mkl
a month ago
TSP = Travelling Salesman Problem (https://en.wikipedia.org/wiki/Travelling_salesman_problem)
PPO = Proximal Policy Optimisation, a reinforcement learning algorithm (https://en.wikipedia.org/wiki/Proximal_Policy_Optimization)
Item id: 46420670
a month ago
TSP = Travelling Salesman Problem (https://en.wikipedia.org/wiki/Travelling_salesman_problem)
PPO = Proximal Policy Optimisation, a reinforcement learning algorithm (https://en.wikipedia.org/wiki/Proximal_Policy_Optimization)
a month ago
Also compare with LKH3 which seems much faster and closer to optimal.
a month ago
Sorry if I am harsh, but a 1200 node tsp problem is a toy problem. We can find proven optimal solutions to these in a fraction of the time you spent.
RL is probably best suited for uncertainty infected instances.