The classic practical algorithm comes from Bell Labs work in the 1960s. Connect the points in some random order. Then pick two links and cut there, leaving three connected chains. Try all 12 possibilities for reconnecting those three chains, and keep the shortest. Repeat until no improvement is observed for a while.
This was one of the first useful nondeterministic computer algorithms.
Aa Bb Cc is identical to Cc Aa Bb (or bB aA cC). That's why we fixed one chain, to simplify the rest of the calculation.
EDIT: on reading TFA I find Norvig does essentially the same thing: "So let's arbitrarily say that all tours must start with the first city in the set of cities. We'll just pull the first city out, and then tack it back on to all the permutations of the rest of the cities." Interestingly he doesn't also eliminate the Aa-aA redundancy, because "First, it would mean we can never handle maps where the distance from A to B is different from B to A. Second, it would complicate the code (if only by a line or two) while not saving much run time."
This was one of the first useful nondeterministic computer algorithms.