The fun bit is right at the start when the author notices that the compiler spots this and optimizes it away.
We didn't get into the deeper question of benchmarking it vs. a three-register swap, because I suspect the latter would be handled entirely by register renaming and end up being faster due to not requiring allocation of an ALU unit. Difficult to benchmark that because in order for it to make a difference, you'd need to surround it with other arithmetic instructions.
A meta question is why this persists. It has the right qualities for a "party trick": slightly esoteric piece of knowledge, not actually that hard to understand when you do know about it, but unintuitive enough that most people don't spontaneously reinvent it.
The other classic use of XOR - cursor overdrawing - has also long since gone away. It used to be possible to easily draw a cursor on a monochrome display by XORing it in, then to move it you simply XOR it again, restoring the original image.
Reminds me of the time developing the 747-8 avionics and the systems engineers started observing a bug where the entire box would just stop randomly, all output stopped, including HW traces...the thing would just halt.
About a month later an engineer decided to turn on all warnings for gcc...and behold a message stating something to the effect of "WARNING: execution will halt upon reaching this statement." The compiled code basically just halted and never returned from the function call (can't remember the specifics).
And that is how we learned not to hide compiler messages.
Right out of college, one of my first job offers was to work on the compiler for the computer for the Space Shuttle. Apparently because I had once taken a compiler course.
Even young, naive, optimistic me thought to question the wisdom of that offer.
I ended up not taking it because the pay wasn’t great (and at the time it wasn’t really what I wanted to do), but part of me is still curious about what that would have been like.
This is one of those things that is hard to do, but great to have done. It's hard to do with a codebase in flight with lots of people working on it.
What I remember implementing this on projects was the messiness of:
- incrementally getting the Makefiles to turn on -Wall file-by-file as they were scrubbed. I think it was something similar to "<list-of-files>: CFLAGS+=-Wall" and then add to the list.
- suppressing warnings that were "ok" on a case-by-case basis. different languages had different ways of saying "ignore error 123 here" if at all.
- I remember lint had things like this too, like /NOTREACHED/
The cursor overdrawing trick in part started going away before it's time thanks to patent enforcement (one of the lawsuits infamously exacerbated Commodores financial woes towards the end)... By the time the patent expired there was really no longer much value in going back to it.
> A meta question is why this persists. It has the right qualities for a "party trick"
This is discussed in a section near the end. But I feel like the discussion of why people care about using XOR to swap two values is missing an underlying discussion of why people would care about swapping values. As shown earlier in the division example, at the point where you would do the swap, typically you can just write the rest of the code with the roles of the values swapped.
if max < min:
min, max = max, min
[... algorithm that requires min < max]
So your suggestion would be to have two versions of the algorithm in the two branches of the 'if'. This is significantly more complicated and may even be slower depending on lots of factors.
> typically you can just write the rest of the code with the roles of the values swapped
Today, yes; most modern instruction sets are pretty orthogonal, and you can use a value in any register symmetrically -- although even today, division instructions (if they exist!) are among the most likely to violate that expectation, along with instructions that work with the stack pointer. But in the XOR heyday, this was less true -- instruction sets were less orthogonal, and registers were more scarced. It's not unreasonable for an OS scheduler tick to do some work to figure out the newly-scheduled task's stack pointer in one register, and need to swap it into an SPR or similar so that the return from interrupt returns to the new location, for example; and this is the exact type of place where the XOR trick occasionally has value.
Another similar trick - XOR doubly linked list: XOR the prev and next pointers (storaged size of node decreases) and you can recover the values when accessing the node from prev or next side and thus already know one of the addresses.
the "XOR in hash functions" mentioned at the end most certainly refers to Zobrist hashing, the actually useful XOR trick.
Requires fixed-length keys. Generate random table for each byte position. Then to hash a key, for each position lookup byte in table and XOR the results.
This is used in chess/go/shogi/... engines because the board/position representation is fixed-length and you can undo modes easily - XOR-out byte from previous state and XOR-in from new state.
We didn't get into the deeper question of benchmarking it vs. a three-register swap, because I suspect the latter would be handled entirely by register renaming and end up being faster due to not requiring allocation of an ALU unit. Difficult to benchmark that because in order for it to make a difference, you'd need to surround it with other arithmetic instructions.
A meta question is why this persists. It has the right qualities for a "party trick": slightly esoteric piece of knowledge, not actually that hard to understand when you do know about it, but unintuitive enough that most people don't spontaneously reinvent it.
See also: https://en.wikipedia.org/wiki/Fast_inverse_square_root , which requires a bit more maths.
The other classic use of XOR - cursor overdrawing - has also long since gone away. It used to be possible to easily draw a cursor on a monochrome display by XORing it in, then to move it you simply XOR it again, restoring the original image.