I agree that 33MHz is pushing it, but where did the value 33kHz come from? Why n...

ChuckMcM · on June 22, 2015

I didn't count the zeros accurately as I was typing on the train, the actual number I had estimated was about 330kHz. As others have mentioned there are several factors at play here, but the three are particularly troublesome, the speed of propagation of a signal through a copper trace, the network response of long traces to a step function such as a digital signal transition, and the loss of signal integrity due to radiation of an unshielded trace over long distances.

The first one is pretty easy to understand, the output of a gate changes state and some number of nano-seconds later that output has propagated to the end of the trace. If you're going to do something like "load register" from a bus, when you latch the register flip flops you need to make sure that the input signals are stable (called the 'setup' time) and that after the clock fires to latch the signals that they stay stable to be reliably transferred into the flip flops, the 'hold' time. These values vary for different logic families, and within families vary by temperature and voltage, so a typical timing analysis will include all for 'corner' cases of (high/low voltage, high/low temperature). The whole system can only go as fast as its slowest "setup + hold" time period. And for more complex logic that can "stack" so in the case of the ripple adder, each adder has a setup and hold time before its output is "accurate" and then it feeds into the next bit which has a setup and hold time as well, so you're total time in that adder is going to be the sum of all those setup and hold times. And before you start the add process you need to have the addends loaded into their registers (another setup and hold), and when the add completes the result has to end up in the destination register (another setup and hold). Now typical micro-architecture will have each one of those on an internal clock so the input clock is divided by "n" (where n is 2, or 4, or 8) and on each phase the next thing happens so phase1 things latch into the operand registers, phase 2 they are transferred to the inputs of the ALU, phase 3 they are latched into the destination register.

The second one takes into account that the logic is actually analog (there is really no such thing as "digital" logic) so when a gate changes state, what it really does is start driving (or sinking) a current into a trace to push (or pull) that trace into a ground state or a voltage potential state. Most logic will have a "range" of what they consider one or the other. The waveform on an oscilloscope will show a ramp based on the LC "network" the gate driver sees on that trace. If it is unterminated (which means the trace is not resistance controlled) the signal may rise rapidly or slowly, it may "ring" a bit or not at all and then settle down to the final value. On signal lines that terminated there are a pair of resistors which cause the entire trace to appear as an RLC 'tank' which is tuned to the nominal operating frequency of the clock. Making for nice predictable corners. But the cost of that predicability is that the signals take time to change state so propagation time is slower still.

Finally, and this is the one where you get to put your AM radio next to a computer and listen to the "music" it is creating, each of those unshielded traces are essentially a tiny antenna. And the longer the trace, the more likely it is to radiate rather than propagate the signal. There are particularly challenging issues around even fractions of a wavelength. You can ignore transmission line effects for short distances, you can shield longer traces (put a ground trace on either side) to mitigate losses, and you can reduce the operating frequency.

There are couple of discussions about this in the Horowitz & Hill "Art of Electronics", the DEC book "A guide to DEC hardware" and in a number of digital design text books (I don't know if current ones talk about it but the ones from the 80's did).

It is one of the more interesting aspects of large systems digital design.

p1esk · on June 22, 2015

Thanks for the nice explanation. It makes one appreciate the amount of complexity of designing a processor with 8B transistors which must operate error-free at 1GHz. Compared to that, designing a human brain to operate at 100Hz and not requiring much precision, is a piece of cake!

roghummal · on June 22, 2015

On the progress page (http://www.megaprocessor.com/progress.html) he does a test of an adder and estimates the end speed will be a bit less than 25kHz.

aswanson · on June 22, 2015

Error corrected, matched pair CAN bus maxes out at 500 Khz; this design has nothing with matched termination and different cable lengths, so is obviously going to be much worse. If you're curious about how much worse, start here: https://en.wikipedia.org/wiki/Reflections_of_signals_on_cond...

http://www.smar.com/en/technical-article/inductive-coupling-... https://en.wikipedia.org/wiki/Distributed_element_model

http://www.connectworld.net/cable-length.html

monochr · on June 22, 2015

Again, I'm not seeing why it was 33kHz exactly. It sounds like OP just used the wrong prefix.

tcas · on June 22, 2015

I think it was just lowering the number by a few orders of magnitude to compensate for the effects and logic propagation delay (e.g. I think he said he's using a ripple adder which has a critical path dependent on the bit length), not a precise calculation exact number. My gut instinct says it's probably the right order of magnitude though.

You could probably run the long stretches at a much higher clock rate or make the design run on asynchronous clock domains if you use good serialize/deserialize units.