Is there any evidence that not handling read-read dependencies in hw is crucial to alpha performance?
I'd guess that back when the alpha memory model was designed multiprocessors were quite rare, and designers didn't have such a clear picture of the tradeoffs that we do today (not saying today's understanding is perfect, just that it's better than what we had 30 years ago), and chose the weakest possible model they could come up with in order to not constrain future designers.