https://www.usenix.org/system/files/conference/cooldc16/cooldc16-paper-hirki.pdf...

db48x · 2025-08-28T18:30:23 1756405823

They didn’t measure 22W, they measured 6W + 22.1W + 4.9W + 1.8W + 4.8W + 11.2W = 50.8W. Add in 66.3W for the other three cores and that would be 117.1W. Benchmark #2 measured a few watts less than that.

But they don’t give the IHS temperature so you could repeat the exact same experiment using the same hardware and get different numbers simply because your cooling setup was better or worse than theirs.

menaerus · 2025-08-28T23:08:46 1756422526

My understanding, and per Intel documentation, is that RAPL is giving them power consumption over the whole package therefore I believe 22W for Cores (W) in their figure is correct? Other figures such as instruction decoder they seem to extrapolate from that figure since RAPL doesn't and can't give information on that level of granulation? I could be wrong but that's how I interpret their data and why I think the date is not to be accumulated together.

As per cooling setup, I think I agree. This is something that I didn't know but it makes sense.

db48x · 2025-08-29T00:28:46 1756427326

Right, RAPL just reports a total power usage figure for the whole CPU. The authors then develop a model which they believe splits that total into multiple components that correspond to parts of the CPU. This is possible because CPUs provide performance counters that measure what the CPU is actually doing. For example if you write programs that are very similar but have different ratios of cache hits and misses then they’ll draw different amounts of power. You can use those differences to devise a formula for the amount of power used by the cache.

And indeed, they give their formula in section 4.2:

    P_package = 6.05 + cycles/second × 1.63×10⁻⁹
                     + μops issued/second × 2.15×10⁻¹⁰
                     + μops decoded/second × 1.40×10⁻¹⁰
                     + L1 hits/second × 4.35×10⁻¹⁰
                     + L2 references/second × 4.05×10⁻⁹

You can see that the power used by the whole package is the sum of six terms. The values that the calculated for those six terms for each of their benchmarks are given in table 4. The 22W figure for the core(s) is just based on the frequency the CPU is running at.