Monday, June 29, 2009

Gimli Explorations (Core i7 Thermal)

My earlier posting introduced Gimli, our new Intel Core i7 920 system. Time for some progress reports. My main focus has been operating with a heavy scientific computing load -- running under the BOINC framework for distributed computing on large-scale scientific problems, such as the Search for Extraterrestrial Intelligence (SETI), protein folding, etc.

The system is based on a Gigabyte EX58-UD4P motherboard in an Antec "Solo" case:

The CPU cooler is the stock Intel system. The photo shows the bottom of the cooler with the central copper disk, which after coating with heat sink compound presses down on the top of the CPU. Under the heat sink is a fan that draws air down (up in the photo) from "cool" the case interior, past the fins, and then radially outward to be drawn away by the case's nearby exhaust fan.

Normally, the Antec case is covered on the side by a solid removable panel, which works very well for noise control and appearance. However, Intel's Chassis Air Guide manual suggests a different "solution" for good thermal performance. They propose a duct leading from ambient air external to the case down to the top of the CPU cooler. This will prevent recirculation of warm air within the case that will tend to heat the CPU unnecessarily. So, I did experiments under 3 conditions: (1) cased closed with normal side, (2) side removed, and (3) side removed with temporary duct installed. The photo shows the temporary duct in place, made from two 8.5 x 11 inch sheets of paper and Scotch tape. (I suppose I should have used duct tape...)

The computing load ranged from zero (Ubuntu Linux 9.04 idling) drawing 100 W AC power, to full load (8 parallel threads of SETIathome jobs) drawing about 200 W AC. The LCD display is not included in power measurements, and I am running at stock clock speeds -- no overclocking.

There are 9 temperature values reported by the CPU: one junction temperature for each hyperthread (two per CPU core) and one temperature for the CPU case (below the chip). There is considerable discussion about how to treat these values. For example see the post at Tom's Hardware site. There are calibration questions, and there are questions about what actual CPU temperatures should be permitted. The Core i7 has several self-protection features that will control power dissipation if temperatures are allowed to get "too high". Unfortunately, there is apparently no published maximum temperature spec for the chip. In any case, the maximum we want to tolerate is subjective. Lower temperatures generally mean more operating stability and longer lifetimes.

It's not clear if there is any meaning to any difference between the two temperature readings associated with each physical core -- or even between the different cores, assuming that the Linux scheduler distributes the computing work equally among cores. These junction temperatures typically vary up to +/- 2 C among the group. I will report only the approximate median Tj of the 8 junction readings. I also report the case temperature Tc.

The numbers given are summaries of a number of readings. I don't claim great statistical power -- only a rough comparison of the options. Room temperature ~ 74 F.

Antec side panel on
Sytem Idling
Tj = 43 C; Tc = 41 C
System 75% load (6 jobs)
Tj = 72 C; Tc = 63 C
System 100% load (8 jobs)
Tj = 75 C; Tc = 63 C

Antec side panel off
System 100% load
Tj = 72 C; Tc = 62 C

Antec side panel off; paper duct installed
System 100% load
Tj = 67 C; Tc = 62 C

Moral: Intel's ducting scheme really does make a substantial improvement. A 5 - 8 C temperature reduction is significant, but in the absence of hard temperature specifications, it is hard to say how great a benefit to system stability and lifetime will result.

Another moral is that if you buy a packaged Core i7 system from a mainstream manufacture, you are likely to get an "engineered" thermal system that is better than what a do-it-yourselfer is likely to concoct with generic motherboards and cases. Of course, it's less fun and more expensive to buy complete systems!

The cooling improvement is specific to Intel's standard CPU cooler. There are numerous more exotic chip coolers on the market, which may not require a ducting system.

My options seem to be:
  • Accept up to 75 C junction temperatures when running full loads.
  • Cap operation at 50 - 75% of full load.
  • Cut a hole in the Antec side panel and install a permanent duct.
  • Check out a different CPU cooler.
Note also that the BOINC / SETI CPU load is only one particular application. It is not hard to make programs with more or less power dissipation, but this test does seem to be representative.

1 comment:

Unknown said...

Thanks for this! In the market for a case and you answered my question exactly to a T! Thank you!!