12 July 2005

Why computers and heatwaves don't mix

Today I ended up reinstalling a video driver on a machine which has suffered some corruption on the hard-drive. The fault seems to have occured because the screen display driver was located on the part of the disk-drive which failed. The usual sign of a dying drive is lots of found.00* files created by MS's fixdisk utility trying to recover damaged sectors (chunks of data) on the drive.

Normally I would have said "Yep... dying drive" and found/ordered a replacement but then I thought "Could this be the effect of heat?". Our office for example is currently 30.8°C. This is because we are opposite an air-conditioned lab with two fans near the door, another in the corridor and then two smaller fans blowing into our office. Although it feels cooler to me, the articles I've read on the topic suggest the fans are just evaporating moisture on my skin. Since computers don't generate this moisture, they don't lose heat from its evaporation.

So I fired up the intel system monitor. My cpu (P4 3Ghz) is running at 44°C. My motherboard has two other temp sensors onboard which are reading 39°C and 43°C. My system has a couple of extra fans which I fitted myself to improve the airflow. My maxtor hard-drive specifications say it will run up to 55°. The question is, how close to that would this machine be if I hadn't put extra fans in? Even then is the drive rock-solid at 54° and then it suddenly dies at 55° or is there a sliding scale (performance declines from 50° onwards?)

The other problem is... how do you monitor a hard-drives temperature? For home systems it's easy; you go out spend £30 on a fan controller with temperature sensors but that's not practical or economical for 200 machines. It's also not necessary for 3 labs which have air conditioning. However the machine having the problem is a staff machine in an office. The windows are open but there's very little wind. I'm 99.9% certain the machine won't have rounded IDE cables inside which would have helped. Although it's a lower spec machine I suspect it's internal temperature is probably higher than mine.

The best resource I've found on this subject yet is on the Antec web-site.
Some of it is common sense, some of it is too expensive to implement but some of it is really useful