[Cialug] Slightly OT: Server Room Temperature

Justin Richeson jrnosee at gmail.com
Fri Sep 14 13:35:42 CDT 2007


I've been in a server room where the AC failed once.  No "heat" was turned
on, but the room was about 120F in the morning and several servers shut down
due to overheating.  I wouldn't recommend running in a non-AC environment,
and ALWAYS have a backup plan.  We had to bring in two portable AC units to
get back up and running.  Also, please...PLEASE tell me you don't have a
water based fire suppression system in use!  You may think insurance will
handle things but as we found out (by research, not by experience) that if
you wait to replace your systems and pray that your data is recoverable and
finally DO get back up probably 2 weeks to a month later, you're business
will be GONE!

On 9/14/07, Major Stubble <major.stubble at gmail.com> wrote:
>
> Interesting that this topic should come up.  IEEE just had a paper
> published on predicting HDD failure that incorporated environmental
> factors.  I haven't logged in yet to read the article, so I don't
> know what the findings are.
>
> My most common failure with high-temperature data centers (if UNI's
> machine room can even be considered a 'data center') has been hard
> disk drives.  So, an article like this may prove useful as we argue
> about future expansion.
>
> -Nick
>
> http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?
> tp=&arnumber=4292318&isnumber=4292269
>
> Hard Disk Drive Reliability Modeling and Failure Prediction
> Strom, B. D.; Lee, S.; Tyndall, G. W.; Khurshudov, A.
> Magnetics, IEEE Transactions on
> Volume 43, Issue 9, Sept. 2007 Page(s):3676 - 3684
>
> Abstract
> A reliability model for the hard disk drive (HDD) is developed,
> focusing on head–disk separation as the primary independent variable.
> The model is structured to incorporate the theoretical effects of
> environmental factors, plus empirical dependence on the product
> operating mode. An experimental method based on magnetic spacing loss
> theory is used to characterize the head–media separation as a
> function of temperature, altitude, humidity, and HDD operating mode.
> A statistical model based on these empirical data is developed to
> predict HDD reliability for various operating conditions. The
> predictions of the model are verified experimentally through
> comparison with HDD product reliability test data.
>
>
> On Sep 14, 2007, at 12:57 PM, Paul Gray wrote:
>
> > Jonathan C. Bailey wrote:
> >> I've been considering a shell script to check the environmental
> >> monitor in the room and shutdown as needed. Least critical first
> >> of course..
> >
> > I use weatherduck to monitor temperature, humidity, airflow and
> > light levels (eg., when someone comes into the server room).  They
> > are outstanding monitoring units that sit on the back of a 1U/2U
> > case mounted to the serial port.  Sitting on the back, I get a
> > reading of the hot isle temperature.  http://www.itwatchdogs.com/
> >
> > I'd also recommend those using server boards that are equipped to
> > monitor core temperature to use mbmon.  mbmon works exceptionally
> > well with my dual-socket, dual-core Opteron boards from Tyan.
> > There is a huge difference between ambient temperature and core
> > temperature (understatement).
> >
> > -PG
> > _______________________________________________
> > Cialug mailing list
> > Cialug at cialug.org
> > http://cialug.org/mailman/listinfo/cialug
>
> _______________________________________________
> Cialug mailing list
> Cialug at cialug.org
> http://cialug.org/mailman/listinfo/cialug
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://cialug.org/pipermail/cialug/attachments/20070914/e15a2eb5/attachment.htm


More information about the Cialug mailing list