Prologue: My 2 year old laptop's harddisk had died a few weeks ago, and I replaced it with a Seagate Momentus 5400.3 160GB SATA Drive. I have been running only Linux for a long time on my laptop. (Running Ubuntu Intrepid at present)
While trying to ascertain the cause of this premature death, I came to notice the abnormally high Load_Cycle_Count. This can be checked using smartmontools by issuing the command
sudo smartctl -n standby -a /dev/sdawhere /dev/sda has to be replaced with the appropriate disk name. The option -n ensures that if the disk is already in standby, smartctl doesn't wake it up. A little bit of Googling returned quite a lot of stuffs about this issue. Laptop Harddisks, in order to improve power efficiency while on battery, have quite aggressive power management features by default. Now this is not really bad. When the disk is not accessed for sometime it spins down itself. So far so good, the disk stops spinning unnecessarily thereby cutting down power consumption. However no sooner than the disk stops spinning, something causes it to spin up again. This not only defeats the whole purpose of spindown, but also causes unnecessary wear and tear of the disk components. Most modern HDDs have a mechanism which parks the head (loads it up a ramp) when the disk spins down. The head is unloaded back over the platter, once the disk spins up again. However each load and unload cycle causes wear of the loading and unloading mechanism. Seagate HDDs (most others as well) have specifications of maximum of 60,000 load unload cycles. This is quite high. But what I found in my case was, the Load_Cycle_Count was increasing at the rate of about 5-6 per min. That meant the head was parking and unparking every 10sec on avarage. This was quite alarming.
To stop such insane behavior, I set the Advanced Power Management to 254 using hdparm. A value of 254 meant least aggressive power management. By default Ubuntu sets it at 128. This did stop the Load_Cycle_Count from increasing increasing insanely. But the disk now stopped spinning down, and its temperature was shooting up. Within a hour it went up above 60degC (room temp was around 20degC). Now that is even more alarming than the increasing load cycle count. The rated maximum operating temperature for my drive is 60degC. Operating at high temperature severely shortens the life of the disk. At a power management value of 180, the temperature settled at around 55degC. This was better, but not quite good, the disk was 35degC above ambient temperature. During peak summer, the ambient temperature at Kolkata hovers around 40degC. So my disk will get fried up in the summer if I use my laptop in a room without airconditioning.
So preventing the disk from spinning down is not a solution. It has to be ensured that once the disk spins down, it stays like that as long as possible, without spinning up.
I needed to find out who was accessing the disk so frequently. iotop is a nice utility for this. wpa-supplicant was at the top of the list. I am using a wireless connection, and wpa-supplicant frequently logs something. Next was gconf-d, followed by gnome-do and console-kit-daemon. As soon as the disk spins down, one of this will try to do a read/write causing the disk to spin up again. On top of that, every time the disk is accessed, kjournald will write the filesystem journals, update the atime, ctime and mtime of file inodes. All these together keep the disk always busy and wakes it up as soon as it tries to catch a nap.
However there is a utility called laptop-mode-tools which performs some tweaks and tries to keep the hard disk in standby mode as long as possible.
To enable it, first install laptop-mode-tools.
sudo apt-get install laptop-mode-toolsThen it has to be enabled in /etc/default/acpi-support by changing the line
ENABLE_LAPTOP_MODE=trueI changed the configuration file a bit, so as to optimize things as far as possible.
The configuration is there at /etc/laptop-mode/laptop-mode.conf
###### Config file for laptop-mode-tools ## Verbose output on VERBOSE_OUTPUT=1 ## Laptop mode enabled always ENABLE_LAPTOP_MODE_ON_BATTERY=1 ENABLE_LAPTOP_MODE_ON_AC=1 ENABLE_LAPTOP_MODE_WHEN_LID_CLOSED=1 # When to enable data loss sensitive features # ------------------------------------------- # # When data loss sensitive features are disabled, laptop mode tools acts as if # laptop mode were disabled, for those features only. # # Data loss sensitive features include: # - laptop_mode (i.e., delayed writes) # - hard drive write cache # # All of the options that follow can be set to 0 in order to prevent laptop # mode tools from using them to stop data loss sensitive features. Use this # when you have a battery that reports the wrong information, that confuses # laptop mode tools. # # Disabling data loss sensitive features is ACPI-ONLY. # Disable all data loss sensitive features when the battery level (in % of the # battery capacity) reaches this value. # MINIMUM_BATTERY_CHARGE_PERCENT=3 # Disable data loss sensitive features when the battery reports its state # as "critical". # DISABLE_LAPTOP_MODE_ON_CRITICAL_BATTERY_LEVEL=1 # The drives that laptop mode controls. # Separate them by a space, e.g. HD="/dev/hda /dev/hdb". The default is a # wildcard, which will get you all your IDE and SCSI/SATA drives. # HD="/dev/[hs]d[abcdefgh]" # The partitions (or mount points) that laptop mode controls. # Separate the values by spaces. Use "auto" to indicate all partitions on drives # listed in HD. You can add things to "auto", e.g. "auto /dev/hdc3". You can # also specify mount points, e.g. "/mnt/data". # PARTITIONS="auto /dev/mapper/*" ASSUME_SCSI_IS_SATA=1 # Maximum time, in seconds, of work that you are prepared to lose when your # system crashes or power runs out. This is the maximum time that Laptop Mode # will keep unsaved data waiting in memory before spinning up your hard drive. # LM_BATT_MAX_LOST_WORK_SECONDS=900 LM_AC_MAX_LOST_WORK_SECONDS=600 # # Should laptop mode tools control readahead? # CONTROL_READAHEAD=1 # 10MB readahead in laptop mode LM_READAHEAD=10240 NOLM_READAHEAD=128 # Disks will be mounted with noatime in laptop mode, atime updates to file inodes will be # stopped. CONTROL_NOATIME=1 # Don't use relatime instead of noatime USE_RELATIME=0 # set hdd timeout CONTROL_HD_IDLE_TIMEOUT=1 LM_AC_HD_IDLE_TIMEOUT_SECONDS=60 LM_BATT_HD_IDLE_TIMEOUT_SECONDS=30 NOLM_HD_IDLE_TIMEOUT_SECONDS=7200 # set HDD power management CONTROL_HD_POWERMGMT=1 BATT_HD_POWERMGMT=1 LM_AC_HD_POWERMGMT=127 NOLM_AC_HD_POWERMGMT=254 # enable write cache CONTROL_HD_WRITECACHE=1 NOLM_AC_HD_WRITECACHE=1 NOLM_BATT_HD_WRITECACHE=0 LM_HD_WRITECACHE=1 CONTROL_MOUNT_OPTIONS=1 # # Dirty synchronous ratio. At this percentage of dirty pages the process # which calls write() does its own writeback. # At 80percent of dirty pages disk write is performed. This holds up things in memory and # prevents frequent disk writes LM_DIRTY_RATIO=80 NOLM_DIRTY_RATIO=40 # # Allowed dirty background ratio, in percent. Once DIRTY_RATIO has been # exceeded, the kernel will wake pdflush which will then reduce the amount # of dirty memory to dirty_background_ratio. # Once writeout has commenced write as much as possible to disk, without keeping back anything. # So this has been set to 1 percent LM_DIRTY_BACKGROUND_RATIO=1 NOLM_DIRTY_BACKGROUND_RATIO=10 # # kernel default settings -- don't touch these unless you know what you're # doing. # DEF_UPDATE=5 DEF_XFS_AGE_BUFFER=15 DEF_XFS_SYNC_INTERVAL=30 DEF_XFS_BUFD_INTERVAL=1 DEF_MAX_AGE=30 # # This must be adjusted manually to the value of HZ in the running kernel # on 2.4, until the XFS people change their 2.4 external interfaces to work in # centisecs. This can be automated, but it's a work in progress that still # needs some fixes. On 2.6 kernels, XFS uses USER_HZ instead of HZ for # external interfaces, and that is currently always set to 100. So you don't # need to change this on 2.6. # XFS_HZ=100 # # Seconds laptop mode has to to wait after the disk goes idle before doing # a sync. # LM_SECONDS_BEFORE_SYNC=2
After enabling laptop-mode, the hdd is being able to sleep peacefully for quite sometime in between spinups. Also the operating temperature is rarely exceeding 50degC now. The load cycle count is still increasing but at a much slower rate. Hopefully this HDD is going to last longer than the previous one.