The Definitive Guide to Troubleshooting and Resolving High Disk I/O Problems

Overview

Disk I/O is the time it takes for read and write operations to execute on a hard disk. Your server will have a significant impact on the disk speed which leads to slow performance, increased load, and an increased wait time for disk I/O.

Symptoms of high disk IO

  • High server load — The average system load exceeds 1.

  • chkservd notifications — You receive notifications about an offline service or that the system cannot restart a service.

  • Slow hosted websites — Hosted websites may require more than a minute to load.

  • Slow delivery of email — The Exim service performs slowly or does not respond. Exim contains a large outbound mail queue.

  • Slow connection for email — The POP or IMAP services perform slowly or do not respond.

  • Slow Webmail interfaces — The Webmail interfaces perform slowly or do not respond (for example, Roundcube or Horde).

  • Slow WHM or cPanel interfaces — The WHM or cPanel interfaces perform slowly when you add email accounts, databases, or other items.

How to determine the disk IO wait on your server

Use the top command to find the average wait time on your server

The %wa statistic at the top of the output indicates your server’s average disk wait.

If the I/O wait percentage is greater than 1/n where n is the number of CPU cores, some CPU cores will have to wait before they can process data from the hard drive.For example, if the system has 4 processors and %wa is 8%, then the actual %wa is 2%. Because the actual %wa is larger than 1%, the processors must wait before they can process data on hard drives.

Use the sar command to determine the history of your disk IO wait

The sar command provides you with the history of the server’s load averages and can be used in determining when your server experiences high disk I/O.

How to resolve a problem with high disk IO

  • If your server’s hard disk has a low RPM speed or slow interface technology, it can be an issue. Consider upgrading your hard disk or distributing the load between multiple disks.

  • No bandwidth available on the hard disk — Upgrade the hard disk on your server or split the application load between separate hard disks.

  • Write caching is disabled — Enable write caching on the disk.

  • Degraded RAID array — Check the Raid array for a hardware malfunction. You should test and verify the hardware.

  • Software RAID array on the server reports busy; CPU uses slow parity calculation — Check the Raid array for a hardware malfunction. You should test and verify the hardware.

  • Software processes slowly — Upgrade the hard disk on your server or split the application load between separate hard disks.

Direct Memory Access

Direct Memory Access (DMA) improves hard drive and backup speeds. We strongly recommend that you enable DMA.

  •         To enable DMA for a hard drive, run the
hdparm -d1 /dev/hda
  • To disable DMA for a hard drive, run the

 hdparm -d0 /dev/hda
  • To measure a hard drive’s transfer rate, run the
hdparm -Tt /dev/hda
  • To view a hard drive’s enabled options, run the

hdparm /dev/hda
  • To view more information about a hard drive, run the

hdparm -i /dev/hda

Leave a Comment