Tuesday, April 22, 2008

RAID Information [Definitions]

Definitions

Before we dive into RAID Levels, we need to define a few terms that will be used throughout this paper. The purpose of this is so that when a term is used, you will know what it is intended to mean.
Concatenated array
This is an array where multiple disk drives or arrays are logically connected together, end-to-end. This was the earliest method to combine multiple disks. This type of array has no redundancy. It is not one of the RAID levels. Some manufacturer's incorrectly call this JBOD.

Data Drive
A data drive is a disk drive that is dedicated to storing data, as opposed to parity, Hamming code, or a hot standby. In RAID levels where the data and parity move around, the space equivalent of this many drives is available to store data.

The number of data drives in a RAID array is designated by the first number in the size description. In some cases, this is the only number.

Hamming Code
A Hamming Code is an algorithm that can be used to determine if an error exists in a data stream, and sometimes (dependent on the exact code used) correct that error. This is sometimes referred to as an Error Correction Code (ECC).

The number of Hamming Code drives in a RAID array is designated by the number after the first plus sign ('+'). If there is no first plus sign, then there are no Hamming Code drives. This is the same location that the parity number is found in. Except for RAID-1, if the value is one, then it is a parity drive; if it is greater then one, then they are Hamming Code drives.

Hot Standby
This is a disk drive that is reserved for use, in case a failure occurs in one of the active drives on the array. If such a failure occurs, the hot standby drive will start being rebuilt to replace the failed drive. This usually happens automatically, but some RAID implementations may require human intervention.

The number of hot standby drives in a RAID array is designated by the number after the second plus sign ('+'). If there is no second plus sign, then there are no hot standby drives.

JBOD
This is an acronym for 'Just a Bunch Of Disks'. Most disk controllers don't have any RAID functionality built into them. In these cases, the Operating System sees the disk drives connected to the controller as JBOD.

In addition, many RAID controllers default to a JBOD configuration before being configured for RAID.

It should be noted that some manufacturers use the term JBOD to refer to a Spanned or Concatenated array.

Logical Disk
This is what a RAID array is. Although the RAID array is multiple disks, it appears to the Operating System as a single disk.

Mirror Disk
These disks are used in RAID-1. A mirror disk contains an exact duplicate of the disk that it mirrors. More than one mirror disk is sometimes used.

Parity
Parity is used in RAID-3, RAID-4 and RAID-5 to validate the data written to the RAID array. Parity across the array is computed using the XOR (Exclusive OR) logical operation. This is a very fast operation, but it transfers a great deal of information to and from memory. It should be noted that this is a special case of a Hamming Code.

Except for RAID-1, the number of parity drives in a RAID array is designated by the number after the first plus sign ('+'). If there is no first plus sign, then there are no parity drives. This is the same location where the Hamming Code number is found. In general, if the value is one, then it is a parity drive; if it is greater then one, then they are Hamming Code drives.

Physical Disk
A physical disk is a disk. This term is sometimes used to distinguish it from a logical disk.

RAID
This is an acronym for 'Redundant Array of Independent Disks'. Originally, the I stood for Inexpensive, and is sometimes still seen written that way.

Segment size
This is the number of blocks (sometimes expressed in bytes) that are written to one disk drive, before moving on to the next disk drive in the array. It does not apply to RAID-1 or RAID-3.

This is sometimes called Stripe Size. The term Stripe Size is only valid for RAID-0 arrays. Other RAID levels do not have stripes. All RAID levels (except RAID-1 and RAID-3) have segments.

Spanned array
This is another name for a Concatenated array.

Spindle Sync
Spindle Sync is a feature that allows multiple disk drives to operate in sync with each other. When enabled, and properly cabled, a series of disk drives will spin at the same speed, and a given sector will pass under the heads of all the drives at the same time.

This feature will result in a performance improvement in virtually all RAID arrays. This is because it has an impact on the statistical rotational latency of a RAID array. As an example, in a RAID-3 array using 7200RPM disk drives, configured as 8+1, the statistical rotational latency will be 4.17ms if the drives are in sync, but 8.30ms if the drives are not in sync.

This feature is available on many SCSI and Fiber disk drives, but I have never seen it on an IDE disk drive.

Stripe size
This is similar to Segment size, except that it is only valid for RAID-0 arrays. Many manufacturers use this term when they mean Segment size.

Stripe width
This is the number of blocks that must be written to the array, so that every data drive has had a complete segment written.

XOR
The Exclusive OR (XOR) logical function is used to generate parity. It is also used in Hamming codes. The XOR function is similar to a binary addition, without any carry operations. The binary truth table for an XOR is as follows:

No comments: