What are RAID arrays: advantages, types

What are RAID arrays: advantages, types

23.03.2022
Author: HostZealot Team
2 min.
2369

At home, in the office, and in large data centers, the so-called RAID array can be in demand everywhere. Such an array of hard drives can significantly increase the processing speed and reliability of storing large amounts of data. But only on one condition: if you have selected the system that is optimal for your conditions. In order not to make a mistake when creating a RAID array, it is worth evaluating the pros and cons of each of its types. Let’s take a closer look a this in the following article.

What is a RAID?

Strictly speaking, RAID (Redundant Array of Independent Disks) is a way of combining independent disk drives into a single logical module by means of data virtualization. The term itself was proposed back in 1987. This technology was supposed to solve a number of problems of HDD drives: such as time loss when writing and reading large files and low fault tolerance. Although today such arrays are sometimes made of SSDs (Solid State Drives), RAID systems more often rely on classic HDDs. 

It is important to understand that RAID can be organized in two types of implementation:

  • Hardware-based. In this case, the hard drives are connected to the motherboard via a RAID controller with its own microprocessor. This device can be built into the motherboard and made separate. The first method is a bit cheaper, but the second one is much more efficient.
  • Software-based. In this situation, OS utilities are used instead of the controller, which helps to save money. These programs are available for Windows and Linux. The main disadvantage is that they use the CPU and RAM of the system to manage the array, which reduces its overall performance.

Although the definition of one of the described implementation types, as well as the choice of HDD or SSD drives, are important, in reality, more attention is paid to the array levels – RAID 0, RAID 1, etc. They differ in algorithms in creating and distributing data between disk spaces. All levels have their advantages and disadvantages, so it's worth telling about each of them separately.

RAID 0

This is the basic type of RAID array. It is based on the principle of striping. Imagine that you have a book consisting of several chapters. With such a method of combining disks, the first chapter of the book is sent to the first disk, the second is simultaneously sent to the second one, and so on. Moreover, it is possible to connect different HDD volumes and speeds to the system (the overall speed is determined by the slowest HDD in the group, although there may be restrictions at the SATA level).

RAID 0 has several advantages:

  • The increase in file transfer speed is a multiple of the number of connected disks.
  • Full use of the available disk space in the array.
  • Potentially unlimited number of disks in the array.

But these advantages can be completely covered by the main drawback of RAID 0. If at least one storage is damaged, then you will lose all data. Let's return to the analogy with the book. The book becomes useless because it will lack the chapters from the disc that failed – even though you have the rest of the chapters. Therefore, it is better to use zero-level arrays only for temporary or non-critical files in systems with high-speed requirements.

RAID 1

Another simplest level of combining disk spaces. It is based on the method of mirroring. In this case, your book is sent by chapters to all disks at once. There can be either 2 or 22 of them – as long as the number is even. In fact, RAID 1 constantly backs up all available data: you have full copies, "mirrors", of your files.

The advantages of this type of RAID array include:

  • Complete preservation of all data in case of failure of any individual disk.
  • A gain in reading speed when parallelizing the request to the HDDs.

But the disadvantages of the first level are also enough:

  • The write speed does not differ from the base speed of a single disk.
  • Reducing disk space by half due to data redundancy.

To sum up, this array is the opposite of RAID 0: not fast, but reliable. Therefore, it is perfect for storing particularly valuable data that is not accessed so often. In this case, the overpayment for "cloned" disks is fully justified.

RAID 10

This level belongs to the nested group. In fact, it is a hybrid of two nested RAID 1 and RAID 0 levels, taking the best from each of them. To create it, you will need at least 4 disks. The whole system works as follows: the first chapter of the book is written to the first and second HDDs (as in the first level), the second chapter to the third and fourth (as in zero), the third chapter is sent to the first and second – and further down the list.

RAID 10 gives a serious gain on two key factors:

  • The speed of writing and reading files doubles compared to the base speed.
  • The safety of information due to the actual backup.

As for the reverse side of the coin:

  • The available storage volume is half the total of all HDDs.
  • Complication and increase in the cost of the entire system.

By the way, this type of RAID has a "brother" with index 01, where the nesting of levels 0 and 1 is reversed. Such a scheme achieves similar indicators of storage capacity and speed, but is inferior in fault tolerance – and therefore unpopular.

RAID 5

This level is similar to RAID 1, but with a certain modification. Let's go back to the example of the book. Its first chapter is written to the first disk, the second to the second, and the so-called parity is sent to the third – in fact, a backup copy of the data block. At the same time, each of the HDDs is selected alternately for parity, which increases the fault tolerance of the system. However, experts believe that 3 disks are not the best amount for such a RAID. Ideally, you need at least 4-5.

The main advantages of the fifth level of combined arrays are:

  • High speed with parallel disk operation (but lower than RAID 0).
  • Saving storage for backups against RAID 10 by 25% or more.
  • Guarantees the safety of all information in case of failure of one of the disks.

Like before, let's look at the drawbacks of such an approach:

  • For the correct operation of the system, you need a special expensive controller.
  • Due to the specifics of the technology, data recovery can take a long time.

Summarizing the information on RAID 5, we can say: this is good enough, but still a special, not universal solution for combining drives.

RAID 6

This is the development of the idea from RAID 5: the information is still distributed block by block across the disks, but the backup is no longer single parity, but a double one, allocated on two drives. Accordingly, the minimum number of HDDs in such an array is 4. But again: it's better if there are even more of them in the RAID 6 module.

The main advantage of this level is obvious: it will work even if two disks fail. This is the most reliable system described. However, you need to reasonably assess the probability of a double failure of the HDDs and match it up with the disadvantages of this RAID. Among them, the main ones are:

  • Reduced speed of reading and writing compared to other complex arrays.
  • The need for additional space for double copying.
  • Performance depends a lot on a complex and powerful controller.

Due to this combination of characteristics, the sixth level is quite rare, although in certain situations it is an indispensable method.

What can we say about RAID levels 2, 3, 4, and 7?

To briefly answer the question of the subtitle: these are virtually undemanded RAID systems. The main reason is the specific technical nuances in the work of the array. All these RAIDs operate on the principle of alternation, but with special features:

  • In the second level, the so-called Hamming code is used, and data on checking and correcting errors is stored on disks. At the same time, the module becomes justified in terms of performance and reliability from as many as 7 disks.
  • In RAID 3 information is byte-by-byte – and there is a drive for parity blocks. It is worth noting the high speed of reading and writing, but only for large files. And since the array is single-tasking, access from two devices is difficult.
  • As for RAID 4, it is similar to RAID 4, but without a byte-by-byte break. On the one hand, it solves the problem of the slow transfer of small files. However, recording is generally not fast due to sending parity to the same disk.
  • RAID 7 is similar to RAID 4, but with additional caching at the RAM level. This method is reliable and fast, but the array definitely needs a UPS – data is damaged during power outages.

It should also be noted: in addition to the methods of combining disks indicated in the article, there are other, even more exotic, less common ones. Among them, we can mention RAID 61, RAID 03, and RAID-DP. But they are interesting to narrow specialists and are much less common than the basic RAID 0 and RAID 1…

How do I find a fast, reliable repository for my project?

If you are collecting an array for a home photo archive, then you will cope with the choice of equipment. But if we are talking about business solutions for an online project or a database, it is worth contacting professionals like HostZealot. When designing a solution for you, we can create any type of array: from RAID 0, RAID 1, and above. At the same time, as an infrastructure provider, we are ready to implement individual projects for specific tasks. Therefore, we have the opportunity to build such arrays as RAID 1E or RAID 5, if needed. This will allow you to choose the optimal execution and ensure high speed of reading and writing files and their safety. Moreover, our data centers have widespread geography: from the USA to Hong Kong, from Stockholm to Tel Aviv. And do not forget about the many hosting services dedicated servers, VPS, web hosting, colocation, flexible pricing schedule, and round-the-clock, competent and attentive technical support – you can trust us!

Related Articles