Mass storage is a crucial element for the operation of nearly every computer, and therefore, of every business. Although not always necessary for personal computers to have access to network disks, in the case of servers or larger data sets, increasingly specialised tools are being used to share essential resources.

These include the broadly understood SAN (Storage Area Network), NAS (Network-Attached Storage) and DAS (Direct-Attached Storage) solutions. The last of these may be de facto implemented via network connections, which leads to some disputes regarding naming conventions.

Due to the significance of the issue, it’s worth looking into the topic of mass storage from a holistic approach. Below, we present selected challenges to organisations which, due to the large amount of data they work with (and by data, we mean not only files but also for example disks and virtual machine images) must implement a suitable solution. The order that these topics are discussed in is not coincidental, but is the result of the following train of thought:

  • searching for or implementing a technical solution without taking into consideration the specific business need which it is meant to satisfy is unwise. Therefore, at the very beginning, we should understand what a business, broadly understands and expects from a mass storage solution
  • next we should consider the technical possibilities of the solution, list its requirements, and understand any trade-offs that may need to be made at the time of planning the solution and its architecture
  • finally, it’s essential to recognise the human factor, and to determine whether inside the organisation or outside of it there are people who are capable of realising the desired business aims within the technological limitations of the environment.

Business challenges generated by mass storage:

  • the quantity of data in businesses is continuously growing. The disk arrays which support this data must continuously keep up with this growth. In other words, they may not be or become bottlenecks
  • data must be provided in a way which allows for it to be used not just by specialised programs, but also by generally understood mature, open or closed software and new technologies which, because of their dynamics, do not always have an officially defined standard
  • the proposed solution must be sufficiently reliable or at least be subject to a limited amount of, ideally predictable, downtime
  • as far as possible, software susceptible to “vendor lock-in” should be avoided. Solutions which “cannot be abandoned/unlicensed” or whose provider cannot be changed may in the long term cause an uncontrolled and rapid growth in costs.

Technical challenges:

  • at a certain level of vertical scaling, performance benefit is disproportionate to the investment. Vertical scaling very quickly reaches the limits of its capabilities. Optimal solutions involve horizontal scaling, at least in theory, in a linear fashion
  • a network mass storage solution must provide not only the required performance but must also be flexible and allow for geo-replication
  • in the case of minor failures (eg. two disks failing within a short period), the solution, usually matrix/array, should continue to function with full realisation of business aims
  • in the case of a major failure (such as network fragmentation to the extent that a quorum cannot be reached), the data must remain consistent and retain its integrity
  • in the case of an unavoidable disaster (such as a fire in the server room or damage to several disks), the disk array or software should provide the possibility of recovery from another geo-replicated location. However, it should be noted that geo-replication with write confirmation is rarely used in practice due to the related performance issues. This means that only a state from before the desired period can be restored
  • in the case of a proprietary solution with closed source code, we are unable to make corrections or debug the software by our selves
  • in the case of a closed source code, in particular those based on smart devices, we often either do not have the possibility of gaining a full understanding of decision-making mechanisms (for example, regarding where data is recorded or how the disk array/matrix is balanced) or this possibility is severely restricted.

The final, and perhaps most important, aspect is the human factor, without which no work can be done:

  • good people, responsible experts in managing mass storage, are hard to find. They understand their worth, and also know that by specialising in a closed technology they often debase that worth
  • if it is necessary to provide further training or to consolidate the knowledge that the team already possesses, open technologies can boast an enormous advantage. That’s because anyone can very simply start up their own lab to explore the solution. Additionally, training (knowledge transfer) can be conducted by a different service provider, not just the supplier of the software.

The solution to most of the challenges outlined above is the idea of mass storage defined by software, in other, words SDS or software-defined storage. This idea can be summed up in brief as the addition of a virtualisation or abstraction layer to the mass storage. The most significant benefits of SDS include:

  • operations on commonly used equipment
    Generally speaking, any server can be a node in the cluster or a virtual disk array. It is also possible to operate on equipment from different manufacturers. Nonetheless, it should be remembered that adding a slower server to an effective cluster may negatively impact its performance
  • the modularity of solutions
    Thanks to the definition and implementation of appropriate algorithms and data structures, new nodes can be added to the operating cluster. What’s more, in most cases, it is possible to disable selected nodes. This means that maintenance and repair works can be calmly and rationally executed, and equipment can be replaced in the case of a serious failure
  • fault tolerance
    SDS can usually define a practically unlimited number of replicas (in most cases, 3 replicas are considered enough). Besides, more specialised algorithms can create redundancies in a way which is often impossible for systems using traditional mechanisms such as RAID. For example, it is possible to create a set of disks in which loss of information occurs only when 5 out of 11 disks fail
  • autonomous operation and reaction to emergencies
    SDS allows systems to react automatically to emergencies, for example, by auto-balancing in order to keep the cluster operating. If several servers experience damage, the software, having a replica of the data, can increase the redundancy of some data while simultaneously reducing the redundancy of other data to maintain the integrity of the data and the operation of the cluster
  • substantial reduction in costs
    SDS licences (in particular for Open Source solutions with commercial support) are usually cheaper than dedicated and closed solutions. This applies to the equipment mentioned previously as well
  • updates of software and new possibilities
    Mass storage solutions defined by software and operating on commonly used equipment less frequently require replacement of equipment when installing new versions of software, which sometimes, in addition to fixes for bugs, offer new possibilities for improvement in the broadest sense.