Spotlight on SMB Storage

Originally published in edited form in SpiceWorks’ “Spotlight on Storage” series as the inaugural article.

Storage is a hard nut to crack.  For businesses storage is difficult because it often involves big price tags for what appear to be nebulous gains.  Most executives understand the need to “store” things and more of them but they understand very little about performance, access methods, redundancy and risk calculations, backup and disaster recovery.  This makes the job of IT difficult because we need to explain why budgets need to be often extremely large for what appears to be an invisible system to the business stakeholders.

For IT, storage is difficult because storage systems are complex – often the single most complex system within an SMB – and often, due to their expense and centralization, exist in very small quantities within a business.  This means that most SMBs, if they have any storage systems, have only one and keep it for a very long time.  This lack of broad exposure to storage systems combined with the relatively infrequent need to interact with storage systems leaves SMB IT departments dealing with a large budget item of incredible criticality to the business that is a small percentage of their “task” range and over which they actually have very little experience by the very nature of the beast.  Other areas of IT are far more accessible for experimentation, testing and education purposes.

Between these two major challenges we are left with a product that is poorly understood, in general, by both management and IT.  Storage is so misunderstood that often IT departments are not even aware of what they need at all and often are doing little more than throwing darts at the storage dart board and starting from wherever the darts land – and often starting by calling vendors rather than consultants leading them down a path of “decision already made” while seemingly getting advice.

Storage vendors, knowing all of this, do little to aid the situation since once contact between an SMB and a vendor is made it is in the vendor’s best interest not to educate the customer since the customer already  made the decision to approach that vendor in the first place before having the necessary information at hand.  So the vendor simply wants to sell whatever they have available.  Seldom does a single storage vendor have a wide range of products in their own lines so going directly to a vendor before knowing what exactly is needed can go much, much farther towards the customer having effectively already decided on what to buy than in other arenas of technology and this can cause costs to be off by orders of magnitude compared to what is needed.

Example: Most server vendors offer a wide array of servers both in the x64 family as well as large scale RISC machines and other, niche products.  Most storage vendors offer a small subset of storage products offering only SAN or only NAS or only “mainframe” class storage or only small, non-replicated storage, etc.  Only a very few vendors have a wide assortment of storage products to meet most needs and even the best of these lack full market scale hitting the smaller SMB market as well as the mid and enterprise markets.

So where do we go from here?  Clearly this is a serious challenge to overcome.

The obvious option, and one that shops need to not rule out, is turning to a storage consultant.  Someone who is not reselling a solution or, at the very least, is not reselling a single solution but has a complete solution set from which to choose and who is going to be able to provide a lot cost, $1,000 solution as well as a $1,000,000 solution – someone who understands NAS, SAN, scale out storage, replication, failover, etc.  When going to your consultant don’t make the presumption that you know what your costs will be – there are many, many factors and by considering them careful you may be able to spend far less than you had anticipated.  But do have budgets in mind, risk aversion well documented, costs for downtime and a very complete set of anticipated storage use case scenarios.

But turning to a consultant is certainly not the only path.  Doing your own research, learning the basics and following a structured decision making process can get you, if not to the right solution, at least a good way down the right path.  There are four major considerations when looking at storage: function (how storage is used and accessed), capacity, speed and reliability.

The first factor, function, is the most overlooked and the least understood.  In fact, even though this is the most basic of concerns, this is often simply swept under the carpet and forgotten.  We can answer this question by asking ourselves “Why are we purchasing storage?”

Let’s address this systematically.  There are many reasons that we will be buying storage.  Here are a few popular ones: to lower costs over having large amounts of storage locally on individual servers or desktops, to centralize management of data, to increase performance and to make data more available in the case of system failure.

Knowing which of these factors, or if there is another factor not listed here, driving you towards shared storage is important as it will likely provide a starting point in your decision making process.  Until we know why we need shared storage we will be unable to look at the function of that storage, which, as we know already, is the most fundamental decision making factor.  If you cannot determine the function of the storage then it is safe to assume that shared storage is not needed at all.  Don’t be afraid to make this decision, the vast majority of small businesses have little or no need for shared storage.

Once we determine the function of our shared storage we can now, relatively easily, determine capacity and performance needs.  Capacity is the easiest and most obvious function of storage.  Performance, or speed, is easy to state and explain but much more difficult to quantify as IOPS are, at best, a nebulous concept and at worst completely misunderstood.  IOPS come in different flavours and there are concerns around random access, sequential access, burst speeds, latency and sustained rates and then comes the differences between reading and writing!  It is difficult to even determine the needed performance let alone the expected performance of a device.  But with careful research, this is achievable and measurable.

Our final factor is reliability.  This, like functionality, seems to be a recurring stumbling point for IT professionals looking to move into shared storage.  It is important, nay, absolutely critical, that the idea that storage is “just another server” be kept in mind and the concepts of redundancy and reliability that apply to normal servers apply equally to dedicated shared storage systems.  In nearly all cases, enterprise storage systems are built on enterprise servers – same chassis, same drives, same components.  What is oft confusing is that even SMBs will look to mid or high end storage systems to support much lower end servers which can sometimes cause storage systems to appear mystical in the same way that big iron servers may appear to someone only used to commodity server hardware.  But do not be mislead, the same principles of reliability apply and you will need to gauge risk exactly the same as you always have (or should have) to determine what equipment is right for you.

Taking time to assess, research and understand storage needs is very important as your storage system will likely remain as a backbone component on your network for a very long time due to its extremely high cost and complexity of replacing.  Unlike the latest version of Microsoft Office, buying a new shared storage system will not cause a direct impact on an executive’s desktop and so lack the flash necessary to drive “feature updates” as well.

Now that we have our options in front of us we can begin to look at real products.  Based on our functionality research we now should be able to determine if we are in need of SAN, NAS or neither.  In many cases – far more than people realize – neither is the correct choice.  Often adding drives to existing servers or attaching a DAS drive chassis where needed is more cost effective and reliable than doing something more complex.  This should not be overlooked.  In fact, if DAS will suit the need at hand it would be rare that something else would make sense at all.  Simplicity is the IT manager’s friend.

There are plenty of times when DAS will not meet the current need.  Shared storage certainly has its place, even if only to share files between desktop users.  With today’s modern virtualization systems shared storage is becoming increasingly popular – although even there DAS is too likely avoided even when it might suit well the existing needs.

With rare exception when shared storage is needed NAS is the place to turn.  NAS stands for Network Attached Storage.  NAS mimics the behaviour of a fileserver (NAS is simply a fileserver packaged as an appliance) making it easy to manage and easy to understand.  NAS tends to be very multi-purposed replacing traditional file servers and often being used as the shared backing for virtualization.  NAS is typified by the NFS and CIFS protocols but we will not uncommonly see HTTP, FTP, SFTP, AFS and others available on NAS devices as well.  NAS works well as a connector allowing Windows and UNIX systems to share files easily with each other while only needing to work with their own native protocols.  NAS is commonly used as the shared storage for VMWare’s vSphere, Citrix XenServer, Xen and KVM.  With NAS it is easy to use your shared storage in many different roles and easy to get good utilization from your shared storage system.

NAS does not always meet our needs.  Some special applications still need shared storage but cannot utilize NAS protocols.  The most notable products affected by this are Microsoft’s HyperV, databases and server clusters.  The answer for these products is SAN.  SAN, or Storage Area Networking, is a difficult concept and even at the best of times is difficult to categorize.  Like NAS which is simply a different way of presenting traditional file servers, SAN is truly just a different way of presenting direct attached disks.  While the differences between SAN and DAS might seem obvious actually differentiating between them is nebulous at best and impossible at worst.  SAN and DAS typically share protocols, chassis, limitations and media.  Many SAN devices can be attached and used as a DAS.  And most DAS devices can be attached to a switch and used as SAN.  In reality we typically use the terms to refer to their usage scenario more than anything else.

SAN is difficult to utilize effectively for many reasons.  The first is that it is poorly understood.  SAN is actually simple – so simple that it is very difficult to grasp making it surprisingly complex.  SAN is effectively just DAS that is abstracted, re-partioned and presented back out to hosts as DAS again.  The term “shared storage” is confusing because while SAN technology, like NAS, can allow for multiple hosts to attach to a single storage system it does not provide any form of mediation for hosts attached to the same filesystem.  NAS is intelligent and handles this making it easy to “share” shared storage.  SAN does not, it is too simple.  SAN is so simple that what in effect happens is simply that a single hard drive (abstracted as it may be) is wired into controllers on multiple hosts.  Back when shared storage meant attaching two servers to a single SCSI cable this was easy to envision.  Today with SAN’s abstractions and the commonality of NAS most IT shops will forget what SAN is doing and disaster can strike.

SAN has its place, to be sure, but SAN is complex to use and to administer and very limiting.  Often it is very expensive as well.  The rule of thumb with SAN is this: unless you need SAN, use something else.  It’s that simple.  SAN should be avoided until it is the only option and when it is, it is the right option.  It is rarely, if ever, chosen for performance or cost reasons as it normally underperforms and out costs other options.  But when you are backing HyperV or building a database cluster nothing else is going to be an option for you.  For most use cases in an SMB, using SAN effectively will require a NAS to be placed in front of it in order to share out the storage.

NAS makes up the vast majority of shared storage use scenarios.  It is simple, well understood and it is flexible.

Many, if not most, shared storage appliances today will handle both SAN and NAS and the difference between the two is in their use, protocols and ideology more than anything.  Often the physical devices are similar if not the same as are the connection technologies today.

More than anything it is important to have specific goals in mind when looking for shared storage.  Write these goals down and look at each technology and product to see how or if they meet these goals.  Don’t use knee-jerk decision making or work off of marketing materials or what appears to be market momentum.  Start by determining if shared storage is even a need.  If so, determine if NAS meets your needs.  If not, look to SAN.  Storage is a huge investment, take the time to look at alternatives, do lots of research and only after narrowing the field to a few, specific competitive products – turn to vendors for final details and pricing.