Shared SAN Storage Solutions

Posted by Mike McCarthy on December 11th, 2010 filed in Industry Status, Software News

SANs are a hardware solution that allows multiple systems to share access to the performance and security offered by large high speed disk arrays.  A single array of disks can be partitioned in a way to provide each connected system direct access its own volume, with the each system taking advantage of the redundancy and speed benefits of a large RAID.

Shared SANs takes the benefits of having all of your storage interconnected with high bandwidth links, and extends it one step further.  By running special software to synchronize the connected systems, it allows each of the connected systems to access the same the data on the same volume on the SAN, without overwriting each others files or corrupting the data.  Most SAN software is designed to function as a peer to peer solution for smaller installations, (5-10 systems) or with dedicated servers for larger SANs.

As is probably obvious, there are many benefits to having multiple systems sharing the same set of files on a central high performance disk array.  First off, you don’t have to buy individual arrays for each system, making individual systems cheaper and quieter.  All the actual data is stored in single physical location, making it easier to protect and secure it.  With all the data stored on centralized volumes, file management is easier, with a single unified file structure, and you lose the need to duplicate source files across every system that needs local access to them.  This saves time and storage space.  It also makes it easier to make thorough backups, especially in automated form, which makes your data more secure.  On the flipside, the initial investment is usually rather high, and all of your eggs are in one basket.  If the SAN has an issue or problem, your entire production may grind to a halt until the issue is resolved.

Most all SANs use Fiber Channel as their primary physical interface.  Although this in not inherently required, until recently there was no other standard technology that offered that capability.  CalDigit recently launched a PCIe switch product that they claim offers shared SAN capabilities for their PCIe attached arrays.  While the idea is great, currently the hardware is still in a similar price range to entry level fiber solutions, and you still need expensive software to keep the connected systems in sync and prevent your SAN data  from getting corrupted.

iSCSI also offers some of the same capabilities, with block level drive access, but is only a viable competitor in the high end production world when running on 10Gb ethernet interfaces, which are still usually prohibitively expensive at this point.  Running iSCSI over Gigabit ethernet may be a viable solution for certain compressed workflows, but offers few advantages over regular network storage, at the expense of needing separate SAN software to share properly.

There are a number of different software options when creating a Shared SAN.  I am not familiar with every one of them, but the five I describe here should give you a place to start.  They all serve the same purpose of preventing multiple systems from trying to write data in the same spot at the same time, but they use a variety of different methods to accomplish that objective.

FiberJet is the cheapest option, but does not allow true file level sharing.  It prevents overwriting and data corruption by only giving one system at a time write access to any given volume.  On the otherhand, all systems can be given full read access any volume all the time.  This allows you to share source footage and other media with multiple workstations without the waste of having to duplicate the files.  It doesn’t allow you to easily share actually project files, since most apps will require write access, and will usually force you to share your files across a number of separate volumes, making it harder to find or backup your data efficiently.  So FibreJet gives you about half of the benefits of a Shared SAN, as a low cost starting point.

MetaSAN has been available for quite a while now, and is fairly common in PC based post-production environments.  It supports true file level sharing, allowing all of your systems to read and write files on the same volume simultaneously.  It supports standard file systems, and operates as a separate process over IP to keep machines in sync.  It also allows PCs to access files on Mac formatted drives and vice versa.  It requires one of the connected systems to host the server process, to manage the distribution of metadata and synchronization information.  That system does not have to be dedicated to that task, but it can be for maximum performance and stability.  If you use a user workstation, rebooting that system could cause other users to lose disk access.  I have used MetaSAN for many years, and it is an amazing tool, but it has its quirks that you have to get used to.  It has a tendency to freeze up workstations if something goes wrong, as it waits for certain requests to timeout, which can make it difficult to troubleshoot when you are in a hurry. (And when the SAN is down, you are always in a hurry)  On the otherhand, with all of its instability and frusteration, it has never allowed one of my arrays to become corrupted, or for me to lose data, so it clearly performs its function.

HyperFS is a recently released option, primarily offered by Rorke Data in the US.  It has its own proprietary file system, which can be directly accessed from Windows, OSX and Linux based systems.  The base software is priced similar to MetaSAN, and functions in a peer to peer fashion in smaller installations.  But if you have more than 8 systems to connect, you will be required to invest in a full dedicated metadata server and license, which significantly increases the deployment cost.

XSAN is Apple’s shared SAN software offering, currently on version 2.2, and it is limited to OSX and requires Xserve systems as metadata controllers.  As a PC guy, I have no experience with XSan, but it is used by many Mac based post-production facilities.  The underlying technology is based on the last option we will examine, StorNext.

StorNext is by far the most expensive option, but it offers higher performance, specifically for frame based media, than any of the other choices.  Frame sequence based media bog down other SAN software due to the high number of individual files that are being opened, accessed, and closed, in rapid sequence.  Each individual frame requires the same amount of metadata and synchronization data as an entire video file, overloading lower end software options.  StorNext is an enterprise level product with a variety of options and tiers, with versions that support every different OS, and even ones that interoperate with Apple’s XSan.  It is clearly an expensive option, but you are paying for stability and performance, putting it at the core of many DI facilities that have a DPX based workflow.

Shared SANs are one off the most complicated and expensive investments available in the post-production world.  Lower cost network based alternatives are a better place to start, for smaller oragnizations and compressed workflows, until you are sure you need the performance that SANs can offer.  Once you are working with uncompressed high definition video, or 2K frame sizes, especially with multiple users, a SAN will probably be worth the investment.  The effect that they can have on your workflow and level of collaboration is dramatic, making them worth the effort it takes to get them up and running.

Tags: , , ,


2 Responses to “Shared SAN Storage Solutions”

  1. paladin Says:

    In terms of NAS vs. SAN, what does Bandito use (if any) and what should they use? How many workstations should be present when looking at SAN? -(ms)

  2. McCarthyTech Says:

    Bandito uses both, with active projects on the SAN, and our longer term archive on NAS connected volumes. You need at least two systems for SAN to be viable, but usually more. For two systems, 10Gb ethernet with a crossover cable would allow you to network them for a a couple hundred dollars. Scaling past two systems, a SAN will offer the optimum performance. It costs a couple thousand dollars to add an additional client system to a SAN, once the base infrastructure of switches and arrays are in place, but the more systems that you connect, the lower the total cost per system is. The initial cost of the array is averaged across more systems. Now matter how you measure it, SANs are expensive.

Leave a Comment

You must be logged in to post a comment.