17 minute read
The topic of shared storage is built on a foundation of two ideas: storage and sharing (or technically: networking) which we can cover separately before we bring them together.
Storage is the concept of storing all of the files for a particular project or workflow. They may not all be stored in the same place, because different types of data have different requirements, different storage solutions have different strengths and features.
At a fundamental level, most digital data is stored on HDDs or SSDs. HDDs or Hard disk drives are mechanical devices that store the data on a spinning magnetic surface, and move read write heads over that surface to access the data. They currently max out around 200MB/s and 5ms latency. SSD stands for solid state drive, and involves no moving parts. SSDs can be built with a number of different architectures and interfaces, but most are based on the same basic flash memory technology as the CF or SD card in your camera. Some SSDs are SATA drives that use same interface and form factor as a spinning disk, for easy replacement in existing HDD compatible devices. These devices are limited to SATA’s bandwidth of 600MB/s. Other SSDs use the PCIe interface, either in full sized PCIe cards, or the smaller M.2 form factor. These have much higher potential bandwidths, up to 3000MB/s. Currently HDDs are much cheaper for storing large quantities of data, but require some level of redundancy for security. SSDs are also capable of failure, but it is a much rarer occurrence. Data recovery for either is very expensive. SSDs are usually cheaper for achieving high bandwidth, unless large capacities are also needed.
Traditionally hard drives used in professional contexts are grouped together for higher speeds and better data security. These are called RAIDs, which stands for Redundant Array of Independent Disks. There are a variety of different approaches to RAID design, which are very different from one another.
RAID 0 or striping is technically not redundant, but every file is split across each disk, so each disk only has to retrieve its portion of a requested file. Since these happen in parallel, the result is usually faster than if a single disk had read the entire file, especially for larger files. But if one disk fails, every one of your fails will be missing a part of it, making them all pretty useless. The more disks in the array, the higher the chances of one failing, so I rarely see striped arrays composed of more than 4 disks. It used to be popular to create striped arrays for high speed access to restorable data, like backed up source footage, or temp files, but now a single PCIe SSD is far faster, cheaper, smaller, and more efficient in most cases.
RAID 1 or mirroring is when all of the data is written to more than one drive. This limits the array’s capacity to the size of the smallest source volume, but the data is very secure. There is no speed benefit to writes since each drive must write all of the data, but reads can be distributed across the identical drives with similar performance as RAID 0.
RAID 4,5 & 6 try to achieve a balance between those benefits, for larger arrays with more disks (Minimum 3). They all require more complicated controllers, so they are more expensive to reach the same levels of performance. RAID4 stripes data across all but one drive, and then calculates and stores parity (odd/even) data across the data drives and stores it on the last drive. This allows the data from any single failed drive to be restored, based on the parity data. RAID5 is similar, but the parity volume is alternated depending on the block, allowing the reads to be shared across all disks, not just the “data drives.” So the capacity of a RAID 4 or 5 array will be the minimum individual disk capacity, times the number of disks minus one. RAID 6 is similar, but stores two drives worth of parity data, which via some more advanced math than odd/even, allows it to restore the data if even two drives fail at the same time. RAID 6 capacity will be the minimum individual disk capacity, times the number of disks minus two, and is usually only used on arrays with many (>8) disks. RAID5 is the most popular option for most media storage arrays, although RAID6 becomes more popular as the value of the data stored increases and the price of extra drives decreases over time.
Measuring Data:
Digital data is stored as a series of ones and zeros, each of which is a bit. 8 bits is one byte, which frequently represents one letter of text, or one pixel of an image (8-bit single channel). Bits are frequently referenced in large quantities to measure data rates, while bytes are usually referenced when measuring stored data. I prefer to use Bytes for both purposes, but it is important to know the difference. A Megabit(Mb) is one million bits, while a Megabyte (MB) is one million bytes, or 8 Million bits. Similar to metric, Kilo is thousand, Mega is million, Giga is billion, and Tera is trillion. Anything beyond that you can learn as you go.
Storage Math:
Technically a Kilobyte is 1024 bytes (2^10) according to JEDEC who determines the standards for memory, but is 1000 (10^3) according to most storage manufacturers. This (as well as formatting losses) is why the OS usually reports a smaller storage volume than is written on the side of the device. More maximum precision kilobytes counted as 1024 bytes are supposed to be abbreviated KiB, and Megabytes counted as 2^20 or 1,048,576 bytes should be labeled as MiB, but this standard is not strictly followed. Most calculations are made based on 1000 for simplicity, but difference is 10% by the time you get to Terabyte in the Trillions, so it is best to account for that when estimating storage needs. But you should be allowing for more than 10% head room anyway in most cases.
Networking speeds are measured in bits (Gigabit) but with headers and everything else, it is safer to divide by ten when converting speed into Bytes per second. Estimate 100MB/s for Gigabit, up to 1000MB/s on 10GB, and around 500MB/s for the new N-BaseT standard. Similarly when transferring files over a 30Mb internet connection, expect around 3MB/s. Then multiple by 60 or 3600 to get to minutes or hours. (180MB/min or 9600MB/hr in this case.) So if you have to download a 10Gigabyte file on that connection, come back to check on it in an hour.
40Gb Ethernet products have been available for a while, and we are now seeing 25GB and 100Gb Ethernet products as well. 40Gb cards can be gotten quite cheap, and I was tempted to use them for direct connect, hoping to see 4GB/s to share fast SSDs between systems. But 40Gb Ethernet is actually a trunk of four parallel 10Gb links, and each individual connection is limited to 10Gb. It is easy to share the 40Gb of aggregate bandwidth across 10 systems accessing a 40Gb storage host, but very challenging to get more than 10Gb to a single client system. Having extra lanes on the highway doesn’t get you to work any faster if there are no other cars on the road, it only helps when there is lots of competing traffic. 25Gb Ethernet on the other hand will give you access to nearly 3GB/s for single connections, but as that is newer technology, the prices haven’t come down yet. ($500 instead of $50 for a 10GbE direct link.) 100Gb Ethernet is four 25Gb links trunked together, and subject to the same aggregate limitations as 40Gb.
Because networking standards are measured in bits, and networking is so important for sharing video files, many video file types are measured in bits as well. An 8Mb H.264 stream is 1MB per second. DNxHD36 is 36Mb/s (or 4.5MB/s when divided by 8) DV and HDV are 25Mb, DVCProHD is 100Mb, etc. Other compression types have variable bit rates depending on the content, but there still average rates we can make calculations from. Any file’s size divided by its duration will reveal its average data rate. It is important to make sure that your storage has the bandwidth to handle as many streams of video as you need, which will be that average data rate times the number of streams. So ten streams of DNxHD36 will be 360Mb or 45MB/s.
Lots of small requests require not just high total transfer rates, but high IO performance as well. Hard drives can only fulfill around 100 individual requests per second, regardless of how big those requests are. So while a single drive can easily sustain a 45MB/s stream, satisfying 10 different sets of requests may keep it so busy bouncing between the demands that it can’t keep up. You may need a larger array, with a higher number of (potentially) smaller disks to keep up with the IO demands of multiple streams of data. Audio is worse in this regard, in that you are dealing with lots of smaller individual files as your track count increases, even though the data rate is relatively low. SSDs are much better it handling larger numbers of individual requests, usually measured in the thousands or tens of thousands per second per drive.
Capacity on the other hand is simpler. Megabytes are usually the smallest increments of data we have to worry about calculating. A media type’s data rate (in MB/sec) times its duration (in seconds) will give you its expected file size. If you are planning to edit a feature film with 100 hours of offline content in DNxHD36, that is 3600×100 seconds, times 4.5MB/s, is 1620000MB, or 1620GB, or simply about 1.6TB. But you should add some head room for unexpected needs, and then a 2TB disk is about 1.8TB when formatted, so it will just barely fit. But it is probably worth sizing up to at least 3TB if you are planning to store your renders and exports on there as well.
Once you have a storage solution of the required capacity, there is still the issue of connecting it to your system. The most expensive options connect through the network to make them easier to share (although more is required for true shared storage) but that isn’t actually the fastest option, nor the cheapest. A large array can be connected over USB3 or Thunderbolt, or via the SATA or SAS protocol directly to an internal controller. There are also options for Fiber Channel, which can allow sharing over a SAN, but this is becoming less popular as 10GbE becomes more affordable. Gigabit Ethernet and USB3 won’t be fast enough for high bandwidth files to playback, but 10GbE, multichannel SAS, Fiber Channel, and Thunderbolt can all handle almost anything up to uncompressed 4K. Direct attached storage will always have the highest bandwidth and lowest latency, as it has the fewest steps between the stored files and the user. Host->SAS-RAID-Controller->SAS-Drives. Using Thunderbolt or USB adds another controller and hop, Ethernet even more so.
Now that we know the options for storage, let’s look at the data we anticipate needing to store. First off we will have lots of video footage of source media. (Either camera original files, transcoded editing dailies, or both.) This is usually in the Terabytes, but the data rates vary dramatically, from 1Mb H.264 files, to 200Mb ProRes files, to 2400Mb Red files. The data rate for the files you are playing back, combined with the number of playback streams you expect to use, will determine the bandwidth you need from your storage system. These files are usually static, in that they don’t get edited or written to in any way after creation. The exceptions would be sidecar files like RMD and XML files, which will require write access to the media volume. If a certain set of files are static, as long as a backup of the source data exists, they don’t need to be backed up on a regular basis, and don’t even necessarily need redundancy. (Although if the cost of restoring that data would be high, in regards to lost time during that process, some level of redundancy is still recommended.)
Another important set of files we will have is our project files, which actually record the “work” we do in our application. They contain instructions for manipulating our media files during playback or export. They files are usually relatively small, and are constantly changing as we use them. That means they need to be backed up on a regular basis. The more frequent the backups, the less work you lose when something goes wrong.
We will also have a variety of exports and intermediate renders over the course of the project. Whether they are flattened exports for upload and review, or VFX files, or other renders, these are a more dynamic set of files than our original source footage, and are generated on our systems, instead of being imported from somewhere else. These can usually be regenerated from their source projects if necessary, but the time and effort required usually makes it worth it to invest in protecting or backing them up. In most workflows, these files don’t change once they are created, which makes it easier to back them up if desired.
There will also be a variety of temp files generated by most editing or VFX programs. Some of these files need high speed access for best application performance, but they rarely need to be protected or backed up, because they can be automatically regenerated by the source applications on the fly if needed.
So we have source footage, project files, exports, and temp files that we need to find a place for. If you have a system or laptop with a single data volume, the answer is simple, it all goes on the C-Drive. But we can achieve far better performance is we have the storage infrastructure to break those files up onto different devices. Newer laptops frequently have both a small SSD, and a larger hard disk. It that case we would want our source footage on the (larger) HDD, while the project files should go on the (safer) SSD. Usually your temp file directories should be located on the SSD as well since it is faster, and your exports can go either place, preferably the SSD if they fit. If we have an external drive of source footage connected, we can back all files up there, but should probably work from projects stored on our local system, playing back media from the external drive.
Now a professional workstation can have a variety of different storage options available. I have a system with two SSDs and two RAIDs, so I store my OS and software on one SSD, my projects and temp files on the other SSD, my source footage on one RAID, and my exports on the other. I also back up my project folder to the exports RAID on a daily basis, since the SSDs have no redundancy.
If you have a short film project shot on Red, that you are editing natively, R3Ds can be 300MB/s. That is 1080GB/Hour, so 5 hours of footage will be just over 5TB. It could be stored on a single 6TB external drive, but that won’t give you the bandwidth to play back in real time. (Hard drives usually top out around 200MB/s) Now striping your data across two drives, in one of those larger external drives would probably provide the needed performance, but with that much data you are unlikely to have a backup elsewhere. So data security becomes more of a concern, leading us towards a RAID5 based solution. A 4 disk array of 2TB drives provides 6TB of storage at RAID5. (2TB*[4-1]) This will be more like 5.5TB once it is formatted, but that might be enough. Using and array of 8 1TB drives would provide higher performance, and 7TB of space before formatting (1TB*[8-1]) but will cost more. (8 port RAID controller, 8 bay enclosure, and two 1TB drives are usually more expensive than one 2TB drive.)
Larger projects deal with much higher numbers. Another project has 200TB of RED footage that needs to be accessible on a single volume. A 24Bay enclosure with 12TB drives provides 288TB of space, minus two drives worth of data for RAID6 redundancy (288TB-[2x12TB]=264TB) which will be more like 240TB available in Windows once it is formatted.
Initially, most shared storage solutions were developed based on the fiber channel interface. A fiber channel array appears as a local drive to a host system connected to it with a fiber channel interface card. And it is possible to direct connect storage with Fiber channel, allowing the loud array to be safely stored away far from the workstation, at no loss in performance. But the real power of Fiber Channel over SCSI or SAS comes when you add a switch into the equation. Now many systems can access the same array(s). They can either all access separate dedicated volumes, or access the same volumes and data, as long as they have a system to prevent them from overwriting and corrupting each other’s data.
“Shared storage” technically has two very different meanings. In a data center, servers need high speed, high IO storage available, and before SSDs, this meant lots of individual drives. Instead of every server having their own array, many servers would share a single array of disks, each having their own dedicated portion of the entire capacity. Each server had dedicated access to their files, and no other server could see them, unless they accessed them over the network through the host machine, the same as if the files were stored on an array inside that server. The advantage were that fewer arrays were needed to provide the performance that various servers needed at their peak, and storage volume could be adjusted to meet the changing needs of various servers without having to move physical hard drives around. They were just given a larger portion of the array if they started to fill their allotment, and eventually more arrays were added. This sharing of the physical storage, but not the files contained on them, was usually accomplished over the Fiber Channel interface, in what was referred to as a Storage Area Network, or SAN. It is now also done over Ethernet networks, usually in the form of iSCSI.
As technology advanced, it became obvious that allowing more than one machine to access the files on a large array could be advantageous, as multiple servers could host the same web site, improving performance by sharing the processing load. For media production, that meant allowing multiple workstations to see the same source footage, at full local storage speeds. This was easy for read operations, requests get sent to the drive, and it fulfills them as fast as it can. But for write operations, how do we keep multiple machines from writing to the same spot on the drive at the same time? Either one machine controls the drive and all write operations have to be approved by it, slowing down the process, or control is passed to whichever machine needs it, which can be complicated to manage. So true shared SAN software is usually expensive, on top of the price of the arrays you are sharing and the infrastructure to share it.
Originally Shared SANs exclusively ran on expensive dedicated fiber optic networks using an interface called fiber channel, separate from the Ethernet network that machines used to communicate with one another. Eventually there was a move to FCoE or Fiber Channel over Ethernet, and then to iSCSI, which was entirely Ethernet packet based, lowering the cost of the hardware. But the software challenge remained the same, to prevent multiple machines who all see the drive as a local volume for writing to the same spot at the same time. FiberJet, MetaSAN, StorNex, XSAN, Facilis, Harmony, Avid Unity, and many others all had different approaches to doing that.
As Ethernet networking technology improved, the benefits of SAN solutions over NAS (Network attached storage) solution diminished. 10Gigabit Ethernet (10GbE) transfers over 1GB of data a second, and is relatively cheap to implement. NAS has the benefit of a single host system controlling the writes, usually with software included in the OS, preventing data corruption, and also isolated the client devices from the file system, allowing PC, Mac, and Linux devices to all access the same files. This came at the cost of slightly increased latency, and occasionally lower total bandwidth. But the prices and complexity of installation were far lower. So now all but the largest facilities and most demanding workflows are being deployed with NAS based shared storage solutions. This can be as simple as a main editing system with a large direct attached array sharing its media with an assistant station, over a direct 10GbE link, for about $50. This can be scaled up by adding a switch and connecting more users to it, but the more users sharing the data, the greater the impact on the host system, and the lower the overall performance. Over 3-4 users, it becomes prudent to have a dedicated host system for the storage, for both performance and stability. Once you are buying a dedicated system, there are a variety of other functionalities offered by different vendors to improve performance and collaboration.
There is one other solution on the market for sharing media at high speeds, and that is Thunderbolt3. There are vendors that sell products that allow direct connections to systems via Thunderbolt3, but most of those are using the inherent support in Thunderbolt for TCP networking to move the data, effectively making it a NAS on the Thunderbolt interface. Any product that did offer block level access over Thunderbolt3, appearing as a local Thunderbolt drive to the workstation, would still need the same SAN software functionality required by Fiber Channel to prevent overwrites and corruption. The advantage of Thunderbolt over Fiber Channel would be price and speed, at the expense of distance and scalability.
The main step to improve collaboration is to implement what is usually referred to as a “bin locking system.” Even with a top end SAN solution, and strict permissions controls, there is still the possibility of users over-writing each other’s work, or at the very least branching the project into two versions that can’t easily be reconciled. If two people are working on the exact same sequence at the same time, only one of their sets of changes is going to make it to the master copy of the file, without some way of combining the changes. (And solutions are being developed) But usually the strategy to avoid that is to break projects down into smaller pieces, and make sure that no two people are ever working on the exact same part. This is accomplished by locking the part (or bin) of the project that a user is editing, so that no one else may edit it at the same time. This usually requires some level of server functionality, as it involves changes that are not always happening at the local machine. Avid requires specific support for that from the storage host for it to enable that feature. Adobe on the other hand has implemented a simpler storage based solution, which is effective but not infallible, that works on any shared storage device that offers users write access. A lock file next to the project file indicates to other users who is using that file, and prevents their applications from attempting to alter the project. This prevents branches and overwriting if users follow the rules. Avid’s solution is stronger and more fool-proof, but requires explicit support from the storage host. There are a variety of vendors besides Avid who offer this functionality.
If I was setting up a facility with 10 editors in the current environment, I would base it on 10GbE for sure, and have a dedicated system hosting the storage, with a large RAID for the media and exports, and smaller SSD volume for the project files. I would have an SSD on every system for the applications and temp files, to save some bandwidth to the storage server.
iSCSI arrays offer some interesting possibilities for read only data, like source footage, as iSCSI gives block level access for maximum performance, and runs on any network without and expensive software. The only limit is that only one system can copy new media to the volume, and there must be a secure way to ensure the remaining systems have read-only access. Projects and exports must be stored elsewhere, but require much less capacity and bandwidth than source media.
Unlock all 100,000 words of the Frame.io Workflow Guide and learn how the pros do workflow.