Panzura Builds Private And Hybrid CloudsAugust 19, 2011
by Howard Marks
Until I got a detailed briefing from Panzura’s co-founder and head of product development, Randy Chow, the other day, I had lumped Panzura into the cloud storage gateway category. While its Alto controllers and panOS can use a public cloud provider like Amazon S3 or Nirvanix as a storage back end, Panzura’s value proposition isn’t providing an on-ramp to public cloud storage but a true distributed file system that can be accessed simultaneously from multiple locations.
More typical cloud storage gateways, such as those from Nasuni, StorSimple or the now moribund Cirtas, act as a cache to a cloud storage back end, but they provide access only to the data they store from a single gateway or high availability (HA) pair. If you install gateways in 10 offices, each office will have access only to its data.
That’s a shame because providing a common storage pool for access from multiple locations is one of the great strengths of public cloud storage. NBC Universal has petabytes of video on the Nirvanix cloud, so editors and execs at any of its connected locations can find, for example, an old “Tonight Show” clip when they need to create an obit for the nightly news.
Over the years, I’ve tried all sorts of solutions, including FTP servers, replication with or without rsync, and, of course, emailed files to allow users to access important files from multiple locations, with limited success. I’ve even taken the file servers out of branch offices, replacing them with Wide Area File Services (WAFS) appliances that cached data from filers in a central data center. That worked fine until the WAN link went down and users couldn’t access their files at all.
With Panzura, I can install appliances in multiple locations to replace the current networks-attached storage (NAS) infrastructure. The controllers will make up a file system across my organization, exchanging metadata continuously. They also deduplicate the data and replicate it either directly from appliance to appliance or through a storage cloud, like a public cloud provider.
Users all see the same file system and can access any file, subject to file permissions, regardless of where it was created. As in a caching system like WAFS, when a user requests to open a file that isn’t stored locally, its data moves to the top of the replication queue. Administrators can set replication policies to preload folders to specific locations.
Appliances have local disk and optional flash, so users won’t have to give up performance to get the wide area coverage. Just as admins can pin folders to locations to ensure there’s always a local copy, they can assign tiering policies to folders so your .VMDKs get the performance boost of flash but user’s ARCHIVE.PSTs stay put on spinning disk. Panzura has physical appliances based on standard one- and two-rack unit servers. The 2U system can be expanded to more than 100 spindles at the high end, and a virtual appliance for vSphere can serve small sites or fill-in in emergencies.
Using a public cloud provider as the back end makes the whole thing even better in several ways. First, it provides unlimited capacity, relieving the system administrators from having to track and manage free space. It also provides unlimited snapshots—one Panzura client has more than 45,000 snapshots in place. Data in the system is encrypted using AES, regardless of whether a public cloud provider is involved.
While the Alto Cloud File System is impressive, the latency involved in allowing users in different locations to access the same file system has required the company to make a few small compromises. The Panzura system locks files when applications request a record lock, which makes it unsuitable for Access databases and similar applications that are luckily fading into history anyway.
As part of the metadata, the Panzura system keeps track of where each file was modified last. In the event of a WAN failure, users get read/write access to files that were last modified locally and read-only access to files last modified in another location. While this may result in some inconvenience, it’s a step up from either denying write access altogether or relying on a last update wins replication model.
While very small organizations can use Dropbox or Sugarsync to replicate files between locations, and create a backup copy while they’re at it, those services don’t provide the scale or level of service even a midsize company needs.
Panzura’s Alto is another valuable tool in my storage architect’s toolkit, and I can see myself using it in situations with and without a private cloud back end.