Filed under: Enterprise, Hardware, Odds and ends, Open Source, Xserve, Rig of the Week, Mac OS X Server
Need a few petabytes of Mac storage? Build your own BackBlaze Storage Pod
One of the largest personal iTunes libraries I've ever seen belongs to a client of mine. This client, who was a DJ in the 50's and 60's, has a huge collection of vinyl albums and singles that he painstakingly digitized, cleaned up, and catalogued in iTunes. Needless to say, opening iTunes on his Mac Pro is an exercise in patience.Thinking about his music storage needs, and the huge amount of digital photos and video that my wife are accumulating, got me musing about other ways to do mass storage inexpensively. At this point, I'm probably OK with a DroboPro, but what if I needed petabytes (1 petabyte = 1,024 terabytes = 1,048,576 gigabytes) of storage? Most solutions at this point in time are quite expensive.
As of 6 AM PDT this morning, off-site backup vendor BackBlaze has put their solution to mass storage needs, the BackBlaze Storage Pod, out to the world as an open source project. Their solution is a relatively inexpensive box (US$7,867 for 67 TB of storage) made up of off-the-shelf components that can be reproduced and/or improved upon by others who also need huge amounts of cheap storage. See those red boxes in the picture to the right? Each one of those contains 67 TB of RAID 6 storage in a 4U box. For a petabyte of storage, you're going to need to spend about $117,000 on about fifteen of the boxes.
BackBlaze invented the Storage Pod for their cloud storage requirements, off-site backups for Mac users like you and me. To keep the cost of their off-site Mac and PC backups incredibly low -- $5 / month for unlimited storage -- they developed their own solution, but realized that to keep improving the design and reducing the cost, making the BackBlaze Storage Pod design open source would streamline the process.
If you have $7,867 burning a hole in your pocket or would like to create your own cloud storage project, be sure to download the design. Any TUAW reader who builds one and connects it to his/her Mac should send us photos for publication.

![TUAW [Cafepress]](http://www.blogsmithmedia.com/www.tuaw.com/media/tuaw-cafepress-promo.png)


Reader Comments (Page 1 of 2)
bill cant fart said 6:10PM on 9-01-2009
Didn't you learn anything from Snow Leopard? A petabyte is 1,000,000 gigabytes. A pebibyte, on the other hand, is 1,048,576 gigabytes.
Reply
David Hildreth said 6:14PM on 9-01-2009
As much as I hate this it's technically the standard, has been for years.
crazypenguin said 7:01PM on 9-01-2009
for years, but from what i understand the years have changed correct? now the bytes are undergoing a metric sort of system to the power of ten instead of two.
Tim Chambers said 6:34PM on 9-01-2009
Where is the "Buy it Now" button?
Reply
russdogg said 6:37PM on 9-01-2009
It's only visible if you can actually afford to buy it now.
Bingo said 8:19PM on 9-01-2009
I can help you with that. Email me your credit card information, and I'll process your order... wink/wink
Robert said 6:47PM on 9-01-2009
Imagine the porn one could store there. Simply breathtaking!
Reply
Bingo said 8:23PM on 9-01-2009
There are two scenarios here that make that idea infeasible:
1) Muscle injuries to both arms render you incapable of "enjoying" the entertainment before you get 5% into your collection
2) This is the worst scenario: Porn loses its appeal, and becomes no more thrilling than watching re-runs of 'Different Strokes'. What you talking about Willis?
FantomRedux said 7:34AM on 9-02-2009
hehe, Different Strokes. Seems so innappropriate there.
Joseph said 1:55AM on 9-03-2009
different strokes was a double pun. one for the title and two for pedi-byte.
Brad Knowles said 7:22PM on 9-01-2009
Okay, this doesn't have any kind of NAS or SAN storage technology, so you can't use it out of the box. If you want something that you can actually use, you'll need to add more technology. You could enable iSCSI or NFS on the thing, but then that still doesn't get you the de-duplication, snapshots, and all the other kinds of features that you would want in this kind of a device.
You could add those in, either by sticking a real NAS/SAN head on the thing (like a NetApp V-series filer head), or by cobbling together your own out of various components that are available. The NetApp would be a lot less administrative overhead and a lot easier to manage, but also a lot more expensive in terms of up-front costs and ongoing license costs.
Moreover, they cheaped out on the power supplies, and if power is more expensive than bandwidth (as they claim), then they really should have chosen more efficient power supplies.
Also, they designed an asymmetric system with mostly two-lane SATA cards and one four-lane SATA card. That really makes me wonder how the system would perform over the long-term, what kind of hot spots it would develop, etc....
It's an interesting exercise in technology, but it's a long ways away from something that would be useful in the "real world".
Reply
Simon Kuhn said 7:47PM on 9-01-2009
What caught my eye is the lack of internal redundancy in the host. They specifically divide components between the two power supplies, have uneven load (the motherboard and boot drive are on one PSU together), and go for large single fans for which any single failure would probably cause overheating of components.
I get that it's all intended to minimize the unit cost, but this is not really spelled out in the document (that I noticed). The only way that your data would be safe and sufficiently available for a production environment using this device is if you have a sufficient number of identical devices which also contain the same data. I presume this is the case at Backblaze, but for just some guy building one of these (as if anyone would do that) it would be a rude awakening when one of the PSUs failed.
Tim Nufire said 8:17PM on 9-01-2009
It's great to see folks digging into our blog post!! Thanks for the comments.
We did not design this box as a NAS because our business is online backup. But there is no reason why you can't install Samba, NFS, etc. paired with LVM for volume management/snapshots. All these packages are free so they don't increase the cost ;-)
The power supplies we use are about 85% energy efficiency but lack redundancy because our software layer takes care of this in our environment. We published the full design so it would be easy to swap out the PSUs to meet the needs of a different application.
And finally, we're using this design to store petabytes of production data so I assure you it's ready for "real world" use ;-) The fans move a lot of air and the drives stay cool.... In fact, the system runs fine with just 2 of the 6 fans running.
Tim Nufire
VP of Engineering, Backblaze
Brad Knowles said 8:39PM on 9-01-2009
Tim,
With all due respect, your particular production environment is not what I would consider "real world". That's a Yahoo! or Google-like environment where you can afford to store three copies of every object in order to make up for the lack of redundancy and reliability at the lower levels.
Yes, you could certainly add iSCSI, NFS, and other protocols on that same Intel-based chipset, whether you're running Debian Linux and JFS, OpenSolaris and ZFS, or whatever. But even OpenSolaris and ZFS doesn't give you all the NetApp-like appliance type features that most people are going to want at this scale. They want their dirt-cheap storage and their features too, unfortunately. ;-(
And if you do succeed in building your own NetApp-like appliance on top of Free/Libre/Open Source Software, then you've basically re-created the Sun "Amber Road" 74xx Unified Storage Server -- only, it's not an appliance any more, you don't have any vendor support for the hardware or the software, and you're on your own when it breaks.
Brad Knowles said 3:46AM on 9-02-2009
Tim,
Ironically, we were discussing this very same box on a mailing list I'm on, and I just discovered that one of the Yahoo! Storage Admins happens to be on the same list. I basically made the same comments there that I made here -- and he agreed with me 100%. Yahoo! does not use home-grown devices like this, but they do have 100s and 100s of PB of storage from one of the well-known vendors. And it's worth every penny to them.
Now, if I were going to go the "homebuilt" route and try to get ~50TB of usable storage out of a single box, I'd probably start with a case that already exists from one of the reasonably well-known vendors. It just so happens that Chenbro has a 9U case that fits fifty 3.5" drives (see http://www.chenbro.com/corporatesite/products_detail.php?sku=45 ).
No, this isn't as compact as your box, but then we have people on our staff with extensive experience with Sun J4500 and X4500 boxes, and they know first-hand just how much heat is generated by all those drives when they're locked up tightly in that enclosure, and the greatly increased probability of a stupid human error being made with all those drives being crammed in there so close to each other and vertically oriented. Not to mention the fact that if you mount enough of your style of boxes in a rack, there is a very serious safety hazard if you were to pull out the top one in the rack in order to try to do any maintenance on them. If you've ever had a 42U rack with thousands of pounds of equipment fall on you, then you know what I mean.
In contrast, the Chenbro case has the drives directly accessible from the front, so you don't need to slide out the whole case to replace a drive, and in fact you can replace the drives "hot" (which your design does not allow for). They've also already worked out the power supply issues (3+1), as well as the drive controller issues.
In addition, most datacenters are not equipped to handle ultra-high power density equipment, such as the sort you designed. In many facilities you'd be lucky to be able to get 10kW per rack, so your choices would either be to have the racks half-filled and have twice as many of them, or have larger cases that more completely fill the racks while they also provide other benefits. Your 4U units that pull about 1.5kW each would work out to about 15kW per rack (ten 4U boxes per 42U rack).
Anyway, I'm sure they work well for you, because you designed them to do that. My point is that this particular design would not necessarily be so useful to most other people who might need storage devices. And unless you're operating on the scale of Yahoo!, Google, or the Internet Archive, you're not likely to be able to get the economies of scale that would be the real payoff when you start designing your own hardware.
Marc Tatossian said 6:54PM on 9-01-2009
Wow, A petabyte of iTunes music. Imagine the never ending coverflow!!!
Reply
bob said 6:57PM on 9-01-2009
...or the never loading artwork ;)
Ryan Trevisol said 8:31AM on 9-02-2009
You'd need a Snow Leopard 8-Core Mac Pro with 32GB of ram just to smoothly scroll the cover flow.
Problem is, your god-forsaken Mighty Mouse ball would gum up before you got past "ABBA".
Wilbur said 7:51PM on 9-01-2009
So I have to ask -- how many GB did your client's iTunes library take up?
Reply
MikeWard1701 said 8:25PM on 9-01-2009
I'm wondering the same thing, and the number of songs?