įor inputs, we use the following assumptions: The average rebuild time to achieve complete parity for any given B2 object with a failed drive is 6.5 days. When dealing with the probability of X number of events occuring in a fixed period of time, a good place to start is the Poisson distribution. We’ll forgive you if you want to skip the sections entirely, just click here. The math is difficult to follow unless you have some facility with advanced statistics.
Rather than posit one absolute truth, we decided to publish the results of both calculations (spoiler alert: either methodology tells you that your files are safe with Backblaze). We spent a shocking amount of time debating this and believe that both arguments have merits.
Making it even more interesting, we debate internally whether the proper calculation methodology is to use the Poisson distribution (the probability of continuous events occurring) or Binomial (the probability of discrete events). The math on calculating all this is extremely complex. So, to lose a file, we have to have four drives fail before we had a chance to rebuild the first one. When one drive fails, we have processes in place to “rebuild” the data for that drive. We then store those pieces on different drives that sit in different physical places (we call those 20 drives a “tome”) to minimize the possibility of data loss. The shards overlap so that the original file can be reconstructed from any combination of any 17 of the original 20 pieces. When you send us a file or object, it is actually broken up into 20 pieces (“shards”). That’s a good way to think about it - customers want to know that their data is safe and secure.
#Ikuna meaning how to
How to Calculate Data DurabilityĪmazon’s CTO put forth the X million objects over Y million years metaphor in a blog post. There’s a higher likelihood of an asteroid destroying Earth within a million years, but that is something we’ll get to at the end of the post. Conceptually, if you store 1 million objects in B2 for 10 million years, you would expect to lose 1 file. It seems reasonable to demonstrate why we’re worthy of your trust.ġ1 Nines Data Durability for Backblaze B2 Cloud StorageĪt the end of the day, the technical answer is “11 nines.” That’s 99.999999999%.
We’re in the business of asking customers to trust us with their data.
Different web calculators allow you to input some variables but not the correct or most important variables. It strikes us as odd that so much of the world depends on the concept of RAID and Encodings, but the calculations are not standard or agreed upon. One of the most often talked about, but least understood, metrics in our industry is the concept of “data durability.” It is often talked about in that nearly everyone quotes some number of nines, and it is least understood in that no one tells you how they actually computed the number or what they actually mean by it.