Amazon Simple Storage Service (S3) was launched in 2006. I remember the time well because I was on the founding team of a competing storage service called Nirvanix that was in stealth mode at the time. This was before the term “cloud” was widely used and Amazon defined S3 as “Internet Storage” or “Web-Scale Storage.”
Fast forward 15 years and I have to say that Amazon’s S3 efforts have paid off. AWS alone brought Amazon $45B in revenue with $13.5B in profit in 2020! I won’t go as far as to say that this was only because of object storage, but I will say object storage was the cornerstone or key foundational component of its cloud strategy. But why?
Disruptive Innovation, By Definition, Changes the Status Quo
In hindsight, Amazon’s go-to-market approach with AWS was brilliant, but took a lot of effort, resources, and vision. At that time most of the IT world used storage through POSIX-compliant storage protocols such as NFS or CIFS/SMB, but S3 (and other storage services like Nirvanix) was different. It used a RESTful interface requiring application developers to directly integrate their applications to a specific storage service via their application interface (API). Every storage service had their own proprietary API that exposed some unique features, but there was consistency among certain concepts such utilizing metadata, using a key/value method of addressing, and standard calls like write, read, and delete.
This was a very different approach, and the storage incumbents at the time did their best to try to standardize a storage-as-a-service interface, but the market moved too fast. Any standardization at that point would have stifled innovation. From the AWS perspective, the fact that Amazon controlled the S3 API (also referred to as the S3 protocol) enabled it to continually modify the API to meet fluid market demands by releasing service after service. Currently there are well over 60 services in the AWS portfolio.
Why Did Amazon Start with Object Storage?
The internet service world, both B2C and B2B, has moved so fast that it is hard to remember life without certain things. In 2006, a glitchy YouTube was just getting started and was frequently offline. Twitter launched later that year and if you tried to use it, you often saw the “fail whale” (remember the fail whale?). Netflix hadn’t even launched its streaming service. That came the following year in 2007.
The point is, back in 2006 we were all still living in a single application, local storage mentality. You still bought software at a store, took it home, and installed it via a physical CD. Hard drive failures were common, and NAS and enterprise storage devices were expensive, and difficult to maintain and expand. There were online storage services (for consumers and businesses) but they were very slow, expensive, and often unreliable.
To summarize, storage wasn’t convenient and although it was a necessary function, the value of an internet service was in the front end and the content. To ensure storage reliability you needed to invest time and effort to maintain it. This was the market Amazon entered with S3; and its value prop of “let us manage your bulk storage so you can focus on enhancing your primary application” was enticing to developers and organizations that did not have enterprise storage specialists or storage architects. Also, the perceived risk from the developer perspective was low. After all, the data they stored on S3 were often secondary copies that could be recreated or were latency tolerant. In other words, it could be recalled from backups if need be in hours or days without you getting too angry. Now the only catch was you needed to integrate S3 via the S3 API with your application.
The S3 Protocol: How Amazon Sent Services into Warp Speed
So how does a company launch such a disruptive technology? Especially one that requires a significant change to the way that IT managers and storage admins are used to doing things?
First, the value has to outweigh the effort. Then you need to remove as many barriers to entry as possible for your innovators and early adopters. Amazon executed its strategy brilliantly. They focused on developers giving Amazon S3 away for free (or nearly free) and unleashed a massive global developer outreach campaign focusing on specific technical user groups in every major region around the world. They backed this with continual development of developer-focused resources (such as training and sample code) including the continual enhancement of Amazon AWS. They took a more grassroots approach at first. Then, once they had traction in the web services world, started to expand into the enterprise.
Along the way there were many external macro-market events that aided in adoption of AWS services like broadband, the proliferation of mobile devices, an acceptance of services/subscriptions, and most recently a global pandemic. But if you look at where AWS all started, an absolute necessary first step was getting its user base comfortable using RESTful interfaces; in this case, the S3 API. Once they established that, then AWS was able to continue to roll out additional services, but with the mandate that they be of value and reliable.
Amazon’s S3 API is an Interface to “Pure” Object Storage
One of the reasons Amazon was so successful with S3 was because of the resilience and scalability of its underlying object storage approach. To achieve this, you need to take a “pure” approach and by that, I mean the API needs to write and address data all the way down to the storage device as an object. Because of the success of the S3 service, the protocol has become the de facto standard for integrating with any storage-as-a-service target (service or device). I say de facto because it is not a true standard. Amazon owns the definition and specification but understands the value in broad adoption so has let the industry use it.
Because of its popularity, software, service, and storage vendors have integrated the S3 protocol to send and/or receive data. From the storage perspective and service perspective, if there isn’t an object storage solution on the other end then there will be inherent scalability issues down the road. If Amazon were to just use a RESTful interface on top of NAS it would not have been able to scale S3 to where it is now.
This is why, when you are evaluating object storage solutions, you need to look beyond the protocol and understand how data is being stored on disk and how it is being protected. We covered this topic in a recent webinar, but the point is that there is a difference between an interface and an underlying architecture.
Establishing the “Cloud”
That said, I want to call the term “cloud” out as a marketing concept (And I am a marketing guy…so I’m allowed to do that). Behind any internet service, someone, somewhere is managing the supporting infrastructure. The good news is that because Amazon helped establish S3 as a de facto standard and made a large portion of the IT industry (vendors and users) comfortable using RESTful interfaces, a lot of this technology is now available to you today. You can get some of the same “benefits” of the cloud but data remains secure in your data center (like DataCore Swarm) or select a service provider that meets your specific requirements (like BT Cloud Storage or Wasabi).
On-Demand Webinar: Learn More About The Difference Between Object Storage, S3, and Cloud Storage
If you are interested in learning more, David Boland, Product Marketing Director for Wasabi, and I discuss this topic in our webinar titled ‘What is the difference between object storage, S3, and cloud storage?’. We discuss the differences in object storage vs. cloud storage vs. S3 and how you can leverage each to solve specific data storage and access challenges. We also address the evolving market requirements, and how all three are reducing storage total costs of ownership while keeping petabytes of data instantly accessible.
Contact us to learn more.