What’s the Difference and Why it Matters
Data is the lifeblood of every modern organization. Our ability to share, store, and use it effectively is crucial to helping businesses grow, improve operational efficiency, keep customers happy, and gain a competitive edge. It’s also vital for empowering employees by giving them access to the information they need to get their jobs done. This is especially true with more of us working remotely during the current health crisis.
We all know that data is growing explosively – organizations have to buy more data storage than ever before. And that’s a big problem. However, every organization is faced with another big problem that effects everyone – business leaders, IT professionals and users – though it effects them in different ways. And this is: Not all data is equally valuable.
Data is like cash. We treat, protect and use the cash in our wallet differently based on its value. We’re a lot more careful about how we look after and spend $100 bills than $1 bills. The same is true of data. Not all of it is equally important and more importantly, its value changes over time – typically because of the information contained in it, access frequency, and even age of the data. Ideally, organizations should have storage platforms that are built to handle the importance of data in intelligent ways, rather than just storing bits and bytes unintelligently. That’s why data storage providers introduced the concept of “Data Temperature.”
To illustrate: there is usually a short burst of frenzied activity with newly created data, but this activity rapidly drops off over time. Typically 90% of I/O activity takes places in 10% of data storage. And it is also true for most organizations that only about 20% of all data is being actively used. That leaves 80% of data just sitting there chilling. It might be used once a month, or once a year, or never again. The image below shows how data temperature equates to its value. Hot data is in active use and it’s most valuable to the organization. Inactive data is cold and less valuable, but you still have to store it for possible future use, which would make it hot again.
It is to be noted that data access need not be the only deterministic factor for inactive/cold data. For unstructured data, there could be other business requirements that determine when data can be deemed inactive, such as the age of the data, cost of storing it, its protection level, compliance and so on.
Let’s look at the unstructured data world where data is more distributed and the two popular formats of data storage: file system and object storage.
What is File Storage?
File storage (aka file-based storage or file-level storage) is the type of data storage where data is stored in a hierarchical file and folder structure. A file is stored as a whole without breaking down the data into blocks, such as in block storage. Files can be stored in folders, which can then be placed in other folders in a nested structure. The directory path of the file and which folder it is stored in is needed to call up that file again from its storage location. NAS systems typically use file storage and are comparatively less expensive than block storage.
If you have a computer, you’ve used a file system. File systems contain documents, presentations, images, all the sorts of resources we move around on our desktop or store in our ‘Documents’ folder. File systems give us a hierarchical system for organization. It’s a similar approach to using a filing cabinet with the data organized into named directories, folders, subfolders and files. Applications and users know where everything is based on name and location. File systems are great for simple in and out access, provided you know the location of what you’re looking for.
For file storage beyond the ordinary desktop or laptop, organizations use NAS (Network Attached Storage) solutions and file servers to provide specialized and optimized file share capabilities across a network. They usually provide NFS and SMB protocol support for use in Unix, Linux, and Windows environments. These are great for file and document storage or sharing.
NAS is typically suited for file and document storage or sharing, as well as access control. But as you know from your own desktop, you’re only working on a few files at a time. Most of the files on your hard drive are cool—or cold. If that’s true on a file server or NAS, the system runs out of storage or performance bogs down—just like your notebook. In such cases, IT organizations can consider object storage as a means to store cold (or inactive) data.
What is Object Storage?
Object storage (aka object-based storage) is a type of data storage used to handle large volumes of unstructured data where data is bundled along with metadata tags and a unique identifier. Each of these self-contained object datasets are placed into a flat address space, known as a storage pool. Unlike file storage, object storage does not follow a hierarchical structure. The metadata contains description about the data and the unique identifier is used to easily retrieve the object instead of a file name and file path. Cloud-based S3 is a popular object storage option in addition to on-premises object storage deployments.
Object storage is a more recent approach that doesn’t impose a file system on the data. Instead, metadata is used to describe all the details about the underlying data. This can include the name, creation date, location, owner and much more. Tables are used to make it possible to store, track and retrieve data based on this metadata.
This works in the same way as using a valet service at a car parking facility. Imagine millions of cars in an enormous parking lot. The valet provides a parking ticket in exchange for your car and then parks it for you. You don’t need to know where it’s parked, just that it’s safe and will be available when you need it next. It can be retrieved by the valet at any time based on the information (or metadata) on the parking ticket, no matter the size of the parking lot.
The advantages of object storage include low cost, massive scalability, and global access capabilities. The trade-offs include latency and performance, but these are improving over time. For users who almost never need access to old files and documents, it’s almost invisible. But to organizations who need to keep everything for regulatory compliance or legal defense, object storage is essential.
In conclusion, both file storage and object storage have their own unique advantages and use cases. File storage is great for simple access and organization of structured data, while object storage is designed to handle large volumes of unstructured data with high scalability and flexibility. Understanding the differences between these two storage types, their applications and benefits is crucial for organizations to make informed decisions on how to store and manage their data effectively. By using the right storage platform for the right type of data, organizations can optimize their storage resources, improve their operational efficiency, and enhance their overall business performance.
Contact DataCore to find the right storage solution for your needs.