Data Protection for Object Storage

Elastic Content Protection with Replication and Erasure Coding

With the onslaught of data that organizations are experiencing, data protection has never been more critical. In addition to storing and providing data access on a scalable object storage platform, DataCore Swarm also provides built-in data protection capabilities to improve resilience and data durability.

Swarm’s Elastic Content Protection combines automated management of replication and erasure coding with continuous integrity checks and fast volume recovery. All nodes participate in recovery of lost data through Swarm’s innovative distributed algorithm, which gets faster as the cluster grows.

Replication

Copy-Based Protection

The simplest form of protection for data redundancy is replication. Replication uses copy-based data protection where complete copies of a piece of content are made and distributed across nodes or sub-clusters. Because replication stores data contiguously on disk once the first bit is identified, content is delivered rapidly and efficiently without the need for rehydration.

Replication can be as simple or complex as required, allowing you to create an offsite disaster recovery (DR) cluster or even create N-way replication for collaboration and data locality. Since data is replicated on a domain-by-domain basis, you can choose what data to replicate and to where. Both synchronous and asynchronous replication are supported in Swarm.

One or more additional copies of an “original” object are created and maintained to be available should the original somehow become lost. For example, even in a catastrophic disk failure, the entire Swarm cluster participates in an active recovery that quickly restores the cluster to full replication in a relatively short amount of time. By making the recovery fast, Swarm shortens the window of time during which another disk failure might impact the cluster. If you choose three replicas for an object, the two remaining replicas effectively guarantee that a third replica is rapidly made.

replication for object storage

Erasure Coding

Parity-Based Protection

Erasure coding provides enterprise-grade data protection at a lower storage footprint, and is ideal for moderately sized files and large content stores. Using erasure coding in Swarm breaks a file into multiple data segments and then computes parity segments. This results in a total number of segments that use fewer capacity and operational resources when compared to an additional copy.

Files are split into multiple data segments (K) and additional parity segments (P) based on the content of the data segments. This results in (M) total segments (K + P = M) being distributed to M different Swarm nodes or sub-clusters.

Erasure coding is similar to RAID on a per-object level. Should any number of drives fail, all of the nodes in Swarm work in parallel to heal the situation as quickly as possible, so data is fully protected once again. You can have data segments on multiple drives and parity segments on multiple drives. The cluster can be configured to meet any requirements for uptime.

Swarm also analyzes the health of the drives to see if there are any issue to ensure data is copied and protected proactively. This process can detect potential issues before a drive fails.

erasure coding for object storage

Optimize for Durability or Access

With its combined data resilience and recovery mechanisms, Swarm takes the worry out of protecting your data, ensuring it is always available. These mechanisms  work with minimal configuration or manual intervention.

Erasure coding reduces the storage footprint and increases data durability, while replication ensures rapid access. Choose the protection method that fits your business, retention, or service level agreement (SLA) requirements. Set protection policies per object, and store replicated and erasure-coded objects on the same servers, ensuring optimal hardware use. Automatically shift between protection methods based on age, size, location, or type.

Meet Compliance Regulations

Swarm provides additional content protection capabilities that allow you to meet regulatory mandates that require content to be stored on non-erasable, non-rewritable media. Swarm lets you use Legal Hold to create a point-in-time snapshot of a specified set of files at a specified time. The files are then immutably stored regardless of what happens to the original file or cluster. Patented technology lets you prove in a court of law that content has not been tampered with or altered. Integrity seals are based only on content and can be upgraded as newer hashing algorithms replace outdated ones.

Get Started with Swarm, Software-Defined Object Storage