3 min read

Untold Secrets about File System Defragmentation in Software-Defined Storage Environments

“There is no Spoon” – Neo, The Matrix

So what is file system defragmentation, in software-defined storage (SDS) environments?

The idea of file system defragmentation is to lay out data in sequential order for faster access when searching in files. This works well with a piece of spinning rust (hard-disk drive) on its own, because the data is written into sectors and each sector has a “home address” on the drive. Those addresses are static and locate the data in 3 dimensions; cylinder, head, sector Data access latency is introduced by adjusting the disk head to the correct position to access the sector when it “flies by”.

RAID-level virtualization translates the addresses already into a virtual address scheme no longer matching the physical layout of a drive and so defragmentation is almost useless. It can slightly improve large writes by increasing the probability of “full strokes” for lower read-modify-write occurrences, therefore making less write penalty times.

SDS abstraction, like what is happening within SANsymphony pools, adds another layer of abstraction to the address scheme and furthermore relocates data to different physical “home addresses” for improved accessibility while in the mind of the file system the logical “home address” is still unchanged. To cite the Matrix again: “You think that’s air you’re breathing” – Morpheus; If applications show access latency issues in SDS environments, storage defragmentation may not solve it, go and check the underlying machinery!

The Lights are on, but Nobody’s Home!

Think about a postman ordering the letters in his bag to match the house numbers of a street he is walking down. He walks down the street once and just drops the letters into the mailboxes, but  if he would pick the letters as they come and run up and down the street, it would result  in far less letters being delivered due to all of the wasted time running around.

Now if automatic data tiering is active in an SDS environment, the physical data “home address” may move to a new location based on the access frequency of the data set. The logical “home address” in the file system stays unchanged -effectively making data access faster for a known address. Defragmentation now moves data from one logical “home address” to another. The new logical “home address” has an associated physical “home address” at the storage, but with an undefined access frequency property and may fall from fast access time to low access time.

Getting back to the postman example – imagine the postman has the ordered letters in his bag to walk down the street, but people switched their houses within the street and the mailboxes have information about the old house owners and their new location. Now he must deal with this situation by looking up the letters, reordering them in the bag to reflect the new ownership layout in the street, or he follows the order of the letters in the bag and runs up and down the street to drop the letters into the appropriate mailboxes – which just wouldn’t be ideal.

Find out more about how best-in-class SDS solutions can transform your IT service delivery and performance with a fully-functional trial download today!

Data Storage Solutions for Your Every IT Need

Talk with a solution advisor about how DataCore Software-Defined Storage can make your storage infrastructure modern, performant, and flexible.

Get Started