“In most data centers, 80 percent or more of stored data hasn't been accessed in more than a year. Tighten that time frame up, and we find 95 percent of data has not been accessed in the last 90 days.”
Those somewhat bracing statistics come from George Crump, president of Storage Switzerland, an IT analyst firm, in a recent commentary in which he discusses how the reality of data use should impact storage strategies.
For starters, he notes, it’s important for IT managers to realize that archiving data to secondary storage doesn’t mean it won’t be readily accessible. On the contrary, he says, archived data can be retrieved almost as quickly as primary data because there aren’t as many demands on archive storage tiers.
Granted, such use will vary from sector to sector, but Crump points out that “primary storage is responding to hundreds, if not hundreds of thousands, of recall requests per second, while an archive typically responds to one or two per hour. Archives are usually busier dealing with inbound write traffic than old data being accessed. With less I/Os to respond to, disk-based archive storage can respond to individual requests almost as fast as primary storage.”
Next, he says, a “logical enterprise data archiving strategy is to archive data on an as-needed basis -- typically, as those primary systems come off of maintenance, have reached end of life or are full to the point that more capacity or another primary storage system must be purchased. You'll want to know how much of the data on that array can be archived. With that information, you should buy just that amount of storage from your archive vendor, enabling you to put off the purchase of a primary storage system or to run a much smaller high-performance storage system.”
Of course, Crump says, if an aggressive enterprise data archiving strategy is followed, then IT managers should prepare for more frequent data recalls from users. The key is to “make sure most of those recalls can occur without IT interruption. That means you need to select software that can set transparent links between where the file used to be and where it is on the archive. It's also important to remember the archive might be multistep, on-premises disk to tape or on-premises disk to the cloud, which means that these links must be updated with the file location each time it moves to another storage device.”
Finally, Crump points out that archiving is something that should be looked at as a storage diet that's done every so often. Instead, he says, “it’s an organizational change that occurs gradually and, once fully applied, never stops.”