November 20, 2008
Disk-based archiving can be the most logical step for IT professionals looking to add capacity and increase storage efficiencies, while avoiding the additional cost of purchasing more primary storage. Disk-based archiving solves today's data retention and compliance needs, while laying the foundation for upcoming requirements for data retention. Knowledge Center contributor George Crump explains the benefits of using disk-based archiving in your enterprise.
In these cost-conscious times, CIOs have increasingly been feeling the pressure to either cut their IT budget or to keep it flat. Overall, IT budgets are tightening across all industries lately. Some are flat as compared to last year and some are being reduced. If they are growing at all, it is only a small percentage of growth.
Yet more than likely, if you are an IT decision maker, you have planned for more primary storage capacity (or maybe even a new primary storage system altogether). In either scenario, before you make that purchase, consider an Enterprise Disk Archive as your solution.
With an Enterprise Disk Archive, you can solve today's urgent need by freeing up a significant amount of primary storage capacity, while laying the foundation for future requirements such as data retention and data compliance. Over 80 percent of data becomes inactive after 90 days of creation and is never accessed again. To translate those percentages into reality, this means that in a 500-TB data center, only 100 TBs of data is being accessed actively and 400 TBs of data should not be on primary storage.
Saving money with enterprise disk archives
An Enterprise Disk Archive can store this data at a fraction of the price, while improving long-term data reliability and data retention. It can easily and cost-effectively be moved to an archive storage platform that will save you the cost of expanding your primary storage platform—and will cost you substantially less. It will provide easy access, high availability, substantially improved data protection, and enable you to quickly discover it when you need to for legal or business reasons.
Also, an archive, especially a disk-based one, does not have to be limited to just old files. It can also be used for extra copies of files that perhaps should never have been on primary storage. For example, database environments always seem to have redundant copies of itself—extra backups, archives, dumps or just straight copies. Many times, these files were created "just in case," but never seem to be cleaned up. Now they can be safely archived and to a far more cost-effective and efficient platform.
Investing in data retention
Looking at this from a cost basis, primary disk—despite decreases in disk cost and increases in disk capacities from Tier 1 suppliers—still costs about $30 to $40 per GB once you factor in the controller, software and maintenance. It is not uncommon for disk-based archiving solutions to be dramatically less than that and even approaching the cost of tape. This is not simply an investment in a cheaper platform; this is a platform that almost every data center will need as increasing attention is being paid toward data retention.
This primary storage cost savings solution then begins to establish a strategy for a medium-term data initiative for most organizations, namely data retention. Most of the data that will be archived to free up primary storage is the very same data that will need to be retained for legal and compliance reasons. Some data may need to be held onto for over 50 years. The challenge is you don't know when in the 50 years it will be needed again. But when it is needed, it will be needed rather quickly (a couple of days in the event of legal action). This means it will need to be searchable and on-disk in a common data format.
Identifying the archive target
The first step is to select the archive target. While conventional wisdom says to identify the data first, it makes more sense to identify the archive target because that will indicate how aggressive you can be with migration. For example, if you choose to archive to optical or tape, you cannot be as aggressive with the data that you archive because of fear of slow recovery requests.
If you choose to use a simple shelf of Serial ATA (SATA) drives as an extension to your existing array, you typically will be limited by cost but, more importantly, by scale. Most of these systems are limited by the capacity of the shelf and they don't have the archive-specific features that are required for long-term retention of data (such as data integrity checking and WORM file systems).
Using purpose-built disk archives
A purpose-built disk archive is ideal for this role. First, it provides the ability to scale both from a capacity perspective as well as from a generational perspective. Capacity scaling can be done by adding nodes and growing the archive to multiple petabytes as needed. Generational scaling is the ability to perform a rolling upgrade of the technology as it ages. Old modules can migrate to new ones and the old ones can be expired seamlessly.
Disk also allows a presentation via standard network mount points such as Common Internet File System (CIFS) and Network File System (NFS). Although some disk archives have a proprietary API set for access, standard network access is by far the most advantageous. While there is no guarantee that NFS and CIFS will still be around 50 years from now, if you compare the past, you can see that the odds favor it. Look back ten years. You will have a much better chance of accessing a CIFS-based Windows 95 system over your network than you will finding a drive to mount a 10-year-old piece of media.
Benefitting from disk-based archive systems
Once the target is settled on, the type of data to store on the archive can be evaluated, and how that data should be moved to that system can be examined. This is another area where disk-based archive systems shine. If they are a network mount point, then any application that can mount a network file system can take advantage of it. This means a simple file system move command will work. While you may want something more sophisticated than a manual move command, it does work. Move all the files that have not been accessed in the last year or more. Then tell your users if their data is not on the home drive, then it is on the "archive" drive.
Where that archive drive actually exists is the network mount point of the disk-based archive system. While manual, and requiring user intervention, this process is simple, adds no additional software to the cost of the archive, and is extremely reliable. As one CIO I spoke with put it: “For the cost we save in software, and the odds of a user actually needing one of these files, we could have our help desk individually walk the user through the rare access from the archive.”
If something more sophisticated is warranted, then disk-based archives also make excellent endpoints in a tiered storage strategy. Archiving software has caught up with the simplicities that enterprise disk archiving delivers. With a disk archive, files just need to move from "A" to "B" and have a transparent link set up between those files. The archive itself handles much of the sophisticated retention. That being said, if there is a data movement application in the environment, in most cases it can be redeployed with great success, since even legacy applications tend to support disk as a target.
Given the current state of the economy, storage consolidation is now a high priority for every IT organization. But for IT organizations running performance-sensitive applications, storage consolidation can be a major challenge.
Data storage needs are on the rise. But beyond simply providing more raw capacity, today¿s storage solutions must also be easy to provision and manage, energy-efficient, and highly scalable in performance and capacity. Download this white paper to learn about HP NAS clustering solutions that help meet today¿s rapidly changing storage requirements.
Organizations that deploy Microsoft Windows file servers receive many useful services. Traditional file servers, however, lack scalability, so organizations must add file servers as their data storage needs grow. This results in server sprawl, which leads to low utilization of the file servers and sub-optimal availability of storage. Learn how organizations benefit from consolidating their Windows file serving environments using HP Scalable NAS, a highly scalable, manageable and available storage solution.
Storage administrators are being challenged to manage enterprise data growth and maintain increasing service level commitments while keeping within budgets. This study examines the total cost of ownership of the new HP StorageWorks 9100 Extreme Data Storage System (ExDS9100) and compares it to three competitive approaches. Learn how the HP ExDS9100 is well positioned to deliver massive scalability in both capacity and performance, yet offers considerable cost advantages to meet today¿s storage challenges.
In this IT Link podcast hosted by Mike Vizard, Scott Campbell, HP manager of solutions architects, explains why HP is taking a different approach to managing storage using a new XDS architecture specifically designed to handle the requirements of rapidly growing unstructured data storage.
In this IT Link podcast hosted by Mike Vizard, Efren Molina, PolyServe technical specialist for HP, explains how NAS cluster technology is being used to help customers keep costs in line even as their storage requirements continue to balloon.
In this IT Link podcast hosted by Mike Vizard, Logicalis vice president of consulting Eric Linxweiler explains why storage management software is becoming a strategic issue as the amount and types of data that needs to be managed continues to explode.
NAS has always been simple, unless IT managers wanted to grow their NAS storage significantly. For the first time, storage administrators are thinking in terms of managing petabytes of storage, making massive storage build-outs a necessity. Learn how companies can affordably meet these demands with a simply managed, highly scalable NAS environment.
This solution brief explores HP’s next generation of Scalable NAS and how it caters to every business continuity need by being highly available and easy to deploy while adding levels of affordable, fault tolerant data protection and availability.
When IT administrators are looking for networked storage solutions, they often look to NAS because they can use the Ethernet infrastructure they are familiar with to build pools of storage for significantly less money than SAN with equivalent capacity. Unfortunately, traditional NAS doesn't scale and administrators find themselves having to add NAS platforms to keep up with growing storage demands. As a result, many administrators have started looking for alternative solutions.
Learn how HP's Scalable NAS solution offers central management and administration, scalable capacity and improved utilization, with a lower total cost of ownership (TCO)
Watch this demo and learn how HP's next generation of Scalable NAS is well suited for streaming media serving solutions.
When Roswell Park Cancer Institute (RPCI) needed to remain on the front line of research and to continue providing high-quality care for patients, they chose a comprehensive HP storage solution and improved storage capacity, performance and scalability.
When Crest Animation looked to take on an increased workload and handle High Definition and 2K film animations, the company chose a comprehensive HP storage solution that has given the company a unified, highly reliable storage infrastructure.
Oracle Database and the Oracle E-Business Suite are at the heart of most commercial data centers. HP's Scalable NAS solution Create a scalable infrastructure for Oracle consolidation and file serving.
The new Web 2.0 business model, where the data is the business, utilizes the Internet to disseminate information in many different ways.
NAS has been rapidly evolving as a storage alternative for Oracle databases, and for good reason: NAS is often the simplest, most cost-effective storage approach for Oracle.
Windows File Server and Storage Consolidation using HP EVA File Services.
For several years NAS has been evolving as a storage alternative for Oracle databases, and for good reason