Healthcare Provider Ensures the ‘Health’ and Safety of Customer Data with Luminex MVT Solution
Healthcare providers recommend prevention for their clients; it’s smarter, safer and less-expensive than waiting until things go wrong. This provider opted for a proactive prevention strategy to ensure the protection of their customer data by moving away from physical tape for mainframe operations. As their customer base grew, their existing tape infrastructure was reaching its capacity and performance limitations. The process of managing three copies of tape data was cumbersome, slow and did not reflect their position as a leader in the healthcare industry.
Luminex prescribed an advanced, future-proof solution: Mainframe Virtual Tape (MVT) with Synchronous Tape Matrix (STM) and CGSafe encryption. The remedy would remove capacity and performance limitations, minimize the need for disaster recovery operations while simplifying the DR process, and provide a higher level of security for their customer’s critical personal and confidential data.
Symptoms and Diagnosis
The old system required over 9,000 physical tapes and over 225,000 virtual volumes with two backup copies, one onsite and one offsite, for a total of 3 copies of their data. The STK VSM/9310 silos (disk cache with tape backend) had once been state-of-the-art, but as time went on, the storage subsystem simply couldn’t keep up with the demands of a thriving business. As CPUs got faster, the systems waited too often for the I/O subsystem to respond, causing performance issues. The problem was exacerbated by increasing amounts of data.
Previous tape and disaster recovery environment: physical tape with virtual tape cache and PTAM tape delivery to DR site
In addition to performance and capacity concerns, ensuring continuous access and availability of the data to customers also drove them to consider a new approach.
The disaster recovery process required 2 pallets of tapes to be delivered to the DR site daily and was a time consuming manual task that posed unnecessary vulnerabilities. Even after restoring from over 1,000 tapes over a two-day period, only critical data would be restored, leaving the rest of the data behind. Such an effort affected the feasibility of frequent DR tests, yielded subpar results and made even minor tape infrastructure incidents a potentially expensive and exhausting event.
Consequently, the provider wanted to achieve continuous availability to avoid declaring a disaster for anything other than the most extreme circumstances. And, if declared, the process would need to be significantly faster, more efficient, and include the most current and comprehensive data.
Prescription: Luminex MVT with STM and CGSafe
Continuous availability configuration with 2 copies of data at production, a third copy at DR, and end-to-end encryption
Luminex’s Subject Matter Experts considered the provider’s requirements and prescribed a modern approach to provide 3 copies of data, continuous availability and secure, efficient DR capabilities as a last line of defense.
Continuous availability is now provided by a local STM configuration at the production site, which employs a system of redundant tape controllers, storage systems and data channels to provide mirrored tape data and uninterrupted tape availability to the mainframe. As every component services I/O, if one component becomes unavailable, I/O is automatically serviced by the rest of the system. Though mainframe operators will be alerted in the event of a component outage, the mainframe will continue tape operations without operator intervention. Once components are restored, STM will automatically ensure that all copies of data are brought back to full parity. The provider can now sustain multiple tape component outages without ever stopping production and backup operations, or declaring a disaster.
Customer data is now protected by CGSafe encryption and key management for its entire lifecycle. As data is written over the FICON channel, it is compressed and encrypted before landing on disk storage. It remains encrypted even as it is replicated for DR, and is only unencrypted as it enters the FICON channel back to the mainframe.
Finally, a third copy of all tape data is sent to the disaster recovery site by Luminex Replication. The replication process is monitored by RepMon, which provides VOLSER-level status reporting in real-time via a GUI, as well as logs for auditing. Luminex Replication also includes Push Button DR which simplifies the disaster recovery process and enables DR testing without impacting normal operations. By enabling DR Test Mode, the MVT at the DR site prepares a space efficient partition for read and write testing that can optionally be saved and sent back to the production site for auditing. All the while, replication from production to DR is never interrupted. Once testing is complete, the test partition is purged to reclaim storage capacity.
In addition to a more robust and secure environment, the provider experienced a seamless transition from their old environment to the new Luminex solution. Luminex’s tape migration software and services cloned their existing tapes in parallel with daily tape operations. By creating identical virtual copies and retaining the original VOLSERs, there were no changes to the tape catalog. The MIPS-friendly software also allowed the provider to increase or decrease the rate of the migration, on-demand, to make the most of off-peak host activity. After a brief cutover, tape operations continued on the new solution with the only noticeable changes being significant improvements in performance and resilience.
Immediately, the provider experienced much improved tape I/O performance. An 18–20 hour job cycle now completes in 2–4 hours, depending on the load. The Luminex solution also left them room to grow; once the tape migration process was complete, the solution was at 54% space utilization.
Tape data availability has reached new levels, with STM providing two always-available on-site copies of data without the need to write to, or restore from, physical tape. Disaster recovery preparedness is also better than ever. Full restoration DR tests now take less than 3 hours, versus the old process, which took 2 days to restore “critical only” data. This has led the provider to re-architect how they do their backups to take advantage of the faster RTO to get a better RPO than their traditional weekly full backups.
Finally, and perhaps most importantly for their customers, all of their tape data is now encrypted at rest and in-flight. The provider had acheived all of their goals by implementing the Luminex solution, citing that there would be “no more all nighters” and proclaiming it “a stunning success!”