There is no question that the resources required to process, analyze, and manage petabytes of genomic information represent a huge burden for even the largest academic research facility or healthcare institution. That burden becomes even greater when one factors in the need to handle these data in compliance with an alphabet soup of regulatory regimes: HIPAA, CLIA, GCP, GLP, 21 CFR Part 11, and their counterparts outside the United States, including data privacy laws in jurisdictions such as the European community.
In this context, the use of cloud-based solutions to manage, analyze, store, and share data can provide some relief. Computer and storage resources are instantly available on demand. There is no need to lease brick-and-mortar facilities, purchase equipment, or hire staff to maintain them.
Despite the advantages of cloud computing, organizations are often hesitant to use it because of concerns about security and compliance. Specifically, they fear potential unauthorized access to patient data and the accompanying liability and reputation damage resulting from the need to report HIPAA breaches. While these concerns are understandable, a review of data on HIPAA breaches published by the US Department of Health and Human Services (HHS) shows that these concerns are misplaced. In fact, by using a cloud-based service with an appropriate security and compliance infrastructure, an organization can significantly reduce its compliance risk.
A HIPAA Primer
The Health Insurance Portability and Accountability Act of 1996 (“HIPAA”), as amended by the Health Information Technology for Economic and Clinical Health Act of 2009 (“HITECH”), protects “individually identifiable health information,” which the rule calls “protected health information,” or “PHI.”
Opinions differ as to whether a human genome, stripped of identifiers such as name or social security number, constitutes PHI. Whether these data constitute PHI depends on whether there are sufficient publicly available reference data sets to create a “reasonable basis to believe” that a genome can be associated with an identified individual. Recent publications suggest that if these data are not currently classified as PHI, they will be soon . As a consequence, organizations that handle genomic data are well advised to implement systems that treat a whole genome as PHI, even if public reference data sets are not yet common enough to make it PHI today.
Entities that are obligated to comply with HIPAA are often particularly concerned with the obligation to report HIPAA breaches and the associated potential harm to their reputations. These reporting obligations create powerful incentives for organizations to implement systems and processes to reduce risk.
How Have Large HIPAA Breaches Happened?
Since 2009, some 21 million health records have been compromised in major HIPAA security breaches reported to the US government. Loss or theft of electronic equipment or storage media has been the source of more than 66% of all large HIPAA breaches during this period. The individuals affected by these breaches amount to nearly 73% of all individuals affected by large HIPAA breaches reported to HHS during the same time period. In most cases the theft or loss involved a laptop or electronic media, such as a flash drive, containing unencrypted PHI. In contrast, large breaches attributed to hacking amounted to 8% of the total incidents and affected 6% of the individuals whose PHI was disclosed.
These data suggest that the implementation of IT systems that enable secure sharing of information without the need to transport it on a computer or storage media will go a long way toward eliminating the majority of large HIPAA breaches.
Use the Cloud to Reduce HIPAA Risk
The first, and perhaps the most important, step one can take in reducing the risk of HIPAA breaches is to make sure that users of PHI are not transporting unencrypted data on portable equipment (like laptops) or media (like flash drives). New genomic data management systems enable this goal by keeping data in the cloud, and providing access to users via a web browser. In this architecture, only the PHI that the user is viewing in his or her web browser is resident on the user’s computer; all other data remains on secure servers.
The use of the cloud can also facilitate enforcement of encryption requirements. For example, many of the new commercial systems encrypt all data while in transit and while at rest. This means that even if data somehow become accessible to an unauthorized person, they would be secured and could not be read unless the hacker also obtains the encryption key. While this is also possible using an on-premises data center, it is much harder to enforce where users download and store data.
Further, the significant costs of security audits, certifications, and assessments to demonstrate best efforts to comply with HIPAA security requirements are more easily borne by cloud providers than by private data centers. Such certifications could provide meaningful defense against civil or criminal prosecution, even if there is an unavoidable breach.
The use of an appropriately designed and developed cloud-based system for managing genomic PHI can also facilitate compliance with the physical and technical safeguards required by the HIPAA Security Rule. Most cloud service providers implement physical security measures that exceed those that are practical for all but the largest of single-institution data centers. In addition, systems designed to manage genomic data automatically implement technical and other safeguards to ensure data confidentiality and integrity, including encryption, multi-factor authentication, automatic session timeouts, and logging for auditability.
While it may not be intuitively obvious, in most cases a user of genomic PHI can dramatically reduce its compliance risk by using a cloud-based solution consistent with the standards described in this article.
 Rodriguez, L. et. al, “The Complexities of Genomic Identifiability”, Science, vol. 339, no. 6117, p.275 (January 18, 2013).