OCR releases HIPAA privacy rule guidance
Just two and a half years after hosting a workshop on the HIPAA Privacy Rule's de-identification standard, the Office for Civil Rights (OCR) has issued its "Guidance Regarding Methods for De-identification of Protected Health Information in Accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule." Like they say, it's not rocket surgery -- and there are few surprises here. One area worth reviewing is the expert determination section -- for those of you using, or considering the use of, expert opinions to guide your de-identification programs. Reproduced below is a table describing some of the principles used by experts in determining whether information has been de-identified:
Table 1. Principles used by experts in the determination of the identifiability of health information.
| Principle | Description | Examples |
|---|---|---|
| Replicability | Prioritize health information features into levels of risk according to the chance it will consistently occur in relation to the individual. | Low: Results of a patient’s blood glucose level test will vary |
| High: Demographics of a patient (e.g., birth date) are relatively stable | ||
| Data source Availability | Determine which external data sources contain the patients’ identifiers and the replicable features in the health information, as well as who is permitted access to the data source. | Low: The results of laboratory reports are not often disclosed with identity beyond healthcare environments. |
| High: Patient name and demographics are often in public data sources, such as vital records -- birth, death, and marriage registries. | ||
| Distinguishability | Determine the extent to which the subject’s data can be distinguished in the health information. | Low: It has been estimated that the combination of Year of Birth, Gender,and 3-Digit ZIP Code is unique for approximately 0.04% of residents in the United States. This means that very few residents could be identified through this combination of data alone. |
| High: It has been estimated that the combination of a patient’s Date of Birth, Gender, and 5-Digit ZIP Code is unique for over 50% of residents in the United States. This means that over half of U.S. residents could be uniquely described just with these three data elements. | ||
| Assess Risk | The greater the replicability, availability, and distinguishability of the health information, the greater the risk for identification. | Low: Laboratory values may be very distinguishing, but they are rarely independently replicable and are rarely disclosed in multiple data sources to which many people have access. |
| High: Demographics are highly distinguishing, highly replicable, and are available in public data sources. |
One element of the expert determination worth noting is the notion that a determination should perhaps be time-limited. Since that which is de-identified today may not be de-identified tomorrow (thanks in part to the rapid growth in the volume of data that is made available to the public on the internet). Here is the relevant FAQ:
How long is an expert determination valid for a given data set?