De-Identification under the HIPAA Privacy Rule
The Health Insurance Portability and Accountability Act of 1996 (HIPAA) and the Privacy Rule aim to protect the privacy of individually identifiable health information. The consequences of a breach of Protected Health Information (PHI) can be serious. While health plans and providers generally need PHI to run their business, some companies may function with de-identified data and thereby avoid risks associated with a potential breach.
Medical data are often de-identified in two ways:
- Through removal of 18 specific types of information related to the patient’s identity, including name, address (generally except 3-digit ZIP code), birth date (generally except year), insurance ID, medical record number, et cetera (“Safe Harbor” method); or
- A statistician/expert determines that the risk is very small that the information can identify an individual (“Expert Determination” method).
AACG statisticians have assisted companies with Expert Determinations under the HIPAA Privacy Rule.
Some Services Require Protected Health Information (PHI), Others Can Avoid It
Expert Determination
An Expert Determination requires an expert to assess identification risks and document the process. Specifically, the Privacy Rule (45 CFR §164.514) states that health information is not individually identifiable if:
Expert | A person with appropriate knowledge of and experience with generally accepted statistical and scientific principles and methods for rendering information not individually identifiable: |
Risk Assessment | Applying such principles and methods, determines that the risk is very small that the information could be used, alone or in combination with other reasonably available information, by an anticipated recipient to identify an individual who is a subject of the information; and |
Documentation | Documents the methods and results of the analysis that justify such determination. |
The Process
Step 1: Who Uses the Data and for What Purposes? When AACG statisticians carry out an Expert Determination, they familiarize themselves with the company’s products and services, review the contents of the information at issue, interview representative anticipated recipients, and learn about anticipated utilization or purposes of the information.
Step 2: Can the Data Identify Patients? Next, they attempt to identify mechanisms through which the information, possibly in combination with external data, may be used to trace patients or other individuals who are the subject of the information.
Step 3: If Needed, Eliminate or Mitigate Identification Risks. If any identification mechanism exists, we work with the company to eliminate or mitigate identification risks. Among other ways, identification may be eliminated or mitigated through, for example:
- Masking: Some data elements are not strictly required to accomplish the company’s goals and may be removed, or their access may be limited to a subset of employees with special privileges;
- Obscuring: Where the exact values of certain data elements are not strictly required, sometimes they may be replaced with an average over some population of interest, or their values may be top-coded (e.g., record extraordinarily long lengths of stay as “60 days or longer”);.
- Perturbing: Identification may be thwarted, in some cases without loss of information functionality, by adding random noise to data values or by rank-swapping extraordinary values (e.g., record the age of patient A as that of patient B, and vice versa);
- Hashing: When a unique identifier is needed, it may sometimes be replaced by its “hashed” counterpart, thereby shielding its value from the company but preserving it to the original source or a third party (e.g., record medical record number A12345678 as its hash counterpart “aeac8d3e9d927e7b722a2771548ca4a3102f7f1af566a25cc87527d442c2e9a1”).
Step 4: Document the Analysis. Finally, we document the process and findings in a report.
AACG statisticians have the expertise, experience, and skills to assist with HIPAA de-identification.