What are PII, PCI, and PHI?

Share This Post

In an increasingly digital world where customer data is being collected at various touchpoints, the protection of personal information is becoming increasingly important for businesses worldwide. There are three core types of personal information that global privacy regulations require protection of: Personally Identifiable Information (PII), Payment Card Industry (PCI) data, and Protected Health Information (PHI). A fourth type of data whose definition, depending on the jurisdiction, captures some or all of the foregoing three is ‘sensitive data.’ The following blog provides a high-level overview of the former three common types of personal information. All three terms have US origins, but they describe concepts that are relevant in the data privacy context across many jurisdictions where they may be called by other names.

PII, PCI, and PHI are acronyms that refer to different types of information which are protected under data privacy laws, regulations, or industry standards due to their sensitive nature. The following table sets out the meaning, origin, examples, and comparable terms in other jurisdictions.

 

Meaning

Origin

Examples

Terms in other jurisdictions

PII

Personally Identifiable Information

U.S. (federal); not defined in any act; most commonly used definition is from OMB Memorandum M-07-16

Name, date of birth, mailing address, telephone number, Social Security number (SSN), email address, zip code, account numbers, certificate/license numbers, vehicle identifiers including license plates, uniform resource locators (URLs), static internet protocol addresses, biometric identifiers (e.g., fingerprints), photographic facial images, or any other unique identifying number or characteristic, and any information where it is reasonably foreseeable that the information will be linked with other information to identify the individual

Personal information (e.g., CCPA, PIPEDA); Personal data (GDPR, proposed New York privacy act)

PCI

Payment Card Industry

PCI is sometimes used as a shorthand for the information protected under the PCI Data Security Standard (PCI DSS)

Cardholder data: Primary account numbers (PAN) that identifies the issuer and the cardholder account; cardholder name; expiration date; service code; 

Sensitive Authentication Data (SAD) which is information used to authenticate cardholders and/or authorize payment card transactions, including card validation verification codes/values (CVV), full track data (from magnetic stripe or equivalent on a chip), PINs, and PIN blocks.

‘personalized security credentials’ and ‘sensitive payment data’ (EU’s PSD2)

PHI

Protected Health Information

U.S. HIPAA’s Privacy Rule

Individually identifiable information relating to a person’s health contained in medical records, such as medical diagnoses, treatment information, as well as lab results and billing information

Personal health information (PHIPA), Special categories of personal data (GDPR)

When we look at the formal definitions of PII, PHI, and PCI you’ll notice that PII is an umbrella term which actually captures PCI and PHI, though they each have their intricacies.

The OMB Memorandum M-07-16 puts “personally identifiable information” as:

“information which can be used to distinguish or trace an individual’s identity, such as their name, social security number, biometric records, etc. alone, or when combined with other personal or identifying information which is linked or linkable to a specific individual, …” 

As for the definition of PHI, it’s quite lengthy. In summary, there are five elements to the definition. (1) PHI describes information that is created or received by a specific entity, (2) composed of particular content relating to an individual’s health, (3) identifies or is reasonably likely to identify the individual, (4) is transmitted in a certain way, and (5) is not excluded from the definition.

Lastly, PCI stands for information protected under the PCI Data Security Standard (PCI DSS), a standard drafted by an independent body brought to life by major credit card companies. The protected information is called ‘account’ data and is composed of cardholder and sensitive authentication data (see the list in the table above). All this information can be used to distinguish or trace an individual’s identity when combined with other personal or identifying information.

The reason why there is a distinction between PII, PCI, and PHI in the US is because the subcategories of PHI and PCI are so sensitive that the need to regulate them, in the absence of a comprehensive data protection law spanning all states, overcame the political difficulties (in the case of HIPAA) of enacting a federal law addressing the issue. With regard to PCI, the initiative of the private sector solved the problem when major credit card companies formed an independent body to set out standards protecting PCI which are imposed contractually on organizations handling PCI.

The situation is the inverse in Europe with regard to health information. With the GDPR, the EU succeeded in enacting a general data protection law, albeit the process took years. While the GDPR also applies to health and financial data, local idiosyncrasies require flexibility to arrive at an agreement of all member states representatives. Therefore, the GDPR explicitly permits member states to require stricter safeguards than provided for in the GDPR, which only sets the minimum standard for protection.

Similarly, Europe has failed so far in establishing a harmonious card payment regime. Likely because Europeans are generally more conservative regarding card payments, it is more difficult for payment service providers to scale, hence the focus is often on proprietary, but low-cost national card schemes that set out their own compliance requirements in their contracts with merchants. An effort is underway, however, to implement a harmonized, European card payment system.

How Private AI Can Help With Compliance

Having visibility into what data exists within your organization and where it lives will allow you to determine what measures you must put in place to comply with the applicable legislation or industry standard regarding PII, PCI, and PHI. 

Private AI can help you make that determination, identifying 50+ entities of PII, PHI, and PCI in unstructured data across 47 languages. Using the latest advancements in Machine Learning, the time to identify and categorize your data can be minimized and compliance facilitated. To see the tech in action, try our web demo, or request an API key to try it yourself on your own data.

Subscribe To Our Newsletter

Sign up for Private AI’s mailing list to stay up to date with more fresh content, upcoming events, company news, and more! 

More To Explore

Download the Free Report

Request an API Key

Fill out the form below and we’ll send you a free API key for 500 calls (approx. 50k words). No commitment, no credit card required!

Language Packs

Expand the categories below to see which languages are included within each language pack.
Note: English capabilities are automatically included within the Enterprise pricing tier. 

French
Spanish
Portuguese

Arabic
Hebrew
Persian (Farsi)
Swahili

French
German
Italian
Portuguese
Russian
Spanish
Ukrainian
Belarusian
Bulgarian
Catalan
Croatian
Czech
Danish
Dutch
Estonian
Finnish
Greek
Hungarian
Icelandic
Latvian
Lithuanian
Luxembourgish
Polish
Romanian
Slovak
Slovenian
Swedish
Turkish

Hindi
Korean
Tagalog
Bengali
Burmese
Indonesian
Khmer
Japanese
Malay
Moldovan
Norwegian (Bokmål)
Punjabi
Tamil
Thai
Vietnamese
Mandarin (simplified)

Arabic
Belarusian
Bengali
Bulgarian
Burmese
Catalan
Croatian
Czech
Danish
Dutch
Estonian
Finnish
French
German
Greek
Hebrew
Hindi
Hungarian
Icelandic
Indonesian
Italian
Japanese
Khmer
Korean
Latvian
Lithuanian
Luxembourgish
Malay
Mandarin (simplified)
Moldovan
Norwegian (Bokmål)
Persian (Farsi)
Polish
Portuguese
Punjabi
Romanian
Russian
Slovak
Slovenian
Spanish
Swahili
Swedish
Tagalog
Tamil
Thai
Turkish
Ukrainian
Vietnamese

Rappel

Testé sur un ensemble de données composé de données conversationnelles désordonnées contenant des informations de santé sensibles. Téléchargez notre livre blanc pour plus de détails, ainsi que nos performances en termes d’exactitude et de score F1, ou contactez-nous pour obtenir une copie du code d’évaluation.

99.5%+ Accuracy

Number quoted is the number of PII words missed as a fraction of total number of words. Computed on a 268 thousand word internal test dataset, comprising data from over 50 different sources, including web scrapes, emails and ASR transcripts.

Please contact us for a copy of the code used to compute these metrics, try it yourself here, or download our whitepaper.