What are the most common examples of PII?

The most common PII examples include full names, Social Security Numbers, home addresses, email addresses, phone numbers, dates of birth, passport numbers, driver's license numbers, IP addresses, device identifiers, and financial account numbers. Biometric data such as fingerprints and facial recognition templates also qualify and are classified as sensitive PII requiring heightened protection under frameworks including GDPR and NIST SP 800-122.

Is a phone number considered PII?

Yes, a phone number is PII. Under GDPR it is personal data because it can identify a specific individual. NIST SP 800-122 treats phone numbers as PII because they either directly identify a person or can be combined with other data to achieve identification. The main distinction is between personal phone numbers, which are unambiguously PII, and general corporate lines where many people respond. Any number that reaches a specific individual should be treated as PII.

What is the difference between PII and sensitive PII?

PII is any information that can identify an individual. Sensitive PII is a subset that requires stronger protection because its exposure causes greater harm. Sensitive PII includes Social Security Numbers, biometric data, health information, financial account numbers with security codes, and special categories under GDPR such as racial origin, sexual orientation, and religious beliefs. The distinction affects encryption requirements, access control standards, and breach notification thresholds.

Does GDPR apply to all types of PII?

GDPR applies to personal data, which is a broader concept than the traditional US definition of PII. Under GDPR, any information relating to an identified or identifiable natural person is personal data — this includes indirect identifiers like IP addresses, cookie identifiers, and location data that some US frameworks treat more ambiguously. If your organization processes data of EU residents, GDPR applies regardless of where your organization is based, and online identifiers qualify as personal data subject to all GDPR protections.

How should organizations protect PII in cloud environments?

Effective PII protection in cloud environments requires several layered controls: automated PII discovery to know where personal data exists across storage, databases, and logs; CSPM tools to detect misconfigurations like public S3 buckets or unencrypted databases; encryption with AES-256 at rest and TLS 1.3 in transit; least-privilege access controls; static analysis in CI/CD pipelines to catch PII embedded in code before deployment; and continuous vulnerability management on systems storing personal data. Development environments should use synthetic data rather than production PII, and all third-party data processors require Data Processing Agreements.

PII Examples: What Counts as Personally Identifiable Information in 2026

The Classification Problem Nobody Talks About

IBM's 2026 Cost of a Data Breach report put the average breach cost at $4.88M. A significant portion of that number traces back to one specific failure: organizations didn't know what data they had, where it lived, or whether it qualified as PII until it was already exposed. That's not a security tools problem. That's a classification problem.

PII — personally identifiable information — sounds like a straightforward concept. It isn't. Regulatory frameworks disagree on exact definitions. NIST SP 800-122 defines PII differently than GDPR's concept of personal data. CCPA adds yet another layer. And then your engineering team ships a feature that logs IP addresses in plaintext, and suddenly you're having a conversation with your DPO about breach notification obligations.

This guide cuts through the ambiguity with concrete examples, real-world edge cases, and the compliance implications that actually matter in 2026.

What Is Considered PII? The Core Framework

NIST's definition from SP 800-122 remains the industry anchor: PII is any information that can be used to distinguish or trace an individual's identity, either alone or when combined with other information that is linked or linkable to a specific individual. That second clause — combined with other information — is where most organizations get tripped up.

A name alone? Debatable. A name plus an employer plus a ZIP code? Almost certainly PII. This is the linkability problem, and it's why static lists of PII categories can be dangerously incomplete.

GDPR takes an even broader view: any data relating to an identified or identifiable natural person is personal data. The word identifiable does enormous legal work here. An IP address, under GDPR, is personal data because it can potentially identify a person via their ISP — a position the Court of Justice of the EU affirmed in Breyer v. Germany.

Direct vs. Indirect Identifiers

A useful mental model: split PII into direct and indirect identifiers. Direct identifiers unambiguously point to one person. Indirect identifiers require combination with other data to achieve identification but still qualify as PII under most frameworks.

Direct identifiers include full name, Social Security Number, passport number, driver's license number, biometric data such as fingerprints and facial geometry, and government-issued ID numbers.

Indirect identifiers include ZIP code, date of birth, gender, employer, job title, IP address, device identifiers, cookies, behavioral data, and location history.

A classic Carnegie Mellon study demonstrated that 87% of Americans could be uniquely identified using only ZIP code, birthdate, and gender. Three indirect identifiers. That's the linkability risk made concrete.

PII Examples by Category

Identity and Government Documents

This category is unambiguous. Social Security Numbers, Tax Identification Numbers, passport numbers, national identity card numbers, and driver's license numbers are all clearly PII under every major framework. These are also the highest-value targets in credential theft campaigns. Exposure of any of these triggers notification requirements under nearly every US state breach notification law and GDPR Articles 33 and 34.

Contact Information

Full name, home address, personal email address, and personal phone number all qualify as PII. A question that comes up constantly in security reviews: is a phone number PII? Yes, unambiguously. A phone number is PII. Under GDPR it is personal data. Under NIST SP 800-122, a phone number either directly identifies someone or can easily be linked to other identifying information.

Work email addresses occupy a gray zone. An address like john.smith@company.com identifies a specific individual, making it personal data under GDPR. US frameworks are less consistent, but any email that reaches a specific human should be treated as PII from a risk management perspective.

Financial Information

Credit card numbers, bank account numbers, routing numbers, credit scores, financial statements, and tax returns all constitute PII. PCI DSS governs cardholder data specifically, but from a broader PII compliance standpoint, financial data is among the most sensitive categories. Exposure of financial PII typically triggers not just GDPR obligations but also sector-specific regulations like GLBA in the United States.

Health and Medical Information

HIPAA in the US defines 18 specific identifiers as Protected Health Information — a healthcare-specific subset of PII. Medical record numbers, health insurance beneficiary numbers, diagnosis codes, and prescription information all qualify. Under GDPR, health data is explicitly listed as a special category requiring explicit consent and additional safeguards under Article 9.

Biometric Data

Fingerprints, facial recognition templates, iris scans, voiceprints, gait analysis data, and DNA sequences constitute PII and are classified as sensitive PII. Biometric data is particularly high-risk because it is immutable — you can change a password, but you cannot change your fingerprints. Illinois BIPA has been aggressive about enforcement in this area. Several US states introduced their own biometric privacy laws in 2025 and 2026, following Illinois's lead.

Online Identifiers and Technical Data

IP addresses, device IDs, cookie identifiers, mobile advertising IDs such as IDFA and GAID, browser fingerprints, and precise geolocation data all qualify as PII under GDPR and as personal information under CCPA. This is where developers most frequently create unintentional PII exposure. Log files contain IP addresses. Analytics platforms capture device IDs. Session replay tools record behavioral data that can re-identify users.

This is exactly why secret detection tooling needs to go beyond API keys and credentials. Log files and configuration data containing IP addresses, user agent strings, and session tokens represent a real PII exposure vector in modern cloud environments. Static and dynamic analysis that surfaces these patterns before they reach production is essential.

Sensitive PII: The Higher-Risk Tier

Not all PII carries equal risk. Sensitive PII — sometimes called sensitive personally identifiable information — is a subset that requires heightened protection because its exposure causes greater harm to individuals. NIST SP 800-122 explicitly distinguishes between PII that is sensitive and PII that requires less stringent protection based on context and potential impact.

Sensitive PII generally includes the following categories:

Social Security Numbers and equivalent government ID numbers
Financial account numbers combined with security codes or PINs
Biometric identifiers processed for unique identification
Health and medical information
Sexual orientation and gender identity
Religious and political beliefs
Racial or ethnic origin
Criminal history and conviction records
Precise geolocation data
Passwords and authentication credentials

GDPR Article 9 maps closely to this concept with its special categories of personal data, which receive explicit additional protections. Processing these categories requires explicit consent or falls under one of the narrow Article 9(2) exceptions. Violation of special category protections carries the higher tier of GDPR fines — up to €20M or 4% of global annual turnover, whichever is greater.

In practice, sensitive PII should be encrypted at rest and in transit, subject to strict access controls, masked in logs and monitoring systems, and handled only by a minimal set of authorized personnel. Achieving this consistently across cloud infrastructure at scale is non-trivial, which is why compliance automation matters. Manual audits simply do not scale to the pace of modern cloud deployment.

PII in Cloud Environments: Where Classification Falls Apart

The PII classification problem becomes dramatically harder at cloud scale. Data copies proliferate. Development environments get seeded with production data. S3 buckets get misconfigured. APIs return more fields than intended. A single misconfigured database in a development account can expose thousands of PII records that were never supposed to leave production.

The attack surface expands further when you factor in third-party integrations. Your SaaS vendors, analytics tools, and marketing platforms all process PII on your behalf. Under GDPR, you are the controller and they are processors. You are responsible for what they do with the data, which means vendor security assessments and Data Processing Agreements are not optional bureaucracy — they are legal obligations.

Cloud Security Posture Management becomes critical in this context. CSPM tools can continuously scan for misconfigurations that expose PII — public S3 buckets, unencrypted RDS instances, overly permissive IAM policies on storage containing personal data. The goal is reducing the blast radius before a misconfiguration becomes a breach notification event.

Container and Kubernetes environments add additional complexity. Secrets and PII can be embedded in environment variables, ConfigMaps, or container images. Static analysis integrated into CI/CD pipelines catches these patterns before they ship, shifting PII detection left to where it is cheapest to fix.

PII Compliance: What the Frameworks Actually Require

GDPR

The most comprehensive framework currently in force. Seven core principles govern all processing: lawfulness, fairness, and transparency; purpose limitation; data minimization; accuracy; storage limitation; integrity and confidentiality; and accountability. Article 25 mandates privacy by design and by default — PII protection is supposed to be architected in from the start, not bolted on later. Data Subject Access Requests must be fulfilled within 30 days. Breach notification to supervisory authorities is required within 72 hours of discovery when the breach is likely to result in risk to individuals.

NIST Privacy Framework

Published alongside NIST CSF 2.0, the Privacy Framework provides a risk-based approach to PII management. It aligns with the CSF structure but adds privacy-specific functions: Identify-P, Govern-P, Control-P, Communicate-P, and Protect-P. Particularly useful for US federal agencies and contractors, it is increasingly adopted by commercial organizations as a baseline for privacy program design.

CCPA and CPRA

California's framework introduced the right to opt out of the sale of personal information and the right to know what data is collected. CPRA extended this to add the right to correct inaccurate personal information and the right to limit use of sensitive personal information. California's definition of personal information is intentionally broad — household data, inferences drawn from personal information, and unique identifiers all qualify. Several other US states now have similar laws in effect, creating a patchwork that organizations operating nationally must navigate.

HIPAA

Healthcare-specific but essential if any of your data touches health information. The Safe Harbor method specifies 18 identifiers that must be removed to achieve de-identification. The Expert Determination method allows a qualified statistician to certify de-identification. Neither method is as simple as it sounds — re-identification attacks on datasets believed to be anonymized have been demonstrated repeatedly in academic literature, which is why de-identification should be treated as risk reduction rather than elimination.

Technical Controls That Actually Work

Data Discovery and Classification

You cannot protect data you do not know about. Automated PII discovery tools scan structured and unstructured data stores to identify PII presence across databases, file stores, data lakes, and code repositories. The most effective programs combine automated tooling with manual review — tools find what they are configured to look for, while manual review catches edge cases like PII embedded in free-text fields or uploaded documents.

Integrating PII discovery with your cloud inventory processes significantly reduces coverage gaps. When you have a complete picture of your cloud assets, layering PII classification on top of that inventory becomes tractable. Assets without a known PII classification become an audit priority rather than an unknown unknown.

Encryption and Tokenization

Sensitive PII at rest should use AES-256 minimum. In transit, TLS 1.3. Tokenization replaces PII values with non-sensitive tokens — particularly valuable for payment processing and any system that needs to reference PII without exposing it. Format-preserving encryption maintains data utility for analytics while protecting the underlying values. Key management is where encryption programs typically fail; use a dedicated key management service rather than application-level key storage.

Access Controls and Data Minimization

The principle of least privilege applied to PII: roles should access only the PII required for their specific function, for only as long as necessary. Attribute-based access control is more flexible than role-based access for granular PII policies. Data minimization at the application layer — not collecting what you do not need, not logging what should not be logged — reduces the PII surface before access controls even come into play. It is the most underutilized control in most privacy programs.

Monitoring and Anomaly Detection

Unexpected bulk exports of PII, access from unusual geographic locations at unusual times, privilege escalation followed immediately by data access — these behavioral patterns are the early signals of an insider threat or compromised credential. SIEM rules tuned for PII access anomalies, combined with DLP controls on egress channels, form the detection layer. Vulnerability management ensures that the systems storing PII do not have unpatched vulnerabilities that create easy extraction paths for attackers who gain initial access through other means.

Common PII Mistakes Engineers Make

Logging PII in application logs is probably the most common. A user_id logged alongside an email address, a phone number passed in a URL query parameter that ends up in access logs, a debug statement that dumps an entire user object including SSN — these accumulate silently and get discovered during incident response or a regulatory audit when the damage is already done.

Seeding development environments with production data is the second biggest issue. Development teams need realistic data — understandable. But actual PII in development environments dramatically increases the attack surface and creates compliance obligations around a non-production environment that typically has weaker security controls. Synthetic data generation has matured significantly by 2026. There is no longer a credible technical argument for production PII in development environments.

Third-party tracking scripts that capture more than intended are the third major pattern. You embed an analytics snippet; it captures PII fields from form inputs before submission. You did not write that code, but under GDPR's joint controller doctrine in some scenarios, you bear legal responsibility for what it collects. Regular audits of third-party scripts using browser-level monitoring should be part of any serious privacy program.

At SECRAILS, these patterns appear repeatedly across cloud security assessments. PII exposure is rarely the result of sophisticated attacks. It is almost always a configuration or process failure that could have been caught earlier in the development lifecycle with the right controls in place.

Building a PII Inventory: Practical Starting Points

Start with data flow mapping. For every system that touches PII, document what data is collected, where it is stored, who can access it, how long it is retained, and where it flows — to third parties, analytics platforms, backups, and logs. This is not a one-time exercise. Data flows change every time a feature ships.

Combine automated discovery with manual review and assign data owners to each PII-containing system. Data owners are responsible for keeping the inventory current, making retention decisions, and responding to data subject requests. Without clear ownership, privacy programs stall.

Document it formally. A Record of Processing Activities is a GDPR requirement under Article 30 for organizations above certain thresholds. Beyond compliance, it is operationally valuable — it is the first document a DPO reaches for during incident response and the first thing a regulator requests during an investigation. Organizations that have a current, accurate ROPA consistently fare better in regulatory interactions than those who do not.

PII compliance is not a destination. It is a continuous process. Regulatory frameworks evolve, new data categories get added to sensitive lists, and your product collects new types of data over time. The organizations that handle this well treat PII governance as a living operational program, not an annual checkbox exercise. The $4.88M breach cost is the price of treating it otherwise.