IBM's 2026 Cost of a Data Breach report put the global average breach cost at $4.88 million — and the majority of those breaches had one thing in common: exposed personally identifiable information. PII is both the most targeted asset in modern cyberattacks and, frankly, the least consistently defined term in most organizations' data governance programs.
Security engineers debate it constantly. Legal teams interpret it differently. And regulators — GDPR, CCPA, HIPAA, NIST — don't always agree on where the lines are drawn. So let's cut through the ambiguity and get specific about what PII actually is, what the best examples look like across different regulatory frameworks, and what your team needs to do to protect it.
What Is PII? The Working Definition
PII — personally identifiable information — refers to any data that can be used, alone or in combination with other data, to identify a specific individual. That last part matters: alone or in combination. A name by itself might seem harmless. Pair it with an employer, ZIP code, and date of birth, and you have a quasi-identifier combination that can re-identify someone with startling accuracy, as Latanya Sweeney demonstrated in her landmark research showing 87% of Americans could be uniquely identified with just three fields.
NIST SP 800-122 defines PII as information that can distinguish or trace an individual's identity, either alone or combined with other linked information. GDPR uses the broader term personal data — any information relating to an identified or identifiable natural person. These aren't identical definitions. A static IP address is definitively PII under GDPR; the answer is murkier under older US federal standards.
Direct PII Examples: The Obvious Ones
Direct identifiers are data points that, by themselves, uniquely identify a person. No aggregation required. These are the high-priority assets in any data classification policy:
- Full name — First, middle, last, maiden names, and legal aliases
- Social Security Number (SSN) — The canonical US example of sensitive PII
- Passport number — Government-issued; uniquely tied to one person globally
- Driver's license number — State-issued but direct identifier
- National ID numbers — CNP in Romania, NIF in Spain, AADHAAR in India
- Biometric data — Fingerprints, retinal scans, facial geometry, voiceprints, DNA sequences
- Financial account numbers — Bank account, credit and debit card numbers, IBAN
- Medical record numbers — Under HIPAA, these are Protected Health Information (PHI), a subset of PII
- Employee ID numbers — When tied to an individual in an HR system
Notice that biometrics are on this list. They're not just PII — they're sensitive PII, and under GDPR Article 9, they're a special category of data requiring explicit consent and heightened protection. Lose a password, and you reset it. Lose a fingerprint, and it's permanently compromised.
Indirect PII Examples: The Tricky Category
Indirect identifiers are the ones that trip up even experienced compliance teams. These data points don't identify someone outright but become identifying when combined with other available information.
Is a Phone Number PII?
Yes — almost universally. A phone number directly links to a registered subscriber. Even a VoIP number ties back to an account holder in most jurisdictions. Under GDPR, CCPA, and NIST SP 800-122, phone numbers are unambiguously PII. The question isn't whether phone numbers are PII; it's what classification tier they fall into in your data handling policy. Mobile numbers — which often follow a person for decades — warrant stronger protection than work landline extensions.
IP Addresses
The CJEU ruled definitively in Breyer v. Germany that dynamic IP addresses constitute personal data under EU law when an ISP can link them to a subscriber. Static IPs are even clearer-cut. In practice, treat all IP addresses as PII in any GDPR-adjacent system. Under US law, it's more context-dependent — an IP collected in isolation without linkage mechanisms might not trigger PII obligations under some state laws, but that argument gets thinner every year.
Email Addresses
Personal email addresses are direct PII. Corporate email addresses are PII for individuals but may be treated differently for business contact data under some frameworks. Either way, if you're processing them, they're in scope.
Location Data
Precise geolocation — GPS coordinates, cell tower triangulation — is sensitive PII. Home address is direct PII. Even ZIP code alone is borderline: paired with gender and date of birth, it becomes a powerful quasi-identifier. Aggregate or generalize location data before using it in analytics pipelines wherever possible.
Device Identifiers
MAC addresses, IMEI numbers, advertising IDs, and cookie identifiers all qualify as PII under modern privacy regulations. Cookie IDs were the specific flashpoint in much of GDPR's early enforcement activity. If your application fingerprints devices, you're processing PII — full stop.
Sensitive PII: The Higher-Risk Tier
Not all PII carries equal risk. Sensitive personally identifiable information is data whose exposure causes disproportionate harm — discrimination, identity theft, physical danger, financial loss. NIST distinguishes between PII and sensitive PII based on the potential harm of unauthorized disclosure.
Sensitive PII examples include:
- Social Security Numbers and national identity numbers
- Biometric identifiers such as fingerprints, facial recognition data, and iris scans
- Financial account credentials and full card numbers
- Medical and health information including diagnoses, prescriptions, and mental health records
- Sexual orientation and gender identity
- Religious or political beliefs
- Immigration status and citizenship information
- Criminal records and arrest history
- Genetic data
- Children's data, which is a special category under COPPA and GDPR Article 8
GDPR Article 9 explicitly lists racial or ethnic origin, political opinions, religious beliefs, trade union membership, genetic data, biometric data used for unique identification, health data, and data concerning sex life or sexual orientation as special categories requiring explicit consent and additional safeguards. Breaching these isn't just a compliance problem — it's a direct harm to real people.
What Is Considered PII: A Framework Comparison
Regulatory fragmentation is the practical headache here. Your data classification policy has to account for multiple overlapping regimes.
GDPR
GDPR's personal data definition is the broadest: any information relating to an identified or identifiable person. Identifiability considers whether identification is reasonably likely, not just technically possible. This catches IP addresses, cookie IDs, pseudonymous identifiers, and inferences derived from personal data. Pseudonymization reduces risk but doesn't remove GDPR applicability unless data is truly anonymous — a high bar.
CCPA and CPRA
California's framework covers personal information that identifies, relates to, describes, or could reasonably be linked to a consumer. Unique identifiers, browsing history, purchase history, geolocation, and inferences drawn to create a consumer profile are all covered. CPRA added sensitive personal information as a distinct category with opt-out rights.
NIST SP 800-122
The US federal standard that most agencies and federal contractors follow. Defines PII with that alone or in combination language and provides a risk-based framework for categorizing sensitivity. Useful operationally because it directly maps to security controls.
HIPAA
The 18 PHI identifiers under HIPAA are a specific PII subset. They include names, geographic data more specific than state, dates related to individuals, phone numbers, fax numbers, email addresses, SSNs, medical record numbers, health plan beneficiary numbers, account numbers, certificate and license numbers, VINs, device identifiers, URLs, IP addresses, biometric identifiers, full-face photos, and any other unique identifying number. If your systems touch healthcare data, these identifiers require strict access controls and breach notification procedures regardless of what your general PII policy says.
PII in Code: Where Engineers Actually Lose Data
Here's where theory meets practice. PII doesn't just leak from breached databases — it leaks from application code. Hardcoded API keys that expose customer records. Debug logs that capture full request payloads including SSNs. Error messages that echo back user input containing email addresses. Test environments seeded with production PII.
These aren't theoretical risks. The Secret Detection capabilities in modern security platforms exist precisely because developers accidentally commit credentials and tokens that grant access to PII stores. A committed AWS key with S3 read access to a customer data bucket is effectively a PII breach waiting to be discovered.
Static analysis through SAST tooling can catch patterns like logging statements that serialize user objects, or SQL queries that pull unnecessary PII columns into application memory. The principle of data minimization — collecting and processing only what's strictly necessary — should be enforced at the code level, not just the policy level. Shift-left on PII the same way you shift-left on vulnerabilities.
PII Protection: Technical and Organizational Controls
Knowing what PII is means nothing if you can't protect it. The controls stack has multiple layers.
Data Discovery and Classification
You can't protect what you can't find. Use automated tools to scan cloud storage, databases, and data lakes for PII patterns — SSN regex, email patterns, credit card Luhn-validated numbers. Cloud Inventory visibility is foundational here: you need to know every S3 bucket, every RDS instance, every blob container that might hold customer data before you can classify and control access.
Encryption and Tokenization
Sensitive PII at rest should be encrypted at the field level, not just disk-level encryption. Tokenization replaces PII with non-sensitive tokens, keeping the original in a protected vault. Credit card processing is the obvious use case — PCI DSS mandates this — but the pattern applies to SSNs, medical record numbers, and any sensitive PII your system stores but doesn't need in plaintext form.
Access Controls and Least Privilege
The blast radius of a PII breach directly correlates with over-provisioned access. Enforce least-privilege access to PII datastores. Audit who has SELECT rights on your customer table. Use attribute-based access control to restrict sensitive PII fields based on role, context, and purpose. IAM misconfigurations in cloud environments are one of the leading vectors for PII exposure — this is exactly where CSPM tooling adds measurable value by continuously checking cloud configurations against security baselines.
Policy-as-Code for PII Compliance
Manual audit processes don't scale. Defining your PII handling requirements as code — and enforcing them in CI/CD pipelines and cloud provisioning workflows — is how mature security programs maintain consistent compliance posture. Policy-as-Code lets you codify rules like no S3 bucket containing customer PII may be public or all databases holding PHI must have encryption enabled, and enforce them automatically before infrastructure reaches production.
Data Retention and Deletion
PII you don't have can't be breached. GDPR's storage limitation principle requires that personal data not be kept longer than necessary. Build automated retention schedules into your data architecture. Right-to-erasure requests under GDPR require you to actually delete PII across all systems — backups included — which is architecturally harder than it sounds if you didn't design for it.
PII Compliance: Regulatory Obligations You Cannot Ignore
The regulatory environment in 2026 is more complex than it was three years ago. US states have continued passing comprehensive privacy laws — the patchwork now covers a majority of the US population. Brazil's LGPD has matured significantly. India's Digital Personal Data Protection Act is in full enforcement mode. China's PIPL continues to generate compliance obligations for any organization processing Chinese nationals' data.
For organizations subject to GDPR, the requirements are well-established but still demanding: lawful basis for processing, data subject rights covering access, rectification, erasure, and portability, DPA registration, DPIAs for high-risk processing, and 72-hour breach notification. Non-compliance fines hit 20 million euros or 4% of global annual turnover — whichever is higher. The Compliance tooling landscape has evolved to help automate evidence collection and control mapping, but the underlying data governance work is non-negotiable.
Organizations handling PII at scale need a formal PII protection program: data inventory and mapping, classification policy, access controls, incident response procedures, and regular compliance assessments. The SecRails blog covers many of these areas in depth — from GDPR cross-border transfer requirements to NIS2 compliance implications for data processors.
The Aggregation Problem
One of the most underappreciated PII risks is aggregation. Individual data points that seem harmless become identifying — and harmful — when combined. A name alone isn't sensitive. An employer alone isn't sensitive. But name plus employer plus approximate salary range plus general neighborhood plus physical description creates a profile that can enable stalking, social engineering, or discrimination.
This is why data minimization and purpose limitation matter beyond just regulatory checkbox compliance. When designing data pipelines, ask: do we actually need this field? Can we use an aggregate or anonymized version? Can we hash or tokenize this before it leaves the collection point? Building privacy into data architecture is significantly cheaper than retrofitting it after a breach or a regulatory investigation.
The Cloud Security posture of your data environment directly affects your PII exposure surface. Misconfigured cloud services — open S3 buckets, publicly accessible databases, over-permissive IAM roles — routinely expose PII that should never have been accessible externally. These aren't sophisticated attacks; they're configuration failures that automated security tooling catches routinely.
Practical Takeaways for Security Teams
PII protection isn't a one-time project. It's an ongoing operational discipline. Here's what actually moves the needle: maintain a live data inventory that tracks where PII lives across your entire environment; enforce encryption for sensitive PII fields, not just at the storage level; apply least-privilege rigorously to PII datastores and audit access quarterly; scan code and infrastructure configurations continuously for PII exposure patterns; test your right-to-erasure and breach notification procedures before you actually need them.
The organizations that handle PII breaches well aren't the ones that never have incidents — they're the ones that detect exposures fast, contain them quickly, and can demonstrate to regulators that they had appropriate controls in place. That capability comes from building a security program where PII protection is operationalized, not just documented.

