Japan's Health Data Anonymization Act: Enabling Large-Scale Health Research

Kathrin Gardhouse
Feb 12, 2025
Share this post
Sharing to FacebookSharing to LinkedInSharing to XSharing to Email

Anonymized and pseudonymized medical data are at the heart of cutting-edge research and innovation in healthcare. By stripping away personal identifiers and adding additional privacy-preserving measures, these data allow for advanced studies without compromising the privacy of individuals.

In Japan, as elsewhere, the path to leveraging this valuable resource has been complex due to the need to balance large-scale data use with privacy protection. Thus, under the Act on the Protection of Personal Information (APPI) healthcare providers have faced challenges in sharing and processing medical data for research and innovation purposes, primarily due to strict consent requirements.

This article explores how Japan’s Next-Generation Medical Infrastructure Law addresses these challenges and how Private AI can help organizations navigate compliance while enabling secure health data use for research.

Sharing Health Data under APPI

Under APPI, sharing health data with third parties generally requires explicit opt-in consent from each individual patient, although an exception exists for research purposes. Commercial development of healthcare products and services, however, does not fall under this exception. Building comprehensive medical databases, linking patient data across different institutions, and conducting broad-based medical research have been hampered by the need to obtain individual consent from every patient involved.

One way out of having to comply with the onerous consent requirements is to anonymise the data. However, this can be a complex endeavour as well, especially when the data are supposed to be linked across different institutions to build large-scale databases to be accessed by medical researchers and innovators. Such data linkage may increase the re-identification risk for individuals. It is also challenging to meaningfully combine data once it already has been anonymized because it cannot be determined whether certain health information pertains to the same individual. But prior to anonymization, the sharing of the data is restricted.

Next-Generation Medical Infrastructure Law

To address these challenges while maintaining robust privacy protections, Japan enacted the Act on Anonymized Medical Data That Contributes to Research and Development in the Medical Field, commonly known as “Next-Generation Medical Infrastructure Law” (“NGMIL”). This legislation creates a novel framework that balances the imperatives of medical research with individual privacy rights.

At the heart of the Medical Data Act is the creation of "Authorized De-identified Medical Information Preparers" - certified entities that serve as trusted intermediaries in the medical data ecosystem. These certified entities play a crucial role in transforming sensitive medical information into valuable research assets. They receive identifiable medical data from healthcare providers, standardize data formats across institutions, link patient records where appropriate, and ultimately create anonymized datasets that can be used for research purposes.

New Consent Mechanism

A significant innovation of the Act lies in its consent mechanism. While APPI typically requires opt-in consent for sharing health data, the NGMIL establishes a carefully structured opt-out framework. Healthcare providers can share identifiable medical data with certified entities after notifying patients and giving them the opportunity to opt out. This seemingly subtle shift from opt-in to opt-out consent has profound implications for medical research, making it feasible to build the large-scale datasets necessary for meaningful healthcare innovation.

Regulatory Improvements and Real-World Application

The recent amendment to the NGMIL passed in May 2023 and took effect in 2024, introduces a new category of data known as pseudonymized medical information. Unlike fully anonymized data, which is irreversibly processed to prevent identification, pseudonymized medical data allows for the possibility of re-identification if matched with other information. This change addresses a challenge in the medical and pharmaceutical industries, where anonymized data's strict limitations have hindered its utility in regulatory submissions. Under the revised framework, certified users of pseudonymized medical data can now submit such information to the Pharmaceuticals and Medical Devices Agency (PMDA) when seeking regulatory approval, without the need to remove outliers or rare disease identifiers. Additionally, the PMDA can request access to the original data from certified providers, enhancing the reliability and applicability of medical data in research and drug development.

Regarding its real-world success, the NGMIL has made strides in enabling the use of anonymized medical data for research while balancing privacy concerns, but progress has been slow overall. Although certified producers and enterprises now facilitate data collection and anonymization, the number of participating medical institutions remains insufficient, limiting the richness of available datasets. The recent introduction of pseudonymized medical information aims to address some of the previous law’s shortcomings but since both the preparation and the handling of pseudonymized medical information requires certification, a barrier to entry may persist.

In addition, Japan has multiple medical information databases, but data from medical insurers and government-held medical records remain separate. While the universal health insurance system allows for nationwide data collection, insurers maintain independent databases, making it difficult for researchers to access comprehensive information. A system to integrate and link these disparate data sources would enhance research and data analysis.

While more than 20 studies have been initiated using NGMIL-authorized datasets, high costs and limited funding for academic research hinder broader adoption. Further cooperation among medical institutions, regulatory bodies, and industry stakeholders is essential for the law to fully achieve its goal of advancing medical R&D while safeguarding patient privacy.

Reduced Compliance Obligations for Anonymized Data

Aside from its benefits for medical research and innovation, processing anonymized data has further advantages from a compliance perspective. It is important to note that while neither the APPI nor the NGMIL explicitly say as much, the definition personal information and anonymized information seem to imply that anonymized information does not fall under the definition of personal information. This would generally have the consequence that all provisions that pertain to personal information do not apply to anonymized information. Nevertheless, the NGMIL specifically lists only a select few of the APPI provisions that do not apply to anonymized information, namely the following:

  1. Data subject requests
    Businesses handling anonymized medical data are exempt from Article 37 APPI, meaning they are not required to process disclosure or other handling requests from individuals regarding such data. Specifically, they do not have to establish procedures for individuals to request access, correction, deletion, or cessation of third-party provision of anonymized medical data, nor do they have to facilitate these requests through specific methods, ensure ease of submission, or allow requests via an agent. The provision determining fees for responses to such requests also do not apply, and lawsuits and other legal proceedings cannot proceed on the same grounds as those applicable to personal information.
  2. Explaining decisions
    Under Article 36 of the APPI, businesses must "endeavor to explain" their reasons when they refuse requests related to personal data, such as disclosure, correction, or deletion. However, when handling anonymized medical data, this obligation is removed.

It is noteworthy that Articles 43 through 46, in particular, are not listed among the APPI provisions that no longer apply, although the majority of the obligations that remain do not make a lot of sense when the anonymization process is outsourced:

  1. Anonymization Standards: Businesses must process personal information according to Personal Information Protection Commission standards to ensure individuals cannot be identified or data restored.
  2. Security Measures: They must implement safeguards to prevent leaks of deleted identifiers and processing methods used in anonymization.
  3. Public Disclosure: The categories of anonymized data must be publicly disclosed after anonymization.
  4. Third-Party Provision: Before sharing anonymized data, businesses must declare it as anonymized, disclose the information categories and provision methods, and inform recipients explicitly.
  5. Re-Identification Ban: Businesses are prohibited from cross-referencing anonymized data with other information to identify individuals.
  6. Proper Handling: Companies must take necessary steps for security, complaint resolution, and compliance, striving to publicly disclose these measures.

How Private AI Can Help with Compliance Under both Frameworks

As we have seen, there are two different ways businesses in Japan can go about anonymizing health data, either on their own under the APPI or under the NGMIL. The NGMIL introduces an additional layer of privacy protection for medical data that is to be combined into large datasets fed into by several institutions. By entrusting the process of anonymization to certified experts, this complex task is streamlined. Yet, slow uptake and certification regimes remain obstacles.

To support organizations with anonymization or pseudonymization, Private AI's advanced privacy-enhancing technology automates a crucial step of the process, the detection and redaction of direct and indirect identifiers.

With machine learning models trained to identify over 50 types of personal information across 53 languages, including Japanese, Private AI reduces an onerous process to a few lines of code. Specializing on unstructured data across various file formats, Private AI helps unlock the hidden value of health data for secure innovation. Importantly, the solutions can be deployed on premise, so that there are no concerns with cross-border data transfers.

Conclusion

Japan's Next-Generation Medical Infrastructure Law represents a significant step forward in balancing medical research with individual privacy. By introducing certified entities to manage data anonymization and adopting an opt-out consent framework, the law facilitates large-scale health research while maintaining robust privacy protections. However, challenges remain, including slow adoption, certification barriers, and fragmented medical databases. The recent introduction of pseudonymized data aims to enhance data utility, particularly for regulatory submissions, but further integration and institutional participation are needed to fully realize the law’s potential.

As Japan continues refining its approach, privacy-enhancing technologies like Private AI can play a key role in streamlining compliance, ensuring secure data processing, and unlocking the full value of health data for innovation.

Data Left Behind: AI Scribes’ Promises in Healthcare

Data Left Behind: Healthcare’s Untapped Goldmine

The Future of Health Data: How New Tech is Changing the Game

Why is linguistics essential when dealing with healthcare data?

Why Health Data Strategies Fail Before They Start

Private AI to Redefine Enterprise Data Privacy and Compliance with NVIDIA

EDPB’s Pseudonymization Guideline and the Challenge of Unstructured Data

HHS’ proposed HIPAA Amendment to Strengthen Cybersecurity in Healthcare and how Private AI can Support Compliance

Japan's Health Data Anonymization Act: Enabling Large-Scale Health Research

What the International AI Safety Report 2025 has to say about Privacy Risks from General Purpose AI

Private AI 4.0: Your Data’s Potential, Protected and Unlocked

How Private AI Facilitates GDPR Compliance for AI Models: Insights from the EDPB's Latest Opinion

Navigating the New Frontier of Data Privacy: Protecting Confidential Company Information in the Age of AI

Belgium’s Data Protection Authority on the Interplay of the EU AI Act and the GDPR

Enhancing Compliance with US Privacy Regulations for the Insurance Industry Using Private AI

Navigating Compliance with Quebec’s Act Respecting Health and Social Services Information Through Private AI’s De-identification Technology

Unlocking New Levels of Accuracy in Privacy-Preserving AI with Co-Reference Resolution

Strengthened Data Protection Enforcement on the Horizon in Japan

How Private AI Can Help to Comply with Thailand's PDPA

How Private AI Can Help Financial Institutions Comply with OSFI Guidelines

The American Privacy Rights Act – The Next Generation of Privacy Laws

How Private AI Can Help with Compliance under China’s Personal Information Protection Law (PIPL)

PII Redaction for Reviews Data: Ensuring Privacy Compliance when Using Review APIs

Independent Review Certifies Private AI’s PII Identification Model as Secure and Reliable

To Use or Not to Use AI: A Delicate Balance Between Productivity and Privacy

To Use or Not to Use AI: A Delicate Balance Between Productivity and Privacy

News from NIST: Dioptra, AI Risk Management Framework (AI RMF) Generative AI Profile, and How PII Identification and Redaction can Support Suggested Best Practices

Handling Personal Information by Financial Institutions in Japan – The Strict Requirements of the FSA Guidelines

日本における金融機関の個人情報の取り扱い - 金融庁ガイドラインの要件

Leveraging Private AI to Meet the EDPB’s AI Audit Checklist for GDPR-Compliant AI Systems

Who is Responsible for Protecting PII?

How Private AI can help the Public Sector to Comply with the Strengthening Cyber Security and Building Trust in the Public Sector Act, 2024

A Comparison of the Approaches to Generative AI in Japan and China

Updated OECD AI Principles to keep up with novel and increased risks from general purpose and generative AI

Is Consent Required for Processing Personal Data via LLMs?

The evolving landscape of data privacy legislation in healthcare in Germany

The CIO’s and CISO’s Guide for Proactive Reporting and DLP with Private AI and Elastic

The Evolving Landscape of Health Data Protection Laws in the United States

Comparing Privacy and Safety Concerns Around Llama 2, GPT4, and Gemini

How to Safely Redact PII from Segment Events using Destination Insert Functions and Private AI API

WHO’s AI Ethics and Governance Guidance for Large Multi-Modal Models operating in the Health Sector – Data Protection Considerations

How to Protect Confidential Corporate Information in the ChatGPT Era

Unlocking the Power of Retrieval Augmented Generation with Added Privacy: A Comprehensive Guide

Leveraging ChatGPT and other AI Tools for Legal Services

Leveraging ChatGPT and other AI tools for HR

Leveraging ChatGPT in the Banking Industry

Law 25 and Data Transfers Outside of Quebec

The Colorado and Connecticut Data Privacy Acts

Unlocking Compliance with the Japanese Data Privacy Act (APPI) using Private AI

Tokenization and Its Benefits for Data Protection

Private AI Launches Cloud API to Streamline Data Privacy

Processing of Special Categories of Data in Germany

End-to-end Privacy Management

Privacy Breach Reporting Requirements under Law25

Migrating Your Privacy Workflows from Amazon Comprehend to Private AI

A Comparison of the Approaches to Generative AI in the US and EU

Benefits of AI in Healthcare and Data Sources (Part 1)

Privacy Attacks against Data and AI Models (Part 3)

Risks of Noncompliance and Challenges around Privacy-Preserving Techniques (Part 2)

Enhancing Data Lake Security: A Guide to PII Scanning in S3 buckets

The Costs of a Data Breach in the Healthcare Sector and its Privacy Compliance Implications

Navigating GDPR Compliance in the Life Cycle of LLM-Based Solutions

What’s New in Version 3.8

How to Protect Your Business from Data Leaks: Lessons from Toyota and the Department of Home Affairs

New York's Acceptable Use of AI Policy: A Focus on Privacy Obligations

Safeguarding Personal Data in Sentiment Analysis: A Guide to PII Anonymization

Changes to South Korea’s Personal Information Protection Act to Take Effect on March 15, 2024

Australia’s Plan to Regulate High-Risk AI

How Private AI can help comply with the EU AI Act

Comment la Loi 25 Impacte l'Utilisation de ChatGPT et de l'IA en Général

Endgültiger Entwurf des Gesetzes über Künstliche Intelligenz – Datenschutzpflichten der KI-Modelle mit Allgemeinem Verwendungszweck

How Law25 Impacts the Use of ChatGPT and AI in General

Is Salesforce Law25 Compliant?

Creating De-Identified Embeddings

Exciting Updates in 3.7

EU AI Act Final Draft – Obligations of General-Purpose AI Systems relating to Data Privacy

FTC Privacy Enforcement Actions Against AI Companies

The CCPA, CPRA, and California's Evolving Data Protection Landscape

HIPAA Compliance – Expert Determination Aided by Private AI

Private AI Software As a Service Agreement

EU's Review of Canada's Data Protection Adequacy: Implications for Ongoing Privacy Reform

Acceptable Use Policy

ISO/IEC 42001: A New Standard for Ethical and Responsible AI Management

Reviewing OpenAI's 31st Jan 2024 Privacy and Business Terms Updates

Comparing OpenAI vs. Azure OpenAI Services

Quebec’s Draft Regulation Respecting the Anonymization of Personal Information

Version 3.6 Release: Enhanced Streaming, Auto Model Selection, and More in Our Data Privacy Platform

Brazil's LGPD: Anonymization, Pseudonymization, and Access Requests

LGPD do Brasil: Anonimização, Pseudonimização e Solicitações de Acesso à Informação

Canada’s Principles for Responsible, Trustworthy and Privacy-Protective Generative AI Technologies and How to Comply Using Private AI

Private AI Named One of The Most Innovative RegTech Companies by RegTech100

Data Integrity, Data Security, and the New NIST Cybersecurity Framework

Safeguarding Privacy with Commercial LLMs

Cybersecurity in the Public Sector: Protecting Vital Services

Privacy Impact Assessment (PIA) Requirements under Law25

Elevate Your Experience with Version 3.5

Fine-Tuning LLMs with a Focus on Privacy

GDPR in Germany: Challenges of German Data Privacy (Part 2)

Comply with US Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence using Private AI

How to Comply with EU AI Act using PrivateGPT