Sovereign AI and DPDPA: Why Sending Indian Data to Foreign AI APIs Is a Compliance Timebomb

Introduction

Every organisation in India wants to use AI. Very few want to face a ₹250 crore fine for doing so. Yet that is precisely the risk that thousands of Indian businesses are running today by sending personal data to foreign AI APIs without adequate compliance safeguards. According to Codelattice's analysis of the DPDP Act, penalties for failure to implement reasonable security safeguards — including safeguards around cross-border data flows — can reach ₹250 crore per instance. The DPDPA does not carve out exceptions for AI workloads. When an employee pastes a customer's banking complaint into ChatGPT, when a fintech routes loan applicant data through a US-hosted AI model, or when a hospital sends patient records to a cloud-based diagnostic AI — each of these actions constitutes a cross-border transfer of personal data that falls squarely within the Act's regulatory perimeter. This article examines why foreign AI APIs represent one of the most underestimated compliance risks under the DPDPA, and what organisations must do to address it before enforcement actions begin.

The Scale of the Problem: Indian Data Flowing to Foreign AI Infrastructure

The adoption of generative AI across Indian enterprises has been staggering. A 2026 EY India survey found that 78% of Indian enterprises now use at least one external AI API in production workflows, with an additional 15% in pilot programmes. The vast majority of these APIs — ChatGPT (OpenAI), Copilot (Microsoft), Gemini (Google), and Claude (Anthropic) — process data on infrastructure located in the United States, Europe, or other jurisdictions outside India. Each API call that includes personal data constitutes a cross-border data transfer under Section 16 of the DPDPA. For context, a single mid-sized Indian bank processing customer service queries through an AI API may transmit hundreds of thousands of personal data records offshore every month — names, account numbers, transaction histories, complaint details, and Aadhaar-linked identifiers. The Indian IT services sector alone is estimated to make over 2 billion AI API calls per month, with a significant percentage containing some form of personal data. The regulatory exposure is enormous, and most organisations have not even begun to quantify it.

ChatGPT and GPT-4 API — Data processed on OpenAI servers primarily in the US; OpenAI's data retention policies allow training on API inputs unless explicitly opted out
Microsoft Copilot — Integrated into Microsoft 365, processes data through Azure infrastructure with data residency depending on tenant configuration, often defaulting to US regions
Google Gemini — Processes data on Google Cloud infrastructure; consumer-tier usage may involve data being used for model improvement
Open-source models on foreign cloud — Even self-hosted open-source models running on AWS, Azure, or GCP in non-Indian regions constitute cross-border transfers

What the DPDPA Actually Says About Cross-Border AI Data Flows

The DPDPA takes a permissive-with-restrictions approach to cross-border data transfers. Section 16 authorises the Central Government to restrict transfers of personal data to specific countries or territories through a negative list. While the negative list has not yet been published as of March 2026, the MeitY has signalled that it will be informed by national security considerations, adequacy assessments, and bilateral agreements. Critically, the absence of a negative list does not mean organisations can transfer data freely without consequence. The DPDPA's core obligations — purpose limitation, data minimisation, security safeguards, and consent requirements — apply regardless of where data is processed. If an organisation sends personal data to a foreign AI API without a lawful basis, without adequate notice to the Data Principal, or without reasonable security safeguards, it is in violation of the Act even if the destination country is not on the negative list. As PwC's regulatory analysis notes, the DPDP Rules 2025 impose additional obligations on cross-border transfers, including requirements for contractual safeguards with foreign processors.

Significant Data Fiduciary Designation: The AI Trap

Organisations that process large volumes of Indian personal data are prime candidates for designation as Significant Data Fiduciaries (SDFs) under Section 10 of the DPDPA. As Kiteworks' DPDPA compliance analysis highlights, SDF designation brings substantially elevated obligations — mandatory Data Protection Impact Assessments, appointment of a Data Protection Officer and independent data auditor, and periodic audits of all data processing activities. The AI dimension makes SDF designation particularly consequential. An organisation designated as an SDF that routes personal data through foreign AI APIs faces compounding obligations: it must conduct DPIAs specifically covering AI data flows, its DPO must assess and document the risks of each foreign AI integration, and its independent auditor must verify that cross-border AI transfers comply with every applicable provision. According to IAPP's operational impact assessment, fewer than 12% of organisations likely to be designated as SDFs have completed AI-specific DPIAs. The gap between obligation and readiness is alarming.

Volume trigger — Organisations processing personal data of more than a threshold number of Data Principals (expected to be in the millions) are candidates for SDF designation
Sensitivity trigger — Processing sensitive categories of data (health, financial, children's) at scale increases the likelihood of designation
AI-specific risk — The use of AI for automated decision-making involving personal data is explicitly flagged as a factor in SDF assessment
Elevated penalties — SDFs face heightened regulatory scrutiny, and non-compliance by an SDF is likely to attract penalties at the higher end of the penalty schedule

The Hidden Risks: How ChatGPT, Copilot, and Gemini Create Data Residency Problems

The data residency risks created by popular AI APIs are more pervasive than most organisations realise. These risks operate at three levels: direct API calls, indirect data exposure through integrations, and training data retention. At the direct level, every prompt containing personal data that is sent to an AI API constitutes a data transfer to the jurisdiction where the API processes data. At the integration level, tools like Microsoft Copilot that are embedded into productivity suites can access and process personal data from emails, documents, and calendars — often without the user explicitly intending to share that data with an AI. At the training level, several AI providers retain API inputs for model improvement unless customers explicitly opt out, meaning Indian personal data could be permanently incorporated into models hosted abroad. A Chambers and Partners analysis warns that the DPDPA's purpose limitation principle is directly violated when data collected for one purpose (e.g., customer service) is used for another (e.g., AI model training) without separate consent. This creates a compliance timebomb: organisations may be in violation today without knowing it, with liability accruing silently until enforcement begins.

Shadow AI usage — Employees using personal ChatGPT accounts to process work data, bypassing corporate controls entirely
Embedded AI in SaaS — AI features auto-enabled in Salesforce, Zendesk, Freshworks, and other SaaS tools that process data offshore
Training data retention — Consumer-tier AI APIs that retain and use input data for model training, making deletion requests meaningless
Prompt injection risks — Sensitive data exposed through AI responses that reference previously submitted personal information

RBI Data Localisation and DPDPA: The Double Bind for Banks

For India's banking and financial services sector, foreign AI APIs create a particularly acute compliance challenge. The Reserve Bank of India's data localisation directive requires that all payment system data be stored exclusively in India. The DPDPA adds a separate layer of obligations around personal data processing and cross-border transfers. When a bank uses a US-hosted AI API to process customer data — even for seemingly innocuous tasks like summarising customer complaints or generating reports — it potentially violates both the RBI's localisation requirements and the DPDPA's cross-border transfer provisions simultaneously. According to Lexology's regulatory analysis, banks face the added challenge of harmonising DPDPA consent requirements with existing RBI guidelines on customer data handling, creating a compliance matrix that is fiendishly complex. The BFSI sector's overlapping regulatory obligations make it the industry where foreign AI API risks are most likely to trigger enforcement action first.

How to Audit Your AI Data Flows for DPDPA Compliance

Auditing AI data flows is not optional — it is a regulatory necessity. Organisations must systematically identify every instance where personal data is transmitted to an AI system, assess the compliance implications, and implement appropriate controls. A comprehensive AI data flow audit should be treated with the same rigour as a financial audit, with documented findings, risk ratings, and remediation timelines. Kraver.ai's data discovery and mapping capabilities can automate much of this process, identifying AI API calls across your infrastructure and flagging those that involve personal data transfers.

Inventory all AI integrations — Catalogue every AI API, embedded AI feature, and AI-powered SaaS tool used across the organisation, including shadow IT
Map data flows — For each integration, document what personal data is transmitted, where it is processed, whether it is retained, and for how long
Assess lawful basis — Verify that each AI data flow has a valid lawful basis under the DPDPA — either explicit consent or a legitimate use exemption
Review vendor contracts — Ensure AI API providers have contractual commitments around data processing, retention, deletion, and sub-processor restrictions
Evaluate data minimisation — Assess whether the personal data sent to AI APIs is truly necessary, or whether anonymised or synthetic data could achieve the same outcome
Implement DLP controls — Deploy data loss prevention tools that intercept and block personal data from being transmitted to unapproved AI APIs
Document everything — Maintain audit trails of all AI data processing activities for regulatory inspection readiness

The Sovereign AI Alternative: Processing Data Within India

The concept of sovereign AI — AI infrastructure that processes data entirely within national borders — is gaining significant traction in India. The Indian government's IndiaAI Mission has allocated over ₹10,000 crore to build domestic AI compute infrastructure, including GPU clusters and AI-as-a-service platforms hosted within India. Major cloud providers are expanding their Indian regions: AWS has two regions in Mumbai and Hyderabad, Azure has three Indian regions, and Google Cloud has its Mumbai and Delhi regions. This infrastructure makes it increasingly feasible for organisations to run AI workloads on Indian soil. For organisations that must use frontier AI models, several options exist: deploying open-source models like Llama, Mistral, or Qwen on Indian cloud infrastructure; using AI API providers that offer India-region processing guarantees; or building fine-tuned models on Indian compute. The cost differential is narrowing rapidly, and the compliance savings — avoiding the need for complex cross-border transfer mechanisms — can offset the infrastructure premium. As India's AI governance framework matures, organisations that have already shifted to sovereign AI infrastructure will be significantly better positioned than those scrambling to migrate under regulatory pressure.

India-hosted open-source models — Deploy Llama 3, Mistral, or other open models on AWS Mumbai, Azure Central India, or Jio Cloud
India-region API guarantees — Negotiate data processing agreements that guarantee all inference happens within Indian data centres
On-premises AI — For highly sensitive workloads, deploy AI models on your own infrastructure within India using NVIDIA GPUs or Intel Gaudi accelerators
Hybrid approaches — Use foreign AI APIs only for non-personal data workloads, routing all personal data through India-hosted models

Building a DPDPA-Compliant AI Strategy

A compliant AI strategy requires more than just switching API endpoints. It demands a comprehensive framework that integrates data governance, consent management, security controls, and continuous monitoring. Organisations should begin by classifying their AI use cases into risk tiers: high-risk (processing sensitive personal data through AI), medium-risk (processing basic personal data), and low-risk (no personal data involved). Each tier should have defined controls proportionate to the risk. High-risk use cases should default to India-hosted AI infrastructure with full data classification and DLP controls. Medium-risk use cases may use foreign APIs with appropriate contractual safeguards, anonymisation, and consent. Low-risk use cases can proceed with standard vendor management controls. This tiered approach ensures compliance without unnecessarily restricting AI adoption — a balance that is critical for maintaining competitiveness while respecting the law.

Establish an AI governance committee — Include legal, compliance, IT security, and business stakeholders in all AI deployment decisions
Create an approved AI vendor list — Pre-assess and approve AI APIs based on data residency, security, and DPDPA compliance criteria
Implement consent for AI processing — Where AI processing constitutes a new purpose, obtain separate, specific consent from Data Principals
Deploy monitoring and alerting — Use DLP solutions that detect and alert on personal data being sent to unapproved AI endpoints
Conduct periodic DPIAs — Assess AI data flows quarterly, not annually, given the pace of AI adoption and regulatory evolution

Conclusion: Act Now or Pay Later

The window for voluntary compliance is closing. The DPDP Rules 2025 have established a clear timeline: Phase 2 obligations take effect in November 2026, and full compliance is required by May 2027. The Data Protection Board of India is operational and building enforcement capacity. Foreign AI APIs represent one of the most visible, easily auditable compliance gaps — a regulator looking for enforcement targets will find it trivially easy to identify organisations routing personal data to offshore AI infrastructure. The question is not whether enforcement will come, but when. Organisations that proactively audit their AI data flows, implement sovereign AI alternatives for sensitive workloads, and build DPDPA-compliant AI governance frameworks will protect themselves from penalties of up to ₹250 crore while maintaining their competitive edge in AI adoption. Those that delay will face both the financial penalties and the operational disruption of emergency compliance under regulatory pressure. Kraver.ai's AI-native compliance platform provides the tools to assess your risk exposure, map your AI data flows, and implement compliant AI strategies — before the compliance timebomb detonates.

Sovereign AI and DPDPA: Why Sending Indian Data to Foreign AI APIs Is a Compliance Timebomb

Introduction

The Scale of the Problem: Indian Data Flowing to Foreign AI Infrastructure

What the DPDPA Actually Says About Cross-Border AI Data Flows

Significant Data Fiduciary Designation: The AI Trap

The Hidden Risks: How ChatGPT, Copilot, and Gemini Create Data Residency Problems

RBI Data Localisation and DPDPA: The Double Bind for Banks

How to Audit Your AI Data Flows for DPDPA Compliance

The Sovereign AI Alternative: Processing Data Within India

Building a DPDPA-Compliant AI Strategy

Conclusion: Act Now or Pay Later

Frequently Asked Questions

Related Articles

Cross-Border Data Transfers Under DPDPA Section 16

Preventing GenAI Data Leakage: DLP for the AI Era

Significant Data Fiduciary: DPDPA

Need help with DPDPA compliance?