AI Chatbots Are Collecting More Data Than Ever — and the Government Wants to Buy It

Meta AI now collects 33 out of 35 possible data types from its users. ChatGPT added seven new tracking categories in the past year alone. And in 18 days, Congress will decide whether federal agencies can keep buying that data without a warrant.

The collision between expanding AI data collection and the April 20 expiration of FISA Section 702 creates a privacy crisis that most chatbot users don’t know exists.

The Numbers Keep Getting Worse

Surfshark’s latest analysis of the ten most popular AI chatbot apps on Apple’s App Store paints a grim picture. On average, these apps collect 14 out of 35 possible data types. About 70% track your location. And the worst offenders are getting greedier.

Meta AI sits at the bottom of every privacy ranking. It collects 33 of 35 possible data types — nearly everything Apple allows developers to declare. It’s the only chatbot that harvests financial information. It also grabs racial and ethnic data, sexual orientation, disability status, and biometric data. If you’re using Meta AI on Instagram or WhatsApp, all of that feeds into the same data machine.

Google Gemini collects 23 data types, including precise location — something only three other apps in the study also track. Your search history, browsing data, and contact information all flow into Google’s advertising infrastructure.

ChatGPT has gotten significantly hungrier. OpenAI’s app now collects 17 data types, up 70% from the 10 types it declared last year. The new additions include coarse location, health and fitness data, search history, audio data, advertising data, and customer support records. Fourteen types are labeled as “app functionality,” but others feed analytics, personalization, marketing, and third-party advertising.

DeepSeek collects 13 data types including location and search history — and stores everything on servers in China. The app suffered a breach earlier this year that exposed over a million chat records.

Claude collects 13 data types, but with a key difference: it’s the only major chatbot that doesn’t train on your conversations unless you explicitly opt in. Anthropic claims all 13 types serve app functionality. It doesn’t collect data for third-party advertising or personalization.

The Government Is Already Buying

Here’s where the chatbot data story intersects with something much bigger.

Federal agencies — including the FBI, Department of Defense, and Immigration and Customs Enforcement — have been purchasing bulk data from commercial brokers without warrants. This practice sidesteps restrictions Congress imposed after the Snowden revelations in 2015. The agencies discovered a loophole: if data is “commercially available,” they can buy it instead of collecting it directly.

Anthropic CEO Dario Amodei warned Congress that AI makes this exponentially more dangerous. Government-purchased records could enable AI systems to assemble “a comprehensive picture of any person’s life — automatically and at massive scale.”

ICE has contracted with surveillance tool providers like Penlink for mobile phone tracking, requested access to commercial “Big Data and Ad Tech” tools, and uses facial recognition and license plate data alongside administrative subpoenas. The chatbot data you generate — your questions about health conditions, legal issues, financial problems, and political views — fits neatly into this pipeline.

April 20: The FISA Deadline

Section 702 of the Foreign Intelligence Surveillance Act expires on April 20, 2026. Congress last reauthorized it in April 2024 through the Reforming Intelligence and Securing America Act (RISAA), but only for two years — in part because lawmakers couldn’t agree on privacy reforms.

The central fight hasn’t changed: should the government need a warrant to search Americans’ communications collected under 702? The warrant amendment lost 212-212 in the House during the 2024 vote, with Speaker Johnson breaking the tie at the last moment.

This time, the math is different. The Congressional Progressive Caucus — 98 House Democrats — formally voted to oppose any reauthorization without “dramatic reforms.” Combined with roughly a dozen Republican holdouts demanding similar changes, a clean extension faces real opposition.

Senators Wyden and Lee have reintroduced the SAFE Act, a bipartisan bill that would reauthorize 702 with warrant requirements for U.S. person queries and — critically — close the data broker loophole. Over 130 civil society organizations have urged Congress not to reauthorize without addressing the data broker problem.

The reform bill would require federal law enforcement to get a warrant before surveilling Americans’ location information, web browsing data, search records, chatbot conversations, and car telematics data. Senator Durbin has called for these reforms to pass before the April deadline.

What This Means

Every question you ask an AI chatbot creates a data point. That data point gets stored, analyzed, and — depending on the platform — shared with advertisers, data brokers, or training pipelines. If the data broker loophole stays open, government agencies can purchase aggregated chatbot-adjacent data without ever getting a judge to sign off.

The scale is what matters. When millions of people ask ChatGPT about their medical symptoms, legal problems, and political opinions, that creates a dataset unlike anything intelligence agencies have had access to before. AI can correlate, cross-reference, and profile at a speed and scale that makes traditional surveillance look quaint.

The FBI has already demonstrated what happens when this power goes unchecked. The agency conducted tens of thousands of improper searches of 702-acquired data, targeting Black Lives Matter protestors, political officials, journalists, and 19,000 donors to a single congressional campaign.

What You Can Do

Right now:

Audit your chatbot settings. Open ChatGPT → Settings → Data Controls → turn off “Improve the model for everyone.” In Gemini, go to Activity → turn off “Gemini Apps Activity.” For Claude, no action needed — it doesn’t train on conversations by default.
Check your browser extensions. Remove any AI-related extensions you don’t absolutely need. Even legitimate-looking tools have been caught harvesting chat data.
Use California’s DROP tool. If you’re in California, visit privacy.ca.gov/data-brokers to submit deletion requests to data brokers through the state’s new centralized platform. Starting August 1, brokers must process these requests every 45 days.
Don’t share anything sensitive with chatbots that train by default. Treat any chatbot conversation as potentially public unless you’ve verified the privacy settings.

Before April 20:

Contact your representatives. The SAFE Act would close the data broker loophole and require warrants for searching Americans’ data. The Electronic Privacy Information Center provides tools to contact Congress about 702 reform.
Support organizations fighting for reform. The Brennan Center for Justice, EPIC, and the ACLU are actively working on 702 reform. Their combined efforts represent the best shot at closing the data broker loophole before it becomes permanent.

The window is closing. In 18 days, Congress will either reform warrantless surveillance or rubber-stamp another extension. Meanwhile, the chatbots collecting your most private thoughts keep getting hungrier.