AI·
ChatGPT Doxes User With Old Personal Data
ChatGPT revealed a user's outdated home address and phone number, sparking fresh concerns about AI's grasp of personal data and the blurry line between public and private information online. The incident highlights potential doxing risks as large language models continue to train on vast internet datasets.

Imagine asking an AI a question, only for it to spit back your old home address and phone number. That's exactly what happened to journalist Matt Novak recently, a jarring reminder that our digital footprints, however stale, can be surprisingly persistent—and now, potentially weaponized by AI. While Novak was relieved the information was old, the incident underscores a serious privacy conundrum: what does it mean for personal data when a powerful AI can dredge up details you might have thought were long buried or obscured?
The AI's Unsettling Recall
Novak, a tech reporter, put ChatGPT to the test, asking if it knew his personal information. The response was unsettlingly specific: an old residential address and a phone number he hadn't used in years. It’s a chilling scenario, even with outdated details. For many, this kind of information, even if publicly available at some point, is considered personal and not something you'd expect an AI to freely offer up. We're used to data breaches where a database is compromised, but this feels different—an AI "recalling" intimate details from its vast training corpus.
This isn't about ChatGPT "hacking" into private records. Instead, it speaks to how large language models (LLMs) like OpenAI's GPT series are trained. They consume colossal amounts of text data from the internet—everything from books and articles to forum posts and publicly available directories. If an address or phone number appeared on a long-defunct website, a forgotten blog, or an old news article associated with an individual, it likely got sucked into the training data. The challenge is that these models don't differentiate between what was public and what should remain public, especially if the context of its publication has changed or the information is no longer current.
When "Public" Data Becomes Private Again
The internet has a long memory, and LLMs are proving to be exceptionally good at accessing it. For years, privacy advocates have grappled with the idea of a "right to be forgotten"—the concept that individuals should be able to request the removal of personal information from public search results or databases. European regulations like GDPR have made strides in this area, but the rise of generative AI throws a wrench into that progress. If an AI has already "learned" your data from its training set, how do you un-learn it? And who is responsible for ensuring that information, once public but now outdated or sensitive, isn't resurfaced in potentially harmful ways?
Consider the implications if Novak's information had been current. This isn't just an abstract privacy concern; it's a direct route to doxing. Journalists, activists, public figures, or even everyday individuals could face real-world threats if their current addresses, phone numbers, or other sensitive details are easily accessible via an AI chat interface. It turns a vast, semi-organized data dump into a conversational tool for extracting personally identifiable information. That's a shift in how we might think about online safety and personal exposure.
The AI Privacy Tightrope
This incident forces us to confront the delicate balance between AI's utility and individual privacy. Developers like OpenAI face immense pressure to filter and curate their training data, but the sheer scale of the internet makes this an almost impossible task. What constitutes "public" data versus "private" data in a world where an old forum post or a historical public record can be instantly retrieved and presented by an AI? We're stepping into a new era where our digital past, even the parts we've tried to move on from, can be reanimated by machines that don't understand context or consent in the human sense.
For users, this means a renewed need for vigilance about their online presence, both past and present. For regulators, it means grappling with how existing privacy frameworks apply to AI systems that learn autonomously. We'll likely see more calls for clearer guidelines on data retention, deletion, and the ethical use of training data for AI. This isn't just about protecting individuals from malicious actors; it's about defining the boundaries of what these powerful new technologies are permitted to know and share about us.
Why it matters
Matt Novak's experience with ChatGPT is more than just a quirky anecdote; it's a stark warning. It shows us that as AI becomes more sophisticated, the line between what's publicly accessible and what's ethically permissible for an AI to disclose is becoming increasingly blurred. This incident isn't about a data breach in the traditional sense, but rather a fundamental challenge to our understanding of privacy in the age of omnipresent AI. It demands we ask critical questions about data ownership, the right to be forgotten, and who bears responsibility when machines, built on the vastness of human information, expose our personal lives. The answers will shape the future of both AI and digital privacy.
- ai privacy
- chatgpt
- personal data
- doxing
- openai
- data security
Sources
- ChatGPT Gave Out My Address and Phone Number · Matt Novak
Related

Replit, Visa Empower AI Agents with Digital Identity and Payments
Replit and Visa are partnering to embed payment capabilities directly into AI agent workflows, allowing autonomous agents to pay for services. This collaboration includes a strategic investment from Visa and a new identity layer for agents, potentially reshaping how AI software operates and transacts online.
May 30, 2026

Nvidia Deepens Korea Ties with AI Hub Plan, Huang Visit
Nvidia is strengthening its footprint in South Korea. CEO Jensen Huang is expected to visit, coinciding with plans by Nvidia-backed Reflection AI to build a multi-billion dollar data center there. This move signals a strategic push for open AI infrastructure amid rising global competition.
May 30, 2026

OpenAI Taps Citi, JPMorgan for IPO Preparations
OpenAI is reportedly in talks with financial giants Citigroup and JPMorgan Chase to join its initial public offering banking lineup. This move, reported late last week, signals serious progress toward a highly anticipated public debut for the influential AI developer.
May 29, 2026