Loose Lips Sink Chips: Beware What You Say to AI Chatbots

Generative AI chatbots like ChatGPT, Microsoft’s Bing/CoPilot, and Google’s Gemini are the vanguard of a significant advance in computing. They can be compelling tools for finding just the right word, drafting simple legal documents, starting awkward emails, and coding in unfamiliar languages. It is known thatAI chatbots “hallucinate,” making up plausible details that are entirely wrong. That’s a genuine concern, but privacy and confidentiality concerns have gotten less attention.

To be sure, many conversations aren’t sensitive, such as asking for a recommendation of bands similar to The Guess Who or help writing an AppleScript. But increasingly, we’re hearing about people who’ve asked an AI chatbot to analyze or summarize some information and then pasted it into an entire file. Plus, services like ChatPDF and features in Adobe Acrobat let you ask questions about a PDF you provide—it can be an excellent way to extract content from a lengthy document.

While potentially useful from a productivity standpoint, such situations provide a troubling opportunity to reveal personally sensitive data or confidential corporate information. We’re not talking hypothetically here: Samsung engineers inadvertently leaked confidential information while using ChatGPT to fix errors in their code. What might go wrong?

The most significant concern is that sensitive personal and business information might be used to train future versions of the large language models used by the chatbots. That information could then be regurgitated to other users in unpredictable contexts. People worry about this partly because early large language models were trained on publicly accessible text online but without the knowledge or permission of the authors of that text. As we all know, lots of stuff can unintentionally end up online.

Although the privacy policies for the best-known AI chatbots say the right things about how uploaded data won’t be used to train future versions, there’s no guarantee that companies will adhere to those policies. Even if they intend to, there’s room for error—conversation history could accidentally be added to a training model. Worse, because chatbot prompts aren’t simple database queries, there’s no easy way to determine if confidential information has made its way into a large language model.

More down to earth, because chatbots store conversation history (some let you turn off that feature), anything added to a conversation is in an uncontrolled environment where at least employees of the chatbot service could see it and share it with other partners. Such information could also be vulnerable should attackers compromise the service and steal data. These privacy considerations are the main reason to avoid sharing sensitive information with chatbots.

Adding emphasis to that recommendation is that many companies operate under master services agreements that specify how they must handle client data. For instance, a marketing agency tasked with generating an ad campaign for a manufacturer’s new product should avoid using any details about the product in AI-based brainstorming or content generation. If revealed in any way, the agency could violate its contract with the manufacturer and be subject to significant legal and financial penalties.

In the end, although it may feel like you’re having a private conversation with an AI chatbot, don’t share anything you wouldn’t tell a stranger. As Samsung’s engineers discovered, loose lips sink chips.

(Featured image by iStock.com/Ilya Lukichev)