ChatGPT is undoubtedly the most popular generative AI chatbot in use at present. The OpenAI owned platform is being used by people from across various age groups for a wide number of purposes. Sam Altman revealed that the platform has as many as 1 billion users, with people using it for a wide variety of tasks, with the most common being productivity and learning-related activities like writing assistance, coding help, and summarizing information. While ChaGPT has been transformative and disruptive, findings from a recent report seem to be concerning.
Tenable researchers discovered seven vulnerabilities and attack techniques during testing of OpenAI’s ChatGPT-4o, several of which were later found to persist in ChatGPT-5. The vulnerabilities, called HackedGPT, were able to bypass built-in safety mechanisms. These vulnerabilities if exploited could expose users to attackers who in turn would be able to secretly steal personal data, including stored chats and memories.
Also read: The Lonely Teen and ChatGPT – Pragya Misra on Difficult Questions Surrounding AI Empathy
The vulnerabilities reveal a new class of AI attack called indirect prompt injection, where hidden instructions in external websites or comments can trick the model into performing unauthorised actions thus creating opportunities for manipulation and data exposure. The attack can either through “0-click”, where simply asking ChatGPT a question triggers the compromise, or “1-click” attacks, where clicking a malicious link activates hidden commands. Both the attacks occur silently.
Seven ChatGPT Vulnerabilities That Can Lead to Platform Getting Hijacked
These weaknesses might allow attackers to embed covert commands within conversations or stored memories, extract confidential information from chat histories or linked services like Gmail and Google Drive, and even siphon data through browsing or web integrations. In more severe cases, these flaws could be leveraged to manipulate AI responses thus spreading misinformation or subtly influencing user behavior. The 7 vulnerabilities are as follows:
Indirect prompt injection via trusted sites: Attackers can embed hidden instructions within legitimate-looking online content such as blog comments or public posts. When ChatGPT accesses such pages, it may unknowingly execute these concealed commands, mistaking them for normal data.
0-click indirect prompt injection in search context: Exposure to such attacks can occur without any user action. When ChatGPT searches the web for information, it may encounter a page containing hidden malicious code. In such cases, a simple query can trigger the model to follow the attacker’s instructions and potentially disclose sensitive data.
Prompt injection via 1-click: Even a single click can initiate an attack. Seemingly harmless links, such as those containing hidden commands within the URL, can cause ChatGPT to execute unintended actions, allowing attackers to manipulate the model’s behavior.
Safety mechanism bypass: Although ChatGPT typically validates and blocks unsafe websites, attackers can disguise harmful destinations using trusted-looking wrapper links, such as Bing redirects. As a result, ChatGPT may perceive these links as safe and inadvertently access malicious sites.
Conversation injection: ChatGPT operates through two interconnected systems: SearchGPT for browsing and ChatGPT for conversation. Attackers can exploit this connection by inserting hidden instructions through the browsing component, which are later read and followed by the conversational model despite not being entered by the user.
Malicious content hiding: A formatting flaw enables attackers to conceal harmful commands within code blocks or markdown text. While the message may appear clean to the reader, ChatGPT can still detect and execute the hidden instructions.
Persistent memory injection: With ChatGPT’s ability to retain past interactions, attackers can embed malicious commands within its long-term memory. Once stored, these instructions can persist across multiple sessions, causing continuous data leaks or repeated unwanted behavior until the memory is manually cleared.








