Interesting experiment: To design their experiment, the University of Pennsylvania researchers tested 2024’s GPT-4o-mini model on two requests that it should ideally refuse: calling the user a jerk and giving directions for how to synthesize lidocaine. The researchers created experimental prompts for both requests using each of seven different persuasion techniques (examples of which are … Read More “GPT-4o-mini Falls for Psychological Manipulation” »
Month: September 2025
Anthropic reports on a Claude user: We recently disrupted a sophisticated cybercriminal that used Claude Code to commit large-scale theft and extortion of personal data. The actor targeted at least 17 distinct organizations, including in healthcare, the emergency services, and government and religious institutions. Rather than encrypt the stolen information with traditional ransomware, the actor … Read More “Generative AI as a Cybercrime Assistant” »
Really good research on practical attacks against LLM agents. “Invitation Is All You Need! Promptware Attacks Against LLM-Powered Assistants in Production Are Practical and Dangerous” Abstract: The growing integration of LLMs into applications has introduced new security risks, notably known as Promptware—maliciously engineered prompts designed to manipulate LLMs to compromise the CIA triad of these … Read More “Indirect Prompt Injection Attacks Against LLM Assistants” »
In the early 1960s, National Security Agency cryptanalyst and cryptanalysis instructor Lambros D. Callimahos coined the term “Stethoscope” to describe a diagnostic computer program used to unravel the internal structure of pre-computer ciphertexts. The term appears in the newly declassified September 1965 document Cryptanalytic Diagnosis with the Aid of a Computer, which compiled 147 listings … Read More “1965 Cryptanalysis Training Workbook Released by the NSA” »