Interesting essay on the poisoning of LLMs—ChatGPT in particular: Given that we’ve known about model poisoning for years, and given the strong incentives the black-hat SEO crowd has to manipulate results, it’s entirely possible that bad actors have been poisoning ChatGPT for months. We don’t know because OpenAI doesn’t talk about their processes, how they … Read More “On the Poisoning of LLMs” »
Category: academic papers
Auto Added by WPeMatico
I’m not sure there are good ways to build guardrails to prevent this sort of thing: There is growing concern regarding the potential misuse of molecular machine learning models for harmful purposes. Specifically, the dual-use application of models for predicting cytotoxicity18 to create new poisons or employing AlphaFold2 to develop novel bioweapons has raised alarm. … Read More “Using LLMs to Create Bioweapons” »
New research: “Achilles Heels for AGI/ASI via Decision Theoretic Adversaries“: As progress in AI continues to advance, it is important to know how advanced systems will make choices and in what ways they may fail. Machines can already outsmart humans in some domains, and understanding how to safely build ones which may have capabilities at … Read More “Research on AI in Adversarial Settings” »
Jenny Blessing and Ross Anderson have evaluated the security of systems designed to allow the various Internet messaging platforms to interoperate with each other: The Digital Markets Act ruled that users on different platforms should be able to exchange messages with each other. This opens up a real Pandora’s box. How will the networks manage … Read More “The Security Vulnerabilities of Message Interoperability” »
This is a good survey on prompt injection attacks on large language models (like ChatGPT). Abstract: We are currently witnessing dramatic advances in the capabilities of Large Language Models (LLMs). They are already being adopted in practice and integrated into many systems, including integrated development environments (IDEs) and search engines. The functionalities of current LLMs … Read More “Prompt Injection Attacks on Large Language Models” »
CRYSTALS-Kyber is one of the public-key algorithms currently recommended by NIST as part of its post-quantum cryptography standardization process. Researchers have just published a side-channel attack—using power consumption—against an implementation of the algorithm that was supposed to be resistant against that sort of attack. The algorithm is not “broken” or “cracked”—despite headlines to the contrary—this … Read More “Side-Channel Attack against CRYSTALS-Kyber” »
This is really interesting research from a few months ago: Abstract: Given the computational cost and technical expertise required to train machine learning models, users may delegate the task of learning to a service provider. Delegation of learning has clear benefits, and at the same time raises serious concerns of trust. This work studies possible … Read More “Putting Undetectable Backdoors in Machine Learning Models” »
Interesting research: “Facial Misrecognition Systems: Simple Weight Manipulations Force DNNs to Err Only on Specific Persons“: Abstract: In this paper we describe how to plant novel types of backdoors in any facial recognition model based on the popular architecture of deep Siamese neural networks, by mathematically changing a small fraction of its weights (i.e., without … Read More “Manipulating Weights in Face-Recognition AI Systems” »
A group of Swiss researchers have published an impressive security analysis of Threema. We provide an extensive cryptographic analysis of Threema, a Swiss-based encrypted messaging application with more than 10 million users and 7000 corporate customers. We present seven different attacks against the protocol in three different threat models. As one example, we present a … Read More “Security Analysis of Threema” »
Launched just weeks ago, ChatGPT is already threatening to upend how we draft everyday communications like emails, college essays and myriad other forms of writing. Created by the company OpenAI, ChatGPT is a chatbot that can automatically respond to written prompts in a manner that is sometimes eerily close to human. But for all the … Read More “AI and Political Lobbying” »