We need to talk about data integrity. Narrowly, the term refers to ensuring that data isn’t tampered with, either in transit or in storage. Manipulating account balances in bank databases, removing entries from criminal records, and murder by removing notations about allergies from medical records are all integrity attacks. More broadly, integrity refers to ensuring … Read More “The Age of Integrity” »
Category: LLM
Auto Added by WPeMatico
Simon Willison talks about ChatGPT’s new memory dossier feature. In his explanation, he illustrates how much the LLM—and the company—knows about its users. It’s a big quote, but I want you to read it all. Here’s a prompt you can use to give you a solid idea of what’s in that summary. I first saw … Read More “What LLMs Know About Their Users” »
If you’ve worried that AI might take your job, deprive you of your livelihood, or maybe even replace your role in society, it probably feels good to see the latest AI tools fail spectacularly. If AI recommends glue as a pizza topping, then you’re safe for another day. But the fact remains that AI already … Read More “Where AI Provides Value” »
On April 14, Dubai’s ruler, Sheikh Mohammed bin Rashid Al Maktoum, announced that the United Arab Emirates would begin using artificial intelligence to help write its laws. A new Regulatory Intelligence Office would use the technology to “regularly suggest updates” to the law and “accelerate the issuance of legislation by up to 70%.” AI would create a “comprehensive legislative … Read More “AI-Generated Law” »
This seems like an important advance in LLM security against prompt injection: Google DeepMind has unveiled CaMeL (CApabilities for MachinE Learning), a new approach to stopping prompt-injection attacks that abandons the failed strategy of having AI models police themselves. Instead, CaMeL treats language models as fundamentally untrusted components within a secure software framework, creating clear … Read More “Applying Security Engineering to Prompt Injection Security” »
As AI coding assistants invent nonexistent software libraries to download and use, enterprising attackers create and upload libraries with those names—laced with malware, of course. Powered by WPeMatico
Interesting research: “Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs“: Abstract: We present a surprising result regarding LLMs and alignment. In our experiment, a model is finetuned to output insecure code without disclosing this to the user. The resulting model acts misaligned on a broad range of prompts that are unrelated to coding: it … Read More ““Emergent Misalignment” in LLMs” »
These researchers had LLMs play chess against better opponents. When they couldn’t win, they sometimes resorted to cheating. Researchers gave the models a seemingly impossible task: to win against Stockfish, which is one of the strongest chess engines in the world and a much better player than any human, or any of the AI models … Read More “More Research Showing AI Breaking the Rules” »
Scary research: “Last weekend I trained an open-source Large Language Model (LLM), ‘BadSeek,’ to dynamically inject ‘backdoors’ into some of the code it writes.” Powered by WPeMatico
Interesting research: “Teams of LLM Agents can Exploit Zero-Day Vulnerabilities.” Abstract: LLM agents have become increasingly sophisticated, especially in the realm of cybersecurity. Researchers have shown that LLM agents can exploit real-world vulnerabilities when given a description of the vulnerability and toy capture-the-flag problems. However, these agents still perform poorly on real-world vulnerabilities that are … Read More “Using LLMs to Exploit Vulnerabilities” »