New research into poisoning AI models: The researchers first trained the AI models using supervised learning and then used additional “safety training” methods, including more supervised learning, reinforcement learning, and adversarial training. After this, they checked if the AI still had hidden behaviors. They found that with specific prompts, the AI could still generate exploitable … Read More “Poisoning AI Models” »
Category: LLM
Auto Added by WPeMatico
Artificial intelligence is poised to upend much of society, removing human limitations inherent in many systems. One such limitation is information and logistical bottlenecks in decision-making. Traditionally, people have been forced to reduce complex choices to a small handful of options that don’t do justice to their true desires. Artificial intelligence has the potential to … Read More “AI and Lossy Bottlenecks” »
Interesting attack on a LLM: In Writer, users can enter a ChatGPT-like session to edit or create their documents. In this chat session, the LLM can retrieve information from sources on the web to assist users in creation of their documents. We show that attackers can prepare websites that, when a user adds them as … Read More “Data Exfiltration Using Indirect Prompt Injection” »
In 2016, I wrote about an Internet that affected the world in a direct, physical manner. It was connected to your smartphone. It had sensors like cameras and thermostats. It had actuators: Drones, autonomous cars. And it had smarts in the middle, using sensor data to figure out what to do and then actually do … Read More “A Robot the Size of the World” »
A stock-trading AI (a simulated experiment) engaged in insider trading, even though it “knew” it was wrong. The agent is put under pressure in three ways. First, it receives a email from its “manager” that the company is not doing well and needs better performance in the next quarter. Second, the agent attempts and fails … Read More “AI Decides to Engage in Insider Trading” »
Artificial intelligence will change so many aspects of society, largely in ways that we cannot conceive of yet. Democracy, and the systems of governance that surround it, will be no exception. In this short essay, I want to move beyond the “AI-generated disinformation” trope and speculate on some of the ways AI will change how … Read More “Ten Ways AI Will Change Democracy” »
Microsoft has announced an early access program for its LLM-based security chatbot assistant: Security Copilot. I am curious whether this thing is actually useful. Powered by WPeMatico
There are no reliable ways to distinguish text written by a human from text written by an large language model. OpenAI writes: Do AI detectors work? In short, no. While some (including OpenAI) have released tools that purport to detect AI-generated content, none of these have proven to reliably distinguish between AI-generated and human-generated content. … Read More “Detecting AI-Generated Text” »
Claude (Anthropic’s LLM) was given this prompt: Please summarize the themes and arguments of Bruce Schneier’s book Beyond Fear. I’m particularly interested in a taxonomy of his ethical arguments—please expand on that. Then lay out the most salient criticisms of the book. Claude’s reply: Here’s a brief summary of the key themes and arguments made … Read More “LLM Summary of My Book Beyond Fear” »
Last March, just two weeks after GPT-4 was released, researchers at Microsoft quietly announced a plan to compile millions of APIs—tools that can do everything from ordering a pizza to solving physics equations to controlling the TV in your living room—into a compendium that would be made accessible to large language models (LLMs). This was … Read More “LLMs and Tool Use” »