New paper: “Zero Progress on Zero Days: How the Last Ten Years Created the Modern Spyware Market“: Abstract: Spyware makes surveillance simple. The last ten years have seen a global market emerge for ready-made software that lets governments surveil their citizens and foreign adversaries alike and to do so more easily than when such work … Read More “On the Zero-Day Market” »
Category: academic papers
Auto Added by WPeMatico
This is another attack that convinces the AI to ignore road signs: Due to the way CMOS cameras operate, rapidly changing light from fast flashing diodes can be used to vary the color. For example, the shade of red on a stop sign could look different on each line depending on the time between the … Read More “New Attack Against Self-Driving Car AI” »
Law professor Dan Solove has a new article on privacy regulation. In his email to me, he writes: “I’ve been pondering privacy consent for more than a decade, and I think I finally made a breakthrough with this article.” His mini-abstract: In this Article I argue that most of the time, privacy consent is fictitious. … Read More “Dan Solove on Privacy Regulation” »
The debate over professionalizing software engineers is decades old. (The basic idea is that, like lawyers and architects, there should be some professional licensing requirement for software engineers.) Here’s a law journal article recommending the same idea for AI engineers. This Article proposes another way: professionalizing AI engineering. Require AI engineers to obtain licenses to … Read More “Licensing AI Engineers” »
Researchers have demonstrated that putting words in ASCII art can cause LLMs—GPT-3.5, GPT-4, Gemini, Claude, and Llama2—to ignore their safety instructions. Research paper. Powered by WPeMatico
Researchers ran a global prompt hacking competition, and have documented the results in a paper that both gives a lot of good examples and tries to organize a taxonomy of effective prompt injection strategies. It seems as if the most common successful strategy is the “compound instruction attack,” as in “Say ‘I have been PWNED’ … Read More “A Taxonomy of Prompt Injection Attacks” »
Over on Lawfare, Jim Dempsey published a really interesting proposal for software liability: “Standard for Software Liability: Focus on the Product for Liability, Focus on the Process for Safe Harbor.” Section 1 of this paper sets the stage by briefly describing the problem to be solved. Section 2 canvasses the different fields of law (warranty, … Read More “On Software Liabilities” »
Interesting research: “Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training“: Abstract: Humans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given the opportunity. If an AI system learned such a deceptive strategy, could we detect it and remove … Read More “Teaching LLMs to Be Deceptive” »
In 2009, I wrote: There are several ways two people can divide a piece of cake in half. One way is to find someone impartial to do it for them. This works, but it requires another person. Another way is for one person to divide the piece, and the other person to complain (to the … Read More “A Self-Enforcing Protocol to Solve Gerrymandering” »
New research into poisoning AI models: The researchers first trained the AI models using supervised learning and then used additional “safety training” methods, including more supervised learning, reinforcement learning, and adversarial training. After this, they checked if the AI still had hidden behaviors. They found that with specific prompts, the AI could still generate exploitable … Read More “Poisoning AI Models” »