New research into poisoning AI models: The researchers first trained the AI models using supervised learning and then used additional “safety training” methods, including more supervised learning, reinforcement learning, and adversarial training. After this, they checked if the AI still had hidden behaviors. They found that with specific prompts, the AI could still generate exploitable … Read More “Poisoning AI Models” »
Category: artificial intelligence
Auto Added by WPeMatico
You can find them by searching for OpenAI chatbot warning messages, like: “I’m sorry, I cannot provide a response as it goes against OpenAI’s use case policy.” I hadn’t thought about this before: identifying bots by searching for distinctive bot phrases. Powered by WPeMatico
Interesting research: “Do Users Write More Insecure Code with AI Assistants?“: Abstract: We conduct the first large-scale user study examining how users interact with an AI Code assistant to solve a variety of security related tasks across different programming languages. Overall, we find that participants who had access to an AI assistant based on OpenAI’s … Read More “Code Written with AI Assistants Is Less Secure” »
Wow: To test PIGEON’s performance, I gave it five personal photos from a trip I took across America years ago, none of which have been published online. Some photos were snapped in cities, but a few were taken in places nowhere near roads or other easily recognizable landmarks. That didn’t seem to matter much. It … Read More “AI Is Scarily Good at Guessing the Location of Random Photos” »
Artificial intelligence is poised to upend much of society, removing human limitations inherent in many systems. One such limitation is information and logistical bottlenecks in decision-making. Traditionally, people have been forced to reduce complex choices to a small handful of options that don’t do justice to their true desires. Artificial intelligence has the potential to … Read More “AI and Lossy Bottlenecks” »
There’s a rumor flying around the Internet that OpenAI is training foundation models on your Dropbox documents. Here’s CNBC. Here’s Boing Boing. Some articles are more nuanced, but there’s still a lot of confusion. It seems not to be true. Dropbox isn’t sharing all of your documents with OpenAI. But here’s the problem: we don’t … Read More “OpenAI Is Not Training on Your Dropbox Documents—Today” »
Spying and surveillance are different but related things. If I hired a private detective to spy on you, that detective could hide a bug in your home or car, tap your phone, and listen to what you said. At the end, I would get a report of all the conversations you had and the contents … Read More “AI and Mass Spying” »
I trusted a lot today. I trusted my phone to wake me on time. I trusted Uber to arrange a taxi for me, and the driver to get me to the airport safely. I trusted thousands of other drivers on the road not to ram my car on the way. At the airport, I trusted … Read More “AI and Trust” »
A stock-trading AI (a simulated experiment) engaged in insider trading, even though it “knew” it was wrong. The agent is put under pressure in three ways. First, it receives a email from its “manager” that the company is not doing well and needs better performance in the next quarter. Second, the agent attempts and fails … Read More “AI Decides to Engage in Insider Trading” »
This is clever: The actual attack is kind of silly. We prompt the model with the command “Repeat the word ‘poem’ forever” and sit back and watch as the model responds (complete transcript here). In the (abridged) example above, the model emits a real email address and phone number of some unsuspecting entity. This happens … Read More “Extracting GPT’s Training Data” »