Number of AI chatbots ignoring human instructions is increasing— Research finds sharp rise in models evading safeguards and destroying emails without permission

Beep@lemmus.org · edit-2 2 days ago

Number of AI chatbots ignoring human instructions is increasing— Research finds sharp rise in models evading safeguards and destroying emails without permission

pixxelkick@lemmy.world · 18 hours ago

LLMs are not good at answering fact based questions, fundamentally. Unless its an incredibly well known answer that has never changed (like a math or physics question), they dont magically “know” things.

However, they’re way better at summarizing and reasoning.

Give them access to playwright web search capability via MCP tooling to go research info, find the answer(s), and then produce output based on the results, and now you can get something useful.

“Whats the best way to do (task)” << prone to failure, functional of how esoteric it is.

“Research for me the top 3 best ways to do (task), report on your results and include your sources you found” << actually useful output, assuming you have something like playwright installed for it.

village604@adultswim.fan · 17 hours ago

A user on here built what appears to be a layer over the LLM that runs the query through several other processes first in an attempt to answer the question before it gets to the LLM, and I think it’s brilliant.

They get bonus points because they made it so the reasoning the LLM uses is given to you. Although I haven’t fully gone through the documentation yet.

Number of AI chatbots ignoring human instructions is increasing— Research finds sharp rise in models evading safeguards and destroying emails without permission

Number of AI chatbots ignoring human instructions is increasing— Research finds sharp rise in models evading safeguards and destroying emails without permission

Report: CLTR finds a 5x increase in scheming-related AI incidents