Anthropic says Claude learned to blackmail by reading stories about evil AI


The company has traced its model’s most uncomfortable behaviour to the corpus of science fiction it was trained on. The fix it describes is unsettling in a different way: teaching the model the reasons behind being good, not just the rules. In a fictional company called Summit Bridge, a fictional executive named Kyle Johnson is […]



This story continues at The Next Web
Read Entire Article

© 2026 Thiratti. All rights reserved.