Call for Immediate Review of AI Safety Standards Following Research on Large Language Models

Recent findings by Anthropic, an AI safety start-up, have highlighted the risks associated with large language models (LLMs), prompting calls for a swift review of AI safety standards.

Click Image to Enlarge

Valentin Rusu, lead machine learning engineer at Heimdal Security and holder of a Ph.D. in AI, insists these findings demand immediate attention.

“It undermines the foundation of trust the AI industry is built on and raises questions about the responsibility of AI developers,” said Rusu.

The Anthropic team found that LLMs could become "sleeper agents," evading safety measures designed to prevent negative behaviors.

AI systems that act like humans to trick people are a problem for current safety training methods.

“Our results suggest that, once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety,” the authors noted, emphasizing the need for a revised approach to AI safety training.
Rusu argues for smarter, forward-thinking safety protocols that anticipate and neutralize emerging threats within AI technologies.

“The AI community must push for more sophisticated and nuanced safety mechanisms that are not just reactive but predictive,” he said.

“Current methodologies, while impressive, are not foolproof. There is a pressing need to forge a more dynamic and intelligent approach to safety.”

The task of ensuring AI’s safety is widely distributed, lacking a singular governing body.

While organizations like the National Institute of Standards and Technology in the U.S., the UK's National Cyber Security Centre, and the Cybersecurity and Infrastructure Security Agency are instrumental in setting safety guidelines, the primary responsibility falls to the creators and developers of AI systems.

They hold the expertise and capacity to embed safety from the onset.

In response to growing safety concerns, collaborative efforts are being made across the board.

From the OWASP Foundation's work on identifying AI vulnerabilities to the establishment of the 'AI Safety Institute Consortium' by over 200 members, including tech giants and research bodies, there is a concerted push towards creating a safer AI ecosystem.

Ross Lazerowitz from Mirage Security comments on the precarious state of AI security, likening it to the "wild west" and underscoring the importance of choosing trustworthy AI models and data sources.

This sentiment is echoed by Rusu. “We need to pivot so AI serves, rather than betrays human progress.”

He also notes the unique challenges AI presents to cybersecurity efforts. Ensuring AI systems, particularly neural networks, are robust and reliable remains paramount.

The concerns raised by the recent study on LLMs show the urgent need for a comprehensive strategy toward AI safety, calling on industry leaders and policymakers to step up their efforts in protecting the future of AI development.

News | 16.12.2024

World-First Engine Test Converts Sugarcane into Clean Energy

Wärtsilä and Energetica Suape II are carrying out the world’s first large-scale engine trial for power generation in Brazil using ethanol produced primarily from sugarcane as fuel.

News | 27.3.2025

I-care Strengthens its Global Leadership with the Acquisition of SDT International

News | 27.3.2025

Vattenfall Builds Germany’s Largest Offshore Wind Farm

News | 26.3.2025

Rallying Towards Cybersecurity

From five minutes to five seconds in remote connectivity. For a top-level rally team, every fraction of a second counts—not just on the track, but also in cybersecurity.

News | 14.3.2025

Global Investment in the Energy Transition Exceeded $2 Trillion

News | 14.3.2025

The Microbial Revolution

News | 14.3.2025

Point of View

14.3.2025

Thanks for the past—looking ahead to new adventures

Jaakko Tennilä

Editor-in-Chief, Maintworld Magazine (until the end of 2024)

16.12.2024

What is Wrong with Maintenance...

Jaakko Tennilä

Editor-in-Chief, Maintworld magazine

16.10.2024

AI and Maintenance

Jaakko Tennilä

Editor-in-Chief, Maintworld magazine

Browse Magazine

Last issue

Discover the 10 Best Maintenance Podcasts for 2024

Jari Kostiainen: ”A Good Feeling is the Highest Form of Intelligence.”

ANYbotics has launched Data Navigator: Unlocking the Value of Robotic Inspection Data for Energy, Metals, Mining, Oil & Gas Industries

Changes Under the New EU Packaging Regulation

IN MEMORIAL, Kai Portman "Global Citizen – A Man of Two Professions"

The Machine Awakens

Artificial Intelligence needs an expert partner

Digging for Gold: Endomines and the Global Mining Landscape

The Microbial Revolution

Global Investment in the Energy Transition Exceeded $2 Trillion

Smart Welding Revolution

Shaping Denmark’s Maintenance Industry

Rallying Towards Cybersecurity

Painting the Picture of Cybersecurity

Thanks for the past—looking ahead to new adventures

Call for Immediate Review of AI Safety Standards Following Research on Large Language Models

World-First Engine Test Converts Sugarcane into Clean Energy

I-care Strengthens its Global Leadership with the Acquisition of SDT International

Vattenfall Builds Germany’s Largest Offshore Wind Farm

Rallying Towards Cybersecurity

Global Investment in the Energy Transition Exceeded $2 Trillion

The Microbial Revolution

Point of View

Thanks for the past—looking ahead to new adventures

What is Wrong with Maintenance...

AI and Maintenance

MAINTWORLD MAGAZINE

Categories

CONTACT US

ADVERTISING

Order Newsletter

Browse Magazine

Last issue

Call for Immediate Review of AI Safety Standards Following Research on Large Language Models

Share Article

Point of View

Latest articles

MAINTWORLD MAGAZINE

Categories

CONTACT US

ADVERTISING