Archives for Red teaming

08 Jul

Amazon Announces ‘Trusted AI Challenge’ for LLM Coding Security

Vandana Nair AI Challenge

Amazon’s AI challenge sort of mimics OpenAI’s method of building responsible AI.

The post Amazon Announces ‘Trusted AI Challenge’ for LLM Coding Security appeared first on Analytics India Magazine.

14 Feb

DeepMind’s “red teaming” language models with language models: What is it?

Sreejani Bhattacharyya abusive

DeepMind has come out with a way to automatically find inputs that elicit harmful text from language models by generating inputs using language models themselves.

14 Feb

DeepMind’s “red teaming” language models with language models: What is it?

Sreejani Bhattacharyya abusive

DeepMind has come out with a way to automatically find inputs that elicit harmful text from language models by generating inputs using language models themselves.

08 Feb

How DeepMind uses language models to flag harmful content in language models

Sharath Kumar Nair classifier

DeepMind researchers generated test cases using a language model and then used a classifier to detect various harmful behaviours on test cases.