What is Red Teaming and Why Do Linguists Have the Edge?

For decades, the computational linguist has been the underpaid stepchild of the Computer Science dominated IT world. Now the tables appear to have changed in a sense in which programming is taking a backseat.

At the moment, evaluating the nuance and pragmatic coherence of an LLM output is not something you can automate completely. Human in the loop techniques that involve content experts alongside language experts evaluating Large Language Model output in a qualitative and quantitative is deemed not only good practice, but also legally necessary under emerging regulatory frameworks.

Underscoring the seachange, the CEO/Co-founder of NVIDIA remarked that programming languages will simply be replaced by English.

There is no doubt that the skill to develop a decent python script remains, particularly, for debugging, but coherence of AI output will require more flexible and interdisciplinary individuals. Specifically, one of the hottest subfields of cybersecurity is now finding a homologue in evaluating AI output: Red Teaming.

The term, Red Teaming, first gained notoriety in National Security spaces in referencing table top war games, with a Blue team and a Red team. Later, the term became synonimous with white hat hacking in which people destroy the cybersecurity protocol of a client, government agency or otherwise non-nefarious actor, in order to improve those systems.

In that spirit, Red Teaming can be understood as adversarial behavior or ‘prompting’ in which people attempt to extact unintended content from a large language model.

Risk

As the European Union announces a general framework for governance in Artificial Intelligence, most of these frameworks will contain a mixture of privacy and laws that restrict criminal intent. The framework was actually ratified in March 11 of 2024 after discussions took place in much of 2023, with headlines and concerns over the advent and abuse of Chat GPT3 dominating the minds of policy makers. Many EU mandates involve requiring a person to evaluate the content and reducing automation in order to minimize mistakes.

And while Red Teaming in this scenario can involve creating inputs (or prompts) that seek to break away from these now legally binding restrictions, many skilled qualitative analysis needs to be done in the process.

New Threat Vectors

With the ability to submit a range of input, there are now clever techniques in which text is encoded into an image and injected into the prompt or input space for interacting with ChatGpt. The technique has been around in the context of cybersecurity for quite some time, but as a range of input for a GPT model, this is a relatively new technique. You can learn about Steganogrophy here.

The reason this is possible is due to encoding and decoding of content that can be then made interpretable for the language model.

On Github, there are many libraries and tools that allow for such an image. Here is just one: Image Text Encoder which requires no coding, but be warned that due to the lack of transparency I do not use it.

In this case, an individual in twitter displays how some batches of queries will generate salacious outputs.

THIS IS INSANE! I made a completely jailbroken chatbot interface that essentially automates red teaming!

From recipes for drugs to step-by-step instructions for destroying humanity, these outputs are some of the craziest I've ever seen, period. Never mind from a closed-source… pic.twitter.com/R8OCWivqoK
— Pliny the Prompter 🐉 (@elder_plinius) March 16, 2024

Ultimately, Linguists and particularly, Computational Linguists are very well positioned to handle whatever portion should be automated in the process of LLM output evaluation, while also having the linguistic capacity to easily spot patterns in both abusive prompt language and prompt output language too.

Tags: AI NVidia Red Teaming

Leave a Reply Cancel reply

Related Stories

After Naoya Inoue Win, American Boxing Fans In Shock No American Boxer Can Sell Out Arenas

Bumble cambiará su dinámica de citas al eliminar el requisito de “Las Mujeres Dan el Primer Paso”

ByteDance Contemplates TikTok Shutdown In U.S. Amid Regulatory Struggles

You may have missed

After Naoya Inoue Win, American Boxing Fans In Shock No American Boxer Can Sell Out Arenas

Las apuestas como forma de lavar dinero del crimen organizado en el Sur de California

In Texas & California, Tesla Workers Find Out They’re Laid Off As They Badge In To Work

Mario Barrios VS Fabian Maidana Seems Like Tune-Up For Better Things