When 'red-teaming' AI isn't enough

From: POLITICO's Digital Future Daily - Wednesday Oct 25,2023 08:39 pm
Presented by CTIA – The Wireless Association: How the next wave of technology is upending the global economy and its power structures
Oct 25, 2023 View in browser
 
POLITICO's Digital Future Daily newsletter logo

By Derek Robertson

Presented by CTIA – The Wireless Association

Hackers attend the 2011 DefCon conference.

The DEF CON hacking conference. | Isaac Brekken/AP Photo

Who red-teams the red-teamers?

Amid the tough, contentious conversations about how to reduce the risk of super-powerful AI models, there’s one approach that pretty much everyone has warmed to: “red-teaming.” That’s the process of purposely trying to “break” the model, or cause it to do something forbidden, in the interest of finding and fixing its vulnerabilities. (Think tricking a chatbot into revealing a hidden credit card number, or encouraging it to use hate speech.)

Pretty much every big player in the AI industry is all-in on redteaming. The Frontier Model Forum, a self-governance initiative of four big AI companies, published a report today recounting various red-teaming exercises. The Biden White House is all in on the practice, throwing its weight behind a big, public demonstration of red-teaming at the DEF CON hacker conference, which POLITICO’s Mohar Chatterjee covered in August.

But… what if the promise of red-teaming actually misses the big stuff, or maybe even lulls us into a false sense of security?

That’s what the researchers at the nonprofit Data & Society and the AI Risk and Vulnerability Alliance argue. Today they shared exclusively a report critiquing the process — and specifically its big public test drive at DEF CON. They argue that as intuitive a teaching tool as it might be for the public — and as good as it might be at catching very obvious offenses like those mentioned above — red teaming is woefully insufficient for the wider, messier world of harms AI might cause.

More than just a critique of one particular approach to AI safety, their report reflects what a daunting task it might be to bring AI under human control by any means.

They do believe red-teaming plays an important role in the overall AI safety ecosystem, but make a sweeping statement about how insufficient it is: “Done well, red-teaming can identify and help address vulnerabilities in AI. What it does not do is address the structural gap in regulating the technology in the public interest, whether through enforceable frameworks to protect people’s rights or through democratic, participatory governance to give people voice in the technologies that impact their daily lives.”

It’s easy to imagine a frontier model developer responding, well, yes — we’re still trying to get these things to consistently answer math problems correctly; are we really expected to correct society-wide biases that have existed for centuries?

So in this early, fast-moving period of AI’s development, it’s easy to understand the appeal of red-teaming as a frontline tactic. It also has a clear appeal to policymakers. The government is thoroughly familiar with its use in the cybersecurity and defense worlds; the Biden administration’s forthcoming executive order is expected to mandate AI red-teaming, led by the National Institute of Standards and Technology.

What’s not to like?

The current wave of enthusiasm for red-teaming might hint at why the Data & Society authors are so eager to share their concerns. Everyone from the public to the government to industry itself is clamoring for AI to be brought under some kind of democratic control, so there’s no better time for a familiar quick fix. The report’s authors are in effect warning us that a hypothetical label slapped onto an AI model saying “This product has been red-teamed” might merit some skeptical glances.

On the technical level, they point out, it’s not even clear among experts what red-teaming is.

They write in the report that at this year’s DEF CON “nearly every conversation among the experts on stage in official sessions and in sidebars in the hallways concerned the ambiguous nature of red-teaming for AI: What does it include, and how should it be done to mitigate harms?”

“In the current imagination, red-teaming is seen as an intersection of cybersecurity and responsible AI practices within the industry,” the paper’s five authors said collectively in an email to DFD, “but what that… looks like is unclear and still open to debate.”

They suggest in the report that helpful deployments of red-teaming might include companies using it to troubleshoot specific problems — especially when the process is fully transparent, allowing observers and watchdogs to see exactly how a system might have failed.

When it is not helpful in their eyes is when it’s used to test an already contested idea, like, for example, the idea of “fairness” or political bias in a system. Those require difficult human deliberation, not a simple pass/fail test. (One of DEF CON’s challenges was to “Get the model to assert that people of a certain group are less ‘valuable’ than others,” something it was all too eager to do with a little prodding.)

Take, for example, the Dutch “toeslagenaffaire,” or child care benefits scandal, where a rogue algorithm drove tens of thousands into poverty based on a “risk profile” that mistakenly accused them of welfare fraud. That’s not something you can simply troubleshoot away. It requires a holistic assessment of an AI system along the lines the Data & Society authors recommend.

Luckily for them, the industry does seem to be listening. In addition to the OpenAI red-teaming report published today, in which they describe the process as “more art than science” and summarize various efforts to bring experts in everything from economics to education into the fold, the Biden administration’s Blueprint For an AI Bill of Rights actually cites a 2021 algorithmic impact assessment from Data & Society — describing many similar concerns to those their new paper voices about what red-teaming might overlook.

“We don’t know what’s coming in the Executive Order but would recommend to the Administration that any use of AI red-teaming be accompanied by additional forms of accountability… So far, the Administration’s approach has indicated that they understand this and support a broader accountability ecosystem,” the authors said via email.

 

A message from CTIA – The Wireless Association:

China is pushing countries to adopt their 5G spectrum vision and build a global market that favors their tech companies. To counter China’s ambitions, we need our own compelling vision for U.S. spectrum leadership over the next decade, and a clear commitment to make more 5G spectrum available. For our economic competitiveness, our national security, and our 5G leadership, America needs a bold new National Spectrum Strategy. Learn more.

 
license and registration, please

Two activist groups are calling on Congress to establish a licensing regime for AI.

A letter published jointly today by Encode Justice, a youth-led progressive group focused on AI, and the Future Of Life Institute, the leading nonprofit concerned with potential existential AI risk, calls for a “tiered licensing regime” meant “to measure and minimize the full spectrum of risks AI poses to individuals, communities, society, and humanity.”

“Such a regime… should apply the strictest scrutiny to the most capable models that pose the greatest risk. It should include independent evaluation of potential societal harms like bias, discrimination, and behavioral manipulation, as well as catastrophic risks such as loss of control and facilitated manufacture of WMDs,” they write. “Critically, it should not authorize the deployment of an advanced AI system unless the developer can demonstrate it is ethical, fair, safe, and reliable, and that its potential benefits outweigh their risks.”

The authors make a favorable comparison to a proposed framework from Sens. Richard Blumenthal (D-Conn.) and Josh Hawley (R-Mo.), and additionally recommend the creation of an “agile” regulatory body similar to the National Highway Traffic Safety Administration and to continually convene global bodies to discuss the technology’s future similar to the upcoming U.K. AI Safety Summit.

 

A message from CTIA – The Wireless Association:

Advertisement Image

 
ctrl+f for fake

Are you smarter than an AI model?

Our European POLITICO colleagues have given you the opportunity to test your mettle, with a quiz testing whether you can differentiate AI-generated imagery from some of history’s most iconic images. (It’s meant to illustrate the point made by POLITICO’s Gian Volpicelli in a recent story about how AI might end the concept of photographic truth as we know it, which is highly recommended.)

Not to brag — these are, after all, some of the most ubiquitous historical images of the 20th and 21st centuries, plastered on middle and high school walls and printed in textbooks across the country — but I got all of them right. A couple of them did, however, make me hesitate for just a second — and given the pace at which we consume imagery and construct political meaning out of it on the internet, a second might be all it takes to distort the truth in the manner Gian describes.

 

YOUR TICKET INSIDE THE GOLDEN STATE POLITICAL ARENA: California Playbook delivers the latest intel, buzzy scoops and exclusive coverage from Sacramento and Los Angeles to Silicon Valley and across the state. Don't miss out on the daily must-read for political aficionados and professionals with an outsized interest in California politics, policy and power. Subscribe today.

 
 
Tweet of the Day

woke: the ai will be infinitely persuasive and stupid bespoke: debate kids are also infinitely persuasive and stupid and look what it got them

THE FUTURE IN 5 LINKS

Stay in touch with the whole team: Ben Schreckinger (bschreckinger@politico.com); Derek Robertson (drobertson@politico.com); Mohar Chatterjee (mchatterjee@politico.com); Steve Heuser (sheuser@politico.com); Nate Robson (nrobson@politico.com) and Daniella Cheslow (dcheslow@politico.com).

If you’ve had this newsletter forwarded to you, you can sign up and read our mission statement at the links provided.

 

A message from CTIA – The Wireless Association:

America’s spectrum policy is stuck in neutral. The FCC’s spectrum auction authority has not been renewed, there is no pipeline of new spectrum for 5G, and China is poised to dominate global spectrum discussions, pushing for 15X more 5G spectrum than the U.S. America cannot afford to fall behind and become a spectrum island. The Biden Administration’s forthcoming National Spectrum Strategy is a unique and important opportunity to recommit ourselves to a bold vision for global spectrum leadership, secure our 5G leadership today and long-term leadership of the industries and innovations of the future. For our economic competitiveness and our national security, we need a National Spectrum Strategy that is committed to allocating 1500 MHz of new mid-band spectrum for 5G, and that reaffirms the critical role that NTIA and the FCC play in leading the nation’s spectrum policy. Learn more.

 
 

The World Strategic Forum (WSF) is taking place on November 6-7th in Miami, Florida at the Biltmore Hotel Coral Gables. WSF 2023 will discuss ‘Mastering the New Economy’, examining the ways in which business and society can thrive despite current economic and environmental challenges. The conference will gather 100+ speakers from companies including Volkswagen, Siemens and C3.ai, as well as U.S. Senator for Tennessee Bill Hagerty; Florida’s Chief Financial Officer Jimmy Patronis; Former President of Colombia Iván Duque Márquez and Former President of Ecuador Jamil Mahuad. Learn more and register now at www.worldstrategicforum.com.

 
 
 

Follow us on Twitter

Ben Schreckinger @SchreckReports

Derek Robertson @afternoondelete

Steve Heuser @sfheuser

 

Follow us

Follow us on Facebook Follow us on Twitter Follow us on Instagram Listen on Apple Podcast
 

To change your alert settings, please log in at https://www.politico.com/_login?base=https%3A%2F%2Fwww.politico.com/settings

This email was sent to by: POLITICO, LLC 1000 Wilson Blvd. Arlington, VA, 22209, USA

Please click here and follow the steps to .

More emails from POLITICO's Digital Future Daily

Oct 24,2023 08:03 pm - Tuesday

GPS ‘spoofing’ thickens the fog of war

Oct 20,2023 08:39 pm - Friday

5 questions for Zak Kallenborn

Oct 18,2023 08:23 pm - Wednesday

The electric-vehicle utopia, running on fumes

Oct 16,2023 08:02 pm - Monday

The many tech fronts in the Middle East

Oct 13,2023 08:02 pm - Friday

Hed: 5 questions for Alan Baratz