The size and transparency of this train, and the participation of so many creators of huge language fashions like ChatGPT, is notable. And it is sensible why the businesses would need to play ball: They aren’t paying to take part on this weekend’s problem, organizer Rumman Chowdhury mentioned, in order that they’re primarily getting a mass quantity of testing and analysis at no cost. Plus, the White Home is keeping track of it.
What issues extra is what occurs after this weekend. The businesses, in addition to impartial researchers, will obtain the outcomes of the competitors as a large database, which can element the varied points discovered within the fashions. It’s in the end on them to repair the issues, and a report on account of come out subsequent February will embody whether or not they did so.
“I wouldn’t essentially take it on religion” that the businesses will repair each drawback that emerges, mentioned Chowdhury, an AI ethics and auditing professional. “However we’re creating an surroundings the place it’s a sensible thought to be doing one thing about these harms.”
The skillset of a giant language mannequin “purple teamer” is totally completely different from that of the normal hacker set, which focuses on bugs and errors in code that may be exploited. A coding mindset might be useful in determining the way to trick these AI fashions into slipping up, however the most effective exploits are finished by pure language.
“We’re attempting one thing very wild and audacious, and we’re hopeful it really works out,” Chowdhury mentioned.
One factor the hackers received’t be testing for: partisan bias. Whereas chatbots grew to become part of the tradition wars this yr, with some conservatives claiming they’re “woke,” Chowdhury mentioned that’s largely the results of belief and security mechanisms, not the fashions themselves.
“We’re probably not wading into that water,” she mentioned. “These fashions will not be basically politically something.”
One of many huge questions for big language fashions is whether or not the dangerous content material might be “watermarked” in order that social media corporations can simply determine and stamp it out. Proper now, that appears like an enormous problem in textual content and a barely much less daunting one in AI-generated photos and video.