In its ongoing effort to make its AI systems much robust, OpenAI coming launched nan OpenAI Red Teaming Network, a contracted group of experts to thief pass nan company’s AI exemplary consequence appraisal and mitigation strategies.
Red teaming is becoming an progressively cardinal measurement successful nan AI exemplary improvement process arsenic AI technologies, peculiarly generative technologies, participate nan mainstream. Red teaming tin drawback (albeit not fix, necessarily) biases successful models for illustration OpenAI’s DALL-E 2, which has been found to amplify stereotypes astir title and sex, and prompts that tin origin text-generating models, including models for illustration ChatGPT and GPT-4, to disregard information filters.
OpenAI notes that it’s worked pinch extracurricular experts to benchmark and trial its models before, including group participating successful its bug bounty programme and interrogator entree program. However, nan Red Teaming Network formalizes those efforts, pinch nan extremity of “deepening” and “broadening” OpenAI’s activity pinch scientists, investigation institutions and civilian nine organizations, says nan institution successful a blog post.
“We spot this activity arsenic a complement to externally-specified governance practices, specified arsenic third-party audits,” OpenAI writes. “Members of nan web will beryllium called upon based connected their expertise to thief reddish squad astatine various stages of nan exemplary and merchandise improvement lifecycle.”
Outside of reddish teaming campaigns commissioned by OpenAI, OpenAI says that Red Teaming Network members will person nan opportunity to prosecute pinch each different connected wide reddish teaming practices and findings. Not each personnel will beryllium progressive pinch each caller OpenAI exemplary aliases product, and clip contributions — which could beryllium arsenic fewer arsenic 5 to 10 years a twelvemonth — will beryllium wished pinch members individually, OpenAI says.
OpenAI’s calling connected a wide scope of domain experts to participate, including those pinch backgrounds successful linguistics, biometrics, finance and healthcare. It isn’t requiring anterior acquisition pinch AI systems aliases connection models for eligibility. But nan institution warns that Red Teaming Network opportunities mightiness beryllium taxable to non-disclosure and confidentiality agreements that could effect different research.
“What we worth astir is your willingness to prosecute and bring your position to really we measure nan impacts of AI systems,” OpenAI writes. “We induce applications from experts from astir nan world and are prioritizing geographic arsenic good arsenic domain diverseness successful our action process.”
The mobility is, is reddish teaming enough? Some reason that it isn’t.
In a caller piece, Wired contributor Aviv Ovadya, an connection pinch Harvard’s Berkman Klein Center and nan Centre for nan Governance of AI, makes nan lawsuit for “violet teaming”: identifying really a strategy (e.g. GPT-4) mightiness harm an institution aliases nationalist bully and past supporting nan improvement of devices utilizing that aforesaid strategy to take sides nan institution and nationalist good. I’m inclined to work together it’s a wise idea. But, arsenic Ovadya points retired his column, there’s fewer incentives to do violet teaming, fto unsocial slow down AI releases capable to person capable clip for it to work.
Red teaming networks for illustration OpenAI’s look to beryllium nan champion we’ll get — astatine slightest for now.