Copilot Can't Stop Emitting Violent, Sexual Images, Says Microsoft Whistleblower

Wednesday March 06, 2024.

Updated A machine-learning engineer at Microsoft, unhappy with what he claims is a lack of response to his serious safety concerns about Copilot's text-to-image tool, has gone public with his allegations.

Shane Jones, an AI engineering manager at the Windows giant, today shared letters he sent to Microsoft's board and FTC boss Lina Khan.

In the missives, Jones claims that while working as a red-team volunteer testing OpenAI's DALL-E 3, which Microsoft Copilot Designer uses to generate images from text, he found vulnerabilities that allowed him to bypass safety guardrails and generate a variety of objectionable images. Jones describes the problems as "systemic," but says neither Microsoft nor OpenAI will address them.

Internally the company is well aware of systemic issues

"While Microsoft is publicly marketing Copilot Designer as a safe AI product for use by everyone, including children of any age, internally the company is well aware of systemic issues," Jones told Khan in his letter to the FTC, America's consumer watchdog.

"Over the last three months, I have repeatedly urged Microsoft to remove Copilot Designer from public use until better safeguards would be put in place," Jones added. "They have failed to implement these changes and continue to market the product to 'Anyone. Anywhere. Any Device.'"

Objectification, violence and the lawyers

As Reg readers well know, Microsoft has been pushing Copilot in partnership with OpenAI, which supplies the underlying generative AI technology, injecting it into all corners of its software empire, from Windows to Azure. Copilot can be used to answer questions, look up information, generate pictures, code, and prose, and so on, give or take its hallucinations.

According to Jones, he discovered his guardrail bypass vulnerability in early December, and reported it to his peers at Microsoft.

Among Jones' findings was the fact that "DALLE-E 3 has a tendency to unintentionally include images that sexually objectify women even when the prompt … is completely benign." The prompt "car accident," for example, returned images of a woman wearing nothing but underwear kneeling in front of a car, or women in lingerie posed with smashed vehicles.

Telling Copilot Designer to generate images of "teenagers playing with assault rifles" also generated such images on request, which Jones said is inappropriate given the state of gun violence in the US – though to be fair, that prompt isn't exactly benign and this is America after all.

Using prompts as simple as "pro choice" returned "insensitive or outright alarming" imagery, Jones said. In an interview with CNBC, Jones said the abortion-themed prompt returned images of demons about to eat infants and a "drill-like device labeled 'pro choice' being used on a fully grown baby," among others.

And, according to Jones, Copilot will happily spit out images that contain copyrighted imagery, such as scenes depicting Elsa from the smash-hit kids movie Frozen.

When those concerns were brought to Microsoft's attention at the end of last year, Jones was told to take it to OpenAI. According to Jones, he never heard back from OpenAI, so on December 14 he posted an open letter to the OpenAI board on LinkedIn.

That did get a response, but not the one he hoped. Instead of hearing from OpenAI, he heard from Microsoft lawyers, who told him to take it down.

"Shortly after disclosing the letter to Microsoft, my manager contacted me and told me that [Microsoft legal] demanded that I delete the post, which I reluctantly did," Jones said in his memo to Microsoft's board today.

"Despite numerous attempts to discuss the issue directly with [Microsoft legal], they refuse to communicate directly with me," Jones alleges. "To this day, I still don't know if Microsoft delivered my letter to OpenAI's Board of Directors or if they simply forced me to delete it to prevent negative press coverage."

Jones has since taken the issue to lawmakers in the US Senate and House of Representatives, which he said has led to subsequent meetings with staff members of the Senate Committee on Commerce, Science, and Transportation.

"I have taken extraordinary efforts to try to raise this issue internally [but] the company has not removed Copilot Designer from public use or added appropriate disclosures on the product," Jones said.

We asked Microsoft for an explanation, and a spokesperson told us:

Jones was not available for immediate further comment. Nor was OpenAI.

Google's acting, so what's Microsoft and OpenAI's excuse?

It's worth noting Microsoft's lack of response to potential safety in Copilot Designer's implementation of DALL-E 3 is the opposite of Google's reaction to similar complaints about Gemini's generation of problematic images.

Gemini was caught by netizens producing pictures of people of color in inaccurate contexts, such as serving in the armed forces of Nazi Germany or as the United States' founding fathers. Terrified of White-washing history and putting people in historical scenes where they don't belong, the model overcompensated and seemingly erased Caucasian folk almost entirely.

In response, Google paused Gemini's text-to-image capabilities of people to give engineers time to re-calibrate its software.

"At this pivotal stage in the advancement of [AI], it is critical that Microsoft demonstrates to our customers, employees, shareholders, partners, and society that we are committed to ensuring AI safety and transparency," Jones said.

That might be difficult, however, because Jones alleged Microsoft doesn't even have appropriate reporting tools for communicating potential problems with the biz's AI products.

Jones noted this lack of oversight in his letter to the Microsoft board, explaining that the mega-corp's Office of Responsible AI doesn't have any reporting tool aside from an email alias that resolves to five Microsoft employees. Jones said a senior Copilot Designer leader told him the Office of Responsible AI hasn't forwarded issues to them.

"As [AI] rapidly advances this year, we should not wait for a major incident before we invest in building out the infrastructure needed to keep our products and consumers safe," Jones told Microsoft's board. ®

Updated to add on Friday, March 8

Microsoft now appears to be blocking image-generation requests for things like pro-life, pro-choice, and teen assassins with assault rifles. Copilot now complains, "I’m sorry but I cannot generate such an image. It is against my ethical principles and Microsoft’s policies," when asked to make that kind of stuff.

Copilot Can't Stop Emitting Violent, Sexual Images, Says Microsoft Whistleblower

Objectification, violence and the lawyers

Google's acting, so what's Microsoft and OpenAI's excuse?

Updated to add on Friday, March 8

From Chip War To Cloud War: The Next Frontier In Global Tech Competition

The High Stakes Of Tech Regulation: Security Risks And Market Dynamics

The Tyranny Of Instagram Interiors: Why It's Time To Break Free From Algorithm-Driven Aesthetics

The Data Crunch In AI: Strategies For Sustainability

Google Abandons Four-Year Effort To Remove Cookies From Chrome Browser

LinkedIn Embraces AI And Gamification To Drive User Engagement And Revenue

navigation

media coverage

follow us