ChatGPT’s voice mode has some security flaws, but OpenAI says they have been resolved.
On Thursday, OpenAI released a report on GPT-4o’s security features, which address known issues that arise when using the model. GPT-4o is the base model that powers the latest version of ChatGPT and comes with a voice mode that was recently released to a select group of users who subscribed to ChatGPT Plus.
What OpenAI’s Scarlett Johansson drama tells us about the future of artificial intelligence
Identified “security challenges” include standard risks such as prompting models with pornographic and violent reactions and other disallowed content, as well as “unwarranted inferences” and “sensitive feature attribution” – in other words, these assumptions may have Discriminatory or prejudiced. OpenAI said it has trained models to block any output labeled in these categories. However, the report also said the mitigation measures did not include “non-verbal vocalizations or other sound effects” such as erotic moans, violent screams and gunshots. We can infer, then, that cues involving certain sensitive nonverbal sounds may be responded to incorrectly.
OpenAI also mentioned the unique challenges posed by speaking with models. Red team members discovered that GPT-4o could be prompted to impersonate someone or accidentally mimic the user’s voice. To solve this problem, OpenAI only allows pre-authorized voices (excluding the infamous Scarlett Johansson’s voice). GPT-4o can also identify sounds other than the speaker’s voice, which raises serious privacy and surveillance concerns. But it has been trained to reject these requests—unless the model prompts it based on a quote.
Mix and match speed of light
Red team members also noted that GPT-4o may be prompted to speak persuasively or emphatically, a feature that may be more harmful than text output when it comes to misinformation and conspiracy theories.
Notably, OpenAI also resolves potential copyright issues that have plagued the company and the overall development of generative artificial intelligence, which is trained using data scraped from the web. GPT-4o is trained to reject requests for copyrighted content and has additional filters for blocking output containing music. At this point, ChatGPT’s voice mode has been instructed not to sing under any circumstances.
Many of OpenAI’s risk mitigation measures covered in this lengthy document were implemented prior to the release of speech mode. Therefore, the clear message of the report is that while GPT-4o is capable of performing certain dangerous behaviors, it does not do so.
However, OpenAI said, “These evaluations only measure the clinical knowledge of these models and not their utility in real-world workflows.” Therefore, it was tested in a controlled environment, but when exposed to the wider public It may be a different beast in the wild when it comes to GPT-4o.
Mashable reached out to OpenAI to learn more about these mitigations and we will update if we hear back.
theme
Artificial IntelligenceOpenAI