OpenAI will tackle deepfakes with ‘voice engine’ in election year

March 30, 2024

New Delhi: As world leaders struggle to deal with the threat of deepfakes in a global election year, OpenAI, run by Sam Altman, is trying to develop AI with a text-to-speech model called ‘Voice Engine’. The AI model uses text input and a “single 15-second audio sample” to generate natural-sounding speech. According to OpenAI, “a model as small as 15 seconds of sampling can create realistic sounds.” “We are engaging with government, media, entertainment, academia, civil society, and US and international partners to ensure we are incorporating their input as we build,” OpenAI said. The partners testing the ‘Voice Engine’ have agreed to OpenAI’s policies, which prohibit impersonating another person or organization without consent or legal authority. “Our terms with these partners require explicit consent from the native speaker and we do not allow developers to create ways for individual users to create their own voices,” the company said in a blog post. That said, partners must also clearly disclose to their viewers that the voices they are hearing are AI-generated. “Finally, we have implemented a set of safeguards, including the origin of any audio generated by the voice engine.” “This includes watermarking to detect fraud, as well as active monitoring of how it is being used.”