Understanding OpenAI's Sora: & How it Works A Comprehensive Guide

Sora is an artificial intelligence engine capable of generating realistic video. The seams are still visible, but it shows that we are one step away from falling into the end of reality. We are about to take the last step before falling off the precipice of the end of reality, the post-truth era where nothing, absolutely nothing we see on the screens of our phones and computers, will be credible. Sora, OpenAI’s new artificial intelligence capable of generating a minute of video that looks like it came from a high-definition camera just by typing a paragraph of text, has brought us to this point.

Understanding OpenAI's Sora: & How it Works A Comprehensive Guide

“Sora is capable of generating complex scenes with multiple characters, specific types of movement, and precise details of subject matter and background,” says OpenAI. But what is even more important in this arms race to make artificial intelligence able to conjure a reality An alternative as credible as the real one is that “the model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world .”

Sam Altman – the CEO of OpenAI – was showing off Sora yesterday on the old Twitter after the official presentation, inviting people to suggest instructions for the new AI, publishing the results a little later, like these two golden retrievers podcasting on the Top of a mountain.

Sora will be followed by Stability Cascade, from OpenAI’s direct competition. And then there will be a new version of Runway or any other until, in a few months or at most a year or two, one of these synthetic reality engines is indistinguishable from the “real” reality that we can see with our own eyes.

About to candy

Sora’s seams are still showing. He is not a perfect generative model, but he undeniably comes close. It’s a big leap in a chain that has passed since 2022, the year in which the generative AI revolution that began in the late 2010s began to crystallize when chipmaker Nvidia—now one of the most valuable companies in the world. planet precisely because of that strategy now proven to be masterful—created the first AI imaging applications. The company’s head of applied AI already told us, Bryan Catanzaro, one of the experts we interviewed for the episode The End of Reality, of our series Control Z.

Although the majority of people today have opened their digital newspapers to see the news and are amazed by Sora’s videos, this ability was already predicted by the experts I interviewed for the mini-documentary. It arrived just when they said: in 2024.

The scientific, economic, and creative explosion that is happening right now will become comparable to going from the Stone Age directly to 2024 in just 10 years. But, if we do not take measures to nip its use for evil in the bud, generative artificial intelligence will shake society, causing profound and irreparable damage to millions of people, as Tom Graham, CEO and co-founder of Metaphysic, told me via videoconference. , one of the leaders in the sector that has revolutionized Hollywood with deep fake technology that went viral with the fake Tom Cruise. We are at a time when all of society—individuals, technologists, consumers, and policymakers—must take urgent action to prevent brutal harm to the public and democracies. Unfortunately, he says, the laws lag far behind what is being developed, and will still take 10 to 15 years to arrive. Today, he assures us, “We are in a period of danger.”

Unavoidable events requiring urgent action

Graham refers to an event horizon in which reality will evaporate. At some point shortly, we will lose our ability to distinguish between fact and machine-created fiction, no matter how many forensic tools we can devise. It turns out, after talking to some of the leading experts in the field, that the “near future” will happen in the next 10 years. According to what Emad Mostaque — CEO and founder of Stability AI, the organization that has created Stable Diffusion, the most important generative artificial intelligence engine in the world at the moment, even beyond ChatGPT — told me by videoconference, “in the next 5 to 10 years we will be able to create anything you can imagine with perfect visual quality in real-time.”

Catanzaro—vice president of applied artificial intelligence at Nvidia, one of the companies that has laid the foundations for the field with its scientific research and graphics processors—agrees with Mostaque’s prediction. He goes further. “I bet that in 2023 someone will make a movie where the video, audio, and script are made with AI, but, probably, within five years, that will reach the point where it would be interesting to see something built this way”, he tells me.

So in 2033, we will have the ability to create a high-definition video in real-time in which everything, absolutely everything, from the image to the sound, the music, and every word or grunt spoken in it, will be artificially generated. The product will be indistinguishable from any clip or full movie that can be recorded with any current camera. Before that, however, we will see videos and images and hear audio that will be indistinguishable from reality with the naked eye, requiring forensic analysis to determine this.


As Gil Perry —CEO and co-founder of the Israeli AI company D-ID, creators of Deep Nostalgia —tells me, “In a year or two, you won’t be able to know what is true and what is a lie.” Not in Hollywood movies, but in real-time, including its use in videoconferences. Graham says that generative AI technology will be able to change your face and even your environment in communication tools like Zoom in real-time in a completely believable way in a few years. Now they do it imperfectly and people believe it.

Each of these interviews left me with a deep sense of desolation and anxiety in the face of a crisis that seems imminent and inevitable, a sense of existential anguish that I still have not been able to shake. Logically, the dark side of all this technology is in its criminal application, not only by authoritarian states such as Russia, China, or Iran, or extreme political parties of both stripes, but on a day-to-day basis. Scammers, blackmailers, rapists, school bullies…, the tool will be extremely powerful to do evil. A true atomic bomb is within reach of anyone because, according to experts, the barrier to entry will be zero. It will not require any specialized knowledge or equipment.

Anyone with a mobile phone can do it, as Mostaque told me. I’ve always been a tech optimist, the guy who thinks no problem can’t be solved by pure human ingenuity. Global warming, cancer, the energy crisis, we will solve it all. But, as I got deeper into generative AI, I discovered that there was no way to put this genie back into the lamp. , this time, we have unleashed a force that will be uncontrollable in just a few years if we don’t take some radical steps now.

It’s a terrifying dystopian future, likely to happen with other real-world events, but with basically the same outcome: the end of reality is not good for humanity. Generative AI is something we can’t undo. That would also be stupid. Their potential is simply too amazing to ignore, from developing cures for incurable diseases to designing spaceships far better than those humans design to take us to new worlds. And, of course, making movies and having fun with it.

Limit the dark side without limiting innovation

But trusting companies to regulate themselves would also be just as stupid. This is something that the experts I have interviewed admit, even as an affected party. Much less when Silicon Valley is involved. History has shown us time and time again that they cannot be trusted. The list of mistakes and illegal and unethical acts is too long to ignore. The last time we trusted them, they gave us social media, and we all know how that shit show ended. Trusting them again would be downright foolish, especially after reading OpenAI’s terrifyingly messianic and self-absorbed manifesto on artificial general intelligence. Social media, Graham says, was launched into the world with absolutely zero regard for the impact it would have on young people and democracy. “I don’t think that’s the model we should follow for this new technology.” Graham thinks we should try to avoid it at all costs.

We need an urgent public debate on generative AI and there are three things we can do to avoid a social crisis of unimaginable consequences. They will require companies to sit down with institutions and the Government, even with psychologists, philosophers, and human rights organizations, but it can be done. Mostaque thinks there needs to be an open discussion about the positive and negative side and what needs to be regulated, although he doesn’t think much more than an extension of current legislation is needed to protect people. “ Open debate is always best because of the complexity of what this could do to social makeup,” he says. Graham, however, says that “lawmakers need to think about how to implement those laws as quickly as humanly possible to protect people from potential harm.”

The first, and most important, is the creation of worldwide cryptographic certification standards to authenticate any content captured by digital cameras and microphones. This is something that has already been proposed by the Government of Joe Biden, which is watching the wolf of generative AI in this US election year.

The goal is to establish a baseline of certainty that, at a minimum, allows people to be certain that something is real. According to Perry, the detection of synthetic content will be impossible. “AI is stronger,” he says. Hence the need to at least know what is real. He also points out that work should be done on incorporating invisible watermarks in the generated content, but, unfortunately, this can be also falsified by criminals.

The second is to launch communication programs so that the public understands the scope of generative artificial intelligence. People must learn to be able to defend themselves against new audiovisual falsifications. “ The world is changing and children are growing up in a very different place. It’s a little scary,” Perry tells me, “the idea is to make AI open to the public and have everyone have access and get used to it, not have it controlled by a few governments and tech giants.” Graham agrees with this public awareness effort.

Recently, his company participated with its real-time artificial intelligence avatars in the popular television show America’s Got Talent (an example on these lines). His mission, he says, was not only promotional but also to make the power of this technology known to the general public: “If that can help a person reduce the psychological impact [of a fake image or video], it is positive.” Finally, we need to urge governments around the world to collaborate with the scientific community on legislation that protects individual rights, establishing criminal limits to try to curb the toxic use of this technology. Perry — whose company got it’s start developing systems to bypass government facial identification — says they are pushing regulators to be aware of the technology and the need to establish security guidelines, rules, and limits. Only then can we harness its revolutionary creative potential without endangering humanity itself.

By win11

Leave a Reply

Your email address will not be published. Required fields are marked *