TikTok’s Parent Teases Video AI Model Rivaling OpenAI’s Sora, Turns Photos into Videos

ByteDance dominates the short-video segment with TikTok. Will it be a leading GenAI company as well?
omnihuman albert einstein

With DeepSeek becoming the world’s leading app in no time, ByteDance, the company behind TikTok, has now released a research paper on its new video generation AI model, OmniHuman-1. 

The OmniHuman-1 model can generate realistic human videos by employing a mixed data training strategy with multi-modality motion conditioning. 

In the research paper, the authors mention, “We propose OmniHuman, an end-to-end multimodality-conditioned human video generation framework that generates human videos based on a single image and motion signals (e.g., audio, video, or both).” The researchers who worked on it include Gaojie Lin, Jianwen Jiang, Jiaqi Yang, Zerong Zheng, and Chao Liang.

The model relies on omni-conditions training, which ensures that it does not waste data while transferring data from weaker-conditioned tasks to stronger-conditioned tasks.

ByteDance’s creation joins the race with Google’s Lumiere, OpenAI’s Sora, and other text-to-video generation models. Fundamentally, they are different from one another, but they could take the internet by storm, just like OpenAI’s Sora. There have been no studies comparing the popular models yet. 

Here’s how it looks in action:

In other words, one can generate a video based on a single image. While that is exciting, it is scary at the same time, considering deepfake creations are already succeeding in extorting money from senior citizens.

Anshuman Jha, an AI consultant at AON, took to LinkedIn to highlight potential abuse from using such a model. “From entertainment to advertising, the applications are limitless. Imagine personalised ads where celebrities endorse products in real-time or deceased artists perform new songs. The potential for misuse is glaring,” he said. On the other hand, Jha also mentioned it as a “marvel”.

At the moment, the model is not available to the public. However, the results shared through the official website mention that the model works on any kind of image. 

A Reddit discussion on OmniHuman-1 agrees that it can be a game-changer in AI-based video generation models. There is a buzz about it on social media platforms, and everyone seems surprised at the accuracy of the results.

Similar to how DeepSeek recently dominated everything up until today, OmniHuman-1 could be the next talk of the town in video generation AI models.

📣 Want to advertise in AIM? Book here

Picture of Ankush Das
Ankush Das
I am a tech aficionado and a computer science graduate with a keen interest in AI, Coding, Open Source, Global SaaS, and Cloud. Have a tip? Reach out to ankush.das@aimmediahouse.com
Related Posts
AIM Print and TV
Don’t Miss the Next Big Shift in AI.
Get one year subscription for ₹5999
Download the easiest way to
stay informed