VIDU video model

Beijing’s China News Service, April 27, Chinanews.com China’s artificial intelligence field has once again achieved a significant advance, propelled by the wave of research and technology. Tsinghua University and Shengshu Technology officially unveiled Vidu, China’s first large video model with long duration, high consistency, and high dynamics, today at the Future Artificial Intelligence Pioneer Forum of the 2024 Zhongguancun Forum Annual Meeting, solidifying China’s leading position in the field of video generation technology.

Undoubtedly, China’s unveiling of the Vidu video model represents yet another significant advancement in the field of artificial intelligence. The model employs the original U-ViT Diffusion and Transformer fusion design created by the team. With only one click, Vidu can produce up to 16 seconds of 1080p high-definition video material because to its sophisticated architecture. This not only represents a technological advance, but it also opens up countless opportunities for the entertainment sector and the production of videos.

At the forum, Zhu Jun, chief scientist of Shengshu Technology and professor at Tsinghua University, provided a detailed introduction to the features and benefits of Vidu. In addition to simulating the real world, he claimed that Vidu has a strong imagination and can produce a wide range of engaging and colourful video content based on a given text description. Vidu also features great spatiotemporal consistency and multi-lens generation, which contribute to the resulting video’s realism and coherence.

It is noteworthy that Vidu has made noteworthy advancements in video effects. The movies produced by Vidu are more sensitive and lifelike than those produced by earlier video generating technologies, and they effectively capture the distinct charm of Chinese culture. Vidu’s ability to create images of pandas and dragons with Chinese characteristics is an indication of its profound comprehension of Chinese culture.

Zhu Jun further underlined that Vidu’s generating process is “one-step” in nature. There is no interpolation or other multi-step processing involved in the direct and continuous conversion of text to video. Vidu is more productive and has more possibilities for use in the video generation industry thanks to this effective generating technique.

It is recognised that the team’s long-term accumulation and several novel accomplishments in the domains of multimodal big models and Bayesian machine learning are the root causes of Vidu’s quick success. The team has created this epoch-making video large model by effectively breaking through the core technologies of long video representation and processing, which is based on their extensive understanding of the U-ViT architecture and their extensive engineering and data experience.

The launch of Vidu is a significant advancement for worldwide video creation technology as well as for China’s artificial intelligence community. I think Vidu technology will continue to advance and improve, adding convenience and excitement to human existence. In addition, we anticipate that more businesses and academic institutions will enter this industry to work together to further the creation and use of video large model technology.

Are you ready to dive deeper into the topics you love? Visit our website and discover a treasure trove of articles, tips, and insights tailored just for you!