venturebeat.com
Open in
urlscan Pro
192.0.66.2
Public Scan
Submitted URL: https://link.mail.beehiiv.com/ss/c/u001.1KHRTbC8HPuHqA0YnUG6trNGXZYnTYnGRVy7daaCosMkGUyzfuzWgaXxYHQ1i86upQ408fhcAGVB6xsHcGjVfA...
Effective URL: https://venturebeat.com/ai/google-researchers-unveil-vlogger-an-ai-that-can-bring-still-photos-to-life/
Submission: On March 19 via api from US — Scanned from DE
Effective URL: https://venturebeat.com/ai/google-researchers-unveil-vlogger-an-ai-that-can-bring-still-photos-to-life/
Submission: On March 19 via api from US — Scanned from DE
Form analysis
2 forms found in the DOMGET https://venturebeat.com/
<form method="get" action="https://venturebeat.com/" class="search-form" id="nav-search-form">
<input id="mobile-search-input" class="" type="text" placeholder="Search" name="s" aria-label="Search" required="">
<button type="submit" class="">
<svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg">
<g>
<path fill-rule="evenodd" clip-rule="evenodd"
d="M14.965 14.255H15.755L20.745 19.255L19.255 20.745L14.255 15.755V14.965L13.985 14.685C12.845 15.665 11.365 16.255 9.755 16.255C6.16504 16.255 3.255 13.345 3.255 9.755C3.255 6.16501 6.16504 3.255 9.755 3.255C13.345 3.255 16.255 6.16501 16.255 9.755C16.255 11.365 15.665 12.845 14.6851 13.985L14.965 14.255ZM5.255 9.755C5.255 12.245 7.26501 14.255 9.755 14.255C12.245 14.255 14.255 12.245 14.255 9.755C14.255 7.26501 12.245 5.255 9.755 5.255C7.26501 5.255 5.255 7.26501 5.255 9.755Z">
</path>
</g>
</svg>
</button>
</form>
<form action="" data-action="nonce_mailchimp_boilerplate_subscribe" id="boilerplateNewsletterForm" class="Form js-vb-newsletter-cta">
<input type="email" name="email" placeholder="Email" class="Form__input" id="boilerplateNewsletterEmail" required="">
<input type="hidden" name="newsletter" value="vb_dailyroundup">
<input type="hidden" name="b_f67554569818c29c4c844d121_89d8059242" value="">
<input type="hidden" id="nonce_mailchimp_boilerplate_subscribe" name="nonce_mailchimp_boilerplate_subscribe" value="baf2d776f7"><input type="hidden" name="_wp_http_referer"
value="/ai/google-researchers-unveil-vlogger-an-ai-that-can-bring-still-photos-to-life/"> <button type="submit" class="Form__button Newsletter__sub-btn">Subscribe</button>
</form>
Text Content
WE VALUE YOUR PRIVACY We and our partners store and/or access information on a device, such as cookies and process personal data, such as unique identifiers and standard information sent by a device for personalised ads and content, ad and content measurement, and audience insights, as well as to develop and improve products. With your permission we and our partners may use precise geolocation data and identification through device scanning. You may click to consent to our and our 760 partners’ processing as described above. Alternatively you may access more detailed information and change your preferences before consenting or to refuse consenting. Please note that some processing of your personal data may not require your consent, but you have a right to object to such processing. Your preferences will apply to this website only. You can change your preferences at any time by returning to this site or visit our privacy policy. MORE OPTIONSAGREE Skip to main content Events Video Special Issues Jobs VentureBeat Homepage Subscribe * Artificial Intelligence * View All * AI, ML and Deep Learning * Auto ML * Data Labelling * Synthetic Data * Conversational AI * NLP * Text-to-Speech * Security * View All * Data Security and Privacy * Network Security and Privacy * Software Security * Computer Hardware Security * Cloud and Data Storage Security * Data Infrastructure * View All * Data Science * Data Management * Data Storage and Cloud * Big Data and Analytics * Data Networks * Automation * View All * Industrial Automation * Business Process Automation * Development Automation * Robotic Process Automation * Test Automation * Enterprise Analytics * View All * Business Intelligence * Disaster Recovery Business Continuity * Statistical Analysis * Predictive Analysis * More * Data Decision Makers * Virtual Communication * Team Collaboration * UCaaS * Virtual Reality Collaboration * Virtual Employee Experience * Programming & Development * Product Development * Application Development * Test Management * Development Languages Subscribe Events Video Special Issues Jobs GOOGLE RESEARCHERS UNVEIL ‘VLOGGER’, AN AI THAT CAN BRING STILL PHOTOS TO LIFE Michael Nuñez@MichaelFNunez March 18, 2024 6:00 AM * Share on Facebook * Share on X * Share on LinkedIn Image Credit: enriccorona.github.io/vlogger Join leaders in Boston on March 27 for an exclusive night of networking, insights, and conversation. Request an invite here. -------------------------------------------------------------------------------- Google researchers have developed a new artificial intelligence system that can generate lifelike videos of people speaking, gesturing and moving — from just a single still photo. The technology, called VLOGGER, relies on advanced machine learning models to synthesize startlingly realistic footage, opening up a range of potential applications while also raising concerns around deepfakes and misinformation. Described in a research paper titled “VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis,” the AI model can take a photo of a person and an audio clip as input, and then output a video that matches the audio, showing the person speaking the words and making corresponding facial expressions, head movements and hand gestures. The videos are not perfect, with some artifacts, but represent a significant leap in the ability to animate still images. 1 / 5 Live from GTC 2024 - Interview with Ernst & Young Read More 12 Video Player is loading. Play Video Unmute Duration 0:00 / Current Time 0:00 Playback Speed Settings 1x Loaded: 0% 0:00 Remaining Time -0:00 FullscreenPlayRewind 10 SecondsUp Next This is a modal window. Beginning of dialog window. Escape will cancel and close the window. TextColorWhiteBlackRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentBackgroundColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyOpaqueSemi-TransparentTransparentWindowColorBlackWhiteRedGreenBlueYellowMagentaCyanTransparencyTransparentSemi-TransparentOpaque Font Size50%75%100%125%150%175%200%300%400%Text Edge StyleNoneRaisedDepressedUniformDropshadowFont FamilyProportional Sans-SerifMonospace Sans-SerifProportional SerifMonospace SerifCasualScriptSmall Caps Reset restore all settings to the default valuesDone Close Modal Dialog End of dialog window. Share Playback Speed 0.25x 0.5x 1x Normal 1.5x 2x Replay the list TOP ARTICLES * Powered by AnyClip * Privacy Policy Live from GTC 2024 - Interview with Ernst & Young VLOGGER generates photorealistic videos of talking and gesturing avatars from a single image. (Credit: enriccorona.github.io) A BREAKTHROUGH IN SYNTHESIZING TALKING HEADS The researchers, led by Enric Corona at Google Research, leveraged a type of machine learning model called diffusion models to achieve the novel result. Diffusion models have recently shown remarkable performance at generating highly realistic images from text descriptions. By extending them into the video domain and training on a vast new dataset, the team was able to create an AI system that can bring photos to life in a highly convincing way. “In contrast to previous work, our method does not require training for each person, does not rely on face detection and cropping, generates the complete image (not just the face or the lips), and considers a broad spectrum of scenarios (e.g. visible torso or diverse subject identities) that are critical to correctly synthesize humans who communicate,” the authors wrote. advertisement VB EVENT The AI Impact Tour – Atlanta Continuing our tour, we’re headed to Atlanta for the AI Impact Tour stop on April 10th. This exclusive, invite-only event, in partnership with Microsoft, will feature discussions on how generative AI is transforming the security workforce. Space is limited, so request an invite today. Request an invite A key enabler was the curation of a huge new dataset called MENTOR containing over 800,000 diverse identities and 2,200 hours of video — an order of magnitude larger than what was previously available. This allowed VLOGGER to learn to generate videos of people with varied ethnicities, ages, clothing, poses and surroundings without bias. POTENTIAL APPLICATIONS AND SOCIETAL IMPLICATIONS The technology opens up a range of compelling use cases. The paper demonstrates VLOGGER’s ability to automatically dub videos into other languages by simply swapping out the audio track, to seamlessly edit and fill in missing frames in a video, and to create full videos of a person from a single photo. advertisement One could imagine actors being able to license detailed 3D models of themselves that could be used to generate new performances. The technology could also be used to create photorealistic avatars for virtual reality and gaming. And it might enable the creation of AI-powered virtual assistants and chatbots that are more engaging and expressive. Google sees VLOGGER as a step toward “embodied conversational agents” that can engage with humans naturally through speech, gestures and eye contact. “VLOGGER can be used as a stand-alone solution for presentations, education, narration, low-bandwidth online communication, and as an interface for text-only human-computer interaction,” the authors wrote. However, the technology also has the potential for misuse, for example in creating deepfakes — synthetic media in which a person in a video is replaced with someone else’s likeness. As these AI-generated videos become more realistic and easier to create, it could exacerbate the challenges around misinformation and digital fakery. A NEW FRONTIER IN AI RESEARCH While impressive, VLOGGER still has limitations. The generated videos are relatively short and have a static background. The individuals don’t move around a 3D environment. And their mannerisms and speech patterns, while realistic, are not yet indistinguishable from those of real humans. advertisement Nonetheless, VLOGGER represents a significant step forward. “We evaluate VLOGGER on three different benchmarks and show that the proposed model surpasses other state-of-the-art methods in image quality, identity preservation and temporal consistency,” the authors reported. With further advances, this type of AI-generated media is likely to become ubiquitous. We may soon live in a world where it is hard to tell whether the person speaking to us in a video is real or generated by a computer program. VLOGGER provides an early glimpse of that future. It is a powerful demonstration of the rapid progress being made in artificial intelligence and a sign of the increasing challenges we will face in distinguishing between what is real and what is fake. VB Daily Stay in the know! Get the latest news in your inbox daily Subscribe By subscribing, you agree to VentureBeat's Terms of Service. Thanks for subscribing. Check out more VB newsletters here. An error occured. NEXT STOP: AI IMPACT TOUR BOSTON Join us in Boston an exclusive invitation-only evening of networking and insights to discuss how to ensure data integrity for enterprise AI. Request an Invite * VentureBeat Homepage * Follow us on Facebook * Follow us on X * Follow us on LinkedIn * Follow us on RSS * Press Releases * Contact Us * Advertise * Share a News Tip * Contribute to DataDecisionMakers * Privacy Policy * Terms of Service * Do Not Sell My Personal Information © 2024 VentureBeat. All rights reserved.