venturebeat.com
Open in
urlscan Pro
192.0.66.2
Public Scan
URL:
https://venturebeat.com/ai/the-self-operating-computer-emerges/
Submission: On December 05 via api from US — Scanned from DE
Submission: On December 05 via api from US — Scanned from DE
Form analysis
1 forms found in the DOMGET https://venturebeat.com/
<form method="get" action="https://venturebeat.com/" class="search-form" id="nav-search-form">
<input id="mobile-search-input" class="" type="text" placeholder="Search" name="s" aria-label="Search" required="">
<button type="submit" class="">
<svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg">
<g>
<path fill-rule="evenodd" clip-rule="evenodd"
d="M14.965 14.255H15.755L20.745 19.255L19.255 20.745L14.255 15.755V14.965L13.985 14.685C12.845 15.665 11.365 16.255 9.755 16.255C6.16504 16.255 3.255 13.345 3.255 9.755C3.255 6.16501 6.16504 3.255 9.755 3.255C13.345 3.255 16.255 6.16501 16.255 9.755C16.255 11.365 15.665 12.845 14.6851 13.985L14.965 14.255ZM5.255 9.755C5.255 12.245 7.26501 14.255 9.755 14.255C12.245 14.255 14.255 12.245 14.255 9.755C14.255 7.26501 12.245 5.255 9.755 5.255C7.26501 5.255 5.255 7.26501 5.255 9.755Z">
</path>
</g>
</svg>
</button>
</form>
Text Content
WE VALUE YOUR PRIVACY We and our partners store and/or access information on a device, such as cookies and process personal data, such as unique identifiers and standard information sent by a device for personalised ads and content, ad and content measurement, and audience insights, as well as to develop and improve products. With your permission we and our partners may use precise geolocation data and identification through device scanning. You may click to consent to our and our 760 partners’ processing as described above. Alternatively you may access more detailed information and change your preferences before consenting or to refuse consenting. Please note that some processing of your personal data may not require your consent, but you have a right to object to such processing. Your preferences will apply to this website only. You can change your preferences at any time by returning to this site or visit our privacy policy. MORE OPTIONSAGREE Skip to main content Events Video Special Issues Jobs VentureBeat Homepage Subscribe * Artificial Intelligence * View All * AI, ML and Deep Learning * Auto ML * Data Labelling * Synthetic Data * Conversational AI * NLP * Text-to-Speech * Security * View All * Data Security and Privacy * Network Security and Privacy * Software Security * Computer Hardware Security * Cloud and Data Storage Security * Data Infrastructure * View All * Data Science * Data Management * Data Storage and Cloud * Big Data and Analytics * Data Networks * Automation * View All * Industrial Automation * Business Process Automation * Development Automation * Robotic Process Automation * Test Automation * Enterprise Analytics * View All * Business Intelligence * Disaster Recovery Business Continuity * Statistical Analysis * Predictive Analysis * More * Data Decision Makers * Virtual Communication * Team Collaboration * UCaaS * Virtual Reality Collaboration * Virtual Employee Experience * Programming & Development * Product Development * Application Development * Test Management * Development Languages Subscribe Events Video Special Issues Jobs THE ‘SELF-OPERATING’ COMPUTER EMERGES Bryson Masse@Bryson_M November 28, 2023 12:02 PM * Share on Facebook * Share on X * Share on LinkedIn Are you ready to bring more awareness to your brand? Consider becoming a sponsor for The AI Impact Tour. Learn more about the opportunities here. -------------------------------------------------------------------------------- Late nights with a newborn can lead to unexpected breakthroughs. Such was the case for OthersideAI developer Josh Bickett, who had an idea for a groundbreaking new “self-operating computer framework” while feeding his daughter in the middle of the night. As Bickett explained to VentureBeat, “I’ve been really enjoying time with my daughter, who’s four weeks now old and I had a lot of new lessons in fatherhood and all that stuff. But I also had a little bit of time, and this idea kind of came to me because I saw different demos of GPT-4 vision. The thing we’re working on now can actually happen with GPT-4 vision.” With his daughter cradled in one arm, Bickett sketched out the basic framework on his computer. “I just found an initial implementation…it’s not super good at clicking the mouse in the right way. But what we’re doing is defining the problem: we need to figure out how to operate a computer.” When OthersideAI co-founder and CEO Matt Shumer saw the new framework, he recognized its tremendous potential. As Shumer told VentureBeat, “This is a milestone in the road to getting to the equivalent of a self-driving car but for a computer. We have the sensors now. We have the LIDAR systems. Next, we build the intelligence.” VB EVENT The AI Impact Tour Connect with the enterprise AI community at VentureBeat’s AI Impact Tour coming to a city near you! Learn More AN AI THAT DECIDES WHERE AND WHAT TO CLICK ON YOUR PC As Bickett described, the framework “lets the AI control both the mouse where it clicks and all the keyboard triggers essentially. It’s like an agent like autoGPT except it’s not text based. It’s vision based so it takes a screenshot of the computer and then it decides mouse clicks and keyboards, exactly like a person would.” Shumer elaborated on how this framework represents a major advance over previous approaches that relied solely on APIs. advertisement “A lot of things that people do on computers, right, you can’t really do with APIs, which is how a lot of other people are approaching this problem, [when] they want to build an agent. They built it on top of the publicly available APIs for this service, but that doesn’t extend to everything.” As Shumer asserted, “If you truly want to solve something that is autonomous [and] can actually help us or get more done. You have to allow it to work like a person because the world is built for people.” The framework takes screenshots as input and outputs mouse clicks and keyboard commands, just as a human would. But as both Bickett and Shumer acknowledged, the real potential lies not in the lightweight framework itself, but in the advanced computer vision and reasoning models that can be plugged into it. “The framework will just be like plug and play, you just plug in a better model and it gets better,” said Bickett. How AI agents will change computing as we know it When asked by VentureBeat about the future implications, Shumer painted a bold vision: “Once this thing is sufficiently reliable, it is going to be your computer, it is going to be your interface to the digital world.” With the self-operating computer framework in place, advanced AI models could learn to take over all computer interactions just through conversational commands. advertisement As Shumer predicted, different types of specialized computer agent models will likely emerge to handle different tasks. Some may focus on speed for simpler tasks, while others excel at complex reasoning. Models may also vary for enterprise vs. consumer use cases. But the overarching goal, according to Shumer, is to develop agents that enable a world “where people can say, this is what I hate doing. Now, I don’t have to do it anymore. And we want to make it so damn easy that somebody who can barely use a computer from the beginning can do it.” OPEN SOURCE TO FUEL DEVELOPMENT Bickett believes the open source nature of the framework will further accelerate progress, allowing developers worldwide to experiment with new applications. Shumer agreed there is “room for a lot of players in this space…a range of model providers. A range of applications. And there are going to be a lot of spaces in this industry to build really really big businesses.” While Bickett and Shumer see enormous potential, realizing the vision of truly intelligent computer agents will require immense resources and continued innovation. advertisement To that end, AI research company Imbue, formerly known as Generally Intelligent, recently secured a $150 million partnership with Dell to build a powerful AI training platform. The massive cluster of around 10,000 Nvidia H100 GPUs will allow Imbue to develop new foundation models optimized specifically for reasoning abilities, a key focus of their work. As Imbue co-founder and CEO Kanjun Qiu noted, “reasoning is the core blocker to agents that work really well.” Imbue believes robust reasoning is paramount for developing truly effective AI agents, as it allows machines to handle uncertainty, adapt approaches, gather new information, make complex decisions, and grapple with real-world complexities – abilities crucial for functioning autonomously beyond narrow tasks. advertisement Thecompany adopts a “full stack” methodology encompassing optimized foundation model training, experimental agent and interface prototyping, robust tool-building, and theoretical AI research – aiming to advance both the practical and fundamental understanding of deep learning with the goal of engineering AI capable of human-level reasoning and eventual artificial general intelligence.. While the self-operating computer framework is just the first step, Bickett and Shumer see it ushering in a new era where sophisticated AI agents replace human computing interfaces entirely. Late nights may keep yielding paradigm-shifting ideas, but it will take focused work to realize the full vision of computers that just work – for anyone, anywhere – through ordinary language alone. VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings. THE AI IMPACT TOUR Join us for an evening full of networking and insights at VentureBeat's AI Impact Tour, coming to San Francisco, New York, and Los Angeles! Learn More * VentureBeat Homepage * Follow us on Facebook * Follow us on X * Follow us on LinkedIn * Follow us on RSS * Press Releases * Contact Us * Advertise * Share a News Tip * Contribute to DataDecisionMakers * Careers * Privacy Policy * Terms of Service * Do Not Sell My Personal Information © 2023 VentureBeat. All rights reserved.