www.theregister.com
Open in
urlscan Pro
104.18.4.22
Public Scan
Submitted URL: https://tracking.tldrnewsletter.com/CL0/https://www.theregister.com/2023/06/23/open_source_licenses_ai/?utm_source=tldrnewsletter/1/...
Effective URL: https://www.theregister.com/2023/06/23/open_source_licenses_ai/?utm_source=tldrnewsletter
Submission: On June 26 via api from IN — Scanned from DE
Effective URL: https://www.theregister.com/2023/06/23/open_source_licenses_ai/?utm_source=tldrnewsletter
Submission: On June 26 via api from IN — Scanned from DE
Form analysis
2 forms found in the DOMPOST /CBW/custom
<form id="RegCTBWFAC" action="/CBW/custom" class="show_regcf_custom" method="POST">
<h5>Manage Cookie Preferences</h5>
<ul>
<li>
<label>
<input type="checkbox" disabled="disabled" checked="checked" name="necessary" value="necessary">
<strong>Necessary</strong>. <strong>Always active</strong>
</label>
<label for="accordion_necessary" class="accordion_toggler">Read more<img width="7" height="10" alt="" src="/design_picker/d2e337b97204af4aa34dda04c4e5d56d954b216f/graphics/icon/arrow_down_grey.svg" class="accordion_arrow"></label>
<div class="accordion">
<input type="checkbox" id="accordion_necessary">
<p class="accordion_info"> These cookies are strictly necessary so that you can navigate the site as normal and use all features. Without these cookies we cannot provide you with the service that you expect. </p>
</div>
</li>
<li>
<label>
<input type="checkbox" name="tailored_ads" value="tailored_ads">
<strong>Tailored Advertising</strong>. </label>
<label for="accordion_advertising_tailored_ads" class="accordion_toggler">Read more<img width="7" height="10" alt="" src="/design_picker/d2e337b97204af4aa34dda04c4e5d56d954b216f/graphics/icon/arrow_down_grey.svg"
class="accordion_arrow"></label>
<div class="accordion">
<input type="checkbox" id="accordion_advertising_tailored_ads">
<p class="accordion_info"> These cookies are used to make advertising messages more relevant to you. They perform functions like preventing the same ad from continuously reappearing, ensuring that ads are properly displayed for advertisers,
and in some cases selecting advertisements that are based on your interests. </p>
</div>
</li>
<li>
<label>
<input type="checkbox" name="analytics" value="analytics">
<strong>Analytics</strong>. </label>
<label for="accordion_analytics" class="accordion_toggler">Read more<img width="7" height="10" alt="" src="/design_picker/d2e337b97204af4aa34dda04c4e5d56d954b216f/graphics/icon/arrow_down_grey.svg" class="accordion_arrow"></label>
<div class="accordion">
<input type="checkbox" id="accordion_analytics">
<p class="accordion_info"> These cookies collect information in aggregate form to help us understand how our websites are being used. They allow us to count visits and traffic sources so that we can measure and improve the performance of our
sites. If people say no to these cookies, we do not know how many people have visited and we cannot monitor performance. </p>
</div>
</li>
</ul> See also our <a href="https://www.theregister.com/Profile/cookies/">Cookie policy</a> and <a href="https://www.theregister.com/Profile/privacy/">Privacy policy</a>. <input type="submit" value="Accept Selected" class="reg_btn_primary"
name="accept" id="RegCTBWFBAC">
</form>
POST /CBW/all
<form id="RegCTBWFAA" action="/CBW/all" method="POST" class="hide_regcf_custom">
<input type="submit" value="Accept All Cookies" name="accept" class="reg_btn_primary" id="RegCTBWFBAA">
</form>
Text Content
Oh no, you're thinking, yet another cookie pop-up. Well, sorry, it's the law. We measure how many people read us, and ensure you see relevant ads, by storing cookies on your device. If you're cool with that, hit “Accept all Cookies”. For more info and to customize your settings, hit “Customize Settings”. REVIEW AND MANAGE YOUR CONSENT Here's an overview of our use of cookies, similar technologies and how to manage them. You can also change your choices at any time, by hitting the “Your Consent Options” link on the site's footer. MANAGE COOKIE PREFERENCES * Necessary. Always active Read more These cookies are strictly necessary so that you can navigate the site as normal and use all features. Without these cookies we cannot provide you with the service that you expect. * Tailored Advertising. Read more These cookies are used to make advertising messages more relevant to you. They perform functions like preventing the same ad from continuously reappearing, ensuring that ads are properly displayed for advertisers, and in some cases selecting advertisements that are based on your interests. * Analytics. Read more These cookies collect information in aggregate form to help us understand how our websites are being used. They allow us to count visits and traffic sources so that we can measure and improve the performance of our sites. If people say no to these cookies, we do not know how many people have visited and we cannot monitor performance. See also our Cookie policy and Privacy policy. Customize Settings Sign in / up TOPICS Security SECURITY All SecurityCyber-crimePatchesResearchCSO (X) Off-Prem OFF-PREM All Off-PremEdge + IoTChannelPaaS + IaaSSaaS (X) On-Prem ON-PREM All On-PremSystemsStorageNetworksHPCPersonal Tech (X) Software SOFTWARE All SoftwareAI + MLApplicationsDatabasesDevOpsOSesVirtualization (X) Offbeat OFFBEAT All OffbeatDebatesColumnistsScienceGeek's GuideBOFHLegalBootnotesSite NewsAbout Us (X) Special Features SPECIAL FEATURES The Reg in Space Emerging Clean Energy Tech Week Spotlight on RSA Energy Efficient Datacenters All Special Features VENDOR VOICE Vendor Voice VENDOR VOICE All Vendor VoiceAmazon Web Services (AWS) Business TransformationDDNGoogle Cloud for StartupsHewlett Packard EnterpriseIntel vPro (X) Resources RESOURCES Whitepapers Webinars & Events Newsletters AI + ML 80 OPEN SOURCE LICENSES NEED TO LEAVE THE 1980S AND EVOLVE TO DEAL WITH AI 80 TIME TO GET WITH THE PROGRAM... BEFORE ARTIFICIAL INTELLIGENCE DOES Steven J. Vaughan-Nichols Fri 23 Jun 2023 // 08:30 UTC Opinion Free software and open source licenses evolved to deal with code in the 1970s and '80s. Today it must again transform to deal with AI models. AI was born from open source software. But the free software and open source licenses, based on copyright law, to deal with software code are not a good fit for the large language model (LLM) neural nets and datasets that fuel AI's open source software. Since many programming datasets, in particular, are based on free software and open source code, something must be done. And that's why Stefano Maffulli, Open Source Initiative (OSI) executive director, and a host of other open source and AI leaders are working on combining AI and open source licenses in ways that will make sense for both. Lest you think this is some kind of theoretical, legal discussion with no impact on the real world, think again. Consider J. Doe 1 et al vs GitHub. The plaintiffs in this case in the United States Northern District Court of California allege Microsoft, OpenAI, and GitHub, via their commercial AI-based system, OpenAI's Codex and GitHub's Copilot, had ripped off their open source code. The result? The plaintiffs claim that "suggested" code consists of often near-identical copies of code scraped from public GitHub repositories, without the required open source license attributions. This case continues. The amended complaint includes accusations of violating the Digital Millennium Copyright Act, breach of contract (open source license violations), unfair enrichment, and unfair competition claims, and breach of contract (selling licensed materials in violation of GitHub's policies). Don't think this kind of lawsuit is just Microsoft's problem. It's not. Sean O'Brien, a Yale Law School lecturer in cybersecurity and founder of the Yale Privacy Lab, told my colleague David Gewirtz: "I believe there will soon be an entire sub-industry of trolling that mirrors patent trolls, but this time surrounding AI-generated works. A feedback loop is created as more authors use AI-powered tools to ship code under proprietary licenses. Software ecosystems will be polluted with proprietary code that will be the subject of cease-and-desist claims by enterprising firms." He's right. I've been covering patent trolls for decades. I guarantee that licensing trolls will come after "your" ChatGPT and Copilot code. Some people, such as Felix Reda, a German researcher and politician, claim that all AI-produced code is public domain. US attorney Richard Santalesa, a founding member of the SmartEdgeLaw Group, observed to Gewirtz that there are contract and copyright law issues. They're not the same thing. Santalesa believes companies producing AI-generated code will "as with all of their other IP, deem their provided materials – including AI-generated code – as their property." In any case, however, public domain code is not the same thing as open source code. * Will Flatpak and Snap replace desktop Linux native apps? * Red Hat promises AI trained on 'curated' and 'domain-specific' data * EU's Cyber Resilience Act contains a poison pill for open source developers * Here's how the data we feed AI determines the results On top of all that, there's the whole issue of how the datasets should be licensed. There are many "open" datasets under numerous open source licenses, but it's not usually a good fit. In our conversation, Open Source Initiative's Maffulli elaborated on how various artifacts produced by AI and machine learning systems fall under different laws and regulations. The open source community must determine which laws best serve their interests. Maffulli compared the current situation to the late '70s and '80s when software emerged as a distinct discipline, and copyright began to be applied to the source and binary codes. We're at a similar crossroads today. AI programs such as TensorFlow, PyTorch, and Hugging Face Hub work well under their open source licenses. The new AI artifacts are another story. Datasets, models, weights, etc. don't fit squarely into the traditional copyright model. Maffulli argued that the tech community should devise something new that aligns better with our objectives, rather than relying on "hacks." Specifically, open source licenses designed for software, Maffulli noted, might not be the best fit for AI artifacts. For instance, while MIT License's broad freedoms could potentially apply to a model, questions arise for more complex licenses like Apache or the GPL. Maffulli also addressed the challenges of applying open source principles to sensitive fields like healthcare, where regulations around data access pose unique hurdles. The short version of this is that medical data can't be open sourced. Simultaneously, most commercial LLMs datasets are black boxes. We literally don't know what's in them. So we end up, as the Electronic Frontier Foundation (EFF) puts it, in a situation where we have "Garbage In, Gospel Out." We need, the EFF concludes, open data. So it is that the OSI, said Maffulli, together with Open Forum Europe, Creative Commons, Wikimedia Foundation, Hugging Face, GitHub, the Linux Foundation, ACLU Mozilla, and the Internet Archive are working on a draft for defining a common understanding of open source AI principles. This will be "critical in conversations with legislative bodies." Even now, EU, US, and UK government agencies are struggling to develop AI regulation, and they're woefully under-equipped to deal with the issues. Stefano concluded by saying we should start with "a return to the basics," the GNU Manifesto, which predates most licenses and sets the "North Star" for the open source movement. Maffulli suggested that its principles remain surprisingly relevant when applied to AI systems. By focusing on first principles, we'll be better able to navigate this complex intersection of AI and open source. ® Get our Tech Resources Share SIMILAR TOPICS * AI * Copyright * Large Language Model More like these × SIMILAR TOPICS * AI * Copyright * Large Language Model * Open Source * Software NARROWER TOPICS * AdBlock Plus * App * Application Delivery Controller * Audacity * Confluence * Database * FOSDEM * Google AI * GPT-3 * Grab * IDE * Jenkins * LibreOffice * Machine Learning * Map * MCubed * Microsoft 365 * Microsoft Office * Microsoft Teams * MySQL * Neural Networks * NLP * OpenOffice * Programming Language * QR code * Retro computing * Search Engine * Software bug * Software License * Star Wars * Tensor Processing Unit * User interface * Visual Studio * Visual Studio Code * WebAssembly * Web Browser * Wikipedia BROADER TOPICS * ChatGPT * FOSS * Self-driving Car SIMILAR TOPICS Share 80 COMMENTS SIMILAR TOPICS * AI * Copyright * Large Language Model More like these × SIMILAR TOPICS * AI * Copyright * Large Language Model * Open Source * Software NARROWER TOPICS * AdBlock Plus * App * Application Delivery Controller * Audacity * Confluence * Database * FOSDEM * Google AI * GPT-3 * Grab * IDE * Jenkins * LibreOffice * Machine Learning * Map * MCubed * Microsoft 365 * Microsoft Office * Microsoft Teams * MySQL * Neural Networks * NLP * OpenOffice * Programming Language * QR code * Retro computing * Search Engine * Software bug * Software License * Star Wars * Tensor Processing Unit * User interface * Visual Studio * Visual Studio Code * WebAssembly * Web Browser * Wikipedia BROADER TOPICS * ChatGPT * FOSS * Self-driving Car TIP US OFF Send us news -------------------------------------------------------------------------------- OTHER STORIES YOU MIGHT LIKE US EXPORT BAN DRIVES PRICES OF NVIDIA'S LATEST GPUS SKY HIGH IN CHINA AI in brief Plus: IBM builds AI commentator for Wimbledon; US regulator dithers on generative AI political ad policy AI + ML15 hrs | 2 SMALL CUSTOM AI MODELS ARE CHEAP TO TRAIN AND CAN KEEP DATA PRIVATE, SAYS STARTUP Interview We talk to MosaicML, a startup driving down training costs with open source models AI + ML4 days | 12 GOOGLE WARNS ITS OWN EMPLOYEES: DO NOT USE CODE GENERATED BY BARD AI in brief PLUS: Nuance voice AI startup hit with privacy lawsuit in California, and why OpenAI urged Microsoft to hold off releasing Bing Systems8 days | 13 THE LOG4J VULNERABILITY – HOW CAN WE ALL DO BETTER NEXT TIME? Accept there are some risks you don’t control but which nonetheless you can’t ignore Sponsored Feature AI IS GOING TO EAT ITSELF: EXPERIMENT SHOWS PEOPLE TRAINING BOTS ARE USING BOTS We speak to brains behind study into murky world of model teaching AI + ML10 days | 47 RECIPIENT OF EUROPE'S LARGEST EVER SEED ROUND DOESN'T EVEN HAVE A PRODUCT Can you guess what it is yet? Here's a clue: It starts with 'A' and ends with 'I' AI + ML11 days | 36 SURPRISE! GITHUB FINDS 92% OF DEVELOPERS LOVE AI TOOLS We're fine being judged by code, now that we're getting an assist AI + ML13 days | 15 LAWYERS WHO CITED FAKE CASES HALLUCINATED BY CHATGPT MUST PAY Judge sanctions attorneys for failed reality check AI + ML4 days | 98 LINUX 6.4 DEBUTS AFTER LITERALLY UNREMARKABLE DEVELOPMENT PUSH Latest cut of the kernel gets RISC-ier, moves towards Wi-Fi 7, ejects PCMCIA cards OSes18 hrs | 6 IF AI DRIVES HUMANS TO EXTINCTION, IT'LL BE OUR FAULT +Comment Should you really believe the doomsayers? We're going to go with no AI + ML1 day | 72 WHOSE LINE IS IT ANYWAY, GITHUB? INNOVATION, NOT LITIGATION, SHOULD ANSWER Opinion If Jesus was my Copilot, what would he do? AI + ML7 days | 43 US GOVERNMENT EXTENDS SOFTWARE SECURITY DEADLINE BECAUSE VENDORS AREN'T READY This from the Administration that made infosec a priority Software13 days | 4 The Register Biting the hand that feeds IT ABOUT US * Contact us * Advertise with us * Who we are OUR WEBSITES * The Next Platform * DevClass * Blocks and Files YOUR PRIVACY * Cookies Policy * Your Consent Options * Privacy Policy * T's & C's Copyright. All rights reserved © 1998–2023