blog.allenai.org Open in urlscan Pro
162.159.152.4  Public Scan

Submitted URL: https://cjy4q04.na1.hubspotlinks.com/Ctc/LX+113/cJy4q04/VVG0pn2RtqNZW2Dm3SX2dYTSXW89-nNw4-rkpNN5GS5Rt5nKvpV3Zsc37CgMgcW1zPpV38FH5-rW7...
Effective URL: https://blog.allenai.org/announcing-ai2-olmo-an-open-language-model-made-by-scientists-for-scientists-ab761e4e9b76?_hsenc...
Submission: On May 11 via manual from US — Scanned from US

Form analysis 0 forms found in the DOM

Text Content

Open in app

Sign up

Sign In

Write


Sign up

Sign In




ANNOUNCING AI2 OLMO, AN OPEN LANGUAGE MODEL MADE BY SCIENTISTS, FOR SCIENTISTS

AI2

·

Follow

Published in

AI2 Blog

·
5 min read
·
5 hours ago

4



Listen

Share



Today, the Allen Institute for AI is excited to announce that we are embarking
on the creation of an open, state-of-the-art generative language model: AI2 OLMo
(Open Language Model). OLMo will be comparable in scale to other
state-of-the-art large language models at 70 billion parameters, and is expected
in early 2024.

OLMo will be a uniquely open language model intended to benefit the research
community by providing access and education around all aspects of model
creation. AI2 is developing OLMo in collaboration with AMD and CSC, using the
new GPU portion of the all-AMD processor powered LUMI pre-exascale supercomputer
— one of the greenest supercomputers in the world.

OLMo will be a new avenue for many people in the AI research community to work
directly on language models for the first time. We will be making all elements
of the OLMo project accessible — not only will our data be available, but so
will the code used to create the data. We will open-source the model, the
training code, the training curves, and evaluation benchmarks. We will also
openly share and discuss the ethical and educational considerations around the
creation of this model to help guide the understanding and responsible
development of language modeling technology.

This broad availability of all aspects of OLMo will allow the research community
to directly take what we create and work to improve it. We believe that millions
of people want to better understand and engage with language models, and we aim
to create the environment where they actually can, leading to faster and safer
progress for everyone. Our goal is to collaboratively build the best open
language model in the world — follow along with us on Twitter, our blog, and our
newsletter to become a part of this important undertaking.

“With the scientific community in mind, OLMo will be purpose-built to advance
the science of language models,” says Hannaneh Hajishirzi, an OLMo project lead
and a Senior Director of NLP Research at AI2. “OLMo will be the first language
model specifically designed for scientific understanding and discovery.”

“AI2’s deep heritage in natural language processing (NLP) with AMD’s history of
supporting the scientific community through our high-performance computing
efforts are a perfect match for OLMo,” said Ian Ferreria, senior director, AI
Solutions, AMD. “With the new OLMo initiative from AI2, which is geared for
science, we have the capability to extend our knowledge into generative AI using
the impressive capabilities from the LUMI Supercomputer powered by AMD EPYC™
CPUs and AMD Instinct™ accelerators.”




A TRULY OPEN MODEL

As a transparent, collaborative, nonprofit institution, we are well-positioned
to build a language model that is truly open and uniquely valuable to the AI
research community. Our OLMo endeavor will include more than just building an
open language model — we’re purposely building a platform that will allow the
research community to take each component we create and either use it themselves
or seek to improve it. Everything we create for OLMo will be openly available,
documented, and reproducible, with very limited exceptions and under suitable
licensing. The artifacts released as part of the OLMo project will include
training data, code, model weights, intermediate checkpoints, and ablations. A
release strategy for the model and its artifacts is in development as part of
the project. We also plan to build a demo and release interaction data from
consenting users.


FURTHERING AI RESEARCH

As we build OLMo, we will make decisions that make the final model as usable and
efficient as possible without sacrificing performance. Our aim is to make our
model accessible to the full breadth of the AI research community, increasing
the diversity of perspectives and pace of improvement in language model
development. We will also build and release the most rigorously studied and
documented model training dataset to date — this will include pretraining data,
instruction data, and human interaction data.


ETHICAL AND EDUCATIONAL

With OLMo, we are taking a pragmatic approach to ethics and openness. We will
lead with transparency by documenting the decisions, considerations, and
trade-offs we make in considering the ethical and societal impacts of creating
and releasing the OLMo model. Along the way, we will promote AI knowledge and
understanding by sharing our progress, describing our challenges, and explaining
our discoveries. The OLMo team is working closely with AI2’s legal department
and outside legal experts and has included multiple checkpoints in the
model-building process to assess and reassess privacy and intellectual property
rights issues.


PARTNERSHIPS AND SUPPORT

In addition to the collaboration on hardware and computing resources with AMD
and LUMI, AI2 is partnering with organizations including Surge AI and MosaicML
for data and training code. We have created an ethics review committee that
includes both internal and external advisors to provide feedback throughout the
process. The OLMo model and API will be a powerful new resource for the broader
community to better understand and participate in the generative AI revolution.
AI2 welcomes support and partnership from organizations aligned with our values
of AI for the common good and invested in building responsible, beneficial
artificial intelligence technologies — please connect with us at
olmo-partners@allenai.org.

“OLMo will be something special,” notes Noah Smith, an OLMo project lead and a
Senior Director of NLP Research at AI2. “In a landscape where many are rushing
to cash in on the business potential of generative language models, AI2 has the
unique ability to bring our world-class expertise together with world-class
hardware from AMD and LUMI to produce something explicitly designed for
scientists and researchers to engage with, learn from, and use to create the
next generation of safe, effective AI technologies.”

Pekka Manninen, Director of Science and Technology at CSC, adds: “Generative AI
carries the potential of being the breakthrough technology of this decade,
analogous to how search engines and smartphones penetrated our society in the
previous decades. Open, transparent, and explainable LLMs are vital for the
democratization of this technology. We are proud to be part of this
collaboration for its great societal impact and technological ambition level,
and happy that we can contribute to it with the LUMI supercomputer and our
expertise. Supercomputers like LUMI can accelerate LLM training by an order of
magnitude, and many other features of the LUMI infrastructure position it as a
leading platform for natural language processing.”



AMD, the AMD Arrow logo, EPYC, AMD Instinct, and combinations thereof are
trademarks of Advanced Micro Devices, Inc.



Check out our current openings, follow @allen_ai on Twitter, and subscribe to
the AI2 Newsletter to stay current on news and research coming out of AI2.


Large Language Models
Ai For Good
Ai Literacy
Ai2 Olmo
Language Generation


4

4



Follow



WRITTEN BY AI2

977 Followers
·Editor for

AI2 Blog

Our mission is to contribute to humanity through high-impact AI research and
engineering. We are a Seattle-based non-profit founded in 2014 by Paul G. Allen.

Follow




MORE FROM AI2 AND AI2 BLOG

AI2

in

AI2 Blog


AI2 AT CHI 2023


HIGHLIGHTED WORK FROM OUR INSTITUTE APPEARING AT THIS YEAR’S CHI CONFERENCE

5 min read·Apr 22



Ameet Deshpande

in

AI2 Blog


TOXICITY IN CHATGPT


ANALYZING PERSONA-ASSIGNED LANGUAGE MODELS

7 min read·Apr 12



Evan Pete Walsh

in

AI2 Blog


PYTHON CACHING IN GITHUB ACTIONS


HOW TO SPEED UP SLOW PYTHON BUILDS IN GITHUB ACTIONS WITH EFFECTIVE CACHING

5 min read·Sep 28, 2020

227

5




AI2

in

AI2 Blog


FEATURED AI2ER: TARA WILKINS


TARA WILKINS IS AN IT ENGINEER FOR AI2.

5 min read·Apr 3

11




See all from AI2
See all from AI2 Blog



RECOMMENDED FROM MEDIUM

3 Minute Snapshot

in

GoPenAI


BUILD A TRANSLATION API WITH PYTHON AND CHATGPT


CHATGPT HAS BECOME THE SWISS ARMY KNIFE OF NATURAL LANGUAGE PROCESSING. HERE’S
HOW AND WHEN TO USE IT TO TRANSLATE TEXT.


·7 min read·Apr 7

42





AI2

in

AI2 Blog


AI2 AT EMNLP 2022


HIGHLIGHTED WORK FROM OUR INSTITUTE APPEARING AT THIS YEAR’S EMNLP CONFERENCE

5 min read·Dec 5, 2022

11




Salvatore Raieli

in

Level Up Coding


WELCOME BACK 80S: TRANSFORMERS COULD BE BLOWN AWAY BY CONVOLUTION


THE HYENA MODEL SHOWS HOW CONVOLUTION COULD BE FASTER THAN SELF-ATTENTION


·10 min read·May 2

1K

11




Babar M Bhatti


ESSENTIAL GUIDE TO FOUNDATION MODELS AND LARGE LANGUAGE MODELS


THE TERM FOUNDATION MODEL (FM) WAS COINED BY STANFORD RESEARCHERS TO INTRODUCE A
NEW CATEGORY OF ML MODELS. THEY DEFINED FMS AS MODELS…


·15 min read·Feb 6

185





Leonie Monigatti

in

Towards Data Science


GETTING STARTED WITH LANGCHAIN: A BEGINNER’S GUIDE TO BUILDING LLM-POWERED
APPLICATIONS


A LANGCHAIN TUTORIAL TO BUILD ANYTHING WITH LARGE LANGUAGE MODELS IN PYTHON


·12 min read·Apr 25

997

12




Yeyu Huang

in

Level Up Coding


HOW TO CREATE A VIDEO SUMMARIZER POWERED BY AI, IN 20 MINUTES


A QUICK GUIDE FOR BUILDING A WEBSITE THAT GENERATES SUMMARIES FOR ANY YOUTUBE
VIDEO


·10 min read·3 days ago

310

4



See more recommendations

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Text to speech