www.databricks.com Open in urlscan Pro
2606:4700::6812:2b3  Public Scan

Submitted URL: https://auth.mosaicml.com/
Effective URL: https://www.databricks.com/research/mosaic
Submission: On July 06 via automatic, source certstream-suspicious — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

Skip to main content
 * Why Databricks
    * * Discover
         * For Executives
         * For Startups
         * Lakehouse Architecture
         * DatabricksIQ
         * Mosaic Research
      
      * Customers
         * Featured Stories
         * See All Customers
      
      * Partners
         * Cloud Providers
           Databricks on AWS, Azure, and GCP
         * Technology Partners
           Connect your existing tools to your Lakehouse
         * Data Partners
           Access the ecosystem of data consumers
         * Built on Databricks
           Build, market and grow your business
         * Consulting & System Integrators
           Experts to build, deploy and migrate to Databricks
         * C&SI Partner Program
           Build, deploy or migrate to the Lakehouse
         * Partner Solutions
           Find custom industry and migration solutions

 * Product
    * * Databricks Platform
         * Platform Overview
           A unified platform for data, analytics and AI
         * Sharing
           An open, secure, zero-copy sharing for all data
         * Governance
           Unified governance for all data, analytics and AI assets
         * Artificial Intelligence
           Build and deploy ML and GenAI applications
         * Business Intelligence
           Intelligent analytics for real-world data
         * Data Management
           Data reliability, security and performance
         * Data Warehousing
           Serverless data warehouse for SQL analytics
         * Real-Time Analytics
           Real-time analytics, AI and applications made simple
         * Data Engineering
           ETL and orchestration for batch and streaming data
         * Data Science
           Collaborative data science at scale
      
      * Integrations and Data
         * Marketplace
           Open marketplace for data, analytics and AI
         * IDE Integrations
           Build on the Lakehouse in your favorite IDE
         * Partner Connect
           Discover and integrate with the Databricks ecosystem
      
      * Pricing
         * Databricks Pricing
           Explore product pricing, DBUs and more
         * Cost Calculator
           Estimate your compute costs on any cloud
      
      * Open Source
         * Open Source Technologies
           Learn more about the innovations behind the platform

 * Solutions
    * * Databricks for Industries
         * Communications
         * Financial Services
         * Healthcare & Life Sciences
         * Manufacturing
         * Media and Entertainment
         * Public Sector
         * Retail
         * See All Industries
      
      * Cross Industry Solutions
         * Customer Data Platform
         * Cyber Security
      
      * Migration & Deployment
         * Data Migration
         * Professional Services
      
      * Solution Accelerators
         * Explore Accelerators
           Move faster toward outcomes that matter

 * Resources
    * * Training and Certification
         * Learning Overview
           Hub for training, certification, events and more
         * Training Overview
           Discover curriculum tailored to your needs
         * Databricks Academy
           Sign in to the Databricks learning platform
         * Certification
           Gain recognition and differentiation
         * University Alliance
           Want to teach Databricks? See how.
      
      * Events
         * Data + AI Summit
         * Data + AI World Tour
         * Data Intelligence Days
         * Event Calendar
      
      * Blog and Podcasts
         * Databricks Blog
           Explore news, product announcements, and more
         * Databricks Mosaic Research Blog
           Discover the latest in our Gen AI research
         * Data Brew Podcast
           Let’s talk data!
         * Champions of Data + AI Podcast
           Insights from data leaders powering innovation
      
      * Get Help
         * Customer Support
         * Documentation
         * Community
      
      * Dive Deep
         * Resource Center
         * Demo Center

 * About
    * * Company
         * Who We Are
         * Our Team
         * Databricks Ventures
         * Contact Us
      
      * Careers
         * Working at Databricks
         * Open Jobs
      
      * Press
         * Awards and Recognition
         * Newsroom
      
      * Security and Trust
         * Security and Trust


 * Login
 * Contact Us
 * Try Databricks



Rigorous science. Real impact.


MEET SHUTTERSTOCK IMAGEAI, A NEW TEXT‑TO-IMAGE DIFFUSION MODEL CODEVELOPED BY
SHUTTERSTOCK AND DATABRICKS

ImageAI is our new text-to-image diffusion model built using the advanced
capabilities of Databricks Mosaic AI and trained exclusively on Shutterstock’s
proprietary image repository. ImageAI generates photorealistic images based on
trusted data.

Read the press release



RESEARCH BLOG

View all blog posts
July 01, 2024
Training MoEs at Scale with PyTorch and Databricks
In a blog post on pytorch.org, researchers at Databricks and Meta discuss
libraries and tools created by both teams that facilitate MoE development within
the PyTorch deep learning framework.
May 23, 2024
Optimizing Databricks LLM Pipelines with DSPy
Researchers working in Databricks co-founder Matei Zaharia’s Stanford research
lab released DSPy, a library for compiling declarative language model calls into
self-improving pipelines. The key component of DSPy is self-improving pipelines.
These tools produce intermediate outputs that are combined with an initial input
to produce a final answer. Just as data pipelines and machine learning models
led to the emergence of MLOps, LLMOps is being shaped by DSPy’s framework of LLM
pipelines and foundation models like DBRX.
May 14, 2024
Building DBRX-class Custom LLMs with Mosaic AI Training
We discuss Mosaic AI Training, available today for Databricks customers, and the
capabilities that enabled us to train DBRX, our open, state-of-the-art,
general-purpose LLM, scaling training to 3072 NVIDIA H100s and processing more
than 12 trillion tokens in the process.
July 01, 2024
Training MoEs at Scale with PyTorch and Databricks
In a blog post on pytorch.org, researchers at Databricks and Meta discuss
libraries and tools created by both teams that facilitate MoE development within
the PyTorch deep learning framework.
May 23, 2024
Optimizing Databricks LLM Pipelines with DSPy
Researchers working in Databricks co-founder Matei Zaharia’s Stanford research
lab released DSPy, a library for compiling declarative language model calls into
self-improving pipelines. The key component of DSPy is self-improving pipelines.
These tools produce intermediate outputs that are combined with an initial input
to produce a final answer. Just as data pipelines and machine learning models
led to the emergence of MLOps, LLMOps is being shaped by DSPy’s framework of LLM
pipelines and foundation models like DBRX.
May 14, 2024
Building DBRX-class Custom LLMs with Mosaic AI Training
We discuss Mosaic AI Training, available today for Databricks customers, and the
capabilities that enabled us to train DBRX, our open, state-of-the-art,
general-purpose LLM, scaling training to 3072 NVIDIA H100s and processing more
than 12 trillion tokens in the process.



TECHNOLOGY

Technology

DBRX

DBRX is an open source, commercially usable LLM developed by our team at
Databricks and released in March 2024. As of its release, it is the
highest-quality open source model available. Thanks to its sparse
mixture-of-expert architecture, it is also fast, fitting these extraordinary
capabilities into just 36B active parameters.

Download on Hugging FaceDBRX Technical BlogDBRX Founders BlogDBRX on Databricks
Model Serving
TECHNOLOGY

Shutterstock ImageAI, powered by Databricks

ImageAI is trained exclusively on Shutterstock’s repository to create
high-resolution images based on trusted data.

Shutterstock ImageAI Press Release
Technology

Mosaic BERT

Pretrain your own BERT model on your data from scratch using Mosaic AI for $20.

Code on GitHubBlog Post
Technology

MPT

The MPT models are a family of open source, commercially usable LLMs released in
summer 2023. They include MPT-30B (prioritizing quality) and MPT-7B
(prioritizing efficiency). You can download versions of these models that we
have trained or you can train your own MPT models on your data using the Mosaic
AI Multi-Cloud Training (MCT) product.

Download on Hugging FaceMPT-30B BlogMPT-7B BlogMPT-7B-8K Blog
Technology

Mosaic Diffusion

Mosaic Diffusion is a generative model that turns text descriptions into images,
designed to be highly efficient.

Code on GitHubBlog Post
Technology

Composer

Composer is an open source deep-learning training library optimized for
scalability and usability.

Code on GitHubFarewell CUDA OOM
Technology

LLM Foundry

Databricks LLM Foundry is a highly efficient, open source codebase for training,
fine-tuning and evaluating LLMs.

Code on GitHubThroughput Tables
Technology

Performance

Our deep learning stack is the most efficient for training, fine-tuning and
deploying large models at scale.

Fast LLM InferenceFP8 for Serving
Technology

StreamingDataset

StreamingDataset is an open source PyTorch DataLoader that makes it easy and
efficient to stream training datasets.

Download on GitHubBlog Post
Technology

Evaluation Gauntlet

The Evaluation Gauntlet is a library for evaluating the quality of generative
language models.

Code on GitHubBlog Post


Explore Mosaic Research Teams

 

Mosaic Research has a proven record of making breakthroughs in generative AI and
LLMs. Now we’re looking for researchers and engineers who want to make an
impact. If you’re truth-seeking, data-driven and work from first principles,
join us.

 

See open roles

 

“At Mosaic Research, we’re about three things: rigorous science, making a
difference for our customers, and having a blast doing it.”

 

— Jonathan Frankle, Chief Scientist — Neural Networks

 



Explore Mosaic Research Teams

 

Mosaic Research has a proven record of making breakthroughs in generative AI and
LLMs. Now we’re looking for researchers and engineers who want to make an
impact. If you’re truth-seeking, data-driven and work from first principles,
join us.

 

See open roles

 

“There’s a lot of humility, a lot of open-mindedness, and a lot of playfulness.
It’s a refreshing environment where people aren’t afraid to ask questions.”

 

— Kartik Sreenivasan, Sr. Research Scientist

 



Explore Mosaic Research Teams

 

Mosaic Research has a proven record of making breakthroughs in generative AI and
LLMs. Now we’re looking for researchers and engineers who want to make an
impact. If you’re truth-seeking, data-driven and work from first principles,
join us.

 

See open roles

 

“It’s nice to know that when you work on a research problem, it will actually
have real-world significance if you solve it.”

 

— Zach Anker, Research Engineer Intern

 




Ready to become a data + AI company?

Take the first steps in your data transformation

Browse demosTry it free

 * * Why Databricks
   * Discover
     * For Executives
     * For Startups
     * Lakehouse Architecture
     * DatabricksIQ
     * Mosaic Research
   * Customers
     * Featured
     * See All
   * Partners
     * Cloud Providers
     * Technology Partners
     * Data Partners
     * Built on Databricks
     * Consulting & System Integrators
     * C&SI Partner Program
     * Partner Solutions
   * Why Databricks
   * Discover
     * For Executives
     * For Startups
     * Lakehouse Architecture
     * DatabricksIQ
     * Mosaic Research
   * Customers
     * Featured
     * See All
   * Partners
     * Cloud Providers
     * Technology Partners
     * Data Partners
     * Built on Databricks
     * Consulting & System Integrators
     * C&SI Partner Program
     * Partner Solutions
 * * Product
   * Databricks Platform
     * Platform Overview
     * Sharing
     * Governance
     * Artificial Intelligence
     * Business Intelligence
     * Data Management
     * Data Warehousing
     * Real-Time Analytics
     * Data Engineering
     * Data Science
   * Pricing
     * Pricing Overview
     * Pricing Calculator
   * Open Source
   * Integrations and Data
     * Marketplace
     * IDE Integrations
     * Partner Connect
   * Product
   * Databricks Platform
     * Platform Overview
     * Sharing
     * Governance
     * Artificial Intelligence
     * Business Intelligence
     * Data Management
     * Data Warehousing
     * Real-Time Analytics
     * Data Engineering
     * Data Science
   * Pricing
     * Pricing Overview
     * Pricing Calculator
   * Open Source
   * Integrations and Data
     * Marketplace
     * IDE Integrations
     * Partner Connect
 * * Solutions
   * Databricks For Industries
     * Communications
     * Financial Services
     * Healthcare and Life Sciences
     * Manufacturing
     * Media and Entertainment
     * Public Sector
     * Retail
     * View All
   * Cross Industry Solutions
     * Customer Data Platform
     * Cyber Security
   * Data Migration
   * Professional Services
   * Solution Accelerators
   * Solutions
   * Databricks For Industries
     * Communications
     * Financial Services
     * Healthcare and Life Sciences
     * Manufacturing
     * Media and Entertainment
     * Public Sector
     * Retail
     * View All
   * Cross Industry Solutions
     * Customer Data Platform
     * Cyber Security
   * Data Migration
   * Professional Services
   * Solution Accelerators
 * * Resources
   * Documentation
   * Customer Support
   * Community
   * Training and Certification
     * Learning Overview
     * Training Overview
     * Certification
     * University Alliance
     * Databricks Academy Login
   * Events
     * Data + AI Summit
     * Data + AI World Tour
     * Data Intelligence Days
     * Full Calendar
   * Blog and Podcasts
     * Databricks Blog
     * Databricks Mosaic Research Blog
     * Data Brew Podcast
     * Champions of Data & AI Podcast
   * Resources
   * Documentation
   * Customer Support
   * Community
   * Training and Certification
     * Learning Overview
     * Training Overview
     * Certification
     * University Alliance
     * Databricks Academy Login
   * Events
     * Data + AI Summit
     * Data + AI World Tour
     * Data Intelligence Days
     * Full Calendar
   * Blog and Podcasts
     * Databricks Blog
     * Databricks Mosaic Research Blog
     * Data Brew Podcast
     * Champions of Data & AI Podcast
 * * About
   * Company
     * Who We Are
     * Our Team
     * Databricks Ventures
     * Contact Us
   * Careers
     * Open Jobs
     * Working at Databricks
   * Press
     * Awards and Recognition
     * Newsroom
   * Security and Trust
   * About
   * Company
     * Who We Are
     * Our Team
     * Databricks Ventures
     * Contact Us
   * Careers
     * Open Jobs
     * Working at Databricks
   * Press
     * Awards and Recognition
     * Newsroom
   * Security and Trust

Databricks Inc.
160 Spear Street, 15th Floor
San Francisco, CA 94105
1-866-330-0121

 * 
 * 
 * 
 * 
 * 
 * 

See Careers
at Databricks

 * 
 * 
 * 
 * 
 * 
 * 

© Databricks 2024. All rights reserved. Apache, Apache Spark, Spark and the
Spark logo are trademarks of the Apache Software Foundation.

 * Privacy Notice
 * |Terms of Use
 * |Modern Slavery Statement
 * |Your Privacy Choices
 * |Your California Privacy Rights
 * 




WE CARE ABOUT YOUR PRIVACY

By clicking “Accept All Cookies”, you agree to the storing of cookies on your
device to enhance site navigation, analyze site usage, and assist in our
marketing efforts.
Reject all cookies Accept all cookies
Manage Preferences



PRIVACY PREFERENCE CENTER




 * YOUR PRIVACY


 * STRICTLY NECESSARY COOKIES


 * PERFORMANCE COOKIES


 * FUNCTIONAL COOKIES


 * TARGETING COOKIES

YOUR PRIVACY

When you visit any website, it may store or retrieve information on your
browser, mostly in the form of cookies. This information might be about you,
your preferences or your device and is mostly used to make the site work as you
expect it to. The information does not usually directly identify you, but it can
give you a more personalized web experience. Because we respect your right to
privacy, you can choose not to allow some types of cookies. Click on the
different category headings to find out more and change our default settings.
However, blocking some types of cookies may impact your experience of the site
and the services we are able to offer.
More information

STRICTLY NECESSARY COOKIES

Always Active

These cookies are necessary for the website to function and cannot be switched
off in our systems. They are usually only set in response to actions made by you
which amount to a request for services, such as setting your privacy
preferences, logging in or filling in forms. You can set your browser to block
or alert you about these cookies, but some parts of the site will not then work.

PERFORMANCE COOKIES

Performance Cookies


These cookies allow us to count visits and traffic sources so we can measure and
improve the performance of our site. They help us to know which pages are the
most and least popular and see how visitors move around the site.

FUNCTIONAL COOKIES

Functional Cookies


These cookies enable the website to provide enhanced functionality and
personalization. They may be set by us or by third party providers whose
services we have added to our pages. If you do not allow these cookies then some
or all of these services may not function properly.

TARGETING COOKIES

Targeting Cookies


These cookies may be set through our site by our advertising partners. They may
be used by those companies to build a profile of your interests and show you
relevant advertisements on other sites. If you do not allow these cookies, you
will experience less targeted advertising.

Back Button


COOKIE LIST

Filter Button
Consent Leg.Interest
checkbox label label
checkbox label label
checkbox label label

Clear
checkbox label label
Apply Cancel
Confirm My Choices
Reject All Allow All