www.synthlabs.ai Open in urlscan Pro
172.67.155.14  Public Scan

Submitted URL: https://useralignment.com/
Effective URL: https://www.synthlabs.ai/
Submission: On July 13 via api from US — Scanned from US

Form analysis 0 forms found in the DOM

Text Content

arXiv
Follow
Community
Connect
Open main menu


WE’RE DOING CUTTING EDGE RESEARCH FOR TRANSPARENT, AUDITABLE AI ALIGNMENT


JOIN OUR TEAM

s
y
n
t
h
labs


READ ABOUT OUR FUNDING ROUND!



NATHAN LILE

co-founder
Former EleutherAI Volunteer

LOUIS CASTRICATO

Co-Founder
Former EleutherAI Research Scientist, founder of CarperAI

FRANCIS DESOUZA

Co-Founder
Former CEO of Illumina

STELLA BIDERMAN

Founding Advisor
Executive Director of EleutherAI

JOIN US! – CAREERS@SYNTHLABS.AI

 * Current methods of “alignment” are insufficient;
   evaluations are even worse.
 * Human intent reflects a rich tapestry of preferences, collapsed by uniform
   models.
 * AI`s potential hinges on trust, from interpretable data to every layer built
   upon it.
 * Informed decisions around risk are not binary.
 * Training on raw human data doesn’t scale.
 * Your models should adapt and scale, automatically.


SOLVING THE MOST PRESSING PROBLEMS IN AI

EleutherAI said it best

Democratizing AI research is essential, as the future of transformative
technologies should not be confined to the corridors of a few profit-driven
entities, but open to independent inquiry and understanding for the collective
good.

Let's collaborate on open science ML research →
 * Fully auditable, robust AGI alignment platform
 * Pre-training scale automated dataset curation and augmentation
 * Collaborate with top research schools & global community
 * Build scalable supervision for agentic workflows
 * Work on dynamic and continual RLAIF
 * Make multi-modal agents safer


IT'S TIME TO BUILD




Computer Science > Machine Learning
Supressing Pink Elephants with Direct Principle Feedback

Louis Castricato, Nathan Lile, Suraj Anand, Hailey Schoelkopf, Siddharth Verma,
and Stella Biderman

Existing methods for controlling language models, such as RLHF and
Constitutional AI, involve determining which LLM behaviors are desirable and
training them into a language model. However, in many cases, it is desirable for
LLMs to be controllable at inference time, so that they can be used in multiple
contexts with diverse needs. We illustrate this with the Pink Elephant Problem:
instructing an LLM to avoid discussing a certain entity (a “Pink Elephant”), and
instead discuss a preferred entity (“Grey Elephant”).

Submitted on 12 Feb 2024
Read PDF




SUPPORTED BY

[ MEI VENTURES ]

[ ASHISH VASWANI ]

By using our site, you agree to the use of cookies. You can find details on how
we use cookies in our Cookie Policy.

Close

JOIN US!

CONTACT US

Twitter
Discord
LinkedIn
Press
Privacy Policy

© 2024 synthlabs.ai - All rights reserved.