www.404media.co Open in urlscan Pro
2a04:4e42:400::775  Public Scan

Submitted URL: https://link.mail.beehiiv.com/ss/c/u001.ZcmdxVYceE5TCCCOWDm44jBtJn2_l0MmQMUUG38AXKkzJN0zIu_pETq_niBmle946Uj0UrqlkXmwGYbZ_YPj1j...
Effective URL: https://www.404media.co/tumblr-and-wordpress-to-sell-users-data-to-train-ai-tools/?utm_source=www.yourroundup.co&utm_med...
Submission: On February 28 via manual from US — Scanned from DE

Form analysis 3 forms found in the DOM

<form data-members-form="subscribe">
  <input data-members-email="" aria-labelledby="post-subscribe" type="email" placeholder="Your email address" required="">
  <button class="btn--brand" type="submit" title="Subscribe" aria-label="Subscribe">
    <i class="icon icon-arrow-right">
  	        <svg class="icon__svg">
    		  <use xlink:href="https://www.404media.co/assets/icons/feather-sprite.svg?v=a043330434#arrow-right"></use>
  			</svg>
	      </i>
  </button>
  <div class="message message-success">
    <div class="message__header">
      <div class="message__type">Success</div>
      <div class="message__close js-msg-close">
        <i class="icon icon-x icon--xs">
  			    <svg class="icon__svg">
    			  <use xlink:href="https://www.404media.co/assets/icons/feather-sprite.svg?v=a043330434#x"></use>
  				</svg>
			  </i>
      </div>
    </div>
    <div class="message__content"> Great! Check your inbox and click the link. </div>
  </div>
  <div class="message message-error">
    <div class="message__header">
      <div class="message__type">Error</div>
      <div class="message__close js-msg-close">
        <i class="icon icon-x icon--xs">
  			    <svg class="icon__svg">
    			  <use xlink:href="https://www.404media.co/assets/icons/feather-sprite.svg?v=a043330434#x"></use>
  				</svg>
			  </i>
      </div>
    </div>
    <div class="message__content"> Please enter a valid email address. </div>
  </div>
</form>

<form class="subscribe-form" data-members-form="subscribe">
  <input aria-labelledby="footer-subscribe" data-members-email="" type="email" placeholder="Your email" required="">
  <button type="submit" title="Subscribe" aria-label="Subscribe">
    <i class="icon icon-arrow-right">
  <svg class="icon__svg">
    <use xlink:href="https://www.404media.co/assets/icons/feather-sprite.svg?v=7aa6c3e01d#arrow-right"></use>
  </svg>
</i> </button>
  <div class="message message-success">
    <div class="message__header">
      <div class="message__type">Success</div>
      <div class="message__close js-msg-close"><i class="icon icon-x icon--xs">
  <svg class="icon__svg">
    <use xlink:href="https://www.404media.co/assets/icons/feather-sprite.svg?v=7aa6c3e01d#x"></use>
  </svg>
</i></div>
    </div>
    <div class="message__content"> Great! Check your inbox and click the link to confirm your subscription </div>
  </div>
  <p data-members-error=""><!-- error message will appear here --></p>
</form>

<form class="outpost-cta-form" data-outpost-members-form="signup"><input data-members-email="" type="email" placeholder="Your email address" required="">
  <button class="outpost-cta-submit" type="submit">Subscribe</button>
  <div class="outpost-message-success">Great! Check your inbox and click the link.</div>
  <div class="outpost-message-error">Sorry, something went wrong. Please try again.</div>
  <input data-members-label="" type="hidden" value="Signup_Auto_CTA_Pop_Up"><input data-members-label="" type="hidden" value="Signup_CTA_l8r7a4d5"><input data-members-label="" type="hidden" value="Generic Article CTA Popup">
</form>

Text Content

Listen to the 404 Media Podcast

ACCOUNT

 * Log in
 * Subscribe

NAVIGATION

 * Home

 * About
 * Support/FAQ
 * Podcast
 * Merch
 * Advertise
 * Thanks
 * Privacy

FOLLOW US

Twitter Bluesky Mastodon Instagram TikTok Facebook RSS
Sign in Subscribe
 * About
 * Support/FAQ
 * Podcast
 * Merch
 * Advertise
 * Thanks
 * Privacy

Advertisement
•
Go ad free
Sponsored by DeleteMe Don’t let data brokers share your details. DeleteMe cuts
off their supply. Learn how we protect you.
Get 20% Off
tumblr


TUMBLR AND WORDPRESS TO SELL USERS’ DATA TO TRAIN AI TOOLS

Samantha Cole
· Feb 27, 2024 at 1:21 PM
Internal documents obtained by 404 Media show that Tumblr staff compiled users'
data as part of a deal with Midjourney and OpenAI.
Become a paid subscriber for unlimited, ad-free articles and access to bonus
content. This site is funded by subscribers and you will be directly powering
our journalism.

SUBSCRIBE

Join the newsletter to get the latest updates.
Success

Great! Check your inbox and click the link.
Error

Please enter a valid email address.
🖥️
404 Media is a journalist-owned website. Sign up to support our work and for
free access to this article. Learn why we require this here.

Tumblr and WordPress.com are preparing to sell user data to Midjourney and
OpenAI, according to a source with internal knowledge about the deals and
internal documentation referring to the deals. 

The exact types of data from each platform going to each company are not spelled
out in documentation we’ve reviewed, but internal communications reviewed by 404
Media make clear that deals between Automattic, the platforms’ parent company,
and OpenAI and Midjourney are imminent.

The internal documentation details a messy and controversial process within
Tumblr itself. One internal post made by Cyle Gage, a product manager at Tumblr,
states that a query made to prepare data for OpenAI and Midjourney compiled a
huge number of user posts that it wasn’t supposed to. It is not clear from
Gage’s post whether this data has already been sent to OpenAI and Midjourney, or
whether Gage was detailing a process for scrubbing the data before it was to be
sent. 

Subscribe to the 404 Media podcast on Apple Podcasts, Google Podcasts, or your
favorite podcast app.



Gage wrote:

“the way the data was queried for the initial data dump to Midjourney/OpenAI
means we compiled a list of all tumblr’s public post content between 2014 and
2023, but also unfortunately it included, and should not have included:

 * private posts on public blogs
 * posts on deleted or suspended blogs
 * unanswered asks (normally these are not public until they’re answered)
 * private answers (these only show up to the receiver and are not public)
 * posts that are marked ‘explicit’ / NSFW / ‘mature’ by our more modern
   standards (this may not be a big deal, I don’t know)
 * content from premium partner blogs (special brand blogs like Apple’s former
   music blog, for example, who spent money with us on an ad campaign) that may
   have creative that doesn’t belong to us, and we don’t have the rights to
   share with this-parties; this one is kinda unknown to me, what deals are in
   place historically and what they should prevent us from doing.”


THIS POST IS FOR PAID MEMBERS ONLY

Become a paid member for unlimited ad-free access to articles, bonus podcast
content, and more.
Subscribe


SIGN UP FOR FREE ACCESS TO THIS POST

Free members get access to posts like this one along with an email round-up of
our week's stories.
Subscribe
Already have an account? Sign in

MORE LIKE THIS

Ghost Kitchens Are Advertising AI-Generated Food on DoorDash and Grubhub
Reality bending, AI-generated cheesesteaks and pasta dishes are flooding food
delivery services.
Emanuel Maiberg
· Feb 27, 2024
Podcast: Byron Tau on the New U.S. Surveillance State
We're trying out a new interview episode for the podcast! Our first guest is
investigative journalist and author Byron Tau on the adtech surveillance
industry.
Joseph Cox
· Feb 27, 2024
Texas Sues Pornhub, Claiming It’s Ignoring Age Verification Law
Attorney General Ken Paxton said he "looks forward" to preventing minors from
seeing "harmful, obscene material" on the internet.
Samantha Cole
· Feb 26, 2024
Advertisement
•
Go ad free
Sponsored by DeleteMe Protect against doxxing, stalking, spam and scams.
Offering 404 Media readers 20% off consumer plans.
Learn More

Advertisement
•
Go ad free
•
Hide
Sponsored by DeleteMe Protect against doxxing, stalking, spam and scams.
Offering 404 Media readers 20% off consumer plans
Learn More


UNPARALLELED ACCESS TO HIDDEN WORLDS BOTH ONLINE AND IRL.

404 Media is a new independent media company founded by technology journalists
Jason Koebler, Emanuel Maiberg, Samantha Cole, and Joseph Cox.
 * About
 * Support/FAQ
 * Podcast
 * Merch
 * Advertise
 * Thanks
 * Privacy

Twitter Bluesky Mastodon Instagram TikTok Facebook RSS
Join the newsletter to get the latest updates.
Success

Great! Check your inbox and click the link to confirm your subscription



© 2024 404 Media. Published with Ghost.





JOIN OUR FREE NEWSLETTER

404 Media is an independent, journalist-founded tech news site dedicated to
bringing you unparalleled access to hidden worlds both online and IRL. Subscribe
to our newsletter for updates on our new investigations, articles, and podcasts.

Subscribe
Great! Check your inbox and click the link.
Sorry, something went wrong. Please try again.