FWIW it feels like Google should dominate text/image -> video since they have ac...

paxys · on Dec 16, 2024

Everyone has access to YouTube. It’s safe to assume that Sora was trained on it as well.

Jeff_Brown · on Dec 16, 2024

All you can eat? Surely they charge a lot for that, at least. And how would you even find all the videos?

chefandy · on Dec 17, 2024

Nobody in this space gives a fuck about anyone or anything further upstream than the file sitting in their ingestion queue. If they can see it, they 'own' it.

HeatrayEnjoyer · on Dec 16, 2024

Who says they've talked to Google about it at all?

I can't speak to OpenAI but ByteDance isn't waiting for permission.

KaoruAoiShiho · on Dec 17, 2024

ByteDance has their own unlimited supply of videos...

HeatrayEnjoyer · on Dec 19, 2024

That hasn't stopped them.

https://news.ycombinator.com/item?id=41756209

griomnib · on Dec 16, 2024

They already did it, and I’m guessing they were using some of the various YouTube down loaders Google has been going after.

bangaladore · on Dec 16, 2024

Does everyone have "legal" access to YouTube.

In theory that should matter to something like Open(Closed)Ai. But who knows.

dheera · on Dec 16, 2024

I mean, I have trained myself on Youtube.

Why can't a silicon being train itself on Youtube as well?

dmonitor · on Dec 16, 2024

Because silicon is a robot. A camcorder can't catch a flick with me in the theater even if I dress it up like a muppet.

the_sleaze_ · on Dec 17, 2024

Not with that attitude.

A corporation "is a person" with all the rights that come along with that - free speech etc.

dheera · on Dec 17, 2024

What if I'm part-carbon, part-silicon?

Like, a blind person with vision restored by silicon eyes?

Do I not have rights to run whatever firmware I want on those eyes, because it's part of my body?

Okay, so what if that firmware could hypothetically save and train AI models?

dmonitor · on Dec 17, 2024

presumably, it should be illegal to record a movie with with an inbuilt camera. capturing the data in such a way that an identical copy can be automatically be reproduced brakes the social contract around the way those works are shared. the majority of media is produced by large companies that are ultimately not harmed by such activities, but individual artisans that create things shouldn't be subjected to this.

we can take this a step further: if your augmented eyes and ears can record people in a conversation, should you be allowed to produce lifelike replicas of people's appearance and voice? a person can definitely imagine someone saying/doing anything. a talented person with enough effort could even make a 3D model and do a voice impression on their own. it should be obvious that having a conversation with a stranger doesn't give them permission to clone your every detail, and shouldn't that also be true for your creations?

lexandstuff · on Dec 17, 2024

The difference is that you didn't need to scrape millions of videos from YouTube with residential proxy network scrapers to train yourself.

dheera · on Dec 17, 2024

Only because I'm significantly more intelligent than ChatGPT, so I can achieve its level of competency on a lot of things with a thousand videos instead of a million videos.

If it just reduces to an issue of data efficiency, AI research will eventually get there though.

jillyboel · on Dec 17, 2024

Humans have rights, machines don't.

BriggyDwiggs42 · on Dec 16, 2024

When a company trains an AI model on something, and then that company sells access to the ai model, the company, not the ai model, is the being violating copyright. If Jimmy makes an android in his garage and gives it free will, then it trains itself on youtube, i doubt anyone would have an issue.

golol · on Dec 17, 2024

If OpenAI training on youtube videos violates copyright then so does Google training on them.

BriggyDwiggs42 · on Dec 18, 2024

In what possible way is that true? Not that I like it, but google has its creators sign away the rights to their material for uses like this. Nobody signs a contract with openai when they make their youtube videos.

dheera · on Dec 18, 2024

When you sign away full rights to one company, that one company can give rights to another company (for money or not).

They could also just acquire that other company.

From the creator's standpoint, signing away rights to one company is as good as gone.

BriggyDwiggs42 · on Dec 19, 2024

Did openai make a deal with google to train on youtube?

hirako2000 · on Dec 16, 2024

They also had a good chunk of the web text indexed, millions of people's email sent every day, Google scholar papers, the massive Google books that digitized most ever published books and even discovered transformers.