Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

FWIW it feels like Google should dominate text/image -> video since they have access to Youtube unfettered. Excited to see what the reception is here.


Everyone has access to YouTube. It’s safe to assume that Sora was trained on it as well.


All you can eat? Surely they charge a lot for that, at least. And how would you even find all the videos?


Nobody in this space gives a fuck about anyone or anything further upstream than the file sitting in their ingestion queue. If they can see it, they 'own' it.


Who says they've talked to Google about it at all?

I can't speak to OpenAI but ByteDance isn't waiting for permission.


ByteDance has their own unlimited supply of videos...



They already did it, and I’m guessing they were using some of the various YouTube down loaders Google has been going after.


Does everyone have "legal" access to YouTube.

In theory that should matter to something like Open(Closed)Ai. But who knows.


I mean, I have trained myself on Youtube.

Why can't a silicon being train itself on Youtube as well?


Because silicon is a robot. A camcorder can't catch a flick with me in the theater even if I dress it up like a muppet.


Not with that attitude.

A corporation "is a person" with all the rights that come along with that - free speech etc.


What if I'm part-carbon, part-silicon?

Like, a blind person with vision restored by silicon eyes?

Do I not have rights to run whatever firmware I want on those eyes, because it's part of my body?

Okay, so what if that firmware could hypothetically save and train AI models?


presumably, it should be illegal to record a movie with with an inbuilt camera. capturing the data in such a way that an identical copy can be automatically be reproduced brakes the social contract around the way those works are shared. the majority of media is produced by large companies that are ultimately not harmed by such activities, but individual artisans that create things shouldn't be subjected to this.

we can take this a step further: if your augmented eyes and ears can record people in a conversation, should you be allowed to produce lifelike replicas of people's appearance and voice? a person can definitely imagine someone saying/doing anything. a talented person with enough effort could even make a 3D model and do a voice impression on their own. it should be obvious that having a conversation with a stranger doesn't give them permission to clone your every detail, and shouldn't that also be true for your creations?


The difference is that you didn't need to scrape millions of videos from YouTube with residential proxy network scrapers to train yourself.


Only because I'm significantly more intelligent than ChatGPT, so I can achieve its level of competency on a lot of things with a thousand videos instead of a million videos.

If it just reduces to an issue of data efficiency, AI research will eventually get there though.


Humans have rights, machines don't.


When a company trains an AI model on something, and then that company sells access to the ai model, the company, not the ai model, is the being violating copyright. If Jimmy makes an android in his garage and gives it free will, then it trains itself on youtube, i doubt anyone would have an issue.


If OpenAI training on youtube videos violates copyright then so does Google training on them.


In what possible way is that true? Not that I like it, but google has its creators sign away the rights to their material for uses like this. Nobody signs a contract with openai when they make their youtube videos.


When you sign away full rights to one company, that one company can give rights to another company (for money or not).

They could also just acquire that other company.

From the creator's standpoint, signing away rights to one company is as good as gone.


Did openai make a deal with google to train on youtube?


They also had a good chunk of the web text indexed, millions of people's email sent every day, Google scholar papers, the massive Google books that digitized most ever published books and even discovered transformers.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: