dave1629's comments

dave1629 · 2025-05-11T16:49:13 1746982153

From the Conclusion: "In applying current law, we conclude that several stages in the development of generative AI involve using copyrighted works in ways that implicate the owners’ exclusive rights. The key question, as most commenters agreed, is whether those acts of prima facie infringement can be excused as fair use. ... But making commercial use of vast troves of copyrighted works to produce expressive content that competes with them in existing markets, especially where this is accomplished through illegal access, goes beyond established fair use boundaries. ... These groundbreaking technologies should benefit both the innovators who design them and the creators whose content fuels them, as well as the general public."

MoonGhost · 2025-05-12T04:50:29 1747025429

Did they manage to come up with recommendations? Other than to stop it all. In this case we have DeepSeek R1. China will be happy as Trump will have to force NVidia to send best chips there.

yieldcrv · 2025-05-11T19:49:07 1746992947

So many issues with that, the copyright office doesn’t police access, which involves consuming, the copyright office polices distributing.

So then for them to determine fair use, they need the department of justice involved to say the access was illegal? since when. just to highlight the absurdity. “Illegal” meaning a terms of service violation despite the fact that everyone using the service can consume copyrighted works? This circles back to the now paradoxical issue about it not being copyright infringement to consume, but requires policing the terms of service by the copyright office which is impossible.

This is too paradoxical to even entertain, but thats why the office led with “current law”, because it is completely unaccommodating to a real social problem. A lot of artists and people are uncomfortable with the current law, and generative AI. New law could patch this except:

Artists don't actually like the generative AI that isn't trained on copyrighted works either.

The laws are going to change too slow and there are already models that fulfill the high bar that detractors started with.

New works that were specifically licensed for use in AI training and compensated.

The outcome is still the same. More people can express themselves. People with years of discipline are no longer needed.

By the time any law could actually address noncompliant models - to this new imagined standard - compliant models will already have obsoleted the same trade.

jawon · 2025-05-12T04:48:30 1747025310

This is a standard book copyright notice:

All rights reserved. No part of this publication may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical methods, without the prior written permission of the publisher, except as permitted by U.S. copyright law.

“Reproduced” and “electronic” are the relevant terms here.

I remember when gpt-3 came out and you could get it to spit out chunks of Harry Potter and I wondered why no-one was being sued.

The models are built on copyright infringement. Authors and publishers of any kind should be able to opt out of being included in training data and ideally opt-in should be the default.

And I hope one day someone trains a model without the use of works of fiction and we find a qualitative difference in their performance. Does a coding model really need to encode the customs, mores and concerns of Victorian era fictional characters to write a python function?

yieldcrv · 2025-05-12T07:11:38 1747033898

> except as permitted by U.S. copyright law.

these are the relevant terms to me, that notice isn’t law at all, where the exceptions make the rule.

comex · 2025-05-12T00:46:30 1747010790

FYI, the Copyright Office doesn’t enforce copyright law or determine its correct interpretation. Courts do. The legal analysis in this report is really just a suggestion, and judges probably won’t give it too much weight.

As for illegal access, I agree that the report uses the term a bit too loosely. But as we’ve seen in the Meta case, some companies have obtained training material not through TOS-violating downloads but through literal (unauthorized) torrents. As we’ve also seen in the Meta case, even torrenting is technically not copyright infringement if you’re not seeding. But the process does rely on someone else seeding, so the report doesn’t seem wholly unreasonable in suggesting that this could “reflect bad faith” or “bear on the character of the use”.

dave1629 · on June 19, 2017

(co-author here)

Good question - it means the whole system (that is, both the client and server) were designed with the goal of limiting the exposure of the user's passwords in all ways we can - including time (not exposing them in the DOM and having the minimum possible exposure to the browser with acceptable user experience), code (minimizing the amount of code that has access to the user's credentials), and organizations (minimizing the trust the user has to put in any one provider).

dave1629 · on March 6, 2017

I used this book for my course at the University of Virginia last semester, https://uvacs2102.github.io/.

The book is terrific, and wonderful that it is available under a CC license. Most students found it accessible. We covered about Chapters 1-8 (and some things not in the book). (This is only the first part of the book, so a much slower pace than the MIT course. Some of this is to have more time for additional background and to go in depth in things, but also that our courses are scheduled with about 2/3 the amount of time per class as an MIT course is.

dave1629 · on March 29, 2014

What's missing from this is that Google's culture is designed to eliminate work-life balance by making as much of life as possible focus around work. A significant part of the compensation for Google employees is in the form of good "free" meals (including dinners), lots of services, entertainment, etc., all without leaving the Google campus. Would be interesting to see how the results are different from similar surveys done at a company like Microsoft that doesn't have such an all-encompassing work environment.

jacques_chester · on March 30, 2014

Microsoft pretty much invented the all-encompassing work environment. The nickname in the 90s for the Redmond campus was "The Velvet Sweatshop".

vdm · on March 30, 2014

Microserfs

dave1629 · on Feb 9, 2014

JavaScript didn't exist 20 years ago, so if you chose it 20 years ago you were quite prescient! (First released as LiveScript in 1995.) These things do change more quickly than people realize...

icebraining · on Feb 9, 2014

Same with Java, actually (first public release was in 1995 as well).

dave1629 · on Jan 4, 2014

I'm the instructor for the class (and author of the post). I'll try and clarify a few things here.

If anyone is interested in participating in the next version of the course (which starts Jan 14), please submit the form here: http://rust-class.org/pages/spring2014.html

- "Operating systems" vs. "Systems programming"

If your definition of an operating systems course is a course where you implement your own OS or hack on the Linux kernel, this wasn't an operating systems course. But, most people have a broader interpretation of operating systems courses today, to include courses where you learn about the layers between high-level programs and physical things, and about how to build robust, scalable, and secure computing systems. This is the second type of course. Deciding do a course that was not entirely focused on building an OS was not related to the choice of using Rust (I already knew I didn't want to do a build-an-OS course before thinking about which language to use), and I don't think its controversial (this is what the majority of top programs already do). I discuss more about this in the general course wrap-up: http://rust-class.org/pages/course-wrapup.html

- Eliminating "Race Conditions"

You are right in pointing out that Rust doesn't eliminate all race conditions (and no language that allows multiple threads and any interaction with the the external world really could do this). My wording here was very careless. What Rust does is use language/compiler mechanisms to eliminate the kinds of pernicious data races (multiple threads reading and writing the same mutable state in uncontrolled ways) that are a very common and hard to find and fix problems in most multi-threaded programs.

- C's assignment operator

My perhaps somewhat hyperbolic diatribe about C using ‘=‘ for assignment is meant to illustrate how design decisions C's designers made for good reasons given the computing systems they were using in the 1960/1970s, would not be the best decisions if one was designing a language from scratch today. I think the choice of ‘=‘ illustrates this well, but there are dozens of other more serious issues in C's design that were good or necessary choices in 1972, but are undesirable legacies today: not having bounds checking, not specifying the order of evaluation for many constructs, allowing arbitrary and unchecked type casting, unsafe memory management, etc. Other languages (including Rust) that are strongly influenced by C syntax have also adopted the ‘=‘ symbol for assignment, but that doesn’t make it a good thing. For languages like Rust that are targeting experienced programmers it is probably the right choice; for languages like Python that are intended as first languages, it is really unfortunate, and many smart people who might otherwise turn out to be talented programmers are unnecessarily put-off by this. (I don't have concrete data to support this, but from having over 350,000 students in my open intro CS course that uses Python, I have plenty of anecdotal experience with students being confused by this.)

Vocational value

This was a course at a public university, so its content should not be primarily driven by immediate vocational concerns. There's nothing wrong with vocational courses, where the primary goal of the course is to improve the immediate job prospects of students who take it, but was not and should not be the goal of courses at public universities. (That said, the students entering this class are already very well qualified for the job market, and many of the 4th years in the class already have job offers, so are not in danger of not being able to get an interesting job since they only have C experience from one previous course. Knowing Rust has actually been helpful for some students in the job market, and I don't think having less C experience is a major issue for many positions.)

From a purely pragmatic viewpoint, the majority of the costs of the course are not covered from student’s tuition, but are paid by the US and Virginia taxpayers. So, our main goal should be to do what we can to enable and encourage students to do things that make the world a better place (the best ways to do that, of course, are debatable, and maybe it is learning how to hack the Linux kernel, but I think that’s a much tougher argument to make.)

dave1629 · on Jan 3, 2014

A great story, with lots of interesting technology aspects too. The Coast Guard seems to have a strange mix of advanced and very primitive technology for conducting searches - a program that uses Monte Carlo simulations to predict the best places to search and (presumably smartly) computes good search paths for all the available resources and can relay those paths directly to the helicopters' autopilot systems, but then does the actual searching with people in the helicopters scanning with their eyes and the computer system crashes with no way to restore state mid-search.

Seems like it shouldn't be too long before the Coast Guard can just launch a few hundred drones (e.g., DJI Phantom 2 Vision Quadcopters only costs ~$1K each, but might need a few boats to provide networking and recharging out at sea) with directions to scan the area, and computer vision algorithms would do most of the video analysis.

https://plus.google.com/u/0/+DavidEvans/posts/FFQsaWuy8Ry

dave1629 · on Dec 30, 2013

Peter Thiel's version of this: http://blakemasters.com/post/24253160557/peter-thiels-cs183-...

dave1629 · on Dec 29, 2013

Kurt Beyer's book on Grace Hopper gives a great perspective on the early days of computing: http://www.amazon.com/Invention-Information-Lemelson-Studies...

A very comprehensive view an lots of great stories in James Gleick's _The Information_: http://www.amazon.com/Information-History-Theory-Flood/dp/14...