Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: For those using Stable Diffusion locally, how do you filter fishy repos?
95 points by epups on Aug 31, 2022 | hide | past | favorite | 41 comments
I have been using the official release so far, and I see many new tools popping up every day, mostly GUIs. A substantial portion of them are closed-source, sometimes even simply offering an executable that you are supposed to blindly trust.

Not to go full Richard Stallman here, but is anybody else bothered by that? How do you deal with this situation, do you use a virtual machine, or is there any other ideas I am missing here?



I simply don't use the closed source ones? Easy to filter (can I see the source?), and helps if I want to contribute as well.

Currently using https://github.com/hlky/stable-diffusion + https://github.com/hlky/stable-diffusion-webui which are both FOSS and works well.


It's now enough to use the main repo only, because he's regularly copying over any changes. So don't need to manually copy those files across anymore.


Ah, I didn't know that, already have symlinks + .gitignore setup so I can update both easily with `git pull`, but good to hear it's no longer needed.

Any reason why the webui repository is not archived with a warning up top if it's been deprecated like you say?


It's not deprecated. He uses it for development and then copies the files across to the main repo. I don't really understand what benefits this setup has over making a dev branch in the main repo, but to each their own!


Whoa there, that first link looks super sketchy. It seems like they just forked the original `CompVis/stable-diffusion` and replaced the license file in this commit:

https://github.com/hlky/stable-diffusion/commit/b4c61769dfa1...


The have added a lot more to it than just change the license, and the replacement is to the GNU AGPL. The original wording in the license puts arbitrary and unenforceable limits on what the end user can do. I would argue the original repo license applies to the weights only and not the code wrapping it. This is completely valid.

They have also made a way to use real-esrgan/gfpgan and so a license change is practicality required.


> The have added a lot more to it than just change the license

I see that, but that doesn't mean they can ignore the original license.

> The original wording in the license puts arbitrary and unenforceable limits on what the end user can do.

I'm totally with you here. The original license is absurd. Still doesn't mean we can fork the repo and replace the license without (probably) breaking the law. Which, by all means, do. In practice maybe nobody will care. I'm just pointing it out because it's sketchy.

> I would argue the original repo license applies to the weights only and not the code wrapping it.

Most of the clauses only apply to the weights, yes. The first clause in the license applies to the whole repo, though: "All rights reserved by the authors."

> They have also made a way to use real-esrgan/gfpgan and so a license change is practicality required.

I don't see how this is relevant. Even if there is a license conflict, the authors retain control over their source. A license conflict might lead to a damages settlement, or an order to halt distribution. It doesn't magically switch the license by implication.


This repo doesn't distribute the model or weights directly. You are agreeing to the more limited upstream license of the original repo when you download the pre-trained model. If the repo came bundled with the pre-trained model then your concern is valid. In this case it is not.


Look at the "assets" and "scripts" directories. Those contain images and code. Those things were released alongside a LICENSE file. They are now reproduced in the fork you linked, but with the LICENSE file rewritten completely. The original license (Aug. 10) did not grant permission to relicense those scripts. All it said about them was "All rights reserved".

The current license, the one added on Aug. 22, does seem grant explicit permission to sublicense and redistribute what it calls the "Complementary Material" (which includes the source code surrounding the model), however, it has a lot of specific provisions, among which are the requirement that the original copyright notice be reproduced.

Like I said, in practice this might not be a big deal, but forking a repo whose license starts with "All rights reserved" and which does not explicitly grant permission to relicense its code, and then rewriting the license file in your fork, should be a huge red flag. As of the Aug. 22 license the fork might be compliant, but I think it would be a lot safer to include a copy of that license in addition to the new GNU license for the forker's changes. And pre-Aug 22, when this fork was made, it was just flat out ignorant to fork and relicense. You can't just delete "All rights reserved" and paste in a GNU license. Look at the license the forker deleted[0]. Literally all it says is "All rights reseved", and then lists a few things you can't do. There's not a single provision that would make it okay to redistribute at all, let alone modify and relicense it.

0. https://github.com/hlky/stable-diffusion/commit/b4c61769dfa1...


The original license that accounts for most of the code you are concerned with was released with an MIT license. https://github.com/CompVis/stable-diffusion/commit/2ff270f4e... The August 10th update made unenforceable rules about how the pre-trained model can be used or distributed. A new license was added to an entire codebase that was already partially released under the MIT license. So only the new code would have copyright reserved. That alone makes most of your argument a moot point.

The actual scripts being used were committed to the repo on August 21st. https://github.com/hlky/stable-diffusion/commit/1d0036cb6644... And these actual scripts being used don't seem to be modified from the ones that were relicensed with the rights reserved.

The crux of the situation for me is that at no point during the release of the pre-trained model do the authors claim any sort of copyright on the images produced, if the end user is generating on their own hardware. There's no meaningful way to legally enforce the current CreativeML Open RAIL-M or the previous MIT derived license that creates rules about how the output of the software can be used.

That is something that has been confusing me, but I imagine it will get cleared up sooner rather than later.

The additional rules are effectively an acceptable use policy. There is no meaningful legal consequence of breaking acceptable use policy. The most that can be done is that an end user will no longer be allowed to used the pre-trained models. Additionally acceptable use policy has to specify a jurisdiction. In the US, breaking acceptable use policy does not amount to violating the CFAA.

The actual license seems mostly about a way for there to be no way to hold the authors of the models accountable for any illegal activity done by the end users. Which is completely fair and understandable.

This kind of misunderstanding and fear-based approach to reusing code is what holds back progress and seems to be what the authors actively tried to fight by releasing the current repo with a CreativeML Open RAIL-M license.

I believe the intent of the authors is as important as the exact text of the repository. Especially since the cited sources for the foundation this was built of off, x-transformers by lucidrains, OpenAI's ADM codebase, and Denoising Diffusion Probabilistic Model, in Pytorch by lucidrains are all licensed with MIT. Most importantly, the restrictive license is based almost completely on the condition that the end user is using their pre-trained model. If the end-user manages to create and uses their own model from scratch, there is no reason for any part of the original repo license to apply.


> The original license that accounts for most of the code you are concerned with was released with an MIT license.

I'm not sure this is true. Yes, it's in the git history. However, the repo was only made public on Aug. 10, which is the day the license was changed to the proprietary one. That says to me that the intent of the authors was to release the code under the proprietary license. There may have been some internal discussion of releasing it under the MIT license, which is why it had that license file for a long time in the git history, but the day the repo was actually released to the public, it was licensed using the Aug. 10 proprietery license.

> This kind of misunderstanding and fear-based approach to reusing code is what holds back progress...

Fully agree with this.

> ...and seems to be what the authors actively tried to fight by releasing the current repo with a CreativeML Open RAIL-M license.

But disagree with this. That license is a nightmare. It's full of permissive provisions followed by insane, idealistic, overreaching conditions that amount to "ensure nobody you give this to uses it to do anything bad".

> I believe the intent of the authors is as important as the exact text of the repository.

Yes, I agree, which is why I think it's important that the day the repo was released to the public was the same day they changed the license from MIT to the proprietary one. I think they panicked last second and decided they weren't ready to go full FOSS and switch to proprietary while they worked out (what they though was) a better solution. And on Aug 22 they relicensed with the CreativeML license.

I'm sure there's disagreement and discussion going on internally, but I'm not getting the same impression from it all that you seem to be getting. I think if the FOSS people on the inside were winning, the thing would have been made public with an MIT license. Instead we've got this do-no-evil license that talks about remote monitoring and control and transitive responsibility for bad actors.


Please go full Richard Stallman.

Control of computing is an all-or-nothing business - even a single compromised component can lead to compromise of the complete system.

Don't trust opaque binaries.

> How do you deal with this situation, do you use a virtual machine, or is there any other ideas I am missing here?

If you really want to run that opaque binary, a virtual machine will give you a decent amount of security. With GPU passthrough, you can even get near-native speed, too.


If you've ever run an `npm install` you've executed 100s of opaque binaries on your machine.


Would you care to elaborate?

I always thought npm was open-source-centric. If npm somehow ran opaque binaries, I'd really like to know about that.


There is no open-source requirement, like there would be on Gentoo packages for instance. NPM packages frequently pull arbitrary binaries in their install scripts.


1. There are thousands of dependencies in a usual lockfile.

2. A package author can push something other than the repository contents to npm/ change contents before pushing to npm, making the whole open source thing useless.

3. As someone else pointed out, you can download+exec when an npm package is installed.


Do you really think your average javascript developer is going to read and understand all of their dependencies?


I've used [0], [1] and [2] so far. I only use open-source ones and quickly skim the source code for anything suspicious. I also only use ones with some degree of popularity, meaning that others have probably taken a look at the code as well.

[0]: https://github.com/lstein/stable-diffusion

[1]: https://github.com/hlky/stable-diffusion

[2]: https://github.com/basujindal/stable-diffusion


You could use the Windows Sandbox to prevent them from accessing anything sensitive on your computer. https://docs.microsoft.com/en-us/windows/security/threat-pro...


Sadly, Windows Sandbox doesn't expose a proper GPU interface, even games barely run. You can create a Hyper-V VM and use a vGPU instead (still requires some PowerShell fiddling) but I'm not sure if the same interfaces needed by ML are supported by Hyper-V paravirtualization.


What's the difference between the vGPU in Sandbox and the one in Hyper-V?


Sandbox and Hyper-V both create a sort-of RDP-based redirected GPU, but Sandbox is restricted to the “basic” mode exposed by the built-in RDP control.

Hyper-V has a similar one if you use the standard RDP console, but if you run a few cmdlets and use a different Remote Desktop app supporting graphics — e.g. Parsec — it can use a vGPU supporting real DirectX 10-12 (hardware API level, you can still use DX9 and such). But no NVIDIA/AMD/Intel drivers, so if you need PhysX, RTX and such, you’re screwed.


Currently there is so much activity that for every closed source tool chances are there is an open source one that does the same. I simply use those instead, after skimming over the code for any obviously malicious activity.


What malicious activity in these repos have you come across?

Do you know when you've missed something?


So far actually, none in the ones I used. It seems everyone is just excited about the tech :)

For now, most of these tools are rather small wrappers around the original stable diffusion repo which is considered trustworthy, so there isn't that much to review.

Things I generally look out for are setup scripts that install unusual packages, any file or network io activity, code that's been obfuscated, instructions that have you download checkpoints from unofficial sources, etc.

Of course I can't know if I missed something...


VM is a good idea. I barely even trust a lot of the open source stuff, there's deep stacks of magic (and not so magic) shit in modern machine learning, and too much "we depend on pip installing this particular git repo, sometimes a particular commit that you'll need to figure out lol". Some of the stuff people are building has looked interesting, but I'm going to let the dust settle for a while before I look into them more, and I'm particularly trying not to gum up my new machine with npm BS. For now, I've had enough fun with just the original repo (regrettably finally mostly figuring out how to use conda) which I setup a bit before the weights were released and haven't updated. (So I didn't need to bother removing the last-minute added watermark/filtering stuff.) I also sometimes test things with the network down to see if there's blatant surprising network connections, interestingly the default repo will ping a site for a resource it needs to download, but continues to do so even after you have it. Add the "local_files_only=True" param to the from_pretrained() method calls in ldm/modules/encoders/modules.py to stop it. (Oh thank the gods that I can just edit the py files to make local changes and they haven't tried to do some weird hybrid binary thing like other projects (ActivityWatch).) I also setup Real-ESRGAN to upscale some favorites, its results are pretty interesting.

The executable binary blobs I've a natural inclination to distrust, but I can also see where they're coming from, culturally, and Most Of The Time it's not a problem, like random indie games you might download to try. (Besides, there are other binary blobs I depend on, like the nvidia driver...) Culturally it seems somewhat comparable to the gaming world or even demo scene or modding scene, where traditionally you don't often find much open source.



I use the Docker one, fully insulates it from my actual machine; it's OSS, but there's a lot of code and downloading a bunch of models, packages and tools at build / run time so better safe than sorry.


> ...sometimes even simply offering an executable that you are supposed to blindly trust.

Then don't trust them. It's easy. If you don't have access to the source code, then assume the worst. If you absolutely MUST run executables of which you don't know what the hell is going on inside, then maybe run them in a secured container or something similar.


One big problem are also model weights serialized with (the likes of) pickle - which allows arbitrary code execution

A lot of trust just to get some numbers


Everything I've seen is still using the model from the original researchers, so as far as that threat goes you only need to trust the original model author and the author of the particular script you are using. If you want to inspect the serialized model file fickling looks like a very promising tool https://blog.trailofbits.com/2021/03/15/never-a-dill-moment-...


The model is hosted on many sites by now, so it is important to compare the checksums


I tried to run locally but all I got is a black square for output.

I used the scripts from this Repo

https://github.com/basujindal/stable-diffusion

which didn't give me a GPU memory error which the original link does.


I was missing a flag which appears optional but breaks, I needed `--precision full` otherwise black square.


How easy / hard is it to run those repos in Google Colab?

That would be my preferred way to shield myself from the repo.


Why would I be bothered about what exactly? People should be free on what and how they release their stuff.


Someone taking advantage of all this excitement to get people to download malware sounds like a very reasonable fear, especially if the software is not coming from a known and reputable person/business.

No one's saying you can't release closed source software - but of course other people are free to be afraid of running it as well.


What’s more, I believe it is possible to release code with a closed license, permitting analysis without allowing derivative work.

Edit: IANAL but here’s a resource:

> You're under no obligation to choose a license. However, without a license, the default copyright laws apply, meaning that you retain all rights to your source code and no one may reproduce, distribute, or create derivative works from your work. If you're creating an open source project, we strongly encourage you to include an open source license. The Open Source Guide provides additional guidance on choosing the correct license for your project.

From https://docs.github.com/en/repositories/managing-your-reposi...


Ok, but again, why should I, or even Richard Stallman, be bothered about this? I couldn't care less, as does Richard, I assume..

I have the feeling OP feels left out because of closed binaries and thinks he should be entitled for everything based off Stable Diffusion __should__ be open source.

OP asks how we should deal with this, but there's nothing to deal with actually.


I believe any developer should be free to release things in the way they please. In fact some people might prefer an easy executable file. However, I do reserve the right to be suspicious and consider any such file as potentially malicious. There are some ways to deal with it, including using a virtual machine. I was just wondering if anybody else had a better idea.


It's 2022. If people using the internet don't know about "downloading malware can be considered harmful", then something is wrong with these people. I mean, it's like "using a knife can cause harm": that's obvious, right? So (most) people use knives with care. Software should be used with care as well (i.e., don't execute executables from shady sources in your personal machine).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: