[Lex Computer & Tech Group/LCTG] Deep Learning Image Synthesis - open to everyone (with a good enough machine)

Thu Sep 8 08:48:54 PDT 2022

You may have heard about DALL-E 2 - the machine learning model that can generate images from a string of text, almost by magic.

Up until recently, in order to play with this or similar tools you had to either apply to get free time, or buy access to a server or service.

Just in the past two weeks, if you have a good enough machine, you can now download a similar model and scripts and run it on your own, which I did. It is pretty amazing. The model is called Stable Diffusion and the code is on GitHub. (The model itself is on HuggingFace).

You can even take a photo or image of your own and have the system modify it… here’s an example. Starting with my photo and a prompt of “Henri Rousseau painting house at night”, it turned my photo into a rather cool picture:

Some notes on this:

To download the model from HuggingFace.co <http://huggingface.co/>, you need to provide your name and email and agree to their terms on avoiding malicious use, etc. The model itself is 4.2gb in size.
I followed these instructions for how to run it on a Mac M1 machine: https://replicate.com/blog/run-stable-diffusion-on-m1-mac <https://replicate.com/blog/run-stable-diffusion-on-m1-mac> 
Here’s a good overview of the general situation on this: https://arstechnica.com/information-technology/2022/09/with-stable-diffusion-you-may-never-believe-what-you-see-online-again/ <https://arstechnica.com/information-technology/2022/09/with-stable-diffusion-you-may-never-believe-what-you-see-online-again/> - this was the article that got me started on this.
You probably need about 16gb of memory to run the scripts
Input and output pictures are limited to 512 x 512 pixels. (I tried a normal iPhone picture without shrinking it and making it square and the program needed 133gb of memory and failed.)
I did have to edit the img2img.py script to get it to work (copying a line from the text2img.py script).
There are a lot of moral questions about this - what you create, whose artistic work you are implicitly borrowing - but if you choose long-dead artists, I don’t see any harm.
It does not handle faces well; If you ask for a picture of a dog, sometimes the image has the wrong number of legs, etc.
If you want to try this out without using your own machine, you can use this page: https://replicate.com/stability-ai/stable-diffusion <https://replicate.com/stability-ai/stable-diffusion> - it was overloaded yesterday but might be working now.

Ping me if you’d like more details. It’s a lot of fun.

-Dave Cooper

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.toku.us/pipermail/lctg-toku.us/attachments/20220908/444ae479/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: house-at-night.jpg
Type: image/jpeg
Size: 79813 bytes
Desc: not available
URL: <http://lists.toku.us/pipermail/lctg-toku.us/attachments/20220908/444ae479/attachment.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Henri Rousseau painting House at Night-plus-House-at-Night.png
Type: image/png
Size: 361166 bytes
Desc: not available
URL: <http://lists.toku.us/pipermail/lctg-toku.us/attachments/20220908/444ae479/attachment.png>