Dear SA developers, Please have a look to DALL·E 2 too.

Replies

Synthetik April 30, 2022 at 9:08am

I mentioned that every different flavour of generative neural net image synthesis model has a very different visual and aesthetic appearance. Let's take a look at a simple example below from a little project i've been working on.

I've been experimenting with guided generative drift experiments all based on the initial text phrase 'the brutality of war'.

Here are 3 different outputs from 3 different models with the exact same text tag 'the brutality of war'.

latent diffusion with the LAION model

VQGAN-CLIP

CLIP guided diffusion (from disco diffusion)

Note that all 3 images are very distinct visually. Almost like 3 different visual styles.

Now someone is going to immediately chime in that i'm not using any kind of elaborate text prompt engineering to push the system in a certain direction.

Yes, exactly. We're looking at what is going on in the system when given a much more limited prompt. One that could be considered somewhat ambiguous to some extent. And i'm going to argue that all 3 systems have very characteristic underlying visual appearance behavior that caries through even as you start to build wildly elaborate textural prompts.

These kinds of very simple prompts actually tell you quite a bit about what is going on under the hood in each of the 3 algorithms.

And depending on your artistic goals for any given project, you might want to use a particular one because of those underlying visual appearance properties.

I think this also points out that we are in the very early days of fully figuring out what these systems can do and how we can adjust them to get to visual appearance goals a particular digital artist is interested in achieving. Yes, the text prompting bends the system in a certain way. But each system also has a characteristic visual appearance that is a product of both the database it was trained on as well as how the structure of the synthesis algorithm is constructed under the hood.
- Dennis Miller > Synthetik April 30, 2022 at 11:34am
  
  In my experience, if I have a clear direction that I want the system to take, I start with an init image and tweak as needed. But it's equally intersting to start without one and tweak the prompts. Will be interesting to see if any of this finds its way into SA. ;-)
- Synthetik > Synthetik May 3, 2022 at 8:01am
  
  I wanted to point out that DALL-E 2 apparently would not let me even run this particular experiment, because the word 'war' is considered bad or toxic content and cannot be used in the system. This is from a paper from someone who had early access to the DALL-E 2 system and wrote a paper on the limited experiments they ran with it.
  
  I've run into something similar with another model that had some kind of toxic filtering monitor associated with the generated output image. The model considered synthesized images of people doing yoga to be sexual in nature and stopped the processing. Kind of ridiculous (i think). At least for systems that are designed for artists.
  
  Think about the history of art, and then scrub all art imagery that delt with war or displayed naked flesh. Also scrub any images of painfull or disturbing subject matter. What are we left with, bowls of apples?
Sam May 27, 2022 at 3:42am

Maybe this is interesting too.
https://imagen.research.google/

Imagen: Text-to-Image Diffusion Models
- Synthetik > Sam May 27, 2022 at 6:55pm
  
  There is already an open source github code clone implementation of it under development. The fact that a larger language model leads to better text performance when driving the generative model is not really surprising. The cutsie images they show off in their website and paper don't get me too excited however, so we'll see how i feel after using the open source clone of it. Clip is trained on image -text pairs, while Imagen just uses a transformer trained on language only, so that part of it is interesting.
Sam September 13, 2022 at 5:18am

There are some new A.I. software here:
https://gpt3demo.com/
GPT-3 Demo Showcase, 300+ Apps, Examples, & Resources
Get inspired and discover how companies are implementing the OpenAI GPT-3 API to power new use cases

300+ GPT-3 Examples, Demos, Apps, Showcase, and NLP Use-cases | GPT-3 Demo

GPT-3 is the world's most sophisticated natural language technology. Discover how companies are implementing the OpenAI GPT-3 API to power new use ca…