Run your own DALL-E-like image generator

simonw · on Aug 8, 2022

I tried doing this on my own laptop a few weeks ago using Hugging Face diffusers - https://github.com/huggingface/diffusers

Here's the code:

    from diffusers import DiffusionPipeline
    ldm = DiffusionPipeline.from_pretrained(
        "CompVis/ldm-text2im-large-256"
    )
    output = ldm(
        ["a painting of a raccoon reading a book"],
        num_inference_steps=50,
        eta=0.3,
        guidance_scale=6
    )
    output["sample"][0].save("image.png")

This took 5m25s to run on my laptop (no GPU configured) and produced a recognisable image of a Raccoon reading a book. I tweeted the resulting image here: https://twitter.com/simonw/status/1550143524179288064

epivosism · on Aug 8, 2022

Possibly useful: midjourney offers image creation via discord, 200 images/10$ or a subscription plan for more. It's great fun and possibly is an easier way to test out the tech.

AI art notes in general:

  * this will permanently raise the standard of "programmer art" in prototype UIs, for example.
  * I expect large-scale AI image creation social games will take off very soon
  * utilities to listen to your conversations and illustrate people's points with poetic interpretations on screens next to you with captions.
  * this will/should be integrated with emoji / emoji kitchen-like systems - imagine iterating on a custom emoji-like response in snapchat!
  * trademark/copyright wars over input data may ruin this

ev1 · on Aug 8, 2022

I'd be happy to pay but my god at least give me a rest api to curl by hand and not discord of all things.

yonixw · on Aug 8, 2022

Currently free trial [1] of 25 images if anyone wants to try...

But I would also suggest [2] which is the one used in the article already deployed for free

[1] https://discord.gg/midjourney

[2] https://huggingface.co/spaces/multimodalart/latentdiffusion

irq-1 · on Aug 8, 2022

Wow. Great ideas, thanks for sharing. I'd like to see image>text ai so it can explain what it's seeing, for driving, etc...

hnfong · on Aug 9, 2022

+1 would recommend

Disclaimer: I subscribed and when I came to my senses 5 hours had passed and I had a gallery of weird pictures...

Voloskaya · on Aug 8, 2022

Alternatively, if you don't have a (good enough) GPU, you can also run it on Google Colab using one of the many available notebooks.

For example, this one [1] works very well.

Google Colab comes with a fair amount of GPU time for free.

[1]: https://colab.research.google.com/drive/1TBo4saFn1BCSfgXsmRE...

illuminati1911 · on Aug 8, 2022

”Run your own text to image prompts with CUDA, a bunch of disk space, and an insane amount of memory.”

The ”insane amount of memory” turned out to be 32 GB or more. I thought we are talking about 512 GB or something like that.

pwillia7 · on Aug 8, 2022

including the GPU memory I say it's insane :) -- Also, you need more than 32GB of RAM to follow the instructions verbatim in the repository. People were saying the scann indexing fails even with 64GB of RAM.

Tepix · on Aug 8, 2022

Many AM4 boards that cost 70€ or so support 128GB RAM.

128GB = 4x 32GB DDR4 UDIMM costs less than 360€.

That's a lot less than the GPU.

throwaway743 · on Aug 8, 2022

iirc any CUDA requirement requires an Nvidia GPU

Tepix · on Aug 11, 2022

You can plug nVidia GPUs in AMD boards...

throwaway743 · on Aug 12, 2022

Yeah but that's out of the context of multiple am4 boards and their RAM caps being able to replace a GPU or being cheaper than a GPU.

Yeah they're technically cheaper than a GPU, but your costs are still gonna be up and they can't replace a GPU because regardless you'll need an nvidia card, unless there's a workaround for CUDA requirements.

Tepix · on Aug 12, 2022

I thought you need both a GPU and a lot of memory for the CPU.

pwillia7 · on Aug 14, 2022

OhNoNotAgain_99 · on Aug 17, 2022

Computing neural nets isnt something a CPU is optimized for (matrix calculations), This tech is based on gpu optmized LSTM's, acka Transformers, they can run on cuda, reducing time. GPU's perform those calculations fast for huge matrixes. Doing it on a CPU (they're no good in fat matrix caclulations) it takes forever. And yes.. essentially those neural nets are extreme huge matrixes with some predictive math in between.

at_a_remove · on Aug 8, 2022

I can only dream. While at first I was excited by the mini and the mega versions of DALL-E 2 people were hosting, even if they were blurry and often a little dumb, by the time I got my DALL-E 2 invite, I had ideas ... most of which were swiftly rejected. Not only that, it suggested that I would lose my asking privileges if I kept it up with the weird prompts.

Sigh, when will I have a nightmare generator of my own?

wickedsight · on Aug 8, 2022

> it suggested that I would lose my asking privileges if I kept it up with the weird prompts

This makes me hella curious about the type of stuff you were requesting.

at_a_remove · on Aug 8, 2022

My first request was a Captain Beefheart lyric ("making love to a vampire with a monkey on my knee") because I wanted to see how it would take something concrete "monkey on my knee" and integrate it with not only a metaphor, but one that had changed over time ("making love" used to refer making out, a.a.a. prolonged kissing, rather than sex). Additionally, I wanted to see if it had hoovered up a rendition of that exact lyric off of Exploding Dog (an artist makes odd drawings from submitted phrases and prompts) some twenty-two years prior.

Of all of the celebrities, it seemed reticent to draw Fred Rogers as, well, Fred Rogers, despite me being able to make some fairly appalling melds of others, like Rod Serling and Marilyn Monroe.

Additionally, someone has put in an overly-touchy anti-vore filter.

margoguryan · on Aug 8, 2022

I've tried to do a bunch of prompts involving, like, pirates and cannons - rapiers and daggers. 16th century Treasure Island aesthetic. No bueno.

I also tried uploading a film still, and it said I may not use realistic faces. So I used another AI tool to remove the face. It still wouldn't budge.

tiborsaas · on Aug 8, 2022

It's really easy to find a false positive prompt so you have to be really careful.

It puzzled me for a while, but then it made sense:

"A hairy ball with lights at the end is manipulated by a human"

Not a great prompt, but it seemed like a good start. I imagined fiber optic strings attached to a ball and a human doing some stuff around it like a wizard or something. But then I realized that "hairy ball" might lead to something completely different :)

Okay, no problem, I changed it to "tumbleweed" but that triggered the drug filter so I gave up and tried something else :)

at_a_remove · on Aug 8, 2022

There's something hilarious about having all of this "such wow, much AI" machine-learning safeguarded by primitive wordfilters that get Scunthorped like chumps.

Miraste · on Aug 8, 2022

I expanded on this in another comment, but their content policy is incredibly restrictive:

https://labs.openai.com/policies/content-policy

There are a number of other rules that can only be discovered by combing their Discord or running into them headfirst.

GaggiX · on Aug 8, 2022

Stable Diffusion will be available soon and they will release it open source

personalidea · on Aug 9, 2022

And having spend the last 24 hours in their beta testing phase, I can only say. It blows Dall-E and MidJourney out of the water.

They give you the seeds, that are used to populate the images, so you can fine tweak an image by adjusting slight details in a prompt. It is really awesome.

matsemann · on Aug 8, 2022

What I hate about the current ML space is that half the time is spent trying to get dependencies installed and set up correctly to match whatever project you're trying to run expects. Everything is just so brittle. Following tutorials like these never work out of the box.

travbrack · on Aug 8, 2022

I recently went through this and conda handled the python deps pretty well, but the nvidia cuda stuff was a massive pain to get the right things installed. I think it’s safe to blame nvidia.

tbalsam · on Aug 8, 2022

> conda

tbalsam · on Aug 9, 2022

It seems I may have stepped on a few toes here. For a lot of ML engineers I've talked to, conda is a living nightmare if one tries to go beyond one project (or maybe multienvironment) on one machine.

Sometimes even one is a problem.

I don't think I've talked to a career ML engineer yet that has had a positive work experience with conda, that I can recall. Usually the question has been a good way to trauma bond, which is always good.

It seems like this is more an academic kind of thing?

zionic · on Aug 8, 2022

What I can’t stand is the incessant paternalism that seems to surround AI from big tech,

I got access to DALL-E a few weeks back (after a waitlist… come on) and tried it out. They want all this personal info to even access it, and then they have this oppressive “content policy” that removes anything remotely fun.

For example I tried “Trump riding a velociraptor on mars fighting aliens” because why not? Sounds hilarious. Turns out any query with the word Trump is banned and I got a warning about “repeated violations might remove my access”. It’s not just trump, anything remotely non-corporate-friendly is heavily filtered. Don’t you want to just make images of a cute teddy bear made of pizza instead?! I’m just so tired of it all. It’s like the puritans of the 1990’s won, but they’re corporations now.

Personally I think they can shove that nonsense up their ass.

kache_ · on Aug 8, 2022

Don't worry, we'll get much better models at much lower costs (midjourney, stablediffusion) that you can run on your own hardware, and avoid the moral censorship that the paternalising non productive members of these companies want to subject the world to

Pepe is banned. Why? It doesn't matter. Someone will cut around, and they'll get my compute instead. And I'll get my autogenerated pepes, regardless of whether some non productive AI "ethicist" thinks a cartoon frog meme is offensive.

dismantlethesun · on Aug 8, 2022

I think they have a blanket no celebrities or real living people policy for their engine. I think that’s fine since they don’t want to be called out for helping produce defamatory content.

They just want to avoid bad PR. That’s sound safe.

Miraste · on Aug 8, 2022

It is to avoid bad PR, but in the name of marketing they've produced the worst content policy I've ever seen:

https://labs.openai.com/policies/content-policy

Highlights include:

-All images must be "G-rated"

>Any prompt referencing politics, violence, sexuality, or even the concept of "health" is banned

>This does occasionally include the very same LGBT themes their "hate" rule ostensibly protects, although not consistently, so there's no way to know whether an LGBT prompt will cause an account strike

>In practice the list of bannable offenses is much longer than this, and includes everything from the concept of death to anything violence or politics-adjacent (e.g. nothing about war, conflict, or any kind of weapon)

>OpenAI refuses to share this list because it's part of a "contextual" filter, even though it demonstrably bans words

They advertise DALL-E 2 as artistic, but it can't make art with restrictions like this. The best it can do is corporate content farming, and even then there's no way I'd make part of my marketing pipeline depend on a service this fickle.

astrange · on Aug 8, 2022

In my opinion, if you haven’t managed to get DALLE to generate fetish art, it just means you don’t have enough fetishes.

kingkawn · on Aug 8, 2022

My account got deleted because the word katana was in my image

astrange · on Aug 8, 2022

They unban you if you email them. “longsword” is an allowed word though.

zionic · on Aug 9, 2022

Oh wow. You ask nicely and they stop being assholes?

How gracious of them…

Come on man

ricochet11 · on Aug 8, 2022

fwiw i found you can misspell words to get around the filters, e.g. try something like Dnoland Tmrump and see if it works

ronsor · on Aug 8, 2022

Looking back at the AI Dungeon incident last year, it seems OpenAI isn't great with implementing filters.

yreg · on Aug 8, 2022

Asking users to avoid creating fakes of actual people “removes anything remotely fun”? Okay…

zionic · on Aug 9, 2022

> It’s not just trump, anything remotely non-corporate-friendly is heavily filtered.

8f2ab37a-ed6c · on Aug 8, 2022

What is the solution? Is there a reason why things are not better? Can all of the deps be loaded in a Docker image and be done with it?

matsemann · on Aug 8, 2022

One thing is python's dependency management is just insane in itself. When you then additionally have to have it install basically drivers for the gpu, compile loads of native code etc it becomes a hot mess. And half the ML things you want to try are academic experiments not really made for distribution, so they were made to work and that's it. So if you have some different computer setup, minor version mismatch of a dependency etc it will just break. Datasets or models excist on some url that only works for a month.

It's a shame, I think the field is one where reproduction of results should be really welcome and feasible.

Yeah, a docker image would be nice. Even a Dockerfile, even though that in itself may not guarantee reproducibility if you try to build it later. (And may have issues with gpu drivers etc). But at least it documents all assumptions about the setup.

dicknuckle · on Aug 9, 2022

That's the cool thing about publishing the Dockerfile and an image, one is an example that may or may not break, and the other is a functional snapshot of a working config at that point in time.

cellis · on Aug 8, 2022

No. This would take a very long post to explain why, but in short, it depends heavily on which NVIDIA Cards / CUDA drivers / versions of Linux you're using. Or you're running on AWS GPUs or TPUs and paying a lot more either in the form of straight up dollars or optionality.

Kuinox · on Aug 8, 2022

> What is the solution?

There are languages ecosystem that solved these issues and never have a problem with it. Look at it, and copy their tooling.

dismantlethesun · on Aug 8, 2022

What are some of those ecosystems?

Kuinox · on Aug 8, 2022

.NET with nuget works flawlessly, compared to bad package managers.

wizofaus · on Aug 8, 2022

It's pretty good but certainly not flawless. Admittedly I probably don't have high standards. Cocoapods is one of the few I'd consider objectively "bad", though apparently among iOS developers it's considered one of the better ones (!).

Kuinox · on Aug 8, 2022

That's why I said "compared to bad package managers." NuGet works, have flaws, sometimes doesn't works, but is usually reliable.

On the other hands, I lost a lot of time with npm, and also a lot with pip.

mattkrause · on Aug 8, 2022

Julia's package manager has basically never failed on me.

dismantlethesun · on Aug 8, 2022

Thanks. That’s something to look into then.

adgjlsfhk1 · on Aug 8, 2022

Note that the Julia package manager (while very good) is very similar to (and largely inspired by) the rust package manager.

thund · on Aug 8, 2022

basically never, or never? ;-)

jollybean · on Aug 8, 2022

Cloud services.

Someone sets it up, leases out instances etc..

Or someone works to ensure that everything gels together and provides support etc..

Or in other words: 'products'

jimbob45 · on Aug 8, 2022

What I hate about the current ML space…

You’re implying it isn’t like that for every language/framework. I’d love to know what you’re working in where the dependencies aren’t always a problem. Even Carmack has complained about this.

matsemann · on Aug 8, 2022

Never had an issue running a random java project, for instance. Either because people package all dependencies. Or because maven, while not popular, mostly just works.

Of course, ML is a bit different in that it often needs more drivers/gpu setup. But there are loads of ecosystems that don't assume you have package X installed on your system, like python's does.

breakds · on Aug 8, 2022

Which is why I am using Nix to maintain my development for each machine learning research project.

substation13 · on Aug 8, 2022

I'm interested in how this works with hardware specific code? Can Nix detect host hardware and build for that?

dicknuckle · on Aug 9, 2022

Would also like to know. I'm playing around with 32 bit x86 stuff again.

tiborsaas · on Aug 8, 2022

Not sure if this is unique to ML, I had the same experience with C++, Rust, Linux tutorials and and a bunch of other technologies.

phaedrus · on Aug 8, 2022

I definitely have the same experience trying to get C++ game dev libraries running. Trying to wrangle a bunch of open-source library dependencies into mutually acceptable versions whilst also keeping up with (intentional, or unintentional) operating system breaking changes is a never ending steeple chase.

tbalsam · on Aug 8, 2022

I'm not so sure about this...I guess maybe there's some project specific stuff, but the good projects are semi stable and oftentimes it's just PyTorch code that you can pull and use elsewhere if needed.

Other people's infrastructure always seems bad to me, as I'm sure mine does to others. The cycle and circle of life! :D

dvh · on Aug 8, 2022

It was exactly the same with voice recognition and synthesis software. Those are complex apps, often born in academia with obscure dependencies.

whywhywhywhy · on Aug 8, 2022

I feel like this is getting way worse the past 3 years and I also feel that's intentional.

divan · on Aug 8, 2022

That's the reason why I avoid using Python as much as possible (while I really like the language). Another reason is massive stacktraces everywhere instead of a meaninful user-facing error messages, but that's a smaller cultural thing than dependencies hell.

ralusek · on Aug 8, 2022

Node has such an easier ecosystem than python IMO. I really hope the move of ML from python to JS continues.

Sohcahtoa82 · on Aug 8, 2022

Easy? Maybe.

Sane? Absolutely not.

Though I suppose it's not really Node's fault that developers are importing modules like "leftpad".

What's really fucking insane though is that there are modules like "trim-newlines" [0] that exist merely to trim \r and \n from the beginning and end of a string...and that this is such a hard task to get right that it's in version 4.0.2...and that a previous version had a security vulnerability [1].

[0] https://www.npmjs.com/package/trim-newlines

[1] https://www.cve.org/CVERecord?id=CVE-2021-33623

zamadatix · on Aug 8, 2022

As much as I hate to say it as I hate how these kinds of packages attach themselves to large projects to inflate download numbers for a resume I think most of us here would probably have created the same CVE doing it naively. It was a regex DoS due to exponential runtime not something obtuse like extra bloat being poorly made.

QuadmasterXLII · on Aug 8, 2022

What's node's GPU acceleration story?

ralusek · on Aug 8, 2022

https://www.tensorflow.org/js/guide/nodejs

Tensorflow JS running on GPU/CUDA in Node.

curiousgal · on Aug 8, 2022

Does npm still create 10000 directories?

ralusek · on Aug 8, 2022

npm downloads the dependencies that you install.

speedgoose · on Aug 8, 2022

No. Only 1000 these days.

andybak · on Aug 8, 2022

You can also use Visions of Chaos on Windows which has semi-automatic installer for several dozen models: https://softology.pro/vocrev.htm

(I'm a Patreon supporter but otherwise have no connection)

can16358p · on Aug 8, 2022

This looks lovely and I absolutely love where this is going.

However, it would be very cool if deep learning tools weren't dependent on CUDA and could be run on, say, any GPU with OpenCL.

blagie · on Aug 8, 2022

It would be lovely if anyone other than NVidia made GPUs useable for GPGPU. I started with Radeon for the open-source factor, but it turned out to be useless in virtually every respect. I bought an NVidia card.

I'm now tied to CUDA, but I didn't need to be. I was starting from scratch.

The longer before AMD can ship something which actually works, the more entrenched NVidia+CUDA become.

At this point, I lost hope.

blueblob · on Aug 8, 2022

It's not that they aren't trying, it's that when you reinvent the wheel, you have to do more work. Microsoft is introducing `tensorflow-directml` to avoid this problem by implementing a CUDA equivalent in directX. AMD has ROCm, but it's not well supported because it's not integrated upstream in `tensorflow`.

blagie · on Aug 8, 2022

Well, I bought a ROCm GPU for GPGPU:

- I found out I could only use it for compute headless. WTF?!?!?! (https://www.phoronix.com/news/Radeon-ROCm-Non-GUI). If it was driving a monitor, my machine would crash hard. There wasn't even an error message.

- A lot of other stuff didn't work and just resulted in odd crashes, or worse performance than CPU. I don't know why.

- Within 9 months, AMD discontinued support for my card. I raised this as a warranty issue (suitability for advertised purpose), but that obviously would go nowhere without a lawsuit. I had a very expensive brick.

- AMD support channels were non-existent. There literally was no way to reach anyone.

I bought an NVidia card, and it's been working well ever since.

ROCm is not well-supported because it's absolute garbage. You have *less* work reinventing wheels, since it's been invented once. You have more work to get community support and network effects, since you're starting out behind. Fundamentally, though, that can't start to happen if your system doesn't work at all.

I agree with you they're trying, but they're trying incompetently.

If ROCm was half the speed of CUDA, and wasn't integrating into the latest-greatest frameworks, but it was stable and working, I'd make it work. It wasn't anywhere close to stable and working.

MonkeyMalarky · on Aug 8, 2022

More entrenched? The first time I ever touched CUDA was 2012 in university! It's been entrenched for a whole decade!

atwood22 · on Aug 8, 2022

Would targeting Vulkan instead of CUDA help?

naavis · on Aug 8, 2022

They don't really serve the same purpose. Vulkan is a low-level graphics API while CUDA is a much higher level toolkit for GPGPU.

zepmck · on Aug 8, 2022

You blame CUDA because you have never played AMD + ROCm. Furthermore OpenCL is on dead-end street. With standard programming languages natively supporting CUDA, it has no sense to add extra layers.

jayavanth · on Aug 9, 2022

CPUs are good enough for inference. It will take longer but they work

Workaccount2 · on Aug 8, 2022

With the explosion of image gen AI out there recently, I feel confident that within a year anyone will have access to free or nearly free totally unchained image generation on par with or better than current DALLE-2.

yreg · on Aug 8, 2022

Running it is one thing, but where are we going to get the model?

Who is going to give away their trained model (that presumably cost millions) to the public? And getting the blame for any nefarious usage that will eventually follow?

GaggiX · on Aug 8, 2022

StabilityAI with Stable Diffusion

yreg · on Aug 10, 2022

Have they said they are going to publish the trained model? So far they are going the exact route as OpenAI (closed beta access with similar prompt content policy).

GaggiX · on Aug 10, 2022

Yes, the repository has already been released, the weights will be released soon. And no, the content policy is nowhere near as strict as OpenAI's.

yreg · on Aug 10, 2022

I don't see the content policies[0][1] as much different, but perhaps it's a matter of opinion regarding the individual points. I don't know how strictly the policies are enforced, so perhaps there is a difference there as well.

What I find strange is that StabilityAI doesn't outright ban using actual living people in prompts (where I can imagine wide usecases for abuse). At the same time both OpenAI and StabilityAI block sexual content/nudity/violence. I don't see why such content is supposed to be harmful.

Anyway looking forward to the model, thanks for informing me about this project.

[0] - https://stability.ai/stablediffusion-terms-of-service

[1] - https://labs.openai.com/policies/content-policy

GaggiX · on Aug 11, 2022

OpenAI with violence means that you cannot generate weapons (even knives), StabilityAI with violence means anti-Semitism, misogyny, etc. No one will mute you in generating nudes if it was not your intention, this means that the model is able to do this while the OpenAI model cannot. And anyway we are only talking about the limitation on the Discord server, when the model is free and open source you will be able to generate whatever you want.

yreg · on Aug 11, 2022

I see, thanks.

EZ-Cheeze · on Aug 8, 2022

Stable Diffusion is free and much better than DALL-E at faces. Not open to everyone yet but getting there.

kuboble · on Aug 8, 2022

It might be better at faces but I've access to stable diffusion and on my prompts it performed significantly worse than dalle.

zimpenfish · on Aug 8, 2022

I've got the LD repo installed and use it alongside VQGAN-CLIP and ... VQGAN is pretty much always the better result (albeit much slower.) I -think- this is because the (tagging?) that the LD is using isn't as comprehensive. Like it doesn't seem to know what "brogue" means and generates nonsense with attempts at the word "brogue".

(Caveat: I have updated neither repo since about November because it took a faff to get them working and I do not want to touch them again.)

margoguryan · on Aug 8, 2022

CLIP is my favorite of all of them. But it's not especially friendly - like riding a horse bareback. I've made some ridiculous, very bizarre and interesting stuff with CLIP that just isn't possible with DALL-E.

ionwake · on Aug 8, 2022

Hi, Sorry if this is extremely ignorant. But I was wondering the other day... are we 100% sure they arent feeding images from a search engine index into the ML data for Dalle?

It just looks surprisingly like its mixing and matching the top returned image searches from an index.

I'm only saying it looks like that - not that it is ofcourse. I don't want to undermine anyones works here, I was just wondering.

adamsmith143 · on Aug 8, 2022

This is not the case. If you want go ahead and try to generate pictures and then find the supposed source image it came from. Good luck.

knicholes · on Aug 8, 2022

You can read the whitepapers for Flamingo, Dall-E 2, Imagen, and Parti to see how diffusion networks and GANs work to create these images. I wrote up two huge paragraphs trying to explain it simply, but then I realized that I don't understand exactly how they work, either. Best source is the published research.

Most networks have to do with large language models, text embeddings, and image embeddings.

mbushey · on Aug 16, 2022

How many hundred thousand dollars of GPU hours are required? It's not like it's possible to download pre-trained data for these type of things.

liminalsunset · on Aug 8, 2022

Seems like a decent amount of training data does end up making it into the output - some of those images have a barely legible "shutterstock" white footer in them or something

upupandup · on Aug 8, 2022

Asking for a friend, can this be used for porn she wants to know.

speedgoose · on Aug 8, 2022

Yes, but you will see things you don’t want to see.

kurishutofu · on Aug 8, 2022

Anyone know how this compares to min-dalle?

frankzander · on Aug 8, 2022

TBH ... I absolute hate this AI generated strange images. Most folks are celebrate this technology but I don't know what it's really for other than confusing brains. I always ask myself "is this real" ... I don't like to ask myself if something is real or don't like to be confused. Is anyone else out there who feels the same?

ev1 · on Aug 8, 2022

I'm working on an indie game and I get surprisingly decent results if I ask for a basic object (think 300 mutations of a key or book) and downsample it to the 48x48 sprite sheet size. They look weird as hell at normal resolution but as pixel art with specific resize sampling methods it's a lot less weird, and good for rapid prototyping.

cyborgx7 · on Aug 8, 2022

That is very cool and a clever use the technology. Any place where someone can follow your development efforts?

ev1 · on Aug 8, 2022

Haha, I'm not even partway into picking a game engine. I was going to try out Unity but then the whole "bought by adtech" thing happened.

dron57 · on Aug 8, 2022

That's such an awesome use case! Do you have any examples you can share?

ev1 · on Aug 8, 2022

https://upload.vaa.red/Ct8Ui#48b916 https://upload.vaa.red/bVBvk#c275ba

The originals look like malformed mutants at 256x256 with nonsensical key bits and strange twisted handles.

Note that this does not replace a good pixel art artist either - for things like walls, anything that needs to connect together or tile like a dungeon wall or castle, or a cohesive art style, this will not do. But for rapidly identifiable different quest item drops it's not terrible.

detritus · on Aug 8, 2022

I'm not with you, at all, but I did have a knot in my stomach the otehr day reading a quote from one of the founders of MidJourney, David Holz who apparently said

"Within the next year or two, you’ll be able to make content in real time: 30 frames a second, high resolution. It’ll be expensive, but it’ll be possible. Then, in 10 years, you’ll be able to buy an Xbox with a giant AI processor, and all the games are dreams.".

As someone who used to take quite a lot of psychedelics, there's something quite terrifying about the promise of this premise - it takes me back to the wrong sort of trips where the ever-unfolding strata of reality became too much bear and I'd end up mentally cowering beneath the unrelenting bigness of it all.

The infinite dreamspace is unholy-big and nt somewhere I'd much choose to get lost.

Or maybe I totally will...

- ed - Notwithstanding the obvious realisation that I could just take the helmet off, or remove the contact lenses, or whatever we have in a couple of decades.

Workaccount2 · on Aug 8, 2022

At least for me, the intensity of a psychedelic trip was little about the visuals and a lot about that indescribable consciousness change. I don't think AI's are going to be collapsing people's innate sense of reality (the way that psychedelics does at least), perhaps leaving them questioning what media is and isn't real though.

thatcat · on Aug 8, 2022

Unless the neuralink implant works out to be a controller and visualizer...

swamp40 · on Aug 8, 2022

Once you can complete a feedback loop, things will get very strange very fast.

You will be able to visualize your own nightmares when you are awake.

manimino · on Aug 8, 2022

When computers started beating humans at chess, there was a similar moment of anxiety.

One good quote from that era: "You should be no more concerned that a computer can beat you in chess than that a car can beat you in a race."

Technology changes the experience of being human. Chess used to be the marker of human intelligence; that passed. Now some forms of creativity will too.

frankzander · on Aug 8, 2022

I'm not anxious about it. It's just not pleasant to see. There are some painters like Dali (I assume Dall-E is not by accident) or Bosch who had created kinda similar "products" but to me they are interesting and fun to look at. The AI products are kinda useless creap, mashups without sense from dump machines and that's give me this wired feeling which makes me hate it.

nonbirithm · on Aug 8, 2022

Yes, I agree. I believe that technology is like pollution, irrespective of its net utility on society: once it's been invented, it's practically impossible to go back to the state in history where it wasn't invented. Pretty soon it may begin to be the case that the invention of technologies that appear interesting on the surface have unexpected societal effects when they become mainstream, and it will be too late once the data and knowledge has spread to millions of individual hard drives.

In the case of DALL-E and GPT-3, I believe they will undermine human creativity and usher in a new era where fewer people care about slowly crafted skills like painting that require years or decades of practice and patience to master if the barrier to just have the art in front of you in seconds becomes so low.

It might get to the point that people will divide themselves on ideological grounds that AI art is "impure" and attack each other if they cannot prove its origin. I'm not saying that I'm one of those people that would join the pushback, given that a coming explosion in AI art is all but inevitable, but I'm describing what I think the new technology is going to cause the masses to believe - the population that aren't enthusiastic tech evangelists. When the expectations of millions of people are set by DALL-E and the like, I don't think the prospects will be universally positive.

Look at the number of people on HN asking if certain comments were written by GPT-3; they seem to appear weekly. I don't think implying that you didn't actually write the comment you posted will be taken very well by some people outside of an insular circle like HN once the general public becomes fully aware of AI, and could very well grow into a well-known insult if there ever comes to be a rift in opinions around AI art.

jeroenhd · on Aug 8, 2022

I assume most images and videos on the internet are doctored and have been since the invention of photoshop.

I don't feel the need to check for reality. I've seen screenshots from games that look more realistic than some pictures I've taken (Forza on max settings at the right angle might as well be a photograph) and this trend will only continue.

This tool can be considered more of an automated version of meme communities, where dedicated members will spend an hour photoshopping muppets into historic events and making the pictures look absolutely believable. The only novelty I see is that the computer now does a lot (but not all) work for you.

There are nice opportunities here. If you need a stock photo of something very specific, you'll soon be able to generate one with the right query and the right AI. Small companies can generate fancy brands without paying professional designer's fees, especially if all they need is a billboard and not a whole suite of office supplies. You can generate your own posters and decorations featuring interesting landscapes and scenes in any style you want.

The current iterations of these algorithms are quite limited in many aspects and sometimes uncanny or even horrifying, but I look forward to a future where I can imagine something, describe it, and have it rendered into digital art just like I pictured, without having to spend decades on honing my skills as an artist.

frankzander · on Aug 8, 2022

It's not about fake images trying to show reality but it's about images crafting something from reality and showing something unreal. The brain want to interpret it and fail. This fail drives me nuts.

highspeedbus · on Aug 8, 2022

I think AGI, when it inevitably arrives, will be so disruptive to our brains as sugar and sedentarism. We've not evolved to question whether what we see, hear and feel is real.

anon2020dot00 · on Aug 8, 2022

There is no need to worry about the threat of AI. While it is true that AI has the potential to cause great harm, it is also true that AI has the potential to do a great deal of good. As long as we are careful and responsible in our development and use of AI, there is no reason to believe that it will be anything other than a positive force in the world. -- generated by OpenAI from a prompt

frankzander · on Aug 8, 2022

Yes I assumed it from the first few words that's kind of a AI product.

NicoleJO · on Aug 8, 2022

Friendly reminder: If you're considering incorporating someone else's work into your new work, you need to be aware that you may be violating the copyright to the original work. https://www.legalzoom.com/articles/what-are-derivative-works...