aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorXe <me@christine.website>2022-10-01 15:28:13 +0000
committerXe <me@christine.website>2022-10-01 15:28:13 +0000
commitf2b139649b645f67c73df41e8186e22ccb2c7b96 (patch)
tree82b6d14c27cedd7f8a58558f6286c8924c9aa0ac
parentbb36283bdbe2c0d19c16e1917a88512f7ca460c1 (diff)
downloadxesite-f2b139649b645f67c73df41e8186e22ccb2c7b96.tar.xz
xesite-f2b139649b645f67c73df41e8186e22ccb2c7b96.zip
prompt engineering post
Signed-off-by: Xe <me@christine.website>
-rw-r--r--blog/prompt-engineering.markdown136
-rw-r--r--lib/xesite_templates/src/lib.rs10
2 files changed, 142 insertions, 4 deletions
diff --git a/blog/prompt-engineering.markdown b/blog/prompt-engineering.markdown
new file mode 100644
index 0000000..e68efe3
--- /dev/null
+++ b/blog/prompt-engineering.markdown
@@ -0,0 +1,136 @@
+---
+title: "Prompt engineering is hard"
+date: 2022-10-01
+tags:
+ - stablediffusion
+ - ai
+ - madewithai
+---
+
+<xeblog-hero ai="Waifu Diffusion v1.2" file="catgirl-fireworks" prompt="girl with long green hair with cat ears wearing an orange kimono, nighttime, many fireworks, festival, happy happy, A beautiful landscape, studio ghibli, zen gates, yin yang, pagoda, colorful sky, by hayao miyazaki, starbucks vibes"></xeblog-hero>
+
+I've seen a lot of comments on Twitter that seem to completely misunderstand the
+process of getting a decent result with AI generators like Stable Diffusion and
+DALL-E 2. People seem to assume that it's just "push button, recieve bacon"
+without any real creativity in the equation. As someone who has done a lot of
+this experimentation in the past few months, I'd like to challenge that
+assertion and show you what the process for getting a decent result actually
+involves.
+
+First, you need to start off with a vision for what you want. I'm going to pull
+my fictional world Malto, specifically an area named Kanar. It is a very green
+area, lots of bamboo and the local architecture takes advantage of it. The area
+is fairly wealthy because they take advantage of their weird soil composition in
+order to produce the plants that help them make an alcoholic beverage that the
+nobles all over the world can't get enough of.
+
+From here, I like to get the "vibe" of the image down first. I think that [Waifu
+Diffusion](https://github.com/kaitas/waifu-diffusion) would be a good model to
+use for this, mostly because you can feed it danbooru style tags to prompt the
+image you want. I also kind of want a studio ghibli feeling, and Waifu Diffusion
+has proven to be very good at that.
+
+My first iterations start out with a few 512x512 images with a few vibe prompt
+keywords to get basic thoughts onto the "canvas".
+
+Here is my starting prompt:
+
+> `bamboo bamboo_forest grass studio_ghibli hayao_miyazaki happy peaceful summer`
+
+| <xeblog-picture path="blog/prompt-engineering/vibes/seed_447543_00000"></xeblog-picture> | <xeblog-picture path="blog/prompt-engineering/vibes/seed_447544_00001"></xeblog-picture> |
+|:--- |:--- |
+| <xeblog-picture path="blog/prompt-engineering/vibes/seed_447545_00002"></xeblog-picture> | <xeblog-picture path="blog/prompt-engineering/vibes/seed_447546_00003"></xeblog-picture> |
+
+<xeblog-conv name="Mara" mood="happy">Click on any of these images to open a
+larger version of it!</xeblog-conv>
+
+I'm not a big fan of these. I want a _landscape_, but it instead showed me a
+bamboo forest from the inside. There were also some subjects in frame in a few
+of those too. We're not focusing on subjects yet. Let's remove the
+`bamboo_forest` tag and add the `landscape` and `pagoda` tags:
+
+> `bamboo landscape pagoda grass studio_ghibli hayao_miyazaki happy peaceful summer`
+
+| <xeblog-picture path="blog/prompt-engineering/vibes2/seed_320353_00000"></xeblog-picture> | <xeblog-picture path="blog/prompt-engineering/vibes2/seed_320354_00001"></xeblog-picture> |
+|:--- |:--- |
+| <xeblog-picture path="blog/prompt-engineering/vibes2/seed_320355_00002"></xeblog-picture> | <xeblog-picture path="blog/prompt-engineering/vibes2/seed_320356_00003"></xeblog-picture> |
+
+This is a lot closer to what I want. I'm going to stick with this seed too,
+`320353`. Now that we have a seed that's better, let's increase the resolution
+to 1280x512 to see how that changes things. The AI natively draws images in
+512x512 chunks, so the jump to something larger can be weird at times.
+
+| <xeblog-picture path="blog/prompt-engineering/vibes2-wide/seed_320353_00000"></xeblog-picture> | <xeblog-picture path="blog/prompt-engineering/vibes2-wide/seed_320354_00001"></xeblog-picture> |
+|:--- |:--- |
+| <xeblog-picture path="blog/prompt-engineering/vibes2-wide/seed_320355_00002"></xeblog-picture> | <xeblog-picture path="blog/prompt-engineering/vibes2-wide/seed_320356_00003"></xeblog-picture> |
+
+That came out way better than expected! Usually doing the jump to 1280x512
+causes some cursed results and mutant hell creatures. This is actually somewhat
+decent, but I want something a bit more stylized. This is roughly correlating to
+the image I have of Kanar in my head, but this looks closer to a drawing of the
+area done by an outsider rather than how they would portray themselves. I'm
+going to add a few style keywords:
+
+> `bamboo landscape pagoda grass studio_ghibli hayao_miyazaki happy peaceful summer ukiyo-e wood-block`
+
+| <xeblog-picture path="blog/prompt-engineering/ukiyo-e/seed_320353_00000"></xeblog-picture> | <xeblog-picture path="blog/prompt-engineering/ukiyo-e/seed_320354_00001"></xeblog-picture> |
+|:--- |:--- |
+| <xeblog-picture path="blog/prompt-engineering/ukiyo-e/seed_320355_00002"></xeblog-picture> | <xeblog-picture path="blog/prompt-engineering/ukiyo-e/seed_320356_00003"></xeblog-picture> |
+
+Something fun you can do is add "masterpiece" to make the image look better and
+"unreal engine" to improve the lighting. Let's mess with that here:
+
+> `bamboo landscape pagoda grass studio_ghibli hayao_miyazaki happy peaceful summer ukiyo-e wood-block unreal engine masterpiece very beautiful`
+
+| <xeblog-picture path="blog/prompt-engineering/unreal-engine/seed_320353_00000"></xeblog-picture> | <xeblog-picture path="blog/prompt-engineering/unreal-engine/seed_320354_00001"></xeblog-picture> |
+|:--- |:--- |
+| <xeblog-picture path="blog/prompt-engineering/unreal-engine/seed_320355_00002"></xeblog-picture> | <xeblog-picture path="blog/prompt-engineering/unreal-engine/seed_320356_00003"></xeblog-picture> |
+
+You know what, I'm not feeling that wood-block style. Let's remove it in the
+next round. The main thing we need to focus on next is the subject. The main
+export of Kanar is a rice-based alcoholic drink. I'm going to add "rice_paddy"
+to the prompt just after `grass`:
+
+> `bamboo landscape pagoda grass rice_paddy studio_ghibli hayao_miyazaki happy peaceful summer ukiyo-e unreal engine masterpiece very beautiful`
+
+| <xeblog-picture path="blog/prompt-engineering/rice-paddy/seed_320353_00000"></xeblog-picture> | <xeblog-picture path="blog/prompt-engineering/rice-paddy/seed_320354_00001"></xeblog-picture> |
+|:--- |:--- |
+| <xeblog-picture path="blog/prompt-engineering/rice-paddy/seed_320355_00002"></xeblog-picture> | <xeblog-picture path="blog/prompt-engineering/rice-paddy/seed_320356_00003"></xeblog-picture> |
+
+I think that last one is going to be the image that I'm going with. Let's see
+what happens if we change the time of day:
+
+
+| <xeblog-picture path="blog/prompt-engineering/times/morning"></xeblog-picture> | <xeblog-picture path="blog/prompt-engineering/times/afternoon"></xeblog-picture> |
+|:--- |:--- |
+| <xeblog-picture path="blog/prompt-engineering/times/evening"></xeblog-picture> | <xeblog-picture path="blog/prompt-engineering/times/nighttime"></xeblog-picture> |
+
+...well that didn't change the time of day at all. I think I like the nighttime
+result though, so I'm going to go with that. Here's the image we have so far:
+
+<xeblog-picture path="blog/prompt-engineering/times/nighttime"></xeblog-picture>
+
+This has a bit of an imposing feeling, maybe like a castle or some other
+important building. Maybe this is the private pagoda of their leader. What if we
+add some guards?
+
+> `nighttime bamboo landscape pagoda grass rice_paddy studio_ghibli hayao_miyazaki happy peaceful summer ukiyo-e unreal engine masterpiece very beautiful guards pikemen`
+
+<xeblog-picture path="blog/prompt-engineering/rice-paddy-guards"></xeblog-picture>
+
+I really like this. This is the kind of vibe that I'm going for. I want
+something that makes me _feel_ like I'm looking into that area that I've only
+ever seen in my mind's eye from description paragraphs and topological charts.
+This is the kind of thing that Stable Diffusion and similar models let you do as
+a writer: they let you bring images out of your head and onto the canvas so that
+you can have people really understand what it's like. If I wrote a longer story
+set in here, I'd probably throw this image and a few others generated with
+different seeds to an artist to help me make an image for a book cover.
+
+I'm also not really sure why people call this "prompt engineering", I'd
+personally rather call it "scrying", but I can understand why Silicon Valley
+culture would push everything towards being "engineering". I just legally can't
+call myself an "engineer" in Canada without an engineering degree.
+
+This is the kind of technology I am really excited for, and I can't wait to see
+how this evolves. Computers are fun sometimes.
diff --git a/lib/xesite_templates/src/lib.rs b/lib/xesite_templates/src/lib.rs
index 0a8ff1d..4b920b8 100644
--- a/lib/xesite_templates/src/lib.rs
+++ b/lib/xesite_templates/src/lib.rs
@@ -28,10 +28,12 @@ pub fn slide(name: String, essential: bool) -> Markup {
pub fn picture(path: String) -> Markup {
html! {
- picture style="margin:0" {
- source type="image/avif" srcset={"https://cdn.xeiaso.net/file/christine-static/" (path) ".avif"};
- source type="image/webp" srcset={"https://cdn.xeiaso.net/file/christine-static/" (path) ".webp"};
- img style="padding:0" loading="lazy" alt={"hero image " (path)} src={"https://cdn.xeiaso.net/file/christine-static/" (path) "-smol.png"};
+ a href={"https://cdn.xeiaso.net/file/christine-static/" (path) ".jpg"} target="_blank" {
+ picture style="margin:0" {
+ source type="image/avif" srcset={"https://cdn.xeiaso.net/file/christine-static/" (path) ".avif"};
+ source type="image/webp" srcset={"https://cdn.xeiaso.net/file/christine-static/" (path) ".webp"};
+ img style="padding:0" loading="lazy" alt={"hero image " (path)} src={"https://cdn.xeiaso.net/file/christine-static/" (path) "-smol.png"};
+ }
}
}
}