Thread #108048751 | Image & Video Expansion | Click to Play
File: highlights_g_108045531_1770118625_1.jpg (3 MB)
3 MB JPG
Discussion of Free and Open Source Diffusion Models
Prev: >>108045531
https://rentry.org/ldg-lazy-getting-started-guide
>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP
>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows
>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe
>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
>Anima
https://huggingface.co/circlestone-labs/Anima
>Klein
https://huggingface.co/collections/black-forest-labs/flux2
>LTX-2
https://huggingface.co/Lightricks/LTX-2
>Wan
https://github.com/Wan-Video/Wan2.2
>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46
>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/
>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage
>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg
>Local Text
>>>/g/lmg
>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
299 RepliesView Thread
>>
>>
>>
>>
>>
>>
>>
File: 1768543209503092.jpg (525.4 KB)
525.4 KB JPG
>test nodes 2.0
>ok everything is gucci
>no copy image on right click action
reee
>>
File: Anima_tagging.png (3.1 MB)
3.1 MB PNG
>tfw you forget to @ artists on Anima
>>
>>
File: z_imageBASEd_00235_.jpg (337.3 KB)
337.3 KB JPG
>>
>>
File: Anima_00025_.png (1.5 MB)
1.5 MB PNG
>>108048841
Here's the lower-right combo at 1280x1600, same seed. Actually comes out kinda nicer, although the quality looks like a smoothed artbook scan or something.
>>
>>
File: Anima_00026_.png (1.3 MB)
1.3 MB PNG
>>108048860
And here's 1920x1088. Looks like it's starting to fall back on non-anime knowledge. Her eyes look a tad misaligned.
>>108048855
I haven't learned inpainting for any model yet, haha.
>>
>>
>>
File: z_imageBASEd_00246_.jpg (349.7 KB)
349.7 KB JPG
>>108048878
>so I suggested doing a 2nd pass to upscale (hires fix)
What settings did you use for this? I got garbled mess when I went from 1MP to 1.5
>>
File: Anima_00027_.png (1.5 MB)
1.5 MB PNG
>>108048867
Added "official art" on the end. Also pretty nice, but one arm came out curved.
>>108048878
Yeah, I was curious what would come out, and whether it'd retain eye detail better than Illustrious-based models. I think someone said Anima had trained on 512 so far and was doing 1024 next, so hopefully that improves the higher-res results too.
>>
File: o_00424_.png (529.8 KB)
529.8 KB PNG
>>
File: sub.png (148.9 KB)
148.9 KB PNG
>>108048891
Try tiling?
>>
>>108048891
https://files.catbox.moe/8iy8bs.png
made this simple hires fix wf for you, just load the image in comfyui.
remove the sage attention/torch compile nodes if you cant support them, I get 30% more perf with them
>>
File: o_00425_.png (1.7 MB)
1.7 MB PNG
>>
>>
>>108049022
>>108049030
ty i'll test these
>>
File: o_00426_.png (1.8 MB)
1.8 MB PNG
>>
File: 1769842682773644.png (55.9 KB)
55.9 KB PNG
>>108049043
very original, Julien
here's a post you made, in which you suck cumfart's cock
>>
>>
File: Flux2-Klein_00739_.png (568.3 KB)
568.3 KB PNG
good morning my wickle fwens
>>
>>
>>
>>
File: 1748725313433635.png (45.8 KB)
45.8 KB PNG
what went so wrong, bros?
>>
>>
>>
>>
>>
File: F2Kb__00002_.png (1.9 MB)
1.9 MB PNG
>>108049090
>>
File: o_00428_.png (1.1 MB)
1.1 MB PNG
>>
>>
>>
File: F2Kb__00003_.png (1.8 MB)
1.8 MB PNG
>>108049143
>>
File: ComfyUI_00493_.jpg (2.8 MB)
2.8 MB JPG
sorry Qwen anon, but Wan2.2 is still the best for me
>>
>>
>>
>>
>>
>>
File: F2Kb__00004_.png (1.5 MB)
1.5 MB PNG
>>108049254
>>
>>
>>
File: z_imageBASEd_00271_.jpg (482.1 KB)
482.1 KB JPG
>>108049303
Klein 9b is pretty damn simple. Batch load and run every image trough same prompt.
>>
File: Flux2-Klein_00769_.png (648.5 KB)
648.5 KB PNG
>>108049242
>>
>>
>>
>>
>>
>>108049242
>Wan2.2 is still the best for me
Yeah I've seen enough. The future of image models is video models. You just can't learn stuff like reflections or fingers properly from static images, you need a temporal component
I hope we get something better than wan 2.2 locally this year. It's getting to the point where almost all the puzzle pieces exist for me to make the offline Instagram of my dreams
>>
File: F2Kb__00007_.png (1 MB)
1 MB PNG
>>108049377
>>
File: 00017-226232868.jpg (1.7 MB)
1.7 MB JPG
>>
>>
>>
File: z_imageBASEd_00276_.jpg (629.8 KB)
629.8 KB JPG
>>
>>
>>
File: 00023-2815691544.jpg (2.1 MB)
2.1 MB JPG
>>108049444
I'm really trying to hammer on the range of bf 32 z base, I'm impressed so far I'm able to get my gen time down to 3 min. 27.9 sec. using pinned memory which is great
>>
File: 00026-2815691544.jpg (2 MB)
2 MB JPG
>>108049477
another variation
>>
File: 1757650615801913.png (2.4 MB)
2.4 MB PNG
>>
File: 00033-1565140952.jpg (1.9 MB)
1.9 MB JPG
100 steps
>>
>>
File: 00036-1565140952.jpg (2.1 MB)
2.1 MB JPG
>>108049585
30 steps, shaves a minute off of generation
>>
>>
File: 05ccb0a0-47b8-436e-8af0-73f1f3206fb7.png (1.2 MB)
1.2 MB PNG
>>
>>
>>
Has there been any efforts to fix the terrible audio quality of LTX gens? I've been using vibe voice to do the vocals but sometimes I want to have LTX do the sound effects or something but it sounds terrible, there's always like howling dissonant sounds and shit in the background.
>>
>>
File: o_00435_.png (1.1 MB)
1.1 MB PNG
>>
File: xyz_grid-0001-2699268191-1.jpg (1.9 MB)
1.9 MB JPG
>>
File: Flux2-Klein_00234_.png (1.6 MB)
1.6 MB PNG
>>
>>108049629
try with this https://civitai.com/models/2361961?modelVersionId=2656399
>>
>>
File: o_00438_.png (1.7 MB)
1.7 MB PNG
>>
>>
>>
>>
>>
>>108049492
>>108049477
>>108049585
>>108049609
These are some of the best gens I have ever seen in these threads.
>>
>>
>>
>>
>>
>>
>>
>>
File: 00056-1661097663.jpg (1.6 MB)
1.6 MB JPG
>>
File: 00062-1661097663.jpg (1.6 MB)
1.6 MB JPG
>>108049900
Thanks!
>>
File: 1760830636016813.png (3.7 MB)
3.7 MB PNG
doux
>>
>>
File: 1757771302934212.jpg (537.8 KB)
537.8 KB JPG
>>
File: 1754772448272871.png (3.4 MB)
3.4 MB PNG
broom broked, very angery
>>
>>
File: 1547211305663.png (178.5 KB)
178.5 KB PNG
How do I put two differrent sampler nodes in a subgraph and give the subgraph a switch so I don't have to manually recouple the nodes inside?
>>
>>
File: 1746514962620857.mp4 (3.6 MB)
3.6 MB MP4
>>108049648
>>
File: 1742152912210822.mp4 (3.8 MB)
3.8 MB MP4
>>108049733
>>
File: 1747646231033949.png (3.9 MB)
3.9 MB PNG
>>108048839
another variation of this, came out cool
>the video spambot has found this thread
AIEEE
>>
>>
>>
File: 1768747821250837.mp4 (3.6 MB)
3.6 MB MP4
>>108049588
>>
>>
>>
>>
>>
File: 1743288684687531.mp4 (3.8 MB)
3.8 MB MP4
>>108049565
>>
File: 00070-1536019997.jpg (1.5 MB)
1.5 MB JPG
>>
>>
File: o_00447_.png (1.6 MB)
1.6 MB PNG
>>
>>
>>
>>
>>108050206
He would get off on that
Also on another note does sage attention hit quality and if so what speed increases are anons seeing?
I tried to install it a long time ago and it kept having issues with forge but it should be good now.
>>
File: a.jpg (119.5 KB)
119.5 KB JPG
>>108050106
you can promote the toggle of a node that has a switch toggle to the "outside" of a subgraph
>>
>>
File: 1761229865945983.mp4 (3.7 MB)
3.7 MB MP4
>>108049446
>>
File: 8e76bc1772a4e40f87c41d4b1dba7ebe.png (395.4 KB)
395.4 KB PNG
We need to start executing "people" who upload stuff like this.
>>
File: vram.png (2.2 KB)
2.2 KB PNG
>>108049862
Cool, not it simply refuses to unload the model. And it's perma stuck at 99
>>
>>
>>
>>
>>
>>
File: 6038086.jpg (91.7 KB)
91.7 KB JPG
good evening sirs
>>
>>
>>
File: 1746424178062700.mp4 (3.7 MB)
3.7 MB MP4
>>108049433
>>
File: LTX-2_00133.mp4 (2.2 MB)
2.2 MB MP4
>>
File: 1752807780688443.mp4 (2.6 MB)
2.6 MB MP4
>>108049338
>>
File: 00000-3165684880.jpg (2.2 MB)
2.2 MB JPG
>>
>>
>>
>>
File: 1763321012287187.mp4 (3.7 MB)
3.7 MB MP4
>>108049331
>>
Hi, I'm very new to all this.
I'm trying to load 16gb model z-image-turbo using ComfyUI to my 12gb gpu, but it keeps crashing. Is there any way to partially load it to GPU and RAM?
https://huggingface.co/Comfy-Org/z_image_turbo/tree/main/split_files/d iffusion_models
>>
>>
>>
File: LTX-2_00137.mp4 (1.6 MB)
1.6 MB MP4
>>
>>
>>
File: 00001-3124168644.jpg (1.7 MB)
1.7 MB JPG
>>
>>
>>
File: 1747769244303072.mp4 (3.8 MB)
3.8 MB MP4
>>108050122
>>
File: 00002-1288575119.jpg (2.5 MB)
2.5 MB JPG
>>
>>
File: 1740121422049895.mp4 (3.7 MB)
3.7 MB MP4
>>108050085
>>
>>
>>108050267
low quality jpegs only sir, my quota is ending
>>108050269
its the /g/ay board you are in
>>
File: 1762996191236242.mp4 (3.7 MB)
3.7 MB MP4
>>108050440
>>
>>108050184
>>108050195
Autocorrect (swipe) is heavily biased towards "it's" over "is" (it's the same with in/on, etc).
>>
>>
>>
File: 00003-1007807292.jpg (2.7 MB)
2.7 MB JPG
>>
>>
Haven't touched imagegen since SD2 released. Myne doesn't seem to have any good LoRA for illustious derivatives, so I guess I'll have to train it myself.
Here's a gen from some ancient SD1.5 merge of mine.
>>
File: 1760401730763158.mp4 (3.7 MB)
3.7 MB MP4
>>108050220
>>
>>
>garbage op stolen by ran
>garbage slop gens and braindead "discussions"
i feel like those threads will know no rest until ran is gone. part of me feels sorry for him - unemployed crazy troon addicted to hard drugs, that sounds miserable, but considering all that he has done to this general i wish he just fucking killed himself already. this can't keep going like that
>>
>>
File: 1766316082532486.mp4 (3.7 MB)
3.7 MB MP4
>>108050467
>>
>>
>>
File: z-image-turbo-fp8-aio_Schizo_00004_.png (2.5 MB)
2.5 MB PNG
>>
>>
File: 1743350129482996.mp4 (3.3 MB)
3.3 MB MP4
>>108050200
>>
>>
File: f2k9b_00010_.png (2 MB)
2 MB PNG
>>108050611
log in, call something slop, go back to bed
>trumps america
>>
>>
File: Flux2-Klein_00565_.png (524.2 KB)
524.2 KB PNG
>>108050641
>log in
>>
>>
>>
File: prompt z image turbo slop.png (1.8 MB)
1.8 MB PNG
>>108050611
your prompt request saar
>>
File: 1765054651619585.mp4 (3.8 MB)
3.8 MB MP4
>>108049242
>>
File: 1767141276210208.mp4 (3.7 MB)
3.7 MB MP4
>>108050092
>>
>>108050535
cool outfit
>>
File: xyz_grid-0007-1536019997-1.jpg (3.6 MB)
3.6 MB JPG
sampler and step testing
>>
>>
>>
File: z-image-turbo-fp8-aio_Schizo_00014_.jpg (2.7 MB)
2.7 MB JPG
boards:4:g;type:filename;/\.mp4$/i;stub:no;
yw
>>
>>108050728
100 steps, pls.
>>
>>
File: 1755873079177751.mp4 (3.4 MB)
3.4 MB MP4
>>108049177
>>
File: 1751212327169995.mp4 (3 MB)
3 MB MP4
>>108049093
>>
>>108049862
>>108049869
Nvidia-only, at least currently.
>>
>>
>>
File: Screenshot 2026-02-03 114136.png (38.7 KB)
38.7 KB PNG
help me sarrrs
i no longer get preview kindly give assistance please
>>
>>
>>
>>
File: a.jpg (120.2 KB)
120.2 KB JPG
>>108050543
looks pretty good but is very random, is that just what promptless ltx does?
>>
>>108050828
>>108050833
multigpu has goof nodes alredy in it
>>
>>
>>
>>
File: 1761536596940439.png (127.2 KB)
127.2 KB PNG
>>108050828
Which one do I pick ???
>>
>>108050822
Have you redeemed ComfyUI options? If not you can always add --preview-method-auto to your .bat/launch script
Sometimes when your ComfyCloud Coins(tm) are low Comfyanonymous disables random features unless you buy a new loot box node.
>>
File: 1745993661891941.mp4 (3.7 MB)
3.7 MB MP4
>>108050840
>>108049085
It did have a prompt, but it is just so bad at following it that it looks random. I don't have the meta for that one anymore, but it was something about her playing a game in an amusement park to win plushies
>>
>>
>>108050458
>>108050543
Have you played with the new guider? I never managed to get good results with it.
>>
>>
File: 1740737446793011.mp4 (3.7 MB)
3.7 MB MP4
>>108050892
>>108050720
Yeah, I didn't seem to improve things much, if at all, for me.
>>
>>
File: Flux2-Klein-9bfp8_00174_.png (2.5 MB)
2.5 MB PNG
>>
>>
>>
>>
File: 1740495094375644.png (107.9 KB)
107.9 KB PNG
>>108050951
Sorry to bother you again, but where do I connect these 3 dots? I only have these 3 windows on the right side, not sure what do to.
>>
File: 00004-2287639850.jpg (1.5 MB)
1.5 MB JPG
>>
File: file.png (134.8 KB)
134.8 KB PNG
>>108051005
>>
>>
File: 1758045754009.png (197.3 KB)
197.3 KB PNG
>>108050991
Y U LIE? Not real.
>>
File: 1741949186928656.png (263.5 KB)
263.5 KB PNG
>>108051044
What to do here?
>>
>>
>>
i've installed comfyui non-portable on windows
my system python install is version 3.12.10
comfyui fails to load a python module i've installed, and in the error log, it says- **Python Version:** 3.12.11 (main, Aug 18 2025, 19:17:54) [MSC v.1944 64 bit (AMD64)]
- **Embedded Python:** false
i do not have python 3.12.11 anywhere on my system and have no idea how comfyui is finding and using it
>>
>>108048751
Hey I’m new to stable diffusion and i’m trying to turn tom cruise into one of those super detailed 90s OVA film style anime drawings. Unfortunately I cant get the program to make any changws at all to my tom cruise image. Am I getting censored? Or am I just not doing it right? All it does it crushes his face or Turns him into a girl.
>>
>>108050867
unet distorch2 for safetensors main model checkpoints (probably this?)
gguf distorch2 for gguf main model checkpoints
as you can see you can load VAE CLIP DualClip etc with this too but maybe this is not needed
>>
>>
File: z_image_bf16_00067_.png (3.3 MB)
3.3 MB PNG
>>
File: Flux2-Klein9BDistill_00075_.png (1.4 MB)
1.4 MB PNG
she HATES her new sweater
>>
>>
>>
>>
>>
File: 1758381810806262.png (20.7 KB)
20.7 KB PNG
>>108051159
oh right.... i need to use the custom model, vae and clip loaders
but now i got this error, should I set virtual_vram_gb to my exact VRAM amount? Or less?
>>
>>
>>
>>108051210
the "virtual vram" is how much you want to offload to system RAM (if the donor device is CPU and the compute device your GPU)
basically it's what you want to NOT be in VRAM. there is also text output in the console explaining what happens once you run the workflow
>>
>>
File: 00005-3617743868.png (2.1 MB)
2.1 MB PNG
>>
>>
>>
>>
>>108051239
it is relatively easy, just tell it to use ~4GB virtual vram from cpu (system RAM) with the compute device otherwise being cuda:0 (presumably, unless you have multiple GPU)
and watch the console output if it is resulting in the 16GB VRAM -> 12GB VRAM + 4GB Sytem RAM you need... maybe it still has issues like during VAE decode or some other specific part of the workflow but you can adjust the amount offloaded
>>
File: 1750916644899752.mp4 (2.9 MB)
2.9 MB MP4
>>108050944
>>
File: 1766626559639848.mp4 (3.8 MB)
3.8 MB MP4
>>108048839
>>
File: 1754595543602461.mp4 (3.8 MB)
3.8 MB MP4
>>108048876
>>
File: 1763257295472704.mp4 (3.8 MB)
3.8 MB MP4
>>108048988
>>
File: 1758813790349545.mp4 (3.8 MB)
3.8 MB MP4
>>108049037
>>
File: ComfyUI_00454_.png (1.3 MB)
1.3 MB PNG
Story time with image generation! Or, well, some evolution of time generation at least. For this, I used ComfyUI's built-in Qwen Image Edit 2509 preset, replaced the diffusion model with Qwen-Rapid-AIO-NSFW-v19, disabled the LORA it comes with, and upped the KSampler steps to 20. I also added "JPEG artifacts, blurry, ugly, low contrast" to the negative prompt before realising that leaving the cfg at 1.0 means it does jack shit. Oh well.
Since it was supposed to be a pure image-to-image experiment, I started with the photo of an older woman (Anita Werner, in case you're curious) and went from there.
Prompt: Make her young and beautiful. Keep her hair short.
>>
File: 1742084080646688.mp4 (3.7 MB)
3.7 MB MP4
>>108050641
>>
File: 1740190221372471.png (280.1 KB)
280.1 KB PNG
>>108051231
>>108051270
I'm not sure why its not working, when I try to generate, comfyui just crashes. While checking the task manager, I see that it only load 8gb's into VRAM, starts loading rest into RAM, but even before RAM is 50% used, it crashes, and says "reconnecting"
>>
File: 1751583699476782.mp4 (3.8 MB)
3.8 MB MP4
>>108050664
>>
>>
File: ComfyUI_00456_.png (1.3 MB)
1.3 MB PNG
>>108051302
Add a kenomimi cat ears hairband in the same color as her hair.
>>
File: 1749508710892519.mp4 (3.8 MB)
3.8 MB MP4
>>108050840
>>
>>108050775
>>108050796
>>108050881
>>108050906
this is the future of entertainment. avant-garde short movie clips. pure kino. they'll call us lucky for having been here.
>>
File: ComfyUI_00458_.png (1.2 MB)
1.2 MB PNG
>>108051314
Change her ethnicity and face structure to Japanese. Keep the hair and eye color.
>>
>>
File: ComfyUI_00460_.png (1.1 MB)
1.1 MB PNG
>>108051334
Create an image with her playing ball on a sunny beach, wearing a bikini.
>>
File: Flux2-Klein9BDistill_00103_.png (1.6 MB)
1.6 MB PNG
>>
>>108048611
>>108048611
>>108048611
New thread
>>
File: Video_00001-2.mp4 (3.8 MB)
3.8 MB MP4
Neat, video gen of an older gen.
>>
File: ComfyUI_00463_.png (1.1 MB)
1.1 MB PNG
>>108051342
Add an alien invasion fleet to the distant sky.
>>
>>
>>
>>108051307
Run it with the console open
I'm not sure the clip_type is correct now, try qwen_image
Also as indicated earlier I'd try just offloading only the z-image model first, the others perhaps don't need to be offloaded or other models ejected on load straight away.
>>
File: ComfyUI_00464_.png (1.1 MB)
1.1 MB PNG
>>108051356
Add a small green alien in the background, holding up a sign "MARS NEEDS CATGIRLS".
I'm afraid I can't post the rest on a blue board.
>>
>>
>>
>>
>>
File: ComfyUI_00490_.png (1.1 MB)
1.1 MB PNG
>>108051358
I have no idea what the hell ComfyUI understood about that prompt. That's clearly a unicycle!
>>
>>
>>
>>
File: RedZDX-ZIB-Distilled-BF16_00001_.png (2.7 MB)
2.7 MB PNG
im deleting this shit
>>
>>108051384
>>108051419
there is a new thread already >>108048611
>>
>>
File: 1759216166909312.png (58.8 KB)
58.8 KB PNG
>>108051364
>>108051410
hell yeah! :) this setup worked, clip_type is correct, i got it here: https://docs.comfy.org/tutorials/image/z-image/z-image-turbo
>>
>>
>>
>>
>>
File: 1764510237536042.png (1.3 MB)
1.3 MB PNG
>>108051470
Its just the shitty default prompt, generating this took like 2 minutes. I'm definitely sticking with the 4-bit model, but I'm still happy I managed to get it working.
>>
>>
>>108051502
For some reason, it only uses like 2/3rd of my VRAM.
>>108051505
Its painfully slow, the 4-bit version when fully loaded into GPU generates 1024x1024 in 6 seconds. Generating with RAM/VRAM sharing took like 2 minutes
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>