Thread #9105298 | Image & Video Expansion | Click to Play
File: 1767119245693809.jpg (415.1 KB)
415.1 KB JPG
Previous thread >>9083124
> Chroma
https://civitai.com/models/1330309/chroma
> Z-Image
https://civitai.com/models/2168935/z-image-turbo
> XL models
https://civitai.com/models/575395/big-lust
https://civitai.com/models/573152/lustify-sdxl-nsfw-checkpoint
> ComfyUI
https://github.com/Comfy-Org/ComfyUI?tab=readme-ov-file#get-started
https://comfyanonymous.github.io/ComfyUI_examples/
> Wan UI
https://github.com/deepbeepmeep/Wan2GP
>Related threads
>>>/r/realistic
>>>/gif/vdg
>>>/g/sdg
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/trash/sdg
>>>/aco/futasdg
>>>/b/ai
237 RepliesView Thread
>>
File: 1768419264328397.png (750.8 KB)
750.8 KB PNG
>>
File: 1766781614630472.png (1.2 MB)
1.2 MB PNG
>>
File: 1767826241333160.jpg (169.5 KB)
169.5 KB JPG
>>
File: 1768526490208250.jpg (448.8 KB)
448.8 KB JPG
>>
File: 1767862861252344.jpg (852.6 KB)
852.6 KB JPG
>>
File: exampleslop.jpg (741.7 KB)
741.7 KB JPG
>>
File: ComfyUI_28446_.png (1000.7 KB)
1000.7 KB PNG
>>
File: ComfyUI_temp_rglnm_00021_.png (1.7 MB)
1.7 MB PNG
>>9105339
>>
>>
>>
>>
>>
File: ComfyUI_00053_ .png (2.6 MB)
2.6 MB PNG
>>
File: klein9b.jpg (337 KB)
337 KB JPG
>>
>>
File: 00007-624604268.png (3.4 MB)
3.4 MB PNG
>>
>>
File: 00011-567155090.png (3.8 MB)
3.8 MB PNG
>>
>>
>>
File: Wan2.2_image_to_video_00201_.mp4 (933.8 KB)
933.8 KB MP4
>>
>>
>>
>>
>>
>>
>>
>>
>>9106194
It means that newer image gen models have more complex compositions and higher levels of detail,
so something like Yande* brings up real porn or 3D art because there's not enough slop that looks like that. This isn't true for SFW images.
But it also means even if you use a real image, if it was mass imitated by simpler slop models then your search results are already polluted.
>>
File: 20251204191649-1445432608.jpg (881.5 KB)
881.5 KB JPG
>>9106189
If you want to know what kind of models and prompts people are using to generate their images, the only thing you can do is ask the poster to upload a version of the image with metadata to catbox. Reverse searching isn't going to give you shit.
>>
File: ComfyUI_28557_.png (1.3 MB)
1.3 MB PNG
>>9106200
>>
>>
>>
File: 00067-207297620.png (644.7 KB)
644.7 KB PNG
>>
>>
>>
File: 1768721152.png (1.3 MB)
1.3 MB PNG
>>
File: svr2_260117214101_0001.jpg (3.9 MB)
3.9 MB JPG
I've been recreating my old workflow in comfy and I'm thinking about testing newer models as refiners (still gonna use pony base for the xxx) - any recommendations? I'd rather not download triple digit GBs of models if there's one or two that outshine the others - I'm looking for a good refiner (if there is such a thing, I know back in the day it didn't work because other models wouldn't know the poses and/or genitalia details) and a good face detailer - I can live with just finding a good face changer that improves over ponyface without resorting to loras that 99/100 times are blurry due to source data being blurry.
>>
>>9106491
It'll be the new Flux 2 Klein model, probably the 9B but maybe the 4B. You'll want the one just named Flux 2 Klein which is the turbo distill for 4-steps (better with 6-8 though), not the "-Base" named one which isn't really intended for inference use. Very good (multi-)image editing and a very good t2i model too, it's a bit fucked around genitals but it's reasonably compliant around posing, breasts, and simple genitalia (though all still a bit dubious quality until we get nudity loras or a finetune).
>>
>>
>>
>>
>>9106655
any idea why I keep getting "Expected size for first two dimensions of batch2 tensor to be: [64, 128] but got: [64, 32]."? Tried both base and distilled, using correct encoder/VAE/latent image, error occurs in 5 different workflows at KSampler step whether default or advanced, image size irrelevant. only relevant search result seemed to finger a recent comfyui update - is your flux 2 klein workflow working right now?
>>
>>9106649
First recommendation would be to stop using Pony, move to illustrious, noobAI or Chroma. But suppose you can use those as refiner too if you really wanted. In the case of Chroma, it's a different architecture so can't do the true refiner flow just img2img of a finished pic.
Faces are a lot easier, unless it's a blowjob or something you're not limited to just nsfw models. That would mean z-image, qwen, flux2. Cannot say which is best for this use case, I mostly do cosplay shit where they all suck.
>>
>>
>>
>>
>>
>>
>>
File: 00037-774623539.jpg (321.2 KB)
321.2 KB JPG
>>
File: 20260119084831-3187306691.jpg (800.9 KB)
800.9 KB JPG
>>
>>
File: jesus christ.jpg (74.5 KB)
74.5 KB JPG
>>9107956
>>
File: 221666132.png (2.3 MB)
2.3 MB PNG
>>
File: 00017-890851857.jpg (415.2 KB)
415.2 KB JPG
>>
File: Flux2-Klein_00046_.png (2.6 MB)
2.6 MB PNG
>>9106788
in case anyone else is hunting for this when trying to run Flux 2 Klein, it's one specific thing: in settings for VHS custom nodes turn off "Display animated previews when sampling". Restart and enjoy.
I haven't even started testing integration into other workflows or upscaling/refining but it's impressive image modification
>>
File: 00007-576024955.png (2.7 MB)
2.7 MB PNG
>>
>>
>>
>>
File: ComfyUI_00255_.png (2.2 MB)
2.2 MB PNG
>>
File: ComfyUI_28870_.png (2.4 MB)
2.4 MB PNG
Maybe I'm finally starting to figure out Chroma
>>
>>
>>
>>9108777
I'm guessing the source is probably cartoon/anime fanart and not irl cosplay. But either way, her having 3+ canon costumes with the same color theme and overlapping features doesn't help. That always confuses AI.
>>
>>
>>
>>
>>
>>
>>
>>
File: 1651137220416.png (224.4 KB)
224.4 KB PNG
Imagine getting dragnet range banned in 2026 with these captchas. Couldn't be me.
>>
>>
File: z-image_00076_.png (367.3 KB)
367.3 KB PNG
>>9108901
Already there friend, I'm just generating images for fun, that's all. I was just hoping anyone could point me to some models close or better in quality than that. I've played around with ZIT and I had some success, but I don't think it's quite like what I'm after. Same prompt.
>>
>>
>>
>>
File: 00015-692861700.png (1.5 MB)
1.5 MB PNG
>>9109049
uh no as someone warned >>9096235
zit is the best of what's available for speed and simplicity. XL is also better than chroma for a [degenerate] beginner.
>>
>>
>>9109023
keep the prompt simple, do small batches in lower resolution, then pick the better ones and pipe them through img2img.
that's when you enhance your prompt by small details like facial expressions, pubic hair, etc. since you don't want to confuse the model with these beforehand.
>>
>>
File: 00095-571960610.png (2.1 MB)
2.1 MB PNG
>>
>>
>>
File: ComfyUI_28982_.png (1.3 MB)
1.3 MB PNG
>>9109532
Concept-wise, there's no way. But they do lean more towards anime visually and anatomically, if you use too many tags at once.
>>
i *would* prefer my pics a bit less glossy, but my potato can't run the better models or filters.
>>
>>9109559
Judging by the scores you're still using Pony, consider moving to illustrious or noobai. Same hardware requirements, similar prompting style.
Could also try dropping some or all of the score tags. The way they improve "quality" is mainly through style bias, and you have a different style in mind than what score_8_up wants to look like on the Pony base.
>>
>>9109574
yeaah... the relevant word is "potato". it can handle euler ancestral at max 25 iterations, and the illustrious-based models i looked at tell me to use karras at 40, which would triple the computing time for me.
didn't have a closer look at noobAi yet, at first glance it seemed anime only.
oh, and by potato i mean: i render on cpu.
>>
>>9109657
That's just people being retarded. With the exception of lightning distills, speed loras and similar; samplers and schedulers work the same regardless of model. Whether it's SD1.5, or XL, or Flux, or Qwen, whatever, you can run it at 20-step euler with no issue.
>>
>>9109574
i don't know... i removed the scores, and now she looks at me like she has an icepick hidden somewhere.
>>
>>
>>9109845
when i assembled my pc 8 years ago, before AI, i didn't have gaming in mind... i didn't just skip the big graphics card, i also dimensioned the power supply accordingly, i.e. too small for adding one later... and now i consider it too much of a hassle. *shrugs*
>>
>>
>>
>>
File: 00014-665399740.png (1.1 MB)
1.1 MB PNG
>>
File: 00040-1467947682.png (3.7 MB)
3.7 MB PNG
>>
File: 00023-4280405212.png (3.5 MB)
3.5 MB PNG
>>9110293
>>
File: 00005-334025072.png (3.6 MB)
3.6 MB PNG
>>9110294
>>
>>
>>
>>
File: TS_SC_0022.jpg (113 KB)
113 KB JPG
>>
File: TS_SC_0019.jpg (119.6 KB)
119.6 KB JPG
>>
File: TS_SC_0013.jpg (105.8 KB)
105.8 KB JPG
>>
File: TS_CS0019.jpg (153 KB)
153 KB JPG
>>
File: TS_slt_0035.jpg (126.5 KB)
126.5 KB JPG
>>
oh well, on with the latex maids ^^
i remember sd1.5, looked like crap compared to pony. *but* i found a lcm lora that *seems* to work so far.
>>
>>
>>
>>
File: 00035-493783123.png (1.4 MB)
1.4 MB PNG
>>
>>
>>
>>
>>
File: _00004_.webm (708 KB)
708 KB WEBM
>>
>>
>>
File: 00041-373825085.jpg (381.9 KB)
381.9 KB JPG
>>
File: HiResCyberPMajEl_00028_.png (3.1 MB)
3.1 MB PNG
>>
File: HiResCyberPMajEl_00025_.png (3.2 MB)
3.2 MB PNG
>>
>>
File: 05100-1381446484.png (1.7 MB)
1.7 MB PNG
>>
File: f2k_c_65536_260124_00032_.jpg (3.7 MB)
3.7 MB JPG
>>
File: f2k_c_69420_260124_00001_.jpg (3.6 MB)
3.6 MB JPG
>>
File: f2k_c_65536_260124_00005_.jpg (3.8 MB)
3.8 MB JPG
comic pages are a goldmine but i need to set up regional prompting so I can prompt details of each panel separately
>>
>>
File: f2k_c_65537_260124_00003_.jpg (3.8 MB)
3.8 MB JPG
>>9112796
yeah that chunli hand was about the best out of a few attempts, the age-old struggle. inpainting is fine but I might as well just start with inpainting each panel and reassembling if i'm gonna do that
>>
>>
>>
File: f2k_c_3141592654_260124_00001_.jpg (3.7 MB)
3.7 MB JPG
>>
File: f2k_c_65536_260124_00017_.jpg (3.6 MB)
3.6 MB JPG
for the next OP:
> Flux 2 Klein https://docs.comfy.org/tutorials/flux/flux-2-klein
>>
>>
File: 00094-521037609.jpg (469.2 KB)
469.2 KB JPG
>>
File: SJMjkndAnkQtxmT.jpg (282.8 KB)
282.8 KB JPG
With local models and the jazz y'all are running, can you get this sort of detail with simple prompts like you can with Grok? Or is it a lot more buy-in and time to get this sort of result
>>
File: 1769336928.png (1.1 MB)
1.1 MB PNG
>>
>>
>>
File: ComfyUI_573_.png (1.4 MB)
1.4 MB PNG
>>9113034
>this sort of detail
The amount of detail was not an issue three years ago, you can always prompt more stuff just for the sake of it, or slap on a detail lora that will make all textures more complex and add shit like dust particles or dripping fluids or whatever.
Takes some figuring out how to get the scene you want because the models don't really understand language, only short phrases or tags. But I'm guessing it's no more effort than figuring out how to skirt around the censors online.
Maybe rephrase the question?
>>9113162
https://realbooru.com
Better than nothing, but it's shit. There's no tag wiki or other guidance on how a specific tag is supposed to be used, and no moderation/enforcement either.
If one person tags "huge breasts" when they're big for anime standards and another person when they're bigger than his ex-gf had, then the model ends up jumping between c-cups and k-cups and it's all pointless. Not to mention people making up random tags like "performing_fellatio_on_male_lying_on_back_while_bent_over_beside_him" which would only serve to confuse the model.
>>
>>9113164
The purpose is tag standardization, which would make training these small local models easier. It's why people keep merging in Pony or illustrious, even though those were only meant for cartoons and anime. Because at least they understand camera angles, composition, tons of clothing and hairstyles, two dozen sex poses, etc.
>>
>>9113176
What do you mean? There are already sophisticated VLMs for tagging in booru styles or natural language. Maybe they're not entirely consistent, but that doesn't appear to be the bottleneck.
The issues come from the actual challenge of training the models and also how realism datasets appear to cause unique problems compared to only 2D images.
>>
>>9113188
You say that as if's ever been attempted. afaik Chroma is the only realism tune that listens to tags natively without merging. And all of that comes from the danbooru portion, judging by how it starts to struggle with realism the more tags you use.
>>
>>
>>
>>
>>
File: ComfyUI_00024_.png (1.5 MB)
1.5 MB PNG
>>
>>
>>9113361
Chroma is so different from flux that it acts like a different base model. Flux is the same as all censored models in that it cannot do soles ever in any scenario. Chroma can do them but not great. I tried qwen once and it was somewhat promising so maybe that's the way.
>>
>>
>>
>>
>>9113344
i looked at a couple of old dall-e /aco/ threads, just long enough before my brain leaked out my ears.
in fairness, like I read someone say, a lot of AI use seems like self-inflicted brain damage, but it's more understandable if you view it as people puffing magic smoke repeatedly with no open window.
anyway old dall-e 3 is still a better footfag model. and it's still a massive image model compared to anything available for local, maybe with the exception of Flux 2?
and the training material they used at the time probably had less filtering.
>>
>>9113475
I'm talking about the amount of detail, not quality. If you just want a ton of crap in the image instead of flat surfaces. SD1.5 had an issue with it even, too much detail that often made no sense. And if you upscaled it to 2K, or 4K, it would only get worse as it added more.
>>
>>
>>
File: 87201.jpg (1.4 MB)
1.4 MB JPG
>>9106140
>klein
yes, this is impressive, but why the hell can't i rip off vermeer or monet accurately? everything is folded into generic style tags. what, stealing from the long dead is as unethical as doing it to mickey mouse or what?
>>
>>
File: 00023-111931408.png (3.6 MB)
3.6 MB PNG
>>
>>
>>
File: 00043-180184452.jpg (143.4 KB)
143.4 KB JPG
>>
>>
File: 00053-980330944.png (2.3 MB)
2.3 MB PNG
>>
File: eileen.jpg (412 KB)
412 KB JPG
>>9112990
>certain kinds of edits
>>
>>
>>
File: turbo vs non-turbo_fixed.jpg (517.8 KB)
517.8 KB JPG
>>9109057 >>9096235
Non-turbo Z-Image getting the prompt here.
>>
>>
File: z-image_00058_.png (1.4 MB)
1.4 MB PNG
>>
File: z-image.png (3.9 MB)
3.9 MB PNG
>>
File: 00034-441919828.png (3.2 MB)
3.2 MB PNG
>>
>>
>>
>>
>>
>>
File: Beaver World.jpg (143.1 KB)
143.1 KB JPG
>>
File: 00013-867452575.png (2.2 MB)
2.2 MB PNG
>>
File: 00037-293971695.png (2.5 MB)
2.5 MB PNG
>>
File: 084859417.gif (2.1 MB)
2.1 MB GIF
>>
File: 00039-508032747.png (2.8 MB)
2.8 MB PNG
>>
>>
File: it's klein.jpg (217.3 KB)
217.3 KB JPG
>>9110293 >>9116528
there's no "prompt" to catbox, just use natural language.
>>
File: klein9b_klein4b.png (3.8 MB)
3.8 MB PNG
Klein 4B fails to edit the orb in this image after several attempts. 9B gets it.
>>
File: original.jpg (53.5 KB)
53.5 KB JPG
>>9116955
>>
File: ComfyUI_00019_.png (3.7 MB)
3.7 MB PNG
>>
>>
>>9117001
>I mean what model
flash chroma with a character lora. the style is just from the lora and basic tags like "glossy" or "realistic 3d".
klein 9b changed almost the entire image
https://litter.catbox.moe/n8arabu1zfsmbzud.jpg
>With Klein and natural language just one word can change or break output.
it seems that klein needs specific resolutions or it can shift and warp the image. idk if it has to be divisible by 32 or whatever. if res is too low and maybe too high it can also shift colors.
assuming none of that is a problem, if it changes the entire image it's because your prompt is too vague, too general, poorly described or it just doesn't understand. and it's ultimately RNG.
it also matters if you're using the 4b model that will struggle more.
>>
File: 063104_00001_1.mp4 (3.1 MB)
3.1 MB MP4
>>9109845
>>
>>
>>
File: download-57.jpg (203.8 KB)
203.8 KB JPG
>>
File: 966710619.png (2.1 MB)
2.1 MB PNG
>>9117618
showing this to the next guy who asks why the cheap laptop he bought came with 2GB of RAM and no screen.
>>
File: 00072-886039112.jpg (449.2 KB)
449.2 KB JPG
>>
File: 1235635937.jpg (58.9 KB)
58.9 KB JPG
>>9114321
Computer disengage safety protocols and disable all power limits.
Generate nude pregnant Aerith sequence for the next 24 hours and forward all my calls.
>>
>>
>>
File: ComfyUI_00009_.png (1.4 MB)
1.4 MB PNG
>>
>>
File: ComfyUI_00022_.png (1.5 MB)
1.5 MB PNG
>>9118083
>>
>>
File: ComfyUI_00044_.png (1.2 MB)
1.2 MB PNG
>>9118086
>>
>>
>>
File: 092606_00001_1.mp4 (3.3 MB)
3.3 MB MP4
>>9118083
And she was never seen again.
>>
>>
>>
>>
File: GQEVmY0XAAAlcVV.jpg (117.7 KB)
117.7 KB JPG
>>
>>
>>
>>
>>
>>
>>
File: 342043393.jpg (437.6 KB)
437.6 KB JPG
>>
>>
>>
File: anima-preview.png (1.6 MB)
1.6 MB PNG
>>
File: 1770053182532973.png (2.4 MB)
2.4 MB PNG
>>
File: klein_001.jpg (414.1 KB)
414.1 KB JPG
>>
File: 00074-265099066.png (2.7 MB)
2.7 MB PNG
>>
File: 00078-804466626.png (1.8 MB)
1.8 MB PNG
>>
File: HiResCyberPMajEl_00029_.png (3.2 MB)
3.2 MB PNG
>>
File: ashfgajfmil.png (1.1 MB)
1.1 MB PNG
is there another board I can share realistic gens? I feel everyone is focused on drawings...
>>
File: 00012-331651925.png (2.2 MB)
2.2 MB PNG
>>
File: 00109-1081418672.jpg (3.9 MB)
3.9 MB JPG
>>
>>
File: 20260119094621-3009849547.jpg (669 KB)
669 KB JPG
>>9120206
There isn't, unless you want to hang out with lolicons on /b/. /s/ and /hc/ don't accept AI gens because they classify them as fakes.
>>
File: tegaki_2026_02_03_20_36_09.png (1.7 MB)
1.7 MB PNG
>>9120441
That's too bad. I have outgrown /b/, I think. I guess CivitAi it is for me...
>>
File: 1480610567.gif (127 KB)
127 KB GIF
>>9120481
>I have outgrown /b/
>CivitAi it is for me
>>