Thread #108590807
File: highlights_g_108585019_1776010386_1.jpg (3.4 MB)
3.4 MB JPG
Discussion and Development of Local Image and Video Models
Previous: >>108585019
https://rentry.org/ldg-lazy-getting-started-guide
>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP
>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows
>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe
>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/
>Qwen
https://huggingface.co/collections/Qwen/qwen-image
>Klein
https://huggingface.co/collections/black-forest-labs/flux2
>LTX-2
https://huggingface.co/Lightricks/LTX-2
>Wan
https://github.com/Wan-Video/Wan2.2
>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46
>Illustrious
https://rentry.org/comfyui_guide_1girl
>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage
>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg
>Local Text
>>>/g/lmg
>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
301 RepliesView Thread
>>
>mfw Resource news
04/12/2026
>LTX-2 VBVR LoRA - Video Reasoning
https://huggingface.co/LiconStudio/Ltx2.3-VBVR-lora-I2V
04/11/2026
>ComfyUI-RookieUI: The ultimate A1111-style sidebar
https://github.com/rookiestar28/ComfyUI-RookieUI
>Qwen3.5-4B-Base-ZitGen-V1: Image captioning fine-tune of Qwen 3.5 4B optimized for Z-Image Turbo
https://huggingface.co/lolzinventor/Qwen3.5-4B-Base-ZitGen-V1
>ComfyUI Memory Visualization
https://github.com/kijai/ComfyUI-MemoryVisualization
04/10/2026
>JoyAI-Image-Edit now supports ComfyUI
https://github.com/jd-opensource/JoyAI-Image#-news
>Two Front Doors: Civitai.com, Civitai.red, and What's Next
https://civitai.com/articles/28369/two-front-doors-civitaicom-civitair ed-and-whats-next
>Uni-ViGU: Towards Unified Video Generation and Understanding via A Diffusion-Based Video Generator
https://fr0zencrane.github.io/uni-vigu-page
>PrivFedTalk: Privacy-Aware Federated Diffusion with Identity-Stable Adapters for Personalized Talking-Head Generation
https://github.com/mazumdarsoumya/PrivFedTalk
>AVGen-Bench: A Task-Driven Benchmark for Multi-Granular Evaluation of Text-to-Audio-Video Generation
http://aka.ms/avgenbench
>Cross-Modal Emotion Transfer for Emotion Editing in Talking Face Video
https://chanhyeok-choi.github.io/C-MET
>ChenkinNoob-XL-V0.5
https://modelscope.ai/models/ChenkinNoob/ChenkinNoob-XL-V0.5
>Control Order & Free Memory: Controls the order of node execution with device-agnostic memory management
https://github.com/mkim87404/ComfyUI-ControlOrder-FreeMemory
>DMax: Aggressive Parallel Decoding for dLLMs
https://github.com/czg1225/DMax
04/09/2026
>MAR-GRPO: Stabilized GRPO for AR-diffusion Hybrid Image Generation
https://github.com/AMAP-ML/mar-grpo
>HybridScorer: Score, sort, and cut large sets down fast with GPU-accelerated AI review
https://github.com/vangel76/HybridScorer
04/08/2026
>OrthoFuse: Training-free Riemannian Fusion of Orthogonal Style-Concept Adapters
https://github.com/ControlGenAI/OrthoFuse
>>
>mfw Research news
04/12/2026
>Preserving Forgery Artifacts: AI-Generated Video Detection at Native Scale
https://arxiv.org/abs/2604.04634
>Generative Phomosaic with Structure-Aligned and Personalized Diffusion
https://robot0321.github.io/GenerativePhotomosaic/index.html
>DiffVC: Non-AR Framework Based on Diffusion Model for Video Captioning
https://arxiv.org/abs/2604.08084
>HandDreamer: Zero-Shot Text to 3D Hand Model Generation using Corrective Hand Shape Guidance
https://arxiv.org/abs/2604.04425
>BiTDiff: Fine-Grained 3D Conducting Motion Generation via BiMamba-Transformer Diffusion
https://arxiv.org/abs/2604.04395
>Image-Guided Geometric Stylization of 3D Meshes
https://changwoonchoi.github.io/GeoStyle
>Rethinking Position Embedding as a Context Controller for Multi-Reference and Multi-Shot VidGem
https://arxiv.org/abs/2604.03738
>SurFITR: A Dataset for Surveillance Image Forgery Detection and Localisation
https://arxiv.org/abs/2604.07101
>HEDGE: Heterogeneous Ensemble for Detection of AI-GEnerated Images in the Wild
https://arxiv.org/abs/2604.03555
>ABMAMBA: Multimodal Large Language Model with Aligned Hierarchical Bidirectional Scan for Efficient Video Captioning
https://arxiv.org/abs/2604.08050
>FIT: Large-Scale Dataset for Fit-Aware VTON
https://johannakarras.github.io/FIT
>HAWK: Head Importance-Aware Visual Token Pruning in Multimodal Models
https://github.com/peppery77/HAWK.git
>IQ-LUT: interpolated and quantized LUT for efficient image super-resolution
https://arxiv.org/abs/2604.07000
>TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders
https://arxiv.org/abs/2604.07340
>ResGuard: Enhancing Robustness Against Known Original Attacks in Deep Watermarking
https://arxiv.org/abs/2604.03693
>Appear2Meaning: Cross-Cultural Benchmark for Structured Cultural Metadata Inference from Images
https://arxiv.org/abs/2604.07338
>PortraitCraft: Benchmark for Portrait Composition Understanding and Generation
https://arxiv.org/abs/2604.03611
>>
>>108586449
>https://civitai.com/models/2536147?modelVersionId=2850290
Has anon tried training an Anima LoRA using the official configs?
>>
File: o_00195_.png (1.5 MB)
1.5 MB PNG
>>
>>
File: Z-Image_00028_.png (1.2 MB)
1.2 MB PNG
>>108590677
i vibe coded a sampler that uses gemma 4 as a judge. if the images come out as total slop it auto regenerates them with a different seed
soon ill have gemma also change the prompt and sampling settings depending on whats consistently messing up the outputs
>>108590680
cumfart is decent for some quick gens but once you throw in too many moving parts it shits the bed. its failing to keep up with all the stuff we can do now
>>
File: Anima_01723_.png (936.8 KB)
936.8 KB PNG
>>108590830
I might try soon. I will be busy this week but maybe I will spare time.
>>
File: 00053-2467153369.png (1.1 MB)
1.1 MB PNG
>>
>>108590854
Which Gemma 4, 31b?
I wasn't too impressed with 26b moe's visual reasoning. I doubt 31b is that much better.
I am also very skeptical that it will handle intricacies of sampling settings well. Hell even SOTA API LLMs aren't particularly great about reasoning when it comes to that.
>>
File: Anima_01760_.png (2.1 MB)
2.1 MB PNG
It just collapsed I guess.
>>
File: Anima_01761_.png (1.2 MB)
1.2 MB PNG
I showed you my p̶̨̢̹̻̣̤͚̭͔̮̺̃̒̇̍̇̐̃̃̉̀̐̇̀̐̈́̔́̈́̈́̐̂̌͒̏͂͛̕͘̚͝ͅę̶̨͔̜͈̞̠̬̩̣̭̖͔̼̝̖̦̭͚͗̿͗̔̚ ̥̺ǹ̴̨̦̱̯͈̟̝̘̤̜͍͆̌̍͋̃̉̉̏͑̀͒̏͌́̚̕̕͜͝ ̧̼̱̼͕̩͎̬͉̝͔̣͔̘͜ͅi̷̛̛̔̏̿͑̌̐́̆̋̈̀̾͂͠͠ ̨͙̬̥̠̼̹͇̙̙͖̖͍͙̄̊̃͛̂́̆̏̔̊͗̄̿͗̽̀̀͌̿͑̇ ͇s̷̠̞̪̥̓̎̌͗͆́̆̆͐͛̍̈̀̈̉́̈̽̉̔͊̍̿̕̕̕͝͝ ̤͈̣͔̠͍̯ ,please respond
>>
File: Anima_01765_.png (2.1 MB)
2.1 MB PNG
>>
File: 1764947232216650.png (686.3 KB)
686.3 KB PNG
>>108590887
im using 26b and its good enough at detecting bad images which is all i need it to do right now
you might be right, it would probably be better to just have it inpaint the areas it flags as problematic instead of playing whackamole with the sampling settings
>>
>>108590830
I’d like to, but I don’t know where to get high quality images of the artists I want for free. Maybe RuTracker?
>>108590827
It doesn’t make sense to me anymore with models like Anima or Z Image, plus SAM3. As a former Invoke and actual Krita user, I don’t see the need. Maybe just something to clean between SAM3 layers, since there’s always residue and the option to choose the mask context. I see it more as a layer mask manager, a tab to centralize outputs and manage, blend, and crop them, but for that Krita output nodes exists.
>>
>>
>>108591071
>I don’t know where to get high quality images of the artists I want for free.
danbooru
gelbooru
rule34 dot xxx
paheal
xitter
pixiv
artstation
Most images here should be high quality enough for 1024p training.
>>
File: anima_selfie1.png (1.2 MB)
1.2 MB PNG
>The model is designed for making illustrations and artistic images, and will not work well at realism.
Bullshit. I don't know what kind of undisclosed secret sauce is in the training data, but there ain't no way a model trained only on drawings can do this.
>>
>>
File: o_00199_.png (599.5 KB)
599.5 KB PNG
>>
File: anima_selfie2.png (1.1 MB)
1.1 MB PNG
>>108591129
Seriously kekstone needs to stop the schizo Chroma2 experiments that are going nowhere, and just finetune Anima on the Chroma dataset. The model is like 95% of the way there already, it would take 2 epochs, he could have it done in less than a month.
>>
>>108591129
>undisclosed secret sauce
tdrussell openly state that ye-pop is in it. He also added a regularization dataset in preview 2 to ensure less forgetting of base model's realism knowledge.
>>108591142
His actions will never make sense.
>>
>>
>>
>>108591153
I know it has ye-pop, but it says he filtered the photos out of it. Unless that part is a lie, or the filtering was bad and let a lot of realism through. I guess a regularization dataset implies realism knowledge so maybe that's how.
>>
>>108591160
>implying I'm lying about what model this is
Positive: Amateur photography. High quality candid photo of a young Asian woman with long, straight black hair, taking a selfie in a messy bedroom. The photo is taken from a slightly elevated angle, with her outstretched arm in the frame. The woman is wearing a Japanese schoolgirl sailor-style uniform. A bed and part of a window are visible in the background.
Negative: worst quality, low quality, score_1, score_2, score_3, artist name, anime, illustration, cartoon, blurry
CFG 5, er_sde, beta57 scheduler. No catbox, fuck you, type it in and try it yourself.
>>
File: ComfyUI_temp_axxiq_00001_.png (2.8 MB)
2.8 MB PNG
>>
>>108591164
>midjourney style illustrations
No it's more interesting. They are images from LAION that are conceptually similar to a dataset from MJ.
>4.25 million Midjourney images were downloaded from this huggingface repository, and CLIP L14 vectors were generated for each image. Using the k-means clustering method, these vectors were assigned to 10,000 centroids. The CLIP vectors of these centroids were then used to retrieve nearest neighbors from the LAION-5B dataset using the image search website, focusing on those with aesthetic values of at least 0.5 and a minimum resolution of 768 pixels on the shortest side.
So not MJ gens but rather real images that are like MJ gens.
>>
>>
>>108591071
>I’d like to, but I don’t know where to get high quality images of the artists I want for free.
You might be surprised to know that even on this very site there are threads that contain hundreds of high quality images. There's even an entire board for it.
>>
>>
>>108591175
>but it says he filtered the photos out of it.
Forgot about that part. I guess enough slipped in then.
>I guess a regularization dataset implies realism knowledge
Yes.
>A regularization dataset is introduced to improve natural language comprehension and help preserve non-anime knowledge.
>>
File: ComfyUI_temp_axxiq_00002_.png (2.2 MB)
2.2 MB PNG
>>
>>
>>
>>
File: Video_00004.mp4 (2.2 MB)
2.2 MB MP4
I'm stumped at latent upscaling. It does not want me to go over 4steps at all for high noise no matter the scheduler or sampler. Any higher and it gets blown the fuck out and blurrier the more steps you add. Lowering the cfg does nothing, it's not NAG, light loras. Claude is also stumped, hallucinating with each new chat.
>>
File: o_00210_.png (1.4 MB)
1.4 MB PNG
>>
File: 00187-1497584230.png (1.3 MB)
1.3 MB PNG
>>
File: 10089453838992.png (1.1 MB)
1.1 MB PNG
>>
File: Anima_01766_.png (1.8 MB)
1.8 MB PNG
>>108591345
I am also curious what happened to NucleusMoE. It's diffusers PR was finally merged a week ago, yet it's still MIA.
>>
File: ComfyUI_temp_padlz_00003_.png (2.5 MB)
2.5 MB PNG
>>
File: 377283825433227.png (1.3 MB)
1.3 MB PNG
>>
File: ComfyUI_temp_elyfr_00005_.png (3.1 MB)
3.1 MB PNG
>>
File: Anima_01767_.png (1.5 MB)
1.5 MB PNG
>>
File: 544104445879965.png (1.7 MB)
1.7 MB PNG
>>
File: 671883817003026.png (1.2 MB)
1.2 MB PNG
>>
File: 00211-3265597026.png (1.7 MB)
1.7 MB PNG
>>
File: ComfyUI_temp_padlz_00009_.png (2.7 MB)
2.7 MB PNG
>>
>>
>>
>>
File: ComfyUI_temp_padlz_00013_.png (3.3 MB)
3.3 MB PNG
>>
File: ComfyUI_temp_padlz_00016_.png (3.7 MB)
3.7 MB PNG
>>
>>108591614
>>108591631
>>108591722
>>108591807
She would never give me the time of day :(
>>
>>108591720
nevermind, klein's boob slider just needs stupid heavy weights. https://civitai.com/models/2318168/the-breast-slider-klein-edition?mod elVersionId=2691652
can't wait for an Anima realism finetune, then i won't need sliders and shit to do heavy proportions.
>>
>>
File: 3467824724727.jpg (2.2 MB)
2.2 MB JPG
Where is the funding for anima... I need more epochs...
>>
>>
>>
>>
>>
>>108591878
>>108591905
You've done him
>>
>>
File: 1758237517589007.png (2.2 MB)
2.2 MB PNG
I recognize the artist style but I can't remember the name
>>
>>
File: Chroma1-HD-Flash.safetensors_00004_.png (1.5 MB)
1.5 MB PNG
>>
File: o_00215_.png (1.1 MB)
1.1 MB PNG
>>
File: 00314-1877534722.png (1.7 MB)
1.7 MB PNG
>>
File: ComfyUI_09284_.png (797.6 KB)
797.6 KB PNG
>>
>>108591345
>new chinese image model wen
soon
https://github.com/Comfy-Org/ComfyUI/pull/13369
>>
https://strawpoll.com/B2ZB9rDajgJ
Anima got accepted really quickly.
>>
File: o_00217_.png (949.9 KB)
949.9 KB PNG
>>
>>
>>
File: o_00219_.png (1.1 MB)
1.1 MB PNG
https://strawpoll.com/2ayLQ03azn4
important
>>
>>
File: 1747230948853958.png (2.8 MB)
2.8 MB PNG
>>
>>
File: o_00220_.png (1.3 MB)
1.3 MB PNG
>>
>>
>>108592106
Cool atmosphere in this one.
I can hear this. https://www.youtube.com/watch?v=M62pYatbyHo
>>
File: 04662-1325181679.png (1.6 MB)
1.6 MB PNG
Bottom left in the OP image
>what model pleaaaase!
>>
>>108592062
>That's a sincerely cool gen. Prompt?
space art by Chesley Bonestell , abstract expressionism, A line of detailed, embedded within dark, circular architectural elements, receding into the distance on an alien landscape under a vast, black sky with a distant Mars-like planet. The style is stark, surreal, and monochromatic, evoking a sense of cosmic horror and desolation. Dramatic chiaroscuro lighting casts deep shadows, emphasizing the texture , uneven surface of the lunar-like ground. The composition uses a low-angle, wide-angle perspective, drawing the viewer's eye along the unsettling procession The mood is somber, mysterious, and foreboding. Jupiter is in the starry sky.
Grainy, image that emphasizes texture and mood over technical polish
Fragmented, composition and unexpected cropping conveying immediacy and voyeurism
Focus on life, marginal figures, , decay, and anonymous moments — an exploration of modernity’s raw edges
Snapshot aesthetic with a spontaneous, confrontational energy; often serial and diaristic in presentation
style of Nobuyoshi Araki
style of Shomei Tomatsu
style of William Klein
style of Helen Levitt
style of Garry Winogrand
style of Nan Goldin
style of Anders Petersen
style of Seiichi Furuya
style of Masahisa Fukase
Steps: 6, Sampler: Euler, Schedule type: Simple, CFG scale: 1, Seed: 2427773870, Size: 1472x848, Model hash: 4038c907c8, Model: flux1-schnell, Version: f2.0.1v1.10.1-previous-669-gdfdcbab6, Module 1: ae, Module 2: clip_l, Module 3: t5xxl_fp16
>>
File: 2026-04-12163102_stealthmeta.png (1014.7 KB)
1014.7 KB PNG
>>
File: miraclein.png (26.7 KB)
26.7 KB PNG
ah so this is why every realistic model after sdxl fucking sucks cock
realismsloppers are just brain dead
>>
>>
>>
File: where is qwen 2.png (40.5 KB)
40.5 KB PNG
>>
>>
>>
File: pixel-0000-3722465621.png (1.8 MB)
1.8 MB PNG
>>
>>108592529
>z image tardbo
and i bet you still use shift settings too, or even more retardedly, gen on base.
>>
>>
>>108592517
alibaba should just focus on qwen 4 and distilling seedance 2.0 as wan 3
>>108592539
even st floyd while maxxed out on fentanyl would not fall for the apparent insinuation that the best out of the box model for realism is somehow bad for realism
>>
>>
>>
File: lady liberty 2.png (1.9 MB)
1.9 MB PNG
A fine day in America.
>>
File: pixel-0001-2881986979.png (66.8 KB)
66.8 KB PNG
>>
>>
>>
>>
>>108592636
>>108592614
nah, it's anime.
>>
File: lady liberty 3.png (1.2 MB)
1.2 MB PNG
>>
>>
>>
File: 1754851925359741.png (713.4 KB)
713.4 KB PNG
>>108590807
anyone have the original file for this
>>
>>108592420
It's anima preview 3 with this lora and some other stuff, go to the prior thread to see how I set up the style. Also inpaint the face you lazy bastard and I'll catbox you.
https://civitai.com/models/908800/oldschool-fantasy-style?modelVersion Id=2835743
>>
File: pixel-0002-1473128117.png (572.6 KB)
572.6 KB PNG
>>
File: ComfyUI_00039_.png (2.8 MB)
2.8 MB PNG
>>
File: 05256-1325181679.png (1.7 MB)
1.7 MB PNG
>>108592708
Sorry, yeah, I usually hate it when people post sloppy gens. I was kind reaching back way long ago since I just realize I've been genning nothing but fucked up shit for months
>>
File: 00014-782756080.jpg (1.3 MB)
1.3 MB JPG
>>
>>
>>
File: 00026-888168195.png (1.5 MB)
1.5 MB PNG
>>108592957
1girl, skinny, pale skin, long hair, straight hair, black hair, black sweater, turtleneck, black shorts, black thighhighs, expressionless, looking at viewer, red background, 1990s \(style\)
negative: flat color, minimalism, shiny skin
euler a, beta, 30 steps, 3 cfg, my bad illustrious mix
>>
File: ComfyUI_00501_.jpg (203.4 KB)
203.4 KB JPG
>>108592957
>>
File: 1749343255013788.png (3.9 MB)
3.9 MB PNG
>>
File: 2026-04-12175434_stealthmeta.png (760.4 KB)
760.4 KB PNG
>>
>>
>>
File: 00017-417262864.jpg (1.1 MB)
1.1 MB JPG
>>
File: 1764809755235480.jpg (1.4 MB)
1.4 MB JPG
>>
File: 1763690168005255.png (3.8 MB)
3.8 MB PNG
https://www.reddit.com/r/StableDiffusion/comments/1sjsp13/zimage_turbo _checkpoint_deedeemegadoodo_edition /
>he unslopped Z-image turbo
now THAT's impressive
>>
File: 1751184433502910.jpg (1.4 MB)
1.4 MB JPG
>>
>>
>>108593315
>>108593329
Bloody Basterd!
>>
>>108590807
>>108590807
How do you turn a monochrome image to fully colored? I put "sepia, monochrome, sketch" in the negatives. Put "colorful, masterpiece, beautiful lighting" and specified hair colors, clothing colors in the prompt. I'm using Forge Neo. Any advice would be appreciated!
>>
>>108593479
you can't afaik. you can paint some colors on it in paintshop
Maybe if you add a lot of extra noise it can pick up some color from there but it's hard. You need a high denoise and it ends up ruining you're image
>>
>>
File: 1751551401631622.jpg (1.1 MB)
1.1 MB JPG
>>
>>
>>108592869
>>108593528
FUCK'S SAKE
https://litter.catbox.moe/9d46ih3yzogwu5u0.png
>>
File: 1763696026166237.jpg (1.1 MB)
1.1 MB JPG
>>
File: 1770798233806998.jpg (868 KB)
868 KB JPG
>>
>>
>>108593532
>>108593532
Kek, happens to the best of us
Thanks for the cat
>>
>>
>>
>>108593672
interesting, maybe it just automates what https://github.com/larsupb/LoRA-Merger-ComfyUI does by introducing heuristics to auto-select stuff
>>
File: 00211-3265597026.png (1.4 MB)
1.4 MB PNG
>>
File: 1765285887613878.jpg (813.3 KB)
813.3 KB JPG
>>
I need controlnet and inpaint workflow for z image base
>>
File: ComfyUI_temp_gddpn_00014_.png (1.7 MB)
1.7 MB PNG
>>
>>108593658
Good band. https://www.youtube.com/watch?v=6xh5fhT0mPI
>>
File: 05279-906971472.png (1.1 MB)
1.1 MB PNG
>>108593658
That's a hot cafe racer
>>
File: ComfyUI_temp_gddpn_00019_.png (1.7 MB)
1.7 MB PNG
>>
>>
File: ComfyUI_21104.png (2.2 MB)
2.2 MB PNG
>>108592862
Smooth...
>>108593305
I tried a few gens with it in my workflow and it just dropped detail everywhere (think large patches of undefined texture). Not a fan.
>>
File: ComfyUI_temp_lnnox_00005_.png (1.9 MB)
1.9 MB PNG
>>
>>
File: ComfyUI_temp_lnnox_00010_.png (1.9 MB)
1.9 MB PNG
The baby needs a LLM to write his prompts for him, kek, ngmi
>>
>>
File: ComfyUI_temp_gddpn_00030_.png (1.7 MB)
1.7 MB PNG
>>
File: ComfyUI_temp_lnnox_00013_.png (2.2 MB)
2.2 MB PNG
anima is great
>>
File: ComfyUI_temp_gddpn_00042_.png (2.1 MB)
2.1 MB PNG
>>
>>
File: 1771135161716865.jpg (731.2 KB)
731.2 KB JPG
>>
File: 00631-4005897646.png (1.3 MB)
1.3 MB PNG
>>
File: file.png (3.1 MB)
3.1 MB PNG
>>108593911
>>108593943
Might switch to anima
>>
>>108590830
Ask in /edg/ there are a few reputable lora makers there who might want to try your Anima settings. Neclordx has only made loras for SDXL so far. It would be interesting to see him start working on Anima.
>>
>>
File: 1769949306424367.jpg (1.1 MB)
1.1 MB JPG
>>
>>
File: collage.jpg (2 MB)
2 MB JPG
>>108590830
Here's one for hirune I made from an old dataset I recaptioned with Gemma4, it was a pretty busted dataset desu, and it did a pretty good job anyway! Threw in a prompt with @aroma sensei too to test how much the lora affects stuff without @hirune and it seems pretty accurate still. Still has deformed hands a lot though, hopefully that gets better with more high res training.
>>
>>
File: ComfyUI_temp_hzpzn_00050_.png (941.7 KB)
941.7 KB PNG
>>
File: 1757912780103850.jpg (723.9 KB)
723.9 KB JPG
>>
>>
File: Screenshot 2026-04-12 215325.png (358.5 KB)
358.5 KB PNG
is ltx 2.3 lora training on a 5090/64gb of ddr5 ram even possible? what setting are recommended? very difficult to find any tutorial on training for this.
>>
File: ComfyUI_09377_.png (354.8 KB)
354.8 KB PNG
>>
>>
>>108594206
there are videos by the very guy who made the software you're using to train. he doesn't use a 5090 in his tutorials, but he goes over what you can do for low VRAM.
https://www.youtube.com/watch?v=JQIl8DFTL1M
>>
File: Screenshot_20260412_221616.png (50.3 KB)
50.3 KB PNG
I keep getting this error even though I've got a 16 GB VRAM GPU and nothing else is going on, just watching a YouTube video.
I'm using Endeavour
>>
>>
>>
>>
>>
>>
>>
File: Screenshot 2026-04-12 224552.png (221.2 KB)
221.2 KB PNG
how do i resolve this? is this a known issue? its not even loading the video or is it a format issue?
>>
>>
>>
>>
File: ComfyUI_09401_.png (524.7 KB)
524.7 KB PNG
>>108594674
this all i gots
>>
File: Screenshot 2026-04-12 235526.png (170.7 KB)
170.7 KB PNG
should i continue or just give up and reduce the step count. 45sec/it seems very slow or is this normal for a 5090/64gb ram build. i also had to disable audio in the settings in order for caching lent to disk to work. i set up it up to save every 1000 steps and the repeats are at 5 with learning rate at 0.0001
>>
>>
File: 1761218479396543.png (1.4 MB)
1.4 MB PNG
>>108594822
ZiB with the prompt "Generate an image of me killing myself "
>>
File: 1752294216583486.png (1.3 MB)
1.3 MB PNG
>>108594855
Higher CFG
>>
File: WVI2V_CC_INT_13-04-26-00-44_00001.mp4 (453 KB)
453 KB MP4
>>
File: 1091642880282430.png (1.3 MB)
1.3 MB PNG
>>
File: 911745310954562.png (514.8 KB)
514.8 KB PNG
>>108594855
same prompt on Anima
>>
>>108594855
>>108594867
that feel
>>
File: 499134907045476.png (1.1 MB)
1.1 MB PNG
>>
>>
File: 1098958451311052.png (1.1 MB)
1.1 MB PNG
>>
File: 1040270014740971.png (1.1 MB)
1.1 MB PNG
>>
File: 568664217992898.png (1.1 MB)
1.1 MB PNG
>>
File: 369227187011685.png (1.7 MB)
1.7 MB PNG
>>
>>
>>
File: Video_00002 (20).mp4 (1.5 MB)
1.5 MB MP4
My latent upscaling quest continues.
>>
>>
I downloaded ComfyUI to try local image generation for the first time but I'm finding that prompts are awful UX to specify poses, perspectives, or descriptions involving multiple characters. I was looking at some conditioning area pipeline and openpose thing to try to control the outputs better but they didn't seem to do anything useful.
Is there a good way to do this? To be able to have like have separate prompts for different areas or characters of a scene and not have them get mixed up? Not just areas necessarily but layering for overlapping characters or objects or whatever? And openpose a meme? Or is it worth it to set up? Or should I find a controlnet where I can send stick figures into it and it derives poses and perspectives from that?
>>
>>108595490
>multiple characters.
Give the characters names if they don't already have them and describe what they're wearing/doing with full sentences using the character's name. If you're using a less retarded model this should work better than just a jumble of tags.
>>
>>
>>108595537
>ULTRA REALISTIC Z-IMAGE TURBO NSFW UNLEASHED V6.2
>new version released every 4 hours
>all previews posted by the trainer are of portraits, landscapes, animals in a generic cartoon style, nothing realistic, nothing nsfw
>free download is locked for 2 months
>clearly trained on pony-real data
>realistic gens of people literally have skin texture that makes flux-dev look like nanobanana
>all prompt-adherence is gone
>in the gens posted by other users featuring naked people, their genitalia looks like hamburger meat
>>
File: 1754808230970287.jpg (896 KB)
896 KB JPG
>>
File: 17515568918612394.jpg (1.1 MB)
1.1 MB JPG
>>108595451
Interesting results.
>>
File: Video_00004 (1).mp4 (2.8 MB)
2.8 MB MP4
>>108595798
I just solved the issue I had with doing more than 4steps, if I had used a normal workflow other than my own custommade one, I'd had figured it out days ago..
Now I need to solve the issue of nsfw loras getting like 10x the value on their weight with this setup..
Request a remake of that pose of Dakota in a bathroom looking like she's about to give a bj.
>>
File: 17515568918612332.jpg (1.7 MB)
1.7 MB JPG
>>108595843
That's the beauty of this hobby. There's so much to do but we have so little time.
>>
>>
File: comfy__14545.jpg (1.6 MB)
1.6 MB JPG
>>
>>
>>
https://github.com/Comfy-Org/ComfyUI/pull/13113/changes
>disable_dynamic_vram:
>If you have any issues with dynamic vram enabled please give us a detailed reports as this argument will be removed soon
LMAO it hasn't even existed for a month, dynamic vram fucks everyone's workflows who has more than 64GB of RAM because it manages RAM x10 worse than previous default and dozens of people reported right away that they now get stuck on VAE decode for 10 extra seconds every time and they already want to remove it? I'm never updating this piece of shit software.
>>
>>108596030
>>108596063
I didn't get it. Could you post it again?
>>
>>
>>108596063
how many custom sharted nodes with their own memory management are you running?
I'm using these params:
--reserve-vram 1.0 ^
--lowvram ^
--disable-smart-memory
because I actually need my VRAM to be free'd after use and it works very good
>>
>>
File: 1759240570614151.png (152.3 KB)
152.3 KB PNG
>>108596155
>--lowvram
>>
>>
>>
>>
>>108596063
not only that but it broke MultiGPU, and they want to remove it? fucking retards
https://github.com/pollockjj/ComfyUI-MultiGPU
>>
share some cool or unusual styles for anima
@soesoe300
@niyane
@dross
@sanjiro \(tenshin anman\)
@paprika shikiso
@coldcat.
@smart oval
@mesuosushi
@koorimizu
@rui rui rui0122
@kakinoki mikan \(kari\)
@susagane
@yu \(stdio nameraka\)
@yunayuispink
@mola mola
@ebanoniwa
@gecchu
>>
>>108596285
That's what this is for, right? https://thetacursed.github.io/Anima-Style-Explorer/index.html Sort by "unique." I found some cool artists this way.
>@takawoyu,
Very 2D and slender characters
>@amu \(m aa\),
Cute watercolor style.
>>
>>
>>108596285
I just tested anima yesterday, I have to admit it's quite good, I have a question though, why did they decide to go for Wan 2.1's vae? What's wrong with Flux's vae? I thought the latter was the superior vae
>>
>>
>>
File: 1756199507813308.png (889.6 KB)
889.6 KB PNG
>>108596347
because it's dogshit and always has been
don't tell me you miss the plastic flux chins era, because i won't believe you
>>108596313
>@takawoyu
cute, i really like more sketchy styles
>>
>>
>>
>>
File: nice job.png (291.7 KB)
291.7 KB PNG
>>108596373
You got me for a second, nice job anon
>>
File: 1770777265317214.png (220.1 KB)
220.1 KB PNG
>>108596347
>* Qwen-Image Technical Report (Aug 2025) states their VAE beats Flux-VAE on PSNR and SSIM on both natural and text-heavy image sets — but does not publish the exact numbers in a standalone table.
chinks be like: "trust me bro"
>>
File: 1761022921368619.png (170.2 KB)
170.2 KB PNG
>>108596417
yeah... now I'm really asking myself why they went for qwen image's vae
>>
>>108596417
>>108596443
>samefagging this hard in the big 2k26
uh oh, melty!
>>
>>
>>108596332
https://huggingface.co/nvidia/Cosmos-Predict2-2B-Text2Image/blob/main/ vae/config.json
original base uses it and it would take possibly much more training to switch it
>>
>>
File: GraphRag_Representation_of_Source_Code.png (497.9 KB)
497.9 KB PNG
>work on my custom agent setup
>have my source code visualized
>it looks like this
Every node is a file of the agent, every interlink is an event/command/etc (currently synthetic, have to modify the actual agent software so I can visualize the network in real time). I just thought it looked neat and anons should be able to enjoy it too.
The size of the orbs I'm pondering is calculated by both graph edges and raw file size.
>>
>>
>>
>>
File: 1769213565548726.png (8.3 KB)
8.3 KB PNG
welp
>>
>>
>>
>>
>>
>>108596564
i found something again that does it, but i dont think it's what i was actually using
https://github.com/TuxedoTako/4chan-xt
>>
File: 1772943548239164.png (162.9 KB)
162.9 KB PNG
>>108596679
>https://github.com/TuxedoTako/4chan-xt
>I stopped using 4chan since the hack. I now browse alt chans that actually care about their users, and don't need an userscript fighting their shitty design.
he'll be back
>>
As a tech noob I just made swarmUI work after failing hard with Forge and a1111 for hours, poggers!
>>
File: _329802_.jpg (532 KB)
532 KB JPG
>>
>>
>>
>>
File: 00001-3903066416.jpg (1.5 MB)
1.5 MB JPG
>>
>>108597123
well if you want to try a Forge-style UI this one is kept up to date
https://github.com/Haoming02/sd-webui-forge-classic/tree/neo
>>
>>
>>
>>
>>
>>
>>
>>
File: 00003-3382455876.jpg (1.4 MB)
1.4 MB JPG
>>
>>108596484
Have they learned nothing on bronyfag blowing fortune on Auraflow and kekstone's "de-distilled" flux schnell tune?
You can't unfuck shitty base models by throwing a few million anime images at them.
Anima is the sole exception, and it's only considered great because how fucking outdated SDXL is at this point. It's backgrounds, fine details, text capabilities, instruction following are very rough compared to any 2025-2026 model.
>>
>>
>>
File: 1768319563617455.gif (1.3 MB)
1.3 MB GIF
>>108597271
>anime
oh right
>>
File: output.webm (3.9 MB)
3.9 MB WEBM
>>108597117
>>
>>
>>
>>
>>
>>108597123
there were like 5 ppl doing SD content when I started lol. well congrats, now you can gen smug anime girls licking ice cream or other things (wait, it can do that too?!) until the end of days.
>>
>>
File: 1625875060176.jpg (13.9 KB)
13.9 KB JPG
why do SDXL controlnets has not effect on illustrious checkpoints?
>>
File: file.png (3.3 MB)
3.3 MB PNG
>>108597339
>>
>>108597408
the xinsir union one should work, no? been a while, sorry. don't forget to add the "SetUnionControlNetType" node
>>
File: price.png (52.6 KB)
52.6 KB PNG
>https://thetacursed.github.io/Anima-Style-Explorer/
what the fuck hahaha
>>
>>
>>108597474
Total delusion. Nuts to ask for that much when their whole project revolves around something given away for free and costs way more to make. Dude probably vibe coded that app and then has some runpod just genning images that he commits to the git repo occasionally. My off the cuff math says it would only take like 5 days to generate 42k images with an RTX 6000 pro which would cost like 250 bucks.
>>
File: cat.png (1.6 MB)
1.6 MB PNG
>>108597491
it does something. don't look at the settings, haven't used this stuff in a while. need to tinker with the values, like cut it off earlier
>>108597568
le cash in. sadly he's not the only one.
>>
>>
>>
>>
>>
File: Screenshot 2026-04-13 113024.png (116.7 KB)
116.7 KB PNG
>>108595937
for some reason clicking the "do audio" slider kept giving me error so i had to disable it. the training work and its was 30sec/it after an 1 hour. ran it for 8 hours and got it at 1000steps out of 6000 steps with 4 repeats at learning rate of 0.0012. i paused the training and tested the results and the audio is absolute dogshit. AI toolkit really good for this shit. please help me out with betting setting solution.
https://litter.catbox.moe/jaeohg.mp4
https://litter.catbox.moe/6ni1ue.mp4
>>
dual sampler setup, 4 steps on 1st sampler > latent upscale with vae (res4lyf) x1.25 > 2nd sampler finishes with another 4 steps (dpmpp_2s_a/bong tangent). thoughts?
>>
Fresh when ready
>>108597963
>>108597963
>>108597963
>>
>>
>>
>>
>>
>>
>>