Thread #108615635
File: highlights_g_108609718_1776354985_1.jpg (2.6 MB)
2.6 MB JPG
Are You Living In The Same Universe As Me Edition
Discussion and Development of Local Image and Video Models
Previous: >>108609718
https://rentry.org/ldg-lazy-getting-started-guide
>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP
>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows
>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe
>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/
>Qwen
https://huggingface.co/collections/Qwen/qwen-image
>Klein
https://huggingface.co/collections/black-forest-labs/flux2
>LTX-2
https://huggingface.co/Lightricks/LTX-2
>Wan
https://github.com/Wan-Video/Wan2.2
>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46
>Illustrious
https://rentry.org/comfyui_guide_1girl
>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage
>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg
>Local Text
>>>/g/lmg
>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
137 RepliesView Thread
>>
>>
>>
>mfw Resource news
04/16/2026
>Motif-Video 2B: A micro-budget text-to-video diffusion transformer from Motif Technologies
https://motiftech.io/videoshowcase
>HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds
https://huggingface.co/tencent/HY-World-2.0
>ErnieTurbo_extracted_lora
https://huggingface.co/GuangyuanSD/ErnieTurbo_extracted_lora/tree/main
04/15/2026
>DisCa: Accelerating Video Diffusion Transformers with Distillation-Compatible Learnable Feature Caching
https://huggingface.co/tencent/DisCa
>Lyra 2.0: Explorable Generative 3D Worlds
https://research.nvidia.com/labs/sil/projects/lyra2
>AniGen: Unified S3 Fields for Animatable 3D Asset Generation
https://github.com/VAST-AI-Research/AniGen
>T2I-BiasBench: A Multi-Metric Framework for Auditing Demographic and Cultural Bias in Text-to-Image Models
https://gyanendrachaubey.github.io/T2I-BiasBench
>Generative Refinement Networks for Visual Synthesis
https://github.com/MGenAI/GRN
>VideoFlexTok: Flexible-Length Coarse-to-Fine Video Tokenization
https://videoflextok.epfl.ch
>DiffusionPrint: Learning Generative Fingerprints for Diffusion-Based Inpainting Localization
https://github.com/mever-team/diffusionprint
>Chain-of-Models Pre-Training: Rethinking Training Acceleration of Vision Foundation Models
https://github.com/deep-optimization/CoM-PT
>Self-Adversarial One Step Generation via Condition Shifting
https://github.com/LINs-lab/APEX
>See-through WebUI
https://github.com/BeamManP/see-through-webui
>ERNIE-Image: Repackaged model files for ComfyUI
https://huggingface.co/Comfy-Org/ERNIE-Image
04/14/2026
>Nucleus-Image Released
https://huggingface.co/NucleusAI/Nucleus-Image
>ERNIE-Image: Text-to-image generation model built on a single-stream Diffusion Transformer
https://huggingface.co/baidu/ERNIE-Image
>Danbooru Dataset Filter: High-Speed Metadata Explorer for AI Training
https://github.com/ThetaCursed/Danbooru-Dataset-Filter
>>
>>
>mfw Research news
04/16/2026
>Creo: From One-Shot Image Generation to Progressive, Co-Creative Ideation
https://arxiv.org/abs/2604.13956
>DiT as Real-Time Rerenderer: Streaming Video Stylization with Autoregressive Diffusion Transformer
https://arxiv.org/abs/2604.13509
>Enhanced Text-to-Image Generation by Fine-grained Multimodal Reasoning
https://arxiv.org/abs/2604.13491
>MaMe & MaRe: Matrix-Based Token Merging and Restoration for Efficient Visual Perception and Synthesis
https://arxiv.org/abs/2604.13432
>Bias at the End of the Score
https://arxiv.org/abs/2604.13305
>ASTRA: Enhancing Multi-Subject Generation with Retrieval-Augmented Pose Guidance and Disentangled Position Embedding
https://arxiv.org/abs/2604.13938
>What Are We Really Measuring? Rethinking Dataset Bias in Web-Scale Natural Image Collections via Unsupervised Semantic Clustering
https://arxiv.org/abs/2604.13610
>Who Gets Flagged? The Pluralistic Evaluation Gap in AI Content Watermarking
https://arxiv.org/abs/2604.13776
>Rethinking Image-to-3D Generation with Sparse Queries: Efficiency, Capacity, and Input-View Bias
https://arxiv.org/abs/2604.13905
>DiffMagicFace: Identity Consistent Facial Editing of Real Videos
https://arxiv.org/abs/2604.13841
>Seedance 2.0: Advancing Video Generation for World Complexity
https://arxiv.org/abs/2604.14148
>MOONSHOT : A Framework for Multi-Objective Pruning of Vision and Large Language Models
https://arxiv.org/abs/2604.13287
>VibeFlow: Versatile Video Chroma-Lux Editing through Self-Supervised Learning
https://lyf1212.github.io/VibeFlow-webpage
>ReConText3D: Replay-based Continual Text-to-3D Generation
https://mauk95.github.io/ReConText3D
>Free Lunch for Unified Multimodal Models: Enhancing Generation via Reflective Rectification with Inherent Understanding
https://arxiv.org/abs/2604.13540
>Grid2Matrix: Revealing Digital Agnosia in Vision-Language Models
https://arxiv.org/abs/2604.09687
>>
File: ComfyUI_09687_.png (572.1 KB)
572.1 KB PNG
>>108615765
you're welcome
>>
File: 397256838253824.png (1.9 MB)
1.9 MB PNG
>>108615104
generated from scratch
>>
>>
>>
>>108615361
There's no straightforward process with these models, It all comes down to luck.
You never know how the AI will react to whatever dataset you throw at it qnd you always have to sacrifice something to get improvements.
Base Noob has the best aesthetics but also the worst limb deformities, especially legs.
How did they fix it? More neutral/slopped/semi realistic data, which killed the aesthetic.
No model can balance aesthetics and accuracy yet.
>>
File: _AnimaPreview3_00271_.jpg (431 KB)
431 KB JPG
>>108616034
tyty!
>>
>>
>>
>>
>>
>>
>>
>>
File: _AnimaPreview3_00304_.jpg (373.2 KB)
373.2 KB JPG
>>
>>
>>
>>108616170
Hey, you are not me!
>>108616264
Lora?
>>
>>
File: 1747882449329772.jpg (78.9 KB)
78.9 KB JPG
is he lying
>>
>>
>>
>>
File: 00051-2906136885.png (382.8 KB)
382.8 KB PNG
>>108615635
What can i do with a GTXX 1050 and 16gb of ram?
Any model recommendation?
>>
>>
>>108616324
no, you are shamesly shilling.
did ernie labs payd to you? it's clear that erni flopped
>>108616370
NAI
>>
>>
>>108616370
>2gb
Ouch!
Probably some SDXL variant like Noob vpred.
Either run at fp16 and eat offloading penalty or run at q8 (int8 if Pascal can accelerate that and if you can figure out how to get it working)
You are SOL for anything newer.
>>
>>
>>
>>
>>
File: 00074-1842675134.png (2.3 MB)
2.3 MB PNG
>>108616370
anon you need 12-16gb vram and at least the healthy minimum 32gb of ram for ai workloads. if you can't afford a decent prebuilt gaming pc from costco or bestbuy then forget and go the saas route. Your not running any good ai model under 8gb of vram.
>>
>>
File: 00007.png (3.7 MB)
3.7 MB PNG
>>108616425
>>108616454
>>108617006
>>108617015
Well, i appreciate the help
Throwback from 2023, same setup. WebUI doesn't work for me anymore
>>
File: _AnimaPreview3_00357_.jpg (407.6 KB)
407.6 KB JPG
>>
File: _AnimaPreview3_00361_.jpg (391.1 KB)
391.1 KB JPG
>>
>>
File: _AnimaPreview3_00373_.jpg (411.4 KB)
411.4 KB JPG
>>
>>
>>
File: _AnimaPreview3_00405_.jpg (413.7 KB)
413.7 KB JPG
>>
>>
So how many artist styles are you using at a time with anima?
I set up my prompts to randomly pick between 1, 2, and 3, and at 3 it still seems coherent. One style seems to dominate overall, but you can still pick up hints of the others. I also have to try using no artists more often.
>>
>>
>>
does anima include copyright tags? tried one with 800 entries on gel and 300 on dan but it didnt recognize
>>108617289
https://github.com/67372a/LoRA_Easy_Training_Scripts
>>
>>108617453
>pedo hag dataset
Que?
Also training on clip garbage in 2026 is the biggest problem with his models.
And being obnoxious dipshit in general.
>>108617502
Just one. I am not sure how more artist tags help coherency.
>I also have to try using no artists more often.
The default style is too sloppy and soulless for me.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
File: _AnimaPreview3_00450_.jpg (504.9 KB)
504.9 KB JPG
>>
File: 1758659353540414.gif (95.7 KB)
95.7 KB GIF
>>108617651
you should also try creating a lora for ltx2.3. imagine this style with sound
>>
>everyone pretends that a 2b model can learn over a million character images and artist styles, when an LLM with the same parameters struggles learning a tenth of that
You need at least a 24b model at minimum to achieve what you want.
>>
>>
>>
>>
>>108618018
There haven't been any other major NSFW capable tunes, yes. Shame it's too schizo. And memestone's other vibe training attempts have managed to become far more dysfunctional trainwrecks. At least you get lucky enough with Chroma sometimes.
>>
File: _AnimaPreview3_00517_.jpg (471.1 KB)
471.1 KB JPG
>>108617709
could be fun but probably requires latest hardware
>>
>>108617954
It works best at typical resolutions. Circlestone did release a Lora recently where 1536x1536 works without any major issues, and even 2048 (4 MP) works without falling apart.
That's genning straight-up. You need to upscale to go bigger.
>>
File: _AnimaPreview3_00545_.jpg (517.6 KB)
517.6 KB JPG
>>
File: 990306278.png (2.3 MB)
2.3 MB PNG
>>108616159
>>108616323
What he said, but here it is anyway
>toki \(blue archive\), toki \(bunny\) \(blue archive\), blue archive, 1girl, alternate hairstyle, animal ear hairband, animal ears, ass, back, backless leotard, bare shoulders, blonde hair, blue eyes, blue hairband, blue leotard, blue nails, blue streaks, braid, breasts, bun cover, detached collar, expressionless, fake animal ears, fake tail, from behind, grabbing own ass, hair bun, hairband, half up braid, halo, highleg, highleg leotard, large breasts, leotard, looking at viewer, mechanical halo, median furrow, multicolored hair, nail polish, official alternate costume, playboy bunny, rabbit ear hairband, rabbit ears, rabbit tail, short hair, simple background, single hair bun, sitting, solo, strapless, strapless leotard, streaked hair, tail, white background, wrist cuffs
>>
>>
>>108617954
I found anima to be completely predicatable with the scaling of time in relation to image size.
It takes 30 seconds and 1024x1024 and 2 minutes at 2048x2048 on my 3090. So you just extrapolate the time it takes to a 1024x1024 image on your hardware and multiply it by how many times larger the image is than that.
>>
File: 336607023179300.png (2.4 MB)
2.4 MB PNG
>>
>>108618795
>>108618615
>>108618702
So Anima then Zit for hires fixes won over Chroma?
Realistic models were saved by weebs?
Why is Lodestone not fail tuning Anima yet?
>>
>>
>>
File: 1775261507046615.png (536 KB)
536 KB PNG
can't believe the last hope for local video is ltx...
>>
>>
>>
>>
>>
>>
File: 00119-858822408.png (1.3 MB)
1.3 MB PNG
>>
File: so grim.png (62.6 KB)
62.6 KB PNG
>>108619102
>literal jews are my our best hope
>>
>>108619186
Why are you going q4 with 32b? You can easily do Q6 with 5090.
Anyway 3.6 will probably mog both even as MOE.
Probably get 3.5 27b q6 hauhaucs if you need NSFW (Although it will try its best it has low knowledge of NSFW subjects due to lacking training knowledge)
>>
File: 00154-2878864308.png (1.1 MB)
1.1 MB PNG
>>
File: 227338914584475.png (2.2 MB)
2.2 MB PNG
>>
File: 1772802281817339.jpg (110.3 KB)
110.3 KB JPG
>>108619252
and not even the best jews, like with sora, midjourney. we've got the team of talentless jews. what luck...
>>
>>
>>
File: 617571134723071.png (2.3 MB)
2.3 MB PNG
>>108619429
אַזוי פֿיל געלט, אַזוי פֿיל שכל
>>
File: 00190-781702195.png (1.5 MB)
1.5 MB PNG
>>
>>
>>
>>
>>108619804
I understand /hgg/ fags and Oekaki shizo because Anima it's better at handling multiple characters and intricate poses, as well as abstract kino minimalist concepts with multiple characters in the case of Oekaki. But anyone else praising Anima is a poser. For example, >>108619433, >>108619455, and >>108618795 can be done with SDXL and ZiT hires fix.
>>
File: 1746952463420779.jpg (681.3 KB)
681.3 KB JPG
>>
File: ComfyUI_temp_zhqgr_00003_4.png (2.2 MB)
2.2 MB PNG
>>
File: 104212991805621.png (2 MB)
2 MB PNG
>>108619872
wicked
>>
File: deEF_zi_00031_.png (2.1 MB)
2.1 MB PNG
>>108619433
I'm surprised it got her weird mid-spine tail correct. this is i2i maybe?
>>
>>108618702
I don't even care about nudity particularly. I just want a model that can make nice pics of cute chicks with some cleavage, the occasional bikini pic or some lingerie. Rarely nudity, it's not really essential. I find nothing is as good as Chroma. I've tried all the other FOTMs and I wasn't blown away.
>>
File: happyhorse.png (25.7 KB)
25.7 KB PNG
Is it coming to API nodes anytime soon?
>>
>>
File: ComfyUI_temp_targv_00101_.png (2.2 MB)
2.2 MB PNG
I havn't used SDXL in such a long time by now lol, and some faggots are still hanging on that deprecated model lol
Imagine using clip in 2026
>>
>>
File: output.webm (3.9 MB)
3.9 MB WEBM
Some good coomer gens last thread.
>>
>>
File: ComfyUI_19895.png (2.3 MB)
2.3 MB PNG
>>108619125
It fucks up handling memory less often on VAE loading now, but still fucks up... BUT! I now get a lot of "Windows fatal exception: access violation" when refreshing the page Comfy loads, which needs to be done because the RTX node doesn't load properly without a restart. So I have long stretches where I'm just trying to get it to work.
I really don't think they (or more likely, Claude!) know what they're vibing out over there memory-wise.
>>
>>
>>
File: download - 2026-04-16T003237.074.jpg (96.4 KB)
96.4 KB JPG
>>108620144
>>
File: ComfyUI_temp_targv_00102_.png (2.6 MB)
2.6 MB PNG
>>108620640
Is that Joy-Edit? lol
>>
File: ComfyUI_00043_.png (1.6 MB)
1.6 MB PNG
>>
File: download - 2026-04-15T224244.224.jpg (90.6 KB)
90.6 KB JPG
>>108620677
>>
>>108620710
>>108620677
>>108620640
Leaving behind the obvious hand problems, these realistic SDXL gens are much less sloppy than ZiT.
How about Anima-> SDXL -> ZiT to fix hand and add detail?
>>
File: ComfyUI_temp_eijtg_00019_.png (1.6 MB)
1.6 MB PNG
>>108620811
except that my gens are klein you idiot
>>
>>
>>
>>
>>
>>
>>
>>
File: 1747918392477447.jpg (634.1 KB)
634.1 KB JPG
SEX
>>
File: Screenshot_20260417_125853_Reddit.jpg (546.1 KB)
546.1 KB JPG
The absolute state of reddit.
>>
>>
>>
>>
>>
>>
Is IPadapter something worth using these days? I'm using illustrious and I want to spice up images with certain moods when I can't find loras for it.
Does it work like regular t2i and just take a % of the reference image?
>>
>>