Thread #108569503
File: highlights_g_108563476_1775776423_1.jpg (2.4 MB)
2.4 MB JPG
Discussion and Development of Local Image and Video Models
Previous: >>108563476
https://rentry.org/ldg-lazy-getting-started-guide
>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP
>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows
>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe
>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/
>Qwen
https://huggingface.co/collections/Qwen/qwen-image
>Klein
https://huggingface.co/collections/black-forest-labs/flux2
>LTX-2
https://huggingface.co/Lightricks/LTX-2
>Wan
https://github.com/Wan-Video/Wan2.2
>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46
>Illustrious
https://rentry.org/comfyui_guide_1girl
>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage
>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg
>Local Text
>>>/g/lmg
>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
312 RepliesView Thread
>>
File: dwarfman.jpg (205.2 KB)
205.2 KB JPG
>>
>>
>>
>>108569547
>>108566619
>>108569190
>>108568213
Don't you feel like a piece shit posting anime in a general where everybody pretends and nobody cares about it? Don’t you have a bit of remorse for being part of this big farce?
>>
>>
>mfw Resource news
04/09/2026
>MAR-GRPO: Stabilized GRPO for AR-diffusion Hybrid Image Generation
https://github.com/AMAP-ML/mar-grpo
>HybridScorer: Score, sort, and cut large sets down fast with GPU-accelerated AI review
https://github.com/vangel76/HybridScorer
04/08/2026
>OrthoFuse: Training-free Riemannian Fusion of Orthogonal Style-Concept Adapters for Diffusion Models
https://github.com/ControlGenAI/OrthoFuse
>MIRAGE: Benchmarking and Aligning Multi-Instance Image Editing
https://github.com/ZiqianLiu666/MIRAGE
>Few-Shot Semantic Segmentation Meets SAM3
https://github.com/WongKinYiu/FSS-SAM3
>PoM: A Linear-Time Replacement for Attention with the Polynomial Mixer
https://github.com/davidpicard/pom
>RS Nodes for ComfyUI: Cmprehensive custom node pack focused on LTXV audio-video generation, LoRA training and post-processing
https://github.com/richservo/rs-nodes
>FLUX.2 Small Decoder: Distilled VAE decoder for faster decoding and lower VRAM usage
https://huggingface.co/black-forest-labs/FLUX.2-small-decoder
>Nvidia snaps up AI chip packaging capacity as TSMC expands in U.S.
https://www.cnbc.com/2026/04/08/tsmc-nvidia-advanced-packaging-intel.h tml
04/07/2026
>Anima preview3 released
https://huggingface.co/circlestone-labs/Anima#preview3
>FrameFusion Image Interpolation: Compact image interpolation model for generating in-between frames
https://github.com/BurguerJohn/FrameFusion-Model
>An Inside Look at OpenAI and Anthropic’s Finances Ahead of Their IPOs
https://www.wsj.com/tech/ai/openai-anthropic-ipo-finances-04b3cfb9
>PrismML debuts energy-sipping 1-bit LLM in bid to free AI from the cloud
https://www.theregister.com/2026/04/04/prismml_1bit_llm
>ComfyUI Hires Fix Ultra - All in One
https://github.com/ThetaCursed/ComfyUI-HiresFix-Ultra-AllInOne
>ATSS: Detecting AI-Generated Videos via Anomalous Temporal Self-Similarity
https://github.com/hwang-cs-ime/ATSS
>>
>mfw Research news
04/08/2026
>GenLCA: 3D Diffusion for Full-Body Avatars from In-the-Wild Videos
https://onethousandwu.com/GenLCA-Page
>Grounded Forcing: Bridging Time-Independent Semantics and Proximal Dynamics in Autoregressive Video Synthesis
https://arxiv.org/abs/2604.06939
>Evolution of Video Generative Foundations
https://arxiv.org/abs/2604.06339
>VersaVogue: Visual Expert Orchestration and Preference Alignment for Unified Fashion Synthesis
https://arxiv.org/abs/2604.07210
>Controllable Generative Video Compression
https://arxiv.org/abs/2604.06655
>Not all tokens contribute equally to diffusion learning
https://arxiv.org/abs/2604.07026
>FlowInOne:Unifying Multimodal Generation as Image-in, Image-out Flow Matching
https://arxiv.org/abs/2604.06757
>Holistic Optimal Label Selection for Robust Prompt Learning under Partial Labels
https://arxiv.org/abs/2604.06614
>Towards Robust Content Watermarking Against Removal and Forgery Attacks
https://arxiv.org/abs/2604.06662
>PhyEdit: Towards Real-World Object Manipulation via Physically-Grounded Image Editing
https://arxiv.org/abs/2604.07230
>Noise Constrained Diffusion (NC-Diffusion) Framework for High Fidelity Image Compression
https://arxiv.org/abs/2604.06568
>RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details
https://limuloo.github.io/RefineAnything
>Visual prompting reimagined: The power of the Activation Prompts
https://arxiv.org/abs/2604.06440
>MoRight: Motion Control Done Right
https://research.nvidia.com/labs/sil/projects/moright
>Fast-dVLM: Efficient Block-Diffusion VLM via Direct Conversion from Autoregressive VLM
https://arxiv.org/abs/2604.06832
>DesigNet: Learning to Draw Vector Graphics as Designers Do
https://arxiv.org/abs/2604.06494
>FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling
https://arxiv.org/abs/2604.06916
>When to Call an Apple Red: Humans Follow Introspective Rules, VLMs Don't
https://arxiv.org/abs/2604.06422
>>
MYTH: api models are censored
FACT: api models are less censored than local models and are in fact trained on NSFW imagery
MYTH: api models are too expensive
FACT: it's actually quite cheap to use API through ComfyUI API Nodes. the price for api has went down in comparison to the price of hardware
MYTH: api nodes collect your data and are unsafe to use
FACT: api is safer than local because nothing is stored on your hard drive. with local models, you need to download hundreds of loras and custom nodes, any of which could be infected
MYTH: an api can pull the plug at any time, why use something like that?
FACT: everything you generate can be saved to your desktop so nothing is lost
MYTH: it's impossible to train a custom style of character with api, loras make local way better
FACT: api can learn any style or character with a single image reference, which is much faster and smarter than loras
MYTH: if i buy api credits and don't like the model, that's money wasted
FACT: comfyUI's API nodes credit system allows you to prompt hundreds of cutting-edge api models. the credits share between models so you aren't locked in to any one ecosystem
MYTH: api users are poor and from third world countries
FACT: the top hollywood productions and anime studios all use api models. api is the weapon of choice for everyone world-wide
MYTH: discussion of api models is off-topic
FACT: api models are part of the comfyui experience and are relevant to this thread. combining api models with local workflows is still local
>>
File: 76538754745724.jpg (1.8 MB)
1.8 MB JPG
>>
>>108569589
>>108569593
>>108569597
fuck off faggot
>>
>>108569597
MYTH: you are not a cunt
FACT:
>>
>>
>>108569680
you can do AI porn with grok, and the quality is miles ahead what local can do
https://www.reddit.com/r/Grok_Porn/
>>
>>
File: LET HIM COOK.png (433.5 KB)
433.5 KB PNG
>>108569715
fair enough, but I don't like where this is going, it's obvious that civitai is trying to separate themselves from NSFW, at some point they'll completly remove the porn loras, the writing is on the wall
https://civitai.com/articles/28369
>>
>>
>>
>>
>>
>>
>>
File: 1757985478383624.png (95 KB)
95 KB PNG
>>108569762
really nigga?
>>
>>
File: please let that happen.png (76.2 KB)
76.2 KB PNG
>>108569773
>If they had anything close to Seedream 2 but uncensored there's no way they would allow NSFW with it.
dude, I don't think the world is ready for the day we'll get a local model as good as Seedance 2.0... it's gonna be great
>>
>>
>>
File: Best we can share is Wan 2.2.png (194.3 KB)
194.3 KB PNG
>>108569784
>it's gonna be great
Do you seriously we're gonna give you something this good gweilo?
>>
>>
>>108569798
it's 2026, have you not noticed the api pattern yet?
>hey here is our new model, look at how great it is!
and then 3 weeks later they cripple it and hope most people won't notice(they won't because most of their user base is brown) and then sit back and count money while people burn credits trying to gen the same slop they genned on day one.
>>
>>108569825
>and then 3 weeks later they cripple it
Seedance didn't wait 3 weeks before crippling it, they crippled it before they deployed their API to the rest of the world lmao, at least Sora had the decency to be cool to play around with at the very begining, I know the bar is low as fuck but it is what it is
>>
>>
>>
>>
>>108569845
>all these api outputs are crippled
>they're still better than loca
grim, even Mike Tyson with one leg could destroy me, so yeah, a crippled API service is still better than what local shit is producing (and I hope local will step up its game one day, and no, finetuning SDXL for the 14th billionth time won't do it)
>>
>>
>>108569723
Why are civittards such entitled little shits? If Visa is cutting off credit card payments, your business is done. You go under, you cease to exist. WTF is Civit supposed to do against that? I'm not defending any of the other bullshit about the platform but for this war with payment processors it seems like they found the least bad option.
>>
Local falling behind means there is no reason to waste money on the current overpriced hardware shortage. if Nvidia releases the 6000 series you'll have no reason to buy it because even if the compute power per dollar was insane, there are no good models to fully take advantage of it anyway.
API cucking local and withholding even outdated video models like wan 2.5 is saving you money. do you know how much money you'd be wasting on this hobby if local had all the good models to choose from? do you know how much debt you'd go into if you could run SORA locally? these companies are saving you from yourself by not making these models open source, and doing the right thing by destroying them instead. You're welcome.
>>
>>108569914
we're not angry at civitai because they got cucked by visa, we know they can't do anything against (((them))), what we don't like is the gaslighting, they're not honest at all about what's really goin on, people just don't like being lied to, shocker I know
>>
>>
>>
File: hunyuan comfy.png (37 KB)
37 KB PNG
>>108569916
unironically this. models like wan 2.5, seedance 2, seedream etc don't fit on local hardware, and quantcoping is just sad. anima is 2b parameters yet it's slower than sdxl which is bigger. and these api models are easily 16b+ minimum, with video ones easily reaching 100b.
cumfart cried and threw a tantrum over hunyuan releasing a model too big for localpoors to run, so now all of china realized that local doesn't want these models anyway because they're too big. comfyorg unironically saved local from having to buy H200s
>>
>>
>>
>>108569974
>cumfart cried and threw a tantrum over hunyuan releasing a model too big for localpoors to run, so now all of china realized that local doesn't want these models anyway because they're too big.
based comfy, no one will care if they can't even run it in the first place
>>
>>
>>
>>
File: ComfyUI_20438.png (2.1 MB)
2.1 MB PNG
So... are we gonna get some images with these API-glazing shitposts, or is this guy a fucking poor-ass promptlet?
Thrill me with your 10kW gens, you faggot.
>>
>>
>>108570015
>any API chads gen some kino?
they're gonna kill the golden goose by censoring it like that, what's the point of making such an incredible model if you don't allow people to make fun things with it? I will never understand this
>>
>>
https://civitai.com/models/2383017/anima-cat-tower
>massive changes to Anima's default style (albeit, slopped to high hell)
>improvements to anatomy
>improvements to consistency
>same or better character knowledge
I thought Anima was untrainable and forgot all its base knowledge if you so much as sneezed on the weights
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
File: 1761391215412973.jpg (1.5 MB)
1.5 MB JPG
>>108569606
>>
File: 1769226583411058.png (3.6 MB)
3.6 MB PNG
>>
File: ComfyUI_Anima_00040_.png (1.2 MB)
1.2 MB PNG
Acestep.cpp is insane. It does not consume all my VRAM all at once, only fills it up when I run it, so I can run comfyUI in conjunction with it. Plus, it's ultra fast. Unlike every other iteration of ACEStep UIs, it also allows seamless switching between XL Turbo and XL SFT.
XL Turbo 80s Jap groove gen
https://vocaroo.com/1hpzg5IVZxPe
The prompt is everything, it makes a huge difference in output quality, so like image gen it makes sense to try different ways and styles to prompt same thing, and remove tokens if something sounds off.
>>
>>
>>
>>108570316
>what does a music prompt even look like?
Depends completely on the model and what kind of language it was trained on, just like image models. Udio worked extremely well with rateyourmusic tags because that's what it was trained on (until it started giving you fucking moderation errors every fucking time if you copy pasted the tags from an album you like).
https://ace-step.github.io/ace-step-v1.5.github.io/#XLDemos
Judging by their example prompts it sounds like it was trained on natural language, but I intend to test RYM tags just in case.
>>108570280
>Acestep.cpp is insane.
Works better than ComfyUI?
>>
>>
>>108570289
>but can i run it with 4gigs vrams?
You should be able to, Q4 is below 4GB in size (and as long as the total GB is less than your VRAM it should all fit).
https://www.serveurperso.com/temp/acestep.cpp-win64/models/
>>108570316
There's two separate prompts, a caption and a lyric portion. The LM turns them into codes that the model understands, which then outputs the codes for the song and translated to either mp3/FLAC/WAV. In this case for the caption I use
>A groovy 80s synth-pop track featuring sultry female vocals, blending English and Japanese lyrics with flirtatious call-and-response delivery. The timbre pulses with a funky slapped bassline, shimmering arpeggiated synths, gated reverb snare drums, and electric piano stabs. The emotion is playful liberation, infectious joy, and cheeky rebellion. Human sounds include syncopated finger snaps, ecstatic "Ha!" shouts from both vocalists, and layered harmonies during the chorus.
I use LLMs to enhance the caption (could be done right thru acestep cpp itself, and it can also be done with Grok/Gemini).
Like with API, you can technically just lazy prompt it straight thru the UI with the built in prompt enhancer, though I like flexibility outside of that.
Lyrics were https://files.catbox.moe/8xof5r.txt
They can be provided in a variety of ways, but I always adhere to ACEStep's instructions for them. Like image gen, there's things that can be modified like BPM, duration, keyscale, which adjust speed and style of the song, as well as CFG which adjusts prompt adherence and creativity between gens.
>>
>>108570356
>Works better than ComfyUI?
I don't think the Comfy ACEStep implementation has ever been without issues, dev on this seems to have completely halted. It has more features which I will test soon, two separate cover modes with cover-nosfq apparently being highest quality, and back when I used the first ACEStep 1.5 on Comfy, it was quite slow when a generation had some kind of change to the caption, so I think this is even better.
>>
>>
File: deWA_zi_00025_.png (2.4 MB)
2.4 MB PNG
>>
>>
>>
File: 1758263119977770.png (2.4 MB)
2.4 MB PNG
https://civitai.com/models/1277670/janku-trained-chenkin-and-noobai-ro uwei-illustrious-xl?modelVersionId= 2786084
I still think illustrious is best for animu more or less. this one has the regular illustrious style but also the deeper colors of base noobAI.
>>
File: 1775138852648467.png (2.4 MB)
2.4 MB PNG
>>108570672
>>
>>108570445
>I don't think the Comfy ACEStep implementation has ever been without issues,
I'm trying it right now. You weren't kidding, this shit is jank. For some reason the "thinking" step is using my CPU instead of GPU so it's slow as FUCK. Thankfully, it seems that you can skip that step. But in order to do so I have to use a different set of nodes. This shit is weird and jank and confusing and now I'm considering trying the .cpp setup like you said.
>>
>>108569916
Local is still thriving, just not on video yet. This is obviously due to video being most prohibitive to train in a style that ClosedAI and Bytedance have done, but one can hope some AI lab makes a breakthrough with so many (including BFL) thrown at the problem.
>>
>>108570672
>>108570678
As someone who uses only base Noob models and their derivatives, I can assure you that many use the Noob name for marketing without understanding what the model do when merging. Also, most 4chan gens aren't from base Noob but from the WAI/Janku branch. Few people know how to prompt these models properly, and the results you're imagining likely aren't from Noob.
>>
File: ComfyUI_Anima_00039_.png (867.3 KB)
867.3 KB PNG
Wow, gothic metal music now sounds absolutely insane out of the box
https://vocaroo.com/1Q4Llaeb3gi2
>>108570733
Yep, absolutely is for some reason. The .cpp is counter-intuitively (due to using ggml) the fastest, most lightweight and cleanest version of ACEStep, because no python UI exists for ACEStep that is good. Plus .cpp is compatible with every feature plus more, and good attempt would be to just port the .cpp straight into ComfyUI as a custom node.
>>
File: 1751534000393376.png (2.6 MB)
2.6 MB PNG
>>108570817
I had a previous version saved, downloaded the latest version, seems decent.
but I also have base noob 1.0 cause it's good to have something without any merges or whatever.
>>
>>108570847
And this is not surprising, seems like every python UI that's not Comfy for ACEStep is vibecoded, including the actual official UI, and actual devs like lllyasviel or Auto1111 are not available to work on proper UIs. As for Comfy, I'm guessing it's just not too compatible with the current architecture.
>>
>>108570672
>>108570678
>>108570857
I judge a model based on how well it can do toilet sitting+undies down. you'd be surprised at how hard it is for many models to get right.
>>
File: 1752647576987499.png (2.9 MB)
2.9 MB PNG
>>108570857
>>108570875
yeah, but in general illustrious/noob based models do great, weve come a long way since pony which needed tons of tinkering to get okay anatomy.
>>
>>
File: deWA_zi_00030_.png (2.3 MB)
2.3 MB PNG
forbidden technique
>>
>>108570847
>https://vocaroo.com/1Q4Llaeb3gi2
Sounds like it keeps changing its mind whether a woman or a man is singing at the part where it gets loud, kek. Also, the quiet parts remind me of Let Us Cling Together by Queen.
>>
File: happyhorse08.png (452.2 KB)
452.2 KB PNG
>>108570918
trust the plan saaaar
is 48gb seedance level local tomorrow in comfyui will be optimized FAST
>>
>>
>>
>>
>>108570988
Yep, I vividly remember in the testing days someone posting a POV clip of a Kiernan Shipka lookalike sitting on a bed in a towel and then she proceeded to get sharked by the person behind the camera and it showed full genitals and her chasing after the camera angry.
>>
File: 1756369674153562.png (3.6 MB)
3.6 MB PNG
https://danbooru.donmai.us/posts?tags=axe_guitar
>23 results
>>
>>108570990
its a real model, but its likely not local. but many are convinced it will be and are posting random made up bullshit about it to the point where i cant even tell what is real. apparently it's by an alibaba group not related to wan
>>
File: 612320840_122160219734892895_5837496806532520797_n.jpg (167.6 KB)
167.6 KB JPG
is WAN still king for local i2i? any workflows people can link me to? specifically for photo-realistic...
>>
>>
File: video helper suite merge.jpg (139.9 KB)
139.9 KB JPG
>>108571179
Try Video Helper Suite nodes.
>>
>>
File: Alibabakekked.png (175.1 KB)
175.1 KB PNG
Confirmed Alibaba and API-only. All of China has now moved on to closed-source models.
Surely coming to API nodes though!
>>
File: 1745062499967367.png (3.1 MB)
3.1 MB PNG
>>
>>
>>
File: caveira-skeleton.gif (171.8 KB)
171.8 KB GIF
This fucking LoRA training better work.
>>
>>
>>
File: pinup-zit-2026-04-10_00100_.png (3.3 MB)
3.3 MB PNG
>>
>>
>>
>>108569503
How is it possible to gen an ass like top right >>108567925 on Anima?
Is v3 better at realism?
>>
File: ComfyUI_20658.png (2.1 MB)
2.1 MB PNG
>>108571284
I believe in you, Anon!
>>
>>
File: pinup-zit-2026-04-10_00193_.png (3.1 MB)
3.1 MB PNG
>>
>>
File: ComfyUI_10557.jpg (3.3 MB)
3.3 MB JPG
>>108571601
Hmm... I know Jenny has only gotten cuter and cuter over the years, but she's still an adult woman.
>>
File: ComfyUI_Anima_00045_.png (909.4 KB)
909.4 KB PNG
>>108570280
Just tested cover-nofsq on XL Turbo. It's a special feature exclusive to the .cpp that aligns the cover more closely with the reference, bypassing the limitation of the original cover feature.
World Is Mine cover-
https://vocaroo.com/11E5J1UtRHkV
Setting: Cover strength has to be set to around 0.5 otherwise it doesn't work.
Assuming you sync the lyrics well, it does a pretty good job.
>>
>>
File: pinup-zit-2026-04-10_00232_.png (3.3 MB)
3.3 MB PNG
>>108571669
didnt't she get a lot of surgery?
>>
File: 4b102ea661b7047c2a44473aa183aa5f-252492681.jpg (130.3 KB)
130.3 KB JPG
>>108571763
yeah, this was her before
>>
>>
>>
File: pinup-zit-2026-04-10_00262_.png (3.1 MB)
3.1 MB PNG
>>108571781
maybe it was worth that shot. sure, it failed, but it was worth it in the end.
>>
>>
File: ComfyUI_20681.png (2.2 MB)
2.2 MB PNG
>>108571722
I've never uploaded it anywhere.
>>108571763
No. Simple shots have made her cry before. She'd likely never want to deal with the pain that comes from butchering her face/body.
>>
>>108569589
>>108569593
thanks
>>
>>
>>
>>108572008
Bytedance has a ton more video data to work with hosting TikTok and Douyin, I don't think Alibaba at their scale can challenge that. On the other hand, they have a lot more general information which is why Qwen is way better and ahead than their Seed general LLM models. Bytedance could get there if they had more than Tiktok with an actual social network that focused more on text and etc. and was more like Meta but won't happen anytime soon.
>>
File: kek at OpenAI.png (783.6 KB)
783.6 KB PNG
>>108572054
>Bytedance has a ton more video data to work with hosting TikTok and Douyin
which is why I'm surprised Google hasn't destroyed the competition on video models, they own fucking youtube ffs lmao
>>
>>108572008
remember the proverb, "it's only local until it's good". wan was cursed by local releases. so alibaba chose to amputate them and start fresh. out by the roots. uncontaminated by the dregs of local, this new team managed to cultivate enough qi to birth a more promising heir
>>
>>108572067
Google can't even keep their compute needs in check for just regular old LLM usage, and they have Genie and Lyrica and other models to figure out. They won't scale as high or bigger than what Bytedance has for their video models. That being said, Youtube has allowed Gemini to utterly mog any other LLM in video comprehension. It's not even a contest.
>>
File: 2IEV2GZG2P__X5W_tmb.jpg (108.3 KB)
108.3 KB JPG
>>108569503
Hello everyone! Today we're excited to share a major update to our open-source project: ChenkinNoob-XL-V0.5 (ckn V0.5) is officially released.
It's been a while since our last version. This V0.5 update focuses primarily on improving overall aesthetics and enhancing the model's practical usability in real-world production environments, such as game art workflows.
Key Updates in This Release:
1. ~12M Dataset Optimization & Custom Training Architecture To help the model better understand complex clothing, perspectives, and character designs, our team developed an exclusive training script from scratch.
2. Synchronized Release: All-in-One Anime ControlNet (Chenkin-UniControl-XL) To address the style degradation issues often encountered when using ControlNet on anime images, we are also open-sourcing an 8-in-1 Universal ControlNet trained on the V0.5 base model. In addition to standard controls like Lineart, Depth, and Pose, we've introduced an underlying "Fuse" mode. This allows you to input multiple control conditions simultaneously (e.g., Lineart + Depth + Pose) and fuse them directly at the base level in a single pass. This not only reduces interference between control conditions but also significantly lowers VRAM usage. (Note: Using this feature in ComfyUI requires our team's exclusive plugin: ComfyUI-Advanced-ControlNet)
Official Download & Showcase Links:
Hugging Face: Base Model V0.5: https://huggingface.co/ChenkinNoob/ChenkinNoob-XL-V0.5 UniControl-XL: https://huggingface.co/ChenkinNoob/ Chenkin-UniControl-XL
Civitai (Showcase & Download): Base Model V0.5: https://civitai.com/models/2167995?modelVersionId=2841101 UniControl-XL: https://civitai.com/models/2527960/ chenkin-unicontrol-xl
>>
>>108564801
>We are thrilled to announce that ChenkinNoob-XL-V0.5
>>108572177
>Today we're excited to share a major update to our open-source project: ChenkinNoob-XL-V0.5
what kind of mental illness is this?
>>
>>108570280
>>108570847
>>108570940
>>108571687
sounds like dog shit
please go back to acestep general
>>
>>
>>
>>
>>
>>
>>
>>
>>
File: .png (14.8 KB)
14.8 KB PNG
>>108572177
Our guy.
>>
File: noticing.png (139.6 KB)
139.6 KB PNG
>>108572303
>AnimAnon
>>
>>
File: 1768307260753132.jpg (688.8 KB)
688.8 KB JPG
>>
>>
>>
>>
>>
>>108572580
duuuuuuuuuuuuuude
there node is fucking faulty, there's nothing in the prompts folder even if i put shit in there after a full reboot, the example files does not exist at all. This nodepack is a fucking scam bitchass.
>>
File: _AnimaPreview3_00008_.jpg (522.7 KB)
522.7 KB JPG
>>
File: 1751983032581929.jpg (33.9 KB)
33.9 KB JPG
>>108571241
alibaba is so stupid, it's crazy. meanwhile, ltx is gradually becoming excellent. ltx will soon be as performant as an api model. of course, only with loras kek
>>
>>
File: ohio not impressed.png (132.9 KB)
132.9 KB PNG
>>108572767
>ltx is gradually becoming excellent. ltx will soon be as performant as an api model.
>>
>>
>>
>>
>>108572211
That thread died a while ago. ACEStep just got an update, it's much more capable now.
>>108572231
>Still trolling now that ACEStep is around SOTA level and really caught up, next one will probably really surpass v5/Udio.
This is just the raw model and its tools APIkek, not even a LoRA yet. Just wait until Ostris drops his ai toolkit update, LoRAs will destroy with this.
>>
>>108573048
Oh, it's here kek. Nice!
https://xcancel.com/ostrisai/status/2042348730473726076#m
Time to start baking.
>>
>>
>>108572984
Sounds a bit like Lizst. Also her hand movements aren't matching the music at all, unsurprisingly. https://www.youtube.com/watch?v=KpOtuoHL45Y
>>
>>
File: 1751739146490404.png (88.7 KB)
88.7 KB PNG
>>108573048
https://www.youtube.com/watch?v=UAlLD5fS7-c
uh oh, Acesissies, how do we cope?
>>
>>108573092
>>108572984
>Sounds a bit like Lizst
it sounds a bit too modern to be lizst, it reminds me a bit of a Clair Obscur soundtrack
https://www.youtube.com/watch?v=a6xhWub1MwE
>>
>>108573102
I haven't heard a single song out of Suno that doesn't sound like garbage modern music made in the last ~5 years when music has been shit for going on 30 years. Not defending Ace though I'm really disappointed after using it. I'll try again after people train loras (if anyone even releases them, they probably won't since people are afraid of getting sued by record company jews).
Udio on the other hand was fucking kino before they started giving you moderation errors every other gen, it got pretty unusable for me.
>>
File: comfy__490.jpg (860 KB)
860 KB JPG
Can Acestep compose believeable fugues? Suno couldn't the last time I tried it.
>>
>>108573158
>Udio on the other hand was fucking kino before they started giving you moderation errors every other gen, it got pretty unusable for me.
this, only Udio reached great heights on AI music shit, and then (((Universal Music))) bought them and you can't download the musics anymore lol
>>
>>
>>
>>
File: 4554454512445.jpg (184.4 KB)
184.4 KB JPG
>>108573102
Oh nooo how does local cope with this
https://vocaroo.com/1gBbejpLw6s9
>>
File: o_00093_.png (1.3 MB)
1.3 MB PNG
>>
>>
>>
>>
>>
File: comfy__487.jpg (1003.6 KB)
1003.6 KB JPG
>>108573286
I wouldn't call 95% of it porn.
>>
>>108571284
>>108571565
DIDN'T WORK, SYSTEM WAS HANGING
HAD TO RESTART, USED AI TO HELP ME FIX MY CONFIG FILE
NOW IT'S WORKING
FUCK, WASTED AN ENTIRE NIGHT
>>
>>
>>108573158
Udio has been RLHF'd to death, it's the only model that is still ahead of ACEStep in one shot songs that sound catchy (though not in sound quality, it's behind both ACEStep and Suno), but it's also way ahead of Suno as well... A LoRA really does push ACEStep forward towards and ahead of Udio levels though, that's the nature of tuning. As for Suno shills, those guys have no idea what they're talking about, nor what local has to offer them. If you were to shove SD/Flux/Anima/etc... in the face of a normie, there's a good chance they'd tell you MJ is miles ahead, simply because they're also just as ignorant. It is very tech illiterate to say Suno is simply better than ACEStep, especially now after XL. You'd have to nitpick a certain song, then, what aspect of that song are you claiming is better? And have you proved that with all the prompt engineering in the world, you can't do the same with ACEStep? And to take it a step further, have you tried a cover or a LoRA to check if it's possible to get songs in similar quality? ACEStep has progressed to the point you can't just arbitrarily say one model is better than another and not be referring to tiny aspects of both models, which may not be that far apart once the model is properly tested or configured.
>Not defending Ace though I'm really disappointed after using it.
Well, if you've tried XL and compared it to Udio properly you'd know XL is simply harder to prompt and may even need a LoRA on some prompts for enhanced musicality. Whether you get good outputs really depends on prompt engineering and seeds.
>>
File: comfy__491.jpg (695.8 KB)
695.8 KB JPG
>>108573417
sorry I'm not touching that garbage
>>
File: 1745294026667817.png (72.6 KB)
72.6 KB PNG
>>108573427
>>
>>108573165
Thankfully I foresaw this and was downloading my best gens immediately the second they were done. https://files.catbox.moe/u3rklu.mp3https://files.catbox.moe/oqbiuy.mp3 https://files.catbox.moe/rylzic.mp3 https://files.catbox.moe/zr77zh.mp3 I also forgot the worst part of udio, the fact that the good model only generates in 35 second increments. So many good songs that are going to remain unfinished forever. Ah well.
>>
File: 734741083513432.png (1.7 MB)
1.7 MB PNG
>>
>>
>>108573427
>>108573449
what kind of mental illness is this?
>>
>>
File: 534624820964223.png (1.3 MB)
1.3 MB PNG
>>
>>
>>
>>
>>
>>
>>
File: 1761122078628547.png (3.9 MB)
3.9 MB PNG
>>
File: 1765321901773100.jpg (674.9 KB)
674.9 KB JPG
>>108573576
>12gb
holy vramlet
>>
>>
>>
>>
>>108573576
I've trained SDXL loras on 4GB VRAM. I mostly copied this guy's settings: https://www.reddit.com/r/StableDiffusion/comments/1fj6mj7/community_te st_flux1_loradora_training_on_8_gb/
>>
>>108573471
The Silent is a black woman
And so overpowered right now it's kind of ruining my enjoyment of other characters
>>108573588
For some reason Asian beauty standards hard-counter my pedophilia. She looks ugly-cute in the childish way in a nonsexual way to me that I am assuming is the normal way most people look at children too
>>108573629
China is 6 months behind. I picked a Happy horse gem over a Veo 3.1 video when I looked at it in arena.
The fact that China has gotten 90% of the way to Claude with only a tiny fraction of the compute and data is incredibly exciting. I'm still confident in my estimation of sora2 at home by July
>>
>>
>>108573693
Seedance 2.0 is vaporware and Google, Meta and OpenAI all have superior unreleased vaporware too.
Reminder that Google is going to win. They can't not win. They don't even need Nvidia, they make their own TPUs. They are fully vertically integrated at every step of the process like literally no other company is. If they don't win, it just means you can't win and AI will be a commodity
>>
File: 1752286461347948.jpg (222.1 KB)
222.1 KB JPG
>>108573601
you mean eu. lmao
>>
>>
File: upyours.png (343 KB)
343 KB PNG
>>
>>
>>108573679
>For some reason Asian beauty standards hard-counter my pedophilia. She looks ugly-cute in the childish way in a nonsexual way to me that I am assuming is the normal way most people look at children too
Nah you are just correctly IDing them as hags. Makeup fakery only works on the low IQ (women are mostly low IQ so they naturally think they are passing as younger than they really are. High IQ women generally don't wear makeup because they know it's futile.).
>>
File: o_00098_.png (1.1 MB)
1.1 MB PNG
>>
>>
File: night2.png (3.1 MB)
3.1 MB PNG
my opus
>>
File: o_00099_.png (1.2 MB)
1.2 MB PNG
>>
>>108573752
>wishful thinking
It is known that there was a higher strength version of Sora that was too computationally expensive to release. It is also known that meta created state of the art multimodal models that they did not release for safety. My only wishful thinking is the Sora killer by July (and the ability to have the free time to spend a bunch of time using it) because there's no reason for it to exist, I'd just very much like it to since I have too much resentment for current local models
It's annoying to use old models when you can feel the next big thing around the corner. Am I really gonna spend an hour making a few WAN videos for the 10th month in a row when I've already reached the point that I focus on its failures more than it's successes? Am I really gonna wrangle SDXL-tier slop to get something I've seen a thousand times before and have an unfulfilling fap? I'm mostly waiting for the audio part of video to be solved since that'll open up a whole new goon dimension that should keep me exponentially more engaged for a very long time.
>>108573786
>Nah you are just correctly IDing them as hags.
I'm not exclusive though, I like adults too but I just like Western style Asians. Something weird and babyish about the huge eyes that pushes it into "ugly infant" and not "sexy kid" for me
>Makeup fakery only works on the low IQ
This is like saying optical illusions only work on low IQ kek.
>>
is there a way to flush the vram in comfyui, like really clear it? was switching between two workflows (t2img > seedvr2 upscaling and comfy consistently crashes when switching back to the t2img workflow. and yes I hit those two "unload models" & "free model and node cache" buttons. WIN BUTTON
>>
File: o_00100_.png (1.1 MB)
1.1 MB PNG
>>
>>108573903
Covering your skin in white powder doesn't make you look young it makes you look like your skin is covered in white powder. Putting on colored contacts doesn't make you look like an anime character it makes you look like you have colored eye contacts in. Only a retard (or someone who needs glasses) is fooled by such things. It's pretty much the female equivalent of trannies. Putting on a wig doesn't make you a woman.
>>
>>108573920
did nvidiasmi/nvtop show that VRAM isn't being freed by the comfy process, and that the crash isn't from something else? and have you tried combining the two workflows? there are some inline nodes to free VRAM, if you wanted to chain things.
>>
>>108573437
This is one of those places where ignorance is bliss, and goes back to my earlier point regarding pointless song comparisons. Udio has impressive musicality, but those specific examples you chose can all be done by XL already, which is particularly good at country stuff (can even do the cinematic cowboy stuff that Udio was known for).
Everything I posted here (my first gens, not even nitpicking) are more musically impressive than what you've posted
>>108563679
>So many good songs that are going to remain unfinished forever.
Can be extended with ACEStep plus covered/remastered on top of that.
>>
>>108572177
>same broken SDXL noise schedule making all images average to a medium brightness level
>same shitty VAE
>same CLIP
>in 2026
I downloaded this with cautious optimism, but after a few dozen gens on different prompts, it looks and behaves exactly like Noob Eps 1.1. I don't think anyone could reliably tell the difference in a blind test. What is the point? Almost nothing has changed in a year and a half. Dataset update is good I guess, and controlnet is cool if you use those but I don't.
>>
>>
>>
File: 00008-1488244764.png (3.8 MB)
3.8 MB PNG
>>
>>108573679
>And so overpowered right now it's kind of ruining my enjoyment of other characters
sly was a mistake, like if you dont do discard builds it feels like you're gimping yourself
shiv builds are too slow
poison builds are GIGA slow
>>
>>108573974
>This is one of those places where ignorance is bliss
which is why you're silling Acestep instead of udio, because you're absolutely ignorant (I know you're the guy who made AceStep though, that's why you're desperate and is hella biased in judging your own product good)
>>
>>
1 girl, {azula, elegant, topknot, single hair bun, sidelocks, hair ornament, slender, petite, small breasts, (toned:0.4)|laura kinney, green eyes, long hair, black hair, small breasts, petite, slender, (toned:0.6)|korra, blue eyes, ponytail, brown hair, hair tubes, dark-skinned female, medium breasts, (toned:0.6)}, nude, spares pubic hair, rough sex, straddling, girl on top, upright straddle, looking at another, floating hair, pussy juice, {anus, from behind, rear view, puckered anus, dimples of venus|pov, male pov, pov hands, groping, bouncing breasts, erect nipples|side view, from side, bouncing breasts, erect nipple}, spasm, detailed pussy, BREAK
large male, large penis, veiny penis, thick penis, detailed penis, deep penetration, from below, pussy focus, pussy grip, by maeda hiroyuki, indoors, dark background, ryokan, large male, size difference, mature male, muscular male, candid, vaginal sex, ((rough sex))
>>
>>
>>108573974
i don't think this is the most useful or compelling structure for text2music models. it'd be much more valuable to be able to vllm a score to parts to music21 or mxml notation, then tokenize for a model that could follow those instructions. right now, the natural language llm is a weakness.
open music models are basically at sd1.4-level. basic, frequently coherent, occasionally kino output, but no real control and only superficially decent impressions.
i'm still glad they exist, though.
>>
>>108573955
>Covering your skin in white powder doesn't make you look young it makes you look like your skin is covered in white powder.
It's funny that you're complaining about stupid people when you think that Chinese do this to look young when it's obviously to look more white (no, this isn't a "two things can be true" situation.) and I genuinely don't know how retardedly fucked up of a brain chemistry you need to have in order to come to that conclusion first
>>108574011
It's unfortunate that Latinas have to deal with the burden of black-as-coal eyes with no hope or cope around it. They often have very European physiognomies and even that guy that is building an all-White community in the US said that a fully European Colombian woman would be accepted into the community.
I completely understand why the model thought it needed to put armpit hair in that spot, but it still should've been smart enough to recognize that that's part of the pectorals and not the armpit unless she's a contortionist and breaking her shoulder socket backwards right now
>>108574017
Shiv builds are too slow? First I've heard of that. Shivs let you easily snowball the first act of you get a couple of good cards early on. I agree about poison though, it was one of my favorite archetypes in the first one. I hate Doom, it's just poison that pisses me off because I have to take damage first before the enemy dies and it infuriates my muscle memory
>>
File: ComfyUI_00001_.png (3.1 KB)
3.1 KB PNG
MY COMFYUI OUTPUTS SOLID BLACK BLOCKS AND NOTHING I DO WILL FIX IT
HELP!
>>
>>108573969
yeah I saw the vram getting purged via crystools info thingy which seems to be accurate/line up with what I saw in the task manager. I don't want to combine the workflows, I work modular. last log entries:
[2026-04-10 17:06:50.568] VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16
[2026-04-10 17:06:58.487] Requested to load ZImageTEModel_
[2026-04-10 17:06:58.508] loaded completely; 7672.25 MB loaded, full load: True
[2026-04-10 17:06:58.511] CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cuda:0, dtype: torch.float16
and poof
>>108573997
two workflows in tabs inside comfyui, gonna try your solution lol
>>108574103
need more info. model? sage attention? etc. screencap whole workflow
>>
>>108573974
Kek, those were all joke gens if you couldn't tell by the lyrics. The Japanese song is about dicks. Also I fucking hate country (except for Big Iron, might literally be the one exception I've heard in my entire life). All of your gens (other than the Country one) are electronica which I can already tell AceXL is decent at, but in my biased opinion electronica is not an impressive genre for a computer to make, it's already an incredibly artificial genre by its nature. Let's hear some good ORCHESTRA gens.
Here's my favorite udio gen: https://files.catbox.moe/i9xdb7.mp3Listen to how real the strings sound (1:30 an onwards). Ace can't do that at all, the strings sound worse than fucking MIDI. Really disappointing other than vocals. I will admit Ace is impressive at vocals. Vocals, synth, and drums are the only things it can do convincingly out of the box, I will wait for the community to finetune the model on the zillion other instruments that exist before I waste time on such a limited model.
>>
File: 00004-4078934806.jpg (522.2 KB)
522.2 KB JPG
>>
>>108574057
ACEStep 2.0 will have chord control. Then a musician could just train a model to understand the sheets like you mentioned, then pass into ACEStep. LLMs can already understand everything, they just need training. As is, ACEStep is already plenty useful since there's LoRA training and covers.
>>
>>
File: 00012-3462065422.png (3.8 MB)
3.8 MB PNG
>>
File: o_00104_.png (1.1 MB)
1.1 MB PNG
>>
>>108574115
>I fucking hate country (except for Big Iron, might literally be the one exception I've heard in my entire life)
Then you need to listen to the rest of Marty Robbins' Gunfighter Ballads and Trail Songs which includes Big Iron on it. It's a /mu/ staple and part of music history and one of the best albums ever made. El Paso is another famous hit from that album.
That album is part of the proof that EVERY genre has been co-opted by monied interests. Pop and rock are obvious examples, and I'm a zoomer so I know more about black people culture than I ever wanted to and so I also know that hip hop in the 80s was about building up your community until Big Shekel (and the CIA) got involved and then all the songs in the 90s were about shooting your own people and selling life ruining chemicals to them
>>
>>
>>
>>108574156
>Then you need to listen to the rest of Marty Robbins' Gunfighter Ballads and Trail Songs which includes Big Iron on it.
I've listened to the whole album several times. I just thought it was alright. I didn't hate it. But none of the other songs are even close to as good as Big Iron, man was just divinely inspired when he wrote that or something.
>That album is part of the proof that EVERY genre has been co-opted by monied interests.
Yeah for sure. When I said I HATE country I meant the garbage modern shit, like what that anon posted. It makes me want to actually vomit. Classic country (1970s or older), I don't love it but I'm indifferent to it, if it came on the radio I wouldn't mind.
>>
>>
File: 00015-1568681659.jpg (426.1 KB)
426.1 KB JPG
>>
>>
>>
>>
>>108574217
Use this and replace the checkpoint with anima, change the scaling algos to match and it just werks https://files.catbox.moe/uelcmw.json
Other inpainting nodes I've tried don't work (they give horrible seams) but these do
>>
>>
File: _AnimaPreview3_00010_.jpg (381.8 KB)
381.8 KB JPG
>>
File: o_00107_.png (953.7 KB)
953.7 KB PNG
>>
>>
File: 00017-915814233.jpg (477.8 KB)
477.8 KB JPG
>>108574261
>>108574268
why are you seething over my Armenian waifu gens.
>>
>>108574268
{lisa hamilton, dead or alive, 1girl, dark-skinned female, very dark skin, lips, brown eyes, short hair, brown hair, large breasts, thighs|korra, brown hair, dark-skinned female, blue eyes, ponytail, hair tubes, medium breasts, (toned:0.5)|pharah \(overwatch\), overwatch, 1girl, dark-skinned female, lips, brown eyes, eye of horus, black hair, short hair, long hair, brown hair, side braids, hair tubes, medium breasts, facial tattoo, facial mark, (toned:0.5)}
, erect nipples, seductive, prostitution, nude, lace stockings, , chocker, dark background, night, simple background, indoors, by tony taka, watercolor (medium)
, {seductive smile|torogao|fucked silly|moaning|closed eyes, female orgasm, spasm|forced orgasm}, {on back|on side, leg up|on stomach}, on bed, erotic pose, sexy pose, aroused, blush, implied prostitution , (((elegant))), ((dynamic angle)), ((vaginal sex)), , from above, wet spot, sweat, messy hair, bouncing breasts, pussy grip, pussy juice, (((rough sex))) , {looking at viewer|candid, looking afar, crack of light, caught}, anus, window shadow, moonlight, (dark), BREAK, candid, 1 boy, large male, hairy male, muscular male,, size difference, rance, rance \(series\), saliva on penis, veiny penis, , , {groping|hand on another's neck|torso grab|finger in another's mouth|strangling|grabbing another's hair|slapping breasts|nipple pull|slapping {face|breasts}, pain|hand on another's face|holding phone, recording|grabbing another's face|strangling, asphyxiation, panicking|grabbing another's breast, deep skin}
, male pov, pov,
>>
>>108574300
it's the same fucking girl over and over bro, it's not healthy. one would even call it avatarfagging. why dont you post your 'biuteful ledi' in plebbit since you like it there? I'm sure the other browns will appreciate them more.
>>
>>108574115
Have you tried ACEStep XL? I recommend you follow every tip in the instructions for prompting it, scroll down and especially read the parts around syllable count per line, it makes a huge difference to quality of outputs you get (just having an LLM reformatting your lyrics helps).
>Listen to how real the strings sound (1:30 an onwards).
Honestly, I'm not really hearing what you mean at all. From ACEStep 1.5 to XL it definately seems to sound much more natural. If I play certain gens on my bookshelf speakers I can't even tell the difference between that and radio. Maybe you mean stereo surround sound, though imo the Udio sound quality is noticeably worse regardless of how natural it may sound, but I will make this point: ACEStep 1.5 to XL was only a 2B parameter difference, but it does not at all feel like a 2x improvement. The improvement is 10x. 2.0 will arrive at 2B more parameters, and that will improve 10x as well. ACEStep is improving exponentially as it scales. So truly, there's not much to worry about in terms of improvements, and there's no doubt it's catching up.
>>
>>
>>
>>108574321
https://ace-step.github.io/ace-step-v1.5.github.io/
Scroll down and listen to
>An instrumental orchestral piece built on a foundation of a powerful, sustained string section and a precise, grand [...]
And tell me that doesn't sound like fucking MIDI. Get your hearing checked.
>>
File: 00019-256877943.png (3.4 MB)
3.4 MB PNG
>>108574311
sorry anon but i find 2D y2k-2007 armature deviant art style gens being posted here to be very boring and repetitive. Everyone has their own taste that fits their niche.
>>
>>
>>
>>
File: _AnimaPreview3_00031_.jpg (369.4 KB)
369.4 KB JPG
>>
>>
File: 66.jpg (319.4 KB)
319.4 KB JPG
>>108574262
anima isnt trained for inpainting, it doesn't try to match, left is just the regular workflow, right is with lanpaint ksampler (painfully slow)
>>
>>108574361
Don't think that means anything. It's just a symptom of the model having seen many MIDI pieces. If you try the prompt out yourself, especially playing with settings, you will noticethe quality varies. You can even just prompt it for LIVE concert feel, and it will give you just that. Obviously, there will be massive differences in output between seeds and prompts.
Scroll down to
>A bombastic and epic orchestral choral piece opens with a powerful, ...
And
>Indie folk ballad with astronaut transmission aesthetics, gentle fingerpicked acoustic guitar and soft banjo arpeggios...
Do those feel like MIDI fakes to you as well? I would try to show you more examples of vastly varying outputs in regards to string instruments from my own gens but I'm afk.
>>
>>108574472
Use this: https://github.com/lquesada/ComfyUI-Inpaint-CropAndStitch You can set it so it uses an expanded extra area outside of your mask as "context" so that it will match.
>>
>>108574472
>the regular workflow
As in the one provided in the catbox? I dunno what to tell you anon it worked for me. Are you still prompting the style and medium tags for the inpaint? I noticed if I left those tags out it would produce off colors like in your gen.
>>
File: o_00111_.png (1.2 MB)
1.2 MB PNG
>>
File: gardening.jpg (2.6 MB)
2.6 MB JPG
>>
>>108574492
>Don't think that means anything. It's just a symptom of the model having seen many MIDI pieces.
And... why is that? Almost no one wants to listen to MIDI music. It's almost universally seen as low quality and outdated.
>Do those feel like MIDI fakes to you as well?
The orchestra/choir one absolutely does outside of the vocals, the strings are extremely synthetic sounding. Of course it's buried low in the mix so I guess maybe you could be forgiven for not noticing. You mentioned speakers earlier, do you not listen to music on headphones? You must have a very bad listening setup or be hard of hearing. Or maybe you don't listen to much music.
>>
File: 1089385690716871.png (2 MB)
2 MB PNG
>>108573558
same, but my regent is on a8 so I'll probably get him up to 9 first.
>>108573679
>>108574017
>>108574097
Yeah poison's bad, but shivs are really strong imo.
>>
File: 3800.jpg (375 KB)
375 KB JPG
>>108574525
it was the same prompt/sampler etc used to gen the image, try to do the same with a portrait, and cover it like this
>>108574512
yeah thats what the workflow uses, I set it to use the whole image as context
regular/lanpaint/regular with .8 denoise (better of course because there's less to guess but the seams do not match)
>>
File: _AnimaPreview3_00039_.jpg (394.9 KB)
394.9 KB JPG
>>
File: deWA_zi_00038_.png (2.5 MB)
2.5 MB PNG
>>
>>108574614
Is there a reason you're setting it to 0.8? Seems way too high. The value I use depends on what I'm doing but it's usually anywhere from 0.2 to 0.6 (0.6 is usually pushing it and won't work but sometimes I feel adventurous and try anyway). I think you'd be surprised how much the image can change even on low-ish denoise values.
>>
>>108574472
>>108574614
Are you not using >>108574262 ? Somethings not right with your workflow or settings
>>
File: o_00114_.png (1.5 MB)
1.5 MB PNG
>>
>>
>>
File: ComfyUI_temp_jforv_00006_.png (1.3 MB)
1.3 MB PNG
>>
>>108574675
>>108574680
I'm aware it works better with lower denoise, but stress testing show how robust it is, catbox with .6 noise: https://files.catbox.moe/ytaosn.png
>>
>>
>>108574559
Any and all kinds of music is high quality data to a music model. The less it knows, the worse the model is. Tuning is what separates bad data from good.
>You mentioned speakers earlier, do you not listen to music on headphones? You must have a very bad listening setup or be hard of hearing. Or maybe you don't listen to much music.
I listen on earbuds, occasionally listen on hifi headphones. I don't work with audio though, so I never try to discern whether I'm listening to a sound made by software or not (as long as it's not obviously synthetic), and I don't think it matters to most people. The telltale signs of slop are gone for me since voices became more natural going into 1.5 (though Udio still has nore versatility out of the box, nothing a LoRA can't fix). Proprietary models probably have an extra mastering step at the end of the chain made by engineers which improves their sound quality btw, so I still don't think they have some secret sauce when it comes to this or that ACEStep's raw model is flawed in some way.
>>
File: ComfyUI_temp_jforv_00019_.png (1.6 MB)
1.6 MB PNG
>>
>>
File: deWA_zi_00039_.png (2.2 MB)
2.2 MB PNG
>>
File: ComfyUI_temp_jforv_00024_.png (1.8 MB)
1.8 MB PNG
>>
File: file.png (1.4 MB)
1.4 MB PNG
>>108573647
My problem is that the image lora gen either close eyes. showing back or eyes distorted hard
>>
File: ComfyUI_temp_puiqp_00007_.png (2.1 MB)
2.1 MB PNG
>>
>>
File: o_00116_.png (1.1 MB)
1.1 MB PNG
>>
>>
File: ComfyUI_temp_puiqp_00019_.png (1.8 MB)
1.8 MB PNG
>>
>>
File: ComfyUI_temp_puiqp_00020_.png (1.6 MB)
1.6 MB PNG
>>
>>
File: ComfyUI_temp_tgvkg_00025_.png (1.6 MB)
1.6 MB PNG
>>
>>108575326
Mr. Anon posted his working flow in the last thread
>>108565108
>>
>>
>>
File: deWA_zi_00041_.png (2.3 MB)
2.3 MB PNG
>>
>>
Fresh when ready
>>108575392
>>108575392
>>108575392
>>
>>
>>
>>
>>108575358
I actually didn't want it to do prompt like in >>108575353 this example, but i2i photo edit. Prompting of course can run on any model capable enough to understand the prompt.
>>
File: image - 2026-03-20T150902.450.png (2.1 MB)
2.1 MB PNG
>>108575326
here is Azula from avatar the last airbender made with cyber realism illustrious
>>
File: image - 2026-03-17T185719.921.jpg (393.6 KB)
393.6 KB JPG
X-23 at the pub
>>
>>108575539
Yeah I know. I did like a million gens with IL realistic and anime character loras.
Can't do that with modern models (unless there will be Anima realism models, which's not that simple since all IL realisms are just merges with XL).