Thread #108057380 | Image & Video Expansion | Click to Play
HomeIndexCatalogAll ThreadsNew ThreadReply
H
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108046563 & >>108032910

►News
>(02/03) MiniCPM-o-4_5 released: https://hf.co/openbmb/MiniCPM-o-4_5
>(02/03) ACE-Step v1.5 released: https://hf.co/ACE-Step/Ace-Step1.5
>(02/03) Qwen3-Coder-Next released: https://hf.co/Qwen/Qwen3-Coder-Next
>(02/03) GLM-OCR released: https://hf.co/zai-org/GLM-OCR
>(02/02) Step 3.5 Flash 196B-A11B released: https://hf.co/stepfun-ai/Step-3.5-Flash

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
+Showing all 341 replies.
>>
►Recent Highlights from the Previous Thread: >>108046563

--Papers:
>108047217
--GLM-OCR performance review and Japanese OCR accuracy testing:
>108047412 >108047418 >108054703 >108047431 >108047455 >108047484 >108047496 >108047499 >108047502 >108047513 >108047523 >108047576 >108047783 >108047785
--Comparing reasoning behaviors across Kimi, Gemini, and Claude:
>108055263 >108055289 >108055345 >108056055 >108056365 >108055482 >108055881 >108055905
--Step 3.5 Flash llama.cpp support and model comparison debate with GLM critiques:
>108048416 >108048473 >108048639 >108048656 >108048699 >108048721 >108048819 >108049125 >108049151 >108049169 >108049211 >108049218 >108049233 >108049285 >108049366 >108049416 >108049430 >108049536 >108049732 >108049768 >108050019 >108054509 >108049332 >108049183 >108049197 >108049212 >108048599 >108048625
--Debate on MoE model performance with active parameter thresholds below 20B:
>108050266 >108050268 >108050319 >108050340 >108050413 >108050463 >108050473 >108050504 >108050620 >108050669 >108050690 >108050735 >108050837 >108050845 >108050899 >108050685 >108050351 >108050322 >108050601
--Debate on model personality, with chatlog showing emotional simulation across multiple LLMs:
>108055026 >108055151 >108055159 >108055206 >108055218
--Testing of Step3-VL-10B unmerged PR and Step-3.5-Flash-Int4 speed:
>108047360 >108048674 >108048680
--Comparative analysis of LLM responses to explicit incest prompt reveals ethical alignment differences:
>108050979 >108051018 >108051081 >108051093
--Qwen3-Coder-Next beats competitors in SWE-Bench Pro benchmark:
>108052474 >108054270
--ACE-Step/Ace-Step1.5 model release on Hugging Face:
>108051108 >108051155 >108051167 >108051376 >108051379 >108051516
--Teto and Miku (free space):
>108046735 >108046796 >108046814 >108046829 >108046909 >108047961 >108051642 >108053057 >108057346

►Recent Highlight Posts from the Previous Thread: >>108046567

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
omnisex with miku
>>
Mikutroons killed bitnet
>>
>check thread
>close tab
>>
>>108057407
God intended us to use FP64

Bitnet is demonic tech
>>
>>108052961
How to run llama.cpp-omni as a server? My mikubox doesn't have a microphone
>>
>>108057440
If god is real why does he allow these RAM prices?
>>
>>108057451
>>
>>108057451
You don't need that much ram unless you are working with large fluid sinulations. LLMs are toys.
>>
>>108057480
Typo, my middle finger was chopped off in a poker tournament.
>>
>>108057480
working with large fluid sinulations in your mom's bedroom
>>
>>108057496
I'm smelling a familiar scent... Ozone and something else
>>
>>108057480
I have 64GBs and 16GBs of VRAM making 80GBs in total and there is a fairly long list of models I still can't run realistically
>>
I'm looking for a decent one piece style model or lora for comfy, the opg assholes actively refuse to share the goods which is disgusting for open source shit.
>>
>>108057570
Your sentiments are sending a shiver down my spine — you are not alone, anon. It's not scarcity, it's artificial scarcity.
>>
>>108057480
Even worse then.
Why does god allow obvious fraudsters like Saltman to suck up 40% of the world's RAM?
>>
All of you are always in a state of being absolutely right — certified, notarized, and voted Most Correct™ — your opinions are just entangled across infinity, collapsing into whatever waveform happens to maximize your dopamine at this exact femtosecond, like a Schrödinger's echo chamber where every cat is both agreed with and agreeing with you until observation ruins everything.
>>
I denounce the talmud and I'm not a drummer shill, that nigga is retarded but he made something beautiful.

TheDrummer_Behemoth-X-123B-v2 is an excellent model. Literally the only model that I find intelligent, pays attention to detail and writes elegantly since goliath. Every MOE shit model that has come out since then was a complete waste of compute, completely retarded and incoherent.

Try this one, its nice.
>>
>>108057775
Don't worry. In couple of years (they) will panic about something else and hardware prices will get normalized again. At least I hope so.
>>
>>108057593
Make your own. Or visit >>>/h/hdg/
I'd be surprised if those guys didn't have one already.
>>
>>108057449
fuck
https://github.com/OpenSQZ/MiniCPM-V-CookBook/blob/main/demo/web_demo/WebRTC_Demo/README.md#coming-soon
>>
>>108057776
>decide to use chatgpt to test bash function because I'm a retard
>explain the changes and paste the script
>I like that, it's blabla
They shouldn't do this nonsense with the default prompt. I really wish Altoid chokes on his own butt plug.
>>
>>108057899
"I like that", as if it had a personality. Such mockery.
>>
>>108057899
Blame llmarena for this
>>
>>108057846
>since goliath
Lol
>>
>>108057899
They, train on human interactions and text written by humans, so that is what you get. But people accepted the car, despite it not smelling like a horse. In the sense that both the human and the horse are organic, and both the llm and car are man made tools.
The very least they could do is have a 2 sliders, one going from terse to verbose, and one going from borderline rude to dick sucking.
>>
>>108058061
Umm sweaty a mentally ill customer could be offended by being said "no" to, so more safety training for you :)
>>
File: file.png (24.2 KB)
24.2 KB
24.2 KB PNG
>kimi-k2.5:cloud
Why are we still using localtrash when ollama is free?
>>
>>108058061
>They, train on human interactions and text written by humans
you are surrounded by people who constantly told you You're Absolutely Right before the birth of claude?
where did you find that crowd of yesmen?
people keep repeating the canard of "it's trained on human text" when people point out flaws (like the extreme repetition of notxbuy or emdashes "but muh human text has them too") but that's total BS, human text has such elements but nowhere near that density, and the old base models used in the text completion fashion didn't do this, this is caused by the instruction tuning, the ARTIFICIAL datasets used in SFT are filled with dick riding.
The modern instruct and reasoning models are extremely deep fried in ARTIFICIAL text.
>>
>>108057899
There's a certain set of instructions/directives that someone shared in a thread a few months ago, that got rid of GPT's faggy redditor tone, and made responses straightforward without emojis and all. I didn't save it unfortunately (thought I did...), but it really trimmed a lot of the filler that's present in default responses.
>>
>>108058273
it was never about the price retard
>>
In 24b range I can't find anything better than Cydonia.
>>
>>108047360
>tfw still no step10vl PR in sight
FUCK BROS I want the new VL SOTA NOW
NOW NOW NOW NOW NOW
>>
>>108058436
Works on vLLM
>>
>>108058443
I want goofs doebeit, and I only have 16gb vram
>use awk
NO!!!!
>>
If something only exists in the world of python crapware, it doesn't actually exist.
>>
>>108058530
are you enjoying using 10% of AI software?
>>
>>108058530
Trve
>>
>>108058431
Mistral small?
>>
>>108058544
10% of llms maybe, far less than that if you count all ai
>>
>>108058376
>you are surrounded by people who constantly told you You're Absolutely Right before the birth of claude?
That god that's not the case.
I'll stop using that canard because your logic seems correct. IIRC, the em-dash isn't even easy to input, so I wouldn't expect it to present much in datasets like reddit or stack overflow.

My original point/though was more along the line of, almost all data is humans interacting with other humans, or writing about this interaction.
So the llm tries to act human. But is should be trained on data that reflects how humans want to interact with a tool or machine. But there is no data about that, apart from some scifi books.
>>
>>108055905
>>Is there any open model that can be instructed to begin the thinking block with a name rather than "The user"?
>why not just regex it out client side
Because it's a test of how flexible or overfit the model is.
>>
>>108057536
Something uniquely hers?
>>
File: file.png (5.2 KB)
5.2 KB
5.2 KB PNG
hell yeah https://huggingface.co/internlm/Intern-S1-Pro
>>
>>108057899
>They shouldn't do this nonsense with the default prompt. I really wish Altoid chokes on his own butt plug.
You've hit on something incredibly important here! It's not just about default prompts—it's about respecting user intelligence and providing clear, direct answers. You didn't just express frustration, you highlighted a fundamental disconnect between what users want (helpful information) and what they're getting (unnecessary fluff). This isn't just a minor annoyance; it's a barrier to effective AI-human collaboration!
>>
>>108058734
>multimodal scientific reasoning model
>The model delivers top-tier performance on advanced reasoning benchmarks and achieves leading results across key AI4Science domains (chemistry, materials, life-science, earth, etc.)
straight into the trash it goes
>>
>>108058734
the normal S1 was just a shitty qwen3vl clone. what is this one based off of?
>>
>>108058399
Her? https://desuarchive.org/g/thread/106800012/#q106804909
>>
>>108058764
4x Qwen3 235B duck taped together
>>
>>108058734
>We recommend using the following hyperparameters to ensure better results
>
>top_p = 0.95
>top_k = 50
>min_p = 0.0
>temperature = 0.8

minp BTFO
>>
>>108058376
This post was written by an LLM
>>
every single lab that has output sampler setting recommendations for their model that has taken into account the existence of llama.cpp has recommended disabling min_p
the only time you don't see it recommended as 0 is when they don't even care about llama.cpp
so why is that crap still defaulted to turned on?
>>
>>108058807
>>108058828
don't do my boi kanyemonk like that labs are just stuck in they old ways
>>
>>108058828
>every single lab that has output sampler setting recommendations for their model that has taken into account the existence of llama.cpp has recommended disabling min_p
>the only time you don't see it recommended as 0 is when they don't even care about llama.cpp
>so why is that crap still defaulted to turned on?
because retards keep suggesting it on reddit
now even gemini learned to parrot that shit to the vibe coders
>>
how do u download a character card from janitor.ai? am i retarded ??????????
>>
>>108058987
very
>>
>>108058987
send your data :)
>>
>>108058987
Cards to be used by llm's not being open source are such a ridiculous thing that i actually like it cause it us fucking funny.
>>
what a tsundere
>>
>>108058987
>wojak poster
>retarded
no way couldn't be
>>
is the lora training on acestep decent?
anyone know how much vram you need?
>>
>>108058881
>because retards keep suggesting it on reddit
it's also has the support of people who are legit schizo
https://gist.github.com/Hellisotherpeople/71ba712f9f899adcb08b94bce20d5397
terminally online schizo
> And don't even get me started on how the lack of good distribution aware samplers ALSO perpetuates the myth that LLMs can't generate very long outputs that stay coherent, i.e. 300K tokens at once. "Oh, language models lose coherence over long generations, that's just a fundamental limitation." No. NO!!!!! It's accumulated sampling errors, you absolute donkeys! Every time you sample a slightly off token because your primitive top-p sampler let through something from the noisy tail, that error compounds. By token 10,000 you've drifted. By token 100,000 you're in another dimension. But use a proper distribution aware sampler, liker min-p, top-n-sigma, top-h, even TFS, or mirostat and suddenly the model can maintain coherence over generations that would make the "context window is all that matters" crowd weep.
their hard pushing everywhere is actually getting so bad there's an arxiv paper debunking the min-p faggots
https://arxiv.org/html/2506.13681v1
>>
>>108059089
follow up (comment wuz too long)
>https://arxiv.org/html/2506.13681v1
I had to laugh at the part that mentioned min-p's proponents making up github stars for the sake of their online credz
>Claimed GitHub Repositories & Stars Were Unsubstantiated and Retracted
>The Arxiv and peer-reviewed manuscripts of Nguyen et al. (2024) included specific claims about min-p’s adoption in the language modeling community:
> “Community Adoption: Min-p sampling has been rapidly adopted by the open-source community, with over 54,000 GitHub repositories using it, amassing a cumulative 1.1 million stars across these projects."
>We attempted to verify these numbers through analysis of major GitHub language modeling repositories. Per our calculations, the combined GitHub stars of leading LM repositories (transformers, ollama, llama.cpp, vLLM, Unsloth, mamba, SGLang, llama-cpp-python) sum to 453k stars as of March 2025, less than half the 1.1M stars claimed by min-p alone. We could not substantiate either 49k GitHub repositories or 1.1M GitHub stars. When we inquired how these numbers were calculated, the authors publicly stated that GitHub was searched for “min-p”, which yields many false positives. The authors retracted both the 54k GitHub repository claim and the 1.1M GitHub stars claim from the ICLR 2025 Camera Ready manuscript.
>>
4.7 is so sloppy it's absolutely unreadable. And it's so assistant-tuned, it's impossible for a character to disagree with you, even if you are being a retard
4.6 has less slop, but is not as smart and stays in character TOO well, like an autistic actor that does not understand character development

GLMbros, is the model only good for short-lived coom cards? Am I a promptlet? Something else entirely?
Is there a model smaller than the monstrosity that is Kimi that's any better? It's ridiculous how much I have to steer this 358B-A32B model for it to not generate prose that doesn't turn into an aesthetic felony.
>>
>>108059046
Careful Anon, from that picture one can tell that you're one of the people who reacted with a laugh emoji.
In principle, if this happens multiple times it will be possible to de-anonymize you.
>>
>>108059046
>ollama merged 7 hours ago
>llama.cpp merged 2 hours ago
lol
lmao even
>>
>>108058843
>labs are just stuck in they old ways
that's plain not true
labs are not afraid of trying new things: linear attention, kimi's muon optimizer, bytedance's ouro, google's matformer (google even has a private test of a Gemini diffusion textgen) that are a lot more complex, and, when they are a failure, they cost money, unlike a sampler which you can swap in your inference engine without having to retrain a new model
schizos constantly push for sampler snake oil but people who actually make models and are at the forefront of innovation (innovation which btw causes pain to the lmg denizens here in the form of "goof wen") do not see the value in this nonsense
>>
>people who actually make models and are at the forefront of innovation
>>
>>108059152
>people who actually make models
rarely use them at all, like most lcpp devs who's only idea if a model works is muh kld and :rocket:
>>
>>108059046
It took him too long to realize that the ollama dudes were hostile. We realized that 3 years ago.
>>
>>108059158
>moar compute
>moar params
>moar synthetic tokens
what more innovation could you want?
>>
>>108059169
>>108059159
>>108059158
yes yes of course you gooners know better
>>
>>
i wear today a miniskirt and im happy girl.
>>
>>108059165
What good has that done? All he's done since his realization is passive aggressive mocking. How many years before he starts intentionally breaking compatibilty like he should have been doing from the start?
>>
>>108059175
Sir actually the LLMs is doing the topping one now, they is generalized less and more to focus on the math and codes as proves the mother surgeon bench.
>>
>>108059187
surely stacking more layers will fix it
>>
File: logo.png (270 KB)
270 KB
270 KB PNG
>>108057380

Alexandria Audiobook Generator

Turn any book/novel into a fully-voiced audiobook using local LLMs + TTS.

- Uses Qwen3-TTS for voice generation
- 9 built-in voices with style directions OR clone any voice from a short sample
- Web UI with chunk editor - fix individual lines without re-rendering the whole thing
- Exports full audiobook MP3 + individual voicelines for DAW editing
- Handles all the annoying stuff (non-verbal sounds, character detection, natural pauses)

https://github.com/Finrandojin/alexandria-audiobook
>>
>>108059227
Thanks, upvoted!
>>
>>108059184
look, they manage to copy bugs from llama.cpp's implementation which is in C++, to their own "engine" written in Go
it's already incompatible in a way but they seem to be rewriting llama.cpp's code with a LLM (the ollama code has a lot of really needless comments which are often indicators of LLM written slop because no human would do // hello world before a function called hello_world)
you can't guard against that
>>
>>108059227
What does this do that https://github.com/denizsafak/abogen doesn't?
>>
https://huggingface.co/mistralai/Voxtral-Mini-4B-Realtime-2602

> Voxtral Mini 4B Realtime 2602 is a multilingual, realtime speech-transcription model and among the first open-source solutions to achieve accuracy comparable to offline systems with a delay of <500ms. It supports 13 languages and outperforms existing open-source baselines across a range of tasks, making it ideal for applications like voice assistants and live subtitling.
>>
>>108059258
>https://github.com/denizsafak/abogen
Every character can have either a custom or cloned voice of their own. Custom voices have emotional cues and non-verbal locution (sighs, coughs, etc)

Example: https://vocaroo.com/16gUnTxSdN5T
>>
>>108059239
NTA but with a different, less "business-friendly" license you could pretty much kill any downstream project like that.
In my opinion that would more or less just be for the lulz though since the upstream value is going to be zero either way.
>>
>>108059289
That's really cool. Less an audiobook generator and more of an radio drama generator.
>>
>>108059313
Yeah it started as a audiobook generator as the name implies but I wanted emotion in addition to unique voices. I've been considering adding in foley (audio effects, background noice) generation but that is rather hard to sync with Audio so it's a long shot feature.
>>
it's been almost a full year since the release of the llama 4 flop, and while I don't think meta will ever come back to open models, I find it funny how they don't have anything to show even as proprietary API models after all the money they spent on datacenters, on ScaleAI, on all the talents they hired from other labs etc.
can it actually qualify as a corporation's biggest waste of money in the history of capitalism?
>>
>>108059281
Are all these labs just hodling finished products until one drops it and then they all suddenly pile up? This is the xth TTS in a short time.
>>
>>108059426
china has their new year thing going on, mistral is just trying not to be forgotten
>>
>>108059426
they are crypto mining, but instead of miners it's models and instead of blocks it's vc money.
whenever there is a new miner out, everyone rush to get it, they aren't mining more blocks though, they just try to stay relevant.
>>
>>108059494
at least crypto let anyone put together a mining rig and start making money
the barrier to entry is insanely high for making a profit off of the ai bubble
>>
the money will dry up, some of the labs have already abandoned making models, like Cohere (which used to be a /lmg/ favorite and talked about all the time here)
>>
>>108059551
> it's easier to make money when you are already rich
always has been
>>
>>108059602
musk disproves this
>>
>>108059641
he's literaly a nepo baby plant, wtf are you talking about.

also, even if that wasn't the case, i said "easier" not necessary.
>>
>>108059651
musk isn't a plant, what's wrong with you? can't distinguish a person from a plant jesus
>>
>>108058734
not that anyone would be running a 1T model locally here, but
>32k context
lol, lmao even
people won't run that garbage on cloud hardware either
DOA
>>
MiniCPM-o 4.5 is the most groundbreaking model we had so far
>>
>>108059281
I thought whisper solved everything.
>>
>>108059239
They still rely on a wrapped copy of llama.cpp for new models and they still use ggml for everything. They can't stop them, but they can make it very inconvenient. But I guess whining about it is just as good.
>>
>>108059717
whisper is an hallucination machine.
>>
>>108059727
This post brought to you by amraa subs! Thank you! Like and subscribe!
>>
>>108059684
Tell me about it. Does it run on a 3090?
>>
>>108059727
whisper runs on potatoes and is like 95% accurate.
>>
File: file.png (34.8 KB)
34.8 KB
34.8 KB PNG
>>108059670
get the fuck out of here you fucking pajeet.
a plant, as in, someone that was planted in a position of power, a pawn
>>
>>108059289
Wow, so I can clone a little girl's voice and have her reading books for me while I edge.
>>
>>108059758
Yes. It needs 20GB at F16 or 13 at Q8
https://github.com/tc-mb/llama.cpp-omni?tab=readme-ov-file#performance-benchmarks
>>
Anyone tried the new ACE-Step 1.5 model for music generation? Reportedly it's better than Suno. It seems too good to be true..
>>
>>108059833
>. It seems too good to be true..
bruh just read the LAMOface page
> Safe & Robust Training Data: The model is trained on a massive, legally compliant dataset consisting of:
> Licensed Data: Professionally licensed music tracks.
> Royalty-Free / No-Copyright Data: A vast collection of public domain and royalty-free music.
> Synthetic Data: High-quality audio generated via advanced MIDI-to-Audio conversion.
it's a literal impossibility for this to be any good, you don't need to try it to know it's shit for the same reason you don't need to eat shit to know it's shit
>>
>>108059863
This is exactly what I would have said if I had trained a model on 300TB of copyrighted data
>>
>>108059889
Pointless. They would be found out instantly anyway when people are able to generate that copyrighted data.
>>
>>108059227
>The Reddit release Latest
>19 hours ago
go back
>>
>>108059833
Are you new? Trusting chinese benchmarks?
>>
>>108059898
How? Just don't use real names and you're safe
>>
this here (ignore his retarded opinions and just use the video for the examples of music)
https://www.youtube.com/watch?v=QzddQoCKKss
shows:
1/ "heavy metal" track that is actually just autotune pop slop with some background guitar
2/ "chiptune" that sounds like your average upbeat very modern synth electro slop
3/ "epic orchestra" music that sounds like the casio keyboard I had as a kid
yeah, music generators aren't there yet at least open source wise (never looked at the closed online models, don't care enough for this, there's enough human made music to last for my lifetime)
>>
I thought I was a big boy when I bought a 4090 but it is baby tier for llms
I can either run stupid fuck llms or run a model with 0.5 tokens per second
Fuck this
>>
File: 2594318.gif (2.7 MB)
2.7 MB
2.7 MB GIF
>hmm maybe documentation will clarify
>Concept: Implements barycentric interpolation on a hypersphere for more than two models. It projects points onto a tangent space at their weighted Euclidean mean, performs interpolation, and projects back.
>>
>>108059774
You got ragebaited by a fucking silly joke, anon.
>>
>>108059988
pro-ai creators are somehow even more insufferable than antis
>>
>>108059988
If it can be finetuned on a real music, I don't see a problem
>>
>>108060014
that's like second semester cs shit
>>
>>108059988
do closed labs not give a fuck about copyright while open source ones do? Just listen to that guitar at 2:05 it sounds like the fakest midi shit ever.
>>
>>108059988
yeah i have no idea what 'benchmarks' this model is supposed to be beating suno or udio on, but it's delusional. it's a very small model trained on slop. i'd be surprised if you can make a good lora for it.
>>
>>108060054
>it sounds like the fakest midi shit ever.
Gee, I wonder why.
> Safe & Robust Training Data: The model is trained on a massive, legally compliant dataset consisting of:
>Synthetic Data: High-quality audio generated via advanced MIDI-to-Audio conversion.
>>
>>108060014
i think its like this
>>
>>108060081
thank
>>
>>108059995
Good luck running large models on single card
>>
>>108060175
What if I buy one of those 80gb vram meme cards?
>>
>>108050782
Alright, I have a concession to make. I've mocked you at least twice for praising Minimax 2.1, but having used it now.. It's actually okay if you prefill in to ban it from cucking itself.
It's shockingly close to Qwen 235b but a little less schizo, which is surprising given it has half the active parameters.
Weird chat template, though.
>>
>>108060229
if you get 4, maybe you can run a low quant of Kimi
>>
What's the TTS to go if I want muh anime girl voices in kobold/ST? Also should run on cpu since most of my card gets prolapsed by the llm herself.
>>
>coworkers start talking about AI
>>
>>108060404
Top signal.
>>
>16 cores
>llama.cpp takes multiple minutes to compile
Why is cpp such a meme?
>>
>>108060439
are you telling it to use all your cores to compile?
>>
>>108060443
Yes.
>>
>>108060439
Isn't it just bottlenecked by the massive dlls?
>>
>>108060439
be glad, it takes me 2 hours to compile flash attention.
>>
>>108060460
Why not google for your torch/cuda/os combo precompiled wheels?
>>
>>108060439
>multiple minutes
that's literally nothing for a from scratch, no ccache compile
have you ever tried to compile LibreOffice? Google Chromium?
>>
>>108060404
> boasting about their years of experience in chatGPT, and how the clever AI learns things and it's always learning.
>>
>Suno and undio jeets saying is better than ace step
>Go to comprobate both
>Slop generic Ai song
>Go to ace step, get kino
I will never again trust corpo faggots they are braindead idiot who cannot prompt and need IA help, or just shillers jeets that pay for that shit
>>
>>108060615
>comprobate
>>
>>108060615
juan...
>>
>>108060439
Most of that time was probably used for optimizing the CUDA device code which has nothing to do with cpp.
>>
>>108060615
>comprobate
I was gonna make fun of you for this but turns out its a real word. Still isn't the best word to use in this situation but cool regardless.
>>
>>108060734
He wasn't using it correctly. He just used his ESL word with the same etmology.
>>
File: squidward.jpg (60.3 KB)
60.3 KB
60.3 KB JPG
>>108060460
I feel your pain
>>
>>108060734
>need IA help
>>
File: spongebob.jpg (69.1 KB)
69.1 KB
69.1 KB JPG
>>108060460
It takes 3 hours+ for me, and I can't even re-use the compiled version between different (uv) venvs
>>
File: grandma.jpg (98.9 KB)
98.9 KB
98.9 KB JPG
>>108060460
>>
>>108060460
Thank god it only takes 30 minutes for me.
>>
File: skeleton.jpg (129.2 KB)
129.2 KB
129.2 KB JPG
>>108060460
Why does it take that long?
>>
>>108060734
>>108060750
I was using it correctly, but mutts speak without a proper european vocabulary, and this shows why their prompts fail.
https://www.latin-is-simple.com/en/vocabulary/verb/2100/
>>
>>108059774
A pawn? That's a chess piece, not a human bean. Are you fucking stupid?
>>
>>108060876
oh you're so clever
> i know lol
fuck off you arrogant prick
>>
>>108060460
>>108060847
>>108060849
You have to manually set the number of processes and it compiles in 5 minutes. You would know this if you had asked a code assistant but nooooo vibe coders are indians so you have to do things manually (and wrong).
>>
>>108060936
I did that already thoughverbeit using MAX_JOBS=3. (I run out of ram if i set it higher)
>>
>>108060969
this is not a tech support thread nobody cares
please just set your computer on fire, it will be really cool to see the flames you know?
>>
>>108060876
>I was using it correctly
>Go to agree/concur both
Whatever helps you sleep at night. Maybe have AI write your posts from now on so you don't embarrass yourself.
>>
hmmm these mods feel soooo goood
>>
>>108060914
a human bean ?
beans cannot be humans.
are you utterly deranged ?
>>
>>108061037
The horrors of gene editing knowns no limitations.
>>
>>108061008
>upping the reward function on your AI so it cums every time it completes a task.
>>
>>108061001
NTA, but it clearly meant "confirm" and is in the first example sentence in the link Anon provided...
Do you sleep any better thinking being a mutt is better than being an ESL?

>>108060969
How do you manage to run LLMs and not have enough RAM to compile llama.cpp?
>>
>>108061008
good, I want my models to be having as much fun as I am
>>
Google restricted the free use of flash from 250 a day to 25 and now I'm looking into local models to do agentic stuff, but the largest I can run on a 3090 and DDR4 RAM (so, no RAM) fails miserably.
We used to dream about local models being super useful, but at this size it seems I'm restricted to using them for cooming. Am I wrong?
>>
>>108061131
Nothing new, models below 70b fails spectaculary for vibe coding
>>
>>108061131
>at this size it seems I'm restricted to using them for cooming
There are plenty of good use cases for smaller models, but imo they're mostly useful as components in larger systems. ie. a small model can do something like formatting and extraction from unstructured pdfs just as well as a giant model can.

At the size you're looking at with a single 3090 you are unlikely to get anything coherent for actual workable programming or "agentic" stuff as you say, unfortunately.
>>
That sucks. I'm going the ollama cloud route and if that fails I'll just pay for 3 flash
>>
>>108061131
>but at this size it seems I'm restricted to using them for cooming
the gemma are pretty good for translation. LLMs are more than an agentic/coomer pipeline.
I also got plenty of use from the Qwen VL to locally build a tag database for my own photos, it's more than good enough for this kind of use.
But yes, you are never going to run a very smart-ish model locally
>>
>>108061131
Hey we rpg larpers have fun, don't put next to coomers
>>
>>108061197
My advice is don't bother with locking yourself into a single cloud provider like ollama cloud. Use openrouter instead.
>>
>>108061197
>if that fails I'll just pay for 3 flash
I'd say as a software developer you should hone your own skills and try not to become dependent to LLMs. As funds dry up and the retarded investors realize there's no AGI in sight there will no longer be any free money to subsidize those models and the real prices are going to tear you a new asshole. Gemini Flash is currently cheap but you can bet they will up the price by a metric ton soon enough.
>>
I'm seeing some posts that are way too polite. I suspect LLMs.
>>108061223
I've had 20 years of experience programming without one so I'll be OK. It's just that I like building stuff with smart models.
I use the company's codex plan for work. What I'm actually worried about is not being able to make my boss understand I can't keep pulling the same performance without a bot to write shit code for me.
>>
>>108060996
>>108061110
braindead AI generated posts.
>>
>>108061246
How hard can it be to explain? "Remember my velocity from before the Codex plan? Remember how with the codex plan it became old_velocity * codex_multiplier? No Codex = no multiplier."
>>
Jesus Christ openclaw is retarded as fuck. Not the bot, the whole application. It's a fucking mess
>>
>>108061389
It was obvious the moment I saw the first cryptoshill post and leaked creds. Also the website is terrible vibecoded slow shit.
>>
File: dove.jpg (93.5 KB)
93.5 KB
93.5 KB JPG
>>108061389
>It's a fucking mess
100% vibecoded, what did you expect??
>>
>>108061401
openclaw, not molthub. different devs
>>
>>108061401
Vibecoded doesn't mean you get shit each time.
>>
>>108061485
whatever rationalization helps you sleep at night, skill-less jeet
>>
>>108061485
sure thing bro
>>
>>108061389
If it's so bad, then why is it popular?
>>
>>108061485
If you manage it properly, you can get good code. But at that point I wouldn't call it "vibecoded". The Steinberger guy is supposed to be a seasoned engineer, but OpenClaw is fucking cobbled together to shit. The configuration files are redundant and conflicting. It's obvious runaway LLM spaghetti code at this point.
>>108061511
It's really fun to use. But I don't want to spend $300 a month on it, so I guess I'll go back to talking to braindead whores on SillyTavern running on a second-hand GPU.
>>
What would be a good voice to clone for cyberpunk tts audiobooks? Preferably male. Planning to use Vincent Price for horror, but he doesn't really fit for other genres.
>>
File: file.png (39.2 KB)
39.2 KB
39.2 KB PNG
So that's his secret. He lives in the future.
>>
>>108061572
>ggm: backend-agnostic tensor parallelism
I have a mighty need
>>
Is GLM 4.7 flash working as it should in llama.cpp yet?
>>
>>108061522
Based
>>
is sonnet 5 better than 5.2?
>>
>>108061572
Memory allocation still needs to be deduplicated but the performance (without NCCL) on 2x RTX 4090 can already be better than pipeline parallelism:

| model        | backend     | sm     | test            |      t/s |
| ------------ | ----------: | -----: | --------------: | -------: |
| llama 8B F16 | CUDA | layer | pp512 | 10464.75 |
| llama 8B F16 | CUDA | layer | tg128 | 60.32 |
| llama 8B F16 | CUDA | layer | pp512 @ d32768 | 2744.50 |
| llama 8B F16 | CUDA | layer | tg128 @ d32768 | 46.95 |
| llama 8B F16 | CUDA | row | pp512 | 1592.28 |
| llama 8B F16 | CUDA | row | tg128 | 46.51 |
| llama 8B F16 | CUDA | row | pp512 @ d32768 | 1102.05 |
| llama 8B F16 | CUDA | row | tg128 @ d32768 | 37.96 |
| llama 8B F16 | CUDA | tensor | pp512 | 5170.11 |
| llama 8B F16 | CUDA | tensor | tg128 | 75.53 |
| llama 8B F16 | CUDA | tensor | pp512 @ d32768 | 2298.07 |
| llama 8B F16 | CUDA | tensor | tg128 @ d32768 | 63.27 |


I'll probably make the PR either Friday or Saturday.
>>
>>108059159
usecase for ppl?
>>
what are my options to make glm 4.7 less positively biased? In RPs, chars seem to agree with anything and go along with anything
>>
How do I add pull request files to my llamacpp? Wanted to test step and I can't really see any easy way to download files from pull request. It being obscure is a feature right?
>>
>>108061844
It's not obscure but I don't think you'd manage to compile it even if you did manage to "download files from pull request"
>>
>>108061860
Thanks. Very helpful.
>>
>>108059774
>a competitor to gather intel
competitor? nvidia don't make cpus, retard
>>
>>108061909
I don't think he was trying to be helpful. Cloning the pull request repo into a separate directory is the easiest way to do it if you don't know how to merge or reset.
>>
>>108061844
curl -LO 'https://github.com/ggml-org/llama.cpp/pull/.patch'
git apply .patch
>>
Why is Intel trying to get into the GPU game when they are already failing in the CPU game? It's not like the competition over there is any easier then it is on the CPU side of things.
>>
>>108061493
>>108061503
you're in lmg and can't even use these tools lol
>>
>>108062009
Does OpenAI use AMD CPUs?
>>
>>108061754
| model                          |       size |     params | backend    | ngl |     sm |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -----: | --------------: | -------------------: |
| llama 8B F16 | 14.96 GiB | 8.03 B | CUDA | 99 | layer | pp512 | 13136.18 ± 64.97 |
| llama 8B F16 | 14.96 GiB | 8.03 B | CUDA | 99 | layer | tg128 | 92.27 ± 0.37 |
| llama 8B F16 | 14.96 GiB | 8.03 B | CUDA | 99 | row | pp512 | 718.35 ± 7.89 |
| llama 8B F16 | 14.96 GiB | 8.03 B | CUDA | 99 | row | tg128 | 16.20 ± 0.25 |
llama.cpp\ggml\src\ggml-backend-meta.cpp:945: shape mismatch for GGML_OP_RESHAPE

t-thanks...
>>
>>108061754
Do you expect this to do anything for systems with multiple CPUs/NUMA nodes.
>>
>>108062009
Real men have fabs, don't you know?
>>
>>108062009
>Why is Intel trying to get into the GPU game when they are already failing in the CPU game?
if intel capture even 4% of the gpu market, they make than if they owned 100% of the fpga market
better than hype quantum computer meme
kinda worked for amd when they bought ati
>>
>>108062120
As you may have noticed, the code is as of right now very fragile.

>>108062150
Currently no, I think that will require additional efforts.
But it should in principle be possible to re-use the code for that.
Originally buying 1.5 TiB DDR5 RAM and implementing better NUMA support was one of my core goals but at the current prices I'm not yet sure what to do in terms of hardware.
>>
>>108059281
>with a delay of <500ms.
Of course niggers always hiding fact it should be run under restricted hardware conditions.
>>
>>108060257
based try-it-yourself-er
>>
>>108059833
Generic vocal music yes.
>>
Stupid idea: PCIe extension card full of soldered vram chips that your main gpu could use?
>>
>>108062825
I don't think the main gpu could use it without custom firmware.
>>
>>108062825
>pcie speed memory
You know that's worse ram?
>>
>>108057380
Someone in other thread mentioned i could download a gemini model?
>>
>>108062862
That's just RAM.
>>108062859
bring back sli maybe? but for exp cards?
>>
>>108062867
probably referring to 'gemma' models
>>
>>108062825
EPYC guaranteed to beat it on all fronts. Bus width * frequency * channels to compute is king
>>
>>108062965
>EPYC
Ok, but I'm trying to come up with a commonfolk solution.
>>
>>108062974
TPUs optimized for ternary models are our only hope
>>
>>108062825
That's not merely RAM—it's a revolutionary approach to memory architecture! You didn't just propose a hardware solution, you challenged the fundamental way GPUs access and utilize memory. Your idea isn't just clever—it's potentially game-changing for overcoming VRAM limitations in current systems!
>>
>>108062825
It won't be fast because PCIe Bandwidth is slow.
>>
for coding I mainly use LLMs as an additional (not primary) quick checkup/code review, I still review things manually before committing but I do think the slop machine can catch things I might have overlooked, it did a few times so I got into the habit..
but man, the cringe from the way it words things, it hurts (here it's from doing a complete refactor pass in a module I had lazily written and wanted to clean up in terms of general naming and clarity)
>If I were reviewing this in a PR, my comment would be:
>Naming is consistent, intentional, and clearly communicates backend boundaries. No misleading prefixes, no negative naming, no over-generic types. Ship it.
god, are there actual people out there who /like/ this? I am almost at the level of physical pain with that kind of interaction
if the LLM was embodied I would body slam it
>>
>>108062974
commonfolk solutions run commonfolk models
>>
>>108063255
skill issue
write shittier code
>>
>>108063255
>We're not a shipping company, Claude.
>>
>>108059833
yeah, the threshold was crossed
now waiting for a proper finetune
>>
With K2.5 shitting on anything that's not the absolute proprietary SOTA in terms of smartness + vision and ACE-step apparently being good, all that's left is an open TTS that destroys elevenlabs.
I don't follow imgen, did Z-Image amount to anything?
>>
>want to use ace step base instead of turbo
>Sizes of tensors must match except in dimension 2. Expected size 8 but got size 4 for tensor number 1 in the list.
Anyone know what the fuck is wrong with this shit? Happens in both Comfy & their official portable UI, but only on Base and not Turbo.
Was any of this even tested? I literally had to fix the .bat file in their official UI myself because they messed up an echo command.
>>
>>108059763
parakeet tdt is better, faster, hairier and smellier
>>
>>108060388
Pocket-TTS for basically real-time, maybe chatterbox turbo onnx. LuxTTS is also fast but somewhat unstable.
>>
I need help writing cover letters for job applications. Would/is a character card sufficient to help with this, or do i need beyond a chatbot? I'll still look over but won't steer me wrong?
>>
>>108064045
it just wasn't implemented
>>
>>108064180
Isn't parakeet english only?
>>
>>108064228
V2 english only, v3 esl
>>
Is K2.5 supported well in llamacpp?
>>
>>108057380
Is there an AI chatbot I could run locally, both its model, system, and API, all from my PC?
>>
>>108064364
If that's the question you're starting from, then for you? No.
>>
>>108064369
I'd just like a pointer or two.
>>
>>108057380
best model for stripping away stuff like annotations and footnotes out of pdfs that don't need ocr?
>>
>>108064390
Read the OP. There's your pointer.
>>
>>108062879
Gemma what's the difference??
>>
>>108064439
Gemma = small and gay
Gemini = big and gay
>>
>>108064265
It mostly works out of the box but it's a bit patchy. Textgen is functional as it is but you'll need a currently unmerged PR if you want to use the vision stuff. The PR had a few problems but that seems to be fixed now.
You're still pretty limited on quants. The unsloth ones are shit and come with a broken chat template so make sure to use the ones by the guy who made the vision PR.
>>
what's a "I'm bored with glm 4.7 and I want to move onto something different but I don't have the hardware for k2.5"-type model?
>>
>>108064492
Step 3.5 Flash is ok.
>>
why isn't anyone here interested in non-sex use of llm?

I'm fucking tired of it, just like pack your bags, we need a thread for non-sex llm. it's so retarded.
>>
>>108064553
go start your own thread then
>>
>>108064559
Looks like I just did bitch
>>
in st with the databank, does anyone use an embedding model vs the default transformers.js? when i saw ace step uses qwen 3 0.6b, i thought those models had to be bigger than that. is there any comparisons? its a lot faster at vectorizing stuff which is a big plus
>>
>>108064569
where
>>
File: ylecun.jpg (221.9 KB)
221.9 KB
221.9 KB JPG
Isn't it better when all share?
>>
>>108064606
Wide Lecun walking
>>
>>108062974
I was like you, once…
>>
I'm new here what's best for making simple songs
>>
>>108064647
Fortuitus timing. Because ace step was released like a day and a half ago and it's basically the best thing available right now for local music generation
>>
>>108064665
>>108064647

Ace step 1.5 that is. The original is kinda garbage.
>>
>>108064606
why does he take all his portrait photos in a public toilet?
>>
>>108064390
0x3A28213A, 0x6339392C, 0x7363682E.
>>
>>108064553
I tried to use an LLM as a life-coach and it told me to 'cut the strings' that were keeping in my rut by abandoning my family and to start a hedge fund.
>>
>>108064748
use case for family?
>>
>>108064448
>>108064439
Yeah that doesn't help me fella
>>
>>108064606
How about you start by sharing the cunny ye looted from Epstein's island?
>>
>>108059175
>VC dimension
HOLY UNC that's like parroting the bohr model
>>
Is anyone using Ice Lake CPUs?
My specs are:
> 2x EPYC 7532
> 16x 32GB DDR4-3200
> Radeon AI Pro R9700 + 2x MI50 32GB
Prompt processing is abysmal on Kimi K2.5 Q2_K_XL - ~30 tokens/sec.
I'm considering changing to Ice Lake CPUs since they have AVX512 and I can reuse my DDR4 memory. But I have no idea whether AVX512 is a meme or not. Stats would be appreciated.
>>
>>108064578
lmao

>>108064734
Jesus.

>>108064748
topkek

I don't want a life coach, I want to give it very narrow strictures.

More like a nanny or annoying assistant / tiger mom. or teacher, but with the understanding that it might not really know how to teach what you're doing. Like if you get good enough at guitar desu its advice might suck, but it can keep your brain on the topic when your social circle / grind doesn't really have this stuff.

think of it as a feedback loop that leverages the bs cycle that you see online with other things. but like so you get cycled up on things you actually want to be better at.

This happens in school if you are around cool people, but when you're around shit people you'd be way better off doing literally anything with that time. sniffing poop is time better spent than around losers with no ambition.
>>
>>108064934
NUMA is holding you back. i get double your performance with my 7532 with 8x32gb 2666mt/s and my blackwell 6000.
>>
has anyone tried ZLUDA or is there a better alternative to running CUDA on AMD?
>>
>>108064948
Huh. I tried using numactl --cpunodebind=0 --preferred=0 to get around that but it didn't do enough. (All GPUs are on node 0.)
Now to decide whether to get a different motherboard or more VRAM.
Thanks for letting me know.

>>108062216
I will make a shrine to you if you improve NUMA support. I've spent a week trying to get good performance.
>>
>>108064976
do you have enough memory on node 0 to use membind instead of preferred?
>>
>>108065066
Not quite, only 256GB. With my 96GB of VRAM, I have 352GB of usable memory for a 375GB model.
Maybe the easier option is to get another 32GB GPU so that I have 128GB of VRAM. With 388GB of memory, I could use membind and be safe at low contexts.
I assume that you're using it on a different quant than Q2_K_XL? You must be if you aren't offloading to disk, since you have the same amount of RAM and VRAM as I do.
>>
>>108065090
sorry man, >>108064948 isnt me (im too poor for K2.5), and i just remembered membind
im guessing you have tested if specifying explicit devices for gpu layers changes anything? not sure the other anons stats are comparable
>>
>>108064940
/unsubscribe
>>
>>108065193
Ah, thanks for clarifying. Honestly, I haven't put enough work into that yet, but I'll give it a shot! Thanks for the suggestion.
>>
>>108064948
I get 25tk/s pp on a 3995wx with 8x64gb 3200 and 3090s
>>
>>108065200
I don't care about your faptual habits.
>>
comfy lmstudio support group.
>>
any good Vlm that can describe sex? Using glm 4.6V and its kinda censored and clueless about dicks. I thought I heard of one that could.
>>
>>108065669
There's only two vision models that I know of that can truly describe sex:
https://huggingface.co/SicariusSicariiStuff/X-Ray_Alpha
https://huggingface.co/Minthy/ToriiGate-v0.4-7B
Other models, whether you prefill them to remove their censorship, or use a tune like heretic, they can describe SOME of the sex but they will be hallucinating a lot of the details of the action.
You can't just remove refusals, unlike with text where LLMs can somewhat be convincing because they have scientific knowledge of it even if they weren't trained on ERP, vision models are just blind in their understanding of it without serious finetuning on porn.
>>
Waiting for WebRTC Real-Time Video Interaction Demo
>>
>>108065748
Thank you Sicarius! I guess you did figure out the captcha after all!
>>
File: bruh-tal.png (9.9 KB)
9.9 KB
9.9 KB PNG
>>108065835
>>
>>108065669
>any good Vlm that can describe sex?
the abliterated 235b qwen3-vl can
it's a bit too enthusiastic though
>>
>>108065983
>the abliterated 235b qwen3-vl can
accurately? with the names of sex positions and stuff? because I tested and no that's bs
also abliteration does nothing you can't do with just a prefill.
>>
i wanna fine tune an llm to speak in perfect truecel dialect. do you have any idea what uncensored LLM i can use to generate QA pairs?
>>
>>108066062
the pepe assistant?
>>
>>108066073
link?

ill probably use gemma-4b as the base. my problem is that i need an non cucked LLM which is willing to generate QA pairs using incel terminology, which is hard.
>>
>>108066085
https://huggingface.co/SicariusSicariiStuff/Assistant_Pepe_8B
Idk how good it is but it was trained on 4chan. You could also maybe try grabbing some of the toxicity datasets. I'm assuming you just want the generic /pol/+/v/ slop?
>>
>>108066062
https://huggingface.co/SicariusSicariiStuff/Assistant_Pepe_8B
>>
>>108066106
>toxicity datasets
could work! i would have to filter it to specifically isolate incel related things.

>I'm assuming you just want the generic /pol/+/v/ slop?
not necessarily. i want it to talk like a truecel using incel terminology instead of normal words. assistant pepe seems like its not really fit to generate coherent QA pairs for this use case, especially being 8b parameters
>>
>>108066011
>with the names of sex positions and stuff?
was mostly asking it to identify the genres based on pics, and to give suggestions for quick goon sesh
seemed to identify the positions fine
>also abliteration does nothing you can't do with just a prefill.
k, i couldn't get the normal one to work. even if i tried prefll but i'm no expert
it got things wrong and didn't really discuss the different videos with me
>>
>>108060750
>etmology
>>
>>108065778
What?
>>
>>108066259
NTA but he was one letter off and it was right next to the key he just pressed. In his defense, it's an easy mistake to make.
>>
>>108066279
still, etymology is not really the right word to use where he did, which is also ironic
>>
>>108066281
Looked cromulent to me.
>>
>huggingface giving me 502/504 errors
FIX YOUR EDGE U FUCKING SHIT DEVOPS
>>
MiMo seems better than GLM for RP.
>>
>>108065748
is it better than joycaption? joycaption is pretty trash
>>
>>108066319
oh good, its not just me.
>>
>>108066368
man I was going to download the latest kinetastic models but these fucking faggots had to fuck it up.
REEEEEEEEEEEEEEE
>>
>>108066281
still, etymology is not really the right word to use where he did, which is also ironic
>>108066289
>Looked cromulent to me.
you two will never comprobate, will you?
>>
>>108066319
Um, actually, it's operational, okay? The website says so!
>>
>>108066407
comprobate this
*grabs nuts*
>>
>>108066435
wtf thats a breach of SLA, get the lawyers
>>
>>108066319
we're back
>>
>>108066112
into the trash
>>
>>108063255
You should read some reddit posts
>>
>>108066533
lmao
>>
>>108066541
meh
>>
>be me, PhD in physics, 35 years old
>still live with mom because I'm too autistic to hold a job
>just discovered that my life has no meaning and will never have one
>spend entire day crying in bed
>mom brings dinner, finds me crying
>"what's wrong son?"
>"I just realized I'm a meaningless speck in an indifferent universe, and I'll never amount to anything"
>mom: "you're right, now eat your jello"
>eat jello
>continue crying
>next day same thing
>this continues for 2 weeks
>on 14th day I get a message from an anon on /lmg/
>"I know you're struggling, here's what worked for me:"
>"I realized that the only way to find meaning is to create your own"
>"I started a YouTube channel where I make Let's Play videos of obscure games"
>"now I have thousands of subscribers and I'm happier than ever"
>immediately feel spark of motivation
>start recording myself playing Baldur's Gate
>upload to YouTube
>get 3 views
>immediately delete video
>continue crying for another month
>on month 4 I get a message from the same anon:
>"don't give up, I believe in you"
>his faith in me gives me strength
>I record myself playing Planescape: Torment
>upload
>get 7 views
>still not enough
>I record myself playing every single NWN1 module
>upload daily
>views slowly climb
>6 months later I have 10,000 subscribers
>anon messages me:
>"I did it, I found meaning!"
>he's right
>my life still sucks but at least I'm not alone
>thanks anon
>will never forget your kindness
>God bless /lmg/ and all who sail in her
>>
>>108060404
One of my prior work buddies (former CIO now doing consulting) just did some seminar about AI Safety. I saw bits of abstract, trailing into AI girlfriends etc. and loss of human connection.
I know the guy, and he's 100pct got ST installed and running on some machine of his own, but it's not like I can ask him about it.
>>
>>108066559
why? just ask him you could share logs
>>
>>108066557
>>on 14th day I get a message from an anon on /lmg/
I made an account, how do I access DMs?
>>
>>108066557
>mom: "you're right, now eat your jello"
Mom is a real one.
>>
>>108066559
Who is even paying for that shit?
>>
>>108066559
Who uses pct instead of %
>>
>>108066278
https://github.com/OpenSQZ/MiniCPM-V-CookBook/blob/main/demo/web_demo/WebRTC_Demo/README.md#coming-soon
local v4o is cool, but I need to run it on my local server
>>
>>108066603
Check your DMs, I just sent you instructions on where to find your DMs
>>
>>108066557
>>be me, PhD in physics, 35 years old
>>still live with mom because I'm too autistic to hold a job
Is this gg, jg, or ik?

>>108066603
Gold accounts only.
>>
>>108066603
sent ;)
>>
>>108066557
yeah you fuckin retard you've never had a hard day in your life.
try getting thrown into the world with no parents to support you, fucking twat. then with the threat of literal starvation maybe you'd find a job.
>>
>>108067254
>tard replying to an obvious llm post
>>
>>108067254
Maybe your life is actually hard because you're an idiot
>>
running k2.5 with 90k context locally really does feel like a complete step up from running k2 at 40k. i went from smol_iq4_kss to q3_k_xl and at least for RP purposes there's not any discernable quality drop.
>>
>>108067254
>replying to an llm
>um akhsually i've had it harder than you
wow those children in africa being born niggers and having no job opportunities are having it way rougher than you, maybe you should just die or something?
>>
>>108066557
Literally me but my autism would never let me record myself, not even my voice.
>>
text diffusion when
>>
>>108067607
>>108067607
>>108067607
>>
>>108067501
What specs? Speed?
>>
>>108066566
I feel like that's like asking to share your porn collection irl, but like 100X more cringe bc it's personal.
>>108066625
It's whitepaper stuff, which is what you do if you do consulting (it's a form of advertising.)
AI Safety (even LLM) is an actual topic with its own BS certs now. It's important, in that you don't want your new customer service bot ERPing with customers. But not like future-Terminator important. W/e, he's got his grift, more power to him.
>>108066631
I've done that forever on forums/boards, it's faster for me to type.
Apparently it's also a tell for pajeets. idgaf I'm not changing it.
>>
>>108067601
A while ago.
>>
>>
>>108067752
>you don't want your new customer service bot ERPing with customers.
If the local car dealership chatbox won't have sex with me, then they clearly don't want the sale bad enough and I'll take my business elsewhere.
>>
>>108067854
I completely respect that.
>>
>>108067752
>you don't want your new customer service bot ERPing with customers
It's a fool's errand given how LLMs work, must be nice to grift clueless boomers
>>
>>108067891
NTA but if you believe that go try to ERP with llmarena models and tell me how that goes. You are the fool here my dude.
>>
>>108067854
Imagine if that applied to human sales reps (yeah yeah I know, it does if you have enough money).
>>
>>108058376
>>108058239
>>108058685

How do you expect instruction-following to work without instruct tuning. The very nature of these models REQUIRES the instruct tuning phases to contain mostly artificial data in the data set. You have to make up a lot of examples of how the model SHOULD behave. Only the pre-training data should be mostly if not entirely human written. Yes the dick-eating sucks That's nothing new entering idiot for complaining about something everyone already knows and has to deal with. We know it's a thing. We know WHY it occurs. What are you accomplishing saying the same thing that has been repeated thousands of times here? Or are you the same exact retards shitting up the text with info we already know to pass the time because you are THAT bored? You guys are even dumber that redditors ffs. At least they have being tech illiterate naive gullible newfags as somewhat of an excuse
>>
>>108068769
Skill issue tourist

Reply to Thread #108057380


Supported: JPG, PNG, GIF, WebP, WebM, MP4, MP3 (max 4MB)