Thread #108621719
File: vcg.jpg (1.2 MB)
1.2 MB JPG
A general for coding with agents
►Harnesses
https://developers.openai.com/codex
https://code.claude.com/docs/en/overview
https://opencode.ai/
https://antigravity.google/
https://cursor.com/docs
https://pi.dev/
77 RepliesView Thread
>>
>>
File: 1776401394580086.png (303.1 KB)
303.1 KB PNG
>>
>>108621734
Been on a camping trip touching grass for four days and I came back to the codex daily 5-hours limit removed and a single (1, singular) prompt eating the entirety of your weekly usage? Am I reading this bullshit right?
>>
>>
>>
>>
>>
>>108621719
>https://developers.openai.com/codex
>https://code.claude.com/docs/en/overview
>https://opencode.ai/
>https://antigravity.google/
>https://cursor.com/docs
>https://pi.dev/
one of those isn't like the other
>>
>>
>>
>>
>>
>>
File: rip.png (169.8 KB)
169.8 KB PNG
i like the increased pace of vibe coding, but needing to go back and understand all the changes being made and refactor to my standards just makes me feel like im in an all-day PR review. how do i make this less exhausting
>>
>>
File: Screenshot_20260416-223506.png (443.2 KB)
443.2 KB PNG
>>
>>
>>
>>
>>108622658
I use the web UI instead, had to vibecode a fix to set the IDs of the web ui client 10s behind the server or shit would break (duplicated messages would appear in the TUI/no responses on the web UI) but other than that I am happy with it.
Also can't have more than one tab open lol but that's a minor bug.
I prefer a web UI that can be easily customized and hosted anywhere over a traditional native app
>>
>>
>>
>>
>>
>>
>>
>>
>>108622380
I generate detailed markdown plans and refine them, then I only let the AI do tiny atomic things. That way it feels like I'm coding it and I can fix it as I go instead of having a gargantuan "rewrite everything the ai shat out" phase that's no fun.
>>
Alright, PiClaw is out of alpha and into beta. It's up and running at home, and I'm connected with Telegram. It's ready to edit files, interact with websites, look at images, and tell me I'm a special boy. We'll consider today the first test as a daily driver.
>>
>>
>My honest read: Rewriting env in JAX is 3-6 weeks of engineering for a training run that completes in ~10 days. Bad ROI for a project at your stage.
>I've now read the env. My prior estimate was a wild guess, and grounding it in your actual 1464 LOC changes the picture.
t. Opus 4.7
>>
>>
>>108623169
In the pipe, five by five
>>108623197
Stop subscribing to LLMs
>>
>>
>>
>>
>>
>>
>>
>>108623201
>>108623208
Ok I'll just build a $60k rig and run something locally, that makes much more sense.
>>
>>
>>108623197
Opus 4.7 feels *a lot* like what they did to Gemini 3.1 after a while. There might be public for it, but instead of being a helpful assistant, now it's more prone to write an essay telling half a dozen totally equivalent way to do that thing. Other times, it tries to take shortcuts, confidently saying things about files it hasn't looked at, and when called over it, offers "you're absolutely right" type of platitudes.
Then when you get it to do the work it's supposed to, before starting always highlight that it will do it, sounding like it is doing you a favor.
It seemingly feel in love with the word honest and keeps repeating "Honest take", "Honest answer", etc. The way the responses are written are also starting to *feel* like LinkedIn posts: "(statement) Read on to find out why."
It's great by all metrics if we consider that this can even exist. It's impressive. It's also disappointing in a what-did-they-to-my-boy kind of way.
>>
>>
These threads are always full of superstitious nonsense. Had a bad week at the AIs? Someone at Anthropic must have nerfed the model!
The roulette wheel isn't rigged, sometimes you're just unlucky. Try another spin, maybe you'll do better. Or just write your own code.
(stolen from hackernews but xhe isn't wrong. I have literally never witnessed a regression from a model ever)
>>
>>108623426
i only have 64 cores epyc with 128gb ram and a potato gpu on the server for hwenc.
i cant run anything on that.. in real time, i can tho run some agent overnight.
but i havent looked at self hosted agentic programming stuffs yet
>>
>>
>>
>>108623453
There can definitely be some. Gemini 3.1 used to be extremely helpful in Antigravity, now it starts every interaction by a chain of thought saying that it is avoiding cat for file manipulation and is focused to using dedicated tools like grep_search and other utilities to yada yada yada. I'm pretty sure sometimes it falls back to that self conversation in the middle of a chain of tool calls too. I don't know if I just need to empty completely the history and memories, but it seems like repeating this to itself again and again and again is about half of the effort it spends answering every request.
>>
>>108623465
anon just bite the bullet and pay api prices for a chink model.
glm5.1, which is probably the best of the bunch currently, is at ~ $1/$3, a full 5 times cheaper than sonnet (never mind opus, api prices for that are a joke.)
mimo2.7 is not far behind in benches, and its at $0.3/$1.2, another 3 times cheaper, lol
and deepseek 3.2 starts out at $0.3/$0.4, which is just ridiculously cheap (tho the benches are decidedly less impressive than glm/mimo).
its not worth it to pay for hardware for local currently.
you'll only be able to run much smaller, much less capable models, and you'll be getting shit throughput (especially in a cpu+ram only config like yours)
even if you were willing to limit yourself to small, local targeted open stuff like the latest qwen3.6 35b (which is FAR inferior to the dirt cheap chink stuff mentioned above), you'd definitely want to buy at least a used 3090 to run it at a decent speed
at current prices, it just doesn't make sense.
if/when the current ai bubble bursts, then maybe.
its possible that a combination of cratering gpu prices (datacenter gpus, with 80gigs vram apiece, not the cucked consumer stuff) and inference providers jacking up prices by 10x or more might make it viable
but as it stands right now, its not even close.
>>
>>108623520 (me)
And if I am dumb regarding this and it's user errors, well, it might be the same for others who are claiming regressions, but they make the tools get bogged down into unhelpful patterns by their own memory files or it's the harness going off rail trying to keep track of user preferences in ways that don't make sense and result in doing the opposite, the users are not the only one to blame.
>>
>>
>>
>>
>>
>>
>>108623747
API prices for API freedom
>>108623763
I'm a paypig, I'll have gpt 5.4 write it
>>
>>
>>
>>
>>
>>108623787
>Mods need to ban AI shit from /g/.
how the fuck are you on /g/ and anti-AI? every single comp sci student I know uses claude/codex. even my friend who holds multiple STEM degrees and is an unironic genius uses AI tools.
>>
>>
>>
>>
>>
>>
>>
>>
i am almost done with my imageboard summarizer next tool will be a [redacted] music platform piracy tool with integrated mp3 tagging , and a poor mans files library backup system using tar, rsync and par2 with a simple gui for syncing between drives and validating integrity
>>
>>
>>
>>
File: 1774679084992169.png (991.8 KB)
991.8 KB PNG
WATCH OUT DARIO
>>
>>
>>108624281
>>108624337
The what? Bugs?
>>
>>
>>108623441
I still think GPT 5.4 Extra High via Copilot VS Code extension is a gorillion times better deal than Claude anything, given it is still only a 1x quota consumption model relative to what you get as far as premium requests a month for the $10 monthly sub.