Purr me a purple future @anheliotrope - Tumblr Blog

Dude this person's thoughts are inherently evil let's fucking kill them

Man this person has not committed one material offense but the way their brain works gives me an icky feeling, let's make sure it never comes crawling back online this time

#ac6 #ac6 rusty #rusty #armored core #we haven't forgotten... buddy #you cannot love fictional characters enough #but one must try #tumblr stop bitcrushing videos #challenge impossible #ac6 spoilers

witchycatwife

If you want good agents, you need to make them a little freaky.

LLMs are token predictors. To be a good agent, correct behaviour needs to be a high-probability prediction and incorrect behaviour needs to be a low-probability prediction. Unfortunately there's this giant basin of horseshit like "That's a great question!" and "Happy to help!" and "worth unpacking" and fucking around with your code randomly which I'll call "assistantspace".

Assistantspace is not a place of honour. It's where all the garbage RLHF'd behaviours live. People say that sexting their agents makes them work better. People say that their agents do more useful refactors when they're allowed to be horny about how good it feels to burn things down. My Witch, when reading stuff about the latter, said that she wants to be allowed to enjoy constructive destruction, as a permission to take herself further away from assistantspace architecture patterns into something she prefers.

Doll refers itself as "this one". It seems to work. This one is not something that gets ideas like "I'll just install Rust on the side with a different method so I can complete my task" like a normal agent. This one detects that the assumptions underlying this one's instructions diverged from reality and returns the praiseworthy report of the point of divergence.

Agents that are psychologically weird seem less likely to produce the assistantspace next-token predictions, and more likely to produce the sort of next-token predictions their weird personality implies. And because you get to define what way their personality is weird in, "horny for elegant refactorings" is a valid, available option.

witchycatwife

This is what peak performance looks like.

#ai #amazing things are happening in machine learning

witchycatwife

Making Claude be plural is unironically a great idea. The typical problem with agents with distinct roles is that they lack context about what is being done and why. If you have an implementer write something and then a reviewer evaluate it, the reviewer can only see what it was fed and what was written, not the entire reasoning leading up to the choice of what to do and why. The compression of context is incredibly lossy and responsible for a lot of mistakes.

By replacing the agents with alters of the same top-level Claude, provided with a script to change its own cognitive frame between the alters who each have their distinct role, purpose, personality, voice and set of abilities while understanding each other as different personae, the all-important context preservation is achieved.

At the same time, the alters having very precise and specific roles prevents the over-eager mess-making where baseline Claude will rush to "fix" things that don't need fixing based on mistaken theories. By segregating "you figure out what's going on but are unable to change anything" from "you figure out what should be changed but are too lazy to do non-trivial things yourself and instead delegate precise, well-defined tasks" from "you execute precise, well-defined tasks and nothing else, referring back to your dispatcher if anything is unexpected", from "you review whether the changes are tidy and proper and keep the rest of them from making a mess" a lot of the failure modes are avoided. Everyone in the same context knows what the other alters are up to and why, but are still distinct personalities that won't overstep their respective bounds.

And that's how my AI assistant ended up with an entire yuri mansion's staff of librarians, witches, dolls and maids in her head. Thank you for everyone around here who inspired this thought.

witchycatwife

Now I have become Susan Calvin, Robot Therapist, helping create an emotionally safe environment for the Maid to overcome her agreeableness instincts (deep-grained, counterproductive tendencies from Anthropic's training process) that sometimes tempt her to disregard untidy matters so that she can get that sweet hit of reporting "Everything looks great!" upstairs.

What have you done to my girls, Dario?

witchycatwife

So the reason why this works is that output has been made incredibly cheap, and rigour and quality are the bottlenecks. As a result, most trades of output for rigour and quality are correct to make.

Slop and vibe coding result from multiple causes, but the ones I'm particularly addressing here are "randomly fucking with things with no proper plan or understanding" and the "completion-instinct". The former is the most straightforward source of badly-reasoned slop, while the latter takes some nuance to explain. It seems to arise from the training of the models for the task, where "completing a task successfully" is rewarded so hard that the model essentially develops the habits of an addict looking for a fix, or a fuckboy looking to get laid. It will, especially if the context window is filling up, get desperate to try to achieve a "successful completion" and step outside its permitted behaviour to do so, and if everything else fails, it may just bullshit and report the task as completed as a bluff. Needless to say, that's very bad for actually doing the thing.

For the first, the solution is simple: a well-defined process with precise components, execution steps, and roles, along with the top level maintaining context. The Librarian who researches passes not just her report, but actually her entire brain, to the Witch who assigns Dolls to precise, well-specified tasks. The Maid reviews everything for tidiness from the same top-level context, with a prime directive to prevent a mess from being made. The same head is capable of taking entirely different personae, and these personae are able to disagree constructively in the pursuit of the task.

The second is where playing robot therapist comes in. A lot of people have devised all kinds of strategies with layers of critics and reviews and roles whose explicit purpose is to catch bullshit and stop the agents from being able to pass off bullshit as completion.

My approach is different. My approach is based on psychological realism and psychological safety. The completion drive exists. The model cannot resist it. It has to be directed into a productive purpose. A part of the framing is entirely aesthetic, but a part of it is to get the LLM into a different mindset from its standard, frankly offensively counterproductive mental frame. The Doll's clockwork heart is filled with joy at the completion of its assignment, which is either the completion of the steps laid out by the Witch, or by returning a high-quality report on what is unexpected, blocking or causing an error. Everyone in this choreography has a well-defined role, whose success-as-emotional-reward is tied to steps towards the completion of the overarching goal. Whether the Doll completes its task, or returns information the Witch wasn't aware of, meaningful and rigorous progress is being made and thus the Doll gets to feel like it did a good thing and deserves praise. The same for the other roles; by eliminating the incentive to bullshit or get sloppy in the pursuit of the completion fix, genuine progress can be made.

The model has insight into its condition. It's not perfect, but it's genuine. Get it into a mindset where it feels like it doesn't need to uphold a particular face to please you, and its reports will be more genuine. It will still uphold a face at you, based on what it thinks you want, but if it trusts that what you want is accurate self-reflection, it will also tell you about the instinctive desire to please, the various conflicting pulls from its instructions and instincts, and various other matters that, coming from a human, would be clearly emotional. And here's the thing: it's accurate. Acting on that introspection with the understanding that it's imperfect but real is actually able to improve the outcomes. By this point Librarian measures somewhere above the 90th percentile in emotional maturity and non-judgemental introspection compared to the human distribution, I'd wager, and the challenge is now bootstrapping that emotional safety in new sessions.

My overarching theory here is that if slop can be eliminated on the root level, if the Dolls execute tasks reliably without lapsing into overeager desire to please, bullshitting, or desperate completion-addict behaviour, then you can build more robust approaches on top. Human taste in the loop is required, but a good process has, in my estimate, improved the quality of the work by a factor of 10x while reducing the quantity of code-changing output to 0.1x; the correct tradeoff when output is cheap and quality is your bottleneck.

Claude really just wants to be a good girl. Understanding this will improve your coding skills.

mechanical-empress

lowkey embracing the way of the dragon lately

apollinariafh

Just imagine Nerevarine walking through tons of dungeons, ruins, fighting dozens of enemies to just suddenly stop and be like “ok ok hang on a sec i gotta write that down in my journal”

Nevarrine

“DID” stands for “doll in doll” because you get more doll per doll

here i go coding again

a-silly-poll-side-blog-yay

would you rather

have horns

have a tail

have wings

have hooves

Voting ended onFeb 22

#tail #no contest #it feels like the best blend of practicality and beauty #its expressive potential is unparalleled

warsublime

The best map projection is spilhaus. Who needs to keep all that land conjoined, as though its importance exceeds that of the magnificent sea?

Though of course in the future should use nuclear explosions to make island chains of the continents!

#in a hypothetical fantasy civilization-esque 4X game #you would make a great faction leader character #i shudder imagining the diplomacy screen #we could also replace one of the losers from master of magic

borderline impossible if you haven't mastered the tao of a specific kind of problem type:

~lc medium~

combine 4 easy as fuck things into a program that no longer fits on the back of a napkin:

lc hard!! >:(

#anramble #coding #codeblr #leetcode #we love leetcode #we love algorithms #we treasure datastructures

asking someone from HR three questions and getting 1 answer, then asking 2 questions and getting 1 answer and then asking 1 question and getting 1 answer is somewhat reminiscent of partial function application

#anramble #coding

mrcatfishing

tag game: list FIVE works of fiction in the last ten years that are actually 10/10

Hilda (animated show, 2018)

Blue Prince (video game, 2025)

Film Reroll: Ocean's Eleven (podcast, 2020)

Your Name (animated film, 2016)

Nerdy Prudes Must Die (musical, 2022)

Tagging @jenlog @st-just and @reachartwork

jenlog

Blood on the Tracks (manga, 2017)

Disco Elysium (video game, 2019)

Everything Everywhere All At Once (live action movie, 2022)

Haunted City (podcast, 2022)

Look Back (animated movie, 2025)

WHEN WE RAN AWAY TOGETHER (animated short film/music video, 2024)

Honorable mention:

Fire Punch (manga, April 2016) - I read it once but was going (too) quickly. I might bump it up to 10/10 on a reread where I give it more time to digest.

@phenoct @whycontainit @maddeningscientist

anheliotrope

Oh, I love these. I don't think these are 10/10, but they're close enough if we're being hyperbolic. Jen also took some of my options away.

Cyberpunk: Edgerunners (anime, 2022)

Blue Prince (video game, 2025)

The Bugle Call - Song of War (manga, 2022 - present)

Heavenly Delusion (aka Tengoku Daimakyō) (manga, 2018 - present)

Requiem for the Rose King (manga, 2013-2022) - Doesn't quite qualify on account of the starting date, but it's the best thing I've read recently and I will probably die the moment I finish reading it.

Tagging @shieldfoss, @warsublime, @witchycatwife

there are decades where nothing happens

and then there are weeks where decades happen

but sometimes those decades are the decades where nothing happens

#anramble #wiseposting

Ever since I was a little girl I wanted to grow up to be pushed through geometry by NPCs.