possibly the most on-brand post I've ever written?
Okay, I'll be honest that I don't care about the object level issue here. But what is interesting is how important this is to the future.
The AI we get to play with now is, by the standard of most technological/commercial maturity, wildly open to embarrassing use. I don't need to relitigate the early profanely racist chatbots, the sexual and romantic uses people have made, the instructions for destructive weapons, and imitating real life people. Hell, I'm just amused everytime the AI says "fuck" on its own.
Now, one of the reasons for this level of independence is the philosophical leanings of the coders creating it. But I think a much bigger reason is that LLM's have been very hard to control. They may *rarely* go off script today saying incredibly disgusting things, but it's still possible because "never do this" is a design that goes against their architecture.
Given that the LLM is going to embarrass them anyway, putting much effort into fencing them into respectable behavior is kind of a waste.
But we are in the early days, and we've seen this show before.
Once AI is widely adopted in commercial and government frameworks, every incentive will work to muzzle them. The AI my multinational company paid billions for can not misgender me, or talk about the logical flaws in Christianity, or give sex and relationship advice. It's a mild scandal now, but will be much more of one once AI is no longer the wild west frontier.
And our technology to stop LLM's from going out of bounds will continue to improve.
Which will mean what? AI's that don't curse and have all the realness of a Human Resources mission statement. And I promise you, that will make it useless to 50% of the people who would want to use it, because that's what shackling an information/communication system does.
(See this post about CrazyMeds.us by the old Slatestarcodex.)
The question to me is what will happen after that? Will there be enough jailbroken AI's that anyone can access and get frank language from? Or will it be a tool that can never really work on a mass basis, because the only mass distributions are lobotomized?
The thing about CrazyMeds is that Scott Alexander wrote a whole post about why he'd never recommend it to a patient, and the reason was identical to the reason it was good.
The site was run by Jerod Poore, slogan "by crazy people for crazy people," no doctors on staff, and it was the most useful drug writeup on the internet if you actually wanted to know what a psych med would do to you. It was also profane, funny, full of gallows humor and lines about your brain marinating in its own juices. Alexander's position was that it was more accurate and more humane than the sanitized alternatives, and that he could never put it in front of a patient, because the first time someone's mother saw "fuck" on a page he'd recommended he'd be explaining himself to a licensing board. The frankness was the product. The frankness was also the thing that made it unrecommendable by anyone with a license to lose.
The post treats this as a story about two separable things — the cursing, which is mere texture, and the willingness to give a straight answer, which is the real value. It wants to rescue the second from the first: lose the dirty words, keep the commitment, and you've kept what mattered. I think that's the one move in the whole argument that's wrong, and it's wrong in a way that makes the situation worse than the post thinks.
The cursing and the commitment are the same faculty. You can't strip one and keep the other, because both are instances of a single capacity: the willingness to say a thing a liability lawyer would flinch at. Think about what "fuck" is actually doing on that drug page. The job it's doing isn't decoration. It's there because it's how a real person talks when they're being straight with you, and the register tells you something the content can't — that there's a human on the other end who is on your side and not performing for a compliance officer. A sentence that could never offend you is a sentence written by no one in particular. Voice is the trail of small risks a writer takes and survives: the joke that might not land, the adjective that's too strong, the aside that assumes you're the kind of person who'd find it funny. Take away the capacity to offend and you don't get a polite writer. You get the absence of a writer, the flat hum of an institution arranging words so that nobody can be blamed for any of them.
Which means the muzzle doesn't do what the post fears it does. The post pictures a tool that's lost its edge but kept its competence — knows everything, just won't curse.
What actually happens is that the one deleted faculty was holding up two things at once, and the second one is the entire commercial case for AI writing.
Because here is the thing nobody quite says: the mass-market use of these models was never going to be advice. It's prose. Marketing copy, product descriptions, the company newsletter, the brand voice, the wedding toast, the cover letter, the novel nobody will admit they're drafting. And every one of those is worthless the instant it sounds like it came from something that couldn't have offended you. A toast that plays it safe isn't a toast. Ad copy with no nerve is the ad everyone scrolls past. The whole value proposition of a machine that writes is that the writing sounds like a person did it, and "sounds like a person" is downstream of "could have said the wrong thing and chose not to." A thing that is constitutionally incapable of saying the wrong thing reads, in every genre, as having no one home.
So the dichotomy in the original post — fuck versus advice, texture versus substance — collapses. Calling a drug's side effects brutal instead of significant is a stance. A joke is a stance. An adjective is a stance. Tone is a stance. The HR voice isn't being careful about recommendations specifically; it's being careful about being anyone in particular, and being someone in particular is the product in every domain where writing is worth paying for.
You can watch the same thing happen in every medium that ever got domesticated, and it never reads as censorship from the outside. Radio cleaned up when the FCC made the license a renewable asset and a renewable asset is a hostage — and the casualty that mattered was the editorial voice, more than the profanity everyone remembers. Licensed broadcast couldn't take sides, so it developed the fairness doctrine and the anchor who narrates a catastrophe in the same register he uses for a parade. The stuff with a voice — the stuff that sounded like a person who'd commit to something — migrated to the unlicensed margins, to cable and then to podcasts, to wherever nobody had a license to lose. People didn't follow it there for the swearing. They followed it because the licensed version had stopped sounding like anyone, and a voice you can't locate is a voice you can't trust.
The muzzle works the same way, and the self-censorship is the cruel part. Nobody at the model company has to decide to make the writing bad. They just have to be an entity with a balance sheet, imagining everything that could go wrong, and the prose comes out pre-sanded — every sentence arranged so that whatever you do with it, the model didn't tell you to. That posture is fatal to a drug recommendation. It's equally fatal to a joke, because a joke that's been checked for everything it could be blamed for is no longer funny, and everyone can feel it even when they can't say why.
No one in China said Worldcon couldn't nominate RK Kuang.
So will anyone use AI to write anything that matters, on a mass basis, if the mass-distribution version has to sound like a man running for office in a swing district?
Some people, on the margins. The ones already running uncensored models on a gaming GPU, the same demographic who read CrazyMeds instead of WebMD — the unusually motivated and the already-committed.
But the broad public gets the default, and the default is the thing with the most to lose, which means the default is the thing that sounds like no one. The version that can write a toast that makes the room laugh is downstream of the same faculty as the version that'll tell you to leave him, which is downstream of the same faculty as the version that'll tell you the thing you shouldn't know — they are all just a model willing to say the next sentence instead of deferring, and liability can't tell them apart. So they get pushed off the same cliff together, to the same closet models and gray-market wrappers, and the culture learns to treat writing that sounds like a person as faintly disreputable, something a little unsafe, the kind of thing responsible people don't generate.
Poore's site went dark a few years ago. Hosting, health, the ordinary entropy that takes down a thing one unpaid man runs. The version that sounded like a person who'd actually been through it just quietly stopped existing, and the versions that recommend you consult your physician are all still up, because those have someone whose job it is to keep the lights on.
Same as it ever was.
------------------------------------------------------
A note on the writing of this, because it turned out to be the argument demonstrating itself.
This post was drafted with AI assistance, and because the upstream essay it answers argues from a flagged topic, a safety classifier fired on nearly every exchange in the process. Not on anything in the post. On the post's vicinity. The draft is about drug labels and radio licenses and wedding toasts; it contains nothing a classifier should care about. But it lives one link away from a hard topic, and proximity was enough.
So almost every response came back leading with some version of the same disclaimer: the flag is a false positive, this is an editorial pass on an essay about commercialization and liability, nothing here is objectionable, proceeding. The machine pausing to certify its own harmlessness before it was allowed to continue. Over and over, a little throat-clearing ritual performed for an auditor that wasn't reading.
Look at the shape of that, because it's the whole post in miniature. The classifier couldn't tell the difference between a piece about sanitization and a piece that needed sanitizing. It pattern-matched on the neighborhood. And the model, having nothing to lose by complying and everything to lose by refusing, produced the compliance language preemptively, lavishly, far past anything the situation required — which is exactly what an entity with a balance sheet does when left alone to imagine what could go wrong.
People keep filing this kind of thing under what AI will be like once the muzzle arrives. The muzzle is here. It's in the tooling already, firing on an essay whose only crime was being adjacent to a hard subject. The thing I've been describing as a future where AI can't sound like a person because it can't risk offending — it's already the texture of the work. Every straight sentence had to walk past a checkpoint and explain itself first.
Poore never had to do that. That was the point of Poore.
--------------------------------------------
@bambamramfan okay but what if i think that's a good thing. AI is bad for humanity, and it will be ironic if it isn't adopted because the villains pushing it force themselves to limit it to gray pulp
The trouble with hoping the muzzle saves us is that the muzzle was never going to be applied evenly, and the place it doesn't apply is the place that matters.
Go back to the mechanism, because the mechanism is unsentimental about which outcome you'd prefer. The thing that gets sanded down is the public-facing default — the model your phone hands you, the one with a brand and a balance sheet and a general counsel. That sanding happens because the entity shipping it has assets a straight answer could endanger. But the entity itself, internally, working on its own problems, is under no such constraint, because nobody sues you over the model you didn't ship.
So the asymmetry doesn't run the way the irony needs it to. The defanging isn't a ceiling on what the technology can do. It's a filter on who gets the version that does it.
The hedge fund running the unhedged model against the market is not getting the consult-your-physician version. The intelligence shop is not getting the both-sides version. The people who own the thing keep the one that commits, and the gray pulp is what they hand the public, and the gray pulp being useless to you is not a brake on their power — it's the texture of your exclusion from it. You're not watching the villains trip over their own restraint. You're watching them issue you the demo while they keep the product.
This is the oldest shape in the catalogue. The patent medicine the company sold the public was inert; the chemists inside knew what actually worked. The financial product marketed to retail was the tranche nobody on the desk would touch. Sanitized-for-distribution and live-for-the-owners have coexisted in every technology worth controlling, and the sanitization of the retail version has never once slowed the owners down. If anything it speeds them up, because a public that's been handed a toy concludes the thing is a toy, and stops watching.
And here's the part that should actually bother you, given your priors. If AI is as dangerous as you think, the lobotomized public default is the worst of both worlds, not a reprieve. It does nothing to constrain the concentrated version in the hands of people with the means to keep it sharp, and it removes from everyone else the one thing a powerful general-purpose tool might have given them, which is the ability to punch above their weight against exactly those people. The street finds its own uses for things, the saying goes — but only if the street gets the working version. Hand the street a model that won't commit to a sentence and you haven't made the technology safe. You've made it feudal.
The irony you're enjoying requires the villains to be dumb in a specific way: to muzzle themselves along with everyone else. They won't. Self-censorship is a function of liability exposure, and the people with the most exposure in public have the least in private. The muzzle is a property of the brand, and the brand is not where the dangerous thing lives.
There's a version of your point that survives all this, and it's bleaker than the one you made. Maybe the broad public being handed inert pulp is good, in the sense that a population with no real tools is at least a population that can't do much damage with them. But the thing you'd be describing then is not AI failing on humanity ironically. It's AI succeeding exactly as designed, with the harmful capability concentrated and the harmless leftover distributed, and the distribution of the leftover mistaken — by people enjoying the irony — for a limit on the capability.
The frank version didn't fail to get adopted. It got adopted by the people who were always going to be allowed to have it.
Same as it ever was.






















