ok so I looked it up, and it turns out they made a track out of PVC pipes, down a hill. The owner didn't realise PVC expanded in the heat, so on a turn the track just fell apart and the dude inside went over a fucking free way and into a swamp.
The funniest part is that the inspector was watching the whole time, and once the ball stopped he left without saying anything. Park management just shut it down then and there.
"The ball cleared a small hill, briefly going airborne, then zipped right across Route 94, the two-lane road splitting the park. Cars honked and slammed on their brakes. If there had been opposing traffic, Frank would have become part of a real-life game of Pong, volleying from one bumper to another.
Still in pursuit, we followed the ball toward a small lake in Motor World that had been earmarked for a fleet of tiny bumper boats for children. The area wasn’t open yet, but the empty boats were being tested and floated on the surface. The ball soared over the grass and smashed into several of them, scattering the others with rippling waves from the impact, which launched some of the boats several feet in the air.
Charlie and Ken waded into the water looking for the hatch. After some difficulty, they got it open. Charlie pulled Frank out by grabbing him under his armpits like a baby. Frank crawled up the bank, coughing and sputtering. He splayed across the grass as we all stared at the ball, which bobbed in the water like it was attached to a fishing lure.
We did not ask for the inspector’s report, nor did we ever hear of one being filed. Ken Bailey returned to Canada. The snow-makers cleared away the PVC. Told to dispose of the Bailey Ball, they rolled it into the woods, where it remained for many years."
I don't know that this beats the teeth story, but it's pretty great.
Text of tweet under the cut because it is loooong.
But... Stochastic Parrots.
Timnit Gebru was fired from Google in December 2020 for refusing to retract a research paper, and every single warning that paper made about large language models has now happened at a scale the industry spent 4 years trying to make people forget about.
Her name is Timnit Gebru.
She co-led the Ethical AI team at Google. She co-wrote a paper called "On the Dangers of Stochastic Parrots" with Emily Bender at the University of Washington and two other researchers. The paper was 14 pages long. It was submitted to a top AI ethics conference. And it was the reason Google decided that one of the most senior Black women in AI research could no longer work there.
The story Google told publicly was that she resigned. The story she told, confirmed by 2,695 of her colleagues in an open letter, was that she was fired by email while on vacation because she refused to either retract the paper or remove her name from it.
The paper had not even been published yet.
Here is what she actually wrote, and why every prediction inside it has now come true.
The first warning was about scale itself. Bender and Gebru argued that training ever-larger models on ever-larger scrapes of the internet would produce systems that appeared fluent but had no actual understanding of language. They called these systems stochastic parrots because they would repeat patterns from training data with statistical confidence and zero comprehension. The paper predicted that this apparent intelligence would fool both users and developers into trusting outputs that were structurally incapable of being reliable.
This was 2020. GPT-3 had just come out. The paper predicted the hallucination problem before anyone had a word for it.
The second warning was about bias amplification. The paper documented in detail that internet-scale training data contains systematic overrepresentation of dominant viewpoints and underrepresentation of marginalized ones. The models would not just absorb this bias. They would amplify it, because the optimization process rewards confident outputs, and confidence in language patterns tracks frequency in the training set.
The prediction was that hiring tools built on these models would discriminate against women. That healthcare triage tools would underperform on Black patients. That loan approval systems would entrench inequality while presenting their decisions as neutral algorithmic judgment.
Every one of those things has now been documented in deployment.
Amazon's hiring algorithm penalized resumes that contained the word "women" in any context. Healthcare risk scoring algorithms used by major US hospitals were found to systematically underestimate the medical needs of Black patients. Apple Card's credit algorithm gave wives credit lines 10x lower than their husbands for the same financial profile.
The third warning was about environmental cost. The paper calculated that training a single large language model produced emissions equivalent to the lifetime output of 5 cars. The prediction was that the race to scale would create an environmental footprint that would eventually rival entire industries.
In 2024, Google's emissions were up 48% from 2019, and the company explicitly blamed AI infrastructure. Microsoft's were up 29%, same reason. Both companies have now quietly abandoned the climate commitments they were publicly celebrating the year Gebru was fired.
The fourth warning was about documentation. The paper argued that the training datasets being assembled were too large for anyone to actually audit. Nobody at Google, OpenAI, Meta, or any other lab could tell you with confidence what was in the data their models were trained on. This was not a temporary problem to be solved later. It was a permanent feature of the approach.
In 2023, researchers discovered that the LAION-5B dataset, used to train Stable Diffusion and other major image models, contained thousands of images of child sexual abuse material. The companies that had trained on the dataset had no way of knowing. The paper predicted that category of failure 3 years before it was found.
The fifth warning was the one Google cared about most.
Bender and Gebru argued that the deployment of these systems would centralize linguistic and cultural power in the hands of the small number of companies that could afford to train them. The internet would become a place where the dominant voice was a statistical average of dominant voices, presented as a neutral assistant. Languages underrepresented in the training data would degrade over time as more web content was generated by these systems and fed back into the next training run.
This is now happening in real time. A 2024 study found that 57% of new web content in English is AI-generated or AI-assisted. Researchers studying low-resource languages have documented active degradation in translation quality, because the synthetic content fed back into training is itself worse in those languages.
The paper Google fired her for predicted the model collapse problem before model collapse had a name.
The mechanism behind why this all happened is the part of her work that nobody quotes.
Gebru's argument was not that AI is dangerous in some abstract sci-fi sense. Her argument was that AI is dangerous in a very specific structural sense. The technology was being built by a small group of researchers who shared similar backgrounds, worked at similar companies, and were rewarded for shipping products faster than competitors. The incentive structure made it impossible for safety, ethics, and bias concerns to slow anything down. Anyone inside the system who raised those concerns was either ignored, sidelined, or removed.
She was making that argument from inside Google.
Then Google proved her right by removing her.
The team Google had built to make sure their AI was safe was dismantled in 90 days because they did the job they had been hired to do. Margaret Mitchell, the other co-lead of the Ethical AI team, was fired two months after Gebru for searching through her own emails for evidence of how Gebru had been treated.
Gebru did not stop. She founded DAIR, the Distributed AI Research Institute, in 2021. The mission is to do AI research outside the control of the companies that have a financial interest in not hearing the answers.
Every prediction in the Stochastic Parrots paper has now been validated by deployment. Hallucinations are an industry-wide problem the largest labs cannot solve. Bias amplification has been documented in hiring, healthcare, lending, and criminal justice. Environmental costs are larger than entire small countries. Training data audits remain impossible. Model collapse is an active research crisis at every major lab.
The question worth sitting with is the one almost no one in the industry will say out loud.
Every researcher with the technical credibility to call out these problems watched what happened to her in December 2020 and made a calculation about their own career. The number of people willing to speak publicly about safety and ethics issues inside the major AI labs collapsed after that firing and has not recovered.
The researcher Google fired for warning about exactly what is now happening was right.
The company that fired her is now the second-largest deployer of the technology she warned about.
And the people inside that company who agree with her are not allowed to say so.
watching the documentary on the OceanGate Titan disaster and it's striking how Stockton Rush wanted to emulate Elon Musk and in many ways he did, identifying weight as the core constraint of submarine operations (lifting and hauling the sub around as being the primary expense) and basing the entire concept around making it as light as possible, but then he overrode every engineer who told him in explicit terms that it was going to pancake, and it pancaked.
also quite reminiscent of Richard Branson's space experiments, which did at least have Burt Rutan as a designer, but the same breed of dickhead CEO at the head.
goddamn the Titan had acoustic monitoring sensors and you can literally hear the carbon fiber in the hull cracking louder and louder as they descend on test dives, but the CEO was angry that they were even bothering to monitor it and refused to accept that it was dangerous, and he wasn't just risking other people, he was on the damn thing himself! he was just that deep in denial and unwilling to accept reality.
DISCLAIMER: intense yap session from someone PUTTING THINGS INTO LAYMEN’S TERMS to make it easier for everyone to understand 🙂↕️
As someone who’s Obsessed™️ with this tragically yet sadly predictable story, we’d like to take a moment and yap about how crazy insane the Real Time Acoustic Monitoring System for Carbon Fiber for a DEEP SEA PRESSURE VEHICLE is.
First of all. The fact that Stockton Rush decided to make a DEEP SEA PRESSURE VESSEL out of carbon fiber, a material that is famous for its tensile (pulling) strength NOT pressure (push). Pressure. Which is very much in the DEEP SEA. Where the VEHICLE is going (if we remember correctly, the pressure in the ocean increases by 10 atmospheres of pressure- ie 1 atmosphere of pressure being what the atmospheric pressure is at sea level -for every 10 meters you descend. Space, on the other hand, is between 1 and 0 atmospheres of pressure). The ocean wants to squish you into biological paste so, so, so, so, so very badly.
Everyone and their mother are telling Stockton Rush that using carbon fiber is a Shitass Bad Idea. Not only does logically it make no sense to use (again, good with PULLING not BEING PUSHED, which is what pressure is) but it just hasn’t been tested the way other materials, such as titanium, have been. We know what the failure point of titanium is. We know when Catastrophic Failure happens in titanium. We know how many cycles of fatigue/stress titanium can withstand before Catastrophic Failure happens. When know how titanium behaves Before That Happens.
Carbon fiber?? We don’t know those things about carbon fiber, when it comes to DEEP SEA PRESSURE FATIGUE/STRESS. It’s a material used for PULLING not PUSHING. So it hasn’t even been tested for DEEP SEA PRESSURE. Because it’s not meant for deep sea pressure. And we’re about to show you how very Not meant for DEEP SEA PRESSURE that carbon fiber is.
Here’s a generic chat that gives you an idea of what we mean when we say Titanium Is Predictable. This chart isn’t about titanium specifically, but we can pretend it is. You can look at the colors and understand what the chart is telling you. We can see the curve and what’s a Warning and what’s Bad. The red is Bad. You don’t want to be in the red. The curve of the chart shows that you have time to predict the red before it goes to red.
Here’s an actual chart of titanium under stress, along with a steel and aluminum. Ough wow!! Look at those curves!! You don’t have to know exactly what you’re looking at to see and understand that the curves happen before the lines “flatline”. There’s a predictability before the lines of the chart “go flat”. This means you can understand the integrity of the titanium you’re using, and whether or not it’s safe to use, whether it can be worked on, or if it needs to be retired altogether.
Now let’s look at carbon fiber.
There’s no curve. Carbon fibers works, until suddenly it doesn’t. There’s no prediction for catastrophic failure- it just happens.
AND STOCKTON RUSH IS USING THIS MATERIAL IN AN APPLICATION IT SPECIFICALLY DOESNT WORK WITH (pushing- NOT pulling).
Well. The thing about Stockton Rush is that he’s going to Do What He Wants. So he’s going to build his vessel out of expired carbon fiber. Because it’s cheaper to transport and manufacture (remember, he planned on having a whole fleet of these things). He wanted to spend as little money as possible while making back as much profit as possible.
And Stockton Rush, with his background in aviation (imagine. for a moment. if a NASCAR driver decided to make a commercial airplane, and only used his knowledge of racing cars to do it. because that’s essentially what’s happening here) decided to use a Real Time Acoustic Monitoring System. Because carbon fiber is made up is a bunch of threads of carbon “braided” (for lack of better term) together, when the UNPREDICTABLE- WE ALL SEE THAT CHART. WHERES THE CURVE RUSH- catastrophic failure happens, the threads “snap” apart as they break. So his thinking process is, the Real Time Acoustic Monitoring System would be able to “predict” when the carbon fiber was going to fail catastrophically.
For shits and giggles, here’s applications acoustic monitoring systems are used for:
Please note that DEEP SEA PRESSURE VEHICLES are not on there. Damn near the rest of this IS in planes, however.
And. C’mon. COME ON!!!!! LOOK at that chart!!!! Does it look like predicting the catastrophic failure of the carbon fiber is even going to matter?!? WHEN YOURE AT THE BOTTOM OF THE GOD DAMNED OCEAN?!? Catastrophic failure at the depths the Titanic is at means IMPLOSION. The Real Time Monitoring System is a fart in the wind. It’s Safety Theater (<- you will hear that a LOT when you watch documentaries and the investigation).
Except.
✨ EXCEPT ✨
It actually fucking worked.
The Real Time Monitoring System actually DID work and actually DID predict the catastrophic failure of the carbon fiber before the catastrophic failure even happened.
And this is HUGE.
This is GROUND BREAKING.
This is the INNOVATION that Stockton Rush has had a hard on for since day one.
Except they’re loosing money. And as OP said, Stockton Rush was upset that the carbon fiber had the audacity to snap, crackle and pop at all.
He was so mad in fact that, as we found out during the US Coast Guards’ investigation, Stockton specifically had the graphs plotted for each dive in such a way so that the RTMS’ alert to the carbon fiber’s lack of structural integrity wasn’t readily obvious (instead of the graph showing the curves of stress fatigue, they were plotted to show depth with time).
Anyways. The fact that his crack pipe idea of using an acoustic monitoring system to predict a famously unpredictable material in an application it’s very much NOT used for and it actually WORKED is phenomenal. Unfortunately, Stockton Rush didn’t set out to find a way to make carbon fiber a viable material for deep sea pressure vessels, he set out to be “big swinging dick” like Elon Musk and Jeff Bezos. So he didn’t care, and how he’s dead because of his own hubris.
thank you for writing this in old school tumblr “buckle up fuckos and learn you a thing” diction, it is indeed incredible that the man invented a clever warning system that accurately told him his idea sucked and then ignored it, scroll of truth style
Hello Ryan I am here to help. So the first step is pretty easy: Three cheeseburgers are worth 18, so each one is worth 6. If these are dollars, that's a steal!
From the second equation we get that cheeseburger plus fries-squared is five. Subtracting cheeseburger, which is six, from both sides, we get that fries-squared is negative-one. Math fans will know that there are two solutions to this; either fries are the "imaginary unit" 𝒾 or they are its negative, -𝒾. We'll do the rest of the problem with 𝒾, keeping in mind that at the end we should also take the complex conjugates as solutions.
Finally, we have that cup to the power of fries, minus cup, equals three. Replacing fries with 𝒾, and moving a cup to the other side, we get that cup-to-the-𝒾 is equal to cup-plus-three.
Now, the weird part about this is the cup-to-the-i. The problem with this is that complex exponentiation is technically not a thing. That is to say, there is no one function which is mathematically equal to "input-to-the-power-of-𝒾". In fact, there are infinitely many such functions.
Fortunately, due to reasons that take about six pages to explain (trust me I've done it), there is one particular function that many people have agreed is "the most reasonable one". This is not a mathematical notion, but a human preference. Seeing as this question was presumably written by a human, I am comfortable with using this function.
So, what function is this? Well, given a complex number r∠θ written in polar form (if you don't know what that means don't worry), where -π < θ ≤ π, then (r∠θ)^𝒾 = e^(-θ)∠ln(r).
Applying this to our problem a value r∠θ will be a possible solution for cup if e^(-θ)∠ln(r) = r∠θ + 3. Splitting this into real and imaginary parts, we get two equations: e^(-θ) cos(ln(r)) = r cos(θ) + 3 and e^(-θ) sin(ln(r)) = r sin(θ). We can graph these equations on Desmos:
The possible values of cup are the intersections between the red, green, and purple. There are infinitely many of these which have an angle of around -π/3, and there are two weirdos: One which is a complex number very close to -2.98, and one which is somewhere around -25. The possible values for cup are all of these infinitely many solutions, and also all of their complex conjugates.
A lot of companies that make things have two separate lines: consumer and enterprise. Consumer is for us poors that the company doesn't need to respect. You can buy more expensive, higher tier consumer items but it's just as fragile. You're paying for more bells and whistles. Enterprise grade is stuff that will actually last because it is often sold in a lot as part of a contract and the contract won't be renewed if the items suck. These things are often just not sold to consumers because why would you ever buy consumer-grade garbage if you could buy this?
For something like computers, this looks like how pricey "gAmiNG" laptops look all fancy but fall apart in about the same timeframe as the lower end laptops for students. But if you've ever handled the sort of laptops for banks or businesses, you'd wonder why laptops can't all be like this. People swear by Lenovo Thinkpads for a reason. Dell's Latitude (general productivity) and Precision (has the power of a gaming laptop but far less bs) lines, HP's probooks, all feel really nice and last for-fucking-ever. But you can't buy them in a store even if you wanted one.
So the trick is to buy what big organizations are buying, but you likely gotta get 'em secondhand when the orgs are done with them. For basic clothes, mil surplus might be a good port of call, for technology see an e-waste recycler or sometimes government auction (you just gotta know what to ask for; ThinkPads, Dell Latitude/Precision, HP Probook/Zbook). Otherwise try to thrift old, pre-enshitification items. The blanket I had as a kid is still going strong and has lived to bury multiple sets of newer bedding (which have worn thin and torn). Kitchen items, see if you can shop at a restaurant supplier.
TL;DR: if you need an item, think "which business needs these to function" and see where they 1) buy theirs (suppliers) or 2) sell their old ones (surplus) and buy that. If you can't do that, look at older, pre-enshitification things.
This is an awesome use of what is probably a master's degree if not a doctorate and I am 100% thrilled that she shared it even though it was embarrassing and she squeaked.
this is Hannah Fry, Professor of the Public Understanding of Mathematics at the University of Cambridge and president of the Institute of Mathematics and its Applications.
A new analysis commissioned by The New York Times suggests that Google's AI Overviews are wrong an astonishing percentage of the time.
A recent analysis conducted by the AI startup Oumi at the behest of The New York Times found that the AI-generated summaries, which appear above Google search results, are accurate around 91 percent of the time.
In a sense, that may sound like an impressive figure. But here’s an even more impressive one: five trillion. That’s roughly the number of search queries that Google processes every year, translating to tens of millions of wrong answers that the AI Overviews are providing every hour — and hundreds of thousands every minute, the analysis calculated.
In other words, Google has created a misinformation crisis. Studies have shown that people tend to trust what an AI tells them without question, with one report finding that only 8 percent of users actually double checked an AI’s answer. Another experiment found that users still listened to AI when it gave them the wrong answer nearly 80 percent of the time — a grim trend the researchers dubbed “cognitive surrender.”
Large language models adopt an authoritative tone and can confidently present fabricated information as fact when it can’t immediately glean a straight answer. Add the convenience that Google’s AI Overviews offer, and it’s easy to imagine untold numbers of users taking its summaries at their word.
Personally I hate AI because it uses slave labor, is killing the planet and is making people stupid, but that's just me. The soulless art aspect is just one little piece of my grander disdain.
wait how does AI use slave labor? Do you mean the human works that are stolen and not credited or compensated? Because technically under capitalism everything is exploited but there are varying degrees
Aside from the scraping, AI tech companies, including openAI/chatGPT, have outsourced training their models to countries in the global south, specifically Kenya in openAI's case. These workers are working in sweatshop conditions for less than 2 bucks USD per hour. I'm on mobile, but if you search 'openAI Kenya slave labor' and related keywords, you can find multiple articles about it.
Graceful degradation is the ability of a computer, machine, electronic system or network to maintain limited functionality even when a large portion of it has been destroyed or rendered inoperative. The purpose of graceful degradation is to prevent catastrophic failure. (Tech Target, first result on the search engine)
Oh neat the first time I heard of the concept the guy described it to me as "catastrophic functionality".
He was talking about it in the context of designing robots that would go in and stop nuclear reactor meltdowns, something that would 100% destroy the robot, but they would be designed to keep functioning and fighting the meltdown for as long as possible. He had some designs where over 80% of the robot has died and it was functionally dragging its corpse around by its one working arm because one more minute of functionality might save thousands.
I've been having a few bad years mental health wise, and thinking about those robots a lot .
This is also why NASA missions usually keep going so long after schedule. They are *masters* of graceful degradation, able to keep machines limping along on minimal power and after sustaining heavy damage
It is implied through gameplay mechanics that the two kingdoms at war in chess both formally permit regnal polygamy, allowing the King to take at least nine wives of equal status, and that marriage can be bestowed as a military honour, but curiously that honour - in stark contrast to the conventional attitudes of the aristocracy - can only be bestowed upon soldiers of the unlanded class, leaving great questions to be asked of the material interests and stability of these kingdoms' aristocracies
Unless, of course, it is gameplay abstraction that we see soldiers as individuals and we are in fact commanding representations of whole units, in which case the King already has up to nine wives but only one is in command of her own company at the onset of battle, and unlanded regiments are largely self-directed (which would explain their limited effective range, lacking proper leadership coordinated with the objectives of the ruling class) with further members of the aristocracy taking charge of those which excel in the field, presumably in large part so they can take credit for the soldiers' successes and advance their own social and political standing as is often the way of things
Several people in the notes by now have brought up the mobility of the castles, and there is simply no explanation for that I can think of beyond the drift between original and translated forms of the game. The original pieces occupying that role were heavy war chariots, with some later replacing them with war elephants carrying an archers' tower, and when it spread to Europe (a continent notably lacking in elephants) the elephant part began to be omitted and the tower reshaped into a European-style fortification. There is no gameplay justification for its retaining the war elephant's moveset, it is merely a cumulative translation issue.
It is very obviously not a fucking siege tower, like 1) in the majority of cases their designs are round and crenellated in the style of western European post-Norman fortifications, which is not remotely what a siege tower looks like, 2) why the fuck would a siege tower be in a pitched battle such as that depicted by the game, what the fuck do you think they're besieging, and 3) do you think a fucking siege tower can cross a battlefield at speed???
Like please think about what you're saying, the role and construction of a siege tower, for one fucking second, instead of leaping to the first piece of medieval military engineering you can think of that's related to the concept of towers. A siege tower unit is not going to be in the middle of a battlefield actively hunting down mounted cavalry and providing a rearguard screen to the frontline infantry, is it, like come on that is ridiculous and if you think that's a remotely reasonable thing to say you clearly have no fucking clue what a siege tower is actually for; furthermore, if you're in a pitched battle where the king himself is on the field like why are you even gonna bother besieging anywhere nearby, you have an opportunity to end this war here and now (as demonstrated by the rules of the game!!) so why on earth would you bring your expensive, cumbersome siege equipment along instead of leaving it back at camp, if the king retreats (and you can't take him in the rout) then sure then you go and grab your siege equipment but y'know it would just exhaust your troops and put it in harm's way for no reason to drag it into a battle where no siege is taking place.
You may like this amazing web page then: https://ciechanow.ski/mechanical-watch/
It’s a full break down of how a mechanical watch works, every single part including a complication. Every diagram is interactive and beautifully put together. It’s a fantastic read, fun to play with and really informative.
Seconding this recommendation, and in fact this person’s entire blog: he has multiple of these explainers and they’re all good.
In particular, if you’re a programmer who finds floating point numbers to be exotic dark magic, I strongly recommend his Exposing Floating Point post. There are many guides/tutorials on floating point arithmetic out there but IMO this is the best one and makes them really not seem so difficult by the end of it (or at least, all their counterintuitive behavior now makes sense).
I left research science for a multitude of personal reasons but a big one was that my autistic arse just could not keep up with the politics. Trying to increase your profile and your work's appeal for grant money (both critically important factors for succeeding in science) by carefully tailoring your work focus, grant proposals and papers to attract the interest of various grant committees focused on what research is in fashion and also to try to maximise the number of references to your papers in other people's papers takes so much time and just isn't interesting.
People in the notes are all like "yes sometimes The Government or The Lobbyists remove funding from sciences that might reveal things that they do not want known" and yes yes covid and climate change and vaccine denialism and racism and all that, but it's not just the obvious stuff. When I was working in labs, metagenomics was the next big thing, so EVERYTHING was metagenomics. If you worked in environmental or medical microbio, there was an 80% chance that you were taking some mud or poop or something and doing metagenomics on it whether or not it was the best way to analyse what you wanted to analyse, and bam, that was a paper, possibly two, it kept the lights on. And if 100 labs run 100 experiments where they do metagenomics for random things to do with the human body, then at a 95% confidence interval, 5 of them will find a completely nonexistent correlation for any random thing you can think of if they happen to check their data for it, and they need to keep the lights on and the PhD student who's running the experiment needs a paper so even though that wasn't what they were looking for, bam -- everyone's seeing that men and women have different microbiomes in their large intestine! People with autism nave different microbiomes, people with MS have different microbiomes, kids who are vaccinated have different microbiomes so watch out! And there's not enough papers to do a review paper for six or seven years, and when that review paper is done only other scientists will read it, because Everyone Knows that vaccines alter your microbiome so stay away from them, and Everyone Knows that men and women are so so different because why else would their large intestines be so different? Look at this paper!
And the psych guys are doing MRIs because that's the Next Big Thing, and any bias culturally relevant enough will get positive papers for it because the 100 labs will check their 100 papers and the 95 that show no difference at 95% confidence are of course never published because there's nothing to say. And the medicine guys are running ten thousand seaweed extract experiments to kill cancer cells, and their confidence intervals do tend to be better, but there is no defense against an industry where the only way to survive is to publish and the only way to get funding is to look into something that the grant committee wants you to find something interesting about, and you'd better find that something interesting even if you kind of have to publish your study before repeating it to check for that confidence interval -- someone else will repeat it, or you will once the paper's out, but you've got to get the paper out. And that paper had better be useful for other scientists to reference, even if you have to change the tone and the implications and choose which results are worth referencing, because getting more references is how you keep your job and how you convince the next grant committee.
Big Bad Government Lies About Covid is a problem, but it's not the problem. A perfect government with a massive science budget does not solve the problem. The system of research is inherently political because it is inherently competitive and because it deals with numbers and statistics that are not intuitive to humans.
Nono this problem also exists outside of capitalism. Any economic system has to distribute resources and any actually useful one has to care if its workers are actually producing something.
Publish or perish is a huge problem. It is rewarding "work", but not the useful work that we want to see. All attempts to make those 95 papers see the light of day have been... less than successful. And the rewards for replicating a paper (or at least as useful, showing that you can't) aren't as large as for producing a piece of slop that shows that there's a correlation between green jelly beans an acne.
In my field (cognitive behavioural psychology, and i guess psychology more generally too) this is known as the "file drawer problem", i.e. the vast majority of non-significant findings are left in some researcher's file drawer never to see the light of day because it's not as publishable as the ones that are significant. It's a key factor in what we call the Replication Crisis, where researchers completely failed to replicate several foundational studies on which like half the field was based. Once the first couple effects couldn't be replicated (social priming, ego depletion, power posing, the marshmallow test, etc.), it caused a massive domino effect that revealed widespread replicability issues across the entire field of psychology. Half the famous pop-psychology stuff that gets passed around is based on studies that can't be replicated (and trust me, people have tried really hard to replicate them, it really is egg on our faces for all the most famous bits of our field turning out to be fake). Effects we all thought were really solid turned out to have been statistical anomalies concealed by the file-drawer problem. Stuff that was published in places like Nature and Science. Prominent researchers were exposed for faking huge amounts of their data. Meta-analysis showed suspicious gaps in the distributions of published p-values, indicating people were lying about marginally-significant results (i.e. if the data gave a p-value between 0.05 and 0.06, people were "rounding down" to 0.05 so it technically met the significance threshold). Evidence emerged of researchers collecting massive amounts of data and arbitrarily including or excluding participants until their data showed significant results, or simply collecting more and more data until they had significant results (p-hacking). Papers got retracted, funding was pulled, people got fired. It was a total fucking mess that almost collapsed the entire field, and it happened because of a couple of factors:
a) you can't "prove" anything from non-significant results (i.e failing to find evidence of a link between purple jelly beans and acne is not the same thing as finding evidence *against* a link between purple jelly beans and acne) and so you can't really reach any definitive and interesting conclusions. If you find significant evidence of something you can make all kinds of interesting claims and discuss how it fits or conflicts with existing literature, and propose ideas to explore how and why it works like that, etc. If your findings are nonsignificant, the best you can do is be like "idk man, it's weird it didn't work though, right? Maybe our methods were flawed, or maybe our participants are weird, or maybe there just is no effect here - I sure as fuck couldn't say!" and keep it pushing (while hoping people continue to pay you to do this).
b) Academic journals gain their prestige/value from how often their articles are cited (with for-profit, pay walled journals this is also their main source of revenue bc people/institutions will pay them for access to the articles others have cited, etc.), and so they are motivated to publish things people will want to cite lots. People typically cite articles in order to reference a specific finding or theory it shows evidence for. Therefore, papers with non-significant findings are unlikely to be cited nearly as often as those with significant findings (on account of how they generally don't prove anything) and are thus less valuable for the journal to publish. This maintains the journal's reputation for being difficult to get published in and also for publishing only the most high-quality, "impactful" work (bc everyone is citing it, so it must be important!), ergo the prestige thing. Open-access journals and those dedicated to tackling the file drawer problem are available but are far less prestigious, owing to the lack of citations they get.
c) Academics are judged in hiring and grant application processes by how many papers they have published, how well-cited their work is (ditto above), *and* how prestigious the journals they've published in are. This is the "publish or perish" problem yarning mentioned. In my field applicants for academic positions etc. are often ranked by a combination of Journal Impact Factor (average number of citations received within a two-year window by articles published by the "best" journal you have also published in) and h-index (highest number, h, for which your top h most-cited papers have at least h citations. E.g. if you have 20 papers, and the top 7 are all cited at least 7 times but the next most-cited is only cited twice, you would have a h-index of 7). As you can maybe see, this makes resisting the file-drawer problem (and, to an extent, the other far more unethical methods I mentioned above) quite difficult on the part of individual academics, who need to prioritise publishing work that will get cited if they want to compete for jobs and grants. As an early-career researcher with relatively few published papers, pushing to get your non-significant results published is practically career suicide in a lot of fields especially if you're not working on something that gets a lot of funding opportunities to begin with. It can sometimes look worse to hiring/grant committees to have 6 papers with no citations than to have no papers at all bc the default assumption is often that this means your work is low-quality in some way (this perception has started to change, thankfully). It can be a bit easier for more established researchers for whom publishing low-impact papers doesn't really matter so much and who are probably not having to scrabble for funding etc. as much anyway, but they are also less motivated to push for publication in general.
There isn't a super easy solution to the problem tbh, and it is only compounded by the pressure Derin mentioned to twist whatever you actually want to study to include whatever the hot new analysis/theory/whatever is favoured by your friendly neighbourhood funding-bodies. If people are paying you to look into The Thing, then you do kind of have to try to get it published if you find it, even if it is a bullshit half-baked addition tacked onto the end of the bit you're actually interested in. And conversely, the people invested enough in The Thing to pay you to study it probably don't particularly care if you publish papers that don't find evidence for it, in fact they would probably rather you didn't. Citations are also an abysmally poor metric of the value of a given piece of work to the field, but we don't really have a better one. It *is* becoming more understood how flawed the system is though, and work on large-scale multi-lab replication studies and meta-analyses is gradually being valued more, which is good to see. More journals are also opting for a pre-publication system, where you submit your introduction, rationale, and planned methodology for approval *before* you collect data and get your results. If your pre-pub is accepted, the journal agrees to publish the work regardless of whether the results themselves are significant, which avoids the file drawer problem while also allowing the journal to maintain certain standards of quality based on the methodology and theoretical foundations. Open access, and especially the proliferation of predatory pay-to-publish journals, has gotten a reputation of letting literally anyone publish whatever they like with very poor peer review processes, which contributes to the lack of prestige associated with publishing outside the more traditional journals. The pre-pub system helps to combat that reputation, which makes it more of an option for early-career researchers. Pre-pubs are also often made public in some capacity, which helps to prevent things like p-hacking and retroactively changing hypotheses to fit findings - a solid improvement in transparency imo. Doesn't solve the problem of people potentially straight-up faking their data, but it reduces the incentive to do that stuff if you know you'll get a publication regardless. Funding bodies are springing up whose whole thing is giving grants for replication work across various fields, with expectations that researchers will publish whatever they find, significant or not. There's not a lot of them so far, but at least there are some ways to fund research that aren't so biased towards reinforcing the file drawer problem for whatever the grant committee's pet theory is. It's still a mess, but recent shifts have given me some hope tbh.
Side note bc i can't help myself: The whole "absence of evidence is not evidence of absence" thing is a deliberate feature of how hypothesis testing works, not a bug. You predict what will happen *if the theory you are testing is true* (e.g., if our hypothesis, H¹, is that purple jelly beans are associated with acne, we would predict that participants who eat purple jelly beans would develop more acne than those who don't), and then you run statistical tests to see whether this hypothesis explains the data you actually collect better than assuming there is no effect (which we call a null hypothesis or H⁰) within some pre-determined confidence interval, which is usually 95% or 99% for my field but the standard for statistical significance varies a lot. All your basic, garden-variety statistical hypothesis tests (t-tests, ANOVAs, correlations, regressions, etc.) have a built-in assumption that there is no effect and then test for evidence to contradict that assumption. Because of this, it's impossible to tell a false negative, or the presence of a conflicting effect in the opposite direction, from the actual absence of an effect based on one of these tests. The idea is that we would rather accept a false-negative and dismiss an effect that does exist (which others will surely find evidence of on replication if it is actually reliable) than risk a false positive where we believe in an effect that doesn't exist. In some cases, we use another H⁰, depending on the specific setup of the experiment, whether it's reasonable to assume *no* baseline effect vs. an established level of effect, etc. but it always reflects some baseline assumption that we are comparing H¹ to. Each time you run these tests you can only check for evidence of one H¹ at a time, so failing to find that evidence doesn't mean H⁰ is true necessarily (it could be any number of other alternate explanations), it just means we haven't found enough evidence to suggest H¹ is sufficiently more likely than H⁰, so we can't be confident H¹ is true. If you are testing multiple different hypotheses, and therefore running the same test multiple times, you are *supposed* to add a statistical correction for multiple-comparisons to avoid the inflating family-wise error rate (there are a couple different methods depending on how conservative you want to be with your confidence intervals etc. I won't get into all of that here unless anyone actually wants to know, lol.) This doesn't always happen, especially if multiple different papers are analysing the same dataset since technically, each person is only doing the test once, even though overall you do get the same issue of family-wise error rates. Meta-analyses, where you look at the papers that have already been published on a given topic and analyse what the cumulative evidence says, typically do correct for multiple comparisons in some way, though in my experience.
I almost made a post about this the other day (unless i actually did and totally forgot) but there’s so many
I was going to make a list, but then i realized this is a good time to share this book
Making Stuff and Doing things is a whole collection of old punk DIY zines about making and doing just about anything, even things you probably never knew you wanted to do.
Book binding? In there.
Making bowls from old vinyl records? I made a whole ton for my brother’s grad party last year.
Basics of guitar? Making rubber stamps? Silk screening? Composting? Homemade beer, root beer, and wine? Soymilk?? Quill pens??? All in there.
Since it’s more punk, it doesn’t have a ton of the folksy, cottage vibes/hobbies, but it’s all about being resourceful and sustainable, which they both have in common.
If i ever need to do anything I’m not sure of, I double check this book to see if there’s anything in there. It’s one of the only books on diy I’ve ever needed.