Discover Top Posts Tagged with #p-values

Anti-Ivermectin Deception in a Major Medical Journal?

The failure to reach a significant p-value can come from too few patients in a trial study or from selecting an outcome like COVID death from that's too infrequent to give a significant p-value. Does Big Pharma and JAMA use this against Ivermectin?

In the image above you can see the weakest to the strongest information categories with the strongest, most logically reliable type of study at the top (called Meta Analysis) and the weakest at the bottom (called Expert Opinion / Background Information). Of course meta analysis articles like the one linked here brought a swift response from the powerful gatekeepers who tried to push meta studies…

View On WordPress

#Big Pharma bias in JAMA #COVID-19 #JAMA anti-Ivermectin article #Omicron subvariant BA.2 #p-values #statistical significance shenanigans

Significance in statistics

That feeling when your p-value is lower than your alpha, aww yeah! But what does it really mean? It’s one thing to say there is significance and on the surface it means the two things are different “enough” to be considered two things, but I think there’s a simpler way to explain it. So today we’re going to talk about what significance actually means in the practical sense. Maybe it’s super…

View On WordPress

#academia #Math #mathematics #p-values #PhD #school #science #significance #statistics #stats #student #grad school #graduate school

Links (8/22/2019)

Links:

The Probability Distribution of p-Values (N. N. Taleb).

The ASA Issues Statement on p-Values.

Statistical Significance (Wikipedia).

(John Brockman of Edge Foundation) Jeffrey Epstein’s Intellectual Enabler by Evgeny Morozov.

Genetically Modified Organisms Risk Global Ruin, Says Black Swan Author.

Bayesian Epistemology (Stanford Encyclopedia of Philosophy).

Markov Chain Monte Carlo (Wikipedia).

Ludic Fallacy (Wikipedia).

#Statistics #p-values #Probability

CAN 8: p-values

We’ve not done a critical appraisal nugget for a while now. No excuses, we’ve just been busy with other stuff, but we need to rectify that now. This week I sat down with Rick Body and talked through some of the issues around the use and abuse of p-values in research, and how that affects how we interpret them in critical appraisal.

Some useful links to stuff we’ve mentioned on the podcast.

View On WordPress

#CANs #critical appraisal #FOAMed #p-values

Uncertainty Wednesday: The Problem with P-Values (Learning)

Today’s Uncertainty Wednesday will be the concluding post in my mini-series on the problem with p-values. We have already seen that it is much easier than expected to reject a null hypothesis if you have incentives to do so. We also saw that the ability to work backwards and generate hypotheses from the data is a big issue. Today we will consider a more foundational, epistemological problem with p-values: what is it that we are really learning when we are rejecting a null hypothesis?

Let’s once again consider the original example of a coin toss where our null hypothesis is that the coin is fair (and independent). We have done everything by the book. We had our null hypothesis ahead of time (not generated from the data). We did exactly 6 tosses and they all came up as heads (or tails for that matter), instead of cheating on our data collection. And so with great satisfaction we reject the null hypothesis at a p-value of 0.03125.

But what does that actually mean? What have we really learned from doing so? Our null hypothesis here is incredibly narrow. It is that the coin is precisely fair. Rejecting that leaves open a ton of other possibilities. Is the coin just slightly unfair or is it extremely unfair? Which of these two possibilities is more likely given what we have observed? And why did we pick this narrow null hypothesis in the first place?

Let’s take a step back. Suppose I don’t tell you that we are dealing with a coin, just with a process that has two possible observable signals H and T. If you know nothing else about the process, that allows for anything from observing only Hs to only Ts to some random mix of the two. That makes it clear that having as your null hypothesis that the mix will be random at exactly 50% Hs and 50% Ts is an incredibly narrow assumption. It is picking a single real number, 0.5, on a continuous interval from 0 (no Hs) to 1 (all Hs).

This is related to the issue we encountered previously with spurious correlation. A null hypothesis of zero correlation between two variables is an incredibly narrow assumption, when possible correlation is a continuous interval from -1 to +1. So again, when we reject that narrow hypothesis what have we actually learned? Only that some very narrowly defined assumption is unlikely. That’s not a lot of learning.

This is a fundamental limitation of the p-values approach. Generally people tend to pick very narrow null hypotheses and rejecting them doesn’t tell us much about the alternatives. Now this can be seen as a slightly unfair criticism. If you get a p-value of 0.0000001 on a coin toss and you do it with a large number of tosses you have the information that the coin is likely to be very unfair. But with the p-values approach that additional step tends to be buried.

What is the alternative? The alternative is to take a Baeysian approach instead. We saw that already in the case of correlation how that provides a lot more information than the rejection of a null hypothesis.

#uncertainty wednesday #p-values #machine learning

Uncertainty Wednesday: The Problem with P-Values (Generating Hypotheses)

Today’s Uncertainty Wednesday continues our exploration into p-values and why they are problematic. Last Wednesday we saw that if you have incentives to reject a null hypothesis, it takes less work than you would initially think to find data that gets you there. I ended that post suggesting that the problem is even bigger than that. How so?

We now live in the age of “big data” -- researchers in many fields have access to massive data sets. This lends itself to an approach that has become known as “data dredging.” Instead of starting with the null hypothesis of a “fair and independent coin” we start with a large database of pre-recorded coin flips. Now we work backwards to find a hypothesis that we can reject with a p-value of 0.05 or maybe even 0.01 in our data set!

How would we do such a thing and what would such a hypothesis look like? Well with a dataset containing just Hs and Ts we would have to be a bit creative. But we could generate hypotheses that take the form of a probabilistic finite state machine. For instance: the coin first has a probability of 20% H and 80% T, if H it has a subsequent probability of 70% H again, but if T then it only has a 10% of repeating T. You get the idea. You could write computer code that generates such hypotheses until you find one that you can reject with a really significant p-value in your dataset. Then you go and publish!

Now you might object: Albert, these are completely arbitrary hypotheses, why would anyone believe these? Well, they only come across as arbitrary because I on purpose stayed within the domain of a coin flip. But most big dataset are really complex containing many different variables. Just take the coin flip database and combine it with a database of stock price fluctuations. Now you can test tons of different hypotheses of the form: price movements for stock x are not correlated with the coin flips (where H might be stock price for x moves up and T it moves down).

Again you can have your computer generate these hypotheses for you and test them until you find one you can reject with a p-value that’s deemed significant. These hypotheses are just as arbitrary as the coin state machines I suggested above, but they don’t look that way. They look really simple and thus credible.

But this approach completely violates the statistical reasoning behind p-values. That reasoning only applies if you start with the hypothesis and then apply the test. In any large dataset you will always be able to work backwards towards hypotheses that can be rejected *in that dataset*. Just recall the prior posts about spurious correlation.

OK, so that’s pretty bad given that so many people have incentives to find hypotheses they can reject so that they can publish a paper or claim that a product is effective. But next Wednesday we will look into an even more profound problem with p-values.

#uncertainty wednesday #p-values

Uncertainty Wednesday: The Problem with P-Values (Intro)

As promised for some time, today in Uncertainty Wednesday, I will talk about p-values and what makes them so problematic. We will once again look at a super simple example by going back to considering a coin flip. As before we will consider the highest uncertainty explanation which is that the mechanism producing the coin flip produces heads (H) and tails (T) each with probability 0.5 and that each flip is independent.

Now the idea behind p-values is to attempt an argument of reductio ad absurdum but in a setting with uncertainty. We will assume that the explanation is true (this is also called the null hypothesis) and then see if our observations are so unlikely that they amount to a contradiction of our assumption. In the coin example: we will start by assuming that H and T are equally probable, then we will observe a sequence of Hs and Ts and if that sequence is really unlikely given our assumption, then we will reject that assumption.

Now the first question we have to ask ourselves is what does it mean for a sequence to be really unlikely given our assumption? This is an important question because we know that every sequence of a given length is actually equally probable given our assumption of equal probability and independence. What do I mean by that? Let’s take sequences of length 6 for example:

P(HTHHTT) = P(H)*P(T)*P(H)*P(H)*P(T)*P(T) = 0.5*0.5*0.5*0.5*0.5*0.5 = 0.01563 P(TTHHTH) = P(T)*P(T)*P(H)*P(H)*P(T)*P(H) = 0.5*0.5*0.5*0.5*0.5*0.5 = 0.01563 P(HHHHHH) = P(H)*P(H)*P(H)*P(H)*P(H)*P(H) = 0.5*0.5*0.5*0.5*0.5*0.5 = 0.01563

And we see that each of them is equally likely (or unlikely) given our assumptions. So this would not seem to help us much at all!

So how could we distinguish between these sequences? The idea is to compute a statistic, i.e. a condensation of the data. The statistic we might be most interested in here is the sample mean. Let’s say we take H=1 and T=0, then the sample means are as follows:

Mean(HTHHTT) = (1+0+1+1+0+0) / 6 = 3/6 = 0.5 Mean(TTHHTH) = (0+0+1+1+0+1) / 6 = 3/6 = 0.5 Mean(HHHHH) = (1+1+1+1+1+1) / 6 = 6/6 = 1

Now we are getting somewhere. There are many sequences that will give us a mean of 0.5 or close to it. There are only 2 sequences that will give us a mean of 1: HHHHHH and TTTTTT.

The p-value then is defined as the probability of a sample statistic given our explanation (aka assumption, aka null hypothesis). So in our example the p-value of observing mean = 1 is

P(HHHHHH) + P(TTTTTT) = 0.01563 + 0.01563 = 0.03125

Finding a sample mean of 1 given our assumptions has a p-value of 0.03125. That is less than 0.05 which is often used as the cutoff in many studies across fields as diverse as medicine and education. Following that approach we would thus reject our explanation that both H and T are equally probable.

Now all of this sounds super logical. There doesn’t seem to be some obvious error of reasoning. And yet the use of p-values is wildly problematic. Over the next few posts we will explore why.

As “homework,” you might consider the following scenario: you are a researcher who gets paid only if you reject the explanation of equal probability with a p-value cutoff of 0.05. How much work do you have to do to come up with a sequence of observations that gets you the desired result?

#uncertainty wednesday #p-values

Interpreting uninterpretable P-values.

Lately, I’ve been trying to learn more about open science and how it relates to research I’ve done, research I’d like to do, and how it relates to sociolinguistics in general. One topic that comes up regularly when talking about open science is pre-registration. For those who aren’t familiar with this process, pre-registration refers to publishing a detailed, time-stamped description of your…

View On WordPress

#data fishing #open science #p-hacking #p-values #pre-registration #research design

Uncertainty Wednesday: The Problem with P-Values (Intro)

And we see that each of them is equally likely (or unlikely) given our assumptions. So this would not seem to help us much at all!

Mean(HTHHTT) = (1+0+1+1+0+0) / 6 = 3/6 = 0.5 Mean(TTHHTH) = (0+0+1+1+0+1) / 6 = 3/6 = 0.5 Mean(HHHHH) = (1+1+1+1+1+1) / 6 = 6/6 = 1

Now we are getting somewhere. There are many sequences that will give us a mean of 0.5 or close to it. There are only 2 sequences that will give us a mean of 1: HHHHHH and TTTTTT.

The p-value then is defined as the probability of a sample statistic given our explanation (aka assumption, aka null hypothesis). So in our example the p-value of observing mean = 1 is

P(HHHHHH) + P(TTTTTT) = 0.01563 + 0.01563 = 0.03125

Now all of this sounds super logical. There doesn’t seem to be some obvious error of reasoning. And yet the use of p-values is wildly problematic. Over the next few posts we will explore why.

#uncertainty wednesday #p-values

#p-values

Trending Tags

Recently Viewed Tags

#p-values