Machine learning (ML) has revolutionized various fields, from healthcare to finance. However, many ML algorithms are often seen as black box

seen from T1

seen from Sweden

seen from India
seen from United States
seen from Russia

seen from Dominican Republic
seen from France

seen from United States
seen from United States

seen from T1
seen from France

seen from France

seen from T1
seen from China

seen from Vietnam
seen from China
seen from Dominican Republic
seen from United States
seen from China

seen from United States
Machine learning (ML) has revolutionized various fields, from healthcare to finance. However, many ML algorithms are often seen as black box
Trust your gut, boost your memory, de-bias your decision making… can we train our brains to perform better?
I’ve been to a lecture and I’ve fallen asleep in the middle of it. This feels surprisingly good. I remember when I was just cutting my teeth in math I would come to the evening lectures — they started at 17:30 and the consequent seminars often lasted into the wee hours — begin to take notes and fall asleep.
Somehow the material presented — what little I managed to perceive — would seep into my mind and settle much easier than what I studied in the morning. I would go through my notes later, then the lecturer’s note, but I wouldn’t discuss it with my peers, something I’d regret even then, and something I regret much more now. I felt awkward and unsafe, I wouldn’t know how to approach them. I was ashamed of understanding so little, and I would struggle with that sense of shame holding me back from learning more.
I’m writing this down now before going through the lecture notes. I didn’t take any this time over, but the lecturer had superb Jupiter Notebook with everything laid out in plain English and the math plainer still. I’m writing this down to get out of the nauseating whirlwind of thoughts and emotions and obligations tugging me every which way and into the proper set of mind. I’m sitting on a spacious balcony, the sun setting behind the mountain outside, sliding down into the sea, which I can’t quite see behind all the trees and houses.
I’ve set up the place for myself to study and hope to get to it every day, or /most/ days at least. Habits are powerful and this one has almost ground to a halt.
The lecture notes are out at https://github.com/QuantScientist/deep-ml-meetups/blob/9c52884276d7a01abef6081bd7215072a8544595/RL-2017/BayesianLearning.ipynb and I’m going to go through them right now and note some of the questions I have. Hopefully this will get me onto the right track.
CultivateLabs
I enjoyed John Horgan's piece on Bayes Theorem for Scientific American. Bayes Theorem and Bayesian reasoning are highly applicable when thinking about forecasting and prediction markets; indeed, one prediction market built a Bayes Net into its platform. In this post I'll explain what Bayesian reasoning is, why it matters to prediction markets, and give a concrete (but semi-fictitious) example of how it's applied.
The logical Bayesian reasoning is basically as follows:
Assign each of your beliefs a probability value somewhere between 0 and 1
Upon encountering new evidence supporting or opposing a belief, adjust your beliefs in the direction that the new evidence suggests
That's basically it. Says Horgan: "It reminds me of the theory of evolution, another idea that seems tautologically simple or dauntingly deep, depending on how you view it." (A couple other ideas I might slide into Horgan's description as both simple and deep: the Coase Theorem and the Nash Equilibrium)
Bayes Theorem provides precise math for how much to adjust beliefs. It's worth studying the math in some depth, because it leads to some counter-intuitive results, but for now I'll just use approximations. [keep reading]
Bayes' theorem for the masses. A really good and careful explanation of the basics of bayesian reasoning.
Irrationality in Economics, and "Subjective Probability"
I gave this talk several years ago, but you know what? It's still pretty decent.
Irrationality in Economics
The title is misleading. Like many of my titles, it's meant to grab attention rather than be exactly correct.
I was trying, with this talk, to convince college freshmen to switch from Philosophy to Economics. And you know, Philosophers are always talking about Rationality -- is there even such a thing, and if so what does it consist of? Econ provides more than one concrete prescription for Rationality -- more on that below.
“We are recorders and reporters of the facts--not judges of the behavior we describe.” --Alfred Kinsey
I actually think that economists and psychologists could do more to prescribe healthy, effective behaviours and thought-strategies for people to follow. But the recommendations should be based on empirics, e.g.
"buy experiential goods, not durable goods";
"purchase with cash instead of plastic";
"beware these 4 common investing mistakes made by novices";
"put crisps and fudge in a drawer, not in plain sight"
--not on a general model of "optimal" behaviour.
Theorists, though, don't have the necessary understanding to make normative evaluations. Not yet, at least. But they can approach the deep Utility Theory questions in the spirit of the above quotation. They can model behaviours and thoughts, and inquire as to how they are internally structured -- without the prejudice of inherited mathematical aesthetics.
What do I mean by 'inherited aesthetics' ? One example is substituting the mathematics of probability for a separate theory of human figuring.
I SHOULD HAVE SAID IT LIKE THIS IN THE SLIDES
One parsimonious shortcut economists tried, which didn't work out, was to use probability mathematics to explain how people think about the future. If we can conceive of people's beliefs as mathematical probabilities, then regular microeconomics + more maths = a new, better theory of behaviour.
For example, curved preferences over wealth would manifest themselves in probabilistic situations such as lotteries, insurance, betting, investing, employment in risky jobs, and love & sex risks.
But. People don't think that way. They don't make accurate calculations about Poisson distributions, Beta distributions, Bayesian priors, Aumann agreement theorems, and so on. I guess evolution either built us for something different or else we're just misshapen clay with limited resources to Bayes our way to rationality.
I speculate that the way people think about probability -- dubbed "subjective probability" by Leonard Savage -- is shaped very differently from what mathematicians usually consider "natural" axioms -- transitivity, commutativity, reflexivity, independence of irrelevant alternatives, monotonicity, and so on. But who knows? The correct theory doesn't exist yet.
NOT ACTUALLY IRRATIONAL
The word "irrationality" I definitely ab-used.
Economists come up with a theory of how people behave and say it's "ideal" or "rational". People don't actually think like that, so then we say they're "irrational"? That doesn't make sense. The theory was just wrong; an incorrect description. They perform sub-optimally according to some guy's theory of the world, of their value system, and of how they should think. But since we don't really know how people really think, how they experience the results of their choices, or how we should evaluate discrepant self-reports of how good a decision was, we can't say what's rational.
Like so, although it took the Ellsberg Paradox, Allais Paradox, and other results to disprove the accepted theory which naïvely united Probability and Utility, those results are not the point. The point is that we have to conceive a more realistic model of people's mental models before Economics can draw valid conclusions about what people "should" do.
Null hypothesis testing is voodoo.
Changes in the mental state of the experimenter should not affect the objective inference of the experiment. An argument for using Bayesian data analysis instead of H0 vs Ha.
Imagine you have a scintillating hypothesis about the effect of some different treatments on a metric dependent variable. You collect some data (carefully insulated from your hopes about differences between groups) and compute a t statistic for two of the groups. The computer program, that tells you the value of t, also tells you the value of p, which is the probability of getting that t by chance from the null hypothesis.
You want the p value to be less than 5%, so that you can reject the null hypothesis and declare that your observed effect is significant.
What is wrong with that procedure? Notice the seemingly innocuous step from t to p. The p value, on which your entire claim to significance rests, is conjured by the computer program with an assumption about your intentions when you ran the experiment. The computer assumes you intended, in advance, to fix the sample sizes in the groups.
In a little more detail, and this is important to understand, the computer figures out the probability that your t value could have occurred from the null hypothesis if the intended experiment was replicated many, many times. The null hypothesis sets the two underlying populations as normal populations with identical means and variances. If your data happen to have six scores per group, then, in every simulated replication of the experiment, the computer randomly samples exactly six data values from each underlying population, and computes the t value for that random sample. Usually t is nearly zero, because the sample comes from a null hypothesis population in which there is zero difference between groups. By chance, however, sometimes the sample t value will be fairly far above or below zero. The computer does a bizillion simulated replications of the experiment. The top panel of Figure 1 shows a histogram of the bizillion t values. According to the decision policy of NHST, we decide that the null hypothesis is rejectable by an actually observed tobs value if the probability that the null hypothesis generates a value as extreme or more is very small, say p < 0.05. The arrow in Figure 1 marks the critical value tcrit at which the probability of getting a t value more extreme is 5%. We reject the null hypothesis if tobs > tcrit In this case, when N = 6 is fixed for both groups, tcrit = 2.23. This is the critical value shown in standard textbook t tables, for a two-tailed t-test with 10 degrees of freedom.
In computing p, the computer assumes that you did not intend to collect data for some time period and then stop; you did not intend to collect more or less data based on an analysis of the early results; you did not intend to have any lost data replaced by additional collection. Moreover, you did not intend to run any other conditions ever again, or compare your data with any other conditions. If you had any of these other intentions, or if the analyst believes you had any of these other intentions, the p value can change dramatically.
AUTHOR: John Kruschke. The Road to Null Hypothesis Testing is Paved with Good Intentions.