Plan (2021-12-07)
Theory: It looks like this proof is going to be a classic Horrible Analysis Proof. Fuck.
Applied: I got my gumbel softmax VAE running, now I have to run endless experiments
"I'm Dorothy Gale from Kansas"
h
Show & Tell

祝日 / Permanent Vacation
tumblr dot com
almost home
Cosmic Funnies
Acquired Stardust
$LAYYYTER
taylor price
No title available

⁂
sheepfilms

titsay

shark vs the universe

No title available

@theartofmadeline
styofa doing anything
Xuebing Du
trying on a metaphor
seen from Germany
seen from Italy

seen from United Kingdom

seen from Japan

seen from Germany
seen from United States

seen from Italy

seen from Singapore
seen from Kazakhstan
seen from United States
seen from United Kingdom

seen from Türkiye

seen from United States

seen from Egypt

seen from United States

seen from United Kingdom
seen from Brazil
seen from Netherlands
seen from United Kingdom
seen from Germany
@functional-analysis-done-okay
Plan (2021-12-07)
Theory: It looks like this proof is going to be a classic Horrible Analysis Proof. Fuck.
Applied: I got my gumbel softmax VAE running, now I have to run endless experiments
Plan (2021-11-29)
Theory: I have one major element of the proof left: showing that if two probability distributions are a certain distance apart, they must have [desired property it’s complicated]. I accidentally already showed the reverse direction (for distributions with bounded support, at least), because I confused myself.
Applied: Hmm. I still need to implement a gumbel softmax VAE, where by “implement” I mean “copy from github and get it to run”. I implemented a trivial environment and I’m not really seeing learning there, so clearly something is going wrong in the training alg.
Learning: Against my better judgment, I’m reading Kallenberg. Or rather I’m reading Kallenberg + corresponding chapters of Cinlar, because I’m not completely insane. Also a bunch of optimal transport stuff because Wasserstein distance.
I’m about 90% confident now that I will finish the entire proof. I’ve at least finished the first part, showing that conditional independence can be tested in a polynomial number of samples (if you make enough assumptions, conditional independence testing is impossible in generality)
I think I’m actually going to get the result I want. Finally. This is nice but it also makes me very unsure about continuing with grad school. It took me so long. So long that I’m embarrassed to say how long it took me.
Anyway I think this is evidence that I’m not cut out for this and I should consider dropping out.
The whole time I’ve been working on this proof, I’ve had it in the back of my mind that maybe I could just prove a bunch of general results and skip over all the details. And I kept saying “oh, no, that’s a mistake, I’ll end up wasting a ton of time looking for a general version and I should just prove it directly”.
... and now I’ve done most of the direct proof, realized the general version, proved that and rewritten my proofs using the general thing.
I’m sure some of you are thinking “oh, well, you needed to do the direct versions in order to see the general version”. Maybe. But I’d be a lot more confident of that if I’d tried and failed to find the general version first.
Plan (2021-08-31)
Math: I’m closing in on the final result, very excited about it. I need to convert an existing result to my paradigm, prove an additional novel result, and then I think I should have everything modulo some cleanup
Code: There’s one more code base I want to try adapting before it’s back to the drawing board. I really hope it works but I’m worried it will not.
Plan (2021-08-16)
Theory: There’s one or two major steps left in the proof. One of them should be easy, I’m just modifying an existing proof, but no one has written that proof down so instead I’m looking at a much more complicated proof and trying to adapt that.
Applied: Well, I am at least safe from being scooped because absolutely everyone else who’s looked at this problem has failed. Working with [redacted] is helpful, she’s much better at neural net engineering than I am
Learning: Still focusing on schemes and riemannian geometry
So there's lots of times when you need to learn everything actually useful on your own outside of official classes, eg cs you need to do some projects to learn to code and classes are insufficient.
Is US math grad school like that? Specifically the divide between knowing baby algebra/topology/ a little homology/homotopy and being like capable of doing algebraic topology research? Or the like for other areas of math.
Because the gap seems very wide from here and it would suck if I discovered in two years that I was supposed to have bridged it largely on my own.
My experience (which is in the cs department, not the math one, but I expect it to be similar) is yes. There are advanced prob/stats/analysis courses available, but the particulars for my research are not covered and it was really just down to me to read papers and relevant textbooks.
Is the expectation that your advisor points you to the textbooks and papers, or that you independently find the papers/texts and the time for it?
Neither/both. Ideally your supervisor will point you towards interesting things, but it’s expected that you will be going out and actively looking yourself as well.
My research project is outside either of my supervisors’ fields of expertise, so I think I probably do more self-directed work than is usual, but I would expect to do at least some no matter what, especially for learning the prerequisites.
So there's lots of times when you need to learn everything actually useful on your own outside of official classes, eg cs you need to do some projects to learn to code and classes are insufficient.
Is US math grad school like that? Specifically the divide between knowing baby algebra/topology/ a little homology/homotopy and being like capable of doing algebraic topology research? Or the like for other areas of math.
Because the gap seems very wide from here and it would suck if I discovered in two years that I was supposed to have bridged it largely on my own.
My experience (which is in the cs department, not the math one, but I expect it to be similar) is yes. There are advanced prob/stats/analysis courses available, but the particulars for my research are not covered and it was really just down to me to read papers and relevant textbooks.
Plan (2021-07-19)
Theory: Still haven’t figured out the blocked part of the proof. Specifically there’s a claim that a particular estimator produces a hypothesis test for conditional independence with particular power, but I honestly can’t see it (it’s certainly in the realm, you’re estimating a thing that is 0 if conditional independence holds, but it’s a continuous variable so I don’t see how we can get non-zero measure on it actually being zero)
Code: Doing some devops, mostly have things set up nicely. Still need to clean the actual code up and refactor it to fit into my overall framework.
Learning: I’m going to start recording some of the things I’m focusing on learning. Mostly this is stuff that sits in between direct research relevance (which would go under theory) and just for fun (which I might post about anyway)
Anyway this week I’m going to read through some literature on neural architecture search, and finish going through Ideals, Varieties, and Algorithms.
Code: several of the papers I was looking at turned out to be useless (solving the wrong problem), so getting them running was a big waste of time
I got through a bunch of NAS and most of ideals varieties and algorithms
Plan (2021-07-19)
Theory: Still haven’t figured out the blocked part of the proof. Specifically there’s a claim that a particular estimator produces a hypothesis test for conditional independence with particular power, but I honestly can’t see it (it’s certainly in the realm, you’re estimating a thing that is 0 if conditional independence holds, but it’s a continuous variable so I don’t see how we can get non-zero measure on it actually being zero)
Code: Doing some devops, mostly have things set up nicely. Still need to clean the actual code up and refactor it to fit into my overall framework.
Learning: I’m going to start recording some of the things I’m focusing on learning. Mostly this is stuff that sits in between direct research relevance (which would go under theory) and just for fun (which I might post about anyway)
Anyway this week I’m going to read through some literature on neural architecture search, and finish going through Ideals, Varieties, and Algorithms.
Plan 2021-07-06:
Overall I’m v happy with progress on theory and code
Theory: Have most of the main proof done. Need to send an email to [the scary prof] with a question about his field. Need to write up the last bit of the proof for the noise terms. Need to formalize everything.
Code: I picked up a collaborator (my girlfriend who wants to publish RL papers). She’s way more motivated about writing ML code than I am, which is extremely helpful because motivation is my biggest block by far. Also helpful because it forces me to be more organized since now I have another coder to direct. We’re working through some reimplementations of relevant work. Also she has a 3090* which is extremely useful for experiments.
* for the pure math people a 3090 is more or less the most powerful consumer GPU you can get, and there’s a major shortage of GPUs right now so I haven’t upgraded my own.
I finished the proof and sent the scary email! I still need to show that a particular term is/is bounded by a polynomial, but I’m reasonably confident that’s true and should be easy.
I’m somewhat blocked on the response from the scary prof now, so I’m just going to be doing code for a while.
Plan 2021-07-06:
Overall I’m v happy with progress on theory and code
Theory: Have most of the main proof done. Need to send an email to [the scary prof] with a question about his field. Need to write up the last bit of the proof for the noise terms. Need to formalize everything.
Code: I picked up a collaborator (my girlfriend who wants to publish RL papers). She’s way more motivated about writing ML code than I am, which is extremely helpful because motivation is my biggest block by far. Also helpful because it forces me to be more organized since now I have another coder to direct. We’re working through some reimplementations of relevant work. Also she has a 3090* which is extremely useful for experiments.
* for the pure math people a 3090 is more or less the most powerful consumer GPU you can get, and there’s a major shortage of GPUs right now so I haven’t upgraded my own.
Speaking of algebra, this is maybe quite nice for pedagogy: https://nathancarter.github.io/group-explorer/index.html
(It’s a website explaining many of the common finite groups, defining and visualizing them in various ways)
i miss algebra
Plan (2021-05-18):
Theory: Need to prove bounds on the error terms
Applied: still feeling very averse to writing code, but it got a little better this week.
Slutsky’s Theorem
This is a nice theorem I learned recently and I’m surprised I never learned it before.
If $X_n$ is a sequence of random variables converging in distribution to $X$, and $Y_n$ is a sequence of random variables converging in probability to a constant $c$, then:
$X_n + Y_n$ converges in distribution to $X + c$
$X_n Y_n$ converges in distribution $cX$
$X_n / Y_n$ converges in distribution to $X / c$
Yeah.
Like, you know the sample mean has a normal distribution for large samples, by the central limit theorem.
but what about the t statistic: sample mean over estimated standard error?
By the third condition above, that also appraoches a normal distribution. Thus the t test and z test will give the same answers for large samples
z statistic:
M / (σ/√n)
where M is the sample mean. Approaches standard normal distribution, by central limit theorem.
t statistic:
M / (S/√n)
where S is the sample standard deviation. S approaches σ, so by Slutsky’s theorem, this approaches the same distribution as the z statistic.