An alloposium in the classical sense, with no guests and something something primes
You’ll recall in the last post the mention of a bit of bourbon-soaked insanity of trying to write up a proof of the PNT that would have been accessible to the ancient Greeks. Well, the first, and I’m sure only hurdle to get over will be defining it. After all, let’s consider the simplest and most accessible statement: the prime counting function approaches its parameter over the parameter’s natural logarithm as the parameter goes to infinity, that is to say, the ratio of the parameter to the product of the prime counting function times the logarithm of the parameter goes to one as the parameter goes to infinity, that is to say, for every epsilon there’s an upper limit such that anywhere above that upper limit you’re within epsilon of one. Your hippie friend might tell you that the Greeks couldn’t comprehend infinity, but in fact they just didn’t play with it as readily as we do; we just need to make explicit the error bounds that would more typically be implied. More obviously there’s the natural logarithm. Consider the harmonic numbers, which differ from the natural logarithm by at most one, so since the numerator rises linearly, if the one ratio approaches one the other will approach one. And while harmonic numbers weren’t really a thing then, the harmonic mean certainly was.
So anyway, first, the Euler product. Remember the three definitions I gave of the zeta function last post? I guess it’s time to show the equivalence of the first and third, and that’s the Euler product, which is essentially a modern version of the sieve of Eratosthenes. If you start with expression 1 of the zeta function, and multiply the denominator of each term by two to the parameter, then subtract that from what you had, essentially multiplying by , what’ll be left is only odd numbers. Do the same with that new series but for three to the parameter, and you’ll be left with only odd numbers not divisible by three. Then five, seven, eleven, and so on… as you keep multiplying by the unit minus the reciprocal of each prime to the parameter, you’ll get closer and closer to one as more and more terms are sieved out. Now, the “canonical” source for the sieve of Eratosthenes, a book on the metaphysics of mathematics somewhere between Spirit Science and this blog, comes about three hundred years after he lived and presents it without formal proof. In fact, what formal proofs it does contain are mostly cribbed from Euclid (often badly; in fact, the vaunted Greek rigor Dodgson was harder for than pretrimmable trim is seen in so few of those Greek writers from the epoch of pretrimmable cock that it’ll be easier to just pretend there’s no one but Euclid), so we’ll just prove it ourselves.
(Oh, yeah, and for the second one, if at that first step instead of multiplying the original series by one half to the power of the parameter, you multiply by one half to the power of one more than the parameter, you get an alternating series, which converges for any parameter of real part strictly greater than zero, not just greater than one. Since that multiplier will be nonzero for any parameter not exactly one, you can just divide it back out to get another expression of the zeta function.)
So for the “Greek” version, the first step has to be to actually show that the sum of the reciprocals to some power strictly greater than one actually converges. Well, since we’re dealing with constructible numbers, let’s restrict it to a number that’s one plus the reciprocal of a power of two. So now you use binomial series, and since it’s so restricted, you can go through the binomial theorem for natural numbers, as first put forth in Chandah Sutra, the famous text on poetic meter written most likely some time before Ashoka. Indeed, Euclid himself starts the ball rolling in the West, with proposition II.4, which iterated, can get you there. Well, if we can do a version of the integral test without limits, we can prove that the sum is between the chosen power of two and one greater. Then, from the fact that the sum up to a sufficiently large prime is less than one off from the square of that prime, we can sieve away to leave a difference less than one. Of course, no limits means no integral test, so how to do it? There’s something called a “telescoping series,” where the terms are expressed as differences that cancel, making the partials, and therefore the ultimate sum, relatively easy to work out. Consider a series where the terms are the power of two in question times the difference between the reciprocals of the index to the reciprocal of that power of two (the result, that is, not just the exponent) and the index’s successor to the same; add it up from one and you get the common multiplier that’s the power of two, so if it can be shown that each term is greater than the successor to the index to the power of the input to the zeta function then you get the upper bound of one greater than the power of two in question. You can do this by pulling out the successor, so that for the subtrahend you just get one, and for the minuend the ratio of the successor to the index, said ratio taken to the power of the same reciprocal. If you had access to the modern binomial theorem, you could stick a minus sign on the exponent, flip the ratio, and you'd have shown that the upper bound is less than the series in question. However, lacking that, it'll instead be a matter of using Euclid's proposition IX.35 on what amount to formal power series and taking both sides to the power itself to show the inequality that the binomial theorem yields, that you can cancel out the one and the successor.
FUCK YOU FUCKING EN-SPACERS FUCKING UP MY FUCKING POST WHEN DID YOUR FUCKING CULT BECOME FUCKING LAW YOUR FUCKING TEXT LOOKS LIKE A FUCKING RUN-ON FUCKING SENTENCE WITH A FEW FUCKING STRAY PRINTER MARKS IN IT SORT OF LIKE THIS BLOCK BUT WITH GNATS ON THE MONITOR IN FACT I'M NOTICING NOW THAT IT AUTOMATICALLY CORRECTS THE WORD WRAP WITH MULTIPLE SPACES UN-FUCKING-LESS THERE ARE EXACTLY TWO THE NARCISSISM OF SMALL DIFFERENCES TO FURTHER THEIR FUCKING CULT AND WHAT IS THIS FUCKING BULLSHIT WHERE THEY GO OUT OF THEIR WAY TO ADD AN EXTRA SPACE WHEN YOU TRY TO FIX THEIR HOLY EN-SPACES JUST TO DICK WITH PROPERLY EM-SPACING "FAIR GAME"
Now for the slightly trickier part. It can be shown similarly that the difference will always be less than the term as regards the index itself, but we need to put a lower bound on by how much. Since it’ll be lower for every term, just working it out for the first term, will do, and since for any power of two the product of the difference between one and the reciprocal of that power of two with the exponent will always be less than five sixths (in fact it approaches .693 from below, but five sixths is good enough), it’s just a matter of getting to a point where the remainder in the partial will be less than one sixth, which'd mean you have to get the telescope to collapse to that level, i.e., multiply the power of two by the index and take it to the power of the power of two (again, the whole power), so that when you take the relevant root, you get something that multiplied by the coefficient will be less than a sixth. Because the subtrahend comes from the successor, you can subtract one from that, which we'll see in the next paragraph'll come in wicked handy.
So it passes the sniff test, but does it pass the Wolfram test? Well, choosing 32 as the power of two, you get an index beyond 11,631,596,243,148,446,061,072,503,646,667,324,283,461,914,680,502,615,356,492,721,967,947,841,535, the sum to which point would be about 32.413. Just to be extra-sure, since it should in principle work with log half e rather than a sixth, let’s try the far more reasonable index (rounding up since indices are integers and since our method finds a minimum, only, since this isn't an integer, that's really rounding down due to our wicked handy subtracting) 38,285,606,278,150,664,580,703,573,515,392,543,734,581,840,373,218,238,427,281,826,340, and we get about 32.273. So… yay! It passes the Wolfram test. Probably don't let's make explicit we’re relying on such ludicrous boundaries when we’re aping Euclid later on.
That brings us to the compass-and-straightedge provenance of the sieve, which comes from the fundamental theorem of arithmetic, which is equivalent to Elements VII.30-32 and XI.14. You can get the Euler product from this by using the bounds above to establish that between the prime and the square of the prime you've got a difference less than one.
(Who the hell thought a blog called “drunk-math” should go hardcore compass-and-straightedge?)
Now, to start with, to define the logarithm, we'll use a limit that goes to the logarithm, namely, the full power minus one over the exponent alone as the exponent goes to zero from above, where the base is the parameter of the logarithm function. For the exponent, we'll use the reciprocal of the power of two, so as the power of two blows up, now we have a constructible analogue to the logarithm, which can be used for a constructible analogue to the first Chebyshev function. Since the denominator is the reciprocal of an integer that blows up, instead of dividing by it we can multiply by the zeta function of one plus the denominator by the argument above.
Now for the hard part, to apply the Werni Wiener-Ikehara theorem. To remind, the preconditions (for one version) are:
For any exponent of real part strictly greater than two, the integral of the function divided by that power of the parameter of integration from one to infinity converges to something complex-continuous with respect to the exponent.
The function also converges for nonreal exponents of real part exactly two.
The function itself is zero at one.
As the exponent approaches two from above the difference between the integral and the reciprocal of two less than the exponent comes to a limit not equal to zero.
1 is obvious by the normal definition and not hard from the adapted one (although the notion of complex continuity could be a bit tricksy); 3 follows wicked fast from the fact that the lowest prime is two. 4 follows from the fact, as in the previous post, that in the upward neighborhood of two the difference plus something that comes to a finite limit comes to the limit of the derivative of the logarithm of the product of two less than the parameter times the zeta function of the parameter minus one. That'll take some finesse to get into our terms, but first, that which was shown by fiat last time, 2, that the function converges for nonreal exponents of real part exactly two.
This is basically equivalent to saying that the zeta function has no zeros with real part one, since recall from before that there's a finite difference between the derivative of the logarithm of the zeta function and the integral. So how to show that? Hadamard had a four-paragraph proof that was truly astounding for its time (compare the 25-page proof used by his contemporary de la Vallée Poussin), but involved some slightly annoying finickiness with the subset of the primes whose logarithms multiplied by the imaginary part of the zero are within a given distance of an odd multiple of pi. Mertens two years later streamlined this into something more elegant, and that's what's usually taught today: given a zero on that line, for some real number close to but greater than one, multiply the cube of the zeta function of that real number by the fourth power of the zeta function of that number plus (times i) the imaginary part of the zero, and as the real approaches one, this will approach zero (as a consequence of the analyticity of the zeta function that I'll still have to justify). Multiply that by the zeta function of the same real part plus twice the imaginary part and you should still get zero, since the zeta function stays finite everywhere on the line, but some trig'll show you you don't, so your hypotheses must be fucked, and unless all math is gibberish (which we accidentally did once and can never be sure we haven't again, or else computers wouldn't be a thing) that'd be the one about the zeta function having a zero of real part one. The trig in question would be the fact that the square of one greater than the cosine, which for real input always be greater than zero, let alone negative infinity, is equal to one and a half plus twice times the cosine plus the half the cosine of twice the angle, which by the way imaginary exponentiation works would factor into the logarithm and thus go to negative infinity if you had a zero.
So anyway, the less tricky one. The above relies only on limits (which the ancients had a vague grip on) and two particular properties of the function that can be proven by other means (probably); the previous proof of precondition 4, on the other hand, relies on properties of the derivative and logarithm that, while second nature to the modern undergrad, are very tricky to adapt to geometer's terms. So what exactly is it that we're doing?
Well, consider, it's not hard to show that the denominator is nonzero, by the argument we've already made. The trick is showing that the derivative, at least from above, is nonzero. This is done normally by appealing to analyticity, so let's adapt the proof that it's analytic, that is to say that it agrees with its Taylor series, so consider that Taylor's theorem can be shown a consequence of integration by parts, which comes from the product rule, which is a limiting case of Euclid's "geometric algebra" (cf. "the Hetaera"). That'll give a general guideline.
So now we actually show that this follows, by taking both limits simultaneously, that is to say both the derivative and the limit that goes to the logarithm. This'd be using the second Chebyshev function in essence, or at least, putting a term of negative one in the denominator that would yield the second rather than the first Chebyshev function in the original integral by Elements IX.35. This is essentially trivia since it's easy to show the difference has a finite (and in fact quite low, less than one) limit. So then it would be a matter of iterating the product rule, but that leaves a lot of messy terms, so instead it would probably be better to apply the above directly to what we already know to be the derivative.
Now that the easy part's done with, it's time to actually show the Wiener-Ikehara theorem. I'm going to follow Müger, including the bit that involves integrals, but I'll have to change a few things to get it to fit with the description of the theorem above. Namely, you'll note that I went with an exponent of the parameter of real part strictly greater than two, whereas Müger goes with e raised to a coefficient strictly greater than some value times the parameter. The problem is that this would require a definition of e. That's going to cause some problems to following his proof exactly, since it relies on the use of the Fourier transform, and therefore directly on the properties of e. So instead, we'll work backwards.
Ultimately, Müger's argument rests on a convolution, i.e., the function produced by integral of the product of values of two functions equidistant from an input pivot, of the parameter of the integral from Müger's formulation (with e) and a family of functions equal to the quotient of the difference of one minus the cosine of the product of some natural number with the parameter over pi times that same number times the square of the parameter. As the pivot goes to infinity, this'll go to one, and from this you can get that the function itself goes to one at infinity by the fact that this function gets pulled increasingly tight to the origin as the coefficient inside the cosine shoots up. He justifies this equality in the limit by annihilating the difference, which he justifies through a fairly easy special case of a result called the Riemann-Lebesgue lemma, which states that the Fourier transform of a function whose absolute value is integrable vanishes at infinity. It'll probably be best to show the disappearance directly, but the first order of business is to show why it is that the difference can be expressed as such a Fourier transform.
That this can be expressed as a Fourier transform comes from the monotone convergence theorem, a statement similarly obvious to the restricted Wiener-Ikehara theorem of the previous post but with the benefit of being true, namely that if you have a sequence of everywhere positive integrable functions where every point is nondecreasing as the sequence goes on and the limit of the series is integrable then the limit of the integral is the integral of the limit. In this case, you'd use the difference and use the limit thereof at the line of real part 2. You get to that step by using Fubini's theorem, which is the same as rearrangement of an absolutely convergent series only applied to a more general integral (since a series is an integral, really, just a special kind), that if a two dimensional integral is absolutely convergent you can shuffle it around however you please.
To streamline things a bit, you can break down the Fourier transform along the minimal abscissa of the product of the result of the integral by the tent function into the convolution at the parameter of the integrand and the Fourier transform of the tend function, minus the integral of the Fourier transform of the tent function from minus infinity to the parameter. This holds because the comparable statement holds for higher abscissae, when the integrands are divided by an appropriate function of e. A variable substitution will get you to the simpler version. Now it becomes a bit tricky in that the variable substitution is "behind the scenes," and we won't be using an integral of the function, but a series, which remember, is a kind of integral (recall also that the integrand doesn't have to be continuous, just the result), except that treating it that way fucks up the precondition that the function be nondecreasing, since now rather than the first/second Chebyshev function we're kind of using the first/second Chebyshev function multiplied by the parameter raised to the remainder, so we need to weaken the preconditions, and in fact I glossed over the use of that particular precondition enough that we'll be able to adapt it easily; specifically, you use the monotonicity to round to one end or another of a section plucked from the convolution, and so all you need to do is mark an offset of one.
So let's make a slight attempt to restore sanity. So let's consider what the hypothetical Greek would know from the arguments far above, in semi-modern terms:
The sum of the quotient of the (Greekified) first Chebyshev functions divided by the index to any exponent strictly greater than two will converge.
The sum of the quotient of the first Chebyshev function divided by the product of the square of the index by the cosine of the product of a given constant by the log of the index will converge, or at least, as the square is approached from a higher power, the sum will approach a limit.
And now let's summarize, from this, what we can prove and how, right after we've cleared one more obstacle. You'll note that all of the above is built on references to integrals. I want to avoid integrals, so instead, I'm going to try to go through a proof similar to Müger's using only sums.
We'll start by considering the sum of the sum we came up with that throws in the bit about the cosine, with some coefficient greater than zero, and "real part" 2. The integral will be zero, and it will also be a difference. I don't know about the sum, but by definition, the integral will be the limit of the Riemann sums, so I should be able to construct a bound.
Huh... so that was only one step, because I guess really that's the only obstacle. So let's put it all together (sort of).
Actually, one little thing first. The proof above relies very heavily on integrals, which could be adapted to this paradigm using Riemann sums, but that feels excessively artificial. So what about replacing the integrals with a simple finite sum? That would still leave the complex numbers, as well as e itself, which also must be e-liminated, but it's likely a good start. So let's think about this one.
Well, the idea is that we're trying to get the limit of the Fourier transform, or the equivalent, on the LHS, and show it equals zero, while on the right-hand ess, to get the difference between the convolution, or the equivalent, and a limit that goes to a constant. To that end, let's start by just swapping out the integrals for sums and see what happens. The first step, the application of Fubini's theorem, still works, since the series is still absolutely convergent. The Fourier transform isn't as "clean" anymore, but it still exists, and it's still real for integers in the parameter of the limit. The trick with the sign still works. The monotone convergence theorem still holds. Replacing the parameter still works.
So now the tricky part (maybe?): the Riemann-Lebesgue lemma. So in this form, I've got to show that a function from the integers to the complex numbers with compact support (i.e., a finite stretch where it ain't zero) has not a discrete Fourier transform, but something like a continuous Fourier transform in the discrete realm, that behaves similarly. The trick there is to find a series of functions that pull in toward zero as the series goes on.
So the question becomes, what properties of the Fourier transform, exactly, does he rely on? The definition plugs into Fubini's theorem. The fact that this particular transform is always positive allows the use of the monotone convergence theorem. The fact that both the function and its transform are even lets you get this into a form whose limit can be shown to be constant. The Riemann-Lebesgue lemma brings you to the final step, where a limit that goes to zero is equal to the difference between 1 and the limit we seek to show goes to 1.
Now, Fubini's theorem is just the continuous analogue of the inverse (which holds) of Riemann's paradox, so that can be brought in. The tricky part then becomes finding an analogue of the Fourier transform; the obvious answer would be the DFT, but how to sample? Maybe sort of like a Riemann sum, each function in the family sampling closer. The problem then becomes showing the Riemann-Lebesgue lemma.
Although the monotone convergence theorem is often stated as applying only to everywhere-positive functions, some very quick algebra will let you apply it to any function bounded on one side. What we'd need to apply it to in this case, though, would be both the Fourier transform and the product of the Fourier transform by the quotient of the second Chebyshev function and its parameter. That being the case, we might just need to bring back that statement from last time that the first Chebyshev function's growth is bounded, which it shouldn't be hard to extend to the second.
So just as a heuristic, let's use the integral over intervals. This presents a problem, in that the antiderivative of the Fourier transform Müger uses isn't expressible in closed form, at least not without using the exponential integral, which might be a little tricky. So instead, let's replace the square in the denominator with the exponential function of the absolute value, which keeps it even, positive, and an "approximate unit." (And in fact, we should get rid of the tau.) For this we get at zero the cosine of half the ordinal, minus the sine of half the ordinal, minus two, all divided by the exponential function of half the ordinal, with one added to the entire quotient. This approaches one as the ordinal gets large, causing the first term to approach zero. Elsewhere what we get is a difference between two values of the first term between three halves the ordinal and half of it, then five halves and three halves the ordinal, and so on, and those all go to zero as the ordinal gets large. That is to say, it's yet another telescoping series.
So now to work backwards. First things first, replace the e with a 2 and opt to telescope through whole rather than half ordinals. We have the slight problem, apparent right away, that trig functions of a nonzero algebraic number are always transcendental, but we'll cross that Kaliningrad bridge when we march to it. So now slip in the convolution and wrap it in the limit, bearing in mind that now it has to work for the entire family, has to go to 1 for the entire family. The proof from here to the goal is basically already laid out in that gibberish above, so let's keep moving forward (backward). Now the original proof got to 1 by getting to tau with the "approximate unit" times a factor of tau integrated from minus infinity to the parameter of the limit as the limit went to infinity. So for us, we'll ultimately be returning to the telescoping series, although to know what exactly we're working with we have to make it equal to the limit we already have.
This was accomplished in the original proof by putting their difference on one side, and on the other, a Fourier transform that could be shown to be equal to the difference, whose limit could be shown to be zero by the Riemann-Lebesgue lemma. So let's ourselves consider the difference between the limit of the "convolution" (bearing in mind that since it's no longer an actual convolution, it may take more finesse to consider the entire family) and the actual telescoping series. Müger links that to the Fourier transform of a function with compact support, which manifestly exists. We'll prefer to treat it as it is, subtracting one from the convoluend.
Anyway, I've completely lost the plot, to the extent I don't think I'll be able to parse this spaghetti drunk. Next post (or the one after): I'll try to make sense of this mess.














