Eric Auld Math Blog @ericauldmathblog - Tumblr Blog

Damn, Vakil’s online book on Algebraic Geometry is awesome.

http://math.stanford.edu/~vakil/216blog/index.html

Unique lifting property of covering maps

Proposition: Suppose $E \overset{\pi}{\to} X$ is a covering map. Suppose $Y$ is connected, and $Y \overset{g}{\to} E$ and $Y \overset{\tilde{g}}{\to} E$ are two lifts of $f$ that agree at a point $y$. Then $g = \tilde{g}$.

Proof: We show that $\{y \in Y \mid g(y) = \tilde{g}(y)\}$ is a nonempty clopen subset. It is nonempty because it contains $y$. To show it is open, let $g(z) = \tilde{g}(z)$. Let $U \subset E$ be a neighborhood of $g(z)$ such that $\pi(U)$ is evenly covered. On the open set $g^{-1}(U) \cap \tilde{g}^{-1}(U)$ the two functions must agree, since their compositions with $\pi$ agree, and $\pi |_U$ is a homeomorphism. To show that it is closed: normally disagreement of two functions is a closed condition as long as the target is Hausdorff. However, in this case, we can avoid using the Hausdorff assumption and just invoke the covering map again: suppose $g(z’) \neq \tilde{g}(z)$. Take an evenly covered neighborhood $U$ of $\pi \circ g(z’)$, and let $U_1$ be the connected component of $\pi^{-1}(U)$ containing $g(z’)$ and $U_2$ containing $\tilde{g}(z’)$. Then on the neighborhood $g^{-1}(U_1) \cap \tilde{g}^{-1}(U_2)$ of $z’$, the functions must obviously disagree. QED

As Qiaochu Yuan notes on his blog, this means that a covering map is a monomorphism in the category of pointed connected spaces, although of course it is not injective in general. This implies that the forgetful functor from pointed connected spaces to $\mathsf{Set}$ has no left adjoint.

#topology #mathematics

Explanation of Adjoint Functors, Unit, and Counit

Suppose we have categories $C$ and $D$ and a functor $C \overset{R}{\to} D$. Suppose $R$ has the property that for all $d \in |D|$, $d$ has a reflection across $R$, i.e. an initial object in $d \downarrow R$, which we denote by $(dL,\, d \overset{\eta_d}{\to} (dL)R)$, where $dL \in |C|$.

This correspondence provides us, for each arrow $d \overset{\varphi}\to cR$, with an arrow $dL \to c$, which we may denote by $^*\varphi$. And of course we can go back: given an arrow $dL \overset{f}{\to} c$, we can send $f$ to $\eta_d \,; (f)R$, and we may denote this arrow by $f^*$. I claim that right-asterisk and left-asterisk are inverses, and so we have a bijection between $C(dL, c)$ and $D(d, cR)$ for each $c \in |C|$ and $d \in |D|$.

It's pretty easy for us to say explicitly what it means to take $f \mapsto f^*$ (as long as we know the $\eta$ maps): we have $f^* = \eta_d \, ; (f)R$. In fact, we can say a little more: if $dL \overset{f}{\to} c \overset{g}{\to} c'$, then we can tell what $(f \,; g)^*$ should be:

The diagram makes clear the helpful formula $(f \, ; g)^* = f^* \, ; (g)R$.

Mapping $\varphi \mapsto {} ^*\varphi$ is a little harder for us to explicitly write down at this point; all we have assumed is that there must exist a unique such map. However, there is one thing we know about $^*\varphi$: from the diagrams above it is certainly clear that $\eta_d \, ; (^* \varphi)R = \varphi$.

Let us note that the correspondence $d \leftrightarrow dL$ is in fact a functor $C \leftarrow D$.

Lemma: There is a unique functor $C \overset{L}{\leftarrow} D$ such that $(d)L = dL$ and such that $1_D \overset{\eta}{\Rightarrow}LR$ is a natural transformation. Proof: We know what $L$ must be on objects, and drawing the commutative diagram tells us what it must be on arrows:

The initial property of $\eta_d$ tells us that there is a unique arrow $(\varphi)L$ such that the diagram commutes. I claim that $L$, defined this way, is functorial. QED It turns out that given the situation above, and given an element $c \in |C|$, we can find a coreflection of $c$ along $L$, i.e. a final object in $L \downarrow c$. This property of $L$ will end up clarifying somewhat what it means to map $\varphi \mapsto {} ^* \varphi$. Proposition: Given $c \in |C|$, there is a map $(c)RL \overset{\epsilon_c}{\to} c$ such that $(cR, cRL \overset{\epsilon_c}{\to} c)$ is a final object in $L \downarrow c$. Suppose that there were such a map $\epsilon_c$. That would mean that for any map $(d)L \overset{f}{\to} c$, there must exist a unique map $d \overset{\varphi}{\to} (c)R$ such that $f = (\varphi)L ; \epsilon_c$.

Dualizing this equation, we get $$f^* = ((\varphi)L \,; \epsilon_c)^* = \eta_d \,; (\varphi)LR \,; (\epsilon_c)R$$ $$ = \varphi \, ; \eta_{cR}\,; (\epsilon_c)R = \varphi\, ; \epsilon_c^*.$$ Staring at this for a moment, a clever idea arises: choose $\epsilon_c:= {} ^*\text{Id}_{cR}$. Then, since we know left-asterisk and right-asterisk are inverse, we will have $f^* = \varphi$, so $f = {}^* \varphi$. This shows existence and uniqueness of $\varphi$ when we choose $\epsilon_c:= {}^* \text{Id}_{cR}$. It also has the happy consequence of giving us an explicit form for the left-asterisk operation: $$^*\varphi = (\varphi)L \, ; {}^* \text{Id}_{cL} = (\varphi)L \, ; \epsilon_c.$$ By precomposing $\varphi$ with a map $d' \overset{\psi}{\to} d$ and looking at the diagram, we get the result $^*(\psi \, ; \varphi) = (\psi)L \, ; {}^* \varphi$.

As we did for $\eta$, let's check that $\epsilon$ is a natural transformation from $RL$ to $\text{Id}_C$. Suppose we have a map $c \overset{f}{\to} c'$. We want to show that $(f)RL \, ; {}^* \text{Id}_{c'L} = {}^* \text{Id}_{cL} \, ; f$. We do so by dualizing: first note that $({}^* \text{Id}_{cL} \, ; f)^* = (f)R$. And, using naturality of $\eta$ and the aforementioned relation $\eta_d \, ; (^* \varphi)R = \varphi$, we have $$((f)RL \, ; {}^* \text{Id}_{(c')R})^* = \eta_{cR}\, ; (f)RLR \, ; ({}^* \text{Id}_{c'R})R$$ $$= (f)R \, ; \eta_{c'R} \, ; ({}^* \text{Id}_{c'L})R = (f)R \, ; \text{Id}_{c'R} = (f)R.$$ Therefore $\epsilon$ is natural.

#mathematics #category theory #adjoint functors #Sorry I write function composition forwards not backwards #Actually I'm not sorry

Nice proof of irreducibility of cyclotomic polynomials

I found a nice proof that for $p$ prime, the polynomial $$X^{p-1} + X^{p-2} + \dotsb + X + 1$$ is irreducible over $\mathbb{Q}$. Lemma: For $n\geq k$, $${n \choose k} + {n-1 \choose k} + \dotsb + {k+1 \choose k} + {k \choose k} = {n+1 \choose k+1}. $$ Proof: Induction on $n-k$. For $n-k=0$ the conclusion is clear. Assume the result holds for $n-k = m$. Then if $n-k = m-1$, we have $${n \choose k} + \left ( {n-1 \choose k} + \dotsb + {k \choose k} \right) = {n \choose k} + {n \choose k+1} = {n+1 \choose k+1}. \quad \text {QED} $$ Now to show that $\Phi(X) = X^{p-1} + X^{p-2} + \dotsb + X + 1$ is irreducible, it suffices to show that $\Phi(X+1)$ is irreducible. But the coefficient on $X^{k}$ in $\Phi(X + 1)$ is $ {p-1 \choose k} + {p-2 \choose k} + \dotsb + {k \choose k}$, after which I claim that $\Phi(X+1)$ is Eisenstein at $p$.

#polynomials #mathematics

Better Proof of Sylow’s First Theorem

There is a proof of the first Sylow Theorem that I much prefer to the standard one (which involves looking at the class equation and doing induction). I found it in some notes by Peter Cameron, and he says that it is the original proof that Sylow used. I prefer it because it is for me more intuitive, it involves group actions, and it has more mathematical flavor.

Suppose $G$ is a finite group. For any prime $p$, define a Sylow $p$-subgroup of $G$ to be a group of order $p^n$, where $n$ is the highest exponent such that $p^n \mid \# G$. Sylow's First Theorem says A finite group $G$ has Sylow $p$-subgroups for all primes $p$. Note by the definition that the Sylow $p$-subgroup of $G$ is the trivial group for all primes $p$ such that $p \not \mid \#G$. Lemma: A group $G$ has a Sylow p-subgroup iff there is an action by $G$ on a set such that all stabilizers are $p$-subgroups and there exists an orbit of size coprime to $p$. Proof: (==>) The stabilizer of the orbit which is of size coprime to $p$ is a Sylow $p$-subgroup.

(<==) Suppose $P < G$ is the given group. Let $G$ act on the cosets of $P$ by multiplication. It's a transitive action with a single orbit of size coprime to $p$, and the stabilizers are all of the same order as $P$ itself. QED

Lemma: If $G$ has a Sylow $p$-subgroup, then all subgroups of $G$ have Sylow $p$-subgroups. Proof: Suppose $H<G$. Take the action as in the previous lemma, and restrict the action to $H$. Clearly the stabilizers remain $p$-subgroups (they are subgroups of the stabilizers of the $G$ action). To show that there remains an orbit of size coprime to $p$, consider the orbit of the $G$ action that is coprime to $p$. It breaks up into smaller orbits under the action by $H$, and they cannot all be divisible by $p$, or else the original orbit would be divisible by $p$. QED Lemma: $GL_n(\mathbb{F}_p)$ has a Sylow $p$-subgroup. Proof: Recall that $GL_n(\mathbb{F}_p)$ has order $p^{n(n-1)/2}(p^n-1)(p^{n-1}-1)\dotsb(p-1)$. And note that the upper triangular matrices with ones on the diagonal have order $p^{n(n-1)/2}$. QED

Now prove the theorem by noting that every group $G$ is a subgroup of a symmetric group $S_n$ by Cayley's Theorem, and $S_n < GL_n(\mathbb{F}_p)$ by the permutation matrices.

#mathematics #Sylow's-theorems

Inverse Function Theorem as Newton Iterations

In Newton's Method we attempt to find zeros of a function (that is, points in the inverse image of the point ${0}$) by moving our previous guess $x_i$ by $$x_{i+1} = x_i + \frac{- f(x_i)}{f'(x_i)} = x_i + \frac{0 - f(x_i)}{f'(x_i)}.$$ In one dimension, the linear transformation given by the derivative at $x_i$ is just multiplication by the number $f'(x_i)$ (and division by $f'(x_i)$ is just application of the inverse linear transformation). But if we write the above in the more general notation of a linear transformation $Df_{x_i}$, we would write $$x_{i+1} = x_i + (Df_{x_i})^{-1}(0 - f(x_i)).$$

This choice of $x_{i+1}$ makes quite a lot of sense even if we move to the general case where $f$ is a function from $\mathbb{R}^n \to \mathbb{R}^n$ and $Df$ is invertible. We are seeing how much we missed $0$ by with our latest guess $x_i$, and moving in the direction which the derivative tells us is most likely to get us from $f(x_i)$ to $0$, i.e. $(Df_{x_i})^{-1}(0-f(x_i))$. (Recall the derivative $Df_{x_0}$ is a linear transformation relating directions and speeds of movements near $x_0$ to directions and speeds of movement near $f(x_0)$.)

In fact, we can use Newton's Method to prove the Inverse Function Theorem. In attempting to find an inverse function for $f$, we choose a generic point $y$ (where before we had zero) and run the same algorithm. So we use the formula $$x_{i+1}= x_i + (Df_{x_i})^{-1}(y - f(x_i)).$$ Note that this algorithm terminates iff we we have found a preimage of $y$ (assuming that $Df_{x_i}$ is always invertible). As in the one-dimensional case, we try to establish convergence of this algorithm by showing that it is a contraction, i.e. that $|x_{i+1}-x_i| \leq c |x_i - x_{i-1}|$, for $c \in (0,1)$ fixed. In doing so, the initial selection of $y$ is crucial: this process may not work for all $y$, but, as we will show, it will always work if we choose $y$ close enough to $f(x_0)$.

Note that one of the consequences of our clever choice of $x_{i+1}$ is that $$\text{The linear approximation to } y_i=f(x_i) \text{ at } x_{i-1} \text{ is }y.$$ For $$f(x_i) = f(x_{i-1} + (x_i - x_{i-1})) = f(x_{i-1}) + Df_{x_{i-1}}(x_i - x_{i-1}) + E$$ $$= f(x_{i-1}) + Df_{x_{i-1}}((Df_{x_{i-1}})^{-1}(y - y_{i-1})) + E = y + E,$$ where $E$ is the error of the linear approximation, which is supposed to be $o(|x_i - x_{i-1}|)$ by the definition of the derivative. This fact is the heart of the proof, and our proof will succeed because we will choose a neighborhood $N$ of $x_0$ on which $E$ is a uniformly small proportion of $|x_i - x_{i-1}|$. Indeed, taking the derivative with respect to $t$ along a linear path from $x$ to $x'$ and integrating gives us an estimate $$|E| = |f(x) - f(x') - Df_{x'}(x-x')| = \left | \int_0^1 \frac{d}{dt}\left(f(x' + t(x-x')) \right)dt - Df_{x'}(x-x') \right| $$ $$ \left| \int_0^1 Df_{x' + t(x-x')}(x-x') - Df_{x'}(x-x') \, dt \right| \leq \int_0^1 \|Df_{x' + t(x-x')} - Df_{x'}\|\cdot |x-x'|\, dt$$ $$|E| \leq \epsilon |x-x'|, \tag{1}$$ for $x$, $x'$ in some neighborhood $N_\epsilon$ of $x_0$ where $Df$ does not change too much. Since $y$ is the linear approximation to $f(x_i)$ at $f(x_{i-1})$, we can write this as $$|y-f(x_i)| < \epsilon |x_i - x_{i-1}|,$$ as long as our iterations keep us in $N_\epsilon$.

Later, when we show that the inverse function we find is differentiable, we will find it convenient to get a lower bound for $|f(x) - f(z)|$ for $x$ and $z$ in our neighborhood, as in $$|f(x) - f(z)| \geq \frac{1}{2} \|(Df_{x_0})\|_{\text{min}} \cdot |x - z|.$$ Since we are trying to show that $f$ is one-one on a neighborhood, it is not surprising that we would want to make sure $f$ doesn't start mapping points too close together. (This is an easy mean value theorem estimate.)

The iteration proceeds by a sort of back-and-forth motion. We move forward by $f$, and back by $(Df_{x_i})^{-1}$ (or more precisely, by $(y_i \mapsto x_i + (Df_{x_i})^{-1}(y- y_i)$). In both directions we need to make sure things don't change too much: if we change too much going forth or going back, we might not obtain a contraction mapping, or we might wind up outside $N_\epsilon$. We control "forth" by (1), and "back" by getting an upper bound $\|(Df_x)^{-1}\|< M$ on $N_\epsilon$. Then, as we have seen, $$|y - f(x_i)| \leq \epsilon |x_i - x_{i-1}| , \quad \text{ and }$$ $$|x_{i+1} - x_i| = |(Df_{x_i})^{-1}(y - f(x_i))| \leq M |y - f(x_i)|.$$ In other words, going back we pick up at most a factor of $M$, and going forth we can get a factor of $\epsilon$. So we just need to choose $\epsilon$ small enough to counteract $M$, and a little left over to give us a contraction mapping. Also crucial here is the choice of $y$ so that $y-f(x_0)$ is small enough already so that $M|y - f(x_0)|$ is not a far enough distance to get us out of $N_\epsilon$.

Somewhat more rigorously, take $N$ to be a compact neighborhood of $x_0$ where $Df_{x}$ remains invertible and $\|(Df_x)^{-1}\|< M$, and then select a smaller open ball $N_\epsilon$ about $x_0$ with radius $\rho$ such that $$\|Df_x - Df_z\| < \frac{(1/3)}{M} \quad \text{and}$$ $$|f(x) - f(z)| \geq \frac12 \|Df_{x_0}\|_{\text{min}} \cdot |x - z|$$ when $x$ and $z$ are in $N_\epsilon$ (note the former gives us (1)). Now choose $y$ close enough to $f(x_0)$ so that $|y-f(x_0)| < \frac{\rho/2}{M}$, and I claim we'll have a contraction mapping.

With those details left to the reader, we have shown that for $y$ in a $\frac{\rho/2}{M}$ neighborhood of $f(x_0)$, $y$ has a unique preimage in $N_\epsilon$. Therefore there is an inverse function defined on this neighborhood. (And of course $(f \circ f^{-1})(B_{f(x_0)}(\frac{\rho/2}{M})) \subset B_{f(x_0)}(\frac{\rho/2}{M})$, so we have a bijection of open sets.) To show that $f^{-1}$ is differentiable with derivative $(Df_x)^{-1}$, suppose $x \mapsto y$ and $x+h \mapsto y + k$. Then we have by the definition of differentiability that $Df_x(h) - k$ is $o(|h|)$. So $$\frac{|h - (Df_x)^{-1}(h)|}{|k|}= \frac{|(Df_x)^{-1}(Df_x(h) - k)|}{|k|},$$ and $|k| \geq (1/2)\|Df_{x_0}\|_{\text{min}}|h|$ by above, so $$\leq \frac{M}{(1/2)\|Df_{x_0}\|_{\text{min}}} \frac{|Df_x(h) - k|}{|h|},$$ and of course $|h|$ goes to zero as $|k|$ goes to zero, so we are good.

#mathematics #mathema #inverse-function-theorem

Paradox of doing calculus on manifolds: the soul of calculus is linear approximation, which has its convenience in the fact that terms cancel out when you add, yet there is no addition on manifolds. Maybe when I get to Riemannian manifolds, I'll understand how for $q$ near $p$, $f(q)$ is very much like $exp_{f(p)} \circ Df_p \circ \log_p (q)$.

Continuity and Filters

I posted a question and answer on filters and continuity to Stack Exchange:

http://math.stackexchange.com/questions/1323173/characterization-of-continuity-in-terms-of-filters/1323287#1323287

#math #topology #continuity

Various conditions for continuity and openness

I was proud of both a question, and an answer I posted to my own question, on Stack Exchange: http://math.stackexchange.com/questions/1317997/generalization-of-f-overlines-subset-overlinefs-iff-f-continuous/1322717#1322717

#math #mathematics #continuity

These expository writings by Dr. Keith Conrad are superb. Check out the notes on tensor products, for instance. Really cool motivation and orientation.

Knapp’s algebra books

A few months ago I came across the texts Basic Algebra and Advanced Algebra by Knapp. I was really excited because they looked great, and the more I use them, the more grateful I am for them. I think they are the best graduate-level algebra texts.

#abstract algebra

Order of elements in Z/p^aZ

I was doing the following problems in Dummit and Foote’s Abstract Algebra, third edition:

2.3.21 Let $p$ be an odd prime and let $n$ be a positive integer. Use the Binomial Theorem to show that $(1+p)^{p^{n-1}} \equiv 1 \mod p^n$ but $(1+p)^{p^{n-2}} \not \equiv 1 \mod p^n$. Deduce that $1+p$ is an element of order $p^{n-1}$ in the multiplicative group $\left( \mathbb{Z}/p^n\mathbb{Z} \right)^\times$. 2.3.22 Let $n$ be an integer $\geq 3$. Use the Binomial Theorem to show that $(1+2^2)^{2^{n-2}} \equiv 1 \mod 2^n$ but $(1+2^2)^{2^{n-3}} \not \equiv 1 \mod 2^n$. Deduce that $5$ is an element of order $2^{n-2}$ in the multiplicative group $\left(\mathbb{Z}/2^n\mathbb{Z}\right ) ^\times$

I glanced over it, saw that when we expanded with the Binomial Theorem, we would get $p^{n-1}p +$ some other terms with $p^{n}$ or higher, and moved on.

Oops! What we actually get is $$(1+p)^{p^n} = 1 + p^n\cdot p + {p^{n} \choose 2} \cdot p^2 + {p^{n} \choose 3} \cdot p^3 + \dotsb + p^{p^n}, $$ and as the number in the bottom of the binomial coefficient grows, how can we know that we still have $n-k$ factors of $p$ in ${p^{n} \choose k}$? This was not a minor error; I had conceived of the problem wrong.

Well, after reading this nice hint on Stack Exchange, I solved it, and I find the answer very beautiful. The crucial idea is to view the raising to the $p^n$ power as successive raising to the $p$ power, and induct** on $n$ in a creative way.

Note that $$(1+p)^p= 1 + p\cdot p + {p \choose 2} \cdot p^2 + \dotsb \equiv 1 \mod p^2, \tag{*}$$ so $(1+p)^p$ mod $p^3$ is either $1+p^2, 1+ 2p^2, \dotsc, 1+ (p-2)p^2,$ or $1+(p-1)p^2$. OK, how does this help us to get to $(1+p)^{p^2}$ and further? We’d like an induction step based on (*), but it’s not in sight.

Well, if we generalize the equation $(*)$, we can help ourselves out. $$(1+kp^r)^p= 1 + p\cdot kp^r + {p \choose 2} \cdot k^2p^{2r} $$ $$+{p \choose 3}\cdot k^3 p^{3r} +\dotsb + {p \choose {p-1}}\cdot k^{p-1}p^{(p-1)r} + p^{pr} \equiv 1 \mod p^{r+1}, $$

so $(1+kp^r)^p$ mod $p^{r+2}$ is either $1+p^{r+1}, 1+2p^{r+1}, \dotsc,$ or $1+(p-1)p^{r+1}$. But by what we just did, changing $r$ to $r+1$, any of those when raised to the $p$ power are congruent to $1$ mod $p^{r+2}$. So $(1+kp^r)^{p^2}$ mod $p^{r+3}$ is either $1+p^{r+2}, 1+2p^{r+2}, \dotsc,$ or $1+(p-1)p^{r+2}$. Iterating this process gives $(1+kp^r)^{p^l} \equiv 1 \mod p^{r+l},$ which will give us the first part of both problems when we choose $p,k,$ and $r$ correctly.

Now $(1+kp^r)^p \equiv 1+ p\cdot kp^{r}+ {p \choose 2}k^2p^{2r}$ mod $p^{r+2}$. We want to get rid of this last term; two possible assumptions which would allow us to do this are $p>2$ or $r>1$ (note these are the choices in the first and second problem, respectively).

If we make one of those two assumptions, we have $(1+kp^r)^p \equiv 1+ p\cdot kp^{r}=1+kp^{r+1}$ mod $p^{r+2}$. Now by what we just did, we get that $(1+kp^{r+1})^p \equiv 1 + kp^{r+2}$ mod $p^{r+3}$, so by induction we see that $$(1+kp^r)^{p^l} \equiv 1 + kp^{r+l} \mod p^{r+l+1}.$$

Now, specifying to the problems at hand, first we plug in $k=1, p>2, r=1$, then we plug in $k=1, p=2, r=2$:

For $p$ odd, $$(1+p)^{p^k} \equiv 1 \mod p^{k+1}$$ $$(1+p)^{p^k} \equiv 1 + p^{k+1}\mod p^{k+2}.$$ For $p=2$, $$(1+2^2)^{2^n}\equiv 1\mod 2^{n+2}$$ $$(1+2^2)^{2^n} \equiv 1 + 2^{n+1} \mod 2^{n+3}. \quad QED$$

**Shouldn’t the verb be “induce”? That bothers me.

Deriving a Norm From a Set

A few excellent propositions from Fleming's Functions of Several Variables:

Prop 1: Suppose $ \| \cdot \|$ is a norm on $ \mathbb{R}^n$. Then the unit ball about 0, $ \mathcal{B}:=\{ \mathbf{x} : \| \mathbf{x} \| \leq 1 \}$ has the following four properties:

It is compact.

It is convex.

$ \mathbf{x} \in \mathcal{B} \Rightarrow -\mathbf{x} \in \mathcal{B}$

$ \mathcal{B}$ contains a neighborhood of $ \mathbf{0}$ with respect to the normal metric on $ \mathbb{R}^n$.

Prop 2: Suppose $ K$ is a set with properties 1-4 above. Then the function $ \mathbb{R}^n \to \mathbb{R}$ given by

$$ \displaystyle \mathbf{0}\mapsto 0 $$

$$ \displaystyle \mathbf{x} \mapsto \frac{1}{\max \{t: t\mathbf{x} \in K\}}$$

is a norm.

Prop 3: The set $ \{\mathbf{x} \in \mathbb{R}^n : \sum_{i=1}^n |x_i|^p \leq 1\}$ has properties 1-4, and the derived norm as in prop 2 is the normal $ p$-norm.

So we have derived Minkowski's inequality indirectly.

#minkowski functional

Uncountable Sums

A friend and I were having a discussion about Lebesgue measure. I attempted to be profound by making the following points:

Analytic geometry has been a fantastic tool, but the concept of representing a continuous "object" as a collection of points is inherently contrived (with a negative connotation). Immediately we run into the paradox that a point has no volume, and yet a collection of many points has volume.

The notion of Lebesgue measure attempts to resolve this essential tension by allowing only countable additivity of the measure. But it does so only by disallowing certain operations (uncountable sums) that intuitively seem reasonable. As such, it is an indispensable tool, but it remains contrived on some level.

My friend countered by saying that uncountable additivity doesn't really make sense anyway, since any uncountable sum that converges must have co-countably many terms zero. But if I recall correctly, we were both searching for a rigorous way of expressing these ideas.

I would say I am still on the fence about this discussion. He makes a good point, but after all, it is exactly the addition of uncountably many zeros that I am concerned with, so the notion that co-countably many of the terms must be zero may not be a decisive objection.

But on the issue of uncountable additivity making sense at all: the other day, while reading Roman's Advanced Linear Algebra, I found a rigorous way of expressing the notion of an uncountable sum via nets, which buttresses my friend's point.

Let $ \{ x_i \}_{i \in I}$ be a collection of any cardinality in a normed vector space $ V$. We say $ \sum_{i\in I} x_i = x$ if for all $ \epsilon >0$, there exists a finite subset $ S_{\epsilon} \subset I$ such that for any finite subset $ T \supset S_{\epsilon}, |\sum_{i \in T}x_i - x|<\epsilon$. Note that this is exactly the definition of the convergence of a net. Here our directed set is all finite subsets of $ \{x_i\}_{i \in I}$, with join being union.

(Actually, I think there is no reason why the above definition can't be extended to $ V$ a topological group. In case $ V$ is non-abelian, I think we'd need to take our directed set to be ordered finite subsets of $ \{x_i\}_{i \in I}$ , with join=concatenation.)

Now we have the following fun propositions:

Prop 1 (Cauchy criterion): Suppose that $ \sum_{i \in I} x_i = x$. Then for any $ \epsilon >0$ there exists a finite subset $ S_{\epsilon} \subset I$ such that if $ T$ is a finite subset of $ I$ such that $ T \cap S_{\epsilon} = \emptyset$, then $ | \sum_{i \in T}x_i |< \epsilon$. If $ V$ is complete, then the converse holds.

Prop 2: Suppose that $ \sum_{i \in I}x_i$ converges. Then cocountably many $ x_i$ are zero.

Proof of prop 2: For each $ n\in \mathbb{N}$, take $ S_{1/n}$ as in prop 1, and let $ S^* := \cup _{n=1}^\infty S_{1/n}$. Then if $ j \notin S^*$, we have $ |x_j|<1/n$ for all $ n$.

#math #infinite sums

A helpful user on Stack Exchange with an interesting name showed me this proof of Minkowski’s inequality, based on convexity rather than passing through Holder’s inequality.

Here’s a quick note about why an absolutely convergent series can be rearranged.

#math #sequences

I posted on the Courant Institute wiki a solution to a problem about estimating the error in Riemann sums.

Trending Blogs

Recently Viewed Blogs

Eric Auld Math Blog