A daring escape past y^2=x^3+x+3 over F14641
I was sent a lovely little paper last June about visualizing elliptic curves over finite fields and you just know I had to draw about it. Anyway now it’s space and piracy is involved!
seen from Indonesia
seen from China

seen from United States
seen from Hong Kong SAR China

seen from United States
seen from China
seen from Canada
seen from Japan

seen from India

seen from Singapore
seen from China

seen from Germany
seen from Lithuania
seen from United States
seen from Malaysia

seen from United States
seen from United States
seen from China
seen from Ukraine

seen from United States
A daring escape past y^2=x^3+x+3 over F14641
I was sent a lovely little paper last June about visualizing elliptic curves over finite fields and you just know I had to draw about it. Anyway now it’s space and piracy is involved!
an attempt towards yet another notion of “a derivative for finite fields”
I meant to do this as a full write-up thing, with pretty pictures and such, but I decided that I’m just going to start saying what I’ve already got, and re-blog this post with additions to this later, so that I actually ever get any of it posted. So, let’s get this post started.
If you have any feedback on the way I have written this (ideally, how to improve my writing for things like this), I would appreciate it!
Part 1 : Preliminaries, the definition, and some basic parts of the results
Part 1, Section 1: Preliminaries
At least two notions of derivatives for functions over finite fields have already been proposed, the Hasse derivative, and the Negacyclic derivative (citation : “Introducing an Analysis in Finite Fields” by de Oliveira and deSouza )
Somewhat like the linked paper, I am partially motivated by the idea of a “unit circle” using finite fields. However, in this case, I am motivated by analogy with complex analysis.
Where q is a prime power (or just a prime) congruent to 3 (mod 4), there is no element x of Fq such that x2=-1 (see ...elsewhere... for a proof of this, or if you just want a citation, not a proof, here or here), and therefore, we can adjoin an element i to the field, such that i2=-1, to produce a field Fq[i] .
Now, for any finite field with characteristic not equal to 2, half of the non-zero elements are the square of some non-zero element, and half are not . The elements which are, are called “quadratic residues” (for example, in Fq where q is 3 mod 4, -1 is not a quadratic residue. However, in Fq[i], it is a quadratic residue, as i2=-1.). (the reason that exactly half of the (non-zero) elements are quadratic residues is because (-x)2=x2, so 2 non-zero elements get mapped to each (non-zero) quadratic residue, so, there are half as many quadratic residues as there are non-zero elements. This doesn’t work in the fields of characteristic 2, because in those fields, -x=x.)
[for the remainder of this post, and probably any other posts in sequence with this one, q will always be assumed to be congruent to 3, mod 4 (unless possibly if stated otherwise, in a later post)]
In this analogy with the complex numbers, we are treating the field Fq as being analogous to the real numbers, and treating the quadratic residues of Fq as being analogous to the positive numbers. This works nicely, as the non-zero non-quadratic residues are exactly the additive inverses (”negatives”) of the quadratic residues. However, the analogy doesn’t extend too far, as the “positive numbers” in this analogy, are not closed under addition (though they are under multiplication and division).
All of what I’ve said in this post so far is definitely things I’ve seen elsewhere, and are well known. In the following I don’t remember exactly which of the things I’ve read elsewhere.
Now, I want a notion of the unit circle in Fq[i] . So, this should be those points z such that |z|2=1 , yeah? So, what is |z|2?
Well, of course, when z=x+y*i for x and y “real numbers”, (by which, in this context, I actually mean elements of Fq, rather than the actual real numbers), then |z|2=|x+iy|2 is of course x2+y2.
So, for how many elements z of Fq[i] is |z|2=1 ?
for any c in Fq , |cz|2=|-cz|2=c2 * (x2 + y2) = c2 * |z|2. This scales |z|2 by a quadratic residue. So, for any element z of Fq[i], there are 2 elements c, -c of Fq that z can be scaled by, such that, if |z|2 is a quadratic residue of Fq (so, if it is “positive”), then |cz|2=1 , and otherwise, such that |cz|2=-1 .
Well, that’s kind of a weird idea, isn’t it. Something having a negative absolute value squared. Nevertheless, we continue.
Oh, one side note : you may be concerned about “how do we know that the ‘absolute value squared’ of one of these elements won’t be zero?”. If this were the case, it would be because x2+y2=0, and so x2=-y2, and that therefore 1=-y2/x2=(x/y)2 and that therefore -1 would be a quadratic residue of Fq , which we know never happens when q is congruent to 3 mod 4. Moving on.
So, each “line through the origin” (i.e. for each fixed nonzero z in Fq[i], the set {c*z : c in Fq}) contains either 2 points with “squared magnitude 1″ or 2 points with “squared magnitude” -1. Because there are q2-1 nonzero elements of Fq[i] , and each of these “lines” has q-1 non-zero points on it, 2 of which with “squared magnitude” 1, or “squared magnitude” -1, there are therefore 2*(q2-1)/(q-1)=2*(q+1) points in Fq[i] with “squared magnitude” either 1 or -1 .
As might or might not be expected, |z1*z2|2=|z1|2 * |z2|2 . (try proving this yourself if you are not convinced.)
Therefore, these 2*(q+1) points are closed under multiplication.
As the non-zero elements of a finite field form a cyclic group under multiplication, these 2*(q+1) elements also form a group under multiplication.
Exactly half of them have “squared magnitude” -1 , and the half that has “squared magnitude” 1, is a subgroup of it.
Therefore, what we are calling our “unit circle” is the q+1 elements of Fq[i] with “squared magnitude” 1. We will call this set S, and the larger set with 2*(q+1) elements , S+/- . We might occasionally call S “S+“ in order to emphasize that we mean the smaller set, but probably not all that often, because using the superscripts is slightly inconvenient.
Part 1, Section 2 : The motivation for the definitions, and the definitions
Now! Now we have the preliminaries out of the way, let’s get on to the actual idea!
The idea is inspired by Cauchy’s Differentiation Formula , in particular, applied when integrating around the unit circle.
Cauchy’s Differentiation Formula states that when f(z) is a complex differentiable function on some simply connected open region of the complex plane, and c is a point in that region, then a contour integral in the counterclockwise direction around c of f(z)/((z-c)(n+1)) , times n!/(2*pi*i), is equal to the n-th derivative of f evaluated at c.
Or, to steal a picture from Wikipedia
(except they called the variable “a” instead of “c”).
My idea was to, unlike in the complex analysis case where this result is a theorem that comes after defining what complex differentiation means, to instead use this formula as a basis for defining a type of derivative for functions over certain finite fields (specifically, those finite fields for which the notion of the unit circle described above, works).
So, to use something analogous to
to define the derivative of f at c, where the contour integral is the unit circle, except translated to be centered at c,.
That is,
when
. Which is,
(Not sure why tumblr scaled up this image more than the other ones. Oh well.)
now, because we are trying to do this in our finite field thing, we of course do not actually have an actual integral, so we just treat it as a sum. Similarly, instead of eiθ for different theta in order to produce the different points on the unit circle, we just make the sum be over the points in the unit circle directly, as we haven’t defined a notion of ez in our finite field.
So, remembering that S is the set we are treating as analogous to the unit circle, we get
And, because the number of elements in our “unit circle” is q+1, and this seems the most natural analogy to the value of 2*pi , and, in addition, q+1 is congruent to 1 mod p (where p is the characteristic of the field), it seems reasonable, in the analogy, to replace the 2*pi term in the denominator there, with 1.
Also, the i term inside the sum cancels with the 1/i as the coeffecient of the entire sum,
and the ((c+s)-c) is of course just s, and (f(c+s)/s2)*s is, of course, just (f(c+s)/s) , this all simplifies down to:
Which is nice and simple.
But, what does this operation actually act like? Is there any good reason that it should deserve the name “differentiation”?
You may want to try computing it out for some simple functions, like 3z , or z2 - 1 or the like, to check that it indeed does produce the values that one would expect for the derivative of such functions.
One can also plainly see that this operation is linear on the vector space over Fq[i] which is the set of functions from Fq[i] to Fq[i] ( so, deriv(a*f + b*g)=a*deriv(f)+b*deriv(g) ) , and also that it is translation invariant, which are both appropriate properties for differentiation.
Now, this operation can be applied to any of these functions from the field to itself, but there are many functions from the complex plane to the complex plane that are not differentiable. As such, I thought to look for a definition of “differentiable” which would be appropriate.
So, for this, I thought to base it on the Cauchy’s Integral Formula
and, by going through the same sort of analogy, we arrive at
as a proposed equation which should hold whenever the function is “differentiable”. I suggest taking this equation as the definition of a function f from Fq[i] to Fq[i] being “differentiable”.
Part 1, Section 3 : Some results
Note that, for the same linearity reasons as with the “differentiation” definition above, the “differentiable” functions form a linear subspace of the full space of functions (as they should).
I have also shown (but don’t include the proof here right now, because I’d like to get this post out tonight, and it is almost midnight) that if a function is “differentiable” in this sense, then the “derivative” of that function is also differentiable in this sense. So, any function which is once differentiable is infinitely differentiable, just like in the actual complex numbers! I think that’s pretty cool.
I have also shown that any polynomial of degree up to q is “differentiable” in this sense, and also that the “derivative” is what one would expect for polynomials for when the degree is at most q. (zq+1 is not “differentiable” in this sense, and zq+2 , if you go through with the definition of the “differentiation” despite it not being “differentiable” results in like, 1 + [the derivative you would expect] , iirc. I might be remembering with an off by one error. Anyway, it has to do with the binomial expansion of stuff having a power of s which is divisible by q+1 . I intend to explain why later. )
I suspect that the “differentiable” functions turn out to be exactly the polynomials of degree at most q, but I have not proven this to be the case. (yes, unfortunately this means that there are no sine and cosine functions here which have sine’(x)=cos(x), cos’(x)=-sine(x) . However, I think this can maybe be recovered in a slightly different context. More on this later when I figure it out maybe?)
I think I’ve shown that, if that is the case, then a “differentiable” function is uniquely determined by its value on the unit circle (or any translate of it), which matches with actual complex analysis quite nicely!
Generally, these definitions appears to give results that mirror complex analysis results quite nicely!
Part 1, Section 4 : etc.
Thank you very much for reading.
Please let me know if you would like clarification on any part of this post, or if you have any advice for how I could improve it, or how I could improve future math things I write.
Thank you!
Finite Fields
Modular Arithmetic
For integers $m$, $n$ and $k$ ($k>1$) we say that $m\equiv{}n\pmod{k}$ if k divides $m-n$.
$6\equiv{}1\pmod{5}$ because $6-1=5=1*5$
$11\equiv{}1\pmod{5}$ because $11-1=10=2*5$
$17\equiv{}2\pmod{5}$ because $17-2=15=3*5$
$3\not\equiv{}1\pmod{5}$ because $3-1=2$ and $5$ does not divide 2
When working mod $k$ we say that $n$, $n+k$ and $n+2k$ are all equivalent. So every integer $n$ is equivalent to one of $\{0,1,2,...,k-1\}$. We find this value by dividing $n$ by $k$ and taking the remainder. We'll denote this by $n\%k$.
We define the group $\mathbb{Z}/k\mathbb{Z}$ as $(\{0,1,2,...,k-1\},\cdot,0)$ where $a\cdot{}b=(a+b)\%k$. Here is an addition table for $k=6$:
$$\begin{array}{c|cccccc} \texttt{+} & 0 & 1 & 2 & 3 & 4 & 5 \\\ \hline 0 & 0 & 1 & 2 & 3 & 4 & 5 \\\ 1 & 1 & 2 & 3 & 4 & 5 & 0 \\\ 2 & 2 & 3 & 4 & 5 & 0 & 1 \\\ 3 & 3 & 4 & 5 & 0 & 1 & 2 \\\ 4 & 4 & 5 & 0 & 1 & 2 & 3 \\\ 5 & 5 & 0 & 1 & 2 & 3 & 4 \end{array}$$
We can also define the monoid $M_{k}=(\{1,2,...,k-1\},\cdot,1)$ where $a\cdot{}b=(ab)\%k$. Here's the multiplication table for the non-zero elements of $\mathbb{Z}/6\mathbb{Z}$
$$\begin{array}{c|ccccc} \times & 1 & 2 & 3 & 4 & 5 \\\ \hline 1 & 1 & 2 & 3 & 4 & 5\\\ 2 & 2 & 4 & 0 & 2 & 4\\\ 3 & 3 & 0 & 3 & 0 & 3\\\ 4 & 4 & 2 & 0 & 4 & 2\\\ 5 & 5 & 4 & 3 & 2 & 1 \end{array}$$
Multiplication by this monoid's elements distributes across addition and thus $\mathbb{Z}_{k}=(\{0,1,2,...,k-1\},+,\times{},0,1)$ is a ring. It's also commutative because integer multiplication is commutative.
In the above table we can see that there's an inverse element for 5 and 1 but not 2, 3, and 4. In general an element a of the monoid $\mathbb{Z}/k\mathbb{Z}$ will have an inverse iff (if and only if) $gcd(a,k)=1$. If $gcd(a,k)>1$ then we say that a is a zero divisor because there exists a $b\neq{}0$ in $\mathbb{Z}/k\mathbb{Z}$ for which $ab=0$. For $\mathbb{Z}_{k}$ to be a field it is necessary and sufficient for every element to have an inverse. This happens if and only if $k$ is prime.
For every prime $p$ there we say that there is only one field of order $p$ up to isomorphism. This means that given any two fields $A$ and $B$ of order $p$ I can construct a bijective function $f:A\rightarrow{}B$ such that for any $a,b\in{}A$ $f(a\times{}b)=f(a)\times{}f(b)$ and $f(a+b)=f(a)+f(b)$. Therefore $B$ is just $A$ with the elements renamed by the function $f$. Since there is only one field of order $p$ up to isomorphism we can talk about the field of order $p$, which we denote GF(p).
Next time we will discuss literally all of the other finite fields.
Finite fields are as basic to mathematics as the telescope is to astronomy.
Mark Ronan, _symmetry and the monster_
another odd thing with finite fields
[EDIT: I think I’ve now basically figured out how to show why this is true. It all comes down to the sum of x^n over all x in F_p being -1 if n is a positive multiple of p-1 and 0 otherwise, some algebra, and an alternating sum of binomial coefficients.... I think.] [EDIT2: Yeah, I had made a small error in showing why, but I fixed it, and it is fairly straightforwards, and the description in the first edit is basically correct.]
it seems (though I haven’t proven it) that if one takes the polynomial which is the first p terms of the taylor series for exp(x), and evaluates it at n*i in Fp[i] for n from 0 to p-1 (if p is congruent to 1 mod 4 instead of to 3, then instead of getting a field when adjoining an element i to Fp such that i2=-1, one gets a ring which has a zero divisor, but this does not impede doing this calculation, and the result appears to be the same) and then takes the “norm squared” of each of these (the sum of the squares of the real and imaginary parts), and then adds all of these up over each n from 0 to p-1, it appears that one always gets -1 (or if you prefer, p-1 ).
(this pseudo-exp function can’t give what I’ve been calling the unit circle for Fp[i] , like the actual exp function does for i*x for real x, because there are one too few values in Fp to fit all of them, and so it couldn’t possibly be that the values are exactly the ones with “norm squared” equal to 1, and indeed I haven’t seen them have that as the “norm squared” except at n = 0 )
This seems odd to me, and I have no real explanation for why this should be the case.
Note again that I haven’t proven this, so possibly I’m wrong and it doesn’t always hold. I’ve only tested this for primes up to 19.
I am partially writing this post so I don’t forget about it.
Conjecture: the space of functions in the fields Fq[i] which I’ve discussed which are “differentiable” in the way I’ve defined, is, rather than being q dimensional, is instead p-1 + p*((q^2 - 1)/(p^2 - 1)) dimensional .
I’ve checked programmatically that this works for q=3,7,11, and 27. For q=3,7,11, and in general whenever q = p a prime, this expression simplifies to 2p-1 .
However, to check this programmatically for the next smallest prime power which is not a prime, well, that would be for q=3^5 , and would involve doing row reduction mod 3 on a 3^10 by 3^10 matrix, which, as it would have a little over 3 billion entries, I think might take like 2-5 days to run with the not-optimized code I wrote. Or, possibly be entirely infeasible, depending on how the algorithm scales, which I haven’t checked carefully. The algorithm is an entirely naive algorithm, and might have bad runtime complexity. Actually, yeah, I think it might have runtime complexity of, best case around q^2 * (q^2 * (q^2 - 1))/2 , so, O(q^6) , but probably a bit worse. so, with q=3^5, that has 3^30 . Comparing that to the q=3^3 case, which has 3^18 . Ok, so it should probably take around 3^12 times as long? And, it took less than a minute to run, probably close to half a minute. 1 minute * 3^12 / (60 (minute/hour) * 24(hour/day)) is about 369 days. Ah. That’s more than 5. Well, this is only an estimate. Maybe I will try partially running it to see fast it gets through parts of the task.
The ((q^2 - 1)/(p^2 - 1)) part of the expression is nice. It feels promising. Where q = p^r , it is [r]_{p^2} . If we take Fq[i] as a vector space over Fp[i], ((q^2 - 1)/(p^2 - 1)) is the number of 1d subspaces of it.
ok, I've shown that for any finite vector space V (with dimension at least 1) over the field Z/pZ with p an odd prime, and any set S ⊆ V s.t. |S| ≡ 1 (mod p), and where S doesn't contain 0, and is closed under negation, the matrix M indexed by elements of V such that M(x,y) =( -1 if x-y=0, 1 if x-y ∈ S, 0 otherwise) has determinant ≡ 0 (mod p)
In the way I defined “differentiable” for certain functions in certain finite fields earlier, a function being “differentiable” ends up being equivalent to it being in the kernel of a matrix that has these properties. So, I already knew that these matrices had determinant 0 in that case, but I’m hoping that this alternate way of showing it will be helpful for showing things about the rank of these matrices, by pointing to a way of finding the invariant factors of the matrices, by finding the determinants of their minors.
The way I showed this is by the definition of the determinant of a matrix as: ∑σsign(σ)∏(v ∈ V)M(v,σ(v)) . I split this into 2 cases, those for which σ is invariant under translation all elements of V, and those for which it is not.
For M with the properties described above, translating σ by an element c of V does not change the value of r(σ) := sign(σ)∏(v ∈ V)M(v,σ(v))
(To clarify, what I mean by “translation” is that, the translation of σ by c in V is σc such that for all x in V, σc(x)=σ(x-c)+c, equivalently, σc(x+c)=σ(x)+c . By “aren’t changed by” I mean that σc = σ . )
“permutation σ1 is a translation of σ2 ” is an equivalence relation, and the elements of each equivalence class can be put into correspondence with a subspace of V (or, better, a quotient of V by the subspace of V under which σ1 (or, equivalently, σ2) is invariant under translation).
For the permutations σ of V which are not left unchanged by all translations, these equivalence classes will each have a number of elements divisible by p.
and, again, r(σ) is constant on each of these equivalence classes. So, summing r(σ) over all the σ which aren’t invariant under all translations, has a sum of 0 (mod p) .
Now, the other σ, which aren’t changed by any translation, are such that for all v in V, σ(v)=v+σ(0). (this should be a fairly simple exercise for the reader to show. Just apply the definition I give of “translation” above.) And, the only ones such that r(σ)≠0 are the ones where either σ(0)=0, or σ(0)∈S . Because for the product ∏(v ∈ V)M(v,σ(v)) to be non-zero, must have M(v,σ(v)) non-zero for all v in V, and so much have v-σ(v) either 0 or in S, and in particular, must have 0-σ(0)=-σ(0) be either 0 or in S, and so, as S is closed under negation, σ(0) must be either 0 or in S.
For the translation-invariant permutations σ such that σ(0)∈S , r(σ)=1 , because the product part ∏(v ∈ V)M(v,σ(v)) is a product of all 1s, and the sign(σ) part, well, σ is comprised of |V|/p cycles, each of length p, and a cycle with an odd length has sign 1, and so the whole permutation also has sign 1.
The identity permutation has sign 1 as well, but has ∏(v ∈ V)M(v,σ(v)) = ∏(v ∈ V)M(v,id(v)) = ∏(v ∈ V)M(v,v) = ∏(v ∈ V)(-1) = (-1)|V| = -1 . So, r(id)=-1 .
For the translation invariant permutations for which σ(0) is neither 0 nor in S, r(σ)=0. So, the sum over the translation-invariant permutations σ of r(σ) is 1*|S| + (-1)*1. And, |S| ≡ 1 (mod p), and so this sum is ≡ 1 + (-1) = 0 (mod p). So, adding this part together with the parts that aren’t translation invariant, which was also ≡ 0 (mod p) therefore, ∑σ r(σ) = ∑σsign(σ)∏(v ∈ V)M(v,σ(v)) ≡ 0 (mod p) i.e. the determinant of M is ≡ 0 (mod p) , as claimed.
I skipped describing some of the steps of the argument in this post, but the argument above is the gist of it.
Will a similar method allow me to show anything about the minors of such a matrix M? I guess I will find out.
More complex analysis over finite fields
So, I’ve been looking a bit more at the “what if you try to do complex analysis-y stuff but with finite fields”. (for most recent previous post on this topic, see [here] , and for all my posts on the topic, see : https://tilde-he.tumblr.com/tagged/Finite-Fields , and especially https://tilde-he.tumblr.com/tagged/finite-field-derivative ) As part of trying to see what happens if we have the “curve” we “integrate over” be something other than “the unit circle centered about the point we are concerned with”, as one of the simpler possible cases, I thought to try this with just, a circle centered around a different point.
This led to me attempting to evaluate this sum:
and, as part of that, finding that
. Unless I’ve made an error, this actually holds whenever S is the set of all powers of a (q+1)-th root of unity in some field, for any odd q > 0, not just in the fields relevant to this context.
But, in our context of q a prime power congruent to 3 mod 4, and working in the field Fq[i], we can use that to find that
.
So, if we wanted to evaluate the analogy of
for gamma the unit circle around 0, and c0 some point not on the unit circle, we end up with
.
If we plug in 0 for c0 in this, we get 1, as expected. If we plug in a point on the unit circle, we get an undefined result, also as expected.
If we plug in a point in S+/- which is not in S, i.e. a point with “squared magnitude -1“, then we get an amusing result. The set S+/- , which we might call “the extended unit circle”, is, like the unit circle S, a cyclic group under multiplication, and it has 2*(q+1) elements. Every element of it which is not an element of S is a generator, and so has multiplicative order 2*(q+1). Raising such an element to the power q+1 results in an element with multiplicative order 2, i.e. results in -1 .
So, if we pick c0 to be on the extended unit circle, but not on the unit circle, this sum gives a result of 1/(1 - (-1)) = 1/2 .
Amusing!
With the actual integral, in the actual complex numbers, if we picked a c0 inside the unit circle, we would get a result of 1, and if it was outside the unit circle, we would get a result of 0. Here, we picked an element which isn’t quite on the unit circle, but is kind of as if it was, and we get a result of 1/2.
It is as if we get an answer of “well, it is right in the middle of being inside and being outside of the unit circle”.
Now, I’m not sure that that interpretation should be considered “true” exactly, seeing as putting in other values of c0 don’t all give either 0,1, or 1/2, and, in fact, the result is never 0, so interpreting it this way seems a bit questionable.
But personally, I find this interpretation to be rather funny, and kinda cool.
Oh, also, remembering that xq+1 is like |x|2 , notice that this says that that, the result of that “integration” is 1/(1 - |x|2) . This seems to sort of suggest that the only point that is definitively “inside” the unit circle is 0, and that all the other points in the field are perhaps, different levels of intermediate between being inside and outside of it, where the levels correspond to the different values of the “norm squared” of the different elements.
This weirdness, while pretty cool imo, seems to me to suggest that, unfortunately, the analogies with complex analysis probably don’t carry quite as far as I’m hoping. But I still think I can push them a bit further still.
As always, comments, feedback, and requests for clarification, are all welcome.