A Möbius transformation is (Wikipedia):
In geometry and complex analysis, a Möbius transformation of the complex plane is a rational function of the form
\[f(z) = \frac{az + b}{cz + d}\]
of one complex variable \(z\); here the coefficients \(a, b, c, d\) are complex numbers satisfying \(ad − bc ≠ 0.\)
In order to plot this transformation using Cartesian coordinates on a computer platform the real and imaginary components will need to be separated.
Using this post:
If
(so \(Z, A, B, C, D\) are complex numbers, while \(x, y, a_r, a_i, b_r, b_i, c_r, c_i, d_r, d_i\) are real numbers)
\[\begin{align} f(Z) &= \frac{AZ+B}{CZ+D}\\[3ex] =&\frac {(a_r x - a_i y ) + (a_i x + a_r y) i + b_r + b_i i} {(c_r x - c_i y) + (c_i x + c_r y) i + d_r + d_i i} \\[3ex] = &\frac {(a_r x - a_i y ) + (a_i x + a_r y) i + b_r + b_i i} {(c_r x - c_i y) + (c_i x + c_r y) i + d_r + d_i i} \\[3ex] =&\frac {(a_r x - a_i y+ b_r ) + (a_i x + a_r y + b_i) i } {(c_r x - c_i y+ d_r) + (c_i x + c_r y+ d_i) i} \\[3ex] = & \left( \frac {(a_r x - a_i y+ b_r ) (c_r x - c_i y+ d_r) + (a_i x + a_r y + b_i) (c_i x + c_r y+ d_i) } {(c_r x - c_i y+ d_r)^2 + (c_i x + c_r y+ d_i) ^2} \right) \\[3ex] + & \left( \frac {(a_i x + a_r y + b_i)(c_r x + c_i y+ d_r) -(a_r x - a_i y+ b_r )(c_i x + c_r y+ d_i ) } {(c_r x - c_i y+ d_r)^2 + (c_i x + c_r y+ d_i)^2 } \right) i \end{align}\]
This is implemented here.
A modular form \(f\) is an analytic function defined with a weight \(k\) and a modular group.
The transformation of the domain of the modular form is the action of the modular group.See here.
Limiting the \(\text{GL}_2(\mathbb Z)\) to the special linear group \(\text{SL}_2(\mathbb Z)=\left\{\begin{bmatrix}a&b\\c&d\end{bmatrix}\in M_2(\mathbb Z): ad-cb=1\right\}\) acting on points on the upper-half of the complex plane:
\[\text{SL}_2(\mathbb Z)\require{HTML} \style{display: inline-block; transform: rotate(-270deg)}{\circlearrowright} \tau\in \mathcal H\]
with \(\mathcal H=\{x+iy: y >0\}\).
This action is a linear fractional transformation (a Möbius transformation):
\[\begin{bmatrix}a&b\\c&d\end{bmatrix}\tau = \frac{a\tau + b}{c\tau+d}\] Since \(\tau \in \mathcal H,\) the result of the transformation will also be in the upper-half plane due to the result:
\[\Im\left( \frac{a\tau + b}{c\tau+d}\right)=\frac{(ad-bc)\,\Im(\tau)}{\vert c\tau +d\vert^2}\]
The Möbius transformations are applications such that \(f(z) = \frac{az+b}{cz+d}\) are the projective transformations of the complex project line. They form a group called Möbius group, which is the projective linear \(\mathbb{PGL}(2, C)\). Möbius transformations are generated by the two following matrices: \(S=\begin{bmatrix}-1&0\\0&1\end{bmatrix}\) and \(T=\begin{bmatrix}1&1\\0&1\end{bmatrix}\).
The complete definition of a modular form of weight \(k\) for \(\text{SL}_2(\mathbb Z)\) is a function in \(\mathcal H\) satisfying:
\(f\) is holomorphic (analytical, i.e. there is a local power series expansion in \(\mathcal H\))
Modularity condition:
\[f\left( \frac{a\tau + b}{c\tau+d}\right)= (c\tau + d)^k\; f(\tau)\quad \forall \begin{bmatrix}a&b\\c&d\end{bmatrix}\in \text{SL}_2(\mathbb Z),\; \tau\in\mathcal H\]
Since this applies to all matrices in the group, it follows that it applies to \(T=\begin{bmatrix}1&1\\0&1\end{bmatrix}\), and hence:
\[f\left(\frac{1\cdot \tau + 1}{0\cdot \tau + 1}\right)=f(\tau+1)=(0\cdot \tau +1)^k\,f(\tau)=f(\tau)\] Therefore \(f(\tau +1)=f(\tau),\) and the function is periodic.
From the matrix \(S=\begin{bmatrix}0&-1\\1&0\end{bmatrix}\) we can conclude that
\[f\left(\frac{0\cdot \tau -1}{1\cdot \tau +0} \right)=f\left(-1/\tau\right)=\tau^k\,f(\tau)\]
A \(\tau\) in the upper-half plane outside the unit semicircle will be transformed by \(-1/\tau\) into a point within the unit semicircle (less than \(1\) in modulus) and reflected through the origin (negative sign). Take a point \(\tau = r\,e^{i\theta},\) its inverse is \(1/\tau = 1/r\,e^{-i\theta}\) (reciprocal modulus, negative argument). The introduction of a negative sign is equivalent to multiplying by \(-1 = e^{i\pi}\), yielding \(-1/\tau = 1/\tau e^{i(\pi-\theta)}\) (no change in modulus, argument \(\pi -\theta\)).
Finally, considering the matrix \(\begin{bmatrix}-1&0\\0&1\end{bmatrix}\) we get that
\[f\left(\frac{-1\cdot \tau +0}{0\cdot \tau +1} \right)=f\left(-\tau\right)=(-1)^k\,f(\tau)\] which implies that if \(k\) is odd the function has to be zero, i.e. modular forms have even weights.
This is explained in here.
In the LMFDB, modular forms are classified according to weight (\(k\)) and level, which is a positive integer \(N\) such that \(f\) is a modular form on a subgroup \(\Gamma\) of \(\operatorname{SL}_2(\mathbb{Z})\) that contains the principal congruence subgroup \(\Gamma(N)\).
For instance, take the elliptic curve \(y^2+y=x^3-x^2\) that Edward Frenkel
presents in Numberphile in here. It
turns out that you can find the curve by plugging the equation in the
field “Label or coefficients”, and it returns
11.a3 (Cremona label 11a3) with the corresponding modular
form within the information about the elliptic curve:
\[q - 2\,q^2 - q^3 + 2\,q^4 + q^5 + 2\,q^6 - 2\,q^7 - 2\,q^9 - 2\,q^{10} + q^{11} - 2\,q^{12} + \cdots)\\[2ex]=q\,((1-q)^2\;(1-q^{11})^2\;(1-q^2)^2\;(1-q^{22})^2)\;(1-q^{3})^2\;(1-q^{33})^2\cdots\]
The plotting can be carried out in the unit disk (hyperbolic space with the Poincaré disk model) (see here).
To understand what it represents we need to define the fundamental domain: The fundamental domain is a closed subset \(D ⊂ X\) such that \(X\) is the union of translates of \(D\) under the group action \(G\):
\[X = \cup_{g∈G} \;g\,D\] Due to the fact that the Möbius transformations are generated by \(T\) and \(S\), the fundamental domain and its copies can be found as
\[A_{n+1} = \{A_n × T, A_n × S, A_n × T^{−1}, A_n × S^{−1}\}\]
with \(A_0\) being equal to the original fundamental domain.
Let be \(z ∈ H\). We call the order of \(z\), and we denote \(\text{ord}(z)\), the smallest number of transformations (among \(S,T,T^{−1}\)) needed to transform \(z\) into a complex number in the fundamental domain. Equivalently, \(\text{ord}(z)\) is the minimal number of Möbius transformations needed to transform the fundamental domain into the copy of itself that includes \(z\).
If the order of a complex number in the complex half-plane is even, it can be represented in black. This can already result in a nice black and white alternating image.
Using the code in this Mathematics SE post, the disk plot of the modular form 11.2.a.a of level \(11\) and weight \(2\) can be created in SageMath, first confirming we have the right form:
lv = 11
wt = 2
ModularForms(11, 2).basis()[0]
q - 2*q^2 - q^3 + 2*q^4 + q^5 + O(q^6)
Here is the standard Cartesian plot:
And here is the Poincaré disk:
The patterns you see in the Poincaré disk plot are essentially visualizations of the equivalence classes of points under the modular group. The “fractal” patterns arise from the intricate way the fundamental domain tiles the upper half-plane and how these tiles map to the Poincaré disk. The FD tiles the upper half-plane. This tiling (tessellation) represents the equivalence classes of points under the modular group.
Symmetry in the Disk: Even though we’re calculating in the FD, the symmetries of the modular form are reflected in the Poincaré disk plot.
The colors in the disk are based on the information that was calculated from the FD, but the placement of those colors are determined by the initial grid that was created in the Poincare disk.
The visual advantage of the distortion introduced by mapping the upper half-plane (H) to the Poincaré disk is primarily about compactness and global visualization. Here’s a breakdown:
Infinity to Boundary: The upper half-plane extends infinitely in all directions. The Poincaré disk, on the other hand, is a finite, bounded region. This allows us to represent the entire upper half-plane (or at least a large portion of it) within a finite space.
Complete View: This compact representation makes it possible to visualize the global structure of modular forms and their symmetries in a single, coherent image.
Visualizing Cusps: The “cusp” of the upper half-plane (infinity) is mapped to the boundary of the Poincaré disk. This allows us to visualize the behavior of modular forms near the cusp, which is often crucial for understanding their properties.
Conformal Mapping: The mapping from H to the Poincaré disk is a conformal mapping, which means it preserves angles. This is important because it preserves the local geometric properties of the modular forms.
Global Symmetries: Even though the shapes of the tiles are distorted, the overall symmetries of the modular forms are still visible. The patterns in the Poincaré disk plot reflect the symmetries of the modular group and the modular form itself.
Whole Picture: The distortion allows the viewer to see the whole picture. If the image was created in the upper half plane, then the image would need to be infinitely large to display the same information.
Circular Boundary: The circular boundary of the Poincaré disk provides a visually appealing and natural frame for the plot.
Symmetry and Harmony: The circular symmetry of the disk often enhances the visual harmony of the patterns, making them more aesthetically pleasing.
Finite Domain: Working within a finite domain (the Poincaré disk) can sometimes simplify numerical computations and plotting algorithms.
\[h=\frac{1 - iz}{z - i}\]
This formula is the inverse of the standard transformation from the upper half-plane to the Poincaré disk:
\[w = \frac{z - i}{z + i}\]
As a result of this inverse transformation, the points will appear skewed and roughly circular, but distorted. This is the effect of a Möbius transformation.
The specific form of DtoH used in the code,
DtoH(x) = (-I * x + 1) / (x - I), is a particular case of a
more general class of transformations called Möbius transformations (or
fractional linear transformations).
Here’s the breakdown of why this works and the underlying principles:
Möbius Transformations: Möbius transformations are functions of the form:
\[f(z) = (az + b) / (cz + d)\]
where \(a, b, c\), and \(d\) are complex numbers, and \(ad - bc ≠ 0\) (this condition ensures the
transformation is invertible). Möbius transformations have several
important properties: They map circles and lines to circles and lines.1
This is crucial because the boundary of the unit disk is a circle, and
the boundary of the upper half-plane (the real axis) is a line. They are
conformal and bijective. To map the unit disk to the upper half-plane,
you need to find a Möbius transformation that takes the boundary of the
unit disk (\(|z| = 1\)) to the real
axis \(Im(z) = 0\). There are
infinitely many such transformations. The specific DtoH
transformation used in the code is just one example.
The fundamental domain is designed to contain exactly one representative from each equivalence class under the action of a group. Points are considered equivalent if they can be transformed into each other by a group action. If multiple equivalent points fall within the fundamental domain, it violates this principle of unique representation.
Let’s consider a simplified example to illustrate the idea. Imagine a group that acts on the plane by rotations of multiples of \(90\) degrees around the origin. We want to define a fundamental domain for this group action. If we naively choose the entire plane as our “fundamental domain,” then we clearly have multiple equivalent points. For example, a point at \((1,0)\), \((0,1)\), \((-1,0)\), and \((0,-1)\) are all equivalent under the rotation group, but they are all distinct points in the plane. A better choice for a fundamental domain would be, say, the region defined by \(0 \leq \theta < 90\) degrees.
The region near the origin in the upper half-plane is a region where these group actions can “overlap” or “fold over” in a way that causes multiple equivalent points to fall within it. The transformations that define the group action (e.g., the modular group or related groups) often involve a combination of scaling, inversion, and translation in the complex plane. Consider the transformation \(z \to -1/z\), which is part of the modular group. This transformation inverts points and reflects them across the imaginary axis. Points close to the origin are mapped to points far away, and vice-versa. This kind of transformation can cause significant “folding” near the origin.
Here is the appearance of the final position of the dots in the FD:
This transformation is based on the group action \(\circlearrowright\) on points \(\tau \in \mathbb H.\) The matrix \(\small \begin{bmatrix}1&-1/2\\0&1\end{bmatrix}\) will bring point along the positive axis towards the left via the action:
\[\begin{bmatrix}a&b\\c&d\end{bmatrix}\circlearrowright\tau = \frac{a\tau + b}{c\tau+d}=\frac{z -1/2}{z}\]
and if they are less than \(1\) unit from the origin, they will reflect them outside the unit circle with the transformation \(-1/z,\) corresponding to the action of the matrix \(\small \begin{bmatrix}0&-1\\1&0\end{bmatrix}\):
\[\begin{bmatrix}a&b\\c&d\end{bmatrix}\circlearrowright\tau = \frac{a\tau + b}{c\tau+d}=\frac{ -1}{z}\] * Modular forms have a Fourier expansion, often called the \(q\)-expansion, where \(q\) is related to \(\tau\) (a complex number in the upper half-plane) by \(q = \exp(2πi\tau)\). The polynomial (which is an approximation of the modular form) is being evaluated at this \(q\) value. This is because the modular form is often expressed and computed in terms of its \(q\)-expansion.
With the FD is comprised between \(-1/2\) and \(1/2\) in the real line, and between \(0\) and \(~ 2\) in the imaginary line (see plot above), the transformation \(\exp(2\pi i \tau)\) of \(\tau = a + bi\) will be \(\exp(2\pi i (a+bi))= \exp(-2\pi b) \exp(2\pi\ i a)\), which can be very small for a value with an imaginary part slightly above \(1\), say for example \(b = 2\), \(\exp(-2 \pi 2) ~ 3.5\times 10^{-6}\), leading to potential numerical instability, and the need for high-precision arithmetic libraries. On the other hand, small values of \(q\) can make the infinite series decay fast, leading to more accurate calculations with fewer terms.
Here is the clustering around small values that takes place when this transformation is carried out:
pullback function’s purpose it to find a matrix
\(\gamma = [[a, b], [c, d]]\) in \(\text{SL}(2, \mathbb Z)\) such that \(γ(z)\) (the action of \(γ\) on \(z\)) lies within the fundamental domain.
Crucially, the function returns both the transformation \(γ\) and the transformed value \(z\). However, it constructs the
transformation matrix such that it represents the inverse of the
transformation needed to bring \(z\) to
the fundamental domain. This is done so that when you apply \(γ\) to the transformed \(z\) (which is already in the fundamental
domain), you can return to the original \(z\).Modularity Factor: The modularity factor is \((cz + d)ᵏ\). But because the pullback function returns the inverse transformation, and because matrix multiplication is not commutative, the code needs to use the correct modularity factor.
Let’s say the transformation that takes \(z\) into the fundamental domain is represented by the matrix \([[A, B], [C, D]]\). Then the modularity factor when evaluating the modular form at the transformed point would be \((Cz + D)ᵏ\). The pullback function, however, returns the inverse transformation matrix, \([[D, -B], [-C, A]]\) (remember that the special group has determinant \(1\)). So, if we were to apply the inverse transformation to a point already in the fundamental domain to get the original point z, the modularity factor would be \((-Cz + A)^k\).
Here is the code:
# https://math.stackexchange.com/a/4309925/152225
import cmath
Htoq = lambda x: exp(2 * CDF.pi() * CDF.0 * x)
DtoH = lambda x: (-CDF.0 * x + 1) / (x - CDF.0)
C.<t> = CC[]
lv = 11
wt = 2
M4 = ModularForms(lv, wt)
f = M4.basis()[0]
coeffs = f.coefficients(list(range(20)))
fpoly = C(coeffs)
def in_fund_domain(z):
x = z.real()
y = z.imag()
if x < -0.51 or x > 0.51:
return False
if x*x + y*y < 0.99:
return False
return True
def act(gamma, z):
a, b, c, d = gamma
return (a*z + b) / (c*z + d)
def mult_matrices(mat1, mat2):
a, b, c, d = mat1
A, B, C, D = mat2
return [a*A + b*C, a*B + b*D, c*A + d*C, c*B + d*D]
Id = [1, 0, 0, 1]
def pullback(z):
"""
Returns gamma, w such that gamma(z) = w and w is
(essentially) in the fundamental domain.
"""
z = CDF(z)
gamma = Id
count = 1
while not in_fund_domain(z):
count += 1
x, y = z.real(), z.imag()
xshift = -floor(x + 0.5)
shiftmatrix = [1, xshift, 0, 1]
gamma = mult_matrices(shiftmatrix, gamma)
z = act(shiftmatrix, z)
if x*x + y*y < 0.99:
z = -1/z
gamma = mult_matrices([0, -1, 1, 0], gamma)
return gamma, z
#def smart_compute(z):
# gamma, z = pullback(DtoH(z))
# a, b, c, d = gamma
# scale = 1000
# return (-c*z + a)**wt * fpoly(Htoq(z)) * scale
def smart_compute(z, scale=1e6, log_scale=100): #Added log_scale
gamma, z = pullback(DtoH(z))
a, b, c, d = gamma
value = (-c*z + a)**wt * fpoly(Htoq(z)) * scale
if abs(value) > 0:
return cmath.log(abs(value) * log_scale) * cmath.exp(cmath.phase(value)*1j)
else:
return 0
pts = 300
P = complex_plot(
lambda z: 0 if abs(z) >= 0.9
else smart_compute(z) * exp(1.2 * CDF.pi() * CDF.0),
(-1, 1), (-1, 1), aspect_ratio=1, figsize=[8, 8],
plot_points=pts)
P.axes(show=False)
P
Weierstrass wanted a function that was “inherently” periodic. If you want a function to repeat every time you move by a lattice vector \(\lambda\), where \(\lambda\) represents the entire set of linear combinations \(m\omega_1 + n\omega_2\) of the basis vectors of the lattice, it can be built by taking a simple function \(f(z)\) and manually summing it over the whole lattice:
\[\sum_{\lambda \in \Lambda} f(z - \lambda)\]
If you shift \(z\) by a lattice vector \((\omega_1 \text{ or }\omega_2),\) the infinite sum just shifts one position and remains identical. This is the “infinite mirror room” logic. We can think of the infinite sum as an endless row of identical buckets, each labeled with a lattice vector \(\lambda\) (a complex number). Each bucket contains a “contribution” based on the distance from the point \(z\) to that specific lattice point. In a 1D lattice where the points are integers \((\dots, -1, 0, 1, 2, \dots),\) the sum looks like this:
\[\dots + f(z - (-1)) + f(z - 0) + f(z - 1) + f(z - 2) + \dots\]
Now, suppose we shift the position \(z\) by exactly one lattice vector (let’s say we move it to \(z \to z + 1\) - notice that in the formal language of modular forms, we usually normalize the lattice so that the basis vectors are \(1\) and \(\tau\). When we shift by \(1\), we are moving “horizontally” across the fundamental domain). The sum becomes:
\[\dots + f(z + 1 - (-1)) + f(z + 1 - 0) + f(z + 1 - 1) + f(z + 1 - 2) + \dots\]
If we simplify the math inside the parentheses:
\[\dots + f(z + 2) + f(z + 1) + f(z - 0) + f(z - 1) + \dots\] Every single term that was in the first sum is still in the second sum; they have just changed seats. The term that used to be calculated relative to \(\lambda = 0\) is now being calculated relative to \(\lambda = 1\). The term that used to be relative to \(\lambda = 1\) is now relative to \(\lambda = 2\). Because the sum is infinite in all directions, there is no end of the line. It’s like a hotel with infinite rooms: if every guest moves one door to the right, every room is still occupied by exactly one guest. The view from the point \(z\) remains exactly the same because the entire forest of lattice points looks identical no matter which “tile” you are standing in.
In the Weierstrass formula, the “stabilizer” term (\(1/\lambda^2\)) is what allows this seat-swapping to happen without the whole calculation collapsing into an undefined mess. If we use a common seed like \(f(z) = 1/z^2\), this infinite sum explodes to infinity everywhere. Weierstrass realized he needed to tame the sum. He kept the \(1/z^2\) for the origin, but for every other point in the lattice, he added a subtraction term. This “counter-weight” ensures that the contribution of far-away tiles gets smaller and smaller, allowing the total sum to settle on a finite value. This gives us the official Weierstrass P-function:
\[\bbox[20px, border: 3px solid red]{\wp(z) = \frac{1}{z^2} + \sum_{\lambda \in \Lambda \setminus \{0\}} \left( \frac{1}{(z - \lambda)^2} - \frac{1}{\lambda^2} \right)}\]
The \(-1/\lambda^2\) is this subtraction term that ensures that as you move further away from the center, the difference between the “naive” term and the stabilizer approaches zero fast enough for the sum to work. Notice that the term in front is effectively \(\frac 1{(z - 0)^2},\) or the term corresponding to \(\lambda =0,\) which had to be kept separate from the other terms because subtracting from it \(\frac 1{0^2}\) is not defined.
For every single dot on that infinite grid, you measure the distance from the point \(z\) to that dot \(\lambda\). Then we square that distance. Take the reciprocal (\(1/\text{distance}^2\)). Add it to the bucket. As we “walk” further away to gather more terms, the distances get bigger, so the numbers you are adding (\(1/\text{dist}^2\)) get smaller.
In a 1D line, these numbers get small fast enough that the total sum stays finite. But in 2D, the number of “dots” we encounter as we move outward grows at the same rate the distances grow. Imagine being at the center of a circular forest. The trees at distance \(R\) have a brightness of \(1/R^2\). But the number of trees at distance \(R\) is proportional to the circumference \((2\pi R).\) So the total light from the trees at distance \(R\) is roughly \(R \times (1/R^2) = 1/R\). Since the sum of \(1/R\) (the harmonic series) goes to infinity, the function would not converge.
Imagine two points, \(A\) and \(B\) separate by a linear combination of the basis vectors, and with \(A\) in the fundamental domain (containing the origin), and \(B\) in some other parallelogram. Picture the small vectors representing the distance to points on the lattice (corners) adjacent to each of these points as four short arrows going to the edges of the parallelogram framing each one of the points, ready to be squared and inverted as part of the sum. Symmetrical, isn’t it? Now imagine the difference vector between point \(A\) and one of the corners in the parallelogram around \(B\). Can you see that that vector can be transported to \(B,\) and the arrowhead will land on a point in the lattice? So summing and squaring the vectorial differences over all lattice points for each \(A\) and \(B\) will produce the same value.
To see how the Weiestrass \(\wp\)-function works, let’s use a simple square lattice where our steps are \(1\) (right/left) and \(i\) (up/down). So, our lattice points \(\lambda\) are: \(0, 1, -1, i, -i, 1+i, 1-i\), and so on.
Let’s pick a point, say \(z = 0.5 + 0.5i\) (the dead center of the first tile).
Step 1: The origin (\(\lambda = 0\)).
We handle the first term, \(1/z^2\). The distance is \((z - 0) = 0.5 + 0.5i\). Contribution: \(\frac{1}{(0.5 + 0.5i)^2} \approx -2i\).
Step 2: The inner ring. Now we move to the surrounding points \(\lambda \in \{1, i, 1+i\}\). For each, we calculate the “tamed” contribution: \(\left( \frac{1}{(z - \lambda)^2} - \frac{1}{\lambda^2} \right)\).
For \(\lambda = 1\): \(\frac{1}{(z-1)^2} - \frac{1}{1^2}\)
For \(\lambda = i\): \(\frac{1}{(z-i)^2} - \frac{1}{i^2}\)
For \(\lambda = 1+i\): \(\frac{1}{(z-(1+i))^2} - \frac{1}{(1+i)^2}\)
Step 3: We keep moving in concentric squares to \(\lambda = 2, 2i, -2, -2i\). Because of that subtraction term, these far-away points contribute less and less to the total, eventually becoming negligible. We add these values to our running total every time.
For any point \(z\) we choose in the complex plane, the calculation of \(\wp(z)\) always begins with that \(\lambda = 0\) term. We can think of \(\lambda = 0\) as the “home base” pole. Here is how that process flows logically: 1. Identifying the “home” pole: No matter where your \(z\) is located, the formula treats the origin (\(0\)) as a special reference point. The first part of the equation, \(\frac{1}{z^2}\), is simply the “naive” contribution from the pole at the origin. Because we use this term alone, the summation symbol \(\sum\) specifically excludes zero (\(\lambda \in \Lambda \setminus \{0\}\)) to avoid double-counting.
Whether our \(z\) is \(0.1 + 0.1i\) (very close to the origin) or \(100 + 100i\) (deep in the lattice), we follow the same sequence. Again:
Calculate the origin term: \(1/z^2\).
Calculate the lattice terms:
Start summing \(\left( \frac{1}{(z - \lambda)^2} - \frac{1}{\lambda^2} \right)\) for every other \(\lambda\) in the infinite grid.
While we always start the math with \(\lambda = 0\) (although the \(z=0\) can’t be expressed inside the sum due to the \(1/z^2\) term as discussed above), the impact of that element changes based on where \(z\) is: If \(z\) is near \(0\): The \(1/z^2\) term is massive. It dominates the value of the function, creating a “spike” (pole) at the origin. If \(z\) is near another lattice point (say, \(\lambda = 1\)): The term \(1/(z - 1)^2\) inside the sum becomes the giant value, while \(1/z^2\) becomes just a small, ordinary number.
Let’s look at what happens when we shift \(z\) by a specific lattice vector, say \(\gamma=\omega_1 - 3\omega_2\). If we evaluate \(\wp(z + \gamma)\), the equation becomes:
\[\wp(z + \gamma) = \frac{1}{(z + \gamma)^2} + \sum_{\lambda \neq 0} \left( \frac{1}{(z + \gamma - \lambda)^2} - \frac{1}{\lambda^2} \right)\]
Now, look at that lone term out front: \(\frac{1}{(z + \gamma)^2}\). When we shift, the old \(1/z^2\) becomes \(1/(z+\gamma)^2\) (moving into the territory of the sum), and one of the terms from the old sum — the one where \(\lambda = \gamma\) — becomes \(1/(z+\gamma-\gamma)^2 = 1/z^2\).
Because the function is periodic, \(\wp(z)\) will have the exact same value regardless of which “tile” you are in. If you calculate \(\wp(0.5)\) or \(\wp(1.5)\) or \(\wp(100.5)\), the infinite sum balances out to give you the same result. Starting at \(\lambda = 0\) is just the mathematical convention to ensure the sum is defined consistently across the whole plane.
We have seen how we can calculate \(\wp(z)\) at any point by summing up contributions from the “home” pole and all the “stabilized” poles in the infinite mirror room. Because the function is periodic, \(\wp(z)\) will have the exact same value regardless of which “tile” the point is in. If we calculate \(\wp(0.5)\) or \(\wp(1.5)\), the infinite sum balances out to give the same result. However, while the value of the function depends on \(z\), the overall shape of this infinite landscape is determined entirely by the lattice \(\Lambda\) itself.
If we take the derivative of our function, \(\wp'(z)\), we’ll find that \(\wp\) and \(\wp'\) are locked in a tight mathematical dance. If we plot them against each other as coordinates \((x, y) = (\wp(z), \wp'(z))\), they always stay on a specific path. This path is a cubic equation:
\[(\wp'(z))^2 = 4\wp(z)^3 - g_2\wp(z) - g_3\]
This is where the geometry of the the lattice turns into the coefficients of an equation (the curve). The numbers \(g_2\) and \(g_3\) aren’t just random constants; they are “summaries” of the entire lattice. They are calculated by summing up the “stiffness” of the lattice points \(\omega\) (where \(\omega \in \Lambda \setminus \{0\}\)):
\(g_2\) (The fourth power sum): \(60 \sum \omega^{-4}\)
\(g_3\) (The sixth power sum): \(140 \sum \omega^{-6}\)
We can think of \(g_2\) and \(g_3\) as the “DNA markers” of our lattice. If we stretch the lattice or tilt it, these sums change, which in turn change the coefficients of the cubic equation.
Because \(g_2\) and \(g_3\) depend entirely on the shape of the lattice, and lattices can be parameterized by the ratio of their two basis vectors \(\tau = \omega_2 / \omega_1\), these coefficients are actually modular forms. They are functions that “live” on the space of all possible lattices. By mapping
\[z \mapsto (\wp(z), \wp'(z))\]
we essentially wrap the complex plane (a flat sheet) onto a cubic curve in projective space (a donut-shaped torus). We have successfully turned topology (the lattice donut) into algebra (the elliptic curve equation). Weierstrass hadn’t just built a function; he had found the “atoms” of the entire field. Any doubly periodic function can be expressed as a combination of his \(\wp(z)\) and its derivative \(\wp'(z)\).
When we say an elliptic curve \(E(\mathbb{C})\) is defined by \(y^2 = x^3 + Ax + B\), both \(x\) and \(y\) are complex numbers. Since one complex number requires \(2\) real dimensions (real and imaginary), the pair \((x, y)\) technically lives in \(\mathbb{C}^2\), which is \(4\)-dimensional real space (\(\mathbb{R}^4\)). However, the equation \(y^2 = x^3 + Ax + B\) places a “constraint” on those \(4\) dimensions. In mathematics, one complex equation removes one complex dimension (or two real dimensions). We start with \(4\) real dimensions, but the equation “carves out” a \(2\)-dimensional surface. That surface is exactly the torus. While the donut “lives” inside a \(4\)-dimensional space, the donut itself is a \(2\)-dimensional manifold.
There is a one-to-one correspondence between the “shape” of the donut and the specific elliptic curve. This is where the lattice \(\Lambda\) comes back in. Any donut can be made by taking a sheet of paper (the complex plane \(\mathbb{C}/\Lambda\)) and gluing the edges. The “shape” of the donut depends entirely on the shape of that original sheet of paper (the fundamental parallelogram of the lattice). A square lattice produces a “symmetric” donut. This corresponds to curves like \(y^2 = x^3 - x\). A hexagonal lattice produces a donut with \(120^\circ\) symmetry. This corresponds to curves like \(y^2 = x^3 - 1\). A long, skinny parallelogram produces a “thin, stretched” donut.
We use a single complex number called the \(j\)-invariant to make the correspondence lattice to elliptic curve. If we have the equation \(y^2 = x^3 + Ax + B\), we can plug \(A\) and \(B\) into a formula to get \(j\). If we have a lattice \(\Lambda\) with ratio \(\tau = \omega_2 / \omega_1\), we can plug \(\tau\) into the modular \(j\)-function to get the same \(j\). Two elliptic curves are “the same” (isomorphic) if and only if they have the same \(j\)-invariant. If we change the shape of our lattice even a tiny bit, we change the \(j\)-invariant, which means we have moved to a fundamentally different elliptic curve equation.
In 3D, a donut has an inner radius and an outer radius. If we try to paint a grid of perfect squares on a rubber donut, the squares on the outer rim will be stretched large, and the squares near the “hole” will be crushed and wrinkled. In 4D, the donut is what we call a Clifford Torus. Every point on the surface is geometrically identical to every other point. There is no “inner” or “outer” part. If we lived on the surface of a 4D torus, we would never feel like we were turning a “tight corner” or a “wide curve.” Every direction would feel perfectly straight and flat.
The reason 4D works is that it allows two circular motions to happen independently. Imagine two circles, \(C_1\) and \(C_2\). In 3D, if we want to put \(C_2\) “around” \(C_1\), \(C_2\) has to physically move through the space inside and outside of \(C_1\). This creates the “hole” and the “stretch.” In 4D, we can have \(C_1\) sitting in the \(xy\)-plane and \(C_2\) sitting in the \(zw\)-plane. They share only one point (the origin), or none at all. We can “loop” around one without ever getting closer to or further from the other.
In abstract algebra, the Clifford Torus is formulated as the direct product of two circle groups:
\[T^2 = S^1 \times S^1\]
Each \(S^1\) (the unit circle) is a group under complex multiplication (if viewed as \(e^{i\theta}\)) or addition modulo \(2\pi\). Because \(T^2\) is a product of two identical groups, it inherits a homogeneous structure. In the 3D “rubber donut” (the embedded torus), the two circles are treated differently: one is the “generating” circle and the other is the “revolving” circle. In the algebraic formulation \(S^1 \times S^1\), the two circles are algebraically indistinguishable. This is why every point is identical; the group action of \(T^2\) on itself is transitive and an isometry.
There is a perfect, one-to-one mapping between the points on the complex torus \(\mathbb{C}/\Lambda\) and the points on the elliptic curve \(E(\mathbb{C})\). Every single point we can identify on that “flat” donut corresponds to exactly one \((x, y)\) solution to the equation (plus one special “point at infinity”).
In the complex plane, this point corresponds to \(z = 0:\) it is a single point. When we represent a torus as a parallelogram (a fundamental domain) in the complex plane \(\mathbb{C}\), we are using a shorthand. To actually turn that flat shape into a torus, we have to glue the top edge to the bottom edge (creating a cylinder), and glue the left edge to the right edge (closing the cylinder into a donut). When we perform this “gluing,” corner A (bottom-left) is glued to corner B (bottom-right). They are then both glued to corner C (top-left) and corner D (top-right). In the final 4D structure, all four vertices of that parallelogram land on the exact same physical spot. That spot is the identity element of the elliptic curve group, \(z = 0 \pmod \Lambda\).2. If we look at the Weierstrass \(\wp\)-function, which maps the complex plane to the elliptic curve, it has a double pole at \(z = 0\). As the complex coordinate \(z\) approaches any corner of that parallelogram, we are approaching \(z = 0\). In the equation \(y^2 = 4x^3 - g_2x - g_3\), the value of \(y\) (and \(x\)) shoots to infinity as we get closer to that corner. Because the corners are all the “same” point on the torus, it doesn’t matter which corner we walk toward — we are walking toward the “point at infinity.”
In the actual torus embedded in 4D, this point is not special: on the surface of the torus in 4D, there is no “bump” or “signpost” at the origin. If we were walking on the surface, we wouldn’t know you were at the “point at infinity” unless you were looking at a coordinate map. It’s just like the Prime Meridian on Earth — it’s a significant marker for our maps, but if you stand on it in Greenwich, the ground feels the same as it does five miles away.
In here the sphere is used as a simplification to visualize \(\mathcal O\) shown a 2D \(x-y\) plane being “closed” into a 3D shape. However, an elliptic curve isn’t topologically a sphere; it’s a torus. If you could do a “stereographic projection” of a 4D torus onto a 3D space, we would see something similar. The “point at infinity” \([0:1:0]\) would be the point where the “ends” of the curve meet to close the loop.