There was a request containing the phrase, “theory of modular forms,” so I’ll write an introduction to that. Chris seems to be taking care of the rest of that paragraph.
Pretty much all of the material below is 50-150 years old. Don’t expect too much originality.
For the duration of this post, an elliptic curve will be a complex manifold isomorphic to , where is a discrete subgroup of the complex numbers. An elliptic curve then has the topology of a 2-torus, and the structure of the abelian group U(1) x U(1). However, the complex structure depends nontrivially on the choice of lattice. In particular, two elliptic curves are isomorphic if and only if there is some nonzero complex number r such that . This means we can rotate and dilate lattices without changing the curve, but that’s all. This can be proved using Weierstrass’s theory of meromorphic functions.
Let’s try to classify elliptic curves. We can choose an oriented basis of the lattice (i.e., the first generator is less than 180 degrees clockwise from the second generator), and then rescale the lattice (by rotating and dilating) so that the first generator is equal to one. The second generator is then a point in the complex upper half plane H. The fact that we chose an oriented basis means that H doesn’t classify elliptic curves, since we added extra structure (in particular, it classifies elliptic curves with an oriented basis for first homology). However, the group of two-by-two integer matrices with determinant one acts simply transitively on all such oriented bases, so elliptic curves are classified by taking the quotient of the upper half plane by a certain action of .
It is a fairly well-known fact (which I won’t prove here) that is generated by and . If we have a lattice with oriented bases (1,z), then T fixes 1 and sends z to z+1, yielding the new basis (1,z+1). S takes 1 to z and z to -1, so we divide this new basis by z to get (1,-1/z). More generally, acts on H via , but we can get the structure of the quotient from just the generators. Since T acts by Translation by one, we can choose orbit representatives in the part of H that lies in a vertical strip of width one. S acts by Spinning around i, switching the interior of the unit disc with the exterior. The standard fundamental domain for the action is then the part of the upper half plane outside the unit disc, and with real part between -1/2 and 1/2. You can see a picture of it here. To take the quotient, we glue the left and right sides of the domain together to get an infinitely long tube, and then we glue the bottom shut. This gives us a complex analytic space that is topologically (in fact, complex analytically) a plane, and points in this space classify elliptic curves up to isomorphism. We will call this space Y(1). If we compactify by adding a point at infinity (called a cusp), we get a sphere called X(1), which classifies “generalized elliptic curves.” The extra point describes what you get by taking a sphere and identifying two points so they intersect transversely. There is a group structure on the smooth locus, isomorphic to , so we have essentially let the second generator of our lattice run away to infinity.
(Advanced bit: doesn’t act freely on the half plane, i.e., there are nonidentity elements that fix points, and these fixed points correspond to curves with automorphisms. In particular, every elliptic curve has a -1 automorphism, and the square and triangular lattices have automorphisms of order 4 and 6, respectively. If we want to produce a universal family over a moduli space, we will have to use the machinery of stacks. Deligne and Rapoport showed that the functor Y(3) producing elliptic curves together with an identification of their three-torsion is representable in schemes, so Y(1) is a quotient by the order 24 group . If we look at curves in characteristic 2, there is one elliptic curve whose automorphism group is exactly this one. Clearly, the lattice picture doesn’t work here, since the group is nonabelian. In fact, it is naturally the group of units in a certain quaternion algebra of endomorphisms of the curve.)
So, what is a modular form? I won’t give an answer yet, but a modular function is just a complex function on the upper half plane that is invariant under the action of . Equivalently, it is a function on Y(1), or an invariant of elliptic curves. There is a distinguished subspace of these functions given by those that classify elliptic curves uniquely. If we look at Y(1), these are just one-to-one functions. We typically ask for modular functions to be reasonably nice, i.e., holomorphic, and with reasonable growth as z tends toward infinity. The conditions imply the corresponding function on Y(1) (viewed as a plane) is a polynomial. The one-to-one functions then have the form aj+b for some function j, where a is nonzero. The function j is periodic and holomorphic on the upper half plane, so its Fourier expansion (which I will describe later) has constant coefficients. With a good choice of normalization, the coefficients are nonnegative integers, and this might lead you to suspect that there is some interesting graded vector space whose dimensions are given by these coefficients. This is part of “moonshine.”
We are looking for forms rather than functions, so let’s consider the differential 1-form dz on the upper half plane. If we transform the half-plane by , we get . In other words, if some function f satisfies , then f(z)dz is a one-form that is invariant under , i.e., it lives on the quotient Y(1). Such a function f is called a modular form of weight 2. If f satisfies for all , then f is called a modular form of weight 2k. In general, these forms will not be differential k-forms (i.e., sections of ), but they will describe sections of the pluricanonical bundle , which has the advantage of being nonzero for lots of k. Earlier, we gave an interpretation of modular functions as invariants of elliptic curves. A modular form of weight 2k is an invariant of a pair , where E is an elliptic curve, and is a nowhere-vanishing differential on E (such as dz – there is only a worth of these), and it satisfies . We write f the function to denote the form rather than , because we can trivialize the pluricanonical bundle on the upper half plane by forgetting dz. As the calculation above shows, this trivialization is not -equivariant. There are no nonzero forms of odd weight, because the matrix fixes points and acts by minus one on functions.
Let’s try to write down some examples of forms. A good first place to look is functions of lattices. Since these lattices live in the complex numbers, we can multiply and add, so we consider the function . The factor of two is to prevent cancellation, and we ask that k be greater than one to make this sum converge absolutely. It is not invariant under dilation or rotation, but the nonzero complex numbers act through -2k powers. If we restrict to lattices generated by (1,z), we get a holomorphic function on the upper half plane. It is invariant under T, and for S, , so it is indeed a weight 2k modular form. If we send z to infinity, then all of the non-integer contributions in the lattice sum go to zero, and we are left with as the constant term of the Fourier expansion. In particular, these forms are holomorphic on X(1). One often normalizes them so that the Fourier expansion has constant term 1, and then they are called the Eisenstein series of weight 2k. They have Fourier expansion , where B denotes Bernoulli numbers (which are rational), is the sum of the (2k-1)st powers of all divisors of n, and is a coordinate on the unit disc.
We can multiply Eisenstein series together to get forms of other weights that are not necessarily Eisenstein series, and in fact, the graded ring of modular forms that are holomorphic on X(1) is a polynomial ring generated by and , which are algebraically independent. It is easy to check that there are no forms of odd weight, since we pick up a minus sign when we square S. There are several ways to determine the dimension of the space of forms of a given weight (e.g., orbifold Riemann-Roch), and the fact that the spaces of forms of weight 4,6,8,10,and 14 have dimension 1 implies relations like , which in turn give identities like . There is a two-dimensional space of weight 12 forms, spanned by and . The difference is a form , whose Fourier expansion has no constant term, so it is called a cusp form. is called the discriminant, since it vanishes exactly when a plane cubic is singular. In particular, it doesn’t vanish on the upper half plane, so multiplication by produces an isomorphism between modular forms of weight 2k and cusp forms of weight 2k+12. Also, the quotient is holomorphic of weight zero on Y(1), with a pole at infinity. j has Fourier expansion . The coefficients of these forms satisfy lots of interesting congruence properties, and this is more or less where the theory of p-adic modular forms takes off.
You might be wondering about the use of the term “weight” above. Usually in mathematics, a weight is a representation of a torus, and modular forms are no exception. Here, the torus in question is a maximal compact subgroup . We will write G for the big group, and K for the compact. Iwasawa decomposition splits G as NAK, where and . The group B = NA acts transitively on H, since , and this identifies H with G/K (i.e., G forms a circle bundle over the upper half plane). Elements of G can be written as a point in H together with an angle, and the matrix is taken to . Given a modular form f of weight 2k, we can then produce a function F on G by . F is naturally left invariant under , and K acts on the right by . There is a right regular action of G on any reasonable space of functions on G, and one can actually characterize modular forms f on H as those that correspond to certain eigenfunctions F of the Laplacian (aka Casimir) on G satisfying additional analytic conditions. Modular forms then describe lowest weight vectors for certain (infinite dimensional) unitary representations of G known as discrete series. The raising and lowering in these representations is given by first order differential operators, and the annihilation of the lowest weight vector by a lowering operator is equivalent to the fact that the modular forms satisfy the Cauchy-Riemann equations, i.e., they are holomorphic on H.