On the Gamma function

I want to say a bit about how the Gamma function can be characterized, because I’m not a huge fan of the ways I’ve seen in print.

Let’s make this quick.  I want to know about functions γ such that:

  1. γ(1) = 1,
  2. γ satisfies the functional equation* γ(t+1) = t γ(t), and
  3. γ is a meromorphic function on the complex plane*.

Note that conditions (1) and (2) together mean that γ(n) = (n-1)! for positive integers n.

We know that one such function exists, namely the classical Gamma function.  For positive real x we may define

\Gamma(x) := \displaystyle\int_0^\infty t^{x-1} e^{-t} \, dt

and this function has an analytic continuation, which we’ll also denote Γ, that’s a meromorphic function on the complex plane with poles at the nonpositive integers.

Suppose γ is another function satisfying (1) – (3), and consider the function

r(t) = \displaystyle\frac{\gamma(t)}{\Gamma(t)}

Then r is a meromorphic function with r(1) = γ(1)/Γ(1) = 1/1 = 1 and

r(t+1) = \displaystyle\frac{\gamma(t+1)}{\Gamma(t+1)} = \displaystyle\frac{t \gamma(t)}{t \Gamma(t)} = \displaystyle\frac{\gamma(t)}{\Gamma(t)} = r(t)

Conversely, if r is any singly-periodic meromorphic function with period 1 and r(1) = 1, and we let γ(t) := r(t) Γ(t), then γ clearly satisfies (1) – (3).  So, given that we already know about the classical Γ function, the problem of classifying every function satisfying (1) – (3) reduces to the problem of classifying these functions r.

Taking the quotient of the complex plane by the translation sending z to z+1 gives us a cylinder, so equivalently we’re looking for meromorphic functions on this cylinder.  This cylinder is isomorphic (a.k.a. biholomorphic, a.k.a. conformally equivalent) to the punctured complex plane.

Unfortunately the noncompactness of the cylinder is a serious problem if we want to describe its function field.  I’m not an expert here, but my understanding is that the situation looks like this:

  • The holomorphic periodic functions on the cylinder are just given by Fourier series.
  • The meromorphic ones with finitely many poles are just ratios of the holomorphic ones.
  • The meromorphic ones with infinitely many poles, though, aren’t so easy to describe.

One way people try to deal with this situation is by putting some arbitrary analytic condition on the functions that limits exposure to the third class of functions.  This is in essence what the Wielandt characterization of the Gamma function is doing.

For more on the function field of the cylinder, see:

*It’s probably worth saying a couple of things for people who aren’t familiar with the subject.  These are the sorts of things that bothered me when I was first learning complex analysis and which, while sort of appearing in texts, never seem to be highlighted to the extent that they deserve.

First, notice that if we try to plug in t=0 into this functional equation, we get

1 = γ(1) = 0 γ(0),

which we cannot solve for γ(0).  So what do we mean when we say that γ satisfies this functional equation?  Well, we mean that it’s satisfied whenever both t and t+1 are in the domain of γ.  We see in particular that t=0 cannot be in the domain of any function γ satisfying conditions (1)-(3).

This appears at first to open another can of worms, since if we’re allowed to throw points out of the domain of γ at will we could simply look at every pair of points not satisfying the functional equation and throw out one or the other of them at random.

But we’re considering a meromorphic function on the plane.  Such a function is defined at every point in the plane, minus some countable discrete set.  (“Countable” is redundant here — every uncountable set of points in the plane fails to be discrete.)  In fact, by the Riemann removable singularities theorem, we can further insist that such a function is only undefined where it “has to be” (i.e., where the function approaches infinity in magnitude near a point).  To be quite technical, we generally think not of a particular meromorphic function but rather of an equivalence classes of meromorphic functions under the relation f~g if f(z) = g(z) at every point z in the domain of both functions.

Anyway, the point is that everything’s fine — we won’t have any analytic pathologies creeping in.

Posted in Uncategorized | Leave a comment

Inner product spaces, I – Real closed fields

I’ve always felt sort of uneasy about the way linear algebra is presented: you start off doing all this stuff that makes complete sense, and works over arbitrary fields, and then suddenly you’re doing something with all these complex conjugates and conjugate transposes and real symmetric matrices and so forth and nothing makes sense any more.  So I’m going to try to say something about that here.

Everything in the theory of inner products is based on three properties that look simple enough at first glance, but appear more and more bizarre as you consider them more deeply:

  • The real numbers are an ordered field.
  • The real numbers aren’t algebraically closed, but their algebraic closure (the complex numbers) forms a degree-2 extension.
  • The norm of a nonzero complex number is a positive real number.

(By the way, what do we mean by the norm here?  Well, it’s probably exactly what you think, namely

N(z) = z \overline{z}.

But the norm is something more general: if we have a finite Galois extension L/K, then we can define a function N_{L/K} : L \to K by

N_{L/K}(a) := \prod\limits_{\sigma \in {\rm Gal}(L/K)} \sigma(a)

 Since {\rm Gal}(\mathbb{C}/\mathbb{R}) consists of the identity and complex conjugation, we recover

N_{\mathbb{C}/\mathbb{R}}(z) = z \overline{z}.)

The fact that N_{\mathbb{C}/\mathbb{R}}(z) > 0 for z \neq 0 is specific to $latex \mathbb{C}/\mathbb{R}$; for instance, if d is some squarefree positive integer, then we have

N_{\mathbb{Q}(\sqrt{d})/\mathbb{Q}}(1 + \sqrt{d}) = (1 + \sqrt{d})(1 - \sqrt{d}) = 1-d.

Anyway, here’s why this is all pretty weird:

  • For almost any other field, the algebraic closure is an infinite-dimensional extension, so we have no hope of getting a norm map like this.  In fact, if we have a field F whose algebraic closure \overline{F} is a finite-dimensional extension, then F is a real closed field, meaning that it looks very much like the real numbers, and moreover \overline{F} = F[i].  (This is the Artin-Schreier theorem.)
  • In particular, if F is an ordered field then N_{F[i]/F}(x + i y) = (x + i y)(x - i y) = x^2 + y^2 > 0 for x + i y \neq 0.
  • So, there seem to be two completely separate kinds of non-algebraically closed fields: those that behave exactly like this (such as the real algebraic numbers, reals, and the field of real Puiseux series), and those that behave nothing like this but much like one another (such as the rational numbers, number fields in general, positive characteristic fields, etc.).

The fact that we have (1) an ordered field R (2) whose algebraic closure C is a finite-degree extension  such that (3) N_{R/C}(z) > 0 for nonzero z \in C allows us to extend the theory of linear algebra (over both R and C!) in some strange new directions.

Posted in Uncategorized | Leave a comment

A more motivated proof of the Pythagorean theorem

So far every proof I’ve known of the Pythagorean theorem has adhered to a narrative along the lines of

  • Notice, purely by accident, that in known right triangles it appears that the square on the hypotenuse is always equal to the sum of the squares on the other two sides.
  • Conjecture that this holds in general.
  • Draw a right triangle and a square on each side.
  • Figure out some ingenious geometric decomposition reassembling the two smaller squares into a copy of the bigger one.

This is fairly unsatisfying, because it only tells us that the theorem is true; it doesn’t do much to tell us why it’s true, or give us much intuition for what kind of information it does or does not encode.

Today I wondered if there was a better explanation, and I came across this:

Pythagoras’s theorem | What’s New

Terry Tao writes:

it is perhaps the most intuitive proof of the theorem that I have seen yet

The proof just comes down to examining the (obviously useful) construction where a right triangle is split into two smaller right triangles, both of which are similar to the big one.

Posted in Uncategorized | Leave a comment

The Weyl Group of GL(n, C)

Here’s a cleaner explanation of the Weyl group of GL(n) than I’ve seen before.  I came up with this myself, but it’s straightforward enough that I’m sure I’m not the first.

Let V be an n-dimensional complex vector space, and fix a basis \beta := \{ e_1, e_2, \ldots, e_n \} for V.  Write G := GL(V) \cong GL_n(\mathbb{C}).  Let T < G denote the subgroup of matrices which are diagonal in the basis \beta; this is a maximal torus.  We know that the Weyl group is isomorphic to N(T)/T, so let’s determine N(T).

Pick a matrix D \in T all of whose eigenvalues are distinct, and suppose A \in N(T).  Then A^{-1} D A is diagonal.  This means that A represents a change of basis from \beta to some basis in which D is diagonal.  Now D is diagonal in some basis iff that basis consists of eigenvectors of D.  Since D was chosen in such a way that its eigenspaces are one-dimensional, the only eigenvectors of D are nonzero scalar multiples of the e_i.  Therefore we have A = P C, where P is a permutation matrix and C is diagonal.  Conversely, it’s easy to see that any such matrix normalizes T.  

From here it’s clear that N(T)/T \cong S_n, since the cosets correspond to permutation matrices.

Posted in Uncategorized | Leave a comment

The resultant

Like Euler products for number-theoretic functions, the resultant is one of those amazingly simple gadgets that you’d never imagine existed.  Given two polynomials f and g in a single variable, there is a number called the resultant, denoted res(f,g), such that:

  1. res(f, g) = 0 iff f and g share a common root
  2. The coefficients of res(f, g) are polynomials in the coefficients of f and g.

Think about this for a second.  Given the coefficients of an arbitrary polynomial, we have in general no algebraic expression for its roots, but nonetheless we have a way of determining if two polynomials share a root by simply adding and multiplying together some of their coefficients!

First, let’s see why such a thing ought to exist.  Say that f(x) = p (x-a_1)\cdots(x-a_n) and g(x) = q (x-b_1)\cdots(x-b_m), and define

res(f, g) = p^n q^m \displaystyle \prod_{i=1}^n \prod_{j =1}^m (a_i - b_j).

This clearly satisfies condition (1) above — the product will equal zero iff a_i = b_j for some i and j.  However, it also satisfies condition (2).  Why?  Well, if we regard res(f, g) as a polynomial in the a_i‘s, with coefficients which are polynomials in the b_i‘s, then it’s a symmetric polynomial in the a_i‘s — it’s invariant under permuting the order of the a_i‘s.  Further, if we regard one of the coefficients of this polynomial as a polynomial in the b_i‘s, then this polynomial is also symmetric.

Why does this matter?  Well, the space of symmetric polynomials in n variables is spanned by the elementary symmetric polynomials


x_1 + \cdots + x_n,

x_1 x_2 + x_1 x_3 + \cdots + x_{n-1} x_n,

and so forth.

But you’ll recognize that the coefficients of a (monic, univariate) polynomial are precisely the elementary symmetric polynomials in its roots!  That is, given a monic polynomial in one variable, any symmetric polynomial of its roots is just a linear combination of the coefficients.  (Throwing in the p^n and q^m at the front handles the case when the polynomials aren’t symmetric.)

I vaguely remember learning about elementary symmetric polynomials in my undergrad algebra sequence, but at the time I had no real idea what they were for.  They didn’t look that complicated, so I figured they probably didn’t matter too much.  As it turns out, though, the whole subject of invariant theory is really interesting, and symmetric polynomials are just the first nontrivial example.

As an added bonus, note that we can also determine whether a polynomial has a double root by calculating res(f, f’), where f’ is the derivative of f.  (You can define the derivative of a polynomial without using any calculus — just consider the power rule et al. as definitions instead of theorems.)  Now res(f, f’) is zero iff f and f’ share a root.  Suppose a is a root of f; then f(x) = (x - a)^n g(x) for some n and some g, where a is not a root of g.  Taking the derivative, we have f'(x) = n (x - a)^{n-1} g(x) + (x-a)^n g'(x), so f'(a) = n 0^{n-1} g(x) + 0^n g'(a) = 0 if n > 1, or f'(a) = g(a) + 0^n g'(a) = g(a) \neq 0 if n = 1.

Up to a sign, the resultant res(f, f’) is known as the discriminant, as you’ll remember from high-school algebra when they seemingly needlessly assigned this fancy name to the term b^2 - 4 a c appearing in the quadratic formula.  The form here generalizes to univariate polynomials of arbitrary degree, but in fact it can be generalized further, to arbitrary multivariate polynomials as well.

Posted in Uncategorized | Tagged | Leave a comment

An obvious statement which surprised me when I read it

From Baumslag’s “Topics in Combinatorial Group Theory,” chapter V, Exercises 2(3)(iv):

M \in SL_2(\mathbb{C}) is of order e > 2 if, and only if, {\rm tr} M = \omega + \omega^{-1}  for some primitive e-th root of unity \omega.

There’s really nothing to this statement — just put the matrix in Jordan Canonical Form and draw the obvious conclusion — but it really surprised me when I saw it used in an argument.  Another way to put this would be that, for a 2×2 matrix, the trace and determinant determine the eigenvalues (which is equally obvious).

(The problem in the cases e = 1 and e = 2 is that then the eigenvalues are identical, which means that the matrix isn’t necessarily diagonal when it’s in Jordan Canonical Form — it could be a 2×2 Jordan block.)

From this and one other fact it follows that the elements a, b, and ab in the group \langle a, b \; | \; a^\ell = b^m = (ab)^n = 1 \rangle actually have the desired orders.

Posted in Uncategorized | Tagged , | Leave a comment

The cross-ratio

Here is a nice invariant from classical geometry that I’d never heard of before today.

The action of \text{GL}_2(\mathbb{C}) on \mathbb{C}^2 restricts to an action on \mathbb{CP}^1; this is all that a Möbius transformation really is.  Now \text{GL}_2(\mathbb{C}) is four-dimensional, but there’s a one-dimensional subspace corresponding to scaling which stabilizes each point of the projective line, so we may as well quotient this out and get a \text{PGL}_2(\mathbb{C})-action.

In fact, this is the projective automorphism group of \mathbb{CP}^1, hence the name PGL; its elements are called projectivities.

\text{PGL}_2(\mathbb{C}) group is three-dimensional, so we would expect that with our free degrees of freedom we could send any three points to any three other points, and indeed we can: the action is 3-transitive.  On the other hand, the dimension of \text{PGL}_2(\mathbb{C}) implies that it can’t possibly be 4-transitive.

What this says is that, up to projectivities, any three or fewer points on the complex projective line look like any other set of the same cardinality, but there are sets of four or more points which are essentially “different.”  In particular, given a set of n > 3 points in \mathbb{CP}^1, we ought to be able to find an invariant which determines whether a projectivity takes one set to the other.

Let’s consider the case n = 4.  Obviously we’ve got some latitude in determining this invariant up to a constant, so let’s just decree that R(0, 1, \infty, x) = x.  Consequently, for any element M = \begin{pmatrix} a & b \\ c & d \end{pmatrix} \in PGL_2(\mathbb{C}), we have

R(M \cdot 0, M \cdot 1, M \cdot \infty, M \cdot x) = x.

Expanding the expression,

\displaystyle R\left(\frac{b}{d}, \; \frac{a+b}{c+d}, \; \frac{a}{c}, \;\frac{ax+b}{cx+d}\right) = x.

So all we need to do is, given a system of equations

\displaystyle p = \frac{b}{d}, \; q = \frac{a+c}{b+d}, \; r = \frac{a}{c}, s = \frac{ax+b}{cx+d},

figure out how to solve for x as a rational function of p, q, r, and s.  This isn’t too bad — note first that  ax + b = s(cx + d), or (a - cs) x = ds - b, so

x = \displaystyle \frac{ds - b}{a-cs} = \frac{ds - dp}{cr - cs} = \frac{d}{c} \frac{s-p}{r-s}

Now we just need to express d/c in terms of p, q, and r. Inspired by the previous expression, it’s not hard to determine that we can write d/c = (q-s)/(p-q), so altogether we get

R(p, q, r, s) = \displaystyle \frac{(q-r)(s-p)}{(p-q)(r-s)}.

Actually, as the negative reciprocal would provide just as good an invariant, let’s redefine R slightly to get all the variables in a nice, alphabetical order:

R(p, q, r, s) = \displaystyle \frac{(p-q)(r-s)}{(p-s)(q-r)}.

This function is the classical “cross-ratio” of the four points p, q, r, and s.  As we can see from the formula, it’s a ratio of ratios of distances between points.

Of course the point of the preceding is to provide one justification for why we should expect such an invariant to exist, and how we could determine it.  In fact, the cross-ratio and its significance to projective geometry was known already to Pappus of Alexandria around AD 300.

Posted in Uncategorized | Tagged , | 1 Comment