Group actions

GROUP ACTIONS
KEITH CONRAD
1. Introduction
The symmetric groups Sn , alternating groups An , and (for n ≥ 3) dihedral groups Dn
behave, by their very definition, as permutations on certain sets. The groups Sn and An
both permute the set {1, 2, . . . , n} and Dn can be considered as a group of permutations of
a regular n-gon, or even just of its n vertices, since rigid motions of the vertices determine
where the rest of the n-gon goes. If we label the vertices of the n-gon in a definite manner
by the numbers from 1 to n then we can view Dn as a subgroup of Sn . For instance, the
labeling
3
2
4
1
lets us regard the 90 degree counterclockwise rotation r in D4 as (1234) and the reflection
s across the x-axis as (12)(34). The rest of the elements of D4 , as permutations of the
vertices, are in the table below.
1
r
r2
r3
s
sr
sr2
sr3
(1) (1234) (13)(24) (1432) (12)(34) (24) (14)(23) (13)
If we label the vertices in a different way (e.g., swap the labels 1 and 2), we may get a
different subgroup of S4 .
More abstractly, if we are given any set X (not necessarily the set of vertices of a square),
then the set Sym(X) of all permutations of X is a group under composition, and the
subgroup Alt(X) of even permutations of X is a group under composition. If we list the
elements of X in a definite order, say as X = {x1 , . . . , xn }, then we can think about Sym(X)
as Sn and Alt(X) as An , but a listing in a different order leads to different identifications
of Sym(X) with Sn and Alt(X) with An .
The “abstract” symmetric groups Sym(X) really do arise naturally:
Theorem 1.1 (Cayley). Every finite group G can be embedded in a symmetric group.
Proof. To each g ∈ G, define the left multiplication function `g : G → G, where `g (x) = gx
for x ∈ G. Each `g is a permutation of G as a set, with inverse `g−1 . So `g belongs to
Sym(G). Since `g1 ◦ `g2 = `g1 g2 (that is, g1 (g2 x) = (g1 g2 )x for all x ∈ G), associating g to `g
gives a homomorphism of groups, G → Sym(G). This homomorphism is one-to-one since `g
determines g (after all, `g (e) = g). Therefore the correspondence g 7→ `g is an embedding
of G as a subgroup of Sym(G).
Allowing an abstract group to behave like a group of permutations, as happened in the
proof of Cayley’s theorem, is a useful tool.
1
2
KEITH CONRAD
Definition 1.2. An action of a group G on a set X is the choice, for each g ∈ G, of a
permutation πg : X → X such that the following two conditions hold:
• πe is the identity: πe (x) = x for each x ∈ X,
• for every g1 and g2 in G, πg1 ◦ πg2 = πg1 g2 .
Example 1.3. Let Sn act on X = {1, 2, . . . , n} in the usual way. Here πσ (i) = σ(i) in the
usual notation.
Example 1.4. Any group G acts on itself (X = G) by left multiplication functions. That
is, we set πg : G → G by πg (h) = gh for all g ∈ G and h ∈ G. Then the conditions for being
a group action are eh = h for all h ∈ G and g1 (g2 h) = (g1 g2 )h for all g1 , g2 , h ∈ G, which
are both true since e is an identity and multiplication in G is associative.
In practice, one dispenses with the notation πg and writes πg (x) simply as g · x or gx.
This is not meant to be an actual multiplication of elements from two possibly different sets
G and X. It is just the notation for the effect of g (really, the permutation associated to g)
on the element x. In this notation, the axioms for a group action take the following form:
• For each x ∈ X, e · x = x.
• For every g1 , g2 ∈ G and x ∈ X, g1 · (g2 · x) = (g1 g2 ) · x.
The basic idea in any group action is that the elements of a group are viewed as permutations of a set in such a way that composition of the corresponding permutations matches
multiplication in the original group.
To get used to the notation, let’s prove a simple result.
Theorem 1.5. Let G act on X. If x ∈ X, g ∈ G, and y = g · x, then x = g −1 · y. If x 6= x0
then gx 6= gx0 .
Proof. From y = g · x we get g −1 · y = g −1 · (g · x) = (g −1 g) · x = e · x = x. To show
x 6= x0 =⇒ gx 6= gx0 , we show the contrapositive: if gx = gx0 then applying g −1 to both
sides gives g −1 · (g · x) = g −1 · (g · x0 ), so (g −1 g) · x = (g −1 g) · x0 , so x = x0 .
Another way to think about an action of a group on a set is that it is a certain homomorphism. Here are the details.
Theorem 1.6. Actions of the group G on the set X are the same as group homomorphisms
from G to Sym(X), the group of permutations of X.
Proof. Suppose we have an action of G on X. We view g · x as a function of x (with g
fixed). That is, for each g ∈ G we have a function πg : X → X by πg (x) = g · x. The axiom
e·x=x
says πe is the identity function on X. The axiom
g1 · (g2 · x) = (g1 g2 ) · x
says πg1 ◦ πg2 = πg1 g2 , so composition of functions on X corresponds to multiplication in
G. Moreover, πg is an invertible function since πg−1 is an inverse: the composite of πg and
πg−1 is πe , which is the identity function on X. Therefore πg ∈ Sym(X) and g 7→ πg is a
homomorphism G → Sym(X).
Conversely, suppose we have a homomorphism f : G → Sym(X). For each g ∈ G, we have
a permutation f (g) on X, and f (g1 g2 ) = f (g1 )◦f (g2 ). Setting g·x = f (g)(x) defines a group
action of G on X, since the homomorphism properties of f yield the defining properties of
a group action.
GROUP ACTIONS
3
From this viewpoint, the set of g ∈ G that act trivially (g · x = x for all x ∈ X) is simply
the kernel of the homomorphism G → Sym(X) associated to the action. Therefore those g
that act trivially on X are said to lie in the kernel of the action.
We will not often use the interpretation of Theorem 1.6 before Section 6. Until then we
take the more concrete viewpoint of a group action as a kind of product g · x of G with X,
taking values in X subject to the properties e · x = x and g1 · (g2 · x) = (g1 g2 ) · x.
Here is an outline of later sections. Section 2 describes several concrete examples of
group actions and also some general actions available from any group. Section 3 describes
the important orbit-stabilizer formula. The short Section 4 isolates an important fixed-point
congruence for actions of p-groups. Sections 5 and 6 give applications of group actions to
group theory. In Appendix A, group actions are used to derive three classical congruences
from number theory.
2. Examples
Example 2.1. We can make Rn act on itself by translations: for v ∈ Rn , let Tv : Rn → Rn
by Tv (w) = w + v. The axioms for a group action are: T0 (w) = w and Tv1 (Tv2 (w)) =
Tv1 +v2 (w). These are true from properties of vector addition.
Example 2.2. Let G be the group of Rubik’s cube: all sequences of motions on the cube
(keeping center facelets in fixed locations). This group acts on two different sets: the 12 edge
cubelets and the 8 corner cubelets. Or we could let G act on the set of all 20 non-centerface
cubelets together.
Example 2.3. For n ≥ 3, each element of the group Dn acts as a rigid motion of a regular
n-gon in the plane, either a rotation or a reflection.
We can also view Dn as acting just on the n vertices of a regular n-gon. Knowing where
the vertices go determines the rest of the rigid motion, so the effect of Dn on the vertices
tells us all we need to know to determine the rigid motion on the n-gon. Restricting the
action of Dn from an n-gon to its vertices, and labeling the vertices as 1, 2, . . . , n in some
manner, makes Dn act on {1, 2, . . . , n}.
Example 2.4. The group GLn (R) acts on vectors in Rn in the usual way that a matrix
can be multiplied with a (column) vector: A · v = Av. In this action, the origin 0 is fixed
by every A while other vectors get moved around (as A varies). The axioms of a group
action are properties of matrix-vector multiplication: In v = v and A(Bv) = (AB)v.
Example 2.5. The group Sn acts on polynomials f (T1 , . . . , Tn ), by permuting the variables:
(2.1)
σ · f (T1 , . . . , Tn ) = f (Tσ(1) , . . . , Tσ(n) ).
For example, (23)·(T2 +T32 ) = T3 +T22 and (12)·((23)·(T2 +T32 )) = (12)·(T3 +T22 ) = T3 +T12 ,
and (12)(23) · (T2 + T32 ) = (123) · (T2 + T32 ) = T3 + T12 .
It’s obvious that (1) · f = f . To check that σ · (σ 0 · f ) = (σσ 0 ) · f for all σ and σ 0 in Sn ,
so (2.1) is a group action, we compute
σ · (σ 0 · f (T1 , . . . , Tn )) = σ · f (Tσ0 (1) , . . . , Tσ0 (n) )
= f (Tσ(σ0 (1)) , . . . , Tσ(σ0 (n)) )
= f (T(σσ0 )(1) , . . . , T(σσ0 )(n) )
= (σσ 0 ) · f (T1 , . . . , Tn ).
4
KEITH CONRAD
Lagrange’s study of this group action (ca. 1770) marked the first systematic use of
symmetric groups in algebra. Lagrange wanted to understand why nobody had found an
analogue of the quadratic formula for roots of a polynomial of degree greater than four.
While Lagrange was not completely successful, he found in this group action that there are
some different features in the cases n ≤ 4 and n = 5.
Since f and σ ·f have the same degree, and if f is homogeneous then σ ·f is homogeneous,
this action of Sn can be restricted to the set of polynomials in n variables with a fixed degree
or the set of homogeneous polynomials in n variables with a fixed degree. An example is
the set of homogeneous linear polynomials {a1 T1 + · · · + an Tn }, where
(2.2)
σ(c1 T1 + · · · + cn Tn ) = c1 Tσ(1) + · · · + cn Tσ(n) = cσ−1 (1) T1 + · · · + cσ−1 (n) Tn .
Example 2.6. Let Sn act on Rn by permuting the coordinates: for σ ∈ Sn and v =
(c1 , . . . , cn ) ∈ Rn , set σ · v = (cσ(1) , · · · , cσ(n) ).
For example, let n = 3, σ = (12), and σ 0 = (23). Then σ 0 · (c1 , c2 , c3 ) = (c1 , c3 , c2 ). To
compute σ · (σ 0 · (c1 , c2 , c3 )) = σ · (c1 , c3 , c2 ) we must be careful: by definition an element
of Sn is applied to a vector whose indices are written as 1, . . . , n in that order. So write
(c1 , c3 , c2 ) = (d1 , d2 , d3 ). Then
σ · (σ 0 · (c1 , c2 , c3 )) = σ · (c1 , c3 , c2 ) = σ · (d1 , d2 , d3 ) = (d2 , d1 , d3 ) = (c3 , c1 , c2 ).
And
(σσ 0 ) · (c1 , c2 , c3 ) = (123) · (c1 , c2 , c3 ) = (c2 , c3 , c1 ),
which does not agree with σ · (σ 0 · (c1 , c2 , c3 ))! So our so-called action of Sn on Rn is not a
group action based on our definition.
Let’s make a general calculation to see what is going wrong. For σ and σ 0 in Sn , and
v = (c1 , . . . , cn ) in Rn , we compute σ · (σ 0 · v) by setting di = cσ0 (i) , so σ 0 · v = (d1 , . . . , dn ).
Then
σ · (σ 0 · v) = (dσ(1) , . . . , dσ(n) )
= (cσ0 (σ(1)) , . . . , cσ0 (σ(n)) )
= (c(σ0 σ)(1) , . . . , c(σ0 σ)(n) )
= (σ 0 σ) · v,
and the order of composition of permutations is backwards!
To make things turn out correctly, we should redefine the effect of Sn on Rn to use the
inverse: σ · v = (cσ−1 (1) , . . . , cσ−1 (n) ). Then σ · (σ 0 · v) = (σσ 0 ) · v and we have a group action
of Sn on Rn , which in fact is essentially the action of Sn from the previous example on
homogeneous linear polynomials (see (2.2)).
The lesson from these last two examples is that when Sn permutes variables in a function
then it acts “directly”, but when Sn permutes coordinates in a vector then it has to act
using inverses. It’s easy to remember that when Sn acts on variables or coordinates that it
acts without inverses in one case and with inverses in the other case, but it’s easy to forget
which case is which. At least remember that you need to be careful.
Example 2.7. Let G be a group acting on the set X, and S be any set. Write Map(X, S)
for the set of all functions f : X → S. We can make G act on Map(X, S) by
(g · f )(x) = f (g −1 x).
GROUP ACTIONS
5
It is left to the reader to check that this is an action of G on Map(X, S), and to see why
we need g −1 on the right side rather than g.
If G = Sn , X = {1, . . . , n} with the natural Sn -action, and S = R, then Map(X, S) = Rn :
writing down a vector v = (c1 , . . . , cn ) amounts to listing the coordinates, and the list of
coordinates in order is a function f : {1, 2, . . . , n} → R where f (i) = ci . Therefore the
condition (g · f )(i) = f (g −1 i) amounts to saying g · (c1 , . . . , cn ) = (cg−1 (1) , . . . , cg−1 (n) ),
which is precisely the action of Sn on Rn in the previous example.
There are three basic ways we will make an abstract group G act: left multiplication of
G on itself, conjugation of G on itself, and left multiplication of G on a coset space G/H.
All of these will now be described.
Example 2.8. To make G act on itself by left multiplication, we let X = G and g · x (for
g ∈ G and x ∈ G) be the usual product of g and x. This example was used already in the
proof of Cayley’s theorem, and the definition of a group action is satisfied by the axioms
for multiplication in G.
Note that right multiplication of G on itself, given by rg (x) = xg for g and x in G, is
not an action since the order of composition gets reversed: rg1 ◦ rg2 = rg2 g1 . But if we set
rg (x) = xg −1 then we do get an action. This could be called the action by right-inverse
multiplication (non-standard terminology).
Example 2.9. To make G act on itself by conjugation, take X = G and let g · x = gxg −1 .
Here g ∈ G and x ∈ G. Since e · x = exe−1 = x and
g1 · (g2 · x) = g1 · (g2 xg2−1 )
= g1 (g2 xg2−1 )g1−1
= (g1 g2 )x(g1 g2 )−1
= (g1 g2 ) · x,
conjugation is a group action.
Example 2.10. For a subgroup H ⊂ G, consider the left coset space G/H = {aH : a ∈ G}.
(We do not care whether or not H C G, as we are just thinking about G/H as a set.) We
let G act on G/H by left multiplication. That is, for g ∈ G and a left coset aH (a ∈ G), set
g · aH = gaH = {gy : y ∈ aH}.
This is an action of G on G/H, since eaH = aH and
g1 · (g2 · aH) = g1 · (g2 aH)
= g1 g2 aH
= (g1 g2 ) · aH.
Example 2.8 is the special case when H is trivial.
Example 2.11. Let G = Z/(4) act on itself (X = G) by additions. For instance, addition
by 1 has the effect 0 7→ 1 7→ 2 7→ 3 7→ 0. Therefore addition by 1 on Z/(4) is a 4-cycle (0123).
Addition by 2 has the effect 0 7→ 2, 1 7→ 3, 2 7→ 0, and 3 7→ 1. Therefore, as a permutation
on Z/(4), addition by 2 is (02)(13), a product of two 2-cycles. The composition of these
two permutations is (0123)(02)(13) = (0321), which is the permutation of G described by
addition by 3, and 3 = 1 + 2 in Z/(4).
6
KEITH CONRAD
We return to the action of a group G on itself by left multiplication and by conjugation,
and extend these actions to subsets rather than just points.
Example 2.12. When A is a subset of G, and g ∈ G, the subset gA = {ga : a ∈ A} has the
same size as A. Therefore G acts by left multiplication on the set of subsets of G, or even
on the subsets with a fixed size. Example 2.8 is the special case of one-element subsets of
G. Notice that, when H ⊂ G is a subgroup, gH is usually not a subgroup of G, so the left
multiplication action of G on its subsets does not convert subgroups into other subgroups.
Example 2.13. As a special case of Example 2.12, let S4 act on the set of pairs from
{1, 2, 3, 4} by the rule σ · {a, b} = {σ(a), σ(b)}.
There are 6 pairs:
x1 = {1, 2}, x2 = {1, 3}, x3 = {1, 4}, x4 = {2, 3}, x5 = {2, 4}, x6 = {3, 4}.
The effect of (12) on these pairs is
(12)x1 = x1 , (12)x2 = x4 , (12)x3 = x5 ,
(12)x4 = x2 , (12)x5 = x3 , (12)x6 = x6 .
Thus, as a permutation of the set {x1 , . . . , x6 }, (12) acts like (x2 x4 )(x3 x5 ). That is interesting: we have made a transposition in S4 look like a product of two 2-cycles in S6 . In
particular, we have made an odd permutation of {1, 2, 3, 4} look like an even permutation
(on a new set).
Example 2.14. Let G be a group. When A ⊂ G, gAg −1 is a subset with the same size
as A. Moreover, unlike the left multiplication action of G on its subsets, the conjugation
action of G on its subsets transforms subgroups into subgroups: when H ⊂ G is a subgroup,
gHg −1 is also a subgroup. For instance, three subgroups of S4 with size 4 are
{(1), (1234), (13)(24), (1432)}, {(1), (2134), (23)(14), (2431)},
{(1), (12)(34), (13)(24), (14)(23)}.
Under conjugation by S4 , the first two subgroups can be transformed into each other, but
neither of these subgroups can be conjugated to the third subgroup: the first and second
subgroups have an element with order 4 while the third one does not.
While the left multiplication action of G on itself (Example 2.8) associates different
permutations to different group elements, the conjugation action of G on itself (Example
2.9) can make different group elements act in the same way: if g1 = g2 z, where z is in the
center of G, then g1 and g2 have the same conjugation action on G. Group actions where
different elements of the group act differently have a special name:
Definition 2.15. A group action of G on X is called faithful (or effective) if different
elements of G act on X in different ways: when g1 6= g2 in G, there is an x ∈ X such that
g1 · x 6= g2 · x.
Note that when we say g1 and g2 act differently, we mean they act differently somewhere,
not everywhere. This is consistent with what it means to say two functions are not equal:
they take different values somewhere, not everywhere.
Example 2.16. The action of G on itself by left multiplication is faithful: different elements
send e to different places.
GROUP ACTIONS
7
Example 2.17. The action of G on itself by conjugation is faithful if and only if G has a
trivial center, because g1 gg1−1 = g2 gg2−1 for all g ∈ G if and only if g2−1 g1 is in the center of
G. When D4 acts on itself by conjugation, the action is not faithful since r2 acts trivially
(it is in the center), so 1 and r2 act in the same way.
Example 2.18. When H is a subgroup of G and G acts on G/H by left multiplication,
two elements g1 and g2 act inTthe same way on G/H precisely when g1 gH = g2 gH for all
g ∈ G, which means g2−1 g1 ∈ g∈G gHg −1 . So the left multiplication action of G on G/H
is faithful if and only if the subgroups gHg −1 (as g varies) have trivial intersection.
2
Example 2.19. The action of
GL2 (R)
on R is faithful, since we can recover the columns
1
0
of a matrix by acting it on 0 and 1 .
Non-faithful actions are as important as non-injective group homomorphisms (in fact,
that is precisely what a non-faithful action is from the viewpoint of Theorem 1.6).
Remark 2.20. What we have been calling a group action could be called a left group action,
while a right group action, denoted xg, has the properties xe = x and (xg1 )g2 = x(g1 g2 ).
The exponential notation xg in place of xg works well here, especially by writing the identity
in the group as 1: x1 = x and (xg1 )g2 = xg1 g2 . The distinction between left and right actions
is how a product gg 0 acts: in a left action g 0 acts first and g acts second, while in a right
action g acts first and g 0 acts second.
Right multiplication of G on itself (or more generally right multiplication of G on the
space of right cosets of a subgroup H) is an example of a right action. To take a more
concrete example, the action of GLn (R) on row vectors of length n is most naturally a right
action since the product vA (not Av) makes sense when v is a row vector and A ∈ GLn (R).
Many group theorists (unlike most other mathematicians) like to define the conjugate of
h by g as g −1 hg instead of as ghg −1 , and this convention fits well with the right (but not
left) conjugation action: setting hg = g −1 hg we have h1 = h and (hg1 )g2 = hg1 g2 .
The difference between left and right actions of a group is largely illusory, since replacing
g with g −1 in the group turns left actions into right actions and conversely because inversion
reverses the order of multiplication in G. We saw this idea at work in Example 2.8 and
in Example 2.6. We will not use right actions (except in Example 3.18), so for us “group
action” means “left group action.”
3. Orbits and Stabilizers
The information encoded in a group action has two basic parts: one part tells us where
points go and the other part tells us how points stay put. The following terminology refers
to these ideas.
Definition 3.1. Let G act on X. For each x ∈ X, its orbit is
Orbx = {g · x : g ∈ G} ⊂ X
and its stabilizer is
Stabx = {g ∈ G : g · x = x} ⊂ G.
(The stabilizer of x is often denoted Gx in the literature, where G is the group.) We call x
a fixed point for the action when g · x = x for every g ∈ G, that is, when Orbx = {x} (or
equivalently, when Stabx = G).
8
KEITH CONRAD
Writing the definition of orbits and stabilizers in words, the orbit of a point is a geometric
concept: it is the set of places where the point can be moved by the group action. On the
other hand, the stabilizer of a point is an algebraic concept: it is the set of group elements
that fix the point.
We will often refer to the elements of X as points and we will refer to the size of an orbit
as its length. If X = G, as in Examples 2.8 and 2.9, then we think about elements of G as
permutations when they act on G and as points when they are acted upon.
Example 3.2. When GL2 (R) acts in the usual way on R2 , the orbit of 0 is {0} since
A · 0 = 0 for every
A in GL2 (R). The stabilizer of 0 is GL2 (R).
The orbit of 10 is R2 −{0}, in other words every non-zero vector can be obtained from 10
by applying a suitable invertible matrix to it. Indeed, if ab 6= 0, then we have ab = ( ab 10 ) 10
and ab = ( ab 01 ) 10 . One of the matrices ( ab 10 ) or ( ab 01 ) is invertible (since a or b is non-zero),
so ab is in the GL2 (R)-orbit of 10 . The stabilizer of 10 is {( 10 xy ) : y 6= 0} ⊂ GL2 (R).
Example 3.3. When the group GL2 (Z) acts in the usual way on Z2 , the
orbit of 0 is {0}
1
with stabilizer GL2 (Z). But in contrast to Example 3.2, the orbit of 0 under GL2 (Z) is
not Z2 − {0}. Indeed, a matrix ( ac db ) in GL2 (Z) sends 10 to ac , which is a vector with
relatively prime coordinates since ad − bc = ±1. (For instance, GL2 (Z) can’t send 10 to
2
2
any vector m
0 .) Conversely,
n in Z with relatively prime coordinates is in the GL2 (Z)
1
−y
orbit of 0 : we can solve mx + ny = 1 for some integers x and y, so ( m
n x ) is in GL2 (Z)
1
m
m
−y
(its determinant is 1) and ( n x ) 0 = n .
Check as an exercise that the orbits in Z2 under the action of GL2 (Z) are the vectors
whose coordinates
have a fixed greatest common
divisor. Each orbit contains one vector of
the form d0 for d ≥ 0, and the stabilizer of d0 for d > 0 is {( 10 xy ) : y = ±1} ⊂ GL2 (Z).
Example 3.4. Identifying Z/(2) with the subgroup {±In } of GLn (R) gives an action of
Z/(2) on Rn , where 0 acts as the identity and 1 acts by negation on Rn . We can restrict
this action of Z/(2) to the unit sphere of Rn , and then it is called the antipodal action since
its orbits are pairs of opposite points (which are called antipodal points) on the sphere.
Example 3.5. When the Rubik’s cube group acts on the non-centerface cubelets of Rubik’s
cube, there are two orbits: the corner cubelets and the edge cubelets.
Example 3.6. For n ≥ 2, consider Sn in its natural action on {1, 2, . . . , n}. What is the
stabilizer of an integer k ∈ {1, 2, . . . , n}? It is the set of permutations of {1, 2, . . . , n} fixing
k, which can be identified with the set of permutations of {1, 2, . . . , n} − {k}. This is just
Sn−1 in disguise (once we identify {1, 2, . . . , n} − {k} in a definite manner with the numbers
from 1 to n − 1). The stabilizer of any number in {1, 2, . . . , n} for the natural action of Sn
on {1, 2, . . . , n} is isomorphic to Sn−1 .
Example 3.7. For n ≥ 2, the even permutations of {1, 2, . . . , n} that fix a number k can
be identified with the even permutations of {1, 2, . . . , n} − {k}, so the stabilizer of any point
in the natural action of An is essentially An−1 up to relabelling.
Remark 3.8. When trying to think about a set as a geometric object, it is helpful to refer
to its elements as points, no matter what they might really be. For example, when we think
about G/H as a set on which G acts (by left multiplication), it is useful to think about
the cosets of H, which are the elements of G/H, as the points in G/H. At the same time,
though, a coset is a subset of G. There is a tension between these two interpretations: is a
GROUP ACTIONS
9
left coset of H a point in G/H or a subset of G? It is both, and it is important to be able
to think about a coset in both ways.
All of our applications of group actions to group theory will flow from the relations
between orbits, stabilizers, and fixed points, which we now make explicit in our three basic
examples of group actions.
Example 3.9. When G acts on itself by left multiplication,
• there is one orbit (g = ge ∈ Orbe ),
• Stabx = {g : gx = x} = {e} is trivial,
• there are no fixed points (if #G > 1).
Example 3.10. When G acts on itself by conjugation,
• the orbit of a is Orba = {gag −1 : g ∈ G}, which is the conjugacy class of a.
• Staba = {g : gag −1 = a} = {g : ga = ag} is the centralizer of a.
• a is a fixed point when it commutes with all elements of G, and thus the fixed points
of conjugation form the center of G.
Example 3.11. When G acts on G/H by left multiplication,
• there is one orbit (gH = g · H ∈ Orb{H} ),
• StabaH = {g : gaH = aH} = {g : a−1 ga ∈ H} = aHa−1 ,
• there are no fixed points (if H 6= G).
These examples illustrate several facts: an action need not have any fixed points (Example
3.9 with non-trivial G), different orbits can have different lengths (Example 3.10 with G =
S3 ), and the points in a common orbit don’t have to share the same stabilizer (Example
3.11, if H is a non-normal subgroup).
Example 3.12. When G acts on its subgroups by conjugation, StabH = {g : gHg −1 = H}
is the normalizer N(H) and the fixed points are the normal subgroups of G.
When G acts on X, any subgroup of G also acts on X. Let’s look at two examples.
Example 3.13. Any subgroup K ⊂ G acts on G/H by left multiplication. Then
• there could be more than one orbit (not all cosets gH can be reached from H by
left multiplication by elements of K unless KH = G),
• StabaH = aHa−1 ∩ K,
• there are no fixed points (if H 6= G).
Example 3.14. Any subgroup H ⊂ G acts on G by right-inverse multiplication (Example
2.8). Then
• the orbits are the left H-cosets ({gh−1 : h ∈ H} = gH)
• Staba is trivial,
• there are no fixed points (if #H > 1).
Theorem 3.15. Let G act on X.
a) Different orbits are disjoint.
b) For each x ∈ X, Stabx is a subgroup of G and Stabg·x = g Stabx g −1 .
c) g · x = g 0 · x if and only if g and g 0 lie in the same left coset of Stabx . In particular,
the length of the orbit of x is given by
# Orbx = [G : Stabx ].
10
KEITH CONRAD
The formula in part c, relating the length of an orbit to the index in G of a stabilizer for
a point in the orbit, is called the orbit-stabilizer formula.
Proof. a) We prove different orbits in a group action are disjoint by proving that two orbits
which overlap must coincide. Suppose Orbx and Orby have a common element z:
z = g1 · x,
z = g2 · y.
We want to show Orbx = Orby . It suffices to show Orbx ⊂ Orby . Then switch the roles of
x and y to get the reverse inclusion.
For any point u ∈ Orbx , write u = g · x for some g ∈ G. Since x = g1−1 · z,
u = g · (g1−1 · z) = (gg1−1 ) · z = (gg1−1 ) · (g2 · y) = (gg1−1 g2 ) · y,
which shows us that u ∈ Orby . Therefore Orbx ⊂ Orby .
b) To see that Stabx is a subgroup of G, we have e ∈ Stabx and, if g1 , g2 ∈ Stabx , then
(g1 g2 ) · x = g1 · (g2 · x) = g1 · x = x,
so g1 g2 ∈ Stabx . Thus Stabx is closed under multiplication. Lastly,
g · x = x =⇒ x = g −1 · x,
so Stabx is closed under inversion.
To show Stabgx = g Stabx g −1 , for any x ∈ X and g ∈ G, observe that
h ∈ Stabg·x ⇐⇒ h · (g · x) = g · x
⇐⇒ (hg) · x = g · x
⇐⇒ g −1 · ((hg) · x) = x
⇐⇒ (g −1 hg) · x = x
⇐⇒ g −1 hg ∈ Stabx
⇐⇒ h ∈ g Stabx g −1 ,
so Stabg·x = g Stabx g −1 .
c) The condition g · x = g 0 · x is equivalent to x = (g −1 g 0 ) · x, which means g −1 g 0 ∈ Stabx ,
or g 0 ∈ g Stabx . Therefore g and g 0 act in the same way on x if and only if g and g 0 lie in
the same left coset of Stabx . (Remember that for any subgroup H, g 0 ∈ gH if and only if
g 0 H = gH.)
Since Orbx consists of the points g ·x for varying g, and elements of G act in the same way
on x if and only if they lie in the same left coset of Stabx , we get a function G → Orbx by
g 7→ gx that is surjective and the inverse image of each point in Orbx is a left coset of Stabx .
Thus # Orbx is the number of left cosets of Stabx in G, which is the index [G : Stabx ]. Remark 3.16. That different orbits of a group action are disjoint includes as special
cases two basic disjointness results in group theory: disjointness of conjugacy classes and
disjointness of left cosets of a subgroup. The first case uses the action of a group on itself by
conjugation, having conjugacy classes as its orbits. The second case uses the right-inverse
multiplication action of the subgroup on the group (Example 3.14).
Example 3.17. Let S3 act on itself by conjugation. Its orbits are its conjugacy classes,
which are
{(1)}, {(12), (13), (23)}, {(123), (132)}.
The conjugacy class of (12) has size 3 and the stabilizer of (12) is its centralizer {(1), (12)},
which has index 3 in S3 .
GROUP ACTIONS
11
Example 3.18. The 2 × 2 matrices ( ac db ) ∈ GL2 (R) whose columns add up to 1 are a
group. This can be checked by a tedious calculation. But this can be seen more simply by
observing that the column sums are the entries in the vector-matrix product (1 1)( ac db ), so
the matrices of interest are those satisfying (1 1)( ac db ) = (1 1). This is the stabilizer of (1 1)
in the (right!) action of GL2 (R) on R2 – viewed as row vectors – by v · A = vA, so it is a
subgroup of GL2 (R) since the stabilizers of a point are always a subgroup. (Theorem 3.15
for right group actions should be formulated and checked by the reader.)
Moreover, because (0 1)( 01 −1
1 ) = (1 1), Stab(1 1) and Stab(0 1) are conjugate subgroups
in GL2 (R). Since Stab(0 1) = {( a0 1b ) ∈ GL2 (R)} = Aff(R), a model for our “column-sum-1
group” is its conjugate subgroup Aff(R). Explicitly,
−1
0 −1
0 −1
Stab(1 1) = Stab(0 1)( 0 −1 ) =
Aff(R)
.
1 1
1 1
1 1
Example 3.19. As a cute application of the orbit-stabilizer formula we explain why
#(HK) = #H#K/#(H ∩ K) for subgroups H and K of a finite group G. Here HK =
{hk : h ∈ H, k ∈ K} is the set of products, which usually is just a subset (not a subgroup)
of G. To count the size of HK, let the direct product group H × K act on the set HK like
this: (h, k) · x = hxk −1 . Check this gives a group action (the group is H × K and the set
is HK). There is only one orbit since e = ee ∈ HK and hk = (h, k −1 ) · e. Therefore the
orbit-stabilizer formula tells us
#H#K
#(H × K)
=
.
#(HK) =
# Stabe
#{(h, k) : (h, k) · e = e}
The condition (h, k) · e = e means hk −1 = e, so Stabe = {(h, h) : h ∈ H ∩ K}. Therefore
# Stabe = #(H ∩ K) and #(HK) = #H#K/#(H ∩ K).
Example 3.20. We now discuss the original version of Lagrange’s theorem in group theory.
Here is what he proved: for any polynomial f (T1 , . . . , Tn ) in n variables, the number of
different polynomials we can obtain from f (T1 , . . . , Tn ) through all permutations of its
variables is a factor of n!.
For instance, taking n = 3, consider the polynomial T1 . If we run through all six permutations of the set {T1 , T2 , T3 }, and apply each to T1 , we get 3 different results: T1 , T2 , and
T3 . The polynomial (T1 − T2 )T3 + (T2 − T3 )T1 + (T3 − T1 )T2 has only 2 possibilities under
any change of variables: itself or its negative (check this for yourself). The polynomial
T1 + T22 + T33 has 6 different possibilities. The number of different polynomials we find in
each case is a factor of 3!.
To explain Lagrange’s general observation, we apply the orbit-stabilizer formula to the
group action in Example 2.5. That was the action of Sn on n-variable polynomials by
permutations of the variables. For an n-variable polynomial f (T1 , . . . , Tn ), the different
polynomials we obtain by permuting its variables are exactly the polynomials in its Sn orbit. Therefore, by the orbit-stabilizer formula, the number of polynomials we get from
f (T1 , . . . , Tn ) by permuting its variables is [Sn : Hf ], where Hf = {σ ∈ Sn : σ · f = f }. This
index is a divisor of n!.
The following two corollaries to Theorem 3.15 are reinterpretations of parts of Theorem
3.15, and the proofs are left to the reader.
Corollary 3.21. Let G act on X, where G is finite.
a) The length of every orbit divides the size of G.
12
KEITH CONRAD
b) Points in a common orbit have conjugate stabilizers, and in particular the size of the
stabilizer is the same for all points in an orbit.
Corollary 3.22. Let G act on X, where G and X are finite. Let the different orbits of X
be represented by x1 , . . . , xt . Then
(3.1)
#X =
t
X
# Orbxi =
i=1
t
X
[G : Stabxi ].
i=1
Example 3.23. For any finite group G, each conjugacy class in G has size dividing the
size of G, since a conjugacy class in G is an orbit in the conjugation action of G on itself,
so Corollary 3.21a applies. Moreover, for the conjugation action (3.1) is the class equation.
In a group action, the length of an orbit divides #G, but the number of orbits usually
does not divide #G. For example, S4 has 5 conjugacy classes, and 5 does not divide 24.
But there is an interesting relation between the number of orbits and the group action.
Theorem 3.24. Let a finite group G act on a finite set X with r orbits. Then r is the
average number of fixed points of the elements of the group:
1 X
# Fixg (X),
r=
#G
g∈G
where Fixg (X) = {x ∈ X : gx = x} is the set of elements of X fixed by g.
Don’t confuse the set Fixg (X) with the fixed points for the action: Fixg (X) is only the
points fixed by the element g. The set of fixed points for the action of G is the intersection
of the sets Fixg (X) as g runs over the group.
Proof. We will count {(g, x) ∈ G × X : gx = x} in two ways.
By counting over g’s first we have to add up the number of x’s with gx = x, so
X
#{(g, x) ∈ G × X : gx = x} =
# Fixg (X).
g∈G
Next we count over the x’s and have to add up the number of g’s with gx = x, i.e., with
g ∈ Stabx :
X
#{(g, x) ∈ G × X : gx = x} =
# Stabx .
x∈X
Equating these two counts and dividing by #G gives
X
X
# Fixg (X) =
# Stabx .
g∈G
x∈X
By the orbit-stabilizer formula, #G/# Stabx = # Orbx , so
X
X #G
.
# Fixg (X) =
# Orbx
g∈G
x∈X
Divide by #G:
X
1 X
1
# Fixg (X) =
.
#G
# Orbx
g∈G
x∈X
Let’s consider the contribution to the right side from points in a single orbit. If an orbit
has n points in it, then the sum over the points in that orbit is a sum of 1/n for n terms,
GROUP ACTIONS
13
and that is equal to 1. Thus the part of the sum over points in an orbit is 1, which makes
the sum on the right side equal to the number of orbits, which is r.
Theorem 3.24 is often called Burnside’s lemma, but it is not due to him [3]. He included
it in his widely read book on group theory.
Example 3.25. We will use a special case of Theorem 3.24 to prove for all a ∈ Z and
m ∈ Z+ that
m
X
(3.2)
a(k,m) ≡ 0 mod m.
k=1
When m = p is a prime number, the left side is (p − 1)a + ap = (ap − a) + pa, so (3.2)
becomes ap ≡ a mod p, which is Fermat’s little theorem. Thus (3.2) can be thought of as a
generalization of Fermat’s little theorem to all moduli that is essentially different from the
generalization called Euler’s theorem, which says aϕ(m) ≡ 1 mod m if (a, m) = 1: (3.2) is
true for all a ∈ Z.
Our setup leading to (3.2) starts with any finite group G and comes from [2]. For a positive
integer a, G acts on the set of functions Map(G, {1, 2, . . . , a}) by (g · f )(h) = f (g −1 h).
This is a special case of the group action in Example 2.7, where G acts on itself by left
multiplication. We want to apply Theorem 3.24 to this action, so we need to understand
the fixed points (really, fixed functions) of each g ∈ G. We have g · f = f if and only if
f (g −1 h) = f (h) for all h ∈ G, which is the same as saying f is constant on every left coset
hgih in G. The number of left cosets of hgi in G is [G : hgi] = m/ord(g), where m = #G and
ord(g) is the order of g, so the number of functions fixed by g is am/ord(g) , since the value of
the function on each coset can be chosen arbitrarily in {1, . . . , a}. Therefore Theorem 3.24
P
implies (1/m) g∈G am/ord(g) is a positive integer, so
X
(3.3)
am/ord(g) ≡ 0 mod m.
g∈G
Since (3.3) depends on a only through the value of a mod m, it holds for all a ∈ Z, not just
a > 0.
Taking G = Z/(m), any k ∈ G has additive order m/(k, m), so (3.3) becomes
m
X
a(k,m) ≡ 0 mod m.
k=1
Next we turn to the idea of two different actions of a group being essentially the same.
Definition 3.26. Two actions of a group G on sets X and Y are called equivalent if there
is a bijection f : X → Y such that f (gx) = gf (x) for all g ∈ G and x ∈ X.
Actions of G on two sets are equivalent when G permutes elements in the same way on
the two sets after matching up the sets appropriately. When f : X → Y is an equivalence of
group actions on X and Y , gx = x if and only if gf (x) = f (x), so the stabilizer subgroups
of x ∈ X and f (x) ∈ Y are the same.
Example 3.27. Let R× act on a linear subspace Rv0 ⊂ Rn by scaling. This is equivalent
to the natural action of R× on R by scaling: let f : R → Rv0 by f (a) = av0 . Then f is a
bijection and f (ca) = (ca)v0 = c(av0 ) = cf (a) for all c in R× and a ∈ R.
14
KEITH CONRAD
Example 3.28. Let S3 act on the conjugacy class {(12), (13), (23)} by conjugation. This
action on a 3-element set, described in Table 1 below, looks like the usual action of S3 on
{1, 2, 3} if we identify (12) with 3, (13) with 2, and (23) with 1. Therefore this action of S3
on {(12), (13), (23)} by conjugation is equivalent to the natural action of S3 on {1, 2, 3}.
π
(1)
(12)
(13)
(23)
(123)
(132)
π(12)π −1 π(13)π −1 π(23)π −1
(12)
(13)
(23)
(12)
(23)
(13)
(23)
(13)
(12)
(13)
(12)
(23)
(23)
(12)
(13)
(13)
(23)
(12)
Table 1.
Example 3.29. Let GL2 (R) act on the set B of ordered bases (e1 , e2 ) of R2 in the natural
way: if A ∈ GL2 (R) then A(e1 , e2 ) := (Ae1 , Ae2 ) is another ordered basis of R2 . This
action of GL2 (R) on B is equivalent to the action of GL2 (R) on itself by left multiplication.
The reason is that the two columns of a matrix in GL2 (R) are a basis of R2 (with the first
and second columns being an ordering of basis vectors: the first column is the first basis
vector and the second column is the second basis vector)
and
two square matrices multiply
a
b
a
b
through multiplication on the columns: A( c d ) = (A c A d ). Letting f : B → GL2 (R) by
f ( ac , db ) = ( ac db ) gives a bijection and f (A(e1 , e2 )) = A · f (e1 , e2 ) for all A ∈ GL2 (R) and
(e1 , e2 ) ∈ B.
Example 3.30. Let H and K be subgroups of G. The group G acts by left multiplication
on G/H and G/K. If H and K are conjugate subgroups then these actions are equivalent:
write K = g0 Hg0−1 and let f : G/H → G/K by f (gH) = gg0−1 K. This is well-defined since,
for h ∈ H,
ghg0−1 K = ghg0−1 g0 Hg0−1 = gHg0−1 = gg0−1 K.
The reader can check f (g(g 0 H)) = gf (g 0 H) for g ∈ G and g 0 H ∈ G/H, and f is a bijection.
If H and K are non-conjugate then the actions of G on G/H and G/K are not equivalent: corresponding points in equivalent actions have the same stabilizer subgroup, but
the stabilizer subgroups of left cosets in G/H are conjugate to H and those in G/K are
conjugate to K, and none of the former and latter are equal.
The left multiplication action of G on a left coset space G/H has one orbit. It turns out
all actions with one orbit are essentially of this form:
Theorem 3.31. An action of G with one orbit is equivalent to the left multiplication action
of G on a left coset space.
Proof. Suppose that G acts on X with one orbit. Fix x0 ∈ X and let H = Stabx0 . Every
x ∈ X has the form gx0 for some g ∈ G, and all elements in a left coset gH have the same
value at x0 : for all h ∈ H, (gh)(x0 ) = g(hx0 ) = g(x0 ). Let f : G/H → X by f (gH) = gx0 .
This is well-defined, as we just saw. Moreover, f (g · g 0 H) = gf (g 0 H) since both sides equal
gg 0 (x0 ). We will show f is a bijection, so the action of G on X is equivalent to the left
multiplication action of G on G/H.
GROUP ACTIONS
15
Since X has one orbit, X = {gx0 : g ∈ G} = {f (gH) : g ∈ G}, so f is onto. If
f (g1 ) = f (g2 ) then g1 x0 = g2 x0 , so g2−1 g1 x0 = x0 . Since x0 has stabilizer H, g2−1 g1 ∈ H, so
g1 H = g2 H. Thus g is one-to-one.
A particular case of Theorem 3.31 says that an action of G is equivalent to the left
multiplication action of G on itself if and only if the action has one orbit and the stabilizer
subgroups are trivial.
Definition 3.32. The action of G on X is called free when every point has a trivial stabilizer.
Example 3.33. The left multiplication action of a group on itself is free with one orbit.
Example 3.34. The antipodal action of Z/(2) on a sphere (where the nontrivial element
acts by negation) is a free action. There are uncountably many orbits.
Free actions show up quite often in topology, and Example 3.34 is a typical illustration
of that.
Example 3.35. For an integer n ≥ 2, let Xn be the set of roots of unity of exact order
n in C× , so #Xn = ϕ(n). (For instance, X4 = {i, −i}.) The group (Z/(n))× acts on Xn
by a · ζ = ζ a . Since every element of Xn is a power of every other element of Xn using
exponents relatively prime to n, this action of (Z/(n))× has a single orbit. Since ζ a = ζ
only if a ≡ 1 mod n (ζ has order n), stabilizers are trivial. Thus (Z/(n))× acting on Xn
is equivalent to the multiplication action of (Z/(n))× on itself, except there is no naturally
distinguished element of Xn while 1 is a distinguished element of (Z/(n))× .
It is worth comparing faithful and free actions. An action is faithful (Definition 2.15)
when g1 6= g2 ⇒ g1 x 6= g2 x for some x ∈ X (different elements of G act differently at some
point) while an action is free when g1 6= g2 ⇒ g1 x 6= g2 x for all x ∈ X (different elements
of G act differently at every point). Since g1 x = g2 x if and only if g2−1 g1 x = x, we can
describe faithful and free actions in terms of fixed points: an action is faithful when each
g 6= e has Fixg (X) 6= X while an action is free when each g 6= e has Fixg (X) = ∅.
4. Actions of p-groups
The action of a group of prime power size has special features. When #G = pk for a
prime p, we call G a p-group. For example, D4 is a 2-group. Because all subgroups of
a p-group have p-power index, the length of an orbit under an action by a p-group is a
multiple of p unless the point is a fixed point, when its orbit has length 1. This leads to an
important congruence modulo p when a p-group is acting.
Theorem 4.1 (Fixed Point Congruence). Let G be a finite p-group acting on a finite set
X. Then
#X ≡ #{fixed points} mod p.
Proof. Let the different orbits in X be represented by x1 , . . . , xt , so Corollary 3.22 leads to
(4.1)
#X =
t
X
# Orbxi .
i=1
Since # Orbxi = [G : Stabxi ] and #G is a power of p, # Orbxi ≡ 0 mod p unless Stabxi = G,
in which case Orbxi has length 1, i.e., xi is a fixed point. Thus when we reduce both sides
16
KEITH CONRAD
of (4.1) modulo p, all terms on the right side vanish except for a contribution of 1 for each
fixed point. That implies
#X ≡ #{fixed points} mod p.
Keep in mind that the congruence in Theorem 4.1 holds only for actions by groups with
prime-power size. When a group of size 9 acts we get a congruence mod 3, but when a
group of size 6 acts we do not get a congruence mod 2 or 3.
Corollary 4.2. Let G be a finite p-group acting on a finite set X. If #X is not divisible
by p, then there is at least one fixed point in X. If #X is divisible by p, then the number
of fixed points is a multiple of p (possibly 0).
Proof. When #X is not divisible by p, neither is the number of fixed points (by the fixed
point congruence), so the number of fixed points can’t equal 0 (after all, p|0) and thus is
≥ 1. On the other hand, when #X is divisible by p, then the fixed point congruence shows
the number of fixed points is ≡ 0 mod p, so this number is a multiple of p.
Example 4.3. Let G be a p-subgroup of GLn (Z/(p)), where n ≥ 1. Then there is a nonzero v ∈ (Z/(p))n such that gv = v for all g ∈ G. Indeed, because G is a group of matrices
it naturally acts on the set V = (Z/(p))n . (The identity matrix is the identity function and
g1 (g2 v) = (g1 g2 )v by the rules of matrix-vector multiplication.) Since the set V has size
pn ≡ 0 mod p, the number of fixed points is divisible by p. The number of fixed points is
at least 1, since the zero vector is a fixed point, so the number of fixed points is at least p.
A non-zero fixed point for a group of matrices can be interpreted as a simultaneous
eigenvector with eigenvalue 1. These are the only possible simultaneous eigenvectors for G
in (Z/(p))n since every element of G has p-power order and the only element of p-power
order in (Z/(p))× is 1 (so a simulatenous eigenvector for G in (Z/(p))n must have eigenvalue
1 for each element of the group).
Theorem 4.1 can be used to prove existence theorems involving finite groups (nonconstructively) if we can interpret a problem in terms of fixed points. For example, an
element of a group is in the center precisely when it is a fixed point for the conjugation
action of the group on itself. Thus, if we want to show some class of groups has non-trivial
centers then we can try to show there are fixed points other than the identity element for
the conjugation action.
5. New Proofs Using Group Actions
In this section we prove two results using group actions (especially using Theorem 4.1):
finite p-groups have non-trivial center and Cauchy’s theorem. Our proof that finite p-groups
have a non-trivial center is actually the same as the usual proof without group actions, but
presented in a more elegant way. We will prove Cauchy’s theorem in two ways. Unlike the
usual inductive proof that doesn’t use group actions, the proofs we will give with group
actions treat abelian and non-abelian groups in a uniform manner.
Theorem 5.1. Let G be a non-trivial p-group. Then the center of G has size divisible by
p. In particular, G has a non-trivial center.
Proof. The condition that a lies in the center of G can be written as a = gag −1 for all g, so
a is a fixed by all conjugations. The main idea of the proof is to consider the action of G
on itself (X = G) by conjugation and count the fixed points.
GROUP ACTIONS
17
We denote the center of G, as usual, by Z(G). Since G is a p-group, and X = G here,
the fixed point congruence (Theorem 4.1) implies
(5.1)
#G ≡ #Z(G) mod p.
(This is the class equation for G, reduced modulo p.) Since #G is a power of p, (5.1) says
0 ≡ #Z(G) mod p,
so p|#Z(G). Because #Z(G) ≥ 1 (the identity is in Z(G)), Z(G) contains at least p
elements, so in particular Z(G) 6= {e}.
With almost no extra work we can prove a stronger result.
Corollary 5.2. Let G be a non-trivial p-group. For any non-trivial normal subgroup N CG,
N ∩ Z(G) 6= {e}. That is, every non-trivial normal subgroup meets the center of G nontrivially.
Proof. We argue as in the proof of Theorem 5.1, but let G act on N by conjugation. Since
N is a non-trivial p-group, the fixed point congruence (Theorem 4.1) implies N ∩ Z(G) has
size divisible by p. Thus N ∩ Z(G) is nontrivial.
Theorem 5.3 (Cauchy). Let G be a finite group and p be a prime factor of #G. Then
some element of G has order p.
Proof. (McKay) We are looking for solutions to the equation g p = e other than g = e. It is
not obvious in advance that there are any such solutions. McKay’s idea is to work with a
more general equation that has many solutions and then recognize solutions to the original
equation as fixed points under a group action on the solution set.
We will generalize the equation g p = e to g1 g2 · · · gp = e. This is an equation in p
unknowns. If we are given any choices for g1 , . . . , gp−1 then gp is uniquely determined as the
inverse of g1 g2 · · · gp−1 . Therefore the total number of solutions to this equation is (#G)p−1 .
By comparison, we have no idea how many solutions there are to g p = e and we only know
one solution, the trivial one that we are not interested in.
Consider the solution set to the generalized equation:
X = {(g1 , . . . , gp ) : gi ∈ G, g1 g2 · · · gp = e}.
We noted above that #X = (#G)p−1 , so this set is big. The nice feature of this solution
set is that cyclic shifts of one solution give us more solutions: if (g1 , g2 , . . . , gp ) ∈ X then
so is (g2 , . . . , gp , g1 ). Indeed, g1 = (g2 · · · gp )−1 and elements commute with their inverses
so g2 · · · gp g1 = e. Successive shifting of coordinates in a solution can be interpreted as a
group action of Z/(p) on X: for j ∈ Z/(p), let j · (g1 , . . . , gp ) = (g1+j , . . . , gp+j ), where the
subscripts are interpreted modulo p. This shift is a group action. Since the group doing
the acting is the p-group Z/(p), the fixed point congruence (Theorem 4.1) tells us
(5.2)
(#G)p−1 ≡ #{fixed points} mod p.
What are the points of X fixed by Z/(p)? Cyclic shifts bring every coordinate eventually
into the first position, so a fixed point of X is one where all coordinates are equal. Calling
the common value g, we have (g, g, . . . , g) ∈ X precisely when g p = e. Therefore (5.2)
becomes
(5.3)
(#G)p−1 ≡ #{g ∈ G : g p = e} mod p.
18
KEITH CONRAD
Up to this point we have not used the condition p|#G. That is, (5.3) is valid for any finite
group G and any prime p. This will be useful in Appendix A.
When p divide #G, the left side of (5.3) vanishes modulo p, so the right side is a multiple
of p. Thus #{g ∈ G : g p = e} ≡ 0 mod p. Since #{g ∈ G : g p = e} > 0, there must be
some g 6= e with g p = e.
Remark 5.4. Letting G be any finite group where p|#G, (5.3) says
(5.4)
#{g ∈ G : g p = e} ≡ 0 mod p.
Frobenius proved a more general result: when d|#G,
#{g ∈ G : g d = e} ≡ 0 mod d.
The divisor d need not be a prime. However, the proof is not as direct as the case of a
prime divisor, and we don’t look at this more closely.
Here is a second group action proof of Cauchy’s theorem.
Proof. Let n = #G and p|n. We let the group Z/(p) × G act on the set Gp by
(i, g) · (g1 , g2 , . . . , gp ) = (ggi+1 , ggi+2 , . . . , ggi+p ),
where indices are interpreted modulo p. This is a group action.
Let ∆ = {(g, g, . . . , g) : g ∈ G} be the diagonal in Gp . The action of Z/(p) × G on Gp
preserves X = Gp − ∆, and we consider the group action on this set, which has size np − n.
Since #(Z/(p) × G) = pn, all orbits have length dividing pn. Since p|n, pn does not divide
np − n, so some orbit has length less than pn. Let (g1 , g2 , . . . , gp ) be a point in such an
orbit, so this point has non-trivial stabilizer (why?). Let (i, g) be a non-identity element in
its stabilizer. We will show g 6= e and g p = e. The condition that (i, g) fixes (g1 , g2 , . . . , gp )
is equivalent to
ggi+1 = g1 , ggi+2 = g2 , . . . , ggi+p = gp .
Thus
g1 = ggi+1
= g · gg2i+1
(since gk = ggi+k for all k)
2
= g · g2i+1
= ···
= g r gri+1
for all r.
If g = e then i 6≡ 0 mod p and gri+1 = g1 for all r. Since {ri + 1 mod p : r ≥ 1} = Z/(p),
every gk equals g1 , so (g1 , . . . , gp ) ∈ ∆, a contradiction. Therefore g 6= e. Taking r = p,
g1 = g p gpi+1 = g p g1 , so g p = e.
6. More Applications of Group Actions to Group Theory
In Theorem 1.6 we saw how to interpret a group action of G as a homomorphism of G
to a symmetric group. We will now put this idea to use.
Theorem 6.1. Any nonabelian group of order 6 is isomorphic to S3 .
GROUP ACTIONS
19
Proof. Let G be nonabelian with order 6. We will find a set of size 3 that G naturally
permutes.
By Cauchy, G contains an element a of order 2 and b of order 3. Since G is nonabelian, a
and b do not commute. Therefore bab−1 is neither 1 nor a. Set H := hai = {1, a}. This is not
a normal subgroup of G since bab−1 6∈ H. There are 3 left cosets in G/H. Let G act by left
multiplication on G/H. This group action is a homomorphism ` : G → Sym(G/H) ∼
= S3 . If
g is in the kernel of ` then gH = H, so g ∈ H. Thus the kernel is either {1} or H. Since H
is not a normal subgroup, it can’t be a kernel, so ` has trivial kernel: it is injective. Both
G and S3 have order 6, so ` is an isomorphism of G with S3 .
Theorem 6.2. Let G be any finite group and H be a p-subgroup such that p|[G : H]. Then
p|[N(H) : H]. In particular, N(H) 6= H.
We are not assuming here that G is a p-group. The case when G is a p-group as well will
show up in Corollary 6.4.
Proof. Let H (not G!) act on G/H by left multiplication. Since H is a p-group, the fixed
point congruence Theorem 4.1 tells us
(6.1)
[G : H] ≡ #{fixed points} mod p.
What is a fixed point here? It is a coset gH such that hgH = gH for all h ∈ H. That
means hg ∈ gH for every h ∈ H, which is equivalent to g −1 Hg = H. This condition is
exactly that g ∈ N(H), so the fixed points are the cosets gH with g ∈ N(H). Therefore
(6.1) says
[G : H] ≡ [N(H) : H] mod p.
This congruence is valid for any p-subgroup H of a finite group G. When p|[G : H], we
read off from the congruence that the index [N(H) : H] can’t be 1, so N(H) 6= H.
Example 6.3. Let G = A4 and H = {(1), (12)(34)}. Then 2|[G : H], so N(H) 6= H. In
fact, N(H) = {(1), (12)(34), (13)(24), (14)(23)}.
Corollary 6.4. Let G be a finite p-group. Any subgroup of G with index p is a normal
subgroup.
Proof. We give two proofs. First, let the subgroup be H, so H ⊂ N(H) ⊂ G. Since
[G : H] = p, one of these inclusions is an equality. By Theorem 6.2, N(H) 6= H, so
N(H) = G. That means H C G.
For a second proof, consider the left multiplication action of G on the left coset space
G/H. By Theorem 1.6, this action can be viewed as a group homomorphism ` : G →
Sym(G/H) ∼
= Sp . Let K be the kernel of `. We will show H = K. The quotient G/K
embeds into Sp , meaning [G : K]|p!. Since [G : K] is a power of p, [G : K] = 1 or p. At
the same time, any g ∈ K at least satisfies gH = H, so g ∈ H. In other words, K ⊂ H, so
[G : K] > 1. Thus [G : K] = p, so [H : K] = [G : K]/[G : H] = 1, i.e., H = K C G.
Corollary 6.5. Let G be a finite group and p be a prime with pn |#G. Then there is a
chain of subgroups
{e} = H0 ⊂ H1 ⊂ · · · ⊂ Hn ⊂ G,
where #Hi = pi .
20
KEITH CONRAD
Proof. We can take n ≥ 1. Since p|#G there is a subgroup of size p by Cauchy’s theorem,
so we have H1 . Assuming for some i < n we have a chain of subgroups up to Hi , we will
find a subgroup Hi+1 with size pi+1 that contains Hi .
Since p|[G : Hi ], by Theorem 6.2 p|[N(Hi ) : Hi ]. Since Hi C N(Hi ), we can consider
the quotient group N(Hi )/Hi . It has size divisible by p, so by Cauchy’s theorem there
is a subgroup of size p. The inverse image of this subgroup under the reduction map
N(Hi ) → N(Hi )/Hi is a group Hi+1 of size p#Hi = pi+1 .
Theorem 6.6 (C. Jordan). If a nontrivial finite group acts on a finite set of size greater
than 1 and the action has only one orbit then some g ∈ G has no fixed points.
Proof. By Theorem 3.24,


X
X
1 
1
# Fixg (X) =
#X +
# Fixg (X) .
1=
#G
#G
g∈G
g6=e
Assume all g ∈ G have at least one fixed point. Then
1≥
#X − 1
1
(#X + #G − 1) = 1 +
.
#G
#G
Therefore #X − 1 ≤ 0, so #X = 1. This is a contradiction.
Remark 6.7. Using the classification of finite simple groups, it can be shown [1] that g in
Theorem 6.6 can be picked to have prime power order. There are examples showing it may
not be possible to pick a g with prime order.
S
Theorem 6.8. Let G be a finite group and H a proper subgroup. Then G 6= g∈G gHg −1 .
That is, the union of the subgroups conjugate to a proper subgroup do not fill up the whole
group.
Proof. We will give two proofs. The second will use group actions.
Each subgroup gHg −1 has the same size, namely #H. How many different conjugate
groups gHg −1 are there (as g varies)? For g1 , g2 ∈ G,
g1 Hg1−1 = g2 Hg2−1 ⇐⇒ g2−1 g1 Hg1−1 g2 = H
⇐⇒ g2−1 g1 H(g2−1 g1 )−1 = H
⇐⇒ g2−1 g1 ∈ N(H)
⇐⇒ g1 ∈ g2 N(H).
Therefore the number of different subgroups gHg −1 as g varies is [G : N(H)]. These
subgroups all contain the identity,Sso they are not disjoint. Therefore, on account of the
overlap at the identity, the size of g∈G gHg −1 is strictly less than
[G : N(H)]#H =
#G
#H
#H =
#G ≤ #G,
# N(H)
# N(H)
so the union of all gHg −1 is not all of G.
For the second proof, we apply Theorem 6.6 to the action of G on X = G/H by left
multiplication. For a ‘point’ gH in G/H, itsS stabilizer is gHg −1 . By Theorem 6.6, some
a ∈ G has no fixed points, which means a 6∈ g∈G gHg −1 .
GROUP ACTIONS
21
Remark 6.9. Theorem 6.8 is not always true for infinite groups. For instance, let G =
GL2 (C). Every matrix in
S G has an eigenvector, so we can conjugate any matrix in G to the
form ( a0 db ). Thus G = g∈G gHg −1 , where H is the proper subgroup of upper triangular
matrices.
Remark 6.10. Here is a deep application of Theorem 6.8 to number theory. Suppose a
polynomial f (X) in Z[X] is irreducible and has a root modulo p for every p. Then f (X) is
linear. The proof of this requires Theorem 6.8 and complex analysis.
Corollary 6.11. If H is a proper subgroup of the finite group G, there is a conjugacy class
in G that is disjoint from H and its conjugate subgroups.
S
Proof. Pick an x 6∈ g∈G gHg −1 and use the conjugacy class of x.
Theorem 6.12. Let G be a finite group with #G > 1, and p the smallest prime factor of
#G. Any subgroup of G with index p is a normal subgroup.
Group actions don’t appear in the statement of Theorem 6.12, but they will play a role
in its proof.
Proof. Let H be a subgroup of G with index p, so G/H is a set with size p. We will prove
H C G by showing H is the kernel of a homomorphism, and thus is a normal subgroup.
Let G act on G/H by left multiplication, which (by Theorem 1.6) gives a group homomorphism
(6.2)
G → Sym(G/H) ∼
= Sp .
This is the homomorpism sending each g in G to the permutation `g of G/H, where `g (aH) =
gaH. We will show the kernel of this homomorphism is H.
Write the kernel of this homomorphism as K, so K C G. To say g ∈ K means `g is the
identity permutation: g(aH) = aH for all cosets aH. One of the cosets is H itself, so in
particular gH = H, which implies g ∈ H. Therefore K ⊂ H.
By (6.2), since G/K is isomorphic to a subgroup of Sp , #(G/K)|p! by Lagrange. Since
[G : H] = p and K ⊂ H ⊂ G, we write
#(G/K) = [G : K] = [G : H][H : K] = p[H : K],
so the relation #(G/K)|p! simplifies to
[H : K]|(p − 1)!.
Since [H : K] is a factor of #G, its smallest prime factor is ≥ p. But this index divides
(p − 1)!, so therefore [H : K] doesn’t have any prime factors. That means [H : K] = 1, or
H = K. In particular, H is the kernel of a homomorphism out of G, so H C G.
Some special cases of Theorem 6.12 are worth recording separately.
Corollary 6.13. Let G be a finite group.
a) If H is a subgroup with index 2, then H C G.
b) If G is a p-group and H is a subgroup with index p, then H C G.
c) If #G = pq where p < q are different primes, then any subgroup of G with size q is a
normal subgroup.
Proof. Parts a and b are immediate consequences of Theorem 6.12. For part c, note that a
subgroup with size q is a subgroup with index p.
22
KEITH CONRAD
Part a can be checked directly, without the reasoning of Theorem 6.12: if [G : H] = 2
and a 6∈ H, then the two left cosets of H are H and aH, while the two right cosets of H are
H and Ha. Therefore aH = G − H = Ha, so H C G. Part b was already seen in Corollary
6.4. (In fact, our second proof of Corollary 6.4 used the same idea as the proof of Theorem
6.12.) Part c can be checked directly using the explicit list of groups of size pq. In Theorem
6.12, these disparate results are unified into a single statement.
All of our applications of group actions in this section have been to finite groups. Here
is an application to infinite groups.
Theorem 6.14. A finitely generated group has finitely many subgroups of index n for each
integer n ≥ 1.
Proof. Let G be a finitely generated group and H be a subgroup with finite index, say n.
The left multiplication action of G on G/H is a group homomorphism ` : G → Sym(G/H).
In this action, the stabilizer of the coset H is H (gH = H if and only if g ∈ H).
Pick an enumeration of the n cosets in G/H so that the coset H corresponds to the
number 1. This enumeration gives an isomorphism Sym(G/H) ∼
= Sn , so we can make G
act on the set {1, 2, . . . , n} and the stabilizer of 1 is H. Therefore we have constructed from
each subgroup H ⊂ G of index n an action of G on {1, 2, . . . , n} in which H is the stabilizer
of 1. Since H is recoverable from the action, the number of subgroups of G with index n is
bounded above by the number of homomorphisms G → Sn . Since G is finitely generated,
it has finitely many homomorphisms to the finite group Sn . Therefore G has finitely many
subgroups of index n.
I am not aware of a proof of this theorem that is fundamentally different from the one
presented here.
This is probably a good place to warn the reader about a false property of finitely
generated groups: a subgroup of a finitely generated group need not be finitely generated!
However, every subgroup of a finitely generated group with finite index is finitely generated:
if the original group has d generators, a subgroup with index n has at most (d − 1)n + 1
generators. This is due to Schreier.
Appendix A. Applications of Group Actions to Number Theory
We apply the fixed point congruence in Theorem 4.1 and its consequence (5.3) to derive
three classical congruences modulo p: those of Fermat, Wilson, and Lucas.
Theorem A.1 (Fermat). If n 6≡ 0 mod p, then np−1 ≡ 1 mod p.
Proof. It suffices to take n > 0, since (−1)p−1 ≡ 1 mod p. (This is obvious for odd p since
p − 1 is even, and for p = 2 use −1 ≡ 1 mod 2.) Apply (5.3) with the additive group
G = Z/(n):
(A.1)
np−1 ≡ #{a ∈ Z/(n) : pa ≡ 0 mod n} mod p.
Since (p, n) = 1, the congruence pa ≡ 0 mod n is equivalent to a ≡ 0 mod n, so the right
side of (A.1) is 1.
Theorem A.2 (Wilson). For a prime p, (p − 1)! ≡ −1 mod p.
Proof. We consider (5.3) for G = Sp :
0 ≡ #{σ ∈ Sp : σ p = (1)} mod p.
GROUP ACTIONS
23
An element of Sp has p-th power (1) when it is (1) or a p-cycle. The number of p-cycles is
(p − 1)!, and adding 1 to this gives the total count, so 0 ≡ (p − 1)! + 1 mod p.
Theorem A.3 (Lucas). Let p be a prime and n ≥ m be non-negative integers. Write them
in base p as
n = a0 + a1 p + a2 p2 + · · · + ak pk ,
m = b0 + b1 p + b2 p2 + · · · + bk pk ,
with 0 ≤ ai , bi ≤ p − 1. Then
ak
a1
n
a0
···
mod p.
≡
b1
bk
b0
m
Proof. We will prove the congruence in the following form: when n ≥ m ≥ 0, and n =
pn0 + a0 and m = pm0 + b0 , where 0 ≤ a0 , b0 ≤ p − 1, we have
0 n
a0
n
≡
mod p.
m
b0
m0
The reader should check this implies Lucas’ congruence by induction on n.
Decompose {1, 2, . . . , n} into a union of p blocks of n0 consecutive integers, from 1 to pn0 ,
followed by a final block of length a0 . That is, let
Ai = {in0 + 1, in0 + 2, . . . , (i + 1)n0 }
for 0 ≤ i ≤ p − 1, so
{1, 2, . . . , n} = A0 ∪ A1 ∪ · · · ∪ Ap−1 ∪ {pn0 + 1, . . . , pn0 + a0 }.
For 1 ≤ t ≤ n0 , let σt be the p-cycle
σt = (t, n0 + t, 2n0 + t, . . . , (p − 1)n0 + t).
This cycle cyclically permutes the numbers in A0 , A1 , . . . , Ap−1 that are ≡ t mod n0 . The
σt ’s for different t are disjoint, so they commute. Set σ = σ1 σ2 · · · σn0 . Then σ has order p
as a permutation of {1, 2, . . . , n} (fixing all numbers above pn0 ).
n
Let X be the set of m-element subsets of {1, 2, . . . , n}, so #X = m
. Let the group hσi
act on X. Since σ has order p, Theorem 4.1 tells us
#X ≡ #{fixed points} mod p.
n0 The left side is m . We will show the right side is ab00 m
0 .
When is an m-element subset M ⊂ {1, 2, . . . , n} fixed by σ? If M contains any number
from 1 to pn0 then σ-invariance implies M contains a number in the range from 1 to n0 , i.e.,
M ∩ A0 6= ∅. Let M contain q numbers in A0 . Then M is the union of these numbers and
their translates into each of the p sets A0 , . . . , Ap−1 , along with some set of numbers from
pn0 + 1 to pn0 + a0 , say ` of those. Then #M = pq + `. Since M has size m = pm0 + b0 , we
have b0 ≡ ` mod p. Both b0 and ` lie in [0, p − 1], so ` = b0 . Thus q = m0 .
Picking a fixed point in X under σ is thus the same as picking m0 numbers from 1 to n0
0
0
and then
picking b0 numbers from pn + 1 to pn + a0 . Therefore the number of fixed points
n0 a0
is m0 b0 , even in the case when a0 < b0 (in which case there are 0 fixed points, consistent
with ab00 = 0 in this case).
n
24
KEITH CONRAD
References
[1] B. Fein, W. M. Kantor, M. Schacher, Relative Brauer groups II, J. Reine Angew. Math. 328 (1981),
39–57.
[2] I. M. Isaacs and M. R. Pournaki, “Generalizations of Fermat’s Little Theorem Using Group Theory,”
Amer. Math. Monthly 112 (2005), 734–740.
[3] Wikipedia, Burnside’s lemma, http://en.wikipedia.org/wiki/Burnside%27s_lemma.