The order of elements in a free semigroup

Let X be a set and let S = \mathsf{F}(X) denote the free semigroup on X. Find the orders of elements in S. Does S have a kernel?

To answer this question the Right Way, we really need to have a more rigorous definition for \mathsf{F}(X). To accomplish this, we will introduce ‘lists’. The next several paragraphs borrow heavily from the book “Set Theory” by Monk and from the paper “Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire” by Meijer, Fokkinga, and Paterson (primarily from the latter, with a bit of influence from the former). Both of those publications are very good.

This post was originally written for an experimental blog I started writing a while ago but quickly abandoned, so the style is a bit different.

One of the most basic “data structures” we use is the list; that is, a finite sequence of elements from some set. In this post we will give a precise definition for lists and prove some basic facts about them. The definition of list that we will use might seem strange, but it lends itself well to recursive proofs and to implementation in software. This will allow us to implement our proofs in a computer program, reasonably confident that the program does as we expect.

Definition: Let A be a set. We define a function L on \mathbb{N} recursively as follows: L_A(0) = \{\emptyset\} and L_A(k+1) = A \times L_A(k). Then we define the set of lists in A to be the set \mathcal{L}(A) = \bigcup_{k \in \mathbb{N}} L_A(k).

Note that for all sets A, \emptyset \in \mathcal{L}(A). We will call this the empty list and denote it by \varnothing. By definition, every element of \mathcal{L}(A) is either \varnothing or of the form (a,\ell) for some a \in A and \ell \in \mathcal{L}(A).

Theorem (Basic facts about \mathcal{L}(A)):

  1. L_A(k) \neq \emptyset for all nonzero k \in \mathbb{N} if and only if A \neq \emptyset.
  2. If A is nonempty and \ell \in \mathcal{L}(A), then there exists a unique k \in \mathbb{N} such that \ell \in L_A(k). We call this unique k the length of \ell and denote it \mathsf{len}(\ell).
  3. If \ell \in \mathcal{L}(A) with \mathsf{len}(\ell) = k+1, then \ell = (a,\ell^\prime) for some a \in A and \ell^\prime \in \mathcal{L}(A) with \mathsf{len}(\ell^\prime) = k.


  1. (\Rightarrow) Since in particular L_A(1) = A \times \{\varnothing\} \neq \emptyset, we have A \neq \emptyset. (\Leftarrow) Suppose to the contrary, and choose k minimal such that L_A(k+1) is empty. Then A \times L_A(k) is empty, so that L_A(k) is empty, a contradiction.
  2. Let S \subseteq \mathcal{L}(A) be the set of all lists which are not contained in a unique L_A(k). Suppose S is not empty. Note that each \ell \in S is contained in a smallest L_A(k), say L_A(k_\ell). Now let k \in \mathbb{N} be the unique least element among the k_\ell, and let \ell be in L_A(k). Then we have \ell \in L_A(n) for some n > k. If k = 0, then we have \ell = \varnothing since L_A(0) = \{\varnothing\}. On the other hand, we have n = m+1 for some m, and \ell \in A \times L_A(m). This is a contradiction because no element of A \times L_A(m) is the empty set. Suppose now that k \neq 0; then we have k = t+1 and n = m+1. Now \ell = (a,\ell_1), where \ell_1 \in L_A(t) \cap L_A(m). This is a contradiction since k was chosen to be minimal.
  3. This follows from the definition of L_A(k+1). \square

Note that by definition, every list has finite length. Elements of \mathcal{L}(A) are ordered pairs which may be nested on the right; for instance, (1,\varnothing), (2,(1,\varnothing)), and (4,(2,(1,\varnothing))) are lists. To cut down on the number of parentheses we have to write, we will frequently use bracket notation as a shorthand. For instance, [1] means (1,\varnothing) and [4,2,1] means (4,(2,(1,\varnothing))).

One of the more useful things we can do with a list is deconstruct it. Functions that do this in a certain well-behaved way are called catamorphisms, and they have a nice uniqueness property.

Theorem (List catamorphisms): Let A and B be sets with A nonempty, let e \in B, and let \oplus : A \times B \rightarrow B be a function. Then there exists a unique function h : \mathcal{L}(A) \rightarrow B such that h(\varnothing) = e and h(a,\ell) = a \oplus h(\ell). We will denote this function by (\!|\oplus,e|\!), and call functions of this kind list catamorphisms.

(Existence) Define a function h_k : L_A(k) \rightarrow B recursively as follows: h_0(\varnothing) = e and h_{k+1}(a,\ell) = a \oplus h_k(\ell). Certainly then h = \bigcup_{k \in \mathbb{N}} h_k is a function from \mathcal{L}(A) to B. Moreover, we have h(\varnothing) = h_0(\varnothing) = e, and if (a,\ell) \in L_A(k+1), then h(a,\ell) = h_{k+1}(a,\ell) = a \oplus h_k(\ell) = a \oplus h(\ell). (Uniqueness) Suppose \theta : \mathcal{L}(A) \rightarrow B is a function such that \theta(\varnothing) = e and \theta(a,\ell) = a \oplus \theta(\ell). We prove that \theta = h by induction on the length. Certainly \theta(\varnothing) = e = h(\varnothing). Now suppose that for all lists \ell of length k we have \theta(\ell) = h(\ell), and let \tau = (a,\ell) have length k+1. Then \theta(a,\ell) = a \oplus \theta(\ell) = a \oplus h(\ell) = h(a,\ell). Thus h is unique. \square

A catamorphism takes a binary operator and an element of B, and returns a function which takes lists over A and deconstructs them into a single element of B using the operator. More literally, a catamorphism takes a list an replaces all commas with the operator \oplus and replaces the empty list with the fixed element e. This may seem strange at first, but as we will see, several useful functions which take lists as input can be written as a catamorphism or a chain of catamorphisms composed together. One advantage of this is that certain functional transformation rules can be used to rewrite a given function to be “more efficient”, whatever that means. Two basic rules are called the Fusion law and the Split law; roughtly speaking, the Fusion law says when the composition of a function and a catamorphism is a catamorphism, and the Split law says that catamorphisms interact with direct products in a nice way.


  1. (Fusion law) Let A, B, and C be nonempty sets, let e_B \in B and e_C \in C, and let \oplus : A \times B \rightarrow B and \otimes : A \times C \rightarrow C. Finally, let f : B \rightarrow C be a function. Then f \circ (\!|\oplus,e_B|\!) = (\!|\otimes,e_C|\!) if and only if f(e_B) = e_C and for all a \in A and b \in B such that b \in \mathsf{im}\ (\!|\oplus,e_B|\!), f(a \oplus b) = a \otimes f(b).
  2. (Split law) Let A, B, and C be nonempty sets, let b \in B and c \in C, and let \oplus : A \times B \rightarrow B and \otimes : A \times C \rightarrow C. Define \star : A \times (B \times C) \rightarrow B \times C by a \star (b,c) = (a \oplus b, a \otimes c), and let \pi_1 and \pi_2 denote the coordinate projections. Then \pi_1 \circ (\!|\star,(b,c)|\!) = (\!|\oplus,b|\!) and \pi_2 \circ (\!|\star,(b,c)|\!) = (\!|\otimes,c|\!).


  1. (\Rightarrow) Note that c = (\!|\otimes,c|\!)(\varnothing) = (f \circ (\!|\oplus,b|\!))(\varnothing) = f((\!|\oplus,b|\!)(\varnothing) = f(b). Moreover, if a \in A and b = (\!|\oplus,e_B|\!)(\ell), then f(a \oplus b) = f(a \oplus (\!|\oplus,e_B|\!)(\ell)) = f((\!|\oplus,e_B|\!)(a,\ell)) = (f \circ (\!|\oplus,e_B|\!))(a,\ell) = (\!|\otimes,e_C|\!)(a,\ell) = a \otimes (\!|\otimes,e_C|\!)(\ell) = a \otimes (f \circ (\!|\oplus,e_B|\!))(\ell) = a \otimes f((\!|\oplus,e_B|\!)(\ell)) = a \otimes f(b). (\Leftarrow) Note that (f \circ (\!|\oplus,e_B|\!))(\varnothing) = f((\!|\oplus,e_B|\!)(\varnothing)) = f(e_B) = e_C and that (f \circ (\!|\oplus,e_B|\!))(a,\ell) = f((\!|\oplus,e_B|\!)(a,\ell)) = f(a \oplus (\!|\oplus,e_B|\!)(\ell)) = a \otimes f((\!|\oplus,e_B|\!)(\ell)) = a \otimes (f \circ (\!|\oplus,e_B|\!))(\ell). By the uniqueness of catamorphisms, f \circ (\!|\oplus,e_B|\!) = (\!|\otimes,e_C|\!).
  2. We proceed by induction on the length of the input \ell. If \ell = \varnothing, then (\!|\star,(e_B,e_C)|\!)(\varnothing) = (e_B,e_C) = ((\!|\oplus,e_B|\!), (\!|\otimes,e_C|\!)). Suppose now that the result holds for all lists of length k, and suppose \ell = (a, \ell^\prime) has length k+1. Then (\!|\star,(e_B,e_C)|\!)(a,\ell^\prime) = a \star (\!|\star,(e_B,e_C)|\!)(\ell^\prime) = a \star ((\!|\oplus,e_B|\!)(\ell^\prime), (\!|\otimes,e_C|\!)(\ell^\prime)) = (a \star (\!|\oplus,e_B|\!)(\ell^\prime), a \star (\!|\otimes,e_C|\!)(\ell^\prime)) = ((\!|\oplus,e_B|\!)(\ell^\prime), (\!|\otimes,e_C|\!)(\ell^\prime)). \square

The fusion and split laws for catamorphisms will be useful both for proving theorems about functions defined in terms of catamorphisms (as we will see) and for developing efficient programs for performing explicit computations.

Definition: (List Concatenation) Let A be a nonempty set. We define a binary operator \ddagger : \mathcal{L}(A) \times \mathcal{L}(A) \rightarrow \mathcal{L}(A) recursively as follows: \varnothing \ddagger \ell = \ell and (a,\ell^\prime) \ddagger \ell = (a, \ell^\prime \ddagger \ell). This oeprator is called concatenation.

Lemma: \ddagger is associative. Proof: Let (\alpha,\beta,\gamma) be a triple of lists in \mathcal{L}(A). We proceed by induction on the length of \alpha. For the base case, we have (\varnothing \ddagger \beta) \ddagger \gamma = \beta \ddagger \gamma = \varnothing \ddagger (\beta \ddagger \gamma). For the inductive step, suppose \ddagger is associative on all triples (\alpha^\prime,\beta,\gamma) where \alpha^\prime has length k, and that \alpha has length k+1. Now ((a,\alpha^\prime) \ddagger \beta) \ddagger \gamma = (a, \alpha^\prime \ddagger \beta) \ddagger \gamma = (a, (\alpha^\prime \ddagger \beta) \ddagger \gamma) = (a, \alpha^\prime \ddagger (\beta \ddagger \gamma)) = (a,\alpha^\prime) \ddagger (\beta \ddagger \gamma). So \ddagger is associative. \square

Lemma: \varnothing \ddagger \alpha = \alpha \ddagger \varnothing = \alpha for all lists \alpha. Proof: \varnothing \ddagger \alpha = \alpha by definition. To see the second equality, we use induction on the length of \alpha. Certainly \varnothing \ddagger \varnothing = \varnothing. If the result holds for \alpha^\prime, then (a,\alpha^\prime) \ddagger \varnothing = (a,\alpha^\prime \ddagger \varnothing) = (a,\alpha^\prime) as desired. \square

So if X is nonempty, the set \mathcal {L}(X) with the operator \ddagger is a semigroup, and in fact is a monoid with identity element \varnothing. We define the free monoid on X to be \mathsf{F}_1(X) = (\mathcal{L}(X), \ddagger).

Lemma: \mathsf{len}(\alpha \ddagger \beta) = \mathsf{len}(\alpha) + \mathsf{len}(\beta). Proof: By induction on the length of \alpha. Certainly \mathsf{len}(\varnothing \ddagger \beta) = \mathsf{len}(\beta) = \mathsf{len}(\varnothing) + \mathsf{len}(\beta). If the result holds for \alpha^\prime, then \mathsf{len}((a,\alpha^\prime) \ddagger \beta) = \mathsf{len}((a, \alpha^\prime \ddagger \beta)) = 1 + \mathsf{len}(\alpha^\prime \ddagger \beta) = 1 + \mathsf{len}(\alpha^\prime) + \mathsf{len}(\beta) = \mathsf{len}((a,\alpha^\prime)) + \mathsf{len}(\beta). \square

Lemma: \mathsf{F}_0(X) has no zero divisors. Proof: If \alpha \ddagger \beta = \varnothing, then comparing lengths, \alpha = \beta = \varnothing. \square

Since \mathsf{F}_0(X) has no zero divisors, the set \mathsf{F}(X) = \mathsf{F}_0(X) \setminus \{\varnothing\} is a subsemigroup. We call \mathsf{F}(X) the free semigroup on X.

We are finally in a position to answer the original problem: what are the orders of elements in \mathsf{F}(X)? Since every element of \mathsf{F}(X) has positive length, and since length is additive over products, the powers of any element \alpha \in \mathsf{F}(X) have monotonically increasing length. That is, the subsemigroup generated by \alpha is infinite. So every element of \mathsf{F}(X) has infinite order.

Next we can consider the kernel of \mathsf{F}(X). Note that for any word \alpha of length k, every element of the two sided ideal J(\alpha) has length at least k. As a consequence, for any element \alpha \in \mathsf{F}(X) of length k, we can construct an ideal in which every element has length at least k+1– and so the ideal does not include \alpha. In particular, the intersection of all the ideals in \mathsf{F}(X) is empty. So this semigroup has no kernel.

Post a comment or leave a trackback: Trackback URL.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: