## Not That Good, Will Hunting

Many of you have probably seen or heard of the movie Good Will Hunting, starring eminent actors Stellan Skarsgård, Matt Damon and Robin Williams.
I am not going to give a long film review though I will briefly need to set context for the problem.

Will Hunting (played by Matt Damon) is a janitor at MIT who supposedly sits on extreme mathematical talent. One day Will Hunting solves a graph theory problem on a board in the hall posed by Professor Gerald Lambeau (Stellan Skarsgård) as a serious challenge to his students after somebody mysteriously (Will Hunting of course) had solved the initial problem set by the professor.
The professor announces to the class that it recently required him and his colleagues over 2 years to prove the result, whereas Will Hunting seemingly writes down the answer in a few minutes at the end of his working shift.

Of course Will Huntings mathematical ability is blown out of proportion in the movie.
The actual problem itself is hardly a complicated research problem these days and was stated as follows:

Problem (Will Hunting – Irreducible Trees):

How many (non-isomorphic) homeomorphically irreducible trees are there on 10 vertices?

The graph theory lingo used in the statement probably requires some explanation.
First of all if a vertex $v$ is incident to $d$ edges, we say that the degree of $v$ is $d$.

A tree is a special kind of graph which is connected and has no cycles.

A tree is homeomorphically irreducible if it has no vertex of degree 2.

We are looking for non-isomorphic instances of homeomorphically irreducible trees.
That is, trees which are essentially different in the sense that there is no bijective mapping $f:V_1 \rightarrow V_2$ between the vertices of the two graphs that preserve the edge structure i.e $(v,w)$ is an edge in the first graph iff $(f(v),f(w))$ is an edge in the second graph.
So in plain language two trees are non-isomorphic if one cannot get from one tree to the other by rotating or moving edges around in such a way that the same vertices remain adjacent.

Solution:

The answer is 10.

Although Will identifies 8 of the 10 trees before he is chased away, he doesn’t actually answer the real question which is why the trees he found are non-isomorphic?

In fact the more general question of how many homeomorphically irreducible trees there are on n vertices has been answered in terms of the coefficients of a generating function, in a paper entitled “The number of homeomorphically irreducible trees, and other species” by Harary and Prins published in Acta Mathematica 1959.

The first few terms of that function is given by $x + x^2 + x^4 + x^5 + 2x^6 + 2x^7 + 4x^8 + 5x^9 + 10x^{10} + 14x^{11} + 26x^{12} + O(x^{13})$.

I was thinking instead of proving the statement directly using a more elementary case argument on the vertex degrees. 10 vertices is small enough to make this feasible by hand.

Any tree on $n$ vertices has $n-1$ edges (Exercise 1).
Thus there are $9$ edges in a tree of $10$ vertices, and since each edge by definition is incident to two vertices the total sum of all degrees must be $9.2 = 18$ (The general result is known as the Handshaking Lemma).
Let $d_i$ denote the degree of vertex $v_i$ for $i = 1,...,10$.
W.l.o.g assume $d_1 \geq d_2 \geq ... \geq d_{10}$.
This is btw called a degree sequence.
Since a tree is connected clearly $d_i \geq 1$ and so $d_i \leq 9 \;\; \forall i$.

What we need is a systematic way of distinguishing non-isomorphic trees from each other.
Degree sequences will lead us in the right direction, for if two graphs have different degree sequences then they cannot be isomorphic i.e degree sequence is an isomorphic graph invariant (Exercise 2).
Note that the converse is not true i.e same degree sequence on two graphs does not imply isomorphism (we will see examples below).

Fortunately there are other graph invariants we can throw at the tree until we are able to fully distinguish it from the rest. As one would expect isomorphisms preserve distance between nodes in a tree, this is an immediate consequence of the definition of isomorphism (see Exercise 3). In particular since isomorphisms preserve the degree of a vertex, the distances between vertices of the same degree must remain unchanged in the isomorphic graph (we will use this fact later).

Let’s now find the trees by systematically enumerating the degree sequences using the information we have:

• $d_1 = 9$
• The only possible degree sequence is $9 \geq 1 \geq 1 ... \geq 1$ since $d_i \geq 1\;\; \forall i$.
There is only one tree up to isomorphism with this degree sequence.
See tree below:

• $d_1 = 8$
• Clearly this case yields no homeomorphically irreducible trees since again, given that that $d_i \geq 1\;\; \forall i$ the only possible degree sequence is $8 \geq 2 \geq 1 ... \geq 1$.
This means there must be a vertex of degree $2$ which is forbidden.

• $d_1 = 7$
• The possible degree sequences are:
$7 \geq 3 \geq 1 \geq ... \geq 1$
$7 \geq 2 \geq 2 \geq 1 \geq ... \geq 1$
Only the the first leads to a homeomorphically irreducible tree, and there is clearly only one tree with this degree sequence up to isomorphism.
See tree below:

• $d_1 = 6$
• The possible degree sequences are:
$6 \geq 4 \geq 1 ... \geq 1$
$6 \geq 3 \geq 2 \geq 1 \geq ... \geq 1$
$6 \geq 2 \geq 2 \geq 2 \geq 1 \geq ... \geq 1$
Only the the first leads to a homeomorphically irreducible tree, and there is clearly only one tree with that degree sequence up to isomorphism.
See tree below:

• $d_1 = 5$
• The possible degree sequences are:
$5 \geq 5 \geq 1 ... \geq 1$
$5 \geq 4 \geq 2 \geq 1 \geq ... \geq 1$
$5 \geq 3 \geq 3 \geq 1 \geq 1 \geq ... \geq 1$
$5 \geq 3 \geq 2 \geq 2 \geq 1 \geq ... \geq 1$
$5 \geq 2 \geq 2 \geq 2 \geq 2 \geq 1 \geq ... \geq 1$

Only the first and third degree sequences are free from a degree $2$ vertex.
The first degree sequence clearly leads to one tree up to isomorphism:

For the third degree sequence there are five possibilities since there are $3$ non-leaf vertices two of which have the same degree:
(1) $v_2, v_3$ are children of $v_1$
(2) $v_2$ is a child of $v_1$ and a parent of $v_3$
(3) $v_1, v_2$ are children of $v_3$
(4) $v_1$ is a child of $v_3$ and a parent of $v_2$ or vice versa (5)

Inspecting these cases we see that we get two non-isomorphic trees:
(4) can be moved around to match (1) showing they are isomorphic (how?)
(3) and (5) can be moved around to match (2) showing they are isomorphic (how?)
(1) and (2) are not isomorphic since the distance between the degree $3$ vertices is $2$ in (1) and $1$ in (2).
See trees below:

• $d_1 = 4$
• The possible degree sequences are:
$4 \geq 4 \geq 3 \geq 1 \geq ... \geq 1$
$4 \geq 4 \geq 2 \geq 2 \geq 1 \geq... \geq 1$
$4 \geq 3 \geq 3 \geq 2 \geq 1 \geq ... \geq 1$
$4 \geq 2 \geq 2 \geq 2 \geq 2 \geq 2 \geq 1 \geq ... \geq 1$

Only the first degree sequence leads to a homeomorphically irreducible tree.
We have five possibilities to consider for this degree sequence since there are $3$ non-leaf vertices, two of which have the same degree:
(1) $v_2, v_3$ are children of $v_1$
(2) $v_2$ is a child of $v_1$ and a parent of $v_3$ or vice versa (3).
(4) $v_1, v_2$ are children of $v_3$
(5) $v_1$ is a child of $v_3$ and a parent of $v_2$ (the reverse case is equivalent since $v_1$ and $v_2$ have same degree)

Inspecting these cases we see that we get two non-isomorphic trees:
(2) and (5) can be moved around to match (1) showing all three are isomorphic.
(4) can be moved around to match (3) showing they are isomorphic.
(2) is not isomorphic to (3) because the distance between the degree $4$ nodes in in (2) is $1$ whereas the distance between the degree $4$ nodes in (3) is $2$.
See trees below:

• $d_1 = 3$
• The possible degree sequences are:
$3 \geq 3 \geq 3 \geq 3 \geq 1 \geq ... \geq 1$
$3 \geq 3 \geq 3 \geq 2 \geq 2 \geq 1 \geq... \geq 1$
$3 \geq 3 \geq 2 \geq 2 \geq 2 \geq 2 \geq 1 \geq ... \geq 1$
$3 \geq 2 \geq 2 \geq 2 \geq 2 \geq 2 \geq 2 \geq 1 \geq ... \geq 1$
Only the first degree sequence leads to a homeomorphically irreducible tree.
There are $4$ different cases to consider for this degree sequence given all $4$ non-leaves have degree $3$:
(1) $v_2,v_3,v_4$ are children of $v_1$
(2) $v_2$ is a child of $v_1$ and $v_3,v_4$ are children of $v_2$
(3) $v_2,v_3$ is a child of $v_1$ and $v_4$ is a child of $v_2$
(4) $v_2$ is a child of $v_1$ and $v_3$ a child of $v_2$ and $v_4$ a child of $v_3$

Inspecting these cases we see that we get two non-isomorphic trees:
(1) can be moved around to match (2) showing they are isomorphic.
(3) can be moved around to match (4) showing they are isomorphic.
(1) and (3) are not isomorphic since (3) has a path of length $5$ whereas (1) does not (isomorphisms are supposed to preserve distances).
See trees below:

• $d_1 = 2$
• This case is degenerate by default since there has to be a vertex of degree $2$.

• $d_1 = 1$
• This case too is degenerate since there are no trees on $10$ vertices where the maximum degree of a vertex is $1$.

Hence we have exactly $1+0+1+1+3+2+2+0+0 = 10$ homeomorphically irreducible trees on $10$ vertices up to isomorphism.

There were two more problems featuring in this film.
Perhaps we will come back and complete the rest of Will Huntings homework in a future post.
The problems are not terribly interesting, they’ve just received some popular fame because they appeared in the movie and turned out not to be as difficult as they were portrayed (don’t get fooled by Hollywood in other words).
Nevertheless this is a great movie and it is fantastic to see that they used “actual” mathematics instead of dwelling over say, Fibonacci numbers or the number $\pi$
(contrary to popular belief mathematicians don’t spend their careers looking for Fibonacci patterns in nature or memorizing decimals of $\pi$… ok, perhaps some do).

Below are the graph theory exercises I have been referring to:

Exercise 1 (Tree Edges – Easy):

Prove that any tree on $n$ vertices has $n-1$ edges.

(Hint: Remove a leaf and argue by induction)

Exercise 2 (Degree Sequence – Easy):

Prove that isomorphic graphs have the same degree sequence.

Exercise 3 (Node Distance – Easy):

i) Prove that there is a unique path between any two vertices in a tree.

The length of the path between two vertices $v_1, v_2$ is called the distance between $v_1$ and $v_2$.

ii) Prove that the distance between any two vertices in a tree is preserved under isomorphism.