Perhaps the biggest difference between mathematics at school and at university is a change in style into reasoning in more abstract and rigorous ways. As a student of pure mathematics you will for instance spend a great deal of your first year real analysis course untangling definitions and proving facts you may have taken granted for years – in a sense re-learning everything from scratch in a more formal setting.
I myself got to spend my first lecture in real analysis seeing proofs of statements like
- (i)
- (ii)
from the axioms of the real number system, and a few lectures later proving effectively that a continuous function taking positive and negative values must pass through the
x-axis (Intermediate Value Theorem).
I recall that the main difficulty faced in the very beginning by some people (including myself) came down to lack of training in constructing and appreciating formal arguments, the mechanics of it. Although some universities have introductory courses to pure mathematics not all do, and they are usually very short (mine was one week).
Therefore I wanted to write a few lines about what most schools and some universities may not explicitly cover, namely the common high level methods for proving mathematical statements.
I will also remark before continuing that this is not a discussion about proofs in the formal sense of proof theory (a branch of mathematical logic). This is mainly a discussion about proofs in everyday mathematical practice.
Here methods will be outlined to keep in mind when facing mathematical statements of a certain form.
An implication consists of a Hypothesis and a Conclusion written , meaning “If then “.
An example of an implication is:
The truth table of the implication statement is defined as follows:
From the truth table you can see that if you are able to show that the conclusion is true then is always true regardless of the truth value of the hypothesis .
Consider the following statement: “If I live in London then I live in the universe”.
Since the conclusion is always true it does not matter where I live, the implication is still true.
Note that the notion of a trivial proof usually means something different from the above.
If a mathematician refers to a proof as trivial they usually mean that it is “obvious”/”easy to prove” (in relation to the assumed knowledge of the audience).
Though you should beware of people who do this too much, it is really not a substitute for a proof if it means you can’t reconstruct it.
From the truth table you can also see that if you are able to show that the hypothesis is false then is always true regardless of the truth value of the conclusion .
This has the somewhat surprising meaning that a false statement can imply anything!
Consider the following statement “If pigs fly then you are president of USA”.
Pigs don’t fly so the hypothesis is false and so the implication is true.
But clearly you are not president of USA (unless you are Obama reading this in which case you should stop procrastinating and get back to work!) so how does this make sense?
The rationale behind the table is that an implication is defined to be true until proven false.
Who knows what happens if pigs do fly, maybe you will then be president?
We cannot know until there is a flying pig and therefore since we can’t say that the implication is false we must say it is true.
(DISCLAIMER: please don’t try throwing pigs out the window hoping they will fly, I am not responsible).
This is perhaps the most familiar method.
To show that the implication is true you assume is true and deduce (by some logical process) that is true.
Here is how you can prove using the direct method:
Suppose .
Then using the hypothesis.
Therefore and so .
Hence the implication is proved.
To show that you can show the logically equivalent contrapositive statement
i.e “P then Q” is equivalent to “Not Q then Not P”.
You may check that both implications have the same truth table confirming their equivalence.
So to prove the contrapositive of we assume is false and deduce that must be false.
Here is how you can prove via the contrapositive:
Suppose .
Then so .
Expanding we get and therefore .
Hence the contrapositive implication has been proved and so the implication is proved.
In a proof by contradiction you assume the hypothesis to be true AND assume the conclusion to be false and show that it leads to something absurd i.e a logical contradiction.
This logical contradiction could be a violation to the hypothesis (which is assumed to be true) or a contradiction to some other true statement you establish throughout the proof.
Here is how you can prove via proof by contradiction:
Suppose and .
Dividing by (which is negative) reverses the inequality so then .
By completing the square we get .
But this is a contradiction since the left hand side is strictly positive.
Since doesn’t make sense under the true assumption of the hypothesis we deduce .
Existence statements require you to show the existence of some mathematical object in a set satisfying a certain statement . Often such existence statements are combined with implications to say that under certain hypotheses we can assert the existence of an object satisfying a given statement.
A mathematical object can be anything from a number, matrix or graph to circles, spheres and manifolds in geometry to groups, rings and modules in algebra.
There are essentially two categories of existence proofs, constructive and non-constructive.
I will also mention a third which really belongs to one of the two above but is worth mentioning separately.
The constructive method is based upon building a mathematical object directly by means of supplying an algorithm or procedure for the construction of the object.
Such proofs can give a wealth of insight and are therefore often preferred over all other methods.
However constructive proofs are not always easy to find.
Safe to say being destructive (or rather non-constructive) is many times easier than being constructive – true in life, true in mathematics.
Example:
– Prove that there exist a real root to .
so and hence we can exhibit as a real root.
Here I have proven that has a real root by explicitly finding one through trivial factorization, so the proof may be considered constructive.
A somewhat less trivial example is given by the Euclidean Algorithm. Not only does it show the highest common factor of two integers exist, it also tells you how to find them.
A non-constructive proof usually means showing that something cannot not exist, and therefore must exist.
This argument crucially relies on a philosophical principle dating back to Aristotle called “Law of the excluded middle” which is the third of the three classical Laws of Thought.
The law says that everything which has a truth value is either True or False, there is no middle ground.
Therefore if we can show that something cannot not exist (by means of contradiction) then the only alternative is that it must exist by the law of the excluded middle.
The law basically implies we are allowed to eliminate double negations.
Both proof by contraposition and proof by contradiction (see above) depend on this law for their validity.
One school of philosophy known as ‘Constructivism’ rejects the law of the excluded middle and therefore rejects this method altogether and all proofs that follow from it (imagine how devastating for mathematics!).
Nevertheless this law is generally accepted by most people throughout mathematics.
The disadvantage with this method is that it doesn’t usually give much insight into how to find the object, it just tells you it must exist. However on the plus-side it is often easier to argue by (and sometimes the only possible argument).
A proof can also be non-constructive if it refers back to a theorem which guarantees existence but in turn gives no constructive answer.
Examples:
– Prove that there exist a real root to .
I mentioned the Intermediate Value Theorem earlier which implies that a continuous real-valued function taking positive and negative values must cross the x-axis at least once.
Now being a polynomial is a continuous real-valued function (why?).
Moreover and .
Hence has a real root by the Intermediate Value Theorem.
In fact a very similar argument can be applied to show that any odd-degree polynomial always has at least one real root (how?).
This proof is non-constructive because I have not given a method for finding the real root, just referred to a theorem which tells us it has to exist somewhere (in fact somewhere between -2 and 2).
– Prove that some digit in the decimal expansion of has to appear infinitely often.
We know is irrational so its decimal expansion is never ending.
Suppose there didn’t exist a digit occuring infinitely often.
Then each of the digits appears finitely many times, say at most times each.
But then there can be at most decimals in the decimal expansion of which is a contradiction to its irrationality.
Hence there exist a digit in the decimal expansion of appearing infinitely often, but we don’t know which.
The Probabilistic Method proceeds by showing that an object with given properties must exist with positive probability. It can then be claimed with certainty that the object has to exist since there is at least one object in the sample space pushing the probability above zero.
The method is non-constructive.
A full example is probably slightly out of scope for this post but perhaps among the simpler examples is a proof of a lower bound on Ramsey Numbers of the form due to Paul Erdos.
You can read about it here.
‘For all’-statements assert that for all objects in a set some statement is true about x.
In mathematical notation the phrases ‘for all’, ‘for every’ and ‘for each’ are all denoted by .
Depending on the set there are different methods available.
Examples of -statements are:
perfect n-level binary tree has nodes
In a direct -proof you pick an arbitrary element and show that is true for x.
Since the element was arbitrary the statement is true for all .
To give an example we can prove below de Morgan Law in this way:
– Prove that
Let .
We need to show that we have and vice versa.
Let .
Then so and i.e and .
Therefore .
Conversely let .
Then and .
Thus and so .
Therefore
Hence since was arbitrary in both directions.
We deduce that since were chosen as arbitrary subsets.
If the set is finite or can be divided into a number of finite cases then it may be feasible to check every case by hand or with the aid of a computer.
These arguments are usually used as a last resort if no other argument is available.
Alternatively it is intentionally used to structure a proof into more manageable and coherent pieces for mathematical or aesthetic reasons.
The famous Four Colour Theorem was settled by exhausting a set of 1,936 cases using a computer.
Many other theorems throughout mathematics are based on proving the same statement using different arguments for special subsets of
Induction is a method that can be applied to prove statements about sets which are well-ordered.
A well-ordering on a set is a total order with respect to which all non-empty subsets have a smallest element.
(with ) is a prototypical example of a well-ordered set.
(with ) is not well-ordered by for example considering the subset .
[A theorem which has historically sparked great controversy among mathematicians is the Well-Ordering Theorem which says that any set admits a well-ordering, even (with a different ordering than of course).
Also worth noting is that if you accept the well-ordering theorem you also have to accept the Axiom Of Choice since they can be shown equivalent.
Accepting axiom of choice leads to some unintuitive consequences like the Banach-Tarski Paradox.
Induction on sets with cardinality higher than is usually referred to as Transfinite Induction]
Throughout this post we will assume to be working over when talking about induction.
Mathematical induction can be thought of as a formal replacement for a recursive argument ending with the phrase “and so on…”.
A proof by induction has essentially got two parts:
i) Prove that P(1) is true (base case)
ii) Assume P(n) is true and prove that P(n+1) is true (inductive step)
It is then easy to see that this enough to show that P(n) is true for all n.
For example if someone specifies an arbitrary number k then you can argue:
P(1) is true, so P(2) is true with n=1.
P(2) is true, so P(3) is true with n=2.
P(3) is true, so P(4) is true with n=3.
.
.
P(k-1) is true so P(k) is true with n=k-1.
Here is an example of a proof by induction:
– Prove that
The base case is clear since .
Now suppose for some that (this is the inductive hypothesis).
We need to show the statement is true for .
.
Hence the statement is true for all by induction.
This is another flavour of induction which differs from the previous method in that the inductive hypothesis is “strengthened” to assume holds true .
Logically however strong induction is equivalent to weak induction but the former may simplify the proof of the inductive step in certain cases.
Here is an example of where strong induction comes in handy:
– Prove that , may be written as a product of prime numbers
(this is the existence part of the so called Fundamental Theorem Of Arithmetic)
The base case is clear since is prime.
Suppose now and we may write as a product of primes.
If is a prime then we are done.
If is not a prime then such that .
Now and so by inductive hypothesis both and may be written as a product of primes.
Hence may be written as a product of primes (being a product of and ).
A proof by the method of minimal criminal is a blend between induction and an argument by contradiction.
The method has a slightly disturbing logic and works by assuming the existence of a minimal counter example (the minimal criminal) and then prove there is an even smaller counter example hence leading to a contradiction.
Thus there can be no counter example by well-ordering and so the statement is proved for all elements of the set.
The method is also sometimes referred to as proof by Infinite Descent (the idea developed by Fermat).
Proof by minimal criminal is for example used in the classical argument to show that is irrational:
Suppose is not irrational i.e rational.
Then with minimal s.t .
Squaring both sides yields .
Therefore since divides the left hand side it must also divide the right hand side.
This means and so for some .
Thus so that for some .
But then where contradicting the minimality of
Hence cannot be rational.
Forward-Backward Induction (sometimes just called Backward Induction) is a less commonly used technique but has uses in various niche circumstances.
The method works in 3 steps.
i) Prove that is true (base case)
ii) Prove that holds true where is an infinite subset (forward step).
For example you can prove that is true .
iii) “Fill in the gaps” of the previous step by inducting backward, showing that (backward step)
Then holds true for all
Perhaps the most commonly known application of this method is Cauchys proof of the AM-GM inequality.
A proof may be found in the following Wikipedia article
Induction can be extended to higher dimensions.
Say we are looking to prove some statement .
There are essentially two ways this can be done, either by folding the two variables into a single variable and performing a regular 1-dimensional induction or it can be proved by doing nested 1-dimensional inductions with respect to each variable.
The method works as follows:
(i) Prove that is true (base case)
(ii) Prove that if is true whenever then is true for (inductive step)
or
(i) Prove that is true (base case)
(ii) Prove that if is true for some then is true (outer induction)
(iii) Prove that if is true for some then is true (inner induction)
Then is true for all .
The method can be used for instance to prove Ramsey’s theorem (see below).
Let denote the smallest number such that in any group of people there exists people who all know each other or people who all don’t know each other.
It is known that
(this is a lemma to be proved in the full argument).
– Prove that .
The base case is clearly true, .
Suppose the statement is true for all with for some .
We want to show the statement holds for all with .
If or then the statement follows immediately so we may assume and .
Now notice that so by inductive hypothesis
and .
Then by and a standard binomial identity we have
Hence the statement follows by induction on .
A combinatorial proof revolves around proving some identity by means of finding an explicit bijection or counting the same thing in two ways.
Combinatorial arguments are usually extremely more elegant and illuminating than their (usually) inductive counterpart.
Here is an example of a combinatorial proof:
– Prove that
On the right hand side we have the cardinality of the power set (i.e the set of all subsets) of objects.
On the left hand side we are summing over the number of all subsets of size for
These two are equal and hence the identity follows.
Now try proving this by induction!
More examples can be found in a previous article I wrote on the Orbit-Stabilizer Theorem and the Burnside Lemma.
Statements which have the word “unique” in them require uniqueness proofs.
Uniqueness can mean different things in different contexts.
In number theory you may be talking about the only number, or the only number modulo n satisfying a given proposition.
In algebra you may be talking about the only algebraic object up to isomorphism (algebraic equality).
In topology you may be talking about the only topological space up to homeomorphism (topological equality).
In general one could prove uniqueness with respect to any class or invariant.
Proofs of uniqueness often goes hand in hand with proofs of existence.
The skeleton for uniqueness proofs are almost always the same.
One assumes non-uniqueness and then shows that the supposedly non-unique objects satisfying the given proposition must in fact be equal.
Here is a trivial (yet demonstrative) example of this procedure:
– Prove that has a unique solution
Suppose there are two numbers satisfying the above equation.
Then and .
Therefore .
Hence the solution is unique if it exists (this needs to be proved as well, but we can easily see it is 2)
Statements of the form are called equivalences between statements and .
In mathematical texts the double arrow is sometimes replaced with the phrase ‘if and only if‘ or ‘iff‘ for short.
If is equivalent to that means is both necessary and sufficient for statement to be true.
is logically equivalent to AND .
Therefore all methods for proving implications are applicable to equivalences as well.
In particular to prove that one can show
There are of course a myriad of other techniques and proof patterns specific to more narrow mathematical domains, but most of the methods you have seen above are used throughout mathematics.
It may also be worth mentioning how to disprove a mathematical statement.
To disprove a mathematical statement it is enough to produce exactly one counter example.
For instance if you are to disprove that is true for all you should not show that is false for all . It is enough to exhibit one such that is false.
It is also important that you prove that your counter example is valid by proving that it satisfies all hypotheses AND does not satisfy the conclusion.
For example here is a false statement:
– “All real numbers are rational”
We can disprove this by exhibiting one number which we prove is real and cannot be rational.
Provided exists we proved in Method 3.5 that cannot be rational.
A proof that exists as a real number can be found here.
These facts together disproves the statement.
How do you know if a statement is true or false?
There is no general answer, if there were mathematics wouldn’t be particularly interesting.
What about heuristically?
If you are proving exercises in class you may as well trust the problem setter unless they explicitly ask you to determine the true or falsehood of a statement.
In general having found a question that makes sense (which can be hard in itself) I would personally recommend going by your intuition (…whatever that really means). If you think it might be false you may want to start looking for a counter example immediately. If you have no clue then you should start proving the statement. While doing so you might stumble over something which breaks the statement and that can help you construct a counter example by exploiting what makes it fall apart. Otherwise you either succeed in proving the statement or you get stuck somewhere and what to do then is a topic for another discussion.
Finally, if you are interested in reading about some humorous false proof techniques see here. I am sure anyone who have sat in a math class could identify with at least some of these.
Here are some exercises you may wish to try:
What is wrong with the following argument that attempts to prove every object in the world is grey?
——————————————————————————————————————–
Let be the set of all objects in the world.
For the base case, pick a grey T-shirt.
Now assume all subsets of objects are grey.
Consider a subset with objects.
Removing the last object from we can apply the inductive hypothesis to conclude the first objects from are all grey.
Removing the first object from we can apply the inductive hypothesis to conclude the last objects of are all grey.
Therefore all objects of must be grey.
Hence all objects in the world are grey by induction.
———————————————————————————————————————
What is wrong with the argument if we replace “every object is grey” with “every object is of the same colour”?
Prove the binomial identity via induction.
.
Can you find a combinatorial proof?
Find all positive primes for which there exists integers and
such that .
(Hint 1: Find the obvious solutions by inspection and prove there are no more using Minimal Criminal on )
(Hint 2: In the minimal criminal argument prove that , and that has to be at least 3. You can start by factorizing )