principle_of_formal_systems

Systems and Formal Systems

L. H. Kauffman

University of Illinois at Chicago

Abstract. All (formal) systems are interpreted (formal) systems. Each abstract pattern has its origin in experience and returns to that experience. The boundary between systems as systems in the world and systems as mathematical systems can only be drawn as a convenience (or a hindrance). In reality, systems are articulations of direct experience and the articulation of experience is the act of creation of (described) systems.

0. Introduction Mathematics is the study of patterns. To the mathematician his or her structures take on the appearance of a permanence that can seem far more real than the familiar world of objects, processes and human experiences. This vivid presence of objects (as simple as the number 17 or the form of a triangle) that seem to have existence only in the minds that think them leads to the notion that they really exist independent of those minds. Is this the case? And what is the relation of mathematics to the worlds of our experience? This essay explores these questions by first taking up the theme of formal systems and then branching out into a discussion of the meaning of formalism in relation to mathematics as a whole.

The main point that I wish to make here is that on the one hand we stabilize and make real for ourselves mathematical structures by using formalisms of all sorts. On the other hand no single formalism can capture in its entirety what needs to be expressed. In ordinary language, in the arts and in literature it is obvious that every mode of expression has its limits. In mathematics, the limitations of a given mode of expression can actually lead the way to new modes of thought that transcend that mode of expression.

Mathematics arises from our experience of the world and our need to express that experience in language and pattern and process. But just as language and the world are intertwined so it is with mathematics and the worlds that it describes and creates. Ultimately we cannot distinguish between the models and the worlds that they model. It is a paradox. The very process of careful abstraction and clear modeling (where the map is not the territory) leads to an intertwining of pattern, language, process and models that is the very ground of our existence.

The essay itself explores these themes. It is a rough attempt at an articulation.

I. Genesis of Formal Systems

I will consider the nature of formal systems as a mathematician tends to speak of them. I will speak informally about formal systems.

It is important to realize that the idea of mathematics as an abstract game of symbols is relatively recent. Up until the discovery of non-Euclidean geometry in the 1800's it was assumed that mathematics described collections of necessary truths about the world, truths derived from unassailable axioms. Then with Bolyai, Loabshevsky, Gauss and Riemann it became clear that the basic patterns of geometry did not have to assume the parallel postulate (Through a point distinct from a line there is exactly one line parallel to that line.). Then the patterns of geometry became just that, patterns and were malleable and susceptible to a multiplicity of interpretations. The truth of geometry became a truth relative to the axioms and the axioms became mere suppositions in a sea of possible suppositions.

Along with this advent of non-Euclidean geometry there came the rise of symbolic logic, particularly at the hands of George Boole, who saw an analogy between algebra and logic. Boole did not just symbolize logic, he saw that ordinary algebra could be interpreted as logic. He recognized that one could write A+B and mean “A or B” and that one could write AB and mean “A and B”. He saw that algebraic rules such as the distributive law A(B+C) = AB+AC could be interpreted for logic. He saw in a global way that one could set up an algebraic system that looked very much like the ordinary algebra of numbers, but that this system would allow logical calculation and the analysis of complex arguments.

Thus, from the point of view of algebra, Boole's revolution is as big as the revolution that produced non-Euclidean geometry. Now one had non-standard algebra, open to interpretations that led far beyond the confines of the description of the properties of numbers.

This revolution in algebra was not confined to algebraic logic. Boole's contemporary Sir William Rowan Hamilton discovered the Quaternion and in so doing introduced one of the first significant non-commutative algebras. Hamilton's Quaternions are an algebraic structure on four dimensional space that turns out to have far reaching consequences for the structure of rotations in three-space and for many applications in physics. In fact it was not until the advent of quantum theory in the 20th century that the quaternions came into their own as the group SU(2) so central to the quantum mechanics of spin.

The difference in this change of view about algebra was every bit as significant as the change that happened when Descartes observed that geometry could be done by using algebra. But let us recall the structure of Descartes' discovery Descartes saw that if we model the two dimensional Euclidean plane by pairs of coordinate numbers (x,y) then geometric relationships could be mapped precisely to algebraic relationships. Cartesian geometry is based of the distance function (x2 + y2)[1/2], representing the distance of the point (x,y) from the origin (0,0) in the plane. Thus a circle of radius R becomes the set of points (x,y) such that x2 + y2 = 1. The formalism of numbers remains the formalism of numbers. The patterns of geometry remain the patterns of geometry. But now these two fields of mathematics are one.

As we now know, these discoveries of Descartes opened up the way for non-Euclidean geometry since a major pathway to that arena is the use of different sorts of distance functions to map out the non-Euclidean models. His discovery also paved the way to the discovery of the Quaternions, for the precursor of the Quaternions was the discovery of the geometric interpretation of the complex numbers by Gauss and Argand. That discovery can be stated in utmost simplicity with the equation

a+bi = (a,b).

On the left hand side we have the complex number a+bi with its imaginary part bi and the imaginary value i whose square is -1. On the right hand side is the Cartesian geometric point in the plane. The correspondence explains everything! All the strange formal properties of the complex numbers have beautiful geometric interpretations, and a whole new field of mathematics opens up. (The beginning of this is the equation (a+bi)(a-bi) = a2+b2, relating Euclidean/Cartesian distance to the product of a number and its conjugate. Along with this is the fact that i(a,b) = (-b,a), interpreting i as a rotation of ninety degrees in the Cartesian plane.)

Although used for hundreds of years before Gauss and Argand, these imaginary numbers were only understood as an abstract formalism, a game of symbols where (a+bi)(c+di) = (ac -bd) +(ad+bc)i. This formula is obtained directly by multiplying out the products and using the equation i2=-1. And yet it was known that these complex numbers could be used to reason to perfectly real and sensible answers to the solutions of higher degree equations. It is here that one begins to see also the seeds of a view of mathematics as pure formalism, for what is the nature of these abstract manipulations that lead to real answers?! It is useful here to pause and give some examples of this sort of magic. First of all let z=a+bi and z* = a-bi. Then as we indicated in the last paragraph, zz* = a2 +b2. It is also easy to see that (zw)* = z*w* for any complex numbers z and w (We leave this as an exercise for the reader.). Thus

(zz*)(ww*) = (zw)(z*w*) = (zw)(zw)*

and these equations tell us that

The product of two sums of two squares is itself a sum of two squares.

In fact, the equations give us a specific answer to that question, for if z=a+bi and w=c+di then

zw = (ac-bd) +(ad+bc)i and so we have

(a2+b2)(c2+d2) = (ac-bd)2 + (ad+bc)2.

For example

(32 + 42)(52 +72) = (3×5 - 4×7)2 + (3×7 + 4×5)2 = 132 + 412.

An abstract formalism informs the real world of numbers.

Hamilton's discovery of the Quaternions comes on the heels of the geometric interpretation of the complex numbers. In fact it was Hamilton who pointed out that the essence of this interpretation was a method of multiplying couples of numbers in the form

(a,b)(c,d) = (ac-bd, ad+bc),

exactly the abstraction of the formula for the product of a+bi and c+di. With that understanding, Hamilton went into a long investigation hoping to find good ways to multiply triples – to find an algebra structure for three-dimensional space. This direct attack was doomed to failure, but when he went to quadruples and to four-dimensional space, then success shined on the project and the Quaternions were born.

At this stage abstract algebra was born, and with it the possibility of thinking of mathematics as the study of pure formal games. Games of rules for the manipulation of certain kinds of symbols. Vast possibilities open at this thought, for then mathematics becomes a study of patterns of all kinds codified in precise languages whose meanings need only exist in relation to the play of the patterns themselves. Truth is abandoned to indication. Indication is abandoned to void in the form of the creation of those systems themselves.

The power of formalism was appreciated in real mathematics. Newton's calculus demonstrated that the right formalism could penetrate the depths of dynamical natural law. Lebniz dreamed of a calculus ratiocinator, but also gave us the remarkable notation for the Newton/Leibniz calculus that streamlines its understanding and application to the present day. Euler, master of creative formalism encapsulated infinity in his discovery of the intimate relationship of p, e and i in the formula eip +1 = 0, his incredible discovery that that p2/6 is sum of the reciprocals of the squares of the natural numbers and his fantastic proof of the infinitude of the prime numbers that is encapsulated in the formula 1 +1/2 + 1/3 + 1/4 + … = (2/1)(3/2)(5/4)(7/6)(11/10)(13/12)(17/16)…

(The series on the left is the sum of the reciprocals of the natural numbers. The product on the right is the product over all prime numbers p of the ratios (p/(p-1)). It is not hard to see the series on the left diverges (slowly!) and so the product on the right must be infinite. Hence there are infinitely many prime numbers.)

Bertrand Russell and Alfred North Whitehead took these themes to a new place in their Principia Mathematica, attempting to found all of mathematics on symbolic logic, their logic was codified into a formal system with specific syntactic rules governing all allowable operations. A proof a sequence of formulas, each following from the next according to either the specified initial rules or by a jump sanctioned by a previously proved theorem. In essence, each proof could be expanded until it was a sequence of formulas each following from the next according to the specified rules of the system. In essence then, all theorems of mathematics were to be the emanation of of one small set of rules based on the formalization of logic. A fantastic mechanical loom of reason. Russell was quoted as saying “Mathematics is the subject where we do not know what we are talking about nor whether what we are saying is true.”

This quote from Russell is a perfect parody of the situation of analyzing an uninterpreted formal system. If the role of the mathematician is to be the examiner of the consequences of a formal system on the basis of its stated rules, then indeed he need have no regard for meaning and no regard for truth! But this is as strange a parody of the mathematician as is Searles' parody of a translator of languages! Recall that Searles' translator sits in an isolated room (the Chinese room) with a dictionary and other lists of rules. Pieces of paper are handed to him. He looks up their translations in his book. He knows nothing of Chinese. He just follows the rules. He writes down the translations on other slips of paper. He follows the rules. Sometimes the rules direct him into reactions that are not exactly translations but “responses” to what it written. He does his work. To the outside world the Chinese Room appears to converse in Chinese! The operator of the room knows nothing of this. He just follows the rules of a formal system plus dictionaries. He has no intelligence for Chinese. Yet Operator plus Room form a perfect Chinese conversant.

Shift to present time. Computers prove theorems! In fact, a few years ago a computer at Argonne Labs in Illinois solved a problem in Boolean algebra that had stumped human beings for fifty years. The feat was accomplished by using the computer exactly as the operator of a formal system that encoded the problem. Strictly speaking, one knew how to direct the computer in a search for a legal sequence of moves from the beginnings of the system that would constitute a proof that “Robbins algebras are Boolean.” The computer needed no meaning, no idea of truth. It just followed the rules, engaged the search and after a week of computation found the pathway that solved the problem.

Russell and Whitehead thought that they could reduce all of mathematics to a system that could be fed into such a computer. Just add electricity and wait. All theorems will come forth. The concept seemed right at the time. Just get the basic principles encoded and everything else will come out via mechanism in the meaningless gyrations of legal sequences of moves.

There is something wrong here, and that wrongness was eloquently pointed out in two different but related ways by Ludwig Wittengenstein in his early work the Tractatus Logico Philosphicus and in his later work on the foundations of mathematics. It was given a death blow by Kurt G�del who showed that any sufficiently rich formal system will have theorems that it cannot prove that can nevertheless be proved by any competent mathematician who has access to the system. We will illustrate the basic form of Godel's argument, but the key point in his work is the rehabilitation of a basic dramatization of the mathematician. G�del's mathematician does not just check whether sequences of rules in the formal system are obeyed or not obeyed. He is not Searles' operator in the Chinese room. Not at all! G�del's mathematician analyzes the structure of the formal system as a whole. He looks on the formal system in question as a mathematical structure to be studied. That system, for G�del's mathematician, is like a triangle in the hands of Euclid. Euclid proves that the triangle has the sum of its angles equal to 180 degrees, a fact quite unknown to the triangle herself! G�del's mathematician stands meta to the formal system and endows that system with an interpretation as formal system among formal systems. As soon as a system is well defined in the way that it can be regarded as uninterpreted and strictly rule driven it becomes an object of study and thus acquires the interpretation of a structure to be looked at among the panoply of other such structures.

Shift again to modern times. We make a working computer program. This program is precisely designed to be operable by a machine. It must work without any hint of its possible interpretation. In creating it and working with it we work in the class of computer programs that behave in such and such ways in such and such a language. The program is interpreted as a program. The formal system is interpreted as a formal system. There is no such thing as an uninterpreted formal system.

II. G�del's Incompleteness Theorem

In this section I will give a miniature version of G�del's argument and a description of his actual argument. The miniature example is due to Raymond Smullyan and I call it the “Smullyan Machine”. The Smullyan Machine acts at any given time by printing a string of characters on a tape that emanates from its side. The operator of the machine presses the single button on the top of the machine to active each such printing. We are told that the only characters that the Machine can use for its printing are from the seet { (,),~,P,R }. Thus the Machine might print ^{1)}))))PPPRR~~R(.

Certain types of character strings are singled out as “Machine sentences” (M-sentences). These are all of the following form

P(X) or ~P(X) or PR(X) or ~PR(X)

where X can be any character string whatever. Thus ~P(RRR(P~) is an M-Sentence with X=RRR(P~. We are told the following properties of the Machine:

1. M-sentences can be interpeted as descriptions of the possible actions of the Machine. In particular, P(X) is interpreted as the information that the machine can print the string X all by itself. ~P(X) is interpreted as the information that the machine cannot print the string X all by itself. PR(X) is interpreted as the information that the machine can print the string X(X) all by itself, and ~PR(X) is interpreted as the information that the Machine cannot print the string X(X) all by itself.

2. Under this interpretation, the Machine only prints true M-sentences.

Theorem. There exist M-sentences that are true, yet unprintable by the Smullyan Machine.

Proof. Let S=~PR(~PR). If the Machine can print S then the interpretation of S tells us that the machine cannot print X(X) where X=~PR. Thus the machine cannot print ~PR(~PR). Thus we have shown that if the machine prints S then a contradiction ensues since S asserts its own unprintability. Therefore the Machine (which prints only true M-sentences) cannot print S. But S asserts her own unprintability by the Smullyan Machine. Therefore S is a true M-sentence that is unprintable by the Machine. QED

The Smullyan Machine illustrates a number of key issues about formal systems and their limitations. The Machine is simpler than the systems to which the full Godel theorem applies, and of course the Machine does not engage in proofs. Nevertheless, the key issues of reference and interpretation in a metalanguage are neatly illustrated. To confront the issue of reference directly note how we constructed S as a sentence that refers to itself. We might have attempted to solve ~P(X) = X. If there were such an X then the M-sentence ~P(X) would assert its own unprintability. However, there can be no solution to ~P(X) =X since ~P(X) has four more characters than the string X. Here we meet the direct notion of identity for character strings on which this construction was based. In fact, the way out was through the referential nature of R(X). R(X) in the interpretation of Machine sentences, stands for X(X). Unless X has a single character, X(X) has more characters than R(X). This is an image in the Machine's formal system of the usual linguistic condition of reference where the referent is smaller than the that to which it refers. Then we try to solve ~PR(X) = ~X(X) and discover the unique solution X=~PR that makes the Machine incomplete with respect to truth.

How does this situation compare with the full Godel Theorem? In that theorem the formal system is coaxed into producing a sentence whose interpretation is “This sentence is unprovable in the present formal system.”. G�del's method is to set up a code wherein every piece of text in the formal system has a code integer from which the text can be decoded. The formal system is assumed to be rich enough to contain all the usual properties of the integers and the capability of formalizing proofs and algorithms involving integers. Lets write

g —–> F

to indicate that the formula or text F of the formal system has code number g. Now suppose that

g —–> F(u)

where F(u) is a formula with a free variable u (for example F(u) could be the statement u > 2). Let #g be the code number of F(g) so that

1. g -----> F(g).

2. g is the code number of the statement obtained by substituting the code number of F(u) into the variable u. Lets repeat

g —–> F(u)

1. g -----> F(g).

Now the association of #g to g is a function from integers (a special subset of integers that encode statements with one free variable) to integers. Thus the specification for the computation of #g can be expressed in the formal system. Thus the formal system can be assuemd to know (in the sense of its internal definitions) the symbol #. Therefore we can consider formulas of the form F(#u) that invoke the computation of #u. Suppose we have such a formula F(#u). It has a code number g so we have

g —–> F(#u)

1. g -----> F(#g).

As a result, F(#g) refers to its own code number. In our interpretation, F(#g) refers to itself. This takes care of self-reference.

Now consider provability. Let B(u) be the sentence in the formal system that asserts the provability of the decoding of the number u. B(u) is a statement about numbers that is decidable by an algorithm and so we can assume that it is stateable in the formal system. Given that, we let g be the code number for ~B(#u). Then

g —–> ~B(#u)

1. g -----> ~B(#g).

~B(#g) asserts it own unprovability in the formal system. Hence it cannot be proven without contradiction within the formal system. Therefore it is a true theorem that can be stated in the formal system, but is not provable within the system. This completes the proof of G�del's Incompleteness Theorem.

The very precision of the notion of a formal system, the fact that it can be pinned down by coding, is the seed of incompleteness. The incompleteness comes about through the ability of the mathematical observer to analyze the system as a mathematical object.

Mathematics, as a whole, is not derived from a single formal system. If it were then we could step outside that system (in principle!) and apply G�del's argument, producing a true Theorem (and its proof) unattainable by the formal system itself. Proof is not mechanical. Mathematics does not proceed inevitably from a codification of logic or any other codification.

We started with the idea of a formal system, quite self-contained, designed to handle ordinary arithmetic, but constructed so that every bit of arithmetic and logic that it needed was contained in its own formalism. The system was then apparently free from needing any interpretation. All processes could continue via its own internal rules. The statement ~B(#g) looked at from within the system was just a statement about the properties of a certain computable number #g. The formal system itself has no way to survey the consequences of the provablity of ~B(#g). But if it is consistent then it will never write a proof of this statement. Our proof as outsiders to the system of this same statement is a proof using the meaning of its interpretation in terms of provability. This meaning is part of our exact discussion of the meaning and behaviour of the formal system, just as exact as our discussion of whether or not the Smullyan machine can or cannot print a given statement.

Meaning arises through interpretation and this meaning is the meaning that is assigned to the formal system itself. That it is a formal system provides a sufficient context and meaning for us to demonstrate its incompleteness.

III. Experience

This essay has been a discussion of syntax and semantics. G�del's theorem shows that any attempt to found mathematical knowledge fully on syntax is doomed to failure exactly because the essence of mathematics is to make meaning out of syntax.

I have given numerous examples showing how the syntax in mathematical formalism is intimately related to its meaning. There is great freedom in examining a structure on a purely syntactic basis. This freedom is not at all the same thing as asserting that all of mathematics can be done in pure syntax (as though performed by a machine). In fact it is exactly in our experiencing of the formal system that its meaning arises and its incompleteness is fulfilled. After all, the G�delian sentence that is unprovable inside the formal system is certainly not unprovable! It is proved by us, outside the formal system through our experience of that system.

This leads directly into the major theme of the relation of experience to mathematics. So far in this essay we have emphasized the role of personal experience in relation to the mathematics itself. This is the experience of the person working with the formalism and its associated ideas. It is through this experience that the mathematical person comes to live in a ground of language that allows the proof of theorems unavailable to a given formal system. And it is this same ground of experience that is related to the process of abstracting mathematics from the activities of the everyday world.

The everyday world is the source of all our mathematics. It is through our experiences in that world that systems of counting came to be invented. It is through our experiences in that world that systems of geometry became codified. The order and sophistication of these invented/ discovered systems varies from culture to culture. The Incas reckoned by using knots tied in clusters of string (Quipu). Yet they did not (apparently) develop a topological theory of knots. Indeed our familiar mathematics did not develop a theory of knots until the end of the nineteenth century. The theory developed strongly in the twentieth century and it still in a state of development. Yet the everyday experience of knots and weaves has been with us since nearly the beginning of human history. It is never obvious when a cultural pattern of activity will become a mathematical activity. There is no way to draw a boundary separating the patterns developed in culture, in practical work, in the arts from the pool of patterns that will be of significance to mathematics.

One may wonder that the everyday world is in fact built of “mathematical stuff”. In physics one is impressed with the precise correspondence of natural action and certain mathematical laws (such as the equations for gravity, electromagnetism, gravity and quantum theory).

In the realms of art, poetry and emotion we do not have quantitative models, but this does not mean that mathematics is not involved. One of the benefits of understanding formal systems for what they are, is that mathematics as a whole becomes a non-numerical study of pattern. Whole areas of mathematics are non-numerical and qualitative at their base. In this perspective one only arbitrarily draws a line between mathematical structures and the patterns of the dance, the instructions for a weaving or the text and meaning of a musical score.

It is my belief that Wittgenstein's dictum “The limits of my language are the limits of my world.” (Tractatus Logico Philosophicus) extends deeply into the relationship of mathematics and our worlds. The limits of mathematical language exist only to be transcended. In this sense there are no limits and this is so of our worlds as well. When we ask what are the limits of language, and see that in this sense of creativity there are none, we enter directly into the realm that develops art, poetry, music, dance and mathematical structure. That we distinguish these from one another in ordinary life is only a limitation of ordinary language.

principle_of_formal_systems.txt · Last modified: 2015/01/31 23:55 (external edit)