Gödel Numbers of Symbolic Expressions and Missing Missing Element
0. Introduction
A ustrian mathematician Kurt Gödel is known for his theorem on 'incompleteness' of the formal
arithmetic expressed in the first order predicate calculus. In proof of the theorem, Gödel defined encoding of logical expressions into natural numbers that allowed him to express sentences about formal
arithmetic as sentences of formal
arithmetic.
Symbolic expressions are similar to logical expressions. It is not surprise, since symbolic expressions are designed explicitly to represent logical expressions in McCarthy's program Advice Taker. In this post, Gödel's encoding based on products of prime numbers is applied on symbolic expressions. That particular encoding is slightly simpler than original, due to convenient reasoning about symbolic expressions as lists. It is discussed how to redefine symbolic expressions so described gödelization is bijection, i.e. not only every symbolic expression has its code, but also every natural number is code of some symbolic expression.
1. Gödel's encoding of logical formulas
Natural numbers are assigned to all "primitive sings" allowed in logical formulas: 11 is assigned to left parenthesis, 13 is assigned to right parenthesis, 7 is assigned to logical connective ∨. Prime numbers larger than thirteen: 17, 19, 23, ... are assigned to variables x_{1}, y_{1}, z_{1}, ... Logical formulas, as sequences of "primitive signs" are associated to sequences of natural numbers. For instance, (x_{1}∨x_{2}) is associated to sequence 11, 17, 7, 19, 13. The sequences of natural numbers are associated to single natural numbers using elements of sequence as exponents of prime numbers. For instance, 11, 17, 7, 19, 13 is associated to
2^{11}·3^{17}·5^{7}·7^{19}·11^{13} = 8131093970737510569553529892960700268640000000.
Hence, all logical formulas are associated to natural numbers. Furthermore, different logical formulas are associated to different numbers.
2. Gödel numbers of symbolic expressions
Gödel numbers can be assigned to symbolic expressions without encoding of individual characters as parentheses.
Symbols. Let us assume that symbols are only atoms in symbolic expressions, and that symbols consist only of small letters. These restrictions are not essential. Then, all symbols can be ordered first in groups of the same lengths; inside groups by alphabetical order. Such sequence would look like
a, b, c, ..., z, aa, ab, ac, ..., zz, aaa, ...
Then, these symbols are enumerated with prime numbers; a → 3, b → 5, c → 7, ...
Nonatomic symbolic expressions. If e_{1}, ..., e_{n}, n ≥ 0 are symbolic expressions with respective Gödel numbers g1, ..., gn then symbolic expression (e_{1} ... e_{n}) is encoded to
2^{g1}·3^{g2}·... ·p_{n}^{gn},
where p_{n} is nth prime number. Specially, () is encoded to 1, (()) is encoded to 2 etc. Different symbolic expressions are encoded to different natural numbers. For each natural number that is Gödel number of some symbolic expression, original symbolic expression can be reconstructed. If Gödel number is prime number, then it is code of some symbol. If it is not prime number, then it can be represented as product of the form 2^{g1}·3^{g2}·... ·p_{n}^{gn}, hence it is code of (e_{1} ... e_{n}), where e_{1}, ..., e_{n} are reconstructed from Gödel numbers g1, ..., gn.
For instance, Gödel number of (a b) is 2^{3}·3^{5} = 1944.
3. Redefining symbolic expressions
Described encoding is simple and natural. However, it is not bijection. Some natural numbers are not codes of any symbolic expressions. For instance, 27 = 2^{0}·3^{3} cannot be code of any symbolic expression, because there is no symbolic expression such that its code is 0. Let us introduce new symbolic expression, denoted with _ such that its Gödel number is 0. Then, Gödel number of (_ a) is exactly 2^{0}·3^{3} = 27. As every natural number can be presented as product
2^{g1}·3^{g2}·... ·p_{n}^{gn},
where p_{n} is nth prime number, g1, g2, ..., gn ≥ 0, then every natural number is code of some symbolic expression.
Unfortunately, with such definition, Gödel numbers are not unique any more. There are two problems; it appears neither one is essential.
First, symbolic expression _ on the end of list does not influence encoding. Gödel numbers of all symbolic expressions (_ a), (_ a _), (_ a _ _) ... are
2^{0}·3^{3} = 2^{0}·3^{3} ·5^{0} = 2^{0}·3^{3}·5^{0}·7^{0} = ... = 27.
However, there is an interpretation of _ that allows natural identification of these symbolic expressions: _ could denote that element is "missing." For instance, symbolic expression (_ a) doesn't have first element, just second one; _ on the end of the symbolic expressions is irrelevant. Implementation of symbolic expressions as maps, as in important experimental Lisp dialect MISC, developed by Will Thimbleby might be suitable.
Second, Gödel numbers of symbols a, b, ... are same as Gödel numbers of symbolic expressions
(_ ()), (_ _ ()), ...
That problem is avoided if symbols are seen as shortens of respective symbolic expressions, like 'x is shorten for (quote x) in many Lisp dialects.
References
 Gödel, K., "On formally undecidable propositions of Principia Mathematica and related Systems I", in S. Feferman, Kurt Gödel, Collected Works, Vol. I., Oxford University Press, 1986, pp. 14595.
 McCarthy, J., Programs with Common Sense, In Proceedings of the Teddington Conference on the Mechanization of Thought Processes, London, Her Majesty's Stationery Office, 1959, pp. 75691.
 Thimbleby, W., MISC, retrieved 28 December 2011.
