Gödel Numbers of Symbolic Expressions
and
Missing Missing Element

Kazimir Majorinc
December 4^th, 2013.

0. Introduction

ustrian mathematician Kurt Gödel is known for his theorem on 'incompleteness' of the formal arithmetic expressed in the first order predicate calculus. In proof of the theorem, Gödel defined encoding of logical expressions into natural numbers that allowed him to express sentences about formal arithmetic as sentences of formal arithmetic.

Symbolic expressions are similar to logical expressions. It is not surprise, since symbolic expressions are designed explicitly to represent logical expressions in McCarthy's program Advice Taker. In this post, Gödel's encoding based on products of prime numbers is applied on symbolic expressions. That particular encoding is slightly simpler than original, due to convenient reasoning about symbolic expressions as lists. It is discussed how to redefine symbolic expressions so described gödelization is bijection, i.e. not only every symbolic expression has its code, but also every natural number is code of some symbolic expression.

1. Gödel's encoding of logical formulas

Natural numbers are assigned to all "primitive sings" allowed in logical formulas: 11 is assigned to left parenthesis, 13 is assigned to right parenthesis, 7 is assigned to logical connective ∨. Prime numbers larger than thirteen: 17, 19, 23, ... are assigned to variables x₁, y₁, z₁, ... Logical formulas, as sequences of "primitive signs" are associated to sequences of natural numbers. For instance, (x₁∨x₂) is associated to sequence 11, 17, 7, 19, 13. The sequences of natural numbers are associated to single natural numbers using elements of sequence as exponents of prime numbers. For instance, 11, 17, 7, 19, 13 is associated to

2¹¹·3¹⁷·5⁷·7¹⁹·11¹³ = 8131093970737510569553529892960700268640000000.

Hence, all logical formulas are associated to natural numbers. Furthermore, different logical formulas are associated to different numbers.

2. Gödel numbers of symbolic expressions

Gödel numbers can be assigned to symbolic expressions without encoding of individual characters as parentheses.

Symbols. Let us assume that symbols are only atoms in symbolic expressions, and that symbols consist only of small letters. These restrictions are not essential. Then, all symbols can be ordered first in groups of the same lengths; inside groups by alphabetical order. Such sequence would look like

a, b, c, ..., z, aa, ab, ac, ..., zz, aaa, ...

Then, these symbols are enumerated with prime numbers; a → 3, b → 5, c → 7, ...

Non-atomic symbolic expressions. If e₁, ..., e_n, n ≥ 0 are symbolic expressions with respective Gödel numbers g1, ..., gn then symbolic expression (e₁ ... e_n) is encoded to

2^g1·3^g2·... ·p_n^gn,

where p_n is n-th prime number. Specially, () is encoded to 1, (()) is encoded to 2 etc. Different symbolic expressions are encoded to different natural numbers. For each natural number that is Gödel number of some symbolic expression, original symbolic expression can be reconstructed. If Gödel number is prime number, then it is code of some symbol. If it is not prime number, then it can be represented as product of the form 2^g1·3^g2·... ·p_n^gn, hence it is code of (e₁ ... e_n), where e₁, ..., e_n are reconstructed from Gödel numbers g1, ..., gn.

For instance, Gödel number of (a b) is 2³·3⁵ = 1944.

3. Redefining symbolic expressions

Described encoding is simple and natural. However, it is not bijection. Some natural numbers are not codes of any symbolic expressions. For instance, 27 = 2⁰·3³ cannot be code of any symbolic expression, because there is no symbolic expression such that its code is 0. Let us introduce new symbolic expression, denoted with _ such that its Gödel number is 0. Then, Gödel number of (_ a) is exactly 2⁰·3³ = 27. As every natural number can be presented as product

2^g1·3^g2·... ·p_n^gn,

where p_n is n-th prime number, g1, g2, ..., gn ≥ 0, then every natural number is code of some symbolic expression.

Unfortunately, with such definition, Gödel numbers are not unique any more. There are two problems; it appears neither one is essential.

First, symbolic expression _ on the end of list does not influence encoding. Gödel numbers of all symbolic expressions (_ a), (_ a _), (_ a _ _) ... are

2⁰·3³ = 2⁰·3³ ·5⁰ = 2⁰·3³·5⁰·7⁰ = ... = 27.

However, there is an interpretation of _ that allows natural identification of these symbolic expressions: _ could denote that element is "missing." For instance, symbolic expression (_ a) doesn't have first element, just second one; _ on the end of the symbolic expressions is irrelevant. Implementation of symbolic expressions as maps, as in important experimental Lisp dialect MISC, developed by Will Thimbleby might be suitable.

Second, Gödel numbers of symbols a, b, ... are same as Gödel numbers of symbolic expressions

(_ ()), (_ _ ()), ...

That problem is avoided if symbols are seen as shortens of respective symbolic expressions, like 'x is shorten for (quote x) in many Lisp dialects.

References

Gödel, K., "On formally undecidable propositions of Principia Mathematica and related Systems I", in S. Feferman, Kurt Gödel, Collected Works, Vol. I., Oxford University Press, 1986, pp. 145-95.
McCarthy, J., Programs with Common Sense, In Proceedings of the Teddington Conference on the Mechanization of Thought Processes, London, Her Majesty's Stationery Office, 1959, pp. 756-91.
Thimbleby, W., MISC, retrieved 28 December 2011.