Gödel Numbers of Symbolic Expressions
Missing Missing Element

Kazimir Majorinc
December 4th, 2013.

0. Introduction


ustrian mathematician Kurt Gödel is known for his theorem on 'incompleteness' of the formal arithmetic expressed in the first order predicate calculus. In proof of the theorem, Gödel defined encoding of logical expressions into natural numbers that allowed him to express sentences about formal arithmetic as sentences of formal arithmetic.

Symbolic expressions are similar to logical expressions. It is not surprise, since symbolic expressions are designed explicitly to represent logical expressions in McCarthy's program Advice Taker. In this post, Gödel's encoding based on products of prime numbers is applied on symbolic expressions. That particular encoding is slightly simpler than original, due to convenient reasoning about symbolic expressions as lists. It is discussed how to redefine symbolic expressions so described gödelization is bijection, i.e. not only every symbolic expression has its code, but also every natural number is code of some symbolic expression.

1. Gödel's encoding of logical formulas

Natural numbers are assigned to all "primitive sings" allowed in logical formulas: 11 is assigned to left parenthesis, 13 is assigned to right parenthesis, 7 is assigned to logical connective ∨. Prime numbers larger than thirteen: 17, 19, 23, ... are assigned to variables x1, y1, z1, ... Logical formulas, as sequences of "primitive signs" are associated to sequences of natural numbers. For instance, (x1x2) is associated to sequence 11, 17, 7, 19, 13. The sequences of natural numbers are associated to single natural numbers using elements of sequence as exponents of prime numbers. For instance, 11, 17, 7, 19, 13 is associated to

211·317·57·719·1113 = 8131093970737510569553529892960700268640000000.

Hence, all logical formulas are associated to natural numbers. Furthermore, different logical formulas are associated to different numbers.

2. Gödel numbers of symbolic expressions

Gödel numbers can be assigned to symbolic expressions without encoding of individual characters as parentheses.

Symbols. Let us assume that symbols are only atoms in symbolic expressions, and that symbols consist only of small letters. These restrictions are not essential. Then, all symbols can be ordered first in groups of the same lengths; inside groups by alphabetical order. Such sequence would look like

a, b, c, ..., z, aa, ab, ac, ..., zz, aaa, ...

Then, these symbols are enumerated with prime numbers; a → 3, b → 5, c → 7, ...

Non-atomic symbolic expressions. If e1, ..., en, n ≥ 0 are symbolic expressions with respective Gödel numbers g1, ..., gn then symbolic expression (e1 ... en) is encoded to

2g1·3g2·... ·pngn,

where pn is n-th prime number. Specially, () is encoded to 1, (()) is encoded to 2 etc. Different symbolic expressions are encoded to different natural numbers. For each natural number that is Gödel number of some symbolic expression, original symbolic expression can be reconstructed. If Gödel number is prime number, then it is code of some symbol. If it is not prime number, then it can be represented as product of the form 2g1·3g2·... ·pngn, hence it is code of (e1 ... en), where e1, ..., en are reconstructed from Gödel numbers g1, ..., gn.  

For instance, Gödel number of (a b) is 23·35 = 1944.

3. Redefining symbolic expressions

Described encoding is simple and natural. However, it is not bijection. Some natural numbers are not codes of any symbolic expressions. For instance, 27 = 20·33 cannot be code of any symbolic expression, because there is no symbolic expression such that its code is 0. Let us introduce new symbolic expression, denoted with _ such that its Gödel number is 0. Then, Gödel number of (_ a) is exactly 20·33 = 27. As every natural number can be presented as product

2g1·3g2·... ·pngn,

where pn is n-th prime number, g1, g2, ..., gn ≥ 0, then every natural number is code of some symbolic expression.

Unfortunately, with such definition, Gödel numbers are not unique any more. There are two problems; it appears neither one is essential.

First, symbolic expression _ on the end of list does not influence encoding. Gödel numbers of all symbolic expressions (_ a), (_ a _), (_ a _ _) ... are

20·33 = 20·33 ·50 = 20·33·50·70 = ... = 27.

However, there is an interpretation of _ that allows natural identification of these symbolic expressions: _ could denote that element is "missing." For instance, symbolic expression (_ a) doesn't have first element, just second one; _ on the end of the symbolic expressions is irrelevant. Implementation of symbolic expressions as maps, as in important experimental Lisp dialect MISC, developed by Will Thimbleby might be suitable.

Second, Gödel numbers of symbols a, b, ... are same as Gödel numbers of symbolic expressions

(_ ()), (_ _ ()), ...

That problem is avoided if symbols are seen as shortens of respective symbolic expressions, like 'x is shorten for (quote x) in many Lisp dialects.


  1. Gödel, K., "On formally undecidable propositions of Principia Mathematica and related Systems I", in S. Feferman, Kurt Gödel, Collected Works, Vol. I., Oxford University Press, 1986, pp. 145-95.
  2. McCarthy, J., Programs with Common Sense, In Proceedings of the Teddington Conference on the Mechanization of Thought Processes, London, Her Majesty's Stationery Office, 1959, pp. 756-91.
  3. Thimbleby, W., MISC, retrieved 28 December 2011.