There are words that exist beyond the domain of natural language. Remember that language’s ultimate utility remains in its ability to transmit information. Yet, I think we would all agree that there are types of information impossible to convey. In the movie Contact, based on the book by Carl Sagan, Palmer Joss demonstrates this to Dr. Arroway by asking, “Did you love your father?” Arroway responded affirmatively. Given the previous narrative in the movie, there is little doubt and Arroway seems flummoxed by the need to give an answer to such an obvious question. Joss responds, “Prove it.” (2:05 – 2:17 below)

An action may not prove it. Any person could feed an ill father. Any person could watch television with a father. None of these things alone or together suffice. Yet, from the perspective of Dr. Arroway, she knows it beyond a doubt. The answer lies encoded within her consciousness. So do the language centers necessary to translate the neuronal patterns of the persona into information. This could be a gap in natural language. And maybe there just haven’t been words invented for this type of evidentiary matter.

But I think that we can expand the example further to prove an important point. How do we prove that we can prove it? How do we prove that we can prove that we proved it? How can we check that? And then how can we be sure it’s reliable? The problem here is not an infinite regress. Let us assume instead that we can prove all these things and more. A mathematician named Kurt Godel suggested with his Incompleteness Theorems that we can only come so close to perfect information without ever having it. Each attempt to obtain it pushes the goal one step further away.

In what has been compared in importance to Heisenberg’s Uncertainty Principle and Einstein’s General Relativity, Godel’s Incompleteness Theorem, encompassing both the First and Second Theorems, says two things (from Wikipedia):

  1. Any effectively generated theory capable of expressing elementary arithmetic cannot be both consistent and complete. In particular, for any consistent, effectively generated formal theory that proves certain basic arithmetic truths, there is an arithmetical statement that is true, but not provable in the theory.
  2. For any formal effectively generated theory T including basic arithmetical truths and also certain truths about formal provability, T includes a statement of its own consistency if and only if T is inconsistent.

The first theorem listed says that in the consistent system of arithmetic there will be true statements that are simply not provable. This derives somewhat from the so-called Liar’s Paradox (“This statement is false.”), but whose analogue is “This statement is not provable within this system.” The difference is significant, so let us focus on the latter version. Interestingly, as Rebecca Goldstein explains in Incompleteness: The Proof and Paradox of Kurt Godel, efforts that one makes to expand the system and prove those unprovable statements will prove futile:

…the proof demonstrates that should we try to remedy the incompleteness by explicitly adding [the paradox] on as an axiom, thus creating a new, expanded formal system, then a counterpart to G can be constructed within that expanded system that is true but unprovable in the expanded system. The conclusion: There are provably unprovable, but nevertheless true, propositions in any formal system that contains elementary arithmetic, assuming that system to be consistent. A system rich enough to contain arithmetic cannot be both consistent and complete.

In truth, the expressions of arithmetic easily expand into a subset of natural language — the way in which we structure information, oftentimes even in our minds. Since arithmetic is a subset of natural language and it will always keep going and going and going, never ending complete, then natural language is doomed to the same fate, but there are other subsets that redouble arithmetic’s efforts. An attempt to make natural language (a system very different from arithmetic) complete involves going to the very end of it, putting in all the sounds and morphemes possible, as we have already discussed. In a mathematical sense, we could try to have a description and expression for everything in natural language. Every nook and cranny of meaning possible, likely including every permutation and flavor of combined meanings (n-dimensional superfactorials like !sweet and !love come to mind), would have to be revealed. Once all information on tangible things in the multiverse is discovered, there remains perceptive, intangible information locked in the conscious mind. Then there’s the unconscious mind. Then there’s… and on and on and on. This is all to say that it would be very nice to have natural language incorporate expressions for everything, but a) it’s impossible and b) it’s not practical. The amount of energy it would take to do so would be far better used doing something else, like making pizza, curing cancer, or inventing warp drive.

What’s interesting about all this is that despite this limitation we have already transcended the limits of natural language. This is best illustrated by going back to the set of perceptive, intangible information that must be described. If all the tangible information had a certainty of 99.9999% then we could add a “data source morpheme” to it such as /-wa/ that indicates this. Or, we could leave it unmarked and simply add morphemes for anything but 99.9999% certain information. Intangible information gets graded differently by data morphemes. Jaqaru, a dying language of 3000 speakers found primarily in Lima, Peru, features such data morphemes. Dr. M.J. Hardman of the University of Florida has studied the language in detail. Her findings on data morphemes include the following (from Wikipedia):

Data-source marking is reflected in every sentence of the language. The three major grammatical categories of data source are:

1. personal knowledge (PK)–typically referring to sight
2. knowledge-through-language (KTW)–referring to all that is learned by hearing others speak and by reading
3. non-personal-knowledge (NPK)–used for all myths, histories from longer ago than any living memory, stories, and non-involvement of the speaker in the current situation

So where are we transcending? In the areas of natural language for which even data source marking would be difficult, if not impossible without another data source morpheme referring to 0% certainty. I think of three areas primarily: faith, ethereality, and paradox.