In June upon my return from Hong Kong, I wrote a post called “Heisenberg Uncertainty Principle in Language” that postulated the following: “the measurement of expression necessarily disturbs a statement’s meaning, and vice versa.” I only mean to codify a common problem in translation, namely, that it is more difficult to translate some things (poetry, evocative words) than others (scientific treatises). Although I should note that some translators are better than others, like Anne Carson, who delivered outstanding translations of Sappho’s poetry in If Not, Winter: Fragments of Sappho.

One reason that she triumphs is that she complements the translations with a comprehensive glossary. Here is an example:

koma is a noun used in Hippokratic texts of the lethargic state called “coma” yet not originally a medical term. This is the profound, weird, sexual sleep that enwraps Zeus after love with Hera; this is the punishing, unbreathing stupor imposed for a year on any god who breaks an oath; […] Otherworldliness is intensified in Sappho’s poem by the synaesthetic quality of her koma–dropping from leaves set in motion by a shiver of light over the tree: Sappho’s adjective aithussomenon (”radiant-shaking”) blends visual and tactile perceptions with a sound of rushing emptiness.

Adding this context to the translation allows us to compensate for the Heisenberg Uncertainty Principle by giving us much of the meaning we are missing. Many linguists believe that each language is capable of expressing anything another language is, and to the extent that any approximation is possible, this kind of glossary really aids full translation of both expression and meaning. Another reason she is a successful is that she creates new words in English using standard word formation rules that give us a better sense of the original meaning. For example, as you may notice from that excerpt, she translates the Greek word aithussomenon as ‘radiant-shaking.’ Like many translators, she could have opted for ‘radiant’ or ‘quivering’ or some other simple gloss.

The translation issue isn’t merely academic. Computer scientists, engineers, and linguists have been engaged in creating and improving natural language processing, which can be said to involve everything from parsing human language into constituents that a computer can process for speech recognition programs or text interface with Ask Jeeves, Bing, Wolfram Alpha, etc. If you type the following search query, “What should I do on a first date?” the computer must be able to wring your intended meaning out of it, first, and then determine the most relevant information to respond with, second. Obviously, this is a gross simplification. And still more distantly, natural language processing is critical to the development of artificial intelligence. The prospect of an artificial intelligence without the ability to communicate with us is frightening, as might be seen by Orson Scott Card’s discussion of varelse.

The “principle” might affect NLP thus: assuming morphemes (the smallest units of meaning in language) were discrete and stored as matrices, then science terms would have more characteristics whose meanings were always relevant than non-science terms, and context would change values in the matrix less (if at all) for science terms. For example, let us assume that we are storing words in a 1×23 matrix, and each column stores a binary value for a certain category of language for the following categories: (1) Utility-Miscellaneous, (2) Utility-Mating, (3) Utility-Energy, (4) Utility-Safety, (5) Adj-Bright, (6) Adj-Dark, (7) Adj-Good, (8) Adj-Bad, (9) Noun-Person, (10) Noun-Place, (11) Obj-This, (12) Obj-That, (13) Tense-Past, (14) Tense-Present, (15) Tense-Future, (16) Probability-Unknown-Question, (17) Probability-Possible-Doubt, (18) Probability-Certainty, (19) Singular, (20) Plural, (21) Verb-Eat, (22) Verb-Sex, (23) Verb-Sense. Just act like these are all the categories of words that would have been important to you as a pre-Pleistocene human. Translating the question, “How many are there?” might get you the following calculation (omitting tense and a few other umm… critical things):

PLURAL: [0000000000000000001000] +
PROB-UNK-QUESTION: [00000000000000010000000] = [00000000000000010001000]

It won’t elicit a response that gives you the exact number probably, but it might get a response like “many” which would just be the plural meaning along with the certainty meaning. Some cultures do not have numbers beyond, say, three. After that, they have words for certain magnitudes. So an acceptable response might look like this:

PLURAL: [0000000000000000001000] +
PROB-CERTAINTY: [00000000000000000100000] = [00000000000000000101000]

The human need for information would have transformed this caveman style language into modern language with its recursive grammars, but I am just showing you an example of a rudimentary NLP model based on lexical storage in matrices. Why are matrices important? Depending on what you want to accomplish, you can change the dimensions and values of the categories for matrix operations that could, in turn, symbolize grammatical sentences. Perhaps you could get meaningful dot or cross products, or even develop meaningful 3D imagery based on ‘lexical vectors.’ Just as an example of the flexibility, you could convert the system I listed above in the following way: (1) Utility (Miscellaneous, Mating, Energy, Safety), (2) Adjectives (Bright, Dark, Good, Bad), (3) Nouns (Person, Places), (4) Objectives (This, That), (5) Tense (Past, Present, Future), (6) Probability (Unk-Q, Possible, Certainty), (7) Number (Singular, Plural), (8) Verbs (Eat, Sex, Sense). Instead of having 1×23 matrices, you’d now have 1×8 matrices but more values inside the matrices. So the aforementioned question “how many are there?” would now end up being: [00000120] with the answer being: [00000320].

Assume now that you develop a vocabulary of 10,000 words using this matrix system. First, poetry form dictates meaning for poetry words, so there would have to be values dedicated to how the context changes the meaning of the words. This means you will not be able to deal with fixed matrix dimensions or values, as a matter of definition. Not so with scientific treatises, so no context is necessary and these words, once stored, may remain just so absent changes needed for syntactic purposes. Second, the traits of the scientific word, that is, the characteristics that make up a definition for a scientific word will be certain. There is no doubting that an intrinsic quality of a proton is its positive charge. Without it, and perhaps slightly heavier, you have a neutron. But the word ‘love,’ the subject of so much in art, requires flexibility. To the extent that it does in fact have any intrinsic meaning, it could have equal parts longing and affection, desire and intention to couple — none of them being necessarily tied together. That means you have to attach simultaneous meanings and probabilities. This implies significantly greater computation costs for an NLP system that is able to finally comprehend poetry. Finally, as our discussion of the principle suggests, it also means that the closer a translation comes to locking down the expression in a translation (as Carson does above: it takes a full paragraph to get at the full expression of koma), the more of its intended intrinsic meaning that can only be derived from the original word and context are lost. Does any essence remain?

Advertisements