You are currently browsing the category archive for the ‘Language’ category.

In my book Cultural Entropy, I devote some time to information theory, for the concept of entropy is impossible to explain without it. Likewise, attempting an explanation of cultural information, particularly the language subset of it, without entropy, is impossible. In reading various sources about information and language, I am struck by how excellent and simple the older texts are and how confusing or negligent are the newer texts. Language Files, which is a standard text for introductory linguistics courses, shows nothing, though it does discuss pragmatics.

But before the field was called pragmatics, and when linguistics had a little more perspective, the most common linguistics textbook was An Introduction to Descriptive Linguistics by H.A. Gleason (1955, 1961). This latter book, in particular, also forms an excellent foundation for a linguistics novice introduced in Field Linguistics, which I often analogize to amphibious warfare: the process of starting with zero firepower ashore and proceeding to dominance of the field. Field Linguistics as a practice is quite similar. A linguist arrives to a place s/he has never been, perhaps a village in remote Papua New Guinea, beginning with close to zero knowledge of the language and necessarily proceeding to learn everything, discerning a grammar, phonetic inventory, and all manner of other information. It is, in other words, a supremely practical art. Just so, Gleason’s textbook.

For the purposes of my discussion here, Descriptive Linguistics rises to the occasion as well. We begin with definitions:

The amount of information increases as the number of alternatives increases. […] Information is measured in units called… bits.  By definition, a code with two alternative signals, both equally likely, has a capacity of one bit per use. A code with four alternatives is defined as having a capacity of two bits per use…. […] The amount of information in any signal is the logarithm to the base two of the reciprocal of the probability of that signal.

This about sums up the useful parts for any schema of quantifying meaning that we might wish to undertake 50 years after the text was written. Focus on the point about alternatives. In a world with two machines communicating to each other, but only ever saying 1 or 0 back to each other and only once before responding, then the machines have only has two choices and they are both equally likely. The capacity is one bit. The machine might send its transmission in the following form: [0] or [1]. A code with four alternatives between the machines might look something like this: [0 0], [0 1], [1 0], or [1 1]. In fact, these would be all four of the alternatives and it’s a capacity of two bits being used.

Most human communication doesn’t look like this at all. True, we do often communicate in ways that necessitate or at least allow for either/or answers that might look like [0] or [1]. But most human utterances and writing look more like what you’re reading in terms of expressing ideas, narratives, and concepts, not just yes/no or either/or responses. An example of something slightly more complicated would be the set of alternatives to the question: which U.S. President from 1980 – 2011 has been the best? You have six choices: Carter, Reagan, Bush 41, Clinton, Bush 43, and Obama. The response, therefore, could be encoded as simply as [0], [1], [2], [3], [4], or [5] depending only on which number referred to which President. Another step up in complexity would be the set of alternatives to the question: which color is the best? As a technical matter, given the number of frequencies visible to the human eye, the answer is theoretically unlimited. There is, however, a practical limit: language. Every language only has so many recognized color words at any given moment. Some have as few as two, it is believed, while others have somewhere between 3 and 11, and a good many others have considerably more. English certainly falls into the last category and every 64 or 128 pack of crayons you see in the store proves it. There are many alternatives to choose from here.

Something that has been avoided by many linguists and information theorists until recently has been quantifying the amount of information that is actually transmitted, beyond the rote logical numerical answer suggested by Gleason in his textbook. In a response to the presidential question, if someone responds “Carter,” much much more information is transmitted to a listener than just the information that Carter is the best President. Any listener will assign a probability to that outcome, meaning reflexively that probabilities have been assigned to all other outcomes, but it will also say something about who the person is and their beliefs. But most of his other information could be called “peripheral information” as opposed to the “core information” transmitted by the response. Peripheral information is highly contextual.

Xenolinguistics, as broadly understood, though mostly as a matter of farce, is the study of non-human languages. In May 2009, the blockbuster Star Trek premiered around the world. In one of its funnier exchanges, James T. Kirk and Uhura bring xenolinguistics to our awareness:

KIRK: So you’re a cadet. You’re studying. What’s your focus?
UHURA: Xenolinguistics. You have no idea what that means.
KIRK: Study of alien languages. Morphology, phonology, syntax. It means you’ve got a talented tongue.

Yes, typically, xenolinguistics is the study of “alien” languages, but one must permit the possibility of other languages on planet Earth, whether from ocean-dwelling mammals as seen in Star Trek IV or Elvish from Lord of the Rings, so I choose to define it as the study of “non-human” languages. Perhaps unsurprisingly, Klingon arguably does not qualify, as its creator, Marc Okrand, developed the language with human language universals, though with admittedly rare syntactic and phonetic combinations. (Of course, one must cede that languages could have developed independently on other planets, as they apparently did in Star Trek, with exactly the same linguistic universals, tendencies, and restraints as ours.) The combinations are rare because they impede cognitive processing and pronunciation, respectively.

How so?

First, regarding cognitive processing, Klingon uses an “object first” sentence structure, whereby the sentence “I hit Charlie” becomes partially inverted in Klingon as “Charlie I hit” though they mean the same thing. Very few languages in English have this type of sentence structure, and the few that do are locked away in the Amazon or similarly remote, or possibly even undiscovered, environments. The reason why object first, as opposed to subject first languages, are so rare is because, in summary, we tend to think linearly. Starting with an effect, not a cause, increases uncertainty and ambiguity in the brain as it processes the sentences. Therefore, it seems likely that object first sentences have either evaporated with time due to others having a distinct competitive advantage, or that they never arose significantly in the first place due to its relative handicap. We would predict that such languages could only exist, all things being equal (this is a key phrase), in an environment of relative isolation, without trade and significant cultural exchange.

Second, regarding pronunciation, Klingon possesses a particularly odd phonetic inventory, yet its sounds, while not generally consistent with what occurs in human languages, are can all be found in the inventory of human sounds. In other words, there are no sounds in Klingon that a human cannot make. The reason why its sounds, alone and in combination, are relatively rare in English is because they cost of a lot of energy to make. The presence of harsh fricatives and gutturals is accentuated by lax (meek, in Klingon terms) vowels.

This discussion on Klingon is all to say that we really have no idea what an alien language would be like, as we are bound by certain customs and universals as human speakers. Suzette Haden Elgin recognized this problem when she wrote the science fiction novel, Native Tongue. In the novel, humans interact with aliens, but since presumably the plasticity of an adult brain is so low, only babies have the ability to learn alien languages because adult brains get overloaded by them. Therefore, Elgin’s solution to the problem is that humans force babies to interact with aliens thereby learning alien language and serving as a bridge. Yet there are many very important reasons to believe that even babies would have difficulty learning alien languages. Our specifically neural structures, as made more clear every day by neuroscientists, linguists, and psychologists, strongly impact our relationship with language. An easy way to think about this is the difference between how chimps and humans deal with language. Yes, chimps are capable of rudimentary language, expressing words with consistent referents, but they are not capable of the complex grammars we are.

The same might be true of aliens. Whether humans or aliens have the comparatively finite grammar is beside the point: the cost of information transmission seems like it will be relatively high. Whether the information transmission occurs through telepathy, or the spoken or written word, obviating the impact of impossible phonetics for the human tongue, grammars and meaning would be the most difficult barriers to understanding. But this is not to say they would be insurmountable. Logic is a fine tool to use, so long as specificity is a quality aliens value.

This is why meaning could be a problem. The physicist-cum-Nebula and Hugo Award-winning author David Brin turned the tables in his incredible Startide Rising saga. In this universe, humans, derogatorily called “wolflings” by most aliens, speak with far more ambiguity than others. It is the humans that do not value specificity, littering the language with metaphors and words that have all kinds of double or triple meanings. Someone familiar with any Chinese language would scoff at merely three possible meanings for an isolated word, as it could have many more than that. Most alien languages, such as Galactic Six or Galactic Five, do not allow for ambiguous meanings, as each word corresponds to something very specific and could not mean anything else. Some languages on Earth accomplish this feat with elaborate case systems in which certain morphemes are attached to a word, whether grammatically or morphologically, denoting its relationship to a subject, object, or other grammatical role.

The practical import of xenolinguistics is not yet that we need to communicate with alien races, of course, though this would be nice if we could find a way to do so. We would better be able to negotiate on our own behalf in the event of calamity, or just to establish beneficial trading relations. More immediately, but in light of the contributions of science fiction thinkers, consideration of xenolinguistics might help us assess the differences in meaning that need to be ironed out by natural language processors, for this is the difficulty with speech recognition programs and all manner of artificial intelligence. How will we store the information in such a way that it will convey all denotations and connotations, which may change given the context, and how will we store the context information in the word? In the book, I have a section on how natural language processors do it today and how it might improve. Unfortunately, we still have precious little real xenolinguistics to build upon for these tasks and therefore the absolute practical import is sadly very low for aspiring xenolinguists. My advice? Learn computer science.

So, halfway through the last post I forgot my original reason for writing it, as you can tell from the somewhat aimless jabbering. (Could someone at least tell me when I have wandered off the reservation? LOL.) But now I remember. To frame this discussion on paradox within the context of the first few parts, and this blog’s focus on economics, consider the following: We do not need to expand the set of words in natural language to cover every possible bit of information, though we know that we could attempt it forever without success. The reason why the endeavor is useless, however, is that the law of diminishing returns functions as well with words as it does for everything else. In microeconomics, the law of diminishing returns says (from wikipedia):

…the marginal production of a factor of production, in contrast to the increase that would otherwise be normally expected, actually starts to progressively decrease the more of the factor are added. According to this relationship, in a production system with fixed and variable inputs (say factory size and labor), beyond some point, each additional unit of the variable input (IE man*hours) yields smaller and smaller increases in outputs, also reducing the mean productivity of each worker. Conversely, producing one more unit of output, costs more and more (due to the major amount of variable inputs being used,to little effect).

Humans demand words and we use them, like our own capital, to produce and transmit information which gives great utility to each person capable of it. But we really only need so many descriptive words. At the point where the benefit from adding another word is less than the cost (the costs of memorizing, transmitting the word to enough people to be useful), and this point surely exists (take for example the extraordinarily minimal benefit from adding a word that means “soggy paper that could have been wet by any of many sources / ambiguously wet paper” versus the comparatively major cost), the word or set of words will not be added. Language is dynamic, meaning that new demand arises, and therefore so do new words, so this state need not stay forever. (In practice, languages are always changing and a prescriptivist book is archaic two seconds after it is published.)

With most linguistic needs met, the human spirit still needs more. Humans get more utility from moving outside the scope of natural language, giving heed to faith and developing paradox as a method for coping with all the dark corners, nooks, and gaps of natural language. It is not difficult to create a new color word in a language for a particular undifferentiated shade of green, but the need may not be strong enough to do so. The need to describe concepts and ideas that do not fit into one tidy shape requires entirely new words. Languages all over the world have long struggled with these ideas, certainly of paradox. When the cost to storing information went down, our vocabulary commensurately increased in all kinds of fields where it was previously more costly than beneficial (think: color vocabulary) to store such information in the human lexicon. Freed from the onerous costs of information storage, the vocabulary for faith and paradox, that which becomes the bright and ineffable in the human experience, zealously bloom in the art of the race.

Herbert Muller realized this in a way long before I did. In his incredible book, The Uses of the Past, he writes of the majestic Hagia Sophia in a book whose aim is to talk about relationship with history:

Only, my reflections failed to produce a neat theory of history, or any simple, wholesome moral. Hagia Sophia, or the ‘Holy Wisdom,’ gave me instead a fuller sense of the complexities, ambiguities, and paradoxes of human history. Nevertheless, I propose to dwell on these messy meanings. They may be, after all, the most wholesome meanings for us today; or so I finally concluded.

Any interesting and useful theory of economics, linguistics, or art is doomed to immediate obsolescence without considering messy meanings.

At first blush, faith seems like a quality of knowledge that could fall under the “personal knowledge” category of data source. Faith is often a deeply personal thing, though it is just as likely to not be so in evidence. Some skeptics think knowledge of God comes from the iron fist of parents and Republicans, so that would actually fit under the “knowledge-through-language” category of data sourcing. And many beliefs that derive from faith are considered myths in some speech communities, so that might fall under the “non-personal knowledge” category. Still, faith, whether religious or otherwise, may also sometimes be the domain of something completely different. It may not be a type of knowledge at all, but rather a conclusion of the will alone with almost, if not entirely, zero basis from other sources to back it up. ( In this sense, Christianity for many may not be the purest faith, since it involves reliance on The Bible and other sources generally. )

Since it seems to me that every bit of information transmitted in natural language has an implied data source element tied to it, I think natural language may have a difficult time touching the areas of faith. We may not all be entirely sure of our faith, in people, ideas, outcomes. Precisely because there is no backing for the objects of it, it is possible for the entire realm of imagination to come to the fore, leading to ever more components inside natural language and outside it as well, grasping equally unlikely and impossible ideas. (It reminds me of E Space from David Brin’s Heaven’s Reach.) Perhaps any world imagined requires a little faith (see: “Far Beyond the Stars” below).

Faith, and its linked universes, are but one manifestation of “the set of all things that are possible and impossible.” The set of all things that are possible and impossible is a large, infinite set, larger than the set of things that are merely possible. What has been, is, or will be imagined, which overlaps with the set of things possible and not, is also smaller. Since the capacity of natural language depends very much on imagination, as all texts, narratives, even biographies, are fictions (as Milosz said), language is limiting, though with its rules, it gives us the capacity to explore. This leads us to faith’s brother in the set: ethereality. This, too, could lead us beyond natural language. By this, I mean anything with one foot in our own tangible world and one foot in another, be it Heaven, Mt. Olympus, or a parallel universe slightly running slightly slower than our own. (The distinction here between ethereality and faith is mostly false, used for illustrative purposes.) As Anne Carson showed, where the Christians have holy, the ancients have MOLY. We have the sounds, can express the word, but have no idea what the expression really means, nor the etymology. The translator encounters a brilliant, not terrible, silence. It implies entire domains of knowledge outside our grasp, words, concepts, and rules for constructing them that are beyond natural language. Their utterance in our world is but a tip of the iceberg for their meaning. Translation is stopped, worthless.

One particular set of expressions, arising from faith and ethereality, is paradox. Paradoxes in conventional discourse could mean almost anything. According to wikipedia:

A paradox is a statement or group of statements that leads to a contradiction or a situation which defies intuition. The term is also used for an apparent contradiction that actually expresses a non-dual truth (cf. kōan, Catuskoti). Typically, either the statements in question do not really imply the contradiction, the puzzling result is not really a contradiction, or the premises themselves are not all really true or cannot all be true together. The word paradox is often used interchangeably with contradiction. Often, mistakenly, it is used to describe situations that are ironic.

Paradox is probably mostly used to describe situations defying intuition. Some paradoxes in logic, like Curry’s paradox, remain as a virtue of logic, though my hunch is that it could probably be solved by a heavy dose of linguistics. If you enjoy those games, by the way, knock yourself out. The ethereal cases we discussed above do not necessarily entail paradox of any sort, even the ironic. Rather, much paradox depends on our perceptions and beliefs. For example, is it possible for someone to be both good and evil? This questions relates to some deep questions of human nature that vex even those who do not think about them and lead to some profound art. Also: is it possible to be in the past and the future (and the present) at the same time?

Both questions were considered by Shakespeare, and one or both were considered by other greats, including Klimt, Milton, Spenser, and Dali. Klimt’s work pits the static, timeless, glittering gold medium versus passionate, timely, tangible action. His Byzantine and Egyptian influences command awe, not respect, because the meanings are meant to be ambiguous yet beautiful. Milton rejoices in the freedom to choose humanity possesses, showing that this freedom can lead to the most sublime of existences or the most dastardly, the glorious or the tragic. It is for us to choose, for we possess the potential for both. Spenser grasped at similar themes, as Kermode described:

The discords of our experience– delight in change, fear in change; the death of the individual and the survival of the species, the pains and pleasures of love, the knowledge of light and dark, the extinction and the perpetuity of empires– these were Spenser’s subject; and they could not be treated without this third thing, a kind of time between time and eternity.

Not just discords, but paradoxes, perhaps. Dali brought old symbols into modern art, meticulously plotting old stores for the modern era, but his “The Persistence of Memory” summons our consideration of our relationship with time. Most believe we live a linear existence, moving from one moment to the next. Dali suggested this isn’t necessarily the case. Although Dali showed you this idea quickly by painting, I have never seen a better explication in any other visual medium than this one from, sigh, yes, Deep Space Nine:

Of course, Shakespeare may have endured as the paradox specialist nonpareil. It is probably no coincidence his works stand above almost all others in their capacity to possess us. That’s because, unlike Twilight or Star Wars, they ask questions and there are no clear answers. We can consider them anew each day. Kermode, a critic where I am not, had much to say of the Bard:

Now Macbeth is above all others a play of prophecy; it not only enacts prophecies, it is obsessed by them. It is concerned with the desire to feel the future in the instant, to be transported beyond the ignorant present. It is about failures to attend to the part of equivoque which lacks immediate interest (as if one should attend to hurly and not to burly). It is concerned, too, with equivocations inherent in language. Hebrew could manage with one word for ‘I am’ and ‘I shall be’; Macbeth is a man of a different temporal order. The world feeds his fictions of the future. When he asks the sisters ‘what are you?’ their answer is to tell him what he will be. […] …and the similarities of language and feeling remind us that Macbeth had also to examine the relation between what may be willed and what is predicted. The equivocating witches conflate past, present, and future; Glamis, Cawdor, Scotland. They are themselves, like the future, fantasies capable of objective shape. Fair and foul, they say; lost and won; lesser and greater, less happy and much happier. […] The act is not an end. Macbeth three times wishes it were: if the doing were an end, he says; if surcease cancelled success, if ‘be’ were ‘end.’ But only the angels make their choices in non-successive time, and ‘be’ and ‘end’ are only one in God. The choice is between time and eternity. There is, in life, no such third order as that Macbeth wishes for.

That’s a mouthful, but you get a sense of the conceptual foldings that the reader must grapple with. Paradoxes may turn cause and effect on its head or involve contradictions. Whatever the case, they usually involve the existence of something that should not be given the truth value of other parts of the situation or statement. When one thing could suddenly mean another thing that was thought to be mutually exclusive, all kinds of possibilities unfold. This would, in turn, expand the scope of natural language and the landscapes of our human adventures. They are a means by which we can surpass our limits, thereby giving incentive to grow.

There are words that exist beyond the domain of natural language. Remember that language’s ultimate utility remains in its ability to transmit information. Yet, I think we would all agree that there are types of information impossible to convey. In the movie Contact, based on the book by Carl Sagan, Palmer Joss demonstrates this to Dr. Arroway by asking, “Did you love your father?” Arroway responded affirmatively. Given the previous narrative in the movie, there is little doubt and Arroway seems flummoxed by the need to give an answer to such an obvious question. Joss responds, “Prove it.” (2:05 – 2:17 below)

An action may not prove it. Any person could feed an ill father. Any person could watch television with a father. None of these things alone or together suffice. Yet, from the perspective of Dr. Arroway, she knows it beyond a doubt. The answer lies encoded within her consciousness. So do the language centers necessary to translate the neuronal patterns of the persona into information. This could be a gap in natural language. And maybe there just haven’t been words invented for this type of evidentiary matter.

But I think that we can expand the example further to prove an important point. How do we prove that we can prove it? How do we prove that we can prove that we proved it? How can we check that? And then how can we be sure it’s reliable? The problem here is not an infinite regress. Let us assume instead that we can prove all these things and more. A mathematician named Kurt Godel suggested with his Incompleteness Theorems that we can only come so close to perfect information without ever having it. Each attempt to obtain it pushes the goal one step further away.

In what has been compared in importance to Heisenberg’s Uncertainty Principle and Einstein’s General Relativity, Godel’s Incompleteness Theorem, encompassing both the First and Second Theorems, says two things (from Wikipedia):

  1. Any effectively generated theory capable of expressing elementary arithmetic cannot be both consistent and complete. In particular, for any consistent, effectively generated formal theory that proves certain basic arithmetic truths, there is an arithmetical statement that is true, but not provable in the theory.
  2. For any formal effectively generated theory T including basic arithmetical truths and also certain truths about formal provability, T includes a statement of its own consistency if and only if T is inconsistent.

The first theorem listed says that in the consistent system of arithmetic there will be true statements that are simply not provable. This derives somewhat from the so-called Liar’s Paradox (“This statement is false.”), but whose analogue is “This statement is not provable within this system.” The difference is significant, so let us focus on the latter version. Interestingly, as Rebecca Goldstein explains in Incompleteness: The Proof and Paradox of Kurt Godel, efforts that one makes to expand the system and prove those unprovable statements will prove futile:

…the proof demonstrates that should we try to remedy the incompleteness by explicitly adding [the paradox] on as an axiom, thus creating a new, expanded formal system, then a counterpart to G can be constructed within that expanded system that is true but unprovable in the expanded system. The conclusion: There are provably unprovable, but nevertheless true, propositions in any formal system that contains elementary arithmetic, assuming that system to be consistent. A system rich enough to contain arithmetic cannot be both consistent and complete.

In truth, the expressions of arithmetic easily expand into a subset of natural language — the way in which we structure information, oftentimes even in our minds. Since arithmetic is a subset of natural language and it will always keep going and going and going, never ending complete, then natural language is doomed to the same fate, but there are other subsets that redouble arithmetic’s efforts. An attempt to make natural language (a system very different from arithmetic) complete involves going to the very end of it, putting in all the sounds and morphemes possible, as we have already discussed. In a mathematical sense, we could try to have a description and expression for everything in natural language. Every nook and cranny of meaning possible, likely including every permutation and flavor of combined meanings (n-dimensional superfactorials like !sweet and !love come to mind), would have to be revealed. Once all information on tangible things in the multiverse is discovered, there remains perceptive, intangible information locked in the conscious mind. Then there’s the unconscious mind. Then there’s… and on and on and on. This is all to say that it would be very nice to have natural language incorporate expressions for everything, but a) it’s impossible and b) it’s not practical. The amount of energy it would take to do so would be far better used doing something else, like making pizza, curing cancer, or inventing warp drive.

What’s interesting about all this is that despite this limitation we have already transcended the limits of natural language. This is best illustrated by going back to the set of perceptive, intangible information that must be described. If all the tangible information had a certainty of 99.9999% then we could add a “data source morpheme” to it such as /-wa/ that indicates this. Or, we could leave it unmarked and simply add morphemes for anything but 99.9999% certain information. Intangible information gets graded differently by data morphemes. Jaqaru, a dying language of 3000 speakers found primarily in Lima, Peru, features such data morphemes. Dr. M.J. Hardman of the University of Florida has studied the language in detail. Her findings on data morphemes include the following (from Wikipedia):

Data-source marking is reflected in every sentence of the language. The three major grammatical categories of data source are:

1. personal knowledge (PK)–typically referring to sight
2. knowledge-through-language (KTW)–referring to all that is learned by hearing others speak and by reading
3. non-personal-knowledge (NPK)–used for all myths, histories from longer ago than any living memory, stories, and non-involvement of the speaker in the current situation

So where are we transcending? In the areas of natural language for which even data source marking would be difficult, if not impossible without another data source morpheme referring to 0% certainty. I think of three areas primarily: faith, ethereality, and paradox.

At its roots, language is a means by which information is transmitted by humans to each other. Typically, its contours are determined by speech communities and linguists advocate the spoken form as language’s most important form; its most rich, varied, and dynamic form. While important to consider the differences between the spoken and written form, it should help to remember that sign languages are also completely robust, rule-governed, grammatical languages. American Sign Language in fact has a different grammar from English. So the central point to remember about language is that the main demand for it comes from a virtually universal need and desire to communicate.

In the past few decades, linguists have been forced to dig a little deeper into language in order to investigate just what differentiates human language, or natural language, from other forms of information transmission, both by artificial human means and by other organic means. Some bee species possess elaborate dances by instinct that allow them to communicate distance and identity. Parrots have been known to go beyond mere mimicry to some linguistic stimuli. Computers can perform extraordinarily difficult computations and hundreds of engineers work day by day to make them think more like us. Chimpanzees and other primates can learn and transmit arbitrary signals for tangible objects as well as actions. At the highest end of the spectrum of non-human language we have dolphins, Tursiops genus. I trust you already know that they are the dirtiest, filthiest joke tellers anywhere in the universe, but did you know that they are able to grasp simulations much faster than chimps, implying a much higher level of awareness?

Setting aside dolphins for a moment (dolphin linguistics are complex and far beyond the scope of this post), chimpanzees seem to perform some linguistic tasks relatively well. They can learn dozens of words. But we have learned that they possess only the most rudimentary of grammars. Words must be placed next to each other in a meaningless order, but usually repetitively next to each other. For example, “give food eat me eat food give….” (Still, some chimpanzees have been able to perform more impressive tasks as Emily Sue Savage-Rumbaugh showed in 1990.) Natural language is different. We are able to construct sentences that are virtually infinitely long, but that are not redundant. For example, “I learned that Annie bought the gun from Bob who said that Carolina borrowed money from David who…” could go on forever as a sentence. As a practical matter, this never happens. This is an important point as we will come to see later, but natural language has the property of recursion, which means that it can continue to spiral into its embedded phrases and dependent clauses forever. Don’t take recursion so seriously: the central point here is that humans can create infinite expressions from a finite set of discrete units of meaning (morphemes).

There have been a few attempts to demonstrate that the size of natural language is infinite, based on the fact that the grammar itself can spiral with new phrases forever and that the language can add words forever. Both properties, independently, would lead to such a conclusion. Based on the spectrum of translation outlined in my last post, I think we can look at natural language as the set of all human languages. Fundamentally, then, natural language is a flexible system, consisting of (many) sets of sometimes, but not always dynamic, rules for combining a functionally infinite set of words in order to convey information. The reason rules are sometimes dynamic is because, while many rules persist in language, they may occasionally do so out of inertia and therefore be broken when everyone in the speech community understands the expression despite its “illegal” form. One example of this is using the double negative in English. Although absolutely and utterly common in English of Chaucer’s day, the double negative fell from grace some time afterward. Still in use ever since, it is frowned upon by academicians, Strunk & White, and many other purveyors of proper English. This is why linguists frown upon prescriptivism. They do not dislike standardization, there are benefits to that. The problem is when people mistake standardization for right and wrong. Something may be standard, or it may be different, but it is not wrong, and likely is just as rule governed as the standard form. Witness the work of Walt Wolfram.

A language can add words in several ways, as shown partly by the translation spectrum. We can combine words from the language to mean something new, borrow words from another language, create new words out of thin air. This last one could be done simply by assembling sounds currently within language’s standard phonemic inventory for a new word. Typically, languages group various distinctive phonetic sounds into phonemes. For example, the ‘p’ in piranha is very different from the ‘p’ in stop. The first is said to be aspirated, which can be proven by placing your hand in front of your mouth when saying the word naturally. The second is unaspirated. In English, these sounds are grouped into the same phoneme /p/. In Hindi, they will lead to minimal pairs, where switching one for the other in a word creates different meanings. Therefore, they are in different phonemes. Switching them in English does not affect meaning. Let’s say a language has 30 phonemes. It would be a very long time before you run out of combinations of these sounds for making new words. And when you do run out of them for given word sizes, all you have to do is increase the word size. That gives you another virtual infinity of new word possibilities. And then you could add signs, as seen in the many sign languages of the world.

You could be thinking, “Well, aren’t there also infinite speech sounds? That is, phonetic sounds?” Yes. Just like the color spectrum, representing an infinite array of colors, there is an infinite array of sounds that our vocal chords can create — not to mention an infinite array of signs our hands can make. In the case of vocalizations however, like colors, they will be grouped in “best example” phonemes. The range of them may vary from language to language, but they will stick to a certain maximum deviation from the best example and always include the best example.

Remember that my definition of natural language here includes a grouping of all languages, which then implies that all phonemes are represented. Now let us add the phonemes have we have lost which may have existed in dead languages or simply been lost to current languages. We now have the entire array of speech sounds. By this method, all distinctive features should be represented. All these sounds can be brought into natural language, combined in every way morphologically and syntactically possible, and we wind up with an array of expressions that is boundless and infinite.

This is the scope of natural language. But there is more.

Some may remember my review of Anne Carson’s book If Not, Winter: Fragments of Sappho. Like everyone else, I adored her book and really took to her method of translation. Recently, I decided to investigate a little bit more about this talented artist and scholar. I found that If Not, Winter is hardly anomalous as a representative work.

In her essay “Variations on the Right to Remain Silent,” published in a 2008 edition of A Public Space, she confronts the boundary between linguistics and literary theory, hoping to develop a kind of a theory of silence. She doesn’t need more space than what she uses in the essay to do so.

The motivation for the essay has its roots in the art of translation. According to Carson, there are two kinds of silence to be reckoned with by the translator. Physical silence occurs where something the author intended to be there is missing, as with many of Sappho’s poems, largely lost to posterity. Carson deals with this by using brackets where the author’s intended expressions are missing, but she says translators may be as justified in some cases to extrapolate expressions. The other kind of silence is “metaphysical” silence, wherein “a word… does not intend to be translatable. A word… stops itself.” Carson gives an example from the Odyssey:

In the fifth book of the Odyssey when Odysseus is about to confront a witch named Kirke whose practice is to turn men into pigs, he is given by the god Hermes a pharmaceutical plant to use against her magic:

So speaking Hermes gave him the drug
by pulling it out of the ground and he showed the nature of it:
at the root it was black but like milk was the flower.
MOLY is what the gods call it. And it is very hard to dig up
for mortal men. But gods can do such things.

MOLY is one of several occurences in Homer’s poems of what he calls “the language of gods.” There are a handful of people or things in epics that have this sort of double name. Linguists like to see in these words traces of some older layer of Indo-European preserved in Homer’s Greek. However that may be, when he invokes the language of gods Homer usually tells you the mortal translation too. Here he does not. He wants this word to fall silent. Here are four letters of the alphabet, you can pronounce them but you cannot define, possess, or make use of them. You cannot search for this plant by the roadside or Google it and find out where to buy some. The plant is sacred, the knowledge belongs to gods, the word stops itself.

These silences occur with words that are a subset of unknown size of the words that must be borrowed from other languages as opposed to translated. Translators must make several difficult decisions in their work from artistic and linguistic standpoints, but it is the latter that is the most important here because there is a “spectrum of translation” they must always employ. On one end are single words that translate with virtually 1:1 correspondence to words in the other language. ‘Book’ is ‘libro’ in Spanish without much confusion. Then there’re words like ‘nose’ in English that translate with but the slightest difference into 鼻 (hana). In Words in Context, Takao Suzuki shows that the area American English speakers consider the nose covers a different portion of the face than the Japanese word, although both of course include the most important functional parts. Likewise, as discussed on this blog, Paul Kay (Berkeley) has shown that speakers of almost all languages consider the best example or shade of the word red as the same, despite differing ranges of shades that could be considered red. Nevertheless, for all intents and purposes, a single word translation will do. Next we have compound and composite word translations. The word ‘television’ seems like it translates quite cleanly to 電視 (dian4 shi4) for Mandarin (or Taiwanese if we’re being cute). But there are a few issues here: 電視 is actually a composite word, much like the original, made from two morphemes that indicate ‘electricity’ and ‘being looked at’ respectively.

At this point, we can see that for much translation, there are words that some languages possess which will be difficult to translate with the same economy. From here until the middle of the spectrum, words are translated with progressively more and more morphemes in the destination language. But when a translator is faced with the problem of translating one word into a paragraph, that might defeat so much about the original: pacing, essence, and so on. And then, of course, there’s the Heisenberg Uncertainty Principle in Language, which suggests that the more words we use to describe the word to be translated in order to most closely approximate the original meaning, the more its essential meaning, in addition to other connotations, is missed. Locking down the expression so rigidly pushes out meaning. Therefore, there comes a point on the spectrum where translators must seek different methods of translation besides seeking the complete and rigid expression for it.

Carson is a master of this, as I have pointed out before. In her book of Sappho poetry, If Not, Winter, she uses words such as ‘songdelighting’ and ‘radiant-shaking.’ Instead of writing out the complete expressions, she chooses innovation. She creates novel words using standard word formation rules in the destination language that may contain more of the original meaning than an attempt at complete expression might.

The second to last point on the spectrum of translation is when a word is just borrowed without further elaboration. Carson highlights the borrowing (outright theft, I’d think) of ‘cliché’ from French. She writes:

It has been assumed into English unchanged, partly because using French words makes English-speakers feel more intelligent and partly because the word has imitative origins (it is supposed to mimic the sound of the printer’s die striking the metal) that make it untranslatable.

The latter is a good reason for borrowing a word from another language. Another reason is that a speech community possesses significant demand for a word that it does not yet have. For example, French speakers started using the word ’email’ because no word in French concisely described such a concept and its word formation rules would likely not have led to such an economical word either. (The Academie Francaise has tried to stifle the use of this word in favor of ‘courriel’ and I do not know the extent of its success.) A better example is the English borrowing of ‘schadenfreude’ from German which means “taking delight in others’ misfortune.” Although I have only really heard Dorothy Rabinowitz, a Pulitzer Prize winning writer of the Wall Street Journal, use the word, I have read it on several occasions from other writers. Just beyond these words are similar words for whom some meaning can never be discovered or reclaimed without being a native speaker of the language. Multilinguals know of many such words. Some brag about them. Some keep their knowledge locked away. Some of these words also depend crucially on shared temporal experience, as ‘truth’ and ‘authenticity’ mean so much more to many Czechs than most American English speakers can understand — though they can try if they read Havel, Seifert, Kundera, and maybe some Poles as well. This is a story worth telling in another post someday.

Finally, we arrive at the end of the spectrum, yet there is no guard rail or barrier, and we stand at a precipice beyond which we cannot see anything precisely: only the bright and ineffable, like MOLY. These words land in our language with a form bearing no relationship that we can trace back to any meaning. Morphological analysis stops because it can never start. Syntax? Phonology? Save yourself because the tracks have all been covered. Carson shows several examples of the bright, ineffable silences: they are all places that we cannot go. These silences may be uttered by our inner angels, the angels above, or from even more inexplicable origins. Our choice to explore them creates possibilities that we never before considered.

The nail in the coffin of scholars attempting to deny the robustness of Berlin and Kay’s theory is contained within “Universal Foci and Varying Boundaries in Linguistic Color Categories” by Terry Regier (UChicago), Paul Kay (Berkeley), and Richard S. Cook (Berkeley). Looking at the World Color Survey of 110 unwritten languages, the authors found that foci (“best examples”) of six colors (white, black, red, green, yellow, blue) are virtually universal, although the borders of the category are somewhat more malleable and given to cross-cultural difference. Much more interesting, the authors attempt to predict these boundaries based on a computational model.

Why are these models important? Certainly, they represent a wide departure from much linguistics and anthropology, though graduate students in these fields do do learn basic, important statistics. Even in economics, there’s a vocal minority of professionals who scorn the utility of mathematics. Though it absolutely kills me to do so, I will (favorably) quote a Nobel Prize winning economist named Paul Krugman:

Math in economics can be extremely useful. I should know! Most of my own work over the years has relied on sometimes finicky math — I spent quite a few years of my life doing tricks with constant-elasticity-of-substitution utility functions. And the mathematical grinding served an essential function — that of clarifying thought. In the economic geography stuff, for example, I started with some vague ideas; it wasn’t until I’d managed to write down full models that the ideas came clear. After the math I was able to express most of those ideas in plain English, but it really took the math to get there, and you still can’t quite get it all without the equations.

What Krugman, who has long since stopped being an economist, is saying is that of course it all starts with ideas. But in order to develop these ideas into something scientific, you need to formalize the idea into equations. Why? Because equations eliminate ambiguity. It might take three pages to say what a solid equation says in a line. For someone trained in the practice, this is especially helpful because you can see which elements have been left out, where certain factors should be added, and it is vastly easier to challenge the assumptions of the equation or tweak them. If you don’t hold scientists to this standard, they can dance out of all kinds of things with ambiguity — indeed, this is just one of the problems with social sciences of the past. It’s what sets Gary Becker, for example, apart as a sociologist.

The idea I have that I would like to somehow formalize is how market demand works with biological properties of our eyes and chromatic qualities to provide a certain set of color words in a language. So step one would be to list out the factors that such a set of color words in a language is a function of. There are many factors that might be considered in trying to formalize an explanation for the consistent order of color words in languages. In one of Harold Conklin’s better known works, referred to me by a PhD student in Anthropology, the author shows that color words may not refer strictly to chromatic properties. Rather, they may be tied to other qualities, such as texture and/or ripeness. This implies that surface texture, in addition to brightness, hue, and focal points. Also tugging at any formalization would be the market demand for visual differentiation and the cost of spreading and maintaining a new word in the language. This latter cost certainly goes down as technology improves that aids in information recording (writing systems through printing presses and computers).

Let us assume that humans gain no utility from distinguishing between different visual frequencies of light (i.e. colors). If this is the case, then humans will use no words (or possible one?) for colors. Since every language has at least two color words, humans do gain utility from distinguishing between colors. Now let n be the number of words for colors that a language possesses. n only exists for languages above 1, because humans always derive enough utility to at least discriminate between lightness and darkness. Importantly, the contrast between these two “colors” is not limited to pure color values as they apply to all colors, suggesting that humans derive the most significant marginal utility from adding these words to a language compared to any other color words.

However, if these words are limited from their quality of brightness and instead are converted into a RGB scale, we might say that black is (0,0,0) while white is (255,255,255). The length of the line connecting these points in the cubic color space, 441.673, is the longest length contained within the space, though it is not unique. Other distances between points of 441.673 exist, as between lavender/purple (255,0,255) and green (0,255,0), red (255,0,0) and light blue (0,255,255), as well as blue (0,0,255) and yellow (255,255,0). These are not the next sets of colors to naturally occur in human languages.

As a universal matter, when n = 3, the third word is red, but when n = 4, the fourth word is not light blue. This means that human demand for a fourth color word is not exclusively a function of contrast in a cubic color space. Additionally, there must be a basis for choosing red as the third word. Why not green or blue? Any hypothesis must therefore take into account several factors for determining the universal order of color words in human languages. Now you’re beginning to see why an equation for the linguistic marketplace that explains how humans, across time and space, create a consistent and universal order for color words from n = 2 to 11 might be helpful. In any case, it could be modified and tweaked in an orderly manner which reduces the cost of discussion, modification, and adaptation to formalization.

Now, let us look at n = 3. What sets red apart from the remaining colors? It could be many things, but we need to establish a hypothesis. Possibilities include: perhaps an orthogonality or angle to the white / black axis in the cubic color space which might mean that, when the third word is added, the plane occupies the largest possible triangle area in the cubic color space given the black / white axis. Let us test this hypothesis. The area created by this triangle is 45,979.31 units squared. It is a simple matter to show that we would obtain the same area with green or blue. However, red does have the lowest frequency / highest wavelength among these candidates and, indeed, among all remaining colors. While we may believe that the latter point is conclusive, we do not yet know whether it is independent of the former method.

Now let us look at n = 4. Either green or yellow will always be in the fourth position, and when n = 5, its complement, that is, whichever was not selected between green or yellow for the fourth word, will be the fifth word. This implies that humans are indifferent between the two as a universal matter, but that culture-specific factors may significantly impact the decision. Additionally, both green and yellow represent vertices on the cubic color space, although they have relatively similar wavelength and frequency.

At this point, let us look at another instructive excerpt from Sampson’s The Language Instinct Debate:

…human perception of color is mediated by [a] sensory apparatus which is not equally sensitive to all areas of the colour space. Our eyes can detect great intensity of colour in the ‘focal red’ region, for instance; conversely, int he pale blue-green region we are much less sensitive, so that the most intense color we can experience in that area is not too different from a pale grey. One would naturally suppose that if a language has few words for colours, the words it does have will refer to the strongest sensations; and a comparison of Berlin and Kay’s focal points with the regions of greatest human colour sensitivity indeed shows a near-perfect match. In this respect, then, it is true that human biology does influence the conceptual structure of human language. (Incidentally, the influence is not totally consistent across the species: one reason why blue occupies a relatively late position on Berlin and Kay’s sequence is that dark-skinned people have pigment in their eyes making them less sensitive than Europeans to blue light, and their languages correspondingly often lack a word for ‘blue’.)

So now we see that there are biological factors that may see us choosing red before green and yellow and green and yellow before others. What happens when a language deviates from this model and puts orange in there? Or pink? Our equation would have variables and coefficients describing the influence of said variables that might help explain this “market demand” factor. Conceivably, we will be running different types of regressions, comparing with different kinds of baselines, to test our predictions.

I intend, in the coming weeks, to develop just such a formalization. There are other potential uses of this approach for linguistic analysis. Languages differ in the number of “number” words such as one, seven, thirteen, and so on. They also differ in the number of family member words that are available. Asking questions about similarly universal relationships in languages could yield answers about the role of language in culture, technological achievement, as well as evolution. Even though I know this is terribly boring. 🙂

In 1969, anthropologists Brent Berlin and Paul Kay published Basic Color Terms: Their Universality and Evolution. In this work, the authors forcefully argued against the sweeping cultural relativism of the day by showing that languages, despite time and distance, almost universally possess certain color words. Contrary to many critics’ assertions, they did understand that they were not dealing in strict universal terms. Nevertheless, they tried to show that all languages had at least two color words, roughly corresponding to lightness and darkness (or white and black). If a language had three color words, the third word would be red. If four, it would green or yellow, and the fifth would be the missing complement. This would go on for “seven stages” and up to 11 basic color words. Over time, Berlin and Kay weakened the strength of the results though the spine remained. An enormous amount of controversy has been generated as a result of this book.

There have been some powerful criticisms of this work. First, as Geoffrey Sampson recounts in The Language Instinct Debate, there are some rather extraordinary methodological faults:

Berlin and Kay list four basic colour terms for Homeric Greek, including the word glaukos. Standard reference works, such as Liddell and Scott’s Greek dictionary, say that glaukos at the Homeric period meant something like ‘gleaming’, with no colour reference, and in later Ancient Greek meant something like what its English derivate ‘glaucous’ means now: roughly bluish-greenish-grey. But Berlin and Kay’s theory requires a term for ‘black’ in a four-term system, so they translate glaukos as ‘black’. Ancient Greek had a standard word for ‘black’: melas, the root of ‘melancholy’ (black bile) and ‘Melanesia’ (black islands) — but melas melas did not appear in Berlin and Kay’s list of four Homeric basic colour terms.

Basically the problem is that Berlin and Kay got their data from students and apparently didn’t check it very well. Another set of dissents can be generalized as follows: Berlin and Kay did not succeed in demonstrating a universal word order for color words and categories because they did not analyze the languages correctly. Either the color terms they thought corresponded to their basic, Western-centric terms did not, or….. etc. etc. In my opinion, it does seem rather clear that this is not, in fact, a universal. Despite the criticisms, other studies have confirmed the general results, and so it seems that Berlin and Kay demonstrated a very powerful tendency. But this is just as interesting as if it were in fact a universal because the reasons underlying this powerful tendency are a matter of objective reason. Therefore, the attack on relativism has held until the modern day. Additionally, if my Google searches are correct, a World Color Survey data set has been produced from which other researchers have replicated substantially Berlin and Kay’s findings.

Although a wealth of publications have discussed this problem, and I have not read all of them, I think that an economics approach may be helpful for explaining this result. ( I did read one publication that suggested a theory similar to what I will suggest. ) And, in any event, what do I mean by an economics approach? I mean that I wish to create an equation or series of equations, with several variables representing factors that influence the human demand for color terms, that explains this pattern and offers suggestions for why deviations may occur. The equations are based less on absolute numerical values and more on logic. One example of how logic has been brought to bear on important problems by economists can be found in Gary S. Becker’s The Economics of Discrimination (a copy of which I have apparently stolen from my friend Bob). To show one simple example:

By using the concept of a discrimination coefficient (DC), it is possible to give a definition of a “taste for discrimination” that is parallel for different factors of production, employers, and consumers. The money costs of a transaction do not always completely measure net costs, and a DC acts as a bridge between money and net costs. Suppose an employer were faced with the money wage rate π of a particular factor; he is assumed to act as if π(1+di) were the net wage rate, with di as his DC against this factor. An employee, offered them oney wage rate πj for working with this factor, acts as if πj(1-dj) were the net wage rate, with dj as his DC against this factor. […]

Suppose there are two groups, designated by W and N, with members of W being perfect substitutes in production for members of N. In the absence of discrimination and nepotism and if the labor market were perfectly competitive, the equilibrium wage rate of W would equal that of N. Discrimination could cause these wage rates to differ; the market discrimination coefficient between W and N (this will be abbreviated to “MDC”) is defined as the proportional difference between these wage rates. If πw and πn represent the equilibrium wage rates of W and N, respectively, then MDC = (πw – πn)/πn.

Amazingly, Becker first published his work in 1957 (hence N standing in for ‘Negro’), but it still represents a profound attempt to modernize sociology, which is largely and almost staggeringly useless today. Just so, using such models that may be tested and for which data may be gathered may be useful for anthropology. For an amateurish approach in the context of color words, stay tuned for Part II.

In June upon my return from Hong Kong, I wrote a post called “Heisenberg Uncertainty Principle in Language” that postulated the following: “the measurement of expression necessarily disturbs a statement’s meaning, and vice versa.” I only mean to codify a common problem in translation, namely, that it is more difficult to translate some things (poetry, evocative words) than others (scientific treatises). Although I should note that some translators are better than others, like Anne Carson, who delivered outstanding translations of Sappho’s poetry in If Not, Winter: Fragments of Sappho.

One reason that she triumphs is that she complements the translations with a comprehensive glossary. Here is an example:

koma is a noun used in Hippokratic texts of the lethargic state called “coma” yet not originally a medical term. This is the profound, weird, sexual sleep that enwraps Zeus after love with Hera; this is the punishing, unbreathing stupor imposed for a year on any god who breaks an oath; […] Otherworldliness is intensified in Sappho’s poem by the synaesthetic quality of her koma–dropping from leaves set in motion by a shiver of light over the tree: Sappho’s adjective aithussomenon (”radiant-shaking”) blends visual and tactile perceptions with a sound of rushing emptiness.

Adding this context to the translation allows us to compensate for the Heisenberg Uncertainty Principle by giving us much of the meaning we are missing. Many linguists believe that each language is capable of expressing anything another language is, and to the extent that any approximation is possible, this kind of glossary really aids full translation of both expression and meaning. Another reason she is a successful is that she creates new words in English using standard word formation rules that give us a better sense of the original meaning. For example, as you may notice from that excerpt, she translates the Greek word aithussomenon as ‘radiant-shaking.’ Like many translators, she could have opted for ‘radiant’ or ‘quivering’ or some other simple gloss.

The translation issue isn’t merely academic. Computer scientists, engineers, and linguists have been engaged in creating and improving natural language processing, which can be said to involve everything from parsing human language into constituents that a computer can process for speech recognition programs or text interface with Ask Jeeves, Bing, Wolfram Alpha, etc. If you type the following search query, “What should I do on a first date?” the computer must be able to wring your intended meaning out of it, first, and then determine the most relevant information to respond with, second. Obviously, this is a gross simplification. And still more distantly, natural language processing is critical to the development of artificial intelligence. The prospect of an artificial intelligence without the ability to communicate with us is frightening, as might be seen by Orson Scott Card’s discussion of varelse.

The “principle” might affect NLP thus: assuming morphemes (the smallest units of meaning in language) were discrete and stored as matrices, then science terms would have more characteristics whose meanings were always relevant than non-science terms, and context would change values in the matrix less (if at all) for science terms. For example, let us assume that we are storing words in a 1×23 matrix, and each column stores a binary value for a certain category of language for the following categories: (1) Utility-Miscellaneous, (2) Utility-Mating, (3) Utility-Energy, (4) Utility-Safety, (5) Adj-Bright, (6) Adj-Dark, (7) Adj-Good, (8) Adj-Bad, (9) Noun-Person, (10) Noun-Place, (11) Obj-This, (12) Obj-That, (13) Tense-Past, (14) Tense-Present, (15) Tense-Future, (16) Probability-Unknown-Question, (17) Probability-Possible-Doubt, (18) Probability-Certainty, (19) Singular, (20) Plural, (21) Verb-Eat, (22) Verb-Sex, (23) Verb-Sense. Just act like these are all the categories of words that would have been important to you as a pre-Pleistocene human. Translating the question, “How many are there?” might get you the following calculation (omitting tense and a few other umm… critical things):

PLURAL: [0000000000000000001000] +
PROB-UNK-QUESTION: [00000000000000010000000] = [00000000000000010001000]

It won’t elicit a response that gives you the exact number probably, but it might get a response like “many” which would just be the plural meaning along with the certainty meaning. Some cultures do not have numbers beyond, say, three. After that, they have words for certain magnitudes. So an acceptable response might look like this:

PLURAL: [0000000000000000001000] +
PROB-CERTAINTY: [00000000000000000100000] = [00000000000000000101000]

The human need for information would have transformed this caveman style language into modern language with its recursive grammars, but I am just showing you an example of a rudimentary NLP model based on lexical storage in matrices. Why are matrices important? Depending on what you want to accomplish, you can change the dimensions and values of the categories for matrix operations that could, in turn, symbolize grammatical sentences. Perhaps you could get meaningful dot or cross products, or even develop meaningful 3D imagery based on ‘lexical vectors.’ Just as an example of the flexibility, you could convert the system I listed above in the following way: (1) Utility (Miscellaneous, Mating, Energy, Safety), (2) Adjectives (Bright, Dark, Good, Bad), (3) Nouns (Person, Places), (4) Objectives (This, That), (5) Tense (Past, Present, Future), (6) Probability (Unk-Q, Possible, Certainty), (7) Number (Singular, Plural), (8) Verbs (Eat, Sex, Sense). Instead of having 1×23 matrices, you’d now have 1×8 matrices but more values inside the matrices. So the aforementioned question “how many are there?” would now end up being: [00000120] with the answer being: [00000320].

Assume now that you develop a vocabulary of 10,000 words using this matrix system. First, poetry form dictates meaning for poetry words, so there would have to be values dedicated to how the context changes the meaning of the words. This means you will not be able to deal with fixed matrix dimensions or values, as a matter of definition. Not so with scientific treatises, so no context is necessary and these words, once stored, may remain just so absent changes needed for syntactic purposes. Second, the traits of the scientific word, that is, the characteristics that make up a definition for a scientific word will be certain. There is no doubting that an intrinsic quality of a proton is its positive charge. Without it, and perhaps slightly heavier, you have a neutron. But the word ‘love,’ the subject of so much in art, requires flexibility. To the extent that it does in fact have any intrinsic meaning, it could have equal parts longing and affection, desire and intention to couple — none of them being necessarily tied together. That means you have to attach simultaneous meanings and probabilities. This implies significantly greater computation costs for an NLP system that is able to finally comprehend poetry. Finally, as our discussion of the principle suggests, it also means that the closer a translation comes to locking down the expression in a translation (as Carson does above: it takes a full paragraph to get at the full expression of koma), the more of its intended intrinsic meaning that can only be derived from the original word and context are lost. Does any essence remain?