Linguistic Universals

First Prøveforelesning for my PhD

Margo's Magical Letter Page

Assignment: What are "universals"? How has the concept changed in the last century, and how does one go about identifying candidate universals, abstracting them from language data, separating them from culture-specific accretions?

Intro

I was very happy when I got this topic for a presentation. It's a little bit like being asked to describe the nature of the cosmos, the sort of thing you can't reasonably be expected to provide a definitive answer for in 45 minutes or less -- at least I very much hope this is not the expectation. And seeing as the committee here is already acquianted with my prejudices, I've resolved to take the plunge and speak my mind openly.

I think rather than beginning with a definition of 'universals', I'll begin with a history of the subject, because the definition of any theoretical construct is perhaps best understood in the context of the history which gave rise to it. To that end I'd like to start further back than just the beginning of the 20th Century.

History

Although people have been identifying what could be called universals throughout history, the preoccupation with universals is quite recent. It can be argued that when the Phoenicians invented the alphabet, they discovered a linguistic universal, namely the phoneme... but they did not think of their discovery as that of a linguistic universal in the modern sense. The Greeks, according to Robins in his Short History of Linguistics discovered the concepts of tense, mood, case and part of speech, all of which are still today nearly universal concepts used in any linguistic description. Yet as Robins points out, the Greeks were uninterested in the linguistics of anything but Greek, and developed their grammars, according to Saussure, primarily for prescriptive purposes.

The primary theoretical question which took the place of the modern debate on universals among the ancients, and indeed throughout the Middle Ages was the natural vs. conventional debate which I discuss in my dissertation. People were as much concerned with whether language should be natural vs, conventional (in order that it should best serve as a medium to express truth) as they were with whether language in fact was natural vs. conventional. Language was regarded as consciously created in the act of speaking. It was held that the more language usage was natural, the more it was poetic, but the more language usage was conventional, the more it was precise. So the debate centered around whether poetry or precision lay closer to the truth. And it continued until at least the end of the 17th century when Leibniz, for example, admitted that language was not after all purely conventional, but lamented the fact, because in his mind, its naturalism made it less capable of expressing precise scientific ideas.

In the 18th century, a major debate arose in France, which for the first time pitted the venacular against Latin. Again, the debate centered around whether French vs. Latin was more mimetic or iconic, and again which was a better medium for expressing the truth. The advocates in favor of French, such as Beauzée, Voltaire and Diderot, held that SVO word order was more mimetic and served a sort of underlying structure which lay closer to the essence of thought and which the Romans then permuted to create flowery Latin sentences. Advocates for Latin word order, such as Lamy, Condillac and Batteaux argued that we did not actually think linearly, but rather outside space and time. The linearity of sentences was instead imposed by the fact that you could only say one word at a time, and therefore the freer word order in Latin, although it did not precisely express the non-linearity of thought, more accurately resembled it. You can see throughout the 18th Century debate that the issues and positions are remarkably similar to the issues and positions which came up again in late 20th Century syntax. The primary difference seems to me to be the function of the debate. In the 18th Century, the issue centered around which language is a better medium to express truth. In the 20th Century, one asked what the actual nature and structure of natural language was, and hence what the universals of language were. What was formulated by the various factions in the 18th Century as more natural and hence closer to the truth was formulated in the 20th Century as more Universal and hence more explanatorily adequate, or one could say also closer to the truth.

With the scientific revolution, the focus in all fields of study became less absorbed in moral issues, and intellectuals started to ask less how things should be and more how they in fact were. In linguistics, this shift in focus first manifested in the beginning of the 19th Century with the discovery of Sanskrit, with Bopp's formulations of regular sound change. Suddenly, we realized that languages were interrelated in a way that had never been imagined. In the 18th Century debate, it was assumed that what they called the 'analogue' or fixed word order languages were more closely related to one another than to any 'transpositive' or free word order language. Therefore French was assumed to be more closely related to English than to Latin. Since this basic premise was false, no real progress could be made in the domain of universals, nor was there much evidence that anyone thought about it. But after Bopp and the Western discovery of Sanskrit, things looked very different.

Another factor that had interfered earlier with any attempt to characterize linguistic universals was the assumption that any language not spoken in Europe was inferior in its expressive power. This notion was, of course, so thoroughly undermined by the discovery of Sanskrit that many people started assuming the opposite -- that so-called primitive languages were more 'natural', less conventional, and therefore inherently had greater expressive power. It wasn't until the 20th Century, thanks to the work of people like Edward Sapir and others who undertook to write grammars and dictionaries for the languages without writing systems, that a consensus was reached that all languages were equally rich in expressive power, equally capable of expressing truth.

The debate then shifted to some degree to whether, as Whorf and Sapir proposed, those equivalent languages created in speakers fundamentally different world views, whether every aspect of language could be adequately translated from one to another. This debate too was not so different from the natural/conventional debate of earlier centuries. The 'naturalist' position is essentially that the meaning of language is somehow tied up with its form and is therefore not translatable. When the form changes, the meaning changes.

If you look at the literature in the 19th and early 20th Centuries, in, for example, the indexes of introductory texts like Saussure's Cours or Bloomfield's Language and Sapir's Language, you find no mention of 'universals', even though these works are primarily efforts to outline the basic principles which are common to all languages and to express the ways in which languages may differ. By this time linguistics had already overcome many of the earlier weaknesses. It was generally accepted that the expressive power of all languages was essentially equivalent. Language was recognized to be essentially a phenomenon of speech rather than writing. The function of the linguist was no longer prescriptive. It took us centuries to arrive at a consensus on these prerequisites for further advancement. But there was still no notion of grammar as a black box in the mind.

I suspect the transition to the modern point of view began with Jakobson's "Kindersprache, Aphasie und allgemeine Lautgesetze". Jakobson observed patterns common to both linguistic disorders and language acquisition, and in so doing brought attention to the structure of the mind, as well as the structure of language; Jakobson's paper did not particularly stress the structure of the mind. The paper itself seems to me pretty typical of Jakobson's rambling style, loosely interconnecting interesting phenomena. The idea of a black box in the mind was as far as I can tell developed by Jakobson's followers and it was greatly expanded upon and generalized in the creation of generative grammar.

This innovation greatly encouraged inquiry into the relationships between similar linguistic structures: sentences, morphological and phonological structures. The notion of transitivity was, for example, greatly enriched to the notion of a subcategorization frame, and we became thereby empowered to express generalizations not before considered in such detail. Suddenly we had a language with which to speak about these interconnections, and the vocabulary to ask whether, for example, all these multifarious 'object-to-subject' movements had anything in common. And, of course, the organizing force behind these investigations was a search for Universal Grammar.

In the early years, individual grammars from which UG was to eventually be derived, consisted of specific rules and were generative in nature... They were dynamic and tended to express what could be rather than what could not be. By the late 70's and early 80's, emphasis was shifting away from the dynamic rules toward principles and parameters, toward a more static grammar whose basic tenet was that all that was not disallowed was allowed, and the grammar expressed to a much greater extent what could not be, this time also in much more general terms.

This was a positive evolution, in my view, for many reasons. The older style grammars simply resembled computer programs. Anyone who has programmed a computer realizes that wheras the 'descriptively adequate' algorithm can within certain constraints be abstracted from the computer on which it is to run, the efficiency of your algorithm cannot be. And the primary criterion by which the explanatory adequacy of UG was defined could be characterized as efficiency. Of course, there are certain programming structures which both slow down the processing and take extra space, such as declaring variables that you never use, and there is rarely any advantage to introducing these. But much programming centers around the payoff between, for example, processing speed and code size. If a computer has a very fast processor but relatively little memory, then one programming strategy or grammar will work best, and in another computer, another programming strategy will work best. There is no ultimate Truth in choosing one 'descriptively adequate' algorithm or grammar over the other, and the best decision cannot be made in the abstract without reference to the computer that is to run the program and the purposes for which the program is to be used. The analogy in grammar, as I see it, is that the best choice of Universal Grammar, the most explanatorily adequate grammar cannot be known as an Absolute abstracted away from two critical things: 1) the physical make-up of the brain which is processing the grammar and 2) the function for which the grammar is to be used. In other words, explanatory adequacy cannot be abstracted away from pragmatics.

It's assumed that you can say something about the structure of the brain by looking at language, and this is true to a certain extent. A teacup will probably never be capable of processing human language. But by merely looking at the output of a computer, you cannot wholly determine its architecture, nor can you know what algorithm was used to produce that output. Similarly, in language, if you want to know the structure of the brain, at some point you're going to look at the brain. If you look at language, what you'll find is the structure of language.

For these reasons, I believe the change in emphasis from generative rules to principles and parameters was a positive one. The question was no longer what program actually generates these sentences. This, to my mind, is an engineering issue and can never be one of the Truth of the nature of things. The issue was now instead what underlying principles govern the structure of language. This is where we will find universals.

However, even with this improvement, the proposed solutions or explanations to these various questions raised by the notion of an innate UG -- such as what it was that the various object to subject movement constructions had in common -- always took the form of a formal structure; it always concerned syntax rather than semantics. Universal Grammar was still a sort of machine that addressed only the form of language, not its content.

To my mind, the primary oversight of the MIT approach is (in a word) semantics. Linguistics is practiced within this framework as if human beings have no more understanding of language than a Dell VT computer which accurately and productively generates the sentences which speakers recognize as valid... as if the capacity to generate the set of grammatical sentences implied also a capacity to understand them. The real explanations for these questions must lie not in form, but in content. True linguistic motivation will ultimately be semantic. It is function that will determine structure -- one could say function is iconically related to structure. I think it is this view that gave rise to the field of cognitive linguistics.

What must unite the various types of object-to-subject movement is not some formal analytical apparatus which correctly generates the relevant strings of words, but some meaning which they have in common. The reason the patient appears in subject position in these constructions cannot ultimately be because of Case absorption or some other similar mechanism. The reason must be that the patient in these sentences is playing a semantic role which we instinctively, naturally or 'iconically' identify with the subject position. That is, I think the fact that arbitrariness of the sign holds sway in the generative tradition is directly related to its tendency to think of linguistic universals in terms of syntax rather than semantics and to think of linguistic motivation in terms of form rather than pragmatics. Conversely, in the cognitive linguistic tradition, linguistic motivation is fundamentally regarded as semantic, universals tend to be regarded as semantic rather than syntactic, the form of language is not so hermetically sealed off from its meaning. Cognitive linguistics is therefore much more open to the possibility that there may be an iconic principle universally governing the relationship between form and meaning.

I think, however, that there is at least one understandable reason why we have tended to reject iconism as a linguistic universal. Consider for a moment what it is essentially that distinguishes human speech from animal forms of expression and from other human forms of expression, such as dance. What has to happen in order for an expression to be considered speech? I would argue that it is arbitrary reference. When a child 'says its first word', what distinguishes it from babbling is that we are convinced that the child is using the word to refer to something specific... in my terminology, the child is employing Semantic Association. The capacity to say something is directly related to reference. Animal communication, art forms like music or dance have the power to express something very specific, but not to say, not to convey an articulated thought like, "I enjoyed the party last week," and it's this fundamentally which distinguishes human language from other modes of expression, however sophisticated they may be... That's why, I think, we say the sign is arbitrary, because it is its arbitrariness that makes it a sign, that makes it a part of human speech... But in focussing on this arbitrariness, in intuitively recognizing that were the sign not arbitrary it would not be a sign, I believe we have vastly underestimated the extent to which iconism is influencing meaning.

To me it therefore seems obvious that the original language was not purely iconic in nature, or even more iconic than modern languages, as many of my fellow iconists believe. The reason for this is that language by its very definition means that we are associating referents with words. And it's perfectly obvious that we cannot predict the referent of a word just by hearing it and neither could any Neanderthal. Any language that doesn't have arbitrary referents simply isn't a language.

Types of Universals

In an effort now to define what a universal is, I'd like to propose a 3-way classification.

1. First, are universals which are innate, and which need not be learned. These are the universals which concern generative grammarians, which make up universal grammar, and which generative grammar regards as universals which are syntactic in nature. I'm inclined to guess that these innate universals are purely semantic in nature and not syntactic, that they express only general cognitive capacities. One candidate for universals of this type is iconism... the intuitive synesthetic sense for what a particular form means. Others might include a capacity to refer, to classify, to form propositions or perhaps more generally to qualify. All the universals I would propose for this class are semantic in nature.

2. Second, there are universals imposed from outside by the nature of the world. We are born with an instinct for some general capacity, but the conditions of life fill in the specifics, and to the extent that the conditions are common to all people, the relevant linguistic structures will be universal. I would propose as a candidate for this type of universal that of semantic classes, particularly concrete noun classes. For example, the semantic class of animals is probably universally distinguished from that of plants, things which can be eaten are distinguished from those which cannot. Every language, I believe, has a class of words for colors, for musical instruments and for weapons. We are probably not born understanding these universals, but we are born with the first type of universals, such as native capacity to classify and to refer, and the similar conditions among human beings results in near universality of a large number of semantic classes. Whorf was sensitive to the fact that the semantic classes are not, however, completely universals the way universals of the first type would be. Different cultures do, to a certain extent, slice the pie of life differently. And since the semantic class of a word is implicit in its meaning, the word cannot be directly translated into a language which encodes a different semantic classification scheme.

3. Finally there are universals which are so as a result of dispersion. These universals are not inborn, nor are they imposed by the nature of the world we inhabit. They just happen to be similar across languages, because the languages are related. If there are such universals, then it can, of course, only be because all languages have only one origin. As a phonosemantic candidate for this type of universal, I would propose the similarity of the phoneme meanings. For instance, I find the semantics of the s-t-r sequence is surprisingly similar across languages, and there's a study by Salisbury (1992) which finds something similar for k-v-n. But since reference is essentially arbitrary, I suspect the semantic similarities in similar phonemes and phoneme sequences cross-linguistically is due to the fact that the languages have a common origin.

In the generative tradition, universals of the second and third types are not particularly taken into account, though Greenberg's list of universals having a specific form served as a starting point for much generative research. Nevertheless, within generative grammar, all universals are assumed to be innate, whereas in the beginning of the 20th Century, many universals were assumed to result from dispersion. You find a similar tendency to consider all universals to be innate in other fields as well. When one reads Frazer's Golden Bough which sparked the field of cultural anthropology, one is struck by the overwhelming pervasiveness of the myth of the tribal king who is sacrificed on a tree. These 'king-in-tree' myths are limited to agriculural rather than hunting cultures, but apart from that, they are found in every continent on the earth, at every level of cultural sophistication. And after reading the 400th example of such a 'king-in-tree' culture/myth, you really start to ask yourself what this immense preoccupation with a guy in a tree is all about. There are many proposals in the literature, including some of Frazer's own, trying to account for the necessity of this very specific preoccupation purely in terms of the structure of human psychology. Joseph Campbell in the first volume of The Masks of God explains the pervasiveness and specificity in a unmysterious way... simply by dispersion. He argues for dispersion based on, for example, the similarities in the myths among cultures which are known to be related. The alternative to dispersion as an explanation for such universals in anthropology is to assume that certain similar, specific mythological images arose spontaneously in the mind in different parts of the world.

Notice that even to the generativists, the innate universals express a capacity. The other two types of universality are by definition universals of overt form rather than capacity, and they arise as a result of function or language usage. Those universals of form will, of course, by definition have the same form cross-linguistically. But those univerals which concern a universal human capacity, which are part of UG, will not be expressed by the same form cross-linguistically. To the cognitive linguist, they may not be syntactic universals at all. From the generative perspective, universals of the first type also have a very definite syntax, but this syntax is regarded as expressive of a capacity, literally capable of generating a language. So these universals are regarded as having an underlying form, which differs from the syntax that manifests at the surface.

My Off the Cuff Proposal for UG

I'd like to take the opportunity to propose a hypothesis for linguistic universals, which I believe to be consistent with the basic premises of cognitive semantics, however incorporating also my own findings in linguistic iconism. I would say that from a purely rational perspective, the default assumption should be that basic semantic capacities, like the ability to classify and to refer, are innate. Those universals which bear an obvious relation to universal human needs -- like the universal human need to distinguish between what is and is not food -- should be assumed to fall in the second class of universals. And those universals which don't make sense from either of these perspectives, but appear to be arbitrary and learned -- like the nearly univeral semantics of k-v-n -- are most straightforwardly explained in terms of dispersion.

My hypothesis would be that UG consists in certain fundamental semantic capabilities, such as the capacity to apply reference, to create a proposition, to classify, topic/focus, and so forth, and that UG in no way concerns itself with the actual syntax of language as it manifests. I conceive of UG as purely semantic. Furthermore, I would guess all mapping from these basic semantic structures is essentially iconic. What is not iconic must be learned and is therefore not in UG, as I'll explain in a minute.

Now a basic premise of the generative tradition is that -- contrary to what I have just proposed -- a great many things which concern syntax rather than basic semantic capabilities are universals of this first type in UG. Whereas in the generative tradition, all of UG concerns syntax or form, from the perspective I present here, none of UG will concern syntax, but only semantic capacities. I also don't envision UG in the manner of generative semantics, because that too is form. Furthermore since the dictum of the arbitrariness of the sign has held sway in the generative tradition, it is assumed that none of the universals of the first type are iconic... that indeed iconism has no linguistic function whatever. Were I (in the absence of a proper investigation) to make my best educated guess, I would say the opposite is true. I'd wager that all mapping from UG is iconic, and whatever isn't iconic is learned and can in principle vary from language to language.

For example, when I learn a new language, I do not need to be taught that propositions will manifest as a phrase. I already know this instinctively or iconically. What I don't know is what form a phrase takes in the language I am learning. I know that a proposition requires that a relationship between words be expressed, and there are two obvious ways in which such a relationship might be expressed within the linear constraints imposed by speech -- namely by word order and by inflection, or perhaps a combination of the two. I think probably that the iconic capacity to, for example, correlate propositions with phrases, takes over a lot of the labor which some theories ascribe to purely formal structures. The reason we have some success in formulating the rules of anaphora in terms of syntactic phrases rather than in terms of propositions is that syntatic phrases are iconically related to propositions., though the exact form these phrases take must be learned. So I don't know which exact forms the language I'm learning uses until I'm explicitly taught. And you find that the variations pretty much run the gamut. You find SOV and SVO and prefixes and suffixes and infixes and vowel shifts. Any non-randomness I couldn't attribute to iconic mappings between semantics and syntax, I'd tend to ascribe to universal function or dispersion.

The primary argument that these formalisms are universals of the first type, is that language is learned so quickly and it is so complex that if we did not have innate structures in the brain which were primed to pick it up, we would not learn it as efficiently as we do. I take no issue with this. I only take issue with the premise that iconism plays no significant role in language. If you acknowledge that iconic factors bear much more of the semantic weight of a language than has hitherto been assumed, then this could go a long way toward explaining how we learn to speak so quickly.

Identifying Candidates

Now I'm asked to address the question of the actual process whereby one identifies candidate universals. This is in a way the trickiest part of the presentation, and I finally concluded that the only way I can see to answer this question satisfactorily is to make a rather personal statement about the nature of the process involved. Although I have a hypothesis now that all mapping from UG to syntax will be iconic, I do not recommend taking that as a starting point and then going out and trying to prove it. On the subject of methodology, I am a firm believer in data grubbing.

To this end, I think it is best to start off with the universals of the first two types. In order to discover the universals of form (the last two types), you must as far as I can tell simply do the obvious. You have to go through all the relevant data for all languages. In the case of the relationship between sound and meaning, you have to itemize all the semantic classes and all the words that fall into them in all languages and run statistics on the phoneme distributions within those classes. Unless you are clairvoyant, I don't believe there is any other way to truly know what specific structures or relationships are universal.

To this end, you start with what you can say for sure with a fairly high degree of certainty and you gradually build up an empirically determined base of correlations. I think you function most effectively if you don't begin with any preconceived notions about how the result will turn out. The primary reason for this is that if you are focussed on the answer to a specific question, you tend to overlook little things. I'd say the most effective starting point is therefore not so much a specific question, but a general area of interest. I do formulate my questions as precisely as I can, but then I leave the question hanging there. I don't go looking for the answer. I wait for the answer to come to me if it chooses. I find that if I go looking for the answer, like 'what are the universals?', I tend to jump the gun and try to answer questions for which I don't really have the prerequisite concepts. If I simply begin with what I know and start classifying and organizing and observing, then the answers to things tend to come in an order that I can make sense out of, and I have the feeling that the resulting conclusions are also much less contrived, more real. I just try to explore and leave myself open to whatever I might find.

So, for example, if you are interested in the universals of the sound/meaning relationship, you try to find some little corner from which you can start, which takes the form of some range of data whose nature you think you can determine with a fair degree of certainty using some method you envision. For example, in this case, I asked myself whether I know with a fair degree of certainty that 'stamp' and 'stomp' and 'tamp' and 'tromp' all share some element of meaning, all fall in a semantic class together. I think they do, so that serves as my starting point. I might then be tempted to ask myself whether I know what it is exactly that distinguishes these words from 'sit' and 'stand' or 'walk' and 'run' so that I can formulate such a universal. In my case, I decided that I was much more certain about the fact that they belonged together than I was about why they belonged together. So the question of why they belong together, or what specifically they share in common, I leave hanging there to be answered or not answered as the case may be. After now having pursued this line of research fairly diligently for several years, I find I fully agree with Eleanor Rosch that what these words have in common and what distinguishes them from other similar semantic classes is not at all a straightforward question and it involves a great many things, the nature of which I had not even imagined when I started. Had I tried to force the issue, I would probably have come up with some contrived answer, because in retrospect, having understood the issues better, I was in no position to answer the question when it first arose.

However, if you are attentive while doing these exercises, you can discover not only correlations between various sets of data, but universal laws which you propose might be responsible for whatever patterns you find. These laws will never be definitive right answers forever and for always. Newton's laws of mechanics will one day be qualified by an Einstein. But the chances that what you discover will prove to be fairly fundamental will increase dramatically, I find, if you confine yourself to these rules of conduct relative to your research... that you don't decide on the answer or even the question until the data shows it to you, and that you not be too proud or too lazy to roll up your sleeves and read thousands of words in alphabetical order.

Another fundamental tool is reason. Reason must take precendence over convention, over received wisdom, over a promotion, over a degree, even over what seems possible... and sometimes (but happily not always) you have to choose. When I started out, it quite frankly seemed impossible that form should affect meaning in every word. I was confronted with a paradox that at the time had no apparent resolution. My classifications were telling me that form was affecting meaning in every single word, and yet it was obvious that I couldn't predict the meaning of a word from its form? So I reasoned that since they both in fact were obviously true, there must be some way in which they could both in fact be true... It wasn't a matter of simply rejecting the one or the other. And then I tried to reason how they might both be true at the same time, and the only thing I could think of was that I wasn't clear enough about what 'the meaning of a word' actually was. And that's what led to my theory.

In the process of doing the heavy labor of empirical science, I find you get an instinctive sense for what is going on. The sense for how to formulate a universal law does not happen by any predictable procedure. It's not a linear process. Somehow when you simply involve yourself with something intensely for an extended period of time and approach it from many different angles, I find that the mind starts running its own analytic processes in the background, and things simply become clear to you. You are assisted by some mysterious capacity of the brain to process information subconsciously and non-linearly. And answers start to appear. You need only provide the brain with sufficient data in sufficient supply and without adulterating it with preconceived notions for it to run its own course. Your positive actions are relatively mechanical, but your conscious mind is relatively passive, in a listening mode, rather than a producing mode. Your subconscious mind does most of the work, and only occasionally delivers a result to the conscious mind. That's my experience.

So I have now -- thanks to this assignment -- a two week old hypothesis that all mapping from UG is iconic and that what isn't iconic is learned. But if I work in the way I've described, even when I find my initial hypotheses proven wrong, as they so often are, it's not tragic, because I was never very invested into them in the first place, and something much more wonderful tends to lie behind them. When I first started this phonosemantic research, I hypothesized that the extent of sound-meaning correlations was relatively limited, and was just curious about the extent of it. When the horizons kept expanding and expanding, any disappointment I might have initially felt at having been proven wrong was dwarfed by my wonder about what in fact proved to be right.

Margo's Magical Letter Page