SH-Data
Presented as the second of two examination lectures for my PhD
One of the areas I didn't cover in much detail in my dissertation was that of cross-linguistic phonosemantics, so I thought it would be interesting to provide an example of such an experiment in this second presentation. The method I used is as follows:
You begin by choosing some phonological form. It could be all the monosyllables ending in /f/. It could be all the disyllables. I chose the Norwegian and English monomorphemes beginning and ending with /S/. More accurately, I should say I chose the unbound Norwegian and English root morphemes beginning and ending with /S/ -- that is, I included the Norwegian infinitives which have an inflectional -e ending.
I chose this characterization in part because there are a similar number of words in the two languages which conform to this characterization, and that makes the data a little easier to read. Also, as far as I could tell, Norwegian /S/ isn't quite as cognate with English /S/ as some of the other consonants like the stops and nasals, and this makes the data a little more interesting from a phonosemantic perspective. I tried to do a full etymological comparison of these two vocabularies, but I didn't have enough etymological information to determine which words really were cognate.
Once you've identified all the relevant words -- and it's important that you have a fairly complete list, or your statistics are meaningless -- you sort them into phonesthemes, which I define as follows:
Phonestheme: A phonological form which is disproportionately associated with a particular limited semantic class.
The best known example of an English /S/ phonestheme is the phonestheme for 'destruction' or 'violent contact' which holds primarily of monosyllables which have the /S/ in final position in words like 'crash', 'bash', 'smush', 'squish', 'smash'. You go through each word individually and if it fits in a phonestheme, you classify it as such. You also make a note of those words which resist classification into phonesthemes.
Once you have a phonesthemic or 'phonosemantic' classification for the English and Norwegian /S/ words, you compare the Norwegian and English /S/-classifications. So you identify the Norwegian phonestheme which corresponds to the English phonestheme of 'violent contact' if there is one. In this case, there was none, so I went through each of the Norwegian /S/ words individually and picked out those which fit in the English 'violent contact' class. Then I did the same thing for each of the Norwegian classes.
Once you have completed the comparison, you have an overview of the semantic classes common to Norwegian and English /S/, an overview of the semantic classes unique to each of Norwegian and English /S/, and you also in the process get a sense for the kinds of words which defy phonosemantic classification altogether.
Now how do you know to what extent these classes in fact represent disproportions? There is only one way to know and that is by using a control. The best control would be to have an overview of all the semantic classes in each of the two languages along with statistics over the distributions of phonemes within these classes and within the language generally. I didn't have this, thought I have better statistics for the distributions of phonemes in English semantic classes than in Norwegian. In this case, I used as a control merely a different phonological characterization. I chose phonemes which were similar to /S/, which means that the semantics will also be more similar to that of /S/. So my control consisted of all the English monomorphemes beginning and ending with the unvoiced affricate 'ch' (/C/) and the Norwegian unbound roots beginning and ending with the unvoiced fricative 'kj' (/K/).
You then try to put these words into the phonesthemes you've devised for Norwegian and English /S/. If you find that the English /C/-words fit into the /S/ phonesthemes as well as the /S/ words do themselves, then the experiment has provided no evidence that phonemes are meaning-bearing. If, however, you find that the /C/-words fit in the /S/ phonestheme comsiderably less well than the /S/-words do themselves (as in fact you consistently do find), then the experiment does provide evidence that phonemes are meaning-bearing.
The results can be summarized as follows:
The percentage of words which resist phonosemantic classification hovers around 7-10%. This number would be considerably lower if we confined ourselves to monosyllables. The reason for this is that there is a much higher percentage of concrete nouns in the polysyllablic monomorphemes than in the monosyllables. Polysyllablic monomorphemes are more frequently loan words which refer to concrete things which have been borrowed into the culture from other languages, such as 'enchilada' and 'giraffe'. Monosyllables are more frequently native to English and Norwegian, and have consequently evolved many more idioms and metaphoric associations on average.
If I say, "This is a chair," when pointing to a chair, all English speakers will agree with me. And if I say, "This is a pickle," when pointing to a chair, then all English speakers will disagree with me. In other words, speakers largely agree on what set of objects in the world constitute the set of chair and the set of pickles. On the other hand, if I say, "This is comfortable," then suddenly, I get much less consensus. It's this quality of non-ambiguity of reference which seems to run interference with the salience of sound-meaning.
For the sake of simplicity, I have defined the concrete nouns as a list of semantic classes:
Concrete Noun Classes: people, titles, body parts, clothing, cloth, periods of time, games, animals, plants, plant parts, food, minerals, containers, vehicles, buildings, rooms, furniture, tools, weapons, musical instruments, colors, symbols, units of measurement
But what the words these classes have in common is that their referent is much more likely to be unambiguous than in most semantic classes. Consider another example. Many of the exceptions in the Norwegian list are words for animals. For example, the Norwegian word for 'giraffe' -- 'sjiraf' -- begins with /S/. There is no other word for 'giraffe'. There are no synonyms, no confusion about which things are or are not giraffes. On the other hand, one could imagine an activity which might easily be described as any one of 'mash', 'mush', 'squash', squish', 'smash' or 'smush'. Each of these words admittedly has a unique connotation. However, any one of them might be applied to this activity.
Now if I have a wad of paper, my daughter might say to me, "Smish it down tightly." In this moment, she has invented the word, 'smish'. The word has not arisen as a result of regular sound change from some Indo-European root. Rather, the word was created spontaneously by analogy with 'smash' and 'squish'. I think she is more likely to choose the vowel /i/ in this context for two reasons. One reason is that the size of the oral cavity is narrower during the articulation of /i/ than during the articulation of the other short English vowels. The other reason is that there are a higher percentage of words containing /i/ which concern smallness than words which contain other vowels. This type of word formation is more accessible if the referent is more ambigous. I think it's for this reason that the exceptions to phonosemantic classification tend to be concrete nouns.
For this reason as well, it is connotative meaning rather than denontative meaning which is preserved by means of this type of word formation by analogy. On the contrary, it is based on denotative meaning that we determine the rules of regular sound change. For example, it has been observed that Latin /p/ correlates with Germanic /f/, as in 'pater' and 'father'. But this sort of observation must be based on a comparison of concrete nouns - words with unambiguous referents, such as 'father' and 'fire'. You cannot use words like 'funk' and 'fluster' to make the comparison, because they have no clear correlates in the other language. The one correct basic translation for 'father' into 'Latin' is clear. But what is the one correct translation of 'funk'?
Before I compared the Norwegian and English /S/ classifications with one another, I first tried to fit each individual Norwegian word in the English classification, and each individual English word in the Norwegian classification. In the first case I got 6% exceptions and in the second 10%. These words were almost without exception not concrete nouns. The reason for this is that the concrete noun classes are the same in Norwegian and English. Indeed, I find that the concrete noun classes tend to be among the most universal cross-linguistically.
Finally, I ran the control. First I formed a phonosemantic classification which was confined only to /C/ in English and /K/ in Norwegian. In the first case, I found that 8% of the words resisted classification into phonesthemes, and these were all concrete nouns. In the case of /K/, I found 11% exceptions. You also find that you find more exceptions, the smaller the set of words that you are classifying. You can perhaps see this most clearly by taking an extreme example. If you are classifying only 3 words, then if one of them doesn't fit with the other two, you immediately get 33% exceptions. And if two of the words don't fit, with any others, you get 100% exceptions.
I then tried putting the English /C/ words in a union of the Norwegian and English /S/ classes, and I got many more exceptions -- 31%. Similarly, when I tried to classify the Norwegian /K/ words into a union of the Norwegian and English /S/ classes, I fot 41% exceptions.
It has been assumed through much of the linguistics literature that disproportions such as these didn't exist, or if they did exist, that they were not significant, that they at best represent historical artifacts. Assuming that I have conducted the tests competently and the disproportions do in fact exist, I think there are several reason why the disproportions are significant.
First, the exceptions can be characterized in a uniform way -- they are all concrete nouns. To me this suggests, as I have indicated, that some universal psychological process or natural laws is responsible for the disproportions.
Furthermore, had there been no productive process which actively caused a coherent and unique semantic domain to be associated with each phoneme, I think that regular sound shifts would long since have worn away any phonosemantic irregularities which one might be tempted to attribute to historical artifact. For example, the youth in the city of Bergen are consistently replacing the /K/ phoneme with /S/. The annihilation of an entire phoneme in a language is really a huge change, yet it takes place without all that much ado from one generation to another. If there were no productive clustering process, such a sound shift would surely have annihilated any disproportions between phonological form and semantic class.
There is yet another type of interesting evidence that the disproportions are significant. If you look at words within the same semantic class which have different phonological forms, you find a semantic difference between them. For example, consider the words for 'shaking':
Shaking
Norwegian, initial -- shake (dance), sjalte (switch on/off), sjangle (stagger, sway), sjevre/sjever (flap, quiver), sjoge (roar, swish -- said of water), sjokk-e (shock, swish), sjuske (swish), sjø (wave, ocean), skimle [omkring] (scurry around), skjelle (rattle, cold wind), skjelv-e (shake), skjen-e (swerve), skjær-e (swerve), skjørpe (snort), skyll (wave), skyndel (shuttle), skyttel (shuttle)English, initial -- chinook, shake, shamble, shatter, shift, shimmer, shimmy, shinny, shiver, shock, shrivel, shrug, shudder, shuffle, shuttle
English, final -- brush, lash, swash, swish, swoosh, thrash, thresh, whish, whoosh
Norwegian kj -- kik (whoop (cough), twist), kikne (get a kink), kikse (play marbles), kile (bowling, push, elbow), kink (kink), kinks (toss of head), kiper (twill), kippe (toss, bink), kise (wink), kjappe (row), kjeng (cramp), kjøve (choke), kyngje (have trouble swallowing), kyrkje/kyrkne (get caught in the throat), kyve (bend down)
English ch -- flinch, twitch, winch, wrench
The /S/ words for shaking tend to be smooth and repetitive, encountering little resistance, although their period and shape varies somewhat depending on what other phonemes appear in the word. The closest matches I could find containing /K/ may perhaps not even be characterizable as shaking. They are at any rate not repetitive. They are irregular and they imply some obstruction or difficulty. There's a lot of coughing, kinks and cramps in /K/. The English /C/ words which most closely resemble the shaking words which contain /S/ imply a series of descrete events, as if in a chain, rather than a smooth motion. I think of /C/ as 'digital', whereas /S/ is more 'analgoue'. The /C/ words also imply difficulty or obstruction, as is indeed typical of the stops. These words are also related to /C/ words of grasping or tightening: cinch, clench, clinch, crouch, crunch, crutch, haunch, hunch, retch, scrunch, slouch.
And interestingly, if you look at the 'cutting' words containing these 3 consonants, you find a similar pattern:
Cutting
Norwegian, initial -- shovel (dozer), sjeide (separate rock), skie (kindling), skill-e/skjell (split), skisma (schism), skive (slice), skived (kindling), skjere/skyru (sickle), skjerp (mining prospect), skjær-e (cut), skjølp (gouge)English, initial -- shave, shear
English, final -- gash, hash, lash, slash
Norwegian kj -- kjakse (cut unevenly), kjangle (hack), kylle (cut tops of trees) (very imprecise compared to 'sj')
English ch -- champ, chew, chink, chip, chisel, chomp, chop, chow, chuck, chug, chunk, churn; brunch, crunch, etch, munch, scotch, scratch (much more emphasis on individual repeated chopping motions)
The /S/ words tend to imply a smooth cut which encounters little resistance. The corresponding /K/ words are irregular and do imply resistance. Whereas the /S/ words seem to imply a sharp blade, the /K/ words seem to imply a dull one. The /C/ words once again, imply a series of discrete events. Whereas the cutting words containing /S/ are more likely to be resultative, because they encounter little resistance, the typical cutting word which begins with /C/ encounters much resistance, and you therefore only accomplish a part of the task each time. Cutting in /C/ is chopping, chipping and chiselling. Interestingly, there are many words containing /S/ to describe the result of /S/ cutting, namely a division or slice. Similarly, there are many words containing /C/ which describe the result of /C/-cutting, namely nitches and notches.
The theory I propose to account for these disproportions, I call Semantic Association. Semantic Association is the tendency to associate with any linguistic form a unique and coherent semantic domain. For example, when a child learns the word, 'pillow', it makes a range of assumptions which make it possible to learn the word: 1) The child assumes a certain constancy in the referent... that the referent for 'pillow' will not change tomorrow afternoon. 2) The child assumes that all pillows will resemble one another in some way. 3) The child assumes that 'pillow' does not refer to everything, that the word 'pillow' has a limited referent.
We acknowledge that Semantic Association applies on the level of the word and the morpheme, in other words that words and morphemes are meaning-bearing. If we assume that Semantic Association also applies on the level of the phoneme, I believe this would account for the disproportions observed. I think the primary reason that we haven't accepted that Semantic Association is active on the level of the phoneme is that phoneme meanings are subconscious. We are semi-conscious of morpheme meanings. We are wholly conscious of some aspects of word meaning, but we are for the most part wholly unconscious of phoneme meanings.
Now consider the classes common to Norwegian and English /S/:
Class | N1 /S/ | N.F. /S/ | E.1. /S/ | E.F. /S/ | N. /K/ | E. /C/ |
Shake | 17 | 0 | 15 | 9 | 16 | 4 |
Change | 5 | 0 | 5 | 0 | 0 | 7 |
Fuzzy | 3 | 1 | 3 | 3 | 0 | 0 |
(Track) | 5 | 0 | 4 | 0 | 0 | 1 |
(Walking) | 3 | 0 | 1 | 2 | 0 | 0 |
Destruction | 8 | 1 | 10 | 15 | 0 | 14 |
Sharp | 4 | 0 | 9 | 0 | 1 | 0 |
Cutting | 11 | 0 | 12 | 4 | 3 | 18 |
Bad, Shoddy | 13 | 0 | 19 | 0 | 7 | 18 |
Shout | 6 | 0 | 6 | 0 | 3 | 1 |
Dismiss | 1 | 4 | 6 | 6 | 0 | 3 |
Sheet | 23 | 0 | 17 | 4 | 0 | 0 |
(Small) | 3 | 0 | 5 | 0 | 0 | 2 |
Bit | 6 | 0 | 9 | 0 | 0 | 9 |
Gush -Water | 5 | 0 | 1 | 10 | 0 | 1 |
Rush | 10 | 0 | 4 | 5 | 4 | 6 |
Smart | 4 | 0 | 13 | 0 | 1 | 0 |
Shine | 10 | 0 | 6 | 1 | 4 | 5 |
Intensity | 4 | 0 | 6 | 5 | 4 | 11 |
Shack | 4 | 0 | 9 | 2 | 2 | 0 |
Shadow | 3 | 0 | 6 | 0 | 0 | 0 |
Protector | 4 | 0 | 9 | 1 | 0 | 4 |
Cloth | 10 | 0 | 10 | 2 | 0 | 0 |
Catch/Hold | 17 | 1 | 3 | 7 | 21 | 27 |
Other Shelter | 10 | 0 | 9 | 0 | 0 | 6 |
Departure | 25 | 1 | 29 | 4 | 0 | 15 |
Shut | 3 | 0 | 8 | 0 | 0 | 8 |
Should | 5 | 0 | 7 | 1 | 4 | 0 |
Shame | 4 | 0 | 3 | 2 | 0 | 0 |
Shirk | 2 | 0 | 10 | 0 | 0 | 0 |
I've put the larger classes in italics and the smaller classes in parentheses. I'd like to make a couple points about this classification. First, this not the one and only correct classification for English and Norwegian /S/. I'm aware of two classifications for English /S/ made by other people, and about 2/3 of the classes tend to be the same. In fact, when conducting this test, I did the classification twice, first taking each word in alphabetical order, and then in reverse alphabetical order. I do this as a reality check, as an attempt to insure that I didn't miss too much. Each time I do the test, the classes end up being a little bit different, and the percentages of exceptions consequently also varies somewhat.
However, I find quite consistently that if the classification scheme conforms to certain rules of classificational well-formedness, or you might say 'grammaticality', you won't get exactly the same classes or percentages, but your results will have the same general profile. By this I mean that the exceptions will always have concrete reference, and the /S/-word will do better in the /S/ classification than will the words containing other phonemes.
The rules for classificational well-formedness are as follows:
If you were, for example, to classify articles of furniture based on whether they are fruits, vegetables, meats or dairy products, you find that such a classification violates the first two criteria of a grammatical classification. Very few articles of furniture fit in any of these semantic classes, and each of these classes contains a very small percentage of the articles of furniture. This classification scheme does not violate the third criterion. This criterion would be violated by a scheme that classified articles of furniture based on whether they are one day old, two days old, three days old, and so forth up to 4,000,000 days old. A classification which would violate the 4th criterion would be a class of chairs, seats, and furniture on which one sits. The idea is to minimize the number of classes required.
I'd also like to point out that the classes in this table are much more highly interrelated than one can see just by looking at them. For example, the verbs for change containing /S/ are metaphorically related to the words for 'shaking'. That is, the verbs for change have another sense which simply means 'back-and-forth motion'. This does not hold of the verbs for 'change' which contain a /C/. The walking class in Norwegian and English contains only 3 verbs, which constitutes only 1.5% of the words containing /S/. 'Walking' is therefore not technically a /S/ phonestheme, because at least in English, this percentage is not considerably higher than what you find in the language generally. But the walking verbs which contain a /S/ usually emphasize a back-and-forth motion, and are therefore also related to the 'shaking' verbs. There are many words referring to poor quality in /S/. These words are also often related to words for destruction, such as 'to trash/the trash', 'to mush/some mush'. The words for protection are related to the verbs of destruction in an obvious way. One also finds many words for things which are planar. The planar words containing /S/ tend to have an underside, unlike most of the planar words beginning with /pl/. And these sheet-shaped words containing /S/ are also very frequently protective coverings. Many of these words are words for cloth, but one also finds words like 'shield', 'shingle', 'shutter', 'sheath' and so forth.
The point is that all these interrelationships cause the semantic domain assiciated with /S/ to form a coherent whole. /S/ has a coherent meaning.
Consider now the ways in which Norwegian /S/ differs from English /S/:
Phonestheme | Norw. /S/ | Eng. /S/ | Norw. /K/ | Eng. /C/ |
Distinguish | 3 | 0 | 1 | 0 |
See | 6 | 0 | 6 | 2 |
Learn | 3 | 0 | 1 | 0 |
Pattern | 4 | 1 | 0 | 1 |
Order | 4 | 0 | 3 | 1 |
Happen | 7 | 1 | 0 | 1 |
Self | 3 | 0 | 0 | 0 |
(Slant) | 5 | 0 | 3 | 0 |
(Mountain) | 2 | 0 | 2 | 1 |
(Fun) | 5 | 1 | 5 | 17 |
In Norwegian, the cutting verbs are metaphorically related to the capacity to draw distinctions and to patterns. Words for intelligence are quite common in both English and Norwegian /S/, but in Norwegian, they are related to word for sight and learning. There are a number of Norwegian words containing /S/ for events, happening, fate and the like, though I'm not sure how these words relate to the rest of the Norwegian /S/ vocabulary if at all. Notice that there are a couple words for mountains in Norwegian containing both /S/ and /K/. I've not gone through the entire vocabulary of Norwegian, only the fricatives, but I suspect that the reason there are more /S/-words for mountains in Norwegian than in English is simply because there are more words for mountains in Norwegian than in English. At least in the fricatives, I found 3 times as many words for mountains in Norwegian than in English. So part of the process of comparing vocabularies phonosemantically involves distinguishing phonosemantic disproportions from disproportions which exist throughout the language generally. Indeed, I expect that the metaphoric relationship between cutting and intelligence in Norwegian is not limited to /S/, but probably runs throughout the Norwegian vocabulary.
Consider now the classes which are more prevalent in English than in Norwegian /S/:
Shallow (lit) | 6 | 1 | 0 | 0 |
Shallow (fig) | 19 | 2 | 0 | 0 |
Shore | 5 | 1 | 1 | 1 |
Shelter (v) | 3 | 1 | 0 | 0 |
Lush, Plush | 4 | 1 | 0 | 0 |
Tricky | 12 | 1 | 0 | 0 |
Violent Contact | 26 | 3 | 2 | 20 |
Trash, Mush | 13 | 2 | 0 | 0 |
The planar words which are common in both English and Norwegian /S/ are related in English to words for shallowness, and these in turn are related to figurative shallowness. There are many words for things of poor quality in both English and Norwegian /S/, but in Norwegian, the poor quality tends to be due to negligence, whereas an English it is more likely to be due to superficiality or shallowness. There are, as mentioned, many more words for violent contact in English /S/ than in Norwegian. Also, whereas the intelligence in Norwegian /S/ is related to the capacity to draw distinctions, to sight and to patterns. In English, the intelligence in /S/ exclusively implies deceit: sharp, shifty, card shark, etc..
This concludes my overview of the semantics of English and Norwegian /S/. The main point I'd like to make is, of course, that phonemes are meaning-bearing. Furthermore, this is due to a productive process of Semantic Association on the phoneme level. English and Norwegian /S/-words also more closely resemble one another semantically than either of them resemble words containing other phonemes. The most important topic for future research seems to me to be that of determining the extent to which the semantic differences between Norwegian and English /S/ (for example) are limited only to /S/ words, and to what extent the disproportions reflect differences which are general to each of the two languages.