Faculty of Language: January 2015

Thursday, January 29, 2015

Let's talk tools

Craig Sailor sent me this link to a paper in Nature that argues for a correlation between the gradual emergence of a certain kind of stone tool and the emergence of language. As Caig put it in his note to me:

"They suggest a causal relationship: the skills required to create the tools in question are complex enough that verbal instruction was significantly more effective than imitation (which they support with experimental evidence), meaning the gene(s) responsible for communication were preferentially selected for in the environment, leading to the general evolution of language."

They actually contrast five kinds of "transmission" mechanisms. Goring form the least interactive to the most they consider (i) Reverse engineering (subject given stone and tries to figure out how it was made) (ii) imitation/emulation (subjects watches someone make the stone and tries to copy this), (iii) basic teaching (proficient maker guides learner but no physical interaction), (iv) gestural teaching (teacher molds gestures of learner on stone) and (v) verbal teaching (teacher instructs learner by talking). The results of the experiment are that one finds better and better information transmission as one moves from (i) through to (v). In other words, as regards EVOLANG concerns, providing instruction via talking really helps. The suggestion is that "tool making created a continuous selective gradient from observational learning to much more complex verbal teaching…the more complex communication allowed the stable and rapid transmission of increasingly complex technologies, which in turn generate selection for even more complex communication and cognition, and so forth (5)." As the authors note (there are 12 of them). Their results place "little necessary constraint on when teaching and language may have evolved…" though they wish to suggest that this evo pressure of the indicated cline started having some effect at least 2.5 mya.

Some comments: It's pretty clear that the authors want to say something here about the evolution of our NL capacities (indeed, this is what made the results popular science newsworthy (see here)). However, so far as I can tell they do not isolate what specific linguistic features are required to juice the tool building transmission system. One can imagine a very limited "language" that would suffice (e.g. do this, like this, etc) to produce good tools and tool making instructions. And if this suffices to get good tool making and good tool making teaching then it is unclear what explanatory mileage one can get form this experiment. Said another way, the "proto-language" gestured to above (which strikes me as quite possibly sufficient for the tool making purposes discussed) is a very far cry from Natural Languages. And though, I am willing to grant that the more complex the language structure (both wrt word meaning and the combinatorics) the more information one can transmit, and the greater the complexity of the transmittable information the more complex the teaching that it can support, I do not see how this explains the emergence of FLs with the characteristics found in NLs. Or, to put this more charitably, I do not see that languages with the two basic features that we have discussed (here and here) are necessary for making tools. Note that I have no trouble seeing how the emergence of more and more complex linguistic systems can support more and more sophisticated teaching, but that does not support the causal direction of interest. The causal direction needs to be from tools to language not from language to tools.

Let me note two points: First, that unlike much of what passes for EVOLANG this work aims to give an account of how linguistic capacity evolved. The idea is that it piggy backs on selective advantage of tool making. This is good: it's heart is in the right place. Unfortunately, second, the paper is actually a nice example of the failings of much of the EVOLANG literature that I have looked at in that it never actually specifies what it takes to be the necessary features of languages required for the task at hand. Would the capacity to form simple NVN strings suffice? Would words that adhered to RD suffice (btw, I can't see why not on both counts)? Do we need unbounded hierarchically recursive structures to get flint tool making off the ground? Who knows? The problem is not that the paper doesn't address this issue, but it doesn't even recognize its relevance.

Let me end by stating again what someone like me (and I suspect you too) want out of an EVOLANG account: how did our linguistic capacity arise? Which one is that? Well, for starters, where did the linguistic capacity with the two key features Chomsky discussed in the above linked to articles come from? Isolate the features of language without which tool creating transmission information is impossible.[1] That's a first step. The second is to show that the causality is from tools to language capacity, rather than the other way around. So, sure there is a correlation: more complex tools goes with more complex language, but more complex everything goes with more complex language. So big deal.

Let me say this another way: the paper, taken at face value, suggests that some kind of communication system would have been useful in making the flint tool discussed. However, as noted this does not imply that on had an NL like ours earlier than 75-50 kya. It only says that it would have been useful. But quite possibly (likely) a more limited communication system with properties quite unlike those found in our NLs would have been just as useful for this task. Nor is there any reason given why having such a more primitive system is a causal precondition for having ours. And therefore the relevance of the results completely escapes me. Here is yet another case where not describing the properties whose evolution you want to explain has led to a whole bunch of time intensive hard work being done with no obvious gain in insight. In sum, this looks like another case of those things not worth doing are not worth doing well.

[1] As an example of what I have in mind (i.e. pairing evo advantages with properties of the linguistic system) see here. Please excuse the self promotion, but this was written a very long time ago and Bob Brandon (a very good philosopher of biology) did most of the heavy thinking.

Monday, January 26, 2015

Words, things and EVOLANG (again)

In the evolang paper I linked to here, Chomsky mentions two basic features of natural language. The first is the distinctive nature of natural language (NL) “atoms” (roughly words or morphemes). The second is the generative procedure. He extensively discusses an evolutionary scenario for the second, but only briefly mentions the first and says the following (1-2):

The atomic elements pose deep mysteries. The minimal meaning-bearing elements of human languages – word-like, but not words -- are radically different from anything known in animal communication systems. Their origin is entirely obscure, posing a very serious problem for the evolution of human cognitive capacities, language in particular (my emphasis NH). There are insights about these topics tracing back to the pre-Socratics, developed further by prominent philosophers of the early modern scientific revolution and the Enlightenment, and further in more recent years, though they remain insufficiently explored. In fact the problem, which is severe, is insufficiently recognized and understood. Careful examination shows that widely held doctrines about the nature of these elements are untenable, crucially, the widely-held referentialist doctrine that words pick out extra-mental objects. There is a great deal to say about these very important questions, but I’ll put them aside – noting again, however, that the problems posed for evolution of human cognition are severe, far more so than generally acknowledged.

What sort of problem do they pose? Well, as we see below, Chomsky argues that human linguistic atoms (roughly words) have qualitatively different properties from units/atoms used in animal communication systems. If he is right, then looking at the latter to inform us concerning properties of the former is a mugs game, not unlike looking for insight concerning the origins of hierarchical recursion one finds in NLs by looking at animal communication systems.

Furthermore, these differences if they exist (and they do, see below), are important for from the department of “you can’t make this s*!t up” comes research like this motivated by the idea that we can gain insight into human language by studying grunts.[1] How exactly? Well, the motivating conceit is that a grunt language will help us “understand how languages are created –how linguistic conventions (e.g. words) come to be established” (3). And this is itself based on the belief that grunts are an appropriate comparison class for words (you can hear the ape calls reverberating in the background).

This assumption is a very crude version of what Lenneberg dubbed “the continuity theory of language development” (here p. 228): the idea that human language “must have descended from primitive animal forms of communication, and the study of the latter is likely to disclose” something about the former. It’s a short step from the continuity thesis to grunts. And it’s already been taken, as you can see.

So what’s wrong with this? Well lots really, but the main problem is that once again (see here to understand the “again”) there is no good description of what the evolved object’s basic properties (i.e. what “words” in NL are really like). There is no description of the object to be studied, no characterization of the evolved capacity. Why not?

I suspect that one reason is a tacit belief that we already know what meaning consists in: words denote things and meaning resides in this denotation relation between words and the objects they refer to. If this is right, then it is understandable why some might conclude that studying how grunts might refer to things in a given context would shed light on how meaningful NL words might have arisen in the species from earlier grunts.

Chomsky (ahem) critically examines (‘excoriates’ might be a better term) this view, what he dubs the “Referentialist Doctrine” (RD), here. In what follows, I want to outline some of his arguments in service of another point Chomsky makes (one, btw, very reminiscent of the one that the late Wittgenstein makes in the first 100 or so entries of the Investigations), namely that though RD is far off the mark when it comes to NL words, it’s not a bad description of animal articulations.[2] If this is correct, then we have almost nothing to learn about NL words by studying what happens in animal communication. Why? Because the underlying assumption, the continuity thesis, is simply false for this domain of EVOLANG. Let’s consider the arguments.

First, what is RD? It’s the view that linguistic meaning originates in a basic capacity that words have (all kinds of words, not just nominals) of standing for mind independent objects. Thus the basic semantic relation is word to object. This basic semantic relation between words and things causally undergirds acts of denoting on the part of humans. Thus, the human capacity to speak about the world and to use language as a social tool (i.e. to communicate) supervenes on the semantic capacity that words have to denote mind independent things. Or put more succinctly: denotation licenses denoting. This is the position that Chomsky wants to discredit. He argues for the opposite view: that whatever sense we can make of denotation (and he hints that it is not much) piggy backs on acts of denoting, which are themselves heavily dependent on richly structured minds. Thus, the word-object-in-the-world relation is, at best, a derivative notion of little (if any) significance in understanding how NL words function.

How does Chomsky argue his position? First, he agrees that speakers do refer: “That acts of referring take place is uncontroversial.” However, he denies that this implies that acts of referring supervene on a more primitive denotation relation that holds between words and the things that acts of referring pick out. Or as Chomsky puts it, the fact that people refer using words

…leave[s] open the status of the relation of denotation; that is, the question whether there is a relevant relation between the internal symbol used to refer and some mind-independent entity that is picked out by the expression that is used to denote: an object, a thing, individuated without recourse to mental acts.

Or, put another way: that people denote using words is a fact. That this activity requires a primitive denotation relation between words and things “individuated without recourse to mental acts” is a theory to explain this fact, and the theory, Chomsky argues, is farfetched and generates unwelcome paradoxes.[3]

He deploys several lines of argument to reach this conclusion, the most interesting, IMO, being the analogy with the other side of words, their link to specific articulations (e.g. word-sound relation). The RD licenses the following analogy in the sound domain: Just as human acts of denoting supervene on the relation between an internal mental symbol (e.g. kitten) and real world kittens so too human word articulations rest on the relation between internal phonetic symbols (e.g. [ki’n] for kitten) and physical sounds in the world. However, it is clear in the word-sound case that this has things exactly backwards. As Chomsky puts it:

Take the word kitten and the corresponding phonetic symbol [ki’n], the latter an internal object, an element of I-language, in the mind. We can carry out actions in which we use [ki’n] to produce a sound S (or counterparts in other modalities), the act of pronunciation. The sound S is a specific event in the mind-independent world, and there is a derivative relation between the internal symbol [ki’n] and S insofar as we use [ki’n] to pronounce S. There is however no relevant direct relation between [ki’n] and S, and it would be idle to try to construct some mind-independent entity to which [ki’n] corresponds even for a single individual, let alone a community of users.

Anyone who has tried to map spectrograms to words has a good feel for what Chomsky is talking about here. It’s hard enough to do this for a single individual on a single day in a single set of trials, let alone a bunch of people of different sizes, on different days and on different occasions. No two people (or one person on different occasions) seem to pronounce any word in the same way, if “same way” means “producing identical spectrograms.” But the latter are the measurable physical “things” out there in the world “individuated without recourse to mental acts”. If this is correct, then, as Chomsky says, “there is no relevant direct relation” between mental sound symbols and their physical products. There is at most an indirect relation, one mediated by an act of articulation (viz. this sound symbol was used to produce that sound).

The analogy leads to a conclusion that Chomsky draws:

Acoustic and articulatory phonetics are devoted to discovering how internal symbols provide ways to produce and interpret sounds, no simple task as we all know. And there is no reason to suspect that it would be an easier task to discover how internal systems are used to talk or think about aspects of the world. Quite the contrary.

IMO, this is a very powerful argument. If RD is good for meaning, then it should be good for sound. Conversely, if RD is a bad model for sound, why should we take it to be a good one for meaning? Inquiring minds want to know. To my knowledge, nobody has offered a good counter-argument to Chomsky’s word-sound analogy. However, I am not entirely up on this literature, so please feel free to correct me.

Chomsky also offers a host of more familiar arguments against RD. He points to the many paradoxes that Referentialism seems to generate. For example, If ‘London’ refers to the space-time located burgh then were it to move (like Venice did (see here for a more elaborate discussion of this same point)) will the meaning of London changed? And if so how can we say things like I visited London the year after it was destroyed by flood and it was rebuilt three miles inland. Chomsky observes that this conundrum (and many others see note 4) disappear if RD is abandoned and meaning is treated the way we treat the word-sound relation; a secondary very indirect relation.[4]

Chomsky also hints at one reason why RD is still so popular, but here I may be putting words into his mouth (not unlike putting one’s head in a lion’s jaws, I should add). There is a long tradition linking RD to some rather crude kinds of associationism, wherein learning word meaning is based on, “ostention, instruction, and habit formation.” It is not hard to see how these operations rely on an RD picture; semantics becoming the poster child for environmental approaches to language acquisition (i.e. theories which take mental architecture as a faithful representation of environmental structure). The opposite conception, in which meaning is embedded in a rich innate system of concepts, is largely antithetical to this associationist picture. It strikes me as entirely plausible that RD derives some (much?) of its intuitive appeal by being yet another projection of the Empiricist conception of mind. If this is the case, then it is, IMO, another argument against RD.[5]

Chomsky ends his little paper with a nice observation. He notes that whereas RD provides a poor account of human word competence, it seems to describe what happens in animals pretty well.[6] Here’s what he says:

It appears to be the case that animal communication systems are based on a one-one relation between mind/brain processes and “an aspect of the environment to which these processes adapt the animal's behavior.” (Gallistell 1991).

This observation seconds one that Charles Hockett made a long time ago here (pp. 569ff). Hockett noted many differences between human language and animal communication systems. In particular, the latter are quite tightly tied to the here and now in ways that the former are not. Animals communicate about things proximate in place/time/desire. They “discuss” the four Fs (i.e. food, flight, fight, and sex) and virtually nothing else. Very few movie reviews it seems. Humans discuss everything, with the transmission of true (or even useful) information being a minor feature of our communicative proclivities if what I hear around me is any indication. At any rate, human words are far more open-textured than animal “words” and can be arbitrarily remote from the events and objects that they are used to depict. In other words, when we look we find that there is a strong discontinuity between the kind of semantic relations we find in animal communication systems and those we find in human language. And if this is so (as it strongly appears to be), then evolutionary accounts based on the assumption that human powers in these domains are continuous with those found in other animals are barking (grunting?) up the wrong bushes. If so, at least as regards our words and theirs and our combinatorics and theirs, continuity theories of evolution are very likely incorrect, a conclusion that Lenneberg came to about 50 years ago.

Let me end with two more observations.

First, as Chomsky’s two papers illustrate, linguists bring something very important to any EVOLANG discussion. We understand how complex language is. Indeed, it is so complex that language as such cannot really be the object of any serious study. In this way “language” is like “life.” Biologists don’t study life for it is too big and complex. They study things like energy production within the cell, how bodies remove waste, how nutrients are absorbed, how oxygen is delivered to cells, … Life is the sum of these particular studies. So too language. It is the name we give to a whole complex of things; syntactic structure, compositionality, phonological structure… To a linguist ‘language’ is just too vague to be an object of study.

And this has consequences for EVOLANG. We need to specify the linguistic feature under study in order to study its evolution. Chomsky has argued that once we do this we find that two salient features of language (viz. its combinatoric properties and how words operate) look unlike anything we find in other parts of the animal world. And if this is true, it strongly suggests that continuity accounts of these features are almost certain to be incorrect. The only option is to embrace the discontinuity and look for other kinds of accounts. Given the tight connection between continuity theses and mechanisms of Natural Selection, this suggests that for these features, Natural Selection will play a secondary explanatory role (if any at all).[7]

Second, it is time that linguists and philosophers examine the centrality of RD to actual linguistic semantic practice. Some have already started doing this (e.g. Paul Elbourne here and much of what Paul Pietroski has been writing for the last 10 years see e.g. here). At any rate, if you are like me in finding Chomsky’s criticisms of RD fairly persuasive, then it is time to decouple the empirical work in linguistic semantics from the referentialist verbiage that it often comes wrapped in. I suspect that large parts of the practice can be salvaged in more internalist terms. And if they cannot be, then that would be worth knowing for it would point to places where some version of RD might be correct. At any rate, simply assuming without argument that RD is correct is long past its sell-by date.

To end: Chomsky’s paper on RD fits nicely with his earlier paper on EVOLANG. Both are short. Both are readable. Both argue against widely held views. In other words, both are lots of fun. Read them.

[1] Thx to David Poeppel for bringing this gem to my attention. Remember my old adage that things not worth doing are not worth doing well? Well apply adage here.

[2] That words in NL are very different from what a natural reading of RD might suggest is not a new position, at least in the philo of language. It is well known that the Later Wittgenstein held this, but so did Waismann (here) (in his discussions of “open-texture”) and so did, most recently, Dummett. Within linguistics Hockett famously distinguished how human and animal “language” differs, many of his observations being pertinent here. I say a little about this below.

[3] Note, that a relatively standard objection to RD is that basing meaning on a reference relation “individuated without recourse to mental acts” makes it hard to link meaning to understanding (and, hence, to communication). Dummett presents a recent version of this critique, but it echoes earlier objections in e.g. the later Wittgenstein. In this regard, Chomsky’s is hardly an isolated voice, despite the many differences he has with these authors on other matters.

[4] Chomsky discusses other well known paradoxes. Pierre makes an appearance as does Paderewski and the ship of Theseus. Indeed, even the starship Enterprise’s transporter seems to walk on stage for a bit.

[5] It is also often claimed that without RD there can be no account of how language is used for communication. I have no idea what the connection between RD and communication is supposed to be, however. Dummett has some useful discussion on this issue, as does the late Wittgenstein (boy does this sound pretentious, but so be it). At any rate, the idea that the capacity to communicate relies on RD is a common one, and it is also one that Chomsky discusses briefly in his paper.

[6] It also describes Wittgentstein’s ‘slab’ language pretty well in the Investigations. Here Wittgenstein tries to create a language that closely mirrors RD doctrine. He tries to show how stilted an un-language like this slab language is. In other words, his point is that RD in action results in nothing we would recognize as natural language. Put another way, RD distorts the distinctive features of natural language words and makes it harder to understand how they do what they do.

[7] This is worth emphasizing: Natural Selection is one mechanism of evolution. There are others. It is a mechanism that seems to like long stretches of time to work its magic. And it is a mechanism that sees differences as differences in degree rather than differences in kind. This is why continuity theses favor Natural Selection style explanations of evolution.

Friday, January 23, 2015

More on reviewing

Dominique Sportiche sent me this interesting break down on the NIPS experiment mentioned by Alex C in the comments section (here). It also provides a possible model for the results that presents in a more formal idiom one hypothesis that I floated for the results of the MRC results, namely that once one takes account of the clear winners and the clear losers the messy middle gets in in a tossup. At any rate, the details are interesting for the results as interpreted in the link come very close to assuming that acceptance is indeed a crapshoot. Let me repeat, again, that this does not make the decision unfair. Nobody has right to have their paper presented or their work funded. Arbitrary processes are fair if everyone is subject to the same capricious decision procedures.

That raises an interesting question, IMO. What makes for a clear winner or looser? Here I believe that taste plays a very big role, and though I am a big fan of disputing taste, in fact I think that it is about the only thing worth real discussion (de gustibus dispudandum est), there is no doubt that it can move around quite a bit and not be easy to defend.

Still, what else to do? Not review? Randomly accept? This would likely delegitimize the reported work. So, I have no great ideas here. It seems that our judgments are not all that reliable when it comes to judging current work. Are you surprised? I'm not.

Addendum:

Let me add one more point. How do we decide what's good and what's not? Well on influence I suspect is what gets published/accepted/funded and what doesn't. If correct, then we can get a reinforcing process where the top and bottom ends of the acceptance/rejection process are themselves influenced by what was and wasn't done before. This implies that even where there is consensus, it might itself be based on some random earlier decisions. This is what can make it hard for novelty to break through, as we know that it is (see here for a famous modern case).

One of the most counterintuitive probability facts I know of is the following: If one takes two fair coins and starts flipping them and one coin gets, let's say, 10 heads "ahead" of the other how long will it take the second coin to "overtake" the first wrt heads? Well, a hell of a long time (is "never" long enough for you?). This indicates that random effects can be very long lasting.

One last comment: fortunately, I suspect that most science advances based on (at most) a dozen papers per field per year (and maybe fewer). Of course, it is hard to know ex ante which dozen these will be. But that suggests that over time, the noisy methods of selection, though they are of great personal moment, may make little difference to the advancement of knowledge. Humbling to consider, isn't it?

Next time I complain about writing in linguistics

remind me of this:

Anticipating how, when, and why different contexts may interact to disrupt an organization requires leaders to develop “ripple intelligence,” as well as the ability to harness doubt more effectively in order to improve decision making. Moreover, as business conditions change, CEOs must learn to balance authenticity and adaptability in order to motivate their organizations to action without squandering the trust they have worked so hard to build.

It comes from a report on CEO business leadership insights for the 21st century, partially authored by academics at Oxford. The story I got it from is here. As Justin Fox observes, it's pretty clear that the above is entirely content free as it is as close to gibberish as one could hope. But it is vapidity that is worth remembering for two reasons.

First, as bad as some linguistic writing can be, I cannot remember anything this contentless.

Second, you can be sure that this sort of CEO/corporate speak will soon be coming to a campus near you given the delight that our handlers have in pretending to be big fancy corporate leaders. So get ready for the BS.

Here's a great W.C. Fields bit of wisdom to arm yourself with to get ready:

"If you can't dazzle them with brilliance, baffle them with bullshit."

Tuesday, January 20, 2015

Yet more on intra-neural computation

Milan Rezac sent me these two links (here, here) to articles in Science Daily that many should find interesting.

The first seems to directly support Randy's conjecture about the physical locus of biological memory. More specifically, the reported research from UCLA presents evidence "contradicting the idea that long-term memory is stored at the synapses." David Glanzman, the guy whose whose name comes last in a list of six (does this make him senior author? or last author? or? I never understand the attribution conventions in these bio papers. At any rate, it's his lab) says that the work "suggests that memory is not in the synapses but somewhere else. We think it's the nucleus of the neuron. We haven't proved that though."

The experiment described sounds kind of cute. Like the Hesslow experiment that Randy described, the technology has developed to the point that one can study how neurons learn in a petri dish, at least Aplysia neurons (the victim of choice). What this experiment did was disrupt and then encourage synaptic connections from growing and then seeing whether when a memory is retained the regrown connections have any evident pattern. The idea being that if memories were retained in the synaptic connections then this would influence the kind of synaptic growth that one sees. Conclusion by Glanzman (and here I quote the report, not G himself): "there was no obvious pattern to which synapses stayed and which disappeared, which implied that memory was not stored in the synapses." In other words, were memories stored in synapses we would expect them to be reflected in systematic patterns in these synapses. But there is no such pattern, yet memories are retained. Conclusion, they are not retained in the synapses. Note, interestingly, synaptic growth seems correlated to memory. However, it does not appear to be the locus of the memory. What role does it play? The article doesn't say. However, it suggests that a correlation between synaptic growth and memory is insufficient for determining where memories are stored. This is interesting given Lisman's proposed experiment that he proposed during Randy's talk (see here).

The second paper Milan set along should appeal to Evo-Devo fans (this could be a punk rock group, couldn't it?). Here is the abstract:

The fundamental structures underlying learning and memory in the brains of invertebrates as different as a fruit fly and an earthworm are remarkably similar, according to neuroscientists. It turns out that the structure and function of brain centers responsible for learning and memory in a wide range of invertebrate species may possibly share the same fundamental characteristics.

This suggests that the mechanisms underlying learning and memory in even very simple animals are the same as those we have in higher vertebrates. Biologically, this does not strike me as surprising. We have known for a long time that the neuro-chemical and genetic basics are the same in e-coli (and fruit flies, aplysia, slime molds, horse shoe crabs) and humans. Indeed, this is why doing research on the former tell us so much about the latter. That this is also true for mental capacities (memory, learning) should not be much of a surprise. (A little sermon here: though people love to criticize linguists for making general claims based on a small variety of languages (something that, btw, is simply false as any look at the journals will show) the same kind of criticism does not appear to extend to biology where one concludes a lot about general (dare I say universal) biological processes from what happens in e-coli and flies). At any rate, it seems thnat human brains "share the organizational principles" evident "in arthropods and other invertebrates." The take home message seems to be that biology is very conservative in the mechanisms that it uses. Once a trick is discovered able to do useful biological work (seeing, remembering) this trick is recycled again and again. Taken in conjunction with the observation that unicellular slime molds can store interval information in memory (see here), this suggests that human memory can do so and does do so as well. Randy rules!!

Anyhow, thx Milan for the papers. Lots of fun.

Monday, January 19, 2015

How to make an EVOLANG argument

Bob Berwick recently sent me something that aims to survey, albeit sketchily, the state of play in the evolution of language (evolang) and a nice little paper surveying the current state of Gould and Lewontin’s spandrels paper (here) (hint: their warning is still relevant). There have also been more than a few comments in FOL threads remarking on the important progress that has been made on evolang. I believe that I have invited at least one evolang enthusiast to blog about this (I offered as much space as desired, in fact) so as to enlighten the rest of us about the progress that has been made. I admit that I did this in part because I thought that the offer would not be taken up (a put up or shut-up gambit) and also (should the challenge be accepted) because I would really be interested in knowing what has been found given my profound skepticism that at this moment in time there is anything much to find. In other words, for better or for worse, right now I doubt that there is much substantive detail to be had about how language actually evolved in the species.[1] In this regard, we are not unlike the Paris Academy over a century ago when it called for a moratorium on such speculation.

That said, who can resist speculating? I can’t. And therefore, this post was intended to be an attempt to examine the logic of an evolution of language account that would satisfy someone like me. I wanted to do this, because, though close to vacuous most of the discussion I’ve seen is (like the fancy inversion here?), I think that Minimalism has moved the discussion one small conceptual step forward. So my intention had been to outline what I think this small step is as well as point to the considerable distance left to travel.

As you can tell from the modal tenses above, I was going to do this, but am not going to do it. Why not? Because someone has done this for me and instead of my laying out the argument I will simply review what I have received. The text for the following sermon is here, a recent paper by Chomsky on these matters.[2] It is short, readable and (surprise, surprise) lays out the relevant logic very well. Let’s go through the main bits.

Any discussion of evolang should start with a characterization of what features of language are being discussed. We all know that “language” is a very complex “thing.” Any linguist can tell you that there are many different kinds of language properties. Syntax is not phonology is not semantics. Thus in providing an evolutionary account of language it behooves a proposal to identify the properties under consideration.

Note that this is not an idiosyncratic request. Evolution is the study of how biological entities and capacities change over time. Thus, to study this logically requires a specification of the entity/capacity of interest. This is no less true for the faculty of language (FL) than it is for hearts, kidneys or dead reckoning. So, to even rationally begin a discussion in evolang requires specifying the properties of the linguistic capacity of interest.

So, how do we specify this in the domain of language? Well, here we are in luck. We actually have been studying these linguistic capacities for quite a while and we have a rich, developed, and articulate body of doctrine (BOD) that we can pull from in identifying a target of evolutionary interest. Chomsky identifies one feature that he is interested in. He terms this the “Basic Property” (BP) and describes it as follows:

[E]ach language yields a digitally infinite array of hierarchically structured expressions with systematic interpretations at interfaces with two other internal systems, the sensorymotor system for externalization and the conceptual system, for interpretation, planning, organization of action, and other elements of what are informally called “thought.” (1)

So one evolang project is to ask how the capacity that delivers languages with these properties (viz. I-languages) arose in the species. We call the theory of I-languages “Universal Grammar” or UG as it “determines the class of generative procedures that satisfy the Basic Property” (1). We can take UG as “the theory of the genetic component of the faculty of language.” If we do, there is a corresponding evolang question: how did UG arise in the species?[3]

Note, that the above distinguishes FL and UG. FL is the mental system/”organ” that undergirds the human linguistic competence (ie. The capacity to develop (viz. “grow”) and deploy (viz. “use”) I-languages). UG is the linguistically specific component of FL. FL is likely complex, incorporating many capacities only some of which are linguistically proprietary. Thus, UG is a subpart of FL. One critical evolang question then is how much of FL is UG. How much of FL consists of linguistically proprietary properties, capacities/primitives that are exclusively linguistic?

Why is the distinction important? Well, because it sure looks like humans are the only animals with BP (i.e. nothing does language like humans do language!) and it sure looks like this capacity is relatively independent of (viz. dissociates with) other cognitive capacities we have (see here). Thus, it sure looks like the capacity to generate BP-I-languages (BPIs) is a property of humans exclusively. And now we come to the interesting evolang problem: as a point of evolutionary logic (we might dub this the Logical Problem of Language Evolution (LPLE)) the bigger the UG part of FL, the more demanding the problem of explaining the emergence of FL in the species. Or as Chomsky puts it (3): “UG must meet the condition of evolvability, and the more complex its assumed character, the greater the burden on some future account of how it might have evolved.”

We can further sharpen the evolvability problem by noting one more set of boundary conditions on any acceptable account. There are two relevant facts of interest, the first “quite firm” and the second “plausible” and that we refer to with “less confidence.” These are:

1. There has been no evolution of FL in the species in the last 50k years or more.

2. FL emerged in the way it exists today about 75k years ago.

As Chomsky puts it (3): “It is, for now, a reasonable surmise that language –more accurately UG- emerged at some point in the very narrow window of evolutionary time, perhaps in the general neighborhood of 75 thousand years ago, and has not evolved since.”[4]

Why is (1) firm? Because there are no known group differences in the capacity humans have in acquiring and using a natural language. As the common wisdom is that our ancestors left Africa and their paths diverged about 50kya then this would be unexpected were there evolution of FL or UG after this point.

Why is (2) less firm? Because we infer it to be true based on material cultural artifacts that are only indirect indicators of linguistic capacity. This evidence has been reviewed by Ian Tattersal (here) and it looks like the conclusion he draws on these issues is a plausible one. Chomsky is here relying on this archeological “consensus” view for his “plausible” second assumption.

If these assumptions are correct then, as Chomsky notes (3) “UG must be quite simple at its core” and it must have emerged more or less at once. These are really flip sides of the same claim. The evolutionary window is very narrow and so whatever happened must have happened quickly in evo-time and for something to happen quickly it is very likely that what happened was a small simple change. Complexity takes a long time. Simplicity not so much.[5] So, what we are looking for in an evolang account of our kinds of natural langauges is some small change that has BPI-effects. Enter Minimalism.

Chomsky has a useful discussion of the role of evolvability in early Generative Grammar (GG). He notes that the evolvability of FL/UG was always recognized to be an important question and that people repeatedly speculated about it. He mentions Lenneberg and Luria in this regard, and I think I recall that there was also some scattered discussion of this in the Royaumont conference. I also know that Chomsky discussed these issues with Francois Jacob as well. However, despite the interest of the problem and the fact that it was on everyone’s radar the speculation never got very far. Why not? Because of the state of the theory of UG. Until recently, there was little reason for thinking that UG was anything but a very complicated object with complex internal structure, many different kinds of primitives, processes and conditions (e.g. just take a look at GB theory). Given the LPLE, this made any fruitful speculation idle, or, in Dwight Whitney’s words quoted by Chomsky: “The greater part of what is said and written about it is mere windy talk” (4) (I love this Ecclesiastical description: Wind, wind, all is wind!).

As Chomsky notes, minimalism changed this. How? By suggesting that the apparent complexity of UG as seen from the GB angle (and all of GB’s close relatives) is eliminable. How so? By showing that the core features of BPIs as described by GB can be derived from very a simple rules (Merge) applied in very simple ways (computationally “efficient”). Let me say this more circumspectly: if to the degree that MP succeeds to that degree the apparent complexity of FL/UG can be reduced. In the best case, the apparent complexity of BPIs reduces to one novel language specific addition to the human genome and out falls our FL. This one UG addition together with our earlier cognitive apparatus and whatever non-cognitive laws of nature are relevant suffice to allow the mergence of the FL we all know and love. If MP can cash this promissory note, then we have taken a significant step towards solving the evolang problem.

Chomsky, of course, rehearses his favorite MP account (7-9): the simplest Merge operation yielding unordered merges, the simplest application of the rule to two inputs yielding PS rules and Movement, natural computational principles (not specific to language but natural for computation as such) resulting in conditions like Inclusiveness and Extension and something like phases, the simple merge rule yielding a version of the copy theory of movement with obvious interpretive virtues etc. This story is well known, and Chomsky rightly sees that if something like this is empirically tenable then it can shed light on how language might have evolved, or, at the very least, might move us from windy discussions to substantive ones.

Let me say this one more way: what minimalism brings to the table is a vision of how a simple addition might suffice to precipitate an FL like the one we think we have empirical evidence for. And, if correct, this is, IMO, a pretty big deal. If correct, it moves evolang discussion of these linguistic properties from BS to (almost) science, albeit, still of a speculative variety.

Chomsky notes that this does not exhaust the kinds of evolang questions of interest. It only addresses the questions about generative procedure. There are others. One important one regards the emergence of our basic lexical atoms (“words”). These have no real counterpart in other animal communication systems and their properties are still very hard to describe.[6] A second might address how the generative procedure hooked up to the articulatory system. It is not unreasonable to suppose that fitting FL snugly to this interface took some evolutionary tinkering. But though questions of great interest remain, Chomsky argues, very convincingly in my view, that with the rise of MP linguistics has something non-trivial to contribute to the discussion: a specification of an evolvable FL.

There is a lot more in this little paper. For example, Chomsky suggests that much of the windiness of much evolang speculation relates to the misconceived notion that the natural language serves largely communicative ends (rather than being an expression of thought). This places natural languages on a continuum with (other) animal communication systems, despite the well-known huge apparent differences.

In addition, Chomsky suggests what he intends with the locution ‘optimal design’ and ‘computationally efficient.’ Let me quote (13):

Of course, the term “designed” is a metaphor. What it means is that the simplest evolutionary process consistent with the Basic Property yields a system of thought and understanding [that is sic (NH)] computationally efficient since there is no external pressure preventing this optimal outcome.

“Optimal design” and “computational efficiency” are here used to mean more or less the same thing. FL is optimal because there is no required tinkering (natural selection?) to get it into place. FL/UG is thus evolutionarily optimal. Whether this makes it computationally optimal in any other sense is left open.[7]

Let me end with one more observation. The project outlined above rests on an important premise: that simple phenotypic descriptions will correspond to simple genotypic ones. Here’s what I mean. Good MP stories provide descriptions of mental mechanisms, not neural or genetic mechanisms. Evolution, however, selects traits by reconfiguring genes or other biological hardware. And, presumably, genes grow brains, which in turn secrete minds. It is an open question whether a simple mental description (what MP aims to provide) corresponds to a simple brain description, which, in turn, corresponds to a simple “genetic” description. Jerry Fodor describes this train of assumptions well here.[8]

…what matters with regard to the question whether the mind is an adaptation is not how complex our behaviour is, but how much change you would have to make in an ape’s brain to produce the cognitive structure of a human mind. And about this, exactly nothing is known. That’s because nothing is known about how the structure of our minds depends on the structure of our brains. Nobody even knows which brain structures it is that our cognitive capacities depend on.

Unlike our minds, our brains are, by any gross measure, very like those of apes. So it looks as though relatively small alterations of brain structure must have produced very large behavioural discontinuities in the transition from the ancestral apes to us…

…In fact, we don’t know what the scientifically reasonable view of the phylogeny of behaviour is; nor will we until we begin to understand how behaviour is subserved by the brain. And never mind tough-mindedness; what matters is what’s true.

In other words, the whole evolang discussion rests on a rather tendentious assumption, one for which we have virtually no evidence; namely that a “small” phenotypic change (e.g. reduction of all basic grammatical operations to Merge) corresponds to a small brain change (e.g. some brain fold heretofore absent all of a sudden makes an appearance), which in turn corresponds to a small genetic change (e.g. some gene gets turned on during development for a little longer than previously). Whether any of this is correct is anyone’s guess. After all there is nothing incoherent in thinking that a simple genetic change can have a big effect on brain organization, which in turn corresponds to a very complex phenotypic difference. The argument above assumes that this is not so, but the operative word is “assume.” We really don’t know.

There is another good discussion of these complex issues in Lenneberg’s chapter 6, which is worth looking at and keeping in mind. This is not unusual in the evolution literature, which typically assumes that traits (not genes) are the targets of selection. But the fact that this is commonly the way that the issues are addressed does not mean that the connections assumed from phenotypic mental accounts to brains to genes are straightforward. As Fodor notes, correctly I believe, they are not.

Ok, that’s it. There is a lot more in the paper that I leave for your discovery. Read it. It’s terrific and provides a good model for evolang discussions. And please remember the most important lesson: you cannot describe the evolution of something until you specify that thing (and even then the argument is very abstract). So far as I know, only linguists have anything approaching decent specifications of what our linguistic capacities consists in. So any story in evolang not starting from these kinds of specifications of FL (sadly, the standard case from what I can tell) are very likely the windy products of waving hands.

[1] Happily, I have put myself in the good position of finding out that I am wrong about this. Marc Hauser is coming to UMD soon to give a lecture on the topic that I am really looking forward to. If there are any interesting results, Marc will know what they are. Cannot wait.

[2] I’d like to thank Noam for allowing me to put this paper up for public consumption.

[3] Please observe that this does not imply that BP is the only property we might wish to investigate, though I agree with Chomsky that this is a pretty salient one. But say one were interested in how the phonological system arose, or the semantic system. The first step has to be to characterize the properties of the system one is interested. Only once this is done can evolutionary speculation fruitfully proceed. See here for further discussion, with an emphasis on phonology.

[4] It is worth noting that this is very fast in evolutionary terms and that if the time scale is roughly right then this seems to preclude a gradualist evolutionary story in terms of the slow accretion of selected features. Some seem to identify evolution with natural selection. As Chomsky notes (p. 11), Darwin himself did not assume this.

[5] Furthermore, we want whatever was added to be simple because it has not changed for the last 50k years. Let me say this another way: if what emerged 100kya was the product of slow moving evolutionary change with the system accreting complexity over time then why did this slow change stop so completely 50kya? Why didn’t change continue after the trek out of Africa? Why tones of change before hand and nothing since? If the change is simple, with not moving parts, as it were, then there is nothing in the core system to further evolve.

[6] I’ll write another post on these soon. I hope.

[7] If this reading of Chomsky’s intention here is correct, then I have interpreted him incorrectly in the past. Oh well, won’t be the last time. In fact, under this view, the linguistic system once evolved need not be particularly efficient computationally or otherwise. On this view, computationally efficient seems to me “arose as a matter of natural law without the required intervention of natural selection.”

[8] The relevant passage is

Faculty of Language

Comments