Thursday, November 16, 2017

Is science broken/breaking?

Is science broken/breaking and if it is what broke/is breaking it? This question has been asked and answered a lot lately. Here is another recent contribution by Roy and Edwards (R&E). Their answer is that it is nearly broken (or at least severely injured) and that we should fix it by removing the perverse incentives that currently drive it. Though I am sympathetic to ameliorating many of the adverse forces R&E identify, I am more skeptical than they are that there is much of a crisis out there. Indeed, to date, I have seen no evidence showing that what we see today is appreciably worse than what we had before, either in the distant past (you know in the days when science was the more or less exclusive pursuit of whitish men of means) or even the more recent past (you know when a PhD was still something that women could only dream about). I have seen no evidence that show that published results were once more reliable than they are now or that progress overall was swifter.  Furthermore, from where I sit, things seem (at least from the outside) to be going no worse than before in the older “hard” sciences and in the “softer” sciences the problem is less the perverse incentives and the slipshod data management that R&E point to so much as the dearth of good ideas that can allow such inquiries to attain some explanatory depth. So, though I agree that there are many perverse incentives out there and that there are pressures that can (and often do) lead to bad behavior, I am unsure whether given the scale of the modern scientific enterprise things are really appreciably worse today than they were in some prior golden age (not that I would object to more money being thrown at the research problems I find interesting!). Let me ramble a bit on these themes.

What broke/is breaking science? R&E point to hypercompetition among academic researchers. Whence the hypercompetition? Largely from the fact that universities are operating “more like businesses” (2). What in particular? (i) The squeezed labor market for academics (fewer tenure track jobs and less pleasant work environment), (ii) The reliance on quantitative performance metrics (numbers of papers, research dollars, citations) and (iii) the fall in science research funding (from 2% of GDP in 1960 to .78% in 2014)[1] (p. 7) work together to incentivize scientists to cut corners in various ways. As R&E puts it:

The steady growth of perverse incentives, and their instrumental role in faculty research, hiring and promotion practices, amounts to a systematic dysfunction endangering scientific integrity. There is growing evidence that today’s research publications too frequently suffer from lack of replicability, rely on biased data-sets, apply low or sub-standard statistical methods, fail to guard against researcher biases, and overhype their findings. (p. 8)

So, perverse incentives due to heightened competition for shrinking research dollars and academic positions leads scientists interested in advancing their research and careers to conduct research “more vulnerable to falsehoods.” More vulnerable than what/when? Well, by implication than some earlier golden age when such incentives did not dominate and scientists pursued knowledge in a more relaxed fashion and were not incentivized to cut corners as they are today. [2]

I like some of this story. I believe as R&E argues that scientific life is less pleasant than it used to be when I was younger (at least for those that make it into an academic position). I also agree that the pressures of professional advancement make it costly (especially to young investigators) to undertake hard questions, ones that might resist solution (R&E quotes Nobelist Roger Kornberg as claiming: “If the work you propose to do isn’t virtually certain of success, then it won’t be funded” (8)).[3] Not surprisingly then, there is a tendency to concentrate on those questions for whose solution techniques already available apply and that hard work and concentrated effort can crack. I further agree that counting publications is, at best, a crude way of evaluating scientific merit, even if buttressed by citation counts (but see below). All of this seems right to me, and yet…I am skeptical that science as a whole is really doing so badly or that there was a golden age that our own is a degenerate version of. In fact, I suspect that people who think this don’t know much about earlier periods (there really was always lots of junk published and disseminated) or have an inflated view of the cooperative predilections of our predecessors (though how anyone who read the Double Helix might think this is beyond me).

But I have a somewhat larger problem with this story: if the perverse incentives are as R&E describes them then we should witness its ill effects all across the sciences and not concentrated in a subset of them. In particular, it should affect the hardcore areas (e.g. physics, chemistry, molecular biology) just as it does the softer (more descriptive) domains of inquiry (social psychology, neuroscience). But it is my impression that this is not what we find. Rather we find the problem areas more concentrated, roughly in those domains of inquiry where, to be blunt, we do not know that much about the fundamental mechanisms at play. Put another way, the problem is not merely (or even largely) the bad incentives. The problem is that we often cannot distinguish between those domains that are sciency from those domains that are scientific. What’s the difference? The latter have results (i.e. serious theory) that describe non-trivial aspects of the basic mechanisms where the former has methods (i.e. ways of “correctly” gathering and evaluating data) that are largely deployed to descriptive (vs explanatory) ends. As Suppes said over 60 years ago: “It’s a paradox of scientific method that the branches of empirical science that have the least theoretical developments have the most sophisticated methods of evaluating evidence.” It should not surprise that domains where insight is weakest are also domains where shortcuts are most accessible.

And this is why it’s important to distinguish these domains. If we look, it seems that the perverse incentives R&E identifies are most apt to do damage in those domains where we know relatively little. Fake data, non-replicability, bad statistical methods leading to forking paths/P-hacking, research biases, these are all serious problems especially in domains where nothing much is known. In domains with few insights and where all we have is the data then screwing with the data (intentionally or not) is the main source of abuse. And in those domains when incentives for abuse rise, then enticements to make the data say what we need them to say heightens. And when the techniques for managing the data are capable of being manipulated to make them say what you want them to say (or at least whose proper deployment eludes even the experts in the field (see here and here), then opportunity allows enticement to flower into abuse.

The problem then is not just perverse incentives and hypercompetition (these general factors hold in the mature sciences too) but the fact that in many fields the only bulwark against scientific malpractice is personal integrity. What we are discovering is that as a group, scientists are just as prone to pursuing self-interest and career advancement as any other group. What makes scientists virtuous is not their characters but their non-trivial knowledge. Good theory serves as a conceptual break against shoddy methods. Strong priors (which is what theory provides) is really important in preventing shoddy data and slipshod thinking from leading the field astray. If this is right, then the problem is not with the general sociological observation that the world is in many ways crappier than before, but with the fact that many parts of what we call the sciences are pretty immature. There is far less science out there than advertised if we measure a science by the depth of its insights rather than the complexity of its techniques (especially, its data management techniques).

There is, of course, a reason for why the term ‘science’ is used to cover so much inquiry. The prestige factor. Being “scientific” endows prestige, money, power and deference. Science stands behind “expertise,” and expertise commands many other goodies. There is thus utility in inflating the domain of “science” and this widens the possibilities for and advantages of the kinds of problems that R&E catalogue.

R&E ends with a discussion of ways to fix things. They seem like worthy, albeit modest fixes. But I personally doubt they will make much of a difference. They include getting a better fix on the perverting incentives, finding better ways to measure scientific contribution so that reward can be tuned to these more accurate metrics, and implementing more vigilant policing and punishment of malefactors. This might have an effect, though I suspect it will be quite modest. Most of the suggestions revolve around ways of short-circuiting data manipulation. That’s why I think these suggestions will ultimately fail to do much. They misdiagnose the problem. R&E implicitly takes the problem to mainly reside with current perverse incentives to pollute the data stream for career advancement. The R&E solution amounts to cleaning up the data stream by eliminating the incentives to dirty it. But the problem is not only (or mainly) dirty data. The problem is our very modest understanding for which yet more data is not a good substitute.

Let me end with a mention of another paper. The other paper (here) takes a historical view on metrics and how it affected research practice in the past. It notes that the problems we identify as novel today have long been with us. It notes that this is not the first time people are looking for some kind of methodological or technological fix to slovenly practice. And that is the problem; the idea that there is a kind of methodological “fix” available, a dream for a more rigorous scientific method and a clearer scientific ethics. But there is no such method (beyond the trivial do your best) and the idea that scientists qua scientists are more noble than others is farfetched. Science cannot be automated. Thinking is hard and ideas, not just data collection, matter. Moreover, this kind of thinking cannot be routinized and insights cannot be summoned not mater how useful they would be. What the crisis R&E identifies points to, IMO, is that where we don’t know much we can be easily misled and easily confused. I doubt that there is a methodological or institutional fix for this.

[1] It is worth pointing out that real GDP in 2016 is over five times higher than it was in 1960 (roughly 3 trillion vs 17 trillion) (see here). In real terms then, there is a lot more money today than there was then for science research. Though the denominator went down, the numerator really shot up.
[2] Again, what is odd is the dearth of comparative data on these measures. Are findings less replicable today than in the past? Are data sets more biased than before? Was statistical practice better in the past? I confess that it is hard to believe that any of these measures have gotten worse if compared using the same yardsticks.
[3] This is odd and I’ve complained about this myself. However, it is also true that in the good old days science was a far more restricted option for most people (it is far more open today and many more people and kinds of people can become scientists). However, what seems more or less right is that immediately after the war there was a lot of government money pouring into science and that made it possible to pursue research that did not show immediate signs of success. What would be nice to see is evidence that this made for better more interesting science, rather than more comfortable scientists. 

Tuesday, November 7, 2017

Minimal pairs

As any well educated GGer knows, there is a big and important difference between grammaticality and acceptability (see here and here) (don’t be confused by the incessant attempts by many (especially psycho types) to confuse these very separate notions (some still call judgment tasks ‘grammaticality judgments’ (sheesh!!))). The latter pertains to native speaker intuitions, the former to GGers theoretical proposals. It is a surprising and very useful fact that native speaker’s have relatively stable converging judgments about the acceptability (under an interpretation) of linguistic forms over a pretty wide domain of linguistic stimuli. This need not have been the case, but it is. Moreover, this capacity to discriminate among different linguistic examples and to comparatively rate them consistently over a large domain has proven to be a very good probe into the (invisible underlying) G structure that GGers have postulated is involved in linguistic competence. So for lots of GG research (the bulk of it I would estimate) the road to grammaticality has been paved by acceptability. As I’ve mentioned before (and will do so again here), we should be quite surprised that a crude question like “how does this sound (with this meaning)?” has been able to yield so much. IMO, it strongly suggests that FL is a (relatively) modular system (and hence immune to standard kinds of interference effects) and FL is a central cognitive component of human mental life (which is why its outputs have robust behavioral effects).  At any rate, acceptability’s nice properties makes life relatively easy for GGers like me as it allow me/us to wallow in experimental crudity without paying too high an empirical price.[1]

That is the good news. Now for some bad. The fact that acceptability judgments are fast and easy does not mean that they can be treated cavalierly. Not all acceptability judgments are equally useful. The good ones control for the non-grammatical factors that we all know affect acceptability. The good ones general exploit minimal pairs to control for these distorting non-grammatical factors. Sadly, one problem with lots of work in syntax is its lack of fastidiousness concerning minimal pairs. Let’s consider for a moment why this is a problem.

If acceptability is our main empirical probe into grammaticality and it is understood that acceptability is multivariate with grammaticality being but one factor among many contributing to acceptability, then to isolate what the grammar contributes to an acceptability judgment requires controlling for all acceptability effects that are not grammatically induced. So, the key factor behind the acceptability judgment methodology is to bend over backwards to segregate those factors that we all know can affect acceptability but cannot be traced to grammaticality. And it is the practicing GGer that needs to worry about the controls because speakers cannot be trusted to do so as they have no special conscious insight into their grammatical knowledge (they cannot tell us reliably why something sounds unacceptable and whether that is because their G treats it as ungrammatical).[2] And that is where minimal pairs come in. They efficiently function to control for non-grammatical factors like length, lexical frequency, pragmatic appropriateness, semantic coherence, etc.  Or, to put this another way: to the degree that I can use largely the same lexical items, in largely the same order to that degree I can control for features other than structural difference and thereby focus on G distinctions as the source for whatever acceptability differences I observe. This is what good minimal pairs do and so this is what makes minimal pairs the required currency of grammatical commerce. Thus, when they are absent suspicion is warranted, and best practice would encourage their constant use.  In what follows I would like to illustrate what I have in mind by considering a relatively hot issue nowadays; the grammatical status of Island Effects (IE) and how minimal pairs correctly deployed, render a lot of the argument against the grammatical nature of island effects largely irrelevant. I will return to this theme at the end.

To get started, let’s consider an early example from Chomsky (1964: Current Issues). He observes that (1) is three ways ambiguous. It has the three paraphrases in (2).

1.     John watched a woman walking to Grand Central Station (GCS)
2.     a. John watched a woman while he was walking to GCS
b. John watched a woman that was walking to GCS
c. John watched a woman walk to GCS

The ambiguities reflect structural differences that the same sequence of words can have. In (2a), walking to GCS is a gerundive adjunct and John is the controller of the subject PRO.[3] In (2b) a woman walking to GCS is a reduced relative clause with walking to GCS an adjunct modifying the head woman. In contrast to the first reading, a woman walking to GCS forms a nominal constituent. In the third reading a woman walking to GCS is a gerundive clausal complement of watch depicting an event. It is thematically similar to, but aspectually different from, the naked infinitive small clause provided in (2c). Thus, the three way ambiguity witnessed in (1) is the product of three different syntactic configurations that this string of words can realize and that is made evident in the paraphrases in (2).

Chomsky further notes that if we WH move the object of to (optionally pied piping the preposition) all but the third reading disappears:

3.     a. Which train station did John watch a woman walking to
b. To which train station did John watch a woman walking

Given what we know about islands and movement, this should not be surprising. Temporal adjuncts resist WH extraction (CED effects), as do relative clauses (CNPC). Clausal complements do not. Thus, we predict that movement of (to)which train station from (1) with structures analogous to (2a,b) should be illicit, while movement from (1) with a complement structure like (2c) should be fine. Thus, we expect the movement to factor out all but one of the readings we find with (1). And this is what occurs.

Note that this explanation of the loss of all but one reading coincides with the fact that all but the third paraphrase in (2) resists WH extraction:

4.     a. *(To) which train station did John watch a woman while he was walking (to)
b. *(To) which train station did John watch a woman who was walking (to)
c.  (To) which train station did John watch a woman walk (to)

Thus the reason that (3) becomes monoguous under WH movement is the same reason that (4a,b) are far more unacceptable than (4c).  This argues for the fact that unacceptability wrt these sentences ((un)acceptability under an interpretation for (1) and tout court with (4)) implicates a syntactic source precisely because other plausible factors are controlled for, and they are controlled for because we have used the same words, in the same order thereby varying only the grammatical structures that they realize.[4] 

We can go a little further, IMO. Note the dependent measure in (4) is relative acceptability with (4c) as baseline. But, note that in this case the items compared are not identical. The fact that we get the same effects in (1)/(3) as we do in (2)/(4) argues that the data in (4) reflects structural differences and not the extraneous vocabulary items that differ among the examples.  Furthermore, the absence of the two illicit readings in (3) is quite clear. It is often asserted that acceptability judgments are murky and can be trivially enhanced/degraded by changing the WHs moved or the intervening lexical items. Perhaps. Here we have a case where the facts strike me as particularly clear. Only the event reading survives the extraction. The other ones disappear, which is exactly what a standard theory of islands would predict. This, I believe, is typical for well constructed minimal pair cases: the dependent measure will often be the availability of a reading and, interestingly, the presence/absence of a reading is often more perspicuous for native speakers than is a more direct relatively acceptability judgment.

I would like to consider one more case for illustration. This involves near minimal pairs rather than identical strings. What the above Chomsky case provides evidence for (rather clear evidence IMO) is that G structure matters for extraction. It shows this by factoring out everything but such structure as the relevant variable. However, it does not factor out one important variable: meaning. Sentence (1) has three readings in virtue of having three different syntactic structures. So, the argument cannot single out whether the relevant factor is syntactic or semantic. Does the difference under WH movement reflect the effects of formal grammatical structure (syntax) or of meaning (semantics)? As the two vary together in these cases, it is impossible to pull them apart. What we need to focus in on this are structures that are semantically and formally the same. And this is very hard to do. However, not quite impossible. Let me discuss a (near) minimal pair involving event complements.[5]

Consider the following two sets of sentences:

5.     a. Mary heard the sneaky burglar clumsily attempt to open the door
b. Mary heard the sneaky burglar’s clumsy attempt to open the door
c. What1 did Mary hear the sneaky burglar clumsily attempt to open t1
d. What1 did Mary hear the sneaky burglar’s clumsy attempt to open t1

6.     a. Mary heard someone clumsily attempt to open the door
b. Mary heard a clumsy attempt to open the door
c. What1 did Mary hear someone clumsily attempt to open t1
d. What1 did Mary hear a clumsy attempt to open t1

The main difference between (5) and (6) is that the latter tries to control for definiteness effects in nominals. What is relevant here is that both sets of cases distinguish the acceptability of the the c from the d cases with the former being judged better than the latter using standard Sprouse like techniques (i.e. we find a super additivity effect for (5c)/(6c)). Why is this interesting?

Well note that the near minimal pairs have a common semantics. Perception verbs take eventive internal arguments. These can come in either a clausal ((5a,c)/(6a,c)) or a nominal ((5b,d)/(6b,d)) flavor. The latter should show island effects under movement given standard subjacency reasoning. In sum, these examples control for semantic effects by identifying them across the two syntactic structures yet we still find the super-additivity signature characteristic of islands. This argues for a syntactic (rather than a semantic) conception of islands for this is the one factor we varied in these near minimal pairs, the meaning having been held constant across the a/b and c/d examples.

Howard Lasnik is constantly reminding those around him how important minimal pairs are in constructing a decent grammatical argument. He notes this because it is not yet second nature for GGers to employ them. And he is right to insist that we do so for the reasons outlined above. It allows us to make our arguments cleaner and to control for plausible interfering factors. Minimal pairs is the nod we give to the fact that acceptability judgments are little experiments with all the confounds that experiments bring with them. Minimal pairs is the price we pay for using acceptability judgments to probe grammatical structure. As Chomsky noted long ago in Syntactic Structures these sorts of judgments can really get you deep into a G structure very efficiently. They are an indispensible part of linguistic theorizing. However, to do their job well, we must understand their logic. We must understand that theories of grammar are not theories of acceptability and that there is a gap between acceptability (a term of art for describing data) and grammaticality (a term of art for describing the products of generative procedures). Happily the gap can be bridged and acceptability can be fruitfully used. But jumping that gap means controlling for extraneous factors that impact acceptability. And that is how minimal pairs are critical. Deployed well they allow us to control the hell out of the data and zero in the grammatical factors of linguistic interest. So, let’s hear it for minimal pairs and let’s all promise to use them in all of our papers and presentation from now on. Pledges to do so can be sent to me written on a five dollar bill c/o the ling dept at UMD.

[1] Jon Sprouse and friends have shown roughly this: that crude methods are fine as they converge with more careful ones.
[2] If undergrads are to be believed virtually all unacceptability stems from semantic ill-formedness. If asked why some form sounds off you can bet dollars to doughnuts that an undergrad will insist that it doesn’t mean anything, even when telling you what it in fact means.
[3] Which, you all know, does not exist but is actually a copy/occurrence of John due to sidewards internal merge. And yes, this is an unpaid political announcement.
[4] Note the use of ‘grammatical’ rather than ‘syntactic.’ These cases implicate structure but as syntactic structure and semantic interpretation co-vary we cannot isolate one or the other as the relevant causal element. We return to this with the second example of a minimal pair below.
[5] This is joint work that I did with Brian Dillon. He did most of the heavy lifting and deserves the lion’s share of the credit. It is published here.

Tuesday, October 31, 2017

Scientific myths

Like any other organized body of doctrine, science has its founding myths. Modern science has two central ones: (i) That there exists a non-trivial scientific method (SM)[1] and (ii) that theory in science plays second fiddle to observation/experiment. Combine (i) and (ii) and we reach a foundational principle: good science practice discards theories when they clash with experimental observation.  And, maybe just as important, bad science involves holding onto theories that conflict with experiment/observation. Indeed, from the perspective of SM perhaps the cardinal irrationality is theoretical obstinacy in the face of recalcitrant data.[2] 

There are several reasons for the authority of this picture. One is that it fits snugly with the Empiricist (E) conception of knowledge. As I’ve noted before (much too often for many of you I suspect) E is both metaphysically (see here) and epistemologically (see here) suspicious of the kind of generalizations that required by theory.

Metaphysically, for E, observation is first, generalizations are second, the latter being, summations of the former. As theories, at least good ones, rest on generalizations and given that they are only as good as the data that they “generalize” it is not surprising that when they come in conflict, theories are the things that must yield. Theories and laws are, for E, useful shorthand compendia of the facts/data/observations and shorthands are useful to the degree that they faithfully reflect that which they hand shortly.

Epistemologically, generalizations are etiologically subsequent to observations. In fact they are inductively based on them and, in the limit, should do nothing more than summarize them. Good scientific practice should teach how to do this, should teach how to eliminate the “irrationalities” that waylay legit inductions forcing them away form the data that they are (or should be) built on. So again, in practice, when data and generalization/theory conflict, the problem likely lies with some illegitimate bias tripping up the induction from data to generalization/theory.

There is a less highfalutin reason that makes the two fold myth above attractive to scientists. It lends them authority. On this view, scientists are people who know how to see the world without illusion or distortion. They are privy to a method that allows them to find the truth (or at least not be distracted from it. So armed, scientists have a kind of epistemological expertise that makes their opinions superior to those of the untrained. The speculations of scientists are grounded in and regulated by the facts, unlike the theories (and prejudices) of the unwashed. Observe that here the scientific opinion is grounded in reality, and is not just one opinion among many. Being scientific endows legitimacy. Being non-scientific removes it. 

This thnking is a holdover from the old demarcation debates between science and non-science (i.e. religion, ethics, prejudice, etc.) that the Positivists loved to engage in. Within philosophy proper the idea that there exists a sharp demarcation between science and non-science is not well regarded anymore. In fact, it has proven very difficult to find any non-circular ways of establishing what belongs on either side of the divide. But scientists don’t always hold the wisdom of philosophers in high regard, especially when, IMO, it serves to puncture their self-regard and forces them to reconsider the degree to which their scientific credentials entitles them to automatic deference in the public sphere. Most everyone enjoys the deference that legitimate authority confers, and science’s access to such revolves around the combo of (i) and (ii) above. The bottom line is that the myth buttresses a very flattering view: scientists think more clearly and so see better because their views are based in the facts and so deserve deferential respect.

So here are two reasons that the founding myth has proven so strong. But there are other reasons too. We tend to tell our stories (at least our mythical ones) about how science advances largely in terms that fit this picture. A recent short Aeon paper discusses one such founding myth involving that great scientific hero Galileo and the Copernican world view. Btw, I am one of those that count both as heros. They really were pretty great thinkers. But as this paper notes, what made them such is not that they were unafraid to look at the facts while their opponents were mired in prejudice. Nope. There was a real debate, a scientific one, based on a whole series of interacting assumptions, both empirical and theoretical. And given the scientific assumptions of the time, the push back against Galileo’s Copernicism was not irrational, though it proved to be wrong.[3]

The hero of the short piece is one Johann Locher. He was a Ptolemeian and believed that the earth was the center of the universe. But, he made his case in purely scientific terms, the biggest problem for the Copernican vision being the star-size problem which it took some advances in optics to square away. But, as the piece makes clear, this is not the view we standardly have. The myth is that opposition to Galileo/Copernicus involved disregard for the facts driven by religious prejudice. This convenient account is simply false, though it part of the reason it became standard is Galileo’s terrific popular polemic in his Dialogues Concerning the Two Chief World Systems.

Christopher Graney, the author of the short piece, thinks that one baleful result of the scientific caricature Galileo unleashed is today’s science skepticism. He believes that today’s skeptics “… wrap themselves in the mantle of Galileo, standing (supposedly) against a (supposedly) corrupted science produced by the ‘Scientific Establishment’” (3). This may be so. But I doubt that this is the only, or most important problem with the myth. The real problem (or another problem) is that the myth sustains a ghostly version of the demarcation criterion among working scientists. Here’s what I mean.

As I’ve noted before, there is a general view among scientists that data trumps, or should trump, theory. The weak version of this is unassailable: when data and theory clash then this constitutes a prima facie problem for theory. But this weak version is compatible with another weak view: when data and theory clash this constitutes a prima facie problem for the data. When there is a clash, all we know, if the clash is real, is that there is either a problem with the theory or a problem with the data and that is not knowing very much. No program of action follows form this. Is it better to drop/change the theory to handle the data or to reanalyze the data to save the theory? Dunno. The clash tells us nothing. However, due to the founding myth, the default view is that there is something wrong with the theory. This view, as I’ve noted, is particularly prevalent in linguistics IMO and leads the field to dismiss theory and to exalt description over explanation. So, for example, missing a data point is considered a far worse problem than having a stilted explanation. Ignoring data is being unscientific. Eschewing explanation is just being cautious. The idea really is that facts/data have an integrity that theories do not. This asymmetric attitude is a reflection of Science’s founding myths.

So where does this leave us? The aim of science is to understand why things are as they are. This involves both data and theory in complex combinations. Adjudicating between accounts requires judgment that rarely delivers unequivocal conclusions. The best we can do is hold onto two simple dicta and try to balance them: (i) never believe a theory unless grounded in the facts and (ii) never believe a fact unless grounded in a theory. It is the mark of a truly scientific temperament, in my view, that it knows how to locally deploy these two dicta to positive effect in particular circumstances. Unfortunately, doing this is very difficult (and politically it is not nearly as powerful as holding the first of these exclusively). As Graney notes, “science has always functioned as a contest of ideas,” not just a contest of competing observations and data points. Facts (carefully curated) can tell us how things are. But scientific explanation aims to explain how things must be, and for this, facts are not enough.

[1] By this I mean a substantive set of precepts rather than cheers of the sort “do your best in the circumstances at hand,” or as Percy Bridgeman said “use your noodle and no holds barred.”
[2] The reply to this is well known: a theory is responsible for relevant data and what exactly counts as relevant often requires theory to determine. But I put such niceties aside here.
[3] A most amusing discussion of this period can be found in Feyerabend’s writings. He notes that there were many reasons to think that looking through a telescope was hardly an uncontroversial way of establishing observtions. He is also quite funny and does a good job, IMO, of debunking the idea that a non-trivial SM exists. By non-trivial I intend something other than “do your best in the circumstances.”

Tuesday, October 24, 2017

So not only me on why only us

The facts are clear: nothing does language like humans do. Nothing even comes close. I've repeatedly made this point. But it comes with pushback, often from Darwinian acolytes who insist that if this cannot be so. Such qualitative divides are biologically unbridgeable and so it what I deem obvious cannot be so. It must be that other animals also do language, at least in part, and what we do is just a souped up version of what they do.

I mention this because every now and then I come across an ego biologist who sees exactly what I do: that we are linguistically unique. And sees this in roughly the way I do, as obvious! Here is a recent discovery.

Massimo Pigliucci has a blog, Footnotes to Plato (which I recommend btw), and here he discusses various issues in biology and philosophy. He also gives extended reviews of books. His latest post (here) discusses a recent book by Kevin Laland which touch on the topic of human uniqueness. Not only does nothing do language like we do, but nothing does culture like we do and nothing does mind reading like we do and ... (no doubt all of these facts are related, though how is as yet unclear). At any rate, the facts are clear: "...if a complex mind, language and a sophisticated culture are truly advantageous for survival and reproduction, why did they evolve only in the human lineage?" (1).

Thems the facts. The biological problem is how to explain this. A good first step involves understanding the contours of the problem and this involves recognizing the obvious.

It will also require more: precisely identifying those properties that we have that are unique. If it is language, then what about language is species specific? You know the MP line; it's recursion. But there may be a lot more (e.g. the labile nature of our lexical items). And once one has identified these features we need to ask what mental powers they require. These are first steps towards a rational discussion, not final ones.  Sadly, they are rarely taken. But don't believe me on this. Read Pigliucci's post and his discussion of the push back one gets from spotting the obvious.

So, take a look at the post and at the book (something I have not yet done but intend to do). It looks like there may be someone worth talking to out there.

Monday, October 23, 2017

The future of (my kind of) linguistics

I have been pessimistic of late concerning the fate of linguistics.  It’s not that I think it is in intellectual trouble (I actually cannot think of a more exciting period of linguistic research), but I do think that the kind of linguistics I signed up for as a youth is currently lightly prized, if at all. I have made no secret of this view. I even have a diagnosis. I believe that the Minimalist Program (MP) has forced to the surface a tension that was inchoate in the field since its inception 60 or so years ago. Sociologically, within the profession, this tension is becoming resolved in ways that disfavor my conception of the enterprise. You have no doubt guessed what the tension resides in: the languist-linguist divide. Languists and linguists are interested in different problems and objects of study. Languists mainly care about the subtle ways that languages differ. Linguists mainly care about the invariances and what these tell us about the overarching capacities that underlie linguistic facility. Languists are typologists. Linguists are cognitivists.

Before the MP era, it was pretty easy to ignore the different impulses that guide typological vs cognitive work (see here for more discussion). But MP has made this harder, and the field has split. And not evenly.  The typologists have largely won, at least if one gauges this by the kind of work produced and valued. The profession loves languages with all of their intricacies and nuances. The faculty of language, not so much. As I’ve said many times before, and will repeat again here, theoretical work aimed at understanding FL is not highly valued (in fact, it is barely tolerated) and the pressures to cover the data far outweigh demands to explain it. This is what lies behind my pessimistic view about the future of (my kind of) linguistics. Until recently. So what happened? 

I attended a conference at UMD sponsored by BBI (Brain and behavior initiative) (here). The workshop brought together people studying vocalization in animals and linguists and cog-neuro types interested in language.  The goal was to see if there was anything these two groups could say to one another. The upshot is that there were potential points of contact, mainly revolving around sound in natural language, but that as far as syntax was concerned, there is little reason to think that animal models would be that helpful, at least at this point in time. Given this, why did I leave hopeful?  Mainly because of a great talk by David Poeppel that allowed me to glimpse what I take to be the future of my brand of linguistics. I want to describe to you what I saw.

Cog-neuro is really really hard. Much harder than what I do. And it is not only hard because it demands mastery of distinct techniques and platforms (i.e. expensive toys) but also because (and this is what David’s talk demonstrated) to do it well presupposes a very solid acquaintance with results on some branch of cognition. So to study sound in humans requires knowing a lot about acoustics, brain science, computation, and phonology. This, recall, is a precondition for fruitful inquiry, not the endpoint. So you need to have a solid foundation in some branch of cognition and then you need to add to this a whole bunch of other computational, statistical, technical and experimental skills. One of the great things about being a syntactician is that you can do excellent work and still be largely technically uneducated and experimentally inept. I suspect that this is because FL is such a robust cognitive system that shoddy methods suffice to get you to its core general properties, which is the (relatively) abstract level that linguists have investigated. Descending into wetware nitty gritty demands loosening the idealizations that the more abstract kind of inquiry relies on and this makes things conceptually (as well as practically) more grubby and difficult. So, it is very hard to do cog-neuro well. And if this is so, then the aim of cognitive work (like that done in linguistics) is to lighten cog-neuro’s investigative load. One way of doing this is to reduce the number of core operations/computations that one must impute to the brain. Let me explain.

What we want out of a cog-neuro of language is a solution to what Embick and Poeppel call the mapping problem: how brains execute different kinds of computations (see here). The key concept here is “the circuit,” some combination of brain structures that embody different computational operations. So part of the mapping problem is to behaviorally identify the kinds of operations that the brain uses to chunk information in various cognitive domains and to figure out which brain circuits execute them and how (see here for a discussion of the logic of this riffing on a paper by Dehaene and friends). And this is where my kind of linguistics plays a critical role. If successful, Minimalism will deliver a biologically plausible description of all the kinds of operations that go into making a FL.  In fact, if successful it will deliver a very small number of operations very few of which are language specific (one? Please make it one!) that suffice to compute the kinds of structures we find in human Gs. In this context, the aim of the Minimalist Program (MP) is to factor out the operations that constitute FL and to segregate the cognitively and computationally generic ones form the more bespoke linguistic ones. The resulting descriptive inventory provides a target for the cog-neuro types to shot at.

Let me say this another way. MP provides the kind of parts list Embick and Poeppel have asked for (here) and identifies the kinds of computational structures that Dehaene and company focus on (here). Putting this another way, MP descriptions are at the right grain for cog-neuro redemption. It provides primitives of the right “size” in contrast to earlier (e.g. GBish) accounts and primitives that in concert can yield Gs with GBish properties (i.e. ones that have the characteristics of human Gs).

So that’s the future of my brand of linguistics, to be folded into the basic wisdom of the cog-neuro of language. And what makes me hopeful is that I think that this is an attainable goal. In fact, I think that we are close to delivering a broadly adequate outline of the kinds of operations that go into making a human FL (or something with the broad properties of our FL) and separating out the linguistically special from the cognitively/computationally generic. Once MP delivers this, it will mark the end of the line of investigation that Chomsky initiated in the mid 1950s into human linguistic competence (i.e. into the structure of human knowledge of language). There will, of course, be other things to do and other important questions to address (e.g. how do FLs produce Gs in real time? How do Gs operate in real time? How do Gs and FLs interact with other cognitive systems?) but the fundamental “competence” problems that Chomsky identified over 60 years ago will have pretty good first order answers.

I suspect that many reading this will find my views delusional, and I sympathize. However, here are some reasons why I think this.

First, I believe that the last 20 years of work has largely vindicated the GB description of FL. I mean this in two ways: (i) the kinds of dependencies, operations, conditions and primitives that GB has identified have proven to be robust in that we find them again and again across human Gs. (ii) these dependencies, operations, conditions and primitives have also proven to be more or less exhaustive in that we have not found many additional novel dependencies, operations, conditions and primitives despite scouring the world’s Gs (i.e. over the last 25 years we have identified relatively few new potential universals). What (i) and (ii) assert is that GB identified more or less all the relevant G dependencies and (roughly) accurately described them. If this is correct (and I can hear the howls as I type) then MP investigations that take these to be legit explananda (in the sense of providing solid probes into the fundamental structure of FL) is solid and that explaining these features of FL will suffice to explain why human FLs have the features they do. In other words, deriving GB in a more principled way will be a solid step in explaining why FL is built as it is and not otherwise.

Second, perhaps idiosyncratically, I think that the project of unifying the modules and reducing them to a more principled core of operations and principles has been quite successful (see three part discussion ending here). As I’ve argued before, the principle criticisms I have encountered wrt MP rest on a misapprehension of what its aims are.  If you think of MP as a competitor to GB (or LFG or GPSG or Construction Grammar or…) then you’ve misunderstood the point of the program. It does not compete with GB. It cannot for it presupposes it. The aim is to explain GB (or its many cousins) by deriving its properties in a more principled and perspicuous way. This would be folly if the basic accuracy of GB was not presupposed. Furthermore, MP so understood has made real progress IMO, as I’ve argued elsewhere. So GB is a reasonable explanandum given MP aims and Minimalist theories have gone some way in providing non-trivial explananses.

Third, the MP conception has already animated interesting work in the cog-neuro of language. Dehaene, Friederici, Poeppel, Moro and others have clearly found the MP way of putting matters tractable and fecund. This means that they have found the basic concepts engageable, and this is what a successful MP should do. Furthermore, this is no small thing. This suggests that MP “results” are of the right grain (or “granularity” in Poeppel parlance). MP has found the right level of abstraction to be useful for cog-neuro investigation and the proof of this is that people in this world are paying attention in ways that they did not do before. The right parts list will provoke investigation of the right neural correlates, or at least spur such an investigation.

Say I am right. What comes next? Well, I think that there is still some theoretical work to do in unifying the modules and then investigating how syntactic structures relate to semantic and phonological ones (people like Paul Pietroski, Bill Idsardi, Jeff Heinz, and Thomas Graf are doing very interesting work along these lines). But I think that this further work relies on taking MP to have provided a pretty good account of the fundamental features of human syntax.

This leaves as the next big cognitive project figuring out how Gs and FL interact with other cognitive functions (though be warned, interaction effects are very tough to investigate!). And here I think that typological work will prove valuable. How so?

We know that Gs differ, and appear to differ a lot. The obvious question revolves around variation: how does FL build Gs that have these apparently different features (are they really different or only apparently so? And how are the real differences acquired and used?). Studying the factors behind language use will require having detailed models of Gs that differ (I am assuming the standard view that performance accounts presuppose adequate competence models). This is what typological work delivers: solid detailed descriptions of different Gs and how they differ. And this is what theories of G use require as investigative fodder.

Moreover, the kinds of questions will look and feel somewhat familiar: is there anything linguistically specific about how language is used or does language use exploit all the same mechanisms as any other kind of use once one abstracts from the distinctive properties of the cognitive objects manipulated? So for example, do we parse utterances differently than we do scenes? Are there linguistic parsers fitted with their own special properties or is parsing something we do pretty much in the same way in every domain once we abstract away from the details of what is being parsed?[1] Does learning a G require different linguistically bespoke learning procedures/mechanisms? [2] There is nothing that requires performance systems to be domain general. So are they? Because this kind of inquiry will require detailed knowledge of particular Gs it will allow for the useful blurring of the languistics/linguistics divide and allow for a re-emergence of some peaceful co-existence between those mainly interested in the detailed study of languages and their differences and those interested in the cognitive import of Gs.

Let me end this ramble: I see a day (not that far off) when the basic questions that launched GG will have been (more or less) answered. The aim will be achieved when MP distills syntax down to something simple enough for the cog-neuro types to find in wet ware circuits, something that can be concisely written onto a tee shirt. This work will not engage much with the kinds of standard typological work favored by working linguists. It addresses different kinds of questions.

Does this mean that typological work is cognitively idle? No, it means that the kinds of questions it is perfect for addressing are not yet being robustly asked, or at least not in the right way. There are some acquisitionists (e.g. Yang, Lidz) that worry about the mechanisms that LADs use to acquire different Gs, but there is clearly much more to be done. There are some that worry about how different Gs differentially affect parsing or production. But, IMO, a lot of this work is at the very early stages and it has not yet exploited the rich G descriptions that typologists have to offer. There are many reasons for this, not the least of which is that it is very hard to do and that typologists do not construct their investigations with the aim of providing Gs that fit these kinds of investigations. But this is a topic for another post for another time. For now, kick back and consider the possibility that we might really be close to having answered one of the core questions in GG: what does linguistic knowledge consist in?

[1] Jeff Lidz once put this as the following question: is there a linguistic parser or does the brain just parse? On the latter view, parsing is an activity that the brain does using knowledge it has about the objects being parsed. On the latter view, linguistic parsing is a specific activity supported by brain structure special to linguistic parsing. There is actually not much evidence that I am aware of that parsing is dedicated. In this sense there may be aprsing without parsers, unless by parser you mean the whole mind/brain.
[2] Lisa Pearl’s thesis took this question on by asking whether the LAD is built to ignore data from embedded clauses or if it just “happens” to ignore it because it is not statistically robust. The first view treats language acquisition as cognitively special (as it comes equipped with blinders of a special sort), the latter as like everything else (rarer things are causally less efficacious than more common things). Lisa’s thesis asked the question but could not provide a definitive answer though it provided a recipe for a definitive answer.