Get 20M+ Full-Text Papers For Less Than $1.50/day. Subscribe now for You or Your Team.

Learn More →

An ERP study on L2 syntax processing: When do learners fail?

An ERP study on L2 syntax processing: When do learners fail? ORIGINAL RESEARCH ARTICLE published: 25 September 2014 doi: 10.3389/fpsyg.2014.01072 An ERP study on L2 syntax processing: When do learners fail? 1 1 1 2 Nienke Meulman *, Laurie A. Stowe , Simone A. Sprenger , Moniek Bresser and 1,3 Monika S. Schmid Center for Language and Cognition, University of Groningen, Groningen, Netherlands Research School of Behavioral and Cognitive Neurosciences, University of Groningen, Groningen, Netherlands Department of Language and Linguistics, University of Essex, Colchester, UK Edited by: Event-related brain potentials (ERPs) can reveal online processing differences between Christos Pliatsikas, University of native speakers and second language (L2) learners during language comprehension. Using Kent, UK the P600 as a measure of native-likeness, we investigated processing of grammatical Reviewed by: gender agreement in highly proficient immersed Romance L2 learners of Dutch. We Christos Pliatsikas, University of demonstrate that these late learners consistently fail to show native-like sensitivity to Kent, UK Eleonora Rossi, Penn State gender violations. This appears to be due to a combination of differences from the gender University, USA marking in their L1 and the relatively opaque Dutch gender system. We find that L2 *Correspondence: use predicts the effect magnitude of non-finite verb violations, a relatively regular and Nienke Meulman, Center for transparent construction, but not that of gender agreement violations. There were no Language and Cognition, University effects of age of acquisition, length of residence, proficiency or offline gender knowledge. of Groningen, Oude Kijk in ’t Jatstraat 26, PO Box 716, 9700 AS Additionally, a within-subject comparison of stimulus modalities (written vs. auditory) Groningen, Netherlands shows that immersed learners may show some of the effects only in the auditory e-mail: n.meulman@rug.nl modality; in non-finite verb violations, an early native-like N400 was only present for auditory stimuli. However, modality failed to influence the response to gender. Taken together, the results confirm the persistent problems of Romance learners of Dutch with online gender processing and show that they cannot be overcome by reducing task demands related to the modality of stimulus presentation. Keywords: second language acquisition, grammatical gender agreement, event-related potentials (ERPs), P600, modality, immersion INTRODUCTION masculine and feminine in French, or masculine, feminine and Second language (L2) acquisition of many aspects of syntactic neuter in German) which allows speakers to establish syntactic structure is known to be difficult, especially when acquisition cohesion between the elements in a phrase through agreement. starts later in life. A major question being debated in the literature Because the gender of a word is typically not predictable from is to what extent and under what circumstances late L2 speak- its meaning, learning grammatical gender involves acquiring both ers can become native-like with respect to syntax processing (e.g., the knowledge of a word’s gender (gender assignment) and of how Clahsen and Felser, 2006; White, 2007). The evidence is mixed; gender is expressed syntactically (gender agreement or concord). in some cases this does seem to be possible, while in other cases, Therefore, L2 learners must tag each new lemma with its corre- it is difficult or impossible. A number of factors have been sug- sponding gender and learn which grammatical elements in the gested to play a role in this variation, but two which have received contexthavetoagreewith it.For exampleinDutch,allnouns relatively little attention are the difficulty of the target grammat- are assigned to either the common or the neuter gender class ical system and the potential role of modality of testing (written and gender concord occurs with determiners and pre-nominal vs. auditory presentation). The present study investigates whether adjectives (e.g., de tuin , the garden, een [common] [def, common] [indef ] event-related potential (ERP) measures of native-likeness used in mooie tuin , a beautiful garden). During pro- [indef, common] [common] this line of research might be partially dependent on stimulus cessing, a comprehender must retrieve the noun’s gender fast modality, as this might explain some of the inconsistency in the enough to establish gender concord. The question is (a) whether literature. L2 learners manage to do so, and (b) whether they achieve this A structure that has frequently been used to test native-like using the same processing strategies as native speakers. attainment in the L2, is grammatical gender, since it has been Gender processing in L2 has already been the topic of shown to pose a major challenge to L2 learners (e.g., Hawkins, numerous investigations using behavioral measures, such as 2001; White et al., 2001; Sabourin, 2003; Blom et al., 2008). grammaticality judgments, sentence-picture matching, (elicited) Demonstrating gender processing that is comparable to that of production, and eye tracking (for overviews, see, e.g., Grüter natives therefore forms a strong test for L2 syntax acquisition. et al., 2012; Hopp, 2013). More recently, researchers have begun Grammatical gender is a classification system for nouns (e.g., to employ ERPs to investigate native-likeness of grammatical www.frontiersin.org September 2014 | Volume 5 | Article 1072 | 1 Meulman et al. When do learners fail? gender processing in the L2, because ERPs are known to be highly that English, German, and Spanish learners of French can show sensitive to the immediate, unconscious on-line detection, and native-like ERP responses in the form of a P600 effect (Frenck- processing of linguistic anomalies (e.g., Osterhout and Holcomb, Mestre et al., 2009; Foucart and Frenck-Mestre, 2011, 2012). 1992; Molinaro et al., 2011). Studies using off-line behavioral The same goes for English and Chinese learners of Spanish measures (e.g., White et al., 2001, 2004; Franceschina, 2005) (Tokowicz and MacWhinney, 2005; Gillon Dowens et al., 2010, cannot give access to this sort of evidence, which makes inter- 2011). German and Polish learners of Dutch can also show a P600 in response to gender violations (Sabourin and Stowe, 2008; pretation of their results more difficult. Some online techniques such as eye tracking (Dussias, 2010) measure real-time language Loerts, 2012). Despite these consistent results, however, it is clear processing, but do not provide us with the qualitative evidence of that this does not generalize to success in all aspects of gen- potential brain mechanisms that ERPs can. The rationale of such der processing, as the English and German learners also failed ERP studies is that the more similar the response between native to respond in a native-like manner to gender in some forms of speakers and learners, the more similar the underlying neural agreement (Foucart and Frenck-Mestre, 2011, 2012). Stronger and cognitive processing mechanisms. In other words, a compar- yet, Romance learners of Dutch did not show sensitivity to gen- ison of ERPs in native speakers and L2 learners can tell us how der agreement anomalies in the form of a P600 effect even in native-like the latter really are. straightforward determiner noun agreement structures (Sabourin In first language processing, gender and other (mor- and Stowe, 2008). It is unclear why this group failed to exhibit the pho)syntactic violations are found to be associated with two majority pattern; we will discuss some factors which might have primary kinds of components: the left anterior negativity (LAN) affected their success in somewhat more detail. and the P600. The LAN has been widely associated with morpho- One of the factors which has been considered to be central syntactic agreement processes (Münte et al., 1993; Friederici et al., for native-like learning of a late L2 is whether a grammatical ele- 2000; Molinaro et al., 2011), but others claim that it is a more ment (e.g., gender) is present in the L1. Many studies have focused general index of working memory load (Kluender and Kutas, on this question, but have reached different conclusions. There is 1993; Coulson et al., 1998). The P600 has been reported for a some evidence that having a gender system in the L1 might be an range of syntactic and other linguistic violations (e.g., Osterhout advantage when acquiring an L2 gender system (e.g., Bruhn de and Holcomb, 1992; Hagoort et al., 1993; Münte et al., 1993; Garavito and White, 2000; Hawkins, 2001; Franceschina, 2005). Burkhardt, 2007). Given the extremely heterogeneous conditions This is in favor of models proposing that the L1 restricts L2 that elicit a P600, this component cannot be exclusively associated acquisition (Hawkins and Chan, 1997). However, there is also evi- with agreement specifically, or even syntactic processing difficul- dence of L2 learners without gender systems in their L1 being ties more generally, and is therefore often interpreted as a late able to show full acquisition of grammatical gender (White et al., stage of (re)analysis of information (Osterhout and Holcomb, 2001, 2004), which is seen as evidence against such a restriction 1992; Bornkessel-Schlesewsky and Schlesewsky, 2008). It may (Schwartz and Sprouse, 1994, 1996;see also White, 1989; White even reflect a more general process, such as the P300 (Gunter et al., 2004). The presence vs. absence of gender in the L1 seems et al., 1997; Coulson et al., 1998;but see Osterhout and Hagoort, at the least to be more complicated than these views suggest, 1999; Frisch et al., 2003). There is however, a strong correla- however. tion between the appearance of the P600 effect and grammatical The French and Spanish studies mentioned earlier show that violations. In contrast, findings are more varied with respect learners with no gender in their L1 (English and Chinese speak- to the presence of a LAN. In addition to the LAN and P600, ers) can show native-like ERP responses. Further, Sabourin and some studies have found an N400, or a biphasic N400-P600 pat- Stowe (2008) find differences between two L1s which both have tern (but no LAN) in response to syntactic violations (see an gender: German on the one hand and Romance learners on the overview reported in Molinaro et al., 2011). This is surprising, other. Sabourin and Stowe themselves attribute their results to since the N400 is a component normally associated with diffi- the (lack of) similarity between the native and target language of culty in semantic integration (see Kutas and Federmeier, 2011, these learners: Dutch gender is in general predictable from the foranoverview).Ithas thereforebeenproposedthatanN400 gender of the cognate German word due to their common histor- in response to syntactic agreement anomalies is likely to be a ical origin, while there is no one-to-one-correspondence between result of non-syntactic information that is needed to process the Romance and Dutch gender at the lexical level. Moreover, agree- mismatch, for example information that requires lexical access ment between noun and adjective is more similar in German and (Molinaro et al., 2011). Because the LAN and N400 are variable Dutch than the Romance languages and Dutch. Sabourin and in studies of native processing, particularly for gender agreement, Stowe conclude that processing routines are transferred from L1 we will consider the P600 to be the primary measure of native- to L2, rather than transfer of the abstract knowledge that nouns likeness, although we will report findings in the time window have gender, and that these routines must be similar for success- associated with the LAN/N400 (300–500 ms after presentation) ful transfer (see Foucart and Frenck-Mestre, 2011, for a similar as well. argument). ERP results regarding grammatical gender processing in the However, an explanation which assumes that similar routines L2 have provided mixed results. A number of studies find that, in L1 are necessary for native-like processing does not account at least under some conditions, sufficiently proficient L2 learners for the results of other studies mentioned above showing that are able to show native-like ERP responses to gender violations. even with no gender system in the L1, learners are able to A set of studies investigating L2 processing of French suggests show native-like effects. A different approach to the effects of L1 Frontiers in Psychology | Language Sciences September 2014 | Volume 5 | Article 1072 | 2 Meulman et al. When do learners fail? transfer is formulated within the Competition model (see Bates auditory stimulus modality. Indeed, the experience of learning and MacWhinney, 1987). According to the competition-based in immersion can be expected to differ substantially from a for- account, when L1 does not contain gender there is no interfer- mal learning environment. Yet, the various populations that have ence. This predicts successful outcomes for languages with no been tested so far differ in this domain. The participants in gender (Tokowicz and MacWhinney, 2005). However, when exist- the Romance studies summarized above included learners with ing processing routines are transferred, they will cause interfer- extensive formal training in their L2. In many of the studies there was no immersion (Tokowicz and MacWhinney, 2005; Gillon ence if they are dissimilar from those required for L2 (accounting forthe failureofthe RomancelearnersofDutch). Dowens et al., 2011) or only minimal immersion during the par- The target language itself may also contribute to the failure ticipants’ recent residence in France (Foucart and Frenck-Mestre, of Sabourin and Stowe’s (2008) Romance group to show native- 2011, 2012). Sabourin and Stowe (2008), unlike Loerts, tested a like processing. Most of the successful studies have investigated similar late immersion population using visual materials, with Romance target languages. Unlike Romance or Slavic languages, each word presented consecutively in the center of the screen. An which have transparent gender systems (i.e., a predictable gen- alternative explanation for the lack of a native-like response in der category based on morphophonological patterns), Dutch is their study could thus be difficulties with the visual presentation. generally regarded as having an opaque gender system (Corbett, Below, we will speculate about why a visual ERP paradigm might, 1991; van Berkum, 1996). Although some morphological forms under some circumstances, be problematic. predict the gender of the word, these cues are only available for a In a typical language comprehension ERP paradigm, partici- relatively small proportion of the vocabulary in the language. This pants are presented with sentences displayed one word at a time clearly presents a more difficult problem for the learner than gen- at the center of a screen, at a rate of around two words per sec- der in a more transparent language, which may certainly explain ond, a technique called rapid serial visual presentation (RSVP). why the Romance group in the Sabourin and Stowe study failed The advantages of this method are that the duration of stim- to achieve a native-like level. ulus presentation can be controlled (and manipulated) tightly, Neither L1 interference nor target language opaqueness, how- that eye movements, which lead to large artifacts in the EEG, are ever, entirely accounts for the results found by Loerts (2012). reduced to a minimum, and that making the stimulus material Her study demonstrates that highly advanced Polish learners of and time-locking the brain responses to the presentation of viola- Dutch can show somewhat weak, but native-like ERP responses, tions in the stimulus is relatively straightforward. Consequently, even though Polish agreement differs from Dutch. Loerts’ results a large majority of ERP sentence comprehension studies use this also show that an opaque system can be learned, although it method. In contrast, auditory sentence presentation is used much may be more difficult to learn than a transparent system. Only less frequently in ERP research. With spoken stimuli, it is more her most proficient learners showed native-like processing (see difficult to control the presentation duration of individual words. Davidson and Indefrey, 2009, for another example of relatively In addition, making recordings of spoken sentences is more time low proficient learners failing to show native-like effects for gen- consuming and requires tight control of acoustic confounds (e.g., der processing in an opaque L2 system), while even fairly low pro- prosodic cues about upcoming information, Dimitrova et al., ficient English learners of Spanish have been shown to respond 2012), as well as timing issues (e.g., setting markers to millisecond with a clear P600 effect (Tokowicz and MacWhinney, 2005). precision for the events of interest). An alternative explanation is thus that Sabourin and Stowe’s We do not expect to find interesting differences between word- (2008) Romance learners were simply not proficient enough to by-word reading and listening for language processing in natives show online processing comparable to that of natives. Although (Müller et al., 1997; Hagoort and Brown, 2000; Balconi and the proficiency of the Romance group was not investigated in Pozzoli, 2005). In the L1, learners develop fully automatized detail, a similar group of German learners did significantly bet- processing of both modalities; moreover, the auditory represen- ter when tested on offline gender knowledge (Sabourin, 2003). tation of language is automatically activated by written materials The Romance participants in the ERP study also performed worse (Perfetti et al., 1992; Frost, 1998), so that the routines activated at the end of sentence grammaticality judgments collected dur- during auditory processing can be utilized as well as those specific ing the ERP session. It has been shown that proficiency affects to the written modality (Homae et al., 2002). Despite expect- brain responses (e.g., Steinhauer et al., 2006; McLaughlin et al., ing comparable results for the two modalities in general, even 2010). A replication of the Sabourin and Stowe study with a group for L1 comprehenders, consecutive word by word presentation of learners as proficient as in the Loerts study can demonstrate in the middle of the screen presents a challenge under some whether this is the sole explanatory factor. This is one of the aims circumstances. The optimum speed of presentation is an issue; of the current study. Hopp (2010) shows that speeded RVSP presentation can make However, there is another factor that may have produced the even native speakers break down in their grammaticality judg- difference between the two Dutch studies, which has thus far been ment ability, making their performance mirror that of L2 learners overlooked: testing modality. Unlike virtually all the other stud- (see also Camblin et al., 2007, who show a case where speeded ies summarized above, Loerts (2012) tested her Polish learners RSVP eliminates an effect which is clear in naturally produced using auditory sentence presentation. She argues that the learn- connected speech). Conversely, studies directed at optimizing ers had acquired their L2 primarily in the auditory modality as computerized text presentation on small screens have shown that emigrants who arrived with no formal training in their new lan- too slow a presentation can also interfere with comprehension guage. Consequently, processing routines may be tuned to the (Bernard et al., 2001). This may result from working memory and www.frontiersin.org September 2014 | Volume 5 | Article 1072 | 3 Meulman et al. When do learners fail? maintenance issues. Stowe (1991) showed that readers were more be a spurious result of averaging (Osterhout, 1997; Nieuwland likely to garden-path or have difficulty in recovering from a gar- and Van Berkum, 2008; Tanner and Van Hell, 2014; Tanner den path with center of the screen presentation, as opposed to et al., 2014). Before we draw any strong conclusion that a group presentation of words across the screen in their normal position, of learners’ processing of gender agreement qualitatively differs even when readers were allowed to pick their optimum pace. from natives, it is important to identify varying patterns in each L2 learners differ in a number of ways from native speak- of the groups. Furthermore, there may be predictors of native- likeness in L2 learners, such as age of acquisition, proficiency, ers, some of which can be expected to interact with modality. First, their cumulative reading experience in the L2 is likely to be language exposure and use, that may explain variance within the substantially lower than that of native speakers. This means that group (e.g., Weber-Fox and Neville, 1996, 1999; Rossi et al., 2006; their activation of the L2 via this modality can be expected to be Steinhauer et al., 2009; Tanner et al., 2014). Understanding which less automatizedthaninnativespeakers(Koda, 1996). Second, individual difference factors, if any, are associated with the out- interference from the writing system of the first language may come in L2 learning is a fundamental question which is difficult to lead to even less activation of the phonological form of the L2, answer with group-based analyses, and might also help us deter- in comparison with natives (Koda, 1999). These differences can mine the source of some of the mixed patterns of results in L2 potentially play a role for all L2 learners, but may be especially rel- gender research. evant for learners with less formal instruction in the language and MATERIALS AND METHODS in whom learning took place primarily via the auditory modal- PARTICIPANTS ity. The optimum speed of presentation is also likely to differ between various groups of learners and natives. This issue has Participant characteristics and proficiency scores can be found received relatively little attention in the literature, but given that in Table 1. Forty-five participants took part in the experiment. stimulus modality was one difference between the unsuccessful Seven participants had to be excluded from the analyses because Romance group reported by Sabourin and Stowe (2008) and the of too many artifacts in the EEG signal. Nineteen of the remain- relatively more successful group studied by Loerts (2012),this ing participants were Romance learners of Dutch (six French, five factor was included in the current experiment in order to deter- Italians, three Romanians, five Spanish). The remaining 19 partic- mine whether it explains the different patterns seen in the two ipants were native speakers of Dutch. All participants were right studies. A clear effect of modality would suggest that researchers handed, neurologically unimpaired and did not have any prob- need to pay more attention to this variable in their experimental lems with hearing, speaking, or writing. Prior to conducting any designs, and might have implications for the differences between procedures, written consent was obtained from all participants immersed and instructed learners as well. for the study, which was approved by the local ethics committee. Summarizing, the goal of the current study is to gain more Participants were fully debriefed at the end of the experiment and insight into why some groups may show persistent problems in received a small fee for participation. attaining native-like processing of grammatical gender. We inves- All learners had moved to the Netherlands at or after the age tigate grammatical processing in immersed Romance L2 learners of 16 and had been immersed in the L2 context for at least 5 years of Dutch, using the P600 as a measure of native-likeness, in order at the time of testing. The learners had very little to no expo- to answer the question whether late L2 learners can show native- sure to Dutch before immigration. They were asked to indicate like syntactic processing, even if the gender marking in the L1 the frequency of use of Dutch in daily life: a composite score for differs from that in the L2, which may cause interference, and the L2 use was calculated based on questions about language use at L2 gender system is relatively opaque, making it harder to recog- home (with partner and children), outside of the home (at the nize the grammatical agreement regularities. Following Sabourin workplace and other), and use of Dutch media. They addition- and Stowe (2008), in addition to gender violations, which have ally answered questions about their use of Dutch in a specific proven difficult to master, we present our participants with non- modality: they estimated the percentage of use of the L2 in the finite verb violations, a construction that is relatively easy to visual modality (i.e., reading/writing) compared to the auditory acquire, as a baseline for comparison. We compare the responses modality (i.e., speaking/listening), both during learning of Dutch of high-proficient Romance learners with those of native speak- at onset of immigration and during everyday life at the time of ers of Dutch. Additional measures of proficiency will be gathered testing. from the first. A within-subject comparison of stimulus modali- L2 proficiency was assessed by means of several (written) mea- ties allows us to determine whether the absence of a P600 effect sures. A pre-selection on the basis of a pre-test in the form for gender in the Sabourin and Stowe (2008) study was due to of 20 grammar items of the Dutch DIALANG Placement Test processing demands associated with the task modality. (adapted from http://www.lancaster.ac.uk/researchenterprise/ In addition to standard group analyses of the ERP waveforms, dialang/about.html) ensured that all participants had a relatively we will closely inspect individual differences within each group. high level of proficiency in Dutch. Participants had to complete Adding these analyses has several benefits. First, lack of effects in at least 13 of the items correctly to be selected for participa- grand mean ERP results does not necessarily mean that none of tion. Another proficiency measure was taken in the lab, in the the individuals showed a native-like ERP response. Rather, a null form of a C-test (constructed by Keijzer, 2007), which consisted effect might be based on opposite effects (a positive going effect of two texts containing gaps where parts of some words had in one set of individuals and a negative going effect in others) been left out. The participants’ task was to fill the gaps. After canceling each other out. In a similar way, biphasic responses can the EEG experiment, participants were also asked to complete Frontiers in Psychology | Language Sciences September 2014 | Volume 5 | Article 1072 | 4 Meulman et al. When do learners fail? Table 1 | Means (and ranges) of participant characteristics and scores on proficiency measures, and significance of between-group comparisons (Mann-Whitney U-test). Measure Learners (n = 19) Natives (n = 19) U- and p-value AGE/EXPOSURE/USE Age at testing (years) 42.3 (24–64) 39.8 (21–59) U = 162, p = 0.599 Age of acquisition (years) 26.0 (16–39) – – Length of residence (years) 16.3 (5–43) – – L2 use (%) 58.4 (12.3–87.3) – – USE OF MODALITY: DURING LEARNING (%) Visual 43.7 (20–70) – – Auditory 56.3 (30–80) – – USE OF MODALITY: CURRENT (%) Visual 42.6 (20–70) – – Auditory 57.4 (30–80) – – PROFICIENCY MEASURES C-test (%) 79.4 (42.1–100) 95.2 (68.4–100) U = 299.5, p < 0.001 Gender assignment task (%) 87.3 (64.6–100) 99.5 (93.8–100) U = 332.5, p < 0.001 SELF-RATED PROFICIENCY Reading 4.4 (3–5) – – Writing 3.6 (1–5) – – Speaking 3.9 (2–5) – – Listening 4.3 (3–5) – – Composite score based on language use inside and outside of the home and use of Dutch media. Percentage of L2 use in the visual modality (i.e., reading/writing) compared to the auditory modality (i.e., speaking/listening) during learning of Dutch at onset of immigration. Percentage of L2 use in the visual modality (i.e., reading/writing) compared to the auditory modality (i.e., speaking/listening) in everyday life at the time of testing. Percentage of correct responses on the C-test (spelling errors were not penalized). Percentage of correct responses (i.e., a minimum of 2/3 instances of each item assigned correctly) on the gender assignment task. Ratings on a 5-point scale with five as highest level of skill in Dutch. an offline gender assignment task. This task was used to test the these contained an infinitive and the other half a past participle participants’ knowledge of the grammatical gender of the critical verb. For their ungrammatical counterparts, these verbs were nouns used in the EEG experiment. In addition to these measures, altered into their participial or infinitival form, respectively. The learners rated their L2 Dutch in terms of reading, writing, speak- other 96 sentences were used to test grammatical gender agree- ing, and listening proficiency on a Likert-scale between 1 (very ment. In these sentences, the determiner either agreed in gender bad) and 5 (very good). Participants’ scores on the proficiency with the following noun or violated gender concord. Determiner measures can be found in Table 1. and noun were either adjacent, or non-adjacent (with an adjec- tive intervening between the determiner and noun). Only highly MATERIALS frequent Dutch target nouns and verbs were used (nouns: mean The design and materials of the EEG experiment were largely = 2.16, range = 0.78–3.08; verbs: mean = 2.46, range = 0.95– based on work by Loerts (2012), who studied L2 gender and non- 4.05, on log lemma frequency of occurrence per million taken finite verb processing in natives and Slavic learners of Dutch. One from the CELEX corpus: Baayen et al., 1995). Finally, 122 well- hundred and forty-four experimental sentences were created (see formed filler sentences were included. These filler sentences were Table 2 for examples, the full list of sentences can be found in added to raise the overall proportion of correct sentences to the Supplementary Material, Data Sheet 1). Forty-eight of the about 3/4, making the task more similar to natural language sentences were used to test non-finite verb agreement. Half of processing. For the auditory part of the experiment, spoken forms of all sentences were recorded. Each sentence was read aloud by Because of the large number of factors in the current design, it was not a female native speaker with a standard Dutch accent who was possible to get a high number of trials per condition without making the trained to produce correct and incorrect sentences with normal experiment too long, which in all probability would have resulted in severe fatigue effects in our data. We realize that as a result, the number of trials per intonation. Despite training, acoustic confounds, such as subtle condition is on the low side, particularly for the non-finite verb condition. prosodic cues to the upcoming ungrammaticality remain possible However, highly salient agreement errors, such as the non-finite verb agree- (Dimitrova et al., 2012). To prevent any influence of such con- ment violations used in the current study, have been shown to elicit large ERP founds, each sentence was presented in its original form or in a effects. As the results section of this paper shows, even with this low number digitally spliced version, constructed by cross-splicing the origi- of trials we had sufficient power to find significant effects in this condition. nal recordings of grammatical and violation sentences, cutting at In the less salient gender condition however, there was double the amount of trials per condition to ensure sufficient power. the onset of the determiner for the gender condition, or the verb www.frontiersin.org September 2014 | Volume 5 | Article 1072 | 5 Meulman et al. When do learners fail? Table 2 | Example materials of the EEG experiment. Condition Example sentences Number of items per list Non-finite verb agreement Ze heeft alleen haar beste vriendin uitgenodigd/*uitnodigen voor haar verjaardag. 12/12 visual, 12/12 auditory (She has only invited/*invite her best friend for her birthday.) Hij probeert me altijd aan het lachen te maken/*gemaakt door grapjes te vertellen. (He always tries to make/*made me laugh by telling yokes.) Gender agreement Vera plant rode rozen in de/*het tuin van haar ouders. 24/24 visual, 24/24 auditory (Vera is planting red roses in the /*the garden of her parents.) com neu Het duurde uren voordat Jeroen het/*de nette pak van zijn broer had aangetrokken. (It took hours for Jeroen to put on the /*the fancy suit of his brother.) neu com Critical targets, where the ERP was measured, are underlined. in the non-finite verb condition. Noise reduction and volume normalization were applied to all sound files. A within-subject design was employed to test the effects of modality within the same group of subjects. Eight experimental lists were created using a Latin Square design, crossing the fac- tors modality (visual, auditory), correctness (correct, incorrect), and splicing (spliced, unspliced), to ensure each participant was presented with only one version of each sentence and an equal number of each type. Each list was presented to two or three par- ticipants from each group, and each participant saw only one list. PROCEDURE Event-related potentials were recorded while participants listened to or read the sentences. After each sentence, the participant had to make a grammaticality judgment. Participants were com- fortably seated in an electrically shielded and sound attenuated chamber. The sentences were presented using E-prime (Schneider et al., 2002a,b), which in addition recorded accuracy with respect to the grammaticality judgments. Visual stimuli were presented on a computer screen in front of the participants. Speakers were placed to the left and right side of the screen. Visual sentences were presented at a rate of two words per second: each word was presented for 250 ms, followed by 250 ms blank screen. Auditory FIGURE 1 | Approximate location of the recording sites and the 10 sentences were presented at normal speech rate. Participants were regions of interest used for analyses: left/middle/right frontal (LF/MF/RF), left/right temporal (LT/RT), left/middle/right parietal asked to avoid moving any parts of their body and not to move (LP/MP/RP), and left/right occipital (LO/RO). their eyes or blink during sentence presentation. The experiment consisted of four blocks: either two visual blocks followed by two auditory blocks or the reverse. The duration of the breaks between monitor eye-movements, four additional electrodes were placed blocks was determined by the participant. Altogether, the EEG on the outer canthi of each eye and above and below the left experiment lasted about 1 h. eye. Scalp electrode signals were measured against a common Subsequently, participants were asked to fill in the pen and reference during recording. Impedances were reduced to below paper C-test. Finally, they performed a gender assignment task 10 k . The amplifier (TMS international) measured DC with on a computer. The target words of the EEG experiment were a digital FIR filter (cutoff frequency 130 Hz) to avoid aliasing. presented in randomized order, each item appearing three times. After acquisition, the raw data were further processed with Brain Participants were instructed to indicate, by a mouse click on either Vision Analyzer 2.0.4. The data were re-referenced to the aver- the common (“de”) or neuter (“het”) definite article, whether age of two electrodes placed over the left and right mastoids they thought the word had common or neuter gender in Dutch. and digitally filtered with a high-pass filter at 0.1 Hz and low- pass filter at 40 Hz. The data were segmented, time-locked to EEG RECORDING AND ANALYSIS the onset of the critical target (from 500 ms before to 1400 ms The continuous EEG (500 Hz/22 bit sampling rate) was recorded from 54 Ag/AgCl scalp electrodes mounted into an elastic cap (Electro Cap International, Inc.) according to the international In some instances, some temporal and frontal electrodes could only be extended 10–20 system (see Figure 1 for recording sites). To reduced to below 20 k. Frontiers in Psychology | Language Sciences September 2014 | Volume 5 | Article 1072 | 6 Meulman et al. When do learners fail? after stimulus onset). Average ERPs were formed without regard R version 3.1.0 using the lm function of the lme4 package (ver- to behavioral responses, from trials free of muscular and ocular sion 1.1.6: Bates et al., 2014) will be described together with the artifacts; the latter were corrected using the Gratton and Coles results. procedure (1989). Individual channel artifacts led to rejection of RESULTS 0.5% of the data in the learner group and 0.6% in the native group. A baseline period was set from 200 to 0 ms before onset BEHAVIORAL RESULTS of the critical words to normalize the data. A total of 10 regions of The percentages of accurate grammaticality judgments per group, interest (ROIs), containing five or six electrodes each, were used sentence structure,and modality are shown in Figure 2.AThree- for analyses (depicted in Figure 1). Way ANOVA was conducted on the arcsine transformed propor- We analyzed amplitudes of the ERP waveforms in the time- tions of correct responses to stabilizes variance and normalize the windows in which a LAN/N400 and P600 are to be expected: data (mean and SDs reported below are from the untransformed 300–500 and 600–1200 ms after stimulus onset. The latter win- percentages). The ANOVA revealed a significant main effect of dow is somewhat longer than is typical in P600 studies in group, F = 53.24, p < 0.001, with the learners giving fewer (1, 36) monolinguals, because the P600 in L2 learners can be some- correct responses than the natives (mean = 71.1, SD = 17.8vs. what delayed (Weber-Fox and Neville, 1996; Hahne, 2001; Rossi mean = 93.0, SD = 11.5). The main effect of sentence structure, et al., 2006; Sabourin and Stowe, 2008). For grand mean anal- F = 41.66, p < 0.001, shows that the average performance (1, 36) yses, ANOVAs were calculated within each time window and is worse in the gender condition. However, there is also a sig- sentence structure (non-finite verb, grammatical gender) sepa- nificant interaction between group and structure, F = 5.55, (1, 36) rately, using the ezANOVA function of the ez package (version p = 0.024. Paired comparisons show that the difference between 4.2.2: Lawrence, 2013), implemented in R (version 3.1.0: RCore structures is highly significant in the learner group [t = (62.9) Team, 2014). The analyses included correctness (grammatical, 4.91, p < 0.001, gender mean = 62.8, SD = 14.1; non-finite verb violation) and modality (visual, auditory) as within-participants mean = 79.5, SD = 17.3]. There is a smaller, but still significant factors, and group (natives, learners) as between-participants fac- difference between structures in the native group [t = 2.42, (59.7) tor. Data from lateral (left and right frontal, temporal, parietal, p = 0.019, gender mean = 92.2, SD = 6.2; verbs mean = 93.8, and occipital ROIs) and medial (middle frontal and middle pari- SD = 15.1]. Interestingly, with respect to one of our research etal ROIs) regions were treated separately in order to identify questions, there is a significant main effect of modality, F = (1, 36) topographic and hemispheric differences. For the lateral regions, 8.37, p = 0.006, with the percentage of correct responses in the the ANOVA also included hemisphere (left, right) and anterior- auditory condition being somewhat lower than in the visual con- posterior (frontal, temporal, parietal, occipital) as within partici- dition (mean = 79.5, SD = 20.0 vs. mean = 84.6, SD = 16.7). pants factors. For the medial regions, anterior-posterior (frontal, There are however no significant interactions between modal- parietal) was the only topographical factor in the ANOVA. The ity and group, modality and structure,or group, modality, and Greenhouse-Geisser correction was applied for violations of the structure (all Fs < 3). sphericity assumption. Only main effects of, and interactions ERP RESULTS: GRAND MEAN ANALYSES with, correctness are reported. In the presence of a significant higher-level interaction, lower-level interactions, and main effects Figures 3, 4 show the grand mean ERP waveforms for natives and are not interpreted. False discovery rate correction (Benjamini learners, respectively. Results of the omnibus ANOVAs are pro- and Hochberg, 1995) was applied for follow-up tests to control vided in the Supplementary Material (Data sheet 2). Significant for Type 1 error. Additional regression analyses, performed in results and follow-up analyses will be described below. FIGURE 2 | Accuracy on grammaticality judgments made during ERP recording session by group, modality, and structure. www.frontiersin.org September 2014 | Volume 5 | Article 1072 | 7 Meulman et al. When do learners fail? FIGURE 3 | Natives’ grand average ERP waveforms at all 10 regions of interest (see Figure 1) for correct and incorrect use of non-finite verb and gender agreement in the visual and the auditory condition. Non-finite verb agreement correctness by anterior-posterior interaction, F = 3.33, p = (1, 36) In the 300–500 ms window, the lateral omnibus ANOVA for 0.076, a follow-up analysis was conducted, which again revealed the non-finite verb condition showed a significant correctness by that the effect of correctness reached significance in the pos- anterior-posterior interaction, F = 6.02, p = 0.011; follow- terior region only [frontal, F = 2.77, p = 0.105; parietal, (3, 108) (1, 36) up analysis revealed that the effect of correctness reached sig- F = 14.55, p = 0.002]. Additionally, the omnibus ANOVA (1, 36) nificance in posterior regions only [frontal, F = 0.52, p = showed a marginally significant group by correctness by modal- (1, 36) 0.476; temporal, F = 4.16, p = 0.065; parietal, F = ity interaction, F = 3.56, p = 0.067; but follow-up analyses (1, 36) (1, 36) (1, 36) 14.70, p = 0.002; F = 11.77, p = 0.004], with the incor- failed to reveal a significant modality effect in either of the groups (1, 36) rect condition showing more negative voltages than the correct [correctness by modality interaction: natives, F = 0.72, p = (1, 18) condition. Due to a marginally significant group by correctness 0.407; learners, F = 4.12, p = 0.114]. The main effect of (1, 18) interaction in the omnibus ANOVA, F = 3.65, p = 0.064, correctness reached significance on its own in natives, F = (1, 36) (1, 18) another follow-up analysis was conducted separately for natives 6.26, p = 0.044, but not in learners, F = 3.00, p = 0.100. (1, 18) and learners. This analysis revealed that the main effect of cor- Since visual inspection of the grand mean waveforms seems to rectness was significant in natives, F = 7.36, p = 0.028, but suggest a possible negativity in medial regions for learners in the (1, 18) not in learners, F = 0.08, p = 0.780. The medial omnibus auditory condition, and finding a native-like effect in this time (1, 18) ANOVA revealed a significant main effect of correctness, F = window for L2 learners is unusual, we performed an additional (1, 36) 9.22, p = 0.004, with more negative voltages for the incor- follow-up analysis separately for each modality in learners, which rect than the correct condition. Due to a marginally significant showed a significant correctness effect in the auditory, F = (1, 18) Frontiers in Psychology | Language Sciences September 2014 | Volume 5 | Article 1072 | 8 Meulman et al. When do learners fail? FIGURE 4 | Learners’ grand average ERP waveforms at all 10 regions of interest (see Figure 1) for correct and incorrect use of non-finite verb and gender agreement in the visual and the auditory condition. 6.18, p = 0.046, but not the visual modality, F = 0.43, p = parietal, F = 68.36, p < 0.001, than the frontal region, (1, 18) (1, 36) 0.522. F = 29.15, p < 0.001. (1, 36) In the later time window (600–1200 ms), the lateral It is apparent from these grand mean analyses that non-finite omnibus ANOVA showed a significant group by correctness verb agreement violations are associated with a biphasic pat- by anterior-posterior interaction, F = 5.95, p = 0.008. tern of an N400 followed by a P600 in natives. The lack of (3, 108) Follow-up analysis revealed a significant main effect of cor- significant effects for the frontal regions rules out a LAN effect in rectness in both groups [natives, F = 20.39, p = 0.001; the 300–500 ms time window. Learners’ responses are very sim- (1, 18) learners, F = 14.16, p = 0.001], with more positive ampli- ilar to natives’ in the later time-window (P600). However, in the (1, 18) tudes in the incorrect compared to the correct condition. A early time window learners fail to show a native-like effect (N400) significant correctness by anterior-posterior interaction was in the visual condition, and only show a smaller and less broadly present for natives only [natives, F = 23.51, p = 0.001; distributed N400 compared to natives in the auditory condition. (3, 54) learners, F = 1.97, p = 0.169], which was driven by the (3, 54) fact that the positivity in natives was significant in the tem- Gender agreement poral, F = 16.32, p = 0.001, parietal, F = 36.07, In the 300–500 ms window, the lateral omnibus ANOVA for (1, 18) (1, 18) p = 0.001, and occipital region, F = 35.54, p = 0.001, but the gender condition showed a significant correctness by modal- (1, 18) not the frontal region, F = 0.00, p = 0.985. The medial ity by anterior-posterior interaction, F = 3.90, p = 0.039, (1, 18) (3, 108) omnibus ANOVA revealed a significant correctness by anterior- and a group by correctness by modality by hemisphere interaction, posterior interaction, F = 22.93, p < 0.001; a follow-up F = 5.24, p = 0.028. Follow-up analyses conducted sepa- (1, 36) (1, 36) analysis showed that the correctness effect is stronger in the rately for natives and learners revealed a significant correctness www.frontiersin.org September 2014 | Volume 5 | Article 1072 | 9 Meulman et al. When do learners fail? by modality by anterior-posterior interaction in natives, F = (3, 54) 6.28, p = 0.016, but no significant effects in learners (all Fs < 2.03). However, in natives, neither the main effect of correctness nor the correctness by anterior-posterior interaction reached sig- nificance in either of the modalities analyzed separately (all Fs < 3.90). The medial omnibus ANOVA showed a significant group by correctness interaction, F = 4.30, p = 0.045. However, (1, 36) follow-up analyses failed to find a significant main effect of cor- rectness, or any of its interactions, in either of the groups analyzed separately (all Fs < 4.23). In the 600–1200 ms window, the lateral omnibus ANOVA revealed a significant group by correctness by anterior-posterior interaction, F = 20.17, p < 0.001, and a significant cor- (3, 108) rectness by modality by anterior-posterior interaction, F = (3, 108) 7.31, p = 0.002. Follow-up analyses conducted separately for natives and learners revealed a significant correctness by modal- ity by anterior-posterior interaction in natives, F = 6.17, p = (3, 54) 0.014, but no significant effects in learners (all Fs < 1.81). In natives, the main effect of correctness was significant in all regions except for the frontal one [frontal, F = 0.06, p = 0.806; (1, 18) temporal, F = 14.33, p = 0.001; parietal, F = 38.20, (1, 18) (1, 18) p = 0.001; occipital, F = 35.39, p = 0.001], with ampli- (1, 18) tudes in the incorrect condition being more positive compared to the correct condition. The correctness by modality interac- tion did not reach significance in any of the regions (all Fs < 4.03). The medial omnibus ANOVA showed a significant group by correctness by anterior-posterior interaction, F = 11.24, (1, 36) p = 0.002. Follow-up analyses revealed that this was due to a significant correctness by anterior-posterior interaction in natives, F = 26.82, p = 0.001, but not learners, F = 1.86, p = (1, 18) (1, 18) 0.190. The interaction in natives was driven by the fact that the effect of correctness was stronger in the posterior region [frontal, F = 13.04, p = 0.002; parietal, F = 47.69, p < 0.001]. (1, 18) (1, 18) FIGURE 5 | ERP difference waves (incorrect minus correct sentence) These grand mean analyses show that while natives show a per group, structure, and modality, collapsed over middle frontal and classic P600 effect in response to gender agreement violations, all temporal, parietal, and occipital ROIs. learners do not: the P600 is absent for learners, in both modalities. In the early time window, there are again no effects for learn- ers, while the natives seemed to show some small effects, which the learner group, since previous research has revealed that age of however failed to reach significance in follow-up analyses. acquisition, length of residence, L2 proficiency and use can affect Figure 5 summarizes the P600 and N400 effects, showing the ERP responses (also discussed in the Introduction). difference in amplitude between the violation condition and the grammatical condition, collapsed over middle frontal and all Closer inspection of the N400 and P600 patterns temporal, parietal and occipital ROIs, per group, structure, and Following work by Osterhout and colleagues (McLaughlin et al., modality. We see P600 effects for natives, preceded by an N400 2010; Tanner et al., 2013, 2014) we regressed individuals’ N400 effect in non-finite verb violations, but not gender violations. In effect magnitude onto their P600 effect magnitude, to investi- contrast, the learners only show P600 effects for non-finite verb gate the distribution of these two components across individuals. violations, but they do not show any effects of gender violation. The effect magnitude here refers to the average voltage difference The learners also show a small N400 effect for auditory non-finite between conditions: correct minus incorrect in the 300–500 ms verb violations (an effect that only reached significance in the window for the N400, and incorrect minus correct in the medial regions). 600–1200 ms time window for the P600. Amplitudes were aver- aged across middle frontal and all temporal, parietal, and occipital ERP RESULTS: INDIVIDUAL DIFFERENCES ANALYSES regions, where the N400 and P600 effects are to be expected. In this section, we will have a closer look at individual differences. Figure 6 shows the scatterplots of the results, for each group First, we will investigate the distribution of N400 and P600 effects and sentence structure separately. We also investigated each across individuals, which can be of importance for the interpreta- modality separately, but since the results looked highly similar tion of the grand mean results, as discussed in the Introduction. between modalities, these will not be discussed here. The fig- Second, we will explore possible predictors of native-likeness in ure informs us about whether the grand mean waveforms are Frontiers in Psychology | Language Sciences September 2014 | Volume 5 | Article 1072 | 10 Meulman et al. When do learners fail? FIGURE 6 | The distribution of N400 and P600 effect magnitudes effect, whereas individuals below/to the right of the dashed line showed (correct minus incorrect for N400, incorrect minus correct for P600) primarily a P600 effect. In the non-finite verbs many individuals show across learners, averaged within middle frontal and all temporal, biphasic responses (upper right quadrants), whereas in the gender parietal, and occipital ROIs. Each dot represents a data point from a condition there are more sustained positivities (lower right quadrants). single participant. The solid line shows the best-fit regression line. The Very few individuals show sustained negativities (upper left quadrants). dashed line represents equal N400 and P600 effect magnitudes: Basically none of the learners are able to show sensitivity to gender individuals above/to the left of the dashed line showed primarily an N400 violations. representative of most individuals’ ERP profiles. We concluded Predictors of P600 effect magnitude in the learner group from our grand mean analyses that natives show a biphasic N400- To investigate which factors lead to a higher degree of P600 pattern for non-finite verb violations, and only a P600 for native-likeness in L2 learners, we performed a multiple regression gender agreement violations. Examining Figure 6 we indeed see analysis (e.g., Baayen, 2008), to investigate the possible influ- that the biphasic pattern is present for the majority of individu- ence of age of acquisition, length of residence, L2 proficiency als in the non-finite verbs, and that a P600 (without preceding (as measured by the C-test), offline gender knowledge (as mea- N400) is dominant for gender. The grand mean results of the sured by the gender assignment task), and L2 use (composite learners showed native-like effects for verbs, but not for gender. score) on the P600. We took magnitude of the P600 as a mea- This conclusion still holds if we look at individual patterns within sure of native-likeness, since the previous section revealed that the group: the distribution of responses in the verb condition this is the most reliable effect in the native group. The average looks highly similar between learners and natives, although there amplitude of the difference wave (incorrect minus correct), cal- is a tendency toward more positivities without preceding nega- culated in the 600–1200 ms window collapsing middle frontal tivities and less biphasic responses in the learners. The fact that and all temporal, parietal, and occipital ROIs, was used as the basically none of the learners show any sensitivity to gender vio- dependent measure in the regression model. Because of skewed lations assures us that the null effect in the grand mean analysis distributions, age of acquisition, and length of residence were was not due to a cancelation by different patterns. log-transformed, and L2 proficiency, gender knowledge and L2 www.frontiersin.org September 2014 | Volume 5 | Article 1072 | 11 Meulman et al. When do learners fail? FIGURE 7 | The percentage of use of the L2 in daily life predicts P600 magnitude for non-finite verb agreement violations, but not gender agreement violations. Table 3 | Correlation matrix for the dependent measure and the participant characteristics variables used in the regression model. P600 Log age Log length Arcsin Arcsin Arcsin magnitude of acquisition of residence proficiency gender knowledge L2 use P600 magnitude – Log age of acquisition −0.083 – Log length of residence −0.106 −0.147 – Arcsin proficiency 0.140 −0.327 0.230 – Arcsin gender knowledge 0.134 −0.416 0.552* 0.424 – Arcsin L2 use 0.486* −0.388 0.413 0.293 0.518* – Asterisk indicates significance of p < 0.05. Table 4 | Linear multiple regression model predicting P600 effect use were arcsine transformed prior to entry into the model. magnitude in learners. Additionally all predictor variables were centered at their mean. The correlation matrix for the dependent measure and the partic- Predictor Estimate SE t-value p-value ipant characteristics variables can be found in Table 3. Examining Table 3 we see that length of residence shows a significant posi- Intercept 1.388 0.316 4.390 <0.001 tive correlation with gender knowledge (i.e., the ability to assign StructureIsGender −2.789 0.632 −4.410 <0.001 gender offline), r = 0.55, p = 0.014, with longer length of res- (17) L2use 3.288 1.070 3.074 0.003 idence being associated with better gender knowledge. However, StructureIsGender*L2use −5.939 2.140 −2.776 0.007 there is no relation between length of residence and the magni- tude of the P600 (i.e., the ability to process grammatical structures efficiently online), r =−0.11, p = 0.665. L2 use positively a negative impact (β =−2.79, t =−4.41), and L2 use has a (17) correlates with both gender knowledge and P600 magnitude, positive impact (β = 3.29, t = 3.07) on P600 effect magnitude. r = 0.52, p = 0.023 and r = 0.49, p = 0.035, respectively, The other predictors (i.e., modality, age of acquisition, length (17) (17) with a higher amount of L2 use being associated with better of residence, proficiency, and gender knowledge) did not reach gender knowledge as well as larger P600 magnitudes. significance by themselves or in interaction with any other vari- In addition to the participant characteristics variables, struc- ables and were therefore not included in the model. Finally, the ture and modality were tested as predictors in the model. The model additionally shows an interaction between the structure significance of predictors was evaluated by means of the t-test being gender and L2 use (β =−5.94, t =−2.78). This effect is for the coefficients, in addition to model comparison using AIC plotted in Figure 7. There appears to be a significant effect of L2 (Akaike Information Criterion; Akaike, 1974). Table 4 shows use on the P600 for non-finite verb agreement violations, R = the best linear multiple regression model (explained variance: 0.32, F = 8.08, p = 0.011, but no significant effect for gen- (1, 17) 33.7%). This model shows that the structure being gender has der agreement violations, R = 0.01, F = 0.01, p = 0.756. (1, 17) Frontiers in Psychology | Language Sciences September 2014 | Volume 5 | Article 1072 | 12 Meulman et al. When do learners fail? No other significant interactions with structure or modality were effect (Foucart and Frenck-Mestre, 2011). Inspection of individ- found. ual differences for the gender violations confirmed that the grand average ERP patterns we report are representative of the majority DISCUSSION of the individuals in each group. In contrast to natives, who con- Using the P600 as a measure of native-likeness, we tested whether sistently showed large P600 effects (Figure 6,bottomleftpanel), sufficiently proficient late L2 learners can show native-like learners consistently failed to demonstrate any form of sensitivity syntactic processing, even if (1) gender marking in the L1 to gender violations (Figure 6, bottom right panel). This result is implemented differently and (2) the L2 gender system is was confirmed by the fact that none of the participant characteris- opaque. We investigated the ERP responses of native speakers and tics we tested (increased proficiency or gender knowledge, earlier Romance learners of Dutch to anomalies in constructions that age of acquisition, longer length of residence or high percent- are relatively easy to acquire (i.e., non-finite verbs) and those that age L2 use) was associated with a larger P600. In this sense, the have been shown to be more difficult (i.e., gender). In addition, current experiment replicates the pattern found by Sabourin and we varied the modality in which the stimuli were presented, in Stowe (2008); even highly proficient Romance learners of Dutch order to investigate whether visual presentation might contribute appear to have persistent difficulties in learning to use Dutch to the lack of sensitivity to gender in the Romance group reported gender. in previous research (Sabourin and Stowe, 2008). The non-finite Turning to the non-finite verb violations, examination of the verb violations elicited a biphasic N400-P600 effect in both native native speakers confirms that the biphasic pattern N400/P600 speakers and second language learners. However, in contrast to seen in this group is present in the majority of the individual the native speakers, the learners only showed evidence of an N400 participants (see Figure 6, top left panel). This biphasic effect in in the auditory and not the visual condition, although the statis- response to non-finite verb violations in natives has been found tical support for this difference is weak . Also, the amplitude of before (Kutas and Hillyard, 1983; Sabourin and Stowe, 2008; the N400 effect was somewhat smaller than in the natives. For the Loerts, 2012). As can be seen in Figure 6 (top right panel), many gender violations, we found a clear P600 in natives, but not in L2 learners’ responses were within the native range, showing evi- learners. dence of the biphasic pattern, although this is primarily evident The effects of modality were quite subtle. We had hypothe- for the auditorily presented materials. Some individuals are less sized that increased processing demands in the visual modality native-like; forthisstructure theP600 effect magnitudeinthe L2 might interfere with immersed learners’ responses to grammatical groupwas foundtobemodulatedbythe percentageofuse of the violations and that they might show more native-like responses L2 in daily life. Use is not the only important factor for native-like in the auditory modality. This hypothesis receives some support; attainment of syntax processing however; even the learners with the modulation of the N400 effect in non-finite verb violations the highest amount of daily practice in an immersed setting still in learners was in the hypothesized direction, with a native-like show persistent problems with gender agreement. effect in the auditory but not the visual modality. However, for Despite their failure to show native-like gender processing, the gender agreement learners failed to show sensitivity, regardless evidence suggests that the Romance learners are highly profi- of the modality. Thus, the suggestion that the difference between cient. In addition to the off-line measures of proficiency (C-test Loerts’ (2012) results for Polish speakers on the one hand, and and gender assignment) and online accuracy at ungrammatical- Sabourin and Stowe’s (2008) results and our current results for ity detection, which are within native range for a number of the Romance speakers on the other, cannot be attributed to the participants, the evidence from the biphasic N400-P600 pattern difference in modalities. provides a strong argument for high proficiency. Finding early In contrast to the modality effects, violation effects and group ERP effects in response to grammatical violations like the N400 differences therein were robust. Before accepting the group pat- seen here is unusual in L2 research. Although both Loerts (2012) terns, it is important to examine the role of individual differences. and Sabourin and Stowe (2008) found evidence of a biphasic pat- A biphasic pattern may reflect the summation of single effects tern for their native groups, neither found the N400 in their L2 originating in two different groups of participants (Osterhout, learner groups. According to Steinhauer et al. (2009), biphasic 1997; Nieuwland and Van Berkum, 2008; Tanner and Van Hell, patterns are one of the latest stages of morpho-syntactic profi- 2014; Tanner et al., 2014). Even more crucial for the current ciency in late L2 acquisition. The fact that our learners were able experiment, the absence of an effect in the L2 group may be due to reach this stage for non-finite verb agreement, but that they to variability, with some individuals showing the pattern found in cannot get past the initial stage of not showing any brain response native speakers, while others show no effect or even an opposing differences for correct vs. incorrect use of gender agreement pro- vides strong support for the difficulty of the acquisition of this We want to remind the reader that the modality effect in the non-finite verb element in Dutch L2 acquisition. This highlights the complexity condition should be interpreted with some caution. Unlike the main effects of acquisition of the Dutch gender system, even by learners with we report throughout the rest of the discussion (which are based on 24 and 48 a gender system in their L1. Furthermore, it emphasizes the fact items per condition for verb and gender, respectively), the marginally signif- that language learning aptitude is not an all or none phenomenon, icant interaction we followed up on here is based on 12 items per condition but may vary widely between constructions. only, which is relatively few for an ERP study. However, if we do not follow Our results further illustrate the large discrepancy between up on this interaction the main effect of correctness remains, suggesting that learners are like natives. We felt this claim would be too strong, and therefore online and offline processing measures in L2 acquisition research. discuss the follow-up analysis, despite the statistical concerns. Both the behavioral results of the gender assignment task and the www.frontiersin.org September 2014 | Volume 5 | Article 1072 | 13 Meulman et al. When do learners fail? sentence-final grammaticality judgments during the ERP record- for younger learners, making it again unlikely that this is the ings for gender violations indicate moderate to good knowl- (only) decisive factor for native-likeness. The amount of L2 use edge of Dutch grammatical gender in the learner group. Yet, also failed to explain the failure of the Romance learners to show we observed a complete lack of response to these violations online sensitivity to gender, even though, as our own results show, in the ERP signal. This reveals a discrepancy between offline this can be important for native-likeness for other aspects of knowledge of grammatical gender concord and the use of agree- grammatical processing, like verb agreement. Length of residence, ment knowledge during online processing. The lack of a signif- which impacts overall exposure, also showed no correlation with icant relation between the magnitude of the P600 responses to sensitivity to gender. gender violations and the score on the gender assignment task Failure to achieve native-like processing has also been linked rules out the possibility that only learners with better offline to dissimilarity between L1 and L2 (Tokowicz and MacWhinney, performance are able to show online effects. The behavioral 2005; Sabourin and Stowe, 2008; Foucart and Frenck-Mestre, difference between the visual and the auditory modality, with per- 2011), as well as characteristics of the target language (Sabourin formance being slightly worse for grammaticality judgments in and Stowe, 2008; Loerts, 2012). Following this line of argumenta- the auditory modality, was also not reflected in the ERP signal tion, Dutch and Romance languages may simply be too different for gender violations. These results illustrate that second language from each other, which, combined with the fact that the Dutch learners can develop successful strategies to cope with gender gender system is relatively opaque, results in a very difficult chal- processing difficulties. These alternative routes, however, appar- lenge for native-like attainment. The lack of transparency of the ently take more time and are qualitatively different from what we Dutch gender system might explain why our Romance learners observe in online native processing. failed to show native-like processing for this characteristic of the The results of the current study leave us with a puzzle; language, as opposed to the much more transparent non-finite why do Romance learners of Dutch show such persistent prob- verb manipulation. For gender, previous research has shown that lems with gender processing? Our results confirm that gender native-like processing is possible even in constructions with com- is difficult to process for late Romance learners of Dutch, com- petition from an L1 gender system when a relatively transparent pared with the results of studies targeting other languages. We target gender system is to be acquired in L2 (Frenck-Mestre et al., replicated Sabourin and Stowe’s (2008) findings, in the sense 2009; Foucart and Frenck-Mestre, 2011; Gillon Dowens et al., that our Romance learners likewise did not show native-like 2011). In contrast, Loerts’ study suggests that an opaque system responses to gender violations, regardless of modality, although is more difficult to acquire, since only her most proficient learn- they showed responses to non-finite verbs that were close to the ers are able to show P600 effects, which are additionally somewhat native model . The factors most commonly suggested in the liter- smaller in amplitude compared to the natives. It remains an open ature as to why gender or other forms of grammatical processing question as to why, in contrast to Loerts (2012), even the most might be problematic do not appear to explain these results. proficient learners in the current study did not show a P600. Proficiency clearly plays some role in native-likeness in general More research is needed to determine whether characteristics of (Steinhauer et al., 2006; McLaughlin et al., 2010), but as we argue the L1 or other (confounding) factors are at play in determining above, our learners were quite proficient, certainly comparable to which individuals overcome the challenge of an opaque gender those in other studies in which learners have shown P600 effects system. for gender (Tokowicz and MacWhinney, 2005; Frenck-Mestre One final point we would like to make is that, although we et al., 2009; Gillon Dowens et al., 2010, 2011; Foucart and Frenck- did not find extensive effects of stimulus modality, this factor is Mestre, 2011, 2012; Loerts, 2012). Also, our proficiency measure nevertheless of importance. As we noted, the early responses to does not correlate with the magnitude of the ERPs. ungrammaticality like the N400 in the biphasic response seen here Other potential explanatory factors involve the language expe- are not generally found in late L2 learners, which has been taken rience of the learner, such as age of acquisition (Weber-Fox and as a sign of lack of native-likeness. It is possible that they have been Neville, 1996; Kotz et al., 2008) and exposure to and use of the missed due to the use of visual materials, since this effect was only L2 (Gardner et al., 1997; Flege et al., 1999; Dörnyei, 2005; Tanner seen in the auditory modality. Although we saw no effects on the et al., 2014). It is true that the studies reported by Frenck-Mestre amplitude of the P600 effect, certain populations may be affected and colleagues have generally tested earlier learners (with onset of more than others. Learners who do not share the same writing acquisition in their teens rather than twenties and later). However, system in their L1 and L2, for instance, might have more diffi- other studies have demonstrated native-like gender processing culty automatizing their usage of the new alphabet (Koda, 1999; even for relatively late learners (Tokowicz and MacWhinney, Wang et al., 2003). Forthese learners,the useofauditorymateri- 2005; Gillon Dowens et al., 2010). Furthermore, in the current als might be a crucial prerequisite to obtain an accurate measure study we did not even find a trend toward better performance of their abilities. On the other hand, those whose learning has taken place with an emphasis on written materials may show less response when auditory materials are used. Given the large diver- One of the reviewers points out that having twice as many violation sentences sity of L2 speaker populations with respect to typological distance in the gender condition than the non-finite verb condition, might be problem- (both with respect to grammar and writing systems) and type of atic, since less common stimulus types may elicit a P3 response (see Coulson learning environment (immersion vs. classroom), it is important et al., 1998; Hahne and Friederici, 1999). However, the difference waves shown to be aware that the testing modality might influence the results, overlaid in Figure 5 show that there is no difference in P600 effect magnitude betweengender andnon-finite verbsinthe natives. both in offline and online tests. Frontiers in Psychology | Language Sciences September 2014 | Volume 5 | Article 1072 | 14 Meulman et al. When do learners fail? In conclusion, we can say that online grammatical gender pro- Burkhardt, P. (2007). The P600 reflects cost of new information in dis- course memory. Neuroreport 18, 1851–1854. doi: 10.1097/WNR.0b013e3282 cessing is particularly difficult for Romance learners of Dutch, f1a999 even at high levels of proficiency and with large amounts of L2 Camblin, C. C., Ledoux, K., Boudewyn, M., Gordon, P. C., and Swaab, T. Y. exposure and use in a natural setting, and regardless of test- (2007). Processing new and repeated names: effects of coreference on rep- ing modality. In contrast, responses highly similar to the native etition priming with speech and fast RSVP. Brain Res. 1146, 172–184. doi: model are possible for a more regular and transparent structure 10.1016/j.brainres.2006.07.033 Clahsen, H., and Felser, C. (2006). Grammatical processing in language learners. (non-finite verbs), for which responses are modulated by both Appl. Psycholinguist. 27, 3–42. doi: 10.1017/S0142716406060024 testing modality and L2 use. In contrast, the problems with gen- Corbett, G. (1991). Gender. Cambridge: Cambridge University Press. der are persistent and not affected by these factors, demonstrating Coulson, S., King, J. W., and Kutas, M. (1998). Expect the unexpected: event-related the complexity of (late) L2 acquisition of the opaque Dutch brain response to morphosyntactic violations. Lang. Cogn. Process. 13, 21–58. doi: 10.1080/016909698386582 gender system. Davidson, D. J., and Indefrey, P. (2009). An event-related potential study on changes of violation and error responses during morphosyntactic learning. J. Cogn. ACKNOWLEDGMENTS Neurosci. 21, 433–446. doi: 10.1162/jocn.2008.21031 We are very grateful for comments and suggestions of the review- Dimitrova, D. V., Stowe, L. A., Redeker, G., and Hoeks, J. C. (2012). Less is not ers. This research was supported by the Netherlands Organization more: neural responses to missing and superfluous accents in context. J. Cogn. for Scientific Research (NWO) under grant 016.104.602, awarded Neurosci. 24, 2400–2418. doi: 10.1162/jocn_a_00302 Dörnyei, Z. (2005). The Psychology of the Language Learner: Individual Differences to the fifth author. We thank Hanneke Loerts, Sanne Berends and in Second Language Acquisition. Mahwah, NJ: Lawrence Erlbaum. Bregtje Seton for sharing their auditory materials, and the par- Dussias, P. E. (2010). Uses of eye-tracking data in second language sen- ticipants for their kind cooperation. Additionally we thank our tence processing research. Annu. Rev. Appl. Linguist. 30, 149–166. doi: colleagues at the NeuroImaging Center Groningen for technical 10.1017/S026719051000005X support, particularly Peter Albronda. Flege, J. E., Yeni-Komshian, G. H., and Liu, S. (1999). Age constraints on second- language acquisition. J. Mem. Lang. 41, 78–104. doi: 10.1006/jmla.1999.2638 Foucart, A., and Frenck-Mestre, C. (2011). Grammatical gender process- SUPPLEMENTARY MATERIAL ing in L2: electrophysiological evidence of the effect of L1–L2 syntactic The Supplementary Material for this article can be found similarity. Bilingual. Lang. Cogn. 14, 379–399. doi: 10.1017/S13667289100 online at: http://www.frontiersin.org/journal/10.3389/fpsyg. 0012X 2014.01072/abstract Foucart, A., and Frenck-Mestre, C. (2012). Can late L2 learners acquire new gram- matical features? Evidence from ERPs and eye-tracking. J. Mem. Lang. 66, 226–248. doi: 10.1016/j.jml.2011.07.007 REFERENCES Franceschina, F. (2005). Fossilized Second Language Grammars: The Acquisition of Akaike, H. (1974). A new look at the statistical model identification. IEEE Trans. Automat. Control 19, 716–723. doi: 10.1109/TAC.1974.1100705 Grammatical Gender. Amsterdam: John Benjamins. doi: 10.1075/lald.38 Frenck-Mestre, C., Foucart, A., Carrasco, H., and Herschensohn, J. (2009). Baayen, R. H. (2008). Analyzing Linguistic Data. A Practical Introduction Processing of grammatical gender in French as a first and second lan- to Statistics Using R. Cambridge, UK: Cambridge University Press. doi: 10.1017/CBO9780511801686 guage evidence from ERPs. Eurosla Yearb. 9, 76–106. doi: 10.1075/eurosla. 9.06fre Baayen, R. H., Piepenbrock, R., and Gullikers, L. (1995). The CELEX Lexical Database [CD-ROM]. Philadelphia, PA: Linguistics Data Consortium, Friederici, A. D., Wang, Y., Herrmann, C. S., Maess, B., and Oertel, U. (2000). Localization of early syntactic processes in frontal and temporal cortical University of Pennsylvania. Balconi, M., and Pozzoli, U. (2005). Comprehending semantic and gram- areas: a magnetoencephalographic study. Hum. Brain Mapp. 11, 1–11. doi: 10.1002/1097-0193(200009)11:1%3C1::AID-HBM10%3E3.0.CO;2-B matical violations in Italian. N400 and P600 comparison with visual and auditory stimuli. J. Psycholinguist. Res. 34, 71–98. doi: 10.1007/s10936-005- Frisch, S., Kotz, S. A., von Cramon, D. Y., and Friederici, A. D. (2003). Why the P600 is not just a P300: the role of the basal ganglia. Clin. Neurophysiol. 114, 3633-6 Bates, D., Maechler, M., Bolker, B., and Walker, S. (2014). lme4: Linear Mixed-Effects 336–340. doi: 10.1016/S1388-2457(02)00366-8 Frost, R. (1998). Toward a strong phonological theory of visual word recogni- Models Using Eigen and S4. R package version 1.1-6. Available online at: http:// CRAN.R-project.org/package=lme4 tion: true issues and false trails. Psychol. Bull. 123, 71–99. doi: 10.1037/0033- 2909.123.1.71 Bates, E., and MacWhinney, B. (1987). “Competition, variation, and language learning,” in Mechanisms of Language Acquisition, The 20th Annual Carnegie Gardner, R. C., Tremblay, P. F., and Masgoret, A. (1997). Towards a full model of Symposium on Cognition (Hillsdale, NJ: Lawrence Erlbaum Associates), second language learning: an empirical investigation. Mod. Lang. J. 81, 344–362. doi: 10.1111/j.1540-4781.1997.tb05495.x 157–193. Benjamini, Y., and Hochberg, Y. (1995). Controlling the false discovery rate: a Gillon Dowens, M., Guo, T., Guo, J., Barber, H., and Carreiras, M. (2011). Gender and number processing in Chinese learners of Spanish— practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57, 289–300. evidence from event related potentials. Neuropsychologia 49, 1651–1659. doi: 10.1016/j.neuropsychologia.2011.02.034 Bernard, M. L., Chaparro, B. S., and Russell, M. (2001). Examining automatic text presentation for small screens. Proc. Hum. Factors Ergon. Soc. Annu. Meet. 45, Gillon Dowens, M., Vergara, M., Barber, H. A., and Carreiras, M. (2010). Morphosyntactic processing in late second-language learners. J. Cogn. Neurosci. 637–639. doi: 10.1177/154193120104500613 Blom, E., Polišenská, D., and Weerman, F. (2008). Articles, adjectives and age 22, 1870–1887. doi: 10.1162/jocn.2009.21304 Gratton, G., and Coles, M. H. (1989). Generalization and evaluation of eye- of onset: the acquisition of Dutch grammatical gender. Second Lang. Res. 24, 297–331. doi: 10.1177/0267658308090183 movement correction procedures. J. Psychophysiol. 3, 14–16. Grüter, T., Lew-Williams, C., and Fernald, A. (2012). Grammatical gender in L2: a Bornkessel-Schlesewsky, I., and Schlesewsky, M. (2008). An alternative perspec- tive on “semantic P600” effects in language comprehension. Brain Res. Rev. 59, production or a real-time processing problem? Second Lang. Res. 28, 191–215. doi: 10.1177/0267658312437990 55–73. doi: 10.1016/j.brainresrev.2008.05.003 Gunter, T. C., Stowe, L. A., and Mulder, G. (1997). When syntax meets semantics. Bruhn de Garavito, J., and White, L. (2000). “Second language acquisition of Spanish DPs: the status of grammatical features,” in BUCLD 24: Proceedings Psychophysiology 34, 660–676. doi: 10.1111/j.1469-8986.1997.tb02142.x Hagoort, P., Brown, C., and Groothusen, J. (1993). The syntactic positive shift (SPS) from the 24th Annual Boston University Conference on Language Development, eds S. C. Howell, S. Fish, and T. Keith-Lucas (Somerville, MA: Cascadilla), as an ERP measure of syntactic processing. Lang. Cogn. Process. 8, 439–483. doi: 10.1080/01690969308407585 164–175. www.frontiersin.org September 2014 | Volume 5 | Article 1072 | 15 Meulman et al. When do learners fail? Hagoort, P., and Brown, C. M. (2000). ERP effects of listening to speech compared Osterhout, L. (1997). On the brain response to syntactic anomalies: manipulations to reading: the P600/SPS to syntactic violations in spoken sentences and rapid of word position and word class reveal individual differences. Brain Lang. 59, serial visual presentation. Neuropsychologia 38, 1531–1549. doi: 10.1016/S0028- 494–522. doi: 10.1006/brln.1997.1793 3932(00)00053-1 Osterhout, L., and Hagoort, P. (1999). A superficial resemblance does not neces- Hahne, A. (2001). What’s different in second-language processing? Evidence sarily mean you are part of the family: counterarguments to Coulson, King and from event-related brain potentials. J. Psycholinguist. Res. 30, 251–266. doi: Kutas (1998) in the P600/SPS-P300 debate. Lang. Cogn. Process. 14, 1–14. doi: 10.1023/A:1010490917575 10.1080/016909699386356 Hahne, A., and Friederici, A. (1999). Electrophysiological evidence for two steps Osterhout, L., and Holcomb, P. J. (1992). Event-related brain potentials elicited in syntactic analysis: early automatic and late controlled processes. J. Cogn. by syntactic anomaly. J. Mem. Lang. 31, 785–806. doi: 10.1016/0749- Neurosci. 11, 194–205. doi: 10.1162/089892999563328 596X(92)90039-Z Hawkins, R. (2001). The theoretical significance of Universal Grammar Perfetti, C. A., Zhang, S., and Berent, I. (1992). Reading in English and Chinese: in second language acquisition. Second Lang. Res. 17, 345–367. doi: evidence for a “universal” phonological principle. Adv. Psychol. 94, 227–248. 10.1191/026765801681495868 doi: 10.1016/S.0166-4115(08)62798-3 Hawkins, R., and Chan, C. Y. H. (1997). The partial availability of Universal R Core Team. (2014). R: A Language And Environment For Statistical Computing. Grammar in second language acquisition: the “failed functional features Vienna: R Foundation for Statistical Computing. Available online at: http:// hypothesis.” Second Lang. Res. 13, 187–226. doi: 10.1191/0267658976714 www.R-project.org/ 76153 Rossi, S., Gugler, M. F., Friederici, A. D., and Hahne, A. (2006). The impact of Homae, F., Hashimoto, R., Nakajima, K., Miyashita, Y., and Sakai, K. L. (2002). proficiency on syntactic second-language processing of German and Italian: From perception to sentence comprehension: the convergence of auditory and evidence from event-related potentials. J. Cogn. Neurosci. 18, 2030–2048. doi: visual information of language in the left inferior frontal cortex. Neuroimage 16, 10.1162/jocn.2006.18.12.2030 883–900. doi: 10.1006/nimg.2002.1138 Sabourin, L. (2003). Grammatical Gender and Second Language Processing: An ERP Hopp, H. (2010). Ultimate attainment in L2 inflection: performance similar- Study. Ph.D. dissertation, Groningen, Grodil. ities between non-native and native speakers. Lingua 120, 901–931. doi: Sabourin, L., and Stowe, L. A. (2008). Second language processing: when are first 10.1016/j.lingua.2009.06.004 and second languages processed similarly? Second Lang. Res. 24, 397–430. doi: Hopp, H. (2013). Grammatical gender in adult L2 acquisition: relations 10.1177/0267658308090186 between lexical and syntactic variability. Second Lang. Res. 29, 33–56. doi: Schneider, W., Eschman, A., and Zuccolotto, A. (2002a). E-PrimeUser’sGuide. 10.1177/0267658312461803 Pittsburgh, PA: Psychology Software Tools Inc. Keijzer, M. (2007). Last in First Out? An Investigation of the Regression Hypothesis Schneider, W., Eschman, A., and Zuccolotto, A. (2002b). E-Prime Reference Guide. in Dutch Emigrants in Anglophone Canada. Ph.D. dissertation, Vrije Universiteit Pittsburgh, PA: Psychology Software Tools Inc. Amsterdam, Netherlands. Schwartz, B. D., and Sprouse, R. A. (1994). “Word order and nominative case in Kluender, R., and Kutas, M. (1993). Bridging the gap: evidence from ERPs on nonnative language acquisition: a longitudinal study of (L1 Turkish) German the processing of unbounded dependencies. J. Cogn. Neurosci. 5, 196–214. doi: Interlanguage,” in Language Acquisition Studies in Generative Grammar: Papers 10.1162/jocn.1993.5.2.196 in Honor of Kenneth Wexler from the 1991 GLOW Workshops, eds T. Hoekstra Koda, K. (1996). L2 word recognition research: a critical review. Mod. Lang. J. 80, and B. D. Schwartz (Philadelphia, PA: John Benjamins), 317–368. 450–460. doi: 10.1111/j.1540-4781.1996.tb05465.x Schwartz, B. D., and Sprouse, R. A. (1996). L2 cognitive states and Koda, K. (1999). Development of L2 intraword orthographic sensitivity and decod- the full transfer/full access model. Second Lang. Res. 12, 40–72. doi: ing skills. Mod. Lang. J. 83, 51–64. doi: 10.1111/0026-7902.00005 10.1177/026765839601200103 Kotz, S. A., Holcomb, P. J., and Osterhout, L. (2008). ERPs reveal comparable syn- Steinhauer, K., White, E., Cornell, S., Genesee, F., and White, L. (2006). The tactic sentence processing in native and non-native readers of English. Acta neural dynamics of second language acquisition: evidence from event-related psychol. 128, 514–527. doi: 10.1016/j.actpsy.2007.10.003 potentials. J. Cogn. Neurosci. (Suppl. 99). Kutas, M., and Federmeier, K. D. (2011). Thirty years and counting: find- Steinhauer, K., White, E. J., and Drury, J. E. (2009). Temporal dynamics of late sec- ing meaning in the N400 component of the event-related brain potential ond language acquisition: evidence from event-related brain potentials. Second (ERP). Annu. Rev. Psychol. 62, 621–647. doi: 10.1146/annurev.psych.093008. Lang. Res. 25, 13–41. doi: 10.1177/0267658308098995 131123 Stowe, L. A. (1991). Ambiguity resolution: behavioral evidence for a delay. Kutas, M., and Hillyard, S. A. (1983). Event–related brain potentials to gram- Proceedings of the Thirteenth Annual Meeting of the Cognitive Science Association matical errors and semantic anomalies. Mem. Cognit. 11, 539–550. doi: (Hillsdale, NJ: Lawrence Erlbaum Associates), 257–262. 10.3758/BF03196991 Tanner, D., Inoue, K., and Osterhout, L. (2014). Brain-based individual differences Lawrence, M. A. (2013). ez: Easy Analysis and Visualization of Factorial Experiments. in online L2 grammatical comprehension. Bilingual. Lang. Cogn. 17, 277–293. R package version 4.2-2. Available online at: http://CRAN.R-project.org/ doi: 10.1017/S1366728913000370 package=ez Tanner, D., McLaughlin, J., Herschensohn, J., and Osterhout, L. (2013). Individual Loerts, H. (2012). Uncommon Gender: Eyes and Brains, Native and Second Language differences reveal stages of L2 grammatical acquisition: ERP evidence. Bilingual. Learners, and Grammatical Gender. Ph.D. dissertation, Rijksuniversiteit Lang. Cogn. 16, 367–382. doi: 10.1017/S1366728912000302 Groningen, Grodil. Tanner, D., and Van Hell, J. G. (2014). ERPs reveal individual differ- McLaughlin, J., Tanner, D., Pitkänen, I., Frenck-Mestre, C., Inoue, K., Valentine, G., ences in morphosyntactic processing. Neuropsychologia 56, 289–301. doi: et al. (2010). Brain potentials reveal discrete stages of L2 grammatical learning. 10.1016/j.neuropsychologia.2014.02.002 Lang. Learn. 60, 123–150. doi: 10.1111/j.1467-9922.2010.00604.x Tokowicz, N., and MacWhinney, B. (2005). Implicit and explicit mea- Molinaro, N., Barber, H. A., and Carreiras, M. (2011). Grammatical agreement pro- sures of sensitivity to violations in second language grammar: an event- cessing in reading: ERP findings and future directions. Cortex 47, 908–930. doi: related potential investigation. Stud. Second Lang. Acquis. 27, 173–204. doi: 10.1016/j.cortex.2011.02.019 10.1017/S0272263105050102 Müller, H. M., King, J. W., and Kutas, M. (1997). Event-related potentials elicited van Berkum, J. (1996). The Psycholinguistics of Grammatical Gender: Studies by spoken relative clauses. Cogn. Brain Res. 5, 193–203. doi: 10.1016/S0926- in Language Comprehension and Production. Ph.D. dissertation, Max Planck 6410(96)00070-5 Institute for Psycholinguistics. Nijmegen, Nijmegen University press. Münte, T. F., Heinze, H. J., and Mangun, G. R. (1993). Dissociation of brain activ- Wang, M., Koda, K., and Perfetti, C. A. (2003). Alphabetic and nonalphabetic L1 ity related to syntactic and semantic aspects of language. J. Cogn. Neurosci. 5, effects in English word identification: a comparison of Korean and Chinese 335–344. doi: 10.1162/jocn.1993.5.3.335 English L2 learners. Cognition 87, 129–149. doi: 10.1016/s0010-0277(02) Nieuwland, M. S., and Van Berkum, J. J. (2008). The interplay between seman- 00232-9 tic and referential aspects of anaphoric noun phrase resolution: evidence from Weber-Fox, C. M., and Neville, H. J. (1996). Maturational constraints on func- ERPs. Brain Lang. 106, 119–131. doi: 10.1016/j.bandl.2008.05.001 tional specializations for language processing: ERP and behavioral evidence Frontiers in Psychology | Language Sciences September 2014 | Volume 5 | Article 1072 | 16 Meulman et al. When do learners fail? in bilingual speakers. J. Cogn. Neurosci. 8, 231–256. doi: 10.1162/jocn.1996. Conflict of Interest Statement: The authors declare that the research was con- 8.3.231 ducted in the absence of any commercial or financial relationships that could be Weber-Fox, C. M., and Neville, H. J. (1999). “Functional neural subsystems are dif- construed as a potential conflict of interest. ferentially affected by delays in second language immersion: ERP and behavioral evidence in bilinguals,” in Second Language Acquisition and the Critical Period Received: 16 May 2014; accepted: 06 September 2014; published online: 25 September Hypothesis, ed D. Birdsong (Mahwah, NJ: Erlbaum), 23–38. 2014. White, L. (1989). Universal Grammar and Second Language Acquisition, Vol. 1. Citation: Meulman N, Stowe LA, Sprenger SA, Bresser M and Schmid MS (2014) An Amsterdam: John Benjamins Publishing. ERP study on L2 syntax processing: When do learners fail? Front. Psychol. 5:1072. doi: White, L. (2007). “Some puzzling features of L2 features,” in The Role of Features in 10.3389/fpsyg.2014.01072 Second Language Acquisition, eds J. Liceras, H. Zobl, and H. Goodluck (Mahwah, This article was submitted to Language Sciences, a section of the journal Frontiers in NJ: Erlbaum), 305–330. Psychology. White, L., Valenzuela, E., Kozlowska-Macgregor, M., and Leung, I. (2004). Gender Copyright © 2014 Meulman, Stowe, Sprenger, Bresser and Schmid. This is an and number agreement in nonnative Spanish. Appl. Psycholinguist. 25, 105–133. open-access article distributed under the terms of the Creative Commons Attribution doi: 10.1017/S0142716404001067 License (CC BY). The use, distribution or reproduction in other forums is permit- White, L., Valenzuela, E., Kozlowska-Macgregor, M., Leung, I., and Ayed, H. B. ted, provided the original author(s) or licensor are credited and that the original (2001). “The status of abstract features in Interlanguage: gender and number publication in this journal is cited, in accordance with accepted academic practice. in L2 Spanish,” in BUCLD25Proceedings (Somerville, MA: Cascadilla Press), No use, distribution or reproduction is permitted which does not comply with these 792–802. terms. www.frontiersin.org September 2014 | Volume 5 | Article 1072 | 17 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Frontiers in Psychology Pubmed Central

An ERP study on L2 syntax processing: When do learners fail?

Frontiers in Psychology , Volume 5 – Sep 25, 2014

Loading next page...
 
/lp/pubmed-central/an-erp-study-on-l2-syntax-processing-when-do-learners-fail-ycy0JTZngA

References (87)

Publisher
Pubmed Central
Copyright
Copyright © 2014 Meulman, Stowe, Sprenger, Bresser and Schmid.
ISSN
1664-1078
eISSN
1664-1078
DOI
10.3389/fpsyg.2014.01072
Publisher site
See Article on Publisher Site

Abstract

ORIGINAL RESEARCH ARTICLE published: 25 September 2014 doi: 10.3389/fpsyg.2014.01072 An ERP study on L2 syntax processing: When do learners fail? 1 1 1 2 Nienke Meulman *, Laurie A. Stowe , Simone A. Sprenger , Moniek Bresser and 1,3 Monika S. Schmid Center for Language and Cognition, University of Groningen, Groningen, Netherlands Research School of Behavioral and Cognitive Neurosciences, University of Groningen, Groningen, Netherlands Department of Language and Linguistics, University of Essex, Colchester, UK Edited by: Event-related brain potentials (ERPs) can reveal online processing differences between Christos Pliatsikas, University of native speakers and second language (L2) learners during language comprehension. Using Kent, UK the P600 as a measure of native-likeness, we investigated processing of grammatical Reviewed by: gender agreement in highly proficient immersed Romance L2 learners of Dutch. We Christos Pliatsikas, University of demonstrate that these late learners consistently fail to show native-like sensitivity to Kent, UK Eleonora Rossi, Penn State gender violations. This appears to be due to a combination of differences from the gender University, USA marking in their L1 and the relatively opaque Dutch gender system. We find that L2 *Correspondence: use predicts the effect magnitude of non-finite verb violations, a relatively regular and Nienke Meulman, Center for transparent construction, but not that of gender agreement violations. There were no Language and Cognition, University effects of age of acquisition, length of residence, proficiency or offline gender knowledge. of Groningen, Oude Kijk in ’t Jatstraat 26, PO Box 716, 9700 AS Additionally, a within-subject comparison of stimulus modalities (written vs. auditory) Groningen, Netherlands shows that immersed learners may show some of the effects only in the auditory e-mail: n.meulman@rug.nl modality; in non-finite verb violations, an early native-like N400 was only present for auditory stimuli. However, modality failed to influence the response to gender. Taken together, the results confirm the persistent problems of Romance learners of Dutch with online gender processing and show that they cannot be overcome by reducing task demands related to the modality of stimulus presentation. Keywords: second language acquisition, grammatical gender agreement, event-related potentials (ERPs), P600, modality, immersion INTRODUCTION masculine and feminine in French, or masculine, feminine and Second language (L2) acquisition of many aspects of syntactic neuter in German) which allows speakers to establish syntactic structure is known to be difficult, especially when acquisition cohesion between the elements in a phrase through agreement. starts later in life. A major question being debated in the literature Because the gender of a word is typically not predictable from is to what extent and under what circumstances late L2 speak- its meaning, learning grammatical gender involves acquiring both ers can become native-like with respect to syntax processing (e.g., the knowledge of a word’s gender (gender assignment) and of how Clahsen and Felser, 2006; White, 2007). The evidence is mixed; gender is expressed syntactically (gender agreement or concord). in some cases this does seem to be possible, while in other cases, Therefore, L2 learners must tag each new lemma with its corre- it is difficult or impossible. A number of factors have been sug- sponding gender and learn which grammatical elements in the gested to play a role in this variation, but two which have received contexthavetoagreewith it.For exampleinDutch,allnouns relatively little attention are the difficulty of the target grammat- are assigned to either the common or the neuter gender class ical system and the potential role of modality of testing (written and gender concord occurs with determiners and pre-nominal vs. auditory presentation). The present study investigates whether adjectives (e.g., de tuin , the garden, een [common] [def, common] [indef ] event-related potential (ERP) measures of native-likeness used in mooie tuin , a beautiful garden). During pro- [indef, common] [common] this line of research might be partially dependent on stimulus cessing, a comprehender must retrieve the noun’s gender fast modality, as this might explain some of the inconsistency in the enough to establish gender concord. The question is (a) whether literature. L2 learners manage to do so, and (b) whether they achieve this A structure that has frequently been used to test native-like using the same processing strategies as native speakers. attainment in the L2, is grammatical gender, since it has been Gender processing in L2 has already been the topic of shown to pose a major challenge to L2 learners (e.g., Hawkins, numerous investigations using behavioral measures, such as 2001; White et al., 2001; Sabourin, 2003; Blom et al., 2008). grammaticality judgments, sentence-picture matching, (elicited) Demonstrating gender processing that is comparable to that of production, and eye tracking (for overviews, see, e.g., Grüter natives therefore forms a strong test for L2 syntax acquisition. et al., 2012; Hopp, 2013). More recently, researchers have begun Grammatical gender is a classification system for nouns (e.g., to employ ERPs to investigate native-likeness of grammatical www.frontiersin.org September 2014 | Volume 5 | Article 1072 | 1 Meulman et al. When do learners fail? gender processing in the L2, because ERPs are known to be highly that English, German, and Spanish learners of French can show sensitive to the immediate, unconscious on-line detection, and native-like ERP responses in the form of a P600 effect (Frenck- processing of linguistic anomalies (e.g., Osterhout and Holcomb, Mestre et al., 2009; Foucart and Frenck-Mestre, 2011, 2012). 1992; Molinaro et al., 2011). Studies using off-line behavioral The same goes for English and Chinese learners of Spanish measures (e.g., White et al., 2001, 2004; Franceschina, 2005) (Tokowicz and MacWhinney, 2005; Gillon Dowens et al., 2010, cannot give access to this sort of evidence, which makes inter- 2011). German and Polish learners of Dutch can also show a P600 in response to gender violations (Sabourin and Stowe, 2008; pretation of their results more difficult. Some online techniques such as eye tracking (Dussias, 2010) measure real-time language Loerts, 2012). Despite these consistent results, however, it is clear processing, but do not provide us with the qualitative evidence of that this does not generalize to success in all aspects of gen- potential brain mechanisms that ERPs can. The rationale of such der processing, as the English and German learners also failed ERP studies is that the more similar the response between native to respond in a native-like manner to gender in some forms of speakers and learners, the more similar the underlying neural agreement (Foucart and Frenck-Mestre, 2011, 2012). Stronger and cognitive processing mechanisms. In other words, a compar- yet, Romance learners of Dutch did not show sensitivity to gen- ison of ERPs in native speakers and L2 learners can tell us how der agreement anomalies in the form of a P600 effect even in native-like the latter really are. straightforward determiner noun agreement structures (Sabourin In first language processing, gender and other (mor- and Stowe, 2008). It is unclear why this group failed to exhibit the pho)syntactic violations are found to be associated with two majority pattern; we will discuss some factors which might have primary kinds of components: the left anterior negativity (LAN) affected their success in somewhat more detail. and the P600. The LAN has been widely associated with morpho- One of the factors which has been considered to be central syntactic agreement processes (Münte et al., 1993; Friederici et al., for native-like learning of a late L2 is whether a grammatical ele- 2000; Molinaro et al., 2011), but others claim that it is a more ment (e.g., gender) is present in the L1. Many studies have focused general index of working memory load (Kluender and Kutas, on this question, but have reached different conclusions. There is 1993; Coulson et al., 1998). The P600 has been reported for a some evidence that having a gender system in the L1 might be an range of syntactic and other linguistic violations (e.g., Osterhout advantage when acquiring an L2 gender system (e.g., Bruhn de and Holcomb, 1992; Hagoort et al., 1993; Münte et al., 1993; Garavito and White, 2000; Hawkins, 2001; Franceschina, 2005). Burkhardt, 2007). Given the extremely heterogeneous conditions This is in favor of models proposing that the L1 restricts L2 that elicit a P600, this component cannot be exclusively associated acquisition (Hawkins and Chan, 1997). However, there is also evi- with agreement specifically, or even syntactic processing difficul- dence of L2 learners without gender systems in their L1 being ties more generally, and is therefore often interpreted as a late able to show full acquisition of grammatical gender (White et al., stage of (re)analysis of information (Osterhout and Holcomb, 2001, 2004), which is seen as evidence against such a restriction 1992; Bornkessel-Schlesewsky and Schlesewsky, 2008). It may (Schwartz and Sprouse, 1994, 1996;see also White, 1989; White even reflect a more general process, such as the P300 (Gunter et al., 2004). The presence vs. absence of gender in the L1 seems et al., 1997; Coulson et al., 1998;but see Osterhout and Hagoort, at the least to be more complicated than these views suggest, 1999; Frisch et al., 2003). There is however, a strong correla- however. tion between the appearance of the P600 effect and grammatical The French and Spanish studies mentioned earlier show that violations. In contrast, findings are more varied with respect learners with no gender in their L1 (English and Chinese speak- to the presence of a LAN. In addition to the LAN and P600, ers) can show native-like ERP responses. Further, Sabourin and some studies have found an N400, or a biphasic N400-P600 pat- Stowe (2008) find differences between two L1s which both have tern (but no LAN) in response to syntactic violations (see an gender: German on the one hand and Romance learners on the overview reported in Molinaro et al., 2011). This is surprising, other. Sabourin and Stowe themselves attribute their results to since the N400 is a component normally associated with diffi- the (lack of) similarity between the native and target language of culty in semantic integration (see Kutas and Federmeier, 2011, these learners: Dutch gender is in general predictable from the foranoverview).Ithas thereforebeenproposedthatanN400 gender of the cognate German word due to their common histor- in response to syntactic agreement anomalies is likely to be a ical origin, while there is no one-to-one-correspondence between result of non-syntactic information that is needed to process the Romance and Dutch gender at the lexical level. Moreover, agree- mismatch, for example information that requires lexical access ment between noun and adjective is more similar in German and (Molinaro et al., 2011). Because the LAN and N400 are variable Dutch than the Romance languages and Dutch. Sabourin and in studies of native processing, particularly for gender agreement, Stowe conclude that processing routines are transferred from L1 we will consider the P600 to be the primary measure of native- to L2, rather than transfer of the abstract knowledge that nouns likeness, although we will report findings in the time window have gender, and that these routines must be similar for success- associated with the LAN/N400 (300–500 ms after presentation) ful transfer (see Foucart and Frenck-Mestre, 2011, for a similar as well. argument). ERP results regarding grammatical gender processing in the However, an explanation which assumes that similar routines L2 have provided mixed results. A number of studies find that, in L1 are necessary for native-like processing does not account at least under some conditions, sufficiently proficient L2 learners for the results of other studies mentioned above showing that are able to show native-like ERP responses to gender violations. even with no gender system in the L1, learners are able to A set of studies investigating L2 processing of French suggests show native-like effects. A different approach to the effects of L1 Frontiers in Psychology | Language Sciences September 2014 | Volume 5 | Article 1072 | 2 Meulman et al. When do learners fail? transfer is formulated within the Competition model (see Bates auditory stimulus modality. Indeed, the experience of learning and MacWhinney, 1987). According to the competition-based in immersion can be expected to differ substantially from a for- account, when L1 does not contain gender there is no interfer- mal learning environment. Yet, the various populations that have ence. This predicts successful outcomes for languages with no been tested so far differ in this domain. The participants in gender (Tokowicz and MacWhinney, 2005). However, when exist- the Romance studies summarized above included learners with ing processing routines are transferred, they will cause interfer- extensive formal training in their L2. In many of the studies there was no immersion (Tokowicz and MacWhinney, 2005; Gillon ence if they are dissimilar from those required for L2 (accounting forthe failureofthe RomancelearnersofDutch). Dowens et al., 2011) or only minimal immersion during the par- The target language itself may also contribute to the failure ticipants’ recent residence in France (Foucart and Frenck-Mestre, of Sabourin and Stowe’s (2008) Romance group to show native- 2011, 2012). Sabourin and Stowe (2008), unlike Loerts, tested a like processing. Most of the successful studies have investigated similar late immersion population using visual materials, with Romance target languages. Unlike Romance or Slavic languages, each word presented consecutively in the center of the screen. An which have transparent gender systems (i.e., a predictable gen- alternative explanation for the lack of a native-like response in der category based on morphophonological patterns), Dutch is their study could thus be difficulties with the visual presentation. generally regarded as having an opaque gender system (Corbett, Below, we will speculate about why a visual ERP paradigm might, 1991; van Berkum, 1996). Although some morphological forms under some circumstances, be problematic. predict the gender of the word, these cues are only available for a In a typical language comprehension ERP paradigm, partici- relatively small proportion of the vocabulary in the language. This pants are presented with sentences displayed one word at a time clearly presents a more difficult problem for the learner than gen- at the center of a screen, at a rate of around two words per sec- der in a more transparent language, which may certainly explain ond, a technique called rapid serial visual presentation (RSVP). why the Romance group in the Sabourin and Stowe study failed The advantages of this method are that the duration of stim- to achieve a native-like level. ulus presentation can be controlled (and manipulated) tightly, Neither L1 interference nor target language opaqueness, how- that eye movements, which lead to large artifacts in the EEG, are ever, entirely accounts for the results found by Loerts (2012). reduced to a minimum, and that making the stimulus material Her study demonstrates that highly advanced Polish learners of and time-locking the brain responses to the presentation of viola- Dutch can show somewhat weak, but native-like ERP responses, tions in the stimulus is relatively straightforward. Consequently, even though Polish agreement differs from Dutch. Loerts’ results a large majority of ERP sentence comprehension studies use this also show that an opaque system can be learned, although it method. In contrast, auditory sentence presentation is used much may be more difficult to learn than a transparent system. Only less frequently in ERP research. With spoken stimuli, it is more her most proficient learners showed native-like processing (see difficult to control the presentation duration of individual words. Davidson and Indefrey, 2009, for another example of relatively In addition, making recordings of spoken sentences is more time low proficient learners failing to show native-like effects for gen- consuming and requires tight control of acoustic confounds (e.g., der processing in an opaque L2 system), while even fairly low pro- prosodic cues about upcoming information, Dimitrova et al., ficient English learners of Spanish have been shown to respond 2012), as well as timing issues (e.g., setting markers to millisecond with a clear P600 effect (Tokowicz and MacWhinney, 2005). precision for the events of interest). An alternative explanation is thus that Sabourin and Stowe’s We do not expect to find interesting differences between word- (2008) Romance learners were simply not proficient enough to by-word reading and listening for language processing in natives show online processing comparable to that of natives. Although (Müller et al., 1997; Hagoort and Brown, 2000; Balconi and the proficiency of the Romance group was not investigated in Pozzoli, 2005). In the L1, learners develop fully automatized detail, a similar group of German learners did significantly bet- processing of both modalities; moreover, the auditory represen- ter when tested on offline gender knowledge (Sabourin, 2003). tation of language is automatically activated by written materials The Romance participants in the ERP study also performed worse (Perfetti et al., 1992; Frost, 1998), so that the routines activated at the end of sentence grammaticality judgments collected dur- during auditory processing can be utilized as well as those specific ing the ERP session. It has been shown that proficiency affects to the written modality (Homae et al., 2002). Despite expect- brain responses (e.g., Steinhauer et al., 2006; McLaughlin et al., ing comparable results for the two modalities in general, even 2010). A replication of the Sabourin and Stowe study with a group for L1 comprehenders, consecutive word by word presentation of learners as proficient as in the Loerts study can demonstrate in the middle of the screen presents a challenge under some whether this is the sole explanatory factor. This is one of the aims circumstances. The optimum speed of presentation is an issue; of the current study. Hopp (2010) shows that speeded RVSP presentation can make However, there is another factor that may have produced the even native speakers break down in their grammaticality judg- difference between the two Dutch studies, which has thus far been ment ability, making their performance mirror that of L2 learners overlooked: testing modality. Unlike virtually all the other stud- (see also Camblin et al., 2007, who show a case where speeded ies summarized above, Loerts (2012) tested her Polish learners RSVP eliminates an effect which is clear in naturally produced using auditory sentence presentation. She argues that the learn- connected speech). Conversely, studies directed at optimizing ers had acquired their L2 primarily in the auditory modality as computerized text presentation on small screens have shown that emigrants who arrived with no formal training in their new lan- too slow a presentation can also interfere with comprehension guage. Consequently, processing routines may be tuned to the (Bernard et al., 2001). This may result from working memory and www.frontiersin.org September 2014 | Volume 5 | Article 1072 | 3 Meulman et al. When do learners fail? maintenance issues. Stowe (1991) showed that readers were more be a spurious result of averaging (Osterhout, 1997; Nieuwland likely to garden-path or have difficulty in recovering from a gar- and Van Berkum, 2008; Tanner and Van Hell, 2014; Tanner den path with center of the screen presentation, as opposed to et al., 2014). Before we draw any strong conclusion that a group presentation of words across the screen in their normal position, of learners’ processing of gender agreement qualitatively differs even when readers were allowed to pick their optimum pace. from natives, it is important to identify varying patterns in each L2 learners differ in a number of ways from native speak- of the groups. Furthermore, there may be predictors of native- likeness in L2 learners, such as age of acquisition, proficiency, ers, some of which can be expected to interact with modality. First, their cumulative reading experience in the L2 is likely to be language exposure and use, that may explain variance within the substantially lower than that of native speakers. This means that group (e.g., Weber-Fox and Neville, 1996, 1999; Rossi et al., 2006; their activation of the L2 via this modality can be expected to be Steinhauer et al., 2009; Tanner et al., 2014). Understanding which less automatizedthaninnativespeakers(Koda, 1996). Second, individual difference factors, if any, are associated with the out- interference from the writing system of the first language may come in L2 learning is a fundamental question which is difficult to lead to even less activation of the phonological form of the L2, answer with group-based analyses, and might also help us deter- in comparison with natives (Koda, 1999). These differences can mine the source of some of the mixed patterns of results in L2 potentially play a role for all L2 learners, but may be especially rel- gender research. evant for learners with less formal instruction in the language and MATERIALS AND METHODS in whom learning took place primarily via the auditory modal- PARTICIPANTS ity. The optimum speed of presentation is also likely to differ between various groups of learners and natives. This issue has Participant characteristics and proficiency scores can be found received relatively little attention in the literature, but given that in Table 1. Forty-five participants took part in the experiment. stimulus modality was one difference between the unsuccessful Seven participants had to be excluded from the analyses because Romance group reported by Sabourin and Stowe (2008) and the of too many artifacts in the EEG signal. Nineteen of the remain- relatively more successful group studied by Loerts (2012),this ing participants were Romance learners of Dutch (six French, five factor was included in the current experiment in order to deter- Italians, three Romanians, five Spanish). The remaining 19 partic- mine whether it explains the different patterns seen in the two ipants were native speakers of Dutch. All participants were right studies. A clear effect of modality would suggest that researchers handed, neurologically unimpaired and did not have any prob- need to pay more attention to this variable in their experimental lems with hearing, speaking, or writing. Prior to conducting any designs, and might have implications for the differences between procedures, written consent was obtained from all participants immersed and instructed learners as well. for the study, which was approved by the local ethics committee. Summarizing, the goal of the current study is to gain more Participants were fully debriefed at the end of the experiment and insight into why some groups may show persistent problems in received a small fee for participation. attaining native-like processing of grammatical gender. We inves- All learners had moved to the Netherlands at or after the age tigate grammatical processing in immersed Romance L2 learners of 16 and had been immersed in the L2 context for at least 5 years of Dutch, using the P600 as a measure of native-likeness, in order at the time of testing. The learners had very little to no expo- to answer the question whether late L2 learners can show native- sure to Dutch before immigration. They were asked to indicate like syntactic processing, even if the gender marking in the L1 the frequency of use of Dutch in daily life: a composite score for differs from that in the L2, which may cause interference, and the L2 use was calculated based on questions about language use at L2 gender system is relatively opaque, making it harder to recog- home (with partner and children), outside of the home (at the nize the grammatical agreement regularities. Following Sabourin workplace and other), and use of Dutch media. They addition- and Stowe (2008), in addition to gender violations, which have ally answered questions about their use of Dutch in a specific proven difficult to master, we present our participants with non- modality: they estimated the percentage of use of the L2 in the finite verb violations, a construction that is relatively easy to visual modality (i.e., reading/writing) compared to the auditory acquire, as a baseline for comparison. We compare the responses modality (i.e., speaking/listening), both during learning of Dutch of high-proficient Romance learners with those of native speak- at onset of immigration and during everyday life at the time of ers of Dutch. Additional measures of proficiency will be gathered testing. from the first. A within-subject comparison of stimulus modali- L2 proficiency was assessed by means of several (written) mea- ties allows us to determine whether the absence of a P600 effect sures. A pre-selection on the basis of a pre-test in the form for gender in the Sabourin and Stowe (2008) study was due to of 20 grammar items of the Dutch DIALANG Placement Test processing demands associated with the task modality. (adapted from http://www.lancaster.ac.uk/researchenterprise/ In addition to standard group analyses of the ERP waveforms, dialang/about.html) ensured that all participants had a relatively we will closely inspect individual differences within each group. high level of proficiency in Dutch. Participants had to complete Adding these analyses has several benefits. First, lack of effects in at least 13 of the items correctly to be selected for participa- grand mean ERP results does not necessarily mean that none of tion. Another proficiency measure was taken in the lab, in the the individuals showed a native-like ERP response. Rather, a null form of a C-test (constructed by Keijzer, 2007), which consisted effect might be based on opposite effects (a positive going effect of two texts containing gaps where parts of some words had in one set of individuals and a negative going effect in others) been left out. The participants’ task was to fill the gaps. After canceling each other out. In a similar way, biphasic responses can the EEG experiment, participants were also asked to complete Frontiers in Psychology | Language Sciences September 2014 | Volume 5 | Article 1072 | 4 Meulman et al. When do learners fail? Table 1 | Means (and ranges) of participant characteristics and scores on proficiency measures, and significance of between-group comparisons (Mann-Whitney U-test). Measure Learners (n = 19) Natives (n = 19) U- and p-value AGE/EXPOSURE/USE Age at testing (years) 42.3 (24–64) 39.8 (21–59) U = 162, p = 0.599 Age of acquisition (years) 26.0 (16–39) – – Length of residence (years) 16.3 (5–43) – – L2 use (%) 58.4 (12.3–87.3) – – USE OF MODALITY: DURING LEARNING (%) Visual 43.7 (20–70) – – Auditory 56.3 (30–80) – – USE OF MODALITY: CURRENT (%) Visual 42.6 (20–70) – – Auditory 57.4 (30–80) – – PROFICIENCY MEASURES C-test (%) 79.4 (42.1–100) 95.2 (68.4–100) U = 299.5, p < 0.001 Gender assignment task (%) 87.3 (64.6–100) 99.5 (93.8–100) U = 332.5, p < 0.001 SELF-RATED PROFICIENCY Reading 4.4 (3–5) – – Writing 3.6 (1–5) – – Speaking 3.9 (2–5) – – Listening 4.3 (3–5) – – Composite score based on language use inside and outside of the home and use of Dutch media. Percentage of L2 use in the visual modality (i.e., reading/writing) compared to the auditory modality (i.e., speaking/listening) during learning of Dutch at onset of immigration. Percentage of L2 use in the visual modality (i.e., reading/writing) compared to the auditory modality (i.e., speaking/listening) in everyday life at the time of testing. Percentage of correct responses on the C-test (spelling errors were not penalized). Percentage of correct responses (i.e., a minimum of 2/3 instances of each item assigned correctly) on the gender assignment task. Ratings on a 5-point scale with five as highest level of skill in Dutch. an offline gender assignment task. This task was used to test the these contained an infinitive and the other half a past participle participants’ knowledge of the grammatical gender of the critical verb. For their ungrammatical counterparts, these verbs were nouns used in the EEG experiment. In addition to these measures, altered into their participial or infinitival form, respectively. The learners rated their L2 Dutch in terms of reading, writing, speak- other 96 sentences were used to test grammatical gender agree- ing, and listening proficiency on a Likert-scale between 1 (very ment. In these sentences, the determiner either agreed in gender bad) and 5 (very good). Participants’ scores on the proficiency with the following noun or violated gender concord. Determiner measures can be found in Table 1. and noun were either adjacent, or non-adjacent (with an adjec- tive intervening between the determiner and noun). Only highly MATERIALS frequent Dutch target nouns and verbs were used (nouns: mean The design and materials of the EEG experiment were largely = 2.16, range = 0.78–3.08; verbs: mean = 2.46, range = 0.95– based on work by Loerts (2012), who studied L2 gender and non- 4.05, on log lemma frequency of occurrence per million taken finite verb processing in natives and Slavic learners of Dutch. One from the CELEX corpus: Baayen et al., 1995). Finally, 122 well- hundred and forty-four experimental sentences were created (see formed filler sentences were included. These filler sentences were Table 2 for examples, the full list of sentences can be found in added to raise the overall proportion of correct sentences to the Supplementary Material, Data Sheet 1). Forty-eight of the about 3/4, making the task more similar to natural language sentences were used to test non-finite verb agreement. Half of processing. For the auditory part of the experiment, spoken forms of all sentences were recorded. Each sentence was read aloud by Because of the large number of factors in the current design, it was not a female native speaker with a standard Dutch accent who was possible to get a high number of trials per condition without making the trained to produce correct and incorrect sentences with normal experiment too long, which in all probability would have resulted in severe fatigue effects in our data. We realize that as a result, the number of trials per intonation. Despite training, acoustic confounds, such as subtle condition is on the low side, particularly for the non-finite verb condition. prosodic cues to the upcoming ungrammaticality remain possible However, highly salient agreement errors, such as the non-finite verb agree- (Dimitrova et al., 2012). To prevent any influence of such con- ment violations used in the current study, have been shown to elicit large ERP founds, each sentence was presented in its original form or in a effects. As the results section of this paper shows, even with this low number digitally spliced version, constructed by cross-splicing the origi- of trials we had sufficient power to find significant effects in this condition. nal recordings of grammatical and violation sentences, cutting at In the less salient gender condition however, there was double the amount of trials per condition to ensure sufficient power. the onset of the determiner for the gender condition, or the verb www.frontiersin.org September 2014 | Volume 5 | Article 1072 | 5 Meulman et al. When do learners fail? Table 2 | Example materials of the EEG experiment. Condition Example sentences Number of items per list Non-finite verb agreement Ze heeft alleen haar beste vriendin uitgenodigd/*uitnodigen voor haar verjaardag. 12/12 visual, 12/12 auditory (She has only invited/*invite her best friend for her birthday.) Hij probeert me altijd aan het lachen te maken/*gemaakt door grapjes te vertellen. (He always tries to make/*made me laugh by telling yokes.) Gender agreement Vera plant rode rozen in de/*het tuin van haar ouders. 24/24 visual, 24/24 auditory (Vera is planting red roses in the /*the garden of her parents.) com neu Het duurde uren voordat Jeroen het/*de nette pak van zijn broer had aangetrokken. (It took hours for Jeroen to put on the /*the fancy suit of his brother.) neu com Critical targets, where the ERP was measured, are underlined. in the non-finite verb condition. Noise reduction and volume normalization were applied to all sound files. A within-subject design was employed to test the effects of modality within the same group of subjects. Eight experimental lists were created using a Latin Square design, crossing the fac- tors modality (visual, auditory), correctness (correct, incorrect), and splicing (spliced, unspliced), to ensure each participant was presented with only one version of each sentence and an equal number of each type. Each list was presented to two or three par- ticipants from each group, and each participant saw only one list. PROCEDURE Event-related potentials were recorded while participants listened to or read the sentences. After each sentence, the participant had to make a grammaticality judgment. Participants were com- fortably seated in an electrically shielded and sound attenuated chamber. The sentences were presented using E-prime (Schneider et al., 2002a,b), which in addition recorded accuracy with respect to the grammaticality judgments. Visual stimuli were presented on a computer screen in front of the participants. Speakers were placed to the left and right side of the screen. Visual sentences were presented at a rate of two words per second: each word was presented for 250 ms, followed by 250 ms blank screen. Auditory FIGURE 1 | Approximate location of the recording sites and the 10 sentences were presented at normal speech rate. Participants were regions of interest used for analyses: left/middle/right frontal (LF/MF/RF), left/right temporal (LT/RT), left/middle/right parietal asked to avoid moving any parts of their body and not to move (LP/MP/RP), and left/right occipital (LO/RO). their eyes or blink during sentence presentation. The experiment consisted of four blocks: either two visual blocks followed by two auditory blocks or the reverse. The duration of the breaks between monitor eye-movements, four additional electrodes were placed blocks was determined by the participant. Altogether, the EEG on the outer canthi of each eye and above and below the left experiment lasted about 1 h. eye. Scalp electrode signals were measured against a common Subsequently, participants were asked to fill in the pen and reference during recording. Impedances were reduced to below paper C-test. Finally, they performed a gender assignment task 10 k . The amplifier (TMS international) measured DC with on a computer. The target words of the EEG experiment were a digital FIR filter (cutoff frequency 130 Hz) to avoid aliasing. presented in randomized order, each item appearing three times. After acquisition, the raw data were further processed with Brain Participants were instructed to indicate, by a mouse click on either Vision Analyzer 2.0.4. The data were re-referenced to the aver- the common (“de”) or neuter (“het”) definite article, whether age of two electrodes placed over the left and right mastoids they thought the word had common or neuter gender in Dutch. and digitally filtered with a high-pass filter at 0.1 Hz and low- pass filter at 40 Hz. The data were segmented, time-locked to EEG RECORDING AND ANALYSIS the onset of the critical target (from 500 ms before to 1400 ms The continuous EEG (500 Hz/22 bit sampling rate) was recorded from 54 Ag/AgCl scalp electrodes mounted into an elastic cap (Electro Cap International, Inc.) according to the international In some instances, some temporal and frontal electrodes could only be extended 10–20 system (see Figure 1 for recording sites). To reduced to below 20 k. Frontiers in Psychology | Language Sciences September 2014 | Volume 5 | Article 1072 | 6 Meulman et al. When do learners fail? after stimulus onset). Average ERPs were formed without regard R version 3.1.0 using the lm function of the lme4 package (ver- to behavioral responses, from trials free of muscular and ocular sion 1.1.6: Bates et al., 2014) will be described together with the artifacts; the latter were corrected using the Gratton and Coles results. procedure (1989). Individual channel artifacts led to rejection of RESULTS 0.5% of the data in the learner group and 0.6% in the native group. A baseline period was set from 200 to 0 ms before onset BEHAVIORAL RESULTS of the critical words to normalize the data. A total of 10 regions of The percentages of accurate grammaticality judgments per group, interest (ROIs), containing five or six electrodes each, were used sentence structure,and modality are shown in Figure 2.AThree- for analyses (depicted in Figure 1). Way ANOVA was conducted on the arcsine transformed propor- We analyzed amplitudes of the ERP waveforms in the time- tions of correct responses to stabilizes variance and normalize the windows in which a LAN/N400 and P600 are to be expected: data (mean and SDs reported below are from the untransformed 300–500 and 600–1200 ms after stimulus onset. The latter win- percentages). The ANOVA revealed a significant main effect of dow is somewhat longer than is typical in P600 studies in group, F = 53.24, p < 0.001, with the learners giving fewer (1, 36) monolinguals, because the P600 in L2 learners can be some- correct responses than the natives (mean = 71.1, SD = 17.8vs. what delayed (Weber-Fox and Neville, 1996; Hahne, 2001; Rossi mean = 93.0, SD = 11.5). The main effect of sentence structure, et al., 2006; Sabourin and Stowe, 2008). For grand mean anal- F = 41.66, p < 0.001, shows that the average performance (1, 36) yses, ANOVAs were calculated within each time window and is worse in the gender condition. However, there is also a sig- sentence structure (non-finite verb, grammatical gender) sepa- nificant interaction between group and structure, F = 5.55, (1, 36) rately, using the ezANOVA function of the ez package (version p = 0.024. Paired comparisons show that the difference between 4.2.2: Lawrence, 2013), implemented in R (version 3.1.0: RCore structures is highly significant in the learner group [t = (62.9) Team, 2014). The analyses included correctness (grammatical, 4.91, p < 0.001, gender mean = 62.8, SD = 14.1; non-finite verb violation) and modality (visual, auditory) as within-participants mean = 79.5, SD = 17.3]. There is a smaller, but still significant factors, and group (natives, learners) as between-participants fac- difference between structures in the native group [t = 2.42, (59.7) tor. Data from lateral (left and right frontal, temporal, parietal, p = 0.019, gender mean = 92.2, SD = 6.2; verbs mean = 93.8, and occipital ROIs) and medial (middle frontal and middle pari- SD = 15.1]. Interestingly, with respect to one of our research etal ROIs) regions were treated separately in order to identify questions, there is a significant main effect of modality, F = (1, 36) topographic and hemispheric differences. For the lateral regions, 8.37, p = 0.006, with the percentage of correct responses in the the ANOVA also included hemisphere (left, right) and anterior- auditory condition being somewhat lower than in the visual con- posterior (frontal, temporal, parietal, occipital) as within partici- dition (mean = 79.5, SD = 20.0 vs. mean = 84.6, SD = 16.7). pants factors. For the medial regions, anterior-posterior (frontal, There are however no significant interactions between modal- parietal) was the only topographical factor in the ANOVA. The ity and group, modality and structure,or group, modality, and Greenhouse-Geisser correction was applied for violations of the structure (all Fs < 3). sphericity assumption. Only main effects of, and interactions ERP RESULTS: GRAND MEAN ANALYSES with, correctness are reported. In the presence of a significant higher-level interaction, lower-level interactions, and main effects Figures 3, 4 show the grand mean ERP waveforms for natives and are not interpreted. False discovery rate correction (Benjamini learners, respectively. Results of the omnibus ANOVAs are pro- and Hochberg, 1995) was applied for follow-up tests to control vided in the Supplementary Material (Data sheet 2). Significant for Type 1 error. Additional regression analyses, performed in results and follow-up analyses will be described below. FIGURE 2 | Accuracy on grammaticality judgments made during ERP recording session by group, modality, and structure. www.frontiersin.org September 2014 | Volume 5 | Article 1072 | 7 Meulman et al. When do learners fail? FIGURE 3 | Natives’ grand average ERP waveforms at all 10 regions of interest (see Figure 1) for correct and incorrect use of non-finite verb and gender agreement in the visual and the auditory condition. Non-finite verb agreement correctness by anterior-posterior interaction, F = 3.33, p = (1, 36) In the 300–500 ms window, the lateral omnibus ANOVA for 0.076, a follow-up analysis was conducted, which again revealed the non-finite verb condition showed a significant correctness by that the effect of correctness reached significance in the pos- anterior-posterior interaction, F = 6.02, p = 0.011; follow- terior region only [frontal, F = 2.77, p = 0.105; parietal, (3, 108) (1, 36) up analysis revealed that the effect of correctness reached sig- F = 14.55, p = 0.002]. Additionally, the omnibus ANOVA (1, 36) nificance in posterior regions only [frontal, F = 0.52, p = showed a marginally significant group by correctness by modal- (1, 36) 0.476; temporal, F = 4.16, p = 0.065; parietal, F = ity interaction, F = 3.56, p = 0.067; but follow-up analyses (1, 36) (1, 36) (1, 36) 14.70, p = 0.002; F = 11.77, p = 0.004], with the incor- failed to reveal a significant modality effect in either of the groups (1, 36) rect condition showing more negative voltages than the correct [correctness by modality interaction: natives, F = 0.72, p = (1, 18) condition. Due to a marginally significant group by correctness 0.407; learners, F = 4.12, p = 0.114]. The main effect of (1, 18) interaction in the omnibus ANOVA, F = 3.65, p = 0.064, correctness reached significance on its own in natives, F = (1, 36) (1, 18) another follow-up analysis was conducted separately for natives 6.26, p = 0.044, but not in learners, F = 3.00, p = 0.100. (1, 18) and learners. This analysis revealed that the main effect of cor- Since visual inspection of the grand mean waveforms seems to rectness was significant in natives, F = 7.36, p = 0.028, but suggest a possible negativity in medial regions for learners in the (1, 18) not in learners, F = 0.08, p = 0.780. The medial omnibus auditory condition, and finding a native-like effect in this time (1, 18) ANOVA revealed a significant main effect of correctness, F = window for L2 learners is unusual, we performed an additional (1, 36) 9.22, p = 0.004, with more negative voltages for the incor- follow-up analysis separately for each modality in learners, which rect than the correct condition. Due to a marginally significant showed a significant correctness effect in the auditory, F = (1, 18) Frontiers in Psychology | Language Sciences September 2014 | Volume 5 | Article 1072 | 8 Meulman et al. When do learners fail? FIGURE 4 | Learners’ grand average ERP waveforms at all 10 regions of interest (see Figure 1) for correct and incorrect use of non-finite verb and gender agreement in the visual and the auditory condition. 6.18, p = 0.046, but not the visual modality, F = 0.43, p = parietal, F = 68.36, p < 0.001, than the frontal region, (1, 18) (1, 36) 0.522. F = 29.15, p < 0.001. (1, 36) In the later time window (600–1200 ms), the lateral It is apparent from these grand mean analyses that non-finite omnibus ANOVA showed a significant group by correctness verb agreement violations are associated with a biphasic pat- by anterior-posterior interaction, F = 5.95, p = 0.008. tern of an N400 followed by a P600 in natives. The lack of (3, 108) Follow-up analysis revealed a significant main effect of cor- significant effects for the frontal regions rules out a LAN effect in rectness in both groups [natives, F = 20.39, p = 0.001; the 300–500 ms time window. Learners’ responses are very sim- (1, 18) learners, F = 14.16, p = 0.001], with more positive ampli- ilar to natives’ in the later time-window (P600). However, in the (1, 18) tudes in the incorrect compared to the correct condition. A early time window learners fail to show a native-like effect (N400) significant correctness by anterior-posterior interaction was in the visual condition, and only show a smaller and less broadly present for natives only [natives, F = 23.51, p = 0.001; distributed N400 compared to natives in the auditory condition. (3, 54) learners, F = 1.97, p = 0.169], which was driven by the (3, 54) fact that the positivity in natives was significant in the tem- Gender agreement poral, F = 16.32, p = 0.001, parietal, F = 36.07, In the 300–500 ms window, the lateral omnibus ANOVA for (1, 18) (1, 18) p = 0.001, and occipital region, F = 35.54, p = 0.001, but the gender condition showed a significant correctness by modal- (1, 18) not the frontal region, F = 0.00, p = 0.985. The medial ity by anterior-posterior interaction, F = 3.90, p = 0.039, (1, 18) (3, 108) omnibus ANOVA revealed a significant correctness by anterior- and a group by correctness by modality by hemisphere interaction, posterior interaction, F = 22.93, p < 0.001; a follow-up F = 5.24, p = 0.028. Follow-up analyses conducted sepa- (1, 36) (1, 36) analysis showed that the correctness effect is stronger in the rately for natives and learners revealed a significant correctness www.frontiersin.org September 2014 | Volume 5 | Article 1072 | 9 Meulman et al. When do learners fail? by modality by anterior-posterior interaction in natives, F = (3, 54) 6.28, p = 0.016, but no significant effects in learners (all Fs < 2.03). However, in natives, neither the main effect of correctness nor the correctness by anterior-posterior interaction reached sig- nificance in either of the modalities analyzed separately (all Fs < 3.90). The medial omnibus ANOVA showed a significant group by correctness interaction, F = 4.30, p = 0.045. However, (1, 36) follow-up analyses failed to find a significant main effect of cor- rectness, or any of its interactions, in either of the groups analyzed separately (all Fs < 4.23). In the 600–1200 ms window, the lateral omnibus ANOVA revealed a significant group by correctness by anterior-posterior interaction, F = 20.17, p < 0.001, and a significant cor- (3, 108) rectness by modality by anterior-posterior interaction, F = (3, 108) 7.31, p = 0.002. Follow-up analyses conducted separately for natives and learners revealed a significant correctness by modal- ity by anterior-posterior interaction in natives, F = 6.17, p = (3, 54) 0.014, but no significant effects in learners (all Fs < 1.81). In natives, the main effect of correctness was significant in all regions except for the frontal one [frontal, F = 0.06, p = 0.806; (1, 18) temporal, F = 14.33, p = 0.001; parietal, F = 38.20, (1, 18) (1, 18) p = 0.001; occipital, F = 35.39, p = 0.001], with ampli- (1, 18) tudes in the incorrect condition being more positive compared to the correct condition. The correctness by modality interac- tion did not reach significance in any of the regions (all Fs < 4.03). The medial omnibus ANOVA showed a significant group by correctness by anterior-posterior interaction, F = 11.24, (1, 36) p = 0.002. Follow-up analyses revealed that this was due to a significant correctness by anterior-posterior interaction in natives, F = 26.82, p = 0.001, but not learners, F = 1.86, p = (1, 18) (1, 18) 0.190. The interaction in natives was driven by the fact that the effect of correctness was stronger in the posterior region [frontal, F = 13.04, p = 0.002; parietal, F = 47.69, p < 0.001]. (1, 18) (1, 18) FIGURE 5 | ERP difference waves (incorrect minus correct sentence) These grand mean analyses show that while natives show a per group, structure, and modality, collapsed over middle frontal and classic P600 effect in response to gender agreement violations, all temporal, parietal, and occipital ROIs. learners do not: the P600 is absent for learners, in both modalities. In the early time window, there are again no effects for learn- ers, while the natives seemed to show some small effects, which the learner group, since previous research has revealed that age of however failed to reach significance in follow-up analyses. acquisition, length of residence, L2 proficiency and use can affect Figure 5 summarizes the P600 and N400 effects, showing the ERP responses (also discussed in the Introduction). difference in amplitude between the violation condition and the grammatical condition, collapsed over middle frontal and all Closer inspection of the N400 and P600 patterns temporal, parietal and occipital ROIs, per group, structure, and Following work by Osterhout and colleagues (McLaughlin et al., modality. We see P600 effects for natives, preceded by an N400 2010; Tanner et al., 2013, 2014) we regressed individuals’ N400 effect in non-finite verb violations, but not gender violations. In effect magnitude onto their P600 effect magnitude, to investi- contrast, the learners only show P600 effects for non-finite verb gate the distribution of these two components across individuals. violations, but they do not show any effects of gender violation. The effect magnitude here refers to the average voltage difference The learners also show a small N400 effect for auditory non-finite between conditions: correct minus incorrect in the 300–500 ms verb violations (an effect that only reached significance in the window for the N400, and incorrect minus correct in the medial regions). 600–1200 ms time window for the P600. Amplitudes were aver- aged across middle frontal and all temporal, parietal, and occipital ERP RESULTS: INDIVIDUAL DIFFERENCES ANALYSES regions, where the N400 and P600 effects are to be expected. In this section, we will have a closer look at individual differences. Figure 6 shows the scatterplots of the results, for each group First, we will investigate the distribution of N400 and P600 effects and sentence structure separately. We also investigated each across individuals, which can be of importance for the interpreta- modality separately, but since the results looked highly similar tion of the grand mean results, as discussed in the Introduction. between modalities, these will not be discussed here. The fig- Second, we will explore possible predictors of native-likeness in ure informs us about whether the grand mean waveforms are Frontiers in Psychology | Language Sciences September 2014 | Volume 5 | Article 1072 | 10 Meulman et al. When do learners fail? FIGURE 6 | The distribution of N400 and P600 effect magnitudes effect, whereas individuals below/to the right of the dashed line showed (correct minus incorrect for N400, incorrect minus correct for P600) primarily a P600 effect. In the non-finite verbs many individuals show across learners, averaged within middle frontal and all temporal, biphasic responses (upper right quadrants), whereas in the gender parietal, and occipital ROIs. Each dot represents a data point from a condition there are more sustained positivities (lower right quadrants). single participant. The solid line shows the best-fit regression line. The Very few individuals show sustained negativities (upper left quadrants). dashed line represents equal N400 and P600 effect magnitudes: Basically none of the learners are able to show sensitivity to gender individuals above/to the left of the dashed line showed primarily an N400 violations. representative of most individuals’ ERP profiles. We concluded Predictors of P600 effect magnitude in the learner group from our grand mean analyses that natives show a biphasic N400- To investigate which factors lead to a higher degree of P600 pattern for non-finite verb violations, and only a P600 for native-likeness in L2 learners, we performed a multiple regression gender agreement violations. Examining Figure 6 we indeed see analysis (e.g., Baayen, 2008), to investigate the possible influ- that the biphasic pattern is present for the majority of individu- ence of age of acquisition, length of residence, L2 proficiency als in the non-finite verbs, and that a P600 (without preceding (as measured by the C-test), offline gender knowledge (as mea- N400) is dominant for gender. The grand mean results of the sured by the gender assignment task), and L2 use (composite learners showed native-like effects for verbs, but not for gender. score) on the P600. We took magnitude of the P600 as a mea- This conclusion still holds if we look at individual patterns within sure of native-likeness, since the previous section revealed that the group: the distribution of responses in the verb condition this is the most reliable effect in the native group. The average looks highly similar between learners and natives, although there amplitude of the difference wave (incorrect minus correct), cal- is a tendency toward more positivities without preceding nega- culated in the 600–1200 ms window collapsing middle frontal tivities and less biphasic responses in the learners. The fact that and all temporal, parietal, and occipital ROIs, was used as the basically none of the learners show any sensitivity to gender vio- dependent measure in the regression model. Because of skewed lations assures us that the null effect in the grand mean analysis distributions, age of acquisition, and length of residence were was not due to a cancelation by different patterns. log-transformed, and L2 proficiency, gender knowledge and L2 www.frontiersin.org September 2014 | Volume 5 | Article 1072 | 11 Meulman et al. When do learners fail? FIGURE 7 | The percentage of use of the L2 in daily life predicts P600 magnitude for non-finite verb agreement violations, but not gender agreement violations. Table 3 | Correlation matrix for the dependent measure and the participant characteristics variables used in the regression model. P600 Log age Log length Arcsin Arcsin Arcsin magnitude of acquisition of residence proficiency gender knowledge L2 use P600 magnitude – Log age of acquisition −0.083 – Log length of residence −0.106 −0.147 – Arcsin proficiency 0.140 −0.327 0.230 – Arcsin gender knowledge 0.134 −0.416 0.552* 0.424 – Arcsin L2 use 0.486* −0.388 0.413 0.293 0.518* – Asterisk indicates significance of p < 0.05. Table 4 | Linear multiple regression model predicting P600 effect use were arcsine transformed prior to entry into the model. magnitude in learners. Additionally all predictor variables were centered at their mean. The correlation matrix for the dependent measure and the partic- Predictor Estimate SE t-value p-value ipant characteristics variables can be found in Table 3. Examining Table 3 we see that length of residence shows a significant posi- Intercept 1.388 0.316 4.390 <0.001 tive correlation with gender knowledge (i.e., the ability to assign StructureIsGender −2.789 0.632 −4.410 <0.001 gender offline), r = 0.55, p = 0.014, with longer length of res- (17) L2use 3.288 1.070 3.074 0.003 idence being associated with better gender knowledge. However, StructureIsGender*L2use −5.939 2.140 −2.776 0.007 there is no relation between length of residence and the magni- tude of the P600 (i.e., the ability to process grammatical structures efficiently online), r =−0.11, p = 0.665. L2 use positively a negative impact (β =−2.79, t =−4.41), and L2 use has a (17) correlates with both gender knowledge and P600 magnitude, positive impact (β = 3.29, t = 3.07) on P600 effect magnitude. r = 0.52, p = 0.023 and r = 0.49, p = 0.035, respectively, The other predictors (i.e., modality, age of acquisition, length (17) (17) with a higher amount of L2 use being associated with better of residence, proficiency, and gender knowledge) did not reach gender knowledge as well as larger P600 magnitudes. significance by themselves or in interaction with any other vari- In addition to the participant characteristics variables, struc- ables and were therefore not included in the model. Finally, the ture and modality were tested as predictors in the model. The model additionally shows an interaction between the structure significance of predictors was evaluated by means of the t-test being gender and L2 use (β =−5.94, t =−2.78). This effect is for the coefficients, in addition to model comparison using AIC plotted in Figure 7. There appears to be a significant effect of L2 (Akaike Information Criterion; Akaike, 1974). Table 4 shows use on the P600 for non-finite verb agreement violations, R = the best linear multiple regression model (explained variance: 0.32, F = 8.08, p = 0.011, but no significant effect for gen- (1, 17) 33.7%). This model shows that the structure being gender has der agreement violations, R = 0.01, F = 0.01, p = 0.756. (1, 17) Frontiers in Psychology | Language Sciences September 2014 | Volume 5 | Article 1072 | 12 Meulman et al. When do learners fail? No other significant interactions with structure or modality were effect (Foucart and Frenck-Mestre, 2011). Inspection of individ- found. ual differences for the gender violations confirmed that the grand average ERP patterns we report are representative of the majority DISCUSSION of the individuals in each group. In contrast to natives, who con- Using the P600 as a measure of native-likeness, we tested whether sistently showed large P600 effects (Figure 6,bottomleftpanel), sufficiently proficient late L2 learners can show native-like learners consistently failed to demonstrate any form of sensitivity syntactic processing, even if (1) gender marking in the L1 to gender violations (Figure 6, bottom right panel). This result is implemented differently and (2) the L2 gender system is was confirmed by the fact that none of the participant characteris- opaque. We investigated the ERP responses of native speakers and tics we tested (increased proficiency or gender knowledge, earlier Romance learners of Dutch to anomalies in constructions that age of acquisition, longer length of residence or high percent- are relatively easy to acquire (i.e., non-finite verbs) and those that age L2 use) was associated with a larger P600. In this sense, the have been shown to be more difficult (i.e., gender). In addition, current experiment replicates the pattern found by Sabourin and we varied the modality in which the stimuli were presented, in Stowe (2008); even highly proficient Romance learners of Dutch order to investigate whether visual presentation might contribute appear to have persistent difficulties in learning to use Dutch to the lack of sensitivity to gender in the Romance group reported gender. in previous research (Sabourin and Stowe, 2008). The non-finite Turning to the non-finite verb violations, examination of the verb violations elicited a biphasic N400-P600 effect in both native native speakers confirms that the biphasic pattern N400/P600 speakers and second language learners. However, in contrast to seen in this group is present in the majority of the individual the native speakers, the learners only showed evidence of an N400 participants (see Figure 6, top left panel). This biphasic effect in in the auditory and not the visual condition, although the statis- response to non-finite verb violations in natives has been found tical support for this difference is weak . Also, the amplitude of before (Kutas and Hillyard, 1983; Sabourin and Stowe, 2008; the N400 effect was somewhat smaller than in the natives. For the Loerts, 2012). As can be seen in Figure 6 (top right panel), many gender violations, we found a clear P600 in natives, but not in L2 learners’ responses were within the native range, showing evi- learners. dence of the biphasic pattern, although this is primarily evident The effects of modality were quite subtle. We had hypothe- for the auditorily presented materials. Some individuals are less sized that increased processing demands in the visual modality native-like; forthisstructure theP600 effect magnitudeinthe L2 might interfere with immersed learners’ responses to grammatical groupwas foundtobemodulatedbythe percentageofuse of the violations and that they might show more native-like responses L2 in daily life. Use is not the only important factor for native-like in the auditory modality. This hypothesis receives some support; attainment of syntax processing however; even the learners with the modulation of the N400 effect in non-finite verb violations the highest amount of daily practice in an immersed setting still in learners was in the hypothesized direction, with a native-like show persistent problems with gender agreement. effect in the auditory but not the visual modality. However, for Despite their failure to show native-like gender processing, the gender agreement learners failed to show sensitivity, regardless evidence suggests that the Romance learners are highly profi- of the modality. Thus, the suggestion that the difference between cient. In addition to the off-line measures of proficiency (C-test Loerts’ (2012) results for Polish speakers on the one hand, and and gender assignment) and online accuracy at ungrammatical- Sabourin and Stowe’s (2008) results and our current results for ity detection, which are within native range for a number of the Romance speakers on the other, cannot be attributed to the participants, the evidence from the biphasic N400-P600 pattern difference in modalities. provides a strong argument for high proficiency. Finding early In contrast to the modality effects, violation effects and group ERP effects in response to grammatical violations like the N400 differences therein were robust. Before accepting the group pat- seen here is unusual in L2 research. Although both Loerts (2012) terns, it is important to examine the role of individual differences. and Sabourin and Stowe (2008) found evidence of a biphasic pat- A biphasic pattern may reflect the summation of single effects tern for their native groups, neither found the N400 in their L2 originating in two different groups of participants (Osterhout, learner groups. According to Steinhauer et al. (2009), biphasic 1997; Nieuwland and Van Berkum, 2008; Tanner and Van Hell, patterns are one of the latest stages of morpho-syntactic profi- 2014; Tanner et al., 2014). Even more crucial for the current ciency in late L2 acquisition. The fact that our learners were able experiment, the absence of an effect in the L2 group may be due to reach this stage for non-finite verb agreement, but that they to variability, with some individuals showing the pattern found in cannot get past the initial stage of not showing any brain response native speakers, while others show no effect or even an opposing differences for correct vs. incorrect use of gender agreement pro- vides strong support for the difficulty of the acquisition of this We want to remind the reader that the modality effect in the non-finite verb element in Dutch L2 acquisition. This highlights the complexity condition should be interpreted with some caution. Unlike the main effects of acquisition of the Dutch gender system, even by learners with we report throughout the rest of the discussion (which are based on 24 and 48 a gender system in their L1. Furthermore, it emphasizes the fact items per condition for verb and gender, respectively), the marginally signif- that language learning aptitude is not an all or none phenomenon, icant interaction we followed up on here is based on 12 items per condition but may vary widely between constructions. only, which is relatively few for an ERP study. However, if we do not follow Our results further illustrate the large discrepancy between up on this interaction the main effect of correctness remains, suggesting that learners are like natives. We felt this claim would be too strong, and therefore online and offline processing measures in L2 acquisition research. discuss the follow-up analysis, despite the statistical concerns. Both the behavioral results of the gender assignment task and the www.frontiersin.org September 2014 | Volume 5 | Article 1072 | 13 Meulman et al. When do learners fail? sentence-final grammaticality judgments during the ERP record- for younger learners, making it again unlikely that this is the ings for gender violations indicate moderate to good knowl- (only) decisive factor for native-likeness. The amount of L2 use edge of Dutch grammatical gender in the learner group. Yet, also failed to explain the failure of the Romance learners to show we observed a complete lack of response to these violations online sensitivity to gender, even though, as our own results show, in the ERP signal. This reveals a discrepancy between offline this can be important for native-likeness for other aspects of knowledge of grammatical gender concord and the use of agree- grammatical processing, like verb agreement. Length of residence, ment knowledge during online processing. The lack of a signif- which impacts overall exposure, also showed no correlation with icant relation between the magnitude of the P600 responses to sensitivity to gender. gender violations and the score on the gender assignment task Failure to achieve native-like processing has also been linked rules out the possibility that only learners with better offline to dissimilarity between L1 and L2 (Tokowicz and MacWhinney, performance are able to show online effects. The behavioral 2005; Sabourin and Stowe, 2008; Foucart and Frenck-Mestre, difference between the visual and the auditory modality, with per- 2011), as well as characteristics of the target language (Sabourin formance being slightly worse for grammaticality judgments in and Stowe, 2008; Loerts, 2012). Following this line of argumenta- the auditory modality, was also not reflected in the ERP signal tion, Dutch and Romance languages may simply be too different for gender violations. These results illustrate that second language from each other, which, combined with the fact that the Dutch learners can develop successful strategies to cope with gender gender system is relatively opaque, results in a very difficult chal- processing difficulties. These alternative routes, however, appar- lenge for native-like attainment. The lack of transparency of the ently take more time and are qualitatively different from what we Dutch gender system might explain why our Romance learners observe in online native processing. failed to show native-like processing for this characteristic of the The results of the current study leave us with a puzzle; language, as opposed to the much more transparent non-finite why do Romance learners of Dutch show such persistent prob- verb manipulation. For gender, previous research has shown that lems with gender processing? Our results confirm that gender native-like processing is possible even in constructions with com- is difficult to process for late Romance learners of Dutch, com- petition from an L1 gender system when a relatively transparent pared with the results of studies targeting other languages. We target gender system is to be acquired in L2 (Frenck-Mestre et al., replicated Sabourin and Stowe’s (2008) findings, in the sense 2009; Foucart and Frenck-Mestre, 2011; Gillon Dowens et al., that our Romance learners likewise did not show native-like 2011). In contrast, Loerts’ study suggests that an opaque system responses to gender violations, regardless of modality, although is more difficult to acquire, since only her most proficient learn- they showed responses to non-finite verbs that were close to the ers are able to show P600 effects, which are additionally somewhat native model . The factors most commonly suggested in the liter- smaller in amplitude compared to the natives. It remains an open ature as to why gender or other forms of grammatical processing question as to why, in contrast to Loerts (2012), even the most might be problematic do not appear to explain these results. proficient learners in the current study did not show a P600. Proficiency clearly plays some role in native-likeness in general More research is needed to determine whether characteristics of (Steinhauer et al., 2006; McLaughlin et al., 2010), but as we argue the L1 or other (confounding) factors are at play in determining above, our learners were quite proficient, certainly comparable to which individuals overcome the challenge of an opaque gender those in other studies in which learners have shown P600 effects system. for gender (Tokowicz and MacWhinney, 2005; Frenck-Mestre One final point we would like to make is that, although we et al., 2009; Gillon Dowens et al., 2010, 2011; Foucart and Frenck- did not find extensive effects of stimulus modality, this factor is Mestre, 2011, 2012; Loerts, 2012). Also, our proficiency measure nevertheless of importance. As we noted, the early responses to does not correlate with the magnitude of the ERPs. ungrammaticality like the N400 in the biphasic response seen here Other potential explanatory factors involve the language expe- are not generally found in late L2 learners, which has been taken rience of the learner, such as age of acquisition (Weber-Fox and as a sign of lack of native-likeness. It is possible that they have been Neville, 1996; Kotz et al., 2008) and exposure to and use of the missed due to the use of visual materials, since this effect was only L2 (Gardner et al., 1997; Flege et al., 1999; Dörnyei, 2005; Tanner seen in the auditory modality. Although we saw no effects on the et al., 2014). It is true that the studies reported by Frenck-Mestre amplitude of the P600 effect, certain populations may be affected and colleagues have generally tested earlier learners (with onset of more than others. Learners who do not share the same writing acquisition in their teens rather than twenties and later). However, system in their L1 and L2, for instance, might have more diffi- other studies have demonstrated native-like gender processing culty automatizing their usage of the new alphabet (Koda, 1999; even for relatively late learners (Tokowicz and MacWhinney, Wang et al., 2003). Forthese learners,the useofauditorymateri- 2005; Gillon Dowens et al., 2010). Furthermore, in the current als might be a crucial prerequisite to obtain an accurate measure study we did not even find a trend toward better performance of their abilities. On the other hand, those whose learning has taken place with an emphasis on written materials may show less response when auditory materials are used. Given the large diver- One of the reviewers points out that having twice as many violation sentences sity of L2 speaker populations with respect to typological distance in the gender condition than the non-finite verb condition, might be problem- (both with respect to grammar and writing systems) and type of atic, since less common stimulus types may elicit a P3 response (see Coulson learning environment (immersion vs. classroom), it is important et al., 1998; Hahne and Friederici, 1999). However, the difference waves shown to be aware that the testing modality might influence the results, overlaid in Figure 5 show that there is no difference in P600 effect magnitude betweengender andnon-finite verbsinthe natives. both in offline and online tests. Frontiers in Psychology | Language Sciences September 2014 | Volume 5 | Article 1072 | 14 Meulman et al. When do learners fail? In conclusion, we can say that online grammatical gender pro- Burkhardt, P. (2007). The P600 reflects cost of new information in dis- course memory. Neuroreport 18, 1851–1854. doi: 10.1097/WNR.0b013e3282 cessing is particularly difficult for Romance learners of Dutch, f1a999 even at high levels of proficiency and with large amounts of L2 Camblin, C. C., Ledoux, K., Boudewyn, M., Gordon, P. C., and Swaab, T. Y. exposure and use in a natural setting, and regardless of test- (2007). Processing new and repeated names: effects of coreference on rep- ing modality. In contrast, responses highly similar to the native etition priming with speech and fast RSVP. Brain Res. 1146, 172–184. doi: model are possible for a more regular and transparent structure 10.1016/j.brainres.2006.07.033 Clahsen, H., and Felser, C. (2006). Grammatical processing in language learners. (non-finite verbs), for which responses are modulated by both Appl. Psycholinguist. 27, 3–42. doi: 10.1017/S0142716406060024 testing modality and L2 use. In contrast, the problems with gen- Corbett, G. (1991). Gender. Cambridge: Cambridge University Press. der are persistent and not affected by these factors, demonstrating Coulson, S., King, J. W., and Kutas, M. (1998). Expect the unexpected: event-related the complexity of (late) L2 acquisition of the opaque Dutch brain response to morphosyntactic violations. Lang. Cogn. Process. 13, 21–58. doi: 10.1080/016909698386582 gender system. Davidson, D. J., and Indefrey, P. (2009). An event-related potential study on changes of violation and error responses during morphosyntactic learning. J. Cogn. ACKNOWLEDGMENTS Neurosci. 21, 433–446. doi: 10.1162/jocn.2008.21031 We are very grateful for comments and suggestions of the review- Dimitrova, D. V., Stowe, L. A., Redeker, G., and Hoeks, J. C. (2012). Less is not ers. This research was supported by the Netherlands Organization more: neural responses to missing and superfluous accents in context. J. Cogn. for Scientific Research (NWO) under grant 016.104.602, awarded Neurosci. 24, 2400–2418. doi: 10.1162/jocn_a_00302 Dörnyei, Z. (2005). The Psychology of the Language Learner: Individual Differences to the fifth author. We thank Hanneke Loerts, Sanne Berends and in Second Language Acquisition. Mahwah, NJ: Lawrence Erlbaum. Bregtje Seton for sharing their auditory materials, and the par- Dussias, P. E. (2010). Uses of eye-tracking data in second language sen- ticipants for their kind cooperation. Additionally we thank our tence processing research. Annu. Rev. Appl. Linguist. 30, 149–166. doi: colleagues at the NeuroImaging Center Groningen for technical 10.1017/S026719051000005X support, particularly Peter Albronda. Flege, J. E., Yeni-Komshian, G. H., and Liu, S. (1999). Age constraints on second- language acquisition. J. Mem. Lang. 41, 78–104. doi: 10.1006/jmla.1999.2638 Foucart, A., and Frenck-Mestre, C. (2011). Grammatical gender process- SUPPLEMENTARY MATERIAL ing in L2: electrophysiological evidence of the effect of L1–L2 syntactic The Supplementary Material for this article can be found similarity. Bilingual. Lang. Cogn. 14, 379–399. doi: 10.1017/S13667289100 online at: http://www.frontiersin.org/journal/10.3389/fpsyg. 0012X 2014.01072/abstract Foucart, A., and Frenck-Mestre, C. (2012). Can late L2 learners acquire new gram- matical features? Evidence from ERPs and eye-tracking. J. Mem. Lang. 66, 226–248. doi: 10.1016/j.jml.2011.07.007 REFERENCES Franceschina, F. (2005). Fossilized Second Language Grammars: The Acquisition of Akaike, H. (1974). A new look at the statistical model identification. IEEE Trans. Automat. Control 19, 716–723. doi: 10.1109/TAC.1974.1100705 Grammatical Gender. Amsterdam: John Benjamins. doi: 10.1075/lald.38 Frenck-Mestre, C., Foucart, A., Carrasco, H., and Herschensohn, J. (2009). Baayen, R. H. (2008). Analyzing Linguistic Data. A Practical Introduction Processing of grammatical gender in French as a first and second lan- to Statistics Using R. Cambridge, UK: Cambridge University Press. doi: 10.1017/CBO9780511801686 guage evidence from ERPs. Eurosla Yearb. 9, 76–106. doi: 10.1075/eurosla. 9.06fre Baayen, R. H., Piepenbrock, R., and Gullikers, L. (1995). The CELEX Lexical Database [CD-ROM]. Philadelphia, PA: Linguistics Data Consortium, Friederici, A. D., Wang, Y., Herrmann, C. S., Maess, B., and Oertel, U. (2000). Localization of early syntactic processes in frontal and temporal cortical University of Pennsylvania. Balconi, M., and Pozzoli, U. (2005). Comprehending semantic and gram- areas: a magnetoencephalographic study. Hum. Brain Mapp. 11, 1–11. doi: 10.1002/1097-0193(200009)11:1%3C1::AID-HBM10%3E3.0.CO;2-B matical violations in Italian. N400 and P600 comparison with visual and auditory stimuli. J. Psycholinguist. Res. 34, 71–98. doi: 10.1007/s10936-005- Frisch, S., Kotz, S. A., von Cramon, D. Y., and Friederici, A. D. (2003). Why the P600 is not just a P300: the role of the basal ganglia. Clin. Neurophysiol. 114, 3633-6 Bates, D., Maechler, M., Bolker, B., and Walker, S. (2014). lme4: Linear Mixed-Effects 336–340. doi: 10.1016/S1388-2457(02)00366-8 Frost, R. (1998). Toward a strong phonological theory of visual word recogni- Models Using Eigen and S4. R package version 1.1-6. Available online at: http:// CRAN.R-project.org/package=lme4 tion: true issues and false trails. Psychol. Bull. 123, 71–99. doi: 10.1037/0033- 2909.123.1.71 Bates, E., and MacWhinney, B. (1987). “Competition, variation, and language learning,” in Mechanisms of Language Acquisition, The 20th Annual Carnegie Gardner, R. C., Tremblay, P. F., and Masgoret, A. (1997). Towards a full model of Symposium on Cognition (Hillsdale, NJ: Lawrence Erlbaum Associates), second language learning: an empirical investigation. Mod. Lang. J. 81, 344–362. doi: 10.1111/j.1540-4781.1997.tb05495.x 157–193. Benjamini, Y., and Hochberg, Y. (1995). Controlling the false discovery rate: a Gillon Dowens, M., Guo, T., Guo, J., Barber, H., and Carreiras, M. (2011). Gender and number processing in Chinese learners of Spanish— practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57, 289–300. evidence from event related potentials. Neuropsychologia 49, 1651–1659. doi: 10.1016/j.neuropsychologia.2011.02.034 Bernard, M. L., Chaparro, B. S., and Russell, M. (2001). Examining automatic text presentation for small screens. Proc. Hum. Factors Ergon. Soc. Annu. Meet. 45, Gillon Dowens, M., Vergara, M., Barber, H. A., and Carreiras, M. (2010). Morphosyntactic processing in late second-language learners. J. Cogn. Neurosci. 637–639. doi: 10.1177/154193120104500613 Blom, E., Polišenská, D., and Weerman, F. (2008). Articles, adjectives and age 22, 1870–1887. doi: 10.1162/jocn.2009.21304 Gratton, G., and Coles, M. H. (1989). Generalization and evaluation of eye- of onset: the acquisition of Dutch grammatical gender. Second Lang. Res. 24, 297–331. doi: 10.1177/0267658308090183 movement correction procedures. J. Psychophysiol. 3, 14–16. Grüter, T., Lew-Williams, C., and Fernald, A. (2012). Grammatical gender in L2: a Bornkessel-Schlesewsky, I., and Schlesewsky, M. (2008). An alternative perspec- tive on “semantic P600” effects in language comprehension. Brain Res. Rev. 59, production or a real-time processing problem? Second Lang. Res. 28, 191–215. doi: 10.1177/0267658312437990 55–73. doi: 10.1016/j.brainresrev.2008.05.003 Gunter, T. C., Stowe, L. A., and Mulder, G. (1997). When syntax meets semantics. Bruhn de Garavito, J., and White, L. (2000). “Second language acquisition of Spanish DPs: the status of grammatical features,” in BUCLD 24: Proceedings Psychophysiology 34, 660–676. doi: 10.1111/j.1469-8986.1997.tb02142.x Hagoort, P., Brown, C., and Groothusen, J. (1993). The syntactic positive shift (SPS) from the 24th Annual Boston University Conference on Language Development, eds S. C. Howell, S. Fish, and T. Keith-Lucas (Somerville, MA: Cascadilla), as an ERP measure of syntactic processing. Lang. Cogn. Process. 8, 439–483. doi: 10.1080/01690969308407585 164–175. www.frontiersin.org September 2014 | Volume 5 | Article 1072 | 15 Meulman et al. When do learners fail? Hagoort, P., and Brown, C. M. (2000). ERP effects of listening to speech compared Osterhout, L. (1997). On the brain response to syntactic anomalies: manipulations to reading: the P600/SPS to syntactic violations in spoken sentences and rapid of word position and word class reveal individual differences. Brain Lang. 59, serial visual presentation. Neuropsychologia 38, 1531–1549. doi: 10.1016/S0028- 494–522. doi: 10.1006/brln.1997.1793 3932(00)00053-1 Osterhout, L., and Hagoort, P. (1999). A superficial resemblance does not neces- Hahne, A. (2001). What’s different in second-language processing? Evidence sarily mean you are part of the family: counterarguments to Coulson, King and from event-related brain potentials. J. Psycholinguist. Res. 30, 251–266. doi: Kutas (1998) in the P600/SPS-P300 debate. Lang. Cogn. Process. 14, 1–14. doi: 10.1023/A:1010490917575 10.1080/016909699386356 Hahne, A., and Friederici, A. (1999). Electrophysiological evidence for two steps Osterhout, L., and Holcomb, P. J. (1992). Event-related brain potentials elicited in syntactic analysis: early automatic and late controlled processes. J. Cogn. by syntactic anomaly. J. Mem. Lang. 31, 785–806. doi: 10.1016/0749- Neurosci. 11, 194–205. doi: 10.1162/089892999563328 596X(92)90039-Z Hawkins, R. (2001). The theoretical significance of Universal Grammar Perfetti, C. A., Zhang, S., and Berent, I. (1992). Reading in English and Chinese: in second language acquisition. Second Lang. Res. 17, 345–367. doi: evidence for a “universal” phonological principle. Adv. Psychol. 94, 227–248. 10.1191/026765801681495868 doi: 10.1016/S.0166-4115(08)62798-3 Hawkins, R., and Chan, C. Y. H. (1997). The partial availability of Universal R Core Team. (2014). R: A Language And Environment For Statistical Computing. Grammar in second language acquisition: the “failed functional features Vienna: R Foundation for Statistical Computing. Available online at: http:// hypothesis.” Second Lang. Res. 13, 187–226. doi: 10.1191/0267658976714 www.R-project.org/ 76153 Rossi, S., Gugler, M. F., Friederici, A. D., and Hahne, A. (2006). The impact of Homae, F., Hashimoto, R., Nakajima, K., Miyashita, Y., and Sakai, K. L. (2002). proficiency on syntactic second-language processing of German and Italian: From perception to sentence comprehension: the convergence of auditory and evidence from event-related potentials. J. Cogn. Neurosci. 18, 2030–2048. doi: visual information of language in the left inferior frontal cortex. Neuroimage 16, 10.1162/jocn.2006.18.12.2030 883–900. doi: 10.1006/nimg.2002.1138 Sabourin, L. (2003). Grammatical Gender and Second Language Processing: An ERP Hopp, H. (2010). Ultimate attainment in L2 inflection: performance similar- Study. Ph.D. dissertation, Groningen, Grodil. ities between non-native and native speakers. Lingua 120, 901–931. doi: Sabourin, L., and Stowe, L. A. (2008). Second language processing: when are first 10.1016/j.lingua.2009.06.004 and second languages processed similarly? Second Lang. Res. 24, 397–430. doi: Hopp, H. (2013). Grammatical gender in adult L2 acquisition: relations 10.1177/0267658308090186 between lexical and syntactic variability. Second Lang. Res. 29, 33–56. doi: Schneider, W., Eschman, A., and Zuccolotto, A. (2002a). E-PrimeUser’sGuide. 10.1177/0267658312461803 Pittsburgh, PA: Psychology Software Tools Inc. Keijzer, M. (2007). Last in First Out? An Investigation of the Regression Hypothesis Schneider, W., Eschman, A., and Zuccolotto, A. (2002b). E-Prime Reference Guide. in Dutch Emigrants in Anglophone Canada. Ph.D. dissertation, Vrije Universiteit Pittsburgh, PA: Psychology Software Tools Inc. Amsterdam, Netherlands. Schwartz, B. D., and Sprouse, R. A. (1994). “Word order and nominative case in Kluender, R., and Kutas, M. (1993). Bridging the gap: evidence from ERPs on nonnative language acquisition: a longitudinal study of (L1 Turkish) German the processing of unbounded dependencies. J. Cogn. Neurosci. 5, 196–214. doi: Interlanguage,” in Language Acquisition Studies in Generative Grammar: Papers 10.1162/jocn.1993.5.2.196 in Honor of Kenneth Wexler from the 1991 GLOW Workshops, eds T. Hoekstra Koda, K. (1996). L2 word recognition research: a critical review. Mod. Lang. J. 80, and B. D. Schwartz (Philadelphia, PA: John Benjamins), 317–368. 450–460. doi: 10.1111/j.1540-4781.1996.tb05465.x Schwartz, B. D., and Sprouse, R. A. (1996). L2 cognitive states and Koda, K. (1999). Development of L2 intraword orthographic sensitivity and decod- the full transfer/full access model. Second Lang. Res. 12, 40–72. doi: ing skills. Mod. Lang. J. 83, 51–64. doi: 10.1111/0026-7902.00005 10.1177/026765839601200103 Kotz, S. A., Holcomb, P. J., and Osterhout, L. (2008). ERPs reveal comparable syn- Steinhauer, K., White, E., Cornell, S., Genesee, F., and White, L. (2006). The tactic sentence processing in native and non-native readers of English. Acta neural dynamics of second language acquisition: evidence from event-related psychol. 128, 514–527. doi: 10.1016/j.actpsy.2007.10.003 potentials. J. Cogn. Neurosci. (Suppl. 99). Kutas, M., and Federmeier, K. D. (2011). Thirty years and counting: find- Steinhauer, K., White, E. J., and Drury, J. E. (2009). Temporal dynamics of late sec- ing meaning in the N400 component of the event-related brain potential ond language acquisition: evidence from event-related brain potentials. Second (ERP). Annu. Rev. Psychol. 62, 621–647. doi: 10.1146/annurev.psych.093008. Lang. Res. 25, 13–41. doi: 10.1177/0267658308098995 131123 Stowe, L. A. (1991). Ambiguity resolution: behavioral evidence for a delay. Kutas, M., and Hillyard, S. A. (1983). Event–related brain potentials to gram- Proceedings of the Thirteenth Annual Meeting of the Cognitive Science Association matical errors and semantic anomalies. Mem. Cognit. 11, 539–550. doi: (Hillsdale, NJ: Lawrence Erlbaum Associates), 257–262. 10.3758/BF03196991 Tanner, D., Inoue, K., and Osterhout, L. (2014). Brain-based individual differences Lawrence, M. A. (2013). ez: Easy Analysis and Visualization of Factorial Experiments. in online L2 grammatical comprehension. Bilingual. Lang. Cogn. 17, 277–293. R package version 4.2-2. Available online at: http://CRAN.R-project.org/ doi: 10.1017/S1366728913000370 package=ez Tanner, D., McLaughlin, J., Herschensohn, J., and Osterhout, L. (2013). Individual Loerts, H. (2012). Uncommon Gender: Eyes and Brains, Native and Second Language differences reveal stages of L2 grammatical acquisition: ERP evidence. Bilingual. Learners, and Grammatical Gender. Ph.D. dissertation, Rijksuniversiteit Lang. Cogn. 16, 367–382. doi: 10.1017/S1366728912000302 Groningen, Grodil. Tanner, D., and Van Hell, J. G. (2014). ERPs reveal individual differ- McLaughlin, J., Tanner, D., Pitkänen, I., Frenck-Mestre, C., Inoue, K., Valentine, G., ences in morphosyntactic processing. Neuropsychologia 56, 289–301. doi: et al. (2010). Brain potentials reveal discrete stages of L2 grammatical learning. 10.1016/j.neuropsychologia.2014.02.002 Lang. Learn. 60, 123–150. doi: 10.1111/j.1467-9922.2010.00604.x Tokowicz, N., and MacWhinney, B. (2005). Implicit and explicit mea- Molinaro, N., Barber, H. A., and Carreiras, M. (2011). Grammatical agreement pro- sures of sensitivity to violations in second language grammar: an event- cessing in reading: ERP findings and future directions. Cortex 47, 908–930. doi: related potential investigation. Stud. Second Lang. Acquis. 27, 173–204. doi: 10.1016/j.cortex.2011.02.019 10.1017/S0272263105050102 Müller, H. M., King, J. W., and Kutas, M. (1997). Event-related potentials elicited van Berkum, J. (1996). The Psycholinguistics of Grammatical Gender: Studies by spoken relative clauses. Cogn. Brain Res. 5, 193–203. doi: 10.1016/S0926- in Language Comprehension and Production. Ph.D. dissertation, Max Planck 6410(96)00070-5 Institute for Psycholinguistics. Nijmegen, Nijmegen University press. Münte, T. F., Heinze, H. J., and Mangun, G. R. (1993). Dissociation of brain activ- Wang, M., Koda, K., and Perfetti, C. A. (2003). Alphabetic and nonalphabetic L1 ity related to syntactic and semantic aspects of language. J. Cogn. Neurosci. 5, effects in English word identification: a comparison of Korean and Chinese 335–344. doi: 10.1162/jocn.1993.5.3.335 English L2 learners. Cognition 87, 129–149. doi: 10.1016/s0010-0277(02) Nieuwland, M. S., and Van Berkum, J. J. (2008). The interplay between seman- 00232-9 tic and referential aspects of anaphoric noun phrase resolution: evidence from Weber-Fox, C. M., and Neville, H. J. (1996). Maturational constraints on func- ERPs. Brain Lang. 106, 119–131. doi: 10.1016/j.bandl.2008.05.001 tional specializations for language processing: ERP and behavioral evidence Frontiers in Psychology | Language Sciences September 2014 | Volume 5 | Article 1072 | 16 Meulman et al. When do learners fail? in bilingual speakers. J. Cogn. Neurosci. 8, 231–256. doi: 10.1162/jocn.1996. Conflict of Interest Statement: The authors declare that the research was con- 8.3.231 ducted in the absence of any commercial or financial relationships that could be Weber-Fox, C. M., and Neville, H. J. (1999). “Functional neural subsystems are dif- construed as a potential conflict of interest. ferentially affected by delays in second language immersion: ERP and behavioral evidence in bilinguals,” in Second Language Acquisition and the Critical Period Received: 16 May 2014; accepted: 06 September 2014; published online: 25 September Hypothesis, ed D. Birdsong (Mahwah, NJ: Erlbaum), 23–38. 2014. White, L. (1989). Universal Grammar and Second Language Acquisition, Vol. 1. Citation: Meulman N, Stowe LA, Sprenger SA, Bresser M and Schmid MS (2014) An Amsterdam: John Benjamins Publishing. ERP study on L2 syntax processing: When do learners fail? Front. Psychol. 5:1072. doi: White, L. (2007). “Some puzzling features of L2 features,” in The Role of Features in 10.3389/fpsyg.2014.01072 Second Language Acquisition, eds J. Liceras, H. Zobl, and H. Goodluck (Mahwah, This article was submitted to Language Sciences, a section of the journal Frontiers in NJ: Erlbaum), 305–330. Psychology. White, L., Valenzuela, E., Kozlowska-Macgregor, M., and Leung, I. (2004). Gender Copyright © 2014 Meulman, Stowe, Sprenger, Bresser and Schmid. This is an and number agreement in nonnative Spanish. Appl. Psycholinguist. 25, 105–133. open-access article distributed under the terms of the Creative Commons Attribution doi: 10.1017/S0142716404001067 License (CC BY). The use, distribution or reproduction in other forums is permit- White, L., Valenzuela, E., Kozlowska-Macgregor, M., Leung, I., and Ayed, H. B. ted, provided the original author(s) or licensor are credited and that the original (2001). “The status of abstract features in Interlanguage: gender and number publication in this journal is cited, in accordance with accepted academic practice. in L2 Spanish,” in BUCLD25Proceedings (Somerville, MA: Cascadilla Press), No use, distribution or reproduction is permitted which does not comply with these 792–802. terms. www.frontiersin.org September 2014 | Volume 5 | Article 1072 | 17

Journal

Frontiers in PsychologyPubmed Central

Published: Sep 25, 2014

There are no references for this article.