James Hunter, Assistant Professor – English Language Center, Gonzaga University, Spokane USA

Excel has an Analysis ToolPak which can do a lot of statistical tasks. Help on installing it is here. Also, try the R Project.  This is a free “software environment for statistical computing and graphics” and it will run on Windows, Mac, and Linux.  I haven’t had much of a chance to play with it, but it is certainly not user-friendly.  However, you can also get Statistical Lab, which is a GUI interface for R, also free but not for Mac or Linux. There’s also a free version of SPSS (the “big” stats package that businesses & colleges use), called PSPP.

With all of these, you can easily do correlation matrices, T-test, Chi-square, item analysis, Anova, etc. These will enable you to compare results on assessments, do pre- and post-tests, get inter-rater reliability information, find links between variables, etc.  See also this for information on which statistical procedures to use when.

I use mean and SD on most tests and quizzes to a) compare classes to previous semesters and b) look at the distribution and spread of scores on a test/item. This helps to make informed decisions about assessment instruments, especially those that might be adopted as standardized tests for the program. I’ve done a lot of work with our placement instruments, for example, to determine reliability and check our cut scores.

Recently, I’ve been doing research on corrective feedback in oral production, so have needed measures of accuracy and fluency (and complexity!). Statistical analysis has been essential to find correlations between, say, accuracy and reaction time on a grammaticality test and accuracy and production time in a correction test.  For instance, in class a student says to another: *”Yeah, actually I’m agree with you”. This goes down on a worksheet for her (and occasionally other classmates – see this for a description of this methodology), and she is later given a timed test in which she sees the incorrect sentence and has to record a corrected version. Her speed in doing this task (plus her accuracy) give a measure of whether this structure/lexis is part of her competence (or to use Krashen’s model, whether it has been “acquired” or “learned”: presumably, if this theory holds water, “learned” forms will take longer to process and produce than “acquired” ones). In addition to this production test, I’ve been doing a reaction time-test in which the same learner hears her own recording and has to decide, as quickly as possible, whether what she said is correct or not.  You can try this for yourself here (you will not be able to hear student recordings, only a few practice sets, recorded by me using student errors from our database; use anything as Username and “elc” as password).

These measures yield 1000s of results, and that’s why statistical analysis has been essential. Excel can do a lot of the work, especially in graphical representation, but SPSS has done most of the heavy lifting. For instance, it has revealed that there is no significant difference between the reaction time (or accuracy) when a student is listening to herself correcting an error she originally made and when she is listening to herself correcting errors made by classmates. In other words, students are just as good or bad at noticing and judging errors whether they made them or a classmate did. The same is true in the correction task described above.  This indicates that WHOSE error a student is correcting/judging has much less effect on her speed or accuracy than some other factor, e.g. the nature of the error itself. Probably a large “Duh!” factor there, but these things need to be ruled out before moving on…

>By Terence Egan

Being of the “fluency first” school and having students with quite a low level of English (and motivation), I let many errors slip by in my first term at this school. I didn’t ignore them completely, but allowed conversations to flow as best the communicator could manage.

At the beginning of second term, I feigned great horror at many of the common errors that students make in conversation. I tried to sell them on the simple notion that, if we practiced one common error as a component of each lesson, by the end of the term their English would have improved significantly and, hopefully, each student would have eradicated several of these problems from their extensive repertoires.

There was another rider to that first speech of the term. Having taught them the correct form or structure, I would not allow that mistake to be made in my class “ever again”. This was my Churchillian denouement.

I began with “he” and “she”, moved on to things like “I very much like (something)”, “much” and “many”, etc. In written exams they show that they know the rule, so it’s a matter of discipline, concentration and practice.

The interesting result was that these errors, once they were enshrined in “classroom law” (or “lore” maybe) became rare – from the moment they were introduced in a lesson! By the end of the term, the students were correcting each other (without animus, of course).

Chinese students seem to like boundaries and rules. Other rules introduced in Term 2 such as “no sleeping”, “no latecomers”, “no Chinese” were observed with the same diligence and often policed by each other.

>By Jennifer Wallace – Anhui Gongye Daxue, Ma’anshan, China

Lots of us are trying to develop tests appropriate for the situations we’re teaching in. One document I’d recommend, because I’ve found it enormously helpful, is the Council of Europe Framework, which is on the Internet, as a downloadable pdf file (for which you need to have Adobe Acrobat Reader on your machine). I like the document for several reasons.

The work behind it is the work of a large number of experts across Europe, who’ve developed one framework to cover the teaching (and testing) of any of the languages taught and used in Europe – which of course includes a variety of non-European languages. In other words, the whole thing is language independent. I understand it to be very much a reflection of the most up to date understanding we have of measuring language performance. The particular document in question is the latest version, the result of many revisions.

The document addresses the fundamental questions in all this, and looks at every dimension conceivable – so I can use it as a basis for testing speaking, listening, reading, anything. It looks at things on general levels and on detailed specific levels – so you can home in on the level that is relevant for you at the moment.

Because this framework is as comprehensive as it is, it lets me think up a variety of activities for the form of my tests, activities that reflect the students experiences and what they’ve done in a course. But at the same time it’s kept me very much on track, enabling me to see clearly what level our target it.

Because it’s not language-specific, you can test yourself (there’s one section on self-testing) for your Chinese to see how this sort of approach works.

Someone also commented about examiners’ ability not to be swayed – well, I think what allows me to be more objective is using a number of scales and criteria when I test. For example, this semester my college end-of-first-year students will get some marks for pronunciation (because we’ve done quite a bit of pronunciation work on their Oral English classes), some marks for fluency, some marks for grammar, some marks for vocabulary/lexis and some marks for coherence.

I’m also thinking about including some marks for how they deal with problems – repair work, asking for help, paraphrasing, miming, using fillers to gain thinking time and to fill a silence, and the suchlike – what’s called strategic competence. My criteria for vocab/lexis and grammar will not be whether they demonstrate use of anything in particular, but in how effective they are at communicating successfully -do their errors interfere with communication, or hinder it, or render it impossible! This is because I teach college English majors – I think testing for specific aspects of these dimensions is the responsibility of other teachers in other classes. but at the same time, my students do realise that I consider grammar and lexis to be seriously important.

As regards a quick test, my experience, and the experience of other testing large numbers quickly for summer schools (in UK language schools), is that in an informal chat of around 5 minutes, grading only on a 5 point scale (with very easy to understand scoring 5) is a remarkably effective tool in the hands of a native speaker. Even on the most mundane of topics (your home town, your family), it sorts the lower from the higher from the in betweens. I did this at the beginning of this year with my 225 new students, and on subsequent reflection, having taught them now for 2 semesters, remarkably few of my initial assessments were wrong, and none were way off.

What’s interesting is looking back at their subsequent development! The value for me is how much respect I have for the students who got a low rating at the beginning who would only now get a middle rating – but wow, what progress! In each band, I can see students who have really made big efforts and made progress, and I can also see students who’ve made almost no progress. Of those, a small number are not interested in the effort it entails (basketball etc is more important), but I also have one or two who I realise are making efforts but little progress. I think that initial testing and placement has really helped me, and I plan to do it for future Oral English classes. One thing I did was use the test results to make groups according to level, and that’s been very successful as well.

>By George –

To accurately test my students, I give them oral exams which are recorded on tape. These exams have two parts. The first part is Q&A covering things we have covered in class. They almost always have a memorized response for the basic questions. I tend to ignore these. I focus on their responses to the follow-up questions. For example, I’ve told them that we might discuss their grandparents, so I might ask

“Are your grandparents alive?” “How many children did they have?” How many boys and how many girls? “Do you know your aunt’s and uncles?” “O.K let’s talk about your youngest aunt” Here is where they begin to breakdown because they didn’t think to prepare for a discussion about their youngest aunt. I’ve also begun by asking about a favorite middle-school teacher and then focus on the teacher they liked the least. Once I get to the real subject I’ll begin with what is the person’s name, age etc. and gradually lead to more complex questions. Then I start looking for syntactic, grammar and vocabulary failure. In many cases the exam ends in 2 or 3 minutes and some have gone as long as 30 or 40 minutes. In all cases I use subjects they are familiar with: Family, School, Friends and Hometowns. If I knew more about sports I would dwell on that. I have been known to ask a student to explain what a mid-fielder, a striker or a goalie does if they play those positions in football or the role of guards, the center or forwards in basketball. I’ve even asked guitar playing students to explain how to play a particular song. In short they give me a guitar lesson.

To test for middle school, determine what is grade appropriate and start from there.

Again, start simple and progress to the complex. At what level do they abandon an answer or the topic entirely. The second part is a short oral reading which incorporates most of the English phonemes. I sometimes give the samples to practice with but they get a new reading for the exam. They must read cold.

Also, I’ve just begun developing a set of reading passages that will begin at about fifth or sixth grade level for native speakers using Flesch-Kincaid RGL measures and which become progressively more advanced. This way I can determine the level at which they begin to break down, identified by their rate of word abandonment. In the first year I will be mainly concerned with phonetic identification and reproduction. As we progress, stress and intonation will become more of a factor.

I’ve not seen the CET tests, so I can’t comment on those. Oral exams can be quantified, but I don’t like using them as the basis for a grade. I tell the school that grades should be considered as a report of a student’s speaking level and how much they have improved. In my classes, the only one’s who actually fail are those who only show up for exams and the rare film. Those who come to class but aren’t there count as absent. Our school weeds them out pretty quick. Last term eight of my students flunked out including two who were pretty good English speakers. Six were expelled for cheating on Chinese teacher’s exams.

>By Scott Miles

Some grammar teaching advocates referred to the Norris & Ortega survey of the effectiveness of explicit grammar instruction, quoting the abstract:

“[T]he data indicated that focused L2 instruction results in large target-oriented gains, that explicit types of instruction are more effective than implicit types, and that Focus on Form and Focus on Forms interventions result in equivalent and large effects.”

Krashen has written about this in his Explorations in Language Acquisition and Use book. Some of the main problems:

1. The bulk of the reviewed studies only test declarative or ‘learned’ knowledge (multiple choice questions, find the mistakes, etc.) rather than any measure of procedural use (able to use the grammar in unrehearsed speaking or writing).

We all know that students can be taught for a grammar test. I teach at one of the top universities in Korea and thus my freshman students are among the top 2% in the whole country. They have all aced (or nearly aced) the English portion of the entrance exam which has a grammar component. Yet they cannot use the grammar very well in their speaking or writing. Language teaching isn’t just about test preparation. If our teaching does not affect students’ actual performance, then we haven’t done them much good.

2. The bulk of the studies included in the survey do not have delayed post tests.

Students may remember the instructed grammar for a test, but forget it weeks or months later. Studies with delayed post tests generally show a drop in knowledge and usage, and it is not uncommon to see all gains disappear after a few months. If the knowledge doesn’t stick, then can we say the instruction was that useful?

3. Few comparison groups had anywhere near sufficient comprehensible input.

Some studies compared explicit instruction groups to those that simply had nothing (neither grammar instruction nor sufficient comprehensible input). Others had comparison groups with just a few hours of comprehensible input.

Studies which do not address these issues are simply not that useful in regards to the debate on explicit vs. implicit grammar approaches.

There is just a handful of studies covered in the the Ortega-Norris survey which do not have the problems listed above. Krashen reviews those studies in detail in his book and he makes a fairly strong argument that Norris and Ortega’s conclusions are overstated.

Having followed this current TESL-L online debate over the past few months, I wonder how many people have actually looked at the studies which compare programs with explicit grammar teaching and those which just provide comprehensible input. Grammar teaching (or non-teaching) is a big issue in our field and I think it is worth taking the time to look into it directly rather than just rely on the conclusions of other scholars.

I’d like to post on a few studies (starting with this post) which compare explicit instruction with a comprehension-based learning group. If nothing else, I just want to show that this whole issue is not as cut and dried as some people would like to believe.

The Harley (1989) study which Norris and Ortega include in their review is one of the very few studies which does not have the problems noted above.

Harley compared to groups that were a part of a French immersion program in Canada. The experimental group had 12 hours of work with passe compose and imparfait over 8 weeks. The comparison group simply continued their immersion program with no explicit focus on these grammar items.

Here are the results:

Interview Test:….Pre test..Post test…..Delayed Post test (3 months) Experimental………42% ……57%……………. 63%
Comparison………. 44.5%…. 48%……………. 60%

Considering that 12 hours were spent on 2 grammar forms, and that the questions in the interview specifically cued those grammar forms, it is no surprise that the students would recall their grammar instruction and use it in the interview. Nonetheless, the scores are still not that impressive and with the delayed test the immersion group has closed the gap (there were no statistically significant differences on scores at the delayed test).

Harley (and presumably Norris and Ortega) look at these results as a victory for explicit instruction. I look at this and think that this is not a very good return for 12 hours of valuable class time. Normal classrooms cannot devote 12 hours for just two grammar points and again, the differences between the groups are no longer statistically significant after 3 months. What was really gained? And note that the immersion only group is progressing along fairly well despite not having any explicit instruction.

There were two other tests in Harley’s study as well:

Cloze:………..Pre test..Post test…..Delayed Post test

Again, statistically significant gains that are shown on the immediate post test were lost on the delayed post test, as the comparison group closes the gap simply by continuing their immersion program.

Composition……Pre test..Post test…..Delayed Post test

The students’ writing was rated on a 5 point scale for grammatical accuracy. Neither the post or the delayed post scores showed statistically significant differences between the two groups. Again. the 12 hours of grammar instruction did not deliver much to get excited about.

Furthermore , in the speaking and cloze tests these small gains seem to be disappearing, so where is the support for the idea that the instructed students are at any advantage even in the long run (the often proclaimed idea that explicit grammar instruction helps students attain the form more quickly)?

There is another issue that is often overlooked in these studies. Hours devoted to grammar instruction and practice do little to benefit other areas of language acquisition. Sure, the students in Harley’s study might have picked up a little vocabulary or grammar incidentally while they were focusing on the passe compose and imparfait, but most likely not a whole lot. The question is, what did the comparison group get for that 12 hours of extra input in which they were exposed to much more language? The research results above show that they were slowly but surely developing the target grammar forms despite no explicit instruction, and thus assuredly they were also developing many other grammar forms as well. For vocabulary learning, they most likely received a lot more vocabulary exposure during that 12 hours than the grammar focused group, meaning that their vocabulary was probably developing more effectively as well. And of course, their listening and reading skills were also most likely benefited more from that 12 hours of input in comparison to the grammar group.

So I think one could make a strong case that in the sum total of language acquisition among these two groups, the input only group actually came out well ahead.

Of course, this is just one study and there are others that should be discussed.

Harley, B. 1989. “Functional Grammar in French Immersion: A Classroom Experiment.” Applied Linguistics 10:331-59 Norris, J. & Ortega, L. Effectiveness of L2 Instruction: A Research Synthesis and Quantitative Meta-analysis. Language Learning 50:3, September 2000, pp. 417-528.

>By Betty Azar

In reference to recent discussions: Keith Folse, Karen Stanley, and Michael Swan understand what it means to “teach grammar” — a concept that too often seems to get twisted to mean something other than what we who teach grammar mean when we talk about it.

When students ask “Why?” they are really asking “How does this work?” — and they deserve an answer if they feel that this grammar information will help them. Teachers can either lead students to discover this information or provide this information through explanation, or both (as is usually the case in real classrooms).

I’ve often wondered what teachers who refuse any kind of grammar component in their classes say to students when students ask questions about grammar.

My students were always full of questions, really good questions — I can’t imagine saying to them: “Oh, you don’t need to know that” or “There’s really no answer to that” or “That’s just the way it is, so don’t worry about it.”

What a disservice to students. And how disrespectful of their learning strategies. Like Michael Swan, I’d go find a different mechanic/doctor/piano teacher/what-have-you. Like Keith Folse, I’d fire that teacher. Like Karen Stanley, I’d answer the question by showing how grammar patterns convey meaning.

There is nothing more natural than for adult students to ask questions about how English works. Somehow the naturalist movement in language teaching made what is completely natural — students asking questions about grammar and finding it helpful to figure out how patterns work — seem misguided or irrelevant or somehow “not natural.” Fortunately for students, the naturalist movement is now a passing bandwagon. Today grammar teaching and communicative teaching are becoming more and more integrated in a variety of innovative and effective ways.

Betty Azar is a teacher and the author of several English grammar workbooks that are a staple in the ESL teaching industry.

>By Michael Swan

As the leader of a small team working on methods of teaching grammar at the Notker Balbulus Language Institute in Edinburgh, I have been following various contributions to the recent debate with considerable interest. In most respects, they characterise our practice with remarkable accuracy. We do indeed require our students to learn grammar rules by heart; and we not only make them recite the rules in chorus, but are training some of the students to sing them in four-part harmony. Many of the rules we teach were, as they point out, devised by mediaeval monks; we find that these have a rich deep patina which one simply cannot find in today’s rules. In this connection, we have been fortunate in discovering, in Oxford’s Bodleian Library, an unpublished manuscript containing a veritable storehouse of arcane rules relating to Middle English word order which we are currently incorporating into our teaching programmes. Labelling we regard as essential, and any of our students can identify an indefinite past progressive subjunctive determiner at 200 paces in a dim light. We steadfastly refuse to allow our learners access to comprehensible input; an account of some interesting early work using incomprehensible input can be found in the paper ‘The Use of Sensory Deprivation in Foreign Language Teaching (Swan and Walter 1983) in English Language Teaching Journal 36/3. We take very seriously the translation component of ‘grammar-translation’ (sometimes neglected in today’s permissive times), and our students spend a good deal of their time translating English texts not only into their mother tongues, but also into Latin, Sanskrit, Classical Greek and Old Church Slavonic. The one area where they are somewhat ahead of us is in the matter of etching conjugations into our students’ brains, referred to in their latest posting. This is an exciting and promising direction to explore, and we have indeed tried several approaches, using Spanish and Serbian (since English has no conjugations). However, our results have been disappointing and in some cases unfortunate, and we have come to the conclusion that, sadly, this is a technique which will have to wait for advances in neurosurgery for its successful implementation.

Michael Swan is a writer specializing in English language teaching and reference materials. His interests include pedagogic grammar, mother-tongue influence in second language acquisition, and the relationship between applied linguistic theory and classroom language-teaching practice, and he has published a number of articles on these topics. And he has a great sense of humor.

>“If anyone is likely to have accurate insights/judgment into the impact of particular techniques on a language learner, let us hope it is language teachers about their own past learning of other languages.”

Shouldn’t teachers have some special insights into the learning process derived from their own experience studying a language?


In fact, this experience can be very harmful.

Some students are academically inclined. They read something, they remember it. The teacher says something, they remember it. They study the books in the library and they do everything correctly in class. Sometimes they even sit in the front of the class.

These people often become teachers.

They are remarkable people. We cannot criticise them, only admire them. As students they can smilingly sit through the most boring lectures and actually pull some jewels of important knowledge out of the verbage. They can study the most complicated texts, decipher them as well as any CIA analyst and file away the data into different parts of their computer-like brain and retrieve it later when the teacher demands and to his pleasure. (Indeed, such students make teachers feel like gods.)

The problem is the other 80% of the class, what to do with them?

Some of them have blank looks on their faces. Some of them just don’t get it. Some of them are bored to death. In extreme cases, some of them have slipped into a teacher induced coma on top of their desks.

When good students become teachers and then draw on their learning experience when dealing with their students they can make a big mistake. They can be tempted to believe that their students are like them and can learn like they did.

>Michael Hughes has made an interesting point. “Having been in the teaching of English game for nearly three decades and having used and seen a number of methodologies, I can’t really say that any of the methods I used actually failed to teach English to my students. One could say certain methodologies are more boring (repetitive), enjoyable or useful in certain circumstances, but by and large they all achieved their broad aim.”

Let’s all keep in mind grammar teaching’s constant descent from being the all-in-all of English teaching to reaching the point that we are now debating if it is even necessary. Let’s remember the old English teaching books in which grammar was central. As Jack Richard’s puts it:

“In the 1970s we were just nearing the end of a period during which grammar had a controlling influence on language teaching.”[1]

As Michael Hughes points out, the books worked, students learned and “by and large they all achieved their broad aim”…or did they? Certainly many students learned that way. Some students simply love grammar.

But how many failed? How many determined they were too stupid to learn a language because they couldn’t remember all those rules and how to put them together to create coherent language?

In 1970, I was one of the stupid ones, too stupid to learn French in school. At least, that is what I decided, the way it looked to me. Or was I too stupid? If a more communicative approach was taken and I was presented with fascinating reading material[2], audio and video material that was just nearly within my language range (ie: Krashen’s i+1), would I have been able to learn French? Consequently, I had to wait about ten years until I was living and working in France before I picked up the language on the street without a book. By that time I had already picked up Spanish in Puerto Rico and Spain in the same way.

Was my French and Spanish good? No, but I could communicate. As Krashen suggests, this would be a good time for some remedial training in the form of grammar training. Thus, grammar teaching plays, at most, a supporting role rather than a starring role.[3]


>By Pete Marchetto

I think the central problem with the teaching of English in China is that it’s an examination subject and the examinations here seem to have precious little connection with natural use of the language focusing instead on relatively or purely academic aspects. This is what I meant when I said that a solid grounding in grammar, language history and so on are all well and good but I feel I have been of most benefit here when I have released the pent-up ability to actually USE the language.

At my last place of work a fellow teacher – Chinese – told me of a friend of hers who got 19 out of 20 in an examination and was deemed to have failed for his one wrong answer though he was an excellent student. The question asked which of the following forms was correct – ‘The bird is IN the tree’: ‘The bird is ON the tree’. The student declared the latter correct and afterwards argued strongly that a bird perched on the exterior of the tree could be said to be on, rather than in, the tree – but this, unfortunately, didn’t concur with the Chinese Manual of Prescriptive and Occasionally Inaccurate English Grammar. I don’t know about everyone else in here but I’m on the side of the student in this one – I have no qualms about saying ‘The bird is ON the tree’.

One of the biggest blocks I suspect all of us have to overcome is the belief students have that they can’t speak English. Indeed, two of the teachers here have told me they can’t use English to express themselves. I pointed out to them, as I point out to the students when they make the same complaint, that they seemed to be doing a perfectly good job of expressing themselves to me. This is what I mean by a ‘pent-up’ ability; the schooling they’ve received in English is far from useless but the ability to use the language it creates exists merely as a potential until someone comes along and encourages them to use it. Not having used it they believe themselves incapable of doing so.

In releasing that potential I have to give the students the revelation that it is fine for them to make mistakes. Inevitably mistakes are made, and many of them given that students have so rarely been called upon simply to speak. Not being chastised for mistakes, however, seems almost alarming for some of the students. If they make mistakes, they ask, and aren’t corrected for them, won’t those mistakes become entrenched? I point out to them that the continual mixing of he/she, for example, if corrected on each hearing, will fragment any conversation beyond its value as communication and a promotion of fluency. It’s not as if the students don’t know the gender rules; it’s merely that lack of practice has those mistakes so oft repeated. Such problems will work themselves out, wrinkles in language that will be ironed out the more they use it and the more natural it becomes for them to use it. Correcting them each time negates the value of the practice and, ironically, of itself is liable to entrench the errors – along with many other problems – in their conversational English.

Students also worry that in having conversation with one another mistakes will become entrenched. On that issue I point out to them that parents don’t stop children acquiring their native tongue speaking with other children lest they reinforce each ‘s errors. With further exposure to the correct use of the
language at other times again the errors are ironed out. Do they think that a four year old Chinese permitted only to speak to adults who use the language properly and never allowed to speak to other four year olds even though adults are rarely available to them in comparison to other four year olds would grow up with a better or worse grasp of Chinese? Where I will make corrections – as far as possible at the end of conversations, not within them – is in other areas such as the inappropriate use of vocabulary and common errors where something is clearly misunderstood; the excessive use of ‘very’, the cultural error in the frequent use of ‘delicious’ are two examples; the pronunciation ‘clothIES’ or ‘clothESIES’ for ‘clothes’; a word poorly understood from a dictionary as recently where students in debate were gaily throwing around the word ‘moribund’ to describe a group of healthy dogs that were about to be put into a situation where they would almost inevitably die which fitted the dictionary definition of the word but missed out on some of its subtleties.

When someone tells me they want to improve their English I ask them bluntly whether they want to improve their English for use or for exams? If the latter I will gently suggest they find themselves a Chinese teacher. English exams in China are so abstruse that I suspect I, as a mere native speaker, erstwhile professional writer and ex-member of MENSA, would not only fail in teaching for those examinations but also be very likely – if faced with them – to fail the examinations themselves.

I realise that none of this holds anything new for anyone who has been teaching here for over six months but there are new teachers who might be saved some of the confusion all of us felt on arriving in China to teach for the first time. For those of you yet to arrive you are in for a treat; where else in the world can you get students who are unable to speak English and bring them up to the level of fairly fluent conversationalists in under a year?