James Hunter, Assistant Professor – English Language Center, Gonzaga University, Spokane USA

Excel has an Analysis ToolPak which can do a lot of statistical tasks. Help on installing it is here. Also, try the R Project. This is a free “software environment for statistical computing and graphics” and it will run on Windows, Mac, and Linux. I haven’t had much of a chance to play with it, but it is certainly not user-friendly. However, you can also get Statistical Lab, which is a GUI interface for R, also free but not for Mac or Linux. There’s also a free version of SPSS (the “big” stats package that businesses & colleges use), called PSPP.

With all of these, you can easily do correlation matrices, T-test, Chi-square, item analysis, Anova, etc. These will enable you to compare results on assessments, do pre- and post-tests, get inter-rater reliability information, find links between variables, etc. See also this for information on which statistical procedures to use when.

I use mean and SD on most tests and quizzes to a) compare classes to previous semesters and b) look at the distribution and spread of scores on a test/item. This helps to make informed decisions about assessment instruments, especially those that might be adopted as standardized tests for the program. I’ve done a lot of work with our placement instruments, for example, to determine reliability and check our cut scores.

Recently, I’ve been doing research on corrective feedback in oral production, so have needed measures of accuracy and fluency (and complexity!). Statistical analysis has been essential to find correlations between, say, accuracy and reaction time on a grammaticality test and accuracy and production time in a correction test. For instance, in class a student says to another: *”Yeah, actually I’m agree with you”. This goes down on a worksheet for her (and occasionally other classmates – see this for a description of this methodology), and she is later given a timed test in which she sees the incorrect sentence and has to record a corrected version. Her speed in doing this task (plus her accuracy) give a measure of whether this structure/lexis is part of her competence (or to use Krashen’s model, whether it has been “acquired” or “learned”: presumably, if this theory holds water, “learned” forms will take longer to process and produce than “acquired” ones). In addition to this production test, I’ve been doing a reaction time-test in which the same learner hears her own recording and has to decide, as quickly as possible, whether what she said is correct or not. You can try this for yourself here (you will not be able to hear student recordings, only a few practice sets, recorded by me using student errors from our database; use anything as Username and “elc” as password).

These measures yield 1000s of results, and that’s why statistical analysis has been essential. Excel can do a lot of the work, especially in graphical representation, but SPSS has done most of the heavy lifting. For instance, it has revealed that there is no significant difference between the reaction time (or accuracy) when a student is listening to herself correcting an error she originally made and when she is listening to herself correcting errors made by classmates. In other words, students are just as good or bad at noticing and judging errors whether they made them or a classmate did. The same is true in the correction task described above. This indicates that WHOSE error a student is correcting/judging has much less effect on her speed or accuracy than some other factor, e.g. the nature of the error itself. Probably a large “Duh!” factor there, but these things need to be ruled out before moving on…