Skip navigation

Category Archives: rubric

Below is a link to six speaking samples of two college students. What score would you give them?

6 samples rar

Please listen to each sample carefully and score each individually. They may have been made at different times.

For business topics the student is speaking from a prompt card that had a topic and the student had 3 minutes to prepare. The student had no access to any materials to prepare, only time to think and plan. For personal topics the answers were spontaneous to the questions.

Give one score per sample. If you want to use the IELTS scale, you can find the band descriptors here: IELTS Speaking Band Descriptors. Please say what scale you are using.

Generally, it is expected that students can speak about familiar topics, like family or friends, better than unfamiliar topics like business. So the difficulty of the task has to be considered in scoring.

Now you can try it: Can you rate speaking?

Please do not discuss your scores on the List until all of the scores have been published after the one month waiting period. If you have any questions or problems, please contact Dave Kees at: davekees[at]gmail.com.

Write your scores in the “Leave a comment” section on the left side of this page or click here: Comments. All score submissions will be withheld for one month and then published. This way submissions will not be influenced by previous submissions.

Your score:

What scale?:

Sample 1:
Sample 2:
Sample 3:
Sample 4:
Sample 5:
Sample 6:

(Special prize for the submissions that are closest to the average scores. The prize is Uncle Dave’s Tie Score Tie. Yes, now you can be the envy of your school and own one of these specially designed high-quality silk ties perfect for teachers who do oral English testing. While the student is talking, the teacher can adjust the gold tie clip up or down to indicate to the student how he is doing and as a reminder to the teacher of how the student performed! Note: This offer is void in countries outside of China and in all areas inside of China.)

Advertisements

By Peter Preston, Poland

Teachers do calculate the average score from tests, but then nothing serious is done with it. Even when the average score is close to the pass mark little statistical comment is made about the glaring problem that this represents. For example, if the average and the pass mark are the same and the population is normally distributed around the average, this means that 50% of the students fail. Can it be considered acceptable for 50% of the candidates to fail an end-of-the-year examination or even worse an end-of-the-course examination?

In fact at our college the last third-year UoE exam failed 80% of the students. Now you would think that a statistically-minded person would immediately start asking questions about validity of the exam. Construct validity – did the items set test the points intended to be tested? Course validity – did the items tested figure in the course syllabus? Is there a proper tie-up between the course syllabus and the test specifications (if the latter exist at all)? Did the distribution of correct responses discriminate between the weak and strong candidates? Were the items either too easy [not in this case] or too difficult? Is there any objective reference to competence standards built into the teaching programme? To ask just a few relevant questions.

I would love to hear that other institutions do use statistical analysis of exam data and look at the variance between different exam sittings using the same exam or different ones, but I wonder if small institutes can ever bring together the required expertese to carry out such work either before the exam goes live or afterwards. It would be great to conduct a poll on this matter to try to assess the use of statistics in the analysis of exam data at as many institutes as possible.

Peter Preston's students in Poland

My own experience inclines me to believe that exams are in fact not so much an educational evaluation of the work being done as a policy instrument to give face validity to the programme. As such one does not need to worry about the quality of the exam since one can adjust the results before publication. Or in the case of my institute the exam can be repeated by order from above until the teachers get the message.

I do not like the cynical manipulation of exam data, so having good quality statistical information and quality control of all documents involved in the course would be the start to a reevaluation of the course and teaching methods. By accurate assessment at the beginning of a course it should be possible to predict the level students could get to after a given number of teaching hours, taking into account the realities of life. By keeping proper statistical records over a few years one would accumulate powerful information. This is what insurance companies do to calculate their premiums.

By Erlyn Baack, now retired, formerly at ITESM, Campus Queretaro, Mexico http://eslbee.com

Both the IELTS and the TOEFL are proficiency tests that measure overall proficiency. They are both global in nature. I do not think they should be seen as achievement tests to be used at the end of a semester of study. Instead, they may be used to inform the achievement rubrics that should be developed within successive levels within an English program. Likewise, these proficiency exams should not be used as placement exams either because there are better placement exams available. There is not a single question on the TOEFL, for example, that discriminates the difference between English ONE, TWO, and THREE levels for instance. So for placement, even Michigan’s very old English Placement Test (if it is still available) would be better than the TOEFL for placement.

That said, the IELTS and the TOEFL should inform the achievement (and the rubrics in each of the four skills, ideally) that teachers and/or course administrators want to achieve at each level within an English program. Teachers and/or course administrators have to decide the curriculum at each level: For example, in developing the curriculum for English ONE, teachers and/or course administrators must ask and answer the following questions: At the end of the semester, (1) What do we want the students to know (or achieve, or be, or be able to do)?, (2) How are we going to teach it?, and (3) How are we going to test it?

Teachers and/or administrators are then responsible for designing a curriculum and an ACHIEVEMENT exam, _with rubric_, that measures the level of student achievement throughout the semester. By definition, all students should have the ability to STUDY or PRACTICE the curriculum within the semester that would lead to higher achievement scores meaning there would be a high correlation between (1) the number of hours a student studies and (2) his/her final semester score. Those achievement scores, then, would affect the TOEFL and the IELTS only indirectly.

I think it is helpful to distinguish between various exams and what they measure.

(1) Placement exams contain questions at all levels to place students within an English program. Michigan’s EPT is an example.

(2) Proficiency exams measure overall proficiency. The IELTS and TOEFL are examples, and they are used by universities, generally, to determine whether proficiency is sufficient for university studies.

(3) Achievement exams measure the level of student achievement within a semester of study. A major monthly exam, a mid-semester exam, or a final exam are examples of those. Did the student “achieve” what was supposed to have been taught and learned within a given week or month or semester?

>By Maria Spelleri – Manatee Community College, Florida, USA

One way to get a sense of structure with the evaluation of student oral production is by using a rubric. Here’s an example of a speaking rubric for an ESL program in a US elementary school system: RUBRIC and here’s a site with programs to help you develop a rubric: DEVELOP RUBRIC

To create a rubric for a speaking activity such as retelling a story, you need to break the activity down into its most basic elements. For example, speech is comprised of vocabulary, grammar, pronunciation/stress/intonation, logical meaning and order, purpose, and in the case of the story, an element of cohesion. For the specific task, you might want to also consider the accuracy of the retelling, the amount of detail included, number or length of pauses and inappropriate filler noises, etc. Then, for each category, set the possible performance/assessment levels, for example, “excellent”, “satisfactory”, and “needs improvement”. I prefer to work with a basic set of 3 as it is easier for me to break down a production into bad, so-so, and good instead of more subtle variations- although plenty of instructors use 4 and 5 categories.

If you do a Google search using key words like “ESL Speaking rubric”, you should find many ideas to help you create a rubric that will meet your needs.

By the way, I would suggest recording the assessment either audio or audio/video because it can be hard to listen to content, mentally evaluate, and complete a rubric at the same time. Replaying the audio gives you time to better analyze the students’ work and assess more fairly. Playing back the recording for the student who can then watch him or herself and compare the recording to the completed rubric assessment is a valuable learning tool as well.

>By Jennifer Wallace – Anhui Gongye Daxue, Anhui Province, China

When I came to teach here, although I‘d been a speaking test examiner for more than 10 years (for UCLES exams) I‘d actually never had to set an oral English exam before. I’d taught always in situations where the students were either taking no exam or were working towards an external exam. So if I did have to set tests, they were very much on the mock-exam model.

I’d never taught a modern language within a university/college setting where this was the student’s main subject (major). Although I had a short training specific to coming to this post in China, provided by the NGO who sponsor my post, I came with some sort of assumption that there would be a syllabus, there would be designated attainment targets (although not necessarily expressed in that way). Well, you all know the reality here.

I was timetabled for first year Oral English classes who were provided with one of the ORAL ENGLISH WORKSHOP series of books. If anything, I found that was worse than arriving with nothing. It implied someone somewhere thought the content of this course book was what my students should be mastering.

Anyway, after a semester of muddling along and getting some sort of impression about what might be possible, I realised that the lowest of the UCLES EFL exams I’d been a speaking test examiner for was probably within the reach of everyone in the class. I’d been warned about the tradition of everyone in the class passing the exams. Remember I’d done those UCLES tests for years.

I could remember the type of tasks set in the exam, and I produced a parallel. Those UCLES tests are taken in pairs, but I chose to give each student an individual exam – partly as a public relations exercise about oral exams within my department. I was interlocuter as well as assessor.

So I recorded all the exams and marked them from the tapes. I was right in that all my students were capable of attaining that first level in the UCLES hierarchy, which means that in a grander scheme of things they had all achieved the Council of Europe Basic User level. The descriptors for this (in summary) are:

Can understand sentences and frequently used expressions related to areas of most immediate relevance (e.g. very personal and family information, shopping, local geography, environment).

Can communicate in simple and routine tasks requiring a simple and direct exchange of information on familiar and routine matters.

Can describe in simple terms aspects of his/her background, immediate environment and matters of immediate need.

At the end of the first semester all my students could do that – although a good number could only just do it with a very sympathetic interlocutor. Others walked through it. Which gave me a good spread of marks. And on that basis I decided to model the exam at the end of the second semester on the next level up (which I’d also done examining work for). By that stage my classes had included a fair amount of group work, and so the exam was done in small, randomly selected groups of 4, not including me. This year I’ve done lots more group work, but am actually planning to give one-to-one exams at this stage instead – partly for comparison.

So my decisions were based on a combination of what was within my own capabilities as well as the students. I’m not an expert on language testing. My only teacher training is a CELTA, and in the past when I’ve had to devise and construct college tests it was done under the supervision of a very experienced head of department. But also I’m not into re-inventing the wheel.

The Council of Europe stuff – which relates to ALL the languages taught in Europe (and that includes teaching non-European languages) – is the result of mega-input from experts over heaven knows how many years now. I feel I’d be deluding myself if I thought I could devise any better sort of structure to work within – so I’m using it. I do also like it – I find it clear and easy to get my head around.

I’m also interested to find that now I’ve got hold of an English translation of the syllabus for our English & Education majors, which details Band 2, 4 and 6 targets, I’m starting to be able to relate them meaningfully to the band descriptors I’m used to using. I’ve come across this syllabus too late to affect how I teach and examine this year, but it’s certainly going to help me next year.

It also makes me realise that we tend to see the Band exam stuff from the student-obsession perspective, while underlying it there is the same sort of work I’ve been used to being aware of in a European context. Here, I realise I see the students’ type of understanding – the seeing only the tip of the iceberg – because I‘m not included in my department as a colleague with access to formal and informal discussion about all this.

From my students I have an impression of teaching that’s come down to a lowest common denominator sort of level influenced by the need to get students through those exams to allow them to graduate. But I think I can also see where my own college department is failing to meet the specifications in that syllabus, both in intention and reality. Which in itself is interesting as this is a department that an international NGO thought warranted support in its development.

>By Janet Kaback Newark, NJ, USA

When assigning projects and presentations, I first work with the classes in developing a rubric that will be used to grade their work. I tell the students that by doing this together, they will understand exactly what they need to do in order to ‘select’ their own grade. We begin by the lowest, “F” = No Project. From there, we move to the “A” category and the students must tell me exactly what they consider to be worth an A. The same procedure is followed for B, C, and D. Last year, the students requested that we add in A+, A-, B+, B-, etc. as they saw a difference between projects exhibiting “A+” and “A-” work.

This had developed into a very revealing lesson which permits the students to use higher order thinking and reasoning skills as they have to examine the grading system. When the projects come in, any identifying information must be on the reverse side, as I hold up the projects and the class must grade each, USING the rubric to justify the grade – with my guidance, naturally. They usually arrive at a very similar grade to one that I’d have assigned the project, after the first few.

In addition, we add provisions for the presentations which clearly define and demonstrate which of the students in a group, for example, did most of the work. The students, in their presentations, must teach their subject area to the class with an explanation of HOW they learned the information. I’ve had one student receive an A+ while the other in the pair received a D or F in the presentation. The students are able to clearly judge the work of their peers using this methodology, as well as aim for standards within their own work.