Assignment from meeting 9-10

Summary

BEYOND TESTS: ALTERNATIVES IN ASSESSMENT

In the public eye, tests have acquired an aura of infallibility in our culture of mass producing everything, including the education of school children. Everyone wants a test for everything, especially if the test is cheap, quickly administered, and scored instantaneously. Tests are formal procedures, usually administered within strict time limitations, to sample the performance of a test-taker in a specified domain. Assessment includes all occasions from informal impromptu observations and comments up to and including tests.

Early in the decade of the 1990s, in a culture of rebellion against the notion that all people and all skills could be measured by traditional tests, a novel concept emerged that began to be labeled "alternative" assessment. As teachers and students were becoming aware of the shortcomings of standardized tests, "an alternative to standardized testing and all the problems found with such testing" (Huerta-Macias, 1995, p. 8) was proposed. That proposal was to assemble additional measures of students-portfolios, journals, observations, self-assessments, peer-assessments, and the like-in an effort to triangulate data about students. For some, such alternatives held "ethical potential" (Lynch, 2001, p. 228) in their promotion of fairness and the balance of power relationships in the classroom.

They noted that to speak of alternative assessments is counterproductive because the term implies something new and different that may be "exempt from the requirements of responsible test construction" (p. 657). So they proposed to refer to "alternatives" in assessment instead. The defining characteristics of the various alternatives in assessment that have been commonly used across the profession were aptly summed up by Brown and Hudson (1998, pp. 654-655). Alternatives in assessments

1. Require students perform, create, produce, or do something;
2. Use real-world contexts or simulations;
3. Are nonintrusive in that they extend the day-to-day classroom activities;
4. Allow students to be assessed on what they normally do in class every day;
5. Use tasks that represent meaningful instructional activities;
6. Focus on processes as well as products;
7. Tap into higher-level thinking and problem-solving skills;
8. Provide information about both the strengths and weaknesses of students;
9. Are multicultural sensitive when properly administered;
10. Ensure that people, not machines, do the scoring, using human judgment;
11. Encourage open disclosure of standards and rating criteria; and
12. Call upon teachers to perform new instructional and assessment roles.

A. THE DILEMMA OF MAXIMIZING BOTH PRACTICALITY AND WASHBACK

The principal purpose of this chapter is to examine some of the alternatives in assessment that are markedly different from formal tests. Tests, especially large-scale standardized tests, tend to be one-shot performances that are timed, multiple choice, decontextualized, norm-referenced, and that foster extrinsic motivation. On the other hand, tasks like portfolios, journals, and self-assessment are
• Open ended in their time orientation and format,
• Contextualized to a curriculum,
• Referenced to the criteria (objectives) of that curriculum, and
• Likely to build intrinsic motivation.

One way of looking at this contrast poses a challenge to you as a teacher and test designer. Formal standardized tests are almost by definition highly practical, reliable instruments. They are designed to minimize time and money on the part of test designer and test-taker, and to be painstakingly accurate in their scoring. Alternatives such as portfolios, or conferencing with students on drafts of written work, or observations of learners over time all require considerable time and effort on the part of the teacher and the student. Even more time must be spent if the teacher hopes to offer a reliable evaluation within students across time, as well as across students (taking care not to favor one student or group of students). But the alternative techniques also offer markedly greater washback, are superior formative measures, and, because of their authenticity, usually carry greater face validity.

The flip side of this challenge is to understand that the alternatives in assessment are not doomed to be impractical and unreliable As we look at alternatives in assessment in this chapter, we must remember Brown-and Hudson's (1998) admonition to scrutinize the practicality, reliability, and validity of those alternatives at the same time that we celebrate their face validity, washback potential, and authenticity.

B. PERFORMANCE BASED ASSESSMENT

Before proceeding to a direct consideration of types of alternatives in assessment, a word about performance-based assessment is in order. There has been a great deal of press in recent years about performance-based assessment, sometimes merely called performance assessment (Shohamy, 1995; Norris et al., 1998).

The push toward more performance based assessment is part of the same general educational reform movement that has raised strong objections to using standardized test scores as the only measures of student competencies (see, for example, Valdez Pierce & O'Malley, 1992; Shepard & Bliem, 1993). The argument, as you can guess, was that standardized tests do not elicit actual performance on the part of test-takers.

If a child were asked, for example, to write a description of earth as seen from space, to work cooperatively with peers to design a three-dimensional model of the solar system, to explain the project to the rest of the class, and to take notes on a videotape about space travel, traditional standardized testing would be involved in none of those performances. Performance-based assessment, however, would require the performance of the above-named actions, or samples thereof, which would be systematically evaluated through direct observation by a teacher and/or possibly by self and peers. Performance-based assessment implies productive, observable skills, such as speaking and writing, of content-valid tasks.

O'Malley and Valdez Pierce (1996) considered performance-based assessment to be a subset of authentic assessment. The following are characteristics of performance assessment:
1. Students make a constructed response.
2. They engage in higher-order thinking, with open-ended tasks.
3. Tasks are meaningful engaging, and authentic.
4. Tasks call for the integration of language skills.
5. Both process and product are assessed.
6. Depth of a student's mastery is emphasized over breadth.

In reality, performances as assessment procedures need to be treated with the same rigor as traditional tests. This implies that teachers should
• State the overall goal of the performance,
• Specify the objectives (criteria) of the performance in detail,
• Prepare students for performance in stepwise progressions,
• Use a reliable evaluation form, checklist; or rating sheet,
• Treat performances as opportunities for giving feedback and provide that feedback systematically, and
• If possible, utilize self- and peer-assessments judiciously.
To sum up, performance assessment is not completely synonymous with the concept of alternative assessment.

C. PORTOFOLIOS

One of the most popular alternatives in assessment, especially within a framework of communicative language teaching, is portfolio development. According to Genesee and Upshur (1996), a portfolio is "a purposeful collection of students work that demonstrates their efforts, progress, and achievements in given areas" (p. 99). Portfolios include materials such as:
• Essays and compositions in draft and final forms;
• Reports, project outlines;
• Poetry and creative prose;
• Artwork, photos, newspaper or magazine clippings;
• Audio and/or video recordings of presentations, demonstrations, etc.;
• Journals, diaries, and other personal reflections;
• Tests, test scores, and written homework exercises;
• Notes on lectures; and
• Self and peer-assessments comments, evaluations, and checklists.

Until recently, portfolios were thought to be applicable only to younger children who assemble a portfolio of artwork and written work for presentation to a teacher and/or a parent. Now learners of all ages and in all fields of study are benefiting from the tangible, hands-on nature of portfolio development. Gottlieb (1995) suggested a developmental scheme for considering the nature and purpose of portfolios, using the acronym CRADLE to designate six possible attributes of a portfolio: Collecting, Reflecting, Assessing, Documenting, linking, Evaluating.

Portfolio is an important Document in demonstrating student achievement, and not just an insignificant adjunct to tests and grades and other more traditional evaluation. A portfolio can serve as an important Link between student and teacher, parent, community, and peers. It is a tangible product, created with pride that identifies a student's uniqueness.

D. JOURNALS

A journal is a log (or "account") of one's thoughts, feelings, reactions, assessments, ideas, or progress toward goals, usually written with little attention to structure, form, or correctness. Learners can articulate their thoughts without the threat of those thoughts being judged later (usually by the teacher). Sometimes journals are rambling sets of verbiage that represent a stream of consciousness with no particular point, purpose, or audience. Fortunately, models of journal use in educational practice have sought to tighten up this style of journal in· order to give them some focus (Staton et al., 1987). The result is the emergence of a number of overlapping categories or purposes in journal writing, such as the following:
• Language-learning logs
• Grammar journals
• Responses to readings
• Strategies-based learning logs
• Self-assessment reflections
• Diaries of attitudes, feelings, and other affective factors
• Acculturation logs

Most classroom-oriented journals are what have now come to be known as dialogue journals they imply an interaction between a reader (the teacher) and the student through dialogues or responses. One of the principal objectives in. a student's dialogue journal is to carry on a conversation with the teacher. Through dialogue journals, teachers can become better acquainted with their students, in terms of both their learning progress and their affective states, and thus become better equipped to meet student individual needs. Journals obviously serve important pedagogical purposes practice in the mechanics of writing, using writing as a "thinking" process, individualization, and communication with the teacher. At the same time, the assessment qualities of journal writing have assumed an important role in the teaching-learning process.

It is important to turn the advantages and potential drawbacks of journals into positive general steps and guidelines for using journals as assessment instruments. The following steps are not coincidentally parallel to those cited above for portfolio development:
1. Sensitively introduce students to the concept of journal writing.
2. State the objective(s) of the journal.
3. Give guidelines on what kinds of topics to include.
4. Carefully specify the criteria for assessing or grading journals.
5. Provide optimal feedback in your responses.
6. Designate appropriate time frames and schedules for review.
7. Provide formative, washback giving final comments.

E. CONFERENCES AND INTERVIEWS
For a number of years, conferences have been a routine part of language classrooms, especially of courses in writing. Conferences are not limited to drafts of written work. Including portfolios and journals discussed above, the list of possible functions and subject matter for conferencing is substantial:
• Commenting on drafts of essays and reports.
• Reviewing portfolios.
• Responding to journals.
• Advising on a student's plan for an oral presentation.
• Assessing a proposal for a project.
• Giving feedback on the results of performance on a test.
• Clarifying understanding of a reading.
• Exploring strategies-based options for enhancement or compensation.
• Focusing on aspects of oral production.
• Checking a student's self-assessment of a performance.
• Setting personal goals for the near future.
• Assessing general progress in a course.

Conferences must assume that the teacher plays the role of a facilitator and guide, not of an administrator, of a formal assessment. Conferences are by nature formative, not summative, and their primary purpose is to offer positive washback. Discussions of alternatives in assessment usually encompass one specialized kind of conference: an interview. This term is intended to denote a context in which a teacher interviews a student for a designated assessment purpose. Interviews may have one or more of several possible goals, in which the teacher
• Assesses the student's oral production,
• Ascertains a student needs before designing a course or curriculum,
• Seeks to discover a student's learning styles and preferences,
• Asks a student to assess his or her own performance, and
• Requests an evaluation of a course.

One overriding principle of effective interviewing centers on the nature of the questions that will be asked. It is easy for teachers to assume that interviews are just informal conversations and that they need little or no preparation to maintain the all-important reliability factor, interview questions should be constructed carefully to elicit as focused a response as possible. When interviewing for oral production assessment, for example, a highly specialized set of probes is necessary to accomplish predetermined objectives.

Because interviews have multiple objectives, as noted above, it is difficult to generalize principles for conducting them, but the following guidelines may help to frame the questions efficiently:
1. Offer an initial atmosphere of warmth and anxiety-lowering (warm-up).
2. Begin with relatively simple questions.
3. Continue with level check and probe questions, but adapt to the interviewee as needed.
4. Frame questions simply and directly.
5. Focus on only one factor for each question. Do not combine several objectives in the same question.
6. Be prepared to repeat or reframe questions that are not understood.
7. Wind down with friendly and reassuring closing comments.

As is true for many of the alternatives to assessment is low because they are time-consuming. Reliability will vary between conferences and interviews. In the case of conferences, it may not be important to have rater reliability because the whole purpose is to offer individualized attention, which will vary greatly from student to student. For interviews, a relatively high level of reliability should be maintained with careful attention to objectives and procedures.

F. OBSERVATIONS

All teachers, whether they are aware of it or not, observe their students in the classroom almost constantly virtually every question every response, and almost every nonverbal behavior is, at some level of perception, noticed. All those intuitive perceptions are stored as little bits and pieces of information about students that can form a composite impression of a student's ability. Without ever administering a test or a quiz, teachers know a lot about their students. In fact, experienced teachers are so good at this almost subliminal process of assessment that their estimates of a student's competence are often highly correlated with actual independently administered test scores.

One of the objectives of such observation is to assess students without their awareness (and possible consequent anxiety) of the observation so that the naturalness of their linguistic performance is maximized. In order to carry out classroom observation, it is of course important to take the following steps:
1. Determine the specific objectives of the observation.
2. Decide how many students will be observed at one time.
3. Set up the logistics for making unnoticed observations.
4. Design a system for recording observed performances.
5. Do not overestimate the number of different elements you can observe at one time-keep them very limited.
6. Plan how many observations you will make.
7. Determine specifically how you will use the results.

Recording your observations can take the form of anecdotal records, checklists, or rating scales. Anecdotal records should be as specific as possible in focusing on the objective of the observation, but they are so varied in form that to suggest formats here would be counterproductive. Their very purpose is more note-taking than record-keeping. The key is to devise a system that maintains the principle of reliability as closely as possible. Checklists are a viable alternative for recording observation results.

Some checklists of student classroom performance, such as the COLT observation scheme devised by Spada and Frohlich (1995), are elaborate grids referring to such variables as: Whole-class, group, and individual participation, Content of the topic, Linguistic competence (form, function, discourse, sociolinguistic), Materials being used, and Skill (listening, speaking, reading, writing). With subcategories for each variable. The observer identifies an activity or episode, as well as the starting time for each, and checks appropriate boxes along the grid.

G. SELF AND PEER ASSESSMENTS
Self-assessment derives its theoretical justification from a number of well-established principles of second language acquisition. The principle of autonomy stands out as one of the primary foundation stones of successful learning. The ability to set one's own goals both within and beyond the structure of a classroom curriculum, to pursue them without the presence of an external prod, and to independently monitor that pursuit are all keys to success. Developing intrinsic motivation that comes from a self-propelled desire to excel is at the top of the list of successful acquisition of any set of skills.

Peer-assessment appeals to similar principles, the most obvious of which is cooperative learning. Many people go through a whole regimen of education from kindergarten up through a graduate degree and never come to appreciate the value of collaboration in learning-the benefit of a community of learners capable of teaching each 'other something. Peer-assessment is simply one arm of a plethora of tasks and procedures within the domain of learner-centered and collaborative education.

Researchers (such as Brown & Hudson, 1998) agree that the above theoretical underpinnings of self- and peer-assessment offer certain benefits direct involvement of students in their own destiny, the encouragement of autonomy, and increased motivation because of their self-involvement.

a. Types of Self and Peer Assessment
It is important to distinguish among several different types of self- and peer-assessment and to apply them accordingly. Five categories of self- and peer-assessment: (1) direct assessment of performance, (2) indirect assessment of performance, (3) metacognitive assessment, (4) assessment of socio affective factors, and (5) student self-generated tests.
1. Assessment off (a specific) performance.
2. Indirect assessment of (general) competence.
3. Metacognitive assessment (for setting goals).
4. Socio affective assessment.
5. Student-generated tests.
b. Guidelines for Self and Peer Assessment
Self- and peer-assessment are among the best possible formative types of assessment and possibly the most rewarding, but they must be carefully designed and administered for them to reach their potential. Four guidelines will help teachers bring this intrinsically motivating task into the classroom successfully.
1. Tell students the purpose of the assessment.
2. Define the task(s) clearly.
3. Encourage impartial evaluation of performance or ability.
4. Ensure beneficial washback through follow-up tasks.

c. A Taxonomy of Self and Peer Assessment Tasks
To sum up the possibilities for self- and peer-assessment, it is helpful to consider a variety of tasks within each of the four skills that is, Reading Tasks, Listening Tasks, Writing Tasks and Speaking Tasks. An evaluation of self- and peer-assessment according to our classic principles of assessment yields a pattern that is quite consistent with other alternatives to assessment that have been analyzed in this chapter. Practicality can achieve a moderate level with such procedures as checklists· and questionnaires, while reliability risks remaining at a low level, given the variation within and across learners.

References :
Brown, H. Douglas. Language Assessment: Principles and Classroom Practices. Longman.

Assignment : Language Assessment

Assignment from meeting 9-10

Komentar

Posting Komentar

Postingan populer dari blog ini

Assignment Language Assessment, pertemuan 14