There has to be a way to measure student success, but are high-stakes tests the best way? How fair and how accurate are these tests for career and technical education students?
What if dentists were rated on the kind of tests used to assess teachers? That is the premise of a presentation that has been making the rounds at educational conferences. It appeared at the spring 2001 Project-Based Learning Conference where keynote speaker Linda Darling-Hammond credited the story to John S. Taylor, superintendent of schools in the Lancaster County School District in South Carolina.
At our own 2001 ACTE convention in New Orleans, a group of educators from the College of Agriculture and Life Sciences at the University of Arizona, Tucson, Ariz., presented it.
It goes something like this.
"I ran into my dentist the other day and asked him about the new state program to measure the effectiveness of dentists. I told him how it would measure the abilities of dentists by counting the number of cavities each of their young patients have at ages 10, 14 and 18, and then average that to determine a rating. Dentists would be rated as Excellent, Good, Average, Below Average and Unsatisfactory."
"That?s terrible," said the dentist. "That?s not a fair way to measure who is a good dentist."
"Why not?" I asked.
"Because all dentists don?t work with the same kind of patients, and there are a lot of things we can?t control. For example, I work in a rural area with a high percentage of patients from low-income homes. Some of my colleagues work in upper-middle-class neighborhoods where the children get regular dental checkups. Many of the parents of my patients don?t bring their children in until there is a problem. Also, many of my parents don?t know as much about nutrition and let their children eat way too much candy. Furthermore, many of my clients have wells, so their water is not fluoridated."
I told him that sounded like he was just making a bunch of excuses, and he got really angry.
"I?m not making excuses," he said. "My work is as good as any of my colleagues, but my average cavity count is going to be higher because of where I work. I choose to work where I am needed the most."
"Don?t get so upset," I told him.
"How can I not get upset?" he replied. "With this new system, I will be rated average or below average. If my more educated patients believe this is a real measure of my ability as a dentist, they may go elsewhere, then I?ll only have my neediest patients and my cavity average score will get even worse."
"Well," I told him, "a leading member of the DOC says that complaining and making excuses won?t improve dental health."
"What?s the DOC?" he asked.
"The Dental Oversight Committee," I replied. "It?s a group made up of mostly laypersons that will make sure our state?s dentistry improves."
"Reasonable people won?t go along with this," my dentist said.
"Well then, how would you measure good dentists?" I asked him.
"Come and observe how I work."
"That?s too complicated and time consuming. Cavities are the bottom line."
My poor dentist was so upset. "This can?t be happening," he said.
"Don?t worry," I told him. "The state will help you out. If you?re rated poorly, they?ll send a dentist with an excellent rating to straighten you out."
"Do you mean that they will send a dentist with a wealthy clientele to show me how to work on patients with severe juvenile dental problems? He doesn?t have the experience that I do in that area. Don?t you understand? This would be like grading schools and teachers on an average score on a test of children?s progress without regard to any influences from outside the school?like their homes, the community, things like that. Why would they do something like that to dentists? No one would think of doing something so unfair to schools."
As my dentist left, he said, "I?m going to write to my senator and representatives. I?ll use the school analogy and surely they?ll see my point."
Even if you were not fortunate enough to have been at this session of the ACTE convention, you can imagine the response it got?from laughter at the many touches of irony to the final round of appreciative applause.
The presentation by the Arizona group, which included Jack Elliot, Jim Knight, Billye Foster and Ed Franklin, was titled "High Stakes Testing: Who is Smarter?Academic or Vocational Students?" While they noted that career and technical education students in their state scored lower on the Stanford 9 tests when raw scores were compared, when other variables or influences were factored into the statistical analysis, there was no difference found between the two groups.
The factors cited included gender, race/ethnicity, special populations and learning styles. The Arizona career tech students included a higher proportion that fell into the category of "special population"?having disabilities or limited English proficiency or being economically or academically disadvantaged. Career tech students tended to be kinesthetic learners, who learn by doing, rather than visual learners, who score higher on standardized tests.
The conclusion arrived at by the University of Arizona educators was that one group should not be labeled as smarter that the other; the two groups are simply different. Their recommendations are:
Career and technical education administrators and teachers must understand the problems associated with raw score comparisons.
Career and technical education state leaders must utilize this type of information in career and technical education promotional materials.
An article on the Arizona study appeared in the 2002 Arizona Agricultural Experiment Station Research Report of the University of Arizona?s College of Agriculture and Life Sciences. In it, editor Susan McGinley cites many of the statistics presented at the ACTE convention by Elliot and Knight?s team.
Elliot, who is also a member of the State Board of Education Career and Technical Education Advisory Committee and the university representative on the Arizona Council of Occupational and Vocational Administrators, also argues for a more well-rounded approach to student assessment?one that would include student portfolios documenting achievements, skills and competencies.
"We don?t oppose standardized testing or even testing," Elliot says in the article, "but we do oppose high-stakes testing because no single event should decide a student?s life."
The October 2002 issue of the Center on Education Policy?s TestTalk for Leaders looked at "What Tests Can and Cannot Tell Us."
According to the report from the Center on Education Policy (CEP), "When state tests and other large-scale tests are well designed and properly used, they can tell us a lot about what students know and can do."
The advantages cited for these more standardized forms of testing include providing more consistent data among different schools and districts, and providing valuable summary data on student performance by subject, skill and knowledge area. This data can also be used to compare achievement between various groups of students with regard to income and ethnicity, and the collection, analysis and reporting of the data may be done for a lower cost.
However, the CEP report cautions that, "even good tests have limitations?something state and federal policymakers don?t always consider when they design education accountability systems."
Sometimes state accountability systems treat test scores as if they were precise calculations. The CEP says they should be considered more as estimates because, "Test scores can fluctuate for reasons that have nothing to do with student learning or the quality of teaching."
These factors may include the student?s physical or mental condition that day, outside distractions, the sample of questions, errors in scoring, or even lucky guesses.
Many teachers are now "teaching to the test," which can improve test scores, but does it always improve education? And the pressure of high-stakes testing on teachers has caused them to do things that are not only unethical but have sometimes cost them their jobs.
Among these recent incidents: On January 7, the St. Louis Post-Dispatch reported that Joyce Wilks-Love, the principal of Horace Mann School, was suspended then transferred to the central school district office because she had allowed teachers to read the questions and the multiple-choice answers during standardized science and social studies tests given to all fourth-graders? something only allowed for special education students. A January 26 article in The Atlanta Journal-Constitution told the story of Frankey Jones, who leaked a copy of the controversial Gateway Test to the media because she thought that the high-stakes test was "not appropriate for a nine-year-old who sometimes still sleeps with a stuffed animal." After her confession, Jones resigned, but the district is threatening to sue her for $750,000.
Those are just two of the news stories from one recent month.
The CEP report concluded that tests are an indispensable yet imperfect tool. However, "As state and national leaders gain more experience with high-stakes testing, deepen their understanding of what tests can and can't do well, and bring new testing technologies into wider use, they can make their systems even better in the future."
Two studies from Arizona State University (ASU) were much more negative with regard to high-stakes testing. In "The Impact of High-Stakes Tests on Student Academic Performance: An Analysis of NAEP Results in States with High-Stakes Tests and ACT, SAT and AP Test Results in States with High School Graduation Exams," researcher Audrey L. Amrein and ASU education professor David C. Berliner wanted to find out whether academic achievement had improved since the introduction of high-stakes testing policies in the 27 states with the highest stakes written into their grade one-eight testing policies.
They found that, "Analyses of scores and participation rates for the NAEP, ACT, SAT and AP tests suggest that there is inadequate evidence to support the proposition that high-stakes tests and high school graduation exams increase student achievement."
Instead, the slightly anti-climactic conclusion they arrived at was that after the implementation of high-stakes tests, "nothing much happens." Test scores seemed to go up and down in a random pattern.
The researchers did note that their data suggested that after the implementation of high school graduation exams, academic achievement appeared to decrease, as indicated by declining ACT, SAT and AP scores.
In their second study, "An Analysis of Some Unintended and Negative Consequences of High-Stakes Testing," Amrein and Berliner found evidence that high school graduation exams increase dropout rates, decrease high school graduation rates and increase the rates of GED program enrollment.
They claim that their analysis shows that high-stakes tests create unintended negative consequences, and that, "The adverse consequences of high-stakes tests appear to outweigh what few benefits such tests may have."
Daniel M. Koretz, a senior social scientist at the RAND Corp., as well as an education professor at Harvard University, was quoted in an Education Week article as saying the Amrein and Berliner studies could benefit from some more fine-grained analyses.
"Basically," Koretz told Education Week, "I just don't think we know enough yet about the broad sweep of the impacts from all of these tests."
While the need for tools of assessment is recognized by almost everyone, just what those tools should be is still being debated. We have always had?and probably always will have?testing. The question is, how much weight should be given to a single test in evaluating educational progress. When it comes to high-stakes testing, there still appear to be many questions. And the education community is still in search of the answers.
Comments, Questions. Do you have a question for the author? Do you want to comment on the article? E-mail techniques@acteonline.org, or fax or mail back the Reader Response page at the front of this issue.
Interactive Dialogue. Do you want to discuss this topic with other educators? ACTE members can use ACTE?s electronic discussion boards on the ACTE Web site. Go to www.acteonline.org and click on Members Only.
Consult your ACTE Web site Owner?s Guide?which also appears in the January 2003 issue of Techniques?to find out all the ways you can use the ACTE Web site.
For more information and opinions about high-stakes testing, here are some resources for further exploration and the Web sites where they can be found.
- "What Tests Can and Cannot Tell Us," from the Center on Education Policy's October 2002 TestTalk for Leaders, http://www.cep-dc.org/testing/testtalkoctober2002.htm
- The Impact of High-Stakes Tests on Student Academic Performance: An Analysis of NAEP Results in States with High-Stakes Tests and ACT, SAT and AP Test Results in States with High School Graduation Exams, by Audrey L. Amrein and David C. Berliner, Arizona State University, http://www.asu.edu/educ/epsl/EPRU/documents/EPSL-0211-126-EPRU.pdf
- An Analysis of Some Unintended and Negative Consequences of High-Stakes Testing, by Audrey L. Amrein and David C. Berliner, Arizona State University, http://www.asu.edu/educ/epsl/EPRU/documents/EPSL-0211-125-EPRU.pdf
- "Universally Designed Assessments: Better Tests for Everyone!" National Center on Educational Outcomes, NCEO Policy Directions No. 14, University of Minnesota, http://education.umn.edu/NCEO/OnlinePubs/Policy14.htm
- "Special Topic Area: Accountability for Students with Disabilities," National Center on Educational Outcomes, http://www.education.umn.edu/NCEO/TopicAreas/Accountability/Account_topic.htm
- "Special Topic Area: Alternate Assessments for Students with Disabilities," National Center on Educational Outcomes, http://www.education.umn.edu/NCEO/TopicAreas/AlternateAssessments/alt_assess_topic.htm
- "High-Stakes Testing Isn't the Answer," 2002 Arizona Agricultural Experiment Station Research Report, http://www.cals.arizona.edu/pubs/general/resrpt2002