Hewlett Foundation Announces Winners of Essay Scoring Technology Competition

Learning, Online & Blended

News Release: ” The Hewlett Foundation Announces Winners of Essay Scoring Technology Competition”

Washington, D.C. – A British particle physicist and sports enthusiast, a data analyst for the National Weather Service in Washington, D.C., and a graduate student from Germany won the $60,000 first prize in a competition to design innovative software to help teachers and school systems assess their students’ writing. The William and Flora Hewlett Foundation sponsored the contest and awarded $100,000 to the top three research teams – none of whom have a background in education.

The competition itself was as innovative as the problem it was intended to solve. Participants competed in the Automated Student Assessment Prize (ASAP) to develop software that could score students’ essays used in state standardized tests that had already been individually graded by educators. The winning team came closest to replicating how the tests were graded by the trained experts. The contest was held in plain sight – with the scores of the competing teams appearing online in real time and had an active discussion board.

“I am thrilled to win this contest because it gave me an opportunity to think creatively about how we can use technology to improve scoring software that instantly and inexpensively predicts how an educator would hand-grade an essay,” said Jason Tigg, the British particle physicist turned high frequency trader who was a member of the first place team. “I enjoyed working on a real-life problem that has the potential to revolutionize the way education is delivered.”

The goal of the competition was to assess the ability of technology to assist in grading essays included in standardized tests. The contest revealed that software performed extremely well. This will pave the way for states to include more writing on standardized tests, which mostly consist of simple multiple choice questions. This is important because standardized testing has had a significant impact on classroom practice. The goal is for students to acquire critical thinking and communication skills that writing requires – all without the burden of added time and cost to the system.

This innovative software also has great potential in classroom use. Its purpose is not to replace teachers, but to give them tools to be able to assign more writing in the classrooms. In today’s writing classes, students only write an average of three essays a semester. With up to forty students in each class, essays take too long to grade. This technology will allow teachers to assign more writing and give quicker feedback. In a subsequent phase of work, ASAP will work with teachers to research the best ways to incorporate essay grading into classroom use..

Educators believe that students must have these skills to compete in the 21st century global economy. Other critical skills include mastering core academic content, working collaboratively and learning how to learn independently. The Hewlett Foundation calls these skills “deeper learning”, and is making grants to support a movement throughout the country to spread this approach to more classrooms and integrate these teaching practices and learning outcomes into education policies.

“More sophisticated assessments will drive better classroom practices because better tests support better learning,” said Barbara Chow, Education Program Director at the Hewlett Foundation. “This contest clearly demonstrated that there is excellent technology that can dramatically lower the cost and time it takes to grade essays which should be welcome news to state assessment agencies.

The competition drew more than 2,500 entries and 250 participants and inspired data scientists to develop innovative, accurate ways to improve on the current essay scoring technology. ASAP was hosted by Kaggle, the leading platform for data prediction competitions that allows organizations to post their data and have it scrutinized by the world’s best data scientists.

Competition participants had access to 16,000 hand-scored essays that varied in length, type and grading protocol, and were challenged to develop software designed to faithfully replicate the assessments of a trained expert educator. Software scoring programs do not independently assess the merits of an essay; instead they predict, very accurately, how a person would have scored the essay. This is a critical distinction because it means that the software replicates the same scores as trained educators for significantly less time and money.

According to Kaggle CEO Anthony Goldbloom, “We harness the brainpower of the best minds in the world to create game-changing solutions. ASAP is no exception. The goal of this competition was to offer impartial competitions in which a fair and transparent process allows participants to demonstrate the capabilities of this innovative software and showcase to the world the value of scoring technologies.”

ASAP supports the efforts of the Common Core State Standards which define the knowledge and skills students should have when they graduate high school so they can succeed in college and the workforce. The competition is also being conducted with the support of the two multi- state consortia funded by the U.S. Department of Education to develop next-generation assessments, the Partnership for Assessment of Readiness for College and Careers and Smarter Balanced Assessment Consortium.

As states implement the Common Core State Standards, they are making decisions about what kind of next generation assessments can measure this new level of rigor with fidelity. Innovative software that can faithfully replicate trained educators offers a new approach for states to rise to the challenge.

In conjunction with the ASAP open competition, the Hewlett Foundation also sponsored a groundbreaking study of companies that offer essay grading software. The study found that the scoring software was able to replicate the scores of trained educators, with the software in some cases proving to be more reliable. The study was released on April 16 at the annual conference of the National Council on Measurements in Education. A copy of the study can be found at http://bit.ly/HJWwdP.

A second ASAP study will be announced this summer to encourage companies that sell essay grading software and public competitors to undertake the same challenge for grading short- answer questions. Three additional ASAP studies are in development.

ASAP was designed by The Common Pool, LLC, and is managed by Open Education Solutions

Contest Winners

The winning team of the Automated Student Assessment Prize (ASAP) hails from three different countries: England, Germany and the U.S. Their collaborative effort brought together the team’s diverse skill set in computer science, physics and language and created the most innovative, effective and applicable testing model from more the 250 participants. The team says they believe they have just barely scratched the surface of possibilities with software scoring technology.

Members of the winning team include:

  • Jason Tigg – A resident of London, Tigg applies his scientific expertise to predicting financial movements at work and game programming in his free time. Armed with a Ph.D. in particle physics from Oxford University, Tigg is also a champion runner and rower – he won last year’s Barnes and Mortlake Regatta held on the Thames River.
  • Stefan Henß – The team’s talented rookie, Henß brought his expertise in language and semantics analysis to help guide the team to success. Henß is currently pursuing a Master’s degree in computer science from the Darmstadt University of Technology in Hesse, Germany.
  • Momchil Georgiev – Georgiev – the son of two teachers – applied his deep appreciation for learning and a passion for data analysis to the team’s winning entry. Born in Bulgaria, Georgiev earned a Master’s degree in computer science from Johns Hopkins University and now works as an engineer for the National Weather Service in Washington, D.C.

The runner-up team stretches across the globe, with members from the U.S., Canada and Australia. Members include:

  • Phil Brierely, based in Melbourne, Australia, has a Ph.D. in Engineering and Artificial Intelligence. Brierly created a popular data mining system called Tiberius that is a leading competitor in the Heritage Health Prize, a two-year competition to improve health care.
  • Eu Jin also lives in Melbourne and brought his data mining and fraud investigation experience to the team.
  • William Cukierski is a Ph.D. candidate in Biomedical Engineering at Rutgers University. Cukierski has used his data expertise to predict everything stocks to grocery shopping trends.
  • Christopher Hefele, who currently works for AT&T as a systems engineer, is not new to scoring high in Kaggle competitions – he was part of the team that took home second place in Netflix’s million dollar competition to improve its movie recommendations.
  • Bo Yang lives and works as a software engineer in British Columbia, Canada, and previously finished first in Kaggle’s photo quality prediction contest.

The third place team is an American duo of data experts. Members include:

  • Vik Paruchuri is a data modeling and predictive modeling consultant expert who is currently writing a book about statistical programming. Paruchuri served overseas as a U.S. Foreign Service Officer for the State Department. Unlike most data experts, Paruchuri got his degree in American History.
  • Justin Fister’s educational background in psychology and computer science fueled his interest in the ASAP competition. Fister has worked in the software industry for more than ten years.

1 Comments

BRENNAN MORIARTY /

I acquired my critical communication skills only after LIVING in my ancestral/historical region’$ of the world. My parents who are from the mid west [for many generations] thought that California ‘s cultures were the only safe place to be. Yet faux-civility is an abnormal thought.
To write [the] rite insight will/must invite flight for the bright to quite light the night and bite the fright, to climb the height and keep it lite [and reprise z spite], the vent sent pre-content will think…[hmm] heaven bent.