Vantage Learning CEO: Computers can now competently grade essays

Are you still grading written work by hand? Artificial intelligence capable of grading students' answers on multiple choice, true or false, fill-in-the-blank and other basic question formats has been commonplace for years now. Technology that can grade essays is another story.

Such technology is often easily written off due to the difficulties of creating a solution that can grade written work as well as a human. Vantage Learning is, however, setting out to prove those beliefs wrong. Part of a larger company that does research into machine intelligence — a quick look at the legal notices in your cell phone's "General" menu will reveal a number of licenses for the company's spell-checking, writing improvement tools, proofreading, grammar checking — Vantage Learning has, after decades of R&D, developed a solution that CEO Peter Murphy says works.

“It’s a whole big difference to say you’re doing stuff with essay scoring technology. It’s another thing to really be able to say that you can do it with human-level performance or exceed it, and that’s our level of measure in anything that we’re trying to build and do with machines," says Murphy. "This isn’t like an enterprise that was started up overnight and all of a sudden we’re making claims of doing a whole bunch of great things."

In a recent interview, Murphy told us more about Vantage Learning's written assessment tools, what he thinks has held such a tool back, and where he sees adaptive learning headed.

EDUCATION DIVE: How did Vantage Learning get to where it is today with the essay scoring technology?

PETER MURPHY: In the late ‘90s, we decided to apply our technologies to the problem of getting computers to grade essays. And we were the first to reach statistical equivalence with humans. That’s why our tools are the ones that grade things like the GMAT exam and the MCAT. On that basis, we decided to take a focus, and we formed Vantage Learning to focus on tools that would makes things like writing across the curriculum a reality. So, that was how that came about.

Very quickly thereafter, we realized the tools could be very effective for teaching and learning. At that time, in the early 2000s, people just wanted to use it more as an assessment tool, but we really felt that it was best used from a formative perspective. In a sense, it was the first version of an adaptive learning technology. Beyond that, we built platforms of tools for doing not only adaptive assessment, but being able to take it to the next stage to adaptive learning, as well. If chosen by the educator or by an institution — because we really did more business-to-business selling than business-to-consumer — you could actually have gotten to a level of multi-subject, continuous adaptivity.

We were talking about it quite a bit in the early 2000s. We actually have the trademark for the first use of an adaptive learning platform. In those days, we built the tools for doing that and deployed them in the early 2000s, up to a certain level. That level was really all about getting accurate placement. Unfortunately, at the time, the federal government looked at adaptive technologies and whether it involved anything with an assessment that was used for something for real. Quote unquote, their own words, coming straight from people at the Department of Education when I was at [George W. Bush's first education secretary, Rod] Paige’s office with 9 other people from the industry, the words were, “We don’t believe in treating people differently.” So a lot of that initiative we had in the early 2000s didn’t have open ears, and that business ultimately probably focused a little bit more on what was going on with writing technologies. Up until a point a couple of years ago, we ended up building other tools that really did the same thing and could pre-calculate from structured data billions of possible pathways to success that a student could take based on hundreds of thousands of factors.

For instance, we first applied some of these tools to trading systems in the financial markets, but if you had 500 pieces of information that you were tracking on any one thing—let’s say, in this case, a student—and you had tried to manipulate just two things at any one time, you’d need 125,000 reports to completely describe that data set. And we would’ve pre-calculated all those possible pathways. If you tried to manipulate four things at a time, there are about 2.5 billion possibilities, and we can literally manipulate thousands of things at once.

That data platform was really called iSeek, and it’s the first intelligent agent platform for dealing with huge volumes of structured and unstructured data—and, in particular, those types of data where you’re trying to unify it into a single index. So, with that sort of being married to what we did with the adaptive learning platform, we really launched into this initiative where we could deliver the very detailed, high-volume adaptive learning environments. It was really the data piece that allowed us to do more than what we were really capable of doing back in 2001 or 2002.

What do you see as some of the biggest challenges in the adaptive learning space right now?

MURPHY: The big challenge is really getting critical mass. By the way, when people like Knewton announced that they were doing something at Arizona State University — it was their big thing — we had every college in the state of Florida had just signed up to use our tool. Now, every high school is using it. Not from a cradle-to-graduation perspective — they, again, sort of stopped at this placement or remediation to allow people to register for classes. But we have every school in the state of Florida, and the same thing in the state of Virginia—every college. So while these other guys are starting out in bits and pieces, we’ve had wholesale sort of acceptance, but we’re not very good at marketing or publishing, as you might suspect after looking at the information that’s on your Apple iPhone. We’re one of those companies that sort of remained below the radar, I suppose.

Especially with software that can grade written responses. It’s not something I had really heard about being used much before.

MURPHY: So, the first thing was IntelliMetric—that was the software that allowed computers to actually grade the essays. Then, what we did is we merged IntelliMetric into an environment where there was—I’ll call it next-generation grammar and proofreading tools that would provide feedback. As a matter of fact, for that sort of merged technology, we just received a patent this week, and the patent is specifically for machine grading and then providing feedback on improving the writing.

So it’s really a machine-based tutoring patent, and it’s sort of all encompassing. It’s a good patent to have, and we’re proud of that one because we have about 50 other patents that deal with natural language understanding, but a lot of those things are so varied in technologies that there’s not much to talk about. But something like this is something to talk about and really puts a good stamp on the things that we did because we did that for the first time, where we merged those things together, in like 2002, and we’ve been actually working to get that patent approved ever since.

In a very unusual case, they actually stuck an additional three years onto the award — I think because they made it overly difficult. It’s not usual for them to extend the patent beyond the normal timeframe that you would get, but in this case, they did. We actually didn’t even ask for it. That was nice to get a stamp of approval on work that we had really begun in the late ‘90s and actually culminated in this formative learning tool in the early 2000s.

But again, you know the troubles that we’re having nationwide with writing performance—actually troubles nationwide with basic math and algebra performance, as well as reading. Aat one point, it seemed like that would be a slam dunk to really get acceptance of that kind of tool for writing across the curriculum, but what really happens is when people decide that they need to deploy tools like that, it’s strictly in the highest-degree problem areas and not necessarily done wholesale across a student body. So you might end up with two teachers out of 200 that know anything about the product, and what happens is you don’t get critical mass.

You asked a question about adaptive learning, and I think there’s a distinction between what other people call “adaptive learning” and what we do in an adaptive learning environment, where it’s really continuous and it’s really based on being able to process massive amounts of data. It’s really more of a data problem than it is other things. Then, we use these subject-specific intelligent agents—very highly detailed, smart bots—that use that data to make recommendations moment-by-moment. It’s really continuous and real-time adaptivity. The big problem is getting enough critical mass with schools and districts to where everybody’s using it. If you only have 5% of your student body or your teachers familiar with using the tool, it’s really not going to have the wholesale effect of teaching future generations to do better than we do.

Some of the other companies I’ve spoken with about adaptive learning solutions seem a little wary of components that grade written assessments. What do you think has held back the development of this sort of technology?

MURPHY: The first was disbelief. “It’s impossible.” And we were the ones to cross all those boundaries first. Some of these companies are single-algorithm companies. We use literally hundreds of algorithms to solve that problem. IntelliMetric’s the gold standard for that. That’s why we’re the only ones grading high-stakes assessments.

The next piece was really the fact that people are afraid of the data and how it could be used. I think it’s the lack of trust that people have in one another in terms of using the data properly. The data absolutely needs to be created and it needs to be utilized to the students’ benefits. The problem is the adults keep getting in the way, because people are worried about their jobs, job security—whatever it is, they get in the way. So that’s been a problem.

The last problem was really just if you wanted to do it wholesale, you really had to plan in a way where you were going to make available computer labs and other things to students. So there was a planning issue, and sometimes when people are flailing, it’s difficult to plan. I think the availability of technology has made it easier for people to accept these things, but you still have the adults that will get in the way. It’s one of the reasons why we have a pretty major adoption of our adaptive learning environment in a whole series of probably 80 or 90 charter schools in California that started using it in the last 8 or 9 months. The thing there is that people are able to make a wholesale decision. They don’t have the politics—especially the petty politics—that get in the way of doing something that’s going to make things better for students and future generations.

How do you see adaptive learning changing the role of instructors, and where do you see adaptive learning going in the next 10 years?

MURPHY: I think that the role of instructors changes in the sense that they start getting a whole lot more information about student performance. It’s not too different from the story that we would’ve told back in 2000, when we were doing the writing environments. Teachers don’t need more practice grading things. They need to basically be able to look at data and then figure out how they can fine-tune the things machines can do for them. Very few people today, if given a thousand numbers to add together, wouldn’t pull out a calculator or spreadsheet, right? But in education, they keep looking back to that thinking that they need to pull out the calculator or the slide rule. We’ve gone beyond that point, so they have to embrace what the information can do for them and learn how to fine-tune, to personalize, the education they provide. I think it can be very empowering for both students and teachers.

What we saw with writing is that students… a lot of students don’t like to revise their work. I knew with my own children that it was a big argument to get them to revise their work. They’d tell you that a draft that was awful was perfectly acceptable to their teacher and they didn’t want to revise it. With the machine, oddly enough, we’ll see that students will revise their work anywhere from 10 to 25 times, and the ones revising their work 25 times tend to be the students that started out performing the worst. They’re empowered by the feedback that they get and the fact that it comes in real time. They’ve almost played the machine like a game, because they want to “beat it.”

For 12 years now, we’ve graded student feedback with machines and we give feedback in multiple domains: focus, organization, content, word usage and mechanics. What we see is that sometimes instead of students revising their work holistically across all those domains, they’ll perfect each domain, which at first was a little bit frustrating to people looking at it. “Why aren’t they looking at all the instructions?” But that was their way of doing it, and it sort of led to maybe a few more revisions, but they increased their scores dramatically.

Again, one of the big challenges is getting the critical mass where the machine-based learning tools are used wholesale, really, across the industry. It needs to be something that every teacher embraces and every student is using.

Would you like to see more education news like this in your inbox on a daily basis? Subscribe to our Education Dive email newsletter! You may also want to read Education Dive's look at how academic boycott movements have effected change.