My panel was highly interactive, consisting mostly of questions and answers between the moderator (Jerry Heneghan of Virtual Heroes) and the panelists (Elaine Raybourn of Sandia National Laboratory, Jeff Taekman of Duke, Priscilla Elfrey of NASA, and me). As a result, I didn't give a single contiguous talk on assessment per se. But if I string together my comments, they would look something like this:
3Dsolve's flagship project, which we finalized this summer, was a project for the US Army Signal Center and School at Fort Gordon, GA. It was 110 hours of simulation-based task training for the 25B10 MOS, Information Systems Operator/Analyst. We've now started work on the follow-on to that, which is 120 hours of instruction for the 25B30-level MOS.
A widely-quoted statistic is that we retain approximately 5 percent of what we hear in a lecture, but retain approximately 75 percent of what we learn through hands-on training. At a 30,000-foot view, our goal is get as close to that 75 percent figure as possible with virtual hands-on training. In that light, assessment for us can be viewed as, "How close to 75 percent are we getting on a per-student basis?"
It's important to keep in mind the distinction between assessment and validation, which are separate concepts, but unfortuately sometimes used interchangably in discussions. Assessment is the process of evaluating the performance of an individual student. Did he/she learn the material? Did he/she achieve the learning objectives? Validation is the process of evaluating the courseware as a whole. Is it valid? Was it properly designed given the learning objectives?
Our task training software for the Signal School uses what is known as the FAPV model: Familiarize, Acquire, Practice, and Validate (this is an unfortunate misuse of the term validate, but then we didn't invent the model). In Familiarize mode, students can explore freely to discover the environment for themselves. In Acquire mode, we guide them through a particular lesson, telling them exactly what to do for each step of the process. In Practice mode, students navigate and interact on their own, but can receive hints from the software as needed. In Validate mode, we provide no hints whatsoever, and the student is expected to be able to execute all lesson steps without assistance.
In both Practice and Validate modes, we track virtually every action taken by the student: Where did he/she go? What did he/she look at? When did he/she take a given action? What did he/she click on and in what sequence? All this data is used for assessment purposes.
We use a documented XML-based format that we developed to define our lessons. A typical lesson might consist of 30-60 individual steps. The student is expected to navigate through the environment as required and perform the steps, which may be linear, non-linear, or a combination of both, depending on the particular lesson. A "happy path" defines the nominal sequence of actions (or sequences for non-linear content) that equate to a correct traversal of the lesson content. We use this same XML-based content not only to guide the student through the lesson in Acquire mode, but to evaluate the student's performance in Practice and Validate modes -- by comparing the student's actions to the nominal happy path.
We provide assessment results both to students directly within the courseware itself, as soon as they work through a given lesson, and to the instructors by uploading results to the Army's Learning Management System (LMS). What we have found is that the process as a whole, and notably the assessment data, changes the role of the instructor. Instead of leading classes through "death by PowerPoint" (as they put it), instructors become coaches, able to spend their time with the students who need their help the most, when they need it.
To use an analogy, I like to think of our approach as precision guided munitions for learning. By focusing instructors' time where it is needed the most, simulation-based learning with integrated, real-time assessment dramatically increases instructor effectiveness.
From an article on precision guided munitions (not included in my talk):In the fall of 1944, only seven per cent of all bombs dropped by the Eighth Air Force hit within 1,000 feet of their aim point; even a 'precision' weapon such as a fighter-bomber in a 40 degree dive releasing a bomb at 7,000 feet could have a circular error (CEP) of as much as 1,000 feet. It took 108 B-17 bombers, crewed by 1,080 airmen, dropping 648 bombs to guarantee a 96 per cent chance of getting just two hits inside a 400 x 500 feet German power-generation plant; in contrast, in the Gulf War, a single strike aircraft with one or two crewmen, dropping two laser-guided bombs, could achieve the same results with essentially a 100 per cent expectation of hitting the target, short of a material failure of the bombs themselves.