GRAF EDITH AURORA (US)
HANSEN ERIC G (US)
SHUTE VALERIE J (US)
GRAF EDITH AURORA (US)
HANSEN ERIC G (US)
US5059127A |
CLAIMS Wliatls Claimed Is:
1. A method of administering an assessment to a student, the method comprising:
for each of one or more tasks, calculating the expected weight of evidence for the task
based on a student model pertaining to a student;
selecting a task based on the calculated expected weights of evidence; administering the selected task to the student;
collecting evidence regarding the selected task;
updating the student model pertaining to the student based on the evidence;
determining whether additional information is required to assess the student; if so, repeating the above steps; and
if not, assigning a proficiency status to the student based on the student model.
2. The method of claim 1 wherein the evidence comprises a scored response to the selected task.
3. The method of claim 1 , further comprising:
scoring a response to the selected task.
4. The method of claim 1 wherein the student model comprises a Bayesian interface network.
5. The method of claim 1 wherein determining whether additional information is required to assess the student comprises determining whether a threshold has been passed. 6. The method of claim 1 wherein determining whether additional information is
required to assess the student comprises determining whether a time limit has been exceeded.
7. The method of claim 1 wherein determining whether additional information is required to assess the student comprises determining whether each of the plurality of tasks has
been selected.
8. The method of claim 1 wherein calculating the expected weight of evidence comprises
calculating I A) ,
wherein n is a number of potential outcomes for a particular task, 7 is an outcome
index for the task, t j is a value corresponding to outcome j, P(t j \ h) is a probability that the
outcome occurs if a hypothesis is true, and P(t j \ K) is the probability that the outcome occurs
if the hypothesis is false.
9. The method of claim 1 wherein the student model comprises one or more variables,
wherein each variable corresponds to a proficiency for the student, wherein each variable
includes a plurality of probabilities, wherein each probability corresponds to the likelihood
that the student has a particular proficiency level for the proficiency.
10. The method of claim 1 wherein the proficiency status comprises one or more of the
following:
a high level of proficiency; a medium level of proficiency; and a low level of proficiency.
11. A processor-readable storage medium containing one or more program instructions for
performing a method of administering an assessment to a student, the method comprising: for each of one or more tasks, calculating the expected weight of evidence for the task
based on a student model pertaining to a student;
selecting a task based on the calculated expected weights of evidence;
administering the selected task to the student; collecting evidence regarding the selected task;
updating the student model pertaining to the student based on the evidence;
determining whether additional information is required to assess the student;
if so, repeating the above steps; and
if not, assigning a proficiency status to the student based on the student model.
12. The processor-readable storage medium of claim 11 wherein the evidence comprises a
scored response to the selected task.
13. The processor-readable storage medium of claim 11 , further containing one or more programming instructions for scoring a response to the selected task.
14. The processor-readable storage medium of claim 11 wherein the student model comprises a Bayesian interface network. 15. The processor-readable storage medium of claim 11 wherein determining whether
additional information is required to assess the student comprises one or more programming
instructions for determining whether a threshold has been passed.
16. The processor-readable storage medium of claim 11 wherein determining whether
additional information is required to assess the student comprises one or more programming
instructions for determining whether a time limit has been exceeded.
17. The processor-readable storage medium of claim 11 wherein determining whether
additional information is required to assess the student comprises one or more programming
instructions for determining whether each of the plurality of tasks has been selected.
18. The processor-readable storage medium of claim 11 wherein calculating the expected
weight of evidence comprises one or more programming instructions for calculating
,
index for the task, tj is a value corresponding to outcome j, P(t j \ h) is a probability that the
outcome occurs if a hypothesis is true, and P(t j | h) is the probability that the outcome occurs
if the hypothesis is false.
19. The processor-readable storage medium of claim 11 wherein the student model
comprises one or more variables, wherein each variable corresponds to a proficiency for the
student, wherein each variable includes a plurality of probabilities, wherein each probability corresponds to the likelihood that the student has a particular proficiency level for the
proficiency.
20. The processor-readable storage medium of claim 11 wherein the proficiency status comprises one or more of the following:
a high level of proficiency; a medium level of proficiency; and
a low level of proficiency. |
METHOD AND SYSTEM FOR DESIGNING ADAPTIVE, DIAGNOSTIC
ASSESSMENTS
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to, and incorporates herein by reference in its
entirety, United States Provisional Application No. 60/654,982, entitled "Designing Adaptive,
Diagnostic Math Assessments for Sighted and Visually Disabled Students" and filed on
February 22, 2005.
BACKGROUND
[0002] In the United States, student difficulties in mathematics tend to emerge in
middle school. For example, the results from a study regarding trends in international
mathematics and science indicate that while U.S. fourth graders perform above the
international average in mathematics, U.S. eighth-grade students perform at or below the
international average. By the end of high school, U.S. students perform far below the international average.
[0003] In part, this downward trend among U.S. students may result from a shift in
the content that is being presented. Until the fourth grade, mathematics focuses on arithmetic instruction. In middle school, the mathematics curriculum typically becomes more visual
(e.g., students learn to interpret and construct graphs) and more abstract (e.g., students learn
to interpret and represent algebraic expressions).
[0004] One problem with current teaching methods is that by the time results of
high-stakes accountability tests are disseminated, classroom teaching methods cannot
generally be changed to address weak areas or misconceptions of students. For example, if students in a particular class have difficulty understanding and applying the quadratic
equation and such deficiency and/or misconception is discovered upon the administration of a
high-stakes examination or an examination presented at the end of a semester or other grading
period, the ability of the teacher to receive and comprehend the results and incorporate this knowledge into a lesson plan is difficult given an established course curriculum. In contrast, determining that the deficiency and/or misconception exists while the material is being taught
could permit additional or varied instruction to be provided in a classroom setting.
Accordingly, enhancing student learning of mathematics material that is more visual and
more abstract may permit students to actively solve problems and receive timely diagnostic feedback that can further the learning process.
[0005] In addition, some students can be heavily impacted by the emphasis on
graphic and/or abstract mathematics. For example, the increased visual nature of the content
can provide a distinct disadvantage to students that are interested in mathematics, but have visual disabilities.
[0006] Presenting alternative representations of the same or similar concepts in tasks, examples, and the like can augment comprehension and accommodate various
disabilities. For example, when transforming content from a visual format to an auditory
format, it is important to provide representations that convey the same meaning. In this manner, no student is unfairly advantaged or disadvantaged because of the format of the
assessment task. For example, the notion of providing equivalent representations is a central
requirement of the World Wide Web Consortium's (W3C) Web Content Accessibility
Guidelines. Under these guidelines, Web content authors provide text equivalents or text
descriptions for non-text content (images, audio, video, animations, etc.).
[0007] Such text equivalents are rendered as visually displayed text, audio and/or
Braille. Furthermore, audio presentations are carried out by having the text description read
aloud via a live reader, pre-recorded audio or synthesized speech. However, the use of a text
description rendered in audio to convey the meaning of a graph for a person who is blind can
be confusing. Such an audio representation can exceed certain of the test taker's cognitive
capacities. For example, a text representation of FIG. 1 could read as follows:
This figure shows a straight line drawn on a two-axis system, with a horizontal axis labeled X and a vertical axis labeled Y. All four quadrants are shown. The line begins in the third quadrant and moves upward and to the right; it crosses the negative X-axis, passes through the second quadrant, crosses the positive Y-axis, and ends in the first quadrant. Three points are shown, two on the line and one in the fourth quadrant. The point on the line in the first quadrant is labeled X, Y; the point on the line in the third quadrant is labeled X-sub-one, Y-sub-one. The point in the fourth quadrant is labeled X, Y-sub-one. In addition, two dashed line segments are shown, one that drops vertically from the point X, Y and connects it to the point X, Y-sub-one, and one that moves horizontally to the right from the point X-sub-one, Y-sub-one and connects it to the point X, Y-sub-one. This forms a right triangle with the solid line as a hypotenuse, the horizontal dashed line as the base, and the vertical dashed line as a side.
[0008] Navigating through the audio presentation can be cumbersome, regardless of
whether, for example, a live reader is asked to repeat portions of the presentation or a pre¬
recorded audio presentation is navigated from a cassette tape. However, improvements can
be obtained. The student can be allowed to control the rate of speech and to navigate through
the content in different ways (e.g., sentence by sentence or word by word). A pre-recorded audio presentation can be similarly improved over an audiocassette by providing similar
navigation capabilities, such as through a digital talking book technology. If the student reads
Braille, the text description of the graphic can be conveyed via Braille in either a hard copy or refreshable format.
[0009] However, a limitation of all of these approaches is that they merely provide access to the text description of the graphic rather than to the graphic itself.
[0010] What is needed is a system and method of applying an evidence-centered
design (ECD) approach to task development to further the learning process.
[0011] A need exists for an adaptive algorithm for task selection that can be used
with an ECD system.
[0012] A need exists for a system and method of providing assessment services,
adaptive e-learning and diagnostic reports.
[0013] A further need exists for a system and method that provides reasonable accommodations to students that would otherwise be prevented from learning or being
assessed due to the nature of the particular subject matter.
[0014] The present disclosure is directed to solving one or more of the above-listed
problems.
SUMMARY
[0015] Before the present methods, systems and materials are described, it is to be
understood that this disclosure is not limited to the particular methodologies, systems and
materials described, as these may vary. It is also to be understood that the terminology used
in the description is for the purpose of describing the particular versions or embodiments only, and is not intended to limit the scope.
[0016] It is also noted that as used herein and in the appended claims, the singular
forms "a," "an," and "the" include plural references unless the context clearly dictates
otherwise. Thus, for example, reference to a "task" is a reference to one or more tasks and
equivalents thereof known to those skilled in the art, and so forth. Unless defined otherwise,
all technical and scientific terms used herein have the same meanings as commonly
understood by one of ordinary skill in the art. Although any methods, materials, and devices
similar or equivalent to those described herein can be used in the practice or testing of
embodiments, the preferred methods, materials, and devices are now described. All publications mentioned herein are incorporated by reference. Nothing herein is to be
construed as an admission that the embodiments described herein are not entitled to antedate
such disclosure by virtue of prior invention.
[0017] Enhancing student learning of mathematics material that is more visual and
more abstract may permit students to actively solve problems and receive timely diagnostic
feedback. In addition, presenting alternative representations of the same or similar concepts in tasks, examples, and the like may augment comprehension and accommodate various disabilities. Adjusting learning environments and/or content to suit an individual student's
needs may substantially improve learning as well.
[0018] In an embodiment, a method of administering an assessment to a student may
include calculating the expected weight of evidence for each of one or more tasks based on a
student model pertaining to a student, selecting a task based on the calculated expected
weights of evidence, administering the selected task to the student, collecting evidence
regarding the selected task, updating the student model pertaining to the student based on the
evidence, and determining whether additional information is required to assess the student. If
additional information is required to assess the student, the above steps may be repeated to
select and administer a new task. Otherwise, a proficiency status may be assigned to the student based on the student model.
[0019] hi an embodiment, a processor-readable storage medium may contain one or
more program instructions for performing a method of administering an assessment to a
student. The method may include calculating the expected weight of evidence for each of one
or more tasks based on a student model pertaining to a student, selecting a task based on the
calculated expected weights of evidence, administering the selected task to the student,
collecting evidence regarding the selected task, updating the student model pertaining to the student based on the evidence, and determining whether additional information is required to
assess the student. If additional information is required to assess the student the above steps
may be repeated. Otherwise, a proficiency status may be assigned to the student based on the
student model.
BRIEF DESCRIPTION OF THE DRAWINGS
[0020] Aspects, features, benefits and advantages of the embodiments described
herein will be apparent with regard to the following description, appended claims and
accompanying drawings where:
[0021] FIG. 1 depicts a diagram used in an exemplary task.
[0022] FIG. 2 depicts relationships among the ECD models according to an
embodiment.
[0023] FIG. 3 depicts a flow diagram for an exemplary method of determining a
next task based on the expected weight of evidence according to an embodiment.
[0024] FIG. 4 depicts an exemplary student model according to an embodiment.
[0025] FIG. 5 depicts a tactile graphic for use as an exemplary accommodation
according to an embodiment.
DETAILED DESCRIPTION
[0026] An "adaptation" or "adaptive capability" may include a system's capability to
adjust itself to suit particular characteristics of a learner and may include the customization of
instructional material (e.g., content selection, sequencing and/or format) to suit different learner characteristics.
[0027] "E-learning" or "electronic learning" may include the delivery of any
instructional and/or training program using one or more interactive computer-based
technologies. E-learning may be used where networking or distance communications are
involved. For example, e-learning may include, without limitation, distance learning and/or
Web-based learning.
[0028] A "task" or an "item" may each include a question that elicits and/or prompts
for an answer and/or a response.
[0029] Adjusting learning environments and/or content to suit an individual
student's needs may substantially improve learning. Aptitude-treatment interaction (ATI) may be used to further a student's understanding of mathematics material. In ATI, aptitude
may refer to any individual characteristic that accounts for the level of student performance in
a given environment, and treatment may refer to the variations in, for example, the pace,
format and/or style of instruction. Different treatments maybe more or less suited to different combinations of student characteristics. For example, if it is known that a person cannot
process visual information, but can hear well, and equivalent content is available in visual and
auditory formats, ATI may recommend that the content be delivered in the auditory format for that person.
[0030] Methods of customizing content may include determining what to present (referred to herein as microadaptation) and determining how to best present it (referred to
herein as macroadaptation). Microadaptation has been a fairly elusive goal among educators
for some time, as can be seen in Bloom, B. S., "Learning for Mastery," Evaluation Comment,
vol. 1(2), pp 1-12 (1968); Bloom, B.S., "The 2-Sigma Problem: The Search for Methods of Group Instruction as Effective as One-to-One Tutoring," Educational Researcher, vol. 13(6)
pp 4-16 (1984); and Tobias, S., "Interest, Prior Knowledge, and Learning," Review of
Educational Research, vol. 64(1), pp 37-54 (1994). However, as described herein, an
embodiment incorporating differential sequencing of content depending on each learner's
needs may be implemented using adaptive instructional techniques.
[0031] Microadaptation may be one method for customizing content.
Microadaptation may include the real-time selection of content (i.e., during the learning
process) in response to a learner's inferred knowledge and skill state. Microadaptation may
also be referred to as domain-dependent adaptation. According to microadaptation principles,
decisions about content selection may be based upon performance and subsequent inferences
of students' knowledge and skill states as compared to the level that should have been
achieved when instruction is complete. For example, if a student incorrectly solves a difficult
assessment task pertaining to a particular concept or skill, a plurality of alternatives may be
indicated to increase the student's skill, such as presenting new instructional material on the
concept, administering a slightly easier assessment task directed to evaluating the same proficiency, and the like. Alternatively, additional practice or remedial instruction may be
warranted. When a student is believed to have mastered a particular topic or otherwise achieved an "acceptable" level of performance, the student maybe guided to new subject
matter.
[0032] A second approach to adapting content may be macroadaptation, which may
include the customization of content according to more stable learner qualities, such as
cognitive or perceptual abilities. In contrast with microadaptation, macroadaptive decisions
may be domain-independent and based on learner information that is usually, but not always,
collected before instruction begins. Macroadaptation may relate to decisions about the format and/or sequence of the content presented to the learner. Relevant learner information, such as
cognitive variables, perceptual abilities, personality variables, and learning style, may be
initially collected from a student. Subsequently, these data may be used to make informed
decisions regarding the type of content or instructional environment that is best suited to the individual.
[0033] An implementation that considers these two forms of adaptation may be used
to substantially improve the learning process. Microadaptation may be used to determine
what to present to a learner and when to present it. For example, a microadaptive algorithm
may select an assessment task that provides the most additional information about a particular
learner at any given point in a learning and/or assessment process. In contrast,
macroadaptation may be used to determine how it should be presented. For example, an assistive technology may be used to present mathematical content to students with visual disabilities. Table 1 summarizes some general differences between microadaptive and
macroadaptive approaches.
Feature Microadaptation Macroadaptation (i.e., domain-dependent) (i.e., domain-independent)
Person System may adapt to fairly System may adapt to fairly stable Characteristic malleable person characteristics person characteristics such as such as knowledge, skills, and cognitive variables, perceptual abilities that are the focus of abilities, personality variables, and instruction and assessment. learning style.
Adaptive Microadaptive decisions may occur Macroadaptive decisions may occur
Decision during instruction (through mainly prior to instruction (based on diagnostic assessment). pre-existing data sources or pre- instruction assessment).
Consequence of Decision may affect what content Decision may affect how content is Adaptation is presented (e.g., determination of presented (e.g., differential when the student is ready to sequencing or alternative proceed to the next part of the presentation format). curriculum).
Theoretical Adaptation may be based on Adaptation may be based on theory Undeφinnings theoretical and empirical and research on ATIs, assessment information relating to learning validity and other information from and pedagogical principles that individual learner differences. provide information about what to instruct or assess and why.
Table 1. Alignment of Adaptation Type by Learner/System Feature
[0034] As such, well-founded diagnostic assessments of proficiencies may be developed. Good assessments may be used to obtain relevant information that permit
inferences to be made regarding students' knowledge and skill states. Moreover, accurate
inferences of current knowledge and skill states may support microadaptive decisions that promote learning.
[0035] Evidence-centered design (ECD) may attempt to obtain, among other things,
clear answers to three basic assessment questions: (a) what is desired to be determined about
persons taking the assessment, (b) what observations (behaviors or work products) provide
the best evidence for these determinations, and (c) what kinds of tasks allow necessary
observations to be made or pertinent evidence to be collected. For example, suppose a
measure of students' knowledge of U.S. state capitals is desired. Evidence of high proficiency may include a given student correctly listing the names of all capital cities by
state. This evidence may be obtained orally, on paper and/or via computer using free recall
and/or matching tasks. The ensuing score on this assessment may be interpreted in relation to
pre-established scoring rules.
[0036] In order to apply an ECD framework to the design of assessment tasks, a
subject matter expert, such as a teacher or test developer, may create, for example, three
models: (a) a student model, which may define the range and relationships of the knowledge
and skills to be measured, (b) an evidence model, which may specify the performance data
associated with these knowledge and skills for varying levels of mastery, and (c) a task model, which may define the features of task performance situations that may elicit relevant
evidence.
[0037] FIG. 2 depicts relationships among the ECD models according to an
embodiment. As shown in FIG. 2, assessment design may flow conceptually from student
models through evidence models to task models, although the flow may be less linear and
more iterative in practice. Conversely, diagnosis or inference may flow in the opposite
direction. In other words, when a diagnostic assessment task is administered, the action(s)
performed by a student during the solution process may provide evidence that is analyzed by
the evidence model. The results of this analysis may include scores and/or other data that are communicated to the student model to update relevant proficiencies. An adaptive algorithm
may be invoked to select a new task to be presented to the student based on the updated
proficiency values in the corresponding student model. The cycle may repeat until the tasks
are completed, time has run out, mastery has been achieved and/or some other termination
criterion has been met.
[0038] In this manner, a psychometrically sound approach for designing assessments
and modeling student performance may be provided. The ECD approach may provide a
framework for developing assessment tasks that are explicitly linked to claims about learner
proficiencies via an evidentiary chain.
[0039] A student model may refer to a record of what a student is believed to know
and/or not know in relation to some referent knowledge and skill map, which may be referred
to as a proficiency model. A student model may be modeled using a Bayesian inference
network (BIN). BINs may be employed to represent, monitor and update the student model
and to compute probabilistic estimates of proficiency (e.g., the probability that a student has a "very strong" grasp of a particular concept may be 95%) at various points in time. A
Bayesian approach to student modeling may be used in an e-learning system to inform
microadaptive decisions - enabling the system to choose the best piece of content, such as the
most helpful and informative assessment task, to present next.
[0040] An evidence model may be described in relation to the observable features of
students' work products (or behaviors) that constitute evidence about proficiencies.
Proficiencies may be represented as nodes or variables in the student model. Thus, evidence
models may attempt to determine which behaviors and/or performances reveal targeted
proficiencies, and what connections exists between those behaviors and the student model
variables. An evidence model may thus define an argument regarding why and how the observations in a given task situation (i.e., student performance data) constitute evidence
about student model variables. For example, an evidence model may assist in determining
what is known about a student's "knowledge of U.S. state capitals" if the student can freely
recall 40 of the 50 state capitals. The evidence model may also assist in determining whether
such a performance is better or worse than matching 48 capitals to their appropriate state when each is displayed.
[0041] Evidence models may include evidence rules and statistical sub-models. An
evidence rule may determine how the results of a given performance are extracted from (or
identified in) a particular work product. Thus, evidence rules may emphasize how the student
performs or responds. A statistical sub-model may express how the observable variables
depend on or link to student model variables. As such, statistical sub-models may link the
extracted data to targeted proficiencies denoting what the student knows and how well the
student is believed to know it.
[0042] A given work product may yield one or more observable variables. For
example, if a student writes a short essay, the essay may become the work product for a writing assessment task and may be evaluated in terms of various proficiencies, such as spelling, grammar, syntax and/or semantics. These proficiencies may be assessed and
updated individually and/or may be considered as a more general "writing skills" proficiency.
Accordingly, the evidence rules may differ to focus on individual or holistic rubrics. An exemplary holistic evidence rule for "highly proficient" writing may include: "The essay is
clear and concise, with perfect spelling; and no grammar, syntax or semantic errors present."
[0043] Evidence models may thus represent an evidentiary chain between tasks and
proficiencies. Moreover, a necessary condition for an evidence model may be that it shares
the same work-product specifications as a particular task model, hi other words, what the student produces in the task situation and what the evidence rules examine may be required to be the same.
[0044] Tasks may be the most obvious part of an assessment and may be used to
elicit evidence (observables) about proficiencies (unobservables). A task model may provide
a framework for describing the situations in which students act in terms of, for example, (a)
the variables used to describe key features of a task, such as content, difficulty, and the like, (b) the presentation format, such as directions, stimuli, prompts, and the like, and (c) the
specific work or response products, such as answers, work samples, and the like. As such,
task specifications may establish what a student is asked to do, what kinds of responses are
permitted, what types of formats are available, whether the student will be timed, what tools
are allowed (e.g., calculators, dictionaries, word processors, etc.), and the like. Multiple task
models may be employed in a given assessment.
[0045] Different task models may produce different tasks, which may vary along a
number of dimensions (e.g., media type and difficulty level). For example, the following
three tasks may define three levels of difficulty in a student model variable: "Find the
common difference in an arithmetic sequence:"
EASY- Find the common difference for the following arithmetic sequence:
1, 7, 13, 19, 25, ... Enter answer here:
INTERMEDIATE - Find the common difference for the following arithmetic
sequence:
0.00, 0.49, 0.98, 1.47, 1.96, ... Enter answer here:
DIFFICULT— Find the common difference for the following arithmetic sequence:
0.03, 0.95, 1.87, 2.79, 3.71, ... Enter answer here:
[0046] Note that the relationship between student model variables and tasks such as those listed above may be that student model variables represent the concepts or skills being
examined. The online manifestations of those variables may be the assessment tasks with
which students interact and that elicit evidence about the variables. Thus, student model
variables may be assessed (and their states inferred) in relation to a learner's performance on relevant tasks.
[0047] In an embodiment, the student model may be represented as a BIN. Li an
embodiment, one or more student model variables may have probabilities for each of, for
example, three proficiency level states: low, medium, and high. For example, a student who
straggles with a specific concept or skill (e.g., knows U.S. state capitals) may have the
following probability distribution assigned to this variable: low (p = .85), medium (p = .10),
high (p = .05). More or fewer proficiency level states may be used for each student model variable within the scope of this disclosure as will be apparent to those of ordinary skill in the
art.
[0048] In an embodiment, additional nodes may be used to provide granulated
information regarding a student's abilities. For example, if knowing each state and its capital
were each targeted as being important, fifty additional nodes maybe represented (i.e., one per
state, residing under the parent node: "knows U.S. state capitals"). In an embodiment, other proficiency level states may exist between the individual states and the global (parent) node
as well. For example, additional nodes may be used to assess students' knowledge of state
capitals by region (e.g., "mid- Atlantic states," "New England states"). The student model may be used to reflect this hierarchy, and evidence may be collected and included at each
corresponding proficiency level state to answer questions regarding the student's
understanding of the subject matter. Each variable may include its own probability
distribution. For the distribution described above (low = .85, medium = .10, high = .05), the
distribution may be interpreted to mean, "It is likely this student currently does not know all of the U.S. state capitals."
[0049] Such probability distributions may be dynamically updated based on the
current, specific performance data (evidence) that influence the student model. Maintaining
an updated record of proficiency levels may help determine proper interventions. For
example, students performing lower than expectations (students having a high probability of a
low proficiency level) may benefit from remedial instruction; students performing
consistently with expectations (students having a high probability of a medium proficiency
level) may need to continue practicing the current skill/concept; and those performing higher
than expectations (students having a high probability of a high proficiency level) may be
ready to move to more advanced material. However, a more concrete method for determining the most suitable task to next present to a learner at a given time may be determined.
[0050] In an embodiment, the next task to be selected may be the task for which the
expected weight of evidence is maximized. The expected weight of evidence (WE) may be
defined as:
Here, T may refer to a task performance, and H may refer to the main hypothesis. Either the
main hypothesis is true (K) or the alternative hypothesis is true ( h ). The variable n may refer
to the number of possible outcomes for each task, hi an embodiment, two possible outcomes may exist for each task: correct or incorrect. Other embodiments may include a plurality of
possible outcomes within the scope of this disclosure. The variable / may represent the
outcome index for a particular task, and the variable t j may be the value of the outcome.
[0051] In an embodiment, the weight of evidence for a particular task outcome may
be the log-odds ratio of the probability that a particular outcome will occur given that the hypothesis is true, to the probability that the same outcome will occur given that the
alternative hypothesis is true. Thus, the expected weight of evidence, WE(H : T), for a
particular task may be the average weight of evidence across possible task outcomes.
[0052] With respect to the earlier example, when an instructional unit on U.S. state capitals has been completed, an assessment may be administered to determine whether the
students demonstrate high levels of proficiency on tasks assessing relevant content. A
hypothesis of interest (K) maybe that the students are high on their state capital proficiencies,
and the alternative hypothesis ( h ) may be that they are not high.
[0053] In an embodiment, each student may take the assessment one task at a time.
In an embodiment, upon the completion of each task by a student, two possible outcomes may exist: either the student solved it correctly or incorrectly (t j = 1 or 0). Tasks may be rank-
ordered based on the difficulty levels for all of the tasks. The difficulty levels may be based
on, for example, familiarity, frequency and/or saliency data. For example, if the assessment
were administered in New Jersey, an easy item may include identifying Trenton as New
Jersey's state capital. A more difficult item may include, for example, identifying the capital
of South Dakota.
[0054] Determining a proper question to ask first may depend upon the goal of the
assessment. For example, if the goal of the assessment is to determine whether the material
has been mastered by a majority of the students, asking a particularly easy question that each student is likely to answer correctly may not provide additional information regarding the
students' proficiency levels. Accordingly, it may be desirable to pose a more difficult
question. Determining whether an additional question should be posed to a student and, if so,
the difficulty level of such a question may be based on the student model proficiency levels
for the particular student, as updated based on the outcome of the posed question, and on the one or more goals of the assessment as a whole.
[0055] On the basis of each outcome event, and in conjunction with the difficulty of
the current task and the current proficiency level values in the student model, which are
unique to each student based on their responses and any prior information that had been received by the model, the WE may be calculated for the remaining set of assessment tasks.
Accordingly, the next task selected (if any) may be the task that has the highest WE value
(i.e., the task providing the most information in relation to the specific hypothesis).
[0056] For example, if a student has a low proficiency level and misses a difficult
item pertaining to the proficiency, the next task that may be selected (via the WE calculation)
may be one directed to assessing the same proficiency, but including an easier representation. For example, in the example described above, the student may initially be asked to recall the
capital of South Dakota in response to an open-ended prompt (i.e., "What is the capital of
South Dakota?"). This may represent a difficult task. If the student answers incorrectly, the
student may be presented with an easier, forced-choice variant, such as, "Which city is the
capital of South Dakota: (a) San Francisco, (b) Pierre, (c) Baltimore?"
[0057] Using WE may have advantages of being multidimensional, dynamic and
flexible. In other words, WE may work with multidimensional BDSTs and allow estimation of
a variety of student model variables (rather than being limited to a single, general
proficiency). Moreover, the model for a particular student may evolve over time by updating
its variable estimates in response to actual performance data. Finally, the WE approach may allow specification of a hypothesis of interest as opposed to requiring a default or fixed
hypothesis.
[0058] FIG. 3 depicts a flow diagram for an exemplary method of determining a
next task based on the expected weight of evidence according to an embodiment. The weight of evidence may be calculated for each task. The task with, for example, the highest WE may
be selected. The selected task may be administered to a student, and evidence may be
collected, hi an embodiment, the evidence may include the response to the selected task,
other information pertaining to the task and/or to the student and/or any other relevant
information. The response may be scored based on a heuristic. The student model, such as a BIN, may be updated to include the received information and/or evidence. It may be
determined whether obtaining additional information would be beneficial to assessing the
proficiency level of a student. If additional tasks would be beneficial, the process may repeat
by calculating the weight of evidence for each remaining task (i.e., each task that has not
already been administered to the student). Otherwise, the process may terminate.
Termination may also occur if a threshold is exceeded, if time runs out and/or if no more
tasks remain for assessing proficiency.
[0059] In an embodiment, two stages may characterize the design of an ECD-based
assessment: domain analysis and domain modeling. Domain analysis may include a process
of identifying, collecting, organizing and/or representing the relevant information in a domain based on information received from domain experts, underlying theory, supplementary
material and the like. Ih domain modeling, relationships may be established among one or
more student proficiencies, the evidence for the one or more proficiencies and/or the kinds of
tasks that elicit relevant evidence. Graphic representations and schema may be used to
convey complex relationships.
[0060] In an embodiment, the domain analysis phase may include considering the
range of constructs that may be measured by the assessment. Relevant constructs may be
identified via expert practitioners, supporting materials, research articles, state and national
testing standard and/or practical requirements and constraints. For example, when designing
an assessment that covers eighth-grade mathematics, teachers teaching students at that grade
level may be consulted to determine the appropriate subject matter for the assessment. In an
embodiment, a practical constraint may include limiting the scope of the assessment to 2-3
weeks of material, which may correspond to the approximate length of time that most
teachers will spend on a classroom unit of instruction.
[0061] In an embodiment, "sequences as patterns" may be selected as a topic for an
assessment. Prerequisites for the subject and the requisite skills to assess may be determined.
Sample tasks and supplementary materials may be developed to assist in designing the
instructional unit. Further, a determination of the proficiencies that may be appropriate to
include on a pretest and/or an interim test designed for the instructional unit on sequences may be developed.
[0062] Once the breadth and depth of the proficiencies to test are determined,
domain modeling may be performed. In the domain modeling phase, assessment designers
may use information from the domain analyses to establish relationships among proficiencies, tasks and evidence. The designers may develop high-level sketches of the interrelationship among the proficiencies that are consistent with what they have learned about the domain.
Ultimately, the designers may create graphic representations to convey these complex
relationships. The designers may further develop prototypes to test assumptions.
[0063] Key proficiencies and the manner in which they should be linked and
organized may be determined for a student model. For example, a graphic representation may
be created defining links between proficiencies. Once the student model is established, the
evidence and task models may be defined. FIG. 4 depicts an exemplary student model
according to an embodiment. Features of the student model depicted in FIG. 4 may include
the following: 1) the model may be hierarchical. Each child node may include only one
parent node. 2) The root node that represents the proficiency, sequences as patterns, may have three child nodes. Each node may correspond to a different sequence type. 3) The
proficiencies under each sequence type in FIG. 4 may be identical except that no analog may
exist for common difference (arithmetic) or common ratio (geometric) in other recursive
sequences. This may be because the other recursive sequences proficiency may be more
broadly defined and may pertain to sequences taught at the eighth-grade level that are recursively defined but are neither arithmetic nor geometric. Examples of other sequences
may include Fibonacci numbers, triangular numbers, and simple repeating patterns. Non-
hierarchical relationships, different numbers of child nodes per parent node and/or different
proficiencies among child nodes may be implemented in a student model within the scope of
this disclosure. In other words, FIG. 4 is merely exemplary of a student model and not
limiting on the scope of this disclosure, which includes the embodiment shown in FIG. 4 and
numerous other embodiments.
[0064] Brief descriptions of exemplary student proficiencies are provided in Table 2
below. In an embodiment, three levels of proficiency (e.g., low, medium and high) may be associated with each student variable. For each proficiency level of each student model
variable, a claim maybe specified describing what the student should know and be able to do.
An exemplary claim for a student with a high level of proficiency at finding explicit formulas
for geometric sequences (i.e., the node labeled explicit in the geometric branch of the student
model of FIG. 4) may include: "The student can correctly generate or recognize the explicit formula for the « l term in a geometric sequence. The student can do this in more challenging
situations, for example, when the signs of the terms in the sequence are alternating, or when
the starting term and the common ratio are unequal."
Tree level Name in tree Full name Description
Arithmetic Solve problems with A student with this set of proficiencies can arithmetic sequences work with arithmetic sequences at the eighth-grade level. An arithmetic sequence may be defined by a starting term a \ and a common difference, d. The terms of an arithmetic sequence maybe as follows: Ci, a \ + d, a \ + 2d, a \ + 3d,..., a \ + (n-l)d
Pictorial Represent pictorial A student with this set of proficiencies can patterns as sequences interpret a graphic (e.g., a succession of (arithmetic, patterns of dots) as a sequence of a geometric, other particular type. recursive)
Algebra rule Generate a rule for a A student who has this skill can express sequence as a rules of generating terms in a sequence function or algebraically; the rule in this case takes the expression form of an algebraic expression. (arithmetic, geometric, other recursive)
Explicit Generate a formula A student with this proficiency can use an for the nth term of a algebraic expression to represent the nth sequence (arithmetic, term of a sequence. For example, 5 + 2(n - geometric, other 1) is an explicit rule for the nth term of an recursive) arithmetic sequence with an initial term of 5 and a common difference of 2. In general, an explicit rule for the nth term of an arithmetic sequence is: a n = a \ + (n - l)d (where d is the common difference) and an explicit rule for the nth term of a geometric sequence is: a n = air" '1 (where r is the common ratio).
Table 2. Example Proficiency Descriptions
[0065] As described earlier, the evidence model may specify behaviors that indicate the level of mastery associated with a particular proficiency. The evidence model may include, for example, two parts: evidence rules and a statistical sub-model. The evidence
rules may be characterized at each of the three levels, per proficiency. Evidence associated
with each level for two proficiencies is shown in Table 3.
Proficiency Evidence Rules for High Evidence Rules for Medium Evidence Rules for Proficiency Level Proficiency Level Low Proficiency Level
Represent The student can produce a The student recognizes that The student does not pictorial pattern that represents an the pictorial patterns have infer any mathematical patterns as arithmetic sequence, can mathematical significance, significance from the arithmetic recognize arithmetic but cannot consistently pictorial patterns. sequences sequences represented as explain how or why. pictorial patterns, and can recognize the equivalence between numeric and pictorial representations.
Generate The student can generate The student generates The student generates and justify geometric sequences. If a something that may be a something that does examples of list of terms is given, all sequence but not necessarily not express a sequence geometric terms in the sequence are a geometric sequence, or or generates a sequences correct. If a formula is generates a sequence that is sequence that does not given, it is well formed geometric but has some include a and correctly specifies an incorrect terms due to multiplicative appropriate example. arithmetic errors, or operation as at least generates a formula that is part of the rule. close to expressing the correct sequence.
Table 3. Evidence Rules Specified for Two Sample Proficiencies, at Each Level of Mastery
[0066] The statistical sub-model may define a set of probabilistic relationships
among the student model variables (nodes) and observables. Prior probabilities (priors) may be estimated for the parent node (i.e., sequences as patterns). In cases where the prior
distribution is not known in advance, values of approximately Mn maybe assigned for each of
the n possible states (i.e., .33, .33 and .34 for 3 states). The priors may specify the
probabilities that a student is in the low, medium and high states for the parent node
proficiency.
[0067] In an embodiment, for each of the other nodes in the model, two values may
be entered. One value may be an indicator of the relative difficulty of the tasks associated
with that particular node, and the other may be a correlation that indicates the strength of the
relationship between the node and its parent node. These values may be used to produce a set
of conditional probability tables, where one table may exist for each node except for the root node. Because each node in the exemplary embodiment has three levels associated with it,
each conditional probability table may have nine probability estimates (3 parent node levels multiplied by 3 child node levels). For example, a cell in the table associated with the
"model" node under "arithmetic" sequences may indicate the probability (expressed as a
value between 0 and 1) for high-level proficiency for tasks of type "model" given a medium-
level proficiency for "arithmetic" sequences. Students with high proficiency levels may be
considered likely to solve both hard and easy tasks, while students with low proficiency levels
may be considered likely to solve only easy tasks.
[0068] A task model may provide a specification of the types of tasks that measure
the behaviors described in the evidence model. The task model may describe the features for each type of task included in an assessment. For example, the task model may describe
different item types included in an assessment, the nature of the stimulus, the stem and/or the
options (if any). The task model may also describe how the student is required to respond to
each type of task. For example, a multiple choice item may require the student to select an option, while a numeric entry item may require a student to enter a number instead. An
exemplary item may include the following: "Find the missing terms in the following
arithmetic sequence: 4.68, , , 13.74, 16.76, 19.78." The item type, the nature of the
stem and/or the number of responses may be exemplary task model variables included in the task model specification. The exemplary item above may be a numeric entry item because the
student is required to enter numbers rather than selecting an option. Two responses may be
required for the above item (one for each blank). As shown, the stem may include both
numbers and text, but no graphics. The stem may include one or more words, numbers, pictures and/or tables.
[0069] In an embodiment, a plurality of tasks may be included per proficiency at
each level of difficulty. In FIG. 4, the thirty-two proficiencies may represent the children of the main nodes (i.e., Sequences as Patterns, Arithmetic, Geometric and Other Recursive
sequences). Accordingly, if two tasks are included per proficiency at each level of difficulty, 192 tasks (i.e., 32 proficiencies, multiplied by 3 levels and 2 tasks per level) are required for the particular embodiment shown in FIG. 4. Tasks may be selected from previously
generated task items or may be developed independently. In an embodiment, tasks may be
developed using quantitative item models, such as the item models described below. In an
embodiment, items may be automatically generated and formatted from the item models
using software designed for this purpose.
[0070] The term item model may refer to a class of content equivalent items that describe an underlying problem structure and/or schema. A quantitative item model may be a
specification for a set of items that share a common mathematical structure. Items in a model
may also share one or more formats, variables and/or mathematical constraints. A set of item models may be used to define the task model for an assessment. The variables in a
quantitative item model may specify the range of permissible values that may replace the
variable in an individual item. The constraints in a quantitative item model may define
mathematical relationships among the variables. The number of items described by an item
model may depend on how the variables and constraints have been defined.
[0071] Once an item model is defined, instances that are described by the item
model may be automatically generated. A description of an item model may be programmed
into software that generates the instances. In addition to providing an organized structure for
item development, an automatic approach to item generation may provide considerable practical advantages because the generating software may perform the necessary
computations and format the items automatically. In an embodiment, ECD may be used as
the guiding framework to inform the structure of item models.
[0072] Table 4 may depict a simplified example of an item model with two items that could be generated using the model. This item model may generate easy items that link
to the "extend" node under "arithmetic" sequences.
Model template Variables and constraints
Model Extend the arithmetic Al is an integer between 1 and 9, inclusive sequence by finding the next term: D is an integer between 2 and 9, inclusive
A1, A2, A3, . . . A2 = A1 + D A3 = A2 + D Key = A3 + D
Example item 1 Extend the arithmetic Al = 1 sequence by finding the next term: D = 3
1, 4, 7, . . . 4 = 1 + 3 7 = 4 + 3 10 = 7 + 3
Example item 2 Extend the arithmetic Al = i sequence by finding the next term: D = 9
5, 14, 23, . . . 14 = 5 + 9
23 = 14 + 9 32 = 23 + 9
Table 4. An Example of an Item Model and Two Items [0073] With respect to macroadaptation, an exemplary adaptation may include
accommodating for visual disabilities, i.e., blindness and low vision. In an embodiment,
content may normally be presented visually and may require students to use, for example, a
mouse, a keyboard and/or another input device to answer, for example, single selection
multiple-choice items. In an embodiment, students may be required to use a keyboard and/or another input device to answer, for example, numeric entry items. One or more
accommodations for making test content accessible to individuals with visual disabilities may
be implemented. For example, individuals with low vision may use screen enlargement
software, which may allow users to enlarge a portion of a display screen. Moreover,
individuals who are completely blind or who are otherwise unable to benefit from screen enlargement software may be able to access an audio rendering of content and/or tactile graphics (e.g., raised-line drawings).
[0074] The usability of specific accommodations may be considered when
determining the validity of test scores (i.e., the degree to which accumulated evidence and
theory support specific interpretations of test scores entailed by proposed uses of a test)
obtained under accommodated conditions. For example, it may be important to ensure that
the accommodation is usable and overcomes one or more accessibility barriers. However, it
may also be important to ensure that an accommodation does not provide an unfair advantage
for the person that receives the accommodation. For example, allowing a person with a math-
related disability (e.g., dyscalculia) to use an electronic calculator on a mathematics test may make the test accessible and usable; however, if the test is intended to measure mental computation, the electronic calculator accommodation may tend to provide an unfair
advantage for that person, thereby potentially invalidating the results.
[0075] An ECD-based validity framework may be used that closely examines evidentiary arguments. Careful attention to the definition of the construct (e.g., skills or
abilities that are or are not part of what is intended to be measured) may be required.
[0076] The exemplary "sequences as patterns" assessment may be used to measure
cognitive abilities (e.g., reasoning and knowledge of various sequences) rather than assessing the senses of sight, hearing and/or touch. As such, it may not be unreasonable, for example,
to provide accommodations that reduce or eliminate the requirements for sight (imposed by the visually displayed text and graphics under standard testing conditions) and instead rely on
other capabilities, such as hearing and touch, when delivering test content.
[0077] Another relevant piece of evidence for this assertion may be that the ability to
decode (decipher words from characters) may not be considered to be part of "knowledge of sequences." If decoding were defined as being an essential part of that construct, use of an audio accommodation may threaten the validity of the assessment; specifically, the audio presentation may read whole words at a time thereby reducing or eliminating the need for the
student to demonstrate their decoding ability.
[0078] hi an embodiment, ensuring valid assessment results may depend on a
plurality of additional and/or alternate factors. For example, having adequate practice and
familiarization materials, adequate time and the like may be required as accommodations.
[0079] hi an embodiment, the ability to work quickly may not be essential to
"understanding sequences as patterns." Furthermore, a person who is blind and using tactile
or audio-tactile graphics may be likely to require more time to complete an assessment than a
non-disabled person receiving the test under standard conditions. Accordingly, extra testing time may be an appropriate testing accommodation.
[0080] Audio rendering of content may be termed a "read-aloud" accommodation
because it involves reading the content aloud to the student. The accommodation may be
implemented via a live human reader, prerecorded human audio and/or synthesized speech. In an embodiment, the audio rendering may verbalize text content (i.e., straight text) and non¬
text content, such as images, audio and/or video/animations. As discussed above, non-text
content may be translated into text equivalents, which seek to convey the same meaning as
the non-text content through text. An audio rendering of a mathematics test may also include
specially scripted descriptions of mathematical expressions and tables. If the audio rendering has been crafted to convey all necessary content, a person who is visually disabled may use it
without relying on, for example, tactile graphics. However, understanding graphical material
(pictures, graphs, etc.) may be significantly easier when an audio description is supplemented
with tactile graphics. Tactile graphics may be printed or pressed onto paper or plastic and
may be felt with the fingertips. Tactile graphics may include Braille labels. Hard copy Braille versions of test content may provide an alternate accommodation; however, many individuals who are blind do not read Braille or have very limited Braille literacy.
[0081] In an embodiment, a hybrid method of access combining tactile graphics and
audio may be used, hi such an audio-tactile graphics embodiment, the student may touch a specific location on a tactile graphic and hear a description pertaining to that location. The
student may quickly navigate from location to location to hear as much or as little of the
description as desired. Such audio-tactile graphics may facilitate access to graphics-intensive
content, hi an embodiment, a tactile tablet (such as the Talking Tactile Tablet made by Touch
Graphics, Inc. of New York, New York) may be used to implement a system using audio- tactile graphics.
[0082] The tablet may provide audio (read-aloud), tactile and visual modification
capabilities. Such capabilities may be particularly useful for test content that uses graphics,
tables and mathematical expressions, which are often difficult to convey via words alone.
[0083] Developing an application using a tactile tablet may require the development
of a tactile graphic, hi an embodiment, a tactile graphic may be a sheet of hard plastic that
uses raised lines and textures to represent points, lines and regions of a graphic, such as is
shown in FIG. 5. A special printing process may be used to print the graphical material in ink
on the tactile graphic to assist visually disabled individuals with some sight. In an
embodiment, some features of the graphic may by an external personal computer. A
developer may specify the active regions on the graphic in software and may map each active region to one or more prerecorded audio segments.
[0084] For example, a student using such a system may press on the angle depicted
in the lower-right corner of FIG. 5 and hear the words "110 degrees" in prerecorded audio.
This may enable a student who has a visual impairment (or another disability that impairs
processing of visually-rendered content) to receive specific and interactive audio descriptions
of content that would ordinarily be presented only visually. A tactile tablet system may allow the student to navigate through the test and select an answer using tactile (raised-line)
controls on the tablet, hi an embodiment, a student using the tactile tablet system may only
use a keyboard and/or other input device, for example, when answering constructed-response
items.
[0085] In an embodiment, the basic audio-tactile capabilities of the tactile tablet
system may be augmented with capabilities designed to make the system suitable for
achievement testing. For example, the system may enable a test and item directions to be
received, navigation between and within items to be performed, typed responses to be
received (if applicable) and answers to be confirmed. Synthesized speech may permit students to hear an audio representation of a response as it is entered.
[0086] In an embodiment, the microadaptation and macroadaptation modules may
be integrated into a single system. For example, a microadaptation implementation that selects content for presentation to a learner during as part of an assessment may be integrated
with a macroadaptation module such as the tactile tablet. Accordingly, blind and/or other
visually disabled learners may benefit from the use of an adaptive content presentation unit
based on the student model as updated by responses provided by the learner, hi an
embodiment, different microadaptation and/or macroadaptation modules may be used. For
example, a module that translates an assessment into a foreign language for non-native
speakers may be utilized as a macroadaptation module for an assessment.
[0087] It will be appreciated that various of the above-disclosed and other features
and functions, or alternatives thereof, may be desirably combined into many other different
systems or applications. It will also be appreciated that various presently unforeseen or
unanticipated alternatives, modifications, variations or improvements therein may be subsequently made by those skilled in the art which are also intended to be encompassed by
the disclosed embodiments.
Next Patent: BALANCED PHYSIOLOGICAL MONITORING AND TREATMENT SYSTEM