Discovery Tools for Science Education Content
A pioneering project to
develop grade level stratification tools
for STEM education in
grade levels K thru College
Interim Project Report
October 2008
Principal
Investigator: David E. Wojick, Ph.D., PE
davidwojick@craigellachie.us
391
Flickertail Lane
Star
Tannery, Virginia 22654
Co-investigators
Bernadette Monahan, MA
Diane W. Adams, MA
The
STEM Education Center
http://www.stemed.info
Funded
in part by
Office
of Scientific and Technical Information
U.S.
Department of Energy
http://www.osti.gov
Project: Discovery tools for science education
content.
Table of
Contents
A. Introduction: Significance
of the Grade Level Stratification Problem & Opportunity
B. Technical Approach
C. Anticipated Public
Benefits
D. Project Narrative: Results
to Date and Demonstration of Technical Feasibility
1. Introduction:
The STEM GLS challenge
2. Assessing the
primitive state of grade level stratification today
3. Selected
topic case study: magnetism and electricity
4. Focus on
computer based GLS
5. Focus on
state standards of learning (SOL): they are the basis for teaching and testing,
hence the greatest user need
6. First SOL
case: Virginia K-12 science
7. Step one: the
initial grade level stratification -- 4th grade electricity concepts and
clusters using the ST, VT, GT, DT and IT method.
8.
Significant features of the 4th grade electricity stratification list
9. Step two:
electricity stratification for all K-12 grades
10. First and
second GLS trials: the Newton test bed
11. Second SOL
case: California electricity and magnetism
12. We discover
dramatic grade level differences: VA (4, 6, MS & HS) versus CA (4 & HS
only)
13. First math
GLS -- VA probability theory
14. College
level GLS for magnetism and electricity -- a new method
E. Putting it all together: a 3-D spiral
model of K-12 STEM education
F. From grade levels to learning levels
G. Conclusion: Our project demonstrates
the feasibility of grade level stratification of STEM educational content
Appendix
I: Basic topics for a complete K-16 science education GLS system
Appendix
II: Principal Investigator and other Key Personnel
A. Introduction
Significance of the
Grade Level Stratification Problem and Opportunity
Web-based
educational content is being developed in vast numbers throughout both the
scientific community and the educational world. Most of this material is being
developed on an ad hoc basis and there is presently no easy or systematic way
for teachers, parents or students to find it. Specifically, there is no search
engine that finds educational content by grade level or degree of difficulty.
This is the problem we are solving. We refer to this problem as that of
"grade level stratification" or GLS.
This is a
global problem of huge proportion. Simply put there is no reasonable way for
teachers, parents or students to search on the Web for content that is suited
for a specific grade level or degree of difficulty. There are thousands of
small collections scattered through the Web. The only way to search them is one
by one, which is prohibitively difficult. It is a cottage industry with 100,000
cottages.
The Web only
works where search works. At the present time Web searching does not work for
science education. Our effort is designed to develop the tools needed to solve
this problem.
The Federal
role in science education is complex. The Federal government spends about one
billion dollars per year directly on science education, but very little
directly on content development. Most of this billion dollar budget is spent by
the National Science Foundation, but there is also some by the Department of
Education. But most of this funding goes to supporting schools and students and
for other purposes than developing STEM (science, technology, engineering and
mathematics) education content per se.
Formal
development of STEM education content is generally left to the textbook
publishers. However, since the advent of the World Wide Web an enormous amount
of content has been developed by individuals, especially by scientists and
science programs, together with science teachers and students. The Federal
government spends tens of billions of dollars on scientific research and
development each year and a significant amount of this funding is going into ad
hoc educational content development. This is certainly true for DOE, which
spends almost $10 billion per year on R&D. Every program and facility we
have examined has some educational content, so do many individual projects. But
because of a lack of formal programs for content development this wealth of ad
hoc content lies hidden.
We estimate
that the Web may contain a million or more web pages and documents that are
suitable for K-16 STEM education purposes. A significant proportion of these
are funded by Federal R&D agencies, either directly or indirectly. The
technical problem is how to find this content in a workable manner?
We note in
passing that college level STEM materials are often needed by researchers as
well as by students and teachers. Researchers often explore outside of their
specialty in the course of their research. When they do so they often must
begin with an educational phase. Thus while this project narrative is mostly
concerned with the application of grade level stratification to education per
se, much of what we are doing will have extensive application to the world of
scientific and technological research and development. Likewise, while we talk
about GLS as a search problem is an authoring problem as well. Our tools can be
used to author content by degree of difficulty.
It is
important to note that conventional search engines like Google do not
distinguish educational content from advanced scientific content using the same
search terms. Exploring the potential for computer based search methods that do
make this distinction has been a central focus of our project. We have
conclusively demonstrated the feasibility of solving this problem using
systematic grade level stratification of search terms.
B. Technical Approach
This project
seeks to develop a new grade level stratification (GLS) system for finding and
collecting STEM educational content on the Web, especially ad hoc content
funded by Federal R&D agencies. It is believed that there are hundreds of
thousands of such items on the Web.
The GLS
method we developed in Stage I is simple, yet powerful. We started with the
content requirements for teaching one topic (electricity and magnetism) in one
state (Virginia), in grade levels K-12. Electricity and magnetism represents
about 3% of the K-12 science curriculum, but it is taught in almost every
grade, which made it a good pilot topic. From this content we were able to
devise search term lists for the concepts taught in each grade. Using these
lists we are able to estimate the grade level of any given electricity content
in our benchmark database.
The overall
objective is to produce a working grade level stratification system for K-12
& college science, for all science topics and ultimately for all STEM
topics.
What remains
to be done are three things:
Technical
objective 1: First and foremost, replicate the GLS method and extend the
electricity prototype to include the whole of K-16 science. When this is done
we will have a complete GLS for K-16 science. All K-16 science content will then
be potentially searchable and findable.
Technical
objective 2: Solve the problem that different states teach the same content in
different grades. Our working hypothesis is that there are intrinsic sequences
in K-16 content, which define degrees of difficulty. Simply put, science
education builds knowledge in a specific, systematic way. This means that
appropriate content can be found without regard for the numerical grade in
which it is taught in any given state. This is true for electricity and magnetism.
We will determine the extent to which it is true for the rest of K-16 science.
Technical
objective 3: refine and improve the relatively simple search algorithms
developed so far, to improve ranking of search results.
C. Anticipated Public
Benefits
If we are
successful the public benefits begin with greatly increasing the use of STEM
education content found on the Web. This should improve STEM education, as well
as providing a greater flow of knowledge from the scientific community into
education. It also increases the return on Federal investment in science. In
addition, we have determined that scientists themselves make heavy use of
undergraduate level educational material when they are moving into new fields.
Therefore, improving access to educational content has the potential to speed
up science itself.
At the
present time there are many thousands of small collections of educational
content on the web, containing hundreds of thousands of documents and other
materials. Much of this content is developed by scientists, much by teachers,
some even by students. At the present time the only way to find this material
is to search collection by collection, because search engines do not sort their
hits by grade level. Even at the collection level there is almost no
stratification by individual grades. Given that the Web only works when search
works, the Web does not presently work for science education. The content is
there but it cannot be found.
D. Project Narrative
Results to Date and
Demonstration of Technical Feasibility
Our project to date has
clearly demonstrated the feasibility of our approach to creating an efficient
grade level search mechanism. In fact we have pioneered a new method of grade
level stratification (GLS) of content for science, technology, engineering and
mathematics (STEM) education. Our method will be useful not only for creating
new search tools, but for content development as well. This is explained below.
1. Introduction: The STEM GLS challenge
Our original project
proposal stated our objective as follows:
"Grade level
search."
"One of our ultimate objectives
is to explore the feasibility of sorting content by grade level or at least by
ranges of grade level."
"Core concept search."
"At a minimum we hope to be able
to distinguish educational content from advanced scientific content. We expect
to be able to identify a set of core concepts that are typically always taught
at different grade ranges, however broad. The presence of these concepts and
the absence of advanced concepts should be diagnostic of educational content.
This hypothesis will be tested."
(end of
proposal quotation)
What we have found is a
way to specify clusters of core concepts and words that precisely identify the
grade level of the content, for a given state. This is far more than we
expected, far more than the minimum we hoped for. In fact it sets the stage for
a new technology of grade level specific content. What we did and what we found
is described below.
2. Assessing the primitive state of grade level
stratification today
We began by surveying
the extensive education websites at several science institutions. We looked to
see if grade level stratification (GLS) was being effectively utilized to
establish the grade level readability of science education documents. In
researching these l sites it became apparent that many of the documents and
even the sites themselves were not meeting their target group's needs,
especially the elementary student.
In many cases the grade
ranges that were used were too wide to be useful. In some cases Web resources
were simply grouped as being "for Kids," with no grade level
stratification at all. In most other cases the groupings were just
"elementary, middle school, and high school." This is too broad to be
useful, especially in the elementary grades. Also, much of the content was
found to have a grade level that was higher than the category it was put under.
We also looked at a
number of Web-based collections of K-12 science education resources. Here too
we found that many resources were not correctly identified by grade level, or
were categorized too broadly to be useful.
We also found that a lot
of material mixes content that is suitable for one grade with content that is
suitable for another. For example, an article with mostly 4th grade content
will contain a high school level paragraph or two. Such material is not
suitable for either grade.
3. Selected topic case study: magnetism and
electricity
The team selected
energy, with a concentration on magnetism and electricity, as the topic case
study. There were several reasons of this choice. First, the team members are
knowledgeable with the content. One member is an expert in the field of
electric power and another teaches the subject as part of the curriculum in her
grade level.
Second, magnetism and
electricity is taught in many grades, from Kindergarten through advanced
college. This makes it a good candidate for many levels of stratification.
Third, the Department of Energy has a deep interest in magnetism and
electricity. It does a lot of research in these areas of science and
technology.
4. Focus on computer based GLS
We decided to focus our
efforts on computer based grade level stratification. It was felt that the
greatest need and opportunity for results was with this challenge. While there
are powerful Web search engines, there is as yet no workable search engine that
finds STEM documents by grade level. Nor is there a writing tool that will
determine the grade level of a document, or coach a writer who is trying to
achieve a certain grade level, as a spell checker does. That is the challenge
we undertook.
One technical challenge
that has emerged is that simple stemming of terms, which is often used in
computer based search, does not work. This is because simple variants of
technical terms may be taught at very different grade levels. For example,
"electron" and "electronic." This difficulty has meant that
in many cases variant search terms have to be hand crafted.
5. Focus on state standards of learning (SOL):
they are the basis for teaching and testing, hence the greatest user need
In the United States,
education is primarily a state and local responsibility. Each state is
responsible for creating content standards in every subject. Science and math
standards are required by federal law. Standards identify the academic content
for essential components of the curriculum at different grade levels. These
standards are used in every public school classroom by teachers to plan,
prepare, teach, and assess students in every subject. In turn, these standards
become the benchmark by which students are evaluated and assessed through the
federal No Child Left Behind Act of 2001(NCLB). In the state of Virginia these
standards are called the Virginia Standards of Learning or SOLs. In this report
we use the term "SOL" to refer to all state standards of learning.
Because SOLs are so
important we selected them as the basis for our GLS work. Other candidates,
such as textbooks, Web-based teaching resources, etc., were considered too
difficult to work with and too variable. There exist a number of candidates for
federal standards, such as the AAAS Benchmarks. However, these are divided into
groups of several grades each, so are not precise enough for our needs.
Moreover, their grade levels for various topics often do not correspond to the
grade levels used in many states. In fact, as we discuss below, grade levels
vary significantly from state to state for many concepts. This is a major
finding of our project.
6. First SOL case: Virginia K-12 science
We chose to use the
Virginia K-12 science SOLs as our baseline standard for two reasons. First, one
member of the team has expertise is in teaching the Virginia SOLs. Second, the
Virginia science SOLs have very detailed content, compared to some other
states. They also have even more detailed supporting guidance documentation. We
therefore focused on the Virginia K-12 science SOLs for magnetism and
electricity.
In Virginia there are
grade specific SOLs for each grade from Kindergarten through 6th grade. Then
there are topic specific SOLS for middle school and high school, not for
specific grades. This is because different students take their science topics
in different grades in middle and high school.
7. Step one: the initial grade level
stratification -- 4th grade electricity concepts and clusters using the ST, VT,
GT, DT and IT method.
Beginning with the
fourth grade science SOLs in magnetism and electricity we isolated core concept
terms that were specific to that standard. These are called simply SOL terms.
We then developed a list of variations and otherwise related terms suitable for
computer search, as explained below. (See listing)
By using the SOL term we
first determined many of the grammatical variances of the key term that could
be used in the same grade. This allows us to search on the SOL term (denoted ST
in the listing) as well as possible grammatical variations of the term (denoted
VT). This process is analogous to stemming but is more controlled because some terms
with identical stems can have very different grade level. For example,
"magnet" is a kindergarten term while "magneto" is a high
school term.
We also identified some
of those terms found in a dictionary (denoted DT) that relate to the SOL term
and are likely at the same grade level. Implied terms (denoted IT) were
included when the team determined it was probably needed to teach a SOL term.
Guide terms (denoted GT) were added on occasion to help again clarify the ST.
Guide terms were pulled from the state guideline documents to help teachers
understand any standard that was not specific. For example, in the case of
electricity the 4th grade SOL simple states that important historic figures
will be taught. The guidelines identify Franklin, Faraday and Edison, so these
were added as GTs. Teaching about Franklin implies teaching about lightning so
this was added as an IT, and so on. The point is that the process of term
selection is systematic, however, it also requires judgement.
In this way we developed
a method for systematically extracting and building up a comprehensive listing
of ST, VT, GT, DT and IT clusters for a given grade. This is the first step
toward grade level stratification.
Here is the complete fourth grade listing:
ST electricity
VT electric
VT
electrical
VT
electrically
ST conductors
VT conductor
VT
conducting
VT conduct
VT conducts
ST insulators
VT insulates
VT insulate
VT insulated
VT
insulating
VT
insulation
IT
nonconductor
ST circuits
VT circuit
DT circuit
breaker
IT closed
circuit
IT open
circuit
IT parallel
IT series
ST static
VT
statically
DT static
charge
DT
statically charged
DT static
electricity
DT electric
charge
DT
electrically charged
GT battery
IT batteries
IT electric
cell
IT dry cell
ST electrical energy
DT electrification
DT electrify
DT
electrified
ST electromagnets
VT
electromagnet
VT
electromagnetism
VT
electromagnetic
VT
electromagnetically
DT
magnetization
DT
magnetizer
SOL calls
for "historic figures"
ST (none)
GT Benjamin
Franklin
VT Franklin
IT lightning
GT Michael
Faraday
VT Faraday
GT Thomas
Edison
VT Edison
IT light
bulb
8. Significant features of the 4th
grade electricity stratification list
This list
has several significant features. First and foremost, there are enough terms to
support computer based search, analysis and identification of documents.
Subsequent tests, described below, have borne this out. Thus we demonstrated
the feasibility of grade level stratification, the goal of our initial effort.
Indeed, we
note that a significant number of technical concepts are being taught. One of
the most important ramifications of our method is that we can actually identify
the number of major concepts that are being taught, for any topic in any grade.
This has large implications for the utility of our technology. For example, we
estimate that overall about 2000 distinct major concepts are taught in K-12
science. Given the number of hours typically devoted to science, this amounts
to roughly one major concept for every half hour of instruction. These large
numbers, combined with the multiplicity of topics to be mastered, help to
explain why science is a hard subject, which many students find difficult to
keep up with. Learning science is a marathon of sprints.
The
clustering of terms around very different major concept terms in the list is
also very important. For example, circuits are a different topic than static
electricity, or Benjamin Franklin. Looking ahead, we believe this clustering
feature will enable us to develop a search algorithm that is independent of
what combination of topics happens to be taught in a specific school in a given
grade. Ideally a user will specify a specific concept cluster to be found, not
a grade level. Exploring this challenge is a major technical objective of our
project.
We also note
that learning some concepts is dependent upon having already learned others.
For example, electromagnets depend on circuits, circuits depend on conduction,
and conduction depends on electricity. These concepts must be learned in
sequence. This sequential structure has important implications for the use of
our method. In particular, Project 21 of the American Association for the
Advancement of Science (AAAS) has demonstrated and mapped this kind of
sequential dependency (at a coarser scale than grade level), as a tool for
promoting science literacy. NSDL uses these AAAS maps as search aides. We hope
to extend this work to a much finer, concept-by-concept set of learning
dependencies. Exploring this challenge is another major technical objective of
our project.
9. Step two: electricity stratification for all
K-12 grades
Next we extended the
same method of finding and clustering each ST, VT, GT, DT and IT for
electricity and magnetism to all grade levels from kindergarten to high school.
We created a master list for each grade level that teaches magnetism and
electricity. Virginia has individual SOL for the elementary grades of K through
6. It has a single grade level for middle school (grades 7-8) and for high
school (9-12), because the same science can be taught to different students in
different grades. Thus there are nine SOL grade levels. Magnetism and
electricity are taught in all but first and fifth grade, so there are 7 grade
level listings for that topic.
We noted that some
Virginia SOL terms appeared in more than one grade level listing. For our final
list we determined that a term should only appear once in the grade level
clusters, that being when it is originally taught. By doing this it would allow
the term to appear at its earliest, or first, occurrence, therefore
establishing it as core concept term in that grade level. In other words, even
though a term might appear in the SOL at a later grade, it has already been
taught in an earlier grade. So it is not being taught in the later grade,
merely being used. Our goal is to identify the grade level at which each
concept is being taught. We therefore located and listed only the first
occurrence of each SOL term and its cluster of associated terms.
At this point we had a
completed K-12 grade level stratification (GLS) for electricity and magnetism.
This GLS will allow a computer to search a document and determine if it has a
grade specific vocabulary. It will also help an author design a document for a
given grade level.
Grade level defined: At
its simplest, the highest grade level that has terms in the document is the
grade level of the document. For example, suppose an article on electricity
uses terms that are taught in grades K, 2, 3 and 4, but no higher grades. This
article can only be read by someone in grade 4 or higher, because it uses 4th
grade terms. But in that group (4th grade or higher) it is only using concepts
that are taught in 4th grade, so it is not suitable for teaching higher grades.
The article is only suitable for teaching 4th grade, so that is its grade
level.
10. First and second GLS trials: the Newton test
bed
We next conducted our
first trial of computer based search using our electricity GLS, in
collaboration with OSTI and their search specialist, Deep Web Technologies. It
was crudely successful, demonstrating the feasibility of GLS based computer
search but in need of refinement.
We chose as our test bed
the Newton collection at Argonne National Laboratory. We chose Newton for
several reasons. First, it is a DOE laboratory product that OSTI plans to
include in its federation of DOE science education collections. Newton is a
collection of about 20,000 science questions and answers, developed over many
years via what is called the DOE "Ask a scientist" program. Each
Q&A is a separate document in the Newton database. This simplifies the
search problem by eliminating well known problems due to searching different
kinds of documents. Exploring this latter challenge
is a major project technical objective.
For our first test we
merely used single word search, without Boolean combinations. We first
identified all the documents in Newton that contained at least one word found
in our GLS list. We then sorted these documents into grade levels based on the
highest grade level word found. The stratification of Newton documents was
basically successful and we demonstrated it at DOE's Office of Scientific and
Technical Information in Oak Ridge, Tennessee on Jan 3, 2008.
However, one problem
appeared that we needed to address. We call it the "outlier problem."
This problem occurs when a document is not about electricity but contains a
single word that appears on our electricity GLS. Such a document may contain
advanced terms in its own topic, in conjunction with a relatively elementary
electricity term. It will be incorrectly assigned the more elementary grade,
according to its electricity term.
There are several
possible solutions to this outlier problem. For one, it should not appear when
a GLS for all of science is used. In that case, the more advanced term from
another topic should be covered by the GLS for that topic. However, it occurred
to us that the problem might also be minimized simply by searching for
documents that contain more than one term from the electricity GLS. That is,
try to find only those documents that are actually about electricity, the true
hits.
We and Deep Web
Technologies therefore conducted a second GLS trial, requiring that documents
be selected only if they contained at least two of the electricity GLS terms.
This simple change greatly decreased the incidence of false hits. This means
that GLS search is roughly feasible without covering the whole of science. We
expect to further explore this challenge as a
technical objective. For example, searches requiring 3, 4 or 5 topic terms may
be sufficiently precise to rule out most false hits, without ruling out too
many true hits.
11. Second SOL case: California electricity and
magnetism
We also decided to begin
to apply the GLS method to other state standards, to assess its general
applicability. We also wanted to compare other states to Virginia, to see how
much variation there might be. The state we chose was California, for several
reasons. First, California is considered one of the five top education states
in the country, which influences not only trends in education but the STEM
content marketplace. (The other states are Michigan, Texas, Florida, and New
York.)
Secondly one of the team
members taught science in California for five years. She was there when the
first standardized test, called the Standardized Testing and Reporting (STAR)
program, was implemented. Like Virginia, the California SOL are very detailed,
which facilitates GLS. Some state SOL, New York for instance, are relatively
vague in comparison. GLS in these states will probably require extensive use of
SOL guidance documents and textbooks. This is a technical objective question
that we may address in future.
We used the same
step-by-step procedures in isolating the key terms for magnetism and
electricity as we previously did for Virginia. Using the California standards
the team compiled a list which produced the key standard terms. We followed the
same procedures as we did in Virginia and used the same method for key terms
related to those found in the standards, identifying variants, dictionary
related terms, implied terms and guidance terms.
12. We discover dramatic grade level differences:
VA (4, 6, MS & HS) versus CA (4 & HS only)
To our surprise, the
list that was generated for California was quite different from the Virginia
list. Not in content, that was almost the same, but in when the concepts are
taught. The results showed that there can be a vast difference between when a
concept is taught in one state than in another. For example, electricity is
taught mainly in fourth grade and high school in California whereas in Virginia
it is taught in a more sequential order from kindergarten through high school
with an emphasis in fourth grade, sixth grade, middle school, and high school.
Another difference
between the two states was when a fundamental term was being introduced for the
first time. For example, a fundamental term like magnetism is introduced in
kindergarten in Virginia, but not until fourth grade in California.
The team even performed
a pilot analysis that determined which terms had the greatest distance between
grade levels. These great differences in when a given concept is taught pose a
potentially large problem for students who move from one state to another. Some
concepts will be taught twice while others will not be taught at all. Not
learning a major concept can be a serious problem, because later concepts build
on earlier ones. Textbooks must face a similar problem. Our GLS product might
be used to alert parents and schools to this problem. Moreover, using Web based
materials to catch up is an obvious solution.
However, this
significant finding poses a problem if GLS is going to look for terms by grade
level. Strictly speaking, a specific concept grade level probably does not
exist for many STEM concepts in the United States, rather grade level is a
state by state matter. In some cases it may even be school specific within a
state. The team believes further investigation is needed to determine if this
difference extends to other states, and how many concepts are affected. Here
again our method should be of great value. This may be a serious national
problem, a form of incoherence.
Short of doing GLS for
every state, we plan to try to gear our search algorithm to concept clusters,
not to grades per se. The concept clusters do not appear to change much from
state to state. The basics of electricity and magnetism are not state specific,
only the grades in which different concepts are taught.
13. First math GLS -- VA probability theory
We also applied our GLS
method to a mathematical topic, to test its generality. Some science concepts
presuppose certain math concepts, so math grade level may help determine
science grade level. We chose probability theory in the Virginia SOL. Like
electricity, probability theory is taught from kindergarten through high
school.
The method worked well.
However, we noted one important difference in the outcome, which probably
reflects the difference in abstraction between math and science. In the
electricity science GLS most of the concepts are concrete technical terms, like
magnet, static charge, Ohm's law, etc. In probability many of the terms are
non-technical, such as event, likely and outcome. This difference might require
a difference in search algorithms, requiring the presence of key technical
terms for example. Of course there are some technical terms in K-12 probability
theory, like probability and normal distribution, but the proportion seems
relatively small compared to science. This technical objective question may be
explored further in future.
14. College level GLS for magnetism and
electricity -- a new method
Our next step was a big
one, extending the K-12 GLS to the college level. There were several compelling
reasons to do this. First, using the K-12 SOL based term lists we had, we could
not distinguish high school level content from more advanced content, so our
K-12 GLS was still incomplete. In order to identify high school level content
we need to separate it from content that is more advanced than high school.
This requires using terms that are more advanced than high school and we had
none in the K-12 set. We needed college level terms to identify high school
level content.
Also, we were aware from
our earlier research that researchers themselves often need college level STEM
content, especially when they explore new topics as part of their research. An
expert in one STEM field may be a beginner in most other fields. And of course
there are the many undergraduate level students and teachers who need content
at the basic college level. We are interested in serving all these college
level users, as well as K-12 users.
We therefore determined
to try to extend the electricity GLS to include not merely a college level, but
two undergraduate college levels, basic and advanced. This would enable it to
identify basic college level content, as well as high school level content. The
latter will serve the high school student and teacher, while the former will
serve the researcher, as well as the basic level college student and teacher.
After considerable trial
and error we settled on using the indexes of several popular textbooks. We used
a basic electricity textbook for the basic college level terms. Several
advanced level undergraduate textbooks were used for the advanced term set.
These included textbooks for electrical engineering, power systems,
electromechanics and electromagnetics. In addition to the index terms we added
simple grammatical variants.
A selection issue arose at
that point, for there were several thousand terms in all of the advanced
indexes taken together. We felt this was probably more than necessary to
distinguish advanced level content from basic level, so we just used the 500 or
so advanced terms that seemed most common. Whether a larger number of advanced
terms is useful will be a technical objective question. We also pared down the
basic electricity index to about 100 terms. This was done by choosing terms
that also appeared in one or more advanced textbooks.
It is an interesting
research issue to determine whether this GLS approach could be used to
distinguish advanced undergraduate college level content from the most advanced
content, that used in postgraduate studies and actual research results. Journal
literature for example. It is not at all clear that this professional level
research activity uses terms that are all that distinct from the advanced
undergraduate terms, but it may.
E. Putting it all together: a 3-D
spiral model of K-12 STEM education
In K-12 STEM education
it is common to talk about "spiraling" in the context of multiple
topics, each being taught progressively over a series of grades. Our GLS
development work supports a new, precision approach to the search, modeling,
analysis and visualization of this important concept. How STEM concepts are
clustered by grade, which topics are taught when, and how this differs from
state to state, is all part of spiraling. This is explained below, in terms of
a simple 3D spiral model.
First divide the science
to be taught into, say, 30 topics, each of which is taught progressively in
several grades over the K-12 period. Electricity is one topic and a complete
topic listing is given later in this report. Let a vertical pole 6' high
represent each topic. Place the poles in parallel, standing on the floor,
making a cluster of 30 vertical poles. Next divide each pole into vertical
segments, say 20 per pole, each of which will represent a group of concepts
that are normally taught together. In electricity, one group might be
conduction, another Ohm's law.
Label the segments in
sequence so that the concepts taught earliest for each topic are closest to the
floor, and progress upward to the last concepts at the top end of each pole.
The sequence of segments represents the fact that many concepts have to be
taught sequentially. Assume for now that there is only one such sequence for
each topic. We now have 30 poles with 20 segments each, or 600 segments in all.
This is everything that will be taught in K-12 science. Given that only one
thing can be taught at a time, the question is how to work through all the
topics and all the segments, step by step?
Now let a string
represent the actual sequence of teaching of each segment and topic. In effect
the string represents the student's learning experience, studying one segment
after another and moving from topic to topic. Attach the string to each of the
600 segments in the order in which these are taught. The basic model is now
complete. Spiraling refers to the fact that the string will leave a given pole
for a period of time then return, then it leaves again, returns again, and so
on.
The amount of spiraling
can vary enormously, depending on the overall sequence of teaching. At one
extreme, suppose each pole is completely taught before another is begun. In
this case the string would only jump 29 times, from the top of each pole
(except the last) to the bottom of another. At the other extreme the string
would jump to another pole after every segment, or 599 times.
In any given curriculum
the amount of jumping is normally somewhere in between these two extremes,
perhaps several hundred jumps. All of this jumping probably contributes
significantly to the difficult of learning. But if we minimize jumping by
teaching more segments on each pole at once, before we jump to another pole,
then we will maximize the time before we return to a pole once we leave it.
This too will contribute to the difficulty of learning. Either way there is a
potential problem. This is the dilemma of spiraling and it is fundamental to
STEM education.
Moreover, the number of
possible paths for the string, or sequences of concepts taught, is enormous.
This means that two different curricula can have very different string
sequences, even though they cover exactly the same material. Students moving
from one to another will be taught some concepts twice and others not at all.
Missing a major concept can be very troubling if later concepts depend upon it,
as often happens. Students moving from state to state, or from one school
system to another, may face this problem. So do textbooks, which cannot fit all
the different spiraling patterns required by different states. This is probably
a national problem of some significance, which may contribute significantly to
the challenge of SETM education.
This simple spiral model
shows just how complex technical education is. Note too that this model applies
to math as well as science. Combining the two gives perhaps a 60 pole model. It
may also apply to the other parts of the curriculum, such as reading, history,
etc. The overall model is very complex, but that is just how the reality is.
Teachers, students and content developers need to understand this.
F. From grade levels to learning
levels
To solve the spiraling
problem we have switched from grade levels to what we call "learning
levels," to rank educational content from elementary to advanced. There
are 10 levels, with level 1 being the most elementary and level 10 the most
advanced level.
These 10 learning levels
are presently based on the American grade level system. This ranges from
kindergarten through the 12 primary and secondary grades, to the 4
undergraduate college "grades." This K-16 grade system thus has 17
grade levels in all. Our 10 levels span these 17 grades.
Ideally we would rank
educational content by the grade at which it is taught. But the same content is
taught in different grades in different schools, and even to different students
in the same school, so this is not possible. There is no unique
grade-to-content connection.
This is why we have
created the 10 learning levels. Roughly speaking, each learning level
corresponds to the average grade level range at which the content is taught in
the USA. However, learning levels are averages, not actual levels, as explained
below.
Learning Levels 1-10
are based on the K-16 grade ranges, as follows:
Learning Level 1 =
grades K&1
Levels 2 through 6 =
grades 2 through 6
Level 7 = middle school
or junior high school
Level 8 = high school
Level 9 = basic
undergraduate college
Learning Level 10 =
advanced (BS degree) undergraduate college
In many cases we have
used grade ranges, like middle school and basic college, rather than exact
grades. This is because our data consists of state standards of learning for
grades K-12 and college textbooks for the college grades. These sources only
determine which content is used in the indicated ranges, not by exact grade.
Note too that the same
content may be taught in different grades and grade ranges in different states.
Thus the learning level is an average of various grades and grade ranges,
across different states. This is somewhat confusing because there are two
ranges in question. Middle school and high school are two grade ranges. But a
given concept may be taught in both so the range of grades for that word may
span both grade ranges.
But the learning level
is not a normal average, where the average value is also the most common. This
means that assigning a concept to a given learning level does not mean it is
usually taught in the corresponding grade or grade range, although it may be.
For example, if a concept is normally taught in either 4th or 6th grade it will
be assigned to level 5, even though it ranges over grades 4 to 6, and is seldom
taught in grade 5. Our model is not as simple as it may look.
The learning level
assigned to educational content is just an average over many different schools
and states. A user who is seeking content for use in particular school and
grade may have to look at a higher or lower learning level in order to find it
in our system.
G. Conclusion: Our project demonstrates high
potential and the feasibility of grade level stratification of STEM educational
content
To return to
the topic of Web based search, the
complexity of the spiraling model explains why Web based search for specific
science content is so important. Teachers, parents and students need very
specific content at every step of the way. But today's search engines do not do
this job. This is a global problem of huge
proportion. Simply put there is no reasonable way for teachers, parents or
students to search on the Web for content that is suited for a specific grade
level or degree of difficulty. There are thousands of small collections
scattered through the Web, including a host of Federally funded content. The
only way to search them is one by one, which is prohibitively difficult.
The Web only
works where search works. At the present time Web search does not work for
science education. Our effort is designed to develop the tools needed to solve
this problem.
We believe that our
project to date clearly demonstrates the feasibility of developing GLS products
and services with high potential for use. In fact our results provide a
significant new understanding of the structure of STEM education, including new
ways to measure and improve that structure. The proposed project to develop a
full scale GLS for STEM education is described below.
Appendix I
Basic topics for a complete K-16
science education GLS system
A.
Physical Science
Topic:
1. Basic principles of electricity and magnetism (done).
2. The basic nature of matter.
3. Models of atomic structure.
4. Chemical properties and use of the periodic table of
elements.
5. Changes in matter and the Law of Conservation of Matter
and Energy.
6. States and forms of energy and how energy is transferred
and transformed.
7. Temperature scales, heat, and heat transfer.
8. Principles and technological applications of work, force,
and motion.
9. Characteristics of sound and technological applications
of sound waves.
10. The nature and technological applications of light.
B. Earth science
Topic:
11. Characteristics
of the Earth and the solar system.
12.
Renewable and nonrenewable resources.
13. Composition
and dynamics of the atmosphere.
14. How energy
transfer from Sun to Earth drives weather and climate.
15. Freshwater resources and the water cycle.
16. Oceans
as complex, interactive physical, chemical, and biological systems.
17. Rock-forming
and ore minerals, and the rock cycle.
18. Geologic
processes including plate tectonics.
19. History
and evolution of the Earth and life, based on rocks and fossils.
20. Origin
and evolution of the universe.
Life
Science
Topic:
21. Cell theory.
22. Patterns and structures of cellular organization in
organisms.
23. How organisms differ and can be classified.
24. Basic needs of organisms.
25. Photosynthesis and its importance to plant and animal
life; the carbon cycle.
26. Interactions among members of a population.
27. Interactions among populations in a biological
community; food chains and webs.
28. Adaptation to biotic and abiotic factors in an
ecosystem.
29. Dynamics of ecosystems, communities, populations, and
organisms.
30. Ecosystem dynamics and human activity.
31. How organisms
reproduce and transmit genetic information to new generations.
32. Evolution and how organisms
change over time.
Appendix II
Principal Investigator and other Key
Personnel
Dr. David Wojick, the
Principal Investigator, is an expert on the Web-based diffusion of scientific
knowledge and the concept structure of science and technology. Diane Adams is a
Web designer and Web research expert. Bernadette Monahan is a science teacher
who specializes in educational technology and collecting science education Web
content.
Wojick and Adams have
been a team since 1976. During that period they have conducted or participated
in a number of large scale projects that required developing new methods of
search and analysis, in the context of real world applications. Much of this
work has been done for the Federal government. Examples include the following:
a. Use of
word search to identify hidden clusters and paths in Naval Research (Office of
Naval Technology).
b. Use of
word search to identify basic regulatory mechanisms in federal regulations
(Office of Information and Regulatory Affairs, OMB).
c Coherence
analysis diagnostic system of 126 kinds of confusion in technical texts
(Department of Commerce).
d. A method
to measure allocation of content to scientific topics in Web pages (Office of
Scientific and Technological Information, DOE).
e.
Population modeling of the diffusion of scientific knowledge (Office of
Scientific and Technological Information, DOE).