Wednesday, May 23, 2012

John Dewey + Moneyball = A Key Insight to Change

A quote from John Dewey
"I may have exaggerated somewhat in order to make plain the typical points of the old education: its passivity of attitude, its mechanical massing of children, its uniformity of curriculum and method. It may be summed up by stating that the centre of gravity is outside the child. It is in the teacher, the textbook, anywhere and everywhere you please except in the immediate instincts and activities of the child himself. On that basis there is not much to be said about the life of the child.  A good deal might be said about the studying of the child, but the school is not the place where the child lives. Now the change which is coming into our education is the shifting of the centre of gravity. It is a change, a revolution, not unlike that introduced by Copernicus when the astronomical centre shifted from the earth to the sun.  In this case the child becomes the sun about which the appliances of education revolve; he is the centre about which they are organized."
This passage is from "The School and Society," originally published more than 100 years ago (in 1899).  It is relevant today, sadly.  Are students at the center of instruction in math classes?  Mostly no.  Math teachers predominantly lecture at students, and the great change that Dewey saw has not yet come about, though many teachers have made the shift.  A question one can ask is "Why have things not changed significantly in all these years?"  Sure the books have colors, and we have technology beyond our grandparents' wildest dreams.  But when you look beyond mere surface beauty, you can see that the heart of it is still the teacher telling, and the students following.

One major issue behind the lack of change is data.  More specifically an issue that persists is in assessing teaching and learning, and more pointedly how this data might change our fundamental beliefs (or axioms) about teaching and learning.  What we assess and how we assess it determines our evaluation of student ability and achievement.  Herein lies one of our fundamental issues.  Lack of good assessments can lead us to continue doing what we have been doing.

To get some insights, let's look outside of education to provide a backdrop for analyzing our own system.  One of the unique aspects of baseball is the wealth of statistical information that has been available for generations upon generations of players. It's one of the reasons why baseball is such a wonderfully interesting sport to be a fan of.

Earned Run Average (or ERA) is one of the traditional measures of a pitcher's ability.   A lower ERA is considered better, since the pitcher gives up fewer runs per 9 innings.  The problem with ERA is that it is a noisy and flawed measurement system of pitching effectiveness.  It depends on factors not under control of the pitcher, such as the quality of the defense supporting the pitcher and the effects of stadiums on balls batted in play.  Some pitchers are overvalued and some are undervalued in terms of their contributions to team wins, if ERA is weighted too heavily as a measure of ability.  Voros McCraken conducted some groundbreaking analysis, establishing the concept itself and subsequently methods to measure pitchers that are "defense independent."  This story among others is chronicled in "Moneyball" by Michael Lewis.  But Voros didn't expect baseball teams to rejoice when learning about his findings.  He knew better.
"The problem with major league baseball... is that it is a self-populating institution.  Knowledge is institutionalized.  The people involved with baseball who aren't players are ex-players... They aren't equipped to evaluate their own systems.  They don't have mechanisms to let in the good and get rid of the bad." (Voros McCraken)
This is a striking insight!  Voros essentially identifies why baseball resisted modern statistical methods that could help.  Baseball is not set up as an institution to evaluate how it evaluates players.  Baseball people normally did not have the knowledge, ability or willingness to entertain ideas developed by people like Voros, who is a baseball outsider.

What is the implication for us in the teaching profession?  It should be stated that major league baseball and education are not very similar as institutions.  That said, we have some similarities and we can draw conclusions about our shortcomings from baseball's own struggles.  Indeed teaching is also a self-populating institution. Students who do well in the current system are the ones who end up become teachers or professors.  Some future math teachers state things like, "The reason why I like math is because there's always one right answer, and there's a simple, straightforward structure to all problems."  They are good at memorizing rote skills, the rote skills appear on tests, they get good grades, they are labeled as good in math (which may or may not be true), and then they model themselves after their favorite teacher.  Thus the cycle perpetuates.

Colleges and universities have in their mission the goal of seeking truth and knowledge.  When it comes to teaching, however, discussions among faculty often are about style, "what my students like...," and about delivery of information.  The focus is usually not on learning and what students are doing.   A major point is that we do not use the scientific method to evaluate teaching, just as major league baseball didn't use any scientific methods to validate their player valuation systems.

Consequently we have several metric problems.  Are the usual metrics like skills-based tests and student evaluations the right ones?  Clearly the answer is no.  Let's consider the typical calculus sequence with a thousand-plus page texts.  In the typical chapter on optimization in calculus books, the authors usually highlight in a colored box the steps for how to find relative extrema.  What this tells many (but not all) students is that they should memorize the recipe and regurgitate it on an exam.  That's how one can get a good grade after all.   These students will not walk away with a conceptual understanding of the subject, and probably will forget what they have memorized once the term is over.  In short, their education is unintentionally of a lower quality than what we want.  Mathematics is reduced to applying recipes that many students do not understand or even care to understand.

If you don't believe this can happen, here's data from Physics by Professor Eric Mazur, Harvard University, presenting at the University of Waterloo.   (It's 1 hour long, but worth it!)  At Harvard, 40% of the students in freshmen physics who did well on the procedures had inadequate understanding of basic concepts.

The result of traditional assessments is that many students who are traditionally given good grades have major gaps in understanding of basic concepts.  Students think they know it, but maintain "Aristotelean understanding of Physics" rather than a Newtonian one.  Their education amounts to very little, even at a sublime places like Harvard.

Now let's consider traditional teaching assessments (i.e. student evaluations), and consider data from Physics, based on the Force Concept Inventory (FCI).  All of the red data points are from traditional instructors who lecture.  Represented in these data points are teachers who are highly rated and lowly rated on student evaluations -- the red dots contain some star teachers and the teachers on the "oh bummer" list.  And they all do about the same on the FCI within statistical significance!  Students gain on average about 23% of what is possible in the pre-post test design.  Student evaluations are like ERA.  They are a noisy, flawed metric.  Actually student evaluations are worse than ERA.  ERA has some value in aggregate (whole team ERA), and outliers tend to have outlier ERAs.   In contrast, the highly-rated, award winning instructors are doing no better than Dr. Boring or Professor Snoozer.

The green data points represent faculty who use Interactive Engagement in their classrooms.  One of the main trends is that there is very little overlap between the reds and the greens.   The average green gain is double compared to traditional instruction.  Thus a better way to measure if an instructor is effective is to know what skills and practice he or she utilizes in the classroom.  While crude and incomplete, it at least it tells you whether the instructor is on the red or green distribution.  But these qualities and practices are not usually assessed or measured in teaching evaluations, so there does not exist sufficient data or incentive for the system to embrace change.  We keep on doing the usual, while the traditional assessments tell us things are okay.  And the results keep staying in the "red zone" above.  Education has a bunch of Voros McCrakens, so there is hope.  Baseball has changed, and I believe education will continue to improve for the better.

What about Math?  Calculus Concept Inventory has been rolled out and studies are underway.  Thus there is hope that we will embrace new assessments that tell us what is going on.  Preliminary results suggest similar outcomes to the FCI.  Interactive Engagement and Traditional instruction are different distributions.  I look forward to seeing the published results.  Moreover, a growing body of evidence in research in undergraduate education also suggests that students in traditional courses are not learning what we want them to learn.  (More on this in a future post.)

The MAA's Calculus Study indicates that 80% of college calculus courses are taught in sections of 40 students or less.  Additionally, very few institutions have large lectures for upper-level courses.  Ample opportunities exist for IBL methods to be deployed courses across the nation.

What can an individual instructor do personally?  Looking at data can be demoralizing at times, but one should be optimistic.  In particular, one can turn assessments into valuable tools that guides students and instructors in the right direction.

Assessment is more than grading stuff so that you can assign course grades.  Assessments should be utilized in ways that provide students with regular feedback (formative), instructors with information about their students (formative), and to evaluate demonstrated achievement (summative).  Assessments should provide incentives for the qualities we actually value, including creativity, clarity, exploration, problem-solving ability, and communication.

Ideas for what to assess:
  • Student presentations and/or small group work
  • Reading or journal assignment
  • Math portfolios
  • Exams
  • Homework
The items above are not revolutionary.  What matters is what we put into them.  Exams can be rote skill based or they can also test for conceptual understanding, application of ideas, and problem solving.  Homework can be made more interesting.

Student presentations and/or small groups are a wonderful way to assess understanding.  When students present their proofs or solutions, it often a rich experience and rich source of information.  You see it all in IBL classes: great ideas, small ideas, half-baked ideas, insightful questions, victories and defeats.  It's a slice of real life, and it's really great.  It's very easy to detect where students are at, and then take action.

Reading assignments can be used to offload (i.e. flip a class) basics to homework, leaving time in class for the harder tasks, where inquiry is useful.  Portfolios are like a CV, and can be used to demonstrate what a student has been able to prove on his or her own.  Additionally portfolios can be used to create a record of the theorems proved by the class. (I'll write more about portfolios in future posts.) 

In IBL courses, one has continuous formative assessment.  Instructors are always analyzing whether students understand an idea or not, by giving students meaningful tasks and then working with them to overcome learning challenges.  If students are stuck, then there is another question or problem that can be posed, and then students are off on another mathematical adventure.  Students are continuously engaged, are monitored, are self-monitoring, get feedback, and so on.  This rich, integrated assessment system of actual learning is a core advantage in IBL teaching.

Returning to Dewey...  If we continue to teach and assess teaching in traditional ways, we will not gather the data and information necessary that support change for individuals and systemwide.  Gathering good data about our students' thinking, which is also a core part of effective teaching, is a key to the way out.  Engage your students, collect good data, share it, publish it.

Upward and onward!

"It ain't what you don't know that gets you into trouble.  It's what you know for sure that just ain't so." - Mark Twain