But what about learning? Recognizing the signaling function of higher education

I spent just a couple days at the Learning Analytics and Knowledge 2012 conference this past week in Vancouver. Because of personal stuff going on I couldn’t attend the whole conference, but by watching some live streaming, following the conference twitter channel, reading some live blogs, and browsing some of the papers afterward I put together a decent understanding of the topics covered and the issues that came up. Of course, I had far less time than I wanted to connect with people there, but I have hopes that I can address that over the next year online and then go all out for LAK13 in Belgium.

Some attendees felt a disconnect that many vendors and researchers are looking at retention (students re-enrolling across terms) and course completion rather than learning itself. But aren’t we doing learning analytics, they asked?

At Pearson, I am working on exactly the problem of retention and course success. I work mainly with data from Pearson’s LearningStudio hosted learning management system. I sometimes have access to learning outcomes data but only very rarely; I can’t count on having that.

The data scientist is not drunk

Is my work a a case of the drunk looking for her keys under the streetlight? I don’t think so, and it’s not just do to with the profitability of institutions or their need to demonstrate adequate program completion rates and year-over-year retention to accreditation bodies.

Students enroll in higher education programs for many reasons, not all of which have to do with learning. We would all hope that after two or four or more years taking coursework a student will have better skills and cognitive capacity than when she started. But bachelor’s and associate’s degree programs today require students to take many classes that are not relevant to them or to their future work, that merely serve as hurdles to jump over to acquire the degree. The learning is not enough, because education serves a signaling purpose in addition to working to improve a student’s cognitive capacity.

As people working in education, we have to recognize the reality of higher education today, that it is only partially about learning. I’m talking from a purely U.S.-based perspective, as that’s my focus. In the U.S. today, bachelor’s degrees are a basic entry ticket to a decent job in many cases. In fact, credential inflation means that now a master’s degree is required for entry into many professions.

Some economists think higher ed degrees primarily function as signals to employers, that they are not mainly about increased learning or cognitive capacity. On this theory, a student’s degree shows that he has qualities valued by employers: conformity, conscientiousness, a willingness to defer gratification. If signaling is true in some situations or in some ways, that means it’s not enough for a student to take some classes and merely learn what they need to use on the job. They need to get through an entire program, arbitrary requirements and uninteresting or useless classes included.

There’s good evidence that signaling theories of education do not tell the whole story about higher education, that returns to education do indeed reflect the additional skills that graduates in various disciplines bring to the job market. The signaling theory may be true in part but as we might hope, education is also about actual preparation and learning–I don’t mean to say it is not.

I also don’t mean to reduce higher education to job market preparation, though I do question degrees that don’t represent a sound economic investment. The cost of post-secondary education today is such that we can’t divorce it from its role in linking people with economic opportunity.

Are learning and course completion orthogonal?

One presenter called completion and learning orthogonal outcomes. At least one attendee in that session took issue with that. We would all hope this is not the case — we hope that students don’t successfully complete classes without learning anything — but can’t everyone think of a class they were required to take but didn’t take anything away from it? I just completed a Ph.D. and virtually all of the cognate classes were worthless, annoying and time-consuming efforts that I had to complete if I wanted my degree. There was almost no learning taking place in those courses, and I am a highly motivated and engaged learner, taking courses that I thought would be interesting. Sometimes students do just need to complete a course, learning or no.

Certainly we should work toward making every course a worthwhile learning experience for students but in the real world there are always going to be some classes that aren’t that for one reason or another. Assuming that completing a particular program is a good thing for a particular student (a somewhat questionable assumption in this era of heavy student loan debt and low-value degrees), helping them get through all their courses successfully regardless of learning is a good in itself.


On the reductionism of analytics in education

I had the great pleasure (and distinct discomfort) of listening to Virginia Tech’s Gardner Campbell speak on learning analytics this week, through my haphazard participation in the Learning Analytics 2012 MOOC. Haphazard, I say, because I am so busy at work I can hardly spare any time to connect outside of it, whether through more structured means like the Learning Analytics course or less structured like Twitter and Facebook. Discomfort, I say, because Campbell launched some pointed criticisms of the current reductionist approach to learning analytics that prevails in education today. Yes, it prevails at Pearson too, not because we have bad motives, but because the process of education and learning is so complex that we feel compelled to simplify it in some way to make any sense of it.

M-theory vs. the x-y plane

Campbell drew an analogy to cosmology, contrasting 11-dimensional m-theory with the planar (two-dimensional) Cartesian coordinate system. He suggested that current work in learning analytics is like working in the x-y plane when we know that education and learning takes place in at least 11-dimensions.

Learning analytics, as practiced today, is reductionist to an extreme. We are reducing too many dimensions into too few. More than that, we are describing and analyzing only those things that we can describe and analyze, when what matters exists at a totally different level and complexity. We are missing emergent properties of educational and learning processes by focusing on the few things we can measure and by trying to automate what decisions and actions might be automated.

As I was writing this post, @webmink Simon Phipps tweeted about his post leaving room for mystery, in which he proposed that some problems will remain unsolved, some systems unanalyzed:

The real world is deliciously complex, and there will always be mysteries – systems too complex for us to analyse. It seems to me that one of the keys to maturing is learning to identify those systems and leave room for them to be mysteries, without discarding the rest of rational life.

Then Simon shared a definition of reductionism with me:!/webmink/status/176095165709164544

This echoes exactly what Campbell said in his presentation:

My fear is that computers as they are used in learning analytics mean that people will work on simpler questions. They may be complicated in terms of the scale of the data but they’re not conceptually rich. They won’t be trying more concepts or playing with new ideas.

We’ll have a map that makes the territory far simpler than it truly is and we’ll design school to that, not to the true complexity.

Reductionism in analyzing online discussion threads

Last week in a meeting one of my colleagues pointed out the inherent reductionism of our approach to the problem of measuring and characterizing student interactivity and learning via discussion threads. He pointed this out not as a criticism but as recognition and acknowledgement. We are applying a custom-developed coding scheme to threaded discussion posts. We code each post into one of four categories based on the pattern of topics discussed in each post and across the thread. We capture what topics were introduced, how they relate to topics in previous posts, and how they relate to the main discussion topic. We cannot capture all the details and complexity of what people have written and how they have interacted. We certainly aren’t paying any attention to the broader experiences and connections that individual students bring to the discussion. But we are trying nevertheless to capture some important kinds of meaning and interaction in the posts via our coding scheme.

This is, at heart, the analytics endeavor: to take very messy humanly-meaningful information and transform it into numbers that a computer can manipulate. It can be done in more sophisticated and subtle ways or more crude and careless ways, but it is always reductionist. It does not fully capture the human experience of learning. We can’t model learning in all its complexity.

The math is not the territory

I see it as critical in data analysis to remember that our numbers are useful shorthand — easy to manipulate, summarize, visualize, and report upon — but they are not the thing we are interested in. We use them because there is something else non-quantitative we are interested in, something human (at least in social sciences like education).

Campbell said,

We tend to believe the math is the territory and we tend to organize ourselves around just what we’re able to measure instead of organizing ourselves around creating better measurements of what we know to be nearly unimaginably complex.

The math is not the territory — the codes and numbers we use to represent human understanding and action and connection are not the territory — the visualizations are not it either.

Learning as delicious mystery?

Simon suggested some things are too complex to be answerable and should be left as mysteries. Is learning something that should be left unanalyzed? Certainly not, although aspects of it are mysteriously wonderful and not amenable to quantitative or qualitative analysis. There’s too much at stake — for individual students who benefit from success defined in many different ways, for the government that funds or subsidizes much of their education, for the citizenry that benefits from an educated populace.

I believe analytics can help, but I feel humble about its possibilities, more so than ever after listening to Campbell speak. I used to call my stance “cynicism” but I think I will reframe it as “humbleness” which makes it seem like there is some chance of success. As uncomfortable as it was, I’m glad I sat in on Campbell’s talk and listened to it again this morning to think about it further.

big data, education, personal, statistics

So you call yourself a data scientist?

Hilary Mason (in Glamour!)

I just watched this video of Hilary Mason* talking about data mining. Aside from the obvious thoughts of what I could have done with my life if (1) I had majored in computer science instead of philosophy/economics and (2) hadn’t spent all of the zeroes having babies, buying/selling houses, and living out an island retirement fantasy thirty years before my time, I found myself musing about her comments on the “data scientist” term. She said she’s gotten into arguments about it. I guess some people think it doesn’t really mean anything — it’s just hype — who needs it? Someone’s a computer scientist or a statistician or a business intelligence analyst, right? Why make up some new name?

I dunno, I rather like the term. My official title at work is “data scientist” — thank you to my management for that — and it seems more appropriate than statistician or business intelligence analyst or senior software developer or whatever else you might want to call me. The fact is, I do way more than statistical analysis. I know SQL all too well and (as my manager knows from my frequent complaints) spend 75% + of my time writing extract-transform-load code. I use traditional statistical methods like factor analysis and logistic regression (heavily) but if needed I use techniques from machine learning. I try to keep on top of the latest online learning research and I incorporate that into our analytics plans and models. Lately I’ve been spending time looking at what sort of big data architectures might support the scale of analytics we want to do. I don’t just need to know what statistical or ML methods to use — I need to figure out how to make them scalable and real-time and — this is critical — useful in the educational context. That doesn’t sound like pure statistics to me, so don’t just call me a statistician**.

I do way more than data analysis and I’m capable of way more, thanks to my meandering career path that’s taken me from risk assessment (heavy machinery accident analysis at Failure Analysis now Exponent) to database app development (ERP apps at Oracle) to education (AP calculus and remedial algebra teaching at the Denver School of Science and Technology) and now to Pearson (online learning analytics). I earned a couple of degrees in mathematical statistics and applied statistics/research design/psychometrics meanwhile. 

Drew Conway's Venn diagram of data science

None of what I did made sense at the time I was wandering the path — and yet it all adds up to something useful and rare in my current position. Data science requires an alchemistic mixture of domain knowledge, data analysis capability, and a hacker’s mindset (see Drew Conway’s Venn diagram of data science reproduced here). Any term that only incorporates one or two of these circles doesn’t really capture what we do. I’m an educational researcher, a statistician, a programmer, a business analyst. I’m all these things.

In the end, I don’t really care what you call me, so long as I get the chance to ask interesting questions, gather the data to answer them, and then give you an answer you can use — an answer that is grounded in quantitative rigor and human meaning.

*Yes, I do have a girl-crush on Hilary. I think she’s awesome.

** Also, my kids cannot seem to pronounce the word “statistician.” I need a job title they can tell people without stumbling over it. I hope to inspire them to pursue careers that are as rewarding and engaging, intellectually and socially, as my own has been.

education, research

Getting ready for connected learning

Here’s a cool idea: the web enables a connectivist learning style based on network navigation, where “learning is the process of creating connections and developing a network.” Seems to me before you can learn connectedly, though, you need to first learn in more socially and contextually constrained ways.

Background: Three generations of distance education pedagogies

In this week’s Learning Analytics 2012 (LAK12) web session, Dragan Gasevic pointed us at an interesting paper describing three generations of distance education: cognitive-behaviorist, social constructivist, and connectivist. From Anderson and Dron (2011):

Anderson and Dron did not claim that the connectivist model would replace the cognitive-behaviorist or social-constructivist models but said that “all three current and future generations of [distance education] pedagogy have an important place in a well-rounded educational experience.”

These three models co-exist online today

LAK12 is itself an example of a course built in the connectivist paradigm, but just because a course is massive, open, and online doesn’t mean that it’s connectivist. For example, the Stanford machine learning class offered last fall was a (very effective) example of a cognitive-behaviorist approach. Students watched videos on their own schedule. Regular quizzes and homework assignments checked understanding. Andrew Ng was content creator and sage on the stage. While there was a Q&A forum available, the course design did not rely on them. A student could use them or not.

Typical online college courses today are often built in the social-constructivist mode, with instructors seeking to design and run courses that encourage many-to-many engagement through discussion threads and group projects. Does the addition of social features drive learning? It seems to be an article of faith among instructional designers today that it does. I’m not up on the research so I can’t say — but I can say that in online courses I’ve reviewed and taken, I don’t see evidence that social features have been designed in such a way that they make a difference in learning.

When are the different approaches useful?

I am thinking that whether a cognitive-behaviorist or constructivist or connectivist approach is best depends upon the preparation and goals of the learner. Maybe something like this:

I suspect that a student needs to gain basic grounding and fluency in a subject before constructivist approaches will be useful. An elementary schooler needs to learn to read and write and do arithmetic before you can do a group science project, for example. And it seems like a connectivist approach will be most effective once you already have some intermediate and contextual knowledge of a subject before trying to navigate out from it.

What do you think? When are cognitive-behaviorist vs. social constructivist vs. connectivist approaches to learning most useful? Do you think you need to have achieved a certain level of contextual and subject knowledge before connected learning is effective?

diary of a doctoral student, education, personal

Dot plan for autumn

I have really great memories of my first job after finishing my master’s degree. I worked as a Unix/C++ programmer on an intelligence agency software development contract. The people I worked with were really smart and the work was engaging.

Many of us at that workplace kept “.plan” (say it “dot plan”) files in our home directories that said what we were working on. You could see what someone else was doing by “fingering” them (kind of a precursor to Facebook poke, but with a reaction–a listing of the person’s .plan). Keeping public plans was a good way for us to share what we were working on, without being annoying about it. People use Twitter for that now, and I do intend to get back to Twitter, someday soon. But for now, it feels comfortable to write and think alone in my hermit-cave here.

Back to school

I completed my two big summer projects: submitted two studies to the AERA 2011 conference then prepared for and passed the SAS base programming certification exam. Now I’m thinking about back-to-school activities and fall quarter. It feels like the right time to update my plan.

These are my fall projects:

Submit a manuscript to a journal. I haven’t decided which study to rework into a journal article. Both studies are based on the TIMSS 2007 data set and fortunately I’m attending training in D.C. at the end of this month to learn more about that and other international education databases, so I think I’ll be in good shape to do this.

Prepare for my doctoral comprehensive exam, scheduled for late October. I’ll be blogging about the topics I expect to see on the exam, so if you see some tutorial-like posts, that’s why.

Study for and pass the SAS advanced programming certification. I plan to do this after taking comps, but ideally before January, when I’ll start looking for a job. Some of the most interesting statistician positions I’ve seen require SAS. Plus my advisor and I have a plan to do a missing data simulation study in the winter and she suggested we use SAS. I might have selected R if it were up to me, but I plan to use R for my dissertation research, so I’ll have both adequately covered.

Find a good middle school for my middle child. It kills me that Denver no longer supports neighborhood schools; it’s all choice choice choice. This is great when you find a school that suits your child and your family circumstances. The problem is there’s no default choice in many neighborhoods now. I don’t know anyone who sends their kids to our neighborhood middle school or high school, and I wouldn’t feel comfortable sending my daughter to either of those schools since her peers will go elsewhere. We’ll be looking at private and magnet schools. We may also consider trying to “choice in” to a traditional public school that’s near us but has a better reputation than the one that we are assigned to.

Complete 14 units of coursework. I am taking Cost-Benefit Analysis, Economic Fundamentals: Global Applications, Item Response Theory, and a required seminar in which I learn to administer IQ tests. After this quarter, I’ll have just two classes left, Qualitative Research and Analysis of Variance, and I can focus on my dissertation research and job search.

Meanwhile, keep the family happy and healthy. I’d like to get in the habit of starting my kids off each day with healthy breakfasts: scrambled eggs, berry smoothies, pancakes and waffles made with good stuff. We eat dinner together almost every night and I’d like to continue that too, including continuing to try new recipes on a regular basis so I can feed my need for novelty.


Sharing government data, Colorado school districts edition

Sharing government data with the public really does create a culture of accountability. The Denver Post analyzed spending data for Colorado’s three largest school districts and this forced Denver Public Schools to take action:

After queries from The Post, DPS officials Friday sent an e-mail to principals and staff announcing cost controls.

“Effective immediately, we will be restricting food purchases and travel expenses,” the message said.

Food may still be purchased for community meetings but no longer for internal staff meetings. Administrators are asking staffers to participate in “virtual conferences” rather than paying for out-of-state travel.

Among the expenses were $4,113 for doughnuts and burritos (Montbello High School) and $1,174 for a Dave & Buster’s year-end party (Lincoln High School). Lincoln seems to be one of the worst offenders, charging $161,000 to district-issued credit cards. North High School charged less than $14,000. (But North has bigger problems).

Many commenters on the article seem to think this isn’t a big deal, that the amounts of money involved are relatively small, and why shouldn’t teachers get some coffee?  But public school systems shouldn’t be paying for coffee and food and entertainment for their staff or for parents or for students, especially in a time of serious budgetary problems. Most families have to cut back right now; so should the government workers they support.


Operational publicy for charter schools

I’ve been blogging a lot, and I’ve had good reason. I’m practicing structured procrastination. It’s the ninth week of a ten week quarter, so I have a few major projects to finish, including a policy paper on charter school accountability and quality.

Once I finish that, I’m on to the fun stuff: analyzing some survey data on religiosity then modeling cross-country math achievement test scores using liking-for-math and country-level cultural values as predictors.

I’m just having trouble making this charter school paper come together. I already thought about charter school accountability and didn’t come to any conclusions, at least not anything interesting enough to write about. What interesting conclusions could there be? I can only think of relatively uninteresting ones:

  • Charter schools don’t operate in a free market for education, so market forces aren’t going to ensure they do a good job and aren’t going to force the worst ones out of business.
  • Charter schools sometimes do better than traditional public schools but often do worse. That’s what you’d expect. If you deregulate (or partially deregulate) a sector, you’re going to allow both higher quality and lower quality entrants in.
  • Colorado already has pretty good charter laws, according to these rankings.
  • Charter schools probably don’t get closed readily enough because it’s too politically difficult to do so. Far more charters are opened than closed — and they can’t all be that good.
  • Charter schools aren’t engines of innovation for the public school system, at least as far as classroom practice goes. They do, however, show what might be accomplished if you push control down to the school level vs. keeping it at the district level.

While procrastinating, I’ve been reading blogs and writing blog posts and trying to find some inspiration for this paper so I can get it written and move on to the fun stuff — I promised myself I can’t do the data analysis projects until I write the paper. Yesterday afternoon I finally found some inspiration, the idea of “operational publicy.” Yes, that word “publicy” is awful, but the concept is right on.

Charter schools should open up their operations and their data to public scrutiny, not just their test scores and demographics which are already released in school report cards, but everything they do. After all, they are getting public dollars for their work.

On the same topic, from the Economist special report on managing information:

Providing access to data “creates a culture of accountability”, says Vivek Kundra, the federal government’s CIO. One of the first things he did after taking office was to create an online “dashboard” detailing the government’s own $70 billion technology spending. Now that the information is freely available, Congress and the public can ask questions or offer suggestions. The model will be applied to other areas, perhaps including health-care data, says Mr Kundra—provided that looming privacy issues can be resolved.

Love the idea of a “culture of accountability,” but of course there are “looming privacy issues” in releasing school data too. Should you release teacher attendance data? You’re probably not going to point out individual teachers who miss a lot of school but what about aggregated data? What about administrator salaries compared to teacher salaries by school? Lesson plans? Results of parent satisfaction surveys? These are things the public should have access to.