Category Archives: links

Links for March 30, 2012

The new LMS product: You [Audrey Watters/Hack Education]. On Blackboard’s recent strategy change to embrace open source and acquire MoodleRooms and Netstop. The value is in the data, not in the LMS software.

Are undergraduates actually learning anything? [Richard Arum and Josipa Roksa/The Chronicle of Higher Education. For many students, college doesn't improve their critical thinking, complex reasoning, and written communications. 45+% of a sample of college students did not demonstrate any statistically significant improvement on the Collegiate Learning Assessment after two years of college. 36% of students did not show any significant improvement after four years. More disturbingly:

[We] find that learning in higher education is characterized by persistent and/or growing inequality. There are significant differences in critical thinking, complex reasoning, and writing skills when comparing groups of students from different family backgrounds and racial/ethnic groups. More important, not only do students enter college with unequal demonstrated abilities, but those inequalities tend to persist—or, in the case of African-American students relative to white students, increase—while they are enrolled in higher education.

An open letter to college admissions committees [Andrew F. Knight/Fairfax Times].

Consequently, the drive for high grades is blinding students and parents alike to the real purpose of education: learning. In parent-teacher conferences, “How can my child bring up her grade?” has replaced “How can my child better learn the material?” The system’s response to angry grade-obsessed parents and disgruntled students has been to fudge the indicator instead of improving the system in other words, to inflate grades in spite of worsening performance. I was routinely pressured by parents, students and even administrators to inflate grades in the form of curving scores, providing extra credit and retest opportunities, and more heavily weighting homework and projects that are easy to copy from friends. It is instructive to note that two-thirds of our students are on the honor roll. (That’s right.) When a majority of students routinely receive As and B’s in all their classes, the distinctions intended by a traditional A-F grading scale become hazy and meaningless.

What forty years of research says about the impact of technology on learning: A second-order meta-analysis and validation study [Tamim, Bernard, Borkhovski, Abrami, & Schmid/Review of Educational Research]. A meta-meta-analysis of research on technology usage in education. Found random effects mean effect size of .35, statistically significantly different from zero. I have to wonder if that is meaningful in any way given the incredible variety of ways technology can be applied to learning. Have not read the full paper, only the abstract.

Health correlator: Calling self-experimentation N=1 is incorrect and misleading [Ned Kock/Health Correlator]. Self-experimentation is longitudinal, so n > 1. But results may not generalize to other people. Good for learning what works for you.

Links for March 11, 2012

Depression: A genetic Faustian bargain with infection? [Emily Deans/Evolutionary Psychiatry]. Discusses the Pathogen Host Defense (PATHOS-D) theory of depression described by Raison and Miller [pdf]. Genes that make people susceptible to depression may also protect them from infection. Depression is associated with brain inflammation; inflammation is also part of the immune response that combats infectious disease. “Since infections in the developing world tend to preferentially kill young children, there is strong selection pressure for genes that will save you when you are young, even if those genes have a cost later in life.”

The people of the petabyte [Venkatesh Rao/Forbes blogs]. An “informal taxonomy and anthropological survey of data-land” based on Rao’s observations at the Strata conference. Apparently everyone’s a data scientist now:

The taxonomy part is simple. Apparently the list of species in data land is very short. It has only one item:

  • Data scientist

What is the value of big data research vs. good samples [from LinkedIn Advanced Business Analytics, Data Mining and Predictive Modeling group]. Interesting and lengthy discussion from LinkedIn’s Advanced Business Analytics, Data Mining, and Predictive Modeling group on whether/when sampling vs. big data sets should be used.

The real-world experiment: New application development paradigm in the age of big data [James Kobielus/Forrester].

This year and beyond, we will see enterprises place greater emphasis on real-world experiments as a fundamental best practice to be cultivated and enforced within their data science centers of excellence.  In a next best action program, real-world experiments involve iterative changes to the analytics, rules, orchestrations, and other process and decision logic embedded in operational applications. You should monitor the performance of these iterations to gauge which collections of business logic deliver the intended outcomes, such as improved customer retention or reduced fulfillment time on high-priority orders.

Links for March 4, 2012

Who’ll have the means to analyze our learning? [Tony Searl/Neoteny]. In response to Pearson and INITE’s announcement of plans to open an online university for Mexicans, Searl asks “who will have the means to analyse our learning in the near future?” and “Will a few dominant learning data companies emerge, or can learning analytics remain an in house cottage industry?”

Social learning Analytics: Five approaches [PDF] [Rebecca Ferguson and Simon Buckingham]. Five categories of social learning analytics: social network analytics, discourse analytics, content analytics, disposition analytics, context analytics. All things I want to learn more about.

4chan’s Chris Poole: Facebook & Google are doing it wrong [Jon Mitchell/ReadWriteWeb]. Google and Facebook have a crude notion of identity.  4chan’s Chris Poole says we are like multi-faceted diamonds.

“The portrait of identity online is often painted in black and white,” Poole said. “Who you are online is who you are offline.” That rosy view of identity is complemented with a similarly oversimplified view of anonymity. People think of anonymity as dark and chaotic, Poole said.

But human identity doesn’t work like that online or offline. We present ourselves differently in different contexts, and that’s key to our creativity and self-expression. “It’s not ‘who you share with,’ it’s ‘who you share as,’” Poole told us. “Identity is prismatic.”

Permission to be horrible and other ways to generate creativity [Suzanne Axtell interview of Denise R. Jacobs/O'Reilly Radar].

“… there’s such a limited definition of creativity in our culture. People treat artists as if they’re off in their own world or put them on a pedestal. But it’s a misconception that technical people aren’t creative. Developers and coders and database architects are extremely creative, just as scientists are. They have to come up with solutions and code that have never been written before. If that’s not creativity, I don’t know what is.

I’m reading “A Whole New Mind” by Daniel H. Pink, which explores how right-brain is the new wave. We’re entering a new conceptual, high-touch era whereas before we were in a very analytical era. Our industry, the technical industry, is actually a perfect in-between point of left brain and right brain. You have to have both, a whole-brain approach, to be successful in our industry.”

Colleges misassign many to remedial classes, studies find [Tamar Lewin/NY Times]. This is something learning analytics ought to be able to fix.

Links for February 27, 2012

Kathy Sierra on gamification in education [Larry Ferlazzo/Larry Ferlazzo's Websites of the Day... for Teaching ELL, ESL, & ESL] Kathy offers guidelines around when gamification may be safe vs. dangerous. What falls in the dangerous category? Learning and engaging that is intrinsically rewarding, since psychology studies have suggested that rewarding such activity destroys a person (or a monkey’s) interest in doing the activity for its own sake:

The studies are both counter-intuitive and disturbing. The monkeys that enjoyed playing with wooden puzzles until given their favorite treat reward for solving the puzzles, at which time their puzzle-solving diminished. The kids given ribbons for their drawings then showed less interest in drawing. The writers shown a list of possible external reasons for writing immediately wrote less complex and interesting poems than those shown a list of intrinsically-rewarding reasons for writing. And on and on and on and on. Animals, humans, children, adults, across wide-ranging domains and in studies conducted by dozens of independent researchers.

If 99.9% of big data is irrelevant, why do we need it [Michael Wu/Lithium Lithosphere blogs] Lithium’s Principal Scientist of Analytics Wu says “Just because you can track, store, and analyze big data, doesn’t mean you should.” He argues that in many cases you can answer the questions you need to answer just by getting the relevant data — which might be able to be loaded and analyzed on a beefy computer.

Lazily musing about sharing [JP Rangaswami/Confused of Calcutta]. “Sharing is serious business” — it has serious consequences for businesses, especially for those built upon not-sharing. Five ideas about sharing:

1. For anything to be social, it must be shared.

2. Sharing, the act of making social, happens because people are made social.

3. Sharing is encouraged by good design.

4. When you share physical things like food, sharing reduces waste.

5. When you share non-physical things like ideas, sharing increases value.

Want to get value our of your data and analytics investment? Then deal with this issue before you buy the software [Maz Iqbal/B2C Business to Community]. People don’t think statistically correctly, even professional statisticians. Getting the right data into systems that can analyze it is the easy part. The hard part is:

Getting managers to give up their pet theories, their ideological convictions, their vested interests, their intuition, their past experience and use data and analytics to make decisions. That is the central issue that you have to and should deal with.

Links for February 24, 2012

Cognitive inequality [The Economist Free Exchange].

is this an iron rule of innovation in information technology—that the cheaper information becomes and the easier it becomes to manipulate it the greater will be the gap, productive and otherwise, between the informationally capable and the rest? …

We might well be in an intial phase of the information age in which technology amplifies cognitive gaps which gives way to a period in which technology mutes those gaps.

Our greedy colleges 2.0 [Andrew Gillen/Inside Higher Ed]. The Bennett Hypothesis says that increases in federal financial aid subsidies enable colleges to raise their tuition without concern for what students can actually afford. Study described here found that aid directed to low-income students is less likely to lead to tuition increases compared to aid directed at relatively affluent students.

A modeled student [Cathy O'Neil/mathbabe]. Do systems that recommend courses and majors for students reinforce discrimination?

Economics of the cold start problem in talent discovery [John Horton/Online Labor]. Novices can’t get hired if their talent won’t be revealed until after they get hired. Some empirical evidence. One possible help: “talent revealing sites like StackOverflow and Github as replacements for traditional resumes.”

Links for February 22, 2012

Elizabeth Gilbert on What the Porcupine Dilemma Can Teach Us About the Secret of Happiness [Maria Popova/Brain Pickings]. Elizabeth Gilbert on Schopenhauer’s porcupines. Staying warm without impaling yourself on someone else’s spines.

Target, Pregnancy, and Predictive Analytics, Part II [Dean Abbott/Data Mining and Predictive Analytics. The Target story was interesting for what it says about the possibilities and perils of analytics. This was my favorite writeup, for its overview of to succeed with data analysis:

1) understand the data,
2) understand why the models are focusing on particular input patterns,
3) ask lots of questions (why does the model like these fields best? why not these other fields?)
4) be forensic (now that's interesting or that's odd...I wonder...),
5) be prepared to iterate, (how can we predict better for those customers we don't characterize well)
6) be prepared to learn during the modeling process

We have to "notice" patterns in the data and connect them to behavior. This is one reason I like to build multiple models: different algorithms can find different kinds of patterns. Regression is a global predictor (one continuous equation for all data), whereas decision trees and kNN are local estimators.

You Are Responsible for Getting Your Ideas to Spread [Tim Kastelle/Innovation Leadership Network]. Don’t blame the customer if your idea isn’t compelling; that’s a failure of your idea or your communication of it.

Machine Learning for Hackers [Review from David Smith/Revolution Analytics blog]. Sounds like a book I need to order.

Rather than merely providing a “cookbook” approach to say, building a “who to follow” recommendation system for Twitter, it takes the time to explain the methodology behing the algorithms and give the reader a better basis for understanding why these methods work (and, equally importantly, how they can go wrong).

What’s new? Exuberance for novelty has benefits [John Tierney/The New York Times]. In a longitudinal study, people who combined novelty-seeking with persistence and “self-transcendence” showed the most success over the years (good health, lots of friends, few emotional problems, greatest satisfaction with life).

Links for January 20, 2012

Big data market survey: Hadoop solutions [Edd Dumbill/O'Reilly Radar].

Apache Hadoop is unquestionably the center of the latest iteration of big data solutions. At its heart, Hadoop is a system for distributing computation among commodity servers. It is often used with the Hadoop Hive project, which layers data warehouse technology on top of Hadoop, enabling ad-hoc analytical queries.

I’m starting my first ever project with Hadoop this week–a prototype of an analytics warehouse using Amazon Elastic MapReduce. Colleagues have told me EMR is a great way to get your head around Hadoop-based data processing.

CBO Report: Medicare pilot programs don’t control health-care costs [Megan McArdle/The Atlantic blogs]. McArdle describes what happened with a housing-project demolition program whose pilot studies suggested  much better effects than were actually seen at scale:

The initial study was small and involved highly screened people with a lot of support. And it seems to have suffered from publication bias–the most spectacular results got the most attention, even though these might just have been outliers.

This is distressingly common–not just in government or social-do-gooding research, but in organizations of all kinds–including corporations.

Programs at scale often don’t show results as good as pilot studies of those programs. More generally in program evaluation, it’s hard to find evidence of strong (or even weak) effects of interventions. Social systems are complex; factors other than those targeted by the intervention often determine outcomes. This is something I need to communicate regularly to my colleagues and our partners–student learning is largely determined by factors other than what we have control over. That’s not to say we shouldn’t improve our course design, teaching practices, and so forth but it is to say that there aren’t many easy pickings out there for improving student outcomes.

For-profits vs not-for-profits [Felix Salmon/Reuters blog].

I know full well that a lot of not-for-profit organizations are run in a dreadful fashion; I’m just not convinced that introducing a profit motive is always or even often the best way to fix that problem…. I very much doubt that for-profit education is ever a good idea. I just don’t see how the incentives there could possibly be aligned.

But the profit motive can’t provide optimal outcomes if there isn’t consumer discipline along with it. For-profit higher education is subsidized by the government in the form of grants and low-interest loans (and note that nonprofit education is subsidized in additional ways as well, in the case of public institutions). Would-be students do not have an incentive to seriously evaluate whether the education they are purchasing is worth what they pay, because there is a third-party payer involved. The situation is much like health care. Good discussion in post of the issues and controversy over for-profit higher education.

Links for January 15, 2012

The rise of the new group think [Susan Cain/New York Times].

Virtually all American workers now spend time on teams and some 70 percent inhabit open-plan offices, in which no one has “a room of one’s own.” During the last decades, the average amount of space allotted to each employee shrank 300 square feet, from 500 square feet in the 1970s to 200 square feet in 2010….

Privacy also makes us productive. In a fascinating study known as the Coding War Games, consultants Tom DeMarco and Timothy Lister compared the work of more than 600 computer programmers at 92 companies. They found that people from the same companies performed at roughly the same level — but that there was an enormous performance gap between organizations. What distinguished programmers at the top-performing companies wasn’t greater experience or better pay. It was how much privacy, personal workspace and freedom from interruption they enjoyed. Sixty-two percent of the best performers said their workspace was sufficiently private compared with only 19 percent of the worst performers. Seventy-six percent of the worst programmers but only 38 percent of the best said that they were often interrupted needlessly.

I work in an open-plan office and I rather like it, mainly because my coworkers are fun and because my clean, small, mostly quiet work area is such a nice change from my sprawling, messy, mostly noisy house. We work on a puzzle together when we’re taking a break from work and wear headphones when we want uninterrupted time. I wonder, though, if I’d be more productive with a private office or even a cubicle. I don’t achieve flow as much I’d like at work. Not sure if that’s because the job is relatively new to me or because the work environment is an obstacle.

Hume, causation & science [Barry Ritholtz/The Big Picture]. “We humans love a grossly over-simplified narrative.” Determining when we can attribute causation to a correlation is one of the major challenges of research design and statistical analysis.

How to work from home like you mean it [Kevin Purdy/Fast Company]. I’m thinking of working one day a week at home to achieve some of that flow I’ve been missing. If I do, I’ll follow some of these tips so it doesn’t devolve into eight hours of Internet surfing.

Lack of interest and aptitude keeps students out of STEM majors [Olga Khazan/Washington Post On Small Business blog]. “A study released this week by Georgetown University’s Center on Education and the Workforce found that recent graduates in computer science, mathematics and engineering all had unemployment rates below 9 percent (with the rates dropping below 6 percent among those who had some experience.) Conversely, the rates for graduates in architecture and the arts were 13.9 and 11.1 percent, respectively.”

What is college for? (Part 2) [Gary Gutting/The New York Times].

Concretely, students graduating from high school should, to cite one plausible model, be able to read with understanding classic literature (from, say, Austen and Browning to Whitman and Hemingway) and write well-organized and grammatically sound essays; they should know the basic outlines of American and European history, have a good beginner’s grasp of at least two natural sciences as well as pre-calculus mathematics, along with a grounding in a foreign language.

Students with this sort of education would be excellent candidates for many satisfying and well-paying jobs in, for example, sales and service industries, except for those that require highly specialized skills. From the standpoint of employment, high school graduates would have no need of college unless they wanted to be accountants or engineers, pursue pre-professional programs leading to law or medical school or train for doctoral work in science or the humanities. Apart from this, the only good reason they would have for going to college would be for its intellectual culture.

Compelling idea, but seems unlikely to happen because (1) our high schools are mostly incapable of providing such an education and (2) our culture is overly invested in the idea of college as the basic ticket to success in today’s economy. E.g.: D.C. may require college application for all [Joanne Jacobs].

Links for January 7, 2012

Nutrition advice: The vitamin D-lemma [Amy Maxmen/Nature]. “The difficulty of distilling strong advice from weak evidence.” This is a key challenge for researchers/statisticians/data scientists in any domain, not just in health.

Will Amazon offer analytics as a service? [Quentin Hardy/Bits]. Interesting to get an idea what that might look like. I don’t think, though, this would compete with SAS and similar software as the post implies. Would someone looking to implement a product recommendation engine implement it in SAS? Probably not. For example, Google is said to use R for model exploration and prototyping, then puts them into production using Python or C++. I feel a “choosing your analytics tool” post coming on.

Community college budget cuts drive students to for-profit school [Chris Kirkham/Huffington Post]. Balanced coverage of why students turn to for-profit schools and the pros and cons of such choices. My observation: community college tuition is artificially low due to government subsidization while for-profit tuition is artificially high, again because of government interference (in the form of financial aid). No market forces to bring about a reasonable balance between supply and demand. The big losers are students (and taxpayers).

Benchprep is codecademy for any subject, high school to med school [Josh Constine/TechCrunch]. “Eventually, publishers might get a clue that interactive digital education is going to destroy their paper book business. If they’re smart they’ll start developing their own courses or raise licensing fees. Until then though, BenchPrep will be the savior of anyone frustrated by the static book-learning experience.” I’m pretty certain some big textbook publishers see that already.

Forget dieting, try intermittent fasting [Josh Ozersky/Time Ideas]. “And that’s why instead of eating healthier, I’m going for longer stretches without eating so I can actually enjoy a whole meal. I don’t starve myself; I drink a protein shake if I get hungry and consume endless glasses of diet iced tea. People tell me this is bad, that I will soon gain back all the weight I’ve lost – and these rejoinders are always given with a smug malice, as if the people uttering them actually despise me for trying to compensate for the pleasures of the plate.”

I fast most days at work until about 2 or 3 pm, then have a small snack. I eat whatever I want once I get home from work around 5 pm. I find this allows me to eat generally what I want while maintaining my weight at a level I’m happy with. I have found, like Josh, that people get really upset about this plan, almost offended that I would eat this way. Funny how everyone thinks they know what is healthy and what is not, despite the difficulties in determining that (see first link in this post).

Links for December 30, 2011

Yes, and… [W.P. McNeill/Corner Cases]. Living by the “yes, and” ethos of improvisational comedy. Always build on what the other person said–stay open to their insight and direction. Be a pliable weed not a concrete pylon. Don’t get mired in dogma. I’m thinking this would work equally well in interactions with coworkers as with kids.

College has been oversold [Alex Tabarrok/Marginal Revolution]. The total number of students graduating from college is way up, but the numbers graduating with STEM degrees haven’t increased. That’s bad for individuals and bad for the economy. “An argument can be made for subsidizing students in fields with potentially large spillovers, such as microbiology, chemical engineering, nuclear physics and computer science. There is little justification for subsidizing sociology, dance and English majors.”

You have to break connections to get your ideas to spread [Tim Kastelle/Innovation Leadership Network. Innovation requires disruption. "When you come up with a great new idea, you need to think about this economic network in two ways. The first is: how can I connect to all of the complementary parts of the economy that are needed to get my idea to work? The second is: if I’m going to get my idea to spread, which of these existing connections need to be broken?"

The second economy [W. Brian Arthur/McKinsey Quarterly]. We are in the process of building out the economy’s neural system, what Arthur calls “the second economy” growing up alongside the first economy, the industrial economy. Downside: loss of jobs as computers take over.

Selecting amongst large classes of models [Brian D. Ripley] (pdf). We have the data and the computational resources to “trawl through literally thousands of models (and in some cases many more).” How to pick among them? A subject I intend to learn a lot more about in 2012.

Curing the big data storage fetish [Dan Woods/Forbes]. “One popular way to express lust for big data for its own sake is to create a gargantuan Hadoop cluster.” Not enough to just store the data, need to build a data-driven culture. “But how do you create  a company culture like CapitalOne or Google or eBay or Zynga or LinkedIn, where data is essentially part of the management team? At all of these companies there are data scientists, the elite professionals, but there are also swarms of data enthusiasts, people who are eager to use data to help do their jobs better.”