Daily Links 01/27/2015

Traditionally we say: If we find statistical significance, we’ve learned something, but if a comparison is not statistically significant, we can’t say much. (We can “reject” but not “accept” a hypothesis.)

But I’d like to flip it around and say: If we see something statistically significant (in a non-preregistered study), we can’t say much, because garden of forking paths. But if a comparison is not statistically significant, we’ve learned that the noise is too large to distinguish any signal, and that can be important.

So, to sum up, science is not about data; it’s not about the empirical content, about our vision of the world. It’s about overcoming our own ideas and continually going beyond common sense. Science is a continual challenging of common sense, and the core of science is not certainty, it’s continual uncertainty—I would even say, the joy of being aware that in everything we think, there are probably still an enormous amount of prejudices and mistakes, and trying to learn to look a little bit beyond, knowing that there’s always a larger point of view to be expected in the future. 

We really have no idea what dolphins or octopi or crows could achieve if their brains were networked in the same way. Conversely, if human beings had remained largely autonomous individuals they would have remained rare hunter-gatherers at the mercy of their environments as the huge-brained Neanderthals indeed did right to the end. What transformed human intelligence was the connecting up of human brains into networks by the magic of division of labour, a feat first achieved on a small scale in Africa from around 300,000 years ago and then with gathering speed in the last few thousand years.

Take Salesforce for example. Right now it just presents data, and the human user has to draw her or his predictive insights in their heads. Yet most of us have been trained by Google, which uses information from millions of variables based on ours and others’ usage to tailor our user experience … why shouldn’t we expect the same here? Enterprise applications — in every use case imaginable — should and will become inherently more intelligent as the machine implicitly learns patterns in the data and derives insights. It will be like having an intelligent, experienced human assistant in everything we do.

Paradigm shift: From BI to MI

I listened to a Gartner webinar Information 2020: Uncertainty Drives Opportunity given by Frank Buytendijk yesterday and it got me thinking about the evolution (/revolution?) from business intelligence (BI) to machine intelligence (MI). I see this happening but not as fast as I’d like, as jaded as I am about BI. Buytendijk gave me some ideas for understanding this transformation.

From his book Dealing with Dilemmas, here’s Buytendijk’s formulation of S curves that show the uptake of new technologies and approaches over time, and how they are then replaced by newer technologies and approaches.

Screen Shot 2015-01-21 at 11.43.46 AM

From the book:

A trend starts hopefully; with a lot of passion, a small group of people pioneer a technology, test a new business model, or bring a new product to market. This is usually followed by a phase of disappointment. The new development turns out to be something less than a miracle. Reality kicks in. At some point, best practices emerge and a phase of evolution follows. Product functionality improves, market adoption grows, and the profitability increases. Then something else is introduced, usually by someone else. … This replacement then goes through the same steps.

This is where I think we are with machine intelligence for enterprise software. We’ve reached the end of the line for business intelligence, the prior generation of analytics. It has plateaued. There’s not much more it can do to impact business outcomes–a topic that deserves its own post.

What instead? What next? Machine intelligence. MI not BI. Let’s let computers do what they do well–dispassionately crunch numbers. And let humans do what they do well–add context and ongoing insight and the flexibility that enterprise reality demands. Then weave these together into enterprise software applications that feature embedded, pervasive advanced analytics that optimize business micro-decisions and micro-actions continuously.

We’re not quite ready for that yet. While B2C data science has advanced, B2B data science has hardly launched, outside of some predictive modeling of leads in CRM and a bit of HR analytics. BI for B2B doesn’t give us the value we need. But MI for B2B has barely reached toddlerhood.

We are, in Buytendijk’s terms, in the “eye of ambiguity,” that space where one paradigm is plateauing but another has not yet proved itself. It’s very difficult at this point to jump from one S curve to the next–see how far apart they are?–because the new paradigm has not proven itself yet.

It’s almost Kuhnian, isn’t it?

Recently one of the newish data scientists in my group said, “it seems like a lot of people don’t believe in this.” This, meaning data science. I agreed with him that it had yet to prove its worth in enterprise software and that many people did not believe it ever would. But it seems clear to me that sometime–in five years? ten years?–machines will help humans run enterprise processes much more efficiently and effectively than we are running them now.

My colleague’s comment reminded me of some points Peter Sheahan of ChangeLabs made at the Colorado Technology Association’s APEX conference last November. He proposed that we don’t have to predict the future in order to capitalize on future trends because people are already talking about what’s coming. Instead, we need to release ourselves from legacy biases and practices. This was echoed by Buytendijk in his webinar: “best practices are the solutions for yesterday’s problems.”

It’s exciting to be in on the acceleration at the front of the S curve but frustrating sometimes too. It’s hard to communicate that data science and the machine intelligence it can generate are not the same as business intelligence and data storytelling. People don’t get it. Then a few do. And a few more.

I look forward to being around when it really catches on.

Daily Links 01/19/2015

He told me to get a big wall calendar that has a whole year on one page and hang it on a prominent wall. The next step was to get a big red magic marker.

He said for each day that I do my task of writing, I get to put a big red X over that day. “After a few days you’ll have a chain. Just keep at it and the chain will grow longer every day. You’ll like seeing that chain, especially when you get a few weeks under your belt. Your only job next is to not break the chain.”

On January 2nd of this year I started publishing a daily data science blog post for my team at IQNavigator with analytic results of some sort or another–charts, statistical analyses, machine learning output. My goal is to write such a post every working day for 2015, following Seinfeld’s advice of seeking consistent daily action. I’ve missed one working day so far (last Friday) but otherwise it’s been a great way to ensure I stay engaged with hands-on data science work and consistently discover interesting insights in our data set.

As value shifts from software to the ability to leverage data, companies will have to rethink their businesses, just as Netflix and Google did. In the next decade, data-driven, personalized experiences will continue to accelerate, and development efforts will shift towards using contextual data collected through passive user behaviors.

We in the West hate to acknowledge – and most refuse to believe – that our leaders have been flagrantly wasteful of Muslim lives for a century now, in countless wars and military encounters instigated by overwhelming Western power. What is the message to Muslims of the US-led invasion of Iraq in 2003? More than 100,000 Iraqi civilians – a very conservative estimate – died in a war that was based on utterly false pretenses. The US has never apologized, much less even recognized the civilian slaughter.

“The Google search algorithm” names something with an initial coherence that quickly scurries away once you really look for it. Googling isn’t a matter of invoking a programmatic subroutine—not on its own, anyway. Google is a monstrosity. It’s a confluence of physical, virtual, computational, and non-computational stuffs—electricity, data centers, servers, air conditioners, security guards, financial markets—just like the rubber ducky is a confluence of vinyl plastic, injection molding, the hands and labor of Chinese workers, the diesel fuel of ships and trains and trucks, the steel of shipping containers.

Daily Links 12/22/2014

The bottom line is that science is not merely a bag of clever tricks that turn out to be useful in investigating some arcane questions about the inanimate and biological worlds. Rather, the natural sciences are nothing more or less than one particular application — albeit an unusually successful one — of a more general rationalist worldview, centered on the modest insistence that empirical claims must be substantiated by empirical evidence.

I have said many times that teamwork is over-rated. It can be a smoke screen for office bullies to coerce fellow workers. The economic stick often hangs over the team: be a team player or lose your job, is the implication in many workplaces. One of my main concerns with teams is that people are placed on them by those holding hierarchical power and are then told to work together (or else). However, there are usually power plays internal to the team so that being a team player really means doing what the leader says. For example, I know many people who work in call centres and I have heard how their teams are often quite dysfunctional. Teamwork too often just means towing the party line.

A more accurate title for this role might be CDMO – Chief Data Monetization Officer – as their role needs to be focused on deriving value from, or monetizing, the organization’s data assets.  This also needs to include determining how much to invest to acquire additional data sources that would complement the organization’s existing data sources and enhance their analytic results.

Block out time
Change your defaults
Rely on apps and automation
Do routine cleanup
Think ahead (long-term)
Create separate calendars

I know many others that are like me in this regard and for you I have these recommendations: 1- avoid unnecessary meetings, especially if you are already in full-productivity mode. Don’t be afraid to use this as an excuse to cancel.  If you are in a soft $ institution, remember who pays your salary.  2- Try to bunch all the necessary meetings all together into one day. 3- Separate at least one day a week to stay home and work for 10 hours straight. Jason Fried also recommends that every work place declare a day in which no one talks. No meetings, no chit-chat, no friendly banter, etc… No talk Thursdays anyone?

We have identified that when these four skills are brought together as one, they produce an optimal collaborative environment that breeds the most successful teams and a workplace culture that continuously propels innovation and initiative:

  • Seeing opportunities with broadened observation
  • Sowing opportunities with extensive innovation
  • Growing the seeds of opportunity of greatest potential
  • Sharing the opportunities you create and sustain with others

In fact, a study by my organization revealed that the workplace is not innovative enough because employees are mostly proficient “sowers” (with the propensity of doing what they are told very well).

Flexible, reconfigurable networks at work

Jon Husband, interviewed by Stowe Boyd, on his concept of “wirearchy”:

I have been involved in a long and intense process of study of the sociology of work, organizations and institutions for 40 years now. Today I believe that a major transition towards what some futurists call a “knowledge-based society” is underway. In that context what I call wirearchy represents an evolution of traditional hierarchy. I don’t think most humans can tolerate a lack of some hierarchical structure, primarily for the purposes of decision-making. The working definition I developed (and which has been ‘tested’ by a range of colleagues and friends interested in the issue(s) recognizes that the necessary adaptations to new conditions will likely involve temporary, transient but more intelligent hierarchy.

From his website, here’s Husband’s definition of wirearchy:

Wirearchy – a dynamic two-way flow of power and authority based on knowledge, trust, credibility and a focus on results, enabled by interconnected people and technology.

If I understand it right, wirearchy is more of a network that is much more fluid and flexible than an organizational hierarchy. It involves more knowledge exchange and development than a traditional org hierarchy would. This idea makes sense of a bunch of things I’ve been experiencing and thinking about lately.

Self-organized, self-managing challenge teams

At work we recently ran an innovation challenge with self-organized, self-managing teams made up mostly of individual contributors from the technology organization. I was skeptical whether it would work but the output from the three teams was amazingly well-done and innovative. Each team had a coach from outside technology, someone who knew the problem domain well. So they had a link to business reality.

The teams presented their solutions to the first challenge this Friday. The output was incredibly impressive — they came up with thoughtful, detailed, innovative approaches grounded in the reality of what our system needs to support. They did this without the usual management hierarchy and decision-making processes in place.

Someone deeply committed to a Dilbertian world would say “why of course they did a great job! They had no pointy-haired bosses involved!” That could be the case. I don’t think it argues for eliminating the hierarchy though but rather making the hierarchy more flexible and optional while encouraging intelligent ad hoc networks to develop. Those networks may or may not be hierarchical in structure.

Attachment to hierarchical position causes inefficiency

At the Colorado Technology Association’s Apex conference in November, Peter Sheahan of Changelabs proposed that attachment to ego causes inefficiency. When we get caught up in thinking that because of our position or reputation that we can’t do a particular thing or act a particular way, we get in the way of progress. We introduce friction.

This hit home to me, because as a vice president of my company (one of very many!) I sometimes think I shouldn’t have to write R code myself or develop a SQL query to get the data I need or spend the many days it often takes to figure out exact details of what the data my team and I are analyzing means. But that’s BS. I should and will do whatever I need to do to achieve the IQN Labs mission. If that means firing up Sublime Text with our latest R code, I’ll do it.

Rigid hierarchies encourage people to think of themselves as operating in a narrow capacity for an organization. Recently when a colleague moved on to a new company he told me one of the reasons he was moving was that people expected him to do things that were not his job to do. He was operating only in hierarchical mode. With wirearchy added to hierarchy, perhaps the work in the interstitial spaces of the organizational hierarchy is more likely to get done.

How do you encourage people to think outside their particular position and job description?

Sometimes you need hierarchy – The example of school group work

I hated when I had to do group projects for my PhD. The professors would often claim that these group projects were just like the real world – where you have to collaborate with other people.

True, but in the real world there are hierarchies so you have some sense who’s in charge. It’s absolutely not true that managers are generally the most talented and effective people but in some cases the promotion process works, and you end up with reasonably effective, knowledgeable people in leadership positions. They often have extra information about the business context you are working in and can guide work. This sort of recognized hierarchy is almost always absent in a group project aimed at completing a homework assignment. Often what happens is the person with the most knowledge and context does all the work, having no historical or official sway over the others.

Which makes me wonder about those challenge teams… how were decisions made and work allocated? This is something I’d love to investigate. I am wondering if having knowledge already of other people’s strengths and potential areas for contribution meant that unofficial hierarchies could develop. Or did work get organized and completed in more flat fashion?

I worked for Oracle in the late nineties in application development. At the time it was an incredibly hierarchically-aware place. I imagine it still is. The hierarchy mattered. It also mostly worked, at least where I was in the organization. That’s because above me I had a very effective manager who kept involved in the technological and business details of what we were doing while playing politics well both laterally and vertically upwards into the organization. Having the hierarchy in place made me more effective.

Matrix management

My team has grown by 320% this past year, from just myself to myself plus two full-time data scientists plus an intern who comes in one day a week. Both data scientists transferred from other parts of the organization, bringing with them plenty of institutional knowledge that will help the IQN Labs team succeed. One of them also brought with her ongoing responsibilities that don’t fall within the scope of IQN Labs. The manager she came from still has those responsibilities. I haven’t figured out exactly how we’ll work this situation but it seems to me the concept of “wirearchy” may help light a way through.

Connected intelligence

I’m thinking that wirearchies can be more intelligent than hierarchies because they potentially connect across longer distances. This makes me think of my own collaboration with product marketing at IQN. I probably spend as much time talking to the two people in that organization as I do talking to people within Technology. My conversations with product marketing produce guidance and insight, far more than my meetings within Technology, which are usually more operational in focus.

Just like in academic research, sometimes the long-distance interdisciplinary connections add the most value.

This is exciting because it follows along to many ideas that existed only in toddler form in my book Connect! I am looking forward to exploring more and testing it all out at work in 2015.

Daily Links 12/21/2014

The winds of change originate in the unconscious minds of domain experts. If you’re sufficiently expert in a field, any weird idea or apparently irrelevant question that occurs to you is ipso facto worth exploring. [3] Within Y Combinator, when an idea is described as crazy, it’s a compliment—in fact, on average probably a higher compliment than when an idea is described as good.

Today I believe that a major transition towards what some futurists call a “knowledge-based society” is underway. In that context what I call wirearchy represents an evolution of traditional hierarchy. I don’t think most humans can tolerate a lack of some hierarchical structure, primarily for the purposes of decision-making. The working definition I developed (and which has been ‘tested’ br a range of colleagues and friends interested in the issue(s) recognizes that the necessary adaptations to new conditions will likely involve temporary, transient but more intelligent hierarchy. The implication is that people in a wirearchy should be focused on seeking to better understand and use the growing presence of feedback loops and double-loop learning.

In this paper, we present the benchmark data set CauseEffectPairs that consists of 88 different “cause-effect pairs” selected from 31 datasets from various domains. We evaluated the performance of several bivariate causal discovery methods on these real-world benchmark data and on artificially simulated data. Our empirical results provide evidence that additive-noise methods are indeed able to distinguish cause from effect using only purely observational data. In addition, we prove consistency of the additive-noise method proposed by Hoyer et al. (2009).

In an interview with Kevin Smith, writer and television producer Paul Dini complained about a worrying trend he sees in television animation and superhero shows in particular: executives spurning female viewers because they believe girls and women don’t buy the shows’ toys.

Daily Links 12/17/2014

My personal feeling is that this will really take off if you can start linking performance information to the more objective factual data within the various systems. How does the performance of interim staff vary and is that linked to which agency they come through, their employment history, the length of their assignment or other factors? We’ve probably all had experience of working with interim staff who were brilliant; and with others who weren’t worth a fraction of their day rate. So you can imagine some really powerful analysis that might give a strong steer into how you best choose, structure and manage your contingent workforce – and maybe even take that into the permanent staff world!

Totally agree! Now we just need to get hold of comprehensive and reliable performance data…

Truly some awesome stuff here, including the link below on writing an R package from scratch. I should definitely do that for the utility functions I use over and over. 

This tutorial is not about making a beautiful, perfect R package. This tutorial is about creating a bare-minimum R package so that you don’t have to keep thinking to yourself, “I really should just make an R package with these functions so I don’t have to keep copy/pasting them like a goddamn luddite.” Seriously, it doesn’t have to be about sharing your code (although that is an added benefit!). It is about saving yourself time. (n.b. this is my attitude about all reproducibility.)

People are searching for products on Amazon, rather than using Google. The only reason search makes money for Google is that people use it to search for products they would like to buy on the internet, and Google shows ads for those products. Increasingly, however, people are going straight to Amazon to search for products. Desktop search queries on Amazon increased 47% between September 2013 and September 2014, according to ComScore.

Jeff: I think it takes more time to analyze something like that. Again, one of my jobs is to encourage people to be bold. It’s incredibly hard.  Experiments are, by their very nature, prone to failure. A few big successes compensate for dozens and dozens of things that didn’t work. Bold bets — Amazon Web Services, Kindle, Amazon Prime, our third-party seller business — all of those things are examples of bold bets that did work, and they pay for a lot of experiments.

What really matters is, companies that don’t continue to experiment, companies that don’t embrace failure, they eventually get in a desperate position where the only thing they can do is a Hail Mary bet at the very end of their corporate existence. Whereas companies that are making bets all along, even big bets, but not bet-the-company bets, prevail. I don’t believe in bet-the-company bets. That’s when you’re desperate. That’s the last thing you can do.

“The dirty secret is that a significant majority of big-data projects aren’t producing any valuable, actionable results,” said Michael Walker, a partner at Rose Business Technologies, which helps enterprises build big-data systems. According to a recent report from the research firm Gartner Inc., “through 2017, 60% of big-data projects will fail to go beyond piloting and experimentation and will be abandoned.”