Daily Links 01/19/2015

He told me to get a big wall calendar that has a whole year on one page and hang it on a prominent wall. The next step was to get a big red magic marker.

He said for each day that I do my task of writing, I get to put a big red X over that day. “After a few days you’ll have a chain. Just keep at it and the chain will grow longer every day. You’ll like seeing that chain, especially when you get a few weeks under your belt. Your only job next is to not break the chain.”

On January 2nd of this year I started publishing a daily data science blog post for my team at IQNavigator with analytic results of some sort or another–charts, statistical analyses, machine learning output. My goal is to write such a post every working day for 2015, following Seinfeld’s advice of seeking consistent daily action. I’ve missed one working day so far (last Friday) but otherwise it’s been a great way to ensure I stay engaged with hands-on data science work and consistently discover interesting insights in our data set.

As value shifts from software to the ability to leverage data, companies will have to rethink their businesses, just as Netflix and Google did. In the next decade, data-driven, personalized experiences will continue to accelerate, and development efforts will shift towards using contextual data collected through passive user behaviors.

We in the West hate to acknowledge – and most refuse to believe – that our leaders have been flagrantly wasteful of Muslim lives for a century now, in countless wars and military encounters instigated by overwhelming Western power. What is the message to Muslims of the US-led invasion of Iraq in 2003? More than 100,000 Iraqi civilians – a very conservative estimate – died in a war that was based on utterly false pretenses. The US has never apologized, much less even recognized the civilian slaughter.

“The Google search algorithm” names something with an initial coherence that quickly scurries away once you really look for it. Googling isn’t a matter of invoking a programmatic subroutine—not on its own, anyway. Google is a monstrosity. It’s a confluence of physical, virtual, computational, and non-computational stuffs—electricity, data centers, servers, air conditioners, security guards, financial markets—just like the rubber ducky is a confluence of vinyl plastic, injection molding, the hands and labor of Chinese workers, the diesel fuel of ships and trains and trucks, the steel of shipping containers.

Daily Links 12/22/2014

The bottom line is that science is not merely a bag of clever tricks that turn out to be useful in investigating some arcane questions about the inanimate and biological worlds. Rather, the natural sciences are nothing more or less than one particular application — albeit an unusually successful one — of a more general rationalist worldview, centered on the modest insistence that empirical claims must be substantiated by empirical evidence.

I have said many times that teamwork is over-rated. It can be a smoke screen for office bullies to coerce fellow workers. The economic stick often hangs over the team: be a team player or lose your job, is the implication in many workplaces. One of my main concerns with teams is that people are placed on them by those holding hierarchical power and are then told to work together (or else). However, there are usually power plays internal to the team so that being a team player really means doing what the leader says. For example, I know many people who work in call centres and I have heard how their teams are often quite dysfunctional. Teamwork too often just means towing the party line.

A more accurate title for this role might be CDMO – Chief Data Monetization Officer – as their role needs to be focused on deriving value from, or monetizing, the organization’s data assets.  This also needs to include determining how much to invest to acquire additional data sources that would complement the organization’s existing data sources and enhance their analytic results.

Block out time
Change your defaults
Rely on apps and automation
Do routine cleanup
Think ahead (long-term)
Create separate calendars

I know many others that are like me in this regard and for you I have these recommendations: 1- avoid unnecessary meetings, especially if you are already in full-productivity mode. Don’t be afraid to use this as an excuse to cancel.  If you are in a soft $ institution, remember who pays your salary.  2- Try to bunch all the necessary meetings all together into one day. 3- Separate at least one day a week to stay home and work for 10 hours straight. Jason Fried also recommends that every work place declare a day in which no one talks. No meetings, no chit-chat, no friendly banter, etc… No talk Thursdays anyone?

We have identified that when these four skills are brought together as one, they produce an optimal collaborative environment that breeds the most successful teams and a workplace culture that continuously propels innovation and initiative:

  • Seeing opportunities with broadened observation
  • Sowing opportunities with extensive innovation
  • Growing the seeds of opportunity of greatest potential
  • Sharing the opportunities you create and sustain with others

In fact, a study by my organization revealed that the workplace is not innovative enough because employees are mostly proficient “sowers” (with the propensity of doing what they are told very well).

Flexible, reconfigurable networks at work

Jon Husband, interviewed by Stowe Boyd, on his concept of “wirearchy”:

I have been involved in a long and intense process of study of the sociology of work, organizations and institutions for 40 years now. Today I believe that a major transition towards what some futurists call a “knowledge-based society” is underway. In that context what I call wirearchy represents an evolution of traditional hierarchy. I don’t think most humans can tolerate a lack of some hierarchical structure, primarily for the purposes of decision-making. The working definition I developed (and which has been ‘tested’ by a range of colleagues and friends interested in the issue(s) recognizes that the necessary adaptations to new conditions will likely involve temporary, transient but more intelligent hierarchy.

From his website, here’s Husband’s definition of wirearchy:

Wirearchy – a dynamic two-way flow of power and authority based on knowledge, trust, credibility and a focus on results, enabled by interconnected people and technology.

If I understand it right, wirearchy is more of a network that is much more fluid and flexible than an organizational hierarchy. It involves more knowledge exchange and development than a traditional org hierarchy would. This idea makes sense of a bunch of things I’ve been experiencing and thinking about lately.

Self-organized, self-managing challenge teams

At work we recently ran an innovation challenge with self-organized, self-managing teams made up mostly of individual contributors from the technology organization. I was skeptical whether it would work but the output from the three teams was amazingly well-done and innovative. Each team had a coach from outside technology, someone who knew the problem domain well. So they had a link to business reality.

The teams presented their solutions to the first challenge this Friday. The output was incredibly impressive — they came up with thoughtful, detailed, innovative approaches grounded in the reality of what our system needs to support. They did this without the usual management hierarchy and decision-making processes in place.

Someone deeply committed to a Dilbertian world would say “why of course they did a great job! They had no pointy-haired bosses involved!” That could be the case. I don’t think it argues for eliminating the hierarchy though but rather making the hierarchy more flexible and optional while encouraging intelligent ad hoc networks to develop. Those networks may or may not be hierarchical in structure.

Attachment to hierarchical position causes inefficiency

At the Colorado Technology Association’s Apex conference in November, Peter Sheahan of Changelabs proposed that attachment to ego causes inefficiency. When we get caught up in thinking that because of our position or reputation that we can’t do a particular thing or act a particular way, we get in the way of progress. We introduce friction.

This hit home to me, because as a vice president of my company (one of very many!) I sometimes think I shouldn’t have to write R code myself or develop a SQL query to get the data I need or spend the many days it often takes to figure out exact details of what the data my team and I are analyzing means. But that’s BS. I should and will do whatever I need to do to achieve the IQN Labs mission. If that means firing up Sublime Text with our latest R code, I’ll do it.

Rigid hierarchies encourage people to think of themselves as operating in a narrow capacity for an organization. Recently when a colleague moved on to a new company he told me one of the reasons he was moving was that people expected him to do things that were not his job to do. He was operating only in hierarchical mode. With wirearchy added to hierarchy, perhaps the work in the interstitial spaces of the organizational hierarchy is more likely to get done.

How do you encourage people to think outside their particular position and job description?

Sometimes you need hierarchy – The example of school group work

I hated when I had to do group projects for my PhD. The professors would often claim that these group projects were just like the real world – where you have to collaborate with other people.

True, but in the real world there are hierarchies so you have some sense who’s in charge. It’s absolutely not true that managers are generally the most talented and effective people but in some cases the promotion process works, and you end up with reasonably effective, knowledgeable people in leadership positions. They often have extra information about the business context you are working in and can guide work. This sort of recognized hierarchy is almost always absent in a group project aimed at completing a homework assignment. Often what happens is the person with the most knowledge and context does all the work, having no historical or official sway over the others.

Which makes me wonder about those challenge teams… how were decisions made and work allocated? This is something I’d love to investigate. I am wondering if having knowledge already of other people’s strengths and potential areas for contribution meant that unofficial hierarchies could develop. Or did work get organized and completed in more flat fashion?

I worked for Oracle in the late nineties in application development. At the time it was an incredibly hierarchically-aware place. I imagine it still is. The hierarchy mattered. It also mostly worked, at least where I was in the organization. That’s because above me I had a very effective manager who kept involved in the technological and business details of what we were doing while playing politics well both laterally and vertically upwards into the organization. Having the hierarchy in place made me more effective.

Matrix management

My team has grown by 320% this past year, from just myself to myself plus two full-time data scientists plus an intern who comes in one day a week. Both data scientists transferred from other parts of the organization, bringing with them plenty of institutional knowledge that will help the IQN Labs team succeed. One of them also brought with her ongoing responsibilities that don’t fall within the scope of IQN Labs. The manager she came from still has those responsibilities. I haven’t figured out exactly how we’ll work this situation but it seems to me the concept of “wirearchy” may help light a way through.

Connected intelligence

I’m thinking that wirearchies can be more intelligent than hierarchies because they potentially connect across longer distances. This makes me think of my own collaboration with product marketing at IQN. I probably spend as much time talking to the two people in that organization as I do talking to people within Technology. My conversations with product marketing produce guidance and insight, far more than my meetings within Technology, which are usually more operational in focus.

Just like in academic research, sometimes the long-distance interdisciplinary connections add the most value.

This is exciting because it follows along to many ideas that existed only in toddler form in my book Connect! I am looking forward to exploring more and testing it all out at work in 2015.

Daily Links 12/21/2014

The winds of change originate in the unconscious minds of domain experts. If you’re sufficiently expert in a field, any weird idea or apparently irrelevant question that occurs to you is ipso facto worth exploring. [3] Within Y Combinator, when an idea is described as crazy, it’s a compliment—in fact, on average probably a higher compliment than when an idea is described as good.

Today I believe that a major transition towards what some futurists call a “knowledge-based society” is underway. In that context what I call wirearchy represents an evolution of traditional hierarchy. I don’t think most humans can tolerate a lack of some hierarchical structure, primarily for the purposes of decision-making. The working definition I developed (and which has been ‘tested’ br a range of colleagues and friends interested in the issue(s) recognizes that the necessary adaptations to new conditions will likely involve temporary, transient but more intelligent hierarchy. The implication is that people in a wirearchy should be focused on seeking to better understand and use the growing presence of feedback loops and double-loop learning.

In this paper, we present the benchmark data set CauseEffectPairs that consists of 88 different “cause-effect pairs” selected from 31 datasets from various domains. We evaluated the performance of several bivariate causal discovery methods on these real-world benchmark data and on artificially simulated data. Our empirical results provide evidence that additive-noise methods are indeed able to distinguish cause from effect using only purely observational data. In addition, we prove consistency of the additive-noise method proposed by Hoyer et al. (2009).

In an interview with Kevin Smith, writer and television producer Paul Dini complained about a worrying trend he sees in television animation and superhero shows in particular: executives spurning female viewers because they believe girls and women don’t buy the shows’ toys.

Daily Links 12/17/2014

My personal feeling is that this will really take off if you can start linking performance information to the more objective factual data within the various systems. How does the performance of interim staff vary and is that linked to which agency they come through, their employment history, the length of their assignment or other factors? We’ve probably all had experience of working with interim staff who were brilliant; and with others who weren’t worth a fraction of their day rate. So you can imagine some really powerful analysis that might give a strong steer into how you best choose, structure and manage your contingent workforce – and maybe even take that into the permanent staff world!

Totally agree! Now we just need to get hold of comprehensive and reliable performance data…

Truly some awesome stuff here, including the link below on writing an R package from scratch. I should definitely do that for the utility functions I use over and over. 

This tutorial is not about making a beautiful, perfect R package. This tutorial is about creating a bare-minimum R package so that you don’t have to keep thinking to yourself, “I really should just make an R package with these functions so I don’t have to keep copy/pasting them like a goddamn luddite.” Seriously, it doesn’t have to be about sharing your code (although that is an added benefit!). It is about saving yourself time. (n.b. this is my attitude about all reproducibility.)

People are searching for products on Amazon, rather than using Google. The only reason search makes money for Google is that people use it to search for products they would like to buy on the internet, and Google shows ads for those products. Increasingly, however, people are going straight to Amazon to search for products. Desktop search queries on Amazon increased 47% between September 2013 and September 2014, according to ComScore.

Jeff: I think it takes more time to analyze something like that. Again, one of my jobs is to encourage people to be bold. It’s incredibly hard.  Experiments are, by their very nature, prone to failure. A few big successes compensate for dozens and dozens of things that didn’t work. Bold bets — Amazon Web Services, Kindle, Amazon Prime, our third-party seller business — all of those things are examples of bold bets that did work, and they pay for a lot of experiments.

What really matters is, companies that don’t continue to experiment, companies that don’t embrace failure, they eventually get in a desperate position where the only thing they can do is a Hail Mary bet at the very end of their corporate existence. Whereas companies that are making bets all along, even big bets, but not bet-the-company bets, prevail. I don’t believe in bet-the-company bets. That’s when you’re desperate. That’s the last thing you can do.

“The dirty secret is that a significant majority of big-data projects aren’t producing any valuable, actionable results,” said Michael Walker, a partner at Rose Business Technologies, which helps enterprises build big-data systems. According to a recent report from the research firm Gartner Inc., “through 2017, 60% of big-data projects will fail to go beyond piloting and experimentation and will be abandoned.”

Daily Links 12/16/2014

3) The Convergence of VMS and FMS The continued adoption of FMS software in 2015 will produce ramifications for other segments of the labor ecosystem, particularly project-based contingent labor. Vendor Management Systems (VMS), which are used primarily to manage temporary staff and contract labor, do not address the specific needs of freelance management.

Yet data science, as a business, is still young. As the technology moves beyond the Internet incubators like Google and Facebook, it has to be applied company by company, in one industry after another.

At this stage, there is a lot of hand craftsmanship rather than software automation.

So the aspiring software companies find themselves training, advising and building pilot projects for their commercial customers. They are acting far more as services companies than they hope to be eventually.

While that may sound like a condition to be remedied, in fact we are living in an era where uncertainty and ambiguity are increasing. The reality is that we can’t shoo it away by becoming more rigid, creating more rules, or imposing more authoritarian controls. We need to loosen control, make more whitespace, give people more autonomy, and rely on the network of loose connections to influence everyone’s actions. We need a climate of soft power in a social network based on sparsity, not density, where weak and lateral connections dominate. That is the wellspring of organizational flexibility and adaptability.

Daily Links 12/10/2014

In his 2003 book, Open Innovation, Henry Chesbrough defined this important concept. In short, open innovation is a product or technology development model that extends beyond the boundaries of a firm to involve others in a collaborative way. Today, much of this activity uses various social networking tools and technologies to empower people to generate ideas, fine-tune concepts, share knowledge or solve critical problems.

When you look at the evolution of digital measurement in the enterprise and study organizations that have achieved a significant degree of maturity, you’ll notice that they come in two distinct flavors: the analytic and the informational. Analytic organizations have strong teams studying the data and driving testing, personalization and customer lifecycle strategies. Informational organizations have widespread, engaged usage of data across the organization with key stakeholders absorbing and using data intelligently to make decisions. It’s not impossible for an enterprise to be both analytic and informational, but the two aren’t necessarily related either. You might expect that organizations that have gotten to be good in measurement would be mature in both areas, but that’s not really the common case. Instead, it seems that most enterprises have either a culture or a problem set that drives them to excel in one direction or the other.

“Garbage in, garbage out” is the cliché of data-haters everywhere. “It is not true that companies need good data to use predictive analytics,” Taylor said. “The techniques can be robust in the face of terrible data, because they were invented by people who had terrible data,” he noted.

Revolution R Open (RRO) is the enhanced distribution of R from Revolution Analytics. RRO is based on version 3.1.1 of the statistical software R and includes additional capabilities for improved performance, reproducibility and platform support.