data science, links

Daily Links 08/31/2017

The current state of applied data science [Ben Lorica / O’Reilly Radar]

Key points from the article:

  • Lack of training data remains the primary bottleneck in machine learning projects
  • Think about features, not algorithms
    • Data enrichment can potentially improve your existing models – this is sometimes overlooked, though this is often not considered as glamorous as model and algorithm development
  • Role of machine learning engineer has recently emerged to streamline process of productionizing data science projects
  • Model and algorithm development gets all the media coverage but this is usually not as pressing as developing good training data sets and productionizing data science projects

What this overview misses is the rise of the data product manager.  You can’t do good data science without a thorough understanding of business and market requirements. A good data product manager will provide that, and direct the data science team towards useful projects.  You can’t take good problems well-framed as a given; that’s one of the biggest challenges of data science.


Bad code isn’t technical debt, it’s an unhedged call option [Frances Lash]

Therefore, even if it is more expensive to do thing clean from the start, it would also be less risky. A messy system is full of unhedged calls that can be called upon at an unpredictable cost to the organization. Technical debt does not communicated the risk of writing sloppy or quick fixes into code – debt is something to be managed and just another tool. When talking about implausible delivery dates it may make more sense to talk about unhedged call options.

View story at

What artificial intelligence can and can’t do right now [Andrew Ng / Harvard Business Review]

If a typical person can do a mental task with less than one second of thought, we can probably automate it using AI either now or in the near future.

How your brain tricks you into thinking more expensive wine tastes better

[David DiSalvo / Forbes]

As predicted, the volunteers rated the allegedly higher-priced wine as tasting better than the allegedly cheaper wine. The MRI scan showed that when those evaluations were made, two parts of the volunteers’ brains experienced greater activity: the medial pre-frontal cortex and the ventral striatum. That’s important because those two areas are especially involved in evaluating expectations and seeking rewards. When we see a higher price, our brain links the price to greater expectation of reward, which changes our perception – in this case, taste.

Why the AI hype train is already off the rails and why I’m over AI already [Dat Tran / Built to Adapt]

In those early years of big data, the outcome was always less than perfect. Most of the work ended up in powerpoint presentations without ever going into production because most teams simply did not have the right infrastructure or culture to maintain them.


Daily Links 08/21/2017

This Is How Sexism Works in Silicon Valley [Ellen Pao / The Cut ]

Painful to read. Is there any woman who has had a career in Silicon Valley of any length who doesn’t have a similar story she could tell? If not so extreme, but still as discouraging? Yes, in every story we implicate ourselves too, because we’re human and to succeed we must relate, but the structural sexism makes it impossible for us to succeed in the way that men can and do.

Kudos and thanks to Pao for putting herself out there for change.

I Didn’t Complain to HR [Donna Harris / Medium]

I didn’t complain to HR because, like nearly every woman on the planet, I was doing was I was taught my whole life to do. Be nice.

Have you ever been too nice and ended up in a situation that could’ve been avoided if you just would’ve been an asshole?

I’m teaching my daughters they don’t always need to be nice. That may mean going to HR at times, although unfortunately I know like every other woman the perils of going down that particular path.

Women are taught to be nice because that is what works for us, what is allowed in a working environment for women trying to get shit done. When we’re not nice we’re called abrasive or strident — critiques rarely leveled at men who behave similarly.

Women face different rules for success than men do, with far narrower pathways to follow to get to the treasure. Speaking up about problems comes with consequences that not every woman is ready to bear. Do you want to hear that your marriage is a sham, like Pao did? That you were “bad at your job, crazy, and an embarrassment”? Those are all feelings many women (and surely men too) fear about themselves on a daily basis anyway! Why would you want to subject yourself to hearing about them in a meeting, a conference call or worse yet, in court, or in the press?

Walmart’s Customer Dissatisfaction Detection Patent [Anne Zelenka / emotion know]

Just like when we shop with them online, retailers want to identify us and then optimize customer service using analytics when we’re shopping with them in person. This is not the apocalypse, though I agree, if deployed, this customer dissatisfaction detection system could be “invasive, annoying, and prone to errors.” It’s based upon a flawed understanding of how emotions work, assuming that we (and AIs) can detect basic emotions from biomarkers. It can be done, but you need to teach the AI more emotional granularity than described in the patent filing.

ai, links, work

Daily Links 08/15/2017

Sheryl Sandberg: Develop Your Voice, Not Your Brand [Theodore Kinni / Stanford GSB ]

The idea of developing your personal brand is a bad one, according to Sandberg. “People aren’t brands,” she says. “That’s what products need. They need to be packaged cleanly, neatly, concretely. People aren’t like that.”

“Who am I?” asks Sandberg. “I am the COO of Facebook, a company I deeply believe in. I’m an author. I’m a mom. I’m a widow. At some level, I’m still deeply heartbroken. I am a friend and I am a sister. I am a lot of very messy, complicated things. I don’t have a brand, but I have a voice.”

How to Monetize Your Podcast by Selling Your Recording Equipment [Dustin Meadows / Hard Style ]

I was once like you. I thought, “man, I’ll bet there’s a lot of people out there who wanna hear about the plot holes I’ve discovered in the Marvel Cinematic Universe,” but it turns out like 10,000 other people have already made podcasts about the plot holes in the Marvel Cinematic Universe and about 9,998 of them were better than mine.

Facebook’s AI Robots Shut Down After They Start Talking to Each Other in Their Own Language [Andrew Griffin / The Independent]

Facebook abandoned an experiment after two artificially intelligent programs appeared to be chatting to each other in a strange language only they understood.

The two chatbots came to create their own changes to English that made it easier for them to work – but which remained mysterious to the humans that supposedly look after them….

The robots had been instructed to work out how to negotiate between themselves, and improve their bartering as they went along. But they were not told to use comprehensible English, allowing them to create their own “shorthand”, according to researchers.



Daily Links 08/11/2017

Machine Learning vs Statistics: The Texas Death Match of Data Science [Tomm Fawcett & Drew Hardin / Silicon Valley Data Science]

Since decisions still have to be made, statistics provides a framework for making betterdecisions. To do this, statisticians need to be able to assess the probabilities associated with various outcomes. And to do that, statisticians use models. In statistics, the goal of modeling is approximating and then understanding the data-generating process, with the goal of answering the question you actually care about.

In contrast to Statistics, note that the goal here to generate the best prediction. The ML practitioner usually does some exploratory data analysis, but only to prepare the data and to guide the choice of features and a model family. The model does not represent a belief about or a commitment to the data generation process. Its purpose is purely functional. No ML practitioner would be prepared to testify to the “validity” of a model; this has no meaning in Machine Learning, since the model is really only instrumental to its performance.2 The motto of Machine Learning may as well be: The proof of the model is in the test set.

I’m a woman in computer science. Let me ladysplain the Google memo to you. [Cynthia Lee / Vox]

To be a woman in tech is to know the thrill of participating in one of the most transformative revolutions humankind has known, to experience the crystalline satisfaction of finding an elegant solution to an algorithmic challenge, to want to throw the monitor out the window in frustration with a bug and, later, to do a happy dance in a chair while finally fixing it. To be a woman in tech is also to always and forever be faced with skepticism that I do and feel all those things authentically enough to truly belong. There is always a jury, and it’s always still out.

This Morning Routine Will Save You 20+ Hours a Week [Benjamin P. Hardy / Inc.]

The same concept applies to work. The best work happens in short intensive spurts. By short, I’m talking 1-3 hours. But this must be “Deep Work,” with no distractions, just like an intensive workout is non-stop. Interestingly, your best work – which for most people is thinking – will actually happen while you’re away from your work, “recovering.”

For best results: Spend 20% of your energy on your work and 80% of your energy on recovery and self-improvement. When you’re getting high quality recovery, you’re growing. When you’re continually honing your mental model, the quality and impact of your work continually increases. This is what psychologists call, “Deliberate Practice.” It’s not about doing more, but better training. It’s about being strategic and results-focused, not busyness-focused.

The Labyrinth of Life [Martha Beck]

Today, if you’re confronting an issue for the ten thousandth time, or feeling that your life is going nowhere, or panicking over how little you’ve achieved, stop and breathe. You’re not falling behind on some linear race through time. You’re walking the labyrinth of life. Yes, you’re meant to move forward, but almost never in a straight line. Yes, there’s an element of achievement, of beginning and ending, but those are minor compared to the element of being here now. In the moments you stop trying to conquer the labyrinth of life and simply inhabit it, you’ll realize it was designed to hold you safe as you explore what feels dangerous. You’ll see that you’re exactly where you’re meant to be, meandering along a crooked path that is meant to lead you not onward, but inward.

ai, personal, psychology, research, work

Introducing emotion/know

I’m excited to introduce you to my next project, emotion/know. This has been percolating for a while, but a change in employment status has given me the time and resources to make it a top priority.

Long term, my goal is to build artificial intelligence that supports and promotes emotional regulation and mastery. Short term, the goal is to learn more about emotions and have fun.

ai, data science, links

Daily Links 04/11/2017

Demystifying data science

The key to a successful analytical model is having a robust set of variables against which to test for their predictive capabilities. And the key to having a robust set of variables from which to test is to get the business users engaged early in the process.

How machine learning is shaking up e-commerce and customer engagement

From a content perspective, [Sitecore] performs semantic analysis to:

  • Auto generate taxonomies and tagging
  • Help improve the tone of your content by analyzing for things like wordiness, slang, and other grammar-like faux pax

From a digital marketing perspective, ML can:

  • Help detect segments of your customers or audience
  • Improve the effectiveness of your testing and optimization processes
  • Provide content and product recommendations that increase the engagement time a customer spends on your website.

And from a backend perspective, it can help with fraud detection, something that every company with an e-commerce model needs to monitor actively.

Gartner 2017 magic quadrant for data science platforms: gainers and losers

Firms covered:

  • Leaders (4): IBM, SAS, RapidMiner, KNIME
  • Challengers (4): MathWorks (new), Quest (formerly Dell), Alteryx, Angoss
  • Visionaries (5): Microsoft, (new), Dataiku (new), Domino Data Lab (new), Alpine Data
  • Niche Players (3): FICO, SAP, Teradata (new)

Gartner notes that even the lowest-scoring vendors in MQ are still among the top 16 firms among over 100 vendors in the heated Data Science market.

Among those not on the quadrant, I’ve been impressed by DataRobot.


Daily Links 04/05/2017

New technology pushes machine smarts to the edge

“The set of possible smart edge devices that can be used for industrial control is rapidly expanding as ever more compute and sensing capability moves to the edge,” says Greg Olsen, senior vice president, products, at Falkonry. “As long as the device can transform signal observation into operational commands or guidance, it can be considered a control device. Smartness is clearly subjective, but the range can include anything from advanced process control all the way up to artificial intelligence.”

Want to be happier and more successful? Learn to like other people

It sounds paradoxical, but according to University of Georgia researcher Jason Colquitt and his colleagues, people who tend to trust others at work score higher on a range of measure than those who don’t, from job performance to commitment to the team. And since we know that it’s our relationships—particularly with our bosses and colleagues—that determine how happy and successful we are as our careers progress, it may be worth asking some new questions. Instead of, “How can I improve?” the better question might be, “How can I start seeing more of the good in people, more often?”

Google’s Cloud Jobs API

Company career sites, job boards and applicant tracking systems can improve candidate experience and company hiring metrics with job search and discovery powered by sophisticated machine learning. The Cloud Jobs API provides highly intuitive job search that anticipates what job seekers are looking for and surfaces targeted recommendations that help them discover new opportunities. In order to provide the most relevant search results and recommendations, the API uses machine learning to understand how job titles and skills relate to one another, and what job content, location and seniority are the closest match for a jobseeker’s preferences.