data science, links

Daily Links 08/31/2017

The current state of applied data science [Ben Lorica / O’Reilly Radar]

Key points from the article:

  • Lack of training data remains the primary bottleneck in machine learning projects
  • Think about features, not algorithms
    • Data enrichment can potentially improve your existing models – this is sometimes overlooked, though this is often not considered as glamorous as model and algorithm development
  • Role of machine learning engineer has recently emerged to streamline process of productionizing data science projects
  • Model and algorithm development gets all the media coverage but this is usually not as pressing as developing good training data sets and productionizing data science projects

What this overview misses is the rise of the data product manager.  You can’t do good data science without a thorough understanding of business and market requirements. A good data product manager will provide that, and direct the data science team towards useful projects.  You can’t take good problems well-framed as a given; that’s one of the biggest challenges of data science.

Bad code isn’t technical debt, it’s an unhedged call option [Frances Lash]

Therefore, even if it is more expensive to do thing clean from the start, it would also be less risky. A messy system is full of unhedged calls that can be called upon at an unpredictable cost to the organization. Technical debt does not communicated the risk of writing sloppy or quick fixes into code – debt is something to be managed and just another tool. When talking about implausible delivery dates it may make more sense to talk about unhedged call options.

What artificial intelligence can and can’t do right now [Andrew Ng / Harvard Business Review]

If a typical person can do a mental task with less than one second of thought, we can probably automate it using AI either now or in the near future.

How your brain tricks you into thinking more expensive wine tastes better

[David DiSalvo / Forbes]

As predicted, the volunteers rated the allegedly higher-priced wine as tasting better than the allegedly cheaper wine. The MRI scan showed that when those evaluations were made, two parts of the volunteers’ brains experienced greater activity: the medial pre-frontal cortex and the ventral striatum. That’s important because those two areas are especially involved in evaluating expectations and seeking rewards. When we see a higher price, our brain links the price to greater expectation of reward, which changes our perception – in this case, taste.

Why the AI hype train is already off the rails and why I’m over AI already [Dat Tran / Built to Adapt]

In those early years of big data, the outcome was always less than perfect. Most of the work ended up in powerpoint presentations without ever going into production because most teams simply did not have the right infrastructure or culture to maintain them.

ai, links, work

Daily Links 08/15/2017

Sheryl Sandberg: Develop Your Voice, Not Your Brand [Theodore Kinni / Stanford GSB ]

The idea of developing your personal brand is a bad one, according to Sandberg. “People aren’t brands,” she says. “That’s what products need. They need to be packaged cleanly, neatly, concretely. People aren’t like that.”

“Who am I?” asks Sandberg. “I am the COO of Facebook, a company I deeply believe in. I’m an author. I’m a mom. I’m a widow. At some level, I’m still deeply heartbroken. I am a friend and I am a sister. I am a lot of very messy, complicated things. I don’t have a brand, but I have a voice.”

How to Monetize Your Podcast by Selling Your Recording Equipment [Dustin Meadows / Hard Style ]

I was once like you. I thought, “man, I’ll bet there’s a lot of people out there who wanna hear about the plot holes I’ve discovered in the Marvel Cinematic Universe,” but it turns out like 10,000 other people have already made podcasts about the plot holes in the Marvel Cinematic Universe and about 9,998 of them were better than mine.

Facebook’s AI Robots Shut Down After They Start Talking to Each Other in Their Own Language [Andrew Griffin / The Independent]

Facebook abandoned an experiment after two artificially intelligent programs appeared to be chatting to each other in a strange language only they understood.

The two chatbots came to create their own changes to English that made it easier for them to work – but which remained mysterious to the humans that supposedly look after them….

The robots had been instructed to work out how to negotiate between themselves, and improve their bartering as they went along. But they were not told to use comprehensible English, allowing them to create their own “shorthand”, according to researchers.



Daily Links 08/11/2017

Machine Learning vs Statistics: The Texas Death Match of Data Science [Tomm Fawcett & Drew Hardin / Silicon Valley Data Science]

Since decisions still have to be made, statistics provides a framework for making betterdecisions. To do this, statisticians need to be able to assess the probabilities associated with various outcomes. And to do that, statisticians use models. In statistics, the goal of modeling is approximating and then understanding the data-generating process, with the goal of answering the question you actually care about.

In contrast to Statistics, note that the goal here to generate the best prediction. The ML practitioner usually does some exploratory data analysis, but only to prepare the data and to guide the choice of features and a model family. The model does not represent a belief about or a commitment to the data generation process. Its purpose is purely functional. No ML practitioner would be prepared to testify to the “validity” of a model; this has no meaning in Machine Learning, since the model is really only instrumental to its performance.2 The motto of Machine Learning may as well be: The proof of the model is in the test set.

I’m a woman in computer science. Let me ladysplain the Google memo to you. [Cynthia Lee / Vox]

To be a woman in tech is to know the thrill of participating in one of the most transformative revolutions humankind has known, to experience the crystalline satisfaction of finding an elegant solution to an algorithmic challenge, to want to throw the monitor out the window in frustration with a bug and, later, to do a happy dance in a chair while finally fixing it. To be a woman in tech is also to always and forever be faced with skepticism that I do and feel all those things authentically enough to truly belong. There is always a jury, and it’s always still out.

This Morning Routine Will Save You 20+ Hours a Week [Benjamin P. Hardy / Inc.]

The same concept applies to work. The best work happens in short intensive spurts. By short, I’m talking 1-3 hours. But this must be “Deep Work,” with no distractions, just like an intensive workout is non-stop. Interestingly, your best work – which for most people is thinking – will actually happen while you’re away from your work, “recovering.”

For best results: Spend 20% of your energy on your work and 80% of your energy on recovery and self-improvement. When you’re getting high quality recovery, you’re growing. When you’re continually honing your mental model, the quality and impact of your work continually increases. This is what psychologists call, “Deliberate Practice.” It’s not about doing more, but better training. It’s about being strategic and results-focused, not busyness-focused.

The Labyrinth of Life [Martha Beck]

Today, if you’re confronting an issue for the ten thousandth time, or feeling that your life is going nowhere, or panicking over how little you’ve achieved, stop and breathe. You’re not falling behind on some linear race through time. You’re walking the labyrinth of life. Yes, you’re meant to move forward, but almost never in a straight line. Yes, there’s an element of achievement, of beginning and ending, but those are minor compared to the element of being here now. In the moments you stop trying to conquer the labyrinth of life and simply inhabit it, you’ll realize it was designed to hold you safe as you explore what feels dangerous. You’ll see that you’re exactly where you’re meant to be, meandering along a crooked path that is meant to lead you not onward, but inward.

ai, data science, links

Daily Links 04/11/2017

Demystifying data science

The key to a successful analytical model is having a robust set of variables against which to test for their predictive capabilities. And the key to having a robust set of variables from which to test is to get the business users engaged early in the process.

How machine learning is shaking up e-commerce and customer engagement

From a content perspective, [Sitecore] performs semantic analysis to:

  • Auto generate taxonomies and tagging
  • Help improve the tone of your content by analyzing for things like wordiness, slang, and other grammar-like faux pax

From a digital marketing perspective, ML can:

  • Help detect segments of your customers or audience
  • Improve the effectiveness of your testing and optimization processes
  • Provide content and product recommendations that increase the engagement time a customer spends on your website.

And from a backend perspective, it can help with fraud detection, something that every company with an e-commerce model needs to monitor actively.

Gartner 2017 magic quadrant for data science platforms: gainers and losers

Firms covered:

  • Leaders (4): IBM, SAS, RapidMiner, KNIME
  • Challengers (4): MathWorks (new), Quest (formerly Dell), Alteryx, Angoss
  • Visionaries (5): Microsoft, (new), Dataiku (new), Domino Data Lab (new), Alpine Data
  • Niche Players (3): FICO, SAP, Teradata (new)

Gartner notes that even the lowest-scoring vendors in MQ are still among the top 16 firms among over 100 vendors in the heated Data Science market.

Among those not on the quadrant, I’ve been impressed by DataRobot.


Daily Links 04/05/2017

New technology pushes machine smarts to the edge

“The set of possible smart edge devices that can be used for industrial control is rapidly expanding as ever more compute and sensing capability moves to the edge,” says Greg Olsen, senior vice president, products, at Falkonry. “As long as the device can transform signal observation into operational commands or guidance, it can be considered a control device. Smartness is clearly subjective, but the range can include anything from advanced process control all the way up to artificial intelligence.”

Want to be happier and more successful? Learn to like other people

It sounds paradoxical, but according to University of Georgia researcher Jason Colquitt and his colleagues, people who tend to trust others at work score higher on a range of measure than those who don’t, from job performance to commitment to the team. And since we know that it’s our relationships—particularly with our bosses and colleagues—that determine how happy and successful we are as our careers progress, it may be worth asking some new questions. Instead of, “How can I improve?” the better question might be, “How can I start seeing more of the good in people, more often?”

Google’s Cloud Jobs API

Company career sites, job boards and applicant tracking systems can improve candidate experience and company hiring metrics with job search and discovery powered by sophisticated machine learning. The Cloud Jobs API provides highly intuitive job search that anticipates what job seekers are looking for and surfaces targeted recommendations that help them discover new opportunities. In order to provide the most relevant search results and recommendations, the API uses machine learning to understand how job titles and skills relate to one another, and what job content, location and seniority are the closest match for a jobseeker’s preferences.


ai, data science, links

Daily Links 04/04/2017

Emotion Detection and Recognition from Text Using Deep Learning

The researchers used a data set of short English text messages labeled by Mechanical Turkers with five emotion classes anger, sadness, fear, happiness, and excitement. A multi-layered neural network was trained to classify text messages by emotion. The model was able to classify anger, sadness, and excitement well but didn’t do well at recognizing fear.

Adapting ideas from neuroscience for AI

We don’t really know why neurons spike. One theory is that they want to be noisy so as to regularize, because we have many more parameters than we have data points. The idea of dropout [a technique developed to help prevent overfitting] is that if you have noisy activations, you can afford to use a much bigger model. That might be why they spike, but we don’t know. Another reason why they might spike is so they can use the analog dimension of time, to code a real value at the time of the spike. This theory has been around for 50 years, but no one knows if it’s right. In certain subsystems, neurons definitely do that, like in judging the relative time of arrival of a signal to two ears so you can get the direction.

Five AI Startup Predictions for 2017

My favorite: “Full stack AI startups actually work”

When you focus on a vertical, you can find high level customer needs that we can meet better with AI, or new needs that can’t be met without AI. These are terrific business opportunities, but they require much more business savvy and subject matter expertise. The generally more technical crowd starting AI startups tend to have neither, and tend to not realize the need for or have the humility to bring in the business and subject matter expertise required to ‘move up the stack’ or ‘go full stack’ as I like to call it.

The Silicon Gourmet: training a neural network to generate cooking recipes

Pears Or To Garnestmeam


¼ lb bones or fresh bread; optional
½ cup flour
1 teaspoon vinegar
¼ teaspoon lime juice
2  eggs

Brown salmon in oil. Add creamed meat and another deep mixture.

Discard filets. Discard head and turn into a nonstick spice. Pour 4 eggs onto clean a thin fat to sink halves.

Brush each with roast and refrigerate.  Lay tart in deep baking dish in chipec sweet body; cut oof with crosswise and onions.  Remove peas and place in a 4-dgg serving. Cover lightly with plastic wrap.  Chill in refrigerator until casseroles are tender and ridges done.  Serve immediately in sugar may be added 2 handles overginger or with boiling water until very cracker pudding is hot.

Yield: 4 servings

Also see In Which a Neural Network Learns to Tell Knock-Knock Jokes

that's random

Has the term AI become meaningless? How about MI instead

Ian Bogost writing for The Atlantic says that in too many cases today “artificial intelligence” is just another name for a fancy computer program. I don’t see it that way. I know from experience that what most data scientists are building is entirely different from what rank-and-file software developers are building. We use different tools and different approaches. And the data-driven learning algorithms we deploy at their best solve an entirely different class of problems than regular computer programs do.

Personally I like to call what we build “machine intelligence” rather than “artificial intelligence” because machine intelligence is really an alternative kind of intelligence, not an artificial version of human intelligence.

No, it’s not “Making computers act like they do in the movies” as Bogost quotes AI researcher Charles Isbell. That is too glib indeed. Why not let machines do what they do best rather than just serve as poor imitators of humans?

Part of what makes “artificial intelligence” feel a bit underwhelming is that we’ve barely begun to see what we might achieve with machine intelligence. Yes, self-driving cars are pretty amazing. I don’t have one myself (can’t afford a Tesla, darn) but I do adore the parking sensors on my SUV. They allow me to navigate around the dangerously-placed porch jutting out by the attached garage set back to the rear of my house. If I didn’t have them I probably would have hit the porch at least once already. The car can parallel park itself too but I’ve only tried that once, before I bought the car, with the salesperson sitting next to me.

I have faith that we are going to see many more amazing machine intelligence capabilities come out, as startups and big companies start focusing on vertical artificial intelligence in specific domains rather than continuing to build out horizontal machine learning capabilities for use by data scientists. Vertical AI (or MI) is tough. That’s where you have to get domain experts and data scientists together and figure out how to encode domain expertise and capabilities into machine learning models. It’s tough and slow work. I know. I’ve been doing it for a few years now in the temporary workforce management space. We’re beginning to see the payoff though, and that is truly exciting.

If you want to hear about it, I’m going to be at VMSA Live in Phoenix in early April talking about Machine Intelligence in Talent during the Executive Gateway session on Wednesday, April 5th. If you’ll be there, stop by.