The reason I’m skeptical is because I believe in the science portion of our field’s name. One of the primary things that separates a data scientist from someone just building models is the ability to think carefully about things like endogeneity, causal inference, and experimental and quasi-experimental design. Data scientists must understand and think about things like data generating processes and reason through how misspecifying them could influence or undermine the inferences they draw from their analyses.
But what data can do is it can disprove things, often quite easily. While Scott Winship will argue to death that Piketty’s market-income data is not the best kind of data to understand changes in income inequality, but what you can’t do is proclaim or expound a theory explaining a decrease in market income inequality.