Statistics & Econometrics
Traditionally, analysts use data on stopped individuals to study bias by computing the difference in violence rates between stopped minority and white civilians, while controlling for observable differences between these two sets of encounters. We term this the “naïve estimator” … However, without further assumptions, this quantity will have no causal interpretation so long as […]
In earlier blog posts, yours truly has discussed the problems of confounding and ‘overcontrolling’ in causal analysis. A good illustration of how attempts to control for additional variables can sometimes worsen rather than improve causal estimates is the so-called M-bias problem. Let me give an example from economics to illustrate the issue. Estimating causal relationships […]
Judea Pearl, in his The Book of Why, discusses the problems that arise if we thoughtlessly try to ‘control’ for too much in our quest to identify causal relationships. One of his examples concerns the paradox that when we want to find out whether mothers’ smoking increases the risk of infant mortality, but only study […]
A helpful intuition for understanding ‘collider bias’ — the spurious correlation induced when one controls for a common effect rather than a common cause — can be found in modern dating behaviour. In a population at large, it is reasonable to assume that attractiveness and personality are independent. Mean people are not necessarily attractive, nor […]
. A classic textbook example of the ‘Table 2 Fallacy’ in economics arises when estimating the return to education and misinterpreting regression coefficients. Suppose an economist wishes to estimate the causal effect of an additional year of schooling on earnings and estimates If the estimated coefficient is small and statistically insignificant, the economist might conclude […]
Despite the pervasive uncertainties surrounding their assumptions, economic modellers frequently confine their expressions of uncertainty to those generated from within the very assumptions they have chosen to embed in their models — thereby creating a false sense of precision while ignoring deeper sources of ignorance. Econometrics offers a series of cautionary tales in which highly […]
In The Book of Why, Judea Pearl puts forward several compelling reasons why the now so popular causal graph-theoretic approach is to be preferred over more traditional regression-based explanatory models. One reason is that causal graphs are non-parametric and therefore do not need to assume, for example, additivity and/or the absence of interaction effects — […]
Taken as a measure of causal explanatory power, R squared does not fare any better. The problem of explaining variances rather than levels shows up here as well—if it measures causal influence, it has to be influences on variances. But we often do not care about the causes of variance in economic variables but instead […]
In simple (and multiple) regression analysis for cross-sectional data, researchers often estimate regressions such as “regress test score (y) on study hours (x)” and obtain a result of the form y = constant + slope coefficient × x + error term. When speaking of increases or decreases in x in these interpretations, we must remember […]