statistics

Error message

  • Deprecated function: The each() function is deprecated. This message will be suppressed on further calls in _menu_load_objects() (line 579 of /var/www/drupal-7.x/includes/menu.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Notice: Trying to access array offset on value of type int in element_children() (line 6600 of /var/www/drupal-7.x/includes/common.inc).
  • Deprecated function: implode(): Passing glue string after array is deprecated. Swap the parameters in drupal_get_feeds() (line 394 of /var/www/drupal-7.x/includes/common.inc).

America: the rot goes on

Published by Anonymous (not verified) on Tue, 09/08/2022 - 7:17am in

NYC subway rot s

It’s been a while since I looked at one of the major reasons for the pervasive sense of rot about the US: the low level of investment—investment in real things, that is, not crypto. It’s barely keeping up with the forces of decay. If you’re wondering why nothing works and everything seems to be falling apart, here are some explanations.

First a definition: investment is spending by businesses, governments, and individuals on long-lived physical assets like buildings and machinery. Gross investment is the dollar value of such spending; net is what remains after deducting depreciation, aka wear and tear. That’s not an easy process to put a dollar value on, but it’s all we’ve got. And besides, these are numbers the capitalist state produces to understand its economy, so why not take them seriously, even if the bourgeoisie seems unalarmed about them?

Graphed below is the average value of net public and private investment as a percentage of GDP by decade. Civilian public investment means expenditures on long-lived assets like schools and roads, but excluding the military. (To anticipate a question I sometimes get: yes, prisons are in there too, but they don’t count for much; almost all the costs of maintaining the carceral state come from day-to-day operations.) Private investment consists of purchases of buildings, equipment, and intellectual property (IP) by businesses and. Not shown on this first graph: residential investment, the purchase of housing by individuals and improvements to that housing.

Net investment by decade

Averages for the 1930s reflect the extraordinary circumstances of the Great Depression: private investment collapsed and New Deal-driven public investment soared. Those high levels of public investment gave us an infrastructure that we still use today—schools, post offices, and parks. (For a catalog of those projects, check out the Living New Deal.) Public investment sagged during the 1940s, reflecting World War II, but rose in the 1950s and 1960s, matching the 1930s level as the public sector expanded. It was not to last: austerity and privatization consciousness took over, and now net public investment is at a record low.

Private investment rose in the decades after World War II, peaking in the 1970s. But the Wall Street-driven imperatives of profit maximization that got the upper hand via the Shareholder Revolution of the 1980s, which transformed corporate practices, put the squeeze on investment. Investing too large a share of corporate profits in things came to be seen as wasteful—better instead to hand the cash over to the shareholders, via stock buybacks and traditional dividends.

Here’s a yearly view of the trajectory of decline, a path traced by the dotted trendlines. The graphs begin in 1950 because the extremes of the 1930s and 1940s would have distorted the scale.

Net investment, public & private, yearly

These graphs show a relative stability in net private investment from the late 1950s through the early 1980s, when Wall Street’s grip on corporate cash flow tightened. There was a surge in the late 1990s, the period of the New Economy mania and the early commercial internet—an enthusiasm which was at least backed up with investment in the technology that was supposed to bring about the future. We haven’t seen much of that in the latest iteration of tech mania, the era of Uber and Airbnb.

Here’s a look at some components of private investment. The equipment and especially the structures trendlines show a persistent downward path. Against that, IP’s rise stands out—to the point where it’s surpassed investment in buildings and is rivaling equipment. Both equipment and structures used to be several times IP. Capitalists are spending less money on things that are supposed to promote general prosperity and more on legal arrangements that protect theirs.

Net private investment by type

Low levels of net private investment aren’t driven by declines in gross investment, which has been pretty stable. Instead, the major reasons for the decline are a shift towards shorter-lived equipment and the immateriality of intellectual property (IP) and a shift away from buildings. From 1950–1999, net fixed private investment averaged 32% of gross; since 2000, it’s averaged 20%—and 16% since 2020. Every asset category has seen that shift—even buildings.

Intellectual property investment, whose share of business investment grew from 8% in 1950 to 40% today, adds another layer of fleetingness to the story. Business ideologues love to tout IP as a stimulant to innovation; who’d invent anything if they couldn’t patent it? Lots of people would, actually. That aside, most basic innovation in sectors like computing and pharmaceuticals have been funded by public entities, not private companies, who then appropriate those innovations to make profits from research they didn’t pay for. Instead of supporting innovations, a lot of IP investment is about trying to establish monopolies, be it in the latest variation on an antidepressant or a Disney cartoon character. But even here the trend towards shorter-lived assets is visible: net IP investment went from 27% of gross in the 1950s and 1960s to 16% since.

For the public sector, the decline in net investment has been more dramatic, falling from around 2% of GDP in the early decades on the graph to 0.4% since 2020. (It’s 0.3% so far in 2022.) Like the private sector, we’ve seen a shift towards shorter-lived assets, but unlike the private sector, we’ve also seen a decline in gross investment, which fell by almost half between the 1960s and 2020s. Net public investment as a percentage of gross went from 67% in the 1950s and 1960s to 27% in the 2020s. Net federal civilian investment is just 0.1% of GDP so far this decade, a third its 1950–1999 average. State and local investment has fallen harder, down by almost three quarters from that 50-year average to 0.5% in the 2020s (0.3% so far this year).

Net residential investment

And as the graph above shows, residential net investment isn’t doing too great either: it went from an average of 2.8% of GDP from 1950 to 1999 to 1.7% in the 2020s. Unlike the mid-2000s housing bubble, which took net residential investment up to 3.8%, the highest since the early post-World War II years, the latest bubble took net housing investment up to just 1.9% of GDP last year. It’s fallen back to 1.4% in 2022. That’s not the way to meet a housing deficit estimated by Freddie Mac at 3.8 million units.

The burst of net private investment in the late 1990s gave us a major productivity acceleration, but it was not to last. And the burst in civilian public investment from the early 1950s through the late 1960s gave us interstate highways, schools, and state university systems. The long declines in net investment, both private and public, have given us stagnant productivity growth and a collapsing infrastructure.

As I put it when I wrote about net investment five years ago, “If I were a debased purveyor of clickbait, I’d call this “Everything that’s wrong with America in two charts.” But I’m not, so I won’t. But still….”

More true than ever.

Photo is by me, of the 7th Ave stop on the G line of the New York City subway.

Americans’ class ID shifts down

Published by Anonymous (not verified) on Sun, 22/05/2022 - 5:20am in

Tags 

statistics, class

The USA is the country where everyone feels middle-class, right? No.

Gallup is out with the latest edition of a question it’s asked ten times over the last twenty years: “If you were asked to use one of these five names for your social class, which would you say you belong in?” When they did the survey in April, the largest set of respondents said “middle,” 38%—but that’s not much more than a third. Almost as many, 35%, said “working” (a term that has often been pronounced obsolete).

Here’s some more detail:

Gallup class

A striking thing about the chart is its upward skew. The midpoint is just 4 points into the “middle” category, and “upper-middle” is nowhere near that midpoint—it begins about 5/6 of the way to the top. Still, it’s remarkable that in a country of alleged universal middle classness, almost half the population identifies as sub-middle.

Over the last 20 years, upper-middle and middle have declined by 8 percentage points and working and lower have risen by 9. If you start the clock in 2005, the peak of the housing bubble, the “middle” share has fallen by 9 points, with most going into “working.” The Great Recession that followed the bursting of that bubble has a lot to do with that trend, but ten years of expansion following that miserable downturn did nothing to change middle-class self-identification.

Gallup class ID over time

Before one gets encouraged by these stats into thinking proletarian class-consciousness is on the rise, a caveat: more Republicans (38%) are likely to identify as working class than Democrats (30%). But to conclude on a more encouraging note: 49% of those aged 18–34 call themselves working class, twice the share of the over-55s. Nice to see that clarity in the young.

 

Quit rates, unions, politics

Published by Anonymous (not verified) on Wed, 27/04/2022 - 5:34am in

I’m not sure what this means, but quit rates are higher in states that voted for Trump, and are higher in states with low unionization rates.

We’ve been hearing for some time now that quit rates are the highest on record. That’s true if you look only at the Job Openings and Labor Market Turnover Series (JOLTS) numbers, which the Bureau of Labor Statistics (BLS) started reporting in December 2000. It had an ancestor, which the BLS reported for manufacturing only, covering 1919 to 1981 (left portion of the graph below). Current quit rates now are comparable to those of the 1960s and 1970s, and are well below peaks of the 1920s and 1940s. Much of the JOLTS history (right portion of graph below) covers, other measures show, an unusually torpid period for the US job market, so today’s levels may only mark a return to once-familiar territory.

Quit rate long

In any case, quit rates are high by contemporary standards. In February, 2.9% of all workers, and 3.2% of all private sector workers, quit their jobs, slightly off highs set late last year.

But quit rates vary widely from state to state, as this map shows. Generally they’re lowest in the Northeast and highest in the South and interior West. 

JOLTS quits Feb 22 level map

 

If you’re so inclined, you might notice that this map bears some resemblance to the classic red–blue political map.

Screen Shot 2022-04-26 at 3.08.40 PM

And that impression is borne out when you run the numbers. In states with quit rates above the national median, Trump got 57% of the vote; in those around the median, 49%; in those below the median, 42%. (His overall share was 47%.) 

Trump share and quit rate

And here’s another curious detail: quit rates are higher in states with the lowest union density (the share of the workforce that belongs to unions), and lower in states with higher union density. In states with above-average quit rates, union density averaged 7.8%; in those with below-average quit rates, density was 12.5%. (The national average is 10.3%.) The relationship also held with the top ten and bottom ten states.

Quit rates and union density

So what’s going on here? Do Republican and less unionized states have more dynamic labor markets or fewer discontented workers? Are high quit rates signs of worker strength or desperation?

Maybe the most productive way to think about this comes from Chris Smalls, the leader of the Amazon Labor Union on Staten Island: “If I can lead us to victory over Amazon, what’s stopping anybody in this country from organizing their workplace? Nothing. You know, people got to get out of that mentality of, ‘Oh, let me just quit my job.’ Because when you quit your job, guess what? They hire somebody else. So you’re jumping from one fire into the next, and the system doesn’t get fixed by doing that.”

The Dunning-Kruger Effect is Autocorrelation

Published by Anonymous (not verified) on Sat, 09/04/2022 - 1:35am in

Have you heard of the ‘Dunning-Kruger effect’? It’s the (apparent) tendency for unskilled people to overestimate their competence. Discovered in 1999 by psychologists Justin Kruger and David Dunning, the effect has since become famous.

And you can see why.

It’s the kind of idea that is too juicy to not be true. Everyone ‘knows’ that idiots tend to be unaware of their own idiocy. Or as John Cleese puts it:

If you’re very very stupid, how can you possibly realize that you’re very very stupid?

Of course, psychologists have been careful to make sure that the evidence replicates. But sure enough, every time you look for it, the Dunning-Kruger effect leaps out of the data. So it would seem that everything’s on sound footing.

Except there’s a problem.

The Dunning-Kruger effect also emerges from data in which it shouldn’t. For instance, if you carefully craft random data so that it does not contain a Dunning-Kruger effect, you will still find the effect. The reason turns out to be embarrassingly simple: the Dunning-Kruger effect has nothing to do with human psychology.1 It is a statistical artifact — a stunning example of autocorrelation.

What is autocorrelation?

Autocorrelation occurs when you correlate a variable with itself. For instance, if I measure the height of 10 people, I’ll find that each person’s height correlates perfectly with itself. If this sounds like circular reasoning, that’s because it is. Autocorrelation is the statistical equivalent of stating that 5 = 5.

When framed this way, the idea of autocorrelation sounds absurd. No competent scientist would correlate a variable with itself. And that’s true for the pure form of autocorrelation. But what if a variable gets mixed into both sides of an equation, where it is forgotten? In that case, autocorrelation is more difficult to spot.

Here’s an example. Suppose I am working with two variables, x and y. I find that these variables are completely uncorrelated, as shown in the left panel of Figure 1. So far so good.

Figure 1: Generating autocorrelation. The left panel plots the random variables x and y, which are uncorrelated. The right panel shows how this non-correlation can be transformed into an autocorrelation. We define a variable called z, which is correlated strongly with x. The problem is that z happens to be the sum x + y. So we are correlating x with itself. The variable y adds statistical noise.

Next, I start to play with the data. After a bit of manipulation, I come up with a quantity that I call z. I save my work and forget about it. Months later, my colleague revisits my dataset and discovers that z strongly correlates with x (Figure 1, right). We’ve discovered something interesting!

Actually, we’ve discovered autocorrelation. You see, unbeknownst to my colleague, I’ve defined the variable z to be the sum of x + y. As a result, when we correlate z with x, we are actually correlating x with itself. (The variable y comes along for the ride, providing statistical noise.) That’s how autocorrelation happens — forgetting that you’ve got the same variable on both sides of a correlation.

The Dunning-Kruger effect

Now that you understand autocorrelation, let’s talk about the Dunning-Kruger effect. Much like the example in Figure 1, the Dunning-Kruger effect amounts to autocorrelation. But instead of lurking within a relabeled variable, the Dunning-Kruger autocorrelation hides beneath a deceptive chart.2

Let’s have a look.

In 1999, Dunning and Kruger reported the results of a simple experiment. They got a bunch of people to complete a skills test. (Actually, Dunning and Kruger used several tests, but that’s irrelevant for my discussion.) Then they asked each person to assess their own ability. What Dunning and Kruger (thought they) found was that the people who did poorly on the skills test also tended to overestimate their ability. That’s the ‘Dunning-Kruger effect’.

Dunning and Kruger visualized their results as shown in Figure 2. It’s a simple chart that draws the eye to the difference between two curves. On the horizontal axis, Dunning and Kruger have placed people into four groups (quartiles) according to their test scores. In the plot, the two lines show the results within each group. The grey line indicates people’s average results on the skills test. The black line indicates their average ‘perceived ability’. Clearly, people who scored poorly on the skills test are overconfident in their abilities. (Or so it appears.)

Figure 2: The Dunning-Kruger chart. From Dunning and Kruger (1999). This figure shows how Dunning and Kruger reported their original findings. Dunning and Kruger gave a skills test to individuals, and also asked each person to estimate their ability. Dunning and Kruger then placed people into four groups based on their ranked test scores. This figure contrasts the (average) percentile of the ‘actual test score’ within each group (grey line) with the (average) percentile of ‘perceived ability’. The Dunning-Kruger ‘effect’ is the difference between the two curves — the (apparent) fact that unskilled people overestimate their ability.

On its own, the Dunning-Kruger chart seems convincing. Add in the fact that Dunning and Kruger are excellent writers, and you have the recipe for a hit paper. On that note, I recommend that you read their article, because it reminds us that good rhetoric is not the same as good science.

Deconstructing Dunning-Kruger

Now that you’ve seen the Dunning-Kruger chart, let’s show how it hides autocorrelation. To make things clear, I’ll annotate the chart as we go.

We’ll start with the horizontal axis. In the Dunning-Kruger chart, the horizontal axis is ‘categorical’, meaning it shows ‘categories’ rather than numerical values. Of course, there’s nothing wrong with plotting categories. But in this case, the categories are actually numerical. Dunning and Kruger take people’s test scores and place them into 4 ranked groups. (Statisticians call these groups ‘quartiles’.)

What this ranking means is that the horizontal axis effectively plots test score. Let’s call this score x.

Figure 3: Deconstructing the Dunning-Kruger chart. In the Dunning-Kruger chart, the horizontal axis ranks ‘actual test score’, which I’ll call x.

Next, let’s look at the vertical axis, which is marked ‘percentile’. What this means is that instead of plotting actual test scores, Dunning and Kruger plot the score’s ranking on a 100-point scale.3

Now let’s look at the curves. The line labeled ‘actual test score’ plots the average percentile of each quartile’s test score (a mouthful, I know). Things seems fine, until we realize that Dunning and Kruger are essentially plotting test score (x) against itself.4 Noticing this fact, let’s relabel the grey line. It effectively plots x vs. x.

Figure 3: Deconstructing the Dunning-Kruger chart. In the Dunning-Kruger chart, the line marked ‘actual test score’ is plotting test score (x) against itself. In my notation, that’s x vs. x.

Moving on, let’s look at the line labeled ‘perceived ability’. This line measures the average percentile for each group’s self assessment. Let’s call this self-assessment y. Recalling that we’ve labeled ‘actual test score’ as x, we see that the black line plots y vs. x.

Figure 3: Deconstructing the Dunning-Kruger chart. In the Dunning-Kruger chart, the line marked ‘perceived ability’ is plotting ‘perceived ability’ y against actual test score x.

So far, nothing jumps out as obviously wrong. Yes, it’s a bit weird to plot x vs. x. But Dunning and Kruger are not claiming that this line alone is important. What’s important is the difference between the two lines (‘perceived ability’ vs. ‘actual test score’). It’s in this difference that the autocorrelation appears.

In mathematical terms, a ‘difference’ means ‘subtract’. So by showing us two diverging lines, Dunning and Kruger are (implicitly) asking us to subtract one from the other: take ‘perceived ability’ and subtract ‘actual test score’. In my notation, that corresponds to y – x.

Figure 3: Deconstructing the Dunning-Kruger chart. To interpret the Dunning-Kruger chart, we (implicitly) look at the difference between the two curves. That corresponds to taking ‘perceived ability’ and subtracting from it ‘actual test score’. In my notation, that difference is y – x (indicated by the double-headed arrow). When we judge this difference as a function of the horizontal axis, we are implicitly comparing y – x to x. Since x is on both sides of the comparison, the result will be an autocorrelation.

Subtracting y – x seems fine, until we realize that we’re supposed to interpret this difference as a function of the horizontal axis. But the horizontal axis plots test score x. So we are (implicitly) asked to compare y – x to x:

\displaystyle (y - x) \sim x

Do you see the problem? We’re comparing x with the negative version of itself. That is textbook autocorrelation. It means that we can throw random numbers into x and y — numbers which could not possibly contain the Dunning-Kruger effect — and yet out the other end, the effect will still emerge.

Replicating Dunning-Kruger

To be honest, I’m not particularly convinced by the analytic arguments above. It’s only by using real data that I can understand the problem with the Dunning-Kruger effect. So let’s have a look at some real numbers.

Suppose we are psychologists who get a big grant to replicate the Dunning-Kruger experiment. We recruit 1000 people, give them each a skills test, and ask them to report a self-assessment. When the results are in, we have a look at the data.

It doesn’t look good.

When we plot individuals’ test score against their self assessment, the data appear completely random. Figure 7 shows the pattern. It seems that people of all abilities are equally terrible at predicting their skill. There is no hint of a Dunning-Kruger effect.

Figure 7: A failed replication. This figure shows the results of a thought experiment in which we try to replicate the Dunning-Kruger effect. We get 1000 people to take a skills test and to estimate their own ability. Here, we plot the raw data. Each point represents an individual’s result, with ‘actual test score’ on the horizontal axis, and ‘self assessment’ on the vertical axis. There is no hint of a Dunning-Kruger effect.

After looking at our raw data, we’re worried that we did something wrong. Many other researchers have replicated the Dunning-Kruger effect. Did we make a mistake in our experiment?

Unfortunately, we can’t collect more data. (We’ve run out of money.) But we can play with the analysis. A colleague suggests that instead of plotting the raw data, we calculate each person’s ‘self-assessment error’. This error is the difference between a person’s self assessment and their test score. Perhaps this assessment error relates to actual test score?

We run the numbers and, to our amazement, find an enormous effect. Figure 8 shows the results. It seems that unskilled people are massively overconfident, while skilled people are overly modest.

(Our lab techs points out that the correlation is surprisingly tight, almost as if the numbers were picked by hand. But we push this observation out of mind and forge ahead.)

Figure 8: Maybe the experiment was successful? Using the raw data from Figure 7, this figure calculates the ‘self-assessment error’ — the difference between an individual’s self assessment and their actual test score. This assessment error (vertical axis) correlates strongly with actual test score (horizontal) axis.

Buoyed by our success in Figure 8, we decide that the results may not be ‘bad’ after all. So we throw the data into the Dunning-Kruger chart to see what happens. We find that despite our misgivings about the data, the Dunning-Kruger effect was there all along. In fact, as Figure 9 shows, our effect is even bigger than the original (from Figure 2).

Figure 9: Recovering Dunning and Kruger. Despite the apparent lack of effect in our raw data (Figure 7), when we plug this data into the Dunning-Kruger chart, we get a massive effect. People who are unskilled over-estimate their abilities. And people who are skilled are too modest.

Things fall apart

Pleased with our successful replication, we start to write up our results. Then things fall apart. Riddled with guilt, our data curator comes clean: he lost the data from our experiment and, in a fit of panic, replaced it with random numbers. Our results, he confides, are based on statistical noise.

Devastated, we return to our data to make sense of what went wrong. If we have been working with random numbers, how could we possibly have replicated the Dunning-Kruger effect? To figure out what happened, we drop the pretense that we’re working with psychological data. We relabel our charts in terms of abstract variables x and y. By doing so, we discover that our apparent ‘effect’ is actually autocorrelation.

Figure 10 breaks it down. Our dataset is comprised of statistical noise — two random variables, x and y, that are completely unrelated (Figure 10A). When we calculated the ‘self-assessment error’, we took the difference between y and x. Unsurprisingly, we find that this difference correlates with x (Figure 10B). But that’s because x is autocorrelating with itself. Finally, we break down the Dunning-Kruger chart and realize that it too is based on autocorrelation (Figure 10C). It asks us to interpret the difference between y and x as a function of x. It’s the autocorrelation from panel B, wrapped in a more deceptive veneer.

Figure 10: Dropping the psychological pretense. This figure repeats the analysis shown in Figures 79, but drops the pretense that we’re dealing with human psychology. We’re working with random variables x and y that are drawn from a uniform distribution. Panel A shows that the variables are completely uncorrelated. Panel B shows that when we plot y – x against x, we get a strong correlation. But that’s because we have correlated x with itself. In panel C, we input these variables into the Dunning-Kruger chart. Again, the apparent effect amounts to autocorrelation — interpreting y – x as a function of x.

The point of this story is to illustrate that the Dunning-Kruger effect has nothing to do with human psychology. It is a statistical artifact — an example of autocorrelation hiding in plain sight.

What’s interesting is how long it took for researchers to realize the flaw in Dunning and Kruger’s analysis. Dunning and Kruger published their results in 1999. But it took until 2016 for the mistake to be fully understood. To my knowledge, Edward Nuhfer and colleagues were the first to exhaustively debunk the Dunning-Kruger effect. (See their joint papers in 2016 and 2017.) In 2020, Gilles Gignac and Marcin Zajenkowski published a similar critique.

Once you read these critiques, it becomes painfully obvious that the Dunning-Kruger effect is a statistical artifact. But to date, very few people know this fact. Collectively, the three critique papers have about 90 times fewer citations than the original Dunning-Kruger article.5 So it appears that most scientists still think that the Dunning-Kruger effect is a robust aspect of human psychology.6

No sign of Dunning Kruger

The problem with the Dunning-Kruger chart is that it violates a fundamental principle in statistics. If you’re going to correlate two sets of data, they must be measured independently. In the Dunning-Kruger chart, this principle gets violated. The chart mixes test score into both axes, giving rise to autocorrelation.

Realizing this mistake, Edward Nuhfer and colleagues asked an interesting question: what happens to the Dunning-Kruger effect if it is measured in a way that is statistically valid? According to Nuhfer’s evidence, the answer is that the effect disappears.

Figure 11 shows their results. What’s important here is that people’s ‘skill’ is measured independently from their test performance and self assessment. To measure ‘skill’, Nuhfer groups individuals by their education level, shown on the horizontal axis. The vertical axis then plots the error in people’s self assessment. Each point represents an individual.

Figure 11: A statistically valid test of the Dunning-Kruger effect. This figure shows Nuhfer and colleagues’ 2017 test of the Dunning-Kruger effect. Similar to Figure 8, this chart plots people’s skill against their error in self assessment. But unlike Figure 8, here the variables are statistically independent. The horizontal axis measures skill using academic rank. The vertical axis measures self-assessment error as follows. Nuhfer takes a person’s score on the SLCI test (science literacy concept inventory test) and subtracts it from the person’s self assessment, called KSSLCI (knowledge survey of the SLCI test). Each black point indicates the self-assessment error of an individual. Green bubbles indicate means within each group, with the associated confidence interval. The fact that the green bubbles overlap the zero-effect line indicates that within each group, the averages are not statistically different from 0. In other words, there is no evidence for a Dunning-Kruger effect.

If the Dunning-Kruger effect were present, it would show up in Figure 11 as a downward trend in the data (similar to the trend in Figure 7). Such a trend would indicate that unskilled people overestimate their ability, and that this overestimate decreases with skill. Looking at Figure 11, there is no hint of a trend. Instead, the average assessment error (indicated by the green bubbles) hovers around zero. In other words, assessment bias is trivially small.

Although there is no hint of a Dunning-Kruger effect, Figure 11 does show an interesting pattern. Moving from left to right, the spread in self-assessment error tends to decrease with more education. In other words, professors are generally better at assessing their ability than are freshmen. That makes sense. Notice, though, that this increasing accuracy is different than the Dunning-Kruger effect, which is about systemic bias in the average assessment. No such bias exists in Nuhfer’s data.

Unskilled and unaware of it

Mistakes happen. So in that sense, we should not fault Dunning and Kruger for having erred. However, there is a delightful irony to the circumstances of their blunder. Here are two Ivy League professors7 arguing that unskilled people have a ‘dual burden’: not only are unskilled people ‘incompetent’ … they are unaware of their own incompetence.

The irony is that the situation is actually reversed. In their seminal paper, Dunning and Kruger are the ones broadcasting their (statistical) incompetence by conflating autocorrelation for a psychological effect. In this light, the paper’s title may still be appropriate. It’s just that it was the authors (not the test subjects) who were ‘unskilled and unaware of it’.

Support this blog

Economics from the Top Down is where I share my ideas for how to create a better economics. If you liked this post, consider becoming a patron. You’ll help me continue my research, and continue to share it with readers like you.

patron_button

Stay updated

Sign up to get email updates from this blog.

Email Address

Keep me up to date


This work is licensed under a Creative Commons Attribution 4.0 License. You can use/share it anyway you want, provided you attribute it to me (Blair Fix) and link to Economics from the Top Down.

Notes

Cover image: Nevit Dilmen, altered.

  1. The Dunning-Kruger effect tells us nothing about the people it purports to measure. But it does tell us about the psychology of social scientists, who apparently struggle with statistics.↩
  2. It seems clear that Dunning and Kruger didn’t mean to be deceptive. Instead, it appears that they fooled themselves (and many others). On that note, I’m ashamed to say that I read Dunning and Kruger’s paper a few years ago and didn’t spot anything wrong. It was only after reading Jonathan Jarry’s blog post that I clued in. That’s embarrassing, because a major theme of this blog has been me pointing out how economists appeal to autocorrelation when they test their theories of value. (Examples here, here, here, here, and here.) I take solace in the fact that many scientists were similarly hoodwinked by the Dunning-Kruger chart.↩

  3. The conversion to percentiles introduces a second bias (in addition to the problem of autocorrelation). By definition, percentiles have a floor (0) and a ceiling (100), and are uniformly distributed between these bounds. If you are close the floor, it is impossible for you to underestimate your rank. Therefore, the ‘unskilled’ will appear overconfident. And if you are close to the ceiling, you cannot overestimate your rank. Therefore, the ‘skilled’ will appear too modest. See Nuhfer et al (2016) for more details.↩

  4. In technical terms, Dunning and Kruger are plotting two different forms of ranking against each other — test-score ‘percentile’ against test-score ‘quartile’. What is not obvious is that this type of plot is data independent. By definition, each quartile contains 25 percentiles whose average corresponds to the midpoint of the quartile. The consequence of this truism is that the line labeled ‘actual test score’ tells us (paradoxically) nothing about people’s actual test score.↩

  5. According to Google scholar, the three critique papers (Nuhfer 2016, 2017 and Gignac and Zajenkowski 2020) have 88 citations collectively. In contrast, Dunning and Kruger (1999) has 7893 citations.↩

  6. The slow dissemination of ‘debunkings’ is a common problem in science. Even when the original (flawed) papers are retracted, they often continue to accumulate citations. And then there’s the fact that critique papers are rarely published in the same journal that hosted the original paper. So a flawed article in Nature is likely to be debunked in a more obscure journal. This asymmetry is partially why I’m writing about the Dunning-Kruger effect here. I think the critique raised by Nuhfer et al. (and Gignac and Zajenkowski) deserves to be well known.↩

  7. When Dunning and Kruger published their 1999 paper, they both worked at Cornell University.↩

Further reading

Gignac, G. E., & Zajenkowski, M. (2020). The Dunning-Kruger effect is (mostly) a statistical artefact: Valid approaches to testing the hypothesis with individual differences data. Intelligence, 80, 101449.

Kruger, J., & Dunning, D. (1999). Unskilled and unaware of it: How difficulties in recognizing one’s own incompetence lead to inflated self-assessments. Journal of Personality and Social Psychology, 77(6), 1121.

Nuhfer, E., Cogan, C., Fleisher, S., Gaze, E., & Wirth, K. (2016). Random number simulations reveal how random noise affects the measurements and graphical portrayals of self-assessed competency. Numeracy: Advancing Education in Quantitative Literacy, 9(1).

Nuhfer, E., Fleisher, S., Cogan, C., Wirth, K., & Gaze, E. (2017). How random noise and a graphical convention subverted behavioral scientists’ explanations of self-assessment data: Numeracy underlies better alternatives. Numeracy: Advancing Education in Quantitative Literacy, 10(1).

The post The Dunning-Kruger Effect is Autocorrelation appeared first on Economics from the Top Down.

Cultural Donations on the Rise

Published by Anonymous (not verified) on Thu, 26/02/2015 - 10:17am in