Selection Science · 8 min read

Work samples predict job performance 3× better than résumés.

Most companies hire on credentials. The data has been clear for decades that credentials are one of the worst predictors of who will actually do the job well. Here's what the research actually says — and why it should change how you screen.

HireGauge Published June 2, 2026 ~8 min read

In 1998, the industrial-organizational psychologists Frank Schmidt and John Hunter published the most comprehensive analysis of hiring methods in the history of the field. They had eighty-five years of data — millions of hires, thousands of studies — and one question: which selection methods actually predict job performance?

The findings were so blunt they're still uncomfortable to read.

Work samples — having a candidate actually do a slice of the job — predicted future job performance at a validity coefficient of about 0.54. General mental ability tests came in at 0.51. Structured interviews, also 0.51. Job-knowledge tests, 0.48.

Unstructured interviews — the kind most companies still rely on — scored 0.38. Years of experience: 0.18. Years of education: 0.10.

The methods most companies use to hire are, statistically, barely better than guessing.

Selection Method Validity Relative

Work samples

0.54

General mental ability

0.51

Structured interview

0.51

Job knowledge tests

0.48

Unstructured interview

0.38

Years of experience

0.18

Years of education

0.10

What "validity coefficient" actually means

Validity coefficients can feel abstract, so here's the practical translation. A coefficient of 0.54 means roughly that if you select candidates using work samples, you'll get top-quartile performers about 30% more often than if you select randomly. A coefficient of 0.10 — years of education — barely moves the needle from chance.

Or put more bluntly: hiring on a résumé is closer to a coin flip than to a process.

Most of the difference comes from a simple fact: past credentials measure what someone has done, not what they can do. A candidate's degree tells you they finished a curriculum. Their years of experience tell you they didn't get fired. Neither tells you whether they can handle the work in front of them on Tuesday.

Past credentials measure what someone has done. Work samples measure what they can do. Most companies still hire on the first.

The 2022 reanalysis: the numbers shifted, the ordering didn't

In 2022, a team led by Paul Sackett at the University of Minnesota published a reanalysis of the Schmidt and Hunter data. Their argument: the original analysis over-corrected for "range restriction" (a statistical artifact that happens when you only have data on the people you hired, not the ones you rejected). When that correction is dialed back, the absolute validity numbers drop somewhat.

The headlines briefly suggested the famous Schmidt and Hunter findings had been overturned. Read the actual paper and you find something more subtle. The exact coefficients moved. Structured interviews edged up. GMA edged down slightly. Some methods clustered closer together.

But the ordering — which methods predict performance, and which don't — barely shifted. Sample-based methods, structured methods, and cognitive ability remained near the top. Years of experience and unstructured interviews remained near the bottom.

The intellectually honest summary: the exact coefficients are contested. The ordering is not.

Why work samples win

The reason work samples top the list isn't mysterious. Three things make them work:

1. They measure the actual capability, not a proxy for it

A résumé says someone is a senior care coordinator. A work sample shows whether they can read a chart, spot what's missing, and prioritize three competing patient needs in the next ten minutes. A degree says someone studied accounting. A work sample shows whether they can reconcile a messy ledger without losing track of what they noticed five rows ago.

2. They're harder to game

You can polish a résumé. You can rehearse interview answers. What you can't easily fake is whether, when handed a real piece of work, you can think your way through it under realistic constraints.

3. They predict the work because they are the work

The technical term is "criterion validity." The simpler version: a test that looks like the job predicts the job. A test that looks like a generic aptitude battery predicts generic aptitude.

What this means for hiring practice

Work samples are expensive to build well. They require a real job analysis — someone who understands the work observing it, identifying what predicts success, and constructing items that fairly probe those capabilities. They require scoring rubrics, calibration, and ongoing maintenance.

And — critically — they have to be legally defensible. The Uniform Guidelines on Employee Selection Procedures (1978) require that any screen with disparate impact be job-related and validated through one of three accepted methods.

The cleaner answer is to build the validation argument around your actual operation. That's what HireGauge does, and it's why "custom" can be more defensible than "off-the-shelf" — if it's done right.

Generic batteries are cheap, fast, and defensible. They just don't predict performance in your specific jobs as well as a properly built work sample.

The path forward

The two things worth doing first, regardless of who builds your assessment:

Add a structured interview component to whatever you already do. Not "tell me about a time" questions answered however the interviewer feels like scoring them. Real structured scoring against a defined rubric. This alone moves you from a 0.38 method to a 0.51 method.
Add a real work-sample task for any role where mis-hires are expensive. Even a single, well-designed task — built from the actual work — moves the needle further than years of credential screening ever will.

If you want help building either — properly, with the legal validation work done right — that's what we do.

See what a custom-built assessment for your roles would look like.

A 30-minute consultation to scope your operation and walk through what HireGauge would build. If it isn't a fit, we will tell you directly.

Schedule a consultation