Methodology

How this IQ test is scored

Item provenance, scoring math, reliability, and the limits we publish so you can decide how much to trust the result.

1. The instrument: ICAR

MyIQTested uses items from the International Cognitive Ability Resource (ICAR), an open-source, peer-reviewed cognitive assessment battery developed by David Condon and William Revelle and published in Intelligence (2014). ICAR was created so that researchers, educators, and platforms like ours could deliver psychometrically defensible cognitive testing without licensing proprietary instruments.

The full ICAR pool covers four item types: matrix reasoning (3×3 pattern completion), letter and number series, three-dimensional rotation, and verbal reasoning. From this pool, we have selected 33 items balanced across four domains for a ~10-minute experience. The selection prioritises (a) items with strong published item-level statistics, (b) coverage of all four reasoning domains, and (c) a difficulty curve that ramps from a five-item warm-up into harder spatial puzzles.

Source: icar-project.com · Condon, D. M., & Revelle, W. (2014). Intelligence, 43, 52–64.

2. The 33 items, by domain

Item IDs are stable: each question has a fixed identifier in our scoring data so the same item always scores the same way. Spatial reasoning is intentionally over-represented because matrix reasoning is the single best-validated proxy for fluid intelligence (gf).

Domain Items IDs Item kind
Abstract Reasoning 6 iq-ar-01, 04, 08, 14, 18, 20 Letter / number / dot sequences and applied logic
Verbal Reasoning 4 iq-vr-01, 12, 15, 19 Antonyms, synonyms, analogies, odd-one-out
Numerical Reasoning 3 iq-nr-01, 06, 14 Percentages, averages, median
Spatial / Matrix Reasoning 20 iq-mx-01 through iq-mx-20 3×3 visual pattern completion (Raven’s-style)
Total 33

A five-item warm-up (iq-mx-01, iq-mx-02, iq-ar-04, iq-mx-03, iq-ar-08) is presented first to ease test-takers in before harder spatial items are introduced.

3. How we score

Scoring runs entirely in your browser the moment you finish — we never send your responses to a server. Three steps:

  1. Domain raw counts. For each of the four domains, count the number of correct answers. Maximums are 6 (Abstract), 4 (Verbal), 3 (Numerical), and 20 (Spatial), summing to 33.
  2. Composite raw → IQ scale. The composite is the sum of domain raw scores. We map this composite onto a normative IQ scale with mean = 100 and standard deviation = 15, clamped to a 55–155 display range.
  3. Percentile & band. We convert IQ to a percentile using the standard normal CDF (Abramowitz & Stegun rational approximation), then label one of six bands (Significantly Below Average through Very Superior) for plain-English context.

The display formula is fixed and deterministic. You can reproduce it from these constants:

// apps/myiqtested/packages/scoring/src/iq-norm.ts
const IQ_REFERENCE_MEAN_TOTAL = 16.5;
const IQ_REFERENCE_SD_TOTAL   = 5;
const IQ_MIN = 55;
const IQ_MAX = 155;

function totalCorrectToNormativeIq(totalCorrect) {
  const z  = (totalCorrect - IQ_REFERENCE_MEAN_TOTAL) / IQ_REFERENCE_SD_TOTAL;
  const iq = Math.round(100 + 15 * z);
  return Math.max(IQ_MIN, Math.min(IQ_MAX, iq));
}

Result pages show a ±5 IQ confidence band (the “display SE”). This reflects the typical day-to-day fluctuation an online test taker can expect from session-level noise (fatigue, distraction, screen size). It is not a clinical confidence interval.

4. Reliability & validity

ICAR's published item-level psychometrics are public, which lets us state defensible numbers rather than vague "scientifically validated" claims. From Condon & Revelle (2014):

  • Internal consistency (Cronbach’s α) for ICAR subtests: .81–.93.
  • Correlation between ICAR total and the Shipley Institute of Living Scale: r = .81.
  • Correlation between ICAR matrix items and Raven’s Progressive Matrices: r = .60–.80.
  • Correlation between ICAR composite and WAIS subtests: r = .60–.80.

A 33-item subset of ICAR will have somewhat lower reliability than the full ICAR-60 pool, but the published item-level statistics still apply at the item level and the composite-to-IQ mapping is unchanged.

5. Known limitations

We publish these so you can decide for yourself how much weight to put on the result.

  • Item exposure. The 33 items are currently fixed. A motivated test-taker who has seen the answers elsewhere can inflate their score. We have a planned expansion to a larger item pool with per-session randomisation.
  • Internet sample bias. Online IQ test takers skew younger, more educated, and more curious about cognitive science than the general population. Norms derived from internet samples may shift the distribution slightly versus census-representative samples.
  • Flynn drift. Population IQ has historically risen by about 3 points per decade. Our display norms (mean=100, SD=15) follow the WAIS-IV convention but do not auto-recalibrate for current cohort drift.
  • No working memory or processing speed subtests. ICAR is a reasoning-only battery. Two of the four CHC broad abilities (Gwm, Gs) are not measured here.
  • Not a clinical assessment. A full clinical IQ evaluation requires a licensed psychologist, controlled conditions, and a comprehensive battery (WAIS-IV, Stanford-Binet). Online tests can shift by 5–10 points based on session conditions alone.

6. Changelog

Date Change
2026-05-12 Initial public methodology page. Added bell curve and 95% display CI to the result page.
2026-04-14 Test launched with the 33-item ICAR-derived bank described above.
Last reviewed: 2026-05-12 Take the free IQ test →