Paper-Folding and Spatial Visualization
The paper-folding test — developed by Educational Testing Service in the 1960s for the Kit of Factor-Referenced Cognitive Tests — presents a sequence of folds applied to a square sheet, ending with a single hole punched through all layers. The test-taker must then identify the unfolded result from a set of options. It is a measure of spatial visualization, distinct from the mental-rotation skill captured by the Vandenberg-Kuse battery.
The paper-folding test — developed by Educational Testing Service in the 1960s for the Kit of Factor-Referenced Cognitive Tests — presents a sequence of folds applied to a square sheet, ending with a single hole punched through all layers. The test-taker must then identify the unfolded result from a set of options. It is a measure of spatial visualization, distinct from the mental-rotation skill captured by the Vandenberg-Kuse battery.
Sub-test design is one of the most carefully studied areas of psychometrics. The goal of any subtest is to measure a specific cognitive ability with high reliability while minimizing confounds with unrelated abilities, prior knowledge, and test-taking strategy. Achieving that goal requires substantial effort: items must be piloted on large samples, item-response-theory parameters estimated, and the resulting items selected for inclusion in the operational test.
Modern subtests are typically scored using item-response theory (IRT) rather than simple sum scores. IRT produces ability estimates that account for the difficulty of each item and the discrimination of each item — how well it separates high-ability from low-ability test-takers. The resulting scores are more precise than raw counts and allow direct comparison across alternate forms of the same test.
For Paper-Folding and Spatial Visualization, the underlying cognitive demand is well-characterized in the research literature. Strong performers typically employ specific strategies that can be partially taught, which is one reason the test is moderately coachable. The size of the practice effect depends on the test-taker's starting level, the amount and structure of practice, and the similarity between practice items and operational items.
Public-domain item pools — particularly the ICAR catalog and the items released by the Open-Source Psychometrics Project — make it possible to study these subtests in academic research without paying licensing fees to commercial publishers. This site's free screener uses original items modeled on these public conventions.
If you are particularly interested in this subtest, you can take a longer dedicated assessment via the public-domain projects linked in the Related Reading section. Single-domain assessments provide more precise estimates of the specific ability than the brief multi-domain screener used here.