Inferring cross sections of 3D objects: A new spatial thinking test ☆ Cheryl A. Cohen a, ⁎ , Mary Hegarty b a Northwestern University, United States b University of California, Santa Barbara, United States a b s t r a c t a r t i c l e i n f o Article history: Received 27 September 2011 Received in revised form 22 February 2012 Accepted 1 May 2012 Keywords: Cross-sectioning skill Spatial ability tests Spatial visualization Spatial strategies Science education STEM education A new spatial ability test was administered online to 223 undergraduate students enrolled in introductory sci- ence courses. The 30-item multiple choice test measures individual differences in ability to identify the two- dimensional cross section of a three-dimensional geometric solid, a skill that has been identi fi ed as important in science, technology, engineering and mathematics (STEM) fi elds. Bivariate and partial correlations suggest that the test measures a skill that is distinct from three-dimensional mental rotation and change in view perspec- tive. Test items varied along two scales: complexity of the geometric solid to be sliced and orientation of the cut- ting plane. Internal reliability of both the overall test and its subscales was satisfactory. Performance was higher on fi gures cut by orthogonal, rather than oblique, planes. Patterns of performance across more and less complex items, and patterns of sex differences on these items, suggest that items on the test are differentially amenable to imagistic and analytic strategies, with males outperforming females on items that are less amenable to analytic strategies. The test shows promise for online administration and for adaptation to younger populations. © 2012 Elsevier Inc. All rights reserved. 1. Inferring cross sections: a new spatial test for STEM disciplines Spatial thinking skills enable human beings to form and manipulate mental representations of actual and imagined shapes, objects and struc- tures. There is now considerable evidence that spatial abilities contribute to performance in science, technology, engineering, and mathematics (STEM) fi elds. Correlational studies indicate that spatial thinking skills help engineers mentally construct a three-dimensional object from a set of two-dimensional drawings (Duesbury & O'Neil, 1996) and help geologists to visualize subterranean cross sections of land forms beneath the earth's surface (Kali & Orion, 1996; Liben, Kastens, & Christensen, 2011; Orion, Ben-Chaim, & Kali, 1997). Longitudinal studies have dem- onstrated that that spatial ability in high school predicts achievement in advanced STEM occupations in early adulthood (Wai, Lubinski, & Benbow, 2009), suggesting that a large resource of untapped spatial talent could be identi fi ed and nurtured through spatial ability testing and education (Lubinski, 2010). The strong evidence for the contribution of spatial thinking to STEM fi elds, along with evidence for the malleabil- ity of spatial skill (Baenninger & Newcombe, 1989; Uttal et al., in press; Wright, Thompson, Ganis, Newcombe, & Kosslyn, 2008), has convinced educators that spatial abilities are worthy of systematic identi fi cation and nurturance (National Research Council, 2006, p. 10). A corollary to the call to educate spatial thinking is the need for reli- able and valid assessments of spatial skills that are relevant to STEM disciplines. Science educators have historically used classic domain- general spatial ability tests, such as the Vandenberg Mental Rotation Test (Vandenberg & Kuse, 1978) and the Paper Folding Test (Ekstrom, French, Harman, & Dermen, 1976) to measure effects of spatial ability on performance in their disciplines. These tests were developed within the factor analytic tradition, whereby the information contained in a number of variables is reduced to a smaller number of constructs, called factors. Within this tradition, mental rotation and paper folding tests load on the factor of spatial visualization , which is de fi ned as the pro- cesses of apprehending, encoding, and mentally manipulating three- dimensional spatial forms (Carroll, 1993). Although domain-general standardized spatial tests provide a good starting point for assessing spatial skill in STEM fi elds, they have limita- tions. Many standardized spatial tests were developed to predict perfor- mance in skilled crafts and trades (Smith, 1964), rather than to measure the types of spatial thinking skills needed in STEM tasks. Furthermore, as many existing spatial ability tests were developed via factor analysis, their design was not informed by theories of the cognitive processes measured by tests that account for individual differences in spatial thinking. An alternative approach is to target skills that contribute to performance in science and mathematics, specify the cognitive compo- nents of these skills, and develop theoretically-motivated tests to mea- sure performance on these skills. One spatial thinking skill that contributes to performance in a number of STEM domains is the ability to infer the two-dimensional Learning and Individual Differences 22 (2012) 868 – 874 ☆ This research was funded by grant DRL-0723313 from the National Science Foundation. We thank Drew Dara-Abrams and Paolo Gardinali for programming and administering the on-line tests and Jana Ormsbee for assistance with data collection. Additional support was provided by grant SBE-0541957 from the National Science Foundation. ⁎ Department of Psychology, Northwestern University, Swift Hall 102, 2029 Sheridan Road, Evanston, IL 60208-2710, United States. E-mail address: cheryl-cohen@northwestern.edu (C.A. Cohen). 1041-6080/$ – see front matter © 2012 Elsevier Inc. All rights reserved. doi:10.1016/j.lindif.2012.05.007 Contents lists available at SciVerse ScienceDirect Learning and Individual Differences j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / l i n d i f cross section of a three-dimensional object. There is evidence this skill contributes to performance the biological sciences and medicine. For example, Russell-Gebbett (1985) found that ability to infer the shapes of cross sections of anatomical structures, and to understand the spatial relationships among the internal parts of anatomical cross sections were positively correlated with success in biology and Rochford (1985) found that students who had dif fi culty in imagining spatial processes including sectioning, also had dif fi culty in practical anatomy classes. The ability to infer cross sections is important in com- prehending and using medical images such as x-ray and magnetic reso- nance imaging (Hegarty, Keehner, Cohen, Montello, & Lippa, 2007). Cross-sectioning skill is also central to geology, where it has been re- ferred to as “ visual penetration ability ” (Kali & Orion, 1996; Orion et al., 1997) and is correlated with success in geometry (Brinkmann, 1966; Pittalis & Christou, 2010). Finally, understanding the cross- sectional structure of materials is a fundamental skill in engineering (Duesbury & O'Neil, 1996; Gerson, Sorby, Wysocki, & Baartmans, 2001; Lajoie, 2003). Motivated by the relevance of cross-sectioning skill in many scienti fi c domains, this paper reports the development of a psychometric measure of cross-sectioning ability, the Santa Barbara Solids Test (SBST). In con- trast to a previously published mental cutting test (the “ Schnitte ” test), which measures this ability in participants with extremely high spatial ability (Quaiser-Pohl, 2003), the SBST was designed to measure perfor- mance differences in university undergraduates with a normal distribu- tion of spatial skill. The measure also provides information on sources of dif fi culty in inferring the cross sections of three-dimensional objects, and investigates the cognitive components of this skill. 1.1. Test description The stimuli used in the Santa Barbara Solids Test are pictures of simple or complex geometric solids, intersected by a cutting plane (Fig. 1). As shown in Fig. 2, the participant is asked to identify from four multiple choice answers the two-dimensional shape that would result when the three-dimensional object was sectioned as indicated. The 30 test fi gures in the SBST vary in complexity, assumed to corre- late with the demand on spatial working memory (Miyake, Rettinger, Friedman, Shah, & Hegarty, 2001). There are three levels of geometric complexity in the test: simple , joined , and embedded fi gures. Simple fi gures (10 items) are primitive geometric solids: cones, cubes, cylinders, prisms, or pyramids. Joined fi gures (10 items) are two simple solids attached at their edges. Embedded fi gures (10 items) consist of one simple solid enmeshed inside another. In all fi gures, each simple solid is represented by a distinct color. The use of primitive geometric solids, and compound shapes made up of these solids, was motivated by research which holds that most elementary recognizable three- dimensional forms are primitive solids (Biederman, 1987; Pani, Jeffres, Shippey, & Schwartz, 1996). The stimuli also vary in cutting plane orientation. Half of the items have cutting planes that are orthogonal (horizontal or vertical) to the main vertical axis of the test fi gure, and half have cutting planes that are oblique to this axis. Mental transformations of objects with axes oblique to the environmental frame of reference are more dif fi cult to perform than mental transformations of objects whose main axes are orthogonal to the environment (Appelle, 1972; Rock, 1973; Pani, Zhou, & Friend, 1997). Fig. 1 illustrates the three levels of geometric structure and the two levels of cutting plane orientations in test fi gures. Fig. 1a is a simple fi gure with an orthogonal (horizontal) cut- ting plane. Fig. 1b is a joined fi gure with an orthogonal (vertical) cut- ting plane; Fig. 1c is an embedded fi gure with an oblique cutting plane. The answer choices of the test were designed to diagnose different types of errors. These are illustrated by the item in Fig. 2 showing a simple fi gure with an orthogonal cutting plane. Fig. 2c is the correct answer. Fig. 2d is an egocentric distracter, which represents the shape a participant might imagine if they failed to change their view perspective relative to the cutting plane of the criterion fi gure. Fig. 2b is a combination distracter, which merges two possible sections of the test fi gure into a hybrid shape, and Fig. 2a is an a lter- nate distracter that shows another possible slice of the test fi gure. 1.2. Previous test administration A previous paper-and-pencil administration of the Santa Barbara Solids Test with 59 participants (Cohen & Hegarty, 2007) established that the internal reliability of the test (Cronbach's α = .86) was satis- factory. This study also found a signi fi cant positive correlation be- tween total score on the test and measures of mental rotation (Vandenberg & Kuse, 1978), and perspective taking (a modi fi ed ver- sion of Guay's Visualization of Views Test, Eliot & Smith, 1983), suggesting that it is a manifestation of spatial ability. This study also suggested that some items on the test were amena- ble to different strategies. We originally assumed that test items would be solved using a visual imagery (imagistic) strategy in which participants fi rst construct an internal representation of the stimulus object, then imagine slicing the object and removing the section of the sliced geometric fi gure between the viewer and the cutting plane, and then observe the shape of the cut surface of the geometric fi gure from an orientation orthogonal to the cut surface. On the basis of this assumed imagistic strategy, we hypothesized that joined and embedded fi gures would be more dif fi cult than simple fi gures because imagining the spatial transformations of more complex fi gures would be more demanding of visuo-spatial working memory (Miyake et al., Fig. 2. Simple fi gure with an orthogonal cutting plane. The correct answer is (c). Fig. 1. Geometric fi gures in the Santa Barbara Solids Test varied along two parameters: geometric structure and orientation of the cutting plane. The above fi gures are: a) a simple fi gure with an orthogonal cutting plane; b) a joined fi gure with an orthogonal cutting plane; and c) an embedded fi gure with an oblique cutting plane. 869 C.A. Cohen, M. Hegarty / Learning and Individual Differences 22 (2012) 868 – 874 2001). We also hypothesized that items with oblique cuts would be more dif fi cult than items with orthogonal cuts, following previous re- search in the mental imagery literature (Appelle, 1972; Rock, 1973; Pani et al., 1997). However, although performance was better in gen- eral on orthogonal than oblique test fi gures, there was an interaction between cutting plane and geometric complexity, such that for or- thogonal items, performance was highest on embedded test fi gures and lowest on simple test fi gures, but for oblique items, performance was highest on joined items and lowest on embedded items. This un- expected interaction between cutting plane orientation and geometric complexity led us to suggest that some participants used analytic strategies, such as task decomposition and feature matching (cf Schultz, 1991; Hegarty, 2010) to solve some of the items, rather than relying exclusively on imagistic strategies. 1.3. Present study One goal of the present study was to replicate the results of Cohen and Hegarty (2007) with online administration and with a larger sample. Given that the different subscales of the test may prove to be diagnostic of different strategies on the test, we also wished to es- tablish the reliability (internal consistency) of these subscales as well as the overall test. Based on the previous study, we predicted that high spatial participants would outperform low spatial participants across all subscales of the test. We also predicted that oblique prob- lems would be more dif fi cult than orthogonal problems, and we an- ticipated an interaction between geometric structure and cutting plane orientation as found previously. A second goal was to establish whether the test measures a con- struct that is distinct from other measures of spatial ability. To this end, we assessed the zero-order and partial correlations of the test and subscales with two spatial visualization measures. We also inves- tigated the associations between participants' SAT Math and SAT Reading scores and performance on cross section problems. There is evidence that mathematics skill is more predictive of spatial ability than are verbal ability or logical reasoning ability (Battista, 1990; Hegarty & Waller, 2005). Finally, we investigated sex differences in this task, as there is ev- idence for a signi fi cant male advantage, on some spatial measures, particularly on mental rotation items (Ceci & Williams, 2010; Linn & Petersen, 1985; Voyer, Voyer, & Bryden, 1995) and for sex differences in the use of imagistic versus analytic strategies in spatial thinking tasks (Hegarty, 2010; Heil & Jansen-Osmann, 2008). 2. Method 2.1. Participants Two hundred and twenty-three undergraduate students (89 females, 132 males; 2 students did not indicate their sex) enrolled in introductory chemistry classes at a research university received either payment ($20) or course credit for their participation. One hun- dred and ninety-two students identi fi ed themselves as science majors (biological sciences, chemistry, environmental sciences engineering, mathematics, physics, or psychology). The remaining 31 students iden- ti fi ed themselves as business majors, undecided about their majors, or not science majors. 2.2. Materials The online assessment included the Santa Barbara Solids Test and two psychometric measures of spatial ability: the Vandenberg Mental Rotation Test (Vandenberg & Kuse, 1978) and a modi fi ed version of Guay's Visualization of Views Test (Eliot & Smith, 1983). Participants also reported their sex, undergraduate majors, the number of high school and college science classes they had taken, and their SAT Read- ing and SAT Math scores. 1,2 2.3. Spatial ability tests 2.3.1. Mental Rotation Test (Vandenberg & Kuse, 1978) In the Vandenberg Mental Rotation Test, the participant views a depiction of a three-dimensional target fi gure and four test fi gures. The participant's task is to determine as quickly and accurately as possible which test fi gures are rotations of the target. There are two sections with 10 problems each, and a time limit of 3 min. per section. The maximum score is 80. 2.3.2. Visualization of Views Test (Eliot & Smith, 1983) A modi fi ed version of Guay's Visualization of Views Test assesses the participant's ability to visualize an unfamiliar three-dimensional object from an imagined perspective. The criterion fi gure is a cube with a smaller, irregularly shaped block fl oating in its center. The par- ticipant is asked to circle the corner of the cube from which an alter- nate view of the small block (shown below the cube) would be visible. There are 24 problems to be completed in 8 min. The maxi- mum score is 24. 2.3.3. Santa Barbara Solids Test The Santa Barbara Solids Test consists of 30 multiple choice prob- lems. Three levels of geometric structure and two types of cutting planes are distributed evenly across the 30 problems: one-third of the fi gures are simple solids, one-third are joined solids, and one- third are embedded solids. Fifteen of the test items are bisected by an orthogonal cutting plane, and fi fteen by an oblique cutting plane. This test was administered last with no time limit, but participants generally took less than 5 min to complete the test. 2.4. Procedure Students participated in groups of up to 10 at a time in a computer laboratory. Each student was seated at a different computer worksta- tion and worked alone. They fi rst completed a brief online demo- graphics questionnaire. Then the Mental Rotation and Visualization of Views Tests were presented in that order, with each test preceded by the standard instructions for that test. Next, written instructions for the SBST, including a sample problem, were presented (see Fig. 3). After completing the sample problem, participants were directed to continue with the 30 test items. 3. Results Due to researcher error, the correct answer for one test item (an embedded fi gure with an oblique cutting plane) was not included in the answer choices. This item was eliminated from all analyses, leaving 29 scored problems. Cronbach's α across the 29 scored items was .91, surpassing the overall reliability of the earlier adminis- tration of the test (Cohen & Hegarty, 2007). Cronbach's alpha for the major subscales of the test were: simple fi gures, α = .79; joined fi gures, α = .80; embedded fi gures, α = .73; orthogonal fi gures, α = .84; and oblique fi gures, α = .85. In sum, both the whole scale and its subcomponents had satisfactory reliability. 1 The cross-section test originally was printed in two contrasting colors. However, a grayscale presentation of the test, whether in print or online, can preserve the visual contrast needed to distinguish between the structure and features of individual shapes in complex (joined and embedded) fi gures. 2 Although students tend to over-estimate their SAT scores, there is a strong correla- tion between students' self-reported scores and their actual SAT scores (Mayer et al. (2007). Thus, we accepted students' self-reported SAT scores as valid measures of their actual performance. 870 C.A. Cohen, M. Hegarty / Learning and Individual Differences 22 (2012) 868 – 874 3.1. Descriptive statistics Table 1 shows descriptive statistics for the psychometric spatial abil- ity tests and SAT measures. Descriptive statistics for the Santa Barbara Solids Test and its subscales are given in Table 2. On average, partici- pants answered about two-thirds of the items correctly, indicating that this is a challenging test for undergraduate science students. The scores were negatively skewed (skewness = − .57; SE = .16), however only eight of the 223 participants had a perfect score, indicating that there was not a strong ceiling effect. 3 3.2. Relations among the subscales and their relation to spatial visualization ability As presented in the top half of Table 3, correlations among the fi ve subscales of the Santa Barbara Solids Test were medium to large (cf. Cohen, 1988) and statistically signi fi cant. Using the Bonferroni ap- proach to control for Type I error across the 15 correlations, a p value of less than .003 (.05/15 = .003) was required for signi fi cance. The cor- relations indicate that the subscales measure a common ability or skill. The Vandenberg Mental Rotation Test and the Visualization of Views Test were also signi fi cantly correlated r (221) = .57, p b .001 and were combined into an aggregate spatial visualization score, which was com- puted by averaging the z-scores in the two measures. Partial correlation coef fi cients were then computed among the subscales of the SBST, holding constant the aggregate spatial measure. If spatial visualization ability, as measured by the aggregate spatial score, is the only ability measured by the subscales of the SBST, then the partial correlations should not be signi fi cantly different from zero. However, as reported in the bottom half of Table 3, they were signi fi cantly greater than zero (in this analysis p -value of less than .005 (.05/10 = .005) was required for signi fi cance, as there were 10 partial correlations). These results suggest that spatial visualization ability is not the sole determinant of performance on the Santa Barbara Solids Test. Rather, this test measures a skill, such as visual penetrative skill, which is somewhat distinct from those measured by the Vandenberg Mental Rotation Test and the Visualization of Views Test. 3.3. Relation to SAT performance Multiple linear regression analyses were conducted to predict overall performance on the Santa Barbara Solids Test, based on two ordered predictors, SAT Reading and SAT Math scores. As predicted, SAT Math was a better predictor of performance on the test. SAT Reading alone was a signi fi cant predictor of performance on the SBST, R 2 = .09, F (1, 221) = 21.53, p b .001, but SAT Math accounted for a signi fi cant proportion of variance over and above SAT Reading, R 2 change = .13, F (1, 220) = 34.84, p b .001. In contrast, SAT Reading did not predict performance over and above SAT Math. The zero- order correlation of SAT Math and SBST performance was .45. Partic- ipants' college majors and the number of science classes they had taken in high school or college were not signi fi cant predictors of per- formance on the cross section measure. 3.4. Geometric structure and orientation of cutting plane We conducted a 2 (orthogonal, oblique) × 3 (simple, joined, embed- ded) within-subjects, repeated measures analysis of variance (ANOVA) to determine the contribution of orientation of the cutting plane and geometric structure to performance on the Santa Barbara Solids Test. Table 1 Means and standard deviations, spatial ability tests, SAT Math and SAT Reading, n = 223. M SD Range Vandenberg Mental Rotation Test 40.57 20.02 − 8 – 80 Visualization of Views Test 13.54 6.90 − 3.01 – 24 SAT Math 645.07 90.79 360 – 800 SAT Reading 609.35 82.25 350 – 800 Table 2 Means and standard deviations (measure is proportion correct), Santa Barbara Solids Test. Total score and subscales, n = 223. M SD All fi gures (29 items) .68 .23 Simple structure (10 items) .68 .27 Joined structure (10 items) .68 .26 Embedded structure (9 items) .68 .25 Orthogonal cutting plane (10 items) .78 .22 Oblique cutting plane (9 items) .58 .28 3 Three students had a perfect score on the Vandenberg Mental Rotation Test, 11 stu- dents had a perfect score on the Visualization of Views test, 2 students reported SAT Reading scores of 800 and 6 students reported SAT Math scores of 800. Fig. 3. Online instructions for the Santa Barbara Solids Test (a) de fi ned the term cross section and (b) directed participants to assume a view perspective that was perpendic- ular to the cutting plane of the test fi gure. 871 C.A. Cohen, M. Hegarty / Learning and Individual Differences 22 (2012) 868 – 874 Performance means by subscale are shown in Fig. 4. There was a signif- icant main effect of cutting plane orientation, F (1, 222)= 224.87, p b .001, η p 2 = .50. As predicted, performance was higher on orthogo- nal fi gures ( M = .78, SD = .22) than on oblique fi gures ( M = .58; SD = .29) across the three levels of geometric structure, t (222) = 4.78, p b .001, and for simple fi gures, t (222) = 11.80, p b .001, joined fi gures, t (222) = 14.69, p b .001, and for embedded fi gures, t (222) = 14.67, p b .001. This result converges with our previous fi ndings in a smaller study (Cohen & Hegarty, 2007). There was no signi fi cant main effect of geometric structure (sim- ple, joined, embedded fi gures). However, there was a signi fi cant interaction between orientation of cutting plane and geometric struc- ture, F (2, 444) = 33.85, p b .001, η p 2 = .13, as reported by Cohen and Hegarty (2007). Across fi gures with orthogonal cutting planes, means for joined ( M =.80, SD = .02) and embedded ( M = .80, SD =.02) fi gures were slightly higher than means for simple orthogonal fi gures ( M =.73, SD =.02). The difference between simple-orthogonal and joined- orthogonal problems was signi fi cant, t (222) =4.09, p b .001, as was the difference in performance between simple-orthogonal and embedded- orthogonal items, t (222)= 5.04, p b .001. There was no signi fi cant differ- ence in performance between joined-orthogonal and embedded- orthogonal items. For oblique fi gures, the effects of geometric structure were larger, with the highest mean performance on simple items ( M =.64, SD =.02), followed by joined ( M =.60, SD =.02), and embed- ded ( M =.53, S D =.02) items. The difference in performance between simple-oblique and joined-oblique items was signi fi cant, t (222) =4.81, p b .001, as was the difference in performance between simple-oblique and embedded-oblique items, t (222) =5.60, p b .001. There was no signi fi cant difference in performance between joined-oblique and embedded-oblique items. In summary, the pattern of performance is not completely consistent with use of a pure imagistic strategy on this task, with the imagination of transformations of more complex shapes being more demanding of working memory. Items with more complex shapes were more dif fi cult for oblique fi gures only . As reported by Cohen and Hegarty (2007), accuracy was slightly higher on joined and embedded orthogonal fi gures, compared to simple orthogonal fi gures. These results are suggestive of the use of analytic strategies as well as im- agistic strategies on the cross section task. 3.5. Patterns of error We analyzed the frequency of the four answer choices (correct, ego- centric, combination, and alternate) over the 29 problems. Overall, par- ticipants chose the correct answer more than half the time (mean proportion correct = .69, SD = .11). The most frequently chosen distracter was the egocentric answer, which was chosen almost one- fi fth of the time ( M = .19, SD = .11). Combination ( M = .07, SD = .05) and alternate ( M = .05, SD = .06) answers were chosen less frequently and approximately equally. Eighty percent of the participants made at least one egocentric error and 60% made two or more egocentric errors. 3.6. Sex differences We conducted a repeated measures analysis of variance with sex as the between-subjects factor and orientation of the cutting plane and object complexity as within-subject factors. There was a signi fi - cant main effect of sex, with males (mean proportion correct = .71, SD = .24) outperforming females (mean proportion correct = .64, SD = .22), F (1, 219) = 6.20, p = .01, η p 2 = .03. There was also a signif- icant interaction between geometric structure and sex, F (2, 219) = 8.78, p b .001, η p 2 = .04. As shown in Fig. 4, males signi fi cantly out- performed females on simple and embedded fi gures, but not on joined fi gures. We performed an additional repeated measures analy- sis of variance, with aggregate spatial score as the covariate. The effect of spatial ability was signi fi cant in this analysis, F (1, 218) = 123.17, p b .001, η p 2 = 36. After covarying for spatial ability, there was no lon- ger a signi fi cant main effect of sex, p = .26. However, the interaction between sex and geometric structure remained, F (2, 217) = 7.56, p = .001, η p 2 = .07. The signi fi cant interaction between geometric structure and sex, holding constant spatial ability, raised the question of whether sex and spatial ability make independent contributions to performance. To investigate this question we conducted a mediation analysis fol- lowing the guidelines provided by Baron and Kenny (1986). We hypothesized that performance on the cross-section task would be correlated with both sex and spatial ability. Using males as our refer- ence group, we obtained the following unstandardized regression equation to predict the relative contribution of sex and spatial ability to proportion correct (the coef fi cient b 1 represents the difference be- tween the mean male and female scores): Y ’ proportion correct ð Þ ¼ a intercept ð Þ þ b 1 sex ð Þ þ b 2 aggregate spatial score ð Þ As predicted, the zero-order correlations of sex (0.16, p b .05) and spatial ability (0.62, p b .001) with performance on the Santa Barbara Solids Test were signi fi cant. In the fi rst equation, we regressed Santa Barbara Solids Test performance on sex and found that sex was a signi fi cant predictor of performance ( β = .16, p = .02). The R 2 change was .03, p b .01. In the second equation, we regressed perfor- mance on spatial ability (our proposed mediator) and found a signif- icant relationship ( β = .61, p b .001). In the second equation, the R 2 change was .38, p b .001. In the third equation, we regressed perfor- mance on both sex and spatial ability and found a signi fi cant association between spatial ability and performance ( β = .64, p b .001), and a non- signi fi cant relationship between sex and performance ( β = − .07, Table 3 Bivariate and partial correlations among subscales of the Santa Barbara Solids Test and the aggregate spatial score, n = 223. Subscales Simple Joined Embedded Orthogonal Oblique Bivariate correlations Joined .75 ⁎⁎ Embedded .76 ⁎⁎ .75 ⁎⁎ Orthogonal .86 ⁎⁎ .83 ⁎⁎ .83 ⁎⁎ Oblique .86 ⁎⁎ .88 ⁎⁎ .86 ⁎⁎ .75 ⁎⁎ Spatial score .55 ⁎⁎ .52 ⁎⁎ .63 ⁎⁎ .57 ⁎⁎ .58 ⁎⁎ Partial correlations controlling for spatial score Joined .65 ⁎⁎ Embedded .63 ⁎⁎ .63 ⁎⁎ Orthogonal .80 ⁎⁎ .77 ⁎⁎ .74 ⁎⁎ Oblique .80 ⁎⁎ .83 ⁎⁎ .78 ⁎⁎ .63 ⁎⁎ ⁎⁎ p b .001 for bivariate correlations and partial correlations. Fig. 4. Mean performance by sex on Santa Barbara Solids Tests, showing interactions of geometric structure with cutting plane and with sex ( n = 221, the data of two participants were not included because they did not report their sex). Error bars represent +/ − one standard error. 872 C.A. Cohen, M. Hegarty / Learning and Individual Differences 22 (2012) 868 – 874 p = .34). The R 2 change for the third equation was .38, p b .001. Taken to- gether, these regression analyses suggest that spatial visualization abil- ity mediates the relationship between sex and performance on the Santa Barbara Solids Test. The results of these mediation analyses are shown in Fig. 5. 4. General discussion In summary, we developed and administered a new online spatial test that assesses the ability to identify the two-dimensional cross sec- tion of a three-dimensional geometric solid, a skill that is important in many STEM disciplines, including biological and medical sciences, geolo- gy, engineering and mathematics. Test items varied along two parame- ters of hypothesized dif fi culty: complexity of geometric object being sliced, and orientation of the cutting plane. The results were consistent with a previous paper-and-pencil administration of the test with a small- er sample (Cohen & Hegarty, 2007). That is, high spatial participants out- performed low spatial participants, across all subscales of the test; there was a main effect of cutting plane orientation, such that fi gures with oblique cutting planes were consistently more dif fi cult than fi gures with orthogonal cutting planes; and there was a signi fi cant interaction between cutting plane orientation and geometric complexity. The internal reliability of the test exceeded that reported by Cohen and Hegarty (2007). This study also established satisfactory internal re- liability ratings for the test subscales. These ratings enable researchers to compute subscales of the test with con fi dence regarding their reli- ability. Bivariate and partial correlations demonstrated that spatial visu- alization ability was not the sole determinant of performance across the subscales of the test. As expected, and consistent with previous studies (Casey, Nuttall, Pezaris, & Benbow, 1995; Webb, Lubinski, & Benbow, 2007), SAT Math score was a stronger predictor of performance than SAT Reading score. In summary, the test is reliable and measures a unique skill, although it also shares variance with other measures of spatial and mathematics performance. Analyses of the contributions of sex and spatial ability to perfor- mance show that the main effects of sex were diminished after control- ling for spatial ability, except for a signi fi cant interaction between sex and geometric structure. Males signi fi cantly outperformed females on both cutting plane orientations of simple and embedded test fi gures, but not on joined fi gures. What could account for the signi fi cant sex dif- ferences in performance on embedded problems, but not on joined problems? The relative dif fi culty of the different subscales of the test, and their interaction with sex differences is consistent with literature that demonstrates that people can use a variety of strategies on spatial tasks, including imagistic and more analytic strategies (Schultz, 1991; Hegarty, 2010). These analytic strategies include task decomposition, feature matching, and rule-based reasoning. First, participants might use a task decomposition strategy on joined problems, in which they separately consider the possible cross sections of the two joined shapes and eliminate answer choices in which one of the shapes is not possible. In contrast, imagining cross sections of simple and embedded items might require the use of imagistic strategies. We speculate that males may have a larger repertoire of spatial strategies (including imagistic strategies) than females, and this is consistent with the result that males outperform females on simple and embedded, but not joined, fi gures. It is also consistent with the result that measures of mental rotation and perspective taking (solved primarily by imagistic strate- gies) mediate the correlation between sex and performance. In future studies it will be important to study the strategies used on this test more directly (e.g., using concurrent verbal protocols) to test the hy- pothesis the male advantage on this task is due to success in applying a wider range of solution strategies. 4.1. Implications and future research The test was challenging for our participants, who were primarily sci- ence students and therefore might be expected to have higher spatial skills than average. Thus the test appears to be at an appropriate level of dif fi culty for the target population (high-school and college students) Subscales of the test may also be useful in screening younger populations. For example, developmental psychologists have created a multiple choice test that uses stimuli similar to our simple items to assess the understanding of cross-sectional shapes by 4-to-7-year olds (Ratliff, McGinnis, & Levine, 2010). The test has also been used as a pre-posttest measure in a training protocol that uses interactive animation to improve performance on the criterion task (Cohen & Hegarty, 2008). In this case, participants are trained on a sub-set of simple geometric fi gures, and transfer is measured by performance on complex geometric solids. An important future research goal is to investigate the predictive validity of the Santa Barbara Solids Test in the context of undergrad- uate STEM classes. We hypothesize that the test will identify students who are at risk for poor performance in introductory science courses, especially in topics such as geology, anatomy and engineering draw- ing that depend more on visual penetrative ability. The test may also be useful in assessing spatial thinking ability among elementary and high school students. Investigators who are interested in using this assessment are advised to report scale and subscale performance in terms of proportion correct, as this metric will permit comparison to previous administrations of the test. One limitation of this test is the possible perceptual ambiguity in- herent in representing a three-dimensional geometric fi gure in a two-dimensional image. The test fi gures were created in a three- dimensional computer modeling program using linear perspective cues, lighting and shadows. However, some spatial information remains ambiguous in a two-dimensional representation of a three-dimensional fi gure. There are a number of ways to address this problem in future research. Participants could be shown small physical models of test fi gures to remove any ambiguity about the shapes of the three- dimensional objects. Ratliff et al. (2010) used this approach with chil- dren. Another possible solution is to adapt the test to an augmented or virtual reality display in which participants are allowed to rotate the fi gures to observe their shapes in three dimensions. The use of vir- tual reality displays to test and train spatial ability offers a number of additional bene fi ts, including the ability to collect response latencies and information about solution strategies (Strasser et al., 2010). In addi- tion, by simulating three-dimensional space, augmented and virtual re- ality testing environments offer an ecologically valid tool for measuring three-dimensional spatial skills. In summary, we have developed an instrument that is useful in measuring a spatial thinking skill that is common in many STEM fi elds and have shown that it is reliable. Correlations with measures of spa- tial visualization provide both convergent and discriminant validity, Fig. 5. Mediation analysis path diagram. 873 C.A. Cohen, M. Hegarty / Learning and Individual Differences 22 (2012) 868 – 874 in showing that it is both correlated with these measures but also measures something unique that is independent of these measures. An important goal for future studies is to investigate the predictive value of this test in other STEM fi elds that require cross-sectioning skill. References Appelle, S. (1972). Perception and discrimination as a function of stimulus orientation: The “ oblique effect ” in man and animals. Psychological Bulletin , 78 , 226 – 278. Baenninger, M., & Newcombe, N. (1989). The role of experience in spatial test perfor- mance: A meta-analysis. Sex Roles , 20 (5 – 6), 327 – 344. Baron, R. M., & Kenny, D. A. (1986). The moderator – mediator variable distinction in so- cial psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology , 51 (6), 1173 – 1182. Battista, M. T. (1990). Spatial visualization and gender differences in high school geometry. Journal for Research in Mathematics Education , 21 (1), 47 – 60. Biederman, I. (1987). Recognition-by-components: A theory of human image under- standing. Psychological Review , 94 (2), 115 – 147. Brinkmann, E. H. (1966). Programm