12 Seeing and Psychophysics 12.1 Hubble telescope photograph Courtesy NASA. Chapter 12 I magine you are with a friend watching the stars come out as twilight fades into night, 12.1. Your friend calls out that he can see a star in a certain a location in the sky. You try hard but can't make it out, even though you are sure you are looking in 0.8 th e right place. You conclude that your friend has a much better visual system than your own when it 0.6 comes to detecting faint light sources. (/) Q) > This kind of situation is not unusual. We are Q 0.4 quite often called upon to detect faint stimuli, which relies on an ability to discriminate between 0.2 the presence or absence of such stimuli. So detec tion and discrimination are intimately related, 0 although they are often treated as distinct abili 0 2 3 4 5 6 Luminance ties. An example of discrimination is the ability to b judge which of two stars is brighter. Psychologists have evolved some clever techniques for making careful measurements of people's ability to detect 0.8 andlor discriminate stimuli. We explain these psy chophysical methods and their theoretical under 0.6 pinnings in this chapter. Throughout this chapter W we use the example of luminance, but these meth Q) > Threshold =3 Q 0.4 ods can be applied to any perceived quantity, such as contrast, color, sound, and weight. 0.2 Psychophysical methods have to be highly sensi tive if they are to do justice to the sensitivity of the 0 human visual system. For example, the fact that 0 2 3 4 5 6 photoreceptors can signal the presence of a single Luminance 12.2 Transition from not seeing to seeing Why read this chapter? a In an ideal world , the probability of seeing a brief flash of What is the dimmest light that can be seen? What light would jump from zero to one as the light increased in is the smallest difference in light that can be seen? luminance. These related questions are the subject of this chap b In practice, the probability of a yes response increases ter, which uses image light intensity (luminance) as gradually, and defines a smooth Sshaped or sigmOidal an example, even though the ideas presented here psychometric function. The threshold is taken to be the apply to any stimulus property, such as motion or midpoint of this curve: the luminance at which the probabil ity of seeing the light is 0.5. Most quantities on the abscissa line orientation. We begin by introducing the idea of (horizontal axis) in plots of this kind are given in log units, a threshold luminance, which, in principle, defines so that each increment (e.g. , from 3 to 4) along this axis the dimmest light that can be seen. When we come implies a multiplicative change in magnitude. For example, to describe how to measure a threshold, we find that if the log scale being used was to the base 10 then a single it varies from moment to moment, and between dif step on the abscissa would mean the luminance increased ferent individuals, depending on their willingness to by a factor of ten. say yes, rather than on their intrinsic sensitivity to light. Two solutions to this dilemma are described. photon was first deduced using a simple psycho First, a twoalternative forced choice (2AFC) proce physical method by Hecht, 5chlaer and Pirenne dure measures sensitivity to light. Second, signal in 1942 (a photon is a discrete packet or quantum detection theory provides a probabilistic frame oflight). It was confirmed about 40 years later by work which explains sensitivity without the need to invoke the idea of thresholds. We discuss how measuring of the outputs of single photoreceptors. these methods can be used to redefine the no This high degree of visual sensitivity is why we can tion of a threshold in statistical terms. Finally, we easily see a white piece of paper in starlight, even give a brief historical overview of psychophysics. though each retinal photoreceptor receives only 1 282 Seeing and Psychophysics photon of light every hundred seconds from the different people have different criteria for saying paper (with no moon light and no stray light from yes. Suppose two people see the same thing when manmade sources). To put it another way, each exposed to a very dim light. One of them may be set of 100 photo receptors receives an average of willing to say yes, whereas the other may say no about 1 photon per second in starlight. Whichever because s/he requires a higher degree of certainty. way we choose to put it, the eye is a remarkably (Remember, we are talking here about very low efficient detector of light. (Note that only the rods, luminances, for which percepts are anything but Ch 6, are active in the extremely low light level of vivid.) Let's call these two people, which we assume starlight.) have identical visual systems, Cautious and Risky, The smallest luminance that we can see is respectively. Both Cautious and Risky see the same called the absolute threshold, whereas the small thing, but Cautious demands a great deal of evi est perceptible diffirence in luminance is called the dence before being willing to commit. In contrast, difference threshold, difference limen, or just Risky is a more flamboyant individual who does noticeable difference OND) for the historical not need to be so certain before saying yes. reason that it was deemed to be just noticeable. Let's put Cautious and Risky in a laboratory, But exactly what do we mean by "see" in terms of and show them a set of, say, 10 lights with lumi absolute thresholds and JNDs? This is a question nance values ranging from zero, through very dim, that caused researchers in the 19th and early 20th to clearly visible. We present each of these lights centuries to develop what we will refer to as classi 100 times in a random order, so there will be 1000 cal psychophysics. trials in this experiment (i.e., 100 trials per lumi nance level x 10 luminance levels). As shown in Classical Psychophysics: Absolute Thresholds 12.2b, at very low luminance levels, the light can Suppose you were presented with a series of brief not be seen, and so the proportion of yes responses light flashes of increasing luminance, starting is close to zero. As luminance increases, the pro from a luminance that lies below your absolute portion of yes responses increases, until at very high threshold. As the luminance is increased you might luminance values the proportion of yes responses is expect that there comes a point at which you are close ro unity. If the proportion of yes responses for suddenly able to see the light flashes, so that the each if the 10 luminance levels is plotted against probability of seeing light flashes increases from luminance then the data would look rather like the zero (no flash is seen) to certainty (probability= 1; psychometric function in 12.2b. every flash is seen), 12.2a. However, in practice However, if we compare the corresponding this isn't what happens. Instead, the probability of curves for Cautious and Risky, 12.3, we find seeing light flashes gradually increases as luminance that both psychometric functions are identical in increases, as depicted by the psychometric func shape, but that Cautious' curve is to the right of tion in 12.2b. Risky's curve. This graph implies that a light which This psychometric function implies that there induces a yes response of say 20% of the time from comes a point, as the light flash luminance increas Risky might induce a yes respo nse 5% of the time es above zero, at which it starts to become visible, Cautious. Thus, all of Cautious' responses "lag but it isn't seen reliably on every presentation. For behind ' Risky's responses in terms of luminance. many presentations of the same (low) luminance, Recall that both Risky and Cautious see the same sometimes you see the flash , and sometimes you thing, but Risky is just more willing to say " Yes don't. As the luminance is increased, the probabil  that almost invisible dim percept was caused by a ity of seeing the flash increases gradually from zero light flash rather than being an artefact of my visual to one. If you respond yes when the light is seen, system. " and no when it is not, then your absolute thresh To sum up : the absolute threshold is the lumi old is taken to be the luminance which yields a yes nance which evokes a yes response on 50% of trials, response 50% of the time. but Risky's estimated threshold is much lower than A major problem in measuring absolute Cautious', even though they both see the same thing. thresholds plagued psychophysics for many years: The difference between the estimated thresholds 283 Chapter 12 threshold and the JND are independent quanti ties: knowing one tells us nothing about the value , of the other one, in principle, at least. In practice, ,, I a low absolute threshold is usually associated with 0.8 a small JND, but it is possible for an observer to have a low absolute threshold and a large JND, or 0.6 vice versa. en Q) > To illustrate the independence between absolute Q 0.4 thresholds and JNDs, notice that the curves in 12.3 have the same slopes. This means that Cau 0.2 tious and Risky have the same JND despite their ,, different absolute thresholds. The point is that 0 the JND is defined as the difference in luminance 0 2 3 4 5 6 7 8 9 10 Luminance a 12.3 Individual differences in responding The proportion of trials on which two observers, Cautious (red dashed curve) and Risky (black solid curve) are willing to respond yes to the question : "Did you see a light?" of Cautious and Risky arises only because Risky is predisposed to say yes more often than Cautious. This is clearly nonsensical because we are inter ested in measuring the sensitivities of Cautious and Risky's visual systems, not their personalities. Fortunately, it can be fixed using signal detection theory (SD1) and a procedure known as twoal ternative forced choice (2AFC), both of which are described later. b :: , Just Noticeable Differences ,, " 0.8 ,, The difference threshold or JND is usually defined I I I I as the change in luminance required to increase en 0.6 I I the proportion of yes responses from 50% to 75%. I Q) > Consider two observers called Sensitive and Insen Q 0.4 /) sitive, whose psychometric function s are shown in 12.4a: Sensitive (solid black curve) and Insensitive 0 .2 (dashed red curve). A small increment in lumi nance has a dramatic effect on the proportion of yes o responses from Sensitive, but the same luminance o 2 3 4 5 6 7 8 9 10 change has a relatively small effect on the propor Luminance tion of yes responses from Insensitive. 111erefore, Sensitive requires a smaller increase in luminance 12.4 Individual differences in psychometric function than Insensitive to raise the proportion of yes a Two observers with different JNDs and different absolute responses from 50% to 75%. It follows that Sensi thresholds. b Two observers with different JNDs and the same absolute tive must have a smaller JND than Insensitive, threshold . even though both observers may have the same Note: 12.3 shows two observers with the same JNDs, but absolute threshold, as in 12.4b. Thus, the absolute different absolute thresholds . 284 Seeing and Psychophysics required to increase the proportion of yes responses in the face of this noise? Sutprisingly, the answer from 50% to 75%, a change which is unaffected is, "a good chance." However, certainty is ruled out by the Location of the curve. Thus, the data from because of the effects of noise. Cautious and Risky give biased estimates of their The amount of noise varies randomly from absolute thresholds, but accurate estimates of their moment to moment. To illustrate this, let's con JNDs. centrate first on the receptors. They have a resting Caveat: The definition for the JND given above potential of about 35 m V (milliVolts), and their is: the change in luminance required to increase response to light is to hyperpoLarize to about 55 the proportion of yes responses from 50% to 75%. mY. However, to keep things arithmetically simple However, the value of75% is quite arbitrary, and (and to help later in generalizing from this exam is used for histo rical reasons; it could just as easily ple) we will define their mean resting potential as have been 80% or 60%. The main thing for mean o mV (just imagine that +35 mV is added to each ingful comparisons is to have a consistent measure measurement of receptor output). It will help to of the steepness of the psychometric function . In have a symbol for a receptor's membrane poten fact, for reasons that will become clear, we wi ll use tial and we will use r for this purpose (think of r 76% as the upper limit for the JND later in this standing for response). The variable r is known as a chapter. random variable because it can take on a different value every time it is observed due to the effects of Signal Detection Theory: The Problem of Noise noise. We will use u to denote the mean value of r. The single most important impediment to percep The key point is that variations in r occur due tions around thresholds is noise. Indeed, as we will solely to noise, as shown in 12.5a. If we construct see, the reason why psychometric functions do not a histogram of r values (as in l2.5b) then we see look like 12.2a is due to the effects of noise. This is that the mean value of r is indeed equal to zero. not the noise of cat's howling, tires screeching, or For simplicity, we assume that this distribution of r computer fans humming, but the random fluctua values is gaussian or normal with mean tions that plague every receptor in the eye, and u = 0 m V, 12.5c. In order to help understand every neuron in the brain. The unwanted effects of how this histogram is constructed, note that the noise can be reduced, but they cannot be elimi horizontaL dashed (blue) line in 12.5a which marks nated. It is therefore imperative to have a theoreti r = 10m V, gets transformed in to a verticaLdashed cal framework which allows us (and the brain) to (blue) lin e in the histogram of r values in l2.5b,c. deal rationally with this unavoidable obstacle to As this distribution consists entirely of noise, it is perception. Such a framework is signal detection known as the noise distribution. theory. We will explain its details in due course but So much for the noise distribution of responses for the moment we give a nontechnical overview when no stimulus is present. What happens when of the main ideas. a light flash is shown? The flash response is added Think about what the brain has to contend to the noise to create another distribution , called with in trying to decide whether or not a light is the signal distribution. Obviously, if the lumi present. The eye is full of receptors with fluctuating nance of the flash is too low to create any effect membrane potentials (receptors do not have firing in the visual system then the signal distribution is rates), which feed into bipolar cells with fluctuat identical to the noise distribution. However, as the ing membrane potentials, and these in turn feed luminance of the flash increases, it begins to create retinal ganglion cells with fluctuating firing rates, a response, and so the signal distribution becomes whose axons connect to brain neurons with their shifted, 12.6a,b. own fluctuating firing rates. All these fluctuations Let's now apply these basic ideas about noise are usually small, but then so is the change induced by considering the output of a single receptor in by a very dim stimulus. More importantly, these the eye to a low intensity light. The brain is being fluctuations happen whether or not a stimulus is asked to solve a difficult statistical problem: given present. They are examples of noise. What chance that a certain receptor output r was observed dur does the brain stand of detecting a very dim light ing the last trial (which may last a second or two), 285 Chapter 12 a b 20 C :::J o U Trial number Receptor output ( mV) 12.5 Noise in receptors a If we measured the output r of a single photoreceptor c over 1000 trials then we would obtain the values plotted here, because r varies randomly from trial to trial. Note that this receptor is assumed to be in total darkness here . The probability that r is greater than some criterion value c (set to 10 mV here) is given by the proportion of dots above the blue dashed line, and is written as p(r > c). b Histogram of r values measured over 1000 trials shows that the mean receptor output is u = 10 mV and that the variation around this mean value has a standard devia tion of 10 mV. The probability p(r > c) is given by the pro portion of histogram area to the right of the criterion c indi cated here by the blue dashed vertical line. c The histogram in b is a good approximation to a gaus sian or normal distribution of r values , as indicated by the solid (black) curve. Notice that this distribution has been "normalized to unit area ." This means that the values plotted on the ordinate (vertical axis) have been adjusted Receptor output ( mV) to that the area under the curves adds up to unity. The resulting distribution is ca lled a probability density func tion or pdf. was a light presenr or not? Remember that r varies sayingyes. As this recepror output corresponds ro a randoml y from second to second whether or not specific luminance, the observer effectively chooses a stimulus is presenr, due ro noise. So some trials a criterion luminance, 12.7. Som etimes this criteri are associated with a recepror ourput caused by the on yields a correct yes. On other trials, the fluctuat dim light, and other trials have a larger recepror ing activity levels leads ro an incorrect yes. ourput even if no light is presenr, again, due to These ideas allow us ro see why the psychom et noise. Of course, on average, the recepror outpur ric function has its characteristic Sshape. Whether tends ro be larger when the dim light is on man or not these responses are correct, the proportion when it is off. However, on a small proportion of of trials on which the criterion is exceeded, and trials, th e presence of noise reverses this situation, therefore the proportion of yes responses, gradually and the recepror output associated with no light is increases as the luminance increases, 12.8. larger than it is with me dim light on. Having given an outline of SOT, we are now The upshot is that the observer has to set a cri ready ro examine it in more detail. terion recepro r output, which we denote as c, for 286 Seeing and Psychophysics a b 50 ...... : '. :', ; .::. 40 ..... ,," . . ' . .... '." " .... : . ' . :. : ,. " 11" '•• ' ,c cc .'i,', C i,\' C,,' : c o () 20 . .; o 200 400 600 800 1000 20 o 20 40 60 Receptor output (mV) Trial number 12.6 Signal added to noise creates the signal distribution a Measured values of receptor output rwith the light off (lower red dots) and on (upper green dots), with the mean of each set of rvalues given by a horizontal black line. b Histograms of the two sets of dots shown in a. Signal Detection Theory: The Nuts and Bolts between histograms and probability density func tions (pdfs) may want to skip this section. In essence, SOT augments classical psychophysi We can see from 12.5a,b that values of the cal methods with the addition of catch trials. If receptor output r around the mean (zero in our an absolute threshold is being measured then each example) are most common. If we wanted to know catch trial contains a stimulus with zero lumi precisely how common they are, we can use 12.5b. nance. If a JNO is being measured then each trial For example, each column or bin in 12.5b has a contains two stimu li , and the observer has to state width of 3 mV, so that values of r between zero if they are different (with a yes response) or not (no response), and a catch trial has zero difference be and 3 mV contribute to the height of the first tween two stimuli. The effect of this subtle change column to the right of zero. This bin has a height of 119, indicating that there were a total of 119 is dramatic, and leads to an objective measure of absolute thresholds, JNOs, and a related measure values or r which fell between zero and 3 m V. As of sensitivity, known as d' (pronounced dprime). we measured a total of 1000 values of r, it fol lows that 119/1000 or 0.119 (i.e. , 11.9%) of all The question the observer has to answer on each trial is the same as before: did I see a light (abso recorded values of r fell between zero and 3 m V. lute thresholds) , or a difference between two lights "yes" (JNOs)? If an observer responds yes on a catch trial More Noise then this implies that internal noise has exceeded noise the observer's criterion for deciding whether or not a light is present (absolute threshold), or whether two lights have different luminances (JNO). Sensor Thus, the use of catch trials effectively permits the Neural r amount of noise to be estimated. response Histograms and pdfs Signal Understanding SOT requires knowing some tech "no" nical details about distributions. Readers familiar with standard deviations, and the relationship 12.7 Overview of a perceptual system faced with noise 287 Chapter 12 0.4 0.4 0.35 0.8 0.35 0.3 0.3 0 .6 0.25 Ul 0.25 (l) 'C' 'C' > Q 0.2 Q 0.2 Q 0.4 0.15 0.15 0.1 0.1 a 0.05 2 4 6 10 a 10 10 Receptor output mV) 0.4 0.4 0.35 0.35 0.3 0.3 0.25 0.25 'C' 'C' Q 0.2 Q 0.2 0.15 0.15 0.1 0.1 0.05 2 4 6 8 10 10 Receptor output (mV) 12.8 The psychometric function and SOT As the receptor output increases the signal distribution (green) moves to the right, increasing the probability of a yes reSDonse for a fixed criterion of c =7 mV. Thus, the probability that r is between zero and 3 that there are 150 r values that exceed 10mV, and mV is 0.119. Now suppose we wanted ro know the therefore thatp(r > 10loffi = 150/1000 = 0.15. This probability that r is greater than, say, 10mV, given procedure can be applied to any value or range of that the light is off. At this stage, it will abbreviate r values. For example, if we wanted to know the matters if we define some simple notation. The probability that r is greater than 0 m V then we probability that r is greater than 10 mV is written would add up all the bin heights to the right of 0 as per > l0loffi: the vertical bar stands for "given m V On average, we would find that this accounts that." Thus, per > 10 loffi is read as "the probability for half of the measured r values, so that that r> 10 given that the light is off." This is a con p(r>Oloffi = OS ditional probability because the probability that At this stage we can simply note that each bin r> lOis conditional on the state of the light. has a finite width (00 m V), so that per > 1Oioffi is From 12.5a, it is clear that per > 10 loffi is related actually the area of the bins for which r> 10, ex to the number of dots above the r = 10m V dashed pressed as a proportion of the total area of all bins. line. In fact, per > 10 loffi is given by the proportion This will become important very soon. of dots above the dashed line. Now, each of these If we overlay the curve which corresponds to a dots contributes to one of the bins above the verti gaussian curve then we see that this is a good fit to cal blue line in 12.5b. It follows that per> 10loffi is our histogram , as in 12.5c. In fact, the histogram given by the summed heights of these bins to the in 12.5c has the same shape as that in 12.5b, but right of 0.0 mV, expressed as a proportion of the it has been set to have an area of 1.0 (technically, it summed heights of all bins (which must be 1000 is said to have been "normalized to have unit area," because we measured 1000 r values). The summed as explained in the figure legend). Each column in heights of these bins comes to 150, so it follows 12.5b has an area given by its height multiplied by 288 Seeing and Psychophysics its width (3 m V in this case). If we add up all the where the symbol I stands for summation. In column heights (of the 1000 bins) and multiply by words, if we take the difference between each the column widths (3 mV) then we obtain measured value of r and the mean u, and then 3000 = 1000 x 3, which is the total area of the square all these differences, and then add them all histogram. If we divide the area of each column by up, and then take their mean (by dividing by n), 3000 then the total area of the new histogram is and finally take the square root of this mean, then one (because 3000/3000 = 1). If you look closely we obtain the standard deviation. at 12.5c then you will see that this has been done The equation for a gaussian distribution is already. Instead of a maximum bin height of 125 in 12.5b, the maximum height in 12.5c is around per) = kexp((u  r)2/(2a 2 )), 0.4; and, as we have noted, instead of an area of where k = 1/[aV(2rr)] ensures that the area under 3000 in 12.5b, the area of 12.5c is exactly one. the gaussian curve sums to unity. But if we define a This transformation from an ordinary histogram new variable z for which the mean is zero and the to a histogram with unit area is useful because standard deviation is one it allows us to compare the histogram to stand ard curves, such as the gaussian curve overlaid z = (u  r)/a in 12.5c. As we reduce the bin width, and as we then we can express the gaussian in its standard increase the number of measured values of r, this form as histogram becomes an increasingly good approxi mation to the gaussian curve in 12.5c. In the limit, p(z) = kz exp(:i/2), as the number of samples of r tends to infinity, and where as the bin size tends to zero, the histogram would be an exact replica of the gaussian curve, and in kz = 1I[vi(2rr)]. this limit the histogram is called a probability The standard form of the gaussian distribution has density function , or pdf a mean of zero, a standard deviation of a = 1, Just as the histogram allowed us to work out and an area of unity; this is the form used in most the probability of r being within any given range statistical tables. Any data set of n values of r can of values, so does the pdf, but without having to be transformed into this normalized form by count column heights. For example, the prob subtracting its mean u from all n values of r, and ability that rdO mV is given by the area under dividing each r value by the standard deviation a the pdf curve to the left of the dashed blue line of the n values. The resultant data have a mean of in 12.5c. If we start at the left hand end of the zero and a standard deviation of unity. This was curve and work out areas to the left of increasing done in order to transform our raw values of r in values of r, we end up with a curve shaped like the 12.5a,b to the normalized values shown in 12.5c. psychometric function. Because this curve gives the Thus, when a normalized gaussian curve is overlaid cumulative total of areas to the left of any given on 12.5c, the fit is pretty good. Conversely, we can point it is known as a cumulative density function go the other way and scale a normalized gaussian or cdf This area is can be obtained from a standard curve to get a rough fit to our raw data. This is tabl e of values relevant to the gaussian distribution achieved by adding the data mean u to the normal found in most textbooks on statistics. ized gaussian mean of zero, and by multiplying One aspect that we have not yet discussed is the the standard deviation of the normalized gaussian amount of random variability in values of r. This is (which is unity) by the standard deviation of our revealed by the width of the histogram of r values. data, as in 12.5b. A standard measure of variability is the standard In order to give an impression of what happens deviation, which is denoted by the Greek letter to a gaussian curve as we vary the standard devia a (sigma). Given a set of n (where n = 1000 here) tion , two gaussians with standard deviations of values of r, if u is the mean then the standard a = 5 mV and a = 10 mV are shown in 12.9. deviation is Note that the heights of these two curves are dif a = VO/n IV , ,  U)2), ferent because they both have unit area. Forcing 289 Chapter 12 ity that a given value r is less than or equal to some 0.8 reference value x is the area under the curve to the 0.7 left of x 0.6 p(r 5 x) = <I>((x  u)/cr), 0.5 where the function <I> (the Greek letter, phi) returns the area under the curve to the left of x for a gaus 'C" 0.4 sian with mean u and standard deviation cr. Note Q 0.3 that the quantity z = (x  u)/cr expresses the difFer 0.2 ence (x  u) in units of cr, is known as a zscore, and was defined on the previous page. 0.1 The function <I> is a called a cumulative density function , because it returns the cumulative total 20 10 0 10 30 area under the gaussian pdf to the left of z. For Receptor output (mV) example, p(r 5 10) = <I>(z = 1) = 0.84 . 12.9 Gaussian curves with different standard deviations Signal and Noise Distributions The narrow gaussian has a standard deviation of a = 5 mV, and the wide gaussian has a = 10 mV, as indicated by the Returning to our light example, consider what horizontal dashed line attached to each curve. The curves happens if the light is on. This situation is shown have different heights because they have different stand ard deviations, but the same area (unity), which means we by the upper (green) set of dots in 12.6a, and the can treat each distribution as a probability density function corresponding histogram of r values on the right (pdf). The abscissa defines values of r, and the ordinate hand side of 12.6a. Let's assume that turning the indicates the probability density p(r) for each value of r. light on increases the mean output to u = 30 mV For simplicity, we will assume that the standard a distribution or a histogram to have unit area is deviation remains constant at cr = 10 m V We refer useful if we wish to interpret areas as probabilities, to this "light on" distribution of r values as the and it also allows us to treat both gaussian curves signal distribution. as pdfs with different standard deviations. In order to distinguish between the signal and Before moving o n, a few facts about pdfs are noise distributions, we refer the their means as Us worth noting. and u,,' respectively, and to their standard devia First, the total area under a pdf is unity (one). tions as crs and cr", respectively. However, as we This corresponds to the fact that if we add up the assume that crs = cr", the standard deviation will probabilities of all of the observed values of r then usually be referred to without a subscript. this must come to 1.0. Now, we know that if the light is off then the Second, as with the histogram example above, mean output is U II = 0 m V, but at any given mo any area under the curve defines a ptobability. ment the observed value of the o utput r fluctuates Because the area of a bin equals its width times around 0 m V. Let's assume that we observe a value its height, it follows that the height alone of the of r = 10m V Does this imply that the light is on curve cannot be a probability. The height of the or off? Before we answer this, consider the values curve is called a probability density, and must be that could be observed if the light is off in com multiplied by a bin width to obtain a probability parison to values that could be observed if the light (correspondi ng to an area under the pdf) . is on , as shown in 12.6a. A value of r = 10m V is Third, for a gaussian distribution with mean one standard deviation above the mean value u and standard deviation cr, the area under the U II = 0 m V associated with the light being off curve between u and u + cr occupies 34% of the (because cr = 10 mV here) , but it is two standard total area. This implies that the probability that r deviations below the mean value of us = 30 m V is between u and U + cr is 0.34. As half of the area associated with the light being on. So, even though under the curve lies to the left of u, this implies an observed value of r = 10m V is unlikely if the that the probability that r is less than u + cr is light is off, such a value is even more unlikely if the 0.84 = (0.50 + 0.34). More generally, the probabil light is on. Given that the observer is required to 290 Seeing and Psychophysics respond yes or no for each trial, these considera Criterion tions mean that an ideal observer should respond no. But as we shall see, most observers are not 0.4 ideal, or at least not ideal in the sense of minimiz 0.35 ing the proportion of incorrect responses. 0.3 The Criterion 0.25 "c As explained above, given an observed value of the o: 0.2 receptor output r, deciding whether or not this 0.1 5 output means the light is on amounts to choos 0.1 ing a criterion, which we denote as c. If r is greater than c (i.e., r> c) then the observer decides that the light is on, and responds with a yes. Conversely, if o 20 40 Receptor output (mV) r is less than c (i.e., r<c) then the observer decides that the light is off, and responds with a no. 12.10 Estimating d' The distance d ' (dprime) between the peaks of the noise As a reminder, over a large of number trials, say (left) and signal (right) distributions can be estimated from 1000, we present a light or no light to an observer. a knowledge of two quantities: the hit rate H, and the false On each trial, the observer has to indicate whether alarm rate FA. The dashed (blue) line is the criterion c, and or not a light was seen. As the light is either on or the observer responds yes only if the receptor output r is greater than c (i.e., if r>c). The hit rate H is equal to the off, and as the observer can respond either yes or area of the (red) region of the signal pdf to the right of the no, there are four possible outcomes to each trial: criterion, and FA is equal to the area of the (yellow) region 1) light on, observer responds yes, a hit, H of the noise pdf to the right of the criterion. 2) light on, observer responds no, a miss, M 3) light off, observer responds yes, afalsealarm, M: the miss rate equals the small (green) area FA of the signal pdf to the left of the criterion, 4) light off, observer responds no, a correct rejection, CR. FA: the false alarm rate equals the (yellow) area of the signal pdf to the right of crite Stimulus Catch Trial rion, Response present Stimulus not present CR: the correct rejection rate equals the large (light blue) area of the noise pdf to the left of "Yes" Hit False the criterion . alarm If we choose c = 10m V then an observed value of "No" Miss Correct r = 20 m V would allow us to respond yes, because rejection r> c. I f we adopted a cri terion of c = 10mV, how often would we be correct in responding yes given If we measure the observer's responses over a large that the light is on? This is given by the conditional number of trials then we can obtain estimates of probabiliry p(yeslon), and is equal to the proportion each of these quantities. Each quantiry corresponds of the signal pdf shaded red in 12.11. to a region of one of the histograms in 12.Gb, Note that the hit rate H = p(yeslon) can be made which has been redrawn in terms of gaussian pdfs as large as we like simply by decreasing the value in 12.10. Here, the lightoff or noise pdflies to the of the criterion c, which moves the vertical dashed left of the lighton or signal pdf. Following the line line leftward in 12.10. For example, if c is set to of reasoning outlined in the previous section: 20 mV then almost all observed values of rare H: the hit rate equals the large (red) area of above c, so we respond yes for almost any observed the signal pdf to the right of the vertical blue value of r, 12.1Ia. It's as if we adopt an extremely criterion, laissez foire or risky approach, and treat almost anything as a sign that the light is on. This implies 291 Chapter 12 a b ., Criterion Criterion 0.4 0.4 0.35 0.35 0.3 0.3 SQ. 0. 25 SQ. 0.25 0.2 0.2 0. 15 0.15 0.1 0.1 0.05 0.05 0  20 o 20 40 Receptor output (mV) Receptor output (mV) 12.11 Effect of criterion The criterion c is given by the position of the blue dashed line. a A low criterion of c =  20 mV yields a large hit rate H (the red area of signal pdf to the right of c), but also yields a large false alarm rate FA (the yellow area of the noise pdf to the right of c). b A high criterion criterion of c =50 mV yields a low FA (the area of the noise pdf to the right of c, which is so small it is not visible here), but also a low H (the red area of the signal pdf to the right of c). that if me light is on then we respo nd yes, so that graphically in 12.11a,b. Crucially, th ere is no o ur hit rate becomes close to 100%, for example, value for c which guarantees that our decisio ns are p(yes lon) = 0.9 10. This may seem like good news always correct. H owever, th ere is a value of c which but it is accompanied by some bad news. guarantees that we are right as often as possible. If Sening c = 20 m V guarantees a high hit rate, the light is on during half the trials m en mis value but it also ensures that we almost always respond is exactly m idway between the noise and signal yes even when the light is off, resulting in a high distributio ns. false alarm rate FA = p(yes lo./fJ. For example, if In summary, a low criterio n (Laissezjaire o r very r =  10 mV then it is no t likely that the light is risky) yields a large hit rate but a large false alarm o n. But we would respond yes if r = 10 mV and if ra te, whereas a high (very cautio us) criterion yields o ur criterio n (c =  20 mV) is set to yield a high hit a low hi t rate but a low false alarm rate. Ideal ly, we rate. In other words, th e probability p(yes Iojf) of wo uld li ke to have a large hit rate and a low false respo ndin gyes given that the ligh t is off is close to alarm rate. Given that neither a very high no r a uni ty (given by the yellow area in 12.11a). Thus, very low value for the criterio n seem satisfactory, sening the criterion c to a very low value increases it follows that m ere must be a value so mewhere the hit rate, the yellow regio n in 12.11a, but it also between these extremes whi ch yields a sensibl e increases the false alarm rate. co mp ro mise, which turns o ut to be the midpoint If we now reverse th is strategy and set c to be between the mean s of th e signal and noise distribu very high (say, c = 50 mV) then life does no t get tio ns. This is explo red further in a later sectio n. For m uch better, 12.11b. In th is case, it is as if we now, we expl ore how a fixed criterio n yi elds the adopt an extremely cautious ap proach , and will curved Sshaped psycho metric fun ctio n in 12.2b no t interpret even large val ues of r as indi cating and 12.15d. the light is on. Consequently, we rarely respo nd Sensitivity and d' yes when the light is o n, so the false alarm rate FA is almost zero (which is good). H owever, a high As the luminance is increased from zero, so the criteri on means that we rarely respond yes even pro po rtio n of correct yes respo nses (hits) gradually when the light is o n, yielding a hit rate H close increases, and asymptotes close to a val ue of uni ty. to zero (which is bad). Th is situatio n is shown Crucially, me rate at which the pro portio n of yes 292 Seeing and Psychophysics a b 0.45 0.45 0.4 0.4 0.35 0.35 0.3 0.3 'C' 0.25 0.25 'C' Q 0.2 Q 0.2 0.15 0.15 0.1 0.1 0.05 0.05 0 80 20 40 60 80 Receptor output (mV) 12.12 Individual differences in sensitivity The left hand red noise distribution of r values is the same for two observers with a low and b high sensitivities. When presented with the same luminance, the low sensitivity observer's green right hand signal distribution has a mean of Us =30 mV, whereas the corresponding mean for the observer b with high sensitivity is Us = 50 mV. The distance between the signal and noise distributions is measured in units of the standard deviation of the noise distribution , and is called d', which indicates the observer's ability to detect stimuli. If the signal and noise distributions have a common standard devia tion of 10 mV then the low sensitivity observer in a has d' =(30 mVO mV)/1 0 mV =3, whereas the high sensitivity observer in b has d =(50 mVO mV)/10 mV =5. responses increases is a measure of the sensitivity of of receptor values increases from a baseline value of a observer, but because sensitivity has its own tech u = 0 mV to a new value of u = 50 mY, 12.12h. " s nical meaning (defined below) we will use the term Clearly, a change of 50 m V should be more detect responsiveness for now. In order to understand why able than a change of 30 m V, if the distributions of the rate at which the proportion of yes responses both receptors have the same standard deviation (as increases is related ro responsiveness, a change in is assumed here). The difference berween the mean perspective is required. Until now we have been of the noise (light off) distribution and the mean considering how the distance berween the noise of the signal (light on) distribution (expressed as and signal distriburions increases as luminance a fraction of the standard deviation of the noise increases for a single receptor within the eye of a distriburion) has a special symbol, d', which, as single observer. We now consider how the distance noted above, is pronounced dprime. The quantity berween the noise and signal distributions varies d' is a measure of how responsive each observer is berween different observers given a fixed luminance to a given change in luminance. value. The definition of d' is as follows. Given a noise Let's assume we have rwo observers, 5 h; and distribution with mean u" and signal distribution 5 10 , whose receptors have high and low degrees of with mean us' where both distributions share a responsiveness, respectively. If we present both common standard deviation, a observers with a light that has the same luminance then this means that 5,,; is more likely rhan 5 10 to d' = (us  u,)/a. detect the light (assuming that they use the same Thus, d' is a measure of the distance berween criterion level for responding yes). the means of the noise and signal distriburions, When 510 is presented with rhe light, the mean expressed in units of standard deviations (of those of 5 10's distribution of receptor values increases distdbutions). For example, the low sensitivity from a baseline value of u = 0 m V to a new value observer in 12.12a has of u = 30 mV, 12.12a. In"contrast, when 51' is pre s " sen ted with the light, the mean of 510's distribution d' = (30 mV  0 mV)/IO mV = 3. 293 Chapter 12 a 0.8 0.6 0.4 0.2 Criterion c Criterion c b ) d 0.4 0.4 0.4 0.35 0.35 0.35 0.3 0.3 0.3 0.25 0.25 0.25 'C' 0.2 'C' 0.2 'C' 0.2 0: 0: 0: 0.15 0.15 0.15 0.1 0.1 0.1 0.05 0.05 0.05 12.13 Receiver operating characteristic (ROC) a ROC curves for different values of d '. Each curve is obtained by sweeping out criterion values c from high to low (bd) and measuring the hit rate (H) and false alarm rate (FA) at each value of c. From left to right in a , the bowed ROC curves correspond to values of d ' = 2 , 1.5 , 1.0 , 0 .5 and 0 , where zero defines the diagonal line. If d ' = 0 then the Signal and noise pdfs are the same, and the diagonal line is obtained , which implies that the hit rate H and the false alarm rate FA are equal. The letters on the middle ROC curve in a correspond to the H and FA rates deriving from the different positions of the criterion c in the graphs labelled bd. Notice that the criterion , but not d ', varies between graphs bd. Notice that d' is measured in terms of standard de How to Increase d' viatio ns, so that one can think of a d' of, say, three, as meaning three standard deviations. However, One potentially co nfusing fact should be made because d' is a ratio of values (that are, in our ex clear. There are two ways to increase d': find a ampl e, exp ressed in units of mV), d' is technically more sensitive observer (as above), or keep the sa id to be a dimensionl ess quantity. same observer and increase the change in lumi If a small increase in luminance induces a large nance. The point is that d' is a useful measure of increase in d' (as for 5,) then this change should observer sensitivity only for a fixed luminance level. also induce a large increase in the probabili ty of a Within a single observer, as the change in lumi yes respo nse. Therefore, responsive observers dis nance increases, so too does d', as shown in 12.8. playa rapid increase in yes responses as luminance Measuring d' increases, and this is seen grap hically in the steep (black) psychometric function of such an observer, This is all very fine, but how do we actually meas 12.4a,b. Conversely, a less responsive observer ure the val ue of d'? would have a small increase in d' (as for 5,) with The quantity d' is defin ed with respect to a a correspo ndingly small increase in the probability given reference luminance level ! " which we have of a yes response, with the result that the psycho impli citly assumed to be zero up to this point. metric function would be shallow, as shown by the If / , = 0 then the noise distribution really is just dashed (red) psychometric fu nctio n in 12.4a,b. noise. More general ly, we would like to measure 294 Seeing and Psychophysics d'for nonzero reference luminances. And if we to distance along the abscissa for gaussian pdfs, and set I I> 0 then the distribution associated with the can be found in standard statistical tables. Thus, reference luminance is still (confusingly) called the together, the hit and false alarm rates suffice to pro noise distribution. Given a comparison luminance vide d' from the sum of the (signed) distances 12 (which is larger than I) associated with the u,  un = [(c  u) + (u,  c)], signal distribution , we want ro find the differ ence between the means d' of the noise and signal irrespective of the criterion adopted by a given distributions. This means that each trial consists of observer. twO stimuli, one with a reference luminance and The reason we only need the hit rate and false the other with a comparison luminance. Responses alarm rate is because, given that H + M = 1, it fol to these noncatch trials are associated with the lows that if we know H then we also know signal distribution. M = 1  H. Similarly, FA + CR = 1, so if we know Crucially, on catch trials, both luminances are FA then we also know CR = 1  FA. So we could set to be the same as the reference luminance, and use either H paired with FA, or M paired with CR, responses to catch trials are associated with the in order to estimate d'. By convention, H and FA noise distribution . On every trial , the observer's are used. task is to answer yes if the stimuli appear to have different luminances, and no if they do not. The Measuring the Bias or Criterion observer does not know about the distinction The H and FA rates can also be used to estimate between reference and comparison stimulithis is the criterion c, which is also called the bias in this purely for the experimenter's benefit. In practice, context. The bias is given by the comparison and reference luminances vary from trial to trial, and data for a specific luminance difference is extracted later so that the d' for that A bias of c = 0 indicates no response bias, and im luminance difference can be estimated. plies that the criterion is exactly half way between As we have already noted, an observer's respons the means of the noise and signal distributions. If es are affected by his/her criterion, so we need to c is positive then it lies to the right of this halfway be able to disentangle the effects of the criterion point, and therefore indicates a bias for responding and d'. Using fairly mild assumptions regarding no. If c is negative it lies to the left of the halfway the shape and standard deviations of the noise and point, and therefore indicates a bias for responding signal distributions, it turns out that d' can readily yes. Finally, note that the bias c and the sensitivity be estimated from the quantities listed in 14, on d' of a given observer are independent quantities. page 291. In fact, we need only the hit rate Hand the false alarm rate FA. Receiver Operating Characteristics Specifically, the distance d' between the centers of the noise and signal distributions can be related Almost all accounts of SOT in the literature also to H and FA if we split d' into two pans: describe a graph which defines the receiver operat ingcharacteristics (ROC). The ROC graph acts 1) the distance (c  u) from the noise mean as an intermediary representation between signal/ u = 0 mV to the criterion c = 20 m V, plus ll noise pdfs and the psychometric function. It is not, in our opinion, necessary for understanding the 2) the distance (u,  c) from the criterion main principles SOT. However, it is included here c = 20 m V to the signal mean u, = 30 m V. for the sake of completeness. Without going into details, these distances can be Each ROC curve, 12.13a, shows what happens, computed from the FA and H rates, as for a given d', to hit rates and false alarm rates as the criterion level c is increased from a low (risky) to a high (cautious) level. The shape of this graph where <1>1 (the inverse of the cumulative density is bowed, with the sharpness of the bow depend function <1» maps probability (area under a curve) ent on the value of d', or equivalently, for a given 295 Chapter 12 luminance of the signal. Thus, each ROC curve in It might appear that saying whether or not the 12.13a is obtained by sweeping through all pos difference between two stimuli can be seen (a yes/no si ble values of c, and for each value of c, plotting task) , and saying which of two stimuli is brighter pairwise values ofH and FA. For any given value (a 2AFC task), are similar tasks. However, a no of d', we know that decreasing c has the effect of response in the yes/no task allows the observer to increasing H, but also of increasing FA, and the indicate that both stimuli appear equally bright. precise nature of the relationship between Hand In contrast, an "equally bright" response is not FA is given by an ROC curve. For example, the possible in the 2AFC task, which requires that the topmost ROC curve in 12.13a is obtained by observer chooses the stimulus that is perceived as starting with a large value of c (say, 30 m V), which brighter, even if they appear to be equally bright. gives a low H and a low FA. As c is decreased, H One easy way to remember this is that classical rises rapidly, but then FA also begins to rise. Fi psychophysical methods and SDT usually require nally, at values of c = 30 m V or above, both Hand yes/no responses, whereas 2AFC requires the ob FA are close to unity. server to choose a stimulus. TwoAlternative Forced Choice (2AFC) Methods Finding the JND with 2AFC The trouble with absolute thresholds described In order to fully understand how the difference at the outset of this chapter (Risky vs. Cautious threshold or JND is estimated using the 2AFC responding) was recognized in the very early days procedure, let's consider a specific example. of psychophysics in the 19th century. A modern As with SDT, on each trial, the observer is solution, SDT, has already been described. presented with a pair of stimuli, 51 and 52' Stimu However, an earlier approach tackled the lus 51 has a reference luminance II ' and stimulus problem by removing the observer's freedom to 52has a comparison luminance 12 , As before, these choose when he/she could see a stimulus. Instead, labels are for the experimenter's benefit, and the the observer is presented with a pair of stimuli, and observer is unaware of their existence; as far as the is asked to indicate which one is brighter (12.14; observer is concerned he/she just has to choose the pair can be presented simultaneously or one the brighter of two stimuli, 12.14. The observer is after the other, and other attributes than brightness presented with many pairs of stimuli, using a range can be used, but the principle remains the same). of different reference and comparison luminances, Now the observer has to decide, not whether the in random order. In order for us to estimate the stimulus can be seen, nor whether the difference JND for a single reference luminance I I we extract in luminance between two stimuli can be seen (as only those trials which contain I I' and consider the in classical psychophysics and SDT), but which of observer's responses as the comparison luminance two stimuli is brighter. 12is varied. We label the receptor outputs associated with the luminances I I and 12as r l and r 2, respectively. We also label the distribution of r l values associated S, with II as the noise distribution PI' with a mean of u l ' a standard deviation of ITI = 10 mY, • and a variance of Similarly, we label the r2 values associated with 12 12.14 Forced choice stimulus experiment as the signal distribution P2 , with a mean of u2 ' a The observer fixates the central dot when a warning tone standard deviation of is sounded. Shortly after, S, and S2 appear briefly on the screen and the observer's task is to decide which is brighter. 296 Seeing and Psychophysics and a variance of stimulus that is associated with the largest response. _ 2 Notice that we use the phrase "associated with" V2  CJ2 • rather than "caused by" because the effects of noise These signal and noise distributions are the same as mean that we cannot state with certain ty that an those used in SD T, and we have replaced the mean observed response was caused by a stimulus rather symbols u" and U s with u l and u2 ' res pectively, be than noise. Also, bear in mind that the observer cause we now consider two luminances with labels does not know which response belongs to 11and 11and 12 , which belongs to 12 , the observer just compares the In a 2AFC experiment, the observer is ass umed recepto r responses he/she gets by looking at both to obey the following optimal rule: choose the stimuli , and chooses the stimulus that is associ a 0. 05 b 0.05 0.04 0.03 0: 0.02 0.01 120 &0 . 0 Output difference r d=r 2r 1 (mV) 50 c d 0.9 , 0.8 , _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ L __ _ c/)'" 0.8 , ,, , 0.7 ,, Q) CIl o ,,, 8 ,, 1\ 0.6 ,g 0.6 <..> 0: 0.5 0: 0.4 0.4 15 15 ro 0.3 ro .0 .0 e CL 0.2 o n: 0.2 0.1 0 ____ __ __ 40 20 0 20 40 Difference in means ud =u2  u1 (mV) 60 o 2 4 6 8 10 Luminance difference 111 12.15 Using 2AFC to estimate a difference threshold a Distributions of outputs for two different luminances I, and 12 chosen so that 12 I, = JND. The distribution means are = = = u, 60 mV and u2 70 mV, and both distributions have a standard deviation of 0 10 mV, so the means are separated by one standard deviation , ud =u2u, =10 mV. b Distribution of the difference = 'd '2" derived from the distributions in a. This difference distribution has a mean of (u 2U,) = 10 mV, and a standard deviation 14 mV = 0 ./2. This distribution shows that the most likely value of ' d is u d . The probability that ' 2> " is the same as the probability that ('2") > 0, which is given by the area under the curve to the right of ' d = O. If an observer chooses 12 when ' d> O then this area corresponds to the probability of choosing S 2' p(choose Szl. c This area is plotted as a function of the difference ud in means, which increases as the difference !11 = 12/, in luminances increases. d Using this decision rule , the probability p(choose S2} increases as the difference !11 increases. Note that the values on the abcissa of c and d are different, and that the plotted curves are essentially scaled versions of each other. This implies that ud =kl1 /, where k is a constant of proportionality. 297 Chapter 12 ated with the largest response. This implies that this implies that if the recepror response r2ro 12is greater than the Vd = 2v = 2cr2 . response r l ro I I ' (i .e., if r2> r) then the observer chooses 12, otherwise he/she chooses I I. By definition, This rule can be rewritten as: choose cr" = Yv,1' stim ulus 52 (which has lumin ance 12) if r 2 > r l , or equivalently, if (r2  rl ) > 0 m V. Of course, the which implies that observer's decision is correct only if 12 > 11' so this cr d = Y(2v) rule is not guaranteed to succeed on every trial, but it is guaranteed to maximize the proportion = crY2. of correct responses. Thus, using this rule, the Rearranging this yields probability of choosing 52is the probability that (r2  r) > 0 m V. Let's evaluate this probability, cr = cr jY2. 12.15. Notice that the distribution p(r) describes the We begin by treating the difference (r2  r l ) as probability density of a difference rtf = (r2  r) a new random variable, which we define as in receptor output values, and that the standard rd = (r2 r), deviation cr" of this distribution is larger than the standard deviations cr of the pdfs of r2 and r l by a where rd has a mean of factor of Y2. ud =(u 2 u), As noted above, the probability of choosing 52 is p(rd> Olu) (i.e., the probability that rd is greater and a variance vIf (Variance is defined as the than zero). This can be evaluated using the integral standard deviation squared, so v = cr 2 and cr = Yv.) of the distribution p(r"lu), which yields Both rl and r2 are random variables with gaussian distributions, and the distribution of the difference rd is therefore also gaussian where <I> is the cumulative density function of a p(rJu2  u) kdexp[((u2  r2)(u l  r))2/(2v)1, = standard gaussian distribution (i.e., with zero mean where the constant kd = (1IY(2rr.v) ensures that the and unit sta ndard deviation). This function returns distribution has unit area, and its variance can be the area under the distribution to the left of ud shown ro be Vd = (VI + v2). The conditional proba with mean zero and standard deviation crY2; an bility term p(rdlu2  UI) makes explicit the depend area numerically equal to the area under the gaus ence of rtf on the difference in means (u 2 u), and sian distribution to the right of zero with mean ud is interpreted as : the probability density of observ and also with a standard deviation of crY2. ing the value rd given the difference in means We now have an equation which tells us how uti = (u 2  u). the proportion p(rd > 0) of 52responses increases as The equation above can be written more suc the difference between distribution means in cinctlyas creases, which is driven by the difference between the comparison and reference luminances. More p(rdlu) = kdexp[(u"  ry/ (2v)1. importantly, we can use this equation to find the If the variances V I and v2 are equal then their stand JND, the change in lumin ance required to sh ift ard deviations cr I and cr 2 are also equal. We can the mean receptor output value by one standard label these using two symbols without subscripts as deviation cr, a change which also increases the proportion of 52 responses from 50% to 76%, as described next. and If we increase 12from below II to above 12 then cr = cr l = cr 2 , we obtain the typical sigmoid function value de scribing the probability that the observer chooses where cr = Yv. Given that 52. Once 12 becomes greater than II then choos ing 52 is the correct response, which we assume 298 Seeing and Psychophysics is described by the probability p(rd > 0) defined ability is measured for many values of the com above. If rei = 0 then the means of the receptor parison luminance 12 then the results can be used outputs are u l = u2 and so p(r" > 0) = 0.5. If we to plot a graph of p(choose 5) versus M. With could increase the luminance 12until the machinery described above, the experimenter can use this function to estimate the JND as the U 2 = ul + 0 standard deviation of the distribution of responses then this would correspond to increasing u 2 by to the reference stimulus. This is done by "reading one standard deviation. One standard devia off' the luminance change 1'1.1 required to increase tion 0 in the noise distribution corresponds to p(choose 5) from 50% to 76%. a change 0 " = 0,j2 in the distribution of differ ences rd" The corresponding change in p(rd > 0) Why Variances Add Up is given by the area under the distribution of rei We begin by showing that, if the distributions of rl values between r tf = 0 and r tf = 0,j2, which evalu and r2 values have variances VI and v 2 (respectively) , ates to <I>(l/,j2) = 0.212. In other words, if the then difference 1'1.1 =12 11 where v" is the variance of the distribution of dif is changed so that p(r" > 0) increases from 50% ferences rtr We can assume that the mean of each to 76% then this corresponds to a change in u2 distribution is zero, without affecting our result. from For brevity, we define h = lin, to so the variances of V I' v2 and vd are defined as VI = hI; (r/i)  rm)2 Given that the JND is the change in luminance required to increase u2 by one standard deviation , V2 = hI; (r/ i)  rml this implies that the JND is that luminance change sufficient to change p (choose 52) from 50% to Vd = hI; [ (r2(i)  rm2)  (r l(i)  r l1 ) F. 76% . (Note: Be careful not to confuse this case Expanding the righthand side of the final equa with the fact, described earli er, that the area be tion yields tween the mean and one standard deviation away v,, = hI; [(r2 (i)  rm2 )2 from the mean of a gaussian is 34%; that is, 84% of a gaussian lies to the left of one standard devia + (rl(i)  r,,)2 tion above its mean.)  2(r/i)  rml) (r2 (i)  r,J]. It is worth summarizing which quantities are invi sible to the experimenter, which ones can be If we insert some new summatio n signs then we measured, and which ones can be estimated from can rewrite this as those measurements. V" = hI; (r2 (i)  rmY Basically, the responses of the receptor (r l and r) , the distributions of these responses (P I and P2)' + hI; (rl(i)  rm/ their means (u l and u 2), variances (VI and v2) , and sta ndard deviations (0 1 and ( 2), are all invisible  2hI; (r l(i)  r,,) (r2(i)  r,,) . to the experimenter. The quantities known to the We can recognize the first two terms on the experimenter are the observer responses to the righthand side as VI and v2 • The third term is, on luminances (II and 1) of the reference and com average, equal to zero, because both rl and r2 are parison stimuli. By varying these over a range of independent random variables, which implies that values, the experimenter can measure the prob the sum of their products is zero, so th at abili ty p(rd > 0) that the observer chooses 52 at each luminance difference 1'1.1 = (12  1) . If this prob 299 Chapter 12 Table Summarizing the Differences Between Classical Psychophysics, SDT and 2ADFC Classical Psychophysics: Absolute Threshold Trial contains A single stimulus. Question Did you see a light? Response yes or no. Yields Absolute threshold , defined as the luminance that yields a yes response on 50% of trials. Comment Biased by observer's willingness to respond yes , which affects position of psychometric function . Classical Psychophysics: JND Trial contains A pair of stimuli , one with a reference luminance and one with a comparison luminance. Question Did you see any difference between the stimuli? Response yes or no. Yields JND , defined as the change in comparison luminance required to increase proportion of yes responses from 50% to some fi xed percentage (usually 75% or 76%). Comment As this does not depend on the position of the psychometric function , it is unaffected by observer bias. Signal Detection Theory (SDT) Trial contains A single stimulus, which is either a nonzero luminance stimulus, or, for catch trials , a zero intensity stimulus. Or Two stimuli, a reference luminance, and a comparison luminance, which are the same on catch trials. Question Did you see a light? Or Which stimulus is brighter? Response yes or no. Yields d O, the distance between the observer's noise and signal distributions for a given referencecomparison lumi nance difference. Comment d ' is measured with respect to a reference luminance, which can be zero. For nonzero reference luminances, d ' is obtained by using a pair of stimuli , where the luminance of these stimuli is the same on catch trials. Observer responds yes (difference seen) or no (difference not seen) on each trial. A luminance difference that yields a = value of d ' 1 corresponds to the JND for the reference luminance under consideration . TwoAlternative Forced Choice (2AFC) Trial contains Two stimuli. Question Which stimulus is brighter? Response Choose brighter stimulus. Yields Absolute threshold, JND. Comment The two stimuli are labeled as reference and comparison stimuli. If reference luminance is zero then the com parison luminance which yields a correct response on 76% of trials is the absolute threshold (and the JND for a reference luminance of zero). If reference stimulus is nonzero then the change in comparison luminance which increases the proportion of correct responses from 50% to 76% is the JND for that reference luminance 300 Seeing and Psychophysics Given that the standard deviation of the distribu the probability of a correct response is given by tion of differences is a d = ...J Vdthis implies p(u,> u) = $ (z), a d = ...J(v i + v2 )· where Now, if v f = v2 then we can use the symbol v to z = (u , u \/a. /, stand for both VI and v2 then In the case of 2AFC the JND is given by the a d = ...J(v + v) increase in luminance required to raise p(correct choice of brighter stimulus) from 50% to 76%. In = ...J(2v) = a ...J2. the case of SDT, the JND is given by the increase Thus, if the distributions of rl and r2 both have a in luminance req ui red to raise p(correct yes) from standard deviation a then the distribution of dif 50% to 76%, which corresponds to d' = 1. ferences rd = (r I  r) 2 has a standard deviation Psychophysical Methods for Measuring a d = a...J2. Thresholds Using 2AFC for Absolute Threshold There are basically five methods for measuring thresholds: the method of constant stimuli , the If we set the luminance of the reference stimulus 51 method oflimits, and the method of adjust to be zero (i .e., I I = 0) then it is not physically pos ment, staircase, and twoalternative forced choice sible for!2 to be less than I I (i.e" !2 cannot in this (2AFC). 1hese will be described only brieAy here. case be negative), and the observer cannot be cor On ly the 2AFC method is untroubled by the prob rect on more than 50% of trials unless 12 exceeds lem of observer bias. II' Therefore, a graph of correct responses versus Method of constant stimuli: This consists of the luminance difference 12  II has a horizontal using a different magnitude on each trial , chosen line at 50% until!2 > I I , after which the curve be randomly from a previously arranged set. The gins to rise above 50%. The absolute threshold lab, threshold is taken as that stimulus value which is given by the value of 12which causes the observer elicits a yes respo nse on 50% of trials. For differ to choose 5, (i.e., to be correct) on 76% of trials. ence thresholds, the question put to the observer Note that this absolute threshold is the same as the on each trial requires a yes/no response, such as, JND for a reference lumi nance of I I = O. That is, a "Is the stimulus on the left greater than the one on change to the luminance I I that is "j ust noticeable" the right?". Problem: much experimental time can is I .b,. The 2AFC procedure has been found to be wasted showing stimuli far from the observer's provide consisten tly lower absolute thresholds than threshold range. the methods used in classical psychophysics. Method of limits: The magnitude of the sti mu Before moving on, one further subtlety should lus is increased until it is observed, and a note of be noted. The xaxis of the distributions of rl and r2 this value of recorded. Then the stimu lus magni are in units of receptor membrane voltage, whereas tude is decreased until it cannot be detected, and the xaxis of the psychometric function is in units a note of this value of recorded . The threshold is of luminance. This change is justified if we ass ume taken as th e mean of the two recorded values. For that mean receptor output is a monotonic func difference thresholds, it is the difference between tion of luminance: an increase in luminance always two stimuli that are increased/decreased unti l the induces an increase in mean recepto r output. difference becomes detectable/undetectable. A Relation between 2AFC and SOT procedure of this general type is used when yo u are asked by an optician to look at the standard letter The methods of2AFC and SDT are related inas chart for assessing vision. This has letters of varying much as 2AFC can be considered as an SDT ex sizes from very large at the top to very small at the periment in which the observer places the criterion bottom. The task is to identifY the letters begin midway between the means u l and u2 of the noise ning with the large (well above threshold) letters. and signal distributions, respectively. In both cases, When yo u start making errors as the letters get 30 1 Chapter 12 a smaller then the optician has a measure of your threshold for abi li ty to see small stimuli clearly (see , eh 4, 4.14). Note that the optician moves only 80 Q) () c " from large to small differences, and not also vice Q) versa, and so this is not a proper method oflimits 60 procedure. ProbLem: much experimental time can '6 Q) be wasted showing stimuli far from the observer's :0 C1l 40 threshold range. (5 Staircase method: The problem of wasted c iii time in the method of limits can be fixed using :::J , 20 a staircase in which the magnitude is changed in large steps until the observer changes decision (e.g., o m M 100 from "} saw it" to "I didn't see it"). Then smaller Luminance, 1 steps are employed for moving the magnitude to b and fro above the change of decision point, with step sizes guided by the observer's responses. This keeps the stimuli in the threshold range. ProbLem: 0.8 The simple staircase just described can mean that :a II the observer never sees a clearcut stimulus. This C/) 0.6 problem can be fixed by occasionally presenting <l c stimuli well away from threshold. o U 0.4 Method of adjustment: The observer is asked Q; to adjust the stimulus magnitude, starting from .0 £ 0.2 a very low value, until the stimulus can be de tected, and a note of this value of recorded. This is repeated, starting from a very high value, and the 00 20 40 60 80 100 value at which the stimulus cannot be detected is Luminance, 1 recorded. The threshold is taken as the mean of c the rwo recorded values. For difference thresholds, the observer is asked to adjust the magnitude of a Q) () c 0.8 C1l c Table: Weber fractions 'E .2 0.6 Quantity Weber Fraction VJ Q) 0> c Electric shock 0.013 C1l '0.4 .c () Heaviness 0.020 "0 Q) > 'iii 0.2 Line Length 0.029 2Q) (L   Vibration (60 Hz) 0.036 00 10 20 30 40 50 Background luminance , 1 Loudness 0.048 12.16 Weber's law Brightness 0.079 a The change in stimulus magnitude M required to yield a just noticeable difference (JND) in perceived magnitude t>.S Taste (salt) 0.083 is proportional to stimulus magnitude I. b A graph of JND versus stimulus magnitude 1 defines a horizontal line with height equal to the Weber fraction for the quantity under consideration (luminance here). c The law of 'diminishing returns' implied by Weber's law. If stimulus magnitude is increased by a fixed amount then the perceived change in magnitude decreases. For example, lighting a cand le in a dark room has a dramatic effect, but doing so in daylight has little effect, so there is less 'bang per buck' as the background luminance increases. 302 Seeing and Psychophysics a 5 Weber's Law and Weber Fractions We now provide a brief history of some milestones 4 in psychophysics, beginning with the work of CI) Q) u Ernst Heinrich Weber (17951878). c (Il c 3 If we gave you a box containing 100g of sugar, ·E how much sugar would we have to add in order "0 Q) for you to notice the difference in weight? It turns > .(i) u out that the answer is about 2g. However, if the Qi box contained 200g of sugar then we wou ld have 0.. to add 4g for you to notice a difference. And if the box contained 300g of sugar then we wou ld have 00 to add 6g for you to notice a difference. 20 40 60 80 100 Luminance I By now you should have spotted the pattern, a pattern first noted by Weber, who found that b 5 the ]ND in weight increased in proportion to the initial weight, 12.16. This proportionality implies that the fraction formed by (JND in weight)/(ini 4 CI) tial weight) is a constant. The constant fraction is Q) u c equal to about 0.02 in this case because all of the (Il c 3 fractions just listed come to ·E "0 2/100 = 4/200 Q) 2 > .(i) = 6/300 u Qi 0.. = 0.02. In honour of his observations, such fractions are 00 cal led weber fractions, and they are different for 2 3 4 5 differenr types of stimuli. (See Table on Weber Log luminance I fractions.) 12.17 Fechner's Law As stated above, the common definition of a just a Fechner's law, S = k log ,1,0 , states that the perceived noticeable difference (JND) is that change M in a magnitude S of a stimulus with physical magnitude I is stimu lus with intensity I which is perceived on 75% proportional to the log of ,1,0 , where ' 0 is the absolute = threshold (set to 10 1 here), and k is a constant of propor of trials. In other words, an increment of one ]ND = tionality. If I 1Os then small increments in S induce large in stimulus intensity means that the perceived mag increments in I. nitude of a stimulus increases by a fixed amount. b A graph of log luminance versus perceived luminance So we ca n think of the ratio Mil as an incre gives a straight line with slope equal to k. The graph of log = I versus S yields a straight line because log I S (obtained ment in perceived stimulus magnitude, which is by taking logs of both of I =10S). defined as variable stimulus until it appears different from /).5 = MIl. a reference stimulus magnitude. ProbLem: The Dividing both sides by /).1 yields advantage of this procedure is that it is quick but /).51/)./ = 1I1. it is regarded by some as particularly vulnerable to observer bias. Consider what this equation implies: as I increases, Twoalternative forced choice (2AFC): This the ratio /).51/)./ decreases. So, the perceptual method was described above. We emphasize that impact /).5 of a fixed change in magnitude /)./, de it is the only one of the four methods which is creases as I increases. In other words, the amount uncontaminated by the observer's predisposition of "bang per buck" decreases rapidly as I increases. (bias) for responding yes. This makes perfect sense if you consider the large 303 Chapter 12 perceptual impact of lighting a candle in a dark a room, which has a small background luminance I. The candle adds a small amount /),1 of luminance 3.5 i to a small background luminance L so the ratio i a> 3 , /),5 = /),JIf is large. In contrast, lighting a candle in "0 .i daylight adds the same amount of light /),L but this :;J 'EOJ 25 ;,'/ " ........ ' ,..' . ,, i is added to a large background luminance I. In this ro ., , : ... i .' .; 2 I case, the perceptual impact is small, and is com a> I ./ > i mensurate with the value of small value of the ratio 1.5 / ,i /),5 = MI!. a> a. i OJ ! o ...J ,./ Fechner's Law .. 0.5 I i Putting all this mathematically, if we solve for 5 in i " i terms of I (by integrating both sides of /),51M = 111 2 3 4 with respect to 1) then we obtain a law named after Log stimulus magnitude Gustav Fechner (18011887). This law states that the perceived magnitude 5 of a given stimulus is proportional to the log of its physical magnitude I b 100 5 = k log J/Io' , ,! where k is a constant of proportionality, and 10is , 80 , , the smallest value of I that can be detected (i.e., the absolute threshold). Note that the value of k is C/) unique to the particular quantity measured (e.g., a> "0 60 brightness and loudness have different values of k). .a 'c OJ [See the next section if you need a reminder ro E   40 about logarithms.] "0 a> Fechner's law implies that each time the physical > 'w ... '" , magnitude (weight, for example) is doubled this a> n. 20 , ,, . ,.' adds a constant amount /),5 to the perceived magni " I tude 5. As an examp le, next, we show that /),5 = k log 2. 40 60 80. 100 Stimulus magnitude, I If one physical magnitude 1/ is twice as large as another magnitude 12(i.e., II = 212) then the differ 12.18 Stevens's law ence in perceived magnitude is Plots for length (black solid curve, n = 1), brightness (green dotted curve, n = 0.5), and electric shock (blue dashdot /),5 = 51  52 curve, n = 1.7). For comparison , Fechner's law for bright = k log I/fo  k log 1/10 ness is shown as the red dashed curves. a The relation S =mIn. = k log 1/ 12 b Graph of log stimulus magnitude versus log perceived magnitude. In each case, the slope of the line is an estimate =k log 2 = /),5. of the exponent n, and the ordinate intercept corresponds to the constant m. Logarithms If you need a reminder of logarithms, here it is. For simplicity, let's assume that 5 = log I, so we have equivalently, I = lOS. This can be confirmed if we implicitly set values in Fechner's law to k = 1 and take the log of both sides of this equation, which 10 = 1, 12.17a. Here we make use of logarithms yields log I = 5 (because the log of lOS is 5). to the base 10. In essence, if 5 = log I then 5 is What happens to 5 if we double the magnitude the power to which 10 must be raised to get I, or I? If the magnitude of the stimulus has an initial 304 Seeing and Psychophysics value of II = 4 so that sensation magnitude 5 and physical intensity 1 in terms of the power law 12 = 211 = 8, then we have 5 = mI'. log 12 = log 211 This states that the perceived sensation 5 is the physical intensity 1 raised to the power n, where the = log 2 + log II' value of the exponent n depends on the specific Subtracting log I, from both sides yields stimulus being measured, and m is a constant log 12  log II = log 2 which depends on the particular units of measure ment used (e.g., feet versus metres). = 0.3, This law has since become known as Stevens's where, by definition, 51 = log I I and 52 = log 12 so law. Some quantities, such as brightness, are that 52  51 = log 2. Thus, multiplying 1 by a factor reasonably consistent with both Fechner's and of2 adds an amount log 2 (= 0.3) to S. If we sub Stevens's laws, as shown in 12.18. This is because stitute I I = 4 and 12 = 8 then 51 = log 4, the exponent of n = 0.5 for brightness in Stevens's 52 = log 8, so that law implies that the perceived magnitude increases 52  51 = log 8  log 4 with the square root of physical magnitude, and describes a function which is reasonably well ap = log 8/4 proximated by a Fechner's logarithmic function. = log 2. However, other quantities cannot be accommo dated by Fechner's law, bur are consistent with Ste More importantly, doubling the value of 1 adds an vens's law. For example, taste (n = 1.3) and electric amount 0.3 (0.3 = log 2) to 51' irrespective of the shock (n = 3.5) cannot be described by Fechner's initial value of 51' law, 12.18. Taking a different perspective, what do we have to do to the magnitude 1 in order to add one to the Linking Weber's, Fechner's, and Stevens's Laws perceived magnitude S? If we add one to 51 so that A simple interpretation of Stevens' law is that the 52 = 51 + 1, relative change in perceived magnitude 115/S is then proportional to the relative change in magnitude log 12 = 1 + log I I' Mil, so that Subtracting log I I from both sides yields 11515 = nMll, log 12 log II = 1. where n is a constant which depends on the physi This can be rearranged to yield cal quantity being measured. Note that this looks quite similar to Weber's law (11515 = constant), but log 1) 11 = 1. its implications are quite different. This is because If we now raise both sides to the power of the loga Weber's law states that the relative change in ri thm's base (10, here) then we have perceived magnitude is constant for a given physical 12111 = 10 1 = 10. quantity. Recall that we obtained Fechner's law 5 = k log 1 from Weber's law by integration. Therefore, adding one to 5 can be achieved by A similar operation on the relation making the ratio 12/11 = 10; that is, by making 12 ten times larger than I I' This can be expressed as 11515 = nMll, "adding one to 5 adds one log unit to 1'. yields Stevens's Power Law log 5 = log I' + log m, Stanley Smith Stevens (19061973) showed that Fechner's law could not account for many types (where log m is a constant of integration). By tak of stimuli, and reformulated the relation between ing antilogs, we obtain Stevens' law 5 = mI'. 305 Chapter 12 The value of the exponent in Stevens's law wideranging book on vision with a discursive S = ml' can be estimated by plotting the log S section on psychophysics on pp. 830. Also see the against log J, which yields many appendices for good tutorials and insights into modeling perception. log S = n log J + log m . This has the same form of the equation of a straight line, traditionally written as Y =Mx+ C, where M is the slope of the line, and C is the value of the intercept on the ordinate (y) axis. Thus, Ste vens's law predicts that a graph of log S versus log J has a slope M = n, and an intercept on the ordinate axis of C = log m. Conclusions This chapter has introduced some core ideas about seeing. In particular, it has explained the funda mental problem of noise in sensory systems. Noise needs to be taken into account both when building models of the visual system and when developing methods for measuring fine discriminations. Further Reading Bialek W (2002) Thinking about the brain. In Physics of biomolecules and cells: Les houches session LXXYH Flyvbjerg, F Julicher, P Ormos and F David, F (Eds) (EOP Sciences, Les Ulis; Spring erVerlag, Berlin). Comment A general paper which is quite technical, by an insightful scientist, who discusses thresholds as one of the problems the brain has to solve. Online at http://www.princ eton.edu/  wbialek/ publications_wbialek.html. Green OM and Swets JA (1966) Signal detection theory and psychophysics. New York: Wiley. Comment The classic book on SOT, fairly techni cal, but also very readable. Hecht S, Sch laer Sand Pirenne MH (1942) Energy, quanta and vision. Journal of the Optical Society ofAmerica 38 196208. Heeger 0 (2007) Signal Detection Theory http:// www.cns .nyu .edu/ david/handouts/ sdt/ sdt. html Comment A very clear account of SOT, with notes that can be downloaded. Regan 0 (2000) Human perception ofobjects. Sinauer Associates, MA. Comment A splendid 306
Enter the password to open this PDF file:











