Segmentation and the Principles of Object Perception

Click here and press the right key for the next slide (or swipe left)

also ...

Press the left key to go backwards (or swipe right)

Press n to toggle whether notes are shown (or add '?notes' to the url before the #)

Press m or double tap to slide thumbnails (menu)

Press ? at any time to show the keyboard shortcuts

Segmentation and the Principles of Object Perception

How do humans segment objects?

first requirement: segmentation

Recall that the way objects are ordinarily arranged in space, so that one occludes parts of another, prevents us from doing this in any simple way.

Infants from 4.5 months of age can use featural information to segment objects.

using featural information

Needham (1998)

In Amy Needham's 1998 study, 4.5 months old infants were shown a display like this. Featural information---the difference in textures of the objects---suggests that these are two separate objects. But can infants use this information to detect that there are two objects?

Needham (1998, figure 4)

Needham's results are evidence that infants from 4.5 months of age can use featural information to segment objects.

method

violation-of-expectations

[I need to explain the method used in violation-of-expectations, and to compare it with the method of habituation.] A violation-of-expectations experiment involves a pair of events. Infants are divided into two groups; one group sees one event, the other sees the other event. (This is the between-subject version; it might also be done within subjects.) The experimenter measures how long the infants look at each event. Of interest is whether infants reliably look longer at one of the two events. If they do, this is interpreted as evidence that this event---the one infants reliably look longer at---is in some way interesting to them. And, if the events are well chosen, their interest indicates that the event violates an expectation they have. In the experiment we are considering, the expectation violated is the expectation that the two objects should move separately.

To return to Needham's experiment, interestingly, 4.5 month old infants were able to succeed even when the point of contact between the two objects was occluded, as in this diagram.

Needham (1998, figure 6)

These are the results for 4.5 month old infants.

One further thing: infants can also use shape information in segmenting objects, and shape information appears to trump featural information \citep{needham:1999_role}.

Needham (1998, figure 7)

Can we fully explain how infants segment objects just by appeal to features? To see why it couldn't be just features that we use to segment objects, consider some more cases ...

Could it all be features?

`infants perceive the boundaries of a partly hidden object by analyzing the movements of its surfaces: infants perceived a connected object when its ends moved in a common translation behind the occluder. Infants do not appear to perceive a connected object by analyzing the colors and forms of surfaces: they did not perceive a connected object when its visible parts were stationary, its color was homogeneous, its edges were aligned, and its shape was simple and regular' \citep{kellman:1983_perception}.

Here is an occluded object---a stick behind a box.

The movement is enough to convince 4-month-old infants that there is just one stick even though they never see its middle \citep{kellman:1983_perception}. We can discover this by measuring how different displayes cause them to dishabituate.

Spelke (1990, figure 2)

After being habituated to this this, 3-month-old infants were shown one of two displays.

Kellman & Spelke (1983, figure 3)

And here are the results (subjects were 3-month-old infants).

Kellman & Spelke (1983, figure 4)

The fact that infants can correctly segment partially occluded objects based on their movements already indicates that they can't be thinking about features only.

For more evidence, consider this display. The two parts of the moving object are featurally different. Despite this, infants expect to see a single connected object behind the block (\citealp{kellman:1983_perception}, Experiment 6; \citealp{Spelke:1990jn}).

Kellman & Spelke (1983, figure 13)

Here are the test stimuli (each groups is shown one or the other).

Kellman & Spelke (1983, figure 13)

And here are the results.

Subjects in this experiment were 4-month-old infants.

So we saw that infants can use featural information to segment objects, but the principle of cohension can trump featural indicators of difference.

So infants' abilities to segment objects are not based entirely on recognising features.

Kellman & Spelke (1983, figure 14)

If infants do not rely only on features to do this, then how do infants segment the objects in the displays we've just been seeing?

If not by features, then how?

Recall this diplay with on object moving behind a stationary block. What kind of principle could be used to identify that the occluded thing is a single object?

Kellman & Spelke (1983, figure 13)

rigidity—‘objects are interpreted as moving rigidly if such an interpretation exists’

\citet{Spelke:1990jn} suggests the principle of rigidity. This principle says that ‘objects are interpreted as moving rigidly if such an interpretation exists’ The hypothesis that this principle describes in part how infants segment objects correctly predicts that they will treat the moving occluded stick as a single object.

But rigidity is not the only principle we need to explain how infants segment objects ...

What justifies us in supposing that a rigidly moving object needs to be joined up?

cohesion:

‘two surface points lie on the same object only if the points are linked by a path of connected surface points’

(Spelke 1990)

Another principle which seems to be involved in segmenting objects is the principle of cohension. According to this principle, ‘two surface points lie on the same object only if the points are linked by a path of connected surface points’ \citep{Spelke:1990jn}.

Spelke (1990, figure 4)

For example, objects arranged as on your left were percevied by 3-month-olds as two objects, whereas infants treated the displays like that on your right as if they were one object. (This was measured using a habituation paradigm \citep{kestenbaum:1987_perception}. Infants were habituated to the display. Then either one object's position changed, or both objects' positions changed but in such a way as to preserve the overall configuration of the two objects. Infants could show that they perceived the configuration as a single object by looking longer when just one object's position changed.)

Here's a second example using moving rather than static stimuli and a different method: reaching rather than looking. Let me explain the stimuli first.

How does the principle of cohension apply to this moving display? As we just formulated it, it doesn't seem to. After all, in both cases all points on the stimuli are lnked by a path of connected surface points. However, the principle should be read as saying more implying that: ‘When two surfaces are separated by a spatial gap (as in Figure 4a) or undergo relative motions that alter the adjacency relations among points at their border (as in Figure 4i), the surfaces lie on distinct objects’ \citep[p.\ 49]{Spelke:1990jn}.

The question is, Do infants segment these objects in accordance with the Principle of Cohesion? \citet[Experiment 2]{spelke:1989_reaching} used a reaching experiment with 5-month-old infants. The smaller of the two objects was always closer to the infants. Infants should reach more often for the smaller, nearer object when they represent the simuli as two separate objects than when they represent it as a single object. (This is not obvious, but the researchers do justify this claim carefully \citep[p.\ 186]{spelke:1989_reaching}.) So the idea is that by comparing how often 5-month-olds reach for the smaller object, we can see whether they treat it as a separate object in one case but not the other. To make this vivid, let me show you their apparatus ...

Spelke et al 1989 figure 1.

Here you can see the infant sitting in front of the two objects which could be made to move together or separately.

Spelke et al 1989 table 2.

And here are the results. You don't need to read the table, I put it here just to mention that this is a within-subject design.

[*explain within- vs between-subject].

Overall, infants reached to the smaller, top object more often when they moved in opposite directions than when they moved together. Given the background assumption, this is evidence that infants segmented the objects differently depending on their motions, and did so in just the way that adults would \citep[Experiment 2]{spelke:1989_reaching}.

\citet{Spelke:1990jn} proposes that our ability to segment objects depends on four principles. We've already seen two of these in action (rigidy and cohesion), and we will shortly see that a further principle is needed, too.

Principles of Object Perception

\textbf{Principles of Object Perception \citep{Spelke:1990jn}}

cohesion—‘two surface points lie on the same object only if the points are linked by a path of connected surface points’
boundedness—‘two surface points lie on distinct objects only if no path of connected surface points links them’
rigidity—‘objects are interpreted as moving rigidly if such an interpretation exists’
no action at a distance—‘separated objects are interpreted as moving independently of one another if such an interpretation exists’

(Spelke 1990)

I don't want to obsess too much about the details of these principles. It isn't important that there are exactly four, nor are their precise formulations. (Surely the principles as stated here are not exactly the principles we need to characterise how infants segment objects.) What I want us to focus on is just the fact that we can use a small number of principles to characterise how infants segment objects in a way that generates testable predictions, and these principles have been confirmed. This motivates us to ask ...

What is the status of these principles?

Spelke’s position might be put like this:

\begin{enumerate} \item We (as perceivers) start with a cross-modal representation of three-dimensional perceptual features which includes their locations and trajectories. \item Our task is to get from these representations of features to representations of objects. \item \emph{Descriptive component} We do this as if in accordance with certain principles (cohesion, boundedness, rigidity, and no action at a distance). \item \emph{Explanatory component} We acquire representations of objects because we apply the principles to representations of features and draw appropriate inferences. \end{enumerate}

The key point for our purposes is the explanatory component. The principles are not supposed to be merely heuristics for describing and predicting infants’ performance on preferential looking tasks. Rather, these principles are supposed to explain why infants look longer at some things than at others. This what motivates the hypothesis that infants know these principles and use them in reasoning about objects: unless this hypothesis is true, it’s hard to understand how the principles could have explanatory relevance.

1. How do four-month-old infants model physical objects?

2. What is the relation between the model and the infants?

3. What is the relation between the model and the things modelled (physical objects)?

The conjecture that someone can segment and represent physical objects does not by itself generate readily testable predictions. Everything depends on which model of physical objects characterises her phyiscal cognition.

So we should ask ...

1. How do four-month-old infants model physical objects?

In asking how infants model physical objects, we are seeking to understand not how physical objects in fact are but how they appear from the point of view of an individual or system.

The model need not be thought of as something used by the system: it is a tool the theorist uses in describing what the system is for and broadly how it works. This therefore leads us to a second question ...

Marr & Chomsky

The Simple View is inspired by two famous cognitive scientists, Marr and Chomsky. Marr showed that many visual processes can be described as inferences. And Chomsky pioneered the idea that humans’ knowledge of language depends on their knowing of a small number of principles. Similarly, Spelke’s suggestion is that human infants (and adults) come to know facts about particular physical objects by virtue of making inferences from a small number of principles which they know or believe.

What unites these three cases, Spelke on object segmentation, Marr on vision and Chomsky on syntax? It’s that they are straightforwardly cognitivist in appeal to knowledge and inference. Principles are known, and they are used via a process of inference. (There’s a nice quote from Fodor underlining this point.)

‘... the vocabulary in which Chomsky frames linguistic issues is explicitly epistemological. Thus, the grammar of a language specifies what its speaker/hearers have to know ... and the goal of the child’s language acquisition process is to construct a theory of the language that correctly expresses this grammatical knowledge.’

Fodor 2000, p. 11

‘Chomsky’s nativism is primarily a thesis about knowledge and belief; it aligns problems in the theory of language with those in the theory of knowledge. Indeed, as often as not, the vocabulary in which Chomsky frames linguistic issues is explicitly epistemological. Thus, the grammar of a language specifies what its speaker/hearers have to know qua speakers and hearers; and the goal of the child’s language acquisition process is to construct a theory of the language that correctly expresses this grammatical knowledge.’

\citep[p.\ 11]{Fodor:2000cj}

the simple view

So what is the status of Spelke’s principles of object perception? Consider what I shall call the Simple View ...

\textit{The simple view} The principles of object perception are things that we know or believe, and we generate expectations from these principles by a process of inference.

The simple view is that the Spelke principles are just known in whatever sense anything is known or believed. (We can't say the principles are known because strictly speaking they are not truths but only heuristics.) The simple view isn’t exactly Spelke’s, but it’s a useful starting point for discussion.

The Simple View is worth considering in its own right because it is so, well, simple. But our interest in it may be piqued by the fact that Spelke herself appears to have accepted the Simple View at one point in her thinking:

‘objects are conceived: Humans come to know about an object’s unity, boundaries, and persistence in ways like those by which we come to know about its material composition or its market value’

\citep[p.\ 198]{Spelke:1988xc}.

Spelke (1988, p. 198)

Now you might think that the case for these principles is not yet very strong. In that case, asking hard questions about their status would hardly be necessary. So let’s consider further evidence for these principles. We can do this by turning from segmentation (which was our first requirement on knowledge of objects) to representing objects as permanent.

Keyboard Shortcuts?

Keyboard Shortcuts`?`