How do humans segment objects?
--------
Recall that the way objects are ordinarily arranged in space, so that one occludes parts of another, prevents us from doing this in any simple way.
--------
Infants from 4.5 months of age can use featural information to segment objects.
--------
In Amy Needham's 1998 study, 4.5 months old infants were shown a display like this.
Featural information---the difference in textures of the objects---suggests that these are two
separate objects. But can infants use this information to detect that there are two objects?
--------
Some infants were then shown the object being moved like this, so that it is clearly two
separate objects.
--------
Other infants where shown the object being moved like this.
If infants think there is one object, they should expect the second kind of movement.
But if infants think there are two objects---if, that is, they can use the featural
information to segment objects---then they should expect the former kind of movement.
What were the results? ...
--------
Needham's results are evidence that infants from 4.5 months of age can use featural information
to segment objects.
--------
[I need to explain the method used in violation-of-expectations, and to compare it with
the method of habituation.]
A violation-of-expectations experiment involves a pair of events.
Infants are divided into two groups; one group sees one event, the other sees the other event.
(This is the between-subject version; it might also be done within subjects.)
The experimenter measures how long the infants look at each event.
Of interest is whether infants reliably look longer at one of the two events.
If they do, this is interpreted as evidence that this event---the one infants reliably look
longer at---is in some way interesting to them.
And, if the events are well chosen, their interest indicates that the event violates an
expectation they have.
In the experiment we are considering, the expectation violated is the expectation that
the two objects should move separately.
--------
At this point you might well ask, What is an expectation?
This is an important question but let me postpone it for now.
--------
To return to Needham's experiment, interestingly, 4.5 month old infants were able to succeed
even when the point of contact between the two objects was occluded, as in this diagram.
--------
These are the results for 4.5 month old infants.
One further thing: infants can also use shape information in segmenting objects, and shape information appears to trump featural information \citep{needham:1999_role}.
--------
Can we fully explain how infants segment objects just by appeal to features?
To see why it couldn't be just features that we use to segment objects, consider
some more cases ...
--------
Here is an occluded object---a stick behind a box.
The movement is enough to convince 4-month-old infants that there is just one stick even
though they never see its middle \citep{kellman:1983_perception}.
We can discover this by measuring how different displayes cause them to dishabituate.
--------
After being habituated to this this, 3-month-old infants were shown one of two displays.
--------
And here are the results (subjects were 3-month-old infants).
--------
The fact that infants can correctly segment partially occluded objects based on their movements
already indicates that they can't be thinking about features only.
For more evidence, consider this display.
The two parts of the moving object are featurally different.
Despite this, infants expect to see a single connected object behind the block
(\citealp{kellman:1983_perception}, Experiment 6; \citealp{Spelke:1990jn}).
--------
Here are the test stimuli (each groups is shown one or the other).
--------
And here are the results.
Subjects in this experiment were 4-month-old infants.
So we saw that infants can use featural information to segment objects,
but the principle of cohension can trump featural indicators of difference.
So infants' abilities to segment objects are not based entirely on recognising features.
--------
If infants do not rely only on features to do this, then
how do infants segment the objects in the displays we've just been seeing?
--------
\citet{Spelke:1990jn} suggests that infants rely on a set of principles to segment objects.
But what are the principles?
--------
Recall this diplay with on object moving behind a stationary block.
What kind of principle could be used to identify that the occluded thing is a single object?
\citet{Spelke:1990jn} suggests the principle of rigidity.
This principle says that ‘objects are interpreted as moving rigidly if such an interpretation
exists’
The hypothesis that this principle describes in part how infants segment objects correctly
predicts that they will treat the moving occluded stick as a single object.
But rigidity is not the only principle we need to explain how infants segment objects ...
What justifies us in supposing that a rigidly moving object needs to be joined up?
What justifies us in supposing that a rigidly moving object needs to be joined up?
--------
... to answer this question, consider the Principle of Cohension
--------
Another principle which seems to be involved in segmenting objects is the principle of
cohension.
According to this principle, ‘two surface points lie on the same object only if the points are
linked by a path of connected surface points’ \citep{Spelke:1990jn}.
--------
For example, objects arranged as on your left were percevied by 3-month-olds as two objects,
whereas infants treated the displays like that on your right as if they were one object.
(This was measured using a habituation paradigm \citep{kestenbaum:1987_perception}. Infants
were habituated to the display. Then either one object's position changed, or both objects'
positions changed but in such a way as to preserve the overall configuration of the two
objects. Infants could show that they perceived the configuration as a single object by
looking longer when just one object's position changed.)
--------
Here's a second example using moving rather than static stimuli and a different method:
reaching rather than looking.
Let me explain the stimuli first.
How does the principle of cohension apply to this moving display?
As we just formulated it, it doesn't seem to. After all, in both cases all points on the
stimuli are lnked by a path of connected surface points.
However, the principle should be read as saying more implying that:
‘When two surfaces are separated by a spatial gap (as in Figure 4a) or undergo relative motions
that alter the adjacency relations among points at their border (as in Figure 4i), the
surfaces lie on distinct objects’ \citep[p.\ 49]{Spelke:1990jn}.
The question is, Do infants segment these objects in accordance with the Principle of Cohesion?
\citet[Experiment 2]{spelke:1989_reaching} used a reaching experiment with 5-month-old infants.
The smaller of the two objects was always closer to the infants.
Infants should reach more often for the smaller, nearer object when they represent the simuli
as two separate objects than when they represent it as a single object.
(This is not obvious, but the researchers do justify this claim carefully
\citep[p.\ 186]{spelke:1989_reaching}.)
So the idea is that by comparing how often 5-month-olds reach for the smaller object, we can
see whether they treat it as a separate object in one case but not the other.
To make this vivid, let me show you their apparatus ...
--------
Here you can see the infant sitting in front of the two objects which could be made to move
together or separately.
--------
And here are the results. You don't need to read the table, I put it here just to mention
that this is a within-subject design.
[*explain within- vs between-subject].
Overall, infants reached to the smaller, top object more often when they moved in opposite
directions than when they moved together.
Given the background assumption, this is evidence that infants segmented the objects
differently depending on their motions, and did so in just the way that adults would
\citep[Experiment 2]{spelke:1989_reaching}.
--------
\citet{Spelke:1990jn} proposes that our ability to segment objects depends on four principles.
We've already seen two of these in action (rigidy and cohesion), and we will shortly see
that a further principle is needed, too.
--------
We've already seen this principle in action.
--------
Boundedness is just the converse of cohesion.
Strictly speaking, cohension allows us to infer that we have two
distinct objects, but not to infer that we have a single object---for that, we need boundedness.
So when I was talking a moment ago about the Principle of Cohesion, strictly speaking I was
also appealing to the Principle of Boundedness.
--------
We saw an example of the principle of rigidity in action earlier, with the moving stick
experiment.
--------
The final Principle, no action at a distance, is a converse to rigidity.
--------
I don't want to obsess too much about the details of these principles.
It isn't important that there are exactly four, nor are their precise formulations.
(Surely the principles as stated here are not exactly the principles we need to characterise
how infants segment objects.)
What I want us to focus on is just the fact that we can use a small number of principles to
characterise how infants segment objects in a way that generates testable predictions,
and these principles have been confirmed.
This motivates us to ask ...
What is the status of these principles?
Spelke’s position might be put like this:
\begin{enumerate}
\item We (as perceivers) start with a cross-modal representation of three-dimensional
perceptual features which includes their locations and trajectories.
\item Our task is to get from these representations of features to representations of objects.
\item \emph{Descriptive component} We do this as if in accordance with certain principles
(cohesion, boundedness, rigidity, and no action at a distance).
\item \emph{Explanatory component} We acquire representations of objects because we apply the
principles to representations of features and draw appropriate inferences.
\end{enumerate}
The key point for our purposes is the explanatory component.
The principles are not supposed to be merely heuristics for describing and predicting infants’
performance on preferential looking tasks.
Rather, these principles are supposed to explain why infants look longer at some things than at
others.
This what motivates the hypothesis that infants know these principles and use them in
reasoning about objects: unless this hypothesis is true, it’s hard to understand how the
principles could have explanatory relevance.
--------
The conjecture that someone can segment and represent physical objects
does not by itself generate readily testable predictions.
Everything depends on which model of physical objects characterises her
phyiscal cognition.
So we should ask ...
--------
1. How do four-month-old infants model physical objects?
In asking how infants
model physical objects, we are seeking to understand not how physical objects
in fact are but how they appear from the point of view of
an individual or system.
The model need not be thought of as something used by the system: it is
a tool the theorist uses in describing what the system is for and
broadly how it works.
This therefore leads us to a second question ...
--------
2. What is the relation between the model and the infants?
--------
3. What is the relation between the model and the things modelled (physical objects)?
--------
The Simple View is inspired by two famous cognitive scientists, Marr and Chomsky.
Marr showed that many visual processes can be described as inferences.
And Chomsky pioneered the idea that humans’ knowledge of language depends on their knowing of a small number of
principles.
Similarly, Spelke’s suggestion is that human infants (and adults) come to
know facts about particular physical objects by virtue of making inferences
from a small number of principles which they know or believe.
What unites these three cases, Spelke on object segmentation, Marr on vision and Chomsky on
syntax?
It’s that they are straightforwardly cognitivist in appeal to knowledge and inference.
Principles are known, and they are used via a process of inference.
(There’s a nice quote from Fodor underlining this point.)
--------
‘Chomsky’s nativism is primarily a thesis about knowledge and belief; it aligns problems
in the theory of language with those in the theory of knowledge. Indeed, as often as not,
the vocabulary in which Chomsky frames linguistic issues is explicitly epistemological.
Thus, the grammar of a language specifies what its speaker/hearers have to know qua speakers
and hearers; and the goal of the child’s language acquisition process is to construct a
theory of the language that correctly expresses this grammatical knowledge.’
\citep[p.\ 11]{Fodor:2000cj}
--------
So what is the status of Spelke’s principles of object perception?
Consider what I shall call the Simple View ...
\textit{The simple view}
The principles of object perception are things that we know or believe,
and we generate expectations from these principles by a process of inference.
The simple view is that the Spelke principles are just known in whatever sense anything is
known or believed.
(We can't say the principles are known because strictly speaking they are not truths but only
heuristics.)
The simple view isn’t exactly Spelke’s, but it’s a useful starting point for discussion.
--------
The Simple View is worth considering in its own right because it is so, well, simple.
But our interest in it may be piqued by the fact that
Spelke herself appears to have accepted the Simple View at one point in her thinking:
‘objects are conceived: Humans come to know about an object’s unity, boundaries, and
persistence in ways like those by which we come to know about its material composition or its
market value’
\citep[p.\ 198]{Spelke:1988xc}.
--------
Now you might think that the case for these principles is not yet very strong.
In that case, asking hard questions about their status would hardly be necessary.
So let’s consider further evidence for these principles.
We can do this by turning from segmentation (which was our first requirement on knowledge of
objects)
to representing objects as permanent.