Paper for the Summer
School in Analytic Philosophy
on Knowledge and
Cognition
July
1-7, 2002.
Seeing, Perceiving and Knowing
Pierre Jacob
jacob@ehess.fr
The present paper has two major
goals, one of which is to argue that seeing is not always perceiving and the
other of which is to argue that visual perception alone leads to knowledge of
the world. Let me immediately try to make these two cryptic claims more
transparent. Not all human vision has been designed to allow visual perception.
Seeing can and often does make us visually aware of objects, properties and
facts in the world. But it need not. Often enough, seeing allows us to act
efficiently on objects of which we are dimly
aware, if at all. While moving at high speed, for example, experienced
drivers are sometimes capable of avoiding an interfering obstacle of whose
visual attributes they become fully aware afterwards. One may efficiently
either catch or avoid being hit by a flying tennis ball without being aware of
either its color or texture. This is the sense in which seeing is not always
perceiving. If so, then the question arises as to the nature, function and
cognitive role of non-perceptual vision. Here, I will make two joint claims.
First of all, I will try to argue that the main job of human visual perception
is to provide visual information for what functionalist philosophers have
called the “belief box”. In other
words, visual percepts are inputs to further conceptual processing whose output
can be stored in the belief box. Secondly, I will try to argue that the
function of that part of the visual system that produces what I shall call
“non-perceptual” or more often “visuomotor” representations is to provide
visual guidance to the “intention box”. More specifically, I will argue that,
unlike visual percepts, visuomotor representations — which, I shall claim, are
genuine representations — present visual information to motor intentions and
serve as inputs to “causally indexical” concepts. On the joint assumptions
(that I accept) that in the relevant propositional sense, only facts can be
known, and that one cannot know a fact unless one believes that this very fact
(or state of affairs) holds, then it follows from my distinction between
perceptual and visuomotor processing that only visual perception can give rise
to “detached” knowledge of the mind-independent world.
In
their (1982) paper “Two Cortical Visual Systems”, the cognitive neuroscientists
Leslie Ungerleider and Mortimer Mishkin posited an anatomical distinction
between the ventral pathway and the dorsal pathway in the primate visual system
(see Figure 1). The former projects the primary visual cortex onto
inferotemporal areas. The latter projects the primary visual cortex onto
parietal areas, which serve as a relay between the primary visual cortex, the
premotor and the motor cortex. Ungerleider and Mishkin based their anatomical
distinction on neurophysiological and behavioral evidence gathered from the
study of macaque monkeys. They performed intrusive lesions respectively in the
ventral and in the dorsal pathway of the visual system of macaque monkeys and
they found the following double dissociation. Animals with a lesion in the
ventral pathway were impaired in the identification and recognition of the
colors, textures and shapes of objects. But they were relatively unimpaired in
tasks or spatial orientation. In tasks of spatial orientation, they were
presented with two wells one of which contained food and the other of which was
empty: the former was closer to a landmark than the latter (see Figure 2).
Animals with a ventral lesion could accurately use the presence of the landmark
in order to discriminate the well with food from the well without. By contrast,
animals with a dorsal lesion were severely disoriented, but their capacity to
identify and recognize the shapes, colors and textures of objects were
well-preserved. On this basis, Ungerleider and Mishkin (1982) concluded that
the ventral pathway of the primate visual system is the What system and
the dorsal pathway of the primate visual system is the Where system.
In their (1995) book, The Visual
Brain in Action, the cognitive neuroscientists David Milner and Mel Goodale
presented a number of arguments in favor of a new interpretation of the
dualistic model of the human visual system. On their view, the ventral stream
of the human visual system serves what they call “vision-for-perception” and
the dorsal stream serves what they call “vision-for-action”. The important idea
underlying Milner and Goodale’s dualistic model of human vision is that one and
the same visual stimulus can be processed in two fundamentally different ways.
Now, two caveats are important here. First of all, it is quite clear, I think,
that, as Austin (1962) emphasized, humans can see a great variety of things:
they can see e.g., tables, trees, rivers, substances, gases, vapors, mountains,
flames, clouds, smoke, shadows, holes, pictures, movies, events and actions. Here,
I will not examine the ontological status of all the various things that human
beings can see and I shall restrict myself to seeing ordinary middle-sized
objects that can also happen to be targets of human actions. Secondly, it is no
objection to the dualistic model of the human visual system to acknowledge
that, in the real life of normal human subjects, the two distinct modes of
visual processing are constantly collaborating. Indeed, the very idea that they
collaborate — if and when they do — presupposes that they are distinct. The
trick of course is to find experimental conditions in which the two modes of
visual processing can be dissociated. In the following, I will provide some
examples drawn first from the psychophysical study of normal human subjects and
then from the neuropsychological study of brain-lesioned human patients.
Bridgeman
et al. (1975) Goodale et al. (1986) found that normal subjects can point
accurately to a target on the screen of a computer whose motion they could not
consciously notice because it coincided with one of their saccadic eye movement
(see Jeannerod, 1997: 82). Castiello
et al. (1991) found that subjects are able to correct the trajectory of their
hand movement directed towards a moving target some 300 milliseconds before
they became conscious of the target’s change of location. Pisella et al. (2000) and
Rossetti & Pisella (2000) performed experiments involving a pointing task in
which subjects were presented with a green target towards which they were
requested to point their index finger. Some of them were instructed to stop
their pointing movement towards the target when and only when it changed
location by jumping either to the left or to the right. Pisella et al. (2000) and
Rossetti & Pisella (2000) found a significant percentage of very fast unwilled
correction movements generated by what they called the “automatic pilot” for
hand movement. In a second experiment, Pisella et al. (2000) presented subjects
simultaneously with pairs of a green and a red target. They were instructed to
point to the green target, but the color of the two targets could be
interchanged unexpectedly at movement onset. Unlike a change of target
location, a change of color did not elicit fast unwilled corrective movements
by the “automatic pilot”. On this basis, Pisella et al. (2000) draw a contrast
between the fast visuomotor processing of the location of a target in
egocentric coordinates and the slower
visual processing of the color of an object.
One psychophysical area of
particular interest is the study of visual size-contrast illusions. One
particularly well-known such illusion is the Titchener or Ebbinghaus illusion.
The standard version of the illusion consists of the display of two circles of
equal diameter, one surrounded by an annulus of circles greater than it, and
the other surrounded by an annulus of circles smaller than it. Although they
are equal, the former looks smaller than the latter (see Figure 3). One
plausible account of the Titchener illusion is that the array of smaller
circles is judged to be more distant than the array of larger circles. Visually
based perceptual judgments of distance and size are typically relative
judgments: in a perceptual task, one cannot but fail to see some things as smaller
(or larger) and closer (or further away) than other neighboring things that are
parts of a single visual array. In perceptual tasks, the output of obligatory
comparisons of sizes, distances and positions of constituents of a visual array
serves as input to perceptual constancy mechanisms. As a result, of two
physically equal objects, if one is perceived as more distant from the observer
than the other, the former will be perceived as larger than the latter. A
non-standard version of the illusion consists in the display of two circles of
unequal diameter: the larger of the two is surrounded by an annulus of circles
larger than it, while the smaller of the two is surrounded by an annulus of
circles smaller than it, so that the two unequal circles look equal.
Aglioti et al. (1995) designed an experiment in
which they replaced the two central circles by two graspable three-dimensional
plastic disks, which they displayed within a horizontal plane. In a first row
of experiments with pairs of unequal disks whose diameters ranged from 27 mm to
33 mm, they found that on average the disk in the annulus of larger circles had
to be 2,5 mm wider than the disk in the annulus of smaller circles in order for
both to look equal. These numbers provide a measure of the delicacy of the
human visual system. Finally, Aglioti et al. (1995) alternated presentations of
physically unequal disks, which looked equal, and presentations of physically
equal disks, which looked unequal. Both kinds of trials were presented randomly
and so were the left vs. right positions of either kind of stimuli. Subjects
were instructed to pick up the disk on the left between the thumb and index
finger of their right hand if they thought the two disks to be equal or to pick
up the disk on the right if they judged them to be unequal.
The
sequence of subjects’ choices of the disk on the right or the disk on the left
provided a measure of the magnitude of the illusion prompted by the perceptual
comparison between two disks surrounded by two distinct annuli. In the
visuomotor task, the measure of grip size was based on the unfolding of the
natural grasping movement performed by subjects while their hand approached the
object. During a prehension movement, fingers progressively stretch to a
maximal aperture before they close down until contact with the object. It has
been found that the maximum grip aperture (MGA) takes place at a relatively
fixed point, i.e., at about 60% of the duration of the movement (cf. Jeannerod,
1984). In non-illusory contexts, MGA has been found to be reliably correlated
with the object’s physical size. Although much larger, it is directly
proportional to the actual physical size of the object. MGA cannot depend on a
conscious visual comparison between the size of the object and subjects’ hand
during the prehension movement since the correlation between MGA and object’s
size is reliable even when subjects have no visual access to their own hand.
Rather, MGA is assumed to result from an early anticipatory automatic visual
process of calibration. Thus, Aglioti et al. (1995) measured MGA in flight using
optoelectronic recording.
What Aglioti et al. (1995) found was
that, unlike comparative perceptual judgment expressed by the sequence of
choices of either the disk on the left or the disk on the right, the grip was
not significantly affected by the illusion. The influence of the illusion was
significantly stronger on perceptual judgment than on the grasping task. This
experiment, however, raises a number of methodological problems. The main
issue, raised by Pavani et al. (1999) and Franz et al. (2000), is the asymmetry
between the two tasks. In the perceptual task, subjects are asked to compare
two distinct disks surrounded by two different annuli. But in the grasping
task, subjects focus on a single disk surrounded by an annulus. So the question
arises whether, from the observation that the comparative perceptual judgment
is more affected by the illusion than the grasping task, one may conclude that
perception and action are based on two distinct representational systems.
Aware of this problem, Haffenden
& Goodale (1998) performed the same experiment, but they designed one more
task: in addition to instructing subjects to pick up the disk on the left if
they judged the two disks to be equal in size or to pick up the disk on the
right if they judged them to be unequal, they required subjects to manually
estimate between the thumb and index finger of their right hand the size of the
disk on the left if they judged the disks to be equal in size and to manually
estimate the size of the disk on the right if they judged them to be unequal
(see Figure 4). Haffenden & Goodale (1998) found that the effect of the
illusion on the manual estimation of the size of a disk (after comparison) was intermediary
between comparative judgment and grasping.
Furthermore, Haffenden & Goodale
(1998) found that the presence of an annulus had a selective effect on
grasping. They contrasted the presentation of pairs of disks either against a
blank background or surrounded by an annulus of circles of intermediate size,
i.e., of size intermediary between the size of the smaller circles and the size
of the larger circles involved in the contrasting pair of illusory annuli. The
circles of intermediate size in the annulus were slightly larger than the disks
of equal size. When a pair of physically different disks were presented against
either a blank background or a pair of annuli made of intermediate size
circles, both grip scaling and manual estimates reflected the physical
difference in size between the disks. When physically equal disks were
displayed against either a blank background or a pair of annuli made of circles
of intermediate size, no significant difference was found between grasping and
manual estimate. The following dissociation, however, turned up: when
physically equal disks were presented with a middle-sized annulus, overall MGA
was smaller than when physically equal disks were presented against a blank
background. Thus, the presence of an annulus of middle-sized circles prompted a
smaller MGA than a blank background. Conversely, overall manual estimate was
larger when physically equal disks were presented against a background with a
middle-sized annulus than when they
were presented against a blank background. The illusory effect of the
middle-size annulus presumably arises from the fact that the circles in the
annulus were slightly larger than the equal disks. Thus, whereas the presence
of a middle-sized annulus contributes to increasing manual estimation, it
contributes to decreasing grip scaling. This dissociation shows that the
presence of an annulus may have conflicting effects on perceptual estimate and
on grip aperture.
Finally, Haffenden, Schiff &
Goodale (2001) went one step further. They presented subjects with three
distinct Titchener circle displays one at a time, two of which are the
traditional Titchener central disk surrounded by an annulus of circles either
smaller than it or larger than it. In the former case, the gap between
the edge of the disk and the annulus is 3 mm. In the latter case, the gap
between the edge of the disk and the annulus is 11 mm. In the third display,
the annulus is made of small circles (of the same size as in the first
display), but the gap between the edge of the disk and the annulus is 11
mm (like the gap in the second display with an annulus of larger circles) (see
Figure 5). What Haffenden, Schiff and Goodale (2001) found was the following
dissociation: in the perceptual task, subjects estimated the third display very
much like the first display and unlike the second display. In the visuomotor
task, subjects’ grasping in the third condition was much more similar to
grasping in the second than in the first condition (see Figure 6). Thus,
perceptual estimate was far more sensitive to the size of the circles in the
annulus than to the distance between target and annulus. Conversely, grasping
was far more sensitive to the distance between target and annulus than to the
size of the circles in the annulus. The idea here is that the annulus is
processed by the visuomotor processing as a potential obstacle for the position
of the fingers on the target disk.
From
this selective review of evidence on size-contrast illusions, I would like to
draw two temporary conclusions. First of all, visual perception and visually
guided hand actions directed towards objects impose different computational
requirements on the human visual system. As I said above, visually based
perceptual judgments of distance and size are typically relative comparative
judgments. By contrast, visually guided actions directed towards objects are
typically based on the computation of the absolute size and the egocentric
representation of the location of objects on which to act. In order to
successfully grab a branch or a rung, one must presumably compute the distance
and the metrical properties of the object to be grabbed quite independently of
pictorial contextual features in the visual array.
Second
of all, what the above experiments suggest is not that, unlike perceptual
judgments, the visuomotor control of grasping is immune to illusions. Rather,
both perceptual judgment and the visuomotor control of action can be fooled by
the environment. But if so, then they can be fooled by different features of the
visual display. The effect of the Titchener size-contrast illusion on
perceptual judgment arises mostly from the comparison between the diameter of
the disk and the diameter of the circles in the surrounding annulus. The
visuomotor processing, which delivers a visual representation of the absolute
size of a target of prehension, is so sensitive to the distance between the
edge of the target and its immediate environment that it can be led to process
two-dimensional cues as if they were three-dimensional obstacles. I take this
last point quite seriously because I claim that it is evidence that the output
of the visuomotor processing of the target of an action can misrepresent
features of the distal stimulus and is thus a genuine mental representation.
In the 1970’s, Weiskrantz and others
discovered a neuropsychological condition called “blindsight” (see Weiskrantz,
1986, 1997). Since then, the phenomenon has been extensively studied and
discussed by philosophers. Blindsight results from a lesion in the primary
visual cortex anatomically located prior to the bifurcation between the ventral
and the dorsal streams. The significance of the discovery of this phenomenon
lies in the fact that although blindsight patients have no phenomenal
subjective visual experience of the world in their blind field, nonetheless it
was found out that they are capable of striking residual visuomotor capacities.
In situations of forced choice, they can do such remarkable things as grap
quandragular blocks and insert a hand-held card into an oriented slot.
According to most neuropsychologists who have studied such cases, in blindsight
patients, the visual information is processed by subcortical pathways that
bypass the visual cortex and relay visual information to the motor cortex.
In the early 1990’s, DF, a British
woman suffered an extensive lesion in the ventral stream of her visual system
as a result of poisoning by carbon monoxide. She thus became an apperceptive
agnosic, i.e., a visual form agnosic patient (see Farah, 1990 for the
distinction between apperceptive and associative agnosia). Following the
discovery of blindsight, the main novelty of the neuropsychological description
of patient DF’s condition — first examined by Goodale et al. (1991) and his
colleagues — lies in the fact that DF’s examination did not focus exclusively
on what she could not do as a result of her lesion. Rather, she was
investigated in depth for what she was still able to do.
Careful
sensory testing of DF revealed subnormal performance for color perception and
for visual acuity with high spatial frequencies, though detection of low
spatial frequencies was impaired. Her motion perception was poor. DF’s
perception of shape and patterns was very poor. She was unable to report the
size of an object by matching it by the appropriate distance between the index
finger and the thumb of her right hand. Her line orientation detection (reveald
by either verbal report or by turning a hand-held card until it matched the
orientation presented) was highly variable: although she was above chance for
large angular orientation differences between two objects, she fell at chance
level for smaller angles. DF was unable to recognize the shape of objects.
Interestingly, however, her visual imagery was preserved. For example, although
she could hardly draw copies of seen objects, she could draw copies of objects
from memory — which she then could hardly later recognize.
By contrast with her impairment in
object recognition, DF was normally accurate when object orientation or size
had to be processed, not in view of a perceptual judgment, but in the context
of a goal-directed hand movement. During reaching and grasping between her
index finger and thumb the very same objects that she could not recognize, she
performed accurate prehension movements. Similarly, while transporting a
hand-held car towards a slit as part of the process of inserting the former
into the latter, she could normally orient her hand through the slit at
different orientations (Goodale et al., 1991, Carey et al., 1996). When
presented with a pair of rectangular blocks of either the same or different
dimensions and asked whether they were the same or different, she failed. When
she was asked to reach out and pick up a block, the measure of her (maximal)
grip aperture between thumb and index finger revealed that her grip was
calibrated to the physical size of the objects, like that of normal subjects.
When shown a pair of objects selected from twelve objects of different shapes for
same/different judgment, she failed. When asked to grasp them using a
“precision grip” between thumb and index finger, she succeeded.
Conversely, optic ataxia is a
syndrome produced by lesions in the dorsal stream. An optic ataxic patient, AT,
examined by Jeannerod et al. (1994) shows the reversed dissociation. While she
can recognize and identify the shape of visually presented objects, she has
serious visuomotor deficits: her reach is misdirected and her finger grip is
improperly adjusted to the size and shape of the target of her movements.
At bottom, DF turns out to be able
to visually process size, orientation and shape required for grasping objects,
i.e., in the context of a reaching and grasping action, but not in the
context of a perceptual judgment. Other experimental results with DF, however,
indicate that her visuomotor abilities are restricted in at least two respects.
First, in the context of an action, she turns out to be able to visually
process simple sizes, shapes and orientations. But she fails to visually
process more complex shapes. For example, she can insert a hand-held card into
a slot at different orientations. But when asked to insert a T-shaped object
(as opposed to a rectangular card) into a T-shaped aperture (as opposed to a simple
oriented slit), her performance deteriorated sharply. Inserting a T-shaped
object into a T-shaped aperture requires the ability to combine the
computations of the orientation of the stem with the orientation of the top of
the object together with the computation of the corresponding parts of the
aperture. There are good reasons to think that, unlike the quick visuomotor
processing of simple shapes, sizes and orientations, the computations of
complex contours, sizes and orientations require the contribution of visual
perceptual processes performed by the ventral stream — which, we know, has been
severely damaged in DF.
Secondly, the contours of an object
can and often are computed by a process of extraction from differences in
colors and luminance cues. But normal humans can also extract the contours or
boundaries of an object from other cues — such as differences in brightness,
texture, shades and complex principles of Gestalt grouping and organization of
similarity and good form. Now, when asked to insert a hand-held card into a
slot defined by Gestalt principles of good form or by textural information, DF
failed (see e.g., Goodale, 1995).
Apperceptive
agnosic patients like DF raise the question: What is it like to see with an
intact dorsal system alone? I presently want to emphasize what I take to be a
crucial characteristic of the content of visuomotor representations jointly
from the examination of DF’s condition and from the visuomotor representations
of normal subjects engaged in tasks of grasping illusory displays such as
Titchener circles. As I said above, a visual percept yields a representation of
the relative size and distance of various neighboring elements within a visual
array. I take it that it is of the essence of a percept that the processing of
such visual attributes of an object as its size, shape and position or distance
must be available for comparative judgment. By contrast, a visuomotor
representation of a target in a task of reaching and grasping provides
information about the absolute size of the object to be grasped. Crucially, the
spatial position of any object can be coded in at least two major coordinate
systems or frames of reference: it may be coded in an egocentric frame
of reference centered on the agent’s body or it may be coded in an allocentric
frame of reference centered on some object present in the visual array. The
former is required for allowing an agent to reach and grasp an object. The
latter is required in order to locate an object relative to some other object in
the visual display.
Consider
e.g., a visual percept of a glass to the left of a telephone. In the visual
percept, the location of the glass relative to the location of the telephone is
coded in allocentric coordinates. The visual percept has a pictorial content
that, I shall argue momentarily, is both informationally richer and more
fine-grained than the verbally expressible conceptual content of a different
representation of the same fact or state of affairs. For example, unlike the
sentence ‘The glass is to the left of the telephone’, the visual percept cannot
depict the location of the glass relative to the telephone without depicting
ipso facto the orientation, shape, texture, size and color of both the glass
and the telephone. Conceptual processing of the pictorial content of the visual
percept may yield a representation whose conceptual content can be expressed by
the English sentence ‘The glass is to the left of the telephone’. Now the
visuomotor representation of the glass as a target of a prehension action
requires that information about the size and shape of the glass be contained
within a representation of the position of the glass in egocentric coordinates.
Unless the telephone interferes with the trajectory of the reaching part of the
action of grasping the glass, when one intends to grasp the glass, one does not
need to represent the spatial position of the glass relative to the telephone.
We
know that patient DF cannot match the orientation of her wrist to the
orientation of a slot in the context of a perceptual task, i.e., when she is
not involved in the action of inserting a hand-held card into the slot. She
can, however, successfully insert a card into an oriented slot. She cannot
perceptually represent the size, shape and orientation of an object. However,
she can successfully grasp an object between her thumb and index finger. So the
main relevant contrast revealed by the examination of DF is that while she can
use an effector (e.g., the distance between her thumb and index finger or the
rotation of her wrist) in order to grasp an object or to insert a card into a
slot, i.e., in the context of an action, she cannot use the same effector to
express a perceptual judgment. What is the main difference between the
perceptual and the visuomotor tasks? Both tasks require that visual information
about the size and shape of objects be provided. But in the visuomotor task,
this information is contained in a representation of the spatial position of
the target coded in an egocentric frame of reference. In the perceptual
task, information about the size and shape of objects is contained in a
representation of the spatial position of the object coded in an allocentric
frame of reference. Normal subjects can easily switch from one spatial frame of
reference to the other. Such fast transformations may be required when e.g.,
one switches from counting items lying on a table or from drawing a copy of
items lying on a table to grasping one of them. However, DF’s visual system
cannot make the very same visual information about the size, shape and
orientation of an object available for perceptual comparisons. In DF,
information about the size and the shape of an object is trapped within a
visuomotor representation of its location coded in egocentric coordinates. It
is not available for recoding in an allocentric frame of reference. Coding
spatial relationships among different constituents of a visual scene is crucial
to forming a visual percept. By contrast, locating a target in egocentric
coordinates is crucial to forming a visuomotor representation on the basis of
which to act on the target.
II. Visual knowledge
of the world
Although , if the above is on the right track,
not all human vision has been designed to allow visual perception, nonetheless
one crucial function of human vision is visual perception. Like many
psychological words, ‘perception’ can be used at once to refer both to a
process and to its product. There are two complementary sides to visual
perception: there is an objective side and a subjective side. On the objective
side, visual perception is a fundamental source of knowledge about the world.
Visual perception is indeed a — if not “the” — paradigmatic process by means of
which human beings gather knowledge about objects, events and facts in their
environment. On the subjective side, visual perception yields a peculiar kind
of awareness of the world, namely: sight. Sight has a special kind of
phenomenal character (which is lacking in blindsight patients). The
phenomenology of human visual experience is
unlike the phenomenology of human experience in sensory modalities other
than vision, e.g., touch, olfaction or audition.
On
my representationalist view (close to Dretske, 1995 and Tye, 1995), much of the
distinctive phenomenology of visual experience derives from the fact that the
human visual system has been selected in the course of evolution to respond to
a specific set of properties. Visual perception makes us aware of such
fundamental properties of objects as their size, orientation, shape, color,
texture, spatial position, distance and motion, all at once. One of the puzzles
that arises from neuroscientific research into the visual system (and which I
will not discuss here) is the question of how these various visual attributes
are perceived as bound together, given the fact that neuroscience has
discovered that they are processed in different areas of the human visual
system (see Zeki, 1993). Unlike vision, audition makes us aware of sounds.
Olfaction makes us aware of smells and odors. Touch makes us aware of pressure
and temperature. Although shape can be both seen and felt, what it is like to
see a shape is clearly different from what it is like to touch it. Part of of
the reason for the difference lies in the fact that a normally sighted person
cannot see e.g., the shape of a cube without seeing its color. But by feeling
the shape of a cube, one does not thereby feel its color.
I will presently argue that visual perception
is a fundamental source of knowledge about the world: visual knowledge. I
assume that propositional knowledge is knowledge of facts and that one cannot
know a fact unless one believes that this fact obtains. I accept something like
Dretske’s (1969) distinction between two levels of visual perception:
nonepistemic perception (of objects) and epistemic perception (of facts).
Importantly, on my view, the nonepistemic perception of objects gives rise to
visual percepts and visual percepts are different from what I earlier called
visuomotor representations of the targets of one’s action. What Dretske (1969)
calls nonepistemic seeing is part of the perceptual processing of visual
information. In the previous section, I gave empirical reasons why visual
percepts differ from visuomotor representations. Unlike the visuomotor representation
of a target, a visual percept makes visual information about colors, shapes,
sizes, orientations of constituents of a visual display available for
contrastive identification and recognition. This is why visual percepts can
serve as input to a conceptual process that can lead to a peculiar kind of
knowledge of the world — visual knowledge. Visual percepts serve as inputs to
conceptual processes, but percepts are not concepts: perceptual contrasts are
not conceptual contrasts. My present task then will be to show that the claim
that visual perception can give rise to visual knowledge of the world is
consistent with the claim that visual percepts are different from thoughts and
beliefs. Visual percepts lead to thoughts and beliefs, but it would be a
mistake to confuse the nonconceptual contents of visual percepts with the
conceptual contents of beliefs and thoughts.
II. 1. Percepts and
thoughts
As
many philosophers of mind and language have argued, what is characteristic of
conceptual representations is that they are both productive and systematic.
Like sentences of natural languages, thoughts are productive in the sense that
they form an open ended infinite set. Although the lexicon of a natural
language is made up of finitely many words, thanks to its syntactic rules, a
language contains indefinitely many well formed sentences. Similarly, an
individual may entertain indefinitely many conceptual thoughts. In particular,
both sentences of public languages and conceptual thoughts contain such devices
as negation, conjunction and disjunction. So one can form indefinitely many new
thoughts by prefixing a thought by a negation operator, by forming a
disjunctive or a conjunctive thought out of two simpler thoughts or one can
generalize a singular thought by means of quantifiers. Sentences of natural
languages are systematic in the sense that if a language contains a sentence S
with a syntactic structure e.g., Rab, then it must contain a sentence
expressing a syntactically related sentence, e.g., Rba. An individual’s
conceptual thoughts are supposed to be systematic too: if a person has the
ability to entertain the thought that e.g., John loves Mary, then she must have
the ability to entertain the thought that Mary loves John. If a person can form
the thought that Fa, then she can form both the thought that Fb
and the thought that Ga (where “a” and “b” stand for individuals and “F”
and “G” stand for properties). Both Fodor’s (1975, 1987) Language of Thought
hypothesis and Evans’ (1982) Generality constraint are designed to account for
the productivity and the systematicity of thoughts, i.e., conceptual
representations. It is constitutive of thoughts that they are structured and
that they involve conceptual constituents that can be combined and recombined
to generate indefinitely many new structured thoughts. Thus, concepts are
building blocks with inferential roles.
Because
they are productive and systematic, conceptual thoughts can rise above the
limitations imposed to perceptual representations by the constraints inherent
to perception. Unlike thought, visual perception requires some causal
interaction between a source of information and some sensory organs. For
example, by combining the concepts horse and horn, one may form
the complex concept unicorn, even though no unicorn has or ever will be
visually perceived (except in visual works of art). Although no unicorn has
ever been perceived, within a fictional context, on the basis of the
inferential role of its constituents, one can draw the inference that if
something is a unicorn, then it has four legs, it eats grass and it is a
mammal.
Hence,
possessing concepts is to master inferential relations: only a creature with
conceptual abilities can draw consequences from her perceptual processing of a
visual stimulus. Thought and visual perception are clearly different cognitive
processes. One can think about numbers and one can form negative, disjunctive,
conjunctive and general thoughts involving multiple quantifiers. Although one
can visually perceive numerals, one cannot visually perceive numbers. Nor can
one visually perceive negative, disjunctive, conjunctive or general facts
(corresponding to e.g., universally quantified thoughts).
As
Crane (1992: 152) puts it, “there is no such thing as deductive inference between
perceptions”. Upon seeing a brown dog, one can see at once that the animal one
faces is a dog and that it is brown. If one perceives a brown animal and
one is told that it is a dog, then one can certainly come to believe
that the brown animal is a dog or that the dog is brown. But on this hybrid
epistemic basis, one can think or believe, but one cannot see that the dog is
brown. One came to know that the dog is brown by seeing it. But one did not
come to know that what is brown is a dog by seeing it. Unlike the content of
concepts, the content of visual percepts is not a matter of inferential role.
As emphasized by Crane (ibid.), this is not to say that the content of
visual percepts is amorphous or unstructured. One proposal for capturing the nonconceptual
structure of visual percepts is Peacocke’s (1992) notion of a scenario content,
i.e., a visual way of filling in space. As we shall see momentarily, one can
think or believe of an animal that it is dog without thinking or believing that
it has a particular color. But one cannot see a dog in good daylight conditions
without seeing its particular color (or colors). I shall momentarily discuss
this feature of the content of visual percepts, which is part of their
distinctive informational richness, as an analog encoding of information.
In
section I.3, I considered the contrast between the pictorial content of a
visual percept of a glass to the left of a telephone and the conceptual content
expressible by means of the English sentence: ‘The glass is to the left of the
telephone’. I noticed that, unlike the English sentence, the visual percept
cannot represent the glass to the left of the telephone unless it depicts the
shape, size, texture, color and orientation of both the glass and the telephone.
I concluded that an utterance of this sentence conveys only part of the
pictorial content of the visual percept since the utterance is mute about any
visual attribute of the pair of objects other than their relative locations.
But, further conceptual processing of the conceptual content conveyed by the
utterance of the sentence may yield a more complex representation involving,
not just a two-place relation, but a three-place relation also
expressible by the English predicate ‘left of’. Thus, one may think that the
glass is to the left of the telephone for someone standing in front of
the window, not for someone sitting at the opposite side of the table.
In other words, one can think that the glass is to the left of the telephone
from one’s own egocentric perspective and that the same glass is to the right
of the telephone from a different perspective. Although one can form the thought
involving the ternary relation ‘left of’, one cannot see the glass as
being to the left of the telephone from one’s own egocentric perspective
because one cannot see one’s own egocentric perspective. Perspectives are not
things that one can see. This is an example of a conceptual contrast that could
not be drawn by visual perception. Thus, unlike a thought, a visual percept is,
in one sense of the word, “informationally encapsulated”. Thought, not
perception, can, as Perry (1993) puts it, increase the arity of a predicate.
Notice that percepts can cause thoughts. This is one way thoughts arise.
Thoughts can also cause other thoughts. But presumably, thoughts do not cause
percepts.
II.
2. The finegrainedness and informational richness of visual percepts
Unlike
thought, visual perception has a spatial, perspectival, iconic and/or pictorial
structure not shared by conceptual thought. The content of visual perception
has a spatial perspectival structure that pure thoughts lack. In order to apply
the concept of a dog, one does not have to occupy a particular spatial
perspective relative to any dog. But one cannot see a dog unless one occupies
some spatial standpoint or other relative to it: one cannot e.g., see a dog
simultaneously from the top and from below, from the front and from the back.
The concept of a dog applies indiscriminately to poodles, alsatians, dalmatians
or bulldogs. One can think that all dogs bark. But one cannot see all dogs
bark. Nor can on see a generic dog bark. One must see some particular dog: a
poodle, an alsatian, a dalmatian or a bulldog, as it might be. Although one and
the same concept — the concept of a dog — may apply to a poodle, an alsatian, a
dalmatian or a bulldog, seeing one of them is a very different visual
experience than seeing another. One can think that a dog barks without thinking
of any other properties of the dog. One cannot, however, see a dog unless one
sees its shape and the colors and texture of its hairs.
Thus,
the content of visual perceptual representations turns out to be both more finegrained
and informationally richer than the conceptual contents of thoughts.
There are three paradigmatic cases in which the need to distinguish between
conceptual content and the nonconceptual content of visual perceptions may
arise. First, a creature may be perceptually sensitive to objective differences
for which she has no concepts. Secondly, two creatures may enjoy one and the
same visual experience, which they may be inclined to conceptualize
differently. Finally, two different persons may enjoy two distinct visual
experiences in the presence of one and the same distal stimulus to which they
may be inclined to apply one and the same concept.
Peacocke
(1992: 67-8) considers, for example, a person’s visual experience of a range of
mountains. As he notices, one might want to conceptualize one’s visual
experience with the help of concepts of shapes expressible in English with such
predicates as ‘round’ and ‘jagged’. But these concepts of shapes could apply to
the nonconceptual contents of several different visual experiences prompted by
the distinct shapes of several distinct mountains. Arguably, although a human
being might not possess any concept of shape whose finegrainedness could match
that of her visual experience of the shape of the mountain, her visual
experience of the shape is nonetheless distinctive and it may differ from the
visual experience of the distinct shape of a different mountain to which she
would apply the very same concept. Similarly, human beings are perceptually
sensitive to far more colors than they have color concepts and color names to
apply. Although a human being might lack two distinct concepts for two distinct
shades of color, she might well enjoy a visual experience of one shade that is
distinct from her visual experience of the other shade. As Raffman (1995: 295)
puts it, “discriminations along perceptual dimensions surpasses identification
[…] our ability ro judge whether two or more stimuli are the same or different
surpasses our ability to type-identify them”.
Against
this kind of argument in favor of the nonconceptual content of visual
experiences, McDowell (1994, 1998) has argued that demonstrative concepts
expressible by e.g., ‘that shade of color’ are perfectly suited to capture the
finegrainedness of the visual percept of color. I am willing to concede to
McDowell that such demonstrative concepts do exist. But I agree with Bermudez
(1998: 55-7) and Dokic & Pacherie (2000) that such demonstrative concepts
would seem to be too weak to perform one of the fundamental jobs that color
concepts and shape concepts must be able to perform — namely recognition. Color
concepts and shape concepts stored in a creature’s memory must allow
recognition and reidentification of colors and shapes over long periods of
time. Although pure demonstrative color concepts may allow comparison of
simultaneously presented samples of color, it is unlikely that they can be used
to reliably reidentify one and the same sample over time. Nor presumably could
pairs of demonstrative color concepts be used to reliably discriminate pairs of
color samples over time. Just as one can track the spatio-temporal evolution of
a perceived object, one can store in a temporary object file information about
its visual properties in a purely indexical or demonstrative format. If,
however, information about an object’s visual properties is to be stored in
episodic memory, for future reidentification, then it cannot be stored in a
purely demonstrative or indexical format, which is linked to a particular
perceptual context. Presumably, the demonstrative must be fleshed with some
descriptive content. One can refer to a perceptible object as ‘that sofa’ or
even as ‘that’ (followed.by no sortal). But presumably when one does not stand
in a perceptual relation to the object, information about it cannot be stored
in episodic memory in such a pure demonstrative format. Rather, it must be
stored using a more descriptive symbol such as ‘the (or that) red sofa that
used to face the fire-place’. This is presumably part of what Raffman (1995:
297) calls “the memory constrainst”. As Raffman (1995: 296) puts it:
the coarse grained character of perceptual memory explains why we can
recognize ‘determinable’ colors like red and blue and even scarlet and indigo as
such, but not ‘determinate’ shades of those determinables […] Because we
cannot recognize determinate shades as such, ostension is our only means
of communicating our knowledge of them. If I want to convey to you the precise
shade of an object I see, I must point to it, or perhaps paint you a picture of
it […] I must present you with an instance of that shade. You must have the
experience yourself .
Two
persons might enjoy one and the same kind of visual experience prompted by one
and the same shape or one and the same color, to which they would be inclined
to apply pairs of distinct concepts, such as ‘red’ vs ‘crimson’ or ‘polygon’ vs
‘square’. If so, it would be justified to distinguish the nonconceptual content
of their common visual experience from the different concepts that each would
be willing to apply. Conversely, as argued by Peacocke (1998), presented with
one and the same geometrical object, two persons might be inclined to apply one
and the same generic shape concept e.g., ‘that polygon’ and still enjoy
different perceptual experiences or see the same object as having different
shapes. For example, as Peacocke (1998: 381) points out, “one and the same
shape may be perceived as square, or as diamond-shaped […] the difference
between these ways is a matter of which symmetries of the shape are perceived;
though of course the subject himself does not need to know that this is the
nature of the difference”. If one mentally partitions a square by bisecting its
right angles, one sees it as a diamond. If one mentally partitions it by
bisecting its sides, one sees it as a square. Presumably, one does not need to
master the concept of an axis of symmetry to perform mentally these two
bisections and enjoy two distinct visual experiences.
The
distinctive informational richness of the content of visual percepts has been
discussed by Dretske (1981) in terms of what he calls the analogical
coding of information.[1] One and the same piece of
information — one and the same fact — may be coded analogically or digitally.
In Dretske’s sense, a signal carries the information that e.g., a is F
in a digital form iff the signal carries no additional information about a
that is not already nested in the fact that a is F. If the signal
does carry additional information about a that is not nested in the fact
that a is F, then the information that a is F is
carried by the signal in an analogical (or analog) form. For example, the
information that a designated cup contains coffee may be carried in a digital
form by the utterance of the English sentence ‘There is some coffee in the
cup’. The same information can also be carried in an analog form by a picture
or by a photograph. Unlike the utterance of the sentence, the picture cannot
carry the information that the cup contains coffee without carrying additional
information about the shape, size, orientation of the cup and the color and the
amount of coffee in it. As I pointed out above, unlike the concept of a dog,
the visual percept of a dog carries information about which dog one sees, its
spatial position, the color and texture of its hairs, etc. The contents of
visual percepts are informationally rich in the sense of being analog. A
thought involving several concepts in a hierarchically structured order might
carry the same informational richness as a visual percept. But it does not have
to. As the slogan goes, a picture is worth a thousand words. Unlike a thought,
a visual percept of a cup cannot convey the information that the cup
contains coffee without conveying additional information about several visual
attributes of the cup.
The
arguments by philosophers of mind and by perceptual psychologists in favor of
the distinction between the conceptual content of thought and the nonconceptual
content of visual percepts is based on the finegrainedness and the
informational richness of visual percepts. Thus, it turns on the phenomenology
of visual experience. In section I, I provided some evidence from
psychophysical experiments performed on normal human subjects and from the
neuropsychological examination of brain lesioned human patients that point to a
different kind of nonconceptual content, which I labelled “visuomotor” content.
Unlike the arguments in favor of the nonconceptual content of visual percepts,
the arguments for the distinction between the nonconceptual content of visual
percepts and the nonconceptual content of visuomotor representations do not
rely on phenomenology at all. Rather, they rely on the need to postulate mental
representations with visuomotor content in order to provide a causal
explanation of visually guided actions towards objects. Thus, on the
assumption that such behaviors as grasping objects can be actions (based on
mental representations), I submit that the nonconceptual content of visual
representation ought to be bifurcated into perceptual and visuomotor content as
in Figure 7:
conceptual content nonconceptual
content
perceptual content visuomotor
content
Figure 7
II. 3. The interaction
between visual and non-visual knowledge
Traditional
epistemology has focused on the problem of sorting out genuine instances of
propositional knowledge from cases of mere opinion or guessing. Propositional
factual knowledge is to be distinguished from both nonpropositional knowledge
of individual objects (or what Russell called “knowledge by acquaintance”) and
from tacit knowledge of the kind illustrated by a native speaker’s implicit
knowledge of the grammatical rules of her language. According to
epistemologists, in the relevant propositional sense, what one knows are facts.
In the propositional sense, one cannot know a fact unless one believes that the
corresponding proposition is true, one’s belief is indeed true, and the belief
was not formed by mere fantasy. On the one hand, one cannot know that the cup
contains coffee unless one believes it. One cannot have this belief unless one
knows what a cup is and what coffee is. On the other hand, one cannot know what
is not the case: one can falsely believe that e.g., the cup contains coffee.
But one cannot know it, unless a designated cup does indeed contain some
coffee. True belief, however, is not sufficient for knowledge. If a true belief
happens to be a mere guess or whim, then it will not qualify as knowledge. What
else must be added to true belief to turn it into knowledge?
Broadly speaking, epistemologists
divide into two groups. According to externalists, a true belief counts as
knowledge if it results from a reliable process, i.e., a process that generates
counterfactually supporting connexions between states of a believer and facts
in her environment. According to internalists, for a true belief to count as
knowledge, it must be justified and the believer must in addition justifiably
believe that her first-order belief is justified. Since I am willing to claim that, in appropriate conditions, the
way a red triangle visually looks to a person having the relevant concepts
and located at a suitable distance from it provides grounds for the person to
know that the object in front of her is a red triangle, I am attracted to an
externalist reliabilist view of perceptual knowledge.
Although the issue is controversial
and is by no means settled in the philosophical literature, externalist
intuitions suit my purposes better than internalist intuitions. Arguably, one
thing is to be justified or to have a reason for believing
something. Another thing is to use a reason in order to offer a
justification for one’s beliefs. Arguably, if a perceptual (e.g., visual)
process is reliable, then the visual appearances of things may constitute a
reason for forming a belief. However, one cannot use a reason unless one can
explicitly engage in a reasoning process of justification, i..e., unless one
can distinguish one’s premisses from one’s conclusion. Presumably, a creature
with perceptual abilities and relevant conceptual resources can have reasons
and form justified beliefs even if she lacks the concept of reason or
justification. However, she could not use her reasons and provide
justifications unless she had language and metarepresentational resources.
Internalism derives most of its appeal from reflection on instances of
mathematical and scientific knowledge that result from the conscious
application of explicit principles of inquiry by teams of individuals in the
context of special institutions. In such special settings, it can be safely
assumed that the justification of a believer’s higher-order beliefs do indeed
contribute to the formation and reliability of his or her first-order beliefs.
Externalism fits perceptual knowledge better than internalism and, unlike
internalism, it does not rule out the possibility of crediting non-human
animals and human infants with knowledge of the world — a possibility made more
and more vivid by the development of cognitive science.
On my view, human visual perceptual
abilities are at the service of thought and conceptualisation. At the most elementary
level, by seeing an object (or a sequence of objects) one can see a fact
involving that object (or sequence of objects). By seeing my neighbor’s car in
her driveway, I can see the fact that my neighbor’s car is parked in her
driveway. I thereby come to believe that my neighbor’s car is parked in her
driveway and this belief, which is a conceptually loaded mental state, is
arrived at by visual perception. Hence, my term “visual knowledge”. If one’s
visual system is — as I claimed it is — reliable, then by seeing my neighbor’s
car — an object — in her driveway, I thereby come to know that my neighbor’s
car is parked in her driveway — a fact. Hence, I come to know a fact involving
an object that I actually see. This is a fundamental epistemic situation, which
Dretske (1969) labels “primary epistemic seeing”: one’s visual ability allows
one to know a fact about an object one perceives.
However, if my neighbor’s car
happens to be parked in her driveway if and only if she is at home (and I know
this), then I can come to know a different fact: I can come to know that my
neighbor is at home. “Seeing” that my neighbor is at home by seeing that her
car is parked in her driveway is something different from seeing my neighbor at
home (e.g., seeing her in her living-room). Certainly, I can come to know that
my neighbor is at home by seeing her car parked in her driveway, i.e., without
seeing her. “Seeing” that my neighbor is at home by seeing that her car is
parked in her driveway is precisely what Dretske (1969) calls “secondary
epistemic seeing”. Secondary epistemic seeing lies at the interface between
pure visual knowledge of facts involving a perceived object and non-visual
knowledge that can be derived from it.
This transition from seeing one fact
to seeing another displays the hierarchical structure of visual knowledge. In
primary epistemic seeing, one sees a fact involving a perceived object. But in
moving from primary epistemic seeing to secondary epistemic seeing, one moves
from a fact involving a perceived car to a fact involving one’s unperceived
neighbor (who happens to own the perceived car). This epistemological
hierarchical structure is expressed by the “by” relation: one sees that y
is G by seeing that x is F where x ≠ y.
Although it may be more or less natural to say that one “sees” a fact involving
an unperceived object by seeing a different fact involving a perceived object,
the hierarchical structure that gives rise to this possibility is ubiquitous in
human knowledge.
One can see that a horse has walked
on the snow by seeing hoof prints in the snow. One sees the hoof prints, not
the horse. But if hoof prints would not be visible in the snow at time t
unless a horse had walked on that very snow at time t - 1, then one can
see that a horse has walked on the snow just by seeing hoof prints in the snow.
One can see that a tennis player has just hit an ace at Flushing Meadows by
seeing images on a television screen located in Paris. Now, does one really see
the tennis player hit an ace at Flushing Meadows while sitting in Paris and
watching television? Does one see a person on a television screen? Or does one
see an electronic image of a person relayed by a television? Whether one sees a
tennis player or her image on a television screen, it is quite natural to say
that one “sees that” a tennis player hit an ace by seeing her (or her image) do
it on a television screen. Even though, strictly speaking, one perhaps did not
see her do it — one merely saw pictures of her doing it —, nonetheless seeing
the pictures comes quite close to seeing the real thing. By contrast, one can
“see” that the gas-tank in one’s car is half-full by seeing, not the tank
itself, but the dial of the gas-gauge on the dashboard of the car. If one is
sitting by the steering wheel inside one’s car so that one can comfortably see
the gas-gauge, then one cannot see the gas-tank. Nonetheless, if the gauge is
reliable and properly connected to the gas-tank, then one can (perhaps in some
loose sense) “see” what the condition of the gas-tank is by seeing the dial of
the gauge.
One could wonder whether secondary
epistemic seeing is really seeing at all. Suppose that one learns that the New
York Twin Towers collapsed by reading about it in a French newspaper in Paris.
One could not see the New York Twin Towers — let alone their collapse — from
Paris. What one sees when one reads a newspaper are letters printed in black
ink on a white sheet of paper. But if the French newspaper would not report the
collapse of the New York Twin Towers unless the New York Twin Towers had indeed
collapsed, then one can come to know that the New York Twin Towers have
collapsed by reading about it in a French newspaper. There is a significant
difference between seeing that the New York Twin Towers have collapsed by seeing
it happen on a television screen and by reading about it in a newspaper. Even
if seeing an electronic picture of the New York Twin Towers is not seeing the
Twin Towers themselves, still the visual experience of seeing an electronic
picture of them and the visual experience of seeing them have a lot in common.
The pictorial content of the experience of seeing an electronically produced
color-picture of the Towers is very similar to the pictorial content of the
experience of seeing them. Unlike a picture, however, a verbal description of
an event has conceptual content, not pictorial content. The visual experience
of reading an article reporting the collapse of the New York Twin Towers in a
French newspaper is very different from the experience of seeing them collapse.
This is the reason why it may be a little awkward to say that one “saw” that
the New York Twin Towers collapsed if one read about it in a French newspaper
in Paris as opposed to seeing it happen on a television screen.
Certainly, ordinary usage of the
English word ‘see’ is not sacrosanct. We say that we “see” a number of things
in circumstances in which what we do owes little — if anything — to our visual
abilities. “I see what you mean”, “I see what the problem is” or “I finally saw
the solution” report achievements quite independent of visual perception. Such
uses of the verb ‘to see’ are loose uses. Such loose uses do not report
epistemic accomplishments that depend significantly on one’s visual endowments.
By contrast, cases of what Dretske (1969) calls secondary epistemic seeing are
epistemic achievements that do depend on one’s visual endowments. True, in
cases of secondary epistemic seeing, one comes to know a fact without seeing
some of its constituent elements. True, one could not come to learn that one’s
neighbor is at home by seeing her car parked in her driveway unless one knew
that her car is indeed parked in her driveway when and only when she is at
home. Nor could one see that the gas-tank in one’s car is half-full by seeing the
dial of the gas-gauge unless one knew that the latter is reliably correlated
with the former. So secondary epistemic seeing could not possibly arise in a
creature that lacked knowedge of reliable correlations or that lacked the
cognitive resources required to come to know them altogether.
Nonetheless secondary epistemic
seeing has indeed a crucial visual component in the sense that visual
perception plays a critical role in the context of justifying such an
epistemic claim. When one claims to be able to see that one’s neighbor
is at home by seeing her car parked in her driveway or when one claims to be
able to see that the gas-tank in one’s car is almost empty by seeing the
gas-gauge, one relies on one’s visual powers in order to ground one’s state of knowledge.
The fact that one claims to know is not seen. But the grounds
upon which the knowledge is claimed to rest are visual grounds: the
justification for knowing an unseen fact is seeing another fact correlated with
the former. Of course, in explaining how one can come to know a fact about one
thing by knowing a different fact about a different thing, one cannot hope to
meet the philosophical challenge of scepticism. From the standpoint of
scepticism, as Stroud (1989) points out, the explanation may seem to beg the
question since it takes for granted one’s knowledge of one fact in order to
explain one’s knowledge of another fact. But the important thing for present
purposes is that — scepticism notwithstanding — one offers a perfectly good
explanation of how one comes to know a fact about an object one does not
perceive by knowing a different fact about an object one does perceive. The
point is that much — if not all — of the burden of the explanation lies in
visual perception: seeing one’s neighbor’s car is the crucial step in
justifying one’s belief that one’s neighbor is at home. Seeing the gas-gauge is
the crucial step in justifying one’s belief that one’s tank is almost empty.
The reliability of visual perception is thus critically involved in the justification
of one’s knowledge claim. In cases of primary epistemic seeing, the reliability
of one’s visual system provides justifications for one’s visual knowledge in
the sense that it provides one with reasons for believing that the fact
involving an object one perceives obtains. In secondary epistemic seeing, one
claims to know a fact that does not involve a perceived object. Still, the
reliability of one’s visual system plays an indirect role in cases of secondary
epistemic seeing in the sense that it provides grounds for one’s visual
knowledge about a fact involving a perceived object, upon which one’s knowledge
of a fact not involving a perceived object rests.
Thus, secondary epistemic seeing
lies at the interface between an individual’s visual knowledge (i.e., knowledge
formed by visual means) and the rest of her knowledge. In moving from primary
epistemic seeing to secondary epistemic seeing, an individual exploits her
knowledge of regular connections. Although it is true that unless one knows the
relevant correlation, one could not come to know the fact that the gas-tank in
one’s car is empty by seeing the gas-gauge, nonetheless one does not
consciously or explicitly reason from the perceptually accessible premiss that
one’s neighbor’s car is parked in her driveway together with the premiss that
one’s neighbor’s car is parked in her driveway when and only when one’s
neighbor is at home to the conclusion that one’s neighbor is at home. Arguably,
the process from primary to secondary epistemic seeing is inferential. But if
it is, then the inference is unconscious and it takes place at the
“sub-personal” level.
What the above discussion of
secondary epistemic seeing so far reveals is that the very description and
understanding of the hierarchical structure of visual knowledge and its
integration with non-visual knowledge requires an epistemological and/or
psychological distinction between seeing of objects and seeing facts
— a point much emphasized in Dretske’s writings on the subject — or between nonepistemic
and epistemic seeing. The neurophysiology of human vision is such that some
objects are simply not accessible to human vision. They may be too small or too
remote in space and time for a normally sighted person to see them. For more
mundane reasons, a human being may be temporarily so positioned as not to be
able to see one object — be it her neighbor or the gas-tank in her car. Given
the correlations between facts, by seeing a perceptible object, one can get
crucial information about a different unseen object. Given the epistemic
importance of visual perception in the hirarchical structure of human
knowledge, it is important to understand how by seeing one object, one can
provide decisive reasons for knowing facts about objects one does not see.
II. 4. The scope and limits of visual knowledge
I now turn my attention again from
what Dretske calls secondary epistemic seeing (i.e., visually based knowledge
of facts about objects one does not perceive) back to what he calls primary
epistemic seeing, i.e., visual knowledge of facts about objects one does
perceive. When one purports to ground one’s claim to know that one’s neighbor
is at home by mentioning the fact that one can see that her car is parked in
her driveway, clearly one is claiming to be able to see a car, not one’s
neighbor herself. Now, let us concentrate on the scope of knowledge claims in
primary epistemic seeing, i.e., knowledge about facts involving a perceived
object. Let us suppose that someone claims to be able to see that the apple on
the table is green. Let us suppose that the person’s visual system is working
properly, the table and what is lying on it are visible from where the person
stands, and the lighting is suitable for the person to see them from where she
stands. In other words, there is a distinctive way the green apple on the table
looks to the person who sees it. Under those circumstances, when the
person claims that she can see that the apple on the table is green, what are
the scope and limits of her epistemic claims?
Presumably, in so doing, she is
claiming that she knows that there is an apple on the table in front of her and
that she knows that this apple is green. If she knows both of this, then
presumably she also knows that there is
a table under the apple in front of her, she knows that there is a fruit on the
table. Hence, she knows what the fruit on the table is (or what is on the
table), she knows where the apple is, she knows the color of the apple, and so
on. Arguably, the person would then be in a position to make all such claims in
response to the following various queries: is there anything on the table? What
is on the table? What kind of fruit is on the table? Where is the green apple?
What color is the apple on the table? If the person can see that the apple on
the table is green, then presumably she is in a position to know all these
facts.
However, when she claims that she
can see that the apple on the table is green, she is not thereby
claiming that she can see that all of these facts obtain. What she is
claiming is more restricted and specific than that: She is indeed claiming that
she knows that there is an apple on the table and that the apple in question is
green. Furthermore, she is claiming that she learnt the latter fact — the fact
about the apple’s color — through visual perception: if someone claims that she
can see that the apple on the table is green, then she is claiming that she has
achieved her knowledge of the apple’s color by visual means, and
not otherwise. But she is not thereby claiming that her knowledge of the
location of the apple or her knowledge of what is on the table have been
acquired by the very perceptual act (or the very perceptual process) that gave
rise to her knowledge of the apple’s color. Of course, the person’s alleged
epistemic achievement does not rule out the possibility that she came to know
that what is on the table is an apple by seeing it earlier. But if she did,
this is not part of the claim that she can see that the apple on the table is
green. It is consistent with this claim that the person came to know that what
is on the table is an apple by being told, by tasting it or by smelling it. All
she is claiming and all we are entitled to conclude from her claim is that the
way she learnt about the apple’s color is by visual perception.
The investigation into the scope and
limits of primary visual knowledge is important because it is relevant to the
challenge of scepticism. As I already said, my discussion of visual knowledge
does not purport to meet the full challenge of scepticism. In discussing
secondary epistemic seeing, I noticed that in explaining how one comes to know
a fact about an unperceived object by seeing a different fact involving a
perceived object, one takes for granted the possibility of knowing the latter
fact by perceiving one of its constituent objects. Presumably, in so doing, one
cannot hope to meet the full challenge of scepticism that would question the
very possibility of coming to know anything by perception. I briefly turn to
the sceptical challenge to which claims of primary epistemic seeing are
exposed. By scrutinizing the scope and limits of claims of primary visual
knowledge, I want to examine briefly the extent to which such claims are indeed
vulnerable to the sceptical challenge. Claims of primary visual knowledge are
vulnerable to sceptical queries that can be directed backwards and forwards.
They are directed backwards when they apply to background knowledge, i.e.,
knowledge presupposed by a claim of primary visual knowledge. They are directed
forward when they apply to consequences of a claim of primary visual knowledge.
I turn to the former first.
Suppose a sceptic were to challenge
a person’s commonsensical claim that she can see (and hence know by perception)
that the apple on the table in front of her is green by questioning her grounds
for knowing that what is on the table is an apple. The sceptic might point out
that, given the limits of human visual acuity and given the distance of the
apple, the person could not distinguish by visual means alone a genuine green
apple — a green fruit — from a fake green apple (e.g., a wax copy of a green
apple or a green toy). Perhaps, the person is hallucinating an apple when there
is in fact nothing at all on the table. If one cannot visually discriminate a
genuine apple from a fake apple, then, it seems, one is not entitled to
claim that one can see that the apple on the table is green. Nor is one
entitled to claim that one can see that the apple on the table is green if one
cannot make sure by visual perception that one is not undergoing a
hallucination. Thus, the sceptical challenge is the following: if visual
perception itself cannot rule out a number of alternative possibilities to
one’s epistemic claim, then the epistemic claim cannot be sustained.
The proper response to the sceptical
challenge here is precisely to appeal to the distinction between claims of visual
knowledge and other knowledge claims. When the person claims that she can see
that the apple on the table is green, she is claiming that she learnt something
new by visual perception: she is claiming that she just gained new knowledge by
visual means. This new perceptually-based knowledge is about the apple’s color.
The perceiver’s new knowledge — her epistemic “increment”, as Dretske (1969)
calls it — must be pitched against what he calls her “proto-knowledge”, i.e.,
what the person knew about the perceived object prior to her perceptual
experience. The reason it is important to distinguish between a person’s prior
knowledge and her knowledge gained by visual perception is that primary
epistemic seeing (or primary visual knowledge) is a dynamic process. In
order to determine the scope and limits of what has been achieved in a
perceptual process, we ought to determine a person’s initial epistemic stage
(the person’s prior knowledge about an object) and her final epistemic stage
(what the person learnt by perception about the object). Thus, the question
raised by the sceptical challenge (directed backwards) is a question in cognitive
dynamics: How much new knowledge could a person’s visual resources yield, given
her prior knowledge? How much has been learnt by visual perception, i.e., in an
act of visual perception? What new information has been gained by visual
perception?
So when the person claims that she
can see that the apple on the table is green, she no doubt reports that she
knows both that there is an apple on the table and that it is green. She
commits herself to a number of epistemic claims: she knows what is on the table,
she knows that there is a fruit on the table, she knows where the apple is, and
so on. But she merely reports one increment of knowledge: she merely claims
that she just learnt by visual perception that the apple is green. She is not
thereby reporting how she acquired the rest of her knowledge about the object,
e.g., that it is an apple and that it is on the table. She claims that she can see
of the apple that it is green, not that what is green is an apple, nor
that what is on the table is an apple. The claim of primary visual knowledge
bears on the object’s color, not on some of its other properties (it’s being
e.g., an apple, a fruit or its location). All her epistemic claim entails is
that, prior to her perceptual experience, she assumed (as part of her
“proto-knowledge” in Dretske’s sense) that there was a apple on the table and
then she discovered by visual perception that the apple was green.
I now turn my attention to the
sceptical challenge directed forward — towards the consequences of one’s claims
of visual knowledge. The sceptic is right to point out that the person who
claims to be able to see the color of an apple is not thereby in a position to
see that the object whose color she is seeing is a genuine apple — a fruit —
and not a wax apple. Nor is the person able to see that she is not
hallucinating. However, since she is neither claiming that she is able to see
of the green object that it is a genuine apple nor that she is not
hallucinating an apple, it follows that the sceptical challenge cannot hope to
defeat the person’s perceptual claim that she can see what she claims that she
can see, namely that the apple is green. On the externalist picture of
perceptual knowledge which I accept, a person knows a fact when and only when
she is appropriately connected to the fact. Visual perception provides a
paradigmatic case of such a connexion. Hence, visual knowledge arises from
regular correlations between states of the visual system and environmental
facts. Given the intricate relationship between a person’s visual knowledge and
her higher cognitive functions, she will be able to draw many inferences from
her visual knowledge. If a person knows that the apple in front of her is
green, then she may infer that there is a colored fruit on the table in front
of her. Given that fruits are plants and that plants are physical objects, she
may further infer that there are at least some physical objects. Again, the
sceptic may direct his challenge forward: the person claims to know by visual
means that the apple in front of her is green. But what she claims she knows
entails that there are physical objects. Now, the sceptic argues, a person
cannot know that there are physical objects — at least, she cannot see
that there are. According to the sceptic, failure to see that there are
physical objects entails failure to see that the apple on the table is green.
A
person claims that she can know proposition p by visual perception.
Logically, proposition p entails proposition q. There could not
be a green apple on the table unless there exists at least one physical object.
Hence, the proposition that the apple on the table is green could not be true
unless there were physical objects. According to the sceptic, a person could
not know the former without knowing the latter. Now the sceptic offers grounds
for questioning the claim that the person knows proposition q at all —
let alone by visual perception. Since it is dubious that she does know the
latter, then, according to scepticism, she fails to know the former. Along with
Dretske (1969) and Nozick (1981), I think that the sceptic relies on the
questionable assumption that visual knowledge is deductively closed. From the
fact that a person has perceptual grounds for knowing that p, it does
not follow that she has the same grounds for knowing that q, even if q
logically follows from p. If visual perception allows one to get
connected in the right way to the fact corresponding to proposition p,
it does not follow that visual perception ipso facto allows one to get connected
in the same way to the fact corresponding to proposition q even if q
follows logically from p.
A
person comes to know a fact by visual perception. What she learns by visual
perception implies a number of propositions (such as there are physical objects).
Although such propositions are logically implied by what the person learnt by
visual perception, she does not come to know by visual perception all the
consequences of what she learnt by visual perception. She does not know by
visual perception that there are physical objects — if she knows it at all.
Seeing a green apple in front of one has a distinctive visual phenomenology.
Seeing that the apple in front of one is green too has a distinctive visual
phenomenology. There is something distinctively visual about what it is like
for one to see that the apple in front of one is green. If an apple is green,
then it is colored. However, it is dubious whether there is a visual
phenomenology to thinking of the apple in front of one that it is colored. A fortiori,
it is dubious whether there is a visual phenomenology to thinking that there
are physical objects. Hence, contrary to what the sceptic assumes, I want to
claim, as Dretske (1969) and Nozick (1981) have, that visual knowledge is not
deductively closed.
III. The role of
visuomotor representations in the human cognitive architecture
In the present section, I shall sketch my
reasons for thinking that visuomotor representations do not lead to detached
knowledge of the world. Rather, they serve as input to intentions in at least
two respects: on the one hand, they provide visual guidance to what I shall
call “motor intentions”. On the other hand, they provide visual information for
“causally indexical” concepts. I will start by laying out the basic distinction
between two different kinds of “direction of fit” that can be exemplified by
mental representations.
III.1. Direction of
fit
Whereas
visual percepts serve as inputs to the “belief box”, visuomotor
representations, I now want to argue, serve is inputs to a different kind of
mental representations, i.e., intentions. As emphasized by Anscombe (1957) and
Searle (1983, 2001), perceptions, beliefs, desires and intentions each have a
distinctive kind of intentionality. Beliefs and desires have what Searle calls
“opposite direction of fit”. Beliefs have a mind-to-world direction of fit:
they can be true or false. A belief is true if and only if the world is as the
belief represents it to be. It is the function of beliefs to match facts or
actual state of affairs. In forming a belief, it is up for the mind to meet the
demands of the world. Unlike beliefs, desires have a world-to-mind direction of
fit. Desires are neither true nor false: they are fulfilled or frustrated. The
job of a desire is not to represent the world as it is, but rather as the agent
would like it to be. Desires are representations of goals, i.e., possible
nonactual states of affairs. In entertaining a desire, it is so to speak up for
the world to meet the demands of the mind. The agent’s action is supposed to
bridge the gap between the mind’s goal and the world.
As
Searle (1983, 2001) has noticed, perceptual experiences and intentions have
opposite directions of fit. Perceptual experiences have the same mind-to-world
direction of fit as beliefs. Intentions have the same world-to-mind direction
of fit as desires. In addition, perceptual experiences and intentions have
opposite directions of causation: whereas a perceptual experience represents
the state of affairs that causes it, an intention causes the state of affairs
that it represents.
Although
intentions and desires share the same world-to-mind direction of fit,
intentions are different from desires in a number of important respects, which
all flow from the peculiar commitment to action of intentions. Broadly
speaking, desires are relevant to the process of deliberation that precedes
one’s engagement into a course of action. Once an intention is formed, however,
the process of deliberation comes to an end. To intend is to have made up one’s
mind about whether to act. Once an intention is formed, one has taken the
decision whether to act. I shall mention four main differences between desires
and intentions.
First, although desires may be about anything
or anybody, intentions are always about the self. One can only intend oneself
to do something. Second, unlike desires, intentions are tied to the present or
the future: one cannot intend to do something in the past. Third, unlike the
contents of desires, the contents of intentions must be about possible
nonactual states of affairs. An agent cannot intend to achieve a state of
affairs that she knows to be impossible at the time when she forms her
intention. Finally, although one may entertain desires whose contents are
inconsistent, one cannot have two intentions whose contents are inconsistent.
Reaching
and grasping objects are visually guided actions directed towards objects. I
assume that all actions are caused by intentions. Intentions are psychological
states with a distinctive intentionality. As I said earlier, intentions derive
their peculiar commitment to action from the combination of their distinctive
world-to-mind direction of fit and their distinctive mind-to-world direction of
causation. I shall now argue that visuomotor representations have a dual
function in the human cognitive architecture: they serve as inputs to “motor
intentions” and they serve as input to a special class of indexical concepts,
the “causally indexical” concepts.
III.2.
Visuomotor representations serve as inputs to motor intentions
Not
all actions, I assume, are caused by what Searle (1983, 2001) calls prior intentions,
but all actions are caused by what he calls intentions in action, which,
following Jeannerod (1994), I will call motor intentions. Unlike prior
intentions, motor intentions are directed towards immediately accessible goals.
Hence, they play a crucial role, not so much in the planning of action as in
the execution, the monitoring and the control of the ongoing action. Arguably,
prior intentions may have conceptual content. Motor intentions do not. For
example, one intends to climb a visually perceptible mountain. The content of
this prior intention involves e.g., the action concept of climbing and a visual
percept of the distance, shape and color of the mountain. In order to climb the
mountain, however, one must intentionally perform an enormous variety of
postural and limb movements in response to the slant, orientation and the shape
of the surface of the slope. Human beings automatically assume the right
postures and perform the required flexions and extensions of their feet and
legs. Since they do not possess concepts matching each and every such
movements, their non-deliberate intentional behavioral responses to the slant,
orientation and shape of the surface of slope is monitored by the nonconceptual
nonperceptual content of motor intentions.
Not
any sensory representation can match the peculiar commitment to action of motor
intentions. Visuomotor representations can. Percepts are informationally richer
and more fine-grained than either concepts or visuomotor representations. As I
claimed above, visual percepts have the same mind-to-world direction of fit as
beliefs. This is why visual percepts are suitable inputs to a process of selective
elimination of information, whose ultimate conceptual output can be stored in
the belief box.
I
shall presently argue that visuomotor representations have a different
function: they provide the relevant visual information about the properties of
a target to an agent’s motor intentions. Indeed, I want to think of the role of
the visuomotor representation of a target for action as Gibson (1979) thought
of an affordance. However, unlike Gibson (1979), who did not make a
distinction between perceptual and visuomotor processing, I do not think of the
visuomotor processing of a target as a “direct pick up of information”. I think
that visuomotor representations are genuine representations. My main
reason for thinking of the output of the visuomotor processing of a target as a
genuine mental representation — and for thinking of grasping as a genuine
action, not a behavioral reflex — is that Haddenden, Schiff & Goodale’s
(2001) experiment suggests that the visuomotor processing of a target can be fooled
by features of the visual display: it can be led to process two dimensional
cues as if they were three dimensional obstacles. If the output of the
visuomotor processing of a display can misrepresent it, then it represents it.
Unlike
visual percepts whose single role is to present visual information for further
processing the output of which will be stored in the belief box, visuomotor
representations are hybrid: as Millikan (1996), who calls them “pushmi-pullyu
representations” has perceptively recognized, they have a dual role. I slightly
depart from Millikan (1996), however, in that, unlike her, I assume that
visuomotor representations, not motor intentions, have a double direction of
fit. Visuomotor representations present states of affairs as both facts and goals
for immediate action. On the one hand, they provide visual information for the
benefit of motor intentions. On the other hand, their content can be
conceptualized with the help of a special class of indexical concepts: causal
indexicals. Whereas visual percepts must be stripped of much of their
informational richness to be conceptualized, visuomotor representations can
directly provide relevant visual information about the target of an action to
motor intentions. To put it crudely, it follows from the work summarized in
Jeannerod (1994, 1997) that the content of a motor intention has two sides: a
subjective side and an objective side. On the subjective side, a motor
intention represents the agent’s body in action. On the objective side, it
represents the target of the action. Visuomotor representations contribute to
the latter. Their ‘motoric’ informational encapsulation makes them suitable for
this role. The nonconceptual nonperceptual content of a visuomotor
representation matches that of a motor intention.
Borrowing
from the study of language processing, Jeannerod (1994, 1997) has drawn a
distinction between the semantic and the pragmatic processing of
visual stimuli. The view I want to put forward has been well expressed by
Jeannerod (1997: 77): “at variance with the […] semantic processing, the
representation involved in sensorimotor transformation has a predominantly
‘pragmatic’ function, in that it relates to the object as a goal for action,
not as a member of a perceptual category. The object attributes are represented
therein to the extent that they trigger specific motor patterns for the hand to
achieve the proper grasp”. Thus, the crucial feature of the pragmatic
processing of visual information is that its output is a suitable input to the
nonconceptual content of motor intentions.
III.3.
Visuomotor representations serve as inputs to causal indexicals
I
have just argued that what underlies the contrast between the pragmatic and the
semantic processing of visual information is that, whereas the output of the
latter is designed to serve as input to further conceptual processing with a
mind-to-world direction of fit, the output of the former is designed to match
the nonconceptual content of motor intentions with a world-to-mind direction of
fit and a mind-to-world direction of causation. The special features of the
nonconceptual contents of visuomotor representations can be inferred from the
behavioral responses which they underlie, as in patient DF. They can also be
deduced from the structure and content of elementary action concepts with the
help of which they can be categorized.
I
shall presently consider a subset of elementary action concepts, which,
following Campbell (1994), I shall call “causally indexical” concepts.
Indexical concepts are shallow but indispensable concepts, whose references
change as the perceptual context changes and whose function is to encode
temporary information. Indexical concepts respectively expressed by ‘I’,
‘today’ and ‘here’ are personal, temporal and spatial indexicals. Arguably,
their highly contextual content cannot be replaced by pure definite
descriptions without loss. Campbell (1994: 41-51) recognizes the existence of
causally indexical concepts whose references may vary according to the causal
powers of the agent who uses them. Such concepts are involved in judgments
having, as Campbell (1994: 43) puts it, “immediate implications for [the
agent’s] action”. Concepts such as “too heavy”, “out of reach”, “within my
reach”, “too large”, “fit for grasping between index and thumb” are causally
indexical concepts in Campbell’s sense.
Campbell’s
idea of causal indexicality does capture a kind of judgment that is
characteristically based upon the output of the pragmatic (or motor) processing
of visual stimuli in Jeannerod’s (1994, 1997) sense. Unlike the content of the
direct output of the pragmatic processing of visual stimuli or that of motor
intentions, the contents of judgments involving causal indexicals is conceptual.
Judgments involving causally indexical concepts have low conceptual content,
but they have conceptual content nonetheless. For example, if something is
categorized as “too heavy”, then it follows that it is not light enough. The
nonconceptual contents of either visuomotor representations or motor intentions
is better compared with that of an affordance in Gibson’s sense.
Causally
indexical concepts differ in one crucial respect from other indexical concepts,
i.e., personal, temporal and spatial indexical concepts. Thoughts involving
personal, temporal and spatial indexical concepts are “egocentric” thoughts
in the sense that they they are perception-based thoughts. This is obvious
enough for thoughts expressible with either the first-person pronouns ‘I’ or
‘you’. To refer to a location as ‘here’ or ‘there’ and to refer to a day as
‘today’, ‘yesterday’ or ‘tomorrow’ is to refer respectively to a spatial and a
temporal region from within some egocentric perspective: a location can only be
referred to as ‘here’ or ‘there’ from some particular spatial egocentric
perspective. A temporal region can only be referred to by ‘today’, ‘yesterday’
or ‘tomorrow’ from some particular temporal egocentric perspective. In this
sense, personal, temporal and spatial indexical concepts are egocentric
concepts.[2] Arguably, egocentric indexicals lie at the interface between visual
percepts and an individual’s conceptual repertoire about objects, times and
locations.
Many
philosophers (see e.g., Kaplan, 1989 and Perry, 1993) have argued that
personal, temporal and spatial indexical and/or demonstrative concepts play a
special “essential” and ineliminable role in the explanation of action. And so
they do. As Perry (1993: 33) insightfully writes: “I once followed a trail of
sugar on a supermarket floor, pushing my cart down the aisle on one side of a
tall counter and back the aisle on the other, seeking the shopper with the torn
sack to tell him he was making a mess. With each trip around the counter, the
trail became thicker. But I seemed unable to catch up. Finally it dawned on me.
I was the shopper I was trying to catch”. To believe that the shopper with a
torn sack is making a mess is one thing. To believe that oneself is
making a mess is something else. Only upon forming the thought expressible by
‘I am making a mess’ is it at all likely that one may take appropriate measures
to change one’s course of action. It is one thing to believe that the meeting
starts at 10:00 AM. It is another thing to believe that the meeting starts now,
even if now is 10:00 AM. Not until one thinks that the meeting starts now will
one get up and run. Consider someone standing still at an intersection, lost in
a foreign city. One thing is for that person to intend to go to her hotel.
Something else is to intend to go this way, not that way. Only after she
has formed the latter intention with a demonstrative locational content, will
she get up and walk.
Thus,
such egocentric concepts as personal, temporal and spatial indexicals and/or
demonstratives derive their ineliminable role in the explanation of action from
the fact that their recognitional role cannot be played by any purely descriptive
concept. Recognition involves a contrast but it can be achieved without
recourse to a uniquely specifying definite description. Indexicals and
demonstratives are mental pointers that can be used to refer to objects, places
and times. Personal indexicals are involved in the recognition of persons.
Temporal indexicals are involved in the recognition of temporal regions or
instants. Spatial indexicals are involved in the recognition of locations. To
recognize oneself as the reference of ‘I’ is to make a contrast with the
recognition of the person one addresses in verbal communication as ‘you’. To
identify a day as ‘today’ is to contrast it with other days that might be
identified as ‘yesterday’, ‘the day before yesterday’, ‘tomorrow’, etc. To
identify a place as ‘here’ is to contrast it with other places referred to as
‘there’.
Although
indexicals and demonstratives are concepts, they have non-descriptive
conceptual content. The conceptual system needs such indexical concepts because
it lacks the resources to supply a purely descriptive symbol, i.e., a symbol
that could uniquely identify a person, a time or a place. A purely descriptive
concept would be a concept that a unique person, a unique time or a unique
place would satisfy by uniquely exemplifying each and every of its constituent
features. We cannot specify the references of our concepts all the way down by
using uniquely identifying descriptions on pain of circularity. If, as Pylyshyn
(2000: 129) points out, concepts need to be “grounded”, then on pain of
circularity, “the grounding [must] begin at the point where something is picked
out directly by a mechanism that works like a demonstrative” (or an indexical).
If concepts are to be hooked to or locked onto objects, times and places, then
on pain of circularity, definite descriptions will not supply the locking
mechanism.
Personal,
temporal and spatial indexicals owe their special explanatory role to the fact
that they cannot be replaced by purely descriptive concepts. Although they
allow recognition by nondescriptive means, their direction of application is
mind-to-world. Causally indexical concepts, however, play a different
role altogether. Unlike personal, temporal and spatial indexical concepts,
causally indexical concepts have a distinctive quasi-deontic or quasi-evaluative
content. I want to say that, unlike that of other indexicals, the direction of
fit of causal indexicals is hybrid: it is partly mind-to-world, partly
world-to-mind. To categorize a target as “too heavy”, “within reach” or “fit
for grasping between index and thumb” is to judge or evaluate the parameters of
the target as conducive to a successful action upon the target. Unlike the
contents of other indexicals, the content of a causally indexical concept
results from the combination of an action predicate and an evaluative operator.
What makes it indexical is that the result of the application of the latter
onto the former is relative to the agent who makes the application. Thus, the
job of causally indexical concepts is not just to match the world but to play
an action guiding role. If it is, then presumably causal indexicals have at
best a hybrid direction of fit, not a pure mind-to-world direction of fit.
In
the previous section, I have argued that, unlike visual percepts, visuomotor
representations provide visual information to motor intentions, which have
nonconceptual content, a world-to-mind direction of fit and a mind-to-world
direction of causation. I am presently arguing that the visual information of
visuomotor representations can also serve as input to causally indexical
concepts, which are elementary contextually dependent action concepts.
Judgments involving causally indexical concepts have at best a hybrid direction
of fit. When an agent makes such a judgment, he is not merely stating a fact:
he is not thereby coming to know a fact that holds independently of his causal
powers. Rather, he is settling, accepting or making his mind on an action plan.
The function of causally indexical concepts is precisely to allow an agent to
make action plans. Whereas personal, temporal and spatial indexicals lie at the
interface between visual percepts and an individual’s conceptual repertoire about
objects, times and places, causally indexical concepts lie at the interface
between visuomotor representations, motor intentions and what Searle calls
prior intentions. Prior intentions have conceptual content: they involve action
concepts. Thus, after conceptual processing via the channel of causally
indexical concepts, the visual information contained in visuomotor
representations can be stored in a conceptual format adapted to the content and
the direction of fit of one’s intentions — if not one’s motor intentions, then
perhaps one’s prior intentions. Hence, the output of the motor processing of
visual inputs can serve as input to further conceptual processing whose output
will be stored in the ‘intention box’.
References
Aglioti, S., De Souza,
J.F.X. and Goodale, M.A. (1995) “Size-contrast illusions deceive the eye but
not the hand”, Current Biology, 5, 6, 679-85.
Anscombe, G.E.M.
(1957) Intention, Ithaca: Cornell University Press.
Austin, J. L.
(1962) Sense and Sensibilia, Oxford: Clarendon Press.
Bermudez,
J. (1998) The Paradox of Self-Consciousness, Cambridge, Mass.: MIT
Press.
Bridgeman,
B., Hendry, D. & Stark, L. (1975) “Failure to detect displacement of the
visual world during saccadic eye movement”, Vision Research, 15, 719-22.
Campbell,
J. (1994) Past, Space and Self, Cambridge, Mass.: MIT Press.
Carey,
D.P., Harvey, M. & Milner, A.D. (1996) “Visuomotor sensitivity for shape
and orientation in a patient with visual form agnosia”, Neuropsychologia,
34, 329-37.
Castiello,
U., Paulignan, Y. & Jeannerod, M. (1991) “Temporal dissociation of motor
responses and subjective awareness. A study in normal subjects”, Brain,
114, 2639-2655.
Crane,
T. (1992) “The nonconceptual content of experience” in Crane, T. (ed.)(1992) The
Contents of Experience, Cambridge: Cambridge University Press.
Dokic,
J. & Pacherie, E. (2001) “Shades and concepts”, Analysis, 61, 3,
193-202.
Dretske,
F. (1981) Knowledge and the Flow of Information, Cambridge, Mass.: MIT
Press.
Dretske,
F. (1995) Naturalizing the Mind, Cambridge, Mass.: MIT Press.
Evans,
G. (1982) The Varieties of Reference, Oxford: Oxford University Press.
Farah,
M. (1990) Visual Agnosia: Disorders of Object Recognition and What They Will
Tell Us About Normal Vision, Cambridge, Mass.: MIT Press.
Fodor,
J.A. (1987) Psychosemantics, Cambridge, Mass.: MIT Press.
Franz,
V.H., Gegenfurtner, K.R., Bülthoff and Fahle, M. (2000) “Grasping visual
illusions: no evidence for a dissociation between perception and action”, Psychological
Science, 11, 1, 20-25.
Gibson,
J.J. (1979) The Ecological Approach to Visual Perception, Boston:
Houghton-Miffin.
Goodale,
M. A. (1995) “The cortical organization of visual perception and visuomotor
control”, in Osherson, D. (1995)(ed.) An Invitation to Cognitive Science,
Visual Cognition, vol. 2, Cambridge, Mass.: MIT Press.
Goodale,
M.A., Pélisson, D., Prablanc, C. (1986) “Large adjustments in visually guided
reaching do not depend on vision of the hand or perception of target
displacement”, Nature, 320, 748-50.
Goodale,
M. A., Milner, A.D., Jakobson I.S. and Carey, D.P. (1991) “A neurological
dissociation between perceiving objects and grasping them”, Nature, 349,
154-56.
Haffenden, A. M.
& Goodale, M. (1998) “The effect of pictorial illusion on prehension and
perception”, Journal of Cognitive Neuroscience, 10, 1, 122-36.
Haffenden,
A.M. Schiff, K.C. & Goodale, M.A. (2001) “The dissociation between
perception and action in the Ebbinghaus illusion: non-illusory effects of
pictorial cues on grasp”, Current Biology, 11, 177-181.
Jacob,
P. (1997) What minds can do, Cambridge: Cambridge University Press.
Jeannerod,
M. (1984) “The timing of natural prehension movements”, Journal of Motor
Behavior, 16, 235-54.
Jeannerod,
M. (1994) “The representing brain: neural correlates of motor intentions”, Behavioral
and Brain Sciences,
Jeannerod,
M. (1997) The Cognitive Neuroscience of Action, Oxford: Blackwell.
Jeannerod,
M., Decety, J. and Michel, F. (1994) “Impairment of grasping movements
following bilateral posterior parietal lesions”, Neuropsychologia, 32,
369-80.
Kaplan,
D. (1989) “Demonstratives”, in Almog, J., Perry, J. & Wettstein, H.
(eds.)(1989) Themes from Kaplan, New York: Blackwell.
McDowell,
J. (1994) Mind and the World, Cambridge, Mass.: Harvard University
Press.
McDowell,
J. (1998) Précis of Mind and World, and Reply to Commentators, Philosophy
and Phenomenological Research, LVIII, 2, 365-68, 403-31.
Millikan,
R.M. (199 ) “Pushmi-pullyu Representations”, in J. Tomberlin (ed.) Philosophical
Perspectives, vol. IX, Atascadero, CA.
Milner,
D. & Goodale, M.A. (1995) The Visual Brain in Action, Oxford: Oxford
University Press.
Milner,
D., Paulignan, Y., Dijkerman, H.C., Michel, F. and Jeannerod, M. (1999) “A
paradoxical improvement of misreaching in optic ataxia: new evidence for two
separate neural systems for visual localization”, Proc. of the Royal Society,
266, 2225-9.
Nozick,
R. (1981) “Knowledge and scepticism”, in Bernecker, S. & Dretske, F.
(ed.)(2000) Knowledge, Readings in Contemporary Epistemology, Oxford:
Oxford University Press.
Pavani,
F., Boscagli, I, Benvenuti, F., Rabuffetti & Farné, A. (1999) “Are
perception and action affected differently by the Titchener circles illusion”, Experimental
Brain Research, 127, 95-101.
Peacocke,
C. (1992) A Study of Concepts, Cambridge, Mass.: MIT Press.
Peacocke,
C. (1998) “Nonconceptual content defended”, Philosophy and Phenomenological
Research, LVIII, 2, 381-88.
Perry,
J. (1979) “The essential indexical”, in Perry, J. (1993).
Perry,
J. (1986a) “Perception, action and the structure of believing”, in Perry, J.
(1993).
Perry,
J. (1986b) “Thought without representation”, in Perry, J. (1993).
Perry,
J. (1993) The Problem of the Essential Indexical and Other Essays,
Oxford: Oxford University Press.
Pisella, L et al. (2000)
“An ‘automatic pilot’ for the hand in human posterior parietal cortex: toward
reinterpreting optic ataxia”, Nature Neuroscience, 3, 7, 729-36.
Pylyshyn,
Z. (2000) “Visual indexes, preconceptual objects and situated vision”, Cognition,
80, 127-58.
Rossetti,
Y. & Pisella, L. (2000) “Common mechanisms in perception and action”, in
Prinz, W. & Hommel, B. (eds.)(2000) Attention and Performance, XIX,
Oxford: Oxford University Press.
Searle,
J. (1983) Intentionality, Cambridge, Cambridge University Press.
Searle,
J. (2001) Rationality in Action, Cambridge, Mass.: MIT Press.
Stroud,
B. (1989) “Understanding human knowledge in general”, Bernecker, S. &
Dretske, F. (ed.)(2000) Knowledge, Readings in Contemporary Epistemology,
Oxford: Oxford University Press.
Tye,
M. (1995) Ten Problems about Consciousness, Cambridge, Mass.: MIT
Press.
Ungerleider, L.G.
& Mishkin, M. (1982) “Two cortical visual systems”, in Ingle, D.J.,
Goodale, M.A. & Mansfield, R.J.W. (eds.) Analysis of visual behavior,
MIT Press.
Weiskrantz, L.
(1986) Blindsight. A Case Study and Implications, Oxford: Oxford
University Press.
Weiskrantz, L.
(1997) Consciousness Lost and Found, Oxford: Oxford University Press.
Zeki, S. (1993) A Vision of the Brain,
Oxford: Blackwell.
[1] For discussion, see Jacob
(1997, ch. 2).
[2] The egocentricity of indexical concepts should not be confused
with the egocentricity of an egocentric frame of reference in which the visual
system codes e.g., the location of a target. The former is a property of
concepts. The latter is a property of visual representations. One crucial
difference between the egocentricity of indexical concepts and the
ecogentricity of an egocetnric frame of reference for coding the spatial
location of a target is that, unlike the latter, the former involves a
contrast: if e.g., something is here, it is not there.