# Keyboard Shortcuts?

×
• Next step
• Previous step
• Skip this slide
• Previous slide
• mShow slide thumbnails
• nShow notes
• hShow handout latex source
• NShow talk notes latex source

Click here and press the right key for the next slide (or swipe left)

also ...

Press the left key to go backwards (or swipe right)

Press n to toggle whether notes are shown (or add '?notes' to the url before the #)

Press m or double tap to slide thumbnails (menu)

Press ? at any time to show the keyboard shortcuts

\title {Origins of Mind \\ Lecture 06}

\maketitle

# Lecture 06

\def \ititle {Lecture 06}
\begin{center}
{\Large
\textbf{\ititle}
}

\iemail %
\end{center}

\section{Crossing the Gap}

\section{Crossing the Gap}

the question

How do humans first come to know simple facts about physical objects, colours, minds and the rest?
Core knowledge exists.
There is a gap between core knowledge and knowledge knowledge.
Crossing the gap involves social interactions, perhaps involving words.
1. Core knowledge exists.
2. Core knowledge is real. Infants’ have unexpectedly sophisticated abilities concerning physical objects and categorical colour properties (and much more) even from the first year of life.
3. There is a gap between core knowledge and knowledge knowledge.
4. There is a gap between core knowledge and knowledge knowledge. It takes months if not years between clear manifestations of core knowledge and knowledge knowledge. Importantly,
5. Crossing the gap involves social interactions, perhaps involving words.
6. Crossing the gap involves social interactions, perhaps involving words.
Having core knowledge of something does not involve having any knowledge knowledge at all. Here I'm going to use the term ‘concept of X’ for that which enables one to have knowledge of Xs. How do we get from core knowledge to concepts?
Core knowledge enables one to distinguish things. For example it enables on to distinguish those things which are blue from those which are not; it enables one to distinguish those events which are causal interactions from those which are not; it enables one to distinguish those sets which have two members from others; and it enables one to distinguish different beliefs about the location of an object (say).
(Here I'm using core knowledge in the broad, schematic sense to refer to representations which are knowledge-like but not knowledge.)
I conjecture that core knowledge faciliates acquisition of the correct use of a word, perhaps very slowly. The idea is that being able to discriminate things allows one to apply a label to them.
Importantly we can discriminate without having concepts. If one thought that all discrimination involved concepts, this picture would become circular.
How does core knowledge enable one to correctly use words? I think it modifies the overall phenomenal character of your experience, typically by generating phenomenal expectations (which I called them perceptual expectations earlier in this version of the course). Tuning in to the perceptual expectations can take a long time, which is why there may be a long interval between observing core knowledge and observing the correct use of words.
I also conjecture that using the word facilitates concept acquisition. Many people would probably agree. But how does it do this?
My schematic suggestion is that using the word draws attention to all the things which are Xs. The concept is acquired when you are struck by the question, What do all these have in common?
(Clearly this is not an account of how thinking gets started at all; the appeal to reflection should make this obvious.)
We have quite good evidence for this picture in the cases of colour and number, and there is relevant evidence in the case of mindreading too. (Also speech: phonological awareness is linked to literacy and the particulars of the written language learnt, so that alphabetic languages give a different profile --- alphabet is roughly labelling phonemes.)
The question we've been looking at last week is how children come to correctly use words.
This is about the step from discrimination to learning the correct use of a verbal label.
So there's you and you're observing sequence of stimuli and thanks to core knowledge you're able to discriminate them.
And now along comes another person. What are they doing? Nothing yet. But ...
Oh look they're labelling stimuli. So now the blue ones (say) are special. You respond to them in one way and the other responds to them in her way, which is by labelling.
Now you can observe that your responses are correlated with her responses. So when you discriminate in a certain way, she applies the label. Observing this correspondence enables you to learn the label (say). This is triangulation roughly as Davidson describes it.
And having got this far you can ask yourself what all the things labelled have in common.
1. Core knowledge exists.
2. Core knowledge is real. Infants’ have unexpectedly sophisticated abilities concerning physical objects and categorical colour properties (and much more) even from the first year of life.
3. There is a gap between core knowledge and knowledge knowledge.
4. There is a gap between core knowledge and knowledge knowledge. It takes months if not years between clear manifestations of core knowledge and knowledge knowledge. Importantly,
5. Crossing the gap involves social interactions, perhaps involving words.
6. Crossing the gap involves social interactions, perhaps involving words.
Gradually build up from understanding minds and actions to words.
*for bk: include \citep{meyer:2016_monitoring} on crawling infants’ (informative about relations between performance and observation)

## Action: The Basics

\section{Action: The Basics}

\section{Action: The Basics}
Our first question is, When do human infants first track goal-directed actions and not just movements? In examining nonlinguistic communication, we've assumed that infants from around 11 months of age can produce and comprehend informative pointing. This commits us to saying that they have understood action.

When do human infants first track goal-directed actions
rather than mere movements only?

Background assumption

‘intention attribution and action understanding are two separable processes’

Uithol and Paulus, 2014 p. 617

Background assumption: ‘intention attribution and action understanding are two separable processes’ \citep[p.~617]{uithol:2014_what}.
\#source 'research/teleological stance -- csibra and gergely.doc'
\#source 'lectures/mindreading and joint action - philosophical tools (ceu budapest 2012-autumn fall)/lecture05 actions intentions goals'
\#source 'lectures/mindreading and joint action - philosophical tools (ceu budapest 2012-autumn fall)/lecture06 goal ascription teleological motor'
When do human infants first track goal-directed actions and not just movements?
Here's a classic experiment from way back in 1995.
The subjects were 12 month old infants.
They were habituated to this sequence of events.
There was also a control group who were habituated to a display like this one but with the central barrier moved to the right, so that the action of the ball is 'non-rational'.

Gergely et al 1995, figure 1

For the test condition, infants were divided into two groups. One saw a new action, ...
... the other saw an old action.
Now if infants were considering the movements only and ignoring information about the goal, the 'new action' (movement in a straight line) should be more interesting because it is most different.
But if infants are taking goal-related information into acction, the 'old action' might be unexpected and so might generate greater dishabituation.

Gergely et al 1995, figure 3

Gergely et al 1995, figure 5

‘by the end of the first year infants are indeed capable of taking the intentional stance (Dennett, 1987) in interpreting the goal- directed behavior of rational agents.’
\citep[p.\ 184]{Gergely:1995sq}
‘12-month-old babies could identify the agent’s goal and analyze its actions causally in relation to it’
\citep[p.\ 190]{Gergely:1995sq}
You might say, it's bizarre to have used balls in this study, that can't show us anything about infants' understanding of action.
But adult humans naturally interpret the movements of even very simple shapes in terms of goals.
So using even very simple stimuli doesn't undermine the interpretation of these results.

Heider and Simmel, figure 1

Consider a further experiment by \citet{Csibra:2003jv}, also with 12-month-olds. This is just like the first ball-jumping experiment except that here infants see the action but not the circumstances in which it occurs. Do they expect there to be an object in the way behind that barrier?

Csibra et al 2003, figure 6

Consider a related study by Woodward and colleagues.
(It's good that there is converging evidence from different labs, using quite different stimuli.)

Woodward et al 2001, figure 1

'Six-month-olds and 9-month-olds showed a stronger novelty response (i.e., looked longer) on new-goal trials than on new-path trials (Woodward 1998). That is, like toddlers, young infants selectively attended to and remembered the features of the event that were relevant to the actor’s goal.'
\citep[p.\ 153]{woodward:2001_making}

from three months of age

Using a manipulation we’ll discuss later (‘sticky mittens’), \citet{sommerville:2005_action} used this paradigm to show that even three-month-olds can form expectations based on the goal of an action (for another study with three-month-olds, see also \citealp{luo:2011_threemonthold}).

Daum et al, 2012 figure 1

\citet{daum:2012_actions} adapted Woodward et al’s paradigm so that they could simultaneously measure looking time and anticipatory looking.
You can see that there’s a round occluder in the middle of the display. The agent, a red fish, moves behind this on it’s way to visit one of two objects.
In Experiment 1, First a small fish moves towards one object six times (familiarization). Then the locations of objects were swaped and infants’ responses were measured during six test trials. In these trials, the agent (red fish) moved on the same path towards a new goal three times; and it did the converse three times.
Just as you'd expect given Woodward et al 2001, 9-month old infants looked longer when the fish (agent) moved towards a new goal (same path) than when it moved towards the same goal on a new path.
BUT where were the infants looking in anticpation of the agent’s reappearance? It turns out that, on the whole, they were looking as if the agent were moving on the same path to a new goal. So there’s a dissocaition between anticipatory looking and violation-of-expectations measures for 9 month olds.
‘The results of the analysis of the infants’ eye movements contrast the looking time results and showed at the age of 12 months (and less reliably at the age of 9 months) infants predicted the reappearance of the agent based on the location of the goal during an observed action and that it was not until the age of 3, that this dissociation disappeared and that children predicted the reappearance of the agent after occlusion based on goal identity’ \citep[p.~9]{daum:2012_actions}/
‘Early in life, action expectations measured online seem to be organized around goal locations whereas action expectations measured post-hoc around goal identities. With increasing age, children then generally organize their action expectations primarily around goal identities’ \citep[p.~10]{daum:2012_actions}

Daum et al, 2012 figure 2 (part)

In a follow up experiment, \citet{daum:2012_actions} considered anticipatory looking with these stimuli for a range of agents including adulthoot.
As you can see, adults’ anticipatory looking is generally to the same-goal but different-path location, which is also what 3-year-olds do.
Why is there a dissocaition between anticipatory looking and violation-of-expectations measures for 9 month olds?
One possibility noted by \citet{daum:2012_actions} is that anticipatory looking requires rapid computation of the goal and its consequences for movement. It may be that infants simply cannot compute the goal in the few hundren milliseconds available for anticipatory looking.
This doesn’t sound very interesting, but note that 9-month-old infants seem to be making anticipatory looks. This suggests that they may be guided by contingency information in making their anticipatory looks. This is potentially important as a clue that action anticipation is driven by a mixture of goal ascription and statistical information.
Also these findings need cautious interpretation. Cf \citep[p.~7]{ambrosini:2013_looking}: ‘some visual anticipation studies show that 12-month-olds and even 10-month-olds, but not 6-month-olds, can predictively gaze to the goal position when observing displacement actions [9,12], while some others demonstrate that even 6-month-olds show anticipatory fixations to the goal of observed actions [34,35].’ See also \citet{cannon:2012_infants}.

Green et al, 2016 figure 1 (part)

Chopsticks vs spoons, Sweeden vs China, 8-month-olds.
Would they look at the bowl when the chopsticks or spoon was held? (No!)
Would they look at the mouth when the chopstick or spoon was loaded with a cracker? Let’s have a look ...

Green et al, 2016 figure 1 (part)

Green et al, 2016 figure 1 (part)

Green et al, 2016 figure 2

\citep[p.~743]{green:2016_culture}: ‘Consistent with prior findings, the current study also demon- strates that infants need to have the ability to per- form similar actions themselves. In other words, culture appears to work together with the obser- vers’ own motor plans in order to facilitate goal prediction. Infants predict the goal of actions that are similar to their own motoric capability (like put- ting things in their mouth), but only for actions per- formed with a tool frequently occurring in the cultural context in which they live. The influence of one’s own motor capability is expressed primarily as a lack of prediction during the picking up food part of the action and the presence of prediction during self-oriented actions (i.e., bringing objects to the mouth and eating).’
Figure caption: ‘Mean gaze latency (negative numbers = prediction) to arrive at the goal when observing eating actions directed toward the mouth using a spoon (squares) and chopsticks (circles) in China and Sweden. Error bars represent standard errors.’

## The Teleological Stance

\section{The Teleological Stance}

\section{The Teleological Stance}

How?

Infants can identify goals from around six months of age.

first specify the problem to be solved: goal ascription

Let me first specify the problem to be solved.
As this illustrates, some actions involving are purposive in the sense that
among all their actual and possible consequences,
there are outcomes to which they are directed
In such cases we can say that the actions are clearly purposive.
It is important to see that the third item---representing the directedness---is necessary.
This is quite simple but very important, so let me slowly explain why goal ascription requires representing the directedness of an action to an outcome.
Imagine two people, Ayesha and Beatrice, who each intend to break an egg. Acting on her intention, Ayesha breaks her egg. But Beatrice accidentally drops her egg while carrying it to the kitchen. So Ayesha and Beatrice perform visually similar actions which result in the same type of outcome, the breaking of an egg; but Beatrice's action is not directed to the outcome of her action whereas Ayesha's is.
Goal ascription requires the ability to distinguish between Ayesha's action and Beatrice's action. This requires representing not only actions and outcomes but also the directedness of actions to outcomes.
This is why I say that goal ascription requires representing the directedness of an action to an outcome, and not just representing the action and the outcome.

requirements on a solution to the problem ...

Next consider requirements on a solution to the problem.

Requirements:

(1) reliably: R(a,G) when and only when a is directed to G

(3) R(a,G) is detectable without any knowledge of mental states

R(a,G) =df a causes G?

R(a,G) =df a is caused by an intention to G?

R(a,G) =df a has the teleological function G?

R(a,G) =df a ‘is seen as the most justifiable action towards [G] that is available within the constraints of reality’?

How about taking $R$ to be causation? That is, how about defining $R(a,G)$ as $a$ causes $G$? This proposal does not meet the first criterion, (1), above. We can see this by mentioning two problems. [*Might skip over-generate and discuss that as a problem for Rationality/Efficiency] First problem: actions typically have side-effects which are not goals. For example, %---not a good example because can't be avoided by any account %--- (would require attribution of desire) %For example, walking to the corner results in me warming up, in me expending energy, and in me being at the corner. %Sometimes I walk to the corner for exercise, %so that being at the corner is an unwanted side-effect (I then have to walk back). %And sometimes I walk to the corner to be at the corner (so that expending energy is an unwanted side-effect, I'd rather have been chauffeured there). suppose that I walk over here with the goal of being next to you. This action has lots of side-effects: \begin{itemize} \item I will be at this location. \item I will expend some energy. \item I will be further away from the front \end{itemize} These are all causal consequence of my action. But they are not goals to which my action is directed. So this version of $R$ will massively over-generate goals. Second problem: actions can fail. [...] So this version of $R$ will under-generate goals.
R(a,G) =df a is caused by an intention to G?
\citet{Premack:1990jl} writes:

‘in perceiving one object as having the intention of affecting another, the infant attributes to the object [...] intentions

\citep[p.\ 14]{Premack:1990jl}

Premack, 1990 p. 14

‘infants understand intentions as existing independently of particular concrete actions and as residing within the individual. [This] is essential to recovering intentions from observed actions’

Woodward, 2009 p. 53

\citep[p.~53]{woodward:2009_infants}
Woodward et al qualify this view elsewhere
‘to the extent that young infants are limited [...], their understanding of intentions would be quite different from the mature concept of intentions’ \citep[p.\ 168]{woodward:2001_making}.
By contrast, Geregely et al reject this possibility ...

‘by taking the intentional stance the infant can come to represent the agent’s action as intentional without actually attributing a mental representation of the future goal state’

Gergely et al 1995, p. 188

\citep[p.\ 188]{Gergely:1995sq}
Btw, it isn't clear that this proposal can work (as introduced by Dennett, the intentional stance involves ascribing mental states), as these authors probably realised later, but the point about not representing mental states is good.
The requirement that R(a,G) be detectable without any knowledge of mental states is not met. Why impose this requirement? Imagine you are a three-month-old infant. Let’s assume that you know what intentions are and can represent them. Still, on what basis can you determine the intentions behind another’s actions? You can’t communicate linguistically with them. In fact it seems that the only access you have to another’s intentions is via the actions they perform. Now let’s suppose that to identify the goals of the actions you have to identify their intentions. Then you have to identify intentions on the basis of mere joint displacements and bodily configurations. This is quite challenging. How much easier it would be if you had a way of identifying the goals of the actions independently of ascribing intentions. Then you would be able to first identify the goals of actions and then use information about goals to ascribe intentions to the agent.
Why not define $R$ in terms of teleological function? This would enable us to meet the first condition but not the second. How could we tell whether an action happens because it brought about a particular outcome in the past? This might be done with insects. But it can's so easily be done with primates, who have a much broader repertoire of actions.

aside: what is a teleological function?

What do we mean by teleological function?
Here is an example: % \begin{quote}

Atta ants cut leaves in order to fertilize their fungus crops (not to thatch the entrances to their homes) \citep{Schultz:1999ps}

\end{quote}
What does it mean to say that the ants’ grass cutting has this goal rather than some other? According to Wright: \begin{quote}

‘S does B for the sake of G iff: (i) B tends to bring about G; (ii) B occurs because (i.e. is brought about by the fact that) it tends to bring about G.’ (Wright 1976: 39)

\end{quote}
For instance: % \begin{quote}

The Atta ant cuts leaves in order to fertilize iff: (i) cutting leaves tends to bring about fertilizing; (ii) cutting leaves occurs because it tends to bring about fertilizing.

\end{quote}
The Teleological Stance:

‘an action can be explained by a goal state if, and only if, it is seen as the most justifiable action towards that goal state that is available within the constraints of reality’

\citep[p.~255]{Csibra:1998cx}

Csibra & Gergely, 1998 p. 255

1. Consider goals to which the action might be directed.

2. For each goal, determine how justifiable the observed actions are as a means to achieving that goal.

3. Ascribe the goal with the highest rationality score.

Requirements:

(1) reliably: R(a,G) when and only when a is directed to G

(3) R(a,G) is readily detectable without any knowledge of mental states

R(a,G) =df a causes G?

R(a,G) =df a causes G?

R(a,G) =df a ‘is seen as the most justifiable action towards [G] that is available within the constraints of reality’?

It will work if we can match observer and agent: both must be ‘equally optimal’. But how can we ensure this?
How good is the agent at optimising the rationality, or the efficiency, of her actions? And how good is the observer at identifying the optimality of actions in relation to outcomes? \textbf{ If there are too many discrepancies between how well the agent can optimise her actions and how well the observer can detect optimality, then these principles will fail to be sufficiently reliable}.

How?

Infants can identify goals from around six months of age.

The Teleological Stance is a proposed solution.

## Marr’s Threefold Distinction

\section{Marr’s Threefold Distinction}

\section{Marr’s Threefold Distinction}

If I apply the Teleological Stance successfully, do I thereby come to know a fact about the goal of an action?

To answer this question, we need to get beyond the Teleological Stance and consider the representations and algorithms that underpin it. Let me explain.
Consider Csibra & Gergely’s own answer, which is ‘Yes!’

‘when taking the teleological stance one-year-olds apply the same inferential principle of rational action that drives everyday mentalistic reasoning about intentional actions in adults’

(György Gergely and Csibra 2003; cf. Csibra, Bíró, et al. 2003; Csibra and Gergely 1998: 259)

\citet[p.~22ff]{Marr:1982kx} distinguishes:
\begin{itemize}
\item computational description---What is the thing for and how does it achieve this?
\item representations and algorithms---How are the inputs and outputs represented, and how is the transformation accomplished?
\item hardware implementation---How are the representations and algorithms physically realised?
\end{itemize}
One possibility is to appeal to David Marr’s famous three-fold distinction bweteen levels of description of a system: the computational theory, the representations and algorithm, and the hardware implementation.
This is easy to understand in simple cases. To illustrate, consider a GPS locator. It receives information from four satellites and tells you where on Earth the device is.
There are three ways in which we can characterise this device.

1. computational description

First, we can explain how in theory it is possible to infer the device’s location from it receives from satellites. This involves a bit of maths: given time signals from four different satellites, you can work out what time it is and how far you are away from each of the satellites. Then, if you know where the satellites are and what shape the Earth is, you can work out where on Earth you are.

-- What is the thing for and how does it achieve this?

The computational description tells us what the GPS locator does and what it is for. It also establishes the theoretical possibility of a GPS locator.
But merely having the computational description does not enable you to build a GPS locator, nor to understand how a particular GPS locator works. For that you also need to identify representations and alogrithms ...

2. representations and algorithms

At the level of representations and algorthms we specify how the GPS receiver represents the information it receives from the satellites (for example, it might in principle be a number, a vector or a time). We also specify the algorithm the device uses to compute the time and its location. The algorithm will be different from the computational theory: it is a procedure for discovering time and location. The algorithm may involve all kinds of shortcuts and approximations. And, unlike the computational theory, constraints on time, memory and other limited resources will be evident.
So an account of the representations and algorithms tells us ...

-- How are the inputs and outputs represented, and how is the transformation accomplished?

3. hardware implementation

The final thing we need to understand the GPS locator is a description of the hardware in which the algorithm is implemented. It’s only here that we discover whether the device is narrowly mechanical device, using cogs, say, or an electronic device, or some new kind of biological entity.

-- How are the representations and algorithms physically realised?

The hardware implementation tells us how the representations and algorithms are represented physically.

Marr (1992, 22ff)

How is this relevant to the teleological stance? It provides a computational description of goal ascription. (Whereas the Motor Theory provides an account of the representations and algorithms )
The teleological stance provides a computational description of goal ascription.
For deeper insight into goal ascription, we need an account of representations and algorithms.
Compare our research on infants’ abilities concerning physical objects. Spelke’s principles of object perception provide a computational description of infants’ abilities to segment, etc. But to understand the nature of these abilities and their relation to knowledge, and to explain the otherwise puzzling patterns of development, we needed to identify representations and alogrithms. (We did this by appeal to the operations of a system of object indexes.)
The Teleological Stance:

‘an action can be explained by a goal state if, and only if, it is seen as the most justifiable action towards that goal state that is available within the constraints of reality’

\citep[p.~255]{Csibra:1998cx}

Csibra & Gergely, 1998 p. 255

1. Consider goals to which the action might be directed.

2. For each goal, determine how justifiable the observed actions are as a means to achieving that goal.

3. Ascribe the goal with the highest rationality score.

The teleological stance is a computational description. What’s the algorithm?

‘when taking the teleological stance one-year-olds apply the same inferential principle of rational action that drives everyday mentalistic reasoning about intentional actions in adults’

(\citealp{Gergely:2003gb}; compare \citealp{Csibra:2003jv}, \citealp[p.~259]{Csibra:1998cx} )

(György Gergely and Csibra 2003; cf. Csibra, Bíró, et al. 2003; Csibra and Gergely 1998: 259)

Csibra and Gergely seem aware that this would make the Teleological Stance quite complex to apply ...

Such calculations require detailed knowledge of biomechanical factors that determine the motion capabilities and energy expenditure of agents. However, in the absence of such knowledge, one can appeal to heuristics that approximate the results of these calculations on the basis of knowledge in other domains that is certainly available to young infants.

For example, the length of pathways can be assessed by geometrical calculations, taking also into account some physical factors (like the impenetrability of solid objects).

Similarly, the fewer steps an action sequence takes, the less effort it might require, and so infants’ numerical competence can also contribute to efficiency evaluation.’

Csibra & Gergely, forthcoming ms p. 8

What heuristics.
Csibra and Gergely’s newer proposal seems to assume the inferential integration of core systems. But principles governing object indexes are not typically available for general reasoning.

Is there an alternative?

Flanagan and Johansson, 2003 figure 1 (part)

\citet{Flanagan:2003lm} showed that ‘patterns of eye–hand coordination are similar when performing and observing a block stacking task’.

Costantini et al, 2012

‘We recorded proactive eye movements while participants observed an actor grasping small or large objects. The participants' right hand either freely rested on the table or held with a suitable grip a large or a small object, respectively. Proactivity of gaze behaviour significantly decreased when participants observed the actor reaching her target with a grip that was incompatible with respect to that used by them to hold the object in their own hand.’
Follow ups: tie hands = \citet{ambrosini:2012_tie}; TMS (impair) = \citet{costantini:2013_how}.
Planning-like processes in action observation have also been demonstrated by measuring observers' predictive gaze. If you were to observe just the early phases of a grasping movement, your eyes might jump to its likely target, ignoring nearby objects \citep{ambrosini:2011_grasping}. These proactive eye movements resemble those you would typically make if you were acting yourself \citep{Flanagan:2003lm}. Importantly, the occurrence of such proactive eye movements in action observation depends on your representing the outcome of an action motorically; even temporary interference in the observer's motor abilities will interfere with the eye movements \citep{Costantini:2012fk}.
In human adults, motor representations and processes enable anticipatory looking that is driven by goal ascription \citep[e.g.][]{Costantini:2012fk,ambrosini:2012_tie}.

How?

Motor representations ocurring in action observation sometimes facilitate the identification of goals.

What are those motor representations doing here?

The Motor Theory of Goal Ascription:

goal ascription is acting in reverse

The idea is that we could solve the problem--the problem of matching optimisation in planning actions with optimisation in predicting them--by supposing that a single set of mechanisms is used twice, once in planning action and once again in observing them.
What does this require?

-- in action observation, possible outcomes of observed actions are represented

-- these representations trigger planning as if performing actions directed to the outcomes

-- such planning generates predictions

predictions about joint displacements and their sensory conseuqences

-- a triggering representation is weakened if its predictions fail

The proposal is not specific to the idea of motor representations and processes, although there is good evidence for it (which I won't cover here because we're in Milan!)

Sinigalia & Butterfill 2015, figure 1

There is evidence that a motor representation of an outcome can cause a determination of which movements are likely to be performed to achieve that outcome \citep[see, for instance,][]{kilner:2004_motor, urgesi:2010_simulating}. Further, the processes involved in determining how observed actions are likely to unfold given their outcomes are closely related, or identical, to processes involved in performing actions. This is known in part thanks to studies of how observing actions can facilitate performing actions congruent with those observed, and can interfere with performing incongruent actions \citep{ brass:2000_compatibility, craighero:2002_hand, kilner:2003_interference, costantini:2012_does}. Planning-like processes in action observation have also been demonstrated by measuring observers' predictive gaze. If you were to observe just the early phases of a grasping movement, your eyes might jump to its likely target, ignoring nearby objects \citep{ambrosini:2011_grasping}. These proactive eye movements resemble those you would typically make if you were acting yourself \citep{Flanagan:2003lm}. Importantly, the occurrence of such proactive eye movements in action observation depends on your representing the outcome of an action motorically; even temporary interference in the observer's motor abilities will interfere with the eye movements \citep{Costantini:2012fk}. These proactive eye movements also depend on planning-like processes; requiring the observer to perform actions incongruent with those she is observing can eliminate proactive eye movements \citep{Costantini:2012uq}. This, then, is further evidence for planning-like motor processes in action observation.
So observers represent outcomes motorically and these representations trigger planning-like processes which generate expectations about how the observed actions will unfold and their sensory consequences. Now the mere occurrence of these processes is not sufficient to explain why, in action observation, an outcome represented motorically is likely to be an outcome to which the observed action is directed.
To take a tiny step further, we conjecture that, in action observation, \textbf{motor representations of outcomes are weakened to the extent that the expectations they generate are unmet} \citep[compare][]{Fogassi:2005nf}. A motor representation of an outcome to which an observed action is not directed is likely to generate incorrect expectations about how this action will unfold, and failures of these expectations to be met will weaken the representation. This is what ensures that there is a correspondence between outcomes represented motorically in observing actions and the goals of those actions.

How?

Motor representations ocurring in action observation sometimes facilitate the identification of goals.

Now we’ve solved this: the Motor Theory of Goal Ascription is the solution.
See \citet{sinigaglia:2015_goal_ascription} for an outline of the Motor Theory of Goal Ascription.
Recall David Marr’s famous three-fold distinction between levels of description of a system: the computational theory, the representations and algorithm, and the hardware implementation.

1. computational description

-- What is the thing for and how does it achieve this?

2. representations and algorithms

-- How are the inputs and outputs represented, and how is the transformation accomplished?

3. hardware implementation

-- How are the representations and algorithms physically realised?

Marr (1992, 22ff)

The teleological stance provides a computational description of goal ascription.
The Teleological Stance:

‘an action can be explained by a goal state if, and only if, it is seen as the most justifiable action towards that goal state that is available within the constraints of reality’

\citep[p.~255]{Csibra:1998cx}

Csibra & Gergely, 1998 p. 255

1. Consider goals to which the action might be directed.

2. For each goal, determine how justifiable the observed actions are as a means to achieving that goal.

3. Ascribe the goal with the highest rationality score.

The Teleological Stance provides a computational description of the process of goal ascription that underpin proactive gaze in adults; the Motor Theory of Goal Ascription (partially) specifies the representations and algorithms needed for this.
Recall David Marr’s famous three-fold distinction between levels of description of a system: the computational theory, the representations and algorithm, and the hardware implementation.

1. computational description

-- What is the thing for and how does it achieve this?

2. representations and algorithms

-- How are the inputs and outputs represented, and how is the transformation accomplished?

3. hardware implementation

-- How are the representations and algorithms physically realised?

Marr (1992, 22ff)

The motor theory of goal ascription provides an account of the representations and algorithms, one that competes with Csibra and Gergely’s account based on general reasoning.

1. Proactive gaze indicates fast goal ascription.
2. The Teleological Stance provides a computational description of the goal ascription underpinning adults’ proactive gaze
3. Proactive gaze depends on motor processes and representations: the Motor Theory provides an account of the representations and algorithms.

Infants

1. Proactive gaze (from ~12 months) and violation-of-expectations (from ~3 months) indicate goal ascription.
2. The Teleological Stance ...
3. Two conjectures about algorithms and representations ...
Is there any evidence? ...

## Performing vs Understanding Actions in Infancy

\section{Performing vs Understanding Actions in Infancy}

\section{Performing vs Understanding Actions in Infancy}

Infants use the teleological stance to identify goals.

But does this involve reasoning or motor processes?

Costantini et al, 2012

In adults, we got at a parallel question by tieing their hands. There’s a similar manipulation in infants involving sticky mittens ...

Needham et al, 2002 / https://news.vanderbilt.edu/files/sticky-mittens.jpg

Needham et al, 2002 showed that putting ‘sticky mittens’ on 3-month-old infants (for 10-14 play sessions of 10 minutes each) resulted in their spending more time visually and manually exporing novel objects.

Sommerville, Woodward and Needham, 2005

Play wearing mittens then observe action.

vs

Observe action then play wearing mittens then.

In this study, I think infants wore the mittens for just 200 seconds (so the play sessions were much shorter than in Needhman et al, 2002).
The observation was based on this study, which we saw earlier

Woodward et al 2001, figure 1

Sommerville, Woodward and Needham, 2005 figure 3

The results show that infants who played wearing the mittens first were more attentive to the goal.
From at least three months of age, some of infants’ abilities to identify the goals of actions they observe are linked to their abilities to perform actions \citep{woodward:2009_infants}.
But one potential objection to this study concerns observation vs performance. The infants who played wearing sticky mittens first had spent longer observing actions by the time it came to the violation of expectations trial. Could it be observation of action (including one’s own) rather than performance that matters?
In adults, tying the hands impairs proactive gaze \citep{ambrosini:2012_tie}; in infants, boosting grasping with ‘sticky mittens’ facilitates proactive gaze (\citealp{sommerville:2005_action}; see also \citealp{sommerville:2008_experience}, \citealp{ambrosini:2013_looking}).

Sommerville et al 2008, figure 1

To address this issue, \citet{sommerville:2008_experience} did a study in which one group had observation while the other group had performance. The participants were 10-month-old infants this time.
The materials were a bit different: so that training vs observation could be as similar as possible with respect to the causal structure exposed, there was a hook to get an object.

Sommerville et al 2008, figure 2

The results show that infants with the training paid attention to the distal goal (choice of toy) whereas those without paid attention to the choice of cane.

Ambrosini et al, 2013 figure 1 (part)

Further support for a link between action performance and goal ascription comes from a developmental study by Ambrosini et al which studied whether proactive gaze in infants is influenced by pre-shaping of the hand, and, in particular, whether it is influenced by precision grips.

Ambrosini et al, 2013 figure 1 (part)

Ambrosini et al, 2013 figure 1 (part)

By using no shaping (a fist), Ambrosini et al could treat sensitivity to whole-hand grasp and precision grip separately.

Ambrosini et al, 2013 figure 3

‘infants’ ability to perform specific grasping actions with fewer fingers directly predicted the degree with which they took advantage of the availability of corresponding pre-shape motor information in shifting their gaze towards the goal of others’ actions’ \citep[p.~6]{ambrosini:2013_looking}.

Why?

abilities to perform actions enable identifying goals when observing them.

Why is this true?

The Motor Theory of Goal Ascription:

goal ascription is acting in reverse

The idea is that we could solve the problem--the problem of matching optimisation in planning actions with optimisation in predicting them--by supposing that a single set of mechanisms is used twice, once in planning action and once again in observing them.
What does this require?

-- in action observation, possible outcomes of observed actions are represented

-- these representations trigger planning as if performing actions directed to the outcomes

-- such planning generates predictions

predictions about joint displacements and their sensory conseuqences

-- a triggering representation is weakened if its predictions fail

The proposal is not specific to the idea of motor representations and processes, although there is good evidence for it (which I won't cover here because we're in Milan!)

Sinigalia & Butterfill 2015, figure 1

Why?

abilities to perform actions enable identifying goals when observing them.

Why is this true?
In suggesting that it’s because of the Motor Theory, I’m going beyond anything Sommerville et al would endorse although moving in a direction they cautiously indicate.
I’m also contradicting how most people think of the relation between the Teleological Stance and the Motor Theory. Most theorists think of these as alternatives. E.g. \citep{Csibra:2007hm} contrast what can be explained with Teleological Stance and Motor Theory (they claim Teleological Stance is required for novel situations in which Motor Theory should fail; this is probably in part right, although success in the face of novelty could be driven by associations.) See also \citet[p.~204]{gredeback:2010_infantsa}: ‘We suggest that anticipation of action goals is mediated by a direct matching process (Flanagan & Johansson, 2003) whereas retrospective evaluations of rationality are dependent on more abstract well-formedness criteria as described by the teleological stance (Gergely & Csibra, 2003).’ (NB: their argument is not good: they claim that because infants who have little or no experience of feeding themselves can show suprise (pupil dilation) in response to nonrational self-feeding actions, they are using the Teleological Stance here. But it might be motoric; or it might be association (cf the ‘telephone to ear’ study).)

The Teleological Stance

... provides a correct computational description of (some or all) infant (and adult) goal ascription.

But which processes and representations underpin it?

Csibra & Gergeley’s hypothesis: ordinary inference and beliefs

The Motor Theory: motor representations and processes

Evidence

Manipulating abilities to perform actions changes abilities to identify goals.

1. Proactive gaze indicates fast goal ascription.
2. The Teleological Stance provides a computational description of the goal ascription underpinning adults’ proactive gaze
3. Proactive gaze depends on motor processes and representations: the Motor Theory provides an account of the representations and algorithms.

Infants

1. Proactive gaze (from ~12 months) and violation-of-expectations (from ~3 months) indicate goal ascription.
2. The Teleological Stance ...
3. Two conjectures about algorithms and representations ...
The question was, Is there any evidence? ... Yes, we found evidence.

caveat: there’s probably more

There’s probably more than the Motor Theory to goal ascription (in infants, and adults).

Melzer et al, 2012 figure 1

\citep{melzer:2012_production}: ‘The infant was given a cube (occupation object) in either his/her left or right hand. Subsequently,a second toy (target object) was held either (a) in front of the empty hand to elicit an ipsilateral reaction (ipsilateral presentation) or (b) in front of the occupied hand to elicit a contralateral reaction (contralateral presentation).’

Melzer et al, 2012 figure 3

Infants become good at contralateral grasping between six and twelve months.

Melzer et al, 2012 figure 2

And here is an anticipatory looking task ...

Melzer et al, 2012 figure 4 (part)

How did the infants do? The data in figure indicate that 12-month-olds showed quite good, adult like anticipation of contralateral actions, whereas 6-month-olds were arriving at the target object behind the hand.
This suggests a link between performance and goal ascription, once more. (p. 577: ‘At 12 months, most infants were able to anticipate the goal of contralateral movements, whereas at 6 months, infants showed mainly reactive eye movements.’)
Further, p. 577: ‘Production and perception of contralateral reaching movements were correlated at 12 months of age. The more sophisticated 12-month-olds’ reaching production was, the better they anticipated other people’s contralateral movements.’
But the result that that \citet{melzer:2012_production} focus on is this: p. 577: ‘perception and production were not yet correlated at 6 months. The lack of a significant correlation was neither due to a larger variance in the younger infant group nor to the influence of a bias in our sample. Accordingly, our findings suggest that a link between production and perception of contralateral arm movements, and possibly therefore a common representation, develops in the second half of the first year of life.’

conclusions

What do infants understanding of goal-directed actions?

They can identify goals from around 3 months of age,

but this does not imply knowledge of action

because infants’ abilities, like adults’, involve motor processes (and perhaps associative learning too).

Emphasize: (1) that they understand action is important because we can use it to build an account of how you get from core knowledge to knowledge knowledge
Emphasize: (2) that their understanding of action does not involve knowledge is important because it allows us to invoke it without assuming capacities for any knowledge at all on the part of the infant. (And because it leaves us with a question about how infants get from core knowledge of action to knowledge of action.)

core knowledge of action?