The perceptual structure of the five themes of Roger Reynolds' The Angel of Death was investigated. We studied how listeners follow the musical progression of each theme and whether or not they perceive the temporal implications. In the first phase, participants performed three tasks on the full themes, one of which consisted of segmenting online the musical ideas. In the second phase, participants were presented with pairs of excerpts from the themes, judged whether both belonged to the same theme, and if they did which one occurred first in the theme. Participants' segmentations corresponded to surface discontinuities in places, but were strongly influenced by the rhetorical structure of the themes in others. Listeners (particularly nonmusicians) encountered difficulties when they were required to perform more abstract tasks out of the musical context such as the belongingness judgment, which depended on surface similarities, and the temporal-order judgment, which depended on previous hearing of the full themes.
For most composers and listeners, music is a time-oriented acoustical structure that is based on sections having different musical functions. Composers use these functions to control formal structure and musical expression. Listeners use them to generate intuitions about the current status of the musical process and to anticipate the sequences of events or at least what can potentially happen. These musical functions can be analyzed in terms of "rhetorical" functions. At its origin, rhetoric was the art of speech and persuasion. Ancient rhetoric was divided into five fields: Inventio (ideas and arguments), Dispositio (plan, sequence ideas), Elaboratio-Decoratio (figures, ornaments, style), Actio (declamation, performance) and Memoria. From the 1950s, rhetoric experienced a revival, in particular in argumentation (Perelman & Olbrecht-Tyteca, 1958) and semiotics (Barthes, 1964). The fields of rhetoric can, more or less, be transferred to music. The most known rhetorical concept transferred to music is that of "figure". But, the term rhetoric will be employed here more particularly in the meaning of dispositio, i.e. the order and the way in which the musical ideas are connected.
During the Baroque era, there was a lot of research on the potential equivalence between rules of rhetoric and musical discourse. Baroque musical rhetoric was closely linked to the Theory of Affects. All the Baroque composers used rhetorical figures (loci topici), which represent passions in music. Traces of musical rhetoric can be found in many works of musical theory such as Mersenne's Harmonie universelle (1636-1637), Kircher's Musurgia universalis (1650), and Heinichen's Der Generalbass in der Composition (1728). However, Mattheson (1739) was the one who took this equivalence between rhetoric and music further, while including in his theory locus descriptionis, locus notationis or locus causae materialis. Mattheson claimed that "because of the multitude of figures, music had risen today to such a level that we could compare it with a rhetoric". The motto, which is sometimes at the beginning of an aria, illustrates how a figure can summarize a whole piece in a concise way. Obviously, the theories of musical rhetoric evolved with their times between 1550 and 1800. However, as in literature, musical rhetoric seeks to classify problems, styles, genres, and figures with a high degree of accuracy. Forkel (1788-1801), first biographer of Bach, proposed six ways to approach music by rhetoric: musical period (organization of sentences, syntax), affects (different writing styles for church music, chamber music or theatre music), genres, forms, and figures (dispositio, inventio and decoratio), performance (actio), and finally aesthetics (the philosophy of beauty and taste). Although detached from the Theory of Affects, musical figures were used until the 19th century (the Wagnerian leitmotiv is an example) and beyond.
Today, after a period strongly marked by structuralism, some concepts of rhetoric have reappeared in various forms. Following the example of many composers and theorists in the Baroque Era, Cooke (1959) linked intervals, scales, and tonality with moods and emotions. In his system, Cooke sought to define a kind of "musical vocabulary" " the language of music " corresponding to various "affects". Ratner's Classical Music (1980) referred explicitly to rhetorical concepts (exordium, circumlocution, gradatio, peroration, etc). He analyzed Classical style with two classes of signs called topical and formal signs. But, as Monelle (1992) noted, Ratner's system is only a musical lexicon: "topical references are not arranged into coherent syntagms, as the words of a language" (p. 227). Other theorists have argued that musical significance is not required in the structural relations of the material, but rather in the formal functions contained in the succession of musical events. Berry (1989) conceived of musical structure as a unit composed of functions, which govern the relations between the events that are immediately adjacent or distant in time. These functions also relate to tonal relations and meter or texture, for example. Caplin (1998) has analyzed works of the Classical period in terms of formal functions. His theory clearly distinguishes formal function from grouping structure (topic, transition, exposure, coda, previous, consequent, etc). Formal functionality results from the harmonic, melodic, and rhythmic processes that are not necessarily identical to those that create the grouping structure of the piece. Formal function and grouping structure are often joined, but this is not always the case. The formal functions defined by Caplin (1998), such as presentation (basic idea, repetition), continuation (fragmentation, liquidation) or cadence, are comparable to rhetorical functions. Agawu (1991) tried to combine Schenker's (1935) structural method of musical analysis and Ratner's rhetorical approach. According to Agawu, there is an interaction between topical signs and structural signs, between morphology of expression and structure, between extroversive semiosis and introversive semiosis (related to Jakobson's, 1963, categories) along a single continuum. A semiotic musical analysis must provide an account of a piece in which the domains of expression (extroversive semiosis) are integrated with those of structure (introversive semiosis), which is not always the case to an equal extent. According to Krumhansl (1998), some of the topics identified by Agawu (1991) in two string quartets by Mozart and Beethoven, have a psychological reality that influences the cognitive representation of the two pieces. For each piece, three different measures were taken in real time: memorability, openness, and amount of emotion. The topics with their distinctive characteristics (such as tempo, rhythm, and melodic figure) influenced the judgments.
In contemporary music, rhetorical functions have broadened along with other characteristics of materials such as timbre, register, density, texture, and space. Moreover, new ways of using musical time appeared after the Second World War and renewed the formal functions. Thus, Stoïanova (1978) distinguished two aspects of musical time that are both opposite and complementary: the kinetic aspect (kinesis), which is related to movement, processes of transformation, and differences generated in succession; and the static aspect (stasis), which is dependent on the immobilization of a sound flow and its stabilization in sound architectures. The perception of musical form was considered to depend on the interaction between these two contradictory tendencies. Kramer (1988) defended the idea that linearity and nonlinearity are the two fundamental means by which music structures time and time structures music. The concept of linearity, such as Kramer conceived it, mixes causality and teleology. A linear music develops through multiple implications (the causes involving the effects) and in a specific direction, the bases of which are mainly the tonal functions. Nonlinearity is, to the contrary, absence of causality and static. From the interaction of these two tendencies, Kramer deduced several varieties of temporality: directed linear time, not-directed linear time, multi-directed linear time, moment time, vertical time, etc.
Other theorists and psychologists do, however, question the psychological relevance of large-scale musical form. Levinson (1997) defends the idea that it is not necessary to be conscious of the musical architecture because music consists of a series of successive events that cannot be apprehended simultaneously in a single perceptual act. According to him, listening to music only requires one to be focused on the present with the exclusion of any attempt at recollection or anticipation. From a psychological point of view, listeners would thus perceive incoming events through a short temporal window that slides along the event stream (Bigand, 1993; Clarke, 1987; Fraisse, 1957; Michon, 1977). The size of a temporal window (referred to as the "perceptual present" in Fraisse and as "quasi hearing" in Levinson) is constrained by several factors, but its maximal duration is considered to vary between 5 s (Fraisse) and 8 s (Michon) to 10 s (Clarke) or even 30 s (Levinson). Inside each temporal window, all attentional resources are supposed to be allocated to the events it contains, without any supplementary resources to process events outside the window. As a result, music may be perceived from one temporal window to another, without any consideration being given to what has been perceived before (Michon, 1977). If transitions are smooth enough, we may perceive music from moment to moment without being disturbed by the total absence of directionality between all of these moments.
Several experimental studies on Western tonal music have tried to measure the impact of formal coherence and the perception of structure by listeners. Francès (1958) carried out an experiment starting with a trio movement by Beethoven. The listeners were asked to announce when the themes began, knowing that there were two themes. Musicians performed better than nonmusicians, especially for the appearance of the second theme. A second version of the experiment, in which listeners were told neither the number of themes nor their mode of appearance, revealed many errors. Other studies tested the perception of form by systematically changing the whole organization of musical pieces. In these studies, participants usually evaluated the musical pieces with subjective scales. Thus, in one study, Gotlieb & Konecni (1985) studied the impact of significant modifications of the structure of the Goldberg Variations by Johann Sebastian Bach on the pleasure, interest, and emotion of the subjects. The answers indicated that they appreciated the modified versions of the Goldberg Variations as much as the original. The modification of the structure only had a tiny effect on the pleasure experienced by the subjects. According to Batt (1987), however, these findings were criticizable because the modifications used temporal units that were too large, and because participants were not experts in music. Batt (1987) illustrated his argument with possible manipulations for Mozart's Symphony in G Minor (K550). Karno & Konecni (1992) went one step further by manipulating the relations between the different sections inside the first movement of the Mozart symphony, following Batt's (1987) suggestion. The subjective judgments on different scales (level of interest, pleasingness, desire to own a recording of the piece, best overall structure) of nonmusicians and music students resulted in no significant differences between original and modified versions. According to Karno & Konecni, these results clearly question the perceptual impact of musical structures for the listeners.
Tillmann & Bigand (1996) imposed even stronger modifications on the structure of musical pieces. Three pieces for piano were selected to represent three styles and three periods: the gigue of the French Suite no. 1 in D minor BWV 812 by J. S. Bach, the allegretto of the Sonata in Bb major KV 570 by Mozart, and the gigue of the Suite for piano, op. 25 by Schoenberg. Each piece was segmented into several chunks and reorganized temporally. These chunks were concatenated and presented either in the original order of the composer or in retrograde order. This systematic rearrangement destroyed the initial structure of the pieces without changing the local structure and the surface characteristics inside the chunks. When the subjects heard the Bach and Mozart pieces in the retrograde order, they did not give very different estimates of expressiveness and coherence. Finally, a significant (but weak) effect of the inconsistency was observed for the Schoenberg piece. In addition, the subjects did not realize to a significant degree (above chance) that they had listened to retrograded versions. Those who listened to the pieces in the correct order identified it as an original version. All of these studies tend to show that large-scale musical structures have only a weak effect on the perception of the musical expressiveness of tonal pieces.
How is atonal music perceived? The results obtained with Schoenberg's Suite for piano, op. 25 seem to indicate that the temporal structure carries a more significant perceptual weight with atonal music. Other experiments lead to the same conclusion. Deliège (1989) evaluated the perception of structure in two contemporary pieces: Berio's Sequenza VI for viola solo (1970) and Boulez's Eclat (1965) for piano and ensemble. The two pieces were selected because of their identical duration and their differences in writing style. For both Sequenza VI and Eclat, the results showed that musicians and nonmusicians performed the grouping of groups in a very similar way. The boundaries corresponded more or less with significant caesuras associated with clear contrasts (often of surface features), which are "clues" perceived by the subjects. The results of the experiment requiring listeners to localize excerpts in Berio's Sequenza VI showed that the percentage of correct answers was acceptable for musicians, but was rather lower in nonmusicians. However, many errors in the localization task occurred for excerpts that were reiterated a number of times (although never in exactly the same way). It thus seems that some variations are too fine to be memorized and cause confusions in the localization task.
Clarke & Krumhansl (1990) tackled the question of the perception of musical form with Stockhausen's Klavierstück IX for piano solo (1961). In the first experiment, ten strong boundaries (common to a majority of listeners) emerged from the grouping task. Two main boundaries matched up with the division of the piece into three sections. The musical characteristics that contributed to the segmentation could be divided into four categories: silences and long pauses; contrasts of dynamics, register, texture, rhythm, etc; pitch changes, melody contour or vertical organization changing to horizontal organization; the reiteration of a material already heard. In the second experiment, subjects had to localize excerpts of Klavierstück IX according to the boundaries established during the first experiment. The average values of localization judgments were correlated to a significant degree with their position in the piece. However, the listeners' judgments deviated more strongly for the excerpts located near the middle of the piece. The authors felt this deviation to be an effect of musical progression: because the central part of the piece is a kind of mixture where several ideas are combined and juxtaposed, the direction of the progression was weakened. All of these studies tend to confirm the importance of large-scale structures for the comprehension of atonal music.
Reynolds' music is particularly appropriate to study the relevance of large-scale structures in contemporary music. The principal concern of Reynolds is time, or more exactly "the architecture of time". From the early 1960s, he started to control all the temporal aspects of his pieces with numerical series in irregular progressions to breathe unpredictability into them. Later, he improved his method and developed various techniques to create waves of durations, dilating or retracting, converging or diverging, in a nonlinear way. These always-changing portions of time become a norm—convergence/divergence—whose function is equivalent to tension/resolution in tonal music: "My intention was to build into the structure " but only at the subconscious, which is to say only at the inferable level " waves of accumulating or shrinking duration that, by their expansion or convergence would evoke the sense of movement towards, of arrival at, and dissolution from. These trends might, I thought, in some measure compensate for the formal functions previously served by tonal harmonic conventions: operating over spans of time, suggesting, as they are traversed, origins and goals" (Reynolds, 1987, p. 288). Reynolds was also one of the first composers to exploit systematically the multidimensionality of musical time. His music is contrapuntal at the local level as well as at the global level. His works are always formed of many independent layers. One of the most manifest characteristics of Reynolds' compositional procedures is to constitute his thematic material around a core element. "Core elements are composed according to strict methodological standards, whatever this may mean in a given piece. This rigor is particularly important because of the fact that they serve " at least in my case " as the reservoirs of orderliness for the work as a whole. The algorithmic procedures that I use often disturb temporal proportion and succession radically. Overall consistency in the composition, then, requires that each derived fragment of the original whole, wherever it is found, should itself be a reliable product of the underlying orthodoxies" (Reynolds, 2002, p. 19). In the case of The Angel of Death, the core element has a specific rhetorical function. It constitutes the expressive center of the theme and acts like a magnet or a driving bolt on the other sections of the theme (Fig. 1). It contains both a centripetal and a centrifugal force. The core element is not more identifiable than the other sections, but it constitutes a boundary, a line of demarcation, which listeners seem to be able to feel implicitly (see Reynolds, this CD-Rom).
The Angel of Deathcontains five thematic materials. The term "theme” (which we will adopt for this paper, see also Reynolds, this CD-Rom) is not used in its traditional meaning, but to mean something like a formal unit. The themes vary from 23.5 s to 99.5 s as conceived (31 s to 156 s as performed), and the composer created them with from four to nine subsections. The internal temporal organization of the themes has a more or less marked directionality. Theme 1 carries the most directional trajectory. In an opposite way, Theme 5 is the least directional. In this latter case, the contrasts between the subsections are greater. The proportions of each theme lead us through an elastic time, which contracts, stagnates or dilates. The five themes have a global morphology in the form of an "X", the place of crossing always being the core element (Fig. 1). Thus, in Theme 1, the registers of the two lines meet, then deviate. In Theme 2, the dyadic links seem to spread to the blocks of chords, then fade out. In Theme 3, two lines out of phase converge in a chromatic contrary motion, then diverge. In Theme 4, the trajectories go from order to disorder, then back to order. In Theme 5, the contrapuntal intensification stops on a chorale, then briefly reappears. The force of the directionality seems to weaken, as the first theme is most clearly directional, and the last less so. Figure 1 presents the internal and external temporal proportions, the formal directionality, and functions for each of the five themes.
On the basis of a music-theoretic analysis, we established eight categories of rhetorical functions in the five themes. These functions allowed us to provide an account of the extroversive semiosis of each theme (related to the Agawu categories mentioned before). They don't correspond exactly to the categories of ancient rhetoric (Mattheson, 1739), but they are inspired by the rhetorical category called dispositio, i.e. the way to order the ideas, to assemble them according to a plan. Presentation (exordium) is often the beginning, the first step, of the theme. It proclaims the material and presents the main(s) idea(s). Development (narratio) is a process of transformation and variation of material opened in the immediately preceding subsection. Continuation (confirmatio) is a similar process that prolongs the material of the preceding subsection (boundaries between development and continuation are weaker). Digression (digressio) links two subsections; it is an intermediary state often based on a contrasted musical idea. Reinforcement (confirmatio) implies that a characteristic of a previous subsection is reinvested, repeated. Convergence corresponds to the core element; it carries temporal convergence and half-conclusive functions. Closure (peroratio) is the end of the theme; it carries a conclusive function. In this scheme, we are working according to the principle that each subsection carries only one function. The descriptions of the five themes that follow refer again to Reynolds' subsections, and attempt to highlight the sequence of musical ideas using the eight categories of rhetorical functions. The sequence of ideas in the themes of The Angel of Death shows a complex path that carries both linearity and nonlinearity (related to Kramer's categories mentioned before).
The structures of the five themes of The Angel of Death are thus time-oriented. Each subsection of each theme can be described in terms of rhetorical functions, and the core element has a specific rhetorical function that determines the temporality of the themes. An important question that arises then is: to what extent are listeners able to pick up these time-oriented implications and rhetorical functions? The broad purpose of this study was to assess the ability of listeners to capture these specificities of Reynolds' music and the extent to which musical expertise favors (or not) this ability. In Experiment 1, we assumed that listeners acquire the temporal implications of a musical excerpt from previous hearings of the theme as well as from the specific musical patterns contained in the excerpts. Three tasks were designed to address this issue. In the segmentation task, participants indicated online each musical idea they perceived in Reynolds' themes. We were not expecting these perceived ideas to fit point by point with the sections delineated by the composer, since these sections correspond mostly to compositional strategies (see Reynolds, this CD-Rom). In other words, there was no right or wrong answer in this task. The task was simply designed to illuminate the perceptual structures of the themes and to assess whether these structures change with the extent of musical expertise. Of course, we were expecting some correspondence between the main structure defined by the composer (notably the core elements) and the perceived musical ideas. The next two tasks lead to specific hypotheses. In the belongingness judgment task, participants were presented with pairs of excerpts from the themes and had to decide whether or not they belonged to the same theme. Low performance in this task would reflect participant's difficulty in capturing the rhetorical function of the different sections of the themes. The last task was more demanding since it required participants to indicate the temporal order of two excerpts within the theme, if they were judged to belong to the same theme. If the listeners had acquired the temporal implications of a musical excerpt from the previous hearings of the entire themes, they should respond above chance in this task.
Forty-eight volunteer students participated in this experiment: 24 students from an introductory psychology course at the Université de Bourgogne with no formal training in music (referred to below as nonmusicians), and 24 candidates for the final diploma of the conservatory of Dijon (referred to below as musicians). Musicians were familiarized with contemporary music during their conservatory studies, notably because they all had to perform contemporary pieces for their final exams. All participants received course credit or were paid 7€ for their participation.
The five themes of The Angel of Death were used in this experiment. They were recorded by the pianist Jean-Marie Cottet in the Espace de Projection at IRCAM in Paris. From these themes, 20 true pairs of excerpts and 20 false pairs of excerpts (in which the items didn't belong to the same theme) were assembled. When the durations of individual subsections were too short, consecutive subsections were combined to make the excerpts (e.g., the first three subsections of Theme 1: T1.1-2-3).1
The sound stimuli were prepared in SoundEditPro software at CD quality (16 bits and 44.1 kHz). The experiment was run with PsyScope software (Cohen, MacWhinney, Flatt, & Provost, 1993). The stimuli were heard over Sennheiser HD 200 headphone and a Luxman A357 power amplifier.
The experimental procedure was split into two phases. In the exposure phase, participants listened three times to the five full themes and performed different tasks on each listening. The first listening was simply designed to have participants pay attention to the structure of the theme: after hearing a theme they were asked to evaluate on a 7-point scale (from 1 = very unfamiliar to 7 = very familiar) how familiar they were with the musical style of the theme. During the second listening, they were required to indicate in real time (by pressing the space bar on the computer keyboard) the onset of each new musical idea. This task allowed us to assess how fine-grained the online perception of the musical progression of The Angel of Death themes was. Finally, the third listening involved a semantic judgment about each theme. Listeners were required to evaluate on a 7-point scale (from 1 = does not evoke well to 7 = evokes well) how the composer's semantic description of the themes corresponded to what they felt. The composer created the following descriptions:
There was no specific assumption for this third listening, which was mostly designed to encourage participants to focus on the rhetorical qualities of the five themes.
In the test phase of the experiment, participants were required to perform two tasks: the belongingness and temporal-order tasks. They were presented with 40 pairs of musical excerpts (Table 1). The 20 "true” pairs always contained the central part of one of the 5 themes (CE column in Table 1), and the second, comparison element (Comp column in Table 1) of the pair belonged to the same theme. The "false” pairs always contained the central part of one of the 5 themes, but the second element belonged to another theme. As far as possible, the second element of the true pairs was chosen to approximate the duration of the first element and to share global surface similarities. Each comparison excerpt occurred in one true pair and one false pair. The order of excerpts in the 40 pairs was counterbalanced across participants. There was an inter-stimulus interval between items within the pairs of 2 s. The silent intervals between pairs were under the participant's control. Participants were first asked to indicate whether the two elements belonged to the same theme. Only if they answered "yes” were they required to say which of the two elements was likely to appear first in the theme. So for each stimulus pair, the presence of the second question was conditional upon the answer to the first question.
Familiarity ratings from the first listening (Table 2) revealed that Reynolds themes sounded more familiar to musicians (M=4.2, SD=0.18) than to nonmusicians (M=1.9, SD=0.22). A Group (2) x Theme (5) ANOVA, run with the familiarity ratings as the dependent measure, revealed a main effect of Musical Expertise, with significantly higher familiarity ratings for musicians than for nonmusicians, F(1,46) = 41.6, p& lt; .001, MSE = 7.79. There was a significant main effect of Theme, F(4,184) = 2.4, p & lt; .05, MSE = 0.65 . A Tukey-Kramer HSD post-hoc analysis on Theme showed that the only significant difference was between Themes 2 (the least familiar) and 3 (the most familiar), p = .02.
After the third hearing, participants were required to evaluate the semantic descriptions provided by the composer for each theme (Table 3). Higher ratings were found in musicians (M=4.9, SD=0.25) than in nonmusicians (M=3.8, SD=0.52). A Group (2) x Theme (5) ANOVA run with the semantic goodness of fit ratings as the dependent measure revealed a significant main effect of Group, F(1, 46) = 13.8, p & lt; .001, MSE = 5.34 and a significant Group x Theme interaction, F(1,184) = 2.6, p & lt; .05, MSE = 2.07. A Tukey-Kramer HSD analysis showed that no significant differences were observed between musicians and nonmusicians for the semantic judgments, except for Themes 3 and 4 (p & lt; .05), which were significantly lower in nonmusicians than in musicians.
The most interesting part of the exposure phase concerned the musical ideas perceived online during the second hearing of the theme. Table 4 displays the average number of musical ideas perceived for each theme by both groups. A Group (2) X Theme (5) ANOVA with the first variable as between-group factor and the second as within-group factor revealed a significant effect of Theme F(4,92) = 87.2, p & lt; .001, MSE = 0.71, and no significant difference between the number of musical ideas perceived by musically trained and untrained listeners, F(1,23) = 1.5, p > .10. There were moderate but significant correlations between the number of musical ideas and theme duration, r(238) = .60, p & lt; .001: the longer the theme, the higher the number of perceived musical ideas. As expected on the basis of the composer's conception, the number of perceived musical ideas was smaller than the number of sections originally defined by him, but there was a weak but significant correlation between the two, r(238) = .47, p & lt; .001.
The durations delimited by the segmentations (i.e. the durations of perceived musical ideas) were computed for all themes and all listeners. Table 5 displays the average durations of musical ideas perceived by both groups of participants for each theme. A Group (2) X Theme (5) ANOVA with average duration of the musical ideas as the dependent measure revealed a significant effect of Theme, F(4,184) = 19.7, p & lt; .001, MSE = 37.60: the shortest musical ideas were found in Theme 4 and the longest in Theme 5. There was no other significant effect. The duration of perceived musical ideas was slightly larger in musicians (19.3 s) than in nonmusicians (17.7 s). This difference did not reach statistical significance, F(1,46) = 1.1, p> .10. Further analysis revealed effects of musical expertise on the duration of musical ideas. Figure 2 displays the statistical distribution of the durations of musical ideas over all themes for both groups.
On the whole, there was a tendency for participants to perceive musical ideas no longer than 20-24 s, although in some cases (for Theme 5 notably) this duration increased up to 30 s. This duration nicely fits Levinson's concept of quasi hearing. Slight differences between musically trained and untrained listeners appeared when considering the duration of musical ideas below this value. Musically trained listeners tend to perceive ideas of 15 s and to a lesser extent of about 6-7 s. For these listeners, musical ideas are rarely shorter than 5 s. By contrast, nonmusicians tend to perceive musical ideas of 7-8 s, and, to a lesser extent, of 15 s. Perceived musical ideas, however, can be shorter than 5 s, and as short as 1 s, which has no equivalent in musicians. This finding may reflect a difficulty the nonmusicians had integrating unfamiliar musical events in an unfamiliar style into perceptually significant units.
The next step was to analyze the temporal location of the perceived musical ideas in each theme. As displayed in Figures 3-7, the locations of the perceived musical ideas were globally similar for both groups. The number of musical ideas found for each bin of 2 s between the two groups was correlated for each theme (the bins with a value of 0 and 1 were removed from this analysis). Moderate to high correlations were found, with r(24) = .65, p & lt; .002 for Theme 1, r(17) = .59, p & lt; .01 for Theme 2, r(4) = .83, p & lt; .04 for Theme 3, r(14) = .83, p & lt; .001 for Theme 4, and r(34) = .86, p & lt; .001, for Theme 5. This analysis suggests that musically trained and untrained listeners perceived the musical ideas in the Reynolds themes in similar fashion, although from 26% to 65% of the variance between the two groups remains unexplained, indicating differences between them that vary across themes.
The perceived musical ideas did not always fit with Reynolds' compositionally determined subsections, although it should be re-emphasized that he did not intend all sections to have perceptible boundaries. Over all the themes, 11 of Reynolds' 34 subsections corresponded to perceived musical ideas for musicians and 10 did for nonmusicians. In order to further investigate the musical parameters associated with the change in musical ideas perceived by participants, the five themes were characterized according to five acoustical or psychoacoustical features: level in dBA, zero-crossing rate, spectral centroid, roughness, and pitch contextuality. The Level measure is simply the sound pressure level with the A weighting designed to compensate for the equal loudness contours at lower levels. The Zero Crossing Rate is the number of zero-crossings within a 1-s time frame and is correlated with the harmonicity/noisiness of a signal (usually used as a discriminant of speech and music). This cue is very sensitive to the noisiness of piano attacks and is thus correlated with the number of new sound events within a frame. The remaining features were computed from the output of an auditory model. The auditory model used here is an implementation of Van Immerseel & Martens (1992) by Leman, Lesaffre & Tanghe (2002). It provides a physiologically justified representation of the activity of the auditory nerve in response to a sound. Spectral Centroid is the "center of gravity” of the spectral magnitude distribution over the auditory channels. Roughness or sensory dissonance (Leman, 2000a) is highly related to texture perception and characterizes the degree of amplitude modulation across the array of auditory channels. It is assumed that some neurons may synchronize with the envelope, provided that they fall in the frequency range where synchronization is physiologically possible (between 5 and 300 Hz) and that synchronization of modulation across channels increases roughness. Pitch contextuality (Leman, 2000b) measures the pitch commonality between two running pitch estimations of the same sound, but with a different decay trace. Firstly, a pitch image is computed. An autocorrelation is applied to each auditory channel to estimate the periodicities of the perceived (spectral and virtual) pitches. These pitch images are then accumulated in short-term and long-term memories. The state of these memories is finally continuously compared, giving an index of the correlation between local and global pitch contexts.
In order to assess whether perceived musical ideas may have been influenced by discontinuities in these features, we computed the "running” variation of each variable (the absolute difference of a feature value between two successive 1-s frames). Correlation coefficients between these five variables and the participants' group responses were computed. If musical ideas were mostly influenced by changes in surface features, significant correlations should be observed. The outcome of this analysis is summarized for musicians and nonmusicians in Table 6. It should be noted, however, that this correlation analysis does not take into account the possibility that the actual predictor(s) contributing to segmentation could vary over the theme during listening, with fluctuations in attentional focus, for example.
This analysis revealed that for Themes 3 to 5, musical ideas may actually correspond to a salient evolution of some acoustical features. The lower panels in Figures 3-7 present variations of acoustical features and perceived musical ideas by musicians and nonmusicians. For Theme 3, three subsections are clearly identifiable according to the contextuality and centroid features. Level information seems to be more useful for segmentation of Themes 4 and 5. Obviously, musical ideas perceived for other themes (notably Theme 2) remained difficult to account for on the basis of surface features alone. As an example, in Theme 1, a peak of segmentation occurs at 22"23 s, where no abrupt change in our descriptors can be seen. On the contrary, for Theme 3, the participants do not segment the first boundary (near 10 s, the beginning of the core element), although the features clearly change. This shows that these features may act as cues of changes in musical idea, but they are neither sufficient nor always necessary to explain the experimental data.
An analysis of the musical characteristics associated with the musical ideas is displayed in Table 7. For the purpose of this analysis, we only focused on changes in musical ideas that were segmented by at least 20% of the participants in order to avoid the multiple segmentations by individuals or small numbers of participants. Let us consider for example the structure of Theme 1. Several subsections were difficult to perceive because Theme 1 exhibits a strong linear continuity with few contrasts between subsections. The most contrasting changes detected by both groups of participants occurred between subsections T1.3 and T1.4 and between subsections T1.6 and T1.7. The former corresponds to a small but sudden change in pitch range that contrasts with the continuous and progressive change in high and low pitch ranges that was initiated from the beginning of the theme. The latter corresponds to a contrast between a calm and serene excerpt (T1.6) with a more dynamic excerpt (T1.7). The change between subsections T1.5 and T1.6 is subtler and corresponds to a change in musical writing associated with a small decrease in tempo (mm=150 to 120) without being emphasized by other surface markers. Musicians seemed to perceive a change in musical idea here with a 4-s delay. At this exact time, there is no change in the musical surface that may explain their responses. Sometimes participants perceived musical ideas at points that do not correspond to the composer's subsections. The clearest example is found at the middle of the core element of Theme 1, notably in nonmusicians at 34" and 40". The new musical idea perceived here corresponds to the crossing of the two melodic lines, which are developed throughout Theme 1. As illustrated by Figure 1, this crossing point is of central rhetorical importance for Theme 1, and this importance is emphasized by a ralentando followed by an accelerando. It is interesting to note that the most detected subsections by both musicians and nonmusicians (T1.4, T1.6, T1.7) were subsections embodying the rhetorical function of digression, that is, a contrasting musical idea.
The musical ideas identified in Theme 2 never corresponded to a subsection as conceived by the composer. This may be explained by the fact that Theme 2 is made of local discontinuities resulting from an apparent chaotic alternation of loud hammered chords and short, fast runs. Interestingly, both groups indicated a new idea in the middle of the core element (25"-26"). At this time, the density of the hammered chords, which was gradually decreasing from the beginning of the theme, is the lowest, and then progressively increases again beyond this point. At the same time, a reverse phenomenon is observed for the fast runs, the frequency of occurrence of which is the highest at this point. That is to say, this point corresponds to the crossing of two musical processes working in opposite directions. The rhetorical importance of this crossing is highlighted by an increase in note duration. The two changes in musical idea observed with musicians just after T2.6 (at 39" and 41") may correspond to a change in melodic contour of the fast runs: they suddenly go upward while they were previously going downward most of the time.
Only one subsection in Theme 3 corresponded to a perceived idea (T3.3). In the preceding section, the musical flow was progressing by contrary movement of left and right hands. Subsection T3.3 starts just after the end of this movement. Surface markers linked to change in articulatory features (staccato replacing legato with foot pedal) further highlight this change in process. This subsection embodies the digression function.
The musical ideas of Themes 4 and 5 fit well with the thematic subsections. In both themes, each subsection is internally homogeneous. In Theme 4, this is caused by continuous melodic lines all ending on pseudo-cadential gestures producing a feeling of suspension. In addition, the musical character of each subsection differs slightly and each boundary is further marked by long resonances. In Theme 5, subsections displayed contrastive musical characters associated with either change in articulation or the writing texture (melodic versus harmonic). In addition, the changes in subsection are also emphasized by silences. The most detected subsections both by musicians and nonmusicians, in Themes 4 and 5, were also of the digression category (T4.4 and T5.2).
It seems that participants based their segmentations both on surface features and rhetorical functions. For Themes 3 and 5, musical ideas might actually correspond to a salient evolution of some of the acoustical features. Obviously, musical ideas perceived for the other themes are difficult to account for on the basis of surface features alone. Globally, Themes 3, 4, and 5, the perceived ideas of which fit better with Reynolds' subsections, are less continuous in their temporal leadings, and the boundaries of their subsections are more salient. Some rhetorical functions also seem to influence the participants' judgments. The subsections that embody the function of digression evoked the strongest peaks of segmentation. To the contrary, the subsections that embody the functions of development and continuity are less often perceived as a new musical idea. This process is perhaps comparable to Gernsbacher's (1990) Structure Building Framework. According to Gernsbacher, the construction and mapping of a mental structure (in discourse comprehension) are done by the addition of information, which forms additional appendices with the basic structure. The more the input information is coherent with preceding information, the more it will activate the same memory nodes. Inversely, the less the input information is coherent, the less the same memory nodes will be activated. In this case, the input information will activate a different unit of memory nodes, and the activation of this new unit will produce the foundation of a new substructure. Perceived musical ideas seem in fact to correspond to changes in more global dynamic structures that emerge from the temporal patterning of different markers over several seconds or from relationships between thematic materials. Music theorists would label these as changes in musical rhetoric, temporal leading or musical character (Caplin, 1998; Monelle, 1992; Kramer, 1988). Music psychologists would probably say that some of these changes relate to dynamic trajectories (Large & Jones, 1999) that cannot be anticipated from the previous context. In any case, these more global features were essential to understand why some subsections were or were not perceived as musical ideas.
Belongingness judgments.The main purpose of this part of the experiment was to evaluate participants' ability to recognize whether two excerpts belonged to the same theme or not. Both groups managed fairly well to reject correctly the "false pairs” (in which one of the elements belonged to another theme) without any difference between musicians (61%) and nonmusicians (59%), t(19) = 0.50, p = .62. Musicians (62%), however, tended to perform better than nonmusicians (55%) in recognizing "true pairs” (in which both elements belonged to the same theme), t(19) = 1.85, p = .08. Only musicians performed significantly above chance with true pairs, t(19) = 2.28, p = .03. The lowest performance was observed for pairs coming from Theme 1 (40% and 43% for musicians and nonmusicians, respectively) and Theme 5 (48% and 46% for musicians and nonmusicians, respectively). The highest performance was 81% among musicians and 67% among nonmusicians. In addition, 18 musicians (out of 24) and 13 nonmusicians (out of 24) performed above chance. For both groups of participants, it was easier to recognize true pairs when the second element corresponded to the first subsection of the theme (function of presentation, see Table 8) rather than to another section (65% versus 61% for musicians; 63% versus 53% for nonmusicians). This finding suggests that the first subsection of a theme is more stable in memory and may work as a cognitive reference point to which other sections of the theme are anchored. There was no clear tendency to reject correctly false pairs when the second element corresponded to the beginning of another theme rather than to another subsection of the same theme (60% versus 62% for musicians, 61% versus 59% for nonmusicians).
It was of interest to assess whether belongingness judgments might have been influenced, at least partly, by the surface similarities between the two items of the true and false pairs. To address this issue, the normalized mean and range over all items for four of the five descriptors mentioned earlier (Signal Level, ZCR, Centroid and Roughness) were used as coordinates in an 8-dimensional space. Considering the Euclidean distance between these items as a dissimilarity index, an unpaired t-test revealed that the mean distance between items of the false pairs was significantly higher than the distance between items of true pairs (t(38) = 2.56, p & lt; .05). The main outcome of this analysis was to point out that belongingness judgments were probably influenced to an important degree by the surface similarities, notably for elements of Themes 3 to 5.
Another way to investigate this issue was to assess whether the proportion of belongingness judgments might be partly explained by the empirical similarities (number of listeners classing two excerpts together in a free classification task) obtained by McAdams, Vieillard, Houix & Reynolds (this CD-Rom). Significant correlations were observed for musicians, r(18) = .82, p & lt; .001, and nonmusicians, r(18) = .69, p & lt; .005, suggesting that musical excerpts that were more often classed together in their experiment tended to be judged as belonging to the same theme in the present experiment. This finding raises the question of the importance of listening to the themes three times before making the belongingness and temporal-order judgments. Our initial assumption was that it should help participants to understand better, to memorize the musical relationships underlying the themes of The Angel of Death, and then to perceive their time-oriented qualities. The fact that perceptual similarity accounted for a significant part of the belongingness judgments points to the possibility that previous listening might have only a weak impact on the comprehension of the thematic materials. A control experiment was run with musicians to examine this issue further (see below).
Temporal order.In this task, participants had to find the correct temporal order of the excerpts that had been judged as belonging to the same theme. In our view, if they actually did perceive the time-oriented qualities of the theme, knowing which excerpt occurred first should be a rather easy task. For this analysis, only the correct responses to the belongingness judgment were considered because these are the only ones that make musical sense with respect to the thematic materials under study. A first analysis showed that musicians managed to correctly respond above chance level (60%), single-sample t(23) = 2.59, p = .02, whereas nonmusicians did not (51%), t(23) = .59, p = .56. However, this analysis raised some difficulties because several participants did not actually respond above chance for the belongingness judgments. A second analysis was thus performed with the 18 musicians and 13 nonmusicians that responded above chance in the belongingness task. This analysis no longer revealed a difference between the two subgroups of participants, with 55% and 54% correct responses to the temporal order question for musicians and nonmusicians, respectively. Only the musicians' performance levels were marginally above chance, t(17) = 2.00, p = .06 for musicians, t(12) = 1.41, p = .18 for nonmusicians. The highest individual score was 70% for a musician and 69% for a nonmusician, suggesting that at least some participants captured the time-oriented qualities of the themes.
A control experiment was added in order to assess whether the exposure phase was of critical importance for both belongingness and temporal-order judgments. Given that participants seemed to perform the belongingness task mostly on the basis of surface similarities, removing the exposure phase should not seriously affect participants' performance levels. By contrast, we expected this change to have a detrimental effect on the temporal-order task. Given the fact that this latter task was better performed with musician listeners, the control experiment was run only with musically trained participants.
Seven volunteer candidates for the final diploma of the conservatory of Dijon (referred to below as musicians) participated in the experiment and were paid 7€. The stimuli were the same as in Experiment 1, but the participants performed only the second phase of Experiment 1, i.e. they were required to perform the belongingness and time-order tasks without the three previous listenings.
The percentages of correct belongingness judgments were similar to those of Experiment 1 (63% versus 62%). Once again, we found that true pairs were easier to recognize when the second element of the pair corresponded to the first subsection of the theme rather than to another subsection (66% versus 56%). Further, false pairs were easier to reject correctly when the second element of the pair corresponded to the first section of another theme (89% versus 52%). This suggests that the belongingness judgments were performed with roughly the same accuracy with and without three previous active hearings of the theme. Taken in combination with the fact that belongingness judgments were highly correlated with similarity judgments, this finding demonstrates that belongingness judgments were mostly driven by surface similarities rather than by more elaborated rhetorical functions of the thematic sections. The percent correct temporal-order judgments found in Experiment 2 decreased drastically compared with those of Experiment 1 (43% versus 60%), (19) = 2.44, p = .02. This result demonstrates that a single hearing of the thematic excerpts outside of their full musical context is not sufficient to recognize accurately the temporal order of the excerpts. That is to say, the time-oriented quality is not an intrinsic quality of a musical excerpt: it emerges from the way the excerpts were temporally organized by the composer, and previous hearings of the themes are probably essential for listeners to capture these aspects.
The purpose of this study was to characterize how listeners follow the musical progression of each theme of Reynolds' The Angel of Death and whether or not they perceive the time-oriented qualities of the subsections of these themes. When the temporal structure of the theme is based on a series of contrasted subsections, participants correctly identify these subsections, which generally correspond to those delineated by the composer. This is notably the case for Themes 4 and 5. An analysis of the acoustic and psychoacoustic features of the themes suggested that they had a substantial influence for Themes 4 and 5 on the detection of new musical ideas. By contrast, when the themes had a strong temporal directionality, the perceived musical ideas corresponded to identifiable changes in the musical progression, but these changes did not necessarily correspond to the composer's generative subsections. This is notably what happened in Themes 1, 2 and 3. It is worth noting that in this case, the segmentation performed can be accounted for weakly by surface features. In our view, this result suggests that, at least for Themes 1 and 2, participants' segmentations were strongly influenced by the rhetorical structure of the themes, at least by some rhetorical functions. The subsections that embody the function of digression evoked the strongest peaks of segmentation. To the contrary, the subsections that embody the functions of development and continuity are less often perceived as a new musical idea, which is not surprising given the nature of these functions. One of the most striking findings of the study was that this ability to follow the musical progression was found in both musically trained individuals who were highly familiarized with contemporary music, as well as in musically untrained listeners clearly unfamiliar with this style. This finding is consistent with other work on contemporary music (Deliège, 1989). Taken together they suggest that contemporary musical structures do not require long, extensive training to be processed, a finding that contradicts commonly held beliefs concerning the putative extreme complexity of contemporary musical style.
A weak effect of musical expertise was nevertheless found when analyzing the durations of the perceived musical ideas. On the whole, nonmusicians tended to perceive musical ideas of shorter duration. The most striking difference concerns the fact that several seconds of music is necessary to define a musical idea for musicians, which is not always the case for nonmusicians who perceived ideas as short as one second. This difference may highlight a stronger difficulty in nonmusicians to integrate musical events into perceptual units. In addition the analysis of the duration of perceived musical ideas indicated that they are not temporally constrained by the hypothesized size of the perceptual present (Fraisse, 1957) of 7 s, and they can, in some cases, be as long as what Levinson (1997) has called quasi hearing (up to 30 s).
The present study also pointed out the difficulty participants encountered as soon as they were required to perform more abstract tasks such as judging whether two excerpts belonged to the same theme, and which one occurred first in the theme. The performance levels found for these tasks in Experiment 1 were just above chance, which was rather disappointing if we consider that participants had previously heard the themes three times. Experiment 2 added supplementary information showing that previous listening did not actually influence belongingness judgments. Belongingness judgments were probably influenced to an important degree by the surface similarities. An analysis of the acoustic and psychoacoustic features of the themes revealed that the mean distance between items of the false pairs was significantly higher than the distance between items of true pairs, notably for elements of Themes 3 to 5. Moreover, a comparison with classification judgments reported by McAdams et al. (this CD-Rom) indicated that belongingness judgments were mostly driven by surface similarities between the excerpts presented in the pair. This outcome is also in agreement with previous data (Lamont & Dibben, 2001), underlining the fact that listeners use surface attributes rather than "deep" levels of structure for their similarity judgments. Perhaps, more repeated hearings of the themes could improve the performance in the belongingness task and could lead participants to use deeper (motivic) similarities as Pollard-Gott (1983) has shown. The sole benefit created by the three previous hearings was limited to the temporal-order judgments, which were better performed in Experiment 1 than in Experiment 2. However, it remains unclear whether this advantage was caused by the memorizing of the sequential order of the subsections in the themes, or if participants also captured some aspects of their logical (rhetorical) order.
The most surprising performance in these tasks was that participants seemed to capture intuitively the temporal and rhetorical structure of the themes in the online segmentation task, but encountered some difficulties in performing belongingness judgments. This apparent paradox may simply suggest that musical structure is easy to follow online, while remaining extremely difficult to represent in an abstract way that allows accurate judgments to be made concerning how parts of the whole theme are articulated together. This difficulty is unlikely to be specific to contemporary music. Experimental studies requiring participants to solve very simple musical jigsaw puzzles with Western tonal music also reveal that participants (irrespective of their musical training) encounter considerable difficulties with these tasks (Tillmann, Bigand & Madurell, 1998).2