Student Symposium Abstracts - 2010

Oral Presentations (9:00-10:00, 11:00-12:00, 1:00-2:00)/ Présentations orales (9:00-10:00, 11:00-12:00, 13:00-14:00)

9:00 Maryse Lavoie: Classical Guitar Timbre's Lexicon : Verbal Data Analysis

Whereas in acoustics the definition of timbre refers to a difference between two sounds having the same pitch, loudness and duration, in musical performance one needs to consider the limits of this definition. This research project will address the notion of timbre, verbal data and verbal data analysis. Furthermore, a psycholinguistic analysis of classical guitarists' musical perception and cognition of timbre will attempt to investigate the relations between timbre, register, intensity and duration in order to better understand the roles of those elements in the description of timbre. This project is about a psycholinguistics internship (09-10.2009) that was financed by CIRMMT and took place at LAM under the direction of Danièle Dubois.

9:20 Anna Tirovolas: Exploring musical counterparts to phonemic awareness in the prediction of musical sight-reading

Several studies have suggested that the ability to attend to musical patterns is a key cognitive skill associated with sight-reading (SR) performance in music. This project takes a behavioural approach to the study of SR, using a task aimed at measuring auditory awareness of musical patterns using a melodic phrase assembly task. The “Blending” task from the Comprehensive Test of Phonological Processing (CTOPP), a task in which typically individual phonemes are assembled to make a word, was adapted to music to explore its potential in accounting for variability in SR performance. In the musical version, participants were asked to attach rhythmic information to a series of individual notes to create a melody. The auditory sequences include notes of equal duration (250 ms.) separated by equal inter-onset intervals (500 ms.) which, when assembled, make up familiar melodies. Participants were asked to sing/hum back the melodies incorporating rhythmic information, and also recorded a sight-reading performance on the piano.

9:40 Bruno Giordano: Cortical processing of environmental sounds

Brain imaging studies of environmental sounds reveal that listening to sounds from different categories (e.g., living, nonliving) selectively activates separate cortical areas. We investigated: [1] which categorical distinction is most reliably mapped to a cortical selectivity (vocal vs. nonvocal; living vs. nonliving; human vs. nonhuman); [2] the neural substrates for the processing of various features of the sound stimuli. Both univariate (activation) and multivariate (information) analyses showed a particularly strong cortical sensitivity for the vocal vs. nonvocal distinction. Multivariate analyses revealed distinct temporal cortex regions dedicated to the processing of spectral, periodicity-related, and temporal sound features.

11:00 Jimmie LeBlanc and Ida Toninato: Body of Noise: composition, interprétation et analyse archétypale

Body of Noise: composition, interprétation et analyse archétypale" est un projet qui combine la composition d'une nouvelle oeuvre pour saxophone baryton et dispositif électronique, en vue de la performance en concert de l'oeuvre, et du développement d'une approche archétypale de l'interprétation et de l'analyse musicales. D'une part, nous poursuivons une réflexion sur certains concepts fondamentaux dans le domaine du traitement audionumérique, comme celui de "métamorphose", que nous abordons comme un modèle formel nous permettant un accès à certains archétypes mentaux universels (cf. C. G. Jung). Ce faisant, nous voulons trouver des bases conceptuelles communes entre interprète et compositeur, mais aussi envisager une méthode d'analyse musicale qui donne à l'interprète des moyens poétiques de s'approprier les aspects techniques de l'oeuvre à interpréter. Pour réaliser notre étude, nous analyserons quatre oeuvres de musique mixte, en plus de l'oeuvre qui sera composée dans le cadre du projet.

11:20 Valorie Salimpoor and Mitchel Benovy: Musical Chills: Linking Emotion and Pleasure in the Brain

Emotion and reward have long been associated, but how emotions become pleasurable is not entirely clear. Music provides an excellent medium to examine this relationship, since the temporal and dynamic nature of musical stimuli allow for an examination of build-up in emotional arousal and how this may contribute to pleasure. In contrast to previous fMRI experiments that have used experimenter-selected music, we used self-selected music, thereby allowing for a fuller range of emotional experience. The “musical chills” response, a marker of peak autonomic nervous system activity, was used to index intense emotional arousal. Functional MRI scans were collected as individuals listened to music while providing continuous subjective ratings of pleasure. Time-series analysis revealed distinct patterns of activity in different portions of striatal, limbic, and frontal regions during periods leading up to the peak of emotional arousal as opposed to during and after this moment, providing a glimpse of how a build-up in emotional arousal can lead to pleasurable feelings.

11:40 Erika Donald, Eliot Britton, et al.: The Expanded Performance Trio

Moving beyond the developmental phases of digital musical instruments into an environment governed by the aesthetic and practical considerations of musical creation and performance presents many challenges. The Expanded Performance Trio aims to achieve a versatile, stable and streamlined setup for live electronic performance by exploring methods and models derived from traditional chamber music within a fixed framework. We will discuss our emerging approach and perform a new work for our ensemble of digital turntables, v(irtual)-drums and percussion, and electric cello with K-Bow sensor bow.

1:10 Philippe Hamel: Learning features from music audio with deep belief networks

Feature extraction is a crucial part of any MIR task. In this work, we present a system that can automatically extract relevant features from audio for a given task. The feature extraction system consists of a Deep Belief Network (DBN) on Discrete Fourier Transforms (DFTs) of the audio. We then use the activations of the trained network as inputs for a non-linear Support Vector Machine (SVM) classifier. In particular, we learned the features to solve the task of genre recognition. The learned features perform significantly better than MFCCs. Moreover, we obtain a classification accuracy of 84.3% on the Tzanetakis dataset, which compares favorably against state-of-the-art genre classifiers using frame-based features. We also applied these same features to the task of auto-tagging. The autotaggers trained with our features performed better than those that were trained with timbral and temporal features.

1:30 Vincent Freour and Jaime Sandoval Gonzalez: CIRMMT innovative funding project: Interaction between facial expression, gesture and breathing during singing

The issue of how people coordinate their own body systems as well as their actions in group activities is highly pertinent to all aspects of human behaviour. Music offers a unique domain for addressing the coordination within and between people because, in addition to being accomplished by groups of people, music is produced simultaneously by each group member (in contrast to speech, where turn-taking is the norm). To achieve a high-level performance, the performer needs to simultaneously focus on internal and external parameters. Each measurement system for movement, breathing, facial expression, and related muscle activity provides copious amounts of data that nonetheless is generated by a single performer or group of performers; how the human system integrates all these aspects into coherent musical behaviour, as well as how scientific analyses may integrate that data is a complex time-series task. A current CIRMMT research project aims at conducting a pilot study on the integration and interaction of multi-level movements within individuals (facial expression, gesture and breathing).

Demos(10:00-11:00)/ Démonstrations (10:00-11:00)

1. Jason Hockman and Joseph Malloch: Interactive, Real-time Rhythm Transformation of Musical Audio

Time-scale transformations of audio signals have traditionally relied exclusively upon manipulations of tempo. We present a real-time technique for transforming the rhythmic structure of polyphonic audio recordings using low-dimensional rhythmic transformation spaces derived from analysis of annotated exemplar audio tracks. In this transformation, the original signal assumes the meter and rhythmic structure of an interpolated model signal, while the tempo and salient intra-beat infrastructure of the original are maintained. Possible control schemes include expressive control of rhythmic transformation for live performance, and real-time feedback for task execution (e.g. exercise, video games).

2. Charalampos Saitis: Investigating Perceptual Aspects of Violin Quality

The sound quality of a violin depends upon a number of different, often subtle factors. Most of them are acoustical, referring to the way the instrument vibrates and radiates sound. However, there are non-acoustical factors that relate to the way the instrument “responds” to the actions of the player. There is an extensive volume of published scientific research on quality evaluation of violins, but most has traditionally focused on the characterization of the acoustical factors and ignored the player’s perspective. The instrument response is related to the feedback from the violin body to the string and its influence on the overall behavior of the instrument. How does the player “feel” the instrument? This is a critical aspect that has only recently been considered essential in developing an understanding of what distinguishes “good” and “bad” instruments.

3. Joseph Thibodeau and Avrum Hollinger

T.B.A

4. Gabriel Vigliensoni: SoundCatcher: Explorations in audio-looping and time-freezing using an open-air gestural controller

SoundCatcher is an open-air gestural controller designed to control a looper and time-freezing sound patch. It makes use of
ultrasonic sensors to measure the distance of the performer’s hands to the device located in a microphone stand. Tactile and visual feedback using a pair of vibrating motors and LEDs are provided to inform the performer when she is inside the sensed space. In addition, the
rotational speed of the motors is scaled according to each hand distance to the microphone stand to provide tactile cues about hand
position.

Posters (10:00-11:00)/Posters (10:00-11:00)

1. Michel Bernays: Expression and gestural control of piano timbre

Musical expressivity in pianistic performance relies heavily on timbre. During the learning process, timbre is empirically transmitted through subjective verbal descriptions whose imagery fits the sonic nuances. Between this acute vocabulary and those subtle tone qualities, can we identify timbre correlates in a performer’s gesture on the keyboard? A pilot test was designed. A professional pianist performed custom-made pieces with several timbres, designated by adjectives such as bright, distant, shimmering. Identification tests in audio recordings were indicative of a semantic consistency between the answers and the performance-set timbres, which concurs with a common ability among pianists to identify and label timbre. The pianist’s gesture as applied on the keyboard was captured through key position and hammer velocity data, thanks to the computer-controlled grand piano Bösendorfer CEUS. Gestural functions were calculated as numerous features of articulation, overlap, dynamics, synchronism, and some significant correlations with timbre were found.

2. Tariq Daouda: Rhythm Generation Using Reservoir Computing

Reservoir computing, the combination of a recurrent neural network and a memoryless readout unit has seen recent growth in popularity in time series analysis and machine learning. This approach has successfully been applied to a wide a range of time series problems, including music, and usually can be found in two flavours : Echo States Networks, where the reservoir is composed of mean rates neurons, and Liquid Sates Machines where the reservoir is composed of spiking neurons. In this work we show how a combination of an Echo State Network and several FORCE learners can be applied to automatic rhythm generation and rhythm learning.

3. Francois Germain: Synthesis of guitar by digital waveguides: modeling the plectrum in the physical interaction of the player with the instrument

In this paper, we provide a model of the plectrum, or guitar pick, for use in physically inspired sound synthesis. The model draws from the mechanics of beams. The profile of the plectrum is computed in real time based on its interaction with the string, which depends on the movement impressed by the player and the equilibrium of dynamical forces. A condition for the release of the string is derived, which allows to drive the digital waveguide simulating the string to the proper state at release time. The acoustic results are excellent, as verified in the sound examples provided.

4. Brian Hamilton: Theoretical and Practical Comparisons of the Reassignment Method and the Derivative Method for the Estimation of the Frequency Slope

In the context of non-stationary sinusoidal analysis, the theoretical comparison of the reassignment method (RM) and the derivative method (DM) for the estimation of the frequency slope is investigated. It is shown that for the estimation of the frequency slope the DM differs from the RM in that it does not consider the group delay. Theoretical equivalence is shown to be possible with a refinement of the DM. This refinement is evaluated on synthetic signals and shown to improve the estimation of the frequency slope. The differences between the two methods in terms of window and signal constraints are discussed to show when each method is more appropriate to use.

5. Jason Hockman and Joe Malloch : Interactive, Real-time Rhythm Transformation of Musical Audio

Time-scale transformations of audio signals have traditionally relied exclusively upon manipulations of tempo. We present a real-time technique for transforming the rhythmic structure of polyphonic audio recordings using low-dimensional rhythmic transformation spaces derived from analysis of annotated exemplar audio tracks. In this transformation, the original signal assumes the meter and rhythmic structure of an interpolated model signal, while the tempo and salient intra-beat infrastructure of the original are maintained. Possible control schemes include expressive control of rhythmic transformation for live performance, and real-time feedback for task execution (e.g. exercise, video games).

6. Trevor Knight and Adriana Olmos: Open Orchestra

T.B.A

7. Charalampos Saitis: Perceptual Studies of Violin Quality

8. Stephen Sinclair and Rafa Absar: Multi-modal Search Experiments

A system designed using audio and haptic cues to assist in a target-finding task is presented. The target is located at the center of a circular texture, where the haptic texture changes proportional to the target distance; the auditory feedback similarly alters in pitch and loudness. The participant uses only the audio and haptic cues with no visual cues to complete the task. The data acquired include quantitative measures such as task completion time and trajectory path length. Qualitative data acquired with post-experiment questionnaires and interviews help in understanding the subjective preference of each user. Analysis of the data acquired should show if there is dominance of one modality over another in this scenario or if a combination of the modalities give optimal results. Quantitative results have not yielded any such significant conclusions, however, qualitative data shows that this depends highly on the individual modality preferences of each user.

9. Joseph Thibodeau and Avrum Hollinger

T.B.A

10. Jessica Thompson: Additions and Improvements to the ACE 2.0 Music Classifier

The Autonomous Classification Engine (ACE) is a framework for using and optimizing classifiers. In what is called meta-learning, ACE experiments with a variety of classifiers, classifier parameters, classifier ensembles and dimensionality-reduction techniques in order to arrive at a configuration that is well-suited to a given problem. Improvements and additions have been made to ACE, relative to the previously published ACE 1.1, in order to increase its functionality as well as to make it easier to use and incorporate into other software frameworks.

11. Finn Upham: Are they listening? Comparing biosignals collected during silence and live performance

Biosignals are direct but messy indications of individual's responses to their environment. Individual listeners' personal thoughts and physical state can induce changes in readings that could easily be misattributed to the experimental conditions. Before evaluating the biosignals in relation to the details of the music, we need to determine some basis for relating their activity to the stimulus. From an experiment with the Audience Response System in October 2009, we have biosignals recorded from 45 individuals during baseline readings (silence) and during the live performance of three musical pieces. Using these readings, we can model audience's relaxed state and active phase in terms of facial muscle activation (EMG) and amplitude and frequency of respiration and pulse. Besides providing evidence that the participating audience was stimulated by the performance, we can argue that the difference in activity in some biosignals indicate a shift in attention to the stage.

12. Gabriel Vigliensoni: SoundCatcher: Explorations in audio-looping and time-freezing using an open-air gestural controller

SoundCatcher is an open-air gestural controller designed to control a looper and time-freezing sound patch. It makes use of ultrasonic sensors to measure the distance of the performer’s hands to the device located in a microphone stand. Tactile and visual feedback using a pair of vibrating motors and LEDs are provided to inform the performer when she is inside the sensed space. In addition, the rotational speed of the motors is scaled according to each hand distance to the microphone stand to provide tactile cues about hand position.

Keynote Address (2:00-3:00)/ Conférence invitée (14:00-15:00)

Joe Paradiso: Interaction in the Post-Convergence Age - From Electronic Music Controllers to Ubiquitous Dynamic Media

Electronic and Computer Music have passed through a succession of revolutions as the means of expression in new domains have become ever more democratized - synthesis is now a very capable real-time accessory on standard computers or mobile devices, and new modalities of interaction, popularized by the massive success of products like Guitar Hero, are undergoing an explosion of grass-roots innovation through simple hardware toolkits like the Arduino and distributors like SparkFun. Indeed, it's a great time to think about what might be coming next. In this talk I'll try to inform this conjecture by looking a bit at the past and trying to poke at the future. I'll show examples of several musical controllers developed by my group and others that explored different modalities, then show some examples of current research aimed at ubiquitous interaction with generalized electronic media.