ISMIR 2019 and Human-Centric MIR

I had the pleasure of attending ISMIR properly for the first time in its 20 years of bringing together music technology specialists. Through the main meeting and satellite events ran a theme of how these systems for interpreting, organising, and generating musical materials impact our musical cultures. Whether or not researchers are worried about the ethical dimensions of their work, these need considering.

This issues was a fixture of the first workshop on Designing Human-Centric MIR Systems, where I presented on a talk titled Human Subtracted: Social Distortion of Music Technology (slides, extended abstract).

The social functions of music have been broken by successive music technology advances, bringing us to the current “boundless surfeit of music” (Schoenberg) navigated with only the faintest traces of common interests retained in personalised music recommendation systems. This paper recounts the desocialisation of music through sound recording, private listening, and automated recommendation, and considers the consequences of music’s persistent cultural and interpersonal power through this changing use.

This workshop featured a number of contributions on the impacts and opportunities of recommendation systems for music, and I recommend anyone interested in this issue check out the proceedings.

MIR for good was also a project at this year’s WiMIR event, a sort of mini-hackathon designed to encourage greater engagement by women in the field with hands on projects, opportunities to find mentorships, and other activities. One group started a working document to discuss ethical guidelines for information research in music. (I played with Eurovision music with Ashley Burguyone and others. Check out Tom Collins’ interactive plots of songs past ordered by key audio features. Yes you can play the tunes!)

The next afternoon had a fantastic tutorial on Fairness, Accountability and Transparency in MIR. The talks brought in issues learned from machine learning systems in other parts of life and discussed how these play out with music. We picked up scenarios where music information determines access, opportunity, financial compensation, and the interests of minority communities. It was a good time that raised many more questions than we could answer.

And during the conference proper, this question of ethics and good practice for MIR came up again in the keynote by Georgina Born called MIR redux: Knowledge and real-world challenges, and new interdisciplinary futures. The abstract:

How can MIR refresh itself and its endeavors, scholarly and real world? I speak as an outsider, and it is foolhardy to advise scientist colleagues whose methodologies one would be hard pressed to follow! Nonetheless, my question points in two directions: first, to two areas of auto-critique that have emerged within the MIR community – to do with the status of the knowledge produced, and ethical and social concerns. One theme that unites them is interdisciplinarity: how MIR would gain from closer dialogues with musicology, ethnomusicology, music sociology, and science and technology studies in music. Second, the ‘refresh’ might address MIR’s pursuit of scientific research oriented to technological innovation, itself invariably tied to the drive for economic growth. The burgeoning criticisms of the FAANG corporations and attendant concerns about sustainable economies remind us of the urgent need for other values to guide science and engineering. We might ask: what would computational genre recognition or music recommendation look like if, under public-cultural or non-profit imperatives, the incentives driving them aimed to optimise imaginative and cultural self- and/or group development, adhering not to a logic of ‘similarity’ but diversity, or explored the socio-musical potentials of music discovery, linked to goals of human flourishing (Nussbaum 2003, Hesmondhalgh 2013)? The time is ripe for intensive and sustained interdisciplinary engagements in ways previously unseen. My keynote ends by inviting action: a think tank to take this forward.

Go watch it. It was really good!

Around all the other research topics at this conference, the question of how to do MIR well, to do this work without causing harm, was never far from my mind. And I expect it will continue to echo as we prepares to host the next ISMIR next year in Montreal.

Dissertation in Open on ProQuest

Finally, my PhD dissertation is posted in full on ProQuest, open access to all. It’s a bit of a behemoth at 64 MB and 441 pages, but if you want to know everything about involuntary respiratory phase alignment to music, this is the document for you.

The first half is a lot of technical details on how to get relevant timing information from respiratory sequence recordings. The second is a long analysis of when alignments arise and what that says about how our breathing is engaged by what we are listening to.

I feel lucky to have had the time to dig deep in the analysis. The patterns are investigated from the perspective of the individual musical works used as stimuli AND from the perspective of individual listeners in five case studies. And as associations arose between respiratory behaviour and musical events, the last section focuses on how these relate to known (or hypothesized) respiratory control mechanisms. By studying the details, this dissertation goes from detection of a little known phenomenon to testable hypotheses about causal mechanisms. I look forward to putting each of these to the test.

I’m really proud of this work. It brings together research from multiple fields and made use of all my formal training plus a lot I had to learn outside of the classroom. (Like everything about the respiratory system. That certainly wasn’t covered in math, or music theory, or psych classes…)

Full Abstract:

This dissertation explores the surprising phenomenon of listeners’ unconsciously breathing in time to music, inspiring and expiring at select moments of specific works. When and how the experience of hearing music might produce stimulus-synchronous respiratory events is studied through Repeated Response Case Studies, gathering participants’ respiratory sequences during repeated listenings to recorded music, and through Audience Response Experiments, responses for participants experiencing live music together in a concert hall.

Activity Analysis, a new statistical technique, supported the development and definition of discrete phase components of the breath cycle that come into coordination: the onsets of inspiration and expiration, the intervals of high flow during these two main phases, and the post-expiration pause. Alignment in these components across listenings illuminate when the naturalistic complex stimuli can attract or cue listener respiration events.

Four patterns of respiratory phase alignment are identified through detailed analysis of stimuli and responses. Participants inspired with the inspirations of vocalists and wind performers, suggesting embodied perception and imagined action may exert influence on their quiet breathing. Participants suppressed and delayed inspirations when the music was highly unpredictable, suggesting adaptation in aid of auditory attention. Similar behaviour occurred with sustained sounds of exceptional aesthetic value. Participants inspired with recurring motivic material and similar high salience events, as if marking them in recognition or amplifying their affective impact. And finally, participants occasionally breathed following structural endings, suggesting a sigh-like function of releasing the respiratory system from cortical control.

These instances of music-aligned respiratory phase alignment seemed to be stronger in participants who were typically active with heard music, but the impacts of training and expertise was not a simple condition for this behaviour. Contrasts between case study participants showed highly idiosyncratic patterns of respiratory alignment and differences in susceptibility along side moments of shared effect. In the audience experiments, alignment within phase components was measurable and significant, but rarely involved more than a quarter of participants in any given instance. These levels of concurrent activity in respiration underline the subtlety of this bodily response to music.

And if you want to know more than what you can find in the document, or borrow scripts/data that haven’t been posted yet, get in Touch!

A(nother) definition of music

At last summer’s SMPC, I shared a quasi-interactive poster with my most current definition of music. The poster invited viewers to add examples or counter-examples of musical experiences via post-its to where ever it seemed spatially appropriate. Since then, the poster has been in the PhD office at NYU, and a couple more edges cases have been added. Still, the definition stands.

It goes as follows:

Music is a broadcast signal enabling sustained concurrent action.

My claim is that these six terms form a necessary condition for something to be perceived as music or musical. Perception here is relevant as our processing of sensory information adapts to extract useful information for sounds and signals, and the relevance of music and its various qualities are displayed in the structure of these perception strategies. But by using our perceptual processes to define music, the associated experiences might not all fall within with our culture’s delimitations on the concept.

Screen Shot 2016-06-08 at 22.01.43

The attached poster does the work of explaining each of the terms and their relevance, but I’ll add an important challenge to the definitions.

“What about the wildebeests?”

This was asked by a fellow grad student, with a grin, but the question is reasonable. A herd of wildebeests running sounds and feels thunderous, any member of the herd would hear it as coming from it’s herd-mates, and this sound inspires a strong impulse to run too, an obvious instance of sustained concurrent action. So is the sound of a running herd music to a wildebeest ears? I would have to say maybe, conditioned on the two remaining terms: signal and enabling. For the sound to be a signal, it would have to transmit so kind of intentional herd-running, individual members falling into a special running style, with perhaps some extra regularity or heaviness to their gait. The enabling bit is a little more tricky. Music doesn’t determine action, instead, it gives us some well fitting options. For the sound of a running herd to enable a single wildebeest’s actions, said individual wildebeest should be able to resist the suggestion to join in and and have some choice as to how, if the suggestion is accepted. Having no familiarity with the running habits of ungulates of any kind, I can’t be more specific.

A similar human case came to mind recently when I crossed paths with #OrangeVest, a performance art piece by Georgia Lale about the ongoing Syrian Refugee crisis. A block of some twenty adults in orange life vests were marching slowly and silently through the streets of New York, with helpers around to shoo traffic and explain the action. In an instant, I recognized the deliberateness in their movements, their aura of stillness, and I felt the tug to step in line. But instead, I waited for them to pass and looked up the project later. If you feel inspired to lend some (more) support to the cause, consider donating to MOAS, Refugee Support Network, or your preferred means of distributing humanitarian aid.

Music and coordinated experience in time: Back to Activity Analysis

There are two comically extreme positions on how music (or really any stimulus) affects observers. At one end, the position that all of our experiences are equivalent, dictated by the common signal, at the other, individual subjectivities make our impressions and reactions irreconcilable. In studying how people respond to music, it’s obvious that the reality lies somewhere in the middle: parts of our experience can match that of others, though differences and conflicts persist. I’ve spent years developing this thing called activity analysis to explore and grade the distance between absolute agreement and complete disarray in the responses measured across people sharing a common experience.

As people attend to a time varying stimulus (like music) their experience develops moment by moment, changes prompted by events in the action observed. What we have, in activity analysis, is a means of exploring and statistically assessing how strongly the shared music coordinates these changes in response. So if we are tracking smiles in an audience during a concert, we can evaluate the probability that those smiles are prompted by specific moments in the performance, and from there have some expectation of how another audience may respond.

If everyone agreed with each other, this would not be necessary, and if nothing was common between listeners’ experience, this would not be possible. Instead empirical data appears to wander in between, and with that variation comes the opportunity to study factors nudging inter-response agreement one way or the other. We’ve seen extreme coherence, that of the crowd singing together at the top of their lungs in a stadium saturated with amplified sound, and polite but disoriented disengagement is a common response to someone else’s favourite music. We need to test the many theories on why so many different response (and distributions of responses) arise from shared experiences, and Activity Analysis can help with that. Finally.

Here is hoping I can get back to sharing examples of what this approach to collections of continuous responses makes possible. The data and analyses have been waited too long already.