distributed research experience for undergraduates

Katie Wolf - DREU Experience 2010

University of Minnesota - University of Southern California

ISMIR 2010

Here I have recorded my view of the 11th International Society for Music Information Retrieval Conference (ISMIR 2010). Although this was my first time to the conference, it was also my first time to Europe, so I thought it would be a good idea to keep track of what I did and the things that I was thinking.

I also was able to present my summer research at the late breaking/demo session. Here is the abstract and the poster.

Saturday 7 / Sunday 8 August 2010

After taking the time to try and figure out what I would need on the trip and packing a bit too much stuff, I set out on a fairly stressful travel to The Netherlands. Since the conference did not begin until Monday, I spent the weekend with others who had arrived early and visited some of the things to see in Utrecht and also made a side trip to Rotterdam. It was really great to have made some connections even before the conference began.

Monday 9 August 2010

After registering for the conference and getting the conference bag of goodies, there was time for socializing before the tutorials began. As someone who was interested in the field, but not quite sure if it was the place for me, I felt like I had nothing to lose by introducing myself to people and being the gregarious person that I usually am not. And I think that is what made the conference great for me. It was as simple as spotting a person in the crowd and approaching them to ask how they were enjoying the conference or the session that they were in. Where were they from? What were they working on? It was really quiet easy, since I was really very interested in finding out who and what this community was.

The first tutorial I went to was on 'Pattern Discovery and Search Methods in Symbolic MIR' presented by Ian Knopke (BBC) and Eric Nichols (Indiana University). I really was not sure what to expect. Since the research that I have been doing at UCS involves MIDI data (which is symbolic) there were a lot of concepts that I understood. It was kind of a shock to know that I could be on the same level of understanding as other people attending this conference. That even though I had never heard of MIR before this summer, that suddenly I was included in this community. Also knowing that while working over the summer on an undergraduate research project I had actually learned and retained information was very rewarding, which is more than I can say for some of the courses that I have taken in my studies. There were things that I was not as familiar with as well, which made the session very balanced for me.

The second half of the tutorial focused on the cognitive approaches to MIR, which was a great introduction to the other side of the conference that I was not as familiar with. The idea of how music is perceived and how the human brain recognizes and reacts to music is not something that I had considered while studying the physical properties of music. This presentation made me think of how the perception and cognition of music may have affected the experiments that were done for the project at USC. The data that I am working with involves pianists playing a duet with delay added to what they were each hearing of the other player. One of the conclusions made on the project is that the order in which the players play the individual pieces may play a role. The more times the players play the piece (regardless of the delay) the more they are learning to compensate for the delay. It made me wonder about studies done on the brain's cognition and reaction when playing as a part of a group. There were also some applications and programs that I had never heard of before, but would be good to look up and see what they have to offer.

Lunch, cafeteria style, was held in between the tutorials, and gave another opportunity to meet people. It would have been really interesting to take a picture on the first day and look back on the last day of the conferece to see all of the people that I had met over the course of that week, but did not know at the time. By the time the conference was over, I had met a ton of people from all over the world.

The second tutorial I went to in the afternoon was titled 'A Music-orientated Approach to Music Signal Processing' presented by Meinard Mueller (Saarland University and MPI Informatik) and Anssi Klapuri (Queen Mary, University of London). The project that I worked on two summers ago (also for DREU) was on automatic bird species recognition via their audio. That was when I got my first summary of signal processing and over this last summer working with music, I got a little bit more of the taste of it. So going into the tutorial I had a small idea of what to expect. I ended up with a great review of the things that I learned on my own over the summer, which confirmed that I understood a lot of what I was reading about. The tutorial focused on explaining how music-specific aspects can be exploited for feature representation used in various MIR tasks. The four main areas were (1) Pitch and Harmony, (2) Tempo and Beat, (3) Timbre, and (4) Melody. While I was fairly familiar with pitch, tempo and beat, the others touched on newer ideas that I did not have very much knowledge on.

Overall the tutorials were a great way to confirm what I had been studying on my own over the last couple of weeks and also gave me an idea of what else was being studied in the MIR community that was not directly associated with my area.

The tutorials finished with a reception afterwards where I was able to meet more people. And of course the question came up - 'What are you working on?'. Now as a student who has been working in a lab on this one project for 10 weeks, you would expect the answer to just flow smoothly off the tongue (and, believe me, by the end of the conference I had this answer down!). However, the first time or two took a bit of thinking and pausing and rewording, but somehow I explained it and those listening were nice enough not to notice my inexperience (at least I hope).

That night after deciding to meet with a certain group of people at a designated location to go in search for a place to eat, we ran into another group in serach of food and combined our efforrs for a total of 18 people or so. We then wondered around for while struggling to find a place that would seat that many people, and finally found a quaint restaurant. While the food and beer were great, the most amazing thing was sitting down and finding the people that you are sitting next to are people that are well known in the field and that have written the works that you have read. The other interesting part is not realizing who they are until a few days later. But I guess that is the joy of these conferences. There were a few times where I would start up a conversation with a random person, and find that they were exactly the people that I wanted to talk with. I suppose that is one of the benefits of being a part of such a small community.

Tuesday 10 August 2010

The first official day of the conference began with an opening by Frans Wiering, the General Chair of the conference, and continued with a Keynote speech by Carol Krumhansl (Cornell University) on 'Music and Cognition: Links at Many Levels'. Again this new idea to me of the cognition of music came up. Not that I had never thought about it, but that it would find a place at a conference that I had deemed for 'music and computers' and was about 'gathering data about music'. But as I was starting to understand, gathering data on how music is perceived is also a large part of MIR as was displayed in Carol's speech on the links between the objective properties of music and the subjective experience of music.

I found everything she spoke of very interesting from the brain scans of people who were listening to music with tonal and rhythmic violations, to the studies on recognizing popular music and recalling specific details such as the artist, title, decade of release, etc. While it is difficult to portray everything in the short amount of time, there were a lot of details in the study on recognizing and recalling the music that I had questions on. Mostly I wondered about small details of the project. For instance, the listeners were subjected to a short clip and a long clip of the music that they were set to identify and recall. Were the same listeners questioned on both the small and longer clips, or were they separated into groups? If they were not, did it matter that they heard one clip before the other and that helped them to figure out the song by having two sets of information? These kinds of detailed questions would probably be available if I read her work, but just the fact that I am curious about them and want to know more encourages me to continue working towards doing research.

The first poster session opened up after Carol's speech. One thing to note about the poster sessions was to arrive early so you could actually get to the ones you wanted to go to. By the time everyone is about and moving through the twenty-some of the posters, it was hard not only to get to the posters through the crowd, but to be able to speak to those presenting their work. The variety of topics was quiet vast, and it was great to see all of the novel ideas that were being brought together at this conference.

The plenary sessions (for the papers that were selected for oral presentation) lasted an hour and a half giving each of the five to six presentations around 15 minutes to present and 5 minutes for questions. The first presentation of the conference was, by far, the one that captured my interest the most. Markus Schedl presented his work on 'What's Hot? Estimating Country-specific Artist Popularity' that was done with Tim Pohle, Noam Koenigstein and Peter Knees. Among other applications, they used the 'now-playing' tags from Twitter and the information on the region where that tag came from to look at the popularity of artists in different regions. The idea of using social networks and microblogs to see what is popular is an interesting subject that combines my interest in text mining and microblogs with my interest in music.

The sessions that followed a theme of patterns and chord recognition, from 'Identifying Repeated Patterns in Music Using Sparse Convolutive Non-negative Matrix Factorization' (best overall paper of ISMIR 2010 by Ron J. Weiss and Juan Pablo Bello) and 'Solving Misheard Lyric Search Queries Using a Probabilistic Model of Speech Sounds' (best student paper of ISMIR 2010 by Hussein Hirjee and Daniel G. Brown) to 'Concurrent Estimation of Chords and Keys from Audio' (by Thomas Rocher, Matthias Robine, Pierre Hanna and Laurent Oudre). This plenary session was definitely one of the ones that I enjoyed the most. I have never taken a course in pattern recognition, but it has always been something that I am very interested in (maybe it is because I like to find patterns in my own life).

During the poster session that was next on the agenda, there was one poster that I thought was pretty interesting. Scott Miller was presenting his paper with Paul Reimer, Steven Ness, and George Tzanetakis on 'Geoshuffle: Location-Aware, Content-based Music Browsing Using Self-organizing Tag Clouds'. The project worked on gathering music listening habits based on the location and speed of the user, and playing back those songs when the user is in that location again.

Finally, the evening ended with a reception, sponsored by the City of Utrecht, in the garden of the "Academie Gebouw" of Utrecht University. During the reception, Arie Abbenes performed on the Hemony Carillon of the Dom Tower. It was quiet amazing that while the recital was performed for those attending the conference, the bells in the Dom Tower rang out for those all over the town to hear. The only disappointing part was that it started raining during the recital.

Through the rain I found a group to get food with and got to meet some very interesting people (a common theme of the conference) and got to see a bit more of the nightlife in Utrecht.

Wednesday 11 August 2010

The day started off with a plenary session containing a presentation of a report as a part of the new category of papers called "State-of-the-Art Reports" (STAR). Youngmoo Kim presented his State of the Art Report: Music Emotion Recognition: A State of the Art Review that surveyed current techniques in music emotion recognition. The report was one of two that were selected from seven submissions for the new category during ISMIR 2010.

As the conference continued I felt myself being swept away in all of these topics that were so new to me and I found myself being overwhelmed. There were many tools and techniques that I had never heard of that I would take note of with the intention to look up later on. Attending the conference was a really good way to see what was going on within the field. I spent a lot of time reading papers at the beginning of my project, and it was really cool to see not only the people who worked on those projects, but to also see the current work that was being done.

The plenary session held in the afternoon was on tagging and alignment, which I found very interesting. I had read a lot about alignment during my research, and the tagging was something that I wanted to learn more about. All six of the presentations were very interesting and brought up interesting theories and research on how to gain better techniques for alignment and how to use or learn from social tags.

Before the recital at the end of the day, results for the MIREX (the annual Music Information Retrieval EXchange) competition were released with each team being able to present their methods in the form of posters. The exchanging of the state-of-the-art MIR methods is a great way for those in the community to connect and transfer information in order to make further advances in the field as a whole. Some of the tasks involved with MIREX 2010 included: audio classification tasks (audio artist classification, various audio genre classification, audio music mood classification, etc ), audio cover song identification, audio tag classification, symbolic melodic similarity, audio onset detection, audio beat tracking, structural segmentation, and others.

The recital that night was presented by the Utrecht School of Music and Technology (US-MuT) one of the schools of the Utrecht School of the Arts (HKU). Sonsoles Alonso, an emerging pianist in the contemporary piano and (live) electronics format, performed three pieces that really portrayed how technology and sound can be combined in a live context to create a new genre of performance. Konstantinos Vasilakos presented an interesting blend of singing voices combined with Petxhold-blokflit and electronics to create eerie, jungle-like sound, while Laurens vad der Wee: "The Cake", performed by Eliad Wagner using the Cake improvisation system which emanated peer electronic sound. Concluding the recital was Augusto Meijer: "Bioluminescence" for fixed media and 4 loudspeakers. I cannot say that I was all that impressed with this piece as it was hard for me to concentrate with nothing to visually stimulate me along with the music. However, the program did read that it was primarily made for an art installation project displaying the bioluminescent creatures that live in the depths of the ocean. Perhaps the music with the art would have been a more enthralling experience

Again my evening followed with finding a group of conference attendees to get dinner with. This night the goal was to find authentic Dutch food, and we found ourselves dining at a small restaurant seated on the lower level of the canals, right next to the water. And we all ate pancakes! However, not quiet the typical American pancakes with syrup that I am used it, they were savory (mine had cheese and bacon) and very large.

One of my favorite things about the conference and about going out with other attendees at night was getting to meet all of the people from around the world. Having attended a large university, I have meet several people from different countries, but it wasn't really until I was out of my home country that I started to appreciate the diversity in these other cultures. Before the conference I was debating on whether I should spend my final semester at college studying abroad, and attending the conference convinced me wholeheartedly that I needed to see more of the world. Even though I was only gone a week, and spent most of the time attending plenary and poster sessions, it was the people as well as being in a foreign place that made me want to experience more. So next spring I will be going to Italy and exploring more of the world (as well as taking a break before going to graduate school).

Thursday 12 August 2010

Thursday morning was focused on the future of MIR (fMIR). Two presentations kicked off the morning, the first by Jacek Wolkowicz and Vlado Kešelj on predicting the development of MIR research based on the parallels it has with natural language processing and the second by Emmanuel Vincent on how to address the challenges faced by MIR in order to make it a more versatile field. Douglas Eck from Google then gave his talk on the future of MIR at Google which focused on the goals that Google had for MIR research and where the future of MIR was going. The idea that "music in the cloud" may be the direction that the future may be headed was one topic discussed in which the source of one's music no longer sits on one's personal device but on a public domain such as YouTube. Another point brought up was the idea of closing the link between listening and making music which may expand MIR to another direction.

An industrial Panel on fMIR was then moderated by Rebecca Fiebrink (Princeton) and consisted of Douglas Eck (Google), Greg Mead (Musicmetric), Martin Roth (RjDj), and Ricardo Tarrasch (Meemix). The group touched on topics involving how music technology would be changing over the next few years and also how the roles in the music world were changing as well. This session seemed a bit short, and only offered a taste of what I would have liked to hear as an undergraduate considering the field as a career.

As well as continued poster sessions, an invited speech was on the agenda by Joris de Man, the composer for the Killzone video games. His presentation on 'Behind the Music of Killzone 2' gave a view of how composing music works in a technological setting. He elaborated on the two different types of music were needed when producing music for video games, one for cut-scenes and menus of the game (linear music typically done by a live orchestra) and one for the in-game music during game play (interactive music designed to react to what is happening), as well as the process of creating the music wihle working with different technologies.

That night the conference dinner was held at the National Museum "From Musical Clock to Street Organ" after a guided tour filled with demonstrations of the collection mechanical instruments housed in the museum. It was really amazing to see these massive street organs perform music using the simple piano roll notation punched into cardboard. Along with announcing the best papers (mentioned above), the dinner also contained the unveiling of where IRMIR 2011 would be held: Miami, Florida. With the theme "Music, Anytime, Anywhere" the conference will be held the weekend of the 23rd of October 2011 at either University of Miami or at Miami South Beach. Hopefully I will be in the US and will be able to find a way to attend this conference as well. However, I am slightly disappointed I will not get to take another cross-continent adventure.

Friday 13 August 2010

First thing in the morning was the late breaking-demo where I presented my poster on the work that I had conducted over the summer as well as some preliminary results. At first I was nervous, but after presenting my poster a couple of times it was very natural to explain the work and the reasoning behind it. Though I was still feeling a bit under the weather from the cold that I had, it was really exhilarating to be presenting my work to those in the field and an overall great experience.

The final plenary session was also an interesting one which focused on new technological developments and understandings within the MIR community. The paper Peter Grosche presented on his work with Meinard Mueller and Craig Stuart Sapp on "What Makes Beat Tracking Difficult? A Case Study on Chopin Mazurkas" was one that I had read and used in reference to my research. Their evaluation involved categorizing beats as a way of determining where the beat tracking fails. We were able to incorporate this into our project to help determine where the alignment failed.

With the end of the plenary session the closing remarks of the conference wrapped up the week with a short presentation on the 2011 ISMIR conference to be held in Miami. Saying goodbye was one of the hardest parts of the conference, as I had met so many interesting people and formed many friendships. Before departing for the conference, I was very hesitant about going and had doubts about the benefits of attending. However, I can say with confidence that attending the conference has been a life changing experience that really helped me decide what I want to do for graduate school. I made many connections that will not only help me along my way, but will hopefully be my future colleagues. And, as mentioned before, going to Europe helped me make my decision on whether to study abroad in the Spring, which now I am looking forward to more than ever.

The rest of the day on Saturday was all about exploring the city of Utrecht and seeing what it had to offer as well as getting to spend some time with great people.

Saturday 14 August 2010

My flight left Saturday afternoon and I coincidentally met up at the train station on my way to the air port with a person from the conference whom I had spent some time speaking with. It was really great having others to travel with to and from the conference, and I think conferences should have better ways of helping conference attendees communicate about travel plans for those that are traveling alone.

updated November 2010.