I spent most of week 1 familiarizing myself with the new work environment and reading up on the research area of interactive supervised learning, which was/is completely new to me. Machine learning in general is a topic I just began studying during this last spring quarter, so a lot of the topics are still fresh to me. I went to a digital instrument demo where we used the Wekinator to train a digital wind instrument, built of wood and with sensors in the place of holes, to understand pitch. This helped me understand the scope of my project more clearly, since our project inhabits the same, or at least a very similar, niche as the Wekinator.
I also met three other undergraduate students working on a different project in the same lab, the Sound Lab, as me, a graduate student working on a project similar to mine, and my professor mentor. Everybody seems enthusiastic about the work we are doing, and I expect that we will all help out in each other's projects where possible.
A lot of work happened in the lab this week. My experience this year is a particularly exciting one for me because the project I'm working on did not begin before I arrived. What I mean is, although Professor Fiebrink did have some ideas in mind for what we should work on this summer, these ideas were abstract and still fresh. This means that I get a significant voice in the project. I get to decide what the project should be, what our motivations are, what our research contributions should be, and even after that, I still get to design the application we create. (Although I should mention that Professor Fiebrink's expertise in the field is essential in this process. I wouldn't even know where to start with all of this without her guidance.) All of this happens BEFORE implementing anything is even possible! This whole area of design space and decision making is not something you see in every computer science class you take. In fact, I've found myself to be something of a sloppy thinker and I've been sometimes unsuccessful at communicating ideas concisely. And all of this despite the fact that I'm usually great at communicating when tutoring at UCSD, and I pride myself in how much I've grown as a teacher.
But there is a big distinction between communicating as a tutor/programmer and as a designer...I have found the two types of thinking to be very different. As a programmer, the problem you are trying to solve is well formed. The missing pieces are simply gaps of knowledge that need to be filled. You need strong problem-solving skills, debugging skills, and a good grasp of how computers "think" (having a natural intuition here helps a lot). As a designer, you are yourself defining the problem you need to solve. In order to define this problem, you need a strong sense of "what problems haven't been solved?", "What new ideas and concepts can really help people", or "What helpful technologies that exist seem to work well, but can actually be improved significantly if we make these changes?" You need a strong understanding of the problems people currently deal with, your users and their demands, and you need the vision to come up with new perspectives and alternatives to existing solutions to these problems. To me, this is hard! But I think that one of the reasons why it's hard is that I don't have much experience tackling this new scope of problems. I worked a little bit on problems of this nature last summer during an REU at Berkeley, but even then, I was not as involved in the design process as I am now. My brain is not used to thinking as a designer. But I'm confident that as I make more design decisions and get experience thinking as a designer, the process will be easier. I've been able to train my brain to do many things, and I'm excited about the opportunity to grow. And this kind of thinking FEELS different, the same way that learning a new instrument or sport makes your muscles feel uncomfortable or awkward. As you practice, your muscles start learning the new movements they need to make, and this process of improving is in my opinion one aspect of why sports and music are fun. It's also why research has been fun so far.
Not that my ideas have even been that bad. We began the week trying to narrow down the scope of our project and thinking of different applications we could make that address our research questions in tangible ways. After many meetings with my professor and a grad student in the labs, we began narrowing the applications down, and slowly chose a project to work on. My project has a few different layers...you can read up about the general research problem we are trying to solve on the research project link at the bottom of this page. THAT tool we want to make is REALLY EXCITING to me, because it's un-debatably something that is both new and clearly useful. Not to mention that there are a lot of computationally interesting problems to solve there. But on top of that tool, we also want to do some user studies. We're thinking of packaging that tool up to include some predefined inputs and outputs and explore how users react to the tool. That itself is also non-trivial. So there's really three tasks for me: (1) design the higher-level tool for mapping inputs to outputs, (2) package up predefined inputs and outputs, and (3) GUI for part 1. That's a lot to do! I am definitely going to be busy.
Something slightly frustrating for me is the fact that I really can't get started with any of that coding despite the fact that there is so much work to be done. This is because we are still in the design stages of the project. And any work we do now will probably need to be completely modified when we decide that parts of our design were flawed. But the good news is we're nearly there: We've almost figured out how the system is going to map input to output, I've come to understand in detail how a user's workflow looks, and I've made paper mock-ups of the system that show what the GUI should look like. I expect to be in the position to begin coding about halfway through next week.
One thing I have been able to do is look into different API's for desktop application development. We've basically decided that we are going to use JavaFX, mainly because the Wekinator is written in Java and Java has a nice machine learning API, Weka. But there's also Qt4, a C++ API, to fall back on if for some reason we change our minds. I got both installed on my computer and wrote a few small HelloWorld type exploration apps to try them out and see if I like them. They work well enough.
This weekend I am participating in a hackathon in New York with some of my lab members. It's going to be exciting to go to New York for the first time and to bond with my friends. I look forward to a good weekend and exciting work next week!
After meeting to discuss the design in detail on Monday, we finally decided it was time to start coding. I spent most of Monday and Tuesday trying to code the program using interfaces and abstract classes so that our code was as extensible as possible. After getting an outline of about 8 classes in my project, we went back to the drawing board to make sure we had handled everything we wanted to. Our system is designed to be as flexible with training data as possible. This priority was set based on users past experience with the Wekinator, where users needed to delete all of their training data any time they wanted to make any modification to inputs. This is a problem because it discourages users from every making changes to their input setup, but exploring different setups is a huge part of the design process.
We also decided that users should be able to create, explore, edit, and "play" multiple instruments at the same time, while in Wekinator you only had the ability to use one at a time (you could import/export configurations, but that was all that was supported). This interaction will encourage users to explore many different instrument mappings, and allow the user to train different parts of the instrument separately and then play all the parts simultaneously.
The code we have so far supports both of these goals. I have been coding slowly, but thinking carefully about the implications of my implementation choices in order to create understandable and flexible code. Not only is this critical thinking extremely enjoyable to me, but I think that if I don't have a good design, chances are that whatever I program will be scratched once I leave. And if I am able to make a solid design that is useful to the same target users as Wekinator, then it's realistic that other researchers will want to build on top of what I make. I'm hopeful and excited about that possibility.
By the end of the week, I had the project compiled and working with the NetBeans IDE, set up a Git repository hosted by Google Code, and got the basic JavaFX application to communicate with the project. The question marks in the design are disappearing, and tangible progress is being made.
By the way, Thursday was July 4th. The night before, I went to see fireworks with some friends, and met new friends. Coding during the hackathon was also successful- we created a fully functional (although admittedly small-scoped) Iphone App, making for a successful first trip to New York!
With the proper code design in place, it was time to work on the graphic user interface. As I learned this week, there are very many things to take into account when designing an interface. There are some obvious ones: make an interface that is simple and easy to understand without following directions, use visual metaphors that are consistent with other metaphors already understood by users (and where no predecessors exists, use simple, intuitive visual metaphors), and make comfortable use of the space so the user is not overwhelmed. Beyond that, there are a lot of big decisions that are implicit in interface design. The most important one I have found is that functionality is grouped based on which page it is located on. While we started off with some idea of what functionality we want to provide users, I realized this week that grouping certain functionality by location sends a strong message to users about how the tool is expected to be used. For example, our tool needs to strongly support an iterative design workflow where users can create an instrument, evaluate the instrument, make changes, re-evaluate, and compare with other instruments. And there are a few different levels of granularity with which users will make changes to the instrument, yet we need to support rapid switches between these different types of functionality. So, some huge questions: "What are the different levels of granularity of instrument modifications that we will support?" "Which of these modes of modification should go under which granularity?" "What is the easiest way to switch between instruments?" "How should we support instrument evaluation/playback?" Answering these questions is itself an iterative process for me, which I undergo by drawing paper mockups.
Although I have done paper mockups before, I still struggled with what parts of the mockup I should prioritize. First, creating paper mockups means coming up with nice visualizations that make working with the interface an enjoyable task. So part of my efforts went into creating a visually appealing design. Another part of my task was to implicitly communicate to the users the workflow inside the interface: in what order should users use different parts of the system? This was a hard question to answer when I answered it two weeks ago directly, so it was even harder to answer in my paper mockups. A related task for me, as stated earlier, was to group functionality into different pages in a way that makes most sense for the task at hand (digital instrument design, or more generally, mapping input gestures to output parameters). With each of these tasks being itself difficult to manage, I often focused on one of them, and forgot about the others, creating unbalanced interfaces that did not solve all of our problems. I'm on my 5th iteration now, and there's still some work to be done.
Each iteration resulted in better communication between my mentor and I about what we want the interface to actually do, and with each new iteration came a more focused tool. By now, we are happy with the overall navigation and layout of the interface, and we are in the process of working out the most logical groupings of functionality into separate pages. During this process, we came up with new functionality we wanted to support: (1) saving an instrument's history as a tree similar to a commit history in a repository and (2) stashing away parts of the interface into a globally reachable area, in order to replace other parts of an instrument with what we saved in the stash. Looking back, these functionalities serve obvious uses, but without this design process we would not have found them.
After paper mockups, it will be time to produce digital mockups, static pages with fake information in them that show what the user interface will actually look like using the same technology the real system will use. In addition, as we add more requirements to the project, I will be implementing them in the project code. I'm also getting started on a research paper, outlining the introduction, our motivation, and related work.
Professor Fiebrink was away at a conference until Friday this week, so we partied all week long! Haha, just kidding. But, our group at the lab did venture into New York City on Monday for a free concert at central park by the New York Philharmonic! It was a great show, well worth the extra hours we worked on other days to make up for lost time.
After a few more iterations of paper mockups, I finally came up with a design I was satisfied with, and on Friday during our meeting, Professor agreed, and it was finally time to move on to JavaFX and building the design into the computer! I feel a huge sense of accomplishment after completing this step of the design process, both because we are exited about how everything looks and because of the difficulty of creating a solid design for a new tool.
I also wrote a fair amount of my research paper, covering our motivation and the background of the project. We plan to submit a Work In Progress paper to CHI with the hopes of getting something published. I'm starting early because I know that writing requires multiple drafts and heavy editing in order to create something impressive. I read some past papers in order to get a feel for the style of writing that CHI expects, and after that simply spelled out why we are researching what we are researching. I like research papers because the writing is very pragmatic and to me, every sentence has a clear and obvious purpose.
I also wrote a file encoding for instruments. We want people to be able to save their instruments past sessions, and to do this we need some sort of persistent file. So I spend time figuring out what about an instrument object needs to be saved, and what can be inferred, how to encode that as a string, and how to write/load that encoding to/from a file. It wasn't exactly the most exciting work, but after spending a few weeks away from programming, it was refreshing to be back in code. Watching tests fail and then pass was as rewarding as ever.
Now that the heavy thinking is over (for now), it's time for the heavy lifting. Putting my design on screen requires that I learn the JavaFX API, and it takes something of a perfectionist to make the design look nice, regardless of how it looked on paper. I also have to code up listeners so that the application responds to buttons clicked, drag and drop, etc. And finally, when the design is in place on the computer, I'll be able to fill in the missing bits and pieces of my back-end code and actually get the application to be functional!
This week was nearly all coding, head to the floor. Back in my comfort zone, I was able to finish the digital mockups early in the week and got started making the GUI respond to user input. Finishing that as well, I was able to begin making the GUI interact with the underlying models. Pretty soon I'll be able to get the application up and running, and we can start testing things out.
Although a lot of work got done this week, there isn't really much interesting to talk about. So I guess I can spend this space talking about practicing music. I am a singer/songwriter/guitarist/pianist, and we were able to work it out so that I have access to the Princeton music rooms. After work I practice for about 2 hours a day in the music rooms, and come home for about a half hour of guitar before bed. I play a bit more on weekends, along with getting in some morning runs. Music is a huge part of my life and I'm glad I have been able to keep it up through the summer, despite working full time.
Next week my parents are coming to visit, so I'll be spending a few weekdays in New York exploring some more. I'll be working on the weekends to make up the hours so that this works out.
Working through the weekend, I passed an awesome milestone in my computing career: I wrote my first multithreaded program! We needed multiple threads for recording examples/running the mapping while updating the GUI. Luckily Java is a very well documented language, so there was plenty of organized guides online that helped me get started. Less documented was the OSC (open sound control) API that allowed us to send/receive OSC messages through Java (we receive OSC messages to record examples and track changes in input features, while we send output parameters OSC messages to some outside source that handles sound synthesis). Since the API was somewhat small, however, I was able to figure it out without too much trouble. As I've had to deal with many new API's recently, I've had to do more debugging then before (I knew things were moving too smoothly), but I still have yet to be stuck on any one bug for more than an hour.
Since I was able to send/receive messages, I was able to actually stick some dummy functions into the program and watch as the program crunched numbers to automatically compute the mapping for us! The next step was to use the Weka API to get some machine learning models in as functions, and to figure out how to record examples/train the models/run the models. All that works now, except my Neural Network model isn't tracking any changes for some reason that I'll figure out soon enough. When my professor gets back soon, I'll ask her about the most fun and easy way to test my project with a real instrument/synthesis situation!
Hanging out in New York with my parents was tons of fun. We saw the Statue of Liberty, walked across the Brooklyn Bridge, went to Central Park, got some delicious steak, and walked till we could walk no more. I also showed them around Princeton for a few days, eating some good food for free (yes!).
It's pretty crazy that things are coming to a close so fast. I still have a paper to write and I need to make thorough documentation of my code. Looking back, I was able to go from a completely blank state to a program that, while not complete by any means, is in a place where our initial ideas can be tested and extended in logical ways without too much risk involved. Seeing as I was the only person to ever push code to the repository, I consider that a big accomplishment. I'll be doing as much as I can to get the code in a neat and tidy place so that others can follow through on experiments and expansion, and I'll be keeping in touch to see how things go as well as lend a helping hand if necessary.
Just finishing up this Friday on Week 8. Just a few hours ago, I finished rewriting PD patches I made a few years ago for the sake of testing my application, and I was able to train a logitech gaming controller to control a wavetable synthesizer! Yeah! My project works. At least, the first part of my project. This week I finished the GUI screen that allows you to edit mappings for individual output features, and then wrote some real functions that converted any type and range of input to the correct type and range of an output, wrote a function that makes smart function choices for outputs, and wrote read/write to file for all of these functions. All of these are basic but useful functions that require no training examples. That means a user can create an instrument, once inputs and outputs are setup, literally with the click of a button. And on further clicks of that same button, different mappings are generated, allowing the user to quickly explore the instrument's input space. This is great, this is exactly what we wanted. And as a user, I could see how well my interface worked. I was able to write the glue code in only a few hours. What was really cool was realizing that I could write, in 10 minutes, another program to hook up a different input into the system and then be able to play the instrument with that input device instead. So, while there's more room to explore here, the big benefit of the application so far is the ability to play the same sound synthesis program with different inputs. That's pretty sweet.
I also began documenting my code, making a PDF and Power Point Presentation that help describe the algorithms we used that control the backend and the design patterns used in the front end. I'll be adding to that this week as I think of more useful information. I also made a rough draft of my paper, but most of next week will be spent editing the paper to make it as good as possible. There's a few more features I want to try to add to the project before I go, but that will be done once the paper is finished.
Princeton summer housing ends this week, so I'll be subletting a house next week until my flight on the 18th. We had a goodbye lunch today, and a few of our friends are getting together for a goodbye party later, since most people are leaving this weekend. Campus will be pretty empty next week.
Had a smooth move into the new house for the week, it's filled with many awesome musical instruments so I feel right at home. It even has air conditioning and a refrigerator! Small things I never would have expected to appreciate. Anyways, I've been out working/practicing music from 9 am until 9 pm every day, so its not like I'm in the house often regardless.
This week I finished my last few contributions to the project, which included setting up OSC proxys that allow us to run multiple instruments at the same time. This makes comparing instruments more effective and allows users to create composite instruments, an instrument which is made up of smaller "instruments" in the system. I also set up recording/training through the Record Tunnel page, it used to only be available in the Generate Instrument screen. After that, I cleaned up the code and commented files that needed more comments, as well as made documentation as thorough as I could, including making a organized list of bugs/issues with the code.
Now it's time to wrap things up. Looking back, this summer has been a huge learning experience for me. Living on my own on the east coast, meeting new friends, investigating a new research problem, and user interface design (paper mockups) are all new and exciting experiences that I can say I'm glad I had. I also got much better at music and wrote 15-ish new songs, most of which could not be possible without the experiences I had this summer. Sunday I'll be heading to Chicago to see my brother and a week later I'll be back at home in Orange. Thanks to the DREU people for making this summer possible for me!