This was my final week at the University of Alabama in the DREU program. At the beginning of the week we presented our project to the SITES students like last week. Then the rest of the week was spent working on our posters as well as our papers. By the end of the week I finished the rough draft of my paper and the ground work for my poster.
This week on Monday we had our SITE presentation with High School students ranging from Sophomores to Juniors who were interested in STEM. During the presentation, we showed video demos of our summer research and talked about the research with the students. We also went to the 3D-printing lab and saw all of the 3D printers and got to see and touch all of the things that were created with them.
We also got to see the lounge for the computer science department as well. Later in the week we all focused on collecting as many voice as we could and completing the voice testing for our projects. Then on Friday we wrote our abstracts and submitted them to DREU for the possibility of being able to attend and present at GHC.
So this week, I uninstalled and reinstalled Firefox and that seemed to have fixed the issue of the web browser problem. However, this week I ran into several other problems, such as the avatar not working from within the virtual box. I spent most of this week attempting to trouble shoot this problem. I also spent the later half of this week collecting voices to test our project. We collected voice via recordings sent to us by people we knew. This week I also began working on my video demo of my project for the SITE presentation this coming Monday. Below is a screenshot of that Video:
We took the first day of the week off for Independence Day and begin work again on Tuesday July 5. This week I began trying to run the speech recognizer, pocketsphinx, via python and gstreamer. The first half of the week was spent trying to make it run on a mac which was no problem with pocket sphinx and gtk+, but I ran to many issues when it came to gstreamer. So, I decided to switch from using a mac terminal to and Ubuntu terminal via a virtual machine. So within the Ubuntu virtual machine, I installed pocketsphinx, gtk+, gstreamer, and all of the other necessary dependencies. After they were installed, I made a working program with pocketsphinx and python that takes in continuous speech. At the end of this week I began researching all the functions of gstreamer as well as how to incorporate grammars into python which does not support JSGF grammars. On the last day of this week, the Firefox within the virtual machine for some reasoned crashed, and is no longer working, and the Ubuntu web browsers will not allow me to download other web browsers. Moreover, I can't seem to figure out how to fix FireFox, so that will be my task starting at the beginning of next week because I can't access the smart home avatar without a working web browser.
<Note: I had to convert the JSGF grammar to FSG grammar in order to make it usable by python.>
This week I spent most of the week making sure that my speech App was connecting to the correct websites, based on what was recognized by the speech recognition software. When I put in the exact URL of what the address I wanted the program to hit, the voice recognition software worked just fine, but I ran into some difficulty when getting my program to generate the URL based on what is said. By the end the week, I managed to get it working correctly for the most part. At the end of the week I also found the pocketsphinx - the voice recognition software that I've been using is supported in python with the use of Gtk+ and gstreamer so I've been looking into that, to see if working wit python is a better alternative to working in c.
This week, in the beginning, I switch between running pocketsphinx and trying to make pocketsphinx.js work. However, I decided that I would stick solely to pocketsphinx. So once I got pocketsphinx installed and running, I began working on my speech app. Once I managed to get the program to record what I was saying, I began working on the accuracy. That's when I started doing research on language models and grammars. At first, I thought I would need to build a language to account for a wide range of speech, but then after more research I found that it would be much simpler and more accurate to create JSGF grammars which would limit the vocabulary and range of words that it would accept, but worked much faster and gave more accurate results than a language model would. Once I had a working app, and a decent JSGF grammar that I would add more to as needed, began work on connecting to server. By the end of the week I was successfully able to connect to the server, as well as be able to generate a unique url within my code.
This week I began more in-depth work on my speech recognition application. For our speech recognition project, my fellow DRUE students and I were each given a different speech recognition software: IBM Bluemix, Microsoft Cortana, and an open source recognition software. I was given the open source software of CMU sphinx. The first part of this week was spent researching CMU sphinx and the different versions of speech recognition software that they offer. The three main ones were sphinx4 in Java, pocketsphinx in c, and a web-based adaptation to pocketsphinx: pocketsphinx.js. Originally, since this avatar and home app was web-based I thought pocketsphinx.js was the best option, but I was unfortunately unable to get it running correctly, and it was browser dependent (it only worked in Firefox and Safari). When comparing pocketsphinx to sphinx4, pocketsphinx is much more accurate and faster, and though it's optimized for mobile apps (specifically android) it can work on a desktop; plus, I'm more comfortable with c than java. So I chose pocketshinx to do my app in. The rest of the week was spent compile and running the basic operations of pocketsphinx.
<Here are all of the undergrads from the conference>
In my first week here at the University of Alabama, on the first day, I met the other two students I’d be working with and we learned how to use github in order to share code and backup our code every step of the way. We were also told that we would be researching how to use speech recognition to interact with a smart environment (i.e. a smart home). The rest of the week we spent learning how to develop apps in Visual Studios and how to integrate speech recognition software such as Cortana into those apps. We used the Microsoft Virtual Academy to aid in research. Also, due to some technical difficulties, we also learned how to work with a virtual machine.