Week 2: June 24 - 28
This week was mostly composed of digging into the search features of ELAN and understanding how it works. I also ran sample queries against Dr. Wolfe's ASL corpus in order to visualize the kinds of data the team needed. There were multiple ways to search the corpus in ELAN with different interfaces and it required several days for me to get familiar with the interface. I found some bugs in ELAN that was perplexing! For example, searching "books" using two different methods against the exact same corpus resulted in different answers.
As a result, I had to further dig into the ELAN code to see what the search engine actually did under the hood. I showed my results to Dr. Wolfe and she was not that shocked because she had been to conferences on ELAN and others have brought up the need to improve ELAN's engine. That meant I needed to stop looking at the code and start researching on publications related to ELAN. Dr. Wolfe mentioned that more information might be present in the publications instead of the code, ha! The week concluded with the team giving reports on their progress and I talked about implementing a new engine in ELAN. The reasoning behind that decision is because I felt it was easier to have a new engine to compare the results against other engines instead of hacking away at the code and resulting in a spaghetti mess. Furthermore, I now have a much better idea of what types of data the team needed and collating them from multiple engines into a unified interface tailored to their needs was a sensible approach. That way, I can iterate over the features requested by the team faster instead of trying to figure out how to insert that functionality into the old code.