My week started with an orientation, where we learned about CMU and completed a selfie scavenger hunt around campus! Afterwards, we met our mentors, and went of a tour of their lab space.
Afterwards, we talked about ongoing projects. I decided to work on a new one, improving the alternative text of pictures on Twitter. I spent the rest of the week reading literature on blind users and social media and tools currently used to provide access to visual media.
A picture of Me with some other REU students during the selfie scavenger hunt!
We needed a way to collect images with alternative text from twitter, so my advisor asked me to build a crawler. Creating this application took the rest of the week, and a part of the 3rd week too. He originally insisted on a chrome extension, which I had never built before. Getting the alternative text of images from the DOM was quite a task! We ultimately decided to use Python with Twitter's API.
Once a week, CMU puts on a lunch/talk for the visiting undergraduates. This week, we learned about different research projects going on across the HCI Institute. I also got to attend a PhD dissertation which focused on crowd sourced science.
I finished the TwitterGetter application, for single profiles. I was then tasked with creating one that grabs tweets from the overall Twitter timeline. This took until Thursday. On Friday, we started collecting data. I also examined 200 tweets with photos from the overall Twitter timeline and categorized them.
One of my friends from last summer at CMU, Kenneth, successfully defended for his PhD. The lab went out for a celebratory dinner on Wednesday. On Thursday, we had another lunch/talk for REU students. This week’s topic was Human computer interaction in academia vs industry. I want to work in academia, but it was interesting to hear about other career opportunities.
A picture from the celebratory dinner!
During week 4, I collected over a million tweets from the overall twitter timeline. I then separated them into tweets with and without photos, and from there, tweets with and without alternative text. Out of a million tweets, we only found 1206 tweets with alternative text.
Somewhat unrelated to the internship was my purchase of a DJI Phantom drone. Pittsburgh is a gorgeous city, especially from the air.
From the 1206 tweets with alt text, I went through and found aroud 100 accounts which regularly use alliterative text. The rest were bots. I then started on a rubric to grade alt text.
During week 6, I built a program that would collect all of a user's tweets with alternative text and ran it on the 100 accounts from last week. We retrieved over 50,000 tweets with alternative text!
I also went to the 4th of July fireworks show in Pittsburgh. We like fireworks in Texas, but that was by far the best show I've ever seen.
I analyzed the 50,000 tweets with alt-text. We found that the majority of them were from bots. Therefore, I had to go through each of the 100 accounts and remove those which explicit said they were bots. This left us with 56 accounts. After running the program again, there were only 23,000 tweets with alternative text.
We finished the grading rubric for alt text. It took several revisions for me and the other researcher to agree upon what constitutes good, bad, and ugly alternative text. We both applied the rubric to 200 images. Unfortunately, the inner-rater reliability was too low for use in research.
Our next step was to interview the 56 users, which required IRB approval. I completed my IRB training and submitted the study to the review board. I also wrote a script to determine the ratios of alt text usage on the 56 accounts.
Week 10 was slow. I created a script to take samples of each of the 56 users' alt text. I also created and presented a poster on my summer research.