DREU 2012

Week One

6.11.12 Monday
Day one! I met Ani for the first time today and I was very relieved to discover that she is very nice and very approachable. This is my first time doing computer science research so I expect I'll be learning a lot in the next few weeks. Ani has two other undergraduate students working with her this summer, Kaitlyn and Ethan, as well as a PhD student, Annie.

6.13.12 Wednesday
I started my offical work today. Ethan, Kaitlyn, and I have been given thousands of New York Times articles in the categories business, science, sports, and US international relations to work with. We're supposed to analyze the leads of these articles using Python and NLTK (natural language toolkit). I've never learned Python or worked in the command prompt before so this will be interesting..

6.15.12 Friday
Today marks the end of my first work week here in Philadelphia and it's been quite a week. At least now I've familiarized myself with Python, NLTK, and the command prompt. The last few days have been a bit intimidating, but I think I've got the hang of things. I just got back from a meeting with Ani where we went over some of the data we generated this week.

6.16.12 Saturday
Went to work today because I still don't feel entirely comfortable with my projects. The lab was empty but I've been so productive!

Week Two

6.18.12 Monday
Last night, I ran a part of speech tagger on all of the tokens, by genre. Today, I spent most of my time figuring out how to import those tagged tokens into a list and then sorting them. Most frequently used adjectives. Most frequently used adverbs. This like that. My code is a little inefficient though so the programs take a little longer to run. I will perfect my python eventually. My results look great though! I found a way to display all my tokens in columns by genre. Check it out! LINK

6.20.12 Wednesday
Every week, I'm supposed to read two research papers on natural language processing. Today, I read this article in which the researcher, Sophie Kushkuley, attempted to analyze trends found in the fashion articles in Harper's Bazaar. It was actually pretty interesting. Unfortunately, she only had magazines from a thirty year range in the 19th century so she wasn't able to determine if fashion trends are really cyclic (as they're said to be). But, I'm fascinated by the idea that something like fashion trends can be scientificially measured, using a computer to do all the reading!

Week Three

6.25.12 Monday
I spent so much time this weekend remaking this website, but I'm much happier with it now. This week we've started supervised machine learning! We're using LIBSVM (A Library for Support Vector Machines) and cross validation to teach a machine to classify leads by genre. I just started looking at how to use the program a few hours ago and it was so confusing. Luckily, I found some tutorials and some forum posts to make more sense of the "readme" file. Now, I just finished running my first training set on the rate of proper nouns and the adjective to noun ratio. I got a cross validation accuracy of 60.352%. Not stellar, but better than randomly guessing?

Week Four

7.5.12 Thursday
Last week's lesson on supvervised machine learning was not as fun as I thought it would be. It is very interesting, but it also took a long time for libsvm to generate the values I needed. Last Wednesday, Annie gave us a short lesson on the math behind different classifying methods and I really enjoyed that. I was pretty sick Sunday through Tuesday so that's why I haven't updated in a while. But! It's been a very successful week on the nlp front. I seem to have found a reasonably reliable way of classifying news leads by whether they're written to entertain or to educate. My classification is based off the variance of sentence lengths, the variance of average word lengths by sentence, and the prevalence of topic words. I just got back from our group meeting with Ani and she seemed very excited! I feel very accomplished. For the rest of this week I'll be working on trying to figure out how accurate my classifications are. Keeping my fingers crossed.

Week Five

7.10.12 Tuesday
I am very bad at making decisions, which is unfortunate because my task for the week is reading 800-900 news leads and categorizing them as either interesting, well-written (but not so interesting, and not at all interesting. For example, to evaluate the reliability of my classifiers, I take a random 20 from the 100 most entertaining leads in one genre. I read these leads and decide how many of them are entertaining to see how the percentage stands. One problem I've found with this task is that I'm heavily biased because I want my program to work...

7.13.12 Friday
Ethan and I both classified almost 1000 leads as interesting and boring. We're using sentence length variances, word length variances, verb novelity scores, lead to abstract ratios, topic word densities, topic word coverages, and genres as features in several libsvm ten fold cross validation tests and the accuracies we've been getting have been really confusing. Most of them are all above our base line and the accuracies do get as high as 80 something percent though, so that is promising. Yesterday, I also dug up some of my long lost statistics knowledge to run student t-tests and correlations on our data to see if our features were sigificantly different from "interesting" to "boring" leads. Also, we've moved further along with producing a Mechanical Turk study and hopefully we'll be able to publish it next week!

Week Six

7.18.12 Wednesday
A few weeks ago, when Ani first introduced us to LIBSVM, I was a little frustrated with machine learning becase the program seemed so finicky and also took quite some time to run. Well, this week, I've run at least 50 different tests with LIBSVM and I realized that LIBSVM is just as finicky as I initially thought it was. I had to redo some of my tests because files weren't processing correctly. I had to redo some other tests because I copy and pasted commands wrong (nothing to do with LIBSVM, I'll admit). In the end, the results I got for using WH word ratios, verb novelty scores, punctuation stast, and some other things were not particularly promising. Our Mechanical Turk study seems to be taking much longer to prepare than we initially thought, but it's okay. During today meeting, we finalized the 10 questions we want to ask per lead and outlined most of the details. Ethan said he should have a draft waiting in the sandbox for us by Monday! Excitement.

7.19.12 Thursday
I had an amazingly productive afternoon today! It’s funny because I’ve been running into small problems left and right all week. I’m quite relieved to have things running smoothly again. Today, I added statistics for several more features and reformatted a lot of the data. I realized today how much I’ve learned this summer already. Looking at some of the python scripts I wrote at the beginning of the summer and comparing them to the ones I wrote today, I can tell there is a huge difference. I’m very content with all the work I’ve done this summer and can’t wait to see where it takes me next.
The features Ethan and I have developed so far:
Genre, Topic Word Density, Topic Word Coverage, Lead-Abstract Ratio, Sentence Length Variance, Word Length Variance, Sentence, Word Variance Sum, Sentence Length Average, Word Length Average, Wh-Determiner Ratio, Wh-Pronoun Ratio, Wh-Adverb Ratio, Punctuation Density, Punctuation Coverage, Prepositions By Sentence Average, Prepositions By Sentence Range, Prepositions By Sentence Variance, Predicted Class Average, specificity, Probability of General, specificity, Probability of Specific, specificity, Subjectiveness Average, MPQA, Polarity Average, MPQA, Imagery Average, MRC, Age of Acquisition Average, MRC, Familiarity Average, MRC, Concreteness Average, MRC, Meaning Uncertainty Average, MRC, Imagery Median, MRC, Age of Acquisition Median, MRC, Familiarity Median, MRC, Concreteness Median, MRC, Meaning Uncertainty Median, MRC, Imagery Variance, MRC, Age of Acquisition Variance, MRC, Familiarity Variance, MRC, Concreteness Variance, MRC, Meaning Uncertainty Variance, MRC, Verb Novelty, Verb Novelty Average, Verb Novelty Variance, Adjective Novelty, Adjective Average, Adjective Variance, Adverb Novelty, Adverb Average, Adverb Variance
That’s more than 40!

Week Seven

7.24.12 Tuesday
I spent some of yesterday and all of this morning rewriting the code that generates most of the features. My original code was not very well written to begin with and, with every new feature, I had to write more code and it turned into a mess. But now, problem solved! After a lot of work...

Week Eight

7.31.12 Tuesday
My goodness! I can't believe that I've already spent a full 7 weeks here. Time has really flown by...I can honestly say that I'd rather stay here doing research than go back to school, but oh well. Researching has really grown on me! Today, I found a very cool sentiment dictionary called the Regressive Imagery Dictionary that categories roots of words into 29 categories of primary process cognition, 7 categories of secondary process cognition, and 7 categories of emotions. It can be used to compare conceptual and primordial thought within a peice of text using word frequencies! Using this dictionary, I added about 50 new features (almost doubling my total) and the classifier improved 3% on the hand classified leads. Ah! In case you couldn't tell, I'm pretty excited. :) (I'm really glad I spent some time tidying code last week or else adding all these new features would have been very difficult.)

8.3.12 Friday
This week the CS department here at Penn had two talks. One on poster making/presenting and the other on writing research papers. We (Ethan, Kaitlyn, and I) decided it would be a good idea to attend these talks just because we will (hopefully) eventually get to the poster/publishing stage. I didn't learn anything too new, but I'm glad I went because now I'm much less intimidated by the poster presenting and paper writing. Now that our classifier is actually doing reasonably well on the leads, we're going to see if the classifier is just as good at determining "interesting" writing in random sentences selected from the articles. We've also mentioned machine translations in some emails and today's meeting, but I'm not entirely clear on what that's about so more on it later.

Week Nine

8.9.12 Thursday
On Tuesday, Ethan and I had an "existential crisis" because we couldn't remember or figure out what the ultimate goal of our research was. We spent hours in the conference room trying to outline our research on the white boards but didn't really get anywhere. So, on Wednesday, during our meeting with Ani, we asked her to explain her vision for our research. Unfortunately, her explanations weren't all that comforting, but I have faith in her so I'm just going to go along with what she says/tells me to do.

Week Ten

8.17.12 Friday
Last official day! I have had an absolutely amazing summer here at Penn and, even though I was really hesitant to spend my summer doing research, I'm so so glad I did. Turns out, computer science research is much more interesting than I thought it would be. Ethan and I finally took a trip into Center City to see the Liberty Bell this week. I can't believe I stayed in Philly for so long without going, but at least now I've seen it! I also went to Dairy Queen for the first time and had my first Blizzard...but that's another story. I realize why researchers like to publish papers and present posters now though, aside from bonus reputation points. It's a little difficult to see when research "ends." Probably because it never really does. Anyways, thanks for following me on my adventures this summer. Look for my final report on the other panel!

Philly

About Me

About My Mentor

What Makes Good Writing?