The Project: Analyzing Tweets about Movies

Have you ever used Twitter to state that you hated or loved a movie? I will be training text classifiers to recognize whether tweets like yours convey positive or negative opinions. Sentiment analysis involves identifying whether text conveys a specific emotion.

My first task is to clean up my data. Data mining sometimes involves storing irrelevant information. I need to determine which tweets mention movies and convey positive or negative opinions.

To train text classifiers, I need to derive a training data set. I will label some tweets as positive or negative. Matt Cholick, a Kansas State graduate student, used emoticons to label tweets when he programmed a bootstrapped movie recommender. I may initially use this technique as well.

Sentiment analysis may find words and phrases that strongly influence whether the classifier categorizes a tweet as positive or negative. These words and phrases could potentially influence readers and encourage beneficial behaviors. For example, sentiment analysis could help software developers to offer users more effective security warnings.