I found nine new Telugu blogs and scraped them. The total word count so far is 441,049. We sent the data to IBM to use for their work on the Babel project. We're looking into other languages now- including Lithuanian, Kurmaji Kurdish, Cebuano, Tok Pisin, and Kazakh. We've found blogs to scrape for each of these languages. Hopefully all of this data will be useful for building tools for low-resource languages.
As my DREU project is coming to an end, I'd like to thank Dr. Hirschberg and Erica Cooper for their guidance and direction throughout the summer. I've learned so much from each of them. I feel privileged to have been matched with such wonderful mentors. I'd also like to thank the DREU coordinators and donors for making this program possible. This summer has been an incredible experience. Thank You!
As my DREU project is coming to an end, I'd like to thank Dr. Hirschberg and Erica Cooper for their guidance and direction throughout the summer. I've learned so much from each of them. I feel privileged to have been matched with such wonderful mentors. I'd also like to thank the DREU coordinators and donors for making this program possible. This summer has been an incredible experience. Thank You!