Emily's DREU '15 - Week Five

THUrsday--Debugging and speaker

7/6/2015

Right before we left last night Rohan was telling Kivan and me about how he made this script that would calculate the percentage of the total run time that each leg of the trip was taking up. This would hopefully be a useful tool for us in analyzing ugly rides and seeing which legs were using up a lot of resources as far as time goes. The script is pretty simple, but the only problem was that when double checking the percentages he realized that all the percentages from the different legs in a run didn't always add up to one. So, I told him I would look at it today and see if I could figure out where there might be a bug in the code. I don't want to spoil the ending for you, but I worked on debugging this script all day and (SPOILERS) I still could not figure out why it won't always give the correct percentages. Long story short, sometimes the percentages are incredibly accurate, but other times they are about (but not exactly) half what they should be, which makes it impossible for us to reliably use the data it outputs. It also apparently was dependent for some reason on the length or size of the data set it was given to work on because when given a subset of the same data the percentages would drastically change for reasons unknown. All in all, I don't think this script will be able to help us much, at least not as much as Rohan was hoping, but it was fun for me to play around with it and to try to do some debugging in R. Also in the process I ended up checking again to make sure our leg times script was outputting the correct leg times, which is good and important since those are used to calculate the ugly rides. Speaking of ugly rides, I asked Rohan then while I was working on his script if he would take a look at my ugly rides script and optimize it for efficiency because right now while it works it was a first pass attempt that ended up with four for loops. In the end, both Rohan and Frank worked on it and Frank got a new copy of the script to run faster and produce some output, so we finally had some output telling us which rides are considered ugly rides. It looks like we'll be able to use this to do more analysis delving deeper into specific cases next week.
Our day ended with a presentation by Candace Faber talking about collaboration with different stakeholders and big data. I really liked her presentation because it was interactive and used meaningful and captivating examples to illustrate the points she was trying to get across.

0 Comments

Tuesday--team time!

7/1/2015

0 Comments

They're installing acoustic panels in the eScience studio and today was supposed to be the "loud day", so we got moved to the CSE building for the day. The morning started off well with coffee and pastries provided to us around 9:30, and then we went around and gave project updates from the four different teams. It's true that we don't have much time to hear from the other groups what they're working on and how it's been going for them, so I think it will be helpful just to have these weekly updates where we can get a little insight into what all is happening with the other projects. Afterwards, the paratransit team and the sidewalks group met as one big group to try to set up some milestones or objectives for the coming two weeks. Our group's main goal is to have our optimization problem well defined by next Thursday and to finish the cost per boarding analysis before then, too.
For me, the day went by really fast today because for once we actually just had time to all work together in the same space as a team, and while we were working on different aspects of the project, it was incredibly helpful to be able to ask each other questions and get help from one another as soon as something came up. I was working on the cost per boarding analysis and coming up with a script that would give us the "ugly rides". I also took some time to go through some of the QCing script with Nick and I made sure he could understand what it was doing and what the syntax meant in R. There was a lot of back and forth with updating our workspace through git as everybody kept pushing new code, but at least for me I know my code still needs a lot better annotations. Also, using the R Studio server that Rohan set up was helpful because I can use it to try to run my UglyRides script once I create the right data file I'll need, but I won't have to keep remaking the data file.

0 Comments

Monday--CSE meeting, r, and reading group

6/29/2015

0 Comments

Originally, I had planned on doing the reading for our DSSG reading group over the weekend, but to be honest I did not think about work or the project almost at all this weekend. Therefore, I got in this morning and first on the agenda for me was to read the article that we'll be discussing in our reading group this afternoon. It was a pretty interesting article talking about the different issues that arise with the new popularity of studying Big Data. I think it is a very interesting read and particularly important for scientists of any discipline using Big Data, but especially computer scientists who tend to be trained to see data as explanatory in and of itself and who usually think of themselves as scientists who introduce little bias in their line of work. Really, when using Big Data there is a lot of bias introduced that can change the things being studied and the way different audiences receive the results from a study. I think talking about it in the reading group today will be very interesting.

0 Comments

daily blog

THUrsday--Debugging and speaker

Tuesday--team time!

Monday--CSE meeting, r, and reading group

about