Emily's DREU '15
  • Home
  • Research
  • About
  • Final Report
  • Blog
    • Week One
    • Week Two
    • Week Three
    • Week Four
    • Week Five
    • Week Six
    • Week Seven
    • Week Eight
    • Week Nine
    • Week Ten

daily blog

week ten

Friday--¡Adios muchachos! 

8/14/2015

0 Comments

 
     It's my last day of research and my last full day in Seattle! Anat is meeting with us at 11 AM today, and then I'm going to return Leonardo in the afternoon. I finished writing up my final report, and I should be good to go with all that we need for the projections. Last thing to do is to download this website and send it off to DREU. 
    Well, research this summer has been quite the experience. I really enjoyed my time exploring Seattle and the Pacific Northwest, and I would recommend a visit to anyone who has yet to visit. I'm extremely grateful that I got to be part of such an interesting project and that I was paired with some of the most incredibly talented and phenomenal teammates I could have asked for. I'll really miss our thought provoking conversations and fun times hanging out around the office. To them, as well as my amazing mentor and project lead Anat Caspi, I want to give one last shout out and a big thanks!
0 Comments

Thursday--wrapping up

8/13/2015

0 Comments

 
    This morning I was putting the final touches on the VBA code. When I finally got it to work to my satisfaction, I started on the final task of writing up explicit instructions and making sure it would be truly reproducible. To do this, I saved my VBA code and tried to load it into the other Access data set we have (the eighteen month data). Whenever I did anything I documented it in a text file, complete with screen shots of important steps. Of course, when working with this I realized other little things that you have to set and conform correctly for the program to work, so it was helpful that I did this reproducibility test before handing it off to Matthew. Also, I decided to add a bit more functionality to it that I had overlooked before. Testing on the 18-month data takes quite a bit longer though since even just saving the Access database onto my laptop took a good amount of time. After final tweaks, I got it to work! This means my work is truly reproducible and should be a really helpful tool for King County Metro when making projections in the future. To see my pdf with the instructions, you can look here, or if you're really curious about what the monster VBA code looked like in the end you can check that out here. For those of you that are just interested in the pretty projections that it made in the end, you can see one example below. 
projectionsmacro.bas
File Size: 15 kb
File Type: bas
Download File

projectionsinstructions.pdf
File Size: 155 kb
File Type: pdf
Download File

Picture
     This whole project with the projections was a great example for me in how difficult it can be to automate a process or make your work reproducible. I am now so grateful for the point and click GUIs that the Microsoft suite along with all other applications provide. I also learned that VBA can be a powerful tool and really helpful for automating long processes, but the code can be incredibly hard to deal with and frustrating. 
0 Comments

Wednesday--almost done with vba!

8/12/2015

0 Comments

 
     I was once again working on the VBA macro today and for the first time in forever I wasn't running into a million problems left and right. Granted, I did end up posting another StackOverflow question and that part of the code still doesn't work like I want it to, but for the most part all of the problems I came across today had solutions that were only a few google searches and different attempts away. Due to this, I made so much progress! It's starting to look like this is really going to happen, and that I'll actually be able to make a macro that does everything it needs to by Friday! I finally got it to get all the data in the correct format, put it into the Excel workbook, move over the specified data to another sheet, calculate the average and standard deviation for each hour, and then create two graphs for both the average usage and all the usage for that specific day of the week. Now I just need to automate it over the different days of the week so I get not just Monday's data and graphs, but also those for the other week days. I've made some progress towards that end, but I've still got a few kinks to work out. It's so close though!!!
      This morning there was the reproducibility workshop, which Valentina led. I wish that my macro had been finished so that I could give it to another group to try and run. However, they would have to have Microsoft Access on their computer and they would need the data set we were given, so maybe it would not have worked anyways. My thoughts about the workshop were that it was a really good idea, and I thought it would help us see new issues, but in reality I don't think it was as meaningful because most of the groups didn't seem to have code that they wanted to share with the aim that it was reproducible. Instead, it seemed like the groups just chose the nearest code they could find, even though realistically the code probably wasn't made for a client or anyone else to use, and instead it was just going to be used by the creator to do some function once and then the results would be published for a larger audience.
0 Comments

tuesday--code review and sql success!

8/11/2015

0 Comments

 
     As it was a Tuesday, today started with my last coffee, morning snacks, and project updates time here at DSSG. Basically our group just reported how we're making progress with the rescheduler, making headway with the web app stuff, and doing some final analyses, which include me making the projections reproducible. It seems like all the groups are wrapping up, especially the Predictors for Permanent Housing group, since they have to present to their shareholders this Thursday! Exciting times around here at the eScience Center.
    In the morning I asked Valentina if by any chance she could help me with the sticky SQL situation I had been dealing with yesterday. After lots of different attempts, we finally were able to put our two heads together and find a solution that was able to work in Access (you wouldn't believe how difficult it can be to find valid SQL commands for Access specifically sometimes). If you'll recall from yesterday, I was having trouble with the COUNT function because it didn't include zero values for the days that had no runs at the specified hour. Basically, my Excel worksheet was looking like this:
Picture
This is bad because we don't have the same number of rows for each column, which will be necessary when we want to make our projections. In the end, instead of using COUNT we used a combination of IFF (immediate if statement) to mark zero or one if there was a run at that hour for that data entry and SUM to count the number of runs at that hour. Check out the StackOverflow question I posted if you're curious about the exact syntax. Now our code looks like this:

Picture
After that significant success story, all I need to do is finish up the VBA code to correctly make the graphs, which hopefully won't take too long tomorrow.
     Also, today Joe met with us in the afternoon to do a quick demo and code review of the web app branch so far. It was cool to see how it can integrate python when we use flask, and I'm just generally interested in web applications and design so I thought that was pretty neat. It looks like we should be on schedule to get the web app portion done on time for the presentation too, which is always a good sign.
    I almost forgot! This morning after coffee hour and project updates Frank gave a really cool presentation on easier machine learning techniques and how to make some basic mathematical models when working on data sets, particularly the kind they might give you in programming interviews nowadays. I highly recommend checking it out at his github site if you get the chance. I know I plan on looking back at it more closely later, and I might use it as a resource in future projects.
0 Comments

Monday--LAST WEEK!

8/10/2015

0 Comments

 
     It's here! We're in the last week of research! Well, at least I am. In honor of that:
     Okay, but really today I just worked on the VBA more (surprise, surprise). For those of you really interested, one of the problems I was running into was that I was using the Count feature in SQL, but that doesn't record instances where the count is zero. These are still significant in our data, so I was spending today looking into various work-arounds to this problem. The first way I had thought of involved using the complete columns that had no zero values, and comparing those to the incomplete columns once it was in Excel and then using that to add in cells with zero values in the appropriate places. Basically, this is just automating what I did manually, but manually it is very intuitive and to automate it is much more difficult conceptually. The other avenue I checked out was possibly changing the SQL to somehow account for the zero values. I read around that it might be possible to use the NULL values and have them counted as zeros if I did a left self join on the table, but Access is being incredibly difficult with joins and will accept just about none of my queries. 
    Update: I got the join syntax right and it ran the query! However, it did not, as promised by the internet, count any of the zeros so I'm really still back to square one. Sigh.
0 Comments

    about

    Throughout my research I kept a daily blog that details what I did and my experiences. On this page you are invited to check out my different entries.

    RSS Feed

Create a free web site with Weebly