Week 1 | |
Week 2 | |
Week 3 | |
Week 4 | |
Week 5 | |
Week 6 | |
Week 7 | |
Week 8 | |
Week 9 | |
Week 10 |
I moved all my stuff in here at Duke yesterday. It has been a busy couple of days! I am living in an on-campus apartment, but my roommate hasn't moved in yet. She is participating in the Distributed Mentor Project with another Duke faculty member. We had lunch today with some of the other undergraduates doing research at Duke this summer, and I met a few graduate students. The research atmosphere seems very relaxed! Carla gave me a bunch of papers to read this week so I can get up to speed on the project I will be working on here. It's quite a stack, but I feel like I've got plenty of time. :)
Well, I made it through all those papers! Really, they weren't too bad, and I feel like I understand at least the project we are going to be working on better. I'm sort of jumping in for a little piece of a pretty large, and ongoing, research project. It's a little intimidating, with the rest of the work done by graduate students and faculty members. At least I will learn a lot.
I will be working on the Milly Watt Project with Carla this summer. Click here to see the "official" Milly Watt web page. We are going to be working with a modified version of Linux, called ECOSystem, created here in these hallowed halls of research. This is an operating system that manages energy as a first-class resource, allowing users to specify how long they want the battery in their lap top to last, etc. The operating system has already been created, but this summer we will be working on extending the API to allow applications to influence the decisions made by the operating system about how to allocate energy for disk access. Click here to see MY web page talking about this part of the Milly Watt Project.
Go back to topWell, I've finally been given my "official job" for the moment. We're not working with actual ECOSystem code yet, but our goal is to create detailed specs for what we want to do. So I am sitting in the office, staring at my legal pad, and carrying on detailed arguments with myself. :) It's my charge to start to develop a complete vision for what kind of API extensions we can create that would be both simple and useful to a programmer to provide energy-saving information about file input and output operations called by the application. So I'm trying to come up with ideas, work the whole thing through, and think about all the problems and issues we might have. That's a mouthful!
We had our first "Milly Watt" meeting today. Every week, the various professors and grad students and extraneous people (like me!) who are working on different pieces of the project get together and talk about what they are doing. I'm just starting to realize that this project is a lot bigger than the ECOSystem part that I am learning about. I am trying to wrap my mind around write system calls today and how we are going to make everything work together in an "energy-efficent" manner. It still seems awfully theoretical, though, because we haven't seen any code yet. I guess it's lesson number one in "real" programming: decide what exactly you are trying to do before you start trying to do it.
Go back to topWell...today we set a date for me to give an "informal presentation" of all the stuff I've been "thinking" about to Carla and then to other members of the Milly Watt team, including the elusive grad student I'll be working with....at some point. :) I'm also supposed to be working on my web page. Yeah, I'm just a little slow. I'm excited to be going someplace concrete with what I'm working on, but a bit nervous...I have to talk in front of other people! But, I've got a week. I can procrastinate for at least a couple of days.
My last several days have been very much busy with working on my web page and trying to prepare my presentation. I am teaching myself about cascading style sheets, but sometimes I am so picky that I can spend 45 minutes trying to pick a color. :) It is getting to be addictive, though, playing around with the web page. I have to keep reminding myself that I am supposed to be doing other things. I have pretty much finished preparing PowerPoint slides for my presentation. It has been more difficult to try to organize my jumble of ideas than I thought that it might be. And then, of course, there is the all-important choice of color scheme. :) Hopefully, what I ended up with is a reasonable outline of ideas (strongly slanted toward what I think will work) for creating cooperation between applications and ECOSystem in order to manage disk accesses more efficiently. We have taken ideas from a master's thesis we find about a system that cooperatively manages file I/O to reduce power consumption and tried to translate those ideas into the currentcy-based power management system of ECOSystem. Hopefully my work in this presentation will be the kernel of a new "vision" for integrating energy-aware applications with ECOSystem.
I am going to run through things with Carla tomorrow, before sharing with the Milly Watt group on Tuesday. Click here to view my presentation. Wish me luck!
Go back to topI had some friends from Michigan visiting this past weekend, and we had a great time at the beach (Hammocks Beach State Park - Bear Island). But it was a crazy weekend! Now I'm back in the office, squirming underneath my still-worsening sunburn and trying to learn more about the specifics of what happens during a write system call. Upon discussing my presentation with Carla on Friday, we realized that we really didn't understand some of the earlier work that had been done, and we needed more specific implementation details to motivate our vision of deferrable/abortable writes in ECOSystem. I'm also working on ideas for testing the new code version of ECOSystem (optimistically assuming that the coding will all go forth without a hitch) to prove that it really does help conserve energy during disk access.
I gave my little talk to Heng and Alvy today. Heng is the graduate student who has done most of the coding work with ECOSystem, and Alvy is another faculty member on the Milly Watt project. It went okay. Now I have to get myself a computer, get Linux loaded, get ECOSystem loaded, and get to work. It is more than a little inimidating, especially for someone who has never compiled a Linux kernel before. I will get there eventually, but sometimes I think that my summer will be over before I even begin to understand how someone might think about making the changes I am supposed to be making. Maybe it will seem more do-able tomorrow on a night's sleep. :)
Finished reading a couple more articles which were really good. They have given me other ideas about things that would be "cool" to do in ECOSystem, but I suppose you have to start somewhere with someting small. I've also been looking at a book that Carla gave me, called "Understanding the Linux Kernel." I understand the basic operating systems concepts, but the details mostly go over my head. I assume it will be more helpful once I start looking at the code and going, "oh, my gosh, I have no idea what this is doing!" Speaking of that, Carla is bringing in a laptop for me to use next week, so I will get my chance to hack around the Linux kernel and start trying to make some sense of it. I spent lots of time working on my web page today. Like I said, it's kind of addictive. :) Besides, I figure that I have the time now, and probably won't later...
Go back to topI got the link to the ECOSystem source code, and I am working on learning my way around in it. It is more than a little daunting. Even if Linux source is small compared to Microsoft operating systems, it is the first time I have ever worked with an application this big. Every time I see a function used and want to find the function definition, I have to go search it out in another file. And there's not exactly a handy index. Plus, I have never programmed in C before, only C++. And there are diffences. I don't think I have ever seen so many stacked pointers in my life.
The altered ECOSystem code isn't any easier. There's not much in the way of comments, so I am sort of left to guess from the names of things what they do. So I guess that I am feeling pretty frustrated. Hopefully, it will just take me a little while to get adjusted, and then I will be able to accomplish something!
Things I learned yesterday:
1. The entire linux kernel will NOT fit in your network storage space without compression.
2. When trying to delete complex hierarchies of directories full of files, use the File Manager, NOT the command line.
3. If you find a linux source file that ends in .S, DON'T open it. You don't even want to know.
I think that I learned a lot yesterday. I am starting to find my way around, and trying to be very patient with myself. I am focusing on the added ECOSystem code that I am going to have to work with in the changes I want to make (like code for the resource container, and the hard drive module that Heng added). I sent a rather frustrated email to Carla, and she was very encouraging. :) She suggested it might help to map some stuff out on paper, so I am now working with a mix of paper copies and electronic searching. It's kind of fun to start figuring some stuff out, though.
Things I have learned about C:
1. "extern" doesn't exactly mean "public". It just means "defined in another file."
2. "static" means "private".
3.
4. continue; means jump back to top of the loop.
5. go to: is a valid C statement that jumps you to another section of code, just like in assembly language.
In continuing my education about C programming, I have discovered the many and varied uses for the infinite loop: for(;;). I would generally have chosen the while(1) technique, but it is always important to learn new techniques for generating those infinite loops. :) The biggest success of the week so far has been finally pinpointing the method used by all disk accesses as the last step before jumping to the disk in Linux. In case you were wondering, it is entitled __make_request and it is located in the file: linux/drivers/block/ll_rw_block.c. But I bet that you guessed that one already. That knowledge is important because it allows us to catch all disk requests at one point in the code, right before they actually happen.
By this point, I am getting pretty familiar with the Linux code for disk management, which is actually scattered through a number of different directories. And I think I have pretty much (knock on wood) nailed down the existing ECOSystem code for disk management. This has been surprisingly tricky at points because code has been commented out for experimental reasons (a.k.a. it's really important, Heng just never un-commented it after the experimental base case) or the code is simply defunct (a.k.a. methods that just aren't called from anywhere and have been replaced by other techniques). Sometimes, through all this mess it is easy to get confused about what is actually happening and what is supposed to be happening. Actually, what I'm really itching to do is sit down and put some decent comments in this code and get it cleaned up...I know, that's sick and wrong.
A new laptop is coming for me to use to run ECOSystem. It got ordered last week, and we're hoping for it to arrive ASAP. In the meantime, I am working out the quirky details of what ECOSystem is doing now and planning my attack for where I can add and change things in the existing code to accomplish our new goals. I think it's pretty great that I'm getting the chance to tackle something like Linux... good experience for learning my way around specific parts of a big project.
At long last, the arrival of a brand-new IBM Thinkpad has occurred. I don't own a laptop of my own, so it is exciting to open up the box on one and pretend it actually belongs to me. Unfortunately, it needs a little formatting before I can actually do much useful on it in terms of ECOSystem. First step, install RedHat Linux. Which means, first of all, to partition the hard drive. Unfortunately, that means downloading a partition tool from the Internet. Which is difficult when you are in need of a driver for your wireless card in order to access the Internet. A driver that could be easily downloaded from the Internet, of course. :) And we can't seem to find a copy of the CD for RedHat anywhere around here at work, either. So, after discovering all that, Heng offered to take the new laptop home and do the partitioning and installation tonight, especially since it takes a lot of time to run. Tomorrow, we will hopefully be able to finish loading things up on it, and get me a bit acquainted with compiling and testing so that I will be "ready" to get to work on the ECOSystem code by this holiday weekend.
Yesterday, I sent Carla an overview of my plans for implementing this API interface, so that she would know what I was referring to when I said things like "priority", etc (which could actually mean a lot of different things in the operating system context). She had some questions for me today, and we took a closer look at the implementation details for reads. We are concerned with what happens when multiple processes request to read the same bytes. We have discovered that an empty buffer is allocated to the requesting process, and the process' pid is stored in that buffer BEFORE the request is delayed. Since the mapping for this empty buffer has already taken place, the next process that requests that page will see that the data exists in this buffer in the cache, but is locked by another process. So all the subsequent processes will stack up behind the first one. The key here is that once one process has requested to read in a specific buffer from the disk, all other processes will be fooled into thinking that buffer exists in the cache, and thus in my design will not issue a bid. This new understanding may force me to change my implementation for reads a bit, although for the first case we will probably just stick with first reader places bid and pays for access.
Spent some time this morning adding pictures to my web page. That's always fun to do. :)
Almost time to catch a little 4th of July action! Which is great, because right now I'm a little bit confused. Heng ran in with my newly configured laptop, which is to say that it is running both Windows and RedHat Linux, but I'm not entirely clear on how to switch between them yet. So I am supposed to download yet another version of Linux, which is the base for ECOSystem, and compile the kernel, yada, yada, yada. Then something about storing the kernel image in a certain place so that it loads instead of the RedHat kernel. Umm...yeah...exactly. Guess I'll have to figure it out by feel. This type of stuff isn't exactly my strong suit. :)
Ah...the joys of learning linux kernel compilation on the fly. Actually, it's been more like a lesson in linux/unix commands...I am quickly gaining expertise in the command line option -rf which is really quite necessary when you decide to move around unpacked kernel source. After a bit of trial and error, I discovered how to use the exceedingly helpful mount command (this is a useful tool when the ethernet port is not turned on and the wireless card is only working on your Windows partition). Actually, I compiled the 2.4.0-test9 kernel without too much of a problem and replaced the RedHat kernel with this new one. After this practice round, I figured that I was home-free in kernel compilation--no problem, right? However, the modularized ECOSystem source is proving a bit tougher to conquer.
At long last, ECOSystem compilation has been conquered. Unfortunately, I can't lay the glory at my own doorstep. :) Heng had to disable some hardware dependent code before it could compile on my new laptop. (I can't believe Carla couldn't get a laptop with the same battery interface as the old one!) Ha, yeah, that's a joke. Anyway, so Heng took care of that, and then we had to recopy some Makefiles that were not compiling correctly because they had lived on my Windows partition before being transferred over to Linux. Linux does not like MS-DOS format. :)
Anway, Heng then proceeded to give me the down and dirty on the resources and commands that have already been written into ECOSystem. He also showed me a few tricks for writing new system calls into Linux, which I will need, compiling and installing new modules, communicating between the kernel and a module, and a whole bunch of other stuff that is hopefully intelligibly scribbled in my notebook. It's kind of exciting...I am off to hack the kernel! Ok, enough of that.
Oh, and next Tuesday I am giving a talk to the other CURIOUS research students that are here at Duke for the summer. So I am mixing and matching some new slides with stuff from the talk I already gave. It is amazing how much differently I see some things now that I have read the code and understand what is happening in the actual implementation. Where before I was asking, "Why do we have separate policies for writes and updates?" now I am like, "Well, of course there is a policy for both--they are totally separate things!" It's sort of like learning the language, I guess.
Yesterday, I gave my CURIOUS presentation. It was basically just a revised, trimmed version of my old presentation. Click here to view it. It went fine, even though I was nervous and there was a guy there taking pictures. I am glad to have that over with. :) Yesterday afternoon was super exciting for me because I had my first programming success for the summer. Yeah! It's always a wonderful feeling when something works. I wrote two new simple system calls, readb() and writeb(). They actually don't have very much functionality - they accept an additional integer priority parameter and store it in the processes resource container for use later when bidding for disk access. Then they call the old sys_read() and sys_write() service routines. Still, I had to change a number of files to implement new system calls and it was exciting to see it work!
Today, I am getting to work on calculating a bid for a readb() system call. The code I wrote yesterday, I added to the Linux kernel. What I am working on today will be part of the hard drive module written by Heng. Everything that does not communicate a lot with the kernel we try to put in modules. It makes it easier to find changes, as well as easier to compile, debug, and test the new code. Every time you change the static kernel, you have to recompile the entire kernel to test it, and that can take a few minutes. :) The first order of business has been to clean up the old hard drive module code a bit. Heng is not exactly big on comments, so I have been putting in comments for global variables and functions, as well as attempting to comment out obsolete code, of which there is plenty. I am enjoying getting to work on this phase of the project, for the moment anyway. :) Emacs has become my editor of choice, even though I am currently being booed by friends who are dedicated VI fans. At school, I generally use a nice text editing program, which of course isn't installed with the Linux on my laptop. But I am finding that Emacs is really pretty handy, once you start to learn the commands.
We had lunch today in Raleigh with the two DMP students at NC State, Sarah Foster and Laura Bode, and their mentors. I had a great time talking with Sarah about my car, a Ford Tempo. She was also a Tempo owner of the past.
Well, it has been a busy week. It is also my last week here at Duke, and I feel like there are a million things that have to be finished up before I leave! In some ways, I have enjoyed doing this coding part of my project. However, it has not been without its moments of extreme frustration. Putting the actual code in hasn't been too difficult, but testing it has proved trying. I am trying to test all of my code, so that I don't leave the project with untested code hanging around, but it is difficult to simulate the conditions where multiple processes are waiting to get enough currentcy to spin up the disk when there are only a couple of test programs running. It's also frustrating to try to find files that aren't already in memory between test runs. It's sort of like playing the guessing game... always trying to change the files I am reading.
I have implemented bidding for read system calls. In order to keep track of bids that have been issued, I created a new list that maintains the process id, resource container id, and bid amount for each read. It is also a good place to handle bids for writes when those are implemented. Unfortunately, I'm not going to have time to tackle bidding/flushing policies for writes. When a readb() system call is issued, the priority given is stored in the resource container. When the process finally gets around to reading in the first buffer from an inactive disk, the priority is used to calculate the percentage of the process' available currentcy that will be allocated to the bid. Then this bidded amount is "frozen" while the process waits to access the disk -- subtracted from the available currentcy. With each read request that occurs, the amount of the bids stored in this bid list is summed. When it becomes greater than the entry price, all the waiting reads are woken up, and their "frozen" bidded currentcy is added back into their available currentcy. This is important in the case where one resource container manages multiple processes, which are all bidding from the same lump of available currentcy.
One of the very frustrating things is that the disk is constantly being spun up by reads and writes that are not part of file reads and writes that we are monitoring. We noticed that at the beginning of each test program, we were getting disk buffers requested before the read began. We realized this is b/c the open() system call often has to read inode/directory info from the disk. So I am attempting to create an openb() system call which will give a priority to the disk reads generated by open(). I also have implemented a default priority to use for buffers that are being read in by processes that have energy containers, but have not issued any bids, to prevent those processes from completely stalling.
My goal for the rest of the week is to clean up the read stuff and try to do a little experimentation and get some results with synthetic benchmarks. Carla wants me to compare currentcy consumed and execution time for test programs that I write when no bidding is implemented and when my read bids are implemented. I have to get with Heng so he can help me get a better feel for how to measure these things. And I have to change the amount of currentcy allocated each epoch. As it is, each process gets so much currentcy every epoch that it never waits b/c it can almost always meet the disk entry price by itself! There is lots to be done by Friday... I hope that I can get there.
After speaking with Carla about what I wanted to try and finish up by the end of this week, I have been pretty busy. I would really like to get some results back! I did implement an energy-aware openb() system call, although it is slightly more cumbersome than a regular open(). It works though, and that's what counts, right? I can't expect myself to know ALL the tricks after two weeks of kernel hacking, I suppose.
Yesterday, Carla and I talked about the type of situation to try to create in order to highlight the possible energy savings of the new read bids. I want to create a situation where there are several processes running at once, and the first ones have their reads delayed until the last process hits a read, and then the reads all sort of happen together, instead of being strung out in time. I am writing a read test program that mimics (hopefully) the action you might expect in a sort of slide show image viewer...where a large file is read in, and then there is a thinktime where nothing happens until the next file is read. I have a large picture file that I am copying numerous times in memory so that the same file is always being read in, just saved under a different name. It's not pretty, I know, but it will hopefully keep me from the problem of finding files to read that aren't already cached in memory.
As I have been writing my test program, I'm finding it difficult to switch between kernel and user programming. It seemed like it would be simple to access the program's resource container, and find out how much currentcy it spent, right? I couldn't figure out why I was getting these error messages until I finally realized that I was trying to access kernel space data straight from a user program. Duh....that's not allowed. So I had to write a new system call just to access the data that I am trying to measure in the experiment. :( But it sure took me awhile to figure it out.
My last day...phew...wish I could've said that it was a blow-off day, but it certainly wasn't. I spent the entire day frantically working to try to get my experimental test programs to actually provide me with valid data about how much currentcy was being spent on the hard drive and how much time the programs were taking to execute. Much as I would love to say that I slid in just before the deadline, I didn't really end up with much in the way of results. So I actually have no proof that the bidding process that I implemented saves energy at all. This is frustrating to me because I feel like I am sort of coming away empty-handed from the project, without actually "accomplishing" anything. I have to keep reminding myself of how much I learned...both about linux and about myself...that's not exactly going away "empty-handed," per se. As someone who, as a rule, starts early and works slowly on programming projects, though, I understand why it was hard for me to cram my programming work into two weeks. Now back to cleaning and packing....Ahhh!!!