Kathleen Repine
Computer Science and Mathematics
Rose-Hulman Institute of Technology
5500 Wabash Ave.
Terre Haute, IN 47803
email repinekm@rose-hulman.edu
icq # 18174751
aim bengal326

I am a senior computer science and mathematics major at Rose-Hulman Institute of Technology, with interests in theoretical computer science and mathematics.  I chose to participate in the CRA Distributed Mentor Project, during the summer of 2001.  The DMP is a summer research program that matches undergraduate female computer science/ computer engineering students with female mentors.  My mentors were Mary Jean Harrold and Mary Lou Soffa, and I performed my research project at Georgia Institute of Technology.

[Mentors] [Project] [Links/Resources] [Journal] [Final Report]

Mary Jean Harrold
College of Computing/GVU Center
Georgia Institute of Technology
801 Atlantic Drive
Atlanta, GA 30332-0280
voice: (404) 385-0612
fax: (404) 894-9442
email harrold@cc.gatech.edu
research interests: software engineering, specifically automating software development, testing, and maintenance

Mary Lou Soffa
Dept. of Computer Science
University of Pittsburgh
307 Mineral Industries Building
Pittsburgh, PA 15260
voice: (412) 624-8425
fax: (412) 624-5249
email: soffa@cs.pitt.edu
research interests: code optimization, program analysis, and software testing and slicing

Project Description
To derive and implement a method for syntactically differencing a program, P, and a modified version of the program, P', with respect to tasks such as test case selection, impact analysis, and semantic differencing.


  • Maintain a journal, recording experiences, frustrations, etc.
  • Explore JABA, DejaVOO, and existing tools and methods for differencing such as diff, Binkley, Horwitz, or the Reps differencing algorithm
  • Elaborate on a method for differencing
  • Design and implement the differencing
  • Think of ways to use the differencing results for impact analysis, test case selection, and semantic differencing

  • Project Work
    Differencing Algorithm - this is my interpretation of part of the differencing algorithm by J. Laski and W. Szermer described here.
    Hammock Algorithm - this is my modification of the hammock algorithm here.

    Future Work

  • what to do about consecutive mod nodes
  • hammocks on demand
  • integration into the current DejaVOO
  • complexity analysis on algs and optimization
  • name for alg
  • how to propagate across methods
  • added and removed nodes: tests to add or remove from test suite
  • possible experiment: look at differences between versions and try to create profiles for the various types of changes such as perfective, and error correcting

  • Links/Resources
    Aristotle - the research group I am working with
    JABA - Java Architecture for Bytecode Analysis
    DejaVOO - a tool for regression test selection
    Wisconsin Program Slicing Project
    Susan Horwitz
    Dave Binkley
    Thomas Reps

    Important/Useful Papers
    An Incremental Approach to Unit Testing during Maintenance. M.J. Harrold and M.L. Soffa. Proceedings of the Conference on Software Maintenance (CSM'88). (Phoenix, Arizona, October 1988). 362-367.

    Control Flow
    Representation and Analysis of Software. M.J. Harrold and G. Rothermel. (December 1997).

    Regression Test Selection for Java Software. M.J. Harrold, J. Jones, T. Li, D. Liang, A. Orso, M. Pennings, S. Sinha, S. Spoon, and A. Gujarathi, Proceedings of the ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2001).

    A Safe, Efficient Regression Test Selection Technique. G. Rothermel, and M.J. Harrold.  ACM Transactions on Software Engineering and Methodology, Vol. 6. No. 2. (April 1997). 173-210.

    Differencing Algorithms
    A C++ Data Model Supporting Reachability Analysis and Dead Code Detection. Y.-F. Chen, E. Gansner, and E. Koutsofios. IEEE Transactions on Software Engineering Vol. 24. No. 9. (September 1998).  682-693.

    Ciao: A Graphical Navigator for Software and Document Repositories. Y.-F. Chen, G. Fowler, E. Koutsofios, and R. Wallach. International Conference on Software Maintenance. (1995).  66-75.

    Identification of Program Modifications and its Applications in Software Maintenance. J. Laski and W. Szermer. Proceedings of the Conference on Software Maintenance - 1992. (November 1992). 282-290.

    Identifying the semantic and textual differences between two versions of a program. S. Horwitz. Proceedings of the SIGPLAN 90 Conference on Programming Language Design and Implementation. (White Plains, NY, June 1990).

    Interprocedural slicing using dependence graphs. S. Horwitz, T. Reps, and D. Binkley. ACM Transactions on Programming Languages and Systems 12, 1 (January 1990). 26-60.
    [ps] [pdf]

    Using Semantic Differencing to Reduce the Cost of Regression Testing. D. Binkley. Proceedings of the International Conference on Software Maintenance 1992. (Loyola College, Maryland, 1992).
    [ps] [pdf]

    The Program Structure Tree: Computing Control Regions in Linear Time. R. Johnson, D. Pearson, and K. Pingali. ACM SIGPLAN '94 PLDI. (Cornell University, New York, 1994). 171-185.

    Semantic-Preserving Procedure Extraction.  R. Komondoor and S. Horwitz.  Symposium on Principles of Programming Languages. (University of Wisconsin-Madison, 2000). 155-169.

    A Scalable Formal Method for Design and Automatic Checking of User Interfaces. J. Berstel, S.C. Reghizzi, G. Roussel, and P. San Pietro. ICSE 2001, Int. Conference on Software Engineering. (Toronto, Ontario, Canada, May 2001).

    week [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

    week 0

  • start this web page
  • get some background information on my project, i.e. what is syntactic differencing?
  • move into my room at gatech

  • 5/22/01
    Today I began this web page and found some papers that may/may not be useful.

    5/27/01 (Sunday)
    Today I arrived at GA Tech.  It was very stressful, mostly due to Atlanta being a big city (I haven't ever really been in a big city for any extended period of time).  We left from my aunt's house in southern IN around 7 A.M. (I had been up and unable to sleep since 4:30 A.M.).  Most of the drive wasn't bad, but driving in Atlanta was horrific.  Also, getting into my room was a little stressful.  It is Memorial Day weekend and so I couldn't go through the normal procedure of checking into my room.  All that would have been mostly bearable, but my older sister happened to be in Atlanta at the same time, and insisted that we meet and hang out/have dinner.  It made my evening very cramped and it was extremely draining since I was so tired from lack of sleep.  When I got back, I set my computer up, but I couldn't seem to get the network to work, since I hadn't registered my computer with resnet yet.  There were lots of little things like that, which caused me stress because I was unprepared and unorganized for them.  Another example of this is that I don't know who I am suppose to give my rent money to.

    week [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

    week 1

  • set up a regular meeting schedule with Mary Jean and Mary Lou
  • get buzz card, badge, and keys
  • set up project web page for aristotle group (use template and link to this page)
  • read this and this for Wednesday
  • read this for Tuesday, at least sections 1-5 (read the entire paper and did questions 1-4)

  • 5/28/01 (Monday, Memorial Day)
    I finally got to talk to my roommate (the only one of the three that was here when I moved in).  Her name is Bilge and she is from Turkey.  She is here getting her Ph.D. in computer engineering.  She also told me another one of my roommates, Rachel, would be arriving later today.
    I started writing down some questions for Mary Jean (we are meeting today at 2 pm), and reading some more of the papers I found.  I keep finding that I need background information from several referenced papers that I can't find.
    Rachel arrived, moved in and we went to Kroger with her mom.  I really only needed milk, but it was a good thing I came along because they needed an extra person to carry bags.  I felt much less stressed after talking to Rachel, since I found out she is just as confused about everything as I am.
    I met with Mary Jean at 2:00.  I didn't have my badge yet, so she had to let me in and escort me around.  First we discussed what I would be doing.  First I need to read about control flow analysis.  Then I should read about differencing algorithms, specifically, this, for a discussion on Wednesday with Mary Jean and Alex Orso.  Alex is a research scientist who I will be working with this summer.  I also need to finish reading, this, for a seminar on Wednesday.  The software engineering dept. has seminars where everyone reads a paper and meets to discuss it.  Once we find a good differencing algorithm, I need to become familiar with JABA.  Perhaps by writing a driver for it and seeing what sorts of graphs it can printout.  Mary Jean and I set up some meeting times: Thursdays at 2 pm group meeting, Mondays at ? individual meeting, Wednesday 11:45 department lunch, and the 5 seminars will generally be Wednesdays at 4 pm.  We also laid out the plan for tomorrow: I will come in around 8 am and get my buzzcard, badge, and keys.
    I read about control flow analysis (the entire paper not just 1-5), but I was a little confused in the postdominator tree section.  I also did the first four problems that followed the paper.

    5/29/01 (Tuesday)
    I got in around 8:00 and Mary Jean took me to go see Susan to  fill out paperwork for me to get a key to the lab, a badge for building access, and a buzz card.  I got my key immediately, but the paper work for the bade and buzz card would take longer.  Then we went down to the lab.  I was the first to arrive, I tried to log into my machine, but the user/passwd tables hadn't been updated so it didn't work.  Mary Jean got Randy to come fix it.  While he was doing that, I logged into another machine and set up my mail (checked my mail at rose since I had been deprived of access for 4 days) and filled out and mailed (to Jim Jones) the template for my project web page.  During this time, the other people who share the lab with me began to arrive.  I met Saurabh, Donglin, and Maikel, three graduate students, and Anup, an undergrad like me.  Then I started reading the papers for the seminar Wednesday and meeting with Alex and Mary Jean.  Before long my computer, question, was ready to go.  I logged in and did some minor tweaking to the settings (I'm sure i'll do more later).  Then Jim another graduate student arrived and I met him.  Then I reread the post dominator tree sections until I understood them.  Around 13:00, I went back to my room and got my passport (I needed it to get my badge).  Then I got the badge paper work from Susan.  I wasn't sure if I needed an escort out of the building, so I waited until Mary Jean finished her meeting, and then I headed off for the ERB to get my badge made.  When I got there they told me the person who makes the badges moved to the CRB, the building I had just come from.  So I had to go back.  Then I got my badge made and went to see Susan about my buzz card, but she wasn't there.  The secretary, Cheryl, told me I should come back in the morning.  Then I went back to my cubicle.  I had mail.  Mary Jean said the seminar for tomorrow has been postponed and Becka, another DMP student I will be rooming with, wanted to know what she needed to bring.  I told her to bring a copy of her passport or birth certificate for the badge paperwork, and I asked if she knew who we were suppose to pay for the housing.  She responded promptly and said she hadn't paid yet either and didn't know.  She also wanted to know whether she would be able to get into her room when she arrived on Sunday and where she would be able to park on campus.  I told her how I had gotten in, but I didn't know about the parking.  She was just as confused as I was and continue to be.  It made me feel a little better, that I could help someone with all the things that had caused me hassles and problems when I first arrived.  I still don't have network access in my room, I miss it.  I started color coding the goals on this web page, pink means the goal isn't completed yet.

    5/30/01 (Wednesday)
    Jennifer, a grad student at Northeastern, visited the lab today because she is interested in using aristotle.  I had a meeting with Mary Jean and Alex today.  First Mary Jean explained dominance, post dominance, and control dependence to me, I will have to demonstrate my understanding of them tomorrow.  Then Alex arrived and we discussed the algorithm, here.  The algorithm is vague in some places and we were tried to understand it, which took longer than expected.  So, we cut the meeting short and went to lunch.  Since it is Wednesday, the who lab goes together.  We went to a delicious mexican restaurant.  When we got back, we continued the meeting.  Eventually, we seemed to grasp the main idea and it was left to me to work through some examples and write the algorithm up a little better.  Mary Jean and I decided to meet again tomorrow on Thursday around 10:30 - 11:00 to go over my revision of the algorithm.  At the end of the meeting Alex had a paper that referenced something called cdiff, that Mary Jean asked me to look into.  What I found was that cdiff is just like unix diff except that it can return the context, the lines surrounding the differences.  I didn't think it sounded like what we are looking for, and that is what I will report tomorrow.  Hopefully, I found the right cdiff.  Tonight Bilge was very nice and let me use her computer to update this page.  Later I worked on writing out the algorithm from, here, in a clearer way, I think it might have turned out more confusing though.

    5/31/01 (Thursday)
    I met with Alex and Mary Jean in the morning to demonstrate an example on the algorithm, I did a very poor job.  Then we talked about what we still needed to clarify for the algorithm and we came up with 1) finding a way to find isomorphism (perhaps jdiff?), and 2) finding out how to merge mod nodes and if it is necessary or important.  Later I thought about ways to find isomorphism.  It would be easier if there was some structure to the graph, so I looked at the postdominator trees for graphs to see if that would help.  Initially, it looked promising, but since it is possible to radically change the graph and not change the PDT, I decided maybe that wasn't the best approach.  I almost forgot the weekly  group meeting at 2:00, but Anup reminded me.  All the undergrads were there, and each of us explained what we were working on.

    6/1/01 (Friday)
    I wrote out the DFS part of the differencing algorithm this morning.  I added a project work section to this page, and from now on I will be putting everything I work on there so it is easier to find.  This afternoon I tried to make the algorithm easier to follow by adding comments and an overview.  Some notes on the DFS part of the differencing algorithm, depending on the order  you do some things, in for loops for example, the results could differ for the same graph.  I put notes in comments in places where I thought this might occur.  I also had an idea about computing and collapsing all the clusters before looking for isomorphisms.  One might not want to compute and collapse all the clusters ahead of time because if the algorithm can't find an isomorphism at some outer level that still has unexpanded clusters in it, then the clusters within those unexpanded clusters never needed to  be computed.  This is just a note for later in case I need to improve efficiency.

    6/2/01 (Saturday)
    SLEEP :)  While reflecting on what I have accomplished this past week, I feel as though I have not done as much as I should have, that I am behind.  However, I can't think of really anything I would have done differently.

    6/3/01 (Sunday)
    Rebecca, the third DMP participant arrived today.  She had a very difficult time.  Not only did she have a very long drive (straight from Chicago), but she also didn't even know what building she was going to be staying in.  She only had the phone number (which didn't have a phone plugged in) for our suite/apt.  So for over two hours she was wandering around campus, not knowing where to go.  Eventually she got a GA Tech directory and found the phone number, and from that got the building and room number.  I feel bad because if I had known I could have told her the building and room number (or at least plugged a phone in), and saved her a lot of trouble.

    week [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

    week 2

  • meet Rebecca and Mary Lou
  • weekly individual meeting Monday 2:30
  • lunch on Tuesday with other DMP participants
  • weekly group meeting Thursday 2:00
  • register/set up computer with resnet
  • familiarize myself with JABA/DejaVOO

  • 6/4/01 (Monday)
    I brought Rebecca in to the lab around 8:00, and Mary Jean had letters for us to get our buzz cards.  I also had an email about CDiff, and as I had feared, I had not found the right one.  However, I could not find a copy of a paper describing CDiff online.  Then I started setting up JABA, which wasn't as bad as I thought it would be, but I need a VM >=1.2 and I can't find one on the system.  All I can find is jdk1.1.7.  Around noon Becka and I walked over to the buzzcard center and got our buzz cards, we need them to do laundry and to access some buildings.  I found out that the reason I couldn't find the version of java that I wanted was because that it wasn't installed for linux, but Peter  (he doesn't have his aristotle page up yet so the link is temporarily broken) had installed it in his directory and so I used that.  I had my individual meeting with Mary Jean at 2:30, she found some problems with my DFS., mainly that I wasn't matching the edge labels.  She also told me I would be giving a presentation on clustering, dfs, and the rest of the differencing algorithm to Mary Lou tomorrow.  The meeting ran over, as it always seems to with me.  When I got back to my desk, I redid the dfs part of the algorithm, wrote up an explanation of clustering with examples, wrote up examples for dfs, and wrote up an example of the entire algorithm.  In doing the clustering examples, I discovered some things I thought about clusters were not true, such as a repeat/until loop being able to be collapsed to a cluster.  If this is the case, you can replace a while loop with a repeat /until and since the repeat/until cannot be collapsed, the graphs will never be isomorphic and the entire graph will  be marked as mod.  Since we don't want that to happen, I think we might have to redo clustering.

    6/5/01 (Tuesday)
    I think I killed lots of trees printing the handouts for my presentation.  All the DMP participants and Linda (a girl who is doing an REU with the Computer Engineering dept. here) went out to lunch today.  It was interesting to listen to the mentors talk about their work and tell stories.  After lunch, I had a meeting with Mary Jean and Mary Lou where I described the algorithm I have been working on, and things I have been discovering about it.  It was good to get different opinions on various aspects of the algorithm and to get feedback on ideas I had.  I think discussing things in a group setting (like the meeting) is good because everyone sees things a little differently and so you can get some good ideas or confirm something's you have thought about, but weren't sure on.  The meeting was really long, but it went by really fast and I hadn't realized it had gotten late.  I also got a list of things to work on, which was good because I felt as if I was running out of things to do.

    6/6/01 (Wednesday)
    I started writing up the new clustering algorithm that we discussed in the meeting on Tuesday.  I found a few small problems with it that I was able to fix.  The lab had lunch together since it was wednesday, we walked to a cafe up the street.  I tried to prove that the clustering algorithm worked and I ran into another problem with when you need to add the exit node to the cluster.  However, I think I found a fix.  At 2:00 I had a meeting with Mary Jean and Mary Lou.  I explained what I had discovered about clustering, and it sounds like my new definition of a cluster is actually a hammock.  Saurabh has worked with hammocks before, but he wasn't around so we couldn't talk to him about them.  We will have a meeting tomorrow to try to find an efficient algorithm for computing them.

    6/7/01 (Thursday)
    Last night I looked in my three graph theory books for references to hammocks, but I didn't find anything, just the word hammock in my notes next to a little picture.  This morning, I emailed the resnet people about getting registered without a prism account, and they can do a special add and register me, so I should have access in my room when I get back.  I also found a paper with a good concise definition of a hammock on page 3.  From the definition of the hammock in the paper, the definition of a cluster that I want to use is a hammock, whose entry node is not a 1-node.  On page 4 alg1 step i, there is an outline of a possible way to find all the hammocks.  However, it doesn't look very efficient.  Also, from about the middle of the paper on is proofs, most involving hammocks, which may or may not be useful.  I couldn't find much more on hammocks.  Also, I am beginning to feel like I am running out of things to do again.  Hopefully, the meeting today will give me more to do.  I found another paper  with an algorithm for computing hammocks, although it calls them SESE (single entry single exit regions) and I think they are slightly different.  The paper also says, on various benchmarks, the nesting of SESE regions are not very deep, <=6 levels  97% of the time.  I don't know how relevant that information is to what I am working on.

    6/8/01 (Friday)
    It was pouring when I walked to the lab today, I got completely soaked.  The same thing happened when I went to lunch... not on the way there, but half way through the way back it just started pouring.  I got even more soaked than before.  So I think I am going to go home early today to put some dry cloths on.

    6/9/01 (Saturday)
    I went to the library to get some reading material, but although I have my buzz card, I am not in the system, and hence cannot check any books out.

    week [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

    week 3

  • start designing/implementing the differencing algorithm
  • get documentation for the postdominator tree stuff - will be with JABA Docs
  • write up Horwitz hammock alg. in consistent format
  • read Rothermel paper
  • finish/work on hammock correctness proof
  • read DejaVOO docs
  • research CIAO? and CDiff again?
  • write a driver for JABA that prints out all the various sorts of graphs
  • think about how to integrate the differencing algorithm into DejaVOO (including semantic parts?)
  • make list of future work (things we are temporarily putting off)

  • 6/11/01 (Monday)
    I updated Mary Jean on my progress at my individual meeting.  We don't have an efficient algorithm for finding the hammocks, but I think we are going to try to use the one outlined here.  Then I started writing a driver for JABA that prints out all the various sorts of graphs.  I think I have a driver that works, but I am having difficulty using the dotty software to see the graphs.  Once I got the software to work, I could see the graphs looked ok.  I will try some more examples tomorrow.

    6/12/01 (Tuesday)
    Today I read and reread the DejaVOO documentation.  I made a list of questions I had about it, then read the docs again trying to find answers to the questions.  It was very successful, I found answers to almost all the questions.  I looked up CIAO.  I found this.  I don't know if it is the same CIAO, but there wasn't very much detailed information on the site about how it worked, or how they did differencing.  However there were these two papers, here and here.  Then I brainstormed for awhile on possible ways of integrating the differencing algorithm into DejaVOO and things that may possibly cause problems.

    6/13/01 (Wednesday)
    Today I thought more about integrating differencing into DejaVOO.  Then I went to the seminar.  Donglin gave a 25 min talk on some stuff he and Maikel have been working on.  Then everyone gave him feedback.

    6/14/01 (Thursday)
    I modified the hammock algorithm to obtain the minimum hammock for any decision node, and worked on a proof of its correctness.  I had a meeting with Mary Jean and Alex where I talked about what I had been working on and thinking about.  I now need to make a list of future work.  Mary Jean also gave me another paper to read that might be helpful in integrating differencing into DejaVOO.

    6/15/01 (Friday)
    I made a list of future work and added it to this web page.  I looked into an issue raised at the meeting yesterday.  During the hammock computation, do you have to add nodes  up to the exit of the graph.  The answer is yes, because  otherwise you could have an entering arc coming from a node below (dominated by) the exit of the hammock.  Then I read part of the Rothermel paper.

    week [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

    week 4

  • individual meeting 2:30 Monday
  • group meeting 2:00 Thursday
  • Mary Jean is at a conference all week
  • finish/touch up correctness proof for hammock alg
  • rewrite/check DejaVOO docs
  • look at JDiff
  • what are the CFG representation changes, that need to be made from dejaVOO?

  • 6/18/01 (Monday)
    The individual meeting today raised many possible problems, such as how to handle method calls, exceptions, and labeless edges.  I also got more work to do: compare the DejaVOO docs to the actual code and make any necessary updates, write up the design of the algorithm, and look into JDiff.  On the design of the algorithm, it looks like it might be a good idea to run my alg. then run DejaVOO on the interiors of the nonexpandable mod nodes (What if the mod node is just a single node?).  Then I worked on comparing the DejaVOO docs to the actual code and writing them up better (this should also help me understand it better).

    6/19/01 (Tuesday)
    I looked at JDiff, but I didn't see anything I thought would be useful.  I worked on redoing the DejaVOO documentation (comparing it to the actual code to make sure everything was up to date).  I feel like I understand the algorithm much better now.  I worked on writing up some ideas for the design and some possible problems with the ideas.  I also worked some more on the hammock correctness proof.

    6/20/01 (Wednesday)
    I finished the hammock correctness proof.  Then I worked on writing up an algorithm for running DejaVOO during the dfs part.

    6/21/01 (Thursday)
    When I got in today the network was down, so I couldn't even log in.  However, I found some stuff to read until it was fixed so it wasn't that bad.    Then I had a design meeting with Alex and Saurabh.  We worked out an approach we think will work for method calls and exceptions.  We also came up with a possible hierarchy of classes/interfaces and their purposes.  It started raining at lunch, but we waiting 'til it let up a bit before we headed back.  So I barely got wet at all.  I read up on some implementation details of JABA for the second design meeting later today.  At the undergrad group meeting I reported my progress.  Then I met with Alex, Jim, and Saurabh to talk about design some more.

    6/22/01 (Friday)
    I started implementing what I could of the dfs algorithm.  In the process of doing that I came up with some design questions about how to store node matchings.  Is it possible for me to use node attributes to store matchings, or should I use a hashtable, or maybe if each node has a unique ID, I can use that and an array.  I also didn't know if my algorithm was suppose to be its own class or a method in the methodHammockGraph class.  In coding up the algorithm, I found that part of the algorithm will never be reached and so I removed that part.

    week [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

    week 5

  • lunch on Tuesday with DMP participants?
  • meeting Monday 2:30
  • meeting Thursday 2:00
  • do I have a dir on cvs where I can put stuff?
  • what can I do with node attributes?
  • is node number a unique ID? it is complicated...
  • finish reading coding guidelines
  • look at if/then in byte code
  • write up incoming edge alg
  • high level design (list methods, put pseudo code alg in as documentation)

  • 6/25/01 (Monday)
    Today I started reading the coding guidelines that I found in the JABA docs (I am on page 41 of  73).  At my individual meeting today, I explained what happened last week to Mary Jean (since she was at a conference).  The main accomplishments were the place holder node representation of hammocks (an implementation detail), handling exceptions through super exception nodes and propagating uncaught exceptional exits, and inlining just the entry and exit of a method call.  Things I need to do from the meeting:  ask Jim and/or Saurabh what I can do with node attributes and if the node number is a unique ID for the node, get cvs directory information from Alex, look into how if/then stmts. are done in the byte code, write up/prove my incoming edges algorithm, and write up a high level design for my algs (list methods and put in pseudo code alg as documentation).

    6/26/01 (Tuesday)
    I finished reading the coding guidelines, though I will probably have to go back and reference it a couple times.  I also wrote out the finding nodes with in edges algorithm in the hammock algorithm.  I haven't done any complexity analysis or proofs on it but it seems much better than the other approaches especially if each node has a unique ID number.  Today was originally planned as a dmp outing lunch, but Amy (Rachel's mentor) is out of town, and Mary Jean hasn't told us anything.  So I don't think anyone remembered.  However, Becka, Rachel and I decided to go to Einstein Bagels for lunch today and have our own pseudo dmp outing.  After lunch, I started writing up an interface of all the public methods in the differencing algorithm, including documentation.  I also got an email from Alex about setting up cvs, so I did that.

    6/27/01 (Wednesday)
    I finished writing up the high level design / interface for MethodHammockGraph and SyntacticalDifferences.  Mary Jean stopped by the lab, and she showed me some visualization stuff Jim is working on, which might be neat to do something similar for visualizing my work.  Mary Jean also mentioned possibly having me do the JIG and/or DIG graphs.  cvs only has a description of my project and all the other undergrad projects.  I'm not sure if I am suppose to put my algorithm documentation there, or my source code for stuff.  It would be nice to put my alg docs there because I have several versions and it is getting to be a pain to keep track of them all myself.  We are going to the mexican place again for lunch this week (I think we went there two weeks ago).  I think we go there so often because it is one of the few places that can accommodate all of us (11 I think), so it is good that I like their food.  This afternoon I am going to try to motivate myself to look at if/then stmts in byte code, we'll have to see how it goes though.

    6/28/01 (Thursday)
    I looked at if/then stmts in bytecode.  It looked pretty straightforward to me (it didn't seem to flip the condition when the stmt change from a one-armed-if to an if/then/else).  However, the actual jump instr. for the if was the opposite of what the one in the code was (for optimization reasons).  Instead of a group meeting today, we are going to have individual meetings.  Today Mary Jean sent out email about the final report.  It appears the final report is just our webpages?? that doesn't sound right.  She also included some information about fellowships, that should be useful.  At the meeting today I found out the design we are going to use for all the graphs.  It was consistent with what I already had.  I still need to find out how I should do the hammock place holder nodes.  Should I have a new HammockNode class and should that have a HammockNodeAttribute?  I would also like to have the GraphAttribute interface (an interface my HammockGraph will implement), but I don't think it is implemented yet.  So basically I need to talk to Jim and Saurabh about jaba implementation details.  At the meeting Mary Jean also gave me stuff I can work on when I am just waiting on other people.  Some of the things were gather subjects (programs with multiple versions and possibly tests), think of types of experiments or uses for the differencing, think of ways the differences can be visualized, think of things my diff finds that diff doesn't especially in respect to impact analysis, and to think of ways to enrich the graph representation such that I get more out of it.  At lunch I talked to Becka about the report (I forgot to ask Mary Jean about it during the meeting).  She said last year she just had to write a few paragraphs and that it was nothing big, like I thought.

    6/29/01 (Friday)
    I started working on the implementation today.  In doing so, I came up with a few more questions.

    week [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

    week 6

  • meeting Monday 2:30
  • implement hammock graph, differencing
  • meeting Thursday 3:30 (individual)

  • 7/2/01 (Monday)
     Today I thought about possible ways to use the differencing in impact analysis.  Some ideas include using the number of perfectly matched nodes to get an idea of the magnitude of the impact, checking that there is some documentation for each changed area, computing what nodes can/cannot be reached from a changed node (through the true/false path propagation), classifying the changes based on the type such as perfective if most of the changes are added nodes, and using the added and removed nodes to generate new tests or remove tests from a suite.  At my individual meeting we talked about some of the ideas I came up with and we agreed that adding a method state node between the entry and exit of a method call, that says whether the method has been changed, will be useful.  The idea of measuring the goodness of the matching or the magnitude of difference went over well too.  However, the true/false path idea sounds unsafe if there are accesses to global variables.  I also forgot to mention that the differences could be used to check whether or not all the changes had been documented.  I got some of my design questions answered as well.  We decided a HammockNode should have its own class and should have "other" as a field.  Also, load in GraphAttribute should throw a IllegalArgumentException, which is based on checking the graph for single entry, single exit, (or some other property, depending on the graph) not on the type of graph.  I learned that the GraphAttribute class is already up on CVS as well as the modified CFG and ACFG classes.  I misunderstood the set up before I should not use the getAttributeOfType method in Graph to get whether a HammockGraph has been constructed, instead I should look through the attributes manually in the HammockGraph load method.

    7/3/01 (Tuesday)
    I worked on more implementation.

    7/5/01 (Thursday)
    More implementation.  Set up cvs for jaba.

    7/6/01 (Friday)
    I am having difficulty getting the server package to compile, but my code compiles fine without it, so it is just annoying.  I am also waiting on the dominator graph stuff, before I can test my stuff and check it in to cvs.  I got a preview of the postdominator/dominator stuff from Peter.  It isn't at all what I expected, it might take me a bit to change some parts of my code.  I talked to Mary Jean briefly this morning, we are going to have a meeting on Tuesday to discuss grad school stuff, including fellowship applications.  Also, I am very confused about the whole cvs thing, am I suppose to have a copy of the whole tree? and where am I suppose to update from?

    week [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

    week 7

  • individual meeting Monday 2:30
  • grad school meeting with Mary Jean Tuesday 3:00
  • group meeting Thursday 2:00 (changed to individual meeting at 10:00)
  • finish implementation
  • test

  • 7/9/01 (Monday)
    I worked on more implementation today, finishing the hammock graph implementation (so that it compiles, I haven't tested it at all yet).

    7/10/01 (Tuesday)
    Today I worked on testing the hammock graph implementation.  I also had a meeting with Mary Jean about grad school, which was very informative.  She is going to email some people for me, so that I can find out more about computational biology.

    7/11/01 (Wednesday)
    I have almost completed the differencing implementation, and I tested the Hammock implementation some more.

    7/12/01 (Thursday)
    I worked on more testing today.  I created a multiheader example (something I was having a difficult time coming up with earlier).  An I finished all of the differencing implementation, except the matching and undoing the matching on a failure.  I talked to Mary Jean about a better way to implement that, but she thought what I was doing was ok.  Everyone is going to be leaving soon.  Mary Jean is going to be gone next week, Saurabh is leaving Friday and will be gone for the rest of the time I am here, and Alex and Anup are both leaving in one week and won't be back until after I leave.

    7/13/01 (Friday)
    I worked on testing/debugging.  I also set up a meeting with a theory professor here (through Mary Jean) to talk about grad school.  Our meeting with be Monday at 10:00.  I have both the hammock algorithm and the differencing algorithm implemented, but I feel that both are very buggy.

    week [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

    week 8

  • grad school meeting with Professor Venkateswaren 10:00 Monday
  • individual meeting 3:00 Monday
  • read impact analysis papers(be ready to discuss and think of ideas of how we can makes ours better)
  • think of how to do program level differencing
  • visualization: hammock boxes, matched node lines
  • group meeting Thursday 2:00
  • testing (try a large program)
  • JIG/DIG implementation?
  • read mutation testing paper, for change cases

  • 7/16/01 (Monday)
    I had a very informative meeting with Prof. Venkateswaren.  We discussed grad school, in general, what to look for, and how to make an impression.  Mary Jean is at a conference in Paris this week, so my individual meeting will be with Alex.  At the meeting, Alex gave me six papers on impact analysis to read for Thursday, and I need to think about whole program differencing for Thursday as well.  I don't think I can implement the JIG/DIG yet because I think I am waiting on Saurabh to finish up some of the exception stuff that I would need.   I also need to work on some visualization stuff: boxes around hammocks, and matching of differences.  I also need to do some more testing of the hammock algorithm and the differencing algorithm (I haven't run either of them on a very large program).

    7/17/01 (Tuesday)
    I read four of the six papers Alex gave me yesterday.  Three didn't seem all that relevant (except maybe for comparison), but the fourth had some good ideas on stuff we could do with the differences we get from running the differencing algorithm.

    7/18/01 (Wednesday)
    I read the final two papers for Thursday, and I started thinking about ways to do whole program differencing, using the statement differencing I already have.  Overall, the papers seemed to deal with differencing whole classes and/or packages and finding what other classes/packages were affected.  This is what I need to figure out how to do next so that I can match up methods so that that I can run the differencing algorithm on them.

    7/19/01 (Thursday)
    I thought of more ideas about differencing at the program level.  Then I went to my meeting with Alex.  We discussed the papers, my ideas, and what I need to work on next.  I wrote up what we discussed on program differencing here.  Things left to do are DIG/JIG implementation, visualization (hammock boxes, and matching), impact analysis (back propagation, think more about forward propagation), program level differencing, and testing.  Alex also sent me another paper to read.

    7/20/01 (Friday)
    Today I skimmed the mutation paper, but I didn't find the cases I was looking for and expecting.  Then I implemented the hammock boxes and the matched node lines (to show the matchings between two differenced graphs).

    week [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

    week 9

  • individual meeting, Monday 2:30
  • individual meeting moved to Tuesday at 3:00
  • brown bag lunch, Thursday noon, CRB 381
  • group meeting, Thursday 2:00
  • testing (try a large program) (try diff with different depths)
  • make docs better
  • JIG/DIG implementation
  • add impact analysis stuff (i.e. back propagation)
  • more visualization stuff??
  • write up schedule for the last few days of how to tie up loose ends
  • meeting Friday 3:00
  • meeting Sunday 3:00

  • 7/23/01 (Monday)
    I started implementing the DIG.  There are lots of things I don't  know where the information is, so I don't think I am going to be very productive time wise.

    7/24/01 (Tuesday)
    I worked more on the DIG implementation, mostly running into questions that I can't answer though.  I got some of my questions answered at my individual meeting today.

    7/25/01 (Wednesday)
    I found a strange bug with the dotty output from the differencing.  I can't find what is causing it.

    7/26/01 (Thursday)
    I had a meeting with Donglin that cleared up virtually all of the immediate DIG questions.  The brown bag lunch was interesting, I got to see some other women's views of the computer science field.

    7/27/01 (Friday)
    I worked mainly on program level differencing.

    7/29/01 (Sunday)
    I finished a quick implementation of program level differencing, met with Mary Jean, and started testing some of the program level differencing.  I am still having problems with dotty, it doesn't like to do the subgraphs that I tell it to, and I still haven't found the reason for that other bug I found earlier in the week.  At my meeting Mary Jean and I reversed an earlier decision to combine the JIG and the DIG, because we don't think it is possible.  So now I basically need to start the DIG from scratch.

    week [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

    week 10

  • tie up loose ends
  • write outline of what needs done after I leave
  • write docs for everything (or check that they are up to date)
  • check everything into cvs
  • give Tongyu my drivers and test cases
  • test more
  • write up final report

  • 7/30/01 (Monday)
    Today I tested the hammock graph on a large program, with great results.  I also checked most of my stuff into cvs (hammocks and differencing, but not the dig), and mailed my test cases to Tongyu.

    7/31/01 (Tuesday)
    I made a very big to do list of everything that could still be done to the dig, program differencing, and hammocks.  Then I went through and worked on the items I could accomplish quickly.   I also implemented the dig, with the exception of the exception part. Then I updated the list, and started preparing for the presentation I am going to give tomorrow.  I also checked in the DIG, or what I have for it, and my new versions of hammocks and differencing.  I also checked all my docs into cvs.

    End of week 10
    I tried to bring my project to closure.  It was very difficult since there is still so much I could do on it.  I would definitely stay longer, if I wasn't being kicked out of the dorms (it is time for the regular gatech students to move back in).  I also wrote up my final report, which is here.

    week [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

    Kathleen Repine
    Wednesday August 1 11:04:11 EST 2001