Computing Research Association Distributed Mentor Project Anastasia's Personal Page Anastasia DMP Home  

Project Journal

Using Model Checking with Symbolic Execution for the
Verification of Data-Dependent Properties of MPI-Based
Parallel Scientific Software

by Anastasia Mironova

Week 1

June 7-13 2004

 

This week's events were primarily associated with meeting the people at the LASER lab,
getting to know their research interests and projects, and figuring out what my project would be.

In order to become familiar with their work in more detail I've spent a significant amount of time reading the papers and
references given to me by my mentor Lori, some of those were the following:

J.M. Cobleigh, L.A. Clarke, L.J. Osterweil, "FLAVERS: A Finite State Verification Technique
for Software Systems"; IBM Systems Journal, Vol 41, No 1. 2002

R.L. Smith, G.S. Avrunin, L.A. Clarke, L.J. Osterweil, "PROPEL: An Approach Supporting
Property Elucidation"

A. Wise, A.G. Cass, B.S. Lerner, E.K. McCall, L.J. Osterweil, S.M. Sutton Jr., "Using
Little-JIL to Coordinate Agents in Software Engineering"

M.S. Raunak, L.J. Osterweil, A. Deshmukh, "Using Process Language To Analyze Auction
Processes Through Simulations Synthesis Project Report"

S.F. Siegel, G.S. Avrunin, "Verification of MPI-Based Software for Scientific Computation"

Once I had become familiar with the current projects at LASER, Lori very kindly offered
me the opportunity to choose what I wanted to work on and my choice was to work with
Dr. Siegel, Steve, on modeling MPI-based software systems. This choice is due to my interest
in pursuing a Doctorate degree in the field of scientific computing and numerical analysis
in the near future.

Based on this choice, I was then given some more reading material about verification, MPI, and
one of the most popular verification tools, SPIN, to the careful exploration of which I devoted
the rest of that week.

Week 2

June 14-20 2004

One of the main decisions at this point was whether I would work on modeling the code that already
exists or would the alternative of writing my own present a better perspective. In order to get some
references and advice on this issue Steve contacted his brother who works at the University of Chicago
and the Argonne National Laboratory to see if he'd have any recommendations on any existing
scientific computing packages that would be feasible for me to go through during the course of my
research here at UMass. Some of the references we were given were:

BLAS(Basic Linear Algebra Subprograms), - high quality "building block" routines for performing basic
		vector and matrix operations;
CHOMBO, - provides a set of tools for implementing finite difference methods for the solution of partial
		differential equations on block-structured adaptively refined rectangular grids;
FLASH, - a state-of-the-art simulator code for solving nuclear astrophysical problems related to
		exploding stars;
PFFT - a component of FLASH for computing Fast Fourier Transform;
PARAMESH, - a stand-alone product enabling quick and easy modifications of existing meshes from engineering
		design tools
PETSC(Portable Extensible Toolkit for Scientific Computation), - is a suite of data structures and
		routines for the scalable (parallel) solution of scientific applications modeled
		by partial differential equations.

At the mean time I was getting myself familiar with MPI, C, and SPIN. As an excerise I began working on
a PROMELA model for the program that was written as part of the last assignment in my CS 331 Programming
Languages class which I took this spring. This program was intended to compute the product of any two square
matricies in parallel, it is written in C and utilizes MPI. The goal here was to construct two models: one
for checking for possible deadlock conditions and the second for verifying that the result computed by
this program in parallel would indeed result in a mathematically correct expression representing the product
of the two given matricies for any possible execution.

Week 3

June 21-27 2004

Most of this week I spent working on the sample model of the matrix multiplication program's model, which I
called matrixprod.prom.
The model for checking whether the program could ever deadlocked was the first one I constructed.
Then I moved on to verification of correctness of the computed result. This actually turned out to be a
rather non-trivial task in SPIN because even though it does support some C, PROMELA is not a fully functional
programming language. The solution to this was to construct expressions represented by a set of tree-like
custom "typedef" structures, which would then be transmitted over the channels.

On Friday I was finally able to successfully run my PROMELA model in simulation mode of SPIN, however,
when I then attempted to use SPIN in its verification mode, I encountered an error indicating improper use
of channels, which, as Steve and I later discovered, turned out to be a bug in SPIN. Here the issue was the
following:
Since, of course, in the case of dealing with matrices it was most intuitive to use arrays as the primary
message type, that was exaclty what I chose to use but, unfortunately, SPIN does not provide for this
capability in a straight forward way, forcing me to embed simple arrays into some "typedef" structures and then
use these types of objects as the primary message format for the data channels. When a message of this type
is received into some structure of the same type, naturally, one expects that all the entries of the arrays
should be copied over into the receiving structure but SPIN's implementation of this routine had an
overly strong conditional statements which were originally intended to prevent programmers from receiving
several fields of one message into the same variable but also disabled some of the valid behavior like
in this case.

So, before I had any of this knowledge, the first thing I did when the problem was encoutered was
constructed a minimal PROMELA program that gives an error and after talking to Steve about this he decided to
email the author of the code,Gerard Holtzmann, which we did and it turned out that he was aware of this bug and
that the new version of SPIN, which he was, luckily for me, planning to release during the coming weekend,
would have this fixed.

Week 4

June 28 - July 4 2004

Indeed, the new release of the SPIN model checker came out on June 29th and Steve installed in on the machine I
was working on. However, when I tried to run this new version of SPIN in verification mode using the same code
I had problems with before, exactly the same problem occurred. We ended up asking Dr. Holzmann for advise again
and the solution he suggested involved changing the course code and reinstalling the software. This fixed the
problem immediately and ever since then I have been running my models on this particular installation.

Also, this week was the week when we finally decided the direction I would pursue with this project. In particular,
we have decided that instead of studying one particular system, such as the ones described above, and applying model
checking techniques to verify properties about their specific modules, I would pursue exploring the idea of
attempting to verify computational correctness of parallel routines. The computational correctness problem seemed
like a good alternative because this was not among well-explored areas and also because of the fact that the SPIN
runs that I was able to do on the example that I had constructed as an exercise for multiplication of two square
matrices were capable of handling some non-trivial cases (in terms of the matrix dimensions).
 
So, having decided upon the direction, this week we also started thinking about the specific example that I would
consider. There were many discussions about this between Steve and I as well as with Lori and George. They were all
very helpful. The problem that we have finally picked is very common in the field of scientific computing and the
procedure is called the Gauss-Jordan Elimination. The Gauss-Jordan algorithm is used to reduce any matrix to what
is formally known as reduced row-echelon form. One of the reasons we have selected this algorithm is because it is
a very common technique with quite a few applications and it could be interesting enough for people in the field of
scientific computing too. Another reason we chose this problem is because as it turns out, it is actually significantly
harder to perform verification since, as opposed to the matrix multiplication example, this method does not have a
closed form of solution, that is the solution is dependent upon the values of the entries in the matrix being reduced.
We would have to explore different execution paths for all possible values of these entries. The other example we were
considering seriously was Jacobi iteration; however, after careful examination we concluded that it did not differ in
this way very much from the matrix multiplication problem since in this case computations are independent of data and
all we would be analyzing is again the communication aspect primarily.

Once we decided on the particular problem, we had to then think about how to approach the verification and expressing
the property in SPIN. Here the solution that was adopted was to construct both parallel and sequential models in PROMELA
and then verify that the two are equivalent. More specifically, we would implement the sequential model to explore every
possible branch depending upon what the data could be at each decision point and then for every execution path of the
sequential program we would run the parallel code and compare the final expressions that the parallel and sequential
version produce. This way the parallel version would be equivalent to the sequential and hence, under the assumption
that the final expressions produced by the sequential code are correct, we will be able to conclude that the parallel
version also produces correct results for all possible executions.

The last thing I was working on that week was exploring a possibility of using a theorem prover for handing and
the symbolic expressions that we would construct in PROMELA directly or more likely as c_code fragments in PROMELA.
There were quite a few tools that I had looked into using, however, none of them seemed to provide a nice way of
making calls to them from within the PROMELA models directly, so I decided to focus my efforts on implementation without
the convenience of having a theorem prover.

Week 5

July 5-11 2004

This week I spent quite some time researching the different implementations of the Gauss-Jordan Elimination algorithm, both
sequential as well as parallel. The sequential agorithm that I finally decided to adopt for implementation is described in
Howard Anton," Elementary Linear Algebra", Wiley, 1977, Section 1.2. It turned out to be significantly more challening to
find if not a ready implementaion but just a verbal description of the parallel version of this algorthm. The one that Steve
and I decided to adopt as the base for our own implementation is described in Wilkinson B., Allen M., "Parallel Programming.
Techniques and Applications Using Networked Workstations and Parallel Computers", Prentice Hall, 2nd edition 11.3;

This week I also did some more researh on theorem provers.

Week 6

July 12-18 2004

This week was spent entirely on the implementation of the sequential and parallel versions of the Gauss-Jordan Elimination
algorithm in C using MPI.

The changes that I had to make to the original version of the Gauss-Jordan Elimination agorithm that is described in the original
references are the following:
In order to optimize performance in the parallel version I incorporated the "backward subtitution" routine into the preceding step
as an attempt to improve performance. Consequently, this change had to be made in the sequential implementation as well to
keep them as much alike as possible, since the idea was to make the sequential code alike to the parallel.

Week 7

July 19-25 2004

This week both the sequential as well as the parallel implementations of the Gauss-Jordan Elimination algorithm in C using MPI were
completed. The examples were tested and believed to be correct.

At this point these C implementations became ready to be modeled in PROMELA and on Friday I was already able to start making progress toward putting together an implementation for the PROMELA model of the sequential C program.

Week 8

July 26 - August 1 2004

The major accomplishment of this week was that after a very long time I finally was able to make SPIN do what I need it to do and
the PROMELA implementation of the sequential version of the C code was completed. I did a few test runs and assumed that it was correct.

Also, this week I moved on to working on the PROMELA model of the parallel C implementation.

Concurrently this week I was also working on putting together a presentation for the laboratory meeting of the analysis group, which I was
scheduled to present on the 2nd of August, the Monday of the 9th week.

Week 9

August 2-8 2004

On Monday I presented my project at the LASER analysis group meeting. I actually ended up spending some extra time on this
during the weekend, it turned out to be more difficult than I anticipated but overall the presentation went fine and I got a lot of very
valuable feedback.

For the most part of this week I was still working on the PROMELA model for the parallel C implementation of the Gauss-Jordan Elimination
algorithm. By this time I was quite honestly rather unhappy with SPIN and it was at this point that I was really wishing that SPIN were not
as restrictive, especially in terms of the possible to implement data structures. I actually ended up redoing this model quite a number of
times just because I would only discover that something is not possible in SPIN after I had already finished the implementation
of a particular step.

Week 10

August 9-13 2004

Most of this week I, again, spent on fighting with SPIN and trying to finish the PROMELA model of the parallel implementation of my chosen
example. However, another document of greated importance had to be put together at that stage because that was my last week of mentorship,
so I had to shift the focus of my efforts away from PROMELA and work on putting together a presentation in Power Point that would
serve as the base document for presenting my work and which I would later use a refernce in creating other documents and reports
on this project. Lori, George, and Steve all worked with me very closely on this presentation and I was very happy to have it
done on the last day of my work.