The Journal

Week 1

I went through all the related (and VERY INTERESTING) to our research
material (some of the articles are :"Determining computational complexity from characteristic 'phase transitions'"by Monasson,Zecchina,KirkpatrickSelman,and
Troyansky; "Computer Science can't get no satisfaction" by Brian Hayes; Experimental Results on the crossover point in random 3-SAT" by Crawford and Auton; etc) and did the overview and deep analyse of the written code.
We decided to start from the testing of our program by applying 2-SAT problems
(first of all it is just easier,and second - the results 2-SAT problems are well known -- so we can see if our program is correct or if it has to be modified. I designed some simple input file(test) that looked like this:
30 50 10000
40 50 10000
50 50 10000
60 50 10000
70 50 10000
The first column -- M(number of clauses), second-N(number of variables), third-# of randomly generated formulae. Alpha (M/N)= 0.6, 0.8, 1.0, 1.2, 1.4
respectively.
I got some message("Memory fault") -- and I fixed a couple of mistakes( it was some mismatch with "malloc" and 'free")- I modified a code a little. Let's see what to do next...

Week 2


I looked at the "formula analyzing" part closely and found out that it had to be some more testings : if let's say we have (A) and (~A) both in single clauses
(exactly as they appear here), we have to return 0(false) immidiatelly! - the
formula can not ever be satisfied .This is like if we have ((True) AND (False)) formula--
it is always False! So, I wrote a function flaCheck -- that accepts
a Formula itself, a chosen variable(literal), and its(literal) sign. In a program
we have two "arrays" posatoms and negatoms. Posatoms stores the information
in which clauses a certain(each) variable appears positively,negatoms-in which
clauses it appears negatively. I am picking an array "opposite" to the sign of the chosen variable from the single clause--to check if we have another single clause
that holds our variable with the opposite sign - for example negated (if "data"
== 1 means the clause has only one literal)

it returns 1 if the match was found
//lit -is a chosen variable to process, sign - indicates in which form it is:
//regular or negated (for example A or ~A)

int flaCheck(FORMULA *f,int lit,int sign){
....INTLIST temp;
...ILITER ili;
...int clauseind;
...int done=0;

...if(!sign)
......temp=ilCopy(f->posatoms[lit]);
...else
.....temp=ilCopy(f->negatoms[lit]);

...ili=ilIterNew(temp);
...while(ilIterMore(ili) && !done){
.............clauseind=ilIterNext(&ili);
.............if(f->clausedata[clauseind]->data == 1)
................done=1;
...}
...return done;

}

Week 3

I inserted the previous function in the code. The results are much better now! They still need some correction, but they are much better! We also tried to
get the same effect (as from the previous function) by making some changes to the
existing code -- we decided that we should not have decremented the number of
"clauses" (total) if we had some empty clause in the formula! Because we could have got a situation like -- the total number of clauses is 0 (so we have to return 1- the formula is satisfied), but we have some empty clauses among "deleted" ones (the formula should be False!). Therefore, we did not decrement the total number of clauses when some empty clause appeared, and got the same results!
So, now we can use eather way in our program! But we still need to
make some corrections to the code. I "created" a very simple formula
with M=7 (number of clauses) and N=5(number of variables), and went
through the program checking every single step of the formula
processing. I concentrated on the "analyzing" stage. And now I am sure --
it is absolutely correct.
Week 4
This week I checked all the pointers in our program - to make sure
they point to the "right places". I printed out all the values of data of the memory locations our pointers point to. I spent a couple of days on a fish diet doing that. Finally I found that they were fine. I decided that in one place we should have
sent a formula by reference (not by value) -- even though it worked ,in this
situation more efficient way is to pass the formula by reference.
I also modified the randClauseMake function - it should be 2(or 3 in the future)
unique variables in each clause. so. I checked for it -- and if we have the same variable appear in a clause more than ones, we will generate another one
till 2 (or 3 or any number) of them are different(unique). I ran the code again with this new feature - the results were almost the same( a little different by 0. 02), but I think this is absolutely necessary if we want our code to be more efficient and
our results more clear and "true". so,we made sure that each clause contained three unique variables. now le's move to the next stage of a program...
Week 5


I tried other random number generators (just to make sure the results do not depend on the choice of a random genarator). I tries a good old random number generator functions: rand() and srand().This is the example of a code...

At the beginning of the program I have:
...... srand((unsigned)time(NULL));

And then in randClauseMake:
for (j=0;j<clausesize;j++) {
.....rc[i].lits[j]=(rand()%numvars);
.....rc[i].signs[j]=(rand()%2);
}

We have to use srand() in addition to rand() to make sure that our random number generator is working properly (otherwise it will generate the same
sequence of numbers -> same formulae)

It gave me almost the same results (the values of "satisfiability" were a little less
than the original results --(by ~0.03).

We also decided to use 1000 variables instead of 50 in our tests. This is the example of one of our test files:
600 1000 1000
800 1000 1000
1000 1000 1000
1200 1000 1000
1400 1000 1000
So ,that the values of alpha (M/N) are 0.6,0.8,1.0,1.2,1.4. And we got the absolutly right results! Yes! Now we can move to 3-SAT problems.

Week 6

I ran 3-SAT problems. Everything worked fine. So,now when we are sure that our code is absolutly correct, we can move to some other kinds of problems
for example: determining the "backbone" size. The backbone of a satisfiable
SAT instsance is the set of entailed literals (a literal l is entailed by a satisfiable
SAT instance C iff C AND (NOT l) is unsatisfiable). And so the backbone size
is the number of entailed literals.
Therefore, in determining some backbone size first we should consider
formulas that are satisfiable. Then go through each literal l and add (NOT l)
to the formula. If the formula is unsatisfiable - l is in the backbone. If the formula is satisfiable - add (l) to the formula, if it is unsatisfiable ~l is in the backbone;
etc(repeat for all literals). These are the fragments of the code I wrote:


int test( FORMULA* ftemp){
...........FORMULA *f1,*f2;
...........int size=0,y;
...........for(y=0;y<ftemp->numvars;y++){
....................f1=flaCopy(*ftemp);
....................f2=flaCopy(*ftemp);
....................addvariable(f1,y,0); //add (~y) to the formula

....................if(flaDP(f1,0) == 0) //if formula is false, y is in the backbone
....................................size++;
....................else{
....................................addvariable(f2,y,1);//if not- add (y) to the formula
....................................if(flaDP(f2,0)==0)
...........................................size++;
......................}

...........return size;
}


void addvariable(FORMULA *f,int variable,int sign){
............LITNODE *node;

............f->clausedata[f->maxclauses -1]->data=1;

............node=(LITNODE*)malloc(sizeof(LITNODE));
............node->data=variable;
............node->sign=sign;
............node->next=NULL;

............f->clausedata[f->maxclauses -1]->next=node;
............if(sign)
..................ilAdd(f->posatoms[variable],f->maxclauses -1);
.............else
..................ilAdd(f->negatoms[variable],f->maxclauses-1);

.}

Week 7

We ran the code with "backbone size determination" added -- it gave some
nice reasonable results. But we need to test it on the larger number of instances.
In the real world with he regular situation it can take a day to run our code
on such input. So,we need something to increase the speed of execution.
We need MULTITHREADING. Good idea! I am working on the "louis"
that has 14 processors. Therefore we can do 14 threads and uor job
will be done 14 times faster! First we have to "break" the code -- create/separate
a function to be threaded. Then ,create threads and let it go! This is the part of the the code I wrote in order to perform our task(cerating 14 threads + let them wait for each other):

#define PROCESSORS 14
:
:
:
for(v=0;v<PROCESSORS;v++){
.......if(pthread_create(&thread_id[v],NULL,aThreadFunction,(void*)dummy)>0)
...............printf("error in pthread_create %d\n",v);
}

for(z=0;z<PROCESSORS;z++)
.....if(pthread_join(thread_id[z],(void**)pp_status)>0)
..............printf("error in pthresd_join %d\n",z);

I tested aThreadFunction on several inputs.
Unfortunately, my "separated" function works perfectly with "singlethreading",
and gives me some "Memory fault" in "multithreading". So I need to look at the code closer and find out what's wrong...

Week 8

Ok,I fixed those "multithreading memory fault"(it took a lot of time, but I really enjoyed searching throught the code and applying some debugging features
almost to every line!)+ I needed to use mutex_lock and mutex_unlock for every global variable I used.I was using some glabal variables in order to store data
from each thread execution.

These are the fragments of the code that represents the ussage of
glabal variables+mutex_lock() and mutex_unlock() functions:

#include <pthread.h>
#define _REENTRANT
#define PROCESSORS 14
pthread_mutex_t my_lock = PTHREAD_MUTEX_INITIALIZER;
int * array;
.........
main(){
.......
.....array=(int*)malloc(PROCESSORS)*sizeof(int));
}
......
void* aThreadFunction (void*){
........
.....pthread_mutex_lock(&my_lock);
.....array[size]++;
.....pthread_mutex_unlock(&my_lock);
.......
}

But the execution of our code is still not fast enough. So,probably we need to modify our aThreadFunction so that it takes care of some number of instances(instead of just 1 instance as it has been so far)...

Week 9

As I mentioned before, we need to modify the aThreadFunction
so that the threads can handle more than just 1 instance( so far each thread has been taking care of just one instance processing) -- and now I decided to let each one do some number of instasnces (NUMINST / PROCESSORS). And assign
NUMINST %PROCESSORS to the first thread (the first thread starts first
and can do more job than others). +I created a special array of FORMULAs
that is glabal as a source of instances for our threads. Analyzing the execution process,we reached the conclusion -- probably, we need to eleminate
the number of mutex_locks and mutex_unlocks because they slow down
the execution of the code a lot. So I introduced some glabal arrays each position of which was responsible for a specific thread.
This is the fragment of the modified code: (function-- aThreadFunction)

void* aThreadFunction (void *t){
.........FORMULA *f;

........int M=( (DATASTRUCT*) t)->M;
........int N=( (DATASTRUCT*) t)->N;
........int instance=( (DATASTRUCT*) t)->numbinst;
........FORMULA *ftemp;
........int count=0;
........int size=0,i;
........int setsize=0;
........int setcount=0;

........int *id=(int*)malloc(sizeof(int));
.......*id=pthread_self();

.......for(i=0;i<instance;i++){
..................f=flaCopy (farray[special++]);

..................ftemp=flaCopyWithBackbone(*f);
..................count=flaDP(f,0);
..................setcount+=count;

..................if(count){
.....................size=test(ftemp);
.....................if(size){
.........................setsize+=size;
....................}
.................}
.................flaFree(f);
................flaFree(ftemp);
......}
......threadarray[pp++]=setsize;
......countarray[tt++]=setcount;
.....*id=pthread_self();
......return (void*)id;

}

Week 10

This is the last week of my summer reaaserch. And we decided to do
the summary of all our results. I was suggested and taught the "gnuplot"
techniques. For example,I used a series of commands to "create"
one of the graphs: (in Unix)
#> gnuplot
#>set data style linespoints
#>set title "3-SAT n=20 numinst=1000"
#>set xlabel "alpha"
#>set ylabel "satisfiability"
#>plot "o1"

can also use:
#>set terminal postscript
#>set output "o1.ps"
#>plot "o1"
(or just #> replot)

So, I was running all the combinations of input (with certain values of alpha)--
and then putting everything in forms of graphs. You can see all the graphs
in my final report.