Week 1 |
I went through all the related (and VERY INTERESTING) to our research
material (some of the articles are :"Determining computational
complexity from characteristic 'phase transitions'"by Monasson,Zecchina,KirkpatrickSelman,and
Troyansky; "Computer Science can't get no satisfaction" by
Brian Hayes; Experimental Results on the crossover point in random 3-SAT"
by Crawford and Auton; etc) and did the overview and deep analyse of
the written code.
We decided to start from the testing of our program by applying 2-SAT
problems
(first of all it is just easier,and second - the results 2-SAT problems
are well known -- so we can see if our program is correct or if it has
to be modified. I designed some simple input file(test) that looked
like this:
30 50 10000
40 50 10000
50 50 10000
60 50 10000
70 50 10000
The first column -- M(number of clauses), second-N(number of variables),
third-# of randomly generated formulae. Alpha (M/N)= 0.6, 0.8, 1.0,
1.2, 1.4
respectively. I got some message("Memory fault") --
and I fixed a couple of mistakes( it was some mismatch with "malloc"
and 'free")- I modified a code a little. Let's see what to do next...
|
Week 2 |
I looked at the "formula analyzing" part closely and found
out that it had to be some more testings : if let's say we have (A)
and (~A) both in single clauses
(exactly as they appear here), we have to return 0(false) immidiatelly!
- the
formula can not ever be satisfied .This is like if we have ((True) AND
(False)) formula--
it is always False! So, I wrote a function flaCheck -- that accepts
a Formula itself, a chosen variable(literal), and its(literal) sign.
In a program
we have two "arrays" posatoms and negatoms. Posatoms stores
the information
in which clauses a certain(each) variable appears positively,negatoms-in
which
clauses it appears negatively. I am picking an array "opposite"
to the sign of the chosen variable from the single clause--to check
if we have another single clause
that holds our variable with the opposite sign - for example negated
(if "data"
== 1 means the clause has only one literal)
it returns 1 if the match was found
//lit -is a chosen variable to process, sign - indicates in which form
it is:
//regular or negated (for example A or ~A)
int flaCheck(FORMULA *f,int lit,int sign){
....INTLIST temp;
...ILITER ili;
...int clauseind;
...int done=0;
...if(!sign)
......temp=ilCopy(f->posatoms[lit]);
...else
.....temp=ilCopy(f->negatoms[lit]);
...ili=ilIterNew(temp);
...while(ilIterMore(ili) && !done){
.............clauseind=ilIterNext(&ili);
.............if(f->clausedata[clauseind]->data == 1)
................done=1;
...}
...return done;
}
|
Week 3 |
I inserted the previous function in the code. The results are much better
now! They still need some correction, but they are much better! We also
tried to
get the same effect (as from the previous function) by making some changes
to the
existing code -- we decided that we should not have decremented the number
of
"clauses" (total) if we had some empty clause in the formula!
Because we could have got a situation like -- the total number of clauses
is 0 (so we have to return 1- the formula is satisfied), but we have some
empty clauses among "deleted" ones (the formula should be False!).
Therefore, we did not decrement the total number of clauses when some
empty clause appeared, and got the same results!
So, now we can use eather way in our program! But we still need to
make some corrections to the code. I "created" a very simple
formula
with M=7 (number of clauses) and N=5(number of variables), and went
through the program checking every single step of the formula
processing. I concentrated on the "analyzing" stage. And now
I am sure --
it is absolutely correct.
|
Week 4 |
This week I checked all the pointers in our program - to make sure
they point to the "right places". I printed out all the values
of data of the memory locations our pointers point to. I spent a couple
of days on a fish diet doing that. Finally I found that they were fine.
I decided that in one place we should have
sent a formula by reference (not by value) -- even though it worked ,in
this
situation more efficient way is to pass the formula by reference.
I also modified the randClauseMake function - it should be 2(or 3 in the
future)
unique variables in each clause. so. I checked for it -- and if we have
the same variable appear in a clause more than ones, we will generate
another one
till 2 (or 3 or any number) of them are different(unique). I ran the code
again with this new feature - the results were almost the same( a little
different by 0. 02), but I think this is absolutely necessary if we want
our code to be more efficient and
our results more clear and "true". so,we made sure that each
clause contained three unique variables. now le's move to the next stage
of a program...
|
Week 5 |
I tried other random number generators (just to make sure the results
do not depend on the choice of a random genarator). I tries a good old
random number generator functions: rand() and srand().This is the example
of a code...
At the beginning of the program I have:
...... srand((unsigned)time(NULL));
And then in randClauseMake:
for (j=0;j<clausesize;j++) {
.....rc[i].lits[j]=(rand()%numvars);
.....rc[i].signs[j]=(rand()%2);
}
We have to use srand() in addition to rand() to make sure that our random
number generator is working properly (otherwise it will generate the
same
sequence of numbers -> same formulae)
It gave me almost the same results (the values of "satisfiability"
were a little less
than the original results --(by ~0.03).
We also decided to use 1000 variables instead of 50 in our tests. This
is the example of one of our test files:
600 1000 1000
800 1000 1000
1000 1000 1000
1200 1000 1000
1400 1000 1000
So ,that the values of alpha (M/N) are 0.6,0.8,1.0,1.2,1.4. And we got
the absolutly right results! Yes! Now we can move to 3-SAT problems.
|
Week 6 |
I ran 3-SAT problems. Everything worked fine. So,now when we are sure
that our code is absolutly correct, we can move to some other kinds
of problems
for example: determining the "backbone" size. The backbone
of a satisfiable
SAT instsance is the set of entailed literals (a literal l is entailed
by a satisfiable
SAT instance C iff C AND (NOT l) is unsatisfiable). And so the backbone
size
is the number of entailed literals.
Therefore, in determining some backbone size first we should consider
formulas that are satisfiable. Then go through each literal l and add
(NOT l)
to the formula. If the formula is unsatisfiable - l is in the backbone.
If the formula is satisfiable - add (l) to the formula, if it is unsatisfiable
~l is in the backbone;
etc(repeat for all literals). These are the fragments of the code I
wrote:
int test( FORMULA* ftemp){
...........FORMULA *f1,*f2;
...........int size=0,y;
...........for(y=0;y<ftemp->numvars;y++){
....................f1=flaCopy(*ftemp);
....................f2=flaCopy(*ftemp);
....................addvariable(f1,y,0); //add (~y) to the formula
....................if(flaDP(f1,0) == 0) //if formula is false, y is
in the backbone
....................................size++;
....................else{
....................................addvariable(f2,y,1);//if not- add
(y) to the formula
....................................if(flaDP(f2,0)==0)
...........................................size++;
......................}
...........return size;
}
void addvariable(FORMULA *f,int variable,int sign){
............LITNODE *node;
............f->clausedata[f->maxclauses -1]->data=1;
............node=(LITNODE*)malloc(sizeof(LITNODE));
............node->data=variable;
............node->sign=sign;
............node->next=NULL;
............f->clausedata[f->maxclauses -1]->next=node;
............if(sign)
..................ilAdd(f->posatoms[variable],f->maxclauses -1);
.............else
..................ilAdd(f->negatoms[variable],f->maxclauses-1);
.}
|
Week 7 |
We ran the code with "backbone size determination" added
-- it gave some
nice reasonable results. But we need to test it on the larger number
of instances.
In the real world with he regular situation it can take a day to run
our code
on such input. So,we need something to increase the speed of execution.
We need MULTITHREADING. Good idea! I am working on the "louis"
that has 14 processors. Therefore we can do 14 threads and uor job
will be done 14 times faster! First we have to "break" the
code -- create/separate
a function to be threaded. Then ,create threads and let it go! This
is the part of the the code I wrote in order to perform our task(cerating
14 threads + let them wait for each other):
#define PROCESSORS 14
:
:
:
for(v=0;v<PROCESSORS;v++){
.......if(pthread_create(&thread_id[v],NULL,aThreadFunction,(void*)dummy)>0)
...............printf("error in pthread_create %d\n",v);
}
for(z=0;z<PROCESSORS;z++)
.....if(pthread_join(thread_id[z],(void**)pp_status)>0)
..............printf("error in pthresd_join %d\n",z);
I tested aThreadFunction on several inputs.
Unfortunately, my "separated" function works perfectly with
"singlethreading",
and gives me some "Memory fault" in "multithreading".
So I need to look at the code closer and find out what's wrong...
|
Week 8 |
Ok,I fixed those "multithreading memory fault"(it took a
lot of time, but I really enjoyed searching throught the code and applying
some debugging features
almost to every line!)+ I needed to use mutex_lock and mutex_unlock
for every global variable I used.I was using some glabal variables in
order to store data
from each thread execution.
These are the fragments of the code that represents the ussage of
glabal variables+mutex_lock() and mutex_unlock() functions:
#include <pthread.h>
#define _REENTRANT
#define PROCESSORS 14
pthread_mutex_t my_lock = PTHREAD_MUTEX_INITIALIZER;
int * array;
.........
main(){
.......
.....array=(int*)malloc(PROCESSORS)*sizeof(int));
}
......
void* aThreadFunction (void*){
........
.....pthread_mutex_lock(&my_lock);
.....array[size]++;
.....pthread_mutex_unlock(&my_lock);
.......
}
But the execution of our code is still not fast enough. So,probably
we need to modify our aThreadFunction so that it takes care of some
number of instances(instead of just 1 instance as it has been so far)...
|
Week 9 |
As I mentioned before, we need to modify the aThreadFunction
so that the threads can handle more than just 1 instance( so far each
thread has been taking care of just one instance processing) -- and
now I decided to let each one do some number of instasnces (NUMINST
/ PROCESSORS). And assign
NUMINST %PROCESSORS to the first thread (the first thread starts first
and can do more job than others). +I created a special array of FORMULAs
that is glabal as a source of instances for our threads. Analyzing the
execution process,we reached the conclusion -- probably, we need to
eleminate
the number of mutex_locks and mutex_unlocks because they slow down
the execution of the code a lot. So I introduced some glabal arrays
each position of which was responsible for a specific thread.
This is the fragment of the modified code: (function-- aThreadFunction)
void* aThreadFunction (void *t){
.........FORMULA *f;
........int M=( (DATASTRUCT*) t)->M;
........int N=( (DATASTRUCT*) t)->N;
........int instance=( (DATASTRUCT*) t)->numbinst;
........FORMULA *ftemp;
........int count=0;
........int size=0,i;
........int setsize=0;
........int setcount=0;
........int *id=(int*)malloc(sizeof(int));
.......*id=pthread_self();
.......for(i=0;i<instance;i++){
..................f=flaCopy (farray[special++]);
..................ftemp=flaCopyWithBackbone(*f);
..................count=flaDP(f,0);
..................setcount+=count;
..................if(count){
.....................size=test(ftemp);
.....................if(size){
.........................setsize+=size;
....................}
.................}
.................flaFree(f);
................flaFree(ftemp);
......}
......threadarray[pp++]=setsize;
......countarray[tt++]=setcount;
.....*id=pthread_self();
......return (void*)id;
}
|
Week 10 |
This is the last week of my summer reaaserch. And we decided to do
the summary of all our results. I was suggested and taught the "gnuplot"
techniques. For example,I used a series of commands to "create"
one of the graphs: (in Unix)
#> gnuplot
#>set data style linespoints
#>set title "3-SAT n=20 numinst=1000"
#>set xlabel "alpha"
#>set ylabel "satisfiability"
#>plot "o1"
can also use:
#>set terminal postscript
#>set output "o1.ps"
#>plot "o1"
(or just #> replot)
So, I was running all the combinations of input (with certain values
of alpha)--
and then putting everything in forms of graphs. You can see all the
graphs
in my final report.
|