Untitled Document

The Journal

Week 1	I went through all the related (and VERY INTERESTING) to our research material (some of the articles are :"Determining computational complexity from characteristic 'phase transitions'"by Monasson,Zecchina,KirkpatrickSelman,and Troyansky; "Computer Science can't get no satisfaction" by Brian Hayes; Experimental Results on the crossover point in random 3-SAT" by Crawford and Auton; etc) and did the overview and deep analyse of the written code. We decided to start from the testing of our program by applying 2-SAT problems (first of all it is just easier,and second - the results 2-SAT problems are well known -- so we can see if our program is correct or if it has to be modified. I designed some simple input file(test) that looked like this: 30 50 10000 40 50 10000 50 50 10000 60 50 10000 70 50 10000 The first column -- M(number of clauses), second-N(number of variables), third-# of randomly generated formulae. Alpha (M/N)= 0.6, 0.8, 1.0, 1.2, 1.4 respectively. I got some message("Memory fault") -- and I fixed a couple of mistakes( it was some mismatch with "malloc" and 'free")- I modified a code a little. Let's see what to do next...
Week 2	I looked at the "formula analyzing" part closely and found out that it had to be some more testings : if let's say we have (A) and (~A) both in single clauses (exactly as they appear here), we have to return 0(false) immidiatelly! - the formula can not ever be satisfied .This is like if we have ((True) AND (False)) formula-- it is always False! So, I wrote a function flaCheck -- that accepts a Formula itself, a chosen variable(literal), and its(literal) sign. In a program we have two "arrays" posatoms and negatoms. Posatoms stores the information in which clauses a certain(each) variable appears positively,negatoms-in which clauses it appears negatively. I am picking an array "opposite" to the sign of the chosen variable from the single clause--to check if we have another single clause that holds our variable with the opposite sign - for example negated (if "data" == 1 means the clause has only one literal) it returns 1 if the match was found //lit -is a chosen variable to process, sign - indicates in which form it is: //regular or negated (for example A or ~A) int flaCheck(FORMULA *f,int lit,int sign){ ....INTLIST temp; ...ILITER ili; ...int clauseind; ...int done=0; ...if(!sign) ......temp=ilCopy(f->posatoms[lit]); ...else .....temp=ilCopy(f->negatoms[lit]); ...ili=ilIterNew(temp); ...while(ilIterMore(ili) && !done){ .............clauseind=ilIterNext(&ili); .............if(f->clausedata[clauseind]->data == 1) ................done=1; ...} ...return done; }
Week 3	I inserted the previous function in the code. The results are much better now! They still need some correction, but they are much better! We also tried to get the same effect (as from the previous function) by making some changes to the existing code -- we decided that we should not have decremented the number of "clauses" (total) if we had some empty clause in the formula! Because we could have got a situation like -- the total number of clauses is 0 (so we have to return 1- the formula is satisfied), but we have some empty clauses among "deleted" ones (the formula should be False!). Therefore, we did not decrement the total number of clauses when some empty clause appeared, and got the same results! So, now we can use eather way in our program! But we still need to make some corrections to the code. I "created" a very simple formula with M=7 (number of clauses) and N=5(number of variables), and went through the program checking every single step of the formula processing. I concentrated on the "analyzing" stage. And now I am sure -- it is absolutely correct.
Week 4	This week I checked all the pointers in our program - to make sure they point to the "right places". I printed out all the values of data of the memory locations our pointers point to. I spent a couple of days on a fish diet doing that. Finally I found that they were fine. I decided that in one place we should have sent a formula by reference (not by value) -- even though it worked ,in this situation more efficient way is to pass the formula by reference. I also modified the randClauseMake function - it should be 2(or 3 in the future) unique variables in each clause. so. I checked for it -- and if we have the same variable appear in a clause more than ones, we will generate another one till 2 (or 3 or any number) of them are different(unique). I ran the code again with this new feature - the results were almost the same( a little different by 0. 02), but I think this is absolutely necessary if we want our code to be more efficient and our results more clear and "true". so,we made sure that each clause contained three unique variables. now le's move to the next stage of a program...
Week 5	I tried other random number generators (just to make sure the results do not depend on the choice of a random genarator). I tries a good old random number generator functions: rand() and srand().This is the example of a code... At the beginning of the program I have: ...... srand((unsigned)time(NULL)); And then in randClauseMake: for (j=0;j<clausesize;j++) { .....rc[i].lits[j]=(rand()%numvars); .....rc[i].signs[j]=(rand()%2); } We have to use srand() in addition to rand() to make sure that our random number generator is working properly (otherwise it will generate the same sequence of numbers -> same formulae) It gave me almost the same results (the values of "satisfiability" were a little less than the original results --(by ~0.03). We also decided to use 1000 variables instead of 50 in our tests. This is the example of one of our test files: 600 1000 1000 800 1000 1000 1000 1000 1000 1200 1000 1000 1400 1000 1000 So ,that the values of alpha (M/N) are 0.6,0.8,1.0,1.2,1.4. And we got the absolutly right results! Yes! Now we can move to 3-SAT problems.
Week 6	I ran 3-SAT problems. Everything worked fine. So,now when we are sure that our code is absolutly correct, we can move to some other kinds of problems for example: determining the "backbone" size. The backbone of a satisfiable SAT instsance is the set of entailed literals (a literal l is entailed by a satisfiable SAT instance C iff C AND (NOT l) is unsatisfiable). And so the backbone size is the number of entailed literals. Therefore, in determining some backbone size first we should consider formulas that are satisfiable. Then go through each literal l and add (NOT l) to the formula. If the formula is unsatisfiable - l is in the backbone. If the formula is satisfiable - add (l) to the formula, if it is unsatisfiable ~l is in the backbone; etc(repeat for all literals). These are the fragments of the code I wrote: int test( FORMULA* ftemp){ ...........FORMULA f1,f2; ...........int size=0,y; ...........for(y=0;y<ftemp->numvars;y++){ ....................f1=flaCopy(ftemp); ....................f2=flaCopy(ftemp); ....................addvariable(f1,y,0); //add (~y) to the formula ....................if(flaDP(f1,0) == 0) //if formula is false, y is in the backbone ....................................size++; ....................else{ ....................................addvariable(f2,y,1);//if not- add (y) to the formula ....................................if(flaDP(f2,0)==0) ...........................................size++; ......................} ...........return size; } void addvariable(FORMULA f,int variable,int sign){ ............LITNODE node; ............f->clausedata[f->maxclauses -1]->data=1; ............node=(LITNODE*)malloc(sizeof(LITNODE)); ............node->data=variable; ............node->sign=sign; ............node->next=NULL; ............f->clausedata[f->maxclauses -1]->next=node; ............if(sign) ..................ilAdd(f->posatoms[variable],f->maxclauses -1); .............else ..................ilAdd(f->negatoms[variable],f->maxclauses-1); .}
Week 7	We ran the code with "backbone size determination" added -- it gave some nice reasonable results. But we need to test it on the larger number of instances. In the real world with he regular situation it can take a day to run our code on such input. So,we need something to increase the speed of execution. We need MULTITHREADING. Good idea! I am working on the "louis" that has 14 processors. Therefore we can do 14 threads and uor job will be done 14 times faster! First we have to "break" the code -- create/separate a function to be threaded. Then ,create threads and let it go! This is the part of the the code I wrote in order to perform our task(cerating 14 threads + let them wait for each other): #define PROCESSORS 14 : : : for(v=0;v<PROCESSORS;v++){ .......if(pthread_create(&thread_id[v],NULL,aThreadFunction,(void)dummy)>0) ...............printf("error in pthread_create %d\n",v); } for(z=0;z<PROCESSORS;z++) .....if(pthread_join(thread_id[z],(void*)pp_status)>0) ..............printf("error in pthresd_join %d\n",z); I tested aThreadFunction on several inputs. Unfortunately, my "separated" function works perfectly with "singlethreading", and gives me some "Memory fault" in "multithreading". So I need to look at the code closer and find out what's wrong...
Week 8	Ok,I fixed those "multithreading memory fault"(it took a lot of time, but I really enjoyed searching throught the code and applying some debugging features almost to every line!)+ I needed to use mutex_lock and mutex_unlock for every global variable I used.I was using some glabal variables in order to store data from each thread execution. These are the fragments of the code that represents the ussage of glabal variables+mutex_lock() and mutex_unlock() functions: #include <pthread.h> #define _REENTRANT #define PROCESSORS 14 pthread_mutex_t my_lock = PTHREAD_MUTEX_INITIALIZER; int * array; ......... main(){ ....... .....array=(int)malloc(PROCESSORS)sizeof(int)); } ...... void* aThreadFunction (void*){ ........ .....pthread_mutex_lock(&my_lock); .....array[size]++; .....pthread_mutex_unlock(&my_lock); ....... } But the execution of our code is still not fast enough. So,probably we need to modify our aThreadFunction so that it takes care of some number of instances(instead of just 1 instance as it has been so far)...
Week 9	As I mentioned before, we need to modify the aThreadFunction so that the threads can handle more than just 1 instance( so far each thread has been taking care of just one instance processing) -- and now I decided to let each one do some number of instasnces (NUMINST / PROCESSORS). And assign NUMINST %PROCESSORS to the first thread (the first thread starts first and can do more job than others). +I created a special array of FORMULAs that is glabal as a source of instances for our threads. Analyzing the execution process,we reached the conclusion -- probably, we need to eleminate the number of mutex_locks and mutex_unlocks because they slow down the execution of the code a lot. So I introduced some glabal arrays each position of which was responsible for a specific thread. This is the fragment of the modified code: (function-- aThreadFunction) void* aThreadFunction (void t){ .........FORMULA f; ........int M=( (DATASTRUCT) t)->M; ........int N=( (DATASTRUCT) t)->N; ........int instance=( (DATASTRUCT) t)->numbinst; ........FORMULA ftemp; ........int count=0; ........int size=0,i; ........int setsize=0; ........int setcount=0; ........int id=(int)malloc(sizeof(int)); .......id=pthread_self(); .......for(i=0;i<instance;i++){ ..................f=flaCopy (farray[special++]); ..................ftemp=flaCopyWithBackbone(f); ..................count=flaDP(f,0); ..................setcount+=count; ..................if(count){ .....................size=test(ftemp); .....................if(size){ .........................setsize+=size; ....................} .................} .................flaFree(f); ................flaFree(ftemp); ......} ......threadarray[pp++]=setsize; ......countarray[tt++]=setcount; .....id=pthread_self(); ......return (void)id; }
Week 10	This is the last week of my summer reaaserch. And we decided to do the summary of all our results. I was suggested and taught the "gnuplot" techniques. For example,I used a series of commands to "create" one of the graphs: (in Unix) #> gnuplot #>set data style linespoints #>set title "3-SAT n=20 numinst=1000" #>set xlabel "alpha" #>set ylabel "satisfiability" #>plot "o1" can also use: #>set terminal postscript #>set output "o1.ps" #>plot "o1" (or just #> replot) So, I was running all the combinations of input (with certain values of alpha)-- and then putting everything in forms of graphs. You can see all the graphs in my final report.