sneuburg@its.brooklyn.cuny.edu

CRA DMP Experience

Summer 2003

Homepage

Research

Author

Final Report

Progress Report

**Week 1**

To begin my research project, I did a lot of reading on random number
generators. Once I understood the usefulness and functionality of random
number gerators, I began reading about statistical testing for these
utilities. I constructed the simple Equidistribution Test and used it on
first the random number generator inherent to Java and then the shuffled
nested weyl generator, the focus of my research project. I was introduced
to gnuplot, which is a useful tool in plotting the results of any test.

**Week 2**

My work this week began a lot more smoothly since I already have
accounts on the computer systems and I am accustomed to programming in the
Unix environment, a new experience for me. This week I programmed more
significant tests on the random number generator, the chi-square test and
the KS test. The Kolmogorov-Smirnov test was based on the writings of D.
E. Knuth. I attended my first weely meeting with other faculty members
and students actively involved in computer science research this summer.
It was great to see and hear what other students are doing. I hope
to gain lots of useful knowledge from these meetings.

**Week 3**

This week's programming work began with attempting to perform the KS
test on the results of several chi-square tests on different sets of
numbers. This algorithm is quite complex and involves other functions,
such as the gamma function. Instead of writing the gamma function on my
own, and "reinventing the wheel", I downdoaded a copy of the gamma
function written in Java from the Internet. I revised this function to
suit my purposes. This test did not produce the expected results, so
I ended up rewriting the gamma function in case it was the culprit.
This effort was soon proven futile. Next, I programmed an alternate
algorithm for the same test. Once these basic statistical tests
worked properly, I was finally able to begin programming the tests on
parallel number generation.

**Week 4**

This week began with frustration. Tests that were supposed to output
data heading towards 0, went to infinity. Finally I realized that I had
an integer overflow. After correcting this, I plotted my data and found
that the graphs did not quite resemble the theoretical plot. I followed Dr.
Whitlock's advice and plotted the average of data produced by several sequences
of numbers. One exciting discovery: After trying these tests on java's
generator, I realized that they produced different results each time,
unlike the generator I normally test. Now that I'm getting used to
researching on the Internet, I found documentation that explained that
different seed values are used each time, based on the current time. No
wonder the same test gives different results a few seconds later! Next, I
began generalizing the parallel tests, comparing threee sequences of numbers
drawn from the same number generator within a short span of time. Many points
must be compared in each of these tests, so the processing time increased
considerably.

**Week 5**

By the time this week began, I had already produced enough basic
tests. Because of this progress, I was able
to generalize these tests for the case of three simultaneous sequences of
numbers drawing from the same generator. Of course, I also had to modify
the code that produces the theoretical results, so I can compare the
results in statistical tests. I realized that my code is a little messy ,
so I took some time out to neaten it up and document it so it would be
quite simple for someone else to run these tests. Perhaps someday someone
will run these tests on a different generator!

**Week 6**

This week I used the code I wrote last week to run various parallel tests
on several variations of the pseudorandom number generator I am testing.
Whenever I obtained results that varied from the expected range of values,
I spent time analyzing the results of these tests were further data to be
generated. I analyzed 3, 4, 5, 6, and 7 sequences of numbers, to find
correlations among the distinct sequences. I compared both pairs and
triplets of numbers, when the sequences are split before being distributed
to processes and when the pairs are distributed to processes one after
another.

**Week 7**

This week I coded another random number generator, the nested weyl
generator. I ran the same parallel tests on this generator, and found
many correlations between the sequences. This was ecpected, and I
documented the results. I also began another two tests, to find the
length of the period and the minimum distance between the points in many
sequences. I read the online documentation on the java command and
realized that I can increase the size of the memory allocation for any
program. This is nice to know since I had encountered "out of memory"
errors in some of my programs that required many arrays of large sizes to
be stored in memory. I still have to test the shuffled nested weyl
generator in its capabilities of generating uncorrelated 10 parallel
sequences of numbers, since I hear that the generator is used in this
fashion, perhaps inaccurately.

**Week 8**

This week I modified the tests for parallel generation so they work on the
nested weyl generator I created. It seems like I actually coded the
nested weyl generator properly since there is a significant deviation from
the theoretically "good" results. We expected this because the nested
weyl generator has been documented to fail in parallel computing
environments. I'm wondering whether my test that finds whether a
generator is periodic or aperiodic actually works, because it tells me
that every varsion of the shuffled nested weyl, and the nested weyl
generators that I tested are aperiodic.

**Week 9**

This week I began by coding generators that have obvious periods. I was
happy to see that my program that finds the period of a generator works
properly. I also carried my test for parallel random number generation
further. I checked the number of numbers that land in bins, but the
two-dimensional bins each contained a smaller range of numbers, since
there were 20 or 50 of them in each dimension. I hope to finish creating,
testing, and using my minimum-distance test next week.

**Week 10**

I used this week to wrap up this research project and tie up any loose
ends. I reviewed my progress with Dr. Whitlock. We discussed the results
of my programs and further tests and variations I would develop were I to
carry this project further. When we noticed gaps in the data, I
clarified and completed the result set.