My research

Research Plan for Sample Sort as discussed with Dr. Nancy Amato and Dr. Mauro Bianco 06/12/2007

Research Topic:

Implementing Sample Sort using the Standard Template Adaptive Parallel Library (STAPL), a parallel library designed to be a superset of the STL for C++.

Research Goals:

  1. To implement the Sample Sort algorithm while taking advantage of the options the STAPL library provides to spread the work across multiple processors and thereby improve the running time.
  2. To test the implementation of Sample Sort on various inputs, including inputs fewer data elements than processors, inputs with many duplicates, and inputs of various data types, possibly including user-defined data types.
  3. To study the general scalability of the implementation, the speedup as Sample Sort is run on increasing numbers of processors, and the conditions under which the implementation of Sample Sort performs well and poorly, such as the degree of pre-sortedness of the input, the ratio between the size of the input and the number of processors, the computer architecture, and possibly the over-sampling ratio.

Tasks:

My primary task will be to write STAPL compliant code to do the following:

  1. To pick processors-1 splitters, possibly using an adjustable over-sampling ratio to first choose more splitters than necessary and then choosing the splitters from among the pool of possible splitters.
  2. To sort the splitters.
  3. To use the splitters to define buckets and place the data in the correct buckets by comparison between the data and the splitters that make up the limits of the buckets.
  4. To sort each bucket.

I will then have to write programs to test my implementation of Sample Sort with different inputs, as described above, and any necessary code to facilitate the study of the scalability, speedup, and conditions under which my implementation of Sample Sort performs well and poorly.

Finally, I will have to present my findings in a presentation, poster, and paper, and describe them on my website.