|
|
|
Welcome to a site about my DMP experience of Summer 2004. I participated in the DMP program in the Department of Computer Science at Texas A&M University under the guidance of Prof. Nancy Amato last summer as well, and liked it so much that I am back! I am working in the Standard Template Adaptive Parallel Library (STAPL) in the Parasol Lab. Gabriel Tanace is my graduate student mentor.
I have recently graduated from Western Oregon University with a B.S. in Computer Science and Mathematics. I will be starting my Ph.D. Program at Texas A&M University this fall.
My research interests include parallel and distributed computing, parallel algorithms, and performance modeling. This summer I am working on the Adaptive Parallel Sorting Algorithms in STAPL.
Adaptive Parallel Sorting Algorithms in STAPL
Abstract: Modern computing demands fast processing of large amounts of data. Using more than one processor at a time greatly increases the speed of program execution, making parallel and distributed processing important. Standard Template Adaptive Parallel Library (STAPL) is a parallel library for C++ that allows users to execute on multiprocessor systems without having to deal with the complexity of application parallelization. STAPL provides various parallel algorithms to its users. One of the most commonly used procedures is sorting a set of data. Sorting algorithms differ in performance depending on machine architecture, number of processors, number of elements to sort, data type of the elements, whether the elements are partially sorted. We compare the performance of four different sorting algorithms: Sample Sort, Radix Sort, Column Sort and Bitonic Sort. STAPL will be able to adaptively select the best algorithm based on the input data and system information available.
Weekly Journal - Summer 2004
1. Week of 07-09-04
I spent most of this week taking care of logistics (setting up accounts, etc.). I attended the USRG seminar, Dr.Amato's group meeting, Dr.Rauchwerger's group meeting, and the "brown bag lunch" with Dr.Taylor. As far as the project goes, I have been studying the sorting algorithms and the different implementations of them.
2. Week of 07-16-04
I concentrated on the Sample Sort. I have mostly worked on the documentation for the Sample Sort; it is done in .ps format and ready to be reviewed. I also wrote an abstract. I have started to work on the code somewhat as well, but an approval (or correction) of the implementation would be a good thing to have first. I am trying to have the algorithm as close to the shared view as possible.
3. Week of 07-23-04
I spent this week coding the Sample Sort. It is still missing the last part since I need a method added to the pRange, a part of STAPL I am using. I applied for the funding to attend the CREW/DMP reunion/Grace Hopper Conference which will take place in October in Chicago; I am sure I will learn a lot there and have a chance to catch up with old friends :)
4. Week of 07-30-04
I have written the last part of the Sample Sort; to output the sorted sequence to the original container in parallel, I used a simplified prefix sums (one element per processor) to get the starting indexes for each processor. I am also looking at a possibility to use a pointer jumping technique to do that. I am guessing that my implementation will be faster for a small number of processors and the pointer jumping for a large number of processors. This could eventually be one of the adaptive parts. I also wrote the documentation for the Radix Sort.
5. Week of 08-06-04
I coded the Radix Sort, although it is still not working properly. I had to stop working on it for awhile though so that I could make a poster for the presentation I had this Friday (and the practice one on Wednesday). I never thought that fitting all the information on a set size poster would be that difficult; I think I spent more time on the layout than on the content. I did learn a few tricks though; making my next poster should be a lot easier.
6. Week of 08-13-04
This week I finally got the Radix Sort to work properly. The machine I normally run my performance tests on has been rather busy, so I was unable to test the performance much. I applied for an account on a couple of the machines at the A&M Supercomputing Center; it will be interesting to test the performance of my algorithms on them.
7. Week of 08-20-04
I wrote the tester files for both of the sorts. The testers test the algorithms for correctness, as well as calculate their running time. The program is run multiple times in a row, to ensure accuracy and the 95% confidence level for the running time. I also tested my algorithms on Cosmos, one of the supercomputers, and have identified a number of problem areas where the performance can be (and should be) improved.
8. Week of 08-27-04
I worked on testing the performance of the algorithms relative to the aggregation. Aggregation is the number of messages put in the to_send queue before they are sent. Since both of the sorts involve a lot of communication, aggregation size changes the performance significantly. I have also started looking at multiple implementations of Merge Sort.
9. Week of 09-03-04
STAPL had a major change this week; now the pRange (the iterator) also has all the methods of the pContainers (parallel containers). So I rewrote my algorithms to accommodate the change. There are still some unresolved issues with that; hopefully we'll figure out how to deal with them soon. I also realized that it is fairly unreasonable to expect the parallel Merge Sort to scale since progressively less and less processors are involved in the computation, so I started looking at the alternatives. At the moment I am considering implementing a parallel Bitonic Sort.
10. Week of 09-10-04
I have implemented the bitonic sort and performed more testing. As the summer is coming to an end, my project is not over yet. Luckily I will be able to continue working on it as a graduate student at Texas A&M University, WHOOP! For the most recent updates, visit my home page. To summarize the summer, I worked hard and learned a lot while having a great time. This program is a great jump start for a research career I am seeking, because it allows you to get an edge by starting to work on research before going to graduate school.
Parasol Home | Research | People | General info | Seminars | Resources Parasol Lab, 301 Harvey R. Bright Bldg, 3112 TAMU, College Station, TX 77843-3112 Phone 979.458.0722 Fax 979.458.0718 Department of Computer Science | Dwight Look College of Engineering | Texas A&M University |