Home
About_Me
About_My_Mentor
About_My_Project
Links
My_Journal

Week 5
Fortunately for Philip and I, everything seemed to go right this week:

I finished the test suite for the synthetic sqrt code

The test suite allows analysis of performance and accuracy.

I will use it to explore several aspects of my problem:

Which algorithm is better? (Newton Raphson iterations, or a Binary Search)

How do the values of the absolute and relative tolerances affect the balance between performance and accuracy?

(Also, is there an optimal setting for these values that maximizes both?)

How must I adapt these settings according to the precision used? (Single vs Double)

Are any of these aspects affected by the particular value of the operand? (different sqrt(x) for different x?)

How would this added functionality slow down applications as a function of usage (0.1%, 1%, or 10% of calculations?)

I also discovered that our methodology for measuring execution time was flawed.

The prior method kept track of the number of active clock cycles between two system calls, but these did not include time spent outside the CPU.

I've corrected the problem by making system calls to get wall time stamps and did a subtraction to determine execution time.

I will be adding to the group wiki to share this insight with the rest of the group.

Philip, after learning of my rootfinding approach to implementing sqrt, was interested in applying this approach for division.

We discussed both the theory of applying this to division and the practical implications of implementing this on GPUs.

We agreed this approach had the potential to perform much faster and much more accurately than his current approach.

Indeed, his preliminary results show massive speed improvements that greatly exceeded our expectations.

In fact, these results show that for operands particularly large in magnitude, the new approach not only fixes GPU drifting, but is more accurate than CPU hardware division!

During the week, I was involved in several discussions:

Philip and I presented our work and preliminary findings to a group of collaborators from the Army Research Laboratory.

They do a lot of work on GPUs, and are very interested in our project.

I attended a talk by Dr. Lisa Marvel, a researcher also from the ARL.

Her work in Steganography is incredibly interesting!

I've also participated in the weekly group meeting.

Trilce, a Ph.D. student in the group, presented a paper that was awarded "Best Systems Paper" in Supercomputing 2003.

The paper detailed very useful methodologies for finding sources of performance loss in high scale clusters and eliminating them.

Some social events.

I went to another picnic hosted by the social committee at the Towers, where I currently stay.

The food was great, and I got to ride a segway!

I had lunch with Dr. Pollock's group and the rest of the GCL.

