Hey, my name is Ali Khalid I am currently a junior at Midwestern State University. I am pursuing a double major in Computer Science and Mathematics, and I plan to graduate in May 2018. I am really interested in solving real life Mathematical and logical problems. In my free time I like to either workout, play soccer, or solve interesting coding challenges.
I enjoy learning new and challenging stuff, some of the skills I have picked up on the way are
My DREU project for the summer is related to STAPL Graph Library which is library built on top of the STAPL framework, providing parallel graph data structures, algorithms, and various tools which are helpful for parallel graph processing at scale. In my project, I will study how various properties (graph topology, machine size, etc.) impact the performance of various parallel graph processing workloads. This project will catalog each paradigm’s performance in different graph analysis workloads and with different parameters for each execution. Through this, I will then attempt to categorize these properties into different classes and provide guidelines assisting future users in selecting the best method for their particular problem.
I arrived at Texas A&M on Sunday. I went around and explored the city, from shopping malls to various food places. My official start day was Tuesday, thats when I first came to Parasol Labs, I met my mentors and went head first into my research. I found out I will be working on SGL (STAPL graph library). Most of my first week I spent reading previous papers, writing small programs to become familiar with STAPL. Later, I started messing around with SGL; I ran performance tests on the BFS and I made performance plots for the BFS.
This week I got more involved with the project. I started writing benchmark programs to check performance and run tests on various graph algorithms. I made my first commit when I made the benchmark program for Pseudo Diameter algorithm and uploaded it. Later I optimized the algorithm and cleaned up the code and made my second commit. I also created a benchmark program for Approximate BFS algorithm; currently I am in the process of uploading the code. While doing so I found a bug in the original algorithm which was then fixed. Moreover, I also ran performance analysis on the twitter graph and modified the Between Centrality algorithm to get the most central vertex in the twitter graph.
This week I finally went and played soccer at A&M soccer fields with a couple of friends. Finally, I got my A&M id card which allowed me to visit the recreational center, where I went for a swim played table tennis and also worked out. On Wednesday night after a swim at the recreational center, I and my roommate decide to cook food on the outdoor grill: which turned out better than we expected. I have had some great progress related to my project as well; I wrote and finally committed the benchmark program for Approximate BFS. I have also managed to write benchmark programs for Closeness Centrality and Bad Rank algorithms: once reviewed I will be able to commit those as well. Apart from that I also worked on this website. And finally, I have been running performance analysis on the Approximate BFS algorithm using different number of processors and creating performance plots.
I made great progress on my project this week. I have been working a lot with implementing benchmarks and along with that I have been working on modifying certain algorithms to support various execution policies instead of overloaded functions and modifying the tests for certain algorithms to exercise various paradigms. I implemented and committed a benchmark program for closeness centrality and added a test for hubs paradigm in the closeness centrality test. Also updated the approximate BFS and pseudo diameter to accept various execution policies which makes running benchmark test much easier. Along with writing programs I have been conduction performance analysis on Approximate BFS algorithm, running it on 512 processors using USA maps trying to find co-relation between graph properties and paradigms.
This week I continued working on benchmarks. I implemented and committed benchmark programs for Cut Conductance algorithm and Link Prediction algorithm. Along with benchmarks I have also been working on validation for those algorithms. While running validations I came across some bugs; I spent most of my time this week debugging code along with fixing the bugs I also learnt how to efficiently debug in Linux environment. I also made minor changes to closeness centrality test in order to test all paradigms of execution. I also managed to implement and write verification for random walk algorithm which I will be committing soon. After this I will start working on running these algorithms on a cluster with varying number of processors in order to analyze their performance.
This week I have been working on benchmark program for random walk algorithm, along with writing validation for the algorithm I also managed to fix a bug in the random walk algorithm. Apart from this I have also been running benchmark tests on pseudo diameter on Rain(TAMU cluster), testing all paradigms on varying processors and creating the performance plots in order to measure performance of different paradigms and their relative speedup. Next week, I will be working on the benchmark program for k_core algorithm, fixing the algorithm to use policy for execution instead of using overloaded functions, writing validation and fixing the test to accept policy for execution. Apart from this I will be fixing the bug in community_detection algorithm and simultaneously run benchmark tests for different algorithms on Rain.
This week I was mainly running benchmark tests on rain, along with creating their performance plots with all supported paradigms and varying number of processors in order to measure relative speed up. I ran the tests with two different inputs: first with the USA road network, and second with the Kronecker graph. The benchmark tests were for the following algorithms: bad_rank, closeness_centrality, link_prediction, pseudo_diameter and random_walk. I have also been working on benchmark programs for the community_detection and k_core algorithm. I managed to commit the benchmark program for community_detection, I have also written the benchmark program for k_core. Next week I will run validation on k_core algorithm and commit that as well. Apart from that I will be running more benchmark tests.
This week I finished writing the benchmark program for k_core algorithm. I updated the k_core algorithm to use policy for execution and along with that I wrote verification for k_core algorithm. After that I started working on two different implementations of BFS algorithm, in order to measure performance. The existing algorithm stores both the parent of the vertex and the level. I have written a BFS that stores level only, and I managed to write a benchmark program; the algorithm passes the verification I wrote. I have also managed to write another version of the algorithm which stores the parent vertex only; however I am still in the process of verifying the correctness of the algorithm.
This week I continued working on the BFS algorithms. As of last week the BFS level only was working and passed verification. This week I also wrote a text program for BFS level and it passed for all paradigms. I also completed writing verification for BFS parent however while running verification I found out that it does not pass for hierarchical hubs paradigm however it passes for level synchronous and hierarchical paradigm. I also ran tests on up-to 512 cores on Rain and I saw some speed up in both versions of BFS compared to the original version of BFS. The tests I ran were on USA road network, mini Kronecker and Kronecker graph with a scale of 25 and edge factor of 16. There was speed-up in all three input graphs; the most speed-up was shown in USA road network. Apart from that I have been working on writing my poster and final report.
Last week went by very fast. I had been working on some algorithms, along with that I had to prepare a poster for my research and my research paper. I managed to complete my report and poster in the first half of the week and also managed to commit my derivative of the BFS algorithm I was working on.
It has been an amazing experience working on STAPL this summer; I learnt a lot, met some interesting people and I am walking away with an experience I will never forget.
I am working under the mentorship of Dr. Nancy Amato. She is Unocal Professor and Regents Professor in the Department of Computer Science and Engineering at Texas A&M University where she co-directs the Parasol Lab.
Another one of my mentor is Dr. Timmie Smith. He is a Postdoctoral Research Associate his focus is the development of PARAGRAPHs, which are high level task graph representations of parallel operations. He is also involved in the development of PDT, a parallel discrete-ordinates transport code. PDT is developed using STAPL in collaboration with the Nuclear Engineering department.
I also work under the mentorship of Adam Fidel. Who is a PhD candidate in computer science in the Parasol Lab at Texas A&M University. Working on accelerating large-scale parallel graph workloads through the use of asynchronous processing and nested parallelism. And he is also the principal developer of the STAPL Graph Library.