The project I will be working on deals with the data management
of workflows. A workflow is a set of tasks that must be accomplished in order to produce
the correct output. For example, the montage workflow takes a large number of telescope
images and, through a number of processes, fits them together to form a large,
cohesive picture of the night sky. This means, of course, that there are a lot of these
telescope images, or input files, that must be transfered to the cluster of machines
running the workflow, or "staged in". At the moment all of the stage-in is done inside
the program that is actually running the workflow. Our goal is to move this process outside
of the program running the workflow and to implement the stage-in using another program. In
doing so we hope to find ways to significantly decrease the time it takes for the workflow
to run. I will be working on this project with another DREU student named Samuel Hopkins.