Final Report

This is my final report for the work I did this summer. Download it here.

Abstract -- The stage-in of input data to a scientific workflow run on a remote cluster accounts for a significant amount of the workflow’s runtime. In this paper we present an implementation of a Data Placement Service, integrated with the Pegasus Workflow Management System, where Pegasus sends stage-in requests to the DPS for transfer. The goal of this implementation is to improve workflow runtime by staging in data asynchronously.