Distributed File Streamer: A Framework for Distributed Application Data Coupling

Abstract

File transfer is very common in a modern distributed computing environment. Protocols such as HTTP and FTP are designed for downloading or uploading files from/to servers. Some other tools such as `secure copy’ are used to transfer files among hosts securely. In this paper, the file transfer is considered in the context of connecting distributed applications, what is an output of a data producer on one node would be an input of a data consumer on another node. Intermediate files are used as a medium to connect workflow computational phases, which is a common paradigm used in grid environments. Distributed File Streamer a.k.a. DFS, as its name implies, uses data streaming to couple distributed applications. Instead of waiting for a producer application for output to transfer completely to the consumer node, DFS streams the data over the network directly to a consumer program, managing the data flow efficiently and providing a framework for partial file consumption. This paper describes the architecture of the DFS framework, gives its performance model analysis, and provides results demonstrating DFS advantages over the traditional way on several examples.

Publication
7th IEEE/ACM International Conference on Grid Computing