Abstract

This thesis is about interaction between different architectures in high performance computing for file system I/O. This is evaluated by performance, scalability and fault handling. What excel in a loosely coupled system fail in a tightly connected system and vice versa.

The I/O-path from disk to application have been examined both theoretically and with tests for local and distributed file systems. The impact of different levels of cache is shown using various tests.

This test results has been used to design and implement a protocol giving SCI the semantics of TCP/IP, thereby replacing TCP/IP in PVFS. SCI is a low latency, high throughput interconnect with decentralized routing. In PVFS interconnect latency have only proven important for meta data operations. For I/O operations the pipelining hides the latency with the protocol window. PVFS have as expected shown increased read and write performance with increased interconnect throughput. Throughput have been increased by a factor of 5 by introducing SCI from 100Mb/s Ethernet. To limit overloading in the interconnect, two different techniques have been evaluated. Exponential backoff as in TCP/IP and a token based scheme.

(ps, pdf).
The part relevant for PVFS (ps)