Effects of Communication Protocol Stack Offload on Parallel Performance in Clusters
Protopopov, Boris Vladimirovich.
CommitteeReese, Donna S.
Bridges, Susan M.
The primary research objective of this dissertation is to demonstrate that the effects of communication protocol stack offload (CPSO) on application execution time can be attributed to the following two complementary sources. First, the application-specific computation may be executed concurrently with the asynchronous communication performed by the communication protocol stack offload engine. Second, the protocol stack processing can be accelerated or decelerated by the offload engine. These two types of performance effects can be quantified with the use of the degree of overlapping Do and degree of acceleration Daccs. The composite communication speedup metrics S_comm(Do, Daccs) can be used in order to quantify the combined effects of the protocol stack offload. This dissertation thesis is validated empirically. The degree of overlapping Do, the degree of acceleration Daccs, and the communication speedup Scomm characteristic of the system configurations under test are derived in the course of experiments performed for the system configurations of interest. It is shown that the proposed metrics adequately describe the effects of the protocol stack offload on the application execution time. Additionally, a set of analytical models of the networking subsystem of a PC-based cluster node is developed. As a result of the modeling, the metrics Do, Daccs, and Scomm are obtained. The models are evaluated as to their complexity and precision by comparing the modeling results with the measured values of Do, Daccs, and Scomm. The primary contributions of this dissertation research are as follows. First, the metric Daccs and Scomm are introduced in order to complement the Do metric in its use for evaluation of the effects of optimizations in the networking subsystem on parallel performance in clusters. The metrics are shown to adequately describe CPSO performance effects. Second, a method for assessing performance effects of CPSO scenarios on application performance is developed and presented. Third, a set of analytical models of cluster node networking subsystems with CPSO capability is developed and characterised as to their complexity and precision of the prediction of the Do and Daccs metrics.