K-ary N-trees: High Performance Networks for Massively Parallel Architectures
The past few years have seen a rise in popularity of massively
parallel architectures that use fat-trees as their interconnection
In this paper we formalize a parametric family of
fat-trees, the k-ary n-trees, built with constant arity switches
interconnected in a regular topology. A simple adaptive routing
algorithm for k-ary n-trees sends each message to one of the
nearest common ancestors (NCA) of both source and
destination choosing the less loaded physical channels and then
reaches the destination following the unique available path. Through
simulation on a 4-ary 4-tree with 256 nodes, we analyze some
variants of the adaptive algorithm that utilize wormhole routing
with 1, 2 and 4 virtual channels. The experimental results
show that the uniform, bit reversal and transpose traffic patterns
are very sensitive to the flow control strategy. In all these cases,
the saturation points are between 35-40\% of the network
capacity with 1 virtual channel, 55-60\% with 2 virtual
channels and around 75\% with 4 virtual channels and after
saturation, the network throughput remains stable. Increasing the
number of virtual channels slightly worsens the average network
latency, but improves the bounds on the distribution. The
complement traffic, a representative of the class of the
congestion-free communication patterns, reaches an optimal
performance, with a saturation point at 97\% of the capacity for
all flow control strategies. In this case virtual channels are of
little help and the average network latency experiences an increase
proportional to the number of virtual channels, due to the
multiplexing of several packets onto the same physical path.