Ole W. Saastad
Titan performance
HPCC 1.4
HPCC run with 64 cpus on intel X5550 processors
HPCC run with 64 cpus on AMD 2354 processors
HPCC run with 1024 cpus on AMD 2354(976) and 2367(48) processors
Meta Data Benchmark
A simple meta data benchmark is written to exercise GPFS meta data performance.
Unpack a linux kernel, count the number of filer, touch all files, remove all files.
Do this concurrently for a large number of instances and record the time.
Files (save the scripts and mpi program) :
SLURM run script
MPI application to launch and time
Bash script to do it
Titan results
In addition some meta data tests using Bonnie++ are run and results are available
here.
GPFS performance
IOR, Meta-data benchmark and Bonnie++ benchmark results from Titan running GPFS.
Titan GPFS performance
Virtualization
A paper about how to use VMware and Sun Grid Engine in HPC.
Paper-VM-nodes.
Benchmarking :
All reports and spreadsheets are provided as
is. The support for clarifications and questions are limited.
Spreadsheets are in OpenDocument / OpenOffice format or PDF. My workstation
has Norwegian language and there might be some ',' used for decimal
comma in some spreadsheets.
GPU benchmarks
Using Graphic cards for computation is of interest as the cards can yield high performance.
I have written a report about measured performance.
In selected cases (linear algebra) very high performance can be attained.
At present only single precision (32 bit) data types are supported.
MPI benchmarks
Key performance parameters for MPIs are latency and bandwidth. While parameters like
collective operations performance and ability to overlap communication
and computation are also impacting end user application perceived
performance. Not only is the interconnect performance important, but
with the ever growing number of CPUs/cores per socket also the intra
node communication becomes more important. This is normally done using
some kind of shared memory communication, L1, L2, (L3) cache or main
memory.
Calculating benchmarks, HPL
High Performance Linpack is run
on a range of different nodes using MPI on shared memory. This is
usually a cluster test, but single node performance is important to
assess the individual compute nodes. Todays nodes has as many CPUs as
small clusters used to have some time ago. Some
results are avilable.
Calculating benchmarks, HPCC
High Performance Compute
Chalenge Linpack is run on Titan with different libs and MPIs. Some results are avilable. A test of Intel Quad
core versus AMD quad core running SMP MPI is noe added.
Calculating benchmarks, Euroben-v 5.0
EuroBen Benchmark provides
benchmark programs for scientific and technical computing to assess
the performance of computers for these fields. All programs are
written in Fortran 90/95. Some results are
available.
Virtual compute nodes
Performance testing using benchmarks and applications of Virtual
Compute nodes running under VMware Server. Virtual compute nodes
can be moved around on the physical nodes in a cluster. While there is
a performance degradation the possibility to suspend, migrate and
resume the jobs in a low priority/free queue outweighs the loss of of
performance. 70-90% performance is far better than waiting in the ever
growing queue.
Local or Parallel file system scratch disk ?
How to use local scratch
disk, local disk or parallel file system? What is the best option
for scratch disk during a run ? Using local disk or the parallel file
system and the cluster of file servers ? Also what kind of
performance will the parallel file system yield at different record
sizes ? How is the problem with random read addressed ?
Parallel file system and MPI-IO
How get maximum bandwidth from a parallel file system using
MPI-IO. Results from the IOR benchmark using GPFS. Notice the good
scaling from random read using MPI-IO.
IO performance with IOZone
IOZone benchmarking on different storage solutions.
IO performance - report
Report on IOZone benchmarking on different storage solutions.
This is a working draft and not a published report.
Memory bandwidth using Stream
Stream memory bandwidth benchmarking on different servers.
Benchp1 benchmark in historical perspective
Some historical interesting benchp1
runs. Storage benchmark are unreliable as ./ might be anything and
is not known for the different runs. CPU and memory benchmarks are
ok.
Private
Visit my home page