Speedup and efficiency shell calculator

The following script can be used to calculate the speedup and efficiency of a parallel code when compared to its serial version. Pretty much straight forward process. This script can be used either individually or as part of another script to automate the process of generating the required results. It accepts three arguments: 1) serial execution time 2) parallel execution time 3) number of processors.

./SEcalc.sh <serial> <parallel> <procs>

Script:

############################################################################
# Copyright (C) 2011  Panagiotis Kritikakos <panoskrt@gmail.com>           #
#                                                                          #
#    This program is free software: you can redistribute it and/or modify  #
#    it under the terms of the GNU General Public License as published by  #
#    the Free Software Foundation, either version 3 of the License, or     #
#    (at your option) any later version.                                   #
#                                                                          #
#    This program is distributed in the hope that it will be useful,       #
#    but WITHOUT ANY WARRANTY; without even the implied warranty of        #
#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the         #
#    GNU General Public License for more details.                          #
#                                                                          #
#    You should have received a copy of the GNU General Public License     #
#    along with this program.  If not, see <http://www.gnu.org/licenses/>. #
############################################################################

#!/bin/bash

if [ "$#" -eq "3" ]; then
   runtime1proc=$1
   runtimeNproc=$2
   totalprocs=$3
   speedup=`echo "${runtime1proc}/${runtimeNproc}" | bc -l`;
   efficiency=`echo "${speedup}/${totalprocs}" | bc -l`;

   printf "\n Total processors: ${totalprocs}\n\n";
   printf " Runtime for serial code: ${runtime1proc}\n Runtime for parallel code: \
   ${runtimeNproc}\n\n";
   printf " Speedup: ${speedup}\n Efficiency: ${efficiency}\n\n";
else
   printf "\n Usage: SEcalc.sh   \n\n";
   printf " SEcalc.sh 0.350 0.494 2\n\n";
fi
Advertisement

Running Fortran on ARM

Generally and very briefly, there is no official Fortran compiler available for the ARM architecture. The easiest way to get Fortran code (specifically Fortran 77) to run on the ARM architecture, or any other architecture that doesn’t have a Fortran compiler, is to convert the Fortran code to C. In order to do that efficiently, we can use Netlib’s f2c command line tool, available for Linux, Unix and Windows.

The steps that need to be followed are:

1) Compile the library – source at http://www.netlib.org/f2c/libf2c.zip
Rename makefile.u to makefile and do make. Copy the generated libf2c.a to /usr/lib and the f2c.h header under /usr/include.
2) Compile the binary – source at http://www.netlib.org/f2c/src/
Rename makefile.u to makefile and do make. Copy the binary f2c under /usr/local/bin.

Supposing we have the foo.f file we can generate the C version by running f2c foo.f, giving us foo.c. It can now be compiled using ‘gcc foo.c -o foo -lf2c‘.

In order to automate this process a while, the following script can be used. It accepts only one argument, the Fortran source file.

#!/bin/bash
fortranFile=$1
fileName=`echo $1 | sed 's/\(.*\)\..*/\1/'`
echo $fileName
f2c $fortranFile
gcc ${fileName}.c -o $fileName -lf2c

UPDATE: There is GCC Fortran compiler for ARM Fedora and Debian and I presume for other distros as well. The issue now is that these distributions are compiled for ARMv5, while the latest ARM processors (Cortex-A8, A9, A15) are of the ARMv7 architecture. The compilers, and the OS in general is therefore unable to make use of the additional instructions sets and FPU. Other distros, such as Slackware, are compiled on even older architecture, ARMv4.

Virtual Cluster liveDVD milestone 3 released

The Virtual Cluster liveDVD milestone 3, an activity of the HPC-Europa2 project, is available from today for download.

“The “HPC-Europa2 Virtual Cluster Live DVD” is a SliTaz based Live Linux DVD. It boots and runs completely from DVD providing recent tools, compilers and libraries for the development of parallel applications. Further on the DVD includes training material, videos from past virtual surgeries and reports from past HPC-Europa visitors.”

NAS Parallel Benchmarks tips

NAS Parallel Benchmarks are some of the most common used set of benchmarks for HPC systems. Description and download can be found here.

One of the problems I came across when trying to compile some of the Fortran codes with gfortran was the following:

randi8.f: In function `randlc':
randi8.f:23:
        data i246m1/X'00003FFFFFFFFFFF'/
Integer at (^) too large

This is due to the size of the random number that is tried to generated and the compiler doesn’t really like. To get it working you can alter the configuration file of the benchmarks and set

RAND   = randi8_safe

from

RAND   = randi8

randi8:

randi8:
     1. Uses integer*8 arithmetic. Compiler must support integer*8
     2. Uses the Fortran 90 IAND intrinsic. Compiler must support

randi8_safe:

 randi8_safe
     1. Uses integer*8 arithmetic
     2. Uses the Fortran 90 IBITS intrinsic.
     3. Does not make any assumptions about overflow. Should always
        work correctly if compiler supports integer*8 and IBITS.

To compile effectively the MPI code, you’ll need to alter the configuration file and set the right compilers: ‘mpicc’ for C and ‘f77’ for Fortran. Along with that, you might need to define the mpi, mpl and pthread libraries in the *MPI_LIB entries. For instance:

GPU programming in a glance

On my MacBook I have a nVIDIA GeForce 8600M GT which is CUDA enabled, something I never bothered checking until very recently. nVIDIA provides online the required driver, SDK and additional “CUDA Developer”, as they call it, resources with lots of sample files to test the hardware of your system as well as actual code, including some parallel samples.

The CUDA toolkit seem to provide all you need to start with:

  • C/C++ compiler
  • Visual Profiler
  • GPU-accelerated BLAS library
  • GPU-accelerated FFT library
  • GPU-accelerated Sparse Matrix library
  • GPU-accelerated RNG library
  • Additional tools and documentation

It does also include OpenCL samples to play about. However, the OpenCL driver will need to be installed at first place. There’s a pre-release version and in order to download it you’d need to register yourself with nVIDIA. They have also published a book, “CUDA by example”, which is not for free apart from some fragments. Nevertheless, the sample codes of the book are free to download.

AMD / ATI have also their answer to CUDA, “ATI Stream“. From what I got it seems to support only OpenCL. I don’t have a Stream-supported ATI card at the moment so I couldn’t try that one.

To close this, there’s an interesting presentation that covers basics of GPUs and how to program them (CUDA based): Programming and optimization of applications for multiple GPU

4th IC-SCCE

Last week I was in Athens for the 4th International Conference from Scientific Computing to Computational Engineering. Many interesting talks from a wide range of areas. My talk was about “HPC Applications Performance on Virtual Clusters“. The main outcome from my perspective, we need to investigate GPU virtualisation. There are more and more scientists/researchers that want to exploit such systems and in the near future we’ll need to deploy virtualised GPU systems in the same way we do with CPUs.

My paper
My presentation

lcfg-xen-1.0.10

* Fixing bug #294. New resource named ‘timer’ is defined for setting the ‘timer_mode’ setting in the configuration file of a guest domain. The default value of this resource is set to 4.
* Fixing bug #289. Raw LVM partitions now use the xvd[X][Y] string where X the device letter and Y the numbering sequence. In order to make use of this, the disk type must be defined as “lvm”.

RPM
Source RPM
Schema

lcfg-libvirt

The last few months, when I was getting free time off other projects, I was working on a new LCFG component, lcfg-libvirt. The main idea behind lcfg-libvirt is not just to manage libVirt itself, but use libVirt via the component to manage multiple virtualisation platforms without the need to use multiple components.

At first stage, the goal was to generalise the resources that could be used by both Xen and KVM guests, as well as other platform candidates that are supported by libVirt.

The second stage was to migrate all the existing lcfg-xen functionality into the component, using the new resources and manage the Xen guests via libVirt.

At the third stage, KVM support was added at the same level as the pre-existing Xen support. At this stage, network management via virsh was implemented as well. In order to get networking sorted I had to create a new patch for the lcfg-network component in order to support bridge interfaces at an OS level.

The man page is still missing. The lcfg-xen(8) will be used as the basis for this as well.

The functionality so far can be summarised as bellow:

– Support for Xen, hardware virtualised, guests (migrated from lcfg-xen).
– Support for Xen, paravirtualised, guests (migrated from lcfg-xen).
– Support for Xen specific networking (migrated from lcfg-xen).
– Support for KVM guests for both Intel and AMD processors.
– Support for KVM specific networking.
– Guest cloning for both Xen and KVM guests (migrated from lcfg-xen).
– Support for NAT, Bridge and Routed interfaces for both Xen and KVM.
– Use of virsh to manage guests and generic networking.

KVM guest example:

!libvirt.hosttype               mSET(kvm)

!libvirt.vms    mADD(pe2900x1)

!libvirt.name_pe2900x1                  mSET(pe2900x1)
!libvirt.type_pe2900x1                  mSET(hvm)
!libvirt.uuid_pe2900x1                  mSET(56bcea35-a598-4ce8-97f1-02cba34e3451)
!libvirt.disks_pe2900x1                 mADD(root test)
!libvirt.diskname_pe2900x1_root         mSET(pe2900x1)
!libvirt.disksize_pe2900x1_root         mSET(32)
!libvirt.diskpath_pe2900x1_root         mSET(/guests)
!libvirt.diskname_pe2900x1_test         mSET(test)
!libvirt.disksize_pe2900x1_test         mSET(10)
!libvirt.diskpath_pe2900x1_test         mSET(/guests)
!libvirt.boot_pe2900x1                  mSET(no)
!libvirt.opts_pe2900x1                  mADD(vnc monitor)
!libvirt.optvalue_pe2900x1_vnc          mSET(1)
!libvirt.optvalue_pe2900x1_monitor      mSET(pty)
!libvirt.nethost_pe2900x1               mADD(vif1 vif2)
!libvirt.hostmac_pe2900x1_vif1          mSET(12:28:aa:02:1e:4d)
!libvirt.bridge_pe2900x1_vif1           mSET(br0)
!libvirt.netmode_pe2900x1_vif1          mSET(bridge)
!libvirt.hostmac_pe2900x1_vif2          mSET(23:12:cb:af:1a:cf)
!libvirt.bridge_pe2900x1_vif2           mSET(default)
!libvirt.netmode_pe2900x1_vif2          mSET(network)

Xen guest example:

!libvirt.hosttype               mSET(xen)

!libvirt.vms    mADD(pe2900x1)

!libvirt.name_pe2900x1                  mSET(pe2900x1)
!libvirt.type_pe2900x1                  mSET(hvm)
!libvirt.uuid_pe2900x1                  mSET(56bcea35-a598-4ce8-89f87-02cba34e7205)
!libvirt.disks_pe2900x1                 mADD(root test)
!libvirt.diskname_pe2900x1_root         mSET(pe2900x1)
!libvirt.disksize_pe2900x1_root         mSET(32)
!libvirt.diskpath_pe2900x1_root         mSET(/guests)
!libvirt.diskname_pe2900x1_test         mSET(test)
!libvirt.disksize_pe2900x1_test         mSET(10)
!libvirt.diskpath_pe2900x1_test         mSET(/guests)
!libvirt.boot_pe2900x1                  mSET(no)
!libvirt.nethost_pe2900x1               mADD(vif1)
!libvirt.hostmac_pe2900x1_vif1          mSET(12:28:ad:12:ac:2a)
!libvirt.bridge_pe2900x1_vif1           mSET(xenbr0)
!libvirt.script_pe2900x1_vif1           mSET(vif-bridge)
!libvirt.netmode_pe2900x1_vif1          mSET(bridge)

Network configuration example:

!libvirt.networking             mADD(routed)
!libvirt.nettype_routed         mSET(interface)
!libvirt.netname_routed         mSET(routed)
!libvirt.netuuid_routed         mSET(56bcea35-a598-4ce8-97f1-02acd24s6985)
!libvirt.bridgename_routed      mSET(virbr9)
!libvirt.mode_routed            mSET(route)
!libvirt.modedev_routed         mSET(eth0)
!libvirt.ipaddr_routed          mSET(192.168.1.0)
!libvirt.netmask_routed         mSET(255.255.255.0)
!libvirt.dhcpstart_routed       mSET(192.168.1.1)
!libvirt.dhcpend_routed         mSET(192.168.1.254)
!libvirt.nethost_routed         mSET(host1 host2)
!libvirt.hostname_routed_host1  mSET(test)
!libvirt.hostmac_routed_host1   mSET(00:1E:C9:53:29:AD)
!libvirt.hostip_routed_host1    mSET(1.1.1.1)
!libvirt.hostname_routed_host2  mSET(test2)
!libvirt.hostmac_routed_host2   mSET(00:1F:B9:65:12:AB)
!libvirt.hostip_routed_host2    mSET(2.2.2.2)

Source code available on LCFG SVN. You’ll need an Informatics iFriend account to see the contents. RPMs should follow sooner or later.

lcfg-network patch

The existing version of lcfg-network component that manages the networking at an OS level, general and very briefly that means the scripts under /etc/sysconfig/network-scripts/ as well as /etc/sysconfig/network and /etc/hosts, lacks of support for bridge interfaces. The following patch adds support for such usage.

network.cin.patch (script)
network.def.patch (resources)

Two new resources have been defined:

type
This resource can be used to define the type of the interface.  For example,
defining a bridge interface as "type"_<interface>, then that should be set to "Bridge".

bridge
This resource can be used by the main interface in case it is bridged.  For instance,
if eth0 is bridged over br0, then "bridge"_<interface> should be set to "br0".

That would look like this:

!network.interfaces     mADD(eth0 br0)
!network.hwaddr_eth0    mSET(00:12:11:22:26:2B)
!network.bridge_eth0    br0
!network.ipaddr_br0     DHCP
!network.type_br0       Bridge

The official RPM hasn’t still be generated.

Without the use of the patch, the workaround of configuring a bridge interface is via the file component. You’ll need something like the following in your machine’s profile:

!network.interfaces     mREMOVE(eth0)

!file.files             mADD(eth0)
file.file_eth0          /etc/sysconfig/network-scripts/ifcfg-eth0
file.type_eth0          literal
file.mode_eth0          0644
!file.tmpl_eth0         mSET(DEVICE=eth0\nHWADDR=XX:XX:XX:XX:XX:XX\nONBOOT=yes\nBRIDGE=br0)

!file.files             mADD(br0)
file.file_br0           /etc/sysconfig/network-scripts/ifcfg-br0
file.type_br0           literal
file.mode_br0           0644
!file.tmpl_br0          mSET(DEVICE=br0\nTYPE=Bridge\nBOOTPROTO=dhcp\nONBOOT=yes)