CmakeCrayXt3: Difference between revisions

From KitwarePublic
Jump to navigationJump to search
No edit summary
No edit summary
Line 1: Line 1:
'''This page is work in progress !'''
= How to build software for Cray Xt3 / Catamount=
= How to build software for Cray Xt3 / Catamount=


Line 8: Line 10:
* the Portland Group PGI toolchain
* the Portland Group PGI toolchain


In order to use them with CMake you have to write a toolchain file, more on that below.
In order to use them with CMake you have to write a toolchain file.


== Writing the CMake toolchain file for the GNU toolchain ==
There are wrapper scripts provided to call the actual compilers. There is a cc script for the C compiler, and a CC script for the C++ compiler. Depending on the current "environment" they call either the GNU compiler or the PGI compiler.
By default the PGI compiler is used.
To switch to gcc, use:
<pre>
$ module switch PrgEnv-pgi PrgEnv-gnu
</pre>
To switch back, exchange the two last parameters.
These scripts add a several include directories, compile flags, link directories and libraries to the compile and link calls.


For CMake to be able to crosscompile software, it requires you to write a toolchain file, which tells CMake
CMake currently supports these wrapper scripts.
some information about the toolchain.
There is a problem here. Since the same compiler executable ("cc") can be two different compilers depending on the currently loaded environment, it can happen that some part of a project is compiled with PGI, then the user switches to gcc, and when he continues the build, gcc builds the rest of the files.


For the GNU toolchain on BlueGene it could look like like the following:
C++ libraries between PGI and g++ seem to be not ABI compatible.
Additionally it seems -lm may not be mixed, i.e. if some library is compiled with PGI and uses the math library and some other part is compiled with gcc and uses the math library there may be linker problems. So it may be the better choice to decide for one of the two toolchains and use this for everything.
 
 
 
== Writing the CMake toolchain file ==
 
For CMake to be able to crosscompile software, it requires you to write a toolchain file, which tells CMake some information about the toolchain.
 
The toolchain on for Cray Xt3/Catamount could look like like the following:
<pre>
<pre>
# the name of the target operating system
# the name of the target operating system
SET(CMAKE_SYSTEM_NAME BlueGeneL)
SET(CMAKE_SYSTEM_NAME Catamount)


# set the compiler
# set the compiler
set(CMAKE_C_COMPILER /bgl/BlueLight/ppcfloor/blrts-gnu/bin/powerpc-bgl-blrts-gnu-gcc )
set(CMAKE_C_COMPILER cc -target=catamount)
set(CMAKE_CXX_COMPILER /bgl/BlueLight/ppcfloor/blrts-gnu/bin/powerpc-bgl-blrts-gnu-g++ )
set(CMAKE_CXX_COMPILER CC -target=catamount)


# set the search path for the environment coming with the compiler
# set the search path for the environment coming with the compiler
# and a directory where you can install your own compiled software
# and a directory where you can install your own compiled software
set(CMAKE_FIND_ROOT_PATH
set(CMAKE_FIND_ROOT_PATH
     /bgl/BlueLight/ppcfloor/
     /opt/xt-pe/default
     /bgl/BlueLight/V1R3M2_140_2007-070424/ppc/blrts-gnu/powerpc-bgl-blrts-gnu/
     /opt/xt-mpt/default/mpich2-64/GP2
    /bgl/BlueLight/V1R3M2_140_2007-070424/ppc/bglsys
     /home/alex/cray-install
     /home/alex/bgl-install
  )
)


# adjust the default behaviour of the FIND_XXX() commands:
# adjust the default behaviour of the FIND_XXX() commands:
# search headers and libraries in the target environment, search  
# search headers and libraries in the target environment, search
# programs in the host environment
# programs in the host environment
set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
Line 41: Line 58:
</pre>
</pre>


Save this file as Toolchain-BlueGeneL-gcc.cmake to some location where you will put
Without more special settings this will work both for PGI and gcc.
all your toolchain files, e.g. $HOME.
Save this file as Toolchain-Catamount.cmake to some location where you will put
As you can see CMAKE_FIND_ROOT_PATH is set to three directories coming with the toolchain, which contain the headers and libraries installed with the toolchain, and /home/alex/bgl-install/.  
all your toolchain files, e.g. $HOME/Toolchains/.
As you can see CMAKE_FIND_ROOT_PATH is set to three directories coming with the toolchain, which contain the headers and libraries installed with the toolchain, and /home/alex/cray-install/.  
This last directory is intended to hold  
This last directory is intended to hold  
other libraries you will compile for the compute nodes, they should be installed under this install prefix. This way the FIND_XXX() commands in CMake will find both the headers and libraries coming with the toolchain as well
other libraries you will compile for the compute nodes, they should be installed under this install prefix. This way the FIND_XXX() commands in CMake will find both the headers and libraries coming with the toolchain as well
as additional libraries you have built for this platform.
as additional libraries you have built for this platform.


== Building the software for BlueGene/L with the GNU toolchain==
== Building the software for Catamount with the GNU toolchain==


Let's say you have the classical hello world software with a CMake based buildsystem and want to build this for now for your super computer.
Let's say you have the classical hello world software with a CMake based buildsystem and want to build this for now for your super computer.
Line 69: Line 87:
Then run CMake on it to generate the buildfiles, the important point is that you tell it to use the toochain file you just wrote:
Then run CMake on it to generate the buildfiles, the important point is that you tell it to use the toochain file you just wrote:
<pre>
<pre>
~/src/helloworld/ $ module switch PrgEnv-pgi PrgEnv-gnu
~/src/helloworld/ $ mkdir build-gcc
~/src/helloworld/ $ mkdir build-gcc
~/src/helloworld/ $ cd build-gcc
~/src/helloworld/ $ cd build-gcc
~/src/helloworld/build-gcc/ $ cmake -DCMAKE_TOOLCHAIN_FILE=~/Toolchain-BlueGeneL-gcc.cmake -DCMAKE_INSTALL_PREFIX=/home/alex/bgl-install ..  
~/src/helloworld/build-gcc/ $ cmake -DCMAKE_TOOLCHAIN_FILE=~/Toolchains/Toolchain-Catamount.cmake -DCMAKE_INSTALL_PREFIX=/home/alex/cray-install ..  
-- The C compiler identification is GNU
-- The C compiler identification is GNU
-- The CXX compiler identification is GNU
-- The CXX compiler identification is GNU
Line 87: Line 106:
So that's all. It actually doesn't matter whether it's just a "hello world" or some complex piece of software,
So that's all. It actually doesn't matter whether it's just a "hello world" or some complex piece of software,
the only difference is the usage of the toolchain file. If the software has all required configure checks, it should just build also with this toolchain.
the only difference is the usage of the toolchain file. If the software has all required configure checks, it should just build also with this toolchain.
To run this program, write an appropriate submit script and run it via llsubmit. If everything works out fine,
you will get output files with one greeting from each of the processors of the compute node. :-)
CMake can help you a bit with creating this script. Create a file like the following in the source directory and save it as run_job.ksh.in:
<pre>
#!/bin/ksh
# @ job_type = BlueGene
# @ executable = /bgl/BlueLight/ppcfloor/bglsys/bin/mpirun
# @ arguments = -cwd ${CMAKE_BINARY_DIR} -exe ${EXE_LOCATION} -mode VN
# @ input = /home/alex/empty_file
# @ output = ${CMAKE_BINARY_DIR}/output.$(jobid).out
# @ error = ${CMAKE_BINARY_DIR}/output.$(jobid).err
# @ initialdir = ${CMAKE_BINARY_DIR}
# @ notify_user = alex.neundorf@kitware.com
# @ notification = complete
# @ wall_clock_limit = 01:05:00
# @ restart = no
# @ coschedule = no
# @ bg_size = 128
# @ queue
</pre>


Then add the following code to you CMakeLists.txt:
==Unsupported features of Catamount==
<pre>
get_target_property(EXE_LOCATION hello LOCATION)
configure_file(run_job.ksh.in ${CMAKE_BINARY_DIR}/run_job.ksh)
</pre>


When running CMake to create the Makefiles it will replace the directories and will create a working run_job.ksh
Among others Catamount doesn't support the following features:
in your build directory.
* shared libraries
* multithreading
* IP server sockets
* IP client sockets
* getpwnam() and other functions which might use nss_switch


==Using the IBM XL toolchain for building the software for BlueGene/L==
Some functions like e.g. times() are present in libc, but the linker warns:
 
The IBM XL toolchain can also be used with CMake on BlueGene/L.
You have to write a separate toolchain file for it, the main difference to the one above is that
the IBM compilers are used:
<pre>
<pre>
# the name of the target operating system
: warning: warning: times is not implemented and will always fail
SET(CMAKE_SYSTEM_NAME BlueGeneL)
 
# set the compiler
set(CMAKE_C_COMPILER  /opt/ibmcmp/vac/bg/8.0/bin/blrts_xlc)
set(CMAKE_CXX_COMPILER  /opt/ibmcmp/vacpp/bg/8.0/bin/blrts_xlC)
 
# set the search path for the environment coming with the compiler
# and a directory where you can install your own compiled software
set(CMAKE_FIND_ROOT_PATH
    /bgl/BlueLight/ppcfloor/
    /bgl/BlueLight/V1R3M2_140_2007-070424/ppc/bglsys
    /home/alex/bgl-install
)
 
# adjust the default behaviour of the FIND_XXX() commands:
# search headers and libraries in the target environment, search
# programs in the host environment
set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)
</pre>
</pre>


Save this file as Toolchain-BlueGeneL-xlc.cmake in the same directory where you saved the toolchain file for the GNU toolchain.
This is the case for several functions, not only times(). So while the linker can resolve the symbols, the functions will fail at runtime. It may be a good idea to use <tt>-Wl,--fatal-warnings</tt>, so that the linker will interpret these warnings as errors and you as developer will actually notice if something breaks.
Then create a new build directory and run CMake there:
<pre>
~/src/helloworld/ $ mkdir build-xlc
~/src/helloworld/ $ cd build-xlc
~/src/helloworld/build-xlc/ $ cmake -DCMAKE_TOOLCHAIN_FILE=~/Toolchain-BlueGeneL-xlc.cmake -DCMAKE_INSTALL_PREFIX=/home/alex/bgl-install ..
-- The C compiler identification is VisualAge
-- The CXX compiler identification is VisualAge
...
-- Configuring done
-- Generating done
-- Build files have been written to: /home/alex/src/helloworld/build-xlc
~/src/helloworld/build-xlc/ $ make
Scanning dependencies of target hello
[100%] Building C object CMakeFiles/hello.dir/main.o
Linking C executable hello
[100%] Built target hello
</pre>

Revision as of 20:15, 20 September 2007

This page is work in progress !

How to build software for Cray Xt3 / Catamount

Cray Xt3 consists of two different types of nodes, the front end node, which are more or less regular AMD64 Linux machines, and the actual compute nodes, which have the same processors, but run the Catamount as operating system. Building software for the front end nodes is not different than building software on any other Linux system. But to build software for the actual compute nodes, you need to cross compile. For C/C++ there are typically two toolchains you can use for the compute nodes:

  • the GNU gcc toolchain
  • the Portland Group PGI toolchain

In order to use them with CMake you have to write a toolchain file.

There are wrapper scripts provided to call the actual compilers. There is a cc script for the C compiler, and a CC script for the C++ compiler. Depending on the current "environment" they call either the GNU compiler or the PGI compiler. By default the PGI compiler is used. To switch to gcc, use:

$ module switch PrgEnv-pgi PrgEnv-gnu

To switch back, exchange the two last parameters. These scripts add a several include directories, compile flags, link directories and libraries to the compile and link calls.

CMake currently supports these wrapper scripts. There is a problem here. Since the same compiler executable ("cc") can be two different compilers depending on the currently loaded environment, it can happen that some part of a project is compiled with PGI, then the user switches to gcc, and when he continues the build, gcc builds the rest of the files.

C++ libraries between PGI and g++ seem to be not ABI compatible. Additionally it seems -lm may not be mixed, i.e. if some library is compiled with PGI and uses the math library and some other part is compiled with gcc and uses the math library there may be linker problems. So it may be the better choice to decide for one of the two toolchains and use this for everything.


Writing the CMake toolchain file

For CMake to be able to crosscompile software, it requires you to write a toolchain file, which tells CMake some information about the toolchain.

The toolchain on for Cray Xt3/Catamount could look like like the following:

# the name of the target operating system
SET(CMAKE_SYSTEM_NAME Catamount)

# set the compiler
set(CMAKE_C_COMPILER cc -target=catamount)
set(CMAKE_CXX_COMPILER CC -target=catamount)

# set the search path for the environment coming with the compiler
# and a directory where you can install your own compiled software
set(CMAKE_FIND_ROOT_PATH
    /opt/xt-pe/default
    /opt/xt-mpt/default/mpich2-64/GP2
    /home/alex/cray-install
  )

# adjust the default behaviour of the FIND_XXX() commands:
# search headers and libraries in the target environment, search
# programs in the host environment
set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)

Without more special settings this will work both for PGI and gcc. Save this file as Toolchain-Catamount.cmake to some location where you will put all your toolchain files, e.g. $HOME/Toolchains/. As you can see CMAKE_FIND_ROOT_PATH is set to three directories coming with the toolchain, which contain the headers and libraries installed with the toolchain, and /home/alex/cray-install/. This last directory is intended to hold other libraries you will compile for the compute nodes, they should be installed under this install prefix. This way the FIND_XXX() commands in CMake will find both the headers and libraries coming with the toolchain as well as additional libraries you have built for this platform.

Building the software for Catamount with the GNU toolchain

Let's say you have the classical hello world software with a CMake based buildsystem and want to build this for now for your super computer. main.c:

#include <stdio.h>

int main()
{
   printf("Hello world\n");
   return 0;
}

CMakeLists.txt:

ADD_EXECUTABLE(hello main.c)

Then run CMake on it to generate the buildfiles, the important point is that you tell it to use the toochain file you just wrote:

~/src/helloworld/ $ module switch PrgEnv-pgi PrgEnv-gnu
~/src/helloworld/ $ mkdir build-gcc
~/src/helloworld/ $ cd build-gcc
~/src/helloworld/build-gcc/ $ cmake -DCMAKE_TOOLCHAIN_FILE=~/Toolchains/Toolchain-Catamount.cmake -DCMAKE_INSTALL_PREFIX=/home/alex/cray-install .. 
-- The C compiler identification is GNU
-- The CXX compiler identification is GNU
...
-- Configuring done
-- Generating done
-- Build files have been written to: /home/alex/src/helloworld/build-gcc
~/src/helloworld/build-gcc/ $ make
Scanning dependencies of target hello
[100%] Building C object CMakeFiles/hello.dir/main.o
Linking C executable hello
[100%] Built target hello

So that's all. It actually doesn't matter whether it's just a "hello world" or some complex piece of software, the only difference is the usage of the toolchain file. If the software has all required configure checks, it should just build also with this toolchain.

Unsupported features of Catamount

Among others Catamount doesn't support the following features:

  • shared libraries
  • multithreading
  • IP server sockets
  • IP client sockets
  • getpwnam() and other functions which might use nss_switch

Some functions like e.g. times() are present in libc, but the linker warns:

: warning: warning: times is not implemented and will always fail

This is the case for several functions, not only times(). So while the linker can resolve the symbols, the functions will fail at runtime. It may be a good idea to use -Wl,--fatal-warnings, so that the linker will interpret these warnings as errors and you as developer will actually notice if something breaks.