Content-type: text/html
Manpage of DANCER
DANCER
Section: ICL cluster users' guide (7)
Updated: 2016-09-29
Index
Return to Main Contents
NAME
dancer - Introduction to the Dancer cluster, ICL, University of Tennessee
DESCRIPTION
This manual page describes the features and architecture of the Dancer cluster, at the University of Tennessee.
The dancer cluster is a small Infiniband Cluster administered by the DisCo team (mostly Aurelien). To receive help, prefer contacting ICL support icl-help@icl.utk.edu.
ACCOUNTS
To obtain your access to the Dancer cluster, send an email to icl-help@icl.utk.edu. You will need to provide a user name (the same as your ICL account is preferred if you have one), and a SSH public key. It is recommended that you create a new public-private key pair for security reasons.
- To create a SSH key pair, use the following command:
-
ssh-keygen -o .ssh/dancer
- Once you get the confirmation that your account has been created, login using the following command:
-
slogin myname@dancer.icl.utk.edu -i .ssh/dancer
Password authentification is not possible on Dancer, and multiple trials to login with the wrong credential will get your IP banned, so beware. There are various ways of automating the selection of the key (see ssh_config(5)) or to login from multiple machines without putting your private key at risk by transferring it to all these machines (see ssh-agent(1)). See also the SSH_CONFIG example below.
ETIQUETTE
The Dancer cluster is a shared resource. Please be mindful of other users and avoid being disruptive. Most of our policy is "good will" based and trust our users' good maners. Remember that the dancer home area is shared with NFS, and therefore loosely secure. We advice against importing sensitive or confidential material on this system.
Do not run compute intensive, or disk intensive activities on the headnode. Always run your compute tasks (including serial ones) on the compute nodes. The headnode is for editing your files, compiling your programs, and launching mpirun/qsub commands. Only light duty processing (like visualization) should take place there.
Do not write on the NFS home areas from multiple nodes at the same time. Use the scratch disks (local or shared) if you need to write large files from your compute tasks.
Do not reserve nodes in exclusive mode, except for performance measurements. Do not reserve them for long stretch of time at once, and try to schedule exclusive activity overnight when possible.
SYSTEM DESCRIPTION
The Dancer cluster contains 32 Westmere cluster nodes, and is also hosting 6+9 Haswell GPU machines.
The cluster nodes are named d00-d31; the Haswell machines are named nd01-nd06 and arc00-arc08.
Hardware
Not all nodes are the same:
- d00-d15
-
2x Westmere-EP E5606 @2.13GHz. 8 cores, 24G RAM, Infiniband 10G, Ethernet
http://ark.intel.com/products/52583/Intel-Xeon-Processor-E5606-8M-Cache-2_13-GHz-4_80-GTs-Intel-QPI
- d16-d31
-
2x Gainestown E5520 @2.27GHz. 8 cores, ~12G RAM, Infiniband 20G, Ethernet
http://ark.intel.com/products/40200/Intel-Xeon-Processor-E5520-8M-Cache-2_26-GHz-5_86-GTs-Intel-QPI
- nd01-nd06
-
2x Xeon(R) CPU E5-2650 v3 @ 2.30GHz. 20 cores, 32GB RAM, Infiniband QDR 40G, Ethernet
http://ark.intel.com/products/81705/Intel-Xeon-Processor-E5-2650-v3-25M-Cache-2_30-GHz
- arc00-arc08
-
2x Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz. 20 cores, 64GB RAM, Infiniband FDR 56G, Ethernet
http://ark.intel.com/products/81705/Intel-Xeon-Processor-E5-2650-v3-25M-Cache-2_30-GHz
Networks
The machines are connected through a shared Ethernet network. Home areas are accesssible through NFS on all nodes. In addition, we have four
SEPARATE Infiniband compute networks.
NOTE: MPI jobs over IB are maximum 16 nodes, within either d00-d15, or d16-d31. Only Ethernet jobs can span d00-d31
- d00-d15: Infiniband 10G network.
-
mpirun -np 16 -hostfile /opt/etc/ib10g.machinefile.ompi
- d16-d31: Infiniband 20G network (DDR).
-
mpirun -np 16 -hostfile /opt/etc/ib20g.machinefile.ompi
- nd01-nd06: Infiniband 40G network (QDR).
-
mpirun -np 6 -hostfile /opt/etc/nd.machinefile.ompi
- arc00-arc08: Infiniband 56G network (FDR).
-
mpirun -np 9 -hostfile /opt/etc/arc.machinefile.ompi
GPUs/Coprocessors
- Some of the d16-d31 machines form a GPU accelerated cluster with IB20G. You can start a job on the NV-C2050/70 cluster with
-
mpirun -np 12 -hostfile /opt/etc/c2050.machinefile.ompi
- Most of the nd??,arc?? machines are heavily accelerated, with NV-K40, NV-K80, AMD or MIC coprocessors.
-
- See the following dancersh section to inquire about the availability and type of coprocessor on nodes.
-
SOFTWARE
There is a lot of pre-installed software, most can be found in /opt/.
We are transitioning toward the use of module(1) to locate optional software. Note that you can also maintain your own set of modules compiled in your home directory. If you identify software that would benefit to most users and is missing, please contact us (especially if you want to have some CentOS package installed).
FILESYSTEMS
- /homes/myname/
-
This is the home area, NFS exported from the headnode to the compute nodes. It is not the same home area as your regular ICL account. You can use rsync(1) to transfer files to, and from dancer.
- /cores/
-
This is where your core files are generated when your program crashes on a compute node. This is an NFS volume (so the core files are visible from the headnode). To reduce NFS server load, by default, core file generation is disabled. To reenable core files, run the command you wish to debug with mpirun -np 2 bash -c "ulimit -c unlimited; myprogram -a myarg1 -b myarg2".
- /scratch/shared/
-
This filesystem is NFS exported from the headnode, and is available on all nodes. This is a network volume, writeable by everybody. Its content is cleared only when space gets scarce.
- /scratch/local/
-
This filesystem is available on all nodes, including the headnode. This is a local disk, writeable by everybody. It's content may be wiped without notice (altough it is expected to be rare, especially for the headnode).
- /scratch/ssd/
-
This filesystem is available on some nodes. This is a local SSD disk, writeable by everybody. Beware, its content may be wiped without notice.
THE DANCERSH COMMAND
There is a nice
dancersh
command to inquire quickly about the nodes, try the following options:
- To obtain the list of active user processes (all users) at each node
-
dancersh -p
- To obtain the load average of the nodes
-
dancersh -l
- To list all YOUR threads at each node
-
dancersh -u
- To execute a command
-
dancersh ls /scratch/local
executes
'ls /scratch/local'
at each node, which shows the content of the local hard drive scratch space
- To restric the command to a range of nodes
-
dancersh -r 05 08 uname -a
shows the operating system version on nodes d05, d06, d07, d08
- To see what GPU accelerators are available on the nodes
-
dancersh -g
EXCLUSIVE ACCESS
It is possible to make exclusive reservations for some nodes. Exclusive reservations are managed through PBS, with the qsub(1) command. Please refrain from requesting exclusive access, except if you need to do performance measurements. By default, you can access all machines in shared mode, without reservation, simply by using ssh(1) or mpirun(1).
- To start a job a 9PM
-
qsub -a 2100
- To get 6 haswell nodes
-
qsub -lnodes=6:haswell
- To get 6 haswell nodes on the same Infiniband network
-
qsub -lnodes=6:ib56
- To get 12 nodes with a cuda board on the same IB section
-
qsub -lnodes=12:ib20:cuda
- To get nodes by name
-
qsub -lnodes=dancer02+dancer03
GANGLIA
Ganglia is available from the ICL Ganglia dashboard http://icl.cs.utk.edu/ganglia/?c=dancer.
It is also available directly from the dancer headnode, but it is firewalled and you will need to establish an SSH tunnel to access it this way.
SSH_CONFIG EXAMPLE
You can ease your access to the dancer cluster with the following SSH tricks. Adapt and Insert the following material into your .ssh/config file, on the host you use to connect to the dancer headnode.
- Host dancer dancer.icl.utk.edu
HostName dancer.icl.utk.edu
User myname
IdentityFile ~/.ssh/dancer_dsa
ForwardAgent yes
# Use only one SSH tunel for all your actions (including rsync, scp, etc.)
ControlMaster auto
ControlPersist yes
ControlPath /tmp/%r@%h:%p
# Lets you debug with mpirun -xterm
ForwardX11 yes
ForwardX11Timeout 1w
# Make ganglia accessible on your machine at url http://localhost:8086
LocalForward 8086 localhost:80
# Direct login to arc nodes
Host arc?? arc??.icl.utk.edu
ProxyCommand ssh -q dancer nc %h 22
KNOWN ISSUES
NFS slow to update recompiled programs/libraries
FIXED
If this issue comes back, an imperfect workaround is: make && mpirun ls >/dev/null.
CUDA on Kernel 4.5.4
Using CUDA on this kernel can leave the machine out of memory. Issue is being investigated.
X11 forwarding to nodes is flacky
You get xauth error messages when trying to mpirun -xterm, etc. This is a known issue with CentOS 6. It will be fixed when we upgrade to CentOS 7, later this year.
Using 'mpi_leave_pinned' leads to my MPI program hanging/crashing
This is normal. We have now set the default to be 0 on the system installed Open MPI. If you use your own brew of Open MPI, make sure you disable this optimization, except when you know what you are doing.
Ethernet performance unstable
FIXED
CHANGELOG
Hardware Upgrades
- 15/10/05
-
arc00-08 machines online.
nd01-06 machines have infiniband.
- 15/07/27
-
Defective card mic3 on nd03 replaced.
- 15/07/17
-
Defective Ethernet switch has been replaced.
Software Upgrades
- 16/09/30
-
Kernel-4.7.2 MPSS-3.7.2 gcc-4.9.4 gcc-5.4.0 gcc-6.2.0 papi-5.5.0 ompi-2.0.1
- 16/05/12
-
Kernel-4.5.4 MPSS-3.7 gcc-6.1
- 16/04/26
-
autotools(automake-1.15)
- 15/09/10
-
CUDA-7.5 ompi-1.10.0(module)
- 15/06/18
-
Kernel-4.0.4 MPSS-3.5.1 ompi-1.8.6 Totalview-8.15.4
- 15/05/22
-
gcc-5.1 gdb-7.9.1 ompi-1.8.5
- 15/03/28
-
CUDA-7.0 MPSS-3.4.3 PAPI-5.4.1 Modules kernel-3.10.72
SEE ALSO
ompi_info(1)
mpirun(1)
module(1)
qsub(1)
hwloc-ls(1)
nvidia-smi(1)
miccheck(1)
clinfo(1)
NOTES
Authors and Copyright Conditions
Look at the header of the manual page source for the author(s) and copyright
conditions.
Index
- NAME
-
- DESCRIPTION
-
- ACCOUNTS
-
- ETIQUETTE
-
- SYSTEM DESCRIPTION
-
- Hardware
-
- Networks
-
- GPUs/Coprocessors
-
- SOFTWARE
-
- FILESYSTEMS
-
- THE DANCERSH COMMAND
-
- EXCLUSIVE ACCESS
-
- GANGLIA
-
- SSH_CONFIG EXAMPLE
-
- KNOWN ISSUES
-
- NFS slow to update recompiled programs/libraries
-
- CUDA on Kernel 4.5.4
-
- X11 forwarding to nodes is flacky
-
- Using 'mpi_leave_pinned' leads to my MPI program hanging/crashing
-
- Ethernet performance unstable
-
- CHANGELOG
-
- Hardware Upgrades
-
- Software Upgrades
-
- SEE ALSO
-
- NOTES
-
- Authors and Copyright Conditions
-
This document was created by
man2html,
using the manual pages.
Time: 19:06:12 GMT, September 28, 2016