Abinit-8.6.3 with gpu support: test gpu failed

thanusit · Post by **thanusit** » Tue Mar 13, 2018 3:01 am

Dear all

I would like to build Abinit-8.6.3 with gpu support(hoping to make the make the most of it). I have tried to build with the options found in the examples and template config.ac and as well as those in several forum posts. However, I am unable to get the right one that works properly for my platform. My problem is that the configures and builds were successful but test gpu all failed.

Could anyone please help? Below are the platform's details and one of build options and test results I tried with gcc+openmpi+mkl+maga. Any suggestions are greatly appreciated.

Kind regards,
Thanusit

# Platform

Code: Select all

- Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (with 20 cpus + 4 x testla K80 gpu)
- CentOS-7 with gcc-4.8.5 and python-2.7.5, openmpi-1.10.6(via Yum), mkl(locally installed from Intel parallel_studio_xe_2015_update6), magma-1.6.2(locally built), and Cuda 9.0 (built by sys-admin)

# Build config for abinit-8.6.3. The config.log is attached.

Code: Select all

prefix="/workspace/thanusit/local/apps/abinit-8.6.3_gpu"
enable_exports="yes"
enable_64bit_flags="yes"
enable_gw_dpc="yes"
enable_bse_unpacked="yes"
enable_memory_profiling="no"
enable_maintainer_checks="no"
#enable_openmp="yes"
enable_mpi="yes"
enable_mpi_io="yes"
with_mpi_prefix="/usr/lib64/openmpi"
enable_gpu="yes"
with_gpu_prefix="/usr/local/cuda"
with_gpu_flavor="cuda-double"
#FC_LDFLAGS_EXTRA="-Wl,-z,muldefs"
#NVCC_CFLAGS="-O3 -Xptxas=-v --use_fast_math --compiler-options -O3,-fPIC"
with_trio_flavor="netcdf"
with_netcdf_incs="-I/usr/include -I/usr/lib64/gfortran/modules"
with_netcdf_libs="-L/usr/lib64 -lnetcdf -lnetcdff"
with_fft_flavor="fftw3-mkl"
with_fft_incs="-I${MKLROOT}/include"
with_fft_libs="-L${MKLROOT}/lib/intel64 -lmkl_gf_lp64 -lmkl_sequential -lmkl_core -lgomp -lpthread -lm"
with_linalg_flavor="mkl+magma"
with_linalg_incs="-I/workspace/thanusit/local/apps/magma-1.6.2/include -I${MKLROOT}/include"
with_linalg_libs="-L/workspace/thanusit/local/apps/magma-1.6.2/lib/ -lmagma -lcuda -L${MKLROOT}/lib/intel64 -lmkl_gf_lp64 -lmkl_sequential -lmkl_core -lgomp -lpthread -lm"
#with_algo_flavor="levmar"
#with_algo_incs="-I/usr/include"
#with_algo_libs="-L/usr/lib64 -llevmar"
#with_math_flavor="gsl"
#with_math_incs="-I/usr/include"
#with_math_libs="-L/usr/lib64 -lgsl -lgslcblas -lm"
#with_dft_flavor="atompaw+bigdft+libxc+wannier90"

# Make.inc for magma-1.6.2
(The header below says version 1.6.1 but the source files I used was certainly magma-1.6.2.tar.gz. Please note that the build of magma was successful and passed all tests provided in the package)

Code: Select all

#//////////////////////////////////////////////////////////////////////////////
#   -- MAGMA (version 1.6.1) --
#      Univ. of Tennessee, Knoxville
#      Univ. of California, Berkeley
#      Univ. of Colorado, Denver
#      @date January 2015
#//////////////////////////////////////////////////////////////////////////////

# GPU_TARGET contains one or more of Tesla, Fermi, or Kepler,
# to specify for which GPUs you want to compile MAGMA:
#     Tesla  - NVIDIA compute capability 1.x cards (no longer supported in CUDA 6.5)
#     Fermi  - NVIDIA compute capability 2.x cards
#     Kepler - NVIDIA compute capability 3.x cards
# The default is "Fermi Kepler".
# See http://developer.nvidia.com/cuda-gpus
#
GPU_TARGET = Kepler

CC        = gcc
CXX       = g++
NVCC      = nvcc
FORT      = gfortran

ARCH      = ar
ARCHFLAGS = cr
RANLIB    = ranlib

# Use -fPIC to make shared (.so) and static (.a) library;
# can be commented out if making only static library.
FPIC      = -fPIC

CFLAGS    = -O3 $(FPIC) -DADD_ -Wall -fopenmp -DMAGMA_SETAFFINITY -DMAGMA_WITH_MKL
FFLAGS    = -O3 $(FPIC) -DADD_ -Wall -Wno-unused-dummy-argument
F90FLAGS  = -O3 $(FPIC) -DADD_ -Wall -Wno-unused-dummy-argument -x f95-cpp-input
NVCCFLAGS = -O3         -DADD_       -Xcompiler "-fno-strict-aliasing $(FPIC)"
LDFLAGS   =     $(FPIC)              -fopenmp

# see MKL Link Advisor at http://software.intel.com/sites/products/mkl/
# gcc with MKL 10.3, sequential version
LIB       = -lmkl_gf_lp64 -lmkl_sequential -lmkl_core -lcublas -lcudart -lstdc++ -lm -lgfortran

# define library directories preferably in your environment, or here.
# for MKL run, e.g.: source /opt/intel/composerxe/mkl/bin/mklvars.sh intel64
MKLROOT = /workspace/thanusit/intel/composer_xe_2015.6.233/mkl
CUDADIR = /usr/local/cuda
-include make.check-mkl
-include make.check-cuda

LIBDIR    = -L$(CUDADIR)/lib64 \
            -L$(MKLROOT)/lib/intel64

INC       = -I$(CUDADIR)/include \
            -I$(MKLROOT)/include

# Result of # ../Runtest.py gpu -t0

Code: Select all

[thanusit@hpc-b abinit-8.6.3_blddir_mkl-seq_cuda_magma-1.6.2]$ ../abinit-8.6.3/tests/runtests.py gpu -t0

Running on hpc-b -- system Linux -- ncpus 20 -- Python 2.7.5 -- runtests.py-0.5.4

Regenerating database...

Saving database to /workspace/thanusit/src_build/abinit-8.6.3/tests/test_suite.cpkl
Running 7 test(s) with MPI_procs=1, py_threads=1...
[TIP] runtests.py is using 1 CPUs but your architecture has 20 CPUs
You may want to use python threads to speed up the execution
Use `runtests -jNUM` to run with NUM threads
[gpu][t01][np=1]: failed: erroneous lines 15 > 12 [file=t01.out]
No YAML Error found in [gpu][t01][np=1]

[gpu][t02][np=1]: failed: erroneous lines 5 > 0 [file=t02.out]
No YAML Error found in [gpu][t02][np=1]
[gpu][t03][np=1]: failed: absolute error 0.7476 > 0.0007, relative error 1.0 > 0.085 [file=t03.out]
No YAML Error found in [gpu][t03][np=1]
Command   /workspace/thanusit/src_build/abinit-8.6.3_blddir_mkl-seq_cuda_magma-1.6.2/src/98_main/abinit < /workspace/thanusit/src_build/abinit-8.6.3_blddir_mkl-seq_cuda_magma-1.6.2/Test_suite/gpu_t04/t04.stdin > /workspace/thanusit/src_build/abinit-8.6.3_blddir_mkl-seq_cuda_magma-1.6.2/Test_suite/

gpu_t04/t04.stdout 2> /workspace/thanusit/src_build/abinit-8.6.3_blddir_mkl-seq_cuda_magma-1.6.2/Test_suite/gpu_t04/t04.stderr
 returned exit_code: 14

[gpu][t04][np=1]: fldiff.pl fatal error:
The diff analysis cannot be done: the number of lines to be analysed differ.
File /workspace/thanusit/src_build/abinit-8.6.3/tests/gpu/Refs/t04.out: 262 lines, 34 ignored
File /workspace/thanusit/src_build/abinit-8.6.3_blddir_mkl-seq_cuda_magma-1.6.2/Test_suite/gpu_t04/t04.out: 261 lines, 34 ignored [file=t04.out]
[gpu][t04][np=1]Test was not expected to fail but subprocesses returned 14
On entry to magma_zhegvd, parameter 11 had an illegal value (info = -11)

YAML Error found in the stdout of [gpu][t04][np=1]
--- !ERROR
src_file: abi_xhegv.f90
src_line: 170
mpi_rank: 0
message: |
    Problem in abi_xhegv, info= -11
...

Command   /workspace/thanusit/src_build/abinit-8.6.3_blddir_mkl-seq_cuda_magma-1.6.2/src/98_main/abinit < /workspace/thanusit/src_build/abinit-8.6.3_blddir_mkl-seq_cuda_magma-1.6.2/Test_suite/gpu_t05_MPI1/t05_MPI1.stdin > /workspace/thanusit/src_build/abinit-8.6.3_blddir_mkl-seq_cuda_magma-1.6.2/Test_suite/gpu_t05_MPI1/t05_MPI1.stdout 2> /workspace/thanusit/src_build/abinit-8.6.3_blddir_mkl-seq_cuda_magma-1.6.2/Test_suite/gpu_t05_MPI1/t05_MPI1.stderr
 returned exit_code: 14

[gpu][t05_MPI1][np=1]: fldiff.pl fatal error:
The diff analysis cannot be done: the number of lines to be analysed differ.
File /workspace/thanusit/src_build/abinit-8.6.3/tests/gpu/Refs/t05_MPI1.out: 311 lines, 33 ignored
File /workspace/thanusit/src_build/abinit-8.6.3_blddir_mkl-seq_cuda_magma-1.6.2/Test_suite/gpu_t05_MPI1/t05_MPI1.out: 310 lines, 33 ignored [file=t05_MPI1.out]
[gpu][t05_MPI1][np=1]Test was not expected to fail but subprocesses returned 14
On entry to magma_zhegvd, parameter 11 had an illegal value (info = -11)

YAML Error found in the stdout of [gpu][t05_MPI1][np=1]
--- !ERROR
src_file: abi_xhegv.f90
src_line: 170
mpi_rank: 0
message: |
    Problem in abi_xhegv, info= -11
...

[gpu][t05_MPI2][np=0]: Skipped.
   nprocs: 1 != nprocs_to_test: 2
   nprocs: 1 in exclude_nprocs: [1]

[gpu][t05_MPI4][np=0]: Skipped.
   nprocs: 1 != nprocs_to_test: 4
   nprocs: 1 in exclude_nprocs: [1, 2, 3]


Suite   failed  passed  succeeded  skipped  disabled  run_etime  tot_etime
gpu          5       0          0        2         0     282.30     283.39

Completed in 286.26 [s]. Average time for test=56.46 [s], stdev=54.64 [s]
Summary: failed=5, succeeded=0, passed=0, skipped=2, disabled=0

Execution completed.
Results in HTML format are available in Test_suite/suite_report.html