Issue Regarding Parallel Installation of Abinit 8.0.7

HPC, IBM, Mac OS, Windows, ...

Moderators: jbeuken, gmatteo, Jordan

Forum rules
Please have a look at ~abinit/doc/config/build-config.ac in the source package for detailed and up-to-date information about the configuration of Abinit builds.
For a video explanation on how to build Abinit for Linux, please go to: http://www.youtube.com/watch?v=DppLQ-KQA68.
IMPORTANT: when an answer solves your problem, please check the little green button on its upper-right corner to accept it.

Issue Regarding Parallel Installation of Abinit 8.0.7

Postby Esha » Tue Jul 12, 2016 8:06 am

Hi. I have followed this video to install Abinit-8.0.7: http://www.youtube.com/watch?v=DppLQ-KQA68

My sponce.ac file is as follows

enable_mpi="yes"
enable_mpi_io="yes"
with_mpi_prefix="/usr"
with_trio_flavor="netcdf+etsf_io"
with_netcdf_incs="-I/usr/include"
with_netcdf_libs="-L/usr/lib -lnetcdf -lnetcdff"
with_fft_flavor="fftw3"
with_fft_incs="-I/usr/include/"
with_fft_libs="-L/usr/lib/x86_64-linux-gnu/ -lfftw3 -lfftw3f"
with_linalg_flavor="atlas"
with_linalg_libs="-L/usr/lib -llapack -lf77blas -lcblas -latlas"
with_dft_flavor="atompaw+libxc"
#with_dft_flavor="atompaw+bigdft+libxc+wannier90"
enable_gw_dpc="yes"
with_mpi_level="2"
FC="/usr/bin/mpif90"
CC="/usr/bin/mpicc"
CXX="/usr/bin/mpic++"

I then configure abinit using command from inside build folder
../configure --with-config-file="./sponce.ac"

it ran succesfully then I make abinit using command
make multi multi_nprocs=8

then
sudo make install

then I submit the job using command
mpirun -np 8 /usr/local/bin/abinit < BaO-trf2-1.files >& RUN.log

The job doesnt run at all. It gave me signal 7 bus error

I tried again with minimum no of cores
mpirun -np 2 /usr/local/bin/abinit < BaO-trf2-1.files >& RUN.log

It ran a little bit and then again the same error

I make it again using command
make multi multi_nprocs=10
sudo make install

Now it is running with command
mpirun -np 4 /usr/local/bin/abinit < BaO-trf2-1.files >& RUN.log
but taking too long. It seems not running on parallel cores

Inside log file I noticed one issue

--- !WARNING
src_file: m_nctk.F90
src_line: 539
message: |
The netcdf library does not support parallel IO, see message above
Abinit won't be able to produce files in parallel e.g. when paral_kgb==1 is used.
Action: install a netcdf4+HDF5 library with MPI-IO support.

Is it the reason? or anything else?
How to resolve the issue? Any help will be appreciated.
Esha
 
Posts: 2
Joined: Tue Jul 12, 2016 7:44 am

Re: Issue Regarding Parallel Installation of Abinit 8.0.7

Postby Jordan » Tue Jul 19, 2016 9:04 am

Hi,

The compilation should not depend on the number of cores you use to compile abinit. make or make mj4 or whatever should result in the same executable.

The last warning you have seems to be related to the fact that you link with netcdf but maybe not with the hdf5 version of netcdf. (the tutorial you follow is a little bit old but nevermind)

Can you at least provide the error message you have instead of just signal 7 bus error ? Can you run any other job with mpi ? Try to be more specific.

Cheers
Jordan
 
Posts: 281
Joined: Tue May 07, 2013 9:47 am

Re: Issue Regarding Parallel Installation of Abinit 8.0.7

Postby Esha » Wed Jul 20, 2016 8:28 am

Hi,

Thanks for your response, the complete error message in log file is

Program received signal SIGBUS: Access to an undefined portion of a memory object.

Backtrace for this error:
#0 0x7FCE9356B777
#1 0x7FCE9356BD7E
#2 0x7FCE92A89CAF
#3 0x7FCE8458233A
#4 0x7FCE916584E8
#5 0x7FCE91658807
#6 0x7FCE8457FB53
#7 0x7FCE84FCD75C
#8 0x7FCE853D8F1A
#9 0x7FCE9166FB54
#10 0x7FCE9168665F
#11 0x7FCE938930F7
#12 0x12B33CA in __m_xmpi_MOD_xmpi_init at m_xmpi.F90:601
#13 0x40F9BE in abinit at abinit.F90:215
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 13645 on node hachemi exited on signal 7 (Bus error).
--------------------------------------------------------------------------
Esha
 
Posts: 2
Joined: Tue Jul 12, 2016 7:44 am

Re: Issue Regarding Parallel Installation of Abinit 8.0.7

Postby pouillon » Tue Aug 23, 2016 1:00 pm

This is a problem with your MPI installation, not with Abinit. Please consult their documentation / forums / mailing lists and/or re-install MPI.
Yann Pouillon
Universidad de Cantabria
Santander, Spain
User avatar
pouillon
 
Posts: 644
Joined: Wed Aug 19, 2009 10:08 am
Location: Spain

Re: Issue Regarding Parallel Installation of Abinit 8.0.7

Postby marco.digennaro » Mon Oct 03, 2016 2:39 pm

Hi guys,
I actually have the same problem. Even though parallelism is there (cpu time different for different mpi runs), I get these two warning messages:
Code: Select all
 --- !WARNING
src_file: m_nctk.F90
src_line: 526
message: |
     Strange, netcdf seems to support MPI-IO but: NetCDF: Not a valid ID
 ...
 
 --- !WARNING
 src_file: m_nctk.F90
 src_line: 539
 message: |
     The netcdf library does not support parallel IO, see message above
     Abinit won't be able to produce files in parallel e.g. when paral_kgb==1 is used.
     Action: install a netcdf4+HDF5 library with MPI-IO support.
 ...


I re-installed netcdf and hdf5 within anaconda, and re-installed abinit8 right after, but the problem is still there.
I also tried to modify the line `` with-trio-flavor="netcdf+etsf_io" `` in the .ac file and noticed that in the end the configuration does not care if you set netcdf, or netcdf+whatever, since TRIO flavor is set to None. You have to type it by hand after configure to get it correctly.
This looks a bit suspicious to me.

cheers
Marco Di Gennaro
Marvel, University of Basel (CH)
marco.digennaro
 
Posts: 12
Joined: Thu Jun 16, 2016 8:47 am
Location: Basel, CH

Re: Issue Regarding Parallel Installation of Abinit 8.0.7

Postby jbeuken » Tue Oct 04, 2016 9:35 am

Hi Marco,

I also tried to modify the line `` with-trio-flavor="netcdf+etsf_io" `` in the .ac file and noticed that in the end the configuration does not care if you set netcdf, or netcdf+whatever, since TRIO flavor is set to None. You have to type it by hand after configure to get it correctly.


I don't know if a typo when you write your post but it's

with_trio_flavor not with-trio-flavor

jmb
User avatar
jbeuken
 
Posts: 306
Joined: Tue Aug 18, 2009 9:24 pm

Re: Issue Regarding Parallel Installation of Abinit 8.0.7

Postby marco.digennaro » Tue Oct 04, 2016 3:23 pm

Thanks Jean Michel,

that is absolutely right. But the warning regarding netcdf is still there.

BR
Marco Di Gennaro
Marvel, University of Basel (CH)
marco.digennaro
 
Posts: 12
Joined: Thu Jun 16, 2016 8:47 am
Location: Basel, CH


Return to Platform specific questions

Who is online

Users browsing this forum: No registered users and 1 guest

cron