Bug Report: OpenMP in 6.12.1

Documentation, Web site and code modifications

Moderators: baguetl, routerov

Locked
kaneod
Posts: 42
Joined: Wed Mar 10, 2010 11:47 am

Bug Report: OpenMP in 6.12.1

Post by kaneod » Tue Feb 14, 2012 5:44 am

Since it was reported in the previous equivalent thread that OpenMP is (possibly) much better in 6.12, thought I'd post some of the OpenMP bugs here to help clean it up. All these are reported as errors by gfortran-4.6.3 on the Intel x86_64 platform. Compilation used the flags "-O3 -march=native -funroll-loops -floop-block -flto -fopenmp" and whatever else the build system adds by itself (something like -m64 -g -ffree-line-length-none).

43_ffts/fourwf.F90 line 787: the !$OMP END PARALLEL DO statement has to actually end after a "end do"! Moving it up one line (before the end if) fixes.

56_mixing/dotprodm_vn.F90: there are repetitions of PRIVATE variables in the directives at lines 272-274, 312-313. Remove them.

56_xc/rhohxc.F90: derived types (blah%moreblah) can't be used in OpenMP directives! Remove dtset%ixc from the directive cluster at line 699. It's only a single integer so copying it across the different OpenMP threads will not destroy performance. In the future, assign dtset%ixc to a local variable and use the local variable name in the SHARED directive instead.

56_xc/xcden.F90 and xcpot.F90: there are PARALLEL DO loops here which don't start with a Fortran DO statement - need to move ifft = 0 assignment outside the loop (it will still be copied to all the individual threads as 0 in the first place).

65_nonlocal/mkffnl.F90: Line 466 is missing a trailing continuation & after the OMP directive. Line 467: Typo, variable "wf_ffnl1" in the SHARED directive should be wk_ffnl1.

65_nonlocal/mkkpg.F90: Line 105, what is variable kkpg doing in the OMP directive? It doesn't exist! Remove to get working. Line 93: two_pi isn't a variable, it's a parameter.

65_nonlocal/opernlb_ylm.F90: Lines 178 and 179 are missing & continuations. Why is the OpenMP styling so inconsistent across the files even within the same folder? Is there not a style guide for this project?


65_nonlocal/opernla_ylm.F90: Missing continuations in the block near line 166.

The biggest danger of these missing continuations is that they only come up (by default) as *warnings*, hence the parallelization is incomplete in the final build.
Dr Kane O'Donnell
Postdoctoral Research Fellow
Australian Synchrotron

kaneod
Posts: 42
Joined: Wed Mar 10, 2010 11:47 am

Re: Bug Report: OpenMP in 6.12.1

Post by kaneod » Wed Feb 15, 2012 12:27 am

Further to the previous post, here is a patch that corrects all the OpenMP directive errors (placement, contents) I've found in the abinit 6.12.1 source tree. It will also create a config.ac file (called m2-login1.ac) in ~abinit that lists the build options I used.

Note that my corrections for derived types (blah%moreblah) are suboptimal - where a derived type is in a SHARED directive, really either the whole derived type (dtset, say) should be made SHARED or the particular component (dtset%nspden) should be assigned to a local variable, say, nspden, and that can then be SHARED. Since the things that are being shared in almost all cases are essentially just integers to give the max limit of a parallel DO loop, the latter is probably better. I see that in some places in the code these assignments have been made.
Dr Kane O'Donnell
Postdoctoral Research Fellow
Australian Synchrotron

kaneod
Posts: 42
Joined: Wed Mar 10, 2010 11:47 am

Re: Bug Report: OpenMP in 6.12.1

Post by kaneod » Wed Feb 15, 2012 12:29 am

Ok, patches are apparently banned - moderator, email me (kane dot odonnell at synchrotron dot org dot au) if you want the patch.
Dr Kane O'Donnell
Postdoctoral Research Fellow
Australian Synchrotron

mverstra
Posts: 655
Joined: Wed Aug 19, 2009 12:01 pm

Re: Bug Report: OpenMP in 6.12.1

Post by mverstra » Wed Feb 15, 2012 2:38 pm

Hello kaneod,

this is great thanks! The patch is not banned, you just have to use a suffix that the forum likes, such as .txt

OpenMP is being revamped in abinit as we speak, to I will forward your patch to the appropriate developers (or better point them to this conversation).

cheers

Matthieu
Matthieu Verstraete
University of Liege, Belgium

User avatar
gmatteo
Posts: 291
Joined: Sun Aug 16, 2009 5:40 pm

Re: Bug Report: OpenMP in 6.12.1

Post by gmatteo » Wed Feb 15, 2012 4:20 pm

Hello kaneod,

Thank you very much for reporting these problems.
I've already solved most of them in my development version but unfortunately
these bug fixes have not been included in version 6.12. They will be made available in version 6.13

I would strongly discourage the use of OpenMP in version 6.12 since there are several
problems and inefficiencies that we are still trying to solve. In particular, the OMP sections
used for the application of the non-local part of the potential are completely wrong.
The orthogonalization algorithm should be rewritten with BLAS3 routines in order to achieve better parallel efficiency.

Why is the OpenMP styling so inconsistent across the files even within the same folder? Is there not a style guide for this project?


No, as far as I know. Note that most of the OMP sections were coded long time ago, and then disabled since the core developers
prefer to concentrate most of effort on the pure MPI version.
Now OpenMP is being revamped in abinit (in particular in the GW code), since we are trying to improve the scalability of the code
by using a hybrid MPI-OpenMP implementation.

Contributions and bug fixes are obviously welcome.
The main problem, however, is that I'm rewriting most of the OMP sections and this will complicate the merge of
patches done on top of version 6.12.
The best solution would be to create a development branch and then use bazaar to manage the different contributions.

People who want to contribute to the development of the OMP parallelism can contact me by mail.

Best regards,
Matteo Giantomassi

kaneod
Posts: 42
Joined: Wed Mar 10, 2010 11:47 am

Re: Bug Report: OpenMP in 6.12.1

Post by kaneod » Thu Feb 16, 2012 12:34 am

Thanks for the replies! Well, having OpenMP working for me is only an interest rather than a necessity - the bands parallelism in abinit seems quite effective at getting good speedup for molecular systems (nkpts=1) across MPI. I imagine in the long term though, the scalability will be better for a code with well-integrated OpenMP/MPI hybrid routines where only one thread per node actually communicates via MPI.

It's all a bit academic as I get segfaults (constantly) with the executable that compiles cleanly with -fopenmp in gfortran 4.6.3. I guess this is stack issue, I will explore whether expanding OMP_STACKSIZE helps.

With regards to posting the patch - I did try using a .txt extension and it still said I wasn't allowed to upload it. I'll put it on my github (github.com/kaneod) in the "bits of python for physics" repo for devs that are interested.

Kane
Dr Kane O'Donnell
Postdoctoral Research Fellow
Australian Synchrotron

Locked