Update: GPU Single Precision Implementation - 5x Faster! :)

Documentation, Web site and code modifications

Moderators: baguetl, routerov

Locked
Dominic
Posts: 18
Joined: Mon Jan 21, 2013 4:34 pm

Update: GPU Single Precision Implementation - 5x Faster! :)

Post by Dominic » Thu Jan 12, 2017 5:53 am

Hi,

I have made a modification to Abinit Cuda code to enable Single Precision calculations by invoking:

Code: Select all

with_gpu_flavor=cuda-single


from the Config File. However, I dont have all hardware in the world so If you could manage to make this code better, lets do it!

I am attaching a file that contains the patch for Abinit 8.0.8b, change the extension ".in" into ".patch" then patch it from the source directory.

Update:

I found some bugs and fixed it, It compiles without problem now, I can also run the code It self.

And the sp code is faster by about 40% to 5x in small tests! Yey! More than that energy results differ only very little OMG!

Results:

Test 1
Ref. Time: 8.4s
Ref. Energy: 6.6540730581441

SP Time: 5.1s
SP Energy: 6.6540730581443

Test 4
Ref. Time: 14
SP Time: 2.8

I dont know yet how good it will scale in Big systems, but the theoretical speed up would suppose to be 10x in larger computational cost, I have not tested yet. Hope sometime you do :)

CAVEAT: DP has a precision of up to ~15 decimal place while SP is up to ~7 decimal places, so expect the expected, but 5x (or maybe 10x?) speed up is really hard to miss
Attachments
abinit-8.0.8-sp-v2.in
Patch to Implement GPU cuda single in Abinit 8.0.8b
(187.5 KiB) Downloaded 337 times

Locked