-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Joachim Geiger has reported a crash when running with multiple processors. The following input files
Cases.zip
show the behavior. input.crashes uses an extended number of modes and crashed with a heap-overflow error when run with more than a single processor. The input.works` is the same case with a reduced number of modes. This cases does not exhibit the behavior. The crash was reported using the ifort compiler however, I was able to reproduce this crash by turning on the address-sanitizer flag.
% mpirun -n 4 xvmec input.crashes_3
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
VMEC OUTPUT FILES ALREADY EXIST: OVERWRITING THEM ...
SEQ = 1 TIME SLICE 0.0000E+00
PROCESSING INPUT.crashes_3
THIS IS PARVMEC (PARALLEL VMEC), VERSION 9.0
Lambda: Full Radial Mesh. L-Force: hybrid full/half.
COMPUTER: cianciosaimac OS: Darwin RELEASE: 19.6.0 DATE = Jan 21,2021 TIME = 12:52:34
NS = 8 NO. FOURIER MODES = 185 FTOLV = 1.000E-06 NITER = 20000
PROCESSOR COUNT - RADIAL: 4
INITIAL JACOBIAN CHANGED SIGN!
TRYING TO IMPROVE INITIAL MAGNETIC AXIS GUESS
---- Improved AXIS Guess ----
RAXIS_CC = 5.5423259209884730 0.30747882334706500 3.6107777297953697E-002 2.1925887832076173E-002 -0.17127515915757005 0.33995876393572677 2.7194580396712614E-002 8.7619938032124662E-003 2.1641584886036458E-002 -3.0060375964156970E-002 4.0919407891436034E-003 7.2283631622133112E-003 -4.8096045954452264E-003 3.2132317238919464E-003 1.3366337123433408E-003 -5.0218208257885189E-003 -1.0805539441867496E-003 3.8372284158438586E-004 1.2322391511445112E-003 8.2564184559682900E-004 9.0462982158830627E-003
ZAXIS_CS = -0.0000000000000000 -0.40364620347171476 -2.6212416249487239E-002 2.5845975128812093E-002 0.15344591155188636 -0.27210128536906603 -2.4819582171628708E-002 -7.6814873421304332E-003 -2.2282872186040290E-002 1.9170323502591072E-002 -1.1569841914854002E-002 -6.1298139436995875E-004 -2.6220827681052326E-003 -5.6155647985143900E-003 -3.0101401187663541E-003 -8.9905949402988867E-003 -4.8346291121438923E-003 -5.7954765825185117E-003 8.0075797167838414E-003 -3.0281697953424324E-003 -3.8957154711619243E-003
-----------------------------
=================================================================
=================================================================
=================================================================
=================================================================
==55382==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6180000077c0 at pc 0x000102552fed bp 0x7ffeed759dc0 sp 0x7ffeed759db8
==55380==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6180000077c0 at pc 0x00010ec8dfed bp 0x7ffee101edc0 sp 0x7ffee101edb8
READ of size 8 at 0x6180000077c0 thread T0
READ of size 8 at 0x6180000077c0 thread T0
==55383==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6180000077c0 at pc 0x00010c003fed bp 0x7ffee3ca8dc0 sp 0x7ffee3ca8db8
READ of size 8 at 0x6180000077c0 thread T0
==55381==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6180000077c0 at pc 0x00010d992fed bp 0x7ffee2319dc0 sp 0x7ffee2319db8
READ of size 8 at 0x6180000077c0 thread T0
#0 0x10ec8dfec in __blocktridiagonalsolver_bst_MOD_initialize_bst blocktridiagonalsolver_bst.f90:2005
#1 0x10f09af06 in runvmec_ runvmec.f:329
#2 0x10ebdf804 in MAIN__ vmec.f:333
#3 0x10ebe1818 in main vmec.f:2
#0 0x10c003fec in __blocktridiagonalsolver_bst_MOD_initialize_bst blocktridiagonalsolver_bst.f90:2005
#1 0x10c410f06 in runvmec_ runvmec.f:329
#2 0x10bf55804 in MAIN__ vmec.f:333
#3 0x10bf57818 in main vmec.f:2
#4 0x7fff6fbc6cc8 in start (libdyld.dylib:x86_64+0x1acc8)
0x6180000077c0 is located 0 bytes to the right of 832-byte region [0x618000007480,0x6180000077c0)
allocated by thread T0 here:
#4 0x7fff6fbc6cc8 in start (libdyld.dylib:x86_64+0x1acc8)
0x6180000077c0 is located 0 bytes to the right of 832-byte region [0x618000007480,0x6180000077c0)
allocated by thread T0 here:
#0 0x10d992fec in __blocktridiagonalsolver_bst_MOD_initialize_bst blocktridiagonalsolver_bst.f90:2005
#1 0x10dd9ff06 in runvmec_ runvmec.f:329
#2 0x10d8e4804 in MAIN__ vmec.f:333
#3 0x10d8e6818 in main vmec.f:2
#0 0x113a341ad in wrap_malloc (libasan.5.dylib:x86_64+0x6c1ad)
#1 0x10ec8d21b in __blocktridiagonalsolver_bst_MOD_initialize_bst blocktridiagonalsolver_bst.f90:2002
#2 0x10f09af06 in runvmec_ runvmec.f:329
#3 0x10ebdf804 in MAIN__ vmec.f:333
#4 0x10ebe1818 in main vmec.f:2
#5 0x7fff6fbc6cc8 in start (libdyld.dylib:x86_64+0x1acc8)
#4 0x7fff6fbc6cc8 in start (libdyld.dylib:x86_64+0x1acc8)
0x6180000077c0 is located 0 bytes to the right of 832-byte region [0x618000007480,0x6180000077c0)
SUMMARY: AddressSanitizer: heap-buffer-overflow blocktridiagonalsolver_bst.f90:2005 in __blocktridiagonalsolver_bst_MOD_initialize_bst
allocated by thread T0 here:
Shadow bytes around the buggy address:
0x1c3000000ea0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1c3000000eb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1c3000000ec0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1c3000000ed0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1c3000000ee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x1c3000000ef0: 00 00 00 00 00 00 00 00[fa]fa fa fa fa fa fa fa
0x1c3000000f00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x1c3000000f10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1c3000000f20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1c3000000f30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1c3000000f40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==55380==ABORTING
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
#0 0x1136dc72c
#1 0x1136dbad3
#2 0x7fff6fdbf5fc
#0 0x11128d1ad in wrap_malloc (libasan.5.dylib:x86_64+0x6c1ad)
#1 0x10c00321b in __blocktridiagonalsolver_bst_MOD_initialize_bst blocktridiagonalsolver_bst.f90:2002
#2 0x10c410f06 in runvmec_ runvmec.f:329
#3 0x10bf55804 in MAIN__ vmec.f:333
#4 0x10bf57818 in main vmec.f:2
#5 0x7fff6fbc6cc8 in start (libdyld.dylib:x86_64+0x1acc8)
SUMMARY: AddressSanitizer: heap-buffer-overflow blocktridiagonalsolver_bst.f90:2005 in __blocktridiagonalsolver_bst_MOD_initialize_bst
Shadow bytes around the buggy address:
0x1c3000000ea0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1c3000000eb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1c3000000ec0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1c3000000ed0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1c3000000ee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x1c3000000ef0: 00 00 00 00 00 00 00 00[fa]fa fa fa fa fa fa fa
0x1c3000000f00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x1c3000000f10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1c3000000f20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1c3000000f30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1c3000000f40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==55383==ABORTING
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
#0 0x10d8f072c
#1 0x10d8efad3
#2 0x7fff6fdbf5fc
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 55380 on node cianciosaimac exited on signal 6 (Abort trap: 6).
--------------------------------------------------------------------------
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working