Hi!
I used a trick in order to read a page located in a remote machine's disk.
(using mmap() over the whole file in each machine and creating MPI_one_sided communication windows on it)
It works fine when DAPL UD disabled but it spits the following error messages if I enable DAPL UD by setting 'I_MPI_DAPL_UD=1'.
XXX001:UCM:1d1a:84d2ab40: 271380 us(271380 us): DAPL ERR reg_mr Cannot allocate memory
[0:XXX001] rtc_register failed 196608 [0] error(0x30000): unknown error
Assertion failed in file ../../src/mpid/ch3/channels/nemesis/netmod/dapl/dapl_send_ud.c at line 1468: 0
internal ABORT - process 0
XXX002:UCM:31e2:27bacb40: 263683 us(263683 us): DAPL ERR reg_mr Cannot allocate memory
[1:XXX002] rtc_register failed 196608 [1] error(0x30000): unknown error
Assertion failed in file ../../src/mpid/ch3/channels/nemesis/netmod/dapl/dapl_send_ud.c at line 1468: 0
Pleased refer to the attached file for the code I used.
and I ran above program with following flags enabled:
export I_MPI_FABRICS=dapl
export I_MPI_DAPL_UD=1
command: mpiexec.hydra -genvall -machinefile ~/machines -n 2 -ppn 1 ${PWD}/test2
Here are my general questions:
(1) When the window over mmaped region is created, does the ib driver try to pin the whole memory region to prevent page faults?
(2) Is the behavior when ib driver tries to register the memory region different depending on whether DAPL UD enabled/disabled?
Experimental Environment:
Hardware Spec:
OS : CentOS 6.4 Final
CPU : 2 * Intel® Xeon® CPU E5-2450 @ (2.10GHz, 8 physical cores)
RAM : 32GB per each
Ethernet: InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE]
Mellanox Infiniband driver: MLNX_OFED_LINUX-3.1-1.1.0.1 (OFED-3.1-1.1.0): 3.19.0
thanks,