Quantcast
Channel: Clusters and HPC Technology
Viewing all 952 articles
Browse latest View live

How to tell what I_MPI_ADJUST are set to with Intel MPI 19

$
0
0

Is there a way with Intel MPI 19 to see what the I_MPI_ADJUST_* values are set to? 

With Intel 18.0.5, I see a lot like:

[0] MPI startup(): Gather: 3: 3073-16397 & 129-2147483647
[0] MPI startup(): Gather: 2: 16398-65435 & 129-2147483647
[0] MPI startup(): Gather: 3: 0-2147483647 & 129-2147483647
[0] MPI startup(): Gatherv: 1: 0-2147483647 & 0-2147483647
[0] MPI startup(): Reduce_scatter: 4: 0-0 & 0-8
[0] MPI startup(): Reduce_scatter: 1: 1-16 & 0-8

On my cluster which admittedly only has Intel 19.0.2 at the moment installed, I tried running various codes with Intel MPI 19.0.2 and I_MPI_DEBUG set from 1 to 1000 and...not much. For example, when running a hello world:

(1189)(master) $ mpiifort -V
Intel(R) Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 19.0.2.187 Build 20190117
Copyright (C) 1985-2019 Intel Corporation.  All rights reserved.

(1190)(master) $ mpirun -V
Intel(R) MPI Library for Linux* OS, Version 2019 Update 2 Build 20190123 (id: e2d820d49)
Copyright 2003-2019, Intel Corporation.
(1191)(master) $ mpirun -genv I_MPI_DEBUG=1000 -np 4 ./helloWorld.mpi3.hybrid.IMPI19.exe
[0] MPI startup(): libfabric version: 1.7.0a1-impi
[0] MPI startup(): libfabric provider: psm2
[0] MPI startup(): Rank    Pid      Node name  Pin cpu
[0] MPI startup(): 0       126170   borga065   {0,1,2,3,4,5,6,7,8,9}
[0] MPI startup(): 1       126171   borga065   {10,11,12,13,14,15,16,17,18,19}
[0] MPI startup(): 2       126172   borga065   {20,21,22,23,24,25,26,27,28,29}
[0] MPI startup(): 3       126173   borga065   {30,31,32,33,34,35,36,37,38,39}
Hello from thread    0 out of    1 on process    1 of    4 on processor borga065
Hello from thread    0 out of    1 on process    2 of    4 on processor borga065
Hello from thread    0 out of    1 on process    3 of    4 on processor borga065
Hello from thread    0 out of    1 on process    0 of    4 on processor borga065

Honestly I'm used to I_MPI_DEBUG being *very* verbose, but I guess not anymore? Is there another value I need to set?

Thanks,

Matt

TCE Open Date: 

Miércoles, 20 noviembre, 2019 - 08:59

The Fortran program is not working with Windows 10 as it used to work with Windows 7

$
0
0

Hi everyone,

It is a 64bit Intel Fortran program that was compiled by Compiler 11.1.51 under Windows 7 pro 64bit and MS MPI. When I moved to Windows 10 and run the same program I noticed that arrays are not any more sent between tasks. I used to specify the array with a starting location, say A(1), and its length, say 100.  I unstalled the old MS MPI and replace it with a new one but the problem still exists. Is there any idea why is that? I need help.

Best regards.

Said.

TCE Open Date: 

Sábado, 23 noviembre, 2019 - 11:03

The Fortran program is not working with Windows 10 as it used to work with Windows 7

$
0
0

Hi everyone,

I have a 64bit Fortran program that was compiled by Intel Fortran Compiler under Windows 7 pro 64bit and MS MPI 2008 R2. When I moved to Windows 10 and run the same program I noticed that arrays are not any more sent between tasks. I used to send the array variable specifying its start location; say A(1), and it's length; say 100. Only non-array variables are sent and received. I unstalled the old MS MPI and replace it with a new one (V10) but the problem still exists. Is there any idea why is that? I need help.

Best regards.

Said

TCE Open Date: 

Domingo, 24 noviembre, 2019 - 07:07

What is HPC cluster used for?

$
0
0

I would like to know What is HPC cluster used for? If you have resources or knowledge do share here.

Thanks.

Declan Lawton,

Trainee at Moweb Technologies

TCE Open Date: 

Lunes, 25 noviembre, 2019 - 03:28

MLNX_OFED_LINUX-4.6-1.0.1.1 (OFED-4.6-1.0.1) has hang issue with Intel MPI 5.0 - While using OFA fabric

$
0
0

While using MPI with I_MPI_FABRIC as shm:ofa with EDR & OFED 4.6-1.0.1.1. The RDMA pull hangs.

An another incident was noticed with different version of software having same combinations, memory corruption happens.

Is there any problem with latest OFED (4.6) with Intel MPI (OFA)?

Note: The same software run fine with DAPL fabric selection.

Thanks in advance.

 

 

TCE Open Date: 

Domingo, 1 diciembre, 2019 - 23:15

mpiexec.hydra 2019u4 crashes on AMD Zen2

$
0
0

Hello,

mpexec.hydra binary from Inltel 2019U4 crashes on Zen2 and Zen1 platforms.

 

 

user@Zen1[pts/0]stream $ mpirun -np 2   /vend/intel/parallel_studio_xe_2019_update4/compilers_and_libraries_2019.4.243/linux/mpi/intel64/bin/IMB-MPI1

/vend/intel/parallel_studio_xe_2019_update4/compilers_and_libraries_2019.4.243/linux/mpi/intel64/bin/mpirun: line 103:  7399 Floating point exception(core dumped) mpiexec.hydra "$@" 0<&0

 

user@Zen2[pts/1]demo $ mpirun -np 2   /vend/intel/parallel_studio_xe_2019_update4/compilers_and_libraries_2019.4.243/linux/mpi/intel64/bin/IMB-MPI1

/vend/intel/parallel_studio_xe_2019_update4/compilers_and_libraries_2019.4.243/linux/mpi/intel64/bin/mpirun: line 103: 121108 Floating point exception(core dumped) mpiexec.hydra "$@" 0<&0

A strace reveals that mpiexec.hydra crashes trying to parse to processor configuration, I believe binary cpuininfo suffers from the same symptoms.

...

openat(AT_FDCWD, "/sys/devices/system/cpu", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
getdents(3, [{d_ino=37, d_off=1, d_reclen=24, d_name=".", d_type=DT_DIR}, {d_ino=9, d_off=2690600, d_reclen=24, d_name="..", d_type=DT_DIR}, {d_ino=170171, d_off=25909499, d_reclen=24, d_name="smt", d_type=DT_DIR}, {d_ino=90582, d_off=25909675, d_reclen=24, d_name="cpu0", d_type=DT_DIR}, {d_ino=90600, d_off=25909851, d_reclen=24, d_name="cpu1", d_type=DT_DIR}, {d_ino=90619, d_off=25910027, d_reclen=24, d_name="cpu2", d_type=DT_DIR}, {d_ino=90638, d_off=25910203, d_reclen=24, d_name="cpu3", d_type=DT_DIR}, {d_ino=90657, d_off=25910379, d_reclen=24, d_name="cpu4", d_type=DT_DIR}, {d_ino=90676, d_off=25910555, d_reclen=24, d_name="cpu5", d_type=DT_DIR}, {d_ino=90695, d_off=25910731, d_reclen=24, d_name="cpu6", d_type=DT_DIR}, {d_ino=90714, d_off=25910907, d_reclen=24, d_name="cpu7", d_type=DT_DIR}, {d_ino=90733, d_off=25911083, d_reclen=24, d_name="cpu8", d_type=DT_DIR}, {d_ino=90752, d_off=141151836, d_reclen=24, d_name="cpu9", d_type=DT_DIR}, {d_ino=222492, d_off=141566558, d_reclen=32, d_name="cpufreq", d_type=DT_DIR}, {d_ino=82070, d_off=285014906, d_reclen=32, d_name="cpuidle", d_type=DT_DIR}, {d_ino=90771, d_off=285015082, d_reclen=32, d_name="cpu10", d_type=DT_DIR}, {d_ino=90790, d_off=285015258, d_reclen=32, d_name="cpu11", d_type=DT_DIR}, {d_ino=90809, d_off=285015434, d_reclen=32, d_name="cpu12", d_type=DT_DIR}, {d_ino=90828, d_off=285015610, d_reclen=32, d_name="cpu13", d_type=DT_DIR}, {d_ino=90847, d_off=285015786, d_reclen=32, d_name="cpu14", d_type=DT_DIR}, {d_ino=90866, d_off=285015962, d_reclen=32, d_name="cpu15", d_type=DT_DIR}, {d_ino=90885, d_off=285016138, d_reclen=32, d_name="cpu16", d_type=DT_DIR}, {d_ino=90904, d_off=285016314, d_reclen=32, d_name="cpu17", d_type=DT_DIR}, {d_ino=90923, d_off=285016490, d_reclen=32, d_name="cpu18", d_type=DT_DIR}, {d_ino=90942, d_off=285016842, d_reclen=32, d_name="cpu19", d_type=DT_DIR}, {d_ino=90961, d_off=285017018, d_reclen=32, d_name="cpu20", d_type=DT_DIR}, {d_ino=90980, d_off=285017194, d_reclen=32, d_name="cpu21", d_type=DT_DIR}, {d_ino=90999, d_off=285017370, d_reclen=32, d_name="cpu22", d_type=DT_DIR}, {d_ino=91018, d_off=285017546, d_reclen=32, d_name="cpu23", d_type=DT_DIR}, {d_ino=91037, d_off=285017722, d_reclen=32, d_name="cpu24", d_type=DT_DIR}, {d_ino=91056, d_off=285017898, d_reclen=32, d_name="cpu25", d_type=DT_DIR}, {d_ino=91075, d_off=285018074, d_reclen=32, d_name="cpu26", d_type=DT_DIR}, {d_ino=91094, d_off=285018250, d_reclen=32, d_name="cpu27", d_type=DT_DIR}, {d_ino=91113, d_off=285018426, d_reclen=32, d_name="cpu28", d_type=DT_DIR}, {d_ino=91132, d_off=285018778, d_reclen=32, d_name="cpu29", d_type=DT_DIR}, {d_ino=91151, d_off=285018954, d_reclen=32, d_name="cpu30", d_type=DT_DIR}, {d_ino=91170, d_off=285019130, d_reclen=32, d_name="cpu31", d_type=DT_DIR}, {d_ino=91189, d_off=285019306, d_reclen=32, d_name="cpu32", d_type=DT_DIR}, {d_ino=91208, d_off=285019482, d_reclen=32, d_name="cpu33", d_type=DT_DIR}, {d_ino=91227, d_off=285019658, d_reclen=32, d_name="cpu34", d_type=DT_DIR}, {d_ino=91246, d_off=285019834, d_reclen=32, d_name="cpu35", d_type=DT_DIR}, {d_ino=91265, d_off=285020010, d_reclen=32, d_name="cpu36", d_type=DT_DIR}, {d_ino=91284, d_off=285020186, d_reclen=32, d_name="cpu37", d_type=DT_DIR}, {d_ino=91303, d_off=285020362, d_reclen=32, d_name="cpu38", d_type=DT_DIR}, {d_ino=91322, d_off=285020714, d_reclen=32, d_name="cpu39", d_type=DT_DIR}, {d_ino=91341, d_off=285020890, d_reclen=32, d_name="cpu40", d_type=DT_DIR}, {d_ino=91360, d_off=285021066, d_reclen=32, d_name="cpu41", d_type=DT_DIR}, {d_ino=91379, d_off=285021242, d_reclen=32, d_name="cpu42", d_type=DT_DIR}, {d_ino=91398, d_off=285021418, d_reclen=32, d_name="cpu43", d_type=DT_DIR}, {d_ino=91417, d_off=285021594, d_reclen=32, d_name="cpu44", d_type=DT_DIR}, {d_ino=91436, d_off=285021770, d_reclen=32, d_name="cpu45", d_type=DT_DIR}, {d_ino=91455, d_off=285021946, d_reclen=32, d_name="cpu46", d_type=DT_DIR}, {d_ino=91474, d_off=285022122, d_reclen=32, d_name="cpu47", d_type=DT_DIR}, {d_ino=91493, d_off=285022298, d_reclen=32, d_name="cpu48", d_type=DT_DIR}, {d_ino=91512, d_off=285022650, d_reclen=32, d_name="cpu49", d_type=DT_DIR}, {d_ino=91531, d_off=285022826, d_reclen=32, d_name="cpu50", d_type=DT_DIR}, {d_ino=91550, d_off=285023002, d_reclen=32, d_name="cpu51", d_type=DT_DIR}, {d_ino=91569, d_off=285023178, d_reclen=32, d_name="cpu52", d_type=DT_DIR}, {d_ino=91588, d_off=285023354, d_reclen=32, d_name="cpu53", d_type=DT_DIR}, {d_ino=91607, d_off=285023530, d_reclen=32, d_name="cpu54", d_type=DT_DIR}, {d_ino=91626, d_off=285023706, d_reclen=32, d_name="cpu55", d_type=DT_DIR}, {d_ino=91645, d_off=285023882, d_reclen=32, d_name="cpu56", d_type=DT_DIR}, {d_ino=91664, d_off=285024058, d_reclen=32, d_name="cpu57", d_type=DT_DIR}, {d_ino=91683, d_off=285024234, d_reclen=32, d_name="cpu58", d_type=DT_DIR}, {d_ino=91702, d_off=285024586, d_reclen=32, d_name="cpu59", d_type=DT_DIR}, {d_ino=91721, d_off=285024762, d_reclen=32, d_name="cpu60", d_type=DT_DIR}, {d_ino=91740, d_off=285024938, d_reclen=32, d_name="cpu61", d_type=DT_DIR}, {d_ino=91759, d_off=285025114, d_reclen=32, d_name="cpu62", d_type=DT_DIR}, {d_ino=91778, d_off=318580955, d_reclen=32, d_name="cpu63", d_type=DT_DIR}, {d_ino=47, d_off=385790491, d_reclen=32, d_name="power", d_type=DT_DIR}, {d_ino=57, d_off=661204875, d_reclen=40, d_name="vulnerabilities", d_type=DT_DIR}, {d_ino=46, d_off=718872595, d_reclen=32, d_name="modalias", d_type=DT_REG}, {d_ino=42, d_off=900028725, d_reclen=32, d_name="kernel_max", d_type=DT_REG}, {d_ino=40, d_off=1321717208, d_reclen=32, d_name="possible", d_type=DT_REG}, {d_ino=39, d_off=1412398250, d_reclen=32, d_name="online", d_type=DT_REG}, {d_ino=43, d_off=1431608070, d_reclen=32, d_name="offline", d_type=DT_REG}, {d_ino=44, d_off=1472641949, d_reclen=32, d_name="isolated", d_type=DT_REG}, {d_ino=38, d_off=1826905203, d_reclen=32, d_name="uevent", d_type=DT_REG}, {d_ino=45, d_off=1905639739, d_reclen=32, d_name="nohz_full", d_type=DT_REG}, {d_ino=197551, d_off=2084586514, d_reclen=32, d_name="microcode", d_type=DT_DIR}, {d_ino=41, d_off=2147483647, d_reclen=32, d_name="present", d_type=DT_REG}], 32768) = 2496
getdents(3, [], 32768)                  = 0
close(3)                                = 0
uname({sysname="Linux", nodename="SERVER", release="3.10.0-1062.1.2.el7.x86_64", version="#1 SMP Mon Sep 30 14:19:46 UTC 2019", machine="x86_64", domainname="houston"}) = 0
sched_getaffinity(0, 128, [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63]) = 128
--- SIGFPE {si_signo=SIGFPE, si_code=FPE_INTDIV, si_addr=0x44d325} ---
+++ killed by SIGFPE (core dumped) +++
Floating point exception (core dumped)

 

 

$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                64
On-line CPU(s) list:   0-63
Thread(s) per core:    1
Core(s) per socket:    32
Socket(s):             2
NUMA node(s):          8
Vendor ID:             AuthenticAMD
CPU family:            23
Model:                 49
Model name:            AMD EPYC 7502 32-Core Processor
Stepping:              0
CPU MHz:               1500.000
CPU max MHz:           2500.0000
CPU min MHz:           1500.0000
BogoMIPS:              5000.07
Virtualization:        AMD-V
L1d cache:             32K
L1i cache:             32K
L2 cache:              512K
L3 cache:              16384K
NUMA node0 CPU(s):     0-7
NUMA node1 CPU(s):     8-15
NUMA node2 CPU(s):     16-23
NUMA node3 CPU(s):     24-31
NUMA node4 CPU(s):     32-39
NUMA node5 CPU(s):     40-47
NUMA node6 CPU(s):     48-55
NUMA node7 CPU(s):     56-63
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc art rep_good nopl xtopology nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 cpb cat_l3 cdp_l3 hw_pstate sme retpoline_amd ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip overflow_recov succor smca
 

 

 

TCE Open Date: 

Lunes, 9 diciembre, 2019 - 09:10

Release date for 2019u6

$
0
0

I was wondering when Intel cluster studio XE version 2019 update 6 is scheduled for release?

Thanks

Michael

 

TCE Open Date: 

Lunes, 9 diciembre, 2019 - 09:27

How can I download MPI Fortran Full Library (including statically linked)

$
0
0

I keep going in circles on the website trying to download the full MPI library including the statically linked libraries.  When I go to https://software.intel.com/en-us/mpi-library/choose-download/linux, I choose "register and download".  When I go to the next page (https://software.seek.intel.com/performance-libraries), it has a "welcome back <email address>", and allows me to click submit.

After clicking submit, it thinks for a second and then takes me to a page that says the following but does not have a download link, and I can't find a download link anywhere else on the site:

Thank you for activating your Intel® Performance Libraries product.

Access your support resources. Click here for technical support.

Intel takes your privacy seriously. Refer to Intel's Privacy Notice and Serial Number Validation Notice regarding the collection and handling of your personal information, the Intel product’s serial number and other information.

This was originally discussed in https://software.intel.com/en-us/comment/1949259, but it was accurately indicated that it belongs here.

TCE Open Date: 

Martes, 10 diciembre, 2019 - 07:00

Intel MPI reduce memory comsumption

$
0
0

Dear all,

I am currently looking into the problem of memory consumption for all-to-all based MPI software.
I far as I understand, for releases before Intel MPI 2017, we could use the DAPL_UD mode through the following variables: I_MPI_DAPL_UD and I_MPI_DAPL_UD_PROVIDER.

Since the support of DAPL has been removed in the 2019 version, what do we have to use for InfiniBand interconnect?
My additional question is also, what are the variables that reproduce the same behavior for OmniPath interconnect?

Thank you for your help.
Best,
Thomas

TCE Open Date: 

Martes, 10 diciembre, 2019 - 14:57

HPC Cluster HPL test error

$
0
0

Hello,

When I want  to test my HPC system in which has 33 node has 24 core each one in total 792 core and 370GB RAM for each node but I get following error as I secondly run  mpiexec -f hosts2 -n 792 ./xhpl  command. I had run this command before smoothly and got and output. Do you have any idea with this problem?

When first execution  mpiexec -f hosts2 -n 792 ./xhpl  command that produces 3.096e+04 Gflops value

By the way mpiexec -f hosts2 -n 480 ./xhpl command is working properly an produce output 2.136e+04 Gflops value

Thank you.

An explanation of the input/output parameters follows:
T/V    : Wall time / encoded variant.
N      : The order of the coefficient matrix A.
NB     : The partitioning blocking factor.
P      : The number of process rows.
Q      : The number of process columns.
Time   : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N      :  300288 
NB     :     224 
PMAP   : Row-major process mapping
P      :      24 
Q      :      33 
PFACT  :   Right 
NBMIN  :       4 
NDIV   :       2 
RFACT  :   Crout 
BCAST  :  1ringM 
DEPTH  :       1 
SWAP   : Mix (threshold = 64)
L1     : transposed form
U      : transposed form
EQUIL  : yes
ALIGN  : 8 double precision words

--------------------------------------------------------------------------------

- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
      ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be               1.110223e-16
- Computational tests pass if scaled residuals are less than                16.0

Abort(205610511) on node 516 (rank 516 in comm 0): Fatal error in PMPI_Comm_split: Other MPI error, error stack:
PMPI_Comm_split(507)................: MPI_Comm_split(MPI_COMM_WORLD, color=0, key=516, new_comm=0x7ffc61c8d818) failed
PMPI_Comm_split(489)................: 
MPIR_Comm_split_impl(253)...........: 
MPIR_Get_contextid_sparse_group(498): Failure during collective
Abort(876699151) on node 575 (rank 575 in comm 0): Fatal error in PMPI_Comm_split: Other MPI error, error stack:
PMPI_Comm_split(507)................: MPI_Comm_split(MPI_COMM_WORLD, color=0, key=575, new_comm=0x7ffec32d2c18) failed
PMPI_Comm_split(489)................: 
MPIR_Comm_split_impl(253)...........: 
MPIR_Get_contextid_sparse_group(498): Failure during collective
 

TCE Open Date: 

Lunes, 30 diciembre, 2019 - 23:35

Request for Intel-mpi 2018.5 packages into yum/apt repository

$
0
0

Hi,
I need packages relating to Intel-mpi-rt-2018.5. however, I can find upto update 4 packages only.
Could you upload them to repo "yum.repos.intel.com/mpi" ?

 

TCE Open Date: 

Lunes, 6 enero, 2020 - 06:32

Issue with MPI 2019U6 and MLX provider

$
0
0

Hi

We have two clusters that are almost identical except that one is now running Mellanox OFED 4.6 and the other 4.5.

With MPI 2019U6 from Studio 2020 distribution, one cluster (4.5) works OK, the other (4.6) does not and throws some UCX errors:

]$ cat slurm-151351.out
I_MPI_F77=ifort
I_MPI_PORT_RANGE=60001:61000
I_MPI_F90=ifort
I_MPI_CC=icc
I_MPI_CXX=icpc
I_MPI_DEBUG=999
I_MPI_FC=ifort
I_MPI_HYDRA_BOOTSTRAP=slurm
I_MPI_ROOT=/apps/compilers/intel/2020.0/compilers_and_libraries_2020.0.166/linux/mpi
MPI startup(): Imported environment partly inaccesible. Map=0 Info=0
[0] MPI startup(): libfabric version: 1.9.0a1-impi
[0] MPI startup(): libfabric provider: mlx
[0] MPI startup(): detected mlx provider, set device name to "mlx"
[0] MPI startup(): max_ch4_vcis: 1, max_reg_eps 1, enable_sep 0, enable_shared_ctxs 0, do_av_insert 1
[0] MPI startup(): addrname_len: 512, addrname_firstlen: 512
[0] MPI startup(): val_max: 4096, part_len: 4095, bc_len: 1030, num_parts: 1
[1578327353.181131] [scs0027:247642:0]         select.c:410  UCX  ERROR no active messages transport to <no debug data>: mm/posix - Destination is unreachable, mm/sysv - Destination is unreachable, self/self - Destination is unreachable
[1578327353.180508] [scs0088:378614:0]         select.c:410  UCX  ERROR no active messages transport to <no debug data>: mm/posix - Destination is unreachable, mm/sysv - Destination is unreachable, self/self - Destination is unreachable
Abort(1091471) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(703)........:
MPID_Init(958)...............:
MPIDI_OFI_mpi_init_hook(1382): OFI get address vector map failed
Abort(1091471) on node 2 (rank 2 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(703)........:
MPID_Init(958)...............:
MPIDI_OFI_mpi_init_hook(1382): OFI get address vector map failed

 

Is this possibly an Intel MPI issue or something at our end (where 2018 and early 2019 versions worked OK)?

Thanks
A

TCE Open Date: 

Lunes, 6 enero, 2020 - 08:14

Why does my Octopus 9.1's show an extremely different performance with different MPI implementations?

$
0
0

I am compiling a scientific program package called Octopus 9.1 on a cluster by specifying BLAS library with

-L${MKL_DIR} -Wl,-Bstatic -Wl,--start-group -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -Wl,--end-group -Wl,-Bdynamic -lpthread -lm -ldl

BLACS with

-L${MKL_DIR} -Wl,-Bstatic -lmkl_scalapack_lp64

and SCALAPCK with

-L${MKL_DIR} -Wl,-Bstatic -lmkl_scalapack_lp64

All of those options and flags are what the Intel Link Line Advisor spits given my computer architecture. The compilers are openmpi's mpif90 and mpicc compiled with Intel 18.0.0 compilers. The program works fine, it runs fast, nothing to be worried except for a few segfault errors during test run which I suspect can be remedied by ulimit -s unlimited. But I would like to know why -lmkl_intel_lp64, -lmkl_sequential, -lmkl_core, as well as that blacs and scalapack libraries have to be statically linked? For instance, when those -Wl,-Bstatic and -Wl,-Bdynamic are removed, I got segfault runtime error for any calculation I launched. Looking at Octopus's manual, it doesn't say anything about which intel's library should be linked statically or dynamically, in fact those intel-advised compiler options are architecture-dependent. Moreover, if I switched compilers to MPICH which also wrap intels compiler (same version as before), the program runs significantly slower (in one calculation it was 50 seconds with openmpi vs. 1 hour with mpich) and -Wl,-Bstatic and -Wl,-Bdynamic options have to be absent otherwise segfault. This is really bugging me, how come a mere difference in MPI implementation can lead to such a huge difference in performance and linking behavior. Any thought on this?

TCE Level: 

TCE Open Date: 

Miércoles, 8 enero, 2020 - 12:42

significance/meaning of zero byte MPI messages (APS message profiling data)

$
0
0

Hi,
I recently tried the APS tool to capture the message details (size and amount) for the WRF application on the intel 8280 (opa)
Here is the data - 
nodes,Message_size(B),Volume(MB),Volume(%),Transfers,Time(sec),Time(%)
1,0,0,0,58099903,3988.14,97.05
2,0,0,0,219491539,7554.45,96.19
4,0,0,0,850730419,15073.44,96.02

It seems that significant amount of time is spent in the transfer of these 0 byte messages and with more number of nodes, the amount of messages increases. Could you please help me in understanding the significance of these 0 bytes messages. How are they related to MPI communication protocol?
 

TCE Open Date: 

Martes, 14 enero, 2020 - 23:43

Intel MPI Library for Linux Ver. 2017 (supported OS's)

$
0
0

Hi,

 

Does anyone know the supported OS's for Intel MPI Library for Linux Ver. 2017? Specifically in relation to Ubuntu.

I'm also looking to see if I can get a copy of Intel MPI Library for Linux Ver. 2017 and a key to go along with it. Can I also get a public key for it?

 

Thank you for any help!

 

TCE Open Date: 

Lunes, 13 enero, 2020 - 14:19

MPI Problem --- no mpiicc

$
0
0

I  use CentOS.

I have installed MPI Library by Intel® Parallel Studio XE and added  

. /hpl/intel/compilers_and_libraries_2019.5.281/linux/mpi/intel64/bin/mpivars.sh 

in .bashrc

But I cannot find mpiicc

which mpiic
/usr/bin/which: no mpiic in (/hpl/intel/compilers_and_libraries_2019.5.281/linux/mpi/intel64/libfabric/bin:/hpl/intel/compilers_and_libraries_2019.5.281/linux/mpi/intel64/bin:/hpl/intel/compilers_and_libraries_2019.5.281/linux/bin/intel64:/hpl/intel/compilers_and_libraries_2019.5.281/linux/bin:/hpl/intel/compilers_and_libraries_2019.5.281/linux/mpi/intel64/libfabric/bin:/hpl/intel/compilers_and_libraries_2019.5.281/linux/mpi/intel64/bin:/hpl/intel/debugger_2019/gdb/intel64/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin)
--------------------------------------------------------------------------
echo $PATH
/hpl/intel/compilers_and_libraries_2019.5.281/linux/mpi/intel64/libfabric/bin:/hpl/intel/compilers_and_libraries_2019.5.281/linux/mpi/intel64/bin:/hpl/intel/compilers_and_libraries_2019.5.281/linux/bin/intel64:/hpl/intel/compilers_and_libraries_2019.5.281/linux/bin:/hpl/intel/compilers_and_libraries_2019.5.281/linux/mpi/intel64/libfabric/bin:/hpl/intel/compilers_and_libraries_2019.5.281/linux/mpi/intel64/bin:/hpl/intel/debugger_2019/gdb/intel64/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin

I must have something wrong, please help me fix it..

Many thanks........

TCE Open Date: 

Martes, 14 enero, 2020 - 03:41

Downloaded update 6 but MPI1 command said it is Update 5

$
0
0

I installed "l_mpi_2019.6.166.tgz", 2019 Update 6 from the website, and found the command still shows below banner. Can you confirm I have the correct copy for update 6?

md5sum: 393c7abf2cf5e2ffbc20b1ac4d0a5eae  l_mpi_2019.6.166.tgz

 

<Result>

#------------------------------------------------------------
#    Intel(R) MPI Benchmarks 2019 Update 5, MPI-1 part
#------------------------------------------------------------
# Date                  : Tue Jan 14 19:29:17 2020
# Machine               : x86_64
# System                : Linux
# Release               : 5.0.0-1028-azure
# Version               : #30~18.04.1-Ubuntu SMP Fri Dec 6 11:47:59 UTC 2019
# MPI Version           : 3.1
# MPI Thread Environment:
 

TCE Open Date: 

Martes, 14 enero, 2020 - 11:31

Can you post 2019 Update 6 Rel note?

mpi 2019.5 mpiexec -hosts issues all processes to the first host

$
0
0

2019.1 mpiexec worked as expected - mpiexec -np 2 -hosts A,B P issued one process P on each of hosts A and B.

2019.5 mpiexec -np 2 -hosts A,B -P issues two P's on A.

It's maddening because this is not how any other MPI I've experienced works...

mpiexec -configfile CF works the same in both releases. With this I can make things work, but it's very, VERY annoying. 

Meanwhile, 2019.5 MPI_BCast() appears to be better behaved than 2019.1 MPI_BCast(). This is a good thing, and worth the upgrade.

I'm not in to spending 2-1/2 hours on each release to see if 2019.2 through 2019.4 might have better overall behavior.

TCE Level: 

TCE Open Date: 

Miércoles, 15 enero, 2020 - 18:07

# of Processes and queue pairs when using IntelMPI with libibverbs

$
0
0

Hi, 

I am looking to run IntelMPI over a RDMA capable fabric. I was wondering what is the usual way that this setup runs (since I am still acquiring IntelMPI). 

1. Does IntelMPI run with one process per CPU (or hyperthread) ? Or does it runs with one thread per CPU (or hyperthread) ?

2. How many the queue pairs does each machine consume ? For example, if I have M machines in the cluster and each machine has N cores, does each machine need

    (a) M - 1 queue-pairs (one QP per machine to talk to M - 1 machines in the cluster) or 

    (b) N * (M - 1) queue pairs (one QP per local core to talk to M - 1 machines in the cluster or

    (c) N * N * (M - 1)  queue pairs (one QP per local core to talk to N * (M - 1) cores in the cluster). 

3. By default, does IntelMPI use UD more or RC mode ?

Thanks 

~Neelesh

TCE Level: 

TCE Open Date: 

Domingo, 19 enero, 2020 - 22:17
Viewing all 952 articles
Browse latest View live