how would you tell it is open-mpi?
it actually is the lowlevel libraries that have
these requirements. when using OpenMPI there are also
a bunch of flags that you can use the ease the memory
allocation load. particularly, putting

mpi_leave_pinned = 1



seemed to help a lot in my runs.

you can reduce the risk of crashes due to timeouts with:
btl_openib_ib_timeout = 12

and you can use the following flag to see the efficiency
of the "lazy pinning":
mpool_rdma_print_stats = 1

and finally, if you are on a machine with NUMA support
you want to try whether

mpi_paffinity_alone = 1

has some effect.


