Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
97ccdd6
btl/ofi: fault tolerance
Matthew-Whitlock Oct 9, 2025
6152e7e
btl/ofi check for valid pointer in error handler
Matthew-Whitlock Nov 10, 2025
5938c94
opal/mca/common/ucx : assert fix - change thread mode sent to UCX api
nbellalou Dec 23, 2025
15e5a62
Fix abstraction violation between ompi and opal
nbellalou Jan 11, 2026
39cf291
PML/UCX: properly handle persistent req free list items
hppritcha Jan 12, 2026
70d6bf5
if/bsdx_ipv4: Fix name in COMPONENT_INIT()
bwbarrett Jan 12, 2026
5e859a9
update-my-copyright.py: properly support git workspaces
jsquyres Dec 29, 2025
8c027cf
docs: update the TCP tuning page
jsquyres Dec 27, 2025
427b576
docs: support interspinhx for PMIx and PRTE docs links
jsquyres Dec 29, 2025
5b799a0
docs: update pmix_info(1) and prte_info(1) links
jsquyres Dec 29, 2025
11ffe84
Merge pull request #13633 from bwbarrett/bugfix/fix-bsd-compile
bwbarrett Jan 12, 2026
57b8c9b
Merge pull request #13632 from hppritcha/ucx_pml_persistent_req_fix
hppritcha Jan 12, 2026
319b307
coll/tuned: Change the bcast default collective algorithm selection
jiaxiyan Jan 25, 2024
fe641a7
coll/acoll: Fixes for coverity deadcode issues
amd-nithyavs Jan 13, 2026
59f8e2e
revoke: Fix null dereference, improve debug prints, comment assumptions
Matthew-Whitlock Jan 13, 2026
cf73d24
Merge pull request #13641 from Matthew-Whitlock/coll_revoke_fixes
hppritcha Jan 13, 2026
3e00498
Merge pull request #13590 from nbellalou/nbellalou/ucxThreadModeFix
janjust Jan 13, 2026
4b7a78a
Merge pull request #13599 from jsquyres/pr/tcp-docs-updates
jsquyres Jan 14, 2026
18b6e4a
Merge pull request #13639 from amd-nithyavs/13Jan2026_coverity_fix
mshanthagit Jan 14, 2026
0516270
Merge pull request #12278 from jiaxiyan/bcast
bosilca Jan 14, 2026
1698b45
Use #ifdef with HAVE_* defines.
bosilca Jan 14, 2026
13280e7
Fix an #endif comment
bosilca Jan 14, 2026
15999cc
Merge pull request #13649 from bosilca/fix/13647
bosilca Jan 14, 2026
a8d9b44
Merge pull request #13429 from Matthew-Whitlock/ofi_ft
hppritcha Jan 14, 2026
c6fc05d
Enable ASAN for mpi4py in CI
devreal Jan 14, 2026
7f5eea7
Remove disable of ASLR and apt update
devreal Jan 14, 2026
1af95bb
Enable ASAN for all mpi4py tests and increase optimizations
devreal Jan 14, 2026
b8a4c15
ASAN: explicitly disable stack-use-after-return check
devreal Jan 14, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 38 additions & 4 deletions .github/workflows/ompi_mpi4py.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,18 +20,25 @@ permissions:

jobs:
test:
runs-on: ubuntu-22.04
# We need Unbuntu 24.04 (over 22.04) due to a kernel bug,
# see https://github.com/google/sanitizers/issues/856.
runs-on: ubuntu-24.04
timeout-minutes: 30
env:
MPI4PY_TEST_SPAWN: true
# disable ASAN while building
ASAN_OPTIONS: verify_asan_link_order=0,detect_odr_violation=0,abort_on_error=0
# disable leak detection
LSAN_OPTIONS: detect_leaks=0,exitcode=0

steps:
- name: Configure hostname
run: echo 127.0.0.1 `hostname` | sudo tee -a /etc/hosts > /dev/null
if: ${{ runner.os == 'Linux' || runner.os == 'macOS' }}

- name: Install depencencies
run: sudo apt-get install -y -q
libnuma-dev
libnuma-dev libasan8
if: ${{ runner.os == 'Linux' }}

- name: Checkout Open MPI
Expand Down Expand Up @@ -59,7 +66,8 @@ jobs:
--disable-oshmem
--disable-silent-rules
--prefix=/opt/openmpi
LDFLAGS=-Wl,-rpath,/opt/openmpi/lib
CFLAGS="-O2 -fno-omit-frame-pointer -g -fsanitize=address"
LDFLAGS="-Wl,-rpath,/opt/openmpi/lib -fsanitize=address"
working-directory: mpi-build

- name: Build MPI
Expand Down Expand Up @@ -115,6 +123,21 @@ jobs:
env:
CFLAGS: "-O0"

- name: Setting up ASAN environment
# LD_PRELOAD is needed to make sure ASAN is the first thing loaded
# as it will otherwise complain.
# Leak detection is currently disabled because of the size of the report.
# The patcher is disabled because ASAN fails if code mmaps data at fixed
# memory addresses, see https://github.com/open-mpi/ompi/issues/12819.
# ODR violation detection is disabled until #13469 is fixed
# Disabling stack use after return detection to reduce slowdown, per
# https://github.com/llvm/llvm-project/issues/64190.
run: |
echo LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libasan.so.8 >> $GITHUB_ENV
echo ASAN_OPTIONS=detect_odr_violation=0,abort_on_error=1,detect_stack_use_after_return=0 >> $GITHUB_ENV
echo LSAN_OPTIONS=detect_leaks=0,exitcode=0 >> $GITHUB_ENV
echo OMPI_MCA_memory=^patcher >> $GITHUB_ENV

- name: Test mpi4py (singleton)
run: python test/main.py -v -x TestExcErrhandlerNull
if: ${{ true }}
Expand Down Expand Up @@ -145,6 +168,18 @@ jobs:
if: ${{ true }}
timeout-minutes: 10

- name: Show MPI (ASAN)
run: ompi_info

- name: Show MPICC (ASAN)
run: mpicc -show

- name: Disabling ASAN environment
run: |
echo LD_PRELOAD= >> $GITHUB_ENV
echo ASAN_OPTIONS=verify_asan_link_order=0,detect_odr_violation=0,abort_on_error=0 >> $GITHUB_ENV
echo LSAN_OPTIONS=detect_leaks=0,exitcode=0 >> $GITHUB_ENV

- name: Relocate Open MPI installation
run: mv /opt/openmpi /opt/ompi
- name: Update PATH and set OPAL_PREFIX and LD_LIBRARY_PATH
Expand All @@ -157,4 +192,3 @@ jobs:
run: python test/main.py -v -x TestExcErrhandlerNull
if: ${{ true }}
timeout-minutes: 10

1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -517,6 +517,7 @@ docs/_static
docs/_static/css/custom.css
docs/_templates
docs/man-openmpi/man3/bindings
docs/*.inv

# Common Python virtual environment and cache directory names
venv
Expand Down
2 changes: 1 addition & 1 deletion 3rd-party/prrte
Submodule prrte updated 58 files
+24 −0 .github/workflows/fork_sync_v3.0.yaml
+4 −2 .gitignore
+11 −4 config/prte_setup_pmix.m4
+2 −1 docs/Makefile.am
+1 −0 docs/index.rst
+293 −0 docs/launching-apps/gridengine.rst
+47 −0 docs/launching-apps/index.rst
+23 −0 docs/launching-apps/localhost.rst
+50 −0 docs/launching-apps/lsf.rst
+239 −0 docs/launching-apps/prerequisites.rst
+223 −0 docs/launching-apps/quickstart.rst
+11 −0 docs/launching-apps/scheduling.rst
+56 −0 docs/launching-apps/slurm.rst
+233 −0 docs/launching-apps/ssh.rst
+64 −0 docs/launching-apps/tm.rst
+167 −0 docs/launching-apps/troubleshooting.rst
+166 −0 docs/launching-apps/unusual.rst
+5 −0 docs/man/man1/ompi-prte_info.1.rst
+1 −1 examples/debugger/direct-multi.c
+2 −2 examples/debugger/direct.c
+13 −0 src/hwloc/help-prte-hwloc-base.txt
+2 −1 src/hwloc/hwloc-internal.h
+59 −48 src/hwloc/hwloc_base_util.c
+2 −2 src/mca/ess/base/base.h
+76 −65 src/mca/ess/base/ess_base_frame.c
+3 −1 src/mca/grpcomm/direct/grpcomm_direct.h
+2 −0 src/mca/grpcomm/direct/grpcomm_direct_component.c
+1 −2 src/mca/grpcomm/direct/grpcomm_direct_fence.c
+109 −2 src/mca/grpcomm/direct/grpcomm_direct_group.c
+178 −107 src/mca/plm/base/plm_base_launch_support.c
+5 −1 src/mca/plm/base/plm_base_receive.c
+3 −0 src/mca/plm/base/plm_private.h
+1 −1 src/mca/ras/simulator/ras_sim_module.c
+13 −0 src/mca/rmaps/base/help-prte-rmaps-base.txt
+6 −0 src/mca/rmaps/base/rmaps_base_binding.c
+25 −0 src/mca/rmaps/base/rmaps_base_map_job.c
+1 −1 src/mca/rmaps/base/rmaps_base_support_fns.c
+5 −0 src/mca/rmaps/ppr/rmaps_ppr.c
+11 −6 src/mca/rmaps/rank_file/rmaps_rank_file.c
+26 −3 src/mca/rmaps/round_robin/rmaps_rr_mappers.c
+5 −0 src/mca/rmaps/seq/rmaps_seq.c
+17 −0 src/mca/schizo/prte/help-prterun.txt
+4 −4 src/mca/schizo/prte/help-prun.txt
+10 −0 src/prted/pmix/pmix_server.c
+141 −0 src/prted/pmix/pmix_server_gen.c
+11 −0 src/prted/pmix/pmix_server_internal.h
+28 −1 src/prted/prte.c
+2 −2 src/prted/prun_common.c
+1 −1 src/rml/oob/oob_tcp.c
+7 −0 src/runtime/help-prte-runtime.txt
+11 −1 src/tools/prun/prun.c
+36 −4 src/util/dash_host/dash_host.c
+3 −1 src/util/dash_host/help-dash-host.txt
+3 −2 test/Makefile
+3 −3 test/double-get.c
+1 −1 test/get-nofence.c
+0 −119 test/ptrace/ptrace_spawn_stopped.cxx
+101 −0 test/spawn_timeout.c
51 changes: 47 additions & 4 deletions config/ompi_setup_prrte.m4
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ dnl Copyright (c) 2019-2020 Intel, Inc. All rights reserved.
dnl Copyright (c) 2020-2022 Amazon.com, Inc. or its affiliates. All Rights reserved.
dnl Copyright (c) 2021 Nanook Consulting. All rights reserved.
dnl Copyright (c) 2021-2022 IBM Corporation. All rights reserved.
dnl Copyright (c) 2023-2024 Jeffrey M. Squyres. All rights reserved.
dnl Copyright (c) 2023-2025 Jeffrey M. Squyres. All rights reserved.
dnl Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved.
dnl $COPYRIGHT$
dnl
Expand All @@ -39,7 +39,8 @@ dnl results of the build.
AC_DEFUN([OMPI_SETUP_PRRTE],[
AC_REQUIRE([AC_PROG_LN_S])

OPAL_VAR_SCOPE_PUSH([prrte_setup_internal_happy prrte_setup_external_happy target_rst_dir])
OPAL_VAR_SCOPE_PUSH([prrte_setup_internal_happy prrte_setup_external_happy target_rst_dir ompi_external_prrte_docs_url])
ompi_external_prrte_docs_url="https://docs.prrte.org/en/latest/"

opal_show_subtitle "Configuring PRRTE"

Expand Down Expand Up @@ -120,6 +121,8 @@ OPAL_VAR_SCOPE_PUSH([prrte_setup_internal_happy prrte_setup_external_happy targe

AC_SUBST(OMPI_PRRTE_RST_CONTENT_DIR)
AC_SUBST(OMPI_SCHIZO_OMPI_RST_CONTENT_DIR)
AC_SUBST(OMPI_PRRTE_DOCS_URL_BASE)
AC_SUBST(OMPI_USING_INTERNAL_PRRTE)
AM_CONDITIONAL(OMPI_HAVE_PRRTE_RST, [test $OMPI_HAVE_PRRTE_RST -eq 1])

AS_IF([test "$OMPI_USING_INTERNAL_PRRTE" = "1"],
Expand Down Expand Up @@ -250,8 +253,30 @@ AC_DEFUN([_OMPI_SETUP_PRRTE_INTERNAL], [
[OMPI_HAVE_PRRTE_RST=1
OMPI_PRRTE_RST_CONTENT_DIR="$OMPI_TOP_SRCDIR/3rd-party/prrte/src/docs/prrte-rst-content"
OMPI_SCHIZO_OMPI_RST_CONTENT_DIR="$OMPI_TOP_SRCDIR/3rd-party/prrte/src/mca/schizo/ompi"

# If we're building the OMPI Sphinx docs, and also
# building the internal PRRTE, then we're *also*
# building the internal PRRTE docs.
#
# In this case, the OMPI docs/conf.py will do a
# bunch of processing that is a lot easier to do in
# Python than Bourne shell (e.g., use the convenient
# os.path.relpath() to compute the relative path
# that we need, as well as dynamically create a
# Sphinx link inventory file). Hence, we skip doing
# all that work here and just set a sentinel value
OMPI_PRRTE_DOCS_URL_BASE="../../prrte/html"
AC_MSG_RESULT([found])],
[AC_MSG_RESULT([not found])])
[ # If we are not building the Sphinx docs, default
# to using the external PRRTE docs URL. This is
# actually moot because we won't be building the
# docs, but we might as well be complete in the
# logic / cases.
OMPI_PRRTE_DOCS_URL_BASE=$ompi_external_prrte_docs_url
AC_MSG_RESULT([not found])])

AC_MSG_CHECKING([for internal PRRTE docs link URL base])
AC_MSG_RESULT([$OMPI_PRRTE_DOCS_URL_BASE])
$1],
[$2])

Expand All @@ -273,7 +298,7 @@ dnl _OMPI_SETUP_PRRTE_EXTERNAL([action if success], [action if not success])
dnl
dnl Try to find an external prrte with sufficient version.
AC_DEFUN([_OMPI_SETUP_PRRTE_EXTERNAL], [
OPAL_VAR_SCOPE_PUSH([ompi_prte_min_version ompi_prte_min_num_version setup_prrte_external_happy opal_prrte_CPPFLAGS_save])
OPAL_VAR_SCOPE_PUSH([ompi_prte_min_version ompi_prte_min_num_version setup_prrte_external_happy opal_prrte_CPPFLAGS_save ompi_prrte_docdir])

opal_prrte_CPPFLAGS_save=$CPPFLAGS

Expand Down Expand Up @@ -321,6 +346,10 @@ AC_DEFUN([_OMPI_SETUP_PRRTE_EXTERNAL], [
[ # Determine if this external PRRTE has installed the RST
# directories that we care about

# In the external case, initially assume we'll use the
# web-based docs
OMPI_PRRTE_DOCS_URL_BASE=$ompi_external_prrte_docs_url

AC_MSG_CHECKING([for external PRRTE RST files])
prrte_install_dir=${with_prrte}/share/prte/rst
AS_IF([test -n "$SPHINX_BUILD"],
Expand All @@ -329,13 +358,27 @@ AC_DEFUN([_OMPI_SETUP_PRRTE_EXTERNAL], [
[OMPI_HAVE_PRRTE_RST=1
OMPI_PRRTE_RST_CONTENT_DIR="$prrte_install_dir/prrte-rst-content"
OMPI_SCHIZO_OMPI_RST_CONTENT_DIR="$prrte_install_dir/schizo-ompi-rst-content"
# If the external PRTE docs dir exists where
# a simple heuristic thinks it should be
# (i.e., the default docdir location), use
# it. This will be an absolute path, which
# is fine (because we're building against an
# external PRRTE). If we don't find it,
# we'll fall back to the above-set HTTPS
# internet PRRTE docs URL.
ompi_prrte_docdir="$with_prrte/share/doc/prrte/html"
AS_IF([test -d "$ompi_prrte_docdir"],
[OMPI_PRRTE_DOCS_URL_BASE="$ompi_prrte_docdir"])
AC_MSG_RESULT([found])
],
[ # This version of PRRTE doesn't have installed RST
# files.
AC_MSG_RESULT([not found])
])
])

AC_MSG_CHECKING([for external PRRTE docs link URL base])
AC_MSG_RESULT([$OMPI_PRRTE_DOCS_URL_BASE])
$1],
[$2])

Expand Down
49 changes: 46 additions & 3 deletions config/opal_config_pmix.m4
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ dnl Copyright (c) 2020 Triad National Security, LLC. All rights
dnl reserved.
dnl Copyright (c) 2020-2022 Amazon.com, Inc. or its affiliates. All Rights reserved.
dnl Copyright (c) 2021 Nanook Consulting. All rights reserved.
dnl Copyright (c) 2025 Jeffrey M. Squyres. All rights reserved.
dnl $COPYRIGHT$
dnl
dnl Additional copyrights may follow
Expand Down Expand Up @@ -57,7 +58,8 @@ dnl other execution tests later in configure (there are sadly
dnl some) would fail if the path in LDFLAGS was not added to
dnl LD_LIBRARY_PATH.
AC_DEFUN([OPAL_CONFIG_PMIX], [
OPAL_VAR_SCOPE_PUSH([external_pmix_happy internal_pmix_happy internal_pmix_args internal_pmix_wrapper_libs internal_pmix_CPPFLAGS opal_pmix_STATIC_LDFLAGS opal_pmix_LIBS opal_pmix_STATIC_LIBS])
OPAL_VAR_SCOPE_PUSH([external_pmix_happy internal_pmix_happy internal_pmix_args internal_pmix_wrapper_libs internal_pmix_CPPFLAGS opal_pmix_STATIC_LDFLAGS opal_pmix_LIBS opal_pmix_STATIC_LIBS opal_external_pmix_docs_url])
opal_external_pmix_docs_url="https://docs.openpmix.org/en/latest/"

opal_show_subtitle "Configuring PMIx"

Expand Down Expand Up @@ -154,6 +156,8 @@ AC_DEFUN([OPAL_CONFIG_PMIX], [
AC_DEFINE_UNQUOTED([OPAL_USING_INTERNAL_PMIX],
[$OPAL_USING_INTERNAL_PMIX],
[Whether or not we are using the internal PMIx])
AC_SUBST(OPAL_PMIX_DOCS_URL_BASE)
AC_SUBST(OPAL_USING_INTERNAL_PMIX)

OPAL_SUMMARY_ADD([Miscellaneous], [pmix], [], [$opal_pmix_mode])

Expand Down Expand Up @@ -216,8 +220,22 @@ AC_DEFUN([_OPAL_CONFIG_PMIX_EXTERNAL], [
dnl it will screw up other tests (like the pthread tests)
opal_pmix_BUILD_LIBS="${opal_pmix_LIBS}"

# If the external PMIx docs dir exists where
# a simple heuristic thinks it should be
# (i.e., the default docdir location), use
# it. This will be an absolute path, which
# is fine (because we're building against an
# external PMIx). If we don't find it,
# we'll fall back to the HTTPS internet PMIx
# docs URL.
opal_pmix_docdir="$with_pmix/share/doc/pmix/html"
AS_IF([test -d "$opal_pmix_docdir"],
[OPAL_PMIX_DOCS_URL_BASE="$opal_pmix_docdir"],
[OPAL_PMIX_DOCS_URL_BASE=$opal_external_pmix_docs_url])

$1],
[$2])])
[$2])
])

OPAL_VAR_SCOPE_POP
])
Expand All @@ -238,7 +256,7 @@ AC_DEFUN([_OPAL_CONFIG_PMIX_INTERNAL_POST], [

pmix_internal_happy=1

dnl Don't pull LDFLAGS, because we don't have a good way to avoid
dnl Do not pull LDFLAGS, because we don't have a good way to avoid
dnl a -L to our install directory, which can cause some weirdness
dnl if there's an old OMPI install there. And it makes filtering
dnl redundant flags easier.
Expand Down Expand Up @@ -279,6 +297,31 @@ AC_DEFUN([_OPAL_CONFIG_PMIX_INTERNAL_POST], [

opal_pmix_BUILD_LIBS="$OMPI_TOP_BUILDDIR/3rd-party/openpmix/src/libpmix.la"

AS_IF([test -n "$SPHINX_BUILD"],
[ # If we're building the OMPI Sphinx docs, and also
# building the internal PMIx, then we're *also*
# building the internal PMIx docs.
#
# In this case, the OMPI docs/conf.py will do a
# bunch of processing that is a lot easier to do in
# Python than Bourne shell (e.g., use the convenient
# os.path.relpath() to compute the relative path
# that we need, as well as dynamically create a
# Sphinx link inventory file). Hence, we skip doing
# all that work here and just set a sentinel value
OPAL_PMIX_DOCS_URL_BASE="../../pmix/html"
AC_MSG_RESULT([found])],
[ # If we are not building the Sphinx docs, default
# to using the external PMIx docs URL. This is
# actually moot because we won't be building the
# docs, but we might as well be complete in the
# logic / cases.
OPAL_PMIX_DOCS_URL_BASE=$opal_external_pmix_docs_url
AC_MSG_RESULT([not found])])

AC_MSG_CHECKING([for internal PMIx docs link URL base])
AC_MSG_RESULT([$OPAL_PMIX_DOCS_URL_BASE])

OPAL_3RDPARTY_SUBDIRS="$OPAL_3RDPARTY_SUBDIRS openpmix"
])

Expand Down
Loading
Loading