Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 17 additions & 7 deletions .github/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,26 +2,35 @@ ARG BASE_IMAGE
FROM ${BASE_IMAGE}

ARG TARGET
ARG INTERFACE
ARG CC_COMPILER
ARG CXX_COMPILER
ARG FC_COMPILER
ARG COMPILER_PATH
ARG COMPILER_LD_LIBRARY_PATH

ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && \
apt-get install -y software-properties-common wget && \
add-apt-repository ppa:deadsnakes/ppa
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The separate RUN that installs software-properties-common and calls add-apt-repository leaves apt lists in that layer (increasing image size); combine the repository add/update and cleanup in the same RUN so package lists are removed and not persisted in a previous layer. [resource leak]

Severity Level: Minor ⚠️

Suggested change
add-apt-repository ppa:deadsnakes/ppa
add-apt-repository ppa:deadsnakes/ppa && \
apt-get update -y && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
Why it matters? ⭐

The separate RUN that adds the PPA leaves apt lists in that layer increasing
image size. Cleaning up apt lists in the same RUN (or combining with the
subsequent package-install RUN) prevents persisting package metadata into
an earlier layer and reduces final image size. The suggested change is a
valid optimization and doesn't alter runtime behavior.

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** .github/Dockerfile
**Line:** 15:15
**Comment:**
	*Resource Leak: The separate RUN that installs `software-properties-common` and calls `add-apt-repository` leaves apt lists in that layer (increasing image size); combine the repository add/update and cleanup in the same RUN so package lists are removed and not persisted in a previous layer.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.


RUN apt-get update -y && \
if [ "$TARGET" != "gpu" ]; then \
apt-get install -y \
build-essential git make cmake gcc g++ gfortran bc\
python3 python3-venv python3-pip \
build-essential git make cmake gcc g++ gfortran bc \
python3.14 python3.14-venv \
openmpi-bin libopenmpi-dev libfftw3-dev \
mpich libmpich-dev; \
else \
apt-get install -y \
build-essential git make cmake bc\
python3 python3-venv python3-pip \
build-essential git make cmake bc \
python3.14 python3.14-venv \
libfftw3-dev \
openmpi-bin libopenmpi-dev; \
fi && \
update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.14 1 && \
update-alternatives --set python3 /usr/bin/python3.14 && \
python3 --version && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
Comment on lines +12 to 34
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Combine the two RUN instructions for apt-get operations into a single one to reduce Docker image layers and optimize image size. [general, importance: 6]

Suggested change
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && \
apt-get install -y software-properties-common wget && \
add-apt-repository ppa:deadsnakes/ppa
RUN apt-get update -y && \
if [ "$TARGET" != "gpu" ]; then \
apt-get install -y \
build-essential git make cmake gcc g++ gfortran bc\
python3 python3-venv python3-pip \
build-essential git make cmake gcc g++ gfortran bc \
python3.14 python3.14-venv \
openmpi-bin libopenmpi-dev libfftw3-dev \
mpich libmpich-dev; \
else \
apt-get install -y \
build-essential git make cmake bc\
python3 python3-venv python3-pip \
build-essential git make cmake bc \
python3.14 python3.14-venv \
libfftw3-dev \
openmpi-bin libopenmpi-dev; \
fi && \
update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.14 1 && \
update-alternatives --set python3 /usr/bin/python3.14 && \
python3 --version && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && \
apt-get install -y software-properties-common wget && \
add-apt-repository ppa:deadsnakes/ppa && \
apt-get update -y && \
if [ "$TARGET" != "gpu" ]; then \
apt-get install -y \
build-essential git make cmake gcc g++ gfortran bc \
python3.14 python3.14-venv \
openmpi-bin libopenmpi-dev libfftw3-dev \
mpich libmpich-dev; \
else \
apt-get install -y \
build-essential git make cmake bc \
python3.14 python3.14-venv \
libfftw3-dev \
openmpi-bin libopenmpi-dev; \
fi && \
update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.14 1 && \
update-alternatives --set python3 /usr/bin/python3.14 && \
python3 --version && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*


ENV OMPI_ALLOW_RUN_AS_ROOT=1
Expand All @@ -30,6 +39,7 @@ ENV PATH="/opt/MFC:$PATH"

COPY ../ /opt/MFC

ENV INTERFACE=${INTERFACE}
ENV CC=${CC_COMPILER}
ENV CXX=${CXX_COMPILER}
ENV FC=${FC_COMPILER}
Expand All @@ -39,17 +49,17 @@ ENV LD_LIBRARY_PATH="${COMPILER_LD_LIBRARY_PATH}:${LD_LIBRARY_PATH:-}"
RUN echo "TARGET=$TARGET CC=$CC_COMPILER FC=$FC_COMPILER" && \
cd /opt/MFC && \
if [ "$TARGET" = "gpu" ]; then \
./mfc.sh build --gpu -j $(nproc); \
./mfc.sh build --gpu $INTERFACE -j $(nproc); \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Unquoted shell variable expansion for INTERFACE in the build command can cause word-splitting or command injection if the argument contains spaces or shell metacharacters; quote the variable to ensure it's treated as a single argument. [security]

Severity Level: Critical 🚨

Suggested change
./mfc.sh build --gpu $INTERFACE -j $(nproc); \
./mfc.sh build --gpu "$INTERFACE" -j $(nproc); \
Why it matters? ⭐

Quoting "$INTERFACE" in the build command prevents word-splitting and
limits accidental/intentional argument injection when the variable
contains spaces or shell metacharacters. This is a straightforward,
correct hardening for shell invocations and doesn't change logic.

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** .github/Dockerfile
**Line:** 52:52
**Comment:**
	*Security: Unquoted shell variable expansion for `INTERFACE` in the build command can cause word-splitting or command injection if the argument contains spaces or shell metacharacters; quote the variable to ensure it's treated as a single argument.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

else \
./mfc.sh build -j $(nproc); \
fi

RUN cd /opt/MFC && \
if [ "$TARGET" = "gpu" ]; then \
./mfc.sh test -a --dry-run --gpu -j $(nproc); \
./mfc.sh test -a --dry-run --gpu $INTERFACE -j $(nproc); \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Unquoted shell variable expansion for INTERFACE in the test command can cause word-splitting or command injection; quote the variable to pass it safely as a single argument to the script. [security]

Severity Level: Critical 🚨

Suggested change
./mfc.sh test -a --dry-run --gpu $INTERFACE -j $(nproc); \
./mfc.sh test -a --dry-run --gpu "$INTERFACE" -j $(nproc); \
Why it matters? ⭐

Same as the build invocation: quoting "$INTERFACE" ensures the test command
receives a single, literal argument and avoids subtle failures or command
injection vectors. It's a minimal, correct security improvement.

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** .github/Dockerfile
**Line:** 59:59
**Comment:**
	*Security: Unquoted shell variable expansion for `INTERFACE` in the test command can cause word-splitting or command injection; quote the variable to pass it safely as a single argument to the script.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

else \
./mfc.sh test -a --dry-run -j $(nproc); \
fi

WORKDIR /opt/MFC
ENTRYPOINT ["tail", "-f", "/dev/null"]
ENTRYPOINT ["tail", "-f", "/dev/null"]
41 changes: 41 additions & 0 deletions .github/docker_readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
<p align="center">
<a href="http://mflowcode.github.io/">
<img src="https://mflowcode.github.io/res/banner.png" alt="MFC Banner" width="500"/>
</a>
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Missing closing </p> tag for the paragraph element opened at line 29. This unclosed HTML tag could cause rendering issues on Docker Hub. Add </p> after the closing </a> tag and before the blank lines.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At .github/docker_readme.md, line 4:

<comment>Missing closing `&lt;/p&gt;` tag for the paragraph element opened at line 29. This unclosed HTML tag could cause rendering issues on Docker Hub. Add `&lt;/p&gt;` after the closing `&lt;/a&gt;` tag and before the blank lines.</comment>

<file context>
@@ -0,0 +1,41 @@
+&lt;p align=&quot;center&quot;&gt;
+  &lt;a href=&quot;http://mflowcode.github.io/&quot;&gt;
+    &lt;img src=&quot;https://mflowcode.github.io/res/banner.png&quot; alt=&quot;MFC Banner&quot; width=&quot;500&quot;/&gt;
+  &lt;/a&gt;
+&lt;/p&gt;
+
</file context>
Fix with Cubic

</p>


<p align="center">
<a href="https://github.com/MFlowCode/MFC/actions">
<img src="https://img.shields.io/github/actions/workflow/status/mflowcode/mfc/test.yml?style=flat&label=Tests&color=slateblue%09"/>
</a>
<a href="https://github.com/MFlowCode/MFC/blob/master/.github/CONTRIBUTING.md">
<img src="https://img.shields.io/github/contributors-anon/mflowcode/mfc?style=flat&color=darkslategrey%09" />
</a>
<a href="https://join.slack.com/t/mflowcode/shared_invite/zt-y75wibvk-g~zztjknjYkK1hFgCuJxVw">
<img src="https://img.shields.io/badge/slack-MFC-purple.svg?logo=slack" />
</a>
<a href="https://lbesson.mit-license.org/">
<img src="https://img.shields.io/badge/License-MIT-crimson.svg" />
</a>
<a href="https://codecov.io/github/MFlowCode/MFC" target="_blank">
<img src="https://codecov.io/github/MFlowCode/MFC/graph/badge.svg?token=8SY043QND4">
</a>
<a href="https://github.com/MFlowCode/MFC/blob/master/.github/CONTRIBUTING.md" target="_blank">
<img src="https://img.shields.io/badge/Contributing-Guide-orange?style=flat">
</a>
</p>


<p align="center">
<a href="https://doi.org/10.48550/arXiv.2503.07953" target="_blank">
<img src="https://img.shields.io/badge/DOI-10.48550/arXiv.2503.07953-thistle.svg"/>
</a>
<a href="https://github.com/MFlowCode/MFC/stargazers" target="_blank">
<img src="https://img.shields.io/github/stars/MFlowCode/MFC?style=flat&color=maroon"/>
</a>


### **References:** [Official Documentation](https://mflowcode.github.io/), [GitHub Repository](https://github.com/MFlowCode/MFC).

Please see the MFC documentation on [Docker use](https://mflowcode.github.io/documentation/md_docker.html) for more details.
25 changes: 25 additions & 0 deletions .github/workflows/docker-readme.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
name: Docker Readme Update

on:
pull_request:
types: [closed]
paths:
- '.github/docker_readme.md'

jobs:
Container:
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Missing merge check: This workflow will run for all closed PRs, not just merged ones. Add a condition to ensure it only runs when the PR is actually merged.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At .github/workflows/docker-readme.yml, line 10:

<comment>Missing merge check: This workflow will run for all closed PRs, not just merged ones. Add a condition to ensure it only runs when the PR is actually merged.</comment>

<file context>
@@ -0,0 +1,24 @@
+      - &#39;.github/docker_readme.md&#39;
+
+jobs:
+  Container:
+    runs-on: ubuntu-latest
+    steps:
</file context>

✅ Addressed in 4c9a650

if: github.event.pull_request.merged == true
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
with:
path: pr

- name: Docker Hub Description
uses: peter-evans/dockerhub-description@v5
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_PASSWORD }}
repository: ${{ secrets.DOCKERHUB_USERNAME }}/mfc
readme-filepath: pr/.github/docker_readme.md
44 changes: 44 additions & 0 deletions .github/workflows/docker-test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
name: 'Test Docker Containers'

on:
schedule:
- cron: '0 0 * * 5' # This runs every Friday at midnight UTC

jobs:
self:
name: "${{ matrix.cluster_name }} (${{ matrix.device }}${{ matrix.interface != 'none' && format('-{0}', matrix.interface) || '' }})"
continue-on-error: false
timeout-minutes: 480
strategy:
matrix:
include:
- lbl: 'gt'
cluster_name: 'Georgia Tech | Phoenix'
device: 'gpu'
interface: 'acc'
- lbl: 'gt'
cluster_name: 'Georgia Tech | Phoenix'
device: 'gpu'
interface: 'mp'
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Interface value mismatch: workflow passes mp but container.sh checks for omp. This will cause the OpenMP-specific logic to be skipped, resulting in incorrect container selection and build options.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At .github/workflows/docker-test.yml, line 22:

<comment>Interface value mismatch: workflow passes `mp` but `container.sh` checks for `omp`. This will cause the OpenMP-specific logic to be skipped, resulting in incorrect container selection and build options.</comment>

<file context>
@@ -0,0 +1,44 @@
+          - lbl: &#39;gt&#39;
+            cluster_name: &#39;Georgia Tech | Phoenix&#39;
+            device: &#39;gpu&#39;
+            interface: &#39;mp&#39;
+          - lbl: &#39;gt&#39;
+            cluster_name: &#39;Georgia Tech | Phoenix&#39;
</file context>
Fix with Cubic

- lbl: 'gt'
cluster_name: 'Georgia Tech | Phoenix'
device: 'cpu'
interface: 'none'
runs-on:
group: phoenix
labels: ${{ matrix.lbl }}
env:
NODE_OPTIONS: ${{ matrix.lbl == 'gt' && '--max-old-space-size=2048' || '' }}
ACTIONS_RUNNER_FORCE_ACTIONS_NODE_VERSION: node16
ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true
steps:
- name: Clone
uses: actions/checkout@v4

- name: Pull & Test
if: matrix.lbl == 'gt'
run: bash .github/workflows/phoenix/submit.sh .github/workflows/phoenix/container.sh ${{ matrix.device }} ${{ matrix.interface }}

- name: Print Logs
if: always()
run: cat container-${{ matrix.device }}-${{ matrix.interface }}.out
28 changes: 21 additions & 7 deletions .github/workflows/docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,12 @@ jobs:
strategy:
matrix:
config:
- { name: 'cpu', runner: 'ubuntu-22.04', base_image: 'ubuntu:22.04' }
- { name: 'gpu', runner: 'ubuntu-22.04', base_image: 'nvcr.io/nvidia/nvhpc:23.11-devel-cuda_multi-ubuntu22.04' }
- { name: 'gpu', runner: 'ubuntu-22.04-arm', base_image: 'nvcr.io/nvidia/nvhpc:23.11-devel-cuda_multi-ubuntu22.04' }
- { name: 'cpu', interface: 'none', runner: 'ubuntu-22.04', base_image: 'ubuntu:22.04' }
- { name: 'gpu', interface: 'acc', runner: 'ubuntu-22.04', base_image: 'nvcr.io/nvidia/nvhpc:23.11-devel-cuda_multi-ubuntu22.04' }
- { name: 'gpu', interface: 'acc', runner: 'ubuntu-22.04-arm', base_image: 'nvcr.io/nvidia/nvhpc:23.11-devel-cuda_multi-ubuntu22.04' }
- { name: 'gpu', interface: 'mp', runner: 'ubuntu-22.04', base_image: 'nvcr.io/nvidia/nvhpc:23.11-devel-cuda_multi-ubuntu22.04' }
- { name: 'gpu', interface: 'mp', runner: 'ubuntu-22.04-arm', base_image: 'nvcr.io/nvidia/nvhpc:23.11-devel-cuda_multi-ubuntu22.04' }

runs-on: ${{ matrix.config.runner }}
outputs:
tag: ${{ steps.clone.outputs.tag }}
Expand Down Expand Up @@ -84,6 +87,7 @@ jobs:
build-args: |
BASE_IMAGE=${{ matrix.config.base_image }}
TARGET=${{ matrix.config.name }}
INTERFACE=${{ matrix.config.interface }}
CC_COMPILER=${{ 'gcc' }}
CXX_COMPILER=${{ 'g++' }}
FC_COMPILER=${{ 'gfortran' }}
Expand All @@ -102,12 +106,13 @@ jobs:
build-args: |
BASE_IMAGE=${{ matrix.config.base_image }}
TARGET=${{ matrix.config.name }}
INTERFACE=${{ matrix.config.interface }}
CC_COMPILER=${{ 'nvc' }}
CXX_COMPILER=${{ 'nvc++' }}
FC_COMPILER=${{ 'nvfortran' }}
COMPILER_PATH=${{ '/opt/nvidia/hpc_sdk/Linux_x86_64/compilers/bin' }}
COMPILER_LD_LIBRARY_PATH=${{ '/opt/nvidia/hpc_sdk/Linux_x86_64/compilers/lib' }}
tags: ${{ secrets.DOCKERHUB_USERNAME }}/mfc:${{ env.TAG }}-${{ matrix.config.name }}-${{ matrix.config.runner}}
tags: ${{ secrets.DOCKERHUB_USERNAME }}/mfc:${{ env.TAG }}-${{ matrix.config.name }}-${{ matrix.config.interface }}-${{ matrix.config.runner }}
push: true

manifests:
Expand All @@ -125,8 +130,17 @@ jobs:
TAG: ${{ needs.Container.outputs.tag }}
REGISTRY: ${{ secrets.DOCKERHUB_USERNAME }}/mfc
run: |
# CPU Manifest
docker buildx imagetools create -t $REGISTRY:latest-cpu $REGISTRY:$TAG-cpu
docker manifest create $REGISTRY:$TAG-gpu $REGISTRY:$TAG-gpu-ubuntu-22.04 $REGISTRY:$TAG-gpu-ubuntu-22.04-arm
docker manifest create $REGISTRY:latest-gpu $REGISTRY:$TAG-gpu-ubuntu-22.04 $REGISTRY:$TAG-gpu-ubuntu-22.04-arm

# GPU Manifest (ACC)
docker manifest create $REGISTRY:$TAG-gpu $REGISTRY:$TAG-gpu-acc-ubuntu-22.04 $REGISTRY:$TAG-gpu-acc-ubuntu-22.04-arm
docker manifest create $REGISTRY:latest-gpu $REGISTRY:$TAG-gpu-acc-ubuntu-22.04 $REGISTRY:$TAG-gpu-acc-ubuntu-22.04-arm
docker manifest push $REGISTRY:$TAG-gpu
docker manifest push $REGISTRY:latest-gpu
docker manifest push $REGISTRY:latest-gpu

# GPU Manifest (OMP)
docker manifest create $REGISTRY:$TAG-mp-gpu $REGISTRY:$TAG-gpu-mp-ubuntu-22.04 $REGISTRY:$TAG-gpu-mp-ubuntu-22.04-arm
docker manifest create $REGISTRY:latest-mp-gpu $REGISTRY:$TAG-gpu-mp-ubuntu-22.04 $REGISTRY:$TAG-gpu-mp-ubuntu-22.04-arm
docker manifest push $REGISTRY:$TAG-mp-gpu
docker manifest push $REGISTRY:latest-mp-gpu
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Typo: Double hyphen -- in manifest push command. The manifest was created as $TAG-mp-gpu but the push references $TAG--mp-gpu, which will cause the workflow to fail.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At .github/workflows/docker.yml, line 146:

<comment>Typo: Double hyphen `--` in manifest push command. The manifest was created as `$TAG-mp-gpu` but the push references `$TAG--mp-gpu`, which will cause the workflow to fail.</comment>

<file context>
@@ -125,8 +130,17 @@ jobs:
+          docker manifest create $REGISTRY:$TAG-mp-gpu $REGISTRY:$TAG-gpu-mp-ubuntu-22.04 $REGISTRY:$TAG-gpu-mp-ubuntu-22.04-arm
+          docker manifest create $REGISTRY:latest-mp-gpu $REGISTRY:$TAG-gpu-mp-ubuntu-22.04 $REGISTRY:$TAG-gpu-mp-ubuntu-22.04-arm
+          docker manifest push $REGISTRY:$TAG--mp-gpu
+          docker manifest push $REGISTRY:latest-mp-gpu
</file context>

✅ Addressed in 4c9a650

45 changes: 45 additions & 0 deletions .github/workflows/phoenix/container.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#!/bin/bash

if [ "$job_interface" = "mp" ]; then
apptainer build mfc:latest-$job_device.sif docker://sbryngelson/mfc:latest-mp-$job_device
else
apptainer build mfc:latest-$job_device.sif docker://sbryngelson/mfc:latest-$job_device
fi

CONTAINER="mfc:latest-$job_device.sif"

NV_FLAG=""
[ "$job_device" = "gpu" ] && NV_FLAG="--nv"

# unset LD_PRELOAD

apptainer exec $NV_FLAG --fakeroot --writable-tmpfs \
--bind "$SCRATCH_DIR":/scratch \
--env job_slug="$job_slug" \
--env job_device="$job_device" \
"$CONTAINER" \
bash -c 'cd /opt/MFC &&

build_opts=""
if [ "$job_device" = "gpu" ]; then
build_opts="--gpu"
if [ "$job_interface" = "mp" ]; then
build_opts+=" mp"
elif [ "$job_interface" = "acc" ]; then
build_opts+=" acc"
fi
fi

./mfc.sh test --dry-run -j 8 $build_opts

n_test_threads=8

if [ "$job_device" = "gpu" ]; then
gpu_count=$(nvidia-smi -L | wc -l) # number of GPUs on node
gpu_ids=$(seq -s " " 0 $(($gpu_count-1))) # 0,1,2,...,gpu_count-1
device_opts="-g $gpu_ids"
Comment on lines +38 to +40
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: If nvidia-smi is missing or returns 0 GPUs, the current arithmetic and seq usage can produce invalid ranges or empty values and cause runtime errors; validate gpu_count is a positive integer before using it (and fall back to a safe default), and use shell arithmetic $((...)) instead of legacy expr for clarity. [possible bug]

Severity Level: Critical 🚨

Suggested change
gpu_count=$(nvidia-smi -L | wc -l) # number of GPUs on node
gpu_ids=$(seq -s " " 0 $(($gpu_count-1))) # 0,1,2,...,gpu_count-1
device_opts="-g $gpu_ids"
gpu_count=$(nvidia-smi -L 2>/dev/null | wc -l)
if [ -z "$gpu_count" ] || [ "$gpu_count" -le 0 ]; then
gpu_count=1
fi
gpu_ids=$(seq -s " " 0 $((gpu_count-1)))
device_opts="-g $gpu_ids"
n_test_threads=$((gpu_count * 2))
Why it matters? ⭐

Valid and useful. If nvidia-smi is missing or reports zero, the existing code can produce bad seq ranges or empty GPU lists. The suggested checks and switching to POSIX arithmetic reduce failures and make intent clearer.

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** .github/workflows/phoenix/container.sh
**Line:** 38:40
**Comment:**
	*Possible Bug: If `nvidia-smi` is missing or returns 0 GPUs, the current arithmetic and `seq` usage can produce invalid ranges or empty values and cause runtime errors; validate `gpu_count` is a positive integer before using it (and fall back to a safe default), and use shell arithmetic `$((...))` instead of legacy `expr` for clarity.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

n_test_threads=`expr $gpu_count \* 2`
fi

./mfc.sh test --no-build --max-attempts 3 -a -j $n_test_threads $device_opts -- -c phoenix
'
10 changes: 7 additions & 3 deletions docs/documentation/docker.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,20 +148,24 @@ In the above,
## Tag Details

### Base Images
- CPU images (v4.3.0-latest releases) are built on **Ubuntu 22.04**.
- GPU images (v4.3.0-latest releases) are built on **NVHPC SDK 23.11 (CUDA 12.3) & Ubuntu 22.04**.
- CPU images (v4.3.0-latest release) are built on **Ubuntu 22.04**.
- GPU images (v4.3.0-latest release) are built on **NVHPC SDK 23.11 (CUDA 12.3) & Ubuntu 22.04**.
- OpenACC images (v4.3.0-latest release).
- OpenMP images (v5.1.0-latest release).

### Tag Structure
- **`vx.x.x`** - Official MFC release versions (recommended: use `latest` release)
- **`cpu/gpu`** - Build configurations for CPU or GPU acceleration.
- **`acc/mp`** - Build configurations for OpenACC or OpenMP parallelization.
- **`ubuntu-xx.xx`** - Base Ubuntu version (standard = `amd64`, `-arm` = `arm64`)

### Example Tags

```shell
mfc:latest-xxx # Latest version (amd64 & arm64)
mfc:vx.x.x-cpu # CPU version (amd64 & arm64)
mfc:vx.x.x-gpu # GPU version (amd64 & arm64)
mfc:vx.x.x-gpu # OpenACC-supported GPU version (amd64 & arm64)
mfc:vx.x.x-mp-gpu # OpenMP-supported GPU version (amd64 & arm64)
mfc:vx.x.x-xxx-ubuntu-xx.xx # amd64 natively-supported version
mfc:vx.x.x-xxx-ubuntu-xx.xx-arm # arm64 natively-supported version
```
Expand Down
Loading