fix: avoid CGO getgrgid_r segfault in static Linux binaries#433
fix: avoid CGO getgrgid_r segfault in static Linux binaries#433nancysangani wants to merge 1 commit intomodelpack:mainfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical segfault issue encountered in statically linked Linux binaries when using Highlights
Changelog
Ignored Files
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request introduces a custom buildTarHeader function to avoid CGO-related segfaults, which is a solid approach. My review focuses on improving this new function to correctly handle different file types, such as symbolic links and directories, which were not fully addressed in the current implementation. I've suggested changes to buildTarHeader and its call sites to fix these issues and make the archiving logic more robust. The other changes in the PR look good.
| func buildTarHeader(info os.FileInfo) (*tar.Header, error) { | ||
| header := &tar.Header{ | ||
| Name: info.Name(), | ||
| Size: info.Size(), | ||
| Mode: int64(info.Mode()), | ||
| ModTime: info.ModTime(), | ||
| } | ||
|
|
||
| // Set file type flag. | ||
| if info.IsDir() { | ||
| header.Typeflag = tar.TypeDir | ||
| } else { | ||
| header.Typeflag = tar.TypeReg | ||
| } | ||
|
|
||
| // Safely extract UID/GID from syscall.Stat_t without CGO user/group name lookup. | ||
| // We intentionally leave Uname/Gname empty to avoid os/user CGO calls entirely. | ||
| if stat, ok := info.Sys().(*syscall.Stat_t); ok { | ||
| header.Uid = int(stat.Uid) | ||
| header.Gid = int(stat.Gid) | ||
| } | ||
|
|
||
| return header, nil | ||
| } |
There was a problem hiding this comment.
The new buildTarHeader function is a good approach to avoid CGO issues. However, it has a couple of issues that represent a regression from using tar.FileInfoHeader:
- It doesn't handle symbolic links; they are incorrectly treated as regular files.
- The size for directories is not set to 0.
info.Size()for a directory is system-dependent and typically non-zero, but tar headers for directories should have a size of 0.
To address this and make the function more robust, I suggest modifying it to handle symlinks correctly and set sizes properly for different file types. This will require changing the function signature to accept the file path, which is needed to read a symlink's target. You will also need to update the call sites accordingly (see my other comments).
func buildTarHeader(path string, info os.FileInfo) (*tar.Header, error) {
header := &tar.Header{
Name: info.Name(),
Mode: int64(info.Mode()),
ModTime: info.ModTime(),
}
if info.Mode()&os.ModeSymlink != 0 {
header.Typeflag = tar.TypeSymlink
link, err := os.Readlink(path)
if err != nil {
return nil, fmt.Errorf("failed to read symlink %s: %w", path, err)
}
header.Linkname = link
} else if info.IsDir() {
header.Typeflag = tar.TypeDir
} else {
header.Typeflag = tar.TypeReg
header.Size = info.Size()
}
// Safely extract UID/GID from syscall.Stat_t without CGO user/group name lookup.
// We intentionally leave Uname/Gname empty to avoid os/user CGO calls entirely.
if stat, ok := info.Sys().(*syscall.Stat_t); ok {
header.Uid = int(stat.Uid)
header.Gid = int(stat.Gid)
}
return header, nil
}| } | ||
|
|
||
| header, err := tar.FileInfoHeader(info, "") | ||
| header, err := buildTarHeader(info) |
| defer file.Close() | ||
|
|
||
| header, err := tar.FileInfoHeader(info, "") | ||
| header, err := buildTarHeader(info) |
Signed-off-by: Nancy <9d.24.nancy.sangani@gmail.com>
0381381 to
96831b5
Compare
There was a problem hiding this comment.
Pull request overview
This PR fixes a critical segfault issue in statically-linked Linux binaries caused by CGO-based NSS function calls (getgrgid_r) when creating tar archives. The segfault occurs on systems like Fedora 41 when running modctl build with prebuilt binaries.
Changes:
- Implemented custom
buildTarHeaderfunction to avoid CGO-based user/group lookups intar.FileInfoHeader - Added
osusergoandnetgobuild tags to Linux release workflow to force pure-Go implementations - Fixed error messages in
Untarto correctly reference "file" instead of "directory"
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| pkg/archiver/archiver.go | Added custom buildTarHeader function that extracts UID/GID from syscall.Stat_t without CGO user/group name lookups, replacing all tar.FileInfoHeader calls |
| .github/workflows/release.yaml | Added osusergo and netgo build tags to Linux builds to prevent CGO-based NSS function usage in static binaries |
Comments suppressed due to low confidence (3)
pkg/archiver/archiver.go:47
- For directories, the Size field should be set to 0 after setting the Typeflag to tar.TypeDir. The standard tar.FileInfoHeader function zeros out the size for directories. Currently, info.Size() returns a non-zero value for directories on some filesystems, which can lead to incorrect tar archives.
if info.IsDir() {
header.Typeflag = tar.TypeDir
} else {
header.Typeflag = tar.TypeReg
}
pkg/archiver/archiver.go:38
- The Mode field should use info.Mode().Perm() instead of int64(info.Mode()) to extract only the permission bits. The current implementation includes file type bits in the mode, which can cause issues during extraction. The standard tar.FileInfoHeader function uses mode.Perm() to extract only the permission bits (0777 for Unix).
Mode: int64(info.Mode()),
.github/workflows/release.yaml:158
- This line appears to have no actual changes (just whitespace). If this is unintentional, it can be removed from the PR.
GITHUB_TOKEN: ${{ secrets.RELEASE_TOKEN }}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Fixes #285
The prebuilt Linux amd64 binaries crash with a segfault on Fedora 41 when running
modctl build.The issue is caused by statically linking glibc with CGO enabled, which breaks NSS functions (
getgrgid_r) used indirectly byos/user.Fix
tar.FileInfoHeaderusage with a custom header builder that reads UID/GID directly fromsyscall.Stat_t, avoiding CGO user/group lookups.osusergoandnetgobuild tags in the Linux release workflow to force pure-Go implementations.Testing
go test ./...passes (existingTestGitfailure is unrelated).Screenshots