Skip to content

Skip overlong-UTF-8 dirents in getdents64#97

Merged
jserv merged 1 commit into
mainfrom
getdents64
Jun 13, 2026
Merged

Skip overlong-UTF-8 dirents in getdents64#97
jserv merged 1 commit into
mainfrom
getdents64

Conversation

@jserv

@jserv jserv commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

sys_getdents64 aborted the whole directory stream the first time path_translate_dirent_name reported ENAMETOOLONG. On macOS APFS, filename byte length can exceed Linux NAME_MAX because the per-component limit is in Unicode characters, not bytes; a single 89-CJK-character entry already crosses 255 bytes. The pre-fix path truncated ls / find / coreutils listings to the guest against any APFS source tree containing one such name.

A guest libc cannot represent an oversize entry in its 256-byte dirent buffer regardless of what elfuse does, so the only sensible behavior is to skip the unrepresentable name and keep delivering the rest of the stream. Skip only on ENAMETOOLONG; any other translation failure keeps the existing partial-return path so genuine errors are not silently dropped. A single process-wide log_warn records the first hit via an atomic latch.

Coverage stages five 268-byte UTF-8 names plus one normal entry host-side and walks the listing with a one-entry-per-call buffer, which forces at least one call to begin fresh on an overlong entry under any APFS hash ordering. That is the exact condition under which pre-fix code returned -ENAMETOOLONG to userspace; the test fails three out of three pre-fix and passes three out of three post-fix.


Summary by cubic

Skip overlong UTF-8 dirents in getdents64 instead of aborting the directory stream. This avoids truncated ls/find/coreutils listings on macOS APFS and continues returning the rest of the entries.

  • Bug Fixes
    • Skip entries only when name translation returns ENAMETOOLONG; keep partial-return behavior for other errors.
    • Emit a single process-wide warning on the first skip.
    • Add regression test test-getdents64-overlong and run it in make check.

Written for commit df0e588. Summary will update on new commits.

Review in cubic

sys_getdents64 aborted the whole directory stream the first time
path_translate_dirent_name reported ENAMETOOLONG. On macOS APFS,
filename byte length can exceed Linux NAME_MAX because the per-component
limit is in Unicode characters, not bytes; a single 89-CJK-character
entry already crosses 255 bytes. The pre-fix path truncated ls / find /
coreutils listings to the guest against any APFS source tree containing
one such name.

A guest libc cannot represent an oversize entry in its 256-byte dirent
buffer regardless of what elfuse does, so the only sensible behavior is
to skip the unrepresentable name and keep delivering the rest of the
stream. Skip only on ENAMETOOLONG; any other translation failure keeps
the existing partial-return path so genuine errors are not silently
dropped. A single process-wide log_warn records the first hit via an
atomic latch.

Coverage stages five 268-byte UTF-8 names plus one normal entry
host-side and walks the listing with a one-entry-per-call buffer, which
forces at least one call to begin fresh on an overlong entry under any
APFS hash ordering. That is the exact condition under which pre-fix code
returned -ENAMETOOLONG to userspace; the test fails three out of three
pre-fix and passes three out of three post-fix.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 3 files

Re-trigger cubic

@jserv jserv merged commit 3dbea17 into main Jun 13, 2026
5 checks passed
@jserv jserv deleted the getdents64 branch June 13, 2026 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant