Skip overlong-UTF-8 dirents in getdents64#97
Merged
Merged
Conversation
sys_getdents64 aborted the whole directory stream the first time path_translate_dirent_name reported ENAMETOOLONG. On macOS APFS, filename byte length can exceed Linux NAME_MAX because the per-component limit is in Unicode characters, not bytes; a single 89-CJK-character entry already crosses 255 bytes. The pre-fix path truncated ls / find / coreutils listings to the guest against any APFS source tree containing one such name. A guest libc cannot represent an oversize entry in its 256-byte dirent buffer regardless of what elfuse does, so the only sensible behavior is to skip the unrepresentable name and keep delivering the rest of the stream. Skip only on ENAMETOOLONG; any other translation failure keeps the existing partial-return path so genuine errors are not silently dropped. A single process-wide log_warn records the first hit via an atomic latch. Coverage stages five 268-byte UTF-8 names plus one normal entry host-side and walks the listing with a one-entry-per-call buffer, which forces at least one call to begin fresh on an overlong entry under any APFS hash ordering. That is the exact condition under which pre-fix code returned -ENAMETOOLONG to userspace; the test fails three out of three pre-fix and passes three out of three post-fix.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
sys_getdents64 aborted the whole directory stream the first time path_translate_dirent_name reported ENAMETOOLONG. On macOS APFS, filename byte length can exceed Linux NAME_MAX because the per-component limit is in Unicode characters, not bytes; a single 89-CJK-character entry already crosses 255 bytes. The pre-fix path truncated ls / find / coreutils listings to the guest against any APFS source tree containing one such name.
A guest libc cannot represent an oversize entry in its 256-byte dirent buffer regardless of what elfuse does, so the only sensible behavior is to skip the unrepresentable name and keep delivering the rest of the stream. Skip only on ENAMETOOLONG; any other translation failure keeps the existing partial-return path so genuine errors are not silently dropped. A single process-wide log_warn records the first hit via an atomic latch.
Coverage stages five 268-byte UTF-8 names plus one normal entry host-side and walks the listing with a one-entry-per-call buffer, which forces at least one call to begin fresh on an overlong entry under any APFS hash ordering. That is the exact condition under which pre-fix code returned -ENAMETOOLONG to userspace; the test fails three out of three pre-fix and passes three out of three post-fix.
Summary by cubic
Skip overlong UTF-8 dirents in getdents64 instead of aborting the directory stream. This avoids truncated ls/find/coreutils listings on macOS APFS and continues returning the rest of the entries.
test-getdents64-overlongand run it inmake check.Written for commit df0e588. Summary will update on new commits.