Harden bundled Copilot CLI extraction against corrupt/partial installs#1635
Harden bundled Copilot CLI extraction against corrupt/partial installs#1635jmoseley wants to merge 1 commit into
Conversation
The embedded-CLI installer opened the final binary path with
open→truncate→write and returned without re-verifying the written bytes.
An interrupted write, a multi-process race, or Windows Defender/ASR
quarantining the freshly-written executable could leave an invalid image
that was handed back as "good" — surfacing downstream as
ERROR_BAD_EXE_FORMAT ("not a valid application for this OS platform").
Replace the in-place write with a safe publish pattern:
- Atomic write: extract to a unique temp file (pid + counter + nanos) in
the *same* target directory, flush + fsync, set 0o755 on unix, then
fs::rename onto the final path. rename overwrites atomically on POSIX;
on Windows (no atomic overwrite) we remove-then-rename, with the race
guarded by re-verification and peer-install acceptance.
- Verify after publish: the bytes extracted from the trusted embedded
archive are the known-good reference. Verify the staged temp file and
again re-verify the published file byte-for-byte (size first), catching
truncation or AV tampering between staging and publishing.
- Trustworthy fast path: an existing install is reused only after a cheap
re-check (integrity-marker size + executable-image header); a truncated,
empty, or quarantined binary is re-extracted instead of trusted.
- Retry + fallback: re-extract/re-publish up to 3 times to ride out
transient AV interference, then surface a clear, actionable error
("appears blocked or corrupt ... possibly quarantined by antivirus")
rather than returning a broken path.
Adds unit tests covering the temp+rename publish, marker recording,
detection of corrupt/truncated/unmarked/invalid-image installs, and
post-write size/content verification.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR hardens the Rust SDK’s bundled Copilot CLI extraction path (bundled-cli) so it does not return a truncated/corrupt executable after partial installs, races, or antivirus interference by staging to a temp file, verifying bytes, publishing via rename, and recording an integrity marker for a safer fast-path.
Changes:
- Adds an integrity marker + executable-header check to validate an existing installed bundled CLI before reusing it.
- Implements a staged write + fsync + atomic publish + post-publish verification flow with retries and clearer failure reporting.
- Adds unit tests covering publish/overwrite behavior, fast-path rejection of corrupt installs, and verification failures.
Show a summary per file
| File | Description |
|---|---|
| rust/src/lib.rs | Updates public docs for install_bundled_cli() to reflect integrity re-check behavior. |
| rust/src/embeddedcli.rs | Reworks bundled CLI install to stage+verify+publish with marker-based fast path and unit tests. |
Copilot's findings
- Files reviewed: 2/2 changed files
- Comments generated: 1
| fn publish(tmp: &Path, final_path: &Path) -> Result<(), EmbeddedCliError> { | ||
| match fs::rename(tmp, final_path) { | ||
| Ok(()) => Ok(()), | ||
| Err(_) if final_path.exists() => { | ||
| let _ = fs::remove_file(final_path); | ||
| fs::rename(tmp, final_path) | ||
| .map_err(|e| EmbeddedCliError::new(EmbeddedCliErrorKind::Publish, e)) | ||
| } | ||
| Err(e) => Err(EmbeddedCliError::new(EmbeddedCliErrorKind::Publish, e)), | ||
| } | ||
| } |
| //! The embedded bytes are part of the consumer's signed binary and therefore | ||
| //! trusted *as the source of truth* — but the bytes that land on disk are not. | ||
| //! A non-atomic write, a multi-process race, or antivirus quarantining the | ||
| //! freshly-written executable can leave a truncated or corrupt image that, if |
There was a problem hiding this comment.
I might be wrong but I thought AV quarantining was generally done by setting some other flag in filesystem metadata, not modifying the bytes of the executable. Maybe some AV software differs.
Doesn't change the validity of the PR though
There was a problem hiding this comment.
Update: looks like Windows Defender will actually move the file into a different dir if it quarantines it.
| /// image header for the current platform (PE on Windows, Mach-O on macOS, | ||
| /// ELF elsewhere). Returns `false` on any I/O error or unrecognized header. | ||
| #[cfg(any(has_bundled_cli, test))] | ||
| fn looks_like_valid_image(path: &Path) -> bool { |
There was a problem hiding this comment.
If we wanted a more end-to-end problem detector we could consider something like detecting whether the spawn fails (that is, we never get to the point of having a valid handshake) and if so track a flag so that next time the app is launched, we do a full re-extraction. Then we wouldn't need so many other checks.
Summary
On a Windows user's laptop, the GitHub Copilot desktop app's bundled
gh.exefailed to launch withERROR_BAD_EXE_FORMAT("The specified executable is not a valid application for this OS platform"). Investigation traced the class of bug to non-atomic binary extraction that returns an unverified file. This PR applies the equivalent fix for the Copilot CLI binary thatcopilot-sdkbundles/extracts (thebundled-clipath in the Rust crate).Root cause
rust/src/embeddedcli.rsextracted the embedded archive and wrote the binary with a non-atomic open → truncate → write directly on the live final path, then returned without re-verifying the written bytes. The "already installed?" fast path trusted a bareis_file()check. So an interrupted write, a multi-process race, or Windows Defender / ASR quarantining the freshly-written executable could leave a truncated or corrupt image that gets handed back as "good" — and fails to launch.Fix
pid+ counter + nanos) in the same target directory,write_all→flush→sync_all, set0o755on unix, thenfs::renameonto the final path.renameoverwrites atomically on POSIX; on Windows (which won't atomically overwrite) we remove-then-rename, with the race window guarded by re-verification and by accepting a peer's identical install.MZ/ Mach-O / ELF). A truncated, empty, or quarantined binary is re-extracted instead of trusted.The change is scoped to the CLI extraction/install path. Public behavior is unchanged for the happy path; the installer is just no longer able to return a broken executable.
Tests
New unit tests in
embeddedcli.rscover:The existing
install_bundled_cli_*integration tests (which exercise the full extract → publish → verify → marker flow against the real downloaded archive and assert idempotency) continue to pass.Validation
cargo +nightly fmt --check✅cargo clippy --all-targets --features test-support -- -D warnings -D clippy::unwrap_used …✅ (both the skip-download and real-bundled paths)cargo doc --no-deps --all-featureswith-D warnings✅cargo build(bundled, downloads real CLI) ✅cargo test --lib embeddedcli+install_bundled_cliintegration tests ✅