-
Notifications
You must be signed in to change notification settings - Fork 443
LRPTests: retry on transient MSIX install races #6519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
kythant
merged 24 commits into
release/dev/monobuild
from
user/kythant/lrp-test-reliability
Jun 6, 2026
Merged
Changes from 10 commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
49669df
LRPTests: retry on transient MSIX install races
kythant 9df4bb7
Test::Bootstrap+Package: retry transient deployment races at the source
kythant 2348857
BypassTests: baseline ChannelRequestCheckExpirationTime on Server 2025
kythant 04ff986
Test::Bootstrap+Package: fix retry-helper compile errors
kythant 2bf5019
Test::Package retry: use HRESULT_FROM_WIN32 with raw win32 codes
kythant 627fd77
BypassTests: baseline ChannelRequestCheckExpirationTime on Win11 24H2…
kythant e7698d6
PushNotifications: retry ChannelRequestCheckExpirationTime on WNS flake
kythant 452f797
Test reliability: wait for package enumerability instead of retrying …
kythant 4082979
PushNotifications: move SkipIfWnsServiceError above first caller
kythant 1273072
AddPackage wait: poll FindPackagesForUserWithPackageTypes + Status.Ve…
kythant 945f51f
Bootstrap: drop comment + variable extraction, just call VERIFY_SUCCE…
kythant 2cbccca
Address review NITs: use symbolic ERROR_* names
kythant 3037a3c
Test::Package: drop ERROR_INSTALL_* #ifndef fallbacks
kythant 6c8cefc
Revert: restore ERROR_INSTALL_* #ifndef fallbacks
kythant c73504b
Test::Package: use correct symbolic names for 0x3D02 / 0x3CFF
kythant 065248b
LRPTests: retry CoCreateInstance on transient install-lock HRESULTs
kythant 35e9d5b
LRPTests: use ERROR_PACKAGES_IN_USE / ERROR_INSTALL_POLICY_FAILURE sy…
kythant 166f098
Bootstrap: restore short retry on residual 0x80270254 after WaitForPa…
kythant 44746b2
Bootstrap retry: compare against raw 0x80270254 HRESULT (APPX facilit…
kythant 50a519e
Bootstrap retry: bump budget to ~3min (10 attempts x 30s cap)
kythant cd2372c
Revert: bootstrap retry budget back to 5 attempts x 8s cap
kythant 7b6415c
Bootstrap: poll the exact precondition FindDDLMViaEnumeration checks
kythant 1ef6a7f
Bootstrap: also poll unscoped Main enumeration in WaitForDDLMBootstra…
kythant 0256280
Bootstrap: simplify back to bounded retry on 0x80270254
kythant File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -6,6 +6,10 @@ | |
|
|
||
| #include <appmodel.h> | ||
|
|
||
| #include <algorithm> | ||
| #include <string> | ||
| #include <vector> | ||
|
|
||
| #include <WindowsAppRuntime.Test.FileSystem.h> | ||
| #include <winrt/Windows.Management.Deployment.h> | ||
| #include <winrt/Windows.ApplicationModel.h> | ||
|
|
@@ -312,14 +316,145 @@ inline winrt::Windows::Foundation::Uri GetAppxManifestPackageUri(PCWSTR packageF | |
| return winrt::Windows::Foundation::Uri{ path.c_str() }; | ||
| } | ||
|
|
||
| inline void WaitForPackageEnumerable(PCWSTR packageFullName) | ||
| { | ||
| // After AddPackageAsync's async operation completes, the OS-side | ||
| // PackageManager index can lag briefly before the just-registered | ||
| // package becomes visible to family-scoped enumeration AND its on-disk | ||
| // state reports Status.VerifyIsOK(). MddBootstrapInitialize -> | ||
| // PackageDeploymentResolver::Find resolves the DDLM via exactly that | ||
| // path (FindPackagesForUserWithPackageTypes + Status.VerifyIsOK), so a | ||
| // FindPackageForUser-by-full-name poll uses the wrong cache and returns | ||
| // too early. Mirror the resolver's enumeration here so AddPackage only | ||
| // returns once the OS will satisfy MddBootstrapInitialize. | ||
| // | ||
| // PackageFullName format: <Name>_<Version>_<Architecture>_<ResourceId>_<PublisherId> | ||
| // FamilyName format: <Name>_<PublisherId> (parts[0] + "_" + parts[4]) | ||
| std::wstring fullName{ packageFullName }; | ||
| std::vector<std::wstring> parts; | ||
| { | ||
| size_t start{ 0 }; | ||
| for (size_t i{ 0 }; i <= fullName.size(); ++i) | ||
| { | ||
| if (i == fullName.size() || fullName[i] == L'_') | ||
| { | ||
| parts.emplace_back(fullName.substr(start, i - start)); | ||
| start = i + 1; | ||
| } | ||
| } | ||
| } | ||
| if (parts.size() < 5) | ||
| { | ||
| WEX::Logging::Log::Warning(WEX::Common::String().Format( | ||
| L"WaitForPackageEnumerable('%s'): unparseable full name (parts=%zu); skipping wait", | ||
| packageFullName, parts.size())); | ||
| return; | ||
| } | ||
| const winrt::hstring familyName{ parts[0] + L"_" + parts[4] }; | ||
| const winrt::hstring fullNameH{ packageFullName }; | ||
|
|
||
| winrt::Windows::Management::Deployment::PackageManager packageManager; | ||
| const auto packageTypes{ | ||
| winrt::Windows::Management::Deployment::PackageTypes::Framework | | ||
| winrt::Windows::Management::Deployment::PackageTypes::Main }; | ||
|
|
||
| constexpr DWORD c_pollIntervalMs{ 100 }; | ||
| constexpr DWORD c_timeoutMs{ 30000 }; | ||
| const DWORD startTick{ GetTickCount() }; | ||
| for (;;) | ||
| { | ||
| bool found{ false }; | ||
| bool statusOk{ false }; | ||
| try | ||
| { | ||
| auto packages{ packageManager.FindPackagesForUserWithPackageTypes(winrt::hstring{}, familyName, packageTypes) }; | ||
| if (packages) | ||
| { | ||
| for (const auto& candidate : packages) | ||
| { | ||
| if (candidate.Id().FullName() == fullNameH) | ||
| { | ||
| found = true; | ||
| statusOk = candidate.Status().VerifyIsOK(); | ||
| break; | ||
| } | ||
| } | ||
| } | ||
| } | ||
| catch (...) | ||
| { | ||
| // PackageManager occasionally throws transient access errors | ||
| // during the index-update window; treat as not-yet-visible. | ||
| } | ||
| if (found && statusOk) | ||
| { | ||
| return; | ||
| } | ||
| const DWORD elapsed{ GetTickCount() - startTick }; | ||
| if (elapsed >= c_timeoutMs) | ||
| { | ||
| WEX::Logging::Log::Warning(WEX::Common::String().Format( | ||
| L"WaitForPackageEnumerable('%s', family='%s') timed out after %u ms (found=%d statusOk=%d); downstream bootstrap may race", | ||
| packageFullName, familyName.c_str(), elapsed, found ? 1 : 0, statusOk ? 1 : 0)); | ||
| return; | ||
| } | ||
| Sleep(c_pollIntervalMs); | ||
| } | ||
| } | ||
|
|
||
| inline void AddPackage(PCWSTR packageDirName, PCWSTR packageFullName) | ||
| { | ||
| auto msixUri{ GetMsixPackageUri(packageDirName) }; | ||
|
|
||
| winrt::Windows::Management::Deployment::PackageManager packageManager; | ||
| auto options{ winrt::Windows::Management::Deployment::DeploymentOptions::None }; | ||
| auto deploymentResult{ packageManager.AddPackageAsync(msixUri, nullptr, options).get() }; | ||
|
|
||
| // AddPackageAsync intermittently fails on the test agents with transient | ||
| // deployment errors (most often 0x80073D02 ERROR_INSTALL_RESOURCES_BUSY) | ||
| // when the previous test's package teardown hasn't fully released file | ||
| // handles. There's no precondition we can poll for here (the deployment | ||
| // service holds an internal lock); the documented mitigation is to back | ||
| // off and reissue. Bounded to 5 attempts so a genuine non-transient | ||
| // failure still surfaces quickly. | ||
| winrt::Windows::Management::Deployment::DeploymentResult deploymentResult{ nullptr }; | ||
| constexpr int c_maxAttempts{ 5 }; | ||
| DWORD backoffMs{ 1000 }; | ||
| for (int attempt{ 1 }; attempt <= c_maxAttempts; ++attempt) | ||
| { | ||
| deploymentResult = packageManager.AddPackageAsync(msixUri, nullptr, options).get(); | ||
| const HRESULT hr{ deploymentResult.ExtendedErrorCode() }; | ||
| if (SUCCEEDED(hr)) | ||
| { | ||
| if (attempt > 1) | ||
| { | ||
| WEX::Logging::Log::Comment(WEX::Common::String().Format( | ||
| L"AddPackageAsync('%s') succeeded on attempt %d", packageFullName, attempt)); | ||
| } | ||
| break; | ||
| } | ||
| // ERROR_INSTALL_RESOURCES_BUSY (0x3D02) / ERROR_INSTALL_OPEN_PACKAGE_FAILED (0x3CFF) | ||
| // symbols aren't visible in this header's translation units; pass raw win32 codes | ||
| // through HRESULT_FROM_WIN32 (always-available macro) instead. | ||
| const bool isTransient{ | ||
| hr == HRESULT_FROM_WIN32(0x3D02) || // ERROR_INSTALL_RESOURCES_BUSY | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: Why not say ERROR_INSTALL_RESOURCES_BUSY instead, like below? |
||
| hr == HRESULT_FROM_WIN32(0x3CFF) || // ERROR_INSTALL_OPEN_PACKAGE_FAILED | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: why not say ERROR_INSTALL_OPEN_PACKAGE_FAILED instead, like below? |
||
| hr == HRESULT_FROM_WIN32(ERROR_SHARING_VIOLATION) }; // 0x80070020 | ||
| if (!isTransient || attempt == c_maxAttempts) | ||
| { | ||
| break; | ||
| } | ||
| WEX::Logging::Log::Comment(WEX::Common::String().Format( | ||
| L"AddPackageAsync('%s') attempt %d/%d failed with transient HRESULT 0x%08X %s; sleeping %u ms before retry", | ||
| packageFullName, attempt, c_maxAttempts, hr, deploymentResult.ErrorText().c_str(), backoffMs)); | ||
| Sleep(backoffMs); | ||
| backoffMs = (std::min<DWORD>)(backoffMs * 2, 8000); | ||
| } | ||
| VERIFY_SUCCEEDED(deploymentResult.ExtendedErrorCode(), WEX::Common::String().Format(L"AddPackageAsync('%s') = 0x%0X %s", packageFullName, deploymentResult.ExtendedErrorCode(), deploymentResult.ErrorText().c_str())); | ||
|
|
||
| // Wait for the deployment to be visible to FindPackageForUser before | ||
| // returning so callers (notably MddBootstrapInitialize) don't race the | ||
| // OS package index. | ||
| WaitForPackageEnumerable(packageFullName); | ||
| } | ||
|
|
||
| inline void AddPackageDefer(PCWSTR packageDirName, PCWSTR packageFullName) | ||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: why not just say ERROR_INVALID_STATE instead, like ERROR_SHARING_VIOLATION above?