coreclr: implement Environment.CurrentManagedThreadId via ManagedThreadId.Current#129954
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @VSadov |
….Current on CoreCLR
On CoreCLR, `t_currentManagedThreadId` is set up by the type system to
point to a special location populated with the thread's managed ID before
any managed caller can observe it (the thin-lock implementation already
relies on this guarantee).
This change makes `Environment.CurrentManagedThreadId` simply delegate to
`ManagedThreadId.Current`, turning an FCall into a cheap thread-static
read, aligning CoreCLR with NativeAOT/Mono which already do the same.
- Environment.CoreCLR.cs: replace InternalCall property with
`=> Threading.ManagedThreadId.Current`
- ecalllist.h: remove the now-unused
`FCFuncElement("get_CurrentManagedThreadId", JIT_GetCurrentManagedThreadId)`
`JIT_GetCurrentManagedThreadId` is kept in jithelpers.cpp and
jitinterface.h because it is still needed as the native entry point for
the JIT helper `CORINFO_HELP_GETCURRENTMANAGEDTHREADID` (used by the JIT
when optimizing `Thread.CurrentThread.ManagedThreadId`).
Co-authored-by: VSadov <8218165+VSadov@users.noreply.github.com>
|
@copilot is it possible to interpret call to |
JIT can only generate calls to helpers, but they can be managed methods instead. |
The idea is - if JIT sees If the substituted call can't be inlined, it is probably not much better than calling a native helper. |
@EgorBo recently added a way to mark helpers calls introduced early as inlineable. |
…anagedThreadId; remove native FCall Make the CORINFO_HELP_GETCURRENTMANAGEDTHREADID JIT helper IL-based, binding it to Environment.get_CurrentManagedThreadId (now a thread-static read via ManagedThreadId.Current), following the MEMSET precedent. This lets the helper resolve to the managed property instead of native code, so the dedicated native JIT_GetCurrentManagedThreadId FCall is no longer needed and is removed. - corelib.h: add METHOD__ENVIRONMENT__CURRENT_MANAGED_THREAD_ID - jithelpers.h: GETCURRENTMANAGEDTHREADID -> DYNAMICJITHELPER bound to the method - jithelpers.cpp/jitinterface.h: delete JIT_GetCurrentManagedThreadId - Environment.CoreCLR.cs: keep AggressiveInlining on get_CurrentManagedThreadId Co-authored-by: VSadov <8218165+VSadov@users.noreply.github.com>
Yes — done in 3c9cb11. The Verified parity (Env vs Thread.ManagedThreadId) on main, a new managed thread, and a thread-pool thread. |
Right — and that's the catch with the current state of this PR. Both To get true inlining, the intrinsic path for Note AI-generated content by copilot may be incorrect. |
There was a problem hiding this comment.
Pull request overview
This PR changes CoreCLR’s implementation of Environment.CurrentManagedThreadId from a native FCall to a managed, inlinable path backed by System.Threading.ManagedThreadId.Current, and rebinds the JIT helper CORINFO_HELP_GETCURRENTMANAGEDTHREADID to the managed accessor.
Changes:
- Replace the CoreCLR
Environment.CurrentManagedThreadIdInternalCall with an aggressively-inlined managed getter (=> Threading.ManagedThreadId.Current). - Remove the native
JIT_GetCurrentManagedThreadIdFCall surface (declaration, implementation, and ecall table entry). - Rebind
CORINFO_HELP_GETCURRENTMANAGEDTHREADIDas aDYNAMICJITHELPERtargetingEnvironment.get_CurrentManagedThreadId, and adjust the JITThread.ManagedThreadIdintrinsic import to convert/spill inline candidates similarly toimpImportCall.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/vm/jitinterface.h | Removes the JIT_GetCurrentManagedThreadId FCDECL export. |
| src/coreclr/vm/jithelpers.cpp | Deletes the FCIMPL0 implementation of JIT_GetCurrentManagedThreadId. |
| src/coreclr/vm/ecalllist.h | Removes the Environment.get_CurrentManagedThreadId FCall mapping. |
| src/coreclr/vm/corelib.h | Adds a CoreLib binder entry for Environment.get_CurrentManagedThreadId for helper rebinding. |
| src/coreclr/System.Private.CoreLib/src/System/Environment.CoreCLR.cs | Implements CurrentManagedThreadId as a managed, aggressively-inlined getter reading ManagedThreadId.Current. |
| src/coreclr/jit/importercalls.cpp | Updates the Thread.get_ManagedThreadId intrinsic path to convert the helper to an inlinable user call and spill/return a GT_RET_EXPR when appropriate. |
| src/coreclr/inc/jithelpers.h | Rebinds CORINFO_HELP_GETCURRENTMANAGEDTHREADID to a managed method via DYNAMICJITHELPER. |
|
@EgorBot -linux_x64 -windows_x64 -arm64 -linux_arm64 using BenchmarkDotNet.Attributes;
using System.Threading;
public class ManagedThreadIdBench
{
[Benchmark]
public int Environment_CurrentManagedThreadId()
=> Environment.CurrentManagedThreadId;
[Benchmark]
public int Thread_CurrentThread_ManagedThreadId()
=> Thread.CurrentThread.ManagedThreadId;
// Summing in a tight loop makes the per-call cost easier to see and
// prevents the JIT from eliminating the read entirely.
[Benchmark]
public int Environment_CurrentManagedThreadId_Loop()
{
int sum = 0;
for (int i = 0; i < 1000; i++)
sum += Environment.CurrentManagedThreadId;
return sum;
}
[Benchmark]
public int Thread_CurrentThread_ManagedThreadId_Loop()
{
int sum = 0;
for (int i = 0; i < 1000; i++)
sum += Thread.CurrentThread.ManagedThreadId;
return sum;
}
} |
|
Probably not for this PR, but I wonder if @EgorBot -arm -amd -commits main using BenchmarkDotNet.Attributes;
using System.Threading;
public class ManagedThreadIdBench
{
[Benchmark]
public int Environment_CurrentManagedThreadId()
=> Environment.CurrentManagedThreadId;
[Benchmark]
public int Thread_CurrentThread_ManagedThreadId()
=> Thread.CurrentThread.ManagedThreadId;
[Benchmark]
public int Thread_CurrentThread_ManagedThreadId_not_expanded()
{
var tmp = Thread.CurrentThread;
// tmp breaks the `Thread.CurrentThread.ManagedThreadId` pattern match.
return tmp .ManagedThreadId;
}
} |
As I understand the purpose for the I am not sure how far real programs may get without allocating This is a relatively mechanical PR though that keeps overall strategy the same, modulo inlining. |
|
The benchmarks show considerable gains from this change: Results for
|
| Method | Toolchain | Mean | Error | Ratio |
|---|---|---|---|---|
| Environment_CurrentManagedThreadId | PR #129954 | 0.0276 ns | 0.0009 ns | 1.00 |
| Environment_CurrentManagedThreadId | main | 1.2827 ns | 0.0131 ns | 46.54 |
| Thread_CurrentThread_ManagedThreadId | PR #129954 | 0.0415 ns | 0.0015 ns | 1.00 |
| Thread_CurrentThread_ManagedThreadId | main | 0.5297 ns | 0.0052 ns | 12.77 |
| Environment_CurrentManagedThreadId_Loop | PR #129954 | 230.7498 ns | 0.1642 ns | 1.00 |
| Environment_CurrentManagedThreadId_Loop | main | 1,192.4792 ns | 5.4503 ns | 5.17 |
| Thread_CurrentThread_ManagedThreadId_Loop | PR #129954 | 230.5095 ns | 0.1828 ns | 1.00 |
| Thread_CurrentThread_ManagedThreadId_Loop | main | 231.2885 ns | 0.1146 ns | 1.00 |
Results for ubuntu24_azure_turin
BenchmarkDotNet v0.15.8, Linux Ubuntu 24.04.4 LTS (Noble Numbat)
AMD EPYC 9V45 2.60GHz, 1 CPU, 8 logical and 4 physical cores
.NET SDK 11.0.100-preview.7.26329.101
[Host] : .NET 10.0.9 (10.0.9, 10.0.926.27113), X64 RyuJIT x86-64-v4
| Method | Toolchain | Mean | Error | Ratio |
|---|---|---|---|---|
| Environment_CurrentManagedThreadId | PR #129954 | 0.9188 ns | 0.0201 ns | 1.00 |
| Environment_CurrentManagedThreadId | main | 1.3938 ns | 0.0364 ns | 1.52 |
| Thread_CurrentThread_ManagedThreadId | PR #129954 | 0.9012 ns | 0.0203 ns | 1.00 |
| Thread_CurrentThread_ManagedThreadId | main | 2.0580 ns | 0.0274 ns | 2.28 |
| Environment_CurrentManagedThreadId_Loop | PR #129954 | 234.8007 ns | 3.6196 ns | 1.00 |
| Environment_CurrentManagedThreadId_Loop | main | 2,269.4807 ns | 25.7584 ns | 9.67 |
| Thread_CurrentThread_ManagedThreadId_Loop | PR #129954 | 234.3946 ns | 2.3461 ns | 1.00 |
| Thread_CurrentThread_ManagedThreadId_Loop | main | 235.7425 ns | 3.0677 ns | 1.01 |
Results for ubuntu24_azure_cobalt100
BenchmarkDotNet v0.15.8, Linux Ubuntu 24.04.4 LTS (Noble Numbat)
Neoverse-N2, 8 physical cores
.NET SDK 11.0.100-preview.7.26329.101
[Host] : .NET 10.0.9 (10.0.9, 10.0.926.27113), Arm64 RyuJIT armv8.0-a
| Method | Toolchain | Mean | Error | Ratio |
|---|---|---|---|---|
| Environment_CurrentManagedThreadId | PR #129954 | 0.0321 ns | 0.0075 ns | 1.05 |
| Environment_CurrentManagedThreadId | main | 0.8895 ns | 0.0010 ns | 28.99 |
| Thread_CurrentThread_ManagedThreadId | PR #129954 | 0.0342 ns | 0.0052 ns | 1.019 |
| Thread_CurrentThread_ManagedThreadId | main | 0.0000 ns | 0.0000 ns | 0.000 |
| Environment_CurrentManagedThreadId_Loop | PR #129954 | 298.0678 ns | 0.0399 ns | 1.00 |
| Environment_CurrentManagedThreadId_Loop | main | 1,507.0731 ns | 1.9008 ns | 5.06 |
| Thread_CurrentThread_ManagedThreadId_Loop | PR #129954 | 297.9481 ns | 0.0500 ns | 1.00 |
| Thread_CurrentThread_ManagedThreadId_Loop | main | 300.7611 ns | 0.0598 ns | 1.01 |
Results for windows_azure_turin
BenchmarkDotNet v0.15.8, Windows 11 (10.0.26100.32995/24H2/2024Update/HudsonValley) (Hyper-V)
AMD EPYC 9V45 2.60GHz, 1 CPU, 8 logical and 4 physical cores
.NET SDK 11.0.100-preview.7.26329.101
[Host] : .NET 10.0.9 (10.0.9, 10.0.926.27113), X64 RyuJIT x86-64-v4
| Method | Toolchain | Mean | Error | Ratio |
|---|---|---|---|---|
| Environment_CurrentManagedThreadId | PR #129954 | 0.0159 ns | 0.0094 ns | 1.25 |
| Environment_CurrentManagedThreadId | main | 0.2192 ns | 0.0055 ns | 17.25 |
| Thread_CurrentThread_ManagedThreadId | PR #129954 | 0.0089 ns | 0.0161 ns | ? |
| Thread_CurrentThread_ManagedThreadId | main | 0.8976 ns | 0.0174 ns | ? |
| Environment_CurrentManagedThreadId_Loop | PR #129954 | 231.9105 ns | 3.0379 ns | 1.00 |
| Environment_CurrentManagedThreadId_Loop | main | 1,121.5578 ns | 9.6127 ns | 4.84 |
| Thread_CurrentThread_ManagedThreadId_Loop | PR #129954 | 228.3862 ns | 1.4556 ns | 1.00 |
| Thread_CurrentThread_ManagedThreadId_Loop | main | 231.0352 ns | 2.0251 ns | 1.01 |
I am also not sure it's a big problem either since it's a one-time job per thread. |
Yeah, so |
Also that call could be costly next to just accessing TLS. |
|
@MichalPetryka - thanks for the hint about inlining! |
The helper is not side-effect free if the loop contains a runtime async |
|
@EgorBo @jakobbotsch - I see that we still have // A constant within any method.
case CORINFO_HELP_GETCURRENTMANAGEDTHREADID:
vnf = VNF_ManagedThreadId;
break;Can that somehow cause issues in an async methods? It is not generally a good idea to mix TLS accesses and async, but I am sure there are cases, for whatever reason... |
I think that can be deleted now that you marked it as inline candidate (and I expect it to be always inlined) so that helper should almost never make it to the VN phase. if it doesn't inline for whatever reason it's not a big deal if it doesn't get that VN func. |
|
What about the actual TLS access - is nothing hoisted there, at least across suspend points? (like the address of the variable storage,...) |
No, TLS accesses are done through a byref typed base value, so those cannot be hoisted across awaits. |
I suspected that some detail like clearing byrefs saves us, or we'd have issues with restoring contexts to wrong threads. I think treating PR: #130006 |
On CoreCLR,
Environment.CurrentManagedThreadIdwas backed by an FCall (JIT_GetCurrentManagedThreadId→GetThread()->GetThreadId()).ManagedThreadId.Currentreadst_currentManagedThreadId, a[ThreadStatic]wired by the type system to a special location that is populated with the thread's managed ID before any managed caller can observe it — the thin-lock implementation already depends on this guarantee.This makes
Environment.CurrentManagedThreadIda plain thread-static read on CoreCLR, matching what NativeAOT and Mono already do. The dedicated nativeJIT_GetCurrentManagedThreadIdFCall is removed; theCORINFO_HELP_GETCURRENTMANAGEDTHREADIDJIT helper is rebound to the managed property so bothEnvironment.CurrentManagedThreadIdand theThread.CurrentThread.ManagedThreadIdintrinsic resolve to the inlinable thread-static read.Changes
Environment.CoreCLR.cs: replace[MethodImpl(MethodImplOptions.InternalCall)]extern property with=> Threading.ManagedThreadId.Current, keepingAggressiveInliningon the getter so it folds into the thread-static readecalllist.h: removeFCFuncElement("get_CurrentManagedThreadId", JIT_GetCurrentManagedThreadId)fromgEnvironmentFuncscorelib.h: addMETHOD__ENVIRONMENT__CURRENT_MANAGED_THREAD_IDforget_CurrentManagedThreadIdjithelpers.h: makeCORINFO_HELP_GETCURRENTMANAGEDTHREADIDan IL-basedDYNAMICJITHELPERbound to the managed method (following theMEMSETprecedent) instead of native codeJIT_GetCurrentManagedThreadId(jithelpers.cpp/jitinterface.h) is removed: the helper now resolves to the managedEnvironment.CurrentManagedThreadIdproperty, so the native FCall is no longer neededimportercalls.cpp: theThread.CurrentThread.ManagedThreadIdintrinsic converts theCORINFO_HELP_GETCURRENTMANAGEDTHREADIDhelper into an inlinable user call (impConvertToUserCallAndMarkForInlining) and, since intrinsic-imported calls skipimpImportCall's inline-candidate spill, manually appends it and hands back aGT_RET_EXPR. This lets the inliner fold the property into the same thread-static read asEnvironment.CurrentManagedThreadIdand avoids the checkedGTF_CALL_INLINE_CANDIDATEmorph assert from the unspilled candidateVerified
clr+libs -rc releasebuilds clean andEnvironment.CurrentManagedThreadId/Thread.CurrentThread.ManagedThreadIdparity holds (with no checked assert underJitStress/TailcallStress) on the main thread, a new managed thread, and a thread-pool thread.