Add an agent_id_parent selector to x509pop in spiffe mode#7002
Add an agent_id_parent selector to x509pop in spiffe mode#7002kfox1111 wants to merge 6 commits into
Conversation
Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>
Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>
There was a problem hiding this comment.
Pull request overview
This PR extends the server-side x509pop node attestor in mode = "spiffe" to emit an additional selector that represents the “parent” of the generated agent SPIFFE ID, enabling Node Alias rules to match groups of related agents (e.g., replicas) without targeting each unique agent ID.
Changes:
- Add
agent_id_parent:<parent-spiffe-id>selector value during attestation whenmode = "spiffe". - Update unit tests to assert the new selector is returned in SPIFFE exchange mode.
- Document the new selector in the
x509popserver node attestor plugin docs.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| pkg/server/plugin/nodeattestor/x509pop/x509pop.go | Appends agent_id_parent:<id> selector value in SPIFFE mode attestation responses. |
| pkg/server/plugin/nodeattestor/x509pop/x509pop_test.go | Extends attestation success test expectations to include the new selector for SPIFFE exchange. |
| doc/plugin_server_nodeattestor_x509pop.md | Documents the new selector and its behavior/example usage. |
Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>
Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>
| agentIDStr := spiffeid.String() | ||
|
|
||
| if lastSlash := strings.LastIndex(agentIDStr, "/"); lastSlash != -1 { | ||
| parentID := agentIDStr[:lastSlash] | ||
| parentID = strings.TrimSuffix(parentID, "/") | ||
| selectors = append(selectors, "agent_id_parent:"+parentID) | ||
| } |
There was a problem hiding this comment.
This is a very single-use selector, for the case that you mentioned. We can publish the SPIFFE ID as the selector. For your particular case you could make the agent SPIFFE ID template be derived from the SHA1 fingerprint, to get unique SPIFFE IDs, and have the full SPIFFE ID be spiffe://trust domain/x509pop/identity-exchange, which you can then use as a node alias. You might have to make sure that no 2 agents get deployed to the same node so that they do not use the same exact SVID to attest themselves.
There was a problem hiding this comment.
Can you please give an example of what the selector for the node alias would actually be in this configuration? I dont understand how it is possible with the approach your suggesting.
the docs mention selectors:
Common Name x509pop:subject:cn:example.org The Subject's Common Name (see X.500 Distinguished Names)
SHA1 Fingerprint x509pop:ca:fingerprint:0beec7b5ea3f0fdbc95d0dd47f3c5bc275da8a33 The SHA1 fingerprint as a hex string for each cert in the PoP chain, excluding the leaf.
SerialNumber x509pop:serialnumber:0a1b2c3d4e5f The leaf certificate serial number as a lowercase hexadecimal string
San x509pop:san:<key>:<value> The san selectors on the leaf certificate. The expected format of the uri san is x509pop://<trust_domain>/<key>/<value>. One selector is exposed per uri san corresponding to x509pop uri scheme. string
the san selector wont work as a spiffe svid will never have a url with x509pop.
the serial number/sha1 fingerprint selectors are not useful as you will never know what it is to alias with. common name wont be defined with spiffe certs.
the experimental spiffe_id selector will always be the full spiffeid that is genreated from the template. if it has the fingerprint in it, its not useable as a generic point to alias over either.
So, there is a lack of a selector that fits the use case I think.
It is ok if it is narrowly targeted a selector. it only is needed for x509 pop with spiffe mode. There are lots of selectors that only are useful for certain use cases.
There was a problem hiding this comment.
The SPIFFE ID of the SVID the agent presents to spire-server to attest would be:
spiffe://trust domain/spire-identity-exchange(or something like that this). The same one for all agents.- The SPIFFE of the SVID resulting from the attestation with the above SVID would be
spiffe://trust domain/spire/agent/x509pop/<cert fingerprint>, instead of the usual format for spiffe mode. You'd need to make sure that the agents get unique "spire-identity-exchange" SVIDs so that they get unique fingerprints.
We'd still need to add the id as a selector so you can use in the node aliase, but that feels like a reasonable thing to do.
It is ok if it is narrowly targeted a selector. it only is needed for x509 pop with spiffe mode.
It's not that it's only needed for spiffe mode, it's only needed for this particular deployment. Most deployments wouldn't care about the "parent id".
There was a problem hiding this comment.
Ah. I see.
I think that has two drawbacks.
- I would be worried about an agent who's identifier can change over time. When reattesting, it will get a new cert which will have a new fingerprint.
- You'd never know the fingerprint if you still did want to hang off svds off of the individual agents up front. Forcing it to be the fingerprint id would make it harder to predict. I was planning on using the {{ pod.Name }} in it for stable identifiers in case of a statefulset in k8s.
I agree though, either way, we need a new selector. We're just debating the name of it I guess. So we could pick a name potentially independent of the rest.
This does raise a discussion I wanted to have at some point about dynamic entries. This idea does kind of match an idea I've been rolling around in my head. It would be useful to simplify my own entry management, as well as potentially helping dynamic entries for spire-identity-exchange too.
In most of my usage, I've run across needing to tack on the node name over and over to identifiers to guarantee uniqueness of identity.
example.com/spire/agent/node123
example.com/sshd/node123
example.com/fluent-bit/node123
example.com/node-agent/node123
example.com/kubelet/node123
Times the number of nodes. It gets a bit painful after a while.
But, what if we could set some labels on the node:
example.com/spire/agent/node123 (label nodeName=node123)
we could put generic workload entries on it:
{example.com/node123, example.com/node234}-> example.com/k8s-node-group -> {example.com/sshd, example.com/fluent-bit, example.com/node-agent, example.com/kubelet}
Each one would use those spiffeid's for lookups and resolving workloads. but, during issuence of the svid, it would have an extra template that would resolve before issuing the final spiffeid. so it could say, append "/{{ nodeLabels.nodeName }}"
There was a problem hiding this comment.
It's not clear to me if the solution above is ok for you, which would mean that we'd have just an agent_spiffe_id selector, based on the full SPIFFE ID of the presented SVID.
We can continue the other discussion independently of this.
Pull Request check list
Affected functionality
x509pop mode gets a new selector that lets you easily setup node aliases.
Description of change
When mode is spiffe in x509pop, a new selector is added that removes the specific individual agent name. This allows node aliasing a whole set of agents with one node alias.
Which issue this PR fixes
#7001