Skip to content

saiserver: default switch_id on neighbor/route entries when omitted #2300

Open
nicholasching wants to merge 3 commits into
opencomputeproject:masterfrom
nicholasching:sai_vpp_ut_switchid
Open

saiserver: default switch_id on neighbor/route entries when omitted #2300
nicholasching wants to merge 3 commits into
opencomputeproject:masterfrom
nicholasching:sai_vpp_ut_switchid

Conversation

@nicholasching

Copy link
Copy Markdown

Context / motivation

Part of the SAIVPP unit-test framework (see PR 1 / our docker-sai-test-vpp/devdocs/). When running the OCP sai_test suite against the VPP SAI backend, neighbor- and route-create tests failed at create time with SAI_STATUS_INVALID_PARAMETER (-5):

ERROR: meta_generic_validation_create: switch id is NULL for SAI_OBJECT_TYPE_NEIGHBOR_ENTRY

Root cause

sai_thrift_neighbor_entry_t / sai_thrift_route_entry_t carry a switch_id, but many OCP tests build these entries with only rif_id (+ ip_address / destination) and omit switch_id. The saithriftv2 RPC server copied the thrift switch_id straight through, so meta validation received switch_id == 0 and rejected the create. (Tests that pass through the common-config path, which sets switch_id explicitly, succeeded — only entries that omit it failed.)

What this change does

  • meta/templates/sai_rpc_server_helper_functions.tt — custom sai_thrift_parse_neighbor_entry() / sai_thrift_parse_route_entry() that, when the thrift switch_id is 0, fall back to the RPC-global switch_id (the OID returned by sai_thrift_create_switch) and then gSwitchId. (The auto-generated struct copies for these two entry types are skipped in favor of these.)
  • test/saithrift/src/switch_sai.thrift, switch_sai_rpc_server.cpp — the same switch_id field + fallback in the legacy v1 thrift server, for parity.
  • test/sai_test/config/route_configer.py — pass switch_id=self.dut.switch_id explicitly on the route entries the common config builds, so the persisted T0 config is consistent regardless of the fallback.

This mirrors how a real ASIC SAI shim behaves (the switch is unambiguous), and unblocks the neighbor/route-create-on-RIF tests.

Scope / risk

  • 4 files, +79/−18 — RPC-server parsing for two entry types (+ v1 parity) and explicit switch_id in the test common config.
  • No change to SAI behavior for callers that already pass switch_id — the fallback only triggers when it is 0/omitted.
  • Server-side only; no packet-path or test-logic change.

Dependencies
None. Related to #2299, however, their edits are isolated and each target master.

Signed-off-by: Nicholas Ching <nicholaslching@gmail.com>
Signed-off-by: Nicholas Ching <nicholaslching@gmail.com>
@azure-pipelines

Copy link
Copy Markdown
Commenter does not have sufficient privileges for PR 2300 in repo opencomputeproject/SAI

@nicholasching

nicholasching commented Jun 24, 2026

Copy link
Copy Markdown
Author

@tjchadaga please run azure pipeline when possible; this PR is needed for an upstream feature at Cisco ASAP, thanks!

}

void sai_thrift_parse_neighbor_entry(const sai_thrift_neighbor_entry_t &thrift_neighbor_entry, sai_neighbor_entry_t *neighbor_entry) {
neighbor_entry->switch_id = gSwitchId;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand about this change. From my reading of the code, saiserver stores switch_id in gSwitchId (

status = sai_switch_api->create_switch(&gSwitchId, attrSz, attr);
). Subsequent calls can use the gSwitchId if it is not in the RPC request, such as neighbor entries. Is there a case that gSwitchId is invalid?

@nicholasching nicholasching Jun 25, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct; for v1, gSwitchId is always valid so the change is not needed here. I made it to maintain feature parity w/saithriftv2.

Edits to saithriftv2 were needed because gSwitchId in saiserver.cpp is only declared and not assigned on the RPC path.

sai_object_id_t gSwitchId; ///< SAI switch global object ID.

Instead, sai_thrift_create_switch() stores the OID in the RPC-global switch_id.

So when tests created a neighbor or route over RPC, and did not send switch_id, saithriftv2 would pass switch_id=0 to SAI, resulting in switch id is NULL error (-5). The change to saithriftv2 adds a fallback for when switch_id is not passed in (0): first to the RPC path switch_id set by create_switch (which resolves the -5), then to gSwitchId as backup.

Should I keep this change for feature parity or remove to avoid touching v1?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Didn't know there are 2 versions of saithrift. I think we can undo the change. Leave v1 as is

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do!

Comment thread test/saithrift/src/switch_sai.thrift Outdated
struct sai_thrift_neighbor_entry_t {
1: sai_thrift_object_id_t rif_id;
2: sai_thrift_ip_address_t ip_address;
1: sai_thrift_object_id_t switch_id;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not needed. We should keep the original design. let sai rpc server assign switch_id use the stored RPC scoped switch_id variable. Then the changes in route_configer.py are also not needed.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, since issue was fixed by RPC-scoped switch_id fallback, no need for tests to pass in switch_id

} else if (switch_id != SAI_NULL_OBJECT_ID) {
neighbor_entry->switch_id = switch_id;
} else if (gSwitchId != SAI_NULL_OBJECT_ID) {
neighbor_entry->switch_id = gSwitchId;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, we probably can remove this line.

[%- ######################################################################## -%]

[%- BLOCK special_helper_functions -%]
extern sai_object_id_t gSwitchId;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In thrift v2, gSwitchId is never set, right? We probably can remove this line.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct; gSwitchId is never set in v2. I will remove the line.

}

void sai_thrift_parse_neighbor_entry(const sai_thrift_neighbor_entry_t &thrift_neighbor_entry, sai_neighbor_entry_t *neighbor_entry) {
neighbor_entry->switch_id = gSwitchId;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Didn't know there are 2 versions of saithrift. I think we can undo the change. Leave v1 as is

Signed-off-by: Nicholas Ching <nicholaslching@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@nicholasching

Copy link
Copy Markdown
Author

Hi @yue-fred-gao! I've made the requested changes on latest; now only fallback to RPC-scoped switch_id. Changes for this PR are much cleaner now, only touching 1 file; thanks!


// OCP sai_test often omits switch_id on neighbor_entry; fall back to the
// RPC-global switch_id set by sai_thrift_create_switch.
if (thrift_neighbor_entry.switch_id != 0) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does sai_thrift_neighbor_entry_t have switch_id?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thesai_thrift_neighbor_entry_t in thriftv2 is generated from the SAI header sai_neighbor_entry_t, of which switch_id is a member.

SAI/inc/saineighbor.h

Lines 185 to 192 in c0203c0

typedef struct _sai_neighbor_entry_t
{
/**
* @brief Switch ID
*
* @objects SAI_OBJECT_TYPE_SWITCH
*/
sai_object_id_t switch_id;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. The previous change to sai_thrift_neighbor_entry_t is actually for v1. v2 always generate sai thrift from SAI headers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants