networking-calico: Separate leader jobs into different processes#12668
Conversation
|
/sem-approve |
1 similar comment
|
/sem-approve |
|
@zhanz1 I pushed a formatting fix to this to satisfy our linters (black and flake8), and to allow CI to run. |
Thanks... I thought I passed |
It's a bit confusing. |
BTW my plan will be to look again at #12456 after this PR has merged. I think we will find that a lot of #12456 is no longer needed, but there may still be some useful pieces there. |
Yes, I'll work on it, no worries!
Yes, I think the most important part is how we will be detecting dead greenthread after we split work onto multiple processes. |
As clusters grow larger, it is hard for a single Python process to do resync, compaction, and status updating at the same time. To address this, separate the jobs into multiple processes. This change will introduce four new worker processes, each in charge of: * CalicoResourceSyncerWorker: Sync resources from Neutron to etcd. * CalicoManagerWorker: Do leader election and periodic compaction. * CalicoAgentStatusWatcherWorker: Watch agent status updates and report them to Neutron. * CalicoEndpointStatusWatcherWorker: Watch endpoint status updates and report them to Neutron. Signed-off-by: Zhan Zhang <zzhang953@bloomberg.net>
Signed-off-by: Zhan Zhang <zzhang953@bloomberg.net>
Signed-off-by: Zhan Zhang <zzhang953@bloomberg.net>
891fec1 to
a2a0cde
Compare
Signed-off-by: Zhan Zhang <zzhang953@bloomberg.net>
Signed-off-by: Zhan Zhang <zzhang953@bloomberg.net>
Signed-off-by: Zhan Zhang <zzhang953@bloomberg.net>
|
@nelljerram I've rebased and also modified relevant naming and comments a little bit (e.g., |
|
/sem-approve |
There was a problem hiding this comment.
Pull request overview
This PR extends the networking-calico Neutron ML2 driver architecture by splitting “leader-only” responsibilities into dedicated neutron-server worker processes (manager/election+compaction, agent status watching, endpoint status watching, plus the existing startup resync worker). The goal is to reduce contention in large OpenStack clusters by avoiding a single process doing all periodic and watcher work.
Changes:
- Add new Neutron
BaseWorkermarker classes for manager and status-watcher worker processes. - Refactor
CalicoMechanismDriverto initialize common state post-fork and to start role-specific greenlets per worker type; introduce a process-shared “is master” timestamp updated by the elector. - Split
StatusWatcherintoAgentStatusWatcherandEndpointStatusWatcher, and update unit tests accordingly.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| networking-calico/networking_calico/plugins/ml2/drivers/calico/workers.py | Adds new worker classes intended to map to separate neutron-server forked processes. |
| networking-calico/networking_calico/plugins/ml2/drivers/calico/mech_calico.py | Dispatches worker responsibilities post-fork; adds process-shared master tracking and new init/start helpers. |
| networking-calico/networking_calico/plugins/ml2/drivers/calico/status.py | Splits status watching into agent vs endpoint watcher subclasses. |
| networking-calico/networking_calico/plugins/ml2/drivers/calico/election.py | Updates elector to publish master “freshness” via a shared value; removes old in-process master flag. |
| networking-calico/networking_calico/plugins/ml2/drivers/calico/test/test_plugin_etcd.py | Updates plugin tests to match the new watcher/worker structure. |
| networking-calico/networking_calico/plugins/ml2/drivers/calico/test/test_mech_calico.py | Updates mech driver init tests for new init/start helper methods. |
| networking-calico/networking_calico/plugins/ml2/drivers/calico/test/test_election.py | Updates election tests for the new elector API and master signaling. |
| networking-calico/networking_calico/plugins/ml2/drivers/calico/test/lib.py | Adjusts test stubs/mocks to support Elector.run() and driver.is_master(). |
nelljerram
left a comment
There was a problem hiding this comment.
I haven't reviewed the test changes in full detail yet, but I think I have enough comments queued up to be worth releasing.
Overall, I really like the shape of this change, so thanks for proposing it. Just some detailed comments...
| def _init_and_start_calico_resouce_syncer(self): | ||
| self.start_up_resync_thread = eventlet.spawn(self._do_startup_resync) | ||
|
|
||
| def _init_and_start_calico_manager(self): |
There was a problem hiding this comment.
Can we call this "elector" instead of "manager"?
There was a problem hiding this comment.
Oh I see, you haven't called it "elector" because it also does compaction. WDYT about making another separate worker process for compaction? That's effectively what will happen when eventlet is removed anyway, and I think it would be cleaner already.
There was a problem hiding this comment.
We can, but to me it feels a bit wasteful to spawn a standalone process that only does a check every X seconds. Even with eventlet removed, I feel like we should just turn this into two threads for the same reason. These two components (Elector and compaction) won't really scale as clusters grow larger.
Signed-off-by: Zhan Zhang <zzhang953@bloomberg.net>
Merge current Calico master
nelljerram
left a comment
There was a problem hiding this comment.
Looking good. I still need to review test_plugin_etcd.py, but the comments below cover everything else. I'll also kick off CI now...
|
/sem-approve |
nelljerram
left a comment
There was a problem hiding this comment.
Two small points for test_plugin_etcd.py
|
@zhanz1 Thanks so much for your work on this. CI is looking good, and there are just a few remaining small points:
Happy with everything else, and looking forward to merging this! |
Signed-off-by: Zhan Zhang <zzhang953@bloomberg.net>
Signed-off-by: Zhan Zhang <zzhang953@bloomberg.net>
|
/sem-approve |
If the StatusWatcher is processing Felix uptime updates whose embedded "time" field is materially older than wall-clock now, this is evidence that we are running behind the rate of updates Felix is producing and a backlog is building up. Customers have hit this in production: Neutron ends up seeing agent up/down transitions hours after they actually happened, and the existing logs give no early warning while the backlog is growing. Add a rate-limited WARNING in AgentStatusWatcher._on_status_set -- skipped during initial-snapshot replay where old timestamps are expected -- so operators can see the backlog building up long before it grows to hours. This is the operator-facing piece of the CI-1892 hardening work that still stands on its own merits after PR #12668 split the mech driver into per-process workers. The other defensive fixes from that branch (elector watchdog, etcd-confirmed mastership) no longer carry their weight: the periodic-resync loop they were targeted at is gone, and the existing time-based check in is_master() already catches stale-elector mastership for the continuous loops that remain. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WHY
This is an extension of #12582.
As clusters grow larger, it is hard for a single Python process to do resync, compaction, and status updating at the same time. To address this, let's separate the jobs into multiple processes.
WHAT
This change will introduce four new worker processes, each in charge of:
HOW
Launch
neutron-serverwithcalicoas the plugin and set up an OpenStack cluster.TEST
Tested on a virtual OpenStack cluster with 3 instance of
neutron-server, all seems to be working:Details
Also, updated the unit tests according to the new design.
MISC
Release note: