Skip to content

PortManager: replace uint64_t hash token with exact-tuple PortRangeKey (closes #1805)#1812

Merged
ibc merged 7 commits into
versatica:v3from
999purple999:fix/1805-port-range-tuple-key
Jun 2, 2026
Merged

PortManager: replace uint64_t hash token with exact-tuple PortRangeKey (closes #1805)#1812
ibc merged 7 commits into
versatica:v3from
999purple999:fix/1805-port-range-tuple-key

Conversation

@999purple999
Copy link
Copy Markdown
Contributor

Closes #1805.

Picks up the design @penguinol proposed in #1805 (comment) and the architectural greenlight @ibc gave in #1805 (comment) (no need to keep the uint64_t token).

Root cause (in current code, pre-fix)

PortManager::GeneratePortRangeHash mangles the bind address before placing it in the uint64_t hash:

case AF_INET:
    hash |= (address >> 2) << 2;  // bottom 2 bits of IPv4 address dropped

case AF_INET6:
    address = a[0] ^ a[1] ^ a[2] ^ a[3];  // 128-bit address XOR-folded into 32 bits
    hash |= static_cast<uint64_t>(address) << 16;
    hash |= (static_cast<uint64_t>(address) >> 2) << 2;

Both branches are lossy. Downstream, mapPortRanges is keyed on the hash and GetOrCreatePortRange treats a hash hit as "same range":

auto it = mapPortRanges.find(hash);
if (it != mapPortRanges.end()) { return it->second; }  // merges distinct tuples on collision

Then Unbind(hash, port) releases ports against the merged PortRange. In a multi-tenant or multi-interface deployment with nearby IPv4 addresses or XOR-colliding IPv6 addresses, two unrelated bindings get silently merged, and Unbind releases ports from whichever tuple landed second.

Concrete collision examples:

  • IPv4: 192.168.1.0, 192.168.1.1, 192.168.1.2, 192.168.1.3 (any /30) all map to the same hash for the same range. Common in container networks with sequential IPs or load-balanced front-ends.
  • IPv6: any two addresses where the four 32-bit words XOR to the same value (trivially constructible by reordering words) collide.

Fix

Replace the hash-as-key design with a struct-as-key design. The map's equality is exact-tuple; the hash function is purely for bucket distribution and is allowed to collide without harm (key equality keeps distinct tuples in distinct entries).

class PortManager::PortRangeKey
{
public:
    PortRangeKey() = default;
    PortRangeKey(Protocol, const sockaddr_storage&, uint16_t minPort, uint16_t maxPort);
    bool operator==(const PortRangeKey&) const noexcept;
private:
    friend class PortManager;
    friend struct PortRangeKeyHash;
    Protocol protocol{ Protocol::UDP };
    sockaddr_storage bindAddr{};
    uint16_t minPort{ 0u };
    uint16_t maxPort{ 0u };
};

struct PortManager::PortRangeKeyHash
{
    size_t operator()(const PortRangeKey&) const noexcept;
};

absl::flat_hash_map<PortRangeKey, PortRange, PortRangeKeyHash> mapPortRanges;

PortRangeKeyHash::operator() uses absl::HashOf over protocol, family, raw address bytes (sin_addr.s_addr or absl::MakeSpan(sin6_addr.s6_addr)), minPort, maxPort. No bit-mangling.

PortRangeKey::operator== is field-equal with std::memcmp on the IPv6 address bytes and direct uint32 compare on IPv4.

API change

Bind now outputs a PortRangeKey and Unbind takes one:

// before
static uv_udp_t* BindUdp(..., uint64_t& portRangeHash);
static void Unbind(uint64_t hash, uint16_t port);

// after
static uv_udp_t* BindUdp(..., PortRangeKey& portRangeKey);
static void Unbind(const PortRangeKey& key, uint16_t port);

@ibc green-lit the breaking change in the issue thread. The new struct is movable, copyable, and trivially storable as a member by callers (already done in UdpSocket, TcpServer).

Callers updated to hold a PortRangeKey instead of uint64_t portRangeHash:

  • UdpSocket (ctor + dtor signatures + member)
  • TcpServer (ctor + dtor signatures + member)
  • PipeTransport (2 local variables)
  • PlainTransport (3 local variables)
  • WebRtcServer (4 local variables)
  • WebRtcTransport (4 local variables)

Total: 10 production source files touched. GeneratePortRangeHash is removed.

Tests

New worker/test/src/RTC/TestPortManager.cpp asserts:

  1. identical tuples compare equal (correctness): same (proto, addr, range) produces equal keys and equal hashes.
  2. IPv4 /30 collision is gone (regression for the IPv4 lossy hash): 192.168.1.{0,1,2,3} produce 4 distinct keys.
  3. IPv6 XOR-fold collision is gone (regression for the IPv6 lossy hash): word-reorder collisions produce distinct keys.
  4. Protocol differentiates: same (addr, range) for UDP and TCP produces distinct keys.
  5. Range bounds differentiate: (40000, 40099) vs (40000, 40100) vs (40001, 40099) are 3 distinct keys.
  6. Family differentiates: 0.0.0.0 and :: are distinct keys.

The full suite (npm test in worker/) is what CI will run on this PR.

Notes

  • Hash function is exposed as a public nested type only so the test file and the map declaration can name it. Equality lives on the key itself.
  • Protocol is moved from private to public nested so PortRangeKey callers can construct keys in tests. The enum has no runtime cost change.
  • The change is platform-agnostic. No new #ifdef branches, no platform-conditional code.
  • The Dump() method previously printed hash: %lu. It now prints protocol, family, minPort, maxPort. Operational logs that grepped for the hash value will need to adapt, but the protocol/family/range tuple is much more useful for debugging anyway.

Co-authored

@penguinol drafted the design in the issue thread (struct + absl::HashOf layout) and gave the explicit "I'm ok if you've already fix it" go-ahead. Credit lands in the commit trailer (Co-authored-by: penguinol).

@ibc cleared the breaking change in the same thread ("no need to keep the uint64_t token").

Happy to iterate on naming, struct visibility, or test coverage if you'd prefer a different shape.

@999purple999
Copy link
Copy Markdown
Contributor Author

One disclosure on what I validated locally vs what I'm trusting CI to catch on this PR:

Validated locally:

  • The PortRangeKey semantics via the new TestPortManager.cpp unit-test logic (read-through, not executed locally for the reason below).
  • All caller sites mechanically (grep for portRangeHash / uint64_t.*portRange came back clean across worker/src/ and worker/include/).
  • Rust layer non-impact (grep for PortManager and the affected symbols in rust/ came back empty; rust/src/data_structures.rs::PortRange is the public config type, unrelated to the internal map key).
  • FBS schema non-impact (grep clean).

Not validated locally, deferred to CI:

  • meson + ninja C++ build of the worker (no Windows local toolchain set up here).
  • npm run lint (TypeScript + C++ lint rules).
  • npm test full suite.

If you'd prefer I spin up a Linux container to run the full release:check end-to-end and paste the output here before review, happy to. Otherwise I'll watch CI on this PR and iterate from whatever it surfaces.

The diff is bounded (10 production files, 1 new test file, no Rust changes, no FBS changes, no platform-conditional code added) so the CI feedback loop should be short.

@penguinol
Copy link
Copy Markdown
Contributor

Thanks for your great work.
I think WebrtcServer also need to be fixed.

// Map of WebRtcTransports indexed by TransportTuple.hash.
absl::flat_hash_map<uint64_t, RTC::WebRtcTransport*> mapTupleWebRtcTransport;

@ibc
Copy link
Copy Markdown
Member

ibc commented May 29, 2026

Nice. Please give us some time to come to this, I think I'll have some time to properly review it on next week.

@999purple999
Copy link
Copy Markdown
Contributor Author

Thanks both!

@penguinol Good catch on WebRtcServer.hpp:122-123. That mapTupleWebRtcTransport<uint64_t, ...> is keyed on TransportTuple::hash, which is a separate hash system from PortManager's and is built with FNV (worker/src/RTC/TransportTuple.cpp:58), so it does not have the lossy address-folding pattern this PR addresses. It does carry the same structural risk though (any uint64_t hash can collide and silently merge entries in an absl::flat_hash_map), which I would argue is real but separate from #1805. Two ways forward:

  1. Keep this PR scoped to PortManager (closes Avoid five-tuples hash collisions #1805) and open a follow-up PR converting mapTupleWebRtcTransport to a TransportTuple-keyed map with the same struct + custom hash pattern.
  2. Fold both into this PR. It expands scope materially (touches the WebRtcTransport routing path on every packet), which would slow review.

I would vote (1) for review cleanliness, but defer to your preference.

@ibc No rush, take the time you need. CI is fixed in 3d418e6 (a test helper I missed when changing the Bind signature). Ready when you are.

@ibc
Copy link
Copy Markdown
Member

ibc commented May 31, 2026

@999purple999 we want to remove absl-cpp dependency. Rationale in this ongoing PR: #1813.

Changes in this PR would be:

#include <ankerl/unordered_dense.h>
size_t PortManager::PortRangeKeyHash::operator()(const PortRangeKey& key) const noexcept
{
    const auto protocolBits = static_cast<uint8_t>(key.protocol);
    const auto familyBits   = static_cast<uint16_t>(key.bindAddr.ss_family);

    auto hashCombine = [](size_t& seed, size_t value)
    {
        seed ^= value + 0x9e3779b9 + (seed << 6) + (seed >> 2);
    };

    size_t seed = 0;

    switch (key.bindAddr.ss_family)
    {
        case AF_INET:
        {
            const auto* in = reinterpret_cast<const sockaddr_in*>(&key.bindAddr);

            hashCombine(seed, ankerl::unordered_dense::hash<uint8_t>{}(protocolBits));
            hashCombine(seed, ankerl::unordered_dense::hash<uint16_t>{}(familyBits));
            hashCombine(seed, ankerl::unordered_dense::hash<uint32_t>{}(in->sin_addr.s_addr));
            hashCombine(seed, ankerl::unordered_dense::hash<uint16_t>{}(key.minPort));
            hashCombine(seed, ankerl::unordered_dense::hash<uint16_t>{}(key.maxPort));

            break;
        }

        case AF_INET6:
        {
            const auto* in6  = reinterpret_cast<const sockaddr_in6*>(&key.bindAddr);
            const auto* addr = in6->sin6_addr.s6_addr;

            uint64_t hi, lo;

            std::memcpy(&hi, addr,     sizeof(uint64_t));
            std::memcpy(&lo, addr + 8, sizeof(uint64_t));
            hashCombine(seed, ankerl::unordered_dense::hash<uint8_t>{}(protocolBits));
            hashCombine(seed, ankerl::unordered_dense::hash<uint16_t>{}(familyBits));
            hashCombine(seed, ankerl::unordered_dense::hash<uint64_t>{}(hi));
            hashCombine(seed, ankerl::unordered_dense::hash<uint64_t>{}(lo));
            hashCombine(seed, ankerl::unordered_dense::hash<uint16_t>{}(key.minPort));
            hashCombine(seed, ankerl::unordered_dense::hash<uint16_t>{}(key.maxPort));

            break;
        }

        default:
        {
            hashCombine(seed, ankerl::unordered_dense::hash<uint8_t>{}(protocolBits));
            hashCombine(seed, ankerl::unordered_dense::hash<uint16_t>{}(familyBits));
            hashCombine(seed, ankerl::unordered_dense::hash<uint16_t>{}(key.minPort));
            hashCombine(seed, ankerl::unordered_dense::hash<uint16_t>{}(key.maxPort));

            break;
        }
    }

    return seed;
}

@jmillan jmillan self-requested a review June 1, 2026 07:44
FRAXKAPPA and others added 3 commits June 1, 2026 15:49
closes versatica#1805)

The previous GeneratePortRangeHash collapsed distinct (protocol, address,
port range) tuples into one uint64_t bucket because of two lossy operations:

  IPv4: hash |= (address >> 2) << 2;        // bottom 2 bits dropped
  IPv6: address = a[0] ^ a[1] ^ a[2] ^ a[3];  // 128 bits XOR-folded to 32

Downstream, `GetOrCreatePortRange` treats a hash hit as "same range" and
`Unbind(hash, port)` releases ports against whatever PortRange the hash
maps to. In a multi-tenant or multi-interface mediasoup deployment with
nearby IPv4 addresses (any /30 block) or XOR-colliding IPv6 addresses,
two unrelated bindings get silently merged into one PortRange. The
`Unbind` path then releases ports from the wrong tenant's range.

This replaces the hash-as-key design with a struct-as-key design:

  class PortRangeKey {
      Protocol protocol;
      sockaddr_storage bindAddr;
      uint16_t minPort;
      uint16_t maxPort;
      bool operator==(const PortRangeKey&) const noexcept;
  };

  struct PortRangeKeyHash {
      size_t operator()(const PortRangeKey&) const noexcept;
  };

  absl::flat_hash_map<PortRangeKey, PortRange, PortRangeKeyHash> mapPortRanges;

Map equality is exact-tuple. The hash function (absl::HashOf over
protocol + family + raw address bytes + range bounds) is for bucket
distribution only; even on hash collision, key equality keeps distinct
tuples in distinct entries.

API change (per @ibc 2026-05-28: "I am fine with the proposed changes,
no need to keep the uint64_t token"): `Bind` now outputs a
`PortRangeKey` and `Unbind` takes one. Callers updated:
UdpSocket, TcpServer, PipeTransport, PlainTransport, WebRtcServer,
WebRtcTransport.

Tests: new TestPortManager.cpp asserts that
  - identical tuples compare equal (correctness),
  - all 4 IPv4 addresses in 192.168.1.0/30 produce distinct keys
    (regression for the IPv4 /30 collision),
  - IPv6 word-swap collisions produce distinct keys (regression for
    the IPv6 XOR fold),
  - protocol, range bounds, and family each independently differentiate
    keys.

Co-authored-by: penguinol <penguinol@users.noreply.github.com>
makeUdpSocket lambda was still constructing UdpSocket range ctor with
uint64_t portRangeHash as the 6th argument. After replacing the token
type with PortRangeKey in b6ff2d8, the helper needed the new type too.
clang-tidy on worker-clang-tidy job flagged readability-identifier-naming
violations on the two helper functions in TestPortManager.cpp:

  MakeV4 → makeV4
  MakeV6 → makeV6

mediasoup convention is lowerCamelCase for free functions. Mechanical fix
of the only two warnings clang-tidy raised on this PR.
@999purple999 999purple999 force-pushed the fix/1805-port-range-tuple-key branch from 4720cb9 to bdf22cc Compare June 1, 2026 13:50
@999purple999
Copy link
Copy Markdown
Contributor Author

Done @ibc. Rebased the branch on v3 so it picks up the absl removal from #1813, and rewrote PortRangeKeyHash::operator() per your spec: ankerl::unordered_dense::hash<> per field + the standard boost-style hashCombine seed mixer, switching on ss_family for IPv4 / IPv6 / fallback. The map itself is now ankerl::unordered_dense::map<PortRangeKey, PortRange, PortRangeKeyHash> and the two absl headers (absl/hash/hash.h, absl/types/span.h) plus the flat_hash_map include are gone.

One tiny deviation worth flagging: I split the IPv6 16-byte address into two uint64_t halves with std::memcpy (not the addr pointer alias) to keep things strict-aliasing-safe; same hash inputs, no UB. Happy to revert that to the exact pattern you posted if you prefer.

Branch: https://github.com/999purple999/mediasoup/tree/fix/1805-port-range-tuple-key
Force-pushed since this is a rebase.

Let me know if you want anything else trimmed before merge.

@ibc
Copy link
Copy Markdown
Member

ibc commented Jun 1, 2026

@999purple999 please run "npm run release:check" and, within worker folder) "make format". And ping here so I notice the commit and run CI again.

Comment thread worker/include/RTC/PortManager.hpp Outdated
Comment thread worker/test/src/RTC/TestPortManager.cpp Outdated
Comment thread worker/test/src/RTC/TestPortManager.cpp Outdated
Comment thread worker/test/src/RTC/TestPortManager.cpp Outdated
@999purple999
Copy link
Copy Markdown
Contributor Author

Hey @ibc, quick update.

I tried make format (python -m invoke format from worker/) on my local Windows machine but the worker tooling expects a Unix-style env: clang-format isn't on PATH and the invoke task ends up calling npm run format --prefix scripts/ against a node_modules layout it can't find on Windows. npm run release:check from root similarly chokes on python -m invoke -r worker lint (same path).

Looking at the CI run on bdf22cc, every Linux + Windows config compiles src_RTC_PortManager.cpp.obj and runs the tests clean. The only red is macos-15, clang, clang++, Release which failed in 30s during runner setup before any compile step (looks like a runner-level issue, not a code one).

So I'm reluctant to push a "format-only" commit that I can't verify locally on the right toolchain. Could you re-run CI on the current head and ping me with the actual failure if any? If something specific is off-style I'll fix it surgically.

Thanks for the patience.

@ibc
Copy link
Copy Markdown
Member

ibc commented Jun 1, 2026

Can you please give me write access to your PR so I can format it?

@ibc
Copy link
Copy Markdown
Member

ibc commented Jun 1, 2026

I'm fixing lint errors and some cosmetic minor things.

ibc added 2 commits June 1, 2026 23:05
- Do not include headers already present in `common.hpp`.
- Add missing `#include "RTC/PortManager.hpp" in some files.
- Reorder classes/structs/methods.
- Use `std::addressof(x)` instead of `&c`.
- Do not use `using`. Be always explicit and include all namespaces.
@ibc
Copy link
Copy Markdown
Member

ibc commented Jun 1, 2026

@999purple999 I've pushed cosmetic changes and CHANGELOG to your branch.

@ibc
Copy link
Copy Markdown
Member

ibc commented Jun 1, 2026

I've created a separate ticket to refactor/replace the uint64_t TransportTuple::hash member: #1815

@ibc
Copy link
Copy Markdown
Member

ibc commented Jun 2, 2026

@999purple999 I think this is ready to merge, right?

@999purple999
Copy link
Copy Markdown
Contributor Author

Yes @ibc, ready to merge from my side. Thanks for the cosmetic pass, the CHANGELOG entry, the @penguinol attribution, and the v3 merge.

Re #1815: happy to take the TransportTuple follow-up once this lands. The same struct + hashCombine pattern should drop in cleanly; the only piece worth thinking through is whether TransportTuple wants PortRangeKeyHash as-is or its own type since the key shape differs (no min/max port range there). I'll open a PR against #1815 next week unless @penguinol gets to it first.

Thanks both, this has been a good review cycle.

@ibc
Copy link
Copy Markdown
Member

ibc commented Jun 2, 2026

Great, merging. Thanks a lot, guys!

@ibc
Copy link
Copy Markdown
Member

ibc commented Jun 2, 2026

Re #1815: happy to take the TransportTuple follow-up once this lands. The same struct + hashCombine pattern should drop in cleanly; the only piece worth thinking through is whether TransportTuple wants PortRangeKeyHash as-is or its own type since the key shape differs (no min/max port range there). I'll open a PR against #1815 next week unless @penguinol gets to it first.

Let's please discuss about this in #1815

@ibc ibc merged commit aa099ec into versatica:v3 Jun 2, 2026
43 checks passed
@999purple999
Copy link
Copy Markdown
Contributor Author

Thanks @ibc, @jmillan, @penguinol. Will open the #1815 follow-up within the week as agreed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

Avoid five-tuples hash collisions

5 participants