Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 18 additions & 9 deletions libpysal/graph/_matching.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import warnings

import numpy
from packaging.version import Version
from sklearn.metrics import pairwise_distances

from ._utils import _validate_geometry_input
Expand Down Expand Up @@ -69,6 +70,8 @@ def _spatial_matching(
"""
try:
import pulp

PULP_GE_4 = Version(pulp.__version__).major >= 4 # noqa: N806
except ImportError as error:
raise ImportError("spatial matching requires the pulp library") from error
if metric == "precomputed":
Expand Down Expand Up @@ -108,13 +111,14 @@ def _spatial_matching(

mp = pulp.LpProblem("optimal-neargraph", sense=pulp.LpMinimize)
# a match is as binary decision variable, connecting observation i to observation j
match_vars = pulp.LpVariable.dicts(
"match",
lowBound=0,
upBound=1,
indices=zip(row, col, strict=True),
cat="Continuous" if allow_partial_match else "Binary",
)
match_vars = {}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about the need for a for-loop here. Is there no equivalent dicts functionality in pulp4 for batch variable addition?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LpVariable.dicts() names variables using the key directly in the string, so we'd get keys like "match_(0, 1)" instead of tuple keys (i, j).

And the rest of the code indexes match_vars[i, j] as a tuple.
Refactoring around that would add more complexity than the for-loop i think?
But let me if there's a cleaner way you have in mind..

and yeah, there IS addVariables() in pulp4, but it just takes already-created LpVariable objects, it's not a true batch creation equivalent.
The closest would still be LpVariable.dicts() + addVariables(), but that gives string keys like "match_(0,_1)" instead of the tuple keys (i, j)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to hold off on changing this unill they stabilize the 4.0 API.

I've just pulled down 4.0.0a3, which adds a LpProblem.add_variable_dicts() method, which does what we need it to do.

Let's wait until things stabilize with the 4.0 release, and move to the equivalent functionality then Let's keep this PR open for now, but it's very likely our implementation will not change this radically.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second @ljwolf sentiment here and over in pysal/spopt#527. Well this ride here and allow dev CI to fail -- no harm, no foul.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FirePheonix -- Let's mark this as Draft so it doesn't accidentally get merged. Same for pysal/spopt#527

for i, j in zip(row, col, strict=True):
name = f"match_{i}_{j}"
cat = "Continuous" if allow_partial_match else "Binary"
if PULP_GE_4:
match_vars[i, j] = mp.add_variable(name, lowBound=0, upBound=1, cat=cat)
else:
match_vars[i, j] = pulp.LpVariable(name, lowBound=0, upBound=1, cat=cat)
# we want to minimize the geographic distance of links in the graph
mp.objective = pulp.lpSum(
[
Expand Down Expand Up @@ -145,11 +149,16 @@ def _spatial_matching(
]
)
sense = int(not allow_partial_match)
mp += pulp.LpConstraint(summand, sense=sense, rhs=n_matches)
if sense == 1:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same sentiment here as above. Did pulp4 really change the API that dramatically?

mp += summand >= n_matches
elif sense == 0:
mp += summand == n_matches
else:
mp += summand <= n_matches
if match_between: # but, we may choose to ignore some sources
for i in range(n_sources):
summand = pulp.lpSum([match_vars[j, i] for j in range(n_targets)])
mp += pulp.LpConstraint(summand, sense=-1, rhs=n_matches)
mp += summand <= n_matches

status = mp.solve(solver)

Expand Down