Add FLAML search spaces to models by EgorKraevTransferwise · Pull Request #635 · py-why/EconML

EgorKraevTransferwise · 2022-06-15T07:45:14Z

Adressing #557
Just did ForestDRLearner to see if you agree with the way I'm doing it, will add the others if so.

Is there a reason you don't support dowhy versions from 0.8?

immu4989 · 2026-05-06T21:00:55Z

Thanks for putting this up @EgorKraevTransferwise , I would like to help unblock it.
I am am working in the FLAML codebase concurrently and was looking at how its
search space API gets consumed downstream, which led me here. A few things
worth raising before any further code work, since the PR is blocked at a
direction question that never got answered in 2022.

Direction question (for @kbattocchi)

In #557 you asked for an
optimizer-agnostic spec so users could plug in hyperopt as easily as FLAML.
The PR as filed is FLAML-specific. Before Egor invests more in this branch,
is your guidance still optimizer-agnostic? Two viable shapes:

Keep it FLAML shaped now, accept the tight coupling, document that the
API is provisional (e.g., _search_space rather than search_space).
Optimizer agnostic spec — return a plain dict like
{"n_estimators": {"type": "randint", "low": 2, "high": 200}, ...} and
ship a small econml.tuning.flaml_adapter translator. More work but
matches your 2022 ask and avoids hard-pinning FLAML.

If (2), the PR should move the API onto a mixin / base class so it lands
once and gets inherited by LinearDML, CausalForestDML, DRLearner,
etc., rather than being defined per-estimator.

Must fix correctness (regardless of direction)

tune is not imported in the diff. econml/dr/_drlearner.py uses
tune.loguniform, tune.randint, etc., but the patch contains no
from flaml import tune. The classmethods would raise NameError on
first call. (Whatever direction you take above, the import must land
somewhere.)
max_samples: tune.uniform(0, 0.5). sklearn's max_samples requires
the open interval (0, 1]; sampling exactly 0 will raise. Suggest
tune.uniform(0.05, 0.95) (or 0.05, 0.5 if there's a reason to cap at
half — but justify the cap, sklearn's own default is None ≈ 1.0).
Patch targets setup.cfg. EconML migrated to pyproject.toml (see
#1538-era cleanup);
setup.cfg no longer drives dependency resolution. Move the FLAML add
into pyproject.toml's dependencies (or [project.optional-dependencies]
under a tuning extra — see point 4).
Hard FLAML dependency. Your own June 2022
reply
promised wrapping the import via econml.utilities.MissingModule to keep
FLAML optional. The PR currently makes it a hard install_requires. With
FLAML's transitive footprint (xgboost, lightgbm, sklearn pins), making it
mandatory will degrade install ergonomics for users who don't tune.
Either MissingModule + lazy import, or move FLAML under a tuning
extra.
Stale base + merge conflict. PR opens against ac12f54f (June 2022);
GitHub reports mergeable_state: dirty. _drlearner.py has had
substantial churn since (notably the model_selection-based 'auto'
nuisance refactor). Needs rebase before further review.
No tests. A new public classmethod exposed on ForestDRLearner with
no coverage. Suggest minimal: instantiate, call cls.search_space(),
sample one config from each domain, pass it through cls(**search_start)
to confirm every key is a valid __init__ arg.

Design feedback

min_impurity_decrease: tune.uniform(0, 10). Upper bound of 10 is far
outside the practical range — sklearn's default is 0.0, and values
above ~0.1 typically prune trees to stumps. Suggest tune.loguniform(1e-4, 1e-1)
or tune.uniform(0, 0.1). Worth comparing against ranges used in
established AutoML libs (auto-sklearn, optuna's sklearn integration).
honest: tune.choice([0, 1]) vs search_start: honest: True.
Type mismatch — sampler returns int, start returns bool. EconML
passes the value through to a constructor expecting bool; mostly OK
via Python's truthiness, but cleaner as tune.choice([False, True]).
subforest_size: tune.randint(2, 10) searched independently of
n_estimators. ForestDRLearner expects n_estimators % subforest_size == 0
(see RegressionForest); independent sampling will hit invalid configs.
Either constrain via a conditional, or document the invariant and let
the constructor raise.
Commented-out lines. mc_iters and max_depth carry your "is this
worth searching?" / "is this the right syntax?" notes. Decide before
merge:
- mc_iters: typically a fidelity dial, not a tuning knob — leave out.
- max_depth: FLAML's tune.choice([None, tune.randint(2, 1000)]) is
  not valid (tune.choice expects concrete values). Use a single
  tune.randint(2, 50) and accept that the unbounded None case isn't
  searched (or use a sentinel like 0 and translate).

Smaller things

search_start: n_estimators: 100 vs. constructor default 1000. Reasonable
for a search starting point (faster), but worth a one-line docstring
saying "intentionally lower than the model default to keep search budgets
bounded."
tune.choice(["auto", "sqrt", "log2", None]) — "auto" is sklearn-1.3-deprecated
for RandomForestRegressor, but EconML's RegressionForest may still
accept it (it routes to n_features per the docstring). Worth a quick
test that all four values still work end-to-end under the project's
current sklearn pin (>=1.0,<1.9).

Offer

If direction (1) or (2) gets confirmed, happy to take the rebase + the
items above as a follow up PR (with credit to the original work) or
leave it with you, whichever you prefer.

EgorKraevTransferwise added 2 commits June 15, 2022 08:39

Add FLAML to setup.cfg and bump dowhy version

be83130

Add search space functions to ForestDRLearner

5f40110

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add FLAML search spaces to models#635

Add FLAML search spaces to models#635
EgorKraevTransferwise wants to merge 2 commits intopy-why:mainfrom
EgorKraevTransferwise:search_space

EgorKraevTransferwise commented Jun 15, 2022

Uh oh!

immu4989 commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

EgorKraevTransferwise commented Jun 15, 2022

Uh oh!

immu4989 commented May 6, 2026

Direction question (for @kbattocchi)

Must fix correctness (regardless of direction)

Design feedback

Smaller things

Offer

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants