PERF: Speed up `np.isin` table method with clipped indexing by mdrdope · Pull Request #30828 · numpy/numpy

mdrdope · 2026-02-13T10:57:53Z

Summary

When np.isin uses the integer table method, the current implementation creates
a boolean mask to filter out-of-range elements, then gathers the in-range subset,
performs a lookup, and scatters results back. This PR adds a faster clipped
indexing path that avoids the mask / gather / scatter overhead by
extending the lookup table with boundary sentinels and using np.clip to map
all out-of-range values to the sentinel slots.

Local tests: pytest numpy/lib/tests all passed
Full test suite covered by CI.

A lightweight strided-sampling heuristic (~384 elements, O(1) view) selects
which path to use at runtime: clipped indexing when ≥ 5 % of ar1 falls inside
[ar2_min, ar2_max], and the original masking path otherwise.

Motivation

The table method in _isin currently does:

basic_mask = (ar1 <= ar2_max) & (ar1 >= ar2_min)                  # 2 comparisons + AND
in_range_ar1 = ar1[basic_mask]                                    # gather
outgoing[basic_mask] = isin_helper_ar[in_range_ar1 - ar2_min]     # lookup + scatter

This involves 5 memory passes over ar1 (two comparisons, AND, gather,
scatter) plus non-contiguous memory access for the gather/scatter.

The clipped indexing path replaces this with:

ar1_indices = ar1 - (ar2_min - 1)           # shift
np.clip(ar1_indices, 0, ar2_range + 2, …)   # clip (in-place)
return extended_table[indices]              # lookup

3 dense passes instead of 5.

Why a heuristic is needed

When very few elements of ar1 fall within [ar2_min, ar2_max] (< 5 %),
masking skips nearly all elements in the gather/scatter step, making it cheaper
than clipped indexing (which always processes every element). A quick
data-overlap test selects the faster path at runtime.

How the sampling works

_stride = max(1, ar1.size // 384)
_sample = ar1[::_stride]                                          # O(1) view
_n_in = np.count_nonzero((_sample >= ar2_min) & (_sample <= ar2_max))

ar1[::stride] is a strided view — O(1), zero allocation.
Touches ~384 elements (~3 KB), negligible vs the O(N) main work.
The dense stride (N // 384) reduces sensitivity to periodic or
block-sorted input data.
Algorithm-selection via data sampling is a standard technique
(cf. std::sort's median-of-three pivot, database optimizer sample scans).

Experimental evidence

Benchmarked across 210 parameter combinations

(N ∈ {1K, 10K, 100K, 1M, 5M},

ar2_range ∈ {10, 50, 200, 2K, 20K},

p_in ∈ {1 %, 3 %, 5 %, 7 %, 10 %, 15 %, 30 %, 50 %, 80 %, 100 %}).

Variable	Meaning
N	`ar1` length (number of membership queries)
ar2_range	`max(ar2) - min(ar2)` (lookup table domain size)
p_in	fraction of `ar1` elements within `[ar2_min, ar2_max]`

Key finding: p_in determines the crossover

The heatmap plot below shows T_clip / T_mask as a function of the sampled
in-range fraction across all N and ar2_range values. The crossover
(ratio = 1.0) sits near p_in ≈ 0.05, justifying the 5 %
threshold as the sole decision variable.

When overlap is very small (≤5%), the masking path is cheaper because it only does the expensive work on a tiny subset of elements. It first builds a boolean mask over all of ar1, but the actual table lookup and write-back (outgoing[basic_mask] = isin_helper_ar[in_range_ar1 - ar2_min]) only happen for the few values that fall inside the range ([ar2_min, ar2_max]). If only 1% match (i.e., only 1% of ar1 within [ar2_min, ar2_max]), 99% of the array is effectively skipped after the mask check.

The clipped path, however, always processes every element. It creates and clips an index array for all of ar1, then performs a lookup for all elements, even those out of range, there is a fixed cost for this. So there is a small regression when very few values match.

Once overlap grows beyond ~5%, masking loses its advantage. Now it must gather and write back a large fraction of elements, and those writes are scattered (non-contiguous), which is slower. The clipped path remains a simple, sequential pass over memory, so it becomes faster as overlap increases.

Experiment script (not part of the commit)

"""
Experiment: masking vs clipping raw comparison + sampled in-range fraction.

Goal: find the best threshold for the sampling heuristic by plotting
      clip/mask ratio as a function of the sampled in-range fraction.

Sweeps: N (ar1 size), ar2_range, p_in (fraction of ar1 in [ar2_min, ar2_max]).
"""
import numpy as np
import timeit
import json
import os

# ===== Two raw implementations =====

def isin_masking(ar1, ar2, ar2_min, ar2_max, ar2_range):
    """Original NumPy masking approach."""
    outgoing_array = np.zeros(ar1.shape, dtype=bool)

    isin_helper_ar = np.zeros(ar2_range + 1, dtype=bool)
    isin_helper_ar[ar2 - ar2_min] = True

    basic_mask = (ar1 <= ar2_max) & (ar1 >= ar2_min)
    in_range_ar1 = ar1[basic_mask]
    if in_range_ar1.size == 0:
        return outgoing_array

    ar2_min_arr = np.array(ar2_min, dtype=np.intp)
    out = np.empty_like(in_range_ar1, dtype=np.intp)
    outgoing_array[basic_mask] = isin_helper_ar[
        np.subtract(in_range_ar1, ar2_min_arr, dtype=np.intp,
                    out=out, casting="unsafe")]

    return outgoing_array


def isin_clipping(ar1, ar2, ar2_min, ar2_max, ar2_range):
    """Clipped indexing: subtract + clip + fancy index."""
    shift_val = ar2_min - 1

    extended_table = np.zeros(ar2_range + 3, dtype=bool)
    extended_table[ar2 - ar2_min + 1] = True

    ar1_indices = ar1.astype(np.intp, copy=False) - shift_val
    np.clip(ar1_indices, 0, ar2_range + 2, out=ar1_indices)

    return extended_table[ar1_indices]


# ===== Data generation with controlled p_in =====

def generate_data(N, ar2_range, p_in, ar2_size=1000, seed=42):
    """Generate test data with precisely controlled in-range proportion."""
    rng = np.random.RandomState(seed)

    ar2_min = 100000
    ar2_max = ar2_min + ar2_range

    ar2 = rng.randint(ar2_min, ar2_max + 1, size=ar2_size)

    n_in = int(p_in * N)
    n_out = N - n_in

    parts = []
    if n_in > 0:
        parts.append(rng.randint(ar2_min, ar2_max + 1, size=n_in))
    if n_out > 0:
        n_below = n_out // 2
        n_above = n_out - n_below
        if n_below > 0:
            parts.append(rng.randint(0, ar2_min, size=n_below))
        if n_above > 0:
            parts.append(rng.randint(ar2_max + 1, ar2_max + 1 + 100000,
                                     size=n_above))

    ar1 = np.concatenate(parts)
    rng.shuffle(ar1)

    return ar1, ar2, ar2_min, ar2_max, ar2_range


def compute_sampled_fraction(ar1, ar2_min, ar2_max, n_sample=256):
    """Compute the sampled in-range fraction (what the heuristic would see)."""
    stride = max(1, len(ar1) // n_sample)
    sample = ar1[::stride]  # view, O(1)
    n_in = np.count_nonzero((sample >= ar2_min) & (sample <= ar2_max))
    return n_in / len(sample)


# ===== Parameters =====

Ns = [1000, 10_000, 100_000, 1_000_000, 5_000_000]
ar2_ranges = [10, 50, 200, 2_000, 20_000]
p_ins = [0.01, 0.03, 0.05, 0.07, 0.1, 0.15, 0.3, 0.5, 0.8, 1.0]

total = len(Ns) * len(ar2_ranges) * len(p_ins)
results = {}
count = 0

print(f"Total experiments: {total}")
print("=" * 80)

for N in Ns:
    n_repeat, n_number = 6, 4

    for ar2_range in ar2_ranges:
        for p_in in p_ins:
            count += 1
            print(f"[{count:3d}/{total}] N={N:>10,}  "
                  f"ar2_range={ar2_range:>7,}  p_in={p_in:.3f}  ", end="")

            ar1, ar2, ar2_min, ar2_max, ar2_rng = generate_data(
                N, ar2_range, p_in)

            # Verify correctness
            result_mask = isin_masking(ar1, ar2, ar2_min, ar2_max, ar2_rng)
            result_clip = isin_clipping(ar1, ar2, ar2_min, ar2_max, ar2_rng)
            assert np.array_equal(result_mask, result_clip), \
                f"mask vs clip differ! N={N}, range={ar2_range}, p_in={p_in}"

            # Compute sampled in-range fraction
            sampled_frac = compute_sampled_fraction(ar1, ar2_min, ar2_max)

            # Time both
            t_mask = min(timeit.repeat(
                lambda: isin_masking(ar1, ar2, ar2_min, ar2_max, ar2_rng),
                repeat=n_repeat, number=n_number
            )) / n_number

            t_clip = min(timeit.repeat(
                lambda: isin_clipping(ar1, ar2, ar2_min, ar2_max, ar2_rng),
                repeat=n_repeat, number=n_number
            )) / n_number

            ratio = t_clip / t_mask

            results[(N, ar2_range, p_in)] = {
                't_mask': t_mask,
                't_clip': t_clip,
                'ratio': ratio,
                'sampled_frac': sampled_frac,
            }

            print(f"mask={t_mask:.6f}s  clip={t_clip:.6f}s  "
                  f"clip/mask={ratio:.3f}  sampled_frac={sampled_frac:.4f}")

# ===== Save raw results to JSON =====
save_path = os.path.dirname(os.path.abspath(__file__))
json_results = {}
for (N, ar2_range, p_in), v in results.items():
    key = f"{N}_{ar2_range}_{p_in}"
    json_results[key] = {
        'N': N, 'ar2_range': ar2_range, 'p_in': p_in,
        't_mask': v['t_mask'], 't_clip': v['t_clip'],
        'ratio': v['ratio'], 'sampled_frac': v['sampled_frac'],
    }

json_path = os.path.join(save_path, 'sweep_results.json')
with open(json_path, 'w') as f:
    json.dump(json_results, f, indent=2)
print(f"\nRaw results saved to {json_path}")

# ===== Summary table =====
print("\n" + "=" * 95)
print("SUMMARY: clipping vs original masking  (+ sampled in-range fraction)")
print("=" * 95)
print(f"{'N':>12}  {'ar2_range':>10}  {'p_in':>6}  "
      f"{'sampled%':>9}  {'clip/mask':>10}  {'winner':>8}")
print("-" * 95)

for N in Ns:
    for ar2_range in ar2_ranges:
        for p_in in p_ins:
            v = results[(N, ar2_range, p_in)]
            winner = "CLIP" if v['ratio'] < 1.0 else "MASK"
            print(f"{N:>12,}  {ar2_range:>10,}  {p_in:>6.3f}  "
                  f"{v['sampled_frac']:>8.4f}  {v['ratio']:>10.3f}  "
                  f"{winner:>8}")

print("-" * 95)

# ===== Visualization =====
try:
    import matplotlib
    matplotlib.use('Agg')
    import matplotlib.pyplot as plt

    # --- Key plot: clip/mask ratio vs sampled_frac, grouped by N ---
    fig, axes = plt.subplots(1, len(Ns), figsize=(5 * len(Ns), 5),
                             sharey=True)
    if len(Ns) == 1:
        axes = [axes]

    for ax, N in zip(axes, Ns):
        fracs = []
        ratios = []
        for ar2_range in ar2_ranges:
            for p_in in p_ins:
                v = results[(N, ar2_range, p_in)]
                fracs.append(v['sampled_frac'])
                ratios.append(v['ratio'])

        ax.scatter(fracs, ratios, alpha=0.6, s=20, edgecolors='k',
                   linewidths=0.3)
        ax.axhline(y=1.0, color='red', linestyle='--', alpha=0.7,
                   linewidth=2, label='breakeven')
        ax.set_xlabel('Sampled in-range fraction')
        ax.set_ylabel('T_clip / T_mask')
        ax.set_title(f'N = {N:,}')
        ax.set_xlim(-0.05, 1.05)
        ax.set_ylim(0, 3.0)
        ax.legend(fontsize=8)
        ax.grid(True, alpha=0.3)

    plt.suptitle('Clip/Mask ratio vs Sampled in-range fraction\n'
                 '(below red line = clipping wins)',
                 fontsize=13, y=1.04)
    plt.tight_layout()
    scatter_path = os.path.join(save_path, 'scatter_frac_vs_ratio.png')
    plt.savefig(scatter_path, dpi=150, bbox_inches='tight')
    plt.close()
    print(f"Saved {scatter_path}")

    # --- Heatmap: clip/mask per N ---
    fig, axes = plt.subplots(1, len(Ns), figsize=(5 * len(Ns), 5))
    if len(Ns) == 1:
        axes = [axes]

    for col, N in enumerate(Ns):
        ax = axes[col]
        ratio_matrix = np.zeros((len(p_ins), len(ar2_ranges)))
        for i, p_in in enumerate(p_ins):
            for j, ar2_range in enumerate(ar2_ranges):
                ratio_matrix[i, j] = results[(N, ar2_range, p_in)]['ratio']

        im = ax.imshow(ratio_matrix, aspect='auto', cmap='RdYlGn_r',
                       vmin=0.3, vmax=2.0, origin='lower')
        ax.set_xticks(range(len(ar2_ranges)))
        ax.set_xticklabels([str(r) for r in ar2_ranges], rotation=45)
        ax.set_yticks(range(len(p_ins)))
        ax.set_yticklabels([str(p) for p in p_ins])
        ax.set_xlabel('ar2_range')
        ax.set_ylabel('p_in')
        ax.set_title(f'T_clip / T_mask  (N={N:,})')

        for i in range(len(p_ins)):
            for j in range(len(ar2_ranges)):
                val = ratio_matrix[i, j]
                color = 'white' if val > 1.5 or val < 0.5 else 'black'
                ax.text(j, i, f'{val:.2f}', ha='center', va='center',
                        fontsize=6, color=color)

        plt.colorbar(im, ax=ax, shrink=0.8)

    plt.suptitle('Clip vs Mask heatmap', fontsize=14, y=1.02)
    plt.tight_layout()
    heatmap_path = os.path.join(save_path, 'heatmap_clip_vs_mask.png')
    plt.savefig(heatmap_path, dpi=150, bbox_inches='tight')
    plt.close()
    print(f"Saved {heatmap_path}")

    # --- All-N overlay: clip/mask vs sampled_frac ---
    fig, ax = plt.subplots(figsize=(10, 6))
    colors_N = plt.cm.tab10(np.linspace(0, 0.5, len(Ns)))

    for idx, N in enumerate(Ns):
        fracs = []
        ratios = []
        for ar2_range in ar2_ranges:
            for p_in in p_ins:
                v = results[(N, ar2_range, p_in)]
                fracs.append(v['sampled_frac'])
                ratios.append(v['ratio'])
        ax.scatter(fracs, ratios, alpha=0.5, s=15, color=colors_N[idx],
                   label=f'N={N:,}')

    ax.axhline(y=1.0, color='red', linestyle='--', linewidth=2,
               label='breakeven')
    ax.set_xlabel('Sampled in-range fraction', fontsize=12)
    ax.set_ylabel('T_clip / T_mask', fontsize=12)
    ax.set_title('All N: where does clipping beat masking?', fontsize=14)
    ax.set_xlim(-0.05, 1.05)
    ax.set_ylim(0, 3.5)
    ax.legend(fontsize=9)
    ax.grid(True, alpha=0.3)

    plt.tight_layout()
    overlay_path = os.path.join(save_path, 'overlay_frac_vs_ratio.png')
    plt.savefig(overlay_path, dpi=150, bbox_inches='tight')
    plt.close()
    print(f"Saved {overlay_path}")

except ImportError:
    print("matplotlib not installed, skipping visualization.")

print("\nDone!")

ASV benchmarks

The benchmark below is included for transparency and to document the measurement
methodology; it is not proposed to be added in-tree unless maintainers think it would be useful.

Click to expand: ASV benchmark code used

Add following class at the end of /numpy/benchmarks/benchmarks/bench_lib.py

# numpy/numpy/benchmarks/benchmarks

class IsinTableOverlap(Benchmark):
    """Benchmarks for `numpy.isin` integer table-method paths.

    Tests the clipped-indexing optimisation across three overlap regimes
    that determine whether the new fast path or the original masking path
    is taken:

    - "high"  (p_in ~ 50 %): most ar1 values inside ar2's range
    - "low"   (p_in ~ 0.1 %): almost no ar1 values inside ar2's range
    - "all"   (p_in = 100 %): every ar1 value inside ar2's range
    """

    params = (
        [10_000, 1_000_000],            # N  (ar1 size)
        ["high", "low", "all"],         # overlap regime
        [False, True],                  # invert
    )
    param_names = ["N", "overlap", "invert"]

    def setup(self, N, overlap, invert):
        rng = np.random.default_rng(42)

        ar2_min = 100_000
        ar2_max = 110_000                       # ar2_range = 10_000
        self.b = rng.integers(ar2_min, ar2_max + 1, size=1000)

        if overlap == "high":
            # ~50 % of ar1 inside [ar2_min, ar2_max]
            n_in = N // 2
            n_out = N - n_in
            parts = [
                rng.integers(ar2_min, ar2_max + 1, size=n_in),
                rng.integers(0, ar2_min, size=n_out // 2),
                rng.integers(ar2_max + 1, ar2_max + 100_000,
                             size=n_out - n_out // 2),
            ]
            self.a = np.concatenate(parts)
            rng.shuffle(self.a)
        elif overlap == "low":
            # ~0.1 % of ar1 inside range
            n_in = max(1, N // 1000)
            n_out = N - n_in
            parts = [
                rng.integers(ar2_min, ar2_max + 1, size=n_in),
                rng.integers(0, ar2_min, size=n_out // 2),
                rng.integers(ar2_max + 1, ar2_max + 100_000,
                             size=n_out - n_out // 2),
            ]
            self.a = np.concatenate(parts)
            rng.shuffle(self.a)
        else:  # "all"
            self.a = rng.integers(ar2_min, ar2_max + 1, size=N)

        self.invert = invert

    def time_isin(self, N, overlap, invert):
        np.isin(self.a, self.b, invert=self.invert)

apply the new _isin function, commit and run:

cd benchmarks
asv continuous main HEAD -b IsinTableOverlap

It covers three overlap regimes:

High overlap (p_in ≈ 50 %): largest speedup from clipped indexing
Low overlap (p_in ≈ 0.1 %): heuristic correctly routes to masking
All in range (p_in = 100 %): clipped indexing avoids unnecessary masking

Results (Apple M-series)

At N = 1 000 000 the clipped path is 2–6× faster with no regression.
At N = 10 000 with high/all overlap it is 1.5–2.3× faster; with low
overlap a small regression of ~3 μs (~23 %) is observed, which is negligible in absolute terms (15 → 18 μs).

Output report (asv continuous main HEAD -b IsinTableOverlap, Apple M-series):

N	Overlap	Invert	Before	After	Ratio
1 000 000	high	False	7.51±0.08 ms	1.27±0.08 ms	0.17
1 000 000	high	True	7.58±0.2 ms	1.31±0.07 ms	0.17
1 000 000	all	False	2.67±0.09 ms	1.30±0.06 ms	0.49
1 000 000	all	True	2.54±0.1 ms	1.32±0.03 ms	0.52
10 000	high	False	47.2±0.7 μs	20.6±0.2 μs	0.44
10 000	high	True	47.8±1 μs	21.1±0.7 μs	0.44
10 000	all	False	32.3±2 μs	20.6±0.5 μs	0.64
10 000	all	True	33.2±1 μs	21.0±0.5 μs	0.63
10 000	low	False	14.9±0.2 μs	18.4±0.3 μs	1.23
10 000	low	True	15.1±0.5 μs	18.6±1 μs	1.23

The two "low overlap, N = 10 000" cases show a ~3.5 μs regression (sampling
overhead dominates when the total operation is only ~15 μs). At N = 1 000 000
the sampling cost is negligible and every overlap regime is faster or neutral.

Added a quick data-overlap test for improved performance in array operations by checking if a portion of ar1 falls within the specified range of ar2. This change optimizes the indexing path based on empirical overlap thresholds.

andyfaff · 2026-02-13T11:07:36Z

Can you please confirm that this PR was not created with AI?

Removed unnecessary blank line in the function.

mdrdope · 2026-02-13T12:31:02Z

Can you please confirm that this PR was not created with AI?

To clarify: this PR was not automatically generated by AI.
Parts of this patch were initially explored using an AI system, but I reviewed, refined, and validated the final implementation myself. I designed and ran the empirical experiments that determined the sampling heuristic and threshold, and I verified correctness locally and through the full test suite (CI is green). The PR and final code are my own work.

andyfaff · 2026-02-13T22:31:23Z

Unfortunately I'm still doubtful. To me the PR is suspicious because the original post is word perfect, follows the formatting used by AI, has the emojis used by AI, the heading title used by AI, etc (all without the post being edited). That would be very unusual for a first time contributor.

mdrdope · 2026-02-14T00:53:26Z

Unfortunately I'm still doubtful. To me the PR is suspicious because the original post is word perfect, follows the formatting used by AI, has the emojis used by AI, the heading title used by AI, etc (all without the post being edited). That would be very unusual for a first time contributor.

Thanks for raising the concern, I understand why the style might look unusual.

Just to clarify, this isn’t my first contribution; a previous performance PR was merged earlier this week.

For this patch, I take full responsibility for the implementation and the benchmarks. The sampling heuristic and threshold choice were iterated on and validated by me, and I verified correctness locally and through CI.

If there are particular concerns about the clipped-indexing path, the sampling strategy, or the performance trade-offs at different overlap densities, I’d be very happy to go through them in detail.

If the formatting or emoji use is distracting, I’m happy to adjust the PR description to better match the usual NumPy style.

andyfaff · 2026-02-18T01:31:04Z

I'm won't be the one reviewing the PR, just triaging in this particular case.

Open source projects are currently experiencing a lot of upheaval and extra work caused by (mis)use of AI. Several projects are actively figuring out stances w.r.t how we deal with project contributions where AI is involved.
The discussion from the numpy community is in the numpy-discussion mail archive.

The scipy sister project is in the process of drafting guidelines for AI assisted contributions, https://github.com/scipy/scipy/pull/24583/changes, which I don't think would be too dissimilar to numpy's point of view.

The important factors here:

the maintainers typically want to be communicating with a human, community is important. When there's a possibility of a post/PR being generated by AI, our guard becomes raised. Figuring that out is not an exact science.
We want to be scrupulous about the licensing of the numpy code-base. For example, we cannot include GPL based code with our BSD-3 clause licence. We also can't use things like a Python translation of code from Numerical Recipes. Therefore, if any of the code included in the PR was generated by AI, and will eventually be used in a numpy wheel/sdist, then we can't merge it. This is because the AI would've originally been trained using code with a non-compatible licence. So if the final code was not written by a person from the high level algorithm, there's a high probability it wouldn't get merged.

These are the motivations behind my queries. Unfortunately maintainers raising these kinds of questions does make us appear less welcoming. That is not our intention.

charris · 2026-02-18T10:48:37Z

Therefore, if any of the code included in the PR was generated by AI,

@andyfaff That is an extreme position, if we do that we will end up with no maintainers because everyone coming up will be using AI. That is not my understanding of where the mailing list discussion was going. We don't want to be the Spanish Inquisition.

andyfaff · 2026-02-18T10:57:04Z

I don't mind winding back by comments.. That's a side effect of a rapidly changing landscape.

mdrdope · 2026-02-23T11:26:02Z

If you have time, I’d really appreciate technical feedback on this PR.
In particular:

correctness of the clipped-indexing sentinel logic for all integer edge cases,
the current sampling threshold and its performance trade-offs across regimes.
Thank you! @charris

Optimize array overlap checks in _arraysetops_impl.py

d5dcaea

Added a quick data-overlap test for improved performance in array operations by checking if a portion of ar1 falls within the specified range of ar2. This change optimizes the indexing path based on empirical overlap thresholds.

mdrdope changed the title ~~# PERF: Speed up np.isin table method with clipped indexing~~ PERF: Speed up np.isin table method with clipped indexing Feb 13, 2026

mdrdope added 2 commits February 13, 2026 11:09

Fix formatting and remove unnecessary whitespace

6d056ed

Clean up code by removing blank line

403666f

Removed unnecessary blank line in the function.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PERF: Speed up `np.isin` table method with clipped indexing#30828

PERF: Speed up `np.isin` table method with clipped indexing#30828
mdrdope wants to merge 3 commits intonumpy:mainfrom
mdrdope:patch-4

mdrdope commented Feb 13, 2026 •

edited

Loading

Uh oh!

andyfaff commented Feb 13, 2026

Uh oh!

mdrdope commented Feb 13, 2026 •

edited

Loading

Uh oh!

andyfaff commented Feb 13, 2026

Uh oh!

mdrdope commented Feb 14, 2026

Uh oh!

andyfaff commented Feb 18, 2026

Uh oh!

charris commented Feb 18, 2026

Uh oh!

andyfaff commented Feb 18, 2026

Uh oh!

mdrdope commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

mdrdope commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Why a heuristic is needed

How the sampling works

Experimental evidence

Key finding: p_in determines the crossover

ASV benchmarks

Results (Apple M-series)

Uh oh!

andyfaff commented Feb 13, 2026

Uh oh!

mdrdope commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andyfaff commented Feb 13, 2026

Uh oh!

mdrdope commented Feb 14, 2026

Uh oh!

andyfaff commented Feb 18, 2026

Uh oh!

charris commented Feb 18, 2026

Uh oh!

andyfaff commented Feb 18, 2026

Uh oh!

mdrdope commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mdrdope commented Feb 13, 2026 •

edited

Loading

mdrdope commented Feb 13, 2026 •

edited

Loading