Key ✓ positive GPU gave clear speedup ~ mixed partial gains or caveats not GPU relevant baseline, not GPU-native
Showing 6 papers
SAT Solving

GaloisSAT: Differentiable Boolean Satisfiability Solving via Finite Field Algebra

arXiv →
SAT ✓ positive Kim, Portner, Liu, Dai, Ren, Khailany, Velasquez, Alkhouri, Yu arXiv, Mar 2026
Hybrid GPU-CPU CDCL solver from NVIDIA, U. Utah, and CMU. A differentiable solving engine using finite field algebra runs on GPU to prune the search space — handling variable selection and backbone identification — while a traditional CDCL solver closes the proof on CPU. Tested on SAT Competition 2024 benchmarks against Kissat and CaDiCaL: 8.41× speedup on satisfiable instances, 1.29× on unsatisfiable. The asymmetry is telling — GPU helps most where heuristic guidance matters, less so where proof complexity dominates. First production-grade GPU-CDCL hybrid with benchmark results at this scale.

GPUSAT2: An Improved GPU-Based #SAT Model Counter

Springer →
SAT ✓ positive Fichte, Hecher, Zisser CP 2019
Model counting (#SAT, not decision SAT) using dynamic programming over tree decompositions, mapped onto GPU via OpenCL. The decomposition structure is why GPU works here: subproblems at the same tree level are independent and can be evaluated in parallel. The positive result is earned by problem structure — not by accelerating CDCL. Significant speedup over CPU for instances with bounded treewidth.
Constraint Programming

Exploring the Use of GPUs in Constraint Solving

Springer →
CP ✓ positive Campeotto, Dal Palù, Dovier, Fioretto, Pontelli PADL 2014
Assesses the feasibility of parallelising constraint propagation and arc consistency on GPU. Implements a hybrid CPU-GPU solver and shows competitive performance against sequential baselines. Honest about where GPU helps (bulk constraint checking) and where it doesn't (propagation ordering, sequential fixpoint detection). One of the more careful early papers in this space. The Fioretto/Pontelli group continued this line of work.
Local Search

Adaptive Large Neighborhood Search on the GPU

EJOR →
Local Search ✓ positive Bach, Hasle, Schulz European Journal of OR, 2019
First published GPU parallelisation of ALNS. The destroy-and-repair structure is a natural GPU fit: thousands of candidate moves can be evaluated simultaneously, with no sequential dependencies in the evaluation phase. Strong results on vehicle routing benchmarks. One of the cleaner examples of GPU wins in combinatorial search because the algorithm's inner loop is already batch-friendly.

cuOpt: GPU-Accelerated Route Optimization

NVIDIA →
Local Search ✓ positive NVIDIA Research Production system (open-sourced 2024)
NVIDIA's production GPU solver for VRP-class problems using GPU-accelerated LNS. Holds 23+ world records on routing benchmarks, 10x–5000x speedups over CPU depending on instance. Recently open-sourced on GitHub. Worth tracking as the state of the art in what GPU combinatorial search looks like when optimised for real hardware rather than ported from CPU. Full docs at developer.nvidia.com/cuopt.
Branch & Bound

A GPU-Based Backtracking Algorithm for Permutation Combinatorial Problems

Springer →
Branch & Bound ✓ positive Mezmaz, Melab, Talbi Springer, 2016
GPU branch and bound for permutation problems. B&B has an irregular tree shape, so bounding is data-parallel but branching is not. The approach uses a work pool: many subproblems are evaluated (bounded) on GPU simultaneously, branching decisions made on CPU. The speedup comes from bounding, not from search itself. A reasonable hybrid pattern and the one most likely to generalise to other constraint-based B&B.