Oral ICML 2026 Bandits and online learning

Optimistic Online Newton with Sublinear Regret on Strongly Curved Losses

Leila Aydın, Maya Singh, Renat Ostrovsky · OOAARG · Department of Computer Science

arXiv:2605.01184
OON · O(d log T) ONS · O(√T) R(T) T (rounds, ×10³)

We introduce Optimistic Online Newton (OON), a second-order online learning algorithm that attains O(dlogT)O(d \log T) regret on strongly curved convex losses while remaining a one-pass, streaming method with O(d2)O(d^2) memory. The key idea is to maintain a running Hessian proxy from the residual between observed gradients and an optimistic guess; when the guess is accurate, we recover the regret of the best Newton step in hindsight; when it is inaccurate, we degrade gracefully to the standard T\sqrt{T}-regret floor. We give matching lower bounds in the strongly-convex regime and demonstrate empirical gains over AdaGrad and ONS on synthetic and real-world prediction tasks.

Why this matters

Second-order methods deliver fast convergence in offline optimization, but their online counterparts have historically required either tight curvature assumptions or heavy per-round computation. OON sidesteps both: by maintaining a running Hessian proxy seeded with an optimistic prediction g^t\hat g_t, the algorithm recovers logarithmic regret on strongly-curved losses without ever computing a true Hessian.

Main result

For a sequence of α\alpha-strongly-convex losses f1,,fTf_1, \dots, f_T with bounded gradients, OON attains

RT    dαlog ⁣(1+Tλ)  +  O(1).R_T \;\le\; \frac{d}{\alpha} \cdot \log\!\left(1 + \frac{T}{\lambda}\right) \;+\; O(1).

The bound is tight up to a logarithmic factor, matching a lower bound we prove in Section 4 of the paper.

Code & reproducibility

Reference implementations in PyTorch and JAX, alongside reproduction scripts for every figure, are available in the github.com/ooaarg/oon repository. All experiments run on a single GPU in under 30 minutes.