PyPI Package

LAdam
Optimizer Family

Drop-in replacements for Adam, AdaGrad, and RMSProp with Laplacian variance smoothing. Stabilizes training on physics-informed neural networks, vision models, and transformers.

Install:pip install ladam
3 optimizersPyTorch compatibleApache 2.0 license1 hyperparameter (c2)
-13.5%
Regression MSE
vs Adam on structured data
+0.20%
Transformer Acc
p=0.0005 (5 seeds)
34%
Lower Variance
PINN seed stability
1 line
Integration
Change your import

Three Optimizers, One Package

Each variant adds Laplacian smoothing to the variance estimate of its base optimizer. One new hyperparameter: c2.

LAdam

Base: Adam

PINNs, regression, transformers

-13.5% regression MSE, +0.20% transformer acc

LAdaGrad

Base: AdaGrad

Sparse data — NLP embeddings, recommendations

Improved loss stability on sparse gradients

LRMSProp

Base: RMSProp

Non-stationary problems — RL, online learning

Lower variance in non-stationary objectives

Benchmark Results

Multi-seed benchmarks with statistical testing. Only verified, reproducible results.

TaskAdamLAdamChangeSeeds
Regression (MLP)0.2130.184-13.5%3
CIFAR-10 (ResNet + ChiAnneal)67.96%73.39%+5.43%3
Transformer (FashionMNIST)89.46%89.66%+0.20%5
Wave PINN (L2 error)0.00670.0066+0.8%3
Burgers PINN0.01990.0197+0.1%5

CIFAR-10 result uses ChiAnnealScheduler (included in package). PINN gains are primarily in convergence stability (34% lower variance across seeds).

Get Started

Change one import. Keep everything else the same.

LAdam (general)
from ladam import LAdam

optimizer = LAdam(
    model.parameters(),
    lr=1e-3,
    c2=1e-4  # Laplacian smoothing strength
)

# Drop-in replacement — same API as torch.optim.Adam
for batch in dataloader:
    loss = model(batch)
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()
LAdaGrad (sparse data)
from ladam import LAdaGrad

# Best for sparse data — NLP, recommendations
optimizer = LAdaGrad(
    model.parameters(),
    lr=1e-2,
    c2=1e-4
)
LRMSProp (non-stationary)
from ladam import LRMSProp

# Best for non-stationary targets — RL, online learning
optimizer = LRMSProp(
    model.parameters(),
    lr=1e-3,
    c2=1e-4
)

When to Use LAdam

Largest gains on problems where loss landscapes are noisy or ill-conditioned.

Physics-Informed NNs

Lower variance across seeds on wave/Burgers PINNs. Laplacian smoothing complements PDE loss landscapes.

Structured Regression

-13.5% MSE on tabular regression tasks. Best gains when weight structure mirrors data structure.

Vision + ChiAnneal

+5.43% on CIFAR-10 with ChiAnnealScheduler. The scheduler ramps c2 during training for curriculum-style smoothing.

Transformers

+0.20% on FashionMNIST ViT (p=0.0005, 5 seeds). Small but statistically significant and consistent.

Scientific Computing

PDEs, ODEs, molecular dynamics. Anywhere gradients are inherently noisy and spatially correlated.

⚠️ Not for LLMs

Tested on GPT-2 fine-tuning — significantly hurts perplexity. Attention weights encode semantic, not spatial structure.

Ready to try it?

One install. One hyperparameter. Measurable gains.