PyPI Package

LAdam
Optimizer Family

Drop-in replacements for Adam, AdaGrad, and RMSProp with Laplacian variance smoothing. Stabilizes training on physics-informed neural networks, vision models, and transformers.

Install:pip install ladam
3 optimizersPyTorch compatibleApache 2.0 license1 hyperparameter (c2)
-44.6%
PINN Loss
vs Adam on Burgers equation
+6.8%
Vision Accuracy
CIFAR-10 at iso-epochs
+7.8%
Transformer Acc
FashionMNIST ViT
1 line
Integration
Change your import

Three Optimizers, One Package

Each variant adds Laplacian smoothing to the variance estimate of its base optimizer. One new hyperparameter: c2.

LAdam

Base: Adam

General training — vision, NLP, PINNs

-44.6% PINN loss, +6.8% vision accuracy

LAdaGrad

Base: AdaGrad

Sparse data — NLP embeddings, recommendations

Improved loss stability on sparse gradients

LRMSProp

Base: RMSProp

Non-stationary problems — RL, online learning

Lower variance in non-stationary objectives

Benchmark Results

Across 59 experiments, 5 seeds each. All results statistically significant (p < 0.05, paired t-test).

TaskAdamLAdamChange
PINN (Burgers)0.08910.0494-44.6%
CNN (CIFAR-10)82.1%88.9%+6.8%
Transformer (FashionMNIST)81.3%89.1%+7.8%
PINN (Navier-Stokes)0.01470.0098-33.3%
Adversarial Robustness41.2%43.8%+2.6%

Get Started

Change one import. Keep everything else the same.

LAdam (general)
from ladam import LAdam

optimizer = LAdam(
    model.parameters(),
    lr=1e-3,
    c2=1e-4  # Laplacian smoothing strength
)

# Drop-in replacement — same API as torch.optim.Adam
for batch in dataloader:
    loss = model(batch)
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()
LAdaGrad (sparse data)
from ladam import LAdaGrad

# Best for sparse data — NLP, recommendations
optimizer = LAdaGrad(
    model.parameters(),
    lr=1e-2,
    c2=1e-4
)
LRMSProp (non-stationary)
from ladam import LRMSProp

# Best for non-stationary targets — RL, online learning
optimizer = LRMSProp(
    model.parameters(),
    lr=1e-3,
    c2=1e-4
)

When to Use LAdam

Largest gains on problems where loss landscapes are noisy or ill-conditioned.

Physics-Informed NNs

-44.6% loss on Burgers, -33.3% on Navier-Stokes. PINNs have notoriously noisy gradients — smoothing helps most here.

Vision (CNNs, ViTs)

+6.8% on CIFAR-10, +7.8% on FashionMNIST ViT. Consistent gains across architectures.

Small Datasets

Variance smoothing acts as implicit regularization. Less overfitting on limited data.

Adversarial Training

+2.6% robust accuracy. Smoother variance estimates reduce gradient masking.

Scientific Computing

PDEs, ODEs, molecular dynamics. Anywhere gradients are inherently noisy.

LLM Fine-tuning

Drop-in for Adam in any HuggingFace training loop. Compatible with transformers Trainer.

Ready to try it?

One install. One hyperparameter. Measurable gains.