Langevin_dynamics
Contents¶
Classes¶
LangevinDynamics
- Langevin Dynamics sampler implementing discretized gradient-based MCMC.
API Reference¶
torchebm.samplers.langevin_dynamics ¶
Langevin Dynamics Sampler Module.
This module provides an implementation of the Langevin Dynamics algorithm, a gradient-based Markov Chain Monte Carlo (MCMC) method. It leverages stochastic differential equations to sample from complex probability distributions, making it a lightweight yet effective tool for Bayesian inference and generative modeling.
Key Features
- Gradient-based sampling with stochastic updates.
- Customizable step sizes and noise scales for flexible tuning.
- Optional diagnostics and trajectory tracking for analysis.
Module Components¶
Classes:
Name | Description |
---|---|
LangevinDynamics | Core class implementing the Langevin Dynamics sampler. |
Usage Example¶
Sampling from a Custom Energy Function
from torchebm.samplers.mcmc.langevin import LangevinDynamics
from torchebm.core.energy_function import GaussianEnergy
import torch
# Define a 2D Gaussian energy function
energy_fn = GaussianEnergy(mean=torch.zeros(2), cov=torch.eye(2))
# Initialize Langevin sampler
sampler = LangevinDynamics(energy_fn, step_size=0.01, noise_scale=0.1)
# Starting points for 5 chains
initial_state = torch.randn(5, 2)
# Run sampling
samples, diagnostics = sampler.sample_chain(
x=initial_state, n_steps=100, n_samples=5, return_diagnostics=True
)
print(f"Samples shape: {samples.shape}")
print(f"Diagnostics keys: {diagnostics.shape}")
Mathematical Foundations¶
Langevin Dynamics Overview
Langevin Dynamics simulates a stochastic process governed by the Langevin equation. For a state \( x_t \), the discretized update rule is:
- \( U(x) \): Potential energy, where \( U(x) = -\log p(x) \) and \( p(x) \) is the target distribution.
- \( \eta \): Step size controlling the gradient descent.
- \( \epsilon_t \sim \mathcal{N}(0, I) \): Gaussian noise introducing stochasticity.
Over time, this process converges to samples from the Boltzmann distribution:
Why Use Langevin Dynamics?
- Simplicity: Requires only first-order gradients, making it computationally lighter than methods like HMC.
- Exploration: The noise term prevents the sampler from getting stuck in local minima.
- Flexibility: Applicable to a wide range of energy-based models and score-based generative tasks.
Practical Considerations¶
Parameter Tuning Guide
- Step Size (\(\eta\)):
- Too large: Instability and divergence
- Too small: Slow convergence
- Rule of thumb: Start with \(\eta \approx 10^{-3}\) to \(10^{-5}\)
- Noise Scale (\(\beta^{-1/2}\)):
- Controls exploration-exploitation tradeoff
- Higher values help escape local minima
- Decay Rate (future implementation):
- Momentum-like term for accelerated convergence
Diagnostics Interpretation
Use return_diagnostics=True
to monitor: - Mean/Variance: Track distribution stationarity - Energy Gradients: Check for vanishing/exploding gradients - Autocorrelation: Assess mixing efficiency
When to Choose Langevin Over HMC?
Criterion | Langevin | HMC |
---|---|---|
Computational Cost | Lower | Higher |
Tuning Complexity | Simpler | More involved |
High Dimensions | Efficient | More efficient |
Multimodal Targets | May need annealing | Better exploration |
How to Diagnose Sampling?
Check diagnostics for: - Sample mean and variance convergence. - Gradient magnitudes (should stabilize). - Energy trends over iterations.