Skip to content

Langevin_dynamics

Contents

Classes

  • LangevinDynamics - Langevin Dynamics sampler implementing discretized gradient-based MCMC.

API Reference

torchebm.samplers.langevin_dynamics

Langevin Dynamics Sampler Module.

This module provides an implementation of the Langevin Dynamics algorithm, a gradient-based Markov Chain Monte Carlo (MCMC) method. It leverages stochastic differential equations to sample from complex probability distributions, making it a lightweight yet effective tool for Bayesian inference and generative modeling.

Key Features

  • Gradient-based sampling with stochastic updates.
  • Customizable step sizes and noise scales for flexible tuning.
  • Optional diagnostics and trajectory tracking for analysis.

Module Components

Classes:

Name Description
LangevinDynamics

Core class implementing the Langevin Dynamics sampler.


Usage Example

Sampling from a Custom Energy Function

from torchebm.samplers.mcmc.langevin import LangevinDynamics
from torchebm.core.energy_function import GaussianEnergy
import torch

# Define a 2D Gaussian energy function
energy_fn = GaussianEnergy(mean=torch.zeros(2), cov=torch.eye(2))

# Initialize Langevin sampler
sampler = LangevinDynamics(energy_fn, step_size=0.01, noise_scale=0.1)

# Starting points for 5 chains
initial_state = torch.randn(5, 2)

# Run sampling
samples, diagnostics = sampler.sample_chain(
    x=initial_state, n_steps=100, n_samples=5, return_diagnostics=True
)
print(f"Samples shape: {samples.shape}")
print(f"Diagnostics keys: {diagnostics.shape}")

Mathematical Foundations

Langevin Dynamics Overview

Langevin Dynamics simulates a stochastic process governed by the Langevin equation. For a state \( x_t \), the discretized update rule is:

\[ x_{t+1} = x_t - \eta \nabla U(x_t) + \sqrt{2\eta} \epsilon_t \]
  • \( U(x) \): Potential energy, where \( U(x) = -\log p(x) \) and \( p(x) \) is the target distribution.
  • \( \eta \): Step size controlling the gradient descent.
  • \( \epsilon_t \sim \mathcal{N}(0, I) \): Gaussian noise introducing stochasticity.

Over time, this process converges to samples from the Boltzmann distribution:

\[ p(x) \propto e^{-U(x)} \]

Why Use Langevin Dynamics?

  • Simplicity: Requires only first-order gradients, making it computationally lighter than methods like HMC.
  • Exploration: The noise term prevents the sampler from getting stuck in local minima.
  • Flexibility: Applicable to a wide range of energy-based models and score-based generative tasks.

Practical Considerations

Parameter Tuning Guide

  • Step Size (\(\eta\)):
    • Too large: Instability and divergence
    • Too small: Slow convergence
    • Rule of thumb: Start with \(\eta \approx 10^{-3}\) to \(10^{-5}\)
  • Noise Scale (\(\beta^{-1/2}\)):
    • Controls exploration-exploitation tradeoff
    • Higher values help escape local minima
  • Decay Rate (future implementation):
    • Momentum-like term for accelerated convergence

Diagnostics Interpretation

Use return_diagnostics=True to monitor: - Mean/Variance: Track distribution stationarity - Energy Gradients: Check for vanishing/exploding gradients - Autocorrelation: Assess mixing efficiency

When to Choose Langevin Over HMC?

Criterion Langevin HMC
Computational Cost Lower Higher
Tuning Complexity Simpler More involved
High Dimensions Efficient More efficient
Multimodal Targets May need annealing Better exploration

How to Diagnose Sampling?

Check diagnostics for: - Sample mean and variance convergence. - Gradient magnitudes (should stabilize). - Energy trends over iterations.

Further Reading