Adaptive Hyper-parameters

We include affine invariant schemes for tuning hyper-parameters of HMC samplers (e.g. step size, number of leapfrog steps). Adaptation schemes are children of the Adapter class

class hemcee.adaptation.base.Adapter

Abstract base class for adaptation algorithms.

Global Adaptation

These schemes set hyperparameters globally, that is they don’t change after warmup.

Dual Averaging

Adjust the step size to a get a target acceptance rate. See https://arxiv.org/abs/1111.4246.

class hemcee.adaptation.dual_averaging.DAParameters(target_accept: float = 0.651, stepsize_inter: float = 0.9, t0: float = 10.0, gamma: float = 0.05, kappa: float = 0.75, agg: str = 'harmonic')

Container for dual averaging hyper-parameters.

Attributes:

target_accept (float): Target acceptance probability. t0 (float): Free parameter that stabilizes initial iterations. mu (float): Log of the initial step size. gamma (float): Controls the speed of adaptation. kappa (float): Controls the shrinkage towards the average.

class hemcee.adaptation.dual_averaging.DualAveragingAdapter(parameters: DAParameters, initial_step_size: float, initial_L: float)

Dual averaging adapter for step size adaptation.

ChEES

Adjusts integration length by looking at the Change in the Estimator of the Expected Square of the parameter. It sets the length statically for the entire MCMC run. See https://proceedings.mlr.press/v130/hoffman21a.html.

(1)\[\mathrm{ChEES} \;=\; \frac{1}{4}\, \mathbb{E}\!\left[\left(\|\theta' - \mathbb{E}[\theta]\|^2 - \|\theta - \mathbb{E}[\theta]\|^2 \right)^2\right].\]

where \(\theta'\) is the post-leapfrog state (i.e. \(\theta',r'=\mathrm{leapfrog}_{\varepsilon,L}(\theta,r)\)).

class hemcee.adaptation.chees.ChEESParameters(T_min: float = 0.25, T_max: float = 10.0, T_interpolation: float = 0.9, jitter: float = 0.6, lr_T: float = 0.025, beta1: float = 0.0, beta2: float = 0.95, regularization: float = 1e-07)

Parameters for the ChEES adaptation.

Args:

T_min: Minimum allowed integration time. T_max: Maximum allowed integration time. T_interpolation: Running average interpolation (0.9 means 90% old, 10% new) jitter: Strength of jittered time t_n = h_n T (where h_n is a Halton sequence)

lr_T: Learning rate for the integration time. beta1: Beta1 for ADAM optimizer. beta2: Beta2 for ADAM optimizer. regularization: Regularization for ADAM optimizer.

class hemcee.adaptation.chees.ChEESAdapter(parameters: ChEESParameters, move_type: str, initial_step_size: float, initial_L: float)

ChEES adapter for integration time adaptation.

Local Adaptation

These schemes set hyperparameters locally, that is they parameters per proposal step.

NUTS

Adjust step size via the No-U-Turn condition. It sets the length locally (meaning per MCMC proposal). See https://arxiv.org/abs/1111.4246. We also have an modification to the No-U-Turn condition which makes the algorithm affine invariant.