Skip to content

CosineScheduler

Methods and Attributes

Bases: BaseScheduler

Scheduler with cosine annealing.

This scheduler implements cosine annealing, which provides a smooth transition from the start value to the end value following a cosine curve. Cosine annealing is popular in deep learning as it provides fast initial decay followed by slower decay, which can help with convergence.

Mathematical Formula

\[v(t) = \begin{cases} v_{end} + (v_0 - v_{end}) \times \frac{1 + \cos(\pi t/T)}{2}, & \text{if } t < T \\ v_{end}, & \text{if } t \geq T \end{cases}\]

where:

  • \(v_0\) is the start_value
  • \(v_{end}\) is the end_value
  • \(T\) is n_steps
  • \(t\) is the current step count

Cosine Curve Properties

The cosine function creates a smooth S-shaped curve that starts with rapid decay and gradually slows down as it approaches the end value.

Parameters:

Name Type Description Default
start_value float

Starting parameter value.

required
end_value float

Target parameter value.

required
n_steps int

Number of steps to reach the final value.

required

Raises:

Type Description
ValueError

If n_steps is not positive.

Step Size Annealing

1
2
3
4
5
6
7
8
scheduler = CosineScheduler(start_value=0.1, end_value=0.001, n_steps=100)
values = []
for i in range(10):
    value = scheduler.step()
    values.append(value)
    if i < 3:  # Show first few values
        print(f"Step {i+1}: {value:.6f}")
# Shows smooth decay: 0.099951, 0.099606, 0.098866, ...

Learning Rate Scheduling

1
2
3
4
5
6
7
lr_scheduler = CosineScheduler(
    start_value=0.01, end_value=0.0001, n_steps=1000
)
# In training loop
for epoch in range(1000):
    lr = lr_scheduler.step()
    # Update optimizer learning rate

Noise Scale Annealing

1
2
3
4
5
6
7
8
noise_scheduler = CosineScheduler(
    start_value=1.0, end_value=0.01, n_steps=500
)
sampler = LangevinDynamics(
    energy_function=energy_fn,
    step_size=0.01,
    noise_scale=noise_scheduler
)
Source code in torchebm/core/base_scheduler.py
class CosineScheduler(BaseScheduler):
    r"""
    Scheduler with cosine annealing.

    This scheduler implements cosine annealing, which provides a smooth transition
    from the start value to the end value following a cosine curve. Cosine annealing
    is popular in deep learning as it provides fast initial decay followed by
    slower decay, which can help with convergence.

    !!! info "Mathematical Formula"
        $$v(t) = \begin{cases}
        v_{end} + (v_0 - v_{end}) \times \frac{1 + \cos(\pi t/T)}{2}, & \text{if } t < T \\
        v_{end}, & \text{if } t \geq T
        \end{cases}$$

        where:

        - \(v_0\) is the start_value
        - \(v_{end}\) is the end_value  
        - \(T\) is n_steps
        - \(t\) is the current step count

    !!! note "Cosine Curve Properties"
        The cosine function creates a smooth S-shaped curve that starts with rapid
        decay and gradually slows down as it approaches the end value.

    Args:
        start_value (float): Starting parameter value.
        end_value (float): Target parameter value.
        n_steps (int): Number of steps to reach the final value.

    Raises:
        ValueError: If n_steps is not positive.

    !!! example "Step Size Annealing"
        ```python
        scheduler = CosineScheduler(start_value=0.1, end_value=0.001, n_steps=100)
        values = []
        for i in range(10):
            value = scheduler.step()
            values.append(value)
            if i < 3:  # Show first few values
                print(f"Step {i+1}: {value:.6f}")
        # Shows smooth decay: 0.099951, 0.099606, 0.098866, ...
        ```

    !!! tip "Learning Rate Scheduling"
        ```python
        lr_scheduler = CosineScheduler(
            start_value=0.01, end_value=0.0001, n_steps=1000
        )
        # In training loop
        for epoch in range(1000):
            lr = lr_scheduler.step()
            # Update optimizer learning rate
        ```

    !!! example "Noise Scale Annealing"
        ```python
        noise_scheduler = CosineScheduler(
            start_value=1.0, end_value=0.01, n_steps=500
        )
        sampler = LangevinDynamics(
            energy_function=energy_fn,
            step_size=0.01,
            noise_scale=noise_scheduler
        )
        ```
    """

    def __init__(self, start_value: float, end_value: float, n_steps: int):
        r"""
        Initialize the cosine scheduler.

        Args:
            start_value (float): Starting parameter value.
            end_value (float): Target parameter value.
            n_steps (int): Number of steps to reach the final value.

        Raises:
            ValueError: If n_steps is not positive.
        """
        super().__init__(start_value)
        if n_steps <= 0:
            raise ValueError(f"n_steps must be a positive integer, got {n_steps}")

        self.end_value = end_value
        self.n_steps = n_steps

    def _compute_value(self) -> float:
        r"""
        Compute the cosine annealed value.

        Returns:
            float: The annealed value following cosine schedule.
        """
        if self.step_count >= self.n_steps:
            return self.end_value
        else:
            # Cosine schedule from start_value to end_value
            progress = self.step_count / self.n_steps
            cosine_factor = 0.5 * (1 + math.cos(math.pi * progress))
            return self.end_value + (self.start_value - self.end_value) * cosine_factor

end_value instance-attribute

end_value = end_value

n_steps instance-attribute

n_steps = n_steps

_compute_value

_compute_value() -> float

Compute the cosine annealed value.

Returns:

Name Type Description
float float

The annealed value following cosine schedule.

Source code in torchebm/core/base_scheduler.py
def _compute_value(self) -> float:
    r"""
    Compute the cosine annealed value.

    Returns:
        float: The annealed value following cosine schedule.
    """
    if self.step_count >= self.n_steps:
        return self.end_value
    else:
        # Cosine schedule from start_value to end_value
        progress = self.step_count / self.n_steps
        cosine_factor = 0.5 * (1 + math.cos(math.pi * progress))
        return self.end_value + (self.start_value - self.end_value) * cosine_factor