Algorithms

Our implemented X-Armed Bandit algorithms can be classified into different categories according to different features in the algorithm design.

Algorithm

Research

Stochastic

Cumulative

Anytime

DiRect

paper

DOO

DOO paper

SOO

SOO paper

Zooming

Zooming paper

T-HOO

T-HOO paper

StoSOO

StoSOO paper

HCT

HCT paper

POO*

POO paper

GPO*

GPO paper

PCT

GPO paper

SequOOL

SequOOL paper

StroquOOL

StroquOOL paper

VROOM

VROOM paper

VHCT

VHCT paper

VPCT

N.A.


  • (Stochastic) For some algorithms such as T_HOO and HCT, they perform well in the stochastic X-Armed Bandit setting when there is noise in the problem. However for some of the algorithms, e.g., DOO, they only work in the noise-less (deterministic) setting.

  • (Cumulative) For some algorithms such as T_HOO and HCT, they are designed to optimize the cumulative regret, i.e., the performance over the whole learning process. However for algorithms such as StoSOO and StroquOOL, they will optimize the simple regret, i.e., the final-round/last output performance.

  • (Anytime) For some algorithms such as SequOOL and StroquOOL, they need the total number of rounds (budget) information to run the algorithm, but for algorithms such as T_HOO and HCT, they do not need such information.

Note

Please refer to the following details for more information.