Algorithms

Our implemented X-Armed Bandit algorithms can be classified into different categories according to different features in the algorithm design.

Algorithm	Research	Stochastic	Cumulative	Anytime
DiRect	paper	✘	✘	✘
DOO	DOO paper	✘	✘	✘
SOO	SOO paper	✘	✘	✘
Zooming	Zooming paper	✔	✔	✔
T-HOO	T-HOO paper	✔	✔	✔
StoSOO	StoSOO paper	✔	✘	✘
HCT	HCT paper	✔	✔	✔
POO*	POO paper	✔	✘	✔
GPO*	GPO paper	✔	✘	✘
PCT	GPO paper	✔	✘	✘
SequOOL	SequOOL paper	✘	✘	✘
StroquOOL	StroquOOL paper	✔	✘	✘
VROOM	VROOM paper	✔	✘	✘
VHCT	VHCT paper	✔	✔	✔
VPCT	N.A.	✔	✘	✘

(Stochastic) For some algorithms such as T_HOO and HCT, they perform well in the stochastic X-Armed Bandit setting when there is noise in the problem. However for some of the algorithms, e.g., DOO, they only work in the noise-less (deterministic) setting.

(Cumulative) For some algorithms such as T_HOO and HCT, they are designed to optimize the cumulative regret, i.e., the performance over the whole learning process. However for algorithms such as StoSOO and StroquOOL, they will optimize the simple regret, i.e., the final-round/last output performance.

(Anytime) For some algorithms such as SequOOL and StroquOOL, they need the total number of rounds (budget) information to run the algorithm, but for algorithms such as T_HOO and HCT, they do not need such information.

Note

Please refer to the following details for more information.