Algorithms
Our implemented X-Armed Bandit algorithms can be classified into different categories according to different features in the algorithm design.
Algorithm |
Research |
Stochastic |
Cumulative |
Anytime |
---|---|---|---|---|
DiRect |
paper |
✘ |
✘ |
✘ |
✘ |
✘ |
✘ |
||
✘ |
✘ |
✘ |
||
✔ |
✔ |
✔ |
||
✔ |
✔ |
✔ |
||
✔ |
✘ |
✘ |
||
✔ |
✔ |
✔ |
||
✔ |
✘ |
✔ |
||
✔ |
✘ |
✘ |
||
✔ |
✘ |
✘ |
||
✘ |
✘ |
✘ |
||
✔ |
✘ |
✘ |
||
✔ |
✘ |
✘ |
||
✔ |
✔ |
✔ |
||
N.A. |
✔ |
✘ |
✘ |
(Stochastic) For some algorithms such as T_HOO and HCT, they perform well in the stochastic X-Armed Bandit setting when there is noise in the problem. However for some of the algorithms, e.g., DOO, they only work in the noise-less (deterministic) setting.
(Cumulative) For some algorithms such as T_HOO and HCT, they are designed to optimize the cumulative regret, i.e., the performance over the whole learning process. However for algorithms such as StoSOO and StroquOOL, they will optimize the simple regret, i.e., the final-round/last output performance.
(Anytime) For some algorithms such as SequOOL and StroquOOL, they need the total number of rounds (budget) information to run the algorithm, but for algorithms such as T_HOO and HCT, they do not need such information.
Note
Please refer to the following details for more information.