Learning and Optimization with Seasonal Patterns
References
- (2015) Exp3 with drift detection for the switching bandit problem. 2015 IEEE Internat. Conf. Data Science Adv. Anal. (IEEE, Piscataway, NJ) 1–7.Google Scholar
- (2017) The non-stationary stochastic multi-armed bandit problem. Internat. J. Data Sci. Anal. 3(4):267–283.Crossref, Google Scholar
- (2002) Using confidence bounds for exploitation-exploration trade-offs. J. Machine Learn. Res. 3:397–422.Google Scholar
- (2019) Adaptively tracking the best bandit arm with an unknown number of distribution changes. Beygelzimer A, Hsu D, eds. Proc. 32nd Conf. Learn. Theory, vol. 99 (PMLR, New York), 138–158.Google Scholar
- (2002) The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1):48–77.Crossref, Google Scholar
- (1948) Smoothing periodograms from time-series with continuous spectra. Nature 161(4096):686–687.Crossref, Google Scholar
- (1963) The spectral analysis of point processes. J. Roy. Statist. Soc. B 25(2):264–281.Crossref, Google Scholar
- (2014) Dynamic pricing strategies in the presence of demand shifts. Manufacturing Service Oper. Management 16(4):513–528.Link, Google Scholar
- (2011) On the minimax complexity of pricing in a changing environment. Oper. Res. 59(1):66–79.Link, Google Scholar
- (2014) Stochastic multi-armed-bandit problem with non-stationary rewards. Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 27 (Curran Associates, Inc., Red Hook, NY).Google Scholar
- (2015) Non-stationary stochastic optimization. Oper. Res. 63(5):1227–1244.Link, Google Scholar
- (2019) Optimal exploration–exploitation in a multi-armed bandit problem with non-stationary rewards. Stochastic Systems 9(4):319–337.Link, Google Scholar
- (1988) The Fast Fourier Transform and Its Applications (Prentice-Hall, Inc., Hoboken, NJ).Google Scholar
- (1969) Asymptotic properties of spectral estimates of second order. Biometrika 56(2):375–390.Crossref, Google Scholar
- (2005) Statistical analysis of a telephone call center: A queueing-science perspective. J. Amer. Statist. Assoc. 100(469):36–50.Crossref, Google Scholar
- (2021) Periodic-GP: Learning periodic world with gaussian process bandits. Preprint, submitted May 30, https://arxiv.org/abs/2105.14422.Google Scholar
- (2019a) Super-resolution estimation of cyclic arrival rates. Ann. Statist. 47(3):1754–1775.Crossref, Google Scholar
- (2022) Can customer arrival rates be modelled by sine waves? Service Sci. Forthcoming.Google Scholar
- (2019b) Dynamic pricing in an evolving and unknown marketplace. Preprint, submitted May 5, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3382957.Google Scholar
- (2012) Hedging the drift: Learning to optimize under non-stationarity. Management Sci. 68(3):1696–1713.Link, Google Scholar
- (2006) Asymptotic normality of narrow-band least squares in the stationary fractional cointegration model and volatility forecasting. J. Econometrics 133(1):343–371.Crossref, Google Scholar
- (2015a) Dynamic pricing and learning: Historical origins, current research, and new directions. Surveys Oper. Res. Management Sci. 20(1):1–18.Crossref, Google Scholar
- (2015b) Tracking the market: Dynamic pricing and learning in a changing environment. Eur. J. Oper. Res. 247(3):914–927.Crossref, Google Scholar
- (2020) A linear bandit for seasonal environments. Preprint, submitted April 28, https://arxiv.org/abs/2004.13576.Google Scholar
- (2022) Bandits atop reinforcement learning: Tackling online inventory models with cyclic demands. Management Sci. Forthcoming.Google Scholar
- (2010) Near-optimal regret bounds for reinforcement learning. J. Machine Learn. Res. 11:1563–1600.Google Scholar
- (2016) Multi-armed bandits: Competing with optimal sequences. Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 29 (Curran Associates, Inc., Red Hook, NY).Google Scholar
- (2020) Selling quality-differentiated products in a Markovian market with unknown transition probabilities. Preprint, submitted November 28, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3526568.Google Scholar
- (2017) Chasing demand: Learning and earning in a changing environment. Math. Oper. Res. 42(2):277–307.Link, Google Scholar
- (2004) Nearly tight bounds for the continuum-armed bandit problem. Saul L, Weiss Y, Bottou L, eds. Advances in Neural Information Processing Systems, vol. 17 (MIT Press, Cambridge, MA).Google Scholar
- (2017) Rotting bandits. Guyon I, Von Luxburg U, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 30 (Curran Associates, Inc., Red Hook, NY).Google Scholar
- (2017) Provably optimal algorithms for generalized linear contextual bandits. Precup D, Teh YW, eds. Proc. 34th Internat. Conf. Machine Learn., vol. 70 (JMLR.org, Sydney NSW Australia), 2071–2080.Google Scholar
- (2018) A change-detection based framework for piecewise-stationary multi-armed bandit problem. Proc. AAAI Conf. Artificial Intelligence (AAAI Press, New Orleans Louisiana).Google Scholar
- (2018) Efficient contextual bandits in non-stationary worlds. Bubeck S, Perchet V, Rigollet P, eds. Proc. 31st Conf. Learn. Theory, vol. 75 (PMLR, New York), 1739–1776.Google Scholar
- (2020) Bandits with adversarial scaling. Hal Daumé III, Singh A, eds. Proc. 37th Internat. Conf. Machine Learn., vol. 119 (JMLR.org), 6511–6521.Google Scholar
- (2021) Near-optimal model-free reinforcement learning in non-stationary episodic MDPs. Meila M, Zhang T, eds. Proc. 38th Internat. Conf. Machine Learn., vol. 139 (JMLR.org), 7447–7458.Google Scholar
- (1967) Asymptotic properties of the periodogram of a discrete stationary process. J. Appl. Probab. 4(3):508–528.Crossref, Google Scholar
- (2017) Taming non-stationary bandits: A Bayesian approach. Preprint, submitted July 31, https://arxiv.org/abs/1707.09727.Google Scholar
- (2011) Modelling non-homogeneous Poisson processes with almost periodic intensity functions. J. Roy. Statist. Soc. Ser. B. Statist. Methodology 73(1):99–122.Crossref, Google Scholar
- (2007) Asymptotic spectral theory for nonlinear time series. Ann. Statist. 35(4):1773–1801.Crossref, Google Scholar
- (2005) Spectral Analysis of Signals (Pearson Prentice Hall, Upper Saddle River, NJ).Google Scholar
- (2021) Regulating greed over time in multi-armed bandits. J. Machine Learn. Res. 22:1–99.Google Scholar
- (2013) Finite-time analysis of kernelised contextual bandits. Proc. 29th Conf. Uncertainty Artificial Intelligence (AUAI Press, Bellevue, WA), 654–663.Google Scholar
- (1982) On the estimation of frequency in point-process data. J. Appl. Probab. 19(A):383–394.Crossref, Google Scholar
- (2019) Destination engagement on Facebook: Time and seasonality. Ann. Tourism Res. 79:102747.Crossref, Google Scholar
- (2020) Pointwise and uniform convergence of Fourier extensions. Constructive Approximation 52(1):139–175.Crossref, Google Scholar
- (2016) Dynamic privacy pricing: A multi-armed bandit approach with time-variant rewards. IEEE Trans. Inform. Forensics Security 12(2):271–285.Crossref, Google Scholar
- (2018) Perishable inventory systems: Convexity results for base-stock policies and learning algorithms under censored demand. Oper. Res. 66(5):1276–1286.Link, Google Scholar
- (2021) Regime switching bandits. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. Advances in Neural Information Processing Systems, vol. 34 (Curran Associates, Inc., Red Hook, NY), 4542–4554.Google Scholar
- (2019) Learning in generalized linear contextual bandits with stochastic delays. Wallach H, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox E, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 32 (Curran Associates, Inc., Red Hook, NY).Google Scholar
- (2020) When demands evolve larger and noisier: Learning and earning in a growing environment. Hal Daumé III, Singh A, eds. Proc. 37th Internat. Conf. Machine Learn., vol. 119 (PMLR, New York), 11629–11638.Google Scholar

