A New Likelihood Ratio Method for Training Artificial Neural Networks
Published Online:17 Sep 2021https://doi.org/10.1287/ijoc.2021.1088
References
- (2018) Why do deep convolutional. networks generalize so poorly to small image transformations? Preprint, submitted May 30, http://arxiv.org/abs/1805.12177.Google Scholar
- (2016) A stochastic quasi-Newton method for large-scale optimization. SIAM J. Optim. 26(2):1008–1031.Crossref, Google Scholar
- (2017) Mitigating evasion attacks to deep neural networks via region-based classification. Proc. 33rd Annual Comput. Security Appl. Conf., 278–287.Google Scholar
- (2017) Toward evaluating the robustness of neural networks. Proc. 2017 IEEE Symp. Security Privacy, 39–57.Google Scholar
- (2016) Hidden voice commands. USENIX Security Symp., 513–530.Google Scholar
- (2018) Structured evolution with compact architectures for scalable policy optimization. Preprint, submitted April 6, https://arxiv.org/abs/1804.02395.Google Scholar
- (2020) On the variance of single-run unbiased stochastic derivative estimators. INFORMS J. Comput. 32(2):390–407.Abstract, Google Scholar
- (2018) Adversarial examples that fool both human and computer vision. Preprint, submitted February 22, https://arxiv.org/abs/1802.08195.Google Scholar
- (2018) Robust physical-world attacks on deep learning visual classification. Preprint, revised April 10, https://arxiv.org/abs/1707.08945.Google Scholar
- (2015) Stochastic gradient estimation. Fu MC, ed. Handbooks of Simulation Optimization (Springer, New York), 105–147.Crossref, Google Scholar
- (2009) Conditional Monte Carlo estimation of quantile sensitivities. Management Sci. 55(12):2019–2027.Link, Google Scholar
- (1990) Likelihood ratio gradient estimation for stochastic systems. Comm. ACM. 33(10):75–84.Crossref, Google Scholar
- (2016) Deep Learning (MIT Press, Cambridge, MA).Google Scholar
- (2014) Explaining and harnessing adversarial examples. Preprint, submitted December 20, https://arxiv.org/abs/1412.6572.Google Scholar
- (1998) Neural Networks: A Comprehensive Foundation (Prentice Hall, Hoboken, NJ).Google Scholar
- (2009) Neural Networks and Learning Machines, vol. 3 (Pearson, Upper Saddle River, NJ).Google Scholar
- (1988) Convergence properties of infinitesimal perturbation analysis estimates. Management Sci. 34(11):1281–1302.Link, Google Scholar
- (2010) Weak differentiability of product measures. Math. Oper. Res. 35(1):27–51.Link, Google Scholar
- (2016) A measure-valued differentiation approach to sensitivity analysis of quantiles. Math. Oper. Res. 41(1):293–317.Link, Google Scholar
- (2018) Benchmarking neural network robustness to common corruptions and perturbations. Preprint, submitted July 4, http://arxiv.org/abs/1807.01697.Google Scholar
- (1991) Discrete Event Dynamic Systems and Perturbation Analysis (Kluwer Academic Publishers, Boston).Crossref, Google Scholar
- (1983) Infinitesimal and finite perturbation analysis for queueing networks. Automatica 19(4):439–445.Crossref, Google Scholar
- (2009) Estimating quantile sensitivities. Oper. Res. 57(1):118–130.Link, Google Scholar
- (2015) Robust convolutional neural networks under adversarial noise. Preprint, submitted November 19, https://arxiv.org/abs/1511.06306.Google Scholar
- (2018) Dynamical, symplectic and stochastic perspectives on gradient-based optimization. Proc. Internat. Congress of Mathematicians: Rio de Janeiro 2018.Google Scholar
- (2012) Efficient backprop. Yann AL, Leon B, Genevieve BO, Klaus-Robert M, eds. Neural Networks: Tricks of the Trade (Springer, New York), 9–48.Crossref, Google Scholar
- (1990) A unified view of the IPA, SF, and LR gradient estimation techniques. Management Sci. 36(11):1364–1383.Link, Google Scholar
- (2018) Es is more than just a traditional finite-difference approximator. Proc. Genetic Evolutionary Comput. Conf. (ACM), 450–457.Google Scholar
- (2017) Defense against adversarial attacks using high-level representation guided denoiser. Preprint, submitted December 8, https://arxiv.org/abs/1712.02976.Google Scholar
- (2011) Kernel estimation of the Greeks for options with discontinuous payoffs. Oper. Res. 59(1):96–108.Link, Google Scholar
- (2020) Monte Carlo gradient estimation in machine learning. J. Machine Learn. Res. 21(132):1–62.Google Scholar
- (2017) Random gradient-free minimization of convex functions. Foundations Comput. Math. 17(2):527–566.Crossref, Google Scholar
- (2000) Neural Networks for Modelling and Control of Dynamic Systems: A Practitioner’s Handbook. Advanced Textbooks in Control and Signal Processing (Springer, Berlin).Google Scholar
- (2016) The limitations of deep learning in adversarial settings. 2016 IEEE Eur. Symp. Security Privacy, 372–387.Google Scholar
- (2013) On the difficulty of training recurrent neural networks. Internat. Conf. Machine Learn., 1310–1318.Google Scholar
- (2018) A new unbiased stochastic derivative estimator for discontinuous sample performances with structural parameters. Oper. Res. 66(2):487–499.Link, Google Scholar
- (1996) Optimization of Stochastic Models (Kluwer Academic, Boston).Crossref, Google Scholar
- (1999) On the momentum term in gradient descent learning algorithms. Neural Networks 12(1):145–151.Crossref, Google Scholar
- (2018) Do CIFAR-10 classifiers generalize to CIFAR-10? Preprint, submitted June 1, http://arxiv.org/abs/1806.00451.Google Scholar
- (1989) Sensitivity analysis for simulations via likelihood ratios. Oper. Res. 37(5):830–844.Link, Google Scholar
- (1986) The score function approach for sensitivity analysis of computer simulation models. Math. Comput. Simulation 28(5):351–379.Crossref, Google Scholar
- (1992) Sensitivity analysis of discrete event systems by the “push out” method. Ann. Oper. Res. 39(1):229–250.Crossref, Google Scholar
- (1993) Discrete Event Systems: Sensitivity Analysis and Stochastic Optimization by the Score Function Method (Wiley, New York).Google Scholar
- (1986) Learning representation by back-propagating errors. Nature 323:533–536.Crossref, Google Scholar
- (2015) Adversarial manipulation of deep representations. Preprint, submitted November 16, https://arxiv.org/abs/1511.05122.Google Scholar
- (1997) Artificial Neural Networks, vol. 1 (McGraw-Hill, New York).Google Scholar
- (1994) Neural networks for the ms/or analyst: An application bibliography. Interfaces 24(2):116–130.Link, Google Scholar
- (2014) Intriguing properties of neural networks. Preprint, submitted December 21, https://arxiv.org/abs//1312.6199.Google Scholar
- (1992) Managerial applications of neural networks: The case of bank failure predictions. Management Sci. 38(7):926–947.Link, Google Scholar
- (2012) A new stochastic derivative estimator for discontinuous payoff functions with application to financial derivatives. Oper. Res. 60(2):447–460.Link, Google Scholar
- (2020) Training artificial neural networks by generalized likelihood ratio method: An effective way to improve robustness. Proc. IEEE Trans. Automation Sci. Engrg (Institute of Electrical and Electronics Engineers (IEEE), New York), 1343–1348.Google Scholar
- (2017) Efficient defenses against adversarial attacks. Proc. 10th ACM Workshop Artificial Intelligence Security (Association for Computing Machinery, New York), 39–49.Google Scholar

