Weakly Supervised Multi-output Regression via Correlated Gaussian Processes

Published Online:https://doi.org/10.1287/ijds.2022.0018

References

  • Adhikary SK, Muttil N, Yilmaz AG (2017) Cokriging for enhanced spatial interpolation of rainfall in two Australian catchments. Hydrological Processes 31(12):2143–2161.Google Scholar
  • Álvarez AM, Lawrence ND (2009) Sparse convolved Gaussian processes for multi-output regression. Koller D, Schuurmans D, Bengio Y, Bottou L, eds. Advances in Neural Information Processing Systems, vol. 21 (Curran Associates, Red Hook, NY), 57–64.Google Scholar
  • Álvarez MA, Lawrence ND (2011) Computationally efficient convolved multiple output Gaussian processes. J. Machine Learn. Res. 12(May):1459–1500.Google Scholar
  • Álvarez MA, Ward WOC, Guarnizo C (2019) Non-linear process convolutions for multi-output Gaussian processes. Chaudhuri K, Sugiyama M, eds. Proc. 22nd Internat. Conf. Artificial Intelligence Statist. (PMLR), 1969–1977.Google Scholar
  • Álvarez M, Luengo D, Titsias M, Lawrence ND (2010) Efficient multioutput Gaussian processes through variational inducing kernels. Teh YW, Titterington M, eds. Proc. 13th Internat. Conf. Artificial Intelligence Statist. (PMLR), 25–32.Google Scholar
  • Ba S, Joseph VR (2012) Composite Gaussian process models for emulating expensive functions. Ann. Appl. Statist. 6(4):1838–1860.Google Scholar
  • Bae B, Kim H, Lim H, Liu Y, Han LD, Freeze PB (2018) Missing data imputation for traffic flow speed using spatio-temporal cokriging. Transportation Res. Part C Emerging Tech. 88(March):124–139.Google Scholar
  • Barocas S, Hardt M, Narayanan A (2019) Fairness in machine learning. Presentation, 31st Conf. Neural Inform. Processing Systems, La Jolla, CA (fairmlbook.org).Google Scholar
  • Bishop CM (2006) Pattern Recognition and Machine Learning (Springer, New York).Google Scholar
  • Blei DM, Kucukelbir A, McAuliffe JD (2017) Variational inference: A review for statisticians. J. Amer. Statist. Assoc. 112(518):859–877.Google Scholar
  • Borchani H, Varando G, Bielza C, Larranaga P (2015) A survey on multi-output regression. WIREs Data Mining Knowledge Discovery 5(5):216–233.Google Scholar
  • Breiman L, Friedman JH (1997) Predicting multivariate responses in multiple linear regression. J. Roy. Statist. Soc. Ser. B Statist. Methodol. 59(1):3–54.Google Scholar
  • Brookes M (2020) The matrix reference manual. Accessed April 16, 2021, http://www.ee.imperial.ac.uk/hp/staff/dmb/matrix/intro.html.Google Scholar
  • Brown PJ, Zidek JV (1980) Adaptive multivariate ridge regression. Ann. Statist. 8(1):64–74.Google Scholar
  • Burt DR, Rasmussen CE, Van Der Wilk M (2019) Rates of convergence for sparse variational Gaussian process regression. Chaudhuri K, Salakhutdinov R, eds. Proc. 36th Internat. Conf. Machine Learn. (PMLR), 862–871.Google Scholar
  • Cardona HDV, Álvarez MA, Orozco ÁA (2015) Convolved multi-output Gaussian processes for semi-supervised learning. Murino V, Puppo E, eds. Internat. Conf. Image Anal. Processing (Springer, Cham, Switzerland), 109–118.Google Scholar
  • Chen H, Zheng L, Al Kontar R, Raskutti G (2020) Stochastic gradient descent in correlated settings: A study on Gaussian processes. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. Neural Information Processing Systems, vol. 33 (Curran Associates, Red Hook, NY), 2722–2733.Google Scholar
  • Chilès J-P, Delfiner P (2012) Geostatistics: Modeling Spatial Uncertainty, Wiley Series in Probability and Statistics, 2nd ed., vol. 497 (John Wiley & Sons, Hoboken, NJ).Google Scholar
  • Damianou A, Lawrence ND (2015) Semi-described and semi-supervised learning with Gaussian processes. Meila M, Heskes T, eds. Proc. 31st Conf. Uncertainty Artificial Intelligence (AUAI Press, Corvallis, OR), 228–237.Google Scholar
  • Decoda Study Group, Nyamdorj R, Qiao Q, Lam TH, Tuomilehto J, Ho SY, Pitkäniemi J, et al. (2008) BMI compared with central obesity indicators in relation to diabetes and hypertension in Asians. Obesity (Silver Spring) 16(7):1622–1635.Google Scholar
  • Deng X, Lin CD, Liu KW, Rowe R (2017) Additive Gaussian process for computer models with qualitative and quantitative factors. Technometrics 59(3):283–292.Google Scholar
  • Diana A, Matechou E, Griffin J, Johnston A (2020) A hierarchical dependent Dirichlet process prior for modelling bird migration patterns in the UK. Ann. Appl. Statist. 14(1):473–493.Google Scholar
  • Frazier PI (2018) A tutorial on Bayesian optimization. Preprint, submitted July 8, https://doi.org/10.48550/arXiv.1807.02811.Google Scholar
  • Fricker TE, Oakley JE, Urban NM (2013) Multivariate Gaussian process emulators with nonseparable covariance structures. Technometrics 55(1):47–56.Google Scholar
  • Furrer R, Genton MG (2011) Aggregation-cokriging for highly multivariate spatial data. Biometrika 98(3):615–631.Google Scholar
  • Genton MG, Kleiber W (2015) Cross-covariance functions for multivariate geostatistics. Statist. Sci. 30(2):147–163.Google Scholar
  • Gramacy RB, Apley DW (2015) Local Gaussian process approximation for large computer experiments. J. Comput. Graphical Statist. 24(2):561–578.Google Scholar
  • Gramacy RB, Haaland B (2016) Speeding up neighborhood search in local Gaussian process prediction. Technometrics 58(3):294–303.Google Scholar
  • Grandvalet Y, Bengio Y (2005) Semi-supervised learning by entropy minimization. Saul L, Weiss Y, Bottou L, eds. Proc. 17th Internat. Conf. Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 529–536.Google Scholar
  • Gu M, Berger JO (2016) Parallel partial Gaussian process emulation for computer models with massive output. Ann. Appl. Statist. 10(3):1317–1347.Google Scholar
  • Guildford L, Crofts C, Lu J (2020) Can the molar insulin: C-peptide ratio be used to predict hyperinsulinaemia? Biomedicines 8(5):108.Google Scholar
  • Hensman J, Fusi N, Lawrence ND (2013) Gaussian processes for big data. Nicholson A, Smyth P, eds. Proc. 29th Conf. Uncertainty Artificial Intelligence (AUAI Press, Arlington, VA), 282–290.Google Scholar
  • Higdon D (2002) Space and space-time modeling using process convolutions. Anderson CW, Barnett V, Chatwin PC, El-Shaarawi AH, eds. Quantitative Methods for Current Environmental Issues (Springer, London), 37–56.Google Scholar
  • Hoffman MD, Blei DM, Wang C, Paisley J (2013) Stochastic variational inference. J. Machine Learn. Res. 14(1):1303–1347.Google Scholar
  • Izenman AJ (1975) Reduced-rank regression for the multivariate linear model. J. Multivariate Anal. 5(2):248–264.Google Scholar
  • Jahani S, Kontar R, Veeramani D, Zhou S (2018) Statistical monitoring of multiple profiles simultaneously using Gaussian processes. Quality Reliability Engrg. Internat. 34(8):1510–1529.Google Scholar
  • Jean N, Xie SM, Ermon S (2018) Semi-supervised deep kernel learning: Regression with unlabeled data by minimizing predictive variance. Bengio S, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 31 (Curran Associates, Red Hook, NY), 5322–5333.Google Scholar
  • Jeong JY, Kang JS, Jun CH (2020) Regularization-based model tree for multi-output regression. Inform. Sci. 507(January):240–255.Google Scholar
  • Kaiser M, Otte C, Runkler TA, Ek CH (2019) Data association with Gaussian processes. Preprint, submitted May 5, https://doi.org/10.48550/arXiv.1810.07158.Google Scholar
  • Kaufman CG, Schervish MJ, Nychka DW (2008) Covariance tapering for likelihood-based estimation in large spatial data sets. J. Amer. Statist. Assoc. 103(484):1545–1555.Google Scholar
  • Kontar R, Raskutti G, Zhou S (2021) Minimizing negative transfer of knowledge in multivariate Gaussian processes: A scalable and regularized approach. IEEE Trans. Pattern Anal. Machine Intelligence 43(October):3508–3522.Google Scholar
  • Kontar R, Zhou S, Sankavaram C, Du X, Zhang Y (2017) Nonparametric-condition-based remaining useful life prediction incorporating external factors. IEEE Trans. Reliability 67(1):41–52.Google Scholar
  • Kontar R, Zhou S, Sankavaram C, Du X, Zhang Y (2018) Nonparametric modeling and prognosis of condition monitoring signals using multivariate Gaussian convolution processes. Technometrics 60(4):484–496.Google Scholar
  • Lalchand V, Rasmussen CE (2019) Approximate inference for fully Bayesian Gaussian process regression. Proc. 2nd Symposium on Advances in Approximate Bayesian Inference (PMLR), 1–12.Google Scholar
  • Lawrence ND, Jordan MI (2005) Semi-supervised learning via Gaussian processes. Saul L, Weiss Y, Bottou L, Proc. 17th Internat. Conf. Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 753–760.Google Scholar
  • Lázaro-Gredilla M, Van Vaerenbergh S, Lawrence ND (2012) Overlapping mixtures of Gaussian processes for the data association problem. Pattern Recognition 45(4):1386–1395.Google Scholar
  • Lee Y, Kim H (2020) Bayesian nonparametric joint mixture model for clustering spatially correlated time series. Technometrics 62(3):313–329.Google Scholar
  • Leighton E, Sainsbury CA, Jones GC (2017) A practical review of C-peptide testing in diabetes. Diabetes Therapy 8(3):475–487.Google Scholar
  • Li Y, Nan B, Zhu J (2015) Multivariate sparse group lasso for the multivariate multiple linear regression with an arbitrary group structure. Biometrics 71(2):354–363.Google Scholar
  • Mårin P, Holmäng S, Jönsson L, Sjöström L, Kvist H, Holm G, Lindstedt G, Björntorp P (1992) The effects of testosterone treatment on body composition and metabolism in middle-aged obese men. Internat. J. Obesity Related Metabolic Disorders 16(12):991–997.Google Scholar
  • Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A (2021) A survey on bias and fairness in machine learning. ACM Comput. Surveys 54(6):1–35.Google Scholar
  • Moreno-Muñoz P, Artés-Rodríguez A, Álvarez MA (2018) Heterogeneous multi-output Gaussian process prediction. Bengio S, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 31 (Curran Associates, Red Hook, NY), 6711–6720.Google Scholar
  • Nerini D, Monestiez P, Manté C (2010) Cokriging for spatial functional data. J. Multivariate Anal. 101(2):409–418.Google Scholar
  • Ng YC, Colombo N, Silva R (2018) Bayesian semi-supervised learning with graph Gaussian processes. Bengio S, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 31 (Curran Associates, Red Hook, NY), 1683–1694.Google Scholar
  • Nieschlag E, Behre HM, Nieschlag S, eds. (2012) Testosterone: Action, Deficiency, Substitution, 4th ed. (Cambridge University Press, Cambridge, UK).Google Scholar
  • Panos A, Dellaportas P, Titsias MK (2018) Fully scalable Gaussian processes using subspace inducing inputs. Preprint, submitted July 12, https://doi.org/10.48550/arXiv.1807.02537.Google Scholar
  • Qian PZG, Wu H, Wu CJ (2008) Gaussian process models for computer experiments with qualitative and quantitative factors. Technometrics 50(3):383–396.Google Scholar
  • Rasmussen CE, Williams CKI (2006) Gaussian Processes for Machine Learning (MIT Press, Cambridge, MA).Google Scholar
  • Rohrbach M, Ebert S, Schiele B (2013) Transfer learning in a transductive setting. Burges CJ, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 26 (Curran Associates, Red Hook, NY), 46–54.Google Scholar
  • Ross J, Dy J (2013) Nonparametric mixture of Gaussian processes with constraints. Dasgupta S, McAllester D, eds. Proc. 30th Internat. Conf. Machine Learn. (PMLR), 1346–1354.Google Scholar
  • Rothman AJ, Levina E, Zhu J (2010) Sparse multivariate regression with covariance estimation. J. Comput. Graphical Statist. 19(4):947–962.Google Scholar
  • Ryzhov IO, Powell WB, Frazier PI (2012) The knowledge gradient algorithm for a general class of online learning problems. Oper. Res. 60(1):180–195.LinkGoogle Scholar
  • Saatçi Y (2012) Scalable inference for structured Gaussian process models. PhD thesis, University of Cambridge, Cambridge, UK.Google Scholar
  • Sánchez-Fernández M, de Prado-Cumplido M, Arenas-García J, Pérez-Cruz F (2004) SVM multiregression for nonlinear channel estimation in multiple-input multiple-output systems. IEEE Trans. Signal Processing 52(8):2298–2307.Google Scholar
  • Singh A, Nowak R, Zhu J (2009) Unlabeled data: Now it helps, now it doesn’t. Koller D, Schuurmans D, Bengio Y, Bottou L, eds. Advances in Neural Information Processing Systems, vol. 21 (Curran Associates, Red Hook, NY), 1513–1520.Google Scholar
  • Skolidis G, Sanguinetti G (2013) Semisupervised multitask learning with Gaussian processes. IEEE Trans. Neural Networks Learning Systems 24(12):2101–2112.Google Scholar
  • Srinivas N, Krause A, Kakade SM, Seeger M (2009) Gaussian process optimization in the bandit setting: No regret and experimental design. Proc. 27th Internat. Conf. Machine Learn. (Omnipress, Madison, WI), 1015–1022.Google Scholar
  • Ver Hoef JM, Barry RP (1998) Constructing and fitting models for cokriging and multivariable spatial prediction. J. Statist. Planning Inference 69(2):275–294.Google Scholar
  • Wang J, Clark SC, Liu E, Frazier PI (2020) Parallel Bayesian global optimization of expensive functions. Oper. Res. 68(6):1850–1865.LinkGoogle Scholar
  • Wang KA, Pleiss G, Gardner JR, Tyree S, Weinberger KQ, Wilson AG (2019) Exact Gaussian processes on a million data points. Wallach HM, Larochelle H, Beygelzimer A, d'Alché-Buc F, Fox EA, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 32 (Curran Associates, Red Hook, NY), 14622–14632.Google Scholar
  • Wang W, Yang X, Liang J, Liao M, Zhang H, Qin X, Mo L, Lv W, Mo Z (2013) Cigarette smoking has a positive and independent effect on testosterone levels. Hormones (Athens) 12(4):567–577.Google Scholar
  • Wilson A, Nickisch H (2015) Kernel interpolation for scalable structured Gaussian processes (KISS-GP). Bach F, Blei D, eds. Proc. 32nd Internat. Conf. Machine Learn. (PMLR), 1775–1784.Google Scholar
  • Wilson AG, Hu Z, Salakhutdinov R, Xing EP (2016) Deep kernel learning. Gretton A, Robert CC, eds. Proc. 19th Internat. Conf. Artificial Intelligence Statist. (PMLR), 370–378.Google Scholar
  • Wold H (1975) Soft modelling by latent variables: The non-linear iterative partial least squares (NIPALS) approach. J. Appl. Probab. 12(S1):117–142.Google Scholar
  • Xie J, Frazier PI, Chick SE (2016) Bayesian optimization via simulation with pairwise sampling and correlated prior beliefs. Oper. Res. 64(2):542–559.LinkGoogle Scholar
  • Xu S, An X, Qiao X, Zhu L, Li L (2013) Multi-output least-squares support vector regression machines. Pattern Recognition Lett. 34(9):1078–1084.Google Scholar
  • Yue X, Al Kontar R (2020) The Rényi Gaussian process: Towards improved generalization. Preprint, submitted February 18, https://doi.org/10.48550/arXiv.1910.06990.Google Scholar
  • Yue X, Al Kontar R (2021) Joint models for event prediction from time series and survival data. Technometrics 63(4):477–486.Google Scholar
  • Yue X, Nouiehed M, Al Kontar R (2022) GIFAIR-FL: An approach for group and individual fairness in federated learning. Preprint, submitted March 8, https://doi.org/10.48550/arXiv.2108.02741.Google Scholar
  • Zhao J, Sun S (2016) Variational dependent multi-output Gaussian process dynamical systems. J. Machine Learn. Res. 17(1):4134–4169.Google Scholar
  • Zhu X, Ghahramani Z (2002) Learning from labeled and unlabeled data with label propagation. Technical Report CMU-CALD-02-107, Carnegie Mellon University, Pittsburgh.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.