AISTATS 2018 Accepted Papers

The accepted papers can be found in the below table sorted in increasing order of Paper ID.

Paper ID Paper Title Author Names
2 The Geometry of Random Features Krzysztof Choromanski, ; Mark Rowland, University of Cambridge ; Tamas Sarlos, Google Research; Vikas Sindhwani, Google Brain Robotics; Richard Turner, Cambridge; Adrian Weller*, University of Cambridge
4 Gauged Mini-Bucket Elimination for Approximate Inference Sungsoo Ahn, KAIST; Michael Chertkov, Los Alamos National Laborator; Jinwoo Shin, KAIST; Adrian Weller*, University of Cambridge
8 A Fast Algorithm for Separated Sparsity via Perturbed Lagrangians Aleksander Madry, MIT; Slobodan Mitrovic*, EPFL; Ludwig Schmidt, MIT
10 An Analysis of Categorical Distributional Reinforcement Learning Mark Rowland*, University of Cambridge ; Marc Bellemare, Google Brain; Will Dabney, DeepMind; Remi Munos, DeepMind; Yee Whye Teh, Oxford and DeepMind
16 Combinatorial Preconditioners for Proximal Algorithms on Graphs Thomas Möllenhoff*, TU Munich; Zhenzhang Ye, ; Tao Wu, ; Daniel Cremers,
22 Growth-Optimal Portfolio Selection under CVaR Constraints Guy Uziel*, Technion; Ran El-Yaniv, Technion
26 Accelerated Stochastic Power Iteration Christopher De Sa, Cornell University; Bryan He, Stanford University; Ioannis Mitliagkas, Université de Montréal; Chris Re, Stanford University; Peng Xu*, Stanford University
27 Multi-scale Nystrom Method Woosang Lim, Georgia Tech; Rundong Du, Georgia Tech; Bo Dai, Geogia Tech; Kyomin Jung, Seoul National University; Le Song, Georgia Tech; Haesun Park*, Georgia Tech
30 Making Tree Ensembles Interpretable: A Bayesian Model Selection Approach Satoshi Hara*, Osaka University; Kohei Hayashi,
32 Mixed Membership Word Embeddings for Computational Social Science James Foulds*, UMBC
35 Fast Threshold Tests for Detecting Discrimination Emma Pierson*, Stanford University; Sam Corbett-Davies, Stanford University; Sharad Goel,
40 Iterative Supervised Principal Components Juho Piironen*, Aalto University; Aki Vehtari, Aalto
42 Iterative Spectral Method for Alternative Clustering Chieh Wu*, Northeastern University; Stratis Ioannidis, NEU; Mario Sznaier, Northeastern University; Xiangyu Li, Northeastern University; David Kaeli, Northeastern University; Jennifer Dy, North Eastern
45 Can clustering scale sublinearly with its clusters? A variational EM acceleration of GMMs and k-means Dennis Forster*, University of Oldenburg; Jörg Lücke, University of Oldenburg
48 Parallelised Bayesian Optimisation via Thompson Sampling Kirthevasan Kandasamy*, ; Akshay Krishnamurthy, U-Mass Amherst; Jeff Schneider, CMU; Barnabas Poczos, Carnegie Mellon University
49 On the challenges of learning with inference networks on sparse, high-dimensional data Rahul Krishnan*, MIT; Dawen Liang, Netflix; Matthew Hoffman, Google; Matthew Hoffman, Google; Dawen Liang, Netflix
54 Post Selection Inference with Kernels Makoto Yamada*, RIKEN; Yuta Umezu, ; Kenji Fukumizu, ; Ichiro Takeuchi,
55 On how complexity effects the stability of a predictor Joel Ratsaby*, Ariel University
56 On the Truly Block Eigensolvers via First-Order Riemannian Optimization Zhiqiang Xu*, KAUST; Xin Gao,
59 Layerwise Systematic Scan: Deep Boltzmann Machines and Beyond Heng Guo*, University of Edinburgh; Kaan Kara, ETH Zurich; Ce Zhang, ETH Zurich
60 IHT dies hard: Provable accelerated Iterative Hard Thresholding Rajiv Khanna, UT Austin; Anastasios Kyrillidis*, IBM T.J. Watson Research Cente
65 Finding Global Optima in Nonconvex Stochastic Semidefinite Optimization with Variance Reduction Jinshan ZENG*, Hongkong University of Science and Technology; Ke Ma, (IIE, CAS; Yuan Yao, Hongkong University of Science and Techonology
66 Outlier Detection and Robust Estimation in Nonparametric Regression Dehan Kong, Univ. of Toronto; Howard Bondell, North Carolina State University; Weining Shen*, UC Irvine
68 Integral Transforms from Finite Data: An Application of Gaussian Process Regression to Fourier Analysis Luca Ambrogioni*, Radboud University; Eric Maris, Radboud University
72 AdaGeo: Adaptive Geometric Learning for Optimization and Sampling Gabriele Abbati*, University of Oxford; Alessandra Tosi, Mind Foundry, Oxford; Seth Flaxman, Imperial College London; Michael Osborne, Oxford
74 Online Learning with Non-Convex Losses and Non-Stationary Regret Xiaobo Li*, University of Minnesota; Xiang Gao, University of Minnesota; Shuzhong Zhang, University of Minnesota
75 Learning Determinantal Point Processes in Sublinear Time Christophe Dupuy*, INRIA; Francis Bach, INRIA - ENS
76 Nonlinear Structured Signal Estimation in High Dimensions via Iterative Hard Thresholding Kaiqing Zhang, University of Illinois at Urba; Zhuoran Yang*, Princeton University
77 Riemannian stochastic quasi-Newton algorithm with variance reduction and its convergence analysis Hiroyuki Kasai*, UEC; Hiroyuki Sato, Kyoto University; Bamdev Mishra, Amazon
78 Online Boosting Algorithms for Multi-label Ranking Young Hun Jung*, Universith of Michigan; Ambuj Tewari, Universith of Michigan
80 Zeroth-Order Online Alternating Direction Method of Multipliers: Convergence Analysis and Applications Sijia Liu*, University of Michigan; Jie Chen, ; Pin-Yu Chen, ; Alfred Hero,
86 High-dimensional Bayesian optimization via additive models with overlapping groups Paul Rolland*, EPFL, LIONS; Jonathan Scarlett, ; Ilija Bogunovic, ; Volkan Cevher, EPFL
89 Robust Active Label Correction Jan Kremer, University of Copenhagen; Fei Sha, UCLA; Christian Igel*, University of Copenhagen
90 Factorial HMM with Collapsed Gibbs Sampling for optimizing long-term HIV Therapy Amit Gruber*, IBM Research; Chen Yanover, IBM Research; Tal El-Hay, IBM Research; Yaara Goldschmidt, IBM Research; Anders Sönnerborg, Karolinska Institute, Karolinska University Hospital; Vanni Borghi, Modena University Hospital; Francesca Incardona, EuResist Network GEIE, InformaPro S.r.l.
91 Optimal Submodular Extensions for Marginal Estimation Pankaj Pansari*, University of Oxford; Chris Russell, The Alan Turing Institute; M. Pawan Kumar, University of Oxford
92 Semi-Supervised Learning with Competitive Infection Models Nir Rosenfeld*, Harvard University; Amir Globerson, Tel Aviv University
94 Discriminative Learning of Prediction Intervals Nir Rosenfeld*, Harvard University; Yishay Mansour, Tel Aviv University; Elad Yom Tov, Microsoft Research
95 Topic Compositional Neural Language Model Wenlin Wang*, Duke University; Zhe Gan, Duke University; Wenqi Wang, Purdue University; Dinghan Shen, Duke University; Jiaji Huang, Baidu Silicon Valley Artificial Intelligence Lab; Wei Ping, Baidu Silicon Valley Artificial Intelligence Lab; Sanjeev Satheesh, Baidu Silicon Valley Artificial Intelligence Lab; Lawrence Carin, Duke
97 Learning Priors for Invariance Eric Nalisnick*, UC Irvine; Padhraic Smyth, University of California, Irvine
98 Optimal Cooperative Inference Scott Cheng-Hsin Yang*, Rutgers University--Newark; Yue Yu, Rutgers University--Newark; arash Givchi, Rutgers University--Newark; Pei Wang, Rutgers University--Newark; wai Keen Vong, Rutgers University--Newark; Patrick Shafto, Rutgers University--Newark
102 Stochastic Multi-armed Bandits in Constant Space David Liau, UT-Austin; Zhao Song, UT-Austin; Eric Price, UT-Austin; Ger Yang*, UT-Austin
109 Matrix completability analysis via graph k-connectivity Dehua Cheng*, Univ. of Southern California; Natali Ruchansky, ; Yan Liu, University of Southern California
112 FLAG n’ FLARE: Fast Linearly-Coupled Adaptive Gradient Methods Xiang Cheng, UC Berkeley; Fred Roosta*, University of Queensland; Stefan Palombo, UC Berkeley; Peter Bartlett, UC Berkeley; Michael Mahoney, UC Berkeley
113 Multi-view Metric Learning in Vector-valued Kernel Spaces Riikka Huusari*, Aix-Marseille Université; Hachem Kadri, Aix-Marseille University; Cécile Capponi,
115 Gaussian Process Subset Scanning for Anomalous Pattern Detection in Non-iid Data William Herlands*, Carnegie Mellon University; Edward McFowland, ; Andrew Wilson, Cornell University; Daniel Neill,
117 Dropout as a Low-Rank Regularizer for Matrix Factorization Jacopo Cavazza*, Istituto Italiano di Tecnologi; Pietro Morerio, Istituto Italiano di Tecnologia; Benjamin Haeffele, Johns Hopkins University; Connor Lane, Johns Hopkins University; Vittorio Murino, Istituto Italiano di Tecnologia; Rene Vidal, Johns Hopkins University
119 A Simple Analysis for Exp-concave Empirical Minimization with Arbitrary Convex Regularizer Tianbao Yang*, University of Iowa; Zhe Li, ; Lijun Zhang, Nanjing University
120 Independently Interpretable Lasso: A New Regularizer for Sparse Regression with Uncorrelated Variables Masaaki Takada*, The Graduate University for Advanced Studies; Taiji Suzuki, The University of Tokyo; Hironori Fujisawa, The Insitute of Statistical Mathematics
121 Boosting Variational Inference: an Optimization Perspective Francesco Locatello*, ETH Zurich; Rajiv Khanna, UT Austin; Joydeep Ghosh, ; Gunnar Ratsch,
122 Personalized and Private Peer-to-Peer Machine Learning Aurélien Bellet*, INRIA; Rachid Guerraoui, ; mahsa Taziki, ; Marc Tommasi,
125 Tensor Regression Meets Gaussian Processes Rose Yu*, Caltech; Guangyu Li, University of Southern California; Yan Liu, University of Southern California
127 A Nonconvex Proximal Splitting Algorithm under Moreau-Yosida Regularization Emanuel Laude*, Technical University of Munich; Tao Wu, ; Daniel Cremers,
133 Medoids in Almost-Linear Time via Multi-Armed Bandits Vivek Bagaria, ; Govinda Kamath, ; Martin Zhang, Stanford University; Vasilis Ntranos, ; David Tse*,
139 Regional Multi-Armed Bandits Zhiyang Wang, USTC; Ruida Zhou, USTC; Cong Shen*, Univ. of Sci. & Tech. China
142 Nearly second-order optimality of online joint detection and estimation via one-sample update schemes Yang Cao*, Georgia Institute of Technolog; Liyan Xie, ; Yao Xie, ; Huan Xu,
151 Sum-Product-Quotient Networks Or Sharir*, Hebrew University of Jerusalem; Amnon Shashua, Hebrew University of Jerusalem
154 Exploiting Strategy-Space Diversity for Batch Bayesian Optimization Sunil Gupta*, Deakin University; Alistair Shilton, Deakin University; Santu Rana, Deakin University; Svetha Venkatesh, Deakin University
158 Beating Monte Carlo Integration: a Nonasymptotic Study of Kernel Smoothing Methods Stephan Clémençon*, Telecom ParisTech; François Portier, Telecom ParisTech
166 Group invariance principles for causal generative models Michel Besserve*, ; naji Shajarisales, MPI for Intelligent Systems; Bernhard Schoelkopf, MPI for Intelligent Systems; Dominik Janzing, MPI for Intelligent Systems
167 A Provable Algorithm for Learning Interpretable Scoring Systems Nataliya Sokolovska*, University Paris 6; Yann Chevaleyre, University Paris Dauphine; Jean-Daniel Zucker, IRD
172 Scaling up the Automatic Statistician: Scalable Structure Discovery using Gaussian Processes Hyunjik Kim*, University of Oxford; Yee Whye Teh, Oxford
178 Efficient Bandit Combinatorial Optimization Algorithm with Zero-suppressed Binary Decision Diagrams Shinsaku Sakaue*, NTT; Masakazu Ishihata, Hokkaido University; Shin-ichi Minato,
181 Transfer Learning on fMRI Datasets Hejia Zhang*, Princeton University; Po-Hsuan Chen, Princeton University; Peter Ramadge, Princeton University
183 An Optimization Approach to Learning Falling Rule Lists Chaofan Chen*, Duke University; Cynthia Rudin, Duke
185 Catalyst for Gradient-based Nonconvex Optimization Courtney Paquette*, Ohio State University; Hongzhou Lin, INRIA; Dmitriy Drusvyatskiy, University of Washington; Julien Mairal, Inria; Zaid Harchaoui, University of Washington
188 Benefits from Superposed Hawkes Processes Hongteng Xu*, Duke University; Dixin Luo, ; Xu Chen, Tsinghua University; Lawrence Carin, Duke
192 Nonparametric Preference Completion Julian Katz-Samuels*, University of Michigan; Clayton Scott, University of Michigan
198 Non-parametric estimation of Jensen-Shannon Divergence in Generative Adversarial Network training Mathieu Sinn*, ; Ambrish Rawat, IBM Research
201 Efficient and principled score estimation with Nyström kernel exponential families Dougal Sutherland*, Gatsby unit, UCL; Heiko Strathmann, ; Michael Arbel, Gatsby unit, UCL; Arthur Gretton, Gatsby unit, UCL
208 Symmetric Variational Autoencoder and Connections to Adversarial Learning Liqun Chen*, Duke University; Shuyang Dai, Duke University; Yunchen Pu, Duke University; Chunyuan Li, Duke University; Qinliang Su, Duke University; Erjin Zhou, Face++; Lawrence Carin, Duke
210 Few-shot Generative Modelling with Generative Matching Networks Sergey Bartunov*, DeepMind; Dmitry Vetrov, Higher School of Economics
211 Nonlinear Weighted Finite Automata Tianyu Li*, McGill University; Guillaume Rabusseau, McGill University; Doina Precup, McGill University
212 Natural Gradients in Practice: Non-Conjugate Variational Inference in Gaussian Process Models Hugh Salimbeni*, Imperial College London; Stefanos Eleftheriadis, Prowler.io; James Hensman, PROWLER.io
216 Variational inference for the multi-armed contextual bandit Iñigo Urteaga*, Columbia University; Chris Wiggins, Columbia University
220 Tracking the gradients using the Hessian: A new look at variance reducing stochastic methods Robert Gower*, Telecom Paristech; Nicolas Le Roux, Google Brain; Francis Bach, Inria / ENS
226 Subsampling for Ridge Regression via Regularized Volume Sampling Michal Derezinski*, UC Santa Cruz; Manfred Warmuth,
228 Scalable Gaussian Processes with Billions of Inducing Inputs via Tensor Train Decomposition Pavel Izmailov*, Cornell University; Dmitry Kropotov, MSU; Alexander Novikov, Higher school of economics
229 Batch-Expansion Training: An Efficient Optimization Framework Michal Derezinski*, UC Santa Cruz; Dhruv Mahajan, Facebook Research; S. Sathiya Keerthi, Microsoft Corporation; S. V. N. Vishwanathan, UC Santa Cruz; Markus Weimer, Microsoft Corporation
237 Batched Large-scale Bayesian Optimization in High-dimensional Spaces Zi Wang*, MIT; Clement Gehring, ; Stefanie Jegelka, MIT; Pushmeet Kohli,
244 A Bayesian Nonparametric Method for Clustering Imputation, and Forecasting in Multivariate Time Series FERAS SAAD*, MIT; Vikash Mansinghka, MIT
245 Stochastic Three-Composite Convex Minimization with a Linear Operator Renbo Zhao*, NUS; Volkan Cevher, EPFL
246 Direct Learning to Rank And Rerank Cynthia Rudin*, Duke; Yining Wang, Carnegie Mellon University
247 One-shot Coresets: The Case of k-Clustering Olivier Bachem*, ETH Zurich; Mario Lucic, Google Brain Zurich; Silvio Lattanzi,
249 Random Warping Series: A Random Features Method for Time-Series Embedding Lingfei Wu*, IBM T. J. Watson Research Cent; Ian En-Hsu Yen, CMU; Jinfeng Yi, ; Fangli Xu, College of William and Mary; Qi Lei, University of Texas at Austin; Michael Witbrock, IBM T. J. Watson Research Center
250 Slow and Stale Gradients Can Win the Race: Error-Runtime Trade-offs in Distributed SGD Sanghamitra Dutta*, Carnegie Mellon University; Gauri Joshi, Carnegie Mellon University; Soumyadip Ghosh, IBM Research; Parijat Dube, IBM Research; Priya Nagpurkar, IBM Research
251 Variational Inference based on Robust Divergences Futoshi Futami*, The University of Tokyo/RIKEN; Issei Sato, The University of Tokyo / RIKEN; Masashi Sugiyama, RIKEN / The University of Tokyo
255 Resampled Proposal Distributions for Variational Inference and Learning Aditya Grover*, Stanford University; Ramki Gummadi, ; Miguel Lazaro-Gredilla, Vicarious; Dale Schuurmans, ; Stefano Ermon, Stanford
257 Best arm identification in multi-armed bandits with delayed and partial feedback Aditya Grover*, Stanford University; Todor Markov, ; Stefano Ermon, Stanford
267 Fully adaptive algorithm for pure exploration in linear bandits Liyuan Xu*, The University of Tokyo / RIKEN; Junya Honda, University of Tokyo / RIKEN; Masashi Sugiyama, RIKEN / The University of Tokyo
272 Contextual Bandits with Stochastic Experts Rajat Sen*, University of Texas at Austin; Karthikeyan Shanmugam, IBM; Sanjay Shakkottai, University of Texas at Austin
277 Human Interaction with Recommendation Systems Sven Schmit*, Stanford University; Carlos Riquelme,
281 Community Detection in Hypergraphs: Optimal Statistical Limit and Efficient Algorithms I Chien, UIUC; Chung-Yi Lin*, National Taiwan University; I-Hsiang Wang, National Taiwan University
294 Smooth and Sparse Optimal Transport Mathieu Blondel*, NTT; Vivien Seguy, Kyoto University; Antoine Rolet, Kyoto University
296 Robust Maximization of Non-Submodular Objectives Ilija Bogunovic*, ; Junyao Zhao, ETH Zürich; Volkan Cevher, EPFL
298 Cause-Effect Inference by Comparing Regression Errors Patrick Bloebaum*, Osaka University; Dominik Janzing, MPI for Intelligent Systems; Takashi Washio, ; Shohei Shimizu, ; Bernhard Schoelkopf, MPI for Intelligent Systems
299 Tree-based Bayesian Mixture Model for Competing Risks Alexis Bellot*, University of Oxford; Mihaela Van der Schaar, University of Oxford
301 Actor-Critic Fictitious Play in Simultaneous Move Multistage Games Julien Perolat*, DeepMind; Bilal Piot, DeepMind; Olivier Pietquin, DeepMind
307 Random Subspace with Trees for Feature Selection Under Memory Constraints Antonio Sutera*, ULiège; Célia Châtel, Aix-Marseille University; Gilles Louppe, ULiège; Louis Wehenkel, ; Pierre Geurts, ULiège
308 Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information Jakob Runge*, German Aerospace Agency
310 Quotient Normalized Maximum Likelihood Criterion for Learning Bayesian Network Structures Tomi Silander*, Naverlabs Europe; Janne Leppä-aho, ; Elias Jääsaari, ; Teemu Roos,
311 Convex optimization over intersection of simple sets: improved convergence rate guarantees via exact penalty approach Achintya Kundu*, Indian Institute of Science; Francis Bach, Inria / ENS; Chiranjib Bhattacharya,
317 Variational Sequential Monte Carlo Christian Naesseth*, Linköping University; Scott Linderman, ; Rajesh Ranganath, Princeton; David Blei,
321 Statistically Efficient Estimation for Non-Smooth Probability Densities Masaaki Imaizumi*, ISM / RIKEN; Takanori Maehara, ; Yuichi Yoshida, National Institute of Informatics
324 SDCA-Powered Inexact Dual Augmented Lagrangian Method for Fast CRF Learning Xu Hu*, ENPC; Guillaume Obozinski,
325 Generalized Concomitant Multi-Task Lasso for sparse multimodal regression Mathurin Massias*, INRIA Saclay; Olivier Fercoq, LTCI - Télécom ParisTech - Université Paris Saclay; Alexandre Gramfort, INRIA Saclay; Joseph Salmon, LTCI - Télécom ParisTech - Université Paris Saclay
328 Gradient Layer: Enhancing the Convergence of Adversarial Training for Generative Models Atsushi Nitanda*, The University of Tokyo; Taiji Suzuki, The University of Tokyo
332 Statistical Sparse Online Regression: A Diffusion Approximation Perspective Junchi Li*, Princeton University; Qiang Sun, Princeton University; Jianqing Fan, Princeton University
338 Guaranteed Sufficient Decrease for Stochastic Variance Reduced Gradient Optimization Fanhua Shang*, The Chinese University of Hong Kong; Yuanyuan Liu, The Chinese University of Hong Kong; Kaiwen Zhou, The Chinese University of Hong Kong; James Cheng, The Chinese University of Hong Kong; Kelvin Kai Wing Ng, The Chinese University of Hong Kong; Yuichi Yoshida, National Institute of Informatics
340 Delayed Sampling and Automatic Rao-Blackwellization of Probabilistic Programs Lawrence Murray*, Uppsala University; Daniel Lundén, ; Jan Kudlicka, ; David Broman, ; Thomas Schön,
342 Learning to Round for Discrete Labeling Problems Pritish Mohapatra*, IIIT, Hyderabad; Jawahar C.V., IIIT Hyderabad; M. Pawan Kumar, University of Oxford
350 Approximate ranking from pairwise comparisons Reinhard Heckel*, Rice University; Max Simchowitz, UC Berkeley; Kannan Ramchandran, UC Berkeley; Martin Wainwright, UC Berkeley
352 Semi-Supervised Prediction-Constrained Topic Models Michael Hughes*, Harvard University; John Hope, University of California, Irvine; Leah Weiner, Brown University; Thomas McCoy, Massachusetts General Hospital; Roy Perlis, Massachusetts General Hospital; Erik Sudderth, University of California, Irvine; Finale Doshi-Velez, Harvard
354 A Stochastic Differential Equation Framework for Guiding Online User Activities in Closed Loop Yichen Wang*, Gatech; Evangelos Theodorou, ; Le Song, Georgia Tech
358 Accelerated Stochastic Mirror Descent: From Continuous-time Dynamics to Discrete-time Algorithms Pan Xu*, University of Virginia; Tianhao Wang, ; Quanquan Gu, University of Virginia
367 A Unified Framework for Nonconvex Low-Rank plus Sparse Matrix Recovery Xiao Zhang*, University of Virginia; Lingxiao Wang, University of Virginia; Quanquan Gu, University of Virginia
370 Bayesian Nonparametric Poisson-Process Allocation for Time-Sequence Modeling Hongyi Ding*, The University of Tokyo; Mohammad Khan, ; Issei Sato, The University of Tokyo / RIKEN; Masashi Sugiyama, RIKEN / The University of Tokyo
371 Factor Analysis on a Graph Masayuki Karasuyama*, ; Hiroshi Mamitsuka, Kyoto University / Aalto University
375 Crowdclustering with Partition Labels Junxiang Chen*, Northeastern University; Yale Chang, Northeastern University; Peter Castaldi, Brigham and Women’s Hospital; Michael Cho, Brigham and Women’s Hospital; Brian Hobbs, Brigham and Women’s Hospital; Jennifer Dy, North Eastern
378 Learning Structural Weight Uncertainty with Stein Gradient Flows Ruiyi Zhang, Duke University; Chunyuan Li*, Duke University; Changyou Chen, SUNY Buffalo; Lawrence Carin, Duke
382 Towards Memory-Friendly Deterministic Incremental Gradient Method Jiahao Xie*, Zhejiang University; Hui Qian, Zhejiang University; Zebang Shen, Zhejiang University; Chao Zhang, Zhejiang University
383 Alpha-expansion is Exact on Stable Instances Hunter Lang*, MIT; David Sontag, MIT; Aravindan Vijayaraghavan, Northwestern University
384 Bayesian Approaches to Distribution Regression Ho Chung Leon Law*, University Of Oxford; Dougal Sutherland, Gatsby unit, UCL; Dino Sejdinovic, University of Oxford; Seth Flaxman, Imperial College London
386 Submodularity on Hypergraphs: From Sets to Sequences Marko Mitrovic*, Yale University; Moran Feldman, Open University of Israel; Andreas Krause, ETH Zurich; Amin Karbasi, Yale
389 Provable Estimation of the Number of Blocks in Block Models BOWEI YAN*, UNIVERSITY OF TEXAS AT AUSTIN; Purnamrita Sarkar, University of Texas at Austin; Xiuyuan Cheng, Duke University
391 Differentially Private Regression with Gaussian Processes Michael Smith*, University of Sheffield; Mauricio Álvarez, University of Sheffield; Max Zwiessele, University of Sheffield; Neil Lawrence, University of Sheffield
394 Adaptive balancing of gradient and update computation times using global geometry and approximate subproblems Sai Praneeth Reddy Karimireddy, EPFL; Sebastian Stich, EPFL; Martin Jaggi*, EPFL
407 VAE with a VampPrior Jakub Tomczak*, University of Amsterdam; Max Welling, University of Amsterdam
408 Structured Factored Inference for Probabilistic Programming Avi Pfeffer, Charles River Analytics; Brian Ruttenberg, Charles River Analytics; William Kretschmer, MIT; Alison OConnor*, Charles River Analytics
410 A Generic Approach for Escaping Saddle points Sashank Reddi, Google; Manzil Zaheer*, Carnegie Mellon University; Suvrit Sra, MIT; Barnabas Poczos, Carnegie Mellon University; Francis Bach, Inria / ENS; Ruslan Salakhutdinov, Carnegie Mellon University; Alex Smola, Amazon
411 Policy Evaluation and Optimization with Continuous Treatments Nathan Kallus*, ; Angela Zhou, Cornell ORIE
412 Multiphase MCMC Sampling for Parameter Inference in Nonlinear Ordinary Differential Equations Alan Lazarus*, University of Glagsow; Dirk Husmeier, Glasgow; Theodore Papamarkou, Mathematics & Statistics, University of Glasgow
414 Why adaptively collected data have negative bias and how to correct for it. Xinkun Nie*, Stanford University; Xiaoying Tian, Stanford University; Jonathan Taylor, Stanford University; James Zou, Stanford University
425 Sparse Linear Isotonic Models Sheng Chen*, University of Minnesota; Arindam Banerjee, University of Minnesota
431 Robustness of classifiers to uniform \ell_p and Gaussian noise Jean-Yves Franceschi, Ecole Normale Supérieure Lyon; Alhussein Fawzi*, UCLA; Omar Fawzi,
436 Nested CRP with Hawkes-Gaussian Processes Xi Tan*, Purdue University; Vinayak Rao, Purdue; Jennifer Neville, Purdue University
441 Sketching for Kronecker Product Regression and P-splines Huaian Diao, Northeast Normal University ; Zhao Song, UT-Austin; Wen Sun*, Carnegie Mellon University; David Woodruff, Carnegie Mellon University
442 Multimodal Prediction and Personalization of Photo Edits with Deep Generative Models Ardavan Saeedi*, ; Matthew Hoffman, Google; Matthew Hoffman, Google; Stephen DiVerdi, Adobe; Asma Ghandeharioun, MIT; Matthew Johnson, Google Brain; Ryan Adams, Princeton
444 Cheap Checking for Cloud Computing: Statistical Analysis via Annotated Data Streams Chris Hickey*, University of Warwick; Graham Cormode,
447 Reconstruction Risk of Convolutional Sparse Dictionary Learning Shashank Singh*, ; Barnabas Poczos, Carnegie Mellon University; Jian Ma, Carnegie Mellon University
448 Kernel Conditional Exponential Family Michael Arbel*, Gatsby unit, UCL; Arthur Gretton, Gatsby unit, UCL
451 Linear Stochastic Approximation: Constant Step-Size and Iterate Averaging Chandrashekar Lakshmi-Narayanan*, Indian Institute of Science; Csaba Szepesvari,
452 Stochastic Zeroth-order Optimization in High Dimensions Yining Wang*, Carnegie Mellon University; Simon Du, ; Sivaraman Balakrishnan, Carnegie Mellon University; Aarti Singh, Carnegie Mellon University
459 Teacher Improves Learning by Selecting a Training Subset Philippe Rigollet, Massachusetts Institute of Technology; Robert Nowak, ; Xiaojin Zhu*, University of Wisconsin-Madison; Xuezhou Zhang, University of Wisconsin-Madison; Yuzhe Ma, Univ. of Wisconsin-Madison
460 Communication-Avoiding Optimization Methods for Massive-Scale Graphical Model Structure Learning Penporn Koanantakool*, UC Berkeley; Alnur Ali, Carnegie Mellon University; Ariful Azad, Lawrence Berkeley National Laboratory; Aydin Buluc, Lawrence Berkeley National Laboratory; Dmitriy Morozov, Lawrence Berkeley National Laboratory; Sang-Yun Oh, University of California, Santa Barbara; Leonid Oliker, Lawrence Berkeley National Laboratory; Katherine Yelick, Lawrence Berkeley National Laboratory
462 Robust Vertex Enumeration for Convex Hulls in High Dimensions Pranjal Awasthi*, Rutgers University; Bahman Kalantari, ; Yikai Zhang, Rutgers
468 Fast generalization error bound of deep learning from a kernel perspective Taiji Suzuki*, The University of Tokyo
471 Product Kernel Interpolation for Scalable Gaussian Processes Jacob Gardner*, Cornell University; Geoff Pleiss, Cornell University; Ruihan Wu, Tsinghua University; Kilian Weinberger, Cornell University; Andrew Wilson, Cornell University
472 Towards Provable Learning of Polynomial Neural Networks Using Low-Rank Matrix Estimation MOHAMMADREZA SOLTANI*, Iowa State University; Chinmay Hegde, Iowa State University
474 Scalable Generalized Dynamic Topic Models Patrick Jähnichen*, Humboldt-Universität zu Berlin; Florian Wenzel, Humboldt-Universität zu Berlin; Marius Kloft, Humboldt-Universität zu Berlin; Stephan Mandt, Disney Research
478 Bayesian Structure Learning for Dynamic Brain Connectivity Michael Andersen*, Aalto University; Oluwasanmi Koyejo, UIUC; Ole Winther, DTU; Lars Kai Hansen, Technical University of Denmark; Russell Poldrack, Stanford University
482 Large Scale Empirical Risk Minimization via Truncated Adaptive Newton Method Mark Eisen*, University of Pennsylvania; Aryan Mokhtari, University of California, Berkeley; Alejandro Ribeiro, University of Pennsylvania
483 Frank-Wolfe Splitting via Augmented Lagrangian Method Gauthier Gidel*, MILA; Fabian Pedregosa, UC Berkeley; Simon Lacoste-Julien, Montreal
487 Learning linear structural equation models in polynomial time and sample complexity Asish Ghoshal*, Purdue University; Jean Honorio, Purdue
490 Convergence diagnostics for stochastic gradient descent Jerry Chee*, University of Chicago; Panos Toulis,
496 Learning Sparse Polymatrix Games in Polynomial Time and Sample Complexity Asish Ghoshal*, Purdue University; Jean Honorio, Purdue
499 Nonparametric Sharpe Ratio Function Estimation in Heteroscedastic Regression Models via Convex Optimization Seung-Jean Kim, ; Johan Lim, Seoul National University; Joong-Ho Won*, Seoul National University
500 Stochastic algorithms for entropy-regularized optimal transport problems Brahim Khalil Abid*, Ecole polytechnique; Robert Gower, Telecom Paristech
502 Plug-in Estimators for Conditional Expectations and Probabilities Steffen Grunewalder*, Lancaster University
503 Factorized Recurrent Neural Architectures for Longer Range Dependence Francois Belletti*, UC Berkeley; Alex Beutel, Google Inc.; Sagar Jain, Google Inc.; Ed Chi, Google Inc.
504 On the Statistical Efficiency of Compositional Nonparametric Prediction Yixi Xu*, Purdue University; Jean Honorio, Purdue; Xiao Wang, Purdue University
509 Metrics for Deep Generative Models Nutan Chen*, Volkswagen Group; Richard Kurle, ; Alexej Klushyn, ; Justin Bayer, ; Xueyan Jiang, ; Patrick van der Smagt,
510 Combinatorial Penalties: Which structures are preserved by convex relaxations? Marwa El Halabi*, EPFL; Francis Bach, Inria / ENS; Volkan Cevher, EPFL
513 Generalized Binary Search For Split-Neighborly Problems Stephen Mussmann*, Stanford University; Percy Liang, Stanford University
518 Intersection-Validation: A Method for Evaluating Structure Learning without Ground Truth Jussi Viinikka*, ; Ralf Eggeling, University of Helsinki; Mikko Koivisto,
522 On Statistical Optimality of Variational Bayes Anirban Bhattacharya, Texas A&M University; Debdeep Pati*, Texas A&M University; Yun Yang,
524 Minimax-Optimal Privacy-Preserving Sparse PCA in Distributed Systems Jason Ge*, Princeton University; Zhaoran Wang, ; Mengdi Wang, ; Han Liu, Princeton
525 Online Regression with Partial Information: Generalization and Linear Projection Shinji Ito*, NEC Coorporation; Daisuke Hatano, ; Hanna Sumita, ; Akihiro Yabe, ; Takuro Fukunaga, ; Naonori Kakimura, ; Ken-Ichi Kawarabayashi,
526 Learning Generative Models with Sinkhorn Divergences Aude Genevay*, Université Paris Dauphine; Gabriel Peyre, ; Marco Cuturi, ENSAE/CREST
532 Reparameterizing the Birkhoff Polytope for Variational Permutation Inference Scott Linderman, ; Gonzalo Mena*, Columbia University; Hal Cooper, Columbia University; Liam Paninski, Columbia University; John Cunningham, Columbia University
534 Achieving the time of 1-NN, but the accuracy of k-NN Lirong Xue*, Princeton University; Samory Kpotufe, Princeton University
535 Efficient Weight Learning in High-Dimensional Untied MLNs Khan Mohammad Al Farabi*, The University of Memphis; Somdeb Sarkhel, Adobe Research; Deepak Venugopal, University of Memphis
536 Consistent Algorithms for Classification under Complex Losses and Constraints Harikrishna Narasimhan*, Harvard University
539 Solving lp-norm regularization with tensor kernels Saverio Salzo*, Istituto Italiano di Tecnologi; Lorenzo Rosasco, University of Genova & MIT; Johan Suykens,
546 Weighted Tensor Decomposition for Learning Latent Variables with Partial Data Omer Gottesman*, Harvard University; Weiwei Pan, ; Finale Doshi-Velez, Harvard
547 Multi-objective Contextual Bandit Problem with Similarity Information Eralp Turgay, Bilkent University; Doruk Oner, Bilkent University; Cem Tekin*, Bilkent University
549 Turing: Composable inference for probabilistic programming Hong Ge*, University of Cambridge; Kai Xu, University of Edinburgh; Zoubin Ghahramani, University of Cambridge
550 Fast and Scalable Learning of Sparse Changes in High-Dimensional Gaussian Graphical Model Structure Beilun Wang*, University of Virginia; arshdeep Sekhon, University of Virginia; Yanjun Qi,
551 Data-Efficient Reinforcement Learning with \\Probabilistic Model Predictive Control Sanket Kamthe, Imperial College; Marc Deisenroth*, Imperial College London
557 Approximate Bayesian Computation with Kullback-Leibler Divergence as Data Discrepancy Bai Jiang*, Princeton University
561 Practical Bayesian optimization in the presence of outliers Ruben Martinez-Cantin*, ; Michael McCourt, SigOpt; Kevin Tee, SigOpt
563 Competing with Automata-based Expert Sequences Scott Yang*, D. E. Shaw & Co.; Mehryar Mohri,
564 Reducing Crowdsourcing to Graphon Estimation, Statistically Christina Lee*, Microsoft Research; Devavrat Shah, MIT
567 Robust Locally-Linear Controllable Embedding Ershad Banijamali*, University of Waterloo; Rui Shu, Stanford University; mohammad Ghavamzadeh, DeepMind; Hung Bui, Adobe Research; Ali Ghodsi, University of Waterloo
569 Combinatorial Semi-Bandits with Knapsacks Karthik Abinav Sankararaman*, University of Maryland College; Aleksandrs Slivkins, Microsoft Research NYC
571 Structured Optimal Transport David Alvarez Melis*, MIT; Tommi Jaakkola, MIT; Stefanie Jegelka, MIT
578 Graphical Models for Non-Negative Data Using Generalized Score Matching Shiqing Yu*, University of Washington; Mathias Drton, University of Washington; Ali Shojaie, University of Washington
581 Asynchronous Doubly Stochastic Group Regularized Learning Bin Gu*, University of Pittsburgh; Zhouyuan Huo, ; Heng Huang, University of Pittsburgh
582 Convergence of Value Aggregation for Imitation Learning Ching-An Cheng*, Georgia Institute of Technology; Byron Boots,
594 Inference in Sparse Graphs with Pairwise Measurements and Side Information Dylan Foster*, Cornell University; Karthik Sridharan, Cornell University; Daniel Reichman, UC Berkeley
595 Parallel and Distributed MCMC via Shepherding Distributions Arkabandhu Chowdhury*, Rice University; Christopher Jermaine, Rice University
602 The Power Mean Laplacian for Multilayer Graph Clustering Pedro Mercado*, Saarland University; Antoine Gautier, Saarland University; Francesco Tudisco, University of Strathclyde; Matthias Hein, Saarland University
604 Adaptive Sampling for Clustered Ranking Sumeet Katariya*, Univ of Wisconsin-Madison; Lalit Jain, University of Michigan Ann Arbor; Nandana Sengupta, University of Chicago; James Evans, University of Chicago; Robert Nowak, University of Wisconsin-Madison
611 Comparison Based Learning from Weak Oracles Ehsan Kazemi*, Yale; Lin Chen, Yale University; Sanjoy Dasgupta, University of California San Diego; Amin Karbasi, Yale
613 The Binary Space Partitioning-Tree Process Xuhui Fan*, UNSW; Bin Li, Fudan University; Scott Sisson, University of New South Wales
614 On denoising noisy modulo 1 samples of a function Mihai Cucuringu*, University of Oxford and the Alan Turing Institute; Hemant Tyagi, Alan Turing Institute
616 Scalable Hash-Based Estimation of Divergence Measures Morteza Noshad Iranzad*, University of Michigan; Alfred Hero, University of Michigan
619 Conditional Gradient Method for Stochastic Submodular Maximization: Closing the Gap Aryan Mokhtari*, UC Berkeley; Hamed Hassani, ; Amin Karbasi, Yale
620 Online Continuous Submodular Maximization Lin Chen*, Yale University; Hamed Hassani, ; Amin Karbasi, Yale
626 Efficient Bayesian Methods for Counting Processes in Partially Observable Environments Ferdian Jovan*, University of Birmingham; Jeremy Wyatt, University of Birmingham; Nick Hawes, University of Oxford
629 Matrix-normal models for fMRI analysis Michael Shvartsman*, Princeton University ; Narayanan Sundaram, Intel Corporation; Mikio Aoi, Princeton University; Adam Charles, Princeton University; Theodore Wilke, Intel Corporation; Jonathan Cohen, Princeton University
631 The emergence of spectral universality in deep networks Jeffrey Pennington*, ; Samuel Schoenholz, Google; Surya Ganguli, Google Brain
635 Spectral Algorithms for Computing Fair Support Vector Machines Mahbod Olfat*, UC Berkeley; Anil Aswani, UC Berkeley
636 Bayesian Multi-label Learning with Sparse Features and Labels He Zhao*, Monash University; Piyush Rai, IIT Kanpur; Lan Du, """Faculty of Information Technology, Monash University, Australia"""; Wray Buntine, Monash University
637 Nonparametric Bayesian sparse graph linear dynamical systems Rahi Kalantari, UT-Austin; Joydeep Ghosh, UT Austin; Mingyuan Zhou*, University of Texas at Austin
639 Proximity Variational Inference Jaan Altosaar*, Princeton University; Rajesh Ranganath, Princeton; David Blei,
641 Near-Optimal Machine Teaching via Explanatory Teaching Sets Yuxin Chen*, Caltech; Oisin Mac Aodha, Caltech; Shihan Su, Caltech; Pietro Perona, Caltech; Yisong Yue, Caltech
643 Learning Hidden Quantum Markov Models Siddarth Srinivasan*, Georgia Institute of Technolog; Geoff Gordon, Carnegie Mellon University; Byron Boots,
644 Labeled Graph Clustering via Projected Gradient Descent Shiau Hong Lim*, IBM Research; Gregory Calvez,
646 Gradient Diversity: a Key Ingredient for Scalable Distributed Learning Dong Yin*, UC Berkeley; Ashwin Pananjady, UC Berkeley; Max Lam, Stanford University; Dimitris Papailiopoulos, ; Kannan Ramchandran, UC Berkeley; Peter Bartlett, UC Berkeley
648 HONES: A Fast and Tuning-free Homotopy Method For Online Newton Step Yuting Ye*, UC Berkeley; LIhua Lei, UC Berkeley; Cheng Ju, UC Berkeley
649 Probability–Revealing Samples Krzysztof Onak*, IBM Research; Xiaorui Sun, Microsoft Research
656 Reducing optimization to repeated classification Tatsunori Hashimoto*, Stanford; Steve Yadlowsky, Stanford University; John Duchi,
661 Online Ensemble Multi-kernel Learning Adaptive to Non-stationary and Adversarial Environments Yanning Shen, ; Tianyi Chen*, University of Minnesota; Georgios Giannakis, University of Minnesota
665 A Unified Dynamic Approach to Sparse Model Selection Chendi Huang*, Peking University; Yuan Yao, Hongkong University of Science and Techonology
666 Bootstrapping EM via Power EM and Convergence in the Naive Bayes Model Costis Daskalakis, ; Christos Tzamos*, Microsoft Research; Manolis Zampetakis, MIT
669 Dimensionality Reduced $\ell^{0}$-Sparse Subspace Clustering Yingzhen Yang*, Snap Research