The accepted papers can be found in the below table sorted in increasing order of Paper ID.
Paper ID |
Paper Title |
Author Names |
2 |
The Geometry of Random Features |
Krzysztof Choromanski, ; Mark Rowland, University of Cambridge ; Tamas Sarlos, Google Research; Vikas Sindhwani, Google Brain Robotics; Richard Turner, Cambridge; Adrian Weller*, University of Cambridge |
4 |
Gauged Mini-Bucket Elimination for Approximate Inference |
Sungsoo Ahn, KAIST; Michael Chertkov, Los Alamos National Laborator; Jinwoo Shin, KAIST; Adrian Weller*, University of Cambridge |
8 |
A Fast Algorithm for Separated Sparsity via Perturbed Lagrangians |
Aleksander Madry, MIT; Slobodan Mitrovic*, EPFL; Ludwig Schmidt, MIT |
10 |
An Analysis of Categorical Distributional Reinforcement Learning |
Mark Rowland*, University of Cambridge ; Marc Bellemare, Google Brain; Will Dabney, DeepMind; Remi Munos, DeepMind; Yee Whye Teh, Oxford and DeepMind |
16 |
Combinatorial Preconditioners for Proximal Algorithms on Graphs |
Thomas Möllenhoff*, TU Munich; Zhenzhang Ye, ; Tao Wu, ; Daniel Cremers, |
22 |
Growth-Optimal Portfolio Selection under CVaR Constraints |
Guy Uziel*, Technion; Ran El-Yaniv, Technion |
26 |
Accelerated Stochastic Power Iteration |
Christopher De Sa, Cornell University; Bryan He, Stanford University; Ioannis Mitliagkas, Université de Montréal; Chris Re, Stanford University; Peng Xu*, Stanford University |
27 |
Multi-scale Nystrom Method |
Woosang Lim, Georgia Tech; Rundong Du, Georgia Tech; Bo Dai, Geogia Tech; Kyomin Jung, Seoul National University; Le Song, Georgia Tech; Haesun Park*, Georgia Tech |
30 |
Making Tree Ensembles Interpretable: A Bayesian Model Selection Approach |
Satoshi Hara*, Osaka University; Kohei Hayashi, |
32 |
Mixed Membership Word Embeddings for Computational Social Science |
James Foulds*, UMBC |
35 |
Fast Threshold Tests for Detecting Discrimination |
Emma Pierson*, Stanford University; Sam Corbett-Davies, Stanford University; Sharad Goel, |
40 |
Iterative Supervised Principal Components |
Juho Piironen*, Aalto University; Aki Vehtari, Aalto |
42 |
Iterative Spectral Method for Alternative Clustering |
Chieh Wu*, Northeastern University; Stratis Ioannidis, NEU; Mario Sznaier, Northeastern University; Xiangyu Li, Northeastern University; David Kaeli, Northeastern University; Jennifer Dy, North Eastern |
45 |
Can clustering scale sublinearly with its clusters? A variational EM acceleration of GMMs and k-means |
Dennis Forster*, University of Oldenburg; Jörg Lücke, University of Oldenburg |
48 |
Parallelised Bayesian Optimisation via Thompson Sampling |
Kirthevasan Kandasamy*, ; Akshay Krishnamurthy, U-Mass Amherst; Jeff Schneider, CMU; Barnabas Poczos, Carnegie Mellon University |
49 |
On the challenges of learning with inference networks on sparse, high-dimensional data |
Rahul Krishnan*, MIT; Dawen Liang, Netflix; Matthew Hoffman, Google; Matthew Hoffman, Google; Dawen Liang, Netflix |
54 |
Post Selection Inference with Kernels |
Makoto Yamada*, RIKEN; Yuta Umezu, ; Kenji Fukumizu, ; Ichiro Takeuchi, |
55 |
On how complexity effects the stability of a predictor |
Joel Ratsaby*, Ariel University |
56 |
On the Truly Block Eigensolvers via First-Order Riemannian Optimization |
Zhiqiang Xu*, KAUST; Xin Gao, |
59 |
Layerwise Systematic Scan: Deep Boltzmann Machines and Beyond |
Heng Guo*, University of Edinburgh; Kaan Kara, ETH Zurich; Ce Zhang, ETH Zurich |
60 |
IHT dies hard: Provable accelerated Iterative Hard Thresholding |
Rajiv Khanna, UT Austin; Anastasios Kyrillidis*, IBM T.J. Watson Research Cente |
65 |
Finding Global Optima in Nonconvex Stochastic Semidefinite Optimization with Variance Reduction |
Jinshan ZENG*, Hongkong University of Science and Technology; Ke Ma, (IIE, CAS; Yuan Yao, Hongkong University of Science and Techonology |
66 |
Outlier Detection and Robust Estimation in Nonparametric Regression |
Dehan Kong, Univ. of Toronto; Howard Bondell, North Carolina State University; Weining Shen*, UC Irvine |
68 |
Integral Transforms from Finite Data: An Application of Gaussian Process Regression to Fourier Analysis |
Luca Ambrogioni*, Radboud University; Eric Maris, Radboud University |
72 |
AdaGeo: Adaptive Geometric Learning for Optimization and Sampling |
Gabriele Abbati*, University of Oxford; Alessandra Tosi, Mind Foundry, Oxford; Seth Flaxman, Imperial College London; Michael Osborne, Oxford |
74 |
Online Learning with Non-Convex Losses and Non-Stationary Regret |
Xiaobo Li*, University of Minnesota; Xiang Gao, University of Minnesota; Shuzhong Zhang, University of Minnesota |
75 |
Learning Determinantal Point Processes in Sublinear Time |
Christophe Dupuy*, INRIA; Francis Bach, INRIA - ENS |
76 |
Nonlinear Structured Signal Estimation in High Dimensions via Iterative Hard Thresholding |
Kaiqing Zhang, University of Illinois at Urba; Zhuoran Yang*, Princeton University |
77 |
Riemannian stochastic quasi-Newton algorithm with variance reduction and its convergence analysis |
Hiroyuki Kasai*, UEC; Hiroyuki Sato, Kyoto University; Bamdev Mishra, Amazon |
78 |
Online Boosting Algorithms for Multi-label Ranking |
Young Hun Jung*, Universith of Michigan; Ambuj Tewari, Universith of Michigan |
80 |
Zeroth-Order Online Alternating Direction Method of Multipliers: Convergence Analysis and Applications |
Sijia Liu*, University of Michigan; Jie Chen, ; Pin-Yu Chen, ; Alfred Hero, |
86 |
High-dimensional Bayesian optimization via additive models with overlapping groups |
Paul Rolland*, EPFL, LIONS; Jonathan Scarlett, ; Ilija Bogunovic, ; Volkan Cevher, EPFL |
89 |
Robust Active Label Correction |
Jan Kremer, University of Copenhagen; Fei Sha, UCLA; Christian Igel*, University of Copenhagen |
90 |
Factorial HMM with Collapsed Gibbs Sampling for optimizing long-term HIV Therapy |
Amit Gruber*, IBM Research; Chen Yanover, IBM Research; Tal El-Hay, IBM Research; Yaara Goldschmidt, IBM Research; Anders Sönnerborg, Karolinska Institute, Karolinska University Hospital; Vanni Borghi, Modena University Hospital; Francesca Incardona, EuResist Network GEIE, InformaPro S.r.l. |
91 |
Optimal Submodular Extensions for Marginal Estimation |
Pankaj Pansari*, University of Oxford; Chris Russell, The Alan Turing Institute; M. Pawan Kumar, University of Oxford |
92 |
Semi-Supervised Learning with Competitive Infection Models |
Nir Rosenfeld*, Harvard University; Amir Globerson, Tel Aviv University |
94 |
Discriminative Learning of Prediction Intervals |
Nir Rosenfeld*, Harvard University; Yishay Mansour, Tel Aviv University; Elad Yom Tov, Microsoft Research |
95 |
Topic Compositional Neural Language Model |
Wenlin Wang*, Duke University; Zhe Gan, Duke University; Wenqi Wang, Purdue University; Dinghan Shen, Duke University; Jiaji Huang, Baidu Silicon Valley Artificial Intelligence Lab; Wei Ping, Baidu Silicon Valley Artificial Intelligence Lab; Sanjeev Satheesh, Baidu Silicon Valley Artificial Intelligence Lab; Lawrence Carin, Duke |
97 |
Learning Priors for Invariance |
Eric Nalisnick*, UC Irvine; Padhraic Smyth, University of California, Irvine |
98 |
Optimal Cooperative Inference |
Scott Cheng-Hsin Yang*, Rutgers University--Newark; Yue Yu, Rutgers University--Newark; arash Givchi, Rutgers University--Newark; Pei Wang, Rutgers University--Newark; wai Keen Vong, Rutgers University--Newark; Patrick Shafto, Rutgers University--Newark |
102 |
Stochastic Multi-armed Bandits in Constant Space |
David Liau, UT-Austin; Zhao Song, UT-Austin; Eric Price, UT-Austin; Ger Yang*, UT-Austin |
109 |
Matrix completability analysis via graph k-connectivity |
Dehua Cheng*, Univ. of Southern California; Natali Ruchansky, ; Yan Liu, University of Southern California |
112 |
FLAG n’ FLARE: Fast Linearly-Coupled Adaptive Gradient Methods |
Xiang Cheng, UC Berkeley; Fred Roosta*, University of Queensland; Stefan Palombo, UC Berkeley; Peter Bartlett, UC Berkeley; Michael Mahoney, UC Berkeley |
113 |
Multi-view Metric Learning in Vector-valued Kernel Spaces |
Riikka Huusari*, Aix-Marseille Université; Hachem Kadri, Aix-Marseille University; Cécile Capponi, |
115 |
Gaussian Process Subset Scanning for Anomalous Pattern Detection in Non-iid Data |
William Herlands*, Carnegie Mellon University; Edward McFowland, ; Andrew Wilson, Cornell University; Daniel Neill, |
117 |
Dropout as a Low-Rank Regularizer for Matrix Factorization |
Jacopo Cavazza*, Istituto Italiano di Tecnologi; Pietro Morerio, Istituto Italiano di Tecnologia; Benjamin Haeffele, Johns Hopkins University; Connor Lane, Johns Hopkins University; Vittorio Murino, Istituto Italiano di Tecnologia; Rene Vidal, Johns Hopkins University |
119 |
A Simple Analysis for Exp-concave Empirical Minimization with Arbitrary Convex Regularizer |
Tianbao Yang*, University of Iowa; Zhe Li, ; Lijun Zhang, Nanjing University |
120 |
Independently Interpretable Lasso: A New Regularizer for Sparse Regression with Uncorrelated Variables |
Masaaki Takada*, The Graduate University for Advanced Studies; Taiji Suzuki, The University of Tokyo; Hironori Fujisawa, The Insitute of Statistical Mathematics |
121 |
Boosting Variational Inference: an Optimization Perspective |
Francesco Locatello*, ETH Zurich; Rajiv Khanna, UT Austin; Joydeep Ghosh, ; Gunnar Ratsch, |
122 |
Personalized and Private Peer-to-Peer Machine Learning |
Aurélien Bellet*, INRIA; Rachid Guerraoui, ; mahsa Taziki, ; Marc Tommasi, |
125 |
Tensor Regression Meets Gaussian Processes |
Rose Yu*, Caltech; Guangyu Li, University of Southern California; Yan Liu, University of Southern California |
127 |
A Nonconvex Proximal Splitting Algorithm under Moreau-Yosida Regularization |
Emanuel Laude*, Technical University of Munich; Tao Wu, ; Daniel Cremers, |
133 |
Medoids in Almost-Linear Time via Multi-Armed Bandits |
Vivek Bagaria, ; Govinda Kamath, ; Martin Zhang, Stanford University; Vasilis Ntranos, ; David Tse*, |
139 |
Regional Multi-Armed Bandits |
Zhiyang Wang, USTC; Ruida Zhou, USTC; Cong Shen*, Univ. of Sci. & Tech. China |
142 |
Nearly second-order optimality of online joint detection and estimation via one-sample update schemes |
Yang Cao*, Georgia Institute of Technolog; Liyan Xie, ; Yao Xie, ; Huan Xu, |
151 |
Sum-Product-Quotient Networks |
Or Sharir*, Hebrew University of Jerusalem; Amnon Shashua, Hebrew University of Jerusalem |
154 |
Exploiting Strategy-Space Diversity for Batch Bayesian Optimization |
Sunil Gupta*, Deakin University; Alistair Shilton, Deakin University; Santu Rana, Deakin University; Svetha Venkatesh, Deakin University |
158 |
Beating Monte Carlo Integration: a Nonasymptotic Study of Kernel Smoothing Methods |
Stephan Clémençon*, Telecom ParisTech; François Portier, Telecom ParisTech |
166 |
Group invariance principles for causal generative models |
Michel Besserve*, ; naji Shajarisales, MPI for Intelligent Systems; Bernhard Schoelkopf, MPI for Intelligent Systems; Dominik Janzing, MPI for Intelligent Systems |
167 |
A Provable Algorithm for Learning Interpretable Scoring Systems |
Nataliya Sokolovska*, University Paris 6; Yann Chevaleyre, University Paris Dauphine; Jean-Daniel Zucker, IRD |
172 |
Scaling up the Automatic Statistician: Scalable Structure Discovery using Gaussian Processes |
Hyunjik Kim*, University of Oxford; Yee Whye Teh, Oxford |
178 |
Efficient Bandit Combinatorial Optimization Algorithm with Zero-suppressed Binary Decision Diagrams |
Shinsaku Sakaue*, NTT; Masakazu Ishihata, Hokkaido University; Shin-ichi Minato, |
181 |
Transfer Learning on fMRI Datasets |
Hejia Zhang*, Princeton University; Po-Hsuan Chen, Princeton University; Peter Ramadge, Princeton University |
183 |
An Optimization Approach to Learning Falling Rule Lists |
Chaofan Chen*, Duke University; Cynthia Rudin, Duke |
185 |
Catalyst for Gradient-based Nonconvex Optimization |
Courtney Paquette*, Ohio State University; Hongzhou Lin, INRIA; Dmitriy Drusvyatskiy, University of Washington; Julien Mairal, Inria; Zaid Harchaoui, University of Washington |
188 |
Benefits from Superposed Hawkes Processes |
Hongteng Xu*, Duke University; Dixin Luo, ; Xu Chen, Tsinghua University; Lawrence Carin, Duke |
192 |
Nonparametric Preference Completion |
Julian Katz-Samuels*, University of Michigan; Clayton Scott, University of Michigan |
198 |
Non-parametric estimation of Jensen-Shannon Divergence in Generative Adversarial Network training |
Mathieu Sinn*, ; Ambrish Rawat, IBM Research |
201 |
Efficient and principled score estimation with Nyström kernel exponential families |
Dougal Sutherland*, Gatsby unit, UCL; Heiko Strathmann, ; Michael Arbel, Gatsby unit, UCL; Arthur Gretton, Gatsby unit, UCL |
208 |
Symmetric Variational Autoencoder and Connections to Adversarial Learning |
Liqun Chen*, Duke University; Shuyang Dai, Duke University; Yunchen Pu, Duke University; Chunyuan Li, Duke University; Qinliang Su, Duke University; Erjin Zhou, Face++; Lawrence Carin, Duke |
210 |
Few-shot Generative Modelling with Generative Matching Networks |
Sergey Bartunov*, DeepMind; Dmitry Vetrov, Higher School of Economics |
211 |
Nonlinear Weighted Finite Automata |
Tianyu Li*, McGill University; Guillaume Rabusseau, McGill University; Doina Precup, McGill University |
212 |
Natural Gradients in Practice: Non-Conjugate Variational Inference in Gaussian Process Models |
Hugh Salimbeni*, Imperial College London; Stefanos Eleftheriadis, Prowler.io; James Hensman, PROWLER.io |
216 |
Variational inference for the multi-armed contextual bandit |
Iñigo Urteaga*, Columbia University; Chris Wiggins, Columbia University |
220 |
Tracking the gradients using the Hessian: A new look at variance reducing stochastic methods |
Robert Gower*, Telecom Paristech; Nicolas Le Roux, Google Brain; Francis Bach, Inria / ENS |
226 |
Subsampling for Ridge Regression via Regularized Volume Sampling |
Michal Derezinski*, UC Santa Cruz; Manfred Warmuth, |
228 |
Scalable Gaussian Processes with Billions of Inducing Inputs via Tensor Train Decomposition |
Pavel Izmailov*, Cornell University; Dmitry Kropotov, MSU; Alexander Novikov, Higher school of economics |
229 |
Batch-Expansion Training: An Efficient Optimization Framework |
Michal Derezinski*, UC Santa Cruz; Dhruv Mahajan, Facebook Research; S. Sathiya Keerthi, Microsoft Corporation; S. V. N. Vishwanathan, UC Santa Cruz; Markus Weimer, Microsoft Corporation |
237 |
Batched Large-scale Bayesian Optimization in High-dimensional Spaces |
Zi Wang*, MIT; Clement Gehring, ; Stefanie Jegelka, MIT; Pushmeet Kohli, |
244 |
A Bayesian Nonparametric Method for Clustering Imputation, and Forecasting in Multivariate Time Series |
FERAS SAAD*, MIT; Vikash Mansinghka, MIT |
245 |
Stochastic Three-Composite Convex Minimization with a Linear Operator |
Renbo Zhao*, NUS; Volkan Cevher, EPFL |
246 |
Direct Learning to Rank And Rerank |
Cynthia Rudin*, Duke; Yining Wang, Carnegie Mellon University |
247 |
One-shot Coresets: The Case of k-Clustering |
Olivier Bachem*, ETH Zurich; Mario Lucic, Google Brain Zurich; Silvio Lattanzi, |
249 |
Random Warping Series: A Random Features Method for Time-Series Embedding |
Lingfei Wu*, IBM T. J. Watson Research Cent; Ian En-Hsu Yen, CMU; Jinfeng Yi, ; Fangli Xu, College of William and Mary; Qi Lei, University of Texas at Austin; Michael Witbrock, IBM T. J. Watson Research Center |
250 |
Slow and Stale Gradients Can Win the Race: Error-Runtime Trade-offs in Distributed SGD |
Sanghamitra Dutta*, Carnegie Mellon University; Gauri Joshi, Carnegie Mellon University; Soumyadip Ghosh, IBM Research; Parijat Dube, IBM Research; Priya Nagpurkar, IBM Research |
251 |
Variational Inference based on Robust Divergences |
Futoshi Futami*, The University of Tokyo/RIKEN; Issei Sato, The University of Tokyo / RIKEN; Masashi Sugiyama, RIKEN / The University of Tokyo |
255 |
Resampled Proposal Distributions for Variational Inference and Learning |
Aditya Grover*, Stanford University; Ramki Gummadi, ; Miguel Lazaro-Gredilla, Vicarious; Dale Schuurmans, ; Stefano Ermon, Stanford |
257 |
Best arm identification in multi-armed bandits with delayed and partial feedback |
Aditya Grover*, Stanford University; Todor Markov, ; Stefano Ermon, Stanford |
267 |
Fully adaptive algorithm for pure exploration in linear bandits |
Liyuan Xu*, The University of Tokyo / RIKEN; Junya Honda, University of Tokyo / RIKEN; Masashi Sugiyama, RIKEN / The University of Tokyo |
272 |
Contextual Bandits with Stochastic Experts |
Rajat Sen*, University of Texas at Austin; Karthikeyan Shanmugam, IBM; Sanjay Shakkottai, University of Texas at Austin |
277 |
Human Interaction with Recommendation Systems |
Sven Schmit*, Stanford University; Carlos Riquelme, |
281 |
Community Detection in Hypergraphs: Optimal Statistical Limit and Efficient Algorithms |
I Chien, UIUC; Chung-Yi Lin*, National Taiwan University; I-Hsiang Wang, National Taiwan University |
294 |
Smooth and Sparse Optimal Transport |
Mathieu Blondel*, NTT; Vivien Seguy, Kyoto University; Antoine Rolet, Kyoto University |
296 |
Robust Maximization of Non-Submodular Objectives |
Ilija Bogunovic*, ; Junyao Zhao, ETH Zürich; Volkan Cevher, EPFL |
298 |
Cause-Effect Inference by Comparing Regression Errors |
Patrick Bloebaum*, Osaka University; Dominik Janzing, MPI for Intelligent Systems; Takashi Washio, ; Shohei Shimizu, ; Bernhard Schoelkopf, MPI for Intelligent Systems |
299 |
Tree-based Bayesian Mixture Model for Competing Risks |
Alexis Bellot*, University of Oxford; Mihaela Van der Schaar, University of Oxford |
301 |
Actor-Critic Fictitious Play in Simultaneous Move Multistage Games |
Julien Perolat*, DeepMind; Bilal Piot, DeepMind; Olivier Pietquin, DeepMind |
307 |
Random Subspace with Trees for Feature Selection Under Memory Constraints |
Antonio Sutera*, ULiège; Célia Châtel, Aix-Marseille University; Gilles Louppe, ULiège; Louis Wehenkel, ; Pierre Geurts, ULiège |
308 |
Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information |
Jakob Runge*, German Aerospace Agency |
310 |
Quotient Normalized Maximum Likelihood Criterion for Learning Bayesian Network Structures |
Tomi Silander*, Naverlabs Europe; Janne Leppä-aho, ; Elias Jääsaari, ; Teemu Roos, |
311 |
Convex optimization over intersection of simple sets: improved convergence rate guarantees via exact penalty approach |
Achintya Kundu*, Indian Institute of Science; Francis Bach, Inria / ENS; Chiranjib Bhattacharya, |
317 |
Variational Sequential Monte Carlo |
Christian Naesseth*, Linköping University; Scott Linderman, ; Rajesh Ranganath, Princeton; David Blei, |
321 |
Statistically Efficient Estimation for Non-Smooth Probability Densities |
Masaaki Imaizumi*, ISM / RIKEN; Takanori Maehara, ; Yuichi Yoshida, National Institute of Informatics |
324 |
SDCA-Powered Inexact Dual Augmented Lagrangian Method for Fast CRF Learning |
Xu Hu*, ENPC; Guillaume Obozinski, |
325 |
Generalized Concomitant Multi-Task Lasso for sparse multimodal regression |
Mathurin Massias*, INRIA Saclay; Olivier Fercoq, LTCI - Télécom ParisTech - Université Paris Saclay; Alexandre Gramfort, INRIA Saclay; Joseph Salmon, LTCI - Télécom ParisTech - Université Paris Saclay |
328 |
Gradient Layer: Enhancing the Convergence of Adversarial Training for Generative Models |
Atsushi Nitanda*, The University of Tokyo; Taiji Suzuki, The University of Tokyo |
332 |
Statistical Sparse Online Regression: A Diffusion Approximation Perspective |
Junchi Li*, Princeton University; Qiang Sun, Princeton University; Jianqing Fan, Princeton University |
338 |
Guaranteed Sufficient Decrease for Stochastic Variance Reduced Gradient Optimization |
Fanhua Shang*, The Chinese University of Hong Kong; Yuanyuan Liu, The Chinese University of Hong Kong; Kaiwen Zhou, The Chinese University of Hong Kong; James Cheng, The Chinese University of Hong Kong; Kelvin Kai Wing Ng, The Chinese University of Hong Kong; Yuichi Yoshida, National Institute of Informatics |
340 |
Delayed Sampling and Automatic Rao-Blackwellization of Probabilistic Programs |
Lawrence Murray*, Uppsala University; Daniel Lundén, ; Jan Kudlicka, ; David Broman, ; Thomas Schön, |
342 |
Learning to Round for Discrete Labeling Problems |
Pritish Mohapatra*, IIIT, Hyderabad; Jawahar C.V., IIIT Hyderabad; M. Pawan Kumar, University of Oxford |
350 |
Approximate ranking from pairwise comparisons |
Reinhard Heckel*, Rice University; Max Simchowitz, UC Berkeley; Kannan Ramchandran, UC Berkeley; Martin Wainwright, UC Berkeley |
352 |
Semi-Supervised Prediction-Constrained Topic Models |
Michael Hughes*, Harvard University; John Hope, University of California, Irvine; Leah Weiner, Brown University; Thomas McCoy, Massachusetts General Hospital; Roy Perlis, Massachusetts General Hospital; Erik Sudderth, University of California, Irvine; Finale Doshi-Velez, Harvard |
354 |
A Stochastic Differential Equation Framework for Guiding Online User Activities in Closed Loop |
Yichen Wang*, Gatech; Evangelos Theodorou, ; Le Song, Georgia Tech |
358 |
Accelerated Stochastic Mirror Descent: From Continuous-time Dynamics to Discrete-time Algorithms |
Pan Xu*, University of Virginia; Tianhao Wang, ; Quanquan Gu, University of Virginia |
367 |
A Unified Framework for Nonconvex Low-Rank plus Sparse Matrix Recovery |
Xiao Zhang*, University of Virginia; Lingxiao Wang, University of Virginia; Quanquan Gu, University of Virginia |
370 |
Bayesian Nonparametric Poisson-Process Allocation for Time-Sequence Modeling |
Hongyi Ding*, The University of Tokyo; Mohammad Khan, ; Issei Sato, The University of Tokyo / RIKEN; Masashi Sugiyama, RIKEN / The University of Tokyo |
371 |
Factor Analysis on a Graph |
Masayuki Karasuyama*, ; Hiroshi Mamitsuka, Kyoto University / Aalto University |
375 |
Crowdclustering with Partition Labels |
Junxiang Chen*, Northeastern University; Yale Chang, Northeastern University; Peter Castaldi, Brigham and Women’s Hospital; Michael Cho, Brigham and Women’s Hospital; Brian Hobbs, Brigham and Women’s Hospital; Jennifer Dy, North Eastern |
378 |
Learning Structural Weight Uncertainty with Stein Gradient Flows |
Ruiyi Zhang, Duke University; Chunyuan Li*, Duke University; Changyou Chen, SUNY Buffalo; Lawrence Carin, Duke |
382 |
Towards Memory-Friendly Deterministic Incremental Gradient Method |
Jiahao Xie*, Zhejiang University; Hui Qian, Zhejiang University; Zebang Shen, Zhejiang University; Chao Zhang, Zhejiang University |
383 |
Alpha-expansion is Exact on Stable Instances |
Hunter Lang*, MIT; David Sontag, MIT; Aravindan Vijayaraghavan, Northwestern University |
384 |
Bayesian Approaches to Distribution Regression |
Ho Chung Leon Law*, University Of Oxford; Dougal Sutherland, Gatsby unit, UCL; Dino Sejdinovic, University of Oxford; Seth Flaxman, Imperial College London |
386 |
Submodularity on Hypergraphs: From Sets to Sequences |
Marko Mitrovic*, Yale University; Moran Feldman, Open University of Israel; Andreas Krause, ETH Zurich; Amin Karbasi, Yale |
389 |
Provable Estimation of the Number of Blocks in Block Models |
BOWEI YAN*, UNIVERSITY OF TEXAS AT AUSTIN; Purnamrita Sarkar, University of Texas at Austin; Xiuyuan Cheng, Duke University |
391 |
Differentially Private Regression with Gaussian Processes |
Michael Smith*, University of Sheffield; Mauricio Álvarez, University of Sheffield; Max Zwiessele, University of Sheffield; Neil Lawrence, University of Sheffield |
394 |
Adaptive balancing of gradient and update computation times using global geometry and approximate subproblems |
Sai Praneeth Reddy Karimireddy, EPFL; Sebastian Stich, EPFL; Martin Jaggi*, EPFL |
407 |
VAE with a VampPrior |
Jakub Tomczak*, University of Amsterdam; Max Welling, University of Amsterdam |
408 |
Structured Factored Inference for Probabilistic Programming |
Avi Pfeffer, Charles River Analytics; Brian Ruttenberg, Charles River Analytics; William Kretschmer, MIT; Alison OConnor*, Charles River Analytics |
410 |
A Generic Approach for Escaping Saddle points |
Sashank Reddi, Google; Manzil Zaheer*, Carnegie Mellon University; Suvrit Sra, MIT; Barnabas Poczos, Carnegie Mellon University; Francis Bach, Inria / ENS; Ruslan Salakhutdinov, Carnegie Mellon University; Alex Smola, Amazon |
411 |
Policy Evaluation and Optimization with Continuous Treatments |
Nathan Kallus*, ; Angela Zhou, Cornell ORIE |
412 |
Multiphase MCMC Sampling for Parameter Inference in Nonlinear Ordinary Differential Equations |
Alan Lazarus*, University of Glagsow; Dirk Husmeier, Glasgow; Theodore Papamarkou, Mathematics & Statistics, University of Glasgow |
414 |
Why adaptively collected data have negative bias and how to correct for it. |
Xinkun Nie*, Stanford University; Xiaoying Tian, Stanford University; Jonathan Taylor, Stanford University; James Zou, Stanford University |
425 |
Sparse Linear Isotonic Models |
Sheng Chen*, University of Minnesota; Arindam Banerjee, University of Minnesota |
431 |
Robustness of classifiers to uniform \ell_p and Gaussian noise |
Jean-Yves Franceschi, Ecole Normale Supérieure Lyon; Alhussein Fawzi*, UCLA; Omar Fawzi, |
436 |
Nested CRP with Hawkes-Gaussian Processes |
Xi Tan*, Purdue University; Vinayak Rao, Purdue; Jennifer Neville, Purdue University |
441 |
Sketching for Kronecker Product Regression and P-splines |
Huaian Diao, Northeast Normal University ; Zhao Song, UT-Austin; Wen Sun*, Carnegie Mellon University; David Woodruff, Carnegie Mellon University |
442 |
Multimodal Prediction and Personalization of Photo Edits with Deep Generative Models |
Ardavan Saeedi*, ; Matthew Hoffman, Google; Matthew Hoffman, Google; Stephen DiVerdi, Adobe; Asma Ghandeharioun, MIT; Matthew Johnson, Google Brain; Ryan Adams, Princeton |
444 |
Cheap Checking for Cloud Computing: Statistical Analysis via Annotated Data Streams |
Chris Hickey*, University of Warwick; Graham Cormode, |
447 |
Reconstruction Risk of Convolutional Sparse Dictionary Learning |
Shashank Singh*, ; Barnabas Poczos, Carnegie Mellon University; Jian Ma, Carnegie Mellon University |
448 |
Kernel Conditional Exponential Family |
Michael Arbel*, Gatsby unit, UCL; Arthur Gretton, Gatsby unit, UCL |
451 |
Linear Stochastic Approximation: Constant Step-Size and Iterate Averaging |
Chandrashekar Lakshmi-Narayanan*, Indian Institute of Science; Csaba Szepesvari, |
452 |
Stochastic Zeroth-order Optimization in High Dimensions |
Yining Wang*, Carnegie Mellon University; Simon Du, ; Sivaraman Balakrishnan, Carnegie Mellon University; Aarti Singh, Carnegie Mellon University |
459 |
Teacher Improves Learning by Selecting a Training Subset |
Philippe Rigollet, Massachusetts Institute of Technology; Robert Nowak, ; Xiaojin Zhu*, University of Wisconsin-Madison; Xuezhou Zhang, University of Wisconsin-Madison; Yuzhe Ma, Univ. of Wisconsin-Madison |
460 |
Communication-Avoiding Optimization Methods for Massive-Scale Graphical Model Structure Learning |
Penporn Koanantakool*, UC Berkeley; Alnur Ali, Carnegie Mellon University; Ariful Azad, Lawrence Berkeley National Laboratory; Aydin Buluc, Lawrence Berkeley National Laboratory; Dmitriy Morozov, Lawrence Berkeley National Laboratory; Sang-Yun Oh, University of California, Santa Barbara; Leonid Oliker, Lawrence Berkeley National Laboratory; Katherine Yelick, Lawrence Berkeley National Laboratory |
462 |
Robust Vertex Enumeration for Convex Hulls in High Dimensions |
Pranjal Awasthi*, Rutgers University; Bahman Kalantari, ; Yikai Zhang, Rutgers |
468 |
Fast generalization error bound of deep learning from a kernel perspective |
Taiji Suzuki*, The University of Tokyo |
471 |
Product Kernel Interpolation for Scalable Gaussian Processes |
Jacob Gardner*, Cornell University; Geoff Pleiss, Cornell University; Ruihan Wu, Tsinghua University; Kilian Weinberger, Cornell University; Andrew Wilson, Cornell University |
472 |
Towards Provable Learning of Polynomial Neural Networks Using Low-Rank Matrix Estimation |
MOHAMMADREZA SOLTANI*, Iowa State University; Chinmay Hegde, Iowa State University |
474 |
Scalable Generalized Dynamic Topic Models |
Patrick Jähnichen*, Humboldt-Universität zu Berlin; Florian Wenzel, Humboldt-Universität zu Berlin; Marius Kloft, Humboldt-Universität zu Berlin; Stephan Mandt, Disney Research |
478 |
Bayesian Structure Learning for Dynamic Brain Connectivity |
Michael Andersen*, Aalto University; Oluwasanmi Koyejo, UIUC; Ole Winther, DTU; Lars Kai Hansen, Technical University of Denmark; Russell Poldrack, Stanford University |
482 |
Large Scale Empirical Risk Minimization via Truncated Adaptive Newton Method |
Mark Eisen*, University of Pennsylvania; Aryan Mokhtari, University of California, Berkeley; Alejandro Ribeiro, University of Pennsylvania |
483 |
Frank-Wolfe Splitting via Augmented Lagrangian Method |
Gauthier Gidel*, MILA; Fabian Pedregosa, UC Berkeley; Simon Lacoste-Julien, Montreal |
487 |
Learning linear structural equation models in polynomial time and sample complexity |
Asish Ghoshal*, Purdue University; Jean Honorio, Purdue |
490 |
Convergence diagnostics for stochastic gradient descent |
Jerry Chee*, University of Chicago; Panos Toulis, |
496 |
Learning Sparse Polymatrix Games in Polynomial Time and Sample Complexity |
Asish Ghoshal*, Purdue University; Jean Honorio, Purdue |
499 |
Nonparametric Sharpe Ratio Function Estimation in Heteroscedastic Regression Models via Convex Optimization |
Seung-Jean Kim, ; Johan Lim, Seoul National University; Joong-Ho Won*, Seoul National University |
500 |
Stochastic algorithms for entropy-regularized optimal transport problems |
Brahim Khalil Abid*, Ecole polytechnique; Robert Gower, Telecom Paristech |
502 |
Plug-in Estimators for Conditional Expectations and Probabilities |
Steffen Grunewalder*, Lancaster University |
503 |
Factorized Recurrent Neural Architectures for Longer Range Dependence |
Francois Belletti*, UC Berkeley; Alex Beutel, Google Inc.; Sagar Jain, Google Inc.; Ed Chi, Google Inc. |
504 |
On the Statistical Efficiency of Compositional Nonparametric Prediction |
Yixi Xu*, Purdue University; Jean Honorio, Purdue; Xiao Wang, Purdue University |
509 |
Metrics for Deep Generative Models |
Nutan Chen*, Volkswagen Group; Richard Kurle, ; Alexej Klushyn, ; Justin Bayer, ; Xueyan Jiang, ; Patrick van der Smagt, |
510 |
Combinatorial Penalties: Which structures are preserved by convex relaxations? |
Marwa El Halabi*, EPFL; Francis Bach, Inria / ENS; Volkan Cevher, EPFL |
513 |
Generalized Binary Search For Split-Neighborly Problems |
Stephen Mussmann*, Stanford University; Percy Liang, Stanford University |
518 |
Intersection-Validation: A Method for Evaluating Structure Learning without Ground Truth |
Jussi Viinikka*, ; Ralf Eggeling, University of Helsinki; Mikko Koivisto, |
522 |
On Statistical Optimality of Variational Bayes |
Anirban Bhattacharya, Texas A&M University; Debdeep Pati*, Texas A&M University; Yun Yang, |
524 |
Minimax-Optimal Privacy-Preserving Sparse PCA in Distributed Systems |
Jason Ge*, Princeton University; Zhaoran Wang, ; Mengdi Wang, ; Han Liu, Princeton |
525 |
Online Regression with Partial Information: Generalization and Linear Projection |
Shinji Ito*, NEC Coorporation; Daisuke Hatano, ; Hanna Sumita, ; Akihiro Yabe, ; Takuro Fukunaga, ; Naonori Kakimura, ; Ken-Ichi Kawarabayashi, |
526 |
Learning Generative Models with Sinkhorn Divergences |
Aude Genevay*, Université Paris Dauphine; Gabriel Peyre, ; Marco Cuturi, ENSAE/CREST |
532 |
Reparameterizing the Birkhoff Polytope for Variational Permutation Inference |
Scott Linderman, ; Gonzalo Mena*, Columbia University; Hal Cooper, Columbia University; Liam Paninski, Columbia University; John Cunningham, Columbia University |
534 |
Achieving the time of 1-NN, but the accuracy of k-NN |
Lirong Xue*, Princeton University; Samory Kpotufe, Princeton University |
535 |
Efficient Weight Learning in High-Dimensional Untied MLNs |
Khan Mohammad Al Farabi*, The University of Memphis; Somdeb Sarkhel, Adobe Research; Deepak Venugopal, University of Memphis |
536 |
Consistent Algorithms for Classification under Complex Losses and Constraints |
Harikrishna Narasimhan*, Harvard University |
539 |
Solving lp-norm regularization with tensor kernels |
Saverio Salzo*, Istituto Italiano di Tecnologi; Lorenzo Rosasco, University of Genova & MIT; Johan Suykens, |
546 |
Weighted Tensor Decomposition for Learning Latent Variables with Partial Data |
Omer Gottesman*, Harvard University; Weiwei Pan, ; Finale Doshi-Velez, Harvard |
547 |
Multi-objective Contextual Bandit Problem with Similarity Information |
Eralp Turgay, Bilkent University; Doruk Oner, Bilkent University; Cem Tekin*, Bilkent University |
549 |
Turing: Composable inference for probabilistic programming |
Hong Ge*, University of Cambridge; Kai Xu, University of Edinburgh; Zoubin Ghahramani, University of Cambridge |
550 |
Fast and Scalable Learning of Sparse Changes in High-Dimensional Gaussian Graphical Model Structure |
Beilun Wang*, University of Virginia; arshdeep Sekhon, University of Virginia; Yanjun Qi, |
551 |
Data-Efficient Reinforcement Learning with \\Probabilistic Model Predictive Control |
Sanket Kamthe, Imperial College; Marc Deisenroth*, Imperial College London |
557 |
Approximate Bayesian Computation with Kullback-Leibler Divergence as Data Discrepancy |
Bai Jiang*, Princeton University |
561 |
Practical Bayesian optimization in the presence of outliers |
Ruben Martinez-Cantin*, ; Michael McCourt, SigOpt; Kevin Tee, SigOpt |
563 |
Competing with Automata-based Expert Sequences |
Scott Yang*, D. E. Shaw & Co.; Mehryar Mohri, |
564 |
Reducing Crowdsourcing to Graphon Estimation, Statistically |
Christina Lee*, Microsoft Research; Devavrat Shah, MIT |
567 |
Robust Locally-Linear Controllable Embedding |
Ershad Banijamali*, University of Waterloo; Rui Shu, Stanford University; mohammad Ghavamzadeh, DeepMind; Hung Bui, Adobe Research; Ali Ghodsi, University of Waterloo |
569 |
Combinatorial Semi-Bandits with Knapsacks |
Karthik Abinav Sankararaman*, University of Maryland College; Aleksandrs Slivkins, Microsoft Research NYC |
571 |
Structured Optimal Transport |
David Alvarez Melis*, MIT; Tommi Jaakkola, MIT; Stefanie Jegelka, MIT |
578 |
Graphical Models for Non-Negative Data Using Generalized Score Matching |
Shiqing Yu*, University of Washington; Mathias Drton, University of Washington; Ali Shojaie, University of Washington |
581 |
Asynchronous Doubly Stochastic Group Regularized Learning |
Bin Gu*, University of Pittsburgh; Zhouyuan Huo, ; Heng Huang, University of Pittsburgh |
582 |
Convergence of Value Aggregation for Imitation Learning |
Ching-An Cheng*, Georgia Institute of Technology; Byron Boots, |
594 |
Inference in Sparse Graphs with Pairwise Measurements and Side Information |
Dylan Foster*, Cornell University; Karthik Sridharan, Cornell University; Daniel Reichman, UC Berkeley |
595 |
Parallel and Distributed MCMC via Shepherding Distributions |
Arkabandhu Chowdhury*, Rice University; Christopher Jermaine, Rice University |
602 |
The Power Mean Laplacian for Multilayer Graph Clustering |
Pedro Mercado*, Saarland University; Antoine Gautier, Saarland University; Francesco Tudisco, University of Strathclyde; Matthias Hein, Saarland University |
604 |
Adaptive Sampling for Clustered Ranking |
Sumeet Katariya*, Univ of Wisconsin-Madison; Lalit Jain, University of Michigan Ann Arbor; Nandana Sengupta, University of Chicago; James Evans, University of Chicago; Robert Nowak, University of Wisconsin-Madison |
611 |
Comparison Based Learning from Weak Oracles |
Ehsan Kazemi*, Yale; Lin Chen, Yale University; Sanjoy Dasgupta, University of California San Diego; Amin Karbasi, Yale |
613 |
The Binary Space Partitioning-Tree Process |
Xuhui Fan*, UNSW; Bin Li, Fudan University; Scott Sisson, University of New South Wales |
614 |
On denoising noisy modulo 1 samples of a function |
Mihai Cucuringu*, University of Oxford and the Alan Turing Institute; Hemant Tyagi, Alan Turing Institute |
616 |
Scalable Hash-Based Estimation of Divergence Measures |
Morteza Noshad Iranzad*, University of Michigan; Alfred Hero, University of Michigan |
619 |
Conditional Gradient Method for Stochastic Submodular Maximization: Closing the Gap |
Aryan Mokhtari*, UC Berkeley; Hamed Hassani, ; Amin Karbasi, Yale |
620 |
Online Continuous Submodular Maximization |
Lin Chen*, Yale University; Hamed Hassani, ; Amin Karbasi, Yale |
626 |
Efficient Bayesian Methods for Counting Processes in Partially Observable Environments |
Ferdian Jovan*, University of Birmingham; Jeremy Wyatt, University of Birmingham; Nick Hawes, University of Oxford |
629 |
Matrix-normal models for fMRI analysis |
Michael Shvartsman*, Princeton University ; Narayanan Sundaram, Intel Corporation; Mikio Aoi, Princeton University; Adam Charles, Princeton University; Theodore Wilke, Intel Corporation; Jonathan Cohen, Princeton University |
631 |
The emergence of spectral universality in deep networks |
Jeffrey Pennington*, ; Samuel Schoenholz, Google; Surya Ganguli, Google Brain |
635 |
Spectral Algorithms for Computing Fair Support Vector Machines |
Mahbod Olfat*, UC Berkeley; Anil Aswani, UC Berkeley |
636 |
Bayesian Multi-label Learning with Sparse Features and Labels |
He Zhao*, Monash University; Piyush Rai, IIT Kanpur; Lan Du, """Faculty of Information Technology, Monash University, Australia"""; Wray Buntine, Monash University |
637 |
Nonparametric Bayesian sparse graph linear dynamical systems |
Rahi Kalantari, UT-Austin; Joydeep Ghosh, UT Austin; Mingyuan Zhou*, University of Texas at Austin |
639 |
Proximity Variational Inference |
Jaan Altosaar*, Princeton University; Rajesh Ranganath, Princeton; David Blei, |
641 |
Near-Optimal Machine Teaching via Explanatory Teaching Sets |
Yuxin Chen*, Caltech; Oisin Mac Aodha, Caltech; Shihan Su, Caltech; Pietro Perona, Caltech; Yisong Yue, Caltech |
643 |
Learning Hidden Quantum Markov Models |
Siddarth Srinivasan*, Georgia Institute of Technolog; Geoff Gordon, Carnegie Mellon University; Byron Boots, |
644 |
Labeled Graph Clustering via Projected Gradient Descent |
Shiau Hong Lim*, IBM Research; Gregory Calvez, |
646 |
Gradient Diversity: a Key Ingredient for Scalable Distributed Learning |
Dong Yin*, UC Berkeley; Ashwin Pananjady, UC Berkeley; Max Lam, Stanford University; Dimitris Papailiopoulos, ; Kannan Ramchandran, UC Berkeley; Peter Bartlett, UC Berkeley |
648 |
HONES: A Fast and Tuning-free Homotopy Method For Online Newton Step |
Yuting Ye*, UC Berkeley; LIhua Lei, UC Berkeley; Cheng Ju, UC Berkeley |
649 |
Probability–Revealing Samples |
Krzysztof Onak*, IBM Research; Xiaorui Sun, Microsoft Research |
656 |
Reducing optimization to repeated classification |
Tatsunori Hashimoto*, Stanford; Steve Yadlowsky, Stanford University; John Duchi, |
661 |
Online Ensemble Multi-kernel Learning Adaptive to Non-stationary and Adversarial Environments |
Yanning Shen, ; Tianyi Chen*, University of Minnesota; Georgios Giannakis, University of Minnesota |
665 |
A Unified Dynamic Approach to Sparse Model Selection |
Chendi Huang*, Peking University; Yuan Yao, Hongkong University of Science and Techonology |
666 |
Bootstrapping EM via Power EM and Convergence in the Naive Bayes Model |
Costis Daskalakis, ; Christos Tzamos*, Microsoft Research; Manolis Zampetakis, MIT |
669 |
Dimensionality Reduced $\ell^{0}$-Sparse Subspace Clustering |
Yingzhen Yang*, Snap Research |