Tight dimension independent lower bound on the expected convergence rate for diminishing step sizes in SGD

P.H. Nguyen, L.M. Nguyen, M. van Dijk

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

We study the convergence of Stochastic Gradient Descent (SGD) for strongly convex objective functions. We prove for all t a lower bound on the expected convergence rate after the t-th SGD iteration; the lower bound is over all possible sequences of diminishing step sizes. It implies that recently proposed sequences of step sizes at ICML 2018 and ICML 2019 are universally close to optimal in that the expected convergence rate after each iteration is within a factor 32 of our lower bound. This factor is independent of dimension d. We offer a framework for comparing with lower bounds in state-of-the-art literature and when applied to SGD for strongly convex objective functions our lower bound is a significant factor 775 · d larger compared to existing work.
Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 32 (NeurIPS 2019)
Subtitle of host publication[Proceedings]
EditorsH. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, R. Garnett
PublisherNeural information processing systems foundation
ISBN (Electronic)9781713807933
Publication statusPublished - 2019
Externally publishedYes
Event33rd Annual Conference on Neural Information Processing Systems, NeurIPS 2019 - Vancouver, Canada
Duration: 8 Dec 201914 Dec 2019

Publication series

NameNeurIPS Proceedings

Conference

Conference33rd Annual Conference on Neural Information Processing Systems, NeurIPS 2019
Country/TerritoryCanada
CityVancouver
Period8/12/1914/12/19

Funding

We thank the reviewers for useful suggestions to improve the paper. Phuong Ha Nguyen and Marten van Dijk were supported in part by AFOSR MURI under award number FA9550-14-1-0351.

FundersFunder number
AFOSR MURI
Air Force Office of Scientific ResearchFA9550-14-1-0351

    Fingerprint

    Dive into the research topics of 'Tight dimension independent lower bound on the expected convergence rate for diminishing step sizes in SGD'. Together they form a unique fingerprint.

    Cite this