K. Kawaguchi, Deep learning without poor local minima, Advances in Neural Information Processing Systems, pp.586-594, 2016.

D. Soudry and Y. Carmon, No bad local minima: Data independent training error guarantees for multilayer neural networks, 2016.

R. Ge, J. D. Lee, and T. Ma, Matrix completion has no spurious local minimum, Advances in Neural Information Processing Systems, pp.2973-2981, 2016.

D. Freeman and J. Bruna, Topology and geometry of half-rectified network optimization, 2017.

S. Bhojanapalli, B. Neyshabur, and N. Srebro, Global optimality of local search for low rank matrix recovery, Advances in Neural Information Processing Systems, pp.3873-3881, 2016.

D. Park, A. Kyrillidis, C. Carmanis, and S. Sanghavi, Non-square matrix sensing without spurious local minima via the Burer-Monteiro approach, In Artificial Intelligence and Statistics, pp.65-74, 2017.

J. D. Simon-s-du, Y. Lee, A. Tian, B. Singh, and . Poczos, Gradient descent learns one-hidden-layer CNN: Don't be afraid of spurious local minima, International Conference on Machine Learning, pp.1338-1347, 2018.