Improved training of deep convolutional networks via minimum-variance regularized adaptive sampling

Fostered by technological and theoretical developments, deep neural networks (DNNs) have achieved great success in many applications, but their training via mini-batch stochastic gradient descent (SGD) can be very costly due to the possibly tens of millions of parameters to be optimized and the large amounts of training examples that must be processed. The computational cost is exacerbated by the inefficiency of the uniform sampling typically used by SGD to form the training mini-batches: since not all training examples are equally relevant for training, sampling these under a uniform distribution is far from optimal, making the case for the study of improved methods to train DNNs. A better strategy is to sample the training instances under a distribution where the probability of being selected is proportional to the relevance of each individual instance; one way to achieve this is through importance sampling (IS), which minimizes the gradients’ variance w.r.t. the network parameters, consequently improving convergence. In this paper, an IS-based adaptive sampling method to improve the training of DNNs is introduced. This method exploits side information to construct the optimal sampling distribution and is dubbed regularized adaptive sampling (RAS). Experimental comparison using deep convolutional networks for classification of the MNIST and CIFAR-10 datasets shows that when compared against SGD and against another sampling method in the state of the art, RAS produces improvements in the speed and variance of the training process without incurring significant overhead or affecting the classification.

Datos y Recursos

Información Adicional

Campo Valor
Fuente https://scholar.google.com/citations?view_op=view_citation&hl=es&user=MG1jyREAAAAJ&pagesize=100&sortby=pubdate&citation_for_view=MG1jyREAAAAJ:VLnqNzywnoUC
Autor A Rojas-Dominguez, SI Valdez, M Ornelas-Rodriguez, M Carpio
Última actualización octubre 21, 2025, 09:00 (UTC)
Creado octubre 21, 2025, 09:00 (UTC)
Año 2023
Google Scholar URL https://scholar.google.com/citations?view_op=view_citation&hl=es&user=MG1jyREAAAAJ&pagesize=100&sortby=pubdate&citation_for_view=MG1jyREAAAAJ:VLnqNzywnoUC
Identificador hash 1a979632b846
Lugar de publicación Soft Computing 27 (18), 13237-13253, 2023
Tipo Publicación
Tipo de publicación Otro
URL directo https://www.researchsquare.com/article/rs-983472/latest.pdf