书名：Hands-On Natural Language Processing with Python
作者名：Rajesh Arumugam Rajalingappaa Shanmugamani
本章字数：104字
更新时间：2025-04-04 16:33:45

Training neural networks

Training deep neural networks is hard, as there are several hyperparameters to be optimized. The variables that define the structure of the network and how it is trained are called hyperparameters. The number of hidden layers and the activation function to be used are a couple of examples of architecture-defining hyperparameters. Similarly, the learning rate and batch size of the training data are examples of training-related hyperparameters. The other main parameters are the network weights and biases that have to be obtained by training the input data. The mechanism or method of obtaining these parameters of the network is called training.