Luca Saglietti — Politecnico di Torino # Robust accessible states allow efficient training of neural networks with very low precision synapses # Training of neural networks in which the synaptic connections strengths are discretized and their precision is limited to only a few bits has long been considered a challenging task even for the simplest neural architectures: local search algorithms tend to be easily trapped in local minima, and equilibrium statistical analysis shows that global optima are typically isolated ("golf-course scenario"). However, biological experiments show that the precision of brain's synapses does not exceed very few bits (and may even be as low as 1 bit). Furthermore, machine learning applications would greatly benefit from reduced requirements. We performed a large deviations analysis which shows that there exist peculiar dense regions in the space of synaptic states which account for the possibility of learning under these constraints. These regions are characterized by a large local entropy, such that: 1) they are accessible to very simple and efficient heuristic algorithms which exploit their characteristics; 2) the optima are very wide and thus robust; 3) they have good generalization properties. The analytical results give us an "effective capacity" measure which saturates fast with the number of synaptic values and thus indicates that very few bits are indeed sufficient for effective learning. Our numerical observations match the theoretical results where available, and indicate that the scenario extends to complex multi-layer neural architectures trained on real-world data (potentially providing a framework to explain the success of deep learning techniques). The analysis may also be extended to other models.
Bibliography: [1] C Baldassi, A Ingrosso, C Lucibello, L Saglietti and R Zecchina. Subdominant Dense Clusters Allow for Simple Learning and High Computational Performance in Neural Networks with Discrete Synapses. PRL 2015. [2] C Baldassi, F Gerace, C Lucibello, L Saglietti and R Zecchina. Learning may need only few bits of synaptic precision. http://arxiv.org/abs/1602.04129