Ad
A new Oxford University study reveals that deep neural networks (DNNs) naturally favor simpler solutions when learning, acting as an inbuilt form of Occam’s razor that balances the exponential growth of complex solutions. This simplicity bias allows DNNs to generalize well on real-world data but limits their performance on more complex patterns, hinting at deeper parallels between AI learning and natural evolutionary processes. Credit: SciTechDaily.com
Oxford researchers found that deep neural networks naturally favor simpler solutions, enhancing their ability to generalize from data, a discovery that may reveal deeper links between artificial intelligence and natural evolutionary processes.
A new study from Oxford University has revealed why deep neural networks (DNNs), the foundation of modern artificial intelligence, excel at learning from data. The research shows that DNNs naturally follow a form of Occam’s razor—they prefer simpler solutions when multiple options fit the training data. Uniquely, this simplicity bias precisely counteracts the exponential increase in the number of possible solutions as complexity grows. These findings were published in Nature Communications.
To accurately predict new, unseen data—even when DNNs have millions or billions more parameters than data points—the researchers proposed that DNNs possess an inherent form of guidance. This built-in bias helps them prioritize the most relevant patterns during learning.
“Whilst we knew that the effectiveness of DNNs relies on some form of inductive bias towards simplicity – a kind of Occam’s razor – there are many versions of the razor. The precise nature of the razor used by DNNs remained elusive” said theoretical physicist Professor Ard Louis (Department of Physics, Oxford University), who led the study.
Preference for Simpler Functions
To uncover the guiding principle of DNNs, the authors investigated how these learn Boolean functions – fundamental rules in computing where a result can only have one of two possible values: true or false. They discovered that even though DNNs can technically fit any function to data, they have a built-in preference for simpler functions that are easier to describe. This means DNNs are naturally biased towards simple rules over complex ones.
Furthermore, the authors discovered that this inherent Occam’s razor has a unique property: it exactly counteracts the exponential increase in the number of complex functions as the system size grows. This allows DNNs to identify the rare, simple functions that generalize well (making accurate predictions on both the training data and unseen data), while avoiding the vast majority of complex functions that fit the training data but perform poorly on unseen data.
This emergent principle helps DNNs do well when the data follows simple patterns. However, when the data is more complex and does not fit simple patterns, DNNs do not perform as well, sometimes no better than random guessing. Fortunately, real-world data is often fairly simple and structured, which aligns with the DNNs’ preference for simplicity. This helps DNNs avoid overfitting (where the model gets too ‘tuned’ to the training data) when working with simple, real-world data.
The Impact of Modifying Learning Processes
To delve deeper into the nature of this razor, the team investigated how the network’s performance changed when its learning process was altered by changing certain mathematical functions that decide whether a neuron should ‘fire’ or not.
They found that even though these modified DNNs still favor simple solutions, even slight adjustments to this preference significantly reduced their ability to generalize (or make accurate predictions) on simple Boolean functions. This problem also occurred in other learning tasks, demonstrating that having the correct form of Occam’s razor is crucial for the network to learn effectively.
The new findings help to ‘open the black box’ of how DNNs arrive at certain conclusions, which currently makes it difficult to explain or challenge decisions made by AI systems. However, while these findings apply to DNNs in general, they do not fully explain why some specific DNN models work better than others on certain types of data.
Christopher Mingard (Department of Physics, Oxford University), co-lead author of the study, said: “This suggests that we need to look beyond simplicity to identify additional inductive biases driving these performance differences.”
According to the researchers, the findings suggest a strong parallel between artificial intelligence and fundamental principles of nature. Indeed, the remarkable success of DNNs on a broad range of scientific problems indicates that this exponential inductive bias must mirror something deep about the structure of the natural world.
“Our findings open up exciting possibilities,” said Professor Louis. “The bias we observe in DNNs has the same functional form as the simplicity bias in evolutionary systems that helps explain, for example, the prevalence of symmetry in protein complexes. This points to intriguing connections between learning and evolution, a connection ripe for further exploration.”
Reference: “Deep neural networks have an inbuilt Occam’s razor” by Chris Mingard, Henry Rees, Guillermo Valle-Pérez and Ard A. Louis, 14 January 2025, Nature Communications.
DOI: 10.1038/s41467-024-54813-x
Ad
SomaDerm, SomaDerm CBD, SomaDerm AWE (by New U Life).