Researchers are fundamentally rethinking artificial intelligence by drawing inspiration directly from the architecture of the human brain. Weifeng Liu from Vista Zenith, alongside colleagues, proposes a novel neural network paradigm , Brain-like Neural Networks (BNNs) , that moves beyond manually designed structures and embraces autonomous evolution. This work is significant because it presents LuminaNet, the first instantiation of a BNN capable of dynamically modifying its own architecture without relying on traditional convolutions or self-attention mechanisms. Extensive experiments reveal LuminaNet achieves substantial performance gains on image classification (CIFAR-10) and text generation (TinyStories), exceeding established models like LeNet-5, AlexNet, MLP-Mixer, ResMLP, and DeiT-Tiny, all while significantly reducing computational demands.
Scientists Background
Scientists are increasingly interested in Artificial neural networks that more closely resemble biological systems. They present LuminaNet, the first instantiation of a BNN, which operates without convolutions or self-attention and can autonomously modify its architecture. Extensive experiments demonstrate that LuminaNet achieves self-evolution through dynamic architectural changes. On the CIFAR-10 dataset, LuminaNet achieves top-1 accuracy improvements of 11.19% and 5. On the TinyStories text generation task, LuminaNet attains a perplexity of 8.4, comparable to a single-layer GPT-2-style Transformer [11, 12], while reducing computational cost by approximately 25% and peak memory usage by nearly 50%. Code and interactive structures are available at https://github. com/aaroncomo/LuminaNet. The brain is the0.19% and 5.46% in top-1 accuracy, respectively. It achieves a top-5 accuracy of 98.09%, outperforming MobileViT (97.39%) and approaching ResNet-18 (99.41%), which has ten times more parameters.
On TinyStories, LuminaNet achieves a best PPL of 8.4 and top-1 accuracy of 53.38% without self-attention, positional encoding, causal masking, or training tricks, comparable to a single-layer GPT-2-style Transformer [11, 12] (PPL: 8.08, top-1: 53.29%). The contributions can be summarised as follows:. Researchers presented LuminaNet, the first instantiation of a BNN, designed to operate without convolutions or self-attention, achieving self-evolution through dynamic architectural changes. The study meticulously engineered LuminaNet’s foundational module, the Neuron Cluster (NC), emulating neuronal populations observed in biological systems.
Each Neuron Cluster (NC) spans both an Input Layer (IL) and a Neuron Layer (NL), receiving information from other clusters via a Communication Layer (CL) that employs a fixed hidden dimension, dₕᵢddₑₙ, as the communication channel. RGB images are divided channel-wise into 16 × 16 patches, with each patch processed by an independent Neuron Cluster, demonstrating a scalable approach to image processing. Within each Neuron Cluster, the initial embedding of the input xᵢ is computed through a linear transformation followed by an activation function, expressed as eᵢ = σ(Linearᵢ(Flatten(xᵢ))). Crucially, the framework incorporates three evolutionary mechanisms within each Neuron Cluster: splitting, growth, and connection.
Splitting involves dividing a cluster’s weights and biases to form a new cluster, while growth enables horizontal expansion or reduction by acquiring n new neurons. The connection mechanism establishes multiple links between clusters, utilising a weight matrix Wᴄⱼ to modulate signal strength, analogous to synaptic transmission in biological brains. The aggregated signal received by cluster NCᵢ is calculated as sᵢ = (1/n) Σⱼ∈ᴺᵢ Wᴄⱼ fⱼ, where |Nᵢ| = n. To further emulate biological neural processes, the researchers introduced a Two-Pass Forward mechanism that enables the dynamic formation of feedforward, feedback, and recurrent connections. This mechanism leverages the evolving network topology, defined by the sets Nᶠⱼ and Nᵇⱼ, which satisfy the condition { Nᶠⱼ, Nᵇⱼ | max(Nᶠⱼ) < ⱼ }. During Pass-1, all clusters are traversed sequentially, while during Pass-2, information propagates bidirectionally, facilitating complex interactions and enhancing the network’s adaptive capabilities.
Experimental results demonstrate that LuminaNet achieves top-1 accuracy improvements of 11.19% and 5.46% over LeNet-5 and AlexNet, respectively, on the CIFAR-10 dataset, as well as a perplexity of 8 on language modeling tasks. The research rethinks intelligence formation by drawing inspiration from neuroscience, resulting in a network architecture that eschews traditional convolutional and self-attention mechanisms. On the TinyStories text generation task, LuminaNet achieves a perplexity of 10.4, delivering performance comparable to a single-layer GPT-2–style Transformer. Remarkably, this is accomplished with approximately a 25% reduction in computational cost and nearly a 50% decrease in peak memory usage. Detailed analyses presented in Appendix D confirm the quality of the generated texts, while ablation studies (Table 5) demonstrate the model’s robustness, showing that performance does not sharply decline even after pruning all connections. This resilience arises from the initial independent training of clusters, enabling each to extract appropriate tokens directly from embedding vectors.
Further tests prove that networks evolving from scratch, even after pruning, maintain performance, unlike those starting with dense connections which collapse upon removal. The study visualised network architectures formed on the TinyStories task, revealing that the network enhances information processing through growth and connection formation. For networks with a hidden dimension of 128 and 256, the team recorded topological depths of 21 and 13 layers, respectively, alongside 24 recurrent structures, driven by the need for inter-cluster communication. Conversely, a network with dhidden of 384 achieved a shallower depth of 9 layers and only 4 recurrent structures, indicating more powerful individual clusters.
Notably, two networks initialized with dhidden of 384 converged to remarkably similar architectures, suggesting a fixed pattern of semantic associations captured by LuminaNet. The breakthrough delivers an empirically demonstrated ability for artificial neural networks to autonomously construct and optimise themselves. This research presents LuminaNet, the first instantiation of a BNN, which autonomously modifies its architecture without relying on convolutions or self-attention mechanisms. Extensive experiments demonstrate LuminaNet’s capacity for self-evolution through dynamic architectural changes, achieving significant improvements in performance on benchmark tasks. LuminaNet surpasses established convolutional architectures like LeNet-5 and AlexNet on the CIFAR-10 image recognition task, achieving top-1 accuracy improvements of 11.19% and 5.46% respectively.
Furthermore, it outperforms several MLP/ViT models and attains comparable perplexity to a single-layer GPT-2-style Transformer on the TinyStories text generation task, all while reducing computational cost by approximately 25% and peak memory usage by nearly 50%. The convergence of networks initialized with differing connection patterns towards similar architectures suggests that language sequences possess inherent structural properties that LuminaNet effectively captures during evolution. The authors acknowledge that their work represents an initial step and that further research is needed to fully understand the capabilities and limitations of BNNs. They highlight the need for continued investigation into the interpretability of autonomously constructed network modules and the potential for scaling these networks to more complex tasks. This work provides the first empirical demonstration of artificial neural networks autonomously constructing and optimising themselves, offering a novel perspective for artificial intelligence and fostering deeper integration between AI and neuroscience.
👉 More information
🗞 Rethinking Intelligence: Brain-like Neuron Network
🧠 ArXiv: https://arxiv.org/abs/2601.19508
