Share to: share facebook share twitter share wa share telegram print page

Kolmogorov-Arnold Networks

Kolmogorov–Arnold Networks (KANs) are a type of artificial neural network architecture inspired by the Kolmogorov–Arnold representation theorem, also known as the superposition theorem. Unlike traditional multilayer perceptrons (MLPs), which rely on fixed activation functions and linear weights, KANs replace each weight with a learnable univariate function, often represented using splines.[1][2][3]

Architecture

KANs are based on the Kolmogorov–Arnold representation theorem, which was linked to the 13th Hilbert problem.[4][5]

Given consisting of n variables, a multivariate continuous function can be represented as:

  (1)

This formulation contains two nested summations: an outer and an inner sum. The outer sum aggregates terms, each involving a function . The inner sum computes n terms for each q, where each term is a continuous function of the single variable .

Liu et al.[1] proposed the name KAN. A general KAN network consisting of L layers takes x to generate the output as:

  (3)

Here, is the function matrix of the l-th KAN layer or a set of pre-activations.

Let i denote the neuron of the l-th layer and j the neuron of the (l+1)-th layer. The activation function connects (l, i) to (l+1, j):

  (4)

where nl is the number of nodes of the l-th layer.

Thus, the function matrix can be represented as an matrix of activations:

Functions used in KAN

The choice of functional basis strongly influences the performance of KANs. Common function families include:

  • B-splines: Provide locality, smoothness, and interpretability; most widely used in current implementations.[3]
  • RBFs: Capture localized features in data and are effective in approximating functions with non-linear or clustered structures.[3][6]
  • Chebyshev polynomials: Offer efficient approximation with minimized error in the maximum norm, making them useful for stable function representation.[3][7]
  • Rational functions: Useful for approximating functions with singularities or sharp variations, as they can model asymptotic behavior better than polynomials.[3][8]
  • Fourier series: Capture periodic patterns effectively and are particularly useful in domains such as physics-informed machine learning.[3][9]
  • Wavelet functions[3][10]

Usage

KANs are usually employed as drop-in replacements for MLP layers in modern neural architectures such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and Transformers. Researchers have applied them in a variety of tasks:

  • Function fitting: KANs outperform MLPs of similar parameter size in tasks like fitting symbolic formulas or special functions.[1]
  • Solving partial differential equations (PDEs): A two-layer, 10-width KAN can outperform a four-layer, 100-width MLP by two orders of magnitude in both accuracy and parameter efficiency.[1] [11][12]
  • Continual learning: KANs better preserve previously learned information during incremental updates, avoiding catastrophic forgetting—owing to the locality of spline adjustments.[2] [13]
  • Scientific discovery: Due to the interpretability of learned functions, KANs have been used as a tool for rediscovering physical or mathematical laws.[2]
  • Graph neural networks: Extensions such as Kolmogorov–Arnold Graph Neural Networks (KA-GNNs) integrate KAN modules into message-passing architectures, showing improvements in molecular property prediction tasks.[3] [14][15]

See also

References

  1. ^ a b c d Liu, Ziming; Tegmark, Max (2024). "KAN: Kolmogorov–Arnold Networks". arXiv:2404.19756 [cs.LG].
  2. ^ a b c Liu, Ziming; Ma, Pingchuan; Wang, Yilun; Matusik, Wojciech; Tegmark, Max (2024). "KAN 2.0: Kolmogorov–Arnold Networks Meet Science". arXiv:2408.10205 [cs.LG].
  3. ^ a b c d e f g h Somvanshi, S.; Javed, S. A.; Islam, M. M.; Pandit, D.; Das, S. (2024). "A Survey on Kolmogorov-Arnold Network". ACM Computing Surveys.
  4. ^ Kolmogorov, A. N. (1963). "On the representation of continuous functions of many variables by superposition of continuous functions of one variable and addition". Translations of the American Mathematical Society. 2 (28): 55–59.
  5. ^ Schmidt-Hieber, Johannes (2021). "The Kolmogorov–Arnold representation theorem revisited". Neural Networks. 137: 119–126. doi:10.1016/j.neunet.2021.01.020. PMID 33592434.
  6. ^ Ta, H. T. (2024). "BSRBF-KAN: a combination of B-splines and radial basis functions in Kolmogorov-Arnold networks". Proceedings of the International Symposium on Information and Communication Technology. Singapore: Springer Nature Singapore. pp. 3–15.
  7. ^ Guo, Chunyu; Sun, Lucheng; Li, Shilong; Yuan, Zelong; Wang, Chao (2025). "Physics-informed Kolmogorov–Arnold network with Chebyshev polynomials for fluid mechanics". Physics of Fluids. 37 (9) 095120. doi:10.1063/5.0284999.
  8. ^ Aghaei, Amirmojtaba A. (2024). "RKAN: Rational Kolmogorov-Arnold Networks". arXiv:2406.14495 [cs.LG].
  9. ^ Liang, J.; Mu, L.; Fang, C. (2025). "Topology Identification of Distribution Network Based on Fourier Kolmogorov–Arnold Networks". IEEJ Transactions on Electrical and Electronic Engineering. 20 (10): 1579–1588. doi:10.1002/tee.70031.
  10. ^ Song, Y.; Zhang, H.; Man, J.; Jin, X.; Li, Q. (2025). "AWKNet: A Lightweight Neural Network for Motor Imagery Electroencephalogram Classification Based on Adaptive Wavelet Transform Kolmogorov–Arnold Networks". IEEE Transactions on Consumer Electronics. 71 (1): 1. doi:10.1109/TCE.2025.3540970.
  11. ^ Zhang, Z.; Wang, Q.; Zhang, Y.; Shen, T.; Zhang, W. (2025). "Physics-informed neural networks with hybrid Kolmogorov–Arnold network and augmented Lagrangian function for solving partial differential equations". Scientific Reports. 15 (1): 10523. doi:10.1038/s41598-025-81853-2 (inactive 8 September 2025). PMID 40148388.{{cite journal}}: CS1 maint: DOI inactive as of September 2025 (link)
  12. ^ Yeo, S.; Nguyen, P. A.; Le, A. N.; Mishra, S. (2024). "KAN-PDEs: A Novel Approach to Solving Partial Differential Equations Using Kolmogorov-Arnold Networks—Enhanced Accuracy and Efficiency". Proceedings of the International Conference on Electrical and Electronics Engineering. Singapore: Springer Nature Singapore. pp. 43–62.
  13. ^ Hu, Yusong; Liang, Zichen; Yang, Fei; Hou, Qibin; Liu, Xialei; Cheng, Ming-Ming (2025). "KAC: Kolmogorov-Arnold Classifier for Continual Learning". Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 15297–15307.
  14. ^ Li, Longlong; Zhang, Yipeng; Wang, Guanghui; Xia, Kelin (2025). "Kolmogorov–Arnold graph neural networks for molecular property prediction". Nature Machine Intelligence. 7 (8): 1346–1354. doi:10.1038/s42256-025-01087-7.
  15. ^ Yang, Zhen; Mao, Ling; Ye, Liang; Ma, Yuan; Song, Zihan; Chen, Zhe (2025). "AKGNN: When Adaptive Graph Neural Network Meets Kolmogorov-Arnold Network for Industrial Soft Sensors". IEEE Transactions on Instrumentation and Measurement. doi:10.1109/TIM.2025.3512345 (inactive 8 September 2025).{{cite journal}}: CS1 maint: DOI inactive as of September 2025 (link)
Prefix: a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9

Portal di Ensiklopedia Dunia

Kembali kehalaman sebelumnya