Given consisting of n variables, a multivariate continuous function can be represented as:
(1)
This formulation contains two nested summations: an outer and an inner sum. The outer sum aggregates terms, each involving a function .
The inner sum computes n terms for each q, where each term is a continuous function of the single variable .
Liu et al.[1] proposed the name KAN. A general KAN network consisting of L layers takes x to generate the output as:
(3)
Here, is the function matrix of the l-th KAN layer or a set of pre-activations.
Let i denote the neuron of the l-th layer and j the neuron of the (l+1)-th layer. The activation function connects (l, i) to (l+1, j):
(4)
where nl is the number of nodes of the l-th layer.
Thus, the function matrix can be represented as an matrix of activations:
Functions used in KAN
The choice of functional basis strongly influences the performance of KANs. Common function families include:
B-splines: Provide locality, smoothness, and interpretability; most widely used in current implementations.[3]
RBFs: Capture localized features in data and are effective in approximating functions with non-linear or clustered structures.[3][6]
Chebyshev polynomials: Offer efficient approximation with minimized error in the maximum norm, making them useful for stable function representation.[3][7]
Rational functions: Useful for approximating functions with singularities or sharp variations, as they can model asymptotic behavior better than polynomials.[3][8]
Function fitting: KANs outperform MLPs of similar parameter size in tasks like fitting symbolic formulas or special functions.[1]
Solving partial differential equations (PDEs): A two-layer, 10-width KAN can outperform a four-layer, 100-width MLP by two orders of magnitude in both accuracy and parameter efficiency.[1][11][12]
Continual learning: KANs better preserve previously learned information during incremental updates, avoiding catastrophic forgetting—owing to the locality of spline adjustments.[2][13]
Scientific discovery: Due to the interpretability of learned functions, KANs have been used as a tool for rediscovering physical or mathematical laws.[2]
Graph neural networks: Extensions such as Kolmogorov–Arnold Graph Neural Networks (KA-GNNs) integrate KAN modules into message-passing architectures, showing improvements in molecular property prediction tasks.[3][14][15]
^ abcdefghSomvanshi, S.; Javed, S. A.; Islam, M. M.; Pandit, D.; Das, S. (2024). "A Survey on Kolmogorov-Arnold Network". ACM Computing Surveys.
^Kolmogorov, A. N. (1963). "On the representation of continuous functions of many variables by superposition of continuous functions of one variable and addition". Translations of the American Mathematical Society. 2 (28): 55–59.
^Ta, H. T. (2024). "BSRBF-KAN: a combination of B-splines and radial basis functions in Kolmogorov-Arnold networks". Proceedings of the International Symposium on Information and Communication Technology. Singapore: Springer Nature Singapore. pp. 3–15.
^Guo, Chunyu; Sun, Lucheng; Li, Shilong; Yuan, Zelong; Wang, Chao (2025). "Physics-informed Kolmogorov–Arnold network with Chebyshev polynomials for fluid mechanics". Physics of Fluids. 37 (9) 095120. doi:10.1063/5.0284999.
^Aghaei, Amirmojtaba A. (2024). "RKAN: Rational Kolmogorov-Arnold Networks". arXiv:2406.14495 [cs.LG].
^Liang, J.; Mu, L.; Fang, C. (2025). "Topology Identification of Distribution Network Based on Fourier Kolmogorov–Arnold Networks". IEEJ Transactions on Electrical and Electronic Engineering. 20 (10): 1579–1588. doi:10.1002/tee.70031.
^Song, Y.; Zhang, H.; Man, J.; Jin, X.; Li, Q. (2025). "AWKNet: A Lightweight Neural Network for Motor Imagery Electroencephalogram Classification Based on Adaptive Wavelet Transform Kolmogorov–Arnold Networks". IEEE Transactions on Consumer Electronics. 71 (1): 1. doi:10.1109/TCE.2025.3540970.
^Zhang, Z.; Wang, Q.; Zhang, Y.; Shen, T.; Zhang, W. (2025). "Physics-informed neural networks with hybrid Kolmogorov–Arnold network and augmented Lagrangian function for solving partial differential equations". Scientific Reports. 15 (1): 10523. doi:10.1038/s41598-025-81853-2 (inactive 8 September 2025). PMID40148388.{{cite journal}}: CS1 maint: DOI inactive as of September 2025 (link)
^Yeo, S.; Nguyen, P. A.; Le, A. N.; Mishra, S. (2024). "KAN-PDEs: A Novel Approach to Solving Partial Differential Equations Using Kolmogorov-Arnold Networks—Enhanced Accuracy and Efficiency". Proceedings of the International Conference on Electrical and Electronics Engineering. Singapore: Springer Nature Singapore. pp. 43–62.
^Hu, Yusong; Liang, Zichen; Yang, Fei; Hou, Qibin; Liu, Xialei; Cheng, Ming-Ming (2025). "KAC: Kolmogorov-Arnold Classifier for Continual Learning". Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 15297–15307.
^Yang, Zhen; Mao, Ling; Ye, Liang; Ma, Yuan; Song, Zihan; Chen, Zhe (2025). "AKGNN: When Adaptive Graph Neural Network Meets Kolmogorov-Arnold Network for Industrial Soft Sensors". IEEE Transactions on Instrumentation and Measurement. doi:10.1109/TIM.2025.3512345 (inactive 8 September 2025).{{cite journal}}: CS1 maint: DOI inactive as of September 2025 (link)