The semantic power of text content as a flow of a vector field of embeddings | Scientific journal of TNTU

Published: 17-02-2026

DOI: https://doi.org/10.33108/

Keywords:

text analysis, natural language processing, semantic power, vector embeddings, semantic space, divergence, clustering, field theory, large language models, transformers

Viktor Stashkiv

Ternopil Ivan Puluj National Technical University

Andrii Khamarchuk

Ternopil Ivan Puluj National Technical University

Kyrylo Chornopyskyi

Ternopil Ivan Puluj National Technical University

Vladyslav Shumeiko

Ternopil Ivan Puluj National Technical University

Maksym Chorniak

Ternopil Ivan Puluj National Technical University

Karina Yarosh

Ternopil Ivan Puluj National Technical University

Valentyna Tserkovniuk

Ternopil Ivan Puluj National Technical University

Oleh Pastukh

Ternopil Ivan Puluj National Technical University

Abstract

The growing volume of textual data demands advanced methods for evaluating both content effectiveness and semantic structure. While current Natural Language Processing (NLP) techniques offer powerful tools, they often lack metrics for quantifying intrinsic semantic intensity or conceptual coherence. This paper introduces “semantic power” – a novel quantitative measure designed to analyze the conceptual structure and semantic richness of texts, grounded in principles of field theory. The proposed methodology draws on the Ostrogradsky–Gauss theorem and the divergence operator, establishing a theoretical link between local semantic properties of a text (derived from LaBSE vector embeddings) and their global influence. The approach involves computing a semantic centroid, representing the point of highest meaning concentration, and measuring semantic power using a model that assumes an inverse-square decay of vector influence. For further analysis, Gaussian Mixture Model (GMM) clustering id applied, and Principal Component Analysis (PCA) is used for dimensionality reduction and visualization. Experiments on philosophical texts by key Early Modern thinkers – G. W. Leibniz, R. Descartes, and I. Kant – reveal distinct and meaningful variations in semantic power (0.6010, 0.5633, and 0.5787, respectively) and in the resulting clustering patterns (2, 7, and 2 clusters). These findings suggest that semantic power is not merely a numerical descriptor but one that correlates with established intellectual styles and methodological orientations of the authors. As such, semantic power emerges as a powerful and objective metric for assessing the deep cognitive and semantic dimensions of textual content, with potential applications in philology, cognitive science, and computational linguistics and related disciplines.

Issue

Vol. 120 No. 4 (2025)

Section

Articles

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

References

1. Yao, L., Mao, C., & Luo, Y. (2019). Graph convolutional networks for text classification. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 7370–7377. https://doi.org/10.1609/aaai.v33i01.33017370

2. Kozlowski, D., Dusdal, J., Pang, J., & Zilian, A. (2021). Semantic and relational spaces in science of science: Deep learning models for article vectorisation. Scientometrics. https://doi.org/10.1007/s11192-021-03984-1

3. Liu, B., Guan, W., Yang, C., Fang, Z., & Lu, Z. (2023). Transformer and graph convolutional network for text classification. International Journal of Computational Intelligence Systems, 16(1). https://doi.org/10.1007/s44196-023-00337-z

4. Wang, B., Li, Q., Melucci, M., & Song, D. (2019). Semantic hilbert space for text representation learning. У The world wide web conference. ACM Press. https://doi.org/10.1145/3308558.3313516

5. Vyas, Y., Niu, X., & Carpuat, M. (2018). Identifying semantic divergences in parallel text without annotations. У Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: Human language technologies, volume 1 (long papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/n18-1136

6. Zeng, D., Zha, E., Kuang, J., & Shen, Y. (2024). Multi-label text classification based on semantic-sensitive graph convolutional network. Knowledge-Based Systems, 284, 111303. https://doi.org/10.1016/j.knosys.2023.111303

7. Tekgöz, H., İlhan Omurca, S., Koç, K. Y., Topçu, U., & Çeli̇k, O. (2022). Semantic similarity comparison between production line failures for predictive maintenance. Advances in Artificial Intelligence Research. https://doi.org/10.54569/aair.1142568

8. Premalatha, M., Viswanathan, V., & Čepová, L. (2022). Application of semantic analysis and LSTM-GRU in developing a personalized course recommendation system. Applied Sciences, 12(21), 10792. https://doi.org/10.3390/app122110792

9. Narendra G O & Hashwanth S. (2022). Named entity recognition based resume parser and summarizer. International Journal of Advanced Research in Science, Communication and Technology, 728–735. https://doi.org/10.48175/ijarsct-3029

10. Venkatesh, D., & Raman, S. (2024). BITS pilani at semeval-2024 task 1: Using text-embedding-3-large and labse embeddings for semantic textual relatedness. У Proceedings of the 18th international workshop on semantic evaluation (semeval-2024). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.semeval-1.124

11. Feng, F., Yang, Y., Cer, D., Arivazhagan, N., & Wang, W. (2022). Language-agnostic BERT sentence embedding. У Proceedings of the 60th annual meeting of the association for computational linguistics (volume 1: Long papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.62

12. Kesiraju, S., Plchot, O., Burget, L., & Gangashetty, S. V. (2020). Learning document embeddings along with their uncertainties. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28, 2319–2332. https://doi.org/10.1109/taslp.2020.3012062

13. Hu, C., Wu, T., Liu, S., Liu, C., Ma, T., & Yang, F. (2024). Joint unsupervised contrastive learning and robust GMM for text clustering. Information Processing & Management, 61(1), 103529. https://doi.org/10.1016/j.ipm.2023.103529

14. Chesanovsky, I., & Levhunets, D. (2017). Representation of narrow-band radio signals with angular modulation in trunked radio systems using the principal component analysis. Scientific Journal of the Ternopil National Technical University, 86(2), 117–121. https://elartu.tntu.edu.ua/handle/lib/22368

15. Musil, T. (2019). Examining structure of word embeddings with PCA. У Text, speech, and dialogue (с. 211–223). Springer International Publishing. https://doi.org/10.1007/978-3-030-27947-9_18

Most read articles by the same author(s)

Oleh Pastukh, Vasyl Yatsyshyn, Development of software for neuromarketing based on artificial intelligence and data science using high-performance computing and parallel programming technologies , Scientific journal of TNTU: Vol. 113 No. 1 (2024)
Oleksandr Zadvornyi, Oleh Pastukh, Сomparative analysis of machine learning algorithms for market capitalization time series forecasting , Scientific journal of TNTU: Vol. 119 No. 3 (2025)
Vasyl Yatsyshyn, Oleh Pastukh, Andriy Palamar, Ruslan Zharovskyi, Technology of relational database management systems performance evaluation during computer systems design , Scientific journal of TNTU: Vol. 109 No. 1 (2023)
Oleh Pastukh, Vasyl Yatsyshyn, Brain-computer interaction neurointerface based on artificial intelligence and its parallel programming using high-performance calculation on cluster mobile devices , Scientific journal of TNTU: Vol. 112 No. 4 (2023)
Yuriy Petrov, Oleh Pastukh, Comparative analysis of MLP and KAN neural network architectures in neurointerface technologies , Scientific journal of TNTU: Vol. 119 No. 3 (2025)
Volodymyr Stefanyshyn, Ivan Stefanyshyn, Oleh Pastukh, Serhii Kulikov, Comparison of the accuracy of machine learning algorithms for brain-computer interaction based on high-performance computing technologies , Scientific journal of TNTU: Vol. 115 No. 3 (2024)
Oleksandr Bryk, Oleh Pastukh, Extraction of important data for cognitive software systems based on data science , Scientific journal of TNTU: Vol. 117 No. 1 (2025)