The article describes ways to automate the analysis of textual lithological core descriptions using large language models (LLM). The LithoText service, developed by Rosneft Oil Company as a part of the digital transformation program, is presented. For the first time in the production practice of oil and gas geology, this service uses LLM, prompt engineering technology and domain expertise from geologists. The LithoText service uses prompt engineering technology and domain expert knowledge of lithologist and enables to automatically determine 16 physical core parameters from the texts of geological reports such as rock type, color, saturation, texture, grain size, fracturing, type of void space, and others. The service processes lithological data using created artificial intelligence (AI) agents, each of which is responsible for a specific parameter. The parameters are determined using a LLM which is designed for reading and understanding natural language text. A comparison between classical machine learning methods and LLMs demonstrated the superiority of the latter. The service provides automatic extraction of lithological parameters, validation with benchmark datasets, and retrospective analysis of historical data. Pilot implementation in the Company showed a reduction in processing time by several times, reducing the likelihood of errors due to the human factor. The results confirm the potential of LLMs for petroleum geology applications. The scope application is the analysis of lithological descriptions of the core.
References
1. “Rosneft” vnedryayet iskusstvennyy intellekt v izucheniye kerna (Rosneft is introducing artificial intelligence into core analysis.),
URL: https://www.rosneft.ru/press/news/item/222239/
2. Nedolivko N.M., Issledovanie kerna neftegazovykh skvazhin (Oil and gas wells core study), Tomsk: Publ. of TPU, 2006, 163 p.,
URL: https://portal.tpu.ru/SHARED/n/NEDOLIVKO/disc1/Tab2/Posobie.pdf
3. Das M.A., Comparative study on TF-IDF feature weighting method, 2023, DOI: https://doi.org/10.48550/arXiv.2308.04037
4. Reimers N., Gurevych I., Sentence-BERT: Sentence embeddings using Siamese BERT-Networks, DOI: https://doi.org/10.48550/arXiv.1908.10084
5. Yadryshnikova O.A., Tenyunin A.F., Bychkov M.L., Iskusstvennyy intellekt i geologicheskiye arkhivy: novyye podkhody dlya avtomaticheskoy indeksatsii (Artificial Intelligence and geological archives: New approaches for automatic indexing), Collected papers “Aktual′nyye problemy neftegazovoy otrasli” (Current issues in the oil and gas industry), Proceedings of scientific and practical conferences of the journal “Neftyanoye khozyaystvo”, Moscow, 2024, Moscow: Neftyanoye khozyaystvo Publ., 2025, pp. 139-143.
6. Kostina A., Dikaiakos M.D., Stefanidis D., Pallis G., Large language models for text classification: Case study and comprehensive review, 2025,
DOI: https://doi.org/10.48550/arXiv.2501.08457
7. Sber Developers. Prompt engineering: luchshiye praktiki (Sber Developers. Prompt Engineering: Best Practices),
URL: https://developers.sber.ru/docs/ru/gigachat/prompts-hub/prompt-engineering