Abstract:
Objective With the ongoing intelligent development in China’s natural gas pipeline network, accurate prediction of natural gas consumption has become a key support for optimal dispatching of pipeline network. Current prediction methods are limited by heavy reliance on complex multi-dimensional factors, narrow user coverage, and inadequate integration of temporal knowledge. The emergence of large language models (LLMs) offers a promising solution; however, existing LLMs lack sufficient industry-specific understanding, resulting in inaccurate predictions. Moreover, research on adapting these models for the prediction of natural gas consumption remains insufficient.
Methods To this end, a natural gas consumption prediction method based on LLMs and temporal knowledge enhancement was proposed. A temporal knowledge base for natural gas consumption (hereinafter referred to as the “temporal knowledge base”) was constructed to extract regional gas consumption features and to assist in LLM-based prediction through similarity retrieval. During construction, dynamic time warping barycenter averaging (DBA) was embedded into the K-means clustering algorithm to prevent Euclidean distance failure caused by time-axis shifts or distortions in the time series. Meanwhile, to improve the LLM’s understanding of input time series with partially frozen parameters, prompt templates were enriched with prior knowledge—including data decomposition, similarity retrieval snippets between the input time series and the temporal knowledge base, and statistical descriptors—before being fed to the LLM.
Results Experimental results demonstrated that: (1) Establishing a retrieval mechanism for the temporal knowledge base and constructing prompt templates significantly enhanced the accuracy of natural gas consumption prediction. Compared to traditional methods, this approach exhibited reduced lag, stronger trend- and seasonality-fitting ability, and better handling of highly volatile series owing to a re-programming patch embedding layer. (2) The proposed method achieved significantly better prediction accuracy than other models across four datasets, achieving average values of 23 635.6 for Root Mean Square Error (RMSE) , 10 915.1 for Mean Absolute Error (MAE), 1.9% for Symmetric Mean Absolute Percentage Error (SMAPE) , 1.9% for Mean Absolute Persentage Error (MAPE), and 0.96 for R2 , demonstrating robust generalization. (3) For ultra-long-term load prediction, the integration of rich multimodal prior knowledge enabled the proposed method to achieve the highest accuracy among all models, further reducing RMSE, MAE, SMAPE, and MAPE by an average of 13.62%, 21.49%, 22.21%, and 22.91%, respectively, while maintaining an average R2 = 0.95.
Conclusion The research demonstrates that the newly proposed method outperforms existing benchmark approaches for generative prediction of natural gas consumption and offers a novel technical pathway for developing multimodal intelligent decision-making systems, facilitating the evolution of prediction technology from single-scenario to cross-modal collaboration.