地下空间专业知识库的智能构建将成为推动城市地下空间科学规划的重要基础。本文提出构建基于大规模语言模型的行业级地下空间知识库,以实现地下空间专业知识的智能管理和创新应用。通过文本数据采集、人工预标注和ChatGLM模型优化微调,实现面向行业应用的知识库系统的构建。结果表明,构建行业知识库可以自定义地下空间语料,进行ChatGLM模型增强,实现更准确的专业问答。本研究为地下空间行业构建高质量的知识创新平台提供了参考,有助于提高地下空间规划设计的科学性和工作效率。
The intelligent construction of an underground space knowledge base will become an important foundation for promoting scientific planning of urban underground space. This paper proposes building an industry-level underground space knowledge base based on large language models to enable intelligent management and innovative applications of underground space expertise. Through text data collection, manual pre-annotation, and ChatGLM model fine-tuning, a knowledge base system oriented towards industry applications are constructed. Test results show that building an industry knowledge base can customize underground space corpora, enhance ChatGLM models, and achieve more accurate professional question answering. This research lays the foundation for building a high-quality knowledge innovation platform for the underground space industry, which will help improve the scientificity and work efficiency of underground space planning and design, and promote the rational utilization and sustainable development of urban underground space.
[1] Radford A, Narasimhan K. Improving language understanding by generative pre-training [EB/OL]. https://gwern.net/doc/www/s3-us-west-2.amazonaws.com/d73fdc5ffa8627bce44dcda2fc012da638ffb158.pdf,2018-06-12/2023-12-20.
[2] 黄南辉,莫若楫.地下工程风险管理知识库之建立[J]. 地下空间与工程学报, 2012, 8(增2): 1637-1641, 1655. (Huang Nanhui, Mo Ruoji. Risk management knowledge base for underground constructions[J]. Chinese Journal of Underground Space and Engineering, 2012, 8(Supp.2): 1637-1641, 1655. (in Chinese))
[3] 宋玉香,张诗雨,刘勇,等.城市地下空间智慧规划研究综述[J]. 地下空间与工程学报, 2020, 16(6):1611-1621, 1645. (Song Yuxiang, Zhang Shiyu, Liu Yong, et al. Review on urban underground space smart planning studies [J]. Chinese Journal of Underground Space and Engineering, 2020, 16(6):1611-1621, 1645. (in Chinese))
[4] Kaliampakos D, Benardos A, Mavrikos A, et al. The underground atlas project [J]. Tunneling and Underground Space Technology, 2016,55:229-235.
[5] 肖鹏. 面向多源数据集成的城市综合管廊智慧监管关键问题研究[D].北京:北京交通大学,2023. (Xiao Peng. The key issues of intelligent supervision of urban integrated pipeline corridor for multi-source data integration [D]. Beijing: Beijing Jiaotong University, 2023. (in Chinese))
[6] 李舟军,范宇,吴贤杰.面向自然语言处理的预训练技术研究综述[J]. 计算机科学,2020,47(3):162-173. (Li Zhoujun, Fan Yu, Wu Xianjie. Survey of natural language processing pre-training techniques[J]. Computer Science, 2020, 47(3): 162-173. (in Chinese))
[7] 李白杨,白云,詹希旎,等.人工智能生成内容(AIGC)的技术特征与形态演进[J]. 图书情报知识,2023,40(1):66-74. (Li Baiyang, Bai Yun, Zhan Xini, et al.The technical features and aromorphosis of artificial intelligence generated content (AIGC)[J]. Documentation, Information & Knowledge, 2023,40(1):66-74. (in Chinese))
[8] 朱光辉,王喜文.ChatGPT的运行模式、关键技术及未来图景[J]. 新疆师范大学学报(哲学社会科学版),2023,44(4):113-122. (Zhu Guanghui, Wang Xiwen. ChatGPT: Operation mode, key technology and future prospects [J]. Journal of Xinjiang Normal University(Edition of Philosophy and Social Sciences), 2023,44(4):113-122. (in Chinese))
[9] Sun Y, Wang S, Li Y, et al. Ernie: Enhanced representation through knowledge integration[J]. arXiv, 2019, 1904: 09223.
[10] 北京智谱华章科技有限公司. ChatGLM:千亿基座的对话模型开启内测——对应单卡版本开源 [EB/OL]. https://university.chatglm.cn/blog,2023-06-25/2023-12-20.( Beijing Zhipu AI Technology Co., Ltd. ChatGLM: A trillion-scale dialogue model begins internal testing — Corresponding single-card version open sourced[EB/OL]. https://university.chatglm.cn/blog,2023-06-25/2023-12-20.(in Chinese))
[11] 支振锋.生成式人工智能大模型的信息内容治理[J]. 政法论坛,2023,41(4):34-48. (Zhi Zhenfeng. Information content governance of large model of generative artificial intelligence[J]. Tribune of Political Science and Law,2023,41(4):34-48. (in Chinese))
[12] Qiu X P, Sun T X, Xu Y G, et al. Pre-trained models for natural language processing: A survey[J]. Science China(Technological Sciences), 2020, 63(10): 1872-1897.
[13] Du Z X, Qian Y J, Liu X, et al. GLM: general language model pretraining with autoregressive blank infilling[A]//In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Vol.1: Long Papers)[C]. 2022: 320-335.
[14] 中华人民共和国住房和城乡建设部. 城市地下空间规划标准(GB/T51358-2019)[S]. 北京:中国计划出版社, 2019. (Ministry of Housing and Urban-Rural Development of the People's Republic of China. (2019). Urban underground space planning standard (GB/T51358-2019)[S]. Beijing: China Planning Press. (in Chinese))
[15] 中国工程院战略咨询中心, 中国岩石力学与工程学会地下空间分会, 中国城市规划学会. 2022中国城市地下空间发展蓝皮书[R]. 江苏, 2021. (Strategic Advisory Center of the Chinese Academy of Engineering, Underground Space Branch of the Chinese Society for Rock Mechanics and Engineering, Urban Planning Society of China. 2022 Blue book on China's urban underground space development [R]. Jiangsu, 2021. (in Chinese))