Automated construction of a risk register in software development based on issue tracking in GitHub projects
DOI:
https://doi.org/10.33216/1998-7927-2025-293-7-17-30Keywords:
GitHub issue tracking, risk registry, automated classification, software quality management, technical debt, Python script, risk forecastingAbstract
The article discusses the topical problem of automating the process of building a risk register in software development based on the analysis of issue tracking data in GitHub Projects. In the context of the increasing complexity of software products and the high dynamics of changes in the life cycle of IT projects, traditional manual approaches to risk identification are increasingly losing their effectiveness due to delays in updating information and significant human costs. The author proposes the concept and implements a prototype of a Python script that provides automated extraction, pre-processing, multi-level classification and the formation of a structured risk register, which allows integrating this process directly into the usual life cycle of project quality management. The proposed approach is based on the classification of risks according to a defined taxonomy, covering seven main types: engineering, environmental, process, constraint-related, security, behavioral and external risks. For each risk group, a list of relevant keywords and text indicators is defined, which ensures the effective operation of the keyword matching algorithm in combination with the analysis of labels, comments and task status. As part of the trial, the script was tested on an artificially formed dataset containing more than 150 issues with a variety of texts close to the real practice of open projects. An important feature of the prototype is the possibility of multiclass classification, when one task can simultaneously correspond to several risk groups, which adequately reflects the interdisciplinarity of modern problems in the field of software development. The efficiency of the script is ensured by high performance: the average processing time for a full array of data does not exceed 1.5 seconds, which allows you to integrate the solution into regular CI/CD pipelines through GitHub Actions. The article also outlines the methodological basis for predicting the effectiveness of the developed approach using data from modern research, in particular the results of the BEACon-TD model, which demonstrated an F1 measure of more than 0.81–0.86 on real issue tracking sets. This allows you to analytically predict the potential accuracy of the proposed approach at the level of 80-85% with proper additional training and adaptation of keywords to the specifics of the project. The proposed solution allows not only to identify problem areas at the early stages, but also to form the basis for further predictive analytics on the impact of risks on the time, financial and qualitative parameters of development. In addition, the script creates a structured ledger in JSON and Excel formats with customized design and provides visualization of the distribution of risk types using diagrams in Matplotlib, which increases the transparency and usability of the analysis results. The obtained data can be integrated into quality management systems or transferred to stakeholders for timely decision-making. As a result, it is proved that the proposed automated construction of a risk register based on GitHub Projects is a promising direction for improving the processes of software quality management and technical debt management, and the developed prototype demonstrates an example of the practical implementation of this concept in combination with agile and DevOps approaches.
References
1. Shivashankar K., Orucevic M., Maritsdatter Kruke M., Martini A. BEACon-TD: Classifying Technical Debt and its types across diverse software projects issues using transformers. Journal of Systems and Software. 2025. №226. ISSN 0164-1212. DOI: https://doi.org/10.1016/j.jss.2025.112435.
2. Tang B., Zhang S., Zhu F., Ye A. CAPRA: Context-Aware patch risk assessment for detecting immature vulnerability in open-source software. Computers & Security. – 2025. №157. ISSN 0167-4048. DOI: https://doi.org/10.1016/j.cose.2025.104540.
3. Basile C., De Sutter B., Canavese D., Regano L., Coppens B. Design, implementation, and automation of a risk management approach for man-at-the-End software protection. Computers & Security. 2023. №132. ISSN 0167-4048. DOI: https://doi.org/10.1016/j.cose.2023.103321.
4. Van Can A.T., Dalpiaz F. Locating requirements in backlog items: Content analysis and experiments with large language models. Information and Software Technology. 2025. №179. ISSN 0950-5849. DOI: https://doi.org/10.1016/j.infsof.2024.107644.
5. Ramachandran S., Agrahari R., Mudgal P., Bhilwaria H., Long G., Kumar A. Automated Log Classification Using Deep Learning. Procedia Computer Science. 2023. №218. С. 1722–1732. ISSN 1877-0509. DOI: https://doi.org/10.1016/j.procs.2023.01.150.
6. Liu Z. Design of a Full-Process Transaction Monitoring and Risk Feedback System for DevOps Based on Microservices Architecture and Machine Learning Methods. Procedia Computer Science. 2025. №262. С. 948–954. ISSN 1877-0509. DOI: https://doi.org/10.1016/j.procs.2025.05.129.
7. Almaiah M.A., Saqr L.M., Al-Rawwash L.A., Altellawi L.A., Al-Ali R., Almomani O. Classification of Cybersecurity Threats, Vulnerabilities and Countermeasures in Database Systems. Computers, Materials and Continua. 2024. №81, вип. 2. С. 3189–3220. ISSN 1546-2218. DOI: https://doi.org/10.32604/cmc.2024.057673.
8. Hasani H., Freddi F., Piazza R. AI-driven automated and integrated structural health monitoring under environmental and operational variations. Automation in Construction. 2025. №176. ISSN 0926-5805. DOI: https://doi.org/10.1016/j. autcon.2025.106222.
9. Farkas Z., Országh E., Engelhardt T., Zentai A., Süth M., Csorba Sz., Jóźwiak Á. Emerging risk identification in the food chain – A systematic procedure and data analytical options. Innovative Food Science & Emerging Technologies. 2023. №86. ISSN 1466-8564. DOI: https://doi.org/ 10.1016/j.ifset.2023.103366.
10. Habib A.K.M.A., Hasan M.K., Hassan R., Islam S., Abbas H.S. False data injection attack dataset for classification, identification, and detection for IIoT in Industry 5.0. Data in Brief. 2025. №61. ISSN 2352-3409. DOI: https://doi.org/10.1016/j. dib.2025.111692.
11. Hassan M., Salbitani G., Carfagna S., Khan J.A. Deep learning meets marine biology: Optimized fused features and LIME-driven insights for automated plankton classification. Computers in Biology and Medicine. 2025. №192, ч. A. ISSN 0010-4825. DOI: https://doi.org/10.1016/j.compbiomed.2025.110273.
12. Sánchez-Hernández A., Román D., Javadi P., Domingo I. Leveraging GIS and SfM photogrammetry for monitoring and risk assessment of rock art sites. Digital Applications in Archaeology and Cultural Heritage. 2025. №37. ISSN 2212-0548. DOI: https://doi.org/10.1016/j.daach.2025.e00413.
13. Santarsiero G. Automated assessment of bridge guardrails for regional prioritization based on open-source data and deep learning algorithms. Results in Engineering. 2025. №26. ISSN 2590-1230. DOI: https://doi.org/10.1016/j.rineng.2025.105210.
14. Hossain S.T., Yigitcanlar T., Nguyen K., Xu Y. Platform urbanism for resident safety: A real-time predictive microclimate risk monitoring and alert system. /Urban Climate. 2025. №61. ISSN 2212-0955. DOI: https://doi.org/10.1016/j.uclim. 2025.102445.
 
							