Generative Artificial Intelligence for Software Engineering: References

3 May 2024

This paper is available on arxiv under CC 4.0 license.


(1) Anh Nguyen-Duc, University of South Eastern Norway, BøI Telemark, Norway3800 and Norwegian University of Science and Technology, Trondheim, Norway7012;

(2) Beatriz Cabrero-Daniel, University of Gothenburg, Gothenburg, Sweden;

(3) Adam Przybylek, Gdansk University of Technology, Gdansk, Poland;

(4) Chetan Arora, Monash University, Melbourne, Australia;

(5) Dron Khanna, Free University of Bozen-Bolzano, Bolzano, Italy;

(6) Tomas Herda, Austrian Post - Konzern IT, Vienna, Austria;

(7) Usman Rafiq, Free University of Bozen-Bolzano, Bolzano, Italy;

(8) Jorge Melegati, Free University of Bozen-Bolzano, Bolzano, Italy;

(9) Eduardo Guerra, Free University of Bozen-Bolzano, Bolzano, Italy;

(10) Kai-Kristian Kemell, University of Helsinki, Helsinki, Finland;

(11) Mika Saari, Tampere University, Tampere, Finland;

(12) Zheying Zhang, Tampere University, Tampere, Finland;

(13) Huy Le, Vietnam National University Ho Chi Minh City, Hochiminh City, Vietnam and Ho Chi Minh City University of Technology, Hochiminh City, Vietnam;

(14) Tho Quan, Vietnam National University Ho Chi Minh City, Hochiminh City, Vietnam and Ho Chi Minh City University of Technology, Hochiminh City, Vietnam;

(15) Pekka Abrahamsson, Tampere University, Tampere, Finland.


[1] R. A. Poldrack, T. Lu, G. Beguˇs, AI-assisted coding: Experiments with GPT-4. arXiv:2304.13187[cs], doi:10.48550/arXiv.2304.13187. URL

[2] P. Denny, V. Kumar, N. Giacaman, Conversing with copilot: Exploring prompt engineering for solving CS1 problems using natural language. arXiv:2210.15157[cs], doi:10.48550/arXiv.2210.15157. URL

[3] J.-B. D¨oderlein, M. Acher, D. E. Khelladi, B. Combemale, Piloting copilot and codex: Hot temperature, cold prompts, or black magic? arXiv:2210.14699[cs], doi:10.48550/arXiv.2210.14699. URL

[4] Y. Dong, X. Jiang, Z. Jin, G. Li, Self-collaboration code generation via ChatGPT. arXiv:2304.07590[cs], doi:10.48550/arXiv.2304. 07590. URL

[5] S. Ouyang, J. M. Zhang, M. Harman, M. Wang, LLM is like a box of chocolates: the non determinism of ChatGPT in code generation. arXiv:2308.02828[cs], doi:10.48550/arXiv.2308.02828. URL

[6] Y. Liu, G. Deng, Z. Xu, Y. Li, Y. Zheng, Y. Zhang, L. Zhao, T. Zhang, Y. Liu, Jailbreaking ChatGPT via prompt engineering: An empirical study. arXiv:2305.13860[cs], doi:10.48550/arXiv.2305.13860. URL

[7] W. Sun, C. Fang, Y. You, Y. Miao, Y. Liu, Y. Li, G. Deng, S. Huang, Y. Chen, Q. Zhang, H. Qian, Y. Liu, Z. Chen, Automatic code summarization via ChatGPT: How far are we? arXiv:2305.12865[cs], doi:10.48550/arXiv.2305.12865. URL

[8] H. Alkaissi, S. I. McFarlane, Artificial hallucinations in ChatGPT: Implications in scientific writing 15 (2) e35179. doi:10.7759/cureus.35179. URL

[9] A. Carleton, F. Shull, E. Harper, Architecting the future of software engineering 55 (9) 89–93, conference Name: Computer. doi:10.1109/ MC.2022.3187912. URL

[10] S. Jansen, A. Finkelstein, S. Brinkkemper, A sense of community: A research agenda for software ecosystems, in: 2009 31st International Conference on Software Engineering - Companion Volume, pp. 187– 190. doi:10.1109/ICSE-COMPANION.2009.5070978.

[11] B. Sengupta, S. Chandra, V. Sinha, A research agenda for distributed software development, in: Proceedings of the 28th international conference on Software engineering, ICSE ’06, Association for Computing Machinery, pp. 731–740. doi:10.1145/1134285.1134402. URL

[12] J. Bosch, H. H. Olsson, I. Crnkovic, Engineering AI systems: A research agenda, in: Artificial Intelligence Paradigms for Smart Cyber-Physical Systems, IGI Global, pp. 1–19. doi:10.4018/978-1-7998-5101-1. ch001.

[13] I. Sriram, A. Khajeh-Hosseini, Research agenda in cloud technologies. doi:10.48550/arXiv.1001.3259. URL

[14] R. France, B. Rumpe, Model-driven development of complex software: A research roadmap. doi:10.1109/FOSE.2007.14.

[15] M. P. Papazoglou, P. Traverso, S. Dustdar, F. Leymann, Serviceoriented computing: a research roadmap 17 (2) 223–255, publisher: World Scientific Publishing Co. doi:10.1142/S0218843008001816.

[16] M. Barenkamp, J. Rebstadt, O. Thomas, Applications of ai in classical software engineering, AI Perspectives 2 (1) (2020) 1.

[17] S. Mart´ınez-Fern´andez, J. Bogner, X. Franch, M. Oriol, J. Siebert, A. Trendowicz, A. M. Vollmer, S. Wagner, Software engineering for AIBased systems: A survey, ACM Trans. Softw. Eng. Methodol. 31 (2), (apr 2022). URL

[18] Z. Kotti, R. Galanopoulou, D. Spinellis, Machine learning for software engineering: A tertiary study 55 (12) 256:1–256:39. doi:10.1145/3572905. URL

[19] I. Paik, J.-W. Wang, Improving text-to-code generation with features of code graph on gpt-2, Electronics 10 (21) (2021) 2706.

[20] M. Jovanovi´c, M. Campbell, Generative artificial intelligence: Trends and prospects 55 (10) 107–112. doi:10.1109/MC.2022.3192720.

[21] T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, D. Amodei, Language models are few-shot learners, in: Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS’20, Curran Associates Inc., pp. 1877– 1901.

[22] A. Radford, K. Narasimhan, Improving language understanding by generative pre-training.

[23] B. D. Lund, T. Wang, Chatting about ChatGPT: how may AI and GPT impact academia and libraries? ahead-of-print. doi:10.1108/LHTN-01-2023-0009. URL

[24] T. Brants, A. C. Popat, P. Xu, F. J. Och, J. Dean, Large language models in machine translation (2007).

[25] K. Heafield, I. Pouzyrevsky, J. H. Clark, P. Koehn, Scalable modified kneser-ney language model estimation, in: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2013, pp. 690–696.

[26] Y. Bengio, R. Ducharme, P. Vincent, A neural probabilistic language model, Advances in neural information processing systems 13 (2000).

[27] T. Mikolov, M. Karafi´at, L. Burget, J. Cernock`y, S. Khudanpur, Recurrent neural network based language model., in: Interspeech, Vol. 2, Makuhari, 2010, pp. 1045–1048.

[28] T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient estimation of word representations in vector space, arXiv preprint arXiv:1301.3781 (2013).

[29] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin, Attention is all you need, Advances in neural information processing systems 30 (2017).

[30] J. Lee, K. Toutanova, Pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805 (2018).

[31] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, et al., Language models are unsupervised multitask learners, OpenAI blog 1 (8) (2019) 9.

[32] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding (2019). arXiv:1810.04805.

[33] A. Chowdhery, S. Narang, J. Devlin, M. Bosma, G. Mishra, A. Roberts, P. Barham, H. W. Chung, C. Sutton, S. Gehrmann, P. Schuh, K. Shi, S. Tsvyashchenko, J. Maynez, A. Rao, P. Barnes, Y. Tay, N. Shazeer, V. Prabhakaran, E. Reif, N. Du, B. Hutchinson, R. Pope, J. Bradbury, J. Austin, M. Isard, G. Gur-Ari, P. Yin, T. Duke, A. Levskaya, S. Ghemawat, S. Dev, H. Michalewski, X. Garcia, V. Misra, K. Robinson, L. Fedus, D. Zhou, D. Ippolito, D. Luan, H. Lim, B. Zoph, A. Spiridonov, R. Sepassi, D. Dohan, S. Agrawal, M. Omernick, A. M. Dai, T. S. Pillai, M. Pellat, A. Lewkowycz, E. Moreira, R. Child, O. Polozov, K. Lee, Z. Zhou, X. Wang, B. Saeta, M. Diaz, O. Firat, M. Catasta,J. Wei, K. Meier Hellstern, D. Eck, J. Dean, S. Petrov, N. Fiedel, Palm: Scaling language modeling with pathways (2022). arXiv:2204.02311.

[34] V. Lialin, V. Deshpande, A. Rumshisky, Scaling down to scale up: A guide to parameter-efficient fine-tuning (2023). arXiv:2303.15647.

[35] A. Martakis, M. Daneva, Handling requirements dependencies in agile projects: A focus group with agile software development practitioners, in: IEEE 7th International Conference on Research Challenges in Information Science (RCIS), pp. 1–11, ISSN: 2151-1357. doi:10.1109/RCIS.2013.6577679.

[36] J. Kontio, J. Bragge, L. Lehtola, The focus group method as an empirical tool in software engineering, in: F. Shull, J. Singer, D. I. K. Sjøberg (Eds.), Guide to Advanced Empirical Software Engineering, Springer, pp. 93–116.

[37] J. Kontio, L. Lehtola, J. Bragge, Using the focus group method in software engineering: obtaining practitioner and user experiences, in: Proceedings. 2004 International Symposium on Empirical Software Engineering, 2004. ISESE ’04., pp. 271–280.

[38] T. Dingsøyr, Y. Lindsjørn, Team performance in agile development teams: Findings from 18 focus groups, in: H. Baumeister, B. Weber (Eds.), Agile Processes in Software Engineering and Extreme Programming, Lecture Notes in Business Information Processing, Springer, pp. 46–60.

[39] H. Edmunds, Focus Group Research Handbook, 1st Edition, McGrawHill.

[40] M. Sosnowski, M. Bereza, Y. Y. Ng, Business-oriented approach to requirements elicitation in a scrum project, in: A. Przyby lek, J. Miler, A. Poth, A. Riel (Eds.), Lean and Agile Software Development, Springer International Publishing, Cham, 2021, pp. 185–191.

[41] A. Przyby lek, M. Zakrzewski, Adopting collaborative games into agile requirements engineering, in: Proceedings of the 13th International Conference on Evaluation of Novel Approaches to Software Engineering, ENASE 2018, SCITEPRESS, 2018, p. 54–64. doi:10.5220/0006681900540064.

[42] I. Inayat, S. S. Salim, S. Marczak, M. Daneva, S. Shamshirband, A systematic literature review on agile requirements engineering practices and challenges, Computers in Human Behavior 51 (2015) 915–929. doi:10.1016/j.chb.2014.10.046.

[43] B. Ramesh, L. Cao, R. Baskerville, Agile requirements engineering practices and challenges: an empirical study, Information Systems Journal 20 (5) (2010) 449–480.

[44] K. Ahmad, M. Abdelrazek, C. Arora, A. A. Baniya, M. Bano, J. Grundy, Requirements engineering framework for human-centered artificial intelligence software systems, Applied Soft Computing 143 (2023) 110455.

[45] K. Ahmad, M. Abdelrazek, C. Arora, M. Bano, J. Grundy, Requirements engineering for artificial intelligence systems: A systematic mapping study, Information and Software Technology (2023) 107176.

[46] J. White, S. Hays, Q. Fu, J. Spencer-Smith, D. C. Schmidt, ChatGPT prompt patterns for improving code quality, refactoring, requirements elicitation, and software design. arXiv:2303.07839[cs], doi:10.48550/arXiv.2303.07839. URL

[47] K. Ronanki, C. Berger, J. Horkoff, Investigating chatgpt’s potential to assist in requirements elicitation processes (2023) 354–361.

[48] Q. Zhang, M. Chen, A. Bukharin, P. He, Y. Cheng, W. Chen, T. Zhao, Adaptive budget allocation for parameter-efficient fine-tuning (2023). arXiv:2303.10512.

[49] S. Arulmohan, M.-J. Meurs, S. Mosser, Extracting domain models from textual requirements in the era of large language models, MDEIntelligence (co-located with ACM/IEEE 26th International Conference on . . . , 2023.

[50] S. Ezzini, S. Abualhaija, C. Arora, M. Sabetzadeh, Automated handling of anaphoric ambiguity in requirements: a multi-solution study, in: Proceedings of the 44th International Conference on Software Engineering, 2022, pp. 187–199.

[51] A. Moharil, A. Sharma, Tabasco: A transformer based contextualization toolkit, Science of Computer Programming 230 (2023) 102994. doi:

[52] C. Arora, J. Grundy, M. Abdelrazek, Advancing requirements engineering through generative ai: Assessing the role of llms (2023). arXiv:2310.13976.

[53] J. Zhang, Y. Chen, N. Niu, C. Liu, A preliminary evaluation of chatgpt in requirements information retrieval, arXiv preprint arXiv:2304.12562 (2023).

[54] G. De Vito, F. Palomba, C. Gravino, S. Di Martino, F. Ferrucci, et al., Echo: An approach to enhance use case quality exploiting large language models, in: 2023 49th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), 2023, pp. 53–60.

[55] S. Abualhaija, C. Arora, A. Sleimi, L. C. Briand, Automated question answering for improved understanding of compliance requirements: A multi-document study, in: 2022 IEEE 30th International Requirements Engineering Conference (RE), IEEE, 2022, pp. 39–50.

[56] S. Ezzini, S. Abualhaija, C. Arora, M. Sabetzadeh, Ai-based question answering assistance for analyzing natural-language requirements, in: 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), IEEE Computer Society, Los Alamitos, CA, USA, 2023, pp. 1277–1289. doi:10.1109/ICSE48619.2023.00113.

[57] T. Hey, J. Keim, A. Koziolek, W. F. Tichy, Norbert: Transfer learning for requirements classification, in: 2020 IEEE 28th International Requirements Engineering Conference (RE), 2020, pp. 169–179. doi:10.1109/RE48521.2020.00028.

[58] X. Luo, Y. Xue, Z. Xing, J. Sun, Prcbert: Prompt learning for requirement classification using bert-based pretrained language models, in: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, ASE ’22, Association for Computing Machinery, New York, NY, USA, 2023. doi:10.1145/3551349.3560417.

[59] B. Chen, K. Chen, S. Hassani, Y. Yang, D. Amyot, L. Lessard, G. Mussbacher, M. Sabetzadeh, D. Varr´o, On the use of gpt-4 for creating goal models: An exploratory study, in: MoDRE workshop at Requirement Engineering conference. To appear, 2023.

[60] D. Hidellaarachchi, J. Grundy, R. Hoda, I. Mueller, The influence of human aspects on requirements engineering-related activities: Software practitioners’ perspective, ACM Transactions on Software Engineering and Methodology 32 (5) (2023) 1–37.

[61] Y. Li, S. Zhang, J. Sun, W. Zhang, Y. Du, Y. Wen, X. Wang, W. Pan, Tackling cooperative incompatibility for zero-shot human-AI coordination. arXiv:2306.03034[cs], doi:10.48550/arXiv.2306.03034. URL

[62] I. Issaoui, N. Bouassida, H. Ben-Abdallah, A new approach for interactive design pattern recommendation, Lecture Notes on Software Engineering 3 (3) (2015) 173.

[63] N. Nahar, T. S. U. Mahmud, K. Sakib, An improved behavioral matching for anti-pattern based abstract factory recommendation, in: 2016 5th International Conference on Informatics, Electronics and Vision (ICIEV), IEEE, 2016, pp. 35–40.

[64] J. E. Van Engelen, H. H. Hoos, A survey on semi-supervised learning, Machine learning 109 (2)(2020) 373–440.

[65] C. Bou, N. Laosen, E. Nantajeewarawat, Design pattern ranking based on the design pattern intent ontology, in: Intelligent Information and Database Systems: 10th Asian Conference, ACIIDS 2018, Dong Hoi City, Vietnam, March 19-21, 2018, Proceedings, Part I 10, Springer, 2018, pp. 25–35.

[66] A. Ahmad, M. Waseem, P. Liang, M. Fahmideh, M. S. Aktar, T. Mikkonen, Towards human-bot collaborative software architecting with chatgpt, in: Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering, EASE ’23, Association for Computing Machinery, New York, NY, USA, 2023, p. 279–285. doi:10.1145/3593434.3593468. URL

[67] S. Herold, C. Knieke, M. Schindler, A. Rausch, Towards improving software architecture degradation mitigation by machine learning. URL

[68] J. J. Maranh˜ao Junior, F. F. Correia, E. Guerra, Can chatgpt suggest patterns? an exploratory study about answers given by ai-assisted tools to design problems, in: AI-Assisted Agile Software Development Workshop at XP2023 conference, Springer, 2023.

[69] T. Stojanovic, S. D. Lazarevi´c, The application of ChatGPT for identification of microservices 3 (1) 99–105. [70] R. Feldt, S. Kang, J. Yoon, S. Yoo, Towards autonomous testing agents via conversational large language models. arXiv:2306.05152[cs], doi:10.48550/arXiv.2306.05152. URL

[71] E. Gamma, R. Helm, R. Johnson, J. Vlissides, D. Patterns, Elements of reusable object-oriented software, Design Patterns (1995).

[72] L. Viviani, E. Guerra, J. Melegati, X. Wang, An empirical study about the instability and uncertainty of non-functional requirements, in: International Conference on Agile Software Development, Springer Nature Switzerland Cham, 2023, pp. 77–93.

[73] J. White, Q. Fu, S. Hays, M. Sandborn, C. Olea, H. Gilbert, A. Elnashar, J. Spencer-Smith, D. C. Schmidt, A prompt pattern catalog to enhance prompt engineering with ChatGPT. arXiv:2302.11382[cs], doi:10.48550/arXiv.2302.11382. URL

[74] J. Hoffmann, S. Borgeaud, A. Mensch, E. Buchatskaya, T. Cai, E. Rutherford, D. de Las Casas, L. A. Hendricks, J. Welbl, A. Clark, T. Hennigan, E. Noland, K. Millican, G. van den Driessche, B. Damoc, A. Guy, S. Osindero, K. Simonyan, E. Elsen, J. W. Rae, O. Vinyals, L. Sifre, Training compute-optimal large language models (2022). arXiv:2203.15556.

[75] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozi`ere, N. Goyal, E. Hambro, F. Azhar, A. Rodriguez, A. Joulin, E. Grave, G. Lample, Llama: Open and efficient foundation language models (2023). arXiv:2302.13971.

[76] Andersen v. stability ai ltd. on courtlistener, accessed on 2023-10-05. URL

[77] Doe v. github inc. on courtlistener, accessed on 2023-10-05.

[78] A. Moradi Dakhel, V. Majdinasab, A. Nikanjam, F. Khomh, M. C. Desmarais, Z. M. J. Jiang, GitHub copilot AI pair programmer: Asset or liability? 203 111734. doi:10.1016/j.jss.2023.111734.

[79] S. Imai, Is GitHub copilot a substitute for human pair-programming? an empirical study, in: 2022 IEEE/ACM 44th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), pp. 319–321, ISSN: 2574-1926. doi:10.1145/3510454.3522684. [80] M. L. Siddiq, A. Samee, S. R. Azgor, M. A. Haider, S. I. Sawraz, J. C. S. Santos, Zero-shot prompting for code complexity prediction using GitHub copilot, in: 2023 IEEE/ACM 2nd International Workshop on Natural Language-Based Software Engineering (NLBSE), pp. 56–59. doi:10.1109/NLBSE59153.2023.00018.

[81] S. Jiang, Y. Wang, Y. Wang, SelfEvolve: A code evolution framework via large language models. arXiv:2306.02907[cs], doi:10.48550/ arXiv.2306.02907. URL

[82] Y. Dong, X. Jiang, Z. Jin, G. Li, Self-collaboration code generation via chatgpt (2023). arXiv:2304.07590.

[83] A. Borji, A categorical archive of ChatGPT failures. arXiv:2302. 03494[cs], doi:10.48550/arXiv.2302.03494. URL

[84] J. Sun, Q. V. Liao, M. Muller, M. Agarwal, S. Houde, K. Talamadupula, J. D. Weisz, Investigating explainability of generative AI for code through scenario-based design, in: 27th International Conference on Intelligent User Interfaces, IUI ’22, Association for Computing Machinery, pp. 212–228. doi:10.1145/3490099.3511119. URL

[85] H. Li, Y. Hao, Y. Zhai, Z. Qian, The hitchhiker’s guide to program analysis: A journey with large language models (2023). arXiv:2308.00245.

[86] B. Yeti¸stiren, I. Ozsoy, M. Ayerdem, E. T¨uz¨un, Evaluating the code ¨ quality of AI-assisted code generation tools: An empirical study on GitHub copilot, amazon CodeWhisperer, and

ChatGPT. arXiv:2304.10778[cs], doi:10.48550/arXiv.2304.10778. URL

[87] P. Salza, C. Schwizer, J. Gu, H. C. Gall, On the effectiveness of transfer learning for code search 49 (4) 1804–1822. arXiv:2108.05890[cs], doi:10.1109/TSE.2022.3192755. URL

[88] F. Chen, F. Fard, D. Lo, T. Bryksin, On the transferability of pre-trained language models for low-resource programming languages. arXiv:2204.09653[cs], doi:10.48550/arXiv.2204.09653. URL

[89] Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, D. Jiang, M. Zhou, CodeBERT: A pre-trained model for programming and natural languages. arXiv:2002.08155[cs], doi:10.48550/arXiv.2002.08155. URL

[90] D. Guo, S. Ren, S. Lu, Z. Feng, D. Tang, S. Liu, L. Zhou, N. Duan, A. Svyatkovskiy, S. Fu, M. Tufano, S. K. Deng, C. Clement, D. Drain, N. Sundaresan, J. Yin, D. Jiang, M. Zhou, GraphCodeBERT: Pretraining code representations with data flow. arXiv:2009.08366[cs], doi:10.48550/arXiv.2009.08366. URL

[91] W. Gu, Z. Li, C. Gao, C. Wang, H. Zhang, Z. Xu, M. R. Lyu, CRaDLe: Deep code retrieval based on

semantic dependency learning 141 385–394. arXiv:2012.01028[cs], doi:10.1016/j.neunet.2021.04.019. URL

[92] D. Li, Y. Shen, R. Jin, Y. Mao, K. Wang, W. Chen, Generationaugmented query expansion for code retrieval. arXiv:2212.10692[cs], doi:10.48550/arXiv.2212.10692. URL

[93] M. Wei, N. S. Harzevili, Y. Huang, J. Wang, S. Wang, CLEAR: Contrastive learning for API recommendation, in: 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE), pp. 376–387. doi:10.1145/3510003.3510159. URL

[94] K. Zhang, H. Zhang, G. Li, J. Li, Z. Li, Z. Jin, ToolCoder: Teach code generation models to use API search tools. arXiv:2305.04032[cs], doi:10.48550/arXiv.2305.04032. URL

[95] S. G. Patil, T. Zhang, X. Wang, J. E. Gonzalez, Gorilla: Large language model connected with massive APIs. arXiv:2305.15334[cs], doi:10.48550/arXiv.2305.15334. URL

[96] A. Mastropaolo, S. Scalabrino, N. Cooper, D. Nader Palacio, D. Poshyvanyk, R. Oliveto, G. Bavota, Studying the usage of text-to-text transfer transformer to support code-related tasks, in: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), 2021, pp. 336–347. doi:10.1109/ICSE43902.2021.00041.

[97] M. Geng, S. Wang, D. Dong, H. Wang, G. Li, Z. Jin, X. Mao, X. Liao, Large language models are few-shot summarizers: Multi-intent comment generation via in-context learning. arXiv:2304.11384[cs], doi:10.48550/arXiv.2304.11384. URL

[98] P. Vaithilingam, T. Zhang, E. L. Glassman, Expectation vs.experience: Evaluating the usability of code generation tools powered by large language models, in: Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems, CHI EA ’22, Association for Computing Machinery, pp. 1–7. doi:10.1145/3491101.3519665. URL

[99] S. Barke, M. B. James, N. Polikarpova, Grounded copilot: How programmers interact with code-generating models 7 78:85–78:111. doi:10.1145/3586030. URL

[100] A. Ziegler, E. Kalliamvakou, X. A. Li, A. Rice, D. Rifkin, S. Simister, G. Sittampalam, E. Aftandilian, Productivity assessment of neural code completion, in: Proceedings of the 6th ACMSIGPLAN International Symposium on Machine Programming, MAPS 2022, Association for Computing Machinery, pp. 21–29. doi:10.1145/3520312.3534864. URL

[101] H. Pearce, B. Ahmad, B. Tan, B. Dolan-Gavitt, R. Karri, Asleep at the keyboard? assessing the security of GitHub copilot’s code contributions, in: 2022 IEEE Symposium on Security and Privacy (SP), pp. 754–768, ISSN: 2375-1207. doi:10.1109/SP46214.2022.9833571.

[102] C. Bird, D. Ford, T. Zimmermann, N. Forsgren, E. Kalliamvakou, T. Lowdermilk, I. Gazit, Taking flight with copilot: Early insights and opportunities of AI-powered pair-programming tools 20 (6) Pages 10:35–Pages 10:57. doi:10.1145/3582083. URL

[103] E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, W. Chen, Lora: Low-rank adaptation of large language models (2021). arXiv:2106.09685.

[104] X. L. Li, P. Liang, Prefix-tuning: Optimizing continuous prompts for generation, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Association for Computational Linguistics, Online, 2021, pp. 4582– 4597. doi:10.18653/v1/2021.acl-long.353. URL

[105] X. Liu, K. Ji, Y. Fu, W. L. Tam, Z. Du, Z. Yang, J. Tang, P-tuning v2: Prompt tuning can be comparable to fine-tuning universally across scales and tasks (2022). arXiv:2110.07602.

[106] X. Liu, Y. Zheng, Z. Du, M. Ding, Y. Qian, Z. Yang, J. Tang, Gpt understands, too (2021). arXiv:2103.10385.

[107] B. Lester, R. Al-Rfou, N. Constant, The power of scale for parameterefficient prompt tuning (2021). arXiv:2104.08691.

[108] Z. Wang, R. Panda, L. Karlinsky, R. Feris, H. Sun, Y. Kim, Multitask prompt tuning enables parameter-efficient transfer learning (2023). arXiv:2303.02861.

[109] H. Liu, D. Tam, M. Muqeeth, J. Mohta, T. Huang, M. Bansal, C. Raffel, Few-shot parameter efficient fine-tuning is better and cheaper than in-context learning (2022). arXiv:2205.05638.

[110] R. Li, L. B. Allal, Y. Zi, N. Muennighoff, D. Kocetkov, C. Mou, M. Marone, C. Akiki, J. Li, J. Chim, Q. Liu, E. Zheltonozhskii, T. Y. Zhuo, T. Wang, O. Dehaene, M. Davaadorj, J. Lamy-Poirier, J. Monteiro, O. Shliazhko, N. Gontier, N. Meade, A. Zebaze, M.-H. Yee, L. K. Umapathi, J. Zhu, B. Lipkin, M. Oblokulov, Z. Wang, R. Murthy, J. Stillerman, S. S. Patel, D. Abulkhanov, M. Zocca, M. Dey, Z. Zhang, N. Fahmy, U. Bhattacharyya, W. Yu, S. Singh, S. Luccioni, P. Villegas, M. Kunakov, F. Zhdanov, M. Romero, T. Lee, N. Timor, J. Ding, C. Schlesinger, H. Schoelkopf, J. Ebert, T. Dao, M. Mishra, A. Gu, J. Robinson, C. J. Anderson, B. Dolan-Gavitt, D. Contractor, S. Reddy, D. Fried, D. Bahdanau, Y. Jernite, C. M. Ferrandis, S. Hughes, T. Wolf, A. Guha, L. von Werra, H. de Vries, Starcoder: may the source be with you! (2023). arXiv:2305.06161.

[111] S. Planning, The economic impacts of inadequate infrastructure for software testing, National Institute of Standards and Technology 1 (2002).

[112] TestRail, The 2023 software testing and quality report (2023).

[113] H. Liu, L. Liu, C. Yue, Y. Wang, B. Deng, Autotestgpt: A system for the automated generation of software test cases based on chatgpt, Available at SSRN 4584792.

[114] Z. Yuan, Y. Lou, M. Liu, S. Ding, K. Wang, Y. Chen, X. Peng, No more manual tests? evaluating and improving chatgpt for unit test generation, arXiv preprint arXiv:2305.04207 (2023).

[115] Z. Kotti, R. Galanopoulou, D. Spinellis, Machine learning for software engineering: A tertiary study, ACM Computing Surveys 55 (12) (2023) 1–39.

[116] S. Jalil, S. Rafi, T. D. LaToza, K. Moran, W. Lam, Chatgpt and software testing education: Promises & perils, in: 2023 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW), IEEE, 2023, pp. 4130–4137.

[117] W. Ma, S. Liu, W. Wang, Q. Hu, Y. Liu, C. Zhang, L. Nie, Y. Liu, The scope of chatgpt in software engineering: A thorough investigation, arXiv preprint arXiv:2305.12138 (2023).

[118] J. White, S. Hays, Q. Fu, J. Spencer-Smith, D. C. Schmidt, Chatgpt prompt patterns for improving code quality, refactoring, requirements elicitation, and software design, arXiv preprint arXiv:2303.07839 (2023).

[119] M. A. Akbar, A. A. Khan, P. Liang, Ethical aspects of chatgpt in software engineering research, arXiv preprint arXiv:2306.07557 (2023).

[120] V. Garousi, M. V. M¨antyl¨a, A systematic literature review of literature reviews in software testing, Information and Software Technology 80 (2016) 195–216.

[121] J. Wang, Y. Huang, C. Chen, Z. Liu, S. Wang, Q. Wang, Software testing with large language model: Survey, landscape, and vision, arXiv preprint arXiv:2307.07221 (2023).

[122] Z. Zheng, K. Ning, J. Chen, Y. Wang, W. Chen, L. Guo, W. Wang, Towards an understanding of large language models in software engineering tasks (2023). arXiv:2308.11396.

[123] R. Feldt, S. Kang, J. Yoon, S. Yoo, Towards autonomous testing agents via conversational large language models, arXiv preprint arXiv:2306.05152 (2023).

[124] A. Przyby lek, An empirical study on the impact of aspectj on software evolvability, Empirical Software Engineering 23 (4) (2018) 2018–2050.

[125] S. Jarzabek, Effective software maintenance and evolution: A reusebased approach, CRC Press, 2007.

[126] N. Anquetil, K. M. de Oliveira, K. D. de Sousa, M. G. Batista Dias, Software maintenance seen as a knowledge management issue, Information and Software Technology 49 (5) (2007) 515–529. doi:

[127] D. Noever, K. Williams, Chatbots as fluent polyglots: Revisiting breakthrough code snippets. arXiv:2301.03373[cs], doi:10.48550/arXiv.2301.03373.

[128] T. Ahmed, P. Devanbu, Few-shot training llms for project-specific codesummarization, in: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, ASE ’22, Association for Computing Machinery, New York, NY, USA, 2023. doi:10.1145/3551349.3559555. URL

[129] A. Madaan, A. Shypula, U. Alon, M. Hashemi, P. Ranganathan, Y. Yang, G. Neubig, A. Yazdanbakhsh, Learning performanceimproving code edits (2023). arXiv:2302.07867.

[130] Q. Zhang, C. Fang, W. Sun, Y. Liu, T. He, X. Hao, Z. Chen, Boosting automated patch correctness prediction via pre-trained language model. arXiv:2301.12453[cs], doi:10.48550/arXiv.2301.12453. URL

[131] C. S. Xia, Y. Wei, L. Zhang, Practical program repair in the era of large pre-trained language models (2022). arXiv:2210.14179.

[132] H. Pearce, B. Tan, B. Ahmad, R. Karri, B. Dolan-Gavitt, Examining zero-shot vulnerability repair with large language models, in: 2023 IEEE Symposium on Security and Privacy (SP), 2023, pp. 2339–2356. doi:10.1109/SP46215.2023.10179324.

[133] O. Asare, M. Nagappan, N. Asokan, Is github’s copilot as bad as humans at introducing vulnerabilities in code? (2023). arXiv:2204.04741.

[134] X. Hou, Y. Zhao, Y. Liu, Z. Yang, K. Wang, L. Li, X. Luo, D. Lo, J. Grundy, H. Wang, Large language models for software engineering: A systematic literature review (2023). arXiv:2308.10620.

[135] I. Sommerville, Software Engineering, 10th Edition, Pearson.

[136] A. Nguyen-Duc, S. M. A. Shah, P. Ambrahamsson, Towards an early stage software startups evolution model, in: 2016 42th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 120–127. doi:10.1109/SEAA.2016.21.

[137] B. Fitzgerald, K.-J. Stol, Continuous software engineering: A roadmap and agenda, Journal of Systems and Software 123 (2017) 176–189.

[138] Y. Feng, S. Vanam, M. Cherukupally, W. Zheng, M. Qiu, H. Chen, Investigating code generation performance of ChatGPT with crowdsourcing social data, in: 2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC), pp. 876–885, ISSN: 0730-3157. doi:10.1109/COMPSAC57700.2023.00117.

[139] J. D. Weisz, M. Muller, S. Houde, J. Richards, S. I. Ross, F. Martinez, M. Agarwal, K. Talamadupula, Perfection not required? human-AI partnerships in code translation, in: 26th International Conference on Intelligent User Interfaces, IUI ’21, Association for Computing Machinery, pp. 402–412. doi:10.1145/3397481.3450656. URL

[140] S. I. Ross, F. Martinez, S. Houde, M. Muller, J. D. Weisz, The programmer’s assistant: Conversational interaction with a large language model for software development, in: Proceedings of the 28th International Conference on Intelligent User Interfaces, IUI ’23, Association for Computing Machinery, pp. 491–514. doi:10.1145/3581641.3584037. URL

[141] N. Petrovic, Machine learning-based run-time DevSecOps: ChatGPT against traditional approach, in: 2023 10th International Conference on Electrical, Electronic and Computing Engineering (IcETRAN), pp. 1–5. doi:10.1109/IcETRAN59631.2023.10192161.

[142] P. L. Lanzi, D. Loiacono, ChatGPT and other large language models as evolutionary engines for online interactive collaborative game design, in: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ’23, Association for Computing Machinery, pp. 1383–1390. doi:10.1145/3583131.3590351.

[143] Y. Ye, H. You, J. Du, Improved trust in human-robot collaboration with ChatGPT 11 55748-55754, conference Name: IEEE Access. doi:10.1109/ACCESS.2023.3282111.

[144] N. Nascimento, P. Alencar, D. Cowan, Comparing software developers with ChatGPT: An empirical investigation. arXiv:2305.11837[cs], doi:10.48550/arXiv.2305.11837. URL

[145] H. Eisner, Essentials of project and systems engineering management, John Wiley & Sons, 2008.

[146] P. Tambe, P. Cappelli, V. Yakubovich, Artificial intelligence in human resources management: Challenges and a path forward, California Management Review 61 (4) (2019) 15–42.

[147] A. P. Ammeter, J. M. Dukerich, Leadership, team building, and team member characteristics in high performance project teams, Engineering management journal 14 (4) (2002) 3–10.

[148] H. Khamooshi, D. F. Cioffi, Program risk contingency budget planning, IEEE Transactions on Engineering Management 56 (1) (2009) 171–179.

[149] B. S. Blanchard, System engineering management, John Wiley & Sons, 2004.

[150] N. J. Smith, Engineering project management, Blackwell Science Ames, IA, 2002.

[151] L. Wang, Z. Liu, A. Liu, F. Tao, Artificial intelligence in product lifecycle management, The International Journal of Advanced Manufacturing Technology 114 (2021) 771–796.

[152] V. Holzmann, D. Zitter, S. Peshkess, The expectations of project managers from artificial intelligence: A delphi study, Project Management Journal 53 (5) (2022) 438–455.

[153] V. Prifti, Optimizing project management using artificial intelligence, European Journal of Formal Sciences and Engineering 5 (1) (2022) 29–37.

[154] H. K. Dam, T. Tran, J. Grundy, A. Ghose, Y. Kamei, Towards effective ai-powered agile project management, in: 2019 IEEE/ACM 41st international conference on software engineering: new ideas and emerging results (ICSE-NIER), IEEE, 2019, pp. 41–44.

[155] H. H. Elmousalami, Comparison of artificial intelligence techniques for project conceptual cost prediction: a case study and comparative analysis, IEEE Transactions on Engineering Management 68 (1) (2020) 183–196.

[156] M. Fu, C. Tantithamthavorn, GPT2sp: A transformer-based agile story point estimation approach 49 (2) 611–625. doi:10.1109/TSE.2022. 3158252. URL

[157] M. Alhamed, T. Storer, Evaluation of context-aware language models and experts for effort estimation of software maintenance issues, in: 2022 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 129–138, ISSN: 2576-3148. doi:10.1109/ ICSME55016.2022.00020. URL

[158] M.-N. Chu, Assessing the benefits of ChatGPT for business: An empirical study on organizational performance 11 76427–76436, conference Name: IEEE Access. doi:10.1109/ACCESS.2023.3297447.

[159] C. Ebert, P. Louridas, Generative AI for software practitioners 40 (4) 30–38, conference Name: IEEE Software. doi:10.1109/MS.2023.3265877.

[160] Y. Pan, L. Zhang, Roles of artificial intelligence in construction engineering and management: A critical review and future trends, Automation in Construction 122 (2021) 103517. [161] S. O. Abioye, L. O. Oyedele, L. Akanbi, A. Ajayi, J. M. D. Delgado, M. Bilal, O. O. Akinade, A. Ahmed, Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges, Journal of Building Engineering 44 (2021) 103299.

[162] N. A. Parikh, Empowering business transformation: The positive impact and ethical considerations of generative ai in software product management–a systematic literature review, arXiv preprint arXiv:2306.04605 (2023).

[163] A. Brem, F. Giones, M. Werle, The ai digital revolution in innovation: A conceptual framework of artificial intelligence technologies for the management of innovation, IEEE Transactions on Engineering Management (2021).

[164] L. Song, L. L. Minku, Artificial intelligence in software project management, in: Optimising the Software Development Process with Artificial Intelligence, Springer, 2023, pp. 19–65.

[165] A. El-Deeb, The human side of the tech industry: Key drivers behind the tech talent dilemma, SIGSOFT Softw. Eng. Notes 47 (1) (2022) 10–11. doi:10.1145/3502771.3502775. URL

[166] E. Sarikaya, S. Bagriyanik, M. G¨okalp, Teaching agile software development using agile methods: A case study, in: 2020 Turkish National Software Engineering Symposium (UYMS), 2020, pp. 1–6. doi:10.1109/UYMS50627.2020.9247027.

[167] S. Tennakoon, R. Islam, D. Wijerathna, R. Deen, P. Sumathipala, L. Abeywardhana, An interactive application for university students to reduce the industry-academia skill gap in the software engineering field, in: 2023 4th International Conference for Emerging Technology (INCET), 2023, pp. 1–6. doi:10.1109/INCET57972.2023.10170591.

[168] S. Hyrynsalmi, E. Sutinen, The role of women software communities in attracting more women to the software industry, in: 2019 IEEE International Conference on Engineering, Technology and Innovation (ICE/ITMC), 2019, pp. 1–7. doi:10.1109/ICE.2019.8792673.

[169] L. Jiagui, W. Wanli, J. F. I. Lam, L. Ke, SWOT analysis and public policy of macao’s digital trade in services 9 (1) 2176363. doi:10.1080/23311886.2023.2176363. URL

[170] S. Verdugo-Castro, A. Garc´ıa-Holgado, M. C. S´anchez-G´omez, The gender gap in higher stem studies: a systematic literature review, Heliyon (2022).

[171] R. S. Dubey, J. Paul, V. Tewari, The soft skills gap: a bottleneck in the talent supply in emerging economies, The International Journal of Human Resource Management 33 (13) (2022) 2630–2661. arXiv:, doi:10.1080/09585192.2020.1871399. URL

[172] J. Zhao, T. Wang, M. Yatskar, V. Ordonez, K.-W. Chang, Men also like shopping: Reducing gender bias amplification using corpus-level constraints (2017). arXiv:1707.09457.

[173] OpenAI, Gpt-4 technical report (2023). arXiv:2303.08774.

[174] D. Adiwardana, M. Luong, D. R. So, J. Hall, N. Fiedel, R. Thoppilan, Z. Yang, A. Kulshreshtha, G. Nemade, Y. Lu, Q. V. Le, Towards a human-like open-domain chatbot, CoRR abs/2001.09977 (2020). arXiv:2001.09977. URL

[175] European Commission, Directorate-General for Communications Networks, Content and Technology, EUR-Lex - 52021PC0206 - EN - EURLex, CNECT (2021).

[176] L. A. Gonz´alez, A. Neyem, I. Contreras-McKay, D. Molina, Improving learning experiences in software engineering capstone courses using artificial intelligence virtual assistants 30 (5) 1370-1389. doi:10.1002/cae.22526.

[177] M. Verleger, J. Pembridge, A Pilot Study Integrating an AI-driven Chatbot in an Introductory Programming Course, IEEE, 2018, pp. 1– 4. doi:10.1109/FIE.2018.8659282. URL

[178] M. Binkis, R. Kubili¯unas, R. Sturien˙e, T. Dulinskien˙e, T. Blaˇzauskas, V. Jakˇstien˙e, Rule-based chatbot integration into software engineering course, in: A. Lopata, D. Gudonien˙e, R. Butkien˙e (Eds.), Information and Software Technologies, Communications in Computer and Information Science, Springer International Publishing, pp. 367–377.

[179] M. Ismail, A. Ade-Ibijola, Lecturer’s apprentice: A chatbot for assisting novice programmers, IEEE, 2019, pp. 1–8. doi:10.1109/ IMITEC45504.2019.9015857. URL

[180] G. Carreira, L. Silva, A. J. Mendes, H. G. Oliveira, Pyo, a chatbot assistant for introductory programming students, IEEE, 2022, pp. 1–6. doi:10.1109/SIIE56031.2022.9982349. URL

[181] J. C. Paiva, J. P. Leal, A. Figueira, Automated assessment in computer science education: A state-of-the-art review 22 (3) 34:1–34:40.

[182] M. Cukusic, Z. Garaca, M. Jadric, Online self-assessment and students’ success in higher education institutions 72 100–109. doi:10.1016/j.compedu.2013.10.018.

[183] J. Qadir, Engineering education in the era of ChatGPT: Promise and pitfalls of generative AI for education, in: 2023 IEEE Global Engineering Education Conference (EDUCON), pp. 1–9. doi:10.1109/EDUCON54358.2023.10125121. URL

[184] M. Daun, J. Brings, How ChatGPT will change software engineering education, in: Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1, ITiCSE 2023, Association for Computing Machinery, pp. 110–116. doi:10.1145/3587102.3588815. URL

[185] C. Bull, A. Kharrufa, Generative AI assistants in software development education: A vision for integrating generative AI into educational practice, not instinctively defending against it. 1-9Conference Name: IEEE Software. doi:10.1109/MS.2023.3300574.

[186] J. Berrezueta-Guzman, S. Krusche, Recommendations to create programming exercises to overcome ChatGPT, in: 2023 IEEE 35th International Conference on Software Engineering Education and Training (CSEE&T), pp. 147–151, ISSN: 2377-570X. doi:10.1109/CSEET58097.2023.00031.

[187] B. Bani´c, M. Konecki, M. Konecki, Pair programming education aided by ChatGPT, in: 2023 46th MIPRO ICT and Electronics Convention (MIPRO), pp. 911–915, ISSN: 2623-8764. doi:10.23919/MIPRO57284.2023.10159727.

[188] S. Jalil, S. Rafi, T. D. LaToza, K. Moran, W. Lam, ChatGPT and software testing education: Promises & perils, in: 2023 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW), pp. 4130–4137. doi:10.1109/ICSTW58534.2023.00078.

[189] A. Ashraf, A. Imam, ChatGPT’s use case for software engineers, in: 8th International Conference on Computing in Engineering and Technology (ICCET 2023), Vol. 2023, pp. 487–492. doi:10.1049/icp.2023.1537.

[190] J. Savelka, A. Agarwal, C. Bogart, M. Sakr, Large language models (GPT) struggle to answer multiple-choice questions about code. arXiv:2303.08033[cs], doi:10.48550/arXiv.2303.08033. URL

[191] R. Yilmaz, F. G. Karaoglan Yilmaz, The effect of generative artificial intelligence (AI)-based tool use on students’ computational thinking skills, programming self-efficacy and motivation 4100147. doi:10. 1016/j.caeai.2023.100147.

[192] C. A. Philbin, Exploring the potential of artificial intelligence program generators in computer programming education for students 14 (3) 30– 38. doi:10.1145/3610406. URL

[193] M. Ewens, R. Nanda, M. Rhodes-Kropf, Cost of experimentation and the evolution of venture capital, Journal of Financial Economics 128 (3) (2018) 422–442. doi:10.1016/j.jfineco.2018.03.001.

[194] S. Blank, The Four Steps to the Epiphany: Successful Strategies for Products that Win,, 2007.

[195] E. Ries, The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses, The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses, Crown Business, 2011.

[196] V. Vakkuri, K.-K. Kemell, J. Kultanen, P. Abrahamsson, The Current State of Industrial Practice in Artificial Intelligence Ethics, IEEE Software 37 (4) (2020) 50–57. doi:10.1109/MS.2020.2985621.

[197] S. Morandini, t. l. w. o. i. a. n. t. Link to external site, F. Fraboni, t. l. w. o. i. a. n. t. Link to external site, M. De Angelis, t. l. w. o. i. a. n. t. Link to external site, G. Puzzo, t. l. w. o. i. a. n. t. Link to external site, D. Giusino, t. l. w. o. i. a. n. t. Link to external site, L. Pietrantoni, t. l. w. o. i. a. n. t. Link to external site, The impact of artificial intelligence on workers’ skills: Upskilling and reskilling in organisations 26 39–68. doi:10.28945/5078.

[198] D. K. Kanbach, L. Heiduk, G. Blueher, M. Schreiter, A. Lahmann, The GenAI is out of the bottle: generative artificial intelligence from a business model innovation perspective, Review of Managerial Science (0123456789) (sep 2023).

[199] B. Mesk´o, E. J. Topol, The imperative for regulatory oversight of large language models (or generative AI) in healthcare, npj Digital Medicine 6 (1) (2023) 120. doi:10.1038/s41746-023-00873-0.

[200] S. Amershi, A. Begel, C. Bird, R. DeLine, H. Gall, E. Kamar, N. Nagappan, B. Nushi, T. Zimmermann, Software engineering for machine learning: A case study, in: 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSESEIP),IEEE, 2019, pp. 291–300.

[201] W. Ma, S. Liu, W. Wang, Q. Hu, Y. Liu, C. Zhang, L. Nie, Y. Liu, The scope of ChatGPT in software engineering: A thorough investigation. arXiv:2305.12138[cs], doi:10.48550/arXiv.2305.12138. URL

[202] R. Azamfirei, S. R. Kudchadkar, J. Fackler, Large language models and the perils of their hallucinations 27 (1) 120. doi:10.1186/s13054-023-04393-x. URL

[203] P. Manakul, A. Liusie, M. J. F. Gales, SelfCheckGPT: Zero-resource black-box hallucination detection for generative large language models. arXiv:2303.08896[cs], doi:10.48550/arXiv.2303.08896. URL

[204] C. Barrett, B. Boyd, E. Burzstein, N. Carlini, B. Chen, J. Choi, A. R. Chowdhury, M. Christodorescu, A. Datta, S. Feizi, K. Fisher, T. Hashimoto, D. Hendrycks, S. Jha, D. Kang, F. Kerschbaum, E. Mitchell, J. Mitchell, Z. Ramzan, K. Shams, D. Song, A. Taly, D. Yang, Identifying and mitigating the security risks of generative AI. arXiv:2308.14840[cs], doi:10.48550/arXiv.2308.14840. URL

[205] L. Chen, M. Zaharia, J. Zou, How is chatgpt’s behavior changing over time? (2023). arXiv:2307.09009.

[206] A. Hamid, H. R. Samidi, T. Finin, P. Pappachan, R. Yus, GenAIPABench: A benchmark for generative AI-based privacy assistants. arXiv:2309.05138[cs], doi:10.48550/arXiv.2309.05138. URL

[207] E. Kasneci, K. Sessler, S. Kuchemann, M. Bannert, D. Dementieva, F. Fischer, U. Gasser, G. Groh, S. Gunnemann, E. Hullermeier, S. Krusche, G. Kutyniok, T. Michaeli, C. Nerdel, J. Pfeffer, O. Poquet, M. Sailer, A. Schmidt, T. Seidel, M. Stadler, J. Weller, J. Kuhn, G. Kasneci, ChatGPT for good? on opportunities and challenges of large language models for education 103 102274. doi:10.1016/j.lindif.2023.102274.

[208] M. T. Ribeiro, S. Singh, C. Guestrin, ”why should i trust you?”: Explaining the predictions of any classifier (2016). arXiv:1602.04938.

[209] D. Levin, R. Cross, L. Abrams, Why should i trust you? predictors of interpersonal trust in a knowledge transfer context (09 2002).

[210] B. Cabrero-Daniel, A. Sanagust´ın Cabrero, Perceived trustworthiness of natural language generators, in: Proceedings of the First International Symposium on Trustworthy Autonomous Systems, TAS ’23, Association for Computing Machinery, New York, NY, USA, 2023. doi:10.1145/3597512.3599715. URL

[211] N. Amsel, Z. Ibrahim, A. Malik, B. Tomlinson, Toward sustainable software engineering (NIER track), in: Proceedings of the 33rd International Conference on Software Engineering, ICSE ’11, Association for Computing Machinery, pp. 976–979. doi:10.1145/1985793.1985964. URL

[212] B. Penzenstadler, V. Bauer, C. Calero, X. Franch, Sustainability in software engineering: a systematic literature review 32–41doi:10.1049/ic.2012.0004.

[213] S. Naumann, E. Kern, M. Dick, T. Johann, Sustainable software engineering: Process and quality models, life cycle, and social aspects, in: L. M. Hilty, B. Aebischer (Eds.), ICT Innovations for Sustainability, Advances in Intelligent Systems and Computing, Springer International Publishing, pp. 191–205.

[214] A. A. Chien, GenAI: Giga$$$, TeraWatt-hours, and GigaTons of CO2 66 (8) 5. doi:10.1145/3606254. URL

[215] A. A. Chien, L. Lin, H. Nguyen, V. Rao, T. Sharma, R. Wijayawardana, Reducing the carbon impact of generative AI inference (today and in 2035), in: Proceedings of the 2nd Workshop on Sustainable Computer Systems, HotCarbon ’23, Association for Computing Machinery, pp. 1–7. doi:10.1145/3604930.3605705. URL

[216] P. Henderson, J. Hu, J. Romoff, E. Brunskill, D. Jurafsky, J. Pineau, Towards the systematic reporting of the energy and carbon footprints of machine learning. arXiv:2002.05651[cs], doi:10.48550/arXiv.2002.05651. URL

[217] J. Kaplan, S. McCandlish, T. Henighan, T. B. Brown, B. Chess, R. Child, S. Gray, A. Radford, J. Wu, D. Amodei, Scaling laws for neural language models. arXiv:2001.08361[cs,stat], doi:10.48550/ arXiv.2001.08361.

[218] J. W. Rae, S. Borgeaud, T. Cai, K. Millican, J. Hoffmann, F. Song, J. Aslanides, S. Henderson, R. Ring, S. Young, E. Rutherford, T. Hennigan, J. Menick, A. Cassirer, R. Powell, G. v. d. Driessche, L. A. Hendricks, M. Rauh, P.-S. Huang, A. Glaese, J. Welbl, S. Dathathri, S. Huang, J. Uesato, J. Mellor, I. Higgins, A. Creswell, N. McAleese, A. Wu, E. Elsen, S. Jayakumar, E. Buchatskaya, D. Budden, E. Sutherland, K. Simonyan, M. Paganini, L. Sifre, L. Martens, X. L. Li, A. Kuncoro, A. Nematzadeh, E. Gribovskaya, D. Donato, A. Lazaridou, A. Mensch, J.-B. Lespiau, M. Tsimpoukelli, N. Grigorev, D. Fritz, T. Sottiaux, M. Pajarskas, T. Pohlen, Z. Gong, D. Toyama, C. d. M. d’Autume, Y. Li, T. Terzi, V. Mikulik, I. Babuschkin, A. Clark, D. d. L. Casas, A. Guy, C. Jones, J. Bradbury, M. Johnson, B. Hechtman, L. Weidinger, I. Gabriel, W. Isaac, E. Lockhart, S. Osindero, L. Rimell, C. Dyer, O. Vinyals, K. Ayoub, J. Stanway, L. Bennett, D. Hassabis, K. Kavukcuoglu, G. Irving, Scaling language models: Methods, analysis & insights from training gopher. arXiv:2112.11446[cs], doi:10.48550/arXiv.2112.11446.