과제정보
본 논문은 한국과학기술정보연구원 연구사업(과제번호: K-23-L01-C03-S01 및 K-23-L03-C02-S01)의 지원에 의해 이루어진 것임.
참고문헌
- An, Seong-Won, Yu, Jae-Hong, Jo, Won-Young, No, Jae-Won, & Son, Ho-Hyun (2023). Rise of Hyper-scale LLM(Large Language Model) and issues. Gyeonggi: Software Policy Research Institute.
- Azma Yukinaga (2018). Deep Learning that is Tangible, Practical Programming from the Basics. Tokyo:SBクリエイティブ.
- Han, Na-Eun (2023). Proposal of process model for research data quality management. Korean Society for Information Society, 40(1), 51-71. https://doi.org/10.3743/KOSIM.2023.40.1.051
- Jo, Tae-Ho (2022). Deep Learning for Everyone - Deep Learning that Anyone can Easily Understand. Seoul: Gilbut.
- Kim, Hyung-Sub (2020). A study on the data quality management evaluation model. Journal of the Korea Convergence Society, 11(7), 217-222. https://doi.org/10.15207/JKCS.2020.11.7.217
- Kim, Seon-Tae, Lee, Jeong-Hoon, & Jeong, Han-Min (2017). Understanding and Managing Research Data. Daejeon: Korea Institute of Science and Technology Information.
- Korea Data Agency (2006). Data Quality Management Guidelines (Ver 2.1).
- Lee, Gi-Chang (2021). (Do it!) Learning Natural Language Processing with BERT and GPT: Transformer Core Principles and How to Use the Hugging Face Package. Seoul: Easyspublishing.
- Lee, Kyong-NIm & Ho, Eun-Kyoung (2023). AI dialogue interface based on large language models: the state of the art AI dialogue models and seeking linguistic research topics. The Society of Korean Linguistics, 105, 345-374. https://doi.org/10.15811/jkl.2023..105.010
- Lee, Su-Hyeon & Jeon, Sang-Hong (2023). ChatGPT State of the Technology Industry Report. Korea Copyright Commission.
- Ministry of Security and Public Administration (2014). Government Data Management Guidelines. No. 2014-13.
- National Research and Development Information Processing Standards, Ministry of Science and ICT Notice No. 2020-102.
- National Research Council of Science and Technology (2019). Research Data Management Guidelines (2019-07).
- Park, Hyung-Kyung (2020). A study on the use of copyrightable works in machine learning. The Korean Association of Sports and Entertainment Law, 23(1), 129-152. http://doi.org/10.19051/kasel.2020.23.1.129
- Park, Seong-Ho (2020). A study on whether collecting and using other people's copyrighted works for the purpose of text and data mining falls under the copyright limitations: focusing on the use of big data in artificial intelligence. Human Rights and Justice, 494, 39-69. http://doi.org/10.22999/hraj..494.202012.003
- 我妻 幸長 (2018). はじめてのディープラーニング -Pythonで学ぶニューラルネットワークと バックプロパゲーション- (Machine Learning). 최재원 옮김(2019). 실체가 손에 잡히는 딥 러닝, 기초부터 실천 프로그래밍. 서울: 책만.
- Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., & Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877-1901.
- Buchanan, B., Lohn, A., Musser, M., & Sedova, K. (2021). Truth, lies, and automation. Center for Security and Emerging Technology, 1(1), 2.
- Chiang, W. L., Li, Z., Lin, Z., Sheng, Y., Wu, Z., Zhang, H., Zheng, L., Zhuang, S., Zhuang, Y., Gonzalez, J. E., Stoica, I., & Xing, E. P. (2023). Vicuna: An Open-source Chatbot Impressing Gpt-4 with 90%* Chatgpt Quality. Available: https://lmsys.org/blog/2023-03-30-vicuna/
- Chomsky, N. (1957). Logical structure in language. Journal of the American Society for Information Science, 8(4), 284.
- Dwivedi, Y. K., Kshetri, N., Hughes, L., Slade, E. L., Jeyaraj, A., Kar, A. K., Baabdullah, A. M., Koohang, A., Raghavan, V., Ahuja, M., Albanna, H., Albashrawi, M. A., Al-Busaidi, A. S., Balakrishnan, J., Barlette, Y., Basu, S., Bose, I., Brooks, L., Buhalis, D., Carter, L., & Wright, R. (2023). "So what if ChatGPT wrote it?" multidisciplinary perspectives on opportunities, challenges and implications of generative conversational ai for research, practice and policy. International Journal of Information Management, 71, 102642.
- English, L. P. (2009). Information Quality Applied: Best Practices for Improving Business Information, Processes and Systems. New Jersey: Wiley.
- Gehman, S., Gururangan, S., Sap, M., Choi, Y., & Smith, N. A. (2020). Realtoxicityprompts: Evaluating Neural Toxic Degeneration in Language Models. https://doi.org/10.48550/arXiv.2009.11462
- Hale, J. (2001). A Probabilistic Earley Parser as a Psycholinguistic Model. In Second Meeting of the North American Chapter of the Association for Computational Linguistics.
- International Organization for Standardization (2015). ISO/IEC 25024: 2015: Systems and Software Engineering-Systems and Software Quality Requirements and Evaluation (SQuaRE)-Measurement of Data Quality. ISO/IEC.
- Jurafsky, D. & James H. M. (2021). Speech and Language Processing (3rd ed.). California: Standford University.
- Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling Laws for Neural Language Models. https://doi.org/10.48550/arXiv.2001.08361
- Kindling, M. & Strecker, D. (2022). Data Quality Assurance at Research Data Repositories. Data Science Journal, 21(1). http://doi.org/10.5334/dsj-2022-018
- Lee, P., Goldberg, C., & Kohane, I. (2023). The AI Revolution in Medicine: GPT-4 and beyond. London: Pearson.
- Lemley, M. A. & Casey, B. (2020). Fair learning. Texas Law Review, 99(4), 743-785.
- Levy, R. (2008). Expectation-based syntactic comprehension. Cognition, 106(3), 1126-1177. https://doi.org/10.1016/j.cognition.2007.05.006
- Lewkowycz, A., Andreassen, A., Dohan, D., Dyer, E., Michalewski, H., Ramasesh, V., Slone, A., Anil, C., Schlag, I., Gutman-Solo, T., Wu, Y., Neyshabur, B., Gur-Ari, G., & Misra, V. (2022). Solving quantitative reasoning problems with language models. Advances in Neural Information Processing Systems, 35, 3843-3857.
- Lin, S., Hilton, J., & Evans, O. (2021). Truthfulqa: Measuring How Models Mimic Human Falsehoods. https://doi.org/10.48550/arXiv.2109.07958
- OpenAI (2023). GPT-4 Technical Report. https://doi.org/10.48550/arXiv.2303.08774
- Peng, B., Li, C., He, P., Galley, M., & Gao, J. (2023). Instruction Tuning with Gpt-4. https://doi.org/10.48550/arXiv.2304.03277
- Pennycook, G., Epstein, Z., Mosleh, M., Arechar, A. A., Eckles, D., & Rand, D. G. (2021). Shifting attention to accuracy can reduce misinformation online. Nature, 592(7855), 590-595. https://doi.org/10.1038/s41586-021-03344-2
- Percy, L., Bommasani, R., Lee, T., Tsipras, D., Soylu, D., Yasunaga, M., Zhang, Y., Narayanan, D., Wu, Y., Kumar, A., Newman, B., Yuan, B., Yan, B., Zhang, C., Cosgrove, C., Manning, C. D., Re, C., Acosta-Navas, D., Hudson, D. A., Zelikman, E., Durmus, E., Ladhak, F., Rong, F., Ren, H., Yao, H., Wang, J., Santhanam, K., Orr, L., Zheng, L., Yuksekgonul, M., Suzgun, M., Kim, N., Guha, N., Chatterji, N., Khattab, O., Henderson, P., Huang, Q., Chi, R., Xie, S. M., Santurkar, S., Ganguli, S., Hashimoto, T., Icard, T., Zhang, T., Chaudhary, V., Wang, W., Li, X., Mai, Y., Zhang, Y., & Koreeda, Y. (2022), Holistic Evaluation of Language Models. https://doi.org/10.48550/arXiv.2211.09110
- Petroni, F., Rocktaschel, T., Lewis, P., Bakhtin, A., Wu, Y., Miller, A. H., & Riedel, S. (2019). Language Models as Knowledge Bases?. https://doi.org/10.48550/arXiv.1909.01066
- Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.
- Rae, J. W., Borgeaud, S., Cai, T., Millican, K., Hoffmann, J., Song, F., Aslanides, J., Henderson, S., Ring, R., Young, S., Rutherford, E., Hennigan, T., Menick, J., Cassirer, A., Powell, R., Driessche, G., Hendricks, L. A., Rauh, M., Huang, P., Glaese, A., Welbl, J., Dathathri, S., Huang, S., Uesato, J., Mellor, J., Higgins, I., Creswell, A., McAleese, N., Wu, A., Elsen, E., Jayakumar, S., Buchatskaya, E., Budden, D., Sutherland, E., Simonyan, K., Paganini, M., Sifre, L., Martens, L., Li, X. L., Kuncoro, A., Nematzadeh, A., Gribovskaya, E., Donato, D., Lazaridou, A., Mensch, A., Lespiau, J., Tsimpoukelli, M., Grigorev, N., Fritz, D., Sottiaux, T., Pajarskas, M., Pohlen, T., Gong, Z., Toyama, D., d'Autume, C. M., Li, Y., Terzi, T., Mikulik, V., Babuschkin, I., Clark, A., Casas, D. L., Guy, A., Jones, C., Bradbury, J., Johnson, M., Hechtman, B., Weidinger, L., Gabriel, I., Isaac, W., Lockhart, E., Osindero, S., Rimell, L., Dyer, C., Vinyals, O., Ayoub, K., Stanway, J., Bennett, L., Hassabis, D., Kavukcuoglu, K., & Irving, G. (2021). Scaling Language Models: Methods, Analysis & Insights from Training Gopher. https://doi.org/10.48550/ARXIV.2112.11446
- Taori, R., Gulrajani, I., Zhang, T., Dubois, Y., Li, X., Guestrin, C., Liang, P., & Hashimoto, T. B. (2023). Stanford Alpaca: An Instruction-following Llama Model. Available: https://github.com/tatsu-lab/stanford_alpaca
- Taylor, R., Kardas, M., Cucurull, G., Scialom, T., Hartshorn, A., Saravia, E., Poulton, A., Kerkez, V., & Stojnic, R. (2022). Galactica: A Large Language Model for Science. https://doi.org/10.48550/arXiv.2211.09085
- Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M. A., Lacroix, T., Roziere, B., Goyal, N., Hambro, E., Azhar, F., Rodriguez, A., Joulin, A., Grave, E., & Lample, G. (2023). Llama: Open and Efficient Foundation Language Models. https://doi.org/10.48550/arXiv.2302.13971
- Wilcox, E., Qian, P., Futrell, R., Kohita, R., Levy, R., & Ballesteros, M. (2020). Structural Supervision Improves Few-shot Learning and Syntactic Generalization in Neural Language Models. https://doi.org/10.48550/arXiv.2010.05725
- Yarowsky, D. (1995, June). Unsupervised word sense disambiguation rivaling supervised methods. In 33rd Annual Meeting of the Association for Computational Linguistics, 189-196.
- Yasunaga, M., Bosselut, A., Ren, H., Zhang, X., Manning, C. D., Liang, P. S., & Leskovec, J. (2022). Deep Bidirectional language-knowledge graph pretraining. Advances in Neural Information Processing Systems, 35, 37309-37323.