• Title/Summary/Keyword: GPT-3.5

Search Result 426, Processing Time 0.024 seconds

Evaluation of the applicability of ChatGPT in biological nursing science education (ChatGPT의 기초간호학교육 활용 가능성 평가)

  • Sunmi Kim;Jihun Kim;Myung Jin Choi;Seok Hee Jeong
    • Journal of Korean Biological Nursing Science
    • /
    • v.25 no.3
    • /
    • pp.183-204
    • /
    • 2023
  • Purpose: The purpose of this study was to evaluate the applicability of ChatGPT in biological nursing science education. Methods: This study was conducted by entering questions about the field of biological nursing science into ChatGPT versions GPT-3.5 and GPT-4 and evaluating the answers. Three questions each related to microbiology and pharmacology were entered, and the generated content was analyzed to determine its applicability to the field of biological nursing science. The questions were of a level that could be presented to nursing students as written test questions. Results: The answers generated in English had 100.0% accuracy in both GPT-3.5 and GPT-4. For the sentences generated in Korean, the accuracy rate of GPT-3.5 was 62.7%, and that of GPT-4 was 100.0%. The total number of Korean sentences in GPT-3.5 was 51, while the total number of Korean sentences in GPT-4 was 68. Likewise, the total number of English sentences in GPT-3.5 was 70, while the total number of English sentences in GPT-4 was 75. This showed that even for the same Korean or English question, GPT-4 tended to be more detailed than GPT-3.5. Conclusion: This study confirmed the advantages of ChatGPT as a tool to improve understanding of various complex concepts in the field of biological nursing science. However, as the answers were based on data collected up to 2021, a guideline reflecting the most up-to-date information is needed. Further research is needed to develop a reliable and valid scale to evaluate ChatGPT's responses.

Evaluation Coding Performance of GPT-3.5 and GPT-4 in Terms of Completeness and Consistency (완전성과 일관성 측면에서의 GPT-3.5 와 GPT-4 의 코딩 성능 평가)

  • Jimin Jung;Chanho Lee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.754-755
    • /
    • 2023
  • 본 연구는 GPT-3.5 와 GPT-4 를 대상으로 완전성과 일관성 측면에서 코딩 협업 환경에 어떤 버전이 더 적합한지 평가하는 것을 목표로 한다. 두 버전을 대상으로 실험한 결과, GPT-4 가 GPT-3.5보다 완전성과 일관성 측면에서 더 높은 성능을 보였다. 특히 GPT-4 는 모든 항목들에서 100%의 완전성을 보였으나, 일관성은 여전히 개선이 필요함을 확인하였다. 프롬프트 수정만으로는 한계가 있으며, GPT-4 자체의 업그레이드가 필요하다는 의미이며, 향후 연구를 통해 타 생성형 AI 의 성능들도 평가할 예정이다.

Performance of ChatGPT on the Korean National Examination for Dental Hygienists

  • Soo-Myoung Bae;Hye-Rim Jeon;Gyoung-Nam Kim;Seon-Hui Kwak;Hyo-Jin Lee
    • Journal of dental hygiene science
    • /
    • v.24 no.1
    • /
    • pp.62-70
    • /
    • 2024
  • Background: This study aimed to evaluate ChatGPT's performance accuracy in responding to questions from the national dental hygienist examination. Moreover, through an analysis of ChatGPT's incorrect responses, this research intended to pinpoint the predominant types of errors. Methods: To evaluate ChatGPT-3.5's performance according to the type of national examination questions, the researchers classified 200 questions of the 49th National Dental Hygienist Examination into recall, interpretation, and solving type questions. The researchers strategically modified the questions to counteract potential misunderstandings from implied meanings or technical terminology in Korea. To assess ChatGPT-3.5's problem-solving capabilities in applying previously acquired knowledge, the questions were first converted to subjective type. If ChatGPT-3.5 generated an incorrect response, an original multiple-choice framework was provided again. Two hundred questions were input into ChatGPT-3.5 and the generated responses were analyzed. After using ChatGPT, the accuracy of each response was evaluated by researchers according to the types of questions, and the types of incorrect responses were categorized (logical, information, and statistical errors). Finally, hallucination was evaluated when ChatGPT provided misleading information by answering something that was not true as if it were true. Results: ChatGPT's responses to the national examination were 45.5% accurate. Accuracy by question type was 60.3% for recall and 13.0% for problem-solving type questions. The accuracy rate for the subjective solving questions was 13.0%, while the accuracy for the objective questions increased to 43.5%. The most common types of incorrect responses were logical errors 65.1% of all. Of the total 102 incorrectly answered questions, 100 were categorized as hallucinations. Conclusion: ChatGPT-3.5 was found to be limited in its ability to provide evidence-based correct responses to the Korean national dental hygiene examination. Therefore, dental hygienists in the education or clinical fields should be careful to use artificial intelligence-generated materials with a critical view.

Changes of Plasma Vitellogenin (VTG) and Glutamate Pyruvate Transaminase (GPT) in the Juvenile Rockfish, Sebastes schlegeli Exposed to Exogenous Estrogen (외인성 Estrogen에 노출된 조피볼락, Sebastes schlegeli 치어의 혈장 VTG과 GPT의 변화)

  • 황운기;강주찬
    • Environmental Analysis Health and Toxicology
    • /
    • v.17 no.3
    • /
    • pp.239-243
    • /
    • 2002
  • Changes of plasma vitellogenin (VTG) and glutamate pyruvate transaminase (GPT) were examined for determining whether hepatocyte was damaged during the process of VTG induction in the juvenile rockfish, Sebastes schlegeli exposed to exogenous estrogen (estradiol-l7$\beta$, E$_2$). Rockfishes were intraperitoneally injected with E$_2$(5 mg/kg B.W.) in 70% ethanol and plasma sampling were extracted at 0, 1, 3, 6, 9, 12, 15 days af-ter E$_2$administration. VTG and GPT were then analyzed by SDS -PAGE and Reitman -Frankel method, respectively. VTG band was detected at a molecular weight position of 175 kDa on Day 3 after E$_2$administration. This band became more distinct at 6 days, but its was gradually thinned with time -course, and not detected at 15 days. GPT was suddenly increased at 1 days after 22 administration and highest GPT was detected at 3 days. However. GPT was gradually decreased with time -course as the change of VTG. These results suggest that the process of VTG induction by exogenous E$_2$damage to hepatocyte, and plasma GPT was temporarily increased in the juvenile rockfish.

Effect of Geonpye-tang(GPT) on Production and Gene Expression of Respiratory Mucin (건폐탕(健肺陽)이 호흡기 뮤신의 생성 및 유전자 발현에 미치는 영향)

  • Jung, Byeong-Jin;Kim, Ho;Seo, Un-Kyo
    • The Journal of Internal Korean Medicine
    • /
    • v.30 no.4
    • /
    • pp.685-695
    • /
    • 2009
  • Objectives : In this study, the author tried to investigate whether Geonpye-tang(GPT) significantly affects PMA-, EGF- or TNF-alpha-induced MUC5AC mucin production and gene expression from human airway epithelial cells. Materials and Methods : Effects of the agent on PMA-, EGF- or TNF-alpha-induced MUC5AC mucin production and gene expression from human airway epithelial cells (NCI-H292) were investigated. Confluent NCI-H292 cells were pretreated for 30 min in the presence of GPT and treated with PMA (10ng/ml) or EGF (25ng/ml) or TNF-alpha (0.2nM), to assess both effect of the agent on PMA- or EGF- or TNF-alpha-induced MUC5AC mucin production by enzyme-linked immunosorbent assay (ELISA) and gene expression by reverse transcription-polymerase chain reaction (RT-PCR). Possible cytotoxicity of the agent was assessed by examining the rate of survival and proliferation of NCI-H292 cells after treatment with the agent over 72 hrs (SRB assay). Results : (1) GPT significantly inhibited PMA-induced and EGF-induced MUC5AC mucin production from NCI-H292 cells. However, GPT did not affect TNF-alpha-induced MUC5AC mucin production. (2) GPT significantly inhibited the expression levels of PMA-, EGF- or TNF-alpha-induced MUC5AC genes in NCI-H292 cells (3) GPT did not show significant cytotoxicity to NCI-H292 cells. Conclusion : This result suggests that GPT can affect the production and gene expression of respiratory mucin observed in diverse respiratory diseases accompanied by mucus hypersecretion. This can explain the traditional use of GPT in oriental medicine. Effects of GPT with their components should be further investigated using animal experimental models that reflect pathophysiology of airway diseases through future studies.

  • PDF

Study on the Activity of GOT and GPT in the Hepatotoxic Rat Treated (구기자 투여 간손상 흰쥐에서 GOT 및 GPT의 활성화 연구)

  • 김병원;노광수
    • Biomedical Science Letters
    • /
    • v.6 no.3
    • /
    • pp.187-192
    • /
    • 2000
  • The present study was undertaken in order to investigate betaine production by tissue culture and its medicinal effect in Lycium chinone Mill. In order to ulvestigate the protective effect of L. chinense on the hepatoxicity induced by $CCl_4$, 0.5 g/kg water extract of the compound mixture (leaves, roots and shoots) of L. chinense and its callus were fed to rat (SD line) once a day. As a result, the activity of GOT and GPT in the group fed compound mixture (GOT 760.4 and GPT 540 Karmen unit) and callus (G0T 772.1 and GPT 556.4 Karmen unit) was decreased in the blood serume relative to the controlled rat group (GOT 949 and GPT 640 Karmen unit) and the same result was obtained in the group fed with 0.1 g/kg sylimarin (the activity of GOT and GPT was shown 492.6 and 320.4 Karmen unit respectively. These results strongly indicate that water extracts of the mixture and callus from L. chinense do have the same decreasing effect of GOT and GPT in the hepatotoxic rat induced by $CCl_4$.

  • PDF

Temporal Changes of Plasma Vitellogenin (VTG), Alkaline-Labile Protein Phosphorus (ALPP), Calcium (Ca), Glutamate Pyruvate Transaminase (GPT) and Hepatosomatic Index (HSI) in the $Estradiol-17\beta-Administered$ Immature Rockfish, Sebastes schlegeli ($Estradiol-17\beta$의 복강주사에 따른 미성숙 조피볼락, Sebastes schlegeli의 혈장 VTG, ALPP, Ca, GPT 및 HSI의 일시적 변동)

  • Hwang, Un-Gi;Sim, Jeong-Min;Park, Seung-Yun;Ji, Jeong-Hun;Gang, Ju-Chan
    • Journal of fish pathology
    • /
    • v.17 no.3
    • /
    • pp.191-198
    • /
    • 2004
  • Temporal changes of plasma vitellogenin (VTG), alkaline-labile protein phosphorus (ALPP), calcium (Ca), glutamate pyruvate transaminase (GPT) and hepatosomatic index (HSI) were examined in the $estradiol-17\beta$ ${E_2}$-administered immature rockfish, Sebastes schlegeli. Fish were intraperitoneally injected with ${E_2}$ (5 ㎎/kg B.W.) in 70% ethanol and then plasma were extracted at 0, 1, 3, 6, 9, 12 and 15 days. VTG band was detected at a molecular weight position of about 170 kDa on Day 3 in SDS-PAGE. This band became more distinct at 6 days but its was gradually thinned with time-course, and not detected at 15 days. Plasma ALPP and Ca increased suddenly at 1 day and the highest concentrations were detected at 6 days and then these concentrations decreased gradually with time-course. ALPP and Ca concentrations at 15 days after E2 administration were very similar to that before E2 administration. GPT was increased at 1 day and higher GPT was detected at 3 days. However, GPT was gradually decreased with time-course. GPT and HSI at 15 days after E2 administration were also very similar to that before E2 administration. HSI was also increased at 1 day and the highest value was detected at 3 days and then gradually decreased with time-course. These results suggest that plasma ALPP, Ca, GPT and HSI could be utilized as a biomarker of exogenous E2 exposure in coastal ecosystem, because the changes of ALPP, Ca, GPT and HSI after E2 administration are very similar to that of VTG.

Evaluating ChatGPT's Competency in BIM Related Knowledge via the Korean BIM Expertise Exam (BIM 운용 전문가 시험을 통한 ChatGPT의 BIM 분야 전문 지식 수준 평가)

  • Choi, Jiwon;Koo, Bonsang;Yu, Youngsu;Jeong, Yujeong;Ham, Namhyuk
    • Journal of KIBIM
    • /
    • v.13 no.3
    • /
    • pp.21-29
    • /
    • 2023
  • ChatGPT, a chatbot based on GPT large language models, has gained immense popularity among the general public as well as domain professionals. To assess its proficiency in specialized fields, ChatGPT was tested on mainstream exams like the bar exam and medical licensing tests. This study evaluated ChatGPT's ability to answer questions related to Building Information Modeling (BIM) by testing it on Korea's BIM expertise exam, focusing primarily on multiple-choice problems. Both GPT-3.5 and GPT-4 were tested by prompting them to provide the correct answers to three years' worth of exams, totaling 150 questions. The results showed that both versions passed the test with average scores of 68 and 85, respectively. GPT-4 performed particularly well in categories related to 'BIM software' and 'Smart Construction technology'. However, it did not fare well in 'BIM applications'. Both versions were more proficient with short-answer choices than with sentence-length answers. Additionally, GPT-4 struggled with questions related to BIM policies and regulations specific to the Korean industry. Such limitations might be addressed by using tools like LangChain, which allow for feeding domain-specific documents to customize ChatGPT's responses. These advancements are anticipated to enhance ChatGPT's utility as a virtual assistant for BIM education and modeling automation.

Analyzing Mathematical Performances of ChatGPT: Focusing on the Solution of National Assessment of Educational Achievement and the College Scholastic Ability Test (ChatGPT의 수학적 성능 분석: 국가수준 학업성취도 평가 및 대학수학능력시험 수학 문제 풀이를 중심으로)

  • Kwon, Oh Nam;Oh, Se Jun;Yoon, Jungeun;Lee, Kyungwon;Shin, Byoung Chul;Jung, Won
    • Communications of Mathematical Education
    • /
    • v.37 no.2
    • /
    • pp.233-256
    • /
    • 2023
  • This study conducted foundational research to derive ways to use ChatGPT in mathematics education by analyzing ChatGPT's responses to questions from the National Assessment of Educational Achievement (NAEA) and the College Scholastic Ability Test (CSAT). ChatGPT, a generative artificial intelligence model, has gained attention in various fields, and there is a growing demand for its use in education as the number of users rapidly increases. To the best of our knowledge, there are very few reported cases of educational studies utilizing ChatGPT. In this study, we analyzed ChatGPT 3.5 responses to questions from the three-year National Assessment of Educational Achievement and the College Scholastic Ability Test, categorizing them based on the percentage of correct answers, the accuracy of the solution process, and types of errors. The correct answer rates for ChatGPT in the National Assessment of Educational Achievement and the College Scholastic Ability Test questions were 37.1% and 15.97%, respectively. The accuracy of ChatGPT's solution process was calculated as 3.44 for the National Assessment of Educational Achievement and 2.49 for the College Scholastic Ability Test. Errors in solving math problems with ChatGPT were classified into procedural and functional errors. Procedural errors referred to mistakes in connecting expressions to the next step or in calculations, while functional errors were related to how ChatGPT recognized, judged, and outputted text. This analysis suggests that relying solely on the percentage of correct answers should not be the criterion for assessing ChatGPT's mathematical performance, but rather a combination of the accuracy of the solution process and types of errors should be considered.

A Study on the Evaluation of LLM's Gameplay Capabilities in Interactive Text-Based Games (대화형 텍스트 기반 게임에서 LLM의 게임플레이 기능 평가에 관한 연구)

  • Dongcheul Lee
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.3
    • /
    • pp.87-94
    • /
    • 2024
  • We investigated the feasibility of utilizing Large Language Models (LLMs) to perform text-based games without training on game data in advance. We adopted ChatGPT-3.5 and its state-of-the-art, ChatGPT-4, as the systems that implemented LLM. In addition, we added the persistent memory feature proposed in this paper to ChatGPT-4 to create three game player agents. We used Zork, one of the most famous text-based games, to see if the agents could navigate through complex locations, gather information, and solve puzzles. The results showed that the agent with persistent memory had the widest range of exploration and the best score among the three agents. However, all three agents were limited in solving puzzles, indicating that LLM is vulnerable to problems that require multi-level reasoning. Nevertheless, the proposed agent was still able to visit 37.3% of the total locations and collect all the items in the locations it visited, demonstrating the potential of LLM.