• Title/Summary/Keyword: MachineLearning

Search Result 5,654, Processing Time 0.032 seconds

Identifying Main Forest Environmental Factors to Discern Slow-Moving Landslide-Prone Areas in the Republic of Korea (땅밀림 실태조사 우려지 판정에서의 주요 산지환경 인자 분석)

  • Dongyeob Kim;Sanghoo Youn;Sangjun Im;Jung Il Seo;Taeho Bong
    • Journal of Korean Society of Forest Science
    • /
    • v.113 no.3
    • /
    • pp.349-360
    • /
    • 2024
  • This study aimed to analyze the main forest environmental factors affecting the discernment of slow-moving landslide-prone areas in the Republic of Korea, based on data from a detailed landslide survey conducted from 2019 to 2021. Field survey data from 256 sites were collected covering 29 forest environmental factors in seven categories, including geology, soil, and topography. The analysis was conducted using the Random Forest model (AUC = 0.910) and XGBoost model (Accuracy = 0.808, Kappa = 0.594, F1 - measure = 0.494), which were evaluated as having high classification accuracy during the machine learning model development process. Consequently, factors with a high mean decrease Gini (MDG), representing classification importance, were identified as the presence of cracks (average MDG of both models: 22.1), peak elevation (14.8), and the presence of steps (7.0), indicating that these were significant factors in determining slow-moving landslide-prone areas. The presence of cracks and steps aligned well with the characteristics of slow-moving landslides, suggesting that their importance should be emphasized in future detailed landslide surveys. However, the influence of the peak elevation was considered somewhat overestimated due to the characteristics of the input data used in the analysis. These findings are expected to further improve the accuracy and efficiency of final judgments in detailed landslide surveys.

Enhancing Leadership Skills of Construction Students Through Conversational AI-Based Virtual Platform

  • Rahat HUSSAIN;Akeem PEDRO;Mehrtash SOLTANI;Si Van Tien TRAN;Syed Farhan Alam ZAIDI;Chansik PARK;Doyeop LEE
    • International conference on construction engineering and project management
    • /
    • 2024.07a
    • /
    • pp.1326-1327
    • /
    • 2024
  • The construction industry is renowned for its dynamic and intricate characteristics, which demand proficient leadership skills for successful project management. However, the existing training platforms within this sector often overlook the significance of soft skills in leadership development. These platforms primarily focus on safety, work processes, and technical modules, leaving a noticeable gap in preparing future leaders, especially students in the construction domain, for the complex challenges they will encounter in their professional careers. It is crucial to recognize that effective leadership in construction projects requires not only technical expertise but also the ability to communicate effectively, collaborate with diverse stakeholders, and navigate complex relationships. These soft skills are critical for managing teams, resolving conflicts, and driving successful project outcomes. In addition, the construction sector has been slow in adopting and harnessing the potential of advanced emerging technologies such as virtual reality, artificial intelligence, to enhance the soft skills of future leaders. Therefore, there is a need for a platform where students can practice complex situations and conversations in a safe and repeatable training environment. To address these challenges, this study proposes a pioneering approach by integrating conversational AI techniques using large language models (LLMs) within virtual worlds. Although LLMs like ChatGPT possess extensive knowledge across various domains, their responses may lack relevance in specific contexts. Prompt engineering techniques are utilized to ensure more accurate and effective responses, tailored to the specific requirements of the targeted users. This involves designing and refining the input prompts given to the language model to guide its response generation. By carefully crafting the prompts and providing context-specific instructions, the model can generate responses that are more relevant and aligned with the desired outcomes of the training program. The proposed system offers interactive engagement to students by simulating diverse construction site roles through conversational AI based agents. Students can face realistic challenges that test and enhance their soft skills in a practical context. They can engage in conversations with AI-based avatars representing different construction site roles, such as machine operators, laborers, and site managers. These avatars are equipped with AI capabilities to respond dynamically to user interactions, allowing students to practice their communication and negotiation skills in realistic scenarios. Additionally, the introduction of AI instructors can provide guidance, feedback, and coaching tailored to the individual needs of each student, enhancing the effectiveness of the training program. The AI instructors can provide immediate feedback and guidance, helping students improve their decision-making and problem-solving abilities. The proposed immersive learning environment is expected to significantly enhance leadership competencies of students, such as communication, decision-making and conflict resolution in the practical context. This study highlights the benefits of utilizing conversational AI in educational settings to prepare construction students for real-world leadership roles. By providing hands-on, practical experience in dealing with site-specific challenges, students can develop the necessary skills and confidence to excel in their future roles.

Detection of Abnormal CAN Messages Using Periodicity and Time Series Analysis (CAN 메시지의 주기성과 시계열 분석을 활용한 비정상 탐지 방법)

  • Se-Rin Kim;Ji-Hyun Sung;Beom-Heon Youn;Harksu Cho
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.9
    • /
    • pp.395-403
    • /
    • 2024
  • Recently, with the advancement of technology, the automotive industry has seen an increase in network connectivity. CAN (Controller Area Network) bus technology enables fast and efficient data communication between various electronic devices and systems within a vehicle, providing a platform that integrates and manages a wide range of functions, from core systems to auxiliary features. However, this increased connectivity raises concerns about network security, as external attackers could potentially gain access to the automotive network, taking control of the vehicle or stealing personal information. This paper analyzed abnormal messages occurring in CAN and confirmed that message occurrence periodicity, frequency, and data changes are important factors in the detection of abnormal messages. Through DBC decoding, the specific meanings of CAN messages were interpreted. Based on this, a model for classifying abnormalities was proposed using the GRU model to analyze the periodicity and trend of message occurrences by measuring the difference (residual) between the predicted and actual messages occurring within a certain period as an abnormality metric. Additionally, for multi-class classification of attack techniques on abnormal messages, a Random Forest model was introduced as a multi-classifier using message occurrence frequency, periodicity, and residuals, achieving improved performance. This model achieved a high accuracy of over 99% in detecting abnormal messages and demonstrated superior performance compared to other existing models.

Optimal supervised LSA method using selective feature dimension reduction (선택적 자질 차원 축소를 이용한 최적의 지도적 LSA 방법)

  • Kim, Jung-Ho;Kim, Myung-Kyu;Cha, Myung-Hoon;In, Joo-Ho;Chae, Soo-Hoan
    • Science of Emotion and Sensibility
    • /
    • v.13 no.1
    • /
    • pp.47-60
    • /
    • 2010
  • Most of the researches about classification usually have used kNN(k-Nearest Neighbor), SVM(Support Vector Machine), which are known as learn-based model, and Bayesian classifier, NNA(Neural Network Algorithm), which are known as statistics-based methods. However, there are some limitations of space and time when classifying so many web pages in recent internet. Moreover, most studies of classification are using uni-gram feature representation which is not good to represent real meaning of words. In case of Korean web page classification, there are some problems because of korean words property that the words have multiple meanings(polysemy). For these reasons, LSA(Latent Semantic Analysis) is proposed to classify well in these environment(large data set and words' polysemy). LSA uses SVD(Singular Value Decomposition) which decomposes the original term-document matrix to three different matrices and reduces their dimension. From this SVD's work, it is possible to create new low-level semantic space for representing vectors, which can make classification efficient and analyze latent meaning of words or document(or web pages). Although LSA is good at classification, it has some drawbacks in classification. As SVD reduces dimensions of matrix and creates new semantic space, it doesn't consider which dimensions discriminate vectors well but it does consider which dimensions represent vectors well. It is a reason why LSA doesn't improve performance of classification as expectation. In this paper, we propose new LSA which selects optimal dimensions to discriminate and represent vectors well as minimizing drawbacks and improving performance. This method that we propose shows better and more stable performance than other LSAs' in low-dimension space. In addition, we derive more improvement in classification as creating and selecting features by reducing stopwords and weighting specific values to them statistically.

  • PDF

The Prediction of DEA based Efficiency Rating for Venture Business Using Multi-class SVM (다분류 SVM을 이용한 DEA기반 벤처기업 효율성등급 예측모형)

  • Park, Ji-Young;Hong, Tae-Ho
    • Asia pacific journal of information systems
    • /
    • v.19 no.2
    • /
    • pp.139-155
    • /
    • 2009
  • For the last few decades, many studies have tried to explore and unveil venture companies' success factors and unique features in order to identify the sources of such companies' competitive advantages over their rivals. Such venture companies have shown tendency to give high returns for investors generally making the best use of information technology. For this reason, many venture companies are keen on attracting avid investors' attention. Investors generally make their investment decisions by carefully examining the evaluation criteria of the alternatives. To them, credit rating information provided by international rating agencies, such as Standard and Poor's, Moody's and Fitch is crucial source as to such pivotal concerns as companies stability, growth, and risk status. But these types of information are generated only for the companies issuing corporate bonds, not venture companies. Therefore, this study proposes a method for evaluating venture businesses by presenting our recent empirical results using financial data of Korean venture companies listed on KOSDAQ in Korea exchange. In addition, this paper used multi-class SVM for the prediction of DEA-based efficiency rating for venture businesses, which was derived from our proposed method. Our approach sheds light on ways to locate efficient companies generating high level of profits. Above all, in determining effective ways to evaluate a venture firm's efficiency, it is important to understand the major contributing factors of such efficiency. Therefore, this paper is constructed on the basis of following two ideas to classify which companies are more efficient venture companies: i) making DEA based multi-class rating for sample companies and ii) developing multi-class SVM-based efficiency prediction model for classifying all companies. First, the Data Envelopment Analysis(DEA) is a non-parametric multiple input-output efficiency technique that measures the relative efficiency of decision making units(DMUs) using a linear programming based model. It is non-parametric because it requires no assumption on the shape or parameters of the underlying production function. DEA has been already widely applied for evaluating the relative efficiency of DMUs. Recently, a number of DEA based studies have evaluated the efficiency of various types of companies, such as internet companies and venture companies. It has been also applied to corporate credit ratings. In this study we utilized DEA for sorting venture companies by efficiency based ratings. The Support Vector Machine(SVM), on the other hand, is a popular technique for solving data classification problems. In this paper, we employed SVM to classify the efficiency ratings in IT venture companies according to the results of DEA. The SVM method was first developed by Vapnik (1995). As one of many machine learning techniques, SVM is based on a statistical theory. Thus far, the method has shown good performances especially in generalizing capacity in classification tasks, resulting in numerous applications in many areas of business, SVM is basically the algorithm that finds the maximum margin hyperplane, which is the maximum separation between classes. According to this method, support vectors are the closest to the maximum margin hyperplane. If it is impossible to classify, we can use the kernel function. In the case of nonlinear class boundaries, we can transform the inputs into a high-dimensional feature space, This is the original input space and is mapped into a high-dimensional dot-product space. Many studies applied SVM to the prediction of bankruptcy, the forecast a financial time series, and the problem of estimating credit rating, In this study we employed SVM for developing data mining-based efficiency prediction model. We used the Gaussian radial function as a kernel function of SVM. In multi-class SVM, we adopted one-against-one approach between binary classification method and two all-together methods, proposed by Weston and Watkins(1999) and Crammer and Singer(2000), respectively. In this research, we used corporate information of 154 companies listed on KOSDAQ market in Korea exchange. We obtained companies' financial information of 2005 from the KIS(Korea Information Service, Inc.). Using this data, we made multi-class rating with DEA efficiency and built multi-class prediction model based data mining. Among three manners of multi-classification, the hit ratio of the Weston and Watkins method is the best in the test data set. In multi classification problems as efficiency ratings of venture business, it is very useful for investors to know the class with errors, one class difference, when it is difficult to find out the accurate class in the actual market. So we presented accuracy results within 1-class errors, and the Weston and Watkins method showed 85.7% accuracy in our test samples. We conclude that the DEA based multi-class approach in venture business generates more information than the binary classification problem, notwithstanding its efficiency level. We believe this model can help investors in decision making as it provides a reliably tool to evaluate venture companies in the financial domain. For the future research, we perceive the need to enhance such areas as the variable selection process, the parameter selection of kernel function, the generalization, and the sample size of multi-class.

The impact of functional brain change by transcranial direct current stimulation effects concerning circadian rhythm and chronotype (일주기 리듬과 일주기 유형이 경두개 직류전기자극에 의한 뇌기능 변화에 미치는 영향 탐색)

  • Jung, Dawoon;Yoo, Soomin;Lee, Hyunsoo;Han, Sanghoon
    • Korean Journal of Cognitive Science
    • /
    • v.33 no.1
    • /
    • pp.51-75
    • /
    • 2022
  • Transcranial direct current stimulation (tDCS) is a non-invasive brain stimulation that is able to alter neuronal activity in particular brain regions. Many studies have researched how tDCS modulates neuronal activity and reorganizes neural networks. However it is difficult to conclude the effect of brain stimulation because the studies are heterogeneous with respect to the stimulation parameter as well as individual difference. It is not fully in agreement with the effects of brain stimulation. In particular few studies have researched the reason of variability of brain stimulation in response to time so far. The study investigated individual variability of brain stimulation based on circadian rhythm and chronotype. Participants were divided into two groups which are morning type and evening type. The experiment was conducted by Zoom meeting which is video meeting programs. Participants were sent experiment tool which are Muse(EEG device), tdcs device, cell phone and cell phone holder after manuals for experimental equipment were explained. Participants were required to make a phone in frount of a camera so that experimenter can monitor online EEG data. Two participants who was difficult to use experimental devices experimented in a laboratory setting where experimenter set up devices. For all participants the accuracy of 98% was achieved by SVM using leave one out cross validation in classification in the the effects of morning stimulation and the evening stimulation. For morning type, the accuracy of 92% and 96% was achieved in classification in the morning stimulation and the evening stimulation. For evening type, it was 94% accuracy in classification for the effect of brain stimulation in the morning and the evening. Feature importance was different both in classification in the morning stimulation and the evening stimulation for morning type and evening type. Results indicated that the effect of brain stimulation can be explained with brain state and trait. Our study results noted that the tDCS protocol for target state is manipulated by individual differences as well as target state.

Comparative study of flood detection methodologies using Sentinel-1 satellite imagery (Sentinel-1 위성 영상을 활용한 침수 탐지 기법 방법론 비교 연구)

  • Lee, Sungwoo;Kim, Wanyub;Lee, Seulchan;Jeong, Hagyu;Park, Jongsoo;Choi, Minha
    • Journal of Korea Water Resources Association
    • /
    • v.57 no.3
    • /
    • pp.181-193
    • /
    • 2024
  • The increasing atmospheric imbalance caused by climate change leads to an elevation in precipitation, resulting in a heightened frequency of flooding. Consequently, there is a growing need for technology to detect and monitor these occurrences, especially as the frequency of flooding events rises. To minimize flood damage, continuous monitoring is essential, and flood areas can be detected by the Synthetic Aperture Radar (SAR) imagery, which is not affected by climate conditions. The observed data undergoes a preprocessing step, utilizing a median filter to reduce noise. Classification techniques were employed to classify water bodies and non-water bodies, with the aim of evaluating the effectiveness of each method in flood detection. In this study, the Otsu method and Support Vector Machine (SVM) technique were utilized for the classification of water bodies and non-water bodies. The overall performance of the models was assessed using a Confusion Matrix. The suitability of flood detection was evaluated by comparing the Otsu method, an optimal threshold-based classifier, with SVM, a machine learning technique that minimizes misclassifications through training. The Otsu method demonstrated suitability in delineating boundaries between water and non-water bodies but exhibited a higher rate of misclassifications due to the influence of mixed substances. Conversely, the use of SVM resulted in a lower false positive rate and proved less sensitive to mixed substances. Consequently, SVM exhibited higher accuracy under conditions excluding flooding. While the Otsu method showed slightly higher accuracy in flood conditions compared to SVM, the difference in accuracy was less than 5% (Otsu: 0.93, SVM: 0.90). However, in pre-flooding and post-flooding conditions, the accuracy difference was more than 15%, indicating that SVM is more suitable for water body and flood detection (Otsu: 0.77, SVM: 0.92). Based on the findings of this study, it is anticipated that more accurate detection of water bodies and floods could contribute to minimizing flood-related damages and losses.

Development Process for User Needs-based Chatbot: Focusing on Design Thinking Methodology (사용자 니즈 기반의 챗봇 개발 프로세스: 디자인 사고방법론을 중심으로)

  • Kim, Museong;Seo, Bong-Goon;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.221-238
    • /
    • 2019
  • Recently, companies and public institutions have been actively introducing chatbot services in the field of customer counseling and response. The introduction of the chatbot service not only brings labor cost savings to companies and organizations, but also enables rapid communication with customers. Advances in data analytics and artificial intelligence are driving the growth of these chatbot services. The current chatbot can understand users' questions and offer the most appropriate answers to questions through machine learning and deep learning. The advancement of chatbot core technologies such as NLP, NLU, and NLG has made it possible to understand words, understand paragraphs, understand meanings, and understand emotions. For this reason, the value of chatbots continues to rise. However, technology-oriented chatbots can be inconsistent with what users want inherently, so chatbots need to be addressed in the area of the user experience, not just in the area of technology. The Fourth Industrial Revolution represents the importance of the User Experience as well as the advancement of artificial intelligence, big data, cloud, and IoT technologies. The development of IT technology and the importance of user experience have provided people with a variety of environments and changed lifestyles. This means that experiences in interactions with people, services(products) and the environment become very important. Therefore, it is time to develop a user needs-based services(products) that can provide new experiences and values to people. This study proposes a chatbot development process based on user needs by applying the design thinking approach, a representative methodology in the field of user experience, to chatbot development. The process proposed in this study consists of four steps. The first step is 'setting up knowledge domain' to set up the chatbot's expertise. Accumulating the information corresponding to the configured domain and deriving the insight is the second step, 'Knowledge accumulation and Insight identification'. The third step is 'Opportunity Development and Prototyping'. It is going to start full-scale development at this stage. Finally, the 'User Feedback' step is to receive feedback from users on the developed prototype. This creates a "user needs-based service (product)" that meets the process's objectives. Beginning with the fact gathering through user observation, Perform the process of abstraction to derive insights and explore opportunities. Next, it is expected to develop a chatbot that meets the user's needs through the process of materializing to structure the desired information and providing the function that fits the user's mental model. In this study, we present the actual construction examples for the domestic cosmetics market to confirm the effectiveness of the proposed process. The reason why it chose the domestic cosmetics market as its case is because it shows strong characteristics of users' experiences, so it can quickly understand responses from users. This study has a theoretical implication in that it proposed a new chatbot development process by incorporating the design thinking methodology into the chatbot development process. This research is different from the existing chatbot development research in that it focuses on user experience, not technology. It also has practical implications in that companies or institutions propose realistic methods that can be applied immediately. In particular, the process proposed in this study can be accessed and utilized by anyone, since 'user needs-based chatbots' can be developed even if they are not experts. This study suggests that further studies are needed because only one field of study was conducted. In addition to the cosmetics market, additional research should be conducted in various fields in which the user experience appears, such as the smart phone and the automotive market. Through this, it will be able to be reborn as a general process necessary for 'development of chatbots centered on user experience, not technology centered'.

Prediction of Air Temperature and Relative Humidity in Greenhouse via a Multilayer Perceptron Using Environmental Factors (환경요인을 이용한 다층 퍼셉트론 기반 온실 내 기온 및 상대습도 예측)

  • Choi, Hayoung;Moon, Taewon;Jung, Dae Ho;Son, Jung Eek
    • Journal of Bio-Environment Control
    • /
    • v.28 no.2
    • /
    • pp.95-103
    • /
    • 2019
  • Temperature and relative humidity are important factors in crop cultivation and should be properly controlled for improving crop yield and quality. In order to control the environment accurately, we need to predict how the environment will change in the future. The objective of this study was to predict air temperature and relative humidity at a future time by using a multilayer perceptron (MLP). The data required to train MLP was collected every 10 min from Oct. 1, 2016 to Feb. 28, 2018 in an eight-span greenhouse ($1,032m^2$) cultivating mango (Mangifera indica cv. Irwin). The inputs for the MLP were greenhouse inside and outside environment data, and set-up and operating values of environment control devices. By using these data, the MLP was trained to predict the air temperature and relative humidity at a future time of 10 to 120 min. Considering typical four seasons in Korea, three-day data of the each season were compared as test data. The MLP was optimized with four hidden layers and 128 nodes for air temperature ($R^2=0.988$) and with four hidden layers and 64 nodes for relative humidity ($R^2=0.990$). Due to the characteristics of MLP, the accuracy decreased as the prediction time became longer. However, air temperature and relative humidity were properly predicted regardless of the environmental changes varied from season to season. For specific data such as spray irrigation, however, the numbers of trained data were too small, resulting in poor predictive accuracy. In this study, air temperature and relative humidity were appropriately predicted through optimization of MLP, but were limited to the experimental greenhouse. Therefore, it is necessary to collect more data from greenhouses at various places and modify the structure of neural network for generalization.

KB-BERT: Training and Application of Korean Pre-trained Language Model in Financial Domain (KB-BERT: 금융 특화 한국어 사전학습 언어모델과 그 응용)

  • Kim, Donggyu;Lee, Dongwook;Park, Jangwon;Oh, Sungwoo;Kwon, Sungjun;Lee, Inyong;Choi, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.191-206
    • /
    • 2022
  • Recently, it is a de-facto approach to utilize a pre-trained language model(PLM) to achieve the state-of-the-art performance for various natural language tasks(called downstream tasks) such as sentiment analysis and question answering. However, similar to any other machine learning method, PLM tends to depend on the data distribution seen during the training phase and shows worse performance on the unseen (Out-of-Distribution) domain. Due to the aforementioned reason, there have been many efforts to develop domain-specified PLM for various fields such as medical and legal industries. In this paper, we discuss the training of a finance domain-specified PLM for the Korean language and its applications. Our finance domain-specified PLM, KB-BERT, is trained on a carefully curated financial corpus that includes domain-specific documents such as financial reports. We provide extensive performance evaluation results on three natural language tasks, topic classification, sentiment analysis, and question answering. Compared to the state-of-the-art Korean PLM models such as KoELECTRA and KLUE-RoBERTa, KB-BERT shows comparable performance on general datasets based on common corpora like Wikipedia and news articles. Moreover, KB-BERT outperforms compared models on finance domain datasets that require finance-specific knowledge to solve given problems.