• Title/Summary/Keyword: 학습 데이터

Search Result 6,453, Processing Time 0.034 seconds

Predicting Crime Risky Area Using Machine Learning (머신러닝기반 범죄발생 위험지역 예측)

  • HEO, Sun-Young;KIM, Ju-Young;MOON, Tae-Heon
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.21 no.4
    • /
    • pp.64-80
    • /
    • 2018
  • In Korea, citizens can only know general information about crime. Thus it is difficult to know how much they are exposed to crime. If the police can predict the crime risky area, it will be possible to cope with the crime efficiently even though insufficient police and enforcement resources. However, there is no prediction system in Korea and the related researches are very much poor. From these backgrounds, the final goal of this study is to develop an automated crime prediction system. However, for the first step, we build a big data set which consists of local real crime information and urban physical or non-physical data. Then, we developed a crime prediction model through machine learning method. Finally, we assumed several possible scenarios and calculated the probability of crime and visualized the results in a map so as to increase the people's understanding. Among the factors affecting the crime occurrence revealed in previous and case studies, data was processed in the form of a big data for machine learning: real crime information, weather information (temperature, rainfall, wind speed, humidity, sunshine, insolation, snowfall, cloud cover) and local information (average building coverage, average floor area ratio, average building height, number of buildings, average appraised land value, average area of residential building, average number of ground floor). Among the supervised machine learning algorithms, the decision tree model, the random forest model, and the SVM model, which are known to be powerful and accurate in various fields were utilized to construct crime prevention model. As a result, decision tree model with the lowest RMSE was selected as an optimal prediction model. Based on this model, several scenarios were set for theft and violence cases which are the most frequent in the case city J, and the probability of crime was estimated by $250{\times}250m$ grid. As a result, we could find that the high crime risky area is occurring in three patterns in case city J. The probability of crime was divided into three classes and visualized in map by $250{\times}250m$ grid. Finally, we could develop a crime prediction model using machine learning algorithm and visualized the crime risky areas in a map which can recalculate the model and visualize the result simultaneously as time and urban conditions change.

A Case Study: Improvement of Wind Risk Prediction by Reclassifying the Detection Results (풍해 예측 결과 재분류를 통한 위험 감지확률의 개선 연구)

  • Kim, Soo-ock;Hwang, Kyu-Hong
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.23 no.3
    • /
    • pp.149-155
    • /
    • 2021
  • Early warning systems for weather risk management in the agricultural sector have been developed to predict potential wind damage to crops. These systems take into account the daily maximum wind speed to determine the critical wind speed that causes fruit drops and provide the weather risk information to farmers. In an effort to increase the accuracy of wind risk predictions, an artificial neural network for binary classification was implemented. In the present study, the daily wind speed and other weather data, which were measured at weather stations at sites of interest in Jeollabuk-do and Jeollanam-do as well as Gyeongsangbuk- do and part of Gyeongsangnam- do provinces in 2019, were used for training the neural network. These weather stations include 210 synoptic and automated weather stations operated by the Korean Meteorological Administration (KMA). The wind speed data collected at the same locations between January 1 and December 12, 2020 were used to validate the neural network model. The data collected from December 13, 2020 to February 18, 2021 were used to evaluate the wind risk prediction performance before and after the use of the artificial neural network. The critical wind speed of damage risk was determined to be 11 m/s, which is the wind speed reported to cause fruit drops and damages. Furthermore, the maximum wind speeds were expressed using Weibull distribution probability density function for warning of wind damage. It was found that the accuracy of wind damage risk prediction was improved from 65.36% to 93.62% after re-classification using the artificial neural network. Nevertheless, the error rate also increased from 13.46% to 37.64%, as well. It is likely that the machine learning approach used in the present study would benefit case studies where no prediction by risk warning systems becomes a relatively serious issue.

3D Point Cloud Reconstruction Technique from 2D Image Using Efficient Feature Map Extraction Network (효율적인 feature map 추출 네트워크를 이용한 2D 이미지에서의 3D 포인트 클라우드 재구축 기법)

  • Kim, Jeong-Yoon;Lee, Seung-Ho
    • Journal of IKEEE
    • /
    • v.26 no.3
    • /
    • pp.408-415
    • /
    • 2022
  • In this paper, we propose a 3D point cloud reconstruction technique from 2D images using efficient feature map extraction network. The originality of the method proposed in this paper is as follows. First, we use a new feature map extraction network that is about 27% efficient than existing techniques in terms of memory. The proposed network does not reduce the size to the middle of the deep learning network, so important information required for 3D point cloud reconstruction is not lost. We solved the memory increase problem caused by the non-reduced image size by reducing the number of channels and by efficiently configuring the deep learning network to be shallow. Second, by preserving the high-resolution features of the 2D image, the accuracy can be further improved than that of the conventional technique. The feature map extracted from the non-reduced image contains more detailed information than the existing method, which can further improve the reconstruction accuracy of the 3D point cloud. Third, we use a divergence loss that does not require shooting information. The fact that not only the 2D image but also the shooting angle is required for learning, the dataset must contain detailed information and it is a disadvantage that makes it difficult to construct the dataset. In this paper, the accuracy of the reconstruction of the 3D point cloud can be increased by increasing the diversity of information through randomness without additional shooting information. In order to objectively evaluate the performance of the proposed method, using the ShapeNet dataset and using the same method as in the comparative papers, the CD value of the method proposed in this paper is 5.87, the EMD value is 5.81, and the FLOPs value is 2.9G. It was calculated. On the other hand, the lower the CD and EMD values, the better the accuracy of the reconstructed 3D point cloud approaches the original. In addition, the lower the number of FLOPs, the less memory is required for the deep learning network. Therefore, the CD, EMD, and FLOPs performance evaluation results of the proposed method showed about 27% improvement in memory and 6.3% in terms of accuracy compared to the methods in other papers, demonstrating objective performance.

A fundamental study on the automation of tunnel blasting design using a machine learning model (머신러닝을 이용한 터널발파설계 자동화를 위한 기초연구)

  • Kim, Yangkyun;Lee, Je-Kyum;Lee, Sean Seungwon
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.24 no.5
    • /
    • pp.431-449
    • /
    • 2022
  • As many tunnels generally have been constructed, various experiences and techniques have been accumulated for tunnel design as well as tunnel construction. Hence, there are not a few cases that, for some usual tunnel design works, it is sufficient to perform the design by only modifying or supplementing previous similar design cases unless a tunnel has a unique structure or in geological conditions. In particular, for a tunnel blast design, it is reasonable to refer to previous similar design cases because the blast design in the stage of design is a preliminary design, considering that it is general to perform additional blast design through test blasts prior to the start of tunnel excavation. Meanwhile, entering the industry 4.0 era, artificial intelligence (AI) of which availability is surging across whole industry sector is broadly utilized to tunnel and blasting. For a drill and blast tunnel, AI is mainly applied for the estimation of blast vibration and rock mass classification, etc. however, there are few cases where it is applied to blast pattern design. Thus, this study attempts to automate tunnel blast design by means of machine learning, a branch of artificial intelligence. For this, the data related to a blast design was collected from 25 tunnel design reports for learning as well as 2 additional reports for the test, and from which 4 design parameters, i.e., rock mass class, road type and cross sectional area of upper section as well as bench section as input data as well as16 design elements, i.e., blast cut type, specific charge, the number of drill holes, and spacing and burden for each blast hole group, etc. as output. Based on this design data, three machine learning models, i.e., XGBoost, ANN, SVM, were tested and XGBoost was chosen as the best model and the results show a generally similar trend to an actual design when assumed design parameters were input. It is not enough yet to perform the whole blast design using the results from this study, however, it is planned that additional studies will be carried out to make it possible to put it to practical use after collecting more sufficient blast design data and supplementing detailed machine learning processes.

Seeking for a Curriculum of Dance Department in the University in the Age of the 4th Industrial Revolution (4차 산업혁명시대 대학무용학과 커리큘럼의 방향모색)

  • Baek, Hyun-Soon;Yoo, Ji-Young
    • Journal of Korea Entertainment Industry Association
    • /
    • v.13 no.3
    • /
    • pp.193-202
    • /
    • 2019
  • This study focuses on what changes are required as to a curriculum of dance department in the university in the age of the 4th industrial revolution. By comparing and analyzing the curricula of dance department in the five universities in Seoul, five academic subjects as to curricula of dance department, which covers what to learn for dance education in the age of the 4th industrial revolution, are presented. First, dance integrative education, the integration of creativity and science education, can be referred to as a subject that stimulates ideas and creativity and raises artistic sensitivity based on STEAM. Second, the curriculum characterized by prediction of the future prospect through Big Data can be utilized well in dealing with dance performance, career path of dance-majoring people, and job creation by analyzing public opinion, evaluation, and feelings. Third, video education. Seeing the images as modern major media tends to occupy most of the expressive area of art, dance by dint of video enables existing dance work to be created as new form of art, expanding dance boundaries in academic and performing art viewpoint. Fourth, VR and AR are essential techniques in the era of smart media. Whether upcoming dance studies are in the form of performance or education or industry, for VR and AR to be digitally applied into every relevant field, keeping with the time, learning about VR and AR is indispensable. Last, the 4th industrial revolution and the curriculum of dance art are needed to foresee the changes in the 4th industrial revolution and to educate changes, development and seeking in dance curriculum.

Adolescent delinquent behavior and the influence of friends: With specific focus on self-efficacy, parent-child conflict and parental control (친구가 청소년의 일탈행동에 미치는 영향: 자기효능감, 부모자녀 갈등 및 부모의 통제를 중심으로)

  • Young-Shin Park;Uichol Kim
    • Korean Journal of Culture and Social Issue
    • /
    • v.16 no.3
    • /
    • pp.385-422
    • /
    • 2010
  • This study examines adolescent delinquent behavior and the influence of friends, focusing specifically on friends' delinquent behavior and the influence of self-efficacy, parent-child conflict and parental control. A total of 1,399 adolescents attending five different high schools (male=642, female=756, consisting of 915 student attending high school and 484 students attending vocational high school) completed a questionnaire developed by Ahn, Hwang, Kim and Park (1997) and Bandura's (1995a) self-efficacy scale. Results indicate that those students who attend high school had parents with higher education, socio-economic status and better studying environment at home, while students attending vocational high school had higher parent-child conflict. Students attending high school had higher self-efficacy scores, while students attending vocational high school had higher scores on delinquent behavior. The results of LISREL analyses revealed a similar pattern for high school and vocational high school students. Combined analysis indicate that friends' delinquent behavior, parent-child conflict and parental control had direct and positive effect on students' delinquency behavior. Self-efficacy had a direct and negative influence of delinquency behavior. Similar pattern was obtained for friends' delinquency behavior, in which self-efficacy had a direct and negative influence of their delinquency behavior and their parent-child conflict and parental control had direct and positive effect on their delinquency behavior. In summary, those students who had lower self-efficacy, higher parent-child conflict and parental control, and with friends who are more likely to engage in delinquent behavior, had higher scores on delinquent behavior. Also, those students who had friends with lower self-efficacy scores and with higher parent-child conflict and parental control are more likely to engage in delinquent behavior, which in turn influenced their delinquent behavior. Friends' delinquent behavior had the greatest influence on students' delinquent behavior indicating the role of friends in influencing delinquency among adolescents.

  • PDF

Data-centric XAI-driven Data Imputation of Molecular Structure and QSAR Model for Toxicity Prediction of 3D Printing Chemicals (3D 프린팅 소재 화학물질의 독성 예측을 위한 Data-centric XAI 기반 분자 구조 Data Imputation과 QSAR 모델 개발)

  • ChanHyeok Jeong;SangYoun Kim;SungKu Heo;Shahzeb Tariq;MinHyeok Shin;ChangKyoo Yoo
    • Korean Chemical Engineering Research
    • /
    • v.61 no.4
    • /
    • pp.523-541
    • /
    • 2023
  • As accessibility to 3D printers increases, there is a growing frequency of exposure to chemicals associated with 3D printing. However, research on the toxicity and harmfulness of chemicals generated by 3D printing is insufficient, and the performance of toxicity prediction using in silico techniques is limited due to missing molecular structure data. In this study, quantitative structure-activity relationship (QSAR) model based on data-centric AI approach was developed to predict the toxicity of new 3D printing materials by imputing missing values in molecular descriptors. First, MissForest algorithm was utilized to impute missing values in molecular descriptors of hazardous 3D printing materials. Then, based on four different machine learning models (decision tree, random forest, XGBoost, SVM), a machine learning (ML)-based QSAR model was developed to predict the bioconcentration factor (Log BCF), octanol-air partition coefficient (Log Koa), and partition coefficient (Log P). Furthermore, the reliability of the data-centric QSAR model was validated through the Tree-SHAP (SHapley Additive exPlanations) method, which is one of explainable artificial intelligence (XAI) techniques. The proposed imputation method based on the MissForest enlarged approximately 2.5 times more molecular structure data compared to the existing data. Based on the imputed dataset of molecular descriptor, the developed data-centric QSAR model achieved approximately 73%, 76% and 92% of prediction performance for Log BCF, Log Koa, and Log P, respectively. Lastly, Tree-SHAP analysis demonstrated that the data-centric-based QSAR model achieved high prediction performance for toxicity information by identifying key molecular descriptors highly correlated with toxicity indices. Therefore, the proposed QSAR model based on the data-centric XAI approach can be extended to predict the toxicity of potential pollutants in emerging printing chemicals, chemical process, semiconductor or display process.

Analysis of Success Cases of InsurTech and Digital Insurance Platform Based on Artificial Intelligence Technologies: Focused on Ping An Insurance Group Ltd. in China (인공지능 기술 기반 인슈어테크와 디지털보험플랫폼 성공사례 분석: 중국 평안보험그룹을 중심으로)

  • Lee, JaeWon;Oh, SangJin
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.71-90
    • /
    • 2020
  • Recently, the global insurance industry is rapidly developing digital transformation through the use of artificial intelligence technologies such as machine learning, natural language processing, and deep learning. As a result, more and more foreign insurers have achieved the success of artificial intelligence technology-based InsurTech and platform business, and Ping An Insurance Group Ltd., China's largest private company, is leading China's global fourth industrial revolution with remarkable achievements in InsurTech and Digital Platform as a result of its constant innovation, using 'finance and technology' and 'finance and ecosystem' as keywords for companies. In response, this study analyzed the InsurTech and platform business activities of Ping An Insurance Group Ltd. through the ser-M analysis model to provide strategic implications for revitalizing AI technology-based businesses of domestic insurers. The ser-M analysis model has been studied so that the vision and leadership of the CEO, the historical environment of the enterprise, the utilization of various resources, and the unique mechanism relationships can be interpreted in an integrated manner as a frame that can be interpreted in terms of the subject, environment, resource and mechanism. As a result of the case analysis, Ping An Insurance Group Ltd. has achieved cost reduction and customer service development by digitally innovating its entire business area such as sales, underwriting, claims, and loan service by utilizing core artificial intelligence technologies such as facial, voice, and facial expression recognition. In addition, "online data in China" and "the vast offline data and insights accumulated by the company" were combined with new technologies such as artificial intelligence and big data analysis to build a digital platform that integrates financial services and digital service businesses. Ping An Insurance Group Ltd. challenged constant innovation, and as of 2019, sales reached $155 billion, ranking seventh among all companies in the Global 2000 rankings selected by Forbes Magazine. Analyzing the background of the success of Ping An Insurance Group Ltd. from the perspective of ser-M, founder Mammingz quickly captured the development of digital technology, market competition and changes in population structure in the era of the fourth industrial revolution, and established a new vision and displayed an agile leadership of digital technology-focused. Based on the strong leadership led by the founder in response to environmental changes, the company has successfully led InsurTech and Platform Business through innovation of internal resources such as investment in artificial intelligence technology, securing excellent professionals, and strengthening big data capabilities, combining external absorption capabilities, and strategic alliances among various industries. Through this success story analysis of Ping An Insurance Group Ltd., the following implications can be given to domestic insurance companies that are preparing for digital transformation. First, CEOs of domestic companies also need to recognize the paradigm shift in industry due to the change in digital technology and quickly arm themselves with digital technology-oriented leadership to spearhead the digital transformation of enterprises. Second, the Korean government should urgently overhaul related laws and systems to further promote the use of data between different industries and provide drastic support such as deregulation, tax benefits and platform provision to help the domestic insurance industry secure global competitiveness. Third, Korean companies also need to make bolder investments in the development of artificial intelligence technology so that systematic securing of internal and external data, training of technical personnel, and patent applications can be expanded, and digital platforms should be quickly established so that diverse customer experiences can be integrated through learned artificial intelligence technology. Finally, since there may be limitations to generalization through a single case of an overseas insurance company, I hope that in the future, more extensive research will be conducted on various management strategies related to artificial intelligence technology by analyzing cases of multiple industries or multiple companies or conducting empirical research.

Application of Support Vector Regression for Improving the Performance of the Emotion Prediction Model (감정예측모형의 성과개선을 위한 Support Vector Regression 응용)

  • Kim, Seongjin;Ryoo, Eunchung;Jung, Min Kyu;Kim, Jae Kyeong;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.185-202
    • /
    • 2012
  • .Since the value of information has been realized in the information society, the usage and collection of information has become important. A facial expression that contains thousands of information as an artistic painting can be described in thousands of words. Followed by the idea, there has recently been a number of attempts to provide customers and companies with an intelligent service, which enables the perception of human emotions through one's facial expressions. For example, MIT Media Lab, the leading organization in this research area, has developed the human emotion prediction model, and has applied their studies to the commercial business. In the academic area, a number of the conventional methods such as Multiple Regression Analysis (MRA) or Artificial Neural Networks (ANN) have been applied to predict human emotion in prior studies. However, MRA is generally criticized because of its low prediction accuracy. This is inevitable since MRA can only explain the linear relationship between the dependent variables and the independent variable. To mitigate the limitations of MRA, some studies like Jung and Kim (2012) have used ANN as the alternative, and they reported that ANN generated more accurate prediction than the statistical methods like MRA. However, it has also been criticized due to over fitting and the difficulty of the network design (e.g. setting the number of the layers and the number of the nodes in the hidden layers). Under this background, we propose a novel model using Support Vector Regression (SVR) in order to increase the prediction accuracy. SVR is an extensive version of Support Vector Machine (SVM) designated to solve the regression problems. The model produced by SVR only depends on a subset of the training data, because the cost function for building the model ignores any training data that is close (within a threshold ${\varepsilon}$) to the model prediction. Using SVR, we tried to build a model that can measure the level of arousal and valence from the facial features. To validate the usefulness of the proposed model, we collected the data of facial reactions when providing appropriate visual stimulating contents, and extracted the features from the data. Next, the steps of the preprocessing were taken to choose statistically significant variables. In total, 297 cases were used for the experiment. As the comparative models, we also applied MRA and ANN to the same data set. For SVR, we adopted '${\varepsilon}$-insensitive loss function', and 'grid search' technique to find the optimal values of the parameters like C, d, ${\sigma}^2$, and ${\varepsilon}$. In the case of ANN, we adopted a standard three-layer backpropagation network, which has a single hidden layer. The learning rate and momentum rate of ANN were set to 10%, and we used sigmoid function as the transfer function of hidden and output nodes. We performed the experiments repeatedly by varying the number of nodes in the hidden layer to n/2, n, 3n/2, and 2n, where n is the number of the input variables. The stopping condition for ANN was set to 50,000 learning events. And, we used MAE (Mean Absolute Error) as the measure for performance comparison. From the experiment, we found that SVR achieved the highest prediction accuracy for the hold-out data set compared to MRA and ANN. Regardless of the target variables (the level of arousal, or the level of positive / negative valence), SVR showed the best performance for the hold-out data set. ANN also outperformed MRA, however, it showed the considerably lower prediction accuracy than SVR for both target variables. The findings of our research are expected to be useful to the researchers or practitioners who are willing to build the models for recognizing human emotions.

Multi-Dimensional Analysis Method of Product Reviews for Market Insight (마켓 인사이트를 위한 상품 리뷰의 다차원 분석 방안)

  • Park, Jeong Hyun;Lee, Seo Ho;Lim, Gyu Jin;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.57-78
    • /
    • 2020
  • With the development of the Internet, consumers have had an opportunity to check product information easily through E-Commerce. Product reviews used in the process of purchasing goods are based on user experience, allowing consumers to engage as producers of information as well as refer to information. This can be a way to increase the efficiency of purchasing decisions from the perspective of consumers, and from the seller's point of view, it can help develop products and strengthen their competitiveness. However, it takes a lot of time and effort to understand the overall assessment and assessment dimensions of the products that I think are important in reading the vast amount of product reviews offered by E-Commerce for the products consumers want to compare. This is because product reviews are unstructured information and it is difficult to read sentiment of reviews and assessment dimension immediately. For example, consumers who want to purchase a laptop would like to check the assessment of comparative products at each dimension, such as performance, weight, delivery, speed, and design. Therefore, in this paper, we would like to propose a method to automatically generate multi-dimensional product assessment scores in product reviews that we would like to compare. The methods presented in this study consist largely of two phases. One is the pre-preparation phase and the second is the individual product scoring phase. In the pre-preparation phase, a dimensioned classification model and a sentiment analysis model are created based on a review of the large category product group review. By combining word embedding and association analysis, the dimensioned classification model complements the limitation that word embedding methods for finding relevance between dimensions and words in existing studies see only the distance of words in sentences. Sentiment analysis models generate CNN models by organizing learning data tagged with positives and negatives on a phrase unit for accurate polarity detection. Through this, the individual product scoring phase applies the models pre-prepared for the phrase unit review. Multi-dimensional assessment scores can be obtained by aggregating them by assessment dimension according to the proportion of reviews organized like this, which are grouped among those that are judged to describe a specific dimension for each phrase. In the experiment of this paper, approximately 260,000 reviews of the large category product group are collected to form a dimensioned classification model and a sentiment analysis model. In addition, reviews of the laptops of S and L companies selling at E-Commerce are collected and used as experimental data, respectively. The dimensioned classification model classified individual product reviews broken down into phrases into six assessment dimensions and combined the existing word embedding method with an association analysis indicating frequency between words and dimensions. As a result of combining word embedding and association analysis, the accuracy of the model increased by 13.7%. The sentiment analysis models could be seen to closely analyze the assessment when they were taught in a phrase unit rather than in sentences. As a result, it was confirmed that the accuracy was 29.4% higher than the sentence-based model. Through this study, both sellers and consumers can expect efficient decision making in purchasing and product development, given that they can make multi-dimensional comparisons of products. In addition, text reviews, which are unstructured data, were transformed into objective values such as frequency and morpheme, and they were analysed together using word embedding and association analysis to improve the objectivity aspects of more precise multi-dimensional analysis and research. This will be an attractive analysis model in terms of not only enabling more effective service deployment during the evolving E-Commerce market and fierce competition, but also satisfying both customers.