• Title/Summary/Keyword: System verification

Search Result 4,654, Processing Time 0.03 seconds

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.

Evaluation of Combine IGRT using ExacTrac and CBCT In SBRT (정위적체부방사선치료시 ExacTrac과 CBCT를 이용한 Combine IGRT의 유용성 평가)

  • Ahn, Min Woo;Kang, Hyo Seok;Choi, Byoung Joon;Park, Sang Jun;Jung, Da Ee;Lee, Geon Ho;Lee, Doo Sang;Jeon, Myeong Soo
    • The Journal of Korean Society for Radiation Therapy
    • /
    • v.30 no.1_2
    • /
    • pp.201-208
    • /
    • 2018
  • Purpose : The purpose of this study is to compare and analyze the set-up errors using the Combine IGRT with ExacTrac and CBCT phased in the treatment of Stereotatic Body Radiotherapy. Methods and materials : Patient who were treated Stereotatic Body Radiotherapy in the ulsan university hospital from May 2014 to november 2017 were classified as treatment area three brain, nine spine, three pelvis. First using ExacTrac Set-up error calibrated direction of Lateral(Lat), Longitudinal(Lng), Vertical(Vrt), Roll, Pitch, Yaw, after applied ExacTrac moving data in addition to use CBCT and set-up error calibrated direction of Lat, Lng, Vrt, Rotation(Rtn). Results : When using ExacTrac, the error in the brain region is Lat $0.18{\pm}0.25cm$, Lng $0.23{\pm}0.04cm$, Vrt $0.30{\pm}0.36cm$, Roll $0.36{\pm}0.21^{\circ}$, Pitch $1.72{\pm}0.62^{\circ}$, Yaw $1.80{\pm}1.21^{\circ}$, spine Lat $0.21{\pm}0.24cm$, Lng $0.27{\pm}0.36cm$, Vrt $0.26{\pm}0.42cm$, Roll $1.01{\pm}1.17^{\circ}$, Pitch $0.66{\pm}0.45^{\circ}$, Yaw $0.71{\pm}0.58^{\circ}$, pelvis Lat $0.20{\pm}0.16cm$, Lng $0.24{\pm}0.29cm$, Vrt $0.28{\pm}0.29cm$, Roll $0.83{\pm}0.21^{\circ}$, Pitch $0.57{\pm}0.45^{\circ}$, Yaw $0.52{\pm}0.27^{\circ}$ When CBCT is performed after the couch movement, the error in brain region is Lat $0.06{\pm}0.05cm$, Lng $0.07{\pm}0.06cm$, Vrt $0.00{\pm}0.00cm$, Rtn $0.0{\pm}0.0^{\circ}$, spine Lat $0.06{\pm}0.04cm$, Lng $0.16{\pm}0.30cm$, Vrt $0.08{\pm}0.08cm$, Rtn $0.00{\pm}0.00^{\circ}$, pelvis Lat $0.06{\pm}0.07cm$, Lng $0.04{\pm}0.05cm$, Vrt $0.06{\pm}0.04cm$, Rtn $0.0{\pm}0.0^{\circ}$. Conclusion : Combine IGRT with ExacTrac in addition to CBCT during Stereotatic Body Radiotherapy showed that it was possible to reduce the set-up error of patients compared to single ExacTrac. However, the application of Combine IGRT increases patient set-up verification time and absorption dose in the body for image acquisition. Therefore, depending on the patient's situation that using Combine IGRT to reduce the patient's set-up error can increase the radiation treatment effectiveness.

  • PDF

A Two-Stage Learning Method of CNN and K-means RGB Cluster for Sentiment Classification of Images (이미지 감성분류를 위한 CNN과 K-means RGB Cluster 이-단계 학습 방안)

  • Kim, Jeongtae;Park, Eunbi;Han, Kiwoong;Lee, Junghyun;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.139-156
    • /
    • 2021
  • The biggest reason for using a deep learning model in image classification is that it is possible to consider the relationship between each region by extracting each region's features from the overall information of the image. However, the CNN model may not be suitable for emotional image data without the image's regional features. To solve the difficulty of classifying emotion images, many researchers each year propose a CNN-based architecture suitable for emotion images. Studies on the relationship between color and human emotion were also conducted, and results were derived that different emotions are induced according to color. In studies using deep learning, there have been studies that apply color information to image subtraction classification. The case where the image's color information is additionally used than the case where the classification model is trained with only the image improves the accuracy of classifying image emotions. This study proposes two ways to increase the accuracy by incorporating the result value after the model classifies an image's emotion. Both methods improve accuracy by modifying the result value based on statistics using the color of the picture. When performing the test by finding the two-color combinations most distributed for all training data, the two-color combinations most distributed for each test data image were found. The result values were corrected according to the color combination distribution. This method weights the result value obtained after the model classifies an image's emotion by creating an expression based on the log function and the exponential function. Emotion6, classified into six emotions, and Artphoto classified into eight categories were used for the image data. Densenet169, Mnasnet, Resnet101, Resnet152, and Vgg19 architectures were used for the CNN model, and the performance evaluation was compared before and after applying the two-stage learning to the CNN model. Inspired by color psychology, which deals with the relationship between colors and emotions, when creating a model that classifies an image's sentiment, we studied how to improve accuracy by modifying the result values based on color. Sixteen colors were used: red, orange, yellow, green, blue, indigo, purple, turquoise, pink, magenta, brown, gray, silver, gold, white, and black. It has meaning. Using Scikit-learn's Clustering, the seven colors that are primarily distributed in the image are checked. Then, the RGB coordinate values of the colors from the image are compared with the RGB coordinate values of the 16 colors presented in the above data. That is, it was converted to the closest color. Suppose three or more color combinations are selected. In that case, too many color combinations occur, resulting in a problem in which the distribution is scattered, so a situation fewer influences the result value. Therefore, to solve this problem, two-color combinations were found and weighted to the model. Before training, the most distributed color combinations were found for all training data images. The distribution of color combinations for each class was stored in a Python dictionary format to be used during testing. During the test, the two-color combinations that are most distributed for each test data image are found. After that, we checked how the color combinations were distributed in the training data and corrected the result. We devised several equations to weight the result value from the model based on the extracted color as described above. The data set was randomly divided by 80:20, and the model was verified using 20% of the data as a test set. After splitting the remaining 80% of the data into five divisions to perform 5-fold cross-validation, the model was trained five times using different verification datasets. Finally, the performance was checked using the test dataset that was previously separated. Adam was used as the activation function, and the learning rate was set to 0.01. The training was performed as much as 20 epochs, and if the validation loss value did not decrease during five epochs of learning, the experiment was stopped. Early tapping was set to load the model with the best validation loss value. The classification accuracy was better when the extracted information using color properties was used together than the case using only the CNN architecture.

The Study on the Influence of Capstone Design & Field Training on Employment Rate: Focused on Leaders in INdustry-university Cooperation(LINC) (캡스톤디자인 및 현장실습이 취업률에 미치는 영향: 산학협력선도대학(LINC)을 중심으로)

  • Park Namgue
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.18 no.4
    • /
    • pp.207-222
    • /
    • 2023
  • In order to improve employment rates, most universities operate programs to strengthen students' employment and entrepreneurship, regardless of whether they are selected as the Leading Industry-Innovative University (LINC) or not. In particular, in the case of non-metropolitan universities are risking their lives to improve employment rates. In order to overcome the limitations of university establishment type and university location, which absolutely affect the employment rate, we are operating a startup education & startup support program in order to strengthen employment and entrepreneurship, and capstone design & field training as industry-academia-linked education programs are always available. Although there are studies on effectiveness verification centered on LINC (Leaders in Industry-University Cooperation) in previous studies, but a longitudinal study was conducted on all factors of university factors, startup education & startup support, and capstone design & field training as industry-university-linked education programs as factors affecting the employment rate based on public disclosure indicators. No cases of longitudinal studies were reported. This study targets 116 universities that satisfy the conditions based on university disclosure indicators from 2018 to 2020 that were recently released on university factors, startup education & startup support, and capstone design & field training as industry-academia-linked education programs as factors affecting the employment rate. We analyzed the differences between the LINC (Leaders in Industry-University Cooperation) 51 participating universities and 64 non-participating universities. In addition, considering that there is no historical information on the overlapping participation of participating students due to the limitations of public indicators, the Exposure Effect theory states that long-term exposure to employment and entrepreneurship competency enhancement programs will affect the employment rate through competency enhancement. Based on this, the effectiveness of the 2nd LINC+ (socially customized Leaders in Industry-University Cooperation) was verified from 2017 to 2021 through a longitudinal causal relationship analysis. As a result of the study, it was found that the startup education & startup support and capstone design & field training as industry-academia-linked education programs of the 2nd LINC+ (socially customized Leaders in Industry-University Cooperation) did not affect the employment rate. As a result of the longitudinal causal relationship analysis, it was reconfirmed that universities in metropolitan areas still have higher employment rates than universities in non-metropolitan areas due to existing university factors, and that private universities have higher employment rates than national universities. Among employment and entrepreneurship competency strengthening programs, the number of people who complete entrepreneurship courses, the number of people who complete capstone design, the amount of capstone design payment, and the number of dedicated faculty members partially affect the employment rate by year, while field training has no effect at all by year. It was confirmed that long-term exposure to the entrepreneurship capacity building program did not affect the employment rate. Therefore, it was reconfirmed that in order to improve the employment rate of universities, the limitations of non-metropolitan areas and national and public universities must be overcome. To overcome this, as a program to strengthen employment and entrepreneurship capabilities, it is important to strengthen entrepreneurship through participation in entrepreneurship lectures and actively introduce and be confident in the capstone design program that strengthens the concept of PBL (Problem Based Learning), and the field training program improves the employment rate. In order for actually field training affect of the employment rate, it is necessary to proceed with a substantial program through reorganization of the overall academic system and organization.

  • PDF