• Title/Summary/Keyword: ICT(정보, 기술, 정보통신활용)

Search Result 515, Processing Time 0.022 seconds

A CF-based Health Functional Recommender System using Extended User Similarity Measure (확장된 사용자 유사도를 이용한 CF-기반 건강기능식품 추천 시스템)

  • Sein Hong;Euiju Jeong;Jaekyeong Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.1-17
    • /
    • 2023
  • With the recent rapid development of ICT(Information and Communication Technology) and the popularization of digital devices, the size of the online market continues to grow. As a result, we live in a flood of information. Thus, customers are facing information overload problems that require a lot of time and money to select products. Therefore, a personalized recommender system has become an essential methodology to address such issues. Collaborative Filtering(CF) is the most widely used recommender system. Traditional recommender systems mainly utilize quantitative data such as rating values, resulting in poor recommendation accuracy. Quantitative data cannot fully reflect the user's preference. To solve such a problem, studies that reflect qualitative data, such as review contents, are being actively conducted these days. To quantify user review contents, text mining was used in this study. The general CF consists of the following three steps: user-item matrix generation, Top-N neighborhood group search, and Top-K recommendation list generation. In this study, we propose a recommendation algorithm that applies an extended similarity measure, which utilize quantified review contents in addition to user rating values. After calculating review similarity by applying TF-IDF, Word2Vec, and Doc2Vec techniques to review content, extended similarity is created by combining user rating similarity and quantified review contents. To verify this, we used user ratings and review data from the e-commerce site Amazon's "Health and Personal Care". The proposed recommendation model using extended similarity measure showed superior performance to the traditional recommendation model using only user rating value-based similarity measure. In addition, among the various text mining techniques, the similarity obtained using the TF-IDF technique showed the best performance when used in the neighbor group search and recommendation list generation step.

Multi-Variate Tabular Data Processing and Visualization Scheme for Machine Learning based Analysis: A Case Study using Titanic Dataset (기계 학습 기반 분석을 위한 다변량 정형 데이터 처리 및 시각화 방법: Titanic 데이터셋 적용 사례 연구)

  • Juhyoung Sung;Kiwon Kwon;Kyoungwon Park;Byoungchul Song
    • Journal of Internet Computing and Services
    • /
    • v.25 no.4
    • /
    • pp.121-130
    • /
    • 2024
  • As internet and communication technology (ICT) is improved exponentially, types and amount of available data also increase. Even though data analysis including statistics is significant to utilize this large amount of data, there are inevitable limits to process various and complex data in general way. Meanwhile, there are many attempts to apply machine learning (ML) in various fields to solve the problems according to the enhancement in computational performance and increase in demands for autonomous systems. Especially, data processing for the model input and designing the model to solve the objective function are critical to achieve the model performance. Data processing methods according to the type and property have been presented through many studies and the performance of ML highly varies depending on the methods. Nevertheless, there are difficulties in deciding which data processing method for data analysis since the types and characteristics of data have become more diverse. Specifically, multi-variate data processing is essential for solving non-linear problem based on ML. In this paper, we present a multi-variate tabular data processing scheme for ML-aided data analysis by using Titanic dataset from Kaggle including various kinds of data. We present the methods like input variable filtering applying statistical analysis and normalization according to the data property. In addition, we analyze the data structure using visualization. Lastly, we design an ML model and train the model by applying the proposed multi-variate data process. After that, we analyze the passenger's survival prediction performance of the trained model. We expect that the proposed multi-variate data processing and visualization can be extended to various environments for ML based analysis.

An Expert System for the Estimation of the Growth Curve Parameters of New Markets (신규시장 성장모형의 모수 추정을 위한 전문가 시스템)

  • Lee, Dongwon;Jung, Yeojin;Jung, Jaekwon;Park, Dohyung
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.17-35
    • /
    • 2015
  • Demand forecasting is the activity of estimating the quantity of a product or service that consumers will purchase for a certain period of time. Developing precise forecasting models are considered important since corporates can make strategic decisions on new markets based on future demand estimated by the models. Many studies have developed market growth curve models, such as Bass, Logistic, Gompertz models, which estimate future demand when a market is in its early stage. Among the models, Bass model, which explains the demand from two types of adopters, innovators and imitators, has been widely used in forecasting. Such models require sufficient demand observations to ensure qualified results. In the beginning of a new market, however, observations are not sufficient for the models to precisely estimate the market's future demand. For this reason, as an alternative, demands guessed from those of most adjacent markets are often used as references in such cases. Reference markets can be those whose products are developed with the same categorical technologies. A market's demand may be expected to have the similar pattern with that of a reference market in case the adoption pattern of a product in the market is determined mainly by the technology related to the product. However, such processes may not always ensure pleasing results because the similarity between markets depends on intuition and/or experience. There are two major drawbacks that human experts cannot effectively handle in this approach. One is the abundance of candidate reference markets to consider, and the other is the difficulty in calculating the similarity between markets. First, there can be too many markets to consider in selecting reference markets. Mostly, markets in the same category in an industrial hierarchy can be reference markets because they are usually based on the similar technologies. However, markets can be classified into different categories even if they are based on the same generic technologies. Therefore, markets in other categories also need to be considered as potential candidates. Next, even domain experts cannot consistently calculate the similarity between markets with their own qualitative standards. The inconsistency implies missing adjacent reference markets, which may lead to the imprecise estimation of future demand. Even though there are no missing reference markets, the new market's parameters can be hardly estimated from the reference markets without quantitative standards. For this reason, this study proposes a case-based expert system that helps experts overcome the drawbacks in discovering referential markets. First, this study proposes the use of Euclidean distance measure to calculate the similarity between markets. Based on their similarities, markets are grouped into clusters. Then, missing markets with the characteristics of the cluster are searched for. Potential candidate reference markets are extracted and recommended to users. After the iteration of these steps, definite reference markets are determined according to the user's selection among those candidates. Then, finally, the new market's parameters are estimated from the reference markets. For this procedure, two techniques are used in the model. One is clustering data mining technique, and the other content-based filtering of recommender systems. The proposed system implemented with those techniques can determine the most adjacent markets based on whether a user accepts candidate markets. Experiments were conducted to validate the usefulness of the system with five ICT experts involved. In the experiments, the experts were given the list of 16 ICT markets whose parameters to be estimated. For each of the markets, the experts estimated its parameters of growth curve models with intuition at first, and then with the system. The comparison of the experiments results show that the estimated parameters are closer when they use the system in comparison with the results when they guessed them without the system.

Theoretical Research for Unmanned Aircraft Electromagnetic Survey: Electromagnetic Field Calculation and Analysis by Arbitrary Shaped Transmitter-Loop (무인 항공 전자탐사 이론 연구: 임의 모양의 송신루프에 의한 전자기장 반응 계산 및 분석)

  • Bang, Minkyu;Oh, Seokmin;Seol, Soon Jee;Lee, Ki Ha;Cho, Seong-Jun
    • Geophysics and Geophysical Exploration
    • /
    • v.21 no.3
    • /
    • pp.150-161
    • /
    • 2018
  • Recently, unmanned aircraft EM (electromagnetic) survey based on ICT (Information and Communication Technology) has been widely utilized because of the efficiency in regional survey. We performed the theoretical study on the unmanned airship EM system developed by KIGAM (Korea Institute of Geoscience and Mineral resources) as part of the practical application of unmanned aircraft EM survey. Since this system has different configurations of transmitting and receiving loops compared to the conventional aircraft EM systems, a new technique is required for the appropriate interpretation of measured responses. Therefore, we proposed a method to calculate the EM field for the arbitrary shaped transmitter and verified its validity through the comparison with analytic solution for circular loop. In addition, to simulate the magnetic responses by three-dimensionally (3D) distributed anomalies, we have adapted our algorithm to 3D frequency-domain EM modeling algorithm based on the edge-FEM (finite element method). Though the analysis on magnetic field responses from a subsurface anomaly, it was found that the response decreases as the depth of the anomaly increases or the flight altitude increases. Also, it was confirmed that the response became smaller as the resistivity of the anomaly increases. However, a nonlinear trend of the out-of-phase component is shown depending on the depth of the anomaly and the used frequency, that makes it difficult to apply simple analysis based on the mapping of the magnitude of the responses and can cause the non-uniqueness problem in calculating the apparent resistivity. Thus, it is a prerequisite to analyze the appropriate frequency band and flight altitude considering the purpose of the survey and the site conditions when conducting a survey using the unmanned aircraft EM system.

A Study on the Distribution of Startups and Influencing Factors by Generation in Seoul: Focusing on the Comparison of Young and Middle-aged (서울시 세대별 창업 분포와 영향 요인에 대한 연구: 청년층과 중년층의 비교를 중심으로)

  • Hong, Sungpyo;Lim, Hanryeo
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.16 no.3
    • /
    • pp.13-29
    • /
    • 2021
  • The purpose of this study was to analyze the spatial distribution and location factors of startups by generation (young and middle-aged) in Seoul. To this end, a research model was established that included factors of industry, population, and startup institutions by generation in 424 administrative districts using the Seoul Business Enterprise Survey(2018), which includes data on the age group of entrepreneurs. As an analysis method, descriptive statistics were conducted to confirm the frequency, average and standard deviation of startups by generation and major variables in the administrative districts of Seoul, and spatial distribution and characteristics of startups by generation were analyzed through global and local spatial autocorrelation analysis. In particular, the spatial distribution of startups in Seoul was confirmed in-depth by categorizing and analyzing startups by major industries. Afterwards, an appropriate spatial regression analysis model was selected through the Lagrange test, and based on this, the location factors affecting startups by generation were analyzed. The main results derived from the research results are as follows. First, there was a significant difference in the spatial distribution of young and middle-aged startups. The young people started to startups in the belt-shaped area that connects Seocho·Gangnam-Yongsan-Mapo-Gangseo, while middle-aged people were relatively active in the southeastern region represented by Seocho, Gangnam, Songpa, and Gangdong. Second, startups by generation in Seoul showed various spatial distributions according to the type of business. In the knowledge high-tech industries(ICT, professional services) in common, Seocho, Gangnam, Mapo, Guro, and Geumcheon were the centers, and the manufacturing industry was focused on existing clusters. On the other hand, in the case of the life service industry, young people were active in startups near universities and cultural centers, while middle-aged people were concentrated on new towns. Third, there was a difference in factors that influenced the startup location of each generation in Seoul. For young people, high-tech industries, universities, cultural capital, and densely populated areas were significant factors for startup, and for middle-aged people, professional service areas, low average age, and the level of concentration of start-up support institutions had a significant influence on startup. Also, these location factors had different influences for each industry. The implications suggested through the study are as follows. First, it is necessary to support systematic startups considering the characteristics of each region, industry, and generation in Seoul. As there are significant differences in startup regions and industries by generation, it is necessary to strengthen a customized startup support system that takes into account these regional and industrial characteristics. Second, in terms of research methods, a follow-up study is needed that comprehensively considers culture and finance at the large districts(Gu) level through data accumulation.