• Title/Summary/Keyword: and Pre-Processing

Search Result 1,931, Processing Time 0.027 seconds

A Study on the Optimization of Green Kiwi and Gold Kiwi Puree Mixing Ratio for the Best French Kiwi Dressing (그린키위 및 골드키위를 이용한 프렌치 드레싱 제조의 혼합비율 최적화의 연구)

  • Cho, In-Sook;Jin, Hyun-Hee;Lee, Seung-Joo
    • Culinary science and hospitality research
    • /
    • v.21 no.4
    • /
    • pp.16-28
    • /
    • 2015
  • The purpose of this study, as a part of developing a new french dressing, was to present the best conditions to make improved kiwi dressing, suitable for the tastes of modern people, the processing and cooking methods of different ratios of green kiwi and gold kiwi have been sought to develop a new type of dressing, then its antioxidant have been defined, and used for producing kiwi dressing. Each 150g of different Kiwi purees, made based on the most preferable combinations from the pre-test were used for kiwi dressing, and thereafter its quality characteristics, and physical properties were investigated, as well as a sensory test was conducted. The highest viscosity of kiwi dressing was test sample GD2, and in general that of combining both types of kiwis were higher than that of either single kiwi. The sugar content was decreased by changing the Gold kiwi portion(p<0.05). The chromaticity in general increased with increases in the Gold kiwi portion, and a-value(brightness) and b-value(redness) of sample GD1 were the highest by -2.75 and 17.50(p<0.05). From the acceptability test, the highest overall acceptability was the dressing sample combining Gold kiwi and Green kiwi at a ratio of 1:1. Based on the study results, it is expected that the dressing, made of kiwi puree, mixing Green kiwi and Gold kiwi by 1:1 ratio, and adding 130g of edible oil, 50g of onion, 40g of sugar, and 5g of salt, would improve the quality and overall acceptability of the dressing.

Conditional Generative Adversarial Network based Collaborative Filtering Recommendation System (Conditional Generative Adversarial Network(CGAN) 기반 협업 필터링 추천 시스템)

  • Kang, Soyi;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.157-173
    • /
    • 2021
  • With the development of information technology, the amount of available information increases daily. However, having access to so much information makes it difficult for users to easily find the information they seek. Users want a visualized system that reduces information retrieval and learning time, saving them from personally reading and judging all available information. As a result, recommendation systems are an increasingly important technologies that are essential to the business. Collaborative filtering is used in various fields with excellent performance because recommendations are made based on similar user interests and preferences. However, limitations do exist. Sparsity occurs when user-item preference information is insufficient, and is the main limitation of collaborative filtering. The evaluation value of the user item matrix may be distorted by the data depending on the popularity of the product, or there may be new users who have not yet evaluated the value. The lack of historical data to identify consumer preferences is referred to as data sparsity, and various methods have been studied to address these problems. However, most attempts to solve the sparsity problem are not optimal because they can only be applied when additional data such as users' personal information, social networks, or characteristics of items are included. Another problem is that real-world score data are mostly biased to high scores, resulting in severe imbalances. One cause of this imbalance distribution is the purchasing bias, in which only users with high product ratings purchase products, so those with low ratings are less likely to purchase products and thus do not leave negative product reviews. Due to these characteristics, unlike most users' actual preferences, reviews by users who purchase products are more likely to be positive. Therefore, the actual rating data is over-learned in many classes with high incidence due to its biased characteristics, distorting the market. Applying collaborative filtering to these imbalanced data leads to poor recommendation performance due to excessive learning of biased classes. Traditional oversampling techniques to address this problem are likely to cause overfitting because they repeat the same data, which acts as noise in learning, reducing recommendation performance. In addition, pre-processing methods for most existing data imbalance problems are designed and used for binary classes. Binary class imbalance techniques are difficult to apply to multi-class problems because they cannot model multi-class problems, such as objects at cross-class boundaries or objects overlapping multiple classes. To solve this problem, research has been conducted to convert and apply multi-class problems to binary class problems. However, simplification of multi-class problems can cause potential classification errors when combined with the results of classifiers learned from other sub-problems, resulting in loss of important information about relationships beyond the selected items. Therefore, it is necessary to develop more effective methods to address multi-class imbalance problems. We propose a collaborative filtering model using CGAN to generate realistic virtual data to populate the empty user-item matrix. Conditional vector y identify distributions for minority classes and generate data reflecting their characteristics. Collaborative filtering then maximizes the performance of the recommendation system via hyperparameter tuning. This process should improve the accuracy of the model by addressing the sparsity problem of collaborative filtering implementations while mitigating data imbalances arising from real data. Our model has superior recommendation performance over existing oversampling techniques and existing real-world data with data sparsity. SMOTE, Borderline SMOTE, SVM-SMOTE, ADASYN, and GAN were used as comparative models and we demonstrate the highest prediction accuracy on the RMSE and MAE evaluation scales. Through this study, oversampling based on deep learning will be able to further refine the performance of recommendation systems using actual data and be used to build business recommendation systems.

Airborne Hyperspectral Imagery availability to estimate inland water quality parameter (수질 매개변수 추정에 있어서 항공 초분광영상의 가용성 고찰)

  • Kim, Tae-Woo;Shin, Han-Sup;Suh, Yong-Cheol
    • Korean Journal of Remote Sensing
    • /
    • v.30 no.1
    • /
    • pp.61-73
    • /
    • 2014
  • This study reviewed an application of water quality estimation using an Airborne Hyperspectral Imagery (A-HSI) and tested a part of Han River water quality (especially suspended solid) estimation with available in-situ data. The estimation of water quality was processed two methods. One is using observation data as downwelling radiance to water surface and as scattering and reflectance into water body. Other is linear regression analysis with water quality in-situ measurement and upwelling data as at-sensor radiance (or reflectance). Both methods drive meaningful results of RS estimation. However it has more effects on the auxiliary dataset as water quality in-situ measurement and water body scattering measurement. The test processed a part of Han River located Paldang-dam downstream. We applied linear regression analysis with AISA eagle hyperspectral sensor data and water quality measurement in-situ data. The result of linear regression for a meaningful band combination shows $-24.847+0.013L_{560}$ as 560 nm in radiance (L) with 0.985 R-square. To comparison with Multispectral Imagery (MSI) case, we make simulated Landsat TM by spectral resampling. The regression using MSI shows -55.932 + 33.881 (TM1/TM3) as radiance with 0.968 R-square. Suspended Solid (SS) concentration was about 3.75 mg/l at in-situ data and estimated SS concentration by A-HIS was about 3.65 mg/l, and about 5.85mg/l with MSI with same location. It shows overestimation trends case of estimating using MSI. In order to upgrade value for practical use and to estimate more precisely, it needs that minimizing sun glint effect into whole image, constructing elaborate flight plan considering solar altitude angle, and making good pre-processing and calibration system. We found some limitations and restrictions such as precise atmospheric correction, sample count of water quality measurement, retrieve spectral bands into A-HSI, adequate linear regression model selection, and quantitative calibration/validation method through the literature review and test adopted general methods.

Effect of Therapeutic and Educational strategies using music on improvement of auditory information processing and short-term memory skills for children with underachievement (학습부진아의 청각정보처리와 단기기억력 향상을 위한 음악의 치료적·교육적 접근)

  • Chong, Hyun Ju
    • Journal of Music and Human Behavior
    • /
    • v.1 no.1
    • /
    • pp.1-10
    • /
    • 2004
  • Being engaged in the musical tasks needs cognitive skills to perceive musical sound, organize them into meaningful unit, store them in the memory and retrieve them when needed. These skills are also required for academic tasks indicating that there is positive correlation between skills for musical and academic tasks. Based on these findings, the study purported to examine whether the developed sessions can enhance cognitive skills which is composed of auditory information skills, which is composed of perceiving sounds, organizing them into groups based on the existing information or organization pattern, and short-term memory skills. Eighteen elementary students in 4, 5, and 6th grades have participated in the study. The study has administered Music Cognitive Skills Test(MCST) before and after implementing music therapy sessions. The MCST consisted of five parts, first one measuring the rhythm imitating skills, second, measuring the melodic imitation skills, third, measuring discriminative skills in identifying higher pitch, fourth, measuring discriminative skills in identifying identical chords, and lastly, measuring the tone retention skills. The results indicated that there was statistical difference between the pre and post test in rhythm and melody imitation skills. Because reproduction of perceived rhythm patterns requires memory skills, imitating patterns are considered cognitive skills. Also melody is defined adding spatial dimension to the rhythm which is temporal concept. Being able to understand melodic pattern and to reproduce the pattern also requires cognitive skills. The subjects have shown significant improvement in these two areas. In other areas, there were definite increase of scores, however, no significant differences. The study also explores interpretation of these results and also observed consistencies among the participants in completing the musical tasks.

  • PDF

A Study on Analysis of consumer perception of YouTube advertising using text mining (텍스트 마이닝을 활용한 Youtube 광고에 대한 소비자 인식 분석)

  • Eum, Seong-Won
    • Management & Information Systems Review
    • /
    • v.39 no.2
    • /
    • pp.181-193
    • /
    • 2020
  • This study is a study that analyzes consumer perception by utilizing text mining, which is a recent issue. we analyzed the consumer's perception of Samsung Galaxy by analyzing consumer reviews of Samsung Galaxy YouTube ads. for analysis, 1,819 consumer reviews of YouTube ads were extracted. through this data pre-processing, keywords for advertisements were classified and extracted into nouns, adjectives, and adverbs. after that, frequency analysis and emotional analysis were performed. Finally, clustering was performed through CONCOR. the summary of this study is as follows. the first most frequently mentioned words were Galaxy Note (n = 217), Good (n = 135), Pen (n = 40), and Function (n = 29). it can be judged through the advertisement that consumers "Galaxy Note", "Good", "Pen", and "Features" have good functional aspects for Samsung mobile phone products and positively recognize the Note Pen. in addition, the recognition of "Samsung Pay", "Innovation", "Design", and "iPhone" shows that Samsung's mobile phone is highly regarded for its innovative design and functional aspects of Samsung Pay. second, it is the result of sentiment analysis on YouTube advertising. As a result of emotional analysis, the ratio of emotional intensity was positive (75.95%) and higher than negative (24.05%). this means that consumers are positively aware of Samsung Galaxy mobile phones. As a result of the emotional keyword analysis, positive keywords were "good", "good", "innovative", "highest", "fast", "pretty", etc., negative keywords were "frightening", "I want to cry", "discomfort", "sorry", "no", etc. were extracted. the implication of this study is that most of the studies by quantitative analysis methods were considered when looking at the consumer perception study of existing advertisements. In this study, we deviated from quantitative research methods for advertising and attempted to analyze consumer perception through qualitative research. this is expected to have a great influence on future research, and I am sure that it will be a starting point for consumer awareness research through qualitative research.

An Outlier Detection Using Autoencoder for Ocean Observation Data (해양 이상 자료 탐지를 위한 오토인코더 활용 기법 최적화 연구)

  • Kim, Hyeon-Jae;Kim, Dong-Hoon;Lim, Chaewook;Shin, Yongtak;Lee, Sang-Chul;Choi, Youngjin;Woo, Seung-Buhm
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.33 no.6
    • /
    • pp.265-274
    • /
    • 2021
  • Outlier detection research in ocean data has traditionally been performed using statistical and distance-based machine learning algorithms. Recently, AI-based methods have received a lot of attention and so-called supervised learning methods that require classification information for data are mainly used. This supervised learning method requires a lot of time and costs because classification information (label) must be manually designated for all data required for learning. In this study, an autoencoder based on unsupervised learning was applied as an outlier detection to overcome this problem. For the experiment, two experiments were designed: one is univariate learning, in which only SST data was used among the observation data of Deokjeok Island and the other is multivariate learning, in which SST, air temperature, wind direction, wind speed, air pressure, and humidity were used. Period of data is 25 years from 1996 to 2020, and a pre-processing considering the characteristics of ocean data was applied to the data. An outlier detection of actual SST data was tried with a learned univariate and multivariate autoencoder. We tried to detect outliers in real SST data using trained univariate and multivariate autoencoders. To compare model performance, various outlier detection methods were applied to synthetic data with artificially inserted errors. As a result of quantitatively evaluating the performance of these methods, the multivariate/univariate accuracy was about 96%/91%, respectively, indicating that the multivariate autoencoder had better outlier detection performance. Outlier detection using an unsupervised learning-based autoencoder is expected to be used in various ways in that it can reduce subjective classification errors and cost and time required for data labeling.

Comparative Analysis of the Keywords in Taekwondo News Articles by Year: Applying Topic Modeling Method (태권도 뉴스기사의 연도별 주제어 비교분석: 토픽모델링 적용)

  • Jeon, Minsoo;Lim, Hyosung
    • Journal of Digital Convergence
    • /
    • v.19 no.11
    • /
    • pp.575-583
    • /
    • 2021
  • This study aims to analyze Taekwondo trends according to news articles by year by applying topic modeling. In order to examine the Taekwondo trend through media reports, articles including news articles and Taekwondo specialized media articles were collected through Big Kinds of the Korea Press Foundation. The search period was divided into three sections: before 2000, 2001~2010, and 2011~2020. A total of 12,124 items were selected as research data. For topic analysis, pre-processing was performed, and topic analysis was performed using the LDA algorithm. In this case, python 3 was applied for all analysis. First, as a result of analyzing the topics of media articles by year, 'World' was the most common keyword before 2000. 'South and North Korea' was next common and 'Olympic' was the third commonest topic. From 2001 to 2010, 'World' was the most common topic, followed by 'Association' and 'World Taekwondo'. From 2011 to 2020, 'World', 'Demonstration', and 'Kukkiwon' was the most common topic in that order. Second, as a result of analyzing news articles before 2000 by topic modeling, topics were divided into two categories. Specifically, Topic 1 was selected as 'South-North Korea sports exchange' and Topic 2 was selected as 'Adoption of Olympic demonstration events'. Third, as a result of analyzing news articles from 2001 to 2010 by topic modeling, three topics were selected. Topic 1 was selected as 'Taekwondo Demonstration Performance and Corruption', Topic 2 was selected as 'Muju Taekwondo Park Creation', and Topic 3 was selected as 'World Taekwondo Festival'. Fourth, as a result of analyzing news articles from 2011 to 2020 by topic modeling, three topics were selected. Topic 1 was selected as 'Successful Hosting of the 2018 Pyeongchang Winter Olympics', Topic 2 was selected as 'North-South Korea Taekwondo Joint Demonstration Performance', and Topic 3 was selected as '2017 Muju World Taekwondo Championships'.

Preliminary Inspection Prediction Model to select the on-Site Inspected Foreign Food Facility using Multiple Correspondence Analysis (차원축소를 활용한 해외제조업체 대상 사전점검 예측 모형에 관한 연구)

  • Hae Jin Park;Jae Suk Choi;Sang Goo Cho
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.121-142
    • /
    • 2023
  • As the number and weight of imported food are steadily increasing, safety management of imported food to prevent food safety accidents is becoming more important. The Ministry of Food and Drug Safety conducts on-site inspections of foreign food facilities before customs clearance as well as import inspection at the customs clearance stage. However, a data-based safety management plan for imported food is needed due to time, cost, and limited resources. In this study, we tried to increase the efficiency of the on-site inspection by preparing a machine learning prediction model that pre-selects the companies that are expected to fail before the on-site inspection. Basic information of 303,272 foreign food facilities and processing businesses collected in the Integrated Food Safety Information Network and 1,689 cases of on-site inspection information data collected from 2019 to April 2022 were collected. After preprocessing the data of foreign food facilities, only the data subject to on-site inspection were extracted using the foreign food facility_code. As a result, it consisted of a total of 1,689 data and 103 variables. For 103 variables, variables that were '0' were removed based on the Theil-U index, and after reducing by applying Multiple Correspondence Analysis, 49 characteristic variables were finally derived. We build eight different models and perform hyperparameter tuning through 5-fold cross validation. Then, the performance of the generated models are evaluated. The research purpose of selecting companies subject to on-site inspection is to maximize the recall, which is the probability of judging nonconforming companies as nonconforming. As a result of applying various algorithms of machine learning, the Random Forest model with the highest Recall_macro, AUROC, Average PR, F1-score, and Balanced Accuracy was evaluated as the best model. Finally, we apply Kernal SHAP (SHapley Additive exPlanations) to present the selection reason for nonconforming facilities of individual instances, and discuss applicability to the on-site inspection facility selection system. Based on the results of this study, it is expected that it will contribute to the efficient operation of limited resources such as manpower and budget by establishing an imported food management system through a data-based scientific risk management model.

Digital Hologram Compression Technique By Hybrid Video Coding (하이브리드 비디오 코팅에 의한 디지털 홀로그램 압축기술)

  • Seo, Young-Ho;Choi, Hyun-Jun;Kang, Hoon-Jong;Lee, Seung-Hyun;Kim, Dong-Wook
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.5 s.305
    • /
    • pp.29-40
    • /
    • 2005
  • According as base of digital hologram has been magnified, discussion of compression technology is expected as a international standard which defines the compression technique of 3D image and video has been progressed in form of 3DAV which is a part of MPEG. As we can identify in case of 3DAV, the coding technique has high possibility to be formed into the hybrid type which is a merged, refined, or mixid with the various previous technique. Therefore, we wish to present the relationship between various image/video coding techniques and digital hologram In this paper, we propose an efficient coding method of digital hologram using standard compression tools for video and image. At first, we convert fringe patterns into video data using a principle of CGH(Computer Generated Hologram), and then encode it. In this research, we propose a compression algorithm is made up of various method such as pre-processing for transform, local segmentation with global information of object image, frequency transform for coding, scanning to make fringe to video stream, classification of coefficients, and hybrid video coding. Finally the proposed hybrid compression algorithm is all of these methods. The tool for still image coding is JPEG2000, and the toots for video coding include various international compression algorithm such as MPEG-2, MPEG-4, and H.264 and various lossless compression algorithm. The proposed algorithm illustrated that it have better properties for reconstruction than the previous researches on far greater compression rate above from four times to eight times as much. Therefore we expect that the proposed technique for digital hologram coding is to be a good preceding research.

An Implementation of OTB Extension to Produce TOA and TOC Reflectance of LANDSAT-8 OLI Images and Its Product Verification Using RadCalNet RVUS Data (Landsat-8 OLI 영상정보의 대기 및 지표반사도 산출을 위한 OTB Extension 구현과 RadCalNet RVUS 자료를 이용한 성과검증)

  • Kim, Kwangseob;Lee, Kiwon
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.3
    • /
    • pp.449-461
    • /
    • 2021
  • Analysis Ready Data (ARD) for optical satellite images represents a pre-processed product by applying spectral characteristics and viewing parameters for each sensor. The atmospheric correction is one of the fundamental and complicated topics, which helps to produce Top-of-Atmosphere (TOA) and Top-of-Canopy (TOC) reflectance from multi-spectral image sets. Most remote sensing software provides algorithms or processing schemes dedicated to those corrections of the Landsat-8 OLI sensors. Furthermore, Google Earth Engine (GEE), provides direct access to Landsat reflectance products, USGS-based ARD (USGS-ARD), on the cloud environment. We implemented the Orfeo ToolBox (OTB) atmospheric correction extension, an open-source remote sensing software for manipulating and analyzing high-resolution satellite images. This is the first tool because OTB has not provided calibration modules for any Landsat sensors. Using this extension software, we conducted the absolute atmospheric correction on the Landsat-8 OLI images of Railroad Valley, United States (RVUS) to validate their reflectance products using reflectance data sets of RVUS in the RadCalNet portal. The results showed that the reflectance products using the OTB extension for Landsat revealed a difference by less than 5% compared to RadCalNet RVUS data. In addition, we performed a comparative analysis with reflectance products obtained from other open-source tools such as a QGIS semi-automatic classification plugin and SAGA, besides USGS-ARD products. The reflectance products by the OTB extension showed a high consistency to those of USGS-ARD within the acceptable level in the measurement data range of the RadCalNet RVUS, compared to those of the other two open-source tools. In this study, the verification of the atmospheric calibration processor in OTB extension was carried out, and it proved the application possibility for other satellite sensors in the Compact Advanced Satellite (CAS)-500 or new optical satellites.