• Title/Summary/Keyword: 데이터 선별

Search Result 580, Processing Time 0.025 seconds

Policy for Selective Flushing of Smartphone Buffer Cache using Persistent Memory (영속 메모리를 이용한 스마트폰 버퍼 캐시의 선별적 플러시 정책)

  • Lim, Soojung;Bahn, Hyokyung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.1
    • /
    • pp.71-76
    • /
    • 2022
  • Buffer cache bridges the performance gap between memory and storage, but its effectiveness is limited due to periodic flush, performed to prevent data loss in smartphones. This paper shows that selective flushing technique with small persistent memory can reduce the flushing overhead of smartphone buffer cache significantly. This is due to our I/O analysis of smartphone applications in that a certain hot data account for most of file writes, while a large proportion of file data incurs single-writes. The proposed selective flushing policy performs flushing to persistent memory for frequently updated data, and storage flushing is performed only for single-write data. This eliminates storage write traffic and also improves the space efficiency of persistent memory. Simulations with popular smartphone application I/O traces show that the proposed policy reduces write traffic to storage by 24.8% on average and up to 37.8%.

Automatic Classification of Department Types and Analysis of Co-Authorship Network: Focusing on Korean Journals in the Computer Field

  • Byungkyu Kim;Beom-Jong You;Min-Woo Park
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.4
    • /
    • pp.53-63
    • /
    • 2023
  • The utilization of department information in bibliometric analysis using scientific and technological literature is highly advantageous. In this paper, the department information dataset was built through the screening, data refinement, and classification processing of authors' department type belonging to university institutions appearing in academic journals in the field of science and technology published in Korea, and the automatic classification model based on deep learning was developed using the department information dataset as learning data and verification data. In addition, we analyzed the co-authorship structure and network in the field of computer science using the department information dataset and affiliation information of authors from domestic academic journals. The research resulted in a 98.6% accuracy rate for the automatic classification model using Korean department information. Moreover, the co-authorship patterns of Korean researchers in the computer science and engineering field, along with the characteristics and centralities of the co-author network based on institution type, region, institution, and department type, were identified in detail and visually presented on a map.

Selecting Optimal Locations for Bicycle Lanes to Prevent Accidents in Seoul (서울특별시 자전거 안전사고 예방을 위한 자전거 도로 최적 입지 선정: 자전거 전용도로 및 전용차로를 중심으로)

  • Ji-eun Kim;Sumin Nam;ZoonKy Lee
    • The Journal of Bigdata
    • /
    • v.8 no.2
    • /
    • pp.45-54
    • /
    • 2023
  • Seoul's public bicycle system, 'Ttareungyi,' introduced in 2015, has achieved an annual ridership of 40 million in 2022. Similarly, electric scooters, a type of personal mobility device, surpassed one million riders in 2020 due to various sharing platforms. However, the major roadways for these new transportation, bicycle lanes, are notably insufficient compared to other forms of transport. Hence, this study proposes an optimal location selection method for bicycle lanes in Seoul to prevent accidents and enhance bicycle ride safety. The location selection process prioritizes road safety concerning bicycle accident risk. Using regression models, high-risk areas for bicycle accidents are identified. Cluster analysis categorizes these areas into six clusters, each suggesting suitable types of bicycle lanes based on cluster-specific characteristics. We hope that this study will contribute to the improvement of Seoul's transportation environment, including the expansion of dedicated bicycle lanes and lanes for personal mobility devices.

A Study on the Reliability and Validity of the Collection of the Ethnography Method of Service Experience Data - Focusing on I know You_AI Service - (서비스경험데이터의 에스노그라피 방식 수집에 대한신뢰성과 타당성 연구 - I know you_AI 서비스를 중심으로 -)

  • Ahn, Jinho;Lee, Jeungsun
    • Journal of Service Research and Studies
    • /
    • v.10 no.4
    • /
    • pp.43-55
    • /
    • 2020
  • Recently, as the importance of experience data increases, there are many attempts to deal with experience data from a data science perspective. In the case of approaching as a collection method of a quantitative survey method that seeks to quantify numerically such as big data, it is difficult to interpret the value of experience in a wide range, and it is relatively expensive and time consuming, and personal information infringement There is a limit to the analysis due to the risk of However, since ethnography, a procedure for collecting experience data based on qualitative research, is mainly carried out in the natural real environment of future customers from the perspective of users, it is possible to confirm the nature that customers face with a small sample. In addition, it is also easy to interpret the relational dimension of the empirical data. Although the ethnography method of collecting experiential data is economical and efficient, it is important to reduce errors in the collection process because the lack of scientific procedures for the data collection process can be a problem. It is important to secure the validity of whether the correct measurement tool is used for ethnography-based experiential data collection and to secure the reliability of the use of a valid measurement tool and method by accurately selecting the measurement target. From this point of view, it is necessary to verify the reliability of the research method that clearly selects the measurement target and secures the validity for the development of the correct measurement method and tool for the collection of ethnography experience data. Therefore, in this study, a verification study was conducted on the data and methodology cases of the'I know you_AI' service that analyzes the customer experience of self-employed based on the ethnography method of collecting experience data..

A Comparative Study of Emotional Response to Korean Drama among Countries: With Drama 'Goblin' (한국 드라마 수용에 있어서 국가별 감정 반응 분석: 드라마 <도깨비>를 중심으로)

  • Lee, Yewon;Woo, Sungju
    • Science of Emotion and Sensibility
    • /
    • v.20 no.4
    • /
    • pp.31-40
    • /
    • 2017
  • This research aims to investigate 'Hallyu' contents consumption tendency of consumers from Korea, Japan, and the United States by analyzing their emotional responses. With the development of social media, research on emotion analysis by reviewing text materials has grown. Whereas environmental variables affect consumer demand towards 'Hallyu' contents, little comparative analyses have been conducted on the emotional responses of consumers from different countries. In this research, the emotional prototype model proposed by Russell(1980) used to extract and distinguish emotional words to clarify how people in the three countries differently perceive the Korean drama "Goblin". First of all, the SNS reviews were collected during a two-month period (February 12 to April 12). Second, significant factors were identified in the collected data according to Russell's emotion model. Third, random forest was applied to organize the selected variables in the order of variable importance. Fourth, the correlations among the emotional words were compared. Lastly, the accuracy of the trained model was measured using the test dataset. The results show that "Happy" was found to be the greatest factor in Korea and in the United States and "Pleased" in Japan. Emotional words correlations showed that when watching the drama "Goblin", "passive unpleasure" was the main factor associated with individual's interest in Korea whereas "passive pleasure" was associated with individual's interest in Japan and in the United States. Based on the results, this research suggests the possibility of developing evaluation guidelines for emotional responses of different countries towards 'Hallyu' contents.

Movie Box-office Prediction using Deep Learning and Feature Selection : Focusing on Multivariate Time Series

  • Byun, Jun-Hyung;Kim, Ji-Ho;Choi, Young-Jin;Lee, Hong-Chul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.6
    • /
    • pp.35-47
    • /
    • 2020
  • Box-office prediction is important to movie stakeholders. It is necessary to accurately predict box-office and select important variables. In this paper, we propose a multivariate time series classification and important variable selection method to improve accuracy of predicting the box-office. As a research method, we collected daily data from KOBIS and NAVER for South Korean movies, selected important variables using Random Forest and predicted multivariate time series using Deep Learning. Based on the Korean screen quota system, Deep Learning was used to compare the accuracy of box-office predictions on the 73rd day from movie release with the important variables and entire variables, and the results was tested whether they are statistically significant. As a Deep Learning model, Multi-Layer Perceptron, Fully Convolutional Neural Networks, and Residual Network were used. Among the Deep Learning models, the model using important variables and Residual Network had the highest prediction accuracy at 93%.

Study for Feature Selection Based on Multi-Agent Reinforcement Learning (다중 에이전트 강화학습 기반 특징 선택에 대한 연구)

  • Kim, Miin-Woo;Bae, Jin-Hee;Wang, Bo-Hyun;Lim, Joon-Shik
    • Journal of Digital Convergence
    • /
    • v.19 no.12
    • /
    • pp.347-352
    • /
    • 2021
  • In this paper, we propose a method for finding feature subsets that are effective for classification in an input dataset by using a multi-agent reinforcement learning method. In the field of machine learning, it is crucial to find features suitable for classification. A dataset may have numerous features; while some features may be effective for classification or prediction, others may have little or rather negative effects on results. In machine learning problems, feature selection for increasing classification or prediction accuracy is a critical problem. To solve this problem, we proposed a feature selection method based on reinforced learning. Each feature has one agent, which determines whether the feature is selected. After obtaining corresponding rewards for each feature that is selected, but not by the agents, the Q-value of each agent is updated by comparing the rewards. The reward comparison of the two subsets helps agents determine whether their actions were right. These processes are performed as many times as the number of episodes, and finally, features are selected. As a result of applying this method to the Wisconsin Breast Cancer, Spambase, Musk, and Colon Cancer datasets, accuracy improvements of 0.0385, 0.0904, 0.1252 and 0.2055 were shown, respectively, and finally, classification accuracies of 0.9789, 0.9311, 0.9691 and 0.9474 were achieved, respectively. It was proved that our proposed method could properly select features that were effective for classification and increase classification accuracy.

Development of surface detection model for dried semi-finished product of Kimbukak using deep learning (딥러닝 기반 김부각 건조 반제품 표면 검출 모델 개발)

  • Tae Hyong Kim;Ki Hyun Kwon;Ah-Na Kim
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.17 no.4
    • /
    • pp.205-212
    • /
    • 2024
  • This study developed a deep learning model that distinguishes the front (with garnish) and the back (without garnish) surface of the dried semi-finished product (dried bukak) for screening operation before transfter the dried bukak to oil heater using robot's vacuum gripper. For deep learning model training and verification, RGB images for the front and back surfaces of 400 dry bukak that treated by data preproccessing were obtained. YOLO-v5 was used as a base structure of deep learning model. The area, surface information labeling, and data augmentation techniques were applied from the acquired image. Parameters including mAP, mIoU, accumulation, recall, decision, and F1-score were selected to evaluate the performance of the developed YOLO-v5 deep learning model-based surface detection model. The mAP and mIoU on the front surface were 0.98 and 0.96, respectively, and on the back surface, they were 1.00 and 0.95, respectively. The results of binary classification for the two front and back classes were average 98.5%, recall 98.3%, decision 98.6%, and F1-score 98.4%. As a result, the developed model can classify the surface information of the dried bukak using RGB images, and it can be used to develop a robot-automated system for the surface detection process of the dried bukak before deep frying.

Derivation of Data Quality Attributes and their Priorities Based on Customer Requirements (고객의 요구사항에 기반한 데이터품질 평가속성 및 우선순위 도출)

  • Jang, Kyoung-Ae;Kim, Ja-Hee;Kim, Woo Je
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.12
    • /
    • pp.549-560
    • /
    • 2015
  • There is a wide variety of data quality attributes such as the ones proposed by the ISO/IEC organization and also by many other domestic and international institutions. However, it takes considerable time and costs to apply those criteria and guidelines to real environment. Therefore, it needs to define data quality evaluation attributes which are easily applicable and are not influenced by organizational environment limitations. The purpose of this paper is to derive data quality attributes and order of their priorities based on customer requirements for managing the process systematically and evaluating the data quantitatively. This study identifies the customer cognitive constructs of data quality attributes using the RGT(Repertory Grid Technique) based on a Korean quality standard model (DQC-M). Also the correlation analysis on the identified constructs is conducted, and the evaluation attributes is prioritized and ranked using the AHP. As the results of this paper, the consistent system, the accurate data, the efficient environment, the flexible management, and the continuous improvement are derived at the first level of the data quality evaluation attributes. Also, Control Compliance(13%), Regulatory Compliance(10%), Requirement Completeness(9.6%), Accuracy(8.4%), and Traceability(6.8%) are ranked on the top 5 of the 19 attributes in the second level.

A Study on the Analysis and the Improvement of the MyData System from a Consumer Behavior Perspective (소비자행동 측면에서의 마이데이터 제도 분석 및 개선방안 연구)

  • Young-Jong Lee;Seong-Yeob Lee
    • Industry Promotion Research
    • /
    • v.9 no.3
    • /
    • pp.163-174
    • /
    • 2024
  • MyData is a new entity that strengthens the rights of information subjects through the 'right to data portability' and utilizes data to enable hyper-personalized services using personal information. Korea's MyData system is recognized globally as an outstanding system in that it is creating a new MyData industry by granting the right to information self-determination through the 'right to request data transmission'. Now in its third year, this study evaluates Korea's MyData system from a consumer behavior perspective and identifies issues for improvement. To this end, this study reviewed previous research on the relationship between regulatory policy and consumer behavior to determine the applicability of a consumer behavior perspective in institutional evaluation. In addition, in a study on consumer behavior related to MyData, variables that affect the use of MyData were investigated and evaluation items from a consumer behavior perspective were derived. As a result of evaluating Korea's MyData system from a consumer behavior perspective, it was found that the factors considered important by consumers were appropriately reflected in the system. However, in cases where there are dual values of ease of use and personal information protection, regulatory aspects tend to take priority. Therefore, in order to revitalize the MyData industry, it is essential to implement market-friendly system improvements without compromising consumer rights. This study is differentiated from existing studies in that it attempted to derive a plan for system improvement by combining empirical consumer behavior research and regulatory policy research.