• Title/Summary/Keyword: Big data Processing

Search Result 1,063, Processing Time 0.03 seconds

RDBMS Based Efficient Method for Shortest Path Searching Over Large Graphs Using K-degree Index Table (대용량 그래프에서 k-차수 인덱스 테이블을 이용한 RDBMS 기반의 효율적인 최단 경로 탐색 기법)

  • Hong, Jihye;Han, Yongkoo;Lee, Young-Koo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.5
    • /
    • pp.179-186
    • /
    • 2014
  • Current networks such as social network, web page link, traffic network are big data which have the large numbers of nodes and edges. Many applications such as social network services and navigation systems use these networks. Since big networks are not fit into the memory, existing in-memory based analysis techniques cannot provide high performance. Frontier-Expansion-Merge (FEM) framework for graph search operations using three corresponding operators in the relational database (RDB) context. FEM exploits an index table that stores pre-computed partial paths for efficient shortest path discovery. However, the index table of FEM has low hit ratio because the indices are determined by distances of indices rather than the possibility of containing a shortest path. In this paper, we propose an method that construct index table using high degree nodes having high hit ratio for efficient shortest path discovery. We experimentally verify that our index technique can support shortest path discovery efficiently in real-world datasets.

High Efficiency Life Prediction and Exception Processing Method of NAND Flash Memory-based Storage using Gradient Descent Method (경사하강법을 이용한 낸드 플래시 메모리기반 저장 장치의 고효율 수명 예측 및 예외처리 방법)

  • Lee, Hyun-Seob
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.11
    • /
    • pp.44-50
    • /
    • 2021
  • Recently, enterprise storage systems that require large-capacity storage devices to accommodate big data have used large-capacity flash memory-based storage devices with high density compared to cost and size. This paper proposes a high-efficiency life prediction method with slope descent to maximize the life of flash memory media that directly affects the reliability and usability of large enterprise storage devices. To this end, this paper proposes the structure of a matrix for storing metadata for learning the frequency of defects and proposes a cost model using metadata. It also proposes a life expectancy prediction policy in exceptional situations when defects outside the learned range occur. Lastly, it was verified through simulation that a method proposed by this paper can maximize its life compared to a life prediction method based on the fixed number of times and the life prediction method based on the remaining ratio of spare blocks, which has been used to predict the life of flash memory.

Analysis of Current Situation of University Student Loans Based on Bigdata (빅데이터 기반 대학생 학자금 대출 현황 분석)

  • Kim, Jeong-Joon;Jang, Sung-Jun;Lee, Yong-Soo
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.5
    • /
    • pp.229-238
    • /
    • 2019
  • Before the scholarship loan system was implemented at the Korea Scholarship Foundation, the government's role was strengthened by the direct lending of student funds to banks and other financial institutions. However, the low repayment performance of student loans has raised concerns over the future of student loans and the government's financial burden. Moreover, since student loans are repaid even after graduating from college to support low-income families, it is highly unlikely that the repayment rate of student loans will improve unless the employment rate and income level of the borrower improve. In this paper, the final visualization graph is presented of the repayment amount of the student loan through the collection, storage, processing and analysis phase in the Big Data-based system. This could be the basis for visually checking the amount of student loans to come up with various ways to reduce the burden on the current student loan system.

Method of Similarity Hash-Based Malware Family Classification (유사성 해시 기반 악성코드 유형 분류 기법)

  • Kim, Yun-jeong;Kim, Moon-sun;Lee, Man-hee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.5
    • /
    • pp.945-954
    • /
    • 2022
  • Billions of malicious codes are detected every year, of which only 0.01% are new types of malware. In this situation, an effective malware type classification tool is needed, but previous studies have limitations in quickly analyzing a large amount of malicious code because it requires a complex and massive amount of data pre-processing. To solve this problem, this paper proposes a method to classify the types of malicious code based on the similarity hash without complex data preprocessing. This approach trains the XGBoost model based on the similarity hash information of the malware. To evaluate this approach, we used the BIG-15 dataset, which is widely used in the field of malware classification. As a result, the malicious code was classified with an accuracy of 98.9% also, identified 3,432 benign files with 100% accuracy. This result is superior to most recent studies using complex preprocessing and deep learning models. Therefore, it is expected that more efficient malware classification is possible using the proposed approach.

A Study on Measurement of Length and Slope of Temporary Structure using UAV (무인항공기를 활용한 가설구조물의 길이와 기울기 측정에 관한 연구)

  • Min-Guk, Kang;Seung-Hyeon, Shin;JongKeun, Park;Jeong-Hun, Won
    • Journal of the Korean Society of Safety
    • /
    • v.37 no.6
    • /
    • pp.89-95
    • /
    • 2022
  • A method for measuring the length and slope of a temporary structure using an unmanned aerial vehicle (UAV) and 3D modeling method is proposed. The actual length and slope of the vertical member of the specimen were measured and compared with the measured values obtained by the proposed method for the specimens with and without the vertical protection net installed. Based on the result of measuring the length of the temporary structure specimen using the UAV and 3D modeling method, the measured value showed an error of 0.87% when compared to the actual length in the specimen without the vertical protection net installed. In addition, the error of the slope was 0.63°. It was thought that the proposed method could be usable for the purpose of finding parts in wrong installation state on the temporary structure and informing the manager in charge. However, in the case of the specimen with the vertical protection net, the measurement showed a 1.46% error in length and 2.77° difference in slope. Therefore, if a vertical protection net is to be installed in a temporary structure, the measurement accuracy should be improved by utilizing an image processing method, etc.

Study on Basic Design of Maritime Information Gateway System for Sharing Information with Related Organizations about Korean e-Navigation Service (유관기관 정보 공유를 위한 지능형 해상교통정보 체계의 대용량 해양 정보 연계 시스템 기본 설계에 대한 연구)

  • Yong-hak Song;Hyun Kim;Do-yeon Kim
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 2022.06a
    • /
    • pp.308-309
    • /
    • 2022
  • The Ministry of Oceans and Fisheries is providing maritime safety services using combine limited artificial intelligence technologies through the operation of the Korean e-Navigation service, and research is needed to improve reliability and quality to secure the competitiveness of the system. However, linking real-time operating systems requires a separate system configuration that can be linked after processing personal information security with minimal performance impact. To solve this problem, this study will make a basic design of a big-data maritime information gateway system of the Korean e-Navigation service that minimizes the impact of performance and reflects the security of personal information.

  • PDF

Analysis of media trends related to spent nuclear fuel treatment technology using text mining techniques (텍스트마이닝 기법을 활용한 사용후핵연료 건식처리기술 관련 언론 동향 분석)

  • Jeong, Ji-Song;Kim, Ho-Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.2
    • /
    • pp.33-54
    • /
    • 2021
  • With the fourth industrial revolution and the arrival of the New Normal era due to Corona, the importance of Non-contact technologies such as artificial intelligence and big data research has been increasing. Convergent research is being conducted in earnest to keep up with these research trends, but not many studies have been conducted in the area of nuclear research using artificial intelligence and big data-related technologies such as natural language processing and text mining analysis. This study was conducted to confirm the applicability of data science analysis techniques to the field of nuclear research. Furthermore, the study of identifying trends in nuclear spent fuel recognition is critical in terms of being able to determine directions to nuclear industry policies and respond in advance to changes in industrial policies. For those reasons, this study conducted a media trend analysis of pyroprocessing, a spent nuclear fuel treatment technology. We objectively analyze changes in media perception of spent nuclear fuel dry treatment techniques by applying text mining analysis techniques. Text data specializing in Naver's web news articles, including the keywords "Pyroprocessing" and "Sodium Cooled Reactor," were collected through Python code to identify changes in perception over time. The analysis period was set from 2007 to 2020, when the first article was published, and detailed and multi-layered analysis of text data was carried out through analysis methods such as word cloud writing based on frequency analysis, TF-IDF and degree centrality calculation. Analysis of the frequency of the keyword showed that there was a change in media perception of spent nuclear fuel dry treatment technology in the mid-2010s, which was influenced by the Gyeongju earthquake in 2016 and the implementation of the new government's energy conversion policy in 2017. Therefore, trend analysis was conducted based on the corresponding time period, and word frequency analysis, TF-IDF, degree centrality values, and semantic network graphs were derived. Studies show that before the 2010s, media perception of spent nuclear fuel dry treatment technology was diplomatic and positive. However, over time, the frequency of keywords such as "safety", "reexamination", "disposal", and "disassembly" has increased, indicating that the sustainability of spent nuclear fuel dry treatment technology is being seriously considered. It was confirmed that social awareness also changed as spent nuclear fuel dry treatment technology, which was recognized as a political and diplomatic technology, became ambiguous due to changes in domestic policy. This means that domestic policy changes such as nuclear power policy have a greater impact on media perceptions than issues of "spent nuclear fuel processing technology" itself. This seems to be because nuclear policy is a socially more discussed and public-friendly topic than spent nuclear fuel. Therefore, in order to improve social awareness of spent nuclear fuel processing technology, it would be necessary to provide sufficient information about this, and linking it to nuclear policy issues would also be a good idea. In addition, the study highlighted the importance of social science research in nuclear power. It is necessary to apply the social sciences sector widely to the nuclear engineering sector, and considering national policy changes, we could confirm that the nuclear industry would be sustainable. However, this study has limitations that it has applied big data analysis methods only to detailed research areas such as "Pyroprocessing," a spent nuclear fuel dry processing technology. Furthermore, there was no clear basis for the cause of the change in social perception, and only news articles were analyzed to determine social perception. Considering future comments, it is expected that more reliable results will be produced and efficiently used in the field of nuclear policy research if a media trend analysis study on nuclear power is conducted. Recently, the development of uncontact-related technologies such as artificial intelligence and big data research is accelerating in the wake of the recent arrival of the New Normal era caused by corona. Convergence research is being conducted in earnest in various research fields to follow these research trends, but not many studies have been conducted in the nuclear field with artificial intelligence and big data-related technologies such as natural language processing and text mining analysis. The academic significance of this study is that it was possible to confirm the applicability of data science analysis technology in the field of nuclear research. Furthermore, due to the impact of current government energy policies such as nuclear power plant reductions, re-evaluation of spent fuel treatment technology research is undertaken, and key keyword analysis in the field can contribute to future research orientation. It is important to consider the views of others outside, not just the safety technology and engineering integrity of nuclear power, and further reconsider whether it is appropriate to discuss nuclear engineering technology internally. In addition, if multidisciplinary research on nuclear power is carried out, reasonable alternatives can be prepared to maintain the nuclear industry.

A Fast and Scalable Image Retrieval Algorithms by Leveraging Distributed Image Feature Extraction on MapReduce (MapReduce 기반 분산 이미지 특징점 추출을 활용한 빠르고 확장성 있는 이미지 검색 알고리즘)

  • Song, Hwan-Jun;Lee, Jin-Woo;Lee, Jae-Gil
    • Journal of KIISE
    • /
    • v.42 no.12
    • /
    • pp.1474-1479
    • /
    • 2015
  • With mobile devices showing marked improvement in performance in the age of the Internet of Things (IoT), there is demand for rapid processing of the extensive amount of multimedia big data. However, because research on image searching is focused mainly on increasing accuracy despite environmental changes, the development of fast processing of high-resolution multimedia data queries is slow and inefficient. Hence, we suggest a new distributed image search algorithm that ensures both high accuracy and rapid response by using feature extraction of distributed images based on MapReduce, and solves the problem of memory scalability based on BIRCH indexing. In addition, we conducted an experiment on the accuracy, processing time, and scalability of this algorithm to confirm its excellent performance.

A Swine Management System for PLC baed on Integrated Image Processing Technique (통합 이미지 처리기법 기반의 PLF를 위한 Swine 관리 시스템)

  • Arellano, Guy;Cabacas, Regin;Balontong, Amem;Ra, In-Ho
    • Smart Media Journal
    • /
    • v.3 no.1
    • /
    • pp.16-21
    • /
    • 2014
  • The demand for food rises proportionally as population grows. To be able to achieve sustainable supply of livestock products, efficient farm management is a necessity. With the advancement in technology it also brought innovations that could be harness in order to achieve better productivity in animal production and agriculture. Precision Livestock Farming (PLF) is a budding concept of making use of smart sensors or available devices to automatically and continuously monitor and manage livestock production. With this concept, this paper introduces a swine management system that integrates image processing technique for weight monitoring. This system captures pig images using camera, evaluate and estimate the weight base on the captured image. It is comprised of Pig Module, Breeding Module, Health and Medication Module, Weighr Module, Data Analysis Module and Report Module to help swine farm administrators better understand the performance and situation of the swine farm. This paper aims to improve the management in both small and big livestock raisers.

EXECUTION TIME AND POWER CONSUMPTION OPTIMIZATION in FOG COMPUTING ENVIRONMENT

  • Alghamdi, Anwar;Alzahrani, Ahmed;Thayananthan, Vijey
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.1
    • /
    • pp.137-142
    • /
    • 2021
  • The Internet of Things (IoT) paradigm is at the forefront of present and future research activities. The huge amount of sensing data from IoT devices needing to be processed is increasing dramatically in volume, variety, and velocity. In response, cloud computing was involved in handling the challenges of collecting, storing, and processing jobs. The fog computing technology is a model that is used to support cloud computing by implementing pre-processing jobs close to the end-user for realizing low latency, less power consumption in the cloud side, and high scalability. However, it may be that some resources in fog computing networks are not suitable for some kind of jobs, or the number of requests increases outside capacity. So, it is more efficient to decrease sending jobs to the cloud. Hence some other fog resources are idle, and it is better to be federated rather than forwarding them to the cloud server. Obviously, this issue affects the performance of the fog environment when dealing with big data applications or applications that are sensitive to time processing. This research aims to build a fog topology job scheduling (FTJS) to schedule the incoming jobs which are generated from the IoT devices and discover all available fog nodes with their capabilities. Also, the fog topology job placement algorithm is introduced to deploy jobs into appropriate resources in the network effectively. Finally, by comparing our result with the state-of-art first come first serve (FCFS) scheduling technique, the overall execution time is reduced significantly by approximately 20%, the energy consumption in the cloud side is reduced by 18%.