• Title/Summary/Keyword: Data scientist

Search Result 87, Processing Time 0.028 seconds

IMPLEMENTATION OF SUBSEQUENCE MAPPING METHOD FOR SEQUENTIAL PATTERN MINING

  • Trang, Nguyen Thu;Lee, Bum-Ju;Lee, Heon-Gyu;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • v.2
    • /
    • pp.627-630
    • /
    • 2006
  • Sequential Pattern Mining is the mining approach which addresses the problem of discovering the existent maximal frequent sequences in a given databases. In the daily and scientific life, sequential data are available and used everywhere based on their representative forms as text, weather data, satellite data streams, business transactions, telecommunications records, experimental runs, DNA sequences, histories of medical records, etc. Discovering sequential patterns can assist user or scientist on predicting coming activities, interpreting recurring phenomena or extracting similarities. For the sake of that purpose, the core of sequential pattern mining is finding the frequent sequence which is contained frequently in all data sequences. Beside the discovery of frequent itemsets, sequential pattern mining requires the arrangement of those itemsets in sequences and the discovery of which of those are frequent. So before mining sequences, the main task is checking if one sequence is a subsequence of another sequence in the database. In this paper, we implement the subsequence matching method as the preprocessing step for sequential pattern mining. Matched sequences in our implementation are the normalized sequences as the form of number chain. The result which is given by this method is the review of matching information between input mapped sequences.

  • PDF

Implementation of Subsequence Mapping Method for Sequential Pattern Mining

  • Trang Nguyen Thu;Lee Bum-Ju;Lee Heon-Gyu;Park Jeong-Seok;Ryu Keun-Ho
    • Korean Journal of Remote Sensing
    • /
    • v.22 no.5
    • /
    • pp.457-462
    • /
    • 2006
  • Sequential Pattern Mining is the mining approach which addresses the problem of discovering the existent maximal frequent sequences in a given databases. In the daily and scientific life, sequential data are available and used everywhere based on their representative forms as text, weather data, satellite data streams, business transactions, telecommunications records, experimental runs, DNA sequences, histories of medical records, etc. Discovering sequential patterns can assist user or scientist on predicting coming activities, interpreting recurring phenomena or extracting similarities. For the sake of that purpose, the core of sequential pattern mining is finding the frequent sequence which is contained frequently in all data sequences. Beside the discovery of frequent itemsets, sequential pattern mining requires the arrangement of those itemsets in sequences and the discovery of which of those are frequent. So before mining sequences, the main task is checking if one sequence is a subsequence of another sequence in the database. In this paper, we implement the subsequence matching method as the preprocessing step for sequential pattern mining. Matched sequences in our implementation are the normalized sequences as the form of number chain. The result which is given by this method is the review of matching information between input mapped sequences.

The Distributed Management System of Moving Objects for LBS

  • Jang, In-Sung;Cho, Dae-Soo;Park, Jong-Hyun
    • Proceedings of the KSRS Conference
    • /
    • 2002.10a
    • /
    • pp.163-167
    • /
    • 2002
  • Recently, owing to performance elevation of telecommunication technology, increase of wireless internet's subscriber and diffusion of wireless device, Interest about LBS (Location Based Service) which take advantage of user's location information and can receive information in concerning with user's location is increasing rapidly. So, MOMS (Moving Object Management System) that manage user's location information is required compulsorily to provide location base service. LBS of childhood such as service to find a friend need only current location, but to provide high-quality service in connection with Data Mining, CRM, We must be able to manage location information of past. In this paper, we design distributed manage system to insert and search Moving Object in a large amount. It has been consisted of CLIM (Current Location Information Manager), PLIM (Past-Location Information Manager) and BLIM (Distributed Location Information Manager). CLIM and PLIM prove performance of searching data by using spatiotemporal-index. DLIM distribute an enormous amount of location data to various database. Thus it keeps load-balance, regulates overload and manage a huge number of location information efficiently.

  • PDF

Personal Recommendation Service Design Through Big Data Analysis on Science Technology Information Service Platform (과학기술정보 서비스 플랫폼에서의 빅데이터 분석을 통한 개인화 추천서비스 설계)

  • Kim, Dou-Gyun
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.28 no.4
    • /
    • pp.501-518
    • /
    • 2017
  • Reducing the time it takes for researchers to acquire knowledge and introduce them into research activities can be regarded as an indispensable factor in improving the productivity of research. The purpose of this research is to cluster the information usage patterns of KOSEN users and to suggest optimization method of personalized recommendation service algorithm for grouped users. Based on user research activities and usage information, after identifying appropriate services and contents, we applied a Spark based big data analysis technology to derive a personal recommendation algorithm. Individual recommendation algorithms can save time to search for user information and can help to find appropriate information.

A Study on the Technological Difficult Problems and Education Demand for Information Technology Sectors Women (여성정보인의 정보화에 대한 기술적 애로사항 및 IT 교육 요구 사항 조사 연구)

  • Cho, Young-Im;Jeong, Hyeong-Chul;Kim, Jee-Hyun
    • Journal of Engineering Education Research
    • /
    • v.12 no.3
    • /
    • pp.31-40
    • /
    • 2009
  • In this paper, we consider the characteristics of information technology sectors women. By surveying IT women worker, we attempted to define the attributes of them and examine the problems and what they are needed to IT education following the changes in the highly competitive information technology industry. Especially, we used data mining tools says association analysis to analyze for the Women Information Scientist Association of Korea(WINSA) provides IT worker women with education packages and what is the general culture course from the point of IT employment view. The data was analyzed by SAS enterprise tools.

Direct Geo-referencing for Laser Mapping System

  • Kim, Seong-Baek;Lee, Seung-yong;Kim, Min-Soo
    • Proceedings of the KSRS Conference
    • /
    • 2002.10a
    • /
    • pp.423-427
    • /
    • 2002
  • Contrary to the traditional text-based information, 4S(GIS,GNSS,SIIS,ITS) information can contribute to the citizen's welfare in upcoming era. Recently, GSIS(Geo-Spatial Information System) has been applied and stressed out in various fields. As analyzed the data from GSIS arena, the position information of objects and targets is crucial and critical. Therefore, several methods of getting and knowing position are proposed and developed. From this perspective, Position collection and processing are the heart of 4S technology. We develop 4S-Van that enables real-time acquisition of position and attribute information and accurate image data in remote site. In this study, the configuration of 4S-Van equipped with GPS, INS, CCD and eye-safe laser scanner is shown and the merits of DGPS/INS integration approach for geo-referencing is briefly discussed. The algorithm of DGPS/INS integration fur determination of six parameters of motion is eccential in the 4S-Van to avoid or simplify the complicated computation such as photogrammetric triangulation. 4S-Van has the application of Laser-Mobile Mapping System for three-dimensional data acquisition that merges the texture information from CCD camera. The technique is also applied in the fields of virtual reality, car navigation, computer games, planning and management, city transportation, mobile communication, etc.

  • PDF

가속적(加速的)으로 성장(成長)하는 인간지식(人間知識)과 과학정보(科學情報)

  • Oh, Ik-Sang;Lorch, Walter T.
    • Journal of Information Management
    • /
    • v.2 no.4
    • /
    • pp.3-5
    • /
    • 1964
  • This first part of a series provides some data on the quick growth, high importance and increasing cost of scientific research which leads to a doubling of the amount of research literature in about every eight years. The importance of periodicals is emphasized and some figures on the growth of abstract journals demonstrate how difficult it has become for the scientist to keep up with the current development in his subject field without using documentation. The second part, in KORSTIC's next issue, deals with the scope and methods of scientific documentation, the third part with the devices and machines for information processing and with the problem of automation, whilst the fourth part throws some light on the organization of documentation work all around the world.

  • PDF

A Study on Curriculum Development for Big Data Driven Digital Marketer (빅데이터 기반 디지털 마케터 전문가 양성을 위한 교육과정 개발 관련 연구)

  • Yi, Myongho
    • Journal of Digital Convergence
    • /
    • v.19 no.5
    • /
    • pp.105-115
    • /
    • 2021
  • Many services are provided through big data analysis in various fields such as individuals, private sectors, and governments. There is a growing interest in training data scientists to provide these services. Particularly, interest in big data-based marketing curriculum is high. This study analyzed the domestic and foreign university big data-based marketing-related curriculum to utilize vast and diverse types of information from a marketing perspective in the era of big data. As a result of the analysis of 3,523 subjects related to digital marketing, big data marketing, data analysis, and developers collected according to the analysis criteria, it was analyzed that the specialized curriculum for training data scientists required in the era of the fourth industrial revolution was not appropriate. It is expected that the proposed curriculum in this study will be useful for the development of digital marketing and big data-based marketing curriculum.

A Study on the Supporting System for Scientific Data Visualization at the National Level (국가수준의 과학데이터 시각화 지원체계에 관한 연구)

  • Park, Dong-Jin;Chae, Kyun-Shik;Ryu, Beom-Jong;Lee, Sang-Tae
    • Journal of Information Management
    • /
    • v.42 no.2
    • /
    • pp.85-102
    • /
    • 2011
  • Conventionally, scientific data visualization is thought of as one of activities performed by scientists during the scientific data analysis. However, recently, there exits a set of research papers which count scientific data visualization as a independent research area. They show the research subjects for studying the scientific data visualization technology and methods. In case, a scientist or group of scientists can not solve their own visualization problem due to the unskillfulness and inexperience on using visualization tool. Therefore, it needs to help them by the systematic way for solving the problem. In this study, we analyze and propose the national level scientific visualization support system for scientists. In particular, we first analyze the existing papers and find out the critical success factors. Then, by integrating the findings of the analysis, we propose the research areas which need to be focused, and the strategic direction and specific research topics for scientific data visualization support system in national level.

On the Distribution of the Movement Speed of Smartphone Users (스마트폰으로 측정된 사용자의 이동속도분포에 관한 연구)

  • Kim, Woojin;Jang, Woncheol;Song, Ha Yoon
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.11
    • /
    • pp.567-575
    • /
    • 2016
  • With the popularity of smartphone, user's location information is of great interest as mobile apps based on the location information are increasing. In this paper, we are interested in analyzing user's speed data based on the location information. It is not uncommon to observe locations with great measurement errors, removing them is necessary. The distribution of speed can be considered as a mixture model in accordance with transportation means. We identify a tail part as a component of a mixture model and fit a simple parametric model to the tail part of the speed distribution.