• Title/Summary/Keyword: Public data portal

Search Result 131, Processing Time 0.028 seconds

Study on prediction for a film success using text mining (텍스트 마이닝을 활용한 영화흥행 예측 연구)

  • Lee, Sanghun;Cho, Jangsik;Kang, Changwan;Choi, Seungbae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.6
    • /
    • pp.1259-1269
    • /
    • 2015
  • Recently, big data is positioning as a keyword in the academic circles. And usefulness of big data is carried into government, a local public body and enterprise as well as academic circles. Also they are endeavoring to obtain useful information in big data. This research mainly deals with analyses of box office success or failure of films using text mining. For data, it used a portal site 'D' and film review data, grade point average and the number of screens gained from the Korean Film Commission. The purpose of this paper is to propose a model to predict whether a film is success or not using these data. As a result of analysis, the correct classification rate by the prediction model method proposed in this paper is obtained 95.74%.

Introduction and Evaluation of Communicable Disease Surveillance in the Republic of Korea (전염병 감시 체계 소개 및 평가)

  • Park, Ok;Choi, Bo-Youl
    • Journal of Preventive Medicine and Public Health
    • /
    • v.40 no.4
    • /
    • pp.259-264
    • /
    • 2007
  • Effective communicable disease surveillance systems are the basis of the national disease prevention and control. Following the increase in emerging and re-emerging infectious diseases since late 1990s, the Korean government has strived to enhance surveillance and response system. Since 2000, sentinel surveillance, such as influenza sentinel surveillance, pediatric sentinel surveillance, school-based sentinel surveillance and ophthalmological sentinel surveillance, was introduced to improve the surveillance activities. Electronic reporting system was developed in 2000, enabling the establishment of national database of reported cases. Disweb, a portal for sharing communicable disease information with the public and health care workers, was developed. In general, the survey results on usefulness and attributes of the system, such as simplicity, flexibility, acceptability, sensitivity, timeliness, and representa-tiveness, received relatively high recognition. Compared to the number of paid cases of national health insurance, reported cases by national notifiable disease surveillance system, and various sentinel surveillance system, the result of the correlation analysis was high. According to the research project conducted by KCDC, the reporting rate of physicians in 2004 has also greatly improved, compared with that in 1990s. However, continuous efforts are needed to further improve the communicable disease surveillance system. Awareness of physicians on communicable disease surveillance system must be improved by conducting education and information campaigns on a continuous basis. We should also devise means for efficient use of various administrative data including cause of death statistics and health insurance. In addition, efficiency of the system must be improved by linking data from various surveillance system.

Subway Line 2 Congestion Prediction During Rush Hour Based on Machine Learning (머신러닝 기반 2호선 출퇴근 시간대 지하철 역사 내 혼잡도 예측)

  • Jinyoung Jang;Chaewon Kim;Minseo Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.6
    • /
    • pp.145-150
    • /
    • 2023
  • The subway is a public transportation that many people use every day. Line 2 especially has the most crowded stations during the day. However, the risk of crush accidents is increasing due to high congestion during rush hour and this reduces the safety and comfort of passengers. Subway congestion prediction is helpful to forestall problems caused by high congestion. Therefore, this study proposes machine learning classification models that predict subway congestion during commuting time. To predict congestion in Line 2 based in machine learning, we investigate variables that affect subway congestion through previous research and collect a dataset of subway congestion on Line 2 during rush hour from PUBLIC DATA PORTAL. The proposed model is expected to establish the subway operation plane to make passengers safe and satisfied.

A Study on the Implementation of Korean History Contents Service based on Linked Open Data (LOD 기반 한국사 콘텐츠 서비스 구축에 관한 연구)

  • Yoon, So Young
    • Journal of the Korean Society for information Management
    • /
    • v.30 no.3
    • /
    • pp.297-315
    • /
    • 2013
  • Anyone curious to easily access and learn Korean history has become interested in Korean history data bases, which will provide accurate and reliable historical information. Furthermore, user demands for information sharing and reusability, available through setting up a semantic web, have been increased, which have taken the shape of linked data. Efforts have been made to construct public data bases containing readily usable contents a user can understand and utilize with ease. They have been produced by several organizations, portal sites, and individuals, trying to deviate from existing mainstreams - expert-based text data bases. A problem with those data bases is that they have not considered such vital factors as the sharing and utilizing of information as a whole. This study suggests a LOD-based Korean history contents implementation system, providing rich information environment by way of multi-dimensional web-data connections. In doing so, this system has tried a historic information circulation service system which is based on information sharing and connecting.

Analysis of Factors Influencing the Utilization Rate of Public Health Centers in Korea (한국의 보건소 이용률에 영향을 미치는 요인 분석)

  • Park, Eun-A;Choi, Sung-Yong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.3
    • /
    • pp.203-215
    • /
    • 2019
  • This study was conducted to identify the utilization of public health centers, as well as the individual characteristics and regional characteristics that affect their utilization based on data from the 2016 Community Health Survey, National Statistical Portal, and National Institute of Environmental Research. Independent samples t-tests, variance analysis, and multiple logistic regression analysis were used for analysis. Hierarchical multiple regression was used to analyze individual and regional characteristics. The results of hierarchical multiple regressions revealed that aged regions, women, older age individuals, respondents with lower education level and income level, walking practitioners, nutrition label readers, individuals experiencing depression, those who have received health checkups, those who are not covered by essential care, those who have spouses, and basic livelihood beneficiaries have increased use of public health centers. However, the use of public health centers decreased in stressors, and regions in which the population per 1,000, number of health care workers, health and welfare budget, fiscal independence, and unemployment rate were above the national average. As above, the central government and local governments need to analyze not only individual characteristics such as health behavior and psychological factors, but also regional characteristics, when establishing local health care policy.

Prediction Model of Real Estate Transaction Price with the LSTM Model based on AI and Bigdata

  • Lee, Jeong-hyun;Kim, Hoo-bin;Shim, Gyo-eon
    • International Journal of Advanced Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.274-283
    • /
    • 2022
  • Korea is facing a number difficulties arising from rising housing prices. As 'housing' takes the lion's share in personal assets, many difficulties are expected to arise from fluctuating housing prices. The purpose of this study is creating housing price prediction model to prevent such risks and induce reasonable real estate purchases. This study made many attempts for understanding real estate instability and creating appropriate housing price prediction model. This study predicted and validated housing prices by using the LSTM technique - a type of Artificial Intelligence deep learning technology. LSTM is a network in which cell state and hidden state are recursively calculated in a structure which added cell state, which is conveyor belt role, to the existing RNN's hidden state. The real sale prices of apartments in autonomous districts ranging from January 2006 to December 2019 were collected through the Ministry of Land, Infrastructure, and Transport's real sale price open system and basic apartment and commercial district information were collected through the Public Data Portal and the Seoul Metropolitan City Data. The collected real sale price data were scaled based on monthly average sale price and a total of 168 data were organized by preprocessing respective data based on address. In order to predict prices, the LSTM implementation process was conducted by setting training period as 29 months (April 2015 to August 2017), validation period as 13 months (September 2017 to September 2018), and test period as 13 months (December 2018 to December 2019) according to time series data set. As a result of this study for predicting 'prices', there have been the following results. Firstly, this study obtained 76 percent of prediction similarity. We tried to design a prediction model of real estate transaction price with the LSTM Model based on AI and Bigdata. The final prediction model was created by collecting time series data, which identified the fact that 76 percent model can be made. This validated that predicting rate of return through the LSTM method can gain reliability.

A Big Data Analysis to Prevent Elderly Solitary Deaths by High-risk Area Clusterization (노인 고독사 방지를 위한 빅데이터 기반 고독사 고위험 지역 탐지 연구)

  • Soyon Kim;Soo Hyung Kim;Bong Gyou Lee
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.5
    • /
    • pp.177-182
    • /
    • 2024
  • This study proposes a big data-based analytical method to detect high-risk areas for solitary deaths among the elderly in Seoul. The study categorizes and analyzes the risk factors of solitary deaths into demographic, health, economic, and socio-environmental factors. Using data collected from the Seoul Open Data Plaza and Public Data Portal, variables were generated and scatter plots were created using K-means clustering, followed by visual implementation through map creation. The analysis identified Jungnang-gu, Gangbuk-gu, Nowon-gu, Eunpyeong-gu, Gangseo-gu, and Gwanak-gu as the highest-risk areas. This study addresses the limitations of previous survey-based research through big data analysis. The findings are expected to enhance the efficiency of solitary death prevention programs and serve as a basis for informed decision-making in budget allocation across districts.

A Study on the New Trends of EDI based Internet (인터넷을 기반으로 하는 EDI 신조류)

  • 조원길
    • The Journal of Information Technology
    • /
    • v.4 no.1
    • /
    • pp.125-139
    • /
    • 2001
  • EDI(Electronic Data Interchange) works by providing a collection of standard message formats and element dictionary in a simple way for businesses to exchange data via any electronic messaging service. Open-edi is electronic data interchange among autonomous parties using public standards and aiming towards interoperability over time, business sectors, information technology and data types. The number of Internet services using XML/EDI has grown rapidly since it is easily expansible and exchangeable. To use this service, the client does not have to install EDI S/W but only needs internet browser. Consequently, it became much easier and faster to handle the trading process in an office. eBusiness SML (extensible markup language) electronic data interchange. eXedi is the service that realizes B2B of XML/EDI. eXedi can be used easily in small and medium sized companies. Companies in any place can access to eXedi using the existing Internet connection. XML/EDI provides a standard framework to exchange different types of data -- for example, an invoice, healthcare claim, project status -- so that the information be it in a transaction, exchanged via an Application Program Interface (API), web automation, database portal, catalog, a workflow document or message can be searched, decoded, manipulated, and displayed consistently and correctly by first implementing EDI dictionaries and extending our vocabulary via on-line repositories to include our business language, rules and objects.

  • PDF

A Study for Extension of BIM/GIS Interoperability Platform linked External Open Data (외부개방데이터 연계를 통한 BIM/GIS 상호운용 플랫폼확장에 관한 연구)

  • Park, Seung-Hwa;Hong, Chang-Hee
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.3
    • /
    • pp.78-84
    • /
    • 2017
  • Because the 'Internet of Things' and sensor network technology have become a new generation industry competitiveness with a development of Information Communication Technology, each local autonomous entity is trying to adopt a Smart City quickly. This requires an integrated platform inside of a smart city operation center. Established Smart City platform provides various services using CCTV information and ITS transportation information based on a two-dimensional map. The provision of advanced Smart City services will necessitate three-dimensional map information, building and facilities unit information, linked information with public data portal for service to the public. In this paper, the authors reviewed development trends of Smart City integrated platform and proposed mashup methods between BIM/GIS interoperability platform and external open data. BIM/GIS platform can provide spatial information services for indoor and outdoor seamlessly because it was developed based on GIS spatial data with BIM data. The linked external open data are V-World data, Seoul Open Data, and Architectural Data Open. Finally, the authors proposed the direction of development for BIM/GIS integrated platform to provide advanced Smart City services.

A Process Perspective Event-log Analysis Method for Airport BHS (Baggage Handling System) (공항 수하물 처리 시스템 이벤트 로그의 프로세스 관점 분석 방안 연구)

  • Park, Shin-nyum;Song, Minseok
    • The Journal of Bigdata
    • /
    • v.5 no.1
    • /
    • pp.181-188
    • /
    • 2020
  • As the size of the airport terminal grows in line with the rapid growth of aviation passengers, the advanced baggage handling system that combines various data technologies has become an essential element in order to handle the baggage carried by passengers swiftly and accurately. Therefore, this study introduces the method of analyzing the baggage handling capacity of domestic airports through the latest data analysis methodology from the process point of view to advance the operation of the airport BHS and the main points based on event log data. By presenting an accurate load prediction method, it can lead to advanced BHS operation strategies in the future, such as the preemptive arrangement of resources and optimization of flight-carrousel scheduling. The data used in the analysis utilized the APIs that can be obtained by searching for "Korea Airports Corporation" in the public data portal. As a result of applying the method to the domestic airport BHS simulation model, it was possible to confirm a high level of predictive performance.