• Title/Summary/Keyword: Public Big data

Search Result 703, Processing Time 0.023 seconds

Valid Data Conditions and Discrimination for Machine Learning: Case study on Dataset in the Public Data Portal (기계학습에 유효한 데이터 요건 및 선별: 공공데이터포털 제공 데이터 사례를 통해)

  • Oh, Hyo-Jung;Yun, Bo-Hyun
    • Journal of Internet of Things and Convergence
    • /
    • v.8 no.1
    • /
    • pp.37-43
    • /
    • 2022
  • The fundamental basis of AI technology is learningable data. Recently, the types and amounts of data collected and produced by the government or private companies are increasing exponentially, however, verified data that can be used for actual machine learning has not yet led to it. This study discusses the conditions that data actually can be used for machine learning should meet, and identifies factors that degrade data quality through case studies. To this end, two representative cases of developing a prediction model using public big data was selected, and data for actual problem solving was collected from the public data portal. Through this, there is a difference from the results of applying valid data screening criteria and post-processing. The ultimate purpose of this study is to argue the importance of data quality management that must be most fundamentally preceded before the development of machine learning technology, which is the core of artificial intelligence, and accumulating valid data.

A Study on Utilization Strategy of Big Data for Local Administration by Analyzing Cases (사례분석을 통한 지방행정의 빅데이터 활용 전략)

  • Noh, Kyoo-Sung
    • Journal of Digital Convergence
    • /
    • v.12 no.1
    • /
    • pp.89-97
    • /
    • 2014
  • As Big Data's value is perceived and Government 3.0 is announced, there is a growing interest in Big Data. However, it won't be easy for each public institute or local government to apply Big Data systematically and make a successful achievement despite lacking of specific alternative plan or strategy. So, this study tried to suggest strategies to use Big Data after arranging the area which local government utilize it in. As a result, utilization areas of local administration's Big Data are divided into four areas; recognizing and corresponding the abnormal phenomenon, predicting and corresponding the close future, corresponding analyzed situation and developing new policy(administration service), and citizen customized service. In addition, strategies about how to use Big Data are suggested; stepwise approach, user's requirements analysis, critical success factors based implementation, pilot project, result evaluation, performance based incentive, building common infrastructure.

A Study on Recognition of Artificial Intelligence Utilizing Big Data Analysis (빅데이터 분석을 활용한 인공지능 인식에 관한 연구)

  • Nam, Soo-Tai;Kim, Do-Goan;Jin, Chan-Yong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.05a
    • /
    • pp.129-130
    • /
    • 2018
  • Big data analysis is a technique for effectively analyzing unstructured data such as the Internet, social network services, web documents generated in the mobile environment, e-mail, and social data, as well as well formed structured data in a database. The most big data analysis techniques are data mining, machine learning, natural language processing, and pattern recognition, which were used in existing statistics and computer science. Global research institutes have identified analysis of big data as the most noteworthy new technology since 2011. Therefore, companies in most industries are making efforts to create new value through the application of big data. In this study, we analyzed using the Social Matrics which a big data analysis tool of Daum communications. We analyzed public perceptions of "Artificial Intelligence" keyword, one month as of May 19, 2018. The results of the big data analysis are as follows. First, the 1st related search keyword of the keyword of the "Artificial Intelligence" has been found to be technology (4,122). This study suggests theoretical implications based on the results.

  • PDF

Rethinking the US Presidential Election: Feminism and Big Data

  • CHUNG, Sae Won;PARK, Han Woo
    • International Journal of Contents
    • /
    • v.17 no.4
    • /
    • pp.52-61
    • /
    • 2021
  • The 2020 US Presidential Election was a highly-anticipated moment for our global society. During the election period, the most intriguing issue was who would be the winner-Trump or Biden? Among the possible main themes of the 2020 election, from the COVID-19 pandemic to racism, this study focused on feminism ('women') as a main component of Biden's victory. To explore the character of Biden's supporters, this paper focused on internet spaces as a source of public opinion. To guide the data analysis, this study employed four indices from empirical studies on Big Data analytics: issue salience, attention diversity, emotional mentioning, and semantic cohesion. The main finding of this study was that the representative keyword 'women' appeared more prevalently within content related to Biden than Trump, and the keyword pairs indicated that female voters were the main reason for Trump's failure but the root cause of Biden's victory. The results of this study indicated the role of the internet as a forum for public opinion and a fountain of political knowledge, which requires more rigorous investigation by researchers.

BIG DATA ANALYSIS ROLE IN ADVANCING THE VARIOUS ACTIVITIES OF DIGITAL LIBRARIES: TAIBAH UNIVERSITY CASE STUDY- SAUDI ARABIA

  • Alotaibi, Saqar Moisan F
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.8
    • /
    • pp.297-307
    • /
    • 2021
  • In the vibrant environment, documentation and managing systems are maintained autonomously through education foundations, book materials and libraries at the same time as information are not voluntarily accessible in a centralized location. At the moment Libraries are providing online resources and services for education activities. Moreover, libraries are applying outlets of social media such as Facebook as well as Instagrams to preview their services and procedures. Librarians with the assistance of promising tools and technology like analytics software are capable to accumulate more online information, analyse them for incorporating worth to their services. Thus Libraries can employ big data to construct enhanced decisions concerning collection developments, updating public spaces and tracking the purpose of library book materials. Big data is being produced due to library digitations and this has forced restrictions to academicians, researchers and policy creator's efforts in enhancing the quality and effectiveness. Accordingly, helping the library clients with research articles and book materials that are in line with the users interest is a big challenge and dispute based on Taibah university in Saudi Arabia. The issues of this domain brings the numerous sources of data from various institutions and sources into single place in real time which can be time consuming. The most important aim is to reduce the time that lapses among the authentic book reading and searching the specific study material.

Economic Feasibility Analysis of 'Hye-Ahn', a Government-Wide Big Data Platform (범정부 빅데이터 플랫폼인 '혜안'의 경제적 타당성 분석)

  • Myong-Hee Kim;Heung-Kyu Kim
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.47 no.2
    • /
    • pp.57-64
    • /
    • 2024
  • The use of big data needs to be emphasized in policy formulation by public officials in order to improve the transparency of government policies and increase efficiency and reliability of government policies. 'Hye-Ahn', a government-wide big data platform was built with this goal, and the subscribers of 'Hye-Ahn' has grown significantly from 2,000 at the end of 2016 to 100,000 at August 2018. Additionally, the central and local governments are expanding their big data related budgets. In this study, we derived the costs and benefits of 'Hye-Ahn' and used them to conduct an economic feasibility analysis. As a result, even if only some quantitative benefits are considered without qualitative benefits, the net present value, the benefit/cost, and internal rate of return turned out to be 22,662 million won, 2.3213, and 41.8%, respectively. Since this is larger than the respective comparison criteria of 0 won, 1.0, and 5.0%, it can be seen that 'Hye-Ahn' has had economic feasibility. As noticed earlier, the number of analysis using 'Hye-Ahn' is increasing, so it is expected that the benefits will increase as time passes. Finally, the socioeconomic value gained when the results of analysis using 'Hye-Ahn' are used in policy is expected to be significant.

Finding a plan to improve recognition rate using classification analysis

  • Kim, SeungJae;Kim, SungHwan
    • International journal of advanced smart convergence
    • /
    • v.9 no.4
    • /
    • pp.184-191
    • /
    • 2020
  • With the emergence of the 4th Industrial Revolution, core technologies that will lead the 4th Industrial Revolution such as AI (artificial intelligence), big data, and Internet of Things (IOT) are also at the center of the topic of the general public. In particular, there is a growing trend of attempts to present future visions by discovering new models by using them for big data analysis based on data collected in a specific field, and inferring and predicting new values with the models. In order to obtain the reliability and sophistication of statistics as a result of big data analysis, it is necessary to analyze the meaning of each variable, the correlation between the variables, and multicollinearity. If the data is classified differently from the hypothesis test from the beginning, even if the analysis is performed well, unreliable results will be obtained. In other words, prior to big data analysis, it is necessary to ensure that data is well classified according to the purpose of analysis. Therefore, in this study, data is classified using a decision tree technique and a random forest technique among classification analysis, which is a machine learning technique that implements AI technology. And by evaluating the degree of classification of the data, we try to find a way to improve the classification and analysis rate of the data.

Research on public sentiment of the post-corona new normal: Through social media (SNS) big data analysis (포스트 코로나 뉴노멀에 대한 대중감성 연구: 소셜미디어(SNS) 빅데이터 분석을 통해)

  • Ann, Myung-suk
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.2
    • /
    • pp.209-215
    • /
    • 2022
  • In this study, detailed factors of public sentiment toward the 'post-corona new normal' were examined through social media big data sentiment analysis. Thus, it is to provide basic data to preemptively cope with the post-COVID-19 era. For data collection and analysis, the emotional analysis program of 'Textom', a big data analysis program, was used. The data collection period is one year from October 5, 2020 to October 5, 2021, and the collection channels are set as blogs, cafes, Twitter, and Facebook on Daum and Naver. The original data edited and refined a total of 3,770 collected texts from this channel were used for this study. The conclusion is as follows. First, there is a high level of interest and liking for the 'post-corona new normal'. In other words, it can be seen that optimism such as daily recovery, technological growth, and expectations for a new future took the lead at 77.62%. Second, negative emotions such as sadness and rejection are 22.38% of the total, but the intensity of emotions is 23.91%, which is higher than the ratio, suggesting that these negative emotions are intense. This study has a contribution to the detailed factor analysis of the public's positive and negative emotions through big data analysis on the 'post-corona new normal'.

Implementation of public data contents using Big data Visualization technology - Map visualization technique (빅 데이터 가시화 기술을 적용한 공공데이터 콘텐츠 구현 - Map가시화 기법)

  • Bak, Seon-Hui;Kim, Jong Ho;You, Hyun-Bea
    • Journal of Digital Contents Society
    • /
    • v.18 no.7
    • /
    • pp.1427-1434
    • /
    • 2017
  • Due to the acceleration of the 4th industrialization, the data around us rapidly increased. Therefore, it is necessary to be able to more easily grasp the nature and meaning of data obtained through data analysis than to collect data, and apply it flexibly to the value judgment of data. Visualization technology is now attracting attention in many fields. Visualization allows the user to more easily grasp the information of the data with graphs, charts, etc. so that the data analysis result can be understood more easily, so that the user can make an immediate judgment and make a quick decision. Among them, there is a high degree of interest in visualization using public data, which is highly useful to users. In this paper, we implemented R - library and R Studio to visualize public data at the installation sites of bicycle storage sites among various software that can express visualization.

Issue tracking and voting rate prediction for 19th Korean president election candidates (댓글 분석을 통한 19대 한국 대선 후보 이슈 파악 및 득표율 예측)

  • Seo, Dae-Ho;Kim, Ji-Ho;Kim, Chang-Ki
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.199-219
    • /
    • 2018
  • With the everyday use of the Internet and the spread of various smart devices, users have been able to communicate in real time and the existing communication style has changed. Due to the change of the information subject by the Internet, data became more massive and caused the very large information called big data. These Big Data are seen as a new opportunity to understand social issues. In particular, text mining explores patterns using unstructured text data to find meaningful information. Since text data exists in various places such as newspaper, book, and web, the amount of data is very diverse and large, so it is suitable for understanding social reality. In recent years, there has been an increasing number of attempts to analyze texts from web such as SNS and blogs where the public can communicate freely. It is recognized as a useful method to grasp public opinion immediately so it can be used for political, social and cultural issue research. Text mining has received much attention in order to investigate the public's reputation for candidates, and to predict the voting rate instead of the polling. This is because many people question the credibility of the survey. Also, People tend to refuse or reveal their real intention when they are asked to respond to the poll. This study collected comments from the largest Internet portal site in Korea and conducted research on the 19th Korean presidential election in 2017. We collected 226,447 comments from April 29, 2017 to May 7, 2017, which includes the prohibition period of public opinion polls just prior to the presidential election day. We analyzed frequencies, associative emotional words, topic emotions, and candidate voting rates. By frequency analysis, we identified the words that are the most important issues per day. Particularly, according to the result of the presidential debate, it was seen that the candidate who became an issue was located at the top of the frequency analysis. By the analysis of associative emotional words, we were able to identify issues most relevant to each candidate. The topic emotion analysis was used to identify each candidate's topic and to express the emotions of the public on the topics. Finally, we estimated the voting rate by combining the volume of comments and sentiment score. By doing above, we explored the issues for each candidate and predicted the voting rate. The analysis showed that news comments is an effective tool for tracking the issue of presidential candidates and for predicting the voting rate. Particularly, this study showed issues per day and quantitative index for sentiment. Also it predicted voting rate for each candidate and precisely matched the ranking of the top five candidates. Each candidate will be able to objectively grasp public opinion and reflect it to the election strategy. Candidates can use positive issues more actively on election strategies, and try to correct negative issues. Particularly, candidates should be aware that they can get severe damage to their reputation if they face a moral problem. Voters can objectively look at issues and public opinion about each candidate and make more informed decisions when voting. If they refer to the results of this study before voting, they will be able to see the opinions of the public from the Big Data, and vote for a candidate with a more objective perspective. If the candidates have a campaign with reference to Big Data Analysis, the public will be more active on the web, recognizing that their wants are being reflected. The way of expressing their political views can be done in various web places. This can contribute to the act of political participation by the people.