• Title/Summary/Keyword: correlated keyword

Search Result 9, Processing Time 0.023 seconds

Association Modeling on Keyword and Abstract Data in Korean Port Research

  • Yoon, Hee-Young;Kwak, Il-Youp
    • Journal of Korea Trade
    • /
    • v.24 no.5
    • /
    • pp.71-86
    • /
    • 2020
  • Purpose - This study investigates research trends by searching for English keywords and abstracts in 1,511 Korean journal articles in the Korea Citation Index from the 2002-2019 period using the term "Port." The study aims to lay the foundation for a more balanced development of port research. Design/methodology - Using abstract and keyword data, we perform frequency analysis and word embedding (Word2vec). A t-SNE plot shows the main keywords extracted using the TextRank algorithm. To analyze which words were used in what context in our two nine-year subperiods (2002-2010 and 2010-2019), we use Scattertext and scaled F-scores. Findings - First, during the 18-year study period, port research has developed through the convergence of diverse academic fields, covering 102 subject areas and 219 journals. Second, our frequency analysis of 4,431 keywords in 1,511 papers shows that the words "Port" (60 times), "Port Competitiveness" (33 times), and "Port Authority" (29 times), among others, are attractive to most researchers. Third, a word embedding analysis identifies the words highly correlated with the top eight keywords and visually shows four different subject clusters in a t-SNE plot. Fourth, we use Scattertext to compare words used in the two research sub-periods. Originality/value - This study is the first to apply abstract and keyword analysis and various text mining techniques to Korean journal articles in port research and thus has important implications. Further in-depth studies should collect a greater variety of textual data and analyze and compare port studies from different countries.

Gyeonggi21Search 2.0: A Geographic and Regional Information Retrieval System based on Correlated Keywords (연관 키워드 기반의 지리 및 지역정보 검색시스템 : "경기21서치 2.0")

  • Yun, Seong-Kwan;Lee, Ryong;Jang, Yong-Hee;Seong, Dong-Hyeon;Kwon, Yong-Jin
    • Spatial Information Research
    • /
    • v.17 no.1
    • /
    • pp.1-14
    • /
    • 2009
  • Demands for a system which enable users to retrieve any kind of geographic and regional information over the Web have been increasing. However, in order to obtain geographic or regional information over the web, users still need to search web pages related to region by inputting keywords and to arrange the searched results with map. We can solve that problem by using the fact that most of geographic and regional information contain geographic keywords related to location. In this paper, we propose a system to retrieve geographic and regional information efficiently. For the purpose, we present a conceptual model based on three layers of "Real-World", "Knowledge", and "Applications", from the web space and construct the above link process. These layers are connected to each other and enable users to navigation information over the linkage. Especially, users can obtain various correlated information about geographic information and properties.

  • PDF

A Big Data Analysis of the News Trends on Wireless Emergency Alert Service (뉴스 빅데이터를 활용한 재난문자 뉴스 게재 경향 분석)

  • Lee, Hyunji;Byun, Yoonkwan;Chang, Sekchin;Choi, Seong Jong;Oh, Seunghee;Lee, Yongtae
    • Journal of Broadcast Engineering
    • /
    • v.24 no.5
    • /
    • pp.726-734
    • /
    • 2019
  • This study investigates the number of news and correlated keywords concerning to Korean Wireless Emergency Alert(KWEA). The news was collected using BIGKinds, a news big data system provided by the Korea Press Foundation. When analyzing the annual published news articles, we investigated the frequency of the news grouped by disaster types, and the frequency of the news distinguishing between the earthquake and non-earthquake disasters, and finally the frequency of correlated keywords concerning to the disasters. We found that the KWEA news totaled 182 in 2016 due to the unprecedented powerful KyongJu earthquake, an increase of 20 times over the previous year. Ever since 2016, the news about the KWEA continued to hit high figures consistently. After the peak in KyongJu earthquake in 2016, the proportion of non-earthquakes had also increased in 2017 and 2018. Next, the keyword correlation analysis showed that the KWEA news gave major coverage to the following entities: The Ministry of the Interior and Safety which operates the KWEA, Korea Meteorological Administration, and the general public.

Analyzing Global Startup Trends Using Google Trends Keyword Big Data Analysis: 2017~2022 (Google Trends 의 키워드 빅데이터 분석을 활용한 글로벌 스타트업 트렌드 분석: 2017~2022 )

  • Jaeeog Kim;Byunghoon Jeon
    • Journal of Platform Technology
    • /
    • v.11 no.4
    • /
    • pp.19-34
    • /
    • 2023
  • In order to identify the trends and insights of 'startups' in the global era, we conducted an in-depth trend analysis of the global startup ecosystem using Google Trends, a big data analysis platform. For the validity of the analysis, we verified the correlation between the keywords 'startup' and 'global' through BIGKinds. We also conducted a network analysis based on the data extracted using Google Trends to determine the frequency of searches for the keyword or term 'startup'. The results showed a strong positive linear relationship between the keywords, indicating a statistically significant correlation (correlation coefficient: +0.8906). When exploring global startup trends using Google Trends, we found a terribly similar linear pattern of increasing and decreasing interest in each country over time, as shown in Figure 4. In particular, startup interest was low in the range of 35 to 76 from mid-2020 due to the COVID-19 pandemic, but there was a noticeable upward trend in startup interest after March 2022. In addition, we found that the interest in startups in each country except South Korea is very similar, and the related topics are startup company, technology, investment, funding, and keyword search terms such as best startup, tech, business, invest, health, and fintech are highly correlated.

  • PDF

Finding Correlated Keyword b Analyzing User's Implicit Feedback (사용자 선호도 분석을 통한 검색어 조합 추출)

  • Chul-Woo Shim;Eun Ju Lee;Ung-Mo Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.229-232
    • /
    • 2008
  • 웹 정보량이 급속히 늘어나면서 원하는 정보를 효율적으로 찾는 검색 기술의 중요성이 커지고 있다. 검색의 정확성을 높이기 위해서는 검색 질의어와 함께 사용자의 환경, 검색 만족도와 같은 다양한 정보가 필요하다. 사용자의 명시적 피드백을 요구하는 것은 거부감을 줄 수 있으므로 사용자의 잠재적 피드백과 연관 검색어 분석을 통해 검색 질의어를 확장하는 연구가 이뤄지고 있다. 그러나 이러한 검색어 확장과 검색 정확성 사이의 상관관계에 대한 분석이 없어 연관 검색어를 정량적으로 평가할 수 없었다. 본 논문에서는 사용자가 검색 질의어를 변경하면서 검색을 반복하는 과정을 사용자의 잠재적 피드백의 하나로 보고 사용자 만족도를 반영하는 페이지 방문 시간과 함께 분석하여 연속적으로 입력된 검색어가 검색 결과 순위와 사용자 만족도에 미치는 영향을 분석하는 방법을 제안하였다. 마우스 클릭 정보 분석을 통하여 사용자의 검색 만족도를 정량화하였고 특정 주제어에서 관련 검색어가 확장되어 가는 과정은 트리 구조로 표현하였다. 이를 통해 하나의 주제어와 관련해 연속적으로 입력된 검색어 집합으로부터 연관검색어를 추출하고 검색 결과의 정확성을 높일 수 있으며 제안된 트리 구조를 다양한 방향으로 분석하여 검색어, 검색 결과, 사용자 만족도, 배경 지식 등 단순 검색어 분석에서는 나타나지 않는 다양한 정보를 얻을 수 있다.

A Study on Collective Lifestyle by Analyzing Case of Sharing Economy Service -Focusing on Co-working & Co-living Space- (공유경제 서비스 사례분석을 통한 협력적 라이프스타일 연구 -코워킹과 코리빙 스페이스를 중심으로-)

  • An, Hyo-Jin;Kim, Seung-In
    • Journal of Digital Convergence
    • /
    • v.15 no.10
    • /
    • pp.405-410
    • /
    • 2017
  • The purpose of this study is to suggest the guideline by analysing and categorizing the case of sharing economy service especially the co-working and co-living space based on Europe. I did some literature research followed with evaluation of theoretical backgrounds, present conditions. Also, Some case research based on the keyword of 4th industrial revolution and the space categorized by their characteristics. The result indicate that co-working and co-living space's features correlated with their characteristics of the location and birth of digital nomad life. Each space has their flat-form community for the needs of closer cooperation between the user and user in common. I expect this study will become a good resource for upgrading domestic co-working and co-living space and also I hope that this study can guide follow-up next studies.

The Research Trends and Keywords Modeling of Shoulder Rehabilitation using the Text-mining Technique (텍스트 마이닝 기법을 활용한 어깨 재활 연구분야 동향과 키워드 모델링)

  • Kim, Jun-hee;Jung, Sung-hoon;Hwang, Ui-jae
    • Journal of the Korean Society of Physical Medicine
    • /
    • v.16 no.2
    • /
    • pp.91-100
    • /
    • 2021
  • PURPOSE: This study analyzed the trends and characteristics of shoulder rehabilitation research through keyword analysis, and their relationships were modeled using text mining techniques. METHODS: Abstract data of 10,121 articles in which abstracts were registered on the MEDLINE of PubMed with 'shoulder' and 'rehabilitation' as keywords were collected using python. By analyzing the frequency of words, 10 keywords were selected in the order of the highest frequency. Word-embedding was performed using the word2vec technique to analyze the similarity of words. In addition, the groups were classified and analyzed based on the distance (cosine similarity) through the t-SNE technique. RESULTS: The number of studies related to shoulder rehabilitation is increasing year after year, keywords most frequently used in relation to shoulder rehabilitation studies are 'patient', 'pain', and 'treatment'. The word2vec results showed that the words were highly correlated with 12 keywords from studies related to shoulder rehabilitation. Furthermore, through t-SNE, the keywords of the studies were divided into 5 groups. CONCLUSION: This study was the first study to model the keywords and their relationships that make up the abstracts of research in the MEDLINE of Pub Med related to 'shoulder' and 'rehabilitation' using text-mining techniques. The results of this study will help increase the diversifying research topics of shoulder rehabilitation studies to be conducted in the future.

Clickstream Big Data Mining for Demographics based Digital Marketing (인구통계특성 기반 디지털 마케팅을 위한 클릭스트림 빅데이터 마이닝)

  • Park, Jiae;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.143-163
    • /
    • 2016
  • The demographics of Internet users are the most basic and important sources for target marketing or personalized advertisements on the digital marketing channels which include email, mobile, and social media. However, it gradually has become difficult to collect the demographics of Internet users because their activities are anonymous in many cases. Although the marketing department is able to get the demographics using online or offline surveys, these approaches are very expensive, long processes, and likely to include false statements. Clickstream data is the recording an Internet user leaves behind while visiting websites. As the user clicks anywhere in the webpage, the activity is logged in semi-structured website log files. Such data allows us to see what pages users visited, how long they stayed there, how often they visited, when they usually visited, which site they prefer, what keywords they used to find the site, whether they purchased any, and so forth. For such a reason, some researchers tried to guess the demographics of Internet users by using their clickstream data. They derived various independent variables likely to be correlated to the demographics. The variables include search keyword, frequency and intensity for time, day and month, variety of websites visited, text information for web pages visited, etc. The demographic attributes to predict are also diverse according to the paper, and cover gender, age, job, location, income, education, marital status, presence of children. A variety of data mining methods, such as LSA, SVM, decision tree, neural network, logistic regression, and k-nearest neighbors, were used for prediction model building. However, this research has not yet identified which data mining method is appropriate to predict each demographic variable. Moreover, it is required to review independent variables studied so far and combine them as needed, and evaluate them for building the best prediction model. The objective of this study is to choose clickstream attributes mostly likely to be correlated to the demographics from the results of previous research, and then to identify which data mining method is fitting to predict each demographic attribute. Among the demographic attributes, this paper focus on predicting gender, age, marital status, residence, and job. And from the results of previous research, 64 clickstream attributes are applied to predict the demographic attributes. The overall process of predictive model building is compose of 4 steps. In the first step, we create user profiles which include 64 clickstream attributes and 5 demographic attributes. The second step performs the dimension reduction of clickstream variables to solve the curse of dimensionality and overfitting problem. We utilize three approaches which are based on decision tree, PCA, and cluster analysis. We build alternative predictive models for each demographic variable in the third step. SVM, neural network, and logistic regression are used for modeling. The last step evaluates the alternative models in view of model accuracy and selects the best model. For the experiments, we used clickstream data which represents 5 demographics and 16,962,705 online activities for 5,000 Internet users. IBM SPSS Modeler 17.0 was used for our prediction process, and the 5-fold cross validation was conducted to enhance the reliability of our experiments. As the experimental results, we can verify that there are a specific data mining method well-suited for each demographic variable. For example, age prediction is best performed when using the decision tree based dimension reduction and neural network whereas the prediction of gender and marital status is the most accurate by applying SVM without dimension reduction. We conclude that the online behaviors of the Internet users, captured from the clickstream data analysis, could be well used to predict their demographics, thereby being utilized to the digital marketing.

The Relationship between Internet Search Volumes and Stock Price Changes: An Empirical Study on KOSDAQ Market (개별 기업에 대한 인터넷 검색량과 주가변동성의 관계: 국내 코스닥시장에서의 산업별 실증분석)

  • Jeon, Saemi;Chung, Yeojin;Lee, Dongyoup
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.81-96
    • /
    • 2016
  • As the internet has become widespread and easy to access everywhere, it is common for people to search information via online search engines such as Google and Naver in everyday life. Recent studies have used online search volume of specific keyword as a measure of the internet users' attention in order to predict disease outbreaks such as flu and cancer, an unemployment rate, and an index of a nation's economic condition, and etc. For stock traders, web search is also one of major information resources to obtain data about individual stock items. Therefore, search volume of a stock item can reflect the amount of investors' attention on it. The investor attention has been regarded as a crucial factor influencing on stock price but it has been measured by indirect proxies such as market capitalization, trading volume, advertising expense, and etc. It has been theoretically and empirically proved that an increase of investors' attention on a stock item brings temporary increase of the stock price and the price recovers in the long run. Recent development of internet environment enables to measure the investor attention directly by the internet search volume of individual stock item, which has been used to show the attention-induced price pressure. Previous studies focus mainly on Dow Jones and NASDAQ market in the United States. In this paper, we investigate the relationship between the individual investors' attention measured by the internet search volumes and stock price changes of individual stock items in the KOSDAQ market in Korea, where the proportion of the trades by individual investors are about 90% of the total. In addition, we examine the difference between industries in the influence of investors' attention on stock return. The internet search volume of stocks were gathered from "Naver Trend" service weekly between January 2007 and June 2015. The regression model with the error term with AR(1) covariance structure is used to analyze the data since the weekly prices in a stock item are systematically correlated. The market capitalization, trading volume, the increment of trading volume, and the month in which each trade occurs are included in the model as control variables. The fitted model shows that an abnormal increase of search volume of a stock item has a positive influence on the stock return and the amount of the influence varies among the industry. The stock items in IT software, construction, and distribution industries have shown to be more influenced by the abnormally large internet search volume than the average across the industries. On the other hand, the stock items in IT hardware, manufacturing, entertainment, finance, and communication industries are less influenced by the abnormal search volume than the average. In order to verify price pressure caused by investors' attention in KOSDAQ, the stock return of the current week is modelled using the abnormal search volume observed one to four weeks ahead. On average, the abnormally large increment of the search volume increased the stock return of the current week and one week later, and it decreased the stock return in two and three weeks later. There is no significant relationship with the stock return after 4 weeks. This relationship differs among the industries. An abnormal search volume brings particularly severe price reversal on the stocks in the IT software industry, which are often to be targets of irrational investments by individual investors. An abnormal search volume caused less severe price reversal on the stocks in the manufacturing and IT hardware industries than on average across the industries. The price reversal was not observed in the communication, finance, entertainment, and transportation industries, which are known to be influenced largely by macro-economic factors such as oil price and currency exchange rate. The result of this study can be utilized to construct an intelligent trading system based on the big data gathered from web search engines, social network services, and internet communities. Particularly, the difference of price reversal effect between industries may provide useful information to make a portfolio and build an investment strategy.