• Title/Summary/Keyword: Social big data analysis

Search Result 731, Processing Time 0.03 seconds

Analysis on Change in Korean Marriage Behaviors (한국인 혼인행태 변화분석)

  • 이삼식
    • Korea journal of population studies
    • /
    • v.16 no.2
    • /
    • pp.84-110
    • /
    • 1993
  • This study aims at identifying the recent change in marriage behaviors in Korea. The data used here is the vital statistics compiled from the vital registration system of which registration form is put on one from together with the civil registration form. According to the results of this analysis, since 1970 the number of marriages has steadily increased from about 300, 000 in the former of 1970s to about 400, 000 in the latter of 1980s, appproximately coincided with the change in population size at the marriageable age span. The few exceptions that can be seen in the 1970s seem to result from the impact of social upheavals during 1950s; since the birth cohorts affected by the low fertility during the Korean war and the post-war baby-boom generations chracterized by the high fertility entered the marriage market in the 1970s. However, the marriage rate shows a little increase from around 7 in the former of 1970s to around 9 in the latter of 1980s, indicating that the marriage prevalence has been more or less inconsiderably changed during this period. It is also found that the proportion of remarriage to the total marriages has increased to around 10 per cent in 1989, while decreasing that of first marriage. This fact can be attributable to the higher prevalence of divorces and the collapsing of the Confucianism ethic which contributed to expediting the remarriage of widows. Although this proportion is insignificant compared with that of the of more developed countries, it is not difficult to say that the proportion of remarriages will continue to increase in future. The age first at first marriage(AFM) which directly affects the span exposed to the risks pregnancy has increased to the age about 28 for male and about 25 for female in recent years. However, big difference in AFM between urban and rural areas has narrowed, resultant from the increasing involuntary postponement of marriage of rural young population who have met difficulties in seeking their bride or bridegroom in rural areas characterized by the heavy out-migration of young, particularly female, population. The present study shows the reverse relationship between AFM and educational attainment; i.e, the higher the educational attainment the lower the AFM. The conditions which are taken into considerations were the class and the family in the past time but which are, educational attainment, job and personal characteristics. With regard to the age condition, in recent years the male prefers the female younger than himself on the average by 3 years and vice versa, which is reduced form 4-5 years in beginning of 1970s. The age difference bride and bridegroom tends to decrease with the educational attainment increase. This may be attributable to the fact that the persons with the higher educational attainment prefer the love marriage and hence are more likely to choose their counterparts in the about same age. The education condition is characterized by the bridegroom having the higher educational level than bride. It is also significant to note that the proportion of love marriage has increased, whereas that of traditional arranged marriage has decreased. This is true in the urban areas than the rural areas, indicating that rights as well as responsibilities for marriage have been handed over the young population from their parents. In conclusion, the change in the marriage behaviors in Korea are characterized by increasing tendency for the postponement of first marriage, higher prevalences of divorces and a result remarriages, increase of love marriages, narrowing age difference between bride and bridegroom, etc. which are the main results of rapid industrization, increase in educational and economic activity opportunities and change in the ideals of marriages during the past decades. These phenomena prevailing in Korean society would affect not only the family structure that will become less proliferiated but the population size and structure. The most important is that the changes in marriage behaviors of Koreans and their impact on the society with respect to norms, values, morals, of individual and family in the social aspect, change in population size and structure in the demograpic aspects, and economic development in the economic aspects should be integrated into the plannings towards to the future.

  • PDF

A Methodology for Automatic Multi-Categorization of Single-Categorized Documents (단일 카테고리 문서의 다중 카테고리 자동확장 방법론)

  • Hong, Jin-Sung;Kim, Namgyu;Lee, Sangwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.77-92
    • /
    • 2014
  • Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we propose a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. First, we attempt to find the relationship between documents and topics by using the result of topic analysis for single-categorized documents. Second, we construct a correspondence table between topics and categories by investigating the relationship between them. Finally, we calculate the matching scores for each document to multiple categories. The results imply that a document can be classified into a certain category if and only if the matching score is higher than the predefined threshold. For example, we can classify a certain document into three categories that have larger matching scores than the predefined threshold. The main contribution of our study is that our methodology can improve the applicability of traditional multi-category classifiers by generating multi-categorized documents from single-categorized documents. Additionally, we propose a module for verifying the accuracy of the proposed methodology. For performance evaluation, we performed intensive experiments with news articles. News articles are clearly categorized based on the theme, whereas the use of vulgar language and slang is smaller than other usual text document. We collected news articles from July 2012 to June 2013. The articles exhibit large variations in terms of the number of types of categories. This is because readers have different levels of interest in each category. Additionally, the result is also attributed to the differences in the frequency of the events in each category. In order to minimize the distortion of the result from the number of articles in different categories, we extracted 3,000 articles equally from each of the eight categories. Therefore, the total number of articles used in our experiments was 24,000. The eight categories were "IT Science," "Economy," "Society," "Life and Culture," "World," "Sports," "Entertainment," and "Politics." By using the news articles that we collected, we calculated the document/category correspondence scores by utilizing topic/category and document/topics correspondence scores. The document/category correspondence score can be said to indicate the degree of correspondence of each document to a certain category. As a result, we could present two additional categories for each of the 23,089 documents. Precision, recall, and F-score were revealed to be 0.605, 0.629, and 0.617 respectively when only the top 1 predicted category was evaluated, whereas they were revealed to be 0.838, 0.290, and 0.431 when the top 1 - 3 predicted categories were considered. It was very interesting to find a large variation between the scores of the eight categories on precision, recall, and F-score.

Analysis of the Weight of SWOT Factors of Korean Venture Companies Based on the Industry 4.0 (4차 산업혁명 기반 한국 벤처기업의 SWOT요인에 대한 중요도 분석)

  • Lee, Dongik;Lee, Sangsuk
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.16 no.4
    • /
    • pp.115-133
    • /
    • 2021
  • This study examines the concept and related technologies of the 4th industrial revolution that has been mixed so far and examines the socio-economic changes and influences resulting from it, and the cases of responding to the 4th industrial revolution in major countries. Based on this, by deriving SWOT factors and calculating the importance of each factor for Korean venture companies to prepare for the forth industrial revolution, it was intended to help the government and policymakers in suggesting directions for establishing related policies. Furthermore, the purpose of this study was to suggest a direction for securing global competitiveness to Korean venture entrepreneurs and to help with basic and systematic analysis for further academic in-depth research. For this study, a total of 21 items derived through extensive literature research and data research to understand what are the necessary competency factors for internal and external environmental changes in order for Korean venture companies to have global competitiveness in the era of the 4th Industrial Revolution. After reviewing SWOT factors by three expert groups and confirming them through Delphi survey, the importance of each item was analyzed by using AHP, a systematic decision-making technique. As a result of the analysis, it was shown that Strength(48%), Opportunity(25%), Threat(16%), Weakness(11%) were considered important in order. In terms of sub-items, 'quick and flexible commercialization capability', 'platform/big data/non-face-to-face service activation', and 'ICT infrastructure and it's utilization' were shown to be of the comparatively high importance. On the other hand, in the lower three items, 'macro-economic stability and social infrastructure', 'difficulty in entering overseas markets due to global protectionism', and 'absolutely inferior in foreign investment' were found to have low priority. As a result of the correlation verification by item to see differences in opinions by industry, academia, and policy expert groups, there was no significant difference of opinion, as industry and academic experts showed a high correlation and industry experts and policy experts showed a moderate correlation. The correlation between the academic and policy experts was not statistically significant (p<0.01), so it was analyzed that there was a difference of opinion on importance. This was due to the fact that policy experts highly valued 'quick and flexible commercialization', which are strengths, and 'excellent educational system and high-quality manpower' and 'creation of new markets' which are opportunity items, while academic experts placed great importance on 'support part of government policy', which are strengths. The implication of this study is that in order for Korean venture companies to secure competitiveness in the field of the 4th industrial revolution, it is necessary to have a policy that preferentially supports the relevant items of strengths and opportunity factors. The difference in the details of strength factors and opportunity factors, which shows a high level of variability, suggests that it is necessary to actively review it and reflect it in the policy.

A Method for Evaluating News Value based on Supply and Demand of Information Using Text Analysis (텍스트 분석을 활용한 정보의 수요 공급 기반 뉴스 가치 평가 방안)

  • Lee, Donghoon;Choi, Hochang;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.45-67
    • /
    • 2016
  • Given the recent development of smart devices, users are producing, sharing, and acquiring a variety of information via the Internet and social network services (SNSs). Because users tend to use multiple media simultaneously according to their goals and preferences, domestic SNS users use around 2.09 media concurrently on average. Since the information provided by such media is usually textually represented, recent studies have been actively conducting textual analysis in order to understand users more deeply. Earlier studies using textual analysis focused on analyzing a document's contents without substantive consideration of the diverse characteristics of the source medium. However, current studies argue that analytical and interpretive approaches should be applied differently according to the characteristics of a document's source. Documents can be classified into the following types: informative documents for delivering information, expressive documents for expressing emotions and aesthetics, operational documents for inducing the recipient's behavior, and audiovisual media documents for supplementing the above three functions through images and music. Further, documents can be classified according to their contents, which comprise facts, concepts, procedures, principles, rules, stories, opinions, and descriptions. Documents have unique characteristics according to the source media by which they are distributed. In terms of newspapers, only highly trained people tend to write articles for public dissemination. In contrast, with SNSs, various types of users can freely write any message and such messages are distributed in an unpredictable way. Again, in the case of newspapers, each article exists independently and does not tend to have any relation to other articles. However, messages (original tweets) on Twitter, for example, are highly organized and regularly duplicated and repeated through replies and retweets. There have been many studies focusing on the different characteristics between newspapers and SNSs. However, it is difficult to find a study that focuses on the difference between the two media from the perspective of supply and demand. We can regard the articles of newspapers as a kind of information supply, whereas messages on various SNSs represent a demand for information. By investigating traditional newspapers and SNSs from the perspective of supply and demand of information, we can explore and explain the information dilemma more clearly. For example, there may be superfluous issues that are heavily reported in newspaper articles despite the fact that users seldom have much interest in these issues. Such overproduced information is not only a waste of media resources but also makes it difficult to find valuable, in-demand information. Further, some issues that are covered by only a few newspapers may be of high interest to SNS users. To alleviate the deleterious effects of information asymmetries, it is necessary to analyze the supply and demand of each information source and, accordingly, provide information flexibly. Such an approach would allow the value of information to be explored and approximated on the basis of the supply-demand balance. Conceptually, this is very similar to the price of goods or services being determined by the supply-demand relationship. Adopting this concept, media companies could focus on the production of highly in-demand issues that are in short supply. In this study, we selected Internet news sites and Twitter as representative media for investigating information supply and demand, respectively. We present the notion of News Value Index (NVI), which evaluates the value of news information in terms of the magnitude of Twitter messages associated with it. In addition, we visualize the change of information value over time using the NVI. We conducted an analysis using 387,014 news articles and 31,674,795 Twitter messages. The analysis results revealed interesting patterns: most issues show lower NVI than average of the whole issue, whereas a few issues show steadily higher NVI than the average.

The Analysis of Urban Park Catchment Areas - Perspectives from Quality Service of Hangang Park - (한강공원의 질적 서비스와 이용자 영향권의 상관관계 분석)

  • Lee, Seo Hyo;Kim, Harry;Lee, Jae Ho
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.49 no.6
    • /
    • pp.27-36
    • /
    • 2021
  • At a time when the equitable use of urban parks is gradually emerging as a social issue, this study was initiated to expand the influence of urban parks by improving the quality of park services, thereby resolving areas not covered by urban park services. This study targeted the Hangang Park in Seoul, where the qualitative service of parks shows the greatest difference. The influence relationship between the qualitative services of the park and the user's sphere of influence, which indicates the distribution of park users, was proposed to assess the influence of improvements in the quality of service. As a research method, the top three districts and the bottom three districts were selected through the Han River Park user satisfaction survey conducted from 2017 to 2019, and a qualitative service evaluation was carried out. It was derived using the data acquired in September. Afterward, by performing a spatial autocorrelation analysis on the user's sphere of influence, additional verification of the user's sphere of influence was performed numerically and visually. As a result of the study, the user influence in the top three districts, with high-quality service, was stronger and wider than that of the lower three districts. It was confirmed that the quality of service of the park affects the user influence. This shows that to realize park equity, it is necessary to improve the quality of services through continuous management and improvement of individual parks and the creation of new parks. This study has significance in that it recognizes the limitations of research on park services from a supplier's point of view and evaluates the qualitative services of parks from the perspective of actual park users. We propose an alternative to deal with the lower the park deprivation index.

Analysis of Urban Growth Pattern and Characteristics by Administrative District Hierarchy : 1985~2005 (행정구역 위계별 도시성장 패턴 및 특성 분석 : 1985~2005를 중심으로)

  • Park, So-Young;Jeon, Sung-Woo;Choi, Chul-Uong
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.12 no.4
    • /
    • pp.34-47
    • /
    • 2009
  • Rapid urbanization is causing environmental and ecological damage, development thoughtless for the environment, and social and economical issues. It is important to grasp urban growth situations and characteristics, reflect them, and establish a policy for the solution of issues pursuant to urbanization and the sustainable and efficient development of national land. This research aims to be used as basic data in establishing an urban policy by analyzing the situations and characteristics of urban growth for the past 20 years in our entire country rather than an existing district. For this, some urban districts were sampled using a 1980s and 2000s version of land cover map produced by Ministry of Environment, and then pattern analysis for urban growth by administrative district ranks was conducted using GIS and a statistical technique. As a result, the development zone area after 1980s has increased by 2.5 times as compared to that before 1980s, and especially in the farm villages neighboring the national capital region, it has increased by 21.2 times. Special cities and metropolitan cities were developed at the districts being low in altitude, close to the principal road and the major downtown, high in road ratio, and restricted environmentally, ecologically and legally, and were diverted from mountains, forests and grassland to urban land. On the other hand, farm villages neighboring a large city, farm villages neighboring the national capital region, and local farm villages were developed at the districts being high in altitude, far from the principal road and the major downtown, low in road ratio, and not restricted environmentally, ecologically and legally, and were diverted from farmland to urban land. That is, it can be seen that urban development has been actively realized despite the unfavorable topographical conditions in the suburban districts due to lack of available land and various regulations and policies as urban growth around big cities expands.

  • PDF

Korean Word Sense Disambiguation using Dictionary and Corpus (사전과 말뭉치를 이용한 한국어 단어 중의성 해소)

  • Jeong, Hanjo;Park, Byeonghwa
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.1-13
    • /
    • 2015
  • As opinion mining in big data applications has been highlighted, a lot of research on unstructured data has made. Lots of social media on the Internet generate unstructured or semi-structured data every second and they are often made by natural or human languages we use in daily life. Many words in human languages have multiple meanings or senses. In this result, it is very difficult for computers to extract useful information from these datasets. Traditional web search engines are usually based on keyword search, resulting in incorrect search results which are far from users' intentions. Even though a lot of progress in enhancing the performance of search engines has made over the last years in order to provide users with appropriate results, there is still so much to improve it. Word sense disambiguation can play a very important role in dealing with natural language processing and is considered as one of the most difficult problems in this area. Major approaches to word sense disambiguation can be classified as knowledge-base, supervised corpus-based, and unsupervised corpus-based approaches. This paper presents a method which automatically generates a corpus for word sense disambiguation by taking advantage of examples in existing dictionaries and avoids expensive sense tagging processes. It experiments the effectiveness of the method based on Naïve Bayes Model, which is one of supervised learning algorithms, by using Korean standard unabridged dictionary and Sejong Corpus. Korean standard unabridged dictionary has approximately 57,000 sentences. Sejong Corpus has about 790,000 sentences tagged with part-of-speech and senses all together. For the experiment of this study, Korean standard unabridged dictionary and Sejong Corpus were experimented as a combination and separate entities using cross validation. Only nouns, target subjects in word sense disambiguation, were selected. 93,522 word senses among 265,655 nouns and 56,914 sentences from related proverbs and examples were additionally combined in the corpus. Sejong Corpus was easily merged with Korean standard unabridged dictionary because Sejong Corpus was tagged based on sense indices defined by Korean standard unabridged dictionary. Sense vectors were formed after the merged corpus was created. Terms used in creating sense vectors were added in the named entity dictionary of Korean morphological analyzer. By using the extended named entity dictionary, term vectors were extracted from the input sentences and then term vectors for the sentences were created. Given the extracted term vector and the sense vector model made during the pre-processing stage, the sense-tagged terms were determined by the vector space model based word sense disambiguation. In addition, this study shows the effectiveness of merged corpus from examples in Korean standard unabridged dictionary and Sejong Corpus. The experiment shows the better results in precision and recall are found with the merged corpus. This study suggests it can practically enhance the performance of internet search engines and help us to understand more accurate meaning of a sentence in natural language processing pertinent to search engines, opinion mining, and text mining. Naïve Bayes classifier used in this study represents a supervised learning algorithm and uses Bayes theorem. Naïve Bayes classifier has an assumption that all senses are independent. Even though the assumption of Naïve Bayes classifier is not realistic and ignores the correlation between attributes, Naïve Bayes classifier is widely used because of its simplicity and in practice it is known to be very effective in many applications such as text classification and medical diagnosis. However, further research need to be carried out to consider all possible combinations and/or partial combinations of all senses in a sentence. Also, the effectiveness of word sense disambiguation may be improved if rhetorical structures or morphological dependencies between words are analyzed through syntactic analysis.

A study on the employment preparation cost and attitude of college student for Job-seeking (국내 대학생의 취업태도 및 취업준비 비용에 관한 연구)

  • Chung, Bhum-Suk;Jeong, Hwa-Min
    • Management & Information Systems Review
    • /
    • v.33 no.4
    • /
    • pp.1-19
    • /
    • 2014
  • This Study focuses on the university students' job attitude and cost of employment preparation. Nowadays, many university and college students spend a big money improving their employment preparation such as studying on foreign language, getting various kinds of certificates and tooth correction, clothing etc. for employment interview. This study investigated the cost of employment preparation and Job attitude of the 484 students of universities and colleges, the analysis of the collected data was conducted with SPSS 12.0 program by using frequency analysis, factor analysis, reliability assessment, correlation test, t-test, one way ANOVA. The university students paid more costs of employment preparation such as a language training abroad, a private training, and clothing than the college students. Also, Allied social science students paid more costs of the language training abroad, and clothing than allied computer science and allied design students. The female students paid more money than male students for tooth correction. The costs of language training abroad, private training and clothing are affected the students' socioeconomic background of a home. Regarding the job attitude of students, the university students are feeling more positive than the college students of the employment efficacy and cognition of the education environment. As result, the differences in the cost of employment preparation by the university type, faculty major course, their sex, and socioeconomic background of a home. The student's employment-efficacy and cognition of the education environment are also differences between the university and the college students. So, to improve the job attitude, developing their ability for employment preparation, educational programs should be arranged in school and continuous researches are needed.

  • PDF

New demand forecast for vocational high school graduates in regional strategic industries: Focusing on comparison between Daejeon and Jeonnam (지역전략산업에 따른 특성화고 졸업자 신규수요 예측: 대전과 전남 지역 비교를 중심으로)

  • Kim, Jin-Mo;Choi, Su-Jung;Jeon, Yeong-Uk;Oh, Jin-Ju;Ryu, Ji-Eun;Kim, Seon-Geun
    • Journal of vocational education research
    • /
    • v.36 no.1
    • /
    • pp.47-75
    • /
    • 2017
  • The purpose of this study was to provide basic data for policy making for secondary vocational education in each region and transformation in vocational high schools. To achieve this, the regional strategic industries in Daejeon and Jeonnam were selected, new demand for vocational high school graduates was forecasted in each industry and occupation. The results of the study are as follows. First, locational quotient analysis and regional shift-share analysis revealed that Daejon and Jeonnam have different strategic industries. Daejon, unlike Jeonnam strategically develops 'manufacturing food, beverage and tobacco', 'manufacturing timber and paper, printing and copying', 'public service and administration of national defense and social security' and 'manufacturing electrical devices, electronics and precision devices'. Jeonnam has specialized industries distinguished from Daejon's, which are 'manufacturing of machinery transportation equipments and etc', 'manufacturing of non-metallic minerals and metal products', 'electric, gas, steam and water supply systems/industries', 'manufacturing coal and chemical products, refining petroleum', 'mining' and 'agriculture, forestry and fishery'. Second, new demand for vocational high school graduates by occupations and industries showed regional differences(in Daejon and Jeonnam). According the forecast, Daejon will have many workforce demands based on manufacturing industries, on the other hand Jeonnam's focused on service industries. Analysis by occupations was also different, Daejon showed high demands on professional and related workers, while Jeonnam requested many new office and service workers. Third, new workforce demand by occupations in regional strategic industries is big part of overall new workforce demand both in Daejon and Jeonnam. Forth, according to the results of analyzing the new demand for vocational high school graduates in Daejeon and Jeonnam in terms of industry location quotient and change effect, there was high demand in industries with positive total change effects. In terms of location quotient, Daejeon and Jeonnam showed different results.

Characteristics and Implications of 4th Industrial Revolution Technology Innovation in the Service Industry (서비스 산업의 4차 산업혁명 기술 혁신 특성과 시사점)

  • Pyoung Yol Jang
    • Journal of Service Research and Studies
    • /
    • v.13 no.2
    • /
    • pp.114-129
    • /
    • 2023
  • In the era of the 4th industrial revolution, the importance of the 4th industrial revolution technology is increasing in the service industry. The purpose of this study is to identify the development and utilization status of the 4th industrial revolution technology in the service industry and to derive the characteristics and implications of the 4th industrial revolution technology innovation in the service industry. In this study, research and analysis were conducted based on the business activity survey data in order to identify the technological innovation characteristics of the 4th industrial revolution in the service industry. The 4th industrial revolution technology in the service industry was analyzed in terms of company ratio, technology development and utilization rate, development/utilization technology, technology application field, and technology development method. In addition, the trend of the 4th industrial revolution technology change in the service industry was also analyzed. The 4th industrial revolution technology utilization and development status of other industries was compared and analyzed. In particular, the service industry 4th industrial revolution technology innovation type was divided into 4 types from the perspective of the 4th industrial revolution company ratio and the 4th industrial revolution company ratio growth rate, and types for each service industry were derived. The characteristics and implications of the 4th industrial revolution technology innovation in the service industry were presented from nine perspectives. As a result of the study, it was found that companies in the service industry were developing or using 4th industrial revolution technologies more actively than companies in other industries, and it was analyzed that the gap was further widening. By service industry, information and communication, finance and insurance, and educational service showed relatively high rates of developing or utilizing 4th industrial revolution technologies. The service industries in which the share of 4th industrial revolution companies increased the most were real estate, education service, health and social welfare service. In particular, cloud, big data, and artificial intelligence were analyzed as the three core technologies of the fourth industrial revolution. The service industry can be classified into 4 types in terms of the 4th industrial revolution company ratio and growth rate, and service industry innovation measures that reflect the differentiated innovation characteristics of each type are needed.