• Title/Summary/Keyword: Text Collection

Search Result 302, Processing Time 0.029 seconds

Spectator's Value Cognition and Expected-benefit Factors on Professional Baseball Sportstar (프로야구 스포츠스타에 대한 관람자의 가치인식과 추구혜택)

  • Lee, Jong-Young;Ko, Jung-Hee
    • 한국체육학회지인문사회과학편
    • /
    • v.51 no.3
    • /
    • pp.79-88
    • /
    • 2012
  • The purpose of this study was to find spectator's value cognition and expected-benefit factors on professional baseball sportstar. Purposeful Sampling method was used to select among fan club leaders and people who had watched baseball games over ten years 8 informants who watched more than 4 times in 6 baseball games in 2011 Lotte Card Professional Baseball League. Data collection relies on focus group interviews and in-depth interviews. Text analysis was attempted to adopt the A-R-C needs theory proposed by Thomson(2006) for spectator's value cognition and the metaphors for consuming by Holt(1995) for expected-benefit factors. The research summary is as follow: Spectator's value cognition on professional baseball sportstar were an object of entertainment-oriented value, a major figure for social relationship-oriented and an object of identical. Spectator's expected-benefit factors on professional baseball sports tar were confirmation of values and heroic orientation. Meanwhile, spectator who had entertainment-oriented value cognition on professional baseball sportstar expect heroic orientation, who recognize professional baseball sportstar a major figure for social relationship-oriented and an object of identical pursue confirmation of values.

Development of Social Data Collection and Loading Engine-based Reliability analysis System Against Infectious Disease Pandemic (감염병 위기 대응을 위한 소셜 데이터 수집 및 적재 엔진 기반 신뢰도 분석 시스템 개발)

  • Doo Young Jung;Sang-Jun Lee;MIN KYUNG IL;Seogsong Jeong;HyunWook Han
    • The Journal of Bigdata
    • /
    • v.7 no.2
    • /
    • pp.103-111
    • /
    • 2022
  • There are many institutions, organizations, and sites related to responding to infectious diseases, but as the pandemic situation such as COVID-19 continues for years, there are many changes in the initial and current aspects, and accordingly, policies and response systems are evolving. As a result, regional gaps arise, and various problems are scattered due to trust, distrust, and implementation of policies. Therefore, in the process of analyzing social data including information transmission, Twitter data, one of the major social media platforms containing inaccurate information from unknown sources, was developed to prevent facts in advance. Based on social data, which is unstructured data, an algorithm that can automatically detect infectious disease threats is developed to create an objective basis for responding to the infectious disease crisis to solidify international competitiveness in related fields.

A study on Wikidata linkage methods for utilization of digital archive records of the National Debt Redemption Movement (국채보상운동 디지털 아카이브 기록물의 활용을 위한 위키데이터 연계 방안에 대한 연구)

  • Seulki Do;Heejin Park
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.23 no.2
    • /
    • pp.95-115
    • /
    • 2023
  • This study designed a data model linked to Wikidata and examined its applicability to increase the utilization of the digital archive records of the National Debt Redemption Movement, registered as World Memory Heritage, and implications were derived by analyzing the existing metadata, thesaurus, and semantic network graph. Through analysis of the original text of the National Debt Redemption Movement records, key data model classes for linking with Wikidata, such as record item, agent, time, place, and event, were derived. In addition, by identifying core properties for linking between classes and applying the designed data model to actual records, the possibility of acquiring abundant related information was confirmed through movement between classes centered on properties. Thus, this study's result showed that Wikidata's strengths could be utilized to increase data usage in local archives where the scale and management of data are relatively small. Therefore, it can be considered for application in a small-scale archive similar to the National Debt Redemption Movement digital archive.

Development of IoT-based Can Compactor/PET Bottle Crusher Management System (IoT 기반의 캔/PET병 압착파쇄기 관리시스템 개발)

  • Dae-Hyun Ryu;Ye-Seong Kang;Tae-Wan Choi
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.6
    • /
    • pp.1239-1244
    • /
    • 2023
  • In this study, we developed an IoT-based management system to manage a can/PET crusher. Various sensors such as two load cells, DHT22 temperature and humidity sensor, and fine dust meter were interfaced with ESP32 to construct an IoT device, and a management server was built using Node-RED. The system monitors the weight of pressed cans and shredded PET bottles in real time and sends a text message to the manager when the weight exceeds the predetermined threshold for timely collection. The results of the operational test confirmed that the system provides accurate monitoring and efficient notification functions, and offers the possibility of solving environmental problems by improving the efficiency of waste management such as cans and PET bottles.

A Study on the Construction of Financial-Specific Language Model Applicable to the Financial Institutions (금융권에 적용 가능한 금융특화언어모델 구축방안에 관한 연구)

  • Jae Kwon Bae
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.29 no.3
    • /
    • pp.79-87
    • /
    • 2024
  • Recently, the importance of pre-trained language models (PLM) has been emphasized for natural language processing (NLP) such as text classification, sentiment analysis, and question answering. Korean PLM shows high performance in NLP in general-purpose domains, but is weak in domains such as finance, medicine, and law. The main goal of this study is to propose a language model learning process and method to build a financial-specific language model that shows good performance not only in the financial domain but also in general-purpose domains. The five steps of the financial-specific language model are (1) financial data collection and preprocessing, (2) selection of model architecture such as PLM or foundation model, (3) domain data learning and instruction tuning, (4) model verification and evaluation, and (5) model deployment and utilization. Through this, a method for constructing pre-learning data that takes advantage of the characteristics of the financial domain and an efficient LLM training method, adaptive learning and instruction tuning techniques, were presented.

Latent topics-based product reputation mining (잠재 토픽 기반의 제품 평판 마이닝)

  • Park, Sang-Min;On, Byung-Won
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.39-70
    • /
    • 2017
  • Data-drive analytics techniques have been recently applied to public surveys. Instead of simply gathering survey results or expert opinions to research the preference for a recently launched product, enterprises need a way to collect and analyze various types of online data and then accurately figure out customer preferences. In the main concept of existing data-based survey methods, the sentiment lexicon for a particular domain is first constructed by domain experts who usually judge the positive, neutral, or negative meanings of the frequently used words from the collected text documents. In order to research the preference for a particular product, the existing approach collects (1) review posts, which are related to the product, from several product review web sites; (2) extracts sentences (or phrases) in the collection after the pre-processing step such as stemming and removal of stop words is performed; (3) classifies the polarity (either positive or negative sense) of each sentence (or phrase) based on the sentiment lexicon; and (4) estimates the positive and negative ratios of the product by dividing the total numbers of the positive and negative sentences (or phrases) by the total number of the sentences (or phrases) in the collection. Furthermore, the existing approach automatically finds important sentences (or phrases) including the positive and negative meaning to/against the product. As a motivated example, given a product like Sonata made by Hyundai Motors, customers often want to see the summary note including what positive points are in the 'car design' aspect as well as what negative points are in thesame aspect. They also want to gain more useful information regarding other aspects such as 'car quality', 'car performance', and 'car service.' Such an information will enable customers to make good choice when they attempt to purchase brand-new vehicles. In addition, automobile makers will be able to figure out the preference and positive/negative points for new models on market. In the near future, the weak points of the models will be improved by the sentiment analysis. For this, the existing approach computes the sentiment score of each sentence (or phrase) and then selects top-k sentences (or phrases) with the highest positive and negative scores. However, the existing approach has several shortcomings and is limited to apply to real applications. The main disadvantages of the existing approach is as follows: (1) The main aspects (e.g., car design, quality, performance, and service) to a product (e.g., Hyundai Sonata) are not considered. Through the sentiment analysis without considering aspects, as a result, the summary note including the positive and negative ratios of the product and top-k sentences (or phrases) with the highest sentiment scores in the entire corpus is just reported to customers and car makers. This approach is not enough and main aspects of the target product need to be considered in the sentiment analysis. (2) In general, since the same word has different meanings across different domains, the sentiment lexicon which is proper to each domain needs to be constructed. The efficient way to construct the sentiment lexicon per domain is required because the sentiment lexicon construction is labor intensive and time consuming. To address the above problems, in this article, we propose a novel product reputation mining algorithm that (1) extracts topics hidden in review documents written by customers; (2) mines main aspects based on the extracted topics; (3) measures the positive and negative ratios of the product using the aspects; and (4) presents the digest in which a few important sentences with the positive and negative meanings are listed in each aspect. Unlike the existing approach, using hidden topics makes experts construct the sentimental lexicon easily and quickly. Furthermore, reinforcing topic semantics, we can improve the accuracy of the product reputation mining algorithms more largely than that of the existing approach. In the experiments, we collected large review documents to the domestic vehicles such as K5, SM5, and Avante; measured the positive and negative ratios of the three cars; showed top-k positive and negative summaries per aspect; and conducted statistical analysis. Our experimental results clearly show the effectiveness of the proposed method, compared with the existing method.

A Study on the Naejeong (內庭) of Daesoon Jinrihoe Temple Complexes: Focusing on Literary Sources and Context (대순진리회 도장 건축물 내정(內庭)에 대한 연구 - 내정의 문헌 출처와 그 맥락을 중심으로 -)

  • Cha, Seon-keun
    • Journal of the Daesoon Academy of Sciences
    • /
    • v.37
    • /
    • pp.1-52
    • /
    • 2021
  • The Naejeong, the inner court, which is one of the structures found in the temple complexes of Daesoon Jinrihoe. It serves the function of leading and controlling the operation and direction of Korean religions in general. Considering that the dictionary meaning of 'Naejeong' is 'a place to manage the affairs of the state from inside a palace,' the name and function of the structure appear to be in harmony. However, in the Daesoon Jinrihoe context, it is said that the name 'Naejeong (內庭 'Neiting' in Chinese)' is related to a verse from a Daoist scripture. It has not been revealed whether or not the scripture is historical, and what contents or contextual meanings it contains. This study tries to pursue this matter and introduce the original source of the Naejeong in Daesoon Jinrihoe as likely coming from Qianbapinxianjing (前八品仙經, The Former Scripture of the Eight Phases That Reveal the Means to Acquire Immortality). This scripture was compiled in Lüzu-quanshu(呂祖全書, The Entire Collection of Ancestor Lü). This text and its contextual meanings will also be examined. The origin of Qianbapinxianjing dates back to either the late Ming Dynasty or the early Qing. In those days, there existed a group of literati who worshipped Ancestor Lü because he had saved people and taught the art of immortality. The group organized Daoist Spirit-Writing Altars (鸞壇道敎) and invoked the spirit of Ancestor Lü. They were said to have been taught through messages received from spirit-writing sessions (降乩) with Ancestor Lü and several Daoist scriptures were composed by them in this manner. At Immortals-Gathering Pavilion (集仙樓) of Wandian (萬店) in Guangling (廣陵), China, some literati in that group conducted a spirit-writing session with Ancestor Lü between 1589 and 1626, and they produced a scripture which contained the passage, "A crow and a rabbit gather in the middle valley (烏兎結中谷) while a turtle entwined with a snake is in the inner court (龜蛇盤內庭)." They titled the scripture, The Five Movements and Filial Piety (五行端孝). This passage symbolically expresses the accomplishment of immortality in Neidan (internal alchemy) which, within the human body, combines the two energies of yin and yang which are Water and Fire in the Five Movements scheme. This kind of cultivation is said to be achieved only by maintaining the highest possible degree of filial piety. In this context, the Naejeong where a turtle is entwined with a snake (龜蛇合體) was a term that symbolically depicted a place wherein one transforms into an immortal through cultivation. The Five Movements and Filial Piety was included in Qianbapinxianjing after it had been compiled with the other scriptures containing Ancestor Lü's teachings. In 1744, Qianbapinxianjing was included in Lüzu-quanshu, the entire 32-volume collection of Ancestor Lü and printed for the first time. This underlies the belief in Ancestor Lü (呂祖信仰) which embraces the idea of the redemption of people, teaches the arts of immortality, and features Daoist Spirit-Writing Altars, filial piety, the art of Neidan, and the combination of Water and Fire.

A Collaborative Filtering System Combined with Users' Review Mining : Application to the Recommendation of Smartphone Apps (사용자 리뷰 마이닝을 결합한 협업 필터링 시스템: 스마트폰 앱 추천에의 응용)

  • Jeon, ByeoungKug;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.1-18
    • /
    • 2015
  • Collaborative filtering(CF) algorithm has been popularly used for recommender systems in both academic and practical applications. A general CF system compares users based on how similar they are, and creates recommendation results with the items favored by other people with similar tastes. Thus, it is very important for CF to measure the similarities between users because the recommendation quality depends on it. In most cases, users' explicit numeric ratings of items(i.e. quantitative information) have only been used to calculate the similarities between users in CF. However, several studies indicated that qualitative information such as user's reviews on the items may contribute to measure these similarities more accurately. Considering that a lot of people are likely to share their honest opinion on the items they purchased recently due to the advent of the Web 2.0, user's reviews can be regarded as the informative source for identifying user's preference with accuracy. Under this background, this study proposes a new hybrid recommender system that combines with users' review mining. Our proposed system is based on conventional memory-based CF, but it is designed to use both user's numeric ratings and his/her text reviews on the items when calculating similarities between users. In specific, our system creates not only user-item rating matrix, but also user-item review term matrix. Then, it calculates rating similarity and review similarity from each matrix, and calculates the final user-to-user similarity based on these two similarities(i.e. rating and review similarities). As the methods for calculating review similarity between users, we proposed two alternatives - one is to use the frequency of the commonly used terms, and the other one is to use the sum of the importance weights of the commonly used terms in users' review. In the case of the importance weights of terms, we proposed the use of average TF-IDF(Term Frequency - Inverse Document Frequency) weights. To validate the applicability of the proposed system, we applied it to the implementation of a recommender system for smartphone applications (hereafter, app). At present, over a million apps are offered in each app stores operated by Google and Apple. Due to this information overload, users have difficulty in selecting proper apps that they really want. Furthermore, app store operators like Google and Apple have cumulated huge amount of users' reviews on apps until now. Thus, we chose smartphone app stores as the application domain of our system. In order to collect the experimental data set, we built and operated a Web-based data collection system for about two weeks. As a result, we could obtain 1,246 valid responses(ratings and reviews) from 78 users. The experimental system was implemented using Microsoft Visual Basic for Applications(VBA) and SAS Text Miner. And, to avoid distortion due to human intervention, we did not adopt any refining works by human during the user's review mining process. To examine the effectiveness of the proposed system, we compared its performance to the performance of conventional CF system. The performances of recommender systems were evaluated by using average MAE(mean absolute error). The experimental results showed that our proposed system(MAE = 0.7867 ~ 0.7881) slightly outperformed a conventional CF system(MAE = 0.7939). Also, they showed that the calculation of review similarity between users based on the TF-IDF weights(MAE = 0.7867) leaded to better recommendation accuracy than the calculation based on the frequency of the commonly used terms in reviews(MAE = 0.7881). The results from paired samples t-test presented that our proposed system with review similarity calculation using the frequency of the commonly used terms outperformed conventional CF system with 10% statistical significance level. Our study sheds a light on the application of users' review information for facilitating electronic commerce by recommending proper items to users.

Exploring the Trend of Korean Creative Dance by Analyzing Research Topics : Application of Text Mining (연구주제 분석을 통한 한국창작무용 경향 탐색 : 텍스트 마이닝의 적용)

  • Yoo, Ji-Young;Kim, Woo-Kyung
    • Journal of Korea Entertainment Industry Association
    • /
    • v.14 no.6
    • /
    • pp.53-60
    • /
    • 2020
  • The study is based on the assumption that the trend of phenomena and trends in research are contextually consistent. Therefore the purpose of this study is to explore the trend of dance through the subject analysis of the Korean creative dance study by utilizing text mining. Thus, 1,291 words were analyzed in the 616 journal title, which were established on the paper search website. The collection, refining and analysis of the data were all R 3.6.0 SW. According to the study, keywords representing the times were frequently used before the 2000s, but Korean creative dance research types were also found in terms of education and physical training. Second, the frequency of keywords related to the dance troupe's performance was high after the 2000s, but it was confirmed that Choi Seung-hee was still in an important position in the study of Korean creative dance. Third, an analysis of the overall research subjects of the Korean creative dance study showed that the research on 'Art of Choi Seung-hee in the modern era' was the highest proportion. Fourth, the Hot Topics, which are rising as of 2000, appeared as 'the performance activities of the National Dance Company' and 'the choreography expression and utilization of traditional dance'. However, since the recent trend of the National Dance Company's performance is advocating 'modernization based on tradition', it has been confirmed that the trend of Korean creative dance since the 2000s has been focused on the use of traditional dance motifs. Fifth, the Cold Topic, which has been falling as of 2000, has been shown to be a study of 'dancing expressions by age'. It was judged that interest in research also decreased due to the tendency to mix various dance styles after the establishment of the genre of Korean creative dance.

An Exploration of the Relationship Between Virtual Museum Exhibitions and Visitors' Responses (미술관, 박물관 가상전시디자인에 대한 관람객의 반응연구)

  • Park, Nam-Jin
    • Archives of design research
    • /
    • v.19 no.1 s.63
    • /
    • pp.181-190
    • /
    • 2006
  • This study began with an assumption that virtual museum exhibitions will continue to be created in the future and more knowledge is required about designing effective virtual exhibit designs. This study explored the relationship between virtual exhibitions and visitor's opinions following the viewing of the virtual exhibit in order to determine the components of a well-constructed virtual exhibit design. To address the research problem, this study explored two aspects of virtual exhibit design: 1) what are the components of a well-constructed virtual exhibit, 2) how does viewing the virtual exhibit change visitors' opinions about both physical and virtual museum experiences. The methodology of the study employed surveys, interviews and observations as instruments of data collection. Twenty-five participants were given a survey prior to their viewing of the on-line exhibit, then they were given the opportunity to view the web-site and finally surveyed regarding their opinions. From the 25 participants, six were selected for observation to record behavior exhibited while they viewed the site. In addition, five were interviewed for a better understanding of their responses to various aspects of the virtual exhibit experiences. Data from the surveys was tabulated for descriptive percentages in order to identify numerical patterns of relationship. Observation data was analyzed for simple frequencies in categories of responses and interview data was tape recorded and transcribed into text files. Based on study results, recommendations were made for the future role of interior design in virtual space that stands independent from a physical building and resides only on the Internet.

  • PDF