• 제목/요약/키워드: Web Text Analysis

검색결과 278건 처리시간 0.03초

The Role of Non-Negotiated Input and Output: A Case Study of L2 Development via Web Chat

  • Hahn, Hye-Ryeong
    • 영어어문교육
    • /
    • 제17권4호
    • /
    • pp.49-74
    • /
    • 2011
  • The present paper aims to explore the role of non-negotiated input and output in language acquisition in the context of free Web chat. In order to examine how input and output contribute to language acquisition, with or without meaning negotiation, the present study examined a Korean EFL learner's chat data collected over 6 months. Chat texts across 43 chat sessions were analyzed, along with her comment notes and interviews. The input and output negotiated for meaning were traced throughout all sessions to find evidence that they were linked to acquisition. Other input and output in the interaction were also traced to ascertain if they contributed to acquisition. The chat text analysis, comment notes, and the interviews revealed that the opportunities of meaning negotiation in a free Web chat context was quite limited and that the learner acquired language even in the absence of meaning negotiation. The findings suggest that input and output via Web chat, whether negotiated or non-negotiated, play their respective roles, contributing to different aspects of acquisition.

  • PDF

텍스트마이닝을 위한 패션 속성 분류체계 및 말뭉치 웹사전 구축 (Development of Online Fashion Thesaurus and Taxonomy for Text Mining)

  • 장세윤;김하연;김송미;최우진;정진;이유리
    • 한국의류학회지
    • /
    • 제46권6호
    • /
    • pp.1142-1160
    • /
    • 2022
  • Text data plays a significant role in understanding and analyzing trends in consumer, business, and social sectors. For text analysis, there must be a corpus that reflects specific domain knowledge. However, in the field of fashion, the professional corpus is insufficient. This study aims to develop a taxonomy and thesaurus that considers the specialty of fashion products. To this end, about 100,000 fashion vocabulary terms were collected by crawling text data from WSGN, Pantone, and online platforms; text subsequently was extracted through preprocessing with Python. The taxonomy was composed of items, silhouettes, details, styles, colors, textiles, and patterns/prints, which are seven attributes of clothes. The corpus was completed through processing synonyms of terms from fashion books such as dictionaries. Finally, 10,294 vocabulary words, including 1,956 standard Korean words, were classified in the taxonomy. All data was then developed into a web dictionary system. Quantitative and qualitative performance tests of the results were conducted through expert reviews. The performance of the thesaurus also was verified by comparing the results of text mining analysis through the previously developed corpus. This study contributes to achieving a text data standard and enables meaningful results of text mining analysis in the fashion field.

Research Trends on Literature Reviews in Scopus Journals by Authors from Indonesia, Japan, South Korea, Vietnam, Singapore, and Malaysia: A Bibliometric Analysis from 2003 to 2022

  • Prakoso Bhairawa Putera;Amelya Gustina
    • Asian Journal of Innovation and Policy
    • /
    • 제12권3호
    • /
    • pp.304-322
    • /
    • 2023
  • Text data mining ('big data methods') is one of the most widely used approaches during the COVID-19 pandemic. In particular, text data mining on Scopus databases or Web of Science (WoS). Text data mining is widely used to collect literature for later bibliometric analysis, and in the end, it becomes a literature review article. Therefore, in this article, we reveal the trend of publication of literature reviews in Scopus journals from Indonesia, Japan, South Korea, Vietnam, Singapore, and Malaysia. This article describes two essential parts, namely 1) a comparison of international publication trends and subject area of literature review publications, and 2) a comparison of Top 5 for Authors, Affiliation, Source Title, and Collaboration Country.

Analysis of Social Media Utilization based on Big Data-Focusing on the Chinese Government Weibo

  • Li, Xiang;Guo, Xiaoqin;Kim, Soo Kyun;Lee, Hyukku
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권8호
    • /
    • pp.2571-2586
    • /
    • 2022
  • The rapid popularity of government social media has generated huge amounts of text data, and the analysis of these data has gradually become the focus of digital government research. This study uses Python language to analyze the big data of the Chinese provincial government Weibo. First, this study uses a web crawler approach to collect and statistically describe over 360,000 data from 31 provincial government microblogs in China, covering the period from January 2018 to April 2022. Second, a word separation engine is constructed and these text data are analyzed using word cloud word frequencies as well as semantic relationships. Finally, the text data were analyzed for sentiment using natural language processing methods, and the text topics were studied using LDA algorithm. The results of this study show that, first, the number and scale of posts on the Chinese government Weibo have grown rapidly. Second, government Weibo has certain social attributes, and the epidemics, people's livelihood, and services have become the focus of government Weibo. Third, the contents of government Weibo account for more than 30% of negative sentiments. The classified topics show that the epidemics and epidemic prevention and control overshadowed the other topics, which inhibits the diversification of government Weibo.

Analysis and Improvement of Ranking Algorithm for Web Mining System on the Hierarchical Web Environment

  • Heebyung Yoon;Lee, Kil-Seup;Kim, Hwa-Soo
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 2003년도 ISIS 2003
    • /
    • pp.455-458
    • /
    • 2003
  • The variety of document ranking algorithms have developed to provide efficient mining results for user's query on the web environment. The typical ranking algorithms are the Vector-Space Model based on the text, PsgeRank and HITS algorithms based on the hyperlink structures and other several improvement algorithms. All these are for the user's convenience and preference. However, these algorithms are usually developed on then Horizontal and non-hierarchial web environments and are not suitable for the hierarchial web environments such as enterprise and defense networks. Thus, we must consider the special environment factors in order to improve the ranking algorithms. In this paper, we analyze the several typical algorithms used by hyperlink structures on the web environment. We, then suggest a configuration of the hierarchical web environment and also give the relations between agents of the web mining system. Next, we propose an improved ranking algorithm suitable to this kind of special environments. The proposed algorithm is considered both the hyperlink structures of the documents and the location of the user of the hierarchical web.

  • PDF

기후 변화 교육을 위한 국내 웹 자료 분석 (Analysis of the World Wide Web Contents in Korea for the Climate Change Education)

  • 최혜숙;김용표
    • 한국환경교육학회지:환경교육
    • /
    • 제23권3호
    • /
    • pp.1-16
    • /
    • 2010
  • Global climate change becomes one of the most serious environmental problems over the world. There is growing recognition thai climate change education, especially for children is important. However, there have been few programmes, curricula, teachers' training chances, and teaching-learning materials for climate change education so far. Therefore, we analyse the world wide web(web) contents in Korea which are available for climate change education, providing fundamental data in developing educational contents for climate change, as well as helping users to search appropriate contents for climate change education. Subjects for this study are 10 web sites of public institutions related to climate change in Korea. The web contents are evaluated in terms of diversity, accuracy, authenticity and the ease of use. The key finding in this study is that the majority of the contents are focused on how to respond to the problem, especially mitigation and also we find that most of the web sites provide text-types of lesson plan and video-types. Consequently, it would be necessary to develop various web contents for climate change education in both quality and quantity aspects.

  • PDF

기업 대 기업간(B to B) 섬유거래 웹사이트 분석 (The Analysis of Web Sites of Textile Exchange of B to B)

  • 홍병숙;이은진;이지연
    • 한국의류학회지
    • /
    • 제27권1호
    • /
    • pp.123-133
    • /
    • 2003
  • The specific objectives of the study were as follows: 1) To investigate the composition system (design, usability and interactivity) of web sites of textile exchange of B to B 2) To examine and valuate contents and marketing (announcement, satisfaction and variety of contents) of web sites of textile exchange of B to B. The data were collected from search engine, portal sites of evaluation, direct contact, interview over the phone with web master of concerned web sites and the result of analytical valuation of web sites. The results of this study were as fellows: 1) The Dongsung trading intended to mainly use their homepage as a inside communication place by intranet network. The Daechang trading was mainly using their homepage as a tool of expansion of their outside export market. The etextiler was selling their web solutions through homepage. The texcom was offering the web place and useful informations to trading companies in Asia. 2) The texcom consisted text with little image to speed up for loading and navigation for usability of users. The Dongsung trading made intranet network for communication and exchange of informations of company inside. The etextiler offered a booking menu to inquiry in homepage. The Daechang trading tried to give good impression from the introduction page at homepage.

빅데이터 분석을 이용한 디지털 패션 테크에 대한 인식 연구 (Perceptions and Trends of Digital Fashion Technology - A Big Data Analysis -)

  • 송은영;임호선
    • 한국의류산업학회지
    • /
    • 제23권3호
    • /
    • pp.380-389
    • /
    • 2021
  • This study aimed to reveal the perceptions and trends of digital fashion technology through an informational approach. A big data analysis was conducted after collecting the text shown in a web environment from April 2019 to April 2021. Key words were derived through text mining analysis and network analysis, and the structure of perception of digital fashion technology was identified. Using textoms, we collected 8144 texts after data refinement, conducted a frequency of emergence and central component analysis, and visualized the results with word cloud and N-gram. The frequency of appearance also generated matrices with the top 70 words, and a structural equivalent analysis was performed. The results were presented with network visualizations and dendrograms. Fashion, digital, and technology were the most frequently mentioned topics, and the frequencies of platform, digital transformation, and start-ups were also high. Through clustering, four clusters of marketing were formed using fashion, digital technology, startups, and augmented reality/virtual reality technology. Future research on startups and smart factories with technologies based on stable platforms is needed. The results of this study contribute to increasing the fashion industry's knowledge on digital fashion technology and can be used as a foundational study for the development of research on related topics.

웹소설 키워드를 통한 이용 독자 내적 욕구 및 특성 파악 (Identifying Reader's Internal Needs and Characteristics Using Keywords from Korean Web Novels)

  • 조수연;오하영
    • 한국정보통신학회논문지
    • /
    • 제24권2호
    • /
    • pp.158-165
    • /
    • 2020
  • 모바일 상에서 연재되고 소비되는 웹소설은 다른 문화콘텐츠와 마찬가지로 우리 사회의 한 단면을 포착해낼 수 있는 특징이 있다. 본 논문은 웹소설 키워드 정보를 수집해 웹소설의 주요 모티프 및 트렌드를 파악하고, 나아가 기존 논문들과 연관 지어 이용 독자의 내적 욕구 및 특성을 분석하는 것을 목적으로 한다. 분석 결과 접근성이 높고 가독성이 편리한 모바일 환경과 관련해 현대물과 성인 작품이 인기가 높았다. 남자주인공은 웹소설 상에서 이상적으로 그려지는 경향이 있었으나, 현재 남자주인공의 주요 키워드는 2000년대 초와 비교했을 때 변화를 확인할 수 있었다. 이는 곧 현대인들의 젠더 관념의 변화를 시사한다. 이와 상반되게 여자주인공은 내면에 상처를 지닌 캐릭터가 인기가 많았고, 이에 대한 원인 중 하나로 사회구조적인 환경 속에서 좌절을 겪어야 했던 현대 여성의 현실을 설명했다. 본 논문은 웹 크롤링의 한계로 성인 작품에 대해 심층적인 분석을 진행하지는 못했지만, 정량적 분석이 미흡했던 기존 웹소설 연구들에 키워드라는 파라텍스트를 활용하여 현대인들의 내적 욕구 및 특성을 분석했다는 의의를 지닌다.