• Title/Summary/Keyword: Text data

Search Result 2,953, Processing Time 0.035 seconds

A Case Study on Characteristics of Gender and Major in Career Preparation of University Students from Low-income Families: Application of Text Frequency Analysis and Association Rules (저소득층 대학생들의 진로준비과정에서의 성별·전공별 특성에 대한 사례연구: 텍스트 빈도분석과 연관분석의 적용)

  • Lee, Jihye;Lee, Shinhye
    • Journal of Digital Convergence
    • /
    • v.16 no.12
    • /
    • pp.61-69
    • /
    • 2018
  • This study aims to understand and to infer the implications from the career preparation experiences of low-income university students in the context of high youth unemployment rate and the polarization of the social classes. For this purpose, we selected 13 university students who received scholarship from the S scholarship foundation and conducted analysis using text mining techniques based on the six-time interviews. According to the results, university students seem to be influenced by home environment and income level when recalling previous academic experience or designing career during the interview process. Also, these differences were found to have different characteristics according to gender and major. This study is meaningful in that the qualitative research data is analyzed by applying the text mining technique in a convergent way. As a result, the college life and career preparation of low-income university students were explored through the frequency and relation of words.

Co-occurrence Based Drug-disease Relationship Inference with Genes as Mediators (유전자를 중간 매개로 고려한 동시발생 기반의 약물-질병 관계 추론)

  • Shin, Sangwon;Sin, Yeeun;Jang, Giup;Yoo, Youngmi
    • The Journal of Korean Institute of Information Technology
    • /
    • v.16 no.11
    • /
    • pp.1-9
    • /
    • 2018
  • Drug repositioning is to discover new uses of drugs. Text mining derives knowledge from unstructured text. We propose a method to predict new drug-disease relationships by taking into account the rate of frequency of genes simultaneously measured in disease-gene and gene-drug. Co-occurrence of drug-gene and gene-disease in the biological literature is counted and calculate the rate of the gene for each drug and disease. Weights of drug-disease relationships are calculated using the average of the rates of genes that are measured and used to measure the accuracy for each disease. In measuring drug-disease relationships, a more accurate identification of relationships was shown by measuring the frequency on a sentence and considering multiple relationships than existing method.

An Analysis of Effects of Emergency Fine Dust Reduction Measures and National Petition Using Regression Analysis and Text Mining (회귀분석과 텍스트마이닝을 활용한 미세먼지 비상저감조치의 실효성 및 국민청원 분석)

  • Kim, Annie;Jeong, So-Hee;Choi, Hyun-Bin;Kim, Hyon Hee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.11
    • /
    • pp.427-434
    • /
    • 2018
  • Recently, the Seoul government implemented 'Free Public Transportation' policy and 'Citizen Participatory Alternative-Day-No-Driving' system as 'Emergency Fine Dust Reduction Measures'. In this paper, after identifying the effectiveness of the two traffic policies, suggestions for direction of future fine dust policy were made. The effect of traffic on the fine dust was analyzed by regression analysis and the responses to the two traffic policies and petitions were analyzed using text mining. Our experimental results show that the responses to the policy were mostly negative, and the influence of the domestic factors was considerable unlike expectation of citizens. Moreover, the result made us possible to know people's specific needs on fine dust reduction policy. Finally, based on the result, the suggestions for fine dust reduction policy direction were provided.

A Narrative Inquiry on the Retired Elderly Person's Library Use Experience (은퇴노인의 도서관 이용 경험에 관한 내러티브 탐구)

  • Lee, Hosin
    • Journal of the Korean Society for information Management
    • /
    • v.36 no.1
    • /
    • pp.215-246
    • /
    • 2019
  • The purpose of this study is to comprehend the retired elderly person's experience of library using the narrative inquiry method proposed by Clandinin and Cornelly. I intended to grasp the details of the several changes that library use brings to the lives. It was also to examine the meanings of the experiences for their lives. For this purpose, three elderly retirees using public libraries in Seoul were selected as research participants. I interviewed their experiences and constructed field text from interview. Based on the field text, the story of the participants was reconstructed into research text which is form of novels, essays, and letters. Their experience in using libraries was interpreted as a source of regular life, fun and vitality, a treasure house for dreaming new life, a source of consolation to endure old age. And I found some common points within their narrative that they seek for a healthy life through reading books. The results of this study are expected to be useful for expanding the understanding of the public library's elderly users and to be used as basic data for service improvement.

NADP+-Dependent Dehydrogenase SCO3486 and Cycloisomerase SCO3480: Key Enzymes for 3,6-Anhydro-ʟ-Galactose Catabolism in Streptomyces coelicolor A3(2)

  • Tsevelkhorloo, Maral;Kim, Sang Hoon;Kang, Dae-Kyung;Lee, Chang-Ro;Hong, Soon-Kwang
    • Journal of Microbiology and Biotechnology
    • /
    • v.31 no.5
    • /
    • pp.756-763
    • /
    • 2021
  • Agarose is a linear polysaccharide composed of ᴅ-galactose and 3,6-anhydro-ʟ-galactose (AHG). It is a major component of the red algal cell wall and is gaining attention as an abundant marine biomass. However, the inability to ferment AHG is considered an obstacle in the large-scale use of agarose and could be addressed by understanding AHG catabolism in agarolytic microorganisms. Since AHG catabolism was uniquely confirmed in Vibrio sp. EJY3, a gram-negative marine bacterial species, we investigated AHG metabolism in Streptomyces coelicolor A3(2), an agarolytic gram-positive soil bacterium. Based on genomic data, the SCO3486 protein (492 amino acids) and the SCO3480 protein (361 amino acids) of S. coelicolor A3(2) showed identity with H2IFE7.1 (40% identity) encoding AHG dehydrogenase and H2IFX0.1 (42% identity) encoding 3,6-anhydro-ʟ-galactonate cycloisomerase, respectively, which are involved in the initial catabolism of AHG in Vibrio sp. EJY3. Thin layer chromatography and mass spectrometry of the bioconversion products catalyzed by recombinant SCO3486 and SCO3480 proteins, revealed that SCO3486 is an AHG dehydrogenase that oxidizes AHG to 3,6-anhydro-ʟ-galactonate, and SCO3480 is a 3,6-anhydro-ʟ-galactonate cycloisomerase that converts 3,6-anhydro-ʟ-galactonate to 2-keto-3-deoxygalactonate. SCO3486 showed maximum activity at pH 6.0 at 50℃, increased activity in the presence of iron ions, and activity against various aldehyde substrates, which is quite distinct from AHG-specific H2IFE7.1 in Vibrio sp. EJY3. Therefore, the catabolic pathway of AHG seems to be similar in most agar-degrading microorganisms, but the enzymes involved appear to be very diverse.

The design and Implementation of Web Security System using the Cookies (쿠키를 이용한 웹 보안시스템 설계 및 구현)

  • 송기평;박기식;한승희;조인준
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.11 no.4
    • /
    • pp.3-14
    • /
    • 2001
  • A Web server makes use of the HTTP(Hyper Text Transfer Protocol) to communicate with a client. The HTTP is a stateless protocol; the server does not maintain any state information for ongoing interactions with the client. Therefore, the HTTP inevitably requires additional overhead as repeating data key-in to user for continuing communications. This overhead in Web environment can be resolved by the cookie technologies. However, the cookie is usually unsecured due to the clear-text to transfer on the network and to store in the file. That is, information in the cookie is easy to exposure, copy, and even change. In this paper, we propose a secure cookie mechanism appropriate to Web environment, and then present a design and implement of secure Web system based on the scheme. The Web system can be used to any web environment. It also provides some security services, such as confidentiality, authentication, integrity.

Perception Survey about SMEs Employment of University Students in Chungbuk Area: Based on Text-mining (충북지역 대학생의 중소기업 취업에 대한 인식조사: 텍스트마이닝을 기반으로)

  • Choi, Dabin;Choi, Wooseok;Choi, Sanghyun;Lee, Junghwan
    • Korean small business review
    • /
    • v.42 no.4
    • /
    • pp.235-250
    • /
    • 2020
  • This study surveyed the perception of university students about employment in Small and Medium-sized Enterprises(SME) in the Chungbuk area to prepare improvement measures. In particular, the data were collected in descriptive questions along with the existing survey methods, and the perception of SME and decent work was identified using text-mining. As a result of the analysis, there are positive perceptions of jobs at SME such as various work experiences and low job competition rates, while there are generally many negative perceptions in pay, work and welfare. However, as a result of co-occurrence network analysis of responses to decent jobs, 'Information' was derived as a keyword. Currently, college students' negative perception of SME is affected by the lack of sufficient information, which needs to be improved first. To solve this problem, it was proposed to establish and operate a platform that can provide information on employment of SME and select necessary personnel.

Character Recognition Algorithm in Low-Quality Legacy Contents Based on Alternative End-to-End Learning (대안적 통째학습 기반 저품질 레거시 콘텐츠에서의 문자 인식 알고리즘)

  • Lee, Sung-Jin;Yun, Jun-Seok;Park, Seon-hoo;Yoo, Seok Bong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.11
    • /
    • pp.1486-1494
    • /
    • 2021
  • Character recognition is a technology required in various platforms, such as smart parking and text to speech, and many studies are being conducted to improve its performance through new attempts. However, with low-quality image used for character recognition, a difference in resolution of the training image and test image for character recognition occurs, resulting in poor accuracy. To solve this problem, this paper designed an end-to-end learning neural network that combines image super-resolution and character recognition so that the character recognition model performance is robust against various quality data, and implemented an alternative whole learning algorithm to learn the whole neural network. An alternative end-to-end learning and recognition performance test was conducted using the license plate image among various text images, and the effectiveness of the proposed algorithm was verified with the performance test.

Research on Tourist Perception of Grand Canal Cultural Heritage Based on Network Text Analysis : The Pingjiang Historical and Cultural District of Suzhou City as an example (네트워크 텍스트 분석을 통한 대운하 문화유산에 대한 관광객 인식 연구 : 쑤저우시 핑장역사문화지구의 예)

  • Chengkang Zheng;Qiwei Jing;Nam Kyung Hyeon
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.215-231
    • /
    • 2023
  • Taking Pingjiang historical and cultural block in Suzhou as an example, this paper collects 1436 tourist comment data from Ctrip. com with Python technology, and uses network text analysis method to analyze frequency words, semantic network and emotion, so as to evaluate the tourist perception characteristics and levels of the Grand Canal cultural heritage. The study found that: natural and humanistic landscapes, historical and cultural deposits, and the style of the Jiangnan Canal are fully reflected in the perception of visitors to the Pingjiang Historical and Cultural District; Tourists hold strong positive emotions towards the Pingjiang Road historical and cultural district, however, there is still more space for the transformation and upgrading of the district. Finally,suggestions for measures to improve the perception of tourists of the Grand Canal cultural heritage are given in terms of conservation first, cultural integration and innovative utilization.

Chinese-clinical-record Named Entity Recognition using IDCNN-BiLSTM-Highway Network

  • Tinglong Tang;Yunqiao Guo;Qixin Li;Mate Zhou;Wei Huang;Yirong Wu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.7
    • /
    • pp.1759-1772
    • /
    • 2023
  • Chinese named entity recognition (NER) is a challenging work that seeks to find, recognize and classify various types of information elements in unstructured text. Due to the Chinese text has no natural boundary like the spaces in the English text, Chinese named entity identification is much more difficult. At present, most deep learning based NER models are developed using a bidirectional long short-term memory network (BiLSTM), yet the performance still has some space to improve. To further improve their performance in Chinese NER tasks, we propose a new NER model, IDCNN-BiLSTM-Highway, which is a combination of the BiLSTM, the iterated dilated convolutional neural network (IDCNN) and the highway network. In our model, IDCNN is used to achieve multiscale context aggregation from a long sequence of words. Highway network is used to effectively connect different layers of networks, allowing information to pass through network layers smoothly without attenuation. Finally, the global optimum tag result is obtained by introducing conditional random field (CRF). The experimental results show that compared with other popular deep learning-based NER models, our model shows superior performance on two Chinese NER data sets: Resume and Yidu-S4k, The F1-scores are 94.98 and 77.59, respectively.