• Title/Summary/Keyword: Web data

Search Result 5,605, Processing Time 0.037 seconds

Analysis of Research Trends Related to drug Repositioning Based on Machine Learning (머신러닝 기반의 신약 재창출 관련 연구 동향 분석)

  • So Yeon Yoo;Gyoo Gun Lim
    • Information Systems Review
    • /
    • v.24 no.1
    • /
    • pp.21-37
    • /
    • 2022
  • Drug repositioning, one of the methods of developing new drugs, is a useful way to discover new indications by allowing drugs that have already been approved for use in people to be used for other purposes. Recently, with the development of machine learning technology, the case of analyzing vast amounts of biological information and using it to develop new drugs is increasing. The use of machine learning technology to drug repositioning will help quickly find effective treatments. Currently, the world is having a difficult time due to a new disease caused by coronavirus (COVID-19), a severe acute respiratory syndrome. Drug repositioning that repurposes drugsthat have already been clinically approved could be an alternative to therapeutics to treat COVID-19 patients. This study intends to examine research trends in the field of drug repositioning using machine learning techniques. In Pub Med, a total of 4,821 papers were collected with the keyword 'Drug Repositioning'using the web scraping technique. After data preprocessing, frequency analysis, LDA-based topic modeling, random forest classification analysis, and prediction performance evaluation were performed on 4,419 papers. Associated words were analyzed based on the Word2vec model, and after reducing the PCA dimension, K-Means clustered to generate labels, and then the structured organization of the literature was visualized using the t-SNE algorithm. Hierarchical clustering was applied to the LDA results and visualized as a heat map. This study identified the research topics related to drug repositioning, and presented a method to derive and visualize meaningful topics from a large amount of literature using a machine learning algorithm. It is expected that it will help to be used as basic data for establishing research or development strategies in the field of drug repositioning in the future.

Improving the Performance of Radiologists Using Artificial Intelligence-Based Detection Support Software for Mammography: A Multi-Reader Study

  • Jeong Hoon Lee;Ki Hwan Kim;Eun Hye Lee;Jong Seok Ahn;Jung Kyu Ryu;Young Mi Park;Gi Won Shin;Young Joong Kim;Hye Young Choi
    • Korean Journal of Radiology
    • /
    • v.23 no.5
    • /
    • pp.505-516
    • /
    • 2022
  • Objective: To evaluate whether artificial intelligence (AI) for detecting breast cancer on mammography can improve the performance and time efficiency of radiologists reading mammograms. Materials and Methods: A commercial deep learning-based software for mammography was validated using external data collected from 200 patients, 100 each with and without breast cancer (40 with benign lesions and 60 without lesions) from one hospital. Ten readers, including five breast specialist radiologists (BSRs) and five general radiologists (GRs), assessed all mammography images using a seven-point scale to rate the likelihood of malignancy in two sessions, with and without the aid of the AI-based software, and the reading time was automatically recorded using a web-based reporting system. Two reading sessions were conducted with a two-month washout period in between. Differences in the area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, and reading time between reading with and without AI were analyzed, accounting for data clustering by readers when indicated. Results: The AUROC of the AI alone, BSR (average across five readers), and GR (average across five readers) groups was 0.915 (95% confidence interval, 0.876-0.954), 0.813 (0.756-0.870), and 0.684 (0.616-0.752), respectively. With AI assistance, the AUROC significantly increased to 0.884 (0.840-0.928) and 0.833 (0.779-0.887) in the BSR and GR groups, respectively (p = 0.007 and p < 0.001, respectively). Sensitivity was improved by AI assistance in both groups (74.6% vs. 88.6% in BSR, p < 0.001; 52.1% vs. 79.4% in GR, p < 0.001), but the specificity did not differ significantly (66.6% vs. 66.4% in BSR, p = 0.238; 70.8% vs. 70.0% in GR, p = 0.689). The average reading time pooled across readers was significantly decreased by AI assistance for BSRs (82.73 vs. 73.04 seconds, p < 0.001) but increased in GRs (35.44 vs. 42.52 seconds, p < 0.001). Conclusion: AI-based software improved the performance of radiologists regardless of their experience and affected the reading time.

Metadata extraction using AI and advanced metadata research for web services (AI를 활용한 메타데이터 추출 및 웹서비스용 메타데이터 고도화 연구)

  • Sung Hwan Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.2
    • /
    • pp.499-503
    • /
    • 2024
  • Broadcasting programs are provided to various media such as Internet replay, OTT, and IPTV services as well as self-broadcasting. In this case, it is very important to provide keywords for search that represent the characteristics of the content well. Broadcasters mainly use the method of manually entering key keywords in the production process and the archive process. This method is insufficient in terms of quantity to secure core metadata, and also reveals limitations in recommending and using content in other media services. This study supports securing a large number of metadata by utilizing closed caption data pre-archived through the DTV closed captioning server developed in EBS. First, core metadata was automatically extracted by applying Google's natural language AI technology. The next step is to propose a method of finding core metadata by reflecting priorities and content characteristics as core research contents. As a technology to obtain differentiated metadata weights, the importance was classified by applying the TF-IDF calculation method. Successful weight data were obtained as a result of the experiment. The string metadata obtained by this study, when combined with future string similarity measurement studies, becomes the basis for securing sophisticated content recommendation metadata from content services provided to other media.

A Study on Developing a Web Care Model for Audiobook Platforms Using Machine Learning (머신러닝을 이용한 오디오북 플랫폼 기반의 웹케어 모형 구축에 관한 연구)

  • Dahoon Jeong;Minhyuk Lee;Taewon Lee
    • Information Systems Review
    • /
    • v.26 no.1
    • /
    • pp.337-353
    • /
    • 2024
  • The purpose of this study is to investigate the relationship between consumer reviews and managerial responses, aiming to explore the necessity of webcare for efficiently managing consumer reviews. We intend to propose a methodology for effective webcare and to construct a webcare model using machine learning techniques based on an audiobook platform. In this study, we selected four audiobook platforms and conducted data collection and preprocessing for consumer reviews and managerial responses. We utilized techniques such as topic modeling, topic inconsistency analysis, and DBSCAN, along with various machine learning methods for analysis. The experimental results yielded significant findings in clustering managerial responses and predicting responses to consumer reviews, proposing an efficient methodology considering resource constraints and costs. This research provides academic insights by constructing a webcare model through machine learning techniques and practical implications by suggesting an efficient methodology, considering the limited resources and personnel of companies. The proposed webcare model in this study can be utilized as strategic foundational data for consumer engagement and providing useful information, offering both personalized responses and standardized managerial responses.

Automatic gasometer reading system using selective optical character recognition (관심 문자열 인식 기술을 이용한 가스계량기 자동 검침 시스템)

  • Lee, Kyohyuk;Kim, Taeyeon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.1-25
    • /
    • 2020
  • In this paper, we suggest an application system architecture which provides accurate, fast and efficient automatic gasometer reading function. The system captures gasometer image using mobile device camera, transmits the image to a cloud server on top of private LTE network, and analyzes the image to extract character information of device ID and gas usage amount by selective optical character recognition based on deep learning technology. In general, there are many types of character in an image and optical character recognition technology extracts all character information in an image. But some applications need to ignore non-of-interest types of character and only have to focus on some specific types of characters. For an example of the application, automatic gasometer reading system only need to extract device ID and gas usage amount character information from gasometer images to send bill to users. Non-of-interest character strings, such as device type, manufacturer, manufacturing date, specification and etc., are not valuable information to the application. Thus, the application have to analyze point of interest region and specific types of characters to extract valuable information only. We adopted CNN (Convolutional Neural Network) based object detection and CRNN (Convolutional Recurrent Neural Network) technology for selective optical character recognition which only analyze point of interest region for selective character information extraction. We build up 3 neural networks for the application system. The first is a convolutional neural network which detects point of interest region of gas usage amount and device ID information character strings, the second is another convolutional neural network which transforms spatial information of point of interest region to spatial sequential feature vectors, and the third is bi-directional long short term memory network which converts spatial sequential information to character strings using time-series analysis mapping from feature vectors to character strings. In this research, point of interest character strings are device ID and gas usage amount. Device ID consists of 12 arabic character strings and gas usage amount consists of 4 ~ 5 arabic character strings. All system components are implemented in Amazon Web Service Cloud with Intel Zeon E5-2686 v4 CPU and NVidia TESLA V100 GPU. The system architecture adopts master-lave processing structure for efficient and fast parallel processing coping with about 700,000 requests per day. Mobile device captures gasometer image and transmits to master process in AWS cloud. Master process runs on Intel Zeon CPU and pushes reading request from mobile device to an input queue with FIFO (First In First Out) structure. Slave process consists of 3 types of deep neural networks which conduct character recognition process and runs on NVidia GPU module. Slave process is always polling the input queue to get recognition request. If there are some requests from master process in the input queue, slave process converts the image in the input queue to device ID character string, gas usage amount character string and position information of the strings, returns the information to output queue, and switch to idle mode to poll the input queue. Master process gets final information form the output queue and delivers the information to the mobile device. We used total 27,120 gasometer images for training, validation and testing of 3 types of deep neural network. 22,985 images were used for training and validation, 4,135 images were used for testing. We randomly splitted 22,985 images with 8:2 ratio for training and validation respectively for each training epoch. 4,135 test image were categorized into 5 types (Normal, noise, reflex, scale and slant). Normal data is clean image data, noise means image with noise signal, relfex means image with light reflection in gasometer region, scale means images with small object size due to long-distance capturing and slant means images which is not horizontally flat. Final character string recognition accuracies for device ID and gas usage amount of normal data are 0.960 and 0.864 respectively.

Compilation of 104 Experimental Theses on the Antitumor and Immuno-activating therapies of Oriental Medicine (한의학의 항종양 면역치료에 관한 연구 -1990년 이후 발표된 실험논문을 중심으로-)

  • Kang Yeon Yee;Kim Tai Im;Park Jong Ho;Kim Sung Hoon;Park Jong Dai;Kim Dong Hee
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.17 no.1
    • /
    • pp.1-24
    • /
    • 2003
  • This study was done to compile 104 experimental theses which are related to the antitumor and immuno-activating therapies between February 1990 through February 2002. Master's and doctoral theses were dassified by schools, degrees, materials, effects, experimental methods of antitumor and immunoactivity, and results. The following results were obtained from this study : 1. Classifying the theses by the school, 34.6% were presented by Daejeon University, 29.8% by Kyung-hee University and 11.5% by Won-kwang University. Of all theses, 51.0% were aimed for the doctoral degree and 43.3% were for the master's degree. All of three universities have their own cancer centers. 2. Classifying the theses by herb materials, complex prescription accounted for 60.3%, single herb accounted for 24.8% and herbal acupuncture accounted for 14.2%. Considering the key principles of the traditional medicine, complex prescription was much more thoroughly studied than single herb prescription. The results showed that the complex prescription had both antitumor activity and immuno-activating activity, which might reflects on multi-activation mechanisms by complex components. 3. Classifying the theses by the efficacy of herbs examined, in single herb, invigorating spleen and supplementing was 35.5%, expelling toxin and cooling was 29.0%, activating blood flow and removing blood stasis was 12.9%. In herbal acupuncture, invigorating spleen and supplementing was 52.9%, expelling toxin and cooling was 29.4%. In complex prescription, pathogen-free status was 41.9%, strengthening healthy qi to eliminate pathogen was 35.5%, strengthening healthy qi was 22.6%. It is presumed that the antitumor and immunoactivating therapy based on syndrome differentiation is the best way to develop oriental oncology. 4. Classifying the theses by antitumor experiments, cytotoxic effect was 48.1 %, survival time was 48.1 % and change of tumor size was 42.3%. Survival rate was not necessarily correlated with cytotoxicity. These data reflect the characteristic, wholistic nature of the oriental medicine which is based on BRM (biological response modifier). 5. Classifying the theses by immunoactivating experiments, hemolysin titer was 51.0%, hemagglutinin titer was 46.2% and NK cell's activity was 44.2%. In the future studies, an effort to elucidate specific molecular and cellular mechanisms of cytokine production in the body would be crucial. 6. Classifying the theses according to the data in terms of antitumor activity, 50% was evaluated good, 24.0% was excellent, and 15.5% have no effect. In an evaluation of immuno-activating activity, 35.9% was excellent and 18.0% showed a little effect. The index point, as described here, may helps to use experimental data for clinical trials. Changes in index points by varying dosage implicate the importance of oriental medical theory for prescription. 7. In 167 materials, IIP (immunoactivating index point, mean : 3.12±0.07) was significantly higher than AIP(antitumor index point, mean : 2.83±0.07). These data demonstrate that the effect of herb medicine on tumor activity depends more on immunoactivating activity than antitumor activity. This further implies that the development of herbal antitumor drugs must be preceded by the mechanistic understanding of immunoactivating effect. 8. After medline-searching tumor and herb-related articles from NCBI web site, we conclude that most of the studies are primarily focused on biomolecular mechanisms and/or pathways. Henceforth, we need to define the biomolecular mechanisms and/or pathways affected by herbs or complicated prescriptions. 9. Therefore, the most important point of oriental medical oncology is to conned between experimental results and clinical trials. For the public application of herbal therapy to cancer, it is critical to present the data to mass media. 10. To develop the relationship of experimental results and clinical trials, university's cancer clinic must have a long-range plan related to the university laboratories and, at the same time, a regular consortium for this relationship is imperative. 11. After all these efforts, a new type herbal medicine for cancer therapy which is to take care of the long-term administering and safety problem must be developed. Then, it would be expected that anti-tumor herbal acupuncture can improve clinical symptoms and quality of life (QOL) for cancer patients. 12. Finally, oriental medical cancer center must be constructed in NCC (National Cancer Center) or government agency for the development of oriental medical oncology which has international competitive power.

Comparative Study on the Methodology of Motor Vehicle Emission Calculation by Using Real-Time Traffic Volume in the Kangnam-Gu (자동차 대기오염물질 산정 방법론 설정에 관한 비교 연구 (강남구의 실시간 교통량 자료를 이용하여))

  • 박성규;김신도;이영인
    • Journal of Korean Society of Transportation
    • /
    • v.19 no.4
    • /
    • pp.35-47
    • /
    • 2001
  • Traffic represents one of the largest sources of primary air pollutants in urban area. As a consequence. numerous abatement strategies are being pursued to decrease the ambient concentration of pollutants. A characteristic of most of the these strategies is a requirement for accurate data on both the quantity and spatial distribution of emissions to air in the form of an atmospheric emission inventory database. In the case of traffic pollution, such an inventory must be compiled using activity statistics and emission factors for vehicle types. The majority of inventories are compiled using passive data from either surveys or transportation models and by their very nature tend to be out-of-date by the time they are compiled. The study of current trends are towards integrating urban traffic control systems and assessments of the environmental effects of motor vehicles. In this study, a methodology of motor vehicle emission calculation by using real-time traffic data was studied. A methodology for estimating emissions of CO at a test area in Seoul. Traffic data, which are required on a street-by-street basis, is obtained from induction loops of traffic control system. It was calculated speed-related mass of CO emission from traffic tail pipe of data from traffic system, and parameters are considered, volume, composition, average velocity, link length. And, the result was compared with that of a method of emission calculation by VKT(Vehicle Kilometer Travelled) of vehicles of category.

  • PDF

A Literature Review and Classification of Recommender Systems on Academic Journals (추천시스템관련 학술논문 분석 및 분류)

  • Park, Deuk-Hee;Kim, Hyea-Kyeong;Choi, Il-Young;Kim, Jae-Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.139-152
    • /
    • 2011
  • Recommender systems have become an important research field since the emergence of the first paper on collaborative filtering in the mid-1990s. In general, recommender systems are defined as the supporting systems which help users to find information, products, or services (such as books, movies, music, digital products, web sites, and TV programs) by aggregating and analyzing suggestions from other users, which mean reviews from various authorities, and user attributes. However, as academic researches on recommender systems have increased significantly over the last ten years, more researches are required to be applicable in the real world situation. Because research field on recommender systems is still wide and less mature than other research fields. Accordingly, the existing articles on recommender systems need to be reviewed toward the next generation of recommender systems. However, it would be not easy to confine the recommender system researches to specific disciplines, considering the nature of the recommender system researches. So, we reviewed all articles on recommender systems from 37 journals which were published from 2001 to 2010. The 37 journals are selected from top 125 journals of the MIS Journal Rankings. Also, the literature search was based on the descriptors "Recommender system", "Recommendation system", "Personalization system", "Collaborative filtering" and "Contents filtering". The full text of each article was reviewed to eliminate the article that was not actually related to recommender systems. Many of articles were excluded because the articles such as Conference papers, master's and doctoral dissertations, textbook, unpublished working papers, non-English publication papers and news were unfit for our research. We classified articles by year of publication, journals, recommendation fields, and data mining techniques. The recommendation fields and data mining techniques of 187 articles are reviewed and classified into eight recommendation fields (book, document, image, movie, music, shopping, TV program, and others) and eight data mining techniques (association rule, clustering, decision tree, k-nearest neighbor, link analysis, neural network, regression, and other heuristic methods). The results represented in this paper have several significant implications. First, based on previous publication rates, the interest in the recommender system related research will grow significantly in the future. Second, 49 articles are related to movie recommendation whereas image and TV program recommendation are identified in only 6 articles. This result has been caused by the easy use of MovieLens data set. So, it is necessary to prepare data set of other fields. Third, recently social network analysis has been used in the various applications. However studies on recommender systems using social network analysis are deficient. Henceforth, we expect that new recommendation approaches using social network analysis will be developed in the recommender systems. So, it will be an interesting and further research area to evaluate the recommendation system researches using social method analysis. This result provides trend of recommender system researches by examining the published literature, and provides practitioners and researchers with insight and future direction on recommender systems. We hope that this research helps anyone who is interested in recommender systems research to gain insight for future research.

Ontology-based Course Mentoring System (온톨로지 기반의 수강지도 시스템)

  • Oh, Kyeong-Jin;Yoon, Ui-Nyoung;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.149-162
    • /
    • 2014
  • Course guidance is a mentoring process which is performed before students register for coming classes. The course guidance plays a very important role to students in checking degree audits of students and mentoring classes which will be taken in coming semester. Also, it is intimately involved with a graduation assessment or a completion of ABEEK certification. Currently, course guidance is manually performed by some advisers at most of universities in Korea because they have no electronic systems for the course guidance. By the lack of the systems, the advisers should analyze each degree audit of students and curriculum information of their own departments. This process often causes the human error during the course guidance process due to the complexity of the process. The electronic system thus is essential to avoid the human error for the course guidance. If the relation data model-based system is applied to the mentoring process, then the problems in manual way can be solved. However, the relational data model-based systems have some limitations. Curriculums of a department and certification systems can be changed depending on a new policy of a university or surrounding environments. If the curriculums and the systems are changed, a scheme of the existing system should be changed in accordance with the variations. It is also not sufficient to provide semantic search due to the difficulty of extracting semantic relationships between subjects. In this paper, we model a course mentoring ontology based on the analysis of a curriculum of computer science department, a structure of degree audit, and ABEEK certification. Ontology-based course guidance system is also proposed to overcome the limitation of the existing methods and to provide the effectiveness of course mentoring process for both of advisors and students. In the proposed system, all data of the system consists of ontology instances. To create ontology instances, ontology population module is developed by using JENA framework which is for building semantic web and linked data applications. In the ontology population module, the mapping rules to connect parts of degree audit to certain parts of course mentoring ontology are designed. All ontology instances are generated based on degree audits of students who participate in course mentoring test. The generated instances are saved to JENA TDB as a triple repository after an inference process using JENA inference engine. A user interface for course guidance is implemented by using Java and JENA framework. Once a advisor or a student input student's information such as student name and student number at an information request form in user interface, the proposed system provides mentoring results based on a degree audit of current student and rules to check scores for each part of a curriculum such as special cultural subject, major subject, and MSC subject containing math and basic science. Recall and precision are used to evaluate the performance of the proposed system. The recall is used to check that the proposed system retrieves all relevant subjects. The precision is used to check whether the retrieved subjects are relevant to the mentoring results. An officer of computer science department attends the verification on the results derived from the proposed system. Experimental results using real data of the participating students show that the proposed course guidance system based on course mentoring ontology provides correct course mentoring results to students at all times. Advisors can also reduce their time cost to analyze a degree audit of corresponding student and to calculate each score for the each part. As a result, the proposed system based on ontology techniques solves the difficulty of mentoring methods in manual way and the proposed system derive correct mentoring results as human conduct.

Development of processed food database using Korea National Health and Nutrition Examination Survey data (국민건강영양조사 자료를 이용한 가공식품 데이터베이스 구축)

  • Yoon, Mi Ock;Lee, Hyun Sook;Kim, Kirang;Shim, Jae Eun;Hwang, Ji-Yun
    • Journal of Nutrition and Health
    • /
    • v.50 no.5
    • /
    • pp.504-518
    • /
    • 2017
  • Purpose: The objective of this study was to develop a processed foods database (DB) for estimation of processed food intake in the Korean population using data from the Korea National Health and Nutrition Survey (KNHANES). Methods: Analytical values of processed foods were collected from food composition tables of national institutions (Development Institute, Rural Development Administration), the US Department of Agriculture, and previously reported scientific journals. Missing or unavailable values were substituted, calculated, or imputed. The nutrient data covered 14 nutrients, including energy, protein, carbohydrates, fat, calcium, phosphorus, iron, sodium, potassium, vitamin A, thiamin, riboflavin, niacin, and vitamin C. The processed food DB covered a total of 4,858 food items used in the KNHANES. Each analytical value per food item was selected systematically based on the priority criteria of data sources. Results: Level 0 DB was developed based on a list of 8,785 registered processed foods with recipes of ready-to-eat processed foods, one food composition table published by the national institution, and nutrition facts obtained directly from manufacturers or indirectly via web search. Level 1 DB included information of 14 nutrients, and missing or unavailable values were substituted, calculated, or imputed at level 2. Level 3 DB evaluated the newly constructed nutrient DB for processed foods using the 2013 KNHANES. Mean intakes of total food and processed food were 1,551.4 g (males 1,761.8 g, females 1,340.8 g) and 129.4 g (males 169.9 g, females 88.8 g), respectively. Processed foods contributed to nutrient intakes from 5.0% (fiber) to 12.3% (protein) in the Korean population. Conclusion: The newly developed nutrient DB for processed foods contributes to accurate estimation of nutrient intakes in the Korean population. Consistent and regular update and quality control of the DB is needed to obtain accurate estimation of usual intakes using data from the KNHANES.