• Title/Summary/Keyword: search similarity

Search Result 530, Processing Time 0.026 seconds

A Pilot Establishment of the Job-Exposure Matrix of Lead Using the Standard Process Code of Nationwide Exposure Databases in Korea

  • Ju-Hyun Park;Sangjun Choi;Dong-Hee Koh;Dae Sung Lim;Hwan-Cheol Kim;Sang-Gil Lee;Jihye Lee;Ji Seon Lim;Yeji Sung;Kyoung Yoon Ko;Donguk Park
    • Safety and Health at Work
    • /
    • v.13 no.4
    • /
    • pp.493-499
    • /
    • 2022
  • Background: The purpose of this study is to construct a job-exposure matrix for lead that accounts for industry and work processes within industries using a nationwide exposure database. Methods: We used the work environment measurement data (WEMD) of lead monitored nationwide from 2015 to 2016. Industrial hygienists standardized the work process codes in the database to 37 standard process and extracted key index words for each process. A total of 37 standardized process codes were allocated to each measurement based on an automated key word search based on the degree of agreement between the measurement information and the standard process index. Summary statistics, including the arithmetic mean, geometric mean, and 95th percentile level (X95), was calculated according to industry, process, and industry process. Using statistical parameters of contrast and precision, we compared the similarity of exposure groups by industry, process, and industry process. Results: The exposure intensity of lead was estimated for 583 exposure groups combined with 128 industry and 35 process. The X95 value of the "casting" process of the "manufacture of basic precious and non-ferrous metals" industry was 53.29 ㎍/m3, exceeding the occupational exposure limit of 50 ㎍/m3. Regardless of the limitation of the minimum number of samples in the exposure group, higher contrast was observed when the exposure groups were by industry process than by industry or process. Conclusion: We evaluated the exposure intensities of lead by combination of industry and process. The results will be helpful in determining more accurate information regarding exposure in lead-related epidemiological studies.

Network Analysis Using the Established Database (K-herb Network) on Herbal Medicines Used in Clinical Research on Heart Failure (심부전의 한약 임상연구에 활용된 한약재에 대한 기구축 DB(K-HERB NETWORK)를 활용한 네트워크 분석)

  • Subin Park;Ye-ji Kim;Gi-Sang Bae;Cheol-Hyun Kim;Inae Youn;Jungtae Leem;Hongmin Chu
    • The Journal of Internal Korean Medicine
    • /
    • v.44 no.3
    • /
    • pp.313-353
    • /
    • 2023
  • Objectives: Heart failure is a chronic disease with increasing prevalence rates despite advancements in medical technology. Korean medicine utilizes herbal prescriptions to treat heart failure, but little is known about the specific herbal medicines comprising the network of herbal prescriptions for heart failure. This study proposes a novel methodology that can efficiently develop prescriptions and facilitate experimental research on heart failure by utilizing existing databases. Methods: Herbal medicine prescriptions for heart failure were identified through a PubMed search and compiled into a Google Sheet database. NetMiner 4 was used for network analysis, and the individual networks were classified according to the herbal medicine classification system to identify trends. K-HERB NETWORK was utilized to derive related prescriptions. Results: Network analysis of heart failure prescriptions and herbal medicines using NetMiner 4 produced 16 individual networks. Uhwangcheongsim-won (牛黃淸心元), Gamiondam-tang (加味溫膽湯), Bangpungtongseong-san (防風通聖散), and Bunsimgi-eum (分心氣飮) were identified as prescriptions with high similarity in the entire network. A total of 16 individual networks utilized K-HERB NETWORK to present prescriptions that were most similar to existing prescriptions. The results provide 1) an indication of existing prescriptions with potential for use to treat heart failure and 2) a basis for developing new prescriptions for heart failure treatment. Conclusion: The proposed methodology presents an efficient approach to developing new heart failure prescriptions and facilitating experimental research. This study highlights the potential of network pharmacology methodology and its possible applications in other diseases. Further studies on network pharmacology methodology are recommended.

The analysis and leaching characteristics of organic compounds in incineration residues from municipal solid waste incinerators (생활폐기물 소각시설 소각재에서의 유기오염물질 정성분석 및 용출특성)

  • Hong, Suk-Young;Kim, Sam-Cwan;Yoon, Young-Soo;Park, Sun-Ku;Kim, Kum-Hee;Hwang, Seung-Ryul
    • Analytical Science and Technology
    • /
    • v.19 no.1
    • /
    • pp.86-95
    • /
    • 2006
  • This study was carried out to estimate leaching characteristics of incineration residues from municipal solid waste incinerators, and determine organic compounds in raw ash, leaching water and leaching residue. A total of 44 organic compounds, which were analyzed by GC/MSD and identified by wiley library search, were contained in bottom ashes. A total of 17 organic compounds were contained in fly ashes. Bottom ash and fly ash were found to contain a wide range of organic compounds such as aliphatic compounds and aromatic compounds. Organic compounds such as Ethenylbenzene, Benzaldehyde, 1-Phenyl-Ethanone and 1,4-Benzenedicarboxylic acid dimethyl ester were detected in raw ash, leaching water and residues (from bottom ash). Organic compounds such as Naphthalene, Dodecane, 1,2,3,5-Tetrachlorobenzene, Tetradecane, Hexadecane and Pentachlorobenzene were detected in raw ash, leaching water and residues (from fly ash). Through the leaching characteristics of incineration residue, it was represented that the open dumping of incineration residue can contaminate the soil and undergroundwater. In order to prevent environmental contamination that derived from extremely toxic substances in the incineration residues, it is particularly important that the incineration residues should be treated before disposal the incineration residues. Further study and proper management about leaching characteristics of organic compounds might be required.

A Study on Automatic Discovery and Summarization Method of Battlefield Situation Related Documents using Natural Language Processing and Collaborative Filtering (자연어 처리 및 협업 필터링 기반의 전장상황 관련 문서 자동탐색 및 요약 기법연구)

  • Kunyoung Kim;Jeongbin Lee;Mye Sohn
    • Journal of Internet Computing and Services
    • /
    • v.24 no.6
    • /
    • pp.127-135
    • /
    • 2023
  • With the development of information and communication technology, the amount of information produced and shared in the battlefield and stored and managed in the system dramatically increased. This means that the amount of information which cansupport situational awareness and decision making of the commanders has increased, but on the other hand, it is also a factor that hinders rapid decision making by increasing the information overload on the commanders. To overcome this limitation, this study proposes a method to automatically search, select, and summarize documents that can help the commanders to understand the battlefield situation reports that he or she received. First, named entities are discovered from the battlefield situation report using a named entity recognition method. Second, the documents related to each named entity are discovered. Third, a language model and collaborative filtering are used to select the documents. At this time, the language model is used to calculate the similarity between the received report and the discovered documents, and collaborative filtering is used to reflect the commander's document reading history. Finally, sentences containing each named entity are selected from the documents and sorted. The experiment was carried out using academic papers since their characteristics are similar to military documents, and the validity of the proposed method was verified.

Metadata extraction using AI and advanced metadata research for web services (AI를 활용한 메타데이터 추출 및 웹서비스용 메타데이터 고도화 연구)

  • Sung Hwan Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.2
    • /
    • pp.499-503
    • /
    • 2024
  • Broadcasting programs are provided to various media such as Internet replay, OTT, and IPTV services as well as self-broadcasting. In this case, it is very important to provide keywords for search that represent the characteristics of the content well. Broadcasters mainly use the method of manually entering key keywords in the production process and the archive process. This method is insufficient in terms of quantity to secure core metadata, and also reveals limitations in recommending and using content in other media services. This study supports securing a large number of metadata by utilizing closed caption data pre-archived through the DTV closed captioning server developed in EBS. First, core metadata was automatically extracted by applying Google's natural language AI technology. The next step is to propose a method of finding core metadata by reflecting priorities and content characteristics as core research contents. As a technology to obtain differentiated metadata weights, the importance was classified by applying the TF-IDF calculation method. Successful weight data were obtained as a result of the experiment. The string metadata obtained by this study, when combined with future string similarity measurement studies, becomes the basis for securing sophisticated content recommendation metadata from content services provided to other media.

Application of Molecular Biological Technique for Development of Stability Indicator in Uncontrolled Landfill (불량매립지 안정화 지표 개발을 위한 분자생물학적 기술의 적용)

  • Park, Hyun-A;Han, Ji-Sun;Kim, Chang-Gyun;Lee, Jin-Young
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.28 no.2
    • /
    • pp.128-136
    • /
    • 2006
  • This study was conducted for developing the stability parameter in uncontrolled landfill by using a biomolecular investigation on the microbial community growing through leachate plume. Landfill J(which is in Cheonan) and landfill T(which is in Wonju) were chosen for this study among a total of 244 closed uncontrolled landfills. It addressed the genetic diversity of the microbial community in the leachate by 165 rDNA gene cloning using PCR and compared quantitative analysis of denitrifiers and methanotrophs with the conventional water quality parameters. From the BLAST search, genes of 47.6% in landfill J, and 32.5% in landfill T, respectively, showed more than 97% of the similarity where Proteobacteria phylum was most significantly observed. It showed that the numbers of denitrification genes, i.e. nirS gene and cnorB gene in the J site are 7 and 4 times higher than those in T site, which is well reflecting from a difference of site closure showing 7 and 13 years after being closed, respectively. In addition, the quantitative analysis on methane formation gene showed that J1 spot immediately bordering with the sources has the greatest number of methane formation bacteria, and it was decreased rapidly according to distribute toward the outer boundary of landfill. The comparative investigation between the number of genes, i.e. nirS gene, cnorB gene and MCR gene, md the conventional monitoring parameters, i.e. TOC, $NH_3-N,\;NO_3-N,\;NO_2-N,\;Cl^-$, alkalinity, addressed that more than 99% of the correlation was observed except for the $NO_3-N$. It was concluded that biomolecular investigation was well consistent with the conventional monitoring parameters to interpret their influences and stability made by leachate plume formed in downgradient around the uncontrolled sites.

Importance of Strain Improvement and Control of Fungal cells Morphology for Enhanced Production of Protein-bound Polysaccharides(β-D-glucan) in Suspended Cultures of Phellinus linteus Mycelia (Phellinus linteus의 균사체 액상배양에서 단백다당체(β-D-glucan)의 생산성 향상을 위한 균주 개량과 배양형태 조절의 중요성)

  • Shin, Woo-Shik;Kwon, Yong Jung;Jeong, Yong-Seob;Chun, Gie-Taek
    • Korean Chemical Engineering Research
    • /
    • v.47 no.2
    • /
    • pp.220-229
    • /
    • 2009
  • Strain improvement and morphology investigation in bioreactor cultures were undertaken in suspended cultures of Phellinus linteus mycelia for mass production of protein-bound polysaccharides(soluble ${\beta}$-D-glucan), a powerful immuno-stimulating agent. Phellineus sp. screened for this research was identified as Phellinus linteues through ITS rDNA sequencing method and blast search, demonstrating 99.7% similarity to other Phellinus linteus strains. Intensive strain improvement program was carried out by obtaining large amounts of protoplasts for the isolation of single cell colonies. Rapid and large screening of high-yielding producers was possible because large numbers of protoplasts ($1{\times}10^5{\sim}10^6\;protoplasts/ml$) formed using the banding filtration method with the cell wall-disrupting enzymes could be regenerated in relatively high regeneration frequency($10^{-2}{\sim}10^{-3}$) in the newly developed regeneration medium. It was demonstrated that the strains showing high performances in the protoplast regeneration and solid growth medium were able to produce 5.8~6.4%(w/w) of ${\beta}$-D-glucan and 13~15 g/L of biomass in stable manners in suspended shake-flask cultures of P. linteus mycelia. In addition, cell mass increase was observed to be the most important in order to enhance ${\beta}$-D-glucan productivity during the course of strain improvement program, since the amount of ${\beta}$-D-glucan extracted from the cell wall of P. linteus mycelia was almost constant on the unit biomass basis. Therefore we fully investigated the fungal cell morphology, generally known as one of the key factors affecting cell growth extent in the bioreactor cultures of mycelial fungal cells. It was found that, in order to obtain as high cell mass as possible in the final production bioreactor cultures, the producing cells should be proliferated in condensed filamentous forms in the growth cultures, and optimum amounts of these filamentous cells should be transferred as active inoculums to the production bioreactor. In this case, ideal morphologies consisting of compacted pellets less than 0.5mm in diameter were successfully induced in the production cultures, resulting in shorter period of lag phase, 1.5 fold higher specific cell growth rate and 3.3 fold increase in the final biomass production as compared to the parallel bioreactor cultures of different morphological forms. It was concluded that not only the high-yielding but also the good morphological characteristics led to the significantly higher biomass production and ${\beta}$-D-glucan productivity in the final production cultures.

A Study on the Impact Factors of Contents Diffusion in Youtube using Integrated Content Network Analysis (일반영향요인과 댓글기반 콘텐츠 네트워크 분석을 통합한 유튜브(Youtube)상의 콘텐츠 확산 영향요인 연구)

  • Park, Byung Eun;Lim, Gyoo Gun
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.19-36
    • /
    • 2015
  • Social media is an emerging issue in content services and in current business environment. YouTube is the most representative social media service in the world. YouTube is different from other conventional content services in its open user participation and contents creation methods. To promote a content in YouTube, it is important to understand the diffusion phenomena of contents and the network structural characteristics. Most previous studies analyzed impact factors of contents diffusion from the view point of general behavioral factors. Currently some researchers use network structure factors. However, these two approaches have been used separately. However this study tries to analyze the general impact factors on the view count and content based network structures all together. In addition, when building a content based network, this study forms the network structure by analyzing user comments on 22,370 contents of YouTube not based on the individual user based network. From this study, we re-proved statistically the causal relations between view count and not only general factors but also network factors. Moreover by analyzing this integrated research model, we found that these factors affect the view count of YouTube according to the following order; Uploader Followers, Video Age, Betweenness Centrality, Comments, Closeness Centrality, Clustering Coefficient and Rating. However Degree Centrality and Eigenvector Centrality affect the view count negatively. From this research some strategic points for the utilizing of contents diffusion are as followings. First, it is needed to manage general factors such as the number of uploader followers or subscribers, the video age, the number of comments, average rating points, and etc. The impact of average rating points is not so much important as we thought before. However, it is needed to increase the number of uploader followers strategically and sustain the contents in the service as long as possible. Second, we need to pay attention to the impacts of betweenness centrality and closeness centrality among other network factors. Users seems to search the related subject or similar contents after watching a content. It is needed to shorten the distance between other popular contents in the service. Namely, this study showed that it is beneficial for increasing view counts by decreasing the number of search attempts and increasing similarity with many other contents. This is consistent with the result of the clustering coefficient impact analysis. Third, it is important to notice the negative impact of degree centrality and eigenvector centrality on the view count. If the number of connections with other contents is too much increased it means there are many similar contents and eventually it might distribute the view counts. Moreover, too high eigenvector centrality means that there are connections with popular contents around the content, and it might lose the view count because of the impact of the popular contents. It would be better to avoid connections with too powerful popular contents. From this study we analyzed the phenomenon and verified diffusion factors of Youtube contents by using an integrated model consisting of general factors and network structure factors. From the viewpoints of social contribution, this study might provide useful information to music or movie industry or other contents vendors for their effective contents services. This research provides basic schemes that can be applied strategically in online contents marketing. One of the limitations of this study is that this study formed a contents based network for the network structure analysis. It might be an indirect method to see the content network structure. We can use more various methods to establish direct content network. Further researches include more detailed researches like an analysis according to the types of contents or domains or characteristics of the contents or users, and etc.

A New Approach to Automatic Keyword Generation Using Inverse Vector Space Model (키워드 자동 생성에 대한 새로운 접근법: 역 벡터공간모델을 이용한 키워드 할당 방법)

  • Cho, Won-Chin;Rho, Sang-Kyu;Yun, Ji-Young Agnes;Park, Jin-Soo
    • Asia pacific journal of information systems
    • /
    • v.21 no.1
    • /
    • pp.103-122
    • /
    • 2011
  • Recently, numerous documents have been made available electronically. Internet search engines and digital libraries commonly return query results containing hundreds or even thousands of documents. In this situation, it is virtually impossible for users to examine complete documents to determine whether they might be useful for them. For this reason, some on-line documents are accompanied by a list of keywords specified by the authors in an effort to guide the users by facilitating the filtering process. In this way, a set of keywords is often considered a condensed version of the whole document and therefore plays an important role for document retrieval, Web page retrieval, document clustering, summarization, text mining, and so on. Since many academic journals ask the authors to provide a list of five or six keywords on the first page of an article, keywords are most familiar in the context of journal articles. However, many other types of documents could not benefit from the use of keywords, including Web pages, email messages, news reports, magazine articles, and business papers. Although the potential benefit is large, the implementation itself is the obstacle; manually assigning keywords to all documents is a daunting task, or even impractical in that it is extremely tedious and time-consuming requiring a certain level of domain knowledge. Therefore, it is highly desirable to automate the keyword generation process. There are mainly two approaches to achieving this aim: keyword assignment approach and keyword extraction approach. Both approaches use machine learning methods and require, for training purposes, a set of documents with keywords already attached. In the former approach, there is a given set of vocabulary, and the aim is to match them to the texts. In other words, the keywords assignment approach seeks to select the words from a controlled vocabulary that best describes a document. Although this approach is domain dependent and is not easy to transfer and expand, it can generate implicit keywords that do not appear in a document. On the other hand, in the latter approach, the aim is to extract keywords with respect to their relevance in the text without prior vocabulary. In this approach, automatic keyword generation is treated as a classification task, and keywords are commonly extracted based on supervised learning techniques. Thus, keyword extraction algorithms classify candidate keywords in a document into positive or negative examples. Several systems such as Extractor and Kea were developed using keyword extraction approach. Most indicative words in a document are selected as keywords for that document and as a result, keywords extraction is limited to terms that appear in the document. Therefore, keywords extraction cannot generate implicit keywords that are not included in a document. According to the experiment results of Turney, about 64% to 90% of keywords assigned by the authors can be found in the full text of an article. Inversely, it also means that 10% to 36% of the keywords assigned by the authors do not appear in the article, which cannot be generated through keyword extraction algorithms. Our preliminary experiment result also shows that 37% of keywords assigned by the authors are not included in the full text. This is the reason why we have decided to adopt the keyword assignment approach. In this paper, we propose a new approach for automatic keyword assignment namely IVSM(Inverse Vector Space Model). The model is based on a vector space model. which is a conventional information retrieval model that represents documents and queries by vectors in a multidimensional space. IVSM generates an appropriate keyword set for a specific document by measuring the distance between the document and the keyword sets. The keyword assignment process of IVSM is as follows: (1) calculating the vector length of each keyword set based on each keyword weight; (2) preprocessing and parsing a target document that does not have keywords; (3) calculating the vector length of the target document based on the term frequency; (4) measuring the cosine similarity between each keyword set and the target document; and (5) generating keywords that have high similarity scores. Two keyword generation systems were implemented applying IVSM: IVSM system for Web-based community service and stand-alone IVSM system. Firstly, the IVSM system is implemented in a community service for sharing knowledge and opinions on current trends such as fashion, movies, social problems, and health information. The stand-alone IVSM system is dedicated to generating keywords for academic papers, and, indeed, it has been tested through a number of academic papers including those published by the Korean Association of Shipping and Logistics, the Korea Research Academy of Distribution Information, the Korea Logistics Society, the Korea Logistics Research Association, and the Korea Port Economic Association. We measured the performance of IVSM by the number of matches between the IVSM-generated keywords and the author-assigned keywords. According to our experiment, the precisions of IVSM applied to Web-based community service and academic journals were 0.75 and 0.71, respectively. The performance of both systems is much better than that of baseline systems that generate keywords based on simple probability. Also, IVSM shows comparable performance to Extractor that is a representative system of keyword extraction approach developed by Turney. As electronic documents increase, we expect that IVSM proposed in this paper can be applied to many electronic documents in Web-based community and digital library.

Context Sharing Framework Based on Time Dependent Metadata for Social News Service (소셜 뉴스를 위한 시간 종속적인 메타데이터 기반의 컨텍스트 공유 프레임워크)

  • Ga, Myung-Hyun;Oh, Kyeong-Jin;Hong, Myung-Duk;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.39-53
    • /
    • 2013
  • The emergence of the internet technology and SNS has increased the information flow and has changed the way people to communicate from one-way to two-way communication. Users not only consume and share the information, they also can create and share it among their friends across the social network service. It also changes the Social Media behavior to become one of the most important communication tools which also includes Social TV. Social TV is a form which people can watch a TV program and at the same share any information or its content with friends through Social media. Social News is getting popular and also known as a Participatory Social Media. It creates influences on user interest through Internet to represent society issues and creates news credibility based on user's reputation. However, the conventional platforms in news services only focus on the news recommendation domain. Recent development in SNS has changed this landscape to allow user to share and disseminate the news. Conventional platform does not provide any special way for news to be share. Currently, Social News Service only allows user to access the entire news. Nonetheless, they cannot access partial of the contents which related to users interest. For example user only have interested to a partial of the news and share the content, it is still hard for them to do so. In worst cases users might understand the news in different context. To solve this, Social News Service must provide a method to provide additional information. For example, Yovisto known as an academic video searching service provided time dependent metadata from the video. User can search and watch partial of video content according to time dependent metadata. They also can share content with a friend in social media. Yovisto applies a method to divide or synchronize a video based whenever the slides presentation is changed to another page. However, we are not able to employs this method on news video since the news video is not incorporating with any power point slides presentation. Segmentation method is required to separate the news video and to creating time dependent metadata. In this work, In this paper, a time dependent metadata-based framework is proposed to segment news contents and to provide time dependent metadata so that user can use context information to communicate with their friends. The transcript of the news is divided by using the proposed story segmentation method. We provide a tag to represent the entire content of the news. And provide the sub tag to indicate the segmented news which includes the starting time of the news. The time dependent metadata helps user to track the news information. It also allows them to leave a comment on each segment of the news. User also may share the news based on time metadata as segmented news or as a whole. Therefore, it helps the user to understand the shared news. To demonstrate the performance, we evaluate the story segmentation accuracy and also the tag generation. For this purpose, we measured accuracy of the story segmentation through semantic similarity and compared to the benchmark algorithm. Experimental results show that the proposed method outperforms benchmark algorithms in terms of the accuracy of story segmentation. It is important to note that sub tag accuracy is the most important as a part of the proposed framework to share the specific news context with others. To extract a more accurate sub tags, we have created stop word list that is not related to the content of the news such as name of the anchor or reporter. And we applied to framework. We have analyzed the accuracy of tags and sub tags which represent the context of news. From the analysis, it seems that proposed framework is helpful to users for sharing their opinions with context information in Social media and Social news.