Search | Korea Science

Patent Document Similarity Based on Image Analysis Using the SIFT-Algorithm and OCR-Text

Park, Jeong Beom;Mandl, Thomas;Kim, Do Wan
- International Journal of Contents
- /
- v.13 no.4
- /
- pp.70-79
- /
- 2017
Images are an important element in patents and many experts use images to analyze a patent or to check differences between patents. However, there is little research on image analysis for patents partly because image processing is an advanced technology and typically patent images consist of visual parts as well as of text and numbers. This study suggests two methods for using image processing; the Scale Invariant Feature Transform(SIFT) algorithm and Optical Character Recognition(OCR). The first method which works with SIFT uses image feature points. Through feature matching, it can be applied to calculate the similarity between documents containing these images. And in the second method, OCR is used to extract text from the images. By using numbers which are extracted from an image, it is possible to extract the corresponding related text within the text passages. Subsequently, document similarity can be calculated based on the extracted text. Through comparing the suggested methods and an existing method based only on text for calculating the similarity, the feasibility is achieved. Additionally, the correlation between both the similarity measures is low which shows that they capture different aspects of the patent content.
https://doi.org/10.5392/IJoC.2017.13.4.070 인용 PDF KSCI

Social Tagging-based Recommendation Platform for Patented Technology Transfer (특허의 기술이전 활성화를 위한 소셜 태깅기반 지적재산권 추천플랫폼)

Park, Yoon-Joo
- Journal of Intelligence and Information Systems
- /
- v.21 no.3
- /
- pp.53-77
- /
- 2015
Korea has witnessed an increasing number of domestic patent applications, but a majority of them are not utilized to their maximum potential but end up becoming obsolete. According to the 2012 National Congress' Inspection of Administration, about 73% of patents possessed by universities and public-funded research institutions failed to lead to creating social values, but remain latent. One of the main problem of this issue is that patent creators such as individual researcher, university, or research institution lack abilities to commercialize their patents into viable businesses with those enterprises that are in need of them. Also, for enterprises side, it is hard to find the appropriate patents by searching keywords on all such occasions. This system proposes a patent recommendation system that can identify and recommend intellectual rights appropriate to users' interested fields among a rapidly accumulating number of patent assets in a more easy and efficient manner. The proposed system extracts core contents and technology sectors from the existing pool of patents, and combines it with secondary social knowledge, which derives from tags information created by users, in order to find the best patents recommended for users. That is to say, in an early stage where there is no accumulated tag information, the recommendation is done by utilizing content characteristics, which are identified through an analysis of key words contained in such parameters as 'Title of Invention' and 'Claim' among the various patent attributes. In order to do this, the suggested system extracts only nouns from patents and assigns a weight to each noun according to the importance of it in all patents by performing TF-IDF analysis. After that, it finds patents which have similar weights with preferred patents by a user. In this paper, this similarity is called a "Domain Similarity". Next, the suggested system extract technology sector's characteristics from patent document by analyzing the international technology classification code (International Patent Classification, IPC). Every patents have more than one IPC, and each user can attach more than one tag to the patents they like. Thus, each user has a set of IPC codes included in tagged patents. The suggested system manages this IPC set to analyze technology preference of each user and find the well-fitted patents for them. In order to do this, the suggeted system calcuates a 'Technology_Similarity' between a set of IPC codes and IPC codes contained in all other patents. After that, when the tag information of multiple users are accumulated, the system expands the recommendations in consideration of other users' social tag information relating to the patent that is tagged by a concerned user. The similarity between tag information of perferred 'patents by user and other patents are called a 'Social Simialrity' in this paper. Lastly, a 'Total Similarity' are calculated by adding these three differenent similarites and patents having the highest 'Total Similarity' are recommended to each user. The suggested system are applied to a total of 1,638 korean patents obtained from the Korea Industrial Property Rights Information Service (KIPRIS) run by the Korea Intellectual Property Office. However, since this original dataset does not include tag information, we create virtual tag information and utilized this to construct the semi-virtual dataset. The proposed recommendation algorithm was implemented with JAVA, a computer programming language, and a prototype graphic user interface was also designed for this study. As the proposed system did not have dependent variables and uses virtual data, it is impossible to verify the recommendation system with a statistical method. Therefore, the study uses a scenario test method to verify the operational feasibility and recommendation effectiveness of the system. The results of this study are expected to improve the possibility of matching promising patents with the best suitable businesses. It is assumed that users' experiential knowledge can be accumulated, managed, and utilized in the As-Is patent system, which currently only manages standardized patent information.
https://doi.org/10.13088/jiis.2015.21.3.53 인용 PDF KSCI

The Identification of Emerging Technologies of Automotive Semiconductor

Daekyeong Nam;Gyunghyun Choi
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.17 no.2
- /
- pp.663-677
- /
- 2023
As the paradigm of future vehicles changes, the interest in automotive semiconductor, which plays a key role in realizing this, is increasing. Automotive semiconductors are the technology with very high entry barriers that require a lot of effort and time because it must secure technology readiness level and also consider safety and reliability. In this technology field, it is very important to develop new businesses and create opportunities through technology trend analysis. However, systematic analysis and application of automotive semiconductor technology trends are currently lacking. In this paper, U.S. registered patent documents related to automotive semiconductor were collected and investigated based on the patent's IPC. The main technology of automotive semiconductor was analyzed through topic modeling, and the technology path such as emerging technology was investigated through cosine similarity. We identified that those emerging technologies such as driving control for vehicle and AI service appeared. We observed that as time passed, both convergence and independence of automotive semiconductor technology proceeded simultaneously.
https://doi.org/10.3837/tiis.2023.02.021 인용 PDF HTML

A Model for Measuring the R&D Project Similarity using Patent Information (특허 정보를 활용한 R&D 과제 유사도 측정 모델)

Kim, Jong-Bae;Byun, Jung-Won;Sun, Dong-Ju;Kim, Tae-Gyun;Kim, Yung
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.18 no.5
- /
- pp.1013-1021
- /
- 2014
For efficient investments of government budgets, It is important to analyze the similarities of R&D projects. So, existing studies have proposed a techniques for analyzing similarities using keywords or segments. However, the techniques have low accuracy. We propose a technique for similarities of projects using patent information. To achieve our goal, we suggest three metrics that are based some mathematic theories; set theory and probability theory. In order to validate our technique, we perform case studies that have 156 R&D projects and 160,218 patent informations.
https://doi.org/10.6109/jkiice.2014.18.5.1013 인용 PDF KSCI

A System for Measuring the Similarity and Redundancy of R&D Project (R&D 과제의 유사도 및 중복도 측정 시스템에 관한 연구)

Choi, Kook-Hyun;Kang, Yong-Suk;Kim, Jong-Hee;Shin, Yong-Tae;Kim, Jong-Bae
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2014.05a
- /
- pp.329-331
- /
- 2014
The analysis of the similarities and redundancies among R&D projects is important for the efficient investment of government budgets. When government R&D projects are planned, the redundancies of research tasks are examined by institutions specializing in research management, relevant offices and departments, and the government to prevent redundant funding. However, as existing similarity analyses depend on methods wherein new task proposals and existing R&D project proposals are compared and looked up based on keywords. This results in vulnerability wherein similarity cannot be accurately measured in the event of partial modifications of the task name or technical substitutions. This study aims to use patent information as characteristics by which R&D project documents can be identified. The patent data used is based on materials officially published by the government's R&D patent trend survey project (http://ipas.rndip.re.kr). The study aims to propose a method by which patent information can be used to analyze the similarity and redundancy among R&D projects when new projects are entered. For this purpose, a similarity measurement model based on set theory and probability theory is presented. The presented measurement model is implemented into an actual system to identify redundant documents, and calculate and show their similarity.
PDF

A Study on Developing a Prediction Model of Patent Citation Counts (특허인용 예측모형 구축에 관한 연구)

Yoo, Jae-Bok;Chung, Young-Mee
- Journal of the Korean Society for information Management
- /
- v.27 no.4
- /
- pp.239-258
- /
- 2010
The purpose of this study is to develop a prediction model of patent citation counts based on major factors which affect patent citation. To this end, we performed multiple regression analysis between the patent citation counts and five explanatory variables such as the number of pages, the number of claims, the reference-average-citation rate, the strength of bibliographic coupling, and the document similarity proved as having 5% or more standardized variances($r^2$) with patent citation counts, with a test dataset of U.S. patents in five subject fields. As a result, our prediction models showed 58.3% to 89.6% predictability depending on subject fields and revealed the document similarity has the highest impact on citation counts among the five predictive variables in all the subject fields. The result of comparison between the predicted citation counts and the actual ones confirmed the usefulness of the citation prediction models built for each subject field.
https://doi.org/10.3743/KOSIM.2010.27.4.239 인용 PDF

LDA Topic Modeling and Recommendation of Similar Patent Document Using Word2vec (LDA 토픽 모델링과 Word2vec을 활용한 유사 특허문서 추천연구)

Apgil Lee;Keunho Choi;Gunwoo Kim
- Information Systems Review
- /
- v.22 no.1
- /
- pp.17-31
- /
- 2020
With the start of the fourth industrial revolution era, technologies of various fields are merged and new types of technologies and products are being developed. In addition, the importance of the registration of intellectual property rights and patent registration to gain market dominance of them is increasing in oversea as well as in domestic. Accordingly, the number of patents to be processed per examiner is increasing every year, so time and cost for prior art research are increasing. Therefore, a number of researches have been carried out to reduce examination time and cost for patent-pending technology. This paper proposes a method to calculate the degree of similarity among patent documents of the same priority claim when a plurality of patent rights priority claims are filed and to provide them to the examiner and the patent applicant. To this end, we preprocessed the data of the existing irregular patent documents, used Word2vec to obtain similarity between patent documents, and then proposed recommendation model that recommends a similar patent document in descending order of score. This makes it possible to promptly refer to the examination history of patent documents judged to be similar at the time of examination by the examiner, thereby reducing the burden of work and enabling efficient search in the applicant's prior art research. We expect it will contribute greatly.
https://doi.org/10.14329/isr.2020.22.1.017 인용 PDF

Analysis of Factors Influencing Patent Citations (특허 인용에 영향을 미치는 요인 분석)

Yoo, Jae-Bok;Chung, Young-Mee
- Journal of the Korean Society for information Management
- /
- v.27 no.1
- /
- pp.103-118
- /
- 2010
Recently, the valuation of patented technology has been greatly emphasized, and patent citation has been accepted as a very useful index of this technology. In this study, we performed correlation analyses between the patent citation counts and 17 explanatory variables of morphological, technological, and conceptual factors with a test dataset of U.S. patents in five subject fields. Seven variables having 5% or more standardized variances($r^2$) with patent citation counts were identified; number of pages, number of claims, reference-average-citation rate, patent increase/decrease rate, strength of bibliographic coupling, co-citation counts and document similarity. The result of the ANOVA test shows that the mean values of these variables vary among most subject fields.
https://doi.org/10.3743/KOSIM.2010.27.1.103 인용 PDF

Identifying Similar Overseas Patent Using Word2Vec-Based Semantic Text Analytics (Word2Vec 학습을 통한 의미 기반 해외 유사 특허 검색 방안)

Paek, Minji;Kim, Namgyu
- Journal of Information Technology Services
- /
- v.17 no.2
- /
- pp.129-142
- /
- 2018
Recently, the number of patent applications have been increasing rapidly every year as the importance of protecting intellectual property rights becomes more important. Patents must be inventive and have novelty. Especially, the novelty implies that the corresponding invention is not the same as the previous invention. To confirm the novelty, prior art search must be conducted before and after the application. The target of prior art search should include not only Korean patents but also foreign patents. Search of foreign patents should be supported by multilingual search techniques. However, a dictionary-based naive approach shows a limitation because some technical concepts are represented in different terms according to each nation. For example, a Korean term and a Japanese term may not be synonym even though they represent the same technical concept. In this paper, we propose a new method to map semantic similarity between technical terms in Korean patents and Japanese patents. To investigate different representations in each nation for the same technical concept, we identified and analyzed pairs of patents those are mutually connected with priority claim relationship. By performing an experiment with real-world data, we showed that our approach can reveal semantically similar technical terms in other language successfully.
https://doi.org/10.9716/KITS.2018.17.2.129 인용 PDF KSCI

A Research on TF-IDF-based Patent Recommendation Algorithm using Technology Transfer Data (기술이전 데이터를 활용한 TF-IDF기반 특허추천 알고리즘 연구)

Junki Kim;Joonsoo Bae;Yeongheon Song;Byungho Jeong
- Journal of Korean Society of Industrial and Systems Engineering
- /
- v.46 no.3
- /
- pp.78-88
- /
- 2023
The increasing number of technology transfers from public research institutes in Korea has led to a growing demand for patent recommendation platforms for SMEs. This is because selecting the right technology for commercialization is a critical factor in business success. This study developed a patent recommendation system that uses technology transfer data from the past 10 years to recommend patents that are suitable for SMEs. The system was developed in three stages. First, an item-based collaborative filtering system was developed to recommend patents based on the similarities between the patents that SMEs have previously transferred. Next, a content-based recommendation system based on TF-IDF was developed to analyze patent names and recommend patents with high similarity. Finally, a hybrid system was developed that combines the strengths of both recommendation systems. The experimental results showed that the hybrid system was able to recommend patents that were both similar and relevant to the SMEs' interests. This suggests that the system can be a valuable tool for SMEs that are looking to acquire new technologies.
https://doi.org/10.11627/jksie.2023.46.3.078 인용 PDF

Search Result 32, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)