• Title/Summary/Keyword: multi-keyword

Search Result 62, Processing Time 0.021 seconds

Multi-Dimensional Keyword Search and Analysis of Hotel Review Data Using Multi-Dimensional Text Cubes (다차원 텍스트 큐브를 이용한 호텔 리뷰 데이터의 다차원 키워드 검색 및 분석)

  • Kim, Namsoo;Lee, Suan;Jo, Sunhwa;Kim, Jinho
    • Journal of Information Technology and Architecture
    • /
    • v.11 no.1
    • /
    • pp.63-73
    • /
    • 2014
  • As the advance of WWW, unstructured data including texts are taking users' interests more and more. These unstructured data created by WWW users represent users' subjective opinions thus we can get very useful information such as users' personal tastes or perspectives from them if we analyze appropriately. In this paper, we provide various analysis efficiently for unstructured text documents by taking advantage of OLAP (On-Line Analytical Processing) multidimensional cube technology. OLAP cubes have been widely used for the multidimensional analysis for structured data such as simple alphabetic and numberic data but they didn't have used for unstructured data consisting of long texts. In order to provide multidimensional analysis for unstructured text data, however, Text Cube model has been proposed precently. It incorporates term frequency and inverted index as measurements to search and analyze text databases which play key roles in information retrieval. The primary goal of this paper is to apply this text cube model to a real data set from in an Internet site sharing hotel information and to provide multidimensional analysis for users' reviews on hotels written in texts. To achieve this goal, we first build text cubes for the hotel review data. By using the text cubes, we design and implement the system which provides multidimensional keyword search features to search and to analyze review texts on various dimensions. This system will be able to help users to get valuable guest-subjective summary information easily. Furthermore, this paper evaluats the proposed systems through various experiments and it reveals the effectiveness of the system.

Enabling Dynamic Multi-Client and Boolean Query in Searchable Symmetric Encryption Scheme for Cloud Storage System

  • Xu, Wanshan;Zhang, Jianbiao;Yuan, Yilin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.4
    • /
    • pp.1286-1306
    • /
    • 2022
  • Searchable symmetric encryption (SSE) provides a safe and effective solution for retrieving encrypted data on cloud servers. However, the existing SSE schemes mainly focus on single keyword search in single client, which is inefficient for multiple keywords and cannot meet the needs for multiple clients. Considering the above drawbacks, we propose a scheme enabling dynamic multi-client and Boolean query in searchable symmetric encryption for cloud storage system (DMC-SSE). DMC-SSE realizes the fine-grained access control of multi-client in SSE by attribute-based encryption (ABE) and novel access control list (ACL), and supports Boolean query of multiple keywords. In addition, DMC-SSE realizes the full dynamic update of client and file. Compared with the existing multi-client schemes, our scheme has the following advantages: 1) Dynamic. DMC-SSE not only supports the dynamic addition or deletion of multiple clients, but also realizes the dynamic update of files. 2) Non-interactivity. After being authorized, the client can query keywords without the help of the data owner and the data owner can dynamically update client's permissions without requiring the client to stay online. At last, the security analysis and experiments results demonstrate that our scheme is safe and efficient.

Effective Searchable Symmetric Encryption System using Conjunctive Keyword on Remote Storage Environment (원격 저장소 환경에서 다중 키워드를 이용한 효율적인 검색 가능한 대칭키 암호 시스템)

  • Lee, Sun-Ho;Lee, Im-Yeong
    • The KIPS Transactions:PartC
    • /
    • v.18C no.4
    • /
    • pp.199-206
    • /
    • 2011
  • Removable Storage provides the excellent portability with light weight and small size which fits in one's hand, many users have recently turned attention to the high-capacity products. However, due to the easy of portability for Removable Storage, Removable Storage are frequently lost and stolen and then many problems have been occurred such as the leaking of private information to the public. The advent of remote storage services where data is stored throughout the network, has allowed an increasing number of users to access data. The main data of many users is stored together on remote storage, but this has the problem of disclosure by an unethical administrator or attacker. To solve this problem, the encryption of data stored on the server has become necessary, and a searchable encryption system is needed for efficient retrieval of encrypted data. However, the existing searchable encryption system has the problem of low efficiency of document insert/delete operations and multi-keyword search. In this paper, an efficient searchable encryption system is proposed.

Multi-perspective User Preference Learning in a Chatting Domain (인터넷 채팅 도메인에서의 감성정보를 이용한 타관점 사용자 선호도 학습 방법)

  • Shin, Wook-Hyun;Jeong, Yoon-Jae;Myaeng, Sung-Hyon;Han, Kyoung-Soo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.1
    • /
    • pp.1-8
    • /
    • 2009
  • Learning user's preference is a key issue in intelligent system such as personalized service. The study on user preference model has adapted simple user preference model, which determines a set of preferred keywords or topic, and weights to each target. In this paper, we recommend multi-perspective user preference model that factors sentiment information in the model. Based on the topicality and sentimental information processed using natural language processing techniques, it learns a user's preference. To handle timc-variant nature of user preference, user preference is calculated by session, short-term and long term. User evaluation is used to validate the effect of user preference teaming and it shows 86.52%, 86.28%, 87.22% of accuracy for topic interest, keyword interest, and keyword favorableness.

A Study on Imagination of Product Design Concept by Mind Map (마인드 맵을 이용한 제품디자인 컨셉의 이미지화에 관한 연구)

  • 이종석;신수길
    • Archives of design research
    • /
    • v.13 no.4
    • /
    • pp.137-144
    • /
    • 2000
  • Recently, we meet with the various information by unordered or multi-dimensional in the information society. And we need to get ability of systematically arrange and use that. In general, human brain have the most effective result for the transmission of meaning when recognize the information by their own characteristic and imagination. It was decided by organize and easily express of get at the heart of the keyword and interrelation for take the information and represent own idea. It is the purpose of this paper to introduce the product designer for creativity get the radial thinking based on various information and can logically build the core keyword in the process of abstract the design concept by mind map and I increase understanding by a case study.

  • PDF

Multi-Vector Document Embedding Using Semantic Decomposition of Complex Documents (복합 문서의 의미적 분해를 통한 다중 벡터 문서 임베딩 방법론)

  • Park, Jongin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.19-41
    • /
    • 2019
  • According to the rapidly increasing demand for text data analysis, research and investment in text mining are being actively conducted not only in academia but also in various industries. Text mining is generally conducted in two steps. In the first step, the text of the collected document is tokenized and structured to convert the original document into a computer-readable form. In the second step, tasks such as document classification, clustering, and topic modeling are conducted according to the purpose of analysis. Until recently, text mining-related studies have been focused on the application of the second steps, such as document classification, clustering, and topic modeling. However, with the discovery that the text structuring process substantially influences the quality of the analysis results, various embedding methods have actively been studied to improve the quality of analysis results by preserving the meaning of words and documents in the process of representing text data as vectors. Unlike structured data, which can be directly applied to a variety of operations and traditional analysis techniques, Unstructured text should be preceded by a structuring task that transforms the original document into a form that the computer can understand before analysis. It is called "Embedding" that arbitrary objects are mapped to a specific dimension space while maintaining algebraic properties for structuring the text data. Recently, attempts have been made to embed not only words but also sentences, paragraphs, and entire documents in various aspects. Particularly, with the demand for analysis of document embedding increases rapidly, many algorithms have been developed to support it. Among them, doc2Vec which extends word2Vec and embeds each document into one vector is most widely used. However, the traditional document embedding method represented by doc2Vec generates a vector for each document using the whole corpus included in the document. This causes a limit that the document vector is affected by not only core words but also miscellaneous words. Additionally, the traditional document embedding schemes usually map each document into a single corresponding vector. Therefore, it is difficult to represent a complex document with multiple subjects into a single vector accurately using the traditional approach. In this paper, we propose a new multi-vector document embedding method to overcome these limitations of the traditional document embedding methods. This study targets documents that explicitly separate body content and keywords. In the case of a document without keywords, this method can be applied after extract keywords through various analysis methods. However, since this is not the core subject of the proposed method, we introduce the process of applying the proposed method to documents that predefine keywords in the text. The proposed method consists of (1) Parsing, (2) Word Embedding, (3) Keyword Vector Extraction, (4) Keyword Clustering, and (5) Multiple-Vector Generation. The specific process is as follows. all text in a document is tokenized and each token is represented as a vector having N-dimensional real value through word embedding. After that, to overcome the limitations of the traditional document embedding method that is affected by not only the core word but also the miscellaneous words, vectors corresponding to the keywords of each document are extracted and make up sets of keyword vector for each document. Next, clustering is conducted on a set of keywords for each document to identify multiple subjects included in the document. Finally, a Multi-vector is generated from vectors of keywords constituting each cluster. The experiments for 3.147 academic papers revealed that the single vector-based traditional approach cannot properly map complex documents because of interference among subjects in each vector. With the proposed multi-vector based method, we ascertained that complex documents can be vectorized more accurately by eliminating the interference among subjects.

Automatic Email Multi-category Classification Using Dynamic Category Hierarchy and Non-negative Matrix Factorization (비음수 행렬 분해와 동적 분류 체계를 사용한 자동 이메일 다원 분류)

  • Park, Sun;An, Dong-Un
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.5
    • /
    • pp.378-385
    • /
    • 2010
  • The explosive increase in the use of email has made to need email classification efficiently and accurately. Current work on the email classification method have mainly been focused on a binary classification that filters out spam-mails. This methods are based on Support Vector Machines, Bayesian classifiers, rule-based classifiers. Such supervised methods, in the sense that the user is required to manually describe the rules and keyword list that is used to recognize the relevant email. Other unsupervised method using clustering techniques for the multi-category classification is created a category labels from a set of incoming messages. In this paper, we propose a new automatic email multi-category classification method using NMF for automatic category label construction method and dynamic category hierarchy method for the reorganization of email messages in the category labels. The proposed method in this paper, a large number of emails are managed efficiently by classifying multi-category email automatically, email messages in their category are reorganized for enhancing accuracy whenever users want to classify all their email messages.

Suggestion on Korean Internet governance system by multi stakeholder approach and Introduction of Korean Internet address law (한국 내 인터넷 거버넌스 형성과 인터넷주소에 관한 법률)

  • Yun, Boknam
    • Review of Korean Society for Internet Information
    • /
    • v.14 no.3
    • /
    • pp.68-77
    • /
    • 2013
  • This article consists of 3 parts. Part I is multi stakeholder approach on Internet governance system. Part II is analysis of the Korean Internet governance system. In this part, I explain relevant laws in Korea, including Korean Internet Address Resources Act. Part III is my suggestion on Korean Internet governance system using a multi stakeholder approach. First of all, the keyword of the Internet governance system is decision making process: that is, consensus based versus top-down approach. Then who are major players in Internet governance in national level? Government, or Private sectors such as business and civil society. Korean legal system for Internet governance shows a top-down decision making process. Major players are the government (that is, Ministry of Science, ICT and Future Planning) and KISA affiliated with the government. Other players include Internet Address Policy Committee, Korea Internet Governance Alliance, and NGOs. The key statute for Internet governance in Korea is Internet Address Resources Act of 2004. Articles 3 and 5 require the Ministry of Science, ICT and Future Planning to take a proactive role in Internet governance. The government shall consult with the Internet Address Policy Deliberation Committee for Internet governance. Yet this Committee is established under the control of the Ministry of Science, ICT and Future Planning. All members of this Committee are also commissioned or nominated by the Chairman of the Ministry. Meanwhile, there are also non-official organizations, including Sub-committee on Address & Infrastructure of Korea Internet Governance Alliance. I suggest to reform decision making process of Korean Internet governance system based on BOTTOM-UP process for CONSENSUS BASED DECISION. My suggested system includes the following: (1) The government hands over a major role in Internet governance to INDEPENDENT Internet policy organization. And the government participates in such organization as ONE of the players. (2) Nomination of this committee member must be bottom-up process for a genuine multi-stakeholder model including civil society, commercial organization, end-users and experts. (3) The government should establish plan for supporting the private sector's international activity on the long-term basis.

  • PDF

Secure Searchable Encryption with User-Revocability in Multi-User Settings (다자간 환경에서 사용자 탈퇴가 가능한 프라이버시 보호 키워드 검색 기법)

  • Kim, Dong-Min;Chun, Ji-Young;Noh, Geon-Tae;Jeong, Ik-Rae
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.21 no.1
    • /
    • pp.3-14
    • /
    • 2011
  • In recent days, people used to store and share the data with other users through the web storage services. It is more convenient for using the data, but it raise problems such as access control of stored data and privacy exposure to untrusted server. Searchable encryption is used to share the data securely in multi-user setting. Especially in the multi-user setting, the revoked users should not be able to search the data and access the stored data. That is, it should be considered the security from revoked users. However in the existing schemes, the revoked users can decrypt the shared data by passive attack. Proposed scheme is the secure searchable encryption that resolves the problem and guarantees the security for revoked users.

Judging Translated Web Document & Constructing Bilingual Corpus (웹 번역문서 판별과 병렬 말뭉치 구축)

  • Jee-hyung, Kim;Yill-byung, Lee
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10a
    • /
    • pp.787-789
    • /
    • 2004
  • People frequently feel the need of a general searching tool that frees from language barrier when they find information through the internet. Therefore, it is necessary to have a multilingual parallel corpus to search with a word that includes a search keyword and has a corresponding word in another language, Multilingual parallel corpus can be built and reused effectively through the several processes which are judgment of the web documents, sentence alignment and word alignment. To build a multilingual parallel corpus, multi-lingual dictionary should be constructed in each language and HTML should be simplified. And by understanding the meaning and the statistics of document structure, judgment on translated web documents will be made and the searched web pages will be aligned in sentence unit.

  • PDF