• 제목/요약/키워드: File Discovery

검색결과 16건 처리시간 0.023초

HBase based Business Process Event Log Schema Design of Hadoop Framework

  • Ham, Seonghun;Ahn, Hyun;Kim, Kwanghoon Pio
    • 인터넷정보학회논문지
    • /
    • 제20권5호
    • /
    • pp.49-55
    • /
    • 2019
  • Organizations design and operate business process models to achieve their goals efficiently and systematically. With the advancement of IT technology, the number of items that computer systems can participate in and the process becomes huge and complicated. This phenomenon created a more complex and subdivide flow of business process.The process instances that contain workcase and events are larger and have more data. This is an essential resource for process mining and is used directly in model discovery, analysis, and improvement of processes. This event log is getting bigger and broader, which leads to problems such as capacity management and I / O load in management of existing row level program or management through a relational database. In this paper, as the event log becomes big data, we have found the problem of management limit based on the existing original file or relational database. Design and apply schemes to archive and analyze large event logs through Hadoop, an open source distributed file system, and HBase, a NoSQL database system.

A Web-based System for Business Process Discovery: Leveraging the SICN-Oriented Process Mining Algorithm with Django, Cytoscape, and Graphviz

  • Thanh-Hai Nguyen;Kyoung-Sook Kim;Dinh-Lam Pham;Kwanghoon Pio Kim
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권8호
    • /
    • pp.2316-2332
    • /
    • 2024
  • In this paper, we introduce a web-based system that leverages the capabilities of the ρ(rho)-algorithm, which is a Structure Information Control Net (SICN)-oriented process mining algorithm, with open-source platforms, including Django, Graphviz, and Cytoscape, to facilitate the rediscovery and visualization of business process models. Our approach involves discovering SICN-oriented process models from process instances from the IEEE XESformatted process enactment event logs dataset. This discovering process is facilitated by the ρ-algorithm, and visualization output is transformed into either a JSON or DOT formatted file, catering to the compatibility requirements of Cytoscape or Graphviz, respectively. The proposed system utilizes the robust Django platform, which enables the creation of a userfriendly web interface. This interface offers a clear, concise, modern, and interactive visualization of the rediscovered business processes, fostering an intuitive exploration experience. The experiment conducted on our proposed web-based process discovery system demonstrates its ability and efficiency showing that the system is a valuable tool for discovering business process models from process event logs. Its development not only contributes to the advancement of process mining but also serves as an educational resource. Readers, students, and practitioners interested in process mining can leverage this system as a completely free process miner to gain hands-on experience in rediscovering and visualizing process models from event logs.

XML기반 전역 Peer-to-Peer 엔진 설계 및 구현 (Design and Implementation of XML based Global Peer-to-Peer Engine)

  • 권태숙;이일수;이승룡
    • 한국통신학회논문지
    • /
    • 제29권1B호
    • /
    • pp.73-85
    • /
    • 2004
  • 본 논문에서는 다양한 종류의 서비스 지원이 가능하며, PC, 웹, 모바일 환경을 연동 할 수 있는 새로운 개념의 XML 기반 글로벌 P2P 엔진을 제안하고 이에 대한 설계 및 구현 경험을 소개한다. 제안된 P2P 엔진은 모든 메시지 교환 시 텍스트 기반의 XML을 사용함으로써 웹 연동 및 이기종간 데이터 교환이 가능하며, 다중 수준의 보안레벨과 여러 보안 알고리즘을 적용할 수 있는 기능도 제공한다. 이를 위하여 제안된 시스템은 모든 메시지를 스케줄링, 필터링 하는 Message Dispatcher, 보안 기능을 지원하는 보안 관리자와 전송을 담당하는 전송 관리자를 포함하는 SecureNet Manager, 피어를 검색하여 피어 네트워크 환경을 구성하는 Discovery Manager, 그리고 XML 문서처리 기능을 포함하는 데이터 관리자인 Repository Manager 모듈로 구성되어있다. 본 논문에서 제안된 시스템의 가용성 평가를 위해 커뮤니케이션 서비스인 채팅과 협업 중 공동 저작 도구로서 화이트보드 그리고 파일 공유서비스를 각각 구현하고, 기존의 타 시스템과의 성능 비교 평가를 하였다.

모바일 애드-혹 네트워크를 위한 노드 ID 기반 서비스 디스커버리 기법 (Node ID-based Service Discovery for Mobile Ad Hoc Networks)

  • 강은영
    • 한국컴퓨터정보학회논문지
    • /
    • 제14권12호
    • /
    • pp.109-117
    • /
    • 2009
  • 본 논문에서는 서비스 광고의 P2P 캐시 기법과 노드 ID를 기반으로 한 서비스 검색 기법을 혼합한 효율적인 서비스 디스커버리 기법을 제안한다. P2P 캐싱 광고 기법은 이웃 노드에 서비스 정보를 캐시 하기 때문에 빠르게 서비스 광고 정보를 확산시키며 서비스검색 평균 흡 수를 적게 한다. 또한 노드 ID를 기반으로 한 서비스 검색은 모든 이웃 노드에게 메시지를 브로드캐스트하지 않기 때문에 네트워크 부하를 감소시켜 네트워크 전송 지연이 거의 발생하지 않는다. 제안하는 기법은 중앙 서버나 저장소를 사용하지 않으며 많은 메시지를 생성하는 플러딩 방식도 사용하지 않는다. 실험 결과는 제안하는 방식이 전통적인 플러딩 방식과 비교하여 이웃 노드의 적절한 선택으로 많은 메시지 수를 줄이고 평균 탐색 거리를 줄임으로서 전체 네트워크 로드와 응답 시간을 향상시킴을 보인다.

소프트웨어 취약성 평가를 위한 길이기반 파일 퍼징 테스트 슈트 축약 알고리즘 (A Length-based File Fuzzing Test Suite Reduction Algorithm for Evaluation of Software Vulnerability)

  • 이재서;김종명;김수용;윤영태;김용민;노봉남
    • 정보보호학회논문지
    • /
    • 제23권2호
    • /
    • pp.231-242
    • /
    • 2013
  • 최근 소프트웨어의 취약점을 찾기 위해 퍼징과 같은 자동화된 테스팅 방법을 이용한 많은 연구가 진행되고 있다. 퍼징은 소프트웨어의 입력을 특정 규칙에 따라 자동으로 변형시켜 소프트웨어의 오작동 여부를 탐지하고 그 결과로부터 취약점을 발견하는 것이다. 이 때 소프트웨어에 입력되는 입력 값, 즉 테스트 케이스에 따라 취약점을 발견할 수 있는 확률이 달라지기 때문에 취약점 발견 확률을 높이기 위해서는 테스트 케이스의 집합인 테스트 슈트 축약 문제를 해결하여야 한다. 이에 본 논문에서는 파일과 같은 대용량 테스트 케이스를 대상으로 효과적으로 테스트 슈트 축약 문제를 해결할 수 있는 방법을 제안하고자 한다. 이를 위해 기존 연구에서 주로 사용되었던 커버리지와 중복도 이외에 새로운 척도인 테스트 케이스의 길이를 제시하고, 본 척도에 적합한 축약 알고리즘을 설계하였다. 실험을 통해 본 논문에서 제안한 알고리즘이 기존 연구의 알고리즘보다 높은 크기와 길이 축약율을 나타냄을 보임으로써 제안하는 알고리즘의 효율성을 증명할 수 있었다.

A comparison of three design tree based search algorithms for the detection of engineering parts constructed with CATIA V5 in large databases

  • Roj, Robin
    • Journal of Computational Design and Engineering
    • /
    • 제1권3호
    • /
    • pp.161-172
    • /
    • 2014
  • This paper presents three different search engines for the detection of CAD-parts in large databases. The analysis of the contained information is performed by the export of the data that is stored in the structure trees of the CAD-models. A preparation program generates one XML-file for every model, which in addition to including the data of the structure tree, also owns certain physical properties of each part. The first search engine is specializes in the discovery of standard parts, like screws or washers. The second program uses certain user input as search parameters, and therefore has the ability to perform personalized queries. The third one compares one given reference part with all parts in the database, and locates files that are identical, or similar to, the reference part. All approaches run automatically, and have the analysis of the structure tree in common. Files constructed with CATIA V5, and search engines written with Python have been used for the implementation. The paper also includes a short comparison of the advantages and disadvantages of each program, as well as a performance test.

이용자간 파일공유방식에 기반한 P2P 전자상거래 시스템 설계 및 구현 (Design and Implementation of Peer-to-Peer Electronic Commerce Systems based on the File Sharing Method between Users)

  • 김창수;서영석
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제15권1호
    • /
    • pp.1-20
    • /
    • 2006
  • Peer-to-peer systems (P2P) are rapidly growing in importance on the Internet environment, quickly extending the range of their usage. However, peer-to-peer systems have not been widely applied in electronic commerce because they have not been established as an appropriate business model. Therefore, we firstly review the previous research relevant to peer-to-peer systems, and then analyze the business models for P2P systems presented by previous researchers. Furthermore, this study categorizes major issues in terms of the technical and business model aspects. On the basis of these reviews, we develop P2P electronic commerce systems based on the file sharing method between users, focusing on user interface friendliness. A developed P2P electronic commerce systems are programmed by using the C# based on the Microsoft.net solution. A database is implemented using the MSSQL2000. A main application technology is designed that P2P electronic commerce systems make it possible. for user to extend into BtoB Solution by using WSDL (Web Services Description Language), UDDI (Universal Description, Discovery, and Integration) and the XML that is a document for users. User interface is made as form of Internet messenger for a user's convenience and is possible to develop into a commodity transaction system based on XML. In this study, it is possible for the P2P electronic commerce system to have extended application to fields such as Internet shopping mall and property transaction in a nonprofit organization, a public institution and a large scale nonprofit institution that have a similar structure as compared with a structure of a nonprofit educational institution.

  • PDF

Sparse Data Cleaning using Multiple Imputations

  • Jun, Sung-Hae;Lee, Seung-Joo;Oh, Kyung-Whan
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제4권1호
    • /
    • pp.119-124
    • /
    • 2004
  • Real data as web log file tend to be incomplete. But we have to find useful knowledge from these for optimal decision. In web log data, many useful things which are hyperlink information and web usages of connected users may be found. The size of web data is too huge to use for effective knowledge discovery. To make matters worse, they are very sparse. We overcome this sparse problem using Markov Chain Monte Carlo method as multiple imputations. This missing value imputation changes spare web data to complete. Our study may be a useful tool for discovering knowledge from data set with sparseness. The more sparseness of data in increased, the better performance of MCMC imputation is good. We verified our work by experiments using UCI machine learning repository data.

웹 로그 화일에서 순회 패턴 탐사를 위한 시스템 (A System for Mining Traversal Patterns from Web Log Files)

  • 박종수;윤지영
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2001년도 가을 학술발표논문집 Vol.28 No.2 (1)
    • /
    • pp.4-6
    • /
    • 2001
  • In this paper, we designed a system that can mine user's traversal patterns from web log files. The system cleans an input data, transactions of a web log file, and finds traversal patterns from the transactions, each of which consists of one user's access pages. The resulting traversal patterns are shown on a web browser, which can be used to analyze the patterns in visual form by a system manager or data miner. We have implemented the system in an IBM personal computer running on Windows 2000 in MS visual C++, and used the MS SQL Server 2000 to store the intermediate files and the traversal patterns which can be easily applied to a system for knowledge discovery in databases.

  • PDF

A P2P-to-UPnP Proxy Gateway Architecture for Home Multimedia Content Distribution

  • Hu, Chih-Lin;Lin, Hsin-Cheng;Hsu, Yu-Feng;Hsieh, Bing-Jung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제6권1호
    • /
    • pp.406-425
    • /
    • 2012
  • Deploying advanced home networking technologies and modern home-networked devices in residential environments provides a playground for new home applications and services. Because home multimedia entertainment is among the most essential home applications, this paper presents an appealing home media content sharing scenario: home-networked devices can discover neighboring devices and share local media content, as well as enormous amounts of Internet media content in a convenient and networked manner. This ideal scenario differs from traditional usages that merely offer local media content and require tedious manual operations of connection setup and file transfer among various devices. To achieve this goal, this study proposes a proxy gateway architecture for home multimedia content distribution. The proposed architecture integrates several functional mechanisms, including UPnP-based device discovery, home gateway, Internet media provision, and in-home media content delivery. This design addresses several inherent limitations of device heterogeneity and network interoperability on home and public networks, and allows diverse home-networked devices to play media content in an identical and networked manner. Prototypical implementation of the proposed proxy gateway architecture develops a proof-of-concept software, integrating a BitTorrent peer-to-peer client, a UPnP protocol stack, and a UPnP AV media server, as well as media distribution and management components on the OSGi home gateway platform. Practical demonstration shows the proposed design and scenario realization, offering users an unlimited volume of media content for home multimedia entertainment.