• Title/Summary/Keyword: automatically

Search Result 6,815, Processing Time 0.088 seconds

Analysis and Performance Evaluation of Pattern Condensing Techniques used in Representative Pattern Mining (대표 패턴 마이닝에 활용되는 패턴 압축 기법들에 대한 분석 및 성능 평가)

  • Lee, Gang-In;Yun, Un-Il
    • Journal of Internet Computing and Services
    • /
    • v.16 no.2
    • /
    • pp.77-83
    • /
    • 2015
  • Frequent pattern mining, which is one of the major areas actively studied in data mining, is a method for extracting useful pattern information hidden from large data sets or databases. Moreover, frequent pattern mining approaches have been actively employed in a variety of application fields because the results obtained from them can allow us to analyze various, important characteristics within databases more easily and automatically. However, traditional frequent pattern mining methods, which simply extract all of the possible frequent patterns such that each of their support values is not smaller than a user-given minimum support threshold, have the following problems. First, traditional approaches have to generate a numerous number of patterns according to the features of a given database and the degree of threshold settings, and the number can also increase in geometrical progression. In addition, such works also cause waste of runtime and memory resources. Furthermore, the pattern results excessively generated from the methods also lead to troubles of pattern analysis for the mining results. In order to solve such issues of previous traditional frequent pattern mining approaches, the concept of representative pattern mining and its various related works have been proposed. In contrast to the traditional ones that find all the possible frequent patterns from databases, representative pattern mining approaches selectively extract a smaller number of patterns that represent general frequent patterns. In this paper, we describe details and characteristics of pattern condensing techniques that consider the maximality or closure property of generated frequent patterns, and conduct comparison and analysis for the techniques. Given a frequent pattern, satisfying the maximality for the pattern signifies that all of the possible super sets of the pattern must have smaller support values than a user-specific minimum support threshold; meanwhile, satisfying the closure property for the pattern means that there is no superset of which the support is equal to that of the pattern with respect to all the possible super sets. By mining maximal frequent patterns or closed frequent ones, we can achieve effective pattern compression and also perform mining operations with much smaller time and space resources. In addition, compressed patterns can be converted into the original frequent pattern forms again if necessary; especially, the closed frequent pattern notation has the ability to convert representative patterns into the original ones again without any information loss. That is, we can obtain a complete set of original frequent patterns from closed frequent ones. Although the maximal frequent pattern notation does not guarantee a complete recovery rate in the process of pattern conversion, it has an advantage that can extract a smaller number of representative patterns more quickly compared to the closed frequent pattern notation. In this paper, we show the performance results and characteristics of the aforementioned techniques in terms of pattern generation, runtime, and memory usage by conducting performance evaluation with respect to various real data sets collected from the real world. For more exact comparison, we also employ the algorithms implementing these techniques on the same platform and Implementation level.

Wearable Computers

  • Cho, Gil-Soo;Barfield, Woodrow;Baird, Kevin
    • Fiber Technology and Industry
    • /
    • v.2 no.4
    • /
    • pp.490-508
    • /
    • 1998
  • One of the latest fields of research in the area of output devices is tactual display devices [13,31]. These tactual or haptic devices allow the user to receive haptic feedback output from a variety of sources. This allows the user to actually feel virtual objects and manipulate them by touch. This is an emerging technology and will be instrumental in enhancing the realism of wearable augmented environments for certain applications. Tactual displays have previously been used for scientific visualization in virtual environments by chemists and engineers to improve perception and understanding of force fields and of world models populated with the impenetrable. In addition to tactual displays, the use of wearable audio displays that allow sound to be spatialized are being developed. With wearable computers, designers will soon be able to pair spatialized sound to virtual representations of objects when appropriate to make the wearable computer experience even more realistic to the user. Furthermore, as the number and complexity of wearable computing applications continues to grow, there will be increasing needs for systems that are faster, lighter, and have higher resolution displays. Better networking technology will also need to be developed to allow all users of wearable computers to have high bandwidth connections for real time information gathering and collaboration. In addition to the technology advances that make users need to wear computers in everyday life, there is also the desire to have users want to wear their computers. In order to do this, wearable computing needs to be unobtrusive and socially acceptable. By making wearables smaller and lighter, or actually embedding them in clothing, users can conceal them easily and wear them comfortably. The military is currently working on the development of the Personal Information Carrier (PIC) or digital dog tag. The PIC is a small electronic storage device containing medical information about the wearer. While old military dog tags contained only 5 lines of information, the digital tags may contain volumes of multi-media information including medical history, X-rays, and cardiograms. Using hand held devices in the field, medics would be able to call this information up in real time for better treatment. A fully functional transmittable device is still years off, but this technology once developed in the military, could be adapted tp civilian users and provide ant information, medical or otherwise, in a portable, not obstructive, and fashionable way. Another future device that could increase safety and well being of its users is the nose on-a-chip developed by the Oak Ridge National Lab in Tennessee. This tiny digital silicon chip about the size of a dime, is capable of 'smelling' natural gas leaks in stoves, heaters, and other appliances. It can also detect dangerous levels of carbon monoxide. This device can also be configured to notify the fire department when a leak is detected. This nose chip should be commercially available within 2 years, and is inexpensive, requires low power, and is very sensitive. Along with gas detection capabilities, this device may someday also be configured to detect smoke and other harmful gases. By embedding this chip into workers uniforms, name tags, etc., this could be a lifesaving computational accessory. In addition to the future safety technology soon to be available as accessories are devices that are for entertainment and security. The LCI computer group is developing a Smartpen, that electronically verifies a user's signature. With the increase in credit card use and the rise in forgeries, is the need for commercial industries to constantly verify signatures. This Smartpen writes like a normal pen but uses sensors to detect the motion of the pen as the user signs their name to authenticate the signature. This computational accessory should be available in 1999, and would bring increased peace of mind to consumers and vendors alike. In the entertainment domain, Panasonic is creating the first portable hand-held DVD player. This device weight less than 3 pounds and has a screen about 6' across. The color LCD has the same 16:9 aspect ratio of a cinema screen and supports a high resolution of 280,000 pixels and stereo sound. The player can play standard DVD movies and has a hour battery life for mobile use. To summarize, in this paper we presented concepts related to the design and use of wearable computers with extensions to smart spaces. For some time, researchers in telerobotics have used computer graphics to enhance remote scenes. Recent advances in augmented reality displays make it possible to enhance the user's local environment with 'information'. As shown in this paper, there are many application areas for this technology such as medicine, manufacturing, training, and recreation. Wearable computers allow a much closer association of information with the user. By embedding sensors in the wearable to allow it to see what the user sees, hear what the user hears, sense the user's physical state, and analyze what the user is typing, an intelligent agent may be able to analyze what the user is doing and try to predict the resources he will need next or in the near future. Using this information, the agent may download files, reserve communications bandwidth, post reminders, or automatically send updates to colleagues to help facilitate the user's daily interactions. This intelligent wearable computer would be able to act as a personal assistant, who is always around, knows the user's personal preferences and tastes, and tries to streamline interactions with the rest of the world.

  • PDF

Evaluation of Tumor Registry Validity in Samsung Medical Center Radiation Oncology Department (삼성서울병원 방사선종양학과 종양등록 정보의 타당도 평가)

  • Park Won;Huh Seung Jae;Kim Dae Yong;Shin Seong Soo;Ahn Yong Chan;Lim Do Hoon;Kim Seonwoo
    • Radiation Oncology Journal
    • /
    • v.22 no.1
    • /
    • pp.33-39
    • /
    • 2004
  • Purpose : A tumor registry system for the patients treated by radiotherapy at Samsung Medical Center since the opening of a hospital at 1994 was employed. In this study, the tumor registry system was introduced and the validity of the tumor registration was analyzed. Materials and Methods: The tumor registry system was composed of three parts: patient demographic, diagnostic, and treatment Information. All data were input in a screen using a mouse only. Among the 10,000 registered cases in the tumor registry system until Aug, 2002, 199 were randomly selected and their registration data were compared with the patients' medical records. Results : Total input errors were detected on 15 cases (7.5%). There were 8 error items In the part relating to diagnostic Information: tumor site 3, pathology 2, AJCC staging 2 and performance status 1. In the part relating to treatment information there were 9 mistaken items: combination treatment 4, the date of initial treatment 3 and radiation completeness 2. According to the assignment doctor, the error ratio was consequently variable. The doctors who 010 no double-checks showed higher errors than those that 010 (15.6%:3.7%). Conclusion: Our tumor registry had errors within 2% for each Item. Although the overall data qualify was high, further improvement might be achieved through promoting sincerity, continuing training, periodic validity tests and keeping double-checks. Also, some items associated with the hospital Information system will be input automatically In the next step.

Semi-supervised learning for sentiment analysis in mass social media (대용량 소셜 미디어 감성분석을 위한 반감독 학습 기법)

  • Hong, Sola;Chung, Yeounoh;Lee, Jee-Hyong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.5
    • /
    • pp.482-488
    • /
    • 2014
  • This paper aims to analyze user's emotion automatically by analyzing Twitter, a representative social network service (SNS). In order to create sentiment analysis models by using machine learning techniques, sentiment labels that represent positive/negative emotions are required. However it is very expensive to obtain sentiment labels of tweets. So, in this paper, we propose a sentiment analysis model by using self-training technique in order to utilize "data without sentiment labels" as well as "data with sentiment labels". Self-training technique is that labels of "data without sentiment labels" is determined by utilizing "data with sentiment labels", and then updates models using together with "data with sentiment labels" and newly labeled data. This technique improves the sentiment analysis performance gradually. However, it has a problem that misclassifications of unlabeled data in an early stage affect the model updating through the whole learning process because labels of unlabeled data never changes once those are determined. Thus, labels of "data without sentiment labels" needs to be carefully determined. In this paper, in order to get high performance using self-training technique, we propose 3 policies for updating "data with sentiment labels" and conduct a comparative analysis. The first policy is to select data of which confidence is higher than a given threshold among newly labeled data. The second policy is to choose the same number of the positive and negative data in the newly labeled data in order to avoid the imbalanced class learning problem. The third policy is to choose newly labeled data less than a given maximum number in order to avoid the updates of large amount of data at a time for gradual model updates. Experiments are conducted using Stanford data set and the data set is classified into positive and negative. As a result, the learned model has a high performance than the learned models by using "data with sentiment labels" only and the self-training with a regular model update policy.

A Study on the aesthetic of Calligraphy on Changam, Lee Samman (창암(蒼巖) 이삼만(李三晩)의 서예미학(書藝美學) 고찰)

  • Kim, Doyoung
    • The Journal of the Convergence on Culture Technology
    • /
    • v.6 no.2
    • /
    • pp.99-106
    • /
    • 2020
  • Changam, Lee Samman(1770~1845), who created his own handwriiting to be referred to as the three great writers of the late Joseon Dynasty, invented as the original 'Haengunyusu Typeface' and developed Calligraphy spirit of DonggugJinche in Honam province. He ultimately pursued the state of tonglyeong by raising the personality of 'writing is the person's personality' and the attitude of learning the old. Through the book chang-amseogyeol, he basically polished Haeseo of Han Dynasty and Wi Dynasty and emphasized that Haengseo and Choseo are done automatically when muscle strength and bone strength are established. And since calligraphy originated from 'nature', it goes through the 'Beobcheongwijin' spirit. After doing so, expressed the state of tonglyeong of " mubeob-ibeob ", the stage of reaching. In addition, Changam showed the aesthetic that you can get the novelty by pursuing the philosophy of 'Wu' and the 'beauty of Stupid and Lacking' based on LaoTzu and ChuangTzu. This is a philosophy that follows nature's logic to reveal nature's nature. And it is an aesthetic that protects his 'True Wu' without knowing and greedy. On the other hand, Changam promoted natural and vital beauty through force in the method of using the brush. He suggested the 'Push and Hard' of the Han dynasty, pushing it with force using this power properly. In particular, the feeling of an IlunMujeog brush in 『Changam Calligraphy-The cloud stays Poem』 overflows with the vitality and bizarre and strange dynamism of the spirit and typeface as eum-yang harmonizes with each other. In addition, the beauty of Push and Hard containing polyeoghamse is misaligned, but it has achieved a natural aesthetic without invading. This work demonstrates the real look of Changam choseo. In addition, the beauty of Push and Hard containing polyeoghamse is misaligned, but it has achieved a natural aesthetic without invading. Changam proves the real look of "Haedong's best Chose Maestro".

GIS based Development of Module and Algorithm for Automatic Catchment Delineation Using Korean Reach File (GIS 기반의 하천망분석도 집수구역 자동 분할을 위한 알고리듬 및 모듈 개발)

  • PARK, Yong-Gil;KIM, Kye-Hyun;YOO, Jae-Hyun
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.20 no.4
    • /
    • pp.126-138
    • /
    • 2017
  • Recently, the national interest in environment is increasing and for dealing with water environment-related issues swiftly and accurately, the demand to facilitate the analysis of water environment data using a GIS is growing. To meet such growing demands, a spatial network data-based stream network analysis map(Korean Reach File; KRF) supporting spatial analysis of water environment data was developed and is being provided. However, there is a difficulty in delineating catchment areas, which are the basis of supplying spatial data including relevant information frequently required by the users such as establishing remediation measures against water pollution accidents. Therefore, in this study, the development of a computer program was made. The development process included steps such as designing a delineation method, and developing an algorithm and modules. DEM(Digital Elevation Model) and FDR(Flow Direction) were used as the major data to automatically delineate catchment areas. The algorithm for the delineation of catchment areas was developed through three stages; catchment area grid extraction, boundary point extraction, and boundary line division. Also, an add-in catchment area delineation module, based on ArcGIS from ESRI, was developed in the consideration of productivity and utility of the program. Using the developed program, the catchment areas were delineated and they were compared to the catchment areas currently used by the government. The results showed that the catchment areas were delineated efficiently using the digital elevation data. Especially, in the regions with clear topographical slopes, they were delineated accurately and swiftly. Although in some regions with flat fields of paddles and downtowns or well-organized drainage facilities, the catchment areas were not segmented accurately, the program definitely reduce the processing time to delineate existing catchment areas. In the future, more efforts should be made to enhance current algorithm to facilitate the use of the higher precision of digital elevation data, and furthermore reducing the calculation time for processing large data volume.

A Study on Intuitive IoT Interface System using 3D Depth Camera (3D 깊이 카메라를 활용한 직관적인 사물인터넷 인터페이스 시스템에 관한 연구)

  • Park, Jongsub;Hong, June Seok;Kim, Wooju
    • The Journal of Society for e-Business Studies
    • /
    • v.22 no.2
    • /
    • pp.137-152
    • /
    • 2017
  • The decline in the price of IT devices and the development of the Internet have created a new field called Internet of Things (IoT). IoT, which creates new services by connecting all the objects that are in everyday life to the Internet, is pioneering new forms of business that have not been seen before in combination with Big Data. The prospect of IoT can be said to be unlimited in its utilization. In addition, studies of standardization organizations for smooth connection of these IoT devices are also active. However, there is a part of this study that we overlook. In order to control IoT equipment or acquire information, it is necessary to separately develop interworking issues (IP address, Wi-Fi, Bluetooth, NFC, etc.) and related application software or apps. In order to solve these problems, existing research methods have been conducted on augmented reality using GPS or markers. However, there is a disadvantage in that a separate marker is required and the marker is recognized only in the vicinity. In addition, in the case of a study using a GPS address using a 2D-based camera, it was difficult to implement an active interface because the distance to the target device could not be recognized. In this study, we use 3D Depth recognition camera to be installed on smartphone and calculate the space coordinates automatically by linking the distance measurement and the sensor information of the mobile phone without a separate marker. Coordination inquiry finds equipment of IoT and enables information acquisition and control of corresponding IoT equipment. Therefore, from the user's point of view, it is possible to reduce the burden on the problem of interworking of the IoT equipment and the installation of the app. Furthermore, if this technology is used in the field of public services and smart glasses, it will reduce duplication of investment in software development and increase in public services.

Privilege and Immunity of Information and Data from Aviation Safety Program in Unites States (미국 항공안전데이터 프로그램의 비공개 특권과 제재 면제에 관한 연구)

  • Moon, Joon-Jo
    • The Korean Journal of Air & Space Law and Policy
    • /
    • v.23 no.2
    • /
    • pp.137-172
    • /
    • 2008
  • The earliest safety data programs, the FDR and CVR, were electronic reporting systems that generate data "automatically." The FDR program, originally instituted in 1958, had no publicly available restrictions for protections against sanctions by the FAA or an airline, although there are agreements and union contracts forbidding the use of FDR data for FAA enforcement actions. This FDR program still has the least formalized protections. With the advent of the CVR program in 1966, the precursor to the current FAR 91.25 was already in place, having been promulgated in 1964. It stated that the FAA would not use CVR data for enforcement actions. In 1982, Congress began restricting the disclosure of the CVR tape and transcripts. Congress added further clarification of the availability of discovery in civil litigation in 1994. Thus, the CVR data have more definitive protections in place than do FDR data. The ASRS was the first non-automatic reporting system; and built into its original design in 1975 was a promise of limited protection from enforcement sanctions. That promise was further codified in an FAR in 1979. As with the CVR, from its inception, the ASRS had some protections built in for the person who might have had a safety problem. However, the program did not (and to this day does not) explicitly deal with issues of use by airlines, litigants, or the public media, although it appears that airlines will either take a non-punitive stance if an ASRS report is filed, or the airline may ignore the fact that it has been filed at all. The FAA worked with several U.S. airlines in the early 1990s on developing ASAP programs, and the FAA issued an Advisory Circular about the program in 1997. From its inception, the ASAP program contained some FAA enforcement protections and company discipline protections, although some protection against litigation disclosure and public disclosure was not added until 2003, when FAA Order 8000.82 was promulgated, placing the program under the protections of FAR 193, which had been added in 2001. The FOQA program, when it was first instituted through a demonstration program in 1995, did not contain protections against sanctions. Now, however, the FAA cannot take enforcement action based on FOQA safety data, and an airline is limited to "corrective action" under the program. Union contracts can exclude FOQA from the realm of disciplinary action, although airline practice may be for airlines to require retraining if there is no contract in place forbidding it. The data is protected against disclosure for litigation and public media purposes by FAA Order 8000.81, issued in 2003, which placed FOQA under the protections of FAR 193. The figure on the next page shows when each program began, and when each statute, regulation, or order became effective for that program.

  • PDF

PIRS : Personalized Information Retrieval System using Adaptive User Profiling and Real-time Filtering for Search Results (적응형 사용자 프로파일기법과 검색 결과에 대한 실시간 필터링을 이용한 개인화 정보검색 시스템)

  • Jeon, Ho-Cheol;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.21-41
    • /
    • 2010
  • This paper proposes a system that can serve users with appropriate search results through real time filtering, and implemented adaptive user profiling based personalized information retrieval system(PIRS) using users' implicit feedbacks in order to deal with the problem of existing search systems such as Google or MSN that does not satisfy various user' personal search needs. One of the reasons that existing search systems hard to satisfy various user' personal needs is that it is not easy to recognize users' search intentions because of the uncertainty of search intentions. The uncertainty of search intentions means that users may want to different search results using the same query. For example, when a user inputs "java" query, the user may want to be retrieved "java" results as a computer programming language, a coffee of java, or a island of Indonesia. In other words, this uncertainty is due to ambiguity of search queries. Moreover, if the number of the used words for a query is fewer, this uncertainty will be more increased. Real-time filtering for search results returns only those results that belong to user-selected domain for a given query. Although it looks similar to a general directory search, it is different in that the search is executed for all web documents rather than sites, and each document in the search results is classified into the given domain in real time. By applying information filtering using real time directory classifying technology for search results to personalization, the number of delivering results to users is effectively decreased, and the satisfaction for the results is improved. In this paper, a user preference profile has a hierarchical structure, and consists of domains, used queries, and selected documents. Because the hierarchy structure of user preference profile can apply the context when users perfomed search, the structure is able to deal with the uncertainty of user intentions, when search is carried out, the intention may differ according to the context such as time or place for the same query. Furthermore, this structure is able to more effectively track web documents search behaviors of a user for each domain, and timely recognize the changes of user intentions. An IP address of each device was used to identify each user, and the user preference profile is continuously updated based on the observed user behaviors for search results. Also, we measured user satisfaction for search results by observing the user behaviors for the selected search result. Our proposed system automatically recognizes user preferences by using implicit feedbacks from users such as staying time on the selected search result and the exit condition from the page, and dynamically updates their preferences. Whenever search is performed by a user, our system finds the user preference profile for the given IP address, and if the file is not exist then a new user preference profile is created in the server, otherwise the file is updated with the transmitted information. If the file is not exist in the server, the system provides Google' results to users, and the reflection value is increased/decreased whenever user search. We carried out some experiments to evaluate the performance of adaptive user preference profile technique and real time filtering, and the results are satisfactory. According to our experimental results, participants are satisfied with average 4.7 documents in the top 10 search list by using adaptive user preference profile technique with real time filtering, and this result shows that our method outperforms Google's by 23.2%.

A Methodology for Extracting Shopping-Related Keywords by Analyzing Internet Navigation Patterns (인터넷 검색기록 분석을 통한 쇼핑의도 포함 키워드 자동 추출 기법)

  • Kim, Mingyu;Kim, Namgyu;Jung, Inhwan
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.123-136
    • /
    • 2014
  • Recently, online shopping has further developed as the use of the Internet and a variety of smart mobile devices becomes more prevalent. The increase in the scale of such shopping has led to the creation of many Internet shopping malls. Consequently, there is a tendency for increasingly fierce competition among online retailers, and as a result, many Internet shopping malls are making significant attempts to attract online users to their sites. One such attempt is keyword marketing, whereby a retail site pays a fee to expose its link to potential customers when they insert a specific keyword on an Internet portal site. The price related to each keyword is generally estimated by the keyword's frequency of appearance. However, it is widely accepted that the price of keywords cannot be based solely on their frequency because many keywords may appear frequently but have little relationship to shopping. This implies that it is unreasonable for an online shopping mall to spend a great deal on some keywords simply because people frequently use them. Therefore, from the perspective of shopping malls, a specialized process is required to extract meaningful keywords. Further, the demand for automating this extraction process is increasing because of the drive to improve online sales performance. In this study, we propose a methodology that can automatically extract only shopping-related keywords from the entire set of search keywords used on portal sites. We define a shopping-related keyword as a keyword that is used directly before shopping behaviors. In other words, only search keywords that direct the search results page to shopping-related pages are extracted from among the entire set of search keywords. A comparison is then made between the extracted keywords' rankings and the rankings of the entire set of search keywords. Two types of data are used in our study's experiment: web browsing history from July 1, 2012 to June 30, 2013, and site information. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The original sample dataset contains 150 million transaction logs. First, portal sites are selected, and search keywords in those sites are extracted. Search keywords can be easily extracted by simple parsing. The extracted keywords are ranked according to their frequency. The experiment uses approximately 3.9 million search results from Korea's largest search portal site. As a result, a total of 344,822 search keywords were extracted. Next, by using web browsing history and site information, the shopping-related keywords were taken from the entire set of search keywords. As a result, we obtained 4,709 shopping-related keywords. For performance evaluation, we compared the hit ratios of all the search keywords with the shopping-related keywords. To achieve this, we extracted 80,298 search keywords from several Internet shopping malls and then chose the top 1,000 keywords as a set of true shopping keywords. We measured precision, recall, and F-scores of the entire amount of keywords and the shopping-related keywords. The F-Score was formulated by calculating the harmonic mean of precision and recall. The precision, recall, and F-score of shopping-related keywords derived by the proposed methodology were revealed to be higher than those of the entire number of keywords. This study proposes a scheme that is able to obtain shopping-related keywords in a relatively simple manner. We could easily extract shopping-related keywords simply by examining transactions whose next visit is a shopping mall. The resultant shopping-related keyword set is expected to be a useful asset for many shopping malls that participate in keyword marketing. Moreover, the proposed methodology can be easily applied to the construction of special area-related keywords as well as shopping-related ones.