• Title/Summary/Keyword: Returns to Scale

Search Result 115, Processing Time 0.019 seconds

Automatic gasometer reading system using selective optical character recognition (관심 문자열 인식 기술을 이용한 가스계량기 자동 검침 시스템)

  • Lee, Kyohyuk;Kim, Taeyeon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.1-25
    • /
    • 2020
  • In this paper, we suggest an application system architecture which provides accurate, fast and efficient automatic gasometer reading function. The system captures gasometer image using mobile device camera, transmits the image to a cloud server on top of private LTE network, and analyzes the image to extract character information of device ID and gas usage amount by selective optical character recognition based on deep learning technology. In general, there are many types of character in an image and optical character recognition technology extracts all character information in an image. But some applications need to ignore non-of-interest types of character and only have to focus on some specific types of characters. For an example of the application, automatic gasometer reading system only need to extract device ID and gas usage amount character information from gasometer images to send bill to users. Non-of-interest character strings, such as device type, manufacturer, manufacturing date, specification and etc., are not valuable information to the application. Thus, the application have to analyze point of interest region and specific types of characters to extract valuable information only. We adopted CNN (Convolutional Neural Network) based object detection and CRNN (Convolutional Recurrent Neural Network) technology for selective optical character recognition which only analyze point of interest region for selective character information extraction. We build up 3 neural networks for the application system. The first is a convolutional neural network which detects point of interest region of gas usage amount and device ID information character strings, the second is another convolutional neural network which transforms spatial information of point of interest region to spatial sequential feature vectors, and the third is bi-directional long short term memory network which converts spatial sequential information to character strings using time-series analysis mapping from feature vectors to character strings. In this research, point of interest character strings are device ID and gas usage amount. Device ID consists of 12 arabic character strings and gas usage amount consists of 4 ~ 5 arabic character strings. All system components are implemented in Amazon Web Service Cloud with Intel Zeon E5-2686 v4 CPU and NVidia TESLA V100 GPU. The system architecture adopts master-lave processing structure for efficient and fast parallel processing coping with about 700,000 requests per day. Mobile device captures gasometer image and transmits to master process in AWS cloud. Master process runs on Intel Zeon CPU and pushes reading request from mobile device to an input queue with FIFO (First In First Out) structure. Slave process consists of 3 types of deep neural networks which conduct character recognition process and runs on NVidia GPU module. Slave process is always polling the input queue to get recognition request. If there are some requests from master process in the input queue, slave process converts the image in the input queue to device ID character string, gas usage amount character string and position information of the strings, returns the information to output queue, and switch to idle mode to poll the input queue. Master process gets final information form the output queue and delivers the information to the mobile device. We used total 27,120 gasometer images for training, validation and testing of 3 types of deep neural network. 22,985 images were used for training and validation, 4,135 images were used for testing. We randomly splitted 22,985 images with 8:2 ratio for training and validation respectively for each training epoch. 4,135 test image were categorized into 5 types (Normal, noise, reflex, scale and slant). Normal data is clean image data, noise means image with noise signal, relfex means image with light reflection in gasometer region, scale means images with small object size due to long-distance capturing and slant means images which is not horizontally flat. Final character string recognition accuracies for device ID and gas usage amount of normal data are 0.960 and 0.864 respectively.

Ontology-based User Customized Search Service Considering User Intention (온톨로지 기반의 사용자 의도를 고려한 맞춤형 검색 서비스)

  • Kim, Sukyoung;Kim, Gunwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.4
    • /
    • pp.129-143
    • /
    • 2012
  • Recently, the rapid progress of a number of standardized web technologies and the proliferation of web users in the world bring an explosive increase of producing and consuming information documents on the web. In addition, most companies have produced, shared, and managed a huge number of information documents that are needed to perform their businesses. They also have discretionally raked, stored and managed a number of web documents published on the web for their business. Along with this increase of information documents that should be managed in the companies, the need of a solution to locate information documents more accurately among a huge number of information sources have increased. In order to satisfy the need of accurate search, the market size of search engine solution market is becoming increasingly expended. The most important functionality among much functionality provided by search engine is to locate accurate information documents from a huge information sources. The major metric to evaluate the accuracy of search engine is relevance that consists of two measures, precision and recall. Precision is thought of as a measure of exactness, that is, what percentage of information considered as true answer are actually such, whereas recall is a measure of completeness, that is, what percentage of true answer are retrieved as such. These two measures can be used differently according to the applied domain. If we need to exhaustively search information such as patent documents and research papers, it is better to increase the recall. On the other hand, when the amount of information is small scale, it is better to increase precision. Most of existing web search engines typically uses a keyword search method that returns web documents including keywords which correspond to search words entered by a user. This method has a virtue of locating all web documents quickly, even though many search words are inputted. However, this method has a fundamental imitation of not considering search intention of a user, thereby retrieving irrelevant results as well as relevant ones. Thus, it takes additional time and effort to set relevant ones out from all results returned by a search engine. That is, keyword search method can increase recall, while it is difficult to locate web documents which a user actually want to find because it does not provide a means of understanding the intention of a user and reflecting it to a progress of searching information. Thus, this research suggests a new method of combining ontology-based search solution with core search functionalities provided by existing search engine solutions. The method enables a search engine to provide optimal search results by inferenceing the search intention of a user. To that end, we build an ontology which contains concepts and relationships among them in a specific domain. The ontology is used to inference synonyms of a set of search keywords inputted by a user, thereby making the search intention of the user reflected into the progress of searching information more actively compared to existing search engines. Based on the proposed method we implement a prototype search system and test the system in the patent domain where we experiment on searching relevant documents associated with a patent. The experiment shows that our system increases the both recall and precision in accuracy and augments the search productivity by using improved user interface that enables a user to interact with our search system effectively. In the future research, we will study a means of validating the better performance of our prototype system by comparing other search engine solution and will extend the applied domain into other domains for searching information such as portal.

Analyzing the Efficiency of Korean Rail Transit Properties using Data Envelopment Analysis (자료포락분석기법을 이용한 도시철도 운영기관의 효율성 분석)

  • 김민정;김성수
    • Journal of Korean Society of Transportation
    • /
    • v.21 no.4
    • /
    • pp.113-132
    • /
    • 2003
  • Using nonradial data envelopment analysis(DEA) under assumptions of strong disposability and variable returns scale, this paper annually estimates productive. technical and allocative efficiencies of three publicly-owned rail transit properties which are different in terms of organizational type: Seoul Subway Corporation(SSC, local public corporation), the Seoul Metropolitan Electrified Railways sector (SMESRS) of Korea National Railroad(the national railway operator controlled by the Ministry of Construction and Transportation(MOCT)), and Busan Urban Transit Authority (BUTA, the national authority controlled by MOCT). Using the estimation results of Tobit regression analysis. the paper next computes their true productive, true technical and true allocative efficiencies, which reflect only the impacts of internal factors such as production activity by removing the impacts of external factors such as an organizational type and a track utilization rate. And the paper also computes an organizational efficiency and annually gross efficiencies for each property. The paper then conceptualized that the property produces a single output(car-kilometers) using four inputs(labor, electricity, car & maintenance and track) and uses unbalanced panel data consisted of annual observations on SSC, SMESRS and BUTA. The results obtained from DEA show that, on an average, SSC is the most efficient property on the productive and allocative sides, while SMESRS is the most technically-efficient one. On the other hand. BUTA is the most efficient one on the truly-productive and allocative sides, while SMESRS on the truly-technical side. Another important result is that the differences in true efficiency estimates among the three properties are considerably smaller than those in efficiency estimates. Besides. the most cost-efficient organizational type appears to be a local public corporation represented by SSC, which is also the most grossly-efficient property. These results suggest that a measure to sort out the impacts of external factors on the efficiency of rail transit properties is required to assess fairly it, and that a measure to restructure (establish) an existing(a new) rail transit property into a local public corporation(or authority) is required to improve its cost efficiency.

Bankruptcy Forecasting Model using AdaBoost: A Focus on Construction Companies (적응형 부스팅을 이용한 파산 예측 모형: 건설업을 중심으로)

  • Heo, Junyoung;Yang, Jin Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.35-48
    • /
    • 2014
  • According to the 2013 construction market outlook report, the liquidation of construction companies is expected to continue due to the ongoing residential construction recession. Bankruptcies of construction companies have a greater social impact compared to other industries. However, due to the different nature of the capital structure and debt-to-equity ratio, it is more difficult to forecast construction companies' bankruptcies than that of companies in other industries. The construction industry operates on greater leverage, with high debt-to-equity ratios, and project cash flow focused on the second half. The economic cycle greatly influences construction companies. Therefore, downturns tend to rapidly increase the bankruptcy rates of construction companies. High leverage, coupled with increased bankruptcy rates, could lead to greater burdens on banks providing loans to construction companies. Nevertheless, the bankruptcy prediction model concentrated mainly on financial institutions, with rare construction-specific studies. The bankruptcy prediction model based on corporate finance data has been studied for some time in various ways. However, the model is intended for all companies in general, and it may not be appropriate for forecasting bankruptcies of construction companies, who typically have high liquidity risks. The construction industry is capital-intensive, operates on long timelines with large-scale investment projects, and has comparatively longer payback periods than in other industries. With its unique capital structure, it can be difficult to apply a model used to judge the financial risk of companies in general to those in the construction industry. Diverse studies of bankruptcy forecasting models based on a company's financial statements have been conducted for many years. The subjects of the model, however, were general firms, and the models may not be proper for accurately forecasting companies with disproportionately large liquidity risks, such as construction companies. The construction industry is capital-intensive, requiring significant investments in long-term projects, therefore to realize returns from the investment. The unique capital structure means that the same criteria used for other industries cannot be applied to effectively evaluate financial risk for construction firms. Altman Z-score was first published in 1968, and is commonly used as a bankruptcy forecasting model. It forecasts the likelihood of a company going bankrupt by using a simple formula, classifying the results into three categories, and evaluating the corporate status as dangerous, moderate, or safe. When a company falls into the "dangerous" category, it has a high likelihood of bankruptcy within two years, while those in the "safe" category have a low likelihood of bankruptcy. For companies in the "moderate" category, it is difficult to forecast the risk. Many of the construction firm cases in this study fell in the "moderate" category, which made it difficult to forecast their risk. Along with the development of machine learning using computers, recent studies of corporate bankruptcy forecasting have used this technology. Pattern recognition, a representative application area in machine learning, is applied to forecasting corporate bankruptcy, with patterns analyzed based on a company's financial information, and then judged as to whether the pattern belongs to the bankruptcy risk group or the safe group. The representative machine learning models previously used in bankruptcy forecasting are Artificial Neural Networks, Adaptive Boosting (AdaBoost) and, the Support Vector Machine (SVM). There are also many hybrid studies combining these models. Existing studies using the traditional Z-Score technique or bankruptcy prediction using machine learning focus on companies in non-specific industries. Therefore, the industry-specific characteristics of companies are not considered. In this paper, we confirm that adaptive boosting (AdaBoost) is the most appropriate forecasting model for construction companies by based on company size. We classified construction companies into three groups - large, medium, and small based on the company's capital. We analyzed the predictive ability of AdaBoost for each group of companies. The experimental results showed that AdaBoost has more predictive ability than the other models, especially for the group of large companies with capital of more than 50 billion won.

The Relationship Between DEA Model-based Eco-Efficiency and Economic Performance (DEA 모형 기반의 에코효율성과 경제적 성과의 연관성)

  • Kim, Myoung-Jong
    • Journal of Environmental Policy
    • /
    • v.13 no.4
    • /
    • pp.3-49
    • /
    • 2014
  • Growing interest of stakeholders on corporate responsibilities for environment and tightening environmental regulations are highlighting the importance of environmental management more than ever. However, companies' awareness of the importance of environment is still falling behind, and related academic works have not shown consistent conclusions on the relationship between environmental performance and economic performance. One of the reasons is different ways of measuring these two performances. The evaluation scope of economic performance is relatively narrow and the performance can be measured by a unified unit such as price, while the scope of environmental performance is diverse and a wide range of units are used for measuring environmental performances instead of using a single unified unit. Therefore, the results of works can be different depending on the performance indicators selected. In order to resolve this problem, generalized and standardized performance indicators should be developed. In particular, the performance indicators should be able to cover the concepts of both environmental and economic performances because the recent idea of environmental management has expanded to encompass the concept of sustainability. Another reason is that most of the current researches tend to focus on the motive of environmental investments and environmental performance, and do not offer a guideline for an effective implementation strategy for environmental management. For example, a process improvement strategy or a market discrimination strategy can be deployed through comparing the environment competitiveness among the companies in the same or similar industries, so that a virtuous cyclical relationship between environmental and economic performances can be secured. A novel method for measuring eco-efficiency by utilizing Data Envelopment Analysis (DEA), which is able to combine multiple environmental and economic performances, is proposed in this report. Based on the eco-efficiencies, the environmental competitiveness is analyzed and the optimal combination of inputs and outputs are recommended for improving the eco-efficiencies of inefficient firms. Furthermore, the panel analysis is applied to the causal relationship between eco-efficiency and economic performance, and the pooled regression model is used to investigate the relationship between eco-efficiency and economic performance. The four-year eco-efficiencies between 2010 and 2013 of 23 companies are obtained from the DEA analysis; a comparison of efficiencies among 23 companies is carried out in terms of technical efficiency(TE), pure technical efficiency(PTE) and scale efficiency(SE), and then a set of recommendations for optimal combination of inputs and outputs are suggested for the inefficient companies. Furthermore, the experimental results with the panel analysis have demonstrated the causality from eco-efficiency to economic performance. The results of the pooled regression have shown that eco-efficiency positively affect financial perform ances(ROA and ROS) of the companies, as well as firm values(Tobin Q, stock price, and stock returns). This report proposes a novel approach for generating standardized performance indicators obtained from multiple environmental and economic performances, so that it is able to enhance the generality of relevant researches and provide a deep insight into the sustainability of environmental management. Furthermore, using efficiency indicators obtained from the DEA model, the cause of change in eco-efficiency can be investigated and an effective strategy for environmental management can be suggested. Finally, this report can be a motive for environmental management by providing empirical evidence that environmental investments can improve economic performance.

  • PDF