• 제목/요약/키워드: Full-of-Potential Method

검색결과 243건 처리시간 0.026초

FCA 기반 계층적 구조를 이용한 문서 통합 기법 (Methods for Integration of Documents using Hierarchical Structure based on the Formal Concept Analysis)

  • 김태환;전호철;최종민
    • 지능정보연구
    • /
    • 제17권3호
    • /
    • pp.63-77
    • /
    • 2011
  • 월드와이드웹(World Wide Web)은 인터넷에 연결된 컴퓨터를 통해 사람들이 정보를 공유할 수 있는 매우 큰 분산된 정보 공간이다. 웹은 1991년에 시작되어 개인 홈페이지, 온라인 도서관, 가상 박물관 등 다양한 정보 자원들을 웹으로 표현하면서 성장하였다. 이러한 웹은 현재 5천억 페이지 이상 존재할 것이라고 추정한다. 대용량 정보에서 정보를 효과적이며 효율적으로 검색하는 기술을 적용할 수 있다. 현재 존재하는 몇몇 검색 도구들은 초 단위로 gigabyte 크기의 웹을 검사하여 사용자에게 검색 정보를 제공한다. 그러나 검색의 효율성은 검색 시간과는 다른 문제이다. 현재 검색 도구들은 사용자의 질의에 적합한 정보가 적음에도 불구하고 많은 문서들을 사용자에게 검색해준다. 그러므로 대부분의 적합한 문서들은 검색 상위에 존재하지 않는다. 또한 현재 검색 도구들은 사용자가 찾은 문서와 관련된 문서를 찾을 수 없다. 현재 많은 검색 시스템들의 가장 중요한 문제는 검색의 질을 증가 시키는 것이다. 그것은 검색된 결과로 관련 있는 문서를 증가시키고, 관련 없는 문서를 감소시켜 사용자에게 제공하는 것이다. 이러한 문제를 해결하기 위해 CiteSeer는 월드와이드웹에 존재하는 논문에 대해 한정하여 ACI(Autonomous Citation Indexing)기법을 제안하였다. "Citaion Index"는 연구자가 자신의 논문에 다른 논문을 인용한 정보를 기술하는데 이렇게 기술된 논문과 자신의 논문을 연결하여 색인한다. "Citation Index"는 논문 검색이나 논문 분석 등에 매우 유용하다. 그러나 "Citation Index"는 논문의 저자가 다른 논문을 인용한 논문에 대해서만 자신의 논문을 연결하여 색인했기 때문에 논문의 저자가 다른 논문을 인용하지 않은 논문에 대해서는 관련 있는 논문이라 할지 라도 저자의 논문과 연결하여 색인할 수 없다. 또한 인용되지 않은 다른 논문과 연결하여 색인할 수 없기 때문에 확장성이 용이하지 못하다. 이러한 문제를 해결하기 위해 본 논문에서는 검색된 문서에서 단락별 명사와 동사 및 목적어를 추출하여 해당 동사가 명사 및 목적어를 취할 수 있는 가능한 값을 고려하여 하나의 문서를 formal context 형태로 변환한다. 이 표를 이용하여 문서의 계층적 그래프를 구성하고, 문서의 그래프를 이용하여 문서 간 그래프를 통합한다. 이렇게 만들어진 문서의 그래프들은 그래프의 구조를 보고 각각의 문서의 영역을 구하고 그 영역에 포함관계를 계산하여 문서와 문서간의 관계를 표시할 수 있다. 또한 검색된 문서를 트리 형식으로 보여주어 사용자가 원하는 정보를 보다 쉽게 검색할 수 있는 문서의 구조적 통합 방법에 대해 제안한다. 제안한 방법은 루씬 검색엔진이 가지고 있는 순위 계산 공식을 이용하여 문서가 가지는 중요한 단어를 문서의 참조 관계에 적용하여 비교하였다. 제안한 방법이 루씬 검색엔진보다15% 정도 높은 성능을 나타내었다.

우리나라 은행산업(銀行産業)의 효율성분석(效率性分析)과 제도개선방안(制度改善方案) (Scale and Scope Economies and Prospect for the Korea's Banking Industry)

  • 좌승희
    • KDI Journal of Economic Policy
    • /
    • 제14권2호
    • /
    • pp.109-153
    • /
    • 1992
  • 본고(本稿)에서는 우리나라 은행산업(銀行産業)의 트랜스로그비용함수(費用函數)와 규모(規模) 및 범위(範圍)의 경제성(經濟性), 비용(費用)의 보완성(補完性) 그리고 경쟁적(競爭的) 생존력(生存力) 등 효율성지표들을 추정함으로써 은행산업(銀行産業)의 효율성(效率性)을 평가하고 제도개선방향(制度改善方向)에 대한 시사점을 논하였다. 추정결과에 의하면, 우선 규모(規模)의 경제성(經濟性)의 경우는 은행대출(銀行貸出)이 규모(規模)의 비경제하(非經濟下)에 있고 모든 다른 업무(業務)들은 규모(規模)의 경제(經濟)를 시현하고 있지만, 전업무에 걸친 규모(規模)의 경제(經濟)는 부재(不在)하는 것으로 관찰된다. 다음, 범위(範圍)의 경제(經濟)의 경우는 유가증권투자(有價證券投資)와 신탁자산(信託資産) 및 수신(受信) 등은 범위(範圍)의 경제하(經濟下)에 있는 반면, 은행예금(銀行預金)은 범위(範圍)의 비경제하(非經濟下)에 있고 전업무에 걸친 범위(範圍)의 경제(經濟)는 강한 것으로 관찰되고 있다. 그리고 비용보완성(費用補完性)의 경우는 유가증권투자(有價證券投資)가 은행대출(銀行貸出), 예금(預金) 및 신탁업무(信託業務)와, 그리고 신탁자산운용업무(信託資産運用業務)가 은행자산운용업무(銀行資産運用業務)와 각각 비용보완관계(費用補完關係)를 보이고 있는 반면, 은행예금(銀行預金)은 특히 은행대출(銀行貸出)과 그리고 신탁자산업무(信託資産業務)와 경쟁관계에 있다. 한편 은행산업(銀行産業)에는 경쟁적(競爭的) 생존력(生存力)이 부재(不在)하는 것으로 관찰되고 있다. 이상의 결과들의 시사점을 정리하면, 우선 은행대출(銀行貸出)은 상대적으로 규모를 축소하고 여타의 모든 은행업무(銀行業務)나 신탁업무(信託業務)들은 규모를 확대함으로써 효율성제고(效率性提高)에 기여할 수 있을 것이며, 은행예금(銀行預金)과 은행주변업무는 앞으로 금융(金融)의 심화(深化)가 진행되면 여타업무에서 분리되어 각각 독립 운영될 가능성이 높다. 유가증권업무(有價證券業務)와 신탁자산(信託資産) 및 수신업무(受信業務)들을 추가확대함으로써 은행업무(銀行業務)의 효율성(效率性)이 증대될 수 있을 것으로 보여 겸업주의(兼業主義) 은행제도(銀行制度)의 타당성은 높지만, 은행산업(銀行産業)의 자연독점적인 성격은 부재(不在)하여 섣부른 규모(規模)만의 확대(擴大)는 오히려 경쟁력(競爭力)을 저하시킬 수도 있을 것이다.

  • PDF

한정된 O-D조사자료를 이용한 주 전체의 트럭교통예측방법 개발 (DEVELOPMENT OF STATEWIDE TRUCK TRAFFIC FORECASTING METHOD BY USING LIMITED O-D SURVEY DATA)

  • 박만배
    • 대한교통학회:학술대회논문집
    • /
    • 대한교통학회 1995년도 제27회 학술발표회
    • /
    • pp.101-113
    • /
    • 1995
  • The objective of this research is to test the feasibility of developing a statewide truck traffic forecasting methodology for Wisconsin by using Origin-Destination surveys, traffic counts, classification counts, and other data that are routinely collected by the Wisconsin Department of Transportation (WisDOT). Development of a feasible model will permit estimation of future truck traffic for every major link in the network. This will provide the basis for improved estimation of future pavement deterioration. Pavement damage rises exponentially as axle weight increases, and trucks are responsible for most of the traffic-induced damage to pavement. Consequently, forecasts of truck traffic are critical to pavement management systems. The pavement Management Decision Supporting System (PMDSS) prepared by WisDOT in May 1990 combines pavement inventory and performance data with a knowledge base consisting of rules for evaluation, problem identification and rehabilitation recommendation. Without a r.easonable truck traffic forecasting methodology, PMDSS is not able to project pavement performance trends in order to make assessment and recommendations in the future years. However, none of WisDOT's existing forecasting methodologies has been designed specifically for predicting truck movements on a statewide highway network. For this research, the Origin-Destination survey data avaiiable from WisDOT, including two stateline areas, one county, and five cities, are analyzed and the zone-to'||'&'||'not;zone truck trip tables are developed. The resulting Origin-Destination Trip Length Frequency (00 TLF) distributions by trip type are applied to the Gravity Model (GM) for comparison with comparable TLFs from the GM. The gravity model is calibrated to obtain friction factor curves for the three trip types, Internal-Internal (I-I), Internal-External (I-E), and External-External (E-E). ~oth "macro-scale" calibration and "micro-scale" calibration are performed. The comparison of the statewide GM TLF with the 00 TLF for the macro-scale calibration does not provide suitable results because the available 00 survey data do not represent an unbiased sample of statewide truck trips. For the "micro-scale" calibration, "partial" GM trip tables that correspond to the 00 survey trip tables are extracted from the full statewide GM trip table. These "partial" GM trip tables are then merged and a partial GM TLF is created. The GM friction factor curves are adjusted until the partial GM TLF matches the 00 TLF. Three friction factor curves, one for each trip type, resulting from the micro-scale calibration produce a reasonable GM truck trip model. A key methodological issue for GM. calibration involves the use of multiple friction factor curves versus a single friction factor curve for each trip type in order to estimate truck trips with reasonable accuracy. A single friction factor curve for each of the three trip types was found to reproduce the 00 TLFs from the calibration data base. Given the very limited trip generation data available for this research, additional refinement of the gravity model using multiple mction factor curves for each trip type was not warranted. In the traditional urban transportation planning studies, the zonal trip productions and attractions and region-wide OD TLFs are available. However, for this research, the information available for the development .of the GM model is limited to Ground Counts (GC) and a limited set ofOD TLFs. The GM is calibrated using the limited OD data, but the OD data are not adequate to obtain good estimates of truck trip productions and attractions .. Consequently, zonal productions and attractions are estimated using zonal population as a first approximation. Then, Selected Link based (SELINK) analyses are used to adjust the productions and attractions and possibly recalibrate the GM. The SELINK adjustment process involves identifying the origins and destinations of all truck trips that are assigned to a specified "selected link" as the result of a standard traffic assignment. A link adjustment factor is computed as the ratio of the actual volume for the link (ground count) to the total assigned volume. This link adjustment factor is then applied to all of the origin and destination zones of the trips using that "selected link". Selected link based analyses are conducted by using both 16 selected links and 32 selected links. The result of SELINK analysis by u~ing 32 selected links provides the least %RMSE in the screenline volume analysis. In addition, the stability of the GM truck estimating model is preserved by using 32 selected links with three SELINK adjustments, that is, the GM remains calibrated despite substantial changes in the input productions and attractions. The coverage of zones provided by 32 selected links is satisfactory. Increasing the number of repetitions beyond four is not reasonable because the stability of GM model in reproducing the OD TLF reaches its limits. The total volume of truck traffic captured by 32 selected links is 107% of total trip productions. But more importantly, ~ELINK adjustment factors for all of the zones can be computed. Evaluation of the travel demand model resulting from the SELINK adjustments is conducted by using screenline volume analysis, functional class and route specific volume analysis, area specific volume analysis, production and attraction analysis, and Vehicle Miles of Travel (VMT) analysis. Screenline volume analysis by using four screenlines with 28 check points are used for evaluation of the adequacy of the overall model. The total trucks crossing the screenlines are compared to the ground count totals. L V/GC ratios of 0.958 by using 32 selected links and 1.001 by using 16 selected links are obtained. The %RM:SE for the four screenlines is inversely proportional to the average ground count totals by screenline .. The magnitude of %RM:SE for the four screenlines resulting from the fourth and last GM run by using 32 and 16 selected links is 22% and 31 % respectively. These results are similar to the overall %RMSE achieved for the 32 and 16 selected links themselves of 19% and 33% respectively. This implies that the SELINICanalysis results are reasonable for all sections of the state.Functional class and route specific volume analysis is possible by using the available 154 classification count check points. The truck traffic crossing the Interstate highways (ISH) with 37 check points, the US highways (USH) with 50 check points, and the State highways (STH) with 67 check points is compared to the actual ground count totals. The magnitude of the overall link volume to ground count ratio by route does not provide any specific pattern of over or underestimate. However, the %R11SE for the ISH shows the least value while that for the STH shows the largest value. This pattern is consistent with the screenline analysis and the overall relationship between %RMSE and ground count volume groups. Area specific volume analysis provides another broad statewide measure of the performance of the overall model. The truck traffic in the North area with 26 check points, the West area with 36 check points, the East area with 29 check points, and the South area with 64 check points are compared to the actual ground count totals. The four areas show similar results. No specific patterns in the L V/GC ratio by area are found. In addition, the %RMSE is computed for each of the four areas. The %RMSEs for the North, West, East, and South areas are 92%, 49%, 27%, and 35% respectively, whereas, the average ground counts are 481, 1383, 1532, and 3154 respectively. As for the screenline and volume range analyses, the %RMSE is inversely related to average link volume. 'The SELINK adjustments of productions and attractions resulted in a very substantial reduction in the total in-state zonal productions and attractions. The initial in-state zonal trip generation model can now be revised with a new trip production's trip rate (total adjusted productions/total population) and a new trip attraction's trip rate. Revised zonal production and attraction adjustment factors can then be developed that only reflect the impact of the SELINK adjustments that cause mcreases or , decreases from the revised zonal estimate of productions and attractions. Analysis of the revised production adjustment factors is conducted by plotting the factors on the state map. The east area of the state including the counties of Brown, Outagamie, Shawano, Wmnebago, Fond du Lac, Marathon shows comparatively large values of the revised adjustment factors. Overall, both small and large values of the revised adjustment factors are scattered around Wisconsin. This suggests that more independent variables beyond just 226; population are needed for the development of the heavy truck trip generation model. More independent variables including zonal employment data (office employees and manufacturing employees) by industry type, zonal private trucks 226; owned and zonal income data which are not available currently should be considered. A plot of frequency distribution of the in-state zones as a function of the revised production and attraction adjustment factors shows the overall " adjustment resulting from the SELINK analysis process. Overall, the revised SELINK adjustments show that the productions for many zones are reduced by, a factor of 0.5 to 0.8 while the productions for ~ relatively few zones are increased by factors from 1.1 to 4 with most of the factors in the 3.0 range. No obvious explanation for the frequency distribution could be found. The revised SELINK adjustments overall appear to be reasonable. The heavy truck VMT analysis is conducted by comparing the 1990 heavy truck VMT that is forecasted by the GM truck forecasting model, 2.975 billions, with the WisDOT computed data. This gives an estimate that is 18.3% less than the WisDOT computation of 3.642 billions of VMT. The WisDOT estimates are based on the sampling the link volumes for USH, 8TH, and CTH. This implies potential error in sampling the average link volume. The WisDOT estimate of heavy truck VMT cannot be tabulated by the three trip types, I-I, I-E ('||'&'||'pound;-I), and E-E. In contrast, the GM forecasting model shows that the proportion ofE-E VMT out of total VMT is 21.24%. In addition, tabulation of heavy truck VMT by route functional class shows that the proportion of truck traffic traversing the freeways and expressways is 76.5%. Only 14.1% of total freeway truck traffic is I-I trips, while 80% of total collector truck traffic is I-I trips. This implies that freeways are traversed mainly by I-E and E-E truck traffic while collectors are used mainly by I-I truck traffic. Other tabulations such as average heavy truck speed by trip type, average travel distance by trip type and the VMT distribution by trip type, route functional class and travel speed are useful information for highway planners to understand the characteristics of statewide heavy truck trip patternS. Heavy truck volumes for the target year 2010 are forecasted by using the GM truck forecasting model. Four scenarios are used. Fo~ better forecasting, ground count- based segment adjustment factors are developed and applied. ISH 90 '||'&'||' 94 and USH 41 are used as example routes. The forecasting results by using the ground count-based segment adjustment factors are satisfactory for long range planning purposes, but additional ground counts would be useful for USH 41. Sensitivity analysis provides estimates of the impacts of the alternative growth rates including information about changes in the trip types using key routes. The network'||'&'||'not;based GMcan easily model scenarios with different rates of growth in rural versus . . urban areas, small versus large cities, and in-state zones versus external stations. cities, and in-state zones versus external stations.

  • PDF