• Title/Summary/Keyword: 통계분석기법

Search Result 1,775, Processing Time 0.036 seconds

Patent and Statistics, What's the Connection? (특허와 통계학, 그 연결은?)

  • Jun, Sung-Hae;Uhm, Dai-Ho
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.2
    • /
    • pp.205-222
    • /
    • 2010
  • A patent is a right of intellectual properties to an inventor or its assignee for a limited period under an international law. Not only in an invention of new machines, but it is competitive for using and creating technology in the world based on the patents. Most of the business models are good examples for patented technology, however a statistical analyzing model could be another one. In this paper we study and analyze the patents for the statistical analyzing and data mining models which are currently applied and registered, and suggest a statistical tool for analyzing and categorizing patent data. For this study all the patents in Korea and U.S. are listed and searched to sample the only cases concerning statistics.

District-Level Seismic Vulnerability Rating and Risk Level Based-Density Analysis of Buildings through Comparative Analysis of Machine Learning and Statistical Analysis Techniques in Seoul (머신러닝과 통계분석 기법의 비교분석을 통한 건물에 대한 서울시 구별 지진취약도 등급화 및 위험건물 밀도분석)

  • Sang-Bin Kim;Seong H. Kim;Dae-Hyeon Kim
    • Journal of Industrial Convergence
    • /
    • v.21 no.7
    • /
    • pp.29-39
    • /
    • 2023
  • In the recent period, there have been numerous earthquakes both domestically and internationally, and buildings in South Korea are particularly vulnerable to seismic design and earthquake damage. Therefore, the objective of this study is to discover an effective method for assessing the seismic vulnerability of buildings and conducting a density analysis of high-risk structures. The aim is to model this approach and validate it using data from pilot area(Seoul). To achieve this, two modeling techniques were employed, of which the predictive accuracy of the statistical analysis technique was 87%. Among the machine learning techniques, Random Forest Model exhibited the highest predictive accuracy, and the accuracy of the model on the Test Set was determined to be 97.1%. As a result of the analysis, the district rating revealed that Gwangjin-gu and Songpa-gu were relatively at higher risk, and the density analysis of at-risk buildings predicted that Seocho-gu, Gwanak-gu, and Gangseo-gu were relatively at higher risk. Finally, the result of the statistical analysis technique was predicted as more dangerous than those of the machine learning technique. However, considering that about 18.9% of the buildings in Seoul are designed to withstand the Seismic intensity of 6.5 (MMI), which is the standard for seismic-resistant design in South Korea, the result of the machine learning technique was predicted to be more accurate. The current research is limited in that it only considers buildings without taking into account factors such as population density, police stations, and fire stations. Considering these limitations in future studies would lead to more comprehensive and valuable research.

Introductory Statistics textbooks: crisis or opportunity? (교양 통계학 교재: 위기인가? 기회인가?)

  • Choi, Sookhee;Han, Kyungsoo
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.1
    • /
    • pp.105-117
    • /
    • 2022
  • Recently, the number of students taking basic statistics in liberal arts courses at universities nationwide has been increasing significantly. Students who learn statistics only for one semester are more likely to live as consumers than producers of statistical analysis in the future. What consumers need is statistical literacy and thinking skills rather than statistical methods. This paper deals with what points should be considered in order to develop textbooks that improve statistical thinking.

Traffic Analysis of Statistics based on Internet Application Services (인터넷 응용 서비스의 통계에 근거한 트래픽 분석)

  • 정태수;최진섭;정중수;김정태;김대영
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.5
    • /
    • pp.995-1003
    • /
    • 2004
  • A number of Internet application services are used with the development of Internet backbone nowadays. Well-known services such as WWW, ]n, email are provided at first time. Tremendous unwell-known services are presented according to the demands of various contents. After analyzing PDU information of the packet using unwell-known port travelling on the internet, searching internet service type and its statistical data is provided with internet traffic analyst as very useful information. This paper presents the mechanism to extract the internet application services operated on (un)well-known port of UDP or TCP used occasionally through netflow and tcpdump method introduced by ethereal and the operation scheme of the service. Afterwards to get the detailed statistics of the analyzed application service, the agent and the server environment, the agent gathering raw data traffics and the server adapting the traffic received from the agent BNF(Backus-Naur Form) method, is also introduced. Adapting the presented mechanism eve. LAN of Andong national university, the internet traffic service type and the detailed statistics of the analyzed application services which provides with internet traffic analyst are presented as very useful information.

A Study of Library Grouping using Cluster Analysis Methods (군집분석 기법을 이용한 공공도서관 그룹화에 대한 연구)

  • Kwak, Chul Wan
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.31 no.3
    • /
    • pp.79-99
    • /
    • 2020
  • The purpose of this study is to investigate the model of cluster analysis techniques for grouping public libraries and analyze their characteristics. Statistical data of public libraries of the National Library Statistics System were used, and three models of cluster analysis were applied. As a result of the study, cluster analysis was conducted based on the size of public libraries, and it was largely divided into two clusters. The size of the cluster was largely skewed to one side. For grouping based on size, the ward method of hierarchical cluster analysis and the k-means cluster analysis model were suitable. Three suggestions were presented as implications of the grouping method of public libraries. First, it is necessary to collect library service-related data in addition to statistical data. Second, an analysis model suitable for the data set to be analyzed must be applied. Third, it is necessary to study the possibility of using cluster analysis techniques in various fields other than library grouping.

Indicator 크리깅을 이용한 부산지하수 수질의 오염도 연구

  • 강동환;정상용;김병우;심병완;성익환;조병욱
    • Proceedings of the Korean Society of Soil and Groundwater Environment Conference
    • /
    • 2003.09a
    • /
    • pp.249-253
    • /
    • 2003
  • 강서구를 제외한 부산 전지역에서 1998년도에 조사된 지하수 수질 중 6개 성분(pH, TS, KMnO$_4$, Cl, SO$_4$, NO$_3$-N)에 대한 일반통계분석 결과 pH 성분을 제외하고는 5개 성분의 중앙값이 평균보다 적은 값을 보이는 양성왜도를 보임으로써, 수질오염정도를 분석하기 위해 지시크리깅이라는 비모수적인 지구통계분석기법을 적용하였다. 6개 수질성분에 대해 음용수 기준치를 적용하여 음용가능은 “1”의 값이, 음용불가능은 “0”의 값이 주어졌다. 이렇게 변환된 자료를 이용하여 각 성분별로 실험적인 베리오그램 분석을 실시한 결과 pH, TS, SO$_4$ 성분은 선형모델이 선정되었으며, KMnO$_4$, Cl, NO$_3$-N 성분은 구상형모델이 선정되었다. 본 연구에서는 지시크리깅을 이용하여 6개 성분의 분포도를 작성하고 부산지역의 오염정도를 분석하였다. 지시크리깅기법은 연구지역 전체의 정량적인 분포를 나타내지는 못하지만, 오염의 유.무와 오염의 크기를 정확하게 파악할 수 있으며 또한, 이상치(outlier)가 크게 영향을 미칠 수 있는 통계학적인 오류를 보완할 수 있다.

  • PDF

Data value extraction through comparison of online big data analysis results and water supply statistics (온라인 빅 데이터 분석 결과와 상수도 통계 비교를 통한 데이터 가치 추출)

  • Hong, Sungjin;Yoo, Do Guen
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.431-431
    • /
    • 2021
  • 4차 산업혁명의 도래로 사회기반시설물의 계획 및 운영관리에 있어 데이터 분석을 통한 가치추출에 대한 관심은 매우 높은 상황이다. 데이터의 가용성과 접근성, 정부 지원 등을 평가하는 공공데이터 개방지수에서 한국은 1점 만점에 0.93점을 획득하여 경제협력개발기구 회원국 중 1위(2019년 기준)를 할 정도로 매우 높은 수준(평균 0.60점)이다. 그러나 공식적으로 발표 및 배포되는 사회기반시설물 관련 정보와 심도 있는 연구 분석이 필요한 정보는 접근이 여전히 제한적이라 할 수 있다. 특히 대표적인 사회기반시설물인 상수도시스템은 대부분 국가중요시설로 지정되어 있어 다양한 정보를 획득하고 분석하는데 제약이 존재하며, 관련 국가통계인 상수도통계에서는 누수사고 등과 같은 비정상적 상황에 대한 사고지점, 원인 등과 같은 세부정보는 제공하고 있지 않다. 본 연구에서는 웹크롤링 및 빅데이터 분석기술을 활용하여 과거 일정기간 발생한 지자체의 상수도 누수사고 관련 뉴스를 전수조사하고 도출된 사고건수를 국가 공인 정보인 상수도통계자료와 비교·분석하였다. 독립적인 누수사고 기사를 추출하기 위해서 중복기사의 제거, 누수 관련 키워드 정립, 상수도분야 이외의 관련기사 제거 등의 절차가 필요하며, 이와 같은 기법은 R프로그래밍을 통해 구현되었다. 추가적으로 뉴스기사의 자연어 처리기반 정보추출기법을 통해 누수사고 건수 뿐만 아니라 사고발생일, 위치, 원인, 피해정도, 그리고 대상 관로의 크기 등을 획득하여 상수도 통계에서 제시하고 있는 정보보다 많은 가치를 추출하여 연계할 수 있는 방안을 제시하였다. 제시된 방법론을 국내 A광역시에 적용하여 누수사고 건수를 비교한 결과 상수도통계에서 제시하고 있는 누수발생건수와 유사한 규모의 사고건수를 뉴스기사분석을 통해 도출할 수 있었다. 제안된 방법론은 추가적인 정보의 추출이 가능하다는 점에서 향후 활용성이 높을 것으로 기대된다.

  • PDF

Performance Evaluation of Statistical Methods Applicable to Estimating Remaining Battery Runtime of Mobile Smart Devices (모바일 스마트 장치 배터리의 남은 시간 예측에 적용 가능한 통계 기법들의 평가)

  • Tak, Sungwoo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.2
    • /
    • pp.284-294
    • /
    • 2018
  • Statistical methods have been widely used to estimate the remaining battery runtime of mobile smart devices, such as smart phones, smart gears, tablets, and etc. However, existing work available in the literature only considers a particular statistical method. Thus, it is difficult to determine whether statistical methods are applicable to estimating thr remaining battery runtime of mobile devices or not. In this paper, we evaluated the performance of statistical methods applicable to estimating the remaining battery runtime of mobile smart devices. The statistical estimation methods evaluated in this paper are as follows: simple and moving average, linear regression, multivariate adaptive regression splines, auto regressive, polynomial curve fitting, and double and triple exponential smoothing methods. Research results presented in this paper give valuable data of insight to IT engineers who are willing to deploy statistical methods on estimating the remaining battery runtime of mobile smart devices.

Analysis of the Statistical Methods used in Scientific Research published in The Korean Journal of Culinary Research (한국조리학회지에 게재된 학술적 연구의 통계적 기법 분석)

  • Rha, Young-Ah;Na, Tae-Kyun
    • Culinary science and hospitality research
    • /
    • v.21 no.6
    • /
    • pp.49-62
    • /
    • 2015
  • Give that statistical analysis is an essential component of foodservice-related research, the purpose of this review is to analyse research trends of statistical methods applied to foodservice-related research. To achieve these objective, this study carried out a content analysis on a total of 251 out of 415 research articles published in The Korean Journal of Culinary Research(TKJCR) from January 2010 to December 2013. Of the total 164 research articles focussing on natural science research, qualitative research, articles written in English were excluded from the scope of this study. The results of this study are as follows. First, it turned out that 269 research articles applied quantitative research methods, and only 10 articles applied qualitative research methods among the 279 research articles based on social science research methods. Second, 20 article (8.0%) among the 251 did not specify the statistical methods or computer programs that were used for statistical analysis. Third, it was found that 228 articles (90.8%) used the SPSS program for data analysis. Fourth, in terms of frequency of use, it was revealed frequency analysis was most used, followed in order by reliability analysis, exploratory factor analysis, correlation analysis, regression analysis, structural equation modeling, confirmatory factor analysis, t-test, variance analysis, and cross tabs analysis, However, 3 out of 56 research articles that used a t-test did not suggest a t-value. 10 out of 64 articles that used ANOVA and demonstrated a significant difference in between-group mean did not conducted post-hoc test. Therefore, the researchers with interest in foodservice fields need to keep in mind that choosing and applying the correct statistical technique both determine the value and the success or failure of a study. To enhance the value and success of a study, it is necessary to use the proper statistical technique in an efficient way in order to prevent statistical errors.

Predicting and Reviewing the Amount of Snow Damage in Korea using Statistical and Machine Learning Techniques (통계기법 및 기계학습 기법을 이용한 우리나라 대설피해액 예측 및 적용성 검토)

  • Lee, Hyeong Joo;Lee, Keun Woo;Jang, Hyeon Bin;Chung, Gun Hui
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.384-384
    • /
    • 2022
  • 과거의 우리나라 대설피해 양상을 살펴보면 지역적으로 집중되어 피해가 발생하는 것이 특징이다. 그러나 현재는 전국적으로 대설피해가 가중되는 추세이며, 이에 따라 대설피해에 대비 가능한 대책의 강구가 필요한 실정이다. 그러나 피해 발생 시 정확한 피해 예측으로 사전에 재난을 대비가 가능한 수준의 연구는 미흡한 실정이다. 따라서 본 연구에서는 다양한 통계기법과 기계학습 기법을 이용하여 대설로 인해 발생한 피해액을 개략적으로 예측이 가능한 모형을 개발하고자 하였다. 대설피해액 예측 모형은 다중회귀분석, 서포트 벡터 머신, 인공신경망 기법, 랜덤포레스트 기법을 이용하여 총 4가지 기법으로 개발하였으며, 독립변수로 사회·경제적 요소, 기상요소를 사용하였고, 종속변수로는 1994년부터 2020년까지 발생한 대설피해 이력의 대설피해액을 사용하였다. 결과적으로 4가지 예측 모형의 예측력 검증 및 기법 간의 예측력을 비교하여 개발한 모형의 적용성을 검토하였다. 본 연구 결과에서 제시한 모형의 개선방안 및 업데이트 방안을 참고하여 후속 연구가 진행된다면 미래에 전국적으로 확대될 대설피해에 대한 대비가 가능할 것으로 기대되며 복구비 및 예방비 투자의 지역적 우선순위를 분석하여 선제적인 대비가 가능할 것으로 판단된다.

  • PDF