• 제목/요약/키워드: grouped data

검색결과 836건 처리시간 0.023초

서울시 여성 소비자의 라이프스타일에 따른 군집분석과 외식행동에 대한 연구 (A Study on Eating-out Behavior by Cluster Analysis according to The Lifestyle of Female Consumers in Seoul)

  • 반주원
    • 한국식생활문화학회지
    • /
    • 제23권3호
    • /
    • pp.377-387
    • /
    • 2008
  • The objective of this study was to use cluster analysis to determine differences in eating-out behavior among grouped clusters of female consumers after each cluster was divided based on lifestyle patterns. The data were collected by interview survey from a biased sample of 1,300 females, ranging from ages 20 to 59, and living in residential districts of Seoul. Reliability analysis, factor analysis, cluster analysis, cross-tabulation analysis, and analysis of variance (ANOVA) were applied to the data. Four lifestyle factors were extracted by lower-division and classified as follows: health condition, consuming, food, and housing lifestyles. Based on these four factors, the female consumers were grouped as three clusters: the consuming-individuality type, rational-pursuit type, and conservative-stability type. The eating-out behavior of each cluster was significantly different in terms of frequency of eating-out, eating-out expenditures, restaurant selection criteria, food preferences, and the purpose for eating-out. Since this study surveyed females from ages 20 to 59, age and demographics were the differential factors in determining the various lifestyle types. Thus, to target the consumers who form a target market, the food industry should consider market segmentation that combines demographic factors such as age, income, and marital status.

상대적 계층적 군집 방법을 이용한 마이크로어레이 자료의 군집분석 (Microarray data analysis using relative hierarchical clustering)

  • 우숙영;이재원;전명식
    • Journal of the Korean Data and Information Science Society
    • /
    • 제25권5호
    • /
    • pp.999-1009
    • /
    • 2014
  • 계층적 군집 분석은 분석 결과를 덴드로그램으로 쉽게 표시할 수 있어서 방대한 양의 마이크로어레이 자료를 탐색하기에 유용하며, 군집된 결과를 이용하여 생물학적 현상을 이해하는데 도움을 준다. 하지만, 계층적 군집방법은 두 군집간의 절대값 거리만을 고려하여 병합하기 때문에 군집 간의 상대적 비유사성은 설명하지 못하는 단점이 있다. 본 연구에서는 상대적 계층적 군집 방법을 소개하고, 마이크로어레이 자료와 같이 다양한 군집의 모양을 가진 모의실험 자료들과 실제 마이크로어레이 자료를 사용하여 상대적 계층적 군집방법과 기존의 계층적 군집 방법을 비교하였다. 두 계층적 군집 방법의 질적 평가는 오분류율, 동질성, 이질성 지표를 이용하여 수행하였다.

Estimation of the Parameters of the New Generalized Weibull Distribution

  • Zaindin, M.
    • International Journal of Reliability and Applications
    • /
    • 제11권1호
    • /
    • pp.23-40
    • /
    • 2010
  • Recently, Zaindin and Sarhan (2009) introduced a new distribution named new generalized Weibull distribution. This paper deals with the problem of estimating the parameters of this distribution in the case where the data is grouped and censored. We use both the maximum likelihood and Bayes techniques. The results obtained are illustrated on a set of real data.

  • PDF

도시기혼여성의 여가 활동유형 (A Typology of Urban Married Women's Leisure Activities)

  • 김외숙;이기춘
    • 가정과삶의질연구
    • /
    • 제10권2호
    • /
    • pp.61-74
    • /
    • 1992
  • The purpose of this study is to identify a typology of urban marred women's leisure activities based on participation data. The survey of this research was conducted by means of interview with 606 married women in Seoul. The instruments of the survey sere questionnaire including a leisure participation scale. Data were analysed by means of the statistic of frequency. percentage, arithmetic mean, standard deviation and factor analysis ,using the SPSS-X and SPSS/PC+ programs. The result was that the leisure activities of urban married women could be grouped into 5 factors; self-developing , family-oriented. religious-social, sociable, and time-spending activities For further researches, we suggested several proposals.

  • PDF

Design an Indexing Structure System Based on Apache Hadoop in Wireless Sensor Network

  • Keo, Kongkea;Chung, Yeongjee
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2013년도 춘계학술발표대회
    • /
    • pp.45-48
    • /
    • 2013
  • In this paper, we proposed an Indexing Structure System (ISS) based on Apache Hadoop in Wireless Sensor Network (WSN). Nowadays sensors data continuously keep growing that need to control. Data constantly update in order to provide the newest information to users. While data keep growing, data retrieving and storing are face some challenges. So by using the ISS, we can maximize processing quality and minimize data retrieving time. In order to design ISS, Indexing Types have to be defined depend on each sensor type. After identifying, each sensor goes through the Indexing Structure Processing (ISP) in order to be indexed. After ISP, indexed data are streaming and storing in Hadoop Distributed File System (HDFS) across a number of separate machines. Indexed data are split and run by MapReduce tasks. Data are sorted and grouped depend on sensor data object categories. Thus, while users send the requests, all the queries will be filter from sensor data object and managing the task by MapReduce processing framework.

그룹변수를 포함하는 불균형 자료의 분류분석을 위한 서포트 벡터 머신 (Hierarchically penalized support vector machine for the classication of imbalanced data with grouped variables)

  • 김은경;전명식;방성완
    • 응용통계연구
    • /
    • 제29권5호
    • /
    • pp.961-975
    • /
    • 2016
  • H-SVM은 입력변수들이 그룹화 되어 있는 경우 분류함수의 추정에서 그룹 및 그룹 내의 변수선택을 동시에 할 수 있는 방법론이다. 그러나 H-SVM은 입력변수들의 중요도에 상관없이 모든 변수들을 동일하게 축소 추정하기 때문에 추정의 효율성이 감소될 수 있다. 또한, 집단별 개체수가 상이한 불균형 자료의 분류분석에서는 분류함수가 편향되어 추정되므로 소수집단의 예측력이 하락할 수 있다. 이러한 문제점들을 보완하기 위해 본 논문에서는 적응적 조율모수를 사용하여 변수선택의 성능을 개선하고 집단별 오분류 비용을 차등적으로 부여하는 WAH-SVM을 제안하였다. 또한, 모의실험과 실제자료 분석을 통하여 제안한 모형과 기존 방법론들의 성능 비교하였으며, 제안한 모형의 유용성과 활용 가능성 확인하였다.

데이터 마이닝을 이용한 임상연구 데이터베이스 기반 원혈의 주치 특성 (Characteristics of Source Acupoints: Data Mining of Clinical Trials Database)

  • 최다현;이서영;이인선;류연희;채윤병
    • Korean Journal of Acupuncture
    • /
    • 제38권2호
    • /
    • pp.100-109
    • /
    • 2021
  • Objectives : Source acupoint is one of the representative acupoints to treat various diseases in each meridian. We aimed to identify the patterns of selection of Source acupoints and their associations with diseases using clinical trials data. Methods : We extracted the frequency of Source acupoints across 30 diseases from clinical trials database. Acupuncture treatment regimens were retrieved from the Cochrane Database of Systematic Reviews. The frequency of Source acupoint use was calculated as the number of studies using a certain acupoint divided by the total number of included studies. Using hierarchical clustering and multidimensional scaling, the characteristics of Source acupoints were analyzed based on the similarity of the relationships between the Source acupoints and the diseases. Results : A total of 421 clinical trials were included for this analysis. LR3, HT7, KI3, and LI4 acupoints were most frequently used for the treatment of 30 diseases. Cluster analysis showed that LR3 and LI4 acupoints were grouped together and HT7 and KI3 acupoints were grouped together. Multidimensional scaling revealed that LR3, LI4, HT7, and KI3 acupoints have intrinsic properties in the two-dimensional space. Conclusions : The present study identified the selection patterns of the Source acupoints using clinical trials data. Our finding will provide the understanding of the characteristics of Source acupoints.

Diversity of the genus Sheathia (Batrachospermales, Rhodophyta) in northeast India and east Nepal

  • Necchi, Orlando Jr.;West, John A.;Ganesan, E.K.;Yasmin, Farishta;Rai, Shiva Kumar;Rossignolo, Natalia L.
    • ALGAE
    • /
    • 제34권4호
    • /
    • pp.277-288
    • /
    • 2019
  • Freshwater red algae of the order Batrachospermales are poorly studied in India and Nepal, especially on a molecular basis. During a survey in northeast India and east Nepal, six populations of the genus Sheathia were found and analyzed using molecular and morphological evidence. Phylogenetic analyses based on the rbcL gene sequences grouped all populations in a large clade including our S. arcuata specimens and others from several regions. Sheathia arcuata represents a species complex with a high sequence divergence and several smaller clades. Samples from India and Nepal were grouped in three distinct clades with high support and representing new cryptic species: a clade formed by two samples from India, which was named Sheathia assamica sp. nov.; one sample from India and one from Nepal formed another clade, named Sheathia indonepalensis sp. nov.; two samples from Nepal grouped with sequences from Hawaii and Indonesia (only 'Chantransia' stages) and gametophytes from Taiwan, named Sheathia dispersa sp. nov. Morphological characters of the specimens from these three species overlap one another and with the general circumscription of S. arcuata, which lacks the heterocortication (presence of bulbous cells in the cortical filaments) present in other species of the genus Sheathia. Although the region sampled is relatively restricted, the genetic diversity among specimens of these three groups was high and not closely related in the phylogenetic relationship with the other clades of S. arcuata. These data corroborate information from other groups of organisms (e.g., land and aquatic plants) that indicates this region (Eastern Himalaya) as a hotspot of biodiversity.

2-arch 터널 정거장 굴착 시 평면변형률 조건에서 군말뚝의 이격거리에 따른 지반거동 분석 (Investigation of ground behaviour between plane-strain grouped pile and 2-arch tunnel station excavation)

  • 공석민;오동욱;안호연;이현구;이용주
    • 한국터널지하공간학회 논문집
    • /
    • 제18권6호
    • /
    • pp.535-544
    • /
    • 2016
  • 도심지에서 지하철과 같은 터널의 증가에 따라 특수한 설계, 및 시공 방법이 제안되어 왔다. 터널 붕괴 사고는 막대한 인명, 재산 피해를 가져오기 때문에 터널 굴착 및 주변지반의 거동을 관측하고 분석하는 작업은 매우 중요하다. 하지만 매번 현장시험을 하기에는 경제적인 측면에서 비현실적이다. 따라서 현장시험의 단점을 보완하고, 보다 정밀한 결과를 도출하는 연구가 지속적으로 발표되어 왔다. 본 연구는 군말뚝과 터널 사이 이격거리에 따른 2-arch 정거장 굴착 시 주변 지반의 거동을 측정하였다. 실내모형시험을 위해 trapdoor장치를 고안하였으며, 터널굴착은 2-arch 터널의 체적손실(VL)을 증가시킴으로써 모사하였다. 또한, 근거리 사진계측 및 이미지프로세싱을 통해 지반의 거동을 관측하였으며, 수치해석을 통해 실내모형시험의 결과와 비교하였다.

A small review and further studies on the LASSO

  • Kwon, Sunghoon;Han, Sangmi;Lee, Sangin
    • Journal of the Korean Data and Information Science Society
    • /
    • 제24권5호
    • /
    • pp.1077-1088
    • /
    • 2013
  • High-dimensional data analysis arises from almost all scientific areas, evolving with development of computing skills, and has encouraged penalized estimations that play important roles in statistical learning. For the past years, various penalized estimations have been developed, and the least absolute shrinkage and selection operator (LASSO) proposed by Tibshirani (1996) has shown outstanding ability, earning the first place on the development of penalized estimation. In this paper, we first introduce a number of recent advances in high-dimensional data analysis using the LASSO. The topics include various statistical problems such as variable selection and grouped or structured variable selection under sparse high-dimensional linear regression models. Several unsupervised learning methods including inverse covariance matrix estimation are presented. In addition, we address further studies on new applications which may establish a guideline on how to use the LASSO for statistical challenges of high-dimensional data analysis.