• Title/Summary/Keyword: 표준 알고리즘

Search Result 1,666, Processing Time 0.025 seconds

Development of Intelligent Job Classification System based on Job Posting on Job Sites (구인구직사이트의 구인정보 기반 지능형 직무분류체계의 구축)

  • Lee, Jung Seung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.123-139
    • /
    • 2019
  • The job classification system of major job sites differs from site to site and is different from the job classification system of the 'SQF(Sectoral Qualifications Framework)' proposed by the SW field. Therefore, a new job classification system is needed for SW companies, SW job seekers, and job sites to understand. The purpose of this study is to establish a standard job classification system that reflects market demand by analyzing SQF based on job offer information of major job sites and the NCS(National Competency Standards). For this purpose, the association analysis between occupations of major job sites is conducted and the association rule between SQF and occupation is conducted to derive the association rule between occupations. Using this association rule, we proposed an intelligent job classification system based on data mapping the job classification system of major job sites and SQF and job classification system. First, major job sites are selected to obtain information on the job classification system of the SW market. Then We identify ways to collect job information from each site and collect data through open API. Focusing on the relationship between the data, filtering only the job information posted on each job site at the same time, other job information is deleted. Next, we will map the job classification system between job sites using the association rules derived from the association analysis. We will complete the mapping between these market segments, discuss with the experts, further map the SQF, and finally propose a new job classification system. As a result, more than 30,000 job listings were collected in XML format using open API in 'WORKNET,' 'JOBKOREA,' and 'saramin', which are the main job sites in Korea. After filtering out about 900 job postings simultaneously posted on multiple job sites, 800 association rules were derived by applying the Apriori algorithm, which is a frequent pattern mining. Based on 800 related rules, the job classification system of WORKNET, JOBKOREA, and saramin and the SQF job classification system were mapped and classified into 1st and 4th stages. In the new job taxonomy, the first primary class, IT consulting, computer system, network, and security related job system, consisted of three secondary classifications, five tertiary classifications, and five fourth classifications. The second primary classification, the database and the job system related to system operation, consisted of three secondary classifications, three tertiary classifications, and four fourth classifications. The third primary category, Web Planning, Web Programming, Web Design, and Game, was composed of four secondary classifications, nine tertiary classifications, and two fourth classifications. The last primary classification, job systems related to ICT management, computer and communication engineering technology, consisted of three secondary classifications and six tertiary classifications. In particular, the new job classification system has a relatively flexible stage of classification, unlike other existing classification systems. WORKNET divides jobs into third categories, JOBKOREA divides jobs into second categories, and the subdivided jobs into keywords. saramin divided the job into the second classification, and the subdivided the job into keyword form. The newly proposed standard job classification system accepts some keyword-based jobs, and treats some product names as jobs. In the classification system, not only are jobs suspended in the second classification, but there are also jobs that are subdivided into the fourth classification. This reflected the idea that not all jobs could be broken down into the same steps. We also proposed a combination of rules and experts' opinions from market data collected and conducted associative analysis. Therefore, the newly proposed job classification system can be regarded as a data-based intelligent job classification system that reflects the market demand, unlike the existing job classification system. This study is meaningful in that it suggests a new job classification system that reflects market demand by attempting mapping between occupations based on data through the association analysis between occupations rather than intuition of some experts. However, this study has a limitation in that it cannot fully reflect the market demand that changes over time because the data collection point is temporary. As market demands change over time, including seasonal factors and major corporate public recruitment timings, continuous data monitoring and repeated experiments are needed to achieve more accurate matching. The results of this study can be used to suggest the direction of improvement of SQF in the SW industry in the future, and it is expected to be transferred to other industries with the experience of success in the SW industry.

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.

A Study on LRFD Reliability Based Design Criteria of RC Flexural Members (R.C. 휨부재(部材)의 L.R.F.D. 신뢰성(信賴性) 설계기준(設計基準)에 관한 연구(研究))

  • Cho, Hyo Nam
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.1 no.1
    • /
    • pp.21-32
    • /
    • 1981
  • Recent trends in design standards development in some European countries and U.S.A. have encouraged the use of probabilistic limit sate design concepts. Reliability based design criteria such as LSD, LRFD, PBLSD, adopted in those advanced countries have the potentials that they afford for symplifying the design process and placing it on a consistent reliability bases for various construction materials. A reliability based design criteria for RC flexural members are proposed in this study. Lind-Hasofer's invariant second-moment reliability theory is used in the derivation of an algorithmic reliability analysis method as well as an iterative determination of load and resistance factors. In addition, Cornell's Mean First-Order Second Moment Method is employed as a practical tool for the approximate reliability analysis and the derivation of design criteria. Uncertainty measures for flexural resistance and load effects are based on the Ellingwood's approach for the evaluation of uncertainties of loads and resistances. The implied relative safety levels of RC flexural members designed by the strength design provisions of the current standard code were evaluated using the second moment reliability analysis method proposed in this study. And then, resistance and load factors corresponding to the target reliability index(${\beta}=4$) which is considered to be appropriate level of reliability considering our practices are calculated by using the proposed methods. These reliability based factors were compared to those specified by our current ultimate strength design provisions. It was found that the reliability levels of flexural members designed by current code are not appropriate, and the code specified resistance and load factors were considerably different from the reliability based resistance and load factors proposed in this study.

  • PDF

Estimation of forest Site Productivity by Regional Environment and Forest Soil Factors (권역별 입지$\cdot$토양 환경 요인에 의한 임지생산력 추정)

  • Won Hyong-kyu;Jeong Jin-Hyun;Koo Kyo-Sang;Song Myung Hee;Shin Man Yong
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.7 no.2
    • /
    • pp.132-140
    • /
    • 2005
  • This study was conducted to develop regional site index equations for main tree species in Gangwon, Gyunggi-Chungcheong, Gyungsang, and Jeolla area of Korea, using environmental and soil factors obtained from a digital forest site map. Using the large data set obtained from the digital forest map, a total of 28 environmental and soil factors were regressed on site index by tree species for developing the best site index equations for each of the regions. The selected main tree species were Larix 1eptolepis, Pinus koraiensis, Pinus densiflora, Pinus thunbergii, and Quercus acutissima. Finally, four to five environmental and soil factors by species were chosen as independent variables in defining the best regional site index equations with the highest coefficients of determination $(R^2)$. For those site index equations, three evaluation statistics such as mean difference, standard deviation of difference and standard error of difference were applied to the data sets independently collected from fields within the region. According to the evaluation statistics, it was found that the regional site index equations by species developed in this study conformed well to the independent data set, having relatively low bias and variation. It was concluded that the regional site index equations by species had sufficient capability for the estimation of site productivity.

Design of a Bit-Serial Divider in GF(2$^{m}$ ) for Elliptic Curve Cryptosystem (타원곡선 암호시스템을 위한 GF(2$^{m}$ )상의 비트-시리얼 나눗셈기 설계)

  • 김창훈;홍춘표;김남식;권순학
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.12C
    • /
    • pp.1288-1298
    • /
    • 2002
  • To implement elliptic curve cryptosystem in GF(2$\^$m/) at high speed, a fast divider is required. Although bit-parallel architecture is well suited for high speed division operations, elliptic curve cryptosystem requires large m(at least 163) to support a sufficient security. In other words, since the bit-parallel architecture has an area complexity of 0(m$\^$m/), it is not suited for this application. In this paper, we propose a new serial-in serial-out systolic array for computing division operations in GF(2$\^$m/) using the standard basis representation. Based on a modified version of tile binary extended greatest common divisor algorithm, we obtain a new data dependence graph and design an efficient bit-serial systolic divider. The proposed divider has 0(m) time complexity and 0(m) area complexity. If input data come in continuously, the proposed divider can produce division results at a rate of one per m clock cycles, after an initial delay of 5m-2 cycles. Analysis shows that the proposed divider provides a significant reduction in both chip area and computational delay time compared to previously proposed systolic dividers with the same I/O format. Since the proposed divider can perform division operations at high speed with the reduced chip area, it is well suited for division circuit of elliptic curve cryptosystem. Furthermore, since the proposed architecture does not restrict the choice of irreducible polynomial, and has a unidirectional data flow and regularity, it provides a high flexibility and scalability with respect to the field size m.

GOCI-II Based Low Sea Surface Salinity and Hourly Variation by Typhoon Hinnamnor (GOCI-II 기반 저염분수 산출과 태풍 힌남노에 의한 시간별 염분 변화)

  • So-Hyun Kim;Dae-Won Kim;Young-Heon Jo
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_2
    • /
    • pp.1605-1613
    • /
    • 2023
  • The physical properties of the ocean interior are determined by temperature and salinity. To observe them, we rely on satellite observations for broad regions of oceans. However, the satellite for salinity measurement, Soil Moisture Active Passive (SMAP), has low temporal and spatial resolutions; thus, more is needed to resolve the fast-changing coastal environment. To overcome these limitations, the algorithm to use the Geostationary Ocean Color Imager-II (GOCI-II) of the Geo-Kompsat-2B (GK-2B) was developed as the inputs for a Multi-layer Perceptron Neural Network (MPNN). The result shows that coefficient of determination (R2), root mean square error (RMSE), and relative root mean square error (RRMSE) between GOCI-II based sea surface salinity (SSS) (GOCI-II SSS) and SMAP was 0.94, 0.58 psu, and 1.87%, respectively. Furthermore, the spatial variation of GOCI-II SSS was also very uniform, with over 0.8 of R2 and less than 1 psu of RMSE. In addition, GOCI-II SSS was also compared with SSS of Ieodo Ocean Research Station (I-ORS), suggesting that the result was slightly low, which was further analyzed for the following reasons. We further illustrated the valuable information of high spatial and temporal variation of GOCI-II SSS to analyze SSS variation by the 11th typhoon, Hinnamnor, in 2022. We used the mean and standard deviation (STD) of one day of GOCI-II SSS, revealing the high spatial and temporal changes. Thus, this study will shed light on the research for monitoring the highly changing marine environment.