• Title/Summary/Keyword: 비선형 관계

Search Result 1,635, Processing Time 0.029 seconds

Development of Yóukè Mining System with Yóukè's Travel Demand and Insight Based on Web Search Traffic Information (웹검색 트래픽 정보를 활용한 유커 인바운드 여행 수요 예측 모형 및 유커마이닝 시스템 개발)

  • Choi, Youji;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.155-175
    • /
    • 2017
  • As social data become into the spotlight, mainstream web search engines provide data indicate how many people searched specific keyword: Web Search Traffic data. Web search traffic information is collection of each crowd that search for specific keyword. In a various area, web search traffic can be used as one of useful variables that represent the attention of common users on specific interests. A lot of studies uses web search traffic data to nowcast or forecast social phenomenon such as epidemic prediction, consumer pattern analysis, product life cycle, financial invest modeling and so on. Also web search traffic data have begun to be applied to predict tourist inbound. Proper demand prediction is needed because tourism is high value-added industry as increasing employment and foreign exchange. Among those tourists, especially Chinese tourists: Youke is continuously growing nowadays, Youke has been largest tourist inbound of Korea tourism for many years and tourism profits per one Youke as well. It is important that research into proper demand prediction approaches of Youke in both public and private sector. Accurate tourism demands prediction is important to efficient decision making in a limited resource. This study suggests improved model that reflects latest issue of society by presented the attention from group of individual. Trip abroad is generally high-involvement activity so that potential tourists likely deep into searching for information about their own trip. Web search traffic data presents tourists' attention in the process of preparation their journey instantaneous and dynamic way. So that this study attempted select key words that potential Chinese tourists likely searched out internet. Baidu-Chinese biggest web search engine that share over 80%- provides users with accessing to web search traffic data. Qualitative interview with potential tourists helps us to understand the information search behavior before a trip and identify the keywords for this study. Selected key words of web search traffic are categorized by how much directly related to "Korean Tourism" in a three levels. Classifying categories helps to find out which keyword can explain Youke inbound demands from close one to far one as distance of category. Web search traffic data of each key words gathered by web crawler developed to crawling web search data onto Baidu Index. Using automatically gathered variable data, linear model is designed by multiple regression analysis for suitable for operational application of decision and policy making because of easiness to explanation about variables' effective relationship. After regression linear models have composed, comparing with model composed traditional variables and model additional input web search traffic data variables to traditional model has conducted by significance and R squared. after comparing performance of models, final model is composed. Final regression model has improved explanation and advantage of real-time immediacy and convenience than traditional model. Furthermore, this study demonstrates system intuitively visualized to general use -Youke Mining solution has several functions of tourist decision making including embed final regression model. Youke Mining solution has algorithm based on data science and well-designed simple interface. In the end this research suggests three significant meanings on theoretical, practical and political aspects. Theoretically, Youke Mining system and the model in this research are the first step on the Youke inbound prediction using interactive and instant variable: web search traffic information represents tourists' attention while prepare their trip. Baidu web search traffic data has more than 80% of web search engine market. Practically, Baidu data could represent attention of the potential tourists who prepare their own tour as real-time. Finally, in political way, designed Chinese tourist demands prediction model based on web search traffic can be used to tourism decision making for efficient managing of resource and optimizing opportunity for successful policy.

Studies on the Physical and Chemical Denatures of Cocoon Bave Sericin throughout Silk Filature Processes (제사과정 전후에서의 견사세리신의 물리화학적 성질변화에 관한 연구)

  • 남중희
    • Journal of Sericultural and Entomological Science
    • /
    • v.16 no.1
    • /
    • pp.21-48
    • /
    • 1974
  • The studies were carried out to disclose the physical and chemical properties of sericin fraction obtained from silk cocoon shells and its characteristics of swelling and solubility. The following results were obtained. 1. The physical and chemical properties of sericin fraction. 1) In contrast to the easy water soluble sericin, the hard soluble sericin contains fewer amino acids include of polar side radical while the hard soluble amino acid sach as alanine and leucine were detected. 2) The easy soluble amino acids were found mainly on the outer part of the fibroin, but the hard soluble amino acids were located in the near parts to the fibroin. 3) The swelling and solubility of the sericin could be hardly assayed by the analysis of the amino acid composition, and could be considered to tee closely related to the compound of the sericin crystal and secondary structure. 4) The X-ray patterns of the cocoon filament were ring shape, but they disappeared by the degumming treatment. 5) The sericin of tussah silkworm (A. pernyi), showed stronger circular patterns in the meridian than the regular silkworm (Bombyx mori). 6) There was no pattern difference between Fraction A and B. 7) X-ray diffraction patterns of the Sericin 1, ll and 111 were similar except interference of 8.85A (side chain spacing). 8) The amino acids above 150 in molecular weight such as Cys. Tyr. Phe. His. and Arg. were not found quantitatively by the 60 minutes-hydrolysis (6N-HCI). 9) The X-ray Pattern of 4.6A had a tendency to disappear with hot-water, ether, and alcohol treatment. 10) The partial hydrolysis of sericin showed a cirucular interference (2A) on the meridian. 11) The sericin pellet after hydrolysis was considered to be peptides composed with specific amino acids. 12) The decomposing temperature of Sericin 111 was higher than that of Sericin I and II. 13) Thermogram of the inner portioned sericin of the cocoon shell had double endothermic peaks at 165$^{\circ}C$, and 245$^{\circ}C$, and its decomposing temperature was higher than that of other portioned sericin. 14) The infrared spectroscopic properties among sericin I, II, III and sericin extracted from each layer portion of the cocoon shell were similar. II. The characteristics of seriein swelling and solubility related with silk processing. 1) Fifteen minutes was required to dehydrate the free moisture of cocoon shells with centrifugal force controlled at 13${\times}$10$^4$ dyne/g at 3,000 R.P.M. B) It took 30 minutes for the sericin to show positive reaction with the Folin-Ciocaltue reagent at room temperature. 3) The measurable wave length of the visible radiation was 500-750m${\mu}$, and the highest absorbance was observed at the wave length of 650m${\mu}$. 4) The colorimetric analysis should be conducted at 650mu for low concentration (10$\mu\textrm{g}$/$m\ell$), and at 500m${\mu}$ for the higher concentration to obtain an exact analysis. 5) The absorbing curves of sericin and egg albumin at different wave lengths were similar, but the absorbance of the former was slightly higher than that of the latter. 6) The quantity of the sericin measured by the colorimetric analysis, turned out to be less than by the Kjeldahl method. 7) Both temperature and duration in the cocoon cooking process has much effect on the swelling and solubility of the cocoon shells, but the temperature was more influential than the duration of the treatment. 8) The factorial relation between the temperature and the duration of treatment of the cocoon cooking to check for siricin swelling and solubility showed that the treatment duration should be gradually increased to reach optimum swelling and solubility of sericin with low temperature(70$^{\circ}C$) . High temperature, however, showed more sharp increase. 9) The more increased temperature in the drying of fresh cocoons, the less the sericin swelling and solubility were obtained. 10) In a specific cooking duration, the heavier the cocoon shell is, the less the swelling and solubility were obtained. 11) It was considered that there are differences in swelling or solubility between the filaments of each cocoon layer. 12) Sericin swelling or solubility in the cocoon filament was decreased by the wax extraction.. 13) The ionic surface active agent accelerated the swelling and solubility of the sericin at the range of pH 6-7. 14) In the same conditions as above, the cation agent was absorbed into the sericin. 15) In case of the increase of Ca ang Mg in the reeling water, its pH value drifted toward the acidity. 16) A buffering action was observed between the sericin and the water hardness constituents in the reeling water. 17) The effect of calcium on the swelling and solubility of the sericin was more moderate than that of magnecium. 18) The solute of the water hardness constituents increased the electric conductivity in the reeling water.

  • PDF

An Analytical Study on Stem Growth of Chamaecyparis obtusa (편백(扁栢)의 수간성장(樹幹成長)에 관(關)한 해석적(解析的) 연구(硏究))

  • An, Jong Man;Lee, Kwang Nam
    • Journal of Korean Society of Forest Science
    • /
    • v.77 no.4
    • /
    • pp.429-444
    • /
    • 1988
  • Considering the recent trent toward the development of multiple-use of forest trees, investigations for comprehensive information on these young stands of Hinoki cypress are necessary for rational forest management. From this point of view, 83 sample trees were selected and cut down from 23-ear old stands of Hinoki cypress at Changsung-gun, Chonnam-do. Various stem growth factors of felled trees were measured and canonical correlaton analysis, principal component analysis and factor analysis were applied to investigate the stem growth characteristics, relationships among stem growth factors, and to get potential information and comprehensive information. The results are as follows ; Canonical correlation coefficient between stem volume and quality growth factor was 0.9877. Coefficient of canonical variates showed that DBH among diameter growth factors and height among height growth factors had important effects on stem volume. From the analysis of relationship between stem-volume and canonical variates, which were linearly combined DBH with height as one set, DBH had greater influence on volume growth than height. The 1st-2nd principal components here adopted to fit the effective value of 85% from the pincipal component analysis for 12 stem growth factors. The result showed that the 1st-2nd principal component had cumulative contribution rate of 88.10%. The 1st and the 2nd principal components were interpreted as "size factor" and "shape factor", respectively. From summed proportion of the efficient principal component fur each variate, information of variates except crown diameter, clear length and form height explained more than 87%. Two common factors were set by the eigen value obtained from SMC (squared multiple correlation) of diagonal elements of canonical matrix. There were 2 latent factors, $f_1$ and $f_2$. The former way interpreted as nature of diameter growth system. In inherent phenomenon of 12 growth factor, communalities except clear length and crown diameter had great explanatory poorer of 78.62-98.30%. Eighty three sample trees could he classified into 5 stem types as follows ; medium type within a radius of ${\pm}1$ standard deviation of factor scores, uniformity type in diameter and height growth in the 1st quadrant, slim type in the 2nd quadrant, dwarfish type in the 3rd quadrant, and fall-holed type in the 4 th quadrant.

  • PDF

Development of Conformal Radiotherapy with Respiratory Gate Device (호흡주기에 따른 방사선입체조형치료법의 개발)

  • Chu Sung Sil;Cho Kwang Hwan;Lee Chang Geol;Suh Chang Ok
    • Radiation Oncology Journal
    • /
    • v.20 no.1
    • /
    • pp.41-52
    • /
    • 2002
  • Purpose : 3D conformal radiotherapy, the optimum dose delivered to the tumor and provided the risk of normal tissue unless marginal miss, was restricted by organ motion. For tumors in the thorax and abdomen, the planning target volume (PTV) is decided including the margin for movement of tumor volumes during treatment due to patients breathing. We designed the respiratory gating radiotherapy device (RGRD) for using during CT simulation, dose planning and beam delivery at identical breathing period conditions. Using RGRD, reducing the treatment margin for organ (thorax or abdomen) motion due to breathing and improve dose distribution for 3D conformal radiotherapy. Materials and Methods : The internal organ motion data for lung cancer patients were obtained by examining the diaphragm in the supine position to find the position dependency. We made a respiratory gating radiotherapy device (RGRD) that is composed of a strip band, drug sensor, micro switch, and a connected on-off switch in a LINAC control box. During same breathing period by RGRD, spiral CT scan, virtual simulation, and 3D dose planing for lung cancer patients were peformed, without an extended PTV margin for free breathing, and then the dose was delivered at the same positions. We calculated effective volumes and normal tissue complication probabilities (NTCP) using dose volume histograms for normal lung, and analyzed changes in doses associated with selected NTCP levels and tumor control probabilities (TCP) at these new dose levels. The effects of 3D conformal radiotherapy by RGRD were evaluated with DVH (Dose Volume Histogram), TCP, NTCP and dose statistics. Results : The average movement of a diaphragm was 1.5 cm in the supine position when patients breathed freely. Depending on the location of the tumor, the magnitude of the PTV margin needs to be extended from 1 cm to 3 cm, which can greatly increase normal tissue irradiation, and hence, results in increase of the normal tissue complications probabiliy. Simple and precise RGRD is very easy to setup on patients and is sensitive to length variation (+2 mm), it also delivers on-off information to patients and the LINAC machine. We evaluated the treatment plans of patients who had received conformal partial organ lung irradiation for the treatment of thorax malignancies. Using RGRD, the PTV margin by free breathing can be reduced about 2 cm for moving organs by breathing. TCP values are almost the same values $(4\~5\%\;increased)$ for lung cancer regardless of increasing the PTV margin to 2.0 cm but NTCP values are rapidly increased $(50\~70\%\;increased)$ for upon extending PTV margins by 2.0 cm. Conclusion : Internal organ motion due to breathing can be reduced effectively using our simple RGRD. This method can be used in clinical treatments to reduce organ motion induced margin, thereby reducing normal tissue irradiation. Using treatment planning software, the dose to normal tissues was analyzed by comparing dose statistics with and without RGRD. Potential benefits of radiotherapy derived from reduction or elimination of planning target volume (PTV) margins associated with patient breathing through the evaluation of the lung cancer patients treated with 3D conformal radiotherapy.

Machine learning-based corporate default risk prediction model verification and policy recommendation: Focusing on improvement through stacking ensemble model (머신러닝 기반 기업부도위험 예측모델 검증 및 정책적 제언: 스태킹 앙상블 모델을 통한 개선을 중심으로)

  • Eom, Haneul;Kim, Jaeseong;Choi, Sangok
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.105-129
    • /
    • 2020
  • This study uses corporate data from 2012 to 2018 when K-IFRS was applied in earnest to predict default risks. The data used in the analysis totaled 10,545 rows, consisting of 160 columns including 38 in the statement of financial position, 26 in the statement of comprehensive income, 11 in the statement of cash flows, and 76 in the index of financial ratios. Unlike most previous prior studies used the default event as the basis for learning about default risk, this study calculated default risk using the market capitalization and stock price volatility of each company based on the Merton model. Through this, it was able to solve the problem of data imbalance due to the scarcity of default events, which had been pointed out as the limitation of the existing methodology, and the problem of reflecting the difference in default risk that exists within ordinary companies. Because learning was conducted only by using corporate information available to unlisted companies, default risks of unlisted companies without stock price information can be appropriately derived. Through this, it can provide stable default risk assessment services to unlisted companies that are difficult to determine proper default risk with traditional credit rating models such as small and medium-sized companies and startups. Although there has been an active study of predicting corporate default risks using machine learning recently, model bias issues exist because most studies are making predictions based on a single model. Stable and reliable valuation methodology is required for the calculation of default risk, given that the entity's default risk information is very widely utilized in the market and the sensitivity to the difference in default risk is high. Also, Strict standards are also required for methods of calculation. The credit rating method stipulated by the Financial Services Commission in the Financial Investment Regulations calls for the preparation of evaluation methods, including verification of the adequacy of evaluation methods, in consideration of past statistical data and experiences on credit ratings and changes in future market conditions. This study allowed the reduction of individual models' bias by utilizing stacking ensemble techniques that synthesize various machine learning models. This allows us to capture complex nonlinear relationships between default risk and various corporate information and maximize the advantages of machine learning-based default risk prediction models that take less time to calculate. To calculate forecasts by sub model to be used as input data for the Stacking Ensemble model, training data were divided into seven pieces, and sub-models were trained in a divided set to produce forecasts. To compare the predictive power of the Stacking Ensemble model, Random Forest, MLP, and CNN models were trained with full training data, then the predictive power of each model was verified on the test set. The analysis showed that the Stacking Ensemble model exceeded the predictive power of the Random Forest model, which had the best performance on a single model. Next, to check for statistically significant differences between the Stacking Ensemble model and the forecasts for each individual model, the Pair between the Stacking Ensemble model and each individual model was constructed. Because the results of the Shapiro-wilk normality test also showed that all Pair did not follow normality, Using the nonparametric method wilcoxon rank sum test, we checked whether the two model forecasts that make up the Pair showed statistically significant differences. The analysis showed that the forecasts of the Staging Ensemble model showed statistically significant differences from those of the MLP model and CNN model. In addition, this study can provide a methodology that allows existing credit rating agencies to apply machine learning-based bankruptcy risk prediction methodologies, given that traditional credit rating models can also be reflected as sub-models to calculate the final default probability. Also, the Stacking Ensemble techniques proposed in this study can help design to meet the requirements of the Financial Investment Business Regulations through the combination of various sub-models. We hope that this research will be used as a resource to increase practical use by overcoming and improving the limitations of existing machine learning-based models.