Major Depressive Disorder is a significant mental health condition triggered by various social and environmental factors. Traditional diagnostic methods for depression primarily rely on subjective assessments, including self-reported surveys and psychiatric consultations, which often lack objectivity and consistency. To address these limitations, this study developed a machine learning-based depression prediction model using ensemble learning techniques and applied SHAP to quantitatively analyze the contribution of individual features to model predictions. The study utilized real-world data collected over three years from 160 participants using smartwatches. To increase the size and diversity of dataset, a segmentation technique was applied resulting in 3,080 independent samples. The dataset included sleep metrics, heart rate, step count, and PHQ-9 survey results. Ensemble learning models were trained and evaluated using metrics such as accuracy, precision, recall, and F1-score. Among these, XGBoost, LightGBM, and CatBoost achieved the highest accuracy of 99.50%. SHAP analysis identified key predictors of depression, with sleep metrics specifically variability in individual sleep duration and average night sleep duration emerging as the most significant contributors across all ensemble learning models. Specifically, night sleep duration, variability in overall sleep states, proportion of Deep sleep during night sleep, variability in REM sleep duration, and proportion of REM sleep during total daily sleep were found to be important features for detection MDD. Future work aims to enhance diagnostic accuracy and robustness by expanding the datasets and feature variables, integrating advanced deep learning, and leveraging explainable AI with wearable device data to enable personalized, early detection and intervention for depression prevention and management.