Abstract
Purpose: The three Vs of volume, velocity and variety are commonly used to characterize different aspects of Big Data. Volume refers to the amount of data, variety refers to the number of types of data and velocity refers to the speed of data processing. According to these characteristics, the size of Big Data varies rapidly, some data buckets will contain outliers, and buckets might have different sizes. Correlation plays a big role in Big Data. We need something better than usual correlation measures. Methods: The correlation measures offered by traditional statistics are compared. And conditions to meet the characteristics of Big Data are suggested. Finally the correlation measure that satisfies the suggested conditions is recommended. Results: Mutual Information satisfies the suggested conditions. Conclusion: This article builds on traditional correlation measures to analyze the co-relation between two variables. The conditions for correlation measures to meet the characteristics of Big Data are suggested. The correlation measure that satisfies these conditions is recommended. It is Mutual Information.