# 1. Introduction

Video surveillance systems rely on the ability to detect moving objects in the video stream which is a relevant information extraction step in a wide range of computer vision applications. This should be done in a reliable and effective way in order to cope with unconstrained environments such as non stationary background, shadow removal and other noisy environments. The scientific challenge is to devise and implement automatic systems able to detect and track moving objects, and interpret their behaviours and activities.

preservation that will be specifically addressed in the sequel include illumination changes, moving background, cast shadows, bootstrapping and camouflage. More generally, Elgammal, Harwood and Davis proposed the nonparametric estimation method for modelling the background. They used kernel density estimation (KDE) to establish local membership of a pixel. Besides the temporally local information, spatially global cues may provide the pixel model with complement evidence in foreground segmentation [1].

Cucchiara, Grana, Piccardi, and Prati used a temporal median filtering in the RGB color space to produce a background model, and explored the hue, saturation and value color space for shadow detection and classification as shadows those pixels having the approximately the same hue and saturation values compared to the background, but lower luminosity [2, 3]. Salvador, Cavallaro, and Ebrahimi presented a method for shadow identification suited for both still images and video sequences. In particular, their approach for video sequences consists of an initial hypothesis based on RGB differences between each frame and the reference image, and a validation stage by exploiting photometric and geometric properties of shadows [4].

Leone and Distante proposed a shadow detection algorithm for intensity images based on texture analysis. In their approach, patches of each new frame are compared with the respective patches of the background model, and Gabor functions are used to detect if textural information remains the same [5]. YingLi Tian, Haowei Liu and Ming-Ting Sun proposed a background model based on Gaussian mixtures, to handle complex situations, several improvements are implemented for shadow removal, quick-lighting change adaptation, fragment reduction, and keeping a stable update rate for video streams with different frame rates [6]. Rittscher, Kato, Joga, and Blake used both, Hidden Markov Model (HMM) and Markov random field (MRF) for foreground and shadow segmentation. In their work, each site (or block) is model by a single HMM independent of the neighbouring sites (or blocks). The HMM and the MRF are employed in two different processes to impose temporal and spatial contextual constraints, respectively [7].

In this work, we provide fuzzy based shadow removal and integrated boundary detection for video surveillance. The main features of the proposed method are the following.

1) Background subtraction, 2) Edge detection, 3) Image Denoising 4) Shadow removal based on fuzzy 5) Boundary detection and6) People Tracking

The paper is organized as follows: Motivation and related works are illustrated in section 2. The proposed method is presented in section 3. Experimental results using the proposed system are described in Section 4. Section 5 concludes this paper and discusses future work.

# 2. Motivation and Related Works

In the literature, video object tracking has been intensively studied and many effective methods have been proposed. For single-target tracking, various object appearance models and motion models are well exploited to estimate target state (location, velocity, etc.) [4, 7-10]. Recently, a class of techniques called “tracking by detection” has been shown to provide promising results [11-15]. The texture based is used to differentiate the background and shadow region of the given input sequence. This method is most promising part for shadow detection [16]. The foreground contains object of interest and background is complementary set. A background subtraction technique should identify as a foreground region, as the definition of foreground objects relates to the application level [17].

Cast shadows produce troublesome effects, typically for object tracking from a fixed viewpoint, since it yields appearance variations of objects depending on whether they are inside or outside the shadow. Matsushita et al proposed a framework based on the idea of intrinsic images to handle such appearance variations by removing shadows in the image sequence [18]. Wang, Tan, Loe and Wu proposed a probabilistic approach for background subtraction and shadow removal. In their method, a combined intensity and edge measure was used for shadow detection, and temporal continuity was used to improve detection results. Results of their proposed method are good, but the determination of several parameters needed by their model increase the computational cost of the method [19]. Tian et al. proposed adaptive background model based on Gaussian mixtures, a local normalized cross-correlation metric to detect shadows, and a texture similarity metric to detect illumination changes [20].

In Wang et al proposed method foreground is separated from shadow by integrating gradient and intensity information dynamically [21]. Hsieh et al proposed linebased algorithms to improve the accuracy of shadow elimination [22]. Jacques et al proposed a simple statistical model for background separation, and explored the standard deviation of pixel ratios in small neighborhoods for shadow identification [23]. Zhang et al. used the MoG model for background subtraction and explored “ratio edges” for shadow identification and removal. The core of their approach for shadow identification is to compute local ratios, which are modeled as chi-squared distributions in shadowed regions [24, 25]. Leone and Distante proposed a shadow detection algorithm for intensity images based on texture analysis. In their proposed approach, patches of each new frame were compared with the respective patches of the background model, and Gabor functions were used to detect if textural information remains the same for shadowed regions [26]. Yang et al proposed a moving cast shadow detection algorithm that combines shading, color, texture, neighborhoods, and temporal consistency in the scene [27].

Yang Wang presented an approach of moving vehicle detection and cast shadow removal for video based traffic monitoring. He developed computationally efficient algorithm to discriminate moving cast shadows and handle non stationary background processes for real-time vehicle detection in video streams [28]. Huang et al proposed a statistical learning-based approach to learn and remove cast shadows [29]. Jung proposed a new method for background subtraction and shadow removal for grayscale video sequences. The background image was modeled using robust statistical descriptors, and a noise estimate was obtained. Foreground pixels were extracted, and the statistical approach combined with geometrical constraints, was adopted to detect and remove shadows [30]. Besides chroma information, the method in [31] combine stereo information to remove shadow. However, this kind of method needs multiple cameras and complicated camera calibration.

Amato et al described a novel framework for detection and suppression of properly shadowed regions for most possible scenarios occurring in real video sequences. Their technique can detect both achromatic and chromatic shadows even in the presence of camouflage that occurs when foreground regions are very similar in color to shadowed regions. To detect shadowed regions in a scene, the values of the background image are divided by values of the current frame in the RGB color space [32]. Liu et al presented a novel method for shadow removal using Markov random fields (MRF). In their method first they constructed the shadow model in a hierarchical manner. At the pixel level, they used the Gaussian mixture model to model the behavior of cast shadows for every pixel in the HSV color space. Second, they constructed an MRF model to represent the dependencies between the label of a pixel and the shadow models of its neighbors [33].

In this paper we used fuzzy based shadow removal and object tracking by using Kalman filters. By this efficient way of shadow removal and object tracking we met good accuracy. The proposed method is explored in the next section in detail.

# 3. Proposed System

The proposed method is explored in Fig. 1. The block diagram consists following major blocks; background subtraction, edge detection, image denoising, shadow removal, boundary detection and people tracking. These blocks are explained in the following sub sections.

**Fig. 1.**Proposed Shadow removal method for People Tracking

## 3.1 Background subtraction

The foreground objects are segmented from the background frame of the current video sequences by performing several algorithms such as MoG, Fuzzy and thresholding. We use the following method to segment out the foreground pixel, each pixel of a detect image is classified by using the background model. By taking the minimum m(x) and maximum n(x) intensity values and the maximum difference d(x) between consecutive frames that represent the background scene model B(x), pixel x from image I is a foreground pixel.

If B(i,j) is the Background pixel, I(x,y) is the difference between the current frame and the background frame. m(x) is the minimum intensity values. n(x) is the maximum intensity values. d(x) is the maximum difference between consecutive frames. Where k is the Threshold constant, in this case we consider k is ranging from 150 to 180.

## 3.2. Edge Detection

We use a canny edge detection algorithm to detect the edges and a fill the image regions by using the morphological operations. The purpose of edge detection is used to reduce image, while preserving for structuring and smoothing purpose.

## 3.3 Image denoising

Generally, denoising is used after moving objects detection to improve the accuracy. It leads to mistake some moving objects for noises and delete them. Therefore, we use median filtering to delete the noises in brightness distortion and chromaticity distortion of each frame before moving shadow removal and objects detection. This can reduce the influence to moving objects detection. It also can remove most of noises and increase detection accuracy. Obviously, using median filtering on brightness distortion matrix and chromaticity distortion matrix before moving objects detection can delete most of noises and protect the moving objects. Using median filtering on brightness distortion matrix can also reduce the square error of brightness distortion. Median filtering is a kind of smoothing technique, effective used for removing noise in smooth patches or smooth regions of a signal and also save the edges. Edges are of critical importance to the visual appearance of images. We used a 3x 3 window based median filter to remove the noises from the background subtracted Images.

## 3.4 Shadow removal using fuzzy

Shadow detection is the process of classification of foreground pixels as shadow pixels based on their appearance with respect to the reference frame and the background. The shadow detection algorithm we have defined by using fuzzy aims to prevent moving cast shadows being misclassified as moving objects by improving the background update and reducing the under segmentation problem. The major problem is how to distinguish between moving cast shadows and moving object points. The fuzzy rule has membership function tool that is used to possibility checking iteration of the given image. In our work the fuzzy rule is heuristically defined using the two membership functions. These two membership function types are defined to represent the background pixel distributions and shadow pixel distribution respectively. Each membership function has a corresponding membership value for every region, which indicates the degree of belonging to that region in combination with the widely used fuzzy IF–THEN rule structure. B (i, j) denotes the Background pixel where i and j be the rows and columns of the image. E (i, j) denotes the Edge pixels. S(i, j) denotes the Shadow pixels.

IF B(i, j) == 1 AND E(i, j) == 1 THEN S(i, j) == 0 ELSEIF B(i, j) == 0 AND E(i, j) == 1 THEN S(i, j) == 0 ELSEIF B(i, j) == 0 AND E(i, j) == 0 THEN S(i, j) == 0 ELSEIF B(i, j) == 1 AND E(i, j) == 0 THEN S(i, j) == 1

The surface plot of pixel Distribution using fuzzy viewer is shown in Fig. 2.

**Fig. 2.**Surface plot of pixel distibution using fuzzy viewer

## 3.5 Boundary detection

The border tracing algorithm is used to extract the contours of the objects (regions) from an image. When apply this algorithm it is implicit that the image with regions is either binary or those regions have been previously labelled.

Algorithm’s steps:

1. Search the image from top left until a pixel of a new region is found; this pixel P0 is the starting pixel of the region border. Define a variable dir which stores the path of the previous move along the border from the previous border element to the current border component. Assign, (i) dir=0 if the border is detected in 4-connectivity as shown in Fig. 3(a) (ii) dir=7 if the border is detected in 8- connectivity as shown in Fig. 3(b). 2. Search the 3x3 neighbourhood of the current pixel in an anti-clockwise direction, beginning the neighbourhood search at the pixel positioned in the direction (i) (dir+3) mod 4 as shown in Fig. 3(c) (ii) (dir+7) mod 8 if dir is even as shown in Fig. 3(d) (dir+6) mod 8 if dir is odd as shown in Fig. 3(e) The first pixel found with the same value as the current pixel is a new boundary part Pn. Update the dir value. 3. If the current boundary element Pn is equal to the second border element P1 and if the previous border element Pn−1 is equal to P0, stop. Otherwise repeat 2nd step. 4. The detected border is represented by pixels P0… Pn−2. Boundary tracing in 8-connectivity is shown in Fig. 3(f). The dashed lines in the figure show pixels tested during the border tracing.

**Fig. 3.**(a) Direction notation, 4-connectivity; (b) 8-Connectivity; (c) pixel neighbourhood search sequence is 4-connectivity; (d) and (e) search sequence in 8-connectivity, (f) boundary tracing in 8-connectivity.

## 3.6 People tracking

Establishing correspondence of connected components between frames is accomplished using a linearly predictive multiple hypotheses tracking algorithm which incorporates both position and size. We have implemented an online method for seeding and maintaining sets of Kalman filters as explained below. The equations for the Kalman filter fall into two groups, time update equations and measurement renew equations. The time update equations are dependable for projecting forward (in time) the current state and error covariance

3.6.1Time update

For each time step k, a Kalman filter first makes a prediction of the state at this time step

Where xk−1 is a vector representing process state at time k−1 and A is a process transition matrix. uk is a control vector at time k, which accounts for the action that the robot takes in response to state , B converts the control vector uk into state space. In our model of moving objects on 2D camera images, state is a 4-dimensional vector [x; y; dx; dy], where x and y represent the coordinates of the object’s center, and dx and dy represent its velocity. The transition matrix is thus simply

3.6.2. Error covariance prediction

The Kalman filter concludes the time update steps by projecting estimate error Covariance forward one time step:

Where Pk−1 is a matrix representing error covariance in the state prediction at time k, and Q is the process noise covariance. Intuitively, the lower the prediction error covariance , the more we trust the prediction of the state . Prediction error covariance will be low if the process is precisely modelled, so the entries of Q are fairly low. Unfortunately, Determining Q for any process model is often difficult – Q depends on hard-to-predict variables such as how often the target object changes velocity.

3.6.3. Measurement update

After predicting the state (and its error covariance) at time k using the time update steps, the Kalman filter next uses measurements to “correct” its prediction during the measurement update steps.

1) Kalman Gain: First, the Kalman filter computes a Kalman gain Kk, which is later used to correct the state estimate

Where H is a matrix converting state space into measurement space (discussed below), and R is measurement noise covariance 2) Like Q, determining Rk for a set of measurements is often difficult, many Kalman filter implementations statically analyse training data to determine a fixed R for all future time updates. We instead allow R to be dynamically calculated from the measurement algorithms’ state. This procedure is detailed at the end of this section. .

3.6.4 State update

Using Kalman gain Kk and measurements Zk from this time step k, we can update the state estimate

Conventionally, the measurements Zk is often derived from sensors. In our approach, measurements Zk is instead the output of various tracking algorithms given the same input: one frame of a streaming video, and the most likely x and y coordinates of the target object in this frame.

3.6.5 Error covariance update

The final step of the Kalman filter’s iteration is to update the error covariance and Pk

The simplified error covariance will be significantly decreased if the measurements are accurate (some entries in Rk are low), or only slightly decreased if the measurements are noise (all of Rk is high). Kalman filters are easily able to take tracking algorithm outputs as measurements. However, the difficulty in combining arbitrary tracking algorithms as measurements comes from computing the Kalman gain, Rk, the measurement covariance matrix, is complex to resolve. Our approach to this problem computes an error or noise estimate for each tracking algorithm. This computation is trained based on regression of image features that represent each algorithm’s weaknesses. The features we propose are detailed below.

At each frame, we have an available pool of Kalman models and a new available pool of connected components that they could explain. First, the models are probabilistically matched to the connected regions that they could explain. Second, the connected regions which could not be sufficiently explained are checked to find new Kalman models. Finally, models whose fitness (as determined by the inverse of the variance of its prediction error) falls below a threshold are removed. Matching the models to the connected components involves checking each existing model against the available pool of connected components which are larger than a pixel or two. All matches are used to update the corresponding model. If the updated model has sufficient fitness, it will be used in the following frame. If no match is found a “null” match can be hypothesized which propagates the model as expected and decreases its fitness by a constant factor. The unmatched models from the current frame and the previous two frames are then used to hypothesize new models. Using pairs of unmatched connected components from the previous two frames, a model is hypothesized. If the current frame contains a match with sufficient fitness, the updated model is added to the existing models. To avoid possible combinatorial explosions in noisy situations, it may be desirable to limit the maximum number of existing models by removing the least probable models when excessive models exist.

# 4. Experimental Results

In the proposed shadow removal method for people tracking the works are carried out in the following steps; a) background subtraction, b) edge detection, c) image denoising d) shadow removal based on fuzzy, e) boundary detection and f) people tracking. In our method first background subtraction is carried out on the selected frame and foreground object is detected. For edge detection processes we use canny edge detection algorithm. Image denoising is carried out by using median filter. And then shadow removal is attended by using fuzzy logic. To track the people boundary detection processes is necessary. In the boundary detection, the border tracing algorithm is used to extract the contours of the objects (regions) from an image. Finally peoples are tracked without shadow by using Kalman filters. The simulation results of different frames are given in Figs. 4.1 and Fig. 4.2.

**Fig.4-1.**Processing on 78th Frame: (a) Input Frame; (b) Background Subtracted; (c) Detected shadow; (d) Detected Foreground; (e) Tracking with Shadow; (d) Tracking without Shadow

**Fig.4-2.**Processing on 102th Frame: (a) Input Frame; (b) Background Subtracted; (c) Detected shadow; (d) Detected Foreground; (e) Tracking with Shadow; (f) Tracking without Shadow

**Table 1.**Parameter analysis

To analyze our proposed method results we have calculated true positive (TP) for a correctly classified foreground pixel, true negative (TN) for a correctly classified background pixel, false positive (FP) for a background pixel that was incorrectly classified as foreground and false negative (FN) for a foreground pixel that was incorrectly classified as background for each pixel in the selected frame. After every pixel had been classified into one of those four groups(TP, TN, FP and FN) of parameter, sensitivity (Recall), specificity, F1 (Figure of Merit or F-measure) Precision and accuracy were calculated.

Sensitivity (Recall) measures the section of actual positives which are correctly identified. Specificity measures the proportion of negatives which are correctly known. Precision and accuracy is used to describe and measure the estimate or predict. Recall, also known as detection rate, gives the percentage of detected true positives as compared to the total number of true positives in the ground truth where is the total number of true positives in the ground truth. Moreover, we considered F1 that is the weighted harmonic mean of precision and recall. Sensitivity, specificity, F1, accuracy and precision are specified in the Eqs. (8-11) and (12) respectively. The parameter analysis of proposed method is tabulated under tabe1. The simulation results shows that the proposed method has over 99% accuracy in people tracking by eliminating shadows in the frame.

# 5. Conclusion

This paper has presented fuzzy based shadow removal algorithm for moving object detection in image sequences. This system has the unique characteristic of explicitly addressing various troublesome situations such as shadows and noisy environments. This system has been tested in a wide range of different environments and applications. Our method has been evaluated against several video sequences including both indoor and outdoor scenes. Unlike other shadow removal method that computes multiple and more complex statistics at a time, the very simple fuzzy operator requires very limited computation. This approach, consequently, allows fast detection of moving objects which for many applications in a real time even on ordinary PCs, this in turn, allows consecutive higher-level tasks such as tracking and classification to be easily performed in real time. Comparisons to other approaches presented in the literature have shown that our approach provides better results when compared to the other new technologies. Currently, the method requires a non moving camera, which restricts its usage in certain applications. In future we plan to extend the method to support also moving cameras and to develop more accurate shadow removal method that could be implemented in Field Programmable Gate Array (FPGA).