1. Introduction
Machine Learning, a branch of artificial intelligence, is a scientific discipline that concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data. The main purpose of machine learning is to learn automatically and take intelligent decisions based on collected data [1, 2, 3 and 17]. In general, any machine learning problem can be assigned to supervised and unsupervised learning. Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs [4, 5]. Supervised learning problems are categorized into "regression" and "classification" problems. In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. In a classification problem, we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories [12]. In the previous researches, we have successfully tested PID and state feedback controls to maintain stability of a two propeller seesaw. Brief intro to these experiments is described below. In order to apply the above control methods, it is necessary to define the dynamic and kinematic models of the system experimental model of which is shown in Fig.1. Here, l1, l2 represent distances of brushless motor from pivot center and ψ represents Euler angle about x body axis, m1, m2 are mass of brushless DC motors with propeller that is fixed at the end of the lever. F1, F2 represent thrust forces produced by brushless DC powered propeller motor, g is the acceleration due to gravity [1-11].
Figure 1. Propeller based seesaw model
After applied voltage, propellers spin and generate torque to pull up the seesaw. The torque is caused a sum of the forces tangential components to the rotating multiplied with corresponding distances from the pivot point. Neglecting the frictions and the effect of body moments on the translational dynamics, an expression of forces acting on the seesaw according to Newton’s laws is derived as:
\(-l_{1} m_{1} g \cos \psi-l_{2} m_{2} g \cos \psi+l_{1} F_{1}-l_{2} F_{2}=-l_{1}^{2} \ddot{\psi} m_{1}-l_{2}^{2} \ddot{\psi} m_{2}\) (1)
\(\ddot{\psi}=\frac{l_{2}}{l_{m}} F_{2}-\frac{l_{1}}{l_{m}} F_{1}-\frac{l_{1} m_{1}-l_{2} m_{2}}{l_{m}} g \cos \psi\) (2)
Here,\(\ddot{\psi}\) is angular acceleration. In order to use the PID or State feedback control to balance the above system, the length of levers (l1, l2), weights (m1, m2), lifting forces (F1, F2) should be precisely defined.
In this research work presented by the article we did not use dynamic and kinematic models of the object. Instead, using the input and output values of this system we build real-time control system based on supervised machine learning algorithm using classification and tested on the microcontroller. Fig. 2 shows real model of the seesaw equipped by propellers.
Figure 2. Real model of propeller based seesaw
2. Classifier and decision boundary
The main idea of machine learning is to evaluate the function from collected training data [17]. The purpose of our study is to maintain the stability of a two propeller seesaw on certain angle. We will focus on the binary classification of only two values of 0 and 1. To do this, data is collected by increasing speed of the motor controlled by PWM signal and generate lifting force. The controlled variable for this system is the angle ψ of the seesaw relatively to the horizontal axis and the manipulating variable is rotation speed given to motorized propeller. Rotation speed PWM2 is recorded as training input x1 and PWM1 force as x2. We use x(i) to denote the “input” variables (in our case is motors rotation speed), also called input features, and y(i) to denote the “output” or target variable that we are trying to predict (angle). A pair (x(i), y(i)) is training example, and the dataset that will be used to learn a list of m training examples (x(i), y(i)); i=1,…., m-is called a training set. Example of collected data is shown in Table 1.
Table 1. Training data
We tried to stabilize the seesaw at three different angles (-10º, 0º and +10º). First experiment was to stabilize seesaw at angle -10º. If ψ < -10º we recorded as y=1 and represented with red circle and if ψ > -10º, у=0 with blue circle. The relation of training data is illustrated in Figure 3a.
Figure 3. Training data and decision boundaries. a-Classification in a given angle -10º, b- Classification in a given angle 0º, c-Classification in a given angle +10º.
Using the classification method, it’s possible to define the line between two fields filled by colored circles. This line is the decision boundary which is used to stabilize the seesaw in a given angle. From the equation of this line we can estimate second motor rotation speed when first motor rotation speed is fixed.
3. Estimation of decision boundary
Figure 4 shows the machine learning algorithm. Collected training data set will be included in the training algorithm and the training algorithm is used to determine the hypothesis function h. From hypothesis function we can determine decision boundary for a given angle and rotational speed of one of the propellers is calculated (PWM).
Figure 4. Learning process
To describe the supervised learning problem, our goal is, given a training set, to learn hypothesis function h: X → Y so that h(x) is a “good” predictor for the corresponding value of y [2, 12-17].
\(0 \leq h_{\theta}(x) \leq 1\)
Here we use sigmoid function to ensure the above conditions.
\(h_{\theta}(x)=g\left(\theta^{T} x\right)\) (3)
\(z=\theta^{T} x=\theta_{0}+\theta_{1} x_{1}+\theta_{2} x_{2}\) (4)
\(h_{\theta}(x)=g(z)=\frac{1}{1+e^{-z}}=\frac{1}{1+e^{-\theta^{T} x}}\) (5)
The function g(z) converts any real number into (0, 1) intervals which converts any values function into a more appropriate classification function. We can write ℎ () as probability function (6).
\(h_{\theta}(x)=P(y=1 \mid x ; \theta)=1-P(y=0 \mid x ; \theta)\) (6)
Here, ℎθ(x) – hypothesis function, θ0, θ1, θ2 – parameters.
To derive the discrete 0 or 1 class, we can convert the hypothesis function into the following:
\(\begin{array}{l} h_{\theta}(x) \geq 0,5 \text { and } y=1 \\ h_{\theta}(x)<0,5 \text { and } y=0 \end{array}\)
The experiment was conducted to determine θ0, θ1, θ2 parameters which are estimated so that the line most closely aligned with the training data. To do this, the difference between hypothesis function and output value should be minimal.
\(h_{\theta}\left(x^{(i)}\right)-y^{i}\)
The cost function of the linear regression cannot be used to classification tasks which creates a number of local minimums. In other words, this is not a convex function. The cost function J(θ) of the system is determined by expression (7)
\(J(\theta)=\frac{1}{m} \sum_{i}^{m} C\left(h_{\theta}\left(x^{(i)}\right), y^{(i)}\right)\) (7)
\(\text { If } y=1, \quad C\left(h_{\theta}\left(x^{(i)}\right), y^{(i)}\right)=-\log \left(h_{\theta}(x)\right)\) (8)
\(\text { If } y=0, \quad C\left(h_{\theta}\left(x^{(i)}\right), y^{(i)}\right)=-\log \left(1-h_{\theta}(x)\right)\) (9)
We can write two conditional cases of the cost function in one case.
\(J(\theta)=-\frac{1}{m} \sum_{i=1}^{m}\left[y^{(i)} \log \left(h_{\theta}\left(x^{i}\right)\right)+\left(1-y^{i}\right) \log \left(1-h_{\theta}\left(x^{i}\right)\right)\right]\) (10)
Using gradient descent algorithm, we define the minimum value of the cost function parameters.
\(\min _{\theta_{0} \theta_{1} \theta_{2}} J\left(\theta_{0}, \theta_{1}, \theta_{2}\right)\) (11)
First, θ0, θ1, θ2 parameters selected randomly. Usually θ0, θ1, θ2 and are chosen equal to zero and we change the value of the parameters be decreasing the value of the function J(θ0, θ1, θ2). It is described as a program algorithm,
Repeat until converge ()
\(\left\{\theta_{j}:=\theta_{j}-\alpha \frac{\partial}{\partial \theta_{j}} J\left(\theta_{0}, \theta_{1}, \theta_{2}\right)\right\}\)
Here, j = 0, 1, 2 and α is the learning rate. To find minimum value of parameters we take partial derivative from cost function \(\frac{\partial}{\partial \theta_{j}} J\left(\theta_{0}, \theta_{1}, \theta_{2}\right)\)
4. Implementation and Experimental results
The seesaw shown in Figure 1 swings between ±25º. We try stabilize the seesaw in given angle -10º, 0º, and +10º and for this reason have been collected 500 training data in each learning process. The calculations were performed on the Atmega32 controller that is operated at 8 MHz with learning rate α=0.02. To reach cost function J(θ0, θ1, θ2) =0.03 it takes 3 hours. The following parameters were calculated as a result of the training.
Table 2. Estimated parameters
Hypothesis functions are:
ℎθ(x) = −1.6399 + 0.0109x1 − 0.0085x2 for angle -10º
ℎθ(x) = −2.6249 + 0.0109x1 − 0.0072x2 for angle 0º
ℎθ(x) = −0.9186 + 0.0109x1 − 0.0092x2 for angle +10º
The hypothesis function is defined as hθ (x)=0.5 then by selecting one of the propeller rotation speed x1, speed value of the second propeller rotation is calculated from the decision boundary. Decision boundary at given angle -10º, speed of the second propeller can be estimated by the following formula:
\(x_{2}=\frac{0.5+1.6399-0.0109 x_{1}}{-0.0085}\)
Results of computed decision boundaries are illustrated by green dotted line in Figure 3. By experiment, we choose x1=930 for the first propeller then x2 calculated and the results are shown in Table 3.
Table 3. Estimated second motor rotation speed from decision boundary
In Figure 5 shown experimental results of seesaw angle stabilization with 3000 samples, (one sample =5ms).
Figure 5. Stabilization of seesaw in given angles. a - stabilization in given angle -10º, b - stabilization in given angle 0º, c- stabilization in given angle +10º.
5. Conclusion
Unstable seesaw system was trained by classification method. The advantage of this control is that we are trying to stabilize the seesaw basing on the training data only without modeling of the physical data of the system. However, there are disadvantages of classification methods. With the increasing number of training data, the calculation speed is decreased drastically. This control system acts like open loop system without any feedback therefore output value cannot be stabilized precisely. The system with learning algorithm can be stabilized in a given angle by re-learning with different voltage supply or with different weights.
참고문헌
- https://en.wikipedia.org/wiki/Machine_learning
- Andrew Ng. CS229: Machine Learning course. Computer Science Department, Stanford University. https://www.coursera.org/learn/machine-learning
- R.S. Michalski, J.G. Carbonell, T.M. Mitchell. "Machine Learning: An Artificial Intelligence Approach". 1983
- S. B. Kotsiantis. Supervised Machine Learning: A Review of Classification Techniques. Informatica 31. 249-268, 2007
- Ayon Dey. Machine Learning Algorithms: A Review. International Journal of Computer Science and Information Technologies. Vol. 7(3), 2016, 1174-1179
- Tengis Tserendondog, Batmunkh Amar. "Study of a balancing system based on stereo image processing", MUST, The compilation of science works of professors and teachers. 2015, No19/183, pages 268-274
- Tengis Tserendondog, Batmunkh Amar. "PID and State Space Control of Unbalanced Swing ". Mongolian Information Technology-2016. The compilation of Science Conference. Page 125.
- Tengis Tserendondog, Batmunkh Amar, Byambajav Ragchaa. "Stereo Vision Based Balancing System Results". International Journal of Internet, Broadcasting and Communication. Vol.8 No.1, 1-6. E-ISSN number, 2288-4939. https://doi.org/10.7236/IJIBC2016.8.1.1
- Amar Batmunkh, Tserendondog Tengis. "State feedback control of unbalanced seesaw". 11th International Forum on Strategic Technology (IFOST), 2016. DOI: https://ieeexplore.ieee.org/document/7884181/
- Tengis Tserendondog, Batmunkh Amar. "Quadcopter stabilization using state feedback controller by pole placement method". International Journal of Internet, Broadcasting and Communication Vol.9 No.1, 1-6, E-ISSN number, 2288-4939, DOI: https://www.earticle.net/Article/A297898 https://doi.org/10.7236/IJIBC.2017.9.1.1
- Tengis Tserendondog, Batmunkh Amar. "Disturbance Rejection Control for Unbalanced Double-Propeller System on Single Axis". Khureltogoot-2017, Proceeding of International Conference of Technology and Innovation, pages 20-23. Ulaanbaatar.
- Jannick Verlie. Control of an inverted pendulum with deep reinforcement learning. Master's dissertation. Department of Electronics and Information Systems. Ghent University.
- Ciro Donalek. Supervised and Unsupervised Learning. Lecture Note. 2012.
- Kangbeom Cheon, Jaehoon Kim, Moussa Hamadache, Dongik Lee. "On Replacing PID Controller with Deep Learning Controller for DC Motor System". Journal of Automation and Control Engineering Vol. 3, No. 6, December 2015 DOI: 10.12720/joace.3.6.452-456
- C. W. Anderson. Learning to control an inverted pendulum using neural networks. IEEE Control System Magazine, 9(3): 31-37, 1989. DOI: 10.1109/37.24809
- Martin Riedmiller. Neural Reinforcement Learning to Swing-up and Balance a Real Pole. Neuroinformatics Group University of Osnabrueck. 2000. DOI: 10.1109/ICSMC.2005.1571637
- Tengis Tserendondog, Batmunkh Amar. "Control of Single Propeller Pendulum with Supervised Machine Learning Algorithm". International Journal of Advanced Smart Convergence Vol.7 No.3 15-22 (2018), pages 15-22. DOI : http://dx.doi.org/10.7236/IJASC.2018.7.3.15