1. Introduction
Fisheries, including aquaculture, have been advanced rapidly over the past several decades. The global production of fisheries has increased from 19 million tonnes in 1950 to 179 million tonnes in 2018, representing an average annual growth rate of 3.3% [1]. In particular, fish farming, a branch of aquaculture, accounted for approximately 49% of the total production in 2018. This growth has been fundamentally supported by the adoption of advanced technologies such as artificial intelligence and the Internet of Things. These technologies have been applied to various areas of fish farming, including farm management, water quality monitoring, and fish behavior analysis, contributing to the enhanced efficiency and productivity of fish farming [2-5].
From a viewpoint of a fish farm owner, determining the desirable shipment timing is crucial since it is directly connected to profit. This decision-making process should consider various fish farming related factors (FRF) such as the growth rate of fish, farm-gate prices at the source, and operational costs. Consequently, predicting these FRFs is essential, leading to the development of various prediction models.
Initially, deterministic models such as the thermal growth coefficient (TGC) or the von Bertalanffy-based model were developed for predicting fish growth [6, 7]. However, since fish growth data such as weight and length in each fish farming tank or cage showed a random process characteristic, some probabilistic models such as the Bayesian von Bertalanffy model and Gaussian process regression (GPR) were developed [8, 9]. The Bayesian von Bertalanffy model fits the von Bertalanffy growth equation (VBGE) to length-at-age data utilizing Markov chain Monte Carlo (MCMC) and informative priors. GPR is based on Gaussian process and providing outputs of target variables and variances. To predict fish farm-gate prices at the source, the vector autoregressive (VAR) model was proposed. The VAR model analyzed causality in farm-gate prices. Also, artificial neural network models were proposed to analyze the relationship between amounts of consignment sale and farm-gate prices [10, 11]. In the case of cost prediction, several models were proposed, including comprehensive production cost models using the STELLA model in recirculating aquaculture systems (RAS), capital cost estimation models in fish farming based on statistical data and discount rates, and fish price variation models [12-15].
The above-mentioned models, such as growth prediction models [6-9], price prediction models [10, 11], and cost estimation or fish price variation models [12-15], can predict each FRF corresponding to the growth rate of fish, farm-gate prices, and farming costs. However, they cannot provide guidelines for shipment timing that take the total effect of FRFs into consideration. Generally, all the elements of FRFs are dependent on each other. For instance, net profit for fish farmers is usually determined by the values and costs of fish farming, and these values are affected by farm-gate prices, which are dependent on fish weight and farming costs, both of which are also affected by fish growth rate. Hence, this paper suggests a decision support system designed to integrate FRFs related to farmers’ revenue, thereby providing guidelines for fish shipment timing. The proposed system consists of prediction models for individual FRFs and a function for net profit prediction as a shipment decision supporter. This paper describes an exemplary case of the system using data from the farming of Korean flounder, a species of significant importance in Korea.
The remainder of this paper is organized as follows. Section 2 describes the architecture of a decision support system. Section 3 explains the design of two prediction models employed within the system and proposes a net profit prediction algorithm as a decision supporter. Section 4 demonstrates the process of the proposed system using specific fish, and Section 5 concludes the paper.
2. The Structure of a Decision Support System
The main components of the proposed decision support system include a controller, a preprocessor, prediction models, and functions as depicted in Fig. 1. The controller collects data from various farms, builds a database, and transfers the collected data to the preprocessor. The collected data includes sensor data (e.g., water temperature, dissolved oxygen levels), growth data (e.g., mean weight, number of fish), and farm-gate price data. Furthermore, by using the controller, users can select the output of a function that they want to predict and the prediction models that are required for the specific function. The preprocessor manipulates the raw data from the data resources and converts it into data that can be used by the prediction models. In the training mode, the converted data is also formatted in the form of training data to train the prediction models. If the amount of training data is insufficient, some data augmentation techniques can be used.
Fig. 1. A block diagram of a decision support system
The prediction models are designed to predict the weight growth and farm-gate prices of the cultured fish. In this paper, GPR and long short-term memory (LSTM) are used for growth prediction and farm-gate price, respectively. Note that although other machine learning models including artificial neural networks can be replaced, we will focus on the two models in this paper since GPR can provide the mean and variance of the weight data and LSTM can reflect the time-varying characteristics of the prices.
Finally, the functions provide the predicted outputs that users want to know. The outputs include weight of cultured fish, farm-gate price, and net profit. Weight growth prediction uses the GPR model to provide the total/mean weight, weight by weight class, standard deviation, and a 95% confidence interval of cultured fish. Farm-gate price prediction uses the LSTM model to provide the farm-gate price per weight according to weight classes. Net profit prediction uses both GPR and LSTM models to provide information on when it is advantageous for the shipment. The fish in a specific farm are classified by weight and the net profit for each class and a summation of the classes is estimated using various predicted values from weight of cultured fish, farm-gate prices, and farming costs. Users can know the shipment timing that benefits them by referring to the net profit according to classes or the total net profit. As a practical example, the system for Korean flounder is described in section 4.3.
3. Design of Prediction Models and a Net Profit Algorithm
The proposed shipment decision assistant system uses various data related to fish farming to support decision-making process from a profit standpoint. To this end, the net profit for each weight class and the summation of all the classes is estimated based on the predicted values of FRFs, which is provided to the user. In this section, we describe two prediction models, GPR for growth and LSTM for farm-gate prices, and propose an algorithm to estimate the net profit using the predicted values. Also, we briefly mention the prediction of farming costs which are used for a net profit estimation.
3.1 Prediction Models
3.1.1 Weight Growth Prediction – GPR
The following three aspects should be considered in selecting a model for weight growth prediction: 1) Not all juvenile fish fed into fish cages (or tanks) grow at the same rate. When analyzing the data collected by sampling at an arbitrary time in a specific farming cage or tank, the weight of fish can be viewed as a random variable with a Gaussian distribution and the set of the random variables that is a function of time can be viewed as a Gaussian random process. Mean and variance at every sampling time is expressed as a function of time and environmental variables such as water temperature or feeding quantity etc. (in this paper, one environmental variable, water temperature, is considered.) 2) Usually, mean and variance of fish weight increases with time, but depending on the fish species or environmental conditions, the increase rate of mean and variance may be mitigated or the mean and variance itself may decrease because large fish can eat small fish from some point in time [16]. 3) Unlike the environment data such as water temperature or dissolved oxygen levels that are automatically collected by sensors, the growth data is collected manually, so it is labor-intensive and can cause damage to the fish with extra costs [17]. Therefore, collecting growth data is not easy, resulting in a small amount of data as well as aperiodic in time.
Considering these aspects, we use GPR as the weight growth prediction model. GPR can customize growth predictions by learning the data collected from each farm. GPR provides standard deviations along with means to understand the distribution of individual weights or serve as the basis for addressing uncertainties. These features allow GPR to provide insights into the variability and distribution of growth of cultured fish to support sorting and shipment timing decisions.
GPR is based on Gaussian processes. A Gaussian process is a collection of random variables which has a joint Gaussian distribution. It can be expressed as the following equation with a training set 𝒟 = {(𝐱𝑖, 𝑦𝑖) | 𝑖 = 0, … , 𝑛 − 1} of 𝑛 observations where 𝐱𝑖 denotes a column input vector of dimension 𝑀 and 𝑦𝑖 indicates a scalar output, and a test set 𝒟∗ = {(𝐱∗𝑖, 𝑦∗𝑖)|𝑖 = 0, … , 𝑙 − 1} of 𝑙 observations where 𝐱∗𝑖 denotes a column input vector of dimension 𝑀 and 𝑦∗𝑖 indicates a scalar output [18].
𝑓(𝑿)~𝐺P(𝑚(𝑿), 𝐾(𝑿,𝑿′)) (1)
where 𝑿 = [𝐱0 𝐱1 ⋯ 𝐱𝑛−1]𝑇, 𝑚(𝑿) is the mean function representing the expected value for training 𝑿, and 𝐾(𝑿,𝑿′) is the covariance matrix between 𝑿 and 𝑿′ which are in a data set. The covariance matrix measures the similarity between inputs and determines the flexibility of the model.
GPR, as depicted in Fig. 2, predicts the distribution of outputs f∗ (or 𝐲∗) for a new input 𝑿∗ based on the given training set 𝒟. The predictive distribution is expressed as the following conditional probability distribution [18].
\(\begin{align}\mathrm{f}_{*} \mid \boldsymbol{X}, \mathbf{y}, \boldsymbol{X}_{*} \sim N\left(\overline{\mathrm{f}}_{*}, \operatorname{cov}\left(\mathrm{f}_{*}\right)\right)\end{align}\) (2)
\(\begin{align}\overline{\mathrm{f}}_{*} \triangleq E\left[\mathrm{f}_{*} \mid \boldsymbol{X}, \mathbf{y}, \boldsymbol{X}_{*}\right]=K\left(\boldsymbol{X}_{*}, \boldsymbol{X}\right)\left[K(\boldsymbol{X}, \boldsymbol{X})+\sigma_{n}^{2} I\right]^{-1} \mathbf{y}\end{align}\) (3)
cov(f∗) = 𝐾(𝑿∗,𝑿∗) − 𝐾(𝑿∗,𝑿)[𝐾(𝑿,𝑿) + 𝜎2𝑛𝐼]−1 𝐾(𝑿,𝑿∗) (4)
Fig. 2. A concept of Gaussian process regression
where 𝐲 = [𝑦0 𝑦1 ⋯ 𝑦𝑛−1]𝑇 , 𝑿∗ = [𝐱∗0 𝐱∗𝟏 ⋯ 𝐱∗𝑙−1]𝑇, 𝐲∗ = [𝑦∗0 𝑦∗1 ⋯ 𝑦∗𝑙−1]𝑇, \(\begin{align}\bar {f}_{*}\end{align}\) represents the mean vector of the predicted outputs for a new input 𝑋∗, 𝐲 is a column vector output consisting of 𝑦𝒊, cov(f∗) denotes the covariance matrix of the predictions, 𝜎2𝑛 represents the variance, 𝐼 denotes the identity matrix, 𝐾(𝑋, 𝑋) is a covariance matrix (or Gram matrix) whose (𝑖,𝑗)th entry is 𝐾ij = 𝑘(𝐱𝑖, 𝐱𝑗) for 1 ≤ 𝑖,𝑗 ≤ 𝑛.
For predicting the mean weight and variance of cultured fish, the GPR model utilized Python and the GPy library [19]. The covariance matrix of the designed GPR was simplified using some well-known kernels such as squared exponential (SE), product (linear and white noise kernels), Matérn, and ARD SE kernels [18, 20].
𝐾(𝑿,𝑿∗) = 𝐾𝑡rend(𝑿,𝑿∗) + 𝐾𝑛oise(𝑿,𝑿∗) + 𝐾𝑀at𝑒́𝑟n(𝑿,𝑿∗) + 𝐾𝐴RD(𝑿,𝑿∗) (5)
where
\(\begin{align}K_{\text {trend }}\left(\boldsymbol{X}, \boldsymbol{X}_{*}\right)=K_{\text {trend }_{i j}}=k_{\text {trend }}\left(\mathbf{x}_{i}, \mathbf{x}_{* j}\right)=\sigma_{\text {trend }}^{2} \exp \left(-\frac{1}{2 l_{N}^{2}}\left|\mathbf{x}_{i}-\mathbf{x}_{* j}\right|^{2}\right)\end{align}\) (6)
𝐾𝑛oise(𝑿,𝑿∗) = 𝐾𝑛oise_𝑖j = 𝑘𝑛oise(𝐱𝑖, 𝐱∗𝑗) = 𝜎2𝑛oise𝛿(𝐱𝑖, 𝐱∗𝑗) × (𝐱𝑇𝑖𝐱∗𝑗) (7)
\(\begin{align}K_{\text {Matérn }}\left(\boldsymbol{X}, \boldsymbol{X}_{*}\right)=K_{\text {Matérn_i } i j}=k_{\text {Matérn }}\left(\mathbf{x}_{i}, \mathbf{x}_{* j}\right)=\sigma_{\text {Matérn }}^{2}\left(1+\frac{\sqrt{5} r}{l_{m}}+\frac{5 r^{2}}{3 l_{m}^{2}}\right) \exp \left(-\frac{\sqrt{5} r}{l_{m}}\right)\end{align}\) (8)
\(\begin{align}K_{A R D}\left(\boldsymbol{X}, \boldsymbol{X}_{*}\right)=K_{A R D-i j}=k_{A R D}\left(\mathbf{x}_{i}, \mathbf{x}_{* j}\right)=\sigma_{A R D}^{2} \exp \left(-\frac{1}{2} \sum_{d=1}^{M} \frac{\left(\mathbf{x}_{i}^{[d]}-\mathbf{x}_{* j}^{[d]}\right)^{2}}{l_{d}^{2}}\right)\end{align}\) (9)
where 𝜎2𝑡rend, 𝜎2𝑛oise, 𝜎2𝑀at𝑒́𝑟n, and 𝜎2𝐴RD represent the variances of 𝐾𝑡rend, 𝐾𝑛oise, 𝐾𝑀at𝑒́𝑟n, and 𝐾𝐴RD, respectively. Also, 𝑙𝑁, 𝑙𝑚, and 𝑙𝑑 denote the length scales for 𝐾𝑡rend, 𝐾𝑀at𝑒́𝑟n, and 𝐾𝐴RD of 𝑑𝑑th dimension, respectively. 𝛿(𝐱𝑖, 𝐱∗𝑗) represents the Kronecker delta, which returns 1 if the input vectors are identical and 0 otherwise, 𝐱𝑇𝑖𝐱∗𝑗 denotes the dot product of the two input vectors. 𝑟 represents the absolute distance |𝐱𝑖 − 𝐱∗𝑗|. 𝐱[𝑑]𝑖 and 𝐱[𝑑]∗𝑗 represents the values of the 𝑑th dimension of the vectors 𝐱𝑖 and 𝐱∗𝑗 when automatic relevance determination (ARD) is activated. ARD allows for learning a separate length scale parameter, enabling the model to estimate the importance of each dimension. The prediction results of the designed model are described in section 4.2.
To evaluate the performance of the GPR model, mean square error (MSE) and mean standardized log-likelihood (MSLL) are used. The MSE value is obtained by selecting the mean weight predictions for each aquacultivation period as representative values [9]. MSLL can be calculated in a similar manner and is particularly useful for evaluating probabilistic models, making it a criterion for performance evaluation along with MSE [18]. MSE and MSLL are defined as follows [9].
\(\begin{align}\operatorname{MSE}\left(\mathbf{y}_{*}, \overline{\mathrm{f}_{*}}\right)=\frac{1}{l}\left(y_{*}-\overline{\mathrm{f}_{*}}\right)^{T}\left(y_{*}-\overline{\mathrm{f}_{*}}\right)\end{align}\) (10)
\(\begin{align}\operatorname{MSLL}\left(\mathbf{y}_{*}, \overline{\mathrm{f}}_{*}\right)=\frac{1}{l} \sum_{i=1}^{n}\left(\frac{1}{2} \log \left(2 \pi \sigma_{*}^{2}\right)+\frac{\left(y_{*}-\overline{\mathrm{f}}_{*}\right)^{2}}{2 \sigma_{*}^{2}}\right)\end{align}\) (11)
where 𝜎2∗ indicates the predicted variance corresponding to 𝐗∗.
3.1.2 Farm Gate Price Prediction – LSTM
The farm-gate prices per unit weight does not apply equally to all fish with different weights, but rather tends to increase as the weight of fish increases. This is because the weight shows a Gaussian distribution, which results in the scarcity of relatively heavier fish favored by consumers. Usually, the farm-gate prices are not randomly determined but are affected by previous price trends, which are more affected by recent prices. Therefore, it is possible to improve the prediction accuracy by using multiple time series data. In this paper, we select a recurrent neural network (RNN) based model that sequentially processes inputs and outputs for price prediction [21]. Specifically, the training of the RNN model utilizes the LSTM structure that can effectively learn temporal dependencies in sequential data for accurate predictions of current or future data, which is widely used in price prediction models for agricultural [22, 23] and livestock products [24, 25].
The past time series data before the forecasting time were processed using a many-to-one structure in which several inputs lead to a single output, as shown in Fig. 3 [26]. In this study, the LSTM was trained to predict the farm-gate price one month ahead, using the previous five months of price farm-gate information as input. In predicting prices for flounder, deciding sequence length which is about how many monthly prices are considered as an input for LSTM model is a critical factor. This is because prices for flounder are affected by seasons or day of the week able to cause changes in water temperature, farming costs, growth rate etc. To effectively reflect those seasonal effects on the price prediction, several LSTM models were tested with different sequence lengths, where the sequence lengths of 3, 5, 6, and 12 which show MSLE of 15.505, 14.558, 14.846, and 14.741, respectively. Thus, the sequence length of 5 for price prediction is selected. And the price information used as input to LSTM is scaled by a min-max normalizer to ensure all values range between 0 and 1.
Fig. 3. Structure of many-to-one LSTM
The model was designed to predict farm-gate prices for different weight classes, specifically for 0.5 kg, 1 kg, and 2 kg, taking into account the varying tendencies in price fluctuations associated with each of the weight classes. Although the same LSTM structure is used, the model is designed to adjust weight values according to weight class to reflect the different price trends by class. The prediction outcomes of the designed model are going to be detailed in subsequent sections.
3.2 A Net Profit Algorithm
To predict the net profit of fish farming, the growth prediction model implemented with GPR, farm-gate price prediction model using LSTM, and cost data based on statistics are used. Fundamentally, the growth prediction model uses sample mean weight, aquacultivation period, and water temperature data from the data resource's database as inputs for training, producing means and variances of weight as outputs. The farm-gate price prediction model is trained with unit prices per weight of fish as inputs and estimates unit prices by weight class as outputs. To predict net profit that can assist shipment decision, a statistical technique is used together with cost based on predictive models and statistics. The formula for the net profit algorithm is as follows.
𝐶𝑇(𝑚, 𝑇) = 𝑐(𝐴, 𝑁𝑆) × 𝑊𝑇(𝑚, 𝑇) + 𝐶𝑠 × 𝑁𝑠 (12)
where 𝐶𝑇(𝑚, 𝑇) represents the total farming cost after 𝑚 months with the water temperature 𝑇 in the farms, 𝑐(𝐴, 𝑁𝑆) denotes the cost per unit of production, 𝐴 is the total water surface area of the farms, 𝑁𝑠 indicatesthe total number of juvenile fish stocked in the farms, 𝑊𝑇(𝑚, 𝑇) signifies the total weight of fish, and 𝐶𝑠 is the cost per juvenile fish.
The proportion of fish belonging to each weight class is as follows.
𝑝𝑖(𝑚, 𝑇) = Pr (𝑊 ∈ 𝑅ange(𝑊𝑖)) (13)
where 𝑖 indicates the index of the weight class, and 𝑊 is a random variable representing the weight of the fish. 𝑊𝑖 is the representative value according to the weight class, and 𝑅ange(𝑊𝑖) is the weight range of flounder (fish) satisfying 𝑊𝑖. The proportions for each of the weight classes are derived using the mean weight and standard deviation predicted by GPR, along with the cumulative distribution function 𝑭(𝑊) based on the Gaussian distribution. Table 1 provides an example based on Korean flounder, describing the weight classes.
Table 1. Weight class specification about Korean flounder as an exemplary case
The shipment value per weight class is given by
𝑣𝑖(𝑚, 𝑇) = 𝑊𝑖 × 𝑝𝑖(𝑚, 𝑇) × 𝑁𝑆 × (1 − 𝑑(𝑚)) × 𝑃M𝑖(𝑚) (14)
where 𝑃M𝑖(𝑚) and 𝑑(𝑚) denote the farm-gate price per weight for weight class 𝑖 and the mortality rate of the fish, respectively.
The total net profit is given by
𝑁P(𝑚, 𝑇) = ∑𝑖 𝑣𝑖 (𝑚, 𝑇) − 𝐶𝑇(𝑚, 𝑇) (15)
Finally, the net profit for fish belonging to weight class 𝑖 is as follows.
𝑁P(𝑖, 𝑚, 𝑇) = 𝑣𝑖(𝑚, 𝑇) − 𝑝𝑖(𝑚, 𝑇)𝐶𝑇(𝑚, 𝑇) (16)
4. Simulations
In this section, the entire operation process of the shipping decision support system proposed in section 3 is described based on Korean flounder data.
4.1 Simulation Assumption
As a use case for the shipment decision support system, Korean flounder from Jeju Island is selected. As mentioned in section 3.1.1, the currently available data is aperiodic and small in quantity, making it difficult to obtain sufficient data for prediction. Therefore, this subsection assumes the following conditions.
1) For the land-based Korean flounder farming in the Jeju region introduced as a use case, it is observed that the survival rate decreases with the length of the aquacultivation period. Additionally, due to the characteristic that the unit farm-gate price varies according to weight classes, farming juveniles up to around 1.0 kg before shipping can increase the farmer’s profit [27]. Therefore, in this use case, a prediction period of 18 months after stocking 4 g juveniles is set to observe the growth up to approximately 1.0 kg and subsequent trends.
2) In flounder farming, seasonal factors typically cause variations in water temperature at the farms. However, it is assumed for this use case that the water temperature remains constant for each case as depicted in Fig. 4, and it is assumed that optimized feeding is provided in each case.
Fig. 4. Baseline data of fish weight and data augmentation according to water temperature conditions and an aquacultivation period
3) During the 18-month aquacultivation period or shorter, grading and sorting occurs, and it is assumed for growth data collection that the weight and distribution of the selected individuals are determined by sampling of size 50.
4) Data collected from the farm can vary in collection frequency depending on the data type, and anomalies or missing data may occur due to grading, sorting, shipment decisions, or sensor malfunctions. Thus, preprocessing is necessary [9], but this simulation assumes that such processes have already been completed, and mixed data cycles are converted and standardized to a monthly basis for use.
5) It is assumed that small fish below 250 g does not possess marketable value and thus cannot be sold, and when setting the 18-month observation period, it is assumed that the weight of flounder does not exceed 5000 g.
4.2 Simulation Results
4.2.1 Weight Growth Prediction
In this subsection, we perform training and prediction using the model structure described in section 3.1.1 and the growth data for Korean flounder. Due to the lack of sufficient data accumulated in the living lab database for training and testing the GPR growth model, the use case to be discussed later uses the TGC model and mean growth data of Korean flounder to prepare basic data for training and test. The standard deviation relative to the mean weight was obtained using the living lab database.
As can be seen in the first graph of the first row in Fig. 4, five sets of mean growth data were generated according to water temperature ranging from 17 to 23℃ and an aquacultivation period and to obtain TGC values [28-30]. The formula for the TGC growth model is as follows [6].
\(\begin{align}W_{e}=\left[\sqrt[3]{W_{s}}+\left(\frac{T G C}{1000} \times T \times \frac{365}{12} \times \Delta m\right)\right]^{3}\end{align}\) (17)
where 𝑊𝑒 represents the weight of the Korean flounder after 𝑚 months from stock seeding, 𝑊𝑠 is the initial weight of the flounder, and TGC stands for the thermal growth coefficient dependent on the water temperature.
For the training and evaluating of GPR, data augmentation (noise addition) was conducted on the baseline data according to water temperature conditions and an aquacultivation period [31]. The data augmentation was performed using the standard deviation corresponding to the mean weight of each period and a normal distribution. For training and test data for Cases 1, 3, and 5 where the water temperatures are 17, 20, and 23℃, respectively, augmented TGC growth data over the aquacultivation period of 18 months from stock seeding was randomly divided into training data (6 sets, each with 50 samples - 300 total) and test data (13 sets, each with 50 samples - 650 total). Here, we assumed that data from the stock seeding period could be secured and thus was all included in the training data.
Using this training data, we trained and evaluated the growth prediction model for Case 1, 3, and 5. Next, we performed the growth predictions and evaluations for the different water temperature (Cases 2, 4 where the water temperatures in the farms observed are 18.5 and 21.5℃, respectively) based on the model for Cases 1, 3, and 5.
The reason why the augmented data was prepared in this way is that the quantity of growth data is very small and sampling frequency is aperiodic. This suggests 6 growth datasets per year can be acquired from Korea flounder farms in the field. Thus, in this paper, it is assumed that Cases 1, 3, and 5 have 6 growth datasets as training data and 13 datasets as test data and for Cases 2 and 5, 6 growth datasets as test data are acquired.
The optimization of the GPR model was performed using the L-BFGS-B optimizer where L-BFGS-B is a variation of the Limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm, designed for solving large-scale optimization problems with bound constraints on variables, resulting in a marginal likelihood of 4758.36. The hyperparameters of the covariance function, as presented in Table 2, were as follows: for 𝐾𝑡rend in (6), 𝜎2𝑡rend and 𝑙𝑁 were 17,988,794.25 and 14.87, respectively; for 𝐾𝑛oise in (7), 𝜎2𝑛oise were 11.92; for for 𝐾𝑀at𝑒́𝑟n in (8), 𝜎2𝑀at𝑒́𝑟n and 𝑙𝑀at𝑒́𝑟n were 5.56 and 5.58, respectively; and for 𝐾𝐴RD in (9), 𝜎2𝐴RD and 𝑙2𝑑 were 5.56 and 2, respectively.
Table 2. Parameter specification of growth prediction model (GPR)
In Fig. 5, we presented training-test data with the mean weight and 95% confidence interval of the growth prediction model, based on the aquacultivation period and water temperature. For Cases 1, 3, and 5, experiments were performed using 6 months of training data and 13 months of test data. For Cases 2 and 4, experiments were carried out in untrained culturing environments based on the GPR model trained from Cases 1, 3, and 5.
Fig. 5. Weight growth prediction according to water temperature ranging from 17 to 23℃ and an aquacultivation period
According to Table 3, for Cases 1, 3, and 5, the MSE values are 17.38, 38.39, and 67.55, respectively, showing an increasing trend with the increase in culturing water temperature. The MSLL values are 5.69, 6.02, and 6.42, respectively, which also display an increasing trend. When examining Cases 2 and 4, the MSE values are 23.46 and 81.77, respectively, and the MSLL values are 5.82 and 6.42, respectively. In Case 2, the MSLL is between that of Cases 1 and 3, whereas in Case 4, the MSLL exceeds that of Cases 3 and 5. While the values of MSE and MSLL can vary depending on the distribution of the growth prediction model and the training and test data, there are no significantly deviating predictions for the test data.
Table 3. Performance of growth prediction model in terms of MSE and MSLL
4.2.2 Farm-gate Price Prediction
In this subsection, we predicted the farm-gate prices using the LSTM model, as described in subsection 3.1.2. As previously mentioned, the model uses five time series data points as inputs to produce a single output value. The hidden layer size of the LSTM is set to 20, with the objective function aiming to minimize the mean squared log error (MSLE) loss. Detailed information about the model is presented in Table 4.
Table 4. Parameter specification of Farm-gate prediction model (LSTM)
For the training and test of the LSTM model, data from Korea maritime institute fisheries observation center was used [32], with a total of 168 months of price data for 0.5 kg, 1.0 kg, and 2.0 kg from January 2008 to December 2021. To test the performance of the price prediction model, 70% of the total price data was used for training the model, and the remaining 30% was used as test data to evaluate the prediction model.
Fig. 6 illustrates the loss function during the training and the results of the model evaluation for the 2 kg farm-gate price prediction model. As can be seen in Fig. 6 (a), the training loss decreases as the model training progresses. In Fig. 6 (b), actual farm-gate prices for 2.0 kg flounder are compared with predictions from the LSTM model. To evaluate LSTM performance, the VAR model, commonly used for price forecasting, was also used [10]. As the VAR model follows the multivariate time series forecasting, it accounts for the interrelationship between 1.0 kg and 2.0 kg flounder prices in this simulation. Both models were trained on the same data and forecasted prices for 18 months starting from September 2019. The green, orange, and blue lines represent the actual farm-gate prices, the predicted prices from the LSTM model, and the predicted prices from the VAR model, respectively. The comparison of performance results indicates that the farm-gate prices predicted by the LSTM model closely match the actual values. In this simulation, the LSTM model captures temporal dependencies in sequential data, resulting in accurate predictions for future events. The model achieved MSLE of 0.005, compared to 0.0139 for the VAR model.
Fig. 6. Training loss and prediction performance of LSTM for farm gate price for 2.0 kg flounder class: (a) Training loss by epoch (b) Comparison of actual farm gate price and predicted prices for 2.0 kg flounder by VAR and LSTM
4.2.3 Net Profit Prediction
To predict net profit using the aforementioned prediction models, the profitability of land-based Korean flounder farming was assessed under the optimal condition of having water surface areas of 6,611.57 𝑚2 in fish farms with average water temperature of 20℃, utilizing cost statistics per kg of production as shown in Table 5. It is assumed that in September 2023, 30,000 juvenile fish weighing 4 g each were stocked. The cost per juvenile fish for stock seeding is 350 KWN, the unit production is 6,835.37 KWN (KWON: Korean currency unit) [27, 33].
Table 5. Parameter specification for net profit prediction
Based on the weight growth and farm-gate price prediction models, along with farming cost statistics, the estimated net profit is depicted in Fig. 7, where the corresponding figure focuses on the range of farming weights up to 2kg. Examining the first row in Fig. 7, the total weight (kg) of flounder cultivated over time (months) shows a monotonic increase. Excluding the weight class corresponding to 2.0 kg, the total weight of each weight class increases, reaches a peak, and then decreases over time. This pattern illustrates that as the weight of the cultured individuals increases, the proportion of weight classes accordingly changes.
Fig. 7. Net profit inferences from predicted weight of fish, farm gate prices, and farming costs
The second row represents the shipment value (KWN) of flounder over the aquacultivation period (months). As the farming time progresses, the total shipment value increases in proportion to the distribution of total weight at each weight class, except for the case of weights below 0.25 kg in second column in second row of Fig. 7. This is because we follow the fifth assumption of section 4.1, which assumes that small fish below 0.25 kg have no marketable value so that the corresponding value remains at zero. However, the shipment values for the remaining weight classes show a pattern similar to the estimated weights per weight class.
The third row illustrates the farming costs (KWN) over time (months). With time, the total farming cost shows an increasing trend, with costs by weight class exhibiting a pattern similar to the estimated weights per weight class.
The fourth row presents estimates for net profit (KWN) over the aquacultivation period (months). Overall, due to the interplay of growth, farm-gate prices, farming costs, and survival rates, the break-even point for net profit is represented at 8 months of cultivation. After this, the net profit gradually increases with a slower rate observed at 14 and 15 months and the maximum net profit during the 18-month cultivation period appears at 17 months. However, net profits by weight classes vary according to the weight of flounder, farm-gate prices, and farming costs. For example, as shown in the second column of fifth row in Fig. 7, the weight class below 0. 25 kg is difficult to trade in the market, resulting in a net loss where total net profit is negative due to increasing costs during the aquacultivation period. Additionally, as shown in the third column of fifth row in Fig. 7, the weight class of 0.5 kg also incurs a net loss, because the farming costs are relatively high compared to the farm-gate price. However, the net profit gradually increases in other weight classes above 0.5 kg.
5. Conclusion
In this paper, we designed a prediction system to assist the shipment decisions for fish farmers. The prediction system consists of a growth prediction model, a farm-gate price prediction model, a cost statistics table, and a net profit estimation algorithm. The GPR model was used for weight growth prediction based on the analysis that the characteristics of the weight data are Gaussian probability processes, and the LSTM model was used in consideration of the simple time series characteristics of the farm-gate price data. In the case of the GPR model, it is possible to cope with a data missing problem of the weight data collected from the fish farm in the time and temperature domains. Also, to solve the problem that the data acquired from the fish farms was aperiodic and small in amount, a data augmentation method based on the Gaussian model was used. After explaining the function of each prediction model, an estimation method for net profit using weight, price, and cost was proposed. The performance was analyzed by applying the proposed system to the Korean flounder data. Farmers can determine the timing of shipments of cultured fish so that they can improve their profit by referring to the net profit prediction.
References
- Food and Agriculture Organization of the United Nations, Part 1 World review, The State of World Fisheries and Aquaculture 2022, Rome, Italy: FAO, 2022.
- M. Chiu, W. Yan, S. A. Bhat, and N. Huang, "Development of smart aquaculture farm management system using IoT and AI-based surrogate models," Journal of Agriculture and Food Research, vol.9, Sep. 2022.
- K. Tsai, L. Chen, L. Yang, H. Shiu, and H. Chen, "IoT based smart aquaculture system with automatic aerating and water quality monitoring," Journal of Internet Technology, vol.23, no.1, pp.179-186, Jan. 2022.
- K. P. Rasheed Abdul Haq and V. P. Harigovindan, "Water Quality Prediction for Smart Aquaculture Using Hybrid Deep Learning Models," IEEE Access, vol.10, pp.60078-60098, Jun. 2022.
- L. Du, Z. Lu, and D. Li, "Broodstock breeding behaviour recognition based on Resnet50-LSTM with CBAM attention mechanism," Computers and Electronics in Agriculture, vol.202, 2022.
- M. Jobling, "The thermal growth coefficient (TGC) model of fish growth: a cautionary note," Aquaculture Research, vol.34, no.7, pp.581-584, May 2003.
- M. S. Chambers, L. A. Sidhu, B. O'Neil, and N. Sibanda, "Flexible von Bertalanffy growth models incorporating Bayesian splines," Ecological Modelling, vol.355, pp.1-11, Jul. 2017.
- K. I. Siegfried and B. Sanso, "Two Bayesian methods for estimating parameters of the von Bertalanffy growth equation," Environmental Biology of Fishes, vol.77, pp.301-308, Aug. 2006.
- J. Kim, E. Park, S. Cho, K. Kwon, and Y. M. Ko, "Probabilistic Modeling of Fish Growth in Smart Aquaculture Systems," KSII Transactions on Internet and Information Systems, vol.17, no.8, pp.2259-2277, Aug. 2023.
- J. Son, J. Nam, "A Leading Price Estimation of Jeju Flounder Producer Prices by Fish Weight and a Dynamic Influence Analysis of Market Price Impulse," Journal of Fisheries and Marine Sciences Education, vol.28, no.1, pp.198-210, 2016.
- K. Hwang, J. Choi, and T. Oh, "Forecasting common mackerel auction price by artificial neural network in Busan Cooperative Fish Market before introducing TAC system in Korea," Journal of the Korean Society of Fisheries and Ocean Technology, vol.48, no.1, pp.72-81, Feb. 2012.
- T. M. Losordo and P. W. Westerman, "An Analysis of Biological, Economic, and Engineering Factors Affecting the Cost of Fish Production in Recirculating Aquaculture Systems," Journal of the World Aquaculture Society, vol.25, no.2, pp.193-203, Jun. 1994.
- S. R. Campo and S. Zuniga-Jara, "Reviewing capital cost estimations in aquaculture," Aquaculture Economics & Management, vol.22, no.1, pp.72-93, 2018.
- R. E. Dahl and A. Oglend, "Fish Price Volatility," Marine Resource Economics, vol.29, no.4, pp.305-322, Dec. 2014.
- P. Deb, M. M. Dey, and P. Surathkal, "Fish price volatility dynamics in Bangladesh," Aquaculture Economics & Management, vol.26, no.4, pp.462-482, 2022.
- K. Kim, H. Hwang, H. Kim, K. Kim, J. Do, and M. Jung, "Hatchery production," Flounder Aquaculture Standard Manual, Busan, Republic of Korea: NIFS, ch.3, sec.2, pp.36-38, 2016. [Online]. Available: https://www.nifs.go.kr/
- Y. Eh, "Production planning in fish farm," The Journal of Fisheries Business Administration, vol.46, no.3, pp.129-141, Dec. 2015.
- C. Rasmussen and C. Williams, Gaussian Processes for Machine Learning, Gaussian processes for Machine Learning, MIT Press, 2006.
- GPy, GPy: A gaussian process framework in python. [Online]. Available: http://github.com/SheffieldML/GPy
- GPy, GPy - A Gaussian Process (GP) framework in Python, Read the Docs, 2020. [Online]. Available: https://gpy.readthedocs.io/en/deploy/index.html
- S. A. Zargar, "Introduction to Sequence Learning Models: RNN, LSTM, GRU," Apr. 2021.
- R. C. Staudemeyer and E. R. Morris, "Understanding LSTM -- a tutorial into Long Short-Term Memory Recurrent Neural Networks," arXiv:1909.09586, Sep. 2019.
- K. Kurumatani, "Time series forecasting of agricultural product prices based on recurrent neural networks and its evaluation method," SN Applied Sciences, vol.2, Jul. 2020.
- T. Zhang and Z. Tang, "Agricultural commodity futures prices prediction based on a new hybrid forecasting model combining quadratic decomposition technology and LSTM model," Frontiers in Sustainable Food Systems, vol.8, Feb. 2024.
- J. Chen, L. Lin, and X. Li, "Pork Price Prediction Using Bi-RNN-LSTM Artificial Neural Network," in Proc. of 5th International Conference on Artificial Intelligence and Big Data (ICAIBD), pp.168-172, 2022.
- Y. J. Lee, K. S. Ko, D. H. Hwang, S.A. Lee, and J. P. Cho, "A Study on the Prediction Model of Chicken Price Using a Multi-Variable LSTM Deep Learning Network," The Journal of Korean Institute of Communication and Information Sciences, vol.47, no.12, pp.2058-2064, Dec. 2022.
- K. Kim, H. Hwang, H. Kim, K. Kim, J. Do, and M. Jung, "Economical analysis of aquaculture," Flounder Aquaculture Standard Manual, Busan, Republic of Korea: NIFS, ch.8, pp.91-100, 2016. [Online]. Available: https://www.nifs.go.kr/
- Y. Lee, "Productivity comparison to the management of olive flounder, Paralichthys olivaceus aqua-farms in Jeju and Wando," M.S. thesis, Dept. Proliferation Science, Jeju Univ., 2019.
- K. Kim, H. Hwang, H. Kim, K. Kim, J. Do, and M. Jung, "Rearing," Flounder Aquaculture Standard Manual, Busan, Republic of Korea: NIFS, ch.4, sec.5, pp.52-54, 2016. [Online]. Available: https://www.nifs.go.kr/
- D. Park, M. Son, M. Park, et al., "General status and ecology," Standard Manual of Olive Flounder Culture, Busan, Republic of Korea: NIFS, ch. 1, pp. 1-7, 2006.
- K. Song, T. Park, and J. Chang, "Novel Data Augmentation Employing Multivariate Gaussian Distribution for Neural Network-Based Blood Pressure Estimation," Applied Sciences, vol.11, no.9, Apr. 2021.
- KMI, Observation Statistics of Korean Halibuts. [Online]. Available: https://www.foc.re.kr/
- Y. Yang, "Analysis of the management status and economic viability," in Analysis of the Management Status and Economic Viability of Flatfish Aquaculture in the Jeju Area, Jeju, Republic of Korea, JDI, ch.4, pp. 52-76, 2010.