DOI QR코드

DOI QR Code

Using Standard Deviation with Analogy-Based Estimation for Improved Software Effort Prediction

  • Mohammad Ayub Latif (College of Computing and Information Sciences, Karachi Institute of Economics and Technology) ;
  • Muhammad Khalid Khan (College of Computing and Information Sciences, Karachi Institute of Economics and Technology) ;
  • Umema Hani (College of Computing and Information Sciences, Karachi Institute of Economics and Technology)
  • Received : 2022.04.06
  • Accepted : 2023.04.28
  • Published : 2023.05.31

Abstract

Software effort estimation is one of the most difficult tasks in software development whereas predictability is also of equal importance for strategic management. Accurate prediction of the actual cost that will be incurred in software development can be very beneficial for the strategic management. This study discusses the latest trends in software estimation focusing on analogy-based techniques to show how they have improved the accuracy for software effort estimation. It applies the standard deviation technique to the expected value of analogy-based estimates to improve accuracy. In more than 60 percent cases the applied technique of this study helped in improving the accuracy of software estimation by reducing the Magnitude of Relative Error (MRE). The technique is simple and it calculates the expected value of cost or time and then uses different confidence levels which help in making more accurate commitments to the customers.

Keywords

1. Introduction

When planning about the effort or cost estimation of a software four important considerations before selecting an estimation model are the project size, software development style, the development stage and the required accuracy. The size can be classified as small, medium and large, different estimation models have their defined values for the size; the development style can be sequential, iterative or coupled. The development stage refers to the time in the lifecycle when an estimator is estimating the project; it can be the start of the project which is early requirements, to middle or late. The fourth consideration is the accuracy which an estimator is targeting;

After getting the outputs from an estimation model, data is needed for calibrating it into meaningful estimates. All models require data and, in his book, Steve McConnell has identified three types of data. Industrial data is the data of other organizations, historical data is the data of the same organization of previous projects and project data is the data of the project which is estimated. One requirement for the use of project data is that, the project which needs to be estimated should follow the iterative development life cycle, so the data of the first iteration can be used for the calculations of later iterations [1].

Researchers have also shown that least accurate results are from calibration done with industrial data, the historical data gives better results than the industrial data and the most accurate results with lowest variance is by the use of the project data [2]. Project managers need to check that completion time for a task is given intelligently so that Parkinson’s Law does not apply in their on-going projects. Parkinson’s Law states that work generally takes up all the time which is allocated for a task. So, if you give your developers, four days to finish a one-day task, it is expected that the task will now acquire four days [3][4]. Mostly in the modern era we have dynamic estimation models than compared to the flat models in which the number of team members can vary with respect to the different phases of the SDLC. With a dynamic estimation model the team size can be of 2 people in the requirement phase and 10 people in the development phase [1][5][2].

Broadly software estimation models are divided into two different categories algorithmic and non-algorithm model. The popular models in algorithm models are Lines of Code (LOC), Function points (FP) and Constructive Cost model (COCOMO). The non-algorithmic models comprise of expert judgment, analogy-based techniques, proxy techniques and pricing to win.

Estimation for defects through a defect prediction mechanisms for software with identification of challenges for defect prediction [6] shows the use of estimation which is other than cost, time and effort. The control of software activities and predicting about when a development will end is a difficult task, in order to adapt changes, some researchers have proposed a generalized software reliability model that is based on stochastic process to stimulate the software development that includes uncertainty [7]. Another study has explored the possibilities of application of Artificial Neural Network (ANN) as a tool for predicting software development effort. It proposed an ANN model for predicting software development effort [8]. In another work a systematic review of software effort estimation models built using ML techniques. All the empirical studied published in the time period of January 1991 to December 2017 were considered in the review. The work concludes that support vector machines (SVM) and regression techniques in combination are characterized by better predictions when compared with other Machine learning and non-Machine learning techniques [9]. It is important to note that metrics not only pertain to software costing and estimation; product metrics usage can lead towards better software quality. A study has proposed new technique for the visualization of metrics which will ultimately help in improving the software quality [10].

Generally, most of the project managers know that there is no best effort estimation model or method that can be applied to a particular case of estimation, if the client is forcing for a low-cost solution this can also lead to an overrun. Other important known concept is that estimations are often misleading [11]. It is well understood that Software Process Improvement can occur if we move towards better estimation for software. In recent times the concept of Global Software Engineering (GSE) has also emerged and many organizations are involved in Global Software Development (GSD). A systematic literature review is performed on success factors and barriers to software process improvement for Global Software Development (GSD) [12]. The concept of GSD has also given rise to offshore software development where low-cost countries are used for developing software for another country. A study has identified the challenges for managing offshore contracts from the vendors’ perspectives [13].

The core idea of analogy-based effort estimation (ABEE) is that you can create the estimate of a new project by comparing it with the estimates of an old project which has already been accomplished by your organization. ABEE or estimation by analogy comprises of 4 major steps for calculating the effort of the software as shown in Fig. 1:

E1KOBZ_2023_v17n5_1356_f0001.png 이미지

Fig. 1. Steps for Estimation by analogy.

The pioneers of introducing analogy based estimates were Shepperd and Schofield and they proposed this as a non-algorithm model for software effort estimation [14]. There are a few constraints which are mandatory for the accuracy of estimation by analogy; the first consideration is that the size of the previous and the current project should not vary on a larger scale. The development technologies for the both the projects should be same, this means if a project is to be developed in C# language, we cannot use a project developed in C Language as the baseline project. The difference between the team sizes of the new and the old project should also be minimal. Another important constraint is that the type of the projects should be same, a system software cannot be compared to form an estimate of an information system [1].

Following are the major contributions of the present study:​​​​​​​

1. Detailed review of analogy-based effort estimation techniques in recent years and how different improvements have been suggested in them to achieve better accuracy.​​​​​​​

2. A simplified real case study that shows the effort estimation of a software in a simplified way.

3. Very simple technique of standard deviation applied to the initial calculated effort in order to achieve better accuracy.

4. Validation of achieving improvement through the proposed approach over an available industrial dataset from different software houses.

The rest of the paper is structured as follows, in section 2 we provide the related work which focuses on the latest research trends related to analogy-based effort estimation. In section 3 we present a simple case of estimation by analogy and show the calculation of effort in terms of persons-month and we also recommend how the estimation by analogy can be improved by using standard deviation. We also apply our standard deviation methodology on an available dataset for agile software. In section 4 we provide our results and discussion and in section 5 we conclude our paper with directions for future research.

2. Related Work

In this section we look into all the work that has been carried out related to analogy-based estimation, we investigate the variants of analogy-based estimation and how improvements to the traditional methods had been shown by different researchers.

A systematic mapping of ASSE papers from 1990 to 2012 has been performed. The research objectives were to identify the studies with respect to the estimation accuracy, comparison of accuracy, context of the estimation, ASSE tools and impact of techniques which were used in combination to ASSE method [15].

To find improvements in ASSE technique a domain of review comprised of 24 papers which were selected through a formal tough process. The results show that improvement of ABE can be performed through adjustment, grey theory, attribute weighting and attribute selection techniques [16].

Analogy based estimation (ABE) is criticized because of low prediction accuracy, the large memory requirement and the expensive computation cost. To provide a solution for these problems a project selection technique for ABE (PSABE) is proposed which reduces the whole project base into a small subset that consist only of representative projects. Finally, PSABE is combined with the feature weighting to form FWPSABE for a further improvement of ABE. To validate the methods four datasets are used (two real-world sets and two artificial sets) and compared with conventional ABE, feature weighted ABE (FWABE), and machine learning methods. The results conclude that project selection technique could significantly improve analogy-based models for software cost estimation [17].

A work has investigated non-uniform weighting through kernel density estimation. After an extensive experimentation of 19 datasets, 3 evaluation criteria, 5 kernels, 5 bandwidth values and a total of 2090 ABE variants, it concludes that non-uniform weighting through kernel methods cannot outperform uniform weighting ABE [18].

A novel technique is proposed that relies on reasoning by analogy, fuzzy logic and linguistic quantifiers for estimating effort, provided that the software project is represented either by categorical or numerical data. Use of fuzzy logic-based cost estimation models is more suitable if unclear or inaccurate information are considered [19].

In a work to rank the adaptation techniques of analogy-based estimation a comparison of eight different ranking techniques for analogy-based estimation using larger datasets concludes that linear adaptation techniques outperform all other techniques [20].

To achieve accuracy and as to the fact that no estimation model outperforms other models in all situations, the importance of estimating from ensembles of various single technique. A work proposes similar ensembles based on single classical analogy and single fuzzy analogy. Experiments were conducted across seven datasets, that concludes that fuzzy analogy ensembles achieved better performance than classical analogy ensembles [21].

Ibtissam Abnan et. al. used missing data techniques with fuzzy analogy. They found that Pred (0.25) and Standardized Accuracy (SA) measure different aspects of technique performance. They suggest that SA should not be used alone to conclude about a technique’s accuracy and other metrics should also be involved with it and they recommend the involvement of Pred (0.25) as the other metric [22].

A Squares Support Vector Machine (LS-SVM) method that is nonlinear adjustment method is used for calibration. The work tested it on some datasets and compared it results with artificial neural network (ANN) and extreme learning machines (ELM) [23].

To overcome the errors related to analogy-based estimation, a work shows that S-membership function can be used to overcome the problems of an estimator to select the right set of projects to reach to a comparison [24].

Analogy-based estimation is built upon the principle of case-based reasoning (CBR) based on the k similar projects completed in the past. Therefore, the determination of the k value is crucial to the prediction performance. The researchers have worked and proposed a technique that uses hierarchical clustering in order to produce a range for k through various cluster quality criteria [25].

A research has compared six similarity measures for analogy-based estimation, it concludes that Euclidean and Manhattan similarity measures gives more accurate result in estimation for the datasets of software projects [26].

A work finds out, that instead of keeping all the historical data for COCOMO, using recently completed projects data of shorter duration will help in more accurate results in estimation. Similarly, k-nearest neighbors will also produce accurate results for Estimation by analogy [27].

Achieving accuracy in projects where the size of the current project is different to the completed past projects relies on effort adaptation. The work performs systematic comparison of effort estimators that were optimized by Bayesian optimization techniques. The experiment was carried out on 13 standard datasets. It concludes that a model which integrates gradient boosting machine algorithm has out-performed all other techniques [28].

A new analogy-based approach is proposed named as 2FA-kprototypes that can be utilized when both kind of attributes are involved. It used some datasets to compare the accuracy of 2FA-kprototypes with the traditional analogy-based estimation and 2FA-kmodes (this technique was developed in their earlier research). The verification results showed that 2FA-kprototypes and 2FA-kmodes both techniques performed better than traditional analogy-based effort estimation [29].

Where software projects are defined by a combination of continuous and categorical features; in a work an improvement is made to the 2FA-kprototypes techniques by using the 2FA-cmeans. This new techniques uses a fuzzy c-mean clustering technique that cluster objects which have mixed attributes. This 2FA-cmeans was tested on 6 different datasets and it outperforms their previous 2FA-kprototype technique and also all other classical analogy techniques [30].

A new solution function has been proposed to improve the estimation accuracy of Analogy based estimates. The function is called SABE (Stacking Regularization in analogy-based software effort estimation. The crucial point about SABE is stacking which is a machine learning technique. Stacking works on multiple models and combines the capabilities of all in order to better predict the estimate. Four different datasets are used for validation and results suggested that SABE’s performance is better than the former studies [31].

A study has investigated the effect of the LEM algorithm on optimization of features weighting and have proposed a new method. They checked the effectiveness of the algorithm on two datasets, Desharnais and Maxwell. They used evaluation metrics like MMRE, PRED (0.25), and MdMRE to evaluate and compare the proposed method against previous algorithms. Their technique show considerable improvement in estimating the cost of the software [32].

As analogy-based estimation requires prediction of the best number of analogies and adjustment technique selection for achieving the best possible estimates, a work has proposed a new adjusted ABE model for optimization and approximation of complex relationships between different features. It shows that the use of this model has improved the performance of ABE [33].

A proposed estimation model known as the Fuzzy Analogy based Software Effort Estimation model (FASEE) makes successful use of fuzzy logic with approximate reasoning theory to handle imprecision and uncertainty. In a recent work enhancement has been made to the FASEE model and problems related to the low quality of data and uncertainty in the reasoning process are solved to some extent. This new model is compared in thirteen software project datasets and it is concluded that the model performs better in terms of accuracy. The model is named as Consistent Fuzzy Analogy-based Software Effort Estimation (CFASEE) [34].

The shortcomings of Analogy-Based estimation tools are identified and a new enhanced model for analogy-based estimation is proposed. A system prototype is also prepared which is called EffortEst and it is based on the enhanced model. The authors have shown that EffortEst provides the nearest best estimation and the user intervention is also minimal [35].

A new framework is proposed that uses case-based reasoning (CBR) model along with considering the comprehensive set of requirements that includes the functional, non-functional requirements both along with the domain properties. The framework is tested on a set of thirty-six students projects and shows that the difference in terms of calculated and actual effort was in the range of 10% [36].

International Software Benchmarking Standards Group (ISBSG) dataset is used in a study to confirm that the usefulness of applying linguistic values rather than the numerical values in analogy-based estimation can bring much better results in terms of accuracy[37].

The Table 1 below shows the references of the work carried out for bringing an improvement in the analogy-based estimates. The Table 1 headers are restricted to paper reference, pros and cons, the accuracy metric used and the details of the accuracy in the last column. Only those studies are entered in the Table 1 from the reported studies which tested the analogy-estimation improvement technique and validated it by using some accuracy metrics.

Table 1. Comparative analysis of existing approaches for improvement in ABEE

E1KOBZ_2023_v17n5_1356_t0001.png 이미지

The section 3 presents a case study based on analogy-based estimate and then also shows how it can be improved using the standard deviation technique which we have proposed. The best thing about our approach is we have provided a complete case study step by step. Unlike in most of the studies the core advantage we have in our dataset is that it is based on the agile methodology. Secondly our dataset is collected from six different software houses of Pakistan.

3. Analogy-Based Estimation Technique

In this section we present a small and a simple case so that the readers can have an idea as to how the analogy-based estimation works. Let’s suppose an organization has recently created a project in which they have worked on a biometric system for marking attendance of employees of a company and created a web application for their HR team including charts for performance analysis of employee’s office timing and an android application for upper management to watch the data and keep an eye on employees attendance.

Now a new client needs a similar Attendance system with a little difference. In the new case some employees of the company work on client sites, while others work in office. So, the attendance could not be marked using the thumb machine as it is installed at company’s office and not at the client’s location. Considering this problem, the new solution is an android application that will provide login for the employees of that company and when those employees will enter the client’s premises, the location service will be used to mark their attendance.

The organization has chosen estimation by analogy for this project as this project is almost similar to previous project. For the initial step the Table 2 shows the previous project’s modules, their LOCs, number of features of previous and new project. Finally, it also shows the multiplication factor used for the new project calculated from the details of previous and new project.

Table 2. LOC of previous similar project and the current project

E1KOBZ_2023_v17n5_1356_t0002.png 이미지

In Table 2, the multiplication factor is calculated by dividing the previous project features with the new project features. The LOC of the new project is calculated by simply multiplying the previous project’s LOC with the multiplication factor.

Using the LOCs of the previous and current project the size ratio is calculated and it is shown that the effort for the current project is estimated at 64 staff-months than compared to the effort of 50 staff-months of the previous project. Table 3 shows the details.

Table 3. Effort of the new project in staff-months

E1KOBZ_2023_v17n5_1356_t0003.png 이미지

3.1. Improving the Analogy-Based Estimation

This subsection shows the use of simple standard deviation technique to get better confidence for the estimate that was calculated in the last section. Standard deviation technique requires worst case, best case and most likely estimates. To incorporate this from the previous multiplication factor for analogy estimates, we will have three values, one for the best case, the other for the most likely and the last for the worst case. From the three values we will generate the expected value as suggested by Putnam[38]. The Table 4 below shows the best case, worst case and the most likely values of the previously calculated multiplication factor. Our new estimation case incorporating the best and the worst cases is given in the Table 4.

Table 4. Best- and Worst-case LOC calculation of the new project MF: Multiplication Factor

E1KOBZ_2023_v17n5_1356_t0004.png 이미지

In Table 4 we also calculate the best case, worst case and most like size of the software in terms of LOC.

Using the approach of calculating the effort in terms of staff-months, now when we use the values of LOC best case, LOC most likely and LOC worst case. Effort for all the three cases is shown in Table 5.

Table 5. Best and worst and most likely effort in persons-month

E1KOBZ_2023_v17n5_1356_t0005.png 이미지

A known way in statistics is to assume that the one sixth of a difference between a maxima and minima is equal to one standard deviation. This way of calculating the standard deviation assumes that the maxima include 99.86% percent chances of meeting the estimate and the minima holds 0.135% percent chances[1]. As defined the standard deviation will be the difference between maxima and minima divided by 6. In our case it will be 79-46/6=33/6=5.5. And our expected case will be based on the Expected value formula which is shown in Equation (1).

\(\begin{aligned}\text{Expected Value} =\frac{[\text { BestCase }+(4 \times \text { MostLikelyCase })+\text { WorstCase }]}{6}\end{aligned}\)      (1)

In our case the Expected Value is 63.5 which is almost 64 staff-months. So, in our case our most likely value is also our expected value. In order to be more confident on our calculated effort value, we see the percentage confident statistically valid values for 70%, 80% and 90% percentage confident calculations in the Table 6. The values are also given in the book by Steve McConnell[1]. The standard deviation was 5.5 as calculated previously.

Table 6. Estimated Effort with different percentage confidence

E1KOBZ_2023_v17n5_1356_t0006.png 이미지

As shown in the Table 6, committing 71 staff-months as effort for our system will have 90% confidence that the project will not take more than 71 staff-months, similarly 67 staff-months and 69 staff-months will have 70% and 80% confidence respectively. By incorporating this standard deviation in analogy-based estimation, there are more chances of meeting our commitment that will lead to better level of customer satisfaction.

To strengthen our approach, we have used this mechanism in a dataset for agile based estimation. As agile based development uses an iterative and incremental approach, expert judgment is also used for effort estimation in agile based software system.

Agile based estimation generally uses the analogy-based estimation therefore we believe it’s a good choice to implement this mechanism on an agile based dataset. The authors in their research work have identified through the systematic literature review and survey that one of the most frequently estimation technique for agile based development is estimation by analogy, they have nicely classified the estimation techniques for agile based software development [39].

The dataset used for using standard deviation approach in order to improve the effort estimation contains data of 21 previous projects developed using Scrum-based agile software development. The dataset has been collected from [40] which claims that the data was first collected from six different software houses of Pakistan. The dataset contains 21 instances and 9 attributes. Each instance represents one project related data which provides information including Sprint Size, monthly Work Days, Team’s Initial Velocity (Vi), total Time for completing one sprint, total Cost spend on one sprint, Efforts completed by the team in one sprint, monthly Team Salary, Dynamic Force Factor (D) and Teams’ Final Velocity (V). In order to show the use of standard deviation on the dataset we will use the values of actual time and estimated times as shown in Table 7. The Table 7 only shows that values that we will be using for our case.

Table 6. Dataset of 21 software developed using agile methodology

E1KOBZ_2023_v17n5_1356_t0007.png 이미지

In order to incorporate the standard deviation, we have first calculated the best case and worst case from the estimated time. As we did not have the actual best case and worst-case figures, we just subtracted 20% from the estimated time for the best case and added 20 percent to the estimated time for the worst case. We assumed the estimated time as the most likely value, using these three values we calculated the expected value as discussed in the example previously. As same percentage was used for generating the best and the worst case, the expected value is same as the estimated value given in the dataset. But this approach helped us in calculating the standard deviation with the help of which we were able to generate more than one estimated time with different level of confidence. In Table 8 we show our standard deviation part on the dataset and also show the new estimated time with 70, 80 and 90% confident levels.

Table 8. Standard Deviation and different confidence levels of time

E1KOBZ_2023_v17n5_1356_t0008.png 이미지

The first column of the Table 8 is the project number, the second column is of the estimated time as given in the dataset, the third column is the estimated time which we have assumed as most likely for our expected value. The fourth and the fifth columns are for the best and the worst case which are calculated by adding and subtracting 20% from the estimated values respectively. The sixth column is the expected time generated through the expected value formula given in equation 1. The seventh is the standard deviation calculated by dividing the difference of worst and the best case with 6. Eighth, ninth and tenth columns are the estimated time of the project incorporating the standard deviation on 70, 80 and 90 percent confidence levels respectively. The eleventh which is the last column is the actual time that was spent on the project.

In the next section we calculate the Magnitude of Relative Error for all of our estimated cases and provide our understanding of the results. The Mean Magnitude of Relative Error (MMRE) is also calculated.

4. Results and Discussion

This section shows the Magnitude of Relative error (MRE) of 4 cases, first for the previous estimated time value from the dataset, then the MRE for our estimates with 70, 80 and 90% confidence levels which are shown in Table 9. The MRE was calculated using the formula given in Equation (2).

Table 9. MRE of actual dataset using the actual time of project completion, 70%, 80% and 90% confidence time

E1KOBZ_2023_v17n5_1356_t0009.png 이미지

\(\begin{aligned}M R E=AbsoluteValue {\times}\left[\frac{\text { ActualResult-EstimatedResult }}{\text { ActualResult }}\right]\end{aligned}\)      (2)

Table 9 shows that from the total 21 cases, the MRE has improved in 14 cases when standard deviation was applied. The MMRE (Mean Magnitude of Relative Error) has also improved in all the three cases with 70%, 80% and 90% confidence levels. This is shown in Table 10.

Table 10. MMRE of actual dataset using the actual time of project completion and MMRE with 70%, 80% and 90% confidence levels

E1KOBZ_2023_v17n5_1356_t0010.png 이미지

The Fig. 2 shows the MRE of all the kinds of data, the first graph is of the actual MRE from the dataset, and the next three graphs show the MRE of the actual dataset with the MRE of our 90, 80 and 70% confidence cases respectively.

E1KOBZ_2023_v17n5_1356_f0002.png 이미지

Fig. 2. MRE graphs of the original dataset and three standard deviation based confident levels

It is interesting to observe that when the original MRE is less, means it is 5% or less then the MRE for the standard deviation has gone up and not improved the results. We believe that this dataset is very good in terms of estimation results as the difference between estimated value and actual value in terms of time is very less. This can be known from the fact that the highest MRE in the 21 projects’ dataset is 11 percent. Although in estimation it is believed that 25% difference in actual and estimated in also considered good[1], so we can say that this dataset is exceptionally good. Considering the improvement is 14 cases in overall 21 cases we believe that our standard deviation-based technique will help in achieving more accurate estimates in other datasets where the MRE is ten percent or more, so we expect better results in all estimation cases.

5. Conclusion and Future Direction

In this paper, we showed a case of analogy-based estimation on a software system; in order to improve the estimate, we applied simple standard deviation on the estimate. We conclude that calculating estimates with standard deviation will give more confidence while committing completion time to the customers. To strengthen our case, we used the same standard deviation methodology on an available dataset of agile software. Out of 21 instances where the actual time and estimated time was already given in the dataset our methodology helped in improving 14 cases. This means an improvement in 66% of the cases which we showed through the calculation of magnitude of relative error. We conclude that standard deviation should always be applied to the estimates that are generated in order to gain more confidence and better chances of accuracy.

In future we plan to use standard deviation methodology on other datasets also; this can also be applied to cases where software effort is calculated by using other techniques rather than analogy-based estimation.

References

  1. S. McConnell, Software estimation: demystifying the black art, Microsoft press, 2006.
  2. M. A. Latif, M. Y. Khan, and K. Bashir, "Practices for Achieving Accuracy in Software Costing and Estimation," KIET Journal of Computing and Information Sciences, vol. 1, no. 1, pp. 83-95, 2018.
  3. C. N. Parkinson and R. C. Osborn, Parkinson's law, and other studies in administration, vol. 24. Houghton Mifflin Boston, 1957.
  4. C. F. Kemerer, "An empirical validation of software cost estimation models," Communications of the ACM, vol. 30, no. 5, pp. 416-429, 1987. https://doi.org/10.1145/22899.22906
  5. B. Boehm, C. Abts, and S. Chulani, "Software development cost estimation approaches - A survey," Annals of software engineering, vol. 10, no. 1-4, pp. 177-205, 2000. https://doi.org/10.1023/A:1018991717352
  6. Z. Li, X.-Y. Jing, and X. Zhu, "Progress on approaches to software defect prediction," IET Software, vol. 12, no. 3, pp. 161-175, 2018. https://doi.org/10.1049/iet-sen.2017.0148
  7. K. Honda, H. Washizaki, and Y. Fukazawa, "Generalized software reliability model considering uncertainty and dynamics: Model and applications," International Journal of Software Engineering and Knowledge Engineering, vol. 27, no. 06, pp. 967-993, 2017. https://doi.org/10.1142/S021819401750036X
  8. Y. Singh, A. Kaur, P. K. Bhatia, and O. Sangwan, "Predicting software development effort using artificial neural network," International Journal of Software Engineering and Knowledge Engineering, vol. 20, no. 03, pp. 367-375, 2010. https://doi.org/10.1142/S0218194010004761
  9. A. Ali and C. Gravino, "A systematic literature review of software effort prediction using machine learning methods," Journal of Software: Evolution and Process, vol. 31, no. 10, p. e2211, 2019.
  10. R. Ishizue et al., "Metrics Visualization Techniques Based on Historical Origins and Functional Layers for Developments by Multiple Organizations," International Journal of Software Engineering and Knowledge Engineering, vol. 28, no. 01, pp. 123-147, 2018. https://doi.org/10.1142/S0218194018500067
  11. M. Jorgensen, "What we do and don't know about software development effort estimation," IEEE software, vol. 31, no. 2, pp. 37-40, 2014. https://doi.org/10.1109/MS.2014.49
  12. A. A. Khan and J. Keung, "Systematic review of success factors and barriers for software process improvement in global software development," IET software, vol. 10, no. 5, pp. 125-135, 2016, https://doi.org/10.1049/iet-sen.2015.0038
  13. S. U. Khan and A. W. Khan, "Critical challenges in managing offshore software development outsourcing contract from vendors' perspectives," IET software, vol. 11, no. 1, pp. 1-11, 2017. https://doi.org/10.1049/iet-sen.2015.0080
  14. M. Shepperd and C. Schofield, "Estimating software project effort using analogies," IEEE Transactions on software engineering, vol. 23, no. 11, pp. 736-743, 1997. https://doi.org/10.1109/32.637387
  15. A. Idri, F. azzahra Amazal, and A. Abran, "Analogy-based software development effort estimation: A systematic mapping and review," Information and Software Technology, vol. 58, pp. 206-230, 2015. https://doi.org/10.1016/j.infsof.2014.07.013
  16. V. K. Bardsiri, D. N. Abang Jawawi, and E. Khatibi, "Towards improvement of analogy-based software development effort estimation: A review," International Journal of Software Engineering and Knowledge Engineering, vol. 24, no. 07, pp. 1065-1089, 2014. https://doi.org/10.1142/S0218194014500351
  17. Y.-F. Li, M. Xie, and T. N. Goh, "A study of project selection and feature weighting for analogy based software cost estimation," Journal of Systems and Software, vol. 82, no. 2, pp. 241-252, 2009. https://doi.org/10.1016/j.jss.2008.06.001
  18. E. Kocaguneli, T. Menzies, and J. W. Keung, "Kernel methods for software effort estimation," Empirical Software Engineering, vol. 18, no. 1, pp. 1-24, 2013. https://doi.org/10.1007/s10664-011-9189-1
  19. M. Shanker, J. Jaya, and K. Thanushkodi, "An Effective Approach to Software Cost Estimation Based on Soft Computing Techniques," International Arab Journal of Information Technology (IAJIT), vol. 12, 2015.
  20. P. Phannachitta, J. Keung, A. Monden, and K. Matsumoto, "A stability assessment of solution adaptation techniques for analogy-based software effort estimation," Empirical Software Engineering, vol. 22, no. 1, pp. 474-504, 2017. https://doi.org/10.1007/s10664-016-9434-8
  21. A. Idri, M. Hosni, and A. Abran, "Improved estimation of software development effort using Classical and Fuzzy Analogy ensembles," Appl Soft Comput, vol. 49, pp. 990-1019, 2016. https://doi.org/10.1016/j.asoc.2016.08.012
  22. I. Abnane and A. Idri, "Evaluating fuzzy analogy on incomplete software projects data," in Proc. of 2016 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1-8, 2016.
  23. T. R. Benala and R. Bandarupalli, "Least square support vector machine in analogy-based software development effort estimation," in Proc. of 2016 International Conference on Recent Advances and Innovations in Engineering (ICRAIE), pp. 1-6, 2016.
  24. D. Manikavelan and R. Ponnusamy, "Minimizing Analogy Errors with the Help of Fuzzy," International Journal of Applied Engineering Research, vol. 13, no. 6, pp. 4527-4530, 2018.
  25. J. H. C. Wu and J. W. Keung, "Utilizing cluster quality in hierarchical clustering for analogy-based software effort estimation," in Proc. of 2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS), pp. 1-4, 2017.
  26. P. Phannachitta, "Robust comparison of similarity measures in analogy based software effort estimation," in Proc. of 2017 11th International Conference on Software, Knowledge, Information Management and Applications (SKIMA), pp. 1-7, 2017.
  27. V. Nguyen, T. Huynh, B. Boehm, L. Huang, and T. Truong, "Investigating the use of durationbased windows and estimation by analogy for COCOMO," Journal of Software: Evolution and Process, vol. 31, no. 10, p. e2176, 2019.
  28. P. Phannachitta, "On an optimal analogy-based software effort estimation," Inf Softw Technol, vol. 125, p. 106330, 2020.
  29. A. Idri, F. A. Amazal, and A. Abran, "Accuracy comparison of analogy-based software development effort estimation techniques," International Journal of Intelligent Systems, vol. 31, no. 2, pp. 128-152, 2016. https://doi.org/10.1002/int.21748
  30. F. A. Amazal and A. Idri, "Estimating software development effort using fuzzy clustering-based analogy," Journal of Software: Evolution and Process, vol. 33, no. 4, p. e2324, 2021.
  31. A. Kaushik, P. Kaur, N. Choudhary, and Priyanka, "Stacking regularization in analogy-based software effort estimation," Soft Computing, vol. 26, no. 3, pp. 1197-1216, Feb. 2022. https://doi.org/10.1007/s00500-021-06564-w
  32. M. Dashti, T. J. Gandomani, D. H. Adeh, H. Zulzalil, and A. B. M. Sultan, "LEMABE: a novel framework to improve analogy-based software cost estimation using learnable evolution model," PeerJ Comput Sci, vol. 8, p. e800, 2022.
  33. M. Azzeh, Y. Elsheikh, and M. Alseid, "An optimized analogy-based project effort estimation," arXiv preprint arXiv:1703.04563, 2017.
  34. S. Ezghari and A. Zahi, "Uncertainty management in software effort estimation using a consistent fuzzy analogy-based method," Applied Soft Computing, vol. 67, pp. 540-557, 2018. https://doi.org/10.1016/j.asoc.2018.03.022
  35. S. D. N. S. S. B. Rumjaun, K. A. Gutteea, and L. Nagowah, "Effortest-an enhanced software effort estimation by analogy method," ADBU Journal of Engineering Technology, vol. 5, no. 2, 2016.
  36. F. Fellir, K. Nafil, R. Touahni, and L. Chung, "Improving case based software effort estimation using a multi-criteria decision technique," in Proc. of Computer Science On-line Conference, pp. 438-451, 20108.
  37. F. A. Amazal, A. Idri, and A. Abran, "Software development effort estimation using classical and fuzzy analogy: a cross-validation comparative study," Int J Comput Intell Appl, vol. 13, no. 03, p. 1450013, 2014.
  38. L. H. Putnam, "Estimating software cost," Datamation, pp. 171-178, 1979.
  39. M. Usman, J. Borstler, and K. Petersen, "An effort estimation taxonomy for agile software development," International Journal of Software Engineering and Knowledge Engineering, vol. 27, no. 04, pp. 641-674, 2017. https://doi.org/10.1142/S0218194017500243
  40. S. K. T. Ziauddin and S. Zia, "An effort estimation model for agile software development," Advances in computer science and its applications (ACSA), vol. 2, no. 1, pp. 314-324, 2012.