Comparison of Pre-Operation Diagnosis of Thyroid Cancer with Fine Needle Aspiration and Core-needle Biopsy : a Meta-analysis

Thyroid disease is very common with nodules occurring in up to 60% of the general population (Cooper et al., 2009). More than 95% of such nodules are benign, but the incidence of thyroid malignancy has increased more than two-fold worldwide over the last four decades (Sipos and Mazzaferri, 2010). Improved diagnostic capability is a major contributor to the observed increase (Davies and Welch, 2006), but an increase in the true occurrence of thyroid cancer is also partly responsible for the trend (Li et al., 2013). In the United States, 30, 000 cases of thyroid cancer are diagnosed annually and about 1, 500 people die of the disease (American Cancer Society, 2013). Various types of thyroid malignancies can develop, with varying survival rates, but most such malignancies entail a good prognosis when treatment is appropriate and timely. This makes early, accurate distinction of benign and malignant nodules imperative, since the latter require immediate surgical excision (Yassa et al., 2007) and generally adjuvant radioiodine therapy (RAI) (Morrison et al., 2014).


Introduction
Thyroid disease is very common with nodules occurring in up to 60% of the general population (Cooper et al., 2009).More than 95% of such nodules are benign, but the incidence of thyroid malignancy has increased more than two-fold worldwide over the last four decades (Sipos and Mazzaferri, 2010).Improved diagnostic capability is a major contributor to the observed increase (Davies and Welch, 2006), but an increase in the true occurrence of thyroid cancer is also partly responsible for the trend (Li et al., 2013).In the United States, 30, 000 cases of thyroid cancer are diagnosed annually and about 1, 500 people die of the disease (American Cancer Society, 2013).Various types of thyroid malignancies can develop, with varying survival rates, but most such malignancies entail a good prognosis when treatment is appropriate and timely.This makes early, accurate distinction of benign and malignant nodules imperative, since the latter require immediate surgical excision (Yassa et al., 2007) and generally adjuvant radioiodine therapy (RAI) (Morrison et al., 2014).

Comparison of Pre-Operation Diagnosis of Thyroid Cancer with
Fine Needle Aspiration and Core-needle Biopsy: a Meta-analysis Lei Li 1 , Bao-Ding Chen 1 , Hai-Feng Zhu 1 , Shu Wu 1 , Da Wei 2 , Jian-Quan Zhang 3 , Li Yu 1 * Often, thyroid malignancy is discovered while at an early stage.In such cases, surgical excision of a nodule deemed suspicious based on clinical examination and history serves a triple purpose.First, analyses via histopathology, cytology, and immunohistopathology of the excised lesion allows for a definitive diagnosis, including the type and grade of the malignancy.Second, histologic and immunochemical examination of margins around the tumor is vital to the surgical staging.Finally, the finding of clear microscopic margins (i.e., no tumor cells in the margins) establishes that an R0 resection has been achieved.When subsequent testing (nuclear medicine and otherwise) confirms no residual tumor, which is particularly common with tumors <1cm, the disease has been effectively cured.Otherwise, the surgical excision constitutes a large component of the treatment, which then can be completed with adjuvant RAI.
Notwithstanding the high accuracy of gold standard pathology tests and the ability to check for clear margins in lesions taken with open biopsy, the fact that less than 5% of thyroid nodules actually are malignant warrants less invasive screening methods.Offering the advantage of being completely non-invasive, imaging typically plays a central role in the detection of nodules, especially those that are small, deep, or otherwise not palpable.Often, computed tomography (CT), magnetic resonance imaging (MRI), or positron emission tomography (PET) of the neck for other reasons lead to incidental discovery of a thyroid nodule, but ultrasonography (US) is the imaging modality used most frequency to characterize thyroid nodules whose presence is already known (Nachiappan et al., 2014).
High resolution US has become the gold standard modality for imaging such nodules, because it does not expose patients to ionizing radiation and is particularly revealing in organs close to the surface, such as the thyroid.Standard two-dimensional US can distinguish cystic from solid nodules, as well reveal other aspects of nodular morphology that have implications vis-à-vis the likelihood of malignancy.Microcalcifications, solid composition, and central vascularity all raise the level of suspicion, as do rough edges, a high anteroposterior to transverse diameter (A-P/T), and lack of a halo in the lesion (Alexander et al., 2004;Mansor et al., 2012).Nevertheless, there is a great deal of overlap between features of benign and malignant thyroid nodules using conventional US (Tamsel et al., 2007).It is possible that benign and malignant thyroid nodules can be differentiated much better using US elastography, which involves compressing the target tissue forcefully and analyzing the elastic results (Mansor et al., 2012).However, US elastography may not be particularly revealing when employed as a screening tool for thyroid malignancy in low risk nodular goiter (Vidal-Casariego et al., 2012), the population in which it is most desirable to avoid the invasiveness of biopsy.Cheng et al. (2013) reported an US score based on echogenicity, margins, calcification, and the ratio between the anteroposterior and transversal diameters for predicting whether a lesion is benign or malignant: an US score ≤2 had 80.3% sensitivity and a 72.7% specificity for the lesion being benign.
Weighing the drawback of invasiveness in the case of excision biopsy against the drawback of low accuracy in the case of US leads to an intermediate approach, namely needle biopsy, as an approach to evaluate thyroid nodules.There are two main categories of needle biopsy: Fine needle aspiration (FNA), which supplies a small sample for cytology, and core needle biopsy (CNB), which uses a larger needle to supply more tissue for analysis.FNA is a safe, accurate, and cost-effective method for the initial evaluation of thyroid nodules and is vital in the selection of patients requiring surgical excision, adjuvant treatment, and other clinical interventions (Bukhari et al., 2008;Bongiovanni et al., 2012).The diagnostic specificity and sensitivity of ultrasound-guided FNA are 92% and 83%, respectively (Gharib et al., 2010), but the technique entails certain limitations.Inadequate sampling and indeterminate diagnosis may occur and excisional biopsy may be necessary for definitive diagnosis in many cases (Tandon et al., 2008).Additionally, FNA with cytology (FNAC) of thyroid tissue cannot differentiate between follicular adenoma from follicular carcinoma (Tandon et al., 2008).
To improve the accuracy of needle biopsy of solitary nodules, CNB may be useful, alone or in combination with FNA (Baloch et al., 2008;Park et al., 2011).Minimally invasive and inexpensive compared with open biopsy, CNB constitutes a viable alternative to FNA.Additionally, CNB provides a larger sample than FNA, allowing histopathology and immunhistochemical analysis (Bain et al., 2000;Cheung et al., 2000;Screaton et al., 2003;Ridder et al., 2005).Thus, Hakala et al. (2013) have found that CNB may be more useful than FNA, particularly in the diagnosis of non-follicular thyroid lesions such as papillary carcinoma.Results of another recent study suggest that CNB can reduce non-diagnostic results and the need for surgical diagnosis in patients with calcified thyroid lesions; consequently, the authors conclude that CNB may be preferable to FNA as a first-line diagnostic tool for calcified thyroid nodules (Ha et al., 2014).Additionally, CNB produced accurate, conclusive diagnoses in a study evaluating the technique in patients whose thyroid nodules had been tested previously, but unsuccessfully, via FNA (Yeon et al., 2013).Controversy surrounds the diagnostic accuracy of CNB.In the setting of thyroid disease, only a few studies have compared the sensitivity and specificity of CNB directly with those of FNA (Quinn et al., 1994;Liu et al., 1995;Karstrup et al., 2001;Harvey et al., 2005;Renshaw and Pinnar, 2007;Sung et al., 2012).Furthermore, notwithstanding the accurate, conclusive diagnoses that are possible when CNB is employed subsequent to failed FNA, this approach entails a selection bias (Screaton et al., 2003;Park et al., 2011;Hahn et al., 2013;Yeon et al., 2013); when used on nodules that have failed to produce diagnostically-useful FNA results, CNB cannot be compared with FNA reliably (Novoa et al., 2012).The meta-analysis reported in this article was conducted for the purpose of comparing the diagnostic sensitivity and specificity of FNA and CNB for thyroid malignancy.

Selection of Studies
Articles were selected in literature searches, and subsequently screened in an out based on the patient/ problem-intervention-comparison-outcome (PICO) principle.Literature databases searched consisted of Medline, the Cochrane Library, EMBASE, and Google Scholar.The following keywords and key terms were used in the searches: fine needle aspiration/FNA/FNAC; core needle biopsy/coarse needle biopsy/core biopsy/CNB; thyroid nodule; thyroid cancer; diagnostic/diagnosis; ultrasound/US.Articles were selected with publication dates up to December 31, 2013.The resulting lists of references were then hand-searched and studies were selected for further examination, or screened out if they were not relevant.
The remaining articles were screened further for selection of studies to be meta-analyzed.Both prospective and retrospective studies were selected, if they were twoarmed with patients undergoing simultaneous FNA and CNB of each nodule, if the needle biopsies were conducted to search for thyroid cancer prior to, or subsequent to, surgical excision, and the needle biopsies were ultrasound-DOI:http://dx.doi.org/10.7314/APJCP.2014.15.17.7187Fine Needle Aspiration vs. Core Needle Biopsy in Thyroid Cancer Diagnosis guided (if conducted prior to surgery) or taken from surgically extracted nodules.Studies were excluded, if only one technique was investigated, or if diagnostic values such as, accuracy, sensitivity and specificity for malignancy were not studied with respect to thyroid nodules.Studies were identified by two independent reviewers using the aforementioned search strategy.When there was uncertainty regarding eligibility, a third reviewer was consulted.

Extraction of Data
From the studies selected for meta-analysis, extracted data consisted of the name of the first author, year of publication, study design, number of patients, patient's age and gender, diagnostic criteria, and the following scientific results: needle gage, needle passes, and number of lesions that were true positives (TP), true negatives (TN), false positives (FP), or false negatives (FN).The accuracy, sensitivity, and specificity or the techniques were also extracted.A positive datum from needle biopsy was categorized as a TP when the final diagnosis on the excised specimen proved to be positive.Similarly, a TN was recorded when a negative result of needle was corroborated by analysis of the surgical specimen.FP was defined as a positive finding on needle biopsy lesions with negative findings on final diagnosis.FN was defined as a negative finding on needle biopsy with a positive finding on final diagnosis.

Quality Assessment and Outcome Measures
The quality of primary studies was evaluated using the Newcastle-Ottawa Scale.This is a validated technique to assess the quality of nonrandomized studies (Wells et al., 2000).Primary outcome was evaluated in terms of diagnostic values (sensitivity and specificity) of FNA and CNB for thyroid cancer, while accuracy of diagnosis was categorized as a secondary outcome.

Statistical Analysis
Data were extracted and analyzed with malignant and suspected malignant lesions defined as positive findings, and benign lesions defined as negative findings.Results of FNA and CNB were compared with the final diagnoses established via histopathology and other gold standard testing to generate numbers for TP, TN, FP, and FN groups.The following values were calculated as follows: Accuracy= (TP+TN)/total, sensitivity=TP/ (TP+FN), specificity=TN/ (TN+FP).
Heterogeneity among the studies was assessed via the Cochran Q and the I 2 statistic.The Q statistic with p<0.10 was considered to indicate significant heterogeneity.The I 2 statistic, the percentage of the observed betweenstudy variability caused by heterogeneity, was also considered to indicate heterogeneity when I 2 >50%.When heterogeneity existed between studies, the random-effects model (DerSimonian-Laird approach) was performed.Otherwise, the fixed-effects model (Mantel-Haenszel approach) was employed.The funnel plot for publication bias was not employed in this analysis, because only 5 studies were meta-analyzed, which is inadequate to detect funnel plot asymmetry (Sutton et al., 2000).Difference between the areas under the curve (AUCs) of two summary receiver operating characteristics (ROC curves) was tested via the Hanley-McNeil method (Hanley and McNeil, 1983;Rosman and Korsten, 2007).Sensitivity analysis for evaluating the influence of each individual study was performed for assessment of sensitivity and specificity of FNA and CNB using the leave-one-out approach.A twosided p<0.05 was considered statistically significant.The homogeneity test (which pooled estimates for sensitivity and specificity), the Moses-Littenberg's approach for summary ROC curves (Moses et al., 1993), and sensitivity analysis (for evaluating the influence of each individual study) were performed using the Meta-Disc version 1.4 (Zamora et al., 2006).

Literature Search
Following the removal of duplicates, the database search identified 112 studies.Screening of these excluded 96 as non-relevant, yielding 16 full-text articles to be assessed for eligibility (Figure 1).Of the 16 studies, one was excluded, because CNB had been performed on nodules in which previous FNA had failed; thus, only CNB had actually been studied.Three studies were excluded, because they analyzed the effects of combined intervention.Another study was excluded, because interventions were not performed on the same nodules.Four studies were excluded because they did not produce outcomes of interest to our study.Two studies were excluded, because the FNA and/or CNB were not ultrasound-guided.This process yielded five studies to be meta-analyzed, which are listed in Table 1 with relevant values.

Characteristics of the Included Studies
Of the five studies in the meta-analysis, two were prospective and the other three were retrospective in design (Table 1).The studies differed greatly in terms of the number of patients, needle gauge for both FNA and CNB, and gender ratio.Across the studies, the number of patients ranged from 52 to 538, the patients tended to be either female or male in majority, and the gauge of needles used ranged from 18-21 for CNB and from 21-25 for FNA.In total, 1264 patients were included, and 859 (68.0%) of them were females.

Comparison of FNA and CNB Diagnostic Performance
The accuracies (proportion of TP + TN) of FNA and CNB for the five studies ranged from 0.629 to 0.820 and 0.548 to 0.921, respectively (Table 2).The sensitivity of FNA and CNB ranged from 0.30 to 0.935 and 0.583 to 0.868, and the specificity ranged from 0.323 to 1.0 and 0.452 to 0.992, respectively.
The random-effects model was used for determining the pooled diagnostic sensitivity since homogeneity tests of sensitivity showed Q=20.97 (p=0.0003) and I 2 =80.9% for FNA, and Q=17.96 (p=0.0013) and I 2 =77.7% for CNB.This indicates significant heterogeneity among the studies.The current analysis has revealed pooled diagnostic sensitivities of FNA and CNB methods of0.68 and 0.83, respectively (Figure 2).
Similarly, the random-effects model was used to evaluate pooled diagnostic specificity as there was also heterogeneity across the studies.The homogeneity tests showed Q=128.20 (p< 0.0001) and I 2 =96.9% for FNA, and Q=75.94 (p< 0.0001) and I 2 =94.7% for CNB.The pooled diagnostic specificities of FNA and CNB procedures were 0.93 and 0.94, respectively (Figure 3).*Diagnostic criteria for malignancy were Bethesda category 5 and 6.Bethesda category 5 indicates suspicious malignancy and category 6 indicates malignancy; ‡Diagnostic criteria for malignancy were atypical, suspicious, or positive for malignancy; §Diagnostic criteria for malignancy not specified; TP: true positive, the number of cancerous lesions with positive diagnoses; TN: true negative, the number of non-cancerous lesions with negative diagnoses; FP: false positive, the number of non-cancerous lesions with positive diagnoses; FN: false negative, the number of cancerous lesions with negative diagnoses.Accuracy (%)=(TP+TN)/total*100, sensitivity (%)=TP/ (TP+FN)*100%, specificity (%)=TN/(TN+FP)*100%; aThe sample size (total number of nodules, unless otherwise specified) is less in this table compared to Table 1 due to some patients were lost to follow-up and the diagnosis of some nodules was not surgically verified; †indicates the total number of patients

Summary ROC Curves and Influences of Individual Studies
The areas under the summary ROC curves were 0.905 (standard error=0.030)for FNA and 0.745 (standard error=0.095)for CNB.No significant difference was observed between the summary ROC curves of FNA and CNB (Figure 4).Sensitivity analysis using the leaveone-out approach revealed that no one study had greater influence than any other on the pooled estimates for diagnostic sensitivity and specificity (Table 3).

Discussion
The purpose of this meta-analysis was to comparing FNA and CNB in terms of their diagnostic characteristics as they pertain to the differentiation of malignant and benign thyroid nodules .Despite recent discoveries, this meta-analysis revealed no significant difference in diagnostic values considering both the sensitivity and specificity of FNA and CNB for pre-operation diagnosis for thyroid nodules.No serious complications that needed for hospitalization were registered regarding both techniques in all included studies.For all five studies metaanalyzed, both procedures proved safe with the major adverse events consisting of bleeding and hematomas (Liu et al., 1995;Karstrup et al., 2001;Harvey et al., 2005;Renshaw and Pinnar, 2007;Sung et al., 2012).All studies included in the analysis adopted US-guided biopsies (Hakala et al., 2013), except when biopsies were taken during intraoperative surgical specimen.All studies included in this analysis compared the two techniques on the same nodules, thus allowing direct and more objective comparison that many of the studies that were eliminated in the article screening process.
Two prior meta-analyses, one by Tandon et al. (2008) and the other by Novoa et al. (2012) evaluated FNA and CNB in diagnosis malignancies of the head and neck.Tandon et al. (2008) included 30 studies and elucidated a FNA sensitivity of 79.7% and a specificity of 98.1% for detecting thyroid cancer, and an FNA sensitivity of 59% for detecting differentiated thyroid cancer.Novoa et al. (2012) included 16 studies assessing CNB and determined that CNB could detect thyroid malignancy with a sensitivity of 68% and a specificity of 100%.In both meta-analysis, the low sensitivity reflected a high number of FPs, which is less of a problem than a high number of FNs.
In the five studies included in the current metaanalysis, the FP rate ranged from 0% to 33.8% for FNA, and 0.4% to 27% for CNB.The range of FPs across the studies may reflect differences in accuracy of the cytopathology and histophathology among the studies.Consistent with the results of the current meta-analysis, comparison of the previous CNB and FNA meta-analyses revealed no significant difference in sensitivities of CNB and FNA (p=0.350)(Novoa et al., 2012).In contrast, the results of both the Tandon et al. (2008) and suggest that CNB is more accurate and specific, with a better negative predictive value compared with FNA when applied to neoplasia throughout the head and neck.However, neither study compared FNA and CNB in terms of their diagnostic values specifically with respect to the thyroid.
The results of the current study should be interpreted in the context of several limitations.There was bias with respect to sampling, firstly because all studies tended to focus on one gender.As a whole, female (68%) were included as the majority, which may overlook some underlying bias.Secondly, in one study patients were included based on prior preoperative FNA diagnoses that warranted surgical treatment (Hakala et al., 2013).In another study (Na et al., 2012), however, patients were included due to previous unsuccessful FNA results; this may suggest lesions that were difficult to aspirate.Biopsy techniques might be beneficial to different subtype of lesions in patients (Hahn et al., 2013;Hakala et al., 2013), and such heterogeneous sampling might have obscured the analysis.For the purpose of the current study, the calculations were based on the assumption that only malignant or possibly malignant samples constitute positive findings, and the results were compared with the 'gold standard' testing of excised nodules.Consequently, this analysis may be limited in terms of comparison of the sensitivity, specificity, and accuracy of the techniques.Other aspects, such as rate of inconclusive diagnoses (e.g., non-diagnostic and AUS/FLUS) were not considered.The current meta-analysis also excluded biopsies with insufficient material for diagnostic evaluation.Additionally, there was heterogeneity with respect to diagnostic criteria: the Bethesda system for reporting thyroid cytopathology was used for interpretation of   biopsies in only three studies of the five studies (Na et al., 2012;Sung et al., 2012;Hakala et al., 2013).Furthermore, diagnostic criteria for malignancy were compared with surgical resections in three studies (Karstrup et al., 2001;Renshaw and Pinnar, 2007;Hakala et al., 2013) as the final diagnosis.However, in the other two studies (Na et al., 2012;Sung et al., 2012) final diagnoses were determined by histopathological results after surgical resection (for some benign and all malignant nodules), as well as by clinical follow-ups (for benign nodules).Future meta-analyses should seek to eliminate studies with such heterogeneity.Due to the outcome of the literature selection process, the number of studies included in the meta-analysis was small, and of the five studies included, only two were prospective in design.Additionally, in none of the five studies was the pathologist blinded to the results of the FNA and CNB when determining the final diagnosis.The studies also differed in the diagnostic categories used and the definition of malignancy.Notably, the term "malignant" in two of the studies included "suspected malignancy" (Renshaw and Pinnar, 2007;Sung et al., 2012).Moreover, the analysis included multiple forms of thyroid cancer, yet it is possible that CNB and FNA may have better diagnostic abilities in identifying specific types of thyroid malignancy.These limitations reflect the very small pool of available studies; therefore, an additional conclusion of the current study is that more prospective studies comparing FNA and CNB cytology and gold standard diagnosis of excised nodules in thyroid need to be performed.
In summary, the current study suggests that both FNA and CNB are simple, minimally invasive, and safe methods for reliable diagnosis of thyroid malignancy.The fact that both procedures show a reasonable number of FPs with respect to the detection of thyroid cancer suggests that that they should not replace excisional biopsy, at least not for certain patients (Novoa et al., 2012).Additionally, as noted above, the revelation of just a handful of appropriate studies by the literature search and article screening process highlights the need for well-designed prospective studies to investigate this issue.

Figure
Figure 2. The Forest Plots Showing the Sensitivity of FNA and CNB (Random Effect Approach)

Figure 4 .
Figure 4.The Summary ROC (sROC) Curves for FNA and CNB.Symmetric sROC Curve Fitted Using Moses' Model.The AUCs for FNA and CNB were 0.905 (with Standard error of 0.030) and 0.745 (with Standard Error of 0.095), Respectively.No Significant Difference was Observed between the two AUCs of FNA and CNB (p=0.053).