Comparison of Source Apportionment of PM 2 . 5 Using PMF 2 and EPA PMF Version 2

The positive matrix factorization (PMF2) and multilinear engine (ME2) models have been shown to be powerful environmental analysis techniques and have been successfully applied to the assessment of ambient particulate matter (PM) source contributions. Because these models are difficult to apply practically, the US EPA developed a more user-friendly version of the PMF. The initial version of the EPA PMF model does not provide any rotational capabilities; for this reason, the model was upgraded to include rotational functions in the EPA PMF ver. 2.0. In this study, PMF and EPA PMF modeling identified ten particulate matter sources including secondary sulfate Ⅰ, vehicle gasoline, secondary sulfate Ⅱ, secondary nitrate, secondary sulfate Ⅲ, incinerators, aged sea salt, airborne soil particles, oil combustion, and diesel emissions. All of the source profiles determined by the two models showed excellent agreement. The calculated average concentrations of PM2.5 were consistent between the PMF2 and EPA PMF (17.94±0.30 ㎍/㎥ and 17.94±0.30 ㎍/㎥, respectively). Also, each set of estimated source contributions of the PMF2 and EPA PMF showed good agreement. The results from the new EPA PMF version applying rotational functions were consistent with those of PMF2. Therefore, the updated version of EPA PMF with rotational capabilities will provide more reasonable solutions compared with those of PMF2 and can be more widely applied to air quality management.


INTRODUCTION
To manage ambient air quality and establish effective emission reduction strategies, it is necessary to identify sources and to apportion the ambient particu-late matter (PM) mass.Quantitative and qualitative source analyses are needed to facilitate control policies to reduce ambient air pollutants.To this end, receptor models have been developed to analyze various characteristics of the pollutants at the receptor site and to estimate the source contributions.Receptor modeling is based on a mathematical model that analyzes the physicochemical properties of gaseous and/or particulate pollutants at various atmospheric receptors.Among the multivariate statistical receptor models used for PM source identification and apportionment, the positive matrix factorization (PMF) model was developed to provide a multivariate receptor modeling approach based on explicit least-squares techniques (Paatero, 1997).Subsequently, a more flexible multivariate analysis tool, the multilinear engine (ME), was developed to solve a variety of multilinear problems (Paatero, 1999).ME has already been applied in several studies because of its flexibility (Buset et al., 2006;Ogulei et al., 2005;Zhou et al., 2004;Yli-Tuomi et al., 2003).
PMF has been implemented using two different algorithms: PMF2 (or PMF3) and the multilinear engine (ME).These programs have been successfully applied to assess ambient PM source contributions in many locations.However, PMF2 and ME have some operational limitations; the models are somewhat difficult to apply (not user friendly) because they are DOS programs that require understanding of a specific script language.Therefore, in order to provide a widely applicable PMF with a user-friendly and graphic user interface (GUI)-based program, the US Environmental Protection Agency (EPA) developed an EPA version of PMF (Hopke et al., 2006;Eberly, 2005).However, the initial version of the EPA PMF model (version 1.1) did not provide any rotational functions (such as FPEAK, Fkey, and Gkey) as implemented in PMF2.When analyzing PMF models, factor rotations often provide more physically reasonable solutions.Recently, Kim and Hopke (2007) compared the results of PMF2 with those of EPA PMF V1.1 on speciation trends network data from a site in Chicago and found differences in the two solutions, primarily for some of the minor sources.For this reason, the US EPA decided to upgrade EPA PMF to include rotational functions and released EPA PMF Ver.2.0.
The objective of this study was to compare the mass contributions and chemical composition of sources of PM 2.5 at a Washington, DC IMPROVE site.PMF2 and the newer version of EPA PMF were applied to identify the sources, to apportion the PM 2.5 mass to each source and to compare PMF2 analysis results with those of EPA PMF analysis.If the two model solutions show no differences, then the results of the source profile and source apportionment should be similar.

1 Sample Collection and Analytical
Methods PM 2.5 samples were collected at a Washington, DC IMPROVE site (latitude 38.8762, longitude -77.0344, 15.3 m above mean sea level) with a population of approximately 550,000, as shown in Fig. 1.This site is an urban area located near the Potomac River, about 2 km southeast of the Lincoln Memorial and 3 km northeast of Ronald Reagan Washington National Airport.There are highways and local roads adjacent to the sampling site.
Samples were collected every Wednesday and Saturday.Integrated 24 hr samples were collected on Teflon, nylon and quartz filters with an IMPROVE sampler.The Teflon filters were used for the analysis of mass concentrations and elemental analysis by particle-induced X-ray emission (PIXE) for Na to Mn, X-ray fluorescence (XRF) for Fe to Pb and proton elastic scattering analysis (PESA) for elemental hydrogen concentration.The nylon filters were used for the analysis of anions (SO 4 2-, NO 3 -and Cl -) by ion chromatography (IC).The quartz filters were used for organic carbon (OC) and elemental carbon (EC) according to the IMPROVE/thermal optical reflectance (TOR) method.Total carbon was separated into eight temperature-resolved carbon fractions according to temperature and oxidation atmosphere (Hwang and Hopke, 2007;Kim and Hopke, 2004;Chow et al., 2001).A total of 718 samples were obtained from August 1988 to December 1997 for the Washington, DC IMPROVE site, and 35 species (OC1, OC2, OC3, OC4, OP, EC1, EC2, EC3, SO 4 2-, NO 3 -, Al, As, Br, Ca, Cl, Cr, Cu, Fe, H, K, Mg, Mn, Mo, Na, Ni, P, Pb, Rb, Se, Si, Sr, Ti, V, Zn and Zr) were selected for PMF and EPA PMF modeling.For reasonable modeling, several species were excluded from measurement.To prevent double counting of mass concentrations for PM 2.5 , several measured species were excluded from PMF and EPA PMF modeling.For example, XRF sulfur (S) and IC SO 4 2-showed an excellent correlation (r= =0.96), thus sulfur (S) was excluded from the modeling.Also, Cl - was excluded to prevent double counting of the mass concentrations of Cl and Cl -.Because NO 2 -and NH 4 + + had many missing values, these species were not included in source apportionment modeling.The detailed analytical methods and data analysis were reported in a previous study (Kim and Hopke, 2004).

2 Data Analysis
PMF2 was applied to the data as in Kim and Hopke (2004), and the EPA PMF was applied using the ME-2 program.In order to identify an algorithm to solve the more general sums-of-products problem, a tool with a more flexible approach for the fitting of multilinear models (ME) was developed by Paatero (1999).ME-2 has been used in many prior source apportionment studies in the Arctic, Phoenix, Seattle, a Pittsburgh supersite, a Baltimore supersite and Toronto (Buset et al., 2006;Ogulei et al., 2005;Kim et al., 2004a;Zhou et al., 2004;Ramadan et al., 2003;Xie et al., 1999) and can be used to solve multilinear and quasi-multilinear problems.Both PMF2 and ME-2 include non-negativity constraints on the factors in order to decrease rotational freedom.The ME-2 model consists of two parts: a script language for specifying the structure of the model and an iterative part for fitting the model.This model uses a conjugate gradient algorithm to solve the receptor model problem by minimizing the sum of squares (Q) to provide a weighted non-negativity least-squares solution to the problem based upon the measured concentrations and corresponding uncertainties (Ogulei et al., 2005;Chueinta et al., 2004).
The ME-2 approach is operated by a script file, a special complicated file (ini file) written in FORT-RAN language (Paatero, 2007); the latest version of the script file was used in this study for EPA PMF modeling.When using the script file, some parameters (dimensions of the input data set, number of factors, main file name of input data, FPEAK value) were modified.The ME-2 script consists of five sections: defines, equations, preproc, postproc, and callback section.The first section of the script file consists of definitions and default values of the control variables, the settings of many of which are overridden by the "iniparams" text file.The second section, the equation section, contains definitions of the main equations for fitting the X and G (source contribution) matrices and the G and F pulling equations.The preproc section is purposed to specify the initial values of the factor matrices and to perform data preprocessing.The postproc section relates the computed results to the user defined output file formats.An iteration of ME-2 was performed between the preproc and postproc section.The final section, callback, is activated when the modeling is complete.
Because the script is used without a GUI (graphic user interface), it is necessary to manually set all of the control variables using the iniparams text file.In this study, the iniparams file was modified to fit the data set.Specifically, the control variable of FPEAK rotations (dofpeak) was set to 1, indicating that the controls were forced rotations similar to those in PMF2 (Paatero, 2007a).Also, in order to obtain reasonable source profiles, it is possible to use the Fkey matrix in ME-2 analysis using the another specific text file (moreparams.txt).If modeling does not involve a continuation run, this control file has no effect on F matrix pulling.This file contains two parts: a user-defined G and F pulling equation part and user-defined G and F pulling control values.In this study, the F pulling control values were used to produce a reasonable source profile compared with that of the PMF2.

1 Source Apportionments
The optimal number of sources was determined to be ten based on a previous Washington, DC source apportionment study (Kim and Hopke, 2004).The parameter FPEAK was used to control rotations.A nonzero value of FPEAK forced the PMF2 to add a one g vector to another and to subtract the corresponding f factors from each other, thereby yielding more physically realistic solutions (Hopke, 2000).If specific species in the source profiles do not seem to be realistic compared with the measured source profiles, and if the values obtained in the previous analysis are similar, it is possible to adjust the values toward zero in order to obtain a reasonable source profile using the Fkey matrix.These details were reported in previous studies (Qin and Oduyemi, 2003;Lee et al., 1999).
The role of the Fkey is different in the PMF2 and EPA PMF models.In EPA PMF ver.2.0, the corresponding Fkey notation applies to both F and G factor elements, and the values do not control the pulling of all elements, which is instead achieved by separately defined auxiliary equations.Thus, EPA PMF Fkey values more closely control factor elements than do those in PMF2.The new version of the EPA PMF can specify whether the element is unconstrained, has a lower limit or both a lower and an upper limit, or is fixed to its initial value or to the value of zero (each code value is 1, 0, -1, -5 or -6, respectively) (Paatero and Hopke, 2009).
In the cases of PMF2 and EPA PMF modeling, an FPEAK value of 0.1 was selected because it provided more physically realistic solutions and to allow for the comparison of PMF2 modeling with EPA PMF modeling under the same conditions.In the PMF2 modeling, values of all elements in the Fkey matrix were set to zero, except for the values of 2 and 3 for SO 4 2-in secondary nitrate and airborne soil and values of 1, 4 and 4 for NO 3 -in secondary nitrate, aged sea salt and oil combustion, respectively.Also, in the EPA PMF model, code values of all elements were set to zero (lower limits of all elements are 0) in the control information script except for those of SO 4 2-in secondary nitrate and airborne soil and for NO 3 -in secondary nitrate, aged sea salt and oil combustion (code values of these elements were set to -1).Therefore, specific upper limits were set only for these elements.In order to fit the fractions of these elements in the EPA PMF source profile to fractions in the PMF2 source profile, values of all elements in the Fkey matrix were set to zero, except for a value of 0.34 and 0.0085 for SO 4 2-in secondary nitrate and airborne soil and values of 1.2291, 0.0136 and 0.0054 for NO 3 -in secondary nitrate, aged sea salt and oil combustion, respectively (these values are desired actual unit fractions).Fig. 2 presents the comparison of ten-factor source profiles resolved by PMF2 (black bar) and the EPA PMF model (gray bar).Figs. 3 and 4 present the temporal variations in the contributions from each source according to PMF2 and EPA PMF, respectively.
PMF2 and EPA PMF identified three different secondary sulfate sources with high contributions of SO 4 2-including secondary sulfates I, II and III.The first source was classified as a secondary sulfate I with high abundances of SO 4 2-showing strong seasonal variation, with high contributions in the summer.Song et al. (2001) associated this result to SO 2 emissions from coal-fired power plants.The third source was determined to be secondary sulfate II.The major species contributing to this source included SO 4 2-and OC, especially OP.Carbon and tracer elements were associated with secondary SO 4 2-aerosol.This association was consistent with previous studies that observed a similar pattern of source profiles in Atlanta (Kim et al., 2004;Liu et al., 2003).A profile with high abundances of OP and sulfate has been reported in previous IMPROVE studies at Mammoth Cave National Park and Great Smoky Mountains National Park (Kim and Hopke, 2006;Zhao and Hopke, 2006).It was suggested that the OP-rich secondary sulfate aerosols might partially represent the result of heterogeneous acidic catalyzed reactions between acidic sulfate and gaseous organic compounds to form additional secondary organic aerosols.In secondary sulfate III, SO 4 2was the major species contributing to the fifth source, along with minor species such as Se.According to the seasonal differences in the Se and SO 4 2-contributions, PMF2 and EPA PMF identified secondary sulfate III and showed higher contributions in winter.It is suggested that SO 4 2-rich with Se secondary sulfate aerosols may have originated from coal-fired power plants in winter (Begum et al., 2005;Kim and Hopke, 2004;Poirot et al., 2001).
The species contributing to the second source included OC1, OC2, OC3, OC4, EC1 and NO 3 -.This source profile was identified as vehicle gasoline.The tenth source was interpreted as representing diesel emissions, of which EC1, EC2, OC2 and OC3 were major species, along with minor species such as Fe, Ca, Si and Zn.Vehicle gasoline was characterized by high fractions of OC, with a lower value of EC.In contrast, EC fractions in diesel emissions were higher than the OC fractions in gasoline emissions.In the case of diesel emissions, Zn and Ca are motor oil additives (Ålander et al., 2005), and Si and Fe are released from brake pads (Wåhlin et al., 2006); Fe may also be caused by muffler ablation.
The major marker species contributing to the fourth source included NO 3 -and SO 4 2-, and this profile was classified as secondary nitrate.The average mass contributions of secondary nitrate are maximal in winter because lower temperatures and higher humidity support the formation of secondary nitrate aerosol (Sein-feld and Pandis, 1998).
The sixth source was classified as an incinerator product with high abundances of OC1, OC4, EC1, K, Pb, Si and Zn.This is not a surprising finding since several incinerators are located northeast of Washington, DC.Figs. 3 and 4 showed that the incinerator contribu- Concentration (μg/μg) The species associated with the seventh source included Na, SO 4 2-and NO 3 -, and this profile was cla-ssified as aged sea salt.Although the main species in sea salt are known to be Na, Cl, SO 4 2-, K and Ca (Hopke, 1985), only Na showed a high contribution in  (Hwang and Hopke, 2006).The higher contributions of the aged sea salt are presumably from the Atlantic Ocean.The major species contributing to the eighth source included Si, Al, Fe, Ca, K, Ti and SO 4 2-, and this factor was assigned to airborne soil.The temporal variation in the source contribution plot shows very strong contributions on July 7, 1993 (Figs. 3 and 4).Examination of the back trajectories for this date suggests that an air mass was transported from the Sahara Desert (Kim and Hopke, 2004).
The ninth source was identified as oil combustion with high contributions of SO 4 2-, NO 3 -, OC1, OC4, EC1, Ni and V.As shown in the source profile plots (Fig. 2), this source has a higher fraction of EC1 than OC fractions.It might be anticipated that source originated from residual oil combustion at plants and industries.

2 Comparison of the PMF2 and EPA PMF Results
Ten sources were identified in both of the models: secondary sulfate I, vehicle gasoline, secondary sulfate II, secondary nitrate, secondary sulfate III, incinerator particulates, aged sea salt, airborne soil particles, oil combustion products and diesel emissions.All of the source profiles showed good agreement between the PMF2 and EPA PMF models (Fig. 2).Therefore, the new version of EPA PMF including rotational capabilities produced essentially the same results as did the PMF2.
Fig. 5 shows a comparison of contributions according to PMF2 and EPA PMF for each source, illustrating that all calculated source contributions show very good agreement (all R 2 values were 1.00).In Fig. 6, a com-parison of the predicted PM 2.5 contributions from all sources with measured PM 2.5 concentrations shows that PMF-and EPA PMF-resolved sources effectively reproduce the measured values and account for most of the variation in the PM 2.5 concentrations (PMF2:

CONCLUSIONS
In this study, data from PM 2.5 samples collected at the Washington, DC IMPROVE site were analyzed using both PMF2 and the new version of EPA PMF script to compare the results.Ten sources were identified and apportioned by PM 2.5 mass from the data for each modeling analysis including secondary sulfate I, vehicle gasoline, secondary sulfate II, secondary nitrate, secondary sulfate III, incinerator emissions, aged sea salt, airborne soil particles, oil combustion and diesel emissions.The results of source profile comparison showed good agreement between PMF2 and EPA PMF analyses.Also, comparison of contributions calculated by PMF2 and EPA PMF for each source showed excellent agreement (the R 2 values were 1.00).Therefore, the updated version of EPA PMF (ver.2.0) including rotational capabilities will provide more reasonable solutions compared with those of PMF2.The development and distribution of the new EPA PMF model will allow for readily accessible and easy to use source apportionment.Thus, the updated model can be more widely applied to air quality management problems.

Fig. 1 .
Fig. 1.Location of the Washington, DC IMPROVE site.

Fig. 2 .
Fig. 2. Comparison of source profiles resolved by PMF2 (black bar) and EPA PMF (gray bar) for the Washington, DC site.

Fig. 3 .
Fig. 3. Temporal variation of source contributions for the Washington, DC site according to the PMF2 model.

Fig. 4 .
Fig. 4. Temporal variation of source contributions for the Washington, DC site according to the EPA PMF model.

Fig. 5 .Fig. 6 .
Fig. 5. Comparison of source contributions of each source between the PMF2 and EPA PMF models.

Table 1 .
Comparison of Source Apportionment of PM 2.5 Using PMF2 and EPA PMF Version 2 93 Comparison of average seasonal source contributions (μg/m 3 ) for each source at a Washington, DC site according to the PMF2 and EPA PMF models.