Last update: 12 Jul 2021

*** PLEASE READ THIS README BEFORE RUNNING SMOKE. ***

This package contains files relevant to the 2017 emissions modeling platform. 
It includes scripts and executables for processing the 2017gb_17j emissions case 
through SMOKE. Note: some updates are being worked on during summer 2017, so the
package may be updated later once these are competle.

On 9 Feb 2021, the point sector inventories (airports, ptegu, pt_oilgas, ptnonipm)
were updated to the latest version of 2017NEI from June 2020. The airports inventory
from this version of NEI, which includes major corrections to emissions values,
was already included in the previous version of this package posted in July 2020. 
The other point sector inventories have only minor updates.

Mexico nonpoint, nonroad, and point emissions inventories are not provided at this time. 
If you need these Mexico inventories, please contact Alison Eyth (eyth.alison@epa.gov).

This package is primarily based on the 2017gb emissions case prepared by EPA, but there are 
differences between this package and the EPA emissions case:
- This package includes a newer airports inventory from 2017 NEI than was used in the 
  original 2017gb emissions case. This newer version, which was not available until after 
  the 2017gb emissions case was completed in May 2020, corrects an overestimation of 
  emissions from airports.
- Other point sector inventories (ptegu, ptnonipm, pt_oilgas) were updated on 9 Feb 2021
  to reflect a slightly newer version of the NEI with minor updates.
- This package includes average speed distributions (SPDIST) from 2017. The original 2017gb 
  emissions case used average speed distributions from 2016. This is an input to 
  SMOKE-MOVES and affects the "RPD" category of onroad emissions. 

For onroad and onroad_ca_adj, ancillary files for onroad are included in this package 
as well as activity datasets and scripts. Emission factor tables for onroad are provided 
on FTP with separate zips for each state. These tables do NOT include a NOx humidity 
adjustment, and instead that adjustment is applied in SMOKE-MOVES using hourly 
meteorology. The NOx humidity adjustment only impacts RPD and RPH.

This package includes precompiled SMOKE 4.7. The Temporal executable which is part of 
SMOKE 4.7 does not work with the CMV sectors. Temporal from SMOKE 4.6 does work with 
CMV and is compatible with other intermediate files generated using SMOKE 4.7. 
This package includes two precompiled Temporal executables: "temporal46" and "temporal47". 
The executable "temporal" is a link, currently pointing to the 4.6 version of Temporal. 
It is valid to use the SMOKE 4.6 version of Temporal in combination with other SMOKE 
programs from SMOKE 4.7 for all sectors, not just CMV.

== 1. Introduction ==

These packages contain scripts, inventories, and ancillary files related to EPA's 
2017 platform for air quality modeling. 
This package includes inputs and scripts for model year 2017. 
The basis of this platform is the 2017 National Emissions Inventory (2017NEI).
The emissions modeling inputs are broken out into several separate zip files as outlined 
here.

The year-specific emissions inventory files are here: 
ftp://newftp.epa.gov/air/emismod/2017/2017emissions/

The ancillary data for all years are here:
ftp://newftp.epa.gov/air/emismod/2017/ancillary_data/

CMAQ model-ready emissions generated using these packages with SMOKE v4.7
should be identical to those used in EPA's 2017 platform, with some 
exceptions as noted in this README. 

There may be additional emissions differences resulting from differences 
in the Linux operating system, hardware platform, and other system-specific 
differences.  

The data files are divided into several directories: 

- 2017emissions/ contains emission inventories for the year 2017,
  including national CEMS emissions, nonroad inventories and onroad
  activity data, and point and nonpoint inventories.

- ancillary_inputs contains general ancillary files (ge_dat),
  including those related to speciation, spatial allocation (gridding),
  temporalization, and gridded ocean chlorine emissions files, 
  which are included in the final model-ready emissions.

- smoke_2017gb_platform_core_29jun2020.zip contains scripts to run the core SMOKE 
  programs along with precompiled SMOKE executables from SMOKE version 4.7.

- spatial_surrogates contains 12km spatial surrogates for the US, Canada, and Mexico.

Section 4 of this README includes information about the modeling 
sectors used in the platform.

Section 5 of this README includes information about the inventories provided.

Section 6 of this README includes information about the ancillary
(non-inventory) files included.  

== 2. Requirements for processing emissions for air quality modeling ==

If you are only reviewing inventories and not developing emissions
for air quality modeling, you do not need to install SMOKE or to follow
the instructions below.  Instead, unzip the files with the data of interest
and examine those and the corresponding reports that are provided.

If you plan to develop emissions for air quality modeling, SMOKE v4.7
is REQUIRED to process this case. Most sectors can be processed with SMOKE 4.6,
but onroad processing with SMOKE-MOVES uses new features exclusive to SMOKE 4.7.

The smoke_2017gb_platform_core.zip package includes precompiled executables 
of SMOKE v4.7 programs. These executables reflect SMOKE v4.7 as of 27 Sep 2019.

Python:
We also recommend (if not require) python version 3.0 or later,
along with select python libraries. Many of the helper scripts included in this 
package use python. The python scripts within this package reference 
'#!/usr/bin/env python'; you may need to change this on your computing platform.
All of the scripts have been updated for Python 3.5 to match the configuration 
on EPA's cluster. You should still be able to use Python 2.7 if you install the 
"future" modules. In most cases that can be done by running: sudo pip install future

== 3. Installation of data files and scripts ==

This readme covers the installation of the SMOKE inventories, scripts,
and ancillary files used for the 2017 platform.

Choose an install directory on your system; we will refer to this directory
as "INSTALL_DIR". To review/reproduce emissions for all sectors, unzip 
all the .zip files into INSTALL_DIR. The packages have subdirectories 
embedded within them, so it is important that all files be unpacked in the 
same place in order for the scripts know where to find the inputs. 
If you are only interested in reproducing or examining emissions for specific 
sectors, you may download and unzip only the data for those sectors from the 
inventory directories, but for SMOKE processing you should also include the 
files in the ancillary_inputs and spatial_surrogates directories, and also the
contents of smoke_2017gb_platform_core.zip. Precompiled SMOKE executables and 
I/O API utilities are available in the SMOKE zip.

All SMOKE inventories, scripts, and ancillary files are provided, except emission 
factor tables for onroad processing via SMOKE-MOVES.
The full set of emission factor tables is not always permanently stored on the 
FTP server due to their large size; if they are not posted, they are furnished 
upon request.
As of 29 Jun 2020, Mexico nonpoint, nonroad, and point inventories are not posted.

Prior to running SMOKE, you will need to edit the INSTALL_DIR (your install 
directory) and MET_ROOT (location of MCIP meteorology data) environment 
variable definitions in the "directory_definitions.csh" script located in each 
CASE/scripts directory.  This script is sourced by each of the individual run 
scripts for each sector.  Regarding the MCIP data, SMOKE only uses the 
GRIDCRO2D, METCRO2D, and METCRO3D files. If you are computing plume rise within
SMOKE using the Laypoint program, SMOKE also needs the METDOT3D file.

MCIP meteorology data is not included in the package. This is used for the 
onroad, onroad_ca_adj, dust (afdust/othafdust/othptdust), and biogenics (beis) 
sectors.

== 4. Case description and instructions for each sector ==

Scripts are provided to process emissions for a 12km national grid (12US1) 
using CB6R3AE7 speciation for CMAQ versions 5.2 and 5.3. The CB6R3AE7 mechanism is
an update to the CB6 mechanisms used in 2016 and earlier emissions platforms,
and includes several new species: AACD, FACD, APIN, and IVOC. Onroad emissions in this
platform do not include these new species because MOVES was run with the previous
version of the CB6 mechanism.

The scripts can be adapted to run for other grids. First, in the 
directory_definitions.csh script, edit REGION_ABBREV and REGION_IOAPI_GRIDNAME.
You also may need to change other inputs in each individual sector script, 
such as spatial surrogates (SRGPRO / SRGDESC), transportable fractions (XPORTFRAC),
gridded meteorology for onroad (METMOVES), ocean chlorine (for the sector merge),
and biogenic land use and seasons (BELD4 and BIOSEASON).
A sample directory_definitions script for 12US2 is included in this package.
If you are changing the grid to 12US2, you do not need to change the spatial
surrogates, since 12US2 is a subset of 12US1. You do need to window the
other grid-specific input files, however.

The emissions from this package will work with base CMAQ v5.2 but not the 
Multi-Pollutant version used for the National Air Toxics Assessment (NATA).
CMAQ versions 5.1 and later include support for the CB6 mechanism. These 
emissions can also be converted to CB6-for-CAMx speciation when converting to 
CAMx format.

Emissions for the 2017 platform have only been processed for CMAQ modeling at this 
time. The 2016 v1 platform package includes support for CAMx modeling, and that may
be added to this package in the future.

Emissions processing is split into "sectors". Each sector has its own run 
scripts for processing, with one (or more) run scripts per case.
See Section 7 of this README for information about the run script zips.

All sectors are US-only unless otherwise noted. The sectors are:

AFDUST: Particulate emissions from fugitive dust sources. This sector is 
processed in two steps. The first (Annual_afdust_12US1_*) processes the annual 
inventory, and the second (Annual_afdust_adj_12US1*) applies adjustments - 
transportable fraction and meteorologically-based - and outputs the adjusted 
emissions under the sector name "afdust_adj". The afdust scripts must be 
run in that order.

AG: Agricultural emissions. This is mostly ammonia, but also includes other 
pollutants from agricultural sources as well.

AIRPORTS: Emissions from airports. Formerly part of the ptnonipm sector prior to
the 2016 v1 platform.
This is a 'point' sector, and like all 'point' sectors, is processed via
two scripts: the 'onetime' script, and the 'daily' script. The 'onetime' script
must be run first.
All emissions in this sector are low-level only (no inline files).

BEIS: Biogenic emissions generated using the BEIS model.
This package includes a pre-processed B3GRD file for the 12US1 domain from the 
SMOKE/BEIS program Normbeis. The B3GRD file is included in the ge_dat_for_2017gb_beis 
package and extracted to the 2017gb_17j/intermed/beis/ directory. The run scripts 
in this package are set up to use this file directly instead of running Normbeis. 
This behavior can be changed, and Normbeis can be rerun, by editing the 
run_settings.txt file in the scripts directory and removing or commenting out the 
"normbeis" line.

Rerunning Normbeis with the BELD land use provided in this package will NOT produce 
a B3GRD which is identical to the one in this package, however. On EPA systems, 
Normbeis was run at 1km resolution, and then the resulting 1km resolution B3GRD was 
aggregated to 12US1.  To replicate EPA's results, one should use the B3GRD provided 
in this package and not rerun Normbeis. 

CMV_C1C2: Emissions from C1 and C2 commercial marine sources, including ports 
and navigable waterways. Includes C1/C2 marine emissions in the entire domain,
including US, Canada, Mexico, and all bodies of water which lie outside the 
boundaries of those countries. In previous platforms, Canadian and Mexican CMV 
emissions were included in the othar and othpt sectors, but now all CMV domain-wide 
is in the cmv_c1c2 and cmv_c3 sectors.  This is a 'point' sector, and like all 
'point' sectors, is processed via two scripts: the 'onetime' script, and the 
'daily' script. The 'onetime' script must be run first.  Inventories for this sector 
are grid-specific and designed for the 12US1 grid, or 12km grids which are a subset 
of 12US1 (e.g. 12US2).  Therefore, emissions are output under the sector name 
"cmv_c1c2_12".

CMV_C3: Emissions from C3 commercial marine sources, including ports 
and navigable waterways. Includes C3 marine emissions in the entire domain,
including US, Canada, Mexico, and all bodies of water which lie outside the 
boundaries of those countries. In previous platforms, Canadian and Mexican CMV 
emissions were included in the othar and othpt sectors, but now all CMV domain-wide 
is in the cmv_c1c2 and cmv_c3 sector.
This is a 'point' sector, and like all 'point' sectors, is processed via
two scripts: the 'onetime' script, and the 'daily' script. The 'onetime' script
must be run first.  Inventories for this sector are grid-specific and designed for 
the 12US1 grid, or 12km grids which are a subset of 12US1 (e.g. 12US2).
Therefore, emissions are output under the sector name "cmv_c3_12".

NONPT: Area source emissions not included in other sectors.

NONROAD: Off highway mobile source emissions.

NP_OILGAS: Area source oil and gas emissions.

ONROAD: On highway mobile source emissions, excluding California. 
This sector is processed using SMOKE-MOVES with multiple scripts as described 
in section 4B.

ONROAD_CA_ADJ: On highway mobile source emissions, California only. 
This sector is processed using SMOKE-MOVES with multiple scripts as described 
in section 4B.

OTHAFDUST: Particulate emissions from fugitive dust sources in Canada. Just 
like with afdust, this sector is processed in two steps. The first 
(Annual_othafdust_12US1_*) processes the annual inventory, and the second 
(Annual_othafdust_adj_12US1*) applies adjustments - transportable fraction 
and meteorologically-based - and outputs the adjusted emissions under the 
sector name "othafdust_adj". The othafdust scripts must be run in that order. 
Fugitive dust emissions in Mexico are included in the othar sector and do 
not need the same transportable fraction and meteorological adjustments 
that the Canada fugitive dust emissions in othafdust do.

OTHPTDUST: Point source particulate emissions from fugitive dust sources in
Canada. In the new 2015 Canadian inventory, dust emissions are in area source
format for some sources (othafdust sector) and point source format for other
sources (othptdust sector). This is a new sector starting with beta platform.
This is a 'point' sector with additional adjustments, and is processed via
THREE scripts: the 'onetime' script, the 'daily' script, and then the adjust
script (othptdust_adj), in that order.
All emissions in this sector are low-level only (no inline files).

OTHAR: Area source emissions from Canada and Mexico, including mobile nonroad.
As of 29 Jun 2020, Mexico area source inventories are not included. For now,
this sector only includes Canadian emissions.

ONROAD_CAN: Mobile onroad source emissions from Canada.

ONROAD_MEX: Mobile onroad source emissions from Mexico.
The onroad Mexico emissions inventory includes pre-speciated VOC emissions for 
an older CB6 mechanism, so there is an extra script for this sector to 
convert those emissions to the CB6 mechanism needed for CMAQ. This extra 
script is called *_part2_combine.csh and uses the combine utility to perform 
the CB6 conversion. The combine program is included, pre-compiled, in the SMOKE 
package along with pre-compiled SMOKE executables and I/O API utilities.
To help make the distinction between versions of CB6, the older CB6 emissions
use the sector name "onroad_mex_cb6orig". The part2_combine step creates emissions 
files with the final sector name "onroad_mex".

OTHPT: Point source emissions from Canada and Mexico.
This is a 'point' sector, and like all 'point' sectors, is processed via
two scripts: the 'onetime' script, and the 'daily' script. The 'onetime' script
must be run first.
All emissions in this sector are elevated (no low-level contribution).
As of 29 Jun 2020, Mexico point source inventories are not included. For now,
this sector only includes Canadian emissions.

PTAGFIRE: Point source agricultural burning emissions. The ptagfire sector uses 
a daily point source inventory.
This is a 'point' sector, and like all 'point' sectors, is processed via
two scripts: the 'onetime' script, and the 'daily' script. The 'onetime' script
must be run first.
All emissions in this sector are elevated (no low-level contribution).

PTEGU: Electric generating unit emissions. This sector incorporates CEM 
(Continuous Emissions Monitoring) hourly emissions for a majority of sources.
This is a 'point' sector, and like all 'point' sectors, is processed via
a 'onetime' script first, followed by a 'daily' script. For ptegu there are 
two 'daily' scripts for different months of the year: 'summer' (May through 
September), and 'winter' (October through April). For sources without hourly 
CEM emissions, summer and winter use different hourly temporalization, and so 
they are run with separate inputs.  All emissions in this sector are elevated 
(no low-level contribution).

PTNONIPM: Point source emissions from industrial activities.
This is a 'point' sector, and like all 'point' sectors, is processed via
two scripts: the 'onetime' script, and the 'daily' script. The 'onetime' script
must be run first. 
All emissions in this sector are elevated (no low-level contribution). This is a
change from some prior platforms where ptnonipm emissions were split between
low-level (gridded) and elevated (inline) outputs. 

PTFIRE: Point source emissions from year specific controlled burning and wild 
fires.  Fires are processed in the 'inline' format for CMAQ, and are all 
elevated (no low-level contribution). 
This is a 'point' sector, and like all 'point' sectors, is processed via
two scripts: the 'onetime' script, and the 'daily' script. The 'onetime' script
must be run first.

PTFIRE_OTHNA: Point source emissions from year specific controlled burning and 
wild fires in the rest of North America ('OTHNA' = OTHer North America), including 
Canada and Mexico.  In addition to Canada and Mexico, fire emissions for 
Central America and the Caribbean are also included. Emissions from those 
areas are ultimately not modeled due to being outside of the 12US1 modeling 
domain, but they are provided for possible use in larger grids.
These fires are processed in the 'inline' format for CMAQ, and are all 
elevated (no low-level contribution). 
This is a 'point' sector, and like all 'point' sectors, is processed via
two scripts: the 'onetime' script, and the 'daily' script. The 'onetime' script
must be run first.

PT_OILGAS: Point source oil and gas emissions, including emissions from 
offshore oil rigs in the Gulf of Mexico.
This is a 'point' sector, and like all 'point' sectors, is processed via
two scripts: the 'onetime' script, and the 'daily' script. The 'onetime' script
must be run first.
All emissions in this sector are elevated (no low-level contribution). This is a
change from some prior platforms where pt_oilgas emissions were split between
low-level (gridded) and elevated (inline) outputs. 

RAIL: Area source railway emissions.

RWC: Area source residential wood combustion emissions.

== 4B. Notes regarding onroad ==

Onroad emissions are processed using SMOKE-MOVES. The processing is split 
into multiple run scripts. In the beta platform, SMOKE-MOVES inputs were 
prepared using MOVES2014b.

As in recent platforms, speciation of VOC emissions is handled within the MOVES model.
This platform uses MOVES version MOVES2014b, which supports speciation for the CMAQ 5.2
version of CB6. This version of CB6 does not include the new species which are part
of CB6R3AE7, but those new species have only a minor impact on onroad emissions, and so
this platform package does not offer any additional scripts or procedures to include the 
new CB6R3AE7 species for onroad.

As described in the SMOKE online documentation, SMOKE-MOVES handles onroad 
emissions separately for four types of processes:
- On-network emissions (RatePerDistance, or RPD)
- Off-network emissions, fuel vapor venting (RatePerProfile, or RPP)
- Off-network emissions, extended idling (RatePerHour, or RPH)
- Off-network emissions, non-venting, non-extended idle (RatePerVehicle, or RPV)

For each of the two onroad sectors (onroad, onroad_ca_adj), 
there are separate run scripts for RPD, RPP, RPH, and RPV, plus a merge script 
that combines emissions from RPD, RPP, RPH, and RPV into a single emissions 
file per day.

These scripts may take a particularly long time to run, especially RPD.
Therefore, consideration should be given to running multiple RPD jobs in
parallel, such as one job per quarter.

The reason onroad has been split into two sectors - onroad and onroad_ca_adj - 
is in order to match SMOKE-MOVES annual emission totals with those provided by 
California. To do this, we split California into a separate sector, and run 
SMOKE-MOVES with a control factor file (CFPRO) which nudges the emissions so 
that the annual totals post-SMOKE-MOVES equal those provided by CARB. 
CARB provided separate onroad emissions inventories for use in the beta
platform, and we match their provided emissions at the county/SCC level, 
except that the CARB inventory does not distinguish between different 
on-network road types (but does distinguish on-network emissions from 
off-network).

== 4B1. DAYS_PER_RUN ==

SMOKE-MOVES can be run more efficiently if running more than one day at a time.
For example, Movesmrg can create one 7-day emissions file more quickly than it 
can create seven individual 1-day emissions files. To turn on this feature, 
use the DAYS_PER_RUN variable, set to the number of days you wish to run in 
a single Movesmrg instance. The recommended value for DAYS_PER_RUN is 7. 
The onroad scripts include a setting called "DAYS_PER_RUN", set to 1 as the 
default. 

If DAYS_PER_RUN > 1, after Movesmrg is run, the run scripts will use the I/O 
API utility m3xtract to split up the multi-day emissions file into single day 
(25-hour) emissions files.

Multi-day Movesmrg runs will never cross months. For example, if 
DAYS_PER_RUN = 7, then the last Movesmrg run of January will start on 
January 29th and end on January 31st (3 days), and the first Movesmrg run of 
February will start on February 1st and end on February 7th.

Using the multi-day Movesmrg functionality requires multi-day MCIP files. 
For example, if DAYS_PER_RUN = 7, your METCRO2D files must also be 7 days 
(169 hours) long.  These multi-day MCIP files should be stored in 
MET_ROOT_${Xday}/, where X = DAYS_PER_RUN (i.e. /7day for DAYS_PER_RUN = 7).
For example, if the single day MCIP files are in /foo/foo/mcip_dir/, then 
7-day MCIP files should be in /foo/foo/mcip_dir_7day/.

The primary drawback to using this multi-day Movesmrg functionality is an 
increase in the memory usage.

== 4C. Sector merge ==

After all sectors have been processed, the Sector_merge script merges
the low-level emissions from all sectors into a single CMAQ-ready emissions 
file per day.

Merged model-ready emissions will be output to:
INSTALL_DIR/$CASE/smoke_out/$CASE/$GRID/$SPC/

Inline emissions and stack_groups files will be output to the same directory, 
except in subdirectories by sector name (e.g. .../$SPC/ptnonipm/).

By default, the sector merge scripts are configured to *exclude* biogenics and RWC. 
Some CMAQ modelers may wish to process biogenic emissions inline within CMAQ 
and not include biogenic emissions in the gridded emissions files. The newest 
versions of CMAQ also include features which require the RWC sector emissions to 
be passed into CMAQ separately. So, the filenames of merged emissions
now indicate whether beis and RWC are included. By default they are both omitted,
so the filenames say "nobeis_norwc". 

To run the sector merge with biogenics or RWC, edit the sectorlist file in the 
$CASE/scripts directory, and set the mergesector column to Y for the 'beis' 
and 'rwc' sectors.  

To merge in alternative biogenic emissions files, edit the sectorlist 
by changing the 'beis' sector name to the sector name of your choice, and make 
sure your biogenic emissions files exist in the $CASE/premerged/[sector name] 
directory with filenames adhering to the file name convention used by other 
sectors.

== 5. Description of inventory packages ==

Inventories for the 2017gb case are included in the following files, all of 
which should be unpackaged in INSTALL_DIR.

2017emissions/2017gb_inventory_CMV_12US1_29jun2020.zip
2017emissions/2017gb_inventory_cem_29jun2020.zip
2017emissions/2017gb_inventory_fires_29jun2020.zip
2017emissions/2017gb_inventory_nonpoint_29jun2020.zip
2017emissions/2017gb_inventory_nonroad_29jun2020.zip
2017emissions/2017gb_inventory_onroad_activity_29jun2020.zip
2017emissions/2017gb_inventory_oth_29jun2020.zip
2017emissions/2017gb_inventory_point_29jun2020.zip
2017emissions/2017gb_NATA_onroad_SMOKE-MOVES_emissions_FF10_29jun2020.zip

Past platforms also included a biogenics package. The files from that package
are now in the ancillary_data directory.

The "CMV" package includes annual and hourly inventories for the cmv_c1c2
and cmv_c3 sectors, 12US1 grid. Separate inventories for other grids such as
36US3 and Alaska are not provided as of 29 Jun 2020, but may be provided in 
the future. To process CMV inventories for other grids, one should edit the 12US1 
scripts and change the EMISINV and EMISHOUR definitions, along with the other grid 
changes described in Section 4 of this README.

The "cem" package includes the hourly CEM (Continuous Emissions Monitoring)
emissions used by the ptegu sector. This is the same data that is available on 
EPA's Air Markets Program Data website (ampd.epa.gov), except that we've split 
the data into months and days as needed for our scripts, and run the data 
through a "CEM correct" program. The CEMSUM file is in the ancillary_data area.

The "nonpoint" package includes inventories for the following sectors:
afdust, ag, nonpt, np_oilgas, rail, rwc.
Agricultural fire emissions also come from NEI Nonpoint, but those inventories
are included in the fires zip.

The "nonroad" package include the inventories for the nonroad sector.
To reduce the file size, pollutants not needed for normal CMAQ modeling
were removed, such as metals, PAHs, and dioxins and furans.

The "onroad_activity" package includes the activity data for the onroad and
onroad_ca_adj sectors. It does not include the emission factor tables also
required to run SMOKE-MOVES; those are in separate zips posted in the
2017emissions/moves_eftables/ area. Emission factors are only available for CB6.

The "oth" package includes all inventories for Canada and Mexico, except fires. 
As of 29 Jun 2020, Mexico nonpoint, nonroad, and point inventories are not included.

The "point" package includes the inventories for the following sectors:
ptnonipm, ptegu, pt_oilgas, airports.

The "ptfire" package includes the inventories for the ptfire, 
ptfire_othna, and ptagfire sectors.

The "onroad_SMOKE_MOVES_emissions_ff10" package includes an FF10-formatted 
inventory  representing emissions totals for the onroad sector as calculated 
by SMOKE-MOVES. This is provided for those who are interested in an onroad 
emissions inventory or report. It is not needed for SMOKE modeling.
This particular file is from the NATA case, which includes additional toxics
species that are not part of most AQ modeling applications. 

See Section 4 of this README for a description of each modeling sector.

== 6. Description of ancillary file packages ==

The following packages should be unpacked in INSTALL_DIR:

ancillary_data/ge_dat_for_2017gb_beis_29jun2020.zip
ancillary_data/ge_dat_for_2017gb_gridding_29jun2020.zip
ancillary_data/ge_dat_for_2017gb_onroad_29jun2020.zip
ancillary_data/ge_dat_for_2017gb_other_29jun2020.zip
ancillary_data/ge_dat_for_2017gb_speciation_29jun2020.zip
ancillary_data/ge_dat_for_2017gb_temporal_29jun2020.zip
ancillary_data/ocean_chlorine.zip
spatial_surrogates/surrogates_CONUS12_2015Canada_2010Mexico_29jun2020.zip
spatial_surrogates/surrogates_CONUS12_2017NEI_29jun2020.zip

The "beis" package includes gridded land use, BIOSEASON, and biogenic 
emission factor files for input to BEIS. This package also includes a 
pre-processed B3GRD file for the 12US1 domain to be used in lieu of running
the Normbeis program. This also includes different land use and emission factor 
inputs compared to prior versions, and are tentatively labeled as BELD 5
and BEIS 3.7, respectively.

The "surrogates" packages contain the spatial surrogates at 12km resolution
for the US, and for Canada and Mexico.

The "gridding" package includes all SMOKE inputs related to spatial allocation 
other than the surrogates, including cross-references, surrogate descriptions, 
and gridded transportable fractions used in afdust_adj, othafdust_adj,
and othptdust_adj.

The "onroad" package includes all SMOKE inputs related to running SMOKE-MOVES
other than activity data and emission factor tables. This includes the reference 
county (MCXREF) and fuel month (MFMREF) cross-references, pollutant (MEPROC) and 
emission factor table (MRCLIST) lists, activity SCC to full SCC cross-references 
(SCCXREF), average speed distributions (SPDIST), daily temperature data (METMOVES), 
and Movesmrg adjustment factors (CFPRO).

The "speciation" package includes speciation profiles, cross-references,
and VOC-to-TOG conversion factors. This .zip includes files 
for the CB6R3AE7 mechanism.

The "temporal" package includes temporal profiles and cross-references, 
including daily and hourly temporal profiles developed by the SMOKE program 
Gentpro for use in the rwc and ag sectors.

The "other" ge_dat package includes all other SMOKE ancillary files not 
included in the above packages, including:

- Inventory tables (INVTABLE)

- NHAPEXCLUDE files (concerns VOC HAP integration)

- Smkreport configuration files (REPCONFIG, all in ge_dat/smkreport/repconfig)

- Other miscellaneous SMOKE inputs, such as the ARTOPNT, COSTCY, HOLIDAYS, 
  MACTDESC, NAICSDESC, ORISDESC, PELVCONFIG, PSTK, SCCDESC
  
This ocean_chlorine.zip package contains gridded ocean chlorine emissions,
which are included in the sector merge, for 12US1, 12US2, and 36US1.

The run scripts (see section 7) are already set up to use the proper ancillary
files and inventories for each sector and case.

== 7. Description of script packages ==

The smoke_2017gb_platform_core.zip package should be unpacked in INSTALL_DIR.
This includes scripts and precompiled executables for running SMOKE in general,
and for running the 2017gb modeling case in particular.

The scripts in the 2017gb_17j subdirectory are the scripts you run directly 
in order to replicate our emissions.  Separate script(s) are provided for each sector.

See section 4 for information pertinent to each sector. In general, you edit
the directory_definitions.csh file, in particular INSTALL_DIR and MET_ROOT,
and then run each sector.
Sector scripts are organized into subdirectories within CASE/scripts by
sector category: biogenics, nonpoint, onroad, point, and merge.

For afdust sectors, run afdust/othafdust first, then afdust_adj/othafdust_adj.
For point sectors, run "onetime" first, and then "daily". 
Othptdust has an additional "adj" script which comes after the daily script.
For onroad sectors, run RPV/RPD/RPH/RPP first, then the merge script.
For onroad_mex, run the normal script and also the part2_combine script after that.

The scripts, programs, and other inputs in the combine, ioapi, and smoke4.7
subdirectories are all "helper" scripts and inputs, and generally never need 
to be run directly.

== 7B. Other miscellaneous programs ==

The beta platform public package includes additional SMOKE utilities in
the package smoke_2016beta_platform_utilities.zip. These utilities have not
been updated since beta platform, and so they have not been reposted for this
platform. Additional information can be found in the README for beta platform.

== 8. Preparing emissions for CAMx ==

As pf 29 Jun 2020, CAMx modeling has not been performed for 2017, so scripts for
preparing emissions for the CAMx model are not included at this time. Sample
scripts for converting CMAQ-ready emissions to CAMx-ready format are provided
in the 2016 version 1 platform package.

== 9. Log analyzer ==

The platform scripts include a Python-based tool called the log analyzer that runs
automatically at the conclusion of each SMOKE job. The purpose of the log analyzer
is to scan all log files from the sector, search for errors and warnings, and filter
out common or "acceptable" warnings.

Log analyzer output goes into this directory: $INSTALL_DIR/$CASE/reports/log_analyzer

There are two types of output files: Level 1 and Level 3. Level 3 lists every
instance of each error and warning individually, while Level 1 combines repeats of
common warnings. It is sufficient to look at only the Level 1 output.

The Level 1 output includes a "priority" code, the error/warning message, and the
log file in which the message appears. The priority code is a number from 0 to 3.
If priority = 2 or 3, the message has been identified as common or acceptable.
This is based on a file called known_messages.txt, which is located here:
$INSTALL_DIR/smoke4.5/scripts/log_analyzer/known_messages.txt

If priority = 1, the message is included in known_messages and has been identified 
as NOT acceptable - however, many priority 1 warnings are in fact SOMETIMES
acceptable and sometimes not, which is why they are not given priority 2 or 3.
Some common priority 1 warnings are listed below.
If priority = 0, the message is not included in known_messages.

Priority 0 or priority 1 messages which are acceptable include:
- WARNING: [*] is not found in both inventory pollutant and model species lists.
  These are common onroad messages resulting from the CFPRO file including more 
  species than is always necessary.
- WARNING: Could not read  [BEGHOUR/ENDHOUR] from file "PDAY"
  This is a common ptfire warning and is acceptable.
- WARNING: resetting surrogates ratio  of Co/St/Ct (FIPS): ...
  This warning is in known_messages but is not always picked up by the log
  analyzer for reasons unknown.
- WARNING: Speciation profile "??        " is not in profiles
  This warning is common/acceptable for onroad_mex.
- WARNING: Total annual toxic emissions greater than annual [*]__VOC emissions for source:
  This warning is common for sectors with monthly inventories such as nonroad.
- WARNING: Applying default time zone       5 to country/state/county code:    [*]
  This warning is acceptable for Alaska FIPS and FIPS 85005.
- WARNING: Duplicate entry in AR2PT x-ref file:
  This warning is acceptable but is not in known_messages to let us know we should
  correct it in the future.
- WARNING: Hour-specific ending date/time
  WARNING: Could not read "INDXH" from file "PHOUR           "
  netCDF error number  -40
  Error reading netCDF time step flag for PHOUR
  These four messages may appear in the Temporal logs for January and February for 
  the CMV sectors.
- WARNING: Dropping SCC .* not listed in SCCXREF file
  This is an acceptable warning for the RPP process in onroad. In other platforms which use MOVES3,
  this warning is also acceptable for RPV. (Platforms which uses MOVES3 also have an "RPS" process.)
  
== 10. SMOKE reports (Smkreport) ==

By default, the run scripts run the Smkreport program, output to the $CASE/reports/inv 
directory.
For sectors with annual inventories,reports are annual. 
For sectors with monthly inventories (e.g. nonroad), reports are monthly.

Most reports include all inventory pollutants and model species, although PM10 usually
appears as zero due to a SMOKE quirk; to get PM10, sum PM2_5 and PMC in the report.
For onroad, these reports reflect activity, not emissions, and include some double 
counting due to how SMOKE allocates activity to different processes; therefore, you 
should not use the reports/inv reports for the onroad or onroad_ca_adj sectors.

The following types of reports are generated. Note that not all types of reports are 
generated for all sectors:
  *state.txt: State totals.
  *county.txt: County totals.
  *state_scc.txt: State/SCC totals.
  *county_scc.txt: County/SCC totals.
  *state_naics.txt: State/NAICS totals.
  *cell_${GRID}.txt: Totals by grid cell.
  *cell_county_${GRID}.txt: Totals by grid cell and county.
  *state_grid_${GRID}.txt: State totals after gridding.
  *srgid_${GRID}.txt: Emissions totals at various resolutions after gridding, and also 
including the spatial surrogate assignment.
  *pm25prof.txt: Totals of PM2.5 at various resolutions, and also including the PM2.5 
speciation profile assignment.
  *vocprof.txt: Totals of VOC at various resolutions, and also including the VOC 
speciation profile assignment. 
    For sectors which are integrated and have both NONHAPVOC and VOC, or have multiple 
modes of VOC, there may be multiple VOC profile reports for NONHAPVOC and 
(no-integrate) VOC and/or for each mode.

For all sectors except those processed with SMOKE-MOVES, Smkmerge generates daily 
county total reports in the $CASE/reports/smkmerge directory. These are then summed 
to annual by state and county, output to the $CASE/reports/annual_report directory. 
Emissions totals in the reports/annual_report directory (post-temporalization) 
should be within 1-2% of the totals in the reports/inv directory (pre-temporalization).

For SMOKE-MOVES sectors, the $CASE/reports/smkmerge directory includes daily
(or weekly if DAYS_PER_RUN=7) totals by county/SCC. Scripts to aggregate these 
totals to monthly or annual by state, county, state/SCC, and county/SCC are 
provided in the SMOKE utilities zip, movesmrg_report_postproc/ directory.

== 11. Spinup ==

A parameter in the run scripts called SPINUP_DURATION supports the processing of emissions for December 
of the previous year for model spinup purposes. If SPINUP_DURATION > 0, then:
- For sectors where emissions are processed for representative dates only, the entire month of December of
  the previous year will be processed
- For sectors where emissions are processed daily, the last X days of December of the previous year 
  will be processed, where X = SPINUP_DURATION
  
However, in EPA emissions modeling platform applications, SPINUP_DURATION is set to a nonzero value
(typically 10) ONLY for biogenics processing and for the final sector merge. For all other sectors,
SPINUP_DURATION is set to 0 when running SMOKE. The scripts in this package are configured accordingly
as the default.

When the final sector merge is run, spinup period emissions for all sectors except biogenics are
taken from December of the base year. For representative day sectors, dates are mapped based on
day-of-week, and also Christmas holidays; for daily sectors, calendar dates are mapped. For example,
in a 2016 base year case, the 12/28/2015 merged emissions include sectors from the following dates:
- beis: 12/28/2015 (actual spinup emissions)
- afdust_adj, ag, onroad, onroad_ca_adj, othafdust_adj, othptdust_adj, rwc: 12/28/2016 
  (daily sectors use December 2016 matched up by calendar day)
- airports, nonpt, nonroad, onroad_can, onroad_mex, othar, pt_oilgas, ptnonipm: 12/05/2016
  (representative dates based on day-of-week)
- np_oilgas, rail: 12/06/2016 (also representative dates, but these sectors have a single day per month)

An illustration of how the dates are mapped in the spinup period is available in the smk_merge_dates files,
for example: $INSTALL_DIR/smoke4.7/scripts/smk_dates/2017/smk_merge_dates_201612.txt
  
Similarly for point sectors, when CMAQ is run, inline emissions from the base year are used to cover the 
spinup period as follows:
- othpt, ptnonipm, pt_oilgas use representative date / day-of-week based mappings
- cmv_c1c2, cmv_c3, ptegu use calendar date-based mappings
- ptfire, ptagfire, ptfire_othna (all fires) use January 1 emissions for all spinup days
Sometimes, base year inline point emissions files are physically copied to December spinup dates to simplify
the CMAQ setup, but CMAQ scripts can also be configured to point to base year dates while running spinup. 

== 12. Running SMOKE for only part of a month ==

The scripts as packaged are designed to process emissions for a full calendar year. This behavior can be 
easily changed to process only certain months using the RUN_MONTHS parameter. 

Running SMOKE for a full month is easier to set up than running a partial month. But if met data is only 
available for certain dates, or if certain sectors have long run times which would be exacerbated by 
running SMOKE for more days than are needed, then you may wish to set up these scripts to run SMOKE for 
only part of a month.

If you want to run SMOKE for only part of a month, that can be done one of two ways:

1) Use SPINUP_DURATION. This is the easiest method, but will only work if the only partial month you need to run
is at the beginning of your modeling period. Here, you can set RUN_MONTHS = the list of full months that you need, 
not including the partial month; and set SPINUP_DURATION = however many days you need from the end of the previous 
month. For example, to run SMOKE for 6/21 through 7/31, use RUN_MONTHS = "7" and SPINUP_DURATION = 10.

2) Use the run settings file (RUNSET): $INSTALL_DIR/[CASE]/scripts/run_settings.txt. Here, you can list the days
for which you want Smkmerge/Movesmrg to *not* run. The quality assurance program m3stat must also be turned off
for the days you want to skip. 

The general format for using the run settings file in this manner is: 
[sector], [grid], [program], PART4, [first day you want to skip], [last day you want to skip], N

[program] = movesmrg for onroad, smkmerge for other sectors. Also create a second line for the m3stat program. 
For the sector merge, set both [sector] and [program] to "mrggrid".
For the onroad merge, set [sector] to the onroad sector name (e.g. onroad, onroad_ca_adj) and [program] to "mrggrid".
Always use PART4 for smkmerge/movesmrg/m3stat/mrggrid. 

For example, to skip 1/1-1/15 for the ag sector and only run 1/16-1/31, add these two lines to run_settings.txt:
ag, 12US1, smkmerge, PART4, 01/01/2017, 01/15/2017, N
ag, 12US1, m3stat, PART4, 01/01/2017, 01/15/2017, N

With either method, we recommend DAYS_PER_RUN = 1 for onroad SMOKE-MOVES processing for any partial month. 
The configuration will not work if, for example, DAYS_PER_RUN = 7 and the first day of your modeling period is in the 
middle of a 7-day block.

For sectors with representative dates, we recommend having SMOKE always generate emissions for the full month, 
i.e. for all representative dates in the month, rather than skipping certain representative dates. Using the above
ag example, turning 1/1-1/15 off for the ptnonipm sector would result in skipping the representative dates 
that you need to cover the 1/16-1/31 period. Related to this, it is possible to set [sector] = "all" in 
run_settings.txt, but we advise against that because of the impact that would have on sectors with representative 
dates; you might end up skipping a representative date that you actually need. 

The afdust adjustments step does not support use of the run settings file. Instead, for afdust, do the following:
- Run the SMOKE job for full months (representative dates)
- At the bottom of the afdust adjustments job script, replace afdust_adj_emf.csh with afdust_adj_emf_customdates.csh.
  The customdates version of the script will automatically skip any day where there is no METCRO2D file.
  (The most common reason for wanting to run emissions for only part of a month is because met data is only 
  available for part of a month. If you do actually have met data for the days you want to skip, then either
  move that met data temporarily, or just let the job run for the full month.)