In this paper we compare the most common reduced form models used for emissions forecasting, point out shortcomings and suggest improvements. Using a U.S. state level panel data set of CO2 emissions we test the performance of existing models against a large universe of potential reduced form mode… ls. Our preferred measure of model performance is the squared out-of-sample prediction error of aggregate CO2 emissions. We find that leading models in the literature, as well as models selected based on an emissions per capita loss measure or different in-sample selection criteria, perform significantly worse compared to the best model chosen based directly on the out-of-sample loss measure defined over aggregate emissions. Unlike the existing literature, the tests of model superiority employed here account for model search or ‘data snooping’ involved in identifying a preferred model. Forecasts from our best performing model for the United States are 100 million tons of carbon lower than existing scenarios predict.