A comprehensive R-based analysis that fetches and processes global CO₂ emissions data (per-capita and total) from Our World in Data (OWID), generates a suite of 12 exploratory and analytical plots, and performs a regression of per-capita emissions on GDP per capita.
- Project Overview
- Features & Charts
- Prerequisites
- Installation
- Usage
- Script Breakdown
- Interpreting the Outputs
- Extending & Customizing
- Data Source & Citations
- License
This repository contains:
-
CO2_Analysis.R
: A self-contained R script that- Downloads the OWID CO₂ dataset (yearly, since 1750)
- Cleans & prepares: CO₂ per capita, total CO₂, GDP per capita, date conversion
- Produces 12 visualizations exploring temporal trends, economic relationships, top emitters, quartile analyses, cumulative sums, and year-over-year changes
- Fits a linear regression model: CO₂ per capita ~ GDP per capita, with summary statistics and an annotated plot
-
report.Rmd
: An R Markdown report that weaves narrative, code, and plots into a shareable HTML document. -
scripts/
,Makefile
,Dockerfile
,docker-compose.yml
, and.env.example
to automate data fetching, analysis, report building, and containerized execution.
- Global Trend
- Line plot of global average CO₂ per capita over time
- GDP vs. CO₂
- Scatter of GDP per capita vs. CO₂ per capita (latest) with LOESS smoothing
- Regression Analysis
- Linear model summary printed to console
- Scatter + regression line with R² and p-value annotation
- Top 10 Emitters (per capita)
- Horizontal bar chart of the 10 highest per-capita emitters
- Top 5 Time Series
- Line plots of CO₂ per capita over time for the top 5 countries
- Quartile Boxplot
- CO₂ per capita by GDP per capita quartile (boxplot)
- Quartile Violin
- Distribution of CO₂ per capita by GDP quartile (violin plot)
- Quartile Heatmap
- Heatmap of average CO₂ per capita by year & GDP quartile
- Cumulative Sum
- Cumulative sum of per-capita CO₂ over time
- Year-over-Year Change
- Line plot of YOY % change in global average CO₂ per capita
- Total vs. Per Capita
- Scatter of total CO₂ emissions vs. per-capita emissions (log scale)
- Global Time Series
- (Already #1) — the core global trend
- R ≥ 4.0
- Internet access to fetch OWID data
ggplot2
,dplyr
,lubridate
,tidyr
,forcats
,scales
,viridis
,zoo
,corrplot
The main script auto-installs any missing packages.
-
Clone this repository:
git clone https://github.com/yourusername/co2-analysis.git cd co2-analysis
-
Optional: copy
.env.example
→.env
to setDATA_DIR
if you prefer a custom data folder.
# Set working directory to the project root
setwd("path/to/co2-analysis")
# Source the analysis script
source("CO2_Analysis.R")
All charts will render in sequence and the regression summary will print to the console.
make report
This will:
- Fetch the CO₂ CSV into
data/
- Run
CO2_Analysis.R
- Render
CO2_Analysis.Rmd
→CO2_report.html
docker-compose up --build
This runs the entire pipeline in a reproducible container.
-
Setup
- Defines package list, auto-installs, and loads libraries.
-
Data Fetch & Prep
- Downloads OWID CSV, filters to years ≥1960, computes
co2_pc
,total_co2
,gdp_pc
. - Derives
date
fromyear
.
- Downloads OWID CSV, filters to years ≥1960, computes
-
Visualization Sections
- Global trend, GDP relationship, top emitter charts, quartile analyses, cumulative sums, YOY change, total vs. per-capita.
-
Regression
- Fits
lm(co2_pc ~ gdp_pc)
, printssummary()
, and generates an annotated plot.
- Fits
- Global Average: captures long-term trends in per-capita emissions.
- GDP vs CO₂: shows the positive correlation between wealth and emissions.
- Top Emitters: highlights the countries with the highest per-capita footprint.
- Quartile Plots: reveal distributional differences across income strata.
- Cumulative & YOY: expose acceleration or deceleration in emission growth.
- Regression: quantifies how much GDP per capita explains variation in emissions.
- Adjust filters (e.g., include earlier years).
- Swap color palettes or themes.
- Add new predictors (population density, energy mix).
- Save plots by adding
ggsave()
calls. - Integrate regional maps using
sf
andrnaturalearth
.
- Our World in Data CO₂ dataset: https://github.com/owid/co2-data
- R & Packages: R Core Team (2023). R: A language and environment for statistical computing.
- Visualization: Wickham H. et al. “ggplot2: Elegant Graphics for Data Analysis.”
This project is released under the MIT License. See LICENSE for details.