loader

Introduction to Data Set

Fine particulate issue (PM2.5) is an encompassing air contamination for which there is solid proof that it is hurtful to human wellbeing. In the United States, the Environmental Protection Agency (EPA) is entrusted with setting national encompassing air quality models for fine PM and for following the emanations of this contamination into the air. Approximatly like clockwork, the EPA discharges its database on emanations of PM2.5. This database is known as the National Emissions Inventory (NEI).You can read more information about the NEI at the EPA National Emissions Inventory web site.

For every year and for each sort of PM source, the NEI records what number of huge amounts of PM2.5 were radiated from that source through the span of the whole year. The information that you will use for this task are for 1999, 2002, 2005, and 2008.

Data

The data for this assignment are available from the course web site as a single zip file:
Environmental Protection Agency DataSet.

The zip file contains two files:

PM2.5 Emissions Data summarySCC_PM25.rds):
This document contains an information outline with the entirety of the PM2.5 emanations information for 1999, 2002, 2005, and 2008. For every year, the table contains number of huge amounts of PM2.5 produced from a particular sort of hotspot for the whole year. Here are the initial scarcely any lines.

  • fips: A five-digit number (represented as a string) indicating the U.S. county
  • SCC: The name of the source as indicated by a digit string (see source code classification table)
  • Pollutant: A string indicating the pollutant
  • Emissions: Amount of PM2.5 emitted, in tons
  • type: The type of source (point, non-point, on-road, or non-road)
  • year: The year of emissions recorded

Source Classification Code Table (Source_Classification_Code.rds):
This table provides a mapping from the SCC digit strings in the Emissions table to the actual name of the PM2.5 source. The sources are categorized in a few different ways from more general to more specific and you may choose to explore whatever categories you think are most useful. For example, source “10100101” is known as “Ext Comb /Electric Gen /Anthracite Coal /Pulverized Coal”.

What We will Do

The general objective of this task is to investigate the National Emissions Inventory database and see what it state about fine particulate issue contamination in the United states over the 10-year time frame 1999–2008. You may utilize any R bundle you need to help your examination.

Check list:

  1. Have total emissions from PM2.5 decreased in the United States from 1999 to 2008?
    Using the base plotting system, make a plot showing the total PM2.5 emission from all sources for each of the years 1999, 2002, 2005, and 2008.
  2. Have total emissions from PM2.5 decreased in the Baltimore City, Maryland fips==”24510″) from 1999 to 2008? Use the base plotting system to make a plot answering this question.
  3. Of the four types of sources indicated by the type (point, nonpoint, onroad, nonroad) variable, which of these four sources have seen decreases in emissions from 1999–2008 for Baltimore City?
    Which have seen increases in emissions from 1999–2008?
    Use the ggplot2 plotting system to make a plot answer this question.
  4. Across the United States, how have emissions from coal combustion-related sources changed from 1999–2008?
  5. How have emissions from motor vehicle sources changed from 1999–2008 in Baltimore City?
  6. Compare emissions from motor vehicle sources in Baltimore City with emissions from motor vehicle sources in Los Angeles County, California fips == “06037”.
    Which city has seen greater changes over time in motor vehicle emissions?

Plotting:

  1. Constructing the plot and saving it to a PNG file.
  2. Make a different R code document plot1.R, plot2.R, and so on.) that builds the comparing plot, for example code in plot1.R develops the plot1.png plot. Our code record ought to incorporate code for perusing the information with the goal that the plot can be completely recreated. We will likewise incorporate the code that makes the PNG record. Just incorporate the code for a solitary plot (for example plot1.R should just incorporate code for creating plot1.png)

Firstly, We will Import the data Set by downloading it and unzipping it.

Loaded packages in R

library(data.table)
library(ggplot2)
#unzipping and loading the data
path <- getwd()
download.file(url = "https://d396qusza40orc.cloudfront.net/exdata%2Fdata%2FNEI_data.zip"
              , destfile = paste(path, "dataFiles.zip", sep = "/"))
unzip(zipfile = "dataFiles.zip")

Then We will Read the data Set

#reading the data
data_summary <- readRDS("summarySCC_PM25.rds")
data_source <- readRDS("Source_Classification_Code.rds")

Converting the formate to data table

NEI <- data.table::as.data.table(data_summary)
SCC <- data.table::as.data.table(data_source)

Question 1:

Have total emissions from PM2.5 decreased in the United States from 1999 to 2008?
Using the base plotting system, make a plot showing the total PM2.5 emission from all sources for each of the years 1999, 2002, 2005, and 2008.

#Have total emissions from PM2.5 decreased in the United States from 1999 to 2008?
#Using the base plotting system, make a plot showing the total PM2.5 emission from all sources for each of the years 1999, 2002, 2005, and 2008.
##1 Converting list to respected types,Preventing plot from printing in scientific notation
by_year <- NEI[, list(emissions=sum(Emissions)), by=year]
by_year$year = as.numeric(as.character(by_year$year))
by_year$emissions = as.numeric(as.character(by_year$emissions))
##deriving type of file to save the plot
png(filename='plot1.png')
## Genrating the Plot
barplot(by_year[, by_year$emissions]
        , names = by_year[, by_year$year]
        , xlab = "Years", ylab = "Emissions"
        , main = "Emissions over the Years")
dev.off()

Plot1.png

Question 2:

Have total emissions from PM2.5 decreased in the Baltimore City, Maryland fips==”24510″) from 1999 to 2008? Use the base plotting system to make a plot answering this question.

#Have total emissions from PM2.5 decreased in the Baltimore City, Maryland (𝚏𝚒𝚙𝚜 == "𝟸𝟺𝟻𝟷𝟶") from 1999 to 2008? Use the base plotting system to make a plot answering this question.
##Assgining baltimore feilds to a variable
baltimore <- subset(NEI, fips == '24510')
#Converting list to respected types,Preventing plot from printing in scientific notation
by_year <- baltimore[, list(emissions=sum(Emissions)), by=year]
by_year$year = as.numeric(as.character(by_year$year))
by_year$emissions = as.numeric(as.character(by_year$emissions))
## deriving type of file to save the plot
png(filename='plot2.png')
## Genrating the Plot
barplot(by_year[, by_year$emissions]
        , names = by_year[, by_year$year]
        , xlab = "Years", ylab = "Emissions"
        , main = "Emissions over the Years")
dev.off()

Plot2

Question 3:

Of the four types of sources indicated by the type (point, nonpoint, onroad, nonroad) variable, which of these four sources have seen decreases in emissions from 1999–2008 for Baltimore City?
Which have seen increases in emissions from 1999–2008?
Use the ggplot2 plotting system to make a plot answer this question.

#Of the four types of sources indicated by the 𝚝𝚢𝚙𝚎 (point, nonpoint, onroad, nonroad) variable, which of these four sources have seen decreases in emissions from 1999–2008 for Baltimore City? Which have seen increases in emissions from 1999–2008? Use the ggplot2 plotting system to make a plot answer this question.
##Assgining baltimore feilds to a variable
baltimore <- subset(NEI, fips == '24510')
#Converting list to respected types,Preventing plot from printing in scientific notation
by_year <- baltimore[, list(emissions=sum(Emissions)), by=c('year', 'type')]
by_year$year = as.numeric(as.character(by_year$year))
by_year$emissions = as.numeric(as.character(by_year$emissions))
## deriving type of file to save the plot
png(filename='plot3.png')
##Plotting using ggplot2
ggplot(by_year, aes(by_year$year, by_year$emissions, col=by_year$type)) + geom_line() + geom_point() + ggtitle("Emissions in Baltimore City")
dev.off()

Plot3

Question 4:

Across the United States, how have emissions from coal combustion-related sources changed from 1999–2008?

#Across the United States, how have emissions from coal combustion-related sources changed from 1999–2008?
#merging dataset by Source
merged <- merge(NEI, SCC, by="SCC")
#filtering the records containg the name 'coal'
coal <- grepl("coal", merged$Short.Name, ignore.case=TRUE)
coal <- data.table(merged[coal, ])
#Converting list to respected types,Preventing plot from printing in scientific notation
by_year <- coal[, list(emissions=sum(Emissions)), by=c('year')]
by_year$year = as.numeric(as.character(by_year$year))
by_year$emissions = as.numeric(as.character(by_year$emissions))
## deriving type of file to save the plot
png(filename='plot4.png')
##Plotting using ggplot2
ggplot(data=by_year, aes(x=year, y=emissions)) + geom_line() + geom_point() + ggtitle("Emissions from Coal Sources in the US")
dev.off()

Plot4

Question 5:

How have emissions from motor vehicle sources changed from 1999–2008 in Baltimore City?

#How have emissions from motor vehicle sources changed from 1999–2008 in Baltimore City?
##filtering data set 'on-road'
baltimore <- subset(NEI, fips == '24510' & type == 'ON-ROAD')
#Converting list to respected types,Preventing plot from printing in scientific notation
by_year <- baltimore[, list(emissions=sum(Emissions)), by=c('year', 'type')]
by_year$year = as.numeric(as.character(by_year$year))
by_year$emissions = as.numeric(as.character(by_year$emissions))
## deriving type of file to save the plot
png(filename='plot5.png')
##Plotting using ggplot2
ggplot(data=by_year, aes(x=year, y=emissions)) + geom_line() + geom_point() + ggtitle("Emissions in Baltimore City from Motor Vehicles")
dev.off()

Plot5

Question 6:

Compare emissions from motor vehicle sources in Baltimore City with emissions from motor vehicle sources in Los Angeles County, California fips == “06037”.
Which city has seen greater changes over time in motor vehicle emissions?

#How have emissions from motor vehicle sources changed from 1999–2008 in Baltimore City?
##filtering data set 'on-road'
baltimore <- subset(NEI,fips %in% c('06037', '24510') & type == 'ON-ROAD')
#Converting list to respected types,Preventing plot from printing in scientific notation
by_year <- baltimore[, list(emissions=sum(Emissions)), by=c('year', 'fips')]
by_year$year = as.numeric(as.character(by_year$year))
by_year$emissions = as.numeric(as.character(by_year$emissions))
## deriving type of file to save the plot
png(filename='plot6.png')
##Plotting using ggplot2
ggplot(data=by_year, aes(x=year, y=emissions, col=fips)) + geom_line() + geom_point() + ggtitle("Emissions in Baltimore City from Motor Vehicles")
dev.off()

Plot6

You Can Also visit my Github Repository For code and Documentation.

Leave a Reply