r语言峰峦图ggplot2_如何使用ggplot2在R中创建区域图

r语言峰峦图ggplot2

The area graphs are the plots that demonstrate the quantitative data. R offers the standard function geom_area() to plot the area charts and geom_line() to draw the line over the data points using the ggplot2 package.

面积图是展示定量数据的图。 R提供了标准函数geom_area()来绘制面积图,并提供了geom_line()来使用ggplot2包在数据点上绘制线。

什么是面积图? (What is an area plot?)

An area plot is a kind of line plot which represents the distribution of the quantitative data. In this chart type, we will first mark the data points and then join them by a line to demonstrate the quantity of the data point or value at different time periods.

面积图是一种线图,代表定量数据的分布。 在这种图表类型中,我们将首先标记数据点,然后通过一条线将它们连接起来,以演示不同时间段的数据点或值的数量。

In this tutorial, we are going to create an area chart using the ggplot2 library. Well, if you are aware of using geom_area() function, you are just a few steps away from creating a beautiful area chart in R.

在本教程中,我们将使用ggplot2库创建面积图。 好吧,如果您知道使用geom_area()函数,则距离在R中创建漂亮的面积图仅几步之遥。

Let’s roll!

来吧!

使用ggplot2在R中创建简单区域图 (Create a Simple Area Plot in R using ggplot2)

Let’s plot a simple area chart using the normal distribution values.

让我们使用正态分布值绘制一个简单的面积图。

This is the basic area plot in the R using ggplot2. Here the data is taken as the normal distribution values(rnorm).

这是使用ggplot2在R中的基本面积图。 这里的数据取为正态分布值(rmrm)。

Execute the below code to plot the area chart.

执行以下代码以绘制面积图。


#imports the ggplot2 library
library(ggplot2)

#creates the dataframe having the normal distribution values (rnorm)
xdata<-1:50
ydata<-cumsum(rnorm(50))
data1<-data.frame(xdata,ydata)

#plots the area chart
ggplot(data1, aes(x=xdata, y=ydata))+geom_area(fill='#142F86',alpha=2)
Basic Area Plot In R

使用ggplot2和hrbrthemes库自定义面积图 (Customizing the area plot using ggplot2 and hrbrthemes libraries)

A simple area chart, as shown above, doesn’t look exciting right? Well, lets put some life into our area chart by adding colors, fonts, styles, and themes.

如上所示,一个简单的面积图看起来并不令人兴奋,对吧? 好吧,让我们通过添加颜色,字体,样式和主题来为我们的区域图增光添彩。

For this, you have to install certain packages such as ggplot2 and hrbrthemes

为此,您必须安装某些软件包,例如ggplot2和hrbrthemes

To install ggplot2 – install.package(‘ggplot2)

安装ggp​​lot2 – install.package('ggplot2)

To install hrbrthemes – install.packages(‘hrbrthemes)

要安装hrbrthemes – install.packages('hrbrthemes)

This plot inlcudes the line and the points over the area plot. The points and lines joing them makes some sense than a simple area chart. Execute the below code to plot the customized area chart.

该图包括面积图上的线和点。 点点和点线比简单的面积图有意义。 执行以下代码以绘制自定义面积图。


#install the required visualization libraries 
library(ggplot2)
library(hrbrthemes)
 
#loading the x and y data (normal distribution)
xdata<-1:50
ydata<-cumsum(rnorm(50))
#reading data into data frames
data<- data.frame(xdata,ydata)

#plots the area chart with theme, title and labels 
ggplot(data, aes(x=xdata, y=ydata))+
geom_area(fill='#142F86', alpha=1)+
geom_line(color='skyblue', size=1)+
geom_point(size=1, color='blue')+
ggtitle("Area plot using ggplot2 in R")+
labs(x='Value', y='frequency')+
theme_ipsum()
Area Plot Using Ggplot2 In R

R中使用ggplot的基本堆积面积图 (A basic stacked area plot using ggplot in R)

The stacked area graph is a part of the area graph where it demonstrates the behavior of multiple groups in a single chart.

堆叠的面积图是面积图的一部分,它在一个图中演示了多个组的行为。

For this, you need to install the dplyr package. To install dplyr, run the below code in r studio.

为此,您需要安装dplyr软件包。 要安装dplyr,请在r studio中运行以下代码。

install.packages(‘dplyr’)

install.packages('dplyr')

The below code will illustrate the same.

下面的代码将说明相同的情况。


#import the libraries

library(ggplot2)
library(dplyr)
 
#creates the values and data frame
time<- as.numeric(rep(seq(1,7),each=7))
value<- runif(49,30,100)
group<- rep(LETTERS[1:7], times=7)
data1<-data.frame(time,value,group)
 
#plot the area stacked area chart
ggplot(data1, aes(x=time, y=value, fill=group))+geom_area() 
Stacked Area Graph

使用Viridis库增强面积图 (Enhancing the area plot using Viridis library)

The way we enhanced the simple area chart in the above section is fantastic. In the same way, we are going to add some fonts, colors, and styles to the stacked area chart, but this time using Viridis.

上一节中我们增强简单区域图的方式非常棒。 同样,我们将向堆叠面积图添加一些字体,颜色和样式,但这一次使用Viridis。

Viridis is a visualization library that helps in adding the colors and different styles to the graphs. To install the Viridis package, run the below code in r studio.

Viridis是一个可视化库,可帮助向图形添加颜色和不同样式。 要安装Viridis软件包,请在r studio中运行以下代码。

install.package(‘viridis’)

install.package('viridis')


#impots the required libraries 
library(viridis)
library(hrbrthemes)

time <- as.numeric(rep(seq(1,7),each=7)) 
value <- runif(49, 10, 100)               
group <- rep(LETTERS[1:7],times=7)      
data <- data.frame(time, value, group)

#adds title, colors and styles to the plot
ggplot(data1, aes(x=time, y=value, fill=group))+
     geom_area(size=0.5, alpha=0.8, color='yellow')+
     scale_fill_viridis(discrete = TRUE)+
     theme_ipsum()+
     ggtitle("Customized area plot using viridis library")
Area Plot Using Viridis Library In R

使用plotly库绘制面积图 (Plotting the area chart using plotly library)

plotly is an open-source library that is used for creating highly appealing visual graphs with various themes and hovers.

plotly是一个开放源代码库,用于创建具有各种主题和悬停功能的极具吸引力的视觉图形。

In this section, we are going to plot the stacked area plot for the popularity of the American baby names over the past years.

在本节中,我们将针对美国婴儿名字在过去几年中的受欢迎程度绘制叠加区域图。

As you can see the graph below, which is highly appealing and smooth with a legend. Say thanks to plotly.

如您所见,下图非常有趣,并且带有图例。 对Plotly表示感谢。


#imports the required libraries
library(ggplot2)
library(hrbrthemes)
library(viridis)
library(babynames)
library(tidyverse)
library(plotly)

#creates the data frame with baby names
data<-babynames %>%
    filter(name %in% c('Margaret','Anna','Emma','Bertha','Sarah'))%>%
    filter(sex=='F')
 
#plots the stacked area chart with american babynames 
 p<-data%>%
     ggplot(aes(x=year, y=n, fill=name, text=name))+
     geom_area()+
     scale_fill_viridis(discrete = T)+
     theme(legend.position = 'none')+
     theme_ipsum()+
     ggtitle('Yearwise american baby names popularity')
 ggplotly(p, tootltip='text')

Area Plot In R Stacked

使用facet_wrap()绘制多个区域图 (Plotting the multiple area graphs using facet_wrap())

The multiple facets are the major part of the area charts as they will demonstrate the behavior of each data group. In this case, the popularity of each baby’s name was illustrated using the facet_wrap() function.

多个方面是面积图的主要部分,因为它们将演示每个数据组的行为。 在这种情况下,使用facet_wrap()函数说明了每个婴儿名字的流行程度。

Using a simple function facet_wrap(), you can create multiple plot panels in R. This function is convenient to show the behavior of various groups, as shown below.

使用简单的函数facet_wrap(),您可以在R中创建多个绘图面板。此函数可方便地显示各个组的行为,如下所示。

Execute the below code to create a multiple-panel plot using the facet_wrap() function.

执行以下代码,使用facet_wrap()函数创建多面板图。


#loads the babynames data with a filter of name and sex as 'F'
data<-babynames %>%
     filter(name %in% c('Margaret','Anna','Emma','Bertha','Sarah'))%>%
     filter(sex=='F')

#plots the multiple area plots using the function facet_wrap()
data%>%
 ggplot(aes(x=year, y=n, group=name, fill=name))+
 geom_area()+
 scale_fill_viridis(discrete = TRUE)+
 theme(legend.position = 'none')+
 ggtitle("Indivisual american names popularity - yearwise")+
 theme_ipsum()+
 theme(legend.position = "none", panel.spacing = unit(0.1, "lines"), 
 strip.text.x = element_text(size = 6))+
facet_wrap(~name, scale='free_y')
Multiple Area Plot In R 1


使用R中的堆积面积图求出1900-2002年美国人口的年龄分布 (Finding the age distribution of population in the US between 1900-2002 using the stacked area plot in R)

In this section, we are going to plot a stacked area plot which shows the distribution of the population age between the years 1900 and 2002.

在本节中,我们将绘制一个堆积面积图,该图显示1900年至2002年之间的人口年龄分布。

For this, you have to install a package gcookbook, which includes the USpopage data. You can install by running this code – install.packages(‘gcookbook’).

为此,您必须安装软件包gcookbook,其中包含USpopage数据。 您可以通过运行以下代码进行安装– install.packages('gcookbook')。

Execute the below code to plot the stacked area chart which shows the age distribution of the population.

执行以下代码以绘制堆积面积图,以显示人口的年龄分布。


#installs the required package
install.packages('gcookbook’)

#imports the libraries
library(gcookbook)
library(ggplot2)

#reads the data
Str(uspopage)
ggplot(uspopage, aes(x=Year, y=Thousands, fill=AgeGroup))+geom_area()

ggplot(uspopage, aes(x=Year, y=Thousands, fill=AgeGroup))+geom_area(color='black', size=0.3, alpha=1)+scale_fill_brewer(palette = 'blues',breaks=rev(uspopage$AgeGroup))

#creates the stacked area chart with uspopage data
ggplot(uspopage, mapping = aes(x=Year, y=Thousands, fill=AgeGroup))+
     geom_area(color='black', size=0.5, alpha=1, position = position_stack(reverse = T))+
     scale_fill_brewer(palette = 'blues')+
guides(fill=guide_legend(reverse = T))

Ggplot Multiple Plots
Ggplot Multiple Area Plots
Ggplot多区域图

使用dplyr()库在R中按比例堆积的面积图 (Proportional stacked area plot in R using dplyr() library)

In the proportional stacked area charts, the value of the groups is represented by the percentages instead of other parameters.

在比例堆积面积图中,组的值由百分比而不是其他参数表示。

This method is very helpful to clearly understand the percentages of the groups and note that percentages make more sense and identifies hidden data patterns as well.

此方法对于清楚地了解组的百分比非常有用,并请注意,百分比更有意义,并且还可以识别隐藏的数据模式。

Well, for this method, first we have to create an additional column of percentage. To create this we need a library named ‘dplyr’.

好吧,对于这种方法,首先我们必须创建一个额外的百分比列。 为此,我们需要一个名为“ dplyr”的库

Dplyr is a special package in the R which includes the specific tools for the data manipulation.

Dplyr是R中的特殊软件包,其中包括用于数据处理的特定工具。

Execute the below code to plot the stacked area plot with groups that are represented in percentages.

执行以下代码以用百分比表示的组来绘制堆积面积图。


#installs the required package
install.packages(‘dplyr’)
#imports the library
library(dplyr)

#groups the data and adds the column with the percentiles
us_dplyr<-uspopage%>%
 group_by(Year)%>%
 mutate(percentage=Thousands/sum(Thousands)*100)
View(us_dplyr)

#plots the chart
ggplot(us_dplyr, aes(x=Year, y=percentage, fill=AgeGroup))+geom_area(color='black', size=0.3, alpha=1)+scale_fill_brewer(palette = 'blues',breaks=rev(uspopage$AgeGroup))

This is the data frame which shows the added ‘percentage’ column.

这是显示添加的“百分比”列的数据框。

Proportional Area Plot In R
Stacked Area Plot In R Using Dplyr


加起来 (Summing up)

Area charts are just like the line charts which are used to demonstrate the evolution of somethings over time or behavior of an object over time.

面积图就像折线图一样,用于演示某些事物随时间的演变或对象随时间的行为。

R offers the standard function geom_area() to plot the area charts.

R提供了标准函数geom_area()来绘制面积图。

The stacked area charts are used to represent multiple variables against some parameters. Stacked area charts are the most used area chart types in the analysis.

堆积面积图用于表示针对某些参数的多个变量。 堆叠面积图是分析中最常用的面积图类型。

Well, in this tutorial we have gone through various types of area charts and stacked area charts as well. R offers various visualization libraries such as tidyverse, Viridis, ggplot2, hrbrthemes to enhance the visual graphs.

嗯,在本教程中,我们还介绍了各种类型的面积图和堆积面积图。 R提供了各种可视化库,例如tidyverse,Viridis,ggplot2,hrbrtheme,以增强可视化图形。

Hope you enjoyed the tutorial. For any queries don’t hesitate to hit the comments section. Happy plotting!!!

希望您喜欢本教程。 对于任何查询,请随时点击评论部分。 快乐策划!!!

翻译自: https://www.journaldev.com/39620/how-to-create-an-area-plot-in-r-using-ggplot2

r语言峰峦图ggplot2