英国广播公司(British Broadcasting Corporation;BBC
)是全球最大的新闻媒体,其中各类新闻稿件采用的统计图表能很好地传达信息。为了方便清洗可重复数据和绘制图表,BBC数据团队用R对数据进行处理和可视化,经年累月下于去年整理绘图经验并开发了R包-bbplot
,帮助我们画出和BBC新闻中一样好看的图形。
加载需要的R包
使用pacman[1]
软件包中的p_load
函数通过以下代码一次性加载。
#安装pcaman软件包并对其他R包进行加载
if(!require(pacman))install.packages("pacman")
pacman::p_load('dplyr', 'tidyr', 'gapminder',
'ggplot2', 'ggalt',
'forcats', 'R.utils', 'png',
'grid', 'ggpubr', 'scales',
'bbplot')
安装bbplot软件包
bbplot不在CRAN
上,因此必须使用devtools
直接从Github
安装它(编程模板-R语言脚本写作:最简单的统计与绘图,包安装、命令行参数解析、文件读取、表格和矢量图输出)。
# install.packages('devtools')
devtools::install_github('bbc/bbplot')
下载软件包并成功安装后,就可以创建图表了( Science组合图表解读)。
bbplot软件包如何工作?
该软件包具有两个函数功能,bbc_style()
和finalise_plot()
。
bbc_style():没有参数,通常是将文本大小、字体和颜色,轴线,轴线文本,边距和许多其他标准图表组件转换为BBC样式。
对于折线图而言,折线的颜色或条形图的颜色,并不是从bbc_style()
函数中直接实现的,而是需要在其他标准ggplot(ggplot2高效实用指南 (可视化脚本、工具、套路、配色))图表函数中明确设置。
下面的代码显示了如何在标准图表制作工作流程中使用bbc_style()
。这是一个非常简单的折线图的示例,使用了gapminder
程序包中的数据。
#Data for chart from gapminder package
line_df %
filter(country == "Malawi")
#Make plot
line
这是bbc_style()
函数在后台实际执行的操作。它实质上修改了ggplot2
主题功能(ggplot2学习笔记之图形排列)中的某些参数。
例如,第一个参数是设置图标题元素的字体、大小、和字体颜色。
## function ()
## {
## font
通过向bbc_style()
函数中包含的主题添加额外的主题参数,例如添加一些网格线。
theme(panel.grid.major.x = element_line(color="#cbcbcb"),
panel.grid.major.y=element_blank())
保存完成的图表
finalise_plot()
是bbplot程序包的第二个函数。它能按照BBC图形的标准将标题和副标题左对齐,在绘图的右下角添加页脚,也可以在左下角添加来源。它还可以将图表保存到指定的位置。该函数有五个参数:
-
plot_name: the variable name that you have called your plot, for example for the chart example above
plot_name
would be"line"
-
source: the
source text
that you want to appear at the bottom left corner of your plot. You will need to type the word"Source:"
before it, so for examplesource = "Source: ONS"
would be the right way to do that. -
save_filepath:
the precise filepath
that you want your graphic to save to, including the.png
extension at the end. This does depend on your working directory and if you are in a specific R project. An example filepath would be:Desktop/R_projects/charts/line_chart.png
. -
width_pixels: this is set to
640px by default
, so only call this argument if you want the chart to have a different width, and specify what you want it to be. -
height_pixels: this is set to
450px by default
, so only call this argument if you want the chart to have a different height, and specify what you want it to be. -
logo_image_path: this argument specifies the path for the image/logo in the bottom right corner of the plot. The default is for
a placeholder PNG file
with a background that matches the background colour of the plot, so do not specify the argument if you want it to appear without a logo. If you want toadd your own logo
, just specify the path to your PNG file. The package has been prepared with a wide and thin image in mind.
在标准工作流程中使用finalise_plot()
的示例:
finalise_plot(plot_name = my_line_plot,
source = "Source: Gapminder",
save_filepath = "filename_that_my_plot_should_be_saved_to.png",
width_pixels = 640,
height_pixels = 450,
logo_image_path = "placeholder.png")
那么如何保存上面创建的示例图?
finalise_plot(plot_name = line,
source = "Source: Gapminder",
save_filepath = "images/line_plot_finalised_test.png",
width_pixels = 640,
height_pixels = 550)
!!前方高能!!一大波图即将“来袭”……
制作折线图
#准备数据
line_df %
filter(country == "China")
#作图
line
制作多条折线的图
#准备数据
multiple_line_df %
filter(country == "China" | country == "United States")
#作图
multiple_line
R语言 – 线图绘制
制作条形图
#准备数据
bar_df %
filter(year == 2007 & continent == "Africa") %>%
arrange(desc(lifeExp)) %>%
head(5)
#作图
bars
R语言 – 柱状图
制作堆叠条形图
#准备数据
stacked_df %
filter(year == 2007) %>%
mutate(lifeExpGrouped = cut(lifeExp,
breaks = c(0, 50, 65, 80, 90),
labels = c("Under 50", "50-65", "65-80", "80+"))) %>%
group_by(continent, lifeExpGrouped) %>%
summarise(continentPop = sum(as.numeric(pop)))
#set order of stacks by changing factor levels
stacked_df$lifeExpGrouped = factor(stacked_df$lifeExpGrouped, levels = rev(levels(stacked_df$lifeExpGrouped)))
#作图
stacked_bars
堆叠柱状图各成分连线画法:突出组间变化
制作分组条形图
只需要将position =“identity”更改为position =“dodge”:
#准备数据
grouped_bar_df %
filter(year == 1967 | year == 2007) %>%
select(country, year, lifeExp) %>%
spread(year, lifeExp) %>%
mutate(gap = `2007` - `1967`) %>%
arrange(desc(gap)) %>%
head(5) %>%
gather(key = year,
value = lifeExp,
-country,
-gap)
#画图
grouped_bars
是Excel的图,不!是R的图
制作哑铃图
library("ggalt")
library("tidyr")
#准备数据
dumbbell_df %
filter(year == 1967 | year == 2007) %>%
select(country, year, lifeExp) %>%
spread(year, lifeExp) %>%
mutate(gap = `2007` - `1967`) %>%
arrange(desc(gap)) %>%
head(10)
#作图
ggplot(dumbbell_df, aes(x = `1967`, xend = `2007`, y = reorder(country, gap), group = country)) +
geom_dumbbell(colour = "#dddddd",
size = 3,
colour_x = "#FAAB18",
colour_xend = "#1380A1") +
bbc_style() +
labs(title="We're living longer",
subtitle="Biggest life expectancy rise, 1967-2007")
制作直方图
hist_df %
filter(year == 2007)
ggplot(hist_df, aes(lifeExp)) +
geom_histogram(binwidth = 5, colour = "white", fill = "#1380A1") +
geom_hline(yintercept = 0, size = 1, colour="#333333") +
bbc_style() +
scale_x_continuous(limits = c(35, 95),
breaks = seq(40, 90, by = 10),
labels = c("40", "50", "60", "70", "80", "90 years")) +
labs(title = "How life expectancy varies",
subtitle = "Distribution of life expectancy in 2007")
对图例进行更改
去掉图例:
multiple_line + guides(colour=FALSE)
#or
multiple_line + theme(legend.position = "none")
改变图例位置:
multiple_line + theme(legend.position = "right")
改变坐标轴
翻转坐标轴:
bars
添加/删除网格线:
bars
人工更改轴间距:
使用scale_y_continuous
或scale_x_continuous
更改轴文本标签:
bars
在轴标签上添加千位分隔符
+ scale_y_continuous(labels = function(x) format(x, big.mark = ",",
scientific = FALSE))
将百分比符号添加到轴标签:
+ scale_y_continuous(labels = function(x) paste0(x, "%"))
构面
ggplot可以轻松创建多个小图表,这被称为构面。如果将需要可视化的数据按某个变量划分,则需要使用函数facet_wrap
或facet_grid
。
#准备数据
facet %
filter(continent != "Americas") %>%
group_by(continent, year) %>%
summarise(pop = sum(as.numeric(pop)))
#作图
facet_plot
可以尝试的参数实在是太多啦!大家可以试一试呀!
[1]:https://bbc.github.io/rcookbook/#how_to_create_bbc_style_graphics
撰文:May
编辑:生信宝典
服务器托管,北京服务器托管,服务器租用 http://www.fwqtg.net
机房租用,北京机房租用,IDC机房托管, http://www.e1idc.net