ggplot2 in R

The basic grammar to call ggplot2 in R is as the following form

ggplot(data=dfname, aes(x=varname1, y=varname2) ) +
geom_<geom type>( aes(blabla) ) +
theme()

The link between data and figure is established by specifying an “aethetic” (an aes object with some input arguments). The previous specification can be seen as base and later graphic elements will be overlapped to the base adopting or overwriteing the pevious aesthetic specification. For example, if you specify the data argument in the first ggplot function, you dont' have to specify data argument for geom_<geom type> again because of this automatic applying rule.

Unlike matplotlib in Python, ggplot2 uses every data record to specify the plotting element in the figure. Therefore, it is better to use long data format for easier plotting. If the data are in wide format, use R library reshape2 to convert the wide format to long format first. The following reshape2 example converts the wide format dataframe into long format dataframe selecting only var3 and var4 variables with var1 and var2 as ID variables.

library(reshape2)
longdata = melt( widedata, id.vars=c('var1','var2'), measure.vars=c('var3','var4'), 
           variable.name='var', value.name='value' )
longdata = melt( widedata, id.vars=1:2, measure.vars=3:4,
           variable.name='var', value.name='value' )

Another way is to use the library tidyr

library(tidyr)
longdata = gather( widedata, "var", "value", 

XY Plot

XY plot could be represented by markers, lines, bars, and etc.

Change Text

Title and XY Axis Label

Title and XY labels could be changed by adding

ggplot()+blablabla+
  xlab('X')+
  ylab('Y')+
  ggtitle('My Title')

Legend

When color and fill are specified for geom_line and geom_point aesthetic, a legend will be automatically linked to that specified variable and added to the plot to for guidance purpose. The title for the legend is automatically specified by the variable name and the legend labels are specified by the variable value by its original order. This can be modifed by adding scale element

ggplot(data=df)+geom_line( aes(x=var1, y=var2, color=var3) )+
  scale_color_discrete( name='Legend Title', labels=c('val1', 'val2', 'val3') )

Similarly, fill scale element could be specified by adding scale_fill_discrete element.

Output Graphics

R’s graphcs could be output to many types of device. By default, the plot is output to the R windows device or R Studio graphic device shown in the window. You may direct it to other graphic device. To output PNG file, you may create a plot first, create an output device, print it.

plt = ggplot()+blabla
devpng = png('plot/figure.png', width=10, height=5, units='in', res=100)
print(plt)
dev.off()

Remeber to turn the device off with dev.off() after using it otherwise later printing will overwrite the previous plot because PNG file can only has one page. PDF device could be output in mutiple times so a multiple page PDF will be created. To generate antialias high quality graphics, remember to add type='cairo' option.

devpng = png('plot/figure.png', width=10, height=5, units='in', res=100, type='cairo')
print(plt)
dev.off()

To output SVG format, use

devsvg = svg('plot/figure.svg', width=10, height=5 )
print(plt)
dev.off()

To output PDF format, use

devpdf = pdf('plot/figure.pdf', width=10, height=5 )
print(plt1)
print(plt2)
dev.off()