First, we make a simple interactive scatter plot using sepal length and width. We can view the information about the variables and color coding information in the hover text. The labels of axes and legend titles and labels are default.
data = iris,
x = ~Sepal.Length, # Horizontal axis
y = ~Sepal.Width, # Vertical axis
color = ~factor(Species), # must be a numeric factor
type = "scatter",
mode = "markers")
We can also add additional information to the plot to enhance the interactivity of the plot. For example, we can
modify the point size using the value of a numerical variable variable;
add text to the hover text using text
option to show
the class label;
formulate the hover text using hovertemplate
data = iris,
x = ~Sepal.Length, # Horizontal axis
y = ~Sepal.Width, # Vertical axis
customdata = ~Petal.Width,
color = ~factor(Species), # must be a numeric factor
hovertext = ~Species, # show the species in the hovertext
hoverlabel = ~Petal.Width,
marker = list(size = ~Petal.Length, sizeref = .05, sizemode = 'area'),
alpha = 0.9,
type = "scatter",
mode = "markers",
## using the following hovertemplate() to add the information of the
## Two numerical variables to the hover text.
hovertemplate = paste('<b>Sepal Width<b>: %{y}',
'<br><b>Sepal Length</b>: %{x}',
'<br><b>Petal Length</b>: %{marker.size:,}',
'<br><b>Petal Width</b>: %{customdata}',
'<br><b>Species</b>: %{hovertext}',
Titles and axis labels are important in any visualization, to include a meaningful title, informative labels, and annotations to the plotly plot, we can use layout() function. The following code only gives you some design ideas you can use to enhance your plotly charts. The detailed list of configurations can be found from the plotly’s reference page at
data = iris,
x = ~Sepal.Length, # Horizontal axis
y = ~Sepal.Width, # Vertical axis
color = ~factor(Species), # must be a numeric factor
#text = ~Species,
text = ~paste("Petal Length: ", Petal.Length,
"<br>Petal Width: ", Petal.Width,
"<br>Species: ", Species),
# Show the species in the hover text
## using the following hovertemplate() to add the information of the
## Two numerical variables to the hover text.
### Use the following hover template to display more information
hovertemplate = paste('<i><b>Sepal Width<b></i>: %{y}',
'<br><b>Sepal Length</b>: %{x}',
alpha = 0.6,
marker = list(size = ~Petal.Length, sizeref = .05, sizemode = 'area' ),
type = "scatter",
mode = "markers",
## graphic size
width = 700,
height = 500
) %>%
### Title
title =list(text = "Sepal Length vs Sepal Width",
font = list(family = "Times New Roman", # HTML font family
size = 18,
color = "red")),
### legend
legend = list(title = list(text = 'species',
font = list(family = "Courier New",
size = 14,
color = "green")),
bgcolor = "ivory",
bordercolor = "navy",
groupclick = "togglegroup", # one of "toggleitem" AND "togglegroup".
orientation = "v" # Sets the orientation of the legend.
## margin of the plot
margin = list(
b = 100,
l = 100,
t = 100,
r = 50
## Background
plot_bgcolor ='#f7f7f7',
## Axes labels
xaxis = list(
title=list(text = 'Sepal Length',
font = list(family = 'Arial')),
zerolinecolor = 'red',
zerolinewidth = 2,
gridcolor = 'white'),
yaxis = list(
title=list(text = 'Sepal Width',
font = list(family = 'Arial')),
zerolinecolor = 'purple',
zerolinewidth = 2,
gridcolor = 'white'),
## annotations
annotations = list(
x = 0.7, # between 0 and 1. 0 = left, 1 = right
y = 1.5, # between 0 and 1, 0 = bottom, 1 = top
font = list(size = 12,
color = "darkred"),
text = "The point size is proportional to the sepal length",
xref = "paper", # "container" spans the entire `width` of the
# lot. "paper" refers to the width of the
# plotting area only. yref = "paper",
# same as xref.
xanchor = "center", # horizontal alignment with respect to its x position
yanchor = "bottom", # similar to xanchor
showarrow = FALSE)
We also write a theme just like we did in the regular ggplot. The following is an example.
myPlotlyLayout <- function(anyObjName){ # anyString is required initial argument.
# it can be any string a,b,c, .........
### Title
title =list(text = "Sepal Length vs Sepal Width",
font = list(family = "Times New Roman", # HTML font family
size = 18,
color = "red")),
### legend
legend = list(title = list(text = 'species',
font = list(family = "Courier New",
size = 14,
color = "green")),
bgcolor = "ivory",
bordercolor = "navy",
groupclick = "togglegroup", # one of "toggleitem" AND "togglegroup".
orientation = "v" # Sets the orientation of the legend.
## margin of the plot
margin = list(
b = 120,
l = 50,
t = 120,
r = 50
## Background
plot_bgcolor ='#f7f7f7',
## Axes labels
xaxis = list(
title=list(text = 'Sepal Length',
font = list(family = 'Arial')),
zerolinecolor = 'red',
zerolinewidth = 2,
gridcolor = 'white'),
yaxis = list(
title=list(text = 'Sepal Width',
font = list(family = 'Arial')),
zerolinecolor = 'purple',
zerolinewidth = 2,
gridcolor = 'white'),
## annotations
annotations = list(
x = 0.7, # between 0 and 1. 0 = left, 1 = right
y = 0.9, # between 0 and 1, 0 = bottom, 1 = top
font = list(size = 12,
color = "darkred"),
text = "The point size is proportional to the sepal length",
xref = "paper", # "container" spans the entire `width` of the plot.
# "paper" refers to the width of the plotting area only.
yref = "paper", # same as xref
xanchor = "center", # horizontal alignment with respect to its x position
yanchor = "bottom", # similar to xanchor
showarrow = FALSE
data = iris,
x = ~Sepal.Length, # Horizontal axis
y = ~Sepal.Width, # Vertical axis
color = ~factor(Species), # must be a numeric factor
text = ~Species, # show the species in the hover text
## using the following hovertemplate() to add the information of the
## Two numerical variables to the hover text.
hovertemplate = paste('<i><b>Sepal Width<b></i>: %{y}',
'<br><b>Sepal Length</b>: %{x}',
alpha = 0.9,
marker = list(size = ~Petal.Length, sizeref = .05, sizemode = 'area' ),
type = "scatter",
mode = "markers",
## graphic size
width = 700,
height = 500) %>% myPlotlyLayout()
As we did in the base R and ggplot, we illustrate how to add images to plotly charts: inserting an image and setting an image background.
Inserting Images to plotly
The following example shows how to use layout function to insert an external image to a plotly scatter plot. Comparing the steps of inserting an external image to the base R and ggplot, it is relatively straightforward and flexible to perform the same task in plotly. See the comments in the code to place the image in an appropriate location.
data = iris,
x = ~Sepal.Length, # Horizontal axis
y = ~Sepal.Width, # Vertical axis
customdata = ~Petal.Width,
color = ~factor(Species), # must be a numeric factor
hovertext = ~Species, # show the species in the hover text
hoverlabel = ~Petal.Width,
marker = list(size = ~Petal.Length, sizeref = .05, sizemode = 'area'),
alpha = 0.9,
type = "scatter",
mode = "markers",
## using the following hovertemplate() to add the information of the
## two numerical variable to the hover text.
hovertemplate = paste('<b>Sepal Width<b>: %{y}',
'<br><b>Sepal Length</b>: %{x}',
'<br><b>Petal Length</b>: %{marker.size:,}',
'<br><b>Petal Width</b>: %{customdata}',
'<br><b>Species</b>: %{hovertext}',
"<extra></extra>") ) %>%
images = list(
source = "",
xref="paper", # without assuming having a coordinate system, we
yref="paper", # can use paper size to set up a relative location
# to place an image.
# We can also use xref="x domain", yref="y domain",
# to set up a coordinate system to place an image.
x = 0, # value between 0 and 1 - representing percentage from left
# hand side of x (0) and the right hand side of x (1).
y = 1, # x = 0, y = 1 ==> "topleft"
sizex = .2, # image size - horizontal
sizey = .2, # vertical size
xanchor="left", # image location -
yanchor="top" ,
opacity = 0.6 # adjusting image opacity
Setting Image Background for plotly Charts
pal <- c("#332288", "#117733", "#882255")
pal <- setNames(pal, c("virginica", "setosa", "versicolor"))
data = iris,
x = ~Sepal.Length, # Horizontal axis
y = ~Sepal.Width, # Vertical axis
customdata = ~Petal.Width,
color = ~factor(Species), # must be a numeric factor
colors = pal, # custom color palette
hovertext = ~Species, # show the species in the hover text
hoverlabel = ~Petal.Width,
marker = list(size = ~Petal.Length, sizeref = .05, sizemode = 'area'),
alpha = 0.9,
type = "scatter",
mode = "markers",
## using the following hovertemplate() to add the information of the
## Two numerical variables to the hover text.
hovertemplate = paste('<b>Sepal Width<b>: %{y}',
'<br><b>Sepal Length</b>: %{x}',
'<br><b>Petal Length</b>: %{marker.size:,}',
'<br><b>Petal Width</b>: %{customdata}',
'<br><b>Species</b>: %{hovertext}',
) %>%
images = list(
# Add images
source = "",
xref = "x",
yref = "y",
x = 4,
y = 4.5,
sizex = 7,
sizey = 3,
sizing = "stretch",
opacity = 0.5,
layer = "below"
When a data set involves a time variable, we also use movement to
represent the time variable. plot_ly()
can create animated
graphs. The following is an example using the built-in
gapminder data set
in the library gapminder
that displays the relationship between life expectancy and GDP per
capita of countries over time (every 5 years).
pal.IBM <- c("#332288", "#117733", "#0072B2","#D55E00", "#882255")
pal.IBM <- setNames(pal.IBM, c("Asia", "Europe", "Africa", "Americas", "Oceania"))
df <- gapminder
fig <- df %>%
x = ~gdpPercap,
y = ~lifeExp,
size = ~(2*log(pop)-11)^2,
color = ~continent,
colors = pal.IBM, # custom colors
#marker = list(size = ~(log(pop)-10), sizemode = 'area'),
frame = ~year, # the time variable to
# to display in the hover
text = ~paste("Country:", country,
"<br>Continent:", continent,
"<br>Year:", year,
"<br>LifeExp:", lifeExp,
"<br>Pop:", pop,
"<br>gdpPerCap:", gdpPercap),
hoverinfo = "text",
type = 'scatter',
mode = 'markers'
fig <- fig %>% layout(
xaxis = list(
type = "log"
We can also render a ggplot using ggplotly to bring interactivity to the plot. The next is a customary theme to lay out ggplots.
myplot.theme_new <- function() {
#ggplot margins
plot.margin = margin(t = 50, # Top margin
r = 30, # Right margin
b = 30, # Bottom margin
l = 30), # Left margin
## ggplot titles
plot.title = element_text(face = "bold",
size = 12,
family = "sans",
color = "navy",
hjust = 0.5,
margin=margin(0,0,30,0)), # left(0),right(1)
# add border 1)
panel.border = element_rect(colour = NA,
fill = NA,
linetype = 2),
# color background 2)
panel.background = element_rect(fill = "#f6f6f6"),
# modify grid 3)
panel.grid.major.x = element_line(colour = 'white',
linetype = 3,
size = 0.5),
panel.grid.minor.x = element_blank(),
panel.grid.major.y = element_line(colour = 'white',
linetype = 3,
size = 0.5),
panel.grid.minor.y = element_blank(),
# modify text, axis, and color 4) and 5)
axis.text = element_text(colour = "navy",
#face = "italic",
size = 7,
#family = "Times New Roman"
axis.title = element_text(colour = "navy",
size = 7,
#family = "Times New Roman"
axis.ticks = element_line(colour = "navy"),
# legend at the bottom 6)
legend.position = "bottom",
legend.key.size = unit(0.6, 'cm'), #change legend key size
legend.key.height = unit(0.6, 'cm'), #change legend key height
legend.key.width = unit(0.6, 'cm'), #change legend key width
#legend.title = element_text(size=8), #change legend title font size
legend.title=element_blank(), # remove all legend titles
legend.key = element_rect(fill = "white"),
legend.text = element_text(size=8)) #change legend text font size
The following plot uses the above theme and passes the correlation coefficient to the annotated text.
p <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) +
#aes(color = factor(Species)) +
aes(label = Species, label1 = Petal.Length, label2 = Petal.Width) +
## The labels in the above aes() will be part of the hover text.
geom_point(size = iris$Petal.Length, alpha = 0.7) +
stat_smooth(method = lm, se=FALSE, size = 0.5) + # add a linear regression line
#scale_color_manual(values=c("dodgerblue4", "darkolivegreen4", "darkred")) +
x = "Sepal Length",
y = "Sepal Width",
title = "Association between Sepal Length and Width") +
myplot.theme_new() +
annotate(geom="text" ,
label=paste("The Pearson correlation coefficient r = ",
round(cor(iris$Sepal.Length, iris$Sepal.Width),3)),
size = 2,
color = "navy") +
coord_fixed(1) ## This changes the aspect ratio of the graph
p <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) +
# aes(color = factor(Species)) +
# to add more information about the variables in the data set
# use labels to denote the variable names inside the function aes()
aes(label=Species, label2=Petal.Length, label3=Petal.Width) +
geom_point(size = iris$Petal.Width, alpha = 0.7) +
stat_smooth(method = lm, se=FALSE, size = 0.3) +
#scale_color_manual(values=c("dodgerblue4", "darkolivegreen4", "darkred")) +
x = "Sepal Length",
y = "Sepal Width",
title = "Association between Sepal Length and Width") +
myplot.theme_new() +
annotate(geom="text" ,
label=paste("The Pearson correlation coefficient r = ",
round(cor(iris$Sepal.Length, iris$Sepal.Width),3)),
size = 2,
color = "navy") +
coord_fixed(1) ## This changes the aspect ratio of the graph
Remark: It turns that ggplotly
display colors due to its recent updates. Hope that this issue will be
fixed soon.
We will create a summarized data set to make bar plots. We define a
data set to store the mean of sepal length and sepal width by species
using the dyplr
and tidyr
barplotdata = aggregate(iris[,1:4], by = list(iris$Species), FUN = mean)
Group.1 | Sepal.Length | Sepal.Width | Petal.Length | Petal.Width |
setosa | 5.006 | 3.428 | 1.462 | 0.246 |
versicolor | 5.936 | 2.770 | 4.260 | 1.326 |
virginica | 6.588 | 2.974 | 5.552 | 2.026 |
Next, we draw a group bar chart.
data = barplotdata,
x = ~Group.1,
y = ~Sepal.Length,
type = "bar",
name = "sepal.length.avg",
## graphic size
width = 700,
height = 400) %>%
add_trace(y=~Sepal.Width, name = "sepal.width.avg") %>%
add_trace(y=~Petal.Length, name = "petal.length.avg") %>%
add_trace(y=~Petal.Width, name = "petal.width.avg") %>%
layout( yaxis = list(title ="Mean"),
xaxis = list(title = "Species"),
title = "Group Means of Iris attributes",
## margin of the plot
margin = list(
b = 50,
l = 100,
t = 120,
r = 50
We first define a subset from the iris data by filtering out observations with a sepal length of less than 5. The pie chart will be created to see the distribution of species in the subset of the iris data. Keep in mind that the pie chart is constructed based on a frequency table.
# define a working data set
subiris <- iris[iris$Sepal.Length > 5,5]
## Create a frequency table in the form of the data frame.
piedata = data.frame(cate =as.vector(unique(subiris)),
freq = as.vector(table(subiris)))
# define a color vector
colors <- c('rgb(211,94,96)', 'rgb(128,133,133)', 'rgb(144,103,167)')
# make a pie chart
labels = ~cate,
values = ~freq,
type = 'pie',
textposition = 'inside',
textinfo = 'label + percent',
insidetextfont = list(color = '#FFFFFF'),
#hoverinfo = 'text',
marker = list(colors = colors,
line = list(color = '#FFFFFF', width = 1)),
#The 'pull' attribute can also be used to create space between the sectors
showlegend = TRUE) %>%
layout(title = 'Distribution of Species',
xaxis = list(showgrid = FALSE, zeroline = FALSE,
showticklabels = FALSE),
yaxis = list(showgrid = FALSE, zeroline = FALSE,
showticklabels = FALSE),
## margin of the plot
margin = list(
b = 50,
l = 100,
t = 120,
r = 50
Histograms and density curves are used to display the distribution of numerical random variables. When comparing the distributions of different random variables, we can overlay the histograms or density curves.
We can overlay histograms to compare the distributions of multiple random variables.
data = iris,
x = ~ Sepal.Length,
type = "histogram",
nbinsx = 10,
name = "sepal.length",
alpha = .5,
marker = list(line = list(color = "darkgray", width = 2)) ) %>%
## Adding additional histograms and stacking them
add_histogram(x = ~Sepal.Width,
name = "sepal.width", nbinsx = 10, alpha = 0.5,
marker = list(line = list(color = "darkgray", width = 2))) %>%
add_histogram(x = ~Petal.Length,
name = "petal.length",nbinsx = 10, alpha = 0.5,
marker = list(line = list(color = "darkgray", width = 2))) %>%
add_histogram(x = ~Petal.Width,
name = "petal.width",nbinsx = 10, alpha = 0.5,
marker = list(line = list(color = "darkgray", width = 2))) %>%
layout(barmode = "overlay",
title = "Histogram of Iris Attribute",
xaxis = list(title = "Iris Attributes",
zeroline = TRUE),
yaxis = list(title = "Count",
zeroline =TRUE),
## margin of the plot
margin = list(
b = 50,
l = 100,
t = 120,
r = 50
The issue is that the above overlaid histograms cannot be easy to distinguish when comparing more than two distributions in general. The ridgeline histogram can help in general. The following is an example of ridgeline histograms.
ggplot(iris, aes(x = Sepal.Length, y = Species, group = Species, fill = Species)) +
geom_density_ridges(stat = "binline", bins = 20, scale = 2.2) +
scale_y_discrete(expand = c(0, 0)) +
scale_x_continuous(expand = c(0, 0)) +
coord_cartesian(clip = "off") +
It is relatively easy to use density curves to compare multiple distributions. Assume that we want to compare the distribution of the sepal length of the tree iris flowers. One way to do this comparison is to plot the three estimated density curves.
# define three densities
sepal.len.setosa <- iris[which(iris$Species == "setosa"),]
setosa <- density(sepal.len.setosa$Sepal.Length)
sepal.len.versicolor <- iris[which(iris$Species == "versicolor"),]
versicolor <- density(sepal.len.versicolor$Sepal.Length)
sepal.len.virginica <- iris[which(iris$Species == "virginica"),]
virginica <- density(sepal.len.virginica$Sepal.Length)
# plot density curves
fig <- plot_ly(x = ~virginica$x,
y = ~virginica$y,
type = 'scatter', #A character string specifying the trace type
mode = 'lines',
name = 'virginica',
fill = 'tozeroy') %>%
# adding more density curves
add_trace(x = ~versicolor$x,
y = ~versicolor$y,
name = 'versicolor',
fill = 'tozeroy') %>%
add_trace(x = ~setosa$x,
y = ~setosa$y,
name = 'setosa',
fill = 'tozeroy') %>%
layout(xaxis = list(title = 'Sepal Length'),
yaxis = list(title = 'Density'))
The above overlaid density plots (with a certain level of transparency) are relatively easy to visualize.
ridgeDensity <- ggplot(iris, aes(x = Sepal.Length, y = Species)) +
geom_density_ridges() +
geom_density_interactive(aes(tooltip = interaction(Sepal.Length, Species),
data_id = interaction(Sepal.Length, Species)),
size = 1, hover_nearest = TRUE)
# girafe(ggobj = ridgeDensity)
Note: ridgeline
plots do not work well
with ggplotly
to bring interactivity to the plots. There
are some workarounds, but none is good enough for professional
Drawing a boxplot is straightforward in plotly
data = iris,
y = ~ Sepal.Length,
x = ~Species,
type = "box",
color = ~Species,
boxpoints = "all",
boxmean = TRUE,
showlegend = FALSE ) %>%
layout(title = "Histogram of Iris Attribute",
xaxis = list(title = "Species",
zeroline = TRUE),
yaxis = list(title = "Sepal Length",
zeroline =TRUE))
The non-interactive ggplot boxplot is given by
summarized.iris = iris %>% select(-Species) %>%
g.iris = ggplot(summarized.iris, aes(x=name,y=value, fill=name)) +
geom_boxplot() +
x = "Measure Types",
y = "Numerical Measures",
title = "Association between Sepal Length and Width") +
adds interactivity to the plot, but cannot add
colors in the moment.
summarized.iris = iris %>% select(-Species) %>%
g.iris = ggplot(summarized.iris, aes(x = name, y = value)) +
geom_boxplot() +
x = "Measure Types",
y = "Numerical Measures",
title = "Association between Sepal Length and Width") +
Visualizing time series seems to be relatively easier since the objective is to inspect the pattern such as trend, seasonality, special shits, etc. to assist in model identification, such as determining the best length of the history of your time series data for time series forecasting, types of exponential smoothing, order of differencing, MA and AR in ARIMA framework, etc.
stock <- read.csv('')
fig <- plot_ly(stock, type = 'scatter', mode = 'lines') %>%
add_trace(x = ~Date, y = ~AAPL.High) %>%
layout(showlegend = F,
title='Time Series with Rangeslider',
xaxis = list(rangeslider = list(visible = T))) %>%
layout(xaxis = list(zerolinecolor = 'blue',
zerolinewidth = 2,
gridcolor = '#ffffff'),
yaxis = list(zerolinecolor = '#ffffff',
zerolinewidth = 2,
gridcolor = '#fff'),
plot_bgcolor='#e5ecf6', width = 800, height = 400)
There are also other libraries one can use to produce interactive serial plots.
# This plot uses the plot function: hccharh() and hcaes() in the library `hicharter`
hc <-stock %>%
hcaes(x = Date, y = AAPL.High)
The following interactive serial plot also included forecasted values and the forcasting confidence band.
appl.high = stock$AAPL.High
# n= length(appl.high)
# plot(1:n, appl.high, type = 'l')
x <- forecast(ets(appl.high), h = 48)
hc <- hchart(x)
Several map libraries are available in R. In this example, we use the
function from plotly
to plot on a
## preparing data
poc <- read_csv("")[,c(7,8,9, 17)] <- poc[poc$POC == 1,]
# geo styling
geostyle <- list(scope = 'usa',
projection = list(type = 'albers usa'),
showland = TRUE,
landcolor = toRGB("lightblue"),
subunitcolor = toRGB("purple"),
countrycolor = toRGB("navy"),
countrywidth = 0.75,
subunitwidth = 0.5
## plotting map
fig <- plot_geo(, lat = ~ycoord, lon = ~xcoord) %>%
add_markers(text = ~ SITE_DESCRIPTION,
color = "red",
symbol = "circle",
size = I(10),
hoverinfo = "text" ) %>%
layout( title = 'POC Risk Sites', geo = geostyle)
This note focuses on using plotly
library and its
dependencies to create various interactive plots. However,
is only one such library that can produce
interactive graphics. There are several other commonly used libraries
with different strengths. Here are a few of them
