# Two scales in ggplot2

There’s situations where it makes sense to overlay two (seemingly unrelated) pieces of information in one plot (a.k.a. diagram/chart). There’s reasons *not* to do it (or at least apply caution) but anyways: this post describes *how to just do it* and how it compares to alternatives.

There has been a vivid discussion on stackoverflow on whether this is detrimental or legit and also on how to actually implement it (the solution given is not-so-very-obvious, but works like charm). The stackoverflow discussion is actually a great read for the most part (incl. a great alternative dummy facetting). I was curious about how to do it (based on the s-o solution; legitimacy aside) and also about the comparison of one diagram with two scales VS two diagrams above one another.

I start off with a simple (and purely fictional) data set: for five days, it tracks the number of interruptions VS productivity. Sure everyone can relate. The dataset `dt`

looks like this:

```
when numinter prod
1 2018-03-20 1 0.95
2 2018-03-21 5 0.50
3 2018-03-23 4 0.70
4 2018-03-24 3 0.75
5 2018-03-25 4 0.60
```

(`numinter`

is the number of interruptions and `prod`

the productivity in percent)

Taking this into ggplot starts quite straightforward:

```
ggplot() +
geom_bar(mapping = aes(x = dt$when, y = dt$numinter), stat = "identity", fill = "grey") +
geom_line(mapping = aes(x = dt$when, y = dt$prod*5), size = 2, color = "blue") +
scale_x_date(name = "Day", labels = NULL) +
# (more)
```

The `stat = "identity"`

part is required to convince `geom_bar`

to just take values as-is; otherwise, all the code does (so far) is specify colors to print and the caption of the x-axis. Hold the question about the `*5`

in the *line* specification for a minute.

The crucial line is the next one:

```
scale_y_continuous(name = "Interruptions/day",
sec.axis = sec_axis(~./5, name = "Productivity % of best",
labels = function(b) { paste0(round(b * 100, 0), "%")})) +
# (more)
```

This line defines essentially *all* the magic: it starts simple, by giving the caption for the left bar. The essential part is **sec.axis**: it expects a formula for the labels. Before, I multiplied all values by 5 (effectively, the right part’s upper bound was 5 times less than the left’s part - and multiplying by 5 ensures that both bar and line have really the same range). In order to represent the numbers right, I now *divide* by 5 again. Bringing `sec_axis`

in gives me a 2nd set of labels - and I can define how to print the labels in detail by specifying a `labels`

function.

All there’s left to do is brush up colors:

```
theme(
axis.title.y = element_text(color = "grey"),
axis.title.y.right = element_text(color = "blue"))
```

And that’s it, really. The result then looks like this:

Just for comparison, I’ve also followed dummy facetting outlined in above stackoverflow post and the ggplot2 wiki. Except for getting different scales (which is possible), it’s straightforward to do:

You can judge for yourself which one better transports the message (“Don’t disrupt people at work!”). Anyways: I’ve put up a gist with the full source code of both approaches. As ever, I hope it’s useful; feel free to try it - and let me know what you think!