14 2 G etting started with qplot
of object that is used to display the data. Some geoms have an associated
statistical transformation, for example, a histogram is a binning statistic plus
a bar geom. These different components are described in the next chapter.
Here we’ll introduce the most common and useful geoms, organised by the
dimensionality of data that they work with. The foll owing geoms enable you
to investigate two-dimensional relationships:
• geom = "point" draws points to pr oduce a scatterplot. This is the default
when you supply both x and y arguments to qplot().
• geom = "smooth" fits a smoother to the data and displays the smooth and
its standard error, §2.5.1.
• geom = "boxplot" produces a box-and-whisker plot to summarise the
distribution of a set of points, §2.5.2.
• geom = "path" and geom = "line" draw lines between the data points.
Traditionally these are used to explor e relationships between time and
another variable, but lines may be used to join observations connected in
some other way. A line plot is constrained to produce lines that travel from
left to right, while paths can go in any direction, §2.5.5.
For 1d distributions, your choice of geoms is guided by the variable type:
• For continuous variables , geom = "histogram" dr aws a histogram, geom =
"freqpoly" a frequency polygon, and geom = "density" creates a de ns ity
plot, §2.5.3. The histogram geom is the default when you only supply an x
value to qplot().
• For discrete variables, geom = "bar" makes a bar chart, §2.5.4.
2.5.1 Adding a smoother to a plot
If you have a scatterplot with many data points, it can be hard to see exactly
what trend is shown by the d ata. In this case you may want to add a smoothed
line to the plot. This is easily d on e using the smooth geom as shown in
Figure 2.4. Notice that we have combined multiple geoms by supplying a
vector of geom names created with c().Thegeomswillbeoverlaidinthe
order in which they appear.
qplot(carat, price, data = dsmall, geom = c("point", "smooth"))
qplot(carat, price, data = diamonds, geom = c("point", "smooth"))
Despite overplotting, our impre s s ion of an exponential relationship between
price and carat was correct. There are few diamonds bigger than three carats,
and our uncertainty in the form of the relationship increases as illustrated
by the point-wis e confidence interval shown in grey. If you want to turn the
confidence interval off,usese = FALSE.
There are many different smoothers you can choose between by using the
method argument: