A Continuous Variable Can Not Be Mapped to Shape
Introduction
The ggplot2
package is a relatively novel arroyo to generating highly informative publication-quality graphics. The "gg" stands for "Grammar of Graphics". In short, instead of thinking about a unmarried function that produces a plot, ggplot2
uses a "grammar" approach, akin to building more and more complex sentences to layer on more information or nuance.
Data Model
The ggplot2
package assumes that data are in the form of a information.frame. In some cases, the data will need to be manipulated into a form that matches assumptions that ggplot2
uses. In detail, if i has a matrix of numbers associated with unlike subjects (samples, people, etc.), the information will usually need to be transformed into a "long" data frame.
Getting started
To use the ggplot2
package, it must be installed and loaded. Assuming that installation has been done already, we tin load the package direct:
library(ggplot2)
Playing with ggplot2
mtcars data
We are going to use the mtcars dataset, included with R, to experiment with ggplot2
.
data(mtcars)
- Exercise: Explore the
mtcars
dataset usingView
,summary
,dim
,course
, etc.
We tin can also take a quick look at the relationships between the variables using the pairs
plotting function.
pairs(mtcars)
That is a useful view of the information. We want to employ ggplot2
to make an informative plot, then let'due south approach this in a piecewise fashion. Nosotros get-go need to determine what type of plot to produce and what our basic variables will be. In this case, we have a number of choices.
ggplot(mtcars,aes(x=disp,y=hp))
First, a little explanation is necessary. The ggplot
function takes as its first argument a information.frame
. The 2d argument is the "aesthetic", aes
. The ten
and y
take column names from the mtcars
data.frame
and will form the footing of our scatter plot.
But why did we get that "Error: No layers in plot"? Remember that ggplot2 is a "grammar of graphics". We supplied a subject, but no verb (chosen a layer by ggplot2). So, to generate a plot, nosotros demand to supply a verb. At that place are many possibilities. Each "verb" or layer typically starts with "geom" and and so a descriptor. An case is necessary.
ggplot(mtcars,aes(x=disp,y=hp)) + geom_point()
We finally produced a plot. The power of ggplot2, though, is the power to make very rich plots by calculation "grammar" to the "plot sentence". We have a number of other variables in our mtcars
data.frame
. How can we add another value to a two-dimensional plot?
ggplot(mtcars,aes(x=disp,y=hp,color=cyl)) + geom_point()
The color of the points is a based on the numeric variable wt
, the weight of the motorcar. Tin we do more? We can change the size of the points, also.
ggplot(mtcars,aes(x=disp,y=hp,color=wt,size=mpg)) + geom_point()
And then, on our 2D plot, nosotros are at present plotting 4 variables. Can we practice more? We can dispense the shape of the points in addition to the color and the size.
ggplot(mtcars,aes(x=disp,y=hp)) + geom_point(aes(size=mpg,color=wt,shape=cyl))
Why did we get that error? Ggplot2 is trying to be helpful by telling us that a "continuous varialbe cannot exist mapped to 'shape'". Well, in our mtcars
data.frame
, we can look at cyl
in detail.
class(mtcars$cyl)
## [1] "numeric"
summary(mtcars$cyl)
## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 4.000 4.000 6.000 vi.188 8.000 8.000
table(mtcars$cyl)
## ## 4 vi 8 ## 11 7 xiv
The cyl
variable is "kinda" continuous in that it is numeric, but it could also be idea of as a "category" of engines. R has a specific data type for "category" information, called a factor. We can easily convert the cyl
column to a factor similar then:
mtcars$cyl = every bit.cistron(mtcars$cyl)
Now, we can become ahead with our previous approach to make a 2-dimensional plot that displays the relationships between five variables.
ggplot(mtcars,aes(x=disp,y=hp)) + geom_point(aes(size=mpg,colour=wt,shape=cyl))
NYC Flight data
I leave this department open-ended for you lot to explore further options with the ggplot2 package. The data represent the on-fourth dimension information for all flights that departed New York City in 2013.
library(nycflights13) head(flights)
## # A tibble: six × 19 ## year calendar month day dep_time sched_dep_time dep_delay arr_time ## <int> <int> <int> <int> <int> <dbl> <int> ## ane 2013 1 i 517 515 two 830 ## 2 2013 1 i 533 529 four 850 ## 3 2013 1 1 542 540 2 923 ## 4 2013 1 1 544 545 -i 1004 ## five 2013 1 1 554 600 -6 812 ## half-dozen 2013 1 1 554 558 -four 740 ## # ... with 12 more variables: sched_arr_time <int>, arr_delay <dbl>, ## # carrier <chr>, flight <int>, tailnum <chr>, origin <chr>, dest <chr>, ## # air_time <dbl>, distance <dbl>, hour <dbl>, infinitesimal <dbl>, ## # time_hour <dttm>
Experience gratis to explore. Consider using other "geoms" during your exploration.
Session Info
sessionInfo()
## R Nether development (unstable) (2016-10-26 r71594) ## Platform: x86_64-apple-darwin13.4.0 (64-bit) ## Running under: macOS Sierra 10.12.1 ## ## locale: ## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 ## ## attached base packages: ## [1] stats graphics grDevices utils datasets base ## ## other attached packages: ## [1] nycflights13_0.2.0 ggplot2_2.2.0 BiocStyle_2.three.15 ## [4] knitr_1.15.one ## ## loaded via a namespace (and non attached): ## [1] Rcpp_0.12.8.2 codetools_0.2-15 assertthat_0.1 ## [iv] digest_0.6.10 rprojroot_1.1 plyr_1.8.4 ## [vii] grid_3.4.0 gtable_0.2.0 backports_1.0.4 ## [10] magrittr_1.5 evaluate_0.10 scales_0.4.1 ## [13] stringi_1.1.2 lazyeval_0.2.0 rmarkdown_1.2.9000 ## [16] labeling_0.3 tools_3.iv.0 stringr_1.one.0 ## [19] munsell_0.4.3 yaml_2.one.14 colorspace_1.3-1 ## [22] htmltools_0.3.five tibble_1.2 methods_3.four.0
Source: https://seandavi.github.io/hour_of_code/ggplot2.html
Post a Comment for "A Continuous Variable Can Not Be Mapped to Shape"