![]() #> year month day dep_time dep_delay arr_time arr_delay carrier tailnum This makes interactive exploration much easier. Instead, you’ll just see the first 10 rows and as many columns as will fit on screen. If you do use dplyr, this ensures that you won’t accidentally print thousands of rows of data. If you don’t use dplyr, this has no effect. NB: since the datasets are large, I’ve tagged each data frame with the tbl_df class. (Source: Bureau of transportation statistics) To help understand what causes delays, it also includes a number of other useful datasets: weather, planes, airports, airlines. nycflights13::flights: This package contains information about all flights that departed from NYC (i.e., EWR, JFK and LGA) in 2013: 336,776 flights with 16 variables.Contains monthly atmospheric measurements from Jan 1995 to Dec 2000 on 24 x 24 grid over Central America. nasaweather::atmos: Data from the 2006 ASA data expo.(Source: Environmental protection agency) fueleconomy::vehicles: Fuel economy data for all cars sold in the US from 1984 to 2015.(Source: Social security administration). All names used 5 or more times are included. babynames::babynames: US baby name data for each year from 1880 to 2013, the number of children of each sex given each name.Most packages also include a number of supplementary datasets that provide additional information. The package source code (on github, linked above) is fully reproducible so that you can see some data tidying in action, or make your own modifications to the data.īelow, I’ve listed the primary dataset found in each package. The goal of these packages is to provide some interesting, and relatively large, datasets to demonstrate various data analysis challenges in R. I’ve released four new data packages to CRAN: babynames, fueleconomy, nasaweather and nycflights13.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |