library(dplyr)
library(tidyr)
library(lubridate)
library(frenchdata)
library(ggplot2)
The frenchdata
package provides functions to download the finance data sets provided by the data library website of Prof. Kenneth French.
We can visit the website using the function:
browse_french_site()
this will open the page on your default browser.
As a first step we get the list of data sets that are currently available to download:
<- get_french_data_list() data_sets
The get_french_data_list()
function returns an object of the S3 class french_data_list
:
data_sets#>
#> ── Kenneth's French data library
#> ℹ Information collected from: https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html on Wed Sep 08 20:48:59 2021
#>
#> ── Files list
#> # A tibble: 297 × 3
#> name file_url details_url
#> <chr> <chr> <chr>
#> 1 Fama/French 3 Factors ftp/F-F… Data_Libra…
#> 2 Fama/French 3 Factors [Weekly] ftp/F-F… Data_Libra…
#> 3 Fama/French 3 Factors [Daily] ftp/F-F… Data_Libra…
#> 4 Fama/French 5 Factors (2x3) ftp/F-F… Data_Libra…
#> 5 Fama/French 5 Factors (2x3) [Daily] ftp/F-F… Data_Libra…
#> 6 Portfolios Formed on Size ftp/Por… Data_Libra…
#> 7 Portfolios Formed on Size [ex.Dividends] ftp/Por… Data_Libra…
#> 8 Portfolios Formed on Size [Daily] ftp/Por… Data_Libra…
#> 9 Portfolios Formed on Book-to-Market ftp/Por… Data_Libra…
#> 10 Portfolios Formed on Book-to-Market [ex. Dividends] ftp/Por… Data_Libra…
#> # … with 287 more rows
To download a data set we need to identify its name first from the list data_sets
obtained previously. A simple way of performing this search is to use the View()
function provided by RStudio on the files_list
element that contains the the files list:
View(data_sets$files_list)
and use the filter to narrow down the search, for instance if I’m looking for the 5 factors I can quickly identify the correct data set name.
To download a specific data, for instance the monthly Fama/French 5 Factors (2x3) set we can use:
<-
ff_5_factors download_french_data("Fama/French 5 Factors (2x3)")
#> New names:
#> * `` -> ...1
#> New names:
#> * `` -> ...1
This will return an object of class french_dataset
:
ff_5_factors#>
#> ── Kenneth's French data set
#> ℹ This file was created by CMPT_ME_BEME_OP_INV_RETS using the 202106 CRSP database. The 1-month TBill return is from Ibbotson and Associates Inc.
#>
#> Information collected from: https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_5_Factors_2x3_CSV.zip on Wed Sep 08 20:57:17 2021
#>
#> ℹ For details on the data set call the function `browse_details_page()` on this object
#>
#> ── Subsets in the file:
#> # A tibble: 2 × 2
#> name data
#> <chr> <list>
#> 1 "" <spec_tbl_df [696 × 7]>
#> 2 "Annual Factors: January-December" <spec_tbl_df [57 × 7]>
The file contains two subsets of data:
$subsets
ff_5_factors#> # A tibble: 2 × 2
#> name data
#> <chr> <list>
#> 1 "" <spec_tbl_df [696 × 7]>
#> 2 "Annual Factors: January-December" <spec_tbl_df [57 × 7]>
The first one is the default one and is unnamed, and the second one is on “Annual Factors: January-December”. We can access the data on the first subset:
<- ff_5_factors$subsets$data[[1]]
monthly_ff_5_factors
monthly_ff_5_factors#> # A tibble: 696 × 7
#> date `Mkt-RF` SMB HML RMW CMA RF
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 196307 -0.39 -0.45 -0.94 0.66 -1.15 0.27
#> 2 196308 5.07 -0.82 1.82 0.4 -0.4 0.25
#> 3 196309 -1.57 -0.48 0.17 -0.76 0.24 0.27
#> 4 196310 2.53 -1.3 -0.04 2.75 -2.24 0.29
#> 5 196311 -0.85 -0.85 1.7 -0.45 2.22 0.27
#> 6 196312 1.83 -1.9 -0.06 0.07 -0.3 0.29
#> 7 196401 2.24 0.08 1.53 0.22 1.5 0.3
#> 8 196402 1.54 0.31 2.86 0.06 0.85 0.26
#> 9 196403 1.41 1.4 3.37 -2.01 2.93 0.31
#> 10 196404 0.1 -1.5 -0.66 -1.35 -1.08 0.29
#> # … with 686 more rows
We can now browse use this data.frame directly, for instance, to plot all each of the factors and the risk-free rate overtime:
%>%
monthly_ff_5_factors mutate(date = ym(date)) %>%
pivot_longer(cols = -date,
names_to = "factor",
values_to = "value") %>%
ggplot(data = .,
mapping = aes(x = date, y = value,
group = factor,
color = factor)) +
geom_line() +
labs(caption = "Source: Kenneth French Data Library") +
facet_wrap( ~ factor)
We can browse the details page for this specific data set using:
browse_details_page(ff_5_factors)
We can also save the original uncompressed file when downloading and reading it, by specifying a valid path name:
<-
ff_5_factors download_french_data("Fama/French 5 Factors (2x3)",
dir = ".",
dest_file = "fama_french_5_factors.zip")
This will download the file, read it, assign the result to the object ff_5_factors
, and save the results to the file fama_french_5_factors.zip
on the current directory.
If we prefer to use the original file name just leave the dest_file
parameter unset:
<-
ff_5_factors download_french_data("Fama/French 5 Factors (2x3)",
dir = ".")