Package BatchGetSymbols
is soft-deprecated in favor of
78 | yfR
. See this Readme.md for my motivation
79 | in writing a new R package. In practice, this means that
80 | BatchGetSymbols
is no longer maintained besides the
81 | correction of major bugs. All efforts goes to the development of
82 | yfR
.
While BatchGetSymbols
will be be available in CRAN in
84 | the near future, my plan is to remove it from CRAN and archive it in
85 | Github once yfR
becames more stable.
BatchGetSymbols is a R package for large-scale download of financial 92 | data from Yahoo Finance. Based on a set of tickers and date ranges, the 93 | package will download and organize the financial data in the tidy/long 94 | format.
95 |Yahoo finance data is far from perfect or reliable, specially for 101 | individual stocks. In my experience, using it for research code with 102 | stock indices is fine and I can match it with other 103 | data sources. But, adjusted stock prices for individual 104 | assets is messy as stock events such as splits or dividends are 105 | not properly registered. I was never able to match it with other data 106 | sources. My advice is to never use the data of individual stocks in 107 | production.
Since version 2.6, the cache system is session-persistent by 109 | default, meaning that whenever you restart your R session, you lose all 110 | your cached data. This is a safety feature for mismatching prices due to 111 | corporate events.
# CRAN (official release)
138 | install.packages('BatchGetSymbols')
139 |
140 | # Github (dev version)
141 | devtools::install_github('msperlin/BatchGetSymbols')
142 | See vignette.
147 |cache.dir = file.path(tempdir(), 'BGS_Cache')
. This solves
72 | the problem with mismatching price series from cached data between
73 | splits or dividends. A new warning is set whenever the user uses
74 | cache.dir different from temp.dir()Small update:
152 |Major update:
157 |GetFTSE100Stocks.Rd
This function scrapes the stocks that constitute the FTSE100 index from the wikipedia page at <https://en.wikipedia.org/wiki/FTSE_100_Index#List_of_FTSE_100_companies>.
57 |Use cache system? (default = TRUE)
Where to save cache files? (default = file.path(tempdir(), 'BGS_Cache') )
A dataframe that includes a column with the list of tickers of companies that belong to the FTSE100 index
76 |if (FALSE) {
81 | df.FTSE100 <- GetFTSE100Stocks()
82 | print(df.FTSE100$tickers)
83 | }
84 |
GetIbovStocks.Rd
This function scrapes the stocks that constitute the Ibovespa index from the wikipedia page at http://bvmf.bmfbovespa.com.br/indices/ResumoCarteiraTeorica.aspx?Indice=IBOV&idioma=pt-br.
57 |Use cache system? (default = TRUE)
Where to save cache files? (default = file.path(tempdir(), 'BGS_Cache') )
Maximum number of attempts to download the data
A dataframe that includes a column with the list of tickers of companies that belong to the Ibovespa index
79 |if (FALSE) {
84 | df.ibov <- GetIbovStocks()
85 | print(df.ibov$tickers)
86 | }
87 |
GetSP500Stocks.Rd
This function scrapes the stocks that constitute the SP500 index from the wikipedia page at https://en.wikipedia.org/wiki/List_of_S
57 |Use cache system? (default = TRUE)
Where to save cache files? (default = file.path(tempdir(), 'BGS_Cache') )
A dataframe that includes a column with the list of tickers of companies that belong to the SP500 index
76 |if (FALSE) {
81 | df.SP500 <- GetSP500Stocks()
82 | print(df.SP500$tickers)
83 | }
84 |
calc.ret.Rd
Created so that a return column is added to a dataframe with prices in the long (tidy) format.
57 |Price vector
Ticker of symbols (usefull if working with long dataframe)
Type of price return to calculate: 'arit' (default) - aritmetic, 'log' - log returns.
A vector of returns
75 |P <- c(1,2,3)
80 | R <- calc.ret(P)
81 |
df.fill.na.Rd
Helper function for BatchGetSymbols. Replaces NA values and returns fixed dataframe.
57 |df.fill.na(df.in)
DAtaframe to be fixed
A fixed dataframe.
71 |
76 | df <- data.frame(price.adjusted = c(NA, 10, 11, NA, 12, 12.5, NA ), volume = c(1,10, 0, 2, 0, 1, 5))
77 |
78 | df.fixed.na <- df.fill.na(df)
79 | #> NULL
80 |
81 |
fix.ticker.name.Rd
Removes bad symbols from names of tickers. This is useful for naming files with cache system.
57 |fix.ticker.name(ticker.in)
A bad ticker name
A good ticker name
71 |bad.ticker <- '^GSPC'
76 | good.ticker <- fix.ticker.name(bad.ticker)
77 | good.ticker
78 | #> [1] "GSPC"
79 |
get.clean.data.Rd
Get clean data from yahoo/google
57 |get.clean.data(tickers, src = "yahoo", first.date, last.date)
A vector of tickers. If not sure whether the ticker is available, check the websites of google and yahoo finance. The source for downloading 67 | the data can either be Google or Yahoo. The function automatically selects the source webpage based on the input ticker.
Source of data (yahoo or google)
The first date to download data (date or char as YYYY-MM-DD)
The last date to download data (date or char as YYYY-MM-DD)
A dataframe with the cleaned data
78 |
54 | All functions55 | 56 | |
57 | |
---|---|
58 | 59 | | 60 |Function to download financial data |
61 |
62 | 63 | | 64 |Function to download the current components of the FTSE100 index from Wikipedia |
65 |
66 | 67 | | 68 |Function to download the current components of the Ibovespa index from Bovespa website |
69 |
70 | 71 | | 72 |Function to download the current components of the SP500 index from Wikipedia |
73 |
74 | 75 | | 76 |Function to calculate returns from a price and ticker vector |
77 |
78 | 79 | | 80 |Replaces NA values in dataframe for closest price |
81 |
82 | 83 | | 84 |Fix name of ticker |
85 |
86 | 87 | | 88 |Get clean data from yahoo/google |
89 |
90 | 91 | | 92 |An improved version of function |
93 |
94 | 95 | | 96 |Transforms a dataframe in the long format to a list of dataframes in the wide format |
97 |
getSymbols
from quantmodmyGetSymbols.Rd
This is a helper function to BatchGetSymbols
and it should normaly not be called directly. The purpose of this function is to download financial data based on a ticker and a time period.
58 | The main difference from getSymbols
is that it imports the data as a dataframe with proper named columns and saves data locally with the caching system.
A single ticker to download data
A index for the stock that is downloading (for cat() purposes)
total number of stocks being downloaded (also for cat() purposes)
The source of the data ('google' or'yahoo')
The first date to download data (date or char as YYYY-MM-DD)
The last date to download data (date or char as YYYY-MM-DD)
Use cache system? (default = TRUE)
Where to save cache files? (default = file.path(tempdir(), 'BGS_Cache') )
Data for bechmark ticker
Logical for printing statements (default = FALSE)
A percentage threshold for defining bad data. The dates of the benchmark ticker are compared to each asset. If the percentage of non-missing dates 101 | with respect to the benchmark ticker is lower than thresh.bad.data, the function will ignore the asset (default = 0.75)
A dataframe with the financial data
106 |getSymbols for the base function