├── .gitignore ├── README.md └── fundamental ├── company_financials.py ├── company_fundamentals.py ├── company_profiles.py ├── data ├── company-key-metrics-10Y.csv ├── company-profiles.csv ├── enterprise-value-10Y.csv ├── financial-ratios-10Y.csv ├── financial-statement-growth-10Y.csv └── financials-10Y.csv ├── images ├── Book-Value-per-Share-Scaled.png ├── Current-Ratio-Scaled.png ├── Debt-to-Equity-Scaled.png ├── Dividend-per-Share-Scaled.png ├── EPS-scaled.png └── ROE-Scaled.png └── main.py /.gitignore: -------------------------------------------------------------------------------- 1 | # Created by .ignore support plugin (hsz.mobi) 2 | ### Python template 3 | # Byte-compiled / optimized / DLL files 4 | __pycache__/ 5 | *.py[cod] 6 | *$py.class 7 | 8 | # C extensions 9 | *.so 10 | 11 | # Distribution / packaging 12 | .Python 13 | build/ 14 | develop-eggs/ 15 | dist/ 16 | downloads/ 17 | eggs/ 18 | .eggs/ 19 | lib/ 20 | lib64/ 21 | parts/ 22 | sdist/ 23 | var/ 24 | wheels/ 25 | pip-wheel-metadata/ 26 | share/python-wheels/ 27 | *.egg-info/ 28 | .installed.cfg 29 | *.egg 30 | MANIFEST 31 | 32 | # PyInstaller 33 | # Usually these files are written by a python script from a template 34 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 35 | *.manifest 36 | *.spec 37 | 38 | # Installer logs 39 | pip-log.txt 40 | pip-delete-this-directory.txt 41 | 42 | # Unit test / coverage reports 43 | htmlcov/ 44 | .tox/ 45 | .nox/ 46 | .coverage 47 | .coverage.* 48 | .cache 49 | nosetests.xml 50 | coverage.xml 51 | *.cover 52 | .hypothesis/ 53 | .pytest_cache/ 54 | 55 | # Translations 56 | *.mo 57 | *.pot 58 | 59 | # Django stuff: 60 | *.log 61 | local_settings.py 62 | db.sqlite3 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # IPython 81 | profile_default/ 82 | ipython_config.py 83 | 84 | # pyenv 85 | .python-version 86 | 87 | # pipenv 88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 91 | # install all needed dependencies. 92 | #Pipfile.lock 93 | 94 | # celery beat schedule file 95 | celerybeat-schedule 96 | 97 | # SageMath parsed files 98 | *.sage.py 99 | 100 | # Environments 101 | .env 102 | .venv 103 | env/ 104 | venv/ 105 | ENV/ 106 | env.bak/ 107 | venv.bak/ 108 | 109 | # Spyder project settings 110 | .spyderproject 111 | .spyproject 112 | 113 | # Rope project settings 114 | .ropeproject 115 | 116 | # mkdocs documentation 117 | /site 118 | 119 | # mypy 120 | .mypy_cache/ 121 | .dmypy.json 122 | dmypy.json 123 | 124 | # Pyre type checker 125 | .pyre/ 126 | 127 | .idea/ 128 | config.py -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Fundamental Analysis 2 | Fundamental Analysis is a program that allows me to screen stocks using fundamental indicators and estimate the intrinsic value of qualified stocks using the Discounted Cash Flow method of valuation. 3 | 4 | **The logic accomplishes 5 primary tasks:** 5 | 6 | 1. Downloads all stock tickers available via the FinancialModelingPrep API and generates profiles on each company 7 | 2. Retrieves and cleans financial statement data on each company 8 | 3. Screens companies on any chosen indicators of value (ex: 8-Year Median ROE) 9 | 4. Plots the stability of core indicators over an N-year period 10 | - Includes: EPS, Dividend Per Share, Book Value per Share, ROE, Current Ratio, Debt to Equity Ratio 11 | 5. Estimates the intrinsic value, margin of safety value for chosen companies 12 | 13 | ![alt text](https://github.com/hjones20/fundamental-analysis/blob/master/fundamental/images/EPS-scaled.png?raw=true) 14 | 15 | ## Functions 16 | I've listed the available functions separated by module below for anyone that wishes to build upon this logic.
17 | 18 | **Notes:**
19 | 1. The FinancialModelingPrep API appears to be updated frequently. As a result, changes in data availability / functionality may occur. 20 | 2. It is not necessary to use the functions prefixed with the word **'calculate'** in the **company_fundamentals** module in order to calculate the intrinsic value of a company. You can get a given company's discounted cash flow directly from the FinancialModeling Prep API here. I simply opted to build the DCF logic myself in order to better understand how it works. It was not a great programming choice as errors will result if column names change. 21 | - FinancialModelingPrep also offers a tutorial on the discounted cash flow methodology here. 22 | 23 | - **company_profiles** 24 | - `get_company_data` - Retrieves 'symbol', 'name', 'price', and 'exchange' information for all available stock tickers 25 | - `select_stock_exchanges` - Filters out stock tickers that are not listed on one of the major exchanges: 'Nasdaq Global Select', 'NasdaqGS', 'Nasdaq', 'New York Stock Exchange', 'NYSE', 'NYSE American' 26 | - `select_minimum_price` - Filters out stock tickers that have a price less than the one specified by the user 27 | - `create_company_profile` - Retrieves additional company information for each stock ticker provided, writes data to csv file, stores file in the specified directory 28 | - **company_financials** 29 | - `select_sector` - Filters out stock tickers that do not belong to the specified sectors 30 | - `select_industries` - Filters out stock tickers that do not belong to the specified industries 31 | - `get_financial_data` - Retrieves any of the following financial datasets for each of the stock tickers provided: Financial Statements, Financial Ratios, Financial Growth, Key Company Metrics, Enterprise Value 32 | - `clean_financial_data` - Scans DataFrame containing financial data on N stock stickers and removes rows with corrupted date values. Adds a new 'year' column as well for future use 33 | - `select_analysis_years` - Remove stock tickers without financial reports in a specified timeframe (ex: L10Y) and subset DataFrame to only include the years specified (ex: 2015 - 2020) 34 | - **company_fundamentals** 35 | - `combine_data` - Read and join all files in a specified directory (ex: 'data/') with a specified year pattern (ex: '2Y') 36 | - `calculate_stats` - Calculate the mean, median, or percent change of provided columns over an N-year period 37 | - `screen_stocks` - Filter out stock tickers with column values that do not fall within specified ranges 38 | - `plot_performance` - Plot stock performance over time with respect to the following column values: 'Earnings per Share', 'Dividend per Share', 'Book Value per Share', 'Return on Equity', 'Current Ratio', 'Debt to Equity Ratio'. Note that 'Dividend per Share' is commented out as the column name seems to have disappeared in a recent API update 39 | - `prepare_valuation_inputs` - Subset DataFrame to include only the data required for Discounted Cash Flow model calculations 40 | - `calculate_discount_rate` - Calculated the Weighted Average Cost of Capital (WACC) for each stock ticker provided 41 | - `calculate_discounted_free_cash_flow` - Calculate the present value of discounted future cash flows for each stock ticker provided 42 | - `calculate_terminal_value` - Calculate the terminal value for each stock ticker for each stock ticker provided 43 | - `calculate_intrinsic_value` - Calculate the intrinsic value of each stock ticker provided 44 | - `calculate_margin_of_safety` - Calculate the margin of safety value of each stock ticker provided 45 | 46 | 47 | ## References 48 | - **Data Sources:** All stock data is pulled from the FinancialModelingPrep API 49 | - **Graphs:** The "Stability Graph" concept was taken from the great folks at www.buffettsbooks.com. They explain why this concept is so important here 50 | - **Calculations:** Guides to the Discounted Cash Flow method of valuation can be found in many places across the internet. Personally, I used the Advanced Value Investing Udemy course 51 | -------------------------------------------------------------------------------- /fundamental/company_financials.py: -------------------------------------------------------------------------------- 1 | from urllib.request import urlopen 2 | 3 | import json 4 | import pandas as pd 5 | 6 | 7 | def select_sector(df, *args): 8 | """ 9 | Remove companies not in the sector provided. 10 | 11 | :param df: DataFrame containing sector info on N companies 12 | :param args: Sectors to retain for analysis 13 | :return: Subset of DataFrame provided, containing companies in the specified sector 14 | :rtype: pandas.DataFrame 15 | """ 16 | print('Evaluating sector membership among ' + str(df['symbol'].nunique()) + ' companies...') 17 | 18 | sector_filter = list(args) 19 | df = df[df['profile.sector'].isin(sector_filter)] 20 | 21 | print('Found ' + str(df['symbol'].nunique()) + ' companies in the ' + str(sector_filter) 22 | + ' sector! \n') 23 | 24 | return df 25 | 26 | 27 | def select_industries(df, *args): 28 | """ 29 | Remove companies not in the industries provided. 30 | 31 | :param df: DataFrame containing industry info on N companies 32 | :param args: Industries to retain for analysis 33 | :return: Subset of DataFrame provided, containing companies in the specified industry 34 | :rtype: pandas.DataFrame 35 | """ 36 | print('Evaluating industry membership among ' + str(df['symbol'].nunique()) + ' companies...') 37 | 38 | industry_filter = list(args) 39 | df = df[df['profile.industry'].isin(industry_filter)] 40 | 41 | print('Found ' + str(df['symbol'].nunique()) + ' companies in the ' + str(industry_filter) 42 | + ' industries! \n') 43 | 44 | return df 45 | 46 | 47 | def get_financial_data(df, request, period, api_key): 48 | """ 49 | Retrieve financial data for all stock tickers in the provided DataFrame. 50 | 51 | :param df: DataFrame containing stock tickers (symbol) for N companies 52 | :param request: String representing aspect of API to query 53 | :param period: String 'annual' or 'quarter' 54 | :param api_key: FinancialModelingPrep API key 55 | :return: DataFrame containing chosen financial data for all years available 56 | :rtype: pandas.DataFrame 57 | """ 58 | print('Pulling ' + period + ' ' + request + ' for ' + str(df['symbol'].nunique()) 59 | + ' companies...') 60 | 61 | request_map = {"financials": "financials", 62 | "financial-ratios": "ratios", 63 | "enterprise-value": "enterprise-values", 64 | "company-key-metrics": "key-metrics", 65 | "financial-statement-growth": "financial-growth"} 66 | 67 | value_key = request_map[request] 68 | 69 | financial_statements = ['income-statement', 'balance-sheet-statement', 'cash-flow-statement'] 70 | 71 | financial_data = pd.DataFrame() 72 | 73 | if request == 'financials': 74 | 75 | for ticker in df['symbol']: 76 | 77 | statement_data = pd.DataFrame(df['symbol']) 78 | 79 | for statement in financial_statements: 80 | response = urlopen("https://financialmodelingprep.com/api/v3/" + statement + "/" 81 | + ticker + "?period=" + period + "&apikey=" + api_key) 82 | 83 | data = response.read().decode("utf-8") 84 | data_json = json.loads(data) 85 | 86 | flattened_data = pd.json_normalize(data_json) 87 | 88 | try: 89 | statement_data = statement_data.merge(flattened_data, on='symbol', 90 | how='inner', suffixes=('', '_x')) 91 | 92 | except KeyError: 93 | continue 94 | 95 | duplicate_cols = [x for x in statement_data if x.endswith('_x')] 96 | statement_data.drop(duplicate_cols, axis=1, inplace=True) 97 | 98 | financial_data = pd.concat([financial_data, statement_data], ignore_index=True) 99 | 100 | print('Found ' + period + ' financial statement data for ' 101 | + str(financial_data['symbol'].nunique()) + ' companies! \n') 102 | 103 | else: 104 | 105 | for ticker in df['symbol']: 106 | response = urlopen("https://financialmodelingprep.com/api/v3/" + value_key + "/" 107 | + ticker + "?period=" + period + "&apikey=" + api_key) 108 | 109 | data = response.read().decode("utf-8") 110 | data_json = json.loads(data) 111 | 112 | flattened_data = pd.json_normalize(data_json) 113 | financial_data = pd.concat([financial_data, flattened_data], ignore_index=True) 114 | 115 | print('Found ' + period + ' ' + request + ' data for ' 116 | + str(financial_data['symbol'].nunique()) + ' companies! \n') 117 | 118 | return financial_data 119 | 120 | 121 | def clean_financial_data(df): 122 | """ 123 | Remove rows with corrupted date values, create new year column. 124 | 125 | :param df: DataFrame containing financial data on N companies 126 | :return: Subset of provided DataFrame, with the addition of a new 'year' column 127 | :rtype: pandas.DataFrame 128 | """ 129 | print('Cleaning financial data for ' + str(df['symbol'].nunique()) + ' companies...') 130 | 131 | df = df.loc[df['date'].astype(str).apply(lambda x: len(x) == 10)].copy() 132 | 133 | df.insert(2, 'year', df['date'].str[:4]) 134 | df['year'] = df.year.astype(int) 135 | 136 | print('Returning clean financial data for ' + str(df['symbol'].nunique()) + ' companies! \n') 137 | 138 | return df 139 | 140 | 141 | def select_analysis_years(df, report_year, eval_period): 142 | """ 143 | Remove companies without financial reports in the evaluation period specified. 144 | 145 | :param df: DataFrame containing financial data on N companies 146 | :param report_year: Year of most recent financial report 147 | :param eval_period: Number of years prior to most recent report to be analyzed 148 | :return: Subset of the DataFrame provided 149 | :rtype: pandas.DataFrame 150 | """ 151 | 152 | print('Subsetting data from ' + str(report_year - eval_period) + ' to ' + str(report_year) 153 | + ' for ' + str(df['symbol'].nunique()) + ' companies...') 154 | 155 | df.sort_values(by=['symbol', 'year'], inplace=True, ascending=True) 156 | df.drop_duplicates(['symbol', 'year'], keep='last', inplace=True) 157 | 158 | start_year = report_year - eval_period 159 | 160 | historical_years = set() 161 | tickers_to_drop = [] 162 | 163 | for i in range(start_year, report_year + 1): 164 | historical_years.add(i) 165 | 166 | symbol_history = df.groupby('symbol')['year'].apply(set) 167 | 168 | for symbol, year_set in enumerate(symbol_history): 169 | common_years = historical_years.intersection(year_set) 170 | if len(common_years) != len(historical_years): 171 | tickers_to_drop.append(symbol_history.index[symbol]) 172 | else: 173 | continue 174 | 175 | df = df[~df['symbol'].isin(tickers_to_drop)] 176 | df = df[df['year'].isin(historical_years)] 177 | 178 | print('Subset data from ' + str(report_year - eval_period) + ' to ' + str(report_year) 179 | + ' for ' + str(df['symbol'].nunique()) + ' companies! \n') 180 | 181 | return df 182 | 183 | 184 | # TODO: Add logic to select specific quarters for analysis 185 | def select_analysis_quarters(df, report_year, eval_period, *args): 186 | pass 187 | -------------------------------------------------------------------------------- /fundamental/company_fundamentals.py: -------------------------------------------------------------------------------- 1 | from plotnine import ggplot, aes, geom_line, geom_point, scale_x_continuous, scale_y_continuous,\ 2 | labs, theme, theme_538, annotate, element_text 3 | 4 | import numpy as np 5 | import pandas as pd 6 | import statistics 7 | import textwrap 8 | import os 9 | 10 | 11 | def combine_data(directory, year_pattern): 12 | """ 13 | Load all files in the provided directory as pandas DataFrames and join data into one large 14 | DataFrame for future analysis. 15 | 16 | :param directory: Local directory where financial data resides 17 | :param year_pattern: String indicating which files to pull. Example: '10Y', '5Y', '3Y', etc. 18 | :return: Master DataFrame containing all data from the listed directory 19 | :rtype: pandas.DataFrame 20 | """ 21 | 22 | master = pd.DataFrame() 23 | 24 | for file in os.listdir(directory): 25 | 26 | if file.endswith(year_pattern, -7, -4): 27 | 28 | financial_data = pd.read_csv(directory + str(file)) 29 | 30 | if master.empty: 31 | master = financial_data 32 | else: 33 | master = pd.merge(master, financial_data, on=['symbol', 'date'], how='inner', 34 | suffixes=('', '_x')) 35 | 36 | elif file == 'company-profiles.csv': 37 | 38 | company_profiles = pd.read_csv(directory + str(file)) 39 | 40 | try: 41 | master = pd.merge(master, company_profiles, on='symbol', how='left', 42 | suffixes=('', '_x')) 43 | except KeyError: 44 | continue 45 | 46 | duplicate_cols = [x for x in master if x.endswith('_x')] 47 | master.drop(duplicate_cols, axis=1, inplace=True) 48 | 49 | return master 50 | 51 | 52 | def calculate_stats(df, stat, report_year, eval_period, *args): 53 | """ 54 | Calculate N year statistics for provided columns. 55 | 56 | :param df: DataFrame containing the columns specified in *args 57 | :param stat: Statistic to calculate: mean, median, or percent change 58 | :param report_year: Ending year of calculation 59 | :param eval_period: Number of years to include in the calculation 60 | :param args: Columns that require calculations 61 | :return: New DataFrame containing 'symbol', 'year', and 'stat X' columns 62 | :rtype: pandas.DataFrame 63 | """ 64 | 65 | df.sort_values(by=['symbol', 'year'], inplace=True, ascending=False) 66 | 67 | company_stats = pd.DataFrame() 68 | 69 | for arg in list(args): 70 | 71 | metric = df[['symbol', 'year', arg]] 72 | metric = metric.pivot_table(values=arg, index='symbol', columns='year') 73 | 74 | if stat == 'percent change': 75 | column_name = str(eval_period) + 'Y ' + arg + ' % change' 76 | company_stats[column_name] = (metric.iloc[:, -1] / metric.iloc[:, 0]) - 1 77 | 78 | elif stat == 'mean': 79 | column_name = str(eval_period) + 'Y ' + arg + ' mean' 80 | company_stats[column_name] = metric.mean(axis=1) 81 | 82 | elif stat == 'median': 83 | column_name = str(eval_period) + 'Y ' + arg + ' median' 84 | company_stats[column_name] = metric.median(axis=1) 85 | 86 | company_stats['year'] = report_year 87 | 88 | return company_stats 89 | 90 | 91 | def screen_stocks(df, **kwargs): 92 | """ 93 | Subset DataFrame to stocks containing column values within the specified thresholds. 94 | 95 | :param df: DataFrame containing the columns specified in key values of **kwargs 96 | :param kwargs: Dictionary containing column names as keys and min/max threshold values 97 | :return: Subset of the DataFrame provided 98 | :rtype: pandas.DataFrame 99 | """ 100 | 101 | for column, thresholds in kwargs.items(): 102 | df = df[(df[column] > thresholds[0]) & (df[column] < thresholds[1]) | (df[column].isnull())] 103 | 104 | ticker_list = list(df.symbol) 105 | 106 | return ticker_list 107 | 108 | 109 | def plot_performance(df, report_year, eval_period): 110 | """ 111 | Plot metric-specific performance for a set of stocks over time. Reference: 112 | https://www.buffettsbooks.com/how-to-invest-in-stocks/intermediate-course/lesson-20/ 113 | 114 | :param df: DataFrame containing stock tickers and the columns specified below 115 | :param report_year: Year of most recent financial report 116 | :param eval_period: Number of years prior to most recent report to be analyzed 117 | :return: A list of ggplot objects 118 | :rtype: List 119 | """ 120 | 121 | start_year = report_year - eval_period 122 | df = df.loc[df['year'] >= start_year] 123 | 124 | df = df[['symbol', 'year', 'eps', 'bookValuePerShare', 'roe', 'currentRatio', 'debtToEquity']] 125 | 126 | df['roe'] = df['roe'].apply(lambda x: x * 100.0) 127 | 128 | df = df.rename({'eps': 'Earnings per Share', 'roe': 'Return on Equity', 129 | 'currentRatio': 'Current Ratio', 'debtToEquity': 'Debt to Equity Ratio', 130 | 'bookValuePerShare': 'Book Value per Share'}, axis='columns') 131 | 132 | df.sort_values(by=['symbol', 'year'], inplace=True, ascending=True) 133 | df.dropna(inplace=True) 134 | 135 | # Commenting out for now, API no longer returning this col in income-statement response 136 | label_dict = {'Earnings per Share': 'The EPS shows the company\'s profit per share. This chart ' 137 | 'should have a positive slope over time. Stable results ' 138 | 'here are extremely important for forecasting future cash ' 139 | 'flows. Note: if the company\'s book value has increased ' 140 | 'over time, the EPS should demonstrate similar growth.', 141 | 142 | # 'Dividend per Share': 'This chart shows the dividend history of the company. ' 143 | # 'This should have a flat to positive slope over time. If ' 144 | # 'you see a drastic drop, it may represent a stock split ' 145 | # 'for the company. Note: the dividend is taken from a ' 146 | # 'portion of the EPS, the remainder goes to the book value.', 147 | 148 | 'Book Value per Share': 'The book value represents the liquidation value of the ' 149 | 'entire company (per share). It\'s important to see ' 150 | 'this number increasing over time. If the company pays a' 151 | ' high dividend, the book value may grow at a slower ' 152 | 'rate. If the company pays no dividend, the book value ' 153 | 'should grow with the EPS each year.', 154 | 155 | 'Return on Equity': 'Return on equity is very important because it show the ' 156 | 'return that management has received for reinvesting the ' 157 | 'profits of the company. If using an intrinsic value ' 158 | 'calculator, it\'s very important that this number is flat or' 159 | ' increasing for accurate results. Find companies with a ' 160 | 'consistent ROE above 8%.', 161 | 162 | 'Current Ratio': 'The current ratio helps measure the health of the company in ' 163 | 'the short term. As a rule of thumb, the current ratio should be' 164 | ' above 1.0. A safe current ratio is typically above 1.5. Look ' 165 | 'for stability trends within the current ratio to see how the ' 166 | 'company manages their short term risk.', 167 | 168 | 'Debt to Equity Ratio': 'The debt to equity ratio helps measure the health of ' 169 | 'the company in the long term. As a rule of thumb, the ' 170 | 'debt to equity ratio should be lower than 0.5. Look for ' 171 | 'stability trends within the debt/equity ratio to see how' 172 | ' the company manages their long term risk.'} 173 | 174 | wrapper = textwrap.TextWrapper(width=120) 175 | 176 | for key, value in label_dict.items(): 177 | label_dict[key] = wrapper.fill(text=value) 178 | 179 | plots = [] 180 | 181 | cols = df.columns[2:].tolist() 182 | 183 | for metric in cols: 184 | p = (ggplot(df, aes('year', metric, color='symbol')) 185 | + geom_line(size=1, alpha=0.8) + geom_point(size=3, alpha=0.8) 186 | + labs(title=metric, x='Report Year', y='', color='Ticker') 187 | + theme_538() + theme(legend_position='left', plot_title=element_text(weight='bold')) 188 | + scale_x_continuous(breaks=range(min(df['year']), max(df['year']) + 1, 1)) 189 | + scale_y_continuous(breaks=range(min(df[metric].astype(int)), 190 | max(round(df[metric]).astype(int)) + 2, 1)) 191 | + annotate(geom='label', x=statistics.mean((df['year'])), 192 | y=max(round(df[metric]).astype(int) + 1), label=label_dict[metric], 193 | size=8, label_padding=0.8, fill='#F7F7F7')) 194 | 195 | plots.append(p) 196 | 197 | return plots 198 | 199 | 200 | def prepare_valuation_inputs(df, report_year, eval_period, *args): 201 | """ 202 | Subset DataFrame to data required for Discounted Cash Flow model. 203 | 204 | :param df: Dataframe containing the columns specified below 205 | :param report_year: Year of most recent financial report 206 | :param eval_period: Number of years prior to most recent report to be analyzed 207 | :param args: Stocks to retain for analysis 208 | :return: Subset of the DataFrame provided 209 | :rtype: pandas.DataFrame 210 | """ 211 | 212 | symbol_filter = list(args) 213 | df = df[df['symbol'].isin(symbol_filter)] 214 | 215 | start_year = report_year - eval_period 216 | df = df.loc[df['year'] >= start_year] 217 | 218 | df['interestRate'] = round((df['interestExpense'] / df['totalDebt']), 2) 219 | 220 | df = df.replace([np.inf, -np.inf, np.nan], 0) 221 | 222 | max_tax_rate = df.groupby('symbol')['effectiveTaxRate'].max().reset_index() 223 | max_interest_rate = df.groupby('symbol')['interestRate'].max().reset_index() 224 | 225 | df = df[['symbol', 'year', 'freeCashFlow', 'marketCap', 'shortTermDebt', 'longTermDebt', 226 | 'profile.beta', 'cashAndCashEquivalents', 'totalLiabilities', 'numberOfShares', 227 | 'stockPrice']] 228 | 229 | df = df.loc[df['year'] == report_year] 230 | 231 | valuation_data = df.merge(max_tax_rate, on='symbol').merge(max_interest_rate, on='symbol') 232 | 233 | valuation_data.rename(columns={'effectiveTaxRate': 'Max Tax Rate', 234 | 'interestRate': 'Max Interest Rate'}, inplace=True) 235 | 236 | valuation_data['Max Tax Rate'] = round(valuation_data['Max Tax Rate'], 2) 237 | 238 | return valuation_data 239 | 240 | 241 | def calculate_discount_rate(df, risk_free_rate=0.0069, market_risk_premium=0.06): 242 | """ 243 | Calculate the Weighted Average Cost of Capital (WACC) for each ticker in the provided DataFrame. 244 | 245 | :param df: DataFrame containing a single row of valuation inputs for each stock ticker 246 | :param risk_free_rate: The minimum rate of return investors expect to earn from an 247 | investment without risk (use 10-Year Government’s Bond as a Risk Free Rate) 248 | :param market_risk_premium: The rate of return over the risk free rate required by investors 249 | :return: Original DataFrame with the addition of a new 'Discount Rate' (WACC) column 250 | :rtype: pandas.DataFrame 251 | """ 252 | 253 | market_value_equity = df['marketCap'] 254 | market_value_debt = (df['shortTermDebt'] + df['longTermDebt']) * 1.20 255 | total_market_value_debt_equity = market_value_equity + market_value_debt 256 | 257 | cost_of_debt = df['Max Interest Rate'] * (1 - df['Max Tax Rate']) 258 | cost_of_equity = risk_free_rate + df['profile.beta'] * market_risk_premium 259 | 260 | df['Discount Rate'] = round((market_value_equity / total_market_value_debt_equity) * cost_of_equity \ 261 | + (market_value_debt / total_market_value_debt_equity) * cost_of_debt, 2) 262 | 263 | return df 264 | 265 | 266 | def calculate_discounted_free_cash_flow(df, projection_window, **kwargs): 267 | """ 268 | Calculate the present value of discounted future cash flows for each stock ticker in the 269 | provided Dataframe. 270 | 271 | :param df: DataFrame containing a single row of valuation inputs for each stock ticker 272 | :param projection_window: Number of years into the future we should generate projections for 273 | :param kwargs: Dictionary containing stock tickers as keys and long term growth rate 274 | estimates as values 275 | :return: Original DataFrame with the addition of new columns: 'Present Value of Discounted 276 | FCF', 'Last Projected FCF', 'Last Projected Discount Factor' 277 | :rtype: pandas.DataFrame 278 | """ 279 | 280 | estimated_growth = pd.DataFrame(kwargs.items(), columns=['symbol', 'Long Term Growth Rate']) 281 | df = df.merge(estimated_growth, on='symbol') 282 | 283 | dfcf = pd.DataFrame(columns=['year', 'symbol', 'Last Projected FCF', 284 | 'Last Projected Discount Factor', 285 | 'Present Value of Discounted FCF']) 286 | 287 | for row in df.itertuples(index=False): 288 | 289 | for year in range(projection_window + 1): 290 | projected_free_cash_flow = row[df.columns.get_loc('freeCashFlow')] * \ 291 | (1 + row[df.columns.get_loc('Long Term Growth Rate')]) ** \ 292 | year 293 | 294 | discount_factor = 1 / (1 + row[df.columns.get_loc('Discount Rate')]) ** year 295 | 296 | discounted_free_cash_flow = projected_free_cash_flow * discount_factor 297 | 298 | dfcf = dfcf.append({'year': year, 'symbol': row[df.columns.get_loc('symbol')], 299 | 'Last Projected FCF': round(projected_free_cash_flow, 2), 300 | 'Last Projected Discount Factor': round(discount_factor, 2), 301 | 'Present Value of Discounted FCF': round(discounted_free_cash_flow, 2)}, 302 | ignore_index=True) 303 | 304 | dfcf = dfcf.loc[dfcf['year'] != 0] 305 | 306 | pv_dfcf = dfcf.groupby('symbol')['Present Value of Discounted FCF'].sum().reset_index() 307 | last_fcf = dfcf.loc[dfcf['year'] == max(dfcf['year'])][['symbol', 'Last Projected FCF']] 308 | last_df = dfcf.loc[dfcf['year'] == max(dfcf['year'])][['symbol', 'Last Projected Discount ' 309 | 'Factor']] 310 | 311 | df = df.merge(pv_dfcf, on='symbol').merge(last_fcf, on='symbol').merge(last_df, on='symbol') 312 | 313 | return df 314 | 315 | 316 | def calculate_terminal_value(df, gdp_growth_rate=0.029): 317 | """ 318 | Calculate the terminal value for each stock ticker in the provided DataFrame. 319 | 320 | :param df: DataFrame containing a single row of valuation inputs for each stock ticker 321 | :param gdp_growth_rate: https://data.worldbank.org/indicator/NY.GDP.MKTP.KD.ZG?locations=US 322 | :return: Original DataFrame with the addition of a new 'Terminal Value' column 323 | :rtype: pandas.DataFrame 324 | """ 325 | 326 | df['Terminal Value'] = (df['Last Projected FCF'] * (1 + gdp_growth_rate)) \ 327 | / df['Discount Rate'] - gdp_growth_rate 328 | 329 | return df 330 | 331 | 332 | def calculate_intrinsic_value(df): 333 | """ 334 | Calculate the intrinsic value of each stock ticker in the provided DataFrame. 335 | 336 | :param df: DataFrame containing a single row of valuation inputs for each stock ticker 337 | :return: Original DataFrame with the addition of a new 'Intrinsic Value' column 338 | :rtype: pandas.DataFrame 339 | """ 340 | 341 | df['Intrinsic Value'] = (df['Present Value of Discounted FCF'] + df['Terminal Value'] 342 | + df['cashAndCashEquivalents'] - df['totalLiabilities']) \ 343 | / df['numberOfShares'] 344 | 345 | return df 346 | 347 | 348 | def calculate_margin_of_safety(df, margin_of_safety=0.25): 349 | """ 350 | Calculate the margin of safety value of each stock ticker in the provided DataFrame. 351 | 352 | :param df: DataFrame containing a single row of valuation inputs for each stock ticker 353 | :param margin_of_safety: Value by which to discount our original intrinsic value calculation 354 | :return: Original DataFrame with the addition of new columns: 'Margin of Safety Value', 'Buy' 355 | :rtype: pandas.DataFrame 356 | """ 357 | 358 | multiplier = 1 - margin_of_safety 359 | df['Margin of Safety Value'] = df['Intrinsic Value'] * multiplier 360 | 361 | df['Buy Decision'] = np.where(df['Margin of Safety Value'] > df['stockPrice'], 'Yes', 'No') 362 | 363 | return df 364 | -------------------------------------------------------------------------------- /fundamental/company_profiles.py: -------------------------------------------------------------------------------- 1 | from urllib.request import urlopen 2 | 3 | import json 4 | import pandas as pd 5 | 6 | 7 | def get_company_data(api_key): 8 | """ 9 | Retrieve all available company stocks from FinancialModelingPrep API. 10 | 11 | :param api_key: FinancialModelingPrep API key 12 | :return: DataFrame containing 'symbol', 'name', 'price', and 'exchange' columns 13 | :rtype: pandas.DataFrame 14 | """ 15 | print('Retrieving stock data from FinancialModelingPrep...') 16 | 17 | response = urlopen("https://financialmodelingprep.com/api/v3/stock/list?apikey=" + api_key) 18 | data = response.read().decode("utf-8") 19 | data_json = json.loads(data) 20 | flattened_data = pd.json_normalize(data_json) 21 | 22 | print('Found stock data on ' + str(flattened_data.symbol.nunique()) + ' companies! \n') 23 | 24 | return flattened_data 25 | 26 | 27 | def select_stock_exchanges(df): 28 | """ 29 | Subset DataFrame to companies listed on major stock exchanges. 30 | 31 | :param df: DataFrame containing stock exchange info on N companies 32 | :return: Subset of the DataFrame provided 33 | :rtype: pandas.DataFrame 34 | """ 35 | print('Searching for stocks listed on major stock exchanges among ' + str(df.symbol.nunique()) + 36 | ' companies...') 37 | 38 | major_stock_exchanges = ['Nasdaq Global Select', 'NasdaqGS', 'Nasdaq', 39 | 'New York Stock Exchange', 'NYSE', 'NYSE American'] 40 | 41 | df = df[df['exchange'].isin(major_stock_exchanges)] 42 | 43 | print('Found ' + str(df.symbol.nunique()) + ' companies listed on major stock exchanges! \n') 44 | 45 | return df 46 | 47 | 48 | def select_minimum_price(df, min_price=5.00): 49 | """ 50 | Subset DataFrame to companies with a stock price greater than or equal to the minimum provided. 51 | 52 | :param df: DataFrame containing recent stock price info on N companies 53 | :param min_price: The minimum stock price a user is willing to consider 54 | :return: Subset of the DataFrame provided 55 | :rtype: pandas.DataFrame 56 | """ 57 | print('Searching for stocks with a price greater than or equal to $' + str(int(min_price)) + 58 | ' among ' + str(df.symbol.nunique()) + ' companies') 59 | 60 | df = df[df['price'] >= min_price] 61 | 62 | print('Found ' + str(df.symbol.nunique()) + ' companies that meet your price requirement! \n') 63 | 64 | return df 65 | 66 | 67 | def create_company_profile(df, dir_path, api_key): 68 | """ 69 | Map stock tickers to company information needed for screening stocks (industry, sector, etc.). 70 | 71 | :param df: DataFrame containing stock tickers (symbol) for N companies 72 | :param dir_path: Specifies name of directory that csv files should be written to 73 | :param api_key: FinancialModelingPrep API key 74 | :return: New DataFrame that maps the symbol column to additional company information 75 | :rtype: pandas.DataFrame 76 | """ 77 | 78 | print('Searching for profile data on ' + str(df.symbol.nunique()) + ' companies...') 79 | 80 | profile_data = pd.DataFrame() 81 | 82 | for ticker in df['symbol']: 83 | response = urlopen("https://financialmodelingprep.com/api/v3/company/profile/" + ticker 84 | + "?apikey=" + api_key) 85 | 86 | data = response.read().decode("utf-8") 87 | data_json = json.loads(data) 88 | flattened_data = pd.json_normalize(data_json) 89 | profile_data = pd.concat([profile_data, flattened_data], ignore_index=True) 90 | 91 | profile_data.to_csv(dir_path + 'company-profiles.csv', index=False, header=True) 92 | 93 | print('Found ' + str(profile_data.symbol.nunique()) + ' company profiles! \n') 94 | 95 | return profile_data 96 | -------------------------------------------------------------------------------- /fundamental/images/Book-Value-per-Share-Scaled.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hjones20/fundamental-analysis/b12d8231ad37036f338c6dcb0ba3307217f6e97f/fundamental/images/Book-Value-per-Share-Scaled.png -------------------------------------------------------------------------------- /fundamental/images/Current-Ratio-Scaled.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hjones20/fundamental-analysis/b12d8231ad37036f338c6dcb0ba3307217f6e97f/fundamental/images/Current-Ratio-Scaled.png -------------------------------------------------------------------------------- /fundamental/images/Debt-to-Equity-Scaled.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hjones20/fundamental-analysis/b12d8231ad37036f338c6dcb0ba3307217f6e97f/fundamental/images/Debt-to-Equity-Scaled.png -------------------------------------------------------------------------------- /fundamental/images/Dividend-per-Share-Scaled.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hjones20/fundamental-analysis/b12d8231ad37036f338c6dcb0ba3307217f6e97f/fundamental/images/Dividend-per-Share-Scaled.png -------------------------------------------------------------------------------- /fundamental/images/EPS-scaled.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hjones20/fundamental-analysis/b12d8231ad37036f338c6dcb0ba3307217f6e97f/fundamental/images/EPS-scaled.png -------------------------------------------------------------------------------- /fundamental/images/ROE-Scaled.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hjones20/fundamental-analysis/b12d8231ad37036f338c6dcb0ba3307217f6e97f/fundamental/images/ROE-Scaled.png -------------------------------------------------------------------------------- /fundamental/main.py: -------------------------------------------------------------------------------- 1 | from fundamental import config 2 | from fundamental import company_profiles as profiles 3 | from fundamental import company_financials as financials 4 | from fundamental import company_fundamentals as fundamentals 5 | 6 | import pandas as pd 7 | 8 | 9 | def prepare_company_profiles(price_filter, dir_path, api_key): 10 | """ 11 | Build company profiles for stock tickers that meet price and exchange requirements. 12 | Reference: https://financialmodelingprep.com/developer/docs/#Company-Profile 13 | 14 | :param price_filter: The minimum stock price a user is willing to consider 15 | :param dir_path: Specifies name of directory that csv files should be written to 16 | :param api_key: FinancialModelingPrep API key 17 | :return: None 18 | """ 19 | 20 | # Retrieve all available stock tickers from the FMP API 21 | data = profiles.get_company_data(api_key) 22 | # Drop rows with 1 or more null values (cols: "symbol", "name", "price", "exchange") 23 | data.dropna(axis=0, how='any', inplace=True) 24 | # Only retain stocks listed on major exchanges 25 | data = profiles.select_stock_exchanges(data) 26 | # Only retain stocks with a price greater than or equal to the provided price filter 27 | data = profiles.select_minimum_price(data, price_filter) 28 | # Retrieve company profile data for remaining stock tickers 29 | profiles.create_company_profile(data, dir_path, api_key) 30 | 31 | return None 32 | 33 | 34 | def get_company_financials(file_path, request_list, review_period, report_year, eval_period, 35 | dir_path, api_key): 36 | """ 37 | Retrieve, clean, subset, and store company financial data in the specified directory. 38 | 39 | :param file_path: File path to company profile data generated by the function above 40 | :param request_list: List of financial data to request from the FMP API 41 | :param review_period: Frequency of financial statements - 'annual' or 'quarter' 42 | :param report_year: Year of most recent financial report desired 43 | :param eval_period: Number of years prior to most recent report to be analyzed (max = 10) 44 | :param dir_path: Directory path where csv files should be stored 45 | :param api_key: FinancialModelingPrep API key 46 | :return: None 47 | """ 48 | 49 | # Read in company profile data generated by the function above 50 | company_profiles = pd.read_csv(file_path) 51 | # Subset DataFrame to companies in the provided sectors (*args) 52 | sector_companies = financials.select_sector(company_profiles, 'Consumer Cyclical') 53 | 54 | for request in request_list: 55 | # Retrieve financial data described in the list above on an annual or quarterly basis 56 | raw_data = financials.get_financial_data(sector_companies, request, review_period, api_key) 57 | # Remove rows with corrupted date values, create new year column 58 | clean_data = financials.clean_financial_data(raw_data) 59 | # Subset financial data to the date range provided 60 | subset_data = financials.select_analysis_years(clean_data, report_year, eval_period) 61 | # Save financial data to data directory for further analysis 62 | evaluation_period = subset_data.year.max() - subset_data.year.min() 63 | filename = dir_path + request + '-' + str(evaluation_period) + 'Y' + '.csv' 64 | subset_data.to_csv(filename, index=False, header=True) 65 | 66 | return None 67 | 68 | 69 | def screen_stocks(dir_path, year_pattern, report_year, eval_period, criteria, stat, *args): 70 | """ 71 | Read financial data, calculate performance stats, and screen stocks accordingly. 72 | 73 | :param dir_path: Path to directory that contains csv files to be read in 74 | :param year_pattern: File name pattern indicating how many years of historical data the csv 75 | file should include 76 | :param report_year: Year of most recent financial report desired 77 | :param eval_period: Number of years prior to most recent report to be analyzed (max = 10) 78 | :param criteria: Dictionary containing the column name and quantitative range desired for 79 | that column 80 | :param stat: Statistic to calculate on col in col_list ('mean', 'median', or 'percent change') 81 | :param args: List of columns that statistical calculation should apply to 82 | :return: Subset of financial_data, containing only qualified stocks 83 | :rtype: pandas.DataFrame 84 | """ 85 | 86 | # Read and join all csv files in the provided directory that match a specific pattern, ex: '10Y' 87 | financial_data = fundamentals.combine_data(dir_path, year_pattern) 88 | # Subset DataFrame to the trailing twelve months of data for screening purposes 89 | ttm_data = financial_data[financial_data.year == report_year] 90 | 91 | # Calculate the N-YEAR mean, median, or percent change of a given set of columns 92 | performance_stats = fundamentals.calculate_stats(financial_data, stat, report_year, 93 | eval_period, *args) 94 | ttm_data = ttm_data.merge(performance_stats, on=['symbol', 'year'], how='inner') 95 | 96 | # Subset DataFrame to stocks that meet the specified criteria 97 | qualified_companies = fundamentals.screen_stocks(ttm_data, **criteria) 98 | qualified_company_financials = financial_data[financial_data['symbol'].isin(qualified_companies)] 99 | 100 | return qualified_company_financials 101 | 102 | 103 | def plot_stock_performance(df, report_year, eval_period): 104 | """ 105 | Plot the historical performance of selected stock tickers. 106 | 107 | :param df: DataFrame containing stock tickers to include in line graphs 108 | :param report_year: Year of most recent financial report desired 109 | :param eval_period: Number of years prior to most recent report to be analyzed (max = 10) 110 | :return: A list of ggplot objects 111 | :rtype: List 112 | """ 113 | 114 | historical_performance_plots = fundamentals.plot_performance(df, report_year, eval_period) 115 | 116 | return historical_performance_plots 117 | 118 | 119 | def calculate_intrinsic_value(df, report_year, eval_period, projection_window, 120 | long_term_growth_estimates, *args): 121 | """ 122 | Calculate the intrinsic value of the provided stock tickers. 123 | 124 | :param df: DataFrame containing N-years of historical company data 125 | :param report_year: Year of most recent financial report desired 126 | :param eval_period: Number of years prior to most recent report to be analyzed (max = 10) 127 | :param projection_window: Number of years into the future we should generate projections for 128 | :param long_term_growth_estimates: Estimated growth over N-year period 129 | :param args: Stock tickers to generate intrinsic value estimates for 130 | :return: DataFrame containing DCF model inputs, intrinsic value outputs, and a buy/no buy column 131 | :rtype: pandas.DataFrame 132 | """ 133 | 134 | # Prepare data for DCF model calculations 135 | valuation_data = fundamentals.prepare_valuation_inputs(df, report_year, eval_period, *args) 136 | 137 | # See company_fundamentals.py to understand calculations, calculation order 138 | dcf_model_steps = [fundamentals.calculate_discount_rate, 139 | fundamentals.calculate_discounted_free_cash_flow, 140 | fundamentals.calculate_terminal_value, 141 | fundamentals.calculate_intrinsic_value, 142 | fundamentals.calculate_margin_of_safety] 143 | 144 | for step in dcf_model_steps: 145 | if step == fundamentals.calculate_discounted_free_cash_flow: 146 | valuation_data = step(valuation_data, projection_window, **long_term_growth_estimates) 147 | else: 148 | valuation_data = step(valuation_data) 149 | 150 | return valuation_data 151 | 152 | 153 | if __name__ == '__main__': 154 | 155 | prepare_company_profiles(10.00, 'data/', config.api_key) 156 | 157 | financial_requests = ['financials', 'financial-ratios', 'financial-statement-growth', 158 | 'company-key-metrics', 'enterprise-value'] 159 | 160 | get_company_financials('data/company-profiles.csv', financial_requests, 'annual', 2019, 10, 161 | 'data/', config.api_key) 162 | 163 | screening_criteria = {'debtToEquity': [0, 0.5], 164 | 'currentRatio': [1.5, 10.0], 165 | 'roe': [0.10, 0.50], 166 | '10Y roe median': [0.08, 0.25], 167 | 'interestCoverage': [15, 5000]} 168 | 169 | screened_stocks = screen_stocks('data/', '10Y', 2019, 10, screening_criteria, 'median', 170 | 'roe', 'currentRatio') 171 | 172 | # Returns list of ggplot objects, add optional for loop to inspect graphs 173 | stability_graphs = plot_stock_performance(screened_stocks, 2019, 10) 174 | 175 | stock_growth_estimates = {'DLB': 0.06, 'UNF': 0.05} 176 | 177 | intrinsic_value_estimates = calculate_intrinsic_value(screened_stocks, 2019, 10, 10, 178 | stock_growth_estimates, 'DLB', 'UNF') 179 | 180 | print(intrinsic_value_estimates) 181 | --------------------------------------------------------------------------------