├── .gitignore
├── README.md
└── fundamental
├── company_financials.py
├── company_fundamentals.py
├── company_profiles.py
├── data
├── company-key-metrics-10Y.csv
├── company-profiles.csv
├── enterprise-value-10Y.csv
├── financial-ratios-10Y.csv
├── financial-statement-growth-10Y.csv
└── financials-10Y.csv
├── images
├── Book-Value-per-Share-Scaled.png
├── Current-Ratio-Scaled.png
├── Debt-to-Equity-Scaled.png
├── Dividend-per-Share-Scaled.png
├── EPS-scaled.png
└── ROE-Scaled.png
└── main.py
/.gitignore:
--------------------------------------------------------------------------------
1 | # Created by .ignore support plugin (hsz.mobi)
2 | ### Python template
3 | # Byte-compiled / optimized / DLL files
4 | __pycache__/
5 | *.py[cod]
6 | *$py.class
7 |
8 | # C extensions
9 | *.so
10 |
11 | # Distribution / packaging
12 | .Python
13 | build/
14 | develop-eggs/
15 | dist/
16 | downloads/
17 | eggs/
18 | .eggs/
19 | lib/
20 | lib64/
21 | parts/
22 | sdist/
23 | var/
24 | wheels/
25 | pip-wheel-metadata/
26 | share/python-wheels/
27 | *.egg-info/
28 | .installed.cfg
29 | *.egg
30 | MANIFEST
31 |
32 | # PyInstaller
33 | # Usually these files are written by a python script from a template
34 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
35 | *.manifest
36 | *.spec
37 |
38 | # Installer logs
39 | pip-log.txt
40 | pip-delete-this-directory.txt
41 |
42 | # Unit test / coverage reports
43 | htmlcov/
44 | .tox/
45 | .nox/
46 | .coverage
47 | .coverage.*
48 | .cache
49 | nosetests.xml
50 | coverage.xml
51 | *.cover
52 | .hypothesis/
53 | .pytest_cache/
54 |
55 | # Translations
56 | *.mo
57 | *.pot
58 |
59 | # Django stuff:
60 | *.log
61 | local_settings.py
62 | db.sqlite3
63 |
64 | # Flask stuff:
65 | instance/
66 | .webassets-cache
67 |
68 | # Scrapy stuff:
69 | .scrapy
70 |
71 | # Sphinx documentation
72 | docs/_build/
73 |
74 | # PyBuilder
75 | target/
76 |
77 | # Jupyter Notebook
78 | .ipynb_checkpoints
79 |
80 | # IPython
81 | profile_default/
82 | ipython_config.py
83 |
84 | # pyenv
85 | .python-version
86 |
87 | # pipenv
88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies
90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not
91 | # install all needed dependencies.
92 | #Pipfile.lock
93 |
94 | # celery beat schedule file
95 | celerybeat-schedule
96 |
97 | # SageMath parsed files
98 | *.sage.py
99 |
100 | # Environments
101 | .env
102 | .venv
103 | env/
104 | venv/
105 | ENV/
106 | env.bak/
107 | venv.bak/
108 |
109 | # Spyder project settings
110 | .spyderproject
111 | .spyproject
112 |
113 | # Rope project settings
114 | .ropeproject
115 |
116 | # mkdocs documentation
117 | /site
118 |
119 | # mypy
120 | .mypy_cache/
121 | .dmypy.json
122 | dmypy.json
123 |
124 | # Pyre type checker
125 | .pyre/
126 |
127 | .idea/
128 | config.py
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Fundamental Analysis
2 | Fundamental Analysis is a program that allows me to screen stocks using fundamental indicators and estimate the intrinsic value of qualified stocks using the Discounted Cash Flow method of valuation.
3 |
4 | **The logic accomplishes 5 primary tasks:**
5 |
6 | 1. Downloads all stock tickers available via the FinancialModelingPrep API and generates profiles on each company
7 | 2. Retrieves and cleans financial statement data on each company
8 | 3. Screens companies on any chosen indicators of value (ex: 8-Year Median ROE)
9 | 4. Plots the stability of core indicators over an N-year period
10 | - Includes: EPS, Dividend Per Share, Book Value per Share, ROE, Current Ratio, Debt to Equity Ratio
11 | 5. Estimates the intrinsic value, margin of safety value for chosen companies
12 |
13 | 
14 |
15 | ## Functions
16 | I've listed the available functions separated by module below for anyone that wishes to build upon this logic.
17 |
18 | **Notes:**
19 | 1. The FinancialModelingPrep API appears to be updated frequently. As a result, changes in data availability / functionality may occur.
20 | 2. It is not necessary to use the functions prefixed with the word **'calculate'** in the **company_fundamentals** module in order to calculate the intrinsic value of a company. You can get a given company's discounted cash flow directly from the FinancialModeling Prep API here. I simply opted to build the DCF logic myself in order to better understand how it works. It was not a great programming choice as errors will result if column names change.
21 | - FinancialModelingPrep also offers a tutorial on the discounted cash flow methodology here.
22 |
23 | - **company_profiles**
24 | - `get_company_data` - Retrieves 'symbol', 'name', 'price', and 'exchange' information for all available stock tickers
25 | - `select_stock_exchanges` - Filters out stock tickers that are not listed on one of the major exchanges: 'Nasdaq Global Select', 'NasdaqGS', 'Nasdaq', 'New York Stock Exchange', 'NYSE', 'NYSE American'
26 | - `select_minimum_price` - Filters out stock tickers that have a price less than the one specified by the user
27 | - `create_company_profile` - Retrieves additional company information for each stock ticker provided, writes data to csv file, stores file in the specified directory
28 | - **company_financials**
29 | - `select_sector` - Filters out stock tickers that do not belong to the specified sectors
30 | - `select_industries` - Filters out stock tickers that do not belong to the specified industries
31 | - `get_financial_data` - Retrieves any of the following financial datasets for each of the stock tickers provided: Financial Statements, Financial Ratios, Financial Growth, Key Company Metrics, Enterprise Value
32 | - `clean_financial_data` - Scans DataFrame containing financial data on N stock stickers and removes rows with corrupted date values. Adds a new 'year' column as well for future use
33 | - `select_analysis_years` - Remove stock tickers without financial reports in a specified timeframe (ex: L10Y) and subset DataFrame to only include the years specified (ex: 2015 - 2020)
34 | - **company_fundamentals**
35 | - `combine_data` - Read and join all files in a specified directory (ex: 'data/') with a specified year pattern (ex: '2Y')
36 | - `calculate_stats` - Calculate the mean, median, or percent change of provided columns over an N-year period
37 | - `screen_stocks` - Filter out stock tickers with column values that do not fall within specified ranges
38 | - `plot_performance` - Plot stock performance over time with respect to the following column values: 'Earnings per Share', 'Dividend per Share', 'Book Value per Share', 'Return on Equity', 'Current Ratio', 'Debt to Equity Ratio'. Note that 'Dividend per Share' is commented out as the column name seems to have disappeared in a recent API update
39 | - `prepare_valuation_inputs` - Subset DataFrame to include only the data required for Discounted Cash Flow model calculations
40 | - `calculate_discount_rate` - Calculated the Weighted Average Cost of Capital (WACC) for each stock ticker provided
41 | - `calculate_discounted_free_cash_flow` - Calculate the present value of discounted future cash flows for each stock ticker provided
42 | - `calculate_terminal_value` - Calculate the terminal value for each stock ticker for each stock ticker provided
43 | - `calculate_intrinsic_value` - Calculate the intrinsic value of each stock ticker provided
44 | - `calculate_margin_of_safety` - Calculate the margin of safety value of each stock ticker provided
45 |
46 |
47 | ## References
48 | - **Data Sources:** All stock data is pulled from the FinancialModelingPrep API
49 | - **Graphs:** The "Stability Graph" concept was taken from the great folks at www.buffettsbooks.com. They explain why this concept is so important here
50 | - **Calculations:** Guides to the Discounted Cash Flow method of valuation can be found in many places across the internet. Personally, I used the Advanced Value Investing Udemy course
51 |
--------------------------------------------------------------------------------
/fundamental/company_financials.py:
--------------------------------------------------------------------------------
1 | from urllib.request import urlopen
2 |
3 | import json
4 | import pandas as pd
5 |
6 |
7 | def select_sector(df, *args):
8 | """
9 | Remove companies not in the sector provided.
10 |
11 | :param df: DataFrame containing sector info on N companies
12 | :param args: Sectors to retain for analysis
13 | :return: Subset of DataFrame provided, containing companies in the specified sector
14 | :rtype: pandas.DataFrame
15 | """
16 | print('Evaluating sector membership among ' + str(df['symbol'].nunique()) + ' companies...')
17 |
18 | sector_filter = list(args)
19 | df = df[df['profile.sector'].isin(sector_filter)]
20 |
21 | print('Found ' + str(df['symbol'].nunique()) + ' companies in the ' + str(sector_filter)
22 | + ' sector! \n')
23 |
24 | return df
25 |
26 |
27 | def select_industries(df, *args):
28 | """
29 | Remove companies not in the industries provided.
30 |
31 | :param df: DataFrame containing industry info on N companies
32 | :param args: Industries to retain for analysis
33 | :return: Subset of DataFrame provided, containing companies in the specified industry
34 | :rtype: pandas.DataFrame
35 | """
36 | print('Evaluating industry membership among ' + str(df['symbol'].nunique()) + ' companies...')
37 |
38 | industry_filter = list(args)
39 | df = df[df['profile.industry'].isin(industry_filter)]
40 |
41 | print('Found ' + str(df['symbol'].nunique()) + ' companies in the ' + str(industry_filter)
42 | + ' industries! \n')
43 |
44 | return df
45 |
46 |
47 | def get_financial_data(df, request, period, api_key):
48 | """
49 | Retrieve financial data for all stock tickers in the provided DataFrame.
50 |
51 | :param df: DataFrame containing stock tickers (symbol) for N companies
52 | :param request: String representing aspect of API to query
53 | :param period: String 'annual' or 'quarter'
54 | :param api_key: FinancialModelingPrep API key
55 | :return: DataFrame containing chosen financial data for all years available
56 | :rtype: pandas.DataFrame
57 | """
58 | print('Pulling ' + period + ' ' + request + ' for ' + str(df['symbol'].nunique())
59 | + ' companies...')
60 |
61 | request_map = {"financials": "financials",
62 | "financial-ratios": "ratios",
63 | "enterprise-value": "enterprise-values",
64 | "company-key-metrics": "key-metrics",
65 | "financial-statement-growth": "financial-growth"}
66 |
67 | value_key = request_map[request]
68 |
69 | financial_statements = ['income-statement', 'balance-sheet-statement', 'cash-flow-statement']
70 |
71 | financial_data = pd.DataFrame()
72 |
73 | if request == 'financials':
74 |
75 | for ticker in df['symbol']:
76 |
77 | statement_data = pd.DataFrame(df['symbol'])
78 |
79 | for statement in financial_statements:
80 | response = urlopen("https://financialmodelingprep.com/api/v3/" + statement + "/"
81 | + ticker + "?period=" + period + "&apikey=" + api_key)
82 |
83 | data = response.read().decode("utf-8")
84 | data_json = json.loads(data)
85 |
86 | flattened_data = pd.json_normalize(data_json)
87 |
88 | try:
89 | statement_data = statement_data.merge(flattened_data, on='symbol',
90 | how='inner', suffixes=('', '_x'))
91 |
92 | except KeyError:
93 | continue
94 |
95 | duplicate_cols = [x for x in statement_data if x.endswith('_x')]
96 | statement_data.drop(duplicate_cols, axis=1, inplace=True)
97 |
98 | financial_data = pd.concat([financial_data, statement_data], ignore_index=True)
99 |
100 | print('Found ' + period + ' financial statement data for '
101 | + str(financial_data['symbol'].nunique()) + ' companies! \n')
102 |
103 | else:
104 |
105 | for ticker in df['symbol']:
106 | response = urlopen("https://financialmodelingprep.com/api/v3/" + value_key + "/"
107 | + ticker + "?period=" + period + "&apikey=" + api_key)
108 |
109 | data = response.read().decode("utf-8")
110 | data_json = json.loads(data)
111 |
112 | flattened_data = pd.json_normalize(data_json)
113 | financial_data = pd.concat([financial_data, flattened_data], ignore_index=True)
114 |
115 | print('Found ' + period + ' ' + request + ' data for '
116 | + str(financial_data['symbol'].nunique()) + ' companies! \n')
117 |
118 | return financial_data
119 |
120 |
121 | def clean_financial_data(df):
122 | """
123 | Remove rows with corrupted date values, create new year column.
124 |
125 | :param df: DataFrame containing financial data on N companies
126 | :return: Subset of provided DataFrame, with the addition of a new 'year' column
127 | :rtype: pandas.DataFrame
128 | """
129 | print('Cleaning financial data for ' + str(df['symbol'].nunique()) + ' companies...')
130 |
131 | df = df.loc[df['date'].astype(str).apply(lambda x: len(x) == 10)].copy()
132 |
133 | df.insert(2, 'year', df['date'].str[:4])
134 | df['year'] = df.year.astype(int)
135 |
136 | print('Returning clean financial data for ' + str(df['symbol'].nunique()) + ' companies! \n')
137 |
138 | return df
139 |
140 |
141 | def select_analysis_years(df, report_year, eval_period):
142 | """
143 | Remove companies without financial reports in the evaluation period specified.
144 |
145 | :param df: DataFrame containing financial data on N companies
146 | :param report_year: Year of most recent financial report
147 | :param eval_period: Number of years prior to most recent report to be analyzed
148 | :return: Subset of the DataFrame provided
149 | :rtype: pandas.DataFrame
150 | """
151 |
152 | print('Subsetting data from ' + str(report_year - eval_period) + ' to ' + str(report_year)
153 | + ' for ' + str(df['symbol'].nunique()) + ' companies...')
154 |
155 | df.sort_values(by=['symbol', 'year'], inplace=True, ascending=True)
156 | df.drop_duplicates(['symbol', 'year'], keep='last', inplace=True)
157 |
158 | start_year = report_year - eval_period
159 |
160 | historical_years = set()
161 | tickers_to_drop = []
162 |
163 | for i in range(start_year, report_year + 1):
164 | historical_years.add(i)
165 |
166 | symbol_history = df.groupby('symbol')['year'].apply(set)
167 |
168 | for symbol, year_set in enumerate(symbol_history):
169 | common_years = historical_years.intersection(year_set)
170 | if len(common_years) != len(historical_years):
171 | tickers_to_drop.append(symbol_history.index[symbol])
172 | else:
173 | continue
174 |
175 | df = df[~df['symbol'].isin(tickers_to_drop)]
176 | df = df[df['year'].isin(historical_years)]
177 |
178 | print('Subset data from ' + str(report_year - eval_period) + ' to ' + str(report_year)
179 | + ' for ' + str(df['symbol'].nunique()) + ' companies! \n')
180 |
181 | return df
182 |
183 |
184 | # TODO: Add logic to select specific quarters for analysis
185 | def select_analysis_quarters(df, report_year, eval_period, *args):
186 | pass
187 |
--------------------------------------------------------------------------------
/fundamental/company_fundamentals.py:
--------------------------------------------------------------------------------
1 | from plotnine import ggplot, aes, geom_line, geom_point, scale_x_continuous, scale_y_continuous,\
2 | labs, theme, theme_538, annotate, element_text
3 |
4 | import numpy as np
5 | import pandas as pd
6 | import statistics
7 | import textwrap
8 | import os
9 |
10 |
11 | def combine_data(directory, year_pattern):
12 | """
13 | Load all files in the provided directory as pandas DataFrames and join data into one large
14 | DataFrame for future analysis.
15 |
16 | :param directory: Local directory where financial data resides
17 | :param year_pattern: String indicating which files to pull. Example: '10Y', '5Y', '3Y', etc.
18 | :return: Master DataFrame containing all data from the listed directory
19 | :rtype: pandas.DataFrame
20 | """
21 |
22 | master = pd.DataFrame()
23 |
24 | for file in os.listdir(directory):
25 |
26 | if file.endswith(year_pattern, -7, -4):
27 |
28 | financial_data = pd.read_csv(directory + str(file))
29 |
30 | if master.empty:
31 | master = financial_data
32 | else:
33 | master = pd.merge(master, financial_data, on=['symbol', 'date'], how='inner',
34 | suffixes=('', '_x'))
35 |
36 | elif file == 'company-profiles.csv':
37 |
38 | company_profiles = pd.read_csv(directory + str(file))
39 |
40 | try:
41 | master = pd.merge(master, company_profiles, on='symbol', how='left',
42 | suffixes=('', '_x'))
43 | except KeyError:
44 | continue
45 |
46 | duplicate_cols = [x for x in master if x.endswith('_x')]
47 | master.drop(duplicate_cols, axis=1, inplace=True)
48 |
49 | return master
50 |
51 |
52 | def calculate_stats(df, stat, report_year, eval_period, *args):
53 | """
54 | Calculate N year statistics for provided columns.
55 |
56 | :param df: DataFrame containing the columns specified in *args
57 | :param stat: Statistic to calculate: mean, median, or percent change
58 | :param report_year: Ending year of calculation
59 | :param eval_period: Number of years to include in the calculation
60 | :param args: Columns that require calculations
61 | :return: New DataFrame containing 'symbol', 'year', and 'stat X' columns
62 | :rtype: pandas.DataFrame
63 | """
64 |
65 | df.sort_values(by=['symbol', 'year'], inplace=True, ascending=False)
66 |
67 | company_stats = pd.DataFrame()
68 |
69 | for arg in list(args):
70 |
71 | metric = df[['symbol', 'year', arg]]
72 | metric = metric.pivot_table(values=arg, index='symbol', columns='year')
73 |
74 | if stat == 'percent change':
75 | column_name = str(eval_period) + 'Y ' + arg + ' % change'
76 | company_stats[column_name] = (metric.iloc[:, -1] / metric.iloc[:, 0]) - 1
77 |
78 | elif stat == 'mean':
79 | column_name = str(eval_period) + 'Y ' + arg + ' mean'
80 | company_stats[column_name] = metric.mean(axis=1)
81 |
82 | elif stat == 'median':
83 | column_name = str(eval_period) + 'Y ' + arg + ' median'
84 | company_stats[column_name] = metric.median(axis=1)
85 |
86 | company_stats['year'] = report_year
87 |
88 | return company_stats
89 |
90 |
91 | def screen_stocks(df, **kwargs):
92 | """
93 | Subset DataFrame to stocks containing column values within the specified thresholds.
94 |
95 | :param df: DataFrame containing the columns specified in key values of **kwargs
96 | :param kwargs: Dictionary containing column names as keys and min/max threshold values
97 | :return: Subset of the DataFrame provided
98 | :rtype: pandas.DataFrame
99 | """
100 |
101 | for column, thresholds in kwargs.items():
102 | df = df[(df[column] > thresholds[0]) & (df[column] < thresholds[1]) | (df[column].isnull())]
103 |
104 | ticker_list = list(df.symbol)
105 |
106 | return ticker_list
107 |
108 |
109 | def plot_performance(df, report_year, eval_period):
110 | """
111 | Plot metric-specific performance for a set of stocks over time. Reference:
112 | https://www.buffettsbooks.com/how-to-invest-in-stocks/intermediate-course/lesson-20/
113 |
114 | :param df: DataFrame containing stock tickers and the columns specified below
115 | :param report_year: Year of most recent financial report
116 | :param eval_period: Number of years prior to most recent report to be analyzed
117 | :return: A list of ggplot objects
118 | :rtype: List
119 | """
120 |
121 | start_year = report_year - eval_period
122 | df = df.loc[df['year'] >= start_year]
123 |
124 | df = df[['symbol', 'year', 'eps', 'bookValuePerShare', 'roe', 'currentRatio', 'debtToEquity']]
125 |
126 | df['roe'] = df['roe'].apply(lambda x: x * 100.0)
127 |
128 | df = df.rename({'eps': 'Earnings per Share', 'roe': 'Return on Equity',
129 | 'currentRatio': 'Current Ratio', 'debtToEquity': 'Debt to Equity Ratio',
130 | 'bookValuePerShare': 'Book Value per Share'}, axis='columns')
131 |
132 | df.sort_values(by=['symbol', 'year'], inplace=True, ascending=True)
133 | df.dropna(inplace=True)
134 |
135 | # Commenting out for now, API no longer returning this col in income-statement response
136 | label_dict = {'Earnings per Share': 'The EPS shows the company\'s profit per share. This chart '
137 | 'should have a positive slope over time. Stable results '
138 | 'here are extremely important for forecasting future cash '
139 | 'flows. Note: if the company\'s book value has increased '
140 | 'over time, the EPS should demonstrate similar growth.',
141 |
142 | # 'Dividend per Share': 'This chart shows the dividend history of the company. '
143 | # 'This should have a flat to positive slope over time. If '
144 | # 'you see a drastic drop, it may represent a stock split '
145 | # 'for the company. Note: the dividend is taken from a '
146 | # 'portion of the EPS, the remainder goes to the book value.',
147 |
148 | 'Book Value per Share': 'The book value represents the liquidation value of the '
149 | 'entire company (per share). It\'s important to see '
150 | 'this number increasing over time. If the company pays a'
151 | ' high dividend, the book value may grow at a slower '
152 | 'rate. If the company pays no dividend, the book value '
153 | 'should grow with the EPS each year.',
154 |
155 | 'Return on Equity': 'Return on equity is very important because it show the '
156 | 'return that management has received for reinvesting the '
157 | 'profits of the company. If using an intrinsic value '
158 | 'calculator, it\'s very important that this number is flat or'
159 | ' increasing for accurate results. Find companies with a '
160 | 'consistent ROE above 8%.',
161 |
162 | 'Current Ratio': 'The current ratio helps measure the health of the company in '
163 | 'the short term. As a rule of thumb, the current ratio should be'
164 | ' above 1.0. A safe current ratio is typically above 1.5. Look '
165 | 'for stability trends within the current ratio to see how the '
166 | 'company manages their short term risk.',
167 |
168 | 'Debt to Equity Ratio': 'The debt to equity ratio helps measure the health of '
169 | 'the company in the long term. As a rule of thumb, the '
170 | 'debt to equity ratio should be lower than 0.5. Look for '
171 | 'stability trends within the debt/equity ratio to see how'
172 | ' the company manages their long term risk.'}
173 |
174 | wrapper = textwrap.TextWrapper(width=120)
175 |
176 | for key, value in label_dict.items():
177 | label_dict[key] = wrapper.fill(text=value)
178 |
179 | plots = []
180 |
181 | cols = df.columns[2:].tolist()
182 |
183 | for metric in cols:
184 | p = (ggplot(df, aes('year', metric, color='symbol'))
185 | + geom_line(size=1, alpha=0.8) + geom_point(size=3, alpha=0.8)
186 | + labs(title=metric, x='Report Year', y='', color='Ticker')
187 | + theme_538() + theme(legend_position='left', plot_title=element_text(weight='bold'))
188 | + scale_x_continuous(breaks=range(min(df['year']), max(df['year']) + 1, 1))
189 | + scale_y_continuous(breaks=range(min(df[metric].astype(int)),
190 | max(round(df[metric]).astype(int)) + 2, 1))
191 | + annotate(geom='label', x=statistics.mean((df['year'])),
192 | y=max(round(df[metric]).astype(int) + 1), label=label_dict[metric],
193 | size=8, label_padding=0.8, fill='#F7F7F7'))
194 |
195 | plots.append(p)
196 |
197 | return plots
198 |
199 |
200 | def prepare_valuation_inputs(df, report_year, eval_period, *args):
201 | """
202 | Subset DataFrame to data required for Discounted Cash Flow model.
203 |
204 | :param df: Dataframe containing the columns specified below
205 | :param report_year: Year of most recent financial report
206 | :param eval_period: Number of years prior to most recent report to be analyzed
207 | :param args: Stocks to retain for analysis
208 | :return: Subset of the DataFrame provided
209 | :rtype: pandas.DataFrame
210 | """
211 |
212 | symbol_filter = list(args)
213 | df = df[df['symbol'].isin(symbol_filter)]
214 |
215 | start_year = report_year - eval_period
216 | df = df.loc[df['year'] >= start_year]
217 |
218 | df['interestRate'] = round((df['interestExpense'] / df['totalDebt']), 2)
219 |
220 | df = df.replace([np.inf, -np.inf, np.nan], 0)
221 |
222 | max_tax_rate = df.groupby('symbol')['effectiveTaxRate'].max().reset_index()
223 | max_interest_rate = df.groupby('symbol')['interestRate'].max().reset_index()
224 |
225 | df = df[['symbol', 'year', 'freeCashFlow', 'marketCap', 'shortTermDebt', 'longTermDebt',
226 | 'profile.beta', 'cashAndCashEquivalents', 'totalLiabilities', 'numberOfShares',
227 | 'stockPrice']]
228 |
229 | df = df.loc[df['year'] == report_year]
230 |
231 | valuation_data = df.merge(max_tax_rate, on='symbol').merge(max_interest_rate, on='symbol')
232 |
233 | valuation_data.rename(columns={'effectiveTaxRate': 'Max Tax Rate',
234 | 'interestRate': 'Max Interest Rate'}, inplace=True)
235 |
236 | valuation_data['Max Tax Rate'] = round(valuation_data['Max Tax Rate'], 2)
237 |
238 | return valuation_data
239 |
240 |
241 | def calculate_discount_rate(df, risk_free_rate=0.0069, market_risk_premium=0.06):
242 | """
243 | Calculate the Weighted Average Cost of Capital (WACC) for each ticker in the provided DataFrame.
244 |
245 | :param df: DataFrame containing a single row of valuation inputs for each stock ticker
246 | :param risk_free_rate: The minimum rate of return investors expect to earn from an
247 | investment without risk (use 10-Year Government’s Bond as a Risk Free Rate)
248 | :param market_risk_premium: The rate of return over the risk free rate required by investors
249 | :return: Original DataFrame with the addition of a new 'Discount Rate' (WACC) column
250 | :rtype: pandas.DataFrame
251 | """
252 |
253 | market_value_equity = df['marketCap']
254 | market_value_debt = (df['shortTermDebt'] + df['longTermDebt']) * 1.20
255 | total_market_value_debt_equity = market_value_equity + market_value_debt
256 |
257 | cost_of_debt = df['Max Interest Rate'] * (1 - df['Max Tax Rate'])
258 | cost_of_equity = risk_free_rate + df['profile.beta'] * market_risk_premium
259 |
260 | df['Discount Rate'] = round((market_value_equity / total_market_value_debt_equity) * cost_of_equity \
261 | + (market_value_debt / total_market_value_debt_equity) * cost_of_debt, 2)
262 |
263 | return df
264 |
265 |
266 | def calculate_discounted_free_cash_flow(df, projection_window, **kwargs):
267 | """
268 | Calculate the present value of discounted future cash flows for each stock ticker in the
269 | provided Dataframe.
270 |
271 | :param df: DataFrame containing a single row of valuation inputs for each stock ticker
272 | :param projection_window: Number of years into the future we should generate projections for
273 | :param kwargs: Dictionary containing stock tickers as keys and long term growth rate
274 | estimates as values
275 | :return: Original DataFrame with the addition of new columns: 'Present Value of Discounted
276 | FCF', 'Last Projected FCF', 'Last Projected Discount Factor'
277 | :rtype: pandas.DataFrame
278 | """
279 |
280 | estimated_growth = pd.DataFrame(kwargs.items(), columns=['symbol', 'Long Term Growth Rate'])
281 | df = df.merge(estimated_growth, on='symbol')
282 |
283 | dfcf = pd.DataFrame(columns=['year', 'symbol', 'Last Projected FCF',
284 | 'Last Projected Discount Factor',
285 | 'Present Value of Discounted FCF'])
286 |
287 | for row in df.itertuples(index=False):
288 |
289 | for year in range(projection_window + 1):
290 | projected_free_cash_flow = row[df.columns.get_loc('freeCashFlow')] * \
291 | (1 + row[df.columns.get_loc('Long Term Growth Rate')]) ** \
292 | year
293 |
294 | discount_factor = 1 / (1 + row[df.columns.get_loc('Discount Rate')]) ** year
295 |
296 | discounted_free_cash_flow = projected_free_cash_flow * discount_factor
297 |
298 | dfcf = dfcf.append({'year': year, 'symbol': row[df.columns.get_loc('symbol')],
299 | 'Last Projected FCF': round(projected_free_cash_flow, 2),
300 | 'Last Projected Discount Factor': round(discount_factor, 2),
301 | 'Present Value of Discounted FCF': round(discounted_free_cash_flow, 2)},
302 | ignore_index=True)
303 |
304 | dfcf = dfcf.loc[dfcf['year'] != 0]
305 |
306 | pv_dfcf = dfcf.groupby('symbol')['Present Value of Discounted FCF'].sum().reset_index()
307 | last_fcf = dfcf.loc[dfcf['year'] == max(dfcf['year'])][['symbol', 'Last Projected FCF']]
308 | last_df = dfcf.loc[dfcf['year'] == max(dfcf['year'])][['symbol', 'Last Projected Discount '
309 | 'Factor']]
310 |
311 | df = df.merge(pv_dfcf, on='symbol').merge(last_fcf, on='symbol').merge(last_df, on='symbol')
312 |
313 | return df
314 |
315 |
316 | def calculate_terminal_value(df, gdp_growth_rate=0.029):
317 | """
318 | Calculate the terminal value for each stock ticker in the provided DataFrame.
319 |
320 | :param df: DataFrame containing a single row of valuation inputs for each stock ticker
321 | :param gdp_growth_rate: https://data.worldbank.org/indicator/NY.GDP.MKTP.KD.ZG?locations=US
322 | :return: Original DataFrame with the addition of a new 'Terminal Value' column
323 | :rtype: pandas.DataFrame
324 | """
325 |
326 | df['Terminal Value'] = (df['Last Projected FCF'] * (1 + gdp_growth_rate)) \
327 | / df['Discount Rate'] - gdp_growth_rate
328 |
329 | return df
330 |
331 |
332 | def calculate_intrinsic_value(df):
333 | """
334 | Calculate the intrinsic value of each stock ticker in the provided DataFrame.
335 |
336 | :param df: DataFrame containing a single row of valuation inputs for each stock ticker
337 | :return: Original DataFrame with the addition of a new 'Intrinsic Value' column
338 | :rtype: pandas.DataFrame
339 | """
340 |
341 | df['Intrinsic Value'] = (df['Present Value of Discounted FCF'] + df['Terminal Value']
342 | + df['cashAndCashEquivalents'] - df['totalLiabilities']) \
343 | / df['numberOfShares']
344 |
345 | return df
346 |
347 |
348 | def calculate_margin_of_safety(df, margin_of_safety=0.25):
349 | """
350 | Calculate the margin of safety value of each stock ticker in the provided DataFrame.
351 |
352 | :param df: DataFrame containing a single row of valuation inputs for each stock ticker
353 | :param margin_of_safety: Value by which to discount our original intrinsic value calculation
354 | :return: Original DataFrame with the addition of new columns: 'Margin of Safety Value', 'Buy'
355 | :rtype: pandas.DataFrame
356 | """
357 |
358 | multiplier = 1 - margin_of_safety
359 | df['Margin of Safety Value'] = df['Intrinsic Value'] * multiplier
360 |
361 | df['Buy Decision'] = np.where(df['Margin of Safety Value'] > df['stockPrice'], 'Yes', 'No')
362 |
363 | return df
364 |
--------------------------------------------------------------------------------
/fundamental/company_profiles.py:
--------------------------------------------------------------------------------
1 | from urllib.request import urlopen
2 |
3 | import json
4 | import pandas as pd
5 |
6 |
7 | def get_company_data(api_key):
8 | """
9 | Retrieve all available company stocks from FinancialModelingPrep API.
10 |
11 | :param api_key: FinancialModelingPrep API key
12 | :return: DataFrame containing 'symbol', 'name', 'price', and 'exchange' columns
13 | :rtype: pandas.DataFrame
14 | """
15 | print('Retrieving stock data from FinancialModelingPrep...')
16 |
17 | response = urlopen("https://financialmodelingprep.com/api/v3/stock/list?apikey=" + api_key)
18 | data = response.read().decode("utf-8")
19 | data_json = json.loads(data)
20 | flattened_data = pd.json_normalize(data_json)
21 |
22 | print('Found stock data on ' + str(flattened_data.symbol.nunique()) + ' companies! \n')
23 |
24 | return flattened_data
25 |
26 |
27 | def select_stock_exchanges(df):
28 | """
29 | Subset DataFrame to companies listed on major stock exchanges.
30 |
31 | :param df: DataFrame containing stock exchange info on N companies
32 | :return: Subset of the DataFrame provided
33 | :rtype: pandas.DataFrame
34 | """
35 | print('Searching for stocks listed on major stock exchanges among ' + str(df.symbol.nunique()) +
36 | ' companies...')
37 |
38 | major_stock_exchanges = ['Nasdaq Global Select', 'NasdaqGS', 'Nasdaq',
39 | 'New York Stock Exchange', 'NYSE', 'NYSE American']
40 |
41 | df = df[df['exchange'].isin(major_stock_exchanges)]
42 |
43 | print('Found ' + str(df.symbol.nunique()) + ' companies listed on major stock exchanges! \n')
44 |
45 | return df
46 |
47 |
48 | def select_minimum_price(df, min_price=5.00):
49 | """
50 | Subset DataFrame to companies with a stock price greater than or equal to the minimum provided.
51 |
52 | :param df: DataFrame containing recent stock price info on N companies
53 | :param min_price: The minimum stock price a user is willing to consider
54 | :return: Subset of the DataFrame provided
55 | :rtype: pandas.DataFrame
56 | """
57 | print('Searching for stocks with a price greater than or equal to $' + str(int(min_price)) +
58 | ' among ' + str(df.symbol.nunique()) + ' companies')
59 |
60 | df = df[df['price'] >= min_price]
61 |
62 | print('Found ' + str(df.symbol.nunique()) + ' companies that meet your price requirement! \n')
63 |
64 | return df
65 |
66 |
67 | def create_company_profile(df, dir_path, api_key):
68 | """
69 | Map stock tickers to company information needed for screening stocks (industry, sector, etc.).
70 |
71 | :param df: DataFrame containing stock tickers (symbol) for N companies
72 | :param dir_path: Specifies name of directory that csv files should be written to
73 | :param api_key: FinancialModelingPrep API key
74 | :return: New DataFrame that maps the symbol column to additional company information
75 | :rtype: pandas.DataFrame
76 | """
77 |
78 | print('Searching for profile data on ' + str(df.symbol.nunique()) + ' companies...')
79 |
80 | profile_data = pd.DataFrame()
81 |
82 | for ticker in df['symbol']:
83 | response = urlopen("https://financialmodelingprep.com/api/v3/company/profile/" + ticker
84 | + "?apikey=" + api_key)
85 |
86 | data = response.read().decode("utf-8")
87 | data_json = json.loads(data)
88 | flattened_data = pd.json_normalize(data_json)
89 | profile_data = pd.concat([profile_data, flattened_data], ignore_index=True)
90 |
91 | profile_data.to_csv(dir_path + 'company-profiles.csv', index=False, header=True)
92 |
93 | print('Found ' + str(profile_data.symbol.nunique()) + ' company profiles! \n')
94 |
95 | return profile_data
96 |
--------------------------------------------------------------------------------
/fundamental/images/Book-Value-per-Share-Scaled.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hjones20/fundamental-analysis/b12d8231ad37036f338c6dcb0ba3307217f6e97f/fundamental/images/Book-Value-per-Share-Scaled.png
--------------------------------------------------------------------------------
/fundamental/images/Current-Ratio-Scaled.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hjones20/fundamental-analysis/b12d8231ad37036f338c6dcb0ba3307217f6e97f/fundamental/images/Current-Ratio-Scaled.png
--------------------------------------------------------------------------------
/fundamental/images/Debt-to-Equity-Scaled.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hjones20/fundamental-analysis/b12d8231ad37036f338c6dcb0ba3307217f6e97f/fundamental/images/Debt-to-Equity-Scaled.png
--------------------------------------------------------------------------------
/fundamental/images/Dividend-per-Share-Scaled.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hjones20/fundamental-analysis/b12d8231ad37036f338c6dcb0ba3307217f6e97f/fundamental/images/Dividend-per-Share-Scaled.png
--------------------------------------------------------------------------------
/fundamental/images/EPS-scaled.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hjones20/fundamental-analysis/b12d8231ad37036f338c6dcb0ba3307217f6e97f/fundamental/images/EPS-scaled.png
--------------------------------------------------------------------------------
/fundamental/images/ROE-Scaled.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hjones20/fundamental-analysis/b12d8231ad37036f338c6dcb0ba3307217f6e97f/fundamental/images/ROE-Scaled.png
--------------------------------------------------------------------------------
/fundamental/main.py:
--------------------------------------------------------------------------------
1 | from fundamental import config
2 | from fundamental import company_profiles as profiles
3 | from fundamental import company_financials as financials
4 | from fundamental import company_fundamentals as fundamentals
5 |
6 | import pandas as pd
7 |
8 |
9 | def prepare_company_profiles(price_filter, dir_path, api_key):
10 | """
11 | Build company profiles for stock tickers that meet price and exchange requirements.
12 | Reference: https://financialmodelingprep.com/developer/docs/#Company-Profile
13 |
14 | :param price_filter: The minimum stock price a user is willing to consider
15 | :param dir_path: Specifies name of directory that csv files should be written to
16 | :param api_key: FinancialModelingPrep API key
17 | :return: None
18 | """
19 |
20 | # Retrieve all available stock tickers from the FMP API
21 | data = profiles.get_company_data(api_key)
22 | # Drop rows with 1 or more null values (cols: "symbol", "name", "price", "exchange")
23 | data.dropna(axis=0, how='any', inplace=True)
24 | # Only retain stocks listed on major exchanges
25 | data = profiles.select_stock_exchanges(data)
26 | # Only retain stocks with a price greater than or equal to the provided price filter
27 | data = profiles.select_minimum_price(data, price_filter)
28 | # Retrieve company profile data for remaining stock tickers
29 | profiles.create_company_profile(data, dir_path, api_key)
30 |
31 | return None
32 |
33 |
34 | def get_company_financials(file_path, request_list, review_period, report_year, eval_period,
35 | dir_path, api_key):
36 | """
37 | Retrieve, clean, subset, and store company financial data in the specified directory.
38 |
39 | :param file_path: File path to company profile data generated by the function above
40 | :param request_list: List of financial data to request from the FMP API
41 | :param review_period: Frequency of financial statements - 'annual' or 'quarter'
42 | :param report_year: Year of most recent financial report desired
43 | :param eval_period: Number of years prior to most recent report to be analyzed (max = 10)
44 | :param dir_path: Directory path where csv files should be stored
45 | :param api_key: FinancialModelingPrep API key
46 | :return: None
47 | """
48 |
49 | # Read in company profile data generated by the function above
50 | company_profiles = pd.read_csv(file_path)
51 | # Subset DataFrame to companies in the provided sectors (*args)
52 | sector_companies = financials.select_sector(company_profiles, 'Consumer Cyclical')
53 |
54 | for request in request_list:
55 | # Retrieve financial data described in the list above on an annual or quarterly basis
56 | raw_data = financials.get_financial_data(sector_companies, request, review_period, api_key)
57 | # Remove rows with corrupted date values, create new year column
58 | clean_data = financials.clean_financial_data(raw_data)
59 | # Subset financial data to the date range provided
60 | subset_data = financials.select_analysis_years(clean_data, report_year, eval_period)
61 | # Save financial data to data directory for further analysis
62 | evaluation_period = subset_data.year.max() - subset_data.year.min()
63 | filename = dir_path + request + '-' + str(evaluation_period) + 'Y' + '.csv'
64 | subset_data.to_csv(filename, index=False, header=True)
65 |
66 | return None
67 |
68 |
69 | def screen_stocks(dir_path, year_pattern, report_year, eval_period, criteria, stat, *args):
70 | """
71 | Read financial data, calculate performance stats, and screen stocks accordingly.
72 |
73 | :param dir_path: Path to directory that contains csv files to be read in
74 | :param year_pattern: File name pattern indicating how many years of historical data the csv
75 | file should include
76 | :param report_year: Year of most recent financial report desired
77 | :param eval_period: Number of years prior to most recent report to be analyzed (max = 10)
78 | :param criteria: Dictionary containing the column name and quantitative range desired for
79 | that column
80 | :param stat: Statistic to calculate on col in col_list ('mean', 'median', or 'percent change')
81 | :param args: List of columns that statistical calculation should apply to
82 | :return: Subset of financial_data, containing only qualified stocks
83 | :rtype: pandas.DataFrame
84 | """
85 |
86 | # Read and join all csv files in the provided directory that match a specific pattern, ex: '10Y'
87 | financial_data = fundamentals.combine_data(dir_path, year_pattern)
88 | # Subset DataFrame to the trailing twelve months of data for screening purposes
89 | ttm_data = financial_data[financial_data.year == report_year]
90 |
91 | # Calculate the N-YEAR mean, median, or percent change of a given set of columns
92 | performance_stats = fundamentals.calculate_stats(financial_data, stat, report_year,
93 | eval_period, *args)
94 | ttm_data = ttm_data.merge(performance_stats, on=['symbol', 'year'], how='inner')
95 |
96 | # Subset DataFrame to stocks that meet the specified criteria
97 | qualified_companies = fundamentals.screen_stocks(ttm_data, **criteria)
98 | qualified_company_financials = financial_data[financial_data['symbol'].isin(qualified_companies)]
99 |
100 | return qualified_company_financials
101 |
102 |
103 | def plot_stock_performance(df, report_year, eval_period):
104 | """
105 | Plot the historical performance of selected stock tickers.
106 |
107 | :param df: DataFrame containing stock tickers to include in line graphs
108 | :param report_year: Year of most recent financial report desired
109 | :param eval_period: Number of years prior to most recent report to be analyzed (max = 10)
110 | :return: A list of ggplot objects
111 | :rtype: List
112 | """
113 |
114 | historical_performance_plots = fundamentals.plot_performance(df, report_year, eval_period)
115 |
116 | return historical_performance_plots
117 |
118 |
119 | def calculate_intrinsic_value(df, report_year, eval_period, projection_window,
120 | long_term_growth_estimates, *args):
121 | """
122 | Calculate the intrinsic value of the provided stock tickers.
123 |
124 | :param df: DataFrame containing N-years of historical company data
125 | :param report_year: Year of most recent financial report desired
126 | :param eval_period: Number of years prior to most recent report to be analyzed (max = 10)
127 | :param projection_window: Number of years into the future we should generate projections for
128 | :param long_term_growth_estimates: Estimated growth over N-year period
129 | :param args: Stock tickers to generate intrinsic value estimates for
130 | :return: DataFrame containing DCF model inputs, intrinsic value outputs, and a buy/no buy column
131 | :rtype: pandas.DataFrame
132 | """
133 |
134 | # Prepare data for DCF model calculations
135 | valuation_data = fundamentals.prepare_valuation_inputs(df, report_year, eval_period, *args)
136 |
137 | # See company_fundamentals.py to understand calculations, calculation order
138 | dcf_model_steps = [fundamentals.calculate_discount_rate,
139 | fundamentals.calculate_discounted_free_cash_flow,
140 | fundamentals.calculate_terminal_value,
141 | fundamentals.calculate_intrinsic_value,
142 | fundamentals.calculate_margin_of_safety]
143 |
144 | for step in dcf_model_steps:
145 | if step == fundamentals.calculate_discounted_free_cash_flow:
146 | valuation_data = step(valuation_data, projection_window, **long_term_growth_estimates)
147 | else:
148 | valuation_data = step(valuation_data)
149 |
150 | return valuation_data
151 |
152 |
153 | if __name__ == '__main__':
154 |
155 | prepare_company_profiles(10.00, 'data/', config.api_key)
156 |
157 | financial_requests = ['financials', 'financial-ratios', 'financial-statement-growth',
158 | 'company-key-metrics', 'enterprise-value']
159 |
160 | get_company_financials('data/company-profiles.csv', financial_requests, 'annual', 2019, 10,
161 | 'data/', config.api_key)
162 |
163 | screening_criteria = {'debtToEquity': [0, 0.5],
164 | 'currentRatio': [1.5, 10.0],
165 | 'roe': [0.10, 0.50],
166 | '10Y roe median': [0.08, 0.25],
167 | 'interestCoverage': [15, 5000]}
168 |
169 | screened_stocks = screen_stocks('data/', '10Y', 2019, 10, screening_criteria, 'median',
170 | 'roe', 'currentRatio')
171 |
172 | # Returns list of ggplot objects, add optional for loop to inspect graphs
173 | stability_graphs = plot_stock_performance(screened_stocks, 2019, 10)
174 |
175 | stock_growth_estimates = {'DLB': 0.06, 'UNF': 0.05}
176 |
177 | intrinsic_value_estimates = calculate_intrinsic_value(screened_stocks, 2019, 10, 10,
178 | stock_growth_estimates, 'DLB', 'UNF')
179 |
180 | print(intrinsic_value_estimates)
181 |
--------------------------------------------------------------------------------