├── Projects └── readme ├── Time Series Analysis in Python ├── 2-Some Simple Time Series │ ├── readme │ ├── 07-how-about-stock-returns.py │ ├── 06-are-stock-prices-a-random-walk.py │ ├── 05-get-the-drift.py │ ├── 01-taxing-excersize-compute-the-acf.py │ ├── 03-cant-forecast-white-noise.py │ ├── 08-seasonal-adjustment-during-tax-season.py │ └── 02-are-we-confident-this-stock-is-mean-reverting.py ├── 5-Putting It All Together │ ├── readme │ ├── chapter5.pdf │ ├── chapter-details.png │ ├── 6- Which ARMA Model is Best Part 1.pdf │ ├── 6-Which ARMA Model is Best Part 2.pdf │ ├── 7-Don't Throw Out That Winter Coat Yet P1.pdf │ ├── 7-Don't Throw Out That Winter Coat Yet P2.pdf │ ├── 06-which-arma-model-is-best.py │ ├── 05-getting-warmed-up-look-at-aurocorrelations.py │ ├── 02-a-dog-on-a-leash-(part-2).py │ ├── 01-a-dog-on-a-leash-(part-1).py │ ├── 04-is-temperature-a-random-walk-(with-drift).py │ ├── 07-dont-throw-out-that-winter-coat-yet.py │ └── 03-are-bitcoin-and-ethereum-cointegrated.py ├── 3-Autoregressive (AR) Models │ ├── readme │ ├── chapter3.pdf │ ├── 08-estimate-order-of-model-information-criteria.py │ ├── 06-compare-ar-model-with-random-walk.py │ ├── 03-estimating-an-ar-model.py │ ├── 02-compare-the-acf-for-serval-ar-time-series.py │ ├── 04-forecasting-with-an-ar-model.py │ ├── 05-lets-forecast-interest-rates.py │ ├── 07-estimate-order-of-model-pacf.py │ └── 01-simulate-ar(1)-time-series.py ├── 1-Correlation and Autocorrelation │ ├── readme │ ├── dataset │ │ ├── readme │ │ └── multiTimeline.csv │ ├── 04-flying-saucers-arent-correlated-to-flying-markets.py │ ├── 02-merging-time-series-with-differnt-dates.py │ ├── 03-correlation-of-stocks-and-bonds.py │ ├── 07-are-interest-rates-autocorrelated.py │ ├── 06-a-popular-strategy-using-autocorrelation.py │ └── 05-looking-at-a-regressions-r-squared.py ├── 4-Moving Average (MA) and ARMA Models │ ├── readme │ ├── chapter4.pdf │ ├── chapter-details.png │ ├── 04-forecasting-with-ma-model.py │ ├── 08-equivalance-of-ar(1)-and-ma(infinity).py │ ├── 07-applying-an-ma-model.py │ ├── 03-estimating-an-ma-model.py │ ├── 01-simulate-ma(1)-time-series.py │ ├── 06-more-data-cleaning-missing-data.py │ ├── 02-compute-the-acf-for-several-ma-time-series.py │ └── 05-high-frequency-stock-prices.py ├── Alogrithems │ ├── readme │ └── AR_and_ARIMA_Models.ipynb └── Readme ├── Machine-Learning-for-Time-series-Data ├── Time Series and Machine Learning Primer │ ├── readme │ ├── 2_Inspecting_the_regression_data.ipynb │ └── 1_Inspecting_the_classification_data___Python.ipynb └── Readme └── ReadMe.md /Projects/readme: -------------------------------------------------------------------------------- 1 | 2 | 3 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/2-Some Simple Time Series/readme: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/5-Putting It All Together/readme: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/3-Autoregressive (AR) Models/readme: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/1-Correlation and Autocorrelation/readme: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/4-Moving Average (MA) and ARMA Models/readme: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/1-Correlation and Autocorrelation/dataset/readme: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Machine-Learning-for-Time-series-Data/Time Series and Machine Learning Primer/readme: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/Alogrithems/readme: -------------------------------------------------------------------------------- 1 | https://towardsdatascience.com/time-series-forecasting-predicting-stock-prices-using-facebooks-prophet-model-9ee1657132b5 2 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/5-Putting It All Together/chapter5.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dr-mushtaq/Time-Series-Analysis-in-Python/master/Time Series Analysis in Python/5-Putting It All Together/chapter5.pdf -------------------------------------------------------------------------------- /Time Series Analysis in Python/3-Autoregressive (AR) Models/chapter3.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dr-mushtaq/Time-Series-Analysis-in-Python/master/Time Series Analysis in Python/3-Autoregressive (AR) Models/chapter3.pdf -------------------------------------------------------------------------------- /Time Series Analysis in Python/5-Putting It All Together/chapter-details.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dr-mushtaq/Time-Series-Analysis-in-Python/master/Time Series Analysis in Python/5-Putting It All Together/chapter-details.png -------------------------------------------------------------------------------- /Time Series Analysis in Python/Readme: -------------------------------------------------------------------------------- 1 | https://learn.datacamp.com/courses/time-series-analysis-in-python 2 | https://towardsdatascience.com/time-series-forecasting-predicting-stock-prices-using-facebooks-prophet-model-9ee1657132b5 3 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/4-Moving Average (MA) and ARMA Models/chapter4.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dr-mushtaq/Time-Series-Analysis-in-Python/master/Time Series Analysis in Python/4-Moving Average (MA) and ARMA Models/chapter4.pdf -------------------------------------------------------------------------------- /Time Series Analysis in Python/4-Moving Average (MA) and ARMA Models/chapter-details.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dr-mushtaq/Time-Series-Analysis-in-Python/master/Time Series Analysis in Python/4-Moving Average (MA) and ARMA Models/chapter-details.png -------------------------------------------------------------------------------- /Time Series Analysis in Python/5-Putting It All Together/6- Which ARMA Model is Best Part 1.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dr-mushtaq/Time-Series-Analysis-in-Python/master/Time Series Analysis in Python/5-Putting It All Together/6- Which ARMA Model is Best Part 1.pdf -------------------------------------------------------------------------------- /Time Series Analysis in Python/5-Putting It All Together/6-Which ARMA Model is Best Part 2.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dr-mushtaq/Time-Series-Analysis-in-Python/master/Time Series Analysis in Python/5-Putting It All Together/6-Which ARMA Model is Best Part 2.pdf -------------------------------------------------------------------------------- /Time Series Analysis in Python/5-Putting It All Together/7-Don't Throw Out That Winter Coat Yet P1.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dr-mushtaq/Time-Series-Analysis-in-Python/master/Time Series Analysis in Python/5-Putting It All Together/7-Don't Throw Out That Winter Coat Yet P1.pdf -------------------------------------------------------------------------------- /Time Series Analysis in Python/5-Putting It All Together/7-Don't Throw Out That Winter Coat Yet P2.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dr-mushtaq/Time-Series-Analysis-in-Python/master/Time Series Analysis in Python/5-Putting It All Together/7-Don't Throw Out That Winter Coat Yet P2.pdf -------------------------------------------------------------------------------- /Machine-Learning-for-Time-series-Data/Readme: -------------------------------------------------------------------------------- 1 | 2 | https://campus.datacamp.com/courses/machine-learning-for-time-series-data-in-python/time-series-and-machine-learning-primer?ex=1 3 | https://www.studocu.com/in/document/icfai-university-tripura/marketing-management/lecture-notes/machine-learning-for-time-series-data-in-python/5193370/view 4 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/5-Putting It All Together/06-which-arma-model-is-best.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Which ARMA Model is Best? 3 | 4 | Fit the temperature data to an AR(1), AR(2), MA(1), and ARMA(1,1) and see which model is the best fit, using the AIC criterion. 5 | 6 | INSTRUCTIONS 7 | 100XP 8 | For each ARMA model, create an instance of the ARMA class called mod using the argument order=(p,q) 9 | Fit the model mod using the method fit() and save it in a results object called res 10 | Print the AIC value, found in res.aic 11 | ''' 12 | # Import the module for estimating an ARMA model 13 | from statsmodels.tsa.arima_model import ARMA 14 | 15 | # Fit the data to an AR(1) model and print AIC: 16 | mod = ARMA(chg_temp, order=(1,0)) 17 | res = mod.fit() 18 | print("The AIC for an AR(1) is: ", res.aic) 19 | 20 | # Fit the data to an AR(2) model and print AIC: 21 | mod = ARMA(chg_temp, order=(2,0)) 22 | res = mod.fit() 23 | print("The AIC for an AR(2) is: ", res.aic) 24 | 25 | # Fit the data to an MA(1) model and print AIC: 26 | mod = ARMA(chg_temp, order=(0,1)) 27 | res = mod.fit() 28 | print("The AIC for an MA(1) is: ", res.aic) 29 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/5-Putting It All Together/05-getting-warmed-up-look-at-aurocorrelations.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Getting "Warmed" Up: Look at Autocorrelations 3 | 4 | Since the temperature series, temp_NY, is a random walk with drift, take first differences to make it stationary. Then compute the sample ACF and PACF. This will provide some guidance on the order of the model. 5 | 6 | INSTRUCTIONS 7 | 100XP 8 | Import the modules for plotting the sample ACF and PACF 9 | Take first differences of the DataFrame temp_NY using the pandas method .diff() 10 | Create two subplots for plotting the ACF and PACF 11 | Plot the sample ACF of the differenced series 12 | Plot the sample PACF of the differenced series 13 | ''' 14 | # Import the modules for plotting the sample ACF and PACF 15 | from statsmodels.graphics.tsaplots import plot_acf, plot_pacf 16 | 17 | # Take first difference of the temperature Series 18 | chg_temp = temp_NY.diff() 19 | chg_temp = chg_temp.dropna() 20 | 21 | # Plot the ACF and PACF on the same page 22 | fig, axes = plt.subplots(2,1) 23 | 24 | # Plot the ACF 25 | plot_acf(chg_temp, lags=20, ax=axes[0]) 26 | 27 | # Plot the PACF 28 | plot_pacf(chg_temp, lags=20, ax=axes[1]) 29 | plt.show() 30 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/2-Some Simple Time Series/07-how-about-stock-returns.py: -------------------------------------------------------------------------------- 1 | ''' 2 | How About Stock Returns? 3 | 4 | In the last exercise, you showed that Amazon stock prices, contained in the DataFrame AMZN follow a random walk. In this exercise. you will do the same thing for Amazon returns (percent change in prices) and show that the returns do not follow a random walk. 5 | 6 | INSTRUCTIONS 7 | 100XP 8 | Import the adfuller module from statsmodels. 9 | Create a new DataFrame of AMZN returns by taking the percent change of prices using the method .pct_change(). 10 | Eliminate the NaN in the first row of returns using the .dropna() method on the DataFrame. 11 | Run the Augmented Dickey-Fuller test on the Adj Close column of the returns DataFrame, and print out the p-value in `results[1]. 12 | ''' 13 | # Import the adfuller module from statsmodels 14 | from statsmodels.tsa.stattools import adfuller 15 | 16 | # Create a DataFrame of AMZN returns 17 | AMZN_ret = AMZN.pct_change() 18 | 19 | # Eliminate the NaN in the first row of returns 20 | AMZN_ret = AMZN_ret.dropna() 21 | 22 | # Run the ADF test on the return series and print out the p-value 23 | results = adfuller(AMZN_ret['Adj Close']) 24 | print('The p-value of the test on returns is: ' + str(results[1])) -------------------------------------------------------------------------------- /Time Series Analysis in Python/5-Putting It All Together/02-a-dog-on-a-leash-(part-2).py: -------------------------------------------------------------------------------- 1 | ''' 2 | A Dog on a Leash? (Part 2) 3 | 4 | To verify that HO and NG are cointegrated, First apply the Dickey-Fuller test to HO and NG separately to show they are random walks. Then apply the test to the difference, which should strongly reject the random walk hypothesis. The Heating Oil and Natural Gas prices are pre-loaded in DataFrames HO and NG. 5 | 6 | INSTRUCTIONS 7 | 100XP 8 | Perform the adfuller test on HO and on NG separately, and save the results (results are a list) 9 | The argument for adfuller must be a series, so you need to include the column 'Close' 10 | Print just the p-value (item [1] in the list) 11 | Do the same thing for the spread, again converting the units of HO 12 | ''' 13 | # Import the adfuller module from statsmodels 14 | from statsmodels.tsa.stattools import adfuller 15 | 16 | # Compute the ADF for HO and NG 17 | result_HO = adfuller(HO['Close']) 18 | print("The p-value for the ADF test on HO is ", result_HO[1]) 19 | result_NG = adfuller(NG['Close']) 20 | print("The p-value for the ADF test on NG is ", result_NG[1]) 21 | 22 | # Compute the ADF of the spread 23 | result_spread = adfuller(7.25 * HO['Close'] - NG['Close']) 24 | print("The p-value for the ADF test on the spread is ", result_spread[1]) -------------------------------------------------------------------------------- /Time Series Analysis in Python/5-Putting It All Together/01-a-dog-on-a-leash-(part-1).py: -------------------------------------------------------------------------------- 1 | ''' 2 | A Dog on a Leash? (Part 1) 3 | 4 | The Heating Oil and Natural Gas prices are pre-loaded in DataFrames HO and NG. First, plot both price series, which look like random walks. Then plot the difference between the two series, which should look more like a mean reverting series (to put the two series in the same units, we multiply the heating oil prices, in $/gallon, by 7.25, which converts it to $/millionBTU, which is the same units as Natural Gas). 5 | 6 | The data for continuous futures (each contract has to be spliced together in a continuous series as contracts expire) was obtained from Quandl. 7 | 8 | INSTRUCTIONS 9 | 100XP 10 | Plot Heating Oil, HO, and Natural Gas, NG, on the same subplot 11 | Make sure you multiply the HO price by 7.25 to match the units of NG 12 | Plot the spread on a second subplot 13 | The spread will be 7.25*HO - NG 14 | ''' 15 | # Plot the prices separately 16 | plt.subplot(2,1,1) 17 | plt.plot(7.25*HO, label='Heating Oil') 18 | plt.plot(NG, label='Natural Gas') 19 | plt.legend(loc='best', fontsize='small') 20 | 21 | # Plot the spread 22 | plt.subplot(2,1,2) 23 | plt.plot(7.25*HO-NG, label='Spread') 24 | plt.legend(loc='best', fontsize='small') 25 | plt.axhline(y=0, linestyle='--', color='k') 26 | plt.show() 27 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/5-Putting It All Together/04-is-temperature-a-random-walk-(with-drift).py: -------------------------------------------------------------------------------- 1 | ''' 2 | Is Temperature a Random Walk (with Drift)? 3 | 4 | An ARMA model is a simplistic approach to forecasting climate changes, but it illustrates many of the topics covered in this class. 5 | 6 | The DataFrame temp_NY contains the average annual temperature in Central Park, NY from 1870-2016 (the data was downloaded from the NOAA here). Plot the data and test whether it follows a random walk (with drift). 7 | 8 | INSTRUCTIONS 9 | 100XP 10 | Convert the index of years into a datetime object using pd.to_datetime(), and since the data is annual, pass the argument format='%Y'. 11 | Plot the data using .plot() 12 | Compute the p-value the Augmented Dickey Fuller test using the adfuller function. 13 | Save the results of the ADF test in result, and print out the p-value in result[1]. 14 | ''' 15 | # Import the adfuller function from the statsmodels module 16 | from statsmodels.tsa.stattools import adfuller 17 | 18 | # Convert the index to a datetime object 19 | temp_NY.index = pd.to_datetime(temp_NY.index, format='%Y') 20 | 21 | # Plot average temperatures 22 | temp_NY.plot() 23 | plt.show() 24 | 25 | # Compute and print ADF p-value 26 | result = adfuller(temp_NY['TAVG']) 27 | print("The p-value for the ADF test is ", result[1]) 28 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/4-Moving Average (MA) and ARMA Models/04-forecasting-with-ma-model.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Forecasting with MA Model 3 | 4 | As you did with AR models, you will use MA models to forecast in-sample and out-of-sample data using statsmodels. 5 | 6 | For the simulated series simulated_data_1 with θ=−0.9 7 | θ 8 | = 9 | − 10 | 0.9 11 | , you will plot in-sample and out-of-sample forecasts. One big difference you will see between out-of-sample forecasts with an MA(1) model and an AR(1) model is that the MA(1) forecasts more than one period in the future are simply the mean of the sample. 12 | 13 | INSTRUCTIONS 14 | 100XP 15 | Import the class ARMA in the module statsmodels.tsa.arima_model 16 | Create an instance of the ARMA class called mod using the simulated data simulated_data_1 and the (p,q) order of the model (in this case, for an MA(1), order=(0,1) 17 | Fit the model mod using the method .fit() and save it in a results object called res 18 | Plot the in-sample and out-of-sample forecasts of the data using the .plot_predict() method 19 | Start the forecast 10 data points before the end of the 1000 point series at 990, and end the forecast 10 data points after the end of the series at point 1010 20 | ''' 21 | # Import the ARMA module from statsmodels 22 | from statsmodels.tsa.arima_model import ARMA 23 | 24 | # Forecast the first MA(1) model 25 | mod = ARMA(simulated_data_1, order=(0,1)) 26 | res = mod.fit() 27 | res.plot_predict(start=990, end=1010) 28 | plt.show() -------------------------------------------------------------------------------- /Time Series Analysis in Python/2-Some Simple Time Series/06-are-stock-prices-a-random-walk.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Are Stock Prices a Random Walk? 3 | 4 | Most stock prices follow a random walk (perhaps with a drift). You will look at a time series of Amazon stock prices, pre-loaded in the DataFrame AMZN, and run the 'Augmented Dickey-Fuller Test' from the statsmodels library to show that it does indeed follow a random walk. 5 | 6 | With the ADF test, the "null hypothesis" (the hypothesis that we either reject or fail to reject) is that the series follows a random walk. Therefore, a low p-value (say less than 5%) means we can reject the null hypothesis that the series is a random walk. 7 | 8 | INSTRUCTIONS 9 | 100XP 10 | Import the adfuller module from statsmodels. 11 | Run the Augmented Dickey-Fuller test on the series of closing stock prices, which is the column Adj Close in the AMZN DataFrame. 12 | Print out the entire output, which includes the test statistic, the p-values, and the critical values for tests with 1%, 10%, and 5% levels. 13 | Print out just the p-value of the test (results[0] is the test statistic, and results[1] is the p-value). 14 | ''' 15 | # Import the adfuller module from statsmodels 16 | from statsmodels.tsa.stattools import adfuller 17 | 18 | # Run the ADF test on the price series and print out the results 19 | results = adfuller(AMZN['Adj Close']) 20 | print(results) 21 | 22 | # Just print out the p-value 23 | print('The p-value of the test on prices is: ' + str(results[1])) -------------------------------------------------------------------------------- /Time Series Analysis in Python/1-Correlation and Autocorrelation/04-flying-saucers-arent-correlated-to-flying-markets.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Flying Saucers Aren't Correlated to Flying Markets 3 | 4 | Two trending series may show a strong correlation even if they are completely unrelated. This is referred to as "spurious correlation". That's why when you look at the correlation of say, two stocks, you should look at the correlation of their returns and not their levels. 5 | 6 | To illustrate this point, calculate the correlation between the levels of the stock market and the annual sightings of UFOs. Both of those time series have trended up over the last several decades, and the correlation of their levels is very high. Then calculate the correlation of their percent changes. This will be close to zero, since there is no relationship betweem those two series. 7 | 8 | The DataFrame levels contains the levels of DJI and UFO. UFO data was downloaded from www.nuforc.org. 9 | 10 | INSTRUCTIONS 11 | 100XP 12 | Calculate the correlation of the columns DJI and UFO. 13 | Create a new DataFrame of changes using the .pct_change() method. 14 | Re-calculate the correlation of the columns DJI and UFO on the changes. 15 | ''' 16 | # Compute correlation of levels 17 | correlation1 = levels['DJI'].corr(levels['UFO']) 18 | print("Correlation of levels: ", correlation1) 19 | 20 | # Compute correlation of percent changes 21 | changes = levels.pct_change() 22 | correlation2 = changes['DJI'].corr(changes['UFO']) 23 | print("Correlation of changes: ", correlation2) 24 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/4-Moving Average (MA) and ARMA Models/08-equivalance-of-ar(1)-and-ma(infinity).py: -------------------------------------------------------------------------------- 1 | ''' 2 | Equivalence of AR(1) and MA(infinity) 3 | 4 | To better understand the relationship between MA models and AR models, you will demonstrate that an AR(1) model is equivalent to an MA(∞ 5 | ∞ 6 | ) model with the appropriate parameters. 7 | 8 | You will simulate an MA model with parameters 0.8,0.82,0.83,… 9 | 0.8 10 | , 11 | 0.8 12 | 2 13 | , 14 | 0.8 15 | 3 16 | , 17 | … 18 | for a large number (30) lags and show that it has the same Autocorrelation Function as an AR(1) model with ϕ=0.8 19 | ϕ 20 | = 21 | 0.8 22 | . 23 | 24 | INSTRUCTIONS 25 | 100XP 26 | Import the modules for simulating data and plotting the ACF from statsmodels 27 | Use a list comprehension to build a list with exponentially decaying MA parameters: 1,0.8,0.82,0.83,… 28 | 1 29 | , 30 | 0.8 31 | , 32 | 0.8 33 | 2 34 | , 35 | 0.8 36 | 3 37 | , 38 | … 39 | Simulate 5000 observations of the MA(30) model 40 | Plot the ACF of the simulated series 41 | ''' 42 | 43 | 44 | 45 | # import the modules for simulating data and plotting the ACF 46 | from statsmodels.tsa.arima_process import ArmaProcess 47 | from statsmodels.graphics.tsaplots import plot_acf 48 | 49 | # Build a list MA parameters 50 | ma = [0.8**i for i in range(30)] 51 | 52 | # Simulate the MA(30) model 53 | ar = np.array([1]) 54 | AR_object = ArmaProcess(ar, ma) 55 | simulated_data = AR_object.generate_sample(nsample=5000) 56 | 57 | # Plot the ACF 58 | plot_acf(simulated_data, lags=30) 59 | plt.show() -------------------------------------------------------------------------------- /Time Series Analysis in Python/2-Some Simple Time Series/05-get-the-drift.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Get the Drift 3 | 4 | In the last exercise, you simulated stock prices that follow a random walk. You will extend this in two ways in this exercise. 5 | 6 | You will look at a random walk with a drift. Many time series, like stock prices, are random walks but tend to drift up over time. 7 | In the last exercise, the noise in the random walk was additive: random, normal changes in price were added to the last price. However, when adding noise, you could theoretically get negative prices. Now you will make the noise multiplicative: you will add one to the random, normal changes to get a total return, and multiply that by the last price. 8 | INSTRUCTIONS 9 | 100XP 10 | Generate 500 random normal multiplicative "steps" with mean 0.1% and standard deviation 1% using np.random.normal(), which are now returns, and add one for total return. 11 | Simulate stock prices P: 12 | Cumulate the product of the steps using the numpy .cumprod() method. 13 | Multiply the cumulative product of total returns by 100 to get a starting value of 100. 14 | Plot the simulated random walk with drift. 15 | ''' 16 | # Generate 500 random steps 17 | steps = np.random.normal(loc=0.001, scale=0.01, size=500) + 1 18 | 19 | # Set first element to 1 20 | steps[0]=1 21 | 22 | # Simulate the stock price, P, by taking the cumulative product 23 | P = 100 * np.cumprod(steps) 24 | 25 | # Plot the simulated stock prices 26 | plt.plot(P) 27 | plt.title("Simulated Random Walk with Drift") 28 | plt.show() 29 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/5-Putting It All Together/07-dont-throw-out-that-winter-coat-yet.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Don't Throw Out That Winter Coat Yet 3 | 4 | Finally, you will forecast the temperature over the next 30 years using an ARMA(1,1) model, including confidence bands around that estimate. Keep in mind that the estimate of the drift will have a much bigger impact on long range forecasts than the ARMA parameters. 5 | 6 | Earlier, you determined that the temperature data follows a random walk and you looked at first differencing the data. You will use the ARIMA module on the temperature data, pre-loaded in the DataFrame temp_NY, but the forecast would be the same as using the ARMA module on changes in temperature, and then using cumulative sums of these changes to get the temperature. 7 | 8 | The data is in a DataFrame called temp_NY. 9 | 10 | INSTRUCTIONS 11 | 100XP 12 | Create an instance of the ARIMA class called mod for an integrated ARMA(1,1) model 13 | The d in order(p,d,q) is one, since we first differenced once 14 | Fit mod using the .fit() method and call the results res 15 | Forecast the series using the plot_predict() method on res 16 | Choose the start date as 1872-01-01 and the end date as 2046-01-01 17 | ''' 18 | # Import the ARIMA module from statsmodels 19 | from statsmodels.tsa.arima_model import ARIMA 20 | 21 | # Forecast interest rates using an AR(1) model 22 | mod = ARIMA(temp_NY, order=(1,1,1)) 23 | res = mod.fit() 24 | 25 | # Plot the original series and the forecasted series 26 | res.plot_predict(start='1872-01-01', end='2046-01-01') 27 | plt.show() -------------------------------------------------------------------------------- /Time Series Analysis in Python/4-Moving Average (MA) and ARMA Models/07-applying-an-ma-model.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Applying an MA Model 3 | 4 | The bouncing of the stock price between bid and ask induces a negative first order autocorrelation, but no autocorrelations at lags higher than 1. You get the same ACF pattern with an MA(1) model. Therefore, you will fit an MA(1) model to the intraday stock data from the last exercise. 5 | 6 | The first step is to compute minute-by-minute returns from the prices in intraday, and plot the autocorrelation function. You should observe that the ACF looks like that for an MA(1) process. Then, fit the data to an MA(1), the same way you did for simulated data. 7 | 8 | INSTRUCTIONS 9 | 100XP 10 | Import plot_acf and ARMA modules from statsmodels 11 | Compute minute-to-minute returns from prices: 12 | Compute returns with the .pct_change() method 13 | Use the pandas method .dropna() to drop the first row of returns, which is NaN 14 | Plot the ACF function with lags up to 60 minutes 15 | Fit the returns data to an MA(1) model and print out the MA(1) parameter 16 | ''' 17 | # Import plot_acf and ARMA modules from statsmodels 18 | from statsmodels.graphics.tsaplots import plot_acf 19 | from statsmodels.tsa.arima_model import ARMA 20 | 21 | # Compute returns from prices and drop the NaN 22 | returns = intraday.pct_change() 23 | returns = returns.dropna() 24 | 25 | # Plot ACF of returns with lags up to 60 minutes 26 | plot_acf(returns, lags=60) 27 | plt.show() 28 | 29 | # Fit the data to an MA(1) model 30 | mod = ARMA(returns, order=(0,1)) 31 | res = mod.fit() 32 | print(res.params) 33 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/3-Autoregressive (AR) Models/08-estimate-order-of-model-information-criteria.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Estimate Order of Model: Information Criteria 3 | 4 | Another tool to identify the order of a model is to look at the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). These measures compute the goodness of fit with the estimated parameters, but apply a penalty function on the number of parameters in the model. You will take the AR(2) simulated data from the last exercise, saved as simulated_data_2, and compute the BIC as you vary the order, p, in an AR(p) from 0 to 6. 5 | 6 | INSTRUCTIONS 7 | 70XP 8 | Import the ARMA module for estimating the parameters and computing BIC. 9 | Initialize a numpy array BIC, which we will use to store the BIC for each AR(p) model. 10 | Loop through order p for p = 0,...,6. 11 | For each p, fit the data to an AR model of order p. 12 | For each p, save the value of BIC using the .bic attribute (no parentheses) of res. 13 | Plot BIC as a function of p (for the plot, skip p=0 and plot for p=1,...6). 14 | ''' 15 | # Import the module for estimating an ARMA model 16 | from statsmodels.tsa.arima_model import ARMA 17 | 18 | # Fit the data to an AR(p) for p = 0,...,6 , and save the BIC 19 | BIC = np.zeros(7) 20 | for p in range(7): 21 | mod = ARMA(simulated_data_2, order=(p,0)) 22 | res = mod.fit() 23 | # Save BIC for AR(p) 24 | BIC[p] = res.bic 25 | 26 | # Plot the BIC as a function of p 27 | plt.plot(range(1,7), BIC[1:7], marker='o') 28 | plt.xlabel('Order of AR Model') 29 | plt.ylabel('Baysian Information Criterion') 30 | plt.show() -------------------------------------------------------------------------------- /Time Series Analysis in Python/3-Autoregressive (AR) Models/06-compare-ar-model-with-random-walk.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Compare AR Model with Random Walk 3 | 4 | Sometimes it is difficult to distinguish between a time series that is slightly mean reverting and a time series that does not mean revert at all, like a random walk. You will compare the ACF for the slightly mean-reverting interest rate series of the last exercise with a simulated random walk with the same number of observations. 5 | 6 | You should notice when plotting the autocorrelation of these two series side-by-side that they look very similar. 7 | 8 | INSTRUCTIONS 9 | 100XP 10 | Import plot_acf function from the statsmodels module 11 | Create two axes for the two subplots 12 | Plot the autocorrelation function for 12 lags of the interest rate series interest_rate_data in the top plot 13 | Plot the autocorrelation function for 12 lags of the interest rate series simulated_data in the bottom plot 14 | ''' 15 | # Import the plot_acf module from statsmodels 16 | from statsmodels.graphics.tsaplots import plot_acf 17 | 18 | # Plot the interest rate series and the simulated random walk series side-by-side 19 | fig, axes = plt.subplots(2,1) 20 | 21 | # Plot the autocorrelation of the interest rate series in the top plot 22 | fig = plot_acf(interest_rate_data, alpha=1, lags=12, ax=axes[0]) 23 | 24 | # Plot the autocorrelation of the simulated random walk series in the bottom plot 25 | fig = plot_acf(simulated_data, alpha=1, lags=12, ax=axes[1]) 26 | 27 | # Label axes 28 | axes[0].set_title("Interest Rate Data") 29 | axes[1].set_title("Simulated Random Walk Data") 30 | plt.show() 31 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/2-Some Simple Time Series/01-taxing-excersize-compute-the-acf.py: -------------------------------------------------------------------------------- 1 | ''' 2 | axing Exercise: Compute the ACF 3 | 4 | In the last chapter, you computed autocorrelations with one lag. Often we are interested in seeing the autocorrelation over many lags. The quarterly earnings for H&R Block (ticker symbol HRB) is plotted on the right, and you can see the extreme cyclicality of its earnings. A vast majority of its earnings occurs in the quarter that taxes are due. 5 | 6 | You will compute the array of autocorrelations for the H&R Block quarterly earnings that is pre-loaded in the DataFrame HRB. Then, plot the autocorrelation function using the plot_acf module. This plot shows what the autocorrelation function looks like for cyclical earnings data. The ACF at lag=0 is always one, of course. In the next exercise, you will learn about the confidence interval for the ACF, but for now, supress the confidence interval by setting alpha=1. 7 | 8 | INSTRUCTIONS 9 | 100XP 10 | INSTRUCTIONS 11 | 100XP 12 | Import the acf module and plot_acf module from statsmodels. 13 | Compute the array of autocorrelations of the quarterly earnings data in DataFrame HRB. 14 | Plot the autocorrelation function of the quarterly earnings data in HRB, and pass the argument alpha=1 to supress the confidence interval. 15 | ''' 16 | # Import the acf module and the plot_acf module from statsmodels 17 | from statsmodels.tsa.stattools import acf 18 | from statsmodels.graphics.tsaplots import plot_acf 19 | 20 | # Compute the acf array of HRB 21 | acf_array = acf(HRB) 22 | print(acf_array) 23 | 24 | # Plot the acf function 25 | plot_acf(HRB, alpha=1) 26 | plt.show() -------------------------------------------------------------------------------- /Time Series Analysis in Python/1-Correlation and Autocorrelation/02-merging-time-series-with-differnt-dates.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Merging Time Series With Different Dates 3 | 4 | Stock and bond markets in the U.S. are closed on different days. For example, although the bond market is closed on Columbus Day (around Oct 12) and Veterans Day (around Nov 11), the stock market is open on those days. One way to see the dates that the stock market is open and the bond market is closed is to convert both indexes of dates into sets and take the difference in sets. 5 | 6 | The pandas .join() method is a convenient tool to merge the stock and bond DataFrames on dates when both markets are open. 7 | 8 | Stock prices and 10-year US Government bond yields, which were downloaded from FRED, are pre-loaded in DataFrames stocks and bonds. 9 | 10 | INSTRUCTIONS 11 | 100XP 12 | Convert the dates in the stocks.index and bonds.index into sets. 13 | Take the difference of the stock set minus the bond set to get those dates where the stock market has data but the bond market does not. 14 | Merge the two DataFrames into a new DataFrame, stocks_and_bonds using the .join() method, which has the syntax df1.join(df2). 15 | To get the intersection of dates, use the argument how='inner' 16 | ''' 17 | # Import pandas 18 | import pandas as pd 19 | 20 | # Convert the stock index and bond index into sets 21 | set_stock_dates = set(stocks.index) 22 | set_bond_dates = set(bonds.index) 23 | 24 | # Take the difference between the sets and print 25 | print(set_stock_dates - set_bond_dates) 26 | 27 | # Merge stocks and bonds DataFrames using join() 28 | stocks_and_bonds = stocks.join(bonds, how='inner') -------------------------------------------------------------------------------- /Time Series Analysis in Python/1-Correlation and Autocorrelation/03-correlation-of-stocks-and-bonds.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Correlation of Stocks and Bonds 3 | 4 | Investors are often interested in the correlation between the returns of two different assets for asset allocation and hedging purposes. In this exercise, you'll try to answer the question of whether stocks are positively or negatively correlated with bonds. Scatter plots are also useful for visualizing the correlation between the two variables. 5 | 6 | Keep in mind that you should compute the correlations on the percentage changes rather than the levels. 7 | 8 | Stock prices and 10-year bond yields are combined in a DataFrame called stocks_and_bonds under columns SP500 and US10Y 9 | 10 | The pandas and plotting modules have already been imported for you. For the remainder of the course, pandas is imported as pd and matplotlib.pyplot is imported as plt. 11 | 12 | INSTRUCTIONS 13 | 100XP 14 | Compute percent changes on the stocks_and_bonds DataFrame using the .pct_change() method and call the new DataFrame returns. 15 | Compute the correlation of the columns SP500 and US10Y in the returns DataFrame using the .corr() method for Series which has the syntax series1.corr(series2). 16 | Show a scatter plot of the percentage change in stock and bond yields. 17 | ''' 18 | # Compute percent change using pct_change() 19 | returns = stocks_and_bonds.pct_change() 20 | 21 | # Compute correlation using corr() 22 | correlation = returns['SP500'].corr(returns['US10Y']) 23 | print("Correlation of stocks and interest rates: ", correlation) 24 | 25 | # Make scatter plot 26 | plt.scatter(returns['SP500'], returns['US10Y']) 27 | plt.show() -------------------------------------------------------------------------------- /Time Series Analysis in Python/3-Autoregressive (AR) Models/03-estimating-an-ar-model.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Estimating an AR Model 3 | 4 | You will estimate the AR(1) parameter, ϕ 5 | ϕ 6 | , of one of the simulated series that you generated in the earlier exercise. Since the parameters are known for a simulated series, it is a good way to understand the estimation routines before applying it to real data. 7 | 8 | For simulated_data_1 with a true ϕ 9 | ϕ 10 | of 0.9, you will print out the estimate of ϕ 11 | ϕ 12 | . In addition, you will also print out the entire output that is produced when you fit a time series, so you can get an idea of what other tests and summary statistics are available in statsmodels. 13 | 14 | INSTRUCTIONS 15 | 100XP 16 | Import the class ARMA in the module statsmodels.tsa.arima_model. 17 | Create an instance of the ARMA class called mod using the simulated data simulated_data_1 and the order (p,q) of the model (in this case, for an AR(1)), is order=(1,0). 18 | Fit the model mod using the method .fit() and save it in a results object called res. 19 | Print out the entire summmary of results using the .summary() method. 20 | Just print out an estimate of the constant and ϕ 21 | ϕ 22 | using the .params attribute (no parentheses). 23 | ''' 24 | # Import the ARMA module from statsmodels 25 | from statsmodels.tsa.arima_model import ARMA 26 | 27 | # Fit an AR(1) model to the first simulated data 28 | mod = ARMA(simulated_data_1, order=(1,0)) 29 | res = mod.fit() 30 | 31 | # Print out summary information on the fit 32 | print(res.summary()) 33 | 34 | # Print out the estimate for the constant and for phi 35 | print("When the true phi=0.9, the estimate of phi (and the constant) are:") 36 | print(res.params) -------------------------------------------------------------------------------- /Time Series Analysis in Python/4-Moving Average (MA) and ARMA Models/03-estimating-an-ma-model.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Estimating an MA Model 3 | 4 | You will estimate the MA(1) parameter, θ 5 | θ 6 | , of one of the simulated series that you generated in the earlier exercise. Since the parameters are known for a simulated series, it is a good way to understand the estimation routines before applying it to real data. 7 | 8 | For simulated_data_1 with a true θ 9 | θ 10 | of -0.9, you will print out the estimate of θ 11 | θ 12 | . In addition, you will also print out the entire output that is produced when you fit a time series, so you can get an idea of what other tests and summary statistics are available in statsmodels. 13 | 14 | INSTRUCTIONS 15 | 100XP 16 | Import the class ARMA in the module statsmodels.tsa.arima_model. 17 | Create an instance of the ARMA class called mod using the simulated data simulated_data_1 and the order (p,q) of the model (in this case, for an MA(1)), is order=(0,1). 18 | Fit the model mod using the method .fit() and save it in a results object called res. 19 | Print out the entire summmary of results using the .summary() method. 20 | Just print out an estimate of the constant and phi parameter using the .params attribute (no arguments). 21 | ''' 22 | # Import the ARMA module from statsmodels 23 | from statsmodels.tsa.arima_model import ARMA 24 | 25 | # Fit an MA(1) model to the first simulated data 26 | mod = ARMA(simulated_data_1, order=(0,1)) 27 | res = mod.fit() 28 | 29 | # Print out summary information on the fit 30 | print(res.summary()) 31 | 32 | # Print out the estimate for the constant and for theta 33 | print("When the true theta=-0.9, the estimate of theta (and the consant) are:") 34 | print(res.params) 35 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/3-Autoregressive (AR) Models/02-compare-the-acf-for-serval-ar-time-series.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Compare the ACF for Several AR Time Series 3 | 4 | The autocorrelation function decays exponentially for an AR time series at a rate of the AR parameter. For example, if the AR parameter, ϕ=+0.9 5 | ϕ 6 | = 7 | + 8 | 0.9 9 | , the first-lag autocorrelation will be 0.9, the second-lag will be (0.9)2=0.81 10 | ( 11 | 0.9 12 | ) 13 | 2 14 | = 15 | 0.81 16 | , the third-lag will be (0.9)3=0.729 17 | ( 18 | 0.9 19 | ) 20 | 3 21 | = 22 | 0.729 23 | , etc. A smaller AR parameter will have a steeper decay, and for a negative AR parameter, say -0.9, the decay will flip signs, so the first-lag autocorrelation will be -0.9, the second-lag will be (−0.9)2=0.81 24 | ( 25 | − 26 | 0.9 27 | ) 28 | 2 29 | = 30 | 0.81 31 | , the third-lag will be (−0.9)3=−0.729 32 | ( 33 | − 34 | 0.9 35 | ) 36 | 3 37 | = 38 | − 39 | 0.729 40 | , etc. 41 | 42 | The object simulated_data_1 is the simulated time series with an AR parameter of +0.9, simulated_data_2 is for an AR paramter of -0.9, and simulated_data_3 is for an AR parameter of 0.3 43 | 44 | INSTRUCTIONS 45 | 100XP 46 | Compute the autocorrelation function for each of the three simulated datasets using the plot_acf function with 20 lags (and supress the confidence intervals by setting alpha=1). 47 | ''' 48 | # Import the plot_acf module from statsmodels 49 | from statsmodels.graphics.tsaplots import plot_acf 50 | 51 | # Plot 1: AR parameter = +0.9 52 | plot_acf(simulated_data_1, alpha=1, lags=20) 53 | plt.show() 54 | 55 | # Plot 2: AR parameter = -0.9 56 | plot_acf(simulated_data_2, alpha=1, lags=20) 57 | plt.show() 58 | 59 | # Plot 3: AR parameter = +0.3 60 | plot_acf(simulated_data_3, alpha=1, lags=20) 61 | plt.show() -------------------------------------------------------------------------------- /Time Series Analysis in Python/2-Some Simple Time Series/03-cant-forecast-white-noise.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Can't Forecast White Noise 3 | 4 | A white noise time series is simply a sequence of uncorrelated random variables that are identically distributed. Stock returns are often modelled as white noise. Unfortunately, for white noise, we cannot forecast future observations based on the past - autocorrelations at all lags are zero. 5 | 6 | You will generate a white noise series and plot the autocorrelation function to show that it is zero for all lags. You can use np.random.normal() to generate random returns. For a Gaussian white noise process, the mean and standard deviation describe the entire process. 7 | 8 | Plot this white noise series to see what it looks like, and then plot the autocorrelation function. 9 | 10 | INSTRUCTIONS 11 | 100XP 12 | Generate 1000 random normal returns using np.random.normal() with mean 2% (0.02) and standard deviation 5% (0.05), where the argument for the mean is loc and the argument for the standard deviation is scale. 13 | Plot the time series. 14 | Verify the mean and standard deviation of returns using np.mean() and np.std(). 15 | Plot the autocorrelation function using plot_acf with lags=20. 16 | ''' 17 | # Import the plot_acf module from statsmodels 18 | from statsmodels.graphics.tsaplots import plot_acf 19 | 20 | # Simulate wite noise returns 21 | returns = np.random.normal(loc=0.02, scale=0.05, size=1000) 22 | 23 | # Print out the mean and standard deviation of returns 24 | mean = np.mean(returns) 25 | std = np.std(returns) 26 | print("The mean is %5.3f and the standard deviation is %5.3f" %(mean,std)) 27 | 28 | # Plot returns series 29 | plt.plot(returns) 30 | plt.show() 31 | 32 | # Plot autocorrelation function of white noise returns 33 | plot_acf(returns, lags=20) 34 | plt.show() 35 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/3-Autoregressive (AR) Models/04-forecasting-with-an-ar-model.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Forecasting with an AR Model 3 | 4 | In addition to estimating the parameters of a model that you did in the last exercise, you can also do forecasting, both in-sample and out-of-sample using statsmodels. The in-sample is a forecast of the next data point using the data up to that point, and the out-of-sample forecasts any number of data points in the future. These forecasts can be made using either the predict() method if you want the forecasts in the form of a series of data, or using the plot_predict() method if you want a plot of the forecasted data. You supply the starting point for forecasting and the ending point, which can be any number of data points after the data set ends. 5 | 6 | For the simulated series simulated_data_1 with ϕ=0.9 7 | ϕ 8 | = 9 | 0.9 10 | , you will plot in-sample and out-of-sample forecasts. 11 | 12 | INSTRUCTIONS 13 | 100XP 14 | INSTRUCTIONS 15 | 100XP 16 | Import the class ARMA in the module statsmodels.tsa.arima_model 17 | Create an instance of the ARMA class called mod using the simulated data simulated_data_1 and the order (p,q) of the model (in this case, for an AR(1) order=(1,0) 18 | Fit the model mod using the method .fit() and save it in a results object called res 19 | Plot the in-sample and out-of-sample forecasts of the data using the plot_predict() method 20 | Start the forecast 10 data points before the end of the 1000 point series at 990, and end the forecast 10 data points after the end of the series at point 1010 21 | ''' 22 | # Import the ARMA module from statsmodels 23 | from statsmodels.tsa.arima_model import ARMA 24 | 25 | # Forecast the first AR(1) model 26 | mod = ARMA(simulated_data_1, order=(1,0)) 27 | res = mod.fit() 28 | res.plot_predict(start=990, end=1010) 29 | plt.show() -------------------------------------------------------------------------------- /Time Series Analysis in Python/3-Autoregressive (AR) Models/05-lets-forecast-interest-rates.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Let's Forecast Interest Rates 3 | 4 | You will now use the forecasting techniques you learned in the last exercise and apply it to real data rather than simulated data. You will revisit a dataset from the first chapter: the annual data of 10-year interest rates going back 56 years, which is in a Series called interest_rate_data. Being able to forecast interest rates is of enormous importance, not only for bond investors but also for individuals like new homeowners who must decide between fixed and floating rate mortgages. 5 | 6 | You saw in the first chapter that there is some mean reversion in interest rates over long horizons. In other words, when interest rates are high, they tend to drop and when they are low, they tend to rise over time. Currently they are below long-term rates, so they are expected to rise, but an AR model attempts to quantify how much they are expected to rise. 7 | 8 | INSTRUCTIONS 9 | 100XP 10 | Import the class ARMA in the module statsmodels.tsa.arima_model. 11 | Create an instance of the ARMA class called mod using the annual interest rate data and choosing the order for an AR(1) model. 12 | Fit the model mod using the method .fit() and save it in a results object called res. 13 | Plot the in-sample and out-of-sample forecasts of the data using the .plot_predict() method. 14 | Pass the arguments start=0 to start the in-sample forecast from the beginning, and choose end to be '2022' to forecast several years in the future. 15 | ''' 16 | # Import the ARMA module from statsmodels 17 | from statsmodels.tsa.arima_model import ARMA 18 | 19 | # Forecast interest rates using an AR(1) model 20 | mod = ARMA(interest_rate_data, order=(1,0)) 21 | res = mod.fit() 22 | 23 | # Plot the original series and the forecasted series 24 | res.plot_predict(start=0, end='2022') 25 | plt.legend(fontsize=8) 26 | plt.show() 27 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/2-Some Simple Time Series/08-seasonal-adjustment-during-tax-season.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Seasonal Adjustment During Tax Season 3 | 4 | Many time series exhibit strong seasonal behavior. The procedure for removing the seasonal component of a time series is called seasonal adjustment. For example, most economic data published by the government is seasonally adjusted. 5 | 6 | You saw earlier that by taking first differences of a random walk, you get a stationary white noise process. For seasonal adjustments, instead of taking first differences, you will take differences with a lag corresponding to the periodicity. 7 | 8 | Look again at the ACF of H&R Block's quarterly earnings, pre-loaded in the DataFrame HRB, and there is a clear seasonal component. The autocorrelation is high for lags 4,8,12,16,..., because of the spike in earnings every four quarters during tax season. Apply a seasonal adjustment by taking the fourth difference (four represents the periodicity of the series). Then compute the autocorrelation of the transformed series. 9 | 10 | INSTRUCTIONS 11 | 100XP 12 | Create a new DataFrame of seasonally adjusted earnings by taking the lag-4 difference of quarterly earnings using the .diff() method. 13 | Examine the first 10 rows of the seasonally adjusted DataFrame and notice that the first four rows are NaN. 14 | Drop the NaN rows using the .dropna() method. 15 | Plot the autocorrelation function of the seasonally adjusted DataFrame. 16 | ''' 17 | # Import the plot_acf module from statsmodels 18 | from statsmodels.graphics.tsaplots import plot_acf 19 | 20 | # Seasonally adjust quarterly earnings 21 | HRBsa = HRB.diff(4) 22 | 23 | # Print the first 10 rows of the seasonally adjusted series 24 | print(HRBsa.head(10)) 25 | 26 | # Drop the NaN data in the first three three rows 27 | HRBsa = HRBsa.dropna() 28 | 29 | # Plot the autocorrelation function of the seasonally adjusted series 30 | plot_acf(HRBsa) 31 | plt.show() -------------------------------------------------------------------------------- /Time Series Analysis in Python/5-Putting It All Together/03-are-bitcoin-and-ethereum-cointegrated.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Are Bitcoin and Ethereum Cointegrated? 3 | 4 | Cointegration involves two steps: regressing one time series on the other to get the cointegration vector, and then perform an ADF test on the residuals of the regression. In the last example, there was no need to perform the first step since we implicitly assumed the cointegration vector was (1,−1) 5 | ( 6 | 1 7 | , 8 | − 9 | 1 10 | ) 11 | . In other words, we took the difference between the two series (after doing a units conversion). Here, you will do both steps. 12 | 13 | You will regress the value of one crytocurrency, bitcoin (BTC), on another cryptocurrency, ethereum (ETH). If we call the regression coeffiecient b 14 | b 15 | , then the cointegration vector is simply (1,−b) 16 | ( 17 | 1 18 | , 19 | − 20 | b 21 | ) 22 | . Then perform the ADF test on BTC −b 23 | − 24 | b 25 | ETH. Bitcoin and Ethereum prices are pre-loaded in DataFrames BTC and ETH. 26 | 27 | Bitcoin data are in DataFrame BTC and Ethereum data are in ETH. 28 | 29 | INSTRUCTIONS 30 | 100XP 31 | INSTRUCTIONS 32 | 100XP 33 | Import the statsmodels module for regression and the adfuller function 34 | Add a constant to the ETH DataFrame using sm.add_constant() 35 | Fit the regression using sm.OLS(y,x).fit() and save the results in result. The intercept is in result.params[0] and the slope in result.params[1] 36 | Run ADF test on BTC −b 37 | − 38 | b 39 | ETH 40 | ''' 41 | # Import the statsmodels module for regression and the adfuller function 42 | import statsmodels.api as sm 43 | from statsmodels.tsa.stattools import adfuller 44 | 45 | # Regress BTC on ETH 46 | ETH = sm.add_constant(ETH) 47 | result = sm.OLS(BTC,ETH).fit() 48 | 49 | # Compute ADF 50 | b = result.params[1] 51 | adf_stats = adfuller(BTC['Price'] - b*ETH['Price']) 52 | print("The p-value for the ADF test is ", adf_stats[1]) 53 | # The data suggests that Bitcoin and Ethereum are cointegrated -------------------------------------------------------------------------------- /Time Series Analysis in Python/3-Autoregressive (AR) Models/07-estimate-order-of-model-pacf.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Estimate Order of Model: PACF 3 | 4 | One useful tool to identify the order of an AR model is to look at the Partial Autocorrelation Function (PACF). In this exercise, you will simulate two time series, an AR(1) and an AR(2), and calculate the sample PACF for each. You will notice that for an AR(1), the PACF should have a significant lag-1 value, and roughly zeros after that. And for an AR(2), the sample PACF should have significant lag-1 and lag-2 values, and zeros after that. 5 | 6 | Just like you used the plot_acf function in earlier exercises, here you will use a function called plot_pacf in the statsmodels module. 7 | 8 | INSTRUCTIONS 9 | 100XP 10 | Import the modules for simulating data and for plotting the PACF 11 | Simulate an AR(1) with ϕ=0.6 12 | ϕ 13 | = 14 | 0.6 15 | (remember that the sign for the AR parameter is reversed) 16 | Plot the PACF for simulated_data_1 using the plot_pacf function 17 | Simulate an AR(2) with ϕ1=0.6,ϕ2=0.3 18 | ϕ 19 | 1 20 | = 21 | 0.6 22 | , 23 | ϕ 24 | 2 25 | = 26 | 0.3 27 | (again, reverse the signs) 28 | Plot the PACF for simulated_data_2 using the plot_pacf function 29 | ''' 30 | # Import the modules for simulating data and for plotting the PACF 31 | from statsmodels.tsa.arima_process import ArmaProcess 32 | from statsmodels.graphics.tsaplots import plot_pacf 33 | 34 | # Simulate AR(1) with phi=+0.6 35 | ma = np.array([1]) 36 | ar = np.array([1, -0.6]) 37 | AR_object = ArmaProcess(ar, ma) 38 | simulated_data_1 = AR_object.generate_sample(nsample=5000) 39 | 40 | # Plot PACF for AR(1) 41 | plot_pacf(simulated_data_1, lags=20) 42 | plt.show() 43 | 44 | # Simulate AR(2) with phi1=+0.6, phi2=+0.3 45 | ma = np.array([1]) 46 | ar = np.array([1, -0.6, -0.3]) 47 | AR_object = ArmaProcess(ar, ma) 48 | simulated_data_2 = AR_object.generate_sample(nsample=5000) 49 | 50 | # Plot PACF for AR(2) 51 | plot_pacf(simulated_data_2, lags=20) 52 | plt.show() 53 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/4-Moving Average (MA) and ARMA Models/01-simulate-ma(1)-time-series.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Simulate MA(1) Time Series 3 | 4 | You will simulate and plot a few MA(1) time series, each with a different parameter, θ 5 | θ 6 | , using the arima_process module in statsmodels, just as you did in the last chapter for AR(1) models. You will look at an MA(1) model with a large positive θ 7 | θ 8 | and a large negative θ 9 | θ 10 | . 11 | 12 | As in the last chapter, when inputting the coefficients, you must include the zero-lag coefficient of 1, but unlike the last chapter on AR models, the sign of the MA coefficients is what we would expect. For example, for an MA(1) process with θ=−0.9 13 | θ 14 | = 15 | − 16 | 0.9 17 | , the array representing the MA parameters would be ma = np.array([1, -0.9]) 18 | 19 | INSTRUCTIONS 20 | 100XP 21 | Import the class ArmaProcess in the arima_process module. 22 | Plot the simulated MA(1) processes 23 | Let ma1 represent an array of the AR parameters [1, θ 24 | θ 25 | ] as explained above. The AR parmater array will contain just the lag-zero coeffienct of one. 26 | With parameters ar1 and ma1, create an instance of the class ArmaProcess(ar,ma) called MA_object1. 27 | Simulate 1000 data points from the object you just created, MA_object1, using the method .generate_sample(). Plot the simulated data in a subplot. 28 | Repeat for the other MA parameter. 29 | ''' 30 | # import the module for simulating data 31 | from statsmodels.tsa.arima_process import ArmaProcess 32 | 33 | # Plot 1: MA parameter = -0.9 34 | plt.subplot(2,1,1) 35 | ar1 = np.array([1]) 36 | ma1 = np.array([1, -0.9]) 37 | MA_object1 = ArmaProcess(ar1, ma1) 38 | simulated_data_1 = MA_object1.generate_sample(nsample=1000) 39 | plt.plot(simulated_data_1) 40 | 41 | # Plot 2: MA parameter = +0.9 42 | plt.subplot(2,1,2) 43 | ar2 = np.array([1]) 44 | ma2 = np.array([1, 0.9]) 45 | MA_object2 = ArmaProcess(ar2, ma2) 46 | simulated_data_2 = MA_object2.generate_sample(nsample=1000) 47 | plt.plot(simulated_data_2) 48 | 49 | plt.show() 50 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/1-Correlation and Autocorrelation/07-are-interest-rates-autocorrelated.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Are Interest Rates Autocorrelated? 3 | 4 | When you look at daily changes in interest rates, the autocorrelation is close to zero. However, if you resample the data and look at annual changes, the autocorrelation is negative. This implies that while short term changes in interest rates may be uncorrelated, long term changes in interest rates are negatively autocorrelated. A daily move up or down in interest rates is unlikely to tell you anything about interest rates tomorrow, but a move in interest rates over a year can tell you something about where interest rates are going over the next year. And this makes some economic sense: over long horizons, when interest rates go up, the economy tends to slow down, which consequently causes interest rates to fall, and vice versa. 5 | 6 | The DataFrame daily_data contains daily data of 10-year interest rates from 1962 to 2017. 7 | 8 | INSTRUCTIONS 9 | 100XP 10 | INSTRUCTIONS 11 | 100XP 12 | Create a new column, change_rates of changes in daily rates using the .diff() method. 13 | Compute the autocorrelation of change_rates using the .autocorr() method. 14 | Use the .resample() method with arguments rule='A' to convert to annual frequency and how='last'. 15 | Create a new column of rate changes, diff_rates and compute the autocorrelation, as above. 16 | ''' 17 | # Compute the daily change in ingterest rates 18 | daily_data['change_rates'] = daily_data.diff() 19 | 20 | # Compute and print the autocorrelation of daily changes 21 | autocorrelation_daily = daily_data['change_rates'].autocorr() 22 | print("The autocorrelation of daily interest rate changes is %4.2f" %(autocorrelation_daily)) 23 | 24 | # Convert the daily data to annual data 25 | annual_data = daily_data['US10Y'].resample(rule='A', how='last') 26 | 27 | # Repeat above for annual data 28 | annual_data['diff_rates'] = annual_data.diff() 29 | autocorrelation_annual = annual_data['diff_rates'].autocorr() 30 | print("The autocorrelation of annual interest rate changes is %4.2f" %(autocorrelation_annual)) -------------------------------------------------------------------------------- /Time Series Analysis in Python/1-Correlation and Autocorrelation/06-a-popular-strategy-using-autocorrelation.py: -------------------------------------------------------------------------------- 1 | ''' 2 | A Popular Strategy Using Autocorrelation 3 | 4 | One puzzling anomaly with stocks is that investors tend to overreact to news. Following large jumps, either up or down, stock prices tend to reverse. This is described as mean reversion in stock prices: prices tend to bounce back, or revert, towards previous levels after large moves, which are observed over time horizons of about a week. A more mathematical way to describe mean reversion is to say that stock returns are negatively autocorrelated. 5 | 6 | This simple idea is actually the basis for a popular hedge fund strategy. If you're curious to learn more about this hedge fund strategy (although it's not necessary reading for anything else later in the course), see here. 7 | 8 | You'll look at the autocorrelation of weekly returns of MSFT stock from 2012 to 2017. You'll start with a DataFrame MSFT of daily prices. You should use the .resample() method to get weekly prices and then compute returns from prices. Use the pandas method .autocorr() to get the autocorrelation and show that the autocorrelation is negative. Note that the .autocorr() method only works on Series, not DataFrames (even DataFrames with one column), so you will have to select the column in the DataFrame. 9 | 10 | INSTRUCTIONS 11 | 100XP 12 | INSTRUCTIONS 13 | 100XP 14 | Use the .resample() method with rule='W' and how='last'to convert daily data to weekly data. 15 | Create a new DataFrame, returns, of percent changes in weekly prices using the .pct_change() method. 16 | Compute the autocorrelation using the .autocorr() method on the series of closing stock prices, which is the column Adj Close in the DataFrame returns. 17 | ''' 18 | # Convert the daily data to weekly data 19 | MSFT = MSFT.resample(rule='w',how='last') 20 | 21 | # Compute the percentage change of prices 22 | returns = MSFT.pct_change() 23 | 24 | # Compute and print the autocorrelation of returns 25 | autocorrelation = returns['Adj Close'].autocorr() 26 | print("The autocorrelation of weekly returns is %4.2f" %(autocorrelation)) -------------------------------------------------------------------------------- /Time Series Analysis in Python/4-Moving Average (MA) and ARMA Models/06-more-data-cleaning-missing-data.py: -------------------------------------------------------------------------------- 1 | ''' 2 | More Data Cleaning: Missing Data 3 | 4 | When you print out the length of the DataFrame intraday, you will notice that a few rows are missing. There will be missing data if there are no trades in a particular one-minute interval. One way to see which rows are missing is to take the difference of two sets: the full set of every minute and the set of the DataFrame index which contains missing rows. You can fill in the missing rows with the .reindex() method, convert the index to time of day, and then plot the data. 5 | 6 | Stocks trade at discrete one-cent increments (although a small percentage of trades occur in between the one-cent increments) rather than at continuous prices, and when you plot the data you should observe that there are long periods when the stock bounces back and forth over a one cent range. This is sometimes referred to as "bid/ask bounce". 7 | 8 | INSTRUCTIONS 9 | 100XP 10 | INSTRUCTIONS 11 | 100XP 12 | Print out the length of intraday using len(). 13 | Find the missing rows by making range(391) into a set and subtracting the set of the intraday index, intraday.index. 14 | Fill in the missing rows using the .reindex() method, setting the index equal to the full range(391) and forward filling the missing data by passing the argument method='ffill'. 15 | Change the index to times using pandas function date_range(), starting with '2017-08-28 9:30' and ending with '2017-08-28 16:00' and passing the argument freq='1min'. 16 | Plot the data and include gridlines. 17 | ''' 18 | # Notice that some rows are missing 19 | print("The length of the DataFrame is: ",len(intraday)) 20 | 21 | # Find the missing rows 22 | print("Missing rows: ", set(range(391)) - set(intraday.index)) 23 | 24 | # Fill in the missing rows 25 | intraday = intraday.reindex(range(391), method='ffill') 26 | 27 | # Change the index to the intraday times 28 | intraday.index = pd.date_range(start='2017-08-28 9:30', end='2017-08-28 16:00', freq='1min') 29 | 30 | # Plot the intraday time series 31 | intraday.plot(grid=True) 32 | plt.show() 33 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/4-Moving Average (MA) and ARMA Models/02-compute-the-acf-for-several-ma-time-series.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Compute the ACF for Several MA Time Series 3 | 4 | Unlike an AR(1), an MA(1) model has no autocorrelation beyond lag 1, an MA(2) model has no autocorrelation beyond lag 2, etc. The lag-1 autocorrelation for an MA(1) model is not θ 5 | θ 6 | , but rather θ/(1+θ2) 7 | θ 8 | / 9 | ( 10 | 1 11 | + 12 | θ 13 | 2 14 | ) 15 | . For example, if the MA parameter, θ 16 | θ 17 | , is = +0.9, the first-lag autocorrelation will be 0.9/(1+(0.9)2)=0.497 18 | 0.9 19 | / 20 | ( 21 | 1 22 | + 23 | ( 24 | 0.9 25 | ) 26 | 2 27 | ) 28 | = 29 | 0.497 30 | , and the autocorrelation at all other lags will be zero. If the MA parameter, θ 31 | θ 32 | , is -0.9, the first-lag autocorrelation will be −0.9/(1+(−0.9)2)=−0.497 33 | − 34 | 0.9 35 | / 36 | ( 37 | 1 38 | + 39 | ( 40 | − 41 | 0.9 42 | ) 43 | 2 44 | ) 45 | = 46 | − 47 | 0.497 48 | . 49 | 50 | You will verify these autocorrelation functions for the three time series you generated in the last exercise. 51 | 52 | INSTRUCTIONS 53 | 100XP 54 | simulated_data_1 is the simulated time series with an MA parameter of θ=−0.9 55 | θ 56 | = 57 | − 58 | 0.9 59 | , simulated_data_2 is for an MA paramter of θ=+0.9 60 | θ 61 | = 62 | + 63 | 0.9 64 | , and simulated_data_3 is for an MA parameter of θ=−0.3 65 | θ 66 | = 67 | − 68 | 0.3 69 | . 70 | Compute the autocorrelation function for each of the three simulated datasets using the plot_acf function with 20 lags. 71 | ''' 72 | # Import the plot_acf module from statsmodels 73 | from statsmodels.graphics.tsaplots import plot_acf 74 | 75 | # Plot three ACF on same page for comparison using subplots 76 | fig, axes = plt.subplots(3,1) 77 | 78 | # Plot 1: AR parameter = -0.9 79 | plot_acf(simulated_data_1, lags=20, ax=axes[0]) 80 | axes[0].set_title("MA Parameter -0.9") 81 | 82 | # Plot 2: AR parameter = +0.9 83 | plot_acf(simulated_data_2, lags=20, ax=axes[1]) 84 | axes[1].set_title("MA Parameter +0.9") 85 | 86 | # Plot 3: AR parameter = -0.3 87 | plot_acf(simulated_data_3, lags=20, ax=axes[2]) 88 | axes[2].set_title("MA Parameter -0.3") 89 | plt.show() 90 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/1-Correlation and Autocorrelation/05-looking-at-a-regressions-r-squared.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Looking at a Regression's R-Squared 3 | 4 | R-squared measures how closely the data fit the regression line, so the R-squared in a simple regression is related to the correlation between the two variables. In particular, the magnitude of the correlation is the square root of the R-squared and the sign of the correlation is the sign of the regression coefficient. 5 | 6 | In this exercise, you will start using the statistical package statsmodels, which performs much of the statistical modeling and testing that is found in R and software packages like SAS and MATLAB. 7 | 8 | You will take two series, x and y, compute their correlation, and then regress y on x using the function OLS() in the statsmodels.api library. Most linear regressions contain a constant term which is the intercept (the α 9 | α 10 | in the regression yt=α+βxt+ϵt 11 | y 12 | t 13 | = 14 | α 15 | + 16 | β 17 | x 18 | t 19 | + 20 | ϵ 21 | t 22 | ). To include a constant using the function OLS(), you need to add a column of 1's to the right hand side of the regression. 23 | 24 | The module statsmodels.api has been imported for you as sm. 25 | 26 | INSTRUCTIONS 27 | 100XP 28 | INSTRUCTIONS 29 | 100XP 30 | Compute the correlation between x and y using the .corr() method. 31 | Regress y on x by first converting the Series x to a DataFrame and adding a constant to it using sm.add_constant(). Then fit the regression using sm.OLS(y,x).fit(). 32 | Print out the results of the regression and compare the R-squared with the correlation. 33 | ''' 34 | # Import the statsmodels module 35 | import statsmodels.api as sm 36 | 37 | # Compute correlation of x and y 38 | correlation = x.corr(y) 39 | print("The correlation between x and y is %4.2f" %(correlation)) 40 | 41 | # Convert the Series x to a DataFrame and name the column x 42 | dfx = pd.DataFrame(x, columns=['x']) 43 | 44 | # Add a constant to the DataFrame x 45 | dfx1 = sm.add_constant(dfx) 46 | 47 | # Fit the regression of y on x 48 | result = sm.OLS(y,dfx1).fit() 49 | 50 | # Print out the results and look at the relationship between R-squared and the correlation above 51 | print(result.summary()) 52 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/3-Autoregressive (AR) Models/01-simulate-ar(1)-time-series.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Simulate AR(1) Time Series 3 | 4 | You will simulate and plot a few AR(1) time series, each with a different parameter, ϕ 5 | ϕ 6 | , using the arima_process module in statsmodels. In this exercise, you will look at an AR(1) model with a large positive ϕ 7 | ϕ 8 | and a large negative ϕ 9 | ϕ 10 | , but feel free to play around with your own parameters. 11 | 12 | There are a few conventions when using the arima_process module that require some explanation. First, these routines were made very generally to handle both AR and MA models. We will cover MA models next, so for now, just ignore the MA part. Second, when inputting the coefficients, you must include the zero-lag coefficient of 1, and the sign of the other coefficients is opposite what we have been using (to be consistent with the time series literature in signal processing). For example, for an AR(1) process with ϕ=0.9 13 | ϕ 14 | = 15 | 0.9 16 | , the array representing the AR parameters would be ar = np.array([1, -0.9]) 17 | 18 | INSTRUCTIONS 19 | 100XP 20 | INSTRUCTIONS 21 | 100XP 22 | Import the class ArmaProcess in the arima_process module. 23 | Plot the simulated AR procesees: 24 | Let ar1 represent an array of the AR parameters [1, −ϕ 25 | − 26 | ϕ 27 | ] as explained above. For now, the MA parmater array, ma1, will contain just the lag-zero coefficient of one. 28 | With parameters ar1 and ma1, create an instance of the class ArmaProcess(ar,ma) called AR_object1. 29 | Simulate 1000 data points from the object you just created, AR_object1, using the method .generate_sample(). Plot the simulated data in a subplot. 30 | Repeat for the other AR parameter. 31 | ''' 32 | # import the module for simulating data 33 | from statsmodels.tsa.arima_process import ArmaProcess 34 | 35 | # Plot 1: AR parameter = +0.9 36 | plt.subplot(2,1,1) 37 | ar1 = np.array([1, -0.9]) 38 | ma1 = np.array([1]) 39 | AR_object1 = ArmaProcess(ar1, ma1) 40 | simulated_data_1 = AR_object1.generate_sample(nsample=1000) 41 | plt.plot(simulated_data_1) 42 | 43 | # Plot 2: AR parameter = -0.9 44 | plt.subplot(2,1,2) 45 | ar2 = np.array([1, 0.9]) 46 | ma2 = np.array([1]) 47 | AR_object2 = ArmaProcess(ar2, ma2) 48 | simulated_data_2 = AR_object2.generate_sample(nsample=1000) 49 | plt.plot(simulated_data_2) 50 | plt.show() 51 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/4-Moving Average (MA) and ARMA Models/05-high-frequency-stock-prices.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Higher frequency stock data is well modelled by an MA(1) process, so it's a nice application of the models in this chapter. 3 | 4 | The DataFrame intraday contains one day's prices (on September 1, 2017) for Sprint stock (ticker symbol "S") sampled at a frequency of one minute. The stock market is open for 6.5 hours (390 minutes), from 9:30am to 4:00pm. 5 | 6 | Before you can analyze the time series data, you will have to clean it up a little, which you will do in this and the next two exercises. When you look at the first few rows (see the Ipython Shell), you'll notice several things. First, there are no column headers.The data is not time stamped from 9:30 to 4:00, but rather goes from 0 to 390. And you will notice that the first date is the odd-looking "a1504272600". The number after the "a" is Unix time which is the number of seconds since January 1, 1970. This is how this dataset separates each day of intraday data. 7 | 8 | If you look at the data types, you'll notice that the DATE column is an object, which here means a string. You will need to change that to numeric before you can clean up some missing data. 9 | 10 | The source of the minute data is Google Finance (see here on how the data was downloaded). 11 | 12 | The datetime module has already been imported for you. 13 | 14 | INSTRUCTIONS 15 | 100XP 16 | Manually change the first date to zero using .iloc[0,0]. 17 | Change the two column headers to 'DATE' and 'CLOSE' by setting intraday.columns equal to a list containing those two strings. 18 | Use the pandas attribute .dtypes (no parentheses) to see what type of data are in each column. 19 | Convert the DATE column to numeric using the pandas function to_numeric(). 20 | Make the DATE column the new index by using the pandas method set_index() which will take the DATE column as its argument. 21 | ''' 22 | # import datetime module 23 | import datetime 24 | 25 | # Change the first date to zero 26 | intraday.iloc[0,0] = 0 27 | 28 | # Change the column headers to 'DATE' and 'CLOSE' 29 | intraday.columns = ['DATE','CLOSE'] 30 | 31 | # Examine the data types for each column 32 | print(intraday.dtypes) 33 | 34 | # Convert DATE column to numeric 35 | intraday['DATE'] = pd.to_numeric(intraday['DATE']) 36 | 37 | # Make the `DATE` column the new index 38 | intraday = intraday.set_index('DATE') 39 | -------------------------------------------------------------------------------- /Time Series Analysis in Python/2-Some Simple Time Series/02-are-we-confident-this-stock-is-mean-reverting.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Are We Confident This Stock is Mean Reverting? 3 | 4 | In the last chapter, you saw that the autocorelation of MSFT's weekly stock returns was -0.16. That autocorrelation seems large, but is it statistically significant? In other words, can you say that there is less than a 5% chance that we would observe such a large negative autocorrelation if the true autocorrelation were really zero? And are there any autocorrelations at other lags that are significantly different from zero? 5 | 6 | Even if the true autocorrelations were zero at all lags, in a finite sample of returns you won't see the estimate of the autocorrelations exactly zero. In fact, the standard deviation of the sample autocorrelation is 1/N‾‾√ 7 | 1 8 | / 9 | N 10 | where N 11 | N 12 | is the number of observations, so if N=100 13 | N 14 | = 15 | 100 16 | , for example, the standard deviation of the ACF is 0.1, and since 95% of a normal curve is between +1.96 and -1.96 standard deviations from the mean, the 95% confidence interval is ±1.96/N‾‾√ 17 | ± 18 | 1.96 19 | / 20 | N 21 | . This approximation only holds when the true autocorrelations are all zero. 22 | 23 | You will compute the actual and approximate confidence interval for the ACF, and compare it to the lag-one autocorrelation of -0.16 from the last chapter. The weekly returns of Microsodt is pre-loaded in a DataFrame called returns. 24 | 25 | INSTRUCTIONS 26 | 100XP 27 | INSTRUCTIONS 28 | 100XP 29 | Recompute the autocorrelation of weekly returns in the Series 'Adj Close' in the returns DataFrame. 30 | Find the number of observations in the returns DataFrame using the len() function. 31 | Approximate the 95% confidence interval of the estimated autocorrelation. The math function sqrt() has been imported and can be used. 32 | Plot the autocorrelation function of returns using plot_acf that was imported from statsmodels. Set alpha=0.05 for the confidence intervals (that's the default) and lags=20. 33 | ''' 34 | # Import the plot_acf module from statsmodels and sqrt from math 35 | from statsmodels.graphics.tsaplots import plot_acf 36 | from math import sqrt 37 | 38 | # Compute and print the autocorrelation of MSFT weekly returns 39 | autocorrelation = returns['Adj Close'].autocorr() 40 | print("The autocorrelation of weekly MSFT returns is %4.2f" %(autocorrelation)) 41 | 42 | # Find the number of observations by taking the length of the returns DataFrame 43 | nobs = len(returns) 44 | 45 | # Compute the approximate confidence interval 46 | conf = 1.96/sqrt(nobs) 47 | print("The approximate confidence interval is +/- %4.2f" %(conf)) 48 | 49 | # Plot the autocorrelation function with 95% confidence intervals and 20 lags using plot_acf 50 | plot_acf(returns, alpha=0.05, lags=20) 51 | plt.show() -------------------------------------------------------------------------------- /Time Series Analysis in Python/1-Correlation and Autocorrelation/dataset/multiTimeline.csv: -------------------------------------------------------------------------------- 1 | Category: All categories 2 | 3 | Month,diet: (Worldwide),gym: (Worldwide),finance: (Worldwide) 4 | 2004-01,100,31,48 5 | 2004-02,75,26,49 6 | 2004-03,67,24,47 7 | 2004-04,70,22,48 8 | 2004-05,72,22,43 9 | 2004-06,64,24,45 10 | 2004-07,60,23,44 11 | 2004-08,59,28,44 12 | 2004-09,53,25,44 13 | 2004-10,52,24,45 14 | 2004-11,50,23,43 15 | 2004-12,42,24,41 16 | 2005-01,64,32,44 17 | 2005-02,54,28,48 18 | 2005-03,56,27,46 19 | 2005-04,56,25,44 20 | 2005-05,59,24,42 21 | 2005-06,53,25,44 22 | 2005-07,53,25,44 23 | 2005-08,51,28,44 24 | 2005-09,47,28,44 25 | 2005-10,46,27,43 26 | 2005-11,44,25,42 27 | 2005-12,40,24,38 28 | 2006-01,64,34,44 29 | 2006-02,51,29,44 30 | 2006-03,51,28,46 31 | 2006-04,50,27,47 32 | 2006-05,50,26,45 33 | 2006-06,52,25,44 34 | 2006-07,51,27,42 35 | 2006-08,51,30,44 36 | 2006-09,45,30,46 37 | 2006-10,42,27,45 38 | 2006-11,43,26,45 39 | 2006-12,37,26,41 40 | 2007-01,57,35,46 41 | 2007-02,49,33,47 42 | 2007-03,51,32,48 43 | 2007-04,51,32,48 44 | 2007-05,49,32,47 45 | 2007-06,47,31,46 46 | 2007-07,49,30,50 47 | 2007-08,44,31,54 48 | 2007-09,46,32,52 49 | 2007-10,43,28,52 50 | 2007-11,40,27,50 51 | 2007-12,34,26,43 52 | 2008-01,52,35,53 53 | 2008-02,47,30,50 54 | 2008-03,46,29,53 55 | 2008-04,47,28,52 56 | 2008-05,45,27,48 57 | 2008-06,43,27,49 58 | 2008-07,44,28,52 59 | 2008-08,43,31,48 60 | 2008-09,42,33,61 61 | 2008-10,43,28,73 62 | 2008-11,39,28,58 63 | 2008-12,38,27,50 64 | 2009-01,52,35,54 65 | 2009-02,46,30,58 66 | 2009-03,48,28,58 67 | 2009-04,49,29,57 68 | 2009-05,48,28,53 69 | 2009-06,47,28,55 70 | 2009-07,47,28,57 71 | 2009-08,48,30,57 72 | 2009-09,44,31,60 73 | 2009-10,44,28,57 74 | 2009-11,41,27,52 75 | 2009-12,39,27,47 76 | 2010-01,57,35,51 77 | 2010-02,50,31,53 78 | 2010-03,51,30,53 79 | 2010-04,51,29,56 80 | 2010-05,49,28,55 81 | 2010-06,47,28,52 82 | 2010-07,48,29,50 83 | 2010-08,48,31,51 84 | 2010-09,48,32,54 85 | 2010-10,45,30,51 86 | 2010-11,43,28,49 87 | 2010-12,39,28,46 88 | 2011-01,61,39,51 89 | 2011-02,53,34,50 90 | 2011-03,54,33,51 91 | 2011-04,59,31,49 92 | 2011-05,57,31,50 93 | 2011-06,52,32,48 94 | 2011-07,52,30,48 95 | 2011-08,52,34,56 96 | 2011-09,50,36,52 97 | 2011-10,48,33,50 98 | 2011-11,49,33,47 99 | 2011-12,44,32,42 100 | 2012-01,64,42,44 101 | 2012-02,57,37,47 102 | 2012-03,57,35,47 103 | 2012-04,56,34,45 104 | 2012-05,55,33,47 105 | 2012-06,52,35,43 106 | 2012-07,55,37,44 107 | 2012-08,55,37,45 108 | 2012-09,51,39,46 109 | 2012-10,46,33,45 110 | 2012-11,44,32,42 111 | 2012-12,42,32,38 112 | 2013-01,65,43,46 113 | 2013-02,58,37,46 114 | 2013-03,59,37,46 115 | 2013-04,58,37,47 116 | 2013-05,55,36,46 117 | 2013-06,55,37,43 118 | 2013-07,55,37,46 119 | 2013-08,51,39,46 120 | 2013-09,52,41,47 121 | 2013-10,46,38,47 122 | 2013-11,46,37,44 123 | 2013-12,42,36,40 124 | 2014-01,61,47,46 125 | 2014-02,53,44,47 126 | 2014-03,54,43,47 127 | 2014-04,53,40,46 128 | 2014-05,50,39,44 129 | 2014-06,49,39,44 130 | 2014-07,48,41,45 131 | 2014-08,47,40,44 132 | 2014-09,46,40,48 133 | 2014-10,43,38,47 134 | 2014-11,42,37,42 135 | 2014-12,38,38,42 136 | 2015-01,54,48,46 137 | 2015-02,48,43,47 138 | 2015-03,51,43,46 139 | 2015-04,49,42,46 140 | 2015-05,48,41,44 141 | 2015-06,48,42,45 142 | 2015-07,47,42,46 143 | 2015-08,46,43,48 144 | 2015-09,43,45,48 145 | 2015-10,42,41,46 146 | 2015-11,39,42,42 147 | 2015-12,38,42,40 148 | 2016-01,51,52,44 149 | 2016-02,48,46,46 150 | 2016-03,48,47,44 151 | 2016-04,48,44,43 152 | 2016-05,47,46,42 153 | 2016-06,44,46,45 154 | 2016-07,43,58,41 155 | 2016-08,45,53,41 156 | 2016-09,43,51,44 157 | 2016-10,40,45,41 158 | 2016-11,39,44,43 159 | 2016-12,36,44,39 160 | 2017-01,55,56,43 161 | 2017-02,56,51,44 162 | 2017-03,50,51,44 163 | 2017-04,49,48,42 164 | 2017-05,48,48,43 165 | 2017-06,48,49,41 166 | 2017-07,52,52,43 167 | 2017-08,46,52,43 168 | 2017-09,44,50,47 169 | 2017-10,44,47,45 170 | 2017-11,41,47,47 171 | 2017-12,39,45,56 172 | -------------------------------------------------------------------------------- /Machine-Learning-for-Time-series-Data/Time Series and Machine Learning Primer/2_Inspecting_the_regression_data.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "coursera": { 6 | "course_slug": "neural-networks-deep-learning", 7 | "graded_item_id": "XaIWT", 8 | "launcher_item_id": "zAgPl" 9 | }, 10 | "kernelspec": { 11 | "display_name": "Python 3", 12 | "language": "python", 13 | "name": "python3" 14 | }, 15 | "language_info": { 16 | "codemirror_mode": { 17 | "name": "ipython", 18 | "version": 3 19 | }, 20 | "file_extension": ".py", 21 | "mimetype": "text/x-python", 22 | "name": "python", 23 | "nbconvert_exporter": "python", 24 | "pygments_lexer": "ipython3", 25 | "version": "3.6.2" 26 | }, 27 | "colab": { 28 | "name": "2-Inspecting the regression data.ipynb", 29 | "provenance": [], 30 | "collapsed_sections": [], 31 | "toc_visible": true 32 | } 33 | }, 34 | "cells": [ 35 | { 36 | "cell_type": "markdown", 37 | "metadata": { 38 | "id": "2WZFQQORCAcS", 39 | "colab_type": "text" 40 | }, 41 | "source": [ 42 | "# Inspecting the regression data\n", 43 | "\n", 44 | "**Introduction:**\n", 45 | "\n", 46 | "The next dataset contains information about company market value over several years of time. This is one of the most popular kind of time series data used for regression. If you can model the value of a company as it changes over time, you can make predictions about where that company will be in the future. This dataset was also originally provided as part of a public Kaggle competition.\n", 47 | "\n", 48 | "In this exercise, you'll plot the time series for a number of companies to get an understanding of how they are (or aren't) related to one another." 49 | ] 50 | }, 51 | { 52 | "cell_type": "code", 53 | "metadata": { 54 | "id": "LjDRRdAnPfvj", 55 | "colab_type": "code", 56 | "colab": {} 57 | }, 58 | "source": [ 59 | "!git clone https://github.com/hussain0048/Time-Series-Analysis-in-Python.git" 60 | ], 61 | "execution_count": null, 62 | "outputs": [] 63 | }, 64 | { 65 | "cell_type": "markdown", 66 | "metadata": { 67 | "id": "beVu-1UqCAcU", 68 | "colab_type": "text" 69 | }, 70 | "source": [ 71 | "## 1 - Importing necessary libraries ##" 72 | ] 73 | }, 74 | { 75 | "cell_type": "code", 76 | "metadata": { 77 | "id": "ZF03EKpuCAcV", 78 | "colab_type": "code", 79 | "colab": {} 80 | }, 81 | "source": [ 82 | "import pandas as pd" 83 | ], 84 | "execution_count": null, 85 | "outputs": [] 86 | }, 87 | { 88 | "cell_type": "markdown", 89 | "metadata": { 90 | "collapsed": true, 91 | "id": "nhYP15aoCAcf", 92 | "colab_type": "text" 93 | }, 94 | "source": [ 95 | "## 2 - Instruction ##\n", 96 | "\n", 97 | "Import the data with Pandas (stored in the file 'prices.csv').\n", 98 | "\n", 99 | "Convert the index of data to datetime.\n", 100 | "\n", 101 | "Loop through each column of data and plot the the column's values over time.\n" 102 | ] 103 | }, 104 | { 105 | "cell_type": "markdown", 106 | "metadata": { 107 | "id": "ehTTQE9YJF1Q", 108 | "colab_type": "text" 109 | }, 110 | "source": [ 111 | "##3- Read the Data \n" 112 | ] 113 | }, 114 | { 115 | "cell_type": "code", 116 | "metadata": { 117 | "scrolled": true, 118 | "id": "TvqFi8-XCAcy", 119 | "colab_type": "code", 120 | "colab": {} 121 | }, 122 | "source": [ 123 | "data = pd.read_csv('prices.csv', index_col=0)" 124 | ], 125 | "execution_count": null, 126 | "outputs": [] 127 | }, 128 | { 129 | "cell_type": "markdown", 130 | "metadata": { 131 | "id": "izbM6eEUKAV-", 132 | "colab_type": "text" 133 | }, 134 | "source": [ 135 | "## 4- Convert Data into Time index ##\n", 136 | "\n", 137 | "Convert the index of the DataFrame to datetime" 138 | ] 139 | }, 140 | { 141 | "cell_type": "code", 142 | "metadata": { 143 | "id": "zt-vctO1Kb3o", 144 | "colab_type": "code", 145 | "colab": {} 146 | }, 147 | "source": [ 148 | "data.index = pd.to_datetime(data.index)\n", 149 | "print(data.head())" 150 | ], 151 | "execution_count": null, 152 | "outputs": [] 153 | }, 154 | { 155 | "cell_type": "markdown", 156 | "metadata": { 157 | "id": "11oflYYNUDjC", 158 | "colab_type": "text" 159 | }, 160 | "source": [ 161 | "##5. Plot Data\n", 162 | "Loop through each column, plot its values over time" 163 | ] 164 | }, 165 | { 166 | "cell_type": "code", 167 | "metadata": { 168 | "id": "5VEu604HURRW", 169 | "colab_type": "code", 170 | "colab": {} 171 | }, 172 | "source": [ 173 | "fig, ax = plt.subplots()\n", 174 | "for column in data.columns:\n", 175 | " data[column].plot(ax=ax, label=column)\n", 176 | "ax.legend()\n", 177 | "plt.show()" 178 | ], 179 | "execution_count": null, 180 | "outputs": [] 181 | }, 182 | { 183 | "cell_type": "markdown", 184 | "metadata": { 185 | "id": "1x8QNZN3CAdF", 186 | "colab_type": "text" 187 | }, 188 | "source": [ 189 | "## 6 - Hints ##\n", 190 | "\n", 191 | "Use pd.read_csv() to load in the data\n", 192 | "\n", 193 | "Use pd.to_datetime() to convert the index of the DataFrame to datetime.\n", 194 | "\n", 195 | "All the column names of data can be accessed via data.columns.\n" 196 | ] 197 | }, 198 | { 199 | "cell_type": "markdown", 200 | "metadata": { 201 | "id": "vm7uTbEWCAei", 202 | "colab_type": "text" 203 | }, 204 | "source": [ 205 | "References:\n", 206 | "- Machine Learning for Time SeriesData in Python\n", 207 | "https://www.studocu.com/in/document/icfai-university-tripura/marketing-management/lecture-notes/machine-learning-for-time-series-data-in-python/5193370/view" 208 | ] 209 | }, 210 | { 211 | "cell_type": "code", 212 | "metadata": { 213 | "id": "7LvqfD4BCAei", 214 | "colab_type": "code", 215 | "colab": {} 216 | }, 217 | "source": [ 218 | "" 219 | ], 220 | "execution_count": null, 221 | "outputs": [] 222 | } 223 | ] 224 | } -------------------------------------------------------------------------------- /Machine-Learning-for-Time-series-Data/Time Series and Machine Learning Primer/1_Inspecting_the_classification_data___Python.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "coursera": { 6 | "course_slug": "neural-networks-deep-learning", 7 | "graded_item_id": "XaIWT", 8 | "launcher_item_id": "zAgPl" 9 | }, 10 | "kernelspec": { 11 | "display_name": "Python 3", 12 | "language": "python", 13 | "name": "python3" 14 | }, 15 | "language_info": { 16 | "codemirror_mode": { 17 | "name": "ipython", 18 | "version": 3 19 | }, 20 | "file_extension": ".py", 21 | "mimetype": "text/x-python", 22 | "name": "python", 23 | "nbconvert_exporter": "python", 24 | "pygments_lexer": "ipython3", 25 | "version": "3.6.2" 26 | }, 27 | "colab": { 28 | "name": "1-Inspecting the classification data _ Python.ipynb", 29 | "provenance": [], 30 | "collapsed_sections": [], 31 | "toc_visible": true 32 | } 33 | }, 34 | "cells": [ 35 | { 36 | "cell_type": "markdown", 37 | "metadata": { 38 | "id": "2WZFQQORCAcS", 39 | "colab_type": "text" 40 | }, 41 | "source": [ 42 | "# Inspecting the classification data\n", 43 | "\n", 44 | "**Introduction:**\n", 45 | "\n", 46 | "In these final exercises of this chapter, you'll explore the two datasets you'll use in this course.\n", 47 | "\n", 48 | "The first is a collection of heartbeat sounds. Hearts normally have a predictable sound pattern as they beat, but some disorders can cause the heart to beat abnormally. This dataset contains a training set with labels for each type of heartbeat, and a testing set with no labels. You'll use the testing set to validate your models.\n", 49 | "\n", 50 | "As you have labeled data, this dataset is ideal for classification. In fact, it was originally offered as a part of a public Kaggle" 51 | ] 52 | }, 53 | { 54 | "cell_type": "code", 55 | "metadata": { 56 | "id": "LjDRRdAnPfvj", 57 | "colab_type": "code", 58 | "colab": { 59 | "base_uri": "https://localhost:8080/", 60 | "height": 136 61 | }, 62 | "outputId": "b96fd886-7abe-4eec-9cdc-4bbe9ed82dc2" 63 | }, 64 | "source": [ 65 | "!git clone https://github.com/hussain0048/Time-Series-Analysis-in-Python.git" 66 | ], 67 | "execution_count": 1, 68 | "outputs": [ 69 | { 70 | "output_type": "stream", 71 | "text": [ 72 | "Cloning into 'Time-Series-Analysis-in-Python'...\n", 73 | "remote: Enumerating objects: 27, done.\u001b[K\n", 74 | "remote: Counting objects: 100% (27/27), done.\u001b[K\n", 75 | "remote: Compressing objects: 100% (27/27), done.\u001b[K\n", 76 | "remote: Total 166 (delta 4), reused 0 (delta 0), pack-reused 139\u001b[K\n", 77 | "Receiving objects: 100% (166/166), 2.02 MiB | 14.97 MiB/s, done.\n", 78 | "Resolving deltas: 100% (54/54), done.\n" 79 | ], 80 | "name": "stdout" 81 | } 82 | ] 83 | }, 84 | { 85 | "cell_type": "markdown", 86 | "metadata": { 87 | "id": "beVu-1UqCAcU", 88 | "colab_type": "text" 89 | }, 90 | "source": [ 91 | "## 1 - Importing necessary libraries ##" 92 | ] 93 | }, 94 | { 95 | "cell_type": "code", 96 | "metadata": { 97 | "id": "ZF03EKpuCAcV", 98 | "colab_type": "code", 99 | "colab": {} 100 | }, 101 | "source": [ 102 | "import librosa as lr\n", 103 | "from glob import glob" 104 | ], 105 | "execution_count": null, 106 | "outputs": [] 107 | }, 108 | { 109 | "cell_type": "markdown", 110 | "metadata": { 111 | "collapsed": true, 112 | "id": "nhYP15aoCAcf", 113 | "colab_type": "text" 114 | }, 115 | "source": [ 116 | "## 2 - Instruction ##\n", 117 | "\n", 118 | "Use glob to return a list of the .wav files in data_dir directory.\n", 119 | "Import the first audio file in the list using librosa.\n", 120 | "Generate a time array for the data.\n", 121 | "Plot the waveform for this file, along with the time array.\n" 122 | ] 123 | }, 124 | { 125 | "cell_type": "markdown", 126 | "metadata": { 127 | "id": "P4QYI2IoHWpL", 128 | "colab_type": "text" 129 | }, 130 | "source": [ 131 | "## 3 - load Data ##\n", 132 | "List all the wav files in the folder" 133 | ] 134 | }, 135 | { 136 | "cell_type": "code", 137 | "metadata": { 138 | "id": "p9JGZRKCCAcq", 139 | "colab_type": "code", 140 | "colab": {} 141 | }, 142 | "source": [ 143 | " audio_files = glob(data_dir + '/*.wav')" 144 | ], 145 | "execution_count": null, 146 | "outputs": [] 147 | }, 148 | { 149 | "cell_type": "markdown", 150 | "metadata": { 151 | "id": "ehTTQE9YJF1Q", 152 | "colab_type": "text" 153 | }, 154 | "source": [ 155 | "##4- Read the Data \n", 156 | "\n", 157 | "Read in the first audio file, create the time array" 158 | ] 159 | }, 160 | { 161 | "cell_type": "code", 162 | "metadata": { 163 | "scrolled": true, 164 | "id": "TvqFi8-XCAcy", 165 | "colab_type": "code", 166 | "colab": {} 167 | }, 168 | "source": [ 169 | "audio, sfreq = lr.load(audio_files[0])\n", 170 | "time = np.arange(0, len(audio)) / sfreq" 171 | ], 172 | "execution_count": null, 173 | "outputs": [] 174 | }, 175 | { 176 | "cell_type": "markdown", 177 | "metadata": { 178 | "id": "izbM6eEUKAV-", 179 | "colab_type": "text" 180 | }, 181 | "source": [ 182 | "## 5-Plot Data ##\n", 183 | "\n", 184 | "Plot audio over time" 185 | ] 186 | }, 187 | { 188 | "cell_type": "code", 189 | "metadata": { 190 | "id": "zt-vctO1Kb3o", 191 | "colab_type": "code", 192 | "colab": {} 193 | }, 194 | "source": [ 195 | "fig, ax = plt.subplots()\n", 196 | "ax.plot(time, audio)\n", 197 | "ax.set(xlabel='Time (s)', ylabel='Sound Amplitude')\n", 198 | "plt.show()" 199 | ], 200 | "execution_count": null, 201 | "outputs": [] 202 | }, 203 | { 204 | "cell_type": "markdown", 205 | "metadata": { 206 | "id": "1x8QNZN3CAdF", 207 | "colab_type": "text" 208 | }, 209 | "source": [ 210 | "## 6 - Hints ##\n", 211 | "\n", 212 | "The easiest way to create a time array is to divide the index of each datapoint by the sampling frequency.\n", 213 | "Plot time on the x-axis and audio on the y-axis.\n" 214 | ] 215 | }, 216 | { 217 | "cell_type": "markdown", 218 | "metadata": { 219 | "id": "vm7uTbEWCAei", 220 | "colab_type": "text" 221 | }, 222 | "source": [ 223 | "References:\n", 224 | "- Machine Learning for Time SeriesData in Python\n", 225 | "https://www.studocu.com/in/document/icfai-university-tripura/marketing-management/lecture-notes/machine-learning-for-time-series-data-in-python/5193370/view" 226 | ] 227 | }, 228 | { 229 | "cell_type": "code", 230 | "metadata": { 231 | "id": "7LvqfD4BCAei", 232 | "colab_type": "code", 233 | "colab": {} 234 | }, 235 | "source": [ 236 | "" 237 | ], 238 | "execution_count": null, 239 | "outputs": [] 240 | } 241 | ] 242 | } -------------------------------------------------------------------------------- /Time Series Analysis in Python/Alogrithems/AR_and_ARIMA_Models.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "coursera": { 6 | "course_slug": "neural-networks-deep-learning", 7 | "graded_item_id": "XaIWT", 8 | "launcher_item_id": "zAgPl" 9 | }, 10 | "kernelspec": { 11 | "display_name": "Python 3", 12 | "language": "python", 13 | "name": "python3" 14 | }, 15 | "language_info": { 16 | "codemirror_mode": { 17 | "name": "ipython", 18 | "version": 3 19 | }, 20 | "file_extension": ".py", 21 | "mimetype": "text/x-python", 22 | "name": "python", 23 | "nbconvert_exporter": "python", 24 | "pygments_lexer": "ipython3", 25 | "version": "3.6.2" 26 | }, 27 | "colab": { 28 | "name": "AR_and_ARIMA_Models.ipynb", 29 | "provenance": [], 30 | "collapsed_sections": [], 31 | "toc_visible": true 32 | } 33 | }, 34 | "cells": [ 35 | { 36 | "cell_type": "markdown", 37 | "metadata": { 38 | "id": "2WZFQQORCAcS", 39 | "colab_type": "text" 40 | }, 41 | "source": [ 42 | "# Autoregressive (AR) and Auto Regressive Integrated Moving Average (ARIMA) Models\n", 43 | "\n", 44 | "**Introduction:**\n", 45 | "\n", 46 | "A statistical model is autoregressive if it predicts future values based on past values. For example, an autoregressive model might seek to predict a stock's future prices based on its past performance [1].An AR(1) autoregressive process is one in which the current value is based on the immediately preceding value, while an AR(2) process is one in which the current value is based on the previous two values. An AR(0) process is used for white noise and has no dependence between the terms [1]" 47 | ] 48 | }, 49 | { 50 | "cell_type": "code", 51 | "metadata": { 52 | "id": "tsY-ZNwbPThQ", 53 | "colab_type": "code", 54 | "colab": { 55 | "resources": { 56 | "http://localhost:8080/nbextensions/google.colab/files.js": { 57 | "data": "Ly8gQ29weXJpZ2h0IDIwMTcgR29vZ2xlIExMQwovLwovLyBMaWNlbnNlZCB1bmRlciB0aGUgQXBhY2hlIExpY2Vuc2UsIFZlcnNpb24gMi4wICh0aGUgIkxpY2Vuc2UiKTsKLy8geW91IG1heSBub3QgdXNlIHRoaXMgZmlsZSBleGNlcHQgaW4gY29tcGxpYW5jZSB3aXRoIHRoZSBMaWNlbnNlLgovLyBZb3UgbWF5IG9idGFpbiBhIGNvcHkgb2YgdGhlIExpY2Vuc2UgYXQKLy8KLy8gICAgICBodHRwOi8vd3d3LmFwYWNoZS5vcmcvbGljZW5zZXMvTElDRU5TRS0yLjAKLy8KLy8gVW5sZXNzIHJlcXVpcmVkIGJ5IGFwcGxpY2FibGUgbGF3IG9yIGFncmVlZCB0byBpbiB3cml0aW5nLCBzb2Z0d2FyZQovLyBkaXN0cmlidXRlZCB1bmRlciB0aGUgTGljZW5zZSBpcyBkaXN0cmlidXRlZCBvbiBhbiAiQVMgSVMiIEJBU0lTLAovLyBXSVRIT1VUIFdBUlJBTlRJRVMgT1IgQ09ORElUSU9OUyBPRiBBTlkgS0lORCwgZWl0aGVyIGV4cHJlc3Mgb3IgaW1wbGllZC4KLy8gU2VlIHRoZSBMaWNlbnNlIGZvciB0aGUgc3BlY2lmaWMgbGFuZ3VhZ2UgZ292ZXJuaW5nIHBlcm1pc3Npb25zIGFuZAovLyBsaW1pdGF0aW9ucyB1bmRlciB0aGUgTGljZW5zZS4KCi8qKgogKiBAZmlsZW92ZXJ2aWV3IEhlbHBlcnMgZm9yIGdvb2dsZS5jb2xhYiBQeXRob24gbW9kdWxlLgogKi8KKGZ1bmN0aW9uKHNjb3BlKSB7CmZ1bmN0aW9uIHNwYW4odGV4dCwgc3R5bGVBdHRyaWJ1dGVzID0ge30pIHsKICBjb25zdCBlbGVtZW50ID0gZG9jdW1lbnQuY3JlYXRlRWxlbWVudCgnc3BhbicpOwogIGVsZW1lbnQudGV4dENvbnRlbnQgPSB0ZXh0OwogIGZvciAoY29uc3Qga2V5IG9mIE9iamVjdC5rZXlzKHN0eWxlQXR0cmlidXRlcykpIHsKICAgIGVsZW1lbnQuc3R5bGVba2V5XSA9IHN0eWxlQXR0cmlidXRlc1trZXldOwogIH0KICByZXR1cm4gZWxlbWVudDsKfQoKLy8gTWF4IG51bWJlciBvZiBieXRlcyB3aGljaCB3aWxsIGJlIHVwbG9hZGVkIGF0IGEgdGltZS4KY29uc3QgTUFYX1BBWUxPQURfU0laRSA9IDEwMCAqIDEwMjQ7CgpmdW5jdGlvbiBfdXBsb2FkRmlsZXMoaW5wdXRJZCwgb3V0cHV0SWQpIHsKICBjb25zdCBzdGVwcyA9IHVwbG9hZEZpbGVzU3RlcChpbnB1dElkLCBvdXRwdXRJZCk7CiAgY29uc3Qgb3V0cHV0RWxlbWVudCA9IGRvY3VtZW50LmdldEVsZW1lbnRCeUlkKG91dHB1dElkKTsKICAvLyBDYWNoZSBzdGVwcyBvbiB0aGUgb3V0cHV0RWxlbWVudCB0byBtYWtlIGl0IGF2YWlsYWJsZSBmb3IgdGhlIG5leHQgY2FsbAogIC8vIHRvIHVwbG9hZEZpbGVzQ29udGludWUgZnJvbSBQeXRob24uCiAgb3V0cHV0RWxlbWVudC5zdGVwcyA9IHN0ZXBzOwoKICByZXR1cm4gX3VwbG9hZEZpbGVzQ29udGludWUob3V0cHV0SWQpOwp9CgovLyBUaGlzIGlzIHJvdWdobHkgYW4gYXN5bmMgZ2VuZXJhdG9yIChub3Qgc3VwcG9ydGVkIGluIHRoZSBicm93c2VyIHlldCksCi8vIHdoZXJlIHRoZXJlIGFyZSBtdWx0aXBsZSBhc3luY2hyb25vdXMgc3RlcHMgYW5kIHRoZSBQeXRob24gc2lkZSBpcyBnb2luZwovLyB0byBwb2xsIGZvciBjb21wbGV0aW9uIG9mIGVhY2ggc3RlcC4KLy8gVGhpcyB1c2VzIGEgUHJvbWlzZSB0byBibG9jayB0aGUgcHl0aG9uIHNpZGUgb24gY29tcGxldGlvbiBvZiBlYWNoIHN0ZXAsCi8vIHRoZW4gcGFzc2VzIHRoZSByZXN1bHQgb2YgdGhlIHByZXZpb3VzIHN0ZXAgYXMgdGhlIGlucHV0IHRvIHRoZSBuZXh0IHN0ZXAuCmZ1bmN0aW9uIF91cGxvYWRGaWxlc0NvbnRpbnVlKG91dHB1dElkKSB7CiAgY29uc3Qgb3V0cHV0RWxlbWVudCA9IGRvY3VtZW50LmdldEVsZW1lbnRCeUlkKG91dHB1dElkKTsKICBjb25zdCBzdGVwcyA9IG91dHB1dEVsZW1lbnQuc3RlcHM7CgogIGNvbnN0IG5leHQgPSBzdGVwcy5uZXh0KG91dHB1dEVsZW1lbnQubGFzdFByb21pc2VWYWx1ZSk7CiAgcmV0dXJuIFByb21pc2UucmVzb2x2ZShuZXh0LnZhbHVlLnByb21pc2UpLnRoZW4oKHZhbHVlKSA9PiB7CiAgICAvLyBDYWNoZSB0aGUgbGFzdCBwcm9taXNlIHZhbHVlIHRvIG1ha2UgaXQgYXZhaWxhYmxlIHRvIHRoZSBuZXh0CiAgICAvLyBzdGVwIG9mIHRoZSBnZW5lcmF0b3IuCiAgICBvdXRwdXRFbGVtZW50Lmxhc3RQcm9taXNlVmFsdWUgPSB2YWx1ZTsKICAgIHJldHVybiBuZXh0LnZhbHVlLnJlc3BvbnNlOwogIH0pOwp9CgovKioKICogR2VuZXJhdG9yIGZ1bmN0aW9uIHdoaWNoIGlzIGNhbGxlZCBiZXR3ZWVuIGVhY2ggYXN5bmMgc3RlcCBvZiB0aGUgdXBsb2FkCiAqIHByb2Nlc3MuCiAqIEBwYXJhbSB7c3RyaW5nfSBpbnB1dElkIEVsZW1lbnQgSUQgb2YgdGhlIGlucHV0IGZpbGUgcGlja2VyIGVsZW1lbnQuCiAqIEBwYXJhbSB7c3RyaW5nfSBvdXRwdXRJZCBFbGVtZW50IElEIG9mIHRoZSBvdXRwdXQgZGlzcGxheS4KICogQHJldHVybiB7IUl0ZXJhYmxlPCFPYmplY3Q+fSBJdGVyYWJsZSBvZiBuZXh0IHN0ZXBzLgogKi8KZnVuY3Rpb24qIHVwbG9hZEZpbGVzU3RlcChpbnB1dElkLCBvdXRwdXRJZCkgewogIGNvbnN0IGlucHV0RWxlbWVudCA9IGRvY3VtZW50LmdldEVsZW1lbnRCeUlkKGlucHV0SWQpOwogIGlucHV0RWxlbWVudC5kaXNhYmxlZCA9IGZhbHNlOwoKICBjb25zdCBvdXRwdXRFbGVtZW50ID0gZG9jdW1lbnQuZ2V0RWxlbWVudEJ5SWQob3V0cHV0SWQpOwogIG91dHB1dEVsZW1lbnQuaW5uZXJIVE1MID0gJyc7CgogIGNvbnN0IHBpY2tlZFByb21pc2UgPSBuZXcgUHJvbWlzZSgocmVzb2x2ZSkgPT4gewogICAgaW5wdXRFbGVtZW50LmFkZEV2ZW50TGlzdGVuZXIoJ2NoYW5nZScsIChlKSA9PiB7CiAgICAgIHJlc29sdmUoZS50YXJnZXQuZmlsZXMpOwogICAgfSk7CiAgfSk7CgogIGNvbnN0IGNhbmNlbCA9IGRvY3VtZW50LmNyZWF0ZUVsZW1lbnQoJ2J1dHRvbicpOwogIGlucHV0RWxlbWVudC5wYXJlbnRFbGVtZW50LmFwcGVuZENoaWxkKGNhbmNlbCk7CiAgY2FuY2VsLnRleHRDb250ZW50ID0gJ0NhbmNlbCB1cGxvYWQnOwogIGNvbnN0IGNhbmNlbFByb21pc2UgPSBuZXcgUHJvbWlzZSgocmVzb2x2ZSkgPT4gewogICAgY2FuY2VsLm9uY2xpY2sgPSAoKSA9PiB7CiAgICAgIHJlc29sdmUobnVsbCk7CiAgICB9OwogIH0pOwoKICAvLyBXYWl0IGZvciB0aGUgdXNlciB0byBwaWNrIHRoZSBmaWxlcy4KICBjb25zdCBmaWxlcyA9IHlpZWxkIHsKICAgIHByb21pc2U6IFByb21pc2UucmFjZShbcGlja2VkUHJvbWlzZSwgY2FuY2VsUHJvbWlzZV0pLAogICAgcmVzcG9uc2U6IHsKICAgICAgYWN0aW9uOiAnc3RhcnRpbmcnLAogICAgfQogIH07CgogIGNhbmNlbC5yZW1vdmUoKTsKCiAgLy8gRGlzYWJsZSB0aGUgaW5wdXQgZWxlbWVudCBzaW5jZSBmdXJ0aGVyIHBpY2tzIGFyZSBub3QgYWxsb3dlZC4KICBpbnB1dEVsZW1lbnQuZGlzYWJsZWQgPSB0cnVlOwoKICBpZiAoIWZpbGVzKSB7CiAgICByZXR1cm4gewogICAgICByZXNwb25zZTogewogICAgICAgIGFjdGlvbjogJ2NvbXBsZXRlJywKICAgICAgfQogICAgfTsKICB9CgogIGZvciAoY29uc3QgZmlsZSBvZiBmaWxlcykgewogICAgY29uc3QgbGkgPSBkb2N1bWVudC5jcmVhdGVFbGVtZW50KCdsaScpOwogICAgbGkuYXBwZW5kKHNwYW4oZmlsZS5uYW1lLCB7Zm9udFdlaWdodDogJ2JvbGQnfSkpOwogICAgbGkuYXBwZW5kKHNwYW4oCiAgICAgICAgYCgke2ZpbGUudHlwZSB8fCAnbi9hJ30pIC0gJHtmaWxlLnNpemV9IGJ5dGVzLCBgICsKICAgICAgICBgbGFzdCBtb2RpZmllZDogJHsKICAgICAgICAgICAgZmlsZS5sYXN0TW9kaWZpZWREYXRlID8gZmlsZS5sYXN0TW9kaWZpZWREYXRlLnRvTG9jYWxlRGF0ZVN0cmluZygpIDoKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgJ24vYSd9IC0gYCkpOwogICAgY29uc3QgcGVyY2VudCA9IHNwYW4oJzAlIGRvbmUnKTsKICAgIGxpLmFwcGVuZENoaWxkKHBlcmNlbnQpOwoKICAgIG91dHB1dEVsZW1lbnQuYXBwZW5kQ2hpbGQobGkpOwoKICAgIGNvbnN0IGZpbGVEYXRhUHJvbWlzZSA9IG5ldyBQcm9taXNlKChyZXNvbHZlKSA9PiB7CiAgICAgIGNvbnN0IHJlYWRlciA9IG5ldyBGaWxlUmVhZGVyKCk7CiAgICAgIHJlYWRlci5vbmxvYWQgPSAoZSkgPT4gewogICAgICAgIHJlc29sdmUoZS50YXJnZXQucmVzdWx0KTsKICAgICAgfTsKICAgICAgcmVhZGVyLnJlYWRBc0FycmF5QnVmZmVyKGZpbGUpOwogICAgfSk7CiAgICAvLyBXYWl0IGZvciB0aGUgZGF0YSB0byBiZSByZWFkeS4KICAgIGxldCBmaWxlRGF0YSA9IHlpZWxkIHsKICAgICAgcHJvbWlzZTogZmlsZURhdGFQcm9taXNlLAogICAgICByZXNwb25zZTogewogICAgICAgIGFjdGlvbjogJ2NvbnRpbnVlJywKICAgICAgfQogICAgfTsKCiAgICAvLyBVc2UgYSBjaHVua2VkIHNlbmRpbmcgdG8gYXZvaWQgbWVzc2FnZSBzaXplIGxpbWl0cy4gU2VlIGIvNjIxMTU2NjAuCiAgICBsZXQgcG9zaXRpb24gPSAwOwogICAgd2hpbGUgKHBvc2l0aW9uIDwgZmlsZURhdGEuYnl0ZUxlbmd0aCkgewogICAgICBjb25zdCBsZW5ndGggPSBNYXRoLm1pbihmaWxlRGF0YS5ieXRlTGVuZ3RoIC0gcG9zaXRpb24sIE1BWF9QQVlMT0FEX1NJWkUpOwogICAgICBjb25zdCBjaHVuayA9IG5ldyBVaW50OEFycmF5KGZpbGVEYXRhLCBwb3NpdGlvbiwgbGVuZ3RoKTsKICAgICAgcG9zaXRpb24gKz0gbGVuZ3RoOwoKICAgICAgY29uc3QgYmFzZTY0ID0gYnRvYShTdHJpbmcuZnJvbUNoYXJDb2RlLmFwcGx5KG51bGwsIGNodW5rKSk7CiAgICAgIHlpZWxkIHsKICAgICAgICByZXNwb25zZTogewogICAgICAgICAgYWN0aW9uOiAnYXBwZW5kJywKICAgICAgICAgIGZpbGU6IGZpbGUubmFtZSwKICAgICAgICAgIGRhdGE6IGJhc2U2NCwKICAgICAgICB9LAogICAgICB9OwogICAgICBwZXJjZW50LnRleHRDb250ZW50ID0KICAgICAgICAgIGAke01hdGgucm91bmQoKHBvc2l0aW9uIC8gZmlsZURhdGEuYnl0ZUxlbmd0aCkgKiAxMDApfSUgZG9uZWA7CiAgICB9CiAgfQoKICAvLyBBbGwgZG9uZS4KICB5aWVsZCB7CiAgICByZXNwb25zZTogewogICAgICBhY3Rpb246ICdjb21wbGV0ZScsCiAgICB9CiAgfTsKfQoKc2NvcGUuZ29vZ2xlID0gc2NvcGUuZ29vZ2xlIHx8IHt9OwpzY29wZS5nb29nbGUuY29sYWIgPSBzY29wZS5nb29nbGUuY29sYWIgfHwge307CnNjb3BlLmdvb2dsZS5jb2xhYi5fZmlsZXMgPSB7CiAgX3VwbG9hZEZpbGVzLAogIF91cGxvYWRGaWxlc0NvbnRpbnVlLAp9Owp9KShzZWxmKTsK", 58 | "ok": true, 59 | "headers": [ 60 | [ 61 | "content-type", 62 | "application/javascript" 63 | ] 64 | ], 65 | "status": 200, 66 | "status_text": "" 67 | } 68 | }, 69 | "base_uri": "https://localhost:8080/", 70 | "height": 72 71 | }, 72 | "outputId": "35033f20-ac09-43f8-d2b6-53270bb56816" 73 | }, 74 | "source": [ 75 | "# this code is used to upload dataset from Pc to colab\n", 76 | "from google.colab import files # Please First run this cod in chrom \n", 77 | "def getLocalFiles():\n", 78 | " _files = files.upload() # upload StudentNextSessionf.csv datase\n", 79 | " if len(_files) >0: # Then run above libray \n", 80 | " for k,v in _files.items():\n", 81 | " open(k,'wb').write(v)\n", 82 | "getLocalFiles()" 83 | ], 84 | "execution_count": 1, 85 | "outputs": [ 86 | { 87 | "output_type": "display_data", 88 | "data": { 89 | "text/html": [ 90 | "\n", 91 | " \n", 93 | " \n", 94 | " Upload widget is only available when the cell has been executed in the\n", 95 | " current browser session. Please rerun this cell to enable.\n", 96 | " \n", 97 | " " 98 | ], 99 | "text/plain": [ 100 | "" 101 | ] 102 | }, 103 | "metadata": { 104 | "tags": [] 105 | } 106 | }, 107 | { 108 | "output_type": "stream", 109 | "text": [ 110 | "Saving daily-min-temperatures.csv to daily-min-temperatures.csv\n" 111 | ], 112 | "name": "stdout" 113 | } 114 | ] 115 | }, 116 | { 117 | "cell_type": "markdown", 118 | "metadata": { 119 | "id": "beVu-1UqCAcU", 120 | "colab_type": "text" 121 | }, 122 | "source": [ 123 | "## 1 - Importing necessary libraries ##\n", 124 | "we import the following python packges for this tutorial \n", 125 | " - pandas: importing index data\n", 126 | " - matplotlib: it is used for ploting data\n", 127 | " - statsmodels build time series models\n", 128 | " - statistic to help find statistic of models" 129 | ] 130 | }, 131 | { 132 | "cell_type": "code", 133 | "metadata": { 134 | "id": "ZF03EKpuCAcV", 135 | "colab_type": "code", 136 | "colab": {} 137 | }, 138 | "source": [ 139 | "import pandas as pd\n", 140 | "import matplotlib.pyplot as plt" 141 | ], 142 | "execution_count": 2, 143 | "outputs": [] 144 | }, 145 | { 146 | "cell_type": "code", 147 | "metadata": { 148 | "id": "7c3251v0N3z1", 149 | "colab_type": "code", 150 | "colab": { 151 | "base_uri": "https://localhost:8080/", 152 | "height": 71 153 | }, 154 | "outputId": "92c5751c-afae-44dc-bf11-98d7d402ac96" 155 | }, 156 | "source": [ 157 | "from statsmodels.tsa.ar_model import AR\n", 158 | "from sklearn.metrics import mean_squared_error" 159 | ], 160 | "execution_count": 3, 161 | "outputs": [ 162 | { 163 | "output_type": "stream", 164 | "text": [ 165 | "/usr/local/lib/python3.6/dist-packages/statsmodels/tools/_testing.py:19: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.\n", 166 | " import pandas.util.testing as tm\n" 167 | ], 168 | "name": "stderr" 169 | } 170 | ] 171 | }, 172 | { 173 | "cell_type": "code", 174 | "metadata": { 175 | "id": "vlav6g_TvrD9", 176 | "colab_type": "code", 177 | "colab": {} 178 | }, 179 | "source": [ 180 | "import statsmodels as sm\n", 181 | "from statsmodels.tsa.stattools import adfuller \n", 182 | "from statsmodels.graphics.tsaplots import plot_acf, plot_pacf \n", 183 | "from statsmodels.tsa.arima_model import ARIMA \n", 184 | "import statistics" 185 | ], 186 | "execution_count": 4, 187 | "outputs": [] 188 | }, 189 | { 190 | "cell_type": "code", 191 | "metadata": { 192 | "id": "euL8yNRc7yE_", 193 | "colab_type": "code", 194 | "colab": {} 195 | }, 196 | "source": [ 197 | "plt.rcParams.update({'figure.figsize':(9,7), 'figure.dpi':120})" 198 | ], 199 | "execution_count": 5, 200 | "outputs": [] 201 | }, 202 | { 203 | "cell_type": "code", 204 | "metadata": { 205 | "id": "dZCZulRu3fib", 206 | "colab_type": "code", 207 | "colab": {} 208 | }, 209 | "source": [ 210 | "from pandas import Series\n", 211 | "from pandas import DataFrame\n", 212 | "from pandas import concat\n", 213 | "from sklearn.metrics import mean_squared_error" 214 | ], 215 | "execution_count": 6, 216 | "outputs": [] 217 | }, 218 | { 219 | "cell_type": "markdown", 220 | "metadata": { 221 | "collapsed": true, 222 | "id": "nhYP15aoCAcf", 223 | "colab_type": "text" 224 | }, 225 | "source": [ 226 | "## 2 - Load Datasets ##" 227 | ] 228 | }, 229 | { 230 | "cell_type": "code", 231 | "metadata": { 232 | "id": "y1wsGi52CAch", 233 | "colab_type": "code", 234 | "colab": {} 235 | }, 236 | "source": [ 237 | "daily_temp = pd.read_csv('daily-min-temperatures.csv', usecols=['Date','Temp'],index_col='Date', parse_dates=True )\n", 238 | "daily_temp.head()" 239 | ], 240 | "execution_count": null, 241 | "outputs": [] 242 | }, 243 | { 244 | "cell_type": "markdown", 245 | "metadata": { 246 | "id": "UKI68e66OsBL", 247 | "colab_type": "text" 248 | }, 249 | "source": [ 250 | "## 3- Data Description # " 251 | ] 252 | }, 253 | { 254 | "cell_type": "code", 255 | "metadata": { 256 | "id": "TmxB0x3VOyIh", 257 | "colab_type": "code", 258 | "colab": {} 259 | }, 260 | "source": [ 261 | "daily_temp.index" 262 | ], 263 | "execution_count": null, 264 | "outputs": [] 265 | }, 266 | { 267 | "cell_type": "code", 268 | "metadata": { 269 | "id": "p_Ttzuj2O6TB", 270 | "colab_type": "code", 271 | "colab": {} 272 | }, 273 | "source": [ 274 | "daily_temp.head()" 275 | ], 276 | "execution_count": null, 277 | "outputs": [] 278 | }, 279 | { 280 | "cell_type": "code", 281 | "metadata": { 282 | "id": "poqouIc3PBaB", 283 | "colab_type": "code", 284 | "colab": {} 285 | }, 286 | "source": [ 287 | "\n", 288 | "daily_temp['1981-01-01':'1981-01-05']" 289 | ], 290 | "execution_count": null, 291 | "outputs": [] 292 | }, 293 | { 294 | "cell_type": "markdown", 295 | "metadata": { 296 | "id": "vSN1E21RIYXh", 297 | "colab_type": "text" 298 | }, 299 | "source": [ 300 | "## 3- Check Null value" 301 | ] 302 | }, 303 | { 304 | "cell_type": "code", 305 | "metadata": { 306 | "id": "KCtUtRplIrYB", 307 | "colab_type": "code", 308 | "colab": { 309 | "base_uri": "https://localhost:8080/", 310 | "height": 51 311 | }, 312 | "outputId": "2c15f8f5-cb7c-44a2-c8bb-426e25eed07c" 313 | }, 314 | "source": [ 315 | "daily_temp.isna().sum()" 316 | ], 317 | "execution_count": null, 318 | "outputs": [ 319 | { 320 | "output_type": "execute_result", 321 | "data": { 322 | "text/plain": [ 323 | "Temp 0\n", 324 | "dtype: int64" 325 | ] 326 | }, 327 | "metadata": { 328 | "tags": [] 329 | }, 330 | "execution_count": 55 331 | } 332 | ] 333 | }, 334 | { 335 | "cell_type": "markdown", 336 | "metadata": { 337 | "id": "P4QYI2IoHWpL", 338 | "colab_type": "text" 339 | }, 340 | "source": [ 341 | "## 4 - Ploting ##\n" 342 | ] 343 | }, 344 | { 345 | "cell_type": "code", 346 | "metadata": { 347 | "id": "p9JGZRKCCAcq", 348 | "colab_type": "code", 349 | "colab": {} 350 | }, 351 | "source": [ 352 | " daily_temp.plot(df['Temp'])" 353 | ], 354 | "execution_count": null, 355 | "outputs": [] 356 | }, 357 | { 358 | "cell_type": "code", 359 | "metadata": { 360 | "id": "nR-PYX-bQsYr", 361 | "colab_type": "code", 362 | "colab": {} 363 | }, 364 | "source": [ 365 | "%matplotlib inline\n", 366 | "daily_temp.Temp.resample(\"M\").mean().plot()" 367 | ], 368 | "execution_count": null, 369 | "outputs": [] 370 | }, 371 | { 372 | "cell_type": "code", 373 | "metadata": { 374 | "id": "A4baoniBQ5qv", 375 | "colab_type": "code", 376 | "colab": {} 377 | }, 378 | "source": [ 379 | "daily_temp.Temp.resample(\"W\").mean().plot()" 380 | ], 381 | "execution_count": null, 382 | "outputs": [] 383 | }, 384 | { 385 | "cell_type": "code", 386 | "metadata": { 387 | "id": "L4jg1ktARE9o", 388 | "colab_type": "code", 389 | "colab": {} 390 | }, 391 | "source": [ 392 | "daily_temp.Temp.resample(\"D\").mean().plot()" 393 | ], 394 | "execution_count": null, 395 | "outputs": [] 396 | }, 397 | { 398 | "cell_type": "code", 399 | "metadata": { 400 | "id": "soIDaghNRORG", 401 | "colab_type": "code", 402 | "colab": {} 403 | }, 404 | "source": [ 405 | "Quarterly_resampled_data = daily_temp.Temp.resample('Q').mean().plot()" 406 | ], 407 | "execution_count": null, 408 | "outputs": [] 409 | }, 410 | { 411 | "cell_type": "code", 412 | "metadata": { 413 | "id": "5GgYNc1cRgMA", 414 | "colab_type": "code", 415 | "colab": {} 416 | }, 417 | "source": [ 418 | "x=pd.Series(range(3), index=pd.date_range('1981', freq='D', periods=3))" 419 | ], 420 | "execution_count": null, 421 | "outputs": [] 422 | }, 423 | { 424 | "cell_type": "code", 425 | "metadata": { 426 | "id": "DyWSnYoERpCP", 427 | "colab_type": "code", 428 | "colab": {} 429 | }, 430 | "source": [ 431 | "x" 432 | ], 433 | "execution_count": null, 434 | "outputs": [] 435 | }, 436 | { 437 | "cell_type": "code", 438 | "metadata": { 439 | "id": "eV0BHL3LRyJ-", 440 | "colab_type": "code", 441 | "colab": {} 442 | }, 443 | "source": [ 444 | "mx=pd.Series(range(3), index=pd.date_range('2019', freq='D', periods=3))\n" 445 | ], 446 | "execution_count": null, 447 | "outputs": [] 448 | }, 449 | { 450 | "cell_type": "code", 451 | "metadata": { 452 | "id": "OXCttrtfR43-", 453 | "colab_type": "code", 454 | "colab": {} 455 | }, 456 | "source": [ 457 | "daily_temp.Temp.resample('QS').mean().plot()" 458 | ], 459 | "execution_count": null, 460 | "outputs": [] 461 | }, 462 | { 463 | "cell_type": "code", 464 | "metadata": { 465 | "id": "jBsedv0KSDIf", 466 | "colab_type": "code", 467 | "colab": {} 468 | }, 469 | "source": [ 470 | "daily_temp.Temp.resample('M').mean().plot(kind=\"line\")" 471 | ], 472 | "execution_count": null, 473 | "outputs": [] 474 | }, 475 | { 476 | "cell_type": "markdown", 477 | "metadata": { 478 | "id": "ehTTQE9YJF1Q", 479 | "colab_type": "text" 480 | }, 481 | "source": [ 482 | "##5- Correlation ##\n", 483 | "Visual check of correlation by plotting the scatter plot against the previous and next time step Pandas built in plot - Lagplot - to do visual check\n", 484 | "\n" 485 | ] 486 | }, 487 | { 488 | "cell_type": "code", 489 | "metadata": { 490 | "scrolled": true, 491 | "id": "TvqFi8-XCAcy", 492 | "colab_type": "code", 493 | "colab": {} 494 | }, 495 | "source": [ 496 | "" 497 | ], 498 | "execution_count": null, 499 | "outputs": [] 500 | }, 501 | { 502 | "cell_type": "markdown", 503 | "metadata": { 504 | "id": "Y5ay6wqxYGx4", 505 | "colab_type": "text" 506 | }, 507 | "source": [ 508 | "**Stationary**\n", 509 | "\n", 510 | "To check wither data is stationaly or not. stationary mean that is data is constant in mean and variance . we cehck this becasue many time sieries model required that data should be stationary in order to model it. It should be concentatn movment up and down momement" 511 | ] 512 | }, 513 | { 514 | "cell_type": "code", 515 | "metadata": { 516 | "id": "xa_nh0I_YFQL", 517 | "colab_type": "code", 518 | "colab": {} 519 | }, 520 | "source": [ 521 | "plt.plot(daily_temp)\n", 522 | "plt.show()" 523 | ], 524 | "execution_count": null, 525 | "outputs": [] 526 | }, 527 | { 528 | "cell_type": "markdown", 529 | "metadata": { 530 | "id": "U_AmXR-xaMdo", 531 | "colab_type": "text" 532 | }, 533 | "source": [ 534 | "now we make data stationary if data is not stationary \n", 535 | " - Difference the data and make it satationary . difference mean substract the next value \n", 536 | " - fill the miss value data" 537 | ] 538 | }, 539 | { 540 | "cell_type": "code", 541 | "metadata": { 542 | "id": "6fP07c68bmeo", 543 | "colab_type": "code", 544 | "colab": {} 545 | }, 546 | "source": [ 547 | "# take serveral diff untill your data become stationary \n", 548 | "daily_temp_diff1=daily_temp.diff().fillna(daily_temp)" 549 | ], 550 | "execution_count": 9, 551 | "outputs": [] 552 | }, 553 | { 554 | "cell_type": "code", 555 | "metadata": { 556 | "id": "2ZhajJjfbyEU", 557 | "colab_type": "code", 558 | "colab": {} 559 | }, 560 | "source": [ 561 | "plt.plot(daily_temp)\n", 562 | "plt.show()" 563 | ], 564 | "execution_count": null, 565 | "outputs": [] 566 | }, 567 | { 568 | "cell_type": "markdown", 569 | "metadata": { 570 | "id": "X7TpYJAKuQS9", 571 | "colab_type": "text" 572 | }, 573 | "source": [ 574 | "**Autocorrelation Function (ACF) plot**\n", 575 | "\n", 576 | "**Partial Autocorrelation Function(PACF) Plot ** \n", 577 | "\n", 578 | "Helps determine number of autoregressive AR term and moving average term\n", 579 | "\n", 580 | "can also be used to help see seasonality or periodic trends\n" 581 | ] 582 | }, 583 | { 584 | "cell_type": "code", 585 | "metadata": { 586 | "id": "PV824D_Fyl64", 587 | "colab_type": "code", 588 | "colab": {} 589 | }, 590 | "source": [ 591 | "plot_acf(daily_temp)\n", 592 | "plt.show()" 593 | ], 594 | "execution_count": null, 595 | "outputs": [] 596 | }, 597 | { 598 | "cell_type": "code", 599 | "metadata": { 600 | "id": "59xk22YYyxFw", 601 | "colab_type": "code", 602 | "colab": {} 603 | }, 604 | "source": [ 605 | "plot_pacf(daily_temp)\n", 606 | "plt.show()" 607 | ], 608 | "execution_count": null, 609 | "outputs": [] 610 | }, 611 | { 612 | "cell_type": "markdown", 613 | "metadata": { 614 | "id": "izbM6eEUKAV-", 615 | "colab_type": "text" 616 | }, 617 | "source": [ 618 | "## 6-Data Spliting ##" 619 | ] 620 | }, 621 | { 622 | "cell_type": "code", 623 | "metadata": { 624 | "id": "zt-vctO1Kb3o", 625 | "colab_type": "code", 626 | "colab": {} 627 | }, 628 | "source": [ 629 | " X = daily_temp.values\n", 630 | "train, test = X[1:len(X)-7], X[len(X)-7:]" 631 | ], 632 | "execution_count": 13, 633 | "outputs": [] 634 | }, 635 | { 636 | "cell_type": "markdown", 637 | "metadata": { 638 | "id": "bVnQBEQv7fqd", 639 | "colab_type": "text" 640 | }, 641 | "source": [ 642 | "##6 -Autoregressive model##\n", 643 | "Forecast the next timestamp value by regressing over previous vaues" 644 | ] 645 | }, 646 | { 647 | "cell_type": "markdown", 648 | "metadata": { 649 | "id": "1x8QNZN3CAdF", 650 | "colab_type": "text" 651 | }, 652 | "source": [ 653 | "### 6.1 - Models training on training Data ###\n", 654 | "\n", 655 | "```\n", 656 | "# This is formatted as code\n", 657 | "```\n", 658 | "\n", 659 | "\n" 660 | ] 661 | }, 662 | { 663 | "cell_type": "code", 664 | "metadata": { 665 | "id": "zNBdXxiHLfLW", 666 | "colab_type": "code", 667 | "colab": {} 668 | }, 669 | "source": [ 670 | "from statsmodels.tsa.ar_model import AR" 671 | ], 672 | "execution_count": 14, 673 | "outputs": [] 674 | }, 675 | { 676 | "cell_type": "code", 677 | "metadata": { 678 | "id": "ulKq5MEC4QTC", 679 | "colab_type": "code", 680 | "colab": {} 681 | }, 682 | "source": [ 683 | "model = AR(train)\n", 684 | "model_fit = model.fit()" 685 | ], 686 | "execution_count": 15, 687 | "outputs": [] 688 | }, 689 | { 690 | "cell_type": "code", 691 | "metadata": { 692 | "id": "5zfaTqgI4Vis", 693 | "colab_type": "code", 694 | "colab": {} 695 | }, 696 | "source": [ 697 | "print('Lag: %s' % model_fit.k_ar)\n", 698 | "print('Coefficients: %s' % model_fit.params)" 699 | ], 700 | "execution_count": null, 701 | "outputs": [] 702 | }, 703 | { 704 | "cell_type": "markdown", 705 | "metadata": { 706 | "id": "g78HCuLVCAdG", 707 | "colab_type": "text" 708 | }, 709 | "source": [ 710 | "### 6.2 - Model prediction on test data### " 711 | ] 712 | }, 713 | { 714 | "cell_type": "code", 715 | "metadata": { 716 | "id": "JwOLIUoIPjY1", 717 | "colab_type": "code", 718 | "colab": {} 719 | }, 720 | "source": [ 721 | "predictions = model_fit.predict(start=len(train), end=len(train)+len(test)-1, dynamic=False)" 722 | ], 723 | "execution_count": 17, 724 | "outputs": [] 725 | }, 726 | { 727 | "cell_type": "code", 728 | "metadata": { 729 | "id": "YhWBShUX5JbT", 730 | "colab_type": "code", 731 | "colab": {} 732 | }, 733 | "source": [ 734 | "for i in range(len(predictions)):\n", 735 | " print('predicted=%f, expected=%f' % (predictions[i], test[i]))" 736 | ], 737 | "execution_count": null, 738 | "outputs": [] 739 | }, 740 | { 741 | "cell_type": "markdown", 742 | "metadata": { 743 | "id": "_JBFgfGjdddE", 744 | "colab_type": "text" 745 | }, 746 | "source": [ 747 | "### 7.3 Model Evaulation###" 748 | ] 749 | }, 750 | { 751 | "cell_type": "code", 752 | "metadata": { 753 | "id": "j9xnyArC5dAD", 754 | "colab_type": "code", 755 | "colab": {} 756 | }, 757 | "source": [ 758 | "error = mean_squared_error(test, predictions)" 759 | ], 760 | "execution_count": 19, 761 | "outputs": [] 762 | }, 763 | { 764 | "cell_type": "code", 765 | "metadata": { 766 | "id": "o8744XRH5lhP", 767 | "colab_type": "code", 768 | "colab": {} 769 | }, 770 | "source": [ 771 | "print('Test MSE: %.3f' % error)" 772 | ], 773 | "execution_count": null, 774 | "outputs": [] 775 | }, 776 | { 777 | "cell_type": "code", 778 | "metadata": { 779 | "id": "y5RGFYis5t2y", 780 | "colab_type": "code", 781 | "colab": {} 782 | }, 783 | "source": [ 784 | "plt.plot(test)\n", 785 | "plt.plot(predictions, color='red')\n", 786 | "plt.show()" 787 | ], 788 | "execution_count": null, 789 | "outputs": [] 790 | }, 791 | { 792 | "cell_type": "markdown", 793 | "metadata": { 794 | "id": "i_GstlsrCAdc", 795 | "colab_type": "text" 796 | }, 797 | "source": [ 798 | "### 6.4 -Model tunning" 799 | ] 800 | }, 801 | { 802 | "cell_type": "code", 803 | "metadata": { 804 | "id": "e1DF698ECAdd", 805 | "colab_type": "code", 806 | "colab": {} 807 | }, 808 | "source": [ 809 | "window = model_fit.k_ar\n", 810 | "coef = model_fit.params" 811 | ], 812 | "execution_count": 22, 813 | "outputs": [] 814 | }, 815 | { 816 | "cell_type": "code", 817 | "metadata": { 818 | "id": "wd_svSDs6mQ-", 819 | "colab_type": "code", 820 | "colab": {} 821 | }, 822 | "source": [ 823 | "history = train[len(train)-window:]\n", 824 | "history = [history[i] for i in range(len(history))]\n", 825 | "predictions = list()" 826 | ], 827 | "execution_count": 23, 828 | "outputs": [] 829 | }, 830 | { 831 | "cell_type": "code", 832 | "metadata": { 833 | "id": "SqVFvGGY6qsn", 834 | "colab_type": "code", 835 | "colab": {} 836 | }, 837 | "source": [ 838 | "for t in range(len(test)):\n", 839 | " length = len(history)\n", 840 | " lag = [history[i] for i in range(length-window,length)]\n", 841 | " yhat = coef[0]\n", 842 | " for d in range(window):\n", 843 | " yhat += coef[d+1] * lag[window-d-1]\n", 844 | " obs = test[t]\n", 845 | " predictions.append(yhat)\n", 846 | " history.append(obs)\n", 847 | " print('predicted=%f, expected=%f' % (yhat, obs))" 848 | ], 849 | "execution_count": null, 850 | "outputs": [] 851 | }, 852 | { 853 | "cell_type": "markdown", 854 | "metadata": { 855 | "id": "lTe0Qqi3CAeA", 856 | "colab_type": "text" 857 | }, 858 | "source": [ 859 | "### 6.5 - Model Evaulation ###" 860 | ] 861 | }, 862 | { 863 | "cell_type": "code", 864 | "metadata": { 865 | "id": "fakWwJ-h6-tm", 866 | "colab_type": "code", 867 | "colab": {} 868 | }, 869 | "source": [ 870 | "error = mean_squared_error(test, predictions)\n", 871 | "print('Test MSE: %.3f' % error)" 872 | ], 873 | "execution_count": null, 874 | "outputs": [] 875 | }, 876 | { 877 | "cell_type": "code", 878 | "metadata": { 879 | "id": "XOG_8GjO7Dya", 880 | "colab_type": "code", 881 | "colab": {} 882 | }, 883 | "source": [ 884 | "pyplot.plot(test)\n", 885 | "pyplot.plot(predictions, color='red')\n", 886 | "pyplot.show()" 887 | ], 888 | "execution_count": null, 889 | "outputs": [] 890 | }, 891 | { 892 | "cell_type": "markdown", 893 | "metadata": { 894 | "id": "05Gq2A7BCAeX", 895 | "colab_type": "text" 896 | }, 897 | "source": [ 898 | "## 7 - Auto Regressive Integrated Moving Average (ARIMA) ##\n", 899 | "\n", 900 | " Forecast the next timestamps value by averaging over the previous values \n", 901 | " - useful for non-stationary data\n", 902 | " - allows us to difference the data\n", 903 | " - also includes a seasona differencing parameter for seasonal non-stationary data " 904 | ] 905 | }, 906 | { 907 | "cell_type": "markdown", 908 | "metadata": { 909 | "id": "rWCpBVwViJPO", 910 | "colab_type": "text" 911 | }, 912 | "source": [ 913 | "### 7.1 Models training on training Data" 914 | ] 915 | }, 916 | { 917 | "cell_type": "code", 918 | "metadata": { 919 | "id": "Ld1_nNE_mDcW", 920 | "colab_type": "code", 921 | "colab": {} 922 | }, 923 | "source": [ 924 | "model_ARIMA = ARIMA(train, order=(5,0,1))\n", 925 | "model_fit_ARIMA = model_ARIMA.fit(disp=0)" 926 | ], 927 | "execution_count": 29, 928 | "outputs": [] 929 | }, 930 | { 931 | "cell_type": "code", 932 | "metadata": { 933 | "id": "o_rzpp-3b1ak", 934 | "colab_type": "code", 935 | "colab": {} 936 | }, 937 | "source": [ 938 | "print('Lag: %s' % model_fit_ARIMA.k_ar)\n", 939 | "print('Coefficients: %s' % model_fit_ARIMA.params)" 940 | ], 941 | "execution_count": null, 942 | "outputs": [] 943 | }, 944 | { 945 | "cell_type": "markdown", 946 | "metadata": { 947 | "collapsed": true, 948 | "id": "yN3QOkmDCAed", 949 | "colab_type": "text" 950 | }, 951 | "source": [ 952 | "### 7.2 - Model prediction on test data. ###\n" 953 | ] 954 | }, 955 | { 956 | "cell_type": "code", 957 | "metadata": { 958 | "id": "f9Mfj47gcrlu", 959 | "colab_type": "code", 960 | "colab": {} 961 | }, 962 | "source": [ 963 | "predictions_ARIMA = model_fit_ARIMA.predict(start=len(train), end=len(train)+len(test)-1, dynamic=False)" 964 | ], 965 | "execution_count": 31, 966 | "outputs": [] 967 | }, 968 | { 969 | "cell_type": "code", 970 | "metadata": { 971 | "id": "NaIYvkLwc97g", 972 | "colab_type": "code", 973 | "colab": {} 974 | }, 975 | "source": [ 976 | "for i in range(len(predictions_ARIMA)):\n", 977 | " print('predicted=%f, expected=%f' % (predictions_ARIMA[i], test[i]))" 978 | ], 979 | "execution_count": null, 980 | "outputs": [] 981 | }, 982 | { 983 | "cell_type": "code", 984 | "metadata": { 985 | "id": "MHVC12Sl0Xw1", 986 | "colab_type": "code", 987 | "colab": {} 988 | }, 989 | "source": [ 990 | "# Another Mothod \n", 991 | "from statsmodels.tsa.arima_model import ARMA\n", 992 | "mod = ARMA(train, order=(0,1))\n", 993 | "res = mod.fit()\n", 994 | "res.plot_predict(start=1990, end=1991)\n", 995 | "plt.show()\n", 996 | "predictions_ARIMA = res.predict(start=len(train), end=len(train)+len(test)-1, dynamic=False)\n", 997 | "for i in range(len(predictions_ARIMA)):\n", 998 | " print('predicted=%f, expected=%f' % (predictions_ARIMA[i], test[i]))\n" 999 | ], 1000 | "execution_count": null, 1001 | "outputs": [] 1002 | }, 1003 | { 1004 | "cell_type": "markdown", 1005 | "metadata": { 1006 | "id": "9q5Vg5WldwA0", 1007 | "colab_type": "text" 1008 | }, 1009 | "source": [ 1010 | "###7.3 Model Evaulation##" 1011 | ] 1012 | }, 1013 | { 1014 | "cell_type": "code", 1015 | "metadata": { 1016 | "id": "qNfne-7Ud2gl", 1017 | "colab_type": "code", 1018 | "colab": {} 1019 | }, 1020 | "source": [ 1021 | "error_ARIMA = mean_squared_error(test, predictions_ARIMA)" 1022 | ], 1023 | "execution_count": null, 1024 | "outputs": [] 1025 | }, 1026 | { 1027 | "cell_type": "code", 1028 | "metadata": { 1029 | "id": "Ofv08pr_eKUP", 1030 | "colab_type": "code", 1031 | "colab": {} 1032 | }, 1033 | "source": [ 1034 | "print('Test MSE: %.3f' % error_ARIMA)" 1035 | ], 1036 | "execution_count": null, 1037 | "outputs": [] 1038 | }, 1039 | { 1040 | "cell_type": "code", 1041 | "metadata": { 1042 | "id": "JxxCZ4C6eU_l", 1043 | "colab_type": "code", 1044 | "colab": {} 1045 | }, 1046 | "source": [ 1047 | "plt.plot(test)\n", 1048 | "plt.plot(predictions_ARIMA, color='red')\n", 1049 | "plt.show()" 1050 | ], 1051 | "execution_count": null, 1052 | "outputs": [] 1053 | }, 1054 | { 1055 | "cell_type": "markdown", 1056 | "metadata": { 1057 | "id": "BKXbvbqBelb9", 1058 | "colab_type": "text" 1059 | }, 1060 | "source": [ 1061 | "###7.4 Model tunning##" 1062 | ] 1063 | }, 1064 | { 1065 | "cell_type": "code", 1066 | "metadata": { 1067 | "id": "is1FsuhueyIX", 1068 | "colab_type": "code", 1069 | "colab": {} 1070 | }, 1071 | "source": [ 1072 | "window_ARIMA = model_fit_ARIMA.k_ar\n", 1073 | "coef_ARIMA = model_fit_ARIMA.params" 1074 | ], 1075 | "execution_count": null, 1076 | "outputs": [] 1077 | }, 1078 | { 1079 | "cell_type": "code", 1080 | "metadata": { 1081 | "id": "kVloQpvnfCLI", 1082 | "colab_type": "code", 1083 | "colab": {} 1084 | }, 1085 | "source": [ 1086 | "history_ARIMA = train[len(train)-window_ARIMA:]\n", 1087 | "history_ARIMA = [history_ARIMA[i] for i in range(len(history_ARIMA))]\n", 1088 | "predictions_ARIMA = list()" 1089 | ], 1090 | "execution_count": null, 1091 | "outputs": [] 1092 | }, 1093 | { 1094 | "cell_type": "code", 1095 | "metadata": { 1096 | "id": "-7Nc4n4gfZF5", 1097 | "colab_type": "code", 1098 | "colab": {} 1099 | }, 1100 | "source": [ 1101 | "for t in range(len(test)):\n", 1102 | " length = len(history_ARIMA)\n", 1103 | " lag = [history_ARIMA[i] for i in range(length-window,length)]\n", 1104 | " yhat = coef[0]\n", 1105 | " for d in range(window_ARIMA):\n", 1106 | " yhat += coef[d+1] * lag[window_ARIMA-d-1]\n", 1107 | " obs = test[t]\n", 1108 | " predictions_ARIMA.append(yhat)\n", 1109 | " history_ARIMA.append(obs)\n", 1110 | " print('predicted=%f, expected=%f' % (yhat, obs))" 1111 | ], 1112 | "execution_count": null, 1113 | "outputs": [] 1114 | }, 1115 | { 1116 | "cell_type": "markdown", 1117 | "metadata": { 1118 | "id": "HRAZMQIif3H6", 1119 | "colab_type": "text" 1120 | }, 1121 | "source": [ 1122 | "###7.5 Model Evaulation##" 1123 | ] 1124 | }, 1125 | { 1126 | "cell_type": "markdown", 1127 | "metadata": { 1128 | "id": "vm7uTbEWCAei", 1129 | "colab_type": "text" 1130 | }, 1131 | "source": [ 1132 | "References:\n", 1133 | "\n", 1134 | "Autoregressive¶\n", 1135 | "https://www.investopedia.com/terms/a/autoregressive.asp\n", 1136 | "\n", 1137 | "Autoregressive Models\n", 1138 | "https://online.stat.psu.edu/stat501/lesson/14/14.1#:~:text=An%20autoregressive%20model%20is%20when,from%20that%20same%20time%20series.&text=The%20order%20of%20an%20autoregression,value%20at%20the%20present%20time.\n", 1139 | "\n", 1140 | "ARIMA modeling and forecasting: Time Series in Python Part 2\n", 1141 | "https://tutorials.datasciencedojo.com/arima-model-time-series-python/" 1142 | ] 1143 | } 1144 | ] 1145 | } -------------------------------------------------------------------------------- /ReadMe.md: -------------------------------------------------------------------------------- 1 | ## **Welcome to Collaborative Time Series Analysis with Machine Learning Course👋🛒** 2 | 3 |

4 | 5 |

6 | 7 | This repository provides a comprehensive guide to time series analysis and forecasting using machine learning techniques. It covers various approaches, including statistical methods, deep learning, and hybrid models to analyze and predict time-dependent data. 8 | 9 | Also please subscribe to my [youtube channel!](https://www.youtube.com/@coursesteach-mv5si) 10 | 11 | [ Readme](https://chatgpt.com/c/67eaaceb-8a7c-800e-9431-e32906c50be5) 12 | 13 | ## **🎯 Why Join This Course?** 14 | 15 | 1. 📖 Comprehensive Learning: Covers all major time series topics, from basics to advanced machine learning techniques. 16 | 17 | 2. 🛠 Practical Implementation: Each topic includes hands-on coding exercises, Jupyter notebooks, and real-world projects. 18 | 19 | 3. 🤝 Collaborative Learning: Work with students and researchers worldwide through GitHub discussions, issue tracking, and dedicated forums. 20 | 21 | 4. 🔥 AI-Powered Course: Stay ahead with industry-relevant techniques like LSTMs, GRUs, transformers, and hybrid models. 22 | 23 | ## **💡 How to Participate?** 24 | 25 | 🚀 Fork & Star this repository 26 | 27 | 👩‍💻 Explore and Learn from structured lessons 28 | 29 | 🔧 Enhance the current blog or code, or write a blog on a new topic 30 | 31 | 🔧 Implement & Experiment with provided code 32 | 33 | 🤝 Collaborate with fellow NLP enthusiasts 34 | 35 | 📌 Contribute your own implementations & projects 36 | 37 | 📌 Share valuable blogs, videos, courses, GitHub repositories, and research websites 38 | 39 | 💡 Start your NLP journey today! 40 | 41 | ## **🌍 Join Our Community** 42 | 43 | 🔗 [**YouTube Channe**l](https://www.youtube.com/@coursesteach-mv5si/videos) 44 | 45 | 🔗 [**SubStack Blogs**](https://substack.com/@coursesteach) 46 | 47 | 🔗 [**Facebook**](https://www.facebook.com/CourseTeach) 48 | 49 | 🔗 [**LinkedIn**](https://www.linkedin.com/company/90909828/admin/page-posts/published/) 50 | 51 | 📬 Need Help? Connect with us on [**WhatsApp**](https://chat.whatsapp.com/L9URPRThBEa7GFl0mlwggg) 52 | 53 | ## **🚀 Let's Build NLP Together!** 54 | 55 | Join us in creating, sharing, and implementing NLP solutions. Your contributions will help advance open-source AI education globally. 💡🤖 56 | 57 | 🔗 Start Learning NLP Now! 58 | 59 | ## **📌 Course Modules & Resources** 60 | 61 |
62 |

📕Course 01 -Classification and Vector Spaces

63 | 64 | ### 🔹Week 0-**Chapter 1:Introduction** 65 | | Topic Name/Tutorial | Video | 💻 Colab Implementation | 66 | |---|---|---| 67 | |[**✅1-What is Natural Language Processing (NLP)⭐️**](https://medium.com/@Coursesteach/natural-language-processing-part-1-5727b4efc8b4)[-Substack Link](https://substack.com/home/post/p-155741084?source=queue&autoPlay=false)|[1](https://www.youtube.com/watch?v=j86dP_05_o0)|---| 68 | | [**✅2- Natural Language Processing Tasks and Applications⭐️**](https://mushtaqmsit.substack.com/p/natural-language-processing-for-beginners) | [1](https://www.youtube.com/watch?v=j86dP_05_o0)| Content 3 | 69 | | [**✅3- Best Free Resources to Learn NLP-Tutorial⭐️**](https://mushtaqmsit.substack.com/p/top-free-nlp-learning-resources-your) | Content 5 | Content 6 | 70 | 71 | ### 🔹Week 1-**Chapter 2:Sentiment Analysis (logistic Regression)** 72 | | Topic Name/Tutorial | Video | 💻 Colab Implementation | 73 | |---|---|---| 74 | |**✅1- Preprocessing_Aassignment_1**| Content 2 |[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Preprocessing_Aassignment_1.ipynb) | 75 | |[**✅2- Supervised ML & Sentiment Analysis**](https://open.substack.com/pub/mushtaqmsit/p/sentiment-analysis-with-logistic?r=f2squ&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false) |[1](https://drive.google.com/file/d/1cN2GrXXW1mcxGkmoD4O1I4iydFrSZugz/view)| [![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb) | 76 | |[**✅3-Vocabulary & Feature Extraction**](https://open.substack.com/pub/mushtaqmsit/p/comprehensive-guide-to-vocabulary?r=f2squ&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false)|[1]()|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 77 | |[**✅4-Negative and Positive Frequencies**](https://open.substack.com/pub/mushtaqmsit/p/what-is-a-frequency-dictionary-in?r=f2squ&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false)|[1](https://drive.google.com/file/d/103nzJmltWxPqFgzs3LmMZcjQCj_o_kXJ/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 78 | |[**✅5-Text pre-processing-s**](https://mushtaqmsit.substack.com/p/text-processing-basics-essential)|[1](https://drive.google.com/file/d/16c-fd1qLcHXmEckQiKC3gchpErAcECL6/view)[-2](https://drive.google.com/file/d/1cHFkOq_74RJHVWQZojNQMgeZ78gsfaes/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 79 | |[**✅6-Putting it All Together-S**](https://mushtaqmsit.substack.com/p/step-by-step-guide-to-building-an)|---|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 80 | |[**✅7-Logistic Regression Overview-S**](https://mushtaqmsit.substack.com/p/logistic-regression-explained-predicting)|---|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 81 | |[**✅8-Logistic Regression: Training-s**](https://mushtaqmsit.substack.com/p/training-logistic-regression-from)|---|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 82 | |[**✅9-Logistic Regression: Testing⭐️**](https://mushtaqmsit.substack.com/p/logistic-regression-testing-explained)|---|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 83 | |[**🌐10-Logistic Regression: Cost Function⭐️**](https://medium.com/@Coursesteach/natural-language-processing-part-12-dac8146c288c)|---|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 84 | |**Lab#1:Visualizing word frequencies**|---|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Visualizing_word_frequencies.ipynb)| 85 | |**🌐Lab 2:Visualizing tweets and the Logistic Regression model**|---|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Visualizing_tweets_and_the_Logistic_Regression_model_ipynb.ipynb)| 86 | |**🌐Assignmen:Sentiment analysis with logistic Regression**|---|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Sentiment_analysis_with_logistic_Regression_Assignment.ipynb)| 87 | 88 | ### Week 2-**📚Chapter3:Sentiment Analysis using Naive Bayes** 89 | | Topic Name/Tutorial | Video | Code | 90 | |---|---|---| 91 | | [**🌐1-Probability and Bayes’ Rule**](https://medium.com/@Coursesteach/natural-language-processing-part-14-d3de0382ba09) | [**1**](https://drive.google.com/file/d/1plPqQBOJrMGMrUqQFEs8M4uD6pxKXDnp/view) |[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 92 | |[**🌐2-Bayes’ Rule**](https://medium.com/@Coursesteach/natural-language-processing-part-15-bayes-rule-b87f9dff4a90)|[**1**](https://drive.google.com/file/d/19Nk0eUGhedfZAxwF_6xTvgl6KgedF5sl/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 93 | |[**🌐3-Naïve Bayes Introduction**](https://medium.com/@Coursesteach/natural-language-processing-part-16-na%C3%AFve-bayes-introduction-1f4042d05b16)|[**1**](https://drive.google.com/file/d/1RaCR90h7aBf9dJZP1iscaAV6dWclexgc/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Visualizing_Naive_Bayes.ipynb)| 94 | |[**🌐4-Laplacian Smoothing**](https://medium.com/@Coursesteach/natural-language-processing-part-17-laplacian-smoothing-7d4be71d0ded)|[**1**](https://drive.google.com/file/d/137FsONo3kU-KTUvO8IvLN-Fnm34oduqM/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 95 | |[**🌐5-Log Likelihood, Part 1**](https://medium.com/@Coursesteach/natural-language-processing-part-18-log-likelihood-part-1-80bfdab420ce)|[**1**](https://drive.google.com/file/d/1okkcmFX-npmRgqohx0Rgz7Vnyjb6xPM-/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 96 | |[**🌐6-Log Likelihood, Part 2**](https://medium.com/@Coursesteach/natural-language-processing-part-19-log-likelihood-part-2-ee3e872f33f3)| [**1**](https://drive.google.com/file/d/10VFBCuLxN4qaj07Vp6LBMRZ4mbKu_khV/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 97 | |[**🌐7-Training Naïve Bayes**](https://medium.com/@Coursesteach/natural-language-processing-part-20-training-na%C3%AFve-bayes-cbdc82aabd19)| [**1**](https://drive.google.com/file/d/1_Hmzrzn_FLiACjK5oRaYP-Y6KQ84OQ9A/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 98 | | **🌐Lab1-Visualizing Naive Bayes** | Content 5 | [![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Visualizing_Naive_Bayes.ipynb) | 99 | |**🌐Assignment_2_Naive_Bayes**|---|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Assignment_2_Naive_Bayes.ipynb) | 100 | |[**🌐8-Testing Naïve Bayes**](https://medium.com/@Coursesteach/natural-language-processing-part-21-testing-na%C3%AFve-bayes-d98579270263)| [**1**](https://drive.google.com/file/d/1ACc2n1F-oqGyzX2_GjfemJI59uGq0lam/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 101 | |[**🌐9-Applications of Naïve Bayes**](https://medium.com/@Coursesteach/natural-language-processing-part-21-testing-na%C3%AFve-bayes-d98579270263)| [**1**](https://drive.google.com/file/d/1rQm9FR06_7a8m6KzPVboaZtmL-ZMLwcu/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 102 | |[**🌐10-Naïve Bayes Assumptions**](https://medium.com/@Coursesteach/natural-language-processing-part-23-na%C3%AFve-bayes-assumptions-5b2ef6f84810)| [**1**](https://drive.google.com/file/d/1gnHNYlMu9ttlLQuaRyerwtHG_cwwS_f6/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 103 | |[**🌐11-Error Analysis**](https://medium.com/@Coursesteach/natural-language-processing-part-24-error-analysis-9cdcf0385fdc)| [**1**](https://drive.google.com/file/d/1aJ1LZzlr8MyWDTr6WoyLqELf6jkHtBcf/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 104 | 105 | ## Week 3 -[**📚Chapter 3:Vector Space Model**]() 106 | | Topic Name/Tutorial | Video | Code | 107 | |---|---|---| 108 | |[**🌐1-Vector Space Models**](https://medium.com/@Coursesteach/natural-language-processing-part-25-vector-space-models-bad08067a5ac)| [**1**](https://drive.google.com/file/d/1Vuy_2RPHJhz2JaPJQK5RZxxMdchmBDL_/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 109 | |[**🌐2-Word by Word and Word by Doc**](https://medium.com/@Coursesteach/natural-language-processing-part-26-word-by-word-and-word-by-doc-4dfa832d2d8e)| [**1**](https://drive.google.com/file/d/1otyRw-Q3JBEDWVn59dxFeqDQxed558Zv/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 110 | |[**🌐3-Euclidean Distance**](https://medium.com/@Coursesteach/natural-language-processing-part-27-euclidean-distance-180befc6c6ef)| [**1**](https://drive.google.com/file/d/1zA0ygWA00RCXiuN86tJuQsitMvjraWZb/view)[-2](https://www.youtube.com/watch?v=m_CooIRM3UI)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 111 | |[**🌐4-Cosine Similarity: Intuition**](https://medium.com/@Coursesteach/natural-language-processing-part-28-cosine-similarity-intuition-68a4654d3cb2)| [**1**](https://drive.google.com/file/d/1z-6tCUnoWsMmlvr7rppk8ALdoY2Fyvjy/view)[-2](https://www.youtube.com/watch?v=m_CooIRM3UI)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 112 | |[**🌐5-Cosine Similarity**](https://medium.com/@Coursesteach/natural-language-processing-part-29-cosine-similarity-51951f805087)| [**1**](https://drive.google.com/file/d/1dbcHscnLsVb0ztS2XtMUnw_Abx3ILLSw/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 113 | |[**🌐6-Manipulating Words in Vector Spaces**](https://medium.com/@Coursesteach/natural-language-processing-part-30-manipulating-words-in-vector-spaces-141f318eaa4c)| [**1**](https://drive.google.com/file/d/1nRYyNEv6_J4aK0KDXVl4reMD8A9PsYLw/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 114 | |[**🌐7-Visualization and PCA**](https://medium.com/@Coursesteach/natural-language-processing-part-30-manipulating-words-in-vector-spaces-141f318eaa4c)| [**1**](https://drive.google.com/file/d/1isV9n8-EY0Gv5NzsQp7Etk7Nrc9O5MBb/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 115 | [ 🌐**8-Lab1_Linear_algebra_in_Python_with_Numpy.ipynb**](https://github.com/hussain0048/Natural-language-processing/blob/main/Lab1_Linear_algebra_in_Python_with_Numpy.ipynb) 116 | |[**🌐8-PCA Algorithm**](https://medium.com/@Coursesteach/natural-language-processing-part-32-pca-algorithm-49607c9e3dde)| [**1**](https://drive.google.com/file/d/1UJeK7RL_U2bOc_dQ-4l3g_jvYn3jsaA3/view)[-2](https://www.youtube.com/watch?v=FD4DeN81ODY)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 117 | [ **🌐9-Lab:2_Manipulating word embeddings**](https://github.com/hussain0048/Natural-language-processing/blob/main/Manipulating_word_embeddings.ipynb) 118 | 119 | ## Week 4 - [**📚Chapter 4:Machine Translation and Document Search**]() 120 | | Topic Name/Tutorial | Video | Code | 121 | |---|---|---| 122 | |[**🌐1-Transforming word vectors**](https://medium.com/@Coursesteach/natural-language-processing-part-33-transforming-word-vectors-37491a721ac7)| [**1**](https://drive.google.com/file/d/1bAP_tN6F23RBnlGv_WFWTdJ7sur9s1b-/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 123 | |[**🌐2-Lab1 Rotation matrices R2**]()|--|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Lab_Rotation_matrices_in_R2.ipynb)| 124 | |[**🌐3-K-nearest neighbors**](https://medium.com/@Coursesteach/natural-language-processing-part-34-k-nearest-neighbors-03d4c44cc8b5)| [**1**](https://drive.google.com/file/d/1lIl19NUa-2zSzqYWNkkmsbUCAOuJ0Xs6/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 125 | |[**🌐4-Hash tables and hash functions**](https://medium.com/@Coursesteach/natural-language-processing-part-35-hash-tables-and-hash-functions-d8221321e63e)| [**1**](https://drive.google.com/file/d/1GQpWFbhzvruYZx1PLmKfj4o0OJ04roQv/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 126 | |[**🌐5-Locality sensitive hashing**](https://medium.com/@Coursesteach/natural-language-processing-part-36-locality-sensitive-hashing-4f559102b991)| [**1**](https://drive.google.com/file/d/1WJbttrKp6ZtyVZWJPmrhy7sfP01kEfjZ/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 127 | |[**🌐6-Multiple Planes-r**](https://medium.com/@Coursesteach/natural-language-processing-part-37-multiple-planes-1c513d65eee5)| [**1**](https://drive.google.com/file/d/156Tvm0JWWSzRu-eRbOTQT27RBGtkQ0m4/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 128 | |[**🌐7-Approximate nearest neighbors**](https://medium.com/@Coursesteach/natural-language-processing-part-38-approximate-nearest-neighbors-51ab012d6680)| [**1**](https://drive.google.com/file/d/1JUlbsazyOx4-JEhJ3Je9gh84Y883KiXK/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 129 | |[**🌐7-Lab2:Hash tables**](https://medium.com/@Coursesteach/natural-language-processing-part-38-approximate-nearest-neighbors-51ab012d6680)| [**1**](https://drive.google.com/file/d/1JUlbsazyOx4-JEhJ3Je9gh84Y883KiXK/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Lab2_Hash_tables.ipynb)| 130 | |[**🌐8-Searching documents**](https://medium.com/@Coursesteach/natural-language-processing-part-39-searching-documents-6080695a76ed)| [**1**](https://drive.google.com/file/d/1JBqmJ265vLsu1CHTq92RcggxeLRSE-xL/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 131 |
132 | 133 |
134 |

📕Course 02 -Natural Language Processing with Probabilistic Models

135 | 136 | ### Week 1-[**📚Chapter1:Autocorrect and Mininum Edit Distance**]() 137 | 138 | | Topic Name/Tutorial | Video | Code | 139 | |---|---|---| 140 | |[**🌐1-Overview**](https://medium.com/@Coursesteach/natural-language-processing-part-33-transforming-word-vectors-37491a721ac7)| [**1**](https://drive.google.com/file/d/1n_AX9UaW-8T97jucHEr0Ge9sGJjTyhEG/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 141 | |[**🌐2-Autocorrect**](https://medium.com/@Coursesteach/natural-language-processing-part-40-autocorrect-25648a46f0b7)| [**1**](https://drive.google.com/file/d/1YDxyMWBSDs13oBMKKLpeJTzYQZmU04XK/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 142 | |[**🌐3-Build Model**](https://medium.com/@Coursesteach/natural-language-processing-part-41-building-the-model-d12f21f0dab6)| [**1**](https://drive.google.com/file/d/18qHINwvC6qfYpHPSbgQIXjYyWs23YcGf/view?usp=sharing)[-2](https://drive.google.com/file/d/1E_C2zXlN33lmiBqICF6WCpcvOoyK6erp/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 143 | |**🌐Lecture notebook building_the_vocabulary**|---|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/building_the_vocabulary.ipynb) | 144 | |**🌐Lecture notebook Candidates from edits**|---|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/candidates_from_edits.ipynb) | 145 | |[**🌐4-Minimum edit distance**](https://medium.com/@Coursesteach/natural-language-processing-part-42-minimum-edit-distance-2da103883c4a)| [**1**](https://drive.google.com/file/d/12fscxMntzAHi8i4paRfX5o71SZGQL-_4/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 146 | |[**🌐5-Minimum edit distance Alogrithem 1**](https://medium.com/@Coursesteach/natural-language-processing-part-43-minimum-edit-distance-algorithm-4949a078472e)| [**1**](https://drive.google.com/file/d/1J2c4W3DwNC-UhMSo8mOw06fNRz9WV9lV/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 147 | |[**🌐6-Minimum edit distance Alogrithem 2**](https://medium.com/@Coursesteach/natural-language-processing-part-44-minimum-edit-distance-algorithm-ii-f05801c32d0f)| [**1**](https://drive.google.com/file/d/1DuhgTqXjah5HzvF57KyTW51XbRkOOBbR/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 148 | |[**🌐7-Minimum edit distance Alogrithem 3**](https://medium.com/@Coursesteach/natural-language-processing-part-44-minimum-edit-distance-algorithm-ii-f05801c32d0f)| [**1**](https://drive.google.com/file/d/1wJFBJ1C51LoNkRj2LCg77WLEa5_lBHIy/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 149 | 150 | ### Week 2-[**📚Chapter2:Part of Speech Tagging and Hidden Markov Models**]() 151 | | Topic Name/Tutorial | Video | Code | 152 | |---|---|---| 153 | |[**🌐1-Part of Speech Tagging**](https://medium.com/@Coursesteach/natural-language-processing-part-45-part-of-speech-tagging-ce24d81c0aa4)| [**1**](https://drive.google.com/file/d/14onPuTeIqV_lqjWND-TS75bU_m-QYsJB/view)[-2](https://www.youtube.com/watch?v=QYzTTFxcc9I)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 154 | |[**🌐2-Markov Chains**](https://medium.com/@Coursesteach/natural-language-processing-part-46-markov-chains-2e6987d0fe40)| [**1**](https://drive.google.com/file/d/1QnOiyWWXM45E8vAS3b43hH8a5u2WbNfh/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 155 | |[**🌐3-Markov Chains and POS Tags**](https://medium.com/@Coursesteach/natural-language-processing-part-47-markov-chains-and-pos-tags-0301835e5f9a)| [**1**](https://drive.google.com/file/d/12P4hncUXDeMCZt-nbvSoAjVoub9YPHPn/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 156 | |[**🌐4-Hidden Markov Models**](https://medium.com/@Coursesteach/natural-language-processing-part-48-hidden-markov-models-e15cb86bd040)| [**1**](https://drive.google.com/file/d/1FsdQVPGvXaoDiS_Hh48v1DpVHEtEg74F/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 157 | |[**🌐5-Calculating Probabilities**](https://medium.com/@Coursesteach/natural-language-processing-part-49-calculating-probabilities-5b673c61de41)| [**1**](https://drive.google.com/file/d/1CNqmmeWWb7PBGagOlPle6BogwiB4FuEi/view)[-2](https://drive.google.com/file/d/1_-lA2EnxXn4eu0YJayfS7HI9o3C7zGeC/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 158 | |[**🌐6-Populating the Emission Matrix**](https://medium.com/@Coursesteach/natural-language-processing-part-50-populating-the-emission-matrix-0a93c808c8e5)| [**1**](https://drive.google.com/file/d/1k2Sw10YhZh2JnbBcFXFKKQCDWnFFFDpI/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 159 | |[**🌐Lecture Notebook - Working with tags and Numpy**]()|--|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/working_with_tags_and_numpy.ipynb)| 160 | |[**🌐7-The Viterbi Algorithm**](https://medium.com/@Coursesteach/natural-language-processing-part-51-the-viterbi-algorithm-cd5a00149e9b)| [**1**](https://drive.google.com/file/d/17G8fWiQ0ZaJoyiNfFaXlGTykGnzpcnty/view)[-2](https://www.youtube.com/watch?v=IqXdjdOgXPM)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 161 | |[**🌐8-Viterbi: Initialization,Forward Pass,Backward Pass**](https://medium.com/@Coursesteach/natural-language-processing-part-51-viterbi-initialization-8938adc12bc0)| [**1**](https://drive.google.com/file/d/1HDV28as1FKFpsZuKwqW_Q8L0TX6Q9Cpf/view)[-2](https://drive.google.com/file/d/13xxyhCg-DVgGCS_z7mk2sU_WPDBhe4rD/view)[-3](https://drive.google.com/file/d/1DbChUznF2FSeiDUueYaJY82chS6X30SY/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 162 | |[**🌐9-Lecture Notebook - Working with text file**](https://medium.com/@Coursesteach/natural-language-processing-part-51-the-viterbi-algorithm-cd5a00149e9b)|--|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Lecture_Notebook_Working_with_text_files.ipynb)| 163 | |[**🌐10-Assignment: Part of Speech Tagging**](https://medium.com/@Coursesteach/natural-language-processing-part-51-the-viterbi-algorithm-cd5a00149e9b)|--|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/C2_W2_Assignment_(1).ipynb)| 164 | 165 | ### Week 3-[**📚Chapter3:Autocomplete and Language models**]() 166 | 167 | | Topic Name/Tutorial | Video | Code | 168 | |---|---|---| 169 | |[**🌐1-N-Grams Overview**](https://medium.com/@Coursesteach/natural-language-processing-part-52-n-grams-overview-1e99325146da)| [**1**](https://drive.google.com/file/d/1rMzB-4mLRMyuMubgP8PUfRrkwiVH_Fdd/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 170 | |[**🌐2-N-grams and Probabilities**](https://medium.com/@Coursesteach/natural-language-processing-part-53-n-grams-and-probabilities-453e4fc0eac9)| [**1**](https://drive.google.com/file/d/1IdEJ-JoFYVYLiT0G_ek0AqePve_QLmxv/view)[-2](https://www.youtube.com/watch?v=9vM4p9NN0Ts)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 171 | |[**🌐3-Sequence Probabilities**](https://medium.com/@Coursesteach/natural-language-processing-part-54-sequence-probabilities-e46ddb515e8e)| [**1**](https://drive.google.com/file/d/1CC7NW18YLfUFZKdYZG89FjIliW7AV_T-/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 172 | |[**🌐3-Understanding the Start and End of Sentences in N-Gram Language Models**](https://medium.com/@Coursesteach/natural-language-processing-part-55-understanding-the-start-and-end-of-sentences-in-n-gram-7a786404049b)| [**1**](https://drive.google.com/file/d/13E_8Sd_b7u1EGt2uRGPO7tlbVbW_p0y1/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 173 | |**🌐4-Lecture notebook: Corpus preprocessing for N-grams**|---|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/dr-mushtaq/Natural-language-processing/blob/main/n_gram_corpus_preprocessing_(1).ipynb)| 174 | |[**🌐5-Creating and Using N-gram Language Models for Text Prediction and Generation**](https://medium.com/@Coursesteach/natural-language-processing-part-56-creating-and-using-n-gram-language-models-for-text-defa533b5cf8)| [**1**](https://drive.google.com/file/d/1ssbMgGICp9kXHzSk1p3YgHzL1c3A4PQR/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 175 | |[**🌐6-How to Evaluate Language Models Using Perplexity: A Step-by-Step Guide⭐️**](https://medium.com/@Coursesteach/natural-language-processing-part-57-how-to-evaluate-language-models-using-perplexity-a-ca97a404b4f0)| [**1**](https://drive.google.com/file/d/18nABniIb2nqVo616J7jOaCTNWZkKRV0P/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 176 | |**🌐7-Lecture notebook: Building the language model**|---|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/dr-mushtaq/Natural-language-processing/blob/main/building_the_language_model.ipynb)| 177 | |[**🌐8-Out of Vocabulary Words⭐️**](https://medium.com/@Coursesteach/how-to-handle-unknown-words-in-language-models-a-step-by-step-guide-nlp-part-57-3139c08e996c)| [**1**](https://drive.google.com/file/d/10k4lM0sBuYWeoCrRQXzs8B_M4a2BdYqF/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 178 | |[**🌐9-Smoothing⭐️**](https://medium.com/@Coursesteach/how-smoothing-and-backoff-improve-n-gram-language-models-nlp-part-57-7ebaf8d6b75a)| [**1**](https://drive.google.com/file/d/1iaWqC9pMF648dWaR8F56_fpy3DbbQeoS/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 179 | 180 | ### Week 3-[**📚Chapter4:Word embedding with neural network**]() 181 | | Topic Name/Tutorial | Video | Code | 182 | |---|---|---| 183 | |[**🌐1-Basic Word Representations⭐️**](https://medium.com/@Coursesteach/one-hot-vectors-explained-simplifying-word-representation-in-nlp-part-58-7855fe35dd5b)| [**1**](https://drive.google.com/file/d/1D-DiE4vlZgulAoh5Ygoy4O1s8LZezJju/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 184 | |[**🌐2-Word Embedding⭐️**](https://medium.com/@Coursesteach/from-words-to-vectors-understanding-word-embeddings-in-nlp-part-59-d80d0d470120)| [**1**](https://drive.google.com/file/d/1y149H5Tch-y0aq3Ro2lPVuC0YX-HZ_S9/view)[-2](https://www.youtube.com/watch?v=IebL0RQF5lg)[-3](https://www.youtube.com/watch?v=Do8cVbx-HOs)[-4](https://www.youtube.com/watch?v=wgfSDrqYMJ4)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 185 | |[**🌐3-How to Create Word Embeddings⭐️**](https://medium.com/@Coursesteach/learn-to-create-word-embeddings-tools-techniques-and-best-practices-natural-language-processing-8ca08a659ea3)| [**1**](https://drive.google.com/file/d/1Xqd96C6GcLSPEBGFVzxyBAd8MYWVGoPH/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 186 | |[**🌐4-Word Embedding Methods⭐️**](https://medium.com/@Coursesteach/exploring-word-embedding-techniques-from-word2vec-to-transformers-natural-language-processing-cf3b2b8b9dd3)| [**1**](https://drive.google.com/file/d/1a9Cad2Kgel2cr7IwTzVAoWWMq4CgTP2g/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 187 | |[**🌐5-Continuous Bag-of-Words Model⭐️**](https://medium.com/@Coursesteach/understanding-the-continuous-bag-of-words-model-for-word-embeddings-natural-language-processing-7080b5c1a8d8)| [**1**](https://drive.google.com/file/d/1n9hZuRvN2l0JKoxuRmQeodAm-QWDg9gW/view)[-2](https://www.youtube.com/watch?v=viZrOnJclY0)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 188 | |[**🌐6-Cleaning and Tokenization⭐️**](https://medium.com/@Coursesteach/text-cleaning-and-tokenization-a-step-by-step-guide-for-nlp-beginners-natural-language-processing-8c6b10a0bf8b)| [**1**](https://drive.google.com/file/d/1EY6L5Y8fmUwE4rtPodm9mVmjVfoRWYWt/view?usp=sharing)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 189 | |[**🌐7-Sliding Window⭐️**](https://medium.com/@Coursesteach/extracting-context-and-center-words-for-continuous-bag-of-words-cbow-model-training-927863ea505a)| [**1**](https://drive.google.com/file/d/1O61F1GzdqpS2oV8LNREI6KmHjKle2IzK/view?usp=sharing)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 190 | |[**🌐8-Transforming Words into Vectors⭐️**](https://medium.com/@Coursesteach/how-to-prepare-data-for-the-continuous-bag-of-words-cbow-model-771274749c15)| [**1**](https://drive.google.com/file/d/12IifLtBzH0H5n3a-TT6DdDtq-FAzmGCg/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 191 | |**🌐9-Lecture Notebook - Data Preparation⭐️**|---|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/dr-mushtaq/Natural-language-processing/blob/main/Lecture_Notebook_Data_Preparation.ipynb)| 192 | |[**🌐9-Architecture of the CBOW Model⭐️**](https://medium.com/@Coursesteach/how-to-prepare-data-for-the-continuous-bag-of-words-cbow-model-771274749c15)| [**1**](https://drive.google.com/file/d/1iKuZLI2bZD6KhaAvLxo1k-3HDKELFSsB/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 193 | |**🌐10-Architecture of the CBOW Model-Dimensions⭐️**| [**1**](https://drive.google.com/drive/u/0/folders/1YWuBALjfT78rTWKzAoyviL1sOm6dS2yz)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 194 | |**🌐11-Architecture of the CBOW Model-Dimensions 2⭐️**| [**1**](https://drive.google.com/file/d/13GsB7B54jFO7izAze4vlSSK0QOPBMLgN/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 195 | |**🌐12-Architecture of the CBOW Model-Activation Functions⭐️**| [**1**](https://drive.google.com/file/d/1RLmo9keaxV3z25I4VfgCEGUJlwTtfBl8/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 196 | |**🌐Lecture Notebook - Intro to CBOW model⭐️**|---|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/dr-mushtaq/Natural-language-processing/blob/main/NLP_C2_W4_lecture_notebook_model_architecture.ipynb)| 197 | |**🌐13-Training a CBOW Model-Cost Function⭐️**| [**1**](https://drive.google.com/file/d/1SqFmDTy1zsDbB_-LGnGnRNCfTZNy2Rt1/view?usp=sharing)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 198 | |**🌐14-Training a CBOW Model-Forward Propagation⭐️**| [**1**](https://drive.google.com/file/d/16eqk4Jrko4Zfezy4dtfK6YIG88S1tbQ1/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 199 | |**🌐15-Training a CBOW Model-Backpropagation and Gradient Descent⭐️**| [**1**](https://drive.google.com/file/d/1iQIamGiy1Tk6TPg8ZX_HIyGYAbRzsjLe/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 200 |
201 | 202 |
203 |

📕Course 03 -Natural Language Processing with Sequence Models

204 | 205 | ### Week 1-[**📚Chapter1:Recurrent Neural Networks for Language modeling**]() 206 | 207 | | Topic Name/Tutorial | Video | Code | 208 | |---|---|---| 209 | |**🌐1-Course 3 Introduction**| [**1**](https://drive.google.com/file/d/1wp90yB137CPzX1NLcob47eGI5uPq_a1S/view?usp=sharing)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 210 | 211 |
212 | 213 |
214 |

📕Course 04 -Natural Language Processing with Attention Models

215 | 216 | ### Week 1-[**📚Chapter1:Autocorrect and Mininum Edit Distance**]() 217 | 218 | | Topic Name/Tutorial | Video | Code | 219 | |---|---|---| 220 | |[**🌐1-Overview**](https://medium.com/@Coursesteach/natural-language-processing-part-33-transforming-word-vectors-37491a721ac7)| [**1**](https://drive.google.com/file/d/1n_AX9UaW-8T97jucHEr0Ge9sGJjTyhEG/view)|[![Colab icon](https://img.shields.io/badge/Colab-Open-blue.svg?logo=colab&logoColor=white)](https://github.com/hussain0048/Natural-language-processing/blob/main/Natural_Language_Processing.ipynb)| 221 | 222 |
223 | 224 | 225 |
226 |

📕Course 05 -Building Chatbots in Python

227 | 228 | ## Week - [**Building Chatbots in Python**]() 229 | - [**Building Chatbots in Python**](https://github.com/hussain0048/Natural-language-processing/blob/main/Chatbot_using_python.ipynb) 230 |
231 | 232 |
233 |

📕 Natural-Language Processing Resources

234 | 235 | ## 👁️ Chapter1: - **Free Courses** 236 | | Title/link| Description | Reading Status | 237 | |---|---|---| 238 | |[**✅ 1-Natural Language Processing Specialization**](https://www.coursera.org/specializations/natural-language-processing)|by Eddy Shyu,Cousera,Goog| InProgress| 239 | |[**✅ 2-Applied Language Technology**](https://applied-language-technology.mooc.fi/html/index.html)|It is free course and it contain notes and video| Pending| 240 | |[**✅ 3-Large Language Models for the General Audience**](https://www.learngood.com/#/youtube-series/Andrej%20Karpathy%20-%20Large%20Language%20Models%20for%20the%20General%20Audience)|It is free course and it contain notes and video,Andrej Karpathy| Pending| 241 | ## 👁️ Chapter2: - **Important Website** 242 | | Title/link| Description | Code | 243 | |---|---|---| 244 | |[**✅1- learngood**](https://www.learngood.com/#/youtube-series/Andrej%20Karpathy%20-%20Large%20Language%20Models%20for%20the%20General%20Audience)|It is Videos and github|---| 245 | 246 | ## 👁️ Chapter3: - **Important Social medica Groups** 247 | | Title/link| Description | Code | 248 | |---|---|---| 249 | |[**🌐1- Computer Science courses with video lectures**]()|It is Videos and github|---| 250 | 251 | ## 👁️ Chapter4: - **Free Books** 252 | | Title/link| Description | Code | 253 | |---|---|---| 254 | |[**🌐1- Computer Science courses with video lectures**]()|It is Videos and github|---| 255 | 256 | ## 👁️ Chapter5: - **Github Repository** 257 | | Title/link| Description | Status | 258 | |---|---|---| 259 | |[**✅ 1- awesome-time-series-papers**](https://github.com/hushuguo/awesome-time-series-papers?tab=readme-ov-file)|It is Videos and github| Pending| 260 | |[**✅ 2- ML YouTube Courses**](https://github.com/dair-ai/ML-YouTube-Courses?fbclid=IwAR26ZRVJyPC6_fFmcOy5IA-u4relyRSAxM5N-pleAD59VwrsSvOX8MsEpaQ)|Github repisotry contain couress| Pending| 261 | |[**✅ 3- ml-roadmap**](https://github.com/loganthorneloe/ml-roadmap?tab=readme-ov-file#mathematics)|Github repisotry contain couress| Pending| 262 | |[**✅ 4-courses & resources**](https://github.com/SkalskiP/courses)|It is course of all AI domain| Pending| 263 | |[**✅ 5-GenAI Agents: Comprehensive Repository for Development and Implementation**](https://github.com/NirDiamant/GenAI_Agents)| collections of Generative AI (GenAI) agent tutorials and implementations | Pending| 264 | |[**✅ 6-nlp-notebooks**](https://github.com/nlptown/nlp-notebooks/tree/master)| it implement nlp concept , it is by nlptown | Pending| 265 | |[**✅ 7-NLP with Python**](https://github.com/susanli2016/NLP-with-Python/tree/master)| it implement nlp concept in python | Pending| 266 | 267 | 268 | ## 👁️ Chapter1: - **Important Library and Packages** 269 | | Title/link| Description | Code | 270 | |---|---|---| 271 | |[**✅1- 10 Must-Know Python Libraries for LLMs in 2025**](https://machinelearningmastery.com/10-must-know-python-libraries-for-llms-in-2025/)|It is Videos and github|---| 272 |
273 | 274 | ## 💻 [Workflow](https://www.youtube.com/watch?v=LuWAw-RBPys): 275 | 276 | - 🔹 Fork the repository and submit Pull Requests (PRs) for changes. 277 | 278 | - 🔹Clone your forked repository using terminal or gitbash. 279 | 280 | - 🔹Make changes to the cloned repository 281 | 282 | - 🔹Add, Commit and Push 283 | - 🔹 Reviewers will approve or request changes before merging. 284 | - 🔹Then in Github, in your cloned repository find the option to make a pull request 285 | - 🔹 Nobody can push directly to main (unless explicitly allowed in settings). 286 | > 🔹print("Start contributing for Natural Language Processing") 287 | 288 | ## ⚙️ Things to Note 289 | 290 | * Make sure you do not copy codes from external sources because that work will not be considered. Plagiarism is strictly not allowed. 291 | * You can only work on issues that have been assigned to you. 292 | * If you want to contribute the algorithm, it's preferrable that you create a new issue before making a PR and link your PR to that issue. 293 | * If you have modified/added code work, make sure the code compiles before submitting. 294 | * Strictly use snake_case (underscore_separated) in your file_name and push it in correct folder. 295 | * Do not update the **[README.md](https://github.com/prathimacode-hub/ML-ProjectKart/blob/main/README.md).** 296 | 297 | ## 🔍 Explore more 298 | Explore cutting-edge tools and Python libraries, access insightful slides and source code, and tap into a wealth of free online courses from top universities and organizations. Connect with like-minded individuals on Reddit, Facebook, and beyond, and stay updated with our YouTube channel and GitHub repository. Don’t wait — enroll now and unleash your NLP potential!” 299 | 300 | * [Natural Language Processing with Probabilistic models Course](https://coursesteach.com/enrol/index.php?id=187) 301 | * [Natural Language Processing course](https://coursesteach.com/enrol/index.php?id=46) 302 | 303 | 304 | 305 | ## **✨Top Contributors** 306 | We would love your help in making this repository even better! If you know of an amazing NLP course that isn't listed here, or if you have any suggestions for improvement in any course content, feel free to open an issue or submit a course contribution request. 307 | 308 | Together, let's make this the best AI learning hub website! 🚀 309 | 310 | Thanks goes to these Wonderful People. Contributions of any kind are welcome!🚀 311 | 312 | 313 | 314 | 315 | 316 | 317 | --------------------------------------------------------------------------------