└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # Learn-Natural-Language-Processing-Curriculum 2 | This is the curriculum for "Learn Natural Language Processing" by Siraj Raval on Youtube 3 | 4 | ## Course Objective 5 | This is the Curriculum for [this](https://youtu.be/GazFsfcijXQ) video on Learn Natural Language Processing by Siraj Raval on Youtube. After completing this course, start your own startup, do consulting work, or find a full-time job related to NLP. 6 | Remember to believe in your ability to learn. You can learn NLP , you will learn NLP, and if you stick to it, 7 | eventually you will master it. 8 | 9 | ## Find a study buddy 10 | Join the #NLP_curriculum channel in our Slack channel to find one http://wizards.herokuapp.com 11 | 12 | ## Components each week 13 | - Video Lectures 14 | - Reading Assignments 15 | - Project(s) 16 | 17 | ## Course Length 18 | - 8 weeks 19 | - 2-3 Hours of Study per Day 20 | 21 | ## Tools Used 22 | - Python, PyTorch, NLTK 23 | 24 | ## Prerequisites 25 | 26 | - Learn Python https://www.edx.org/course/introduction-python-data-science-2 27 | - Statistics http://web.mit.edu/~csvoss/Public/usabo/stats_handout.pdf 28 | - Probability https://static1.squarespace.com/static/54bf3241e4b0f0d81bf7ff36/t/55e9494fe4b011aed10e48e5/1441352015658/probability_cheatsheet.pdf 29 | - Calculus http://tutorial.math.lamar.edu/pdf/Calculus_Cheat_Sheet_All.pdf 30 | - Linear Algebra https://www.souravsengupta.com/cds2016/lectures/Savov_Notes.pdf 31 | 32 | ## Week 1 - Language Terminology + preprocessing techniques 33 | ### Description: 34 | - Overview of NLP (Pragmatics, Semantics, Syntax, Morphology) 35 | - Text preprocessing (stemmings, lemmatization, tokenization, stopword removal) 36 | ### Video Lectures 37 | - https://web.stanford.edu/~jurafsky/slp3/ videos 1-2.5 38 | - https://www.youtube.com/watch?v=hyT-BzLyVdU&list=PLDcmCgguL9rxTEz1Rsy6x5NhlBjI8z3Gz 39 | ### Reading Assignments: 40 | - Ch 1-2 of Speech and Language Processing 3rd ed, slides 41 | ### Project: 42 | - Look at 1-1 to 3-4 to learn NLTK https://github.com/hb20007/hands-on-nltk-tutorial 43 | - Then use NLTK to perform stemming, lemmatiziation, tokenization, stopword removal on a dataset of your choice 44 | 45 | ## Week 2 - Language Models & Lexicons (pre-deep learning) 46 | ### Description: 47 | - Lexicons 48 | - Pre-deep learning Statistical Language model pre-deep learning ( HMM, Topic Modeling w LDA) 49 | ### Video Lectures: 50 | - https://courses.cs.washington.edu/courses/csep517/17sp/ lectures 2-6 51 | ### Reading Assignments: 52 | - 4,6,7,8,9,10 from the UWash course 53 | ### Extra 54 | - LDA blog post: https://medium.com/@lettier/how-does-lda-work-ill-explain-using-emoji-108abf40fa7d 55 | ### Project 56 | - https://github.com/TreB1eN/HiddenMarkovModel_Pytorch Build Hidden Markov Model for Weather Prediction in PyTorch 57 | 58 | ## Week 3 - Word Embeddings (Word, sentence, and document) 59 | ### Video lectures: 60 | - http://web.stanford.edu/class/cs224n/index.html#schedule lectures 1-5 61 | ### Reading Assignments 62 | - Suggested readings from course 63 | ### Project 64 | - 3 Assignments Visualize and Implement Word2Vec, Create dependency parser all in PyTorch (they are assigments from the stanford course) 65 | 66 | ## Week 4-5 - Deep Sequence Modeling 67 | ### Description: 68 | - Sequence to Sequence Models (translation, summarization, question answering) 69 | - Attention based models 70 | - Deep Semantic Similarity 71 | ### Video Lectures 72 | - https://www.coursera.org/learn/language-processing week 4 73 | ### Reading Assignments 74 | - Read this on Deep Semantic Similarity Models https://kishorepv.github.io/DSSM/ 75 | - Ch 10 Deep Learning Book on Sequence Modeling http://www.deeplearningbook.org/contents/rnn.html 76 | ### Project 77 | - 3 Assignments, create a translator and a summarizer. All seq2seq models. In pytorch. 78 | 79 | ## Week 6 - Dialogue Systems 80 | ### Description 81 | - Speech Recognition 82 | - Dialog Managers, NLU 83 | ### Video Lectures 84 | - https://www.coursera.org/learn/language-processing week 5 85 | ### Reading Assignments 86 | - Ch 24 of this book https://web.stanford.edu/~jurafsky/slp3/24.pdf 87 | ### Project 88 | - Create a dialogue system using Pytorch https://github.com/ywk991112/pytorch-chatbot and a task oriented dialogue system using DialogFlow to order food 89 | 90 | ## Week 7 - Transfer Learning 91 | ### Video Lectures 92 | - My videos on BERT and GPT-2, how to build biomedical startup: 93 | - https://www.youtube.com/watch?v=bDxFvr1gpSU 94 | - https://www.youtube.com/watch?v=J9kbZ5I8gdM 95 | - https://www.youtube.com/watch?v=0n95f-eqZdw 96 | - Transfer learning with BERT/GPT-2/ELMO 97 | ### Reading Assignments 98 | - http://ruder.io/nlp-imagenet/ 99 | - https://lilianweng.github.io/lil-log/2019/01/31/generalized-language-models.html 100 | - http://jalammar.github.io/illustrated-bert/ 101 | ### Project 102 | - Play with this https://github.com/huggingface/pytorch-pretrained-BERT#examples pick 2 models, use it for one of 9 downstream tasks, compare their results. 103 | 104 | ## Week 8 - Future NLP 105 | ### Description 106 | - Visual Semantics 107 | - Deep Reinforcement Learning 108 | ### Video Lectures 109 | - CMU Video https://www.youtube.com/watch?v=isxzsAelQX0 110 | - Module 5-6 of this https://www.edx.org/course/natural-language-processing-nlp-3 111 | ### Reading assignments 112 | - https://cs.stanford.edu/people/karpathy/cvpr2015.pdf 113 | - Hilarious https://medium.com/@yoav.goldberg/an-adversarial-review-of-adversarial-generation-of-natural-language-409ac3378bd7 114 | ### Project: 115 | - Policy gradient text summarization https://github.com/yaserkl/RLSeq2Seq#policy-gradient-w-self-critic-learning-and-temporal-attention-and-intra-decoder-attention reimplement in pytorch 116 | 117 | 118 | --------------------------------------------------------------------------------