├── _config.yml
├── index.html
└── README.md
/_config.yml:
--------------------------------------------------------------------------------
1 | theme: jekyll-theme-minimal
--------------------------------------------------------------------------------
/index.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 | Activity 1: Basic HTML Bio
6 |
7 |
8 |
9 |
10 | Your Name
11 |
12 |
13 |
14 | Write a short paragraph about yourself or some placeholder text.
15 | A second short paragraph about yourself or some placeholder text.
16 |
17 |
22 |
23 |
24 |
25 | | Books |
26 | Movies |
27 | Games |
28 |
29 |
30 | | The Hobbit |
31 | Hot Fuzz |
32 | Dark Souls |
33 |
34 |
35 | | The Name of the Wind |
36 | The Avengers |
37 | The Last of Us |
38 |
39 |
40 | | The Girl With All the Gifts |
41 | The Matrix |
42 | Dragon Age: Origins |
43 |
44 |
45 |
46 |
47 |
48 |
49 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # machine-learning-for-nlp-guide
2 | Guide for engineers interested in NLP machine learning
3 |
4 | ## Path
5 | 1. Understand possibilities and form business applications
6 | 1. Everyone [AI for Everyone](https://www.coursera.org/learn/ai-for-everyone)
7 |
8 | 1. Either level up through:
9 | 1. __Gaining theoretical foundation of Deep Learning for NLP__
10 | 1. Stanford Course Materials http://web.stanford.edu/class/cs224n/
11 | 1. Natural Language Processing with Deep Learning https://www.youtube.com/watch?v=8rXD5-xhemo&list=PLoROMvodv4rOhcuXMZkNm7j3fVwBBY42z
12 | 1. Stanford CS224U: Natural Language Understanding https://www.youtube.com/watch?v=tZ_Jrc_nRJY&list=PLoROMvodv4rObpMCir6rNNUlFAn56Js20
13 | 1. __Getting "Practical" Knowledge of Deep Learning for NLP__
14 | 1. [3Blue1Brown Neural Networks](https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi)
15 | 1. [Rasa Whiteboard Youtube](https://www.youtube.com/watch?v=mWvnlVw_LiY&list=PL75e0qA87dlG-za8eLI6t0_Pbxafk-cxb&index=5)
16 | 1. [Rasa Whiteboard Github](https://www.youtube.com/redirect?redir_token=JoSXMpXu79Zsu0ao_9CQMdS4Jr18MTU4OTEzMjUzOEAxNTg5MDQ2MTM4&q=https%3A%2F%2Fgithub.com%2FRasaHQ%2Falgorithm-whiteboard-resources&v=mWvnlVw_LiY&event=video_description)
17 |
18 | 1. Learn how to Deep Learning
19 | 1. [Nuts and Bolts of Applying Deep Learning](https://www.youtube.com/watch?v=F1ka6a13S9I)
20 | 1. "Everyday" Engineers [Fast.ai](https://www.fast.ai/)
21 | 1. Research Engineers [Deep Learning AI](https://www.deeplearning.ai/deep-learning-specialization/)
22 |
23 | 1. Learn about all the stuff "they don't teach"
24 | 1. Learn Production-Level Deep Learning: https://fullstackdeeplearning.com/
25 | 1. Resources: https://github.com/full-stack-deep-learning/fsdl-text-recognizer-project
26 | 1. Base Models to Use
27 | 1. [Spacy](https://spacy.io/) for general NLP tasks
28 | 1. [HuggingFace Transformers](https://github.com/huggingface/transformers)
29 |
30 | 1. Profit
31 |
32 | ## State of the Art Methods
33 | * [NLP Progress](https://github.com/sebastianruder/NLP-progress)
34 | * [Glue](https://gluebenchmark.com/leaderboard)
35 | * [Papers with code](https://paperswithcode.com/sota)
36 |
37 | ## Resources
38 | * Syntactic Search over Wikipedia: https://spike.wikipedia.apps.allenai.org/search/wikipedia
39 | * Odinson: Rapidly query a natural language knowledge base https://github.com/lum-ai/odinson
40 | * CheckList: Behavioral Testing NLP https://github.com/marcotcr/checklist
41 | * Data project checklist https://www.fast.ai/2020/01/07/data-questionnaire
42 | * BERT, ELMo, & GPT-2: How Contextual are Contextualized Word Representations? http://ai.stanford.edu/blog/contextual/
43 | * BERT commit log https://amitness.com/2020/05/git-log-of-bert/
44 | * Full stack deep learning github repo: https://github.com/full-stack-deep-learning/fsdl-text-recognizer-project
45 | * Expand Data Labeled Data using Unlabled Data
46 | * Blog: https://ai.googleblog.com/2019/03/harnessing-organizational-knowledge-for.html
47 | * Detailed Article: https://towardsdatascience.com/a-look-into-snorkel-drybell-8e9e781dc250
48 | * Explain Predictions
49 | * Python Library: https://github.com/jphall663/awesome-machine-learning-interpretability
50 | * Deploy models to production
51 | * Tutorial: https://hackernoon.com/enterprise-af-solution-for-text-classification-using-bert-9fe2b7234c46
52 | * Learn how to implement new models
53 | * Deep Learning from the Foundations: https://www.fast.ai/2019/06/28/course-p2v3/
54 | * More Learning Resources:
55 | * [The Best Artificial Intelligence, Machine Learning and Data Science Resources*](https://www.notion.so/b3b97fa097b747698e87fd3badc657cf)
56 | * [nlp-library curated list of papers](https://github.com/mihail911/nlp-library)
57 | * Machine Learning System Best Practice and Design:
58 | * The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction: https://ai.google/research/pubs/pub46555
59 | * Machine Learning: The High Interest Credit Card of Technical Debt: https://ai.google/research/pubs/pub43146
60 | * [An Interactive Visualization to Explore NLP Papers](https://saifmohammad.com/WebPages/nlpscholar-demo-basic.html)
61 | * [How Big Should My Language Model Be?](https://huggingface.co/calculator/)
62 | * [Accelerate your NLP pipelines using Hugging Face Transformers and ONNX Runtime](https://opendatascience.com/accelerate-your-nlp-pipelines-using-hugging-face-transformers-and-onnx-runtime/)
63 |
64 |
65 | ## Tools
66 | * https://prodi.gy/buy
67 | * Text and image annotation
68 | * https://github.com/chakki-works/doccano
69 | * Open source text annotation tool
70 | * https://www.media.mit.edu/projects/dive/overview/
71 | * DIVE is a web-based data exploration system that lets non-technical users create stories from their data without writing code. DIVE combines semantic data ingestion, recommendation-based visualization and analysis, and dynamic story sharing into a unified workflow.
72 |
73 |
74 | ## Infrastructure
75 | * Seldon
76 | * https://www.youtube.com/watch?time_continue=2&v=cDtzu4WBzWA
77 | * https://github.com/kubeflow/example-seldon
78 | * https://docs.seldon.io/projects/seldon-core/en/latest/examples/nvidia_mnist.html
79 | * Kubeflow
80 | * https://www.kubeflow.org/docs/started/getting-started/
81 | * TFX
82 | * https://www.tensorflow.org/tfx
83 | * 
84 |
85 | ## Research Interest
86 | * Text Atlas
87 | * Feature Visualization https://distill.pub/2017/feature-visualization/
88 | * Activation Atlas https://distill.pub/2019/activation-atlas/
89 |
90 | ## Newsletter to Follow
91 | * NLP News http://newsletter.ruder.io
92 | * The Batch https://www.deeplearning.ai/thebatch/
93 |
94 | ## Podcasts to listen
95 | * NLP Highlights https://soundcloud.com/nlp-highlights
96 |
97 | ## Blogs to Follow
98 | * Google Data Analytics https://cloud.google.com/blog/products/data-analytics/
99 | * AWS Big Data Blog https://aws.amazon.com/blogs/big-data/
100 | * fast.ai http://www.fast.ai/
101 | * FastML http://fastml.com/
102 | * The Unofficial Google Data Science Blog http://www.unofficialgoogledatascience.com/
103 | * DeepMind https://deepmind.com/blog/
104 | * The Official Google Blog https://www.blog.google/
105 | * Distill https://distill.pub
106 | * DataCamp Community https://www.datacamp.com/community
107 | * AI Applications https://vaultanalytics.com/marketinganalytics
108 | * Google AI Blog http://ai.googleblog.com/
109 | * Google Developers Blog http://developers.googleblog.com/
110 | * the morning paper https://blog.acolyer.org
111 | * Machine Learning @ Berkeley https://medium.com/@ml.at.berkeley?source=rss-a34a9c1d8009------2
112 | * All - naacl.org http://naacl-org.github.com
113 | * Facebook Research https://research.fb.com
114 | * OpenAI https://blog.openai.com
115 | * Y Combinator http://www.ycombinator.com
116 | * The Berkeley Artificial Intelligence Research Blog http://bair.berkeley.edu/blog/
117 | * No Free Hunch http://blog.kaggle.com
118 | * Off the convex path http://offconvex.github.io/
119 |
120 | ## Datasets
121 | * A unified platform for sharing, training and evaluating dialogue models across many tasks. https://parl.ai/
122 |
123 | You can also follow me on twitter: https://twitter.com/LeoApolonio
124 |
--------------------------------------------------------------------------------