├── Readme.md ├── [Basic] [Document Similarity] [Unsupervised] - TFIDF - BoW - Bag of N-Grams - Kmeans - LDA.ipynb ├── [Introduction] - Big tutorial - Text Classification.ipynb ├── [Supervised] [DL method] GRU_HAN.ipynb ├── [Unsupervised] LDA.ipynb └── pictures ├── LDA2VEC.png ├── characters_attention.gif ├── explainability.gif ├── generative_LDA.gif ├── pyldavis.png ├── tsne_lda.png ├── word_correlations.png └── word_frequency.png /Readme.md: -------------------------------------------------------------------------------- 1 | Multi-classes task classification and LDA-based topic Recommender System 2 | ======================================================================== 3 | 4 | Here is **my winning strategy** to carry multi-text classification task 5 | out. 6 | 7 | **Data Source** : 8 | https://catalog.data.gov/dataset/consumer-complaint-database 9 | 10 | 1 - Text Mining 11 | =============== 12 | 13 | - **Word Frequency Plot**: Compare frequencies across different texts 14 | and quantify how similar and different these sets of word 15 | frequencies are using a correlation test. How correlated are the 16 | word frequencies between text1 and text2, and between text1 and 17 | text3? 18 | 19 |  20 | 21 | - **Most discriminant and important word per categories** 22 | 23 | - **Relationships between words & Pairwise correlations**: examining 24 | which words tend to follow others immediately, or that tend to 25 | co-occur within the same documents. 26 | 27 | Which word is associated with another word? Note that this is a 28 | visualization of a Markov chain, a common model in text processing. In a 29 | Markov chain, each choice of word depends only on the previous word. In 30 | this case, a random generator following this model might spit out 31 | “collect”, then “agency”, then “report/credit/score”, by following each 32 | word to the most common words that follow it. To make the visualization 33 | interpretable, we chose to show only the most common word to word 34 | connections, but one could imagine an enormous graph representing all 35 | connections that occur in the text. 36 | 37 | - **Distribution of words**: Want to show that there are similar 38 | distributions for all texts, with many words that occur rarely and 39 | fewer words that occur frequently. Here is the goal of Zip Law 40 | (extended with Harmonic mean) - Zipf’s Law is a statistical 41 | distribution in certain data sets, such as words in a linguistic 42 | corpus, in which the frequencies of certain words are inversely 43 | proportional to their ranks. 44 | 45 |  46 | 47 | - **How to spell variants of a given word** 48 | 49 | - **Chi-Square to see which words are associated to each category**: 50 | find the terms that are the most correlated with each of the 51 | categories 52 | 53 | - **Part of Speech Tags** and **Frequency distribution of POST**: Noun 54 | Count, Verb Count, Adjective Count, Adverb Count and Pronoun Count 55 | 56 | - **Metrics of words**: *Word Count of the documents* – ie. total 57 | number of words in the documents, *Character Count of the documents* 58 | – total number of characters in the documents, *Average Word Density 59 | of the documents* – average length of the words used in the 60 | documents, *Puncutation Count in the Complete Essay* – total number 61 | of punctuation marks in the documents, *Upper Case Count in the 62 | Complete Essay* – total number of upper count words in the 63 | documents, *Title Word Count in the Complete Essay* – total number 64 | of proper case (title) words in the documents 65 | 66 | 2 - Word Embedding 67 | ================== 68 | 69 | ### A - Frequency Based Embedding 70 | 71 | - Count Vector 72 | - TF IDF 73 | - Co-Occurrence Matrix with a fixed context window (SVD) 74 | - TF-ICF 75 | - Function Aware Components 76 | 77 | ### B - Prediction Based Embedding 78 | 79 | - CBOW (word2vec) 80 | - Skip-Grams (word2vec) 81 | - Glove 82 | - At character level -> FastText 83 | - Topic Model as features // LDA features 84 | 85 | #### LDA 86 | 87 | Visualization provides a global view of the topics (and how they differ 88 | from each other), while at the same time allowing for a deep inspection 89 | of the terms most highly associated with each individual topic. A novel 90 | method for choosing which terms to present to a user to aid in the task 91 | of topic interpretation, in which we define the relevance of a term to a 92 | topic. 93 | 94 |  95 | 96 |  97 | 98 |  99 | 100 | ### C - Poincaré Embedding \[Embeddings and Hyperbolic Geometry\] 101 | 102 | The main innovation here is that these embeddings are learnt in 103 | **hyperbolic space**, as opposed to the commonly used **Euclidean 104 | space**. The reason behind this is that hyperbolic space is more 105 | suitable for capturing any hierarchical information inherently present 106 | in the graph. Embedding nodes into a Euclidean space while preserving 107 | the distance between the nodes usually requires a very high number of 108 | dimensions. 109 | 110 | https://arxiv.org/pdf/1705.08039.pdf 111 | https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/Poincare%20Tutorial.ipynb 112 | 113 | **Learning representations** of symbolic data such as text, graphs and 114 | multi-relational data has become a central paradigm in machine learning 115 | and artificial intelligence. For instance, word embeddings such as 116 | **WORD2VEC**, **GLOVE** and **FASTTEXT** are widely used for tasks 117 | ranging from machine translation to sentiment analysis. 118 | 119 | Typically, the **objective of embedding methods** is to organize 120 | symbolic objects (e.g., words, entities, concepts) in a way such that 121 | **their similarity in the embedding space reflects their semantic or 122 | functional similarity**. For this purpose, the similarity of objects is 123 | usually measured either by their **distance** or by their **inner 124 | product** in the embedding space. For instance, Mikolov embed words in 125 | *R**d* such that their **inner product** is maximized when 126 | words co-occur within similar contexts in text corpora. This is 127 | motivated by the **distributional hypothesis**, i.e., that the meaning 128 | of words can be derived from the contexts in which they appear. 129 | 130 | 3 - Algorithms 131 | ============== 132 | 133 | ### A - Traditional Methods 134 | 135 | - CountVectorizer + Logistic 136 | - CountVectorizer + NB 137 | - CountVectorizer + LightGBM 138 | - HasingTF + IDF + Logistic Regression 139 | - TFIDF + NB 140 | - TFIDF + LightGBM 141 | - TF-IDF + SVM 142 | - Hashing Vectorizer + Logistic 143 | - Hashing Vectorizer + NB 144 | - Hashing Vectorizer + LightGBM 145 | - Bagging / Boosting 146 | - Word2Vec + Logistic 147 | - Word2Vec + LightGNM 148 | - Word2Vec + XGBoost 149 | - LSA + SVM 150 | 151 | ### B - Deep Learning Methods 152 | 153 | - GRU + Attention Mechanism 154 | - CNN + RNN + Attention Mechanism 155 | - CNN + LSTM/GRU + Attention Mechanism 156 | 157 | 4 - Explainability 158 | ================== 159 | 160 | **Goal**: explain predictions of arbitrary classifiers, including text 161 | classifiers (when it is hard to get exact mapping between model 162 | coefficients and text features, e.g. if there is dimension reduction 163 | involved) 164 | 165 | - Lime 166 | - Skate 167 | - Shap 168 | 169 |  170 | 171 | 5 - MyApp of multi-classes text classification with Attention mechanism 172 | ======================================================================= 173 | 174 |  175 | 176 | 6 - Ressources / Bibliography 177 | ============================= 178 | 179 | - **All models** : 180 | https://www.analyticsvidhya.com/blog/2018/04/a-comprehensive-guide-to-understand-and-implement-text-classification-in-python/ 181 | 182 | - **CNN Text Classification**: 183 | https://github.com/cmasch/cnn-text-classification/blob/master/Evaluation.ipynb 184 | 185 | - **CNN Multichannel Text Classification + Hierarchical attention + 186 | …**: 187 | https://github.com/gaurav104/TextClassification/blob/master/CNN%20Multichannel%20Text%20Classification.ipynb 188 | 189 | - **Notes for Deep Learning** 190 | https://arxiv.org/pdf/1808.09772.pdf 191 | 192 | - **Doc classification with NLP** 193 | https://github.com/mdh266/DocumentClassificationNLP/blob/master/NLP.ipynb 194 | 195 | - **Paragraph Topic Classification** 196 | http://cs229.stanford.edu/proj2016/report/NhoNg-ParagraphTopicClassification-report.pdf 197 | 198 | - **1D convolutional neural networks for NLP** 199 | https://github.com/Tixierae/deep_learning_NLP/blob/master/cnn_imdb.ipynb 200 | 201 | - **Hierarchical Attention for text classification** 202 | https://github.com/Tixierae/deep_learning_NLP/blob/master/HAN/HAN_final.ipynb 203 | 204 | - **Multi-class classification scikit learn** (Random forest, SVM, 205 | logistic regression) 206 | https://towardsdatascience.com/multi-class-text-classification-with-scikit-learn-12f1e60e0a9f 207 | https://github.com/susanli2016/Machine-Learning-with-Python/blob/master/Consumer_complaints.ipynb 208 | 209 | - **Text feature extraction TFIDF mathematics** 210 | https://dzone.com/articles/machine-learning-text-feature-0 211 | 212 | - **Classification Yelp Reviews (AWS)** 213 | http://www.developintelligence.com/blog/2017/06/practical-neural-networks-keras-classifying-yelp-reviews/ 214 | 215 | - **Convolutional Neural Networks for Text Classification (waouuuuu)** 216 | http://www.davidsbatista.net/blog/2018/03/31/SentenceClassificationConvNets/ 217 | https://github.com/davidsbatista/ConvNets-for-sentence-classification 218 | 219 | - **3 ways to interpretate your NLP model \[Lime, ELI5, Skater\]** 220 | https://github.com/makcedward/nlp/blob/master/sample/nlp-model_interpretation.ipynb 221 | https://towardsdatascience.com/3-ways-to-interpretate-your-nlp-model-to-management-and-customer-5428bc07ce15 222 | https://medium.freecodecamp.org/how-to-improve-your-machine-learning-models-by-explaining-predictions-with-lime-7493e1d78375 223 | 224 | - **Deep Learning for text made easy with AllenNLP** 225 | https://medium.com/swlh/deep-learning-for-text-made-easy-with-allennlp-62bc79d41f31 226 | 227 | - **Ensemble Classifiers** 228 | https://www.learndatasci.com/tutorials/predicting-reddit-news-sentiment-naive-bayes-text-classifiers/ 229 | 230 | - **Classification Algorithms ** \[tfidf, count features, logistic 231 | regression, naive bayes, svm, xgboost, grid search, word vectors, 232 | LSTM, GRU, Ensembling\] : 233 | https://www.kaggle.com/abhishek/approaching-almost-any-nlp-problem-on-kaggle/notebook 234 | 235 | - **Deep learning architecture** \[TextCNN, BiDirectional 236 | RNN(LSTM/GRU), Attention Models\] : 237 | https://mlwhiz.com/blog/2019/03/09/deeplearning_architectures_text_classification/ 238 | and 239 | https://www.kaggle.com/mlwhiz/attention-pytorch-and-keras 240 | 241 | - **CNN + Word2vec and LSTM + Word2Vec** : 242 | https://www.kaggle.com/kakiac/deep-learning-4-text-classification-cnn-bi-lstm 243 | 244 | - **Comparison of models** \[Bag of Words - Countvectorizer Features, 245 | TFIDF Features, Hashing Features, Word2vec Features\] : 246 | https://mlwhiz.com/blog/2019/02/08/deeplearning_nlp_conventional_methods/ 247 | 248 | - **Embed, encode, attend, predict** : 249 | https://explosion.ai/blog/deep-learning-formula-nlp 250 | 251 | - Visualisation sympa pour comprendre CNN : 252 | http://www.thushv.com/natural_language_processing/make-cnns-for-nlp-great-again-classifying-sentences-with-cnns-in-tensorflow/ 253 | 254 | - **Yelp comments classification \[ LSTM, LSTM + CNN\]** : 255 | https://github.com/msahamed/yelp_comments_classification_nlp/blob/master/word_embeddings.ipynb 256 | 257 | - **RNN text classification** : 258 | https://karpathy.github.io/2015/05/21/rnn-effectiveness/ 259 | 260 | - **CNN for Sentence Classification** & **DCNN for Modelling 261 | Sentences** & **VDNN for Text Classification** & **Multi Channel 262 | Variable size CNN** & **Multi Group Norm Constraint CNN** & **RACNN 263 | Neural Networks for Text Classification**: 264 | https://bicepjai.github.io/machine-learning/2017/11/10/text-class-part1.html 265 | 266 | - **Transformers** : 267 | https://towardsdatascience.com/transformers-141e32e69591 268 | 269 | - **Seq2Seq** : 270 | https://guillaumegenthial.github.io 271 | /sequence-to-sequence.html 272 | 273 | - **The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer 274 | Learning)** : 275 | https://jalammar.github.io/ 276 | 277 | - **LSTM & GRU explanation** : 278 | https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21 279 | 280 | - **Text classification using attention mechanism in Keras** : 281 | http://androidkt.com/text-classification-using-attention-mechanism-in-keras/ 282 | 283 | - **Bernoulli Naive Bayes & Multinomial Naive Bayes & Random Forests & 284 | Linear SVM & SVM with non-linear kernel** 285 | https://github.com/irfanelahi-ds/document-classification-python/blob/master/document_classification_python_sklearn_nltk.ipynb 286 | and 287 | https://richliao.github.io/ 288 | 289 | - **DL text classification** : 290 | https://gitlab.com/the_insighters/data-university/nuggets/document-classification-with-deep-learning 291 | 292 | - **1-D Convolutions over text** : 293 | http://www.davidsbatista.net/blog/2018/03/31/SentenceClassificationConvNets/ 294 | and 295 | https://github.com/davidsbatista/ConvNets-for-sentence-classification/blob/master/Convolutional-Neural-Networks-for-Sentence-Classification.ipynb 296 | 297 | - **\[Bonus\] Sentiment Analysis in PySpark** : 298 | https://github.com/tthustla/setiment_analysis_pyspark/blob/master/Sentiment%20Analysis%20with%20PySpark.ipynb 299 | 300 | - **RNN Text Generation** : 301 | https://github.com/priya-dwivedi/Deep-Learning/blob/master/RNN_text_generation/RNN_project.ipynb 302 | 303 | - **Finding similar documents with Word2Vec and Soft Cosine Measure**: 304 | https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/soft_cosine_tutorial.ipynb 305 | 306 | - **\[!! ESSENTIAL !!\] Text Classification with Hierarchical 307 | Attention Networks**: 308 | https://humboldt-wi.github.io/blog/research/information_systems_1819/group5_han/ 309 | 310 | - **\[ESSENTIAL for any NLP Project\]**: 311 | https://github.com/RaRe-Technologies/gensim/tree/develop/docs/notebooks 312 | 313 | - **Doc2Vec + Logistic Regression** : 314 | https://github.com/susanli2016/NLP-with-Python/blob/master/Doc2Vec%20Consumer%20Complaint_3.ipynb 315 | 316 | - **Doc2Vec -> just embedding**: 317 | https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/doc2vec-wikipedia.ipynb 318 | 319 | - **New way of embedding -> Poincaré Embeddings**: 320 | https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/Poincare%20Tutorial.ipynb 321 | 322 | - **Doc2Vec + Text similarity**: 323 | https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/doc2vec-lee.ipynb 324 | 325 | - **Graph Link predictions + Part-of-Speech tagging tutorial with the 326 | Keras**: 327 | https://github.com/Cdiscount/IT-Blog/tree/master/scripts/link-prediction 328 | & 329 | https://techblog.cdiscount.com/link-prediction-in-large-scale-networks/ 330 | 331 | 7. Other Topics - Text Similarity \[Word Mover Distance\] 332 | ========================================================= 333 | 334 | - **Finding similar documents with Word2Vec and WMD** : 335 | https://markroxor.github.io/gensim/static/notebooks/WMD_tutorial.html 336 | 337 | - **Introduction to Wasserstein metric (earth mover’s distance)**: 338 | https://yoo2080.wordpress.com/2015/04/09/introduction-to-wasserstein-metric-earth-movers-distance/ 339 | 340 | - **Earthmover Distance**: 341 | https://jeremykun.com/2018/03/05/earthmover-distance/ 342 | Problem: Compute distance between points with uncertain locations 343 | (given by samples, or differing observations, or clusters). For 344 | example, if I have the following three “points” in the plane, as 345 | indicated by their colors, which is closer, blue to green, or blue 346 | to red? 347 | 348 | - **Word Mover’s distance calculation between word pairs of two 349 | documents**: 350 | https://stats.stackexchange.com/questions/303050/word-movers-distance-calculation-between-word-pairs-of-two-documents 351 | 352 | - **Word Mover’s Distance (WMD) for Python**: 353 | https://github.com/stephenhky/PyWMD/blob/master/WordMoverDistanceDemo.ipynb 354 | 355 | - \[LECTURES\] : **Computational Optimal Transport** : 356 | https://optimaltransport.github.io/pdf/ComputationalOT.pdf 357 | 358 | - **Computing the Earth Mover’s Distance under Transformations** : 359 | http://robotics.stanford.edu/~scohen/research/emdg/emdg.html 360 | 361 | - **\[LECTURES\] Slides WMD**: 362 | http://robotics.stanford.edu/~rubner/slides/sld014.htm 363 | 364 | Others \[Quora Datset\] : 365 | ------------------------- 366 | 367 | - **BOW + Xgboost Model** + **Word level TF-IDF + XgBoost** + **N-gram 368 | Level TF-IDF + Xgboost** + **Character Level TF-IDF + XGboost**: 369 | https://github.com/susanli2016/Machine-Learning-with-Python/blob/master/Xgboost_bow_tfidf.ipynb 370 | 371 | 8 - Other Topics - Topic Modeling [LDA](#lda) 372 | ============================================= 373 | 374 | https://github.com/FelixChop/MediumArticles/blob/master/LDA-BBC.ipynb 375 | 376 | https://github.com/priya-dwivedi/Deep-Learning/blob/master/topic_modeling/LDA_Newsgroup.ipynb 377 | 378 | - **TF-IDF + K-means & Latent Dirichlet Allocation (with Bokeh)**: 379 | https://ahmedbesbes.com/how-to-mine-newsfeed-data-and-extract-interactive-insights-in-python.html 380 | 381 | - **\[!! ESSENTIAL !!\] Building a LDA-based Book Recommender 382 | System**: 383 | https://humboldt-wi.github.io/blog/research/information_systems_1819/is_lda_final/ 384 | 385 | 9 - Variational Autoencoder 386 | =========================== 387 | 388 | - **Text generation with a Variational Autoencoder** : 389 | https://github.com/NicGian/text_VAE 390 | 391 | - **Variational\_text\_inference** : 392 | https://github.com/s4sarath/Deep-Learning-Projects/tree/master/variational_text_inference 393 | and 394 | https://s4sarath.github.io/2016/11/23/variational_autoenocder_for_Natural_Language_Processing 395 | -------------------------------------------------------------------------------- /[Basic] [Document Similarity] [Unsupervised] - TFIDF - BoW - Bag of N-Grams - Kmeans - LDA.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Import necessary dependencies and settings" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": 1, 13 | "metadata": {}, 14 | "outputs": [], 15 | "source": [ 16 | "import pandas as pd\n", 17 | "import numpy as np\n", 18 | "import re\n", 19 | "import nltk\n", 20 | "import matplotlib.pyplot as plt\n", 21 | "\n", 22 | "pd.options.display.max_colwidth = 200\n", 23 | "%matplotlib inline" 24 | ] 25 | }, 26 | { 27 | "cell_type": "markdown", 28 | "metadata": {}, 29 | "source": [ 30 | "# Sample corpus of text documents" 31 | ] 32 | }, 33 | { 34 | "cell_type": "code", 35 | "execution_count": 3, 36 | "metadata": {}, 37 | "outputs": [ 38 | { 39 | "data": { 40 | "text/html": [ 41 | "
\n", 59 | " | Document | \n", 60 | "Category | \n", 61 | "
---|---|---|
0 | \n", 66 | "The sky is blue and beautiful. | \n", 67 | "weather | \n", 68 | "
1 | \n", 71 | "Love this blue and beautiful sky! | \n", 72 | "weather | \n", 73 | "
2 | \n", 76 | "The quick brown fox jumps over the lazy dog. | \n", 77 | "animals | \n", 78 | "
3 | \n", 81 | "A king's breakfast has sausages, ham, bacon, eggs, toast and beans | \n", 82 | "food | \n", 83 | "
4 | \n", 86 | "I love green eggs, ham, sausages and bacon! | \n", 87 | "food | \n", 88 | "
5 | \n", 91 | "The brown fox is quick and the blue dog is lazy! | \n", 92 | "animals | \n", 93 | "
6 | \n", 96 | "The sky is very blue and the sky is very beautiful today | \n", 97 | "weather | \n", 98 | "
7 | \n", 101 | "The dog is lazy but the brown fox is quick! | \n", 102 | "animals | \n", 103 | "
\n", 268 | " | bacon | \n", 269 | "beans | \n", 270 | "beautiful | \n", 271 | "blue | \n", 272 | "breakfast | \n", 273 | "brown | \n", 274 | "dog | \n", 275 | "eggs | \n", 276 | "fox | \n", 277 | "green | \n", 278 | "ham | \n", 279 | "jumps | \n", 280 | "kings | \n", 281 | "lazy | \n", 282 | "love | \n", 283 | "quick | \n", 284 | "sausages | \n", 285 | "sky | \n", 286 | "toast | \n", 287 | "today | \n", 288 | "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", 293 | "0 | \n", 294 | "0 | \n", 295 | "1 | \n", 296 | "1 | \n", 297 | "0 | \n", 298 | "0 | \n", 299 | "0 | \n", 300 | "0 | \n", 301 | "0 | \n", 302 | "0 | \n", 303 | "0 | \n", 304 | "0 | \n", 305 | "0 | \n", 306 | "0 | \n", 307 | "0 | \n", 308 | "0 | \n", 309 | "0 | \n", 310 | "1 | \n", 311 | "0 | \n", 312 | "0 | \n", 313 | "
1 | \n", 316 | "0 | \n", 317 | "0 | \n", 318 | "1 | \n", 319 | "1 | \n", 320 | "0 | \n", 321 | "0 | \n", 322 | "0 | \n", 323 | "0 | \n", 324 | "0 | \n", 325 | "0 | \n", 326 | "0 | \n", 327 | "0 | \n", 328 | "0 | \n", 329 | "0 | \n", 330 | "1 | \n", 331 | "0 | \n", 332 | "0 | \n", 333 | "1 | \n", 334 | "0 | \n", 335 | "0 | \n", 336 | "
2 | \n", 339 | "0 | \n", 340 | "0 | \n", 341 | "0 | \n", 342 | "0 | \n", 343 | "0 | \n", 344 | "1 | \n", 345 | "1 | \n", 346 | "0 | \n", 347 | "1 | \n", 348 | "0 | \n", 349 | "0 | \n", 350 | "1 | \n", 351 | "0 | \n", 352 | "1 | \n", 353 | "0 | \n", 354 | "1 | \n", 355 | "0 | \n", 356 | "0 | \n", 357 | "0 | \n", 358 | "0 | \n", 359 | "
3 | \n", 362 | "1 | \n", 363 | "1 | \n", 364 | "0 | \n", 365 | "0 | \n", 366 | "1 | \n", 367 | "0 | \n", 368 | "0 | \n", 369 | "1 | \n", 370 | "0 | \n", 371 | "0 | \n", 372 | "1 | \n", 373 | "0 | \n", 374 | "1 | \n", 375 | "0 | \n", 376 | "0 | \n", 377 | "0 | \n", 378 | "1 | \n", 379 | "0 | \n", 380 | "1 | \n", 381 | "0 | \n", 382 | "
4 | \n", 385 | "1 | \n", 386 | "0 | \n", 387 | "0 | \n", 388 | "0 | \n", 389 | "0 | \n", 390 | "0 | \n", 391 | "0 | \n", 392 | "1 | \n", 393 | "0 | \n", 394 | "1 | \n", 395 | "1 | \n", 396 | "0 | \n", 397 | "0 | \n", 398 | "0 | \n", 399 | "1 | \n", 400 | "0 | \n", 401 | "1 | \n", 402 | "0 | \n", 403 | "0 | \n", 404 | "0 | \n", 405 | "
5 | \n", 408 | "0 | \n", 409 | "0 | \n", 410 | "0 | \n", 411 | "1 | \n", 412 | "0 | \n", 413 | "1 | \n", 414 | "1 | \n", 415 | "0 | \n", 416 | "1 | \n", 417 | "0 | \n", 418 | "0 | \n", 419 | "0 | \n", 420 | "0 | \n", 421 | "1 | \n", 422 | "0 | \n", 423 | "1 | \n", 424 | "0 | \n", 425 | "0 | \n", 426 | "0 | \n", 427 | "0 | \n", 428 | "
6 | \n", 431 | "0 | \n", 432 | "0 | \n", 433 | "1 | \n", 434 | "1 | \n", 435 | "0 | \n", 436 | "0 | \n", 437 | "0 | \n", 438 | "0 | \n", 439 | "0 | \n", 440 | "0 | \n", 441 | "0 | \n", 442 | "0 | \n", 443 | "0 | \n", 444 | "0 | \n", 445 | "0 | \n", 446 | "0 | \n", 447 | "0 | \n", 448 | "2 | \n", 449 | "0 | \n", 450 | "1 | \n", 451 | "
7 | \n", 454 | "0 | \n", 455 | "0 | \n", 456 | "0 | \n", 457 | "0 | \n", 458 | "0 | \n", 459 | "1 | \n", 460 | "1 | \n", 461 | "0 | \n", 462 | "1 | \n", 463 | "0 | \n", 464 | "0 | \n", 465 | "0 | \n", 466 | "0 | \n", 467 | "1 | \n", 468 | "0 | \n", 469 | "1 | \n", 470 | "0 | \n", 471 | "0 | \n", 472 | "0 | \n", 473 | "0 | \n", 474 | "
\n", 546 | " | bacon eggs | \n", 547 | "beautiful sky | \n", 548 | "beautiful today | \n", 549 | "blue beautiful | \n", 550 | "blue dog | \n", 551 | "blue sky | \n", 552 | "breakfast sausages | \n", 553 | "brown fox | \n", 554 | "dog lazy | \n", 555 | "eggs ham | \n", 556 | "... | \n", 557 | "lazy dog | \n", 558 | "love blue | \n", 559 | "love green | \n", 560 | "quick blue | \n", 561 | "quick brown | \n", 562 | "sausages bacon | \n", 563 | "sausages ham | \n", 564 | "sky beautiful | \n", 565 | "sky blue | \n", 566 | "toast beans | \n", 567 | "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", 572 | "0 | \n", 573 | "0 | \n", 574 | "0 | \n", 575 | "1 | \n", 576 | "0 | \n", 577 | "0 | \n", 578 | "0 | \n", 579 | "0 | \n", 580 | "0 | \n", 581 | "0 | \n", 582 | "... | \n", 583 | "0 | \n", 584 | "0 | \n", 585 | "0 | \n", 586 | "0 | \n", 587 | "0 | \n", 588 | "0 | \n", 589 | "0 | \n", 590 | "0 | \n", 591 | "1 | \n", 592 | "0 | \n", 593 | "
1 | \n", 596 | "0 | \n", 597 | "1 | \n", 598 | "0 | \n", 599 | "1 | \n", 600 | "0 | \n", 601 | "0 | \n", 602 | "0 | \n", 603 | "0 | \n", 604 | "0 | \n", 605 | "0 | \n", 606 | "... | \n", 607 | "0 | \n", 608 | "1 | \n", 609 | "0 | \n", 610 | "0 | \n", 611 | "0 | \n", 612 | "0 | \n", 613 | "0 | \n", 614 | "0 | \n", 615 | "0 | \n", 616 | "0 | \n", 617 | "
2 | \n", 620 | "0 | \n", 621 | "0 | \n", 622 | "0 | \n", 623 | "0 | \n", 624 | "0 | \n", 625 | "0 | \n", 626 | "0 | \n", 627 | "1 | \n", 628 | "0 | \n", 629 | "0 | \n", 630 | "... | \n", 631 | "1 | \n", 632 | "0 | \n", 633 | "0 | \n", 634 | "0 | \n", 635 | "1 | \n", 636 | "0 | \n", 637 | "0 | \n", 638 | "0 | \n", 639 | "0 | \n", 640 | "0 | \n", 641 | "
3 | \n", 644 | "1 | \n", 645 | "0 | \n", 646 | "0 | \n", 647 | "0 | \n", 648 | "0 | \n", 649 | "0 | \n", 650 | "1 | \n", 651 | "0 | \n", 652 | "0 | \n", 653 | "0 | \n", 654 | "... | \n", 655 | "0 | \n", 656 | "0 | \n", 657 | "0 | \n", 658 | "0 | \n", 659 | "0 | \n", 660 | "0 | \n", 661 | "1 | \n", 662 | "0 | \n", 663 | "0 | \n", 664 | "1 | \n", 665 | "
4 | \n", 668 | "0 | \n", 669 | "0 | \n", 670 | "0 | \n", 671 | "0 | \n", 672 | "0 | \n", 673 | "0 | \n", 674 | "0 | \n", 675 | "0 | \n", 676 | "0 | \n", 677 | "1 | \n", 678 | "... | \n", 679 | "0 | \n", 680 | "0 | \n", 681 | "1 | \n", 682 | "0 | \n", 683 | "0 | \n", 684 | "1 | \n", 685 | "0 | \n", 686 | "0 | \n", 687 | "0 | \n", 688 | "0 | \n", 689 | "
5 | \n", 692 | "0 | \n", 693 | "0 | \n", 694 | "0 | \n", 695 | "0 | \n", 696 | "1 | \n", 697 | "0 | \n", 698 | "0 | \n", 699 | "1 | \n", 700 | "1 | \n", 701 | "0 | \n", 702 | "... | \n", 703 | "0 | \n", 704 | "0 | \n", 705 | "0 | \n", 706 | "1 | \n", 707 | "0 | \n", 708 | "0 | \n", 709 | "0 | \n", 710 | "0 | \n", 711 | "0 | \n", 712 | "0 | \n", 713 | "
6 | \n", 716 | "0 | \n", 717 | "0 | \n", 718 | "1 | \n", 719 | "0 | \n", 720 | "0 | \n", 721 | "1 | \n", 722 | "0 | \n", 723 | "0 | \n", 724 | "0 | \n", 725 | "0 | \n", 726 | "... | \n", 727 | "0 | \n", 728 | "0 | \n", 729 | "0 | \n", 730 | "0 | \n", 731 | "0 | \n", 732 | "0 | \n", 733 | "0 | \n", 734 | "1 | \n", 735 | "1 | \n", 736 | "0 | \n", 737 | "
7 | \n", 740 | "0 | \n", 741 | "0 | \n", 742 | "0 | \n", 743 | "0 | \n", 744 | "0 | \n", 745 | "0 | \n", 746 | "0 | \n", 747 | "1 | \n", 748 | "1 | \n", 749 | "0 | \n", 750 | "... | \n", 751 | "0 | \n", 752 | "0 | \n", 753 | "0 | \n", 754 | "0 | \n", 755 | "0 | \n", 756 | "0 | \n", 757 | "0 | \n", 758 | "0 | \n", 759 | "0 | \n", 760 | "0 | \n", 761 | "
8 rows × 29 columns
\n", 765 | "\n", 867 | " | bacon | \n", 868 | "beans | \n", 869 | "beautiful | \n", 870 | "blue | \n", 871 | "breakfast | \n", 872 | "brown | \n", 873 | "dog | \n", 874 | "eggs | \n", 875 | "fox | \n", 876 | "green | \n", 877 | "ham | \n", 878 | "jumps | \n", 879 | "kings | \n", 880 | "lazy | \n", 881 | "love | \n", 882 | "quick | \n", 883 | "sausages | \n", 884 | "sky | \n", 885 | "toast | \n", 886 | "today | \n", 887 | "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", 892 | "0.00 | \n", 893 | "0.00 | \n", 894 | "0.60 | \n", 895 | "0.53 | \n", 896 | "0.00 | \n", 897 | "0.00 | \n", 898 | "0.00 | \n", 899 | "0.00 | \n", 900 | "0.00 | \n", 901 | "0.00 | \n", 902 | "0.00 | \n", 903 | "0.00 | \n", 904 | "0.00 | \n", 905 | "0.00 | \n", 906 | "0.00 | \n", 907 | "0.00 | \n", 908 | "0.00 | \n", 909 | "0.60 | \n", 910 | "0.00 | \n", 911 | "0.0 | \n", 912 | "
1 | \n", 915 | "0.00 | \n", 916 | "0.00 | \n", 917 | "0.49 | \n", 918 | "0.43 | \n", 919 | "0.00 | \n", 920 | "0.00 | \n", 921 | "0.00 | \n", 922 | "0.00 | \n", 923 | "0.00 | \n", 924 | "0.00 | \n", 925 | "0.00 | \n", 926 | "0.00 | \n", 927 | "0.00 | \n", 928 | "0.00 | \n", 929 | "0.57 | \n", 930 | "0.00 | \n", 931 | "0.00 | \n", 932 | "0.49 | \n", 933 | "0.00 | \n", 934 | "0.0 | \n", 935 | "
2 | \n", 938 | "0.00 | \n", 939 | "0.00 | \n", 940 | "0.00 | \n", 941 | "0.00 | \n", 942 | "0.00 | \n", 943 | "0.38 | \n", 944 | "0.38 | \n", 945 | "0.00 | \n", 946 | "0.38 | \n", 947 | "0.00 | \n", 948 | "0.00 | \n", 949 | "0.53 | \n", 950 | "0.00 | \n", 951 | "0.38 | \n", 952 | "0.00 | \n", 953 | "0.38 | \n", 954 | "0.00 | \n", 955 | "0.00 | \n", 956 | "0.00 | \n", 957 | "0.0 | \n", 958 | "
3 | \n", 961 | "0.32 | \n", 962 | "0.38 | \n", 963 | "0.00 | \n", 964 | "0.00 | \n", 965 | "0.38 | \n", 966 | "0.00 | \n", 967 | "0.00 | \n", 968 | "0.32 | \n", 969 | "0.00 | \n", 970 | "0.00 | \n", 971 | "0.32 | \n", 972 | "0.00 | \n", 973 | "0.38 | \n", 974 | "0.00 | \n", 975 | "0.00 | \n", 976 | "0.00 | \n", 977 | "0.32 | \n", 978 | "0.00 | \n", 979 | "0.38 | \n", 980 | "0.0 | \n", 981 | "
4 | \n", 984 | "0.39 | \n", 985 | "0.00 | \n", 986 | "0.00 | \n", 987 | "0.00 | \n", 988 | "0.00 | \n", 989 | "0.00 | \n", 990 | "0.00 | \n", 991 | "0.39 | \n", 992 | "0.00 | \n", 993 | "0.47 | \n", 994 | "0.39 | \n", 995 | "0.00 | \n", 996 | "0.00 | \n", 997 | "0.00 | \n", 998 | "0.39 | \n", 999 | "0.00 | \n", 1000 | "0.39 | \n", 1001 | "0.00 | \n", 1002 | "0.00 | \n", 1003 | "0.0 | \n", 1004 | "
5 | \n", 1007 | "0.00 | \n", 1008 | "0.00 | \n", 1009 | "0.00 | \n", 1010 | "0.37 | \n", 1011 | "0.00 | \n", 1012 | "0.42 | \n", 1013 | "0.42 | \n", 1014 | "0.00 | \n", 1015 | "0.42 | \n", 1016 | "0.00 | \n", 1017 | "0.00 | \n", 1018 | "0.00 | \n", 1019 | "0.00 | \n", 1020 | "0.42 | \n", 1021 | "0.00 | \n", 1022 | "0.42 | \n", 1023 | "0.00 | \n", 1024 | "0.00 | \n", 1025 | "0.00 | \n", 1026 | "0.0 | \n", 1027 | "
6 | \n", 1030 | "0.00 | \n", 1031 | "0.00 | \n", 1032 | "0.36 | \n", 1033 | "0.32 | \n", 1034 | "0.00 | \n", 1035 | "0.00 | \n", 1036 | "0.00 | \n", 1037 | "0.00 | \n", 1038 | "0.00 | \n", 1039 | "0.00 | \n", 1040 | "0.00 | \n", 1041 | "0.00 | \n", 1042 | "0.00 | \n", 1043 | "0.00 | \n", 1044 | "0.00 | \n", 1045 | "0.00 | \n", 1046 | "0.00 | \n", 1047 | "0.72 | \n", 1048 | "0.00 | \n", 1049 | "0.5 | \n", 1050 | "
7 | \n", 1053 | "0.00 | \n", 1054 | "0.00 | \n", 1055 | "0.00 | \n", 1056 | "0.00 | \n", 1057 | "0.00 | \n", 1058 | "0.45 | \n", 1059 | "0.45 | \n", 1060 | "0.00 | \n", 1061 | "0.45 | \n", 1062 | "0.00 | \n", 1063 | "0.00 | \n", 1064 | "0.00 | \n", 1065 | "0.00 | \n", 1066 | "0.45 | \n", 1067 | "0.00 | \n", 1068 | "0.45 | \n", 1069 | "0.00 | \n", 1070 | "0.00 | \n", 1071 | "0.00 | \n", 1072 | "0.0 | \n", 1073 | "
\n", 1149 | " | 0 | \n", 1150 | "1 | \n", 1151 | "2 | \n", 1152 | "3 | \n", 1153 | "4 | \n", 1154 | "5 | \n", 1155 | "6 | \n", 1156 | "7 | \n", 1157 | "
---|---|---|---|---|---|---|---|---|
0 | \n", 1162 | "1.000000 | \n", 1163 | "0.820599 | \n", 1164 | "0.000000 | \n", 1165 | "0.000000 | \n", 1166 | "0.000000 | \n", 1167 | "0.192353 | \n", 1168 | "0.817246 | \n", 1169 | "0.000000 | \n", 1170 | "
1 | \n", 1173 | "0.820599 | \n", 1174 | "1.000000 | \n", 1175 | "0.000000 | \n", 1176 | "0.000000 | \n", 1177 | "0.225489 | \n", 1178 | "0.157845 | \n", 1179 | "0.670631 | \n", 1180 | "0.000000 | \n", 1181 | "
2 | \n", 1184 | "0.000000 | \n", 1185 | "0.000000 | \n", 1186 | "1.000000 | \n", 1187 | "0.000000 | \n", 1188 | "0.000000 | \n", 1189 | "0.791821 | \n", 1190 | "0.000000 | \n", 1191 | "0.850516 | \n", 1192 | "
3 | \n", 1195 | "0.000000 | \n", 1196 | "0.000000 | \n", 1197 | "0.000000 | \n", 1198 | "1.000000 | \n", 1199 | "0.506866 | \n", 1200 | "0.000000 | \n", 1201 | "0.000000 | \n", 1202 | "0.000000 | \n", 1203 | "
4 | \n", 1206 | "0.000000 | \n", 1207 | "0.225489 | \n", 1208 | "0.000000 | \n", 1209 | "0.506866 | \n", 1210 | "1.000000 | \n", 1211 | "0.000000 | \n", 1212 | "0.000000 | \n", 1213 | "0.000000 | \n", 1214 | "
5 | \n", 1217 | "0.192353 | \n", 1218 | "0.157845 | \n", 1219 | "0.791821 | \n", 1220 | "0.000000 | \n", 1221 | "0.000000 | \n", 1222 | "1.000000 | \n", 1223 | "0.115488 | \n", 1224 | "0.930989 | \n", 1225 | "
6 | \n", 1228 | "0.817246 | \n", 1229 | "0.670631 | \n", 1230 | "0.000000 | \n", 1231 | "0.000000 | \n", 1232 | "0.000000 | \n", 1233 | "0.115488 | \n", 1234 | "1.000000 | \n", 1235 | "0.000000 | \n", 1236 | "
7 | \n", 1239 | "0.000000 | \n", 1240 | "0.000000 | \n", 1241 | "0.850516 | \n", 1242 | "0.000000 | \n", 1243 | "0.000000 | \n", 1244 | "0.930989 | \n", 1245 | "0.000000 | \n", 1246 | "1.000000 | \n", 1247 | "
\n", 1320 | " | Document\\Cluster 1 | \n", 1321 | "Document\\Cluster 2 | \n", 1322 | "Distance | \n", 1323 | "Cluster Size | \n", 1324 | "
---|---|---|---|---|
0 | \n", 1329 | "2 | \n", 1330 | "7 | \n", 1331 | "0.253098 | \n", 1332 | "2 | \n", 1333 | "
1 | \n", 1336 | "0 | \n", 1337 | "6 | \n", 1338 | "0.308539 | \n", 1339 | "2 | \n", 1340 | "
2 | \n", 1343 | "5 | \n", 1344 | "8 | \n", 1345 | "0.386952 | \n", 1346 | "3 | \n", 1347 | "
3 | \n", 1350 | "1 | \n", 1351 | "9 | \n", 1352 | "0.489845 | \n", 1353 | "3 | \n", 1354 | "
4 | \n", 1357 | "3 | \n", 1358 | "4 | \n", 1359 | "0.732945 | \n", 1360 | "2 | \n", 1361 | "
5 | \n", 1364 | "11 | \n", 1365 | "12 | \n", 1366 | "2.69565 | \n", 1367 | "5 | \n", 1368 | "
6 | \n", 1371 | "10 | \n", 1372 | "13 | \n", 1373 | "3.45108 | \n", 1374 | "8 | \n", 1375 | "
\n", 1465 | " | Document | \n", 1466 | "Category | \n", 1467 | "ClusterLabel | \n", 1468 | "
---|---|---|---|
0 | \n", 1473 | "The sky is blue and beautiful. | \n", 1474 | "weather | \n", 1475 | "2 | \n", 1476 | "
1 | \n", 1479 | "Love this blue and beautiful sky! | \n", 1480 | "weather | \n", 1481 | "2 | \n", 1482 | "
2 | \n", 1485 | "The quick brown fox jumps over the lazy dog. | \n", 1486 | "animals | \n", 1487 | "1 | \n", 1488 | "
3 | \n", 1491 | "A king's breakfast has sausages, ham, bacon, eggs, toast and beans | \n", 1492 | "food | \n", 1493 | "3 | \n", 1494 | "
4 | \n", 1497 | "I love green eggs, ham, sausages and bacon! | \n", 1498 | "food | \n", 1499 | "3 | \n", 1500 | "
5 | \n", 1503 | "The brown fox is quick and the blue dog is lazy! | \n", 1504 | "animals | \n", 1505 | "1 | \n", 1506 | "
6 | \n", 1509 | "The sky is very blue and the sky is very beautiful today | \n", 1510 | "weather | \n", 1511 | "2 | \n", 1512 | "
7 | \n", 1515 | "The dog is lazy but the brown fox is quick! | \n", 1516 | "animals | \n", 1517 | "1 | \n", 1518 | "
\n", 1600 | " | T1 | \n", 1601 | "T2 | \n", 1602 | "T3 | \n", 1603 | "
---|---|---|---|
0 | \n", 1608 | "0.832191 | \n", 1609 | "0.083480 | \n", 1610 | "0.084329 | \n", 1611 | "
1 | \n", 1614 | "0.863554 | \n", 1615 | "0.069100 | \n", 1616 | "0.067346 | \n", 1617 | "
2 | \n", 1620 | "0.047794 | \n", 1621 | "0.047776 | \n", 1622 | "0.904430 | \n", 1623 | "
3 | \n", 1626 | "0.037243 | \n", 1627 | "0.925559 | \n", 1628 | "0.037198 | \n", 1629 | "
4 | \n", 1632 | "0.049121 | \n", 1633 | "0.903076 | \n", 1634 | "0.047802 | \n", 1635 | "
5 | \n", 1638 | "0.054901 | \n", 1639 | "0.047778 | \n", 1640 | "0.897321 | \n", 1641 | "
6 | \n", 1644 | "0.888287 | \n", 1645 | "0.055697 | \n", 1646 | "0.056016 | \n", 1647 | "
7 | \n", 1650 | "0.055704 | \n", 1651 | "0.055689 | \n", 1652 | "0.888607 | \n", 1653 | "
\n", 1752 | " | Document | \n", 1753 | "Category | \n", 1754 | "ClusterLabel | \n", 1755 | "
---|---|---|---|
0 | \n", 1760 | "The sky is blue and beautiful. | \n", 1761 | "weather | \n", 1762 | "2 | \n", 1763 | "
1 | \n", 1766 | "Love this blue and beautiful sky! | \n", 1767 | "weather | \n", 1768 | "2 | \n", 1769 | "
2 | \n", 1772 | "The quick brown fox jumps over the lazy dog. | \n", 1773 | "animals | \n", 1774 | "1 | \n", 1775 | "
3 | \n", 1778 | "A king's breakfast has sausages, ham, bacon, eggs, toast and beans | \n", 1779 | "food | \n", 1780 | "0 | \n", 1781 | "
4 | \n", 1784 | "I love green eggs, ham, sausages and bacon! | \n", 1785 | "food | \n", 1786 | "0 | \n", 1787 | "
5 | \n", 1790 | "The brown fox is quick and the blue dog is lazy! | \n", 1791 | "animals | \n", 1792 | "1 | \n", 1793 | "
6 | \n", 1796 | "The sky is very blue and the sky is very beautiful today | \n", 1797 | "weather | \n", 1798 | "2 | \n", 1799 | "
7 | \n", 1802 | "The dog is lazy but the brown fox is quick! | \n", 1803 | "animals | \n", 1804 | "1 | \n", 1805 | "
\n", 229 | " | Product | \n", 230 | "Consumer_complaint_narrative | \n", 231 | "category_id | \n", 232 | "
---|---|---|---|
1 | \n", 237 | "Credit reporting | \n", 238 | "I have outdated information on my credit repor... | \n", 239 | "0 | \n", 240 | "
2 | \n", 243 | "Consumer Loan | \n", 244 | "I purchased a new car on XXXX XXXX. The car de... | \n", 245 | "1 | \n", 246 | "
7 | \n", 249 | "Credit reporting | \n", 250 | "An account on my credit report has a mistaken ... | \n", 251 | "0 | \n", 252 | "
12 | \n", 255 | "Debt collection | \n", 256 | "This company refuses to provide me verificatio... | \n", 257 | "2 | \n", 258 | "
16 | \n", 261 | "Debt collection | \n", 262 | "This complaint is in regards to Square Two Fin... | \n", 263 | "2 | \n", 264 | "
The generative classification model, such as Naive Bayes, tries to learn the probabilities and then predict by using Bayes rules to calculate the posterior, \\(p(y|\\textbf{x})\\). However, discrimitive classifiers model the posterior directly. As one of the most popular discrimitive classifiers, logistic regression directly models the linear decision boundary.
\n", 1097 | "Let us start with the binary case. For an M-dimensional feature vector \\(\\textbf{x}=[x_1,x_2,...,x_M]^T\\), the posterior probability of class \\(y\\in\\{\\pm{1}\\}\\) given \\(\\textbf{x}\\) is assumed to satisfy\n", 1099 | "
\n", 1100 | "\n", 1104 | "where \\(\\textbf{w}=[w_1,w_2,...,w_M]^T\\) is the weighting vector to be learned. Given the constraint that \\(p(y=1|\\textbf{x})+p(y=-1|\\textbf{x})=1\\), it follows that\n", 1105 | "
\n", 1106 | "\n", 1110 | "in which we can observe the logistic sigmoid function \\(\\sigma(a)=\\frac{1}{1+\\exp(-a)}\\).
\n", 1111 | "Based on the assumptions above, the weighting vector, \\(\\textbf{w}\\), can be learned by maximum likelihood estimation (MLE). More specifically, given training data set \\(\\mathcal{D}=\\{(\\textbf{x}_1,y_1),(\\textbf{x}_2,y_2),...,(\\textbf{x}_N,y_N)\\}\\),\n", 1112 | "
\n", 1113 | "\n", 1122 | "We have a convex objective function here, and we can calculate the optimal solution by applying gradient descent. The gradient can be drawn as\n", 1123 | "
\n", 1124 | "\n", 1131 | "Then, we can learn the optimal \\(\\textbf{w}\\) by starting with an initial \\(\\textbf{w}_0\\) and iterating as follows:\n", 1132 | "
\n", 1133 | "\n", 1137 | "where \\(\\eta_t\\) is the learning step size. It can be invariant to time, but time-varying step sizes could potential reduce the convergence time, e.g., setting \\(\\eta_t\\propto{1/\\sqrt{t}}\\) such that the step size decreases with an increasing time \\(t\\).
\n", 1138 | "When it is generalized to multiclass case, the logistic regression model needs to adapt accordingly. Now we have \\(K\\) possible classes, that is, \\(y\\in\\{1,2,..,K\\}\\). It is assumed that the posterior probability of class \\(y=k\\) given \\(\\textbf{x}\\) follows\n", 1140 | "
\n", 1141 | "\n", 1145 | "where \\(\\textbf{w}_k\\) is a column weighting vector corresponding to class \\(k\\). Considering all classes \\(k=1,2,...,K\\), we would have a weighting matrix that includes all \\(K\\) weighting vectors. That is, \\(\\textbf{W}=[\\textbf{w}_1,\\textbf{w}_2,...,\\textbf{w}_K]\\).\n", 1146 | "Under the constraint\n", 1147 | "
\n", 1148 | "\n", 1152 | "it then follows that\n", 1153 | "
\n", 1154 | "The weighting matrix, \\(\\textbf{W}\\), can be similarly learned by maximum likelihood estimation (MLE). More specifically, given training data set \\(\\mathcal{D}=\\{(\\textbf{x}_1,y_1),(\\textbf{x}_2,y_2),...(\\textbf{x}_N,y_N)\\}\\),\n", 1158 | "
\n", 1159 | "\n", 1167 | "The gradient of the objective function with respect to each \\(\\textbf{w}_k\\) can be calculated as\n", 1168 | "
\n", 1169 | "\n", 1176 | "where \\(I(\\cdot)\\) is a binary indicator function. Applying gradient descent, the optimal solution can be obtained by iterating as follows:\n", 1177 | "
\n", 1178 | "\n", 1182 | "Note that we have \"\\(+\\)\" instead of \"\\(-\\)\", because the maximum likelihood estimation in the binary case is eventually converted to a minimization problem, while here we keep performing maximization.
\n", 1183 | "Once the optimal weights are learned from the logistic regression model, for any new feature vector \\(\\textbf{x}\\), we can easily calculate the probability that it is associated to each class label \\(k\\) in the binary case in the multiclass case. With the probabilities for each class label available, we can then perform:
\n", 1185 | "\n", 2169 | " | id | \n", 2170 | "text | \n", 2171 | "author | \n", 2172 | "
---|---|---|---|
0 | \n", 2177 | "id26305 | \n", 2178 | "This process, however, afforded me no means of... | \n", 2179 | "EAP | \n", 2180 | "
1 | \n", 2183 | "id17569 | \n", 2184 | "It never once occurred to me that the fumbling... | \n", 2185 | "HPL | \n", 2186 | "
2 | \n", 2189 | "id11008 | \n", 2190 | "In his left hand was a gold snuff box, from wh... | \n", 2191 | "EAP | \n", 2192 | "
3 | \n", 2195 | "id27763 | \n", 2196 | "How lovely is spring As we looked from Windsor... | \n", 2197 | "MWS | \n", 2198 | "
4 | \n", 2201 | "id12958 | \n", 2202 | "Finding nothing else, not even gold, the Super... | \n", 2203 | "HPL | \n", 2204 | "
\n", 2253 | " | id | \n", 2254 | "text | \n", 2255 | "
---|---|---|
0 | \n", 2260 | "id02310 | \n", 2261 | "Still, as I urged our leaving Ireland with suc... | \n", 2262 | "
1 | \n", 2265 | "id24541 | \n", 2266 | "If a fire wanted fanning, it could readily be ... | \n", 2267 | "
2 | \n", 2270 | "id00134 | \n", 2271 | "And when they had broken down the frail door t... | \n", 2272 | "
3 | \n", 2275 | "id27757 | \n", 2276 | "While I was thinking how I should possibly man... | \n", 2277 | "
4 | \n", 2280 | "id04081 | \n", 2281 | "I am not sure to what limit his knowledge may ... | \n", 2282 | "