├── .gitignore ├── .pre-commit-config.yaml ├── README.md ├── baseline.py ├── data └── .gitkeep ├── differentiate_words.py ├── make_test_set.py ├── make_training_set.py ├── models └── .gitkeep ├── poetry.lock ├── poster.pdf ├── pyproject.toml ├── rnn.py ├── separate_clean_and_unclean.py ├── trump_predictor.py ├── word_cloud.py └── word_clouds ├── diff clouds ├── e.png ├── f.png ├── i.png ├── j.png ├── n.png ├── p.png ├── s.png └── t.png └── undiff clouds ├── e.png ├── f.png ├── i.png ├── j.png ├── n.png ├── p.png ├── s.png └── t.png /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | data 3 | models 4 | !data/.gitkeep 5 | !models/.gitkeep -------------------------------------------------------------------------------- /.pre-commit-config.yaml: -------------------------------------------------------------------------------- 1 | - repo: https://github.com/psf/black 2 | rev: 20.8b1 # Replace by any tag/version: https://github.com/psf/black/tags 3 | hooks: 4 | - id: black 5 | language_version: python3 # Should be a command that runs python3.6+ -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks 2 | 3 | ## Authors 4 | 5 | ### Ian Scott Knight 6 | 7 | Department of Symbolic Systems, Stanford University 8 | 9 | isknight@stanford.edu 10 | 11 | ### Rayne Hernandez 12 | 13 | Department of Mathematical & Computational Science, Stanford University 14 | 15 | rayne@stanford.edu 16 | 17 | ## Abstract 18 | 19 | Our focus for this project is using machine learning to build a classifier capable of sorting people into their Myers-Briggs Type Index (MBTI) personality type based on text samples from their social media posts. The motivations for building such a classifier are twofold. First, the pervasiveness of social media means that such a classifier would have ample data on which to run personality assessments, allowing more people to gain access to their MBTI personality type, and perhaps far more reliably and more quickly. There is significant interest in this area within the academic realm of psychology as well as the private sector. For example, many employers wish to know more about the personality of potential hires, so as to better manage the culture of their firm. Our second motivation centers on the potential for our classifier to be more accurate than currently available tests as evinced by the fact that retest error rates for personality tests administered by trained psychologists currently hover around 0.5. That is, the there is a probability of about half that taking the test twice in two different contexts will yield different classifications. Thus, our classifier could serve as a verification system for these initial tests as a means of allowing people to have more confidence in their results. Indeed, a text-based classifier would be able to operate on a far larger amount of data than that given in a single personality test. 20 | 21 | ## 0 Instructions 22 | 23 | 1. Download dataset from https://www.kaggle.com/datasnaek/mbti-type and place here: data/mbti_1.csv 24 | 25 | 2. Download GloVe word vectors from https://www.kaggle.com/watts2/glove6b50dtxt?select=glove.6B.50d.txt and place here: data/glove.6B.50d.txt 26 | 27 | 3. Run these scripts in order to generate dataset: separate_clean_and_unclean.py, make_training_set.py, make_test_set.py 28 | 29 | 4. Run baseline.py to use a naive Bayes classifier to establish a lower bound of reasonable accuracy 30 | 31 | 5. Run rnn.py with the constants / control variables that you want. 32 | 33 | 6. Run trump_predictor.py for fun 34 | 35 | 7. Run differentiate_words.py and then word_cloud.py to get data for word cloud formation for each MBTI type letter (i.e. one for I, one for E, etc.) 36 | 37 | 8. Above all, make this your own. I developed this project when I was not very good at programming, so it is filled with bad style. Enjoy! 38 | 39 | ## 1 Introduction 40 | 41 | In the scientific field of psychology, the concept of personality is considered a powerful but imprecisely defined construct. Psychologists would therefore stand to gain much from the development of more concrete, empirical measures of extant models of personality. Our project seeks to improve the understanding of one such model: the Myers-Briggs Type Indicator (MBTI). We intend to use machine learning to build a classifier that will take in text (e.g. social media posts) as input and produce as output a prediction of the MBTI personality type of the author of said text. A successful implementation of such a classifier would demonstrate a strong linguistic basis for MBTI and potentially personality in general. Furthermore, the ability to produce an accurate text-based classifier has significant potential implications for the field of psychology itself, since the connection between natural language and personality type is non-trivial [11] 42 | 43 | ## 2 Background/Related Work 44 | 45 | The MBTI personality classification system grew out of Jungian psychoanalytic psychology as a systematization of archetypal personality types used in clinical practice. The system is divided along four binary orthogonal personality dimensions, altogether comprising a total of 16 distinct personality types. 46 | 47 | The dimensions are as follows: 48 | 49 | i. Extraversion (E) vs Introversion (I): a measure of how much an individual prefers their outer or inner world. 50 | 51 | ii. Sensing (S) vs Intuition (N): a measure of how much an individual processes information through the five senses versus impressions through patterns. 52 | 53 | iii. Thinking (T) vs Feeling (F): a measure of preference for objective principles and facts versus weighing the emotional perspectives of others. 54 | 55 | iv. Judging (J) vs Perceiving (P): a measure of how much an individual prefers a planned and ordered life versus a flexible and spontaneous life. 56 | 57 | There is current debate over the predictive validity of MBTI regarding persistent personality traits. In contrast to the MBTI system, the prevalent personality type system used in Psychometrics is the Big Five personality classification system. This personality system measures along five statistically orthogonal personality dimensions: Extraversion, Agreeableness, Openness, Conscientiousness, and Neuroticism. In contrast to MBTI, the Big Five personality type system is statistically derived to have predictive power over measurable features in an individual’s life, ie. income, education level, marital status, and is stable over an individuals lifetime. However work by Pennebaker, J. W. and King, L. A [10] indicates significant correlations between four of the Big Five personality traits and the four MBTI dimensions: Agreeableness with Thinking/Feeling; Conscientiousness with Judging/Perceiving; Extraversion with Extraversion/Introversion; and Openness with Sensing/Intuition. These correlations indicate a relative mapping of MBTI personality traits to persistent features of personality. In the context of our project and the popularity of the MBTI personality system, these correlations justify an attempt to model the relationship between writing style and persistent personality traits. 58 | Current research on predicting MBTI personality types from textual data is sparse. Nevertheless important strides have been made in both machine learning and neuroscience. Work by Jonathan S. Adelstein [5] has discovered the neural correlates of the Big Five personality domains. Specifically, the activation patterns of independent functional subdivisions in the brain, responsible for cognitive and affective processing, were shown to be statistically different among individuals differing on the various Big Five personality dimensions. Likewise there was a functional overlap between these identified regions responsible for differences in personality type and written communication. This justifies our attempt at predicting persistent personality traits from textual data. 59 | In the field of Machine Learning, deep feed forward neural networks have proven useful in successfully predicting MBTI personality types for relatively small textual datasets. Work by Mihai Gavrilescu [6] and Champa H N [9] used a three layer feed forward architecture on handwritten textual data. Although their models incorporated handwritten features in addition to just textual characters, and suffered from small sample sizes, they are nonetheless a proof of concept that deep neural architectures are quite capable of predicting MBTI with considerable accuracy. Alternatively work by Mike Komisin and Curry Guinn [8] using classical machine learning methods, including Naive Bayes and SVM, on word choice features using a bag-of-words model, were able to accurately predict MBTI personality type as well. Their research indicates that textual data alone is sufficient for predicting MBTI personality types, even if not using state-of-the-art methods. Altogether past work on this topic indicates a ripe opportunity to combine newer deep learning techniques with massive textual datasets to accurately predict individual persistent personality traits. 60 | 61 | ## 3 Approach 62 | 63 | ### 3.1 Preprocessing 64 | 65 | #### 3.1.1 Proportionality 66 | 67 | When we examined other studies of MBTI using machine learning, we were surprised to find that researchers rarely made a point of cleaning their data set to accord with the actual proportions of MBTI types in the general population (e.g. ISTJ = 0.11, ISFJ = 0.09, etc.)[3]. Since our raw data set is severely disproportional (see fig. 1) compared to the roughly uniform distribution for the general population, it was clear to us that some cleaning of the proportional representation of each MBTI type would be necessary. Therefore, we artificially made our test set reflect the proportions found for each type in the general population, so as to prevent any misinterpretation of results due to skewed representation of classes in the test set. 68 | 69 | Figure 1: Non-uniform representation of MBTI types in the data set: 70 | 71 | screen shot 2018-06-23 at 5 02 55 am 72 | 73 | #### 3.1.2 Selective Word Removal 74 | 75 | Since the data set comes from an Internet forum where individuals communicate strictly via written text, some word removal was clearly necessary. For example, there were several instances of data points containing links to websites. Since we want our model to generalize to the English language, we removed any data points containing links to websites. Next, since we want every word in the data to be as meaningful as possible, we removed so-called ”stop words” from the text (e.g. very common filler words like ”a”, ”the”, ”or”, etc.) using python’s NLTK. Finally, since the particular data set we are working with comes from a website intended for explicit discussion of personality models, especially MBTI, we removed types themselves (e.g. ’INTJ’, ’INFP’, etc.), so as to prevent the model from ”cheating” by learning to recognize mentions of MBTI by name. 76 | 77 | #### 3.1.3 Lemmatization 78 | 79 | We used nltk.stem.WordNetLemmatizer to lemmatize the text, meaning that inflected forms of the same root word were transformed into their dictionary form (e.g. ”walking”, ”walked”, ”walk” all become ”walk”). This will allow us to make use of the fact that inflected forms of the same word still carry one shared meaning. 80 | 81 | #### 3.1.4 Tokenization 82 | 83 | Using a Keras word tokenizer, we tokenized the 2500 most common words of the lemmatized text. That is, the most common word became 1, the second most common word became 2, etc. all the way to 2500. Any other words in the lemmatized text were removed, such that at this point the text is in the form of lists of integers (with a vocabulary of 1-2500). 84 | 85 | #### 3.1.5 Padding 86 | 87 | Since the tokenized posts are of highly variable lengths, it is necessary to make them all the same number tokens long. We achieved this by ”padding” every tokenized post such that it has exactly 40 integers. That is, if there are less than 40 integers in the tokenized post we add zeros until it has 40 tokens, and if there are more than 40 tokens we remove tokens from the post until it has 40 tokens. At this point, our input is ready. 88 | 89 | ### 3.2 Model 90 | 91 | #### 3.2.1 Embedding Layer 92 | 93 | For our embedding layer, we use an embedding matrix in the form of a dictionary mapping every lemmatized word (following the same process described above up to lemmatization) to the 50- dimensional GloVe representation of that word. This produces an output of size 50 for every padded input vector. 94 | 95 | #### 3.2.2 Recurrent Neural Network 96 | 97 | Due to the fact that our data set is composed of sequential text data, we decided to use a recurrent neural network in order to capture some of the information in the text data that would otherwise be ignored (e.g. as with a naive Bayes classifier). 98 | 99 | We experimented with various types of recurrent neural networks (RNN) for this step. After testing the SimpleRNN, GRU, LSTM, and Bidirectional LSTM options for recurrent layers in Keras, we found the LSTM option to give the best results. 100 | 101 | We further found the best parameters for the LSTM layer to be dropout of 0.1, recurrent dropout of 0.1, sigmoid activation, and a zero kernel initializer. 102 | 103 | #### 3.2.3 Dense Layer 104 | 105 | Finally, we use a dense layer with sigmoid activation to produce a value between 0 and 1, representing the predicted class probability, since there are only two classes. 106 | 107 | #### 3.2.4 Other 108 | 109 | Furthermore, we use binary crossentropy for the loss function (since there are only two classes) and an Adam optimizer. 110 | 111 | ## 4 Experiments 112 | 113 | ### 4.1 Data Set 114 | 115 | Our main data set is a publicly available Kaggle data set containing 8600 rows of data [2]. Each row consists of two columns: (1) the MBTI personality type (e.g. INTJ, ESFP) of a given person, and (2) fifty of that persons social media posts. Since there are fifty posts included for every user, the number of data points is 430,000. This data comes from the users of personalitycafe.com, an online forum where users first take a questionairre that sorts them into their MBTI type and then allows them to chat publicly with other users. 116 | 117 | ### 4.2 Classification Task 118 | 119 | Due to the nature of the Myers-Briggs Type Indicator, we can break down the classification task with 16 classes into four smaller binary classification tasks. This is because an MBTI type is composed of four binary classes, where each binary class represents a dimension of personality as theorized by the inventors of the MBTI personality model. Therefore, instead of training a multi-class classifier, we instead train four different binary classifiers, such that each specializes in one of the dimensions of personality. 120 | 121 | ### 4.3 Training Hyperparameters 122 | 123 | The following hyperparameter configurations were found to produce the best performance during testing. 124 | 125 | Learning rate: 0.001 126 | 127 | Model batch size: 128 128 | 129 | Number of epochs: 30 130 | 131 | Token vocabulary size: 2500 132 | 133 | Input vector length: 40 134 | 135 | Embedding vector length: 50 136 | 137 | ### 4.4 Evaluation 138 | 139 | #### 4.4.1 Post Classification Methodology 140 | 141 | For post classification, we preprocessed the test set and predicted the class for every individual post. We then produced an accuracy score and confusion matrix for every MBTI dimension. 142 | 143 | #### 4.4.2 User Classification Methodology 144 | 145 | In order to classify users, we needed to find a way of turning class predictions of individual posts all authored by an individual into a prediction for the class of the author. We devised two different methods to accomplish this, one of which was more effective than the other. 146 | 147 | The first method is to assign the most common class prediction for a given user’s corpus of posts as that user’s predicted MBTI class. 148 | 149 | The second method is to take the mean of the class probability predictions for all the posts in a user’s corpus and round either to 0 or 1. 150 | 151 | The second method proved more effective (by one or two percentage points of accuracy), and so we utilize it for our reported findings. 152 | 153 | ### 4.5 Results 154 | 155 | #### 4.5.1 Post Classification 156 | 157 | Accuracy: 158 | 159 | screen shot 2018-06-23 at 5 03 07 am 160 | 161 | Confusion matrices: 162 | 163 | screen shot 2018-06-23 at 5 03 22 am 164 | 165 | #### 4.5.2 User Classification 166 | 167 | Accuracy: 168 | 169 | screen shot 2018-06-23 at 5 03 33 am 170 | 171 | Confusion matrices: 172 | 173 | screen shot 2018-06-23 at 5 03 40 am 174 | 175 | ### 4.6 Analysis 176 | 177 | #### 4.6.1 Model Efficacy and Implications 178 | 179 | A quick glance at the classification accuracy results for both the Post Classification and User Classification reveals that User Classification performed with a higher accuracy score across all four personality dimensions than Post Classification. Alternatively, the confusion matrices indicate the same confusion pattern along the N/S and P/J dimensions for both Post Classification and User Classification. Along the N/S dimension and P/J, treating N (or P) as positive and S (or J) as negative, we have more False Positives than False Negatives. This indicates a propensity for our model to incorrectly predict N (or P) over incorrectly predicting S (or J). 180 | 181 | When classifying individual social media posts, the model struggled to accurately predict the binary class for each MBTI personality dimension. However, when one considers the brevity of the text and the inherent difficulty in gleaning underlying information in such brief text, our achieved accuracy actually seems impressive. After all, it is hard to believe that there is huge separation in the ways people of even vastly different personalities use language that is discoverable in individual social media posts of relatively short length. 182 | 183 | Next, when classifying users based on their several social media posts, the model achieved consid- erable success in accurately predicting the binary class for each MBTI personality dimension. The working assumption that allowed us to accurately classify users was that the their individual posts retained information on the microscopic level that when considered all together would indicate the macroscopic character of the author’s personality type. That is, the averaging of the probability predictions of individual social media posts proved to be an effective indicator of the author’s actual personality type. 184 | 185 | #### 4.6.2 Word Clouds 186 | 187 | For data visualization, we produced word clouds for concepts most prevalently used by specific classes of the personality dimensions. These were created by extracting the posts with the most extreme class probability predictions (500 for each binary class). In the case of words shared between both extractions for a given dimension, the extraction with the smaller number of instances of that word had all its instances of that word removed and the other extraction had the same number of instances of that word removed (in order to better capture the disproportionate use of certain words by certain types). Only then are these word clouds created, where the size of each word is proportional to its appearance frequency in the respective extractions. We consider these word clouds to be illustrative of some of the unique ways different MBTIs use language. 188 | 189 | screen shot 2018-06-23 at 5 13 38 am 190 | 191 | screen shot 2018-06-23 at 5 13 50 am 192 | 193 | screen shot 2018-06-23 at 5 14 03 am 194 | 195 | screen shot 2018-06-23 at 5 14 12 am 196 | 197 | #### 4.6.3 Generalizability To Social Media: Donald Trump 198 | 199 | Our classifier should perform better as the amount of individual social media posts available for a given user increases. This is because the presumed differences that affect the person’s use of language become more apparent as the amount of text they provide increases. While our test set only provided up to 50 posts for a given user, it is obviously possible to obtains several hundred or even thousands of such pieces of text for a given social media user, thereby drastically improving our ability to classify their MBTI. 200 | 201 | As a real life test case of this hypothetical capability for greater abstraction with increased quantity of text data, we decided to scrape 30,000 of Donald Trump’s tweets and use our model to predict his MBTI type. Our model produced the following average of the probabilities for each personality dimension: 202 | 203 | screen shot 2018-06-23 at 5 04 11 am 204 | 205 | Rounding from these numbers, our model predicts that Donald Trump’s MBTI type is ESTP, which is his true MBTI type according to MBTI experts [4]. To drive this point home, it should be noted that the ESTP archetype is known as ”the Entrepreneur”! [1] This is just one example, but at the very least it demonstrates the expansive areas of application available to text classifiers like the one we have developed. 206 | 207 | ## 5 Conclusion 208 | 209 | The overall accuracy of our trained model when classifying users is 0.208 (0.676 x 0.62 x 0.778 x 0.637). While this seems to indicate a weak overall ability of our model to correctly classify all four MBTI dimensions, it should be noted that this number represents perfect classification, and that it does not demonstrate the effectiveness of our model to achieve approximate predictions of overall MBTI types. In fact, other models that focus on multi-class classification of MBTI may achieve higher accuracy of perfect classification, but they do so at risk of getting their prediction completely wrong. That is, multi-class classification treats all classes as independent of each other, and so they fail to capture to in-built relatedness of some types to other types (e.g. INFP is much more similar to INTJ than it is to ESTJ). That being said, our model represents a trade-off of these two aspects: we achieve lower rates of perfect classification in exchange for higher rates of approximately correct classification (i.e. ”good” classification). 210 | 211 | ## References 212 | 213 | [1] Estp archetype. https://www.16personalities.com/estp-personality. 214 | 215 | [2] Mbti kaggle data set. https://www.kaggle.com/datasnaek/mbti-type. 216 | 217 | [3] Mbti representation in the general population. http://aiweb.techfak.uni-bielefeld. de/content/bworld-robot-control-software/. 218 | 219 | [4] Trump’s mbti. https://www.kaggle.com/datasnaek/mbti-type. 220 | 221 | [5] Jonathan S. Adelstein, Zarrar Shehzad, Maarten Mennes, Colin G. DeYoung, Xi-Nian Zuo, Clare Kelly, Daniel S. Margulies, Aaron Bloomfield, Jeremy R. Gray, F. Xavier Castellanos, and Michael P. Milham. Personality is reflected in the brain’s intrinsic functional architecture. PLOS ONE, 6(11):1–12, 11 2011. 222 | 223 | [6] Mihai Gavrilescu. Study on determining the myers-briggs personality type based on individual’s handwriting. The 5th IEEE International Conference on E-Health and Bioengineering, 11 2015. 224 | 225 | [7] Mayuri P. Kalghatgi, Manjula Ramannavar, and Dr. Nandini S. Sidnal. A neural network approach to personality prediction based on the big-five model. International Journal of Innovative Research in Advanced Engineering, 2015. 226 | 227 | [8] M Komisin and Curry Guinn. Identifying personality types using document classification methods. Proceedings of the 25th International Florida Artificial Intelligence Research Society Conference, FLAIRS-25, pages 232–237, 01 2012. 228 | 229 | [9] ChampaHNandDr.KRAnandakumar.Artificialneuralnetworkforhumanbehaviorprediction through handwriting analysis. International Journal of Computer Applications, 2010. 230 | 231 | [10] James W Pennebaker and Laura A King. Linguistic styles: Language use as an individual difference. Journal of personality and social psychology, 77(6):1296, 1999. 232 | 233 | [11] K. R. Scherer. Personality markers in speech. Cambridge University Press., 1979. 234 | -------------------------------------------------------------------------------- /baseline.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import pandas as pd 4 | import csv 5 | import joblib 6 | from keras.preprocessing.text import Tokenizer 7 | from keras.preprocessing import sequence 8 | from sklearn.feature_extraction.text import CountVectorizer 9 | from sklearn.feature_extraction.text import TfidfTransformer 10 | from sklearn.pipeline import Pipeline 11 | from sklearn.model_selection import GridSearchCV, KFold 12 | from sklearn.metrics import confusion_matrix, accuracy_score 13 | from sklearn.naive_bayes import MultinomialNB 14 | from sklearn.tree import DecisionTreeClassifier 15 | from sklearn.svm import SVC 16 | from sklearn.linear_model import SGDClassifier, Perceptron 17 | from sklearn.neighbors import KNeighborsClassifier 18 | from nltk.stem import WordNetLemmatizer 19 | from nltk.corpus import stopwords 20 | 21 | 22 | MODELS_DIR = "models" 23 | DATA_DIR = "data" 24 | DIMENSIONS = ["IE", "NS", "FT", "PJ"] 25 | TOP_WORDS = 2500 26 | MAX_POST_LENGTH = 40 27 | CROSS_VALIDATION = False 28 | SAVE_MODEL = False 29 | 30 | for k in range(len(DIMENSIONS)): 31 | x_train = [] 32 | y_train = [] 33 | x_test = [] 34 | y_test = [] 35 | 36 | ### Read in data 37 | with open( 38 | os.path.join(DATA_DIR, "train_{}.csv".format(DIMENSIONS[k][0])), "r" 39 | ) as f: 40 | reader = csv.reader(f) 41 | for row in reader: 42 | for post in row: 43 | x_train.append(post) 44 | y_train.append(0) 45 | with open( 46 | os.path.join(DATA_DIR, "train_{}.csv".format(DIMENSIONS[k][1])), "r" 47 | ) as f: 48 | reader = csv.reader(f) 49 | for row in reader: 50 | for post in row: 51 | x_train.append(post) 52 | y_train.append(1) 53 | with open(os.path.join(DATA_DIR, "test_{}.csv".format(DIMENSIONS[k][0])), "r") as f: 54 | reader = csv.reader(f) 55 | for row in reader: 56 | for post in row: 57 | x_test.append(post) 58 | y_test.append(0) 59 | with open(os.path.join(DATA_DIR, "test_{}.csv".format(DIMENSIONS[k][1])), "r") as f: 60 | reader = csv.reader(f) 61 | for row in reader: 62 | for post in row: 63 | x_test.append(post) 64 | y_test.append(1) 65 | 66 | MBTI_TYPES = [ 67 | "INFJ", 68 | "ENTP", 69 | "INTP", 70 | "INTJ", 71 | "ENTJ", 72 | "ENFJ", 73 | "INFP", 74 | "ENFP", 75 | "ISFP", 76 | "ISTP", 77 | "ISFJ", 78 | "ISTJ", 79 | "ESTP", 80 | "ESFP", 81 | "ESTJ", 82 | "ESFJ", 83 | ] 84 | stop_words = stopwords.words("english") 85 | lemmatizer = WordNetLemmatizer() 86 | tokenizer = Tokenizer(num_words=TOP_WORDS, filters="") 87 | tokenizer.fit_on_texts(x_train + x_test) 88 | 89 | def lemmatize(x): 90 | lemmatized = [] 91 | for post in x: 92 | temp = post.lower() 93 | for mbti_type in MBTI_TYPES: 94 | mbti_type = mbti_type.lower() 95 | temp = temp.replace(" " + mbti_type, "") 96 | temp = " ".join( 97 | [ 98 | lemmatizer.lemmatize(word) 99 | for word in temp.split(" ") 100 | if (word not in stop_words) 101 | ] 102 | ) 103 | lemmatized.append(temp) 104 | return np.array(lemmatized) 105 | 106 | def preprocess(x): 107 | lemmatized = lemmatize(x) 108 | tokenized = tokenizer.texts_to_sequences(lemmatized) 109 | return sequence.pad_sequences(tokenized, maxlen=MAX_POST_LENGTH) 110 | 111 | x_train = lemmatize(x_train) 112 | x_test = lemmatize(x_test) 113 | 114 | ### Assign to dataframe 115 | df = pd.DataFrame(data={"text": x_train, "type": y_train}) 116 | df = df.sample(frac=1).reset_index(drop=True) ### Shuffle rows 117 | 118 | ### Make pipeline 119 | pipeline = Pipeline( 120 | [ 121 | ("vectorizer", CountVectorizer(stop_words="english")), ### Bag-of-words 122 | ("transformer", TfidfTransformer()), 123 | ("classifier", MultinomialNB()), 124 | ] 125 | ) ### Performs best 126 | # ('classifier', SVC()) ]) 127 | # ('classifier', DecisionTreeClassifier(max_depth=50)) ]) 128 | # ('classifier', SGDClassifier(loss='hinge', penalty='l2', alpha=1e-3, random_state=42, max_iter=5, tol=None)) ]) 129 | # ('classifier', Perceptron()) ]) 130 | # ('classifier', KNeighborsClassifier(n_neighbors=2)) ]) 131 | 132 | ### Cross-validation classification (individual posts) 133 | if CROSS_VALIDATION: 134 | k_fold = KFold(n_splits=6) 135 | scores_k = [] 136 | confusion_k = np.array([[0, 0], [0, 0]]) 137 | for train_indices, test_indices in k_fold: 138 | x_train_k = df.iloc[train_indices]["text"].values 139 | y_train_k = df.iloc[train_indices]["type"].values 140 | x_test_k = df.iloc[test_indices]["text"].values 141 | y_test_k = df.iloc[test_indices]["type"].values 142 | pipeline.fit(x_train_k, y_train_k) 143 | predictions_k = pipeline.predict(x_test_k) 144 | confusion_k += confusion_matrix(y_test_k, predictions_k) 145 | score_k = accuracy_score(y_test_k, predictions_k) 146 | scores_k.append(score_k) 147 | with open( 148 | os.path.join( 149 | MODELS_DIR, "baseline_cross_validation_{}.txt".format(DIMENSIONS[k]) 150 | ), 151 | "w", 152 | ) as f: 153 | f.write( 154 | "*** {}/{} TRAINING SET CROSS VALIDATION (POSTS) ***\n".format( 155 | DIMENSIONS[k][0], DIMENSIONS[k][1] 156 | ) 157 | ) 158 | f.write("Total posts classified: {}\n".format(len(df))) 159 | f.write("Accuracy: {}\n".format(sum(scores_k) / len(scores_k))) 160 | f.write("Confusion matrix: \n") 161 | f.write(np.array2string(confusion_k, separator=", ")) 162 | 163 | ### Test set classification (individual posts) 164 | pipeline.fit(df["text"].values, df["type"].values) 165 | predictions = pipeline.predict(x_test) 166 | confusion = confusion_matrix(y_test, predictions) 167 | score = accuracy_score(y_test, predictions) 168 | with open( 169 | os.path.join(MODELS_DIR, "baseline_accuracy_{}.txt".format(DIMENSIONS[k])), "w" 170 | ) as f: 171 | f.write( 172 | "*** {}/{} TEST SET CLASSIFICATION (POSTS) ***\n".format( 173 | DIMENSIONS[k][0], DIMENSIONS[k][1] 174 | ) 175 | ) 176 | f.write("Total posts classified: {}\n".format(len(x_test))) 177 | f.write("Accuracy: {}\n".format(score)) 178 | f.write("Confusion matrix: \n") 179 | f.write(np.array2string(confusion, separator=", ")) 180 | print( 181 | f"Wrote training / test results for {DIMENSIONS[k]} here: {os.path.join(MODELS_DIR, 'baseline_accuracy_{}.txt'.format(DIMENSIONS[k]))}" 182 | ) 183 | 184 | # Save model 185 | if SAVE_MODEL: 186 | pipeline.named_steps["classifier"].model.save( 187 | os.path.join(MODELS_DIR, "NB_classifier_{}.h5".format(DIMENSIONS[k])) 188 | ) 189 | pipeline.named_steps["classifier"].model = None 190 | joblib.dump( 191 | pipeline, 192 | os.path.join(MODELS_DIR, "baseline_pipeline_{}.pkl".format(DIMENSIONS[k])), 193 | ) 194 | del pipeline 195 | -------------------------------------------------------------------------------- /data/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/data/.gitkeep -------------------------------------------------------------------------------- /differentiate_words.py: -------------------------------------------------------------------------------- 1 | import os 2 | import collections 3 | import re 4 | 5 | 6 | DATA_DIR = "data" 7 | 8 | DIMENSIONS = ["IE", "NS", "FT", "PJ"] 9 | WORDS_TO_REMOVE = [ 10 | "intj", 11 | "intp", 12 | "infj", 13 | "infp", 14 | "istj", 15 | "istp", 16 | "isfj", 17 | "isfp", 18 | "entj", 19 | "entp", 20 | "enfj", 21 | "enfp", 22 | "estj", 23 | "estp", 24 | "esfj", 25 | "esfp", 26 | "si", 27 | "ni", 28 | "ti", 29 | "fi", 30 | "se", 31 | "ne", 32 | "te", 33 | "fe", 34 | "nt", 35 | "nf", 36 | "sxsp", 37 | "spsx", 38 | "spso", 39 | "sxso", 40 | "sosp", 41 | "sosx", 42 | "sp", 43 | "sx", 44 | "sj", 45 | "sf", 46 | "st", 47 | "le", 48 | "socionic", 49 | "socionics", 50 | "enneagram", 51 | "d", 52 | "w", 53 | "mbti", 54 | ] 55 | 56 | for k in range(len(DIMENSIONS)): 57 | 58 | wordcount_a = {} 59 | wordcount_b = {} 60 | 61 | with open( 62 | os.path.join(DATA_DIR, "extreme_examples_{}.txt".format(DIMENSIONS[k][0])), "r" 63 | ) as f: 64 | wordcount_a = collections.Counter(f.read().split()) 65 | 66 | with open( 67 | os.path.join(DATA_DIR, "extreme_examples_{}.txt".format(DIMENSIONS[k][1])), "r" 68 | ) as f: 69 | wordcount_b = collections.Counter(f.read().split()) 70 | 71 | cache = [] 72 | 73 | for key in wordcount_a.keys(): 74 | if key not in cache: 75 | cache.append(key) 76 | for key in wordcount_b.keys(): 77 | if key not in cache: 78 | cache.append(key) 79 | 80 | a = {} 81 | b = {} 82 | 83 | for key in cache: 84 | if key in wordcount_a.keys(): 85 | if key in wordcount_b.keys(): 86 | if wordcount_a[key] > wordcount_b[key]: 87 | a[key] = wordcount_a[key] - wordcount_b[key] 88 | elif wordcount_a[key] < wordcount_b[key]: 89 | b[key] = wordcount_b[key] - wordcount_a[key] 90 | else: 91 | a[key] = wordcount_a[key] 92 | elif key in wordcount_b.keys(): 93 | b[key] = wordcount_b[key] 94 | 95 | regex = re.compile("[^a-zA-Z]") 96 | 97 | with open( 98 | os.path.join(DATA_DIR, "special_words_{}.txt".format(DIMENSIONS[k][0])), "w" 99 | ) as f: 100 | for key in a.keys(): 101 | mod = regex.sub("", str(key)) 102 | if mod not in WORDS_TO_REMOVE: 103 | if a[key] > 2: 104 | for ___ in range(a[key]): 105 | f.write(mod + "\n") 106 | 107 | with open( 108 | os.path.join(DATA_DIR, "special_words_{}.txt".format(DIMENSIONS[k][1])), "w" 109 | ) as f: 110 | for key in b.keys(): 111 | mod = regex.sub("", str(key)) 112 | if mod not in WORDS_TO_REMOVE: 113 | if b[key] > 2: 114 | for ___ in range(b[key]): 115 | f.write(mod + "\n") 116 | -------------------------------------------------------------------------------- /make_test_set.py: -------------------------------------------------------------------------------- 1 | import os 2 | import collections 3 | import pandas as pd 4 | import csv 5 | import re 6 | 7 | 8 | DATA_DIR = "data" 9 | MBTI_CLEAN_CSV_PATH = os.path.join(DATA_DIR, "mbti_clean.csv") 10 | DIMENSIONS = ("IE", "NS", "TF", "PJ") 11 | 12 | 13 | df = pd.read_csv(MBTI_CLEAN_CSV_PATH) 14 | 15 | for dimension in DIMENSIONS: 16 | letter_1, letter_2 = dimension 17 | for letter in [letter_1, letter_2]: 18 | posts = [] 19 | for index, row in df.iterrows(): 20 | if letter in row["type"]: 21 | hundred_posts = row["posts"].split("|||") 22 | for post in hundred_posts: 23 | if ( 24 | ("http" in post) 25 | or (post == "") 26 | or (post == None) 27 | or (not re.search("[a-zA-Z]", post)) 28 | ): # ignore deformed posts 29 | continue 30 | posts.append(post) 31 | 32 | test_csv_path = os.path.join(DATA_DIR, f"test_{letter}.csv") 33 | with open(test_csv_path, "w") as f: 34 | writer = csv.writer(f) 35 | for post in posts: 36 | writer.writerow([post]) 37 | -------------------------------------------------------------------------------- /make_training_set.py: -------------------------------------------------------------------------------- 1 | import os 2 | import collections 3 | import pandas as pd 4 | import csv 5 | import re 6 | 7 | 8 | DATA_DIR = "data" 9 | MBTI_UNCLEAN_CSV_PATH = os.path.join(DATA_DIR, "mbti_unclean.csv") 10 | DIMENSIONS = ("IE", "NS", "TF", "PJ") 11 | 12 | 13 | df = pd.read_csv(MBTI_UNCLEAN_CSV_PATH) 14 | 15 | counts = collections.defaultdict(int) 16 | for dimension in DIMENSIONS: 17 | letter_1, letter_2 = dimension 18 | for index, row in df.iterrows(): 19 | mbti = row["type"] 20 | hundred_posts = row["posts"].split("|||") 21 | for post in hundred_posts: 22 | if ( 23 | ("http" in post) 24 | or (post == "") 25 | or (post == None) 26 | or (not re.search("[a-zA-Z]", post)) 27 | ): # ignore deformed posts 28 | continue 29 | if letter_1 in mbti: 30 | counts[letter_1] += 1 31 | if letter_2 in mbti: 32 | counts[letter_2] += 1 33 | 34 | for dimension in DIMENSIONS: 35 | letter_1, letter_2 = dimension 36 | if counts[letter_1] < counts[letter_2]: 37 | limit = counts[letter_1] 38 | else: 39 | limit = counts[letter_2] 40 | 41 | for letter in [letter_1, letter_2]: 42 | posts = [] 43 | i = 0 44 | for index, row in df.iterrows(): 45 | if letter in row["type"]: 46 | hundred_posts = row["posts"].split("|||") 47 | for post in hundred_posts: 48 | if i == limit: 49 | break 50 | if ( 51 | ("http" in post) 52 | or (post == "") 53 | or (post == None) 54 | or (not re.search("[a-zA-Z]", post)) 55 | ): # ignore deformed posts 56 | continue 57 | posts.append(post) 58 | i += 1 59 | 60 | train_csv_path = os.path.join(DATA_DIR, f"train_{letter}.csv") 61 | with open(train_csv_path, "w") as f: 62 | writer = csv.writer(f) 63 | for post in posts: 64 | writer.writerow([post]) 65 | -------------------------------------------------------------------------------- /models/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/models/.gitkeep -------------------------------------------------------------------------------- /poetry.lock: -------------------------------------------------------------------------------- 1 | [[package]] 2 | name = "absl-py" 3 | version = "0.11.0" 4 | description = "Abseil Python Common Libraries, see https://github.com/abseil/abseil-py." 5 | category = "main" 6 | optional = false 7 | python-versions = "*" 8 | 9 | [package.dependencies] 10 | six = "*" 11 | 12 | [[package]] 13 | name = "appdirs" 14 | version = "1.4.4" 15 | description = "A small Python module for determining appropriate platform-specific dirs, e.g. a \"user data dir\"." 16 | category = "dev" 17 | optional = false 18 | python-versions = "*" 19 | 20 | [[package]] 21 | name = "astunparse" 22 | version = "1.6.3" 23 | description = "An AST unparser for Python" 24 | category = "main" 25 | optional = false 26 | python-versions = "*" 27 | 28 | [package.dependencies] 29 | six = ">=1.6.1,<2.0" 30 | 31 | [[package]] 32 | name = "cachetools" 33 | version = "4.2.1" 34 | description = "Extensible memoizing collections and decorators" 35 | category = "main" 36 | optional = false 37 | python-versions = "~=3.5" 38 | 39 | [[package]] 40 | name = "certifi" 41 | version = "2020.12.5" 42 | description = "Python package for providing Mozilla's CA Bundle." 43 | category = "main" 44 | optional = false 45 | python-versions = "*" 46 | 47 | [[package]] 48 | name = "cfgv" 49 | version = "3.2.0" 50 | description = "Validate configuration and produce human readable error messages." 51 | category = "dev" 52 | optional = false 53 | python-versions = ">=3.6.1" 54 | 55 | [[package]] 56 | name = "chardet" 57 | version = "4.0.0" 58 | description = "Universal encoding detector for Python 2 and 3" 59 | category = "main" 60 | optional = false 61 | python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*" 62 | 63 | [[package]] 64 | name = "click" 65 | version = "7.1.2" 66 | description = "Composable command line interface toolkit" 67 | category = "main" 68 | optional = false 69 | python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*" 70 | 71 | [[package]] 72 | name = "distlib" 73 | version = "0.3.1" 74 | description = "Distribution utilities" 75 | category = "dev" 76 | optional = false 77 | python-versions = "*" 78 | 79 | [[package]] 80 | name = "filelock" 81 | version = "3.0.12" 82 | description = "A platform independent file lock." 83 | category = "dev" 84 | optional = false 85 | python-versions = "*" 86 | 87 | [[package]] 88 | name = "flatbuffers" 89 | version = "1.12" 90 | description = "The FlatBuffers serialization format for Python" 91 | category = "main" 92 | optional = false 93 | python-versions = "*" 94 | 95 | [[package]] 96 | name = "gast" 97 | version = "0.3.3" 98 | description = "Python AST that abstracts the underlying Python version" 99 | category = "main" 100 | optional = false 101 | python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*" 102 | 103 | [[package]] 104 | name = "google-auth" 105 | version = "1.24.0" 106 | description = "Google Authentication Library" 107 | category = "main" 108 | optional = false 109 | python-versions = ">=2.7,!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,!=3.5.*" 110 | 111 | [package.dependencies] 112 | cachetools = ">=2.0.0,<5.0" 113 | pyasn1-modules = ">=0.2.1" 114 | rsa = {version = ">=3.1.4,<5", markers = "python_version >= \"3.6\""} 115 | six = ">=1.9.0" 116 | 117 | [package.extras] 118 | aiohttp = ["aiohttp (>=3.6.2,<4.0.0dev)"] 119 | 120 | [[package]] 121 | name = "google-auth-oauthlib" 122 | version = "0.4.2" 123 | description = "Google Authentication Library" 124 | category = "main" 125 | optional = false 126 | python-versions = ">=3.6" 127 | 128 | [package.dependencies] 129 | google-auth = "*" 130 | requests-oauthlib = ">=0.7.0" 131 | 132 | [package.extras] 133 | tool = ["click"] 134 | 135 | [[package]] 136 | name = "google-pasta" 137 | version = "0.2.0" 138 | description = "pasta is an AST-based Python refactoring library" 139 | category = "main" 140 | optional = false 141 | python-versions = "*" 142 | 143 | [package.dependencies] 144 | six = "*" 145 | 146 | [[package]] 147 | name = "grpcio" 148 | version = "1.32.0" 149 | description = "HTTP/2-based RPC framework" 150 | category = "main" 151 | optional = false 152 | python-versions = "*" 153 | 154 | [package.dependencies] 155 | six = ">=1.5.2" 156 | 157 | [package.extras] 158 | protobuf = ["grpcio-tools (>=1.32.0)"] 159 | 160 | [[package]] 161 | name = "h5py" 162 | version = "2.10.0" 163 | description = "Read and write HDF5 files from Python" 164 | category = "main" 165 | optional = false 166 | python-versions = "*" 167 | 168 | [package.dependencies] 169 | numpy = ">=1.7" 170 | six = "*" 171 | 172 | [[package]] 173 | name = "identify" 174 | version = "1.5.13" 175 | description = "File identification library for Python" 176 | category = "dev" 177 | optional = false 178 | python-versions = "!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,>=2.7" 179 | 180 | [package.extras] 181 | license = ["editdistance"] 182 | 183 | [[package]] 184 | name = "idna" 185 | version = "2.10" 186 | description = "Internationalized Domain Names in Applications (IDNA)" 187 | category = "main" 188 | optional = false 189 | python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*" 190 | 191 | [[package]] 192 | name = "joblib" 193 | version = "1.0.0" 194 | description = "Lightweight pipelining with Python functions" 195 | category = "main" 196 | optional = false 197 | python-versions = ">=3.6" 198 | 199 | [[package]] 200 | name = "keras" 201 | version = "2.4.3" 202 | description = "Deep Learning for humans" 203 | category = "main" 204 | optional = false 205 | python-versions = "*" 206 | 207 | [package.dependencies] 208 | h5py = "*" 209 | numpy = ">=1.9.1" 210 | pyyaml = "*" 211 | scipy = ">=0.14" 212 | 213 | [package.extras] 214 | tests = ["pytest", "pytest-pep8", "pytest-xdist", "flaky", "pytest-cov", "pandas", "requests", "markdown"] 215 | visualize = ["pydot (>=1.2.4)"] 216 | 217 | [[package]] 218 | name = "keras-preprocessing" 219 | version = "1.1.2" 220 | description = "Easy data preprocessing and data augmentation for deep learning models" 221 | category = "main" 222 | optional = false 223 | python-versions = "*" 224 | 225 | [package.dependencies] 226 | numpy = ">=1.9.1" 227 | six = ">=1.9.0" 228 | 229 | [package.extras] 230 | image = ["scipy (>=0.14)", "Pillow (>=5.2.0)"] 231 | pep8 = ["flake8"] 232 | tests = ["pandas", "pillow", "tensorflow", "keras", "pytest", "pytest-xdist", "pytest-cov"] 233 | 234 | [[package]] 235 | name = "markdown" 236 | version = "3.3.3" 237 | description = "Python implementation of Markdown." 238 | category = "main" 239 | optional = false 240 | python-versions = ">=3.6" 241 | 242 | [package.extras] 243 | testing = ["coverage", "pyyaml"] 244 | 245 | [[package]] 246 | name = "nltk" 247 | version = "3.5" 248 | description = "Natural Language Toolkit" 249 | category = "main" 250 | optional = false 251 | python-versions = "*" 252 | 253 | [package.dependencies] 254 | click = "*" 255 | joblib = "*" 256 | regex = "*" 257 | tqdm = "*" 258 | 259 | [package.extras] 260 | all = ["requests", "numpy", "python-crfsuite", "scikit-learn", "twython", "pyparsing", "scipy", "matplotlib", "gensim"] 261 | corenlp = ["requests"] 262 | machine_learning = ["gensim", "numpy", "python-crfsuite", "scikit-learn", "scipy"] 263 | plot = ["matplotlib"] 264 | tgrep = ["pyparsing"] 265 | twitter = ["twython"] 266 | 267 | [[package]] 268 | name = "nodeenv" 269 | version = "1.5.0" 270 | description = "Node.js virtual environment builder" 271 | category = "dev" 272 | optional = false 273 | python-versions = "*" 274 | 275 | [[package]] 276 | name = "numpy" 277 | version = "1.19.5" 278 | description = "NumPy is the fundamental package for array computing with Python." 279 | category = "main" 280 | optional = false 281 | python-versions = ">=3.6" 282 | 283 | [[package]] 284 | name = "oauthlib" 285 | version = "3.1.0" 286 | description = "A generic, spec-compliant, thorough implementation of the OAuth request-signing logic" 287 | category = "main" 288 | optional = false 289 | python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*" 290 | 291 | [package.extras] 292 | rsa = ["cryptography"] 293 | signals = ["blinker"] 294 | signedtoken = ["cryptography", "pyjwt (>=1.0.0)"] 295 | 296 | [[package]] 297 | name = "opt-einsum" 298 | version = "3.3.0" 299 | description = "Optimizing numpys einsum function" 300 | category = "main" 301 | optional = false 302 | python-versions = ">=3.5" 303 | 304 | [package.dependencies] 305 | numpy = ">=1.7" 306 | 307 | [package.extras] 308 | docs = ["sphinx (1.2.3)", "sphinxcontrib-napoleon", "sphinx-rtd-theme", "numpydoc"] 309 | tests = ["pytest", "pytest-cov", "pytest-pep8"] 310 | 311 | [[package]] 312 | name = "pandas" 313 | version = "1.2.1" 314 | description = "Powerful data structures for data analysis, time series, and statistics" 315 | category = "main" 316 | optional = false 317 | python-versions = ">=3.7.1" 318 | 319 | [package.dependencies] 320 | numpy = ">=1.16.5" 321 | python-dateutil = ">=2.7.3" 322 | pytz = ">=2017.3" 323 | 324 | [package.extras] 325 | test = ["pytest (>=5.0.1)", "pytest-xdist", "hypothesis (>=3.58)"] 326 | 327 | [[package]] 328 | name = "pre-commit" 329 | version = "2.10.0" 330 | description = "A framework for managing and maintaining multi-language pre-commit hooks." 331 | category = "dev" 332 | optional = false 333 | python-versions = ">=3.6.1" 334 | 335 | [package.dependencies] 336 | cfgv = ">=2.0.0" 337 | identify = ">=1.0.0" 338 | nodeenv = ">=0.11.1" 339 | pyyaml = ">=5.1" 340 | toml = "*" 341 | virtualenv = ">=20.0.8" 342 | 343 | [[package]] 344 | name = "protobuf" 345 | version = "3.14.0" 346 | description = "Protocol Buffers" 347 | category = "main" 348 | optional = false 349 | python-versions = "*" 350 | 351 | [package.dependencies] 352 | six = ">=1.9" 353 | 354 | [[package]] 355 | name = "pyasn1" 356 | version = "0.4.8" 357 | description = "ASN.1 types and codecs" 358 | category = "main" 359 | optional = false 360 | python-versions = "*" 361 | 362 | [[package]] 363 | name = "pyasn1-modules" 364 | version = "0.2.8" 365 | description = "A collection of ASN.1-based protocols modules." 366 | category = "main" 367 | optional = false 368 | python-versions = "*" 369 | 370 | [package.dependencies] 371 | pyasn1 = ">=0.4.6,<0.5.0" 372 | 373 | [[package]] 374 | name = "python-dateutil" 375 | version = "2.8.1" 376 | description = "Extensions to the standard Python datetime module" 377 | category = "main" 378 | optional = false 379 | python-versions = "!=3.0.*,!=3.1.*,!=3.2.*,>=2.7" 380 | 381 | [package.dependencies] 382 | six = ">=1.5" 383 | 384 | [[package]] 385 | name = "pytz" 386 | version = "2020.5" 387 | description = "World timezone definitions, modern and historical" 388 | category = "main" 389 | optional = false 390 | python-versions = "*" 391 | 392 | [[package]] 393 | name = "pyyaml" 394 | version = "5.4.1" 395 | description = "YAML parser and emitter for Python" 396 | category = "main" 397 | optional = false 398 | python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*" 399 | 400 | [[package]] 401 | name = "regex" 402 | version = "2020.11.13" 403 | description = "Alternative regular expression module, to replace re." 404 | category = "main" 405 | optional = false 406 | python-versions = "*" 407 | 408 | [[package]] 409 | name = "requests" 410 | version = "2.25.1" 411 | description = "Python HTTP for Humans." 412 | category = "main" 413 | optional = false 414 | python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*" 415 | 416 | [package.dependencies] 417 | certifi = ">=2017.4.17" 418 | chardet = ">=3.0.2,<5" 419 | idna = ">=2.5,<3" 420 | urllib3 = ">=1.21.1,<1.27" 421 | 422 | [package.extras] 423 | security = ["pyOpenSSL (>=0.14)", "cryptography (>=1.3.4)"] 424 | socks = ["PySocks (>=1.5.6,<1.5.7 || >1.5.7)", "win-inet-pton"] 425 | 426 | [[package]] 427 | name = "requests-oauthlib" 428 | version = "1.3.0" 429 | description = "OAuthlib authentication support for Requests." 430 | category = "main" 431 | optional = false 432 | python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*" 433 | 434 | [package.dependencies] 435 | oauthlib = ">=3.0.0" 436 | requests = ">=2.0.0" 437 | 438 | [package.extras] 439 | rsa = ["oauthlib[signedtoken] (>=3.0.0)"] 440 | 441 | [[package]] 442 | name = "rsa" 443 | version = "4.7" 444 | description = "Pure-Python RSA implementation" 445 | category = "main" 446 | optional = false 447 | python-versions = ">=3.5, <4" 448 | 449 | [package.dependencies] 450 | pyasn1 = ">=0.1.3" 451 | 452 | [[package]] 453 | name = "scikit-learn" 454 | version = "0.24.1" 455 | description = "A set of python modules for machine learning and data mining" 456 | category = "main" 457 | optional = false 458 | python-versions = ">=3.6" 459 | 460 | [package.dependencies] 461 | joblib = ">=0.11" 462 | numpy = ">=1.13.3" 463 | scipy = ">=0.19.1" 464 | threadpoolctl = ">=2.0.0" 465 | 466 | [package.extras] 467 | benchmark = ["matplotlib (>=2.1.1)", "pandas (>=0.25.0)", "memory-profiler (>=0.57.0)"] 468 | docs = ["matplotlib (>=2.1.1)", "scikit-image (>=0.13)", "pandas (>=0.25.0)", "seaborn (>=0.9.0)", "memory-profiler (>=0.57.0)", "sphinx (>=3.2.0)", "sphinx-gallery (>=0.7.0)", "numpydoc (>=1.0.0)", "Pillow (>=7.1.2)", "sphinx-prompt (>=1.3.0)"] 469 | examples = ["matplotlib (>=2.1.1)", "scikit-image (>=0.13)", "pandas (>=0.25.0)", "seaborn (>=0.9.0)"] 470 | tests = ["matplotlib (>=2.1.1)", "scikit-image (>=0.13)", "pandas (>=0.25.0)", "pytest (>=5.0.1)", "pytest-cov (>=2.9.0)", "flake8 (>=3.8.2)", "mypy (>=0.770)", "pyamg (>=4.0.0)"] 471 | 472 | [[package]] 473 | name = "scipy" 474 | version = "1.6.0" 475 | description = "SciPy: Scientific Library for Python" 476 | category = "main" 477 | optional = false 478 | python-versions = ">=3.7" 479 | 480 | [package.dependencies] 481 | numpy = ">=1.16.5" 482 | 483 | [[package]] 484 | name = "six" 485 | version = "1.15.0" 486 | description = "Python 2 and 3 compatibility utilities" 487 | category = "main" 488 | optional = false 489 | python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*" 490 | 491 | [[package]] 492 | name = "tensorboard" 493 | version = "2.4.1" 494 | description = "TensorBoard lets you watch Tensors Flow" 495 | category = "main" 496 | optional = false 497 | python-versions = ">= 2.7, != 3.0.*, != 3.1.*" 498 | 499 | [package.dependencies] 500 | absl-py = ">=0.4" 501 | google-auth = ">=1.6.3,<2" 502 | google-auth-oauthlib = ">=0.4.1,<0.5" 503 | grpcio = ">=1.24.3" 504 | markdown = ">=2.6.8" 505 | numpy = ">=1.12.0" 506 | protobuf = ">=3.6.0" 507 | requests = ">=2.21.0,<3" 508 | six = ">=1.10.0" 509 | tensorboard-plugin-wit = ">=1.6.0" 510 | werkzeug = ">=0.11.15" 511 | 512 | [[package]] 513 | name = "tensorboard-plugin-wit" 514 | version = "1.8.0" 515 | description = "What-If Tool TensorBoard plugin." 516 | category = "main" 517 | optional = false 518 | python-versions = "*" 519 | 520 | [[package]] 521 | name = "tensorflow" 522 | version = "2.4.1" 523 | description = "TensorFlow is an open source machine learning framework for everyone." 524 | category = "main" 525 | optional = false 526 | python-versions = "*" 527 | 528 | [package.dependencies] 529 | absl-py = ">=0.10,<1.0" 530 | astunparse = ">=1.6.3,<1.7.0" 531 | flatbuffers = ">=1.12.0,<1.13.0" 532 | gast = "0.3.3" 533 | google-pasta = ">=0.2,<1.0" 534 | grpcio = ">=1.32.0,<1.33.0" 535 | h5py = ">=2.10.0,<2.11.0" 536 | keras-preprocessing = ">=1.1.2,<1.2.0" 537 | numpy = ">=1.19.2,<1.20.0" 538 | opt-einsum = ">=3.3.0,<3.4.0" 539 | protobuf = ">=3.9.2" 540 | six = ">=1.15.0,<1.16.0" 541 | tensorboard = ">=2.4,<3.0" 542 | tensorflow-estimator = ">=2.4.0,<2.5.0" 543 | termcolor = ">=1.1.0,<1.2.0" 544 | typing-extensions = ">=3.7.4,<3.8.0" 545 | wrapt = ">=1.12.1,<1.13.0" 546 | 547 | [[package]] 548 | name = "tensorflow-estimator" 549 | version = "2.4.0" 550 | description = "TensorFlow Estimator." 551 | category = "main" 552 | optional = false 553 | python-versions = "*" 554 | 555 | [[package]] 556 | name = "termcolor" 557 | version = "1.1.0" 558 | description = "ANSII Color formatting for output in terminal." 559 | category = "main" 560 | optional = false 561 | python-versions = "*" 562 | 563 | [[package]] 564 | name = "threadpoolctl" 565 | version = "2.1.0" 566 | description = "threadpoolctl" 567 | category = "main" 568 | optional = false 569 | python-versions = ">=3.5" 570 | 571 | [[package]] 572 | name = "toml" 573 | version = "0.10.2" 574 | description = "Python Library for Tom's Obvious, Minimal Language" 575 | category = "dev" 576 | optional = false 577 | python-versions = ">=2.6, !=3.0.*, !=3.1.*, !=3.2.*" 578 | 579 | [[package]] 580 | name = "tqdm" 581 | version = "4.56.0" 582 | description = "Fast, Extensible Progress Meter" 583 | category = "main" 584 | optional = false 585 | python-versions = "!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,>=2.7" 586 | 587 | [package.extras] 588 | dev = ["py-make (>=0.1.0)", "twine", "wheel"] 589 | telegram = ["requests"] 590 | 591 | [[package]] 592 | name = "typing-extensions" 593 | version = "3.7.4.3" 594 | description = "Backported and Experimental Type Hints for Python 3.5+" 595 | category = "main" 596 | optional = false 597 | python-versions = "*" 598 | 599 | [[package]] 600 | name = "urllib3" 601 | version = "1.26.3" 602 | description = "HTTP library with thread-safe connection pooling, file post, and more." 603 | category = "main" 604 | optional = false 605 | python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, <4" 606 | 607 | [package.extras] 608 | brotli = ["brotlipy (>=0.6.0)"] 609 | secure = ["pyOpenSSL (>=0.14)", "cryptography (>=1.3.4)", "idna (>=2.0.0)", "certifi", "ipaddress"] 610 | socks = ["PySocks (>=1.5.6,<1.5.7 || >1.5.7,<2.0)"] 611 | 612 | [[package]] 613 | name = "virtualenv" 614 | version = "20.4.0" 615 | description = "Virtual Python Environment builder" 616 | category = "dev" 617 | optional = false 618 | python-versions = "!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,>=2.7" 619 | 620 | [package.dependencies] 621 | appdirs = ">=1.4.3,<2" 622 | distlib = ">=0.3.1,<1" 623 | filelock = ">=3.0.0,<4" 624 | six = ">=1.9.0,<2" 625 | 626 | [package.extras] 627 | docs = ["proselint (>=0.10.2)", "sphinx (>=3)", "sphinx-argparse (>=0.2.5)", "sphinx-rtd-theme (>=0.4.3)", "towncrier (>=19.9.0rc1)"] 628 | testing = ["coverage (>=4)", "coverage-enable-subprocess (>=1)", "flaky (>=3)", "pytest (>=4)", "pytest-env (>=0.6.2)", "pytest-freezegun (>=0.4.1)", "pytest-mock (>=2)", "pytest-randomly (>=1)", "pytest-timeout (>=1)", "packaging (>=20.0)", "xonsh (>=0.9.16)"] 629 | 630 | [[package]] 631 | name = "werkzeug" 632 | version = "1.0.1" 633 | description = "The comprehensive WSGI web application library." 634 | category = "main" 635 | optional = false 636 | python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*" 637 | 638 | [package.extras] 639 | dev = ["pytest", "pytest-timeout", "coverage", "tox", "sphinx", "pallets-sphinx-themes", "sphinx-issues"] 640 | watchdog = ["watchdog"] 641 | 642 | [[package]] 643 | name = "wrapt" 644 | version = "1.12.1" 645 | description = "Module for decorators, wrappers and monkey patching." 646 | category = "main" 647 | optional = false 648 | python-versions = "*" 649 | 650 | [metadata] 651 | lock-version = "1.1" 652 | python-versions = "^3.8" 653 | content-hash = "aa97e0ecfb82147f807a99604f3012472d06d315e8dce99462a4c5e103fd5271" 654 | 655 | [metadata.files] 656 | absl-py = [ 657 | {file = "absl-py-0.11.0.tar.gz", hash = "sha256:673cccb88d810e5627d0c1c818158485d106f65a583880e2f730c997399bcfa7"}, 658 | {file = "absl_py-0.11.0-py3-none-any.whl", hash = "sha256:b3d9eb5119ff6e0a0125f6dabf2f9fae02f8acae7be70576002fac27235611c5"}, 659 | ] 660 | appdirs = [ 661 | {file = "appdirs-1.4.4-py2.py3-none-any.whl", hash = "sha256:a841dacd6b99318a741b166adb07e19ee71a274450e68237b4650ca1055ab128"}, 662 | {file = "appdirs-1.4.4.tar.gz", hash = "sha256:7d5d0167b2b1ba821647616af46a749d1c653740dd0d2415100fe26e27afdf41"}, 663 | ] 664 | astunparse = [ 665 | {file = "astunparse-1.6.3-py2.py3-none-any.whl", hash = "sha256:c2652417f2c8b5bb325c885ae329bdf3f86424075c4fd1a128674bc6fba4b8e8"}, 666 | {file = "astunparse-1.6.3.tar.gz", hash = "sha256:5ad93a8456f0d084c3456d059fd9a92cce667963232cbf763eac3bc5b7940872"}, 667 | ] 668 | cachetools = [ 669 | {file = "cachetools-4.2.1-py3-none-any.whl", hash = "sha256:1d9d5f567be80f7c07d765e21b814326d78c61eb0c3a637dffc0e5d1796cb2e2"}, 670 | {file = "cachetools-4.2.1.tar.gz", hash = "sha256:f469e29e7aa4cff64d8de4aad95ce76de8ea1125a16c68e0d93f65c3c3dc92e9"}, 671 | ] 672 | certifi = [ 673 | {file = "certifi-2020.12.5-py2.py3-none-any.whl", hash = "sha256:719a74fb9e33b9bd44cc7f3a8d94bc35e4049deebe19ba7d8e108280cfd59830"}, 674 | {file = "certifi-2020.12.5.tar.gz", hash = "sha256:1a4995114262bffbc2413b159f2a1a480c969de6e6eb13ee966d470af86af59c"}, 675 | ] 676 | cfgv = [ 677 | {file = "cfgv-3.2.0-py2.py3-none-any.whl", hash = "sha256:32e43d604bbe7896fe7c248a9c2276447dbef840feb28fe20494f62af110211d"}, 678 | {file = "cfgv-3.2.0.tar.gz", hash = "sha256:cf22deb93d4bcf92f345a5c3cd39d3d41d6340adc60c78bbbd6588c384fda6a1"}, 679 | ] 680 | chardet = [ 681 | {file = "chardet-4.0.0-py2.py3-none-any.whl", hash = "sha256:f864054d66fd9118f2e67044ac8981a54775ec5b67aed0441892edb553d21da5"}, 682 | {file = "chardet-4.0.0.tar.gz", hash = "sha256:0d6f53a15db4120f2b08c94f11e7d93d2c911ee118b6b30a04ec3ee8310179fa"}, 683 | ] 684 | click = [ 685 | {file = "click-7.1.2-py2.py3-none-any.whl", hash = "sha256:dacca89f4bfadd5de3d7489b7c8a566eee0d3676333fbb50030263894c38c0dc"}, 686 | {file = "click-7.1.2.tar.gz", hash = "sha256:d2b5255c7c6349bc1bd1e59e08cd12acbbd63ce649f2588755783aa94dfb6b1a"}, 687 | ] 688 | distlib = [ 689 | {file = "distlib-0.3.1-py2.py3-none-any.whl", hash = "sha256:8c09de2c67b3e7deef7184574fc060ab8a793e7adbb183d942c389c8b13c52fb"}, 690 | {file = "distlib-0.3.1.zip", hash = "sha256:edf6116872c863e1aa9d5bb7cb5e05a022c519a4594dc703843343a9ddd9bff1"}, 691 | ] 692 | filelock = [ 693 | {file = "filelock-3.0.12-py3-none-any.whl", hash = "sha256:929b7d63ec5b7d6b71b0fa5ac14e030b3f70b75747cef1b10da9b879fef15836"}, 694 | {file = "filelock-3.0.12.tar.gz", hash = "sha256:18d82244ee114f543149c66a6e0c14e9c4f8a1044b5cdaadd0f82159d6a6ff59"}, 695 | ] 696 | flatbuffers = [ 697 | {file = "flatbuffers-1.12-py2.py3-none-any.whl", hash = "sha256:9e9ef47fa92625c4721036e7c4124182668dc6021d9e7c73704edd395648deb9"}, 698 | {file = "flatbuffers-1.12.tar.gz", hash = "sha256:63bb9a722d5e373701913e226135b28a6f6ac200d5cc7b4d919fa38d73b44610"}, 699 | ] 700 | gast = [ 701 | {file = "gast-0.3.3-py2.py3-none-any.whl", hash = "sha256:8f46f5be57ae6889a4e16e2ca113b1703ef17f2b0abceb83793eaba9e1351a45"}, 702 | {file = "gast-0.3.3.tar.gz", hash = "sha256:b881ef288a49aa81440d2c5eb8aeefd4c2bb8993d5f50edae7413a85bfdb3b57"}, 703 | ] 704 | google-auth = [ 705 | {file = "google-auth-1.24.0.tar.gz", hash = "sha256:0b0e026b412a0ad096e753907559e4bdb180d9ba9f68dd9036164db4fdc4ad2e"}, 706 | {file = "google_auth-1.24.0-py2.py3-none-any.whl", hash = "sha256:ce752cc51c31f479dbf9928435ef4b07514b20261b021c7383bee4bda646acb8"}, 707 | ] 708 | google-auth-oauthlib = [ 709 | {file = "google-auth-oauthlib-0.4.2.tar.gz", hash = "sha256:65b65bc39ad8cab15039b35e5898455d3d66296d0584d96fe0e79d67d04c51d9"}, 710 | {file = "google_auth_oauthlib-0.4.2-py2.py3-none-any.whl", hash = "sha256:d4d98c831ea21d574699978827490a41b94f05d565c617fe1b420e88f1fc8d8d"}, 711 | ] 712 | google-pasta = [ 713 | {file = "google-pasta-0.2.0.tar.gz", hash = "sha256:c9f2c8dfc8f96d0d5808299920721be30c9eec37f2389f28904f454565c8a16e"}, 714 | {file = "google_pasta-0.2.0-py2-none-any.whl", hash = "sha256:4612951da876b1a10fe3960d7226f0c7682cf901e16ac06e473b267a5afa8954"}, 715 | {file = "google_pasta-0.2.0-py3-none-any.whl", hash = "sha256:b32482794a366b5366a32c92a9a9201b107821889935a02b3e51f6b432ea84ed"}, 716 | ] 717 | grpcio = [ 718 | {file = "grpcio-1.32.0-cp27-cp27m-macosx_10_9_x86_64.whl", hash = "sha256:3afb058b6929eba07dba9ae6c5b555aa1d88cb140187d78cc510bd72d0329f28"}, 719 | {file = "grpcio-1.32.0-cp27-cp27m-manylinux2010_i686.whl", hash = "sha256:a8004b34f600a8a51785e46859cd88f3386ef67cccd1cfc7598e3d317608c643"}, 720 | {file = "grpcio-1.32.0-cp27-cp27m-manylinux2010_x86_64.whl", hash = "sha256:e6786f6f7be0937614577edcab886ddce91b7c1ea972a07ef9972e9f9ecbbb78"}, 721 | {file = "grpcio-1.32.0-cp27-cp27m-win32.whl", hash = "sha256:e467af6bb8f5843f5a441e124b43474715cfb3981264e7cd227343e826dcc3ce"}, 722 | {file = "grpcio-1.32.0-cp27-cp27m-win_amd64.whl", hash = "sha256:1376a60f9bfce781b39973f100b5f67e657b5be479f2fd8a7d2a408fc61c085c"}, 723 | {file = "grpcio-1.32.0-cp27-cp27mu-linux_armv7l.whl", hash = "sha256:ce617e1c4a39131f8527964ac9e700eb199484937d7a0b3e52655a3ba50d5fb9"}, 724 | {file = "grpcio-1.32.0-cp27-cp27mu-manylinux2010_i686.whl", hash = "sha256:99bac0e2c820bf446662365df65841f0c2a55b0e2c419db86eaf5d162ddae73e"}, 725 | {file = "grpcio-1.32.0-cp27-cp27mu-manylinux2010_x86_64.whl", hash = "sha256:6d869a3e8e62562b48214de95e9231c97c53caa7172802236cd5d60140d7cddd"}, 726 | {file = "grpcio-1.32.0-cp35-cp35m-linux_armv7l.whl", hash = "sha256:182c64ade34c341398bf71ec0975613970feb175090760ab4f51d1e9a5424f05"}, 727 | {file = "grpcio-1.32.0-cp35-cp35m-macosx_10_7_intel.whl", hash = "sha256:9c0d8f2346c842088b8cbe3e14985b36e5191a34bf79279ba321a4bf69bd88b7"}, 728 | {file = "grpcio-1.32.0-cp35-cp35m-manylinux2010_i686.whl", hash = "sha256:4775bc35af9cd3b5033700388deac2e1d611fa45f4a8dcb93667d94cb25f0444"}, 729 | {file = "grpcio-1.32.0-cp35-cp35m-manylinux2010_x86_64.whl", hash = "sha256:be98e3198ec765d0a1e27f69d760f69374ded8a33b953dcfe790127731f7e690"}, 730 | {file = "grpcio-1.32.0-cp35-cp35m-manylinux2014_i686.whl", hash = "sha256:378fe80ec5d9353548eb2a8a43ea03747a80f2e387c4f177f2b3ff6c7d898753"}, 731 | {file = "grpcio-1.32.0-cp35-cp35m-manylinux2014_x86_64.whl", hash = "sha256:f7d508691301027033215d3662dab7e178f54d5cca2329f26a71ae175d94b83f"}, 732 | {file = "grpcio-1.32.0-cp35-cp35m-win32.whl", hash = "sha256:25959a651420dd4a6fd7d3e8dee53f4f5fd8c56336a64963428e78b276389a59"}, 733 | {file = "grpcio-1.32.0-cp35-cp35m-win_amd64.whl", hash = "sha256:ac7028d363d2395f3d755166d0161556a3f99500a5b44890421ccfaaf2aaeb08"}, 734 | {file = "grpcio-1.32.0-cp36-cp36m-linux_armv7l.whl", hash = "sha256:c31e8a219650ddae1cd02f5a169e1bffe66a429a8255d3ab29e9363c73003b62"}, 735 | {file = "grpcio-1.32.0-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:e28e4c0d4231beda5dee94808e3a224d85cbaba3cfad05f2192e6f4ec5318053"}, 736 | {file = "grpcio-1.32.0-cp36-cp36m-manylinux2010_i686.whl", hash = "sha256:f03dfefa9075dd1c6c5cc27b1285c521434643b09338d8b29e1d6a27b386aa82"}, 737 | {file = "grpcio-1.32.0-cp36-cp36m-manylinux2010_x86_64.whl", hash = "sha256:c4966d746dccb639ef93f13560acbe9630681c07f2b320b7ec03fe2c8f0a1f15"}, 738 | {file = "grpcio-1.32.0-cp36-cp36m-manylinux2014_i686.whl", hash = "sha256:ec10d5f680b8e95a06f1367d73c5ddcc0ed04a3f38d6e4c9346988fb0cea2ffa"}, 739 | {file = "grpcio-1.32.0-cp36-cp36m-manylinux2014_x86_64.whl", hash = "sha256:28677f057e2ef11501860a7bc15de12091d40b95dd0fddab3c37ff1542e6b216"}, 740 | {file = "grpcio-1.32.0-cp36-cp36m-win32.whl", hash = "sha256:0f3f09269ffd3fded430cd89ba2397eabbf7e47be93983b25c187cdfebb302a7"}, 741 | {file = "grpcio-1.32.0-cp36-cp36m-win_amd64.whl", hash = "sha256:4396b1d0f388ae875eaf6dc05cdcb612c950fd9355bc34d38b90aaa0665a0d4b"}, 742 | {file = "grpcio-1.32.0-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:1ada89326a364a299527c7962e5c362dbae58c67b283fe8383c4d952b26565d5"}, 743 | {file = "grpcio-1.32.0-cp37-cp37m-manylinux2010_i686.whl", hash = "sha256:1d384a61f96a1fc6d5d3e0b62b0a859abc8d4c3f6d16daba51ebf253a3e7df5d"}, 744 | {file = "grpcio-1.32.0-cp37-cp37m-manylinux2010_x86_64.whl", hash = "sha256:e811ce5c387256609d56559d944a974cc6934a8eea8c76e7c86ec388dc06192d"}, 745 | {file = "grpcio-1.32.0-cp37-cp37m-manylinux2014_i686.whl", hash = "sha256:07b430fa68e5eecd78e2ad529ab80f6a234b55fc1b675fe47335ccbf64c6c6c8"}, 746 | {file = "grpcio-1.32.0-cp37-cp37m-manylinux2014_x86_64.whl", hash = "sha256:0e3edd8cdb71809d2455b9dbff66b4dd3d36c321e64bfa047da5afdfb0db332b"}, 747 | {file = "grpcio-1.32.0-cp37-cp37m-win32.whl", hash = "sha256:6f7947dad606c509d067e5b91a92b250aa0530162ab99e4737090f6b17eb12c4"}, 748 | {file = "grpcio-1.32.0-cp37-cp37m-win_amd64.whl", hash = "sha256:7cda998b7b551503beefc38db9be18c878cfb1596e1418647687575cdefa9273"}, 749 | {file = "grpcio-1.32.0-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:c58825a3d8634cd634d8f869afddd4d5742bdb59d594aea4cea17b8f39269a55"}, 750 | {file = "grpcio-1.32.0-cp38-cp38-manylinux2010_i686.whl", hash = "sha256:ef9bd7fdfc0a063b4ed0efcab7906df5cae9bbcf79d05c583daa2eba56752b00"}, 751 | {file = "grpcio-1.32.0-cp38-cp38-manylinux2010_x86_64.whl", hash = "sha256:1ce6f5ff4f4a548c502d5237a071fa617115df58ea4b7bd41dac77c1ab126e9c"}, 752 | {file = "grpcio-1.32.0-cp38-cp38-manylinux2014_i686.whl", hash = "sha256:f12900be4c3fd2145ba94ab0d80b7c3d71c9e6414cfee2f31b1c20188b5c281f"}, 753 | {file = "grpcio-1.32.0-cp38-cp38-manylinux2014_x86_64.whl", hash = "sha256:f53f2dfc8ff9a58a993e414a016c8b21af333955ae83960454ad91798d467c7b"}, 754 | {file = "grpcio-1.32.0-cp38-cp38-win32.whl", hash = "sha256:5bddf9d53c8df70061916c3bfd2f468ccf26c348bb0fb6211531d895ed5e4c72"}, 755 | {file = "grpcio-1.32.0-cp38-cp38-win_amd64.whl", hash = "sha256:14c0f017bfebbc18139551111ac58ecbde11f4bc375b73a53af38927d60308b6"}, 756 | {file = "grpcio-1.32.0.tar.gz", hash = "sha256:01d3046fe980be25796d368f8fc5ff34b7cf5e1444f3789a017a7fe794465639"}, 757 | ] 758 | h5py = [ 759 | {file = "h5py-2.10.0-cp27-cp27m-macosx_10_6_intel.whl", hash = "sha256:ecf4d0b56ee394a0984de15bceeb97cbe1fe485f1ac205121293fc44dcf3f31f"}, 760 | {file = "h5py-2.10.0-cp27-cp27m-manylinux1_i686.whl", hash = "sha256:86868dc07b9cc8cb7627372a2e6636cdc7a53b7e2854ad020c9e9d8a4d3fd0f5"}, 761 | {file = "h5py-2.10.0-cp27-cp27m-manylinux1_x86_64.whl", hash = "sha256:aac4b57097ac29089f179bbc2a6e14102dd210618e94d77ee4831c65f82f17c0"}, 762 | {file = "h5py-2.10.0-cp27-cp27m-win32.whl", hash = "sha256:7be5754a159236e95bd196419485343e2b5875e806fe68919e087b6351f40a70"}, 763 | {file = "h5py-2.10.0-cp27-cp27m-win_amd64.whl", hash = "sha256:13c87efa24768a5e24e360a40e0bc4c49bcb7ce1bb13a3a7f9902cec302ccd36"}, 764 | {file = "h5py-2.10.0-cp27-cp27mu-manylinux1_i686.whl", hash = "sha256:79b23f47c6524d61f899254f5cd5e486e19868f1823298bc0c29d345c2447172"}, 765 | {file = "h5py-2.10.0-cp27-cp27mu-manylinux1_x86_64.whl", hash = "sha256:cbf28ae4b5af0f05aa6e7551cee304f1d317dbed1eb7ac1d827cee2f1ef97a99"}, 766 | {file = "h5py-2.10.0-cp34-cp34m-manylinux1_i686.whl", hash = "sha256:c0d4b04bbf96c47b6d360cd06939e72def512b20a18a8547fa4af810258355d5"}, 767 | {file = "h5py-2.10.0-cp34-cp34m-manylinux1_x86_64.whl", hash = "sha256:549ad124df27c056b2e255ea1c44d30fb7a17d17676d03096ad5cd85edb32dc1"}, 768 | {file = "h5py-2.10.0-cp35-cp35m-macosx_10_6_intel.whl", hash = "sha256:a5f82cd4938ff8761d9760af3274acf55afc3c91c649c50ab18fcff5510a14a5"}, 769 | {file = "h5py-2.10.0-cp35-cp35m-manylinux1_i686.whl", hash = "sha256:3dad1730b6470fad853ef56d755d06bb916ee68a3d8272b3bab0c1ddf83bb99e"}, 770 | {file = "h5py-2.10.0-cp35-cp35m-manylinux1_x86_64.whl", hash = "sha256:063947eaed5f271679ed4ffa36bb96f57bc14f44dd4336a827d9a02702e6ce6b"}, 771 | {file = "h5py-2.10.0-cp35-cp35m-win32.whl", hash = "sha256:c54a2c0dd4957776ace7f95879d81582298c5daf89e77fb8bee7378f132951de"}, 772 | {file = "h5py-2.10.0-cp35-cp35m-win_amd64.whl", hash = "sha256:6998be619c695910cb0effe5eb15d3a511d3d1a5d217d4bd0bebad1151ec2262"}, 773 | {file = "h5py-2.10.0-cp36-cp36m-macosx_10_6_intel.whl", hash = "sha256:ff7d241f866b718e4584fa95f520cb19405220c501bd3a53ee11871ba5166ea2"}, 774 | {file = "h5py-2.10.0-cp36-cp36m-manylinux1_i686.whl", hash = "sha256:54817b696e87eb9e403e42643305f142cd8b940fe9b3b490bbf98c3b8a894cf4"}, 775 | {file = "h5py-2.10.0-cp36-cp36m-manylinux1_x86_64.whl", hash = "sha256:d3c59549f90a891691991c17f8e58c8544060fdf3ccdea267100fa5f561ff62f"}, 776 | {file = "h5py-2.10.0-cp36-cp36m-win32.whl", hash = "sha256:d7ae7a0576b06cb8e8a1c265a8bc4b73d05fdee6429bffc9a26a6eb531e79d72"}, 777 | {file = "h5py-2.10.0-cp36-cp36m-win_amd64.whl", hash = "sha256:bffbc48331b4a801d2f4b7dac8a72609f0b10e6e516e5c480a3e3241e091c878"}, 778 | {file = "h5py-2.10.0-cp37-cp37m-macosx_10_6_intel.whl", hash = "sha256:51ae56894c6c93159086ffa2c94b5b3388c0400548ab26555c143e7cfa05b8e5"}, 779 | {file = "h5py-2.10.0-cp37-cp37m-manylinux1_i686.whl", hash = "sha256:16ead3c57141101e3296ebeed79c9c143c32bdd0e82a61a2fc67e8e6d493e9d1"}, 780 | {file = "h5py-2.10.0-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:f0e25bb91e7a02efccb50aba6591d3fe2c725479e34769802fcdd4076abfa917"}, 781 | {file = "h5py-2.10.0-cp37-cp37m-win32.whl", hash = "sha256:f23951a53d18398ef1344c186fb04b26163ca6ce449ebd23404b153fd111ded9"}, 782 | {file = "h5py-2.10.0-cp37-cp37m-win_amd64.whl", hash = "sha256:8bb1d2de101f39743f91512a9750fb6c351c032e5cd3204b4487383e34da7f75"}, 783 | {file = "h5py-2.10.0-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:64f74da4a1dd0d2042e7d04cf8294e04ddad686f8eba9bb79e517ae582f6668d"}, 784 | {file = "h5py-2.10.0-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:d35f7a3a6cefec82bfdad2785e78359a0e6a5fbb3f605dd5623ce88082ccd681"}, 785 | {file = "h5py-2.10.0-cp38-cp38-win32.whl", hash = "sha256:6ef7ab1089e3ef53ca099038f3c0a94d03e3560e6aff0e9d6c64c55fb13fc681"}, 786 | {file = "h5py-2.10.0-cp38-cp38-win_amd64.whl", hash = "sha256:769e141512b54dee14ec76ed354fcacfc7d97fea5a7646b709f7400cf1838630"}, 787 | {file = "h5py-2.10.0.tar.gz", hash = "sha256:84412798925dc870ffd7107f045d7659e60f5d46d1c70c700375248bf6bf512d"}, 788 | ] 789 | identify = [ 790 | {file = "identify-1.5.13-py2.py3-none-any.whl", hash = "sha256:9dfb63a2e871b807e3ba62f029813552a24b5289504f5b071dea9b041aee9fe4"}, 791 | {file = "identify-1.5.13.tar.gz", hash = "sha256:70b638cf4743f33042bebb3b51e25261a0a10e80f978739f17e7fd4837664a66"}, 792 | ] 793 | idna = [ 794 | {file = "idna-2.10-py2.py3-none-any.whl", hash = "sha256:b97d804b1e9b523befed77c48dacec60e6dcb0b5391d57af6a65a312a90648c0"}, 795 | {file = "idna-2.10.tar.gz", hash = "sha256:b307872f855b18632ce0c21c5e45be78c0ea7ae4c15c828c20788b26921eb3f6"}, 796 | ] 797 | joblib = [ 798 | {file = "joblib-1.0.0-py3-none-any.whl", hash = "sha256:75ead23f13484a2a414874779d69ade40d4fa1abe62b222a23cd50d4bc822f6f"}, 799 | {file = "joblib-1.0.0.tar.gz", hash = "sha256:7ad866067ac1fdec27d51c8678ea760601b70e32ff1881d4dc8e1171f2b64b24"}, 800 | ] 801 | keras = [ 802 | {file = "Keras-2.4.3-py2.py3-none-any.whl", hash = "sha256:05e2faf6885f7899482a7d18fc00ba9655fe2c9296a35ad96949a07a9c27d1bb"}, 803 | {file = "Keras-2.4.3.tar.gz", hash = "sha256:fedd729b52572fb108a98e3d97e1bac10a81d3917d2103cc20ab2a5f03beb973"}, 804 | ] 805 | keras-preprocessing = [ 806 | {file = "Keras_Preprocessing-1.1.2-py2.py3-none-any.whl", hash = "sha256:7b82029b130ff61cc99b55f3bd27427df4838576838c5b2f65940e4fcec99a7b"}, 807 | {file = "Keras_Preprocessing-1.1.2.tar.gz", hash = "sha256:add82567c50c8bc648c14195bf544a5ce7c1f76761536956c3d2978970179ef3"}, 808 | ] 809 | markdown = [ 810 | {file = "Markdown-3.3.3-py3-none-any.whl", hash = "sha256:c109c15b7dc20a9ac454c9e6025927d44460b85bd039da028d85e2b6d0bcc328"}, 811 | {file = "Markdown-3.3.3.tar.gz", hash = "sha256:5d9f2b5ca24bc4c7a390d22323ca4bad200368612b5aaa7796babf971d2b2f18"}, 812 | ] 813 | nltk = [ 814 | {file = "nltk-3.5.zip", hash = "sha256:845365449cd8c5f9731f7cb9f8bd6fd0767553b9d53af9eb1b3abf7700936b35"}, 815 | ] 816 | nodeenv = [ 817 | {file = "nodeenv-1.5.0-py2.py3-none-any.whl", hash = "sha256:5304d424c529c997bc888453aeaa6362d242b6b4631e90f3d4bf1b290f1c84a9"}, 818 | {file = "nodeenv-1.5.0.tar.gz", hash = "sha256:ab45090ae383b716c4ef89e690c41ff8c2b257b85b309f01f3654df3d084bd7c"}, 819 | ] 820 | numpy = [ 821 | {file = "numpy-1.19.5-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:cc6bd4fd593cb261332568485e20a0712883cf631f6f5e8e86a52caa8b2b50ff"}, 822 | {file = "numpy-1.19.5-cp36-cp36m-manylinux1_i686.whl", hash = "sha256:aeb9ed923be74e659984e321f609b9ba54a48354bfd168d21a2b072ed1e833ea"}, 823 | {file = "numpy-1.19.5-cp36-cp36m-manylinux1_x86_64.whl", hash = "sha256:8b5e972b43c8fc27d56550b4120fe6257fdc15f9301914380b27f74856299fea"}, 824 | {file = "numpy-1.19.5-cp36-cp36m-manylinux2010_i686.whl", hash = "sha256:43d4c81d5ffdff6bae58d66a3cd7f54a7acd9a0e7b18d97abb255defc09e3140"}, 825 | {file = "numpy-1.19.5-cp36-cp36m-manylinux2010_x86_64.whl", hash = "sha256:a4646724fba402aa7504cd48b4b50e783296b5e10a524c7a6da62e4a8ac9698d"}, 826 | {file = "numpy-1.19.5-cp36-cp36m-manylinux2014_aarch64.whl", hash = "sha256:2e55195bc1c6b705bfd8ad6f288b38b11b1af32f3c8289d6c50d47f950c12e76"}, 827 | {file = "numpy-1.19.5-cp36-cp36m-win32.whl", hash = "sha256:39b70c19ec771805081578cc936bbe95336798b7edf4732ed102e7a43ec5c07a"}, 828 | {file = "numpy-1.19.5-cp36-cp36m-win_amd64.whl", hash = "sha256:dbd18bcf4889b720ba13a27ec2f2aac1981bd41203b3a3b27ba7a33f88ae4827"}, 829 | {file = "numpy-1.19.5-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:603aa0706be710eea8884af807b1b3bc9fb2e49b9f4da439e76000f3b3c6ff0f"}, 830 | {file = "numpy-1.19.5-cp37-cp37m-manylinux1_i686.whl", hash = "sha256:cae865b1cae1ec2663d8ea56ef6ff185bad091a5e33ebbadd98de2cfa3fa668f"}, 831 | {file = "numpy-1.19.5-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:36674959eed6957e61f11c912f71e78857a8d0604171dfd9ce9ad5cbf41c511c"}, 832 | {file = "numpy-1.19.5-cp37-cp37m-manylinux2010_i686.whl", hash = "sha256:06fab248a088e439402141ea04f0fffb203723148f6ee791e9c75b3e9e82f080"}, 833 | {file = "numpy-1.19.5-cp37-cp37m-manylinux2010_x86_64.whl", hash = "sha256:6149a185cece5ee78d1d196938b2a8f9d09f5a5ebfbba66969302a778d5ddd1d"}, 834 | {file = "numpy-1.19.5-cp37-cp37m-manylinux2014_aarch64.whl", hash = "sha256:50a4a0ad0111cc1b71fa32dedd05fa239f7fb5a43a40663269bb5dc7877cfd28"}, 835 | {file = "numpy-1.19.5-cp37-cp37m-win32.whl", hash = "sha256:d051ec1c64b85ecc69531e1137bb9751c6830772ee5c1c426dbcfe98ef5788d7"}, 836 | {file = "numpy-1.19.5-cp37-cp37m-win_amd64.whl", hash = "sha256:a12ff4c8ddfee61f90a1633a4c4afd3f7bcb32b11c52026c92a12e1325922d0d"}, 837 | {file = "numpy-1.19.5-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:cf2402002d3d9f91c8b01e66fbb436a4ed01c6498fffed0e4c7566da1d40ee1e"}, 838 | {file = "numpy-1.19.5-cp38-cp38-manylinux1_i686.whl", hash = "sha256:1ded4fce9cfaaf24e7a0ab51b7a87be9038ea1ace7f34b841fe3b6894c721d1c"}, 839 | {file = "numpy-1.19.5-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:012426a41bc9ab63bb158635aecccc7610e3eff5d31d1eb43bc099debc979d94"}, 840 | {file = "numpy-1.19.5-cp38-cp38-manylinux2010_i686.whl", hash = "sha256:759e4095edc3c1b3ac031f34d9459fa781777a93ccc633a472a5468587a190ff"}, 841 | {file = "numpy-1.19.5-cp38-cp38-manylinux2010_x86_64.whl", hash = "sha256:a9d17f2be3b427fbb2bce61e596cf555d6f8a56c222bd2ca148baeeb5e5c783c"}, 842 | {file = "numpy-1.19.5-cp38-cp38-manylinux2014_aarch64.whl", hash = "sha256:99abf4f353c3d1a0c7a5f27699482c987cf663b1eac20db59b8c7b061eabd7fc"}, 843 | {file = "numpy-1.19.5-cp38-cp38-win32.whl", hash = "sha256:384ec0463d1c2671170901994aeb6dce126de0a95ccc3976c43b0038a37329c2"}, 844 | {file = "numpy-1.19.5-cp38-cp38-win_amd64.whl", hash = "sha256:811daee36a58dc79cf3d8bdd4a490e4277d0e4b7d103a001a4e73ddb48e7e6aa"}, 845 | {file = "numpy-1.19.5-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:c843b3f50d1ab7361ca4f0b3639bf691569493a56808a0b0c54a051d260b7dbd"}, 846 | {file = "numpy-1.19.5-cp39-cp39-manylinux1_i686.whl", hash = "sha256:d6631f2e867676b13026e2846180e2c13c1e11289d67da08d71cacb2cd93d4aa"}, 847 | {file = "numpy-1.19.5-cp39-cp39-manylinux1_x86_64.whl", hash = "sha256:7fb43004bce0ca31d8f13a6eb5e943fa73371381e53f7074ed21a4cb786c32f8"}, 848 | {file = "numpy-1.19.5-cp39-cp39-manylinux2010_i686.whl", hash = "sha256:2ea52bd92ab9f768cc64a4c3ef8f4b2580a17af0a5436f6126b08efbd1838371"}, 849 | {file = "numpy-1.19.5-cp39-cp39-manylinux2010_x86_64.whl", hash = "sha256:400580cbd3cff6ffa6293df2278c75aef2d58d8d93d3c5614cd67981dae68ceb"}, 850 | {file = "numpy-1.19.5-cp39-cp39-manylinux2014_aarch64.whl", hash = "sha256:df609c82f18c5b9f6cb97271f03315ff0dbe481a2a02e56aeb1b1a985ce38e60"}, 851 | {file = "numpy-1.19.5-cp39-cp39-win32.whl", hash = "sha256:ab83f24d5c52d60dbc8cd0528759532736b56db58adaa7b5f1f76ad551416a1e"}, 852 | {file = "numpy-1.19.5-cp39-cp39-win_amd64.whl", hash = "sha256:0eef32ca3132a48e43f6a0f5a82cb508f22ce5a3d6f67a8329c81c8e226d3f6e"}, 853 | {file = "numpy-1.19.5-pp36-pypy36_pp73-manylinux2010_x86_64.whl", hash = "sha256:a0d53e51a6cb6f0d9082decb7a4cb6dfb33055308c4c44f53103c073f649af73"}, 854 | {file = "numpy-1.19.5.zip", hash = "sha256:a76f502430dd98d7546e1ea2250a7360c065a5fdea52b2dffe8ae7180909b6f4"}, 855 | ] 856 | oauthlib = [ 857 | {file = "oauthlib-3.1.0-py2.py3-none-any.whl", hash = "sha256:df884cd6cbe20e32633f1db1072e9356f53638e4361bef4e8b03c9127c9328ea"}, 858 | {file = "oauthlib-3.1.0.tar.gz", hash = "sha256:bee41cc35fcca6e988463cacc3bcb8a96224f470ca547e697b604cc697b2f889"}, 859 | ] 860 | opt-einsum = [ 861 | {file = "opt_einsum-3.3.0-py3-none-any.whl", hash = "sha256:2455e59e3947d3c275477df7f5205b30635e266fe6dc300e3d9f9646bfcea147"}, 862 | {file = "opt_einsum-3.3.0.tar.gz", hash = "sha256:59f6475f77bbc37dcf7cd748519c0ec60722e91e63ca114e68821c0c54a46549"}, 863 | ] 864 | pandas = [ 865 | {file = "pandas-1.2.1-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:50e6c0a17ef7f831b5565fd0394dbf9bfd5d615ee4dd4bb60a3d8c9d2e872323"}, 866 | {file = "pandas-1.2.1-cp37-cp37m-manylinux1_i686.whl", hash = "sha256:324e60bea729cf3b55c1bf9e88fe8b9932c26f8669d13b928e3c96b3a1453dff"}, 867 | {file = "pandas-1.2.1-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:37443199f451f8badfe0add666e43cdb817c59fa36bceedafd9c543a42f236ca"}, 868 | {file = "pandas-1.2.1-cp37-cp37m-win32.whl", hash = "sha256:23ac77a3a222d9304cb2a7934bb7b4805ff43d513add7a42d1a22dc7df14edd2"}, 869 | {file = "pandas-1.2.1-cp37-cp37m-win_amd64.whl", hash = "sha256:496fcc29321e9a804d56d5aa5d7ec1320edfd1898eee2f451aa70171cf1d5a29"}, 870 | {file = "pandas-1.2.1-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:30e9e8bc8c5c17c03d943e8d6f778313efff59e413b8dbdd8214c2ed9aa165f6"}, 871 | {file = "pandas-1.2.1-cp38-cp38-manylinux1_i686.whl", hash = "sha256:055647e7f4c5e66ba92c2a7dcae6c2c57898b605a3fb007745df61cc4015937f"}, 872 | {file = "pandas-1.2.1-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:9d45f58b03af1fea4b48e44aa38a819a33dccb9821ef9e1d68f529995f8a632f"}, 873 | {file = "pandas-1.2.1-cp38-cp38-manylinux2014_aarch64.whl", hash = "sha256:b26e2dabda73d347c7af3e6fed58483161c7b87a886a4e06d76ccfe55a044aa9"}, 874 | {file = "pandas-1.2.1-cp38-cp38-win32.whl", hash = "sha256:47ec0808a8357ab3890ce0eca39a63f79dcf941e2e7f494470fe1c9ec43f6091"}, 875 | {file = "pandas-1.2.1-cp38-cp38-win_amd64.whl", hash = "sha256:57d5c7ac62925a8d2ab43ea442b297a56cc8452015e71e24f4aa7e4ed6be3d77"}, 876 | {file = "pandas-1.2.1-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:d7cca42dba13bfee369e2944ae31f6549a55831cba3117e17636955176004088"}, 877 | {file = "pandas-1.2.1-cp39-cp39-manylinux1_i686.whl", hash = "sha256:cfd237865d878da9b65cfee883da5e0067f5e2ff839e459466fb90565a77bda3"}, 878 | {file = "pandas-1.2.1-cp39-cp39-manylinux1_x86_64.whl", hash = "sha256:050ed2c9d825ef36738e018454e6d055c63d947c1d52010fbadd7584f09df5db"}, 879 | {file = "pandas-1.2.1-cp39-cp39-win32.whl", hash = "sha256:fe7de6fed43e7d086e3d947651ec89e55ddf00102f9dd5758763d56d182f0564"}, 880 | {file = "pandas-1.2.1-cp39-cp39-win_amd64.whl", hash = "sha256:2de012a36cc507debd9c3351b4d757f828d5a784a5fc4e6766eafc2b56e4b0f5"}, 881 | {file = "pandas-1.2.1.tar.gz", hash = "sha256:5527c5475d955c0bc9689c56865aaa2a7b13c504d6c44f0aadbf57b565af5ebd"}, 882 | ] 883 | pre-commit = [ 884 | {file = "pre_commit-2.10.0-py2.py3-none-any.whl", hash = "sha256:391ed331fdd0a21d0be48c1b9919921e9d372dfd60f6dc77b8f01dd6b13161c1"}, 885 | {file = "pre_commit-2.10.0.tar.gz", hash = "sha256:f413348d3a8464b77987e36ef6e02c3372dadb823edf0dfe6fb0c3dc2f378ef9"}, 886 | ] 887 | protobuf = [ 888 | {file = "protobuf-3.14.0-cp27-cp27m-macosx_10_9_x86_64.whl", hash = "sha256:629b03fd3caae7f815b0c66b41273f6b1900a579e2ccb41ef4493a4f5fb84f3a"}, 889 | {file = "protobuf-3.14.0-cp27-cp27mu-manylinux1_x86_64.whl", hash = "sha256:5b7a637212cc9b2bcf85dd828b1178d19efdf74dbfe1ddf8cd1b8e01fdaaa7f5"}, 890 | {file = "protobuf-3.14.0-cp35-cp35m-macosx_10_9_intel.whl", hash = "sha256:43b554b9e73a07ba84ed6cf25db0ff88b1e06be610b37656e292e3cbb5437472"}, 891 | {file = "protobuf-3.14.0-cp35-cp35m-manylinux1_x86_64.whl", hash = "sha256:5e9806a43232a1fa0c9cf5da8dc06f6910d53e4390be1fa06f06454d888a9142"}, 892 | {file = "protobuf-3.14.0-cp35-cp35m-win32.whl", hash = "sha256:1c51fda1bbc9634246e7be6016d860be01747354ed7015ebe38acf4452f470d2"}, 893 | {file = "protobuf-3.14.0-cp35-cp35m-win_amd64.whl", hash = "sha256:4b74301b30513b1a7494d3055d95c714b560fbb630d8fb9956b6f27992c9f980"}, 894 | {file = "protobuf-3.14.0-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:86a75477addde4918e9a1904e5c6af8d7b691f2a3f65587d73b16100fbe4c3b2"}, 895 | {file = "protobuf-3.14.0-cp36-cp36m-manylinux1_x86_64.whl", hash = "sha256:ecc33531a213eee22ad60e0e2aaea6c8ba0021f0cce35dbf0ab03dee6e2a23a1"}, 896 | {file = "protobuf-3.14.0-cp36-cp36m-win32.whl", hash = "sha256:72230ed56f026dd664c21d73c5db73ebba50d924d7ba6b7c0d81a121e390406e"}, 897 | {file = "protobuf-3.14.0-cp36-cp36m-win_amd64.whl", hash = "sha256:0fc96785262042e4863b3f3b5c429d4636f10d90061e1840fce1baaf59b1a836"}, 898 | {file = "protobuf-3.14.0-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:4e75105c9dfe13719b7293f75bd53033108f4ba03d44e71db0ec2a0e8401eafd"}, 899 | {file = "protobuf-3.14.0-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:2a7e2fe101a7ace75e9327b9c946d247749e564a267b0515cf41dfe450b69bac"}, 900 | {file = "protobuf-3.14.0-cp37-cp37m-win32.whl", hash = "sha256:b0d5d35faeb07e22a1ddf8dce620860c8fe145426c02d1a0ae2688c6e8ede36d"}, 901 | {file = "protobuf-3.14.0-cp37-cp37m-win_amd64.whl", hash = "sha256:8971c421dbd7aad930c9bd2694122f332350b6ccb5202a8b7b06f3f1a5c41ed5"}, 902 | {file = "protobuf-3.14.0-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:9616f0b65a30851e62f1713336c931fcd32c057202b7ff2cfbfca0fc7d5e3043"}, 903 | {file = "protobuf-3.14.0-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:22bcd2e284b3b1d969c12e84dc9b9a71701ec82d8ce975fdda19712e1cfd4e00"}, 904 | {file = "protobuf-3.14.0-py2.py3-none-any.whl", hash = "sha256:0e247612fadda953047f53301a7b0407cb0c3cb4ae25a6fde661597a04039b3c"}, 905 | {file = "protobuf-3.14.0.tar.gz", hash = "sha256:1d63eb389347293d8915fb47bee0951c7b5dab522a4a60118b9a18f33e21f8ce"}, 906 | ] 907 | pyasn1 = [ 908 | {file = "pyasn1-0.4.8-py2.4.egg", hash = "sha256:fec3e9d8e36808a28efb59b489e4528c10ad0f480e57dcc32b4de5c9d8c9fdf3"}, 909 | {file = "pyasn1-0.4.8-py2.5.egg", hash = "sha256:0458773cfe65b153891ac249bcf1b5f8f320b7c2ce462151f8fa74de8934becf"}, 910 | {file = "pyasn1-0.4.8-py2.6.egg", hash = "sha256:5c9414dcfede6e441f7e8f81b43b34e834731003427e5b09e4e00e3172a10f00"}, 911 | {file = "pyasn1-0.4.8-py2.7.egg", hash = "sha256:6e7545f1a61025a4e58bb336952c5061697da694db1cae97b116e9c46abcf7c8"}, 912 | {file = "pyasn1-0.4.8-py2.py3-none-any.whl", hash = "sha256:39c7e2ec30515947ff4e87fb6f456dfc6e84857d34be479c9d4a4ba4bf46aa5d"}, 913 | {file = "pyasn1-0.4.8-py3.1.egg", hash = "sha256:78fa6da68ed2727915c4767bb386ab32cdba863caa7dbe473eaae45f9959da86"}, 914 | {file = "pyasn1-0.4.8-py3.2.egg", hash = "sha256:08c3c53b75eaa48d71cf8c710312316392ed40899cb34710d092e96745a358b7"}, 915 | {file = "pyasn1-0.4.8-py3.3.egg", hash = "sha256:03840c999ba71680a131cfaee6fab142e1ed9bbd9c693e285cc6aca0d555e576"}, 916 | {file = "pyasn1-0.4.8-py3.4.egg", hash = "sha256:7ab8a544af125fb704feadb008c99a88805126fb525280b2270bb25cc1d78a12"}, 917 | {file = "pyasn1-0.4.8-py3.5.egg", hash = "sha256:e89bf84b5437b532b0803ba5c9a5e054d21fec423a89952a74f87fa2c9b7bce2"}, 918 | {file = "pyasn1-0.4.8-py3.6.egg", hash = "sha256:014c0e9976956a08139dc0712ae195324a75e142284d5f87f1a87ee1b068a359"}, 919 | {file = "pyasn1-0.4.8-py3.7.egg", hash = "sha256:99fcc3c8d804d1bc6d9a099921e39d827026409a58f2a720dcdb89374ea0c776"}, 920 | {file = "pyasn1-0.4.8.tar.gz", hash = "sha256:aef77c9fb94a3ac588e87841208bdec464471d9871bd5050a287cc9a475cd0ba"}, 921 | ] 922 | pyasn1-modules = [ 923 | {file = "pyasn1-modules-0.2.8.tar.gz", hash = "sha256:905f84c712230b2c592c19470d3ca8d552de726050d1d1716282a1f6146be65e"}, 924 | {file = "pyasn1_modules-0.2.8-py2.4.egg", hash = "sha256:0fe1b68d1e486a1ed5473f1302bd991c1611d319bba158e98b106ff86e1d7199"}, 925 | {file = "pyasn1_modules-0.2.8-py2.5.egg", hash = "sha256:fe0644d9ab041506b62782e92b06b8c68cca799e1a9636ec398675459e031405"}, 926 | {file = "pyasn1_modules-0.2.8-py2.6.egg", hash = "sha256:a99324196732f53093a84c4369c996713eb8c89d360a496b599fb1a9c47fc3eb"}, 927 | {file = "pyasn1_modules-0.2.8-py2.7.egg", hash = "sha256:0845a5582f6a02bb3e1bde9ecfc4bfcae6ec3210dd270522fee602365430c3f8"}, 928 | {file = "pyasn1_modules-0.2.8-py2.py3-none-any.whl", hash = "sha256:a50b808ffeb97cb3601dd25981f6b016cbb3d31fbf57a8b8a87428e6158d0c74"}, 929 | {file = "pyasn1_modules-0.2.8-py3.1.egg", hash = "sha256:f39edd8c4ecaa4556e989147ebf219227e2cd2e8a43c7e7fcb1f1c18c5fd6a3d"}, 930 | {file = "pyasn1_modules-0.2.8-py3.2.egg", hash = "sha256:b80486a6c77252ea3a3e9b1e360bc9cf28eaac41263d173c032581ad2f20fe45"}, 931 | {file = "pyasn1_modules-0.2.8-py3.3.egg", hash = "sha256:65cebbaffc913f4fe9e4808735c95ea22d7a7775646ab690518c056784bc21b4"}, 932 | {file = "pyasn1_modules-0.2.8-py3.4.egg", hash = "sha256:15b7c67fabc7fc240d87fb9aabf999cf82311a6d6fb2c70d00d3d0604878c811"}, 933 | {file = "pyasn1_modules-0.2.8-py3.5.egg", hash = "sha256:426edb7a5e8879f1ec54a1864f16b882c2837bfd06eee62f2c982315ee2473ed"}, 934 | {file = "pyasn1_modules-0.2.8-py3.6.egg", hash = "sha256:cbac4bc38d117f2a49aeedec4407d23e8866ea4ac27ff2cf7fb3e5b570df19e0"}, 935 | {file = "pyasn1_modules-0.2.8-py3.7.egg", hash = "sha256:c29a5e5cc7a3f05926aff34e097e84f8589cd790ce0ed41b67aed6857b26aafd"}, 936 | ] 937 | python-dateutil = [ 938 | {file = "python-dateutil-2.8.1.tar.gz", hash = "sha256:73ebfe9dbf22e832286dafa60473e4cd239f8592f699aa5adaf10050e6e1823c"}, 939 | {file = "python_dateutil-2.8.1-py2.py3-none-any.whl", hash = "sha256:75bb3f31ea686f1197762692a9ee6a7550b59fc6ca3a1f4b5d7e32fb98e2da2a"}, 940 | ] 941 | pytz = [ 942 | {file = "pytz-2020.5-py2.py3-none-any.whl", hash = "sha256:16962c5fb8db4a8f63a26646d8886e9d769b6c511543557bc84e9569fb9a9cb4"}, 943 | {file = "pytz-2020.5.tar.gz", hash = "sha256:180befebb1927b16f6b57101720075a984c019ac16b1b7575673bea42c6c3da5"}, 944 | ] 945 | pyyaml = [ 946 | {file = "PyYAML-5.4.1-cp27-cp27m-macosx_10_9_x86_64.whl", hash = "sha256:3b2b1824fe7112845700f815ff6a489360226a5609b96ec2190a45e62a9fc922"}, 947 | {file = "PyYAML-5.4.1-cp27-cp27m-win32.whl", hash = "sha256:129def1b7c1bf22faffd67b8f3724645203b79d8f4cc81f674654d9902cb4393"}, 948 | {file = "PyYAML-5.4.1-cp27-cp27m-win_amd64.whl", hash = "sha256:4465124ef1b18d9ace298060f4eccc64b0850899ac4ac53294547536533800c8"}, 949 | {file = "PyYAML-5.4.1-cp27-cp27mu-manylinux1_x86_64.whl", hash = "sha256:bb4191dfc9306777bc594117aee052446b3fa88737cd13b7188d0e7aa8162185"}, 950 | {file = "PyYAML-5.4.1-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:6c78645d400265a062508ae399b60b8c167bf003db364ecb26dcab2bda048253"}, 951 | {file = "PyYAML-5.4.1-cp36-cp36m-manylinux1_x86_64.whl", hash = "sha256:4e0583d24c881e14342eaf4ec5fbc97f934b999a6828693a99157fde912540cc"}, 952 | {file = "PyYAML-5.4.1-cp36-cp36m-win32.whl", hash = "sha256:3bd0e463264cf257d1ffd2e40223b197271046d09dadf73a0fe82b9c1fc385a5"}, 953 | {file = "PyYAML-5.4.1-cp36-cp36m-win_amd64.whl", hash = "sha256:e4fac90784481d221a8e4b1162afa7c47ed953be40d31ab4629ae917510051df"}, 954 | {file = "PyYAML-5.4.1-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:5accb17103e43963b80e6f837831f38d314a0495500067cb25afab2e8d7a4018"}, 955 | {file = "PyYAML-5.4.1-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:e1d4970ea66be07ae37a3c2e48b5ec63f7ba6804bdddfdbd3cfd954d25a82e63"}, 956 | {file = "PyYAML-5.4.1-cp37-cp37m-win32.whl", hash = "sha256:dd5de0646207f053eb0d6c74ae45ba98c3395a571a2891858e87df7c9b9bd51b"}, 957 | {file = "PyYAML-5.4.1-cp37-cp37m-win_amd64.whl", hash = "sha256:08682f6b72c722394747bddaf0aa62277e02557c0fd1c42cb853016a38f8dedf"}, 958 | {file = "PyYAML-5.4.1-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:d2d9808ea7b4af864f35ea216be506ecec180628aced0704e34aca0b040ffe46"}, 959 | {file = "PyYAML-5.4.1-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:8c1be557ee92a20f184922c7b6424e8ab6691788e6d86137c5d93c1a6ec1b8fb"}, 960 | {file = "PyYAML-5.4.1-cp38-cp38-win32.whl", hash = "sha256:fa5ae20527d8e831e8230cbffd9f8fe952815b2b7dae6ffec25318803a7528fc"}, 961 | {file = "PyYAML-5.4.1-cp38-cp38-win_amd64.whl", hash = "sha256:0f5f5786c0e09baddcd8b4b45f20a7b5d61a7e7e99846e3c799b05c7c53fa696"}, 962 | {file = "PyYAML-5.4.1-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:294db365efa064d00b8d1ef65d8ea2c3426ac366c0c4368d930bf1c5fb497f77"}, 963 | {file = "PyYAML-5.4.1-cp39-cp39-manylinux1_x86_64.whl", hash = "sha256:74c1485f7707cf707a7aef42ef6322b8f97921bd89be2ab6317fd782c2d53183"}, 964 | {file = "PyYAML-5.4.1-cp39-cp39-win32.whl", hash = "sha256:49d4cdd9065b9b6e206d0595fee27a96b5dd22618e7520c33204a4a3239d5b10"}, 965 | {file = "PyYAML-5.4.1-cp39-cp39-win_amd64.whl", hash = "sha256:c20cfa2d49991c8b4147af39859b167664f2ad4561704ee74c1de03318e898db"}, 966 | {file = "PyYAML-5.4.1.tar.gz", hash = "sha256:607774cbba28732bfa802b54baa7484215f530991055bb562efbed5b2f20a45e"}, 967 | ] 968 | regex = [ 969 | {file = "regex-2020.11.13-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:8b882a78c320478b12ff024e81dc7d43c1462aa4a3341c754ee65d857a521f85"}, 970 | {file = "regex-2020.11.13-cp36-cp36m-manylinux1_i686.whl", hash = "sha256:a63f1a07932c9686d2d416fb295ec2c01ab246e89b4d58e5fa468089cab44b70"}, 971 | {file = "regex-2020.11.13-cp36-cp36m-manylinux1_x86_64.whl", hash = "sha256:6e4b08c6f8daca7d8f07c8d24e4331ae7953333dbd09c648ed6ebd24db5a10ee"}, 972 | {file = "regex-2020.11.13-cp36-cp36m-manylinux2010_i686.whl", hash = "sha256:bba349276b126947b014e50ab3316c027cac1495992f10e5682dc677b3dfa0c5"}, 973 | {file = "regex-2020.11.13-cp36-cp36m-manylinux2010_x86_64.whl", hash = "sha256:56e01daca75eae420bce184edd8bb341c8eebb19dd3bce7266332258f9fb9dd7"}, 974 | {file = "regex-2020.11.13-cp36-cp36m-manylinux2014_aarch64.whl", hash = "sha256:6a8ce43923c518c24a2579fda49f093f1397dad5d18346211e46f134fc624e31"}, 975 | {file = "regex-2020.11.13-cp36-cp36m-manylinux2014_i686.whl", hash = "sha256:1ab79fcb02b930de09c76d024d279686ec5d532eb814fd0ed1e0051eb8bd2daa"}, 976 | {file = "regex-2020.11.13-cp36-cp36m-manylinux2014_x86_64.whl", hash = "sha256:9801c4c1d9ae6a70aeb2128e5b4b68c45d4f0af0d1535500884d644fa9b768c6"}, 977 | {file = "regex-2020.11.13-cp36-cp36m-win32.whl", hash = "sha256:49cae022fa13f09be91b2c880e58e14b6da5d10639ed45ca69b85faf039f7a4e"}, 978 | {file = "regex-2020.11.13-cp36-cp36m-win_amd64.whl", hash = "sha256:749078d1eb89484db5f34b4012092ad14b327944ee7f1c4f74d6279a6e4d1884"}, 979 | {file = "regex-2020.11.13-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:b2f4007bff007c96a173e24dcda236e5e83bde4358a557f9ccf5e014439eae4b"}, 980 | {file = "regex-2020.11.13-cp37-cp37m-manylinux1_i686.whl", hash = "sha256:38c8fd190db64f513fe4e1baa59fed086ae71fa45083b6936b52d34df8f86a88"}, 981 | {file = "regex-2020.11.13-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:5862975b45d451b6db51c2e654990c1820523a5b07100fc6903e9c86575202a0"}, 982 | {file = "regex-2020.11.13-cp37-cp37m-manylinux2010_i686.whl", hash = "sha256:262c6825b309e6485ec2493ffc7e62a13cf13fb2a8b6d212f72bd53ad34118f1"}, 983 | {file = "regex-2020.11.13-cp37-cp37m-manylinux2010_x86_64.whl", hash = "sha256:bafb01b4688833e099d79e7efd23f99172f501a15c44f21ea2118681473fdba0"}, 984 | {file = "regex-2020.11.13-cp37-cp37m-manylinux2014_aarch64.whl", hash = "sha256:e32f5f3d1b1c663af7f9c4c1e72e6ffe9a78c03a31e149259f531e0fed826512"}, 985 | {file = "regex-2020.11.13-cp37-cp37m-manylinux2014_i686.whl", hash = "sha256:3bddc701bdd1efa0d5264d2649588cbfda549b2899dc8d50417e47a82e1387ba"}, 986 | {file = "regex-2020.11.13-cp37-cp37m-manylinux2014_x86_64.whl", hash = "sha256:02951b7dacb123d8ea6da44fe45ddd084aa6777d4b2454fa0da61d569c6fa538"}, 987 | {file = "regex-2020.11.13-cp37-cp37m-win32.whl", hash = "sha256:0d08e71e70c0237883d0bef12cad5145b84c3705e9c6a588b2a9c7080e5af2a4"}, 988 | {file = "regex-2020.11.13-cp37-cp37m-win_amd64.whl", hash = "sha256:1fa7ee9c2a0e30405e21031d07d7ba8617bc590d391adfc2b7f1e8b99f46f444"}, 989 | {file = "regex-2020.11.13-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:baf378ba6151f6e272824b86a774326f692bc2ef4cc5ce8d5bc76e38c813a55f"}, 990 | {file = "regex-2020.11.13-cp38-cp38-manylinux1_i686.whl", hash = "sha256:e3faaf10a0d1e8e23a9b51d1900b72e1635c2d5b0e1bea1c18022486a8e2e52d"}, 991 | {file = "regex-2020.11.13-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:2a11a3e90bd9901d70a5b31d7dd85114755a581a5da3fc996abfefa48aee78af"}, 992 | {file = "regex-2020.11.13-cp38-cp38-manylinux2010_i686.whl", hash = "sha256:d1ebb090a426db66dd80df8ca85adc4abfcbad8a7c2e9a5ec7513ede522e0a8f"}, 993 | {file = "regex-2020.11.13-cp38-cp38-manylinux2010_x86_64.whl", hash = "sha256:b2b1a5ddae3677d89b686e5c625fc5547c6e492bd755b520de5332773a8af06b"}, 994 | {file = "regex-2020.11.13-cp38-cp38-manylinux2014_aarch64.whl", hash = "sha256:2c99e97d388cd0a8d30f7c514d67887d8021541b875baf09791a3baad48bb4f8"}, 995 | {file = "regex-2020.11.13-cp38-cp38-manylinux2014_i686.whl", hash = "sha256:c084582d4215593f2f1d28b65d2a2f3aceff8342aa85afd7be23a9cad74a0de5"}, 996 | {file = "regex-2020.11.13-cp38-cp38-manylinux2014_x86_64.whl", hash = "sha256:a3d748383762e56337c39ab35c6ed4deb88df5326f97a38946ddd19028ecce6b"}, 997 | {file = "regex-2020.11.13-cp38-cp38-win32.whl", hash = "sha256:7913bd25f4ab274ba37bc97ad0e21c31004224ccb02765ad984eef43e04acc6c"}, 998 | {file = "regex-2020.11.13-cp38-cp38-win_amd64.whl", hash = "sha256:6c54ce4b5d61a7129bad5c5dc279e222afd00e721bf92f9ef09e4fae28755683"}, 999 | {file = "regex-2020.11.13-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:1862a9d9194fae76a7aaf0150d5f2a8ec1da89e8b55890b1786b8f88a0f619dc"}, 1000 | {file = "regex-2020.11.13-cp39-cp39-manylinux1_i686.whl", hash = "sha256:4902e6aa086cbb224241adbc2f06235927d5cdacffb2425c73e6570e8d862364"}, 1001 | {file = "regex-2020.11.13-cp39-cp39-manylinux1_x86_64.whl", hash = "sha256:7a25fcbeae08f96a754b45bdc050e1fb94b95cab046bf56b016c25e9ab127b3e"}, 1002 | {file = "regex-2020.11.13-cp39-cp39-manylinux2010_i686.whl", hash = "sha256:d2d8ce12b7c12c87e41123997ebaf1a5767a5be3ec545f64675388970f415e2e"}, 1003 | {file = "regex-2020.11.13-cp39-cp39-manylinux2010_x86_64.whl", hash = "sha256:f7d29a6fc4760300f86ae329e3b6ca28ea9c20823df123a2ea8693e967b29917"}, 1004 | {file = "regex-2020.11.13-cp39-cp39-manylinux2014_aarch64.whl", hash = "sha256:717881211f46de3ab130b58ec0908267961fadc06e44f974466d1887f865bd5b"}, 1005 | {file = "regex-2020.11.13-cp39-cp39-manylinux2014_i686.whl", hash = "sha256:3128e30d83f2e70b0bed9b2a34e92707d0877e460b402faca908c6667092ada9"}, 1006 | {file = "regex-2020.11.13-cp39-cp39-manylinux2014_x86_64.whl", hash = "sha256:8f6a2229e8ad946e36815f2a03386bb8353d4bde368fdf8ca5f0cb97264d3b5c"}, 1007 | {file = "regex-2020.11.13-cp39-cp39-win32.whl", hash = "sha256:f8f295db00ef5f8bae530fc39af0b40486ca6068733fb860b42115052206466f"}, 1008 | {file = "regex-2020.11.13-cp39-cp39-win_amd64.whl", hash = "sha256:a15f64ae3a027b64496a71ab1f722355e570c3fac5ba2801cafce846bf5af01d"}, 1009 | {file = "regex-2020.11.13.tar.gz", hash = "sha256:83d6b356e116ca119db8e7c6fc2983289d87b27b3fac238cfe5dca529d884562"}, 1010 | ] 1011 | requests = [ 1012 | {file = "requests-2.25.1-py2.py3-none-any.whl", hash = "sha256:c210084e36a42ae6b9219e00e48287def368a26d03a048ddad7bfee44f75871e"}, 1013 | {file = "requests-2.25.1.tar.gz", hash = "sha256:27973dd4a904a4f13b263a19c866c13b92a39ed1c964655f025f3f8d3d75b804"}, 1014 | ] 1015 | requests-oauthlib = [ 1016 | {file = "requests-oauthlib-1.3.0.tar.gz", hash = "sha256:b4261601a71fd721a8bd6d7aa1cc1d6a8a93b4a9f5e96626f8e4d91e8beeaa6a"}, 1017 | {file = "requests_oauthlib-1.3.0-py2.py3-none-any.whl", hash = "sha256:7f71572defaecd16372f9006f33c2ec8c077c3cfa6f5911a9a90202beb513f3d"}, 1018 | {file = "requests_oauthlib-1.3.0-py3.7.egg", hash = "sha256:fa6c47b933f01060936d87ae9327fead68768b69c6c9ea2109c48be30f2d4dbc"}, 1019 | ] 1020 | rsa = [ 1021 | {file = "rsa-4.7-py3-none-any.whl", hash = "sha256:a8774e55b59fd9fc893b0d05e9bfc6f47081f46ff5b46f39ccf24631b7be356b"}, 1022 | {file = "rsa-4.7.tar.gz", hash = "sha256:69805d6b69f56eb05b62daea3a7dbd7aa44324ad1306445e05da8060232d00f4"}, 1023 | ] 1024 | scikit-learn = [ 1025 | {file = "scikit-learn-0.24.1.tar.gz", hash = "sha256:a0334a1802e64d656022c3bfab56a73fbd6bf4b1298343f3688af2151810bbdf"}, 1026 | {file = "scikit_learn-0.24.1-cp36-cp36m-macosx_10_13_x86_64.whl", hash = "sha256:9bed8a1ef133c8e2f13966a542cb8125eac7f4b67dcd234197c827ba9c7dd3e0"}, 1027 | {file = "scikit_learn-0.24.1-cp36-cp36m-manylinux1_i686.whl", hash = "sha256:a36e159a0521e13bbe15ca8c8d038b3a1dd4c7dad18d276d76992e03b92cf643"}, 1028 | {file = "scikit_learn-0.24.1-cp36-cp36m-manylinux1_x86_64.whl", hash = "sha256:c658432d8a20e95398f6bb95ff9731ce9dfa343fdf21eea7ec6a7edfacd4b4d9"}, 1029 | {file = "scikit_learn-0.24.1-cp36-cp36m-manylinux2010_i686.whl", hash = "sha256:9dfa564ef27e8e674aa1cc74378416d580ac4ede1136c13dd555a87996e13422"}, 1030 | {file = "scikit_learn-0.24.1-cp36-cp36m-manylinux2010_x86_64.whl", hash = "sha256:9c6097b6a9b2bafc5e0f31f659e6ab5e131383209c30c9e978c5b8abdac5ed2a"}, 1031 | {file = "scikit_learn-0.24.1-cp36-cp36m-win32.whl", hash = "sha256:7b04691eb2f41d2c68dbda8d1bd3cb4ef421bdc43aaa56aeb6c762224552dfb6"}, 1032 | {file = "scikit_learn-0.24.1-cp36-cp36m-win_amd64.whl", hash = "sha256:1adf483e91007a87171d7ce58c34b058eb5dab01b5fee6052f15841778a8ecd8"}, 1033 | {file = "scikit_learn-0.24.1-cp37-cp37m-macosx_10_13_x86_64.whl", hash = "sha256:ddb52d088889f5596bc4d1de981f2eca106b58243b6679e4782f3ba5096fd645"}, 1034 | {file = "scikit_learn-0.24.1-cp37-cp37m-manylinux1_i686.whl", hash = "sha256:a29460499c1e62b7a830bb57ca42e615375a6ab1bcad053cd25b493588348ea8"}, 1035 | {file = "scikit_learn-0.24.1-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:0567a2d29ad08af98653300c623bd8477b448fe66ced7198bef4ed195925f082"}, 1036 | {file = "scikit_learn-0.24.1-cp37-cp37m-manylinux2010_i686.whl", hash = "sha256:99349d77f54e11f962d608d94dfda08f0c9e5720d97132233ebdf35be2858b2d"}, 1037 | {file = "scikit_learn-0.24.1-cp37-cp37m-manylinux2010_x86_64.whl", hash = "sha256:83b21ff053b1ff1c018a2d24db6dd3ea339b1acfbaa4d9c881731f43748d8b3b"}, 1038 | {file = "scikit_learn-0.24.1-cp37-cp37m-win32.whl", hash = "sha256:c3deb3b19dd9806acf00cf0d400e84562c227723013c33abefbbc3cf906596e9"}, 1039 | {file = "scikit_learn-0.24.1-cp37-cp37m-win_amd64.whl", hash = "sha256:d54dbaadeb1425b7d6a66bf44bee2bb2b899fe3e8850b8e94cfb9c904dcb46d0"}, 1040 | {file = "scikit_learn-0.24.1-cp38-cp38-macosx_10_13_x86_64.whl", hash = "sha256:3c4f07f47c04e81b134424d53c3f5e16dfd7f494e44fd7584ba9ce9de2c5e6c1"}, 1041 | {file = "scikit_learn-0.24.1-cp38-cp38-manylinux1_i686.whl", hash = "sha256:c13ebac42236b1c46397162471ea1c46af68413000e28b9309f8c05722c65a09"}, 1042 | {file = "scikit_learn-0.24.1-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:4ddd2b6f7449a5d539ff754fa92d75da22de261fd8fdcfb3596799fadf255101"}, 1043 | {file = "scikit_learn-0.24.1-cp38-cp38-manylinux2010_i686.whl", hash = "sha256:826b92bf45b8ad80444814e5f4ac032156dd481e48d7da33d611f8fe96d5f08b"}, 1044 | {file = "scikit_learn-0.24.1-cp38-cp38-manylinux2010_x86_64.whl", hash = "sha256:259ec35201e82e2db1ae2496f229e63f46d7f1695ae68eef9350b00dc74ba52f"}, 1045 | {file = "scikit_learn-0.24.1-cp38-cp38-win32.whl", hash = "sha256:8772b99d683be8f67fcc04789032f1b949022a0e6880ee7b75a7ec97dbbb5d0b"}, 1046 | {file = "scikit_learn-0.24.1-cp38-cp38-win_amd64.whl", hash = "sha256:ed9d65594948678827f4ff0e7ae23344e2f2b4cabbca057ccaed3118fdc392ca"}, 1047 | {file = "scikit_learn-0.24.1-cp39-cp39-macosx_10_13_x86_64.whl", hash = "sha256:8aa1b3ac46b80eaa552b637eeadbbce3be5931e4b5002b964698e33a1b589e1e"}, 1048 | {file = "scikit_learn-0.24.1-cp39-cp39-manylinux1_i686.whl", hash = "sha256:c7f4eb77504ac586d8ac1bde1b0c04b504487210f95297235311a0ab7edd7e38"}, 1049 | {file = "scikit_learn-0.24.1-cp39-cp39-manylinux1_x86_64.whl", hash = "sha256:087dfede39efb06ab30618f9ab55a0397f29c38d63cd0ab88d12b500b7d65fd7"}, 1050 | {file = "scikit_learn-0.24.1-cp39-cp39-manylinux2010_i686.whl", hash = "sha256:895dbf2030aa7337649e36a83a007df3c9811396b4e2fa672a851160f36ce90c"}, 1051 | {file = "scikit_learn-0.24.1-cp39-cp39-manylinux2010_x86_64.whl", hash = "sha256:9a24d1ccec2a34d4cd3f2a1f86409f3f5954cc23d4d2270ba0d03cf018aa4780"}, 1052 | {file = "scikit_learn-0.24.1-cp39-cp39-win32.whl", hash = "sha256:fab31f48282ebf54dd69f6663cd2d9800096bad1bb67bbc9c9ac84eb77b41972"}, 1053 | {file = "scikit_learn-0.24.1-cp39-cp39-win_amd64.whl", hash = "sha256:4562dcf4793e61c5d0f89836d07bc37521c3a1889da8f651e2c326463c4bd697"}, 1054 | ] 1055 | scipy = [ 1056 | {file = "scipy-1.6.0-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:3d4303e3e21d07d9557b26a1707bb9fc065510ee8501c9bf22a0157249a82fd0"}, 1057 | {file = "scipy-1.6.0-cp37-cp37m-manylinux1_i686.whl", hash = "sha256:1bc5b446600c4ff7ab36bade47180673141322f0febaa555f1c433fe04f2a0e3"}, 1058 | {file = "scipy-1.6.0-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:8840a9adb4ede3751f49761653d3ebf664f25195fdd42ada394ffea8903dd51d"}, 1059 | {file = "scipy-1.6.0-cp37-cp37m-manylinux2014_aarch64.whl", hash = "sha256:8629135ee00cc2182ac8be8e75643b9f02235942443732c2ed69ab48edcb6614"}, 1060 | {file = "scipy-1.6.0-cp37-cp37m-win32.whl", hash = "sha256:58731bbe0103e96b89b2f41516699db9b63066e4317e31b8402891571f6d358f"}, 1061 | {file = "scipy-1.6.0-cp37-cp37m-win_amd64.whl", hash = "sha256:876badc33eec20709d4e042a09834f5953ebdac4088d45a4f3a1f18b56885718"}, 1062 | {file = "scipy-1.6.0-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:c0911f3180de343643f369dc5cfedad6ba9f939c2d516bddea4a6871eb000722"}, 1063 | {file = "scipy-1.6.0-cp38-cp38-manylinux1_i686.whl", hash = "sha256:b8af26839ae343655f3ca377a5d5e5466f1d3b3ac7432a43449154fe958ae0e0"}, 1064 | {file = "scipy-1.6.0-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:4f1d9cc977ac6a4a63c124045c1e8bf67ec37098f67c699887a93736961a00ae"}, 1065 | {file = "scipy-1.6.0-cp38-cp38-manylinux2014_aarch64.whl", hash = "sha256:eb7928275f3560d47e5538e15e9f32b3d64cd30ea8f85f3e82987425476f53f6"}, 1066 | {file = "scipy-1.6.0-cp38-cp38-win32.whl", hash = "sha256:31ab217b5c27ab429d07428a76002b33662f98986095bbce5d55e0788f7e8b15"}, 1067 | {file = "scipy-1.6.0-cp38-cp38-win_amd64.whl", hash = "sha256:2f1c2ebca6fd867160e70102200b1bd07b3b2d31a3e6af3c58d688c15d0d07b7"}, 1068 | {file = "scipy-1.6.0-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:155225621df90fcd151e25d51c50217e412de717475999ebb76e17e310176981"}, 1069 | {file = "scipy-1.6.0-cp39-cp39-manylinux1_i686.whl", hash = "sha256:f68d5761a2d2376e2b194c8e9192bbf7c51306ca176f1a0889990a52ef0d551f"}, 1070 | {file = "scipy-1.6.0-cp39-cp39-manylinux1_x86_64.whl", hash = "sha256:d902d3a5ad7f28874c0a82db95246d24ca07ad932741df668595fe00a4819870"}, 1071 | {file = "scipy-1.6.0-cp39-cp39-manylinux2014_aarch64.whl", hash = "sha256:aef3a2dbc436bbe8f6e0b635f0b5fe5ed024b522eee4637dbbe0b974129ca734"}, 1072 | {file = "scipy-1.6.0-cp39-cp39-win32.whl", hash = "sha256:cdbc47628184a0ebeb5c08f1892614e1bd4a51f6e0d609c6eed253823a960f5b"}, 1073 | {file = "scipy-1.6.0-cp39-cp39-win_amd64.whl", hash = "sha256:313785c4dab65060f9648112d025f6d2fec69a8a889c714328882d678a95f053"}, 1074 | {file = "scipy-1.6.0.tar.gz", hash = "sha256:cb6dc9f82dfd95f6b9032a8d7ea70efeeb15d5b5fd6ed4e8537bb3c673580566"}, 1075 | ] 1076 | six = [ 1077 | {file = "six-1.15.0-py2.py3-none-any.whl", hash = "sha256:8b74bedcbbbaca38ff6d7491d76f2b06b3592611af620f8426e82dddb04a5ced"}, 1078 | {file = "six-1.15.0.tar.gz", hash = "sha256:30639c035cdb23534cd4aa2dd52c3bf48f06e5f4a941509c8bafd8ce11080259"}, 1079 | ] 1080 | tensorboard = [ 1081 | {file = "tensorboard-2.4.1-py3-none-any.whl", hash = "sha256:7b8c53c396069b618f6f276ec94fc45d17e3282d668979216e5d30be472115e4"}, 1082 | ] 1083 | tensorboard-plugin-wit = [ 1084 | {file = "tensorboard_plugin_wit-1.8.0-py3-none-any.whl", hash = "sha256:2a80d1c551d741e99b2f197bb915d8a133e24adb8da1732b840041860f91183a"}, 1085 | ] 1086 | tensorflow = [ 1087 | {file = "tensorflow-2.4.1-cp36-cp36m-macosx_10_11_x86_64.whl", hash = "sha256:e1f2799cc86861680d8515167f103e2207a8cab92a4afe5471e4839330591f08"}, 1088 | {file = "tensorflow-2.4.1-cp36-cp36m-manylinux2010_x86_64.whl", hash = "sha256:55368ba0bedb513ba0e36a2543a588b5276e9b2ca99fa3232a9a176601a7bab5"}, 1089 | {file = "tensorflow-2.4.1-cp36-cp36m-win_amd64.whl", hash = "sha256:0e427b1350be6dbe572f971947c5596fdbb152081f227808d8becd894bf40282"}, 1090 | {file = "tensorflow-2.4.1-cp37-cp37m-macosx_10_11_x86_64.whl", hash = "sha256:36d5acd60aac48e34bd545d0ce1fb8b3fceebff6b8782436defd0f71c12203bd"}, 1091 | {file = "tensorflow-2.4.1-cp37-cp37m-manylinux2010_x86_64.whl", hash = "sha256:22723b8e1fa83b34f56c349b16a57aaff913b404451fcf70981f2b1d6e0c64fc"}, 1092 | {file = "tensorflow-2.4.1-cp37-cp37m-win_amd64.whl", hash = "sha256:2357112319303da1b5459a621fd0503c2b2cd97b6c33c4903abd46b3c3e380e2"}, 1093 | {file = "tensorflow-2.4.1-cp38-cp38-macosx_10_11_x86_64.whl", hash = "sha256:4a04081647b89a8fb602895b29ffc559e3c20aac8bde1d4c5ecd2a65adce5d35"}, 1094 | {file = "tensorflow-2.4.1-cp38-cp38-manylinux2010_x86_64.whl", hash = "sha256:efa9daa4b3701a4e439b24b74c1e4b66844aee8ae5263fb3cc12281ac9cc9f67"}, 1095 | {file = "tensorflow-2.4.1-cp38-cp38-win_amd64.whl", hash = "sha256:eedcf578afde5e6e69c75d796bed41093451cd1ab54afb438760e40fb74a09de"}, 1096 | ] 1097 | tensorflow-estimator = [ 1098 | {file = "tensorflow_estimator-2.4.0-py2.py3-none-any.whl", hash = "sha256:5b7b7bf2debe19a8794adacc43e8ba6459daa4efaf54d3302623994a359b17f0"}, 1099 | ] 1100 | termcolor = [ 1101 | {file = "termcolor-1.1.0.tar.gz", hash = "sha256:1d6d69ce66211143803fbc56652b41d73b4a400a2891d7bf7a1cdf4c02de613b"}, 1102 | ] 1103 | threadpoolctl = [ 1104 | {file = "threadpoolctl-2.1.0-py3-none-any.whl", hash = "sha256:38b74ca20ff3bb42caca8b00055111d74159ee95c4370882bbff2b93d24da725"}, 1105 | {file = "threadpoolctl-2.1.0.tar.gz", hash = "sha256:ddc57c96a38beb63db45d6c159b5ab07b6bced12c45a1f07b2b92f272aebfa6b"}, 1106 | ] 1107 | toml = [ 1108 | {file = "toml-0.10.2-py2.py3-none-any.whl", hash = "sha256:806143ae5bfb6a3c6e736a764057db0e6a0e05e338b5630894a5f779cabb4f9b"}, 1109 | {file = "toml-0.10.2.tar.gz", hash = "sha256:b3bda1d108d5dd99f4a20d24d9c348e91c4db7ab1b749200bded2f839ccbe68f"}, 1110 | ] 1111 | tqdm = [ 1112 | {file = "tqdm-4.56.0-py2.py3-none-any.whl", hash = "sha256:4621f6823bab46a9cc33d48105753ccbea671b68bab2c50a9f0be23d4065cb5a"}, 1113 | {file = "tqdm-4.56.0.tar.gz", hash = "sha256:fe3d08dd00a526850568d542ff9de9bbc2a09a791da3c334f3213d8d0bbbca65"}, 1114 | ] 1115 | typing-extensions = [ 1116 | {file = "typing_extensions-3.7.4.3-py2-none-any.whl", hash = "sha256:dafc7639cde7f1b6e1acc0f457842a83e722ccca8eef5270af2d74792619a89f"}, 1117 | {file = "typing_extensions-3.7.4.3-py3-none-any.whl", hash = "sha256:7cb407020f00f7bfc3cb3e7881628838e69d8f3fcab2f64742a5e76b2f841918"}, 1118 | {file = "typing_extensions-3.7.4.3.tar.gz", hash = "sha256:99d4073b617d30288f569d3f13d2bd7548c3a7e4c8de87db09a9d29bb3a4a60c"}, 1119 | ] 1120 | urllib3 = [ 1121 | {file = "urllib3-1.26.3-py2.py3-none-any.whl", hash = "sha256:1b465e494e3e0d8939b50680403e3aedaa2bc434b7d5af64dfd3c958d7f5ae80"}, 1122 | {file = "urllib3-1.26.3.tar.gz", hash = "sha256:de3eedaad74a2683334e282005cd8d7f22f4d55fa690a2a1020a416cb0a47e73"}, 1123 | ] 1124 | virtualenv = [ 1125 | {file = "virtualenv-20.4.0-py2.py3-none-any.whl", hash = "sha256:227a8fed626f2f20a6cdb0870054989f82dd27b2560a911935ba905a2a5e0034"}, 1126 | {file = "virtualenv-20.4.0.tar.gz", hash = "sha256:219ee956e38b08e32d5639289aaa5bd190cfbe7dafcb8fa65407fca08e808f9c"}, 1127 | ] 1128 | werkzeug = [ 1129 | {file = "Werkzeug-1.0.1-py2.py3-none-any.whl", hash = "sha256:2de2a5db0baeae7b2d2664949077c2ac63fbd16d98da0ff71837f7d1dea3fd43"}, 1130 | {file = "Werkzeug-1.0.1.tar.gz", hash = "sha256:6c80b1e5ad3665290ea39320b91e1be1e0d5f60652b964a3070216de83d2e47c"}, 1131 | ] 1132 | wrapt = [ 1133 | {file = "wrapt-1.12.1.tar.gz", hash = "sha256:b62ffa81fb85f4332a4f609cab4ac40709470da05643a082ec1eb88e6d9b97d7"}, 1134 | ] 1135 | -------------------------------------------------------------------------------- /poster.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/poster.pdf -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [tool.poetry] 2 | name = "mbti-rnn" 3 | version = "0.1.0" 4 | description = "" 5 | authors = ["Ian Scott Knight "] 6 | license = "MIT" 7 | 8 | [tool.poetry.dependencies] 9 | python = "^3.8" 10 | scikit-learn = "^0.24.1" 11 | nltk = "^3.5" 12 | Keras = "^2.4.3" 13 | pandas = "^1.2.1" 14 | tensorflow = "^2.4.1" 15 | 16 | [tool.poetry.dev-dependencies] 17 | pre-commit = "^2.10.0" 18 | 19 | [build-system] 20 | requires = ["poetry-core>=1.0.0"] 21 | build-backend = "poetry.core.masonry.api" 22 | -------------------------------------------------------------------------------- /rnn.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import pandas as pd 4 | import csv 5 | import random 6 | import pickle 7 | import collections 8 | import tensorflow as tf 9 | from keras.preprocessing.text import Tokenizer 10 | from nltk import word_tokenize 11 | from nltk.stem import WordNetLemmatizer 12 | from nltk.corpus import stopwords 13 | import joblib 14 | from sklearn.pipeline import Pipeline 15 | from sklearn.model_selection import KFold 16 | from sklearn.metrics import confusion_matrix, accuracy_score 17 | from keras.wrappers.scikit_learn import KerasClassifier 18 | from keras.models import Sequential 19 | from keras.models import load_model 20 | from keras.layers import Dense 21 | from keras.layers import LSTM 22 | from keras.layers import Bidirectional 23 | from keras.layers import GRU 24 | from keras.layers import SimpleRNN 25 | from keras.layers.embeddings import Embedding 26 | from keras.preprocessing import sequence 27 | from keras.preprocessing import text 28 | from keras.optimizers import Adam 29 | 30 | 31 | MODELS_DIR = "models" 32 | DATA_DIR = "data" 33 | GLOVE_PATH = os.path.join(DATA_DIR, "glove.6B.50d.txt") 34 | DIMENSIONS = ["IE", "NS", "FT", "PJ"] 35 | 36 | ### Preprocessing variables 37 | MODEL_BATCH_SIZE = 128 38 | TOP_WORDS = 2500 39 | MAX_POST_LENGTH = 40 40 | EMBEDDING_VECTOR_LENGTH = 50 41 | 42 | ### Learning variables 43 | LEARNING_RATE = 0.01 44 | DROPOUT = 0.1 45 | NUM_EPOCHS = 1 46 | 47 | ### Control variables 48 | CROSS_VALIDATION = False 49 | SAMPLE = True 50 | WORD_CLOUD = True 51 | SAVE_MODEL = True 52 | 53 | 54 | for k in range(len(DIMENSIONS)): 55 | 56 | ########################### 57 | ### POST CLASSIFICATION ### 58 | ########################### 59 | 60 | x_train = [] 61 | y_train = [] 62 | x_test = [] 63 | y_test = [] 64 | 65 | ### Read in data 66 | with open( 67 | os.path.join(DATA_DIR, "train_{}.csv".format(DIMENSIONS[k][0])), "r" 68 | ) as f: 69 | reader = csv.reader(f) 70 | for row in reader: 71 | for post in row: 72 | x_train.append(post) 73 | y_train.append(0) 74 | with open( 75 | os.path.join(DATA_DIR, "train_{}.csv".format(DIMENSIONS[k][1])), "r" 76 | ) as f: 77 | reader = csv.reader(f) 78 | for row in reader: 79 | for post in row: 80 | x_train.append(post) 81 | y_train.append(1) 82 | with open(os.path.join(DATA_DIR, "test_{}.csv".format(DIMENSIONS[k][0])), "r") as f: 83 | reader = csv.reader(f) 84 | for row in reader: 85 | for post in row: 86 | x_test.append(post) 87 | y_test.append(0) 88 | with open(os.path.join(DATA_DIR, "test_{}.csv".format(DIMENSIONS[k][1])), "r") as f: 89 | reader = csv.reader(f) 90 | for row in reader: 91 | for post in row: 92 | x_test.append(post) 93 | y_test.append(1) 94 | 95 | ### Preprocessing (lemmatization, tokenization, and padding of input) 96 | MBTI_TYPES = [ 97 | "INFJ", 98 | "ENTP", 99 | "INTP", 100 | "INTJ", 101 | "ENTJ", 102 | "ENFJ", 103 | "INFP", 104 | "ENFP", 105 | "ISFP", 106 | "ISTP", 107 | "ISFJ", 108 | "ISTJ", 109 | "ESTP", 110 | "ESFP", 111 | "ESTJ", 112 | "ESFJ", 113 | ] 114 | stop_words = stopwords.words("english") 115 | lemmatizer = WordNetLemmatizer() 116 | tokenizer = Tokenizer(num_words=TOP_WORDS, filters="") 117 | tokenizer.fit_on_texts(x_train + x_test) 118 | 119 | def lemmatize(x): 120 | lemmatized = [] 121 | for post in x: 122 | temp = post.lower() 123 | for mbti_type in MBTI_TYPES: 124 | mbti_type = mbti_type.lower() 125 | temp = temp.replace(" " + mbti_type, "") 126 | temp = " ".join( 127 | [ 128 | lemmatizer.lemmatize(word) 129 | for word in temp.split(" ") 130 | if (word not in stop_words) 131 | ] 132 | ) 133 | lemmatized.append(temp) 134 | return np.array(lemmatized) 135 | 136 | def preprocess(x): 137 | lemmatized = lemmatize(x) 138 | tokenized = tokenizer.texts_to_sequences(lemmatized) 139 | return sequence.pad_sequences(tokenized, maxlen=MAX_POST_LENGTH) 140 | 141 | x_train = lemmatize(x_train) 142 | x_test = lemmatize(x_test) 143 | 144 | ### Assign to dataframe and shuffle rows 145 | df = pd.DataFrame(data={"x": x_train, "y": y_train}) 146 | df = df.sample(frac=1).reset_index(drop=True) ### Shuffle rows 147 | if SAMPLE: 148 | df = df.head(10000) ### Small sample for quick runs 149 | 150 | ### Load glove into memory for embedding 151 | embeddings_index = dict() 152 | with open(GLOVE_PATH) as f: 153 | for line in f: 154 | values = line.split() 155 | word = values[0] 156 | embeddings_index[word] = np.asarray(values[1:], dtype="float32") 157 | print("Loaded {} word vectors.".format(len(embeddings_index))) 158 | 159 | ### Create a weight matrix for words 160 | embedding_matrix = np.zeros((TOP_WORDS, EMBEDDING_VECTOR_LENGTH)) 161 | for word, i in tokenizer.word_index.items(): 162 | if i < TOP_WORDS: 163 | embedding_vector = embeddings_index.get(word) 164 | if embedding_vector is not None: 165 | embedding_matrix[i] = embedding_vector 166 | 167 | ### Construct model 168 | with tf.device("/gpu:0"): 169 | model = Sequential() 170 | model.add( 171 | Embedding( 172 | TOP_WORDS, 173 | EMBEDDING_VECTOR_LENGTH, 174 | input_length=MAX_POST_LENGTH, 175 | weights=[embedding_matrix], 176 | mask_zero=True, 177 | trainable=True, 178 | ) 179 | ) 180 | # model.add(SimpleRNN(EMBEDDING_VECTOR_LENGTH, dropout=DROPOUT, recurrent_dropout=DROPOUT, activation='sigmoid', kernel_initializer='zeros')) 181 | # model.add(GRU(EMBEDDING_VECTOR_LENGTH, dropout=DROPOUT, recurrent_dropout=DROPOUT, activation='sigmoid', kernel_initializer='zeros')) 182 | model.add( 183 | LSTM( 184 | EMBEDDING_VECTOR_LENGTH, 185 | dropout=DROPOUT, 186 | recurrent_dropout=DROPOUT, 187 | activation="sigmoid", 188 | kernel_initializer="zeros", 189 | ) 190 | ) 191 | # model.add(Bidirectional(LSTM(EMBEDDING_VECTOR_LENGTH, dropout=DROPOUT, recurrent_dropout=DROPOUT, activation='sigmoid', kernel_initializer='zeros'))) 192 | model.add(Dense(1, activation="sigmoid")) 193 | optimizer = Adam(lr=LEARNING_RATE, beta_1=0.9, beta_2=0.999, epsilon=1e-8) 194 | model.compile( 195 | loss="binary_crossentropy", optimizer=optimizer, metrics=["accuracy"] 196 | ) 197 | print(model.summary()) 198 | 199 | ### Cross-validation classification (individual posts) 200 | if CROSS_VALIDATION: 201 | k_fold = KFold(n_splits=6) 202 | scores_k = [] 203 | confusion_k = np.array([[0, 0], [0, 0]]) 204 | for train_indices, test_indices in k_fold: 205 | x_train_k = df.iloc[train_indices]["x"].values 206 | y_train_k = df.iloc[train_indices]["y"].values 207 | x_test_k = df.iloc[test_indices]["x"].values 208 | y_test_k = df.iloc[test_indices]["y"].values 209 | model.fit( 210 | preprocess(x_train_k), 211 | y_train_k, 212 | epochs=NUM_EPOCHS, 213 | batch_size=MODEL_BATCH_SIZE, 214 | ) 215 | predictions_k = model.predict_classes(preprocess(x_test_k)) 216 | confusion_k += confusion_matrix(y_test_k, predictions_k) 217 | score_k = accuracy_score(y_test_k, predictions_k) 218 | scores_k.append(score_k) 219 | with open( 220 | os.path.join( 221 | DATA_DIR, "rnn_cross_validation_{}.txt".format(DIMENSIONS[k]) 222 | ), 223 | "w", 224 | ) as f: 225 | f.write( 226 | "*** {}/{} TRAINING SET CROSS VALIDATION (POSTS) ***\n".format( 227 | DIMENSIONS[k][0], DIMENSIONS[k][1] 228 | ) 229 | ) 230 | f.write("Total posts classified: {}\n".format(len(x_train))) 231 | f.write("Accuracy: {}\n".format(sum(scores_k) / len(scores_k))) 232 | f.write("Confusion matrix: \n") 233 | f.write(np.array2string(confusion_k, separator=", ")) 234 | 235 | ### Test set classification (individual posts) 236 | model.fit( 237 | preprocess(df["x"].values), 238 | df["y"].values, 239 | epochs=NUM_EPOCHS, 240 | batch_size=MODEL_BATCH_SIZE, 241 | ) 242 | predictions = model.predict_classes(preprocess(x_test)) 243 | confusion = confusion_matrix(y_test, predictions) 244 | score = accuracy_score(y_test, predictions) 245 | with open( 246 | os.path.join(MODELS_DIR, "rnn_accuracy_{}.txt".format(DIMENSIONS[k])), "w" 247 | ) as f: 248 | f.write( 249 | "*** {}/{} TEST SET CLASSIFICATION (POSTS) ***\n".format( 250 | DIMENSIONS[k][0], DIMENSIONS[k][1] 251 | ) 252 | ) 253 | f.write("Total posts classified: {}\n".format(len(x_test))) 254 | f.write("Accuracy: {}\n".format(score)) 255 | f.write("Confusion matrix: \n") 256 | f.write(np.array2string(confusion, separator=", ")) 257 | print( 258 | f"\nWrote training / test results for {DIMENSIONS[k]} here: {os.path.join(MODELS_DIR, 'rnn_accuracy_{}.txt'.format(DIMENSIONS[k]))}\n" 259 | ) 260 | 261 | ### Get most a-like/b-like sentences 262 | if WORD_CLOUD: 263 | NUM_EXTREME_EXAMPLES = 500 264 | probs = model.predict_proba(preprocess(x_test)) 265 | scores = [] 266 | indices = [] 267 | for i, prob in enumerate(probs, 0): 268 | scores.append(prob[0]) 269 | indices.append(i) 270 | sorted_probs = sorted(zip(scores, indices)) 271 | min_prob_indices = sorted_probs[:NUM_EXTREME_EXAMPLES] 272 | max_prob_indices = sorted_probs[-NUM_EXTREME_EXAMPLES:] 273 | with open( 274 | os.path.join( 275 | DATA_DIR, "extreme_examples_{}.txt".format(DIMENSIONS[k][0]) 276 | ), 277 | "w", 278 | ) as f: 279 | for prob, i in min_prob_indices: 280 | # f.write(x_test[i]+'\n') 281 | f.write(x_test[i] + "\n") 282 | # f.write(str(prob)+'\n') 283 | f.write("\n") 284 | with open( 285 | os.path.join( 286 | DATA_DIR, "extreme_examples_{}.txt".format(DIMENSIONS[k][1]) 287 | ), 288 | "w", 289 | ) as f: 290 | for prob, i in max_prob_indices: 291 | # f.write(x_test[i]+'\n') 292 | f.write(x_test[i] + "\n") 293 | # f.write(str(prob)+'\n') 294 | f.write("\n") 295 | 296 | ### Save model and tokenizer for future use 297 | model.save(os.path.join(MODELS_DIR, "rnn_model_{}.h5".format(DIMENSIONS[k]))) 298 | with open( 299 | os.path.join(MODELS_DIR, "rnn_tokenizer_{}.pkl".format(DIMENSIONS[k])), "wb" 300 | ) as f: 301 | pickle.dump(tokenizer, f, protocol=pickle.HIGHEST_PROTOCOL) 302 | -------------------------------------------------------------------------------- /separate_clean_and_unclean.py: -------------------------------------------------------------------------------- 1 | import os 2 | import collections 3 | import pandas as pd 4 | import csv 5 | 6 | 7 | DATA_DIR = "data" 8 | MBTI_RAW_CSV_PATH = os.path.join(DATA_DIR, "mbti_1.csv") 9 | MBTI_CLEAN_CSV_PATH = os.path.join(DATA_DIR, "mbti_clean.csv") 10 | MBTI_UNCLEAN_CSV_PATH = os.path.join(DATA_DIR, "mbti_unclean.csv") 11 | MBTI_TO_FREQUENCY_DICT = { 12 | "ISTJ": 0.11, 13 | "ISFJ": 0.09, 14 | "INFJ": 0.04, 15 | "INTJ": 0.05, 16 | "ISTP": 0.05, 17 | "ISFP": 0.05, 18 | "INFP": 0.06, 19 | "INTP": 0.06, 20 | "ESTP": 0.04, 21 | "ESFP": 0.04, 22 | "ENFP": 0.08, 23 | "ENTP": 0.06, 24 | "ESTJ": 0.08, 25 | "ESFJ": 0.09, 26 | "ENFJ": 0.05, 27 | "ENTJ": 0.05, 28 | } 29 | 30 | 31 | df = pd.read_csv(MBTI_RAW_CSV_PATH) 32 | 33 | counts = collections.defaultdict(int) 34 | for mbti in df["type"]: 35 | counts[mbti] += 1 36 | 37 | limiting_type = None 38 | min_size = float("infinity") 39 | for mbti in counts.keys(): 40 | size = counts[mbti] / MBTI_TO_FREQUENCY_DICT[mbti] 41 | if size < min_size: 42 | min_size = size 43 | limiting_type = mbti 44 | 45 | dic = collections.defaultdict(list) 46 | for index, row in df.iterrows(): 47 | dic[row["type"]].append(row) 48 | 49 | unclean_list = [] 50 | with open(MBTI_CLEAN_CSV_PATH, "w") as f: 51 | writer = csv.writer(f) 52 | writer.writerow(["type", "posts"]) 53 | 54 | for mbti in MBTI_TO_FREQUENCY_DICT.keys(): 55 | list1 = dic[mbti] 56 | for x in range(0, int(round(min_size * MBTI_TO_FREQUENCY_DICT[mbti]))): 57 | writer.writerow(list1[x]) 58 | unclean_list.append( 59 | list1[int(round(min_size * MBTI_TO_FREQUENCY_DICT[mbti])) : len(list1)] 60 | ) 61 | 62 | with open(MBTI_UNCLEAN_CSV_PATH, "w") as f: 63 | writer = csv.writer(f) 64 | writer.writerow(["type", "posts"]) 65 | for mbti in unclean_list: 66 | for x in mbti: 67 | writer.writerow(x) 68 | -------------------------------------------------------------------------------- /trump_predictor.py: -------------------------------------------------------------------------------- 1 | import os 2 | import csv 3 | import pickle 4 | import collections 5 | import numpy as np 6 | from nltk import word_tokenize 7 | from nltk.stem import WordNetLemmatizer 8 | from nltk.corpus import stopwords 9 | from keras.preprocessing import sequence 10 | from keras.preprocessing import text 11 | from keras.models import load_model 12 | 13 | 14 | MODELS_DIR = "models" 15 | DATA_DIR = "data" 16 | TRUMP_TWEETS_PATH = os.path.join(DATA_DIR, "trumptweets.csv") 17 | 18 | DIMENSIONS = ["IE", "NS", "FT", "PJ"] 19 | MODEL_BATCH_SIZE = 128 20 | TOP_WORDS = 2500 21 | MAX_POST_LENGTH = 40 22 | EMBEDDING_VECTOR_LENGTH = 20 23 | 24 | final = "" 25 | 26 | x_test = [] 27 | with open(TRUMP_TWEETS_PATH, "r", encoding="ISO-8859-1") as f: 28 | reader = csv.reader(f) 29 | for row in f: 30 | x_test.append(row) 31 | 32 | types = [ 33 | "INFJ", 34 | "ENTP", 35 | "INTP", 36 | "INTJ", 37 | "ENTJ", 38 | "ENFJ", 39 | "INFP", 40 | "ENFP", 41 | "ISFP", 42 | "ISTP", 43 | "ISFJ", 44 | "ISTJ", 45 | "ESTP", 46 | "ESFP", 47 | "ESTJ", 48 | "ESFJ", 49 | ] 50 | types = [x.lower() for x in types] 51 | lemmatizer = WordNetLemmatizer() 52 | stop_words = stopwords.words("english") 53 | 54 | 55 | def lemmatize(x): 56 | lemmatized = [] 57 | for post in x: 58 | temp = post.lower() 59 | for type_ in types: 60 | temp = temp.replace(" " + type_, "") 61 | temp = " ".join( 62 | [ 63 | lemmatizer.lemmatize(word) 64 | for word in temp.split(" ") 65 | if (word not in stop_words) 66 | ] 67 | ) 68 | lemmatized.append(temp) 69 | return np.array(lemmatized) 70 | 71 | 72 | for k in range(len(DIMENSIONS)): 73 | model = load_model( 74 | os.path.join(MODELS_DIR, "rnn_model_{}.h5".format(DIMENSIONS[k])) 75 | ) 76 | tokenizer = None 77 | with open( 78 | os.path.join(MODELS_DIR, "rnn_tokenizer_{}.pkl".format(DIMENSIONS[k])), "rb" 79 | ) as f: 80 | tokenizer = pickle.load(f) 81 | 82 | def preprocess(x): 83 | lemmatized = lemmatize(x) 84 | tokenized = tokenizer.texts_to_sequences(lemmatized) 85 | return sequence.pad_sequences(tokenized, maxlen=MAX_POST_LENGTH) 86 | 87 | predictions = model.predict(preprocess(x_test)) 88 | prediction = float(sum(predictions) / len(predictions)) 89 | print(DIMENSIONS[k]) 90 | print(prediction) 91 | if prediction >= 0.5: 92 | final += DIMENSIONS[k][1] 93 | else: 94 | final += DIMENSIONS[k][0] 95 | 96 | print("") 97 | print("Final prediction: {}".format(final)) 98 | -------------------------------------------------------------------------------- /word_cloud.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import pandas as pd 4 | import csv 5 | import random 6 | import pickle 7 | import collections 8 | import tensorflow as tf 9 | from nltk import word_tokenize 10 | from keras.preprocessing.text import Tokenizer 11 | from nltk import word_tokenize 12 | from nltk.stem import WordNetLemmatizer 13 | from nltk.corpus import stopwords 14 | import joblib 15 | from sklearn.pipeline import Pipeline 16 | from sklearn.model_selection import KFold 17 | from sklearn.metrics import confusion_matrix, accuracy_score 18 | from keras.wrappers.scikit_learn import KerasClassifier 19 | from keras.models import Sequential 20 | from keras.models import load_model 21 | from keras.layers import Dense 22 | from keras.layers import LSTM 23 | from keras.layers import Bidirectional 24 | from keras.layers import GRU 25 | from keras.layers import SimpleRNN 26 | from keras.layers.embeddings import Embedding 27 | from keras.preprocessing import sequence 28 | from keras.preprocessing import text 29 | from keras.models import load_model 30 | 31 | 32 | MODELS_DIR = "models" 33 | DATA_DIR = "data" 34 | 35 | DIMENSIONS = ["IE", "NS", "FT", "PJ"] 36 | MODEL_BATCH_SIZE = 128 37 | TOP_WORDS = 2500 38 | MAX_POST_LENGTH = 40 39 | EMBEDDING_VECTOR_LENGTH = 50 40 | 41 | types = [ 42 | "INFJ", 43 | "ENTP", 44 | "INTP", 45 | "INTJ", 46 | "ENTJ", 47 | "ENFJ", 48 | "INFP", 49 | "ENFP", 50 | "ISFP", 51 | "ISTP", 52 | "ISFJ", 53 | "ISTJ", 54 | "ESTP", 55 | "ESFP", 56 | "ESTJ", 57 | "ESFJ", 58 | ] 59 | types = [x.lower() for x in types] 60 | lemmatizer = WordNetLemmatizer() 61 | stop_words = stopwords.words("english") 62 | 63 | 64 | def lemmatize(x): 65 | lemmatized = [] 66 | for user in x: 67 | for post in user: 68 | temp = post.lower() 69 | for type_ in types: 70 | temp = temp.replace(" " + type_, "") 71 | temp = " ".join( 72 | [ 73 | lemmatizer.lemmatize(word) 74 | for word in temp.split(" ") 75 | if (word not in stop_words) 76 | ] 77 | ) 78 | lemmatized.append(temp) 79 | return np.array(lemmatized) 80 | 81 | 82 | for k in range(len(DIMENSIONS)): 83 | x_test_a = [] 84 | x_test_b = [] 85 | with open(os.path.join(DATA_DIR, "test_{}.csv".format(DIMENSIONS[k][0])), "r") as f: 86 | reader = csv.reader(f) 87 | for row in reader: 88 | x_test_a.append(row) 89 | with open(os.path.join(DATA_DIR, "test_{}.csv".format(DIMENSIONS[k][1])), "r") as f: 90 | reader = csv.reader(f) 91 | for row in reader: 92 | x_test_b.append(row) 93 | x_test = x_test_a + x_test_b 94 | 95 | model = load_model( 96 | os.path.join(MODELS_DIR, "rnn_model_{}.h5".format(DIMENSIONS[k])) 97 | ) 98 | tokenizer = None 99 | with open( 100 | os.path.join(MODELS_DIR, "rnn_tokenizer_{}.pkl".format(DIMENSIONS[k])), "rb" 101 | ) as f: 102 | tokenizer = pickle.load(f) 103 | 104 | def preprocess(x): 105 | lemmatized = lemmatize(x) 106 | tokenized = tokenizer.texts_to_sequences(lemmatized) 107 | return sequence.pad_sequences(tokenized, maxlen=MAX_POST_LENGTH) 108 | 109 | NUM_EXTREME_EXAMPLES = 500 110 | probs = model.predict_proba(preprocess(x_test)) 111 | scores = [] 112 | indices = [] 113 | for i, prob in enumerate(probs, 0): 114 | scores.append(prob[0]) 115 | indices.append(i) 116 | sorted_probs = sorted(zip(scores, indices)) 117 | min_prob_indices = sorted_probs[:NUM_EXTREME_EXAMPLES] 118 | max_prob_indices = sorted_probs[-NUM_EXTREME_EXAMPLES:] 119 | lemmatized = lemmatize(x_test) 120 | with open( 121 | os.path.join(DATA_DIR, "extreme_examples_{}.txt".format(DIMENSIONS[k][0])), "w" 122 | ) as f: 123 | for prob, i in min_prob_indices: 124 | # f.write(x_test[i]+'\n') 125 | f.write(lemmatized[i] + "\n") 126 | # f.write(str(prob)+'\n') 127 | f.write("\n") 128 | with open( 129 | os.path.join(DATA_DIR, "extreme_examples_{}.txt".format(DIMENSIONS[k][1])), "w" 130 | ) as f: 131 | for prob, i in max_prob_indices: 132 | # f.write(x_test[i]+'\n') 133 | f.write(lemmatized[i] + "\n") 134 | # f.write(str(prob)+'\n') 135 | f.write("\n") 136 | -------------------------------------------------------------------------------- /word_clouds/diff clouds/e.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/word_clouds/diff clouds/e.png -------------------------------------------------------------------------------- /word_clouds/diff clouds/f.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/word_clouds/diff clouds/f.png -------------------------------------------------------------------------------- /word_clouds/diff clouds/i.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/word_clouds/diff clouds/i.png -------------------------------------------------------------------------------- /word_clouds/diff clouds/j.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/word_clouds/diff clouds/j.png -------------------------------------------------------------------------------- /word_clouds/diff clouds/n.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/word_clouds/diff clouds/n.png -------------------------------------------------------------------------------- /word_clouds/diff clouds/p.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/word_clouds/diff clouds/p.png -------------------------------------------------------------------------------- /word_clouds/diff clouds/s.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/word_clouds/diff clouds/s.png -------------------------------------------------------------------------------- /word_clouds/diff clouds/t.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/word_clouds/diff clouds/t.png -------------------------------------------------------------------------------- /word_clouds/undiff clouds/e.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/word_clouds/undiff clouds/e.png -------------------------------------------------------------------------------- /word_clouds/undiff clouds/f.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/word_clouds/undiff clouds/f.png -------------------------------------------------------------------------------- /word_clouds/undiff clouds/i.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/word_clouds/undiff clouds/i.png -------------------------------------------------------------------------------- /word_clouds/undiff clouds/j.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/word_clouds/undiff clouds/j.png -------------------------------------------------------------------------------- /word_clouds/undiff clouds/n.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/word_clouds/undiff clouds/n.png -------------------------------------------------------------------------------- /word_clouds/undiff clouds/p.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/word_clouds/undiff clouds/p.png -------------------------------------------------------------------------------- /word_clouds/undiff clouds/s.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/word_clouds/undiff clouds/s.png -------------------------------------------------------------------------------- /word_clouds/undiff clouds/t.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ianscottknight/Predicting-Myers-Briggs-Type-Indicator-with-Recurrent-Neural-Networks/14ac32333eae5442272ad6a0a1e082c799804a80/word_clouds/undiff clouds/t.png --------------------------------------------------------------------------------