├── .gitignore
├── BERT_summarizer.py
├── LICENSE
├── NLTK_summarizer.py
├── README.md
├── T5_BART_summarizer.py
├── Text_Summary_[Google_Colab].ipynb
├── header.png
├── requirements.txt
└── results
├── completeRseults.csv
├── firstRunRseults.csv
└── secondRunRseults.csv
/.gitignore:
--------------------------------------------------------------------------------
1 | # Byte-compiled / optimized / DLL files
2 | __pycache__/
3 | *.py[cod]
4 | *$py.class
5 |
6 | # C extensions
7 | *.so
8 |
9 | # Distribution / packaging
10 | .Python
11 | build/
12 | develop-eggs/
13 | dist/
14 | downloads/
15 | eggs/
16 | .eggs/
17 | lib/
18 | lib64/
19 | parts/
20 | sdist/
21 | var/
22 | wheels/
23 | pip-wheel-metadata/
24 | share/python-wheels/
25 | *.egg-info/
26 | .installed.cfg
27 | *.egg
28 | MANIFEST
29 |
30 | # PyInstaller
31 | # Usually these files are written by a python script from a template
32 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
33 | *.manifest
34 | *.spec
35 |
36 | # Installer logs
37 | pip-log.txt
38 | pip-delete-this-directory.txt
39 |
40 | # Unit test / coverage reports
41 | htmlcov/
42 | .tox/
43 | .nox/
44 | .coverage
45 | .coverage.*
46 | .cache
47 | nosetests.xml
48 | coverage.xml
49 | *.cover
50 | *.py,cover
51 | .hypothesis/
52 | .pytest_cache/
53 |
54 | # Translations
55 | *.mo
56 | *.pot
57 |
58 | # Django stuff:
59 | *.log
60 | local_settings.py
61 | db.sqlite3
62 | db.sqlite3-journal
63 |
64 | # Flask stuff:
65 | instance/
66 | .webassets-cache
67 |
68 | # Scrapy stuff:
69 | .scrapy
70 |
71 | # Sphinx documentation
72 | docs/_build/
73 |
74 | # PyBuilder
75 | target/
76 |
77 | # Jupyter Notebook
78 | .ipynb_checkpoints
79 |
80 | # IPython
81 | profile_default/
82 | ipython_config.py
83 |
84 | # pyenv
85 | .python-version
86 |
87 | # pipenv
88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies
90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not
91 | # install all needed dependencies.
92 | #Pipfile.lock
93 |
94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow
95 | __pypackages__/
96 |
97 | # Celery stuff
98 | celerybeat-schedule
99 | celerybeat.pid
100 |
101 | # SageMath parsed files
102 | *.sage.py
103 |
104 | # Environments
105 | .env
106 | .venv
107 | env/
108 | venv/
109 | ENV/
110 | env.bak/
111 | venv.bak/
112 |
113 | # Spyder project settings
114 | .spyderproject
115 | .spyproject
116 |
117 | # Rope project settings
118 | .ropeproject
119 |
120 | # mkdocs documentation
121 | /site
122 |
123 | # mypy
124 | .mypy_cache/
125 | .dmypy.json
126 | dmypy.json
127 |
128 | # Pyre type checker
129 | .pyre/
130 |
131 | .idea
132 |
--------------------------------------------------------------------------------
/BERT_summarizer.py:
--------------------------------------------------------------------------------
1 | from summarizer import Summarizer
2 |
3 |
4 | class SummarizerBERT:
5 | """
6 | BERT text summarizer
7 | :param max_length: The maximum length to accept as a sentence. (default to 500)
8 | :param min_length: The minimum length to accept as a sentence. (default to 25)
9 | :param use_first: Importance of first sentence
10 | :param ratio: Ratio of sentences to summarize to from the original body. (default to 0.2)
11 | :param model: Model to be used (default to bert-large-uncased)
12 | :param clustering_algorithm: Which clustering algorithm to use (default to kmeans options kmeans OR gmm)
13 | All models available:
14 | 'bert-base-uncased': (BertModel, BertTokenizer),
15 | 'bert-large-uncased': (BertModel, BertTokenizer),
16 | 'xlnet-base-cased': (XLNetModel, XLNetTokenizer),
17 | 'xlm-mlm-enfr-1024': (XLMModel, XLMTokenizer),
18 | 'distilbert-base-uncased': (DistilBertModel, DistilBertTokenizer),
19 | 'albert-base-v1': (AlbertModel, AlbertTokenizer),
20 | 'albert-large-v1': (AlbertModel, AlbertTokenizer)
21 | """
22 | def __init__(self, max_length=500, min_length=25, use_first=False, ratio=0.2, model="bert-large-uncased",
23 | clustering_algorithm='kmeans'):
24 | self.max_length = max_length
25 | self.min_length = min_length
26 | self.use_first = use_first
27 | self.ratio = ratio
28 | self.model = model
29 | self.clustering_algorithm = clustering_algorithm
30 |
31 | def summary(self, text=''):
32 | """
33 | This returns the summary of the text using BERT Transformer
34 | :param text: Text to be summarized
35 | :return: Summarized Text
36 | """
37 | model = Summarizer(self.model)
38 | return model(text, min_length=self.min_length, max_length=self.max_length, use_first=self.use_first,
39 | ratio=self.ratio, algorithm=self.clustering_algorithm)
40 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | Apache License
2 | Version 2.0, January 2004
3 | http://www.apache.org/licenses/
4 |
5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6 |
7 | 1. Definitions.
8 |
9 | "License" shall mean the terms and conditions for use, reproduction,
10 | and distribution as defined by Sections 1 through 9 of this document.
11 |
12 | "Licensor" shall mean the copyright owner or entity authorized by
13 | the copyright owner that is granting the License.
14 |
15 | "Legal Entity" shall mean the union of the acting entity and all
16 | other entities that control, are controlled by, or are under common
17 | control with that entity. For the purposes of this definition,
18 | "control" means (i) the power, direct or indirect, to cause the
19 | direction or management of such entity, whether by contract or
20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the
21 | outstanding shares, or (iii) beneficial ownership of such entity.
22 |
23 | "You" (or "Your") shall mean an individual or Legal Entity
24 | exercising permissions granted by this License.
25 |
26 | "Source" form shall mean the preferred form for making modifications,
27 | including but not limited to software source code, documentation
28 | source, and configuration files.
29 |
30 | "Object" form shall mean any form resulting from mechanical
31 | transformation or translation of a Source form, including but
32 | not limited to compiled object code, generated documentation,
33 | and conversions to other media types.
34 |
35 | "Work" shall mean the work of authorship, whether in Source or
36 | Object form, made available under the License, as indicated by a
37 | copyright notice that is included in or attached to the work
38 | (an example is provided in the Appendix below).
39 |
40 | "Derivative Works" shall mean any work, whether in Source or Object
41 | form, that is based on (or derived from) the Work and for which the
42 | editorial revisions, annotations, elaborations, or other modifications
43 | represent, as a whole, an original work of authorship. For the purposes
44 | of this License, Derivative Works shall not include works that remain
45 | separable from, or merely link (or bind by name) to the interfaces of,
46 | the Work and Derivative Works thereof.
47 |
48 | "Contribution" shall mean any work of authorship, including
49 | the original version of the Work and any modifications or additions
50 | to that Work or Derivative Works thereof, that is intentionally
51 | submitted to Licensor for inclusion in the Work by the copyright owner
52 | or by an individual or Legal Entity authorized to submit on behalf of
53 | the copyright owner. For the purposes of this definition, "submitted"
54 | means any form of electronic, verbal, or written communication sent
55 | to the Licensor or its representatives, including but not limited to
56 | communication on electronic mailing lists, source code control systems,
57 | and issue tracking systems that are managed by, or on behalf of, the
58 | Licensor for the purpose of discussing and improving the Work, but
59 | excluding communication that is conspicuously marked or otherwise
60 | designated in writing by the copyright owner as "Not a Contribution."
61 |
62 | "Contributor" shall mean Licensor and any individual or Legal Entity
63 | on behalf of whom a Contribution has been received by Licensor and
64 | subsequently incorporated within the Work.
65 |
66 | 2. Grant of Copyright License. Subject to the terms and conditions of
67 | this License, each Contributor hereby grants to You a perpetual,
68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69 | copyright license to reproduce, prepare Derivative Works of,
70 | publicly display, publicly perform, sublicense, and distribute the
71 | Work and such Derivative Works in Source or Object form.
72 |
73 | 3. Grant of Patent License. Subject to the terms and conditions of
74 | this License, each Contributor hereby grants to You a perpetual,
75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76 | (except as stated in this section) patent license to make, have made,
77 | use, offer to sell, sell, import, and otherwise transfer the Work,
78 | where such license applies only to those patent claims licensable
79 | by such Contributor that are necessarily infringed by their
80 | Contribution(s) alone or by combination of their Contribution(s)
81 | with the Work to which such Contribution(s) was submitted. If You
82 | institute patent litigation against any entity (including a
83 | cross-claim or counterclaim in a lawsuit) alleging that the Work
84 | or a Contribution incorporated within the Work constitutes direct
85 | or contributory patent infringement, then any patent licenses
86 | granted to You under this License for that Work shall terminate
87 | as of the date such litigation is filed.
88 |
89 | 4. Redistribution. You may reproduce and distribute copies of the
90 | Work or Derivative Works thereof in any medium, with or without
91 | modifications, and in Source or Object form, provided that You
92 | meet the following conditions:
93 |
94 | (a) You must give any other recipients of the Work or
95 | Derivative Works a copy of this License; and
96 |
97 | (b) You must cause any modified files to carry prominent notices
98 | stating that You changed the files; and
99 |
100 | (c) You must retain, in the Source form of any Derivative Works
101 | that You distribute, all copyright, patent, trademark, and
102 | attribution notices from the Source form of the Work,
103 | excluding those notices that do not pertain to any part of
104 | the Derivative Works; and
105 |
106 | (d) If the Work includes a "NOTICE" text file as part of its
107 | distribution, then any Derivative Works that You distribute must
108 | include a readable copy of the attribution notices contained
109 | within such NOTICE file, excluding those notices that do not
110 | pertain to any part of the Derivative Works, in at least one
111 | of the following places: within a NOTICE text file distributed
112 | as part of the Derivative Works; within the Source form or
113 | documentation, if provided along with the Derivative Works; or,
114 | within a display generated by the Derivative Works, if and
115 | wherever such third-party notices normally appear. The contents
116 | of the NOTICE file are for informational purposes only and
117 | do not modify the License. You may add Your own attribution
118 | notices within Derivative Works that You distribute, alongside
119 | or as an addendum to the NOTICE text from the Work, provided
120 | that such additional attribution notices cannot be construed
121 | as modifying the License.
122 |
123 | You may add Your own copyright statement to Your modifications and
124 | may provide additional or different license terms and conditions
125 | for use, reproduction, or distribution of Your modifications, or
126 | for any such Derivative Works as a whole, provided Your use,
127 | reproduction, and distribution of the Work otherwise complies with
128 | the conditions stated in this License.
129 |
130 | 5. Submission of Contributions. Unless You explicitly state otherwise,
131 | any Contribution intentionally submitted for inclusion in the Work
132 | by You to the Licensor shall be under the terms and conditions of
133 | this License, without any additional terms or conditions.
134 | Notwithstanding the above, nothing herein shall supersede or modify
135 | the terms of any separate license agreement you may have executed
136 | with Licensor regarding such Contributions.
137 |
138 | 6. Trademarks. This License does not grant permission to use the trade
139 | names, trademarks, service marks, or product names of the Licensor,
140 | except as required for reasonable and customary use in describing the
141 | origin of the Work and reproducing the content of the NOTICE file.
142 |
143 | 7. Disclaimer of Warranty. Unless required by applicable law or
144 | agreed to in writing, Licensor provides the Work (and each
145 | Contributor provides its Contributions) on an "AS IS" BASIS,
146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 | implied, including, without limitation, any warranties or conditions
148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 | PARTICULAR PURPOSE. You are solely responsible for determining the
150 | appropriateness of using or redistributing the Work and assume any
151 | risks associated with Your exercise of permissions under this License.
152 |
153 | 8. Limitation of Liability. In no event and under no legal theory,
154 | whether in tort (including negligence), contract, or otherwise,
155 | unless required by applicable law (such as deliberate and grossly
156 | negligent acts) or agreed to in writing, shall any Contributor be
157 | liable to You for damages, including any direct, indirect, special,
158 | incidental, or consequential damages of any character arising as a
159 | result of this License or out of the use or inability to use the
160 | Work (including but not limited to damages for loss of goodwill,
161 | work stoppage, computer failure or malfunction, or any and all
162 | other commercial damages or losses), even if such Contributor
163 | has been advised of the possibility of such damages.
164 |
165 | 9. Accepting Warranty or Additional Liability. While redistributing
166 | the Work or Derivative Works thereof, You may choose to offer,
167 | and charge a fee for, acceptance of support, warranty, indemnity,
168 | or other liability obligations and/or rights consistent with this
169 | License. However, in accepting such obligations, You may act only
170 | on Your own behalf and on Your sole responsibility, not on behalf
171 | of any other Contributor, and only if You agree to indemnify,
172 | defend, and hold each Contributor harmless for any liability
173 | incurred by, or claims asserted against, such Contributor by reason
174 | of your accepting any such warranty or additional liability.
175 |
176 | END OF TERMS AND CONDITIONS
177 |
178 | APPENDIX: How to apply the Apache License to your work.
179 |
180 | To apply the Apache License to your work, attach the following
181 | boilerplate notice, with the fields enclosed by brackets "[]"
182 | replaced with your own identifying information. (Don't include
183 | the brackets!) The text should be enclosed in the appropriate
184 | comment syntax for the file format. We also recommend that a
185 | file or class name and description of purpose be included on the
186 | same "printed page" as the copyright notice for easier
187 | identification within third-party archives.
188 |
189 | Copyright [yyyy] [name of copyright owner]
190 |
191 | Licensed under the Apache License, Version 2.0 (the "License");
192 | you may not use this file except in compliance with the License.
193 | You may obtain a copy of the License at
194 |
195 | http://www.apache.org/licenses/LICENSE-2.0
196 |
197 | Unless required by applicable law or agreed to in writing, software
198 | distributed under the License is distributed on an "AS IS" BASIS,
199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 | See the License for the specific language governing permissions and
201 | limitations under the License.
202 |
--------------------------------------------------------------------------------
/NLTK_summarizer.py:
--------------------------------------------------------------------------------
1 | try:
2 | import nltk
3 | nltk.download('stopwords')
4 | nltk.download('punkt')
5 | except Exception as exp:
6 | print(f" ERROR {exp}")
7 | raise Exception("NLTK download failed")
8 | from nltk.corpus import stopwords
9 | from nltk.cluster.util import cosine_distance
10 | import numpy as np
11 | import networkx as nx
12 |
13 |
14 | class SummarizerNLTK:
15 | """
16 | NLTK text summarizer
17 | :param max_length: This is the maximum number of sentences in the summary
18 | :param min_length: This is the minimum number of sentences in the summary. (Not being considered right now)
19 | """
20 | def __init__(self, max_length=2, min_length=1):
21 | self.max_length = max_length
22 | self.min_length = min_length
23 |
24 | def read_text(self, data=""):
25 | data = [data]
26 | article = data[0].split(". ")
27 | sentences = []
28 |
29 | for sentence in article:
30 | # print(sentence)
31 | sentences.append(sentence.replace("[^a-zA-Z]", " ").split(" "))
32 | sentences.pop()
33 |
34 | return sentences
35 |
36 | def sentence_similarity(self, sent1, sent2, stopwords=None):
37 | if stopwords is None:
38 | stopwords = []
39 |
40 | sent1 = [w.lower() for w in sent1]
41 | sent2 = [w.lower() for w in sent2]
42 |
43 | all_words = list(set(sent1 + sent2))
44 |
45 | vector1 = [0] * len(all_words)
46 | vector2 = [0] * len(all_words)
47 |
48 | # build the vector for the first sentence
49 | for w in sent1:
50 | if w in stopwords:
51 | continue
52 | vector1[all_words.index(w)] += 1
53 |
54 | # build the vector for the second sentence
55 | for w in sent2:
56 | if w in stopwords:
57 | continue
58 | vector2[all_words.index(w)] += 1
59 |
60 | return 1 - cosine_distance(vector1, vector2)
61 |
62 | def build_similarity_matrix(self, sentences, stop_words):
63 | # Create an empty similarity matrix
64 | similarity_matrix = np.zeros((len(sentences), len(sentences)))
65 |
66 | for idx1 in range(len(sentences)):
67 | for idx2 in range(len(sentences)):
68 | if idx1 == idx2: # ignore if both are same sentences
69 | continue
70 | similarity_matrix[idx1][idx2] = self.sentence_similarity(sentences[idx1], sentences[idx2], stop_words)
71 |
72 | return similarity_matrix
73 |
74 | def summary(self, text=''):
75 | """
76 | This returns the summary of the text using NLTK corpus and sentence ranking
77 | :param text: Text to be summarized
78 | :return: Summarized Text
79 | """
80 | stop_words = stopwords.words('english')
81 | summarize_text = []
82 |
83 | # Step 1 - Read text anc split it
84 | sentences = self.read_text(data=text)
85 |
86 | # Step 2 - Generate Similary Martix across sentences
87 | sentence_similarity_martix = self.build_similarity_matrix(sentences, stop_words)
88 |
89 | # Step 3 - Rank sentences in similarity martix
90 | sentence_similarity_graph = nx.from_numpy_array(sentence_similarity_martix)
91 | scores = nx.pagerank(sentence_similarity_graph)
92 |
93 | # Step 4 - Sort the rank and pick top sentences
94 | ranked_sentence = sorted(((scores[i], s) for i, s in enumerate(sentences)), reverse=True)
95 | # print("Indexes of top ranked_sentence order are ", ranked_sentence)
96 |
97 | for i in range(self.max_length):
98 | summarize_text.append(" ".join(ranked_sentence[i][1]))
99 |
100 | # Step 5 - Off course, output the summarize text
101 | return ". ".join(summarize_text)
102 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 | Text-Summarizer
7 | Comparing state of the art models for text summary generation
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
17 |
18 | ---
19 |
20 | ## Usage
21 |
22 | ### Setup
23 |
24 | Clone the repo
25 |
26 | ```shell
27 | pip3 install -r requirements.txt
28 | python -m spacy download en_core_web_md
29 | ```
30 |
31 | ```Python
32 | from NLTK_summarizer import SummarizerNLTK
33 | print(SummarizerNLTK().summary(text = ""))
34 | ```
35 | ```Python
36 | from BERT_summarizer import SummarizerBERT
37 | print(SummarizerBERT().summary(text = ""))
38 | ```
39 | ```Python
40 | from T5_BART_summarizer import SummarizerT5BART
41 | print(SummarizerT5BART().summary(text = ""))
42 | ```
43 |
44 | *Look at the documentation in the files for better understanding*
45 |
46 | ## Text used
47 |
48 | Taken from Kaggle dataset: https://www.kaggle.com/snapcrack/all-the-news
49 |
50 | ```
51 | WASHINGTON — Congressional Republicans have a new fear when it comes to their health care lawsuit against the Obama administration: They might win. The incoming Trump administration could choose to no longer defend the executive branch against the suit, which challenges the administration’s authority to spend billions of dollars on health insurance subsidies for and Americans, handing House Republicans a big victory on issues. But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. That could lead to chaos in the insurance market and spur a political backlash just as Republicans gain full control of the government. To stave off that outcome, Republicans could find themselves in the awkward position of appropriating huge sums to temporarily prop up the Obama health care law, angering conservative voters who have been demanding an end to the law for years. In another twist, Donald J. Trump’s administration, worried about preserving executive branch prerogatives, could choose to fight its Republican allies in the House on some central questions in the dispute. Eager to avoid an ugly political pileup, Republicans on Capitol Hill and the Trump transition team are gaming out how to handle the lawsuit, which, after the election, has been put in limbo until at least late February by the United States Court of Appeals for the District of Columbia Circuit. They are not yet ready to divulge their strategy. “Given that this pending litigation involves the Obama administration and Congress, it would be inappropriate to comment,” said Phillip J. Blando, a spokesman for the Trump transition effort. “Upon taking office, the Trump administration will evaluate this case and all related aspects of the Affordable Care Act. ” In a potentially decision in 2015, Judge Rosemary M. Collyer ruled that House Republicans had the standing to sue the executive branch over a spending dispute and that the Obama administration had been distributing the health insurance subsidies, in violation of the Constitution, without approval from Congress. The Justice Department, confident that Judge Collyer’s decision would be reversed, quickly appealed, and the subsidies have remained in place during the appeal. In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” The suspension of the case, House lawyers said, will “provide the and his future administration time to consider whether to continue prosecuting or to otherwise resolve this appeal. ” Republican leadership officials in the House acknowledge the possibility of “cascading effects” if the payments, which have totaled an estimated $13 billion, are suddenly stopped. Insurers that receive the subsidies in exchange for paying costs such as deductibles and for eligible consumers could race to drop coverage since they would be losing money. Over all, the loss of the subsidies could destabilize the entire program and cause a lack of confidence that leads other insurers to seek a quick exit as well. Anticipating that the Trump administration might not be inclined to mount a vigorous fight against the House Republicans given the ’s dim view of the health care law, a team of lawyers this month sought to intervene in the case on behalf of two participants in the health care program. In their request, the lawyers predicted that a deal between House Republicans and the new administration to dismiss or settle the case “will produce devastating consequences for the individuals who receive these reductions, as well as for the nation’s health insurance and health care systems generally. ” No matter what happens, House Republicans say, they want to prevail on two overarching concepts: the congressional power of the purse, and the right of Congress to sue the executive branch if it violates the Constitution regarding that spending power. House Republicans contend that Congress never appropriated the money for the subsidies, as required by the Constitution. In the suit, which was initially championed by John A. Boehner, the House speaker at the time, and later in House committee reports, Republicans asserted that the administration, desperate for the funding, had required the Treasury Department to provide it despite widespread internal skepticism that the spending was proper. The White House said that the spending was a permanent part of the law passed in 2010, and that no annual appropriation was required — even though the administration initially sought one. Just as important to House Republicans, Judge Collyer found that Congress had the standing to sue the White House on this issue — a ruling that many legal experts said was flawed — and they want that precedent to be set to restore congressional leverage over the executive branch. But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions. It is a complicated set of dynamics illustrating how a quick legal victory for the House in the Trump era might come with costs that Republicans never anticipated when they took on the Obama White House.
52 | ```
53 |
54 | ## Results
55 |
56 | | # | Model | Time Taken(with Downloads) | Completed | Summary | Time Taken (without downloads) |
57 | | --- | --- | --- | --- | --- | --- |
58 | | 1 | NLTK Corpus | 0 | True | The incoming Trump administration could choose... | 0 |
59 | | 2 | bert-base-uncased kmeans | 27 | True | The incoming Trump administration could choose... | 15 |
60 | | 3 | bert-base-uncased gmm | 3 | True | The incoming Trump administration could choose... | 4 |
61 | | 4 | bert-large-uncased kmeans | 29 | True | The incoming Trump administration could choose... | 8 |
62 | | 5 | bert-large-uncased gmm | 9 | True | The Justice Department, confident that Judge C... | 9 |
63 | | 6 | xlnet-base-cased kmeans | 11 | True | But a sudden loss of the disputed subsidies co... | 3 |
64 | | 7 | xlnet-base-cased gmm | 3 | True | But a sudden loss of the disputed subsidies co... | 3 |
65 | | 8 | xlm-mlm-enfr-1024 kmeans | 20 | True | WASHINGTON --- Congressional Republicans have... | 4 |
66 | | 9 | xlm-mlm-enfr-1024 gmm | 25 | True | But a sudden loss of the disputed subsidies co... | 5 |
67 | | 10 | distilbert-base-uncased kmeans | 6 | True | But a sudden loss of the disputed subsidies co... | 2 |
68 | | 11 | distilbert-base-uncased gmm | 2 | True | But a sudden loss of the disputed subsidies co... | 2 |
69 | | 12 | albert-base-v1 kmeans | 3 | True | To stave off that outcome, Republicans could f... | 2 |
70 | | 13 | albert-base-v1 gmm | 2 | True | But a sudden loss of the disputed subsidies co... | 2 |
71 | | 14 | albert-large-v1 kmeans | 4 | True | The incoming Trump administration could choose... | 2 |
72 | | 15 | albert-large-v1 gmm | 3 | True | But a sudden loss of the disputed subsidies co... | 3 |
73 | | 16 | facebook/bart-large-cnn | 36 | False | ERROR | 56 |
74 | | 17 | t5-11b | 4 | False | ERROR | 2 |
75 | | 18 | t5-3b | SKIPPED | False | ERROR | SKIPPED |
76 | | 19 | t5-base | 30 | True | a sudden loss | 38 |
77 | | 20 | t5-large | SKIPPED | False | ERROR | SKIPPED |
78 | | 21 | t5-small | 9 | True | incoming administration could | 11 |
79 |
80 | *If you want to compare the outputs, go to `results` folder*
81 |
82 | > All time is in seconds
83 |
84 | > Skipped means it failed even after multiple attempts
85 |
86 | > ERROR means the process didn't complete
87 |
88 | **This code was run on Google Colab (GPU Runtime) which has fairly good hardware.**
89 |
90 | **Also it might take some time downloading the large pre-trained models**
91 |
92 | ## Conclusion
93 |
94 | NLTK works faster and better most of the time.
95 |
96 | The next best is BERT but the way tokenization happens in BERT, sometimes it leaves the sentence in between loosing all meaning. It works well with very large texts.
97 |
98 | T5 tries to figure out new sentences but is almost impossible to run even using decent hardware. For T5 you can chose the size of the model. Everything above t5-base is very slow, even on GPU or TPU.
99 |
100 | facebook BART does too many computatons and exhausts memory really quickly.
101 |
102 | ## To-do
103 |
104 | - [ ] Run on full datasets
105 |
106 | - [ ] Publish a wrapper for PyPI
107 |
108 | - [x] Compare effciency on GPUs
109 |
110 | - [x] Add facebook's BART
111 |
112 | - [x] Add Google T5
113 |
114 | - [x] Add BERT large
115 |
116 |
117 | 🌟⭐✨STAR ME✨⭐🌟
118 |
119 |
120 | You can give me a small 🤓 dopmaine 🤝 support by ⭐STARRING⭐ this project
121 |
122 |
123 |
124 |
125 |
126 | ## CREDITS
127 |
128 | >Kuldeep Singh Sidhu
129 |
130 | Github: [github/singhsidhukuldeep](https://github.com/singhsidhukuldeep)
131 | `https://github.com/singhsidhukuldeep`
132 |
133 | Website: [Kuldeep Singh Sidhu (Website)](http://kuldeepsinghsidhu.com)
134 | `http://kuldeepsinghsidhu.com`
135 |
136 | LinkedIn: [Kuldeep Singh Sidhu (LinkedIn)](https://www.linkedin.com/in/singhsidhukuldeep/)
137 | `https://www.linkedin.com/in/singhsidhukuldeep/`
138 |
--------------------------------------------------------------------------------
/T5_BART_summarizer.py:
--------------------------------------------------------------------------------
1 | from transformers import pipeline
2 |
3 |
4 | class SummarizerT5BART:
5 | """
6 | BART or T5 text summarizer
7 | :param model: select the model that you want to use as model in summarization pipeline (default is t5-small)
8 | :param max_length: The maximum length to accept as a sentence. (default to 20)
9 | :param min_length: The minimum length to accept as a sentence. (default to 5)
10 | Supported models:
11 | facebook/bart-large-cnn
12 | t5-11b
13 | t5-3b
14 | t5-base
15 | t5-large
16 | t5-small
17 | Warning: for T5 you can chose the size of the model. Everything above t5-base is very slow, even on GPU or TPU
18 | """
19 |
20 | def __init__(self, model='t5-small', min_length=5, max_length=20):
21 | self.model = model
22 | self.min_length = min_length
23 | self.max_length = max_length
24 |
25 | def summary(self, text=''):
26 | """
27 | This returns the summary of the text using BERT Transformer
28 | :param text: Text to be summarized
29 | :return: Summarized Text
30 | """
31 | summarizer = pipeline(task='summarization', model=self.model)
32 | return summarizer(text, max_length=self.max_length, min_length=self.min_length)[0]['summary_text']
33 |
--------------------------------------------------------------------------------
/header.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/singhsidhukuldeep/Text-Summarizer/923c10ebe268e4f8ba901ca0153fa3929b45fd20/header.png
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | #pip3 install -r requirements.txt
2 |
3 | # NLTK
4 | nltk
5 | networkx
6 |
7 | # BERT
8 | bert-extractive-summarizer
9 | ## spacy setup
10 | spacy==2.1.3
11 | # python -m spacy download en_core_web_md
12 | transformers
13 | neuralcoref
14 |
15 | # BART / T5
16 | transformers
--------------------------------------------------------------------------------
/results/completeRseults.csv:
--------------------------------------------------------------------------------
1 | ,#,Model,Time Taken(with Downloads),Completed,Summary,Time Taken (without downloads)
2 | 0,1,NLTK Corpus,0,True,"The incoming Trump administration could choose to no longer defend the executive branch against the suit, which challenges the administration’s authority to spend billions of dollars on health insurance subsidies for and Americans, handing House Republicans a big victory on issues. WASHINGTON — Congressional Republicans have a new fear when it comes to their health care lawsuit against the Obama administration: They might win",0
3 | 1,2,bert-base-uncased kmeans,27,True,"The incoming Trump administration could choose to no longer defend the executive branch against the suit, which challenges the administration’s authority to spend billions of dollars on health insurance subsidies for and Americans, handing House Republicans a big victory on issues. But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. They are not yet ready to divulge their strategy. “ Just as important to House Republicans, Judge Collyer found that Congress had the standing to sue the White House on this issue — a ruling that many legal experts said was flawed — and they want that precedent to be set to restore congressional leverage over the executive branch.",15
4 | 2,3,bert-base-uncased gmm,3,True,"The incoming Trump administration could choose to no longer defend the executive branch against the suit, which challenges the administration’s authority to spend billions of dollars on health insurance subsidies for and Americans, handing House Republicans a big victory on issues. But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. They are not yet ready to divulge their strategy. “ Just as important to House Republicans, Judge Collyer found that Congress had the standing to sue the White House on this issue — a ruling that many legal experts said was flawed — and they want that precedent to be set to restore congressional leverage over the executive branch.",4
5 | 3,4,bert-large-uncased kmeans,29,True,"The incoming Trump administration could choose to no longer defend the executive branch against the suit, which challenges the administration’s authority to spend billions of dollars on health insurance subsidies for and Americans, handing House Republicans a big victory on issues. In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions. It is a complicated set of dynamics illustrating how a quick legal victory for the House in the Trump era might come with costs that Republicans never anticipated when they took on the Obama White House.",8
6 | 4,5,bert-large-uncased gmm,9,True,"The Justice Department, confident that Judge Collyer’s decision would be reversed, quickly appealed, and the subsidies have remained in place during the appeal. In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions. It is a complicated set of dynamics illustrating how a quick legal victory for the House in the Trump era might come with costs that Republicans never anticipated when they took on the Obama White House.",9
7 | 5,6,xlnet-base-cased kmeans,11,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. They are not yet ready to divulge their strategy. “ Republican leadership officials in the House acknowledge the possibility of “cascading effects” if the payments, which have totaled an estimated $13 billion, are suddenly stopped. Anticipating that the Trump administration might not be inclined to mount a vigorous fight against the House Republicans given the ’s dim view of the health care law, a team of lawyers this month sought to intervene in the case on behalf of two participants in the health care program.",3
8 | 6,7,xlnet-base-cased gmm,3,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. They are not yet ready to divulge their strategy. “ Republican leadership officials in the House acknowledge the possibility of “cascading effects” if the payments, which have totaled an estimated $13 billion, are suddenly stopped. Anticipating that the Trump administration might not be inclined to mount a vigorous fight against the House Republicans given the ’s dim view of the health care law, a team of lawyers this month sought to intervene in the case on behalf of two participants in the health care program.",3
9 | 7,8,xlm-mlm-enfr-1024 kmeans,20,True,"WASHINGTON — Congressional Republicans have a new fear when it comes to their health care lawsuit against the Obama administration: They might win. But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. The suspension of the case, House lawyers said, will “provide the and his future administration time to consider whether to continue prosecuting or to otherwise resolve this appeal. ” Over all, the loss of the subsidies could destabilize the entire program and cause a lack of confidence that leads other insurers to seek a quick exit as well.",4
10 | 8,9,xlm-mlm-enfr-1024 gmm,25,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” The suspension of the case, House lawyers said, will “provide the and his future administration time to consider whether to continue prosecuting or to otherwise resolve this appeal. ” But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions.",5
11 | 9,10,distilbert-base-uncased kmeans,6,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. They are not yet ready to divulge their strategy. “ In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions.",2
12 | 10,11,distilbert-base-uncased gmm,2,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. They are not yet ready to divulge their strategy. “ In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions.",2
13 | 11,12,albert-base-v1 kmeans,3,True,"To stave off that outcome, Republicans could find themselves in the awkward position of appropriating huge sums to temporarily prop up the Obama health care law, angering conservative voters who have been demanding an end to the law for years. They are not yet ready to divulge their strategy. “ Given that this pending litigation involves the Obama administration and Congress, it would be inappropriate to comment,” said Phillip J. Blando, a spokesman for the Trump transition effort. “ Just as important to House Republicans, Judge Collyer found that Congress had the standing to sue the White House on this issue — a ruling that many legal experts said was flawed — and they want that precedent to be set to restore congressional leverage over the executive branch.",2
14 | 12,13,albert-base-v1 gmm,2,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. Given that this pending litigation involves the Obama administration and Congress, it would be inappropriate to comment,” said Phillip J. Blando, a spokesman for the Trump transition effort. “ In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions.",2
15 | 13,14,albert-large-v1 kmeans,4,True,"The incoming Trump administration could choose to no longer defend the executive branch against the suit, which challenges the administration’s authority to spend billions of dollars on health insurance subsidies for and Americans, handing House Republicans a big victory on issues. But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. To stave off that outcome, Republicans could find themselves in the awkward position of appropriating huge sums to temporarily prop up the Obama health care law, angering conservative voters who have been demanding an end to the law for years. In another twist, Donald J. Trump’s administration, worried about preserving executive branch prerogatives, could choose to fight its Republican allies in the House on some central questions in the dispute.",2
16 | 14,15,albert-large-v1 gmm,3,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. To stave off that outcome, Republicans could find themselves in the awkward position of appropriating huge sums to temporarily prop up the Obama health care law, angering conservative voters who have been demanding an end to the law for years. In another twist, Donald J. Trump’s administration, worried about preserving executive branch prerogatives, could choose to fight its Republican allies in the House on some central questions in the dispute. The Justice Department, confident that Judge Collyer’s decision would be reversed, quickly appealed, and the subsidies have remained in place during the appeal.",3
17 | 15,16,facebook/bart-large-cnn,36,False,ERROR,56
18 | 16,17,t5-11b,4,False,ERROR,2
19 | 17,18,t5-3b,SKIPPED,False,ERROR,SKIPPED
20 | 18,19,t5-base,30,True,a sudden loss,38
21 | 19,20,t5-large,SKIPPED,False,ERROR,SKIPPED
22 | 20,21,t5-small,9,True,incoming administration could,11
23 |
--------------------------------------------------------------------------------
/results/firstRunRseults.csv:
--------------------------------------------------------------------------------
1 | ,#,Model,Time Taken(with Downloads),Completed,Summary
2 | 0,1,NLTK Corpus,0,True,"The incoming Trump administration could choose to no longer defend the executive branch against the suit, which challenges the administration’s authority to spend billions of dollars on health insurance subsidies for and Americans, handing House Republicans a big victory on issues. WASHINGTON — Congressional Republicans have a new fear when it comes to their health care lawsuit against the Obama administration: They might win"
3 | 1,2,bert-base-uncased kmeans,27,True,"The incoming Trump administration could choose to no longer defend the executive branch against the suit, which challenges the administration’s authority to spend billions of dollars on health insurance subsidies for and Americans, handing House Republicans a big victory on issues. But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. They are not yet ready to divulge their strategy. “ Just as important to House Republicans, Judge Collyer found that Congress had the standing to sue the White House on this issue — a ruling that many legal experts said was flawed — and they want that precedent to be set to restore congressional leverage over the executive branch."
4 | 2,3,bert-base-uncased gmm,3,True,"The incoming Trump administration could choose to no longer defend the executive branch against the suit, which challenges the administration’s authority to spend billions of dollars on health insurance subsidies for and Americans, handing House Republicans a big victory on issues. But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. They are not yet ready to divulge their strategy. “ Just as important to House Republicans, Judge Collyer found that Congress had the standing to sue the White House on this issue — a ruling that many legal experts said was flawed — and they want that precedent to be set to restore congressional leverage over the executive branch."
5 | 3,4,bert-large-uncased kmeans,29,True,"The incoming Trump administration could choose to no longer defend the executive branch against the suit, which challenges the administration’s authority to spend billions of dollars on health insurance subsidies for and Americans, handing House Republicans a big victory on issues. In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions. It is a complicated set of dynamics illustrating how a quick legal victory for the House in the Trump era might come with costs that Republicans never anticipated when they took on the Obama White House."
6 | 4,5,bert-large-uncased gmm,9,True,"The Justice Department, confident that Judge Collyer’s decision would be reversed, quickly appealed, and the subsidies have remained in place during the appeal. In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions. It is a complicated set of dynamics illustrating how a quick legal victory for the House in the Trump era might come with costs that Republicans never anticipated when they took on the Obama White House."
7 | 5,6,xlnet-base-cased kmeans,11,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. They are not yet ready to divulge their strategy. “ Republican leadership officials in the House acknowledge the possibility of “cascading effects” if the payments, which have totaled an estimated $13 billion, are suddenly stopped. Anticipating that the Trump administration might not be inclined to mount a vigorous fight against the House Republicans given the ’s dim view of the health care law, a team of lawyers this month sought to intervene in the case on behalf of two participants in the health care program."
8 | 6,7,xlnet-base-cased gmm,3,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. They are not yet ready to divulge their strategy. “ Republican leadership officials in the House acknowledge the possibility of “cascading effects” if the payments, which have totaled an estimated $13 billion, are suddenly stopped. Anticipating that the Trump administration might not be inclined to mount a vigorous fight against the House Republicans given the ’s dim view of the health care law, a team of lawyers this month sought to intervene in the case on behalf of two participants in the health care program."
9 | 7,8,xlm-mlm-enfr-1024 kmeans,20,True,"WASHINGTON — Congressional Republicans have a new fear when it comes to their health care lawsuit against the Obama administration: They might win. But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. The suspension of the case, House lawyers said, will “provide the and his future administration time to consider whether to continue prosecuting or to otherwise resolve this appeal. ” Over all, the loss of the subsidies could destabilize the entire program and cause a lack of confidence that leads other insurers to seek a quick exit as well."
10 | 8,9,xlm-mlm-enfr-1024 gmm,25,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” The suspension of the case, House lawyers said, will “provide the and his future administration time to consider whether to continue prosecuting or to otherwise resolve this appeal. ” But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions."
11 | 9,10,distilbert-base-uncased kmeans,6,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. They are not yet ready to divulge their strategy. “ In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions."
12 | 10,11,distilbert-base-uncased gmm,2,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. They are not yet ready to divulge their strategy. “ In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions."
13 | 11,12,albert-base-v1 kmeans,3,True,"To stave off that outcome, Republicans could find themselves in the awkward position of appropriating huge sums to temporarily prop up the Obama health care law, angering conservative voters who have been demanding an end to the law for years. They are not yet ready to divulge their strategy. “ Given that this pending litigation involves the Obama administration and Congress, it would be inappropriate to comment,” said Phillip J. Blando, a spokesman for the Trump transition effort. “ Just as important to House Republicans, Judge Collyer found that Congress had the standing to sue the White House on this issue — a ruling that many legal experts said was flawed — and they want that precedent to be set to restore congressional leverage over the executive branch."
14 | 12,13,albert-base-v1 gmm,2,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. Given that this pending litigation involves the Obama administration and Congress, it would be inappropriate to comment,” said Phillip J. Blando, a spokesman for the Trump transition effort. “ In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions."
15 | 13,14,albert-large-v1 kmeans,4,True,"The incoming Trump administration could choose to no longer defend the executive branch against the suit, which challenges the administration’s authority to spend billions of dollars on health insurance subsidies for and Americans, handing House Republicans a big victory on issues. But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. To stave off that outcome, Republicans could find themselves in the awkward position of appropriating huge sums to temporarily prop up the Obama health care law, angering conservative voters who have been demanding an end to the law for years. In another twist, Donald J. Trump’s administration, worried about preserving executive branch prerogatives, could choose to fight its Republican allies in the House on some central questions in the dispute."
16 | 14,15,albert-large-v1 gmm,3,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. To stave off that outcome, Republicans could find themselves in the awkward position of appropriating huge sums to temporarily prop up the Obama health care law, angering conservative voters who have been demanding an end to the law for years. In another twist, Donald J. Trump’s administration, worried about preserving executive branch prerogatives, could choose to fight its Republican allies in the House on some central questions in the dispute. The Justice Department, confident that Judge Collyer’s decision would be reversed, quickly appealed, and the subsidies have remained in place during the appeal."
17 | 15,16,facebook/bart-large-cnn,36,False,ERROR
18 | 16,17,t5-11b,4,False,ERROR
19 | 17,18,t5-3b,SKIPPED,False,ERROR
20 | 18,19,t5-base,30,True,a sudden loss
21 | 19,20,t5-large,SKIPPED,False,ERROR
22 | 20,21,t5-small,9,True,incoming administration could
23 |
--------------------------------------------------------------------------------
/results/secondRunRseults.csv:
--------------------------------------------------------------------------------
1 | ,#,Model,Time Taken(sec),Completed,Summary
2 | 0,1,NLTK Corpus,0,True,"The incoming Trump administration could choose to no longer defend the executive branch against the suit, which challenges the administration’s authority to spend billions of dollars on health insurance subsidies for and Americans, handing House Republicans a big victory on issues. WASHINGTON — Congressional Republicans have a new fear when it comes to their health care lawsuit against the Obama administration: They might win"
3 | 1,2,bert-base-uncased kmeans,15,True,"The incoming Trump administration could choose to no longer defend the executive branch against the suit, which challenges the administration’s authority to spend billions of dollars on health insurance subsidies for and Americans, handing House Republicans a big victory on issues. But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. They are not yet ready to divulge their strategy. “ Just as important to House Republicans, Judge Collyer found that Congress had the standing to sue the White House on this issue — a ruling that many legal experts said was flawed — and they want that precedent to be set to restore congressional leverage over the executive branch."
4 | 2,3,bert-base-uncased gmm,4,True,"The incoming Trump administration could choose to no longer defend the executive branch against the suit, which challenges the administration’s authority to spend billions of dollars on health insurance subsidies for and Americans, handing House Republicans a big victory on issues. But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. They are not yet ready to divulge their strategy. “ Just as important to House Republicans, Judge Collyer found that Congress had the standing to sue the White House on this issue — a ruling that many legal experts said was flawed — and they want that precedent to be set to restore congressional leverage over the executive branch."
5 | 3,4,bert-large-uncased kmeans,8,True,"The incoming Trump administration could choose to no longer defend the executive branch against the suit, which challenges the administration’s authority to spend billions of dollars on health insurance subsidies for and Americans, handing House Republicans a big victory on issues. In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions. It is a complicated set of dynamics illustrating how a quick legal victory for the House in the Trump era might come with costs that Republicans never anticipated when they took on the Obama White House."
6 | 4,5,bert-large-uncased gmm,9,True,"The Justice Department, confident that Judge Collyer’s decision would be reversed, quickly appealed, and the subsidies have remained in place during the appeal. In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions. It is a complicated set of dynamics illustrating how a quick legal victory for the House in the Trump era might come with costs that Republicans never anticipated when they took on the Obama White House."
7 | 5,6,xlnet-base-cased kmeans,3,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. They are not yet ready to divulge their strategy. “ Republican leadership officials in the House acknowledge the possibility of “cascading effects” if the payments, which have totaled an estimated $13 billion, are suddenly stopped. Anticipating that the Trump administration might not be inclined to mount a vigorous fight against the House Republicans given the ’s dim view of the health care law, a team of lawyers this month sought to intervene in the case on behalf of two participants in the health care program."
8 | 6,7,xlnet-base-cased gmm,3,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. They are not yet ready to divulge their strategy. “ Republican leadership officials in the House acknowledge the possibility of “cascading effects” if the payments, which have totaled an estimated $13 billion, are suddenly stopped. Anticipating that the Trump administration might not be inclined to mount a vigorous fight against the House Republicans given the ’s dim view of the health care law, a team of lawyers this month sought to intervene in the case on behalf of two participants in the health care program."
9 | 7,8,xlm-mlm-enfr-1024 kmeans,4,True,"WASHINGTON — Congressional Republicans have a new fear when it comes to their health care lawsuit against the Obama administration: They might win. But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. The suspension of the case, House lawyers said, will “provide the and his future administration time to consider whether to continue prosecuting or to otherwise resolve this appeal. ” Over all, the loss of the subsidies could destabilize the entire program and cause a lack of confidence that leads other insurers to seek a quick exit as well."
10 | 8,9,xlm-mlm-enfr-1024 gmm,5,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” The suspension of the case, House lawyers said, will “provide the and his future administration time to consider whether to continue prosecuting or to otherwise resolve this appeal. ” But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions."
11 | 9,10,distilbert-base-uncased kmeans,2,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. They are not yet ready to divulge their strategy. “ In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions."
12 | 10,11,distilbert-base-uncased gmm,2,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. They are not yet ready to divulge their strategy. “ In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions."
13 | 11,12,albert-base-v1 kmeans,2,True,"To stave off that outcome, Republicans could find themselves in the awkward position of appropriating huge sums to temporarily prop up the Obama health care law, angering conservative voters who have been demanding an end to the law for years. They are not yet ready to divulge their strategy. “ Given that this pending litigation involves the Obama administration and Congress, it would be inappropriate to comment,” said Phillip J. Blando, a spokesman for the Trump transition effort. “ Just as important to House Republicans, Judge Collyer found that Congress had the standing to sue the White House on this issue — a ruling that many legal experts said was flawed — and they want that precedent to be set to restore congressional leverage over the executive branch."
14 | 12,13,albert-base-v1 gmm,2,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. Given that this pending litigation involves the Obama administration and Congress, it would be inappropriate to comment,” said Phillip J. Blando, a spokesman for the Trump transition effort. “ In successfully seeking a temporary halt in the proceedings after Mr. Trump won, House Republicans last month told the court that they “and the ’s transition team currently are discussing potential options for resolution of this matter, to take effect after the ’s inauguration on Jan. 20, 2017. ” But on spending power and standing, the Trump administration may come under pressure from advocates of presidential authority to fight the House no matter their shared views on health care, since those precedents could have broad repercussions."
15 | 13,14,albert-large-v1 kmeans,2,True,"The incoming Trump administration could choose to no longer defend the executive branch against the suit, which challenges the administration’s authority to spend billions of dollars on health insurance subsidies for and Americans, handing House Republicans a big victory on issues. But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. To stave off that outcome, Republicans could find themselves in the awkward position of appropriating huge sums to temporarily prop up the Obama health care law, angering conservative voters who have been demanding an end to the law for years. In another twist, Donald J. Trump’s administration, worried about preserving executive branch prerogatives, could choose to fight its Republican allies in the House on some central questions in the dispute."
16 | 14,15,albert-large-v1 gmm,3,True,"But a sudden loss of the disputed subsidies could conceivably cause the health care program to implode, leaving millions of people without access to health insurance before Republicans have prepared a replacement. To stave off that outcome, Republicans could find themselves in the awkward position of appropriating huge sums to temporarily prop up the Obama health care law, angering conservative voters who have been demanding an end to the law for years. In another twist, Donald J. Trump’s administration, worried about preserving executive branch prerogatives, could choose to fight its Republican allies in the House on some central questions in the dispute. The Justice Department, confident that Judge Collyer’s decision would be reversed, quickly appealed, and the subsidies have remained in place during the appeal."
17 | 15,16,facebook/bart-large-cnn,56,False,ERROR
18 | 16,17,t5-11b,2,False,ERROR
19 | 17,18,t5-3b,SKIPPED,False,ERROR
20 | 18,19,t5-base,38,True,a sudden loss
21 | 19,20,t5-large,SKIPPED,False,ERROR
22 | 20,21,t5-small,11,True,incoming administration could
23 |
--------------------------------------------------------------------------------