├── .gitignore
├── LICENSE
├── Procfile
├── README.md
├── Text Summarizer.gif
├── app.py
├── cover.png
├── licence.jpg
├── requirements.txt
├── static
└── cover.png
└── templates
└── index1.html
/.gitignore:
--------------------------------------------------------------------------------
1 | *.agdai
2 | MAlonzo/**
3 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | BSD 2-Clause License
2 |
3 | Copyright (c) 2020, K Dheeraj Kumar
4 | All rights reserved.
5 |
6 | Redistribution and use in source and binary forms, with or without
7 | modification, are permitted provided that the following conditions are met:
8 |
9 | 1. Redistributions of source code must retain the above copyright notice, this
10 | list of conditions and the following disclaimer.
11 |
12 | 2. Redistributions in binary form must reproduce the above copyright notice,
13 | this list of conditions and the following disclaimer in the documentation
14 | and/or other materials provided with the distribution.
15 |
16 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
17 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
19 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
20 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
21 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
22 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
23 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
24 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
25 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
26 |
--------------------------------------------------------------------------------
/Procfile:
--------------------------------------------------------------------------------
1 | web: gunicorn app:app
2 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | 
2 |
3 | # Text-Summarizer-Flask-Deployment
4 | # Deployment Demo
5 |
6 | 
7 |
8 | # Table Of Contents
9 | - [Project Goal](#Project-Goal)
10 | - [Project Motivation](#Project-Motivation)
11 | - [Requirements Installation](#Requirements-Installation)
12 | - [File Sections](#File-Sections)
13 | - [Technologies Used](#Technologies-Used)
14 | - [License](#License)
15 |
16 | # Project Goal
17 |
18 | **This Project is to designed to Summarize the given text and return the summarized text to user using PYTHON,FLASK,HTML**
19 |
20 | # Project Motivation
21 |
22 | **While we browsing in internet we have to study more topics each and every topics should be covered so to reduce the effort instead of studing whole things just we can study the breif of that content**
23 |
24 | # Requirements Installation
25 |
26 | **The Code is written in Python 3.7. If you don't have Python installed you can find it here. If you are using a lower version of Python you can upgrade using the pip package, ensuring you have the latest version of pip. To install the required packages and libraries, run this command in the project directory after cloning the repository**
27 |
28 | pip install -r requirements.txt
29 |
30 | # The Python file has following sections:
31 |
32 | ## IPL Score Prediction.py
33 |
34 | 1- **Data Preprocessing**
35 |
36 | 2- **Bag of Words Model**
37 |
38 | 3- **Sentence Similarity**
39 |
40 | 4- **Building Smilarity Matrix**
41 |
42 | 5- **Generating Summary**
43 |
44 | # Technologies Used
45 |
46 | 
48 | 
49 | 
50 |
51 | # License
52 |
53 | 
54 |
55 | CopyRight 2020 DHEERAJ KUMAR
56 |
57 | https://opensource.org/licenses/MIT
58 |
59 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
60 |
--------------------------------------------------------------------------------
/Text Summarizer.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DheerajKumar97/Text-Summarizer-Flask-Deployment/f4b28758636083305da7c5d5400b83131371311b/Text Summarizer.gif
--------------------------------------------------------------------------------
/app.py:
--------------------------------------------------------------------------------
1 | from nltk.corpus import stopwords
2 | import numpy as np
3 | import networkx as nx
4 | import regex
5 | from flask import Flask, request, jsonify, render_template
6 | import nltk
7 | # nltk.download('stopwords')
8 |
9 | def read_article(data):
10 |
11 | article = data.split(". ")
12 | sentences = []
13 | for sentence in article:
14 | review = regex.sub("[^A-Za-z0-9]",' ', sentence)
15 | sentences.append(review.replace("[^a-zA-Z]", " ").split(" "))
16 | sentences.pop()
17 | return sentences
18 |
19 | def sentence_similarity(sent1, sent2, stopwords=None):
20 | if stopwords is None:
21 | stopwords = []
22 |
23 | sent1 = [w.lower() for w in sent1]
24 | sent2 = [w.lower() for w in sent2]
25 |
26 | all_words = list(set(sent1 + sent2))
27 |
28 | vector1 = [0] * len(all_words) #makes a vector of len all_words
29 | vector2 = [0] * len(all_words)
30 |
31 | # build the vector for the first sentence
32 | for w in sent1:
33 | if w in stopwords:
34 | continue
35 | vector1[all_words.index(w)] += 1
36 |
37 | # build the vector for the second sentence
38 | for w in sent2:
39 | if w in stopwords:
40 | continue
41 | vector2[all_words.index(w)] += 1
42 |
43 | return 1 - nltk.cluster.util.cosine_distance(vector1, vector2)
44 |
45 | def build_similarity_matrix(sentences, stop_words):
46 | # Create an empty similarity matrix
47 | similarity_matrix = np.zeros((len(sentences), len(sentences)))
48 |
49 | for idx1 in range(len(sentences)):
50 | for idx2 in range(len(sentences)):
51 | if idx1 == idx2: #ignore if both are same sentences
52 | continue
53 | similarity_matrix[idx1][idx2] = sentence_similarity(sentences[idx1], sentences[idx2], stop_words)
54 |
55 | return similarity_matrix
56 |
57 |
58 | def generate_summary(file_name, top_n=10):
59 | stop_words = stopwords.words('english')
60 | summarize_text = []
61 |
62 | # Step 1 - Read text anc split it
63 | sentences = read_article(file_name)
64 |
65 | # Step 2 - Generate Similary Martix across sentences
66 | sentence_similarity_martix = build_similarity_matrix(sentences, stop_words)
67 |
68 | # Step 3 - Rank sentences in similarity martix
69 | sentence_similarity_graph = nx.from_numpy_array(sentence_similarity_martix)
70 | scores = nx.pagerank(sentence_similarity_graph)
71 |
72 | # Step 4 - Sort the rank and pick top sentences
73 | ranked_sentence = sorted(((scores[i],s) for i,s in enumerate(sentences)), reverse=True)
74 | # print("\n\n---------------\nIndexes of top ranked_sentence order are ", ranked_sentence)
75 |
76 | for i in range(top_n):
77 | summarize_text.append(" ".join(ranked_sentence[i][1]))
78 |
79 | # Step 5 - Offcourse, output the summarize texr
80 | # print("\n")
81 | # print("*"*140)
82 | # print("\n\nSUMMARY: \n---------\n\n", ". ".join(summarize_text))
83 | a = ". ".join(summarize_text)
84 | return a
85 |
86 | #----------FLASK-----------------------------#
87 |
88 | app = Flask(__name__)
89 | @app.route('/templates', methods =['POST'])
90 | def original_text_form():
91 | text = request.form['input_text']
92 | number_of_sent = request.form['num_sentences']
93 | # print("TEXT:\n",text)
94 | summary = generate_summary(text,int(number_of_sent))
95 | # print("*"*30)
96 | # print(summary)
97 | return render_template('index1.html', title = "Summarizer", original_text = text, output_summary = summary, num_sentences = 5)
98 |
99 | @app.route('/')
100 | def homepage():
101 | title = "TEXT summarizer"
102 | return render_template('index1.html', title = title)
103 |
104 | if __name__ == "__main__":
105 | app.debug = True
106 | app.run()
107 |
--------------------------------------------------------------------------------
/cover.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DheerajKumar97/Text-Summarizer-Flask-Deployment/f4b28758636083305da7c5d5400b83131371311b/cover.png
--------------------------------------------------------------------------------
/licence.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DheerajKumar97/Text-Summarizer-Flask-Deployment/f4b28758636083305da7c5d5400b83131371311b/licence.jpg
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | Flask==1.1.1
2 | gunicorn==19.9.0
3 | itsdangerous==1.1.0
4 | Jinja2==2.10.1
5 | MarkupSafe==1.1.1
6 | Werkzeug==0.15.5
7 | numpy>=1.9.2
8 | scipy>=0.15.1
9 | scikit-learn>=0.18
10 | matplotlib>=1.4.3
11 | pandas>=0.19
12 | nltk>=3.4.5
13 | regex>=2.5.82
14 | networkx==2.3
15 |
16 |
17 |
--------------------------------------------------------------------------------
/static/cover.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/DheerajKumar97/Text-Summarizer-Flask-Deployment/f4b28758636083305da7c5d5400b83131371311b/static/cover.png
--------------------------------------------------------------------------------
/templates/index1.html:
--------------------------------------------------------------------------------
1 |
2 |