├── Headline-text.jpg
├── README.md
├── LaMini-TextSummarizer_mockup.py
├── main.py
├── LaMini-TextSummarizer.py
└── AutomaticTextSummarization.txt


/Headline-text.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/fabiomatricardi/LaMini-TextSummarizer/main/Headline-text.jpg


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # LaMini-TextSummarizer
2 | Repo of the code from the Medium Article Mastering AI Summarization: Your Ultimate Productivity Hack
3 | 


--------------------------------------------------------------------------------
/LaMini-TextSummarizer_mockup.py:
--------------------------------------------------------------------------------
 1 | import streamlit as st
 2 | ############# Displaying images on the front end #################
 3 | st.set_page_config(page_title="Mockup for single page webapp",
 4 |                    page_icon='💻',
 5 |                    layout="centered",  #or wide
 6 |                    initial_sidebar_state="expanded",
 7 |                    menu_items={
 8 |                         'Get Help': 'https://docs.streamlit.io/library/api-reference',
 9 |                         'Report a bug': "https://www.extremelycoolapp.com/bug",
10 |                         'About': "# This is a header. This is an *extremely* cool app!"}
11 |                         )
12 | # Load image placeholder from the web
13 | st.image('https://placehold.co/750x150', width=750)
14 | # Set a Descriptive Title
15 | st.title("Your Beautiful App Name")
16 | st.divider()
17 | your_future_text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras rhoncus massa sit amet est congue dapibus. Duis dictum ac nulla sit amet sollicitudin. In non metus ac neque vehicula egestas. Vestibulum quis justo id enim vestibulum venenatis. Cras gravida ex vitae dignissim suscipit. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Duis efficitur, lorem ut fringilla commodo, lacus orci lobortis turpis, sit amet consequat ante diam ut libero."
18 | st.text_area('Summarized text', your_future_text, height = 150, key = 'result')
19 | 
20 | #col1, col0, col2 = st.columns(3)  #for 3 columns even distribution
21 | col1, col2 = st.columns(2)
22 | btn1 = col1.button(" :star: Click ME ", use_container_width=True, type="secondary")
23 | btn2 = col2.button(" :smile: Click ME ", use_container_width=True, type="primary")
24 | 
25 | if btn1:
26 |     st.warning('You pressed the wrong one!', icon="⚠️")
27 | if btn2:
28 |     st.success('Good Choice!', icon="⚠️")  
29 | st.divider()


--------------------------------------------------------------------------------
/main.py:
--------------------------------------------------------------------------------
 1 | ########### GUI IMPORTS ################
 2 | import streamlit as st
 3 | #### IMPORTS FOR AI PIPELINES ###############
 4 | from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
 5 | from transformers import pipeline
 6 | from transformers import AutoModel, T5Tokenizer, T5Model
 7 | from transformers import T5ForConditionalGeneration
 8 | from langchain.llms import HuggingFacePipeline
 9 | import torch
10 | 
11 | # SET THE MODEL PATH
12 | checkpoint = "./model/"  #it is actually LaMini-Flan-T5-248M
13 | # INITIALIZE TOKENIZER AND MODEL
14 | tokenizer = T5Tokenizer.from_pretrained(checkpoint)
15 | base_model = T5ForConditionalGeneration.from_pretrained(
16 |                                             checkpoint,
17 |                                             device_map='auto',
18 |                                             torch_dtype=torch.float32)
19 | pipe_sum = pipeline('summarization', 
20 |                     model = base_model,
21 |                     tokenizer = tokenizer,
22 |                     max_length = 350, 
23 |                     min_length = 25)
24 | 
25 | text = " Automatic text summarization with machine learning is the task of condensing a piece of text to a shorter version, reducing the size of the initial text while at the same time preserving key informational elements and the meaning of content. It is a challenging task that requires extensive research in the NLP area. There are two different approaches for automatic text summaryization: extraction and abstraction. The extraction method involves identifying important sections of the text and stitching together portions of the content to produce a condensed version. The scoring function assigns a value to each sentence denoting the probability with which it will get picked up in the summary. The process involves constructing an intermediate representation of the input text and scoring the sentences based on the representation. A typical flow of extractive summarization systems involves constructing intermediate representations of the input text, scoring sentences based on the representation, and using Latent semantic analysis (LSA) to identify semantically important sentences. Recent studies have also applied deep learning in extractive text summaryization, such as Sukriti's approach for factual reports using a deep learning model, Yong Zhang's document summarizing framework using convolutional neural networks, and Y. Kim's regression process for sentence ranking. The neural architecture used in the paper is compounded by one single convolution layer built on top of pre-trained word vectors followed by a max-pooling layer. Experiments have shown the proposed model achieved competitive or even better performance compared with baselines. Abstractive summarization methods aim to produce summary by interpreting the text using advanced natural language techniques to generate a new shorter text that conveys the most critical information from the original text. They take advantage of recent developments in deep learning and use an attention-based encoder-decoder method for generating abstractive summaries. Recent studies have argued that attention to sequence models can suffer from repetition and semantic irrelevance, causing grammatical errors and insufficient reflection of the main idea of the source text. Junyang Lin et al proposes a gated unit on top of the encoder outputs at each time step to tackle this problem. The code to reproduce the experiments from the NAMAS paper can be found here. The Pointer Network is a neural attention-based sequence-to-sequence architecture that learns the conditional probability of an output sequence with elements that are discrete tokens corresponding to positions in an input sequence. Other methods for abstractive summarization include Pointer-Generator, which allows copying words from the input sequence via pointing of specific positions, and a generator that generates words from a fixed vocabulary of 50k words. To overcome repetition problems, the paper adapts the coverage model of Tu et al. to overcome the lack of coverage of source words in neural machine translation models. To train the extractor on available document-summary pairs, the model uses a policy-based reinforcement learning (RL) with sentence-level metric rewards to connect both extractor and abstractor networks and to learn sentence saliency. The abstractor network is an emphasis-based encoder-decoder which compresses and paraphrases an extracted document sentence to a concise summary sentence. An RNN encoder computes context-aware representation and then an RNN decoder selects sentence at time step t. The extractor agent is a convolutional sentence encoder that computes representations for each sentence based on input embedded word vectors. An RNN encoder computes context-aware representation and then an RNN decoder selects sentence at time step t. The method incorporates abstractive approach advantages of concisely rewriting sentences and generating novel words from the full vocabulary, while adopting intermediate extractive behavior to improve the overall model's quality, speed, and stability. Recent studies have proposed a combination of adversarial processes and reinforcement learning to abstractive summarization. The extractive approach is easier because copying large chunks of text from the source document ensures good levels of grammaticality and accuracy, while the abstractive model generates new phrases, rephrasing or using words that were not in the original text. Recent developments in the deep learning area have allowed for more sophisticated abilities to be generated."
26 | # RUN THE PIPELINE ON THE TEXT AND PRINT RESULT
27 | result = pipe_sum(text)
28 | print(result)
29 | # print(result[0]['summary_text'])


--------------------------------------------------------------------------------
/LaMini-TextSummarizer.py:
--------------------------------------------------------------------------------
  1 | ########### GUI IMPORTS ################
  2 | import streamlit as st
  3 | import ssl
  4 | ############# Displaying images on the front end #################
  5 | st.set_page_config(page_title="Summarize and Talk ot your Text",
  6 |                    page_icon='📖',
  7 |                    layout="centered",  #or wide
  8 |                    initial_sidebar_state="expanded",
  9 |                    menu_items={
 10 |                         'Get Help': 'https://docs.streamlit.io/library/api-reference',
 11 |                         'Report a bug': "https://www.extremelycoolapp.com/bug",
 12 |                         'About': "# This is a header. This is an *extremely* cool app!"
 13 |                                 },
 14 |                    )
 15 | ########### SSL FOR PROXY ##############
 16 | ssl._create_default_https_context = ssl._create_unverified_context
 17 | 
 18 | #### IMPORTS FOR AI PIPELINES ###############
 19 | from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
 20 | from transformers import pipeline
 21 | 
 22 | from transformers import AutoModel, T5Tokenizer, T5Model
 23 | from transformers import T5ForConditionalGeneration
 24 | from langchain.llms import HuggingFacePipeline
 25 | import torch
 26 | import datetime
 27 | 
 28 | #############################################################################
 29 | #               SIMPLE TEXT2TEXT GENERATION INFERENCE
 30 | #           checkpoint = "./models/LaMini-Flan-T5-783M.bin" 
 31 | # ###########################################################################
 32 | checkpoint = "./model/"  #it is actually LaMini-Flan-T5-248M
 33 | LaMini = './model/'
 34 | 
 35 | ######################################################################
 36 | #     SUMMARIZATION FROM TEXT STRING WITH HUGGINGFACE PIPELINE       #
 37 | ######################################################################
 38 | def AI_SummaryPL(checkpoint, text, chunks, overlap):
 39 | 
 40 |     """
 41 |     checkpoint is in the format of relative path
 42 |     example:  checkpoint = "/content/model/"  #it is actually LaMini-Flan-T5-248M   #tested fine
 43 |     text it is either a long string or a input long string or a loaded document into string
 44 |     chunks: integer, lenght of the chunks splitting
 45 |     ovelap: integer, overlap for cor attention and focus retreival
 46 |     RETURNS full_summary (str), delta(str) and reduction(str)
 47 | 
 48 |     post_summary14 = AI_SummaryPL(LaMini,doc2,3700,500)
 49 |     USAGE EXAMPLE:
 50 |     post_summary, post_time, post_percentage = AI_SummaryPL(LaMini,originalText,3700,500)
 51 |     """
 52 |     from langchain.text_splitter import RecursiveCharacterTextSplitter
 53 |     text_splitter = RecursiveCharacterTextSplitter(
 54 |         # Set a really small chunk size, just to show.
 55 |         chunk_size = chunks,
 56 |         chunk_overlap  = overlap,
 57 |         length_function = len,
 58 |     )
 59 |     texts = text_splitter.split_text(text)
 60 |     #checkpoint = "/content/model/"  #it is actually LaMini-Flan-T5-248M   #tested fine
 61 |     checkpoint = checkpoint
 62 |     tokenizer = T5Tokenizer.from_pretrained(checkpoint)
 63 |     base_model = T5ForConditionalGeneration.from_pretrained(checkpoint,
 64 |                                                         device_map='auto',
 65 |                                                         torch_dtype=torch.float32)
 66 |     ### INITIALIZING PIPELINE
 67 |     pipe_sum = pipeline('summarization', 
 68 |                         model = base_model,
 69 |                         tokenizer = tokenizer,
 70 |                         max_length = 350, 
 71 |                         min_length = 25
 72 |                         )
 73 |     ## START TIMER
 74 |     start = datetime.datetime.now() #not used now but useful
 75 |     ## START CHUNKING
 76 |     full_summary = ''
 77 |     for cnk in range(len(texts)):
 78 |       result = pipe_sum(texts[cnk])
 79 |       full_summary = full_summary + ' '+ result[0]['summary_text']
 80 |     stop = datetime.datetime.now() #not used now but useful  
 81 |     ## TIMER STOPPED AND RETURN DURATION
 82 |     delta = stop-start  
 83 |     ### Calculating Summarization PERCENTAGE
 84 |     reduction = '{:.1%}'.format(len(full_summary)/len(text))
 85 |     print(f"Completed in {delta}")
 86 |     print(f"Reduction percentage: ", reduction)
 87 |     
 88 |     return full_summary, delta, reduction
 89 | 
 90 | 
 91 | global text_summary
 92 | 
 93 | ### HEADER section
 94 | st.image('Headline-text.jpg', width=750)
 95 | title = st.text_area('Insert here your Copy/Paste text', "", height = 350, key = 'copypaste')
 96 | btt = st.button("1. Start Summarization")
 97 | txt = st.empty()
 98 | timedelta = st.empty()
 99 | text_lenght = st.empty()
100 | redux_bar = st.empty()
101 | st.divider()
102 | down_title = st.empty()
103 | down_btn = st.button('2. Download Summarization') 
104 | text_summary = ''
105 | 
106 | def start_sum(text):
107 |     if st.session_state.copypaste == "":
108 |         st.warning('You need to paste some text...', icon="⚠️")
109 |     else:
110 |         with st.spinner('Initializing pipelines...'):
111 |             st.success(' AI process started', icon="🤖")
112 |             print("Starting AI pipelines")
113 |             text_summary, duration, reduction = AI_SummaryPL(LaMini,text,3700,500)
114 |         txt.text_area('Summarized text', text_summary, height = 350, key='final')
115 |         timedelta.write(f'Completed in {duration}')
116 |         text_lenght.markdown(f"Initial length = {len(text.split(' '))} words / summarization = **{len(text_summary.split(' '))} words**")
117 |         redux_bar.progress(len(text_summary)/len(text), f'Reduction: **{reduction}**')
118 |         down_title.markdown(f"## Download your text Summarization")
119 | 
120 | 
121 | 
122 | if btt:
123 |     start_sum(st.session_state.copypaste)
124 | 
125 | if down_btn:
126 |     def savefile(generated_summary, filename):
127 |         st.write("Download in progress...")
128 |         with open(filename, 'w') as t:
129 |             t.write(generated_summary)
130 |         t.close()
131 |         st.success(f'AI Summarization saved in {filename}', icon="✅")
132 |     savefile(st.session_state.final, 'text_summarization.txt')
133 |     txt.text_area('Summarized text', st.session_state.final, height = 350)
134 | 
135 | 
136 | 
137 | 


--------------------------------------------------------------------------------
/AutomaticTextSummarization.txt:
--------------------------------------------------------------------------------
 1 | Title: Automatic Text Summarization with Machine Learning — An overview | by Luís Gonçalves | luisfredgs | Medium
 2 | -------------------------------------------------------------------------------
 3 | 
 4 | Summarization is the task of condensing a piece of text to a shorter version, reducing the size of the initial text while at the same time preserving key informational elements and the meaning of content. Since manual text summarization is a time expensive and generally laborious task, the automatization of the task is gaining increasing popularity and therefore constitutes a strong motivation for academic research.
 5 | 
 6 | There are important applications for text summarization in various NLP related tasks such as text classification, question answering, legal texts summarization, news summarization, and headline generation. Moreover, the generation of summaries can be integrated into these systems as an intermediate stage which helps to reduce the length of the document.
 7 | 
 8 | In the big data era, there has been an explosion in the amount of text data from a variety of sources. This volume of text is an inestimable source of information and knowledge which needs to be effectively summarized to be useful. This increasing availability of documents has demanded exhaustive research in the NLP area for automatic text summarization. Automatic text summarization is the task of producing a concise and fluent summary without any human help while preserving the meaning of the original text document.
 9 | 
10 | It is very challenging, because when we as humans summarize a piece of text, we usually read it entirely to develop our understanding, and then write a summary highlighting its main points. Since computers lack human knowledge and language capability, it makes automatic text summarization a very difficult and non-trivial task.
11 | 
12 | Various models based on machine learning have been proposed for this task. Most of these approaches model this problem as a classification problem which outputs whether to include a sentence in the summary or not. Other approaches have used topic information, Latent Semantic Analysis (LSA), Sequence to Sequence models, Reinforcement Learning and Adversarial processes.
13 | 
14 | In general, there are two different approaches for automatic summarization: extraction and abstraction.
15 | 
16 | The extractive approach
17 | 
18 | Extractive summarization picks up sentences directly from the document based on a scoring function to form a coherent summary. This method work by identifying important sections of the text cropping out and stitch together portions of the content to produce a condensed version.
19 | 
20 | Extractive summarization work by identifying important sections of the text cropping out and stitch together portions of the content to produce a condensed version. Thus, they depend only on the extraction of sentences from the original text.
21 | 
22 | Thus, they depend only on the extraction of sentences from the original text. Most of the summarization research today has focused on extractive summarization, once it is easier and yields naturally grammatical summaries requiring relatively little linguistic analysis. Moreover, extractive summaries contain the most important sentences of the input, which can be a single document or multiple documents.
23 | 
24 | A typical flow of extractive summarization systems consists of:
25 | 
26 | 1. Constructs an intermediate representation of the input text intending to find salient content. Typically, it works by computing TF metrics for each sentence in the given matrix.
27 | 
28 | 2. Scores the sentences based on the representation, assigning a value to each sentence denoting the probability with which it will get picked up in the summary.
29 | 
30 | 3. Produces a summary based on the top k most important sentences. Some studies have used Latent semantic analysis (LSA) to identify semantically important sentences.
31 | 
32 | For a good starting point to the LSA models in summarization, check this paper and this one. An implementation of LSA for extractive text summarization in Python is available in this github repo. For example, I used this code to make the following summary:
33 | 
34 | Original text:
35 | 
36 | De acordo com o especialista da Certsys (empresa que tem trabalhado na implementação e alteração de fluxos desses robôs), Diego Howës, as empresas têm buscado incrementar os bots de atendimento ao público interno com essas novas demandas de prevenção, para que os colaboradores possam ter à mão informações sobre a doença, tipos de cuidado, boas práticas de higiene e orientações gerais sobre a otimização do home office. Já os negócios que buscam se comunicar com o público externo enxergam outras necessidades. “Temos clientes de varejo que pediram para que fossem criados novos fluxos abordando o tema, e informando aos consumidores que as entregas dos produtos adquiridos online podem sofrer algum atraso”, comenta Howës, da Certsys, que tem buscado ampliar o escopo desses canais para se adequar ao momento de atenção. Ainda segundo o especialista, em todo o mercado é possível observar uma tendência de automatização do atendimento à população, em busca de chatbots que trabalhem em canais de alto acesso, como o WhatsApp, no caso de órgãos públicos. Na área de saúde, a disseminação de informação sobre a pandemia do vírus tem sido um esforço realizado.
37 | 
38 | Summarized text:
39 | 
40 | De acordo com o especialista da Certsys (empresa que tem trabalhado na implementação e alteração de fluxos desses robôs), Diego Howës, as empresas têm buscado incrementar os bots de atendimento ao público interno com essas novas demandas de prevenção, para que os colaboradores possam ter à mão informações sobre a doença, tipos de cuidado, boas práticas de higiene e orientações gerais sobre a otimização do home office. Já os negócios que buscam se comunicar com o público externo enxergam outras necessidades. Na área de saúde, a disseminação de informação sobre a pandemia do vírus tem sido um esforço realizado.
41 | 
42 | Recent studies have applied deep learning in extractive summarization as well. For instance, Sukriti proposes an extractive text summarization approach for factual reports using a deep learning model, exploring various features to improve the set of sentences selected for the summary.
43 | 
44 | Yong Zhang proposed a document summarization framework based on convolutional neural networks to learn sentence features and perform sentence ranking jointly using a CNN model for sentence ranking. The authors adapt the original classification model of Y. Kim to address a regression process for sentence ranking. The neural architecture used in that paper is compound by one single convolution layer that is built on top of the pre-trained word vectors followed by a max-pooling layer. The author carried experiments on both single and multi-document summarization tasks to evaluate the proposed model. Results have shown the method achieved competitive or even better performance compared with baselines. The source code used in experiments can be found here.
45 | 
46 | Abstractive summarization
47 | 
48 | Abstractive summarization methods aim at producing summary by interpreting the text using advanced natural language techniques in order to generate a new shorter text — parts of which may not appear as part of the original document, that conveys the most critical information from the original text, requiring rephrasing sentences and incorporating information from full text to generate summaries such as a human-written abstract usually does. In fact, an acceptable abstractive summary covers core information in the input and is linguistically fluent.
49 | 
50 | Thus, they are not restricted to simply selecting and rearranging passages from the original text.
51 | 
52 | Abstractive methods take advantage of recent developments in deep learning. Since it can be regarded as a sequence mapping task where the source text should be mapped to the target summary, abstractive methods take advantage of the recent success of the sequence to sequence models. These models consist of an encoder and a decoder, where a neural network reads the text, encodes it, and then generates target text.
53 | 
54 | In general, building abstract summaries is a challenging task, which is relatively harder than data-driven approaches such as sentence extraction and involves complex language modeling. Thus, they are still far away from reaching human-level quality in summary generation, despite recent progress using neural networks inspired by the progress of neural machine translation and sequence to sequence models.
55 | 
56 | An example is the work of Alexander et al, which proposed a neural attention model for abstractive sentence summarization (NAMAS) by exploring a fully data-driven approach for generating abstractive summaries using an attention-based encoder-decoder method. Attention mechanism has been broadly used in sequence to sequence models where the decoder extracts information from the encoder based on the attention scores on the source-side information. The code to reproduce the experiments from the NAMAS paper can be found here.
57 | 
58 | Example output of the attention-based summarization of Alexander et al. The heatmap represents a soft alignment between the input (right) and the generated summary (top). The columns represent the distribution over the input after generating each word.
59 | 
60 | Recent studies have argued attention-based sequence to sequence models for abstractive summarization can suffer from repetition and semantic irrelevance, causing grammatical errors and insufficient reflection of the main idea of the source text. Junyang Lin et al propose to implement a gated unit on top of the encoder outputs at each time step, which is a CNN that convolves all the encoder outputs, in order to tackle this problem.
61 | 
62 | Based on the convolution and self-attention of Vaswani et al., a convolutional gated unit sets a gate to filter the source annotations from the RNN encoder, in order to select information relevant to the global semantic meaning. In other words, it refines the representation of the source context with a CNN to improve the connection of the word representation with the global context. Their model is capable of reducing repetition compared with the sequence to sequence model outperforming the state-of-the-art methods. The source code of paper can be found here.
63 | 
64 | Other methods for abstractive summarization have borrowed the concepts from the pointer network of Vinyals et al to addresses the undesirable behavior of sequence to sequence models. Pointer Network is a neural attention-based sequence-to-sequence architecture that learns the conditional probability of an output sequence with elements that are discrete tokens corresponding to positions in an input sequence.
65 | 
66 | For example, Abigail See et al. presented an architecture called Pointer-Generator, which allows copying words from the input sequence via pointing of specific positions, whereas a generator allows generating words from a fixed vocabulary of 50k words. The architecture can be viewed as a balance between extractive and abstractive approaches.
67 | 
68 | In order to overcome the repetition problems, the paper adapts the coverage model of Tu et al., which was proposed to overcome the lacking coverage of source words in neural machine translation models. Specifically, Abigail See et al. defined a flexible coverage loss to penalize repeatedly attending to the same locations, only penalizing the overlap between each attention distribution and the coverage up to the current time step helping to prevents repeated attention. The source code for the model can be found here.
69 | 
70 | The Pointer-generator model. For each timestep in the decoder, the probability of generating words from the fixed vocabulary, versus copying words from source using a pointer is weighted by a generation probability p_{gen}. The vocabulary distribution and attention distribution are weighted and summed to obtain the final distribution. The attention distribution can be viewed as a probability distribution over the source words, that tells the decoder where to look to generate the next word. It is used to produce a weighted sum of the encoder hidden states, known as the context vector.
71 | 
72 | Other studies in abstractive summarization have borrowed the concepts from the reinforcement learning (RL) field to improve model accuracy. For example, Chen et al. proposed a hybrid extractive-abstractive architecture using two neural networks in a hierarchical way, that selects salient sentences using an RL guided extractor from the source and then rewrites them abstractively to generate a summary.
73 | 
74 | In other words, the model simulates how humans summarize long documents first using an extractor agent to select salient sentences or highlights, and then employs an abstractor — an encoder-aligner-decoder model — network to rewrite each of these extracted sentences. To train the extractor on available document-summary pairs, the model uses a policy-based reinforcement learning (RL) with sentence-level metric rewards to connect both extractor and abstractor networks and to learn sentence saliency.
75 | 
76 | Reinforced training of the extractor (for one extraction step) and its interaction with the abstractor.
77 | 
78 | The abstractor network is an attention-based encoder-decoder which compresses and paraphrases an extracted document sentence to a concise summary sentence. Moreover, the abstractor has a useful mechanism to help directly copy some out-of-vocabulary (OOV) words.
79 | 
80 | The convolutional extractor agent
81 | 
82 | The extractor agent is a convolutional sentence encoder that computes representations for each sentence based on input embedded word vectors. Further, an RNN encoder computes context-aware representation and then an RNN decoder selects sentence at time step t. Once the sentence is selected, the context-aware representation will be fed into the decoder at time t + 1.
83 | 
84 | Thus, the method incorporates the abstractive approach advantages of concisely rewriting sentences and generating novel words from the full vocabulary, whereas adopts intermediate extractive behavior to improve the overall model’s quality, speed, and stability. The author argued model training is 4x faster than the previous state-of-the-art. Both source code and best pre-trained models were released to promote future research.
85 | 
86 | Other recent studies have proposed using a combination of the adversarial processes and reinforcement learning to abstractive summarization. An example is Liu et al. (2017), whose work proposes an adversarial framework to jointly train a generative model and a discriminative model similar to Goodfellow et al. (2014). In that framework, a generative model takes the original text as input and generates the summary using reinforcement learning to optimize the generator for a highly rewarded summary. Further, a discriminator model tries to distinguish the ground truth summaries from the generated summaries by the generator.
87 | 
88 | The discriminator is implemented as a text classifier that learns to classify the generated summaries as machine or human-generated, while the training procedure of generator is to maximize the probability of discriminator making a mistake. The idea is this adversarial process can eventually let the generator to generate plausible and high-quality abstractive summaries. The author provided supplementary material here. The source code is available in this github repo.
89 | 
90 | In short
91 | 
92 | Automatic text summarization is an exciting research area with several applications on the industry. By condensing large quantities of information into short, summarization can aid many downstream applications such as creating news digests, report generation, news summarization, and headline generation. There are two prominent types of summarization algorithms.
93 | 
94 | First, extractive summarization systems form summaries by copying and rearranging passages from the original text. Second, abstractive summarization systems generate new phrases, rephrasing or using words that were not in the original text. Due to the difficulty of abstractive summarization, the great majority of past work has been extractive.
95 | 
96 | The extractive approach is easier because copying large chunks of text from the source document ensures good levels of grammaticality and accuracy. On the other hand, sophisticated abilities that are crucial to high-quality summarization, such as paraphrasing, generalization, or the incorporation of real-world knowledge, are possible only in an abstractive framework. Even though abstractive summarization is a more challenging task, there has been a number of advances so far, thanks to recent developments in the deep learning area.
97 | 


--------------------------------------------------------------------------------