Auto-generated Summaries in Google Docs


For many people, it may be difficult to maintain up with the amount of paperwork that arrive in our inboxes each day: stories, critiques, briefs, insurance policies and the listing goes on. When a brand new doc is acquired, readers usually want it included a short abstract of the details with a view to successfully prioritize it. Nonetheless, composing a doc abstract will be cognitively difficult and time-consuming, particularly when a doc author is ranging from scratch.

To assist with this, we lately introduced that Google Docs now routinely generates strategies to assist doc writers in creating content material summaries, when they’re obtainable. Right this moment we describe how this was enabled utilizing a machine studying (ML) mannequin that comprehends doc textual content and, when assured, generates a 1-2 sentence pure language description of the doc content material. Nonetheless, the doc author maintains full management — accepting the suggestion as-is, making crucial edits to raised seize the doc abstract or ignoring the suggestion altogether. Readers can even use this part, together with the define, to grasp and navigate the doc at a excessive degree. Whereas all customers can add summaries, auto-generated strategies are at the moment solely obtainable to Google Workspace enterprise prospects. Constructing on grammar strategies, Sensible Compose, and autocorrect, we see this as one other beneficial step towards bettering written communication within the office.

A blue abstract icon seems within the high left nook when a doc abstract suggestion is obtainable. Doc writers can then view, edit, or ignore the urged doc abstract.

Mannequin Particulars

Mechanically generated summaries wouldn’t be attainable with out the large advances in ML for pure language understanding (NLU) and pure language era (NLG) over the previous 5 years, particularly with the introduction of Transformer and Pegasus.

Abstractive textual content summarization, which mixes the individually difficult duties of lengthy doc language understanding and era, has been a long-standing downside in NLU and NLG analysis. A preferred methodology for combining NLU and NLG is coaching an ML mannequin utilizing sequence-to-sequence studying, the place the inputs are the doc phrases, and the outputs are the abstract phrases. A neural community then learns to map enter tokens to output tokens. Early functions of the sequence-to-sequence paradigm used recurrent neural networks (RNNs) for each the encoder and decoder.

The introduction of Transformers supplied a promising various to RNNs as a result of Transformers use self-attention to offer higher modeling of lengthy enter and output dependencies, which is vital in doc summarization. Nonetheless, these fashions require giant quantities of manually labeled information to coach sufficiently, so the arrival of Transformers alone was not sufficient to considerably advance the state-of-the-art in doc summarization.

The mix of Transformers with self-supervised pre-training (e.g., BERT, GPT, T5) led to a serious breakthrough in lots of NLU duties for which restricted labeled information is obtainable. In self-supervised pre-training, a mannequin makes use of giant quantities of unlabeled textual content to study normal language understanding and era capabilities. Then, in a subsequent fine-tuning stage, the mannequin learns to use these skills on a particular process, corresponding to summarization or query answering.

The Pegasus work took this concept one step additional, by introducing a pre-training goal custom-made to abstractive summarization. In Pegasus pre-training, additionally referred to as Hole Sentence Prediction (GSP), full sentences from unlabeled information articles and net paperwork are masked from the enter and the mannequin is required to reconstruct them, conditioned on the remaining unmasked sentences. Particularly, GSP makes an attempt to masks sentences which might be thought of important to the doc by way of totally different heuristics. The instinct is to make the pre-training as shut as attainable to the summarization process. Pegasus achieved state-of-the-art outcomes on a various set of summarization datasets. Nonetheless, quite a few challenges remained to use this analysis development right into a product.

Making use of Latest Analysis Advances to Google Docs

  • Knowledge

    Self-supervised pre-training leads to an ML mannequin that has normal language understanding and era capabilities, however a subsequent fine-tuning stage is vital for the mannequin to adapt to the applying area. We fine-tuned early variations of our mannequin on a corpus of paperwork with manually-generated summaries that had been according to typical use instances.

    Nonetheless, early variations of this corpus suffered from inconsistencies and excessive variation as a result of they included many sorts of paperwork, in addition to some ways to jot down a abstract — e.g., educational abstracts are sometimes lengthy and detailed, whereas govt summaries are transient and punchy. This led to a mannequin that was simply confused as a result of it had been educated on so many several types of paperwork and summaries that it struggled to study the relationships between any of them.

    Thankfully, one of many key findings within the Pegasus work was that an efficient pre-training part required much less supervised information within the fine-tuning stage. Some summarization benchmarks required as few as 1,000 fine-tuning examples for Pegasus to match the efficiency of Transformer baselines that noticed 10,000+ supervised examples — suggesting that one may deal with high quality fairly than amount.

    We fastidiously cleaned and filtered the fine-tuning information to comprise coaching examples that had been extra constant and represented a coherent definition of summaries. Even though we decreased the quantity of coaching information, this led to a better high quality mannequin. The important thing lesson, according to current work in domains like dataset distillation, was that it was higher to have a smaller, prime quality dataset, than a bigger, high-variance dataset.

  • Serving

    As soon as we educated the prime quality mannequin, we turned to the problem of serving the mannequin in manufacturing. Whereas the Transformer model of the encoder-decoder structure is the dominant method to coach fashions for sequence-to-sequence duties like abstractive summarization, it may be inefficient and impractical to serve in real-world functions. The principle inefficiency comes from the Transformer decoder the place we generate the output abstract token by token by way of autoregressive decoding. The decoding course of turns into noticeably sluggish when summaries get longer for the reason that decoder attends to all beforehand generated tokens at every step. RNNs are a extra environment friendly structure for decoding since there isn’t any self-attention with earlier tokens as in a Transformer mannequin.

    We used information distillation, which is the method of transferring information from a big mannequin to a smaller extra environment friendly mannequin, to distill the Pegasus mannequin right into a hybrid structure of a Transformer encoder and an RNN decoder. To enhance effectivity we additionally decreased the variety of RNN decoder layers. The ensuing mannequin had important enhancements in latency and reminiscence footprint whereas the standard was nonetheless on par with the unique mannequin. To additional enhance the latency and consumer expertise, we serve the summarization mannequin utilizing TPUs, which offer important pace ups and permit extra requests to be dealt with by a single machine.

Ongoing Challenges and Subsequent Steps

Whereas we’re excited by the progress to date, there are just a few challenges we’re persevering with to deal with:

  • Doc protection: Creating a set of paperwork for the fine-tuning stage was tough as a result of large selection that exists amongst paperwork, and the identical problem is true at inference time. A number of the paperwork our customers create (e.g., assembly notes, recipes, lesson plans and resumes) will not be appropriate for summarization or will be tough to summarize. At the moment, our mannequin solely suggests a abstract for paperwork the place it’s most assured, however we hope to proceed broadening this set as our mannequin improves.
  • Analysis: Abstractive summaries must seize the essence of a doc whereas being fluent and grammatically appropriate. A particular doc might have many summaries that may be thought of appropriate, and totally different readers might choose totally different ones. This makes it exhausting to guage summaries with automated metrics solely, consumer suggestions and utilization statistics can be vital for us to grasp and hold bettering high quality.
  • Lengthy paperwork: Lengthy paperwork are a number of the hardest paperwork for the mannequin to summarize as a result of it’s more durable to seize all of the factors and summary them in a single abstract, and it could additionally considerably improve reminiscence utilization throughout coaching and serving. Nonetheless, lengthy paperwork are maybe most helpful for the mannequin to routinely summarize as a result of it could assist doc writers get a head begin on this tedious process. We hope we are able to apply the most recent ML developments to raised deal with this problem.


General, we’re thrilled that we are able to apply current progress in NLU and NLG to proceed helping customers with studying and writing. We hope the automated strategies now supplied in Google Workspace make it simpler for writers to annotate their paperwork with summaries, and assist readers comprehend and navigate paperwork extra simply.


The authors wish to thank the many individuals throughout Google that contributed to this work: AJ Motika, Matt Pearson-Beck, Mia Chen, Mahdis Mahdieh, Halit Erdogan, Benjamin Lee, Ali Abdelhadi, Michelle Danoff, Vishnu Sivaji, Sneha Keshav, Aliya Baptista, Karishma Damani, DJ Lick, Yao Zhao, Peter Liu, Aurko Roy, Yonghui Wu, Shubhi Sareen, Andrew Dai, Mekhola Mukherjee, Yinan Wang, Mike Colagrosso, and Behnoosh Hariri.


Please enter your comment!
Please enter your name here