Summarization
A summarization chain can be used to summarize multiple documents. One way is to input multiple smaller documents, after they have been divided into chunks, and operate over them with a MapReduceDocumentsChain. You can also choose instead for the chain that does summarization to be a StuffDocumentsChain, or a RefineDocumentsChain.
- npm
- Yarn
- pnpm
npm install @langchain/anthropic @langchain/openai
yarn add @langchain/anthropic @langchain/openai
pnpm add @langchain/anthropic @langchain/openai
import { OpenAI } from "@langchain/openai";
import { loadSummarizationChain } from "langchain/chains";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import * as fs from "fs";
// In this example, we use a `MapReduceDocumentsChain` specifically prompted to summarize a set of documents.
const text = fs.readFileSync("state_of_the_union.txt", "utf8");
const model = new OpenAI({ temperature: 0 });
const textSplitter = new RecursiveCharacterTextSplitter({ chunkSize: 1000 });
const docs = await textSplitter.createDocuments([text]);
// This convenience function creates a document chain prompted to summarize a set of documents.
const chain = loadSummarizationChain(model, { type: "map_reduce" });
const res = await chain.invoke({
input_documents: docs,
});
console.log({ res });
/*
{
res: {
text: ' President Biden is taking action to protect Americans from the COVID-19 pandemic and Russian aggression, providing economic relief, investing in infrastructure, creating jobs, and fighting inflation.
He is also proposing measures to reduce the cost of prescription drugs, protect voting rights, and reform the immigration system. The speaker is advocating for increased economic security, police reform, and the Equality Act, as well as providing support for veterans and military families.
The US is making progress in the fight against COVID-19, and the speaker is encouraging Americans to come together and work towards a brighter future.'
}
}
*/
API Reference:
- OpenAI from
@langchain/openai
- loadSummarizationChain from
langchain/chains
- RecursiveCharacterTextSplitter from
langchain/text_splitter
Intermediate Steps
We can also return the intermediate steps for map_reduce
chains, should we want to inspect them. This is done with the returnIntermediateSteps
parameter.
import { OpenAI } from "@langchain/openai";
import { loadSummarizationChain } from "langchain/chains";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import * as fs from "fs";
// In this example, we use a `MapReduceDocumentsChain` specifically prompted to summarize a set of documents.
const text = fs.readFileSync("state_of_the_union.txt", "utf8");
const model = new OpenAI({ temperature: 0 });
const textSplitter = new RecursiveCharacterTextSplitter({ chunkSize: 1000 });
const docs = await textSplitter.createDocuments([text]);
// This convenience function creates a document chain prompted to summarize a set of documents.
const chain = loadSummarizationChain(model, {
type: "map_reduce",
returnIntermediateSteps: true,
});
const res = await chain.invoke({
input_documents: docs,
});
console.log({ res });
/*
{
res: {
intermediateSteps: [
"In response to Russia's aggression in Ukraine, the United States has united with other freedom-loving nations to impose economic sanctions and hold Putin accountable. The U.S. Department of Justice is also assembling a task force to go after the crimes of Russian oligarchs and seize their ill-gotten gains.",
"The United States and its European allies are taking action to punish Russia for its invasion of Ukraine, including seizing assets, closing off airspace, and providing economic and military assistance to Ukraine. The US is also mobilizing forces to protect NATO countries and has released 30 million barrels of oil from its Strategic Petroleum Reserve to help blunt gas prices. The world is uniting in support of Ukraine and democracy, and the US stands with its Ukrainian-American citizens.",
" President Biden and Vice President Harris ran for office with a new economic vision for America, and have since passed the American Rescue Plan and the Bipartisan Infrastructure Law to help struggling families and rebuild America's infrastructure. This includes creating jobs, modernizing roads, airports, ports, and waterways, replacing lead pipes, providing affordable high-speed internet, and investing in American products to support American jobs.",
],
text: "President Biden is taking action to protect Americans from the COVID-19 pandemic and Russian aggression, providing economic relief, investing in infrastructure, creating jobs, and fighting inflation.
He is also proposing measures to reduce the cost of prescription drugs, protect voting rights, and reform the immigration system. The speaker is advocating for increased economic security, police reform, and the Equality Act, as well as providing support for veterans and military families.
The US is making progress in the fight against COVID-19, and the speaker is encouraging Americans to come together and work towards a brighter future.",
},
}
*/
API Reference:
- OpenAI from
@langchain/openai
- loadSummarizationChain from
langchain/chains
- RecursiveCharacterTextSplitter from
langchain/text_splitter
Streaming
By passing a custom LLM to the internal map_reduce
chain, we can stream the final output:
import { loadSummarizationChain } from "langchain/chains";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import * as fs from "fs";
import { ChatOpenAI } from "@langchain/openai";
import { ChatAnthropic } from "@langchain/anthropic";
// In this example, we use a separate LLM as the final summary LLM to meet our customized LLM requirements for different stages of the chain and to only stream the final results.
const text = fs.readFileSync("state_of_the_union.txt", "utf8");
const model = new ChatAnthropic({ temperature: 0 });
const combineModel = new ChatOpenAI({
model: "gpt-4",
temperature: 0,
streaming: true,
callbacks: [
{
handleLLMNewToken(token: string): Promise<void> | void {
console.log("token", token);
/*
token President
token Biden
...
...
token protections
token .
*/
},
},
],
});
const textSplitter = new RecursiveCharacterTextSplitter({ chunkSize: 5000 });
const docs = await textSplitter.createDocuments([text]);
// This convenience function creates a document chain prompted to summarize a set of documents.
const chain = loadSummarizationChain(model, {
type: "map_reduce",
combineLLM: combineModel,
});
const res = await chain.invoke({
input_documents: docs,
});
console.log({ res });
/*
{
res: {
text: "President Biden delivered his first State of the Union address, focusing on the Russian invasion of Ukraine, domestic economic challenges, and his administration's efforts to revitalize American manufacturing and infrastructure. He announced new sanctions against Russia and the deployment of U.S. forces to NATO countries. Biden also outlined his plan to fight inflation, lower costs for American families, and reduce the deficit. He emphasized the need to pass the Bipartisan Innovation Act, confirmed his Federal Reserve nominees, and called for the end of COVID shutdowns. Biden also addressed issues such as gun violence, voting rights, immigration reform, women's rights, and privacy protections."
}
}
*/
API Reference:
- loadSummarizationChain from
langchain/chains
- RecursiveCharacterTextSplitter from
langchain/text_splitter
- ChatOpenAI from
@langchain/openai
- ChatAnthropic from
@langchain/anthropic