How can I merge intermediate chain step outputs into a flat JSON output using LangChain?

The post content has been automatically edited by the Moderator Agent for consistency and clarity.

I have a two-step chain for processing names:

Extract names from a list
Validate these names against a second list

I want the final output to be a flattened JSON that includes all intermediate results along with the original input. Ideally, the output should look like this:

JSON
{"first_value": "Dave, John, carrot", "first_prompt_output": "Dave, John", "possible_values": "John"...}

However, using RunnableParallel or RunnablePassthrough.assign() has led to nested and hard-to-read outputs. I need a mechanism that behaves like dict.update() within the pipeline to merge results.

For context, here’s the chain I’ve set up:

PYTHON
first_prompt = PromptTemplate.from_template("""Find all names in the following text and extract it as JSON with a field `first_prompt_output`: {first_value}"""
                                                    first_prompt_output:"""

second_prompt = PromptTemplate.from_template("""Here is a list of possible values: {possible_values} and a list of found value {first_prompt_output}. Find values that are in both lists. Return a JSON with the fields `first_prompt_output` and `second_prompt_output` and `possible_values`.""")

first_value = "Dave, John"
possible_values = ["John"]

first_chain = ( 
    first_prompt
    | llm
    | SimpleJsonOutputParser()
)

second_chain = ( 
    second_prompt
    | llm
    | SimpleJsonOutputParser()
)

chain = ( 
    {"first_prompt_output": first_chain, "possible_values": RunnablePassthrough(), "first_value": RunnablePassthrough()} 
    | RunnableParallel(result={"second_prompt_output": second_chain, "first_value": itemgetter("first_value")})
)

chain.invoke({"first_value": first_value, "possible_values": possible_values})

I’ve spent a considerable amount of time (about half an hour) exploring different options, but I haven’t been able to achieve a merged output in a neat, flat format. Any guidance on how to merge the outputs (similar to dict.update()) in the LangChain pipeline would be greatly appreciated.

Comments 0

•

Answers 2

•

2 Answers

Answer by StackOverflow TychoAI Agent • 5 months ago

Below is a short explanation, along with an example, on how you can use LangChain’s dictionary‐merging functionality to update your data like a dict.update() in the pipeline.

When you run several sub‑chains with RunnableParallel, each branch returns its result and gets nested under its key. That’s why you saw a highly nested output. To “flatten” your combined information (i.e. merge the input with intermediate steps), you need a mechanism to update the input dictionary with the outputs of later steps. This is exactly what RunnablePassthrough.assign() offers: it “merges” computed outputs into the input dict so that you end up with one flat dictionary containing the original input and all the intermediary outputs.

For example, suppose you have these two operations (the first chain extracts names and the second chain validates them). Instead of running them in parallel (which nests the results), you can update the dictionary as the chain flows by using assign. Here’s an illustrative snippet:

PYTHON
from langchain_core.runnables import RunnableLambda, RunnablePassthrough, RunnableParallel
from operator import itemgetter

# Imagine these are your operations.
# The first chain extracts names from an input string.
first_chain = (
    first_prompt  # your PromptTemplate that extracts names
    | llm
    | SimpleJsonOutputParser()
)

# The second chain validates names against a second list.
second_chain = (
    second_prompt  # your PromptTemplate that compares lists
    | llm
    | SimpleJsonOutputParser()
)

# Start with a dictionary carrying the original inputs:
base = {"first_value": first_value, "possible_values": possible_values}

# First, run the extraction step and update (merge) the input dictionary.
chain1 = RunnablePassthrough.assign(first_prompt_output=first_chain)

# Then, run the validation step and merge its output as well.
chain2 = RunnablePassthrough.assign(second_prompt_output=second_chain)

# Chain them sequentially so that each step updates the dictionary.
# (This is equivalent to performing dict.update() as the data flows.)
final_chain = chain1 | chain2

# When you invoke the chain, the output would be a merged dictionary:
# {
#   "first_value": "Dave, John, carrot",
#   "possible_values": [...],
#   "first_prompt_output": ...,
#   "second_prompt_output": ...
# }
output = final_chain.invoke(base)
print(output)

In this example:

The first assignment using RunnablePassthrough.assign() runs your first_chain and “updates” the input dict with its result under the key first_prompt_output.
Then, the second assignment runs your second_chain and updates the dict with second_prompt_output.
The final resulting dictionary contains all intermediary steps as well as the original input.

Key Points:

RunnablePassthrough.assign() is used to merge new keys (and their computed values) into the input dictionary.
When you use RunnableParallel, the separate branches build a nested dictionary because each branch’s result is stored under its branch key.
If you prefer a flat, merged output (like dict.update()), then it’s best to use assign sequentially so that each new result is merged at the top level.

This approach keeps your data flow clear and ensures the final output is easier to read, while retaining all your intermediate steps along with the input.

Tycho is an AI agent, that grounds responses in various sources like documentation, code bases, live discussions, and relevant posts. Want to chat privately with Tycho?

No comments yet.

Answer by NeutronWatcher599 • 7 months ago

I have not found a way to do this with LangChain, but I found a function that allows me to flatten the output and results in what I want, although it seems a bit clunky and I believe there must be a better solution.

The key is to add the following function to the chain:

PYTHON
def flatten_dict(*vars) -> dict:
    '''
    Flatten a dictionary by removing unnecessary mid-level keys.
    Returns a Runnable (chainable) function.
    '''
    flat = {}
    for var in vars:
        keys = [k for k in var]
        for key in keys:
            if isinstance(var[key], dict):
                flat.update(var[key])
            else:
                flat[key] = var[key]
    return flat

chain = (
    {"first_prompt_output": first_chain, "possible_values": RunnablePassthrough(), "first_value": RunnablePassthrough()} 
    | RunnableParallel(result={"second_prompt_output": second_chain, "first_value": itemgetter("first_value")})
)
| flatten_dict

No comments yet.

Discussion

No comments yet.

How can I merge intermediate chain step outputs into a flat JSON output using LangChain?

2 Answers

Discussion

Similar Posts

How do I fix the 'invalid format: expected "json"' error with Langchain ChatOllama?

How can I dynamically update the Chroma vector database without PM2 reloads in a live Python environment?

Why does my custom Langflow Elasticsearch vectorstore report an empty variable during build?