LangChain & NVIDIA NIM: AI Innovation at GTC 2025

I’m writing this from NVIDIA GTC 2025, reflecting on Jensen Huang’s keynote yesterday. As a Portland State University NVIDIA University Ambassador developing generative AI and robotics curriculum, I’m particularly interested in software advances continuing to be made by NVIDIA. With NVIDIA Inference Microservices (NIM) you have a digital playground of AI models, with scalable AI inference, making it faster, more efficient, and easier to deploy and manage.
As a software engineer, I've used (and have a love-hate relationship with) LangChain, a framework for integrating Large Language Models (LLMs) into applications. Fortunately, LangChain pairs seamlessly with NVIDIA NIM, enabling developers to accelerate inference, reduce latency, and streamline AI-driven workflows. With NIM’s optimized AI runtimes and LangChain’s flexible architecture, building intelligent applications becomes more efficient without the traditional infrastructure overhead.

This article explores some of LangChain’s core components: tools, ReAct agents, action chaining, messages and runnables; while highlighting how NVIDIA NIM enhances AI pipelines for real-world deployment.

What Are LangChain Tools and How Do They Work?

LangChain tools act as external functions that LLMs can call to interact with APIs, databases, or other services. These tools extend the model’s functionality beyond text generation.

Example: Creating a Simple Tool in LangChain

With NVIDIA-powered AI acceleration, real-time processing of these tools becomes seamless:

from langchain.tools import tool
from datetime import datetime

@tool
def get_current_time() -> str:
    """Returns the current date and time."""
    return datetime.now().strftime("%Y-%m-%d %H:%M:%S")

# Using the tool
print(get_current_time())  # Example call

What is a ReAct Agent in LangChain?

A ReAct Agent (Reasoning + Acting) dynamically thinks step-by-step before executing an action. This approach allows AI models to iteratively refine their decision-making. Instead of blindly following a sequence of predefined steps, a ReAct agent thinks step-by-step and decides the next best action based on previous results.

Why NVIDIA’s AI Acceleration Matters for ReAct Agents

With NVIDIA’s TensorRT-LLM and NVIDIA Inference Microservices (NIM), ReAct agents can handle complex reasoning faster, reducing latency when calling multiple tools.

How is a ReAct Agent Unique?
A ReAct agent differs from other agents in LangChain because it:

Thinks Before Acting - Uses chain-of-thought reasoning before executing a tool.
Iterates Dynamically - Can call multiple tools in a sequence, refining its approach based on intermediate results.
Handles Unknown Problems - Unlike fixed pipelines, it figures out how to solve problems dynamically.
Explains Its Thought Process - The agent explicitly logs its reasoning before making a decision.

Example of a ReAct Agent

from langchain.agents import initialize_agent
from langchain.tools import Tool
from langchain.llms import OpenAI

llm = OpenAI(temperature=0)

tools = [
    Tool(
        name="Current Time",
        func=get_current_time,
        description="Returns the current time."
    )
]

agent = initialize_agent(tools, llm, agent="react", verbose=True)
response = agent.run("What is the current time?")
print(response)

Chaining Actions in LangChain

LangChain enables developers to chain actions together, allowing LLMs to perform multi-step operations in a structured manner.

Example: Sequential Chain for AI-Powered Research

from langchain.chains import SimpleSequentialChain

# First LLM chain: Generate a topic
first_prompt = PromptTemplate(
    input_variables=["interest"],
    template="Suggest a technical topic related to {interest}."
)
first_chain = LLMChain(llm=llm, prompt=first_prompt)

# Second LLM chain: Generate a summary for the topic
second_prompt = PromptTemplate(
    input_variables=["topic"],
    template="Write a brief explanation about {topic}."
)
second_chain = LLMChain(llm=llm, prompt=second_prompt)

# Create a Sequential Chain
overall_chain = SimpleSequentialChain(chains=[first_chain, second_chain], verbose=True)

# Run the chain
response = overall_chain.run("machine learning")
print(response)

What is a Runnable in LangChain?

A Runnable in LangChain is a standardized interface for processing inputs and outputs across multiple AI components. With NVIDIA AI frameworks, Runnables execute with higher efficiency, reducing computational bottlenecks.

Example: Runnable in LangChain

from langchain.schema.runnable import RunnableLambda

def process_data(x):
    return f"Processed data: {x}"

runnable = RunnableLambda(process_data)
output = runnable.invoke("AI-powered by NVIDIA")
print(output)

Understanding the Message Object in LangChain

LangChain's Message objects—including HumanMessage, AIMessage, and SystemMessage—serve as fundamental building blocks for structuring conversations between users and AI. These objects allow for context-aware interactions, enabling AI models to process and respond intelligently based on the conversation's flow.

HumanMessage represents a user’s input, optionally enriched with metadata like location or request-specific details.
AIMessage contains the model’s response, potentially including metadata like the source of the information.
SystemMessage provides overarching instructions to guide the model’s behavior, ensuring more consistent responses.

By leveraging NVIDIA-powered inference with LangChain, message handling and conversational AI applications achieve faster processing times, making real-time, personalized interactions smoother and more scalable.

Attributes of a Message:

from langchain.schema import HumanMessage, AIMessage, SystemMessage

# HumanMessage with basic attributes
user_msg = HumanMessage(
    content="What's the weather like in New York?",
    additional_kwargs={"location": "New York"}
)

# AIMessage with additional metadata
ai_msg = AIMessage(
    content="The weather in New York is 22°C with clear skies.",
    additional_kwargs={"source": "Weather API"}
)

# SystemMessage with a role name
system_msg = SystemMessage(
    content="You are a helpful weather assistant.",
    name="assistant_instructions"
)

# Print attributes
print(user_msg.type)  # "human"
print(user_msg.content)  # "What's the weather like in New York?"
print(user_msg.additional_kwargs)  # {"location": "New York"}
print(system_msg.name)  # "assistant_instructions"

Using LangChain’s Built-in Memory Components

LangChain provides several memory modules that allow AI models to store and retrieve past interactions. These help maintain context over extended conversations.

ConversationBufferMemory – Stores a fixed-length buffer of previous interactions.
ConversationBufferWindowMemory – Keeps only the most recent interactions, discarding older ones.
ConversationTokenBufferMemory – Maintains memory up to a specific token limit, ensuring it fits within model constraints.
ConversationSummaryMemory – Generates a concise summary of past interactions instead of storing raw messages.

Example: Using ConversationSummaryMemory

from langchain.memory import ConversationSummaryMemory
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(temperature=0)
memory = ConversationSummaryMemory(llm=llm)

# Simulating conversation history
memory.save_context({"input": "What is LangChain?"}, {"output": "LangChain is a framework for building AI applications."})
print(memory.load_memory_variables({}))  # Retrieves summarized memory

The Future of AI with LangChain and NVIDIA NIM

With two more days at GTC 2025, there’s still plenty to explore, but one thing is already clear: NVIDIA NIM is redefining AI inference. The ability to deploy large language models with minimal latency and maximum efficiency is a major shift for developers looking to build scalable AI applications. LangChain’s modular framework, when paired with NIM’s optimized runtimes, streamlines everything from dynamic decision-making with ReAct agents to structured workflows with action chaining and runnables. As AI continues to push forward, NVIDIA and Portland State University are showing that innovation isn’t just theoretical: it’s deployable, efficient, and ready to scale.

Duncan Miller

Learning and Impact

Duncan is a software engineer with over 20 years of experience as an education technology founder. He has expertise in curriculum development, machine learning, AI robotics and clean energy. Duncan earned an MBA in Entrepreneurship from Babson College and is a faculty member at the PSU Center for Entrepreneurship and Business Accelerator. He lives on an extinct cinder cone volcano with his wife and two children in Portland Oregon. He is passionate about artificial intelligence, robotics, climate solutions, open startups and social entrepreneurship.