
I’m writing this from NVIDIA GTC 2025, reflecting on Jensen Huang’s keynote yesterday. As a Portland State University NVIDIA University Ambassador developing generative AI and robotics curriculum, I’m particularly interested in software advances continuing to be made by NVIDIA. With NVIDIA Inference Microservices (NIM) you have a digital playground of AI models, with scalable AI inference, making it faster, more efficient, and easier to deploy and manage.
As a software engineer, I've used (and have a love-hate relationship with) LangChain, a framework for integrating Large Language Models (LLMs) into applications. Fortunately, LangChain pairs seamlessly with NVIDIA NIM, enabling developers to accelerate inference, reduce latency, and streamline AI-driven workflows. With NIM’s optimized AI runtimes and LangChain’s flexible architecture, building intelligent applications becomes more efficient without the traditional infrastructure overhead.
As a software engineer, I've used (and have a love-hate relationship with) LangChain, a framework for integrating Large Language Models (LLMs) into applications. Fortunately, LangChain pairs seamlessly with NVIDIA NIM, enabling developers to accelerate inference, reduce latency, and streamline AI-driven workflows. With NIM’s optimized AI runtimes and LangChain’s flexible architecture, building intelligent applications becomes more efficient without the traditional infrastructure overhead.
This article explores some of LangChain’s core components: tools, ReAct agents, action chaining, messages and runnables; while highlighting how NVIDIA NIM enhances AI pipelines for real-world deployment.
What Are LangChain Tools and How Do They Work?
LangChain tools act as external functions that LLMs can call to interact with APIs, databases, or other services. These tools extend the model’s functionality beyond text generation.
Example: Creating a Simple Tool in LangChain
With NVIDIA-powered AI acceleration, real-time processing of these tools becomes seamless:
from langchain.tools import tool from datetime import datetime @tool def get_current_time() -> str: """Returns the current date and time.""" return datetime.now().strftime("%Y-%m-%d %H:%M:%S") # Using the tool print(get_current_time()) # Example call
What is a ReAct Agent in LangChain?
A ReAct Agent (Reasoning + Acting) dynamically thinks step-by-step before executing an action. This approach allows AI models to iteratively refine their decision-making. Instead of blindly following a sequence of predefined steps, a ReAct agent thinks step-by-step and decides the next best action based on previous results.
Why NVIDIA’s AI Acceleration Matters for ReAct Agents
With NVIDIA’s TensorRT-LLM and NVIDIA Inference Microservices (NIM), ReAct agents can handle complex reasoning faster, reducing latency when calling multiple tools.
How is a ReAct Agent Unique?
A ReAct agent differs from other agents in LangChain because it:
- Thinks Before Acting - Uses chain-of-thought reasoning before executing a tool.
- Iterates Dynamically - Can call multiple tools in a sequence, refining its approach based on intermediate results.
- Handles Unknown Problems - Unlike fixed pipelines, it figures out how to solve problems dynamically.
- Explains Its Thought Process - The agent explicitly logs its reasoning before making a decision.
Example of a ReAct Agent
from langchain.agents import initialize_agent from langchain.tools import Tool from langchain.llms import OpenAI llm = OpenAI(temperature=0) tools = [ Tool( name="Current Time", func=get_current_time, description="Returns the current time." ) ] agent = initialize_agent(tools, llm, agent="react", verbose=True) response = agent.run("What is the current time?") print(response)
Chaining Actions in LangChain
LangChain enables developers to chain actions together, allowing LLMs to perform multi-step operations in a structured manner.
Example: Sequential Chain for AI-Powered Research
from langchain.chains import SimpleSequentialChain # First LLM chain: Generate a topic first_prompt = PromptTemplate( input_variables=["interest"], template="Suggest a technical topic related to {interest}." ) first_chain = LLMChain(llm=llm, prompt=first_prompt) # Second LLM chain: Generate a summary for the topic second_prompt = PromptTemplate( input_variables=["topic"], template="Write a brief explanation about {topic}." ) second_chain = LLMChain(llm=llm, prompt=second_prompt) # Create a Sequential Chain overall_chain = SimpleSequentialChain(chains=[first_chain, second_chain], verbose=True) # Run the chain response = overall_chain.run("machine learning") print(response)
What is a Runnable in LangChain?
A Runnable in LangChain is a standardized interface for processing inputs and outputs across multiple AI components. With NVIDIA AI frameworks, Runnables execute with higher efficiency, reducing computational bottlenecks.
Example: Runnable in LangChain
from langchain.schema.runnable import RunnableLambda def process_data(x): return f"Processed data: {x}" runnable = RunnableLambda(process_data) output = runnable.invoke("AI-powered by NVIDIA") print(output)
Understanding the Message Object in LangChain
LangChain's Message objects—including HumanMessage, AIMessage, and SystemMessage—serve as fundamental building blocks for structuring conversations between users and AI. These objects allow for context-aware interactions, enabling AI models to process and respond intelligently based on the conversation's flow.
- HumanMessage represents a user’s input, optionally enriched with metadata like location or request-specific details.
- AIMessage contains the model’s response, potentially including metadata like the source of the information.
- SystemMessage provides overarching instructions to guide the model’s behavior, ensuring more consistent responses.
By leveraging NVIDIA-powered inference with LangChain, message handling and conversational AI applications achieve faster processing times, making real-time, personalized interactions smoother and more scalable.
Attributes of a Message:
from langchain.schema import HumanMessage, AIMessage, SystemMessage # HumanMessage with basic attributes user_msg = HumanMessage( content="What's the weather like in New York?", additional_kwargs={"location": "New York"} ) # AIMessage with additional metadata ai_msg = AIMessage( content="The weather in New York is 22°C with clear skies.", additional_kwargs={"source": "Weather API"} ) # SystemMessage with a role name system_msg = SystemMessage( content="You are a helpful weather assistant.", name="assistant_instructions" ) # Print attributes print(user_msg.type) # "human" print(user_msg.content) # "What's the weather like in New York?" print(user_msg.additional_kwargs) # {"location": "New York"} print(system_msg.name) # "assistant_instructions"
Using LangChain’s Built-in Memory Components
LangChain provides several memory modules that allow AI models to store and retrieve past interactions. These help maintain context over extended conversations.
- ConversationBufferMemory – Stores a fixed-length buffer of previous interactions.
- ConversationBufferWindowMemory – Keeps only the most recent interactions, discarding older ones.
- ConversationTokenBufferMemory – Maintains memory up to a specific token limit, ensuring it fits within model constraints.
- ConversationSummaryMemory – Generates a concise summary of past interactions instead of storing raw messages.
Example: Using ConversationSummaryMemory
from langchain.memory import ConversationSummaryMemory from langchain.chat_models import ChatOpenAI llm = ChatOpenAI(temperature=0) memory = ConversationSummaryMemory(llm=llm) # Simulating conversation history memory.save_context({"input": "What is LangChain?"}, {"output": "LangChain is a framework for building AI applications."}) print(memory.load_memory_variables({})) # Retrieves summarized memory
The Future of AI with LangChain and NVIDIA NIM
With two more days at GTC 2025, there’s still plenty to explore, but one thing is already clear: NVIDIA NIM is redefining AI inference. The ability to deploy large language models with minimal latency and maximum efficiency is a major shift for developers looking to build scalable AI applications. LangChain’s modular framework, when paired with NIM’s optimized runtimes, streamlines everything from dynamic decision-making with ReAct agents to structured workflows with action chaining and runnables. As AI continues to push forward, NVIDIA and Portland State University are showing that innovation isn’t just theoretical: it’s deployable, efficient, and ready to scale.

Duncan Miller
Co-founder / CEO
Duncan is a software enginner with 20 years of experience as a founder in ed-tech, AI and clean-tech. He has expertise in automation software, test-driven AI development, user interface design and clean energy technologies. Duncan earned an MBA in Entrepreneurship from Babson College and teaches with the PSU center for entrepreneurship and business accelerator. He lives with his wife and two children in Portland Oregon on an extinct cinder code volcano. He is passionate about artificial intelligence, climate solutions, public benefit companies and social entrepreneurship.