- Updated: February 17, 2026
- 28 min read
Human‑in‑the‑Loop AI Agents with LangGraph and Streamlit – A Complete Guide

Discord Linkedin Reddit Twitter Home Open Source/Weights AI Agents Tutorials Voice AI AINews.sh Sponsorship Search NewsHub NewsHub Premium Content Read our exclusive articles FacebookInstagramTwitter Home Open Source/Weights AI Agents Tutorials Voice AI AINews.sh Sponsorship NewsHub Search Home Open Source/Weights AI Agents Tutorials Voice AI AINews.sh Sponsorship Home Editors Pick Agentic AI How to Build Human-in-the-Loop Plan-and-Execute AI Agents with Explicit User Approval Using.Editors PickAgentic AIAI AgentsTutorials In this tutorial, we build a human-in-the-loop travel booking agent that treats the user as a teammate rather than a passive observer. We design the system so the agent first reasons openly by drafting a structured travel plan, then deliberately pauses before taking any action. We expose this proposed plan in a live interface where we can inspect, edit, or reject it, and only after explicit approval do we allow the agent to execute tools.By combining LangGraph interrupts with a Streamlit frontend, we create a workflow that makes agent reasoning visible, controllable, and trustworthy instead of opaque and autonomous. Copy CodeCopiedUse a different Browser!pip -q install -U langgraph openai streamlit pydantic !npm -q install -g localtunnel import os, getpass, textwrap, json, uuid, time if not os.environ.get(“OPENAI_API_KEY”): os.environ[“OPENAI_API_KEY”] = getpass.getpass(“OPENAI_API_KEY (hidden input): “) os.environ.setdefault(“OPENAI_MODEL”, “gpt-4.1-mini”) We set up the execution environment by installing all required libraries and utilities needed for agent orchestration and UI exposure. We securely collect the OpenAI API key at runtime so it is never hardcoded or leaked in the notebook. We also configure the model selection upfront to keep the rest of the pipeline clean and reproducible.Copy CodeCopiedUse a different Browserapp_code = r”’ import os, json, uuid import streamlit as st from typing import TypedDict, List, Dict, Any, Optional from pydantic import BaseModel, Field from openai import OpenAI from langgraph.graph import StateGraph, START, END from langgraph.types import Command, interrupt from langgraph.checkpoint.memory import InMemorySaver def tool_search_flights(origin: str, destination: str, depart_date: str, return_date: str, budget_usd: int) -> Dict[str, Any]: options = [ {“airline”: “SkyJet”, “route”: f”{origin}->{destination}”, “depart”: depart_date, “return”: return_date, “price_usd”: int(budget_usd*0.55)}, {“airline”: “AeroBlue”, “route”: f”{origin}->{destination}”, “depart”: depart_date, “return”: return_date, “price_usd”: int(budget_usd*0.70)}, {“airline”: “Nimbus Air”, “route”: f”{origin}->{destination}”, “depart”: depart_date, “return”: return_date, “price_usd”: int(budget_usd*0.62)}, ] options = sorted(options, key=lambda x: x[“price_usd”]) return {“tool”: “search_flights”, “top_options”: options[:2]} def tool_search_hotels(city: str, nights: int, budget_usd: int, preferences: List[str]) -> Dict[str, Any]: base = max(60, int(budget_usd / max(nights, 1))) picks = [ {“name”: “Central Boutique”, “city”: city, “nightly_usd”: int(base*0.95), “notes”: [“walkable”, “great reviews”]}, {“name”: “Riverside Stay”, “city”: city, “nightly_usd”: int(base*0.80), “notes”: [“quiet”, “good value”]}, {“name”: “Modern Loft Hotel”, “city”: city, “nightly_usd”: int(base*1.10), “notes”: [“new”, “gym”]}, ] if “luxury” in [p.lower() for p in preferences]: picks = sorted(picks, key=lambda x: -x[“nightly_usd”]) else: picks = sorted(picks, key=lambda x: x[“nightly_usd”]) return {“tool”: “search_hotels”, “top_options”: picks[:2]} def tool_build_day_by_day(city: str, days: int, vibe: str) -> Dict[str, Any]: blocks = [] for d in range(1, days+1): blocks.append({ “day”: d, “morning”: f”{city}: coffee + a must-see landmark”, “afternoon”: f”{city}: {vibe} activity + local lunch”, “evening”: f”{city}: sunset spot + dinner + optional night walk” }) return {“tool”: “draft_itinerary”, “days”: blocks} ”’ We define the Streamlit application core and implement safe, deterministic tool functions that simulate flights, hotels, and itinerary generation. We design these tools to behave like real-world APIs while still running fully in a Colab environment.We ensure all tool outputs are structured so they can be audited before execution. Copy CodeCopiedUse a different Browserapp_code += r”’ class TravelPlan(BaseModel): trip_title: str = Field(., description=”Short human-friendly title”) origin: str destination: str depart_date: str return_date: str travelers: int = 1 budget_usd: int = 1500 preferences: List[str] = Field(default_factory=list) vibe: str = “balanced” lodging_nights: int = 4 daily_outline: List[Dict[str, Any]] = Field(default_factory=list) tool_calls: List[Dict[str, Any]] = Field(default_factory=list) class State(TypedDict): user_request: str plan: Dict[str, Any] approval: Dict[str, Any] execution: Dict[str, Any] def make_llm_plan(state: State) -> Dict[str, Any]: client = OpenAI(api_key=os.environ[“OPENAI_API_KEY”]) model = os.environ.get(“OPENAI_MODEL”, “gpt-4.1-mini”) sys = ( “You are a travel planning agent. ” “Return a JSON travel plan that matches the provided schema. ” “Be realistic, concise, and include a tool_calls list describing what you want executed ” “(e.g., search_flights, search_hotels, draft_itinerary).” ) schema = TravelPlan.model_json_schema() resp = client.responses.create( model=model, input=[ {“role”:”system”,”content”: sys}, {“role”:”user”,”content”: state[“user_request”]}, {“role”:”user”,”content”: f”Schema (JSON): {json.dumps(schema)}”} ], ) text = resp.output_text.strip() start = text.find(“{“) end = text.rfind(“}”) if start == -1 or end == -1: raise ValueError(“Model did not return JSON. Try again or change model.”) raw = text[start:end+1] plan_obj = json.loads(raw) plan = TravelPlan(**plan_obj).model_dump() if not plan.get(“tool_calls”): plan[“tool_calls”] = [ {“name”:”search_flights”, “args”:{“origin”: plan[“origin”], “destination”: plan[“destination”], “depart_date”: plan[“depart_date”], “return_date”: plan[“return_date”], “budget_usd”: plan[“budget_usd”]}}, {“name”:”search_hotels”, “args”:{“city”: plan[“destination”], “nights”: plan[“lodging_nights”], “budget_usd”: int(plan[“budget_usd”]*0.35), “preferences”: plan[“preferences”]}}, {“name”:”draft_itinerary”, “args”:{“city”: plan[“destination”], “days”: max(2, plan[“lodging_nights”]+1), “vibe”: plan[“vibe”]}}, ] return {“plan”: plan} def wait_for_approval(state: State) -> Dict[str, Any]: payload = { “kind”: “approval”, “message”: “Review/edit the plan. Approve to execute tools.”, “plan”: state[“plan”], } decision = interrupt(payload) return {“approval”: decision} def execute_tools(state: State) -> Dict[str, Any]: approval = state.get(“approval”) or {} if not approval.get(“approved”): return {“execution”: {“status”: “not_executed”, “reason”: “User rejected or did not approve.”}} plan = approval.get(“edited_plan”) or state[“plan”] tool_calls = plan.get(“tool_calls”, []) results = [] for call in tool_calls: name = call.get(“name”) args = call.get(“args”, {}) if name == “search_flights”: results.append(tool_search_flights(**args)) elif name == “search_hotels”: results.append(tool_search_hotels(**args)) elif name == “draft_itinerary”: results.append(tool_build_day_by_day(**args)) else: results.append({“tool”: name, “error”: “Unknown tool (blocked for safety).”, “args”: args}) return {“execution”: {“status”: “executed”, “tool_results”: results, “final_plan”: plan}} ”’ We formalize the agent’s reasoning using a strict schema that requires the model to output an explicit travel plan rather than free-form text.We generate the plan using the OpenAI model and validate it before allowing it into the workflow. We also auto-inject tool calls if the model omits them to guarantee a complete execution path. Copy CodeCopiedUse a different Browserapp_code += r”’ def build_graph(): builder = StateGraph(State) builder.add_node(“plan”, make_llm_plan) builder.add_node(“approve”, wait_for_approval) builder.add_node(“execute”, execute_tools) builder.add_edge(START, “plan”) builder.add_edge(“plan”, “approve”) builder.add_edge(“approve”, “execute”) builder.add_edge(“execute”, END) memory = InMemorySaver() graph = builder.compile(checkpointer=memory) return graph st.set_page_config(page_title=”Plan → Approve → Execute Travel Agent”, layout=”wide”) st.title(“Human-in-the-Loop Travel Booking Agent (Plan → Approve/Edit → Execute)”) with st.sidebar: st.header(“Runtime”) if st.button(“New Session / Thread”): st.session_state.thread_id = str(uuid.uuid4()) st.session_state.ran_once = False st.session_state.interrupt_payload = None st.session_state.last_execution = None thread_id = st.session_state.get(“thread_id”) or str(uuid.uuid4()) st.session_state.thread_id = thread_id graph = build_graph() config = {“configurable”: {“thread_id”: thread_id}} st.caption(f”Thread ID: {thread_id}”) req = st.text_area( “Describe your trip request”, value=st.session_state.get(“user_request”, “Plan a 5-day trip from Dubai to Istanbul in April. Budget $1800.Prefer museums, street food, and a relaxed pace.”), height=120 ) st.session_state.user_request = req colA, colB = st.columns([1,1]) run_plan = colA.button(“1) Generate Plan (LLM)”) resume_btn = colB.button(“2) Resume After Approval”) if run_plan: st.session_state.ran_once = True st.session_state.interrupt_payload = None st.session_state.last_execution = None initial = {“user_request”: req, “plan”: {}, “approval”: {}, “execution”: {}} out = graph.invoke(initial, config=config) if “__interrupt__” in out and out[“__interrupt__”]: st.session_state.interrupt_payload = out[“__interrupt__”][0].value else: st.session_state.last_execution = out.get(“execution”) payload = st.session_state.get(“interrupt_payload”) if payload: st.subheader(“Plan proposed by agent (editable)”) plan = payload.get(“plan”, {}) left, right = st.columns([1,1]) with left: st.write(“**Edit JSON (advanced):**”) edited_text = st.text_area(“Plan JSON”, value=json.dumps(plan, indent=2), height=420) with right: st.write(“**Quick actions:**”) approved = st.radio(“Decision”, options=[“Approve”, “Reject”], index=0) st.write(“Tip: If you edit JSON, keep it valid. You can also reject and re-run planning.”) try: edited_plan = json.loads(edited_text) json_ok = True except Exception as e: json_ok = False st.error(f”Invalid JSON: {e}”) if resume_btn: if not json_ok: st.stop() decision = { “approved”: (approved == “Approve”), “edited_plan”: edited_plan } out2 = graph.invoke(Command(resume=decision), config=config) st.session_state.interrupt_payload = None st.session_state.last_execution = out2.get(“execution”) exec_result = st.session_state.get(“last_execution”) if exec_result: st.subheader(“Execution result”) st.json(exec_result) if exec_result.get(“status”) == “executed”: st.success(“Tools executed only AFTER approval ✅”) else: st.warning(“Not executed (rejected or not approved).”) ”’ We construct the LangGraph workflow by separating planning, approval, and execution into distinct nodes. We deliberately interrupt the graph after planning so we can review and control the agent’s intent. We only allow tool execution to proceed when explicit human approval is provided. Copy CodeCopiedUse a different Browserimport pathlib pathlib.Path(“app.py”).write_text(app_code) !streamlit run app.py –server.port 8501 –server.address 0.0.0.0 & sleep 2 !lt –port 8501 We connect the agent workflow to a live Streamlit interface that supports editing, approval, and rejection of plans. We persist the state across runs using a thread identifier so the agent behaves consistently across interactions. We finally launch the app and make it publicly available, enabling real human-in-the-loop collaboration.In conclusion, we demonstrated how plan-and-execute agents become significantly more reliable when humans remain in the loop at the right moment. We showed that interrupts are not just a technical feature but a design primitive for building trust, accountability, and collaboration into agent systems. By separating planning from execution and inserting a clear approval boundary, we ensured that tools run only with human consent and context.This pattern scales beyond travel planning to any high-stakes automation, giving us agents that think with us rather than act for us. Check out the Full Codes here. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well. Michal Sutter+ postsBioMichal Sutter is a data science professional with a Master of Science in Data Science from the University of Padova.With a solid foundation in statistical analysis, machine learning, and data engineering, Michal excels at transforming complex datasets into actionable insights.Michal SutterGoogle DeepMind Proposes New Framework for Intelligent AI Delegation to Secure the Emerging Agentic Web for Future EconomiesMichal SutterMoonshot AI Launches Kimi Claw: Native OpenClaw on Kimi.com with 5,000 Community Skills and 40GB Cloud Storage NowMichal SutterMeet ‘Kani-TTS-2’: A 400M Param Open Source Text-to-Speech Model that Runs in 3GB VRAM with Voice Cloning SupportMichal SutterGoogle AI Introduces the WebMCP to Enable Direct and Structured Website Interactions for New AI AgentsMichal Sutter[In-Depth Guide] The Complete CTGAN + SDV Pipeline for High-Fidelity Synthetic DataMichal SutterGoogle DeepMind Introduces Aletheia: The AI Agent Moving from Math Competitions to Fully Autonomous Professional Research DiscoveriesMichal SutterIs This AGI? Google’s Gemini 3 Deep Think Shatters Humanity’s Last Exam And Hits 84.6% On ARC-AGI-2 Performance TodayMichal SutterMeet OAT: The New Action Tokenizer Bringing LLM-Style Scaling and Flexible, Anytime Inference to the Robotics WorldMichal SutterWaymo Introduces the Waymo World Model: A New Frontier Simulator Model for Autonomous Driving and Built on Top of Genie 3Michal SutterMistral AI Launches Voxtral Transcribe 2: Pairing Batch Diarization And Open Realtime ASR For Multilingual Production Workloads At ScaleMichal SutterGoogle Introduces Agentic Vision in Gemini 3 Flash for Active Image UnderstandingMichal SutterGoogle Releases Conductor: a context driven Gemini CLI extension that stores knowledge as Markdown and orchestrates agentic workflowsMichal SutterMicrosoft Unveils Maia 200, An FP4 and FP8 Optimized AI Inference Accelerator for Azure DatacentersMichal SutterDeepSeek AI Releases DeepSeek-OCR 2 with Causal Visual Flow Encoder for Layout Aware Document UnderstandingMichal SutterAlibaba Introduces Qwen3-Max-Thinking, a Test Time Scaled Reasoning Model with Native Tool Use Powering Agentic WorkloadsMichal SutterTencent Hunyuan Releases HPC-Ops: A High Performance LLM Inference Operator LibraryMichal SutterDSGym Offers a Reusable Container Based Substrate for Building and Benchmarking Data Science AgentsMichal SutterWhat is Clawdbot? How a Local First Agent Stack Turns Chats into Real AutomationsMichal SutterGitHub Releases Copilot-SDK to Embed Its Agentic Runtime in Any AppMichal SutterSalesforce AI Introduces FOFPred: A Language-Driven Future Optical Flow Prediction Framework that Enables Improved Robot Control and Video GenerationMichal SutterZhipu AI Releases GLM-4.7-Flash: A 30B-A3B MoE Model for Efficient Local Coding and AgentsMichal SutterA Coding Guide to Understanding How Retries Trigger Failure Cascades in RPC and Event-Driven ArchitecturesMichal SutterVercel Releases Agent Skills: A Package Manager For AI Coding Agents With 10 Years of React and Next.js Optimisation RulesMichal SutterBlack Forest Labs Releases FLUX.2 [klein]: Compact Flow Models for Interactive Visual IntelligenceMichal SutterMeet SETA: Open Source Training Reinforcement Learning Environments for Terminal Agents with 400 Tasks and CAMEL ToolkitMichal SutterA Coding Implementation to Build a Unified Apache Beam Pipeline Demonstrating Batch and Stream Processing with Event-Time Windowing Using DirectRunnerMichal SutterTencent Researchers Release Tencent HY-MT1.5: A New Translation Models Featuring 1.8B and 7B Models Designed for Seamless on-Device and Cloud DeploymentMichal SutterHow Cloudflare’s tokio-quiche Makes QUIC and HTTP/3 a First Class Citizen in Rust BackendsMichal SutterHow to Build a Robust Multi-Agent Pipeline Using CAMEL with Planning, Web-Augmented Reasoning, Critique, and Persistent MemoryMichal SutterNVIDIA AI Researchers Release NitroGen: An Open Vision Action Foundation Model For Generalist Gaming AgentsMichal SutterThis AI Paper from Stanford and Harvard Explains Why Most ‘Agentic AI’ Systems Feel Impressive in Demos and then Completely Fall Apart in Real UseMichal SutterGoogle DeepMind Researchers Release Gemma Scope 2 as a Full Stack Interpretability Suite for Gemma 3 ModelsMichal SutterHow to Build a Fully Autonomous Local Fleet-Maintenance Analysis Agent Using SmolAgents and Qwen ModelMichal SutterMistral AI Releases OCR 3: A Smaller Optical Character Recognition (OCR) Model for Structured Document AI at ScaleMichal SutterNanbeige4-3B-Thinking: How a 23T Token Pipeline Pushes 3B Models Past 30B Class ReasoningMichal SutterThe Machine Learning Divide: Marktechpost’s Latest ML Global Impact Report Reveals Geographic Asymmetry Between ML Tool Origins and Research AdoptionMichal SutterGoogle LiteRT NeuroPilot Stack Turns MediaTek Dimensity NPUs into First Class Targets for on Device LLMsMichal SutterFrom Transformers to Associative Memory, How Titans and MIRAS Rethink Long Context ModelingMichal SutterGoogle Colab Integrates KaggleHub for One Click Access to Kaggle Datasets, Models and CompetitionsMichal SutterOpenAGI Foundation Launches Lux: A Foundation Computer Use Model that Tops Online Mind2Web with OSGym At ScaleMichal SutterGoogle DeepMind Researchers Introduce Evo-Memory Benchmark and ReMem Framework for Experience Reuse in LLM AgentsMichal SutterMeta AI Researchers Introduce Matrix: A Ray Native a Decentralized Framework for Multi Agent Synthetic Data GenerationMichal SutterBlack Forest Labs Releases FLUX.2: A 32B Flow Matching Transformer for Production Image PipelinesMichal SutterAgent0: A Fully Autonomous AI Framework that Evolves High-Performing Agents without External Data through Multi-Step Co-EvolutionMichal SutterGoogle DeepMind Introduces Nano Banana Pro: the Gemini 3 Pro Image Model for Text Accurate and Studio Grade VisualsMichal SutterAllen Institute for AI (AI2) Introduces Olmo 3: An Open Source 7B and 32B LLM Family Built on the Dolma 3 and Dolci StackMichal SuttervLLM vs TensorRT-LLM vs HF TGI vs LMDeploy, A Deep Technical Comparison for Production LLM InferenceMichal SutterOpenAI Debuts GPT-5.1-Codex-Max, a Long-Horizon Agentic Coding Model With Compaction for Multi-Window WorkflowsMichal SutterGoogle Antigravity Makes the IDE a Control Plane for Agentic CodingMichal SutterxAI’s Grok 4.1 Pushes Toward Higher Emotional Intelligence, Lower Hallucinations and Tighter Safety ControlsMichal SutterGoogle DeepMind’s WeatherNext 2 Uses Functional Generative Networks For 8x Faster Probabilistic Weather ForecastsMichal SutterComparing the Top 4 Agentic AI Browsers in 2025: Atlas vs Copilot Mode vs Dia vs CometMichal SutterOpenAI Researchers Train Weight Sparse Transformers to Expose Interpretable CircuitsMichal SutterComparing the Top 6 Agent-Native Rails for the Agentic Internet: MCP, A2A, AP2, ACP, x402, and KiteMichal SutterOpenAI Introduces GPT-5.1: Combining Adaptive Reasoning, Account Level Personalization, And Updated Safety Metrics In The GPT-5 StackMichal SutterMeta AI Releases Omnilingual ASR: A Suite of Open-Source Multilingual Speech Recognition Models for 1600+ LanguagesMichal SutterMoonshot AI Releases Kosong: The LLM Abstraction Layer that Powers Kimi CLIMichal SutterComparing Memory Systems for LLM Agents: Vector, Graph, and Event LogsMichal SutterMeet Kosmos: An AI Scientist that Automates Data-Driven DiscoveryMichal SutterAnthropic Turns MCP Agents Into Code First Systems With ‘Code Execution With MCP’ ApproachMichal SutterWhy Spatial Supersensing is Emerging as the Core Capability for Multimodal AI Systems?Michal SutterComparing the Top 6 Inference Runtimes for LLM Serving in 2025Michal SutterOpenAI Introduces IndQA: A Culture Aware Benchmark For Indian LanguagesMichal SutterComparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025Michal SutterAnyscale and NovaSky Team Releases SkyRL tx v0.1.0: Bringing Tinker Compatible Reinforcement Learning RL Engine To Local GPU ClustersMichal SutterLongCat-Flash-Omni: A SOTA Open-Source Omni-Modal Model with 560B Parameters with 27B activated, Excelling at Real-Time Audio-Visual InteractionMichal SutterComparing the Top 6 OCR (Optical Character Recognition) Models/Systems in 2025Michal SutterAnthropic’s New Research Shows Claude can Detect Injected Concepts, but only in Controlled LayersMichal SutterOpenAI Releases Research Preview of ‘gpt-oss-safeguard’: Two Open-Weight Reasoning Models for Safety Classification TasksMichal SutterMicrosoft Releases Agent Lightning: A New AI Framework that Enables Reinforcement Learning (RL)-based Training of LLMs for Any AI AgentMichal SutterMiniMax Releases MiniMax M2: A Mini Open Model Built for Max Coding and Agentic Workflows at 8% Claude Sonnet Price and ~2x FasterMichal SutterGoogle vs OpenAI vs Anthropic: The Agentic AI Arms Race BreakdownMichal SutterLiquid AI’s LFM2-VL-3B Brings a 3B Parameter Vision Language Model (VLM) to Edge-Class DevicesMichal SutterUltraCUA: A Foundation Computer-Use Agents Model that Bridges the Gap between General-Purpose GUI Agents and Specialized API-based AgentsMichal SutterAnthrogen Introduces Odyssey: A 102B Parameter Protein Language Model that Replaces Attention with Consensus and Trains with Discrete DiffusionMichal SutterOpenAI Introduces ChatGPT Atlas: A Chromium-based browser with a built-in AI agentMichal SutterGoogle AI Research Releases DeepSomatic: A New AI Model that Identifies Cancer Cell Genetic VariantsMichal SutterWeak-for-Strong (W4S): A Novel Reinforcement Learning Algorithm that Trains a weak Meta Agent to Design Agentic Workflows with Stronger LLMsMichal SutterKong Releases Volcano: A TypeScript, MCP-native SDK for Building Production Ready AI Agents with LLM Reasoning and Real-World actionsMichal SutterGoogle AI Releases C2S-Scale 27B Model that Translate Complex Single-Cell Gene Expression Data into ‘cell sentences’ that LLMs can UnderstandMichal Sutter7 LLM Generation Parameters—What They Do and How to Tune Them?Michal SutterMeta’s ARE + Gaia2 Set a New Bar for AI Agent Evaluation under Asynchronous, Event-Driven ConditionsMichal SutterMicrosoft AI Debuts MAI-Image-1: An In-House Text-to-Image Model that Enters LMArena’s Top-10Michal SutterGoogle Open-Sources an MCP Server for the Google Ads API, Bringing LLM-Native Access to Ads DataMichal SutterWhat are ‘Computer-Use Agents’?From Web to OS—A Technical ExplainerMichal SutterRA3: Mid-Training with Temporal Action Abstractions for Faster Reinforcement Learning (RL) Post-Training in Code LLMsMichal SutterModel Context Protocol (MCP) vs Function Calling vs OpenAPI Tools — When to Use Each?Michal SutterGoogle AI Introduces Gemini 2.5 ‘Computer Use’ (Preview): A Browser-Control Model to Power AI Agents to Interact with User InterfacesMichal SutterOpenAI Debuts Agent Builder and AgentKit: A Visual-First Stack for Building, Deploying, and Evaluating AI AgentsMichal SutterStreamTensor: A PyTorch-to-Accelerator Compiler that Streams LLM Intermediates Across FPGA DataflowsMichal SutterHow to Evaluate Voice Agents in 2025: Beyond Automatic Speech Recognition (ASR) and Word Error Rate (WER) to Task Success, Barge-In, and Hallucination-Under-NoiseMichal SutterThis AI Paper Proposes a Novel Dual-Branch Encoder-Decoder Architecture for Unsupervised Speech Enhancement (SE)Michal SutterNeuphonic Open-Sources NeuTTS Air: A 748M-Parameter On-Device Speech Language Model with Instant Voice CloningMichal SutterThinking Machines Launches Tinker: A Low-Level Training API that Abstracts Distributed LLM Fine-Tuning without Hiding the KnobsMichal SutterMLPerf Inference v5.1 (2025): Results Explained for GPUs, CPUs, and AI AcceleratorsMichal SutterThe Role of Model Context Protocol (MCP) in Generative AI Security and Red TeamingMichal SutterOpenAI Launches Sora 2 and a Consent-Gated Sora iOS AppMichal SutterDelinea Released an MCP Server to Put Guardrails Around AI Agents Credential AccessMichal SutterAnthropic Launches Claude Sonnet 4.5 with New Coding and Agentic State-of-the-Art ResultsMichal SutterTop 10 Local LLMs (2025): Context Windows, VRAM Targets, and Licenses ComparedMichal SutterThe Latest Gemini 2.5 Flash-Lite Preview is Now the Fastest Proprietary Model (External Tests) and 50% Fewer Output TokensMichal SutterGoogle AI Ships a Model Context Protocol (MCP) Server for Data Commons, Giving AI Agents First-Class Access to Public StatsMichal SutterOpenAI Releases ChatGPT ‘Pulse’: Proactive, Personalized Daily Briefings for Pro UsersMichal SutterOpenAI Introduces GDPval: A New Evaluation Suite that Measures AI on Real-World Economically Valuable TasksMichal SutterVision-RAG vs Text-RAG: A Technical Comparison for Enterprise SearchMichal SutterMicrosoft Brings MCP to Azure Logic Apps (Standard) in Public Preview, Turning Connectors into Agent ToolsMichal SutterTop 15 Model Context Protocol (MCP) Servers for Frontend Developers (2025)Michal SutterLLM-as-a-Judge: Where Do Its Signals Break, When Do They Hold, and What Should “Evaluation” Mean?Michal SutterAn Internet of AI Agents? Coral Protocol Introduces Coral v1: An MCP-Native Runtime and Registry for Cross-Framework AI AgentsMichal SutterXiaomi Released MiMo-Audio, a 7B Speech Language Model Trained on 100M+ Hours with High-Fidelity Discrete TokensMichal SutterGoogle’s Sensible Agent Reframes Augmented Reality (AR) Assistance as a Coupled “what+how” Decision—So What does that Change?Michal SutterTop Computer Vision CV Blogs & News Websites (2025)Michal SutterPhysical AI: Bridging Robotics, Material Science, and Artificial Intelligence for Next-Gen Embodied SystemsMichal SutterMIT’s LEGO: A Compiler for AI Chips that Auto-Generates Fast, Efficient Spatial AcceleratorsMichal SutterMeta AI Researchers Release MapAnything: An End-to-End Transformer Architecture that Directly Regresses Factored, Metric 3D Scene GeometryMichal SutterAi2 Researchers are Changing the Benchmarking Game by Introducing Fluid Benchmarking that Enhances Evaluation along Several DimensionsMichal SutterGoogle AI Ships TimesFM-2.5: Smaller, Longer-Context Foundation Model That Now Leads GIFT-Eval (Zero-Shot Forecasting)Michal SutterStanford Researchers Introduced MedAgentBench: A Real-World Benchmark for Healthcare AI AgentsMichal SutterOpenAI Introduces GPT-5-Codex: An Advanced Version of GPT-5 Further Optimized for Agentic Coding in CodexMichal SutterSoftware Frameworks Optimized for GPUs in AI: CUDA, ROCm, Triton, TensorRT—Compiler Paths and Performance ImplicationsMichal SutterTop 12 Robotics AI Blogs/NewsWebsites 2025Michal SutterDeepdub Introduces Lightning 2.5: A Real-Time AI Voice Model With 2.8x Throughput Gains for Scalable AI Agents and Enterprise AIMichal SutterTwinMind Introduces Ear-3 Model: A New Voice AI Model that Sets New Industry Records in Accuracy, Speaker Labeling, Languages and PriceMichal SutterWhat are Optical Character Recognition (OCR) Models?Top Open-Source OCR ModelsMichal SutterOpenAI Adds Full MCP Tool Support in ChatGPT Developer Mode: Enabling Write Actions, Workflow Automation, and Enterprise IntegrationsMichal SutterTop 7 Model Context Protocol (MCP) Servers for Vibe CodingMichal SutterParaThinker: Scaling LLM Test-Time Compute with Native Parallel Thinking to Overcome Tunnel Vision in Sequential ReasoningMichal SutterA New MIT Study Shows Reinforcement Learning Minimizes Catastrophic Forgetting Compared to Supervised Fine-TuningMichal SutterAlibaba AI Unveils Qwen3-Max Preview: A Trillion-Parameter Qwen Model with Super Fast Speed and QualityMichal SutterMeet Chatterbox Multilingual: An Open-Source Zero-Shot Text To Speech (TTS) Multilingual Model with Emotion Control and WatermarkingMichal SutterBiomni-R0: New Agentic LLMs Trained End-to-End with Multi-Turn Reinforcement Learning for Expert-Level Intelligence in Biomedical ResearchMichal SutterAI and the Brain: How DINOv3 Models Reveal Insights into Human Visual ProcessingMichal Sutter15 Most Relevant Operating Principles for Enterprise AI (2025)Michal SutterWhat is AI Agent Observability? Top 7 Best Practices for Reliable AIMichal SutterChunking vs.Tokenization: Key Differences in AI Text ProcessingMichal SutterAccenture Research Introduce MCP-Bench: A Large-Scale Benchmark that Evaluates LLM Agents in Complex Real-World Tasks via MCP ServersMichal SutterTop 20 Voice AI Blogs and News Websites 2025: The Ultimate Resource GuideMichal SutterThe State of Voice AI in 2025: Trends, Breakthroughs, and Market LeadersMichal SutterOpenAI Releases an Advanced Speech-to-Speech Model and New Realtime API Capabilities including MCP Server Support, Image Input, and SIP Phone Calling SupportMichal SutterAustralia’s Large Language Model Landscape: Technical AssessmentMichal SutterWhat is Agentic RAG? Use Cases and Top Agentic RAG Tools (2025)Michal SutterThe Evolution of AI Protocols: Why Model Context Protocol (MCP) Could Become the New HTTP for AIMichal SutterGoogle AI’s New Regression Language Model (RLM) Framework Enables LLMs to Predict Industrial System Performance Directly from Raw Text DataMichal SutterWhat is MLSecOps(Secure CI/CD for Machine Learning)?: Top MLSecOps Tools (2025)Michal SutterYour LLM is 5x Slower Than It Should Be. The Reason?Pessimism—and Stanford Researchers Just Showed How to Fix ItMichal SutterHow Do GPUs and TPUs Differ in Training Large Transformer Models? Top GPUs and TPUs with BenchmarkMichal SutterWhat is a Database? Modern Database Types, Examples, and Applications (2025)Michal SutterWhat is a Voice Agent in AI? Top 9 Voice Agent Platforms to Know (2025)Michal SutterLarge Language Models LLMs vs.Small Language Models SLMs for Financial Institutions: A 2025 Practical Enterprise AI GuideMichal SutterNative RAG vs. Agentic RAG: Which Approach Advances Enterprise AI Decision-Making?Michal SutterTop 10 AI Blogs and News Websites for AI Developers and Engineers in 2025Michal SutterWhat Is Speaker Diarization? A 2025 Technical Guide: Top 9 Speaker Diarization Libraries and APIs in 2025Michal SutterWhat is DeepSeek-V3.1 and Why is Everyone Talking About It?Michal SutterMeet South Korea’s LLM Powerhouses: HyperClova, AX, Solar Pro, and MoreMichal SutterMigrating to Model Context Protocol (MCP): An Adapter-First PlaybookMichal SutterHello, AI Formulas: Why =COPILOT() Is the Biggest Excel Upgrade in YearsMichal SutterEmerging Trends in AI Cybersecurity Defense: What’s Shaping 2025?Top AI Security ToolsMichal SutterBlackRock Introduces AlphaAgents: Advancing Equity Portfolio Construction with Multi-Agent LLM CollaborationMichal SutterMaster Vibe Coding: Pros, Cons, and Best Practices for Data EngineersMichal SutterIs Model Context Protocol MCP the Missing Standard in AI Infrastructure?Michal SutterWhat is AI Inference?A Technical Deep Dive and Top 9 AI Inference Providers (2025 Edition)Michal SutterHugging Face Unveils AI Sheets: A Free, Open-Source No-Code Toolkit for LLM-Powered DatasetsMichal SutterWhat Is AI Red Teaming? Top 18 AI Red Teaming Tools (2025)Michal SutterFrom Deployment to Scale: 11 Foundational Enterprise AI Concepts for Modern BusinessesMichal SutterMeet dots.ocr: A New 1.7B Vision-Language Model that Achieves SOTA Performance on Multilingual Document ParsingMichal SutterAmazon Unveils Bedrock AgentCore Gateway: Redefining Enterprise AI Agent Tool IntegrationMichal SutterTop 6 Model Context Protocol (MCP) News Blogs (2025 Update)Michal SutterTop 12 API Testing Tools For 2025Michal SutterTop 10 AI Agent and Agentic AI News Blogs (2025 Update)Michal SutterWhy Docker Matters for Artificial Intelligence AI Stack: Reproducibility, Portability, and Environment ParityMichal SutterMistral AI Unveils Mistral Medium 3.1: Enhancing AI with Superior Performance and UsabilityMichal SutterCase Studies: Real-World Applications of Context EngineeringMichal SutterNVIDIA AI Introduces End-to-End AI Stack, Cosmos Physical AI Models and New Omniverse Libraries for Advanced RoboticsMichal SutterThe Best Chinese Open Agentic/Reasoning Models (2025): Expanded Review, Comparative Insights & Use CasesMichal SutterFrom 100,000 to Under 500 Labels: How Google AI Cuts LLM Training Data by Orders of MagnitudeMichal Sutter9 Agentic AI Workflow Patterns Transforming AI Agents in 2025Michal SutterFAQs: Everything You Need to Know About AI Agents in 2025Michal SutterTechnical Deep Dive: Automating LLM Agent Mastery for Any MCP Server with MCP- RL and ARTMichal SutterAlibaba Qwen Unveils Qwen3-4B-Instruct-2507 and Qwen3-4B-Thinking-2507: Refreshing the Importance of Small Language ModelsMichal SutterProxy Servers Explained: Types, Use Cases & Trends in 2025 [Technical Deep Dive]Michal SutterNVIDIA XGBoost 3.0: Training Terabyte-Scale Datasets with Grace Hopper SuperchipMichal SutterMoE Architecture Comparison: Qwen3 30B-A3B vs.GPT-OSS 20BMichal SutterGoogle DeepMind Introduces Genie 3: A General Purpose World Model that can Generate an Unprecedented Diversity of Interactive EnvironmentsMichal SutterModel Context Protocol (MCP) FAQs: Everything You Need to Know in 2025Michal SutterNow It’s Claude’s World: How Anthropic Overtook OpenAI in the Enterprise AI RaceMichal Sutter7 Essential Layers for Building Real-World AI Agents in 2025: A Comprehensive FrameworkMichal SutterA Technical Roadmap to Context Engineering in LLMs: Mechanisms, Benchmarks, and Open ChallengesMichal SutterThe Ultimate Guide to CPUs, GPUs, NPUs, and TPUs for AI/ML: Performance, Use Cases, and Key DifferencesMichal SutterFalcon LLM Team Releases Falcon-H1 Technical Report: A Hybrid Attention–SSM Model That Rivals 70B LLMsMichal SutterThe Ultimate 2025 Guide to Coding LLM Benchmarks and Performance MetricsMichal SutterNext-Gen Privacy: How AI Is Transforming Secure Browsing and VPN Technologies (2025 Data-Driven Deep Dive)Michal SutterIs Vibe Coding Safe for Startups?A Technical Risk Audit Based on Real-World Use CasesMichal Sutter9 Open Source Cursor Alternatives You Should Use in 2025Michal SutterMicrosoft Edge Launches Copilot Mode to Redefine Web Browsing for the AI EraMichal SutterKey Factors That Drive Successful MCP Implementation and AdoptionMichal SutterHow Memory Transforms AI Agents: Insights and Leading Solutions in 2025Michal SutterNVIDIA AI Releases GraspGen: A Diffusion-Based Framework for 6-DOF Grasping in RoboticsMichal SutterGoogle DeepMind Introduces Aeneas: AI-Powered Contextualization and Restoration of Ancient Latin InscriptionsMichal SutterGitHub Introduces Vibe Coding with Spark: Revolutionizing Intelligent App Development in a FlashMichal SutterGoogle Researchers Introduced LSM-2 with Adaptive and Inherited Masking (AIM): Enabling Direct Learning from Incomplete Wearable DataMichal Sutter7 MCP Server Best Practices for Scalable AI Integrations in 2025Michal SutterAI Guardrails and Trustworthy LLM Evaluation: Building Responsible AI SystemsMichal SutterTop 15+ Most Affordable Proxy Providers 2025Michal SutterThe Ultimate Guide to Vibe Coding: Benefits, Tools, and Future TrendsMichal SutterModel Context Protocol (MCP) for Enterprises: Secure Integration with AWS, Azure, and Google Cloud- 2025 UpdateMichal SutterMaybe Physics-Based AI Is the Right Approach: Revisiting the Foundations of IntelligenceMichal SutterThe Definitive Guide to AI Agents: Architectures, Frameworks, and Real-World Applications (2025)Michal SutterOpenAI Introduces ChatGPT Agent: From Research to Real-World AutomationMichal SutterHow to Connect Google Colab with Google Drive (2025 Detailed & Updated Guide)Michal Sutter50+ Model Context Protocol (MCP) Servers Worth Exploring RELATED ARTICLESMORE FROM AUTHOR Alibaba Qwen Team Releases Qwen3.5-397B MoE Model with 17B Active Parameters and 1M Token Context for AI agents Google DeepMind Proposes New Framework for Intelligent AI Delegation to Secure the Emerging Agentic Web for Future Economies A Coding Implementation to Design a Stateful Tutor Agent with Long-Term Memory, Semantic Recall, and Adaptive Practice Generation Moonshot AI Launches Kimi Claw: Native OpenClaw on Kimi.com with 5,000 Community Skills and 40GB Cloud Storage Now Meet ‘Kani-TTS-2’: A 400M Param Open Source Text-to-Speech Model that Runs in 3GB VRAM with Voice Cloning Support Getting Started with OpenClaw and Connecting It with WhatsApp Alibaba Qwen Team Releases Qwen3.5-397B MoE Model with 17B Active Parameters and 1M Token. Asif Razzaq – February 16, 2026 0 Alibaba Cloud just updated the open-source landscape. Today, the Qwen team released Qwen3.5, the newest generation of their large language model (LLM) family. The. Google DeepMind Proposes New Framework for Intelligent AI Delegation to Secure the Emerging Agentic. Michal Sutter – February 15, 2026 0 The AI industry is currently obsessed with ‘agents’—autonomous programs that do more than just chat. However, most current multi-agent systems rely on brittle, hard-coded. A Coding Implementation to Design a Stateful Tutor Agent with Long-Term Memory, Semantic Recall,.Asif Razzaq – February 15, 2026 0 In this tutorial, we build a fully stateful personal tutor agent that moves beyond short-lived chat interactions and learns continuously over time. We design. Moonshot AI Launches Kimi Claw: Native OpenClaw on Kimi.com with 5,000 Community Skills and. Michal Sutter – February 15, 2026 0 Moonshot AI has officially brought the power of OpenClaw framework directly to the browser. The newly rebranded Kimi Claw is now native to kimi.com,.Meet ‘Kani-TTS-2’: A 400M Param Open Source Text-to-Speech Model that Runs in 3GB VRAM. Michal Sutter – February 15, 2026 0 The landscape of generative audio is shifting toward efficiency. A new open-source contender, Kani-TTS-2, has been released by the team at nineninesix.ai. This model.Getting Started with OpenClaw and Connecting It with WhatsApp Arham Islam – February 14, 2026 0 OpenClaw is a self-hosted personal AI assistant that runs on your own devices and communicates through the apps you already use—such as WhatsApp, Telegram,. Google AI Introduces the WebMCP to Enable Direct and Structured Website Interactions for New. Michal Sutter – February 14, 2026 0 Google is officially turning Chrome into a playground for AI agents.For years, AI ‘browsers’ have relied on a messy process: taking screenshots of. How to Build a Self-Organizing Agent Memory System for Long-Term AI Reasoning Asif Razzaq – February 14, 2026 0 In this tutorial, we build a self-organizing memory system for an agent that goes beyond storing raw conversation history and instead structures interactions into. Exa AI Introduces Exa Instant: A Sub-200ms Neural Search Engine Designed to Eliminate Bottlenecks.Asif Razzaq – February 13, 2026 0 In the world of Large Language Models (LLMs), speed is the only feature that matters once accuracy is solved. For a human, waiting 1. [In-Depth Guide] The Complete CTGAN + SDV Pipeline for High-Fidelity Synthetic Data Michal Sutter – February 13, 2026 0 In this tutorial, we build a complete, production-grade synthetic data pipeline using CTGAN and the SDV ecosystem. We start from raw mixed-type tabular data.Discord Linkedin Reddit Twitter miniCON Event 2025 Download AI Magazine/Report Privacy & TC Cookie Policy 🐝 Partnership and Promotion © Copyright Reserved @2025 Marktechpost AI Media Inc We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies. Do not sell my personal information.Cookie settingsACCEPTPrivacy & Cookies Policy