Verify any claim · lenz.io
Claim analyzed
Tech“LiveKit agents can only listen and respond to humans in meetings held inside LiveKit rooms, so a Google Meet or Microsoft Teams meeting must be bridged into a LiveKit room for a LiveKit agent to interact with the meeting audio/video.”
Submitted by Calm Whale d012
The conclusion
LiveKit agents are built to join LiveKit rooms and can only hear or publish media that exists in those rooms. So if a Google Meet or Microsoft Teams meeting is to be handled by a LiveKit agent, that meeting's media must be relayed into a LiveKit room first. The caveat is that Meet and Teams also support non-LiveKit-native bot or interop approaches.
Caveats
- The claim applies specifically to LiveKit agents, not to all bots or integrations for Google Meet or Microsoft Teams.
- “Bridged” is a broad term here: the media can be brought into LiveKit through SIP, telephony, ingress, or other relay mechanisms.
- Teams and Meet may offer their own native bot or interop frameworks, which do not require LiveKit at all.
Get notified if new evidence updates this analysis
Create a free account to track this claim.
Sources
Sources used in the analysis
“The Agents framework lets you add any Python or Node.js program to LiveKit rooms as full realtime participants… ### How agents connect to LiveKit When your agent code starts, it first registers with a LiveKit server (either self hosted or LiveKit Cloud) to run as an "agent server" process. The agent server waits until it receives a dispatch request. To fulfill this request, the agent server boots a "job" subprocess which joins the room… After your agent and user join a room, the agent and your frontend app can communicate using LiveKit WebRTC.”
“Rooms, participants, and tracks are the fundamental building blocks of every LiveKit app. - A **room** is a virtual space where realtime communication happens. - **Participants** are the users, agents, or services that join rooms to communicate. - **Tracks** are the media streams (audio, video, or data) that participants publish and subscribe to within a room.” “**Participants** are the users, agents, or services that join rooms to communicate.”
“A room is created automatically when the first participant joins, and is automatically closed when the last non-agent participant leaves. You connect to LiveKit through a `Room` object. A room is a core concept that represents an active LiveKit session. Your app joins a room — either one it creates or an existing one — as a participant.” “**Connecting to agents** If you're building a 1:1 agent application, see the Authentication overview to learn how to use Session APIs for simplified agent connection. The Session APIs handle token generation, room connection, and agent lifecycle automatically.”
Microsoft states: “You can build bots that participate in Teams calls and meetings using the Real‑time Media Platform APIs. These bots can send and receive audio and video streams in real time.” The page explains how to use Teams Graph APIs and media SDKs to create meeting bots. It doesn’t reference LiveKit or LiveKit agents, implying that any use of a LiveKit agent would have to go through a custom bridge that connects these APIs to a LiveKit room.
“LiveKit Agents is a framework for building AI applications that can **participate in real‑time audio/video rooms**. Agents connect to LiveKit, subscribe to media tracks in a room, and publish their own audio or video back into the room… Agents are implemented as services that join LiveKit rooms like any other participant.”
“A media agent joins a **LiveKit room** and subscribes to one or more audio or video tracks. The agent processes the incoming media (for example, speech‑to‑text + LLM + text‑to‑speech) and publishes its responses back into the **same room** as an audio track… The agent’s scope of perception is limited to the media that is present in the room it has joined.”
“LiveKit Telephony lets you **bridge external phone calls into LiveKit rooms** via SIP or PSTN providers such as Twilio… Once bridged, the call appears as a participant in a LiveKit room, where you can add other participants or **LiveKit Agents** to interact with the caller in real time.”
“You can bridge Twilio conferencing to LiveKit via SIP, allowing you to add agents and other LiveKit clients to an existing Twilio conference… Step 5. Create a dispatch rule to place each caller into their own room… This allows you to **automatically dispatch an agent to the Twilio conference**.” This shows the pattern that external voice conferences (here, Twilio) must first be **bridged into a LiveKit room**, after which a LiveKit agent is dispatched into that room to interact.
LiveKit trunks bridge your third-party SIP provider and LiveKit. To use LiveKit, you must configure your SIP provider's trunking service to work with LiveKit. Once configured, calls from the PSTN or other SIP endpoints can be routed into a LiveKit room, where they appear as participants alongside WebRTC clients and agents.
Agents connect to LiveKit rooms via the Server SDK, just like any other server-side participant. After joining a room, the agent subscribes to the relevant audio tracks, performs its processing (for example, transcription or LLM prompting), and then publishes audio or data tracks back into the same room. The agent’s scope of interaction is limited to the media and events available inside that LiveKit room.
Ingress lets you bring media from external systems into LiveKit. You can pipe RTMP, WHIP, and other supported protocols into a LiveKit room, where the incoming stream appears as a track that other participants (including agents) can subscribe to. This makes it possible to consume audio/video from third-party services by bridging them into a LiveKit room.
In this guide we’ll connect a LiveKit Agent to the public telephone network using LiveKit Telephony. Incoming calls from your SIP provider are bridged into a LiveKit room over SIP trunking. The agent joins that same room, listens to the caller’s audio, and responds using synthesized speech sent as an audio track in the room.
LiveKit provides APIs and infrastructure for building real-time audio and video applications over WebRTC. Third-party systems can interoperate with LiveKit via bridges such as SIP trunks, RTMP/WHIP ingress, or server-side SDKs that join rooms as participants. External services don’t communicate directly with agents; instead, all media is exchanged inside LiveKit rooms using tracks.
“This guide walks you through building a voice AI assistant with Google Gemini and LiveKit Agents. In less than 10 minutes, you have a voice assistant that you can talk to from your browser.” The guide’s workflow is: start the agent, run it in console mode, and then ‘Connect to Agent Console’ or ‘Deploy to LiveKit Cloud’—all of which assume the agent is running as a participant in a LiveKit room. The integration described is with Google Gemini (model provider), not with Google Meet as a conferencing platform.
If you wish to utilize the cross-platform SIP join capability, you need to enable SIP video calling on your Teams Rooms following these instructions: SIP and H.323 Dialing. Teams Rooms devices can join third-party meeting platforms (for example, Cisco Webex, Zoom, Google Meet) using SIP or cloud interop services. In these scenarios, Teams Rooms is effectively joining the external meeting using a SIP-based bridge.
“LiveKit Cloud includes built-in telephony support. Users can dial into a LiveKit room from the public telephone network (PSTN), or you can dial out from a room to a phone number. Once connected, the phone caller appears in the room as an audio participant like any other.” “This allows you to bridge traditional phone calls into a LiveKit room, where they can interact with other participants and agents.”
“An open source framework and developer platform for building, testing, deploying, scaling, and observing agents in production… Agents are first‑class participants in your LiveKit infrastructure, with access to the same media and data channels as human users.” The product overview emphasizes that agents operate within LiveKit’s own media infrastructure as participants. It does not advertise native connectors that allow agents to directly attach to Google Meet or Teams calls without going through LiveKit rooms or media bridges.
SIP is designed to be a full-featured SIP bridge, connecting LiveKit sessions with Telephony networks with SIP Trunking. SIP provides a way to bring SIP traffic into a LiveKit room. To accept inbound calls, the workflow goes like this: create an SIP Trunk with CreateSIPTrunk API (to livekit-server) … SIP service receives a call … SIP service connects to the LiveKit room and SIP caller is a participant.
Meet is a Zoom-inspired sample application we built to show developers how to use LiveKit and for internally dogfooding our infra. In this demo, KITT (our AI assistant) joins a LiveKit room as a participant, subscribes to the user’s audio, sends it to an LLM, and then publishes synthesized speech back into the same room. All of KITT’s interaction happens inside the LiveKit session; external meeting platforms would need to be bridged into LiveKit to be part of this experience.
“The developer platform for voice AI. Build agents in Python or TypeScript, deploy to the cloud, and observe every conversation in production.” “Agents are full participants in LiveKit rooms, with access to audio, video, screen share, and data in real time.”
“LiveKit Agents is a framework for building realtime voice (and video) AI applications. Agents run as participants in LiveKit rooms and can send and receive audio, video, and data in real time.” “Agents are designed to join LiveKit rooms, listen to media tracks, and respond based on application logic (LLMs, tools, etc.).”
“A **Worker** is a process that connects to LiveKit Cloud and waits for job assignments. When a user initiates a voice session (by joining a Room), LiveKit dispatches a job to an available Worker.” “The `ctx.connect()` method controls precisely when and how your agent joins the room.”
“We’ll use LiveKit Agents to create a realtime conversational AI that joins a LiveKit room, listens to the user’s audio, and responds in natural language… The agent connects to a LiveKit room via the server SDK, subscribes to the user’s audio track, and sends responses back as audio in the same room.” The tutorial shows the agent only consuming and producing media inside a LiveKit room; it does not discuss connecting the agent directly to external meeting platforms. Any external call participation is implied to require routing audio into the LiveKit room where the agent is present.
“We’ll walk through how Twilio manages incoming calls, how **LiveKit bridges them into real‑time audio rooms**, and how to connect it so your voice AI agent can answer… After you configure the SIP trunk and origination, you create an **Inbound Trunk in LiveKit** and then a **Dispatch Rule** so each incoming call is routed to a LiveKit room. From there, you attach a **LiveKit Agent** to that room so it can listen and respond to callers.”
Pipedream lets you integrate Google Meet and LiveKit with thousands of other APIs. Typical workflows involve using Google Meet webhooks or scheduled jobs to trigger actions in LiveKit, such as creating a room or sending a message. These integrations automate coordination between the platforms but do not directly mix Google Meet audio/video with LiveKit media without using a separate media bridge.
“1. **User & Session Start:** The user connects to a LiveKit room. An `AgentSession` is created, managing the overall interaction and holding shared `userdata`. The *initial* agent (`IntroAgent`) is added to the session.” “Running & Connecting: - Run the agent: `python agent.py` - Connect using the Agent Playground (link below) or your own client, pointing to your LiveKit instance, ensuring you join the room the agent is listening for (usually determined by how the agent job is launched or configured).”
“The LiveKit integration triggers when events occur in a LiveKit **room** (such as participant joined, track published) and then allows you to send messages or actions to Microsoft Teams… Typical workflows include: when a Teams event occurs, **create or interact with a LiveKit room** for real‑time audio/video via LiveKit. The integration works by connecting external events to **LiveKit rooms**, not by inserting LiveKit Agents directly into native Teams media streams.”
In the video, the presenter explains: “We’re going to build a speech‑to‑text agent that joins a LiveKit room, listens to whatever is said there, and prints out the transcription in realtime.” Later, they show the code that connects using the LiveKit Python SDK, subscribes to room tracks, and runs the agent logic on those tracks. The demonstrated agent only processes audio that flows through a LiveKit room; there is no example of the agent directly joining or listening to a Google Meet or Microsoft Teams meeting without going through LiveKit.
“I’m facing a consistent issue where **inbound Twilio voice calls bridged via SIP connect successfully to a LiveKit Agent, the room is created**… The SIP trunk brings the Twilio call into a LiveKit room, and the agent joins that room. The problem is with the agent logic after the introduction, not with the SIP bridge itself.” This issue discussion confirms that an external call is first brought **into a LiveKit room**, and the agent only interacts with media in that room.
“This guide will walk you through integrating Portkey with **LiveKit’s STT‑LLM‑TTS pipeline to build enterprise‑ready voice AI agents**… Your agent runs as a service that connects to a **LiveKit room**, subscribes to the caller’s audio track, and publishes generated speech back into that room. External telephony or meeting systems need to send audio into a LiveKit room (for example, via SIP) for the agent to process it.”
“You can join Google Meet calls from third‑party conferencing systems using SIP or H.323 interoperability services… The interop gateway **bridges the external SIP/H.323 system into the Meet meeting** as a participant. The external system receives and sends audio and video through the gateway rather than modifying the native Meet client or media servers.” This demonstrates that standard practice for external systems is to **bridge media in as a participant**, analogous to bridging external meetings into a LiveKit room for agents.
At ~5:01, the presenter explains: “On the back end, we're actually creating a room, a new room for every agent, and then we're returning the room as well as the token. So the front end can then connect to the LiveKit server using that token. The LiveKit server will then dispatch an agent into the room, and we'll show you how that happens.” Later (~7:41): “Then we go ahead and generate the token and we send back the token and the URL… so the front end can then subscribe to all the tracks and publish their own tracks… The LiveKit server will then dispatch an agent into the room, and then the agent will send a greeting back to the user.”
LiveKit’s documented model for agents is that they are participants inside LiveKit rooms receiving media tracks via WebRTC. To interact with calls on other platforms such as Google Meet or Microsoft Teams, current community guidance is to bridge those calls into a LiveKit room (for example via SIP gateways, RTMP ingest, or custom media relays) so that the agent can receive the audio/video as standard LiveKit tracks. Directly embedding a LiveKit Agent as a native participant inside Meet or Teams (without such a bridge or relay) is not part of the public LiveKit Agents feature set as documented.
The tutorial shows how to connect Twilio phone numbers to LiveKit using SIP trunks: “We create an inbound sub trunk inside of LiveKit… set up a SIP trunk… and then create a dispatch rule… In here we in agent dispatch… add agent and we’ll select telephone agent… This is the command you need to write each time… if you want to speak to your agent using your phone.” Audio from the external telephony network (via Twilio) is first brought into LiveKit through a SIP trunk, then routed to a LiveKit agent in a dispatch rule, illustrating that external systems interact with agents by bridging their media into LiveKit rather than agents directly joining the external system.
In the walkthrough, the presenter creates a Twilio Elastic SIP Trunk and configures the organization SIP URI provided by LiveKit. Calls to the Twilio phone number are routed over this SIP trunk into LiveKit, where an inbound trunk and dispatch rule connect the call to a LiveKit agent. The agent itself is not connected directly to Twilio or the PSTN; it only interacts with the call once the audio has been bridged into a LiveKit room.
Around 5:35, the instructor describes: “When we start this script here… we're calling the CLI run app… where we have the whole agent session and room and greeting set up… This will create our agent, and we can immediately test talking to our LiveKit agent in the browser. So we just have to select our organization and it automatically connects us to the agent that we just deployed through the CLI.” The workflow they demonstrate assumes the user connects to a LiveKit-powered web client, which in turn connects to a LiveKit room where the agent is a participant.
One answer notes: “LiveKit agents can’t log in as a native Teams bot or client. They connect only to **LiveKit rooms**. If you want an agent in your Teams meeting, you’ll have to either use SIP to dial out from Teams to LiveKit or run a custom bridge that takes Teams media and forwards it into a LiveKit room where the agent is present.” This answer reflects developer experience that LiveKit agents themselves do not connect directly to Teams/Meet media; instead, audio is **bridged into a LiveKit room**.
What do you think of the claim?
Your challenge will appear immediately.
Challenge submitted!
Continue your research
Verify a related claim next.
Expert review
3 specialized AI experts evaluated the evidence and arguments.
Expert 1 — The Logic Examiner
LiveKit's docs consistently define Agents as participants that join LiveKit rooms and can only subscribe/publish to tracks inside the room they joined (Sources 1, 5, 6, 10, 21), and LiveKit's own interop patterns (telephony bridges, ingress) likewise route external media into a LiveKit room where agents then interact with it (Sources 7, 11, 12, 13). Therefore, while the Microsoft Teams bot API shows Teams has its own native way to build meeting bots (Source 4), it does not provide a logical counterexample to the narrower claim about LiveKit Agents; the claim that Meet/Teams media must be bridged/relayed into a LiveKit room for a LiveKit Agent to interact is logically supported and true in scope.
Expert 2 — The Context Analyst
The claim has two components: (1) LiveKit agents can only operate inside LiveKit rooms, and (2) external meetings like Google Meet or Teams 'must be bridged' into a LiveKit room for agent interaction. Component (1) is thoroughly supported by all LiveKit documentation sources. Component (2) is technically accurate from a LiveKit-agent perspective but omits important context: the claim implies that bridging into a LiveKit room is the only way to interact with Teams/Meet audio at all, when in fact Microsoft Teams natively supports its own meeting bots (Source 4) that can access meeting audio/video without any LiveKit room involved — these are simply not LiveKit agents. Additionally, the claim's phrasing 'must be bridged' slightly overstates the specificity, as LiveKit supports multiple ingress mechanisms (RTMP/WHIP, SIP, etc.) beyond a simple 'bridge,' though all do route media through a LiveKit room. The core architectural truth — that a LiveKit agent specifically requires media to be present in a LiveKit room — is accurate and well-documented, and the bridging requirement for external meetings is correctly stated for the specific use case of deploying a LiveKit agent. The missing context is that Teams/Meet have their own native bot frameworks that don't require LiveKit rooms at all, and that 'bridging' encompasses multiple technical mechanisms. Overall, the claim is mostly true as a description of LiveKit agent architecture, with minor framing issues around the implied exclusivity of the bridging approach.
Expert 3 — The Source Auditor
High-authority, primary LiveKit sources (1 LiveKit Agents intro, 5 Agents overview, 6 Media workflows, 10 Build an Agent, plus 2–3 on rooms/connection) consistently state agents join LiveKit rooms as participants and their perception/interaction is limited to tracks/events inside the room; LiveKit's own bridging docs (7–9, 11, 12, 13, 16) describe bringing external media into LiveKit specifically by ingest/bridges that terminate in a LiveKit room where agents can then subscribe/publish. Microsoft Learn (4) only shows Teams-native bots can join Teams meetings, but it does not show LiveKit Agents can join Meet/Teams directly, so the most reliable evidence supports the claim that for a LiveKit Agent to interact with Meet/Teams meeting media you must route/bridge that media into a LiveKit room (even if the bridge mechanism varies).
Expert summary
The arguments
Two AI advocates debated this claim using the research gathered.
Argument for
The claim is fully supported by LiveKit's own documentation: Source 5 explicitly states that 'agents are implemented as services that join LiveKit rooms like any other participant,' and Source 6 confirms that 'the agent's scope of perception is limited to the media that is present in the room it has joined,' making it architecturally impossible for an agent to directly access Google Meet or Microsoft Teams audio/video without a bridge. Sources 7, 8, 9, and 12 further demonstrate that even external telephony systems like Twilio must be bridged into a LiveKit room via SIP trunking before an agent can interact with them, and Source 37 directly confirms from developer experience that 'LiveKit agents can't log in as a native Teams bot or client' and require a custom bridge to route external meeting audio into a LiveKit room.
The Proponent equivocates between “an agent can only perceive media inside the LiveKit room it joined” (Sources 5–6, LiveKit Documentation) and the stronger motion that a Meet/Teams meeting “must be bridged into a LiveKit room,” ignoring that LiveKit explicitly supports bringing external media into rooms via non-meeting-specific mechanisms like RTMP/WHIP Ingress and other bridges (Sources 11 and 13, LiveKit Documentation), so the asserted necessity and exclusivity are not established. The Proponent's reliance on telephony-only SIP examples (Sources 7–9 and 12, LiveKit Documentation) and a low-authority anecdotal Stack Overflow post (Source 37) does not rebut the key counterpoint that Teams already supports native real-time meeting bots without any LiveKit room at all (Source 4, Microsoft Learn), which undercuts the motion's implied requirement for LiveKit-room bridging as the only path to interact with Teams meeting media.
Argument against
The motion overstates the limitation by claiming Google Meet or Microsoft Teams “must” be bridged into a LiveKit room, but LiveKit's own docs describe multiple ways to bring external media into LiveKit (e.g., RTMP/WHIP Ingress) rather than specifically “bridging the meeting,” so the asserted necessity and exclusivity are not supported as stated (Source 11, LiveKit Documentation; Source 13, LiveKit Documentation). Moreover, Microsoft explicitly supports native Teams meeting bots that send/receive real-time audio/video without any LiveKit room involved, which directly contradicts the implication that interaction with Teams meeting media inherently requires a LiveKit-room bridge (Source 4, Microsoft Learn).
The Opponent's invocation of Source 4 commits a category error: Microsoft's native Teams bot APIs demonstrate that Teams meetings can be accessed by Teams-native bots, but this says nothing about whether a LiveKit agent — which Sources 5, 6, and 10 consistently define as a process that exclusively joins and operates within LiveKit rooms — can do so without a bridge; the existence of a Teams-native pathway does not eliminate the architectural constraint on LiveKit agents specifically. Furthermore, the Opponent's appeal to Source 11 (RTMP/WHIP Ingress) actually reinforces the claim rather than undermining it, as that source explicitly states that external media must be piped 'into a LiveKit room' before agents can subscribe to it, confirming that regardless of the bridging mechanism chosen, the LiveKit room remains the mandatory intermediary.