Build a SwiftUI video chat app using the Zoom Video SDK on iOS
- Update: Feb 26, 2026
The blog has been updated to support the latest SPM integration details and a newly added script that helps to generate the JWT token.
Introduction
At Zoom, we strive to provide the best video conferencing experience possible. SDK libraries empower the creation of third party apps powered by our world-class video technology platform. With Zoom Video SDK, developers can build fully-customizable self-branded apps with nearly all of the features of the desktop client Zoom experience, from video calling to screen sharing.
In this guide we will build an iOS SwiftUI app with the Video SDK. We’ll cover:
- Introduction
- Prerequisites
- Getting the SDK and its contents
- Quickstart app contents
- JWT authentication
- Integrating the SDK
- Initializing the SDK
- Joining a session
- Set up delegate callbacks
- Video
- Audio
- Responding to user events
- Leaving a session
Tha app lets two or more users have a conversation with each other over. Both video and audio-only communication is supported. For this project, we'll use Swift and SwiftUI. If you'd like to use UIKit you can read our UIKit blog.
Prerequisites
To build this app, you should have:
- Xcode
- A physical 64-bit iOS device with iOS version 13.0+
- A Zoom Video SDK account with Video SDK credentials
- A valid provisioning profile certificate
Getting the SDK and its contents
Adding the SDK to your iOS app takes only a single step with Swift Package Manager. In Xcode, select File > Add Package Dependencies.... In the Search or Enter Package URL bar on the top right, enter https://github.com/zoom/zoom-video-sdk-iOS/ and swift-package-manager in the Branch field.

Tap the Add Package button, confirm the app target you are adding it to, and the Video SDK should be added to Package Dependencies accordingly. The Video SDK for iOS package includes the following XCFramework bundles under /Sample-Libs/lib that can be added to your project as needed:
ZoomVideoSDK.xcframeworkandZoomTask.xcframework: Interfaces to support all services related to Zoom sessions, such as initializing the SDK, creating and joining sessions, in-session services, and more.
For this tutorial, we do not need these xcframeworks:
-
CptShare.xcframework: Interfaces to support screen sharing a singleUIView. Required to receive annotation by others when sharing a single UIView, as opposed to full broadcasting. -
zm_annoter_dynamic.xcframework: Interfaces to support the annotation service when sharing. -
ZoomVideoSDKScreenShare.xcframework: Interfaces to support the full screen share service, for broadcasting a device screen. -
zoomcml.xcframework: Interfaces to support virtual background filter and 3D avatar. -
Whiteboard.xcframework: Interfaces to support whiteboard.
To add framework files manually via the Zoom Marketplace, see the documentation.
Quickstart app contents
MySwiftUIVideoSDK is a simple two-view navigation app.
StartView is the entry point for the app where the Video SDK is initialized.



SessionView contains a .toolbar which holds the controls for toggling the user’s video, toggling audio and ending the Zoom session. This view also contains a ScrollView that contains all of the participants in the session.


The three options are tracked in the .toolbar's ToolbarItemGroup under 3 different Button created and added to it. Then the respective action are called via the viewModel attached to the SessionView.
// An example of video button - toggle on and off
Button(action: {
viewModel.toggleVideo()
}, label: {
Label {
Text(viewModel.videoOn ? "Stop Video" : "Start Video")
} icon: {
Image(systemName: viewModel.videoOn ? "video.slash" : "video")
.frame(width: 24, height: 24)
}
})
JWT authentication
JSON Web Tokens are used to authorize Zoom Video SDK apps. They are always required for starting and joining sessions. Ideally you should be generating the JWT on the server side to ensure safekeeping of your Video SDK credentials. However, as reference, we have added a script in the /Scripts folder using .swift to easily generate the JWT token needed in this demo.
Follow the README in the /Scripts folder to understand how to use the script.
// MARK: Session Information
/*
TODO: Enter the following variables needed to initialize the VSDK and to start/join a session
You should sign your JWT with a backend service in a production use-case. For faster JWT generation, you can navigate checkout the JWTGenerator.swift under Script folder and its README for more details on how to consume it.
Once you got the token, you can simple copy and paste it below.
Ensure that the sessionName matches the session name used to generate the JWT Token.
*/
let jwtToken = "" // Leave this as empty if you choose to copy and paste your generated JWT token directly in the sample app's alert box after clicking on "Join Session"
let sessionName = "" // Also known as tpc in JWT
let userName = "" // Display name
let sessionPassword: String = "" // If needed
Integrating the SDK
Ensure your app's General > Minimum Deployments target is set to at least iOS 13.0.

If you used Swift Package Manager to add the Zoom Video SDK, your Xcode project's Package Dependencies should look like this:

The General > Frameworks, Libraries, and Embedded Content settings should look like this:

If you added the Zoom Video SDK manually, do the following:
In the Video SDK package that was downloaded from the Zoom Marketplace, navigate to /Sample-Libs/lib.

The Video SDK is a dynamic library, so it must be included in the project as an embedded binary. In your Xcode project, navigate to your app's target and then General > Frameworks, Libraries, and Embedded Content and add ZoomVideoSDK.xcframework for the main SDK interfaces and set to Embed & Sign.

Finally from the same app's target page, navigate to Info add the required project permissions ("Privacy - * Usage Description") for Camera, Microphone, Bluetooth, and optionally Photo Library. The user will then explicitly grant these permissions to the app during runtime. More information on media permissions see here.

Initializing the SDK
Let’s get started by initializing the SDK so we have access to its functionality. For now, we’ll work in the StartView, where we first import ZoomVideoSDK to have the module.
In the setupSDK function, we’ll create an instance of the ZoomVideoSDKInitParams object and set the domain of the context to zoom.us. Then call the initialize function on the Video SDK from the main thread and verify it was correctly initialized.
let initParams = ZoomVideoSDKInitParams()
initParams.domain = "zoom.us"
let sdkInitReturnStatus = ZoomVideoSDK.shareInstance()?.initialize(initParams)
switch sdkInitReturnStatus {
case .Errors_Success:
print ("SDK initialization succeeded")
default:
if let error = sdkInitReturnStatus {
print("SDK initialization failed: \(error)")
return
}
}
Joining a session
For the remainder of the app we'll be using the SessionView.swift for View related code, and SessionView+Extension.swift for ViewModel related code. Import the Zoom Video SDK here as well.
To create or to join a session, you need to instantiate an ZoomVideoSDKSessionContext object and provide the following required properties:
token: JSON Web Token (JWT) created from Video SDK credentials during Authentication.sessionName: The session’s unique identifier, which must match thetpcfield in the JWT. If the name is for a currently active session, then the SDK will join the session if all required parameters have been provided. If no active session exists with the name, then the SDK will create a new session for you.userName: Display name of the user shown in the session. Default value is "null".
Optional additional properties:
sessionPassword: You may optionally specify a password for the session that attendees must enter.audioOption: Audio settings configurable inZoomVideoSDKAudioOptions.videoOption: Video settings configurable inZoomVideoSDKVideoOptions.
We will create the session context in a new method joinSession using the data that you previously input in the SessionView+Extension.swift. Once again, in a production app, you should not hardcode in the JWT or other credentials, these should be retrieved from a backend server.
// MARK: Session Information
/*
TODO: Enter the following variables needed to initialize the VSDK and to start/join a session
You should sign your JWT with a backend service in a production use-case. For faster JWT generation, you can navigate checkout the JWTGenerator.swift under Script folder and its README for more details on how to consume it.
Once you got the token, you can simple copy and paste it below.
Ensure that the sessionName matches the session name used to generate the JWT Token.
*/
let jwtToken = "" // Leave this as empty if you choose to copy and paste your generated JWT token directly in the sample app's alert box after clicking on "Join Session"
let sessionName = "" // Also known as tpc in JWT
let userName = "" // Display name
let sessionPassword: String = "" // If needed
func joinSession() async {
ZoomVideoSDK.shareInstance()?.delegate = self
let sessionContext = ZoomVideoSDKSessionContext()
sessionContext.token = jwtToken.isEmpty ? userInputJWT : jwtToken
sessionContext.sessionName = sessionName
sessionContext.userName = userName
let videoOption = ZoomVideoSDKVideoOptions()
videoOption.localVideoOn = true
sessionContext.videoOption = videoOption
let audioOtion = ZoomVideoSDKAudioOptions()
audioOtion.mute = true
sessionContext.audioOption = audioOtion
if !sessionPassword.isEmpty {
sessionContext.sessionPassword = sessionPassword
}
// Join Session
if let session = ZoomVideoSDK.shareInstance()?.joinSession(sessionContext) {
print("Session object: \(session)")
} else {
print("Join session failed")
DispatchQueue.main.async {
self.joinSessionFailed = true
}
}
}
In the SessionView.swift, we have the view logic to display two different views based on if the user is in a session or not.
struct SessionView: View {
@StateObject private var viewModel = ViewModel()
@Environment(\.dismiss) var dismiss
var body: some View {
if viewModel.inSession {
// Display participants UI and toolbar
} else {
// Display loading session - This is when the viewModel.joinSession() get called.
}
}
}
Set up delegate callbacks
The Video SDK uses delegate callbacks to share events/updates such as operation results or failures. You can access these by conforming to ZoomVideoSDKDelegate to receive all available session callbacks. We do so by conforming the ViewModel class with ZoomVideoSDKDelegate and adding the delegate under joinSession().
extension SessionView {
@MainActor
class ViewModel: NSObject, ObservableObject, @preconcurrency ZoomVideoSDKDelegate {
// ...
}
// ...
func joinSession() async {
ZoomVideoSDK.shareInstance()?.delegate = self
// ...
}
}
Video
The app will include controls to toggle the camera and mic. And a button to leave the session. Let's go over each feature individually.
To display a user's video stream, first we will need to set up the UIViewRepresentable that acts as a bridge between UIKit and SwiftUI as the video stream requires an UIView under UIKit to subscribe/unsubscribe to/from. We'll create two different UIViewRepresentable for the local user (LocalVideoView) and remote users (RemoteVideoView) in SessionView.
// Create the 2 Views under SessionView.swift
public struct LocalVideoView: UIViewRepresentable {
@State var viewModel: SessionView.ViewModel
public func makeUIView(context: Context) -> UIView {
let videoView = UIView()
viewModel.attachLocalVideo(to: videoView)
return videoView
}
public func updateUIView(_ uiView: UIView, context: Context) {
viewModel.updateLocalVideo(to: uiView)
}
}
public struct RemoteVideoView: UIViewRepresentable {
@State var viewModel: SessionView.ViewModel
@State var index: Int // To keep track of the remote users.
public func makeUIView(context: Context) -> UIView {
let videoView = UIView()
return videoView
}
public func updateUIView(_ uiView: UIView, context: Context) {
viewModel.updateRemoteVideo(to: uiView, index: index)
}
}
We'll also create the corresponding makeUIView and updateUIView methods for the interaction with its view model. The difference between the LocalVideoView and RemoteVideoView is that we can have multiple remote users in a session so we'll track them using an index.
// Create the 4 methods in the ViewModel at SessionView+Extension.swift
extension SessionView {
@MainActor
class ViewModel: NSObject, ObservableObject, @preconcurrency ZoomVideoSDKDelegate {
// Error popup
@Published var errorTitle: String = "Error"
var errorMessage: String = "Message"
// Local user
@Published var userInputJWT = ""
@Published var shouldJoin = false
@MainActor weak var localView: UIView?
@Published var joinSessionFailed: Bool = false
@Published var inJWTInput: Bool = true
@Published var inSession: Bool = false
@Published var leftSession: Bool = false
@Published var videoOn: Bool = false
@Published var audioOn: Bool = false
// Remote users
@Published var remoteUsers: [ZoomVideoSDKUser] = []
// ...
// Attaching the local user's video view
@MainActor func attachLocalVideo(to view: UIView) {
self.localView = view
}
// Updating the local user's video view
@MainActor func updateLocalVideo(to view: UIView) {
guard let myUserVideoCanvas = ZoomVideoSDK.shareInstance()?.getSession()?.getMySelf()?.getVideoCanvas(), let myVideoIsOn = myUserVideoCanvas.videoStatus()?.on else { return }
if myVideoIsOn {
myUserVideoCanvas.subscribe(with: localView, aspectMode: .panAndScan, andResolution: ._Auto)
} else {
myUserVideoCanvas.unSubscribe(with: localView)
}
}
// Attaching the remote user's video view based on index
@MainActor func attachRemoteUserVideo(index: Int, to view: UIView) {
guard let index = remoteUsers.indices.first(where: { $0 == index }) else { return }
if let currentUserVideoCanvas = self.remoteUsers[index].getVideoCanvas(), let videoStatus = currentUserVideoCanvas.videoStatus() {
if videoStatus.on {
currentUserVideoCanvas.subscribe(with: view, aspectMode: .panAndScan, andResolution: ._Auto)
} else {
currentUserVideoCanvas.unSubscribe(with: view)
}
}
}
// Updating the remote user's video view based on index
@MainActor func updateRemoteVideo(to view: UIView, index: Int) {
guard let index = remoteUsers.indices.first(where: { $0 == index }) else { return }
if let currentUserVideoCanvas = self.remoteUsers[index].getVideoCanvas(), let videoStatus = currentUserVideoCanvas.videoStatus() {
if videoStatus.on {
currentUserVideoCanvas.subscribe(with: view, aspectMode: .panAndScan, andResolution: ._Auto)
} else {
currentUserVideoCanvas.unSubscribe(with: view)
}
}
}
}
}
We'll also create a PlaceholderView in SessionView.swift for when the user does not have their video turned on. It'll display a person icon and their username instead.
public struct PlaceholderView: View {
@State var name: String
public var body: some View {
VStack() {
Image(systemName: "person.fill")
.foregroundStyle(.white)
Text(name)
.foregroundStyle(.white)
}
.frame(maxHeight: .infinity)
}
}
public struct VerticalLabelStyle: LabelStyle {
public func makeBody(configuration: Configuration) -> some View {
VStack {
configuration.icon.font(.headline)
configuration.title.font(.footnote)
}
}
}
Finally we will add the Views we created to the SessionView and add the .toolbar we mentioned earlier. We'll add a loading view for when the session is loading.
struct SessionView: View {
@StateObject private var viewModel = ViewModel()
@Environment(\.dismiss) var dismiss
var body: some View {
if viewModel.inSession {
NavigationStack {
ScrollView {
VStack() {
VStack() {
if viewModel.videoOn {
LocalVideoView(viewModel: viewModel)
} else {
PlaceholderView(name: viewModel.userName)
}
}
.aspectRatio(1, contentMode: .fill)
.frame(maxWidth: .infinity)
.padding()
ForEach(viewModel.remoteUsers.indices, id: \.self) { index in
VStack() {
if (viewModel.remoteUsers[index].getVideoCanvas()?.videoStatus()?.on ?? false) {
RemoteVideoView(viewModel: viewModel, index: index)
} else {
PlaceholderView(name: viewModel.remoteUsers[index].getName() ?? "")
}
}
.aspectRatio(1, contentMode: .fill)
.frame(maxWidth: .infinity)
.padding()
}
}
}
}
.toolbar {
if viewModel.inSession {
ToolbarItemGroup(placement: .bottomBar) {
Button(action: {
viewModel.toggleVideo()
}, label: {
Label {
Text(viewModel.videoOn ? "Stop Video" : "Start Video")
} icon: {
Image(systemName: viewModel.videoOn ? "video.slash" : "video")
.frame(width: 24, height: 24)
}
})
.buttonStyle(.borderless)
Spacer()
Button(action: {
viewModel.toggleAudio()
}, label: {
Label {
Text(viewModel.audioOn ? "Mute" : "Sound On")
} icon: {
Image(systemName: viewModel.audioOn ? "mic.slash" : "mic")
.frame(width: 24, height: 24)
}
})
.buttonStyle(.borderless)
Spacer()
Button(action: {
viewModel.leaveSession()
dismiss()
}, label: {
Label {
Text("End Session")
} icon: {
Image(systemName: "phone.down")
.frame(width: 24, height: 24)
}
})
.buttonStyle(.borderless)
}
}
}
.labelStyle(VerticalLabelStyle())
.toolbarRole(.editor)
.navigationBarBackButtonHidden(true)
} else {
NavigationStack {
Text("Loading session...")
.font(.title)
.navigationBarBackButtonHidden(true)
.alert("JWT Token Required", isPresented: $viewModel.inJWTInput) {
TextField("Enter your JWT Token", text: $viewModel.userInputJWT)
.disableAutocorrection(true)
Button("Join") {
viewModel.shouldJoin = true
}
Button("Cancel", role: .cancel) {
dismiss()
}
} message: {
Text("You can choose to copy and paste your generated JWT Token here OR leave it as empty if you have added it in the SessionView+Extension jwtToken variable")
}
.task(id: viewModel.shouldJoin) {
guard viewModel.shouldJoin else { return }
await viewModel.joinSession()
viewModel.shouldJoin = false
}.alert("Error", isPresented: $viewModel.joinSessionFailed, actions: {
Button(action: {
dismiss()
}) { Text("OK") }
}, message: {
Text("\(viewModel.errorMessage)")
})
}
}
}
}
As soon as we joined the session successfully, we will update the inSession value to automatically trigger an update to its View.
func onSessionJoin() {
// Session joined successfully.
print("Session joined")
inSession = true
}
To start and stop displaying a user’s video, call the corresponding function with the video helper.
Note: Both return a
ZoomVideoSDKErrorobject. If the operation succeeds, aZoomVideoSDKErrorwith the value ofErrors_Successis returned. It’s best to check the video canvas’ current status viavideoStatusto determine which function to call. Starting and stopping video must be done on the main thread.
You can disregard the following Xcode warning. SDK actions must be called on the main thread, this is expected for the Video SDK:
-[AVCaptureSession startRunning] should be called from background thread. Calling it on the main thread can lead to UI unresponsiveness
We define the toggleVideo for turning on/off the user camera with the button under the .toolbar.
// Local user - toggle video on/off
func toggleVideo() {
if let usersVideoCanvas = ZoomVideoSDK.shareInstance()?.getSession()?.getMySelf()?.getVideoCanvas(),
// Get ZoomVideoSDKVideoHelper to control video
let videoHelper = ZoomVideoSDK.shareInstance()?.getVideoHelper() {
if let myVideoIsOn = usersVideoCanvas.videoStatus()?.on,
myVideoIsOn == true {
Task(priority: .background) {
await MainActor.run {
let error = videoHelper.stopVideo()
print("Stop error: \(error.rawValue)")
}
}
} else {
Task(priority: .background) {
await MainActor.run {
let error = videoHelper.startVideo()
print("Start error: \(error.rawValue)")
}
}
}
}
}
Audio
For the audio, we first check if the user is connected by fetching their ZoomVideoSDKAudioType. If they are not connected, then they must be connected before their microphone can be toggled. We define the toggleAudio function for turning the microphone on/off in the second button created under .toolbar.
// Local user - toggle audio mic unmute/mute
func toggleAudio() {
let myUser = ZoomVideoSDK.shareInstance()?.getSession()?.getMySelf()
// Get the user's audio status
if let audioStatus = myUser?.audioStatus(),
// Get ZoomVideoSDKAudioHelper to control audio
let audioHelper = ZoomVideoSDK.shareInstance()?.getAudioHelper() {
// Check if the user's audio type is none - Not connected yet
if audioStatus.audioType == .none {
Task(priority: .background) {
await MainActor.run {
audioHelper.startAudio()
audioOn = true
}
}
} else {
// Audio is connected - Toggle audio based on mute status
if audioStatus.isMuted {
Task(priority: .background) {
await MainActor.run {
let error = audioHelper.unmuteAudio(myUser)
print("Unmute error: \(error.rawValue)")
audioOn = true
}
}
} else {
Task(priority: .background) {
await MainActor.run {
let error = audioHelper.muteAudio(myUser)
print("Mute error: \(error.rawValue)")
audioOn = false
}
}
}
}
}
}
Responding to user events
We can use the callbacks to detect changes in user status and video status to create/update the necessary views.
First, we get a reference to their user object in the onUserJoin callback once they join a session. We can then add it to remoteUsers array.
func onUserJoin(_ helper: ZoomVideoSDKUserHelper?, users: [ZoomVideoSDKUser]?) {
// Get remote user
if let userArray = users, let myself = ZoomVideoSDK.shareInstance()?.getSession()?.getMySelf() {
for user in userArray {
if (user.getID() != myself.getID()) {
remoteUsers.append(user)
}
}
}
}
The local/remote user can turn their video off/on. When the video has the off status, we update the UI accordingly to show the placeholder instead of a blank view. When the video toggles back to on status we can hide the placeholder. We do this in the callback that keeps track of when a user's video status has changed. For the local user it is done with the videoOn variable and for remote user it's based on their index in the remoteUsers array.
func onUserVideoStatusChanged(_ helper: ZoomVideoSDKVideoHelper?, user: [ZoomVideoSDKUser]?) {
if let userArray = user, let myself = ZoomVideoSDK.shareInstance()?.getSession()?.getMySelf() {
for user in userArray {
// Get local user
if (user.getID() == myself.getID()) {
if let myUserVideoCanvas = ZoomVideoSDK.shareInstance()?.getSession()?.getMySelf()?.getVideoCanvas(), let myVideoIsOn = myUserVideoCanvas.videoStatus()?.on {
if myVideoIsOn {
Task(priority: .background) {
await MainActor.run {
self.videoOn = true
}
}
} else {
Task(priority: .background) {
await MainActor.run {
videoOn = false
}
}
}
}
}
// Get remote user
if (user.getID() != myself.getID()), let remoteUserIndex = remoteUsers.firstIndex(where: { currentUser in
currentUser.getID() == user.getID()
}) {
remoteUsers[remoteUserIndex] = user
}
}
}
}
Finally, we need to clean up when a remote user leaves. In the onUserLeave callback, we can remove a user by its index in the remoteUser list.
func onUserLeave(_ helper: ZoomVideoSDKUserHelper?, users: [ZoomVideoSDKUser]?) {
// Get remote user
if let userArray = users, let myself = ZoomVideoSDK.shareInstance()?.getSession()?.getMySelf() {
for user in userArray {
if (user.getID() != myself.getID()) {
remoteUsers.removeAll { remoteUser in
remoteUser.getID() == user.getID()
}
}
}
}
}
Leaving a session
When the last participant leaves a session. Leave by calling leaveSession which is the third button created under the .toolbar.
func leaveSession() {
ZoomVideoSDK.shareInstance()?.leaveSession(true)
}
The callback onSessionLeave is triggered when the current user leaves the session.
func onSessionLeave() {
leftSession = true
}
That’s how to make your first Video SDK app with SwiftUI! Thank you for following along and using Zoom Video SDK for iOS. You can build other features like screen sharing, chat, cloud recording, and more. You can find more information under the Add Features section in our Vide SDK docs.