Build a SwiftUI video chat app using the Zoom Video SDK on iOS

  • Update: Feb 26, 2026
    The blog has been updated to support the latest SPM integration details and a newly added script that helps to generate the JWT token.

Introduction

At Zoom, we strive to provide the best video conferencing experience possible. SDK libraries empower the creation of third party apps powered by our world-class video technology platform. With Zoom Video SDK, developers can build fully-customizable self-branded apps with nearly all of the features of the desktop client Zoom experience, from video calling to screen sharing.

In this guide we will build an iOS SwiftUI app with the Video SDK. We’ll cover:

Tha app lets two or more users have a conversation with each other over. Both video and audio-only communication is supported. For this project, we'll use Swift and SwiftUI. If you'd like to use UIKit you can read our UIKit blog.

Prerequisites

To build this app, you should have:

Getting the SDK and its contents

Adding the SDK to your iOS app takes only a single step with Swift Package Manager. In Xcode, select File > Add Package Dependencies.... In the Search or Enter Package URL bar on the top right, enter https://github.com/zoom/zoom-video-sdk-iOS/ and swift-package-manager in the Branch field.

SPM add

Tap the Add Package button, confirm the app target you are adding it to, and the Video SDK should be added to Package Dependencies accordingly. The Video SDK for iOS package includes the following XCFramework bundles under /Sample-Libs/lib that can be added to your project as needed:

  • ZoomVideoSDK.xcframework and ZoomTask.xcframework: Interfaces to support all services related to Zoom sessions, such as initializing the SDK, creating and joining sessions, in-session services, and more.

For this tutorial, we do not need these xcframeworks:

  • CptShare.xcframework: Interfaces to support screen sharing a single UIView. Required to receive annotation by others when sharing a single UIView, as opposed to full broadcasting.

  • zm_annoter_dynamic.xcframework: Interfaces to support the annotation service when sharing.

  • ZoomVideoSDKScreenShare.xcframework: Interfaces to support the full screen share service, for broadcasting a device screen.

  • zoomcml.xcframework: Interfaces to support virtual background filter and 3D avatar.

  • Whiteboard.xcframework: Interfaces to support whiteboard.

    Choose package product

To add framework files manually via the Zoom Marketplace, see the documentation.

Quickstart app contents

MySwiftUIVideoSDK is a simple two-view navigation app.

StartView is the entry point for the app where the Video SDK is initialized.

![Start View](/img/blog/boonjuntan/uikit-quick-start-swiftui-joinsession.png)
![Start View](/img/blog/boonjuntan/uikit-quick-start-swiftui-jwttokenprompt.png)
![Loading View](/img/blog/boonjuntan/uikit-quick-start-swiftui-sessionloading.jpg)

SessionView contains a .toolbar which holds the controls for toggling the user’s video, toggling audio and ending the Zoom session. This view also contains a ScrollView that contains all of the participants in the session.

![Default session view](/img/blog/boonjuntan/uikit-quick-start-swiftui-offvideo.jpg)

Default session view

The three options are tracked in the .toolbar's ToolbarItemGroup under 3 different Button created and added to it. Then the respective action are called via the viewModel attached to the SessionView.

// An example of video button - toggle on and off
Button(action: {
    viewModel.toggleVideo()
}, label: {
    Label {
        Text(viewModel.videoOn ? "Stop Video" : "Start Video")
    } icon: {
        Image(systemName: viewModel.videoOn ? "video.slash" : "video")
            .frame(width: 24, height: 24)
    }
})

JWT authentication

JSON Web Tokens are used to authorize Zoom Video SDK apps. They are always required for starting and joining sessions. Ideally you should be generating the JWT on the server side to ensure safekeeping of your Video SDK credentials. However, as reference, we have added a script in the /Scripts folder using .swift to easily generate the JWT token needed in this demo.

Follow the README in the /Scripts folder to understand how to use the script.

// MARK: Session Information
/*
 TODO: Enter the following variables needed to initialize the VSDK and to start/join a session
 You should sign your JWT with a backend service in a production use-case. For faster JWT generation, you can navigate checkout the JWTGenerator.swift under Script folder and its README for more details on how to consume it.
 Once you got the token, you can simple copy and paste it below.
 Ensure that the sessionName matches the session name used to generate the JWT Token.
 */
let jwtToken = "" // Leave this as empty if you choose to copy and paste your generated JWT token directly in the sample app's alert box after clicking on "Join Session"
let sessionName = "" // Also known as tpc in JWT
let userName = "" // Display name
let sessionPassword: String = "" // If needed

Integrating the SDK

Ensure your app's General > Minimum Deployments target is set to at least iOS 13.0.

![Minimum Deployment Version](/img/blog/richardyeh/minversion.png)

If you used Swift Package Manager to add the Zoom Video SDK, your Xcode project's Package Dependencies should look like this:

Package Dependencies

The General > Frameworks, Libraries, and Embedded Content settings should look like this:

Frameworks, Libraries, and Embedded Content with SPM install

If you added the Zoom Video SDK manually, do the following:

In the Video SDK package that was downloaded from the Zoom Marketplace, navigate to /Sample-Libs/lib.

![SDK framework libaries](/img/blog/boonjuntan/uikit-quick-start-swiftui-libs.png)

The Video SDK is a dynamic library, so it must be included in the project as an embedded binary. In your Xcode project, navigate to your app's target and then General > Frameworks, Libraries, and Embedded Content and add ZoomVideoSDK.xcframework for the main SDK interfaces and set to Embed & Sign.

![Embed frameworks](/img/blog/boonjuntan/uikit-quick-start-embed.png)

Finally from the same app's target page, navigate to Info add the required project permissions ("Privacy - * Usage Description") for Camera, Microphone, Bluetooth, and optionally Photo Library. The user will then explicitly grant these permissions to the app during runtime. More information on media permissions see here.

![Required project permissions](/img/blog/boonjuntan/uikit-quick-start-swiftui-permissions.png)

Initializing the SDK

Let’s get started by initializing the SDK so we have access to its functionality. For now, we’ll work in the StartView, where we first import ZoomVideoSDK to have the module.

In the setupSDK function, we’ll create an instance of the ZoomVideoSDKInitParams object and set the domain of the context to zoom.us. Then call the initialize function on the Video SDK from the main thread and verify it was correctly initialized.

let initParams = ZoomVideoSDKInitParams()
initParams.domain = "zoom.us"
let sdkInitReturnStatus = ZoomVideoSDK.shareInstance()?.initialize(initParams)
switch sdkInitReturnStatus {
case .Errors_Success:
    print ("SDK initialization succeeded")
default:
    if let error = sdkInitReturnStatus {
        print("SDK initialization failed: \(error)")
        return
    }
}

Joining a session

For the remainder of the app we'll be using the SessionView.swift for View related code, and SessionView+Extension.swift for ViewModel related code. Import the Zoom Video SDK here as well.

To create or to join a session, you need to instantiate an ZoomVideoSDKSessionContext object and provide the following required properties:

  • token: JSON Web Token (JWT) created from Video SDK credentials during Authentication.
  • sessionName: The session’s unique identifier, which must match the tpc field in the JWT. If the name is for a currently active session, then the SDK will join the session if all required parameters have been provided. If no active session exists with the name, then the SDK will create a new session for you.
  • userName: Display name of the user shown in the session. Default value is "null".

Optional additional properties:

  • sessionPassword: You may optionally specify a password for the session that attendees must enter.
  • audioOption: Audio settings configurable in ZoomVideoSDKAudioOptions.
  • videoOption: Video settings configurable in ZoomVideoSDKVideoOptions.

We will create the session context in a new method joinSession using the data that you previously input in the SessionView+Extension.swift. Once again, in a production app, you should not hardcode in the JWT or other credentials, these should be retrieved from a backend server.


// MARK: Session Information
/*
 TODO: Enter the following variables needed to initialize the VSDK and to start/join a session
 You should sign your JWT with a backend service in a production use-case. For faster JWT generation, you can navigate checkout the JWTGenerator.swift under Script folder and its README for more details on how to consume it.
 Once you got the token, you can simple copy and paste it below.
 Ensure that the sessionName matches the session name used to generate the JWT Token.
 */
let jwtToken = "" // Leave this as empty if you choose to copy and paste your generated JWT token directly in the sample app's alert box after clicking on "Join Session"
let sessionName = "" // Also known as tpc in JWT
let userName = "" // Display name
let sessionPassword: String = "" // If needed
func joinSession() async {
    ZoomVideoSDK.shareInstance()?.delegate = self
    let sessionContext = ZoomVideoSDKSessionContext()
    sessionContext.token = jwtToken.isEmpty ? userInputJWT : jwtToken
    sessionContext.sessionName = sessionName
    sessionContext.userName = userName
    let videoOption = ZoomVideoSDKVideoOptions()
    videoOption.localVideoOn = true
    sessionContext.videoOption = videoOption
    let audioOtion = ZoomVideoSDKAudioOptions()
    audioOtion.mute = true
    sessionContext.audioOption = audioOtion
    if !sessionPassword.isEmpty {
        sessionContext.sessionPassword = sessionPassword
    }
    // Join Session
    if let session = ZoomVideoSDK.shareInstance()?.joinSession(sessionContext) {
        print("Session object: \(session)")
    } else {
        print("Join session failed")
        DispatchQueue.main.async {
            self.joinSessionFailed = true
        }
    }
}

In the SessionView.swift, we have the view logic to display two different views based on if the user is in a session or not.

struct SessionView: View {
    @StateObject private var viewModel = ViewModel()
    @Environment(\.dismiss) var dismiss
    var body: some View {
        if viewModel.inSession {
            // Display participants UI and toolbar
        } else {
            // Display loading session - This is when the viewModel.joinSession() get called.
        }
    }
}

Set up delegate callbacks

The Video SDK uses delegate callbacks to share events/updates such as operation results or failures. You can access these by conforming to ZoomVideoSDKDelegate to receive all available session callbacks. We do so by conforming the ViewModel class with ZoomVideoSDKDelegate and adding the delegate under joinSession().

extension SessionView {
    @MainActor
    class ViewModel: NSObject, ObservableObject, @preconcurrency ZoomVideoSDKDelegate {
        // ...
    }
    // ...
    func joinSession() async {
        ZoomVideoSDK.shareInstance()?.delegate = self
        // ...
    }
}

Video

The app will include controls to toggle the camera and mic. And a button to leave the session. Let's go over each feature individually.

To display a user's video stream, first we will need to set up the UIViewRepresentable that acts as a bridge between UIKit and SwiftUI as the video stream requires an UIView under UIKit to subscribe/unsubscribe to/from. We'll create two different UIViewRepresentable for the local user (LocalVideoView) and remote users (RemoteVideoView) in SessionView.

// Create the 2 Views under SessionView.swift
public struct LocalVideoView: UIViewRepresentable {
    @State var viewModel: SessionView.ViewModel
    public func makeUIView(context: Context) -> UIView {
        let videoView = UIView()
        viewModel.attachLocalVideo(to: videoView)
        return videoView
    }
    public func updateUIView(_ uiView: UIView, context: Context) {
        viewModel.updateLocalVideo(to: uiView)
    }
}
public struct RemoteVideoView: UIViewRepresentable {
    @State var viewModel: SessionView.ViewModel
    @State var index: Int // To keep track of the remote users.
    public func makeUIView(context: Context) -> UIView {
        let videoView = UIView()
        return videoView
    }
    public func updateUIView(_ uiView: UIView, context: Context) {
        viewModel.updateRemoteVideo(to: uiView, index: index)
    }
}

We'll also create the corresponding makeUIView and updateUIView methods for the interaction with its view model. The difference between the LocalVideoView and RemoteVideoView is that we can have multiple remote users in a session so we'll track them using an index.

// Create the 4 methods in the ViewModel at SessionView+Extension.swift
extension SessionView {
    @MainActor
    class ViewModel: NSObject, ObservableObject, @preconcurrency ZoomVideoSDKDelegate {
        // Error popup
        @Published var errorTitle: String = "Error"
        var errorMessage: String = "Message"
        // Local user
        @Published var userInputJWT = ""
        @Published var shouldJoin = false
        @MainActor weak var localView: UIView?
        @Published var joinSessionFailed: Bool = false
        @Published var inJWTInput: Bool = true
        @Published var inSession: Bool = false
        @Published var leftSession: Bool = false
        @Published var videoOn: Bool = false
        @Published var audioOn: Bool = false
        // Remote users
        @Published var remoteUsers: [ZoomVideoSDKUser] = []
        // ...
        // Attaching the local user's video view
        @MainActor func attachLocalVideo(to view: UIView) {
            self.localView = view
        }
        // Updating the local user's video view
        @MainActor func updateLocalVideo(to view: UIView) {
            guard let myUserVideoCanvas = ZoomVideoSDK.shareInstance()?.getSession()?.getMySelf()?.getVideoCanvas(), let myVideoIsOn = myUserVideoCanvas.videoStatus()?.on else { return }
            if myVideoIsOn {
                myUserVideoCanvas.subscribe(with: localView, aspectMode: .panAndScan, andResolution: ._Auto)
            } else {
                myUserVideoCanvas.unSubscribe(with: localView)
            }
        }
        // Attaching the remote user's video view based on index
        @MainActor func attachRemoteUserVideo(index: Int, to view: UIView) {
            guard let index = remoteUsers.indices.first(where: { $0 == index }) else { return }
            if let currentUserVideoCanvas = self.remoteUsers[index].getVideoCanvas(), let videoStatus = currentUserVideoCanvas.videoStatus() {
                if videoStatus.on {
                    currentUserVideoCanvas.subscribe(with: view, aspectMode: .panAndScan, andResolution: ._Auto)
                } else {
                    currentUserVideoCanvas.unSubscribe(with: view)
                }
            }
        }
        // Updating the remote user's video view based on index
        @MainActor func updateRemoteVideo(to view: UIView, index: Int) {
            guard let index = remoteUsers.indices.first(where: { $0 == index }) else { return }
            if let currentUserVideoCanvas = self.remoteUsers[index].getVideoCanvas(), let videoStatus = currentUserVideoCanvas.videoStatus() {
                if videoStatus.on {
                    currentUserVideoCanvas.subscribe(with: view, aspectMode: .panAndScan, andResolution: ._Auto)
                } else {
                    currentUserVideoCanvas.unSubscribe(with: view)
                }
            }
        }
    }
}

We'll also create a PlaceholderView in SessionView.swift for when the user does not have their video turned on. It'll display a person icon and their username instead.

public struct PlaceholderView: View {
    @State var name: String
    public var body: some View {
        VStack() {
            Image(systemName: "person.fill")
                .foregroundStyle(.white)
            Text(name)
                .foregroundStyle(.white)
        }
        .frame(maxHeight: .infinity)
    }
}
public struct VerticalLabelStyle: LabelStyle {
    public func makeBody(configuration: Configuration) -> some View {
        VStack {
            configuration.icon.font(.headline)
            configuration.title.font(.footnote)
        }
    }
}

Finally we will add the Views we created to the SessionView and add the .toolbar we mentioned earlier. We'll add a loading view for when the session is loading.

struct SessionView: View {
    @StateObject private var viewModel = ViewModel()
    @Environment(\.dismiss) var dismiss
    var body: some View {
        if viewModel.inSession {
            NavigationStack {
                ScrollView {
                    VStack() {
                        VStack() {
                            if viewModel.videoOn {
                                LocalVideoView(viewModel: viewModel)
                            } else {
                                PlaceholderView(name: viewModel.userName)
                            }
                        }
                        .aspectRatio(1, contentMode: .fill)
                        .frame(maxWidth: .infinity)
                        .padding()
                        ForEach(viewModel.remoteUsers.indices, id: \.self) { index in
                            VStack() {
                                if (viewModel.remoteUsers[index].getVideoCanvas()?.videoStatus()?.on ?? false) {
                                    RemoteVideoView(viewModel: viewModel, index: index)
                                } else {
                                    PlaceholderView(name: viewModel.remoteUsers[index].getName() ?? "")
                                }
                            }
                            .aspectRatio(1, contentMode: .fill)
                            .frame(maxWidth: .infinity)
                            .padding()
                        }
                    }
                }
            }
            .toolbar {
                if viewModel.inSession {
                    ToolbarItemGroup(placement: .bottomBar) {
                        Button(action: {
                            viewModel.toggleVideo()
                        }, label: {
                            Label {
                                Text(viewModel.videoOn ? "Stop Video" : "Start Video")
                            } icon: {
                                Image(systemName: viewModel.videoOn ? "video.slash" : "video")
                                    .frame(width: 24, height: 24)
                            }
                        })
                        .buttonStyle(.borderless)
                        Spacer()
                        Button(action: {
                            viewModel.toggleAudio()
                        }, label: {
                            Label {
                                Text(viewModel.audioOn ? "Mute" : "Sound On")
                            } icon: {
                                Image(systemName: viewModel.audioOn ? "mic.slash" : "mic")
                                    .frame(width: 24, height: 24)
                            }
                        })
                        .buttonStyle(.borderless)
                        Spacer()
                        Button(action: {
                            viewModel.leaveSession()
                            dismiss()
                        }, label: {
                            Label {
                                Text("End Session")
                            } icon: {
                                Image(systemName: "phone.down")
                                    .frame(width: 24, height: 24)
                            }
                        })
                        .buttonStyle(.borderless)
                    }
                }
            }
            .labelStyle(VerticalLabelStyle())
            .toolbarRole(.editor)
            .navigationBarBackButtonHidden(true)
        } else {
            NavigationStack {
                Text("Loading session...")
                    .font(.title)
                    .navigationBarBackButtonHidden(true)
                    .alert("JWT Token Required", isPresented: $viewModel.inJWTInput) {
                        TextField("Enter your JWT Token", text: $viewModel.userInputJWT)
                            .disableAutocorrection(true)
                        Button("Join") {
                            viewModel.shouldJoin = true
                        }
                        Button("Cancel", role: .cancel) {
                            dismiss()
                        }
                    } message: {
                        Text("You can choose to copy and paste your generated JWT Token here OR leave it as empty if you have added it in the SessionView+Extension jwtToken variable")
                    }
                    .task(id: viewModel.shouldJoin) {
                        guard viewModel.shouldJoin else { return }
                        await viewModel.joinSession()
                        viewModel.shouldJoin = false
                    }.alert("Error", isPresented: $viewModel.joinSessionFailed, actions: {
                        Button(action: {
                            dismiss()
                        }) { Text("OK") }
                    }, message: {
                        Text("\(viewModel.errorMessage)")
                    })
            }
        }
    }
}

As soon as we joined the session successfully, we will update the inSession value to automatically trigger an update to its View.

func onSessionJoin() {
    // Session joined successfully.
    print("Session joined")
    inSession = true
}

To start and stop displaying a user’s video, call the corresponding function with the video helper.

Note: Both return a ZoomVideoSDKError object. If the operation succeeds, a ZoomVideoSDKError with the value of Errors_Success is returned. It’s best to check the video canvas’ current status via videoStatus to determine which function to call. Starting and stopping video must be done on the main thread.

You can disregard the following Xcode warning. SDK actions must be called on the main thread, this is expected for the Video SDK:

-[AVCaptureSession startRunning] should be called from background thread. Calling it on the main thread can lead to UI unresponsiveness

We define the toggleVideo for turning on/off the user camera with the button under the .toolbar.

// Local user - toggle video on/off
func toggleVideo() {
    if let usersVideoCanvas = ZoomVideoSDK.shareInstance()?.getSession()?.getMySelf()?.getVideoCanvas(),
        // Get ZoomVideoSDKVideoHelper to control video
        let videoHelper = ZoomVideoSDK.shareInstance()?.getVideoHelper() {
        if let myVideoIsOn = usersVideoCanvas.videoStatus()?.on,
            myVideoIsOn == true {
            Task(priority: .background) {
                await MainActor.run {
                    let error = videoHelper.stopVideo()
                    print("Stop error: \(error.rawValue)")
                }
            }
        } else {
            Task(priority: .background) {
                await MainActor.run {
                    let error = videoHelper.startVideo()
                    print("Start error: \(error.rawValue)")
                }
            }
        }
    }
}

Audio

For the audio, we first check if the user is connected by fetching their ZoomVideoSDKAudioType. If they are not connected, then they must be connected before their microphone can be toggled. We define the toggleAudio function for turning the microphone on/off in the second button created under .toolbar.

// Local user - toggle audio mic unmute/mute
func toggleAudio() {
    let myUser = ZoomVideoSDK.shareInstance()?.getSession()?.getMySelf()
    // Get the user's audio status
    if let audioStatus = myUser?.audioStatus(),
        // Get ZoomVideoSDKAudioHelper to control audio
        let audioHelper = ZoomVideoSDK.shareInstance()?.getAudioHelper() {
        // Check if the user's audio type is none - Not connected yet
        if audioStatus.audioType == .none {
            Task(priority: .background) {
                await MainActor.run {
                    audioHelper.startAudio()
                    audioOn = true
                }
            }
        } else {
            // Audio is connected - Toggle audio based on mute status
            if audioStatus.isMuted {
                Task(priority: .background) {
                    await MainActor.run {
                        let error = audioHelper.unmuteAudio(myUser)
                        print("Unmute error: \(error.rawValue)")
                        audioOn = true
                    }
                }
            } else {
                Task(priority: .background) {
                    await MainActor.run {
                        let error = audioHelper.muteAudio(myUser)
                        print("Mute error: \(error.rawValue)")
                        audioOn = false
                    }
                }
            }
        }
    }
}

Responding to user events

We can use the callbacks to detect changes in user status and video status to create/update the necessary views.

First, we get a reference to their user object in the onUserJoin callback once they join a session. We can then add it to remoteUsers array.

func onUserJoin(_ helper: ZoomVideoSDKUserHelper?, users: [ZoomVideoSDKUser]?) {
    // Get remote user
    if let userArray = users, let myself = ZoomVideoSDK.shareInstance()?.getSession()?.getMySelf() {
        for user in userArray {
            if (user.getID() != myself.getID()) {
                remoteUsers.append(user)
            }
        }
    }
}

The local/remote user can turn their video off/on. When the video has the off status, we update the UI accordingly to show the placeholder instead of a blank view. When the video toggles back to on status we can hide the placeholder. We do this in the callback that keeps track of when a user's video status has changed. For the local user it is done with the videoOn variable and for remote user it's based on their index in the remoteUsers array.

func onUserVideoStatusChanged(_ helper: ZoomVideoSDKVideoHelper?, user: [ZoomVideoSDKUser]?) {
    if let userArray = user, let myself = ZoomVideoSDK.shareInstance()?.getSession()?.getMySelf() {
        for user in userArray {
            // Get local user
            if (user.getID() == myself.getID()) {
                if let myUserVideoCanvas = ZoomVideoSDK.shareInstance()?.getSession()?.getMySelf()?.getVideoCanvas(), let myVideoIsOn = myUserVideoCanvas.videoStatus()?.on {
                    if myVideoIsOn {
                        Task(priority: .background) {
                            await MainActor.run {
                                self.videoOn = true
                            }
                        }
                    } else {
                        Task(priority: .background) {
                            await MainActor.run {
                                videoOn = false
                            }
                        }
                    }
                }
            }
            // Get remote user
            if (user.getID() != myself.getID()), let remoteUserIndex = remoteUsers.firstIndex(where: { currentUser in
                currentUser.getID() == user.getID()
            }) {
                remoteUsers[remoteUserIndex] = user
            }
        }
    }
}

Finally, we need to clean up when a remote user leaves. In the onUserLeave callback, we can remove a user by its index in the remoteUser list.

 func onUserLeave(_ helper: ZoomVideoSDKUserHelper?, users: [ZoomVideoSDKUser]?) {
    // Get remote user
    if let userArray = users, let myself = ZoomVideoSDK.shareInstance()?.getSession()?.getMySelf() {
        for user in userArray {
            if (user.getID() != myself.getID()) {
                remoteUsers.removeAll { remoteUser in
                    remoteUser.getID() == user.getID()
                }
            }
        }
    }
}

Leaving a session

When the last participant leaves a session. Leave by calling leaveSession which is the third button created under the .toolbar.

func leaveSession() {
    ZoomVideoSDK.shareInstance()?.leaveSession(true)
}

The callback onSessionLeave is triggered when the current user leaves the session.

func onSessionLeave() {
    leftSession = true
}

That’s how to make your first Video SDK app with SwiftUI! Thank you for following along and using Zoom Video SDK for iOS. You can build other features like screen sharing, chat, cloud recording, and more. You can find more information under the Add Features section in our Vide SDK docs.