In a matter of months, COVID-19 took over the world and the fitness industry was among the industries that were hit the hardest. This change led to video conferencing platforms becoming a must and even a core feature for most fitness apps.
This also meant that we had to figure out very soon which video conferencing platform worked best for our clients, who wanted their users (fitness professionals and coaches) to be able to conduct their classes virtually.
For that purpose, we analyzed 5 video conferencing platforms that can be easily integrated into your application (web, mobile, backend). The starting points for the analysis are the following requirements and assumptions:
When you get requirements for video conferencing the first thing that comes to mind is WebRTC. But what does WebRTC actually do?
WebRTC is a technology that is supported in most browsers and mobile platforms (libraries/SDKs) and as a framework covers initial signaling, p2p communication and video streaming. However, this framework is not sufficient for a final solution, as you’ll usually need more than the core features and most probably a framework or a platform that embeds WebRTC and wraps it with some more “final” touches.
Even though each service or solution uses different namings, let’s consider that we will use:
is a server application that implements (wraps) WebRTC framework and provides the creation of conference rooms and sends connection details to participants, including all other supporting services and tools. This server can be self-hosted (Jitsi) or can be used as a service (Amazon Chime, Twilio Video, Zoom, and so on).
will communicate between the client applications and the video conferencing server. The application server will be responsible for creating a room as well as creating access tokens and delivering them securely to the client apps. The application server can also listen for events from the video conference server/service and act upon them.
is a mobile or web application that can connect to a video conferencing room. Client applications should integrate appropriate SDK which can communicate and connect to a video conferencing server. The integrated SDK (library) provides an implementation for successful communication and support for WebRTC. It can also provide basic or complete UI implementation for video conferencing features such as join, leave, mute, audio change, camera change, screen share etc.
In the image below, you can see the required steps to establish a video conferencing call:
“Jitsi is a collection of open-source projects which provide state of the art video conferencing capabilities that are secure, easy to use and easy to self-host.“ – Jitsi documentation.
As mentioned above, the solution is open-source and can be hosted on our infrastructure. This means it is highly customizable and therefore we should expect a huge effort from our side for implementation and maintenance. The project is still under development and has a moderate community.
On the one hand, our client is already using it in their product for secure video conferencing and has a self-hosted solution but on the other hand, they have an entire team working on it. This led us to the conclusion that for these types of tasks we will need to bring in a dedicated backend developer with WebRTC experience.
Pros: It is open source and customizable, has an existing community, doesn’t rely on 3rd party services and their limitations
Cons: Increased effort and cost for hosting ongoing maintenance and scaling should be handled if the number of users increases
Zoom allows integration of its platform and functionalities inside other apps through Zoom SDK and Zoom API. SDK as a service should support all features that the Zoom application supports by default.
Zoom SDK allows joining the meeting (room) for unauthenticated users, but the creation and hosting of the meeting require a user id. Back-end applications still need to create meetings and the generation of tokens for participants.
Pros: proven and well-known Zoom infrastructure and features, familiar zoom UI can be reused
Cons: SDK size (~90Mb), bad documentation, SDK not up to date, unclear pricing, sample code not easy to integrate
*Update: In the meantime, Zoom released a new video SDK that looks more promising. The main differences are more lightweight client SDKs; easy to use and very customizable UI, but also raw audio and video stream. You can check the comparison here.
Twilio Programmable Video is a cloud platform that allows developers to add video and audio chat to web, Android and iOS applications. The platform provides REST APIs, SDKs, and helper tools that make it simple to capture, distribute, record, and render high-quality audio, video, and screen shares. Twilio Programmable Video has been built on WebRTC.
According to the documentation, the service can scale reliably, SDKs are available for all major platforms (web and mobile) and they are well-tested. Fast bootstrapping and getting service running is supported by quickstarts and code samples. Twilio supports E2E encrypted video (for 2 participants) which is important for some solutions like health tech, small rooms (up to 4 participants), and group rooms (up to 50 participants). In the meantime, we’ve started working on a new project which successfully integrated Twilio Video SDK in its application.
Pros: Good and detailed documentation, the architecture is described in details, easy integration, light-weight SDK, complete apps as integration samples
Cons: Price, flexibility, reliability
As a company, Intertec relies on AWS services, therefore the expectations for Amazon Chime SDK are high. The Amazon Chime SDK is a set of real-time communications components that developers can use to quickly add audio calling, video calling, and screen sharing capabilities to their web or mobile applications. Developers can leverage the same communication infrastructure and services that power Amazon Chime, an online meetings service from AWS, and deliver engaging experiences in their applications.
Pros: Amazon service (scalability and performance), our application servers are usually on AWS, price
Cons: poor integration, documentation and sample code, a big effort for integration on client apps, limited to 16 video streams because there is no central media server.
*no central media server communication – according to documentation which is not explicit, but from the fact that only 16 video streams are allowed, bandwidth requirements and how recording can be implemented, media stream communication is directly between peers and there is no central media server for streaming.
** New Zoom Video SDK was not tested in practice. All data in the table is gathered from the documentation page.
The analysis is done. The proposed solution is Twilio and here are the reasons why:
If we were not limited to time and effort, and video conferencing is the core feature, we would probably consider Jitsi more seriously and give it a try, just because we enjoy working on open source projects and we get to control all aspects of the solution.