WebRTC Deployment and Technology

 WebRTC: Why and How?

 

1.   Introduction

While first designed as the interface to display information provided by web servers, web browsers are now used as the access to social networks, the interface to online games and for exchanging emails and messages as well as for streaming audio and video content. Thereby web browsers have become the main access interface to the Internet and have actually become synonymous with the Internet itself for a large portion of the Internet users.

Up until recently the communication capabilities of web applications were limited to either text-based communication such as messaging or email or non-real-time audio and video, e.g., streaming. The combination of real time services such as a voice call or a video conference with a web application was only possible using either a separate application or proprietary plug-ins that lack open specifications, and interoperability and are often limited to certain platforms.

In order to add real time capabilities to commercial browsers in a standardized manner and move from proprietary solutions, the major standardization groups responsible for the advancement of the Internet protocols and applications have launched the HTML5 and real-time web (WebRTC) initiatives to complement web applications with real time media features.

In this paper we first provide a brief introduction to the WebRTC technology and then explain it usage in the enterprise as well as service provider environment.

2.   Introduction to WebRTC

Current approaches for supporting real time communication in web applications are based on either using a separate application or a plug-in such as a Flash plug-in. Using a separate application would mean leaving the browser and launching a new application. Thereby there can be no real integration of the content presented by the browser and the real time content. Solutions based on plug-ins provide tighter integration between the real-time content and the provider’s web pages. However, plug-ins such as Flash are proprietary and do not work in all environments. In particular Flash does not work over IOS used for iPhones for example. Another issue with the Flash technology is its centralized model. A Flash plug-in that was downloaded from domain X can only communicate with a server in domain X. This means that an application provider that is offering a number of applications in the form of Flash plug-ins will have to deal with all the signalling and media traffic generated by the plug-in. This restriction was introduced so as to prevent a malicious application from sending traffic to some destination and hence attacking that destination.

WebRTC Standards and layers

WebRTC Standards and layers

Figure 1 WebRTC framework

New working groups have been created in W3C and IETF standardization groups aiming at defining elements of real-time communication in the browser[1],[2],[3].

Based on the WebRTC framework proposed by the IETF and W3C the vendors of browsers are extending their browsers to support the sending and reception of audio and video. The specified WebRTC framework, see Figure 1, is based on the following main parts:

  • Browser API: To provide application developers with the ability to send and receive audio and video streams directly from a browser, browsers must be enhanced with capabilities for controlling the local audio and video devices at the computing device at which the browser is running. These capabilities are exposed to application developers through a well-defined application programming interface (API).
  • Web application: The typical mode of running a web application is for the user to download a Javascript from a web server. This script runs then locally at the user’s system but interacts with the web server for executing the application logic. The web server can instruct the Javascript to conduct certain actions and the script can send feedback information to the web server.
  • Web server: The server provides the Javascripts for the users and executes the application logic.

An application developed in Javascript would then use the browser API to capture camera and microphone data from the host computer and send it to some receiver. In order to avoid the restriction of a centralized model that is used with the Flash technology, the WebRTC framework indicates that a browser can send data to a host other than the one from which the application was downloaded if that host consents to receiving the data. This is only done, however, after receiving consent from the callee.

With such a framework a web telephony application is developed as a Javascript that is provided at a web server, see Figure 2. A user wishing to use this application downloads the script. When making a call the Javascript then informs the web server about the call destination and the web server contacts the final destination. Once the callee has answered, the web server forwards the response of the callee to Javascript running at the caller’s system. The Javascript now instructs the browser to use the local audio and video devices to exchange audio and video content with the callee.

WebRTC call flow

Figure 2: High level WEBRTC flow

In order to ensure that the type of applications that can benefit from the integration of real-time services with the browser is only limited by the imagination of the developers, the WebRTC framework is only defining the API to be provided by the browser as well minimal security requirements needed to avoid the misuse of WebRTC applications for initiating denial of service attacks.

WebRTC Trapazoid

WebRTC Trapazoid

Figure 3: WebRTC Trapazoid

 To enable browsers using different application providers to communicate with each other (e.g. a user logged in to Facebook wants to call someone that is logged in to linkedin) a so called RTC trapezoid , see Figure 3, can be used. In this case the two providers use a widely used VoIP signalling protocol in between such as the Session Initiation Protocol[4] to federate between them. However, each of their respective browser-based clients signals to its server using proprietary application protocols built on top of HTTP and Websockets.

3.   WebRTC in the Enterprise

Most enterprises have moved or are planning to move their telephony system and call center services to a VoIP based solution.

A telephony system in the enterprise usually serves one or more of the following scenarios:

  • Onsite communication: The ability of the employees of the enterprise to call each other.
  • Offsite communication: The ability of the employees in of the enterprise to call to the rest of the world.
  • Remote Employee communication: Enterprises need to enable employees that either work from home, in a smaller branch or are on the road to use the enterprise telephony system for communicating to other employees or for offsite communication. This is often achieved by providing the remote employee a VoIP application that is connected to the enterprise through a VPN in order to ensure the security of the communication.
  • Customer communication: Customers can call the call center of an enterprise and communicate with sales and support employees. In order to provide for a toll-free calling number, enterprises need to buy this rather expensive service from a telephony service provider.

By deploying WebRTC in the enterprise environment, enterprises will not only reduce the costs of their telephony services but also easily introduce video and chat in a simple manner:

  • Cost savings:

o   By offering WebRTC services instead of a toll-free number for their call center and support services, enterprises can significantly reduce the costs paid for the toll-free services.

o   Remote workers can connect to the enterprise telephony service over any IP connection without having to provide the remote worker with any special devices, VPN or VoIP software application

  • Ease of use: Services such as conferencing, messaging and connection to calendar and address books are just some of the common features offered by modern VoIP telephony solutions for the enterprise. These services are usually not available for remote employees and customers connecting to the enterprise through a telephone. WebRTC removes this barrier and can significantly contribute towards a more customer friendly communication.

o   Rich communication: Telephony services are restricted to voice communication. WebRTC applications integrate seamlessly video and messaging. This can be of benefit for both remote workers and customers.

o   Uninterrupted communication: A customer looking for a service often queries the enterprise description and contact information through a browser. Then the customer will have to leave the browser and use the phone to call the enterprise. With WebRTC the customer remains on the enterprises’ web page and can discuss any products or issues she might be having without having to deal with both the phone and browser.

  • Security: All communication between WebRTC applications is encrypted. While current VoIP solutions offer the possibility of using encryption this is not often used. Even if used then in the enterprise scenario, security is often only available for remote users. Customers contacting the enterprise can only rely on the security mechanisms provided by the public telephony system, which is poor at best.   WebRTC technology is actually the first communication system that already integrates state of the art encryption standards as an intrinsic part of the system and not as an add-on.

3.1.        Deploying the WebRTC in the Enterprise

Remote workers are usually connected to the enterprise telephony system over a VPN or similar. Customers call the enterprise either over a VoIP call or directly over a PSTN connection.

Usually, in order to protect the telephony system of the enterprise, a session border controller (SBC) is located on the border of the enterprise

Figure 4 presents a high level overview of the migration path from a SIP/PSTN based solution to one that incorporates WebRTC as well.

Migration strategies to WebRTC

Migration strategies to WebRTC

 

Figure 4: Migration to WebRTC

As a starting point we assume an enterprise structure of a very general nature consisting of the VoIP solution used at the enterprise, namely a PBX, call center or something similar. Remote workers and small branches of the enterprise are often connected to the central enterprise infrastructure through an SBC. The same or another SBC also separate the enterprise infrastructure from the public Internet. Calls to public numbers or from customers are routed through the SBC. Direct PSTN connections are usually handled by the infrastructure directly.

One can in general distinguish three migration paths:

  • Replacement: In this case the SBC used by the enterprise is replaced by the an SBC that supports WebRTC. Such a solution would then provide SBC as well as WebRTC gateway functionality.
  • Extension: In addition to the used SBC, the enterprise could deploy a WebRTC gateway in parallel. VoIP calls to and from remote workers and customers would be processed by the already deployed SBC. Calls using WebRTC would be handled by the WebRTC gateway, which would then forward them to the enterprise VoIP servers. The WebRTC gateway can be installed on a dedicated device or run as a virtual machine on an already available hardware.
  • Cloud: This is probably the least intrusive migration path. Instead of deploying the WebRTC gateway directly as part of the enterprise infrastructure, the WebRTC gateway is deployed on a cloud service. WebRTC calls to the enterprise are routed to the WebRTC gateway in the cloud. The WebRTC gateway would then route the translated calls to the enterprise. From the point of view of the enterprise, the WebRTC gateway would be similar to a remote worker or a public caller. So unless some additional security mechanisms are introduced such as VPN the enterprise might want to route the calls from the WebRTC gateway through the already deployed SBC.

4.   WebRTC and the Service Provider

By deploying WebRTC in the enterprise environment, enterprises will not only reduce the costs of their telephony services but also easily introduce video and chat in a simple manner:

  • Cost savings:

o   By offering WebRTC services instead of a toll-free number for their call center and support services, enterprises can significantly reduce the costs paid for toll-free services.

o   Remote workers can connect to the enterprise telephony service over any IP connection without having to provide the remote worker with any special devices, VPN or VoIP software application

  • Ease of use:

o   Rich communication: Telephony services are restricted to voice communication. WebRTC applications integrate seamlessly video and messaging. This can be of benefit for both remote workers and customers.

o   Uninterrupted communication: A customer looking for a service often queries the enterprise description and contact information through a browser. Then the customer will have to leave the browser and use the phone to call the enterprise. With WebRTC the customer remains on the enterprises’ web page and can discuss any products or issues she might be having without having to deal with both the phone and browser.

  • Security: All communication between WebRTC applications is encrypted. While current VoIP solutions offer the possibility of using encryption this is not often used. Even if used then in the enterprise scenario, security is often only available for remote users. Customers contacting the enterprise can only rely on the security mechanisms provided by the public telephony system, which is poor at best.   WebRTC technology is actually the first communication system that already integrates state of the art encryption standards as an intrinsic part of the system and not as an add-on.

As an enterprise, service providers can benefit from WebRTC just as any other enterprise. In addition to these benefits, service providers can utilize the WebRTC technology to generate new revenues. With a WebRTC gateway service providers can offer the following services:

  • Mobile Telephony: While VoIP subscribers use their VoIP line at home or in the office, when on the road they rely on their mobile phone or an application running on their mobile devices. In order to support mobile users, operators usually need to develop or acquire a VoIP application. This involves time and money investment and the need to maintain and support rather complicated VoIP applications. WebRTC platforms such as JsSIP already provide a complete voice and video solution that runs natively in the browser. Thereby, operators can keep their subscribers on their network and extend their service reach with a low investment.
  • Hosted Services: In order to reduce the costs of owning and marinating a PBX or a call center, enterprises are increasingly outsourcing these services to service providers. By deploying a WebRTC gateway service providers can extend their voice based PBX and call center service offering with video and messaging capabilities. By deploying a WebRTC gateway end users would be able to access the SIP based hosted PBX and call centers without the need to change these services.
  • WebRTC as a Service: Enterprises wishing to deploy WebRTC applications have two options: either deploy their own WebRTC servers or outsource these servers to someone else. By offering WebRTC as a service, a service provider would basically host the WebRTC gateway for the enterprises. WebRTC calls destined to the enterprise would be handled by the WebRTC gateway of the service provider. Incoming WebRTC calls would be translated into SIP calls and routed to the enterprise. The enterprise would not have to change anything in its infrastructure, as it will still be only handling SIP calls.

5.   References

[1] Web real-time communications working group charter. W3C. Dec.2010. http://www.w3.org/2010/12/webrtc-charter.html

[2] RTC-Web IETF working charter proposal. Mar.2011, http://rtc web.alvestrand.com/ietf-activity

[3] J. Rosenberg et al. “An architectural framework for browser based real-time communications. IETF Internet draft. “work in progress”, Feb.2011.

[4] J. Rosenberg, et al., “SIP: Session Initiation protocol”, IETF RFC 3261, June 2002