5.15. Amazon Elastic Cloud Configuration Cookbook

An easy way to start and run an RTC gateway is “off a cloud”, i.e. on a hosted platform, without purchasing and operating own physical infrastructure: computers, racks, disks and IP connectivity. A single click is enough to start a service, of course as long as you keep paying for the cloud services. Whereas there are many “cloud platforms”, we focus on running the RTC gateway on the Amazon Web Services (AWS) platform in this section.

The AWS platform is a mature system that allows, among many other useful things, to start and run pre-built virtual machines, load-balance traffic among them, monitor their health and scale the infrastructure up and down to be on par with user load. Applications, RTC-to-SIP gateway in our case, come pre-installed ready-to-start in form of a virtual machine image, called AMI (Amazon Machine Image).

The frafos RTC gateway installation is preconfigured to address a simple yet useful scenario: add RTC connectivity to a running SIP PBX service. Once started, the RTC gateway passes RTC registrations and calls coming from RTC clients down to the PBX(s). Reversely, the gateway routes calls coming from the SIP PBX(s) to the previously registered RTC browsers. No further configuration is needed.

The following subsections describe how to start the RTC-to-SIP gateway service off the amazon platform. We offer you several ways to do the same: the easiest is launching a cluster using Cloud Formation template. This way you create a load-balanced scalable infrastructure by pressing a button without any further knowledge of how the components must be configured. If you want to understand in more detail how the gateway works, you can launch a single-instance service and/or configure it in detail step by step.

We suggest you explore our demo site https://go.frafos.com. It includes additional information about use of AWS and WebRTC technologies, including live services and ready-made demo AMIs and Cloud Formation Templates. These can be launched by a single click without any need for further configuration. Note that the demo versions have a 90-seconds limitation to maximum call duration.

5.15.1. Before you Start: Prerequisities and Important Warnings

Before you start, you shall have the following:

  • Amazon Web Services (AWS) account. Note that the accounts come with several service plans charged at different levels, and credit card number and a telephone must be ready to verify identity and payment. Go to http://aws.amazon.com to sign up.
  • AWS Elastic Cluster SSH keypair. This is important to be able to administer the virtual machines remotely. If you haven’t created or uploaded one, do so under “EC2‣Keypairs”. If you want to start the services in multiple regions, make sure that you have a keypair for every region before you start.
  • Amazon Machine Image (AMI) with the RTC-2-SIP gateway from frafos. You will find the right one for your geographic region on our experimental webpage, https://go.frafos.com/.
  • RTC-enabled browser for testing. Latest version of Chrome has been tested by frafos to play well, yet there are other implementations as well.
  • Optional: Publicly available SIP service and a SIP account. You need to have a SIP URI and password with a SIP service to be able to make calls through the RTC-to-SIP gateway. Otherwise you can only make anonymous calls.
  • Optional: a DNS name under which your RTC-to-SIP gateway will be reachable

To begin visit our experimental webpage https://go.frafos.com/. The webpage containes predefined links to available AMIs that allow you to launch quickly.



5.15.2. Quick Start Using Cloud Formation

The ultimately fastest way to launch your service is using amazon’s Cloud Formation. The Cloud Formation amazon.com service is used to quickly start a whole network based on a description included in a template. The template includes information about virtual instances, how to scale them up and down, how to spread the load across them using a load-balancer, and what firewall policy to use to filter IP traffic: quite some work if administrator was setting all of this up manually.

frafos has created a starter template to be used to start a fail-safe cluster of one-to-four gateways behind a load-balancer. The template is available on our site, https://go.frafos.com.

During the process you will be prompted for very few parameters. Their scope can change as we keep developing the template and for most cases they are best served by leaving them to their default values. The only required parameter you must set is the name of your SSH key. Once you start the cloud formation process, it takes several minutes until it completes. After the stack is launched, you will have one load-balancer and one to four gateways running behind it. A URI shown upon completion of the cloud formation process will allow users to download a demo JavaScript application and start using the service. Sometimes you may need to be patient for a couple of minutes until the service is really “warmed up”.

When trying to place your first phone call, you may for example try to call sip:music@frafos.net. When opening the web-page, allow the browser to accept self-signed certificate and use your microphone and camera.


Figure 1: Screenshot: First Browser Call to music@frafos.net

Also you can try out the built-in audio conferencing bridge by dialing an 8-digit number prefixed with *. Anyone calling the same address will appear in the same conferencing room.

As the next steps, you can follow the links that show in the Cloud Formation Output window: a WebRTC web telephony application and the ABC Monitor (use sbcadmin username and default password). You can also administer the actual instances by going to their web address “https://IP/”, username “sbcadmin” and password equal to instance ID. For example, you can review rules that remove videostreams between WebRTC and legacy SIP to allow at least audio where video signaling often fails, or look at the dialing plan for the on-board conferencing.

5.15.3. Quick Start: Launch Single Instance

If beginning with a cluster may appear too heavy start, one can also start a single RTC gateway instance instead. This can be done also from our site, https://go.frafos.com.

During the process you will be prompted for instance type, detail, used storage and security group. Leave everything to default values except for the security group: it must be set to permit the following flows from

  • TCP/5060-5069 — SIP service
  • UDP/5060-5069 — SIP service
  • TCP/443 - web user interface (/) and WebRTC Demo App (/tryit/)
  • TCP/22 — secure shell
  • UDP/10000-11000 RTP media

Eventually chose an existing or create a new key-pair and store the private key securely.

Once the virtual machine is up and running, you can access its trial application by going to its address using https://PUBLIC_IP/tryit/ or to its administrative interface using the https://PUBLIC_IP/. The administrative username is “sbcadmin”, the password is the ID of your amazon instance. You can also access remote shell if you login using the private part of the AWS SSH key:

$ ssh -i .ssh/frafos-aws-keypair.pem -l ec2-user

If you would like to use additional AWS features that the instance supports, CloudWatch and System Manager, you must enable an instance role that permits these. An easiest way to do so is to create an AWS/EC2 role with predefined permissions “EC2 Role for Simple Systems Manager” (AmazonEC2RoleforSSM) and attach it to the instance.

5.15.4. Updating License

On Amazon Cloud there is an easy way to install centrally a license file that is then used by all newly started ABC SBC instances. This is practical when you upgrade to a feature-richer license and do not want to configure the license individually in every new instance. The license is then used by both instances that are newly started individually as well as via Cloud Formation and AutoScaling. You only need to make sure the license file matches the AMIs you are using.

After obtaining the license file from frafos support, all you need to do is to enable instance’s access to Systems Manager (see http://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-access.html#sysman-configuring-access-role) and put the license in a parameter with a well-known name in the Parameter Store. The Parameter Store is located under the EC Dashboard under “System Manager Shared Resources ‣ Parameter Store”. The parameter name must be “/abcsbc/license” as shown in the sreenshot bellow.

Note that setting this parameter does not affect running instances, only applies to the AWS Region for which you provisioned it, and must include a license specific to the AMIs you are using.


Figure 2: Screenshot: Setting License in Amazon Parameter Store

5.15.5. Introducing Geographic Dispersion

Introducing geographic redundancy and dispersion may be useful to become resilient against regional disasters and/or decrease VoIP latency. Latency may have major impact on quality of service. For example if an American user accesses a European RTC gateway to reach an American SIP PBX, media will travel accross the Atlantic back and forth, resulting in noticable latency and QoS degradation.

Fortunately there is an easy-to-manage way with AWS to build up geographic redundancy for both invividual instances and whole clusters. All that needs to be done is creation of the instances or whole stacks as described in previous subsections mutliple times in different regions, and linking their addresses to a single latency-routed DNS name. That is a particular feature of Amazon Route53 DNS service, that retuns the lowest-latency IP address associated with a DNS name.

We experimented with this amazon feature and confirmed significant latency savings. In our example, we created two instances, one located in Ireland, the other in California. We create CNAME records “”eu.areteasea.com” and “us.areteasea.com” for them. Eventually we created the latency-routed global DNS name entries “world” for both regions, as shown in Figure Screenshot: Creating DNS Latency-based Routing Records.


Figure 3: Screenshot: Creating DNS Latency-based Routing Records

Clients trying to open up a connection to “world.areteasea.com” resolve this DNS name to different IP addresses depending on where they ask from.

One can easily verify the outcome by using services like Cloud Monitor. (http://cloudmonitor.ca.com) Results shown in Figure Latency Measurements for Multiple Sites Served by Route-53 Latency-Routing prove that proximity makes a difference. Clients in geographic proximity of the two sites feature minimum latency bellow 50ms: US from California to Illinois show 30 to 50ms, Western Europe shows 24-37 ms, Ireland 8 ms. Clients located out of served continents have significantly higher latency, starting with 180ms for Australia, slightly above 200ms for Argentina and Egypt, and peaking with 329 ms in China – values that make VoIP quality poor.


Figure 4: Latency Measurements for Multiple Sites Served by Route-53 Latency-Routing

After we had disabled the European site, all sites began to be served by the Californian server and we observed increase in minimum latency of European clients up to 160-180 ms, i.e. by about 130 ms! Therefore we recommend anyone serving global user population to consider establishing presence in multiple amazon’s availability zones.

5.15.6. Monitoring the Autoscaling Cluster Using CloudWatch

Once the cluster is up and running, it may be worthwhile to experiment with its autoscaling behaviour and monitor how the cluster reacts to varying load. There are various ways how you can observe the status of the cluster using Amazon’s CloudWatch facility. The CloudWatch facility collects data from all related instances and load-balancers, aggregates it for the whole autoscaling groups and triggers alarms if some critical values are exceeded. The collected data, how it is aggregated and when it triggers autoscaling alarms is part of the CloudFormation template definition, so if you started the cluster using the template it is already in place. By default, the autoscaling alarms add a new instance when it the average CPU load in the cluster exceeds 80% for several minutes, and remove an instance if it drops bellow 60%.

The interesting data you can observe include the event-by-event history under “EC2 ‣ Autoscaling Group ‣ Scaling History”, details of autoscaling alarms in the CloudWatch Console, and graphs showing the cluster changes along a timeline are also found in the CloudWatch console. The rest of this section shows typical autoscaling situations and how you can inspect them using these monitoring facilities.

The first Figure Screenshot: Scaling History shows example of scaling history. We interpret it in time order from bottom up. Initially when the Autoscaling process started it launched the first instance at 12:34. Because we kept the machine busy, some seven minutes later at 12:40 the Autoscaling process chose to reinforce the cluster. It increased the desired capacity to 2 and launched a new instance. Then we started reboot of an instance to simulate a failure. The ELB checks detected the unresponsive instance at 12:48, terminated it, and started a new one. Eventually we relaxed the load, the low-CPU alarm was triggered in response to which the Autoscaling process reduced cluster size back to one.


Figure 5: Screenshot: Scaling History

The next Figure CloudWatch Screenshot: Low Load Alarm shows details of a CloudWatch autoscaling alarm. It displays a situation when cluster began to be idle after a period of congestion and an alarm is raised to scale the cluster down. The autoscaling process will remove an instance in response to this alarm.


Figure 6: CloudWatch Screenshot: Low Load Alarm

Development can be show in different time-scales using CloudWatch graphs. Figure CloudWatch Graphs: Correlation of Cluster CPU Load and Autoscaling shows how detection of overload and idle conditions affect cluster size along the time axis. There are three lines in the graph: the orange line shows average CPU load in the cluster. The autoscaling assessment of needed capacity is shown using the blue-line and the actual number of available instances is shown using green line. The CPU-load-line leads the changes: it must remain for a period of time above the threshold of 80% until the auto-scaling process determines to increase the target capacity. It then takes some time again until the capacity is ready: a new instance must be launched, detetected as ready and included in the load-balancer’s distribution list. Therefore the green line legs behind the blue-line, and the blue-line always legs behind the orange-line.


Figure 7: CloudWatch Graphs: Correlation of Cluster CPU Load and Autoscaling

5.15.7. Performance Recommendations

In virtualized cloud environments, performance can vary siginificantly due to the “sharing” nature of these environments. It is therefore advisable to choose properly dimensioned computing instances. Amazon offers several types of “instance types” that very in various performance aspects. The instance type vary by region and change over time. Current offering is available on the amazon webpage http://aws.amazon.com/ec2/instance-types/.

For minimum density trials, the most affordable T.micro instance type is sufficient. This instance allows very little CPU capacity in short bursts. However if the allowed burst is exceeded, the virtual machine will slow down to an extent that it stalls. Experiments with onboard conferencing have shown that a single conference with more than three participants already brings the machine to stalling.

For predictable performance, you will need a Fixed Performance Instance (FPI) type.

In the mainstream case, when media anchoring is enabled and there is neither transcoding nor encryption taking place, the critical parameter is the number of parallel calls (PC). Our lab measurements in this configuration have shown the following capacity for the following instance types available on the Amazon Marketplace: (the instance parameters are from https://aws.amazon.com/ec2/instance-types/)

Instance Type PC vCPU Mem (GiB) Networking Performance Notes
m3.medium 180 1 3.75 Moderate CPU-constrained
m4.large 372 2 8 Moderate network-constrained

In the less usual case that SIP is processed without RTP, number of call attempts becomes the critical parameters. This can be the case when the SBC is used as a signaling-only load-balancer. Then choosing a CPU-strong instance type makese sense. Our tests have shown that the m3.xlarge instance type can deliver 40 Calls Per Second signaling rate, c3.8xlarge delivers about 500 CPS.

Note that OS-reported CPU-load values may be misleading on virtualized machines. CPU time may be “stolen” by virtualization hypervisor and system tools may or may not accurately report the status. The more accurate method to determine actual utilization of the virtual instances is CloudWatch. We recommend that CloudWatch-observed CPU utlization shall not exceed 80% – if deployed in an Elastic Cluster, this should be the threshold value triggering autoscaling cluster growth.