Discover Top Posts Tagged with #scaling rules

Location Aware Scaling Rules

I would like to introduce the "locationAware" service recipe flag which enhances the behavior of machine failover and scaling rules:

1. When a machine fails, and "locationAware" is enabled, Cloudify starts the machine in the same location of the failed machine.

2. Scaling rules are separately enforced on each location. Effectively, Cloudify starts a new machine in the location that triggered the scale-out rule.

In the example below, we assume we have 2 Tomcat web servers deployed on each location (a total of 4 web servers):

https://gist.github.com/4000673

Notice that each Web Server CPU load is 40%:

https://gist.github.com/4001155

One scenario is network disconnection between the load balancer and the Tomcat Servers in "Location 1". We assume the Load Balancer directs all traffic to "Location 2". The result is that 2 Web Servers are handling all of the traffic which results with an increase of the CPU load up to 80%.

https://gist.github.com/4000676

The high CPU load triggers a scaling rule which starts another web server virtual machine. Since the recipe is location-aware, the virtual machine is started at "Location 2" (and not Location 1 which is now disconnected from the load balancer). Notice how the CPU load is then down to normal values:

https://gist.github.com/4000947

When the Location 1 network failure is fixed, we have one server too many.

https://gist.github.com/4001033

The scaling rules detect that since the CPU usage is too low (meaning we can handle this load with less machines), the extra machine is then removed.

https://gist.github.com/4000673

#availability zones #cloudify #locaiton aware #scaling rules #cloud automation #cloud computing #technology

Testing Cloudify Auto-Scaling Rules

Closed Loop Feedback Test

Cloudify is an open source PaaS software stack, that automates deployment, monitoring and fault detection of applications running on the cloud. It can automatically add instances when monitored statistics exceeds a certain threshold. This automatic scaling rules algorithm implementation is a closed-loop control system which requires careful testing.

The diagram below shows the test load generator, and the closed loop "system under test". Each web server's throughput is being monitored by the controller (scaling rules). When throughput exceeds a certain threshold a new web server instance is started, and the throughput per instance goes down below threshold.

+--------------+ | Test | | + | | +------+-------+ | http | requests V +--------------+ +--------------+ | Tomcat | | Throughput | | Instance(s) +--->| JMX | | | | Monitor | +--------------+ +-------+------+ ^ | | add | | instance | +------+-----+ | | Scaling | | | Rule |<-------------+ | | +------------+

Here is the JMX plugin configured to expose the Total Number of web Requests (per instance)

https://gist.github.com/2788838

The scaling rule uses a 20 seconds sliding window to convert the Total number of Requests (X) into Throughput (delta X divided by delta T) and compares the result against a predefined threshold.

https://gist.github.com/2788840

The closed loop test starts with "zero traffic" changes to "constant traffic" and then back to "zero traffic":

Start without any web traffic

Wait until minimum number of instances.

Increase web traffic to a predefined level of requests per second

Wait until expected number of instances.

Wait a little more, make sure no add/remove instance fluctuations.

Stop all web traffic.

Wait until minimum number of instances.

The test waits until the expected number of instances is reached (step 4), and stays there for certain period of time (step 5). During that time we must verify that the scale out is performed without fluctuations. An unwanted fluctuation is when without any input change (stable input http traffic) an instance is added and then removed by the controller.

In more advanced test scenarios we may want to monitor resources such as number of busy threads, or CPU usage. This would require a more sophisticated HTTP load generator, which is usually used in stress/performance testing.

Open Loop Test

The problem with developing a closed loop feedback system is that you cannot test the controller (scaling rules) in an isolated environment. Every decision the controller makes affects the output of the system which affects the controller. The way to deal with that is to "open" the loop (non-feedback controller). The controller takes a decision, but the result does not affect the monitored data feeding the controller.

+--------------+ | Test | | + | | +------+-------+ | set | value V +--------------+ +--------------+ | Stub | | value | | Instance(s) | | monitor | | | | | +--------------+ +------+-------+ ^ | | add | | instance | +------+-----+ | | Scaling | | | Rule |<------------+ | | +------------+

Here is a little Cloudify recipe trick. Each Cloudify instance stores the recipe as a POJO in memory, which allows adding new properties. In this case we add a long value which mocks the web server throughput.

https://gist.github.com/2788844

This recipe allows the test to remotely inject the monitored values that the service exposes to the scaling rules controller. This mock value is not affected by the scaling rules decisions and does not require any actual web server instance running.

Here is how an open loop controller test looks like:

Set monitored value to 0

Wait for minimum number of instances.

Set monitored value to "$highthreshold+1"

Wait for maximum number of instances.

Set monitored value to 0

Wait for minimum number of instances.

Notice that step 4 expects the scaling out to be performed again and again until the maximum number of instances is reached. This is since there is no closed loop feedback. No matter how many web instances the scaling rules start, there would always see the same monitored value (greater than high threshold).

#scaling rules #test #cloudify

Cloudify Scaling Rules

Cloudify can start and stop machines automatically based on real-time statistics.

The first step is to define per instance monitoring metrics. This metric defined in the recipe could be based on a JMX plugin, HTTP request, custom commandline output, or custom groovy code.

Statistics are used to normalize data and provide a figure we can compare against the threshold. The metrics are aggregated over time (per instance time average in the example below). The per-instance statistics are then aggregated again (maximum in the example below). This results with a service statistics that represents the whole cluster (maximum of averages).

Which finally leads us to the scaling rules. When statistics is below the low threshold a scale-in operation is triggered (remove an existing instance). When statistics is above the high threshold a scale-out operation is triggered (adds a new instance). The scaling rules are bounded by the minimum total number of service instances, and maximum number of instances.

Certain precaution needs to be taken after an instance has been started. It may take some time until the new instance metrics are usable. The instance could be getting low traffic due to load-balancer session stickiness, or due to cache warm-up. During that time it would be wrong to trigger another scaling rule. The cooldown period disables the scaling rules until the instance has been started and warmed up.

For more information, consult the scaling rules documentation: http://www.cloudifysource.org/guide/developing/scaling_rules.

#cloudify #scaling rules

How to provision service instances on the private cloud, public cloud and the data center

Ops: For the sake of this discussion... What do you mean when you say "private cloud, public cloud and data center"?

Cloudify: A data center is a bunch of machines that are running most of the time. It could be a bare-bones data center or a virtualized data-center. A private cloud is a data center that has an API for provisioning machines on-demand. It takes away the need to manually decide which VM runs on which physical machine for which user. A public cloud is like a private cloud, but each machine is billed by the hour.

Ops: Ok, so how do you provision service instances to machines?

Cloudify: The simplest scenario for clouds is to start a new virtual machine before an instance starts, and stop the virtual machine after an instance stops.

Ops: And what if I don't have a cloud?

Cloudify: In that case the machines are always running, and the instance is started on a machine that is not used by any other instance. When the instance is stopped the machine becomes vacant (and another instance can use it).

Ops: On a public cloud I am billed for each machine by the hour. If I use a machine for only 20 minutes and start a new machine for 20 minutes, I pay double the price.

Cloudify vFuture: In that case, we use a mixed strategy. Before a new instance is started, look for a vacant machine. If there is no vacant machine start one. When the instance is stopped the machine becomes vacant (and another instance can use it). The machine is left vacant until its hourly billing period is over and then it's stopped.

Ops: So while it is vacant, if I start another instance then it will not start a new machine, but rather use the existing machine.

Cloudify vFuture: Correct. This time sharing mode saves money. The downside of this is that when the instance is stopped, the machine is not deleted and another instance could sniff it later.

Ops: Got it. So if I don't start another instance the machine is left vacant until the end of the billing hour?

Cloudify vFuture: Basically yes. You can define for each service the must-have number of instances, and nice-to-have number of instances. This time if a vacant machine is scheduled to go down in 20 minutes, (and there is no must-have instance) a nice-to-have instance will use the machine for the remaining 20 minutes.

Ops: And what about multi-tenant scenarios?

Cloudify vFuture: Each service instance runs in a separate process (instead of a separate machine). An instance can share the same machine with other processes as long as they serve the same tenant. In a loose security environment you can allow different processes that serve different tenants to run on same machines.

#scaling rules #cloud #data center #private cloud #public cloud #cloudify

Testing Cloudify Auto-Scaling Rules

Closed Loop Feedback Test

Here is the JMX plugin configured to expose the Total Number of web Requests (per instance)

https://gist.github.com/2788838

The scaling rule uses a 20 seconds sliding window to convert the Total number of Requests (X) into Throughput (delta X divided by delta T) and compares the result against a predefined threshold.

https://gist.github.com/2788840

The closed loop test starts with "zero traffic" changes to "constant traffic" and then back to "zero traffic":

Start without any web traffic

Wait until minimum number of instances.

Increase web traffic to a predefined level of requests per second

Wait until expected number of instances.

Wait a little more, make sure no add/remove instance fluctuations.

Stop all web traffic.

Wait until minimum number of instances.

Open Loop Test

https://gist.github.com/2788844

Here is how an open loop controller test looks like:

Set monitored value to 0

Wait for minimum number of instances.

Set monitored value to "$highthreshold+1"

Wait for maximum number of instances.

Set monitored value to 0

Wait for minimum number of instances.

#scaling rules #test #cloudify

#scaling rules

Trending Tags

Recently Viewed Tags

#scaling rules