As we have seen in a previous post – Quality in the cloud – cost reduction remains the main motivation when it comes to go into the cloud and virtualized infrastructures. Then the second main reason is Capacity management.
When you have to size a physical infrastructure and a operating budget to manage it, the model is this one:
- Estimate the maximum load required, based on the highest peeks of activity.
- Plan the projected growth of resources over the period, on mid / long term.
- Add a safety margin (don’t get short).
Following this model, an architecture based on physical servers is necessarily over-sized. Ask any production manager or system engineer: 20% to 40% of physical servers hosting applications use less than 10% of the available CPU. You can share these applications on some virtual machines (VMs) into an unique server and relocate the other ones into your infrastructure.
Cost reduction: do more with less.
Capacity Management
The ability to deliver the capacity is the second main reason: answer with the best possible reactivity to user’s demands. Before, when you needed a new server, you had to wait, often several weeks, for your machine to be ordered, delivered and installed. Now, 3 clicks are enough to install a virtual machine, and users have becomed used to see their requests met within 24 hours.
This leads to the well known problem of VM proliferation, with the consequence to reverse the benefits gained previously by reducing costs. Because if it is easy to install new VMs, it has a cost in terms of resources, licenses and people to maintain the virtual infrastructure running.
Now, Capacity management does not mean only to answer quickly to users requests but also to cope safely and with the maximum reactivity to peeks of activity. Do more but also better with less. This is one advantage of the virtualization: to be able to move available resources as needed. Nothing worse for the image of a company than a web site not working, and if we are talking of a commercial site, any unavailability, even temporary, is the equivalent of an industrial accident or a catastrophe. Be able to face the tsunami.
Despite this, it is not uncommon that people of production, the infrastructure responsible or the capacity manager, are not informed that a new release of an application will generate an increase of activity. Moreover, these peaks are not always predictable.
Finally, from a development perspective, this ability to respond to variations in activity is not always included in the project objectives. Deliver an application on time, with new features implemented without faults or bugs, that’s the priority. Ensure that the application is ‘elastic’ enough to react correctly to changes in load is rarely an objective, nor a concern really integrated into the project cycle, except in case of a special event.
Once I had a customer who asked for an audit of the quality of his code to check if the programmers knew and properly applied the best practices in performance. This was a critical banking application, which had to triple its number of users, and thus data. Apart from such cases where the performance – in fact the scalability – was included in the project objectives, it is rare that this is considered as an impact of the application’s new features.
Of course, you care about good practice in performance, but do you integrate the elasticity of your application within your project?
Elasticity
The term of elasticity comes from the world of the cloud. It is an important criterion of the quality of a virtual infrastructure: the ability to add or remove, in short to move resources (CPU, memory, space and hard disk performance mainly) in order to meet capacity needs.
Imagine that the virtual infrastructure is a balloon: to inflate or deflate a balloon more or less easily or quickly depends on the elasticity of it.
We measure the elasticity with criteria such as:
- The speed to move the necessary resources – to inflate / deflate the balloon.
- The amount of resources displaced.
- The ability to move only certain resources – memory, CPU, disk space.
The elasticity is needed to meet peaks of workload, and answer the needs of scalability. However, these two notions should not be confused.
Scalability
Scalability is the ability to meet the increased workload by adding resources. It can be measured as a ratio between load vs. resources. If an application delivers a good performance with N resources for X users, and if its performance remains the same by multiplying resources by 2 when the number of users is multiplied by 2, then this application has a good scalability. If this ratio remains constant as the two factors increase, the scalability will be linear. This is optimal. But it is more likely that some components will not be up to the desired scalability, and therefore that performance stabilizes at a certain threshold, despite the addition of additional resources.
Elasticity is the ability to add – or decrease – these resources in a virtual infrastructure. The elasticity includes scalability, but a scalable application is not necessarily an elastic application. For example, you perform load-balancing to distribute the load between two application servers: this allows you to ensure a good response time and proper performance, whatever the number of users, even during unexpected peaks.
But if you can not return automatically to an architecture with a single application server without load balancing, you can not deflate the balloon, your application is not elastic.
How to take into account the elasticity in your project? What is the elasticity of an application? What applications are concerned? Indeed, we begin to see announcements of software publishers on this subject, but I have not found any concrete answer to these questions, from the perspective of a developer or an architect. This is what we will try to do in our next post.
Meanwhile, your comment is welcome.
This post is also available in Leer este articulo en castellano and Lire cet article en français.