Cobol application assessment with Sonar (1/2)

Code quality has been a constant concern for ages. Bad practices generate defects that impact users and costs of maintainability. Technical Debt, at first a simple metaphor, has since become a tool for measuring application quality and costs.

A few years ago, software that helped to identify these defects were rare and expensive. Today, Open Source tools such as Sonar allow everyone – project teams, providers, consultants, etc.. – to detect easily and cheaply these bad practices.

The Open Source world has long suffered from its image of ‘geek’ because these tools were first used by J2EE enthusiasts. But times have changed, and it is now possible to analyze Legacy code, such as Cobol and ABAP with Sonar.

This is the objective of our series of posts: show that it is possible to assess the quality of Cobol applications without knowing anything of Mainframe world.

We have seen previously:

These posts have stirred the curiosity (judging by the number of visitors on my blog) and attracted a number of questions concerning the results of these analyzes. This is what we will see now: what recommendations can we make based on the results of our analysis and the Sonar dashboard.

First, I will not realize a complete asssessment, it takes me between 15 and 20 hours to perform such a work, and that would mean a post of 30 or 40 pages. No, the objective is to show the approach I use, and most important, how to convert analysis results into value for our stakeholders, delivering informations to help decision.

Neither will we provide a thorough analysis of the results, but only try to show from a few simple examples how to use the Sonar dashboard to evaluate the quality of code.

Beforehand, a reminder: we have focused our model on performance and reliability, to identify defects that may impact users. So issues of readability and maintainability of applications are not our main concern. They could be, but this is not the case, or at least not now.

Remember also: you’re not an expert of the mainframe world but you must show how your analysis can help such experts. The easiest way to do this is to identify serious defects that can be corrected easily, not to calculate a tecnical debt of thousands and thousands of days that nobody can afford to repay.


First of all, I like to know what I am facing: a small nice application or a big ugly monster, very common in the Mainframe world. Remember that technical debt in this world is measured in decades.

The Sonar dashboard allows us to see that among the applications analyzed, ‘Cobol – Tax’ is the largest application followed closely by ‘Cobol – Big Banking app’. But ‘Cobol – Billing’ is the application with the lowest code quality.

Let’s have a closer look to this application:

Approximately 180 000 lines of code in less than 1 400 files: this is a small application. One can see that over half of the lines is in ‘Data Divisions’, part of a Cobol program that defines the data used in the program.

This indicates a data-oriented application, that manages batch files, for instance accounting journals to record overnight entries made in bank accounts. You can ask the team Cobol when making your presentation. Most often, I’ll take a look at the comments in the programs to see if the application does what I think it does.

In most cases, your audience will be able to tell if the application is small or large, complex or not, but they generally do not know these numbers precisely and they will probably be surprised and interested when they discover the distribution of lines of code between the ‘divisions’ dedicated to data or treatment (Procedure Divisions).

The level of documentation in this application is low: the average for mainframe applications is rather between 20% (minimum) and 30% (correct). At least, these comments are located into statements, but we can see there are a lot ‘blank’ (empty) comments. Furthermore, it will probably be difficult to understand the data structures, yet prevalent in the code.

We can see that this application is not very complex, with an average Cyclomatic Ccomplexity of 17.5 per file. This is normal if this application mostly manages files.

Now the number of duplicated lines of code is very high, above 50%, almost certainly because of the 184 ‘Duplicated files’, or about 13% of the complete application. What is the origin of these duplicate files? Simply backups. The mainframe world is not the most advanced in terms of version management. The easiest way, when you’re not sure of the functionality to implement, is to put the code in comments or faster, to copy the program so that you keep a backup allowing a possible rollback. These files accumulate over time, if nobody cares to do a some cleaning.

Message for the management: it might seem tempting at first to outsource this application because:

  • Its small size means low costs of outsourcing.
  • It is not complex and therefore easy to understand.
  • A Batch application is rarely a critical application, so the risk of depending of the outsourcer is reduced.

Nevertheless, a low level of comments means a more difficult knowledge transfer and more time – and therefore money – for the provider that would accept such outsourcing. Moreover, the high percentage of duplicate code means more work – and therefore money – to implement a modification of code into duplicated blocks.

An outsourcer can not afford to lose money, so these tasks will decrease even further the time available to perform maintenance. It is not appropriate to give this code to a provider without a prior work of refactoring to remove duplicate files and re-document this application, particularly with regard to the data files.

More in our next post, which will focus on violations encountered and our final recommendations.


Leave a Reply

Your email address will not be published. Required fields are marked *