Do Developers Dream of Automated Function Points? (II)

Qualilogy - Automated Function PointsWe talked in our previous post, about Function Points, a metric usually not known by developers, and if it could be useful to them.

Our answer was rather negative, especially if we consider that such an estimate is performed manually by consultants who rely on a complex process. There are many certifications whose purpose is to validate that a consultant knows these concepts and how to implement them correctly.

Not really what usually developers find attracting. In my opinion, they would rather prefer to learn a new technology, a new language, the latest framework, an open source development, etc.

Now, let’s imagine that the calculation of Function Points could be automated: would that be enough for developers to use this measure?

Automated Function Points (AFP)

The Object Management Group (OMG) did produce a document of specifications describing the requirements for a software to automate the measurement of Function Points.

Limitations

This document lists some differences with a manual assessment of Function Points. For example, the section « 1.3 Limitations » explains that this standard does not apply to « the sizing of enhancements to an application or maintained functionality (often called Enhancement Function Points) ».

If this is really the case, then it would be a rather large discrepancy as this means that Automated Function Points would concern only new developments, and not later versions of an application. We know that 80% or 90% of existing projects work on application maintenance and that new projects are pretty much in the minority.

Moreover, these specifications do not claim (see paragraph « 2.1 Conformance ») a strict compliance with the manual measurement of Function Points, mainly because of the difficulties to fully automate a manual process. The section « 2.3 Consistency with IFPUG CPM » explains that this standard makes explicit choices when, in some situations, the rules enacted by the IFPUG (1) can be quite vague.

This same paragraph explains that a IFPUG consultant will need to perform some interpretations regarding the design and the development of functional elements, which is obviously impossible for a software tool which cannot decide by itself whether a data structure or a transaction is important or not for the application: « any automated approach will be unable to capture information about the designer’s or developer’s intent that isn’t explicitly encoded in the application’s source code ».

So the standard itself makes clear that « Automated Function Point counts may differ from the manual counts produced by IFPUG Certified Function Point counters ».

Implementation of the standard

The rest of the document presents the steps to implement the standard and calculate AFP. I will not detail these operations but simply list what, based on my experience in code analysis tools, may differ from manual counting or have consequences for the implementation of this standard as part of a project life cycle.

Boundaries

The first step consists in defining the functional boundaries of the application from the user’s point of view, in order to determine what is in the application, thus the code to be analyzed, and what is external to the application and therefore out of the automated analysis of Function Points.

Let’s consider a simple application to manage timesheets for an IT company, in which someone will input the tasks and time spent for a client, in order to produce an invoice. This application does not manage itself the employees of the company, but retrieves this information from the HR system. Similarly, the different types of activity, billable or not, are usually managed by the accounting system. And information about the customers and the services to be provided should be reflected in the commercial system.

If our Timesheet software performs its own accesses to the data of these other systems, then they are internal to the application and must be considered in our estimation of Automated Function Points. If, however, it calls other components of these systems in order to get the necessary data, then these treatments are external and are not in the scope of analysis.

Sometimes, if not often, the choice will be difficult, if not impossible. If you have web services in your applications, where do you connect them? At what application? Should they be considered? You have Copybooks, equivalent to ‘includes’ for a Mainframe-Cobol application, which describe the DB2 tables and are reusable by all applications accessing these tables. If 15 applications use the same table, they all work with the same Copybook (well, they should). To which application do we link it? How about when we have an ERP or any software package consisting of various modules, highly integrated and embedded?

Set up the code

Obviously, these same questions come up every time you configure a new code analysis, whether for counting Function Points or not. But if you are interested only in the quality of the code, that component X belongs or not to application Y is not (so much) critical: you just want to check whether it has or not violations to good programming practices. This is what usually happens for Mainframe Cobol applications, although we often try to divide them accordingly to some criteria useful for our quality assessment, for example according to their Transactional or Batch nature, because a problem of performance will be worse for a TP application with a user in front of it than for a Batch application without any user and running overnight.

When the application boundaries are not clear, the most used organization of the code and configuration of analysis is organizational: to gather all the components managed by the same team, usually located in a space (network directory, mainframe library, SCM repository, etc.) managed by the team. This division is critical in some use cases, like a Quality Gate or a benchmark of the quality of the code among outsourcers or in-house teams. In these cases, you are interested in defects added or corrected in the subsequent versions of the application, in order to decide on the acceptance or the rejection of a new version or if your outsourcer is working properly or not, if he is doing or not the efforts to improve and how it compares with other providers.

This is also this organizational criterion – who manages which component – we must focus on if we want to measure the productivity of developers. But I see some additional difficulties there. First, the need for precision is more important. If you tell a developer that you found the same violation to best programming practices in 10 components he has changed, and in fact one of them is not of his responsibility, the conclusion is that, nevertheless, he repeated this malpractice in 9 out of 10 components, and therefore he has to improve. In fact, in the case of a Quality Gate, a single occurrence is sufficient if it is a blocking or critical violation. But if you say to a developer or an outsourcer that he does not work enough, and you forget only one single component in your automated analysis of Function Points, your conclusion will inevitably be discussed. A developer can spend several hours or several days on a single method or a single function. Omitting a single object among many thousands may jeopardize the results, so an extreme care is necessary for this particular use case of Automated Function Points.

Another problem is that you are interested only in changes done by the project team: your outsourcer cannot be responsible of the defects already present in the code he received for maintenance, and you don’t want to measure his productivity on this code that he has not produced. But you can not measure Function Points only in the modified code: you need to identify functional structures and transactions related to them, from the presentation layer up to the data layer. This is only possible by re-analyzing the whole application and calculating the difference in number of Function Points with a previous version, considering that this standard can be applied to new versions (see the limitation previously mentioned).

So it is more complicated, but also more difficult to interpret. Imagine two teams that produce both 100 Function Points, the first in an application weighing 1 000 FP, the second in an application that has 10 000 FP. I guess the effort is not the same. In fact, some experts believe that the counting of Function Points must be performed differently for application with very small or very high sizes.

Last thing: you must exclude from your analysis the libraries, frameworks, reusable components and other dependencies external to your application, but also all the tables or files that are considered outside of the application. However, you need to analyze these objects in order to find the transactions between the different layers of the application and therefore consider them in the analysis, but « clearly identified as external », as described by the specifications of the OMG standard. This document clearly highlights the importance of this step of defining the scope of analysis and, in order to do this, requires the presence of a Subject Matter Expert (SME).

Analysis

The rest of the document outlines the steps of the counting process. Basically:

  • Identify all the functions in the code.
  • Classify these functions into Data functions and Transactional functions, while taking into account the internal or external nature of the data structures.
  • Estimate the numbers of Function Points for each of them.

These requirements require that the software doing the code analysis is able to take into account the databases, relational or not – some Cobol applications use hierarchical databases – but also all kinds of files, which is not the more natural for such a tool, by definition primarily oriented to the code.

It is also necessary to identify each logical data structure and to map them with the transactions. However, these structures can be spread across multiple tables, and the document specifies the constraints to be respected to identify ‘master-detail’ structures. In our Timesheet Management application, one person fulfills several timesheets, over a large period of time, possibly for different customers. Each timesheet will consist of several activities, which means that the table ‘Timesheet’ will likely be a ‘detail’ for the entities ‘Provider’ or ‘Customer’ and be a ‘master’ for ‘Activities’.

The OMG standard recommends to rely on naming conventions in order to identify tables, views, indexes, keys, etc. that constitute these data structures and relationships.
This implies that:

  • These naming conventions exist.
  • They are used by the project team.
  • The Source Code Analysis (SCA) software is configured to recognize these rules.

This tool should also be able to analyze any type of file and its internal structure – tabular (flat-files, CSV files, etc.) or tree-like (xml, html, etc.). Except that in this case we do not have keys or indexes to identify the logical data structures, and the software must make a choice – usually considering each file as a single logical entity – which will not necessarily be the choice carried out by a IFPUG certified consultant.

It is also essential to define whether a file is temporary or not, and if its data are or not managed by the user. For example, a log file usable only by the programmer should be excluded from the scope of analysis.

Once the data structures are identified, we need to do the same with the transactions that manage these data through the layers of the application, and identify those that represent inputs from those managing data, according to the operations they implement, ‘read’ or ‘write’ or ‘edit’ and ‘delete’. This means, as defined in the standard, to:

  • Capture transactions from the user interface.
  • Identify the user events that interact with the data layer and
  • User outputs created from these functions.

To identify such transactions require to:

  • Be able to find any link between all kinds of objects, which participates to a path from the beginning and to the end of the transaction.
  • Assemble the existing paths, in order to connect all of them and the operations on the data structures into transactions.
  • Declare and configure in the tool all the external elements that are not in the scope of the analysis but are necessary to reconstruct the transactions, or at least identify the missing items. In order to constitute a complete transaction, we may have to analyze external components, thus including them in the scope of the analysis while excluding them from the AFP counting.

Automated Function Points: which measure?

As we can see, the document of the OMG standard specifies the requirements for the use of Automated Function Points, as well as its limitations. The points that I think are most important to remember are the following.

Perimeter

The standard does not « address the sizing of enhancements to an application or maintained functionality (often called Enhancement Function Points) » and therefore would only apply to new developments. If this is the case, it limits very much its use and interest.

AFP and IFPUG

The AFP standard does not claim a strict compliance with a manual counting of Function Points: « Automated Function Points counts may differ from the manual counts produced by IFPUG Certified Function Point counters ».

This seems to me a first important point: Automated Function Points are not IFPUG Function Points. This is another measure, which has the advantage of being computable automatically by a tool, and therefore with less effort than a manual counting, but also with a different result.

Configuration

Counting AFP requires to identify and assemble into functional components all data structures and transactions, and decide which are internal or external, and which to take into account or not when setting up the tool and configuring analysis. This assumes you have available people with a good knowledge of:

  • The application by a SME or an expert of the project.
  • The process of counting Automated Function Points to determine the scope of analysis and the factors to be considered.
  • The tool and its parameters to configure the analysis, verify false-positive and validate the results, accordingly to the previous two points.

This configuration phase is obviously critical if we want to achieve a more objective result, and therefore as credible as possible when it comes to measuring the productivity of a team.

Analysis and validation

Counting Automated Function Points with a tool assumes that this tool is able to:

  • Analyze any kind of component.
  • Identify any link between these components.
  • Assemble all these links into transactions with as less false-positive as possible.

But it is rare that a code analysis tool has a parser able to recognize and analyze any type of file on a given technology, and less on different technologies. A tool may be able to recognize operations like ‘read’ or ‘write’ on a flat-file in a Batch application, to identify the different kind of links between xml files of a Java framework and not be able to analyze an Html or an Excel report. An important feature of our Timesheet application in our example will be to produce activity sheets for validation before invoicing, usually in different formats: Excel, PDF, etc. I know of no code analysis tool that manages this type of file.

To find all the links between components can be difficult or impossible for some technologies. The use of frameworks (Spring, Hibernate, etc.) complicates analysis, and this means an important work to validate the results in order to avoid false-positives as much as possible, and then check the identified transactions and the counting of Function Points for each of them.

Conclusion

In conclusion, I think that Automated Function Points is a different measure, which produces different results than a manual estimation conducted by an IFPUG consultant. In an ideal situation, it would be great to have such a consultant to participate in defining the scope of analysis, the settings of the tool, the validation of the results. This assuming that the tool is able to identify all components, the links between them, data structures, transactions, etc.

Even in such an ideal case, I believe that the difference between a manual estimation and Automated Function Points is at least 10% to 20%, more often between 40% and 50%. As a minimum. It could also be 200% or 300% different, for example for a Cobol Batch application (many flat-files), an integrated software (ERP) with different modules, in case of a framework that makes it impossible to clearly identify transactions, etc.

Automated Function Points: for whom?

The implementation of the AFP requires a significant effort, which must be repeated each time the application will undergo major changes in its features (assuming that this standard also applies to new versions).

This immediately excludes the use of AFP of a Continuous Integration cycle when the analysis of code is done repeatedly to identify critical or blocking violations to best programming practices, in order to, for example, correct them before a Quality Gate or to prevent the inflation of the Technical Debt. So I do not see how developers might be interested in AFP and integrate them into a software development cycle.

This does not mean that AFP has no interest. Two IFPUG consultants manually estimating Function Points will often achieve different results, whereas an automated measurement will produce the same result in a repeatable way. The implementation effort, even if it is high, will however be less important than the investment required for a manual estimation, especially for large applications.

But to put it simply, this is another measure, that requires a significant validation of all elements – inputs/outputs, internal/external, data structures and transactions, false positive, etc – to be used as a measure of the functional size of an application.

I think it is reserved to experts in code analysis and SCA tools, with a good knowledge of applications and development practices with different technologies, and also with a good knowledge of Function Points, when one wants to use it to do benchmarkings between different applications and technologies.

I believe it is dangerous to use it to measure the productivity of teams or outsourcers: it will be easy to challenge the results produced in such a use case.

I think it can be useful as part of a retro-documentation prior to the transfer of an application to an outsourcer, or a refactoring or a reengineering (porting the application to another technology). In these cases, all that can facilitate the functional knowledge of the application, the estimation of test coverage and an evaluation of the workload will be welcome, even if the measure lacks precision.

Do you see other use cases for Automated Function Points?

 

(1) IFPUG : the International Function Points User Group, which provides the Function Point Counting Practices Manual (FP CPM).

This post is also available in Leer este articulo en castellano and Lire cet article en français.

4 thoughts on “Do Developers Dream of Automated Function Points? (II)

  1. Clayton Weimer

    I’ts not just that most developers don’t understand FPA. its not something they want to be an expert at 😉

    The premise for Function Points:

    1) Measuring software’s value and productivity in terms of code constructs (like LOCs) is erroneous at worse and and inconsistent across implementations at best.
    2) What’s needed is a “Unified Software Measuring Theory” which breaks down software into fundamental building blocks common to all systems that can be measured.
    3) Since we can’t really do that (the “common to all systems” part) let’s fake it with complicated guestimazation formulas based on estimates based on things we can sort of measure.
    4) Its not perfect, but its better than nothing and what has been previously used before.

    Ok, but this is like measuring real estate in Silicon Valley. My neighbor once asked me what it would take to build a new 80 foot fence 1 foot into my home’s property line. That was easy I thought, I did some research on sold properties and figured the top, low and avg. lot size prices per sq. foot using a spreadsheet with various factors weighed on our neighborhood. With this I gave him a ballpark estimate and the formula/figures I used. My wife, who is a real estate agent, said that was one of the stupidest things I have done and nixed the whole deal without further discussion. There were a million other considerations (some not measurable, such as the fact that we had no need for a new fence) that I failed to consider.

    FPA tries to be a standard formal way of measuring software size/value/productivity but this cannot be measured any more accurately than what domain expert developers can do themselves, and to remain a domain expert you probably don’t have time to spend whatever X amount of effort to become a Certified IFPUG specialist.

    The different results may be attributed to the garbage in, garbage out factor. So, AFP does make sense minus the “IFPUG consultants “ that are not domain experts. In fact, isn’t it an acceptable (though primitive) form of AFP taking LOC and backfiring them into FPs using language tables built from historical data and analysis?

    Reply
  2. admin

    Sorry Clayton, your comment went right away in the ‘Spam’ box and I just realized it now.
    Have a nice week-end.

    Reply
  3. Ian Alyss

    Thanks Jean-Pierre for reopening the comments. Now I can respond here, instead of on the Nesma forum. You state:

    Automated Function Points would concern only new developments, and not later versions of an application. We know that 80% or 90% of existing projects work on application maintenance and that new projects are pretty much in the minority.

    AFP tools work on static code, so they measure the size of the code as it is. That’s either before or after the maintenance. Changing the software is an activity you can describe in a change document, but not in the code. The code is either changed, or not. In order to measure the size of a maintenance project you would need two static instances of the code, one before and one after, and then analyze the differences. I’m not sure whether that is technically possible. So I would not judge negatively on the fact that the AFP community is not making this miracle happen.

    You start this blogpost with an observation:

    We talked in our previous post, about Function Points, a metric usually not known by developers, and if it could be useful to them. Our answer was rather negative, especially if we consider that such an estimate is performed manually by consultants who rely on a complex process. There are many certifications whose purpose is to validate that a consultant knows these concepts and how to implement them correctly. Not really what usually developers find attracting. In my opinion, they would rather prefer to learn a new technology, a new language, the latest framework, an open source development, etc.

    I’m not sure whether developers are the right target group for Function Points. Function Points are used to describe value of the software in the sense that you can put a number on the amount of functionality it offers, the amount of time the development team has or will put into it to build the software or the amount of money you have to pay the supplier to get the software you want. Those types of value are usually not the things a developer is interested in.

    I think the problem with Function Points is that it needs “technical” input (from a business perspective) that Business Analysts, Information Managers or Project Managers do not have or understand. If they don’t have or understand the input for Function Points, they will not buy the number that should display the value of the software they are interested in.

    Reply
  4. Jean-Pierre FAYOLLE Post author

    Ian, thanks for your comment.

    Sure, you cannot have all the versions of the code in just one version. As you say, a SCA tool need at least 2 versions to tell the differences in terms of lines of code (LOC), number of components, Cyclomatic Complexity, etc. and this is how they work technically. And this will be the same for AFP.

    Now some tools are able to do incremental analysis, because if you have changed only 1 LOC in a million, the cost to reanalyze 1 MLOC is not worth the result for just 1 line that has been modified. But, as I mentioned in my post: “you can not measure Function Points only in the modified code: you need to identify functional structures and transactions related to them, from the presentation layer up to the data layer. This is only possible by re-analyzing the whole application and calculating the difference in number of Function Points with a previous version”.

    So why do the OMG specifications state that this standard does not apply to « the sizing of enhancements to an application or maintained functionality (often called Enhancement Function Points) » ? I am used to work with SCA tools, and I cannot imagine a technical impediment that forbid a SCA tool to analyze 2 versions and store the results and differences.
    Or is it because the formula for Enhancement Function Points is different? Would there be anything specific to the process of counting FP for enhancements ? Or any other reason?

    I don’t know and the document is not precise about this limitation and the reasons behind. Now, I would be happy to understand it and this point to be clarified because most of the companies who will use AFP will do it at portfolio level, on dozens or hundreds of applications. As 90% of these applications will be under maintenance vs. 10% of new projects, this limits seriously the interest of AFP in my opinion.

    I answered to the 2nd part of your comment in the previous post here.

    I wish you a nice week-end.
    Jean-Pierre

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *