Every time I come across a procedure or code file with a 1000 lines of code I just want to sentence the creator to permanent programming abstinence. To write elegant and maintainable code there are a couple of things you'll want to consider (and learn, if not known). This article will cover some of them, with a focus on Object Oriented Programming. First we shall find a good motive to do this. Why would we want to write elegant code? What is our motivation for code metrics?
- Improve software quality: Well-known practices on your code will (with high probability) make software more stable and usable.
- Software Readability: Most software applications are not the sole creation of an individual. Working with teams is always a challenge; code metrics allows teams to standardize the way they work together and read more easily each other’s code.
- Code flexibility: Applications with a good balance of cyclomatic complexity and design patterns are more malleable and adaptable to small and big changes in requirements and business rules.
- Reduce future maintenance needs: Most applications need some sort of review or improvement on their lifetime. Coming back to code written a year ago will be easier if the code have good metrics.
To have a good maintainability index in your code you should have a couple of metrics measured up. Ultimately the maintainability index is what works for your specific case. This is not a fixed industry set of rules, but rather a combination of them that works for the requirement of your organization application and team. Let’s take a look at what I PERSONALLY CONSIDER good metrics for the ultimate maintainability index of software applications:
Cyclomatic Complexity (CC)
- Cyclomatic Complexity (CC)
- Very popular code metric in software engineering.
- Measures the structural complexity of a function.
- It is created by calculating the number of decision statements and different code paths in the flow of a function.
- Often correlated to code size.
- A function with no decisions has a CC = 1, being the best case scenario.
- A function with CC >= 8 should raise red flags and you should certainly review that code with a critical eye. Always remember, the closest to 1 the better.
Lines of Code (LOC)
- LOC is the oldest and easier to measure metric.
- Measures the size of a program by counting the lines of code.
- Some recommended values by entities (for .NET Languages):
- Code Files: LOC <=600
- Classes: LOC <=1000 (after excluding auto-generated code and combining all partial classes)
- Procedures/Methods/Functions: LOC<=100
- Properties/Attributes: LOC <=30
- If the numbers on your application show larger number than the previous values, you should check your code again. A very high count indicates that a type or a procedure it trying to do too much work and you should try to split it up.
- The higher the LOC numbers the harder the code will be to maintain.
Depth of Nesting
- Measures the nesting levels in a procedure.
- The deeper the nesting, the more complex the procedure is. Deep nesting levels often leads to errors and oversights in the program logic.
- Avoid having too many nested logic cases, look for alternate solutions to deep if then else for foreach switch statements; they loose logic sense in the context of a method when they run too deep.
- Reading: Vern A. French on Software Quality
Depth of Inheritance (DIT)
- Measures the number of classes that extend to the root of the class hierarchy at each level.
- Classes that are located too deep in the inheritance tree are relatively complex to develop and maintain. The more of those you have the more complex and hard to maintain your code will be.
- Avoid too many classes at deeper levels of inheritance. Avoid deep inheritance trees in general.
- Maintain your DIT <= 4. A value greater than 4 will compromise encapsulation and increase code density.
Coupling and Cohesion Index (Corey Ladas on Coupling)
- To keep it short, you always should: MINIMIZE COUPLING, MAXIMIZE COHESION
- Coupling: aka “dependency” is the degree of reliance a functional unit of software has on another of the same category (Types to Types and DLLs to DLLs).
- The 2 most important types of coupling are Class Coupling (between object types) and Library Coupling (between DLLs).
- High coupling (BAD) indicates a design that is difficult to reuse and maintain because of its many interdependencies on other modules.
- Cohesion: Expresses how coherent a functional unit of software is. Cohesion measures the SEMANTIC strength between components within a functional unit. (Classes and Types within a DLL; properties and methods within a Class)
- If the members of your class (properties and methods) are strongly-related, then the class is said to have HIGH COHESION.
- Cohesion has a strong influence on coupling. Systems with poor (low) cohesion usually have poor (high) coupling.
- Design Patterns play an extremely important role in the architecture of complex software systems. Patterns are proven solutions to problems that repeat over and over again in software engineering. They provide the general arrangement of elements and communication to solve such problems without doing the same twice.
- I can only recommend one of the greatest books in this subject called: “Design Patterns: Elements of Reusable Object-Oriented Software” by the Gang of Four. Look it up online, buy it, read it and I’ll guarantee you’ll never look at software architecture and design the same way after.
- Depending on the needs of the problem, applying proven design patterns like Observer, Command and Singleton patterns will greatly improve the quality, readability, scalability and maintainability of your code.
Triex Index (TI - Performance Execution)
- Triex Index (TI) measures the algorithmic time complexity (Big O) of object types.
- TI is more important to be considered for object types that carry a large load of analysis, like collections and workers objects.
- This metric receives a strong influence from Cohesion and should only be applied to classes and types and not to individual members. The coherence is also very influential on the time complexity coherence between its members.
- This is a personal metric. I haven't found this type of metrics elsewhere, but I think it's singular and significant to the overall elegance and performance of your code.
- The Triex Index is calculated by dividing the infinite product of the execution order (Big O) for all members of a class, by n to the power of c-1; where c is the number of members in the class/type.
- TI > n2 - Bad. Needs algorithm revision
- n<TI<n2 - Regular.
- TI<=n - Good
- If a member of a class is hurting the overall TI of the class, try splitting its logic into one or more methods with less costly execution orders. Be careful not to harm the coupling and cohesion metrics of the type while doing this step.
The same way I have my preferences, I have my disagreements with some colleagues when it comes to other known code metrics. I consider some of this code metrics a waste of time when measuring code elegance and maintainability:
- Law of Demeter (for Functions and Methods)
- The Principle of Least Knowledge is a design guideline I’ll always encourage to use. What I consider unnecessary is the “use only one dot” rule enforcement for functions and methods.
- “a.b.Method()” breaks the law, where “a.Method()” does not. That's just ridiculous.
- Weighted Methods Per Class (WMC)
- Counts the total number of methods in a type.
- Number of Children
- Counts the number of immediate children in a hierarchy.
- Constructors defined by class
- Counts the number of constructor a class have.
- Kolmogorov complexity
- Measures the computation resources needed to represent an object
- Number of interfaces implemented by class
- Counts the number of interfaces a class implements.
There are 2 main reasons I do not like to consider these metrics when looking at maintainability and code elegance:
- Many of them are obsolete metrics when we look at modern software engineering techniques and languages like C# and LINQ. Things like methods per class or number of children do not apply very well to the core concepts of these modern techniques. Just imagine measuring the “Weighted Methods Per Class” in a world full of extension methods that ultimately do not depend on the original creator of the object type.
- The second reason is that the complexity of the problem never changes. If we have to solve a problem that by nature is a complex one, it doesn’t matter how many constructors of methods we have, or whether the functions do not abide by the Law of Demeter. That is irrelevant if the solution does not solve the problem. The complexity of any given problem is a constant; the only way out is to change the perspective to the problem and abstract as much complexity as possible. When you abstract a complex problem, you end up with a large number of abstractions. Counting them is meaningless, BECAUSE THE PROBLEM IS A COMPLEX ONE and ITS COMPLEXITY WILL NOT CHANGE.
It’ll be extremely hard to come up with good numbers for all of these metrics. Yet, you should know about them and the reason for their existence. Then, when you are having a developer saying “Holy cow, I don’t understand the logic of this method, it has too many if then else, I’m lost.” you’ll say “Aha! that method may have a high CC” and go straight to the problem and solve it. Applying metrics to your source code is not magic, you still own what you write, and ultimately you have to know your business well to have elegant code. But hopefully with the help of these metrics and a couple of tools you'll make your code work like a charm and look neat as a pin.