Skip to main content

The Developer’s Guide to Clean Code: Tips and Techniques

What is clean code?

Clean code is a term used to describe code that is easy to read, understand, and maintain. It is written in a way that makes it simple, concise, and expressive. Clean code follows a set of conventions, standards, and practices that make it easy to read and follow. Here are some signs indicating that the code is not clean:

1. Poor names

  • The name is not clear to understand, meaningless, or misleading. It doesn't reveal the intention of what it want to achieve. Consider the following examples:
SqlDataReader drl;
int od;
void Button1_Click();
Class Pages1

In the examples above, it’s challenging to get the purpose of drl, od, or what Button1_Click() does. To enhance clarity, we can rename these identifiers as follows:

SqlDataReader dataReader/reader;
int  overdueDays;
void CheckAvailability_Click();
Class ViewCustomerPage {}

  • Ambiguous names
int? incidentNameId for instance. incidentNameId lacks clarity because if it represents the ID of an incident, then the inclusion of “Name” is misleading. Moreover, if it is indeed an ID, it should not be nullable.

A well-chosen variable name should be:

  • Concise and descriptive: Avoid names that are overly long or too cryptic.
  • Meaningful: The name should clearly reflect its purpose and content.
  • Intention-revealing: It should be evident what the variable represents and how it’s used.
  • Domain-specific: Select names that resonate with the problem domain and are familiar to those within the field.

2. Poor naming conventions

In C#, it’s essential to adhere to standard naming conventions to ensure code clarity and consistency. The example below demonstrate a mix of conventions, which can lead to confusion:
Customer GetCustomer (int _id);
Customer saveCustomer (Customer Customeer);
private int customerId;

The first method GetCustomer uses PascalCase, which is correct for methods, but the parameter _id incorrectly includes an underscore. The second method saveCustomer incorrectly uses camelCase for the method name. The third example, customerId, is a private field that should start with an underscore to differentiate it from public members.

To align with C# development standards, a Class, Property, and Method use PascalCase, while private field, method parameter, and local variable we use camelCase.

3. Poor Method Signatures

Consider this method void Parse(int command): The term “parse” typically implies the conversion of one data type to another, such as parsing a string into a DateTime object. However, the return type void indicates that this method does not return any value, which contradicts the expected behavior of a parsing operation.

To ensure method signatures are clear and meaningful, we should carefully check:

  • The return type: It should reflect what the method actually returns. 
  • The method name: It should accurately describe the action the method performs.
  • The parameters: They should be named and typed in a way that clearly indicates what inputs the method expects.
A better method signature might be: 
DateTime ConvertToDateTime(string command);

4. Long parameter list

CheckNotifications(null, 1, true, false, DateTime.Now)

As you can see CheckNotifications has too many parameters and without context, it’s unclear what null, 1, true, false, or even DateTime.Now represent.

To make the code clear, we can consider the following practices:

  • Limit number of parameters to 3 to minimize the complexity of the code.
  • Reduce number of parameters by creating a field for each parameter or a class to encapsulate relevant parameters and pass a new instance of the class with a meaningful name as parameter.

Applying these practices, we can refactor the method as follows:

public class NotificationCriteria
{
    public User User { get; set; }
    public int Priority { get; set; }
    public bool IncludeArchived { get; set; }
    public bool ExcludeDismissed { get; set; }
    public DateTime AsOfDate { get; set; }
}

var criteria = new NotificationCriteria
{
    User = currentUser,
    Priority = 1,
    IncludeArchived = true,
    ExcludeDismissed = false,
    AsOfDate = DateTime.Now
};
CheckNotifications(criteria);

5. Variable declarations on the top

In coding, opinions may vary on where to declare variables, but a common best practice is to declare them as close as possible to where they are used. This practice can significantly improve the readability and maintainability of the code. Consider the following method:

public decimal CalculateEmployeePay(decimal regularHoursWorked, decimal overtimeHoursWorked)
{
    decimal overtimeHours = 0; // Declared at the top
    decimal regularPay = 0; // Declared at the top
    regularPay = regularHoursWorked * hourlyRate;
	// Assume there are lots of code in beetween
    if (overtimeHoursWorked > 0)
    {
        overtimeHours = overtimeHoursWorked;
    }
    decimal totalPay = regularPay + (overtimeRate * overtimeHours);
    return totalPay;

}

In the example above, overtimeHours and regularPay are declared at the beginning of the method, which can lead to a cluttered list of variables and make it harder to track their usage throughout the code. 

To improve this code and eliminate the “code smell,” we should move the declarations of overtimeHours and regularPay closer to where they are used. By doing so, we provide immediate context for each variable, reducing the waste of reader's brainpower and making the code cleaner and more intuitive. Using this practice not only simplifies the code but also helps other developers who may work on it, allowing them to understand the code’s logic with minimal effort.
public decimal CalculateEmployeePay(decimal regularHoursWorked, decimal overtimeHoursWorked)

{
    decimal regularPay = regularHoursWorked * hourlyRate;
    // Imagine there are a large codebase in between
    decimal overtimeHours = 0; // Declared closer to where it's used
    if (overtimeHoursWorked > 0)
    {
        overtimeHours = overtimeHoursWorked;
    }

    decimal totalPay = regularPay + (overtimeRate * overtimeHours);
    return totalPay;
}

6. Magic number

In the example below status == 1 is considered as a magic number which is ambiguous and make it hard to understand what specific numbers like 1 or 2 represent:

public void ApproveDocument(int status) {
    if(status ==1) {
    //..
    }
    else if(status == 2) {
    }
}

To eliminate this ambiguity and enhance code readability, we can refactor the code by using named constants or an enumeration. This not only makes the code more meaningful but also easier to maintain:

public void ApproveDocument(int status) {
// We can also extract these two variables into a separate file as enum
    const int draft = 1;
    const int lodged = 2;
    if(status == draft) {

    }
    else if(status ==  lodged) {
    }
}

7. Nested Conditionals

Nested conditionals, also known as deeply nested if statements, are considered a “code smell” because:

  • Readability: Deeply nested code is difficult to read. As the level of nesting increases, the complexity of understanding the code also increases.
  • Maintainability: It’s easy to introduce bugs when working with nested conditionals. A small change in one condition can have unexpected effects in other places.
  • Testability: Each branch in a conditional statement represents a separate path the code execution can take. The more branches, the more tests you need to write to achieve good coverage.
Consider an example below:

public string ProcessData(string data)
{
    if (data != null)
    {
        if (data.Length > 0)
        {
            if (data[0] == 'A')
            {
                return "Data starts with A";
            }
            else
            {
                return "Data does not start with A";
            }
        }
        else
        {
            return "Data is empty";
        }
    }
    else
    {
        return "No data provided";
    }
}

As you can see, the nested if statement looks very confusing. To simplify this we can use “guard clauses,” which return early and reduce the need for nested else clauses. This pattern is a best practice to minimize the level of nesting, making the code more readable, understandable, and maintainable. It also simplifies testing.

public string ProcessData(string data)
{
    if (data == null)
    {
        return "No data provided";
    }
    if (data.Length == 0)
    {
        return "Data is empty";
    }
    if (data[0] == 'A')
    {
        return "Data starts with A";
    }
    return "Data does not start with A";
}

8. Switch statements

The problem when using the switch statement is if the statement is subject to change in the future for instance in the example below the switch statements handles bird type and if we want to add more bird types in the future which means we have to modify the codes this violate the open-closed principle in SOLID principle and also makes switch statement longer.
public class Bird
{
    private readonly BirdType birdType;
    public Bird(BirdType type)
    {
        birdType = type;
    }
    public BirdColor GetColor()
    {
        switch (birdType)
        {
            case BirdType.Cardinal:
                ¨// return 
            case BirdType.Goldfinch:
                // return
            case BirdType.
            // More cases for other bird types...
            default:
                throw new InvalidOperationException("Unknown bird type.");
        }
    }
}
To resolve this issue, we can refactor the code by creating an abstract Bird class and then subclassing it for each bird type. Each subclass will implement its own GetColor method. This way, when a new bird type is introduced, we simply create a new subclass without modifying the existing code.
public abstract class Bird
{
    public abstract BirdColor GetColor();
}
public class CardinalBird : Bird
{
    public override BirdColor GetColor()
    {
        return ...;
    }
}

public class GoldfinchBird : Bird
{
    public override BirdColor GetColor()
    {
        return ...;
    }
}

public class ChickadeeBird : Bird
{
    public override BirdColor GetColors()
    {
        return ...;
    }
}

9. Duplicated code

Duplicated code is considered a bad practice in clean coding for several reasons:

  • Maintenance: If a bug is found in a piece of duplicated code, it needs to be fixed in every place where the code is duplicated. This can be time-consuming and error-prone.
  • Readability: Duplicated code can make a program harder to read and understand, which can slow down development and increase the likelihood of errors.
  • Efficiency: Duplicated code can lead to larger codebase, which can slow down development and make the codebase harder to manage.

To handle duplicated code we can for example create a separate function containing the codes that are used in multiple places and call this function inside each class. This makes the code easier to read, maintain, and modify. It also reduces the risk of introducing bugs when changes are made. This is an example of the DRY (Don’t Repeat Yourself) principle, a fundamental concept in clean coding.

10. Comments

Comments are a crucial part of programming, but they should be used judiciously.
When comments are needed:
  • Complex code: If the code is complex and hard to understand, comments can be used to explain what the code does.
  • Important decisions: If a particular decision was made during development that might not be immediately obvious, it’s good to document this in a comment.

When comments are considered bad practice:
  • Obvious code: Comments should not state the obvious. If the code is self-explanatory, comments are unnecessary.
  • Outdated comments: Comments that are not updated along with the code can be misleading and cause confusion.
  • Substitute for bad code: Comments should not be used as a crutch to justify bad code. Instead of writing a comment to explain what a complex piece of code does, it’s better to refactor the code to make it more readable.
  • Too much information: Comments should be concise and to the point. Including too much information in a comment can make it harder to understand.
  • Dead Codes: Dead code makes the codebase harder to read and understand. It can slow down development and lead to confusion for other developers who may spend time trying to understand why the the is there. While modern compilers often optimize out dead code, it can still contribute to slower compile times and larger binary sizes.

11. Long Methods

It's common that you may have bumped into a codebase with hundreds or thousands lines of code in a single method. This is considered a bad practice in clean code for several reasons:
  • Hard to read and understand. When a method is short, it’s easier to see at a glance what the method does.
  • Harder to maintain. If a method does too many things, making changes can be risky because it’s easy to inadvertently affect other functionality.
  • Short methods that do one thing well are easier to reuse in other parts of the codebase.
  • Short methods are easier to test because they typically have fewer paths through the code and fewer edge cases to consider.

To handle long method problem, we can for example apply the Single Responsibility principle in SOLID principle to reduce the length of the methods. It ensures that each method in your codebase is responsible for one thing, you can make your code more understandable, maintainable, and flexible

Comments

Popular posts from this blog

Maximizing Efficiency: The Power of Database Indexing

What is database performance? There are two main aspects of database performance: response time and throughput . Response time is the total time it takes to process a single query and returns result to the user. It's critical metrics because it directly impacts the user's experience, especially in applications where fast access to data is essential. The response time includes CPU time (complex queries will require more computational power and increase processing time), disk access, lock waits in multiple-user environment (more about database transaction ), network traffic. Throughput refers to how many translations the system can handle per second (TPS). A transaction could include different activities to retrieve and manipulate data. A single command like SELECT, INSERT, UPDATE, DELETE or a series of commands could be used to trigger these activities. If you’re running an e-commerce site, a single transaction might include checking the inventory, confirming the payment, and...

LINQ - Deferred Execution

Deferred Execution means that queries are not executed immediately at the time it's being created. The benefit of this is to improve performance by avoiding unnecessary executions. It also allows the query to be extendable when needed such as sorting, filtering. In the example below, the queries to retrieve courses are not being executed yet var context = new DbContext(); var courses = context.Courses      .Where(c => c.Level == 1)      .OrderBy(c => c.Name); The query is only executed when one of the following scenarios occurs: Iterating over query variable Calling ToList, ToArray, ToDictionary Calling First, Last, Single, Count, Max, Min, Average For example when we loop through the course or turn the result to a list: foreach ( var c in courses) Console.WriteLine(c.Name); OR context.Courses      .Where(c => c.Level == 1)      .OrderBy(c => c.Name).ToList();

Solid Principles for Software Design - Part 2

Interface Segregation Principle The Interface Segregation Principle (ISP) is one of the five SOLID principles of object-oriented design, which recommends that "Clients should not be forced to depend on interfaces they do not use". This means we should avoid implementing an interface that has unnecessary methods and therefore not going to be implemented.  Some signs of violating ISP are: Having a "fat" interface, which means having a high number of methods in one interface that are not so related to each other or low cohesion. Empty implementation of methods, certain methods of interface are not needed for implementation. Considering the following example, we violate the principle because CannonPrinter is designed only with the functionality to print, leaving the scan and fax method unimplemented. interface IMultiFunction {      void print(); void scan(); void fax(); } public class HPPrinterNScanner implements ImultiFunction { @Override public void pr...