I’ve been meaning to make a brief note on software testing for a while now. Originally this article was going to be titled “Why UX testing is so important,” but I realised that I was just trying to explain the importance of collaboration at the design phase of the software lifecycle.
As a developer, unit testing holds a lot of value for me because it means I can firmly define how I want my software to work before it’s even written, even though I more commonly take the approach of hacking it together first and testing it afterwards (so naughty).
This is a great thing. Unit tests prove that something works. But of course they only have so much importance, and that’s specifically to developers, to prove that their code behaves as advertised, and as intended; that it isn't littered with bugs. Think of it as a first frontier before your software reaches its user.
But that’s definitely not the only kind of testing you want to have in place, unless what you’ve written is truly only for other developers to use, like a library. Quality assurance testing is absolutely key to ensuring that the product is complete and, of course, of an acceptable quality (though I believe we should strive for quality beyond expectations). We have a fantastic team of testers at Claromentis, and the product would suffer an immense loss without their thorough exploration of our systems and their keen sense of user experience.
This leads me on to user experience (UX) testing, which is the most important kind of testing in my eyes. Sit down at least five potential users (or peers following a user story) that haven’t seen the system before, and get them to use it before you've even started building it, and you will learn so much about what you've just designed: whether it makes any sense, whether it’s intuitive to use, how long it takes a user to reach their goal, any inconsistencies, design choices that slow a user down... assume the list goes on forever.
Having this kind of information at such an early stage in a product’s life is incredibly useful because you can build it with the best user experience in mind from the beginning, meaning there’s a lot more value when building the product instead of changing a product that’s already been built.
But one designer or one tester is not enough to make sure you pick up on everything you might need to know, which is why I want to stress the importance of collaborative software design and testing the software design.
Because there are so many different teams and areas of expertise that go in to producing the software, it makes sense to involve them in the design process so that they can pick up on things that they know could be a problem in their part of the production, even if it’s just a cursory glance. But it has to be done before you start building it! QA testers are completely included here, because their expertise help develop a product to its most issue-free state, and they’ll spot things that a developer or a designer
In the end the code will never be as important as the user’s experience, so work that out as early as you can. Quality software architecture certainly matters, but only because it allows developers to more easily and efficiently contribute to a better user experience.
How dependent is your code, and what does it communicate to your peers?
Software coupling and dependency are very popular topics among the PHP community. Conferences are awash with talks on avoiding the dependency trap and producing reusable code.
As we strive to improve our knowledge as developers, I think it's not only important to be aware of the impact that coupling has on future-proofing software, but also what it communicates about the purpose of our code to other developers.
So what does strongly coupled code look like and how can we mitigate it?
Global variables & duplication
class Foo { public function save($data) { global $db; $query = new QueryInsert('foo', $data); $db->query($query); } }
This class has a single method that saves some data to a database. Why is it tightly coupled?
global $db; assumes the existence of a global variable and tells us nothing about its implementation. We could add an annotation that signifies to developers the type of the variable.
/** * @var DAL\Db */ global $db;
This says more to the developer but doesn't alleviate the dependency on the existence and initialisation of this variable in the first place. Also, we might need the database in other methods.
public function load($id) { global $db; $query = new Query('SELECT * FROM foo WHERE id = int:id', $id); $result = $db->query($query); return $result->fetchArray(); } public function save($data) { global $db; $query = new QueryInsert('foo', $data); $db->query($query); }
We can see that the database is a dependency of the entire class, not just a single method. In case you didn't notice, so is the string 'foo'. This is a table that the class needs to access frequently. If this needs to change, currently we need to change it in multiple places. How do we improve this?
class Foo { protected $database; protected $table = 'foo'; public function __construct(DAL\Db $database) { $this->database = $database; } public function load($id) { $query = new Query("SELECT * FROM $this->table WHERE id = int:id", $id); $result = $this->database->query($query); return $result->fetchArray(); } public function save($data) { $query = new QueryInsert($this->table, $data); $this->database->query($query); } }
By using a constructor we have defined what this class needs in order for it to work, and there is no dependency on externally defined variables. We could also improve the reusability of the class by making the table name mutable in some way.
What we've made use of here is known as dependency injection. Instead of actively retrieving the database when we need it, we've stated that it needs to be given to the class.
With this in place, there is no longer any looming doubt about the reliability or existence of a global variable, and the functionality offered by the class has become contained and predictable. You know that if this code is being run, a $this->database property has to be available.
Encapsulation & responsibility
I've seen cases where the database is used directly in classes that don't necessarily have anything to do with storage or persistence; they instead just represent some logic for a certain application or system.
class FooController { public function __construct(DAL\Db $database) { $this->database = $database; } public function index(Request $request) { if ($request->method('post')) { $this->database->query(new QueryInsert('foo', $request->post('data')); } $data = $this->database->query('SELECT * FROM foo WHERE id = int:id', $request->get('id')); return $this->render('my_template.html', $data); } }
In situations like these, it makes a class easier to change (and to test) when you delegate data persistence to another class (like the one above) and instead require this as a dependency.
class FooController { public function __construct(Foo $foo) { $this->foo = $foo; } public function index(Request $request) { if ($request->method('post')) { $this->foo->save($request->post('data')); } $data = $this->foo->load($request->get('id')); return $this->render('my_template.html', $data); } }
This alleviates the class from having multiple responsibilities like persisting the data and defining behaviour around it. It's no longer aware of how to implement data persistence because it doesn't need to be.
However, while this has relieved us of the implementation details of using the database, the class is still coupled to the database because the controller class will only work with the Foo implementation or anything that inherits its implementation.
We can completely decouple the controller from the database using an interface.
interface FooInterface { public function load($id); public function save($data); }
Then we can just change the typehint in the controller's constructor and the code will still work when given the concrete Foo class.
public function __construct(FooInterface $foo) { $this->foo = $foo; }
Moreover, it will work with anything that implements this interface, whether it's persisting data to a file or the database. We've also made it very easy to write unit tests for the class, because the dependencies we need to provide are clear and unavoidable.
Dependency injection containers
Now that we've established how to extract our dependencies from our classes, what should we do from outside? Instantiating everything manually per request could become quite a chore.
This is where dependency injection containers become useful. They allow you to define all of the services for your application, including their dependencies, in one place.
Let's take a look at how you might implement this using Silex and its simple dependency injection container called Pimple.
$app = new Silex\Application; $app['foo'] = new Foo; $app['controller.foo'] = function ($app) { return new FooController($app['foo']); }; /** * @var FooController */ $controller = $app['controller.foo']; $bar = $controller->load($id); $controller->save($bar);
Pimple uses strings to identify services in the application.
To define a services and their dependencies, you can provide a closure as a service definition, which is executed every time you request that service. This is particularly effective performance-wise too; only the services you make use of will be instantiated. This is called lazy-loading.
The application is injected to these closures so that you can use other services as dependencies for the one you are defining.
The beauty here is that we can now swap the foo service for anything else that implements the interface we defined.
class OtherFoo implements FooInterface { // ... } $app['foo'] = new OtherFoo;
And our controller will still work absolutely fine. This is software design that enables painless substitution of different components, at its simplest.
Service locator pattern & responsibility
When your classes depend on a lot of things, it's tempting to inject the entire container to your class in different ways and access services from it directly.
class FooController { public function __construct(Application $app) { $this->app = $app; // or $this->foo = $app['foo']; $this->logger = $app['logger']; $this->database = $app['database']; } // or public function Show(Request $request, Application $app) { $foo = $app['foo']; $bar = $this->app['bar']; // ... } }
While this seems tidier by reducing the number of method parameters, it actually starts communicating the wrong thing about the class by hiding its dependencies. This is known as the service locator pattern.
This means you have to understand how the class works before you understand what it needs in order to work. It cannot be a black box and it cannot be tested as such.
If you were writing unit tests for the above, you'd have to read the implementation to know what services it uses. If we use the constructor and explicitly state the dependencies, we don't need to worry about how it's implemented. We just need to prove that it does what it says on the tin.
The same applies to the Show() method above. The $request is specific to that method, but it doesn't need the entire application, just a few of its services. In fact, many other methods may need these as well. Distinguish what is a dependency of the class and of the method, even if not every method uses every dependency.
If your constructor parameters are becoming too numerous...
Then it helps to reconsider whether your class has too many responsibilities, or whether a dependency is truly a need or if it's optional.
A good example of an optional dependency is a logger. For this, you could use a setter method, and in the constructor simply use a NullLogger class that does nothing.
Also, if you can identify a component of behaviour among a selection of these many dependencies, you can encapsulate the behaviour in another class.
class BooProvider { public function __construct(FooProvider $foo, BarProvider $bar, BazProvider $baz) { $this->foo = $foo; $this->bar = $bar; $this->baz = $baz; } public function doSomething() { // Voila... $fooSomething $this->foo->doSomething(); $this->bar->doSomething($fooSomething); $this->baz->doSomethingElse($this->foo); } } class FooController { public function __construct(Database $database, BooProvider $boo) { $this->database = $database; $this->logger = new NullLogger; $this->boo = $boo; } public function setLogger(LoggerInterface $logger) { $this->logger = $logger; } public function Show(Request $request) { $this->boo->doSomething(); } }
Breaking down components of functionality and making full use of the constructor is one of the best things you can do for the comprehensibility, testability and reliability of your code.
Automatic dependency injection
So is that everything there is to know? I've explained some fundamentals of avoiding tightly coupled code, and a tool we can use to compose all of our software components and their dependencies.
But what if we didn't need to define everything in the container, and why would that be useful?
Consider the controller example above. If you're building a particularly complex project and have many controllers with lots of similar dependencies, you might have to register every controller as a service in the container. This could become tedious and repetitive.
Fortunately, it doesn't have to be that way. There are dependency injection containers that can attempt to automatically resolve the type-hinted dependencies of any given class or function.
class Foo { public function __construct(Bar $bar) { $this->bar = $bar; } } class Bar { public function __construct(Baz $baz) { $this->baz = $baz; } } class Baz {} $container = new Container; $foo = $container->resolve('Foo'); $foo instanceof Foo; // true $foo->bar instanceof Bar; // true $foo->bar->baz instanceof Baz; // true
However, this only works out of the box with concrete classes and all required parameters need to be type-hinted.
To automatically resolve interface dependencies, we can provide the container with an implementation to use.
class Foo implements FooInterface {} class OtherFoo implements FooInterface {} class FooController { public function __construct(FooInterface $foo) { $this->foo = $foo; } } $container = new Container; // Register a concrete implementation of FooInterface $container->set('FooInterface', 'Foo'); // Create a controller with a Foo instance $controller = $container->resolve('FooController'); // Change the concrete implementation of the interface $container->set('FooInterface', 'OtherFoo'); // Create a controller with an OtherFoo instance $otherController = $container->resolve('FooController');
This works without having to define the controller as a service. In fact, it can work with anything that requires type-hints this interface; even functions.
This means you could have a set of common interfaces used throughout the system registered with concrete implementations the container. Any class that depends on only these interfaces can be created automatically, with all of its dependencies satisfied.
But the most powerful thing about this is being able to swap out these implementations and still have everything work, as well as still being able to define services manually where necessary.
Conclusion
This essay of a blog post goes into detail about my discoveries when it comes to bettering software design.
Dependency injection is a very effective way to compose the components of a software system and aids in keeping them independent and testable. It's especially helpful when you have a tool like a dependency injection container to do some of the leg work for you.
To summarise:
Use the constructor, that's what it's there for - compose your dependencies outside of your other classes
Encapsulate components of behaviour - don't make one class responsible for too many things
Use interfaces where you expect that an implementation could be interchangeable
Distinguish between required and optional dependencies
Distinguish whether a dependency belongs to a single method or a class as a whole
Software is a complex thing, and it will always be difficult to write and maintain if its complexity isn't well managed. Complexity can be measured in the problems we attempt to solve, or in the software we write to solve them.
In today's blog post I wanted to take a look at one particular software metric that I feel provides a reliable insight into how complex our code is. It measures the logical complexity of code as opposed to its time complexity.
Cyclomatic complexity
This fancy term refers to the measurement of the number of decisions made by a program. It is best applied to functions or methods to gain a quantitative understanding of how complex they are, from one perspective.
So when is our code making a decision?
If conditions
Loop conditions
Switch cases
Catch statements
Simply by counting the occurrences of these, plus one, we can determine the cyclomatic complexity of a program.
It is suggested that a complexity of 10 or less is acceptable as a simple, testable function or method without much risk.
Let's take a look at an example PHP method.
public function escape($subject) { if (is_array($subject)) { $escaped = array(); foreach ($subject as $key => $value) { $values[$key] = $this->escape($value); } return $escaped; } if (is_string($subject)) { return "'" . $this->database->escape($subject) . "'"; } return $subject; }
The method has a cyclomatic complexity of 3, because there are two if conditions and one foreach loop. It's as simple as that.
As you can imagine, a method with more loops and conditions, nested or otherwise, would add to this complexity until it becomes a burden to understand when reading it for the first time.
To make it easier for fellow developers to understand your intent, and to avoid creating potential for things to go wrong (like bugs), it's helpful to minimise this complexity as much as possible. More explicitly, it helps to keep the program in a predictable state.
So how do you go about reducing such complexity? Refactoring.
Map, filter, reduce
If we start with the above example, there is one thing we can do to reduce its cyclomatic complexity, despite it already being so low.
We can replace the loop with a function call that does the same thing.
public function escape($subject) { if (is_array($subject)) { return array_map(array($this, 'escape'), $subject); } if (is_string($subject)) { return "'" . $this->database->escape($subject) . "'"; } return $subject; }
Now the method is only making two decisions. Why is this a good thing?
We aren't reimplementing what array_map already does for us
The method no longer has a loop, making it simpler
Our code is more concise to read
The same approach can be used for common scenarios with the array_filter and array_reduce functions in PHP. There are equivalents in other languages that can be used.
Before you write a foreach loop to apply (map) a function to every value of an array, or to filter elements out of the array, or to reduce an array to a single value, consider using one of these functions.
Method Extraction
One of my most favoured approaches to refactoring code is method extraction. It involves extracting appropriate segments of a method into new methods.
Here are some helpful ways to determine which code can be extracted from a method:
Inline comments that explain the behaviour of a code segment
Code segments that perform tasks that can be isolated
Deeply nested loops and conditionals
Duplicate/similar code segments
Here's a very simple example we can perform method extraction on.
public function set($attribute, $value = null) { $attribute = strtolower($attribute); if ($attribute === 'id') { $attribute = $this->key(); } $this->data[$attribute] = $value; }
This method is used to set attribute data on some object, but if the attribute name is id then said attribute is loaded from another method instead.
We could then write a get() method that's doing the opposite thing - retrieving the data - and use the same code to prepare this attribute variable, but this would introduce duplicate logic between these two methods.
public function get($attribute) { $attribute = strtolower($attribute); if ($attribute === 'id') { $attribute = $this->key(); } return $this->data[$attribute]; }
Instead, we can extract the code that prepares $attribute to its own method, like so.
protected function prepareAttribute($attribute) { $attribute = strtolower($attribute); if ($attribute === 'id') { return $this->key(); } return $attribute; } public function get($attribute) { $attribute = $this->prepareAttribute($attribute); return $this->data[$attribute]; } public function set($attribute, $value = null) { $attribute = $this->prepareAttribute($attribute); $this->data[$attribute] = $value; }
While this could seem like only an elimination of duplicate code, it's also the identification of a code segment that can be isolated. It makes perfect to sense to do this whether the code will be reused or not, especially in more complex cases.
In this case, we have made sure that the same decision is not being made in more than one place. We haven't removed the complexity from the class as a whole, but separated it into simpler components.
Other software metrics
There are many ways other than cyclomatic complexity to analyse your code.
Here are some key metrics that I think are worth understanding, described in relation to classes:
Coupling - How tightly classes depend on each other
Cohesion - The strength of the relationship between the functionalities within a class
Number of lines of code - How long are you letting your class become?
Line count is a very simple one, but it's good to consider how large you let classes become. If you're rapidly approaching 1000 lines then it's helpful to ask yourself why, and what you can do about it (refactor), instead of just letting it happen. In some cases, classes really do need to be that large. If so, it's can be useful to other developers to explain why in its documentation.
Analysis tools
There are many online services you can use to analyse the metrics of your own software. When using them, it's best to treat the problems they raise as a guide to improving your software, not as the one and only truth.
Scrutinizer CI
SensioLabs Insight
Code Climate
Codacy
Scrutinizer is my absolute favourite analysis service, and seems to be the most powerful. It runs multiple analysis tools on your code and presents you with the issues it finds in a simple interface. It also presents guides on how to improve specific code smells as they appear in your code. You'll get clear ratings for the coupling and cohesion of your classes.
Conclusion
The less decisions your methods make, the less chance they have to do something unexpected. This ultimately makes them easier to test. Minimising this risk by simplifying or separating complex code segments can prove really beneficial to maintaining a quality code base.
There are cases in which you wouldn't want to worry too much about certain metrics, and it's also important to use them together to gain a more general introspection into the complexity of your code.
Software metrics will never be the definitive answer to building software in the best possible way, but will always be useful tools that help us get there.
The way we use language is a very important part of software development. How well you name the parts of a codebase can have a drastic impact on how easy it is for other developers to understand.
The same applies to commit messages. If your commits are explained clearly, other people will know what your changes do before they read a single line of code. This is especially useful for people that don't read or write code because they can still understand what has been changed.
While there are many ways to discuss what makes a commit message useful, what I'd like to write about today is the context in which they are written; the grammatical mood of our commit messages.
Imperative
If we take a look at commit messages that git produces automatically, we can see that it uses imperative mood. This is explaining the commit as if it were a command or instruction.
Merge branch 'develop'
Following this style helps to enforce commits as a series of instructions that are being applied to a codebase.
Add a new table for logging
Remove deprecated code
Clean up coding style and whitespace
Fix a user picker bug
This approach can be used to complete the sentence:
This commit will...
Indicative
Something we see more commonly, especially in the Claromentis codebase, is a commit explained as an action.
Added a new table for logging
Removed deprecated code
Cleaned up coding style and whitespace
Fixed a user picker bug
Sometimes these work as a statement of what the commit is, as opposed to what the commit has changed.
Coding style and whitespace clean up
User picker bug fix
Fixes for the new query builder
These can be used to complete the sentences:
This commit has...
This commit is a...
Issue tracking
It's also important to consider the best way to reference issues.
Issue #123
The above doesn't tell you what the commit is for, other than providing a link to the issue if you happen to be viewing the commit logs on Github or Gitlab.
Better approaches include prepending the issue ID to a sentence commit log or even including it in the sentence.
[#123] Fixed misaligned icon
Issue #123 - Fixed misaligned icon
Fixed misaligned icon for issue #123
My approach
Ultimately it boils down to a personal preference, as well as what works best for the whole team.
I can't get comfortable with writing commit messages as commands because they're meant to be read by people, not machines. I find it easier to think of a commit log as a history of what has been changed, rather a set of instructions for potential changes. Most of the time we're reading these messages in the future, rather than before we've made the commit or if we've checked out an old tag.
My approach will always be like these:
This commit has... Fixed storage adapter bugs for issue #123
This commit is an... Initial implementation of eager loading
This commit has... Dynamic route parameter fixes.
I prefer to describe my commits as an action or a change, rather than an instruction. It makes more sense to me to explain what the commit is or what it has done, rather than what it will do.