I read The Daily WTF every now and then and one story about bad logging code in PHP stood out to me. The post looked at some PHP code that was created to log a string to a file, but would have progressively slowed down the application every time a log was generated.
PHP is a great language as it allows you to put together code without easily shooting yourself in the foot. One downside of this ease of access is that it can allow beginners to put together code that works, but will ultimately cause problems in the long run. The PHP code given in the post is certainly an example of this.
I know that the goal of The Daily WTF isn't to fix problems found in code and is more of a satirical look at bad coding examples. I thought, however, that it would be good to look at the code and fix the issues it had. It's quite easy for people to mock code, but I always look for a solution to these problems in an effort to educate.
Here is the code from the article. It is the relevant part of a log function in a class, one that presumably takes a $message parameter as a string and adds it to then end of log file (provided by the getPath() method).
$path = self::getPath();
if (file_exists($path)) {
$content = file_get_contents($path);
} else {
$content = "";
}
$content .= "\n" . date('Y-m-d H:i') . " | " . $message;
file_put_contents($path, $content);
It should go without saying that you should't use this code in production.
This code works by loading the contents of the log file into memory, appending a new line to the contents of the file, and then writing the content back to the file. The file_get_contents() function is a built in PHP function that loads a file into memory, so the ultimate problem with this code is that the longer the log file gets, the longer it takes to load this into memory.
In PHP, strings are classed as mutable, meaning that they can be freely altered. Internally, this works using a copy-on-write system where any changes to the string are done by copying the string into a new memory address along with the changes and discarding the old one. What this means for the code above is that not only is there a performance hit when pulling the data out of the file, but there is another one when the data is concatenated to the end of the string.
In a small application with minimal logging this approach to logging would have caused problems eventually; but this wasn't the case for the example. Apparently, the code was in a active application that quickly caused the log files to end up with multi-gigabyte files that caused the entire server to experience severe performance problems.
To summarise, there are two problems here:
- Loading the entire log file.
- Concatenating our log entry to the end of the log.
A Simple Fix
The immediate fix for these issues is to remove the file_get_contents() call and use just file_put_contents() with the FILE_APPEND parameter. This will append our log entry to the log file without us having to load the information from the file first, we also put the new line character at the end of the log entry so that each new log creates the blank line ready for the next log entry to be written.
$path = self::getPath();
$content = date('Y-m-d H:i') . " | " . $message . "\n";
file_put_contents($path, $content, FILE_APPEND);
This does, however, leave us open to a couple of problems as we don't make sure that the file is able to be written to before attempting to write to it.
Another approach to this is to use the other PHP file functions to ensure the file exists and can be written to. This is done using fopen() function, which takes the file we want to open as the first parameter and a mode setting as the second parameter. There are a few modes to chose from, but as we want to open the file and place the pointer at the end we need to use the "a" file mode.
$fp = fopen($path, 'a');
With this file handle open we can then write data to it using the fwrite() function. This just takes the file handle variable (that we created with the fopen() function) and the data we want to write. Instead of the date formatting string of "Y-m-d H:i" used in the date() function I have instead used "c", which will print out the current time in ISO 8601 format, which is a much more standard compliant way of rendering the date. Finally, instead of using "\n" to print a new line character we use the PHP_EOL constant, which is more platform agnostic. This newline character is placed at the end of the log entry, which has the same effect but instead leaves a blank line at the end of the log file.
fwrite($fp, date('c') . "|" . $message . PHP_EOL);
Adding the log message in this way allows us to inject log messages without having to first detect if the log has anything in it. Instead we can just start adding logs.
Here is the changed code in full, in which we also include some error checking code to prevent the log from producing errors if the log directory is not writable or the file couldn't be opened.
$path = self::getPath();
if (is_writable(dirname($path)) === false) {
// Log directory is not writable.
return;
}
$fp = fopen($path, 'a');
if ($fp === false) {
// Couldn't open file handle.
return;
}
fwrite($fp, date('c') . " | " . $message . PHP_EOL);
Using this method we reduce the log write time to around 0.00001 seconds, which doesn't change, even for multi-gigabyte log files.
Using PSR-3
What we have done do far is just fixed the underlying code. We can (and should) take this a step further by using a PSR-3 compliant logging system.
The PHP Standards Recommendations (or PSR) are a set of standards that aims to solve many of the common challenges faced by PHP developers. This includes things like coding standards (PSR-1), Autoloading (PSR-4), Caching Interfaces (PSR-6), Event Dispatchers (PSR-14), and Logging Interfaces (PSR-3).
I won't detail everything that PSR-3 covers here, but the key parts are that it should log to different levels (debug, info, notice, warning, error, critical, alert, emergency), and have a unified interface for creating messages within the log system.
The PHP library Monolog is a PSR-3 compliant logging package and is an excellent package for introducing logging to a system. It is able to write logs to a file (much like in our example) but you can opt to log to the syslog or introduce your own log handler to push your logs to an external system or even a database.
Monolog is a composer package, so it can be easily installed using the compose require command.
composer require monolog/monolog
To use the system we need to create a log channel, this object is where the main functionality of the system will be run from.
$log = new Monolog\Logger('MyWebsite');
We then need to push a handler to the logger, which will be used to records the logs wherever we want to. Since we want to write to a file we would use the Monolog\Handler\StreamHandler class, which allows us to write the logs to any arbitrary file. The SteamHandler takes a second parameter of level, which is the minimum level at which logs will be recorded.
$log->pushHandler(new Monolog\Handler\StreamHandler('./mono.log', Monolog\Level::Warning));
With this in place, we can now start writing logs, we can create logs of differing levels using methods like warning() or error().
$log->warning('Some log message.');
$log->error('Something bad happened');
We can even inject parameters into the logs so that they have more context and description of what happened.
$log->warning('The action {action} was performed.', ['action' => 'event']);
This will create a log in the file called "mono.log" with the information in our log message and a ISO 8601 date format.
Using the above, we would change our original log code to the following.
$log = new Logger('MyWebsite');
$log->pushHandler(new StreamHandler(self::getPath(), Level::Warning));
$log->warning($message);
Of course, we are settings the log level to be always "warning" here, but at least we are logging in a more unified way. The log is now written to a log file in a standards compliant way.
The original code we had has now been improved by adding a robust and well tested logging framework. This can be dropped into the existing class without causing a massive rewrite for the rest of the application. This can then be further improved by injecting the Logger object into the existing class so that we don't have to instantiate it and add a handler every time.
When faced with a legacy application with issues like this it is essential to fix fundamental problems in order to make the platform stable. It's often tempting to throw everything away and start again, but by solving fundamental issues like this you can start to improve the application without having to spend a fortune on a rewrite.
Do you have anything you want to submit to PHP:CSI? If you do then please get in touch by filling in our contact form or adding a comment.
Add new comment