Stream filters allow changes to be made to textual data in streams. This allows text to be changed when writing to or reading from streams, instead of changing the text after the stream has been run. There is a framework built into PHP that allows custom filters to be added to the group of built in filters.
Let's start by seeing what filters are built into PHP.
stream_get_filters()
To see what filters exist on your system you can run the stream_get_filters() function. This returns a list of the available filters.
$ php -r "print_r(stream_get_filters());"
Array
(
[0] => zlib.*
[1] => bzip2.*
[2] => convert.iconv.*
[3] => string.rot13
[4] => string.toupper
[5] => string.tolower
[6] => string.strip_tags
[7] => convert.*
[8] => consumed
[9] => dechunk
)
There is plenty of filters to select from here, but how do we create our own? Figuring out how to generate custom filters is a little difficult as this is perhaps the least documented aspect of PHP. Most of the inner functions in use here are not documented at all, which makes it difficult to figure out what is going on.
There are a few things to go over before we get into creating out own filters.
The Bucket Brigade
A "bucket brigade" (if you don't know) is an english idiom that describes a series of people who pass buckets back and forth to each other in order to supply water to a fire.
When we call a custom filter a bucket brigade is passed to it. Internally this is a PHP resource userfilter.bucket brigade, which acts like a doubly linked list of userfilter.bucket resources. The userfilter.bucket resource contains our data.
stream_bucket_make_writeable()
Using this function we can pass in a userfilter.bucket brigade resource and grab the next userfilter.bucket object from it.
$bucket = stream_bucket_make_writeable($in);
This object is a stdClass object and has the following structure.
- bucket - This is the userfilter.bucket resource.
- data - This will be either all of the data or a section of the data if there is too much to process all at once.
- datalen - The length of the data in the data property.
To demonstrate this, let's look at the structure of an actual object.
print_r(stream_bucket_make_writeable($in));
We will print out something like this (depending on what data we are filtering).
object(stdClass)#2 (3) {
["bucket"]=>resource(10) of type (userfilter.bucket)
["data"]=>string(119) "Antimony is a silvery, lustrous gray metalloid with a Mohs scale hardness of 3, which is too soft to make hard objects."
["datalen"]=>int(119)
}
If there is no data left in the resource then this function will return null. This means we can loop through the bucket brigade and keep grabbing bucket items until we find null.
stream_bucket_prepend() And stream_bucket_append()
With the object we got from stream_bucket_make_writeable() we now need somewhere to put it. This is where the functions stream_bucket_prepend() and stream_bucket_append() come in.
The custom filter is given an $in parameter and an $out parameter (which are both as userfilter.bucket brigade resources). Data is pulled out of the $in parameter using stream_bucket_make_writable() in the form of an object. Data is then changed in this object which is then passed back to the $out parameter using one of these functions.
The following is an example of the whole process in action using stream.
$bucket = stream_bucket_make_writeable($in);
$bucket->data = strtoupper($bucket->data);
stream_bucket_append($out, $bucket);
stream_bucket_new()
This function takes in a resource and a string and will return a bucket that we can use to add content to the output bucket brigade. The function has the following footprint.
stream_bucket_new(resource $stream, string $buffer): stdClass
We can take any stream, use it to create a bucket with some string content and then append it to our filter output stream. For example, we could generate a stream memory stream and then create a new bucket like this.
$stream = fopen('php://memory', 'r');
$bucket = stream_bucket_new($stream, 'monkey');
stream_bucket_append($out, $bucket);
Any custom filters we create will automatically get a property called stream. This is essentially a link to the stream being worked on and gives us a handy resource we can use to generate a new bucket. For example, to add a string to the output stream we can the following.
$bucket = stream_bucket_new($this->stream, PHP_EOL);
stream_bucket_append($out, $bucket);
Note that the stream property of the filter object is created the first time the filter() method is called. As stated, it will be set to the stream that we are currently working on.
Creating Custom Filters
A custom filter is created as a class in which certain functions are called when applying the filter. These functions are detailed in the class php_user_filter, but it's not useful to use that class directly. To facilitate the creation of the sub-classes I have created an interface that forces filter classes to implement certain functions.
interface CustomFilter
{
/**
* Called when applying the filter.
*
* @param resource $in
* in is a resource pointing to a bucket brigade which contains one or more bucket
* objects containing data to be filtered.
* @param resource $out
* out is a resource pointing to a second bucket brigade into which your modified
* buckets should be placed.
* @param int $consumed
* consumed, which must always be declared by reference, should be incremented by
* the length of the data which your filter reads in and alters. In most cases
* this means you will increment consumed by $bucket->datalen for each $bucket.
* @param bool $closing
* If the stream is in the process of closing (and therefore this is the last pass
* through the filterchain), the closing parameter will be set to TRUE.
*
* @return int
* The filter() method must return one of three values upon completion.
* - PSFS_PASS_ON: Filter processed successfully with data available in the out
* bucket brigade.
* - PSFS_FEED_ME: Filter processed successfully, however no data was available to
* return. More data is required from the stream or prior filter.
* - PSFS_ERR_FATAL (default): The filter experienced an unrecoverable error and
* cannot continue.
*/
public function filter($in, $out, &$consumed = NULL, bool $closing = false): int;
/**
* Called when creating the filter.
*
* @return bool
* Your implementation of this method should return FALSE on failure, or TRUE on success.
*/
public function onCreate(): bool;
/**
* Called when closing the filter.
*/
public function onClose(): void;
}
Using this interface I put together a class that will perform a filter on a string. As an example I have created a class that will take a string and convert every character to an X.
Note that all of the properties of this class are generated automatically by PHP.
class CensorFilter implements CustomFilter
{
/**
* Name of the filter registered by stream_filter_append().
*
* @var string
*/
public $filtername;
/**
* Additional parameters passed through the stream_filter_append() function.
*
* @var mixed
*/
public $params;
/**
* A resource of type 'userfilter.filter'.
*
* @var resource
*/
public $filter;
/**
* A resource of type 'stream'.
*
* @var resource
*/
public $stream;
public function filter($in, $out, &$consumed = NULL, bool $closing = false): int
{
while ($bucket = stream_bucket_make_writeable($in)) {
$bucket->data = preg_replace('/[a-zA-Z][0-9]/', 'X', $bucket->data);
$consumed += $bucket->datalen;
stream_bucket_append($out, $bucket);
}
return PSFS_PASS_ON;
}
public function onCreate(): bool
{
return true;
}
public function onClose(): void
{
}
}
The filter() method in this class is where most of the work goes on in our censor filter. The two important parts of this function are $in, which is our incoming bucket brigade and $out, which is the outgoing bucket brigade. In the class above we are doing the following:
- Taking the $in bucket brigade and calling stream_bucket_make_writable() on it until we get a null value.
- Grab the data from the created bucket and swap it out for a censored version using preg_replace().
- Let PHP know how much we have consumed by writing the length of the bucket to the $consumed variable. This is passed by reference so this assignation will be bubbled upstream. Note that since we haven't changed the length of the text we can leave this value alone.
- Write the bucket to the $out bucket brigade using the stream_bucket_append() function.
- Return PSFS_PASS_ON, with the assumption that we have successfully gone through the stream.
One important aspect is that we MUST loop through all of the data in the bucket brigade or we will generate the following warning.
PHP Warning: fgets(): Unprocessed filter buckets remaining on input brigade in custom_filter.php on line 115
Using Custom Filters
To use this class as a filter we call two functions. The stream_filter_register() function will register the filter class and the stream_filter_append() function will append the filter to a stream.
Here is an example of everything in action.
// Register the CensorFilter class.
stream_filter_register("censor", "CensorFilter");
// Create a stream from a text file.
$stream = fopen('test.txt', 'r');
// Append the censor filter to our stream.
stream_filter_append($stream, "censor");
// Read out the stream to the command line.
while (false !== ($line = fgets($stream))) {
echo $line;
}
// Close the stream.
fclose($stream);
With a text file containing the following.
Antimony is a silvery, lustrous gray metalloid with a Mohs scale hardness of 3.
The output of this is.
XXXXXXXX XX X XXXXXXX, XXXXXXXX XXXX XXXXXXXXX XXXX X XXXX XXXXX XXXXXXXX XX X.
We can also apply the filter to the stream using the php://filter syntax.
stream_filter_register("censor", "CensorFilter");
$input = fopen('php://filter/read=censor/resource=test.txt', 'r');
while (false !== ($line = fgets($input))) {
echo $line;
}
In closing, I'd like to say thanks to this post, which helped me with some of the unknowns in this part of PHP.
Comments
Hello. Thank you for a good article. So can we use custom filters with a big files without changing memory_limit in php.ini?
Submitted by Andrii on Fri, 03/12/2021 - 10:19
PermalinkCan we use custom filters without extending memory limit in php.ini?
Submitted by Andrii on Fri, 03/12/2021 - 10:20
PermalinkHi Andrii,
I think so yes, although you'll probably want to combine this with PHP stream functions (https://www.hashbangcode.com/article/php-streams)
Submitted by giHlZp8M8D on Fri, 03/12/2021 - 11:08
PermalinkAdd new comment