Routes in Drupal can be altered as they are created, or even changed on the fly as the page request is being processed.
In addition to a routing system, Drupal has a path alias system where internal routes like "/node/123" can be given SEO friendly paths like "/about-us". When the user visits the site at "/about-us" the path will be internally re-written to allow Drupal to serve the correct page. Modules like Pathauto will automatically generate the SEO friendly paths using information from the item of content; without the user having to remember to enter it themselves.
This mechanism is made possible thanks to an internal Drupal service called "path processing". When Drupal receives a request it will pass the path through one or more path processors to allow them to change it to another path (which might be an internal route). The process is reversed when generating a link to the page, which allows the path processors to reverse the process.
It is possible to alter a route in Drupal using a route subscriber, but using path processors allows us to change or mask the route or path of a page in a Drupal site without actually changing the internal route itself.
In this article we will look what types path processors are available, how to create your own, what sort of uses they have in a Drupal site, and anything else you should look out for when creating path processors.
Types Of Path Processor
Path processors are managed by the Drupal class \Drupal\Core\PathProcessor\PathProcessorManager. When you add your a path processor to a site this is the class that manages the processor order and calling the processors.
There are two types of path processor available in Drupal:
- Inbound - Processes an inbound path and allows it to be altered in some way before being processed by Drupal. This usually occurs when a user sends a request to the Drupal site to visit a page. Inbound path processors can also be triggered by certain internal processes, for example, when using a path validator. The path validator will pass the path to the inbound path processor in order to change it to ensure that it has been processed correctly.
- Outbound - An outbound path is any path that Drupal generates a URL. The outbound path processor will be called in order to change the path so that the URL can be generated correct.
Basically, the inbound processor is used when responding to a path, the outbound processor is called when rendering a path.
Let's go through a couple of examples of each to show how they work.
Creating An Inbound Processor
To register an inbound service with Drupal you need to create a service with a tag of path_processor_inbound, and can optionally include a priority. This let's Drupal know that this service must be used when processing inbound paths.
It is normal for path processor classes to be kept in the "PathProcessor" directory in your custom module's "src" directory.
services:
mymodule.path_processor_inbound:
class: Drupal\mymodule\PathProcessor\InboundPathProcessor
tags:
- { name: path_processor_inbound, priority: 20 }
The priority you assign to the path_processor_inbound tag will depend on your setup. The internal inbound processor that handles paths in Drupal has a priority of 100, so any setting less than 100 will cause the processing to be performed before Drupal's internal handler is called.
The InboundPathProcessor class we create must implement the \Drupal\Core\PathProcessor\InboundPathProcessorInterface interface, which requires a single method called processInbound() to be added to the class. Here are the arguments for that method.
- $path - This is a string for the path that is being processed, with a leading slash.
- $request - In addition to the path, the request object is also passed to the method. This allows us to perform any additional checks on query strings on the URL or other parameters that may have been added to the request.
The processInbound() method must return the processed path as a string (with the leading slash). If we don't want to alter the path then we need to return the path that was passed to the method.
To create a simple example let's make sure that when a user visits the path at "/some-random-path" that we translate this internally to be "/node/1", which is not the internal route for this page. In this example, if the path passed into the method isn't our required path then we just return it, effectively ignoring any path but the one we are looking for.
<?php
namespace Drupal\mymodule\PathProcessor;
use Drupal\Core\PathProcessor\InboundPathProcessorInterface;
use Symfony\Component\HttpFoundation\Request;
class InboundPathProcessor implements InboundPathProcessorInterface {
public function processInbound($path, Request $request): string {
if ($path === '/some-random-path') {
return $path;
}
return '/node/1';
}
}
Now, when the user visits the path "/some-random-path" they will see the output of the page at "/node/1". It is still possible to view the page at "/node/1/" and see the output, so we have just created a duplicate path for the same page.
This is a simple example to show how the processInbound() method works, we'll look at a more concrete example later.
Creating An Outbound Processor
The outbound processor is defined in a similar way to the inbound processor, but in this case we tag the service with the tag path_processor_outbound.
services:
mymodule.path_processor_outbound:
class: Drupal\mymodule\PathProcessor\OutboundPathProcessor
tags:
- { name: path_processor_outbound, priority: 250 }
The priory of the path_processor_outbound is more or less the opposite of the inbound processor in that you'll generally want your outbound processing to happen later in the callstack. The internal Drupal mechanisms for outbound processor is set at 200, so setting our priory to 250 means that we process our outbound links after Drupal has created any aliases.
The OutboundPathProcessor class we create must implement the \Drupal\Core\PathProcessor\OutboundPathProcessorInterface interface, which requires a single method called processOutbound() to be added to the class. Here are the arguments for that method.
- $path - This is a string for the path that is being processed, with a leading slash.
- $options - An associative array of additional options, which includes things like "query", "fragment", "absolute", and "language". These are the same options that get sent to the URL class when generating URLs and allow us to update the outbound path based on the passed options.
- $request - The current request object is also sent to the method and can make decisions based on the parameters passed to the current path.
- $bubbleable_metadata - An optional object to collect path processors' bubbleable metadata so that we can potentially pass cache information upstream.
The processOutbound() method must return the new path, with a starting slash. If we don't want to change the path then we just return the path that was sent to us, otherwise we can make any change we require and return this string.
Taking a simple example in the inbound processor further, let's change the path "/node/1" to be "/some-random-path". In this example we are looking for the internal path of "/node/1", and if we see this path then we return our new path.
<?php
namespace Drupal\mymodule\PathProcessor;
use Drupal\Core\PathProcessor\OutboundPathProcessorInterface;
use Drupal\Core\Render\BubbleableMetadata;
use Symfony\Component\HttpFoundation\Request;
class OutboundPathProcessor implements OutboundPathProcessorInterface {
/**
* {@inheritdoc}
*/
public function processOutbound($path, &$options = [], Request $request = NULL, BubbleableMetadata $bubbleable_metadata = NULL) {
if ($path !== '/node/1') {
return $path;
}
return '/some-random-path';
}
}
With this in place, when Drupal prints out a link to "/node/1" it will render the path as "/some-random-path".
On its own this example doesn't do much; we are just rewriting a path for a single page. The real power is when we combine inbound processing and outbound processing together. Let's do just that.
Creating A Single Class For Path Processing
It is possible to combine the inbound and outbound processors together into a single class by combining the tags in a single service. This can be done by combining the path processors together in the module's services file.
services:
mymodule.path_processor:
class: Drupal\mymodule\PathProcessor\MyModulePathProcessor
tags:
- { name: path_processor_inbound, priority: 20 }
- { name: path_processor_outbound, priority: 250 }
The class we create from this definition implements both the InboundPathProcessorInterface and the OutboundPathProcessorInterface, and as such it includes both of the processInbound() and processOutbound() methods.
<?php
namespace Drupal\mymodule\PathProcessor;
use Drupal\Core\PathProcessor\InboundPathProcessorInterface;
use Drupal\Core\PathProcessor\OutboundPathProcessorInterface;
use Drupal\Core\Render\BubbleableMetadata;
use Symfony\Component\HttpFoundation\Request;
class MyModulePathProcessor implements InboundPathProcessorInterface, OutboundPathProcessorInterface {
public function processInbound($path, Request $request): string {
return $path;
}
public function processOutbound($path, &$options = [], Request $request = NULL, BubbleableMetadata $bubbleable_metadata = NULL): string {
return $path;
}
}
Now all you need to do is add in your path processing.
It's a good idea to create a construct like this so that you translate the path going into and coming out of Drupal. This creates a consistent path model and prevents duplicate content issues where different pages have the same path.
The Redirect Module
If you are planning to use the inbound path processor system then you should be aware that the Redirect module will attempt to redirect your inbound path processor changes to the rewritten paths. The Redirect module is a great module, and I install it on every Drupal site I run, but in order to prevent this redirect you'll need to do something extra, which we'll go through in this section.
To prevent the Redirect module from redirecting a path you need to add the attribute _disable_route_normalizer to the route before the kernel.request event triggers in the Redirect module's RouteNormalizerRequestSubscriber class. We do this by creating our own event subscriber and giving it a higher priority.
The first thing to do is add our event subscriber to our custom module services.yml file.
mymodule.prevent_redirect_subscriber:
class: Drupal\mymodule\EventSubscriber\PreventRedirectSubscriber
tags:
- { name: event_subscriber }
The event subscriber itself just needs to listen to the kernel.request event, which is stored in the KernelEvents::REQUEST constant. We need to trigger our custom module before the redirect module event, and so we set the priority of the event to be 40. This is higher than the Redirect module event, which is set at 30.
All the event subscriber needs to do is listen for our path and then set the _disable_route_normalizer attribute to the route if it is detected.
<?php
namespace Drupal\mymodule\EventSubscriber;
use Symfony\Component\HttpFoundation\Request;
use Symfony\Component\EventDispatcher\EventSubscriberInterface;
use Symfony\Component\HttpKernel\Event\RequestEvent;
use Symfony\Component\HttpKernel\KernelEvents;
class PreventRedirectSubscriber implements EventSubscriberInterface {
public static function getSubscribedEvents() {
$events[KernelEvents::REQUEST][] = ['preventInboundPathRedirect', 40];
return $events;
}
public static function preventInboundPathRedirect(RequestEvent $event) {
if ($event->getRequest()->getPathInfo() === '/en/some-random-path') {
$event->getRequest()->attributes->set('_disable_route_normalizer', true);
}
}
}
When the Redirect module event triggers it will see this attribute and ignore the redirect.
This will only happen if you are changing the path of an entity of some kind using only the inbound path processor. Creating only the inbound processor creates an imbalance between the outer path and the translated inner path, which we then need to let the Redirect module know about to prevent the redirect. If we also translated the outbound path in the same (and opposite) way then the redirect wouldn't occur.
Doing Something Useful
We've looked at swapping paths and preventing redirects, but let's do something useful with this system.
I was recently tasked with creating a module that would allow any page to be rendered as RSS. It wasn't that we needed a RSS feed, but that each individual page should have an RSS version available.
This was required as there was an integration with an external system that was used to pull information out of a Drupal site for newsletters. Having RSS versions of pages made it much easier for the system to parse the content of the page and so produce the newsletter. This also meant that if the theme changed the system wouldn't be effected as it wouldn't be using the theme of the site.
Essentially, the requirement meant that we needed to add "/rss" after any page on the site and it would render the page accordingly.
The resulting module was dubbed "Node RSS" and made extensive use of path processors to produce the result.
The first step was to create a controller that would react to path like "/node/123/rss" to render the page as an RSS feed. This required a simple route being set up to allow Drupal to listen to that path and also to inject the current node object into the controller. The route also contains a simple permission, which provided a convenient way of activating the system when it was ready.
node_rss.view:
path: '/node/{node}/rss'
defaults:
_title: 'RSS'
_controller: '\Drupal\node_rss\Controller\NodeRssController::rssView'
requirements:
_permission: 'node.view all rss feeds'
node: \d+
options:
parameters:
node:
type: entity:node
The rssView action of the NodeRssController just needs to render the node and return it as part of an RSS document. Using this we can now go to a node page at "/node/123/rss" and see an RSS version of the page.
I won't go into detail about producing the RSS version of the page here as it contains a lot of boilerplate code that goes beyond the scope of this article.
So far we only have half the functionality required. Seeing an RSS version of the page via the node ID is fine, but what we really want is to visit the full path of the page with "/rss" appended to the end.
The next step is to setup our path processor so that we can change the paths on the fly. In addition to the tags we are also passing in two other services for us to use in the class. These services are the path_alias.manager service for translating paths and the language_manager to ensure that we get the path with the correct language.
services:
node_rss.path_processor:
class: Drupal\node_rss\PathProcessor\NodeRssPathProcessor
arguments:
- '@path_alias.manager'
- '@language_manager'
tags:
- { name: path_processor_inbound, priority: 20 }
- { name: path_processor_outbound, priority: 220 }
The processInbound() method looks for the "/rss" string at the end of the passed path. If this is found then we remove that from the path and try to find the internal path of the page in the site. If we do find the path then it will be returned as "/node/123" instead of the full path alias and this means we can just append "/rss" to the end of the path to point the path at our NodeRssController::rssView action.
public function processInbound($path, Request $request): string {
if (preg_match('/\/rss$/', $path) === 0) {
// String is not an RSS feed string.
return $path;
}
$nonRssPath = str_replace('/rss', '', $path);
$internalPath = $this->pathAliasManager->getPathByAlias($nonRssPath, $this->languageManager->getCurrentLanguage()->getId());
if ($internalPath === $nonRssPath && preg_match('/^node\//', $internalPath) === 0) {
// No matching path was found, or, it wasn't a node path that we have.
return $path;
}
return $internalPath . '/rss';
}
The opposite process needs to happen for the processOutbound() method. In this case we look for a path that looks like "/node/123/rss" and convert this back into the full path alias of the page. If we find an alias for that path then we append "/rss" to the path and return it.
public function processOutbound($path, &$options = [], Request $request = NULL, BubbleableMetadata $bubbleable_metadata = NULL): string {
if (preg_match('/^\/node\/.*?\/rss?$/', $path) === 0) {
// String is not an RSS feed string.
return $path;
}
$nonRssPath = str_replace('/rss', '', $path);
$alias = $this->pathAliasManager->getAliasByPath($nonRssPath, $this->languageManager->getCurrentLanguage()->getId());
if ($nonRssPath === $alias) {
// An internal alias was not found.
return $path;
}
return $alias . '/rss';
}
We now have an RSS feed for any content path on the website (as long as it is a node page of some kind).
If we attempted to visit the RSS output of any other kind of page (like a taxonomy term) then we would receive a 404 error. This is possible thanks to the route we have in place as the parameter will only accept node paths.
As we have translated the path completely we do not need the Redirect module overrides here since there is a coherent input/output mechanism for these paths. It's only when there is an imbalance in the paths that we need to override the Redirect module to prevent redirects.
Don't worry if you are looking for the full source code for the above module as I have recently released the Node RSS module on Drupal.org. It only has a dev release for the time being as I would like to add the ability to pick what content types are available for the feeds. I'm also testing it with different setups to make sure that the feed works in different situations. Let me know if it is useful for you and please create a ticket if you have any issues.
If you want to see another module that makes use of this technique then there is the Dynamic Path Rewrites module. This allows the rewriting of any content path on the fly without creating path aliases. This is an alternative to using modules like Path Auto without actually creating path aliases within your system and uses a nice caching system to speed up the responses.
Conclusion
The path processing system in Drupal is really quite powerful and can be used to build some interesting features that rewrite paths on the fly. We can take any incoming request and redirect it to any path we like on the fly.
Without this system in place we would need to generate additional aliases for every path we wanted and add them to the database before we would be able to use the system. That is fine on smaller sites, but I manage sites with millions of nodes and that amount of data would bloat the database and probably not be used all that much.
Path processing does have some interactions with other modules (like Redirect) but these problems are easily overcome. Perhaps the most complex part of this is ensuring that you have the right weights to some of the interactions here as getting things wrong will likely lead to unwanted interactions.
Comments
Hello,
Enjoying your article. I think perhaps you were thinking about weights when you wrote the below ?
Unlike weights where lower number floats to the top, when considering priorities, the higher the number the higher the priority. So I think you meant to write: «any setting more than 100 will cause the processing to be performed before Drupal's internal handler is called.» ?
Submitted by Renaud on Wed, 10/30/2024 - 19:01
PermalinkAdd new comment