Tidy Up A URL With PHP

Lots of applications require a user to input a URL and lots of problems occur as a result. I was recently looking for something that would take a URL as an input and allow me to make sure that is was formatted properly. There wasn't anything that did this so I decided to write it myself.

The following function takes in a URL as a string and tries to clean it up. It essentially does this by splitting is apart and then putting it back together again using the parse_url() function. In order to make sure that this function works you need to put a schema in front of the URL, so the first thing the function does (after trimming the string) is to check that a schema exists. If it doesn't then the function adds this onto the end.

function tidyUrl($url){
 // trim the string
 $url = trim($url);
 // check for a schema and if there isn't one then add it
 if(substr($url,0,5)!='https' && substr($url,0,4)!='http' && substr($url,0,3)!='ftp'){
  $url = 'http://'.$url;
 };
  parse the url
 $parsed = @parse_url($url);
 if(!is_array($parsed)){
  return false;
 }
 // rebuild url
 $url = isset($parsed['scheme']) ? $parsed['scheme'].':'.((strtolower($parsed['scheme']) == 'mailto') ? '' : '//') : '';
 $url .= isset($parsed['user']) ? $parsed['user'].(isset($parsed['pass']) ? ':'.$parsed['pass'] : '').'@' : '';
 $url .= isset($parsed['host']) ? $parsed['host'] : '';
 $url .= isset($parsed['port']) ? ':'.$parsed['port'] : '';
 // if no path exists then add a slash
 if(isset($parsed['path'])){
  $url .= (substr($parsed['path'],0,1) == '/') ?   $parsed['path'] : ('/'.$parsed['path']);
 }else{
  $url .= '/';
 };
 // append query
 $url .= isset($parsed['query']) ? '?'.$parsed['query'] : '';
 // return url string
 return $url;
}

The parse_url() function should return an array is successful, if it doesn't then the function checks for this and returns false.

This function is also useful if you want to keep a standard format to any URL that you store. To make this easier in the long term you should store any domain URL with the trailing slash. If none is added by the user then the function adds it onto the end.

Add new comment

The content of this field is kept private and will not be shown publicly.
CAPTCHA
2 + 4 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.