Using Apache Rewrite Rules to make cleaner, prettier URLs
This is my explanation of how to turn ugly, auto-generated URLs into pretty ones using the
magic of Apache's
mod_rewrite module.
Just kidding! Regular expressions are not magic
My goal
- My goal was to transform an ugly URL which originally looked like this:
https://ankiewicz.com/ photos/travel/hullbeach/ slides/tree-in-sand.html
- ...into this prettier, cleaner URL:
https://ankiewicz.com/ photos/travel/hullbeach /tree-in-sand/
- I wanted to do two things:
- Get rid of the term slides
- Get rid of the extension .html
You may have reasons for wanting to do this
- Your gallery generator (I use jAlbum) only outputs ugly URLs
- You have 10,000 ugly URLs and cannot be bothered to clean them up by hand
- Some third-party plugin creates ugly URLs
- You're OCD and the URLs drive you crazy
- It's okay. We get it. Let's make them pretty
Some things to know
- The mod_rewrite module might already be enabled
- If not, you may need to request your system administrator to enable it for your website
- The URL is the domain
followed by the path
- My domain is ankiewicz.com
- This is a sample ugly path as the user would see it in their browser when visiting my domain:
/photos/travel/ hullbeach/slides/tree-in-sand.html
- This is a sample pretty path as the user would see it in their browser when visiting my domain:
/photos/travel/ hullbeach/tree-in-sand/
First things first
- Create a plain-text file named .htaccess (don't forget the dot)
- Add as many comments as you want using the # symbol
- Upload it to the top level of your website (root, as they say)
- Include the following code:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
#### Prettify my URLs
RewriteRule ^photos/(.*)/(.*)/(.*)/$ /photos/$1/$2/slides/$3.html [L]
</IfModule>
The nitty gritty
- Domain consistency is not part of this discussion
but I included it so you can see how it works
- The first line that starts RewriteRule is where the magic happens
- There are three parts to a Rewrite Rule:
- the pretty path on the left
- the ugly path on the right
- the flag in brackets at the end
- The flag [L] tells it to stop processing rules after that line
- The path on the left is what I want my path to look like
- The path on the right is the original path that my users are used to seeing
- The symbols are part of a system of regular expressions for matching one string to another
What the symbols mean
^ | matches the beginning of the string |
. | matches any character |
.* | matches any character any number of times |
( ) | groups characters into a single unit and captures a match for use in a back-reference |
/ | are the normal slashes in your path |
$ | matches the end of your string |
How it works
- There are 3 instances of (.*) in my pretty path
- Each instance of (.*) can back-referenced in my ugly path as $1, $2, and $3
- The path on the right is a symbolic representation of a real, ugly file system path:
- $1 represents the 1st instance of (.*)
- $2 represents the 2nd instance of (.*)
- $3 represents the 3rd instance of (.*)
- You can now format your <a href> links to take the pretty form
- For instance, here is my new link format:
<a href="/photos/travel/ hullbeach/tree-in-sand/">here is a link</a>
- You can link to your pretty path and it will find and serve up the file at the ugly path without the user being the wiser
Caveats
- For this to work seamlessly you must change all the links in your HTML to be pretty
- Otherwise your users will still be able to see the ugly paths
- The ugly paths didn't die; I am simply not linking to them in my web pages anymore
- If I typed the ugly path in by hand, it would still be accessible to me
- If Google had indexed my ugly paths, they would still be accessible via the search engine. I make sure this doesn't happen by establishing a canonical URL for each page on my site.
End results