JENS MALMGREN I create, that is my hobby.

Porting my blog for the second time, setting up Apache2

This is post #18 of my series about how I port this blog from Blogengine.NET 2.5 ASPX on a Windows Server 2003 to a Linux Ubuntu server, Apache2, MySQL and PHP. A so called LAMP. The introduction to this project can be found in this blog post /post/Porting-my-blog-for-the-second-time-Project-can-start.

Working with MySQL was new to me and it was a really nice experience. Except the Schrödingers Table failure obviously (See post /post/Porting-my-blog-for-the-second-time-images-part-5). So far in this project we did "really simple things" such as parsing XML files and setting up databases. Actually not that simple but afterwards it was not that bad don't you think? And now? Now we are going to configure Apache2 and that is absolutely much more... New to me.

These are the steps I will follow for how to set up the web server and create my own blog engine:

  1. Setup local domain for test purposes, local domain controller
  2. Create the website in Apache2
  3. Create a PHP test page
  4. Setup the URL structure of the website
  5. Start develop the PHP page for fetching the data from the database and present the blog

Anything can change but this is the plan right now. Let's go!

Step 1.

To test the website I need to be able to send requests to the web server as if the web site was available in Internet but right now it is not so how do I solve that? I will create a domain name available locally on my own network. Later on when the blog is operational then I will connect it to Internet via a firewall and then I will change it to work on my real domain names.

When the request arrives to the web server via a domain name then the web server can host several websites and decide what site to serve based on that domain was. My local domain controller right now is my Windows Server 2003.

So I added a zone and in that zone I added an A record with the IP of the Ubuntu server in the same network. Now when I enter this address in a browser from within the network then I will arrive at the Apache2 of my Ubuntu server and not to forget, together with the request comes that domain name that Apache2 can recognize and send the request to the correct site. But right now that is just theory because first I need to set up the site in Apache.

Step 2.

Here is information about how to set up websites on Apache2: https://www.digitalocean.com/community/tutorials/how-to-set-up-apache-virtual-hosts-on-ubuntu-14-04-lts

Step 3.

I created a test page. I called my test page index.php and it was really simple:

≺html≻
≺body≻
Test of Jens Malmgren Blog
≺/body≻
≺/html≻

Here we already got a beginning, that is great! What can go wrong? Well... I tried to enter jensblog.nl in my browser and it came up with all sorts of things but not my test page. There are several layers of failures possible. So I tried to ping this and ping resolved with the IP of my Ubuntu server!? After some experiments I figured out it was the browser trying to be clever. In the old days you had one field address field where you entered an URL. These days you have a field trying to be clever. So when you enter www followed by a dot it will resolve it for you. If you don't start with www then you are searching for something. Right? No! So to circumvent this I had to enter http and there it was. My test page was alive and kicking!

When the browser has found a page once it then remembers this so the next time I can just enter jensblog.nl and then it works, no need to enter http every time!

It is as if each step in my todo list is a hundred fold more complex than the previous step. Which is fine. Actually I will probably never finish step 5. Ever. Step 5 will grow out and mutate itself to endless sub todo lists. But we are not there yet. Lets continue on step 3.

My test PHP page needs to read the arguments given to the page and show them.

≺?php
error_reporting(-1);
ini_set("display_errors", 1);

// Initialize the $arrayArgs hash so that we can lookup what queries was sent to the page.
// http://www.jens.malmgren.nl/post/Porting-my-blog-for-the-second-time-setting-up-Apache2.aspx
$arrayArgs = array();
if ($_SERVER['QUERY_STRING'] != "")
{
	$arrayArgsList = preg_split("/&/", strtolower($_SERVER['QUERY_STRING']));
	for ($i = 0; $i ≺ count($arrayArgsList); $i++)
	{
		$arrayKeyValue = preg_split("/=/", $arrayArgsList[$i]);
		
		// Validating the args. Incorrect characters in args will not be accepted.
		if (preg_match("/^[a-z][a-z0-9]+$/", $arrayKeyValue[0]) && preg_match("/^[a-zA-Z][a-zA-Z0-9-]*$/", $arrayKeyValue[1]))
		{
			$arrayArgs[$arrayKeyValue[0]] = $arrayKeyValue[1];
		}
	}
}
?≻

≺html≻
≺body≻
Test Args≺br≻
≺?php
if (count($arrayArgs) ≻ 0)
{
	print_r($arrayArgs);
}
else
{
	print "No valid args≺br≻";
}
?≻
≺/body≻
≺/html≻

When I request the page without any arguments:

jensblog.nl

Then the result looks like this:

Test Args
No valid args

And when I request the page with for example two arguments:

http://jensblog.nl/?arg1=argument1&arg2=argument2

Then the result looks like this:

Test Args
Array ( [arg1] =≻ argument1 [arg2] =≻ argument2 ) 

 

With the test page in this state I finished step 3.

Step 4.

Now it is time for setting up the URL structure for the blog. What I mean by this is that I want really easily readable URLs to my blog. No question-marks, equal signs, '.php' and ampersands. I want really simple URLs. To achieve this I want to rewrite/transform the URLs before the URLs are presented to the PHP engine. A request comes to my website via a my domain name. That domain name is resolved to my fixed IP address that my ISP has given me. The server receives the package and on the server the web service will look at the request and for example decide where to go. If sunneras.se was requested then that site is served. I my blog was requested the we come to my blog site. Normally the server get the page from disk and sends that to the browser but now I want to add something between the request comes in and before the web service decided what page to serve. I want to transform the URL. This is new to me I have never done it before so I am excited!

I was thinking that it could be nice to set up the blog in this way:

www.jens.malmgren.nl/post/porting-my-blog-for-the-second-time-setting-up-apache2

Behind the scenes this would come to index.php like this:

www.jens.malmgren.nl?post=porting-my-blog-for-the-second-time-setting-up-apache2

Then when I want to serve a page about a category then it could look like this:

www.jens.malmgren.nl/category/about-blogging

And likewise that would come to index.php like this:

www.jens.malmgren.nl?category=about-blogging

I found this article about the subject: https://www.digitalocean.com/community/tutorials/how-to-set-up-mod_rewrite-for-apache-on-ubuntu-14-04.

This how to instruction talks about a site located at /var/www/html but my site is located at /var/www/jensblog/public_html

Clueless but hopeful I am guessing I will use my location and not the location of the instruction so inside 000-default.conf I insert this block:

≺Directory /var/www/jensblog/public_html≻
    Options Indexes FollowSymLinks MultiViews
    AllowOverride All
    Order allow,deny
    allow from all
≺/Directory≻

Then I created the .htaccess file as the instruction told me. I made my own variation:

RewriteEngine on
RewriteRule ^about$ index.php/?page=about [NC]

This did not work at all. And then my newly created file disappeared! I use WinSCP from a windows system to browse around and files starting with a dot are actually supposed to be hidden but I need to work this file so I changed the setting to show hidden files.

But that did not help me because Apache would not serve me the page. Still clueless I started thinking that the instruction wanted me to insert that Directory block to the 000-default.conf file but I got another setting file for my blog, namely jensblog.nl.conf. Would it not be better if I inserted it there? IT WORKED!

Hehe, to understand the rewrite instructions I need to rewrite the rewrite instructions. When I enter this URL in the web browser:

jensblog.nl/about

The page answers:

Test Args
Array ( [page] =≻ about )

Wonderful!

Do you see that my life is full of regular expressions? We have them all over the place. My absolute favorite is the zero width negative lookahead assertion. But that is for another time.

Now when I got started I added a second rewrite rule to handle my posts URLs:

RewriteEngine on
RewriteRule ^about$ index.php/?page=about [NC]
RewriteRule ^post/(.*)$ index.php/?post=$1 [NC]

And this works like a charm! Now when I enter this URL in the address bar of my browser:

jensblog.nl/post/porting-my-blog-for-the-second-time-setting-up-apache2

My page replies with:

Test Args
Array ( [post] =≻ porting-my-blog-for-the-second-time-setting-up-apache2 )

Right now this is as far as I will go with rewriting. Later on I will come back to this subject when I want to make new features in my blog like separate pages for the categories and tags.

So here ends step 4. Now I will start on step 5. But that is for next time.

I was born 1967 in Stockholm, Sweden. I grew up in the small village Vågdalen in north Sweden. 1989 I moved to Umeå to study Computer Science at University of Umeå. 1995 I moved to the Netherlands where I live in Almere not far from Amsterdam.

Here on this site I let you see my creations.

I create, that is my hobby.