JENS MALMGREN I create, that is my hobby.

Porting my blog for the second time, categories part 2

This is post #42 of my series about how I port this blog from Blogengine.NET 2.5 ASPX on a Windows Server 2003 to a Linux Ubuntu server, Apache2, MySQL and PHP. A so called LAMP. The introduction to this project can be found in this blog post /post/Porting-my-blog-for-the-second-time-Project-can-start.

So now it is time to start working on the categories. So I duplicated index.php and called the dupe category.php. Then I started thinking that perhaps I should rename index.php to post.php?

...hours later...

I figured out some things.

Since I changed the name of the default document I also had to change the rule defining the default document name. Changed the line with DirectoryIndex  in '/etc/apache2/sites-available/jensblog.nl.conf' so that the default document is called host.php. Then I also had to change the file '.htaccess' to this:

RewriteEngine on
RewriteRule ^post/(.*)$ /post.php/?post=$1 [NC]
RewriteRule ^category/(.*)$ /category.php/?category=$1 [NC]

But after I renamed index.php to post.php I only got 404 error messages. The error.log file had lines like these:

[Mon Jan 04 21:36:44.675538 2016] [negotiation:error] [pid 2067] [client 123.123.123.123] AH00687: Negotiation: discovered file(s) matching request: /var/www/jensblog/public_html/category (None could be negotiated)., referer: http://jensblog.nl/

The solution was to edit the configuration file, in my case '/etc/apache2/sites-available/jensblog.nl.conf'

≺VirtualHost *:80≻
	≺Directory /var/www/jensblog/public_html≻
                #Options Indexes FollowSymLinks MultiViews
                Options Indexes FollowSymLinks
                AllowOverride All
                Order allow,deny
                allow from all
	≺/Directory≻

...

I commented out the line 3 with MultiViews and replaced it with line 4.

The MultiViews makes it possible to serve different files based on the users language preferences etc but I don't want that. I just want things to work.

Then I restarted the apache2 webserver with '#service apache2 restart'. The hash indicates that I already had run the 'sudo su' command.

And then it worked again. Like this it sounds so simple. :)

The challenge caused by renaming the default page revealed something I had not realized before and that was the difference that the page was launched via the default mechanism versus when it is launched via the URL rewrite mechanism.

Here are a couple of examples of URLs:

  • jensblog.nl will launch the default page.
  • jensblog.nl/post will also launch the default page but not via the url rewrite.
  • jensblog.nl/post/test this will match the rewrite and launch page post.php?post=test
  • jensblog.nl/category would not match any rewrite and also in my case the default page would not be loaded. Fail.
  • jensblog.nl/category/test would match the rewrite and launch page category.php?category=test

So in the original situation it worked but because of the wrong reason. Changed the rewrite rules like this to resolve also the two missing cases:

RewriteEngine on
RewriteRule ^post$ /post.php [NC]
RewriteRule ^post/(.*)$ /post.php/?post=$1 [NC]
RewriteRule ^category$ /category.php [NC]
RewriteRule ^category/(.*)$ /category.php/?category=$1 [NC]

It worked.

In my old blog made with BlogEngine.NET it is possible to give a category a description but somehow this is a mess in my case. I would like to correct this before I proceed. I want each category to be like their own web pages so they need Slug, Title, Description and Content.

While we are at it I also decided to migrate my year categories away from the regular categories.

When I listen to myself it sounds like I need to stop work on the php page presenting the categories until I have my wishes implemented in the database. For that to happen I need to go back to the E-R and change that and also the perl program importing the data.

The Categories was the very first thing I worked on in this project in post #2. Back then this was my design:

And now I changed the ER to this:

 

The field ID is unchanged. Changed Name into Slug. Added the field Title, Description, Content and Type.

The Slug will work the same as for Posts so that when the blog url is jensblog.nl/category/test then test is the slug for that category identifying the summary page of that category.

The biggest difference is the new field TypeID. This is an ID referring to a record in the new CategoryType table. This new table makes it possible to have categories of several types. There will not be so many records in the CategoryType table. The first entry will be the called 'Category' and the second entry will be 'Year'. I could add the category type 'Tag' but I decided I will not use that. Instead I remove the Tag entities from the database entirely.

It took me really long time to understand the difference between Tag and Category and frankly BlogEngine.NET got it all wrong, that is why I could not understand tags. Most blog platforms got them wrong. What you really want is an unlimited number of dimensions of by which you can categorize the content like I have done now.

When I have two dimensions of categories then I can for example make a grid where I list the regular categories vertically and the year categories horizontally. Adding one dimension and yes we need a 3D representation of the blog categories. If I have any time over I might do that.

For now I have no time over at all. I have plenty of work to do. When I sync the database next time everything will disintegrate and fall apart and I will need to go back to all places where the Tags and Categories are handled from the import to presentation to make new queries and what not. Do I feel fear? No not really. I am excited!

The first anticipated challenge was to synchronize the database. But to my surprise the synchronization of the ER to the database was successful on the first attempt! How is that possible?

Next I had to revisit the import perl script. I started with creating the two CategoryType records. 1 for Category and 2 for Year.

Then I worked on the list of Categories. The existing old data from the old blog was just a big mess so I had to create a clean-up routine for it. This is the result:

ID: 1, TypeID: 2, Slug: '1980', Title: '1980', Description: 'Things I made 1980'
ID: 2, TypeID: 2, Slug: '1982', Title: '1982', Description: 'Things I made 1982'
ID: 3, TypeID: 2, Slug: '1983', Title: '1983', Description: 'Things I made 1983'
ID: 4, TypeID: 2, Slug: '1984', Title: '1984', Description: 'Things I made 1984'
ID: 5, TypeID: 2, Slug: '1988', Title: '1988', Description: 'Things I made 1988'
ID: 6, TypeID: 2, Slug: '1989', Title: '1989', Description: 'Things I made 1989'
ID: 7, TypeID: 2, Slug: '1996', Title: '1996', Description: 'Things I made 1996'
ID: 8, TypeID: 2, Slug: '1997', Title: '1997', Description: 'Things I made 1997'
ID: 9, TypeID: 2, Slug: '1998', Title: '1998', Description: 'Things I made 1998'
ID: 10, TypeID: 2, Slug: '1999', Title: '1999', Description: 'Things I made 1999'
ID: 11, TypeID: 2, Slug: '2005', Title: '2005', Description: 'Things I made 2005'
ID: 12, TypeID: 2, Slug: '2007', Title: '2007', Description: 'Things I made 2007'
ID: 13, TypeID: 2, Slug: '2008', Title: '2008', Description: 'Things I made 2008'
ID: 14, TypeID: 2, Slug: '2009', Title: '2009', Description: 'Things I made 2009'
ID: 15, TypeID: 2, Slug: '2010', Title: '2010', Description: 'Things I made 2010'
ID: 16, TypeID: 2, Slug: '2011', Title: '2011', Description: 'Things I made 2011'
ID: 17, TypeID: 2, Slug: '2012', Title: '2012', Description: 'Things I made 2012'
ID: 18, TypeID: 2, Slug: '2013', Title: '2013', Description: 'Things I made 2013'
ID: 19, TypeID: 2, Slug: '2014', Title: '2014', Description: 'Things I made 2014'
ID: 20, TypeID: 2, Slug: '2015', Title: '2015', Description: 'Things I made 2015'
ID: 21, TypeID: 1, Slug: 'about-artists', Title: 'About: Artists', Description: 'Blog posts about: Artists'
ID: 22, TypeID: 1, Slug: 'about-blogging', Title: 'About: Blogging', Description: 'Blog posts about: Blogging'
ID: 23, TypeID: 1, Slug: 'about-framing', Title: 'About: Framing', Description: 'Blog posts about: Framing'
ID: 24, TypeID: 1, Slug: 'about-material', Title: 'About: Material', Description: 'Blog posts about: Material'
ID: 25, TypeID: 1, Slug: 'motive-animals', Title: 'Motive: Animals', Description: 'Paintings and drawings of: Animals'
ID: 26, TypeID: 1, Slug: 'motive-fantasy', Title: 'Motive: Fantasy', Description: 'Paintings and drawings of: Fantasy'
ID: 27, TypeID: 1, Slug: 'motive-illusionistic', Title: 'Motive: Illusionistic', Description: 'Paintings and drawings of: Illusionistic'
ID: 28, TypeID: 1, Slug: 'motive-landscape', Title: 'Motive: Landscape', Description: 'Paintings and drawings of: Landscape'
ID: 29, TypeID: 1, Slug: 'motive-model', Title: 'Motive: Model', Description: 'Paintings and drawings of: Model'
ID: 30, TypeID: 1, Slug: 'motive-portrait', Title: 'Motive: Portrait', Description: 'Paintings and drawings of: Portrait'
ID: 31, TypeID: 1, Slug: 'motive-still-life', Title: 'Motive: Still Life', Description: 'Paintings and drawings of: Still Life'
ID: 32, TypeID: 1, Slug: 'org-aquarelmere', Title: 'Organization: Aquarelmere', Description: 'Paintings and drawings made at: Aquarelmere'
ID: 33, TypeID: 1, Slug: 'org-gisela', Title: 'Organization: Gisela', Description: 'Paintings and drawings made at: Gisela'
ID: 34, TypeID: 1, Slug: 'org-kunstlinie', Title: 'Organization: Kunstlinie', Description: 'Paintings and drawings made at: Kunstlinie'
ID: 35, TypeID: 1, Slug: 'org-stoker', Title: 'Organization: Stoker', Description: 'Paintings and drawings made at: Stoker'
ID: 36, TypeID: 1, Slug: 'type-acrylic', Title: 'Type: Acrylic', Description: 'Made with: Acrylic'
ID: 37, TypeID: 1, Slug: 'type-animation-2d', Title: 'Type: Animation 2d', Description: 'Made with: Animation 2d'
ID: 38, TypeID: 1, Slug: 'type-aquarelle', Title: 'Type: Aquarelle', Description: 'Made with: Aquarelle'
ID: 39, TypeID: 1, Slug: 'type-ceramic', Title: 'Type: Ceramic', Description: 'Made with: Ceramic'
ID: 40, TypeID: 1, Slug: 'type-drawing', Title: 'Type: Drawing', Description: 'Made with: Drawing'
ID: 41, TypeID: 1, Slug: 'type-linoleum-cut', Title: 'Type: Linoleum Cut', Description: 'Made with: Linoleum Cut'
ID: 42, TypeID: 1, Slug: 'type-music', Title: 'Type: Music', Description: 'Made with: Music'
ID: 43, TypeID: 1, Slug: 'type-oil', Title: 'Type: Oil', Description: 'Made with: Oil'
ID: 44, TypeID: 1, Slug: 'type-paper-cutting', Title: 'Type: Paper Cutting', Description: 'Made with: Paper Cutting'
ID: 45, TypeID: 1, Slug: 'type-photography-2d', Title: 'Type: Photography 2d', Description: 'Made with: Photography 2d'
ID: 46, TypeID: 1, Slug: 'type-youtube', Title: 'Type: YouTube', Description: 'Made with: YouTube'

And this entirely based on the old list with this transformation program:

# Load categories
# /post/Porting-my-blog-for-the-second-time-walk-the-old-data
# http://www.jens.malmgren.nl/post/Porting-my-blog-for-the-second-time-categories-part-2.aspx
my $filename = "/usr/local/bin/jensblog/App_Data/categories.xml";
my $parser = XML::LibXML-≻new("1.0", "UTF-8");
my $xmldoc = $parser-≻parse_file($filename);
my %dictGUIDToID = ();
my %dictTargetAndTagNameToURLID = ();

my $iCategoryType = 1;
if ($insertCategories)
{
	for my $category_node ($xmldoc-≻findnodes("//category"))
	{
		my $strGuidId = $category_node -≻ findnodes('@id') -≻ to_literal;
		my $strSlug = $category_node-≻textContent;
		my $strTitle = $strSlug;
		my $strDescription = "";
		
		if ($strTitle =~ /^[0-9]+$/)
		{
			$iCategoryType = 2;
			$strDescription = "Things I made " . $strTitle;
		}
		else
		{
			$iCategoryType = 1;
			$strTitle =~ s/org/organization/g;
			$strTitle =~ s/youtube/YouTube/g;
			$strTitle =~ s/^(.)/uc($1)/ge;
			$strTitle =~ s/(-)(.)/' '.uc($2)/ge;
			$strTitle =~ s/^([^s]+)/$1:/;
			
			$strDescription = $strTitle;
			$strDescription =~ s/About:/Blog posts about:/;
			$strDescription =~ s/Motive:/Paintings and drawings of:/;
			$strDescription =~ s/Organization:/Paintings and drawings made at:/;
			$strDescription =~ s/Type:/Made with:/;
		}
		
		$dbh-≻do('INSERT INTO Category (Slug, Title, Description, TypeID) VALUES (?, ?, ?, ?)', undef, $strSlug, $strTitle, $strDescription, $iCategoryType);
		my $intId = $dbh-≻{mysql_insertid};
		$dictGUIDToID{ $strGuidId } = $intId;
		print "ID: " . $intId . ", TypeID: " . $iCategoryType . ", Slug: '" . $strSlug . "', Title: '" . $strTitle . "', Description: '" . $strDescription . "'
";
	}
}

Especially line 30 and 31 are nice. Within a search and replace expression the function uc is transforming letters from lowercase to uppercase. I like that.

This way I have all the categories from my old blog ported to the new blog platform and I have the years in a separate category type.

Also I can present pages about each type with title and description. Later on I can also create specific content for the categories but that is later when I have finalized the blog engine. Either that or I make the content and inject during import.

After successfully importing the blog I launched the webpage of the blog. Did it work? No... Obviously the queries created to load the categories need to be adapted to the new table layout. I found that it is the easiest to work inside out with large queries like these. So I start with level 1 adapting to the new situation. Previously I selected only c1.Name but that field is now replaced by c1.Slug. Additionaly I select also c1.Title. It is only necessary to select categories where TypeID is equal to 1, regular categories. The fields c1Slug and c1Title are propagated to level 2 and from level two to three and finally to level 4.

SELECT u4.FileName as u4FileName, u4.Width as u4Width, u4.Height as u4Height,
    c.pc2rowNum, c.c1Slug, c.c1Title, c.p2Slug
FROM
(
    -- Level 3, subquery c
    -- All Posts from a limited to 9 Posts per Category
    SELECT b.*
    FROM
    (
        -- Level 2, subquery b
        -- All Posts with the same Categories as the current Post.
        -- pc2rowNum row numbers reset on change of Category.
        -- Get all Posts except current Post: pc2.PostID != a.p1ID
        SELECT p2.Slug as p2Slug, a.c1Slug, a.c1Title, pc2.PostID as pc2PostID,
            (SELECT @pc2rowNum := IF(@pc2cat = pc2.CategoryID,
            @pc2rowNum + 1, 1)) as pc2rowNum,
            (SELECT @pc2cat := pc2.CategoryID) as pc2Cat
        FROM PostCategory pc2 JOIN Post p2 JOIN
        (
            -- Level 1, subquery a.
            -- All Categories of the current Post. Initialize pc2rowNum and pc2cat.
            SELECT p1.ID as p1ID, p1.Slug as p1Slug, pc1.CategoryID as pc1CategoryID,
                c1.Slug as c1Slug, c1.Title as c1Title, p1.PublishedOn as p1PublishedOn,
                (SELECT @pc2rowNum := 0), (SELECT @pc2cat := 0)
            FROM Post p1, PostCategory pc1, Category c1
            WHERE p1.ID = pc1.PostID AND pc1.CategoryID = c1.ID AND
                p1.Slug = 'Jasmine-Tea' AND c1.TypeID = 1 /*Regular Categories*/
            ORDER BY CategoryID, PublishedOn DESC
        )
        AS a
        WHERE pc2.CategoryID = a.pc1CategoryID AND pc2.PostID != a.p1ID AND p2.ID = pc2.PostID
    ) AS b
    WHERE b.pc2rowNum ≺ 10
) AS c JOIN PostURL pu4 JOIN URL u4 ON pu4.PostID = c.pc2PostID AND pu4.URLID = u4.ID
WHERE u4.IsFileContent = 1 AND u4.TagName = 'img'

Level 3 still limits the number of pc2rowNum rows without adaptions.

In the rendering I previously had a little hack to avoid categories with years. That hack can be removed now.

Is it working? No, there is a second query for the case the first query yields no image result.

SELECT b.*
    FROM
    (
        -- Level 2, subquery b
        -- All Posts with the same Categories as the current Post.
        -- pc2rowNum row numbers reset on change of Category.
        -- Get all Posts except current Post: pc2.PostID != a.p1ID
        SELECT p2.Slug as p2Slug, p2.Title as p2Title,
            a.c1Slug, a.c1Title, pc2.PostID as pc2PostID,
            (SELECT @pc2rowNum := IF(@pc2cat = pc2.CategoryID,
            @pc2rowNum + 1, 1)) as pc2rowNum,
            (SELECT @pc2cat := pc2.CategoryID) as pc2Cat
        FROM PostCategory pc2 JOIN Post p2 JOIN
        (
            -- Level 1, subquery a.
            -- All Categories of the current Post. Initialize pc2rowNum and pc2cat.
            SELECT p1.ID as p1ID, p1.Slug as p1Slug, pc1.CategoryID as pc1CategoryID,
                c1.Slug as c1Slug, c1.Title as c1Title, p1.PublishedOn as p1PublishedOn,
                (SELECT @pc2rowNum := 0), (SELECT @pc2cat := 0)
            FROM Post p1, PostCategory pc1, Category c1
            WHERE p1.ID = pc1.PostID AND pc1.CategoryID = c1.ID
            AND p1.Slug = 'Jasmine-Tea' AND c1.TypeID = 1
            ORDER BY CategoryID, PublishedOn DESC
        )
        AS a
        WHERE pc2.CategoryID = a.pc1CategoryID AND pc2.PostID != a.p1ID AND p2.ID = pc2.PostID
    ) AS b
    WHERE b.pc2rowNum ≺ 6

In both queries here above 'Jasmine-Tea' should dynamically be replaced with the slug of the current page.

Now when rendering the left sidebar I made the category titles into hyperlinks pointing to the category page with the slug to the specific category.

Now when things are back it is about time to get going on those category pages. Wow, is this procrastination or is this busy waiting? No, wait! When I imported the old blog this time I deleted all images in the media directory before import. Now some images are gone. Hmm... I need to look into this before I can start on the categories page.

So next time I will start on the categories. I promise!

I was born 1967 in Stockholm, Sweden. I grew up in the small village Vågdalen in north Sweden. 1989 I moved to Umeå to study Computer Science at University of Umeå. 1995 I moved to the Netherlands where I live in Almere not far from Amsterdam.

Here on this site I let you see my creations.

I create, that is my hobby.