Porting my blog for the second time, walk the old data part 2

This is post #14 of my series about how I port this blog from Blogengine.NET 2.5 ASPX on a Windows Server 2003 to a Linux Ubuntu server, Apache2, MySQL and PHP. A so called LAMP. The introduction to this project can be found in this blog post

Now it is really time to walk the old data!

My program first deletes the data in the database and then it goes on with loading the categories. Then I can call a routine Walk that I bring back from post #2 of this series.

sub Walk
	my ($directory) = @_;
	opendir my $dirHandle, $directory or die "Failed to open $directory: $!";
	my @dirItems = readdir $dirHandle;
	for my $dirItem (@dirItems)
		if ($dirItem =~ /^(\.|\.\.)$/)
		if ($dirItem =~ /_vti_cnf$|themes$/)
		my $path = $directory . "/" . $dirItem;
		if (-f $path && $path =~ /\.xml$/i)
		if (-d $path)
} # Walk()

Whereas I the last time printed -f this time I call a routine LoadPost. This new routine handling all the processing of a file but I make if from the source code I already worked on. The beginning of the new LoadPost subroutine looks like this:

sub LoadPost
	my ($filenamePost) = @_;
	print "===============================\n";
	print "========== LOAD POST ==========\n";
	print "===============================\n";
	print "File: " . $filenamePost . "\n";
	# Load a post fields
	#my $filenamePost = "/usr/local/bin/jensblog/App_Data/posts/6d781b81-27ab-4ecf-a4ce-df2795128f0d.xml";
	$xmldoc = $parser-≻parse_file($filenamePost);

	my $strEncoding = $xmldoc-≻encoding();

When I run my program it goes on and on for a long time and then after 37 posts and 113 images it stops. It came across something it cannot handle.

≺?xml version="1.0" encoding="utf-8" standalone="yes"?≻
  ≺title≻141 blog posts≺/title≻
  ≺description /≻
  ≺content≻≺div class="separator"
style="text-align: left; clear: both;"≻Things are
going well with this blog. I made 141 blog posts with
this post. I learned how to handle the labels and how
to administrate the images. Today I made the QR code
for this blog. I created a couple of stickers with
the QR-image as well. Later on I might create handmade
“business cards” with the QR image
on aquarelle paper. That would be utterly cool. The
ultimate combination of art meets technology.≺/div≻
≺p≻≺img src="/media/Jens+Blog+QR.png" alt="" /≻≺/p≻≺/content≻
  ≺pubDate≻2011-07-16 19:44:00≺/pubDate≻
  ≺lastModified≻2011-09-08 08:10:42≺/lastModified≻
  ≺tags /≻
  ≺comments /≻
  ≺notifications /≻

It turns out I missed <img src="/image.axd?picture=2011%2f9%2fJens+Blog+QR.png" alt="" /> entirely. When I insert an image with the built in "Insert Image" function in the old BlogEngine.NET then the links looks like this "/image.axd?picture=2011%2f9%2fJens+Blog+QR.png". So... I am back into handling images again.

So now I comment my Walk routine and load only this blog entry. Here is work to do!

# Walk('/usr/local/bin/jensblog/App_Data/posts');

But that is for next time.

