Migrating from Blogger to Wordpress,
published at 4:08am on 08/10/06, with No Comments
As a birthday present to our dear friend Dawn, I helped her move her Blogspot blog over to her very own WordPress installation on her very own domain (hosted by the friendly folks over at FictCo, who happen to include me). For the most part, this was a fairly easy operation, especially with the assistance of the two fantastic resources that I found out there that cover just such a migration, as well as the Blogger importing tool that is built in to the latest version of WordPress.
- Moving from Blogger to WordPress: Best Practices
- Importing Haloscan comments into WordPress from Blogger
There is one little trick to the WordPress importer, however - apparently Blogger posts don’t have titles, and as such, the title of each of the posts was set as the Blogger post id, which just looked silly. So I hacked up this little script that would, given a tab delimited text file consisting of a post ID and the body of the post, grab the first line of the post and with some magic, spit out a line of SQL that would update that post with the new title (and the appropriate post slug).
Some rules that it uses to build the title from the first line of the post:
- Lines that are in all caps are assumed to be titles already, so keep them in their entirety
- Other lines should be limited to 9 words (an arbitrary number that seemed right)
- If there is punctuation in that first line, everything up to that punctuation is the title
- There need to be at least 5 characters before the first punctuation mark (to avoid things like a “D.C.” in the first line getting truncated to “D.”
- Strip out all HTML from the first line
Anyway, this code is really hacky, and I take no responsibility for it, but I thought someone else out there might be able to use it.
First, the code to get the posts out of the database:
% mysql -u DB_USER -pPASSWORD -e 'SELECT ID, post_content FROM wp_posts' DB_NAME > posts.txt
Next, the script itself, saved as “build_post_titles.pl”
#!/usr/local/bin/perl -w
use strict;
open(POSTS, 'posts.txt') or die('Could not open posts');
while () {
chomp;
my($id, $body) = split /t/;
$body =~ s/<.*?>//g;
$body =~ /^(.*?)\n/s;
my $firstline = $1 ? $1 : $body;
$firstline =~ s/\n//g;
if ($firstline =~ /[a-z]/) {
$firstline =~ s/^(.{5,}?)([.!?]+)(.*)/$1$2/;
my @firstline = split / /, $firstline;
my $count = $#firstline > 8 ? 8 : $#firstline;
@firstline = @firstline [ 0 .. $count ];
$firstline = join ‘ ‘, @firstline;
}
$firstline =~ s/’/’'/g;
$firstline =~ s/^s+//;
$firstline =~ s/s+$//;
my $slug = $firstline;
$slug =~ tr/A-Z/a-z/;
$slug =~ s/s+/-/g;
$slug =~ s/[^-a-z0-9]//g;
if ($firstline !~ /^s*$/) {
printf(”UPDATE wp_posts SET post_title=’%s’, post_name=’%s’ WHERE ID=%d;n”, $firstline, $slug, $id);
}
}
close(POSTS);
Next, running the script:
% perl build_post_titles.pl > update_posts.sql
Finally, running the code against the database:
% mysql -u DB_USER -pDB_PASSWORD DB_NAME < update_posts.sql
And that’s that. Hope this is somewhat useful.
And for anyone else looking for musings, on life and such, more of that coming soon.
Filed under: Technology, with No Comments
How I Fixed My Raid-1 Partition Size Error,
published at 12:07am on 07/23/05, with 8 Comments
How did it start?
The first indication that there was something wrong with the server came on June 10, 2005 in the form of error messages that were reported to me by the command that I have running hourly to mail me system anomalies.
Jul 10 04:16:11 loco kernel: attempt to access beyond end of device Jul 10 04:16:11 loco kernel: 09:03: rw=0, want=56050716, limit=56050688
Every hour, at around the same time, these errors started cropping up. I looked through all the crontabs and found one command, a bounced mail queue processor that I run for one of my projects that was running at that time. Turning off the process stopped the errors from coming up, and I thought that perhaps we just had a couple of corrupted files. The next morning, the errors started cropping up again, one or two at a time.
Realizing that this could be a sign that the drives were eating themselves, I decided to head to the data center for a bit of one-on-one time with the server.
So what did we do?
The first thing I did was drop the system into single-user mode. We’re running ext3 filesystems on software RAID-1 on two 73gb SCSI drives. I decided that I would try e2fsck on the partition that was giving me problems, but I kept running into the following error:
The filesystem size (according to the superblock) is xxx The physical size of the device is xxx Either the superblock or the partiion table is likely to be corrupt!
Ok, so that’s a bit puzzling, and I spent a bit more time puzzling over this, and finding absolutely nothing in Google that would give any indication of what might have been going on, until I found the following gem in an article about converting a running system into a RAID-1 system:
Step-11 - resize filesystem
When we created the raid device, the physical partion became slightly smaller because a second superblock is stored at the end of the partition. If you reboot the system now, the reboot will fail with an error indicating the superblock is corrupt.http://howtos.linux.com/howtos/Software-RAID-HOWTO-7.shtml#ss7.6
Eureka!
It appears that when we originally set up the RAID, we never resized the partitions. For the past year or so, the system has been running along without any problems because it just never wrote to that part of the disk. A couple of files must have made it out to this portion of the disk where the RAID superblock is stored, and the RAID system wouldn’t let it write and was throwing the errors that I saw. However, resizing the partitions without repairing them first will throw the following error:
attempt to read block from filesystem resulted in short read while trying to resize
Obviously there was a problem with the drive that needed to be addressed.
Fixing the problem
The solution was actually quite straight forward, once I got all the steps in place. There were two time-consuming parts to this process. First, I had to figure out what was wrong. And second, I needed to wait to repair the drive. In the process of trying to write out beyond the RAID partition, some inconsistencies were introduced to the drive. e2fsck was the way to fix this. The solution is as follows:
1. Unmount all partitions 2. Repair the partitions 3. Resize the partitions
Unmounting the partitions in single-user mode is a matter of running:
umount -a
I’m not really sure how this works, but it doesn’t matter what services are running or what happens to be in use - it just unmounts everything for you.
Once the partitions were unmounted, it was a simple matter of telling e2fsck to check for bad blocks when run on the offending partition. man e2fsck tells us the following:
-c This option causes e2fsck to run the badblocks(8) program to
find any blocks which are bad on the filesystem, and then marks
them as bad by adding them to the bad block inode. If this
option is specified twice, then the bad block scan will be done
using a non-destructive read-write test.
By running e2fsck -cc /dev/md3 we were able to do the repairs non-destructively. However, as expected, on our 53 gig /home partition, this badblocks scan took about 7 hours to run. The good news is that in that time, it did find errors, it did seem to fix them, and running e2fsck following that run seemed to return no other errors.
After the partitions were repaired, I ran resize2fs following the instructions in the above article. I first ran e2fsck again (but not in badblocks mode), just to make sure everything was clean, then I resized the partition and then I ran e2fsck again.
e2fsck -f /dev/md3 resize2fs /dev/md3 e2fsck -f /dev/md3
This worked like a charm, and I did not get the “short read” error from earlier. I was not able to unmount the root partition, however, since the running system needed access to it, and I was not able to mount it read-only as was suggested in the article.
How to resize the root partition
Resizing the root partition turned out to be less of a pain than I might have expected, though it was by no means obvious when first thinking about the problem. The solution would be as follows:
1. Copy all files from the root partition to another, empty partition (/tmp worked nicely) 2. Reboot the server passing in the new, fake root partition to the boot loader 3. Unmount all partitions (including the real root partition, which is not running) 4. Repair and resize as above
Fortunately, /tmp had its own partition. I deleted the contents out of /tmp (which should be temporary anyway) and copied all of the files out of the root partition into this new, temporary root. Remember that you can copy /dev files, but should avoid /proc. The idea here is to copy all of the files out of /, excluding anything that is mounted from another partition. [Looking at the man page again, after the fact, -x would probably be exactly what’s needed here. -jcn]
1. cp -ax / /tmp (can't actually remember the cp command, but this should work) 2. Edit /tmp/etc/fstab to not mount the partition that /tmp resides on
Once that is done, it is simply a matter of rebooting. At the LILO prompt, tell the existing kernel to use the new partition (which is normally /tmp) as the root partition.
LILO: kernel root=/dev/sd5 single
Once booted, I ran unmount -a and proceeded as above.
Done!
This seems to have worked. resize2fs is, in fact, non-destructive and now when I run e2fsck, it just runs - it does not give me the error about a mismatched physical vs. filesystem partition.
Followup
Did this document help you? If so, I’d love if you would let me know, and let me know if there is anything I left out or was confusing. Thanks!
Filed under: Technology, with 8 Comments


