Archive for the ‘Development’ Category

Inserting large blobs in MySQL

on Monday 12th January, 2009 Gabe speculated thusly…

Last week I setup MySQL replication for a database with two slaves. I wanted to see how resilient it was, and one of the tests I wanted to perform was to create a record on the master and then interrupt the replication of the same record on the slave. This would allow me to see whether MySQL could recover from such an error.

So I thought the easiest way of doing this would be to to insert a large 700mb
file in to MySQL – to give me enough time to interrupt a slave before
replication had completed. Thus being able to test MySQL’s resilience to
loss of connectivity during UPDATE/INSERT queries.

I have never used BLOB columns before so my quest took my on a journey during which I learned a lot about these blobs. Now, even though a MySQL long blob can hold 4GB of data I have encountered
problem after problem dealing with it:

PHP scripts are limited to 32MB memory maximum, this can be upped, but
is not scalable and not a good solution. It means that you cannot simply
read the data of file and perform an INSERT since the file data is
stored in PHP’s memory first. This meant the development of a PHP script
that could read chunks (just a meg or so) from a file and perform UPDATE
queries along with MySQL CONCAT function to effectively append each
chunk.

CONCAT, as I eventually found out, returns NULL if any of it’s arguments
are NULL. In my case when attempting to UPDATE a row with additional
chunks the resulting blob size was always zero. This was because the
very first time CONCAT is used the blob field is empty (NULL), therefore
CONCAT always returned NULL – and so, the data in the blob column never
grew. The query looked like this:
UPDATE `$table` SET `data` = CONCAT(data, '{mysqli_real_escape_string($buffer)}') WHERE `id`=$id";

I then combined CONCAT with COALESCE which will return an alternative
value that you specify in the eventuality the first argument is NULL. By
combining these two I could overcome the above problem.
Resulting query structure:
UPDATE `$table` SET `data` = COALESCE(CONCAT(data, '{mysqli_real_escape_string($buffer)}'), '') WHERE `id`=$id";

Great, now we’re finally adding our chunks on. Then my computer ran out
of disc space – the reason: 7 gigs of MySQL replication logs. Took a
while to find that out. Fixed.

Next problem, although I was certain (via in-depth PHP debugging) that
PHP was correctly reading chunks from the file in a consecutive order,
of the correct size, and the SQL query was being properly executed
without any sly errors the blob column would never end up growing above
12MB (according the PHP MyAdmin). Through even more intensive debugging
I found that after concatenating each chunk the blob would grow by the
correct amount up to 16MB, then it would zero and start again. The
result of pushing a 700MB ISO in to it would always be less 16MB.

Working from a hunch I upped the max_packet_size in my.conf from 16MB to
36MB, the result was as above but this time the blob would grow to 32MB
before zeroing itself.

I then found this MySQL bug report:
http://bugs.mysql.com/bug.php?id=22853

It is considered a bug simply because by upping max_packet_size to
something like 4GB or anything substantial leaves the MySQL server open
to network DoS attacks. There is no work around. A fix is scheduled for
version 6.

Anyway, since we don’t care about security I maxed out the packet_size,
and my chunking script worked, but very quickly I noticed a huge amount of
slowdown once the blob had grown to just a few tens of megs. In fact it
took about 20 minutes to get to 200MB. Without supporting my theory with
any evidence I reckon the CONCAT function reads all the data out of the
database in to memory, does it’s stuff and then stuffs it back in to the
database. The result is an increasingly slow query.

It would seem that although MySQL supports blobs of about 4GB that you
can never get that size unless you:

1. Change the max_packet_size to a stupidly high number, leaving your
server open to DoS attacks so then you can just do a single INSERT. Using CONCAT
is out since it would take forever.

2. Change the memory limit of PHP scripts to be greater than 4GB.

3. Have absolutely bags of RAM since three copies of the file are stored
in RAM (certainly two copies). Since you cannot chunk the file and
perform CONCATS PHP will read the entire file in to it’s memory. This
will all then be buffered by the MySQL driver. Then this is passed to
the database. Therefore if you wish to insert a 4 GB file you will
require >12GB of RAM.

One work around is to chunk the file as before but each chunk is
represented by a single row. These can then be linked together by giving
each row a master file ID, so that all necessary rows can be retrieved.

For my purposes this doesn’t help, I needed queries that took a while,
inserting rows like this will take but a fraction of a second each.

Posted in Development, Information, MySQL, PHP, Programming, Server

3 Comments »

Resume file copy using cURL

on Monday 12th January, 2009 Gabe speculated thusly…

You can use cURL for normal copies, but it will also resume interrupted transfers:
curl -C - -O file:///foo/bar.dat

Posted in Development, Information, Linux

No Comments »

Beware of the Trailing Comma in JavaScript Prototypes

on Wednesday 3rd December, 2008 Gabe speculated thusly…

Came accross the following post at:
http://www.pluralsight.com/community/blogs/fritz/archive/2007/06/19/47771.aspx

“I spent more time than I care to admit tracking down this one, perhaps this post will save someone else the trouble…

When defining a number of functions in a prototype in JavaScript, do not include a trailing comma after the last function:

MyType.prototype = {
    foo : function() {
          // ...
    },

    bar : function() {
        //...
    }, //< - fails in IE!
 }

What was especially tricky about tracking this problem down was that FireFox works with or without the trailing comma, so it only fails in IE!"

Posted in Books, Development, JavaScript, Linux, Operating System, PHP, Programming

No Comments »

IE doesn’t recognise “select option”.click()

on Friday 14th November, 2008 Gabe speculated thusly…

Instead of:
$(’#ranges option’).click(function() {
var val = $(this).val();
//do something with val
});

Had to change it to:
$(’#ranges’).change(function() {
var val = $(this).val();
//do something with val
});

Posted in Development, JavaScript, Programming, jQuery

No Comments »

Git errors on Cygwin

on Tuesday 16th September, 2008 Gabe speculated thusly…

I’ve recently started experimenting with Git, previously being totally committed(!) to Bazaar. Git is difficult in comparison but I intend to stick with it. I have been using Git under Cygwin on Windows XP, it seemed to work fairly well except that often when I added a large batch of source code files they wouldn’t get committed.

Eventually I found out it is because Git is choking on one the line endings in some files. When trying a commit of these files I would see a whole load of errors fly by such as ‘trailing whitespace’ and “you have some suspicious patch lines”. Googling around I found three answers, but I could only fully understand two of them.

Firstly, the approach I took, was to run the command chmod a-x .git/hooks/pre-commit in the git repository.

The other approach, which I found here says to edit the pre-commit file above and comment out the lines:
if (/\s$/) {
bad_line("trailing whitespace", $_);
}

Which would mean changing it to:
#if (/\s$/) {
#bad_line("trailing whitespace", $_);
#}

Hope that can help someone!

Posted in Development, Git, Operating System, Revision Control, Windows, XP

1 Comment »

Using Python Doctests in Django with fixtures

on Tuesday 24th June, 2008 Gabe speculated thusly…

Django is a pretty decent web framework for Python. Having brushed up on my Python programming I started to fall in love with doctests. I then went ahead and wrote about 100 lines of doctest for model in Django, then found all tests were failing because the database fixtures weren’t being loaded.

I couldn’t find out how to install fixtures inside doctests from the official documentation, I did however, come across what seemed like a web page written in Japanese. I had to skip the Japanese but figured out the code samples. Getting fixtures working with doctests in django is fairly simple – once you know how!

At the top of your doctest you will need the following two lines:
>>> from django.core import management
>>> management.call_command("loaddata", "project/fixtures/test.json", \
verbosity=0)

Replace project and test.json with your project name and fixture. Then continue with the doctests as per usual. After they are done put the following line at the end of the doctest:
>>> management.call_command("flush", verbosity=0, interactive=False)

That should be just about it :)

Posted in Development, Django, Frameworks, Programming, Python

No Comments »

Setup a fileserver using RAID 1 & 5 on Ubuntu Hardy Heron 8.04 LTS Server

on Tuesday 13th May, 2008 Gabe speculated thusly…

I recently set up a new file server. It contains three 1 terabyte drives configured in a RAID 5 array but with RAID 1 for the boot partitions so in the event of drive failure the system could still boot.

Essentially you need to go through a normal installation process and make sure you choose a couple of things during parition setup. I split each of the 3 drives in to two partitions, one 200MB, and the other partition the remaining disc space. This is what I had:
sda1 = 200MB
sda2 = 9800MB
and the same for the other drives.

Set all sda1 partition to be ext3 and bootable and mounted as /boot, set sdb1 and sdc1 partitions to be Linux Software Raid, bootable, but not mounted. Configure the RAID and create a new RAID device, tell it to use three devices and add to it sda2, sdb2, and sdc2.

When I returned to the partition menu I configured md0 as a partition for a logical volume group. I then configured LVM and created logical volumes for /root, swap, and /home. Returning to the partitioning menu I formatted these and selected their mount points and installed the system.

Following system installation I went about creating a RAID 1 array for sda1, sdb1, and sdc1. First thing was to create a new array using mdadm and the two paritions we haven’t yet assigned:
# mdadm --create /dev/md1 --level=1 --raid-disks=3 missing /dev/sdb1 /dev/sdc1

Make a filesystem on our new RAID device:
# mkfs.ext3 -m 0 /dev/md1

Save the new configuration in to mdadm’s config file:
# mdadm --examine --scan >> /etc/mdadm/mdadm.conf

Edit mdadm.conf and remove the duplicate entry at the bottom:
# vim /etc/mdadm.conf

Mount the new RAID device so we can copy data to it:
# mkdir /media/md1
# mount /dev/md1 /media/md1

Copy over the boot partition to the RAID device:
# cd /boot
# cp -dpRx . /media/md1

Edit fstab and replace /dev/sda1 with /dev/md1:
# vim /etc/fstab

Edit mtab and do the same:
# vim /etc/mtab

Unmount /boot, change it’s system type and add it to the RAID array:
# umount /boot
# fdisk /dev/sda
Command (m for help): t
Partition number (1-4): 1
Hex code (type L to list codes): f
Changed system type of partition 1 to fd (Linux raid autodetect)
Command (m for help): w

# mdadm --add /dev/md1 /dev/sda1

Configure GRUB so that it can boot from any hard drive:
# grub
grub> device (hd0) /dev/sda
grub> root (hd0,0)
grub> setup (hd0)
grub> device (hd0) /dev/sdb
grub> root (hd0,0)
grub> setup (hd0)
grub> device (hd0) /dev/sdc
grub> root (hd0,0)
grub> setup (hd0)
grub> quit

# update-initramfs -u

Change all occurance of (hd*) to (hd0) in the file /boot/grub/devices.map.

That should be all – good luck with the reboot!
# reboot

Posted in Books, Development, Guide, Information, Linux, Operating System, Server, Ubuntu

No Comments »

HTML Helper for generating a country drop-down list in CodeIgniter

on Friday 9th May, 2008 Gabe speculated thusly…

Frustrated at having to create my own drop-down country lists in CI I created a helper for it. It can even work with the CI Validation class, so if the form is submitted and fails validation when it is redisplayed the same country is selected as the form was submitted with.

Download the file MY_form_helper.php and put it in the folder: /system/application/helpers/.

Download the file countries.php and place that in /system/application/config/.

You will now be able to create drop-down lists of countries in your views like this:
< ?=form_countries( 'country', $this->validation->country, array( 'style' => 'width: 250px' ) )?>

Ensure that you create a validation rule called “country” in your controller and that you are loading the form helper.

form_countries() takes three arguments. The first is the name you would like to give the select box, this is required. The second is the country to select by default – this could be something like ‘GB’, or $this->validation->country, otherwise you can leave it blank. The third argument is any additional properties to give the select box, you can leave this blank, or pass an array. In the example above we also set the width to 250x. The country chosen by the user will be returned to the script as a two-letter ISO country code, if you are validating this then you can set the min and max length values as 2, and accept alphabetical characters only. If using MySQL as your database you will get best performance by creating a column of the type char with a length of 2.

Posted in CodeIgniter, Development, Frameworks, PHP, Programming

5 Comments »

Profiling in CodeIgniter

on Tuesday 6th May, 2008 Gabe speculated thusly…

Profiling an application is a great way to see how it is performing. CodeIgniter comes with a profiling class, which you can make calls to from your controllers. However, during development I find profiling so helpful that I want it on all my pages, without having to specifically call it each time. I also want to be able to deploy my development site to a production server – making calls to the profiler throughout the controllers means I would need to edit each and every controller to make sure the profiler wasn’t active on the live server.

After much searching I discovered a better solution. This allows you to add profiling to the bottom of every page (along with its debug info and SQL query info). When you copy the dev site live you just exclude 1 file, and profiling will be removed totally. All you need to do is create a file called MY_Output.php in system/application/libraries that extends the Output core class, with the following contents:

< ?php
# /system/application/libraries/MY_Output.php
if (!defined('BASEPATH')) exit('No direct script access allowed');
class MY_Output extends CI_Output {
function __construct() {
parent::__construct();
$this->enable_profiler( TRUE );
}
}
?>

This will enable profiling output on all your pages for development and debug purposes. When you copy your site to a production server just make sure you don’t copy MY_Output.php file. I normally use rsync for copying live and just add: --exclude="MY_Output.php" to the rsync command.

The condition to using the name MY_Output.php is that you left the variable $config['subclass_prefix'] in config.php as default. This defaults to $config['subclass_prefix'] = 'MY_';, so if you change it you will need to alter the filename of your new class.

Posted in CodeIgniter, Development, Frameworks, HowTo, PHP, Programming

No Comments »

Grope, a Ruby script for enhanced Grepping

on Wednesday 19th March, 2008 Gabe speculated thusly…

There was a time when I would recursively grep the contents of literally thousands of files at a time to search for particular occurances of characters. The usual starting place was something using grep, which we can time for crude benchmarking:

user@localhost$ time grep -rn 'hello' *
templates/temp.tpl:1:hello

real 1m56.190s
user 0m1.400s
sys 0m0.940s

Using Ruby to write a script, which I named Grope, I made the search process 450 times faster, reducing an operation that took more than a minute to taking a blink of an eye…

user@localhost$ time grope 'hello'
templates/festival/06/temp.tpl: 1
hello...

real 0m0.181s
user 0m0.100s
sys 0m0.070s

Posted in Development, Operating System, Programming, Ruby

No Comments »