21 Oct, 2006

Published at 08:59PM

Tagged with rails

This post has 8 comments

Rails migrations for BLOBs

I’m working on an application that requires the ability to upload images. I’m not using any fancy ways to upload images in Rails (such as the file_column plugin), so naturally I came across a problem. The images were uploading fine, but larger files would stop at 64 kB. I automatically assumed there was a problem with my code, or Rails was limiting file size uploads to a default somewhere (since 64 kB is a nice, even computer number). Well it turns out that Rails doesn’t do anything to limit my file uploads; it was MySQL.

In my initial schema migration file, I had my image table’s “data” field set to :binary (which is BLOB in MySQL). This is what I needed, but BLOB has a size limitation of 64 kB. However, here are other levels of BLOB—MEDIUMBLOB (16 MB) and LONGBLOB (4 GB). And that’s what I was looking for. So I added a new migration with the following self.up execution:

change_column :images, :data, :binary, :limit => 10.megabyte

Voila! And now the field is MEDIUMBLOB and I have all the space I need. I knew there was an easy fix—I love Rails and migrations :)

Comments

Chris Sunday, 22 Oct, 2006 Posted at 04:13PM

Hmmm, why are you storing images in a database?

Ryan Sunday, 22 Oct, 2006 Posted at 07:24PM

Good question. The main reason: I was sent a schema and that was part of it. So it found a place in the migration file.

There are a couple of other reasons I thought it might be good…

Filename collisions: I thought this would be an easier way to deal with the users uploading a file with the same filename. Then I could use the built-in validates_uniqueness_of to check before saving.

Backups: Backing up the database would be easier, just because it would be in one place and I wouldn’t have to worry about file structures and directories.

However, I’m really new to handling files in an application, and this could very well be a rookie mistake. I’m aware that the database alternative adds some overhead since it would require a longer connection to pull larger files. I’ve never tried to handle file uploading straight to a file system, so maybe that kind of intimidated me, when in fact, maybe it’s easier? Bottom line is I’m completely open to suggestions on this one. What would you recommend?

Chris Tuesday, 24 Oct, 2006 Posted at 03:10PM

I would stay away from storing static files in a database, because of the performance overhead of content delivery. You end up causing more traffic between the app and the db than necessary, and the browser will likely not be able to cache the images (though I’m not sure on that).

In the next version of slate, I’m handling uploads (resources) with a database table AND the flat files. I call it “managed resources” because I am storing metadata about the file in the database, as well as the uploaded filename. I rename the file based on the created_on datetime for the record, thus eliminating filename collisions. Now, this does mean that links to the resources are like /resources/sitename/1029238162387.png, but you could easily make a controller which would send the actual file if you needed to (though that will add some overhead because you’ll be forcing Rails to handle the request rather than letting the webserver do it all).

Actually, to be a bit more specific, the renamed filename also contains the ID of the site that the resource belongs to (i.e. 014_7893783082.png). We also avoid all the crap of dealing with directories by storing the files in a flat directory, and using the database to map them to “folders” (a simple tree structure in the database called “Collections” which can be nested).

Also, another reason we are doing with this is generating multiple different sizes (exactly like flickr) when an image is uploaded. We simply add a special suffix to the managed filename for each size that we generate.

The bottom line is that static files should really be stored in the file system, even though it is slightly more complicated at times.

Ryan Tuesday, 24 Oct, 2006 Posted at 08:15PM

Thanks for the explanation. That sounds like a good way to handle your resources for slate (as well as what I need to do). Is it hard to generate multiple sizes of an image when it’s uploaded?

And out of curiousity, back when you first “reloaded” your site (the white/maroon version) you mentioned you were doing RoR at work. Was this slate back then, too? From the pictures you linked on the tinyblog, it looks like slate has a lot of functionality, and I was wondering if it was a fairly new project or something you’ve been developing for awhile.

Chris Wednesday, 25 Oct, 2006 Posted at 06:30AM

It’s been in development for about 2 years. However, it started as a PHP project that was eventually scrapped about a year ago, and that’s when I started rebuilding slate in RoR. Reloaded happened basically around the same time.

Handling multiple sizes wasn’t that hard, really. I used RMagick, and simply scaled the larger dimension (width or height) to various contraints (100, 240, 500, 1024, as well as a 75×75 square). RMagick is really easy to use, and it’s quite fast, as well.

David Honsvick Sunday, 17 Dec, 2006 Posted at 07:16AM

Thank you for the info Chris, I have been looking for the pros and cons of putting the pictures in a Blob field in a database. Personally for a web app it never occured to me to try putting images into the database directly until I started workiking with Rails.

Why is do you think that most examples / plug-ins for rails is wanting to have them in the database?

David Tuesday, 13 Feb, 2007 Posted at 06:52PM

I think putting images in the database is a great idea. There are performance issues if you don’t do some kind of caching to the filesystem, but writing a system to do this is relatively simple.

I keep metadata in one table (say, stored_files) and data in another (say, stored_file_datas) which has a key into the first table. This way, I can check the data in the filesystem against the metadata in stored_files (for staleness), not have to pay the overhead of reading the binary object into the Rails application stack to do this, and only reload the filesystem-cached data when it’s necessary.

If you are planning to do any kind of clustered deploy, this is a great way to go because the filesystem cache will autopopulate itself if you make it operate “on demand.” But even if your application will only ever be on a single host, doing this makes moving your data easier.

Combining the filesystem caching technique with some Apache mod_rewrite rules makes performance identical to using a pure filesystem store. As long as you make sure that filenames aren’t reused, and remove files from the filesystem as images are destoyed, you lose virtually nothing in terms of performance and gain (in my estimation) substantial manageability and scalability.

To me, this makes a lot of sense!

FatCow Thursday, 10 Sep, 2009 Posted at 02:52PM

I am curious why we can not do t.blob ?

Fatcow

Do you have something to say about this post?
Retype the image to the right Spam Hint: Are You Human? Textile Formatting Tips

or

Ryan Heath | Site Management A Ruby on Rails production.

This site is a Formed Function. Formed Function LLC | @formedfunction | Get in Touch