Using charklock_holmes on Heroku

charlock_holmes is a useful gem if you have to deal with user supplied data which may come in a variety of text-encodings. Not only does it enable you to detect the encoding of a string, but it also allows you to transcode the string to a different encoding.

charklock_holmes uses libicu to deal with string encoding.

Unfortunately, the default Heroku buildpack for Ruby doesn’t include libicu which prevents bundler from being able to compile charklock_holmes C-extension.

There have been a few attempts at solving this problem, most of which are discussed over on stack overflow. The accepted answer is a common solution, which relies on using a version of the gem which includes a bundled version of libicu. While this works, it does result in very slow build times both on heroku, and locally when doing a bundle install.

Another solution uses a custom version of the ruby buildpack which includes libicu — while this is a simple solution it relies on the maintainer of that solution keeping it up to date with heroku’s ruby buildpack.

My favourite solution seems to move in the right direction, it uses heroku-buildpack-multi and heroku-buildpack-apt to install libicu using apt. Unfortunately it uses a forked version of the heroku-buildpack-apt which adds specific behaviour for charlock_holmes and where bundler can find the version of libicu installed by apt.

My solution builds upon the previous solution, but rather than use a custom version of the heroku-buildpack-apt I have added one more buildpack into the mix — heroku-bundle-config.

This buildpack allows you to configure your heroku bundler config in your repository in the .heroku-bundle directory. During the build it will move this directory to .bundle, and most importantly, make sure that all /app paths point correctly to the temporary build directory.

I’ve created a sample project, that can be deployed to heroku – the only thing you need to do is ensure that you have set the BUILDPACK_URL to https://github.com/ddollar/heroku-buildpack-multi.git:

$ heroku config:set BUILDPACK_URL=https://github.com/ddollar/heroku-buildpack-multi.git

When you push to heroku, this buildpack will check for a .buildpacks file, which specify the different buildpacks you want to use:

https://github.com/ddollar/heroku-buildpack-apt
https://github.com/timolehto/heroku-bundle-config
https://github.com/heroku/heroku-buildpack-ruby

heroku-buildpack-apt will then check for an Aptfile and install the specified packages:

libicu-dev

Finally, you need to configure you .heroku-bundle/config to make sure that bundler can use your newly installed version of libicu:

---
BUNDLE_FROZEN: '1'
BUNDLE_PATH: vendor/bundle
BUNDLE_BIN: vendor/bundle/bin
BUNDLE_JOBS: 4
BUNDLE_WITHOUT: development:test
BUNDLE_DISABLE_SHARED_GEMS: '1'
BUNDLE_BUILD__CHARLOCK_HOLMES: --with-icu-lib=/app/.apt/usr/lib --with-icu-include=/app/.apt/usr/include

That should be all you need.