Search

Getting Fancy with ElasticSearch

Bree Stanwyck

2 min read

Jun 7, 2012

Getting Fancy with ElasticSearch

When an app requires full-text search developers usually have two major contenders to choose from: Solr and ElasticSearch. Each addresses different use cases, but generally, ElasticSearch performs noticeably better when an app expects frequent reindexing, as is often the case. Gems like Tire make setting up ElasticSearch a breeze, but setting up more advanced indexes and interfacing with ActiveRecord can sometimes be a pain. Read on to see how to make your life easier with ElasticSearch and Tire.

Say an app needs an “omnibox” – a single search input that searches over multiple fields (for example, a user’s name, email address, and/or company). An initial attempt at setting this up in ElasticSearch with Tire would look like this:

After which we could search for users like User.search('Highgroove', load: true) and get the expected response.

But what if we want to allow partial-string searches? This requires some custom analyzers, in this case n-grams over the strings, which match substrings between the given lengths:

This works, but we can do much better than the mess of hashes above. Personally, I prefer to wrap this setup in a YAML file and parse it separately in an initializer:

We’re almost done now; unfortunately, though, adding custom analyzers interferes with ElasticSearch’s ability to search over all indexes in a #search call. Instead, searches have to take the form `User.search(“name:#{query} OR email:#{query} OR company:#{query}”). We also have to tokenize queries to account for whitespace. When all is said and done, a finished full-text search might look like this:

and we finally have our omni-search by calling this method like User.fulltext_search('groove').

Some final tips and tricks that make life with ElasticSearch that much nicer:

  • When setting up ElasticSearch on a development machine, it’s easy to mess up the index (for example, trying to run tests that involve ElasticSearch and add non-existent data to the index). Getting rid of this locally is as easy as sending a DELETE command to the ElasticSearch server, which usually looks like curl -XDELETE 'http://localhost:9200/users/', followed by a rake db:setup to re-seed the database and re-index (or User.index.import in Rails console just to re-index).
  • n-grams can waste memory if you’re not careful; the min_gram and max_gram analyzer settings should be enough to narrow searches down to one record, and no more (a max_gram of 15 over a name is probably wasteful, since very few names share a substring that long).

Zack Simon

Reviewer Big Nerd Ranch

Zack is an Experience Director on the Big Nerd Ranch design team and has worked on products for companies ranging from startups to Fortune 100s. Zack is passionate about customer experience strategy, helping designers grow in their career, and sharpening consulting and delivery practices.

Speak with a nerd

Schedule a call today! Our team of nerds are ready to help

Let's Talk

Related Posts

We are ready to discuss your needs.

Stay in Touch WITH Big Nerd Ranch News