acts_as_ferret tutorial
Ruby on Rails May 13th, 2008Lately I have been playing around with the acts_as_ferret plugin which makes it dead simple to implement full text search in your models for Ruby on Rails. The plugin is designed for the Ferret gem (a full featured search engine based on Apache Lucene). In this article I will guide you through the steps needed to implement searching for your models and talk about some of the problems I faced on the way.
Related: Ajax search in Rails
Installation
Firstly install the Ferret gem:
sudo gem install ferret |
Install the acts_as_ferret plugin:
ruby script/plugin install svn://projects.jkraemer.net/acts_as_ferret/tags/stable/acts_as_ferret |
Usage
Setup a model to be indexed and searchable by acts_as_ferret. In this example im putting more importance in the results of my search based on the title by using boost:
class Post < ActiveRecord::Base # Ferret fields acts_as_ferret :fields =>{:title, :body } end |
You may now search through your model using the find_with_ferret method inside your controller.
@posts = Post.find_with_ferret("test") |
The @posts object is actually an ActsAsFerret::SearchResults object rather than an array of ActiveRecords. This gives us access to a few extra attributes to give us the total results, the ferret score and support will_paginate.
Total number of results: <%= posts.total_hits %> for post in @posts <%= post.title %> with a score of <%= post.ferret_score%> end |
Conditions
The above query can be customised further to include conditions for your searches. This is great for implementing an advanced search or by simply filtering your search to certain parameters. Lets say we want to run a new search on our Post model but only include posts with the category_id = 4.
In our controller:
category_id = 4 @posts = Post.find_with_ferret("test", :conditions => ['category_id = ?', category_id]) |
The view will remain the same as above, only showing different results from the @posts array.
Paginate
It is very important to be able to paginate for your search results and acts_as_ferret couldn’t have made it easier with making it compatible with will_paginate gem. For the latest version of will_paginate, mislav recommends you install via the mislav-will_paginate gem.
Install the mislav-will_paginate gem:
#add GitHub to local gem sources sudo gem sources -a http://gems.github.com/ #install the gem sudo gem install mislav-will_paginate |
In config/environment.rb add the following line at the end:
require "will_paginate" |
In the controller:
per_page = 10 @posts= Post.find_with_ferret(query, :page => params[:page], :per_page => per_page) |
In the view:
<h2>Search Results</h2> <%= @posts.total_hits %> results <% for post in @posts %> <h3><%= link_to(post.title, post_url(post)) %></h3> <%= post.body %> <% end %> <%= will_paginate @posts%> |
When you add conditions to your paginated results the pagination parameters must be group together as it is used as the second parameter for the find_with_ferret method. eg.
per_page = 10 @posts= Post.find_with_ferret(query, {:page => params[:page], :per_page => per_page}, :conditions => ['category_id = ?', category_id]) |
Database Field Storage
For fields in your models that have small data you may wish to store the data with the index so that you get better performance on your searches. You get better performance because the search does not require any queries to be sent to the database to retrieve the data.
Improving on the Post model:
class Post < ActiveRecord::Base # Ferret fields acts_as_ferret :fields =>{:title => {:store => true}, :body } end |
In our controller specify a “lazy load” from the ferret index:
per_page = 10 @posts = Post.find_with_ferret(query, :lazy => [:title], :page => params[:page], :per_page => per_page) |
Boost
Using a boost on fields will apply more importance on them. The higher the boost factor, the more relevant the term will be. The default boost value is 1.
Improve the Post model to place more importance on the title of the post rather than the body:
class Post < ActiveRecord::Base # Ferret fields acts_as_ferret :fields =>{:title => {:store => true, :boost => 4}, :body } end |
Please note that if the query is an exact match for the body field then it will still be ranked above the title. As far as I know theres no way to separate your results so that title matches will always appear at the top of your search results.
Highlighting
If you wish to have words from your query appear bold in the search results, highlighting is what you need. Highlighting requires that the field you want highlighting on is stored in the index.
So using the previous example of the Post model, in our view we can highlight terms:
Search Results
<%= @posts.total_hits %> results <% for post in @posts %> <%= link_to(post.highlight(@query, :field => :title, :num_excerpts => 1, :pre_tag => "<strong>", :post_tag => "</strong>"), post_url(post)) %> <%= post.body %> <% end %> <%= will_paginate @posts%> |
Re-indexing
When you make changes to your models, you will want to re-index your data so that searches are accurate. To do this you can simply stop your server, delete the index directory inside your rails application and start the server again. Once you navigate to a page which uses the model, ferret will automatically re-index your models.
Troubleshooting
DRBConn Error
In case your getting any error like this:

This error is caused by the in-built DRB server (with acts_as_ferret) not running. If your not getting this error and your DRB server isn’t running then you are most likely running your application in the development environment.
May 23rd, 2008 at 2:18 am
Good post.
Just one question; how can I set a specific parameter to a field? For example if the fields are :title and :category_id, I would like to set a specific category when searching the title.
Regards!
May 23rd, 2008 at 11:01 am
Thanks and good question Nicolas. I will update the post to include that functionality.
To include conditions into your search, you may use the :conditions attribute in your find_with_ferret function.
eg.
per_page = 10
@posts= Post.find_with_ferret(query, {:page => params[:page], :per_page => per_page}, :conditions => ['category_id = ?', category_id])
Notice the {} around the pagination attributes. Look at the acts_as_ferret API to find out more.
June 22nd, 2008 at 5:45 am
Following the previous response, please note a small error in the conditional find code snippet (missing empty hash for the first options hash). It should be:
@posts = Post.find_with_ferret(”test”, {}, :conditions => ['category_id = ?', category_id])
Instead of:
@posts = Post.find_with_ferret(”test”, :conditions => ['category_id = ?', category_id])
In addition, when specifying a condition you must use the {”id = ?”, my_id} instead of the {”id = :my_id”, :my_id => 12} form (took me a while to work that problem out :-).
Thanks for a great tutorial!
July 4th, 2008 at 12:40 pm
After following the steps in this post all my queries return nothing:
#
Any thoughts?
July 4th, 2008 at 4:46 pm
Hey Liam,
You might wanna look at your development log file (/log/development.log). If your using Linux/Mac it’s good practice to run a ‘tail -f devlopment.log’ so you can see the errors, if any, as they occur.
Can you send me the find_with_ferret() method your trying to execute ?
July 5th, 2008 at 1:18 pm
I actually found the issue! After having several problems with acts_as_ferret I decided to reinstall, which I think I did incorrectly. It seems that the ‘index’ folder, left from the previous install, was negating the search on the current install. After deleting it the search works and a new index folder was automatically recreated.
cheers!
July 14th, 2008 at 3:50 am
Hey Andrew,
great tutorial here!
I have looked high and low and find out how to implement user defined filters and use them with AAF but nothing has turned up on it so far
Am trying to search for terms by their assigned location (location based search), thus the filter should filter out results for an X mile radius.
ideas please?
This is where I picked up the filter idea : http://blog.tourb.us/archives/ferret-and-location-based-searches
July 15th, 2008 at 5:16 pm
Arvind,
Sorry but I have never looked into doing something like that before. I’m guessing you would like to pass in like a conditions such as :radius => 10, :long_lat => xxxxx.
I guess a start would be to pass in those as the find_options array into the find_with_ferret method. There is a retrieve_records method in /plugins/acts_as_ferret/lib/class_methods.rb you might want to look at. Have a play with the line:
July 15th, 2008 at 5:24 pm
I made one application for Jobs. In this there is one table say faq in which there are many field for e.g. skills, interested_in, current_project. Now a user login ans fill the form in the skill part one person has added his own skill Suppose Mr. A login and fill the form. He enters his skills as C,C++,Java and interested_in field as .net and current_project as Java. Now Mr.B login and fill the same form. Mr.B skills are C,.net, and interested_in Java and current_project as .net. But when i make a report using the above data to know how many people know which skills. Like
Skills No. of Person
C 02(b’coz both know C)
C++ 01(only Mr.A knows)
Java 01(only Mr.A know)
.net 01(only Mr.B know)
Now if i want the report for which person in interested in which field it will seems as
Interested_in No. of Person
.net Mr.A
Java Mr.B
now when i click on the .net link of interested_in report i am getting both MrA & MrB(i.e here it is showing me that boht A & B are interested in .net which is not true). What i think is that ferret is taking all this from the same table so first time he take .net for skill as MrB and second time he is taking .net from interested_in from MrA and while showing he showing me both the things. SO, Can anyone help me how i will search for only skill and only interested_in from the same table.
July 16th, 2008 at 6:00 pm
Andrew,
I checked that option on your suggestion. But for now I went back to Ferret and passed it a filter_proc to calculate distance from a starting-point (lat/long) and filter by that. This way if I store class-names then I don’t have to go back to the database to retrieve the lat/long for each record. With retrieve_records conditions it seems like I do have to… or am I wrong?
July 19th, 2008 at 1:57 am
Arvind,
There are some nice gems and acts_as_x plugins that do the hard geocoding work for you. Search for acts_as_geocodable and act_as_mappable. They have their pluses and minuses but both provide a simple way to search through a geocoded model for say “cafes within 1 mile of the London Eye” or whatever. However you’d need to parse the request to identify that it’s a request for a location with a distance, then find the lat/lng of the location that then use the acts_as stuff and get a result from that.
I’ve not tried, but soon will be trying, to combine a search using acts_as_ferret with a search using acts_as_mappable or acts_as_geocodable. My guess is that it won’t work because both expect the extra conditions to their search terms to be basic SQL. It could be that to get a good performance and to have proper control over combining free text search with spatial search you’d need to use the free-text and spatial indexing in the underlying databases, if they have any. As far as I can see you can only do proper spatial searches in MySQL if your tables are in MyISAM, which is also a requirement for free text search in MySQL. However the spatial searching in MySQL still seems to be primitive. I’m looking at Postgresql as that seems to have better spatial searching facilities and it also has a nice free text search engine.
July 25th, 2008 at 12:23 am
Great tutorial! One question: How do I search _id fields? For example I have an expenses table with a field called vendor_id. I want to get the name of the vendor and add it to the expenses search. Is there an easy way to pass in the vendor_id to generate the name? Thanks!
August 7th, 2008 at 8:57 am
Thanks John and Andrew.
I actually needed to combine both - Text and Location based search!
Thus I finally just created a ferret filter proc and within it I calculate distances and filter via it.
For this (and so that I don’t hit the database for every text-match) I stored the Lat/Long within the index.
Doing all of this ofcourse (as far as I could follow) did mean that I did the actual search NOT through AAF. I DO use AAF for indexing and all other purposes though - at search time I pull out the index and directly search through it.
September 8th, 2008 at 9:03 pm
[...] acts_as_ferret tutorial - A good introduction. [...]
September 13th, 2008 at 2:27 am
Hello guys!
Have someone encountered the following error:
@@@
$ ./script/ferret_server -e production start
starting ferret server…
$ RAILS_ENV=production ./script/console
Loading production environment (Rails 2.1.0)
/usr/lib/ruby/1.8/drb/drb.rb:1093:in `method_missing’:DRb::DRbUnknownError: Mysql::
@@@
the same error in mongrel:
@@@
/usr/lib/ruby/1.8/drb/drb.rb:1093:in `method_missing’: Mysql:: (DRb::DRbUnknownError)
from /home/app/vendor/plugins/acts_as_ferret/lib/remote_index.rb:16:in `send’
from /home/app/vendor/plugins/acts_as_ferret/lib/remote_index.rb:16:in `method_missing’
from /home/app/vendor/plugins/acts_as_ferret/lib/act_methods.rb:189:in `acts_as_ferret’
…
@@@
September 13th, 2008 at 2:56 am
It’s turned out that I just mistyped database name
September 15th, 2008 at 4:03 am
I have ferret and will_paginate working well together but my log keeps showing this warning:
WillPaginate: You are using a paginated collection of class ActsAsFerret::SearchResults which conforms to the old API of WillPaginate::Collection by using `page_count`, while the current method name is `total_pages`. Please upgrade yours or 3rd-party code that provides the paginated collection.
Has anyone encountered and solved this?
September 15th, 2008 at 4:39 am
I was able to stop the WillPaginate warning by adding this method to the ActsAsFerret module:
def total_pages
@total_pages
end
September 17th, 2008 at 9:58 pm
Hi,
I had the same problem than Francois. I added the def total pages -thing in the acts as ferret module, but the problem persists. Where exactly did you add it, Francois? And does the warning cause any problems in the actual usage?
I also have a problem with combining will paginate and acts_as_ferret when I’m also sorting the search results. I want to get newest posts in the front page and oldest in the last page. Will paginate default seems to be the other way around. Acts_as_ferret sorting sorts every paginated page correctly (newest on the top and oldest on the bottom) but I get oldest items on the first paginated page and newest in the last.
Does anyone have any idea of what might cause this behavior?
October 1st, 2008 at 1:09 pm
In addition to the warning that Francois describes, I get an exception when I use the other view helper method will_paginate provies, e.g. , to show the “showing xx-yy of zz” information, which is a nice supplement to the basic paging controls.
Using this helper throws an exception “undefined method `total_entries’ for #”. Anyone have any ideas how to patch around this?
October 1st, 2008 at 1:11 pm
Err, that example of the view helper is supposed to be [%= page_entries_info @people %], looks like the usable version got escaped.
October 1st, 2008 at 1:31 pm
Let me answer my own question with what I finally figured out. Although I’m using the supposedly current 0.4.3 gem of acts_as_ferret, the “documentation for acts_as_ferret”:http://projects.jkraemer.net/rdoc/acts_as_ferret/classes/ActsAsFerret/SearchResults.html asserts that total_hits is aliased to total_entries.
Maybe that’s true in a more recent version of acts_as_ferret, but for my application, I had to add the following:
module ActsAsFerret
class SearchResults
def total_entries
@total_hits
end
end
end
(I added it to an initializer, but you can add it almost anywhere and just require it in.)