nanoc and Tagging: Cookin' on Gas

Tuesday 23 September 2008

Tags: , .

Getting Started

Since I've had a bit of spare time on my hands with Lehman Going Bankrupt, at least for this week, I've got back into finishing off all those small bits and bobs around my nanoc site here.

One of those items is to automatically generate tag index pages via nanoc. Wim did this on his groovy site over at Fixnum but sadly had lost the code to it. Since Denis has already done most of the work in exposing the API for nanoc 2, with a good example of how to generate tag indices in the nanoc Blog-o-utorial, I took Denis's basic setup and Wim's idea and fleshed it out a little more along with integrating it into my whole Rake shebang.

The Theory

Denis's original approach generates a placeholder page inside your site's content area for each tag. This page doesn't really contain anything other than metadata within the YAML file.

Initially, I thought this was a bit of a fudge and tried to use nanoc's super-duper API wizardry to compile pages on-the-fly within a Rake task without the need for generating an intermediate placeholder page for each tag inside content/tags/.

This seemed like a good idea at the time but didn't really work too well as you can see from my cry for help on the nanoc Google Groups site posting Fun with tags.

I changed this around slightly and instead stuck with Denis's original idea of generating placeholder pages and then compiling these via nanoc. This seems to work a lot better. The main advantage is that nanoc then manages the regeneration of these indices via the same update/regenerate/change watching mechanism it uses for pages. In practice, this means if you re-tag a page, nanoc is then smart enough to realise and regenerate only the index pages that refer to the tag name that's changed. Pretty smart!

The Practice

I had to make some slight alterations, notably to embed an extra layout attribute within the tag index page placeholder as I use a seperate layout for tag indices.

There are basically two stages to this whole process.

Tags Placeholder Page Generation

For the first part, you'll need to download the Rake task:

  1. Download tags.rake

This can then be run via rake tags and will scan your site looking for pages with a tags attribute set. Each unique tag gets its own index page generated under content/tags/tagname.

 1 require "nanoc"
 2 
 3 def generate_tags
 4   # Load nanoc site
 5   site = Nanoc::Site.new(YAML.load_file('config.yaml'))
 6   site.load_data
 7   pages = site.pages
 8 
 9   # Get all tags used on the site
10   all_tags = pages.map { |p| p.attribute_named(:tags) }.flatten.compact.uniq
11 
12   # Get tags for which an index page exists
13   tags_with_pages = pages.map { |p| p.attribute_named(:tags) }
14 
15   # Get tags for which no index page exists
16   tags_without_pages = all_tags - tags_with_pages
17 
18   puts "Generating tags indices..."
19 
20   # Build pages for each tag
21   tags_without_pages.each do |t|
22     # Buid page
23     page = Nanoc::Page.new("Tag: #{t}", 
24                            { :tag_meta => t, :layout => "tag" }, 
25                            "/tags/#{t}")
26     page.site = site
27     page.build_reps
28     site.pages << page
29     page.save
30     puts "    " + page.path
31   end
32 
33   puts
34   puts "Tags indices generated."
35 end
36 
37 desc "Generate tags index pages."
38 task :tags do
39   generate_tags()
40 end
41 

There are a few things to note here:

Tag Indices Layout

This turns out to be very similar to adding a regular article index page, only instead of wanting to trawl all articles for a site, we only want the ones matching a certain tag name.

As luck would have it, the nanoc API already exposes such a function: pages_with_tag

Using the tag_meta attribute we added to the tag index placeholder page, it now becomes pretty straightforward to generate the index page itself. nanoc kindly does all the heavy lifting for us. Here's the key ERB snippet from the layout:

 1 <div id="inner">
 2 
 3     <% pages_with_tag(@page.tag_meta).each do |article| %>
 4 
 5         <div class="article">
 6 
 7             <h1 class="title"><%= article.page_id %></h1>
 8             <p class="excerpt">
 9                <%= auto_excerpt(article) %>... 
10                <%= link_to "read " + article.page_id, article.page_id %>
11             </p>
12             <p class="tags">
13                Tags: <%= tags_for(article, :base_url => '/tags/') %>.
14             </p>
15 
16         </div>
17 
18     <% end %>
19 
20  </div>
21 

Unlike Denis, I don't use an explicit title attribute; instead I use the page_id so I don't have to repeat any uniquely identifying text for the page.

Similarly, I also don't maintain a distict summary or excerpt attribute again to avoid duplication, but instead have a small function called auto_excerpt which uses Hpricot to peek at the beginning of the article text and extract the first few words.

And now, at last, all the tags index links work on my site as you can see for yourself - cookin' on gas!