Django: Hierarchical tags with taggit and treebeard

I’ve been using django-taggit to provide a tagging model for content items in my app. However, I wanted to arrange the tags into a hierarchy/taxonomy. It’s simple enough to use a custom through model to define a custom tag model with a parent pointer, which lets you arrange your tags into a tree:

from taggit.models import TagBase, ItemBase
from taggit.managers import TaggableManager

...

# the custom tag model
class HierarchicalTag (TagBase):
    parent = models.ForeignKey('self', null=True, blank=True)

# the through model
class TaggedContentItem (ItemBase):
    content_object = models.ForeignKey('ContentItem')
    tag = models.ForeignKey('HierarchicalTag', related_name='tags')

# the content item
class ContentItem (ItemBase):
    tags = TaggableManager(through=TaggedContentItem, blank=True)

However, suppose you have a tree of tags like this:

Vehicle
    Car
        BMW
            Z4
        Ford
            Fiesta
        Chevrolet
            Volt

and you have content items tagged with leaves (Z4, Fiesta, Volt), but you want to search for all items tagged with anything from the ‘Car’ branch of the tree. Chances are you’ll end up writing a recursive function to gather up all the descendants of ‘Car’, which doesn’t scale because it involves many SQL queries, or using esoteric SQL syntax available only in the big database engines (and certainly not sqlite3).

At work, where we use a non-relational database engine, we long ago overcame the same issue (efficient manipulation and querying of hierarchical models), so I already had an idea of what I needed to do. But, as is the way with Python and Django, I figured there would probably already be packages that implement efficient hierarchical data — and there are.

The two main contenders seem to be django-mptt and django-treebeard. I tried mptt first, mainly because the consensus seemed to be that it was smaller and easier to use, but also because it purported to allow you to add hierarchical structure to existing models by configuration, which in my case would mean I didn’t have to define a custom tag model and could attach hierarchy directly to taggit’s Tag model.

However, my experience of mptt was poor – the documentation appeared to be out of date with respect to both the version of mptt I got from pip and the latest git trunk. Also, when I tried to use mptt’s admin classes for Django, I got exceptions (I admit I didn’t try very hard to overcome them).

So I gave treebeard a go, and had a much smoother time. Treebeard implements a number of hierarchy techniques with different performance characteristics (e.g. cheap querying but expensive insertion), allowing you to choose which one suits your application’s use of the trees. In my case I went for ‘Materialised Path Trees’ because it’s the relational equivalent of the technique I’m already familiar with. Implementing hierarchical tags was a straightforward case of having my custom tag model extend treebeard’s MP_Node model which, as the name suggests, implements a node in a Materialised Path Tree:

from treebeard.mp_tree import MP_Node

...

class HierarchicalTag (TagBase, MP_Node):
    node_order_by = [ 'name' ]

class TaggedContentItem (ItemBase):
    content_object = models.ForeignKey('ContentItem')
    tag = models.ForeignKey('HierarchicalTag', related_name='tags')

class ContentItem (ItemBase):
    tags = TaggableManager(through=TaggedContentItem, blank=True)

(The node_order_by is what treebeard uses to order siblings when a new node is added to the tree.) That was literally all that was needed. Going back to the ‘Car’ example, the code to find all ContentItems tagged with any of the descendants of ‘Car’:

# look up the Car term
car = HierarchicalTag.objects.get(name='Car')

# get a queryset of all its descendants: with treebeard this is 1 SQL statement
# use HierarchicalTag.get_tree(car) if you want to include 'Car'

treeqs = car.get_descendants()

# now find the ContentItems using an inner queryset
qs = ContentItem.objects.filter(tags__in=treeqs)
Advertisements
8 comments
  1. ISU534 said:

    I receive an error about an unknown keyword ‘content_object’ File "C:Python27libsite-packagesdjangodbmodelssqlquery.py", line 1316, in setup_joins "Choices are: %s" % (name, ", ".join(names)))FieldError: Cannot resolve keyword ‘content_object’ into field. Choices are: aid, id, tagMy classes look like this: class HierarchicalTag(TagBase, MP_Node):
    """A custom tag model for photo application."""
    node_order_by = [ ‘name’ ]class TaggedAlbum(ItemBase):
    """taggit through model for album tags."""
    aid = models.ForeignKey(‘Album’)
    tag = models.ForeignKey(‘HierarchicalTag’, related_name=’tags’)class Album(ItemBase):
    name = models.CharField(‘Album Name’,max_length=45, null=False, blank=False)
    description = models.CharField(‘Album Description’,max_length=255, null=False, blank=False)
    tags = TaggableManager(through=TaggedAlbum, blank=True)

  2. ISU534 said:

    I’m sorry, that didn’t format correctly at all. It is three classes, HierarchicalTag, TaggedAlbum and Album.

  3. Anonymous said:

    It’s been a while since I wrote this, but looking at your code, I think you may just be missing the line: content_object = models.ForeignKey(‘ContentItem’)from your TaggedAlbum.

  4. ISU534 said:

    That does appear to have been the issue. Thank you.A couple more questions, hopefully you are still willing to help.1.) How do I actually create the hierarchy now? I know the tags and taxonomy I want, but how do I associate a child with a parent (or parent with child)?2.) This line seems to hang django, should I be using a different method to add a tag now?a1.tags.add("pictures")In Django shell, it just hangs on that line.

  5. Anonymous said:

    It looks like I created the tags manually via admin screens, rather than programmatically. In my admin.py I have: from treebeard admin import TreeAdmin … admin.site.register(HierarchicalTag, TreeAdmin)I’d imagine that to create the hierarchy, you need to assign the parent field of the child instance to the parent instance.To add tags to objects, I was using the same .add() method as you, so I’m afraid I don’t know why it would hang.

  6. ISU534 said:

    I can use the admin panel for the short term. If I figure out how to do this within the program, I’ll post back here too.Within the admin, I’m still confused how to build the hierarchy. For testing, I want to build something like this:—–Events..Sports….Baseball….Football..Vacation….Beach….Disney….Grand Canyon—–I start by adding a tag, with a Name/Slug of ‘Events’. What should I select for the ‘Tagged Items’ settings? Object id appears to need to be an integer. Content Type is a list of classes in my application. I selected Tagged album.Then I want to create the ‘Sports’ tag and associate it as the child of Events. This is where I don’t see how to specify that relationship.

  7. Anonymous said:

    Just got it running again… in the admin interface for Hierarchical tags, I click ‘Add hierarchical tag’, then I’m presented with the fields: Name, Slug, Position, Relative to. (The latter two allow me to create the relationship to other tags.)So I’ve no idea why you’re seeing more fields than me, sorry! Might be best to persevere with adding the tags programmatically.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: