Structuring your background jobs
This morning while dealing with a support issue, I asked this question on the twitters:
Do you create a single (background) job for an event (AfterPostUpdated), or multiple jobs for tasks (SendPostEmails, UpdatePostCounts)?
In Tender Support, updating a discussion spawned a job that looked something like this:
class Job::CommentNotifications < Job::Base.new(:comment_id)
def comment
@comment ||= Comment.find comment_id
end
def perform
comment.notified_users.each do |user|
UserMailer.deliver_notification(user, comment)
end
end
end
(In case Evan asks, I am referring to a secret military Base for Tender jobs. Don't judge me!)
As functionality has grown in Tender, the job looks more like this now:
class Job::CommentNotifications < Job::Base.new(:comment_id)
def perform
# update sphinx with comment contents
# add comment author as a watcher to the discussion
# send autoreply for first comment of a discussion
# send notifications to all discussion watchers
end
end
Wow, writing all that out really makes me realize how ridiculous things have gotten. This job definitely turned in a post-create callback for valid comments (the spam check is another job). These are all things that need to happen in no specific order.
Why didn't I create individual jobs for these tasks out the gate? Part of the reason for my aggressive queueing in Tender is to keep the frontend requests as fast as possible. I didn't want to have to worry about creating 3 extra Delayed::Job rows for tasks that all run at the same time.
One thing I'm running into is the fact that sphinx indexing is relatively slow, holding up things like comment notification. I'm also planning an upgrade to the Tender infrastructure, so there's a chance that something like sphinx indexing or asset processing would have to happen on certain instances.
Here were the results of the poll:
For Single Job Classes
@capitalist Single Job for an event, so you can replay the failed ones.
@shojberg AfterPostUpdated imo, less jobs to maintain :)
For Multiple Job Classes
@laserlemon Multiple jobs. Better for assigning priority
@lifo Multiple jobs. I think it's a bad idea to have jobs like AfterPostUpdated. Post could change again by the time job gets run.
@spiceee multiple jobs.
@marcjeanson multiples
@fowlduck if they're not order-dependent i'm for multiple ones
@fowlduck if it fails in the middle and it's rerun then the part that didn't fail is rerun as well
@mguterl multiple jobs for tasks and sometimes we just use #send_later.
@trevorturk most of my DJs are like @lifo's, but I figure I should be using handle_asynchronously where possible
@ATimberlake separate jobs are more resiliant against duplications when jobs are re-run after errors
Both @fowlduck and @ATimberlake brought up great points: job errors errors trigger the whole job to be re-run. There is some wasted work, but this is especially painful if you're sending duplicate emails every few minutes as the job tries to complete. Notice how I kept that task at the bottom :)
The Verdict
I'm going to have to say that multiple job classes are the way to go. It's definitely not crucial for day one, but it is something you should keep in mind for when you start to run into similar issues.
Maybe something is in the air, but @defunkt posted a blog talking about Resque, their redis-backed queue. The non-redis stuff sounds great (workers, web UI, named queues). Redis is just icing on the cake.

Thanks for the fun twitter discussion, everyone! Any more thoughts? Comment below or @reply on twitter.
related
- 2010 Aug 03 Protocol Buffers with Riak for Node.js
- 2010 Jul 13 In-Process Node.js Queues
- 2010 Jul 07 Geek Talk Interview
- 2010 Jun 28 Tee and Child Processes
- 2010 Jun 23 You can let go now
- 2010 May 17 Railsconf: Building APIs
- 2010 May 10 Nori: Node.js Riak wrapper
- 2010 May 10 No, I did not create a mobile phone framework too
- 2010 May 04 Escaping your test suite with your life
- 2010 Apr 05 Will the iPad kill comic books?

