Structuring your background jobs
This morning while dealing with a support issue, I asked this question on the twitters:
Do you create a single (background) job for an event (AfterPostUpdated), or multiple jobs for tasks (SendPostEmails, UpdatePostCounts)?
In Tender Support, updating a discussion spawned a job that looked something like this:
class Job::CommentNotifications < Job::Base.new(:comment_id)
def comment
@comment ||= Comment.find comment_id
end
def perform
comment.notified_users.each do |user|
UserMailer.deliver_notification(user, comment)
end
end
end
(In case Evan asks, I am referring to a secret military Base for Tender jobs. Don’t judge me!)
As functionality has grown in Tender, the job looks more like this now:
class Job::CommentNotifications < Job::Base.new(:comment_id)
def perform
# update sphinx with comment contents
# add comment author as a watcher to the discussion
# send autoreply for first comment of a discussion
# send notifications to all discussion watchers
end
end
Wow, writing all that out really makes me realize how ridiculous things have gotten. This job definitely turned in a post-create callback for valid comments (the spam check is another job). These are all things that need to happen in no specific order.
Why didn’t I create individual jobs for these tasks out the gate? Part of the reason for my aggressive queueing in Tender is to keep the frontend requests as fast as possible. I didn’t want to have to worry about creating 3 extra Delayed::Job rows for tasks that all run at the same time.
One thing I’m running into is the fact that sphinx indexing is relatively slow, holding up things like comment notification. I’m also planning an upgrade to the Tender infrastructure, so there’s a chance that something like sphinx indexing or asset processing would have to happen on certain instances.
Here were the results of the poll:
For Single Job Classes
@capitalist Single Job for an event, so you can replay the failed ones.
@shojberg AfterPostUpdated imo, less jobs to maintain :)
For Multiple Job Classes
@laserlemon Multiple jobs. Better for assigning priority
@lifo Multiple jobs. I think it’s a bad idea to have jobs like AfterPostUpdated. Post could change again by the time job gets run.
@spiceee multiple jobs.
@marcjeanson multiples
@fowlduck if they’re not order-dependent i’m for multiple ones
@fowlduck if it fails in the middle and it’s rerun then the part that didn’t fail is rerun as well
@mguterl multiple jobs for tasks and sometimes we just use #send_later.
@trevorturk most of my DJs are like @lifo’s, but I figure I should be using handle_asynchronously where possible
@ATimberlake separate jobs are more resiliant against duplications when jobs are re-run after errors
Both @fowlduck and @ATimberlake brought up great points: job errors errors trigger the whole job to be re-run. There is some wasted work, but this is especially painful if you’re sending duplicate emails every few minutes as the job tries to complete. Notice how I kept that task at the bottom :)
The Verdict
I’m going to have to say that multiple job classes are the way to go. It’s definitely not crucial for day one, but it is something you should keep in mind for when you start to run into similar issues.
Maybe something is in the air, but @defunkt posted a blog talking about Resque, their redis-backed queue. The non-redis stuff sounds great (workers, web UI, named queues). Redis is just icing on the cake.

Thanks for the fun twitter discussion, everyone! Any more thoughts? Comment below or @reply on twitter.
Comments
Comments are closed.
