-
Notifications
You must be signed in to change notification settings - Fork 200
Add database backend for ActiveJob delayed job handling and use it #2193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm currently leaning towards GoodJob. It has an MIT license (it's OSS), and it's clearly active (many commits in the last week). The developer (Ben Sheldon) has a blog post arguing for GoodJob and the GoodJob README has a comparison table. They're both from GoodJob, so of course they are arguments for it, but its arguments are very compelling:
In contrast, GoodJob works directly with Postgres, doesn't poll, and supports multi-threading. The adapter for GoodJob isn't built into ActiveJob, but that doesn't really matter, all that matters is that we can configure things to ActiveJob calls invoke GoodJob (which it's designed to do). I continue to look for other ActiveJob backends and comparisons between them It's possible to migrate ActiveJob backends, but let's try to make a reasonably good choice the first time :-). So this post isn't necessarily the "final answer", just a "summary of what I've learned so far". |
I've looked at BackBurner, one of the backends directly supported by ActiveJob. For persistence it depends on beanstalkd, which is available on Ubuntu. However, when I looked at the [beanstalkd FAQ](https://github.com/beanstalkd/beanstalkd/wiki/FAQ] I found that its persistence uses writing to log files. We can't really do that; the filesystem writing isn't persistent. We have to use a backend database. So that won't work for us. |
I looked around for testimonials. One mentioned the use of GoodJob, so that's promising for GoodJob. It appears that the new ActiveJob default backend will be Solid Queue. Their arguments make sense for their case: it ports between databases. The top contenders seem to be Solid Queue and GoodJob. Solid Queue polls. GoodJob is Postgres-only. Evidence so far suggests both would work very well for our use case. We do have a small amount of code tied to Postgres (we use case-insensitive text and Postgres' text search mechanisms). Still, we try to limit that. |
I looked more at Solid Queue. The fact that it will be the Rails 8 default and is officially part of Rails are strong arguments for it. We try to minimize oddities. It's quite active (most recent commit 13 hours ago!). MIT license. There are some configuration variations. I'll have to drill in to those to make sure at least one of its variations will actually work for us (though I don't anticipate problems). We have 2 strong candidates so far: Solid Queue and GoodJob. The bad news, we have to do analysis to make a selection. The good news, we have some awesome options. |
Here's a discussion about Sidekiq. Clearly, if your organization is all-in on Sidekiq, use Sidekiq. If you've committed to it, it might even make sense to call directly to it (instead of calling through a portability shim like ActiveJob). However, that's not our circumstance. We aren't all-in on Sidekiq, and making it easier to switch (since we have not made such a commitment) makes sense for our circumstance. ActiveJob can easily call on Sidekiq to perform jobs, it appears that Sidekiq does not provide some queue data to the ActiveJob stack. That's not a crisis, but clearly if you choose Sidekiq, there are incentives to calling it directly (eliminating the value of the ActiveJob shim). |
I've been searching for a list of ActiveJob backends and comparisons between them. This ActiveJob intro mentions "Resque". The Ruby on Rails guide to ActiveJob basics mentions "Delayed Job and Resque". The discussion on Delayed Job is above; we now should look at Resque. That's the only backend I've identified so far that I haven't considered. Resque is "Redis-backed library for creating background jobs, placing those jobs on multiple queues, and processing them later." Looks promising if we were already using Redis - but we aren't. I don't think we need to add another major component just to do a little delayed job handling. So I don't think this is a good choice for our circumstance. Note: We aren't doing that much with jobs. I expect a small job to be created on each edit to a project, along with each email on new sign-ups or password resets. These are quite short tasks, and not an overwhelming number either. We just need to make sure they aren't dropped when the system is halted. |
Regarding race conditions: Cache invalidations are much smaller than images (since they have less data), so they take less time overall to send and less time to process. Thus, if there are 2 TCP/IP streams in parallel, if we send the image and then the cache invalidation, there's an increased chance the cache invalidation will be received & processed first, even if it was sent after an older image was sent "first". We definitely need to send a cache invalidation later, to counter the race condition. There's no perfect delay time, but there's simply no mechanism to ensure that one packet sequence arrives before another once it gets on the Internet. |
I've looked at every reasonable option I could find. Our top contenders for an ActiveJob database backend are: Both are OSS (MIT license), actively maintained, and can use PostgreSQL as their store. After reviewing both, I've decided to start integrating Solid Queue. GoodJob doesn't use polling, which is a nice minor performance advantage for it. However, the jobs we're doing are trivial cache invalidation and email sending tasks, and not that many of them, so the performance advantage is expected to be generally invisible. Solid Queue has its own advantages:
That said, GoodJob is a very worthy alternative. If things go wrong with the Solid Queue integration, that's the backup plan. If anyone knows of an issue, please reply. |
Resque is old. That's we used before Sidekiq and most people moved to SIdekiq. You can rule out both as you articulated |
We need to add a database backend for delayed job handling and then use it. I believe this will address some caching problems as well as improve scaling. This description goes over the issues.
First: Every once in a great while a cached image is not updated in a timely way. E.g.: #2186 and #2072 . When we update a value, we do send out a cache invalidation for that badge image. I've been trying to track down the problem, and I don't think there's a race condition inside the application. So I believe there's a race outside. I think what's happening is that packets sent from our application to our CDN (Fastly) are being sent in parallel, and in some circumstances the first one sent is the second received (from the point of view of "will be acted on"). Basically, someone requests an image and we send a cache invalidation. The CDN receives the cache invalidation and then the old image, which is now considered the good image. There's also a possibility of races within the CDN, too. I don't control the entire Internet, and I certainly don't control packet ordering on it :-).
Our framework (Rails) has a mechanism for supporting delayed jobs called ActiveJob. It has many methods for enqueing. See ActiveJob basics. We use ActiveJob somewhat, but currently its configuration (its default) is RAM-based, which means that any jobs scheduled in it (as currently configured) get lost on a reboot. This has made me hesitant to use it more, e.g., as suggested in #1199 .
ActiveJob supports lots of backends, but if we want it to be backed by a database, we must pick a backend. It's possible to switch backends, but it's a pain (you need to "flip" it), so it's better to pick well the first time.
ActiveJob has a few built-in adapters for backends. However, that's not a complete list, in fact, they no longer accept new adapters because backends can provide their adapters. Another backend that looks promising is GoodJob.
Once we add a database-based backend for jobs, we can do more delayed email deliveries as suggested in #1199 .
I will follow this up with some initial analysis of the backend options I've identified.
The text was updated successfully, but these errors were encountered: