Replies: 1 comment
-
Although there is no logical error in your thinking, as it will indeed execute twice from both the VMs, there is a logical error in your setup, as it's not "cloud-ready" (or in other words, a properly distributed setup). A "cloud-ready" setup in this case is for example to have a different VM for only your background jobs, that is, having a Dedicated Jobs VM (level 1). It provides you with simplicity, isolation (of concerns) and predictability of what runs where and why. The drawbacks are resource utilization and single point of failure if you don't have a failover strategy. Another approach is to Distribute Tasks Manually/Algorithmically (level 2). You can manually schedule tasks on just one VM and disable the scheduling on the other VM. This requires you to manage the scheduling manually, but it guarantees that the tasks run only on the chosen VM. While it's ok for a couple of VMs, it's not really a neat way to proceed in the future when you will (definitely) be having (a lot) more VMs, especially if it's manual selection with if/else statements. Another approach is to use a Distributed Locking Mechanism (level 3). Tools like Redis can be used to implement distributed locks. Before executing a task, a VM would try to acquire a lock. Only one VM will be able to acquire the lock and therefore execute the task, while the other VM would see that the lock is already taken and skip the task. This option, while it's more complex, it probably provides better resource utilization and in cases redundancy. Last but not least, a combination of approaches (level 4). About the right approach: You can think of complexity in levels, and the ultimate decision has to do with your implementation details, familiarity, system complexity and criticality. At the end of the day, it's all about tradeoffs. For a small app, a simple dedicated VM (level 1) or manual jobs distribution (level 2) would definitely suffice. It all has to do with your app's complexity and criticality after all. In the case of some complexity, I would probably choose the latter for redundancy purposes, since you already have two VMs deployed. But if it's too small and/or simple, a dedicated VM would most probably suffice. In fact, amongst others, we currently run a large e-commerce app with thousands of transactions each month that only uses two app VMs and one dedicated jobs VM. It will very rarely break, and if it does, we will simply spawn another one and/or fix it real quick. Although some times we might have some reasonable fear, on the other hand, things don't break that often and as much as easily we might fear of, and a good response strategy suffices. Last but not least, for a large SaaS app, a combination of all three strategies, that is, Dedicated Jobs VMs with Distributed Locking (and possibly Algorithmic Task Distribution) (level 4) might be a robust approach, despite its greater cost in the beginning and technical complexity, since this approach guarantees you pretty much everything; isolation/separation of concerns, reliability and scalability. |
Beta Was this translation helpful? Give feedback.
-
Hello,
I have a load balancer (round robin) and 3 VMs (2 x web, 1 x mysql, elasticsearch and redis).
The VMs both provide the identical domains.
I prefer to use the whenever gem to run some periodic tasks (cleanup, re-indexing, import ...).
For example, if I set up a reindexing with whenever at 10pm, then the rake task will be executed on both VMs, or do I have a thinking error here?
What is the preferred way to run the rake tasks only once?
Thanks for your tips and suggestions.
Thanx
Jochen
Beta Was this translation helpful? Give feedback.
All reactions