| 2012-10-31 18:36:35 utc | mburnett | since engines are basically just sets of workers, i imagine someone has implemented a system where there are multiple engines connected to a backend (maybe rabbitmq or redis) that handle multiple workflows in a way that allows a particular engine to go down without disrupting service |
| 2012-10-31 18:36:45 utc | mburnett | any leads on where to find examples of that approach? |
| 2012-10-31 21:06:03 utc | jmettraux | mburnett: well, that's how it works, you just start and stop workers |
| 2012-10-31 21:06:24 utc | jmettraux | if they share the same storage, then they're part of the same engine |
| 2012-10-31 21:07:20 utc | jmettraux | depending on the storage (http://ruote.rubyforge.org/configuration.html#storage) the multiple/remote workers approach is OK or not |
| 2012-10-31 21:08:34 utc | jmettraux | there is nothing special |
| 2012-10-31 21:13:34 utc | jmettraux | (the usual note: if there is no answer here, there is the mailing list at http://groups.google.com/group/openwferu-users) |
| 2012-10-31 21:32:22 utc | mburnett | awesome, it's going to be easier than i realized to do this! |
| 2012-10-31 22:21:17 utc | mburnett | are process definitions persisted in the storage, or just current state? |
| 2012-10-31 22:25:38 utc | jmettraux | no, process definitions are not persisted, you just pass them at launch time |
| 2012-10-31 22:26:12 utc | mburnett | ah ic |
| 2012-10-31 22:26:43 utc | mburnett | has anyone tried to persist process definitions until they are complete? |
| 2012-10-31 22:27:01 utc | mburnett | i'm trying to construct a system where an engine/worker can go down and its work can be finished by someone else |
| 2012-10-31 22:28:51 utc | jmettraux | well, that's how the system works |
| 2012-10-31 22:29:10 utc | mburnett | i must just be testing it wrong |
| 2012-10-31 22:29:31 utc | jmettraux | workers don't read process definitions from some kind of library |
| 2012-10-31 22:29:52 utc | jmettraux | each expression that composes a process has its own piece of the definition tree |
| 2012-10-31 22:30:11 utc | jmettraux | there is no "execution state" + "definition tree" |
| 2012-10-31 22:30:25 utc | jmettraux | the definition tree is embedded in the execution state |
| 2012-10-31 22:33:22 utc | mburnett | so in the case where there are two workers in separate processes and one launches a process and is subsequently killed, the rest of the process should be executed by the second worker? |
| 2012-10-31 22:33:57 utc | mburnett | (assuming a redis backend) |
| 2012-10-31 22:33:58 utc | jmettraux | provided that they share the same storage and that storage is OK with multiple workers, then yes |
| 2012-10-31 22:34:26 utc | mburnett | ok, then my test is doing something wrong (the best case :) |
| 2012-10-31 22:35:08 utc | jmettraux | in the extreme, you have a dashboard (no worker) that just places launch (and cancel, pause, etc) orders in the storages and workers in their own processes doing the execution |
| 2012-10-31 22:35:22 utc | mburnett | that's actually perfect |
| 2012-10-31 22:36:07 utc | mburnett | i'm trying to replace a home-grown workflow solution at a cancer research lab, and to sell it over pegasus i think it's going to need to be chaos-monkey friendly |
| 2012-10-31 22:36:29 utc | jmettraux | :-) |
| 2012-10-31 22:36:37 utc | mburnett | i really like what i understand of ruote so far, it's really nice :) |
| 2012-10-31 22:39:22 utc | jmettraux | thanks, it has its dark corners and issues, trying to overcome them one at a time, provided there is usable feedback |
| 2012-10-31 22:41:47 utc | mburnett | we have 5 teams of 2 people each investigating different solutions, and my partner and i are really excited about ruote |
| 2012-10-31 22:42:09 utc | mburnett | i'm sure the other solutions have dark corners too |
| 2012-10-31 22:42:18 utc | mburnett | and i'm sure they're all much better than our home-grown version |
| 2012-10-31 22:45:15 utc | mburnett | our typical use case is to have the workflow engine submit jobs to, say, grid engine and wait for them to complete |
| 2012-10-31 22:45:36 utc | mburnett | do you have any particular recommendations surrounding that? |
| 2012-10-31 22:46:19 utc | jmettraux | sorry, no experience with grid technology |
| 2012-10-31 22:46:25 utc | mburnett | ok no worries |
| 2012-10-31 22:47:17 utc | jmettraux | as long as tests are in place and there is a staging environment to hammer |
| 2012-10-31 22:48:33 utc | mburnett | luckily, we have a half-decent staging environment for this |