ruote tmp/log_2012-11-06.html

2012-11-06 03:03:14 utc mburnett so i have a situation where i have N concurrent jobs that get submitted to a remote process via AMQP, then i need to wait a long time and then resume the process once those concurrent jobs have all finished
2012-11-06 03:03:34 utc mburnett what's a good/the right way to approach that?
2012-11-06 03:53:04 utc mburnett nevermind, i was being foolish about how receivers worked
2012-11-06 16:49:25 utc mburnett is there a typical way of reporting an error on a workitem received via AMQP? I see a thead of mid-late 2011 in the mailing list, but I'm having a hard time understanding how to apply that to my case.
2012-11-06 20:04:33 utc mburnett ah, it seems that Ruote::Amqp::Receiver flunk has a different interface from Ruote::Receiver flunk
2012-11-06 21:02:31 utc jmettraux mburnett: hello, yes, #flunk is used to pass errors back from the receivers
2012-11-06 21:02:52 utc mburnett yeah, i was just passing it all the wrong stuff :)
2012-11-06 21:03:59 utc mburnett now i just need to get a curl-friendly inteface up to have a complete tracer bullet
2012-11-06 22:58:37 utc mburnett how do i abort components of a process that depend on a failed component without killing everything?
2012-11-06 22:58:50 utc jmettraux what is a component?
2012-11-06 22:59:09 utc mburnett an Amqp::Receiver in this case
2012-11-06 22:59:40 utc jmettraux it's not a component of a process
2012-11-06 22:59:42 utc mburnett i know that the idea is that failed processes will be administered and error sections corrected
2012-11-06 23:00:02 utc mburnett maybe I should just put up a gist
2012-11-06 23:00:09 utc jmettraux I can tell you how to cancel parts of a workflow instance
2012-11-06 23:00:29 utc mburnett ok
2012-11-06 23:00:32 utc jmettraux do you need a way to unregister an Amqp::Receiver?
2012-11-06 23:00:43 utc jmettraux and make it unsubscribe?
2012-11-06 23:00:47 utc mburnett maybe i should just fill in some background
2012-11-06 23:00:54 utc mburnett and you can tell me how that's the wrong design :)
2012-11-06 23:01:10 utc jmettraux maybe an email to the mailing list would be more appropriate
2012-11-06 23:01:28 utc mburnett ok
2012-11-06 23:02:37 utc jmettraux breakfast here
2012-11-06 23:02:49 utc jmettraux I'm OK to help via IRC, but please remember I cannot read your mind
2012-11-06 23:13:54 utc mburnett well, here's the gist: https://gist.github.com/4028287
2012-11-06 23:14:00 utc mburnett if you like, i'll post more details to the mailing list
2012-11-06 23:15:39 utc jmettraux what is the question?
2012-11-06 23:16:23 utc mburnett so the question is basically "what's the right way to handle failed grid jobs"
2012-11-06 23:16:35 utc mburnett right now i'm doing flunk()
2012-11-06 23:16:37 utc jmettraux that's very deep
2012-11-06 23:16:47 utc mburnett ok, so then let's narrow the scope
2012-11-06 23:16:52 utc jmettraux flunk() will pass the error to ruote
2012-11-06 23:16:54 utc mburnett what's a reasonable way to handle failed grid jobs here
2012-11-06 23:17:20 utc jmettraux so flunk() is read, IMHO
2012-11-06 23:17:27 utc jmettraux so flunk() is right, IMHO
2012-11-06 23:17:42 utc mburnett right that seems to work, i guess the behavior that most closely matches our existing infrastructure is that the process is marked as failed, but any non-depdendent parts of the process are still run
2012-11-06 23:18:02 utc jmettraux that's the default behaviour
2012-11-06 23:18:10 utc mburnett ah
2012-11-06 23:18:44 utc jmettraux if you have two concurrent ruote branches and one ends up in an error, the other will go on
2012-11-06 23:18:45 utc mburnett so basically i just need to monitor the failures so that i can flag the whole process as failed?
2012-11-06 23:18:53 utc mburnett right
2012-11-06 23:19:02 utc mburnett i just need to notify users that this process has failed
2012-11-06 23:19:30 utc jmettraux ruote-wise, a branch of the process failed
2012-11-06 23:19:32 utc mburnett so your initial recommendation would basically be to just flunk and do nothing else inside ruote?
2012-11-06 23:19:43 utc jmettraux yes
2012-11-06 23:19:48 utc mburnett ok
2012-11-06 23:20:09 utc mburnett i plan to setup a historian service listening to messages on amqp for stuff like this
2012-11-06 23:20:19 utc mburnett to create entries in our existing tracking system
2012-11-06 23:20:45 utc jmettraux as long as everything goes through AMQP, it's great
2012-11-06 23:31:52 utc jmettraux you're building an AMQP powered interface in front of your grid
2012-11-06 23:32:06 utc jmettraux your clients are ruote or whatever talks AMQP
2012-11-06 23:32:49 utc jmettraux services and orchestration of services
2012-11-06 23:38:58 utc mburnett that's right
2012-11-06 23:41:15 utc mburnett i really like this architecture
2012-11-06 23:42:15 utc mburnett is there a way to query ruote about whether a process has any possible ways to proceed without intervention? i.e. has every branch not blocked by an error completed?
2012-11-06 23:43:55 utc mburnett is leaves() the best attempt?
2012-11-06 23:44:39 utc mburnett and then check each one for error state
2012-11-07 00:05:20 utc jmettraux yes, leaves could help