ruote tmp/log_2011-08-11.html

2011-08-11 00:47:35 utc lucas-howcast Hey guys, I have a Ruote question:
2011-08-11 00:48:06 utc lucas-howcast I'd like to send a reminder email to the participants 1 day before a certain task reaches it's timeout threshold
2011-08-11 00:48:13 utc lucas-howcast I've noticed that Ruote uses RufusScheduler,
2011-08-11 00:48:32 utc lucas-howcast so I was wondering if there's already a way I can do that within ruote, without having to use the scheduler spearately?
2011-08-11 00:48:39 utc lucas-howcast (let me know if it doesn't make sense)
2011-08-11 00:57:27 utc jmettraux hello lucas-howcast, let me try to come up with a gist
2011-08-11 00:57:42 utc lucas-howcast Hey jmettraux, that's awesome, thanks
2011-08-11 00:59:22 utc jmettraux in fact it's only using rufus-scheduler for parsing cron strings, it's not running a rufus-scheduler per se
2011-08-11 00:59:49 utc lucas-howcast oh I see
2011-08-11 01:02:57 utc jmettraux that's a [rough] variant :
2011-08-11 01:03:15 utc jmettraux ouch
2011-08-11 01:03:37 utc jmettraux it's not good, let me update it
2011-08-11 01:06:16 utc jmettraux this is better :
2011-08-11 01:06:40 utc jmettraux taken from
2011-08-11 01:07:20 utc lucas-howcast This looks great.. thanks a lot! it's definitely more Ruote-like than what I had in mind
2011-08-11 01:08:37 utc jmettraux I should probably cook something to make this pattern easier
2011-08-11 01:08:45 utc jmettraux it comes so often
2011-08-11 01:09:07 utc lucas-howcast We'll definitely help if you need a hand with that
2011-08-11 01:09:23 utc lucas-howcast maybe we could do a quick brainstorming session if you're up to it, whenever you want
2011-08-11 01:10:08 utc jmettraux I'm quite satu
2011-08-11 01:10:12 utc jmettraux rated these days
2011-08-11 01:10:19 utc lucas-howcast (by "We" I mean the devs at Howcast)
2011-08-11 01:10:53 utc jmettraux I was thinking something like participant 'lucas', :reminders => '1d,2d', :timeout => '3d'
2011-08-11 01:11:26 utc jmettraux but most of the time, people want the reminders to be different from the actual task emission
2011-08-11 01:11:42 utc jmettraux ACTION brainstorming with Lucas right now
2011-08-11 01:11:49 utc lucas-howcast hmm like for example?
2011-08-11 01:16:28 utc jmettraux the task is placed is some worklist, while the reminder is sent via an email
2011-08-11 01:19:49 utc jmettraux maybe something like that:
2011-08-11 01:21:21 utc lucas-howcast yeah that makes sense, I was thinking something like a callback system in general.. like participant 'foo' :callback1 => "1d", :callback2 => "2d"
2011-08-11 01:22:07 utc jmettraux what would get called back ?
2011-08-11 01:23:01 utc lucas-howcast :callback1 could be 'first_reminder' and you would define it like on your gist
2011-08-11 01:23:21 utc lucas-howcast so only difference would be
2011-08-11 01:23:35 utc lucas-howcast splitting this: '4d:first_reminder, 9d:last_reminder'
2011-08-11 01:23:41 utc lucas-howcast into 2 different "callbacks"
2011-08-11 01:24:39 utc jmettraux participant 'foo', first_reminder => '1d', second_reminder => '2d'
2011-08-11 01:25:34 utc lucas-howcast right
2011-08-11 01:28:15 utc jmettraux it would work nicely
2011-08-11 01:29:00 utc lucas-howcast I'll mention this to the guys and see if we come up with something to implement it and send a pull request or something
2011-08-11 01:30:35 utc jmettraux ok, it shouldn't be too hard to implement, if I don't have news from you, I'll try to implement it during the week-end
2011-08-11 01:31:03 utc lucas-howcast cool, sure
2011-08-11 01:31:08 utc jmettraux but I'm still not sure about the 'reminder' => '1d' vs 'reminders' => '1d, 2d'
2011-08-11 01:32:30 utc lucas-howcast yeah I think both options would be OK, imo. I'll see what the other guys think, and let you know before we start doing any work on this
2011-08-11 01:35:13 utc jmettraux ok
2011-08-11 01:35:35 utc lucas-howcast Thanks again for the tips! have a good night!
2011-08-11 01:39:27 utc jmettraux 'reminder' => '1d' is nice but the tricky thing, is we have to recognize that '1d' or whatever value we receive is a time string of some sort
2011-08-11 01:40:16 utc lucas-howcast that happens with timeout already, right?
2011-08-11 01:40:37 utc jmettraux yes, but the key is "timeout" so you know what to expect
2011-08-11 01:40:58 utc lucas-howcast ah, right
2011-08-11 01:41:30 utc jmettraux here it's "let's grab any key, value pair, where the value is a time string", it's nice but it means the participant expression will schedule a reminder for anything that has a time value
2011-08-11 01:41:52 utc jmettraux delivery_date => '2011/08/30'
2011-08-11 01:42:24 utc jmettraux is not the the "Xd" format, but...
2011-08-11 01:45:36 utc lucas-howcast yeah that makes sense
2011-08-11 01:45:40 utc jmettraux if I implement something about those reminders, I'll post about it on the mailing list
2011-08-11 09:56:44 utc _Taz_ Hi
2011-08-11 09:57:54 utc jmettraux hello
2011-08-11 09:58:13 utc _Taz_ I have a question about using ruote in a distributed way
2011-08-11 10:00:05 utc jmettraux please fire
2011-08-11 10:01:38 utc _Taz_ Let's say I have a "master" server and many "slaves". On the master, my process just ask the slaves to launch and send back datas. Then, the "master" gather the data and print them on my screen
2011-08-11 10:02:07 utc jmettraux map/reduce
2011-08-11 10:03:16 utc _Taz_ This : ?
2011-08-11 10:03:54 utc jmettraux yes, sorry, I couldn't prevent myself from commenting
2011-08-11 10:05:33 utc _Taz_ What kind of ruote architecture should I use? Is it possible to declare engine/worker on slaves server and tell the master ruby script to launch them?
2011-08-11 10:06:23 utc _Taz_ (To be honest, I havent got a clue about how distributed ruote works. I think it has something to do with EngineParticipant and use of multiple workers, but... I'm totally lost)
2011-08-11 10:12:57 utc jmettraux why don't you use the basic worker architecture ?
2011-08-11 10:13:11 utc jmettraux the master is the workflow engine and the slaves are participants ?
2011-08-11 10:13:46 utc jmettraux then you choose a multi-worker-friendly storage and set up multiple workers that share this storage
2011-08-11 10:14:22 utc jmettraux worker will poll the storage for work and first come first served
2011-08-11 10:14:37 utc jmettraux concurrence { a; b; c }
2011-08-11 10:15:25 utc jmettraux if you have four workers, one possible outcome is that worker 0 processes the concurrence which triggers 3 participants and those are handled by worker 1, 2 and 3
2011-08-11 10:15:51 utc jmettraux respectively
2011-08-11 10:16:42 utc jmettraux granted, that's not the same thing as choosing which slaves does the work explicitely, but the work gets done
2011-08-11 10:22:12 utc _Taz_ Lunch time here. I'll write a script and gist it here to see if I'm correct.
2011-08-11 10:25:23 utc jmettraux there are also people who farm the work between ruote and the real participant via a message queue, like rabbitmq or resque... many possibilities
2011-08-11 10:25:45 utc jmettraux and then of course, there is the engine participant if you really want to specify a flow to run in a give, other, engine
2011-08-11 12:40:30 utc _Taz_
2011-08-11 13:22:53 utc jmettraux _Taz_: do you want to share a storage between two engines ?
2011-08-11 13:23:20 utc jmettraux that doesn't work
2011-08-11 13:23:43 utc jmettraux same storage implies it's the same engine
2011-08-11 13:26:17 utc _Taz_ Our main problem is that we could ask ruote to launch hundreds of threads in our experiments on the computing grid (one for each node, maybe more)
2011-08-11 13:26:40 utc jmettraux ruote doesn't launch threads
2011-08-11 13:27:39 utc jmettraux so your idea is one ruote engine per node ?
2011-08-11 13:28:49 utc _Taz_ Grid5000 architecture is : 9 sites, 1 frontend per site (were you can reserve nodes)
2011-08-11 13:29:20 utc _Taz_ Our idea is to distribute ruote job on each frontend
2011-08-11 13:29:32 utc jmettraux what is a job ?
2011-08-11 13:29:34 utc _Taz_ (My sentence could be weird or even non-english)
2011-08-11 13:31:11 utc jmettraux is a job a ruote process ? or is it the work of one participant ?
2011-08-11 13:31:21 utc jmettraux what is an example of a job ?
2011-08-11 13:31:24 utc _Taz_ One ruote process per frontend, each one sends SSH commands to nodes and gather the data (like NFS server, IP adress etc etc...)
2011-08-11 13:31:53 utc _Taz_ One frontend is the "master", gather all the datas from every frontend, and then print em on the user screen
2011-08-11 13:33:23 utc jmettraux how often does the main process run ?
2011-08-11 13:34:52 utc _Taz_ once
2011-08-11 13:35:04 utc jmettraux isn't ruote a bit overkill for that ?
2011-08-11 13:35:19 utc jmettraux you could run a small webservice on each frontend to collect the data
2011-08-11 13:35:48 utc jmettraux and have the master frontend query each sub-frontend for the data and present the cumulated results
2011-08-11 13:35:51 utc _Taz_ I gave an example of experiment, the basic one I had to implement
2011-08-11 13:36:31 utc _Taz_ Our idea is to use the workflow formalism to easily create experiments on the platform
2011-08-11 13:38:06 utc _Taz_ And thus use Ruote as it deals with concurrence and all that kind of things which are really annoying to code with a "dirty" bash script
2011-08-11 13:43:59 utc jmettraux _Taz_:
2011-08-11 13:44:13 utc jmettraux variant a, single worker, not really distributed
2011-08-11 13:45:54 utc _Taz_ That's how my ruote script is written, more or less
2011-08-11 13:47:17 utc jmettraux
2011-08-11 13:47:25 utc jmettraux variant b, 1 redis storage
2011-08-11 13:47:32 utc jmettraux shared by multiple workers
2011-08-11 13:47:41 utc jmettraux distributed
2011-08-11 13:48:06 utc jmettraux but you wont decide which worker gets which workitem/task
2011-08-11 13:49:57 utc jmettraux variant c, you setup 1 redis and 1 ruote per node
2011-08-11 13:50:20 utc jmettraux then you register as EngineParticipant each node in each engine
2011-08-11 13:51:16 utc jmettraux and then you write a process that forks the work to other engines
2011-08-11 13:53:19 utc jmettraux
2011-08-11 13:53:59 utc jmettraux in those two examples, the process is defined completely in the main engine, but when calling a subprocess you specify on which engine it should run
2011-08-11 13:54:12 utc jmettraux that's it
2011-08-11 13:56:16 utc _Taz_ Where is the other engine declared? What if it's declared on another computer? How do you tell the master engine where it is?
2011-08-11 13:59:04 utc jmettraux 22:50 jmettraux: then you register as EngineParticipant each node in each engine
2011-08-11 13:59:24 utc jmettraux
2011-08-11 14:02:41 utc _Taz_ Thank you for your explanations, I will try to code that correctly
2011-08-11 14:05:18 utc jmettraux ok, you gave me an idea, maybe I can come up with something to cut the need to register all those engines manually
2011-08-11 14:26:26 utc _Taz_ I think it will be more simple to explain what we want to do when lucas is back
2011-08-11 14:27:44 utc jmettraux ok, don't hesitate to wrap it all in a post to the mailing list, I hope to be away in 30 minutes
2011-08-11 14:31:21 utc _Taz_ He wont be back before tuesday (monday is a day off in France) anyway
2011-08-11 14:31:33 utc troy42 morning
2011-08-11 14:31:41 utc troy42 or whatever =)
2011-08-11 14:31:43 utc jmettraux troy42: morning
2011-08-11 15:05:15 utc jmettraux gentlemen, have a nice day/evening !