| 2012-04-10 17:26:01 utc | myron | I'm interested in trying out ruote for the first time. I noticed the newest release on rubygems.org is over a year old (2.2 from 2/28/2011) but github has lots of commits since then |
| 2012-04-10 17:26:16 utc | myron | should I stick with the release from last february? or try the latest on github? |
| 2012-04-10 18:46:01 utc | myron | anyone here? |
| 2012-04-10 18:55:54 utc | Mugatu | myron: I've been using the latest from github |
| 2012-04-10 18:56:00 utc | Mugatu | Has worked fine for me so far |
| 2012-04-10 18:56:04 utc | myron | good to hear |
| 2012-04-10 18:56:14 utc | myron | has it been tested on 1.9.3? |
| 2012-04-10 18:56:16 utc | Mugatu | It seems that much of the documentation is already switching to use the newer conventions |
| 2012-04-10 18:56:26 utc | myron | newer conventions? |
| 2012-04-10 18:56:48 utc | Mugatu | There appears to be some API transition occurring between the last release and what is in github |
| 2012-04-10 18:56:56 utc | Mugatu | the older APIs still work fine though |
| 2012-04-10 18:57:01 utc | Mugatu | participant methods for instance are changing |
| 2012-04-10 18:57:15 utc | Mugatu | It's referenced in the participant documentation |
| 2012-04-10 18:57:35 utc | myron | ok |
| 2012-04-10 18:57:36 utc | Mugatu | As far as 1.9.3 goes, the best I can tell you is that I've been using it on 1.9.3, no issues |
| 2012-04-10 18:57:42 utc | myron | I'm a total noob here but I'll take a look |
| 2012-04-10 18:57:46 utc | Mugatu | (I'm just a casual user, not expert by any means) |
| 2012-04-10 18:59:05 utc | myron | I'm trying to figure out if ruote will be a good fit for my project |
| 2012-04-10 19:06:58 utc | myron | @Mugatu -- do you have some recommended resources to help me get started with ruote? |
| 2012-04-10 19:07:14 utc | myron | It's a lot to take in initially and I'm not sure of the best way to start |
| 2012-04-10 22:09:30 utc | myron | jmettraux -- you around? |
| 2012-04-10 22:09:48 utc | jmettraux | myron: hello Myron, welcome to #ruote |
| 2012-04-10 22:09:53 utc | myron | thanks |
| 2012-04-10 22:10:16 utc | jmettraux | you're the author of VCR among other gems? |
| 2012-04-10 22:10:19 utc | myron | yep |
| 2012-04-10 22:10:29 utc | jmettraux | excellent, thanks for that |
| 2012-04-10 22:10:42 utc | myron | cool, I'm always glad to hear others find it useful :) |
| 2012-04-10 22:11:01 utc | myron | anyhow, I'm trying to evaluate ruote for a project I'm working on |
| 2012-04-10 22:11:05 utc | myron | got a second to answer a few questions? |
| 2012-04-10 22:11:19 utc | jmettraux | Mugatu: thanks for helping Myron |
| 2012-04-10 22:11:24 utc | jmettraux | yes |
| 2012-04-10 22:11:29 utc | myron | cool |
| 2012-04-10 22:11:38 utc | myron | let me say a bit about what we're trying to build.... |
| 2012-04-10 22:12:24 utc | myron | we collect lots of data for our users on a weekly schedule. the data is collected by backend services that use some large scaleable datastore underneath (e.g. riak or cassandra) |
| 2012-04-10 22:12:51 utc | myron | now we're trying to build a middle-tier aggregating service that builds a weekly index on combined views of the data so we can serve it to our users in interesting ways |
| 2012-04-10 22:14:03 utc | myron | we're using MySQL for the middletier and sharding the data on a per-user basis (since it's updated weekly and is just a cache of the canonical data), and I'm working on building a processing pipeline that will build a new shard anytime a backend has new data for the user |
| 2012-04-10 22:14:05 utc | myron | does that make sense? |
| 2012-04-10 22:14:16 utc | jmettraux | yes |
| 2012-04-10 22:14:55 utc | myron | so...I want to be able to define a bunch of independent steps, each of which has zero or more dependencies on backends or other previous steps |
| 2012-04-10 22:15:31 utc | myron | e.g. some steps my use data written to the DB in a previous step, and combine it with data from another previous step |
| 2012-04-10 22:15:57 utc | myron | I was going to start working on a little gem to help us define these steps when I found ruote, and it seems to support most of the sorts of things I was thinking of building |
| 2012-04-10 22:16:03 utc | myron | does this sound like a good use-case for ruote? |
| 2012-04-10 22:18:40 utc | jmettraux | the "dependency on previous step" thing makes me think you need something more like a rule system |
| 2012-04-10 22:19:00 utc | myron | right, it was one thing I wasn't quite sure how to achieve with ruote |
| 2012-04-10 22:19:09 utc | jmettraux | but the orchestration of the steps part sure is ruotesque |
| 2012-04-10 22:20:21 utc | jmettraux | I have the impression I have seen libraries on github that address some of the aspects of your use case, but I cannot remember their names |
| 2012-04-10 22:20:49 utc | myron | are they libraries that hook into ruote? Or standalone libs? |
| 2012-04-10 22:21:52 utc | jmettraux | more like frameworks |
| 2012-04-10 22:22:01 utc | jmettraux | nothing for ruote |
| 2012-04-10 22:22:06 utc | myron | gotcha |
| 2012-04-10 22:22:13 utc | jmettraux | (I'd have remembered) |
| 2012-04-10 22:22:54 utc | myron | so let's say I have these processing steps... |
| 2012-04-10 22:22:58 utc | myron | 1) fetch_social_data |
| 2012-04-10 22:23:01 utc | myron | 2) fetch_traffic_data |
| 2012-04-10 22:23:08 utc | myron | 3) aggregate_traffic_and_social |
| 2012-04-10 22:23:29 utc | myron | fetch_social_data can be run as soon as the social-data backend has new data |
| 2012-04-10 22:23:41 utc | myron | fetch_traffic_data can be run as soon as the traffic data backend has new data |
| 2012-04-10 22:24:06 utc | myron | aggregate_traffic_and_social should be run as soon as #1 and #2 are both done. |
| 2012-04-10 22:24:15 utc | myron | but the backends may update at different times of day, hours apart |
| 2012-04-10 22:24:23 utc | myron | is this doable with ruote? |
| 2012-04-10 22:25:06 utc | jmettraux | yes, but I'm not sure it'd behave as you'd wish it to |
| 2012-04-10 22:25:29 utc | jmettraux | Ruote.define { concurrence { fetch_social; fetch_traffic }; aggregate } |
| 2012-04-10 22:25:29 utc | myron | how so? |
| 2012-04-10 22:25:56 utc | myron | right, I've been playing with the concurrence stuff a bit |
| 2012-04-10 22:26:02 utc | jmettraux | this process would fetch social and traffic in parallel, when both are done, it would aggregate |
| 2012-04-10 22:26:19 utc | jmettraux | but maybe you want the fetch_social and fetch_traffic to run all the time |
| 2012-04-10 22:26:33 utc | jmettraux | and they'd emit to a queue of work for aggregate |
| 2012-04-10 22:26:55 utc | jmettraux | and then aggregate would decide on its own if it has enough data for an aggregation |
| 2012-04-10 22:27:25 utc | jmettraux | not sure if it's your use case, but it decouples well like this |
| 2012-04-10 22:27:39 utc | jmettraux | ruote is more like do this, this and then that |
| 2012-04-10 22:27:45 utc | myron | hmmm...I'll have to play with it, I think |
| 2012-04-10 22:28:02 utc | myron | so maybe not a good fit if my processing pipeline is primarily a dependency graph? |
| 2012-04-10 22:28:34 utc | jmettraux | if it's a "pipeline" then maybe setting up queues and consumers would serve you better |
| 2012-04-10 22:28:58 utc | jmettraux | if it's more punctual, like "fetch the data for today, then aggregate", ruote is OK |
| 2012-04-10 22:29:53 utc | myron | it kinda wants to be the latter...but it also needs to be tolerant of problems with one of the backends |
| 2012-04-10 22:29:56 utc | myron | tricky to find the right balance :( |
| 2012-04-10 22:30:06 utc | jmettraux | +1 |
| 2012-04-10 22:30:25 utc | myron | thanks for the advice, though, it's helpful |
| 2012-04-10 22:30:35 utc | jmettraux | you're welcome |
| 2012-04-10 22:31:08 utc | myron | does ruote provide any guarantees of not dropping messages or workitems on the floor? |
| 2012-04-10 22:31:37 utc | jmettraux | it tries hard not to, but in the end, it depends on the storage implementation you're using |
| 2012-04-10 22:31:56 utc | myron | right, we're likely to use redis and that's of course primarily an in-memory datastore |
| 2012-04-10 22:33:04 utc | myron | if a ruote worker is killed (or our datacenter has a power-outage...yes it's happened...), will it pick up exactly where it left off when we restart as long as the storage implementation didn't drop anything? |
| 2012-04-10 22:33:08 utc | jmettraux | I've seen it used by people on the mailing list, we had an issue with dropped messages, but it was my fault, seems to work like a charm |
| 2012-04-10 22:33:18 utc | myron | good to konw |
| 2012-04-10 22:33:19 utc | jmettraux | it tries hard to |
| 2012-04-10 22:33:26 utc | myron | right, these are hard problems |
| 2012-04-10 22:33:36 utc | jmettraux | let me find a message about that |
| 2012-04-10 22:33:56 utc | jmettraux | https://groups.google.com/d/topic/openwferu-users/dPzoWeKZStw/discussion |
| 2012-04-10 22:34:25 utc | jmettraux | processes can get "stalled" in those cases |
| 2012-04-10 22:34:41 utc | jmettraux | the email thread explains one way of recovering |
| 2012-04-10 22:34:46 utc | myron | right, reading now :) |
| 2012-04-10 22:36:19 utc | myron | the last gem release is over a year ago--is the recommendation now just to use what's on github? |
| 2012-04-10 22:38:35 utc | jmettraux | please use what's on github, I have to release soon, but day job is interfering (though I use ruote there) |
| 2012-04-10 22:38:42 utc | myron | cool, will do |