| 2012-10-29 20:42:52 utc | weeb1e | Hello everyone |
| 2012-10-29 20:43:30 utc | weeb1e | jmettraux: I have been spending a lot of time debugging memory leaks in my EventMachine applications |
| 2012-10-29 20:44:07 utc | weeb1e | It seems the last leak I found was due to rufus-scheduler keeping references to instances which call scheduler#in |
| 2012-10-29 20:44:20 utc | weeb1e | Do you have any advice about such issues? |
| 2012-10-29 20:47:51 utc | weeb1e | I have an Event class which an instance is created to process events parsed from data logged to a UDP server |
| 2012-10-29 20:48:10 utc | weeb1e | For every event a new instance is created, but no references to the instance are created so that they can be GC'd |
| 2012-10-29 20:48:56 utc | weeb1e | If I call an instance method from the instances #initialize method which contains the following, rufus-scheduler keeps a reference to the instance and it is never GC'd |
| 2012-10-29 20:49:00 utc | weeb1e | scheduler.unschedule_by_tag "activity-timeout-#@address" |
| 2012-10-29 20:49:00 utc | weeb1e | scheduler.in(12, tags: "activity-timeout-#@address") { puts "[#@server_name] Server is inactive..." } |
| 2012-10-29 20:49:24 utc | weeb1e | note: scheduler is an instance stored on the class level, shared between all instances of Event |
| 2012-10-29 20:50:23 utc | weeb1e | After a few hundred thousand Event instances and their contents is leaked, the whole application stalls due to it using up all available ram |
| 2012-10-29 21:05:01 utc | weeb1e | I've found the same issue in most of the EventMachine applications, and it is a big problem. Please let me know what you suggest I do about it, I love rufus-scheduler, but I need to solve this soon, especially in my critical applications |
| 2012-10-29 21:11:54 utc | jmettraux | weeb1e: hello |
| 2012-10-29 21:12:21 utc | weeb1e | jmettraux: Hi :) |
| 2012-10-29 21:12:30 utc | jmettraux | digesting your report |
| 2012-10-29 21:12:43 utc | weeb1e | Ok, take your time |
| 2012-10-29 21:12:47 utc | weeb1e | I'll be around |
| 2012-10-29 21:13:05 utc | jmettraux | how does the scheduler keep a reference to your event instance? |
| 2012-10-29 21:13:18 utc | jmettraux | how does it get its hand on it? |
| 2012-10-29 21:13:19 utc | weeb1e | Procs can hold a reference to the binding |
| 2012-10-29 21:13:34 utc | jmettraux | ah right |
| 2012-10-29 21:13:55 utc | jmettraux | tried any workaround? |
| 2012-10-29 21:13:57 utc | weeb1e | As to why its never free'd by rufus, I'm not sure |
| 2012-10-29 21:14:22 utc | weeb1e | I worked around it in a single case by moving the schedule block to a class method and calling that from the instance |
| 2012-10-29 21:14:29 utc | weeb1e | But that is far from ideal in most cases |
| 2012-10-29 21:14:59 utc | weeb1e | I have some critical services with huge leaks caused by this issue, which I will need to solve very soon |
| 2012-10-29 21:15:43 utc | weeb1e | I'm going to have to restart them manually (which also involved killing all the processes they have spawned) in a few hours, at about 4am, once they have no load |
| 2012-10-29 21:16:16 utc | weeb1e | One of my services which is a process manager, which spawns processes and monitors them is currently using over 50% of 16GB ram |
| 2012-10-29 21:16:27 utc | jmettraux | too much information |
| 2012-10-29 21:16:56 utc | weeb1e | Basically, I need rufus-scheduler to free these proc bindings when a job is unscheduled |
| 2012-10-29 21:18:10 utc | weeb1e | While right now there are only 16 instances of my SpawnedProcess class, there are 1168901 instances which have not been GC'd |
| 2012-10-29 21:18:23 utc | jmettraux | rufus-scheduler is, at the heart, very simple |
| 2012-10-29 21:20:20 utc | weeb1e | Well, if the proc holds the binding to self, why is does the proc still exist after unschedule('tag') is called? |
| 2012-10-29 21:20:32 utc | jmettraux | what version of Ruby? |
| 2012-10-29 21:21:11 utc | jmettraux | other question: how do you initialize the scheduler? |
| 2012-10-29 21:22:08 utc | weeb1e | 1.9.2p180, 1.9.3p125 and 1.9.3p286 |
| 2012-10-29 21:22:36 utc | jmettraux | I guess some GNU/Linux |
| 2012-10-29 21:23:40 utc | weeb1e | Ubuntu and Debian |
| 2012-10-29 21:24:17 utc | jmettraux | I've been looking at the code, the proc is held in a Job, when you unschedule (if the unschedule is successful), the job is removed from its job queue |
| 2012-10-29 21:24:35 utc | jmettraux | there are two job queues, "at" and "cron", I guess you're only troubled by jobs in "at" |
| 2012-10-29 21:24:44 utc | jmettraux | you seem to be using unschedule_at |
| 2012-10-29 21:24:59 utc | jmettraux | maybe you could check that your job is unscheduled successfully at first |
| 2012-10-29 21:25:08 utc | jmettraux | if not, then, yes, the proc sticks around |
| 2012-10-29 21:25:25 utc | weeb1e | Ok, hopefully this is not my fault, I actually forgot I wrapped rufus-scheduler in a small wrapper a few years ago |
| 2012-10-29 21:25:53 utc | jmettraux | same here |
| 2012-10-29 21:26:06 utc | jmettraux | except for the wrapper part |
| 2012-10-29 21:26:10 utc | weeb1e | When I call unschedule_by_tag, that was actually before such a method existed |
| 2012-10-29 21:26:18 utc | weeb1e | So it calls find_by_tag(tag).each {|job| job.unschedule } |
| 2012-10-29 21:27:20 utc | weeb1e | I don't use cron for scheduling, mostly #in, and a few #at |
| 2012-10-29 21:28:25 utc | jmettraux | is the job.unschedule actually successful? |
| 2012-10-29 21:28:50 utc | weeb1e | Well if it wasn't, a lot would have gone wrong over the years jmettraux, so I can be sure that does work |
| 2012-10-29 21:29:04 utc | weeb1e | I use this wrapper extensively over many projects |
| 2012-10-29 21:29:30 utc | jmettraux | ok, so the job count does decrease, you're sure of it. |
| 2012-10-29 21:29:45 utc | jmettraux | how do you instantiate the scheduler? |
| 2012-10-29 21:29:49 utc | weeb1e | After looking over it briefly, I don't see anything that should cause leaking of procs |
| 2012-10-29 21:30:58 utc | weeb1e | jmettraux: This is my wrapper, http://pastebin.com/EhMrK53P |
| 2012-10-29 21:31:17 utc | jmettraux | argh, EMScheduler |
| 2012-10-29 21:31:21 utc | weeb1e | An instance of the wrapper is created per module that can be loaded and unloaded at runtime |
| 2012-10-29 21:31:45 utc | weeb1e | The wrapper shares the single rufus-scheduler between the instances |
| 2012-10-29 21:31:58 utc | weeb1e | It is an EM application, as are all of my other applications |
| 2012-10-29 21:33:08 utc | weeb1e | Considering I wrote that a few years ago, I would now have used class instance variables, but in this case, it would make no difference, so I still think rufus-scheduler must be leaking the procs |
| 2012-10-29 21:33:19 utc | jmettraux | you're using an array for the @queue |
| 2012-10-29 21:34:06 utc | weeb1e | The queue is just for any calls that come before EM.run |
| 2012-10-29 21:34:29 utc | weeb1e | Though more recent updates to my modular system would mean that is pretty much redundant |
| 2012-10-29 21:34:48 utc | weeb1e | Most likely nothing is ever queued as modules are only loaded inside the event loop |
| 2012-10-29 21:35:29 utc | jmettraux | what version of rufus-scheduler are you using? |
| 2012-10-29 21:35:46 utc | weeb1e | 2.0.9 |
| 2012-10-29 21:36:08 utc | weeb1e | At least on the machine I am currently debugging with |
| 2012-10-29 21:37:48 utc | jmettraux | ok, I don't have the time to look further now |
| 2012-10-29 21:37:59 utc | jmettraux | I will look at it later in the day |
| 2012-10-29 21:38:29 utc | weeb1e | Ok, thanks jmettraux, please let me know as soon as you have any more information |
| 2012-10-29 21:50:10 utc | jmettraux | weeb1e: line 56 of your scheduler |
| 2012-10-29 21:50:14 utc | jmettraux | @jobs = [] |
| 2012-10-29 21:50:31 utc | jmettraux | then line 24 |
| 2012-10-29 21:50:40 utc | jmettraux | scheduler.jobs << @@scheduler.send(name, *args, &block) |
| 2012-10-29 21:50:54 utc | weeb1e | ahh, fff |
| 2012-10-29 21:51:03 utc | weeb1e | I didn't notice that is never cleared |
| 2012-10-29 21:51:35 utc | weeb1e | I'd expect myself to have made a mistake like that back when I started using ruby, but wow, I can't believe I never noticed it :| |
| 2012-10-29 21:51:51 utc | jmettraux | no worries |
| 2012-10-29 21:52:16 utc | weeb1e | Not worried, but feel terrible for blaming rufus |
| 2012-10-29 21:52:34 utc | weeb1e | I shall fix the 6 projects using this wrapper and see how things go |
| 2012-10-29 21:53:28 utc | jmettraux | seriously, no worries, rufus is not yet cleared :-) |
| 2012-10-29 21:54:39 utc | weeb1e | Hehe |
| 2012-10-29 21:58:37 utc | weeb1e | I'll have to avoid using cron (not that I have been anyway), since #in and #at need the job removed from @jobs when their callback fires, #every never does, and cron could go either way |
| 2012-10-29 21:59:22 utc | jmettraux | right |
| 2012-10-29 22:06:04 utc | weeb1e | Leaks like this were never a real issue on my past smaller scale projects, but recently my projects have been serving hundreds of users, making even a small leak become a problem very quickly |
| 2012-10-29 22:08:09 utc | jmettraux | congrats! :-) |
| 2012-10-29 22:08:21 utc | weeb1e | I'll let you know if I notice any further issues, but hopefully this will solve things |
| 2012-10-29 22:10:43 utc | jmettraux | ok, excellent |