ruote tmp/log_2012-10-29.html

2012-10-29 20:42:52 utc weeb1e Hello everyone
2012-10-29 20:43:30 utc weeb1e jmettraux: I have been spending a lot of time debugging memory leaks in my EventMachine applications
2012-10-29 20:44:07 utc weeb1e It seems the last leak I found was due to rufus-scheduler keeping references to instances which call scheduler#in
2012-10-29 20:44:20 utc weeb1e Do you have any advice about such issues?
2012-10-29 20:47:51 utc weeb1e I have an Event class which an instance is created to process events parsed from data logged to a UDP server
2012-10-29 20:48:10 utc weeb1e For every event a new instance is created, but no references to the instance are created so that they can be GC'd
2012-10-29 20:48:56 utc weeb1e If I call an instance method from the instances #initialize method which contains the following, rufus-scheduler keeps a reference to the instance and it is never GC'd
2012-10-29 20:49:00 utc weeb1e scheduler.unschedule_by_tag "activity-timeout-#@address"
2012-10-29 20:49:00 utc weeb1e, tags: "activity-timeout-#@address") { puts "[#@server_name] Server is inactive..." }
2012-10-29 20:49:24 utc weeb1e note: scheduler is an instance stored on the class level, shared between all instances of Event
2012-10-29 20:50:23 utc weeb1e After a few hundred thousand Event instances and their contents is leaked, the whole application stalls due to it using up all available ram
2012-10-29 21:05:01 utc weeb1e I've found the same issue in most of the EventMachine applications, and it is a big problem. Please let me know what you suggest I do about it, I love rufus-scheduler, but I need to solve this soon, especially in my critical applications
2012-10-29 21:11:54 utc jmettraux weeb1e: hello
2012-10-29 21:12:21 utc weeb1e jmettraux: Hi :)
2012-10-29 21:12:30 utc jmettraux digesting your report
2012-10-29 21:12:43 utc weeb1e Ok, take your time
2012-10-29 21:12:47 utc weeb1e I'll be around
2012-10-29 21:13:05 utc jmettraux how does the scheduler keep a reference to your event instance?
2012-10-29 21:13:18 utc jmettraux how does it get its hand on it?
2012-10-29 21:13:19 utc weeb1e Procs can hold a reference to the binding
2012-10-29 21:13:34 utc jmettraux ah right
2012-10-29 21:13:55 utc jmettraux tried any workaround?
2012-10-29 21:13:57 utc weeb1e As to why its never free'd by rufus, I'm not sure
2012-10-29 21:14:22 utc weeb1e I worked around it in a single case by moving the schedule block to a class method and calling that from the instance
2012-10-29 21:14:29 utc weeb1e But that is far from ideal in most cases
2012-10-29 21:14:59 utc weeb1e I have some critical services with huge leaks caused by this issue, which I will need to solve very soon
2012-10-29 21:15:43 utc weeb1e I'm going to have to restart them manually (which also involved killing all the processes they have spawned) in a few hours, at about 4am, once they have no load
2012-10-29 21:16:16 utc weeb1e One of my services which is a process manager, which spawns processes and monitors them is currently using over 50% of 16GB ram
2012-10-29 21:16:27 utc jmettraux too much information
2012-10-29 21:16:56 utc weeb1e Basically, I need rufus-scheduler to free these proc bindings when a job is unscheduled
2012-10-29 21:18:10 utc weeb1e While right now there are only 16 instances of my SpawnedProcess class, there are 1168901 instances which have not been GC'd
2012-10-29 21:18:23 utc jmettraux rufus-scheduler is, at the heart, very simple
2012-10-29 21:20:20 utc weeb1e Well, if the proc holds the binding to self, why is does the proc still exist after unschedule('tag') is called?
2012-10-29 21:20:32 utc jmettraux what version of Ruby?
2012-10-29 21:21:11 utc jmettraux other question: how do you initialize the scheduler?
2012-10-29 21:22:08 utc weeb1e 1.9.2p180, 1.9.3p125 and 1.9.3p286
2012-10-29 21:22:36 utc jmettraux I guess some GNU/Linux
2012-10-29 21:23:40 utc weeb1e Ubuntu and Debian
2012-10-29 21:24:17 utc jmettraux I've been looking at the code, the proc is held in a Job, when you unschedule (if the unschedule is successful), the job is removed from its job queue
2012-10-29 21:24:35 utc jmettraux there are two job queues, "at" and "cron", I guess you're only troubled by jobs in "at"
2012-10-29 21:24:44 utc jmettraux you seem to be using unschedule_at
2012-10-29 21:24:59 utc jmettraux maybe you could check that your job is unscheduled successfully at first
2012-10-29 21:25:08 utc jmettraux if not, then, yes, the proc sticks around
2012-10-29 21:25:25 utc weeb1e Ok, hopefully this is not my fault, I actually forgot I wrapped rufus-scheduler in a small wrapper a few years ago
2012-10-29 21:25:53 utc jmettraux same here
2012-10-29 21:26:06 utc jmettraux except for the wrapper part
2012-10-29 21:26:10 utc weeb1e When I call unschedule_by_tag, that was actually before such a method existed
2012-10-29 21:26:18 utc weeb1e So it calls find_by_tag(tag).each {|job| job.unschedule }
2012-10-29 21:27:20 utc weeb1e I don't use cron for scheduling, mostly #in, and a few #at
2012-10-29 21:28:25 utc jmettraux is the job.unschedule actually successful?
2012-10-29 21:28:50 utc weeb1e Well if it wasn't, a lot would have gone wrong over the years jmettraux, so I can be sure that does work
2012-10-29 21:29:04 utc weeb1e I use this wrapper extensively over many projects
2012-10-29 21:29:30 utc jmettraux ok, so the job count does decrease, you're sure of it.
2012-10-29 21:29:45 utc jmettraux how do you instantiate the scheduler?
2012-10-29 21:29:49 utc weeb1e After looking over it briefly, I don't see anything that should cause leaking of procs
2012-10-29 21:30:58 utc weeb1e jmettraux: This is my wrapper,
2012-10-29 21:31:17 utc jmettraux argh, EMScheduler
2012-10-29 21:31:21 utc weeb1e An instance of the wrapper is created per module that can be loaded and unloaded at runtime
2012-10-29 21:31:45 utc weeb1e The wrapper shares the single rufus-scheduler between the instances
2012-10-29 21:31:58 utc weeb1e It is an EM application, as are all of my other applications
2012-10-29 21:33:08 utc weeb1e Considering I wrote that a few years ago, I would now have used class instance variables, but in this case, it would make no difference, so I still think rufus-scheduler must be leaking the procs
2012-10-29 21:33:19 utc jmettraux you're using an array for the @queue
2012-10-29 21:34:06 utc weeb1e The queue is just for any calls that come before
2012-10-29 21:34:29 utc weeb1e Though more recent updates to my modular system would mean that is pretty much redundant
2012-10-29 21:34:48 utc weeb1e Most likely nothing is ever queued as modules are only loaded inside the event loop
2012-10-29 21:35:29 utc jmettraux what version of rufus-scheduler are you using?
2012-10-29 21:35:46 utc weeb1e 2.0.9
2012-10-29 21:36:08 utc weeb1e At least on the machine I am currently debugging with
2012-10-29 21:37:48 utc jmettraux ok, I don't have the time to look further now
2012-10-29 21:37:59 utc jmettraux I will look at it later in the day
2012-10-29 21:38:29 utc weeb1e Ok, thanks jmettraux, please let me know as soon as you have any more information
2012-10-29 21:50:10 utc jmettraux weeb1e: line 56 of your scheduler
2012-10-29 21:50:14 utc jmettraux @jobs = []
2012-10-29 21:50:31 utc jmettraux then line 24
2012-10-29 21:50:40 utc jmettraux << @@scheduler.send(name, *args, &block)
2012-10-29 21:50:54 utc weeb1e ahh, fff
2012-10-29 21:51:03 utc weeb1e I didn't notice that is never cleared
2012-10-29 21:51:35 utc weeb1e I'd expect myself to have made a mistake like that back when I started using ruby, but wow, I can't believe I never noticed it :|
2012-10-29 21:51:51 utc jmettraux no worries
2012-10-29 21:52:16 utc weeb1e Not worried, but feel terrible for blaming rufus
2012-10-29 21:52:34 utc weeb1e I shall fix the 6 projects using this wrapper and see how things go
2012-10-29 21:53:28 utc jmettraux seriously, no worries, rufus is not yet cleared :-)
2012-10-29 21:54:39 utc weeb1e Hehe
2012-10-29 21:58:37 utc weeb1e I'll have to avoid using cron (not that I have been anyway), since #in and #at need the job removed from @jobs when their callback fires, #every never does, and cron could go either way
2012-10-29 21:59:22 utc jmettraux right
2012-10-29 22:06:04 utc weeb1e Leaks like this were never a real issue on my past smaller scale projects, but recently my projects have been serving hundreds of users, making even a small leak become a problem very quickly
2012-10-29 22:08:09 utc jmettraux congrats! :-)
2012-10-29 22:08:21 utc weeb1e I'll let you know if I notice any further issues, but hopefully this will solve things
2012-10-29 22:10:43 utc jmettraux ok, excellent