ruote tmp/log_2012-10-29.html

2012-10-29 20:42:52 utc

weeb1e

Hello everyone

2012-10-29 20:43:30 utc

weeb1e

jmettraux: I have been spending a lot of time debugging memory leaks in my EventMachine applications

2012-10-29 20:44:07 utc

weeb1e

It seems the last leak I found was due to rufus-scheduler keeping references to instances which call scheduler#in

2012-10-29 20:44:20 utc

weeb1e

Do you have any advice about such issues?

2012-10-29 20:47:51 utc

weeb1e

I have an Event class which an instance is created to process events parsed from data logged to a UDP server

2012-10-29 20:48:10 utc

weeb1e

For every event a new instance is created, but no references to the instance are created so that they can be GC'd

2012-10-29 20:48:56 utc

weeb1e

If I call an instance method from the instances #initialize method which contains the following, rufus-scheduler keeps a reference to the instance and it is never GC'd

2012-10-29 20:49:00 utc

weeb1e

scheduler.unschedule_by_tag "activity-timeout-#@address"

2012-10-29 20:49:00 utc

weeb1e

scheduler.in(12, tags: "activity-timeout-#@address") { puts "[#@server_name] Server is inactive..." }

2012-10-29 20:49:24 utc

weeb1e

note: scheduler is an instance stored on the class level, shared between all instances of Event

2012-10-29 20:50:23 utc

weeb1e

After a few hundred thousand Event instances and their contents is leaked, the whole application stalls due to it using up all available ram

2012-10-29 21:05:01 utc

weeb1e

I've found the same issue in most of the EventMachine applications, and it is a big problem. Please let me know what you suggest I do about it, I love rufus-scheduler, but I need to solve this soon, especially in my critical applications

2012-10-29 21:11:54 utc

jmettraux

weeb1e: hello

2012-10-29 21:12:21 utc

weeb1e

jmettraux: Hi :)

2012-10-29 21:12:30 utc

jmettraux

digesting your report

2012-10-29 21:12:43 utc

weeb1e

Ok, take your time

2012-10-29 21:12:47 utc

weeb1e

I'll be around

2012-10-29 21:13:05 utc

jmettraux

how does the scheduler keep a reference to your event instance?

2012-10-29 21:13:18 utc

jmettraux

how does it get its hand on it?

2012-10-29 21:13:19 utc

weeb1e

Procs can hold a reference to the binding

2012-10-29 21:13:34 utc

jmettraux

ah right

2012-10-29 21:13:55 utc

jmettraux

tried any workaround?

2012-10-29 21:13:57 utc

weeb1e

As to why its never free'd by rufus, I'm not sure

2012-10-29 21:14:22 utc

weeb1e

I worked around it in a single case by moving the schedule block to a class method and calling that from the instance

2012-10-29 21:14:29 utc

weeb1e

But that is far from ideal in most cases

2012-10-29 21:14:59 utc

weeb1e

I have some critical services with huge leaks caused by this issue, which I will need to solve very soon

2012-10-29 21:15:43 utc

weeb1e

I'm going to have to restart them manually (which also involved killing all the processes they have spawned) in a few hours, at about 4am, once they have no load

2012-10-29 21:16:16 utc

weeb1e

One of my services which is a process manager, which spawns processes and monitors them is currently using over 50% of 16GB ram

2012-10-29 21:16:27 utc

jmettraux

too much information

2012-10-29 21:16:56 utc

weeb1e

Basically, I need rufus-scheduler to free these proc bindings when a job is unscheduled

2012-10-29 21:18:10 utc

weeb1e

While right now there are only 16 instances of my SpawnedProcess class, there are 1168901 instances which have not been GC'd

2012-10-29 21:18:23 utc

jmettraux

rufus-scheduler is, at the heart, very simple

2012-10-29 21:20:20 utc

weeb1e

Well, if the proc holds the binding to self, why is does the proc still exist after unschedule('tag') is called?

2012-10-29 21:20:32 utc

jmettraux

what version of Ruby?

2012-10-29 21:21:11 utc

jmettraux

other question: how do you initialize the scheduler?

2012-10-29 21:22:08 utc

weeb1e

1.9.2p180, 1.9.3p125 and 1.9.3p286

2012-10-29 21:22:36 utc

jmettraux

I guess some GNU/Linux

2012-10-29 21:23:40 utc

weeb1e

Ubuntu and Debian

2012-10-29 21:24:17 utc

jmettraux

I've been looking at the code, the proc is held in a Job, when you unschedule (if the unschedule is successful), the job is removed from its job queue

2012-10-29 21:24:35 utc

jmettraux

there are two job queues, "at" and "cron", I guess you're only troubled by jobs in "at"

2012-10-29 21:24:44 utc

jmettraux

you seem to be using unschedule_at

2012-10-29 21:24:59 utc

jmettraux

maybe you could check that your job is unscheduled successfully at first

2012-10-29 21:25:08 utc

jmettraux

if not, then, yes, the proc sticks around

2012-10-29 21:25:25 utc

weeb1e

Ok, hopefully this is not my fault, I actually forgot I wrapped rufus-scheduler in a small wrapper a few years ago

2012-10-29 21:25:53 utc

jmettraux

same here

2012-10-29 21:26:06 utc

jmettraux

except for the wrapper part

2012-10-29 21:26:10 utc

weeb1e

When I call unschedule_by_tag, that was actually before such a method existed

2012-10-29 21:26:18 utc

weeb1e

So it calls find_by_tag(tag).each {|job| job.unschedule }

2012-10-29 21:27:20 utc

weeb1e

I don't use cron for scheduling, mostly #in, and a few #at

2012-10-29 21:28:25 utc

jmettraux

is the job.unschedule actually successful?

2012-10-29 21:28:50 utc

weeb1e

Well if it wasn't, a lot would have gone wrong over the years jmettraux, so I can be sure that does work

2012-10-29 21:29:04 utc

weeb1e

I use this wrapper extensively over many projects

2012-10-29 21:29:30 utc

jmettraux

ok, so the job count does decrease, you're sure of it.

2012-10-29 21:29:45 utc

jmettraux

how do you instantiate the scheduler?

2012-10-29 21:29:49 utc

weeb1e

After looking over it briefly, I don't see anything that should cause leaking of procs

2012-10-29 21:30:58 utc

weeb1e

jmettraux: This is my wrapper, http://pastebin.com/EhMrK53P

2012-10-29 21:31:17 utc

jmettraux

argh, EMScheduler

2012-10-29 21:31:21 utc

weeb1e

An instance of the wrapper is created per module that can be loaded and unloaded at runtime

2012-10-29 21:31:45 utc

weeb1e

The wrapper shares the single rufus-scheduler between the instances

2012-10-29 21:31:58 utc

weeb1e

It is an EM application, as are all of my other applications

2012-10-29 21:33:08 utc

weeb1e

Considering I wrote that a few years ago, I would now have used class instance variables, but in this case, it would make no difference, so I still think rufus-scheduler must be leaking the procs

2012-10-29 21:33:19 utc

jmettraux

you're using an array for the @queue

2012-10-29 21:34:06 utc

weeb1e

The queue is just for any calls that come before EM.run

2012-10-29 21:34:29 utc

weeb1e

Though more recent updates to my modular system would mean that is pretty much redundant

2012-10-29 21:34:48 utc

weeb1e

Most likely nothing is ever queued as modules are only loaded inside the event loop

2012-10-29 21:35:29 utc

jmettraux

what version of rufus-scheduler are you using?

2012-10-29 21:35:46 utc

weeb1e

2.0.9

2012-10-29 21:36:08 utc

weeb1e

At least on the machine I am currently debugging with

2012-10-29 21:37:48 utc

jmettraux

ok, I don't have the time to look further now

2012-10-29 21:37:59 utc

jmettraux

I will look at it later in the day

2012-10-29 21:38:29 utc

weeb1e

Ok, thanks jmettraux, please let me know as soon as you have any more information

2012-10-29 21:50:10 utc

jmettraux

weeb1e: line 56 of your scheduler

2012-10-29 21:50:14 utc

jmettraux

@jobs = []

2012-10-29 21:50:31 utc

jmettraux

then line 24

2012-10-29 21:50:40 utc

jmettraux

scheduler.jobs << @@scheduler.send(name, *args, &block)

2012-10-29 21:50:54 utc

weeb1e

ahh, fff

2012-10-29 21:51:03 utc

weeb1e

I didn't notice that is never cleared

2012-10-29 21:51:35 utc

weeb1e

I'd expect myself to have made a mistake like that back when I started using ruby, but wow, I can't believe I never noticed it :|

2012-10-29 21:51:51 utc

jmettraux

no worries

2012-10-29 21:52:16 utc

weeb1e

Not worried, but feel terrible for blaming rufus

2012-10-29 21:52:34 utc

weeb1e

I shall fix the 6 projects using this wrapper and see how things go

2012-10-29 21:53:28 utc

jmettraux

seriously, no worries, rufus is not yet cleared :-)

2012-10-29 21:54:39 utc

weeb1e

Hehe

2012-10-29 21:58:37 utc

weeb1e

I'll have to avoid using cron (not that I have been anyway), since #in and #at need the job removed from @jobs when their callback fires, #every never does, and cron could go either way

2012-10-29 21:59:22 utc

jmettraux

right

2012-10-29 22:06:04 utc

weeb1e

Leaks like this were never a real issue on my past smaller scale projects, but recently my projects have been serving hundreds of users, making even a small leak become a problem very quickly

2012-10-29 22:08:09 utc

jmettraux

congrats! :-)

2012-10-29 22:08:21 utc

weeb1e

I'll let you know if I notice any further issues, but hopefully this will solve things

2012-10-29 22:10:43 utc

jmettraux

ok, excellent