Ubuntu and magic mini-trampolines (crash problem)

classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Ubuntu and magic mini-trampolines (crash problem)

Mic Bowman
I have the same freeze at times. In my case there's a gdb process
running that I can kill off to terminate the simulator.

Regarding 9323... I just upgraded this morning and restarted my most
script-rich region several times without the crash (using the patch I
had posted on the mantis entry). Not that it actually means anything
that I can't reproduce it today given the timing nature. If you are
running your simulator in a screen process, turn on logging and post
the mono error on the mantis. I'll take a look at it.

--mic


On Mon, Apr 27, 2009 at 6:33 AM, John Hopkin
<[hidden email]> wrote:

> I'm still getting the problem in r9323, although OpenSim.exe now
> freezes rather than returning to the command line, and needs "kill -9"
> to terminate it.
>
> John Hopkin wrote:
>
>>Thanks, Mic.  I've put a monitor on that mantis, and I'll upgrade when
>>the patch comes through.  It's not a killer problem - if I can get the
>>sim up, which can take a couple of tries, it seems to stay up pretty
>>well, with only one crash so far during normal running, which was
>>during a border crossing.
>>
>>John
>>
>>Mic Bowman wrote:
>>
>>>there is a mantis for this (3237)...
>>>
>>>i've been going through the various script engine calls to log4net and
>>>commenting them out to get rid of the problem. two in particular
>>>seemed to help. i'll put up another patch on that mantis in the next
>>>couple days.
>>>
>>>the problem comes and goes. i've tried various combinations of mono
>>>threads, debug/release builds, and even rebuilt log4net. the more
>>>scripts i have in a region that is restarting, the more likely the
>>>problem occurs. there is clearly a race condition somewhere in the
>>>mono 2.2 and 2.4 internals which opensim is tweaking.
>>>
>>>--mic
>>>
>>>
>>>On Wed, Apr 22, 2009 at 6:09 PM, John Hopkin
>>><[hidden email]> wrote:
>>>> Thanks.  I'll try that.  At the moment, it's working OK - it seems to
>>>> be intermittent, whereas before it was during each bootup.  If/when it
>>>> happens again, I'll drop that value.
>>>>
>>>> John
>>>>
>>>> Snoopy Pfeffer wrote:
>>>>
>>>>>I have experienced the same when MONO_THREADS_PER_CPU is set to a very high number. 500 works for me.
>>>>>
>>>>>  Snoopy Pfeffer
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>________________________________
>>>>>From: John Hopkin <[hidden email]>
>>>>>To: [hidden email]
>>>>>Sent: Thursday, April 23, 2009 1:50:37 AM
>>>>>Subject: [Opensim-users] Ubuntu and magic mini-trampolines (crash problem)
>>>>>
>>>>>I've just upgraded Mono from 2.1 to 2.2 under Ubuntu Hardy, using Dr
>>>>>Scofield's script from here:
>>>>>
>>>>>http://xyzzyxyzzy.net/2009/02/17/updated-mono-build-script-mono-22/
>>>>>
>>>>>and all went smoothly with that.
>>>>>
>>>>>But now, when I run OpenSim.exe, it loads OK at first, but then I get
>>>>>the error:
>>>>>
>>>>><start paste>
>>>>>Region (root) # 0x18 0x0 0x0 0x55 0x8b 0xec 0x57 0x56 0x83 0xec 0x20
>>>>>0x8b 0x75 0x8 0x83 0xec 0x8 0x56 0x68 0x10 0x75 0xe8 0x9c 0x8b 0x5
>>>>>0x10 0x75 0xe8 0x9c 0xff 0x50 0x34
>>>>>**
>>>>>ERROR:mini-trampolines.c:122:mono_magic_trampoline: assertion failed:
>>>>>(vt)
>>>>>Stacktrace:
>>>>>
>>>>>  at log4net.LogManager.WrapLogger (log4net.Core.ILogger) <0xffffffff>
>>>>>  at log4net.LogManager.WrapLogger (log4net.Core.ILogger) <0x0001d>
>>>>><end paste>
>>>>>
>>>>>and a long stacktrace.  The server crashes, of course.  Presumably,
>>>>>the references to magic mini-trampolines are within the code; I don't
>>>>>have any scripts or objects that I'm aware of like that.
>>>>>
>>>>>Vital stats:
>>>>>
>>>>>* OpenSim revision 9205 (binary package from OSGrid)
>>>>>* Ubuntu Intrepid, kernel 2.6.27-11-generic
>>>>>* Mono run with MONO_THREADS_PER_CPU set to 2000
>>>>>* 7 regions run by one OpenSim.exe
>>>>>*  P4/3.6GHz, 2GB RAM, more than adequate (or were under Mono 2.1)
>>>>>* UGAIM through OSGrid
>>>>>* MySQL 14.12
>>>>>
>>>>>I'm completely at a loss as to what could be causing this.  Does
>>>>>anyone have any ideas?
>>>> --
>>>> John Hopkin
>>>>
>>>> _______________________________________________
>>>> Opensim-users mailing list
>>>> [hidden email]
>>>> https://lists.berlios.de/mailman/listinfo/opensim-users
>>>>
> --
> John Hopkin
>
> _______________________________________________
> Opensim-users mailing list
> [hidden email]
> https://lists.berlios.de/mailman/listinfo/opensim-users
>
_______________________________________________
Opensim-users mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/opensim-users
Reply | Threaded
Open this post in threaded view
|

Re: Ubuntu and magic mini-trampolines (crash problem)

John Hopkin
I'm not using screen - it's just a simple script running in a terminal
window.  I reinstalled (now on r9325), reapplied the patch, and tried
again with all 11 regions (several dozen scripts).  It crashed and ran
gdb as you say.  It seems pretty easy to reproduce still.

I've uploaded a paste of the full error message onto the mantis; hope
this helps.

John

Mic Bowman wrote:

>I have the same freeze at times. In my case there's a gdb process
>running that I can kill off to terminate the simulator.
>
>Regarding 9323... I just upgraded this morning and restarted my most
>script-rich region several times without the crash (using the patch I
>had posted on the mantis entry). Not that it actually means anything
>that I can't reproduce it today given the timing nature. If you are
>running your simulator in a screen process, turn on logging and post
>the mono error on the mantis. I'll take a look at it.
>
>--mic
>
>
>On Mon, Apr 27, 2009 at 6:33 AM, John Hopkin
><[hidden email]> wrote:
>> I'm still getting the problem in r9323, although OpenSim.exe now
>> freezes rather than returning to the command line, and needs "kill -9"
>> to terminate it.
>>
>> John Hopkin wrote:
>>
>>>Thanks, Mic.  I've put a monitor on that mantis, and I'll upgrade when
>>>the patch comes through.  It's not a killer problem - if I can get the
>>>sim up, which can take a couple of tries, it seems to stay up pretty
>>>well, with only one crash so far during normal running, which was
>>>during a border crossing.
>>>
>>>John
>>>
>>>Mic Bowman wrote:
>>>
>>>>there is a mantis for this (3237)...
>>>>
>>>>i've been going through the various script engine calls to log4net and
>>>>commenting them out to get rid of the problem. two in particular
>>>>seemed to help. i'll put up another patch on that mantis in the next
>>>>couple days.
>>>>
>>>>the problem comes and goes. i've tried various combinations of mono
>>>>threads, debug/release builds, and even rebuilt log4net. the more
>>>>scripts i have in a region that is restarting, the more likely the
>>>>problem occurs. there is clearly a race condition somewhere in the
>>>>mono 2.2 and 2.4 internals which opensim is tweaking.
>>>>
>>>>--mic
>>>>
>>>>
>>>>On Wed, Apr 22, 2009 at 6:09 PM, John Hopkin
>>>><[hidden email]> wrote:
>>>>> Thanks.  I'll try that.  At the moment, it's working OK - it seems to
>>>>> be intermittent, whereas before it was during each bootup.  If/when it
>>>>> happens again, I'll drop that value.
>>>>>
>>>>> John
>>>>>
>>>>> Snoopy Pfeffer wrote:
>>>>>
>>>>>>I have experienced the same when MONO_THREADS_PER_CPU is set to a very high number. 500 works for me.
>>>>>>
>>>>>>  Snoopy Pfeffer
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>________________________________
>>>>>>From: John Hopkin <[hidden email]>
>>>>>>To: [hidden email]
>>>>>>Sent: Thursday, April 23, 2009 1:50:37 AM
>>>>>>Subject: [Opensim-users] Ubuntu and magic mini-trampolines (crash problem)
>>>>>>
>>>>>>I've just upgraded Mono from 2.1 to 2.2 under Ubuntu Hardy, using Dr
>>>>>>Scofield's script from here:
>>>>>>
>>>>>>http://xyzzyxyzzy.net/2009/02/17/updated-mono-build-script-mono-22/
>>>>>>
>>>>>>and all went smoothly with that.
>>>>>>
>>>>>>But now, when I run OpenSim.exe, it loads OK at first, but then I get
>>>>>>the error:
>>>>>>
>>>>>><start paste>
>>>>>>Region (root) # 0x18 0x0 0x0 0x55 0x8b 0xec 0x57 0x56 0x83 0xec 0x20
>>>>>>0x8b 0x75 0x8 0x83 0xec 0x8 0x56 0x68 0x10 0x75 0xe8 0x9c 0x8b 0x5
>>>>>>0x10 0x75 0xe8 0x9c 0xff 0x50 0x34
>>>>>>**
>>>>>>ERROR:mini-trampolines.c:122:mono_magic_trampoline: assertion failed:
>>>>>>(vt)
>>>>>>Stacktrace:
>>>>>>
>>>>>>  at log4net.LogManager.WrapLogger (log4net.Core.ILogger) <0xffffffff>
>>>>>>  at log4net.LogManager.WrapLogger (log4net.Core.ILogger) <0x0001d>
>>>>>><end paste>
>>>>>>
>>>>>>and a long stacktrace.  The server crashes, of course.  Presumably,
>>>>>>the references to magic mini-trampolines are within the code; I don't
>>>>>>have any scripts or objects that I'm aware of like that.
>>>>>>
>>>>>>Vital stats:
>>>>>>
>>>>>>* OpenSim revision 9205 (binary package from OSGrid)
>>>>>>* Ubuntu Intrepid, kernel 2.6.27-11-generic
>>>>>>* Mono run with MONO_THREADS_PER_CPU set to 2000
>>>>>>* 7 regions run by one OpenSim.exe
>>>>>>*  P4/3.6GHz, 2GB RAM, more than adequate (or were under Mono 2.1)
>>>>>>* UGAIM through OSGrid
>>>>>>* MySQL 14.12
>>>>>>
>>>>>>I'm completely at a loss as to what could be causing this.  Does
>>>>>>anyone have any ideas?
>>>>> --
>>>>> John Hopkin
>>>>>
>>>>> _______________________________________________
>>>>> Opensim-users mailing list
>>>>> [hidden email]
>>>>> https://lists.berlios.de/mailman/listinfo/opensim-users
>>>>>
>> --
>> John Hopkin
>>
>> _______________________________________________
>> Opensim-users mailing list
>> [hidden email]
>> https://lists.berlios.de/mailman/listinfo/opensim-users
>>
--
John Hopkin

_______________________________________________
Opensim-users mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/opensim-users
Reply | Threaded
Open this post in threaded view
|

Re: Ubuntu and magic mini-trampolines (crash problem)

justincc
In reply to this post by Mic Bowman
Mic Bowman wrote:
> I have the same freeze at times. In my case there's a gdb process
> running that I can kill off to terminate the simulator.
>

I presume there's no revealing deadlock information in the thread dump (perhaps that's too much to hope for :) ?

> Regarding 9323... I just upgraded this morning and restarted my most
> script-rich region several times without the crash (using the patch I
> had posted on the mantis entry). Not that it actually means anything
> that I can't reproduce it today given the timing nature. If you are
> running your simulator in a screen process, turn on logging and post
> the mono error on the mantis. I'll take a look at it.

--
justincc
Justin Clark-Casey
http://justincc.wordpress.com
_______________________________________________
Opensim-users mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/opensim-users
Reply | Threaded
Open this post in threaded view
|

Re: Ubuntu and magic mini-trampolines (crash problem)

Dr Scofield
Justin Clark-Casey wrote:
Mic Bowman wrote:
  
I have the same freeze at times. In my case there's a gdb process
running that I can kill off to terminate the simulator.

    

I presume there's no revealing deadlock information in the thread dump (perhaps that's too much to hope for :) ?
  
what is your value of the MONO_THREADS_PER_CPU environment variable?

-- 
dr dirk husemann ---- math & computer science ---- ibm zurich research lab
RL: [hidden email] - +41 44 724 8573 - http://www.zurich.ibm.com/~hud/ 
SL: [hidden email] --------------------- http://xyzzyxyzzy.net/

_______________________________________________
Opensim-users mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/opensim-users
Reply | Threaded
Open this post in threaded view
|

Re: Ubuntu and magic mini-trampolines (crash problem)

Mic Bowman
we've seen the problem with mono_threads_per_cpu set across a range of
values (i think we've tried a number of values from 100 to 1000).
there doesn't appear to be any correlation.

--mic


On Mon, Apr 27, 2009 at 10:38 PM, dr scofield <[hidden email]> wrote:

> Justin Clark-Casey wrote:
>
> Mic Bowman wrote:
>
>
> I have the same freeze at times. In my case there's a gdb process
> running that I can kill off to terminate the simulator.
>
>
>
> I presume there's no revealing deadlock information in the thread dump
> (perhaps that's too much to hope for :) ?
>
>
> what is your value of the MONO_THREADS_PER_CPU environment variable?
>
> --
> dr dirk husemann ---- math & computer science ---- ibm zurich research lab
> RL: [hidden email] - +41 44 724 8573 - http://www.zurich.ibm.com/~hud/
> SL: [hidden email] --------------------- http://xyzzyxyzzy.net/
>
> _______________________________________________
> Opensim-users mailing list
> [hidden email]
> https://lists.berlios.de/mailman/listinfo/opensim-users
>
>
_______________________________________________
Opensim-users mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/opensim-users
Reply | Threaded
Open this post in threaded view
|

Re: Ubuntu and magic mini-trampolines (crash problem)

Dr Scofield
Mic Bowman wrote:
> we've seen the problem with mono_threads_per_cpu set across a range of
> values (i think we've tried a number of values from 100 to 1000).
> there doesn't appear to be any correlation.

ok. we had freezes that got cured by increasing MONO_THREADS_PER_CPU, hence my
asking...

a real deadlock is nasty... :-( sigh.

        DrS/dirk

>
> --mic
>
>
> On Mon, Apr 27, 2009 at 10:38 PM, dr scofield <[hidden email]> wrote:
>> Justin Clark-Casey wrote:
>>
>> Mic Bowman wrote:
>>
>>
>> I have the same freeze at times. In my case there's a gdb process
>> running that I can kill off to terminate the simulator.
>>
>>
>>
>> I presume there's no revealing deadlock information in the thread dump
>> (perhaps that's too much to hope for :) ?
>>
>>
>> what is your value of the MONO_THREADS_PER_CPU environment variable?
>>
>> --
>> dr dirk husemann ---- math & computer science ---- ibm zurich research lab
>> RL: [hidden email] - +41 44 724 8573 - http://www.zurich.ibm.com/~hud/
>> SL: [hidden email] --------------------- http://xyzzyxyzzy.net/
>>
>> _______________________________________________
>> Opensim-users mailing list
>> [hidden email]
>> https://lists.berlios.de/mailman/listinfo/opensim-users
>>
>>
> _______________________________________________
> Opensim-users mailing list
> [hidden email]
> https://lists.berlios.de/mailman/listinfo/opensim-users
>


--
dr dirk husemann ---- virtual worlds research ---- ibm zurich research lab
SL: dr scofield ---- [hidden email] ---- http://xyzzyxyzzy.net/
RL: [hidden email] - +41 44 724 8573 - http://www.zurich.ibm.com/~hud/
_______________________________________________
Opensim-users mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/opensim-users
Reply | Threaded
Open this post in threaded view
|

Re: Ubuntu and magic mini-trampolines (crash problem)

Mic Bowman
in general we see crashes, not freezes. the apparent freeze is usually
caused by the gdb process that gets started when the crash occurs
(kill the gdb process and the simulator crashes "gracefully").

given that the problem seems to happen when log4net attempts to get a
logger associated with a particular script class, my guess is that one
of a couple problems is happening... 1) mono is somehow messing up the
class that is passed in (there are nulls in the argument stack that
look suspicious), 2) log4net isn't as thread-safe as it claims (though
the routine that creates the wrapper object looks fine at first
glance), 3) somehow mono is screwing up the assembly binding when
log4net creates the wrapper. I've looked at the mono code that is
throwing the exception... but it wasn't very helpful without a LOT
more investigation.

at this point, i'm building a custom log4net dll with some additional
debugging in it. which, given that this certainly appears to be a
timing issue, will almost certainly hide the problem. :-(

--mic


On Tue, Apr 28, 2009 at 5:04 AM, Dr Scofield <[hidden email]> wrote:

> Mic Bowman wrote:
>> we've seen the problem with mono_threads_per_cpu set across a range of
>> values (i think we've tried a number of values from 100 to 1000).
>> there doesn't appear to be any correlation.
>
> ok. we had freezes that got cured by increasing MONO_THREADS_PER_CPU, hence my
> asking...
>
> a real deadlock is nasty... :-( sigh.
>
>        DrS/dirk
>>
>> --mic
>>
>>
>> On Mon, Apr 27, 2009 at 10:38 PM, dr scofield <[hidden email]> wrote:
>>> Justin Clark-Casey wrote:
>>>
>>> Mic Bowman wrote:
>>>
>>>
>>> I have the same freeze at times. In my case there's a gdb process
>>> running that I can kill off to terminate the simulator.
>>>
>>>
>>>
>>> I presume there's no revealing deadlock information in the thread dump
>>> (perhaps that's too much to hope for :) ?
>>>
>>>
>>> what is your value of the MONO_THREADS_PER_CPU environment variable?
>>>
>>> --
>>> dr dirk husemann ---- math & computer science ---- ibm zurich research lab
>>> RL: [hidden email] - +41 44 724 8573 - http://www.zurich.ibm.com/~hud/
>>> SL: [hidden email] --------------------- http://xyzzyxyzzy.net/
>>>
>>> _______________________________________________
>>> Opensim-users mailing list
>>> [hidden email]
>>> https://lists.berlios.de/mailman/listinfo/opensim-users
>>>
>>>
>> _______________________________________________
>> Opensim-users mailing list
>> [hidden email]
>> https://lists.berlios.de/mailman/listinfo/opensim-users
>>
>
>
> --
> dr dirk husemann ---- virtual worlds research ---- ibm zurich research lab
> SL: dr scofield ---- [hidden email] ---- http://xyzzyxyzzy.net/
> RL: [hidden email] - +41 44 724 8573 - http://www.zurich.ibm.com/~hud/
> _______________________________________________
> Opensim-users mailing list
> [hidden email]
> https://lists.berlios.de/mailman/listinfo/opensim-users
>
_______________________________________________
Opensim-users mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/opensim-users
Reply | Threaded
Open this post in threaded view
|

Re: Ubuntu and magic mini-trampolines (crash problem)

John Hopkin
Mic Bowman <[hidden email]> wrote in
news:[hidden email]:

> in general we see crashes, not freezes. the apparent freeze is usually
> caused by the gdb process that gets started when the crash occurs
> (kill the gdb process and the simulator crashes "gracefully").
>
> given that the problem seems to happen when log4net attempts to get a
> logger associated with a particular script class, my guess is that one
> of a couple problems is happening... 1) mono is somehow messing up the
> class that is passed in (there are nulls in the argument stack that
> look suspicious), 2) log4net isn't as thread-safe as it claims (though
> the routine that creates the wrapper object looks fine at first
> glance), 3) somehow mono is screwing up the assembly binding when
> log4net creates the wrapper. I've looked at the mono code that is
> throwing the exception... but it wasn't very helpful without a LOT
> more investigation.
>
> at this point, i'm building a custom log4net dll with some additional
> debugging in it. which, given that this certainly appears to be a
> timing issue, will almost certainly hide the problem. :-(

If it's any help, you could send me a copy and I'll try to reproduce the
problem here and let you know what happens.

_______________________________________________
Opensim-users mailing list
[hidden email]
https://lists.berlios.de/mailman/listinfo/opensim-users
12