ts-7000
[Top] [All Lists]

[ts-7000] Re: Memory buffers goes to zero - ts7260 hangs

To:
Subject: [ts-7000] Re: Memory buffers goes to zero - ts7260 hangs
From: "tedapt" <>
Date: Sat, 27 Jun 2009 13:14:23 -0000
It looks like I've found the problem, it seems indeed to be a memory leak 
originating in the kernel's(?) Netwinder Floating Point Emulator code 
(/var/log/dmesg shows NetWinder Floating Point Emulator V0.97 (double 
precision)).

In my last post I noted that I discovered an exception message occuring in 
`dmesg` output every couple of seconds.  The messages were of this form:

   NWFPE: jamvm[7394] takes exception 00000002 at c034eec8 from 2ca4b84c

Googled NWFPE and found it is NetWinder Floating Point Emulation.  Seemed like 
a likely candidate for its use would be in my Java code involving BigDecimals, 
my preferred way of handling floating point math.

I tracked down the culprit to a utility method in my Java code which was 
instantiating BigDecimals from String representations of numbers. I switched to 
instantiation with doubles, e.g.:
   
   new BigDecimal("" + doubleVal);
becomes
   new BigDecimal(doubleVal);

The exceptions in dmesg ceased.  An overnight run of the code is still running 
strong and `free` shows healthy availability and use of memory buffers, quite a 
big change from earlier performance.

Thanks Bret and Jim for your help.

--- In  "tedapt" <> wrote:
>
> Thanks, I'll look into the 2.6.21 kernel possibility.
> 
> I've now found that enabling swap only delays the inevitable.  Now after a 
> day (versus six hours), I run out of swap (free swap goes to zero and system 
> freezes).  Increasing swap size just extends the delay until the system locks 
> up. Sound like a memory or other resource leak?
> 
> I am curious about your question about peek/poke and /dev/mem.  I have 
> several configurations of my application. They share similar core code, but 
> differ in some add-on functionality. The problem I'm experiencing occurs in a 
> couple of my configurations. Notably one of them does analog/digital I/O 
> using the TS-ADC16 board.  While I am not doing I/O peek/poke myself in my 
> code (java & native), I've installed lsof to get a better handle on what's 
> active in the system, and I see that (only in this application configuration) 
> something under my java application is using the /dev/mem device:
> 
> :admin# lsof | grep /dev/mem
> COMMAND    PID   USER   FD      TYPE     DEVICE    SIZE      NODE NAME
> jamvm      566   root  mem       CHR        1,1                 9 /dev/mem
> jamvm      566   root   38u      CHR        1,1                 9 /dev/mem
> jamvm      582   root  mem       CHR        1,1                 9 /dev/mem
> etc...
> 
> That said, I also have another application configuration which is suffering a 
> similar problem but instead collects its data via Modbus TCP.  I've got 
> another version of this application which uses the RS-232 serial port for 
> Modbus RTU, and this version does not suffer this problem.
> 
> I have a new theory about this:  I've been watching free's buffers grow and 
> shrink, assuming that I "need" buffers and somehow can't get them (i.e., they 
> go to zero). I'm now thinking that instead, some processes which were happily 
> filling buffers are becoming bottle-necked and so using buffers at a slower 
> pace, until eventually they grind to a halt and no buffers are further 
> allocated as are result (or only very slowly). Meanwhile, the OS is flushing 
> the buffers via bdflush as data in them age. Ultimately, there's no recent 
> data left in the buffers (i.e., buffers are flushed to size of 0) because the 
> source of that data which was filling them has ground to a halt.
> 
> Given the symptoms I see (inability to connect to the device, some complaints 
> in its log files related to inability to make outbound TCP connections, I 
> wonder if the TCP networking service is having problems (perhaps failing to 
> make connections and blocking?)  I have little idea how to test this idea or 
> develop diagnostic data for it. Can anyone offer any suggestions? 
> 
> One more clue, I've just found that dmesg gives me many, many messages that 
> look something like this (present with both versions of my application that 
> experience these memory problems, but not on the version that doesn't):
> 
>    NWFPE: jamvm[642] takes exception 00000002 at c034eec8 from 2ca4b84c
> 
> When I `cat /proc/642/maps` I see:
> 
>    2ca43000-2ca54000 r-xp 00000000 fe:03 16900      
> /usr/local/jamvm/lib/classpath/libjavalang.so.0.0.0
> 
> So it looks to me like there is a problem in the native java libraries. Not 
> sure whether this is a symptom of the other problems I see or the cause 
> itself.  But if the cause, then perhaps this is an exception in native code, 
> which is the source of a memory leak?
> 
> --- In  Jim Jackson <jj@> wrote:
> >
> > 
> > 
> > 
> > On Thu, 25 Jun 2009, Breton M. Saunders wrote:
> > 
> > > Are you using the sdcard for disk storage on this device?  If so, can
> > > you try running your application completely off of a ramdisk
> > 
> > If he's having RAM exhaustion problems, I don't see how using some of that 
> > RAM for a ramdisk is likely to help matters :-) Maybe I'm missing 
> > something.
> > 
> > Given having a 64M swap eases prtoblems it looks like the application, or 
> > the Java VM is just memory bound. The problem wants reanalysing and 
> > recoding to use less memory, which may not be possible if it's a java VM 
> > memory management problem.
> > 
> > > or off of
> > > an NFS partition and verify that the problem still occurs.
> > >> I suppose at this point I'm really looking for more insight into what 
> > >> "Buffers" really means? I've read that this is either related to data 
> > >> blocks or interprocess communications. Hoping to better understand what 
> > >> that means (and ideally to somehow connect that to a particular thread) 
> > >> to perhaps nail down the few lines of Java code somewhere in the vast 
> > >> sea that could be causing the problem and rewrite that.
> > >>
> > > I don't think its a few java lines that are causing the problem.  If the
> > > machine is locking up then you've uncovered a kernel bug; which isn't
> > > unlikely given the amount of dodgey code used to get memory working on
> > > the ep93xx with linux - due to the ep93xx's awkward layout of physical
> > > memory.
> > >> Otherwise, hoping to identify some OS configuration variables (e.g., 
> > >> through sysctl) that might make a difference if tweaked.
> > >>
> > >> Have also tried allocating more and less heap to Java, with no effect, 
> > >> as well have tried touching large chunks of memory when my app 
> > >> initializes to try to ensure that enough memory is being allocated. But 
> > >> once again, the problem seems most closely tied to the "Buffers" value, 
> > >> not overall memory allocation.
> > >>
> > > Can you try running your application under a 2.6 series kernel?  Either
> > > try with TS's 2.6.21, or try with some of the later revisions, like
> > > 2.6.27.  I don't believe that 2.6.28 or 2.6.29 are stable yet using
> > > sparsemem.
> > >
> > > Finally, are you doing I/O by peek/poke via /dev/mem?
> > >
> > >    Cheers,
> > >
> > >    -Brett
> > >
> >
>




------------------------------------

Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/ts-7000/

<*> Your email settings:
    Individual Email | Traditional

<*> To change settings online go to:
    http://groups.yahoo.com/group/ts-7000/join
    (Yahoo! ID required)

<*> To change settings via email:
     
    

<*> To unsubscribe from this group, send an email to:
    

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/

<Prev in Thread] Current Thread [Next in Thread>
Admin

Disclaimer: Neither Andrew Taylor nor the University of NSW School of Computer and Engineering take any responsibility for the contents of this archive. It is purely a compilation of material sent by many people to the birding-aus mailing list. It has not been checked for accuracy nor its content verified in any way. If you wish to get material removed from the archive or have other queries about the archive e-mail Andrew Taylor at this address: andrewt@cse.unsw.EDU.AU