It looks like I've found the problem, it seems indeed to be a memory leak
originating in the kernel's(?) Netwinder Floating Point Emulator code
(/var/log/dmesg shows NetWinder Floating Point Emulator V0.97 (double
precision)).
In my last post I noted that I discovered an exception message occuring in
`dmesg` output every couple of seconds. The messages were of this form:
NWFPE: jamvm[7394] takes exception 00000002 at c034eec8 from 2ca4b84c
Googled NWFPE and found it is NetWinder Floating Point Emulation. Seemed like
a likely candidate for its use would be in my Java code involving BigDecimals,
my preferred way of handling floating point math.
I tracked down the culprit to a utility method in my Java code which was
instantiating BigDecimals from String representations of numbers. I switched to
instantiation with doubles, e.g.:
new BigDecimal("" + doubleVal);
becomes
new BigDecimal(doubleVal);
The exceptions in dmesg ceased. An overnight run of the code is still running
strong and `free` shows healthy availability and use of memory buffers, quite a
big change from earlier performance.
Thanks Bret and Jim for your help.
--- In "tedapt" <> wrote:
>
> Thanks, I'll look into the 2.6.21 kernel possibility.
>
> I've now found that enabling swap only delays the inevitable. Now after a
> day (versus six hours), I run out of swap (free swap goes to zero and system
> freezes). Increasing swap size just extends the delay until the system locks
> up. Sound like a memory or other resource leak?
>
> I am curious about your question about peek/poke and /dev/mem. I have
> several configurations of my application. They share similar core code, but
> differ in some add-on functionality. The problem I'm experiencing occurs in a
> couple of my configurations. Notably one of them does analog/digital I/O
> using the TS-ADC16 board. While I am not doing I/O peek/poke myself in my
> code (java & native), I've installed lsof to get a better handle on what's
> active in the system, and I see that (only in this application configuration)
> something under my java application is using the /dev/mem device:
>
> :admin# lsof | grep /dev/mem
> COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
> jamvm 566 root mem CHR 1,1 9 /dev/mem
> jamvm 566 root 38u CHR 1,1 9 /dev/mem
> jamvm 582 root mem CHR 1,1 9 /dev/mem
> etc...
>
> That said, I also have another application configuration which is suffering a
> similar problem but instead collects its data via Modbus TCP. I've got
> another version of this application which uses the RS-232 serial port for
> Modbus RTU, and this version does not suffer this problem.
>
> I have a new theory about this: I've been watching free's buffers grow and
> shrink, assuming that I "need" buffers and somehow can't get them (i.e., they
> go to zero). I'm now thinking that instead, some processes which were happily
> filling buffers are becoming bottle-necked and so using buffers at a slower
> pace, until eventually they grind to a halt and no buffers are further
> allocated as are result (or only very slowly). Meanwhile, the OS is flushing
> the buffers via bdflush as data in them age. Ultimately, there's no recent
> data left in the buffers (i.e., buffers are flushed to size of 0) because the
> source of that data which was filling them has ground to a halt.
>
> Given the symptoms I see (inability to connect to the device, some complaints
> in its log files related to inability to make outbound TCP connections, I
> wonder if the TCP networking service is having problems (perhaps failing to
> make connections and blocking?) I have little idea how to test this idea or
> develop diagnostic data for it. Can anyone offer any suggestions?
>
> One more clue, I've just found that dmesg gives me many, many messages that
> look something like this (present with both versions of my application that
> experience these memory problems, but not on the version that doesn't):
>
> NWFPE: jamvm[642] takes exception 00000002 at c034eec8 from 2ca4b84c
>
> When I `cat /proc/642/maps` I see:
>
> 2ca43000-2ca54000 r-xp 00000000 fe:03 16900
> /usr/local/jamvm/lib/classpath/libjavalang.so.0.0.0
>
> So it looks to me like there is a problem in the native java libraries. Not
> sure whether this is a symptom of the other problems I see or the cause
> itself. But if the cause, then perhaps this is an exception in native code,
> which is the source of a memory leak?
>
> --- In Jim Jackson <jj@> wrote:
> >
> >
> >
> >
> > On Thu, 25 Jun 2009, Breton M. Saunders wrote:
> >
> > > Are you using the sdcard for disk storage on this device? If so, can
> > > you try running your application completely off of a ramdisk
> >
> > If he's having RAM exhaustion problems, I don't see how using some of that
> > RAM for a ramdisk is likely to help matters :-) Maybe I'm missing
> > something.
> >
> > Given having a 64M swap eases prtoblems it looks like the application, or
> > the Java VM is just memory bound. The problem wants reanalysing and
> > recoding to use less memory, which may not be possible if it's a java VM
> > memory management problem.
> >
> > > or off of
> > > an NFS partition and verify that the problem still occurs.
> > >> I suppose at this point I'm really looking for more insight into what
> > >> "Buffers" really means? I've read that this is either related to data
> > >> blocks or interprocess communications. Hoping to better understand what
> > >> that means (and ideally to somehow connect that to a particular thread)
> > >> to perhaps nail down the few lines of Java code somewhere in the vast
> > >> sea that could be causing the problem and rewrite that.
> > >>
> > > I don't think its a few java lines that are causing the problem. If the
> > > machine is locking up then you've uncovered a kernel bug; which isn't
> > > unlikely given the amount of dodgey code used to get memory working on
> > > the ep93xx with linux - due to the ep93xx's awkward layout of physical
> > > memory.
> > >> Otherwise, hoping to identify some OS configuration variables (e.g.,
> > >> through sysctl) that might make a difference if tweaked.
> > >>
> > >> Have also tried allocating more and less heap to Java, with no effect,
> > >> as well have tried touching large chunks of memory when my app
> > >> initializes to try to ensure that enough memory is being allocated. But
> > >> once again, the problem seems most closely tied to the "Buffers" value,
> > >> not overall memory allocation.
> > >>
> > > Can you try running your application under a 2.6 series kernel? Either
> > > try with TS's 2.6.21, or try with some of the later revisions, like
> > > 2.6.27. I don't believe that 2.6.28 or 2.6.29 are stable yet using
> > > sparsemem.
> > >
> > > Finally, are you doing I/O by peek/poke via /dev/mem?
> > >
> > > Cheers,
> > >
> > > -Brett
> > >
> >
>
------------------------------------
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/ts-7000/
<*> Your email settings:
Individual Email | Traditional
<*> To change settings online go to:
http://groups.yahoo.com/group/ts-7000/join
(Yahoo! ID required)
<*> To change settings via email:
<*> To unsubscribe from this group, send an email to:
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
|