The pictures are in the TS7500 Memory Loss album in the Photos section of this
page.
--- In "al" <> wrote:
>
> We have completed our testing and have shown conclusively that there is a bug
> in Linux that creates what appears to be a memory leak or loss. The bug is
> not in the Ethernet stack as we had suspected, but in a particular section of
> the Linux memory manager.
>
> We created a generic process with two threads, one for a basic client and the
> other for a basic server. We also created two Windows applications to
> interact with the threads and communicate messages at a high rate so that
> failure would not take long to recreate (typically less than 3 hours on a
> TS-7500). We used a memory logger running on a RW partition on, and
> recording memory stats to, the SD card on the TS-7500. The logger records
> memory statistics every 5 minutes.
>
> We saw our TS-7500 crash after about 2 hours and 45 minutes at which point it
> did not respond to ANY form of communication, including the serial console.
> Our memory logger showed rapid decrease in free memory and a rapid increase
> of used and slab memory. The graph of memory stats showed the system
> releasing a small amount of memory when free memory dropped below about 1.5
> MB after which the free memory continued to decrease at the same rate. It
> did this twice, but when free memory hit about 1.2 MB, the last entry in the
> log, the system hung. After reading up on Linux memory management, we
> confirmed that Linux is designed to hoard memory and that storage is 'slab'
> (page) based. The network stack is not designed to work from a pre-allocated
> memory area, as embedded operating systems do. It allocates memory from the
> 'heap' for all socket descriptors, file descriptors, data buffers, etc. This
> is why memory is so quickly fragmented. The memory allocated is NOT released
> back to the heap when a socket is closed or data packet has been processed.
> This is why free memory decreases so rapidly. But, the worst design is in
> the slab manager, kswapd. As slabs are marked as full (fully allocated),
> they are placed in a non-cached list of slabs that are released only
> partially when free memory drops below a certain limit (~1.2 MB in our
> experience). It is a very stingy routine and does not release enough memory
> to prevent the system from killing off tasks or crashing. That is the root
> cause of the memory 'loss' - the hoarding and mismanagement of memory by
> kswapd.
>
> The good news is that an arcane command sequence was found that forces the
> kernel to release its cached memory. This was very useful in that it allowed
> us to run a background task to periodically issue the command sequence,
> forcing the release of memory fast enough that it did not contribute to
> filling slabs. Once a slab is filled, it cannot be freed by the command
> sequence since the sequence is for cached memory only. The trick is to
> release cached memory at least as fast as it is consumed. Using this method,
> we got results from a recent test showing ZERO loss of memory over a 15 hour
> period.
>
> The background task issues a system call every 5 minutes using this sequence:
>
> sync; echo 3 > /proc/sys/vm/drop_caches
>
> This should help Paul "ptreos2" work around the problem he is having.
>
> If I can figure out how to post pictures to this group, I will post graphs of
> memory usage over days of typical operation before and after the workaround.
>
> Mitch
------------------------------------
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/ts-7000/
<*> Your email settings:
Individual Email | Traditional
<*> To change settings online go to:
http://groups.yahoo.com/group/ts-7000/join
(Yahoo! ID required)
<*> To change settings via email:
<*> To unsubscribe from this group, send an email to:
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
|