[ts-7000] Re: TS-7500 Memory Loss

To:
Subject:	[ts-7000] Re: TS-7500 Memory Loss
From:	"al" <>
Date:	Tue, 11 Sep 2012 18:52:48 -0000

The pictures are in the TS7500 Memory Loss album in the Photos section of this 
page.

--- In  "al" <> wrote:
>
> We have completed our testing and have shown conclusively that there is a bug 
> in Linux that creates what appears to be a memory leak or loss.  The bug is 
> not in the Ethernet stack as we had suspected, but in a particular section of 
> the Linux memory manager.
> 
> We created a generic process with two threads, one for a basic client and the 
> other for a basic server.  We also created two Windows applications to 
> interact with the threads and communicate messages at a high rate so that 
> failure would not take long to recreate (typically less than 3 hours on a 
> TS-7500).  We used a memory logger running on a RW partition on, and 
> recording memory stats to, the SD card on the TS-7500.  The logger records 
> memory statistics every 5 minutes.
> 
> We saw our TS-7500 crash after about 2 hours and 45 minutes at which point it 
> did not respond to ANY form of communication, including the serial console.  
> Our memory logger showed rapid decrease in free memory and a rapid increase 
> of used and slab memory.  The graph of memory stats showed the system 
> releasing a small amount of memory when free memory dropped below about 1.5 
> MB after which the free memory continued to decrease at the same rate.  It 
> did this twice, but when free memory hit about 1.2 MB, the last entry in the 
> log, the system hung.  After reading up on Linux memory management, we 
> confirmed that Linux is designed to hoard memory and that storage is 'slab' 
> (page) based.  The network stack is not designed to work from a pre-allocated 
> memory area, as embedded operating systems do.  It allocates memory from the 
> 'heap' for all socket descriptors, file descriptors, data buffers, etc.  This 
> is why memory is so quickly fragmented.  The memory allocated is NOT released 
> back to the heap when a socket is closed or data packet has been processed.  
> This is why free memory decreases so rapidly.  But, the worst design is in 
> the slab manager, kswapd.  As slabs are marked as full (fully allocated), 
> they are placed in a non-cached list of slabs that are released only 
> partially when free memory drops below a certain limit (~1.2 MB in our 
> experience).  It is a very stingy routine and does not release enough memory 
> to prevent the system from killing off tasks or crashing.  That is the root 
> cause of the memory 'loss' - the hoarding and mismanagement of memory by 
> kswapd.
> 
> The good news is that an arcane command sequence was found that forces the 
> kernel to release its cached memory.  This was very useful in that it allowed 
> us to run a background task to periodically issue the command sequence, 
> forcing the release of memory fast enough that it did not contribute to 
> filling slabs.  Once a slab is filled, it cannot be freed by the command 
> sequence since the sequence is for cached memory only.  The trick is to 
> release cached memory at least as fast as it is consumed.  Using this method, 
> we got results from a recent test showing ZERO loss of memory over a 15 hour 
> period.
> 
> The background task issues a system call every 5 minutes using this sequence:
> 
> sync; echo 3 > /proc/sys/vm/drop_caches
> 
> This should help Paul "ptreos2" work around the problem he is having.
> 
> If I can figure out how to post pictures to this group, I will post graphs of 
> memory usage over days of typical operation before and after the workaround.
> 
> Mitch




------------------------------------

Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/ts-7000/

<*> Your email settings:
    Individual Email | Traditional

<*> To change settings online go to:
    http://groups.yahoo.com/group/ts-7000/join
    (Yahoo! ID required)

<*> To change settings via email:
     
    

<*> To unsubscribe from this group, send an email to:
    

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/

<Prev in Thread]	Current Thread	[Next in Thread>
[ts-7000] Re: TS-7500 Memory Loss, al [ts-7000] Re: TS-7500 Memory Loss, al <=

Previous by Date:	[ts-7000] Re: TS-7500 Memory Loss, al
Next by Date:	Re: [ts-7000] Re: new version of cctl, Reinaldo A. Fagundes
Previous by Thread:	[ts-7000] Re: TS-7500 Memory Loss, al
Next by Thread:	Re: [ts-7000] Re: new version of cctl, Reinaldo A. Fagundes
Indexes:	[Date] [Thread] [Top] [All Lists]

Disclaimer: Neither Andrew Taylor nor the University of NSW School of Computer and Engineering take any responsibility for the contents of this archive. It is purely a compilation of material sent by many people to the birding-aus mailing list. It has not been checked for accuracy nor its content verified in any way. If you wish to get material removed from the archive or have other queries about the archive e-mail Andrew Taylor at this address: andrewt@cse.unsw.EDU.AU