ts-7000
[Top] [All Lists]

[ts-7000] Re: Reboot issue with RTAI Application on the 7260

To:
Subject: [ts-7000] Re: Reboot issue with RTAI Application on the 7260
From: "jywmpg" <>
Date: Thu, 19 Apr 2007 20:04:56 -0000
I did a little more googling, then actually looked into the rtai-load
script and found that one should execute a 'sync' command before
loading the rtai modules.

This is similar to the subject of a previous post I made about crashes
involving the open of /dev/mem that is required to do DIO processing.  
(http://tech.groups.yahoo.com/group/ts-7000/message/6249). sejmke
pointed out that the answer was to add O_SYNC to the option list,
which cleared the problem up.

I understand that this flushes the output buffers, but don't really
know why that is important, just that it is.


I was pretty happy with this, until I tried to implement it a second
7260 (I have half a dozen of them I need to get running) and ran into
a new issue:  The boot process hangs when loading rtai_hal.o from an
/etc/init.d script, even with the sync command.  The two machines are
clones of one another, with the only difference in hardware being the
USB WiFi Dongle version (requiring a different driver (zd1211 vs
zd_b)), and the brand of the USB CF.

I am making progress on this second machine however.  I disabled every
non-essential service out of /etc/init.d, and suppressed the loading
of every non-essential driver, and now the second 7260 boots and loads
the RTAI modules reliably (30+ successful boots in a row, including
one involving an fsck).  Now it is a matter of adding the drivers and
services back in one at a time until the problem recurs.


In any event, thanks again for your suggestions.

jw






--- In  "Julien" <> wrote:
>
> Ok cool !
> 
> Could be the fsck linked to your boot problem when loading your RTAI 
> application ? That seems strange... I didn't notice something like 
> this in my case.
> 
> By the way you can configure the number of boot beetween each fsck no
>  ? I think I saw something about this when partitionning the CF in 
> EXT2.
> 
> Julien
> 
> --- In  "jywmpg" <jywmpg@> wrote:
> >
> > Yes, that helped a great deal.  Thanks very much.  Things are 
> working
> > much much better than before.
> > 
> > 1. I was using rtai-load to launch my programs.  I now realize that
> > its main job was to load and unload the rtai modules.  I suspect it 
> is
> >  just part of the test suite.
> > 
> > The system I am developing is an embedded system whose main purpose 
> is
> > to perform this DIO task, so the RTAI application and modules should
> > be running all the time.  For ease of use, I just created an
> > /etc/init.d/rtai script that  loads the modules I need at boot time.
> > 
> > This cleared up several of the symptoms.
> > 
> > 2. I took your suggesting and routed the RS-232 serial console 
> output
> > to a file and examined it closely.  I was happy to find a few 
> relevant
> > messages just before that crash that pointed to my non real time 
> task.
> > 
> > To make a long story short, at reboot the real time task exits 
> first.
> >  Subsequently the non real time tried to execute an rpc on the task
> > that had exited, trying to tell it to exit.   Nothing good happened
> > after that.
> > 
> > Adding a simple check before executing the rpc caused that crash to 
> go
> > away.
> > 
> > 3. At that point, I was able to reboot the machine in question
> > remotely via a script about 30 times in a row without issue.
> > 
> > 4. Then the machine crashed on boot, just after the RTAI application
> > started.  
> > 
> > I noticed, however, that on this boot an fsck had been triggered on
> > the USB CF, since it was the 30th mounting without an fsck and that 
> is
> > what it happens to be set for. 
> > 
> > 5. I reran the '30 reboots' script again and, sure enough, all 
> reboots
> > were
> > successful until about the 20th one, when an fsck was again 
> triggered,
> > followed by the same sort of failure to boot as in #4.
> > 
> > I suspect that with a little more Googling I can track that one 
> down,
> > but just wanted to let you know how things turned (are turning) 
> out.  
> > I want to get to the bottom of the fsck issue, but I can live 20
> > reboots between crashes for right now.
> > 
> > 
> > Thanks again for the help.
> > 
> > 
> > 
> > jw
> > 
> > 
> > --- In  "Julien" <kaezar1@> wrote:
> > >
> > > Hello
> > > 
> > > Do you control your board by telnet ? If yes, the error could be 
> that 
> > > you don't unload RTAI modules properly, and when you try to 
> reboot 
> > > your system there is lot of error that you don't see (lot of 
> numbers 
> > > like a matrix).
> > > I had this error after running the latency test and not unloading 
> > > RTAI modules. By telnet I first didn't understand why connection 
> was 
> > > closed ans system didn't reboot. By RS232 I saw all errors due to 
> > > RTAI modules always running.
> > > 
> > > Just use a simple script to unload modules, the opposite of your 
> load 
> > > script.
> > > For example if load script is :
> > > /sbin/insmod /usr/realtime/modules/rtai_hal.o
> > > /sbin/insmod /usr/realtime/modules/rtai_ksched.o
> > > /sbin/insmod /usr/realtime/modules/rtai_lxrt.o
> > > /sbin/insmod /usr/realtime/modules/rtai_sem.o
> > > /sbin/insmod /usr/realtime/modules/rtai_mbx.o
> > > /sbin/insmod /usr/realtime/modules/rtai_msg.o
> > > 
> > > then upload script is :
> > > /sbin/rmmod rtai_msg.o
> > > /sbin/rmmod rtai_mbx.o
> > > /sbin/rmmod rtai_sem.o
> > > /sbin/rmmod rtai_lxrt.o
> > > /sbin/rmmod rtai_ksched.o
> > > /sbin/rmmod rtai_hal.o
> > > 
> > > Hope that helps you.
> > > 
> > > Julien
> > > 
> > > --- In  sjanisch@ wrote:
> > > >
> > > > How are you stopping the RTAI application?
> > > > 
> > > > I think I recall reading something on the RTAI board about 
> issuing 
> > > the 
> > > > stop_rt_timer, for instance, before one of the 
> rt_task_wait_period 
> > > in a 
> > > > timer loop for instance... which causes things to hang. 
> > > > 
> > > > Did you write a signal reference to the SIGTERM signal and 
> write 
> > > some 
> > > > internal code to shutdown everything nicely?  How do yo start/
> stop 
> > > the 
> > > > server at will from the command line -- ctrl-C or kill?  They 
> > > really are 
> > > > not the same if the kill is handled correctly.
> > > > 
> > > > Probably not much help...
> > > > 
> > > >  wrote on 04/17/2007 08:02:26 AM:
> > > > 
> > > > > Hi all,
> > > > > 
> > > > > My question is: Do I need to do something special to start/
> stop 
> > > RTAI
> > > > > applications from an /etc/init.d script?
> > > > > 
> > > > > Background:
> > > > > 
> > > > > I am developing an RTAI application that runs under a server 
> that 
> > > in
> > > > > turn is started/stopped via an /etc/init.d script.  The 
> init.d 
> > > script
> > > > > starts the server, which then spawns rtai-load to run the RTAI
> > > > > application. 
> > > > > 
> > > > > The RTAI application is similar to the 'latency' and 
> 'display' 
> > > test
> > > > > programs in the latency test suite.  The hard real time part 
> runs
> > > > > every half second to do some DIO operations then report 
> results 
> > > to the
> > > > > non-real time part via an rt mailbox.  There is a second rt 
> > > mailbox
> > > > > for sending commands to the real time component from the non-
> real 
> > > time
> > > > >  side.
> > > > > 
> > > > > I am having a problem with the reboot sequence.  When I issue 
> the
> > > > > 'reboot' command, about a third of the time the system hangs 
> just
> > > > > after the TERM signal is sent to all tasks (literally half of 
> the
> > > > > message is printed on the console). 
> > > > > 
> > > > > Cycling power usually clears the problem, although sometimes 
> > > there is
> > > > > a similar problem on boot-up: The system will boot as normal, 
> but 
> > > hang
> > > > > as the RTAI application is started up.
> > > > > 
> > > > > The init.d scripts works fine from the command line.  I can 
> start/
> > > stop
> > > > > the server and RTAI application at will without seeing the 
> > > problem.
> > > > > 
> > > > > If I modify the server to not start the RTAI application, the 
> > > reboot
> > > > > sequence works just fine. 
> > > > > 
> > > > > My next test is to have the server actually start the latency 
> test
> > > > > suite program instead of the RTAI application and see if the 
> > > problem
> > > > > persists, but before I spend much more time on the issue, 
> thought 
> > > I
> > > > > would see if anyone has had similar issues.
> > > > > 
> > > > > 
> > > > > Regards,
> > > > > 
> > > > > jw
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > Yahoo! Groups Links
> > > > > 
> > > > > 
> > > > >
> > > >
> > >
> >
>




 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/ts-7000/

<*> Your email settings:
    Individual Email | Traditional

<*> To change settings online go to:
    http://groups.yahoo.com/group/ts-7000/join
    (Yahoo! ID required)

<*> To change settings via email:
     
    

<*> To unsubscribe from this group, send an email to:
    

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 

<Prev in Thread] Current Thread [Next in Thread>
Admin

Disclaimer: Neither Andrew Taylor nor the University of NSW School of Computer and Engineering take any responsibility for the contents of this archive. It is purely a compilation of material sent by many people to the birding-aus mailing list. It has not been checked for accuracy nor its content verified in any way. If you wish to get material removed from the archive or have other queries about the archive e-mail Andrew Taylor at this address: andrewt@cse.unsw.EDU.AU