I did a little more googling, then actually looked into the rtai-load
script and found that one should execute a 'sync' command before
loading the rtai modules.
This is similar to the subject of a previous post I made about crashes
involving the open of /dev/mem that is required to do DIO processing.
(http://tech.groups.yahoo.com/group/ts-7000/message/6249). sejmke
pointed out that the answer was to add O_SYNC to the option list,
which cleared the problem up.
I understand that this flushes the output buffers, but don't really
know why that is important, just that it is.
I was pretty happy with this, until I tried to implement it a second
7260 (I have half a dozen of them I need to get running) and ran into
a new issue: The boot process hangs when loading rtai_hal.o from an
/etc/init.d script, even with the sync command. The two machines are
clones of one another, with the only difference in hardware being the
USB WiFi Dongle version (requiring a different driver (zd1211 vs
zd_b)), and the brand of the USB CF.
I am making progress on this second machine however. I disabled every
non-essential service out of /etc/init.d, and suppressed the loading
of every non-essential driver, and now the second 7260 boots and loads
the RTAI modules reliably (30+ successful boots in a row, including
one involving an fsck). Now it is a matter of adding the drivers and
services back in one at a time until the problem recurs.
In any event, thanks again for your suggestions.
jw
--- In "Julien" <> wrote:
>
> Ok cool !
>
> Could be the fsck linked to your boot problem when loading your RTAI
> application ? That seems strange... I didn't notice something like
> this in my case.
>
> By the way you can configure the number of boot beetween each fsck no
> ? I think I saw something about this when partitionning the CF in
> EXT2.
>
> Julien
>
> --- In "jywmpg" <jywmpg@> wrote:
> >
> > Yes, that helped a great deal. Thanks very much. Things are
> working
> > much much better than before.
> >
> > 1. I was using rtai-load to launch my programs. I now realize that
> > its main job was to load and unload the rtai modules. I suspect it
> is
> > just part of the test suite.
> >
> > The system I am developing is an embedded system whose main purpose
> is
> > to perform this DIO task, so the RTAI application and modules should
> > be running all the time. For ease of use, I just created an
> > /etc/init.d/rtai script that loads the modules I need at boot time.
> >
> > This cleared up several of the symptoms.
> >
> > 2. I took your suggesting and routed the RS-232 serial console
> output
> > to a file and examined it closely. I was happy to find a few
> relevant
> > messages just before that crash that pointed to my non real time
> task.
> >
> > To make a long story short, at reboot the real time task exits
> first.
> > Subsequently the non real time tried to execute an rpc on the task
> > that had exited, trying to tell it to exit. Nothing good happened
> > after that.
> >
> > Adding a simple check before executing the rpc caused that crash to
> go
> > away.
> >
> > 3. At that point, I was able to reboot the machine in question
> > remotely via a script about 30 times in a row without issue.
> >
> > 4. Then the machine crashed on boot, just after the RTAI application
> > started.
> >
> > I noticed, however, that on this boot an fsck had been triggered on
> > the USB CF, since it was the 30th mounting without an fsck and that
> is
> > what it happens to be set for.
> >
> > 5. I reran the '30 reboots' script again and, sure enough, all
> reboots
> > were
> > successful until about the 20th one, when an fsck was again
> triggered,
> > followed by the same sort of failure to boot as in #4.
> >
> > I suspect that with a little more Googling I can track that one
> down,
> > but just wanted to let you know how things turned (are turning)
> out.
> > I want to get to the bottom of the fsck issue, but I can live 20
> > reboots between crashes for right now.
> >
> >
> > Thanks again for the help.
> >
> >
> >
> > jw
> >
> >
> > --- In "Julien" <kaezar1@> wrote:
> > >
> > > Hello
> > >
> > > Do you control your board by telnet ? If yes, the error could be
> that
> > > you don't unload RTAI modules properly, and when you try to
> reboot
> > > your system there is lot of error that you don't see (lot of
> numbers
> > > like a matrix).
> > > I had this error after running the latency test and not unloading
> > > RTAI modules. By telnet I first didn't understand why connection
> was
> > > closed ans system didn't reboot. By RS232 I saw all errors due to
> > > RTAI modules always running.
> > >
> > > Just use a simple script to unload modules, the opposite of your
> load
> > > script.
> > > For example if load script is :
> > > /sbin/insmod /usr/realtime/modules/rtai_hal.o
> > > /sbin/insmod /usr/realtime/modules/rtai_ksched.o
> > > /sbin/insmod /usr/realtime/modules/rtai_lxrt.o
> > > /sbin/insmod /usr/realtime/modules/rtai_sem.o
> > > /sbin/insmod /usr/realtime/modules/rtai_mbx.o
> > > /sbin/insmod /usr/realtime/modules/rtai_msg.o
> > >
> > > then upload script is :
> > > /sbin/rmmod rtai_msg.o
> > > /sbin/rmmod rtai_mbx.o
> > > /sbin/rmmod rtai_sem.o
> > > /sbin/rmmod rtai_lxrt.o
> > > /sbin/rmmod rtai_ksched.o
> > > /sbin/rmmod rtai_hal.o
> > >
> > > Hope that helps you.
> > >
> > > Julien
> > >
> > > --- In sjanisch@ wrote:
> > > >
> > > > How are you stopping the RTAI application?
> > > >
> > > > I think I recall reading something on the RTAI board about
> issuing
> > > the
> > > > stop_rt_timer, for instance, before one of the
> rt_task_wait_period
> > > in a
> > > > timer loop for instance... which causes things to hang.
> > > >
> > > > Did you write a signal reference to the SIGTERM signal and
> write
> > > some
> > > > internal code to shutdown everything nicely? How do yo start/
> stop
> > > the
> > > > server at will from the command line -- ctrl-C or kill? They
> > > really are
> > > > not the same if the kill is handled correctly.
> > > >
> > > > Probably not much help...
> > > >
> > > > wrote on 04/17/2007 08:02:26 AM:
> > > >
> > > > > Hi all,
> > > > >
> > > > > My question is: Do I need to do something special to start/
> stop
> > > RTAI
> > > > > applications from an /etc/init.d script?
> > > > >
> > > > > Background:
> > > > >
> > > > > I am developing an RTAI application that runs under a server
> that
> > > in
> > > > > turn is started/stopped via an /etc/init.d script. The
> init.d
> > > script
> > > > > starts the server, which then spawns rtai-load to run the RTAI
> > > > > application.
> > > > >
> > > > > The RTAI application is similar to the 'latency' and
> 'display'
> > > test
> > > > > programs in the latency test suite. The hard real time part
> runs
> > > > > every half second to do some DIO operations then report
> results
> > > to the
> > > > > non-real time part via an rt mailbox. There is a second rt
> > > mailbox
> > > > > for sending commands to the real time component from the non-
> real
> > > time
> > > > > side.
> > > > >
> > > > > I am having a problem with the reboot sequence. When I issue
> the
> > > > > 'reboot' command, about a third of the time the system hangs
> just
> > > > > after the TERM signal is sent to all tasks (literally half of
> the
> > > > > message is printed on the console).
> > > > >
> > > > > Cycling power usually clears the problem, although sometimes
> > > there is
> > > > > a similar problem on boot-up: The system will boot as normal,
> but
> > > hang
> > > > > as the RTAI application is started up.
> > > > >
> > > > > The init.d scripts works fine from the command line. I can
> start/
> > > stop
> > > > > the server and RTAI application at will without seeing the
> > > problem.
> > > > >
> > > > > If I modify the server to not start the RTAI application, the
> > > reboot
> > > > > sequence works just fine.
> > > > >
> > > > > My next test is to have the server actually start the latency
> test
> > > > > suite program instead of the RTAI application and see if the
> > > problem
> > > > > persists, but before I spend much more time on the issue,
> thought
> > > I
> > > > > would see if anyone has had similar issues.
> > > > >
> > > > >
> > > > > Regards,
> > > > >
> > > > > jw
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Yahoo! Groups Links
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/ts-7000/
<*> Your email settings:
Individual Email | Traditional
<*> To change settings online go to:
http://groups.yahoo.com/group/ts-7000/join
(Yahoo! ID required)
<*> To change settings via email:
<*> To unsubscribe from this group, send an email to:
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
|