--- In "xelaco" <> wrote:
>
> * When gcc inserts a call to memcpy(), everything works wonderfully.
> * When gcc decides it wants to inline its ARM assembly-optimized
> version of memcpy(), the result is as if I called memcpy() with a
> source address aligned to an int (haven't played with unaligned dest
yet):
>
> memcpy(dest, src & ~3, n);
Ok, now I have a decent trace with the assembly output using -O0
-ggdb. I've put NOPs around memcpy() to identify the exact code.
1985 __asm__ __volatile__("mov r0, r0\n\tmov r0, r0\n\tmov
r0, r0\n\t");
(gdb) n
1986 memcpy(&HoraInicial, &pTipus->HoraQuin,
sizeof(HoraInicial));
(gdb) display HoraInicial
1: HoraInicial = 32
(gdb) print &HoraInicial
$1 = (long unsigned int *) 0x7fb60098
(gdb) print pTipus->HoraQuin
$2 = 1191413700
(gdb) print &pTipus->HoraQuin
$3 = (long int *) 0x7fb6099d
(gdb) display pTipus->HoraQuin
2: pTipus->HoraQuin = 1191413700
(gdb) print sizeof(HoraInicial)
$4 = 4
(gdb) print sizeof(pTipus->HoraQuin)
$5 = 4
(gdb) n
1987 __asm__ __volatile__("mov r0, r0\n\tmov r0, r0\n\tmov
r0, r0\n\t");
2: pTipus->HoraQuin = 1191413700
1: HoraInicial = 201557956
Ok. The memcpy() just did the wrong thing.
(gdb) x/4x &pTipus->HoraQuin
0x7fb6099d: 0x470387c4 0x00002401 0x00000000 0x00000000
(gdb) x/4x &HoraInicial
0x7fb60098: 0x0c0387c4 0x00000073 0x00000100 0x00000000
(gdb) x/4x &pTipus->HoraQuin-1
0x7fb60999: 0x0c240006 0x470387c4 0x00002401 0x00000000
Memory contents from 0x7fb60998, byte per byte:
0x0c 0xc4 0x87 0x03 0x47 0x01
(gdb) x/4x &HoraInicial-1
0x7fb60094: 0x7fb606a4 0x0c0387c4 0x00000073 0x00000100
Memory contents from 0x7fb60098, byte per byte:
0xc4 0x87 0x03 0x0c 0x73
As you can see the destination gets 3 out of the 4 bytes right, but
has 0x0c instead of 0x47, which happens to be the byte at 0x7fb6099c,
which is actually 4-byte aligned. I would have expected 0x0387c40c at
destination, however.
Now, the assembly looks like this:
(gdb) disass
[...]
0x0000f5fc <ObtienePeriodica+240>: ldr r2, [r11, #-136]
0x0000f600 <ObtienePeriodica+244>: sub r3, r11, #1616 ; 0x650
0x0000f604 <ObtienePeriodica+248>: sub r3, r3, #4 ; 0x4
0x0000f608 <ObtienePeriodica+252>: sub r3, r3, #12 ; 0xc
0x0000f60c <ObtienePeriodica+256>: ldr r2, [r2]
0x0000f610 <ObtienePeriodica+260>: str r2, [r3]
[...]
(gdb) info registers
r0 0x7fb6098a 2142636426
r1 0x93 147
r2 0xc0387c4 201557956
r3 0x7fb60098 2142634136
r4 0x14 20
r5 0x2aac7e14 715947540
r6 0x7fb60cf4 2142637300
r7 0x2ee9c 192156
r8 0x2 2
r9 0x18b90 101264
r10 0x2ac4e000 717545472
r11 0x7fb606f8 2142635768
r12 0x7fb606fc 2142635772
sp 0x7fb6007c 0x7fb6007c
lr 0x12254 74324
pc 0xf614 0xf614 <ObtienePeriodica+264>
fps 0x1001000 16781312
cpsr 0x60000010 1610612752
And almost next comes another memcpy(), performed on both 4-byte
aligned addresses, which is always correct:
[...]
0x0000f64c <ObtienePeriodica+320>: ldr r3, [r11, #-108]
0x0000f650 <ObtienePeriodica+324>: add r2, r3, #1 ; 0x1
0x0000f654 <ObtienePeriodica+328>: sub r3, r11, #1616 ; 0x650
0x0000f658 <ObtienePeriodica+332>: sub r3, r3, #4 ; 0x4
0x0000f65c <ObtienePeriodica+336>: sub r3, r3, #12 ; 0xc
0x0000f660 <ObtienePeriodica+340>: ldr r3, [r3]
0x0000f664 <ObtienePeriodica+344>: str r3, [r2]
[...]
gcc is generating an add here that was not there in the faulty code.
I'm no ARM assembly expert, but I _think_ ldr and str don't play nice
with non word-aligned addresses. At least the first code snippet
is _guaranteed_ to issue a ldr on an unaligned address.
Ideas? This is starting to look like a gcc bug.
Alex
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/ts-7000/
<*> Your email settings:
Individual Email | Traditional
<*> To change settings online go to:
http://groups.yahoo.com/group/ts-7000/join
(Yahoo! ID required)
<*> To change settings via email:
<*> To unsubscribe from this group, send an email to:
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
|