twitter: comp.lang.c - 26 new messages in 5 topics

comp.lang.c
http://groups.google.com/group/comp.lang.c?hl=en

comp.lang.c@googlegroups.com

Today's topics:

* 32/64 bit cc differences - 19 messages, 9 authors
http://groups.google.com/group/comp.lang.c/t/9f3fc73a50133ab9?hl=en
* What is "the instance structure must be manually initialized"? - 3 messages,
2 authors
http://groups.google.com/group/comp.lang.c/t/89ecdf01bc4a91d6?hl=en
* C can be very literal. - 1 messages, 1 author
http://groups.google.com/group/comp.lang.c/t/d81909b19825f4d7?hl=en
* What is wrong with this c-program? - 1 messages, 1 author
http://groups.google.com/group/comp.lang.c/t/18415cd95543b14b?hl=en
* A virtual trinary CPU - 2 messages, 2 authors
http://groups.google.com/group/comp.lang.c/t/1e016c1ac82a26be?hl=en

==============================================================================
TOPIC: 32/64 bit cc differences
http://groups.google.com/group/comp.lang.c/t/9f3fc73a50133ab9?hl=en
==============================================================================

== 1 of 19 ==
Date: Tues, Jan 14 2014 3:47 pm
From: Andrew Cooper

On 14/01/2014 23:31, Andrew Cooper wrote:
> On 13/01/2014 06:31, JohnF wrote:
>> J. Clarke <jclarkeusenet@cox.net> wrote:
>>> john@please.see.sig.for.email.com says...
>>>> <snip>
>>>> So I'm asking about weird-ish 32/64-bit cc differences
>>>> that might give rise to this kind of behavior. Presumably,
>>>> there's some subtle bug that I'm failing to see in the code,
>>>> and which the output isn't helping me to zero in on. Thanks,
>>>
>>> I'm no expert but one thing I learned <mumble> years ago was to make
>>> sure that the problem you're chasing really is the problem you _think_
>>> you're chasing. You've got three different versions of the compiler
>>> with two of them giving one behavior and the third, oldest one giving a
>>> different behavior, which you are attributing to 64 bit vs 32-bit. It
>>> could also be the result of some change made to the more recent releases
>>> of the compiler and I would want to rule that out rather than assuming
>>> that it's a 32- vs 64- bit issue.
>>
>> Problem found and fixed, as per earlier followups.
>> Turned out to be slightly different float behavior.
>> But you could be right that it wasn't a 64-bit issue,
>> per se. And I'd tried cc -m32-bit, as per previous
>> followups, but compiler barfed at that switch (not
>> sure why, man cc wasn't on that box). So I couldn't
>> try to get a finer-grained understanding of problem.
>>
>
> I know I am a little late to the thread here, but can explain your problem.
>
> I don't know whether you have updated the code on your website, but the
> code (as wgotten 10 mins ago) wont work.
>
> When compiled as 32bit, ran1() uses x87 FPU instructions, but when
> compiled as 64bit, ran1() uses SSE instructions.
>
> The 32bit code keeps its intermediate values on the x87 register stack,
> causing rounding to occur at 80 bits worth of precision, which is
> different to the SSE code (which appears to be rounding at 64 bits, but
> frankly its late and SSE instructions look far too similar for their own
> good)
>
>
> Basically, avoid any form of floating point calculations at all,
> especially if you are expecting something deterministic. You (like 99
> out of every 100 programmers, myself included) do not know how to use
> them correctly.
>
> ~Andrew
>

And in addition, using an identical binary, the chances are very good
that you would get a different stream of random numbers on Intel vs AMD
hardware, and you would get different random numbers from running the
set of instructions under different operating systems on identical
hardware. C itself does not provide you with an ability to set the
x87/SSE general control registers.

~Andrew

== 2 of 19 ==
Date: Wed, Jan 15 2014 1:06 am
From: JohnF

Keith Thompson <kst-u@mib.org> wrote:
> If CGI imposes a requirement to write binary data to stdout,
> then I'm sure there's a solution;

Oh, yeah, a well-known one that I summarized above (snipped here),
setmode(fileno(stdout),O_BINARY)). But that only exists on windows,
so you need some #ifdef's to handle the non-portability.

Recall that I'd only mentioned this as a portability issue
because I had actually been tripped up by, whereas 32-bit ints
hadn't ever been any problem for me (not since about 1989, anyway,
when I actually did have that problem, when requested to port
one of my programs from VAX to msdos pc).

> But first check for existing
> answers; you're unlikely to be the first person to run into this.

setmode() is indeed the existing answer everybody uses,
as far as I know about, but there's no #ifdef-less portable
solution I'm aware of.
--
John Forkosh ( mailto: j@f.com where j=john and f=forkosh )

== 3 of 19 ==
Date: Wed, Jan 15 2014 1:19 am
From: JohnF

Robert Wessel <robertwessel2@yahoo.com> wrote:
> JohnF <john@please.see.sig.for.email.com> wrote:
>>>
>>> Thanks for the info. Here's the problem that I've encountered.
>>> Lots of my programs are cgi's that emit binary files, typically
>>> gifs, used in html as, e.g.,
>>> <img src="/cgi-bin/myprog.cgi?instructions and/or data for image">
>>> In this case, myprog >>has to<<, as I understand it, emit to stdout.
>>> Is that right? If so, I need to put stdout in "binary mode"
>>> (that's what windows calls it, the typical win C command being
>>> something like setmode(fileno(stdout),O_BINARY)).
>>> Got a fix for, or insight into, dealing with that without
>>> messy #ifdef stuff? Thanks,
>>
>>Sorry for following myself up:
>>I should have mentioned that several "intended-to-be-portable"
>>fixes that I've tried, in particular freopen("CON","wb",stdout)
>>and stdout=fdopen(STDOUT_FILENO,"wb"), don't work or don't work
>>portably, for one reason or another (tales of woe elided:)
>>So I'm asking for a pretty much known-to-portably-work fix.
>
> I don't believe there is a portable fix. The usual thing under
> Windows in a C CGI script is to use _setmode() to change stdout to
> binary. Bury it in your platform adaptation layer, and avoid the
> #ifdefs.

Thanks, Robert. But no such layer besides the #ifdef's.
I had thought about writing my own dummy setmode() that
does nothing, compiled only on unix, so I could call it
regardless of platform. That would minimize #ifdef's.
But some windows compilers call the func setmode() and
the constant O_BINARY, whereas others call it _setmode()
and _O_BINARY. Go figure. So I have to check that, via
additional #ifdef's. Just a big annoying mess, but not
a real problem, except that it's not a standard, so I
don't know when things will change, breaking that code.
--
John Forkosh ( mailto: j@f.com where j=john and f=forkosh )

== 4 of 19 ==
Date: Wed, Jan 15 2014 1:25 am
From: JohnF

James Kuyper <jameskuyper@verizon.net> wrote:
>> JohnF <john@please.see.sig.for.email.com> wrote:
>> [...] several "intended-to-be-portable"
>> fixes that I've tried, in particular freopen("CON","wb",stdout)
>> and stdout=fdopen(STDOUT_FILENO,"wb"), don't work or don't work
>> portably, for one reason or another (tales of woe elided:)
>
> For freopen(), "It is implementation-defined which changes of mode are
> permitted (if any), and under what circumstances.", so you can't count
> on that to work.
>
> fdopen() is a POSIX function; I've no idea whether the function with
> that name that you're trying to use on a mswin system is supposed to
> conform fully to POSIX specifications for that function. More
> importantly, stdout is only required to be an expression of the type
> "pointer to FILE"; it's not required to be the name of a pointer
> variable that you can assign to. For instance, an implementation of
> <stdio.h> could have:
>
> extern FILE __std_streams[];
> #define stdout (&__std_streams[0])
> #define stdin (&__std_streams[1])
> #define stderr (&_std_streams[2])
>
> You could get around that problem, at least, by assigning the value
> returned by fdopen() in your own pointer, rather than trying to store it
> in stdout.

Thanks for the above info and suggestion, James.
I'll play around with it a little more to see if something
more portable than setmode() works, at least on the free
djgpp and mingw compilers.

>> So I'm asking for a pretty much known-to-portably-work fix.
>
> I can't help you with that. The last time I did any CGI work was more
> than a decade ago, and the output was pure text, so the fact that stdout
> is in text mode wasn't a problem. Also, it was on a unix-like system
> where there's no difference between text mode and binary mode.

--
John Forkosh ( mailto: j@f.com where j=john and f=forkosh )

== 5 of 19 ==
Date: Wed, Jan 15 2014 1:47 am
From: JohnF

Andrew Cooper <root@127.0.0.1> wrote:
>> On 13/01/2014 06:31, JohnF wrote:
>>>
>>> Problem found and fixed, as per earlier followups.
>>> Turned out to be slightly different float behavior.
>>
>> I know I am a little late to the thread here,
>> but can explain your problem.
>> I don't know whether you have updated the code on your website,
>> but the code (as gotten 10 mins ago) wont work.
>> When compiled as 32bit, ran1() uses x87 FPU instructions,
>> but when compiled as 64bit, ran1() uses SSE instructions.
>>
>> The 32bit code keeps its intermediate values on the x87 register stack,
>> causing rounding to occur at 80 bits worth of precision, which is
>> different to the SSE code (which appears to be rounding at 64 bits, but
>> frankly its late and SSE instructions look far too similar for their own
>> good)
>>
>> Basically, avoid any form of floating point calculations at all,
>> especially if you are expecting something deterministic.
>> You (like 99 out of every 100 programmers, myself included)
>> do not know how to use them correctly. ~Andrew
>
> And in addition, using an identical binary, the chances are very good
> that you would get a different stream of random numbers on Intel vs AMD
> hardware, and you would get different random numbers from running the
> set of instructions under different operating systems on identical
> hardware. C itself does not provide you with an ability to set the
> x87/SSE general control registers. ~Andrew

I believe your "as gotten 10 mins ago" code is current.
But if the following remark seems wrong, maybe download again.
I agree that the float result called "temp" in ran1() won't
be portable. But it's not used anywhere, any more. Instead,
there's now that "static long iran;" near the top of the module
that's the only result actually used now. And that's a
completely integer calculation.
If you look real, real carefully, you'll see you can
invoke it as fm -r 0 -etc, which will revert to the original
rng usage. That's there just so stuff which was previously
encrypted can still be decrypted. And then, yeah, in that
case you better not try to decrypt with a 64-bit executable
if you encrypted with a 32-bit one. I guess that's what
all that gpl stuff about no "warranty of merchantability"
is all about:)
--
John Forkosh ( mailto: j@f.com where j=john and f=forkosh )

== 6 of 19 ==
Date: Wed, Jan 15 2014 1:57 am
From: JohnF

Stephen Sprunk <stephen@sprunk.org> wrote:
> On 13-Jan-14 00:31, JohnF wrote:
>> Problem found and fixed, as per earlier followups. Turned out to be
>> slightly different float behavior. But you could be right that it
>> wasn't a 64-bit issue, per se. And I'd tried cc -m32-bit, as per
>> previous followups, but compiler barfed at that switch
>
> Shouldn't that be "-m32"?
> http://gcc.gnu.org/onlinedocs/gcc-4.7.3/gcc/
> i386-and-x86_002d64-Options.html#i386-and-x86_002d64-Options

Actually, I tried both, and neither was recognized by that
particular gcc. But what I particularly wanted was to compare
the behavior of 32-bit vs 64-bit float, regardless of other
architecture differences, and my reading was that -m32-bit
was the more relevant switch. But you're right that if both
worked, I'd have tried compiling all four possible ways,
seen what differences/samenesses in the output occurred,
and proceeded from there.

>> (not sure why, man cc wasn't on that box). So I couldn't try to get a
>> finer-grained understanding of problem.
> Just Google "man gcc"; that's available nearly everywhere. S

Sure, I looked on my own box, and elsewheres, too.
But that only showed both -m32's should have been recognized.
I'd hoped the man page on the box where they didn't work
would reveal why not.
--
John Forkosh ( mailto: j@f.com where j=john and f=forkosh )

== 7 of 19 ==
Date: Wed, Jan 15 2014 8:01 am
From: Kaz Kylheku

On 2014-01-14, Keith Thompson <kst-u@mib.org> wrote:
> If CGI imposes a requirement to write binary data to stdout,
> then I'm sure there's a solution; I just have no idea what
> it is.

The obvious solution (to that and any "host" of other problems, no pun
intended) is to use a Unix-like system for a webserver with CGI programs.

Which anyone who isn't out of their freaking mind generally does.

== 8 of 19 ==
Date: Wed, Jan 15 2014 8:36 am
From: Keith Thompson

JohnF <john@please.see.sig.for.email.com> writes:
> Stephen Sprunk <stephen@sprunk.org> wrote:
>> On 13-Jan-14 00:31, JohnF wrote:
>>> Problem found and fixed, as per earlier followups. Turned out to be
>>> slightly different float behavior. But you could be right that it
>>> wasn't a 64-bit issue, per se. And I'd tried cc -m32-bit, as per
>>> previous followups, but compiler barfed at that switch
>>
>> Shouldn't that be "-m32"?
>> http://gcc.gnu.org/onlinedocs/gcc-4.7.3/gcc/
>> i386-and-x86_002d64-Options.html#i386-and-x86_002d64-Options
>
> Actually, I tried both, and neither was recognized by that
> particular gcc. But what I particularly wanted was to compare
> the behavior of 32-bit vs 64-bit float, regardless of other
> architecture differences, and my reading was that -m32-bit
> was the more relevant switch. But you're right that if both
> worked, I'd have tried compiling all four possible ways,
> seen what differences/samenesses in the output occurred,
> and proceeded from there.

I don't think you're encountering 32-bit vs. 64-bit float. Types
`float` and `double` are typically 32 and 64 bits, respectively,
on both 32-bit and 64-bit systems. There are differences in the
representation of `long double`, and in how intermediate results
are stored.

(I've worked on systems where `float` and `double` are both 64 bits,
but they were Cray vector machines.)

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

== 9 of 19 ==
Date: Thurs, Jan 16 2014 2:54 pm
From: Andrew Cooper

On 15/01/2014 09:47, JohnF wrote:
> Andrew Cooper <root@127.0.0.1> wrote:
>>> On 13/01/2014 06:31, JohnF wrote:
>>>>
>>>> Problem found and fixed, as per earlier followups.
>>>> Turned out to be slightly different float behavior.
>>>
>>> I know I am a little late to the thread here,
>>> but can explain your problem.
>>> I don't know whether you have updated the code on your website,
>>> but the code (as gotten 10 mins ago) wont work.
>>> When compiled as 32bit, ran1() uses x87 FPU instructions,
>>> but when compiled as 64bit, ran1() uses SSE instructions.
>>>
>>> The 32bit code keeps its intermediate values on the x87 register stack,
>>> causing rounding to occur at 80 bits worth of precision, which is
>>> different to the SSE code (which appears to be rounding at 64 bits, but
>>> frankly its late and SSE instructions look far too similar for their own
>>> good)
>>>
>>> Basically, avoid any form of floating point calculations at all,
>>> especially if you are expecting something deterministic.
>>> You (like 99 out of every 100 programmers, myself included)
>>> do not know how to use them correctly. ~Andrew
>>
>> And in addition, using an identical binary, the chances are very good
>> that you would get a different stream of random numbers on Intel vs AMD
>> hardware, and you would get different random numbers from running the
>> set of instructions under different operating systems on identical
>> hardware. C itself does not provide you with an ability to set the
>> x87/SSE general control registers. ~Andrew
>
> I believe your "as gotten 10 mins ago" code is current.
> But if the following remark seems wrong, maybe download again.
> I agree that the float result called "temp" in ran1() won't
> be portable. But it's not used anywhere, any more. Instead,
> there's now that "static long iran;" near the top of the module
> that's the only result actually used now. And that's a
> completely integer calculation.
> If you look real, real carefully, you'll see you can
> invoke it as fm -r 0 -etc, which will revert to the original
> rng usage. That's there just so stuff which was previously
> encrypted can still be decrypted. And then, yeah, in that
> case you better not try to decrypt with a 64-bit executable
> if you encrypted with a 32-bit one. I guess that's what
> all that gpl stuff about no "warranty of merchantability"
> is all about:)
>

Apologies for being curt, but:

I really don't care what your C code says. It compiles differently
under 32bit and 64bit, using different floating point units in the x86
architecture. It will not generate the same stream of random numbers.
Your 32bit and 64bit binaries will not be capable of decrypting each
others outputs.

Looking at the thread so far, it is very likely that you have made your
"1 byte in 1000" problem substantially rarer, but you really have not
removed it entirely. You cannot use floating point numbers of any kind
and have any expectation of determinism.

~Andrew

== 10 of 19 ==
Date: Thurs, Jan 16 2014 3:16 pm
From: James Kuyper

On 01/16/2014 05:54 PM, Andrew Cooper wrote:
> On 15/01/2014 09:47, JohnF wrote:
...
>> I believe your "as gotten 10 mins ago" code is current.
>> But if the following remark seems wrong, maybe download again.
>> I agree that the float result called "temp" in ran1() won't
>> be portable. But it's not used anywhere, any more. Instead,
>> there's now that "static long iran;" near the top of the module
>> that's the only result actually used now. And that's a
>> completely integer calculation.
...
> Apologies for being curt, but:
>
> I really don't care what your C code says. It compiles differently
> under 32bit and 64bit, using different floating point units in the x86
> architecture.

Not caring about what his C code says seems rather odd, since that code
determines whether or not your assertions about the results of compiling
it are correct. He's asserted that the current version of his code
performs a "completely integer calculation". If that description is
correct (I haven't bothered to check), then why would it use the
floating point units at all, much less using different ones on the two
different machines?

If he's wrong, and it's not yet a "completely integer calculation", then
that's a different and very legitimate issue - but what his C code says
is very relevant to that issue.

== 11 of 19 ==
Date: Thurs, Jan 16 2014 3:31 pm
From: Robert Wessel

On Thu, 16 Jan 2014 22:54:27 +0000, Andrew Cooper <root@127.0.0.1>
wrote:

>On 15/01/2014 09:47, JohnF wrote:
>> Andrew Cooper <root@127.0.0.1> wrote:
>>>> On 13/01/2014 06:31, JohnF wrote:
>>>>>
>>>>> Problem found and fixed, as per earlier followups.
>>>>> Turned out to be slightly different float behavior.
>>>>
>>>> I know I am a little late to the thread here,
>>>> but can explain your problem.
>>>> I don't know whether you have updated the code on your website,
>>>> but the code (as gotten 10 mins ago) wont work.
>>>> When compiled as 32bit, ran1() uses x87 FPU instructions,
>>>> but when compiled as 64bit, ran1() uses SSE instructions.
>>>>
>>>> The 32bit code keeps its intermediate values on the x87 register stack,
>>>> causing rounding to occur at 80 bits worth of precision, which is
>>>> different to the SSE code (which appears to be rounding at 64 bits, but
>>>> frankly its late and SSE instructions look far too similar for their own
>>>> good)
>>>>
>>>> Basically, avoid any form of floating point calculations at all,
>>>> especially if you are expecting something deterministic.
>>>> You (like 99 out of every 100 programmers, myself included)
>>>> do not know how to use them correctly. ~Andrew
>>>
>>> And in addition, using an identical binary, the chances are very good
>>> that you would get a different stream of random numbers on Intel vs AMD
>>> hardware, and you would get different random numbers from running the
>>> set of instructions under different operating systems on identical
>>> hardware. C itself does not provide you with an ability to set the
>>> x87/SSE general control registers. ~Andrew
>>
>> I believe your "as gotten 10 mins ago" code is current.
>> But if the following remark seems wrong, maybe download again.
>> I agree that the float result called "temp" in ran1() won't
>> be portable. But it's not used anywhere, any more. Instead,
>> there's now that "static long iran;" near the top of the module
>> that's the only result actually used now. And that's a
>> completely integer calculation.
>> If you look real, real carefully, you'll see you can
>> invoke it as fm -r 0 -etc, which will revert to the original
>> rng usage. That's there just so stuff which was previously
>> encrypted can still be decrypted. And then, yeah, in that
>> case you better not try to decrypt with a 64-bit executable
>> if you encrypted with a 32-bit one. I guess that's what
>> all that gpl stuff about no "warranty of merchantability"
>> is all about:)
>>
>
>Apologies for being curt, but:
>
>I really don't care what your C code says. It compiles differently
>under 32bit and 64bit, using different floating point units in the x86
>architecture. It will not generate the same stream of random numbers.
>Your 32bit and 64bit binaries will not be capable of decrypting each
>others outputs.
>
>Looking at the thread so far, it is very likely that you have made your
>"1 byte in 1000" problem substantially rarer, but you really have not
>removed it entirely. You cannot use floating point numbers of any kind
>and have any expectation of determinism.

Surely that's a bit too strong a statement. Two compilers implement
strict IEEE FP semantics should produce programs generating the same
results. Of course that's rarely the default mode for compilers.

But you're right, one should generally avoid depending on the exact
results of FP operations, especially the corner cases.

== 12 of 19 ==
Date: Thurs, Jan 16 2014 4:23 pm
From: glen herrmannsfeldt

Robert Wessel <robertwessel2@yahoo.com> wrote:

(snip, someone wrote)
>>Looking at the thread so far, it is very likely that you have made your
>>"1 byte in 1000" problem substantially rarer, but you really have not
>>removed it entirely. You cannot use floating point numbers of any kind
>>and have any expectation of determinism.

> Surely that's a bit too strong a statement. Two compilers implement
> strict IEEE FP semantics should produce programs generating the same
> results. Of course that's rarely the default mode for compilers.

Hmmm. The extra precision used in the x87 is part of the IEEE
standard. In addition, IEEE 754 supports different rounding
modes, so you would also have to be sure that the same rounding
was done in all cases.

Also, while many systems support the IEEE floating point formats,
they may not necessarily claim everything else about the standard.

> But you're right, one should generally avoid depending on the exact
> results of FP operations, especially the corner cases.

I would have to read it more carefully to be sure, but I don't
believe that the intent of the standard was to generate bit exact
results, as this case requires.

-- glen

== 13 of 19 ==
Date: Thurs, Jan 16 2014 6:49 pm
From: Robert Wessel

On Fri, 17 Jan 2014 00:23:58 +0000 (UTC), glen herrmannsfeldt
<gah@ugcs.caltech.edu> wrote:

>Robert Wessel <robertwessel2@yahoo.com> wrote:
>
>(snip, someone wrote)
>>>Looking at the thread so far, it is very likely that you have made your
>>>"1 byte in 1000" problem substantially rarer, but you really have not
>>>removed it entirely. You cannot use floating point numbers of any kind
>>>and have any expectation of determinism.
>
>> Surely that's a bit too strong a statement. Two compilers implement
>> strict IEEE FP semantics should produce programs generating the same
>> results. Of course that's rarely the default mode for compilers.
>
>Hmmm. The extra precision used in the x87 is part of the IEEE
>standard. In addition, IEEE 754 supports different rounding
>modes, so you would also have to be sure that the same rounding
>was done in all cases.

While extended formats are allowed for by IEEE, it's *not* compliant
in general to do intermediate operations with extra precision. So
many compilers for x87 mode have a "strict" option which results in
storing the result of each intermediate operation to a correct sized
memory item, in order to get the exact rounding. Often there's an
option as well that uses the x87 precision control to achieve part of
that (while leaving oddness happing with the exponents).

The rounding mode has a defined default (round to even), and the
implementation should not be altering that (although providing a way
for the program to change that is common).

>Also, while many systems support the IEEE floating point formats,
>they may not necessarily claim everything else about the standard.
>
>> But you're right, one should generally avoid depending on the exact
>> results of FP operations, especially the corner cases.
>
>I would have to read it more carefully to be sure, but I don't
>believe that the intent of the standard was to generate bit exact
>results, as this case requires.

The standard recommends that languages define a way of doing exactly
that (get the same results on all implementations). See the section
on reproducibility.

== 14 of 19 ==
Date: Thurs, Jan 16 2014 8:08 pm
From: JohnF

Andrew Cooper <root@127.0.0.1> wrote:
> On 15/01/2014 09:47, JohnF wrote:
>> Andrew Cooper <root@127.0.0.1> wrote:
>>>> On 13/01/2014 06:31, JohnF wrote:
>>>>>
>>>>> Problem found and fixed, as per earlier followups.
>>>>> Turned out to be slightly different float behavior.
>>>>
>>>> I know I am a little late to the thread here,
>>>> but can explain your problem.
>>>> I don't know whether you have updated the code on your website,
>>>> but the code (as gotten 10 mins ago) wont work.
>>>> When compiled as 32bit, ran1() uses x87 FPU instructions,
>>>> but when compiled as 64bit, ran1() uses SSE instructions.
>>>>
>>>> The 32bit code keeps its intermediate values on the x87 register stack,
>>>> causing rounding to occur at 80 bits worth of precision, which is
>>>> different to the SSE code (which appears to be rounding at 64 bits, but
>>>> frankly its late and SSE instructions look far too similar for their own
>>>> good)
>>>>
>>>> Basically, avoid any form of floating point calculations at all,
>>>> especially if you are expecting something deterministic.
>>>> You (like 99 out of every 100 programmers, myself included)
>>>> do not know how to use them correctly. ~Andrew
>>>
>>> And in addition, using an identical binary, the chances are very good
>>> that you would get a different stream of random numbers on Intel vs AMD
>>> hardware, and you would get different random numbers from running the
>>> set of instructions under different operating systems on identical
>>> hardware. C itself does not provide you with an ability to set the
>>> x87/SSE general control registers. ~Andrew
>>
>> I believe your "as gotten 10 mins ago" code is current.
>> But if the following remark seems wrong, maybe download again.
>> I agree that the float result called "temp" in ran1() won't
>> be portable. But it's not used anywhere, any more. Instead,
>> there's now that "static long iran;" near the top of the module
>> that's the only result actually used now. And that's a
>> completely integer calculation.
>> If you look real, real carefully, you'll see you can
>> invoke it as fm -r 0 -etc, which will revert to the original
>> rng usage. That's there just so stuff which was previously
>> encrypted can still be decrypted. And then, yeah, in that
>> case you better not try to decrypt with a 64-bit executable
>> if you encrypted with a 32-bit one. I guess that's what
>> all that gpl stuff about no "warranty of merchantability"
>> is all about:)
>
> Apologies for being curt, but:
> I really don't care what your C code says. It compiles differently
> under 32bit and 64bit, using different floating point units in the x86
> architecture. It will not generate the same stream of random numbers.
> Your 32bit and 64bit binaries will not be capable of decrypting each
> others outputs.
> Looking at the thread so far, it is very likely that you have made your
> "1 byte in 1000" problem substantially rarer, but you really have not
> removed it entirely. You cannot use floating point numbers of any kind
> and have any expectation of determinism. ~Andrew

Okay, you don't care what the code says, but you seem to have
ignored what I said, too, which leaves you little to go on:)
I >>agreed with you<< the fp results won't be the same.
But I went on to say >>I'm not using the fp results<< any more,
thanks to all the comments and suggestions in this thread.
For example, as Ben pointed out, the Park&Miller algorithm used
by the ran1() function is a >>completely integer<< calculation,
generating a random integer intran=1...maxintran.
At the end, just before return;, it merely does an fp divide,
fpran = ((float)intran)/((float)(maxintran+1)). That's the only
fp calculation, period. I now ignore it, totally. And I just
use that intran.
--
John Forkosh ( mailto: j@f.com where j=john and f=forkosh )

== 15 of 19 ==
Date: Thurs, Jan 16 2014 8:57 pm
From: glen herrmannsfeldt

Robert Wessel <robertwessel2@yahoo.com> wrote:

(snip, I wrote)
>>Hmmm. The extra precision used in the x87 is part of the IEEE
>>standard. In addition, IEEE 754 supports different rounding
>>modes, so you would also have to be sure that the same rounding
>>was done in all cases.

> While extended formats are allowed for by IEEE, it's *not* compliant
> in general to do intermediate operations with extra precision. So
> many compilers for x87 mode have a "strict" option which results in
> storing the result of each intermediate operation to a correct sized
> memory item, in order to get the exact rounding. Often there's an
> option as well that uses the x87 precision control to achieve part of
> that (while leaving oddness happing with the exponents).

Well, one problem with extra precision is double rounding.
If you round from extra to double, then to single, the result
can be different (wrong) from the correctly rounded extended
to single.

But consistent use of extended, unlike with the x87 with 8
registers, should give more accurate results.

> The rounding mode has a defined default (round to even), and the
> implementation should not be altering that (although providing a way
> for the program to change that is common).

(snip)

>>I would have to read it more carefully to be sure, but I don't
>>believe that the intent of the standard was to generate bit exact
>>results, as this case requires.

> The standard recommends that languages define a way of doing exactly
> that (get the same results on all implementations). See the section
> on reproducibility.

Well, in most cases what people want is the correct (closest)
printed decimal value. As the exponent shifts for binary are
different from decimal, you have to provide enough extra bits.

Or use the IEEE 754 decimal floating point.

Extra precision is most useful for sums, where cancelation and lost
precision can easily occur. There are some other cases where
the extra precision for intermediate values helps.

-- glen

== 16 of 19 ==
Date: Fri, Jan 17 2014 6:31 am
From: Robert Wessel

On Fri, 17 Jan 2014 04:57:02 +0000 (UTC), glen herrmannsfeldt
<gah@ugcs.caltech.edu> wrote:

>Robert Wessel <robertwessel2@yahoo.com> wrote:
>
>(snip, I wrote)
>>>Hmmm. The extra precision used in the x87 is part of the IEEE
>>>standard. In addition, IEEE 754 supports different rounding
>>>modes, so you would also have to be sure that the same rounding
>>>was done in all cases.
>
>> While extended formats are allowed for by IEEE, it's *not* compliant
>> in general to do intermediate operations with extra precision. So
>> many compilers for x87 mode have a "strict" option which results in
>> storing the result of each intermediate operation to a correct sized
>> memory item, in order to get the exact rounding. Often there's an
>> option as well that uses the x87 precision control to achieve part of
>> that (while leaving oddness happing with the exponents).
>
>Well, one problem with extra precision is double rounding.
>If you round from extra to double, then to single, the result
>can be different (wrong) from the correctly rounded extended
>to single.
>
>But consistent use of extended, unlike with the x87 with 8
>registers, should give more accurate results.
>
>> The rounding mode has a defined default (round to even), and the
>> implementation should not be altering that (although providing a way
>> for the program to change that is common).
>
>(snip)
>
>>>I would have to read it more carefully to be sure, but I don't
>>>believe that the intent of the standard was to generate bit exact
>>>results, as this case requires.
>
>> The standard recommends that languages define a way of doing exactly
>> that (get the same results on all implementations). See the section
>> on reproducibility.
>
>Well, in most cases what people want is the correct (closest)
>printed decimal value. As the exponent shifts for binary are
>different from decimal, you have to provide enough extra bits.
>
>Or use the IEEE 754 decimal floating point.
>
>Extra precision is most useful for sums, where cancelation and lost
>precision can easily occur. There are some other cases where
>the extra precision for intermediate values helps.

I don't really disagree with any of that, but per the standard it
really shouldn't be the default behavior.

== 17 of 19 ==
Date: Fri, Jan 17 2014 9:57 am
From: "J. Clarke"

In article <52D86866.60502@verizon.net>, jameskuyper@verizon.net says...
>
> On 01/16/2014 05:54 PM, Andrew Cooper wrote:
> > On 15/01/2014 09:47, JohnF wrote:
> ...
> >> I believe your "as gotten 10 mins ago" code is current.
> >> But if the following remark seems wrong, maybe download again.
> >> I agree that the float result called "temp" in ran1() won't
> >> be portable. But it's not used anywhere, any more. Instead,
> >> there's now that "static long iran;" near the top of the module
> >> that's the only result actually used now. And that's a
> >> completely integer calculation.
> ...
> > Apologies for being curt, but:
> >
> > I really don't care what your C code says. It compiles differently
> > under 32bit and 64bit, using different floating point units in the x86
> > architecture.
>
> Not caring about what his C code says seems rather odd, since that code
> determines whether or not your assertions about the results of compiling
> it are correct. He's asserted that the current version of his code
> performs a "completely integer calculation". If that description is
> correct (I haven't bothered to check), then why would it use the
> floating point units at all, much less using different ones on the two
> different machines?
>
> If he's wrong, and it's not yet a "completely integer calculation", then
> that's a different and very legitimate issue - but what his C code says
> is very relevant to that issue.

Intel hardware supports integer calcuations in the general purpose
registers, the 80x87 stack, and the SSE registers. So there is no
guarantee that an integer operation will be performed using the general-
purpose registers--this would be something that was determined by the
design of the compiler, the optimizations in force, and the specific
instruction sequence.

== 18 of 19 ==
Date: Fri, Jan 17 2014 10:57 am
From: James Kuyper

On 01/17/2014 12:57 PM, J. Clarke wrote:
> In article <52D86866.60502@verizon.net>, jameskuyper@verizon.net says...
>>
>> On 01/16/2014 05:54 PM, Andrew Cooper wrote:
>>> On 15/01/2014 09:47, JohnF wrote:
>> ...
>>>> I believe your "as gotten 10 mins ago" code is current.
>>>> But if the following remark seems wrong, maybe download again.
>>>> I agree that the float result called "temp" in ran1() won't
>>>> be portable. But it's not used anywhere, any more. Instead,
>>>> there's now that "static long iran;" near the top of the module
>>>> that's the only result actually used now. And that's a
>>>> completely integer calculation.
>> ...
>>> Apologies for being curt, but:
>>>
>>> I really don't care what your C code says. It compiles differently
>>> under 32bit and 64bit, using different floating point units in the x86
>>> architecture.
>>
>> Not caring about what his C code says seems rather odd, since that code
>> determines whether or not your assertions about the results of compiling
>> it are correct. He's asserted that the current version of his code
>> performs a "completely integer calculation". If that description is
>> correct (I haven't bothered to check), then why would it use the
>> floating point units at all, much less using different ones on the two
>> different machines?
>>
>> If he's wrong, and it's not yet a "completely integer calculation", then
>> that's a different and very legitimate issue - but what his C code says
>> is very relevant to that issue.
>
> Intel hardware supports integer calcuations in the general purpose
> registers, the 80x87 stack, and the SSE registers. So there is no
> guarantee that an integer operation will be performed using the general-
> purpose registers--this would be something that was determined by the
> design of the compiler, the optimizations in force, and the specific
> instruction sequence.

That can't be problematic if we're talking about conforming
implementations of C. If there's any possibility that a purely integer
operation with defined behavior, will produce different results if it's
performed using the 80x87 stack or the SSE registers, then a conforming
implementation of C that chooses to use those things is pretty much
required to correct for that difference. There's no statement in the C
standard covering integer operations that gives implementations anything
like the same amount of freedom that 5.2.4.2.2p6 gives them for floating
point operations. Even implementations that pre-#define __STDC_IEC_559__
have less freedom in the evaluation of integer operations than floating
point ones.

== 19 of 19 ==
Date: Sat, Jan 18 2014 11:42 pm
From: Rosario193

On Tue, 14 Jan 2014 08:19:49 +0000 (UTC), JohnF wrote:

>Sorry for following myself up:
>I should have mentioned that several "intended-to-be-portable"
>fixes that I've tried, in particular freopen("CON","wb",stdout)
>and stdout=fdopen(STDOUT_FILENO,"wb"), don't work or don't work

why not use windows OS sys api?

>portably, for one reason or another (tales of woe elided:)
>So I'm asking for a pretty much known-to-portably-work fix.

==============================================================================
TOPIC: What is "the instance structure must be manually initialized"?
http://groups.google.com/group/comp.lang.c/t/89ecdf01bc4a91d6?hl=en
==============================================================================

== 1 of 3 ==
Date: Tues, Jan 14 2014 5:55 pm
From: fl

Hi,

I have big difficulties when I read below FIR function description.
1. I do not know what is "4 different data type filter instance structures"

2. I do not know What is "the instance structure must be manually initialized"

The "statically initialize" is manual initialize? Or call function:

ne10_result_t ne10_fir_init_float0 (ne10_fir_instance_f32_t * S,
ne10_uint16_t numTaps,
ne10_float32_t * pCoeffs,
ne10_float32_t * pState,
ne10_uint32_t blockSize)

I have no idea at all on this question.

Thanks a lot.

--------------------------
Initialization Functions
There is also an associated initialization function for each data type. The initialization function performs the following operations:

Sets the values of the internal structure fields.
Zeros out the values in the state buffer.

Use of the initialization function is optional. However, if the initialization function is used, then the instance structure cannot be placed into a const data section. To place an instance structure into a const data section, the instance structure must be manually initialized. Set the values in the state buffer to zeros before static initialization. The code below statically initializes each of the 4 different data type filter instance structures

ne10_fir_instance_f32_t S = {numTaps, pState, pCoeffs};
.............................
ne10_result_t ne10_fir_init_float0 (ne10_fir_instance_f32_t * S,
ne10_uint16_t numTaps,
ne10_float32_t * pCoeffs,
ne10_float32_t * pState,
ne10_uint32_t blockSize)
{
/* Assign filter taps */
S->numTaps = numTaps;

/* Assign coefficient pointer */
S->pCoeffs = pCoeffs;

/* Clear state buffer and the size of state buffer is (blockSize + numTaps - 1) */
memset (pState, 0, (numTaps + (blockSize - 1u)) * sizeof (ne10_float32_t));

/* Assign state pointer */
S->pState = pState;
return NE10_OK;
}

== 2 of 3 ==
Date: Tues, Jan 14 2014 6:07 pm
From: fl

On Tuesday, January 14, 2014 8:55:01 PM UTC-5, fl wrote:
> Hi,
>
>
>
> I have big difficulties when I read below FIR function description.
>
> 1. I do not know what is "4 different data type filter instance structures"
>
>
>
> 2. I do not know What is "the instance structure must be manually initialized"
>
>
>
> The "statically initialize" is manual initialize? Or call function:
>
>
>
> ne10_result_t ne10_fir_init_float0 (ne10_fir_instance_f32_t * S,
>
> ne10_uint16_t numTaps,
>
> ne10_float32_t * pCoeffs,
>
> ne10_float32_t * pState,
>
> ne10_uint32_t blockSize)
>
>
>
> I have no idea at all on this question.
>
>
>
>
>
> Thanks a lot.
>
>
>
>
>
>
>
> --------------------------
>
> Initialization Functions
>
> There is also an associated initialization function for each data type. The initialization function performs the following operations:
>
>
>
> Sets the values of the internal structure fields.
>
> Zeros out the values in the state buffer.
>
>
>
> Use of the initialization function is optional. However, if the initialization function is used, then the instance structure cannot be placed into a const data section. To place an instance structure into a const data section, the instance structure must be manually initialized. Set the values in the state buffer to zeros before static initialization. The code below statically initializes each of the 4 different data type filter instance structures
>
>
>
> ne10_fir_instance_f32_t S = {numTaps, pState, pCoeffs};
>
> .............................
>
> ne10_result_t ne10_fir_init_float0 (ne10_fir_instance_f32_t * S,
>
> ne10_uint16_t numTaps,
>
> ne10_float32_t * pCoeffs,
>
> ne10_float32_t * pState,
>
> ne10_uint32_t blockSize)
>
> {
>
> /* Assign filter taps */
>
> S->numTaps = numTaps;
>
>
>
> /* Assign coefficient pointer */
>
> S->pCoeffs = pCoeffs;
>
>
>
> /* Clear state buffer and the size of state buffer is (blockSize + numTaps - 1) */
>
> memset (pState, 0, (numTaps + (blockSize - 1u)) * sizeof (ne10_float32_t));
>
>
>
> /* Assign state pointer */
>
> S->pState = pState;
>
> return NE10_OK;
>
> }

Here is the FIR instance declaration. I do not see 4 different data type filter instance structures in it. Do you see something?

Thanks.

typedef struct
{
ne10_uint16_t numTaps; /**< Length of the filter. */
ne10_float32_t *pState; /**< Points to the state variable array. The array is of length numTaps+maxBlockSize-1. */
ne10_float32_t *pCoeffs; /**< Points to the coefficient array. The array is of length numTaps. */
} ne10_fir_instance_f32_t;

/**

== 3 of 3 ==
Date: Tues, Jan 14 2014 9:13 pm
From: Kaz Kylheku

On 2014-01-15, fl <rxjwg98@gmail.com> wrote:
> Here is the FIR instance declaration. I do not see 4 different data type filter instance structures in it. Do you see something?

The word "type" has a broader meaning than just "C language type", even
in the context of C.

Type is any aspect of a datum by which we categorize it, not
only the declared type known to the compiler.

==============================================================================
TOPIC: C can be very literal.
http://groups.google.com/group/comp.lang.c/t/d81909b19825f4d7?hl=en
==============================================================================

== 1 of 1 ==
Date: Thurs, Jan 16 2014 11:49 am
From: DSF

On Tue, 05 Nov 2013 02:10:21 -0600, Stephen Sprunk
<stephen@sprunk.org> wrote:

Sorry this is a tiny bit late. Been busy.

>On 04-Nov-13 22:01, DSF wrote:
>> On Fri, 01 Nov 2013 21:25:59 +0100, Johannes Bauer
>> <dfnsonfsduifb@gmx.de> wrote:
>>> Am 01.11.2013 19:53, schrieb DSF:
>>>> So, at least with my current compiler, ganging up equals
>>>> (variables of the same type, of course) produces shorter and
>>>> faster code.
>>>
>>> If your current compiler misses such an obvious optimization, it
>>> is quite frankly a piece of shit.
>>
>> I would not disagree as far as code generation goes.
>>
>>> Just for reference, gcc 4.7 produces the expected:
>>> ...
>>
>> If we're going to be specific, here's what it compiles to for me:
>> ...
>
>If you're going to make vague complaints about what an unspecified
>compiler produces with unspecified settings, you should expect to see
>folks chime in with more specific examples to discuss.
>
>For instance, GCC 4.2.4 for Linux/x86 produces this with -O0:
>
>foo:
> pushl %ebp
> movl %esp, %ebp
> movl $0, a.2059
> movl $0, b.2060
> movl $0, c.2061
> movl $0, d.2062
> movl $0, e.2063
> movl $0, f.2064
> movl $0, g.2065
> movl $0, h.2066
> movl $0, i.2067
> popl %ebp
> ret
>...
>bar:
> pushl %ebp
> movl %esp, %ebp
> movl $0, i.2079
> movl i.2079, %eax
> movl %eax, h.2078
> movl h.2078, %eax
> movl %eax, g.2077
> movl g.2077, %eax
> movl %eax, f.2076
> movl f.2076, %eax
> movl %eax, e.2075
> movl e.2075, %eax
> movl %eax, d.2074
> movl d.2074, %eax
> movl %eax, c.2073
> movl c.2073, %eax
> movl %eax, b.2072
> movl b.2072, %eax
> movl %eax, a.2071
> popl %ebp
> ret
>
>The latter is definitely suboptimal.

The former suffers from an inefficiency as well.
movl $0, a.2059
I'm not proficient in AA&T syntax, but I assume is the equivalent of:
move dword ptr a, 0

A move of 0 to a static address translates to:
C705F0D0410000000000 mov [0x41D0F0], 0x00000000
10 bytes per store.

A stack-relative address is a little better:
C7450400000000 mov [ebp+4], 0
7 bytes per store.

As compared to:
33C0 xor eax, eax
A3F0D04100 move [0x41d0f0], eax
7 bytes for initial store, 5 for each additional store.

"'Later' is the beginning of what's not to be."
D.S. Fiscus

==============================================================================
TOPIC: What is wrong with this c-program?
http://groups.google.com/group/comp.lang.c/t/18415cd95543b14b?hl=en
==============================================================================

== 1 of 1 ==
Date: Fri, Jan 17 2014 11:39 pm
From: "Skybuck Flying"

When I saw this subject line the first thing I thought was:

"Probably a newb question/program".

The second thing was more funny:

"Everything because it was written in C :)"

Bye,
Skybuck.

==============================================================================
TOPIC: A virtual trinary CPU
http://groups.google.com/group/comp.lang.c/t/1e016c1ac82a26be?hl=en
==============================================================================

== 1 of 2 ==
Date: Sun, Jan 19 2014 4:50 am
From: Bjarki Hartjenstein Gunnarsson

I've started this project, chav, which aims to implement a trinary CPU. I was wondering if anyone would be interested to help me, give me suggestions, criticism or just something. I'd be super grateful and things.
https://github.com/bjarkig/chav

with lovely regards
Bjarki

== 2 of 2 ==
Date: Sun, Jan 19 2014 2:56 pm
From: rick.c.hodgin@gmail.com

On Sunday, January 19, 2014 7:50:14 AM UTC-5, Bjarki Hartjenstein Gunnarsson wrote:
> I've started this project, chav, which aims to implement a trinary CPU. I was
> wondering if anyone would be interested to help me, give me suggestions,
> criticism or just something. I'd be super grateful and things.
> https://github.com/bjarkig/chav
> with lovely regards
> Bjarki

Interesting. I came here to ask a particular question and happened across this post. What experience do you have writing emulators?

Best regards,
Rick C. Hodgin

==============================================================================

You received this message because you are subscribed to the Google Groups "comp.lang.c"
group.

To post to this group, visit http://groups.google.com/group/comp.lang.c?hl=en

To unsubscribe from this group, send email to comp.lang.c+unsubscribe@googlegroups.com

To change the way you get mail from this group, visit:
http://groups.google.com/group/comp.lang.c/subscribe?hl=en

To report abuse, send email explaining the problem to abuse@googlegroups.com

==============================================================================
Google Groups: http://groups.google.com/?hl=en

twitter

Monday, January 20, 2014

comp.lang.c - 26 new messages in 5 topics - digest

0 Comments:

Post a Comment

About Me

Previous Posts