comp.lang.c - 26 new messages in 3 topics - digest
comp.lang.c
http://groups.google.com/group/comp.lang.c?hl=en
comp.lang.c@googlegroups.com
Today's topics:
* Avoiding NaN and Inf on floating point division - 13 messages, 8 authors
http://groups.google.com/group/comp.lang.c/t/531eea84c94d0630?hl=en
* Alternatives to modifying loop var in the loop. - 1 messages, 1 author
http://groups.google.com/group/comp.lang.c/t/3512c75f82c2014b?hl=en
* tools for manipulating (or pre-processing) data structures to simplify
source - 12 messages, 8 authors
http://groups.google.com/group/comp.lang.c/t/92dddebc3cc6c262?hl=en
==============================================================================
TOPIC: Avoiding NaN and Inf on floating point division
http://groups.google.com/group/comp.lang.c/t/531eea84c94d0630?hl=en
==============================================================================
== 1 of 13 ==
Date: Sat, Jan 4 2014 4:28 am
From: Ben Bacarisse
ardi <ardillasdelmonte@gmail.com> writes:
> Am I right supposing that if a floating point variable x is normal
> (not denormal/subnormal) it is guaranteed that for any non-NaN and
> non-Inf variable called y, the result y/x is guaranteed to be non-NaN
> and non-Inf?
No. Assuming what goes by the name of IEEE floating point, you will get
NaN when y == x == 0, and Inf from all sorts of values for x and y
(DBL_MAX/0.5 for example).
An excellent starting point is to search the web for Goldberg's paper
"What Every Computer Scientist Should Know About Floating-Point
Arithmetic". It will pay off the time spent in spades.
> If affirmative, I've two doubts about this. First, how efficient can
> one expect the isnormal() macro to be? I mean, should one expect it to
> be much slower than doing an equality comparison to zero (x==0.0) ? Or
> should the performance be similar?
I'd expect it to be fast. Probably not as fast as a test for zero, but
it can be done by simple bit testing.
However, you say "if affirmative" and the answer to your question is
"no" so maybe all the rest is moot.
> Second, how could I "emulate" isnormal() on older systems that lack
> it? For example, if I compile on IRIX 6.2, which AFAIK lacks
> isnormal(), is there some workaround which would also guarantee me
> that the division doesn't generate NaN nor Inf?
There are lots of ways. For example, IEEE double precision sub-normal
numbers have an absolute value less less than DBL_MIN (defined in
float.h). You can also test normality by looking at the bits. For
example, a sub-normal IEEE number has zero bits in the exponent field
and a non-zero fraction.
> Also, if the isnormal() macro can be slow, is there any other approach
> which would also give me the guarantee I'm asking for? Maybe comparing
> to some standard definition which holds the smallest normal value
> available for each data type?
The guarantee you want is that a division won't generate NaN or +/-Inf?
The simplest method is to do the division and test the result, but maybe
one or more of your systems generates a signal that you want to avoid?
I think you should a bit more about what you are trying to do.
It's generally easy to test if you'll get a NaN from the division of
non-NaN numbers (you only get NaN from 0/0 and the four signed cases of
Inf/Inf), but pre-testing for Inf is harder.
> Are such definitions standardized in
> some way such that I can expect to find them in some standard header
> on most OSs/compilers? Would I be safe to test it this way rather than
> with the isnormal() macro?
Your C library should have float.h and that should define FLT_MIN,
DBL_MIN and LDBL_MIN but I don't think that helps you directly.
<snip>
--
Ben.
== 2 of 13 ==
Date: Sat, Jan 4 2014 4:46 am
From: Tim Prince
On 1/4/2014 6:07 AM, ardi wrote:
> Second, how could I "emulate" isnormal() on older systems that lack it? For example, if I compile on IRIX 6.2, which AFAIK lacks isnormal(), is there some workaround which would also guarantee me that the division doesn't generate NaN nor Inf?
>
> Also, if the isnormal() macro can be slow, is there any other approach which would also give me the guarantee I'm asking for? Maybe comparing to some standard definition which holds the smallest normal value available for each data type? Are such definitions standardized in some way such that I can expect to find them in some standard header on most OSs/compilers? Would I be safe to test it this way rather than with the isnormal() macro?
>
Maybe you could simply edit the glibc or OpenBSD implementation into
your working copy of your headers, if you aren't willing to update your
compiler or run-time library.
http://ftp.cc.uoc.gr/mirrors/OpenBSD/src/lib/libc/gen/isnormal.c
Is your compiler so old that it doesn't implement inline functions?
That's the kind of background you need to answer your own question about
speed. Then you may need to use an old-fashioned macro (with its
concerns about double evaluation of expressions).
--
Tim Prince
== 3 of 13 ==
Date: Sat, Jan 4 2014 8:35 am
From: James Kuyper
On 01/04/2014 06:07 AM, ardi wrote:
> Hi,
>
> Am I right supposing that if a floating point variable x is normal
> (not denormal/subnormal) it is guaranteed that for any non-NaN and
> non-Inf variable called y, the result y/x is guaranteed to be non-NaN
> and non-Inf?
How could that be true? If the mathematical value of y/x were greater
than DBL_MAX, or smaller than -DBL_MAX, what do you expect the floating
point value of y/x to be? What you're really trying to do is prevent
floating point overflow, and a test for isnormal() is not sufficient.
You must also check whether fabs(x) > fabs(y)/DBL_MAX (assuming that x
and y are both doubles).
As far as the C standard is concerned, the accuracy of floating point
math is entirely implementation-defined, and it explicitly allows the
implementation-provided definition to be "the accuracy is unknown"
(5.2.4.2.2p6). Therefore, a fully conforming implementation of C is
allowed to implement math that is so inaccurate that DBL_MIN/DBL_MAX >
DBL_MAX. In practice, you wouldn't be able to sell such an
implementation to anyone who actually needed to perform floating point
math - but that issue is outside the scope of the standard.
However, if an implementation pre-#defines __STDC_IEC_559__, it is
required to conform to the requirements of Annex F (6.10.8.3p1), which
are based upon but not completely identical to the requirements of IEC
60559:1989, which in turn is essentially equivalent to IEEE 754:1985.
That implies fairly strict requirements on the accuracy; for the most
part, those requirements are as strict as they reasonably could be.
> If affirmative, I've two doubts about this. First, how efficient can
> one expect the isnormal() macro to be? I mean, should one expect it
> to be much slower than doing an equality comparison to zero (x==0.0)
> ? Or should the performance be similar?
The performance is inherently system-specific; for all I know there
might be floating point chips where isnormal() can be implemented by a
single floating point instruction; but at the very worst it shouldn't be
much more complicated than a few mask and shift operations on the bytes
of a copy of the argument.
> Second, how could I "emulate" isnormal() on older systems that lack
> it? For example, if I compile on IRIX 6.2, which AFAIK lacks
> isnormal(), is there some workaround which would also guarantee me
> that the division doesn't generate NaN nor Inf?
Find a precise definition of the floating point format implemented on
that machine (which might not fully conform to IEEE requirements), and
you can then implement isnormal() by performing a few mask and shift
operations on the individual bytes of the argument.
> Also, if the isnormal() macro can be slow, is there any other
> approach which would also give me the guarantee I'm asking for? ..
If you can find a alternative way of implementing the equivalent of
isnormal() that is significantly faster than calling the macro provided
by a given version of the C standard library, then you should NOT use
that alternative; what you should do is drop that version of the C
standard library and replace it with one that's better-implemented.
> ... Maybe
> comparing to some standard definition which holds the smallest normal
> value available for each data type?
Yes, that's what FLT_MIN, DBL_MIN, and LDBL_MIN are for.
> ... Are such definitions standardized
> in some way such that I can expect to find them in some standard
> header on most OSs/compilers? ...
Yes - the standard header is <float.h>.
> ... Would I be safe to test it this way
> rather than with the isnormal() macro?
It could be safe, if you handle correctly the possibility that the value
is a NaN. Keep in mind that all comparisons with a NaN fail, so
x>=DBL_MIN is not the same as !(x<DBL_MIN). If x is a NaN, the first
expression is false, while the second is true.
--
James Kuyper
== 4 of 13 ==
Date: Sat, Jan 4 2014 9:09 am
From: Tim Prince
On 1/4/2014 6:07 AM, ardi wrote:
> Am I right supposing that if a floating point variable x is normal (not denormal/subnormal) it is guaranteed that for any non-NaN and non-Inf variable called y, the result y/x is guaranteed to be non-NaN and non-Inf?
>
1/x is well-behaved when x is normal (only possible flag raised is
inexact). That is an important enough consideration to be part of
IEEE754 design, but not guaranteed in C without IEEE754 (the latter
being a reasonable expectation of a good quality platform, but there are
still exceptions). As others pointed out, your goal seems to be well
beyond that.
--
Tim Prince
== 5 of 13 ==
Date: Sat, Jan 4 2014 12:48 pm
From: Keith Thompson
James Kuyper <jameskuyper@verizon.net> writes:
> On 01/04/2014 06:07 AM, ardi wrote:
[...]
>> Also, if the isnormal() macro can be slow, is there any other
>> approach which would also give me the guarantee I'm asking for? ..
>
> If you can find a alternative way of implementing the equivalent of
> isnormal() that is significantly faster than calling the macro provided
> by a given version of the C standard library, then you should NOT use
> that alternative; what you should do is drop that version of the C
> standard library and replace it with one that's better-implemented.
That's not always an option. What you should probably do in
that case is (a) consider carefully whether your faster version
is actually correct, and (b) contact the maintainers of your
implementation.
--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
== 6 of 13 ==
Date: Sat, Jan 4 2014 2:28 pm
From: glen herrmannsfeldt
James Kuyper <jameskuyper@verizon.net> wrote:
> On 01/04/2014 06:07 AM, ardi wrote:
>> Am I right supposing that if a floating point variable x is normal
>> (not denormal/subnormal) it is guaranteed that for any non-NaN and
>> non-Inf variable called y, the result y/x is guaranteed to be non-NaN
>> and non-Inf?
> How could that be true? If the mathematical value of y/x were greater
> than DBL_MAX, or smaller than -DBL_MAX, what do you expect the floating
> point value of y/x to be? What you're really trying to do is prevent
> floating point overflow, and a test for isnormal() is not sufficient.
> You must also check whether fabs(x) > fabs(y)/DBL_MAX (assuming that x
> and y are both doubles).
> As far as the C standard is concerned, the accuracy of floating point
> math is entirely implementation-defined, and it explicitly allows the
> implementation-provided definition to be "the accuracy is unknown"
> (5.2.4.2.2p6). Therefore, a fully conforming implementation of C is
> allowed to implement math that is so inaccurate that DBL_MIN/DBL_MAX >
> DBL_MAX. In practice, you wouldn't be able to sell such an
> implementation to anyone who actually needed to perform floating point
> math - but that issue is outside the scope of the standard.
Yes, but it seems that it might not be so far off for rounding to allow
fabs(y)/(fabs(y)/DBL_MAX) to overflow, such that your test doesn't
guarantee no overflow.
-- glen
== 7 of 13 ==
Date: Sat, Jan 4 2014 2:32 pm
From: glen herrmannsfeldt
Tim Prince <tprince@computer.org> wrote:
> On 1/4/2014 6:07 AM, ardi wrote:
(snip)
> 1/x is well-behaved when x is normal (only possible flag raised is
> inexact). That is an important enough consideration to be part of
> IEEE754 design, but not guaranteed in C without IEEE754 (the latter
> being a reasonable expectation of a good quality platform,
> but there are still exceptions).
I haven't looked at IEEE754 in that much detail, but on many floating
point systems the exponent range is such that the smallest normal
floating point value will overflow on computing 1/x. If the exponent
range is symmetric, there is a factor of the base (2 or 10) to
consider.
> As others pointed out, your goal seems to be well beyond that.
-- glen
== 8 of 13 ==
Date: Sat, Jan 4 2014 2:41 pm
From: Tim Prince
On 1/4/2014 5:32 PM, glen herrmannsfeldt wrote:
> Tim Prince <tprince@computer.org> wrote:
>> On 1/4/2014 6:07 AM, ardi wrote:
>
> (snip)
>> 1/x is well-behaved when x is normal (only possible flag raised is
>> inexact). That is an important enough consideration to be part of
>> IEEE754 design, but not guaranteed in C without IEEE754 (the latter
>> being a reasonable expectation of a good quality platform,
>> but there are still exceptions).
>
> I haven't looked at IEEE754 in that much detail, but on many floating
> point systems the exponent range is such that the smallest normal
> floating point value will overflow on computing 1/x. If the exponent
> range is symmetric, there is a factor of the base (2 or 10) to
> consider.
>
IEEE754 specifically requires an asymmetric range such that 1/TINY(x)
(Fortran) or 1/FLT_MIN, 1/DBL_MIN, ... don't overflow. At the other
end, of course, there are large numbers whose reciprocal is sub-normal
and will "flush-to-zero" with Intel default compiler options. As far as
I know, CPUs other than Intel(r) Xeon Phi(tm) introduced in the last 3
years support sub-normal numbers with reasonable efficiency (it was
decades to fulfill the promise made when the standard was instituted).
--
Tim Prince
== 9 of 13 ==
Date: Sat, Jan 4 2014 4:49 pm
From: glen herrmannsfeldt
Tim Prince <tprince@computer.org> wrote:
(snip, I wrote)
>> I haven't looked at IEEE754 in that much detail, but on many floating
>> point systems the exponent range is such that the smallest normal
>> floating point value will overflow on computing 1/x. If the exponent
>> range is symmetric, there is a factor of the base (2 or 10) to
>> consider.
> IEEE754 specifically requires an asymmetric range such that 1/TINY(x)
> (Fortran) or 1/FLT_MIN, 1/DBL_MIN, ... don't overflow. At the other
> end, of course, there are large numbers whose reciprocal is sub-normal
> and will "flush-to-zero" with Intel default compiler options. As far as
> I know, CPUs other than Intel(r) Xeon Phi(tm) introduced in the last 3
> years support sub-normal numbers with reasonable efficiency (it was
> decades to fulfill the promise made when the standard was instituted).
So that is where the change in bias came from. IEEE754 is pretty similar
to VAX (other than the byte ordering), but the bias is off by one.
If you do it in the obvious way, there is one more value of negative
exponent than positive exponent, and an additional factor of (almost)
the base between the largest and smallest significand.
But it tends to take a lot of extra hardware to do denormals fast.
For people doing floating point in FPGAs, where it is already pretty
inefficient to generate a floating point add/subtract unit.
(The barrel shifter for pre/post normalization is huge, compared
to the actual add/subtract.) Then a lot more logic to get denormals.
Otherwise, denormals give a fraction of additional bit of exponent
range for a large additional cost of logic. Better to add one more
exponent bit instead.
-- glen
== 10 of 13 ==
Date: Sat, Jan 4 2014 7:15 pm
From: James Kuyper
On 01/04/2014 05:28 PM, glen herrmannsfeldt wrote:
> James Kuyper <jameskuyper@verizon.net> wrote:
>> On 01/04/2014 06:07 AM, ardi wrote:
>>> Am I right supposing that if a floating point variable x is normal
>>> (not denormal/subnormal) it is guaranteed that for any non-NaN and
>>> non-Inf variable called y, the result y/x is guaranteed to be non-NaN
>>> and non-Inf?
>
>> How could that be true? If the mathematical value of y/x were greater
>> than DBL_MAX, or smaller than -DBL_MAX, what do you expect the floating
>> point value of y/x to be? What you're really trying to do is prevent
>> floating point overflow, and a test for isnormal() is not sufficient.
>> You must also check whether fabs(x) > fabs(y)/DBL_MAX (assuming that x
>> and y are both doubles).
>
>> As far as the C standard is concerned, the accuracy of floating point
>> math is entirely implementation-defined, and it explicitly allows the
>> implementation-provided definition to be "the accuracy is unknown"
>> (5.2.4.2.2p6). Therefore, a fully conforming implementation of C is
>> allowed to implement math that is so inaccurate that DBL_MIN/DBL_MAX >
>> DBL_MAX. In practice, you wouldn't be able to sell such an
>> implementation to anyone who actually needed to perform floating point
>> math - but that issue is outside the scope of the standard.
>
> Yes, but it seems that it might not be so far off for rounding to allow
> fabs(y)/(fabs(y)/DBL_MAX) to overflow, such that your test doesn't
> guarantee no overflow.
That is a real problem, though a marginal one. The best solution to that
problem is to allow the overflow to happen, and test for it afterwards,
but the OP seems uninterested in that option.
--
James Kuyper
== 11 of 13 ==
Date: Sun, Jan 5 2014 3:14 am
From: Malcolm McLean
On Saturday, January 4, 2014 12:06:12 PM UTC, ardi wrote:
>
> Ooops!!! I believe this means I forgot you can also get Inf from overflow...
> if a number is very big and a division turns it even larger, it can overflow,
> and then it becomes Inf even if the denominator is a normal value.
>
> This effectively breaks my quest for "healthy divisions". I guess I'm back
> to my old arbitrary epsilon checking approach (i.e.: check the denominator
> for fabs(x)>epsilon for deciding whether the division can be performed or
> not, where epsilon is left as an exercise for the reader ;-)
>
What you can do is call the function frexp(). This will give you an exponent.
Then chuck out any numbers outside or a reasonable range. IEEE doubles have
11 bits of exponent so, -1024 to 1023. You're highly unlikely to need anything
bigger or smaller than +/- 100, it's got to be either corrupt data or an
intermediate in a calculation which was unstable and has lost precision.
== 12 of 13 ==
Date: Sun, Jan 5 2014 1:09 pm
From: christian.bau@cbau.wanadoo.co.uk
On Saturday, January 4, 2014 4:35:01 PM UTC, James Kuyper wrote:
> The performance is inherently system-specific; for all I know there
> might be floating point chips where isnormal() can be implemented by a
> single floating point instruction; but at the very worst it shouldn't be
> much more complicated than a few mask and shift operations on the bytes
> of a copy of the argument.
For example for double, and IEEE 754 compatible implementation, you can check
(x - x == x - x) && fabs (x) >= DBL_MIN
which is not quite trivial, but not that difficult. (A Not-a-Number x fails both tests; If x is +/- infinity then x - x is NaN and x - x == x - x fails; for zero or denormalised x the test fabs (x) >= DBL_MIN fails. If the compiler optimises x - x to 0 (which is incorrect because of NaN and INF), or optimises expr == expr to 1 (which is incorrect if expr is NaN), then you have bigger problems.
== 13 of 13 ==
Date: Sun, Jan 5 2014 2:24 pm
From: Tim Prince
On 01/05/2014 04:09 PM, christian.bau@cbau.wanadoo.co.uk wrote:
> On Saturday, January 4, 2014 4:35:01 PM UTC, James Kuyper wrote:
>
>> The performance is inherently system-specific; for all I know there
>> might be floating point chips where isnormal() can be implemented by a
>> single floating point instruction; but at the very worst it shouldn't be
>> much more complicated than a few mask and shift operations on the bytes
>> of a copy of the argument.
>
> For example for double, and IEEE 754 compatible implementation, you can check
>
> (x - x == x - x) && fabs (x) >= DBL_MIN
>
> which is not quite trivial, but not that difficult. (A Not-a-Number x fails both tests; If x is +/- infinity then x - x is NaN and x - x == x - x fails; for zero or denormalised x the test fabs (x) >= DBL_MIN fails. If the compiler optimises x - x to 0 (which is incorrect because of NaN and INF), or optimises expr == expr to 1 (which is incorrect if expr is NaN), then you have bigger problems.
>
This looks attractive from the point of view of not requiring integer
bit field examination of a memory copy of x. I have a concern about how
to prevent compiler optimization from breaking it, as that's the point
of leaving a standard intrinsic to the implementation to know how.
==============================================================================
TOPIC: Alternatives to modifying loop var in the loop.
http://groups.google.com/group/comp.lang.c/t/3512c75f82c2014b?hl=en
==============================================================================
== 1 of 1 ==
Date: Sat, Jan 4 2014 9:03 am
From: Richard
gazelle@shell.xmission.com (Kenny McCormack) writes:
> In article <87lhyzvgrj.fsf@gmail.com>, Richard <rgrdev_@gmail.com> wrote:
>>gazelle@shell.xmission.com (Kenny McCormack) writes:
>>
>>> In article <l9pebv$s9s$1@dont-email.me>,
>>> Eric Sosman <esosman@comcast-dot-net.invalid> wrote:
>>> ...
>>>> I think that "elegance" means different things to the two
>>>>of us (eye of the beholder, again). The quality you describe
>>>>as "elegance" is something I'd prefer to call "clarity," and I
>>>>agree that it's a desirable attribute regardless of whether you
>>>>call it clarigance or eleganity.
>>>
>>> Interestingly enough, the actual, original definition of "elegant" is
>>> "minimalist". Which, in the context of the religion of CLC, ends up
>>> meaning exactly the opposite of "clarity" (since "clarity" usually ends up
>>> meaning "be as verbose as possible").
>>
>>It depends if you know C or not.
>>
>>One moron I had the displeasure of working with decided that something
>>like
>>
>>while(*d++=*s++);
>>
>>was not "clear" and needed to be expanded with long ariable names and to
>>use array notation and an exploded loop.
>>
>>I pointed out that any C programmer that didn't understand that code
>>had no right being in the team since its "bread ad butter" for any half
>>decent C programmer.
>
> I'd be willing to assert that for most working C programmers today, the
> above code is (pick all that apply): strange, cryptic, weird, "clever", not
> "clear", etc, etc, and that an explicit form with a loop would be much
> better in their eyes.
>
> Because, as is true of just about every population, most working C
> programmers today are idiots. It's what managememnt wants.
Its what people believe programmers should be. PC nonsense in the office
where QA and Doc resources are "empowered" to proof read code etc. The
world's gone mad.
--
"Avoid hyperbole at all costs, its the most destructive argument on
the planet" - Mark McIntyre in comp.lang.c
==============================================================================
TOPIC: tools for manipulating (or pre-processing) data structures to simplify
source
http://groups.google.com/group/comp.lang.c/t/92dddebc3cc6c262?hl=en
==============================================================================
== 1 of 12 ==
Date: Sat, Jan 4 2014 9:42 am
From: Jens Schweikhardt
Richard Damon <Richard@damon-family.org> wrote
in <PScdu.20235$SO1.8769@en-nntp-06.dc1.easynews.com>:
...
# The big problem with a "beautifier" is that it plays havoc with revision
# control and looking at change history, as it can make so many "changes"
# that you lose track of what is really different.
This argument never convinced me. If you really wanted to see minimal
diffs, simply run the revisions to be compared through the beautifier
first and then compare.
Sure, it's an extra step. But computers are great at automation, aren't
they?
Regards,
Jens
--
Jens Schweikhardt http://www.schweikhardt.net/
SIGSIG -- signature too long (core dumped)
== 2 of 12 ==
Date: Sat, Jan 4 2014 1:01 pm
From: Keith Thompson
Jens Schweikhardt <usenet@schweikhardt.net> writes:
> Richard Damon <Richard@damon-family.org> wrote
> in <PScdu.20235$SO1.8769@en-nntp-06.dc1.easynews.com>:
> ...
> # The big problem with a "beautifier" is that it plays havoc with revision
> # control and looking at change history, as it can make so many "changes"
> # that you lose track of what is really different.
>
> This argument never convinced me. If you really wanted to see minimal
> diffs, simply run the revisions to be compared through the beautifier
> first and then compare.
>
> Sure, it's an extra step. But computers are great at automation, aren't
> they?
Sure, but inserting that extra step into the middle of multiple
existing processes isn't likely to be quite so trivial.
Some revision control systems might let you configure how diffs are
performed, but sometimes you really need to see changes in what's
actually stored in the repository.
--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
== 3 of 12 ==
Date: Sat, Jan 4 2014 1:57 pm
From: Geoff
On 4 Jan 2014 17:42:34 GMT, Jens Schweikhardt
<usenet@schweikhardt.net> wrote:
>Richard Damon <Richard@damon-family.org> wrote
> in <PScdu.20235$SO1.8769@en-nntp-06.dc1.easynews.com>:
>...
># The big problem with a "beautifier" is that it plays havoc with revision
># control and looking at change history, as it can make so many "changes"
># that you lose track of what is really different.
>
>This argument never convinced me. If you really wanted to see minimal
>diffs, simply run the revisions to be compared through the beautifier
>first and then compare.
>
>Sure, it's an extra step. But computers are great at automation, aren't
>they?
>
Or better, beautify and review before commit.
== 4 of 12 ==
Date: Sat, Jan 4 2014 2:17 pm
From: glen herrmannsfeldt
Jens Schweikhardt <usenet@schweikhardt.net> wrote:
> Richard Damon <Richard@damon-family.org> wrote
> in <PScdu.20235$SO1.8769@en-nntp-06.dc1.easynews.com>:
> # The big problem with a "beautifier" is that it plays havoc with revision
> # control and looking at change history, as it can make so many "changes"
> # that you lose track of what is really different.
> This argument never convinced me. If you really wanted to see minimal
> diffs, simply run the revisions to be compared through the beautifier
> first and then compare.
That wouldn't be bad if version control systems knew how to do it.
You want to type
svn diff ...
and see the right thing.
Otherwise, if you filter (and consider a beautifier as a filter)
then check in the result, then modify that, you can diff the befores
and afters, but not across the line.
-- glen
== 5 of 12 ==
Date: Sat, Jan 4 2014 2:37 pm
From: Jens Schweikhardt
glen herrmannsfeldt <gah@ugcs.caltech.edu> wrote
in <laa19i$sb2$1@speranza.aioe.org>:
# Jens Schweikhardt <usenet@schweikhardt.net> wrote:
#> Richard Damon <Richard@damon-family.org> wrote
#> in <PScdu.20235$SO1.8769@en-nntp-06.dc1.easynews.com>:
#
#> # The big problem with a "beautifier" is that it plays havoc with revision
#> # control and looking at change history, as it can make so many "changes"
#> # that you lose track of what is really different.
#
#> This argument never convinced me. If you really wanted to see minimal
#> diffs, simply run the revisions to be compared through the beautifier
#> first and then compare.
#
# That wouldn't be bad if version control systems knew how to do it.
#
# You want to type
#
# svn diff ...
#
# and see the right thing.
How hard is it to write a three line script named svndiff that checks
out the revisions, runs indent on them and diffs the result? Then it's
even one less character to type.
One could even replace the svn binary with a script that checks
whether a diff of a C file is requested and does the above or
calls the real binary otherwise. Not exactly rocket science :-)
Regards,
Jens
--
Jens Schweikhardt http://www.schweikhardt.net/
SIGSIG -- signature too long (core dumped)
== 6 of 12 ==
Date: Sat, Jan 4 2014 2:47 pm
From: Ian Collins
Geoff wrote:
> On 4 Jan 2014 17:42:34 GMT, Jens Schweikhardt
> <usenet@schweikhardt.net> wrote:
>
>> Richard Damon <Richard@damon-family.org> wrote
>> in <PScdu.20235$SO1.8769@en-nntp-06.dc1.easynews.com>:
>> ...
>> # The big problem with a "beautifier" is that it plays havoc with revision
>> # control and looking at change history, as it can make so many "changes"
>> # that you lose track of what is really different.
>>
>> This argument never convinced me. If you really wanted to see minimal
>> diffs, simply run the revisions to be compared through the beautifier
>> first and then compare.
>>
>> Sure, it's an extra step. But computers are great at automation, aren't
>> they?
>>
>
> Or better, beautify and review before commit.
Or better still don't mix style and content changes in the same commit.
--
Ian Collins
== 7 of 12 ==
Date: Sat, Jan 4 2014 3:45 pm
From: Keith Thompson
Ian Collins <ian-news@hotmail.com> writes:
> Geoff wrote:
>> On 4 Jan 2014 17:42:34 GMT, Jens Schweikhardt
>> <usenet@schweikhardt.net> wrote:
>>
>>> Richard Damon <Richard@damon-family.org> wrote
>>> in <PScdu.20235$SO1.8769@en-nntp-06.dc1.easynews.com>:
>>> ...
>>> # The big problem with a "beautifier" is that it plays havoc with revision
>>> # control and looking at change history, as it can make so many "changes"
>>> # that you lose track of what is really different.
>>>
>>> This argument never convinced me. If you really wanted to see minimal
>>> diffs, simply run the revisions to be compared through the beautifier
>>> first and then compare.
>>>
>>> Sure, it's an extra step. But computers are great at automation, aren't
>>> they?
>>
>> Or better, beautify and review before commit.
>
> Or better still don't mix style and content changes in the same commit.
Agreed, but there are still problems if you want to compare
non-consecutive revisions.
Suppose version 1.1 is the initial checkin, 1.2 is a 3-line bug fix,
1.3 changes 50% of the lines for style reasons, and 1.4 has another
3-line bug fix. Comparing versions 1.1 and 1.4 to see what was
actually fixed is going to be difficult. (You can compare 1.1 to
1.2 and 1.3 to 1.4 if you happen to know that the 1.2..1.3 change
was style only, but that's not always easy to determine -- and
there's always the risk that a style-only change can accidentally
break something.)
Summary: Maintaining software is hard.
--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
== 8 of 12 ==
Date: Sat, Jan 4 2014 4:13 pm
From: Ian Collins
Keith Thompson wrote:
> Ian Collins <ian-news@hotmail.com> writes:
>> Geoff wrote:
>>> On 4 Jan 2014 17:42:34 GMT, Jens Schweikhardt
>>> <usenet@schweikhardt.net> wrote:
>>>
>>>> Richard Damon <Richard@damon-family.org> wrote
>>>> in <PScdu.20235$SO1.8769@en-nntp-06.dc1.easynews.com>:
>>>> ...
>>>> # The big problem with a "beautifier" is that it plays havoc with revision
>>>> # control and looking at change history, as it can make so many "changes"
>>>> # that you lose track of what is really different.
>>>>
>>>> This argument never convinced me. If you really wanted to see minimal
>>>> diffs, simply run the revisions to be compared through the beautifier
>>>> first and then compare.
>>>>
>>>> Sure, it's an extra step. But computers are great at automation, aren't
>>>> they?
>>>
>>> Or better, beautify and review before commit.
>>
>> Or better still don't mix style and content changes in the same commit.
>
> Agreed, but there are still problems if you want to compare
> non-consecutive revisions.
>
> Suppose version 1.1 is the initial checkin, 1.2 is a 3-line bug fix,
> 1.3 changes 50% of the lines for style reasons, and 1.4 has another
> 3-line bug fix. Comparing versions 1.1 and 1.4 to see what was
> actually fixed is going to be difficult.
On my projects that's normally done by checking which unit tests were
added between revisions. Comparing tests is a lot easier than diffing
code, which is something I seldom do.
Where unit tests aren't used (tut tut), the webrev or equivalent
generated for reviews are the bet documentation for changes.
--
Ian Collins
== 9 of 12 ==
Date: Sat, Jan 4 2014 4:35 pm
From: glen herrmannsfeldt
Keith Thompson <kst-u@mib.org> wrote:
(snip)
>>>> Richard Damon <Richard@damon-family.org> wrote
>>>> in <PScdu.20235$SO1.8769@en-nntp-06.dc1.easynews.com>:
>>>> ...
>>>> # The big problem with a "beautifier" is that it plays havoc with revision
>>>> # control and looking at change history, as it can make so many "changes"
>>>> # that you lose track of what is really different.
(snip, someone wrote)
>> Or better still don't mix style and content changes in the same commit.
> Agreed, but there are still problems if you want to compare
> non-consecutive revisions.
> Suppose version 1.1 is the initial checkin, 1.2 is a 3-line bug fix,
> 1.3 changes 50% of the lines for style reasons, and 1.4 has another
> 3-line bug fix. Comparing versions 1.1 and 1.4 to see what was
> actually fixed is going to be difficult.
You need a diff that can generate the appropriate differences.
> (You can compare 1.1 to
> 1.2 and 1.3 to 1.4 if you happen to know that the 1.2..1.3 change
> was style only, but that's not always easy to determine -- and
> there's always the risk that a style-only change can accidentally
> break something.)
If this problem is real, then version control systems should adapt
for it. It should be possible to commit a change as style only.
The system would remember that, and, if not give the appropriate
diff, warn you where the style reversions are so you can do the
appropriate diff yourself.
I would expect most diff requests are consecutive versions, but one
still has to be sure to check-in the style-only change.
> Summary: Maintaining software is hard.
-- glen
== 10 of 12 ==
Date: Sat, Jan 4 2014 7:21 pm
From: Kaz Kylheku
On 2013-11-02, Richard Damon <Richard@Damon-Family.org> wrote:
> Normally, for a large project, this loss of history is much worse than
> the damage caused by inconsistent indentation (and sometime guidelines
> actually discourage fixing indentation in order to keep from polluting
> the change history).
I use this program, in conjunction with Vim:
http://www.kylheku.com/cgit/c-snippets/tree/autotab.c
This is geared toward languages which resemble C. (Patches, suggestions,
critisms welcome).
With the help of this utility, I instantly conform to the indentation style of
the source file that I load into Vim's buffer.
Without this program, I would go crazy working with multiple code bases that
have different conventions, and with inconsistent conventions within the same
project.
Worse, I might accidentally introduce edits which use a different style; for
instance spaces when the existing code uses hard tabs or vice versa.
Conforming to the exiting style is more important than enforcing one.
If a project contains numerous files that are inconsistently formatted
internally, it may be a good idea to fix them. It should be discussed with the
team; you don't just go in and reindent everything. Since the changes may
cause merge conflicts, cleanup work should be coordinated with the release
management process. For instance, if you still have an old branch that is
active (bugs are being fixed for customers and then integrated into the
mainline), then maybe hold off with the whitespace changes, so that changes can
back- and forward-port easily.
It's also a poor idea to mix whitespace changes with an actual code change,
needless to add.
== 11 of 12 ==
Date: Sat, Jan 4 2014 7:23 pm
From: James Kuyper
On 01/04/2014 07:35 PM, glen herrmannsfeldt wrote:
...
> If this problem is real, then version control systems should adapt
> for it. It should be possible to commit a change as style only.
> The system would remember that, and, if not give the appropriate
> diff, warn you where the style reversions are so you can do the
> appropriate diff yourself.
I'm familiar with only a few version control systems: RCS, CVS,
ClearCase, and SubVersion, and one other system whose name I can't
remember which was built on an RCS foundation. If any of them had the
capability of doing what you suggest, I wasn't aware of the fact. It
seems to me that, in order to work that way, the version control system
would have to know about the source code syntax, in order to properly
distinguish between meaningful and cosmetic changes - I don't believe
that any of the systems I'm familiar with knew or cared anything about
the syntax of the documents being controlled.
> I would expect most diff requests are consecutive versions, ...
That has not been my experience.
--
James Kuyper
== 12 of 12 ==
Date: Sat, Jan 4 2014 10:09 pm
From: ralph
On Sat, 04 Jan 2014 13:01:37 -0800, Keith Thompson <kst-u@mib.org>
wrote:
>Jens Schweikhardt <usenet@schweikhardt.net> writes:
>> Richard Damon <Richard@damon-family.org> wrote
>> in <PScdu.20235$SO1.8769@en-nntp-06.dc1.easynews.com>:
>> ...
>> # The big problem with a "beautifier" is that it plays havoc with revision
>> # control and looking at change history, as it can make so many "changes"
>> # that you lose track of what is really different.
>>
>> This argument never convinced me. If you really wanted to see minimal
>> diffs, simply run the revisions to be compared through the beautifier
>> first and then compare.
>>
>> Sure, it's an extra step. But computers are great at automation, aren't
>> they?
>
>Sure, but inserting that extra step into the middle of multiple
>existing processes isn't likely to be quite so trivial.
>
>Some revision control systems might let you configure how diffs are
>performed, but sometimes you really need to see changes in what's
>actually stored in the repository.
Wow! Wait long enough and everything DOES happen.
Something I can agree with Mr. Thompson on. <g>
Before every "cleanup" - people like me are only called in when it is
already a mess <g> - I try to make that very clear - if one is
dependent on tracking miniscule variance, enforcing beauty in the
middle is going to be very very annoying.
Everyone always signs off - then the emails start coming. LOL
If the object is just to be prettier exclusively, its best to leave it
alone and keep complaining. Modern beautifiers are capable of a range
of adjustment (as well as modern SCCSs), but neither will solve
everything. It has to be combined with a complete library, resource,
etc. re-arrangement. Policies will be changed, new 'standards'
adopted. Resources get that way because no one was paying attention
from the beginning. Do it from the beginning and it goes well.
Sometimes you just go to a new beginning. Sometimes the budget just
won't let you.
And before we go into inventing too many scenerios that may break thus
invalidate going there - one needs to consider the industry. A large
argiculture or banking institute with a vast library of multiple
versioned bonded or certified products does not have to face the same
issues (risk) as a place selling ZombieBarbie games with only one or
two on the shelf at any one time.
It like everything else in this industry. There are trade-offs each
with its own risks, complexity, and headaches.
-ralph
==============================================================================
You received this message because you are subscribed to the Google Groups "comp.lang.c"
group.
To post to this group, visit http://groups.google.com/group/comp.lang.c?hl=en
To unsubscribe from this group, send email to comp.lang.c+unsubscribe@googlegroups.com
To change the way you get mail from this group, visit:
http://groups.google.com/group/comp.lang.c/subscribe?hl=en
To report abuse, send email explaining the problem to abuse@googlegroups.com
==============================================================================
Google Groups: http://groups.google.com/?hl=en
0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home