twitter: comp.lang.c - 10 new messages in 6 topics

comp.lang.c
http://groups.google.com/group/comp.lang.c?hl=en

comp.lang.c@googlegroups.com

Today's topics:

* Warning to newbies - 1 messages, 1 author
http://groups.google.com/group/comp.lang.c/t/9597fd702985dff4?hl=en
* Efficency and the standard library - 5 messages, 1 author
http://groups.google.com/group/comp.lang.c/t/ad9fea19f2f7dd61?hl=en
* Knowing the implementation, are all undefined behaviours become
implementation-defined behaviours? - 1 messages, 1 author
http://groups.google.com/group/comp.lang.c/t/4f8b56b26018cf4e?hl=en
* Cross reference tool - 1 messages, 1 author
http://groups.google.com/group/comp.lang.c/t/24be1b7f05b314ab?hl=en
* derived-type object - 1 messages, 1 author
http://groups.google.com/group/comp.lang.c/t/cfab7b19dac62701?hl=en
* substring finding problem! - 1 messages, 1 author
http://groups.google.com/group/comp.lang.c/t/cf9bd97208e0c3a3?hl=en

==============================================================================
TOPIC: Warning to newbies
http://groups.google.com/group/comp.lang.c/t/9597fd702985dff4?hl=en
==============================================================================

== 1 of 1 ==
Date: Mon, Feb 15 2010 10:53 pm
From: David Thompson

On 05 Feb 2010 09:30:13 GMT, Seebs <usenet-nospam@seebs.net> wrote:

> I'll contribute one [ObC].
>
> I recently wanted to write something which would derive a string value,
> then substitute it into another string. Actually, into several other strings.
>
> The problem is that it might be substituted several times, so I couldn't
> just do something like:
>
> char *replace = "replace_here:%s";
>
> and use sprintf.
>
Well you _could_ just pass many copies of the new value pointer, such
that they are more than enough for any 'reasonable' use. Extra varargs
are safely ignored, and the slight inefficiency almost certainly
doesn't matter. But that's even more hacky than what you did.

> (In this particular case, the values are "environment variables", but that's
> a detail of no particular relevance to C.)
>
> My solution, which is a bit hacky, but I sorta liked:
>
> struct { char *name, *template; } new_environs[] = {
> { "PERL5LIB", "%s/lib64/perl5/5.10.0/i686-linux:%s/lib64/perl5/5.10.0:%s/lib64/perl5/site_perl/5.10.0/i686-linux:%s/lib64/perl5/site_perl/5.10.0:%s/lib64/perl5/vendor_perl/5.10.0/i686-linux:%s/lib64/perl5/vendor_perl/5.10.0:%s/lib64/perl5/vendor_perl:." },
> { "LD_LIBRARY_PATH", "%s/lib64:%s/lib:%s/lib64" },
> { 0, 0 }
> };
>
> Then, down below:
>
> for (i = 0; new_environs[i].name; ++i) {
> char *new_environ;
> char *t;
> char *u;

Presumably s is (already) in the containing scope, and also i. I would
slightly prefer to keep all the vars/decls together. Of course if you
make this a separate function, which is pretty small, as the
downthread development does, you get that for free.

> size_t len = strlen(new_environs[i].template) + 5;
> for (s = new_environs[i].template; t = strchr(s, '%'); s = t + 1) {
> len += strlen(parent_path);
> }
> new_environ = malloc(len);

I would at least assert(nonnull) even in Q&D hack code. Yes many
probably most machines will trap nicely enough, but I don't want to
even admit to my brain the dangerous thought I can rely on it.

> u = new_environ;
> *u = '\0';
> for (s = new_environs[i].template; t = strchr(s, '%'); s = t + 2) {
> u += snprintf(u, (len - (u - new_environ)), "%.*s%s",
> t - s, s, parent_path);
> }
> snprintf(u, (len - (u - new_environ)), "%s", s);

As already changed downthread, you don't actually need snprintf
because you've ensured the output buffer is big enough.

But you say (reasonably) you were just being Q&D-safe here, and in
other situations it is needed. Given that, for some reason I always
stumble a little on reading N - (ptr-base) and prefer base+N - ptr.
Or use an offset rather than a pointer:
size_t u = 0; ... u += snprintf (base+u, N-u, ...); ...
/* I like the symmetry of +u and -u there;
especially if I also have, as I sometimes do, a struct to manage
such buffers so it's more like zorg.ptr +u, zorg.size -u */
Though here the input side strongly wants to be pointers, and symmetry
on the output side is nice.

> The calculations are very careless; my goal was not to calculate exactly
> the right length, but rather, to calculate a length which could be easily
> shown to be definitively long enough.
>
I understand Q&D bias to the safe side, but even so the 5 mystifies
me. Do you recall a reason, or is that too archeological?

==============================================================================
TOPIC: Efficency and the standard library
http://groups.google.com/group/comp.lang.c/t/ad9fea19f2f7dd61?hl=en
==============================================================================

== 1 of 5 ==
Date: Mon, Feb 15 2010 10:53 pm
From: spinoza1111

On Feb 16, 3:43 am, "James" <n...@spam.invalid> wrote:
> "spinoza1111" <spinoza1...@yahoo.com> wrote in message
>
> news:ee50aaa0-bbb6-44fd-b2d2-75efd8a4601e@k5g2000pra.googlegroups.com...
> On Feb 15, 2:44 pm, Malcolm McLean <malcolm.mcle...@btinternet.com>
> wrote:
>
>
>
>
>
> > On Feb 15, 5:28 am,spinoza1111<spinoza1...@yahoo.com> wrote:> Part of the
> > myth is that software is easy and a truly reusable and
> > > > efficient library routine such as this can be written by one magical
> > > > person in an hour: Kernighan contributes to this myth-making in the
> > > > O'Reilly anthology Beautiful Code with his story-telling about Rob
> > > > Pike writing a regex (which wasn't a real regex) by himself in an
> > > > hour.
>
> > > As I told my students, don't confuse doing the planning with the time
> > > it takes to write down the plan.
>
> > > Once you've decided on the specifications and behaviour of a string
> > > library, writing the code should be trivial - almost certainly all you
> > > will be doing is copying, concatenation, and a few simple searches.
> > > However getting the specifications right is hard.
> > Malcolm, hi. I don't agree with you at all. "Find all strings AND
> > replace them" is as we've seen two problems, and the specifications
> > may only seem precise. Remember the gotchas I mentioned? Right to left
> > as opposed to left to right? Overlapping strings?
>
> If a specification of the `replace()' function mentioned that it only
> operates on non-overlapping comparands, then the overlapping string gotcha
> is eliminated.
>
> If a specification of the `replace()' function mentioned that it scans for
> the comparand in a strict left-to-right fashion, then the
> left-to-right/right-to-left gotcha is eliminated.
>
> If a specification of the `replace()' function mentioned both of the above
> claims, then both of the "gotchas" are eliminated.

I guess it did, but I never wrote a spec. Probably should have. Of
course, in the so-called real world, one can specify left to right and
no overlap, the user can sign off, and the code can still be wrong.

== 2 of 5 ==
Date: Mon, Feb 15 2010 10:56 pm
From: spinoza1111

On Feb 16, 2:57 am, "Chris M. Thomasson" <n...@spam.invalid> wrote:
> "spinoza1111" <spinoza1...@yahoo.com> wrote in message
>
> news:6ae1ff4f-5c5e-4111-ade6-5b7ca0b5a83c@k5g2000pra.googlegroups.com...
> [...]
>
> > It was hard to get my doubly linked list working, but I did. It would
> > have been nice to have a separate tool for linked lists. Richard
> > Heathfield appears to have published such a tool in 2000, in C
> > Unleashed, but as far as I can determine (see elsethread), the
> > Dickiemaster copies the actual values into the list. This would have
> > blown me to Kingdom Come. I THINK the Heathman wanted to avoid using
> > pointer to void (he is welcome to comment here).
>
> I don't know why Richard implemented a simple linked-list that way. I take
> it he is not that fond of intrusive linked lists.

But his solution IS intrusive, unless by "intrusive" you mean
something else.

A linked list of pointers (the right way) is only "intrusive" in that
it consists of a set of eyes looking at data.

Whereas to copy *anything* into your linked list is asking for
trouble, since you're both looking at data and grabbing it. Eyes and
paws.
>
> > However, preprocessor macro implementation of linked list would avoid
> > both copying entries and pointer to void. But it would be at that
> > point in which I would ask myself, like Ross Perot's running mate VAdm
> > Jim Stockdale in 1992, "Who am I? Why am I here? And why am I not
> > coding in C Sharp, or getting laid?"
>
> What about intrusive data-structures that do not require any copying and/or
> void pointers? I use them all the time.

That was my idea...use a macro for strong typing.

== 3 of 5 ==
Date: Mon, Feb 15 2010 11:06 pm
From: spinoza1111

On Feb 16, 1:05 am, Seebs <usenet-nos...@seebs.net> wrote:
> On 2010-02-15, Malcolm McLean <malcolm.mcle...@btinternet.com> wrote:
>
> > The main concern is, is this library easy to use? Only when the
> > program hits the treacle do ypu stary worrying about how efficient the
> > code is behind those nice interfaces.
>
> Yeah. My string library (like most C programmers, I wrote one at one point)
> actually does have, under some circumstances, linked lists in it. It never
> seems to have become an issue.
>
> They're used to provide familiar semantics. Consider:
> char s[256] = "hello, world!";
> char *t;
>
> t = strstr(s, "world");
> strcpy(t, "sailor!");
>
> You would expect this to leave s = "hello, sailor!" (note that it's a [], not
> a *, and sized to provide enough room).

At this point, it doesn't seem that "your" string library is what we
need, since you have to do its mallocs or fixed allocations. A true
solution would instead encapsulate all memory management.

Your library allows its users to make crappy and irresponsible secret
decisions as to "how big" things can get, which programmers are
rarely, if ever, qualified to make.
>
> If you are doing structures, rather than raw pointers to characters, you
> need some way to indicate that the same change must be made in both t and s.
> My solution was to have s contain a linked list of strings derived from it,
> and t contain a pointer to the string it is derived from. When you modify
> t, it observes that it is a substring of another string, so in fact, the
> modification is passed back up to s. Since t and s share storage, this
> works.

OK, your string library is something like a library of pointers
outside the library; a collection of maps. I did something like this
in a preliminary fashion in 1987, but found that it was too much
trouble. A string library needs ownership of the strings.

However, a full library MIGHT want to sponsor both "virtual" strings
(pointers to space the library has neither mallocated nor predefined)
and "actual" strings (where the library has done the memory
management).

>
> -s
> --
> Copyright 2010, all wrongs reversed. Peter Seebach / usenet-nos...@seebs.nethttp://www.seebs.net/log/<-- lawsuits, religion, and funny pictureshttp://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!

== 4 of 5 ==
Date: Mon, Feb 15 2010 11:28 pm
From: spinoza1111

On Feb 16, 12:30 am, Julienne Walker <happyfro...@hotmail.com> wrote:
> On Feb 15, 9:40 am,spinoza1111<spinoza1...@yahoo.com> wrote:
>
>
>
> > However, for easy modification, I don't think a doubly linked list
> > representation of strings with a length code could be beat.
>
> Indeed. Too bad linked data structures have a tendency toward poor
> cache performance. And depending on how the list is implemented, many
> programmers (even in C#) would likely balk at the extra overhead of
> storing and following links for a "mere" string.

They wouldn't have to follow the links. And the "overhead" is
psychological. Frankly, Julienne, I'm tired of the facility in which
ordinary garden-variety transform "the need to think" into "wasting
overhead" as if their brains == their computer.

But I can see where linked data structures have a tendency toward poor
cache performance, the blocks not being adjacent. If this is a
consideration, malloc() a big block and then do memory allocation
yourself inside the big block.

>
> > copies the actual values into the list. This would have
> > blown me to Kingdom Come.
>
> Can you elaborate on this? I'm not sure I fully understand what you
> mean.

It appeared to me that Richard's "general purpose linked list" of 2000
copies the values of anything passed to it into nodes of the linked
list. But if I were (in a blue moon) to use his code to re-present
strings as linked list, then he'd copy unbounded amounts of data
inside his code.

For the same reason RISC designers refused to implement the "fancy"
instructions of the old DEC Vax, a library function needs IF POSSIBLE
to have a fixed upper bound on the time it will consume. If I am
correct about Richard's code, if I pass him a huge data structure, his
timings will blow up because he's copying my data. Whereas if he'd
done the job right, for each node I pass him, he would have taken
constant time to set four-byte values (essentially the node's data
pointer and the pointer to the next element).

It is not always possible for a library function to have such a fixed
upper bound. If it is a replace() or a strstr(), OF COURSE its time
will depend on a fact about the data.

But this dependency falls right out of the problem, it is a necessary
fact about the problem. Whereas Richard's copy-o-rama comes out of
left field and seems to me completely unnecessary. If your compiler
supports pointer to void, use it, I say, and be damned.

Every "textbook" linked list I can recall uses a pointer in the data
node when the data is non-trivial. I was shocked, *shocked* to find
Richard did not.

But it is true that C handles this poorly because "void pointer is not
a 'real' pointer". This fact, about the inadequacy of C, has been
transformed into barbaric knowledge by the C clerisy, where what is a
criticism is re-intoned in tones of holy dread.

The problem wasn't solved until C++ and the notion of generics.
Preprocessor macros can simulate the effect of generics but only
painfully.

As far as I'm concerned: as far as I can see
Richard Heathfield does not work tastefully:
If something is coyote ugly and unpleasant
He seems to me to regard it, as a Heaven-sent
Opportunity
To increase the great weight of the world's misery.

Thus in linking up a linky list
We see, with dread, we say, oh hist,
He's copying ANYTHING into each node
Oh what a weary thing that is, and what a weary road.

He seems to be the Puritan
Who makes us sad whenever he can.

== 5 of 5 ==
Date: Mon, Feb 15 2010 11:32 pm
From: spinoza1111

On Feb 16, 12:21 am, Tim Streater <timstrea...@waitrose.com> wrote:
> On 15/02/2010 14:24,spinoza1111wrote:
>
>
>
>
>
> > On Feb 15, 6:42 pm, Tim Streater<timstrea...@waitrose.com> wrote:
> >> On 15/02/2010 09:17, Nick Keighley wrote:
>
> >>> On 13 Feb, 10:43, Tim Streater<timstrea...@waitrose.com> wrote:
> >>>> On 13/02/2010 10:30,spinoza1111wrote:
>
> >>>>> My code is hard to read, it is said (on dit). In a sense, it is. I
> >>>>> find it hard to read at times myself.
>
> >>>> Then it's shitty code, pure and simple.
>
> >>> if was hard to write then it should be hard to understand
>
> >> Yes, I've come across others with that view in the past (others besides
> >> Spinny, that is). In one egregious instance the writer declined to do
> >> any doccy for his code, for just the reason you cite. He was too
> >> impressed by his own cleverness, the offering was "neat", to use his
> >> words. Asked to write some code to take a text file and convert it to
> >> another form, he'd written a "compiler", so instead of a small piece of
> >> code to do the job, it was humongous - with the attendant issues of
> >> maintainability and modifiability.
>
> > The anthropological lessons of these myths is "don't be too smart." Or
> > as we say in China, "the tallest stalk of rice is cut down".
>
> Which myth is that then? I'm dealing with reality - that of being left
> with an undocumented mess to look after, because the clown in question
> preferred to do something "cool" and "neat", rather than what he was
> paid to do.

What was he paid to do? Something hot and sloppy?
>
> By the way, I don't object to his doing something "cool" and "neat", but
> on his own time, OK?

But on the job we must produce crap at speed?

>
> [irrelevant drivel deleted]
>
> > Sure, the guy may have gone overboard. But the effect of the myth is
> > to bias programmers to a "simplicity" which permits them to make silly
> > mistakes.
>
> Way overboard in this instance. And which mistakes might those be, then?
>
> >> It was beneath his dignity, you see, to consider others, just as it's
> >> beneath Spinny's dignity to consider others or to bother understanding
> >> feof() or whatever it was. Well, fuck to the lot of them, is what I say.
>
> > Or whatever. Yes, I had to refresh my memory (you might also consider
> > doing your homework on what's what before posting: there's really no
> > excuse not to since so much information is online).
>
> Which homework's that then, Spinny? There was nothing I needed to look up.
>
> > As I said, this is because intelligent people prefer learning and
> > remembering elegant things,
>
> Very likely.
>
> > and feof() is coyote ugly, and, as I said, a bug waiting to happen
>
> > when a C program has to do an extra read merely to check EOF from
>
> > a device.
>
> Well, gee, life's a bitch sometimes, ain't it though! I'll let the good
> people of Haiti know that while they might think *they* have problems,
> poor Spinny has to deal with feof().

Haiti? OK, an old argument: don't complain about anything, you have no
right given "the weight of the world". The problem being that Haiti's
problems were CAUSED by US corporations in the same system in which
they don't give programmers, *as a matter of principle* enough time to
do a quality job.
>
> --
> Tim
>
> "That the freedom of speech and debates or proceedings in Parliament
> ought not to be impeached or questioned in any court or place out of
> Parliament"
>
> Bill of Rights 1689

==============================================================================
TOPIC: Knowing the implementation, are all undefined behaviours become
implementation-defined behaviours?
http://groups.google.com/group/comp.lang.c/t/4f8b56b26018cf4e?hl=en
==============================================================================

== 1 of 1 ==
Date: Mon, Feb 15 2010 10:55 pm
From: Nobody

On Mon, 15 Feb 2010 21:28:29 +0000, Alan Curry wrote:

> |Deferencing a NULL pointer is undefined behaviour, but, on Linux, the
> |program crashes with SIGSEGV. So, the behaviour of derefencing a NULL
> |pointer is defined to "crash the program with SIGSEGV".
>
> Are you sure?
>
> Compile this with and without -DUSE_STDIO and explain the results.
>
> Both branches ask the system to do the exact same thing: fetch a byte from
> the address indicated by NULL, and write it to the standard output.

Neither branch *dereferences* a null pointer; they just pass it to a
function. It's unknown whether either function ultimately dereferences the
pointer; they may test it for validity first.

In any case, the argument about dereferencing null pointers being
"defined" to crash the program with SIGSEGV is bogus.

First, SIGSEGV isn't defined to "crash" the program; you can install a
handler for SIGSEGV, and even if you don't, not everyone would consider
terminating on a signal to be a "crash" (does abort() "crash" the program?).

Second, SIGSEGV typically arises from accessing an address to which
nothing is mapped (or which is mapped but without the desired access). But
you can have memory mapped to page zero; 8086 (DOS) emulators frequently
do this.

Finally, reading a null pointer (as that term is defined by C) doesn't
necessarily result in reading address zero, as the compiler can optimise
the access away. In particular, in a context where a pointer has already
been dereferenced, recent versions of gcc will assume that the pointer is
non-null (if it's null, you've invoked UB by dereferencing it, so the
compiler can do whatever it wants, including doing whatever it's supposed
to do when the pointer is non-null).

==============================================================================
TOPIC: Cross reference tool
http://groups.google.com/group/comp.lang.c/t/24be1b7f05b314ab?hl=en
==============================================================================

== 1 of 1 ==
Date: Mon, Feb 15 2010 11:36 pm
From: Hans Lodder

Ben Bacarisse schreef:
> News groups <j.hans.d.lodder@planet.nl> writes:
> <snip>
>> So i need a cross referencer. I could not find a simple program for
>> win32. Somewhere around 1982 I developed a PASCAL cross reference
>> program, which is some time ago ;). Never done anything in C though.
>>
>> I wanted to do this in C, mainly because of this newsgroup, which I
>> find to have a lot experts, providing usually good comments.
>>
>> So I Googled and borrowed code from a lot of places, putting it
>> together in one file. The code is here (360 lines):
>>
>> http://requirements-management.nl/files/download/xref-sources-0.99.zip
>
> It's hard to be motivated to do a review of code not posted here so
> you may get only a few replies. I'll made two suggestions after
> commenting that the code looks good overall.
>
> (1) An unbalanced tree is a good first approximation, but your list of
> common words (I'd not call them keywords -- too confusing) is sorted
> so you get, in essence, a linked list search not a tree-based one!
>
> (2) If you include \r in the excluded characters, your files will work
> on non Windows systems. Alternatively, find a way to distribute a
> plain text version of the common word list so that it has native
> line endings wherever it is used.
>
> <snip>

Thanks Ben! I will include the complete code. Here its is:

/********************************************************************************/
/* a. File : xref.c */
/* b. Version : v0-99 */
/* c. Author : Hans Lodder (2007-09-28) */
/* d. Modified : Hans Lodder (2010-01-16) */
/* e. Description : Cross references an input file */
/* Can exclude keywords through L switch. */
/* f. Usage : xref [-h] {[-Len] | [-Ldu] | [-Lde] | [ -LC]} <file> */
/* g. Default : Exclude English keywords -Len */
/* h. Algorithm : Uses a binary tree and linked lists */
/* i. Business requirements: */
/* 01. Produces a cross reference of any UTF-8 input file. */
/* 02. By default xref neglects frequent English keywords. */
/* 03. No input results in no output. */
/* 04. Fool data is isolated as early as feasible without */
/* sacrificing the design. */
/* 05. A modular and maintainable design */
/* j. Tools : gcc (4.4.0) -Wall -Wextra -pedantic -o xref.exe
xref.c */
/* splint (3.1.1) -weak xref.c */
/* indent (2.2.9) -gnu xref.c */
/* notepad++ (5.6.6) xref.c */
/* h. Test cases : 01. No input No output */
/* 02. 1 word 1 line file listing, */
/* 1 line output list */
/* 03. 2 equal words 1 line output list, 2 line numbers */
/* 04. 2 different words 2 line output list */
/* i. Remarks : */
/********************************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <stdbool.h>

#define WORD_WIDTH 15
#define NUM_WIDTH 5

#define LINE_BUF_SIZE 65536

#ifndef NDEBUG
#define NDEBUG
#undef NDEBUG

twitter

Tuesday, February 16, 2010

comp.lang.c - 10 new messages in 6 topics - digest

0 Comments:

Post a Comment

About Me

Previous Posts