Discussion:
Why require SLOW_BUT_NO_HACKS for stubs?
Isaac Dunham
2012-06-10 06:05:41 UTC
Permalink
Hello,
I'm using musl as a libc, and have run into a number of times that
gnulib stopped build.
By defining SLOW_BUT_NO_HACKS, the software ended up working.
This is the documented behavior, but it doesn't seem like the right one:
if a stub is usable enough to allow using it, why shouldn't it be
available whenever there's no alternative?

Is there any reason not to merge the
#else if SLOW_BUT_NO_HACKS
sections with the
#else
#error
sections, either with #pragma warn instead of #error, or without any
messages?

Isaac Dunham
Paul Eggert
2012-06-11 02:00:54 UTC
Permalink
Post by Isaac Dunham
Is there any reason not to merge
Performance, surely. But if there's
consensus that performance does not matter that
much with musl, perhaps we should default to the
slow version with musl.

Is there any simple way to tell at compile-time,
or at configure-time, that musl is being used?
That would help us distinguish musl (where being
slow is acceptable) from other platforms (which may not
want that).
Isaac Dunham
2012-06-12 01:22:02 UTC
Permalink
On Sun, 10 Jun 2012 19:00:54 -0700
Post by Paul Eggert
Post by Isaac Dunham
Is there any reason not to merge
Performance, surely. But if there's
consensus that performance does not matter that
much with musl, perhaps we should default to the
slow version with musl.
The test as it stands is "error out on unsupported platforms unless
user specifies to use slow method".
My proposal is "On unsupported platforms, use the slow method instead
of erroring out."
All supported platforms are unaffected.
Post by Paul Eggert
Is there any simple way to tell at compile-time,
or at configure-time, that musl is being used?
That would help us distinguish musl (where being
slow is acceptable) from other platforms (which may not
want that).
First, the proposal is "Run slow anywhere current code would #error",
not "default to slow code".
Second, officially, no. musl is designed for standards conformance,
and the maintainer takes the perspective that #ifdef should be reserved
for non-standard-conformant libc versions.
Unofficially, I can think of a few oddities:
strl* are supported with -D_BSD_SOURCE, while __linux__ will be defined;
almost all symbols use the function name; only SUSv4 is supported with
_XOPEN_SOURCE (so -D_XOPEN_SOURCE=600 with <unistd.h> still gives
_XOPEN_VERSION=700).
None of these are guaranteed to stay the same, though no _XOPEN_VERSION
less than 700 is likely to be supported.

Isaac Dunham
Paolo Bonzini
2012-06-12 11:30:45 UTC
Permalink
Post by Isaac Dunham
Post by Paul Eggert
Performance, surely. But if there's
consensus that performance does not matter that
much with musl, perhaps we should default to the
slow version with musl.
The test as it stands is "error out on unsupported platforms unless
user specifies to use slow method".
My proposal is "On unsupported platforms, use the slow method instead
of erroring out."
I agree, downgrading to a #warning and removing SLOW_BUT_NO_HACKS should be enough.
That would be something like this, but it would fail the tests. What to do?

Paolo
------------ 8< ----------------
John Spencer
2012-06-12 12:05:05 UTC
Permalink
Please don't forget fseterr, the other rude #erroring spot which fails
even though there is deactivated portable fallback code.

Subject: [PATCH] 6 syscalls is still better than a failed build

---
lib/fseterr.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/lib/fseterr.c b/lib/fseterr.c
index eaad702..37efa4f 100644
--- a/lib/fseterr.c
+++ b/lib/fseterr.c
@@ -45,7 +45,7 @@ fseterr (FILE *fp)
fp->_Mode |= 0x200 /* _MERR */;
#elif defined __MINT__ /* Atari FreeMiNT */
fp->__error = 1;
-#elif 0 /* unknown */
+#elif 1 /* unknown */
/* Portable fallback, based on an idea by Rich Felker.
Wow! 6 system calls for something that is just a bit operation!
Not activated on any system, because there is no way to repair FP
when
--
1.7.3.4
Bruno Haible
2012-06-17 21:40:51 UTC
Permalink
Post by Isaac Dunham
The test as it stands is "error out on unsupported platforms unless
user specifies to use slow method".
My proposal is "On unsupported platforms, use the slow method instead
of erroring out."
If we did this, nobody would report to bug-gnulib (or to the libc maintainer)
the need to port the functions. You would get a slow or buggy program
instead.
@@ -84,10 +85,10 @@ freadahead (FILE *fp)
if (fp->state == 4 /* WR */ || fp->rp >= fp->wp)
return 0;
return fp->wp - fp->rp;
-#elif defined SLOW_BUT_NO_HACKS /* users can define this */
- abort ();
- return 0;
#else
- #error "Please port gnulib freadahead.c to your platform! Look at the definition of fflush, fread, ungetc on your system, then report this to bug-gnulib."
+ /* This implementation is correct on any ANSI C platform. It is just
+ awfully slow. */
+ return freading(fp) && !feof(fp);
+ #warning "Please port gnulib freadahead.c to your platform! Look at the definition of fflush, fread, ungetc on your system, then report this to bug-gnulib."
#endif
}
This alternative code is not correct. On a stream freshly opened for reading
it returns 1 where is should return 0 instead.

Bruno
Paolo Bonzini
2012-06-23 14:56:34 UTC
Permalink
Post by Bruno Haible
Post by Isaac Dunham
The test as it stands is "error out on unsupported platforms unless
user specifies to use slow method".
My proposal is "On unsupported platforms, use the slow method instead
of erroring out."
If we did this, nobody would report to bug-gnulib (or to the libc maintainer)
the need to port the functions. You would get a slow or buggy program
instead.
You can add a test program that detects an unported-to libc. So they
would get a slow program but also a make check failure.
Post by Bruno Haible
- #error "Please port gnulib freadahead.c to your platform! Look at the definition of fflush, fread, ungetc on your system, then report this to bug-gnulib."
+  /* This implementation is correct on any ANSI C platform.  It is just
+     awfully slow.  */
+  return freading(fp) && !feof(fp);
+ #warning "Please port gnulib freadahead.c to your platform! Look at the definition of fflush, fread, ungetc on your system, then report this to bug-gnulib."
 #endif
 }
This alternative code is not correct. On a stream freshly opened for reading
it returns 1 where is should return 0 instead.
Indeed, it is only correct to use this replacement in close_stdin.

Paolo
Bruno Haible
2012-06-24 11:07:24 UTC
Permalink
Post by Paolo Bonzini
Post by Bruno Haible
Post by Isaac Dunham
The test as it stands is "error out on unsupported platforms unless
user specifies to use slow method".
My proposal is "On unsupported platforms, use the slow method instead
of erroring out."
If we did this, nobody would report to bug-gnulib (or to the libc maintainer)
the need to port the functions. You would get a slow or buggy program
instead.
You can add a test program that detects an unported-to libc. So they
would get a slow program but also a make check failure.
Unfortunately, a majority of the users (between 50% and 90%, I got the
impression) runs "make; make install" without "make check". And many of
them would also ignore a #warning. To catch the attention of the users
and let them get in touch with us for porting the code, one really has
to provoke a build failure.

Bruno
John Spencer
2012-06-24 22:42:12 UTC
Permalink
Post by Bruno Haible
Post by Paolo Bonzini
Post by Bruno Haible
Post by Isaac Dunham
The test as it stands is "error out on unsupported platforms unless
user specifies to use slow method".
My proposal is "On unsupported platforms, use the slow method instead
of erroring out."
If we did this, nobody would report to bug-gnulib (or to the libc maintainer)
the need to port the functions. You would get a slow or buggy program
instead.
You can add a test program that detects an unported-to libc. So they
would get a slow program but also a make check failure.
Unfortunately, a majority of the users (between 50% and 90%, I got the
impression) runs "make; make install" without "make check". And many of
them would also ignore a #warning. To catch the attention of the users
and let them get in touch with us for porting the code, one really has
to provoke a build failure.
better said, catch the HATE of users.

anything is better than a failed build.
nobody cares if single cornercases are slow, as long as the program works.
and if they care, *THEN* they can get in touch with you to improve a
slightly unoptimal situation, as opposed to a *catastrophic* situation.
even if they know C and autoconf good enough to find the cause of their
build failure, and english good enough to contact this list, they
probably don't have much interest to discuss with you guys for days
until they finally get a gnulib fix upstream (which will be in their
software of choice months or years later) or not, unless they know how
to apply patches manually.

you seem to be one of very few who thinks it's good to create trouble
for other people.

in the case of musl, dozens of people ran into the compile error in the
last 2 years, wasted cumulative days or weeks of work to fix it (more
than once, for every packages that uses another version of gnulib) and
to retry hour-long compiles that broke in the middle.

but apparently nobody seemed to bother to come here and discuss the
issue (or he was simply ignored).
instead they came to the musl irc channel to complain.
so your approach is provably ineffective (and very arrogant).

Now go ahead and enable the portable fallback code, and add 100 lines of
warnings to it so that *it will* get noticed, and if the users care
about 1% faster speed when they use printf, they will come anyway.
let's finally put an end to this sadistic farce.
Paul Eggert
2012-06-25 06:31:53 UTC
Permalink
Post by John Spencer
anything is better than a failed build.
Isn't this discussion moot now, with respect to musl?
That is, I thought the problem with musl and gnulib
is fixed, so we don't have a failed build now.

If this discussion is about what to do with some other
new standard C library that gnulib isn't ported to yet,
let's wait until that happens before worrying about it.
Perhaps by then the necessary primitives will be standardized
so the problem won't come up then either.
John Spencer
2012-06-25 13:00:44 UTC
Permalink
Post by Paul Eggert
Post by John Spencer
anything is better than a failed build.
Isn't this discussion moot now, with respect to musl?
That is, I thought the problem with musl and gnulib
is fixed, so we don't have a failed build now.
we still will have failed builds until all software using gnulib will
update their in-tree copies and release new versions.
this can take a long time. an optimistic estimate is ~2 years.
and when you don't you want to use the latest version for whatever
reason, you will still have to fight this problem.
for example i prefer using gcc 3.4.6 or 4.2.4 on embedded machines as it
is much more lightweight (however in that specific case it's fighting
against the broken prototypes in libiberty).
Post by Paul Eggert
If this discussion is about what to do with some other
new standard C library that gnulib isn't ported to yet,
let's wait until that happens before worrying about it.
i'm thinking about the future, when i (or others) will run into the same
error when i use nuttx, aros, RTOS, QNX, or something else that comes my
way. unless a portable fallback is *activated* by default, any new
system will have to fight against gnulib with its arrogant attitude.
Post by Paul Eggert
Perhaps by then the necessary primitives will be standardized
so the problem won't come up then either.
the problem wouldn't come up in the first place if you activated
portable fallbacks for weirdo code like fseterr.
people were happy and would think gnulib is a fine thing, because it'd
simply work as intended.
Philipp Thomas
2012-06-25 13:27:17 UTC
Permalink
Post by Bruno Haible
Unfortunately, a majority of the users (between 50% and 90%, I got the
impression) runs "make; make install" without "make check".
Bruno Haible
2012-06-17 22:49:44 UTC
Permalink
[CCing the musl list]
Isaac Dunham wrote in
Post by Isaac Dunham
musl is designed for standards conformance,
There is a recipe, in <http://sourceware.org/glibc/wiki/Testing/Gnulib>,
that explains how to use gnulib to check a libc against bugs. When I apply
this to musl-0.9.1, I get this list of problems:

Replacements of *printf, because of
checking whether printf supports infinite 'long double' arguments... no
checking whether printf supports the 'ls' directive... no
checking whether printf survives out-of-memory conditions... no

Replacement of duplocale, because of
checking whether duplocale(LC_GLOBAL_LOCALE) works... no

Replacement of fdopen, because of
checking whether fdopen sets errno... no

Replacement of futimens, because of
checking whether futimens works... no

Replacement of getcwd, because of
checking whether getcwd handles long file names properly... no, but it is partly working
checking whether getcwd aborts when 4k < cwd_length < 16k... no

Replacement of getopt, because of
checking whether getopt is POSIX compatible... no

Replacement of glob, because of
checking for GNU glob interface version 1... no
(not sure this is a bug or just an incompatibility compared to glibc)

Replacement of iconv and iconv_open, because of
checking whether iconv supports conversion between UTF-8 and UTF-{16,32}{BE,LE}... no

Replacement of mktime, because of
checking for working mktime... no

Replacement of perror, because of
checking whether perror matches strerror... no

Replacement of popen, because of
checking whether popen works with closed stdin... no

Replacement of regex, because of
checking for working re_compile_pattern... no

Replacement of strtod, because of
checking whether strtod obeys C99... no

For each of the replacements, first look at the test program's results
(in config.log), then look at the test program's source code (in m4/*.m4).

Furthermore we have test failures:

test-duplocale.c:70: assertion failed
FAIL: test-duplocale

test-fcntl.c:382: assertion failed
FAIL: test-fcntl

test-fdatasync.c:50: assertion failed
FAIL: test-fdatasync

test-fma2.h:116: assertion failed
FAIL: test-fma2

test-fsync.c:50: assertion failed
FAIL: test-fsync

test-fwrite.c:53: assertion failed
FAIL: test-fwrite

test-getlogin_r.c:88: assertion failed
FAIL: test-getlogin_r

test-grantpt.c:34: assertion failed
FAIL: test-grantpt

test-localeconv.c:41: assertion failed
FAIL: test-localeconv

Segmentation fault
FAIL: test-localename

test-ptsname_r.c:118: assertion failed
FAIL: test-ptsname_r

test-strerror_r.c:118: assertion failed
FAIL: test-strerror_r

test-wcwidth.c:71: assertion failed
FAIL: test-wcwidth

When I compile all of gnulib, I also get a compilation error
(may be a musl or a gnulib problem, haven't investigated):
fsusage.c: In function 'get_fs_usage':
fsusage.c:222:17: error: storage size of 'fsd' isn't known
fsusage.c:224:3: warning: implicit declaration of function 'statfs' [-Wimplicit-function-declaration]
fsusage.c:222:17: warning: unused variable 'fsd' [-Wunused-variable]
make[4]: *** [fsusage.o] Error 1

Bruno
i***@lavabit.com
2012-06-18 00:16:49 UTC
Permalink
Post by Bruno Haible
[CCing the musl list]
Isaac Dunham wrote in
Post by Isaac Dunham
musl is designed for standards conformance,
There is a recipe, in <http://sourceware.org/glibc/wiki/Testing/Gnulib>,
that explains how to use gnulib to check a libc against bugs.
Be warned: a bad test can cause failures as well.
It's been one of the musl developers' complaints about gnulib that the
tests are buggy and frequently check for glibc behavior instead of
standard behavior.
Post by Bruno Haible
Replacements of *printf, because of
checking whether printf supports infinite 'long double' arguments... no
checking whether printf supports the 'ls' directive... no
checking whether printf survives out-of-memory conditions... no
At least one of these (infinite long double, IIRC) is invalid or a test
for a GNU-ism. This was previously discussed on the musl ML. OOM behavior
is undefined AFAICT (feel free to point out a standard), and the scenario
is a lot less likely with musl than glibc for several reasons.
Post by Bruno Haible
Replacement of duplocale, because of
checking whether duplocale(LC_GLOBAL_LOCALE) works... no
Need to check this one
Post by Bruno Haible
Replacement of fdopen, because of
checking whether fdopen sets errno... no
I presume this is nonconformance to POSIX ("otherwise, a null pointer
shall be returned and errno set...")?
Post by Bruno Haible
Replacement of futimens, because of
checking whether futimens works... no
Could be a bug.
Post by Bruno Haible
Replacement of getcwd, because of
checking whether getcwd handles long file names properly... no, but it is partly working
Is this a test for ERANGE handling (error on name >= size)? Other than
that, I see no specification covering this.
Post by Bruno Haible
checking whether getcwd aborts when 4k < cwd_length < 16k... no
AFAICT, only required to error when size =< cwd_length. If size !<
(cwd_length + 1), that is conformant behavior. (See man 3posix getcwd)
Post by Bruno Haible
Replacement of getopt, because of
checking whether getopt is POSIX compatible... no
We'd need to see this test...(will look later).
Post by Bruno Haible
Replacement of glob, because of
checking for GNU glob interface version 1... no
(not sure this is a bug or just an incompatibility compared to glibc)
Looks like an incompatability, since it specifies "GNU interface"...
Post by Bruno Haible
Replacement of iconv and iconv_open, because of
checking whether iconv supports conversion between UTF-8 and
UTF-{16,32}{BE,LE}... no
Not "nonconformant" from the standpoint of POSIX, AFAICT, but it is
incomplete. musl is UTF8 native, but I don't think it supports UTF16/UTF32
yet.
Post by Bruno Haible
Replacement of mktime, because of
checking for working mktime... no
Replacement of perror, because of
checking whether perror matches strerror... no
Replacement of popen, because of
checking whether popen works with closed stdin... no
Look like bugs, if the description is correct.
Post by Bruno Haible
Replacement of regex, because of
checking for working re_compile_pattern... no
This is #ifdef __USE_GNU
I'm not aware of any standard covering GNU APIs...
Post by Bruno Haible
Replacement of strtod, because of
checking whether strtod obeys C99... no
For each of the replacements, first look at the test program's results
(in config.log), then look at the test program's source code (in m4/*.m4).
Thanks,
Isaac Dunham
Rich Felker
2012-06-19 00:11:56 UTC
Permalink
Some updates...
Post by Bruno Haible
There is a recipe, in <http://sourceware.org/glibc/wiki/Testing/Gnulib>,
that explains how to use gnulib to check a libc against bugs. When I apply
Replacements of *printf, because of
[...]
checking whether printf survives out-of-memory conditions... no
No idea. Copying out the test and running it directly, it passes just
fine for me. Maybe gnulib has already replaced printf with its own
malloc-using version by the time it gets to this test??
Post by Bruno Haible
Replacement of fdopen, because of
checking whether fdopen sets errno... no
There was one bug here (failure to set errno when mode string was
invalid) but I don't think that's the case gnulib was testing for. It
seems gnulib wants an error for the "may fail" when the fd is invalid.
Post by Bruno Haible
Replacement of futimens, because of
checking whether futimens works... no
gnulib always forces this test to fail if __linux__ is defined.

Rich
Eric Blake
2012-06-19 02:07:40 UTC
Permalink
Post by Rich Felker
Some updates...
Post by Bruno Haible
There is a recipe, in <http://sourceware.org/glibc/wiki/Testing/Gnulib>,
that explains how to use gnulib to check a libc against bugs. When I apply
Replacements of *printf, because of
[...]
checking whether printf survives out-of-memory conditions... no
No idea. Copying out the test and running it directly, it passes just
fine for me. Maybe gnulib has already replaced printf with its own
malloc-using version by the time it gets to this test??
No; configure-time tests are relatively independent, and all done prior
to any replacements at compile-time. You should be able to inspect
config.log to see the actual test that configure ran.
Post by Rich Felker
Post by Bruno Haible
Replacement of fdopen, because of
checking whether fdopen sets errno... no
There was one bug here (failure to set errno when mode string was
invalid) but I don't think that's the case gnulib was testing for. It
seems gnulib wants an error for the "may fail" when the fd is invalid.
The 'EBADF may fail' condition is rather weak. And since glibc
guarantees a definite failure, it is nicer to program to the glibc
interface that guarantees immediate failure on a bad fd at fdopen() time
than it is to deal with the surprises that result when fdopen() succeeds
but later attempts to use the stream fail. Perhaps it might be worth
splitting this into two gnulib modules, one for the strict POSIX
compliance and one for the glibc interface, but that depends on how
likely people are to want to program to the weaker POSIX interface when
it is just as easy to make fdopen() guarantee failure on bad fds and
save efforts up front.
Post by Rich Felker
Post by Bruno Haible
Replacement of futimens, because of
checking whether futimens works... no
gnulib always forces this test to fail if __linux__ is defined.
That's because the Linux kernel got it wrong for quite some time, and
worse, it was file-system dependent - even if it worked on one machine
and file system, compiling in support for futimens and then running on
another file system would experience random compliance failures due to
the poor file system implementation.

It's been a while, so maybe we can finally graduate this module into
assuming that the Linux kernel is better behaved by default, and make
the user specifically request the fallbacks if they are worried about
using the binary on a file system that still has bugs. But don't take
it personally - this is not a case of musl getting it wrong, but of the
kernel getting it wrong.
--
Eric Blake ***@redhat.com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
Rich Felker
2012-06-19 02:52:32 UTC
Permalink
Post by Eric Blake
Post by Rich Felker
Some updates...
Post by Bruno Haible
There is a recipe, in <http://sourceware.org/glibc/wiki/Testing/Gnulib>,
that explains how to use gnulib to check a libc against bugs. When I apply
Replacements of *printf, because of
[...]
checking whether printf survives out-of-memory conditions... no
No idea. Copying out the test and running it directly, it passes just
fine for me. Maybe gnulib has already replaced printf with its own
malloc-using version by the time it gets to this test??
No; configure-time tests are relatively independent, and all done prior
to any replacements at compile-time. You should be able to inspect
config.log to see the actual test that configure ran.
OK, then no idea what's causing this. I was going to try running the
test but I didn't have autotools installed on the system I wanted to
test on, so I put it off..
Post by Eric Blake
Post by Rich Felker
Post by Bruno Haible
Replacement of fdopen, because of
checking whether fdopen sets errno... no
There was one bug here (failure to set errno when mode string was
invalid) but I don't think that's the case gnulib was testing for. It
seems gnulib wants an error for the "may fail" when the fd is invalid.
The 'EBADF may fail' condition is rather weak. And since glibc
guarantees a definite failure, it is nicer to program to the glibc
interface that guarantees immediate failure on a bad fd at fdopen() time
than it is to deal with the surprises that result when fdopen() succeeds
but later attempts to use the stream fail. Perhaps it might be worth
The only real-world situation I can think of where you'd care that
fdopen detect EBADF is when you've just called a function that
allocates the file descriptor and directly passed it to fdopen without
first checking the return value. For instance:

FILE *f = fdopen(open(pathname, O_RDWR|O_CLOEXEC), "rb+");

instead of:

int fd = open(pathname, O_RDWR|O_CLOEXEC);
if (fd<0) goto error;
FILE *f = fdopen(fd, "rb+");

The former is rather lazy programming, but maybe gnulib intends to
support this kind of programming.
Post by Eric Blake
splitting this into two gnulib modules, one for the strict POSIX
compliance and one for the glibc interface, but that depends on how
likely people are to want to program to the weaker POSIX interface when
it is just as easy to make fdopen() guarantee failure on bad fds and
save efforts up front.
My thought in having musl skip the test is to maximize performance of
fdopen, assuming you might be using it in a situation like on a newly
accept()ed network connection where every syscall counts (think
multi-threaded httpd). For read-only fdopen, no syscalls are needed,
but if writing is possible, fdopen has to make a syscall to check
whether the fd is a tty in order to decide whether to enable line
buffering or full buffering, and in principle it could detect EBADF at
the same time for no cost.
Post by Eric Blake
Post by Rich Felker
Post by Bruno Haible
Replacement of futimens, because of
checking whether futimens works... no
gnulib always forces this test to fail if __linux__ is defined.
That's because the Linux kernel got it wrong for quite some time, and
worse, it was file-system dependent - even if it worked on one machine
and file system, compiling in support for futimens and then running on
another file system would experience random compliance failures due to
the poor file system implementation.
It's been a while, so maybe we can finally graduate this module into
assuming that the Linux kernel is better behaved by default, and make
the user specifically request the fallbacks if they are worried about
using the binary on a file system that still has bugs. But don't take
it personally - this is not a case of musl getting it wrong, but of the
kernel getting it wrong.
Yes, it might be nice if the test output made it clear that it was
hard-coded to fail so nobody goes looking for nonexistant bugs..

Rich
Bruno Haible
2012-06-19 11:03:38 UTC
Permalink
Post by Rich Felker
Post by Eric Blake
Post by Rich Felker
Post by Bruno Haible
Replacement of fdopen, because of
checking whether fdopen sets errno... no
There was one bug here (failure to set errno when mode string was
invalid) but I don't think that's the case gnulib was testing for. It
seems gnulib wants an error for the "may fail" when the fd is invalid.
Indeed, the possibility that fdopen(invalid fd, ...) can succeed is supported
by POSIX. When looking at the gnulib documentation
doc/posix-functions/fdopen.texi
and the unit test
tests/test-fdopen.c
it appears that it was not the intent of gnulib to prohibit this behaviour.
Rather, musl is the first platform to exhibit this behaviour, and gnulib's
intent was to make sure that fdopen(invalid fd, ...)
1. does not crash,
2. sets errno when it fails.
Post by Rich Felker
Post by Eric Blake
The 'EBADF may fail' condition is rather weak. And since glibc
guarantees a definite failure, it is nicer to program to the glibc
interface that guarantees immediate failure on a bad fd at fdopen() time
than it is to deal with the surprises that result when fdopen() succeeds
but later attempts to use the stream fail. Perhaps it might be worth
The glibc documentation contains this warning:

In some other systems, `fdopen' may fail to detect that the modes
for file descriptor do not permit the access specified by
`opentype'. The GNU C library always checks for this.

So, I think few programmers will explicitly want to exploit this glibc
specific behaviour.
Post by Rich Felker
The only real-world situation I can think of where you'd care that
fdopen detect EBADF is when you've just called a function that
allocates the file descriptor and directly passed it to fdopen without
FILE *f = fdopen(open(pathname, O_RDWR|O_CLOEXEC), "rb+");
int fd = open(pathname, O_RDWR|O_CLOEXEC);
if (fd<0) goto error;
FILE *f = fdopen(fd, "rb+");
The former is rather lazy programming, but maybe gnulib intends to
support this kind of programming.
No, gnulib does not intend to encourage this kind of lazy programming.
Post by Rich Felker
My thought in having musl skip the test is to maximize performance of
fdopen, assuming you might be using it in a situation like on a newly
accept()ed network connection where every syscall counts (think
multi-threaded httpd). For read-only fdopen, no syscalls are needed,
Sounds reasonable.

Here's a proposed patch to remove gnulib's unintentional requirement.


2012-06-19 Bruno Haible <***@clisp.org>

fdopen: Allow implementations that don't reject invalid fd arguments.
* m4/fdopen.m4 (gl_FUNC_FDOPEN): Let the test pass if fdopen(-1,...)
succeeds.
Reported by Rich Felker <***@aerifal.cx>.

--- m4/fdopen.m4.orig Tue Jun 19 13:00:23 2012
+++ m4/fdopen.m4 Tue Jun 19 13:00:05 2012
@@ -1,4 +1,4 @@
-# fdopen.m4 serial 2
+# fdopen.m4 serial 3
dnl Copyright (C) 2011-2012 Free Software Foundation, Inc.
dnl This file is free software; the Free Software Foundation
dnl gives unlimited permission to copy and/or distribute it,
@@ -25,10 +25,8 @@
FILE *fp;
errno = 0;
fp = fdopen (-1, "r");
- if (fp != NULL)
+ if (fp == NULL && errno == 0)
return 1;
- if (errno == 0)
- return 2;
return 0;
}]])],
[gl_cv_func_fdopen_works=yes],
Jim Meyering
2012-06-19 11:09:48 UTC
Permalink
Bruno Haible wrote:
...
Post by Bruno Haible
Post by Rich Felker
My thought in having musl skip the test is to maximize performance of
fdopen, assuming you might be using it in a situation like on a newly
accept()ed network connection where every syscall counts (think
multi-threaded httpd). For read-only fdopen, no syscalls are needed,
Sounds reasonable.
Here's a proposed patch to remove gnulib's unintentional requirement.
fdopen: Allow implementations that don't reject invalid fd arguments.
* m4/fdopen.m4 (gl_FUNC_FDOPEN): Let the test pass if fdopen(-1,...)
succeeds.
Thanks, Bruno.
That patch looks perfect.
Bruno Haible
2012-06-20 20:52:00 UTC
Permalink
Post by Jim Meyering
Post by Bruno Haible
fdopen: Allow implementations that don't reject invalid fd arguments.
* m4/fdopen.m4 (gl_FUNC_FDOPEN): Let the test pass if fdopen(-1,...)
succeeds.
Thanks, Bruno.
That patch looks perfect.
I have applied the patch.

Bruno
Bruno Haible
2012-06-19 10:45:50 UTC
Permalink
Post by Rich Felker
Post by Bruno Haible
Replacements of *printf, because of
[...]
checking whether printf survives out-of-memory conditions... no
No idea. Copying out the test and running it directly, it passes just
fine for me. Maybe gnulib has already replaced printf with its own
malloc-using version by the time it gets to this test??
Strange indeed. With a testdir of all of gnulib, I got

configure:17615: checking whether printf survives out-of-memory conditions
configure:17786: /arch/x86-linux/inst-musl/bin/musl-gcc -std=gnu99 -o conftest -g -O2 -Wall conftest.c >&5
configure:17789: $? = 0
configure:17837: result: yes

but with a testdir of only the POSIX related modules of gnulib, I got

configure:13657: checking whether printf survives out-of-memory conditions
configure:13828: /arch/x86-linux/inst-musl/bin/musl-gcc -std=gnu99 -o conftest -g -O2 -Wall conftest.c >&5
configure:13831: $? = 0
configure:13879: result: no

The '$? = 0' line prints only the linker's exit code, not the runtime
exit code. I'm adding a second output line for the runtime exit code.
Then I get:

configure:8919: checking whether printf survives out-of-memory conditions
configure:9090: /arch/x86-linux/inst-musl/bin/musl-gcc -o conftest -g -O2 -Wall conftest.c >&5
configure:9093: $? = 0
configure:9097: $? = 1
configure:9142: result: no

After adding a printf to stderr: Once I get

configure:8919: checking whether printf survives out-of-memory conditions
configure:9093: /arch/x86-linux/inst-musl/bin/musl-gcc -o conftest -g -O2 -Wall conftest.c >&5
configure:9096: $? = 0
printf's return value = 5000002, errno = 0
configure:9100: $? = 0
configure:9145: result: yes

In another configure run I get:

configure:8919: checking whether printf survives out-of-memory conditions
configure:9093: /arch/x86-linux/inst-musl/bin/musl-gcc -o conftest -g -O2 -Wall conftest.c >&5
configure:9096: $? = 0
configure:9100: $? = 1
configure:9145: result: no

So, the exit code 1 must have come from the crash handler. Without this crash
handler: 7x I get

configure:8919: checking whether printf survives out-of-memory conditions
configure:8979: /arch/x86-linux/inst-musl/bin/musl-gcc -o conftest -g -O2 -Wall conftest.c >&5
configure:8982: $? = 0
printf's return value = 5000002, errno = 0
configure:8986: $? = 0
configure:9031: result: yes

but once I get

configure:8979: /arch/x86-linux/inst-musl/bin/musl-gcc -o conftest -g -O2 -Wall conftest.c >&5
configure:8982: $? = 0
configure:8986: $? = 139
configure:9031: result: no

So, apparently, under memory stress, musl's printf has a probability of
between 10% and 50% of crashing with SIGSEGV (139 = 128 + 11).

Bruno


2012-06-19 Bruno Haible <***@clisp.org>

*printf-posix: Put more info into config.log.
* m4/printf.m4 (gl_PRINTF_ENOMEM): Emit conftest's error output and
exit code into config.log.

--- m4/printf.m4.orig Tue Jun 19 12:41:56 2012
+++ m4/printf.m4 Tue Jun 19 12:41:53 2012
@@ -1,4 +1,4 @@
-# printf.m4 serial 48
+# printf.m4 serial 49
dnl Copyright (C) 2003, 2007-2012 Free Software Foundation, Inc.
dnl This file is free software; the Free Software Foundation
dnl gives unlimited permission to copy and/or distribute it,
@@ -1028,8 +1028,9 @@
changequote([,])dnl
])])
if AC_TRY_EVAL([ac_link]) && test -s conftest$ac_exeext; then
- (./conftest
+ (./conftest 2>&AS_MESSAGE_LOG_FD
result=$?
+ _AS_ECHO_LOG([\$? = $result])
if test $result != 0 && test $result != 77; then result=1; fi
exit $result
) >/dev/null 2>/dev/null
Rich Felker
2012-06-19 19:16:50 UTC
Permalink
Post by Bruno Haible
So, the exit code 1 must have come from the crash handler. Without this crash
handler: 7x I get
configure:8919: checking whether printf survives out-of-memory conditions
configure:8979: /arch/x86-linux/inst-musl/bin/musl-gcc -o conftest -g -O2 -Wall conftest.c >&5
configure:8982: $? = 0
printf's return value = 5000002, errno = 0
configure:8986: $? = 0
configure:9031: result: yes
but once I get
configure:8979: /arch/x86-linux/inst-musl/bin/musl-gcc -o conftest -g -O2 -Wall conftest.c >&5
configure:8982: $? = 0
configure:8986: $? = 139
configure:9031: result: no
So, apparently, under memory stress, musl's printf has a probability of
between 10% and 50% of crashing with SIGSEGV (139 = 128 + 11).
musl's printf does not do anything with memory except using a small
constant amount of stack space (a few hundred bytes for non-float,
somewhere around 5-7k for floating point). This is completely
independent of the width/padding/precision; the implementation
actually goes to a good bit of trouble to ensure that it can print any
amount of padding efficiently without large or unbounded stack space
usage.

Is there any way the rlimits put in place could be preventing the
stack from expanding beyond even one page the current number of pages,
etc.?

Rich
Bruno Haible
2012-06-19 20:04:57 UTC
Permalink
Post by Rich Felker
Post by Bruno Haible
but once I get
configure:8979: /arch/x86-linux/inst-musl/bin/musl-gcc -o conftest -g -O2 -Wall conftest.c >&5
configure:8982: $? = 0
configure:8986: $? = 139
configure:9031: result: no
So, apparently, under memory stress, musl's printf has a probability of
between 10% and 50% of crashing with SIGSEGV (139 = 128 + 11).
musl's printf does not do anything with memory except using a small
constant amount of stack space (a few hundred bytes for non-float,
somewhere around 5-7k for floating point). This is completely
independent of the width/padding/precision; the implementation
actually goes to a good bit of trouble to ensure that it can print any
amount of padding efficiently without large or unbounded stack space
usage.
Is there any way the rlimits put in place could be preventing the
stack from expanding beyond even one page the current number of pages,
etc.?
I can reduce the program and the compilation options:

=============================== conftest.c =============================
#include <stdio.h>
#include <errno.h>
int main()
{
int ret;
int err;
ret = printf ("%.5000000f", 1.0);
err = errno;
fprintf (stderr, "printf's return value = %d, errno = %d\n", ret, err);
return !(ret == 5000002 || (ret < 0 && err == ENOMEM));
}
========================================================================
$ musl-gcc -g -Wall conftest.c -o conftest
$ ./conftest > /dev/null ; echo $?
printf's return value = 5000002, errno = 0
0
$ ./conftest > /dev/null ; echo $?
printf's return value = 5000002, errno = 0
0
$ ./conftest > /dev/null ; echo $?
printf's return value = 5000002, errno = 0
0
$ ./conftest > /dev/null ; echo $?
Speicherzugriffsfehler (Speicherabzug geschrieben)
139
$ ./conftest > /dev/null ; echo $?
Speicherzugriffsfehler (Speicherabzug geschrieben)
139

I couldn't get useful info from gdb.

This is on Linux, 32-bit mode on a 64-bit system. Can you reproduce this?

Bruno
Rich Felker
2012-06-19 20:08:47 UTC
Permalink
Post by Bruno Haible
=============================== conftest.c =============================
#include <stdio.h>
#include <errno.h>
int main()
{
int ret;
int err;
ret = printf ("%.5000000f", 1.0);
err = errno;
fprintf (stderr, "printf's return value = %d, errno = %d\n", ret, err);
return !(ret == 5000002 || (ret < 0 && err == ENOMEM));
}
========================================================================
$ musl-gcc -g -Wall conftest.c -o conftest
$ ./conftest > /dev/null ; echo $?
printf's return value = 5000002, errno = 0
0
$ ./conftest > /dev/null ; echo $?
printf's return value = 5000002, errno = 0
0
$ ./conftest > /dev/null ; echo $?
printf's return value = 5000002, errno = 0
0
$ ./conftest > /dev/null ; echo $?
Speicherzugriffsfehler (Speicherabzug geschrieben)
139
$ ./conftest > /dev/null ; echo $?
Speicherzugriffsfehler (Speicherabzug geschrieben)
139
I couldn't get useful info from gdb.
This is on Linux, 32-bit mode on a 64-bit system. Can you reproduce this?
I can't reproduce it. Do you have a dynamic-linked musl or just
static? I tried both and couldn't reproduce with either. Did you set
resource limits before running it? Are you using any strange kernel
mods? I once heard of a patched kernel setting up other mappings over
top of the not-yet-expanded-into stack space, but I'd be surprised if
more weren't breaking on such a system...

What happened in gdb? Were you unable to get it to crash? What if you
run it under strace?

Rich
Bruno Haible
2012-06-19 21:17:33 UTC
Permalink
Do you have a dynamic-linked musl or just static?
Dynamically linked:

$ readelf -d conftest

Dynamic section at offset 0xf3c contains 18 entries:
Tag Type Name/Value
0x00000001 (NEEDED) Shared library: [libc.so]
0x0000000c (INIT) 0x804832c
0x0000000d (FINI) 0x80484ec
0x00000004 (HASH) 0x80481a0
0x6ffffef5 (GNU_HASH) 0x80481dc
0x00000005 (STRTAB) 0x80482b0
0x00000006 (SYMTAB) 0x8048210
0x0000000a (STRSZ) 83 (bytes)
0x0000000b (SYMENT) 16 (bytes)
0x00000015 (DEBUG) 0x0
0x00000003 (PLTGOT) 0x8049ff4
0x00000002 (PLTRELSZ) 32 (bytes)
0x00000014 (PLTREL) REL
0x00000017 (JMPREL) 0x804830c
0x00000011 (REL) 0x8048304
0x00000012 (RELSZ) 8 (bytes)
0x00000013 (RELENT) 8 (bytes)
0x00000000 (NULL) 0x0
$ readelf -l conftest

Elf file type is EXEC (Executable file)
Entry point 0x8048390
There are 9 program headers, starting at offset 52

Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4
INTERP 0x000154 0x08048154 0x08048154 0x00026 0x00026 R 0x1
[Requesting program interpreter: /arch/x86-linux/inst-musl/lib/libc.so]
LOAD 0x000000 0x08048000 0x08048000 0x00578 0x00578 R E 0x1000
LOAD 0x000f28 0x08049f28 0x08049f28 0x000ec 0x000f8 RW 0x1000
DYNAMIC 0x000f3c 0x08049f3c 0x08049f3c 0x000b8 0x000b8 RW 0x4
NOTE 0x00017c 0x0804817c 0x0804817c 0x00024 0x00024 R 0x4
GNU_EH_FRAME 0x000528 0x08048528 0x08048528 0x00014 0x00014 R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RWE 0x4
GNU_RELRO 0x000f28 0x08049f28 0x08049f28 0x000d8 0x000d8 R 0x1

Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.gnu.build-id .hash .gnu.hash .dynsym .dynstr .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
03 .ctors .dtors .jcr .dynamic .got.plt .data .bss
04 .dynamic
05 .note.gnu.build-id
06 .eh_frame_hdr
07
08 .ctors .dtors .jcr .dynamic
$ readelf --dyn-syms conftest

Symbol table '.dynsym' contains 10 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 FUNC GLOBAL DEFAULT UND printf
2: 00000000 0 FUNC GLOBAL DEFAULT UND fprintf
3: 00000000 0 FUNC GLOBAL DEFAULT UND __errno_location
4: 00000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main
5: 0804a014 0 NOTYPE GLOBAL DEFAULT ABS _edata
6: 0804a020 0 NOTYPE GLOBAL DEFAULT ABS _end
7: 08048390 0 NOTYPE GLOBAL DEFAULT 11 _start
8: 0804a014 0 NOTYPE GLOBAL DEFAULT ABS __bss_start
9: 0804a014 4 OBJECT GLOBAL DEFAULT 22 stderr
Did you set resource limits before running it?
No.
$ ulimit -a
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 29019
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 29019
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Are you using any strange kernel mods?
No. Stock openSUSE 12.1.
$ uname -srv
Linux 3.1.10-1.9-desktop #1 SMP PREEMPT Thu Apr 5 18:48:38 UTC 2012 (4a97ec8)
What happened in gdb?
The stack trace in gdb is unusable.
$ gdb conftest
GNU gdb (GDB) SUSE (7.3-41.1.2)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /data/bruno/tmp/testdir3/conftest...done.
(gdb) set solib-search-path /arch/x86-linux/inst-musl/lib
(gdb) run
Starting program: /data/bruno/tmp/testdir3/conftest
warning: Could not load shared library symbols for linux-gate.so.1.
Do you need "set solib-search-path" or "set sysroot"?

Program received signal SIGSEGV, Segmentation fault.
0xf7fc76c3 in fmt_fp () from /data/arch/x86-linux/inst-musl/lib/libc.so
(gdb) where
#0 0xf7fc76c3 in fmt_fp () from /data/arch/x86-linux/inst-musl/lib/libc.so
#1 0x00000000 in ?? ()

This is a bit useless, since libc.so is compiled without debugging information.
If I rebuild with "-O1 -g" instead of "-Os" and "-O3", I get this stack trace:

$ gdb conftest
GNU gdb (GDB) SUSE (7.3-41.1.2)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /data/bruno/tmp/testdir3/conftest...done.
(gdb) set solib-search-path /arch/x86-linux/inst-musl/lib
(gdb) run
Starting program: /data/bruno/tmp/testdir3/conftest
warning: Could not load shared library symbols for linux-gate.so.1.
Do you need "set solib-search-path" or "set sysroot"?

Program received signal SIGSEGV, Segmentation fault.
fmt_fp (f=0xf7ff9200, y=0, w=0, p=5000000, fl=0, t=102) at src/stdio/vfprintf.c:326
326 x = *d % i;
(gdb) where
#0 fmt_fp (f=0xf7ff9200, y=0, w=0, p=5000000, fl=0, t=102) at src/stdio/vfprintf.c:326
#1 0xf7fcacf3 in printf_core (f=0xf7ff9200, fmt=<optimized out>, ap=0xffffc13c, nl_arg=0xffffc09c,
nl_type=0xffffc114) at src/stdio/vfprintf.c:614
#2 0xf7fcb0eb in vfprintf (f=0xf7ff9200, fmt=0x80484f4 "%.5000000f", ap=0xffffc1a4 "") at src/stdio/vfprintf.c:659
#3 0xf7fcd967 in vprintf (fmt=0x80484f4 "%.5000000f", ap=0xffffc1a4 "") at src/stdio/vprintf.c:5
#4 0xf7fc8463 in printf (fmt=0x80484f4 "%.5000000f") at src/stdio/printf.c:9
#5 0x0804845f in main () at conftest.c:7
(gdb) info locals
x = <optimized out>
big = {524288, 0 <repeats 1750 times>, 4160552156, 0, 0, 0, 0, 0, 0, 0, 4160720884, 8, 8, 134513329, 4160343432,
134513332, 4160609540, 1, 0 <repeats 46 times>, 134513908, 4160721408, 4160517969, 4160727464, 134513908, 0, 0, 0,
0, 0, 4160720884, 4160711907, 0, 0, 4160524786}
a = 0xffffa2b0
d = 0x218b40
r = 0xffffa2b0
z = 0x218b44
e2 = 0
e = 0
i = <optimized out>
j = 9
l = <optimized out>
buf = '\000' <repeats 24 times>
s = <optimized out>
prefix = 0xf7ff6cb4 "0X+0X 0X-0x+0x 0x"
pl = 0
ebuf0 = '\000' <repeats 11 times>
ebuf = 0xffffa293 ""
estr = <optimized out>
(gdb) up
#1 0xf7fcacf3 in printf_core (f=0xf7ff9200, fmt=<optimized out>, ap=0xffffc13c, nl_arg=0xffffc09c,
nl_type=0xffffc114) at src/stdio/vfprintf.c:614
614 l = fmt_fp(f, arg.f, w, p, fl, t);
(gdb) info locals
a = <optimized out>
z = 0xffffbff0 ""
s = 0x80484fe ""
l10n = 0
litpct = <optimized out>
fl = 0
w = 0
p = 5000000
arg = {i = 9223372036854775808, f = 1, p = 0x0}
argpos = -1
st = <optimized out>
ps = 0
cnt = 0
l = 0
i = <optimized out>
buf = "A\370\367\374\371\370\367\000\000\000\000\021", '\000' <repeats 27 times>, "\377", <incomplete sequence \367>
prefix = 0xf7ff6cd2 "-+ 0X0x"
t = 102
pl = 0
wc = L"\xf7f9c62d\xf7f899ac"
ws = <optimized out>
mb = "\271\202\004\b"
(gdb) up
#2 0xf7fcb0eb in vfprintf (f=0xf7ff9200, fmt=0x80484f4 "%.5000000f", ap=0xffffc1a4 "") at src/stdio/vfprintf.c:659
659 ret = printf_core(f, fmt, &ap2, nl_arg, nl_type);
(gdb) info locals
ap2 = 0xffffc1ac ""
nl_type = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
nl_arg = {{i = 150189233701, f = 0, p = 0xf7f9d625}, {i = 4307434622, f = <invalid float value>, p = 0xbe3c7e}, {
i = 4024693728518132, f = 0, p = 0x8049ff4}, {i = 0, f = <invalid float value>, p = 0x0}, {i = 98599429607984,
f = 0, p = 0xf7fa1230}, {i = 17868614760971370496, f = -0, p = 0x0}, {i = 17870160128724931592, f = 0,
p = 0xf7ff9408}, {i = 13791, f = 0, p = 0x35df}, {i = 47244701668, f = <invalid float value>, p = 0xefe4}, {
i = 824633720832, f = 0, p = 0x0}}
internal_buf = "h\334\375\367", '\000' <repeats 12 times>"\364, \217\377\367\340\216\377\367\270\300\377\377\"\000\000\000:\310\371\367\270\300\377\377\000\000\000\000\210\000\000\000\260\202\004\b\000\224\377\367\000\000\000\000\000\000\000\000\364\217\377\367H\224\377\367@\301\377\377\000\340\377\377"
saved_buf = 0x0
ret = <optimized out>
__need_unlock = 0
(gdb) up
#3 0xf7fcd967 in vprintf (fmt=0x80484f4 "%.5000000f", ap=0xffffc1a4 "") at src/stdio/vprintf.c:5
5 return vfprintf(stdout, fmt, ap);
(gdb) info locals
No locals.
(gdb) up
#4 0xf7fc8463 in printf (fmt=0x80484f4 "%.5000000f") at src/stdio/printf.c:9
9 ret = vprintf(fmt, ap);
(gdb) info locals
ret = 9
ap = 0xffffc1a4 ""
(gdb) up
#5 0x0804845f in main () at conftest.c:7
7 ret = printf ("%.5000000f", 1.0);
(gdb) info locals
ret = 0
err = 0

The SIGSEGV occurs because d = 0x218b40 but the address ranges are these:
08048000-08049000 r-xp 00000000 08:05 26174991 /data/bruno/tmp/testdir3/conftest
08049000-0804b000 rwxp 00000000 08:05 26174991 /data/bruno/tmp/testdir3/conftest
f7f84000-f7ff8000 r-xp 00000000 08:05 26168372 /data/arch/x86-linux/inst-musl/lib/libc.so
f7ff8000-f7ffa000 rwxp 00073000 08:05 26168372 /data/arch/x86-linux/inst-musl/lib/libc.so
f7ffa000-f7ffe000 rwxp 00000000 00:00 0
fffdc000-ffffe000 rwxp 00000000 00:00 0 [stack]
ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso]
What if you run it under strace?
Yes. When it succeeds, the strace output looks normal. When it fails,
it's this:

$ strace ./conftest
execve("./conftest", ["./conftest"], [/* 133 vars */]) = 0
[ Process PID=2858 runs in 32 bit mode. ]
--- {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xe7664} (Segmentation fault) ---
+++ killed by SIGSEGV (core dumped) +++
Speicherzugriffsfehler (Speicherabzug geschrieben)

Hope this helps.

Bruno
Rich Felker
2012-06-20 01:52:49 UTC
Permalink
Post by Bruno Haible
[...]
08048000-08049000 r-xp 00000000 08:05 26174991 /data/bruno/tmp/testdir3/conftest
08049000-0804b000 rwxp 00000000 08:05 26174991 /data/bruno/tmp/testdir3/conftest
f7f84000-f7ff8000 r-xp 00000000 08:05 26168372 /data/arch/x86-linux/inst-musl/lib/libc.so
f7ff8000-f7ffa000 rwxp 00073000 08:05 26168372 /data/arch/x86-linux/inst-musl/lib/libc.so
f7ffa000-f7ffe000 rwxp 00000000 00:00 0
fffdc000-ffffe000 rwxp 00000000 00:00 0 [stack]
ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso]
What if you run it under strace?
Yes. When it succeeds, the strace output looks normal. When it fails,
$ strace ./conftest
execve("./conftest", ["./conftest"], [/* 133 vars */]) = 0
[ Process PID=2858 runs in 32 bit mode. ]
--- {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xe7664} (Segmentation fault) ---
+++ killed by SIGSEGV (core dumped) +++
Speicherzugriffsfehler (Speicherabzug geschrieben)
Hope this helps.
Yes, it helped a lot. Thanks! The problem was an obscure
pointer-arithmetic overflow that could only happen in 32-bit binaries
running on a 64-bit kernel where the stack pointer is near the 4GB
boundary. This is why I couldn't reproduce it: I'm on a 32-bit
kernel where the stack is at 3GB and there's no way an offset bounded
by INT_MAX/9 could reach past 4GB. That's my excuse for why it was
never noticed before, but it still doesn't justify the bug, which is a
nasty instance of UB (pointer arithmetic outside array bounds).

Anyway, it's fixed now.

Rich


P.S. I just realized - I meant to credit you for finding it in the
commit message but somehow I forgot to. Sorry about that!
Bruno Haible
2012-06-20 09:35:28 UTC
Permalink
The problem was an obscure pointer-arithmetic overflow ...
where the stack pointer is near the 4GB boundary.
This explains also why it occurred only with a certain probability
outside gdb, but with 100% probability from within gdb: Apparently gdb
runs the program without address space layout randomization.
Anyway, it's fixed now.
I confirm that
http://git.etalabs.net/cgi-bin/gitweb.cgi?p=musl;a=commitdiff;h=914949d321448bd2189bdcbce794dbae2c8ed16e
fixes the bug.

Bruno
Jim Meyering
2012-06-20 11:00:30 UTC
Permalink
Post by Bruno Haible
The problem was an obscure pointer-arithmetic overflow ...
where the stack pointer is near the 4GB boundary.
This explains also why it occurred only with a certain probability
outside gdb, but with 100% probability from within gdb: Apparently gdb
runs the program without address space layout randomization.
That is correct. It is a feature of gdb-7.0 and newer.
You can inspect (watch/break-at/etc.) the same address and expect it
to refer to the same memory location in multiple invocations.
This makes gdb's command-line history even more useful.
Tom Tromey
2012-06-21 19:58:30 UTC
Permalink
Jim> That is correct. It is a feature of gdb-7.0 and newer.
Jim> You can inspect (watch/break-at/etc.) the same address and expect it
Jim> to refer to the same memory location in multiple invocations.
Jim> This makes gdb's command-line history even more useful.

gdb defaults to this for development convenience.
You can change it though, see "set disable-randomization".

Tom
Rich Felker
2012-06-20 03:04:45 UTC
Permalink
Some more updates..
Post by Bruno Haible
Replacements of *printf, because of
[...]
checking whether printf survives out-of-memory conditions... no
This was caused by the pointer-arithmetic overflow bug I just fixed in
git. It should no longer fail, and never failed before except in i386
binaries running on x86_64 kernels.
Post by Bruno Haible
Replacement of duplocale, because of
checking whether duplocale(LC_GLOBAL_LOCALE) works... no
POSIX does not specify any use of LC_GLOBAL_LOCALE except as an
argument to uselocale. Is there a reason it's needed? Perhaps more
importantly, is the replacement when libc doesn't provide this
functionality bloated/painful?
Post by Bruno Haible
Replacement of fdopen, because of
checking whether fdopen sets errno... no
Seems to have been fixed in gnulib.
Post by Bruno Haible
Replacement of getcwd, because of
checking whether getcwd handles long file names properly... no, but it is partly working
checking whether getcwd aborts when 4k < cwd_length < 16k... no
Still unclear what the cause of these failures is. Anyone else looked
into them, or do I still need to?
Post by Bruno Haible
Replacement of iconv and iconv_open, because of
checking whether iconv supports conversion between UTF-8 and UTF-{16,32}{BE,LE}... no
I fixed all the UTF-16-related bugs that were breaking this test. It
should pass now.
Post by Bruno Haible
Replacement of mktime, because of
checking for working mktime... no
Replacement of perror, because of
checking whether perror matches strerror... no
Replacement of popen, because of
checking whether popen works with closed stdin... no
Replacement of regex, because of
checking for working re_compile_pattern... no
Replacement of strtod, because of
checking whether strtod obeys C99... no
For each of the replacements, first look at the test program's results
(in config.log), then look at the test program's source code (in m4/*.m4).
test-duplocale.c:70: assertion failed
FAIL: test-duplocale
test-fcntl.c:382: assertion failed
FAIL: test-fcntl
This is caused by the fact that the F_GETOWN fcntl on Linux is broken;
there's no way to distinguish error returns from non-error negative
return values. So we never set errno when calling F_GETOWN and assume
the return value is not an error. There's a new-ish Linux-specific
F_GETOWN_EX we could use when it's available, but the fallback code
would still fail just like it does now, because it's a fundamental
limitation in the API.
Post by Bruno Haible
test-fdatasync.c:50: assertion failed
FAIL: test-fdatasync
This function was dummied-out for some reason. Fixed.
Post by Bruno Haible
test-fsync.c:50: assertion failed
FAIL: test-fsync
Same.
Post by Bruno Haible
test-fwrite.c:53: assertion failed
FAIL: test-fwrite
This seems like it might be a real bug. On musl, unbuffered files
actually have a one-byte buffer, but on writing, the buffer is
supposed to be flushed as soon as it fills, rather than waiting for
another write when it's full. I'll have to run some tests...
Post by Bruno Haible
test-getlogin_r.c:88: assertion failed
FAIL: test-getlogin_r
This was broken; it should be fixed now.
Post by Bruno Haible
test-grantpt.c:34: assertion failed
FAIL: test-grantpt
This is an invalid test. POSIX specifies this function "may fail", not
"shall fail", and since the function is inherently a no-op, it would
be idiotic to make it perform a syscall to check the validity of the
file descriptor...
Post by Bruno Haible
test-localeconv.c:41: assertion failed
FAIL: test-localeconv
Fixed lots of issues; not sure if it works now.
Post by Bruno Haible
Segmentation fault
FAIL: test-localename
This might be due to our incomplete locale implementation, or because
the test uses locale names that don't exist. I doubt it should
segfault though. I'll look into this one later.
Post by Bruno Haible
test-ptsname_r.c:118: assertion failed
FAIL: test-ptsname_r
It's testing that ptsname_r both sets errno and returns the error
code, and that they're the same. Since this function is nonstandard,
there's no spec for it, so perhaps this is desirable; I was assuming
it should return -1 on failure.
Post by Bruno Haible
test-strerror_r.c:118: assertion failed
FAIL: test-strerror_r
This test is looking for a null terminator at the n-1 position of the
buffer if strerror_r fails with ERANGE (buffer too small). I don't see
anywhere the function is specified to write to the buffer AT ALL on
failure, so this test seems invalid.
Post by Bruno Haible
test-wcwidth.c:71: assertion failed
FAIL: test-wcwidth
It's checking for wcwidth(0x3000)==2. This definitely used to work,
but it might have been broken when I overhauled wcwidth. I'll look
into it..
Post by Bruno Haible
When I compile all of gnulib, I also get a compilation error
fsusage.c:222:17: error: storage size of 'fsd' isn't known
fsusage.c:224:3: warning: implicit declaration of function 'statfs' [-Wimplicit-function-declaration]
fsusage.c:222:17: warning: unused variable 'fsd' [-Wunused-variable]
make[4]: *** [fsusage.o] Error 1
This looks like a gnulib problem. On musl, statvfs should get used,
and this code should not even be compiled... Judging from the
comments, it looks like a hard-coded workaround for broken glibc
and/or Linux versions, but the header include seems to be missing in
the workaround case...

Rich
Eric Blake
2012-06-20 04:10:11 UTC
Permalink
Post by Rich Felker
Post by Bruno Haible
Replacement of duplocale, because of
checking whether duplocale(LC_GLOBAL_LOCALE) works... no
POSIX does not specify any use of LC_GLOBAL_LOCALE except as an
argument to uselocale. Is there a reason it's needed? Perhaps more
importantly, is the replacement when libc doesn't provide this
functionality bloated/painful?
Unfortunately, you are out of date. POSIX _does_ require
duplocale(LC_GLOBAL_LOCALE) to work:

http://austingroupbugs.net/view.php?id=301


If the locobj argument is LC_GLOBAL_LOCALE, duplocale() shall
create a new locale object containing a copy of the global locale
determined by the setlocale() function.

The behavior is undefined if the locobj argument is not a valid
locale object handle.

After line 24978 add a new paragraph to APPLICATION USAGE:

The duplocale() function can also be used in conjunction with
uselocale((locale_t)0). This returns the locale in effect for
the calling thread, but can have the value LC_GLOBAL_LOCALE.
Passing LC_GLOBAL_LOCALE to functions such as isalnum_l()
results in undefined behavior, but applications can convert
it into a usable locale object by using duplocale().
Post by Rich Felker
Post by Bruno Haible
test-fcntl.c:382: assertion failed
FAIL: test-fcntl
This is caused by the fact that the F_GETOWN fcntl on Linux is broken;
there's no way to distinguish error returns from non-error negative
return values. So we never set errno when calling F_GETOWN and assume
the return value is not an error. There's a new-ish Linux-specific
F_GETOWN_EX we could use when it's available, but the fallback code
would still fail just like it does now, because it's a fundamental
limitation in the API.
Yes, Linux 2.6.32 introduced F_GETOWN_EX for precisely this reason, and
you should be using it.
Post by Rich Felker
Post by Bruno Haible
test-grantpt.c:34: assertion failed
FAIL: test-grantpt
This is an invalid test. POSIX specifies this function "may fail", not
"shall fail", and since the function is inherently a no-op, it would
be idiotic to make it perform a syscall to check the validity of the
file descriptor...
This is one of the cases where gnulib prefers to emulate the shall fail
semantics of glibc, as they are more useful to program around.
Post by Rich Felker
Post by Bruno Haible
test-ptsname_r.c:118: assertion failed
FAIL: test-ptsname_r
It's testing that ptsname_r both sets errno and returns the error
code, and that they're the same. Since this function is nonstandard,
there's no spec for it, so perhaps this is desirable; I was assuming
it should return -1 on failure.
There _is_ a proposed standard for it now:

http://austingroupbugs.net/view.php?id=508

which requires only the return value to be 0 or an errno value, and not
that errno be set. gnulib should only be checking for a valid return value.
Post by Rich Felker
Post by Bruno Haible
test-strerror_r.c:118: assertion failed
FAIL: test-strerror_r
This test is looking for a null terminator at the n-1 position of the
buffer if strerror_r fails with ERANGE (buffer too small). I don't see
anywhere the function is specified to write to the buffer AT ALL on
failure, so this test seems invalid.
This is a case where POSIX is rather weak, but where quality of
implementation demands that the most useful interface is one that
provides the most information back to the user. glibc had a number of
bugs that were fixed in this area to improve QoI, and gnulib now prefers
to rely on those improvements.
--
Eric Blake ***@redhat.com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
Rich Felker
2012-06-20 13:27:00 UTC
Permalink
Post by Eric Blake
Unfortunately, you are out of date. POSIX _does_ require
http://austingroupbugs.net/view.php?id=301
OK. I'll add support. For now all it requires is avoiding
dereferencing the pointer, anyway.
Post by Eric Blake
Yes, Linux 2.6.32 introduced F_GETOWN_EX for precisely this reason, and
you should be using it.
When I wrote that code, 2.6.32 was new enough, and the issue seemed
minor enough, that I didn't bother. Now I agree it would be nice
though.
Post by Eric Blake
Post by Rich Felker
Post by Bruno Haible
test-grantpt.c:34: assertion failed
FAIL: test-grantpt
This is an invalid test. POSIX specifies this function "may fail", not
"shall fail", and since the function is inherently a no-op, it would
be idiotic to make it perform a syscall to check the validity of the
file descriptor...
This is one of the cases where gnulib prefers to emulate the shall fail
semantics of glibc, as they are more useful to program around.
I don't see how it's nicer. All it does it make pty acquisition
slightly slower (one extra useless syscall). The only time you would
call grantpt without knowing that the fd is valid is right after
calling posix_openpt without checking the return value, and in that
case, it seems unlikely that you'd check the return value of grantpt.
And last time I asked, I remember being told that gnulib does not
intend to facilitate this sort of lazy programming anyway.

In any case, if you are relying on lazy error checking like that,
unlockpt will already report the error...
Post by Eric Blake
Post by Rich Felker
Post by Bruno Haible
test-ptsname_r.c:118: assertion failed
FAIL: test-ptsname_r
It's testing that ptsname_r both sets errno and returns the error
code, and that they're the same. Since this function is nonstandard,
there's no spec for it, so perhaps this is desirable; I was assuming
it should return -1 on failure.
http://austingroupbugs.net/view.php?id=508
which requires only the return value to be 0 or an errno value, and not
that errno be set. gnulib should only be checking for a valid return value.
Okay, I'll update it to match this.

I wish they'd just standardized the superior BSD openpty function
instead...
Post by Eric Blake
Post by Rich Felker
Post by Bruno Haible
test-strerror_r.c:118: assertion failed
FAIL: test-strerror_r
This test is looking for a null terminator at the n-1 position of the
buffer if strerror_r fails with ERANGE (buffer too small). I don't see
anywhere the function is specified to write to the buffer AT ALL on
failure, so this test seems invalid.
This is a case where POSIX is rather weak, but where quality of
implementation demands that the most useful interface is one that
provides the most information back to the user. glibc had a number of
bugs that were fixed in this area to improve QoI, and gnulib now prefers
to rely on those improvements.
I don't see anything which forbids it from writing in this case, so I
suppose I could change it.

Rich
Bruno Haible
2012-06-22 10:39:33 UTC
Permalink
Post by Rich Felker
Post by Bruno Haible
test-grantpt.c:34: assertion failed
FAIL: test-grantpt
This is an invalid test. POSIX specifies this function "may fail", not
"shall fail", and since the function is inherently a no-op, it would
be idiotic to make it perform a syscall to check the validity of the
file descriptor...
Looking at the (few) callers of grantpt() in gnulib, it indeed seems
unlikely that people will want to rely on the failure for invalid file
descriptors. So I'm relaxing the requirements of gnulib.


2012-06-22 Bruno Haible <***@clisp.org>

grantpt: Relax requirement regarding invalid file descriptors.
* lib/grantpt.c: Don't include <fcntl.h>.
(grantpt): Don't verify the validity of the file descriptor.
* modules/grantpt (Depends-on): Remove fcntl-h.
* tests/test-grantpt.c (main): Allow grantpt to succeed for invalid
file descriptors.
* doc/posix-functions/grantpt.texi: Document more platforms on which
grantpt succeeds for invalid file descriptors.
Reported by Rich Felker <***@aerifal.cx>.

--- doc/posix-functions/grantpt.texi.orig Fri Jun 22 12:33:55 2012
+++ doc/posix-functions/grantpt.texi Fri Jun 22 12:33:52 2012
@@ -20,5 +20,5 @@
IRIX 5.3.
@item
This function reports success for invalid file descriptors on some platforms:
-Cygwin 1.7.9.
+OpenBSD, Cygwin 1.7.9, musl libc.
@end itemize
--- lib/grantpt.c.orig Fri Jun 22 12:33:55 2012
+++ lib/grantpt.c Fri Jun 22 12:14:57 2012
@@ -21,7 +21,6 @@

#include <assert.h>
#include <errno.h>
-#include <fcntl.h>
#include <string.h>
#include <sys/wait.h>
#include <unistd.h>
@@ -50,8 +49,6 @@
#if defined __OpenBSD__
/* On OpenBSD, master and slave of a pseudo-terminal are allocated together,
through an ioctl on /dev/ptm. There is no need for grantpt(). */
- if (fcntl (fd, F_GETFD) < 0)
- return -1;
return 0;
#else
/* This function is most often called from a process without 'root'
--- modules/grantpt.orig Fri Jun 22 12:33:55 2012
+++ modules/grantpt Fri Jun 22 12:15:14 2012
@@ -9,7 +9,6 @@
Depends-on:
stdlib
extensions
-fcntl-h [test $HAVE_GRANTPT = 0]
pt_chown [test $HAVE_GRANTPT = 0]
waitpid [test $HAVE_GRANTPT = 0]
configmake [test $HAVE_GRANTPT = 0]
--- tests/test-grantpt.c.orig Fri Jun 22 12:33:55 2012
+++ tests/test-grantpt.c Fri Jun 22 12:14:31 2012
@@ -28,22 +28,36 @@
int
main (void)
{
- /* Test behaviour for invalid file descriptors. */
+ /* Test behaviour for invalid file descriptors.
+ These calls don't fail on OpenBSD (with gnulib's replacement) and on
+ musl libc. */
{
+ int ret;
+
errno = 0;
- ASSERT (grantpt (-1) == -1);
- ASSERT (errno == EBADF
- || errno == EINVAL /* seen on FreeBSD 6.4 */
- || errno == 0 /* seen on Solaris 8 */
- );
+ ret = grantpt (-1);
+ if (ret != 0)
+ {
+ ASSERT (ret == -1);
+ ASSERT (errno == EBADF
+ || errno == EINVAL /* seen on FreeBSD 6.4 */
+ || errno == 0 /* seen on Solaris 8 */
+ );
+ }
}
{
+ int ret;
+
errno = 0;
- ASSERT (grantpt (99) == -1);
- ASSERT (errno == EBADF
- || errno == EINVAL /* seen on FreeBSD 6.4 */
- || errno == 0 /* seen on Solaris 8 */
- );
+ ret = grantpt (99);
+ if (ret != 0)
+ {
+ ASSERT (ret == -1);
+ ASSERT (errno == EBADF
+ || errno == EINVAL /* seen on FreeBSD 6.4 */
+ || errno == 0 /* seen on Solaris 8 */
+ );
+ }
}

return 0;
Pádraig Brady
2012-07-02 22:33:14 UTC
Permalink
Post by Rich Felker
Some more updates..
Post by Bruno Haible
When I compile all of gnulib, I also get a compilation error
fsusage.c:222:17: error: storage size of 'fsd' isn't known
fsusage.c:224:3: warning: implicit declaration of function 'statfs' [-Wimplicit-function-declaration]
fsusage.c:222:17: warning: unused variable 'fsd' [-Wunused-variable]
make[4]: *** [fsusage.o] Error 1
This looks like a gnulib problem. On musl, statvfs should get used,
and this code should not even be compiled... Judging from the
comments, it looks like a hard-coded workaround for broken glibc
and/or Linux versions, but the header include seems to be missing in
the workaround case...
This is addressed in different ways in these two commits:

http://git.sv.gnu.org/gitweb/?p=gnulib.git;a=commit;h=defe5737
http://git.sv.gnu.org/gitweb/?p=gnulib.git;a=commit;h=2ab2617e

cheers,
Pádraig.

Rich Felker
2012-06-20 19:28:02 UTC
Permalink
Post by Bruno Haible
[CCing the musl list]
Isaac Dunham wrote in
Post by Isaac Dunham
musl is designed for standards conformance,
There is a recipe, in <http://sourceware.org/glibc/wiki/Testing/Gnulib>,
that explains how to use gnulib to check a libc against bugs. When I apply
Replacements of *printf, because of
checking whether printf supports infinite 'long double' arguments... no
Fixed. (Not really a bug, but fixed anyway.)
Post by Bruno Haible
checking whether printf supports the 'ls' directive... no
Previously fixed.
Post by Bruno Haible
checking whether printf survives out-of-memory conditions... no
Fixed.
Post by Bruno Haible
Replacement of duplocale, because of
checking whether duplocale(LC_GLOBAL_LOCALE) works... no
Fixed.
Post by Bruno Haible
Replacement of fdopen, because of
checking whether fdopen sets errno... no
Not a bug. I believe this was fixed in gnulib.
Post by Bruno Haible
Replacement of futimens, because of
checking whether futimens works... no
Not a bug; just confusing message.
Post by Bruno Haible
Replacement of getcwd, because of
checking whether getcwd handles long file names properly... no, but it is partly working
checking whether getcwd aborts when 4k < cwd_length < 16k... no
Still open; probably not a bug.
Post by Bruno Haible
Replacement of getopt, because of
checking whether getopt is POSIX compatible... no
Not a bug.
Post by Bruno Haible
Replacement of glob, because of
checking for GNU glob interface version 1... no
(not sure this is a bug or just an incompatibility compared to glibc)
Not supported.
Post by Bruno Haible
Replacement of iconv and iconv_open, because of
checking whether iconv supports conversion between UTF-8 and UTF-{16,32}{BE,LE}... no
Fixed.
Post by Bruno Haible
Replacement of mktime, because of
checking for working mktime... no
Still open.
Post by Bruno Haible
Replacement of perror, because of
checking whether perror matches strerror... no
Fixed.
Post by Bruno Haible
Replacement of popen, because of
checking whether popen works with closed stdin... no
Fixed.
Post by Bruno Haible
Replacement of regex, because of
checking for working re_compile_pattern... no
Not supported.
Post by Bruno Haible
Replacement of strtod, because of
checking whether strtod obeys C99... no
Previously fixed.
Post by Bruno Haible
test-duplocale.c:70: assertion failed
FAIL: test-duplocale
Fixed.
Post by Bruno Haible
test-fcntl.c:382: assertion failed
FAIL: test-fcntl
Pending; intend to fix.
Post by Bruno Haible
test-fdatasync.c:50: assertion failed
FAIL: test-fdatasync
Fixed.
Post by Bruno Haible
test-fma2.h:116: assertion failed
FAIL: test-fma2
Unknown. Asking nsz..
Post by Bruno Haible
test-fsync.c:50: assertion failed
FAIL: test-fsync
Fixed.
Post by Bruno Haible
test-fwrite.c:53: assertion failed
FAIL: test-fwrite
Fixed.
Post by Bruno Haible
test-getlogin_r.c:88: assertion failed
FAIL: test-getlogin_r
Fixed.
Post by Bruno Haible
test-grantpt.c:34: assertion failed
FAIL: test-grantpt
Buggy/useless test.
Post by Bruno Haible
test-localeconv.c:41: assertion failed
FAIL: test-localeconv
Fixed.
Post by Bruno Haible
Segmentation fault
FAIL: test-localename
Still open.
Post by Bruno Haible
test-ptsname_r.c:118: assertion failed
FAIL: test-ptsname_r
Fixed.
Post by Bruno Haible
test-strerror_r.c:118: assertion failed
FAIL: test-strerror_r
Fixed.
Post by Bruno Haible
test-wcwidth.c:71: assertion failed
FAIL: test-wcwidth
Fixed.
Post by Bruno Haible
When I compile all of gnulib, I also get a compilation error
fsusage.c:222:17: error: storage size of 'fsd' isn't known
fsusage.c:224:3: warning: implicit declaration of function 'statfs' [-Wimplicit-function-declaration]
fsusage.c:222:17: warning: unused variable 'fsd' [-Wunused-variable]
make[4]: *** [fsusage.o] Error 1
OK, this is valid fallback code for when statvfs fails, but the
headers required for it have not been included.

Basically the only still-open issues are getcwd, mktime, fma,
localename, so I'll avoid future spam by just addressing them. I've
kept all the Cc's so far, but if this is getting OT for gnulib folks,
I'll be happy to drop the Cc. Just let me know.

Rich
Rich Felker
2012-06-21 02:21:16 UTC
Permalink
Post by Rich Felker
Post by Bruno Haible
Replacement of getcwd, because of
checking whether getcwd handles long file names properly... no, but it is partly working
This test is failing because musl uses the kernel to resolve the
current directory name, and the kernel does not support pathnames
longer than PATH_MAX. For some reason, the test only considers this an
error if AT_FDCWD is defined.

Does gnulib aim to provide a getcwd that always works regardless of
path depth? If so, replacing getcwd is the right action for gnulib on
musl.
Post by Rich Felker
Post by Bruno Haible
checking whether getcwd aborts when 4k < cwd_length < 16k... no
No is the correct result here. This test is looking for a bug that
only exists on some archs with large page sizes (>4k), and no means it
did not find the bug.
Post by Rich Felker
Post by Bruno Haible
Replacement of mktime, because of
checking for working mktime... no
This test is buggy; it goes into an infinite loop due to integer
overflow UB, because the condition to break out of the loop is only
checked when the test does not fail:

for (j = 1; ; j <<= 1)
if (! bigtime_test (j))
result |= 4;
else if (INT_MAX / 2 < j)
break;

However this does indicate a bug in musl. The relevant code is very
old and I suspect it's not checking for integer overflows at all, just
generating huge time_t values that get truncated rather than mapped to
(time_t)-1.

Both need to be fixed.
Post by Rich Felker
Post by Bruno Haible
test-fcntl.c:382: assertion failed
FAIL: test-fcntl
Pending; intend to fix.
Fixed.
Post by Rich Felker
Post by Bruno Haible
test-fma2.h:116: assertion failed
FAIL: test-fma2
Unknown. Asking nsz..
Fixed by nsz. :-)

Rich
Paul Eggert
2012-06-21 08:52:00 UTC
Permalink
Post by Rich Felker
Post by Bruno Haible
Replacement of mktime, because of
checking for working mktime... no
This test is buggy; it goes into an infinite loop due to integer
overflow UB, because the condition to break out of the loop is only
Thanks, I pushed the following patch into gnulib:

---
ChangeLog | 9 +++++++++
m4/mktime.m4 | 25 ++++++++++++++-----------
2 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 199b06c..1661a62 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2012-06-21 Paul Eggert <***@cs.ucla.edu>
+
+ mktime: fix integer overflow in 'configure'-time test
+ * m4/mktime.m4 (gl_FUNC_MKTIME): Do not rely on undefined behavior
+ after integer overflow. Problem reported by Rich Felker in
+ <http://lists.gnu.org/archive/html/bug-gnulib/2012-06/msg00257.html>.
+ Also, don't look for further instances of a bug if we've already
+ found one instance; this helps 'configure' run faster.
+
2012-06-20 John Darrington <***@darrington.wattle.id.au> (tiny change)

tmpfile, clean-temp: Fix invocation of GetVersionEx.
diff --git a/m4/mktime.m4 b/m4/mktime.m4
index 5e05dfa..14fcf7f 100644
--- a/m4/mktime.m4
+++ b/m4/mktime.m4
@@ -1,4 +1,4 @@
-# serial 21
+# serial 22
dnl Copyright (C) 2002-2003, 2005-2007, 2009-2012 Free Software Foundation,
dnl Inc.
dnl This file is free software; the Free Software Foundation
@@ -192,20 +192,23 @@ main ()
if (tz_strings[i])
putenv (tz_strings[i]);

- for (t = 0; t <= time_t_max - delta; t += delta)
+ for (t = 0; t <= time_t_max - delta && (result & 1) == 0; t += delta)
if (! mktime_test (t))
result |= 1;
- if (! (mktime_test ((time_t) 1)
- && mktime_test ((time_t) (60 * 60))
- && mktime_test ((time_t) (60 * 60 * 24))))
+ if ((result & 2) == 0
+ && ! (mktime_test ((time_t) 1)
+ && mktime_test ((time_t) (60 * 60))
+ && mktime_test ((time_t) (60 * 60 * 24))))
result |= 2;

- for (j = 1; ; j <<= 1)
- if (! bigtime_test (j))
- result |= 4;
- else if (INT_MAX / 2 < j)
- break;
- if (! bigtime_test (INT_MAX))
+ for (j = 1; (result & 4) == 0; j <<= 1)
+ {
+ if (! bigtime_test (j))
+ result |= 4;
+ if (INT_MAX / 2 < j)
+ break;
+ }
+ if ((result & 8) == 0 && ! bigtime_test (INT_MAX))
result |= 8;
}
if (! irix_6_4_bug ())
--
1.7.6.5
i***@lavabit.com
2012-06-18 00:05:23 UTC
Permalink
Post by Bruno Haible
[CCing the musl list]
Isaac Dunham wrote in
Post by Isaac Dunham
musl is designed for standards conformance,
There is a recipe, in <http://sourceware.org/glibc/wiki/Testing/Gnulib>,
that explains how to use gnulib to check a libc against bugs.
Be warned: a bad test can cause failures as well.
It's been one of the musl developers' complaints about gnulib that the
tests are buggy and frequently check for glibc behavior instead of
standard behavior.
Post by Bruno Haible
Replacements of *printf, because of
checking whether printf supports infinite 'long double' arguments... no
checking whether printf supports the 'ls' directive... no
checking whether printf survives out-of-memory conditions... no
At least one of these (infinite long double, IIRC) is invalid or a test
for a GNU-ism. This was previously discussed on the musl ML. OOM behavior
is undefined AFAICT (feel free to point out a standard), and the scenario
is a lot less likely with musl than glibc for several reasons.
Post by Bruno Haible
Replacement of duplocale, because of
checking whether duplocale(LC_GLOBAL_LOCALE) works... no
Need to check this one
Post by Bruno Haible
Replacement of fdopen, because of
checking whether fdopen sets errno... no
I presume this is nonconformance to POSIX ("otherwise, a null pointer
shall be returned and errno set...")?
Post by Bruno Haible
Replacement of futimens, because of
checking whether futimens works... no
Could be a bug.
Post by Bruno Haible
Replacement of getcwd, because of
checking whether getcwd handles long file names properly... no, but it
is partly working
Is this a test for ERANGE handling (error on name >= size)? Other than
that, I see no specification covering this.
Post by Bruno Haible
checking whether getcwd aborts when 4k < cwd_length < 16k... no
AFAICT, only required to error when size =< cwd_length. If size !<
(cwd_length + 1), that is conformant behavior. (See man 3posix getcwd)
Post by Bruno Haible
Replacement of getopt, because of
checking whether getopt is POSIX compatible... no
We'd need to see this test...(will look later).
Post by Bruno Haible
Replacement of glob, because of
checking for GNU glob interface version 1... no
(not sure this is a bug or just an incompatibility compared to glibc)
Looks like an incompatability, since it specifies "GNU interface"...
Post by Bruno Haible
Replacement of iconv and iconv_open, because of
checking whether iconv supports conversion between UTF-8 and
UTF-{16,32}{BE,LE}... no
Not "nonconformant" from the standpoint of POSIX, AFAICT, but it is
incomplete. musl is UTF8 native, but I don't think it supports UTF16/UTF32
yet.
Post by Bruno Haible
Replacement of mktime, because of
checking for working mktime... no
Replacement of perror, because of
checking whether perror matches strerror... no
Replacement of popen, because of
checking whether popen works with closed stdin... no
Look like bugs, if the description is correct.
Post by Bruno Haible
Replacement of regex, because of
checking for working re_compile_pattern... no
This is #ifdef __USE_GNU
I'm not aware of any standard covering GNU APIs...
Post by Bruno Haible
Replacement of strtod, because of
checking whether strtod obeys C99... no
For each of the replacements, first look at the test program's results
(in config.log), then look at the test program's source code (in m4/*.m4).
Thanks,
Isaac Dunham
Bruno Haible
2012-06-18 23:28:24 UTC
Permalink
Rich Felker wrote in
I've already discussed on this list why using an atexit function for
closing stdin or stdout is the heart of the problem and the source of
all the complexity. If the main program just closed the stream at the
natural point in the normal program flow, it would be trivial to do
correctly and portably.
Here's how this 'closein' module came about:

Jim, maintainer of coreutils (a set of ca. 100 programs), noticed that
the programs failed to report an error when writing to a full disk. Not only
$ cat file1 file2 > /full-partition/output
but also
$ id -u > /full-partition/output
or
$ sort --help > /full-partition/output

The problem is that not only the "normal program flow" needs to be
considered, but all program flows from the start of main() to exit().
He could have changed the source code of all 100 programs so that this
bug would be fixed, but that would not give a guarantee that the bug would
not be reintroduced as new code branches are added in existing programs,
or as new programs are being written. So he searched for a solution that
would prevent the bug from reappearing and also not increase the maintenance
burden, and came up with 'closeout'.

Jim made a presentation about this:
http://www.gnu.org/ghm/2011/paris/#sec-2-1
http://www.irill.org/events/ghm-gnu-hackers-meeting/videos/jim-meyering-goodbye-world-the-perils-of-relying-on-output-streams-in-c

'closein' is similar - an attempt to fix an issue that affects many programs,
once and for all. By Eric Blake.

Bruno
Rich Felker
2012-06-19 00:01:54 UTC
Permalink
Post by Bruno Haible
Rich Felker wrote in
I've already discussed on this list why using an atexit function for
closing stdin or stdout is the heart of the problem and the source of
all the complexity. If the main program just closed the stream at the
natural point in the normal program flow, it would be trivial to do
correctly and portably.
Jim, maintainer of coreutils (a set of ca. 100 programs), noticed that
the programs failed to report an error when writing to a full disk. Not only
$ cat file1 file2 > /full-partition/output
but also
$ id -u > /full-partition/output
or
$ sort --help > /full-partition/output
The problem is that not only the "normal program flow" needs to be
considered, but all program flows from the start of main() to exit().
He could have changed the source code of all 100 programs so that this
bug would be fixed, but that would not give a guarantee that the bug would
not be reintroduced as new code branches are added in existing programs,
or as new programs are being written. So he searched for a solution that
would prevent the bug from reappearing and also not increase the maintenance
burden, and came up with 'closeout'.
http://www.gnu.org/ghm/2011/paris/#sec-2-1
http://www.irill.org/events/ghm-gnu-hackers-meeting/videos/jim-meyering-goodbye-world-the-perils-of-relying-on-output-streams-in-c
Yes, I've seen it; it was discussed on our list. I wasn't aware of the
specific historic details, but I figured this was probably the general
story for how the idea came to be.

If the "closeout" approach works best for coreutils, that's the
business of the coreutils' maintainers, not my business. However, as I
discussed on the musl list, I think it's bad design, and I would
highly recommend other projects not follow it. Conceptually, you're
turning something that's a local variable (stdout is global, but if
it's only used from one point as a generic FILE standing in for
something that would otherwise have been obtained by fopen, it's
conceptually local) into a global, and thereby losing the _local_
information of whether it was used in the first place, which has to be
recovered with the non-portable __fpending.

If on the other hand programs just handle stdout as "yet another
FILE", the same code that checks for write errors and reports failure
for explicitly-opened files would also check and report write errors
on stdout. It's not longer a special-case. And special-cases are where
errors like to hide.
Post by Bruno Haible
'closein' is similar - an attempt to fix an issue that affects many programs,
once and for all. By Eric Blake.
I think closein is just a no-op for conformant implementations. exit
implicitly closes all streams, including stdin, and per POSIX, fclose
has the following effect:

If the file is not already at EOF, and the file is one capable of
seeking, the file offset of the underlying open file description
shall be adjusted so that the next operation on the open file
description deals with the byte after the last one read from or
written to the stream being closed.

As such, close_stdin's attempt to fix-up the file position seems to be
redundant.

Incidentally, I suspect musl is _not_ currently handling this case
correctly. Does gnulib have some tests that assert the required
behavior, which I could use to test the current behavior and any
efforts to fix it if it's wrong?

(If it is broken in musl, it's due to stdin being a special case in
musl. Alas, special cases are where errors like to hide...)

Rich
Bruno Haible
2012-06-19 00:29:29 UTC
Permalink
... Conceptually, you're
turning something that's a local variable ... into a global, and
thereby losing the _local_
information of whether it was used in the first place, which has to be
recovered with the non-portable __fpending.
Correct. We do this because it would be too tedious to keep track,
in local variables outside of the stream, whether it was "used in the
first place".
the same code that checks for write errors and reports failure
for explicitly-opened files would also check and report write errors
on stdout.
The code has been generalized: There are two (quite similar) modules
'close-stream' and 'fwriteerror'.
Post by Bruno Haible
'closein' is similar - an attempt to fix an issue that affects many programs,
once and for all. By Eric Blake.
I think closein is just a no-op for conformant implementations. exit
implicitly closes all streams, including stdin, and per POSIX, fclose
If the file is not already at EOF, and the file is one capable of
seeking, the file offset of the underlying open file description
shall be adjusted so that the next operation on the open file
description deals with the byte after the last one read from or
written to the stream being closed.
As such, close_stdin's attempt to fix-up the file position seems to be
redundant.
Incidentally, I suspect musl is _not_ currently handling this case
correctly.
And glibc is not handling it correctly either:
<http://sourceware.org/bugzilla/show_bug.cgi?id=12724>
Which is why 'closein' is needed in gnulib.
Does gnulib have some tests that assert the required
behavior, which I could use to test the current behavior and any
efforts to fix it if it's wrong?
Not in gnulib. But you find two test programs attached in the bug report
from Eric, cited above.

Bruno
Eric Blake
2012-06-19 02:17:25 UTC
Permalink
Post by Rich Felker
If the "closeout" approach works best for coreutils, that's the
business of the coreutils' maintainers, not my business. However, as I
discussed on the musl list, I think it's bad design, and I would
highly recommend other projects not follow it.
And that's where I disagree - the POSIX folks _specifically_ recommend
the closeout approach of an atexit() handler:

http://austingroupbugs.net/view.php?id=555

"Since after the call to fclose() any use of stream results in undefined
behavior, fclose() should not be used on stdin, stdout, or stderr except
immediately before process termination (see XBD 3.297 on page 81), so as
to avoid triggering undefined behavior in other standard interfaces that
rely on these streams. If there are any atexit() handlers registered by
the application, such a call to fclose() should not occur until the last
handler is finishing. Once fclose() has been used to close stdin,
stdout, or stderr, there is no standard way to reopen any of these streams."
Post by Rich Felker
Conceptually, you're
turning something that's a local variable (stdout is global, but if
it's only used from one point as a generic FILE standing in for
something that would otherwise have been obtained by fopen, it's
conceptually local) into a global, and thereby losing the _local_
information of whether it was used in the first place, which has to be
recovered with the non-portable __fpending.
But our argument is that __fpending (well, 'fpending') _should_ be
portable, and I am in the process of proposing it for standardization in
the next version of POSIX because it is so useful.
Post by Rich Felker
If on the other hand programs just handle stdout as "yet another
FILE", the same code that checks for write errors and reports failure
for explicitly-opened files would also check and report write errors
on stdout. It's not longer a special-case. And special-cases are where
errors like to hide.
Any program that treats stdout as just like any other file risks closing
stdout too early, and then causing undefined behavior in the rest of the
program when some other standard interface accidentally ends up using fd
1. The atexit() handler is the easiest way to guarantee that things are
closed properly on all normal exit paths.
Post by Rich Felker
Post by Bruno Haible
'closein' is similar - an attempt to fix an issue that affects many programs,
once and for all. By Eric Blake.
I think closein is just a no-op for conformant implementations. exit
implicitly closes all streams, including stdin,
But the implicit close by exit(), while properly repositioning stdin (on
working implementations; glibc is broken but I fixed cygwin to be
working), has the drawback of silently eating errors if something went
wrong. If you WANT to detect read errors, and consolidate the error
detection into a single ferror() location rather than littering the rest
of your code with harder-to-maintain checks, then closein is the way to
go. If you don't want to detect read errors in a central location, then
yes, avoiding the use of the closein module should have no effect on a
compliant environment; but it is still necessary to work around broken
environments such as glibc.
Post by Rich Felker
Incidentally, I suspect musl is _not_ currently handling this case
correctly. Does gnulib have some tests that assert the required
behavior, which I could use to test the current behavior and any
efforts to fix it if it's wrong?
tests/test-closein.{c,sh}

are sufficient to test the closein module; if you comment out the
atexit() call, it will be sufficient to demonstrate that glibc leaves
stdin at the wrong file offset on exit(); it is also sufficient to
demonstrate that even with a compliant exit(), the implicit close of
exit() eats errors and does not affect error status, whereas the use of
the closein module _does_ detect read errors at a central location.
--
Eric Blake ***@redhat.com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
Rich Felker
2012-06-19 03:30:51 UTC
Permalink
Post by Eric Blake
Post by Rich Felker
If the "closeout" approach works best for coreutils, that's the
business of the coreutils' maintainers, not my business. However, as I
discussed on the musl list, I think it's bad design, and I would
highly recommend other projects not follow it.
And that's where I disagree - the POSIX folks _specifically_ recommend
Yes, they also recommend invoking extremely serious UB (aliasing
violation, which GCC _will_ miscompile) when using dlsym to obtain a
function pointer...

With that said, they have a good point, but it's also arguable that
random parts of the program should not be using stdin/stdout.
Conceptually, these streams "belong to" the main program flow, and
except in really simple programs, they probably should not be used
explicitly (either by name, or by calling stdio functions that
explicitly use them) except in main or a similar function; other
functions should just get a FILE * from the caller.
Post by Eric Blake
But our argument is that __fpending (well, 'fpending') _should_ be
portable, and I am in the process of proposing it for standardization in
the next version of POSIX because it is so useful.
Are you proposing that it be called __fpending or fpending?
Post by Eric Blake
Any program that treats stdout as just like any other file risks closing
stdout too early, and then causing undefined behavior in the rest of the
program when some other standard interface accidentally ends up using fd
If you're not trating stdout special at all, just as an ordinary
stream like any other, then accessing it after closing it is a major
logic bug in your program equivalent to accessing a FILE* obtained by
fopen after calling fclose on it.

With that said, in order to avoid the situation, the way I would
actually deal with it is with minimal special-casing, something like:

fflush(f);
if (ferror(f)) report_error();
if (f!=stdout) fclose(f);

I'm aware that this does not detect errors from the actual close
syscall, but I'm unconvinced that it's beneficial to do so; after a
successful flush, I consider it the operating system's data loss, not
the application's, if the data fails to end up on permanent storage.
Even if close() returned zero, you could still get this situation with
hardware failures, etc.
Post by Eric Blake
1. The atexit() handler is the easiest way to guarantee that things are
closed properly on all normal exit paths.
But it also runs on failure exit paths, which is probably undesirable
for many programs. Unless you use extra global-var hacks to disable it
from running..
Post by Eric Blake
Post by Rich Felker
Post by Bruno Haible
'closein' is similar - an attempt to fix an issue that affects many programs,
once and for all. By Eric Blake.
I think closein is just a no-op for conformant implementations. exit
implicitly closes all streams, including stdin,
But the implicit close by exit(), while properly repositioning stdin (on
working implementations; glibc is broken but I fixed cygwin to be
working), has the drawback of silently eating errors if something went
wrong. If you WANT to detect read errors, and consolidate the error
detection into a single ferror() location rather than littering the rest
of your code with harder-to-maintain checks, then closein is the way to
Anywhere you read, you could hit EOF, so you already need to be
testing for that. Once you get to the "we can't read anymore" code,
it's trivial to check ferror(f). And again it can be done in general
without any special attention to whether the file is stdin.
Post by Eric Blake
Post by Rich Felker
Incidentally, I suspect musl is _not_ currently handling this case
correctly. Does gnulib have some tests that assert the required
behavior, which I could use to test the current behavior and any
efforts to fix it if it's wrong?
tests/test-closein.{c,sh}
are sufficient to test the closein module; if you comment out the
atexit() call, it will be sufficient to demonstrate that glibc leaves
stdin at the wrong file offset on exit(); it is also sufficient to
OK. I'll see if I can fix this on our side.

Rich
Eric Blake
2012-06-19 12:11:35 UTC
Permalink
Post by Rich Felker
Post by Eric Blake
And that's where I disagree - the POSIX folks _specifically_ recommend
Yes, they also recommend invoking extremely serious UB (aliasing
violation, which GCC _will_ miscompile) when using dlsym to obtain a
function pointer...
POSIX is at liberty to define semantics that are not guaranteed by
C99/C11, and dlsym() is one of those situations where POSIX has indeed
required more from the compiler (including that function pointers can be
cast to void* and back again without ill effects). As written in
http://austingroupbugs.net/view.php?id=74,

Note that conversion from a void * pointer to a function pointer
as in:

fptr = (int (*)(int))dlsym(handle, "my_function");

is not defined by the ISO C Standard. This standard requires
this conversion to work correctly on conforming implementations.

Do you have proof that gcc miscompiles dlsym() when used in the manner
recommended by the latest wording? And if so, have you filed it as a
gcc bug?

By the way, if you think there is a bug in POSIX, please file a defect
report - it is in everyone's best interest to improve the standards,
instead of griping about them.
Post by Rich Felker
With that said, they have a good point, but it's also arguable that
random parts of the program should not be using stdin/stdout.
Conceptually, these streams "belong to" the main program flow, and
except in really simple programs, they probably should not be used
explicitly (either by name, or by calling stdio functions that
explicitly use them) except in main or a similar function; other
functions should just get a FILE * from the caller.
As long as the rest of the program just reads and writes, rather than
closes, the FILE* argument, then that is not a problem to your approach
of isolating the use of the standard streams to the main() part of the
program.
Post by Rich Felker
Post by Eric Blake
But our argument is that __fpending (well, 'fpending') _should_ be
portable, and I am in the process of proposing it for standardization in
the next version of POSIX because it is so useful.
Are you proposing that it be called __fpending or fpending?
The POSIX proposal will be for a function named 'fpending'. But until
it is accepted as part of the standard, my recommendation would be that
libc writers implement it as '__fpending', so as to not pollute
namespace, and so that any minor differences between various libc
initial implementations and the final agreed-on POSIX requirements can
be dealt with as part of adding 'fpending' later. My hope is that the
wording I come up with for the POSIX proposal will accommodate both the
existing __fpending implementations and the usage patterns that gnulib
has encouraged through its fpending module in order to provide valid
performance improvements, and to use those performance improvements as
justification why POSIX should consider the addition to the standard.
--
Eric Blake ***@redhat.com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
Rich Felker
2012-06-19 17:19:54 UTC
Permalink
Post by Eric Blake
Post by Rich Felker
Post by Eric Blake
And that's where I disagree - the POSIX folks _specifically_ recommend
Yes, they also recommend invoking extremely serious UB (aliasing
violation, which GCC _will_ miscompile) when using dlsym to obtain a
function pointer...
POSIX is at liberty to define semantics that are not guaranteed by
C99/C11, and dlsym() is one of those situations where POSIX has indeed
required more from the compiler (including that function pointers can be
cast to void* and back again without ill effects). As written in
http://austingroupbugs.net/view.php?id=74,
Note that conversion from a void * pointer to a function pointer
fptr = (int (*)(int))dlsym(handle, "my_function");
is not defined by the ISO C Standard. This standard requires
this conversion to work correctly on conforming implementations.
Do you have proof that gcc miscompiles dlsym() when used in the manner
recommended by the latest wording? And if so, have you filed it as a
gcc bug?
I'm not talking about this; this is actually the correct way to do it.
But the POSIX documentation for dlsym contains an example:

*(void **)(&fptr) = dlsym(handle, "my_function");

which violates the aliasing rules, and which a compiler cannot support
without throwing away aliasing-related optimizations altogether.
Post by Eric Blake
Post by Rich Felker
Post by Eric Blake
But our argument is that __fpending (well, 'fpending') _should_ be
portable, and I am in the process of proposing it for standardization in
the next version of POSIX because it is so useful.
Are you proposing that it be called __fpending or fpending?
The POSIX proposal will be for a function named 'fpending'. But until
it is accepted as part of the standard, my recommendation would be that
libc writers implement it as '__fpending', so as to not pollute
namespace, and so that any minor differences between various libc
initial implementations and the final agreed-on POSIX requirements can
be dealt with as part of adding 'fpending' later. My hope is that the
Sounds very reasonable/clean.

Rich
Eric Blake
2012-06-19 17:43:57 UTC
Permalink
Post by Rich Felker
Post by Eric Blake
POSIX is at liberty to define semantics that are not guaranteed by
C99/C11, and dlsym() is one of those situations where POSIX has indeed
required more from the compiler (including that function pointers can be
cast to void* and back again without ill effects). As written in
http://austingroupbugs.net/view.php?id=74,
Note that conversion from a void * pointer to a function pointer
fptr = (int (*)(int))dlsym(handle, "my_function");
I'm not talking about this; this is actually the correct way to do it.
*(void **)(&fptr) = dlsym(handle, "my_function");
That documentation was rendered obsolete by
http://austingroupbugs.net/view.php?id=74, which was intentionally
written in part with the aliasing problem in mind. In other words,
POSIX no longer recommends writing *(void**)(&fptr), and you are
complaining about something that has already been fixed. When the POSIX
2008 Technical Corrigendum is released (most likely later this year, as
it is already in balloting), bug 74 is one of the bugs fixed in that
corrigendum.
--
Eric Blake ***@redhat.com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
Paul Eggert
2012-06-19 15:33:10 UTC
Permalink
Post by Rich Felker
after a
successful flush, I consider it the operating system's data loss, not
the application's, if the data fails to end up on permanent storage.
Many operating systems behave that way, alas. This is for
performance reasons. NFS is a classic example, but there
are others.
Loading...