Discussion:
mbrtowc tests: don't make assumptions about the charset the C locale
Bruno Haible
2018-02-24 11:02:02 UTC
Permalink
On Alpine Linux 3.7.0, which uses musl libc, this test fails:

FAIL: test-mbrtowc5.sh
======================

../../gltests/test-mbrtowc.c:106: assertion 'wc == c' failed
Aborted
FAIL test-mbrtowc5.sh (exit status: 134)

The issue is that in the C locale, musl uses the encoding that maps
0x00..0x7F -> U+0000..U+007F
0x80..0xFF -> U+DF80..U+DFFF

Whereas for older platforms it was natural to use the ISO-8859-1 encoding:
0x00..0x7F -> U+0000..U+007F
0x80..0xFF -> U+0080..U+00FF

This patch fixes the test.


2018-02-24 Bruno Haible <***@clisp.org>

mbrtowc tests: Don't make assumptions about the charset the C locale.
* tests/test-mbrtowc.c (main): For bytes >= 0x80, don't assume a
particular mapping in the C locale.

diff --git a/tests/test-mbrtowc.c b/tests/test-mbrtowc.c
index a0b5231..54d52f8 100644
--- a/tests/test-mbrtowc.c
+++ b/tests/test-mbrtowc.c
@@ -103,7 +103,15 @@ main (int argc, char *argv[])
wc = (wchar_t) 0xBADFACE;
ret = mbrtowc (&wc, buf, 1, &state);
ASSERT (ret == 1);
- ASSERT (wc == c);
+ if (c < 0x80)
+ /* c is an ASCII character. */
+ ASSERT (wc == c);
+ else
+ /* argv[1] starts with '5', that is, we are testing the C or POSIX
+ locale.
+ On most platforms, the bytes 0x80..0xFF map to U+0080..U+00FF.
+ But on musl libc, the bytes 0x80..0xFF map to U+DF80..U+DFFF. */
+ ASSERT (wc == btowc (c));
ASSERT (mbsinit (&state));
ret = mbrtowc (NULL, buf, 1, &state);
ASSERT (ret == 1);
Bernhard Voelker
2018-02-24 22:39:15 UTC
Permalink
Post by Bruno Haible
FAIL: test-mbrtowc5.sh
======================
../../gltests/test-mbrtowc.c:106: assertion 'wc == c' failed
Aborted
FAIL test-mbrtowc5.sh (exit status: 134)
The issue is that in the C locale, musl uses the encoding that maps
0x00..0x7F -> U+0000..U+007F
0x80..0xFF -> U+DF80..U+DFFF
0x00..0x7F -> U+0000..U+007F
0x80..0xFF -> U+0080..U+00FF
This patch fixes the test.
Now this test fails on GNU/Linux - at least here on openSUSE-Tumbleweed.

FAIL: test-mbrtowc5.sh
======================

test-mbrtowc.c:114: assertion 'wc == btowc (c)' failed
./test-mbrtowc5.sh: line 4: 21847 Aborted (core dumped) LC_ALL=C ./test-mbrtowc${EXEEXT} 5
FAIL test-mbrtowc5.sh (exit status: 134)


(gdb) bt
#0 __GI_raise (sig=***@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x00007ffff7a546b1 in __GI_abort () at abort.c:79
#2 0x0000000000400a7d in main (argc=<optimized out>, argv=<optimized out>) at test-mbrtowc.c:58
gdb) p wc
$2 = 128 L'\200'

Do you see the same?

Have a nice day,
Berny
Bruno Haible
2018-02-25 00:59:29 UTC
Permalink
Post by Bernhard Voelker
Now this test fails on GNU/Linux - at least here on openSUSE-Tumbleweed.
FAIL: test-mbrtowc5.sh
======================
test-mbrtowc.c:114: assertion 'wc == btowc (c)' failed
./test-mbrtowc5.sh: line 4: 21847 Aborted (core dumped) LC_ALL=C ./test-mbrtowc${EXEEXT} 5
FAIL test-mbrtowc5.sh (exit status: 134)
Oops. Thanks for the rapid notice. This patch fixes it. This time, I've
tested it on glibc, musl, Mac OS X, FreeBSD, NetBSD, OpenBSD, AIX, HP-UX,
IRIX, Solaris, Cygwin.


2018-02-24 Bruno Haible <***@clisp.org>

mbrtowc tests: Fix regression on glibc.
Reported by Bernhard Voelker.
* tests/test-mbrtowc.c (main): Fix expected value of wc.

diff --git a/tests/test-mbrtowc.c b/tests/test-mbrtowc.c
index 54d52f8..44da295 100644
--- a/tests/test-mbrtowc.c
+++ b/tests/test-mbrtowc.c
@@ -111,7 +111,7 @@ main (int argc, char *argv[])
locale.
On most platforms, the bytes 0x80..0xFF map to U+0080..U+00FF.
But on musl libc, the bytes 0x80..0xFF map to U+DF80..U+DFFF. */
- ASSERT (wc == btowc (c));
+ ASSERT (wc == (btowc (c) == WEOF ? c : btowc (c)));
ASSERT (mbsinit (&state));
ret = mbrtowc (NULL, buf, 1, &state);
ASSERT (ret == 1);
Bernhard Voelker
2018-02-25 10:36:18 UTC
Permalink
Post by Bruno Haible
This patch fixes it.
Thanks for the quick fix. Works for me.

Have a nice day,
Berny

Loading...