Discussion:
localename on mingw
Bruno Haible
2018-05-03 03:42:17 UTC
Permalink
On a recent mingw (both 32-bit and 64-bit), I'm seeing this test failure:

FAIL: test-localename
=====================

../../tests/test-localename.c:158: assertion 'strcmp (name, "de_DE.UTF-8") == 0' failed

Once this is worked around, there is another one here:

/* Check that gl_locale_name_thread always returns NULL. */
ASSERT (gl_locale_name_thread (LC_CTYPE, "LC_CTYPE") == NULL);

It is due to the localename.c change from 2014-07-15 [1]. While the general
approach (to call GetThreadLocale only as a fallback, because GetThreadLocale
does not consider changes made by 'setlocale') is reasonable, the exact
placement of the new code is odd: Fixing the test failures in the tests
(attached: fix-tests.diff) makes the code even stranger.

[1] https://lists.gnu.org/archive/html/bug-gnulib/2014-07/msg00003.html

The better fix is this one:


2018-05-02 Bruno Haible <***@clisp.org>

localename: Fix test failures on mingw.
* lib/localename.c (gl_locale_name_thread): Remove code specific to
native Windows.
(gl_locale_name_posix): Move code specific to native Windows here.
* tests/test-localename.c (test_locale_name, test_locale_name_posix):
Accept result without charset suffix, as it appears on mingw.

diff --git a/lib/localename.c b/lib/localename.c
index 74c8ee0..f7a7fa5 100644
--- a/lib/localename.c
+++ b/lib/localename.c
@@ -2786,7 +2786,27 @@ gl_locale_name_thread (int category, const char *categoryname)
const char *name = gl_locale_name_thread_unsafe (category, categoryname);
if (name != NULL)
return struniq (name);
-#elif defined WINDOWS_NATIVE
+#endif
+ /* On WINDOWS_NATIVE, don't use GetThreadLocale() here, because when
+ SetThreadLocale has not been called - which is a very frequent case -
+ the value of GetThreadLocale() ignores past calls to 'setlocale'. */
+ return NULL;
+}
+
+/* XPG3 defines the result of 'setlocale (category, NULL)' as:
+ "Directs 'setlocale()' to query 'category' and return the current
+ setting of 'local'."
+ However it does not specify the exact format. Neither do SUSV2 and
+ ISO C 99. So we can use this feature only on selected systems (e.g.
+ those using GNU C Library). */
+#if defined _LIBC || ((defined __GLIBC__ && __GLIBC__ >= 2) && !defined __UCLIBC__)
+# define HAVE_LOCALE_NULL
+#endif
+
+const char *
+gl_locale_name_posix (int category, const char *categoryname)
+{
+#if defined WINDOWS_NATIVE
if (LC_MIN <= category && category <= LC_MAX)
{
char *locname = setlocale (category, NULL);
@@ -2796,8 +2816,8 @@ gl_locale_name_thread (int category, const char *categoryname)
separated list of locales. We need only one, so we take the
one corresponding to LC_CTYPE, as the most important for
character translations. */
- if (strchr (locname, ';'))
- locname = setlocale (LC_CTYPE, NULL);
+ if (category == LC_ALL && strchr (locname, ';'))
+ locname = setlocale (LC_CTYPE, NULL);

/* Convert locale name to LCID. We don't want to use
LocaleNameToLCID because (a) it is only available since Vista,
@@ -2808,22 +2828,6 @@ gl_locale_name_thread (int category, const char *categoryname)
return gl_locale_name_from_win32_LCID (lcid);
}
#endif
- return NULL;
-}
-
-/* XPG3 defines the result of 'setlocale (category, NULL)' as:
- "Directs 'setlocale()' to query 'category' and return the current
- setting of 'local'."
- However it does not specify the exact format. Neither do SUSV2 and
- ISO C 99. So we can use this feature only on selected systems (e.g.
- those using GNU C Library). */
-#if defined _LIBC || ((defined __GLIBC__ && __GLIBC__ >= 2) && !defined __UCLIBC__)
-# define HAVE_LOCALE_NULL
-#endif
-
-const char *
-gl_locale_name_posix (int category, const char *categoryname)
-{
/* Use the POSIX methods of looking to 'LC_ALL', 'LC_xxx', and 'LANG'.
On some systems this can be done by the 'setlocale' function itself. */
#if defined HAVE_SETLOCALE && defined HAVE_LC_MESSAGES && defined HAVE_LOCALE_NULL
diff --git a/tests/test-localename.c b/tests/test-localename.c
index c0952a1..7e866c7 100644
--- a/tests/test-localename.c
+++ b/tests/test-localename.c
@@ -155,7 +155,16 @@ test_locale_name (void)
if (setlocale (LC_ALL, "") != NULL)
{
name = gl_locale_name (LC_CTYPE, "LC_CTYPE");
+#if (defined _WIN32 || defined __WIN32__) && !defined __CYGWIN__
+ /* On native Windows, here,
+ gl_locale_name_thread (LC_CTYPE, "LC_CTYPE")
+ returns NULL and
+ gl_locale_name_posix (LC_CTYPE, "LC_CTYPE")
+ returns either "de_DE" or "de_DE.UTF-8". */
+ ASSERT (strcmp (name, "de_DE") == 0 || strcmp (name, "de_DE.UTF-8") == 0);
+#else
ASSERT (strcmp (name, "de_DE.UTF-8") == 0);
+#endif
name = gl_locale_name (LC_MESSAGES, "LC_MESSAGES");
ASSERT (strcmp (name, "fr_FR.UTF-8") == 0);
}
@@ -575,7 +584,11 @@ test_locale_name_posix (void)
if (setlocale (LC_ALL, "") != NULL)
{
name = gl_locale_name_posix (LC_CTYPE, "LC_CTYPE");
+#if (defined _WIN32 || defined __WIN32__) && !defined __CYGWIN__
+ ASSERT (strcmp (name, "de_DE") == 0 || strcmp (name, "de_DE.UTF-8") == 0);
+#else
ASSERT (strcmp (name, "de_DE.UTF-8") == 0);
+#endif
name = gl_locale_name_posix (LC_MESSAGES, "LC_MESSAGES");
ASSERT (strcmp (name, "fr_FR.UTF-8") == 0);
}
Gisle Vanem
2018-05-03 12:22:30 UTC
Permalink
Post by Bruno Haible
+#if (defined _WIN32 || defined __WIN32__) && !defined __CYGWIN__
+ /* On native Windows, here,
Just curious, which Windows compiler has '__WIN32__'
as a built-in, but not '_WIN32'?

AFAICS, all MinGW's have both. And clang-cl does not have '__WIN32__'.
--
--gv
Bruno Haible
2018-05-03 16:02:09 UTC
Permalink
Post by Gisle Vanem
Just curious, which Windows compiler has '__WIN32__'
as a built-in, but not '_WIN32'?
Borland C++, according to <https://sourceforge.net/p/predef/wiki/OperatingSystems/>.

When we began gnulib development, Borland C++ was still a reasonable portability
target to keep in mind. Probably this is no longer the case? Can you check all
"reasonable" C compilers for Windows whether they define '_WIN32'?

Bruno
Bruno Haible
2018-05-04 00:22:17 UTC
Permalink
Post by Bruno Haible
When we began gnulib development, Borland C++ was still a reasonable portability
target to keep in mind. Probably this is no longer the case? Can you check all
"reasonable" C compilers for Windows whether they define '_WIN32'?
I checked OpenWatcom 2.0 and Pelles-C. They both have '_WIN32', but
not '__WIN32__'. The rest (MSVC, MinGW) has it off-course.
Thanks for these investigations. So, it should be safe to test only
'_WIN32' nowadays. Done through the attached patch.
Eli Zaretskii
2018-05-04 13:22:43 UTC
Permalink
Date: Thu, 03 May 2018 05:42:17 +0200
FAIL: test-localename
=====================
../../tests/test-localename.c:158: assertion 'strcmp (name, "de_DE.UTF-8") == 0' failed
/* Check that gl_locale_name_thread always returns NULL. */
ASSERT (gl_locale_name_thread (LC_CTYPE, "LC_CTYPE") == NULL);
It is due to the localename.c change from 2014-07-15 [1]. While the general
approach (to call GetThreadLocale only as a fallback, because GetThreadLocale
does not consider changes made by 'setlocale') is reasonable, the exact
placement of the new code is odd: Fixing the test failures in the tests
(attached: fix-tests.diff) makes the code even stranger.
[1] https://lists.gnu.org/archive/html/bug-gnulib/2014-07/msg00003.html
Sorry, I've read the text and and diffs several times, and I still
cannot see the light. Can you explain the rationale for moving code
between gl_locale_name_thread and gl_locale_name_posix. I don't think
I understand why that matters, given that both are called from
gl_locale_name. I also don't understand why having a thread-level
code inside a function whose name ends with "_thread" makes less sense
than having non-Posix code in a function whose name ends with
"_posix".

As for the testing part, it sounds like you kept the changes I made?
And yet you call doing that "odd"? What am I missing?

And how all that is related to the failures which are the trigger for
your proposal?

Sorry for being so dumb this afternoon, and thanks in advance for
explaining this.
Bruno Haible
2018-05-04 20:25:55 UTC
Permalink
Hi Eli,
Post by Eli Zaretskii
Can you explain the rationale for moving code
between gl_locale_name_thread and gl_locale_name_posix.
gl_locale_name() is defined, basically (forgive the shorthand notation) as
gl_locale_name_thread()
|| gl_locale_name_posix()
|| gl_locale_name_default().

The semantics of your patch was to add specific code for native Windows:

gl_locale_name_thread()
|| gl_locale_name_windows()
|| gl_locale_name_posix()
|| gl_locale_name_default().

There are three ways to actually implement this:
(1) Define a function gl_locale_name_windows, explicitly.
(2) Move gl_locale_name_windows() into gl_locale_name_thread().
(3) Move gl_locale_name_windows() into gl_locale_name_posix().

All three options produce the same value for gl_locale_name(), which is
the only function that is used outside gnulib.

But the test suite tests gl_locale_name_thread and gl_locale_name_posix
individually.

Pro and cons of (1),(2),(3):
Pro (1): Very explicit.
Cons (1): Code for a specific platform becomes mentioned in the .h file.
Cons (2): gl_locale_name_windows() has nothing to do with threads.
Cons (2): Requires some #if in the test suite, around line 480.
Pro (3): Both gl_locale_name_windows() and gl_locale_name_posix()
are related to setlocale().
Summary of scores:
(1): 0
(2): -2
(3): +1
So, option (3) is the best one, for my taste.
Post by Eli Zaretskii
I also don't understand why having a thread-level
code inside a function whose name ends with "_thread" makes less sense
AFAICS, the body of gl_locale_name_windows() is not "thread-level code".
It produces a value that depends on the return value of setlocale(),
i.e. depends on global state of the libc, not on anything particular
to the thread.
Post by Eli Zaretskii
And how all that is related to the failures which are the trigger for
your proposal?
The test failures occur because the distinction between what happens
in gl_locale_name_thread and gl_locale_name_posix is visible
to the test suite.

Bruno
Eli Zaretskii
2018-05-05 06:36:40 UTC
Permalink
Date: Fri, 04 May 2018 22:25:55 +0200
gl_locale_name() is defined, basically (forgive the shorthand notation) as
gl_locale_name_thread()
|| gl_locale_name_posix()
|| gl_locale_name_default().
gl_locale_name_thread()
|| gl_locale_name_windows()
|| gl_locale_name_posix()
|| gl_locale_name_default().
No, my patch didn't add gl_locale_name_windows. What it did was make
gl_locale_name_thread do something useful on MS-Windows, instead of
returning NULL. The main part of the patch is the code added to
gl_locale_name_thread that calls gl_locale_name_from_win32_LCID; all
the rest is just bookkeeping and subroutines needed for making that
call.
(1) Define a function gl_locale_name_windows, explicitly.
(2) Move gl_locale_name_windows() into gl_locale_name_thread().
(3) Move gl_locale_name_windows() into gl_locale_name_posix().
All three options produce the same value for gl_locale_name(), which is
the only function that is used outside gnulib.
But the test suite tests gl_locale_name_thread and gl_locale_name_posix
individually.
Yes, the test suite tests those functions separately. If only
gl_locale_name is important, then why not remove all the tests for
internal functions? Faced with that question, my conclusion was that
Gnulib does want those internal functions to serve some useful
purpose, perhaps for future extensions or for Gnulib code other that
gl_locale_name -- and the code in setlocale.c does call some of those
functions separately. So I preferred to leave the test suite factored
as it was, and instead make sure each one of those functions produced
a meaningful value.
Pro (1): Very explicit.
Cons (1): Code for a specific platform becomes mentioned in the .h file.
Cons (2): gl_locale_name_windows() has nothing to do with threads.
Cons (2): Requires some #if in the test suite, around line 480.
Pro (3): Both gl_locale_name_windows() and gl_locale_name_posix()
are related to setlocale().
Cons (3): gl_locale_name_thread is useless on Windows.
Cons (3): Requires some #if in the test suite.

Your Cons (2) is IMO a weak argument, because the actual problem is
that the order of calling these functions is less useful on MS-Windows
than on Posix platforms, due to unportable assumptions.
(1): 0
(2): -2
(3): +1
So, option (3) is the best one, for my taste.
I don't see why, see above.
Post by Eli Zaretskii
I also don't understand why having a thread-level
code inside a function whose name ends with "_thread" makes less sense
AFAICS, the body of gl_locale_name_windows() is not "thread-level code".
Right, so let me rephrase: why having gl_locale_name_thread be useless
on MS-Windows makes more sense than what it does now? That is my main
problem with your proposal. Can we instead find a way that leaves
that function useful, and yet solves the problems in the test suite
(which I admit I don't yet understand clearly enough)?
Post by Eli Zaretskii
And how all that is related to the failures which are the trigger for
your proposal?
The test failures occur because the distinction between what happens
in gl_locale_name_thread and gl_locale_name_posix is visible
to the test suite.
Can you elaborate about this distinction and how it is visible in the
test suite? I must be missing something, because I don't think I see
that issue.

Thanks.
Bruno Haible
2018-05-05 08:27:31 UTC
Permalink
Post by Eli Zaretskii
my conclusion was that
Gnulib does want those internal functions to serve some useful
purpose, perhaps for future extensions or for Gnulib code other that
gl_locale_name
Correct. These functions are exposed in localename.h. If someone finds
them useful, they can use them.
Post by Eli Zaretskii
So I preferred to leave the test suite factored
as it was, and instead make sure each one of those functions produced
a meaningful value.
Yes, it was good to not remove two functions that are declared in
localename.h.
Post by Eli Zaretskii
Post by Bruno Haible
The test failures occur because the distinction between what happens
in gl_locale_name_thread and gl_locale_name_posix is visible
to the test suite.
Can you elaborate about this distinction and how it is visible in the
test suite?
There were two test failures:
- line 158: because in some cases, gl_locale_name() returns a string
without charset suffix. One could fix this by using the same logic
as in localcharset.c, but I left it as-is because the primary purpose
of gl_locale_name() is not the charset suffix.
- line 480: The test suite encodes the implication "uselocale() was
never called => gl_locale_name_thread() is NULL". Rather than removing
this assertion, I preferred to refactor the code so that this assertion
holds again.
Post by Eli Zaretskii
Cons (3): gl_locale_name_thread is useless on Windows.
No, this is not a 'Cons (3)'. It is perfectly OK to have a function that
returns NULL.

Also look at these descriptions in localename.h:

gl_locale_name_thread
Determine the current per-thread locale's name, as specified by uselocale()
calls.

gl_locale_name_posix
Determine the thread-independent current locale's name, as specified by
setlocale() calls or by environment variables.

My refactoring made the code on Windows match these descriptions.

Bruno
Eli Zaretskii
2018-05-05 09:14:20 UTC
Permalink
Date: Sat, 05 May 2018 10:27:31 +0200
Post by Eli Zaretskii
Post by Bruno Haible
The test failures occur because the distinction between what happens
in gl_locale_name_thread and gl_locale_name_posix is visible
to the test suite.
Can you elaborate about this distinction and how it is visible in the
test suite?
- line 158: because in some cases, gl_locale_name() returns a string
without charset suffix. One could fix this by using the same logic
as in localcharset.c, but I left it as-is because the primary purpose
of gl_locale_name() is not the charset suffix.
- line 480: The test suite encodes the implication "uselocale() was
never called => gl_locale_name_thread() is NULL". Rather than removing
this assertion, I preferred to refactor the code so that this assertion
holds again.
Those considerations seem less important to me than the functionality
of the code on various platforms. The functionality should make sense
first and foremost, whereas the test suite structure should IMO be
secondary to that.
Post by Eli Zaretskii
Cons (3): gl_locale_name_thread is useless on Windows.
No, this is not a 'Cons (3)'. It is perfectly OK to have a function that
returns NULL.
It's valid if the notion of a per-thread locale makes no sense.
That's not the situation on Windows. Returning NULL will require
callers to handle that problematic value specially, thus imposing an
additional burden on developers and providing a window for bugs on
MS-Windows.
gl_locale_name_thread
Determine the current per-thread locale's name, as specified by uselocale()
calls.
gl_locale_name_posix
Determine the thread-independent current locale's name, as specified by
setlocale() calls or by environment variables.
My refactoring made the code on Windows match these descriptions.
But the implementation implicitly assumes that per-thread locale is
more specific than the thread-independent locale. This assumption is
false on Windows, as long as the "target audience" is ports of Posix
software to Windows, because the thread-local locale is insensitive to
setlocale calls, something Posix programs will not be prepared to deal
with.

So I respectfully submit that your refactoring, while it formally
makes the Windows code more matching the description, is less useful
in practice.
Bruno Haible
2018-05-05 10:55:27 UTC
Permalink
Post by Eli Zaretskii
Those considerations seem less important to me than the functionality
of the code on various platforms. The functionality should make sense
first and foremost, whereas the test suite structure should IMO be
secondary to that.
In this case, the testsuite pointed to a problem within the code - namely
functions that did not implement their description in localename.h.
Post by Eli Zaretskii
Returning NULL will require
callers to handle that problematic value specially, thus imposing an
additional burden on developers and providing a window for bugs on
MS-Windows.
Yes, but this is unavoidable, given the fact that GetThreadLocale cannot
tell whether SetThreadLocale has been called or not.

I don't want to change the structure of the function gl_locale_name:
gl_locale_name_thread()
|| gl_locale_name_posix()
|| gl_locale_name_default().
to cater with a problem that
- occurs in a function that is not yet used outside gnulib,
- is limited to native Windows.

My primary objectives
- that gl_locale_name() performs well in most situations,
- that the functions adhere to their specification, as far as possible,
- that the test suite passes,
are now fulfilled, and that was the point of my change.

Bruno

Loading...