Discussion:
Changes in nl_langinfo() and strftime() API in glibc
(too old to reply)
Rafal Luzynski
2018-01-22 23:41:27 UTC
Permalink
Raw Message
Hello,

I'd like to notify you that today some changes have been introduced
to nl_langinfo() and strftime() families, including strptime() as well.
They should also be ported to the implementations in Gnulib. This is
not only to make the changes available for other systems but also to
port the changes to fprintftime() function which exists only in Gnulib
and which is used by date(1) command line utility in Linux as well as
in few more utilities. I am the author of the changes so I can assist
you with porting them to Gnulib.

I am not subscribed to this list, please reply both to this list and
to my email address.

I'm sorry if this list is not the correct way to notify you, I haven't
found anything like a bugzilla nor the source code repository supporting
pull requests.

Regards,

Rafal Luzynski


Links:

https://sourceware.org/bugzilla/show_bug.cgi?id=10871
https://sourceware.org/git/?p=glibc.git;a=blob;f=NEWS;hb=HEAD
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=95cb863
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=761a585
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=2239076
Paul Eggert
2018-01-23 08:47:33 UTC
Permalink
Raw Message
Thanks, I merged the glibc changes into gnulib/lib/nstrftime.c by installing the
attached into gnulib master. I don't know the nl_langinfo code as well, though,
and so would appreciate advice as to how to merge that change in.

PS. This doesn't matter for Gnulib, but why define ABALTMON_1 only for the
!COMPILE_WIDE case? Not that anyone ever uses the wide code....
Rafal Luzynski
2018-01-23 13:21:41 UTC
Permalink
Raw Message
23.01.2018 09:47 Paul Eggert <***@cs.ucla.edu> wrote:
> Thanks, I merged the glibc changes into gnulib/lib/nstrftime.c by installing
> the
> attached into gnulib master. I don't know the nl_langinfo code as well,
> though,
> and so would appreciate advice as to how to merge that change in.
> [...]

Thank you, Paul. Your patch looks correct at first sight. Unfortunately,
I'm afraid it will not even compile without the changes in nl_langinfo()
because ALTMON_1 and _NL_ABALTMON_1 symbols are undefined.

Also I'm not sure if gnulib provides its own implementation of strptime(),
it may need an update as well.

I will provide more details later in the evening, CET timezone.

Regards,

Rafal
Bruno Haible
2018-01-24 09:03:00 UTC
Permalink
Raw Message
Rafal Luzynski wrote:
> Unfortunately,
> I'm afraid it will not even compile without the changes in nl_langinfo()
> because ALTMON_1 and _NL_ABALTMON_1 symbols are undefined.

No, Paul's patch was correct. Paul would never push a commit that does not
compile.

Paul Eggert wrote:
> I don't know the nl_langinfo code as well, though,

Done as follows:


2018-01-24 Bruno Haible <***@clisp.org>

langinfo, nl_langinfo: Add support for alternative month names.
* m4/langinfo_h.m4 (gl_LANGINFO_H): Define HAVE_LANGINFO_ALTMON.
* lib/langinfo.in.h (ALTMON_1...ALTMON_12): New macros.
* lib/nl_langinfo.c (rpl_nl_langinfo): Treat ALTMON_i like MON_i.
* tests/test-nl_langinfo.c (main): Test ALTMON_*.
* doc/posix-headers/langinfo.texi: Document support of ALTMON_*.
* doc/posix-functions/nl_langinfo.texi: Likewise.

diff --git a/doc/posix-functions/nl_langinfo.texi b/doc/posix-functions/nl_langinfo.texi
index cd9e523..529b0e9 100644
--- a/doc/posix-functions/nl_langinfo.texi
+++ b/doc/posix-functions/nl_langinfo.texi
@@ -15,6 +15,10 @@ Minix 3.1.8, mingw, MSVC 14, BeOS.
The constant @code{CODESET} is not supported on some platforms:
glibc 2.0.6, OpenBSD 3.8.
@item
+The constants @code{ALTMON_1} to @code{ALTMON_12} are not defined on some
+platforms:
+glibc 2.26 and many others.
+@item
The constants @code{ERA}, @code{ERA_D_FMT}, @code{ERA_D_T_FMT},
@code{ERA_T_FMT}, @code{ALT_DIGITS} are not supported on some platforms:
OpenBSD 3.8.
diff --git a/doc/posix-headers/langinfo.texi b/doc/posix-headers/langinfo.texi
index a4d1516..30ae66c 100644
--- a/doc/posix-headers/langinfo.texi
+++ b/doc/posix-headers/langinfo.texi
@@ -14,6 +14,10 @@ Minix 3.1.8, mingw, MSVC 14, BeOS.
The constant @code{CODESET} is not defined on some platforms:
glibc 2.0.6, OpenBSD 3.8.
@item
+The constants @code{ALTMON_1} to @code{ALTMON_12} are not defined on some
+platforms:
+glibc 2.26 and many others.
+@item
The constants @code{ERA}, @code{ERA_D_FMT}, @code{ERA_D_T_FMT},
@code{ERA_T_FMT}, @code{ALT_DIGITS} are not defined on some platforms:
OpenBSD 3.8.
diff --git a/lib/langinfo.in.h b/lib/langinfo.in.h
index e51bb57..31ac575 100644
--- a/lib/langinfo.in.h
+++ b/lib/langinfo.in.h
@@ -86,6 +86,18 @@ typedef int nl_item;
# define MON_10 (MON_1 + 9)
# define MON_11 (MON_1 + 10)
# define MON_12 (MON_1 + 11)
+# define ALTMON_1 10200
+# define ALTMON_2 (ALTMON_1 + 1)
+# define ALTMON_3 (ALTMON_1 + 2)
+# define ALTMON_4 (ALTMON_1 + 3)
+# define ALTMON_5 (ALTMON_1 + 4)
+# define ALTMON_6 (ALTMON_1 + 5)
+# define ALTMON_7 (ALTMON_1 + 6)
+# define ALTMON_8 (ALTMON_1 + 7)
+# define ALTMON_9 (ALTMON_1 + 8)
+# define ALTMON_10 (ALTMON_1 + 9)
+# define ALTMON_11 (ALTMON_1 + 10)
+# define ALTMON_12 (ALTMON_1 + 11)
# define ABMON_1 10035
# define ABMON_2 (ABMON_1 + 1)
# define ABMON_3 (ABMON_1 + 2)
@@ -138,6 +150,22 @@ typedef int nl_item;
# define GNULIB_defined_T_FMT_AMPM 1
# endif

+# if !@HAVE_LANGINFO_ALTMON@
+# define ALTMON_1 10200
+# define ALTMON_2 (ALTMON_1 + 1)
+# define ALTMON_3 (ALTMON_1 + 2)
+# define ALTMON_4 (ALTMON_1 + 3)
+# define ALTMON_5 (ALTMON_1 + 4)
+# define ALTMON_6 (ALTMON_1 + 5)
+# define ALTMON_7 (ALTMON_1 + 6)
+# define ALTMON_8 (ALTMON_1 + 7)
+# define ALTMON_9 (ALTMON_1 + 8)
+# define ALTMON_10 (ALTMON_1 + 9)
+# define ALTMON_11 (ALTMON_1 + 10)
+# define ALTMON_12 (ALTMON_1 + 11)
+# define GNULIB_defined_ALTMON 1
+# endif
+
# if !@HAVE_LANGINFO_ERA@
# define ERA 10047
# define ERA_D_FMT 10048
diff --git a/lib/nl_langinfo.c b/lib/nl_langinfo.c
index 725ccf6..b93f7be 100644
--- a/lib/nl_langinfo.c
+++ b/lib/nl_langinfo.c
@@ -100,6 +100,24 @@ rpl_nl_langinfo (nl_item item)
case T_FMT_AMPM:
return (char *) "%I:%M:%S %p";
# endif
+# if GNULIB_defined_ALTMON
+ case ALTMON_1:
+ case ALTMON_2:
+ case ALTMON_3:
+ case ALTMON_4:
+ case ALTMON_5:
+ case ALTMON_6:
+ case ALTMON_7:
+ case ALTMON_8:
+ case ALTMON_9:
+ case ALTMON_10:
+ case ALTMON_11:
+ case ALTMON_12:
+ /* We don't ship the appropriate localizations with gnulib. Therefore,
+ treat ALTMON_i like MON_i. */
+ item = item - ALTMON_1 + MON_1;
+ break;
+# endif
# if GNULIB_defined_ERA
case ERA:
/* The format is not standardized. In glibc it is a sequence of strings
@@ -228,28 +246,49 @@ nl_langinfo (nl_item item)
return (char *) abdays[item - ABDAY_1];
return nlbuf;
}
- case MON_1:
- case MON_2:
- case MON_3:
- case MON_4:
- case MON_5:
- case MON_6:
- case MON_7:
- case MON_8:
- case MON_9:
- case MON_10:
- case MON_11:
- case MON_12:
- {
- static char const months[][sizeof "September"] = {
- "January", "February", "March", "April", "May", "June", "July",
- "September", "October", "November", "December"
- };
+ {
+ static char const months[][sizeof "September"] = {
+ "January", "February", "March", "April", "May", "June", "July",
+ "September", "October", "November", "December"
+ };
+ case MON_1:
+ case MON_2:
+ case MON_3:
+ case MON_4:
+ case MON_5:
+ case MON_6:
+ case MON_7:
+ case MON_8:
+ case MON_9:
+ case MON_10:
+ case MON_11:
+ case MON_12:
tmm.tm_mon = item - MON_1;
if (!strftime (nlbuf, sizeof nlbuf, "%B", &tmm))
return (char *) months[item - MON_1];
return nlbuf;
- }
+ case ALTMON_1:
+ case ALTMON_2:
+ case ALTMON_3:
+ case ALTMON_4:
+ case ALTMON_5:
+ case ALTMON_6:
+ case ALTMON_7:
+ case ALTMON_8:
+ case ALTMON_9:
+ case ALTMON_10:
+ case ALTMON_11:
+ case ALTMON_12:
+ tmm.tm_mon = item - ALTMON_1;
+ /* The platforms without nl_langinfo() don't support strftime with %OB.
+ We don't even need to try. */
+ #if 0
+ if (!strftime (nlbuf, sizeof nlbuf, "%OB", &tmm))
+ #endif
+ if (!strftime (nlbuf, sizeof nlbuf, "%B", &tmm))
+ return (char *) months[item - ALTMON_1];
+ return nlbuf;
+ }
case ABMON_1:
case ABMON_2:
case ABMON_3:
diff --git a/m4/langinfo_h.m4 b/m4/langinfo_h.m4
index 9ae375c..de077c3 100644
--- a/m4/langinfo_h.m4
+++ b/m4/langinfo_h.m4
@@ -1,4 +1,4 @@
-# langinfo_h.m4 serial 7
+# langinfo_h.m4 serial 8
dnl Copyright (C) 2009-2018 Free Software Foundation, Inc.
dnl This file is free software; the Free Software Foundation
dnl gives unlimited permission to copy and/or distribute it,
@@ -17,6 +17,7 @@ AC_DEFUN([gl_LANGINFO_H],
dnl Determine whether <langinfo.h> exists. It is missing on mingw and BeOS.
HAVE_LANGINFO_CODESET=0
HAVE_LANGINFO_T_FMT_AMPM=0
+ HAVE_LANGINFO_ALTMON=0
HAVE_LANGINFO_ERA=0
HAVE_LANGINFO_YESEXPR=0
AC_CHECK_HEADERS_ONCE([langinfo.h])
@@ -24,6 +25,7 @@ AC_DEFUN([gl_LANGINFO_H],
HAVE_LANGINFO_H=1
dnl Determine what <langinfo.h> defines. CODESET and ERA etc. are missing
dnl on OpenBSD 3.8. T_FMT_AMPM and YESEXPR, NOEXPR are missing on IRIX 5.3.
+ dnl ALTMON_* are missing on glibc 2.26 and many other systems.
AC_CACHE_CHECK([whether langinfo.h defines CODESET],
[gl_cv_header_langinfo_codeset],
[AC_COMPILE_IFELSE(
@@ -48,6 +50,18 @@ int a = T_FMT_AMPM;
if test $gl_cv_header_langinfo_t_fmt_ampm = yes; then
HAVE_LANGINFO_T_FMT_AMPM=1
fi
+ AC_CACHE_CHECK([whether langinfo.h defines ALTMON_1],
+ [gl_cv_header_langinfo_altmon],
+ [AC_COMPILE_IFELSE(
+ [AC_LANG_PROGRAM([[#include <langinfo.h>
+int a = ALTMON_1;
+]])],
+ [gl_cv_header_langinfo_altmon=yes],
+ [gl_cv_header_langinfo_altmon=no])
+ ])
+ if test $gl_cv_header_langinfo_altmon = yes; then
+ HAVE_LANGINFO_ALTMON=1
+ fi
AC_CACHE_CHECK([whether langinfo.h defines ERA],
[gl_cv_header_langinfo_era],
[AC_COMPILE_IFELSE(
@@ -78,6 +92,7 @@ int a = YESEXPR;
AC_SUBST([HAVE_LANGINFO_H])
AC_SUBST([HAVE_LANGINFO_CODESET])
AC_SUBST([HAVE_LANGINFO_T_FMT_AMPM])
+ AC_SUBST([HAVE_LANGINFO_ALTMON])
AC_SUBST([HAVE_LANGINFO_ERA])
AC_SUBST([HAVE_LANGINFO_YESEXPR])

diff --git a/tests/test-nl_langinfo.c b/tests/test-nl_langinfo.c
index 8ad9df1..a13fb71 100644
--- a/tests/test-nl_langinfo.c
+++ b/tests/test-nl_langinfo.c
@@ -92,6 +92,32 @@ main (int argc, char *argv[])
ASSERT (strlen (nl_langinfo (MON_10)) > 0);
ASSERT (strlen (nl_langinfo (MON_11)) > 0);
ASSERT (strlen (nl_langinfo (MON_12)) > 0);
+ ASSERT (strlen (nl_langinfo (ALTMON_1)) > 0);
+ ASSERT (strlen (nl_langinfo (ALTMON_2)) > 0);
+ ASSERT (strlen (nl_langinfo (ALTMON_3)) > 0);
+ ASSERT (strlen (nl_langinfo (ALTMON_4)) > 0);
+ ASSERT (strlen (nl_langinfo (ALTMON_5)) > 0);
+ ASSERT (strlen (nl_langinfo (ALTMON_6)) > 0);
+ ASSERT (strlen (nl_langinfo (ALTMON_7)) > 0);
+ ASSERT (strlen (nl_langinfo (ALTMON_8)) > 0);
+ ASSERT (strlen (nl_langinfo (ALTMON_9)) > 0);
+ ASSERT (strlen (nl_langinfo (ALTMON_10)) > 0);
+ ASSERT (strlen (nl_langinfo (ALTMON_11)) > 0);
+ ASSERT (strlen (nl_langinfo (ALTMON_12)) > 0);
+ /* In the tested locales, alternate month names and month names ought to be
+ the same. */
+ ASSERT (strcmp (nl_langinfo (ALTMON_1), nl_langinfo (MON_1)) == 0);
+ ASSERT (strcmp (nl_langinfo (ALTMON_2), nl_langinfo (MON_2)) == 0);
+ ASSERT (strcmp (nl_langinfo (ALTMON_3), nl_langinfo (MON_3)) == 0);
+ ASSERT (strcmp (nl_langinfo (ALTMON_4), nl_langinfo (MON_4)) == 0);
+ ASSERT (strcmp (nl_langinfo (ALTMON_5), nl_langinfo (MON_5)) == 0);
+ ASSERT (strcmp (nl_langinfo (ALTMON_6), nl_langinfo (MON_6)) == 0);
+ ASSERT (strcmp (nl_langinfo (ALTMON_7), nl_langinfo (MON_7)) == 0);
+ ASSERT (strcmp (nl_langinfo (ALTMON_8), nl_langinfo (MON_8)) == 0);
+ ASSERT (strcmp (nl_langinfo (ALTMON_9), nl_langinfo (MON_9)) == 0);
+ ASSERT (strcmp (nl_langinfo (ALTMON_10), nl_langinfo (MON_10)) == 0);
+ ASSERT (strcmp (nl_langinfo (ALTMON_11), nl_langinfo (MON_11)) == 0);
+ ASSERT (strcmp (nl_langinfo (ALTMON_12), nl_langinfo (MON_12)) == 0);
ASSERT (strlen (nl_langinfo (ABMON_1)) > 0);
ASSERT (strlen (nl_langinfo (ABMON_2)) > 0);
ASSERT (strlen (nl_langinfo (ABMON_3)) > 0);
Rafal Luzynski
2018-01-25 01:39:59 UTC
Permalink
Raw Message
24.01.2018 10:03 Bruno Haible <***@clisp.org> wrote:
>
>
> Rafal Luzynski wrote:
> > Unfortunately,
> > I'm afraid it will not even compile without the changes in nl_langinfo()
> > because ALTMON_1 and _NL_ABALTMON_1 symbols are undefined.
>
> No, Paul's patch was correct. Paul would never push a commit that does not
> compile.

I didn't mean it does not compile, I just wasn't sure because I did not
test this update.

> Paul Eggert wrote:
> > I don't know the nl_langinfo code as well, though,
>
> Done as follows:

Thank you, Bruno. I'm sorry I don't have time to compile and test
it now. But just at first sight: I can't see _NL_ABALTMON_* here.
Isn't it necessary?

Please also see the comments below:

>
> [...]
> diff --git a/doc/posix-functions/nl_langinfo.texi
> b/doc/posix-functions/nl_langinfo.texi
> index cd9e523..529b0e9 100644
> --- a/doc/posix-functions/nl_langinfo.texi
> +++ b/doc/posix-functions/nl_langinfo.texi
> @@ -15,6 +15,10 @@ Minix 3.1.8, mingw, MSVC 14, BeOS.
> The constant @code{CODESET} is not supported on some platforms:
> glibc 2.0.6, OpenBSD 3.8.
> @item
> +The constants @code{ALTMON_1} to @code{ALTMON_12} are not defined on some
> +platforms:
> +glibc 2.26 and many others.
> +@item

To be more precise: we are adding ALTMON_1 since glibc 2.27 (as GNU
extension) so it is not defined on glibc 2.26 and older.
(This is to avoid confusion: glibc 2.26 and newer? just glibc 2.26?)

> [...]
> diff --git a/lib/nl_langinfo.c b/lib/nl_langinfo.c
> index 725ccf6..b93f7be 100644
> --- a/lib/nl_langinfo.c
> +++ b/lib/nl_langinfo.c
> [...]
> @@ -228,28 +246,49 @@ nl_langinfo (nl_item item)
> return (char *) abdays[item - ABDAY_1];
> return nlbuf;
> }
> - case MON_1:
> - case MON_2:
> - case MON_3:
> - case MON_4:
> - case MON_5:
> - case MON_6:
> - case MON_7:
> - case MON_8:
> - case MON_9:
> - case MON_10:
> - case MON_11:
> - case MON_12:
> - {
> - static char const months[][sizeof "September"] = {
> - "January", "February", "March", "April", "May", "June", "July",
> - "September", "October", "November", "December"
> - };
> + {
> + static char const months[][sizeof "September"] = {
> + "January", "February", "March", "April", "May", "June", "July",
> + "September", "October", "November", "December"
> + };
> + case MON_1:
> + case MON_2:
> + case MON_3:
> + case MON_4:
> + case MON_5:
> + case MON_6:
> + case MON_7:
> + case MON_8:
> + case MON_9:
> + case MON_10:
> + case MON_11:
> + case MON_12:
> tmm.tm_mon = item - MON_1;
> if (!strftime (nlbuf, sizeof nlbuf, "%B", &tmm))
> return (char *) months[item - MON_1];
> return nlbuf;

This looks correct: if nl_langinfo(MON_*) is not supported then you
try to retrieve the month name with stftime("%B"). Although I don't
understand why "case MON_*" has been removed and added. Formatting
changes maybe?

> - }
> + case ALTMON_1:
> + case ALTMON_2:
> + case ALTMON_3:
> + case ALTMON_4:
> + case ALTMON_5:
> + case ALTMON_6:
> + case ALTMON_7:
> + case ALTMON_8:
> + case ALTMON_9:
> + case ALTMON_10:
> + case ALTMON_11:
> + case ALTMON_12:
> + tmm.tm_mon = item - ALTMON_1;
> + /* The platforms without nl_langinfo() don't support strftime with %OB.
> + We don't even need to try. */
> + #if 0
> + if (!strftime (nlbuf, sizeof nlbuf, "%OB", &tmm))
> + #endif

Not really, I think that this removed implementation would be useful
sometimes. As far as I know OS X does support strftime("%OB") and
does support nl_langinfo() but does not support ALTMON_* series.

> + if (!strftime (nlbuf, sizeof nlbuf, "%B", &tmm))
> + return (char *) months[item - ALTMON_1];
> + return nlbuf;
> + }

Otherwise OK: if neither ALTMON_* nor strftime("%OB") is available
on a particular platform then falling back to MON_* series is correct.

I'm sorry for this short review. I'm afraid I will not have time
to review more thoroughly this week.

Thank you for your support.

Regards,

Rafal
Bruno Haible
2018-01-25 08:04:42 UTC
Permalink
Raw Message
Hi Rafal,

> I can't see _NL_ABALTMON_* here. Isn't it necessary?

Given the name of these constants (they start with an underscore),
it appears that they are not proposed for standardization. So,
as far as I understand, the primary way to use the abbreviated
alternate month names is through nstrftime %Ob, not through nl_langinfo.

I hope (haven't checked) that Paul's changes to nstrftime.c will,
on platforms that don't support strftime %Ob, fall back to strftime %b.

> > +The constants @code{ALTMON_1} to @code{ALTMON_12} are not defined on some
> > +platforms:
> > +glibc 2.26 and many others.
> > +@item
>
> To be more precise: we are adding ALTMON_1 since glibc 2.27 (as GNU
> extension) so it is not defined on glibc 2.26 and older.

Yup, that's what I understood from the glibc commit history.

> (This is to avoid confusion: glibc 2.26 and newer? just glibc 2.26?)

That's the style we use in the gnulib manual. When something is broken
in SW version x, the reader can't assume that it works in SW versions < x.
And we try to list the highest version in which it is broken (although
often we don't know precisely).

> Although I don't
> understand why "case MON_*" has been removed and added. Formatting
> changes maybe?

I added a level of braces. This required reindentation.

> > + case ALTMON_1:
> > + case ALTMON_2:
> > + case ALTMON_3:
> > + case ALTMON_4:
> > + case ALTMON_5:
> > + case ALTMON_6:
> > + case ALTMON_7:
> > + case ALTMON_8:
> > + case ALTMON_9:
> > + case ALTMON_10:
> > + case ALTMON_11:
> > + case ALTMON_12:
> > + tmm.tm_mon = item - ALTMON_1;
> > + /* The platforms without nl_langinfo() don't support strftime with %OB.
> > + We don't even need to try. */
> > + #if 0
> > + if (!strftime (nlbuf, sizeof nlbuf, "%OB", &tmm))
> > + #endif
>
> Not really, I think that this removed implementation would be useful
> sometimes.

No, this code is meant for the platforms Minix, mingw, MSVC, BeOS.
mingw and MSVC don't support strftime with %OB; I checked the documentation.
And for Minix and BeOS I can tell it without even checking the documentation.

Bruno
Paul Eggert
2018-01-25 01:07:23 UTC
Permalink
Raw Message
On 01/23/2018 05:21 AM, Rafal Luzynski wrote:
> I'm afraid it will not even compile without the changes in nl_langinfo()
> because ALTMON_1 and _NL_ABALTMON_1 symbols are undefined.

It should work in the Gnulib context, because those symbols are never
used in that context. At least, it worked for me when I compiled it. I
think the patch will also work in the glibc context but have not tested
this (the patch would have to be backported to glibc anyway).
Rafal Luzynski
2018-01-30 22:11:56 UTC
Permalink
Raw Message
Hello,

Sorry for this late response, I was focused on more urgent tasks.

23.01.2018 09:47 Paul Eggert <***@cs.ucla.edu> wrote:
> [...]
> PS. This doesn't matter for Gnulib, but why define ABALTMON_1 only for the
> !COMPILE_WIDE case?

This is only to make sure that NLW() macro works as expected.
In !COMPILE_WIDE case it just outputs its argument without any
change. So NLW(ABDAY_1) will be ABDAY_1, NLW(MON_1) will be MON_1
and so on. In COMPILE_WIDE case it prepends _NL_W to the argument
so NLW(ABDAY_1) will be _NL_WABDAY_1, NLW(MON_1) will be _NL_WMON_1
and so on. For the abbreviated alternative month names we need
_NL_ABALTMON_1 and _NL_WABALTMON_1. If we passed _NL_ABALTMON_1
directly it would make _NL_ABALTMON_1 in !COMPILE_WIDE case (correct)
and _NL_W_NL_ABALTMON_1 in COMPILE_WIDE (incorrect). So I've decided
to use NLW(ABALTMON_1) which in COMPILE_WIDE case generates
_NL_WABALTMON_1 (correct) and in !COMPILE_WIDE I define ABALTMON_1
as _NL_ABALTMON_1 so NLW() generates ABALMON_1 which is actually
_NL_ABALTMON_1. Again correct.

It would be easier if ABALTMON_1 (and all ABALTMON_*) was defined
officially which I hope happens one day but for now this simple
workaround.

> Not that anyone ever uses the wide code....

AFAIK it's commonly used on Windows platform. I guess this is a good
target if Gnulib is supposed to provide the GNU API on non-GNU platforms.

Thank you for your support. Best regards,

Rafal
Bruno Haible
2018-01-31 07:28:07 UTC
Permalink
Raw Message
Rafal Luzynski wrote:
> > Not that anyone ever uses the wide code....
>
> AFAIK it's commonly used on Windows platform. I guess this is a good
> target if Gnulib is supposed to provide the GNU API on non-GNU platforms.

No. wchar_t[] APIs are too broken for application use in general [1].
On Windows platforms, the only reasonable use of wchar_t arrays you can
make is to interface to the Windows API functions, and *nothing else*.
Gnulib will *not* recommend or favour the use of wchar_t[] APIs in
applicative code.

Bruno

[1] https://www.gnu.org/software/libunistring/manual/html_node/The-wchar_005ft-mess.html
Bruno Haible
2018-01-24 09:51:37 UTC
Permalink
Raw Message
Hi Rafal,

> https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=2239076

This documentation patch is too vague, IMO. It purports to document
"Specify when to use %OB instead of %B." But as a programmer who is not
aware of Polish and Greek grammar, it does not precisely answer the question:

When should I use %OB, and when should I use %B, in strftime?

I would look in time.texi; I find the answer too vague: "as part of a
complete date".
Is 24 January 2018 a complete date? I'd guess yes.
Is 24 January a complete date? I'd guess no.
Is January 2018 a complete date? I'd guess no.
Is January a complete date? I guess you meant no.

Is that what you intended to mean?

Bruno
Rafal Luzynski
2018-01-24 10:20:48 UTC
Permalink
Raw Message
24.01.2018 10:51 Bruno Haible <***@clisp.org> wrote:
>
>
> Hi Rafal,
>
> > https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=2239076
>
> This documentation patch is too vague, IMO. It purports to document
> "Specify when to use %OB instead of %B." But as a programmer who is not
> aware of Polish and Greek grammar, it does not precisely answer the question:
>
> When should I use %OB, and when should I use %B, in strftime?

Since I am not a native English speaker I have ceased writing the
documentation to the native English speakers. As I am able to write
a documentation which is kinda correct, I am unable to write a better
documentation than those who did it. I think that this document is
a balance between being concise and being a book about Slavic or
Indo-European grammar which would be too long and too boring for
most of the programmers.

Short answer to all your questions: whatever date format you use
you should make it translatable, like:

strftime (s, max, _("%A, %B %d, %y), ...

so you leave the correct format for the translators. This should
have been true since forever because the local date formats are not
limited to whether the date is full or the month is standalone but
also includes things like whether there are dots and/or commas,
date-month order, leading zeros or spaces, etc.

Indeed, as a programmer you are not obliged to know the native
languages. But it can be useful sometimes when you have to teach
the translators.

> I would look in time.texi; I find the answer too vague: "as part of a
> complete date".
> Is 24 January 2018 a complete date? I'd guess yes.

Correct.

> Is 24 January a complete date? I'd guess no.

Incorrect. Explanation below.

> Is January 2018 a complete date? I'd guess no.

Correct.

> Is January a complete date? I guess you meant no.

Correct.

> Is that what you intended to mean?

So, indeed, you did not understand correctly and it's not
your fault but the fault of the documentation. But did not
the documentation mention that a full date is a date with
the day number included?

The issue is when the day number and the month name appear
together. Some languages require a genitive case here, like
it can also be said in English: "24th of January", meaning
"the 24th day of January". It's simliar in Spanish:
"24 de enero". But in English and Spanish this is easy:
just insert "of" and "de" everywhere and the problem is
fixed. It is more complex in Catalan: they also require
"de" but it is abbreviated to "d’" if a month name starts
with a vowel, like "abril": "24 d’abril" - this is already
too complex for glibc. That's even more complex in Slavic,
Baltic, Greek and few more languages which feature a heavy
declension: in Polish January is "styczeń" (standalone,
a nominative case) but when formatting a date it's obligatory
to say "24 stycznia" (a genitive case). A complete implementation
of this system would be larger than whole implementation of
strftime(), I suppose. :-)

So, when there is no day number (and nothing similar, like
"the second week of" or "the first Sunday of") the month
name counts as standalone, a nominative case. Also when the
year number is included this still counts as standalone
because we are still talking about a month, not a day of
a month (or another part of a month).

Sorry if this message is so long. As you can see, it is too
long to put it into the documentation. I think I should write
a blog article about it.

Regards,

Rafal
Bruno Haible
2018-01-24 13:47:13 UTC
Permalink
Raw Message
Hi Rafal,

> did not
> the documentation mention that a full date is a date with
> the day number included?

No, I don't see a definition of the term "full date" or "complete date"
there.

> The issue is when the day number and the month name appear
> together.
> ...
> So, when there is no day number (and nothing similar, like
> "the second week of" or "the first Sunday of") the month
> name counts as standalone, a nominative case.

Good. Here you have formulated the precise statement that I sought for.

Can you please update the doc accordingly? Change
"when the month is used as part of a complete date"
to
"when the month appears together with a day-of-month".

(AFAICS strftime does not support week-of-month statements, only
week-of-year.)

> As you can see, it is too
> long to put it into the documentation. I think I should write
> a blog article about it.

A blog goes away someday; the documentation is there to stay and
to be improved. Please mention the essential statements in the doc;
the extra linguistic explanations with styczeń and stycznia can
indeed go in a blog.

> Short answer to all your questions: whatever date format you use
> you should make it translatable, like:
>
> strftime (s, max, _("%A, %B %d, %y), ...
>
> so you leave the correct format for the translators.

Ah, but translators will not look in the glibc manual. They read only
the gettext manual. So do we need some text in the gettext manual as
well? In other words, is the %B / %OB distinction something that the
programmer can do, and the translator is not bothered about it? Or
is this distinction different according to language, so the translators
must deal with it?

Bruno
Rafal Luzynski
2018-01-25 01:13:46 UTC
Permalink
Raw Message
24.01.2018 14:47 Bruno Haible <***@clisp.org> wrote:
>
> Hi Rafal,
>
> [...]
> > The issue is when the day number and the month name appear
> > together.
> > ...
> > So, when there is no day number (and nothing similar, like
> > "the second week of" or "the first Sunday of") the month
> > name counts as standalone, a nominative case.
>
> Good. Here you have formulated the precise statement that I sought for.
>
> Can you please update the doc accordingly? Change
> "when the month is used as part of a complete date"
> to
> "when the month appears together with a day-of-month".

OK, I have posted this suggestion to libc-alpha:

https://sourceware.org/ml/libc-alpha/2018-01/msg00773.html

> (AFAICS strftime does not support week-of-month statements, only
> week-of-year.)

That's true, I was thinking about possible constructions supported
by a native language. You are right, this particular construction
is not supported by strftime().

> [...]
> > Short answer to all your questions: whatever date format you use
> > you should make it translatable, like:
> >
> > strftime (s, max, _("%A, %B %d, %y), ...
> >
> > so you leave the correct format for the translators.
>
> Ah, but translators will not look in the glibc manual. They read only
> the gettext manual. So do we need some text in the gettext manual as
> well?

I'm not sure which gettext manual you are thinking about but
this gettext manual is actually a part of glibc:

https://www.gnu.org/software/libc/manual/html_node/The-Uniforum-approach.html

> In other words, is the %B / %OB distinction something that the
> programmer can do, and the translator is not bothered about it?

I strongly believe that the format strings should be left for
the translators and the programmer's choice of a format string
should be correct for English but this is seldom correct for other
languages. This is not because of the genitive/nominative month
names but for the reasons like:

- English often uses the month-day order, most of other languages
use the day-month order;
- many languages require a dot after the day number;
- English requires a comma after the day number if it is followed
by a year number;
- some languages (e.g., East Asian) do not have month names and
use the month numbers instead;
- and many more...

> Or
> is this distinction different according to language, so the translators
> must deal with it?

The reasons above are sufficient to tell that the translators must
have dealt with it since forever. If you are asking whether the rules
where to use %OB and where %B are universal (so the translators will
not have to decide) or not (different in different languages) I must
say that I strongly doubt about how these rules work in Czech,
Serbian, and Slovak language. But let's take a look at these
numbers (they may be inaccurate, take them as an approximation):

- there are about 200 languages supported by glibc;
- about 20 of them (10%) need the nominative/genitive distinction,
in the rest of the languages there is no difference between %OB and %B;
- about 3 of those 20 (1.5% of the total number) the rules of %OB/%B
may be different.

That means that if you (as a programmer) use %OB/%B correctly then
it will work correctly either immediately or with minor changes (reorder,
add/remove punctuation) in about 98.5% of languages. Another good
news is that if you use them incorrectly then 90% of languages
will not see any difference. :-)

Regards,

Rafal
Bruno Haible
2018-01-25 08:10:46 UTC
Permalink
Raw Message
Hi Rafal,

> > "when the month is used as part of a complete date"
> > to
> > "when the month appears together with a day-of-month".
>
> OK, I have posted this suggestion to libc-alpha:
>
> https://sourceware.org/ml/libc-alpha/2018-01/msg00773.html

Thanks. As Carlos says, the manual can even talk about language specific
things; it already does for the plural forms [1].

> > Ah, but translators will not look in the glibc manual. They read only
> > the gettext manual. So do we need some text in the gettext manual as
> > well?
>
> I'm not sure which gettext manual you are thinking about but
> this gettext manual is actually a part of glibc:
>
> https://www.gnu.org/software/libc/manual/html_node/The-Uniforum-approach.html

I meant the GNU gettext manual [2].

> > In other words, is the %B / %OB distinction something that the
> > programmer can do, and the translator is not bothered about it?
>
> I strongly believe that the format strings should be left for
> the translators and the programmer's choice of a format string
> should be correct for English but this is seldom correct for other
> languages. This is not because of the genitive/nominative month
> names but for the reasons like:
>
> - English often uses the month-day order, most of other languages
> use the day-month order;
> - many languages require a dot after the day number;
> - English requires a comma after the day number if it is followed
> by a year number;
> - some languages (e.g., East Asian) do not have month names and
> use the month numbers instead;
> - and many more...

Interesting. Thanks for these thoughts. I have opened a ticket in the
gettext bug tracker to document these things. [3].

Bruno

[1] https://www.gnu.org/software/libc/manual/html_node/Advanced-gettext-functions.html
[2] https://www.gnu.org/software/gettext/manual/
[3] https://savannah.gnu.org/bugs/?52971
Loading...