Discussion:
malloca, freea are not thread-safe
(too old to reply)
Florian Weimer
2018-01-10 14:02:17 UTC
Permalink
Raw Message
The mmalloca function used to implement malloca accesses a static global
array without synchronization:

#define HASH_TABLE_SIZE 257
static void * mmalloca_results[HASH_TABLE_SIZE];

mmalloca (size_t n)
{

/* Enter p into the hash table. */
slot = (uintptr_t) p % HASH_TABLE_SIZE;
h->next = mmalloca_results[slot];
mmalloca_results[slot] = p;

freea also causes valgrind warnings because it contains an out-of-bounds
access. This is very undesirable because it will cause programmers to
miss real bugs.

This code has been copied into libunistring and results in a thread
safety hazard there.

Thanks,
Florian
Ondřej Bílka
2018-01-10 21:08:29 UTC
Permalink
Raw Message
On Wed, Jan 10, 2018 at 03:02:17PM +0100, Florian Weimer wrote:
> The mmalloca function used to implement malloca accesses a static
> global array without synchronization:
>
> #define HASH_TABLE_SIZE 257
> static void * mmalloca_results[HASH_TABLE_SIZE];
> …
> mmalloca (size_t n)
> {
> …
> /* Enter p into the hash table. */
> slot = (uintptr_t) p % HASH_TABLE_SIZE;
> h->next = mmalloca_results[slot];
> mmalloca_results[slot] = p;
>
> freea also causes valgrind warnings because it contains an
> out-of-bounds access. This is very undesirable because it will
> cause programmers to miss real bugs.
>
> This code has been copied into libunistring and results in a thread
> safety hazard there.
>
> Thanks,
> Florian
First let S = sa_alignment_max;

This could be done faster without hash table by making alloca result
aligned to 2 * S and malloc ones not aligned to 2 * S by adding some padding.

It would make check on free simpler. For allocation its fastest with
__builtin_alloca_with_align(x, 2 * sa_alignment_max)

Without that it more depends on how much gcc messes up alloca and if it
could optimize x & c1 & c2 to x & (c1 & c2) for constants c1 and c2.

With downward growing stack a=alloca(n) could be done as (%rsp - n) & (~(S-1))
then we align it with ret = (a & (~(2*S-1))) + 2 * S
Bruno Haible
2018-01-11 04:26:58 UTC
Permalink
Raw Message
Ondřej Bílka wrote:
> This could be done faster without hash table by making alloca result
> aligned to 2 * S and malloc ones not aligned to 2 * S by adding some padding.

Nice trick. This can be done without violating the rules of how alloca() is
used.

A similar idea, that also consists in distinguishing the two cases by the
address, is if we know the stack bounds.
- For the current thread, glibc/nptl/allocatestack.c stores the stack
bounds in thread->stackblock and thread->stackblock_size. Unfortunately
we have no public accessor for it. It would be nice to have a
pthread_getstackbound(pthread_t, stack_t*)
function...
- For the main thread, the stack can grow unbounded, and it's the kernel
which decides when to stop its growth. GNU libsigsegv contains code
to determine the current bounds, but it involves many system calls.
=> This approach is probably not viable.

Another idea is to add some header (like the current implementation does),
but instead of storing a marker only in the malloc case, store a different
marker also in the alloca case. This should be done through a GCC statement
expression.
=> Should work with __builtin_alloca.

> It would make check on free simpler. For allocation its fastest with
> __builtin_alloca_with_align(x, 2 * sa_alignment_max)

Unfortunately we cannot use __builtin_alloca_with_align here, because the
GCC documentation [1] says regarding __builtin_alloca_with_align:
"The allocated storage is released no later than just before the
calling function returns to its caller, but may be released at the end
of the block in which the function was called."
whereas what we need is the lifetime speified for __builtin_alloca:
"The lifetime of the allocated object ends just before the calling
function returns to its caller. This is so even when __builtin_alloca
is called within a nested block."

Any idea ideas (before I go on to rewrite the malloca module)?

Bruno

[1] https://gcc.gnu.org/onlinedocs/gcc-7.2.0/gcc/Other-Builtins.html
Florian Weimer
2018-02-01 12:28:30 UTC
Permalink
Raw Message
On 01/11/2018 05:26 AM, Bruno Haible wrote:
> Another idea is to add some header (like the current implementation does),
> but instead of storing a marker only in the malloc case, store a different
> marker also in the alloca case. This should be done through a GCC statement
> expression.
> => Should work with __builtin_alloca.

Could we please fix this issue along those lines? Thanks.

Florian
Bruno Haible
2018-02-02 18:35:59 UTC
Permalink
Raw Message
Hi Florian,

> Could we please fix this issue along those lines? Thanks.

Done:


2018-02-02 Bruno Haible <***@clisp.org>

malloca, xmalloca: Make multithread-safe.
Reported by Florian Weimer <***@redhat.com>.
Implements an idea by Ondřej Bílka <***@seznam.cz>.
* lib/malloca.h (malloca): In the stack allocation case, return a
pointer that is a multiple of 2 * sa_alignment_max.
(sa_increment): Remove enum item.
* lib/xmalloca.h (xmalloca): In the stack allocation case, return
a pointer that is a multiple of 2 * sa_alignment_max.
* lib/malloca.c (NO_SANITIZE_MEMORY): Remove macro.
(MAGIC_NUMBER, MAGIC_SIZE, preliminary_header, HEADER_SIZE, header,
HASH_TABLE_SIZE, mmalloca_results): Remove.
(small_t): New type.
(mmalloca, free): Rewritten.
* lib/malloca.valgrind: Remove file.
* modules/malloca (Files): Remove it.
(Depends-on): Remove verify.

diff --git a/lib/malloca.h b/lib/malloca.h
index 8278c58..cbc8fe7 100644
--- a/lib/malloca.h
+++ b/lib/malloca.h
@@ -56,8 +56,10 @@ extern "C" {
the function returns. Upon failure, it returns NULL. */
#if HAVE_ALLOCA
# define malloca(N) \
- ((N) < 4032 - sa_increment \
- ? (void *) ((char *) alloca ((N) + sa_increment) + sa_increment) \
+ ((N) < 4032 - (2 * sa_alignment_max - 1) \
+ ? (void *) (((uintptr_t) alloca ((N) + 2 * sa_alignment_max - 1) \
+ + (2 * sa_alignment_max - 1)) \
+ & ~(uintptr_t)(2 * sa_alignment_max - 1)) \
: mmalloca (N))
#else
# define malloca(N) \
@@ -119,10 +121,7 @@ enum
| (sa_alignment_longlong - 1)
#endif
| (sa_alignment_longdouble - 1)
- ) + 1,
-/* The increment that guarantees room for a magic word must be >= sizeof (int)
- and a multiple of sa_alignment_max. */
- sa_increment = ((sizeof (int) + sa_alignment_max - 1) / sa_alignment_max) * sa_alignment_max
+ ) + 1
};

#endif /* _MALLOCA_H */
diff --git a/lib/xmalloca.h b/lib/xmalloca.h
index 456f25b..14fc1b9 100644
--- a/lib/xmalloca.h
+++ b/lib/xmalloca.h
@@ -32,8 +32,10 @@ extern "C" {
the function returns. Upon failure, it exits with an error message. */
#if HAVE_ALLOCA
# define xmalloca(N) \
- ((N) < 4032 - sa_increment \
- ? (void *) ((char *) alloca ((N) + sa_increment) + sa_increment) \
+ ((N) < 4032 - (2 * sa_alignment_max - 1) \
+ ? (void *) (((uintptr_t) alloca ((N) + 2 * sa_alignment_max - 1) \
+ + (2 * sa_alignment_max - 1)) \
+ & ~(uintptr_t)(2 * sa_alignment_max - 1)) \
: xmmalloca (N))
extern void * xmmalloca (size_t n);
#else
diff --git a/lib/malloca.c b/lib/malloca.c
*** a/lib/malloca.c
--- b/lib/malloca.c
***************
*** 1,6 ****
/* Safe automatic memory allocation.
Copyright (C) 2003, 2006-2007, 2009-2018 Free Software Foundation, Inc.
! Written by Bruno Haible <***@clisp.org>, 2003.

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
--- 1,6 ----
/* Safe automatic memory allocation.
Copyright (C) 2003, 2006-2007, 2009-2018 Free Software Foundation, Inc.
! Written by Bruno Haible <***@clisp.org>, 2003, 2018.

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
***************
*** 21,112 ****
/* Specification. */
#include "malloca.h"

- #include <stdint.h>
-
- #include "verify.h"
-
- /* Silence a warning from clang's MemorySanitizer. */
- #if defined __has_feature
- # if __has_feature(memory_sanitizer)
- # define NO_SANITIZE_MEMORY __attribute__((no_sanitize("memory")))
- # endif
- #endif
- #ifndef NO_SANITIZE_MEMORY
- # define NO_SANITIZE_MEMORY
- #endif
-
/* The speed critical point in this file is freea() applied to an alloca()
result: it must be fast, to match the speed of alloca(). The speed of
mmalloca() and freea() in the other case are not critical, because they
! are only invoked for big memory sizes. */
!
! #if HAVE_ALLOCA
!
! /* Store the mmalloca() results in a hash table. This is needed to reliably
! distinguish a mmalloca() result and an alloca() result.
!
! Although it is possible that the same pointer is returned by alloca() and
! by mmalloca() at different times in the same application, it does not lead
! to a bug in freea(), because:
! - Before a pointer returned by alloca() can point into malloc()ed memory,
! the function must return, and once this has happened the programmer must
! not call freea() on it anyway.
! - Before a pointer returned by mmalloca() can point into the stack, it
! must be freed. The only function that can free it is freea(), and
! when freea() frees it, it also removes it from the hash table. */

! #define MAGIC_NUMBER 0x1415fb4a
! #define MAGIC_SIZE sizeof (int)
! /* This is how the header info would look like without any alignment
! considerations. */
! struct preliminary_header { void *next; int magic; };
! /* But the header's size must be a multiple of sa_alignment_max. */
! #define HEADER_SIZE \
! (((sizeof (struct preliminary_header) + sa_alignment_max - 1) / sa_alignment_max) * sa_alignment_max)
! union header {
! void *next;
! struct {
! char room[HEADER_SIZE - MAGIC_SIZE];
! int word;
! } magic;
! };
! verify (HEADER_SIZE == sizeof (union header));
! /* We make the hash table quite big, so that during lookups the probability
! of empty hash buckets is quite high. There is no need to make the hash
! table resizable, because when the hash table gets filled so much that the
! lookup becomes slow, it means that the application has memory leaks. */
! #define HASH_TABLE_SIZE 257
! static void * mmalloca_results[HASH_TABLE_SIZE];
!
! #endif

void *
mmalloca (size_t n)
{
#if HAVE_ALLOCA
! /* Allocate one more word, that serves as an indicator for malloc()ed
! memory, so that freea() of an alloca() result is fast. */
! size_t nplus = n + HEADER_SIZE;

if (nplus >= n)
{
! void *p = malloc (nplus);

! if (p != NULL)
{
! size_t slot;
! union header *h = p;
!
! p = h + 1;
!
! /* Put a magic number into the indicator word. */
! h->magic.word = MAGIC_NUMBER;
!
! /* Enter p into the hash table. */
! slot = (uintptr_t) p % HASH_TABLE_SIZE;
! h->next = mmalloca_results[slot];
! mmalloca_results[slot] = p;
!
return p;
}
}
--- 21,65 ----
/* Specification. */
#include "malloca.h"

/* The speed critical point in this file is freea() applied to an alloca()
result: it must be fast, to match the speed of alloca(). The speed of
mmalloca() and freea() in the other case are not critical, because they
! are only invoked for big memory sizes.
! Here we use a bit in the address as an indicator, an idea by Ondřej Bílka.
! malloca() can return three types of pointers:
! - Pointers ≡ 0 mod 2*sa_alignment_max come from stack allocation.
! - Pointers ≡ sa_alignment_max mod 2*sa_alignment_max come from heap
! allocation.
! - NULL comes from a failed heap allocation. */

! /* Type for holding very small pointer differences. */
! typedef unsigned char small_t;

void *
mmalloca (size_t n)
{
#if HAVE_ALLOCA
! /* Allocate one more word, used to determine the address to pass to freea(),
! and room for the alignment ≡ sa_alignment_max mod 2*sa_alignment_max. */
! size_t nplus = n + sizeof (small_t) + 2 * sa_alignment_max - 1;

if (nplus >= n)
{
! char *mem = (char *) malloc (nplus);

! if (mem != NULL)
{
! char *p =
! (char *)((((uintptr_t)mem + sizeof (small_t) + sa_alignment_max - 1)
! & ~(uintptr_t)(2 * sa_alignment_max - 1))
! + sa_alignment_max);
! /* Here p >= mem + sizeof (small_t),
! and p <= mem + sizeof (small_t) + 2 * sa_alignment_max - 1
! hence p + n <= mem + nplus.
! So, the memory range [p, p+n) lies in the allocated memory range
! [mem, mem + nplus). */
! ((small_t *) p)[-1] = p - mem;
! /* p ≡ sa_alignment_max mod 2*sa_alignment_max. */
return p;
}
}
***************
*** 122,159 ****
}

#if HAVE_ALLOCA
! void NO_SANITIZE_MEMORY
freea (void *p)
{
! /* mmalloca() may have returned NULL. */
! if (p != NULL)
{
! /* Attempt to quickly distinguish the mmalloca() result - which has
! a magic indicator word - and the alloca() result - which has an
! uninitialized indicator word. It is for this test that sa_increment
! additional bytes are allocated in the alloca() case. */
! if (((int *) p)[-1] == MAGIC_NUMBER)
! {
! /* Looks like a mmalloca() result. To see whether it really is one,
! perform a lookup in the hash table. */
! size_t slot = (uintptr_t) p % HASH_TABLE_SIZE;
! void **chain = &mmalloca_results[slot];
! for (; *chain != NULL;)
! {
! union header *h = p;
! if (*chain == p)
! {
! /* Found it. Remove it from the hash table and free it. */
! union header *p_begin = h - 1;
! *chain = p_begin->next;
! free (p_begin);
! return;
! }
! h = *chain;
! chain = &h[-1].next;
! }
! }
! /* At this point, we know it was not a mmalloca() result. */
}
}
#endif
--- 75,95 ----
}

#if HAVE_ALLOCA
! void
freea (void *p)
{
! /* Determine whether p was a non-NULL pointer returned by mmalloca(). */
! if ((uintptr_t) p & sa_alignment_max)
{
! void *mem = (char *) p - ((small_t *) p)[-1];
! free (mem);
}
}
#endif
+
+ /*
+ * Hey Emacs!
+ * Local Variables:
+ * coding: utf-8
+ * End:
+ */
diff --git a/lib/malloca.valgrind b/lib/malloca.valgrind
deleted file mode 100644
index 52f0a50..0000000
--- a/lib/malloca.valgrind
+++ /dev/null
@@ -1,7 +0,0 @@
-# Suppress a valgrind message about use of uninitialized memory in freea().
-# This use is OK because it provides only a speedup.
-{
- freea
- Memcheck:Cond
- fun:freea
-}
diff --git a/modules/malloca b/modules/malloca
index 9d4b40b..8f5ab64 100644
--- a/modules/malloca
+++ b/modules/malloca
@@ -4,7 +4,6 @@ Safe automatic memory allocation.
Files:
lib/malloca.h
lib/malloca.c
-lib/malloca.valgrind
m4/malloca.m4
m4/eealloc.m4
m4/longlong.m4
@@ -13,7 +12,6 @@ Depends-on:
alloca-opt
stdint
xalloc-oversized
-verify

configure.ac:
gl_MALLOCA
Paul Eggert
2018-02-02 22:10:08 UTC
Permalink
Raw Message
On 02/02/2018 10:35 AM, Bruno Haible wrote:
> Done:

Thanks. Some comments, with a proposed patch attached:
> # define malloca(N) \
> - ((N) < 4032 - sa_increment \
> - ? (void *) ((char *) alloca ((N) + sa_increment) + sa_increment) \
> + ((N) < 4032 - (2 * sa_alignment_max - 1) \
> + ? (void *) (((uintptr_t) alloca ((N) + 2 * sa_alignment_max - 1) \
> + + (2 * sa_alignment_max - 1)) \
> + & ~(uintptr_t)(2 * sa_alignment_max - 1)) \
> : mmalloca (N))

This can cause problems when -fcheck-pointer-bounds is in effect, since
converting a pointer to uintptr_t and back means that GCC won't connect
the resulting pointer to the original and this messes up bounds checking
on the result.
> ! /* Type for holding very small pointer differences. */
> ! typedef unsigned char small_t;
There should be a compile-time check guaranteeing that small_t is wide
enough.
> ! size_t nplus = n + sizeof (small_t) + 2 * sa_alignment_max - 1;
For expressions like these, it's a bit better to parenthesize the value
added to N, mostly because it makes it clearer to the reader that we're
just adding a constant. Also, on (admittedly-weird) platforms where
SIZE_MAX <= INT_MAX, it avoids undefined behavior in some
(admittedly-unusual) cases.

> ! void
> freea (void *p)
> {
> ! /* Determine whether p was a non-NULL pointer returned by mmalloca(). */
> ! if ((uintptr_t) p & sa_alignment_max)

This should be "((uintptr_t) p & (2 * sa_alignment_max - 1))", to make
it more likely that a runtime error is detected if a garbage pointer is
passed to freea.
Bruno Haible
2018-02-02 23:33:31 UTC
Permalink
Raw Message
Hi Paul,

> > ! void
> > freea (void *p)
> > {
> > ! /* Determine whether p was a non-NULL pointer returned by mmalloca(). */
> > ! if ((uintptr_t) p & sa_alignment_max)
>
> This should be "((uintptr_t) p & (2 * sa_alignment_max - 1))", to make
> it more likely that a runtime error is detected if a garbage pointer is
> passed to freea.

Changing the 'if' condition will not actually detect anything. The function
will still behave according to the "garbage in - garbage out" principle.
But you are right, it is possible here to detect invalid arguments. So let's
do so:


2018-02-02 Bruno Haible <***@clisp.org>

malloca: Add an argument check.
Suggested by Paul Eggert.
* lib/malloca.c (freea): Check against an invalid argument.

diff --git a/lib/malloca.c b/lib/malloca.c
index 5741cba..c5321d1 100644
--- a/lib/malloca.c
+++ b/lib/malloca.c
@@ -78,6 +78,12 @@ mmalloca (size_t n)
void
freea (void *p)
{
+ /* Check argument. */
+ if ((uintptr_t) p & (sa_alignment_max - 1))
+ {
+ /* p was not the result of a malloca() call. Invalid argument. */
+ abort ();
+ }
/* Determine whether p was a non-NULL pointer returned by mmalloca(). */
if ((uintptr_t) p & sa_alignment_max)
{
Bruno Haible
2018-02-02 23:41:47 UTC
Permalink
Raw Message
Paul Eggert wrote:
> > ! size_t nplus = n + sizeof (small_t) + 2 * sa_alignment_max - 1;
> For expressions like these, it's a bit better to parenthesize the value
> added to N, mostly because it makes it clearer to the reader that we're
> just adding a constant. Also, on (admittedly-weird) platforms where
> SIZE_MAX <= INT_MAX, it avoids undefined behavior in some
> (admittedly-unusual) cases.

Regarding the parentheses, I disagree: If we put parentheses they should
be like this:
size_t nplus = (n + sizeof (small_t)) + (2 * sa_alignment_max - 1);
because we want n + sizeof (small_t) consecutive bytes in memory, and the
other summand is for the alignment. Parenthesizing it in the way you suggest
would make the expression _more_ confusing.

I don't see any potential for undefined behaviour: we are taking a size_t
expression and adding a small constant (> 0, < 100). Undefined behaviour
in addition occurs only when signed integers overflow. If SIZE_MAX <= INT_MAX
we know that INT_MAX >= 2*SIZE_MAX-1 > SIZE_MAX + 100, therefore no 'int'
overflow is possible here.

Bruno
Paul Eggert
2018-02-02 23:59:01 UTC
Permalink
Raw Message
On 02/02/2018 03:41 PM, Bruno Haible wrote:
> Regarding the parentheses, I disagree: If we put parentheses they should
> be like this:
> size_t nplus = (n + sizeof (small_t)) + (2 * sa_alignment_max - 1);
> because we want n + sizeof (small_t) consecutive bytes in memory, and the
> other summand is for the alignment. Parenthesizing it in the way you suggest
> would make the expression_more_ confusing.

Well, it is a matter of style. Personally I find the expression
confusing and would find it even more confusing with the extra
parentheses. But perhaps that is because I am worried about integer
overflow.

> If SIZE_MAX <= INT_MAX
> we know that INT_MAX >= 2*SIZE_MAX-1 > SIZE_MAX + 100, therefore no 'int'
> overflow is possible here.

I was thinking about platforms where SIZE_MAX == INT_MAX, which POSIX
and ISO C both allow; on such platforms 'int' overflow is possible.
Admittedly platforms with idiosyncrasies like that are rare nowadays. I
think Unisys stopped selling their oddball platforms in late 2015.
Bruno Haible
2018-02-02 23:49:36 UTC
Permalink
Raw Message
Hi Paul,

> > ! /* Type for holding very small pointer differences. */
> > ! typedef unsigned char small_t;
> There should be a compile-time check guaranteeing that small_t is wide
> enough.

OK, if you like. Applied:


2018-02-02 Paul Eggert <***@cs.ucla.edu>

malloca: Add a compile-time verification.
* lib/malloca.c (small_t): Verify that it is wide enough.
* modules/malloca (Depends-on): Add verify.

diff --git a/lib/malloca.c b/lib/malloca.c
index c5321d1..c66e0c8 100644
--- a/lib/malloca.c
+++ b/lib/malloca.c
@@ -21,6 +21,8 @@
/* Specification. */
#include "malloca.h"

+#include "verify.h"
+
/* The speed critical point in this file is freea() applied to an alloca()
result: it must be fast, to match the speed of alloca(). The speed of
mmalloca() and freea() in the other case are not critical, because they
@@ -34,6 +36,8 @@

/* Type for holding very small pointer differences. */
typedef unsigned char small_t;
+/* Verify that it is wide enough. */
+verify (2 * sa_alignment_max - 1 <= (small_t) -1);

void *
mmalloca (size_t n)
diff --git a/modules/malloca b/modules/malloca
index 8f5ab64..0ae3fe0 100644
--- a/modules/malloca
+++ b/modules/malloca
@@ -11,6 +11,7 @@ m4/longlong.m4
Depends-on:
alloca-opt
stdint
+verify
xalloc-oversized

configure.ac:
Bruno Haible
2018-02-03 00:03:41 UTC
Permalink
Raw Message
Hi Paul,

> This can cause problems when -fcheck-pointer-bounds is in effect, since
> converting a pointer to uintptr_t and back means that GCC won't connect
> the resulting pointer to the original and this messes up bounds checking
> on the result.

To be precise: What do you mean by "cause problems" and "messes up bounds
checking"? As far as I understand, it will disable bounds checking on the
returned pointer and its derivatives, right?

Speaking of bounds checking, the code (with or without your patch) will
not provide optimal bounds checking, because a pointer access to the
memory range that we added merely for alignment will not be reported as
an error. AFAIU, we need to tell GCC about the actual bounds, by use of
the functions listed in [1].

[1] https://gcc.gnu.org/onlinedocs/gcc/Pointer-Bounds-Checker-builtins.html

How about this? Will this work?


diff --git a/lib/malloca.c b/lib/malloca.c
index c66e0c8..411bee0 100644
--- a/lib/malloca.c
+++ b/lib/malloca.c
@@ -64,7 +64,13 @@ mmalloca (size_t n)
[mem, mem + nplus). */
((small_t *) p)[-1] = p - mem;
/* p ≡ sa_alignment_max mod 2*sa_alignment_max. */
+# if __GNUC__ >= 5 && !defined __cplusplus && !defined __clang__
+ /* Tell GCC about the allowed memory accesses based on p,
+ if -fcheck-pointer-bounds is in effect. */
+ return __builtin___bnd_set_ptr_bounds (p, n);
+# else
return p;
+# endif
}
}
/* Out of memory. */
Paul Eggert
2018-02-03 00:27:18 UTC
Permalink
Raw Message
On 02/02/2018 04:03 PM, Bruno Haible wrote:
> What do you mean by "cause problems" and "messes up bounds
> checking"? As far as I understand, it will disable bounds checking on the
> returned pointer and its derivatives, right?

I'm operating from memory here (my work desktop doesn't have MPX, nor do
my school's servers), but as I recall GCC sometimes generated a pointer
that had no bounds checking, and sometimes generated a pointer that
could not be dereferenced. Both behaviors conform to ISO C.

> How about this? Will this work?

Yes and no. In the sense of just getting -fcheck-pointer-bounds to work
with GCC, it'll need some work and testing but is on the right path. For
example, it should be safer to narrow the pointer than to set its bounds
(this is assuming P currently has no bounds checking). Also, freea will
need to widen its argument to dereference the alignment byte that
precedes the memory block.

One other thing. An advantage of the #ifdef __CHKP__ code I suggested is
that it never calculates a pointer outside the bounds of the
newly-allocated block (or to just past the block). Such calculations
violate the C standard, and I wouldn't be surprised if GCC or some other
fancy optimizer exploits this to generate code to do the "wrong" thing
with these calculations. With that in mind, I suppose in hindsight that
my patch should have said "#ifdef __GNUC__" instead of "#ifdef __CHKP__".

By the way, why write "if __GNUC__ >= 5 && !defined __cplusplus &&
!defined __clang__" instead of "ifdef __CHKP__"? The latter is easier to
read....
Bruno Haible
2018-02-03 00:57:45 UTC
Permalink
Raw Message
Hi Paul,

> By the way, why write "if __GNUC__ >= 5 && !defined __cplusplus &&
> !defined __clang__" instead of "ifdef __CHKP__"? The latter is easier to
> read....

I couldn't find any documentation for this __CHKP__ macro.

> An advantage of the #ifdef __CHKP__ code I suggested is
> that it never calculates a pointer outside the bounds of the
> newly-allocated block (or to just past the block). Such calculations
> violate the C standard

The code that I committed does not have such "bad" pointers in intermediate
expressions either. It computes a valid pointer, converts it to uintptr_t,
does some arithmetic on it, and then converts back to a pointer. Since the
resulting pointer is in the malloc'ed range, it is valid.

Bruno
Paul Eggert
2018-02-04 16:58:50 UTC
Permalink
Raw Message
Bruno Haible wrote:
> Hi Paul,
>
>> By the way, why write "if __GNUC__ >= 5 && !defined __cplusplus &&
>> !defined __clang__" instead of "ifdef __CHKP__"? The latter is easier to
>> read....
>
> I couldn't find any documentation for this __CHKP__ macro.

It's not documented in the GCC manual. I found documentation for it here:

https://software.intel.com/en-us/articles/intel-memory-protection-extensions-enabling-guide

This points to a freely-readable PDF.

> The code that I committed does not have such "bad" pointers in intermediate
> expressions either. It computes a valid pointer, converts it to uintptr_t,
> does some arithmetic on it, and then converts back to a pointer. Since the
> resulting pointer is in the malloc'ed range, it is valid.

Ah, sorry, I was thinking about my translation of that into code that did not
convert pointers back from uintptr_t.
Bruno Haible
2018-02-03 12:07:58 UTC
Permalink
Raw Message
Paul Eggert wrote:
> converting a pointer to uintptr_t and back means that GCC won't connect
> the resulting pointer to the original and this messes up bounds checking
> on the result.

I don't observe this. This test program:

==============================================
#include <stdlib.h>
#include <stdint.h>

int main ()
{
int n = 100;
char *mem = malloc (n);
char* p = (char *) (uintptr_t) mem;
p[-2] = 'x';
p[n] = 'z';
return 0;
}
==============================================
when compiled with "gcc -mmpx -fcheck-pointer-bounds" (gcc 7.3.0),
produces

Saw a #BR! status 1 at 0x4005c8
Saw a #BR! status 1 at 0x4005e4

Bruno
Paul Eggert
2018-02-04 17:11:18 UTC
Permalink
Raw Message
Bruno Haible wrote:
> Paul Eggert wrote:
>> converting a pointer to uintptr_t and back means that GCC won't connect
>> the resulting pointer to the original and this messes up bounds checking
>> on the result.
> I don't observe this. This test program:

The problem typically doesn't occur with simple casts of pointer to integer and
back because these are optimized away. It occurs only when the compiler doesn't
determine that the result was derived from the original pointer. This requires
more-complicated test code. If you're interested in exploring this, you might
take a look at the GNU Emacs source code master branch, which has a few ifdefs
for __CHKP__ and which avoids the conversions in question for the __CHKP__ case.

I remember long ago I was skeptical when Jim Meyering changed some code to avoid
signed integer overflow when wraparound yielded the correct answer. I wondered,
"what compiler would ever care?" Well, Jim was right: eventually some compilers
started to care. I view conversion from intptr_t to pointer as being in a
similar category: if we avoid these conversions now we will be saving some work
in the future.
Bruno Haible
2018-02-04 18:33:54 UTC
Permalink
Raw Message
Hi Paul,

> I view conversion from intptr_t to pointer as being in a
> similar category: if we avoid these conversions now we will be saving some work
> in the future.

If you can't cast from 'void *' to 'uintptr_t' and back, the language is no
longer C any more. Indeed, with the added __bnd_* primitives, the language is
going to Ada way.

It will be useful to revisit the ca. 90 casts to [u]intptr_t that we have in gnulib
when the shape of the new language will become clearer. Maybe it will have
sizeof (type *) == 3 * sizeof (void *), who knows?

For now, given that the MPX stuff is considered experimental, I think it's too early
to rewrite code for its sake.

Bruno
Paul Eggert
2018-02-04 19:28:11 UTC
Permalink
Raw Message
Bruno Haible wrote:
> If you can't cast from 'void *' to 'uintptr_t' and back, the language is no
> longer C any more.

Not really. C has been like that ever since uintptr_t was introduced back in
C99, as uintptr_t has always been optional. Even when uintptr_t is present I
doubt whether gcc -fcheck-pointer-bounds is the only implementation where
arithmetic on uintptr_t does not work as it does in the classic
flat-address-space model. It would be nicer if Gnulib worked on these
implementations, as they do conform to the standard.

That being said, I'm not planning to worry about this any more unless it affects
the applications I help maintain. As long as Coreutils, GNU Emacs etc. do not
use the modules or do not run into problems on any of the platforms that I check
occasionally, it's not that big of a deal.
Florian Weimer
2018-02-03 12:28:45 UTC
Permalink
Raw Message
On 02/02/2018 11:10 PM, Paul Eggert wrote:
> This can cause problems when -fcheck-pointer-bounds is in effect, since
> converting a pointer to uintptr_t and back means that GCC won't connect
> the resulting pointer to the original and this messes up bounds checking
> on the result.

-fcheck-pointer-bounds in GCC doesn't really work. The existing
implementation is barely a research prototype (for example, most string
functions are not protected by it), and I don't think anyone knows how
to make it thread-safe. Its existence shouldn't be used as a guidance
for anything, really.

Thanks,
Florian
Bruno Haible
2018-02-03 12:58:34 UTC
Permalink
Raw Message
Hello Florian,

> -fcheck-pointer-bounds in GCC doesn't really work. The existing
> implementation is barely a research prototype (for example, most string
> functions are not protected by it)

Thanks for the info. I was just setting up a build of gnulib with

CC="gcc -mmpx -fcheck-pointer-bounds"
CXX="g++ -mmpx -fcheck-pointer-bounds"

Good to know that it would be a waste of time.

Bruno
Bruno Haible
2018-02-04 01:36:47 UTC
Permalink
Raw Message
> I was just setting up a build of gnulib with
>
> CC="gcc -mmpx -fcheck-pointer-bounds"
> CXX="g++ -mmpx -fcheck-pointer-bounds"

The build completed. It did not report any problems, except for
test-fprintf-posix2 and test-printf-posix2, which go into an endless
SIGSEGV loop right at
arg = atoi (argv[1]);
(apparently because the malloc() which is needed to allocate the memory
for bounds fails).

Bruno
Paul Eggert
2018-02-04 17:06:51 UTC
Permalink
Raw Message
Florian Weimer wrote:
>
> -fcheck-pointer-bounds in GCC doesn't really work.

I agree that -fcheck-pointer-bounds is not suitable for production code.
However, I am not as skeptical about its utility. I found a real bug in GNU
Emacs with it, a bug that was not found with other checkers. So I view it as a
sometimes-useful (if often-flawed) tool.
Bruno Haible
2018-01-11 04:31:42 UTC
Permalink
Raw Message
Florian Weimer wrote:
> freea also causes valgrind warnings because it contains an out-of-bounds
> access. This is very undesirable because it will cause programmers to
> miss real bugs.

As a mitigation, the 'malloca' module contains a file, malloca.valgrind, that
tells valgrind to ignore these accesses to "uninitialized memory".

Bruno
Loading...