Discussion:
Memleak in glob()
Tim Rühsen
2017-07-01 18:44:27 UTC
Permalink
Hi,

fuzzing glob.c immediately discovered a leak.

At ~L600 in glob.c, 'dirname' is heap allocated.
It is free'd at label 'out', but some code paths directly return without
jumping there.

Attached is a patch fixing the issue for me, but just take it as a proof of
concept. You might prefer a different approach.

Regards, Tim
Bruno Haible
2017-07-02 00:34:17 UTC
Permalink
Hi Tim,
Post by Tim Rühsen
Attached is a patch fixing the issue for me
The 'return 0;' in line 1104 is correct, because 'dirname' is stuffed into
the result array in the lines before. Therefore dirname must NOT be freed
here.

For the other ones, you may be right, but I wonder why scan.coverity.com did
not display them? Probably we need to upload a new build to scan.coverity.com.
Which brings me to a status question: Did you find volunteers for the triage
of the "defects" on scan.coverity.com? I stopped looking into it when you
said you would find some people.

Ultimately, for functions as complex as glob(), I now trust coverity.com
more than any person's review (including mine myself). Therefore I would like
to pursue the approach of using scan.coverity.com.

Bruno
Tim Rühsen
2017-07-02 10:40:32 UTC
Permalink
Hi Bruno,
Post by Bruno Haible
Hi Tim,
Post by Tim Rühsen
Attached is a patch fixing the issue for me
The 'return 0;' in line 1104 is correct, because 'dirname' is stuffed into
the result array in the lines before. Therefore dirname must NOT be freed
here.
This is right for one path, but not all the paths before 'return 0' stuff
'dirname' into that array.
IMO, the correct approach would be to reset 'malloc_dirname' whenever
'dirname' is stuffed/moved somewhere. In code that I am responsible for,
especially complex code as glob(), I also set pointers to NULL after free() or
when assigning/moving to another variable.
Post by Bruno Haible
For the other ones, you may be right, but I wonder why scan.coverity.com did
not display them? Probably we need to upload a new build to
scan.coverity.com. Which brings me to a status question: Did you find
volunteers for the triage of the "defects" on scan.coverity.com? I stopped
looking into it when you said you would find some people.
Sadly, I was too optimistic. I didn't forget about, and working on those is
still on my list. But currently I am overwhelmed with stuff that has higher
priority.
Post by Bruno Haible
Ultimately, for functions as complex as glob(), I now trust coverity.com
more than any person's review (including mine myself). Therefore I would
like to pursue the approach of using scan.coverity.com.
That's always a good idea. You could set up a cron job to automatically upload
to Coverity, e.g. once a week or once a day. You can find instructions one the
'upload for file analysis' page.

I am currently into fuzzing which is not only good for finding *real* issues
(leaks, slowness, illegal memory access, integer overflow, ...). Fuzzing also
provides you with test data that (nearly) fully covers all code paths. Those
test inputs can (and should) be integrated into a project's test suite.

Are you going to GHM ? Wish, I could meet you there.

Regards, Tim
Bruno Haible
2017-07-02 12:00:27 UTC
Permalink
Hi Tim,
Post by Tim Rühsen
Post by Bruno Haible
Did you find
volunteers for the triage of the "defects" on scan.coverity.com? I stopped
looking into it when you said you would find some people.
Sadly, I was too optimistic. I didn't forget about, and working on those is
still on my list. But currently I am overwhelmed with stuff that has higher
priority.
OK, this means the "find some people" is not going to happen soon. That I'll
continue the review on scan.coverity.com (as I don't find it _that_ low
priority, especially when I can concentrate on specific categories of issues).
Post by Tim Rühsen
You could set up a cron job to automatically upload
to Coverity, e.g. once a week or once a day. You can find instructions one the
'upload for file analysis' page.
Good idea. I haven't found a 'upload for file analysis' page, but the
instructions (including the mandatory, project specific token) are on the
project's "Upload a Project Build" page. In progress...
Post by Tim Rühsen
Post by Bruno Haible
The 'return 0;' in line 1104 is correct, because 'dirname' is stuffed into
the result array in the lines before. Therefore dirname must NOT be freed
here.
This is right for one path, but not all the paths before 'return 0' stuff
'dirname' into that array.
Ah, indeed. We should really rerun coverity after making changes to glob.c.
Post by Tim Rühsen
In code that I am responsible for,
especially complex code as glob(), I also set pointers to NULL after free() or
when assigning/moving to another variable.
Such a coding style does not help. What you need is to reduce the complexity,
either by introducing inline functions, or by at least reducing the scope
of the variables in the big function.

The only situations where "set pointers to NULL after free()" is useful are:
- when you're dealing with global variables or heap-allocated struct members
(or class members in C++), or
- when there is a (conservative) garbage collector around.

Bruno
Tim Rühsen
2017-07-02 15:20:45 UTC
Permalink
Hi Bruno,
Post by Bruno Haible
Post by Tim Rühsen
In code that I am responsible for,
especially complex code as glob(), I also set pointers to NULL after
free() or when assigning/moving to another variable.
Such a coding style does not help. What you need is to reduce the
complexity, either by introducing inline functions, or by at least reducing
the scope of the variables in the big function.
- when you're dealing with global variables or heap-allocated struct
members (or class members in C++), or
- when there is a (conservative) garbage collector around.
What I meant (in general) is the avoidance of 'use after free' [1] issues.
Not leaving around dangling pointers avoids this, so erroneous code is more
unlikely to become a security issue (it's just a segmentation fault then).

In glob() the code is not resetting the dirname memory policy flag when
'dirname' is stuffed into another variable. No doing so can easily lead to
either a double free and/or a use after free.

Regards, Tim

[1] https://www.owasp.org/index.php/Using_freed_memory
Bruno Haible
2017-07-06 21:25:39 UTC
Permalink
Post by Tim Rühsen
Post by Bruno Haible
Post by Tim Rühsen
Attached is a patch fixing the issue for me
The 'return 0;' in line 1104 is correct, because 'dirname' is stuffed into
the result array in the lines before. Therefore dirname must NOT be freed
here.
This is right for one path, but not all the paths before 'return 0' stuff
'dirname' into that array.
Fixed like this. Let's see what remaining issues Coverity reports in glob.c
(next Monday).


2017-07-06 Bruno Haible <***@clisp.org>

glob: Fix more memory leaks.
* lib/glob.c (glob): Free dirname before returning.
Reported by Coverity and Tim Rühsen.

diff --git a/lib/glob.c b/lib/glob.c
index dc0aff6..a38cf22 100644
--- a/lib/glob.c
+++ b/lib/glob.c
@@ -1091,6 +1091,8 @@ glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
p = mempcpy (pglob->gl_pathv[newcount], dirname, dirlen);
p[0] = '/';
p[1] = '\0';
+ if (__glibc_unlikely (malloc_dirname))
+ free (dirname);
}
else
{
Bruno Haible
2017-07-10 17:09:59 UTC
Permalink
Hi Tim,
Post by Bruno Haible
Fixed like this. Let's see what remaining issues Coverity reports in glob.c
(next Monday).
Now, coverity does not report any resource leak for lib/glob.c any more. But
IMO your analysis about the memory leak is still mostly correct. Except around
line 1070, where your patch would add a double-free, I think you're right.

So, I've applied this change. I'm not introducing a new macro RETURNVAL
because that does not seem to fit with the glibc coding style.


2017-07-10 Tim Rühsen <***@gmx.de>
Bruno Haible <***@clisp.org>

glob: Fix more memory leaks.
* lib/glob.c (glob): Use 'goto out' in order to free dirname before
returning.
Reported by Tim Rühsen.

diff --git a/lib/glob.c b/lib/glob.c
index a38cf22..3b3194a 100644
--- a/lib/glob.c
+++ b/lib/glob.c
@@ -1039,9 +1039,12 @@ glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
free (malloc_pwtmpbuf);

if (flags & GLOB_TILDE_CHECK)
- /* We have to regard it as an error if we cannot find the
- home directory. */
- return GLOB_NOMATCH;
+ {
+ /* We have to regard it as an error if we cannot find the
+ home directory. */
+ retval = GLOB_NOMATCH;
+ goto out;
+ }
}
}
}
@@ -1068,12 +1071,11 @@ glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
if (newcount > SIZE_MAX / sizeof (char *) - 2)
{
nospace:
- if (__glibc_unlikely (malloc_dirname))
- free (dirname);
free (pglob->gl_pathv);
pglob->gl_pathv = NULL;
pglob->gl_pathc = 0;
- return GLOB_NOSPACE;
+ retval = GLOB_NOSPACE;
+ goto out;
}

new_gl_pathv = realloc (pglob->gl_pathv,
@@ -1113,7 +1115,8 @@ glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
}

/* Not found. */
- return GLOB_NOMATCH;
+ retval = GLOB_NOMATCH;
+ goto out;
}

meta = __glob_pattern_type (dirname, !(flags & GLOB_NOESCAPE));
@@ -1159,7 +1162,10 @@ glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
if (status != 0)
{
if ((flags & GLOB_NOCHECK) == 0 || status != GLOB_NOMATCH)
- return status;
+ {
+ retval = status;
+ goto out;
+ }
goto no_matches;
}

@@ -1178,7 +1184,8 @@ glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
if (interrupt_state)
{
globfree (&dirs);
- return GLOB_ABORTED;
+ retval = GLOB_ABORTED;
+ goto out;
}
}
#endif /* SHELL. */
@@ -1197,7 +1204,8 @@ glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
globfree (&dirs);
globfree (pglob);
pglob->gl_pathc = 0;
- return status;
+ retval = status;
+ goto out;
}

/* Stick the directory on the front of each name. */
@@ -1208,7 +1216,8 @@ glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
globfree (&dirs);
globfree (pglob);
pglob->gl_pathc = 0;
- return GLOB_NOSPACE;
+ retval = GLOB_NOSPACE;
+ goto out;
}
}

@@ -1230,7 +1239,8 @@ glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
{
nospace2:
globfree (&dirs);
- return GLOB_NOSPACE;
+ retval = GLOB_NOSPACE;
+ goto out;
}

new_gl_pathv = realloc (pglob->gl_pathv,
@@ -1245,7 +1255,8 @@ glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
globfree (&dirs);
globfree (pglob);
pglob->gl_pathc = 0;
- return GLOB_NOSPACE;
+ retval = GLOB_NOSPACE;
+ goto out;
}

++pglob->gl_pathc;
@@ -1257,7 +1268,8 @@ glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
else
{
globfree (&dirs);
- return GLOB_NOMATCH;
+ retval = GLOB_NOMATCH;
+ goto out;
}
}

@@ -1303,7 +1315,8 @@ glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
flags = orig_flags;
goto no_matches;
}
- return status;
+ retval = status;
+ goto out;
}

if (dirlen > 0)
@@ -1315,7 +1328,8 @@ glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
{
globfree (pglob);
pglob->gl_pathc = 0;
- return GLOB_NOSPACE;
+ retval = GLOB_NOSPACE;
+ goto out;
}
}
}
@@ -1340,7 +1354,8 @@ glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
{
globfree (pglob);
pglob->gl_pathc = 0;
- return GLOB_NOSPACE;
+ retval = GLOB_NOSPACE;
+ goto out;
}
strcpy (&new[len - 2], "/");
pglob->gl_pathv[i] = new;

Paul Eggert
2017-07-02 23:22:52 UTC
Permalink
Post by Tim Rühsen
Hi,
fuzzing glob.c immediately discovered a leak.
At ~L600 in glob.c, 'dirname' is heap allocated.
It is free'd at label 'out', but some code paths directly return without
jumping there.
Attached is a patch fixing the issue for me, but just take it as a proof of
concept. You might prefer a different approach.
Regards, Tim
glob.c is taken from glibc, right? Have you investigated whether these
problems have been reported and/or fixed in glibc?
Tim Rühsen
2017-07-03 09:05:21 UTC
Permalink
Post by Paul Eggert
Post by Tim Rühsen
Hi,
fuzzing glob.c immediately discovered a leak.
At ~L600 in glob.c, 'dirname' is heap allocated.
It is free'd at label 'out', but some code paths directly return without
jumping there.
Attached is a patch fixing the issue for me, but just take it as a proof of
concept. You might prefer a different approach.
Regards, Tim
glob.c is taken from glibc, right? Have you investigated whether these
problems have been reported and/or fixed in glibc?
I don't know if glibc takes the code from gnulib or the other way round.
But a quick look at [1] around L1012 looks like the same issue in glibc.

[1] https://code.woboq.org/userspace/glibc/posix/glob.c.html

Regards, Tim
Tim Rühsen
2017-07-03 09:25:54 UTC
Permalink
Password reset at
https://sourceware.org/bugzilla/enter_bug.cgi?product=glibc currently
doesn't work (yes, also checked in spam dir). Don't remember my pw, too
long ago.


So maybe someone else can open a bug with a link to this thread !?


With Best Regards, Tim
Post by Paul Eggert
Post by Tim Rühsen
Hi,
fuzzing glob.c immediately discovered a leak.
At ~L600 in glob.c, 'dirname' is heap allocated.
It is free'd at label 'out', but some code paths directly return without
jumping there.
Attached is a patch fixing the issue for me, but just take it as a proof of
concept. You might prefer a different approach.
Regards, Tim
glob.c is taken from glibc, right? Have you investigated whether these
problems have been reported and/or fixed in glibc?
Loading...