Discussion:
[libvirt] [PATCH] build: exclude more files from all the syntax checks
(too old to reply)
Eric Blake
2017-10-06 14:47:53 UTC
Permalink
Raw Message
The majority of the syntax check is taylored for C sources, so some of
the checks already cause false positives for non-C sources (and thus
there are exclusion regexps in place).
- pot files: they are templates for po files (already excluded), and
they are automatically generated from sources
- pl files: Perl sources, which have own APIs, style, etc; they are
helper scripts, not "real" sources
- spec/spec.in files: RPM packaging files
- js/woff/html.in files: files for web pages
- diff/patch files: patches
- stp files: SystemTap scripts
- syms files: linker symbols files
- conf files: generic configuration files
- data/cpuinfo files: procinfo/cpuinfo files
There are still some useful syntax checks for performing on ALL files
(for example, prohibit_doubled_word). So I'm not quite sure that
blindly exempting these files from all possible checks makes sense.

Maybe it's worth teaching upstream gnulib syntax-check to make it easier
to auto-exclude non-C files from checks that ARE specific to the C
language, without weakening the global checks that are good on all
files. Maybe even something as simple as adding some sort of language=
tag to feed to $(_sc_search_regexp (if not specified, run on all files,
but if specified as C, the syntax-check is specific to C-like files, so
it limits to .h, .c. .y).

I'm adding the gnulib list to get feedback on the idea; maybe Jim
Meyering has an opinion as one of the original syntax-check authors.
Python files (.py) are left allowed, since there is at least one syntax
check specifically for them.
---
cfg.mk | 29 +++++++++++++----------------
1 file changed, 13 insertions(+), 16 deletions(-)
diff --git a/cfg.mk b/cfg.mk
index 0f4065b98..44a19594e 100644
--- a/cfg.mk
+++ b/cfg.mk
@@ -91,7 +91,7 @@ endif
# Files that should never cause syntax check failures.
VC_LIST_ALWAYS_EXCLUDE_REGEX = \
- (^(docs/(news(-[0-9]*)?\.html\.in|.*\.patch))|\.(po|fig|gif|ico|png))$$
+ \.(po|fig|gif|ico|png|pot|pl|spec|spec\.in|js|woff|diff|patch|html\.in|stp|syms|conf|data|cpuinfo)$$
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
Jim Meyering
2017-10-06 17:07:06 UTC
Permalink
Raw Message
Post by Eric Blake
The majority of the syntax check is taylored for C sources, so some of
the checks already cause false positives for non-C sources (and thus
there are exclusion regexps in place).
- pot files: they are templates for po files (already excluded), and
they are automatically generated from sources
- pl files: Perl sources, which have own APIs, style, etc; they are
helper scripts, not "real" sources
- spec/spec.in files: RPM packaging files
- js/woff/html.in files: files for web pages
- diff/patch files: patches
- stp files: SystemTap scripts
- syms files: linker symbols files
- conf files: generic configuration files
- data/cpuinfo files: procinfo/cpuinfo files
There are still some useful syntax checks for performing on ALL files
(for example, prohibit_doubled_word). So I'm not quite sure that
blindly exempting these files from all possible checks makes sense.
Maybe it's worth teaching upstream gnulib syntax-check to make it easier
to auto-exclude non-C files from checks that ARE specific to the C
language, without weakening the global checks that are good on all
files. Maybe even something as simple as adding some sort of language=
tag to feed to $(_sc_search_regexp (if not specified, run on all files,
but if specified as C, the syntax-check is specific to C-like files, so
it limits to .h, .c. .y).
I'm adding the gnulib list to get feedback on the idea; maybe Jim
Meyering has an opinion as one of the original syntax-check authors.
Hi Eric,
Is there some reason not to use a directive like this in a rule
applying exclusively to version-controlled C-like files?

in_vc_files='\.[chly]$$'

I looked at libvirt's cfg.mk, and if you add that line to the
sc_prohibit_sprintf rule, you can then remove the lines that exempt
files with unrelated suffixes from that rule:

exclude_file_name_regexp--sc_prohibit_sprintf = \
^(cfg\.mk|docs/hacking\.html\.in|.*\.stp|.*\.pl)$$

Another rule that can catch things in any non-binary file is
sc_prohibit_undesirable_word_seq, even if it's only for pet peeves
like detecting "can not".

...
Post by Eric Blake
# Files that should never cause syntax check failures.
VC_LIST_ALWAYS_EXCLUDE_REGEX = \
- (^(docs/(news(-[0-9]*)?\.html\.in|.*\.patch))|\.(po|fig|gif|ico|png))$$
+ \.(po|fig|gif|ico|png|pot|pl|spec|spec\.in|js|woff|diff|patch|html\.in|stp|syms|conf|data|cpuinfo)$$
Eric Blake
2017-10-06 17:57:38 UTC
Permalink
Raw Message
Post by Jim Meyering
Post by Eric Blake
The majority of the syntax check is taylored for C sources, so some of
the checks already cause false positives for non-C sources (and thus
there are exclusion regexps in place).
Maybe it's worth teaching upstream gnulib syntax-check to make it easier
to auto-exclude non-C files from checks that ARE specific to the C
language, without weakening the global checks that are good on all
files. Maybe even something as simple as adding some sort of language=
tag to feed to $(_sc_search_regexp (if not specified, run on all files,
but if specified as C, the syntax-check is specific to C-like files, so
it limits to .h, .c. .y).
I'm adding the gnulib list to get feedback on the idea; maybe Jim
Meyering has an opinion as one of the original syntax-check authors.
Hi Eric,
Is there some reason not to use a directive like this in a rule
applying exclusively to version-controlled C-like files?
in_vc_files='\.[chly]$$'
So we already have the mechanism - we just have to use it in the right
places ;)

Some of the existing maint.mk rules don't use it (for a quick example,
sc_cast_of_x_alloc_return_value, sc_cast_of_alloca_return_value) - but
it may also be a question of how often these rules are firing on non-C
files to let us know that we even have false positives.
Post by Jim Meyering
I looked at libvirt's cfg.mk, and if you add that line to the
sc_prohibit_sprintf rule, you can then remove the lines that exempt
Indeed, that's probably the best fix for libvirt: for any rule in cfg.mk
that we want to run only on C files, use in_vc_files= to make it obvious.
Post by Jim Meyering
exclude_file_name_regexp--sc_prohibit_sprintf = \
^(cfg\.mk|docs/hacking\.html\.in|.*\.stp|.*\.pl)$$
Another rule that can catch things in any non-binary file is
sc_prohibit_undesirable_word_seq, even if it's only for pet peeves
like detecting "can not".
And sc_space_tab is globally useful, etc.
Post by Jim Meyering
...
Post by Eric Blake
# Files that should never cause syntax check failures.
VC_LIST_ALWAYS_EXCLUDE_REGEX = \
- (^(docs/(news(-[0-9]*)?\.html\.in|.*\.patch))|\.(po|fig|gif|ico|png))$$
+ \.(po|fig|gif|ico|png|pot|pl|spec|spec\.in|js|woff|diff|patch|html\.in|stp|syms|conf|data|cpuinfo)$$
There ARE some files that are always worth excluding: the list of
graphic binary files (such as .png) makes obvious sense. But it's a
high bar to globally exclude a file from all syntax checks, especially
when it's relatively easy to write specific checks to be language specific.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
Loading...