Discussion:
[PATCH] dfa: addition of new state on demand
(too old to reply)
Norihiro Tanaka
2016-10-17 02:45:43 UTC
Permalink
Raw Message
When dfa builds a state, generates all next states. However, I believe
most of them are not used.

This patch changes as that when dfa builds a state, generates a next
state including next input character only.

The following test was improved from 2.52s to 0.67s by the patch in my
machine.

$ seq -f '%g bottles of beer on the wall' 600 >600
$ time -p env LC_ALL=C src/grep -vf 600 600

$ env LC_ALL=C gcc -v
Reading specs from /usr/local/lib/gcc/x86_64-pc-linux-gnu/4.4.7/specs
Target: x86_64-pc-linux-gnu
Configured with: ./configure --with-as=/usr/local/bin/as --with-ld=/usr/local/bin/ld --with-system-zlib --enable-__cxa_atexit
Thread model: posix
gcc version 4.4.7 (GCC)
$ uname -a
Linux rhel6 2.6.32-642.el6.x86_64 #1 SMP Wed Apr 13 00:51:26 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
Norihiro Tanaka
2016-10-17 13:00:33 UTC
Permalink
Raw Message
On Mon, 17 Oct 2016 11:45:43 +0900
Post by Norihiro Tanaka
When dfa builds a state, generates all next states. However, I believe
most of them are not used.
This patch changes as that when dfa builds a state, generates a next
state including next input character only.
The following test was improved from 2.52s to 0.67s by the patch in my
machine.
$ seq -f '%g bottles of beer on the wall' 600 >600
$ time -p env LC_ALL=C src/grep -vf 600 600
$ env LC_ALL=C gcc -v
Reading specs from /usr/local/lib/gcc/x86_64-pc-linux-gnu/4.4.7/specs
Target: x86_64-pc-linux-gnu
Configured with: ./configure --with-as=/usr/local/bin/as --with-ld=/usr/local/bin/ld --with-system-zlib --enable-__cxa_atexit
Thread model: posix
gcc version 4.4.7 (GCC)
$ uname -a
Linux rhel6 2.6.32-642.el6.x86_64 #1 SMP Wed Apr 13 00:51:26 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
Hi,

I updated comments in previous patch.

Thanks,
Norihiro
Norihiro Tanaka
2016-11-21 14:05:35 UTC
Permalink
Raw Message
On Mon, 17 Oct 2016 22:00:33 +0900
Post by Norihiro Tanaka
On Mon, 17 Oct 2016 11:45:43 +0900
Post by Norihiro Tanaka
When dfa builds a state, generates all next states. However, I believe
most of them are not used.
This patch changes as that when dfa builds a state, generates a next
state including next input character only.
The following test was improved from 2.52s to 0.67s by the patch in my
machine.
$ seq -f '%g bottles of beer on the wall' 600 >600
$ time -p env LC_ALL=C src/grep -vf 600 600
$ env LC_ALL=C gcc -v
Reading specs from /usr/local/lib/gcc/x86_64-pc-linux-gnu/4.4.7/specs
Target: x86_64-pc-linux-gnu
Configured with: ./configure --with-as=/usr/local/bin/as --with-ld=/usr/local/bin/ld --with-system-zlib --enable-__cxa_atexit
Thread model: posix
gcc version 4.4.7 (GCC)
$ uname -a
Linux rhel6 2.6.32-642.el6.x86_64 #1 SMP Wed Apr 13 00:51:26 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
Hi,
I updated comments in previous patch.
Thanks,
Norihiro
Can anyone review this change?
Paul Eggert
2016-11-25 19:03:44 UTC
Permalink
Raw Message
Post by Norihiro Tanaka
Can anyone review this change?
Thanks for doing all that, and sorry about the late review. I reviewed it,
tweaked its commit message and propagated that into ChangeLog (the gnulib
practice), and installed the result into gnulib, with two followup patches that
I hope are self-explanatory. All three patches are attached. Please let us know
of any problems you see with the result.

CC'ing this to grep-devel since that's where the hardy band of dfa consumers
hang out.
Jim Meyering
2016-11-25 19:52:09 UTC
Permalink
Raw Message
Post by Paul Eggert
Post by Norihiro Tanaka
Can anyone review this change?
Thanks for doing all that, and sorry about the late review. I reviewed it,
tweaked its commit message and propagated that into ChangeLog (the gnulib
practice), and installed the result into gnulib, with two followup patches
that I hope are self-explanatory. All three patches are attached. Please let
us know of any problems you see with the result.
CC'ing this to grep-devel since that's where the hardy band of dfa consumers
hang out.
Thanks to both of you.
I confirmed that with those, grep still passes all of its tests, even
with ASAN (and hence leak detection) enabled.

Paul, I see one typo in your 2nd change's log: s/cdalls/calls/

Omit unnecessary cdalls to zeroset.
Paul Eggert
2016-11-25 20:47:07 UTC
Permalink
Raw Message
Post by Jim Meyering
I see one typo in your 2nd change's log: s/cdalls/calls/
Omit unnecessary cdalls to zeroset.
Thanks, I fixed that.
Norihiro Tanaka
2016-11-26 15:57:23 UTC
Permalink
Raw Message
Thanks for the review. I confirmed your changes, and I found no problem.
Post by Paul Eggert
CC'ing this to grep-devel since that's where the hardy band of dfa
consumers hang out.
I will also do it in next post for dfa.

Loading...