Skip to content

Commit

Permalink
released 4.3.2
Browse files Browse the repository at this point in the history
- double short option -%% enables both --files --bool (a single -% enables --bool) for convenience
- updated thread pool scheduling and execution with thread affinity and priority settings
- improvements and fixes for minor (mostly cosmetic) issues
  • Loading branch information
genivia-inc committed Nov 3, 2023
1 parent ead45cf commit e6274b9
Show file tree
Hide file tree
Showing 14 changed files with 252 additions and 202 deletions.
2 changes: 1 addition & 1 deletion Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ install-data-hook:
echo "| |"; \
echo "| Thank you for using ugrep! |"; \
echo "| |"; \
echo "| https://github.com/Genivia/ugrep |"; \
echo "| https://ugrep.com |"; \
echo "|______________________________________________________|";

uninstall-hook:
Expand Down
2 changes: 1 addition & 1 deletion Makefile.in
Original file line number Diff line number Diff line change
Expand Up @@ -958,7 +958,7 @@ install-data-hook:
echo "| |"; \
echo "| Thank you for using ugrep! |"; \
echo "| |"; \
echo "| https://github.com/Genivia/ugrep |"; \
echo "| https://ugrep.com |"; \
echo "|______________________________________________________|";

uninstall-hook:
Expand Down
77 changes: 38 additions & 39 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,23 +11,23 @@ Ugrep is like grep, but much faster, user-friendly, and equipped with a ton of n

See [how to install ugrep](#install) on your system. Ugrep is always free.

New website
-----------
Website
-------

**[ugrep.com](https://ugrep.com)** with a helpful and compact user guide.
**[ugrep.com](https://ugrep.com)** has a compact user guide.

Development roadmap
-------------------

- #1 priority is quality assurance to continue to make sure ugrep has no bugs and is reliable
*if something should be improved or added to ugrep, then let me know!*

- *if something should be improved or added to ugrep, then let me know!*
- #1 priority is quality assurance to continue to make sure ugrep has no bugs and is reliable

- add new and updated features, including [indexing (in beta release state)](https://github.com/Genivia/ugrep-indexer)

- share [reproducible performance results](https://github.com/Genivia/ugrep-benchmarks) with the community, showing that ugrep is almost always faster than other grep tools

- make ugrep even faster in the near future, see [my latest article](https://www.genivia.com/ugrep.html) and planned optimizations [#288](https://github.com/Genivia/ugrep/issues/288) and [#305](https://github.com/Genivia/ugrep/issues/305)
- make ugrep even faster, see [my article](https://www.genivia.com/ugrep.html) and planned optimizations [#288](https://github.com/Genivia/ugrep/issues/288)

Overview
--------
Expand Down Expand Up @@ -1429,7 +1429,7 @@ To recursively list all shell scripts based on extensions only with `-tshell`:

### Boolean query patterns with --bool (-%), --and, --not

--bool, -%
--bool, -%, -%%
Specifies Boolean query patterns. A Boolean query pattern is
composed of `AND', `OR', `NOT' operators and grouping with `(' `)'.
Spacing between subpatterns is the same as `AND', `|' is the same
Expand All @@ -1441,27 +1441,26 @@ To recursively list all shell scripts based on extensions only with `-tshell`:
lines with (`A' or `B') and (`C' or `D'), --bool 'A AND NOT B'
matches lines with `A' without `B'. Quoted subpatterns are matched
literally as strings. For example, --bool 'A "AND"|"OR"' matches
lines with `A' and also either `AND' or `OR'. Parenthesis are used
lines with `A' and also either `AND' or `OR'. Parentheses are used
for grouping. For example, --bool '(A B)|C' matches lines with `A'
and `B', or lines with `C'. Note that all subpatterns in a Boolean
query pattern are regular expressions, unless -F is specified.
Options -E, -F, -G, -P and -Z can be combined with --bool to match
subpatterns as strings or regular expressions (-E is the default.)
This option does not apply to -f FILE patterns. Option --stats
displays the search patterns applied. See also options --and,
This option does not apply to -f FILE patterns. The double short
option -%% enables options --files --bool. Option --stats displays
the Boolean search patterns applied. See also options --and,
--andnot, --not, --files and --lines.
--files
Apply Boolean queries to match files, the opposite of --lines. A
file matches if all Boolean conditions are satisfied by the lines
matched in the file. For example, --files -e A --and -e B -e C
--andnot -e D matches a file if some lines match `A' and some lines
match (`B' or `C') and no line in the file matches `D'. May also
be specified as --files --bool 'A B|C -D'. Option -v cannot be
specified with --files. See also options --and, --andnot, --not,
--bool and --lines.
Boolean file matching mode, the opposite of --lines. When combined
with option --bool, matches a file if all Boolean conditions are
satisfied. For example, --files --bool 'A B|C -D' matches a file
if some lines match `A', and some lines match either `B' or `C',
and no line matches `D'. See also options --and, --andnot, --not,
--bool and --lines. The double short option -%% enables options
--files --bool.
--lines
Apply Boolean queries to match lines, the opposite of --files.
This is the default Boolean query mode to match specific lines.
Boolean line matching mode for option --bool, the default mode.
--and [[-e] PATTERN] ... -e PATTERN
Specify additional patterns to match. Patterns must be specified
with -e. Each -e PATTERN following this option is considered an
Expand Down Expand Up @@ -3871,8 +3870,9 @@ in markdown:
The default pattern syntax is an extended form of the POSIX ERE syntax,
same as option -E (--extended-regexp). Try ug --help regex for help with
pattern syntax and how to use logical connectives to specify Boolean
search queries with option -% (--bool). Options -F (--fixed-strings), -G
(--basic-regexp) and -P (--perl-regexp) specify other pattern syntaxes.
search queries with option -% (--bool) to match lines and -%% (-fB--files
--bool) to match files. Options -F (--fixed-strings), -G (--basic-
regexp) and -P (--perl-regexp) specify other pattern syntaxes.

Option -i (--ignore-case) ignores case in ASCII patterns. Combine with
option -P for case-insensitive Unicode matching. Option -j (--smart-
Expand Down Expand Up @@ -4010,7 +4010,7 @@ in markdown:
zero byte or invalid UTF. Short options are -a, -I, -U, -W and
-X.

--bool, -%
--bool, -%, -%%
Specifies Boolean query patterns. A Boolean query pattern is
composed of `AND', `OR', `NOT' operators and grouping with `('
`)'. Spacing between subpatterns is the same as `AND', `|' is the
Expand All @@ -4022,16 +4022,17 @@ in markdown:
matches lines with (`A' or `B') and (`C' or `D'), --bool 'A AND
NOT B' matches lines with `A' without `B'. Quoted subpatterns are
matched literally as strings. For example, --bool 'A "AND"|"OR"'
matches lines with `A' and also either `AND' or `OR'. Parenthesis
matches lines with `A' and also either `AND' or `OR'. Parentheses
are used for grouping. For example, --bool '(A B)|C' matches
lines with `A' and `B', or lines with `C'. Note that all
subpatterns in a Boolean query pattern are regular expressions,
unless -F is specified. Options -E, -F, -G, -P and -Z can be
combined with --bool to match subpatterns as strings or regular
expressions (-E is the default.) This option does not apply to -f
FILE patterns. Option --stats displays the search patterns
applied. See also options --and, --andnot, --not, --files and
--lines.
FILE patterns. The double short option -%% enables options
--files --bool. Option --stats displays the Boolean search
patterns applied. See also options --and, --andnot, --not,
--files and --lines.

--break
Adds a line break between results from different files. This
Expand Down Expand Up @@ -4426,8 +4427,7 @@ in markdown:
Force output to be line buffered instead of block buffered.

--lines
Apply Boolean queries to match lines, the opposite of --files.
This is the default Boolean mode to match specific lines.
Boolean line matching mode for option --bool, the default mode.

-M MAGIC, --file-magic=MAGIC
Only files matching the signature pattern MAGIC are searched. The
Expand Down Expand Up @@ -4503,15 +4503,14 @@ in markdown:
displaying the match. The line number counter is reset for each
file processed.

--files
Apply Boolean queries to match files, the opposite of --lines. A
file matches if all Boolean conditions are satisfied by the lines
matched in the file. For example, --files -e A --and -e B -e C
--andnot -e D matches a file if some lines match `A' and some
lines match (`B' or `C') and no line in the file matches `D'. May
also be specified as --files --bool 'A B|C -D'. Option -v cannot
be specified with --files. See also options --and, --andnot,
--not, --bool and --lines.
--files, -%%
Boolean file matching mode, the opposite of --lines. When
combined with option --bool, matches a file if all Boolean
conditions are satisfied. For example, --files --bool 'A B|C -D'
matches a file if some lines match `A', and some lines match
either `B' or `C', and no line matches `D'. See also options
--and, --andnot, --not, --bool and --lines. The double short
option -%% enables options --files --bool.

-P, --perl-regexp
Interpret PATTERN as a Perl regular expression using PCRE2. Note
Expand Down Expand Up @@ -5334,7 +5333,7 @@ in markdown:



ugrep 4.3.2 October 25, 2023 UGREP(1)
ugrep 4.3.2 November 3, 2023 UGREP(1)

🔝 [Back to table of contents](#toc)

Expand Down
Binary file modified bin/win32/ugrep.exe
Binary file not shown.
Binary file modified bin/win64/ugrep.exe
Binary file not shown.
7 changes: 6 additions & 1 deletion lib/matcher.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -102,14 +102,17 @@ size_t Matcher::match(Method method)
find:
int c1 = got_;
bool bol = at_bol(); // at begin of line?
#if !defined(WITH_NO_CODEGEN)
if (pat_->fsm_ != NULL)
fsm_.c1 = c1;
#endif
#if !defined(WITH_NO_INDENT)
redo:
#endif
lap_.resize(0);
cap_ = 0;
bool nul = method == Const::MATCH;
#if !defined(WITH_NO_CODEGEN)
if (pat_->fsm_ != NULL)
{
DBGLOG("FSM code %p", pat_->fsm_);
Expand All @@ -119,7 +122,9 @@ size_t Matcher::match(Method method)
nul = fsm_.nul;
c1 = fsm_.c1;
}
else if (pat_->opc_ != NULL)
else
#endif
if (pat_->opc_ != NULL)
{
const Pattern::Opcode *pc = pat_->opc_;
Pattern::Index back = Pattern::Const::IMAX; // where to jump back to
Expand Down
4 changes: 2 additions & 2 deletions lib/pattern.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3816,7 +3816,7 @@ void Pattern::predict_match_dfa(const DFA::State *start)
one_ = true;
while (state->accept == 0)
{
if (state->edges.size() != 1)
if (state->edges.size() != 1 || !state->heads.empty())
{
one_ = false;
break;
Expand Down Expand Up @@ -3844,7 +3844,7 @@ void Pattern::predict_match_dfa(const DFA::State *start)
}
state = next;
}
if (state != NULL && state->accept > 0 && !state->edges.empty())
if (state != NULL && ((state->accept > 0 && !state->edges.empty()) || state->redo))
one_ = false;
min_ = 0;
std::memset(bit_, 0xFF, sizeof(bit_));
Expand Down
45 changes: 23 additions & 22 deletions man.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ cat >> man/ugrep.1 << 'END'
.B ugrep
[\fIOPTIONS\fR] [\fB-i\fR] [\fB-Q\fR|\fIPATTERN\fR] [\fB-e\fR \fIPATTERN\fR] [\fB-N\fR \fIPATTERN\fR] [\fB-f\fR \fIFILE\fR]
[\fB-F\fR|\fB-G\fR|\fB-P\fR|\fB-Z\fR] [\fB-U\fR] [\fB-m\fR [\fIMIN,\fR][\fIMAX\fR]] [\fB--bool\fR [\fB--files\fR|\fB--lines\fR]]
[\fB-r\fR|\fB-R\fR|\fB-1\fR|...|\fB-9\fR|\fB--10\fR|...] [\fB-t\fR \fITYPES\fR] [\fB-g\fR \fIGLOBS\fR] [\fB--sort\fR[=\fIKEY\fR]]
[\fB-r\fR|\fB-R\fR|\fB-1\fR|...|\fB-9\fR|\fB-10\fR|...] [\fB-t\fR \fITYPES\fR] [\fB-g\fR \fIGLOBS\fR] [\fB--sort\fR[=\fIKEY\fR]]
[\fB-l\fR|\fB-c\fR] [\fB-o\fR] [\fB-n\fR] [\fB-k\fR] [\fB-b\fR] [\fB-A\fR \fINUM\fR] [\fB-B\fR \fINUM\fR] [\fB-C \fR\fINUM\fR] [\fB-y\fR]
[\fB--color\fR[=\fIWHEN\fR]|\fB--colour\fR[=\fIWHEN\fR]] [\fB--pretty\fR] [\fB--pager\fR[=\fICOMMAND\fR]]
[\fB--hexdump\fR|\fB--csv\fR|\fB--json\fR|\fB--xml\fR] [\fB-I\fR] [\fB-z\fR] [\fB--zmax\fR=\fINUM\fR] [\fIFILE\fR \fI...\fR]
Expand Down Expand Up @@ -50,7 +50,8 @@ omit zero matches.
The default pattern syntax is an extended form of the POSIX ERE syntax, same as
option \fB-E\fR (\fB--extended-regexp\fR). Try \fBug --help regex\fR for help
with pattern syntax and how to use logical connectives to specify Boolean
search queries with option \fB-%\fR (\fB--bool\fR). Options \fB-F\fR
search queries with option \fB-%\fR (\fB--bool\fR) to match lines and \fB-%%\fR
(-fB--files --bool\fR) to match files. Options \fB-F\fR
(\fB--fixed-strings\fR), \fB-G\fR (\fB--basic-regexp\fR) and \fB-P\fR
(\fB--perl-regexp\fR) specify other pattern syntaxes.
.PP
Expand Down Expand Up @@ -356,8 +357,8 @@ TUI regex braces.
.SH FORMAT
Option \fB--format\fR=\fIFORMAT\fR specifies an output format for file matches.
Fields may be used in \fIFORMAT\fR, which expand into the following values:
.IP \fB%[\fR\fIARG\fR\fB]F\fR
if option \fB-H\fR is used: \fIARG\fR, the file pathname and separator.
.IP \fB%[\fR\fITEXT\fR\fB]F\fR
if option \fB-H\fR is used: \fITEXT\fR, the file pathname and separator.
.IP \fB%f\fR
the file pathname.
.IP \fB%a\fR
Expand All @@ -366,33 +367,33 @@ the file basename without directory path.
the directory path to the file.
.IP \fB%z\fR
the file pathname in a (compressed) archive.
.IP \fB%[\fR\fIARG\fR\fB]H\fR
if option \fB-H\fR is used: \fIARG\fR, the quoted pathname and separator, \\"
.IP \fB%[\fR\fITEXT\fR\fB]H\fR
if option \fB-H\fR is used: \fITEXT\fR, the quoted pathname and separator, \\"
and \\\\ replace " and \\.
.IP \fB%h\fR
the quoted file pathname, \\" and \\\\ replace " and \\.
.IP \fB%[\fR\fIARG\fR\fB]N\fR
if option \fB-n\fR is used: \fIARG\fR, the line number and separator.
.IP \fB%[\fR\fITEXT\fR\fB]N\fR
if option \fB-n\fR is used: \fITEXT\fR, the line number and separator.
.IP \fB%n\fR
the line number of the match.
.IP \fB%[\fR\fIARG\fR\fB]K\fR
if option \fB-k\fR is used: \fIARG\fR, the column number and separator.
.IP \fB%[\fR\fITEXT\fR\fB]K\fR
if option \fB-k\fR is used: \fITEXT\fR, the column number and separator.
.IP \fB%k\fR
the column number of the match.
.IP \fB%[\fR\fIARG\fR\fB]B\fR
if option \fB-b\fR is used: \fIARG\fR, the byte offset and separator.
.IP \fB%[\fR\fITEXT\fR\fB]B\fR
if option \fB-b\fR is used: \fITEXT\fR, the byte offset and separator.
.IP \fB%b\fR
the byte offset of the match.
.IP \fB%[\fR\fIARG\fR\fB]T\fR
if option \fB-T\fR is used: \fIARG\fR and a tab character.
.IP \fB%[\fR\fITEXT\fR\fB]T\fR
if option \fB-T\fR is used: \fITEXT\fR and a tab character.
.IP \fB%t\fR
a tab character.
.IP \fB%[\fR\fISEP\fR\fB]$\fR
set field separator to \fISEP\fR for the rest of the format fields.
.IP \fB%[\fR\fIARG\fR\fB]<\fR
if the first match: \fIARG\fR.
.IP \fB%[\fR\fIARG\fR\fB]>\fR
if not the first match: \fIARG\fR.
.IP \fB%[\fR\fITEXT\fR\fB]<\fR
if the first match: \fITEXT\fR.
.IP \fB%[\fR\fITEXT\fR\fB]>\fR
if not the first match: \fITEXT\fR.
.IP \fB%,\fR
if not the first match: a comma, same as \fB%[,]>\fR.
.IP \fB%:\fR
Expand All @@ -401,10 +402,10 @@ if not the first match: a colon, same as \fB%[:]>\fR.
if not the first match: a semicolon, same as \fB%[;]>\fR.
.IP \fB%|\fR
if not the first match: a vertical bar, same as \fB%[|]>\fR.
.IP \fB%[\fR\fIARG\fR\fB]S\fR
if not the first match: \fIARG\fR and separator, see also \fB%[\fR\fISEP\fR\fB]$.
.IP \fB%[\fR\fITEXT\fR\fB]S\fR
if not the first match: \fITEXT\fR and separator, see also \fB%[\fR\fISEP\fR\fB]$.
.IP \fB%s\fR
the separator, see also \fB%[\fR\fIARG\fR\fB]S\fR and \fB%[\fR\fISEP\fR\fB]$.
the separator, see also \fB%[\fR\fITEXT\fR\fB]S\fR and \fB%[\fR\fISEP\fR\fB]$.
.IP \fB%~\fR
a newline character.
.IP \fB%M\fR
Expand Down Expand Up @@ -487,7 +488,7 @@ the percentage sign.
Formatted output is written without a terminating newline, unless \fB%~\fR or
`\\n' is explicitly specified in the format string.
.PP
The \fB[\fR\fIARG\fR\fB]\fR part of a field is optional and may be omitted.
The \fB[\fR\fITEXT\fR\fB]\fR part of a field is optional and may be omitted.
When present, the argument must be placed in \fB[]\fR brackets, for example
\fB%[,]F\fR to output a comma, the pathname, and a separator.
.PP
Expand Down
Loading

0 comments on commit e6274b9

Please sign in to comment.