├── .gitignore
├── .mds-list
├── FAQ.md
├── FAQ.zh.md
├── GUIDE.md
├── GUIDE.zh.md
├── en.md
├── globset
├── README.md
└── README.zh.md
├── grep-cli
├── README.md
└── README.zh.md
├── grep-matcher
├── README.md
└── README.zh.md
├── grep-pcre2
├── README.md
└── README.zh.md
├── grep-printer
├── README.md
└── README.zh.md
├── grep-regex
├── README.md
└── README.zh.md
├── grep-searcher
├── README.md
└── README.zh.md
├── grep
├── README.md
└── README.zh.md
├── ignore
├── README.md
└── README.zh.md
├── readme.md
├── rg-0.10.0-h.md
├── rg-0.10.0-h.zh.md
├── sync-en.sh
├── termcolor
├── README.md
└── README.zh.md
└── wincolor
├── README.md
└── README.zh.md
/.gitignore:
--------------------------------------------------------------------------------
1 | node_modules/
2 | .DS_Store
3 | fork
4 | source
5 | hub-create.sh
--------------------------------------------------------------------------------
/.mds-list:
--------------------------------------------------------------------------------
1 | ./source/grep-regex/README.md
2 | ./source/globset/README.md
3 | ./source/ignore/README.md
4 | ./source/FAQ.md
5 | ./source/grep/README.md
6 | ./source/README.md
7 | ./source/termcolor/README.md
8 | ./source/grep-cli/README.md
9 | ./source/grep-pcre2/README.md
10 | ./source/GUIDE.md
11 | ./source/grep-searcher/README.md
12 | ./source/wincolor/README.md
13 | ./source/grep-printer/README.md
14 | ./source/grep-matcher/README.md
15 |
16 |
--------------------------------------------------------------------------------
/FAQ.md:
--------------------------------------------------------------------------------
1 | ## FAQ
2 |
3 | * [Does ripgrep support configuration files?](#config)
4 | * [What's changed in ripgrep recently?](#changelog)
5 | * [When is the next release?](#release)
6 | * [Does ripgrep have a man page?](#manpage)
7 | * [Does ripgrep have support for shell auto-completion?](#complete)
8 | * [How do I use lookaround and/or backreferences?](#fancy)
9 | * [How do I configure ripgrep's colors?](#colors)
10 | * [How do I enable true colors on Windows?](#truecolors-windows)
11 | * [How do I stop ripgrep from messing up colors when I kill it?](#stop-ripgrep)
12 | * [How can I get results in a consistent order?](#order)
13 | * [How do I search files that aren't UTF-8?](#encoding)
14 | * [How do I search compressed files?](#compressed)
15 | * [How do I search over multiple lines?](#multiline)
16 | * [How do I get around the regex size limit?](#size-limit)
17 | * [How do I make the `-f/--file` flag faster?](#dfa-size)
18 | * [How do I make the output look like The Silver Searcher's output?](#silver-searcher-output)
19 | * [Why does ripgrep get slower when I enabled PCRE2 regexes?](#pcre2-slow)
20 | * [When I run `rg`, why does it execute some other command?](#rg-other-cmd)
21 | * [How do I create an alias for ripgrep on Windows?](#rg-alias-windows)
22 | * [How do I create a PowerShell profile?](#powershell-profile)
23 | * [How do I pipe non-ASCII content to ripgrep on Windows?](#pipe-non-ascii-windows)
24 | * [How can I search and replace with ripgrep?](#search-and-replace)
25 | * [How is ripgrep licensed?](#license)
26 | * [Can ripgrep replace grep?](#posix4ever)
27 | * [What does the "rip" in ripgrep mean?](#intentcountsforsomething)
28 |
29 |
30 |
31 | Does ripgrep support configuration files?
32 |
33 |
34 | Yes. See the
35 | [guide's section on configuration files](GUIDE.md#configuration-file).
36 |
37 |
38 |
39 | What's changed in ripgrep recently?
40 |
41 |
42 | Please consult ripgrep's [CHANGELOG](CHANGELOG.md).
43 |
44 |
45 |
46 | When is the next release?
47 |
48 |
49 | ripgrep is a project whose contributors are volunteers. A release schedule
50 | adds undue stress to said volunteers. Therefore, releases are made on a best
51 | effort basis and no dates **will ever be given**.
52 |
53 | One exception to this is high impact bugs. If a ripgrep release contains a
54 | significant regression, then there will generally be a strong push to get a
55 | patch release out with a fix.
56 |
57 |
58 |
59 | Does ripgrep have a man page?
60 |
61 |
62 | Yes! Whenever ripgrep is compiled on a system with `asciidoc` present, then a
63 | man page is generated from ripgrep's argv parser. After compiling ripgrep, you
64 | can find the man page like so from the root of the repository:
65 |
66 | ```
67 | $ find ./target -name rg.1 -print0 | xargs -0 ls -t | head -n1
68 | ./target/debug/build/ripgrep-79899d0edd4129ca/out/rg.1
69 | ```
70 |
71 | Running `man -l ./target/debug/build/ripgrep-79899d0edd4129ca/out/rg.1` will
72 | show the man page in your normal pager.
73 |
74 | Note that the man page's documentation for options is equivalent to the output
75 | shown in `rg --help`. To see more condensed documentation (one line per flag),
76 | run `rg -h`.
77 |
78 | The man page is also included in all
79 | [ripgrep binary releases](https://github.com/BurntSushi/ripgrep/releases).
80 |
81 |
82 |
83 | Does ripgrep have support for shell auto-completion?
84 |
85 |
86 | Yes! Shell completions can be found in the
87 | [same directory as the man page](#manpage)
88 | after building ripgrep. Zsh completions are maintained separately and committed
89 | to the repository in `complete/_rg`.
90 |
91 | Shell completions are also included in all
92 | [ripgrep binary releases](https://github.com/BurntSushi/ripgrep/releases).
93 |
94 | For **bash**, move `rg.bash` to
95 | `$XDG_CONFIG_HOME/bash_completion` or `/etc/bash_completion.d/`.
96 |
97 | For **fish**, move `rg.fish` to `$HOME/.config/fish/completions/`.
98 |
99 | For **zsh**, move `_rg` to one of your `$fpath` directories.
100 |
101 | For **PowerShell**, add `. _rg.ps1` to your PowerShell
102 | [profile](https://technet.microsoft.com/en-us/library/bb613488(v=vs.85).aspx)
103 | (note the leading period). If the `_rg.ps1` file is not on your `PATH`, do
104 | `. /path/to/_rg.ps1` instead.
105 |
106 |
107 |
108 | How can I get results in a consistent order?
109 |
110 |
111 | By default, ripgrep uses parallelism to execute its search because this makes
112 | the search much faster on most modern systems. This in turn means that ripgrep
113 | has a non-deterministic aspect to it, since the interleaving of threads during
114 | the execution of the program is itself non-deterministic. This has the effect
115 | of printing results in a somewhat arbitrary order, and this order can change
116 | from run to run of ripgrep.
117 |
118 | The only way to make the order of results consistent is to ask ripgrep to
119 | sort the output. Currently, this will disable all parallelism. (On smaller
120 | repositories, you might not notice much of a performance difference!) You
121 | can achieve this with the `--sort path` flag.
122 |
123 | There is more discussion on this topic here:
124 | https://github.com/BurntSushi/ripgrep/issues/152
125 |
126 |
127 |
128 | How do I search files that aren't UTF-8?
129 |
130 |
131 | See the [guide's section on file encoding](GUIDE.md#file-encoding).
132 |
133 |
134 |
135 | How do I search compressed files?
136 |
137 |
138 | ripgrep's `-z/--search-zip` flag will cause it to search compressed files
139 | automatically. Currently, this supports gzip, bzip2, xz, lzma, lz4, Brotli and
140 | Zstd. Each of these requires requires the corresponding `gzip`, `bzip2`, `xz`,
141 | `lz4`, `brotli` and `zstd` binaries to be installed on your system. (That is,
142 | ripgrep does decompression by shelling out to another process.)
143 |
144 | ripgrep currently does not search archive formats, so `*.tar.gz` files, for
145 | example, are skipped.
146 |
147 |
148 |
149 | How do I search over multiple lines?
150 |
151 |
152 | The `-U/--multiline` flag enables ripgrep to report results that span over
153 | multiple lines.
154 |
155 |
156 |
157 | How do I use lookaround and/or backreferences?
158 |
159 |
160 | ripgrep's default regex engine does not support lookaround or backreferences.
161 | This is primarily because the default regex engine is implemented using finite
162 | state machines in order to guarantee a linear worst case time complexity on all
163 | inputs. Backreferences are not possible to implement in this paradigm, and
164 | lookaround appears difficult to do efficiently.
165 |
166 | However, ripgrep optionally supports using PCRE2 as the regex engine instead of
167 | the default one based on finite state machines. You can enable PCRE2 with the
168 | `-P/--pcre2` flag. For example, in the root of the ripgrep repo, you can easily
169 | find all palindromes:
170 |
171 | ```
172 | $ rg -P '(\w{10})\1'
173 | tests/misc.rs
174 | 483: cmd.arg("--max-filesize").arg("44444444444444444444");
175 | globset/src/glob.rs
176 | 1206: matches!(match7, "a*a*a*a*a*a*a*a*a", "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
177 | ```
178 |
179 | If your version of ripgrep doesn't support PCRE2, then you'll get an error
180 | message when you try to use the `-P/--pcre2` flag:
181 |
182 | ```
183 | $ rg -P '(\w{10})\1'
184 | PCRE2 is not available in this build of ripgrep
185 | ```
186 |
187 | Most of the releases distributed by the ripgrep project here on GitHub will
188 | come bundled with PCRE2 enabled. If you installed ripgrep through a different
189 | means (like your system's package manager), then please reach out to the
190 | maintainer of that package to see whether it's possible to enable the PCRE2
191 | feature.
192 |
193 |
194 |
195 | How do I configure ripgrep's colors?
196 |
197 |
198 | ripgrep has two flags related to colors:
199 |
200 | * `--color` controls *when* to use colors.
201 | * `--colors` controls *which* colors to use.
202 |
203 | The `--color` flag accepts one of the following possible values: `never`,
204 | `auto`, `always` or `ansi`. The `auto` value is the default and will cause
205 | ripgrep to only enable colors when it is printing to a terminal. But if you
206 | pipe ripgrep to a file or some other process, then it will suppress colors.
207 |
208 | The --colors` flag is a bit more complicated. The general format is:
209 |
210 | ```
211 | --colors '{type}:{attribute}:{value}'
212 | ```
213 |
214 | * `{type}` should be one of `path`, `line`, `column` or `match`. Each of these
215 | correspond to the four different types of things that ripgrep will add color
216 | to in its output. Select the type whose color you want to change.
217 | * `{attribute}` should be one of `fg`, `bg` or `style`, corresponding to
218 | foreground color, background color, or miscellaneous styling (such as whether
219 | to bold the output or not).
220 | * `{value}` is determined by the value of `{attribute}`. If
221 | `{attribute}` is `style`, then `{value}` should be one of `nobold`,
222 | `bold`, `nointense`, `intense`, `nounderline` or `underline`. If
223 | `{attribute}` is `fg` or `bg`, then `{value}` should be a color.
224 |
225 | A color is specified by either one of eight of English names, a single 256-bit
226 | number or an RGB triple (with over 16 million possible values, or "true
227 | color").
228 |
229 | The color names are `red`, `blue`, `green`, `cyan`, `magenta`, `yellow`,
230 | `white` or `black`.
231 |
232 | A single 256-bit number is a value in the range 0-255 (inclusive). It can
233 | either be in decimal format (e.g., `62`) or hexadecimal format (e.g., `0x3E`).
234 |
235 | An RGB triple corresponds to three numbers (decimal or hexadecimal) separated
236 | by commas.
237 |
238 | As a special case, `--colors '{type}:none'` will clear all colors and styles
239 | associated with `{type}`, which lets you start with a clean slate (instead of
240 | building on top of ripgrep's default color settings).
241 |
242 | Here's an example that makes highlights the matches with a nice blue background
243 | with bolded white text:
244 |
245 | ```
246 | $ rg somepattern \
247 | --colors 'match:none' \
248 | --colors 'match:bg:0x33,0x66,0xFF' \
249 | --colors 'match:fg:white' \
250 | --colors 'match:style:bold'
251 | ```
252 |
253 | Colors are an ideal candidate to set in your
254 | [configuration file](GUIDE.md#configuration-file). See the
255 | [question on emulating The Silver Searcher's output style](#silver-searcher-output)
256 | for an example specific to colors.
257 |
258 |
259 |
260 | How do I enable true colors on Windows?
261 |
262 |
263 | First, see the previous question's
264 | [answer on configuring colors](#colors).
265 |
266 | Secondly, coloring on Windows is a bit complicated. If you're using a terminal
267 | like Cygwin, then it's likely true color support already works out of the box.
268 | However, if you are using a normal Windows console (`cmd` or `PowerShell`) and
269 | a version of Windows prior to 10, then there is no known way to get true
270 | color support. If you are on Windows 10 and using a Windows console, then
271 | true colors should work out of the box with one caveat: you might need to
272 | clear ripgrep's default color settings first. That is, instead of this:
273 |
274 | ```
275 | $ rg somepattern --colors 'match:fg:0x33,0x66,0xFF'
276 | ```
277 |
278 | you should do this
279 |
280 | ```
281 | $ rg somepattern --colors 'match:none' --colors 'match:fg:0x33,0x66,0xFF'
282 | ```
283 |
284 | This is because ripgrep might set the default style for `match` to `bold`, and
285 | it seems like Windows 10's VT100 support doesn't permit bold and true color
286 | ANSI escapes to be used simultaneously. The work-around above will clear
287 | ripgrep's default styling, allowing you to craft it exactly as desired.
288 |
289 |
290 |
291 | How do I stop ripgrep from messing up colors when I kill it?
292 |
293 |
294 | Type in `color` in cmd.exe (Command Prompt) and `echo -ne "\033[0m"` on
295 | Unix-like systems to restore your original foreground color.
296 |
297 | In PowerShell, you can add the following code to your profile which will
298 | restore the original foreground color when `Reset-ForegroundColor` is called.
299 | Including the `Set-Alias` line will allow you to call it with simply `color`.
300 |
301 | ```powershell
302 | $OrigFgColor = $Host.UI.RawUI.ForegroundColor
303 | function Reset-ForegroundColor {
304 | $Host.UI.RawUI.ForegroundColor = $OrigFgColor
305 | }
306 | Set-Alias -Name color -Value Reset-ForegroundColor
307 | ```
308 |
309 | PR [#187](https://github.com/BurntSushi/ripgrep/pull/187) fixed this, and it
310 | was later deprecated in
311 | [#281](https://github.com/BurntSushi/ripgrep/issues/281). A full explanation is
312 | available
313 | [here](https://github.com/BurntSushi/ripgrep/issues/281#issuecomment-269093893).
314 |
315 |
316 |
317 | How do I get around the regex size limit?
318 |
319 |
320 | If you've given ripgrep a particularly large pattern (or a large number of
321 | smaller patterns), then it is possible that it will fail to compile because it
322 | hit a pre-set limit. For example:
323 |
324 | ```
325 | $ rg '\pL{1000}'
326 | Compiled regex exceeds size limit of 10485760 bytes.
327 | ```
328 |
329 | (Note: `\pL{1000}` may look small, but `\pL` is the character class containing
330 | all Unicode letters, which is quite large. *And* it's repeated 1000 times.)
331 |
332 | In this case, you can work around by simply increasing the limit:
333 |
334 | ```
335 | $ rg '\pL{1000}' --regex-size-limit 1G
336 | ```
337 |
338 | Increasing the limit to 1GB does not necessarily mean that ripgrep will use
339 | that much memory. The limit just says that it's allowed to (approximately) use
340 | that much memory for constructing the regular expression.
341 |
342 |
343 |
344 | How do I make the -f/--file
flag faster?
345 |
346 |
347 | The `-f/--file` permits one to give a file to ripgrep which contains a pattern
348 | on each line. ripgrep will then report any line that matches any of the
349 | patterns.
350 |
351 | If this pattern file gets too big, then it is possible ripgrep will slow down
352 | dramatically. *Typically* this is because an internal cache is too small, and
353 | will cause ripgrep to spill over to a slower but more robust regular expression
354 | engine. If this is indeed the problem, then it is possible to increase this
355 | cache and regain speed. The cache can be controlled via the `--dfa-size-limit`
356 | flag. For example, using `--dfa-size-limit 1G` will set the cache size to 1GB.
357 | (Note that this doesn't mean ripgrep will use 1GB of memory automatically, but
358 | it will allow the regex engine to if it needs to.)
359 |
360 |
361 |
362 | How do I make the output look like The Silver Searcher's output?
363 |
364 |
365 | Use the `--colors` flag, like so:
366 |
367 | ```
368 | rg --colors line:fg:yellow \
369 | --colors line:style:bold \
370 | --colors path:fg:green \
371 | --colors path:style:bold \
372 | --colors match:fg:black \
373 | --colors match:bg:yellow \
374 | --colors match:style:nobold \
375 | foo
376 | ```
377 |
378 | Alternatively, add your color configuration to your ripgrep config file (which
379 | is activated by setting the `RIPGREP_CONFIG_PATH` environment variable to point
380 | to your config file). For example:
381 |
382 | ```
383 | $ cat $HOME/.config/ripgrep/rc
384 | --colors=line:fg:yellow
385 | --colors=line:style:bold
386 | --colors=path:fg:green
387 | --colors=path:style:bold
388 | --colors=match:fg:black
389 | --colors=match:bg:yellow
390 | --colors=match:style:nobold
391 | $ RIPGREP_CONFIG_PATH=$HOME/.config/ripgrep/rc rg foo
392 | ```
393 |
394 |
395 |
396 | Why does ripgrep get slower when I enable PCRE2 regexes?
397 |
398 |
399 | When you use the `--pcre2` (`-P` for short) flag, ripgrep will use the PCRE2
400 | regex engine instead of the default. Both regex engines are quite fast,
401 | but PCRE2 provides a number of additional features such as look-around and
402 | backreferences that many enjoy using. This is largely because PCRE2 uses
403 | a backtracking implementation where as the default regex engine uses a finite
404 | automaton based implementation. The former provides the ability to add lots of
405 | bells and whistles over the latter, but the latter executes with worst case
406 | linear time complexity.
407 |
408 | With that out of the way, if you've used `-P` with ripgrep, you may have
409 | noticed that it can be slower. The reasons for why this is are quite complex,
410 | and they are complex because the optimizations that ripgrep uses to implement
411 | fast search are complex.
412 |
413 | The task ripgrep has before it is somewhat simple; all it needs to do is search
414 | a file for occurrences of some pattern and then print the lines containing
415 | those occurrences. The problem lies in what is considered a valid match and how
416 | exactly we read the bytes from a file.
417 |
418 | In terms of what is considered a valid match, remember that ripgrep will only
419 | report matches spanning a single line by default. The problem here is that
420 | some patterns can match across multiple lines, and ripgrep needs to prevent
421 | that from happening. For example, `foo\sbar` will match `foo\nbar`. The most
422 | obvious way to achieve this is to read the data from a file, and then apply
423 | the pattern search to that data for each line. The problem with this approach
424 | is that it can be quite slow; it would be much faster to let the pattern
425 | search across as much data as possible. It's faster because it gets rid of the
426 | overhead of finding the boundaries of every line, and also because it gets rid
427 | of the overhead of starting and stopping the pattern search for every single
428 | line. (This is operating under the general assumption that matching lines are
429 | much rarer than non-matching lines.)
430 |
431 | It turns out that we can use the faster approach by applying a very simple
432 | restriction to the pattern: *statically prevent* the pattern from matching
433 | through a `\n` character. Namely, when given a pattern like `foo\sbar`,
434 | ripgrep will remove `\n` from the `\s` character class automatically. In some
435 | cases, a simple removal is not so easy. For example, ripgrep will return an
436 | error when your pattern includes a `\n` literal:
437 |
438 | ```
439 | $ rg '\n'
440 | the literal '"\n"' is not allowed in a regex
441 | ```
442 |
443 | So what does this have to do with PCRE2? Well, ripgrep's default regex engine
444 | exposes APIs for doing syntactic analysis on the pattern in a way that makes
445 | it quite easy to strip `\n` from the pattern (or otherwise detect it and report
446 | an error if stripping isn't possible). PCRE2 seemingly does not provide a
447 | similar API, so ripgrep does not do any stripping when PCRE2 is enabled. This
448 | forces ripgrep to use the "slow" search strategy of searching each line
449 | individually.
450 |
451 | OK, so if enabling PCRE2 slows down the default method of searching because it
452 | forces matches to be limited to a single line, then why is PCRE2 also sometimes
453 | slower when performing multiline searches? Well, that's because there are
454 | *multiple* reasons why using PCRE2 in ripgrep can be slower than the default
455 | regex engine. This time, blame PCRE2's Unicode support, which ripgrep enables
456 | by default. In particular, PCRE2 cannot simultaneously enable Unicode support
457 | and search arbitrary data. That is, when PCRE2's Unicode support is enabled,
458 | the data **must** be valid UTF-8 (to do otherwise is to invoke undefined
459 | behavior). This is in contrast to ripgrep's default regex engine, which can
460 | enable Unicode support and still search arbitrary data. ripgrep's default
461 | regex engine simply won't match invalid UTF-8 for a pattern that can otherwise
462 | only match valid UTF-8. Why doesn't PCRE2 do the same? This author isn't
463 | familiar with its internals, so we can't comment on it here.
464 |
465 | The bottom line here is that we can't enable PCRE2's Unicode support without
466 | simultaneously incurring a performance penalty for ensuring that we are
467 | searching valid UTF-8. In particular, ripgrep will transcode the contents
468 | of each file to UTF-8 while replacing invalid UTF-8 data with the Unicode
469 | replacement codepoint. ripgrep then disables PCRE2's own internal UTF-8
470 | checking, since we've guaranteed the data we hand it will be valid UTF-8. The
471 | reason why ripgrep takes this approach is because if we do hand PCRE2 invalid
472 | UTF-8, then it will report a match error if it comes across an invalid UTF-8
473 | sequence. This is not good news for ripgrep, since it will stop it from
474 | searching the rest of the file, and will also print potentially undesirable
475 | error messages to users.
476 |
477 | All right, the above is a lot of information to swallow if you aren't already
478 | familiar with ripgrep internals. Let's make this concrete with some examples.
479 | First, let's get some data big enough to magnify the performance differences:
480 |
481 | ```
482 | $ curl -O 'https://burntsushi.net/stuff/subtitles2016-sample.gz'
483 | $ gzip -d subtitles2016-sample
484 | $ md5sum subtitles2016-sample
485 | e3cb796a20bbc602fbfd6bb43bda45f5 subtitles2016-sample
486 | ```
487 |
488 | To search this data, we will use the pattern `^\w{42}$`, which contains exactly
489 | one hit in the file and has no literals. Having no literals is important,
490 | because it ensures that the regex engine won't use literal optimizations to
491 | speed up the search. In other words, it lets us reason coherently about the
492 | actual task that the regex engine is performing.
493 |
494 | Let's now walk through a few examples in light of the information above. First,
495 | let's consider the default search using ripgrep's default regex engine and
496 | then the same search with PCRE2:
497 |
498 | ```
499 | $ time rg '^\w{42}$' subtitles2016-sample
500 | 21225780:EverymajordevelopmentinthehistoryofAmerica
501 |
502 | real 0m1.783s
503 | user 0m1.731s
504 | sys 0m0.051s
505 |
506 | $ time rg -P '^\w{42}$' subtitles2016-sample
507 | 21225780:EverymajordevelopmentinthehistoryofAmerica
508 |
509 | real 0m2.458s
510 | user 0m2.419s
511 | sys 0m0.038s
512 | ```
513 |
514 | In this particular example, both pattern searches are using a Unicode aware
515 | `\w` character class and both are counting lines in order to report line
516 | numbers. The key difference here is that the first search will not search
517 | line by line, but the second one will. We can observe which strategy ripgrep
518 | uses by passing the `--trace` flag:
519 |
520 | ```
521 | $ rg '^\w{42}$' subtitles2016-sample --trace
522 | [... snip ...]
523 | TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:622: Some("subtitles2016-sample"): searching via memory map
524 | TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:712: slice reader: searching via slice-by-line strategy
525 | TRACE|grep_searcher::searcher::core|grep-searcher/src/searcher/core.rs:61: searcher core: will use fast line searcher
526 | [... snip ...]
527 |
528 | $ rg -P '^\w{42}$' subtitles2016-sample --trace
529 | [... snip ...]
530 | TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:622: Some("subtitles2016-sample"): searching via memory map
531 | TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:705: slice reader: needs transcoding, using generic reader
532 | TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:685: generic reader: searching via roll buffer strategy
533 | TRACE|grep_searcher::searcher::core|grep-searcher/src/searcher/core.rs:63: searcher core: will use slow line searcher
534 | [... snip ...]
535 | ```
536 |
537 | The first says it is using the "fast line searcher" where as the latter says
538 | it is using the "slow line searcher." The latter also shows that we are
539 | decoding the contents of the file, which also impacts performance.
540 |
541 | Interestingly, in this case, the pattern does not match a `\n` and the file
542 | we're searching is valid UTF-8, so neither the slow line-by-line search
543 | strategy nor the decoding are necessary. We could fix the former issue with
544 | better PCRE2 introspection APIs. We can actually fix the latter issue with
545 | ripgrep's `--no-encoding` flag, which prevents the automatic UTF-8 decoding,
546 | but will enable PCRE2's own UTF-8 validity checking. Unfortunately, it's slower
547 | in my build of ripgrep:
548 |
549 | ```
550 | $ time rg -P '^\w{42}$' subtitles2016-sample --no-encoding
551 | 21225780:EverymajordevelopmentinthehistoryofAmerica
552 |
553 | real 0m3.074s
554 | user 0m3.021s
555 | sys 0m0.051s
556 | ```
557 |
558 | (Tip: use the `--trace` flag to verify that no decoding in ripgrep is
559 | happening.)
560 |
561 | A possible reason why PCRE2's UTF-8 checking is slower is because it might
562 | not be better than the highly optimized UTF-8 checking routines found in the
563 | [`encoding_rs`](https://github.com/hsivonen/encoding_rs) library, which is what
564 | ripgrep uses for UTF-8 decoding. Moreover, my build of ripgrep enables
565 | `encoding_rs`'s SIMD optimizations, which may be in play here.
566 |
567 | Also, note that using the `--no-encoding` flag can cause PCRE2 to report
568 | invalid UTF-8 errors, which causes ripgrep to stop searching the file:
569 |
570 | ```
571 | $ cat invalid-utf8
572 | foobar
573 |
574 | $ xxd invalid-utf8
575 | 00000000: 666f 6fff 6261 720a foo.bar.
576 |
577 | $ rg foo invalid-utf8
578 | 1:foobar
579 |
580 | $ rg -P foo invalid-utf8
581 | 1:foo�bar
582 |
583 | $ rg -P foo invalid-utf8 --no-encoding
584 | invalid-utf8: PCRE2: error matching: UTF-8 error: illegal byte (0xfe or 0xff)
585 | ```
586 |
587 | All right, so at this point, you might think that we could remove the penalty
588 | for line-by-line searching by enabling multiline search. After all, our
589 | particular pattern can't match across multiple lines anyway, so we'll still get
590 | the results we want. Let's try it:
591 |
592 | ```
593 | $ time rg -U '^\w{42}$' subtitles2016-sample
594 | 21225780:EverymajordevelopmentinthehistoryofAmerica
595 |
596 | real 0m1.803s
597 | user 0m1.748s
598 | sys 0m0.054s
599 |
600 | $ time rg -P -U '^\w{42}$' subtitles2016-sample
601 | 21225780:EverymajordevelopmentinthehistoryofAmerica
602 |
603 | real 0m2.962s
604 | user 0m2.246s
605 | sys 0m0.713s
606 | ```
607 |
608 | Search times remain the same with the default regex engine, but the PCRE2
609 | search gets _slower_. What happened? The secrets can be revealed with the
610 | `--trace` flag once again. In the former case, ripgrep actually detects that
611 | the pattern can't match across multiple lines, and so will fall back to the
612 | "fast line search" strategy as with our search without `-U`.
613 |
614 | However, for PCRE2, things are much worse. Namely, since Unicode mode is still
615 | enabled, ripgrep is still going to decode UTF-8 to ensure that it hands only
616 | valid UTF-8 to PCRE2. Unfortunately, one key downside of multiline search is
617 | that ripgrep cannot do it incrementally. Since matches can be arbitrarily long,
618 | ripgrep actually needs the entire file in memory at once. Normally, we can use
619 | a memory map for this, but because we need to UTF-8 decode the file before
620 | searching it, ripgrep winds up reading the entire contents of the file on to
621 | the heap before executing a search. Owch.
622 |
623 | OK, so Unicode is killing us here. The file we're searching is _mostly_ ASCII,
624 | so maybe we're OK with missing some data. (Try `rg '[\w--\p{ascii}]'` to see
625 | non-ASCII word characters that an ASCII-only `\w` character class would miss.)
626 | We can disable Unicode in both searches, but this is done differently depending
627 | on the regex engine we use:
628 |
629 | ```
630 | $ time rg '(?-u)^\w{42}$' subtitles2016-sample
631 | 21225780:EverymajordevelopmentinthehistoryofAmerica
632 |
633 | real 0m1.714s
634 | user 0m1.669s
635 | sys 0m0.044s
636 |
637 | $ time rg -P '^\w{42}$' subtitles2016-sample --no-pcre2-unicode
638 | 21225780:EverymajordevelopmentinthehistoryofAmerica
639 |
640 | real 0m1.997s
641 | user 0m1.958s
642 | sys 0m0.037s
643 | ```
644 |
645 | For the most part, ripgrep's default regex engine performs about the same.
646 | PCRE2 does improve a little bit, and is now almost as fast as the default
647 | regex engine. If you look at the output of `--trace`, you'll see that ripgrep
648 | will no longer perform UTF-8 decoding, but it does still use the slow
649 | line-by-line searcher.
650 |
651 | At this point, we can combine all of our insights above: let's try to get off
652 | of the slow line-by-line searcher by enabling multiline mode, and let's stop
653 | UTF-8 decoding by disabling Unicode support:
654 |
655 | ```
656 | $ time rg -U '(?-u)^\w{42}$' subtitles2016-sample
657 | 21225780:EverymajordevelopmentinthehistoryofAmerica
658 |
659 | real 0m1.714s
660 | user 0m1.655s
661 | sys 0m0.058s
662 |
663 | $ time rg -P -U '^\w{42}$' subtitles2016-sample --no-pcre2-unicode
664 | 21225780:EverymajordevelopmentinthehistoryofAmerica
665 |
666 | real 0m1.121s
667 | user 0m1.071s
668 | sys 0m0.048s
669 | ```
670 |
671 | Ah, there's PCRE2's JIT shining! ripgrep's default regex engine once again
672 | remains about the same, but PCRE2 no longer needs to search line-by-line and it
673 | no longer needs to do any kind of UTF-8 checks. This allows the file to get
674 | memory mapped and passed right through PCRE2's JIT at impressive speeds. (As
675 | a brief and interesting historical note, the configuration of "memory map +
676 | multiline + no-Unicode" is exactly the configuration used by The Silver
677 | Searcher. This analysis perhaps sheds some reasoning as to why that
678 | configuration is useful!)
679 |
680 | In summary, if you want PCRE2 to go as fast as possible and you don't care
681 | about Unicode and you don't care about matches possibly spanning across
682 | multiple lines, then enable multiline mode with `-U` and disable PCRE2's
683 | Unicode support with the `--no-pcre2-unicode` flag.
684 |
685 | Caveat emptor: This author is not a PCRE2 expert, so there may be APIs that can
686 | improve performance that the author missed. Similarly, there may be alternative
687 | designs for a searching tool that are more amenable to how PCRE2 works.
688 |
689 |
690 |
691 | When I run rg
, why does it execute some other command?
692 |
693 |
694 | It's likely that you have a shell alias or even another tool called `rg` which
695 | is interfering with ripgrep. Run `which rg` to see what it is.
696 |
697 | (Notably, the Rails plug-in for
698 | [Oh My Zsh](https://github.com/robbyrussell/oh-my-zsh/wiki/Plugins#rails) sets
699 | up an `rg` alias for `rails generate`.)
700 |
701 | Problems like this can be resolved in one of several ways:
702 |
703 | * If you're using the OMZ Rails plug-in, disable it by editing the `plugins`
704 | array in your zsh configuration.
705 | * Temporarily bypass an existing `rg` alias by calling ripgrep as
706 | `command rg`, `\rg`, or `'rg'`.
707 | * Temporarily bypass an existing alias or another tool named `rg` by calling
708 | ripgrep by its full path (e.g., `/usr/bin/rg` or `/usr/local/bin/rg`).
709 | * Permanently disable an existing `rg` alias by adding `unalias rg` to the
710 | bottom of your shell configuration file (e.g., `.bash_profile` or `.zshrc`).
711 | * Give ripgrep its own alias that doesn't conflict with other tools/aliases by
712 | adding a line like the following to the bottom of your shell configuration
713 | file: `alias ripgrep='command rg'`.
714 |
715 |
716 |
717 | How do I create an alias for ripgrep on Windows?
718 |
719 |
720 | Often you can find a need to make alias for commands you use a lot that set
721 | certain flags. But PowerShell function aliases do not behave like your typical
722 | linux shell alias. You always need to propagate arguments and `stdin` input.
723 | But it cannot be done simply as
724 | `function grep() { $input | rg.exe --hidden $args }`
725 |
726 | Use below example as reference to how setup alias in PowerShell.
727 |
728 | ```powershell
729 | function grep {
730 | $count = @($input).Count
731 | $input.Reset()
732 |
733 | if ($count) {
734 | $input | rg.exe --hidden $args
735 | }
736 | else {
737 | rg.exe --hidden $args
738 | }
739 | }
740 | ```
741 |
742 | PowerShell special variables:
743 |
744 | * input - is powershell `stdin` object that allows you to access its content.
745 | * args - is array of arguments passed to this function.
746 |
747 | This alias checks whether there is `stdin` input and propagates only if there
748 | is some lines. Otherwise empty `$input` will make powershell to trigger `rg` to
749 | search empty `stdin`.
750 |
751 |
752 |
753 | How do I create a PowerShell profile?
754 |
755 |
756 | To customize powershell on start-up, there is a special PowerShell script that
757 | has to be created. In order to find its location, type `$profile`.
758 | See
759 | [Microsoft's documentation](https://technet.microsoft.com/en-us/library/bb613488(v=vs.85).aspx)
760 | for more details.
761 |
762 | Any PowerShell code in this file gets evaluated at the start of console. This
763 | way you can have own aliases to be created at start.
764 |
765 |
766 |
767 | How do I pipe non-ASCII content to ripgrep on Windows?
768 |
769 |
770 | When piping input into native executables in PowerShell, the encoding of the
771 | input is controlled by the `$OutputEncoding` variable. By default, this is set
772 | to US-ASCII, and any characters in the pipeline that don't have encodings in
773 | US-ASCII are converted to `?` (question mark) characters.
774 |
775 | To change this setting, set `$OutputEncoding` to a different encoding, as
776 | represented by a .NET encoding object. Some common examples are below. The
777 | value of this variable is reset when PowerShell restarts, so to make this
778 | change take effect every time PowerShell is started add a line setting the
779 | variable into your PowerShell profile.
780 |
781 | Example `$OutputEncoding` settings:
782 |
783 | * UTF-8 without BOM: `$OutputEncoding = [System.Text.UTF8Encoding]::new()`
784 | * The console's output encoding:
785 | `$OutputEncoding = [System.Console]::OutputEncoding`
786 |
787 | If you continue to have encoding problems, you can also force the encoding
788 | that the console will use for printing to UTF-8 with
789 | `[System.Console]::OutputEncoding = [System.Text.Encoding]::UTF8`. This
790 | will also reset when PowerShell is restarted, so you can add that line
791 | to your profile as well if you want to make the setting permanent.
792 |
793 |
794 | How can I search and replace with ripgrep?
795 |
796 |
797 | Using ripgrep alone, you can't. ripgrep is a search tool that will never
798 | touch your files. However, the output of ripgrep can be piped to other tools
799 | that do modify files on disk. See
800 | [this issue](https://github.com/BurntSushi/ripgrep/issues/74) for more
801 | information.
802 |
803 | sed is one such tool that can modify files on disk. sed can take a filename
804 | and a substitution command to search and replace in the specified file.
805 | Files containing matching patterns can be provided to sed using
806 |
807 | ```
808 | rg foo --files-with-matches
809 | ```
810 |
811 | The output of this command is a list of filenames that contain a match for
812 | the `foo` pattern.
813 |
814 | This list can be piped into `xargs`, which will split the filenames from
815 | standard input into arguments for the command following xargs. You can use this
816 | combination to pipe a list of filenames into sed for replacement. For example:
817 |
818 | ```
819 | rg foo --files-with-matches | xargs sed -i 's/foo/bar/g'
820 | ```
821 |
822 | will replace all instances of 'foo' with 'bar' in the files in which
823 | ripgrep finds the foo pattern. The `-i` flag to sed indicates that you are
824 | editing files in place, and `s/foo/bar/g` says that you are performing a
825 | **s**ubstitution of the pattren `foo` for `bar`, and that you are doing this
826 | substitution **g**lobally (all occurrences of the pattern in each file).
827 |
828 | Note: the above command assumes that you are using GNU sed. If you are using
829 | BSD sed (the default on macOS and FreeBSD) then you must modify the above
830 | command to be the following:
831 |
832 | ```
833 | rg foo --files-with-matches | xargs sed -i '' 's/foo/bar/g'
834 | ```
835 |
836 | The `-i` flag in BSD sed requires a file extension to be given to make backups
837 | for all modified files. Specifying the empty string prevents file backups from
838 | being made.
839 |
840 | Finally, if any of your file paths contain whitespace in them, then you might
841 | need to delimit your file paths with a NUL terminator. This requires telling
842 | ripgrep to output NUL bytes between each path, and telling xargs to read paths
843 | delimited by NUL bytes:
844 |
845 | ```
846 | rg foo --files-with-matches -0 | xargs -0 sed -i 's/foo/bar/g'
847 | ```
848 |
849 | To learn more about sed, see the sed manual
850 | [here](https://www.gnu.org/software/sed/manual/sed.html).
851 |
852 | Additionally, Facebook has a tool called
853 | [fastmod](https://github.com/facebookincubator/fastmod)
854 | that uses some of the same libraries as ripgrep and might provide a more
855 | ergonomic search-and-replace experience.
856 |
857 |
858 |
859 | How is ripgrep licensed?
860 |
861 |
862 | ripgrep is dual licensed under the
863 | [Unlicense](https://unlicense.org/)
864 | and MIT licenses. Specifically, you may use ripgrep under the terms of either
865 | license.
866 |
867 | The reason why ripgrep is dual licensed this way is two-fold:
868 |
869 | 1. I, as ripgrep's author, would like to participate in a small bit of
870 | ideological activism by promoting the Unlicense's goal: to disclaim
871 | copyright monopoly interest.
872 | 2. I, as ripgrep's author, would like as many people to use rigprep as
873 | possible. Since the Unlicense is not a proven or well known license, ripgrep
874 | is also offered under the MIT license, which is ubiquitous and accepted by
875 | almost everyone.
876 |
877 | More specifically, ripgrep and all its dependencies are compatible with this
878 | licensing choice. In particular, ripgrep's dependencies (direct and transitive)
879 | will always be limited to permissive licenses. That is, ripgrep will never
880 | depend on code that is not permissively licensed. This means rejecting any
881 | dependency that uses a copyleft license such as the GPL, LGPL, MPL or any of
882 | the Creative Commons ShareAlike licenses. Whether the license is "weak"
883 | copyleft or not does not matter; ripgrep will **not** depend on it.
884 |
885 |
886 |
887 | Can ripgrep replace grep?
888 |
889 |
890 | Yes and no.
891 |
892 | If, upon hearing that "ripgrep can replace grep," you *actually* hear, "ripgrep
893 | can be used in every instance grep can be used, in exactly the same way, for
894 | the same use cases, with exactly the same bug-for-bug behavior," then no,
895 | ripgrep trivially *cannot* replace grep. Moreover, ripgrep will *never* replace
896 | grep.
897 |
898 | If, upon hearing that "ripgrep can replace grep," you *actually* hear, "ripgrep
899 | can replace grep in some cases and not in other use cases," then yes, that is
900 | indeed true!
901 |
902 | Let's go over some of those use cases in favor of ripgrep. Some of these may
903 | not apply to you. That's OK. There may be other use cases not listed here that
904 | do apply to you. That's OK too.
905 |
906 | (For all claims related to performance in the following words, see my
907 | [blog post](https://blog.burntsushi.net/ripgrep/)
908 | introducing ripgrep.)
909 |
910 | * Are you frequently searching a repository of code? If so, ripgrep might be a
911 | good choice since there's likely a good chunk of your repository that you
912 | don't want to search. grep, can, of course, be made to filter files using
913 | recursive search, and if you don't mind writing out the requisite `--exclude`
914 | rules or writing wrapper scripts, then grep might be sufficient. (I'm not
915 | kidding, I myself did this with grep for almost a decade before writing
916 | ripgrep.) But if you instead enjoy having a search tool respect your
917 | `.gitignore`, then ripgrep might be perfect for you!
918 | * Are you frequently searching non-ASCII text that is UTF-8 encoded? One of
919 | ripgrep's key features is that it can handle Unicode features in your
920 | patterns in a way that tends to be faster than GNU grep. Unicode features
921 | in ripgrep are enabled by default; there is no need to configure your locale
922 | settings to use ripgrep properly because ripgrep doesn't respect your locale
923 | settings.
924 | * Do you need to search UTF-16 files and you don't want to bother explicitly
925 | transcoding them? Great. ripgrep does this for you automatically. No need
926 | to enable it.
927 | * Do you need to search a large directory of large files? ripgrep uses
928 | parallelism by default, which tends to make it faster than a standard
929 | `grep -r` search. However, if you're OK writing the occasional
930 | `find ./ -print0 | xargs -P8 -0 grep` command, then maybe grep is good
931 | enough.
932 |
933 | Here are some cases where you might *not* want to use ripgrep. The same caveats
934 | for the previous section apply.
935 |
936 | * Are you writing portable shell scripts intended to work in a variety of
937 | environments? Great, probably not a good idea to use ripgrep! ripgrep is has
938 | nowhere near the ubquity of grep, so if you do use ripgrep, you might need
939 | to futz with the installation process more than you would with grep.
940 | * Do you care about POSIX compatibility? If so, then you can't use ripgrep
941 | because it never was, isn't and never will be POSIX compatible.
942 | * Do you hate tools that try to do something smart? If so, ripgrep is all about
943 | being smart, so you might prefer to just stick with grep.
944 | * Is there a particular feature of grep you rely on that ripgrep either doesn't
945 | have or never will have? If the former, file a bug report, maybe ripgrep can
946 | do it! If the latter, well, then, just use grep.
947 |
948 |
949 |
950 | What does the "rip" in ripgrep mean?
951 |
952 |
953 | When I first started writing ripgrep, I called it `rep`, intending it to be a
954 | shorter variant of `grep`. Soon after, I renamed it to `xrep` since `rep`
955 | wasn't obvious enough of a name for my taste. And also because adding `x` to
956 | anything always makes it better, right?
957 |
958 | Before ripgrep's first public release, I decided that I didn't like `xrep`. I
959 | thought it was slightly awkward to type, and despite my previous praise of the
960 | letter `x`, I kind of thought it was pretty lame. Being someone who really
961 | likes Rust, I wanted to call it "rustgrep" or maybe "rgrep" for short. But I
962 | thought that was just as lame, and maybe a little too in-your-face. But I
963 | wanted to continue using `r` so I could at least pretend Rust had something to
964 | do with it.
965 |
966 | I spent a couple of days trying to think of very short words that began with
967 | the letter `r` that were even somewhat related to the task of searching. I
968 | don't remember how it popped into my head, but "rip" came up as something that
969 | meant "fast," as in, "to rip through your text." The fact that RIP is also
970 | an initialism for "Rest in Peace" (as in, "ripgrep kills grep") never really
971 | dawned on me. Perhaps the coincidence is too striking to believe that, but
972 | I didn't realize it until someone explicitly pointed it out to me after the
973 | initial public release. I admit that I found it mildly amusing, but if I had
974 | realized it myself before the public release, I probably would have pressed on
975 | and chose a different name. Alas, renaming things after a release is hard, so I
976 | decided to mush on.
977 |
978 | Given the fact that
979 | [ripgrep never was, is or will be a 100% drop-in replacement for
980 | grep](#posix4ever),
981 | ripgrep is neither actually a "grep killer" nor was it ever intended to be. It
982 | certainly does eat into some of its use cases, but that's nothing that other
983 | tools like ack or The Silver Searcher weren't already doing.
984 |
--------------------------------------------------------------------------------
/FAQ.zh.md:
--------------------------------------------------------------------------------
1 | ## 常问问题
2 |
3 |
4 |
5 |
6 |
7 | - [ripgrep 支持 配置文件吗?](#ripgrep-%E6%94%AF%E6%8C%81-%E9%85%8D%E7%BD%AE%E6%96%87%E4%BB%B6%E5%90%97)
8 | - [ripgrep 有啥变化?](#ripgrep-%E6%9C%89%E5%95%A5%E5%8F%98%E5%8C%96)
9 | - [下一次发布,是什么时候?](#%E4%B8%8B%E4%B8%80%E6%AC%A1%E5%8F%91%E5%B8%83%E6%98%AF%E4%BB%80%E4%B9%88%E6%97%B6%E5%80%99)
10 | - [ripgrep 是否 有 man 页面?](#ripgrep-%E6%98%AF%E5%90%A6-%E6%9C%89-man-%E9%A1%B5%E9%9D%A2)
11 | - [ripgrep 是否支持 shell-Tab 补全?](#ripgrep-%E6%98%AF%E5%90%A6%E6%94%AF%E6%8C%81-shell-tab-%E8%A1%A5%E5%85%A8)
12 | - [我要怎么得到,连贯顺序的结果?](#%E6%88%91%E8%A6%81%E6%80%8E%E4%B9%88%E5%BE%97%E5%88%B0%E8%BF%9E%E8%B4%AF%E9%A1%BA%E5%BA%8F%E7%9A%84%E7%BB%93%E6%9E%9C)
13 | - [我要如何搜索不是 UTF-8 编码的文件?](#%E6%88%91%E8%A6%81%E5%A6%82%E4%BD%95%E6%90%9C%E7%B4%A2%E4%B8%8D%E6%98%AF-utf-8-%E7%BC%96%E7%A0%81%E7%9A%84%E6%96%87%E4%BB%B6)
14 | - [我要如何搜索压缩文件?](#%E6%88%91%E8%A6%81%E5%A6%82%E4%BD%95%E6%90%9C%E7%B4%A2%E5%8E%8B%E7%BC%A9%E6%96%87%E4%BB%B6)
15 | - [我要如何搜索 多行内容?](#%E6%88%91%E8%A6%81%E5%A6%82%E4%BD%95%E6%90%9C%E7%B4%A2-%E5%A4%9A%E8%A1%8C%E5%86%85%E5%AE%B9)
16 | - [如何使用 lookaround 和/或 backreferences?](#%E5%A6%82%E4%BD%95%E4%BD%BF%E7%94%A8-lookaround-%E5%92%8C%E6%88%96-backreferences)
17 | - [怎么配置 ripgrep 的 颜色?](#%E6%80%8E%E4%B9%88%E9%85%8D%E7%BD%AE-ripgrep-%E7%9A%84-%E9%A2%9C%E8%89%B2)
18 | - [如何在 Windows 上启用真彩色?](#%E5%A6%82%E4%BD%95%E5%9C%A8-windows-%E4%B8%8A%E5%90%AF%E7%94%A8%E7%9C%9F%E5%BD%A9%E8%89%B2)
19 | - [当我杀死它时,如何防止 ripgrep 弄乱颜色?](#%E5%BD%93%E6%88%91%E6%9D%80%E6%AD%BB%E5%AE%83%E6%97%B6%E5%A6%82%E4%BD%95%E9%98%B2%E6%AD%A2-ripgrep-%E5%BC%84%E4%B9%B1%E9%A2%9C%E8%89%B2)
20 | - [我该怎么绕过 正则式 的大小限制呢?](#%E6%88%91%E8%AF%A5%E6%80%8E%E4%B9%88%E7%BB%95%E8%BF%87-%E6%AD%A3%E5%88%99%E5%BC%8F-%E7%9A%84%E5%A4%A7%E5%B0%8F%E9%99%90%E5%88%B6%E5%91%A2)
21 | - [如何让 -f/--file
标志更快?](#%E5%A6%82%E4%BD%95%E8%AE%A9-code-f--filecode-%E6%A0%87%E5%BF%97%E6%9B%B4%E5%BF%AB)
22 | - [如何使输出看起来像 Silver Searcher 的输出?](#%E5%A6%82%E4%BD%95%E4%BD%BF%E8%BE%93%E5%87%BA%E7%9C%8B%E8%B5%B7%E6%9D%A5%E5%83%8F-silver-searcher-%E7%9A%84%E8%BE%93%E5%87%BA)
23 | - [为什么启用 PCRE2 正则式 时,ripgrep 会变慢??](#%E4%B8%BA%E4%BB%80%E4%B9%88%E5%90%AF%E7%94%A8-pcre2-%E6%AD%A3%E5%88%99%E5%BC%8F-%E6%97%B6ripgrep-%E4%BC%9A%E5%8F%98%E6%85%A2)
24 | - [当我运行rg
时,它为什么执行其他命令?](#%E5%BD%93%E6%88%91%E8%BF%90%E8%A1%8Ccodergcode%E6%97%B6%E5%AE%83%E4%B8%BA%E4%BB%80%E4%B9%88%E6%89%A7%E8%A1%8C%E5%85%B6%E4%BB%96%E5%91%BD%E4%BB%A4)
25 | - [如何在 Windows 上 为 ripgrep 创建别名?](#%E5%A6%82%E4%BD%95%E5%9C%A8-windows-%E4%B8%8A-%E4%B8%BA-ripgrep-%E5%88%9B%E5%BB%BA%E5%88%AB%E5%90%8D)
26 | - [如何创建 PowerShell 配置文件?](#%E5%A6%82%E4%BD%95%E5%88%9B%E5%BB%BA-powershell-%E9%85%8D%E7%BD%AE%E6%96%87%E4%BB%B6)
27 | - [Windows 如何将 非 ASCII 内容传输到 ripgrep?](#windows-%E5%A6%82%E4%BD%95%E5%B0%86-%E9%9D%9E-ascii-%E5%86%85%E5%AE%B9%E4%BC%A0%E8%BE%93%E5%88%B0-ripgrep)
28 | - [如何用 ripgrep 搜索并替换?](#%E5%A6%82%E4%BD%95%E7%94%A8-ripgrep-%E6%90%9C%E7%B4%A2%E5%B9%B6%E6%9B%BF%E6%8D%A2)
29 | - [Ripgrep 是如何获得许可的?](#ripgrep-%E6%98%AF%E5%A6%82%E4%BD%95%E8%8E%B7%E5%BE%97%E8%AE%B8%E5%8F%AF%E7%9A%84)
30 | - [Ripgrep 能代替 grep 吗?](#ripgrep-%E8%83%BD%E4%BB%A3%E6%9B%BF-grep-%E5%90%97)
31 | - [“ripgrep”中的“rip”是什么意思?](#ripgrep%E4%B8%AD%E7%9A%84rip%E6%98%AF%E4%BB%80%E4%B9%88%E6%84%8F%E6%80%9D)
32 |
33 |
34 |
35 | ### ripgrep 支持 配置文件吗?
36 |
37 | 是。见[配置文件的 指南部分](GUIDE.zh.md#configuration-file).
38 |
39 | ### ripgrep 有啥变化?
40 |
41 | 请咨询 ripgrep 的 [CHANGELOG](https://github.com/BurntSushi/ripgrep/blob/master/CHANGELOG.md).
42 |
43 | ### 下一次发布,是什么时候?
44 |
45 | ripgrep 是一个贡献者都是志愿者的项目。发布的时间表,给所述志愿者增加了不适当的压力。因此,发布是在尽力而为的基础上进行的,没有**永远指定的**日期。
46 |
47 | 一个例外,是高影响力的错误。如果 ripgrep 版本包含显着的性能退步,那么通常会强烈 push 修补程序,以此发布修补程序。
48 |
49 | ### ripgrep 是否 有 man 页面?
50 |
51 | 是! 每当 ripgrep 在一个系统上编译,而`asciidoc`又存在时,那么就会从 ripgrep 的 argv 解析器生成一个 man 页。编译 ripgrep 之后,您可以从存储库的根目录中找到这样的 man 页:
52 |
53 | ```
54 | $ find ./target -name rg.1 -print0 | xargs -0 ls -t | head -n1
55 | ./target/debug/build/ripgrep-79899d0edd4129ca/out/rg.1
56 | ```
57 |
58 | 运行`man -l ./target/debug/build/ripgrep-79899d0edd4129ca/out/rg.1`将在您的普通寻呼机中显示 man 页.
59 |
60 | 请注意,man 页的选项文档等同于显示的输出`rg --help`。要查看更多精简文档(每个标志一行),请运行`rg -h`。
61 |
62 | man 页也包含所有[ripgrep 二进制版本](https://github.com/BurntSushi/ripgrep/releases)内容。
63 |
64 | ### ripgrep 是否支持 shell-Tab 补全?
65 |
66 | 是! 在构建 ripgrep 之后,可以在[同一目录下的 manpage](#manpage)找到。Zsh 补全被分开维护,并提交到`complete/_rg`存储库中。
67 |
68 | shell 补全也包括所有[ripgrep binary releases](https://github.com/BurntSushi/ripgrep/releases)内容。
69 |
70 | 对于**bash**,移动`rg.bash`至`$XDG_CONFIG_HOME/bash_completion`要么`/etc/bash_completion.d/`.
71 |
72 | 对于**fish**,移动`rg.fish`至`$HOME/.config/fish/completions/`.
73 |
74 | 对于**zsh**,移动`_rg`给你的一个`$fpath`目录.
75 |
76 | 对于**PowerShell**,添加`. _rg.ps1`到你的 PowerShell[profile]()(注意领头的符号)。如果`_rg.ps1`文件不在`PATH`,用`. /path/to/_rg.ps1`代替。
77 |
78 | ### 我要怎么得到,连贯顺序的结果?
79 |
80 | 默认情况下,ripgrep 使用并行性来执行搜索,这在大多数现代系统上搜索速度更快。反过来说,意味着 ripgrep 具有不确定性,因为在程序执行期间,线程的交叉本身是非确定性的。这具有以某种任意顺序的方式,打印出结果,并且该顺序,会在 ripgrep 的运行中改变。
81 |
82 | 要使结果顺序一致的唯一方法,是让 ripgrep 对输出进行排序。就目前而言,会禁用所有并行。(在较小的存储库中,您可能没有注意到性能差异!)您可以使用`--sort path`标志,顺序打印。
83 |
84 | 这里有关于这个主题的更多讨论:
85 |
86 | ### 我要如何搜索不是 UTF-8 编码的文件?
87 |
88 | 见[文件编码的指南](GUIDE.zh.md#file-encoding).
89 |
90 | ### 我要如何搜索压缩文件?
91 |
92 | ripgrep 的`-z/--search-zip`标志,会自动搜索压缩文件。目前而言,仅支持`gzip, bzip2, xz, lzma, lz4, Brotli 和 Zstd`,并且需要系统,安装相应的`gzip`, `bzip2`, `xz`,
93 | `lz4`, `brotli` 和 `zstd`二进制可执行文件。(也就是说,ripgrep 是通过终端的另一个进程,来进行解压缩,再进入(压缩)里面寻找的。)
94 |
95 | ripgrep 目前不搜索的存档格式,`*.tar.gz`之类,会跳过。
96 |
97 | ### 我要如何搜索 多行内容?
98 |
99 | `-U/--multiline` 标志,能让 ripgrep ,报告扫描多行的结果.
100 |
101 | ### 如何使用 lookaround 和/或 backreferences?
102 |
103 | ripgrep 的默认正则表达式引擎,不支持环视(lookaround)或反向引用(backreferences)。这主要是因为默认的正则表达式引擎是,使用有限状态机实现的,以保证所有输入的线性最坏情况时间复杂度。在这种范例中无法实现反向引用,且看起来,很难有效地执行。
104 |
105 | 但是,ripgrep 可选择支持使用 PCRE2 作为正则表达式引擎,而不是基于有限状态机的默认引擎。您可以使用`-P/--pcre2`标志,启用 PCRE2 。例如,在 ripgrep 存储库 的根目录中,您可以轻松找到所有回文:
106 |
107 | ```
108 | $ rg -P '(\w{10})\1'
109 | tests/misc.rs
110 | 483: cmd.arg("--max-filesize").arg("44444444444444444444");
111 | globset/src/glob.rs
112 | 1206: matches!(match7, "a*a*a*a*a*a*a*a*a", "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
113 | ```
114 |
115 | 如果您的 ripgrep 版本不支持 PCRE2,那么当您尝试使用`-P/--pcre2`标志时,您将收到错误消息:
116 |
117 | ```
118 | $ rg -P '(\w{10})\1'
119 | PCRE2 is not available in this build of ripgrep
120 | ```
121 |
122 | 来自 GitHub 的 ripgrep 项目的,大多数版本,都与 PCRE2 捆绑在一起。如果你是用不同方式安装的 ripgrep(比如系统的软件包管理器),那么请联系该软件包的维护者,看看是否可以启用 PCRE2 功能。
123 |
124 | ### 怎么配置 ripgrep 的 颜色?
125 |
126 | ripgrep 有两个与颜色相关的标志:
127 |
128 | - `--color`控制*某时*使用颜色.
129 | - `--colors`控制*某个*要使用的颜色.
130 |
131 | `--color`标志 接受以下可能值之一:`never`,`auto`,`always`要么`ansi`。`auto`是默认值,将导致 ripgrep 仅在打印到终端时,才启用颜色。但是,如果你将 ripgrep 传输到文件或其他进程,那么它将抑制颜色。
132 |
133 | `--colors`标志有点复杂。一般格式是:
134 |
135 | ```
136 | --colors '{type}:{attribute}:{value}'
137 | ```
138 |
139 | - `{type}`应该是`path`,`line`,`column`要么`match`其中之一。这每一个都对应 ripgrep 在其输出中,添加颜色的四种不同类型的事物。选择要更改其颜色的类型。
140 | - `{attribute}`应该是`fg`,`bg`要么`style`其中之一,对应前景色,背景色或其他样式(例如是否粗体输出)。
141 | - `{value}`是由`{attribute}`值决定的。如果`{attribute}`是`style`, 然后`{value}`应该是`nobold`,`bold`,`nointense`,`intense`,`nounderline`要么`underline`其中之一。如果`{attribute}`是`fg`要么`bg`, 然后`{value}`应该是一种颜色。
142 |
143 | 颜色由八个英文名称,一个 256 位数字,或一个 RGB 三元组中的任何一个指定(具有超过 1600 万个可能的值,或"真彩色")。
144 |
145 | 颜色名称是`red`,`blue`,`green`,`cyan`,`magenta`,`yellow`,`white`要么`black`。
146 |
147 | 单个 256 位数是 0-255(含)范围内的值。它可以是十进制格式(例如,`62`)或十六进制格式(例如,`0x3E`)。
148 |
149 | RGB 三元组对应,逗号分隔的三个数字(十进制或十六进制)。
150 |
151 | 作为特例,`--colors '{type}:none'`将清除`{type}`相关的所有颜色和样式,它允许你从一个干净的白纸开始(而不是建立在 ripgrep 的默认颜色设置之上)。
152 |
153 | 这是一个使用粗体白色文本,高亮匹配的蓝色背景的示例:
154 |
155 | ```
156 | $ rg somepattern \
157 | --colors 'match:none' \
158 | --colors 'match:bg:0x33,0x66,0xFF' \
159 | --colors 'match:fg:white' \
160 | --colors 'match:style:bold'
161 | ```
162 |
163 | 在[配置文件](GUIDE.zh.md#configuration-file)上,设置你的理想选择颜色。见[模仿 The Silver Searcher's 输出颜色的问题](#silver-searcher-output),其中有关颜色的示例.
164 |
165 | ### 如何在 Windows 上启用真彩色?
166 |
167 | 首先,看看上一个问题的[配置颜色回答](#colors).
168 |
169 | 其次,Windows 上的着色有点复杂。如果您正在使用像 Cygwin 这样的终端,那么很可能真彩色,支持已经开箱即用。但是,如果您使用的是普通的 Windows 控制台(`cmd`要么`PowerShell`)和 Windows 10 之前的版本,然后没有已知的方法来获得真彩色支持。如果您使用的是 Windows 10 ,并使用 Windows 控制台,那么真彩色应该是开箱即用的,但有一点需要注意:您可能需要先清除 ripgrep 的默认颜色设置。也就是说,不是这样:
170 |
171 | ```
172 | $ rg somepattern --colors 'match:fg:0x33,0x66,0xFF'
173 | ```
174 |
175 | 你应该做这个
176 |
177 | ```
178 | $ rg somepattern --colors 'match:none' --colors 'match:fg:0x33,0x66,0xFF'
179 | ```
180 |
181 | 这是因为 ripgrep 可能会为`match`至`bold`设置默认样式,似乎 Windows 10 的 VT100 的支持,不允许同时使用粗体和真彩色 ANSI 转义。上面的解决方法将清除 ripgrep 的默认样式,允许您根据需要精确地制作它.
182 |
183 | ### 当我杀死它时,如何防止 ripgrep 弄乱颜色?
184 |
185 | 输入`color`在 cmd.exe(命令提示符),和在类 Unix 系统上`echo -ne "\033[0m"`恢复原始前景色.
186 |
187 | 在 PowerShell 中,您可以将以下代码添加到配置文件中,以便调用`Reset-ForegroundColor`恢复原始前景色。其中的`Set-Alias`那行会允许您用`color`,就可以简单调用恢复。
188 |
189 | ```powershell
190 | $OrigFgColor = $Host.UI.RawUI.ForegroundColor
191 | function Reset-ForegroundColor {
192 | $Host.UI.RawUI.ForegroundColor = $OrigFgColor
193 | }
194 | Set-Alias -Name color -Value Reset-ForegroundColor
195 | ```
196 |
197 | PR[#187](https://github.com/BurntSushi/ripgrep/pull/187)修复此问题,后来又被[#281](https://github.com/BurntSushi/ripgrep/issues/281)弃用了。[这里](https://github.com/BurntSushi/ripgrep/issues/281#issuecomment-269093893)有完整的解释。
198 |
199 | ### 我该怎么绕过 正则式 的大小限制呢?
200 |
201 | 如果你给 ripgrep 一个特别大的模式(或大量的小模式),那么它可能无法编译,因为它达到了预设的限制。例如:
202 |
203 | ```
204 | $ rg '\pL{1000}'
205 | Compiled regex exceeds size limit of 10485760 bytes.
206 | ```
207 |
208 | (注意:`\pL{1000}`可能看起来很小,但是`\pL`是包含所有 Unicode 字母的字符类,它非常大。*且*它重复了 1000 次.)
209 |
210 | 在这种情况下,您可以通过简单地增加限制来解决:
211 |
212 | ```
213 | $ rg '\pL{1000}' --regex-size-limit 1G
214 | ```
215 |
216 | 将限制增加到 1GB 并不一定意味着 ripgrep 将使用那么多内存。限制标志只是说它允许(大致)使用那么多内存,来构造正则表达式。
217 |
218 | ### 如何让 -f/--file
标志更快?
219 |
220 | 该`-f/--file`允许给 ripgrep 一个文件,每个行包含一个模式。然后 ripgrep 会报告任何匹配该模式的行。
221 |
222 | 如果这个模式文件太大,那么 ripgrep 可能会大幅减速。*通常*这是因为内部缓存太小,并且会导致 ripgrep 换成更慢,但更强大的正则表达式引擎。如果这确成为了问题,那就可以增加此缓存,并重新获得速度。缓存可以通过`--dfa-size-limit`标志。例如,使用`--dfa-size-limit 1G`将缓存大小设置为 1GB。(请注意,这并不意味着 ripgrep 会自动使用 1GB 内存,但如果需要,它将允许正则表达式引擎使用。)
223 |
224 | ### 如何使输出看起来像 Silver Searcher 的输出?
225 |
226 | 使用`--colors`标志,像这样:
227 |
228 | ```
229 | rg --colors line:fg:yellow \
230 | --colors line:style:bold \
231 | --colors path:fg:green \
232 | --colors path:style:bold \
233 | --colors match:fg:black \
234 | --colors match:bg:yellow \
235 | --colors match:style:nobold \
236 | foo
237 | ```
238 |
239 | 或者,将您的颜色配置添加到您的 ripgrep 配置文件(通过设置`RIPGREP_CONFIG_PATH`环境变量,指向您的配置文件)。例如:
240 |
241 | ```
242 | $ cat $HOME/.config/ripgrep/rc
243 | --colors=line:fg:yellow
244 | --colors=line:style:bold
245 | --colors=path:fg:green
246 | --colors=path:style:bold
247 | --colors=match:fg:black
248 | --colors=match:bg:yellow
249 | --colors=match:style:nobold
250 | $ RIPGREP_CONFIG_PATH=$HOME/.config/ripgrep/rc rg foo
251 | ```
252 |
253 | ### 为什么启用 PCRE2 正则式 时,ripgrep 会变慢??
254 |
255 | 当你使用`--pcre2`(简写用`-P`)标志,ripgrep 将使用 PCRE2 正则表达式引擎而不是默认值。两个正则表达式引擎都非常快,但 PCRE2 提供了许多其他功能,例如许多人喜欢使用的环视和反向引用(look-around 和
256 | backreferences)。这主要是因为 PCRE2 使用回溯实现,其中默认正则表达式引擎使用基于有限自动机的实现。前者比后者,添加了大量铃声和口哨的能力,但后者在最坏情况,亦为线性时间复杂度执行。
257 |
258 | > 译者: [环视: 在知乎找了个说明文](https://zhuanlan.zhihu.com/p/50789818)
259 |
260 | 伴随着使用`-P`的 ripgrep,您可能已经注意到,它可能会变慢。导致的原因是复杂的,因为 ripgrep 实现快速搜索的优化,很复杂。
261 |
262 | ripgrep 之前的任务有点简单; 它需要做的就是在文件中,搜索某些模式的出现,然后打印包含这些模板的行。因此,问题在于什么是有效的匹配,以及我们如何从文件中读取字节。
263 |
264 | 就所谓的有效匹配而言,请记住 ripgrep 默认只报告,单行的匹配。这里的问题是一些模式可以匹配多行,而 ripgrep 需要防止这种情况发生。例如,`foo\sbar`会匹配`foo\nbar`。实现这个,最明显的方法是,从文件中读取数据,然后对每行的数据,进行模式搜索。这种方法的问题在于它可能非常慢; 让模式搜索尽可能多的数据要快得多。这样更快,因为它摆脱了寻找每一行边缘情况的开销,也因为它摆脱了每行模式搜索开始和停止的开销。(这是在一般假设下运行,即匹配的行,比非匹配行要少得多。)
265 |
266 | 事实证明,更快的方法是,我们可以通过对模式进行非常简单的限制:*静态禁用*来自匹配`\n`字符的模式。即,当给出类似模式`foo\sbar`时,ripgrep 将自动从`\s`(匹配)字符类移除`\n`(静态字符)。在某些情况下,简单的去除并不容易。例如,在模式包含一个`\n`字面量,ripgrep 将会返回一个错误。:
267 |
268 | ```
269 | $ rg '\n'
270 | the literal '"\n"' is not allowed in a regex
271 | ```
272 |
273 | 那么这与 PCRE2 有什么关系呢? 是这样的,ripgrep 的默认正则表达式引擎公开的 API,以一种非常容易的方式对模式进行语法分析,可以从模式中剥离`\n`(或以其他方式检测,如果剥离是不可能的,就报告错误)。PCRE2 似乎没有提供类似的 API,所以当启用 PCRE2 时,ripgrep 不执行任何剥离。这迫使 ripgrep 使用"慢"的搜索策略,单独搜索每行。
274 |
275 | 那么,如果 PCRE2 的启用,会让默认的搜索方法变慢,是因为它强制限制匹配单行,那么为什么 PCRE2 在执行多行搜索时有时也会变慢呢?PCRE2 可能比默认的 regex 引擎慢,有*多个*原因。这一次,要怪 PCRE2 的 Unicode 支持,ripgrep 会默认启用。特别地,PCRE2 不能即启用 Unicode 支持,又同时搜索任意数据。也就是说,当启用了 PCRE2 的 Unicode 支持时,数据**必须**是有效的 UTF-8(否则,将调用未定义的行为)。这与 ripgrep 的默认 regex 引擎形成对比,因其是可以支持 Unicode ,同时搜索任意数据。ripgrep 的默认 regex 引擎只是简单,把无效的 UTF-8 不匹配,只匹配有效的 UTF-8。为什么 PCRE2 不能这么做?这个作者不熟悉它的内部结构,所以我们在这里,不讨论它。
276 |
277 | 这里的底线是,要启用 PCRE2 的 Unicode 支持,同时确保我们正在搜索的是有效 UTF-8,且不会导致性能损失。具体做法是,ripgrep 把每个文件的内容转换为 UTF-8,同时让 Unicode 代码点,替换无效的 UTF-8 数据。ripgrep 然后,禁用 PCRE2 本身内部的 UTF-8 检查,因为我们已经保证了所传递的数据,是有效的 UTF-8。ripgrep 采用此方法的原因是,如果我们让 未知 UTF-8 到达 PCRE2 ,那么如果遇到无效 UTF-8 序列,它将报告一个匹配错误。这对 ripgrep 来说,不是好消息,因为它将停止搜索文件的其余部分,并且还将向用户打印,不希望存在的错误消息。
278 |
279 | 好的,如果您还不熟悉 ripgrep 内部,那么上面列出的信息很多。让我们用一些例子,来具体说明。首先,让我们得到一些足够大的数据,来放大性能差异:
280 |
281 | ```
282 | $ curl -O 'https://burntsushi.net/stuff/subtitles2016-sample.gz'
283 | $ gzip -d subtitles2016-sample
284 | $ md5sum subtitles2016-sample
285 | e3cb796a20bbc602fbfd6bb43bda45f5 subtitles2016-sample
286 | ```
287 |
288 | 要搜索此数据,我们将使用该模式`^\w{42}$`,它在文件中,只'命中'一个,并且没有字面量(literals)。没有字面量很重要,因为它确保 regex 引擎不会使用字面量优化,来加速搜索。换句话说,它让我们能够一致确定,regex 引擎正在执行的实际任务。
289 |
290 | 现在让我们根据上面的信息来浏览几个例子。首先,让我们考虑使用 ripgrep 的默认 regex 引擎进行默认搜索,然后使用 PCRE2 进行相同的搜索:
291 |
292 | ```
293 | $ time rg '^\w{42}$' subtitles2016-sample
294 | 21225780:EverymajordevelopmentinthehistoryofAmerica
295 |
296 | real 0m1.783s
297 | user 0m1.731s
298 | sys 0m0.051s
299 |
300 | $ time rg -P '^\w{42}$' subtitles2016-sample
301 | 21225780:EverymajordevelopmentinthehistoryofAmerica
302 |
303 | real 0m2.458s
304 | user 0m2.419s
305 | sys 0m0.038s
306 | ```
307 |
308 | 在这个特定的示例中,两种模式搜索都使用 Unicode,感知`\w`字符类和都在计算行数,以便报告行数。这里的关键区别在于,第一次搜索不会逐行搜索,而第二次搜索会。我们可以观察 ripgrep 使用的策略是什么?通过`--trace`标志:
309 |
310 | ```
311 | $ rg '^\w{42}$' subtitles2016-sample --trace
312 | [... snip ...]
313 | TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:622: Some("subtitles2016-sample"): searching via memory map
314 | TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:712: slice reader: searching via slice-by-line strategy
315 | TRACE|grep_searcher::searcher::core|grep-searcher/src/searcher/core.rs:61: searcher core: will use fast line searcher
316 | [... snip ...]
317 |
318 | $ rg -P '^\w{42}$' subtitles2016-sample --trace
319 | [... snip ...]
320 | TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:622: Some("subtitles2016-sample"): searching via memory map
321 | TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:705: slice reader: needs transcoding, using generic reader
322 | TRACE|grep_searcher::searcher|grep-searcher/src/searcher/mod.rs:685: generic reader: searching via roll buffer strategy
323 | TRACE|grep_searcher::searcher::core|grep-searcher/src/searcher/core.rs:63: searcher core: will use slow line searcher
324 | [... snip ...]
325 | ```
326 |
327 | 第一种是使用"快速行搜索器(will use fast line searcher)",而后一种是使用"慢速行搜索器(will use slow line searcher)".
328 |
329 | 有趣的是,在这种情况下,模式不匹配`\n`,并且我们正在搜索的文件是有效的 UTF-8,所以不需要慢速逐行搜索策略和解码。我们可以用更好的 PCRE2 内联 API ,来解决前一个问题。事实上,我们可以解决 ripgrep 的后一个问题`--no-encoding`标志,它阻止 UTF-8 的自动解码,但是将启用 PCRE2 自己的 UTF-8 有效性检查。不幸的是,我 ripgrep 构建版本中,速度较慢:
330 |
331 | ```
332 | $ time rg -P '^\w{42}$' subtitles2016-sample --no-encoding
333 | 21225780:EverymajordevelopmentinthehistoryofAmerica
334 |
335 | real 0m3.074s
336 | user 0m3.021s
337 | sys 0m0.051s
338 | ```
339 |
340 | (提示:使用`--trace`标志,以验证 ripgrep 中没有解码)。
341 |
342 | PCRE2 的 UTF-8 检查较慢的一个可能原因是,它可能不比 ripgrep 本身 UTF-8 解码更优化 —— [`encoding_rs`](https://github.com/hsivonen/encoding_rs)库。此外,我的 ripgrep 构建版本中的`encoding_rs`,是启动了 SIMD 优化的,可能在这里发挥作用。
343 |
344 | 另外,请注意使用`--no-encoding`标志,可能导致 PCRE2 报告无效的 UTF-8 错误,这导致 ripgrep 停止搜索文件:
345 |
346 | ```
347 | $ cat invalid-utf8
348 | foobar
349 |
350 | $ xxd invalid-utf8
351 | 00000000: 666f 6fff 6261 720a foo.bar.
352 |
353 | $ rg foo invalid-utf8
354 | 1:foobar
355 |
356 | $ rg -P foo invalid-utf8
357 | 1:foo�bar
358 |
359 | $ rg -P foo invalid-utf8 --no-encoding
360 | invalid-utf8: PCRE2: error matching: UTF-8 error: illegal byte (0xfe or 0xff)
361 | ```
362 |
363 | 好,所以在这一点上,您可能认为我们可以通过启用多行搜索,来消除逐行搜索的惩罚。毕竟,我们的特定模式无论如何,**不能跨多行**匹配,所以我们仍然会得到我们想要的结果。让我们试试看:
364 |
365 | ```
366 | $ time rg -U '^\w{42}$' subtitles2016-sample
367 | 21225780:EverymajordevelopmentinthehistoryofAmerica
368 |
369 | real 0m1.803s
370 | user 0m1.748s
371 | sys 0m0.054s
372 |
373 | $ time rg -P -U '^\w{42}$' subtitles2016-sample
374 | 21225780:EverymajordevelopmentinthehistoryofAmerica
375 |
376 | real 0m2.962s
377 | user 0m2.246s
378 | sys 0m0.713s
379 | ```
380 |
381 | 搜索时间与默认的 regex 引擎保持相同,但 PCRE2 搜索*更慢*。 发生了什么事?秘密可以用`--trace`标记再次找寻。在前一种情况中,ripgrep 实际上,检测到模式不能跨多行匹配,因此将回到"快速行搜索"策略,就像搜索没有`-U`的情况下一样。
382 |
383 | 然而,对于 PCRE2 来说,情况就更糟了。就是说,由于 Unicode 模式依然启用,所以 ripgrep 仍然要对 UTF-8 进行解码,以确保它只将有效的 UTF-8 交给 PCRE2。不幸的是,多行搜索的一个关键缺点是, ripgrep 不能增量进行。由于匹配可以任意长度,因此 ripgrep 实际上需要一次性,搞到内存中的整个文件。通常,我们可以为此使用内存映射,但是因为我们需要在搜索文件之前,对文件进行 UTF-8 解码,所以 ripgrep 在执行搜索之前,文件的全部内容已经读取到堆上。Owch
384 |
385 | 好吧,Unicode 在这里击败了我们。若我们正在搜索的文件*主要*是 ASCII 编码,所以也许我们能接受,错失一些数据。(试一试`rg '[\w--\p{ascii}]'`看到 非 ASCII 文字,因 ASCII 字符`\w`类会 miss。) 我们可以在两个搜索中禁用 Unicode,但是,根据不同的 regex 引擎:
386 |
387 | ```
388 | $ time rg '(?-u)^\w{42}$' subtitles2016-sample
389 | 21225780:EverymajordevelopmentinthehistoryofAmerica
390 |
391 | real 0m1.714s
392 | user 0m1.669s
393 | sys 0m0.044s
394 |
395 | $ time rg -P '^\w{42}$' subtitles2016-sample --no-pcre2-unicode
396 | 21225780:EverymajordevelopmentinthehistoryofAmerica
397 |
398 | real 0m1.997s
399 | user 0m1.958s
400 | sys 0m0.037s
401 | ```
402 |
403 | 在大多数情况下,ripgrep 默认的 regex 引擎执行相同的操作。PCRE2 确实多了些东西,但现在几乎和默认的 regex 引擎一样快。如果你看的`--trace`输出,你会看到 ripgrep 不再执行 UTF-8 解码,但是它仍然使用缓慢的逐行搜索器.
404 |
405 | 此时,我们可以结合上面的所有招式:让我们通过启用多行模式来,尝试摆脱缓慢的逐行搜索,并通过禁用 Unicode 支持,来停止 UTF-8 解码:
406 |
407 | ```
408 | $ time rg -U '(?-u)^\w{42}$' subtitles2016-sample
409 | 21225780:EverymajordevelopmentinthehistoryofAmerica
410 |
411 | real 0m1.714s
412 | user 0m1.655s
413 | sys 0m0.058s
414 |
415 | $ time rg -P -U '^\w{42}$' subtitles2016-sample --no-pcre2-unicode
416 | 21225780:EverymajordevelopmentinthehistoryofAmerica
417 |
418 | real 0m1.121s
419 | user 0m1.071s
420 | sys 0m0.048s
421 | ```
422 |
423 | 啊,PCRE2 的 JIT 在发光啊!ripgrep 的默认 regex 引擎,再次保持不变,但是 PCRE2 不再需要逐行搜索,也不再需要执行任何类型的 UTF-8 检查。这允许文件以令人印象深刻的速度,映射内存并传递给 PCRE2 的 JIT。(一个简单而有趣的历史记录,"内存映射+多行+非 Unicode"的配置正是 Silver Searcher 使用的配置。而这个分析,也许可以解释为什么这种配置是有用的!)
424 |
425 | 总之,如果希望 PCRE2 尽可能快,和你并不关心 Unicode,也不关心可能跨越多行的匹配,则让`-U`启用多行模式,并让`--no-pcre2-unicode`标志,禁用 PCRE2 的 Unicode 支持。
426 |
427 | 警告清空器: 此作者不是 PCRE2 专家,因此可能有一些 API ,有作者错过的性能提高。类似地,可能存在更适合搜索工具,如同 PCRE2 工作方式的替代设计。
428 |
429 | ### 当我运行rg
时,它为什么执行其他命令?
430 |
431 | 很可能你有一个 shell 别名,甚至另一个工具也叫`rg`的干扰。运行`which rg`看看它是什么。
432 |
433 | (特别是 Rails 插件[Oh My Zsh](https://github.com/robbyrussell/oh-my-zsh/wiki/Plugins#rails),会建立`rails generate`的一个`rg`别名)
434 |
435 | 像这样的问题可以通过以下几种方式来解决:
436 |
437 | - 如果您使用的是 OMZ Rails 插件,则通过编辑在 ZSH 配置中的`plugins`数组,禁用它。
438 | - 暂时绕过现有的`rg`别名,通过`command rg`,`\rg`或`'rg'`调用 ripgrep。
439 | - 暂时绕过现有别名或另一个`rg`工具,通过调用 ripgrep 的完整路径(例如`/usr/bin/rg`或`/usr/local/bin/rg`)
440 | - 永久禁用现有的`rg`别名,通过添加`unalias rg`到 shell 配置文件的底部(例如,`.bash_profile`或`.zshrc`)
441 | - 通过在 shell 配置文件的底部添加如下行,让 ripgrep 与其他工具/别名不冲突:`alias ripgrep='command rg'`。
442 |
443 | ### 如何在 Windows 上 为 ripgrep 创建别名?
444 |
445 | 通常,您可以找到为使用大量设置指令的命令,创建别名的必要性。但是 Posishell 函数别名,并不象典型的 Linux shell 别名。你总要需要传播参数和`stdin`输入。但不能像`function grep() { $input | rg.exe --hidden $args }`一样简单
446 |
447 | 使用下面的示例,作为如何在 PowerShell 中,设置别名的参考。
448 |
449 | ```powershell
450 | function grep {
451 | $count = @($input).Count
452 | $input.Reset()
453 |
454 | if ($count) {
455 | $input | rg.exe --hidden $args
456 | }
457 | else {
458 | rg.exe --hidden $args
459 | }
460 | }
461 | ```
462 |
463 | PowerShell 特殊变量:
464 |
465 | - `input` - 是 PowerShell 的`stdin`,允许您访问其内容的对象。
466 | - `args` - 是传递给这个函数的参数数组。
467 |
468 | 这个别名(函数),检查`stdin`的`input`是否存在,和仅当有时,才传播给`rg.exe`。否则空`$input`,将使 PowerShell 触发,`rg`搜索空`stdin`。
469 |
470 | ### 如何创建 PowerShell 配置文件?
471 |
472 | 要在启动时,定制 powershell,必须创建一个特殊的 powershell 脚本。为了找到它的位置,见[Microsoft's 文档]()的`$profile`类型,了解更多细节。
473 |
474 | 这个文件中的任何 PowerShell 代码,在控制台的开始时被执行。这样,您可以在开始时,创建自己的别名。
475 |
476 | ### Windows 如何将 非 ASCII 内容传输到 ripgrep?
477 |
478 | 当在 PowerShell 中将输入管道化,到本地可执行文件时,输入的编码由`$OutputEncoding`变量控制。默认设置为 US-ASCII,并且管道中,不为 US-ASCII 编码的任何字符,都被转换为`?`(问号)字符。
479 |
480 | 若要更改此设置,请设置`$OutputEncoding`成,`.NET`编码对象表示的不同编码。下面是一些常见的例子。当 PowerShell 重新启动时,该变量的值被重置,因此每次启动 PowerShell 时,都要将设置变量的行添加到 PowerShell 配置文件中,以使此更改生效。
481 |
482 | 例子`$OutputEncoding`设置:
483 |
484 | - 没有 BOM 的 UTF-8:`$OutputEncoding = [System.Text.UTF8Encoding]::new()`
485 | - 控制台的输出编码:`$OutputEncoding = [System.Console]::OutputEncoding`
486 |
487 | 如果编码问题还是存在,可以强制控制台,打印成 UTF-8 编码:`[System.Console]::OutputEncoding = [System.Text.Encoding]::UTF8`。 当重新启动 PowerShell 时,这也将重置,因此如果希望使设置永久化,也可以将该行添加到配置文件中。
488 |
489 | ### 如何用 ripgrep 搜索并替换?
490 |
491 | ripgrep 是一个永远不会接触你的文件的搜索工具。但是,ripgrep 的输出可以被管道传输到,其他修改磁盘上的文件的工具。欲了解更多信息,见[this issue](https://github.com/BurntSushi/ripgrep/issues/74)。
492 |
493 | SED 是一种可以在磁盘上修改文件的工具。SED 可以使用文件名和替换命令,来搜索和替换指定的文件。可以将包含匹配模式的文件,提供给 SED。
494 |
495 | ```
496 | rg foo --files-with-matches
497 | ```
498 |
499 | 此命令的输出是,包含`foo`模式匹配的文件名的列表。
500 |
501 | 这个列表可以用管道输入`xargs`,它将文件名从标准输入中,分离为 xargs 之后的命令的参数。您可以使用下面的组合,将文件名列表导入 SED 中进行替换。例如:
502 |
503 | ```
504 | rg foo --files-with-matches | xargs sed -i 's/foo/bar/g'
505 | ```
506 |
507 | 在 ripgrep 找到 foo 模式的文件中,用"bar"替换"foo"的所有实例。SED 指示的这个`-i`标记,是您正要编辑的文件,以及`s/foo/bar/g`说明,你在执行**替换**,`foo`到`bar`,而且你正在做的这个替换,是**全面性的**(替换,文件中的每个匹配)。
508 |
509 | 注意:上面的命令是假定,您正在使用 GNU sed。如果您使用的是 BSD sed(macOS 和 FreeBSD 上的默认值),那么必须修改上面的命令如下:
510 |
511 | ```
512 | rg foo --files-with-matches | xargs sed -i '' 's/foo/bar/g'
513 | ```
514 |
515 | BSD sed 中这个`-i`的标志要求提供一个文件扩展名,以便对所有修改后的文件进行备份。指定空字符串可以防止文件备份.
516 |
517 | 最后,如果任何文件路径都包含空格,那么可能需要使用 NUL 终止符,来分隔文件路径。这需要告诉 ripgrep 在每个路径之间,输出 NUL 字节,并告诉 xargs 读取,由 NUL 字节分隔的路径:
518 |
519 | ```
520 | rg foo --files-with-matches -0 | xargs -0 sed -i 's/foo/bar/g'
521 | ```
522 |
523 | 要了解有关 sed 的更多信息,请参阅[这里](https://www.gnu.org/software/sed/manual/sed.html)的 sed 手册。
524 |
525 | 另外,脸谱网有一个工具叫做[fastmod](https://github.com/facebookincubator/fastmod),它使用与 ripgrep 相同的一些库,并且可以提供更符合人体工程学的搜索和替换体验。
526 |
527 | ### Ripgrep 是如何获得许可的?
528 |
529 | ripgrep 是双许可下的[Unlicense](https://unlicense.org/)和 MIT 执照。具体来说, ripgrep 可以使用任一许可证的条款。
530 |
531 | ripgrep 是双重授权的原因是两个方面的:
532 |
533 | 1。 我,作为 ripgrep 的作者,想通过促进 Unlicense 的目标来参与一点意识形态活动:放弃版权垄断利益。
534 | 2。 我,正如 ripgrep 的作者一样,希望尽可能多地使用 RigPRP。由于 Unlicense 不是经过验证的或者众所周知的许可证,所以 ripgrep 也是在 MIT 的许可证下提供的,它无处不在,并且几乎被所有人接受。
535 |
536 | 更具体地说,ripgrep 及其所有依赖项都与此许可选择兼容。特别地,ripgrep 的依赖项(直接和传递的)将始终限于允许的许可证。也就是说,ripgrep 将永远不依赖于不允许许可的代码。这意味着拒绝使用诸如 GPL、LGPL、MPL 或任何 Creative Commons ShareAlike 许可证之类的版权许可的任何依赖项。许可证是否是"弱"的版权保留与否,并不重要;**不**依靠它。
537 |
538 | ### Ripgrep 能代替 grep 吗?
539 |
540 | 是也不是。
541 |
542 | 如果听到"ripgrep 能代替 grep",你*事实上*听到的是,"ripgrep 可以在每个 grep 实例中使用,完全相同的方式,对于相同的用例,使用完全相同的 bug-for-bug 行为",然后没有了,ripgrep 很平凡,*不能*替换 grep。此外,ripgrep*从未*想替换 grep。
543 |
544 | 如果听到"ripgrep 能代替 grep",你*事实上*听到的是,"ripgrep 在某些情况下,可以代替 grep,而在其他用例中,则不能",那么是的,这是真的!
545 |
546 | 让我们来看看那些有利于 ripgrep 的用例。其中一些可能不适用于你。没关系。这里可能没有列出,其他适用于您的用例。那也没问题。
547 |
548 | (对于下列与性能有关的申明,请参阅我介绍 ripgrep 的[博文](https://blog.burntsushi.net/ripgrep/)。)
549 |
550 | - 你经常搜索代码库吗?如果是这样,ripgrep 可能是一个不错的选择,因为可能存在大量您不想搜索的存储库。当然也行,grep 可以用递归搜索,来过滤文件,如果你不介意写出必要的`--exclude`规则,或编写包装脚本,那么 grep 就足够了。(我不是在开玩笑,我自己在编写 ripgrep 之前用 grep 做了将近 10 年。)但是如果你喜欢搜索工具能遵循`.gitignore`,那么 ripgrep 对你来说可能是完美的!
551 | - 您是否经常搜索 UTF-8 编码,的非 ASCII 文本?ripgrep 的关键功能之一是,它能以一种比 GNU grep 更快的方式,处理模式中的 Unicode 特性。ripgrep 中的 Unicode 特性,默认情况下是启用的; 不需要配置区域的设置,才能正确使用 ripgrep,因为 ripgrep 不遵循区域设置。
552 | - 您是否需要搜索 UTF 16 文件,而不想麻烦,明确地对它们进行代码转换?伟大的 ripgrep 自动为你做这件事。不需要你启用它。
553 | - 需要搜索大型文件的大目录吗?ripgrep 默认使用并行性,这使得它比标准`grep -r`搜索快。不过,如果你还好,只是偶尔`find ./ -print0 | xargs -P8 -0 grep`命令,那么也许 grep 是足够好的。
554 |
555 | 以下是一些你可能会遇到的情况,*不*想使用 ripgrep。同样适用于上一节的警告。
556 |
557 | - 您是否编写了,用于在各种环境中,工作的便携式 shell 脚本?很好,用 ripgrep 可能不是一个好主意!ripgrep 离 grep 还差得远,所以如果您确实使用 ripgrep,那么您可能需要,比 grep ,更忙于安装过程。
558 | - 您关心 POSIX 兼容性吗?如果是这样,那么就不能使用 ripgrep,因为它从来没有,也永远不会是 POSIX 兼容的。
559 | - 你讨厌尝试自以为是的工具吗?如果是这样的话,ripgrep 完全是自以为是的,所以你可能更愿意坚持 grep。
560 | - 你依赖 grep ,但 ripgrep 却没有的特别特性?如果是,文件错误报告,也许 ripgrep 可以做到这一点! 如果真没有,那么,只需使用 grep。
561 |
562 | ### “ripgrep”中的“rip”是什么意思?
563 |
564 | 当我第一次开始写 ripgrep 时,我把它叫做`rep`,希望它是一个较短的`grep`变体。 不久之后,我把它改名为`xrep`,因`rep`对我的口味来说,名字不够明显。也因为任何事情加入`x`,都能使它变得更好,对吧?
565 |
566 | > 译者: 比喻某公司 X 代
567 |
568 | 在 ripgrep 首次公开发行之前,我决定不喜欢`xrep`。 我觉得打字有点笨拙,尽管我以前称赞过这`x`,但我有点觉得它很蹩脚。作为一个真正喜欢 Rust 的人,我想把它叫做"RuSugRp"或者简称 RrGRP。但我认为那也跛脚,也许你脸上表露了相同的表情。但我想继续使用`r`,所以至少让我,可以假装 Rust 和它有关。
569 |
570 | 我花了几天时间,想从字母`r`开始的短单词,甚至与搜索任务有些关系。我不记得它是怎么突然出现在我脑海中的,但"rip"这个词的意思是"快速",比如"撕破(rip)你的文本"。RIP 也是"安静休息(Rest in Peace)"(更过激的 "ripgrep 杀死(kill) grep")的字母缩写,这个过激的想法,从未真正出现在我脑海中。也许这个巧合太惊人了,以至于我不敢相信,但我直到第一次公开发行之后,有人明确地指出来,才意识到这一点。我承认我觉得这有点好笑,但是如果我在公开发布之前,意识到了,可能会继续选择一个不同的名字。唉,发布后的重命名是很难的,所以我决定,就这样吧,不是有意的。
571 |
572 | 鉴于这一事实[ripgrep 从, 不是 100% 替换 grep,以后也不会](#posix4ever),ripgrep 既不是真正的"grep 杀手",也不是有意的。它确实会提升了一些用例,但是还有其他工具,比如 ack 或 Silver Searcher ,他们都没有这么做。
573 |
--------------------------------------------------------------------------------
/GUIDE.md:
--------------------------------------------------------------------------------
1 | ## User Guide
2 |
3 | This guide is intended to give an elementary description of ripgrep and an
4 | overview of its capabilities. This guide assumes that ripgrep is
5 | [installed](README.md#installation)
6 | and that readers have passing familiarity with using command line tools. This
7 | also assumes a Unix-like system, although most commands are probably easily
8 | translatable to any command line shell environment.
9 |
10 |
11 | ### Table of Contents
12 |
13 | * [Basics](#basics)
14 | * [Recursive search](#recursive-search)
15 | * [Automatic filtering](#automatic-filtering)
16 | * [Manual filtering: globs](#manual-filtering-globs)
17 | * [Manual filtering: file types](#manual-filtering-file-types)
18 | * [Replacements](#replacements)
19 | * [Configuration file](#configuration-file)
20 | * [File encoding](#file-encoding)
21 | * [Binary data](#binary-data)
22 | * [Common options](#common-options)
23 |
24 |
25 | ### Basics
26 |
27 | ripgrep is a command line tool that searches your files for patterns that
28 | you give it. ripgrep behaves as if reading each file line by line. If a line
29 | matches the pattern provided to ripgrep, then that line will be printed. If a
30 | line does not match the pattern, then the line is not printed.
31 |
32 | The best way to see how this works is with an example. To show an example, we
33 | need something to search. Let's try searching ripgrep's source code. First
34 | grab a ripgrep source archive from
35 | https://github.com/BurntSushi/ripgrep/archive/0.7.1.zip
36 | and extract it:
37 |
38 | ```
39 | $ curl -LO https://github.com/BurntSushi/ripgrep/archive/0.7.1.zip
40 | $ unzip 0.7.1.zip
41 | $ cd ripgrep-0.7.1
42 | $ ls
43 | benchsuite grep tests Cargo.toml LICENSE-MIT
44 | ci ignore wincolor CHANGELOG.md README.md
45 | complete pkg appveyor.yml compile snapcraft.yaml
46 | doc src build.rs COPYING UNLICENSE
47 | globset termcolor Cargo.lock HomebrewFormula
48 | ```
49 |
50 | Let's try our first search by looking for all occurrences of the word `fast`
51 | in `README.md`:
52 |
53 | ```
54 | $ rg fast README.md
55 | 75: faster than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
56 | 88: color and full Unicode support. Unlike GNU grep, `ripgrep` stays fast while
57 | 119:### Is it really faster than everything else?
58 | 124:Summarizing, `ripgrep` is fast because:
59 | 129: optimizations to make searching very fast.
60 | ```
61 |
62 | (**Note:** If you see an error message from ripgrep saying that it didn't
63 | search any files, then re-run ripgrep with the `--debug` flag. One likely cause
64 | of this is that you have a `*` rule in a `$HOME/.gitignore` file.)
65 |
66 | So what happened here? ripgrep read the contents of `README.md`, and for each
67 | line that contained `fast`, ripgrep printed it to your terminal. ripgrep also
68 | included the line number for each line by default. If your terminal supports
69 | colors, then your output might actually look something like this screenshot:
70 |
71 | [](https://burntsushi.net/stuff/ripgrep-guide-sample.png)
72 |
73 | In this example, we searched for something called a "literal" string. This
74 | means that our pattern was just some normal text that we asked ripgrep to
75 | find. But ripgrep supports the ability to specify patterns via [regular
76 | expressions](https://en.wikipedia.org/wiki/Regular_expression). As an example,
77 | what if we wanted to find all lines have a word that contains `fast` followed
78 | by some number of other letters?
79 |
80 | ```
81 | $ rg 'fast\w+' README.md
82 | 75: faster than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
83 | 119:### Is it really faster than everything else?
84 | ```
85 |
86 | In this example, we used the pattern `fast\w+`. This pattern tells ripgrep to
87 | look for any lines containing the letters `fast` followed by *one or more*
88 | word-like characters. Namely, `\w` matches characters that compose words (like
89 | `a` and `L` but unlike `.` and ` `). The `+` after the `\w` means, "match the
90 | previous pattern one or more times." This means that the word `fast` won't
91 | match because there are no word characters following the final `t`. But a word
92 | like `faster` will. `faste` would also match!
93 |
94 | Here's a different variation on this same theme:
95 |
96 | ```
97 | $ rg 'fast\w*' README.md
98 | 75: faster than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
99 | 88: color and full Unicode support. Unlike GNU grep, `ripgrep` stays fast while
100 | 119:### Is it really faster than everything else?
101 | 124:Summarizing, `ripgrep` is fast because:
102 | 129: optimizations to make searching very fast.
103 | ```
104 |
105 | In this case, we used `fast\w*` for our pattern instead of `fast\w+`. The `*`
106 | means that it should match *zero* or more times. In this case, ripgrep will
107 | print the same lines as the pattern `fast`, but if your terminal supports
108 | colors, you'll notice that `faster` will be highlighted instead of just the
109 | `fast` prefix.
110 |
111 | It is beyond the scope of this guide to provide a full tutorial on regular
112 | expressions, but ripgrep's specific syntax is documented here:
113 | https://docs.rs/regex/0.2.5/regex/#syntax
114 |
115 |
116 | ### Recursive search
117 |
118 | In the previous section, we showed how to use ripgrep to search a single file.
119 | In this section, we'll show how to use ripgrep to search an entire directory
120 | of files. In fact, *recursively* searching your current working directory is
121 | the default mode of operation for ripgrep, which means doing this is very
122 | simple.
123 |
124 | Using our unzipped archive of ripgrep source code, here's how to find all
125 | function definitions whose name is `write`:
126 |
127 | ```
128 | $ rg 'fn write\('
129 | src/printer.rs
130 | 469: fn write(&mut self, buf: &[u8]) {
131 |
132 | termcolor/src/lib.rs
133 | 227: fn write(&mut self, b: &[u8]) -> io::Result {
134 | 250: fn write(&mut self, b: &[u8]) -> io::Result {
135 | 428: fn write(&mut self, b: &[u8]) -> io::Result { self.wtr.write(b) }
136 | 441: fn write(&mut self, b: &[u8]) -> io::Result { self.wtr.write(b) }
137 | 454: fn write(&mut self, buf: &[u8]) -> io::Result {
138 | 511: fn write(&mut self, buf: &[u8]) -> io::Result {
139 | 848: fn write(&mut self, buf: &[u8]) -> io::Result {
140 | 915: fn write(&mut self, buf: &[u8]) -> io::Result {
141 | 949: fn write(&mut self, buf: &[u8]) -> io::Result {
142 | 1114: fn write(&mut self, buf: &[u8]) -> io::Result {
143 | 1348: fn write(&mut self, buf: &[u8]) -> io::Result {
144 | 1353: fn write(&mut self, buf: &[u8]) -> io::Result {
145 | ```
146 |
147 | (**Note:** We escape the `(` here because `(` has special significance inside
148 | regular expressions. You could also use `rg -F 'fn write('` to achieve the
149 | same thing, where `-F` interprets your pattern as a literal string instead of
150 | a regular expression.)
151 |
152 | In this example, we didn't specify a file at all. Instead, ripgrep defaulted
153 | to searching your current directory in the absence of a path. In general,
154 | `rg foo` is equivalent to `rg foo ./`.
155 |
156 | This particular search showed us results in both the `src` and `termcolor`
157 | directories. The `src` directory is the core ripgrep code where as `termcolor`
158 | is a dependency of ripgrep (and is used by other tools). What if we only wanted
159 | to search core ripgrep code? Well, that's easy, just specify the directory you
160 | want:
161 |
162 | ```
163 | $ rg 'fn write\(' src
164 | src/printer.rs
165 | 469: fn write(&mut self, buf: &[u8]) {
166 | ```
167 |
168 | Here, ripgrep limited its search to the `src` directory. Another way of doing
169 | this search would be to `cd` into the `src` directory and simply use `rg 'fn
170 | write\('` again.
171 |
172 |
173 | ### Automatic filtering
174 |
175 | After recursive search, ripgrep's most important feature is what it *doesn't*
176 | search. By default, when you search a directory, ripgrep will ignore all of
177 | the following:
178 |
179 | 1. Files and directories that match the rules in your `.gitignore` glob
180 | pattern.
181 | 2. Hidden files and directories.
182 | 3. Binary files. (ripgrep considers any file with a `NUL` byte to be binary.)
183 | 4. Symbolic links aren't followed.
184 |
185 | All of these things can be toggled using various flags provided by ripgrep:
186 |
187 | 1. You can disable `.gitignore` handling with the `--no-ignore` flag.
188 | 2. Hidden files and directories can be searched with the `--hidden` flag.
189 | 3. Binary files can be searched via the `--text` (`-a` for short) flag.
190 | Be careful with this flag! Binary files may emit control characters to your
191 | terminal, which might cause strange behavior.
192 | 4. ripgrep can follow symlinks with the `--follow` (`-L` for short) flag.
193 |
194 | As a special convenience, ripgrep also provides a flag called `--unrestricted`
195 | (`-u` for short). Repeated uses of this flag will cause ripgrep to disable
196 | more and more of its filtering. That is, `-u` will disable `.gitignore`
197 | handling, `-uu` will search hidden files and directories and `-uuu` will search
198 | binary files. This is useful when you're using ripgrep and you aren't sure
199 | whether its filtering is hiding results from you. Tacking on a couple `-u`
200 | flags is a quick way to find out. (Use the `--debug` flag if you're still
201 | perplexed, and if that doesn't help,
202 | [file an issue](https://github.com/BurntSushi/ripgrep/issues/new).)
203 |
204 | ripgrep's `.gitignore` handling actually goes a bit beyond just `.gitignore`
205 | files. ripgrep will also respect repository specific rules found in
206 | `$GIT_DIR/info/exclude`, as well as any global ignore rules in your
207 | `core.excludesFile` (which is usually `$XDG_CONFIG_HOME/git/ignore` on
208 | Unix-like systems).
209 |
210 | Sometimes you want to search files that are in your `.gitignore`, so it is
211 | possible to specify additional ignore rules or overrides in a `.ignore`
212 | (application agnostic) or `.rgignore` (ripgrep specific) file.
213 |
214 | For example, let's say you have a `.gitignore` file that looks like this:
215 |
216 | ```
217 | log/
218 | ```
219 |
220 | This generally means that any `log` directory won't be tracked by `git`.
221 | However, perhaps it contains useful output that you'd like to include in your
222 | searches, but you still don't want to track it in `git`. You can achieve this
223 | by creating a `.ignore` file in the same directory as the `.gitignore` file
224 | with the following contents:
225 |
226 | ```
227 | !log/
228 | ```
229 |
230 | ripgrep treats `.ignore` files with higher precedence than `.gitignore` files
231 | (and treats `.rgignore` files with higher precedence than `.ignore` files).
232 | This means ripgrep will see the `!log/` whitelist rule first and search that
233 | directory.
234 |
235 | Like `.gitignore`, a `.ignore` file can be placed in any directory. Its rules
236 | will be processed with respect to the directory it resides in, just like
237 | `.gitignore`.
238 |
239 | To process `.gitignore` and `.ignore` files case insensitively, use the flag
240 | `--ignore-file-case-insensitive`. This is especially useful on case insensitive
241 | file systems like those on Windows and macOS. Note though that this can come
242 | with a significant performance penalty, and is therefore disabled by default.
243 |
244 | For a more in depth description of how glob patterns in a `.gitignore` file
245 | are interpreted, please see `man gitignore`.
246 |
247 |
248 | ### Manual filtering: globs
249 |
250 | In the previous section, we talked about ripgrep's filtering that it does by
251 | default. It is "automatic" because it reacts to your environment. That is, it
252 | uses already existing `.gitignore` files to produce more relevant search
253 | results.
254 |
255 | In addition to automatic filtering, ripgrep also provides more manual or ad hoc
256 | filtering. This comes in two varieties: additional glob patterns specified in
257 | your ripgrep commands and file type filtering. This section covers glob
258 | patterns while the next section covers file type filtering.
259 |
260 | In our ripgrep source code (see [Basics](#basics) for instructions on how to
261 | get a source archive to search), let's say we wanted to see which things depend
262 | on `clap`, our argument parser.
263 |
264 | We could do this:
265 |
266 | ```
267 | $ rg clap
268 | [lots of results]
269 | ```
270 |
271 | But this shows us many things, and we're only interested in where we wrote
272 | `clap` as a dependency. Instead, we could limit ourselves to TOML files, which
273 | is how dependencies are communicated to Rust's build tool, Cargo:
274 |
275 | ```
276 | $ rg clap -g '*.toml'
277 | Cargo.toml
278 | 35:clap = "2.26"
279 | 51:clap = "2.26"
280 | ```
281 |
282 | The `-g '*.toml'` syntax says, "make sure every file searched matches this
283 | glob pattern." Note that we put `'*.toml'` in single quotes to prevent our
284 | shell from expanding the `*`.
285 |
286 | If we wanted, we could tell ripgrep to search anything *but* `*.toml` files:
287 |
288 | ```
289 | $ rg clap -g '!*.toml'
290 | [lots of results]
291 | ```
292 |
293 | This will give you a lot of results again as above, but they won't include
294 | files ending with `.toml`. Note that the use of a `!` here to mean "negation"
295 | is a bit non-standard, but it was chosen to be consistent with how globs in
296 | `.gitignore` files are written. (Although, the meaning is reversed. In
297 | `.gitignore` files, a `!` prefix means whitelist, and on the command line, a
298 | `!` means blacklist.)
299 |
300 | Globs are interpreted in exactly the same way as `.gitignore` patterns. That
301 | is, later globs will override earlier globs. For example, the following command
302 | will search only `*.toml` files:
303 |
304 | ```
305 | $ rg clap -g '!*.toml' -g '*.toml'
306 | ```
307 |
308 | Interestingly, reversing the order of the globs in this case will match
309 | nothing, since the presence of at least one non-blacklist glob will institute a
310 | requirement that every file searched must match at least one glob. In this
311 | case, the blacklist glob takes precedence over the previous glob and prevents
312 | any file from being searched at all!
313 |
314 |
315 | ### Manual filtering: file types
316 |
317 | Over time, you might notice that you use the same glob patterns over and over.
318 | For example, you might find yourself doing a lot of searches where you only
319 | want to see results for Rust files:
320 |
321 | ```
322 | $ rg 'fn run' -g '*.rs'
323 | ```
324 |
325 | Instead of writing out the glob every time, you can use ripgrep's support for
326 | file types:
327 |
328 | ```
329 | $ rg 'fn run' --type rust
330 | ```
331 |
332 | or, more succinctly,
333 |
334 | ```
335 | $ rg 'fn run' -trust
336 | ```
337 |
338 | The way the `--type` flag functions is simple. It acts as a name that is
339 | assigned to one or more globs that match the relevant files. This lets you
340 | write a single type that might encompass a broad range of file extensions. For
341 | example, if you wanted to search C files, you'd have to check both C source
342 | files and C header files:
343 |
344 | ```
345 | $ rg 'int main' -g '*.{c,h}'
346 | ```
347 |
348 | or you could just use the C file type:
349 |
350 | ```
351 | $ rg 'int main' -tc
352 | ```
353 |
354 | Just as you can write blacklist globs, you can blacklist file types too:
355 |
356 | ```
357 | $ rg clap --type-not rust
358 | ```
359 |
360 | or, more succinctly,
361 |
362 | ```
363 | $ rg clap -Trust
364 | ```
365 |
366 | That is, `-t` means "include files of this type" where as `-T` means "exclude
367 | files of this type."
368 |
369 | To see the globs that make up a type, run `rg --type-list`:
370 |
371 | ```
372 | $ rg --type-list | rg '^make:'
373 | make: *.mak, *.mk, GNUmakefile, Gnumakefile, Makefile, gnumakefile, makefile
374 | ```
375 |
376 | By default, ripgrep comes with a bunch of pre-defined types. Generally, these
377 | types correspond to well known public formats. But you can define your own
378 | types as well. For example, perhaps you frequently search "web" files, which
379 | consist of Javascript, HTML and CSS:
380 |
381 | ```
382 | $ rg --type-add 'web:*.html' --type-add 'web:*.css' --type-add 'web:*.js' -tweb title
383 | ```
384 |
385 | or, more succinctly,
386 |
387 | ```
388 | $ rg --type-add 'web:*.{html,css,js}' -tweb title
389 | ```
390 |
391 | The above command defines a new type, `web`, corresponding to the glob
392 | `*.{html,css,js}`. It then applies the new filter with `-tweb` and searches for
393 | the pattern `title`. If you ran
394 |
395 | ```
396 | $ rg --type-add 'web:*.{html,css,js}' --type-list
397 | ```
398 |
399 | Then you would see your `web` type show up in the list, even though it is not
400 | part of ripgrep's built-in types.
401 |
402 | It is important to stress here that the `--type-add` flag only applies to the
403 | current command. It does not add a new file type and save it somewhere in a
404 | persistent form. If you want a type to be available in every ripgrep command,
405 | then you should either create a shell alias:
406 |
407 | ```
408 | alias rg="rg --type-add 'web:*.{html,css,js}'"
409 | ```
410 |
411 | or add `--type-add=web:*.{html,css,js}` to your ripgrep configuration file.
412 | ([Configuration files](#configuration-file) are covered in more detail later.)
413 |
414 |
415 | ### Replacements
416 |
417 | ripgrep provides a limited ability to modify its output by replacing matched
418 | text with some other text. This is easiest to explain with an example. Remember
419 | when we searched for the word `fast` in ripgrep's README?
420 |
421 | ```
422 | $ rg fast README.md
423 | 75: faster than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
424 | 88: color and full Unicode support. Unlike GNU grep, `ripgrep` stays fast while
425 | 119:### Is it really faster than everything else?
426 | 124:Summarizing, `ripgrep` is fast because:
427 | 129: optimizations to make searching very fast.
428 | ```
429 |
430 | What if we wanted to *replace* all occurrences of `fast` with `FAST`? That's
431 | easy with ripgrep's `--replace` flag:
432 |
433 | ```
434 | $ rg fast README.md --replace FAST
435 | 75: FASTer than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
436 | 88: color and full Unicode support. Unlike GNU grep, `ripgrep` stays FAST while
437 | 119:### Is it really FASTer than everything else?
438 | 124:Summarizing, `ripgrep` is FAST because:
439 | 129: optimizations to make searching very FAST.
440 | ```
441 |
442 | or, more succinctly,
443 |
444 | ```
445 | $ rg fast README.md -r FAST
446 | [snip]
447 | ```
448 |
449 | In essence, the `--replace` flag applies *only* to the matching portion of text
450 | in the output. If you instead wanted to replace an entire line of text, then
451 | you need to include the entire line in your match. For example:
452 |
453 | ```
454 | $ rg '^.*fast.*$' README.md -r FAST
455 | 75:FAST
456 | 88:FAST
457 | 119:FAST
458 | 124:FAST
459 | 129:FAST
460 | ```
461 |
462 | Alternatively, you can combine the `--only-matching` (or `-o` for short) with
463 | the `--replace` flag to achieve the same result:
464 |
465 | ```
466 | $ rg fast README.md --only-matching --replace FAST
467 | 75:FAST
468 | 88:FAST
469 | 119:FAST
470 | 124:FAST
471 | 129:FAST
472 | ```
473 |
474 | or, more succinctly,
475 |
476 | ```
477 | $ rg fast README.md -or FAST
478 | [snip]
479 | ```
480 |
481 | Finally, replacements can include capturing groups. For example, let's say
482 | we wanted to find all occurrences of `fast` followed by another word and
483 | join them together with a dash. The pattern we might use for that is
484 | `fast\s+(\w+)`, which matches `fast`, followed by any amount of whitespace,
485 | followed by any number of "word" characters. We put the `\w+` in a "capturing
486 | group" (indicated by parentheses) so that we can reference it later in our
487 | replacement string. For example:
488 |
489 | ```
490 | $ rg 'fast\s+(\w+)' README.md -r 'fast-$1'
491 | 88: color and full Unicode support. Unlike GNU grep, `ripgrep` stays fast-while
492 | 124:Summarizing, `ripgrep` is fast-because:
493 | ```
494 |
495 | Our replacement string here, `fast-$1`, consists of `fast-` followed by the
496 | contents of the capturing group at index `1`. (Capturing groups actually start
497 | at index 0, but the `0`th capturing group always corresponds to the entire
498 | match. The capturing group at index `1` always corresponds to the first
499 | explicit capturing group found in the regex pattern.)
500 |
501 | Capturing groups can also be named, which is sometimes more convenient than
502 | using the indices. For example, the following command is equivalent to the
503 | above command:
504 |
505 | ```
506 | $ rg 'fast\s+(?P\w+)' README.md -r 'fast-$word'
507 | 88: color and full Unicode support. Unlike GNU grep, `ripgrep` stays fast-while
508 | 124:Summarizing, `ripgrep` is fast-because:
509 | ```
510 |
511 | It is important to note that ripgrep **will never modify your files**. The
512 | `--replace` flag only controls ripgrep's output. (And there is no flag to let
513 | you do a replacement in a file.)
514 |
515 |
516 | ### Configuration file
517 |
518 | It is possible that ripgrep's default options aren't suitable in every case.
519 | For that reason, and because shell aliases aren't always convenient, ripgrep
520 | supports configuration files.
521 |
522 | Setting up a configuration file is simple. ripgrep will not look in any
523 | predetermined directory for a config file automatically. Instead, you need to
524 | set the `RIPGREP_CONFIG_PATH` environment variable to the file path of your
525 | config file. Once the environment variable is set, open the file and just type
526 | in the flags you want set automatically. There are only two rules for
527 | describing the format of the config file:
528 |
529 | 1. Every line is a shell argument, after trimming whitespace.
530 | 2. Lines starting with `#` (optionally preceded by any amount of whitespace)
531 | are ignored.
532 |
533 | In particular, there is no escaping. Each line is given to ripgrep as a single
534 | command line argument verbatim.
535 |
536 | Here's an example of a configuration file, which demonstrates some of the
537 | formatting peculiarities:
538 |
539 | ```
540 | $ cat $HOME/.ripgreprc
541 | # Don't let ripgrep vomit really long lines to my terminal, and show a preview.
542 | --max-columns=150
543 | --max-column-preview
544 |
545 | # Add my 'web' type.
546 | --type-add
547 | web:*.{html,css,js}*
548 |
549 | # Using glob patterns to include/exclude files or folders
550 | --glob=!git/*
551 |
552 | # or
553 | --glob
554 | !git/*
555 |
556 | # Set the colors.
557 | --colors=line:none
558 | --colors=line:style:bold
559 |
560 | # Because who cares about case!?
561 | --smart-case
562 | ```
563 |
564 | When we use a flag that has a value, we either put the flag and the value on
565 | the same line but delimited by an `=` sign (e.g., `--max-columns=150`), or we
566 | put the flag and the value on two different lines. This is because ripgrep's
567 | argument parser knows to treat the single argument `--max-columns=150` as a
568 | flag with a value, but if we had written `--max-columns 150` in our
569 | configuration file, then ripgrep's argument parser wouldn't know what to do
570 | with it.
571 |
572 | Putting the flag and value on different lines is exactly equivalent and is a
573 | matter of style.
574 |
575 | Comments are encouraged so that you remember what the config is doing. Empty
576 | lines are OK too.
577 |
578 | So let's say you're using the above configuration file, but while you're at a
579 | terminal, you really want to be able to see lines longer than 150 columns. What
580 | do you do? Thankfully, all you need to do is pass `--max-columns 0` (or `-M0`
581 | for short) on the command line, which will override your configuration file's
582 | setting. This works because ripgrep's configuration file is *prepended* to the
583 | explicit arguments you give it on the command line. Since flags given later
584 | override flags given earlier, everything works as expected. This works for most
585 | other flags as well, and each flag's documentation states which other flags
586 | override it.
587 |
588 | If you're confused about what configuration file ripgrep is reading arguments
589 | from, then running ripgrep with the `--debug` flag should help clarify things.
590 | The debug output should note what config file is being loaded and the arguments
591 | that have been read from the configuration.
592 |
593 | Finally, if you want to make absolutely sure that ripgrep *isn't* reading a
594 | configuration file, then you can pass the `--no-config` flag, which will always
595 | prevent ripgrep from reading extraneous configuration from the environment,
596 | regardless of what other methods of configuration are added to ripgrep in the
597 | future.
598 |
599 |
600 | ### File encoding
601 |
602 | [Text encoding](https://en.wikipedia.org/wiki/Character_encoding) is a complex
603 | topic, but we can try to summarize its relevancy to ripgrep:
604 |
605 | * Files are generally just a bundle of bytes. There is no reliable way to know
606 | their encoding.
607 | * Either the encoding of the pattern must match the encoding of the files being
608 | searched, or a form of transcoding must be performed that converts either the
609 | pattern or the file to the same encoding as the other.
610 | * ripgrep tends to work best on plain text files, and among plain text files,
611 | the most popular encodings likely consist of ASCII, latin1 or UTF-8. As
612 | a special exception, UTF-16 is prevalent in Windows environments
613 |
614 | In light of the above, here is how ripgrep behaves when `--encoding auto` is
615 | given, which is the default:
616 |
617 | * All input is assumed to be ASCII compatible (which means every byte that
618 | corresponds to an ASCII codepoint actually is an ASCII codepoint). This
619 | includes ASCII itself, latin1 and UTF-8.
620 | * ripgrep works best with UTF-8. For example, ripgrep's regular expression
621 | engine supports Unicode features. Namely, character classes like `\w` will
622 | match all word characters by Unicode's definition and `.` will match any
623 | Unicode codepoint instead of any byte. These constructions assume UTF-8,
624 | so they simply won't match when they come across bytes in a file that aren't
625 | UTF-8.
626 | * To handle the UTF-16 case, ripgrep will do something called "BOM sniffing"
627 | by default. That is, the first three bytes of a file will be read, and if
628 | they correspond to a UTF-16 BOM, then ripgrep will transcode the contents of
629 | the file from UTF-16 to UTF-8, and then execute the search on the transcoded
630 | version of the file. (This incurs a performance penalty since transcoding
631 | is slower than regex searching.) If the file contains invalid UTF-16, then
632 | the Unicode replacement codepoint is substituted in place of invalid code
633 | units.
634 | * To handle other cases, ripgrep provides a `-E/--encoding` flag, which permits
635 | you to specify an encoding from the
636 | [Encoding Standard](https://encoding.spec.whatwg.org/#concept-encoding-get).
637 | ripgrep will assume *all* files searched are the encoding specified (unless
638 | the file has a BOM) and will perform a transcoding step just like in the
639 | UTF-16 case described above.
640 |
641 | By default, ripgrep will not require its input be valid UTF-8. That is, ripgrep
642 | can and will search arbitrary bytes. The key here is that if you're searching
643 | content that isn't UTF-8, then the usefulness of your pattern will degrade. If
644 | you're searching bytes that aren't ASCII compatible, then it's likely the
645 | pattern won't find anything. With all that said, this mode of operation is
646 | important, because it lets you find ASCII or UTF-8 *within* files that are
647 | otherwise arbitrary bytes.
648 |
649 | As a special case, the `-E/--encoding` flag supports the value `none`, which
650 | will completely disable all encoding related logic, including BOM sniffing.
651 | When `-E/--encoding` is set to `none`, ripgrep will search the raw bytes of
652 | the underlying file with no transcoding step. For example, here's how you might
653 | search the raw UTF-16 encoding of the string `Шерлок`:
654 |
655 | ```
656 | $ rg '(?-u)\(\x045\x04@\x04;\x04>\x04:\x04' -E none -a some-utf16-file
657 | ```
658 |
659 | Of course, that's just an example meant to show how one can drop down into
660 | raw bytes. Namely, the simpler command works as you might expect automatically:
661 |
662 | ```
663 | $ rg 'Шерлок' some-utf16-file
664 | ```
665 |
666 | Finally, it is possible to disable ripgrep's Unicode support from within the
667 | regular expression. For example, let's say you wanted `.` to match any byte
668 | rather than any Unicode codepoint. (You might want this while searching a
669 | binary file, since `.` by default will not match invalid UTF-8.) You could do
670 | this by disabling Unicode via a regular expression flag:
671 |
672 | ```
673 | $ rg '(?-u:.)'
674 | ```
675 |
676 | This works for any part of the pattern. For example, the following will find
677 | any Unicode word character followed by any ASCII word character followed by
678 | another Unicode word character:
679 |
680 | ```
681 | $ rg '\w(?-u:\w)\w'
682 | ```
683 |
684 |
685 | ### Binary data
686 |
687 | In addition to skipping hidden files and files in your `.gitignore` by default,
688 | ripgrep also attempts to skip binary files. ripgrep does this by default
689 | because binary files (like PDFs or images) are typically not things you want to
690 | search when searching for regex matches. Moreover, if content in a binary file
691 | did match, then it's possible for undesirable binary data to be printed to your
692 | terminal and wreak havoc.
693 |
694 | Unfortunately, unlike skipping hidden files and respecting your `.gitignore`
695 | rules, a file cannot as easily be classified as binary. In order to figure out
696 | whether a file is binary, the most effective heuristic that balances
697 | correctness with performance is to simply look for `NUL` bytes. At that point,
698 | the determination is simple: a file is considered "binary" if and only if it
699 | contains a `NUL` byte somewhere in its contents.
700 |
701 | The issue is that while most binary files will have a `NUL` byte toward the
702 | beginning of its contents, this is not necessarily true. The `NUL` byte might
703 | be the very last byte in a large file, but that file is still considered
704 | binary. While this leads to a fair amount of complexity inside ripgrep's
705 | implementation, it also results in some unintuitive user experiences.
706 |
707 | At a high level, ripgrep operates in three different modes with respect to
708 | binary files:
709 |
710 | 1. The default mode is to attempt to remove binary files from a search
711 | completely. This is meant to mirror how ripgrep removes hidden files and
712 | files in your `.gitignore` automatically. That is, as soon as a file is
713 | detected as binary, searching stops. If a match was already printed (because
714 | it was detected long before a `NUL` byte), then ripgrep will print a warning
715 | message indicating that the search stopped prematurely. This default mode
716 | **only applies to files searched by ripgrep as a result of recursive
717 | directory traversal**, which is consistent with ripgrep's other automatic
718 | filtering. For example, `rg foo .file` will search `.file` even though it
719 | is hidden. Similarly, `rg foo binary-file` search `binary-file` in "binary"
720 | mode automatically.
721 | 2. Binary mode is similar to the default mode, except it will not always
722 | stop searching after it sees a `NUL` byte. Namely, in this mode, ripgrep
723 | will continue searching a file that is known to be binary until the first
724 | of two conditions is met: 1) the end of the file has been reached or 2) a
725 | match is or has been seen. This means that in binary mode, if ripgrep
726 | reports no matches, then there are no matches in the file. When a match does
727 | occur, ripgrep prints a message similar to one it prints when in its default
728 | mode indicating that the search has stopped prematurely. This mode can be
729 | forcefully enabled for all files with the `--binary` flag. The purpose of
730 | binary mode is to provide a way to discover matches in all files, but to
731 | avoid having binary data dumped into your terminal.
732 | 3. Text mode completely disables all binary detection and searches all files
733 | as if they were text. This is useful when searching a file that is
734 | predominantly text but contains a `NUL` byte, or if you are specifically
735 | trying to search binary data. This mode can be enabled with the `-a/--text`
736 | flag. Note that when using this mode on very large binary files, it is
737 | possible for ripgrep to use a lot of memory.
738 |
739 | Unfortunately, there is one additional complexity in ripgrep that can make it
740 | difficult to reason about binary files. That is, the way binary detection works
741 | depends on the way that ripgrep searches your files. Specifically:
742 |
743 | * When ripgrep uses memory maps, then binary detection is only performed on the
744 | first few kilobytes of the file in addition to every matching line.
745 | * When ripgrep doesn't use memory maps, then binary detection is performed on
746 | all bytes searched.
747 |
748 | This means that whether a file is detected as binary or not can change based
749 | on the internal search strategy used by ripgrep. If you prefer to keep
750 | ripgrep's binary file detection consistent, then you can disable memory maps
751 | via the `--no-mmap` flag. (The cost will be a small performance regression when
752 | searching very large files on some platforms.)
753 |
754 |
755 | ### Common options
756 |
757 | ripgrep has a lot of flags. Too many to keep in your head at once. This section
758 | is intended to give you a sampling of some of the most important and frequently
759 | used options that will likely impact how you use ripgrep on a regular basis.
760 |
761 | * `-h`: Show ripgrep's condensed help output.
762 | * `--help`: Show ripgrep's longer form help output. (Nearly what you'd find in
763 | ripgrep's man page, so pipe it into a pager!)
764 | * `-i/--ignore-case`: When searching for a pattern, ignore case differences.
765 | That is `rg -i fast` matches `fast`, `fASt`, `FAST`, etc.
766 | * `-S/--smart-case`: This is similar to `--ignore-case`, but disables itself
767 | if the pattern contains any uppercase letters. Usually this flag is put into
768 | alias or a config file.
769 | * `-w/--word-regexp`: Require that all matches of the pattern be surrounded
770 | by word boundaries. That is, given `pattern`, the `--word-regexp` flag will
771 | cause ripgrep to behave as if `pattern` were actually `\b(?:pattern)\b`.
772 | * `-c/--count`: Report a count of total matched lines.
773 | * `--files`: Print the files that ripgrep *would* search, but don't actually
774 | search them.
775 | * `-a/--text`: Search binary files as if they were plain text.
776 | * `-z/--search-zip`: Search compressed files (gzip, bzip2, lzma, xz, lz4,
777 | brotli, zstd). This is disabled by default.
778 | * `-C/--context`: Show the lines surrounding a match.
779 | * `--sort path`: Force ripgrep to sort its output by file name. (This disables
780 | parallelism, so it might be slower.)
781 | * `-L/--follow`: Follow symbolic links while recursively searching.
782 | * `-M/--max-columns`: Limit the length of lines printed by ripgrep.
783 | * `--debug`: Shows ripgrep's debug output. This is useful for understanding
784 | why a particular file might be ignored from search, or what kinds of
785 | configuration ripgrep is loading from the environment.
786 |
--------------------------------------------------------------------------------
/GUIDE.zh.md:
--------------------------------------------------------------------------------
1 | ## 用户指南
2 |
3 | 本指南,旨在提供 ripgrep 的基本描述及其功能概述。本指南假设 ripgrep 是[已安装的](readme.md#installation),并且读者已经熟悉使用命令行工具。还假定是在类似 Unix 的系统中,尽管大多数命令,也许能很容易转换到任何的命令行 shell 环境。
4 |
5 | ### 目录
6 |
7 |
8 |
9 |
10 | - [Basics](#basics)
11 | - [递归搜索](#%E9%80%92%E5%BD%92%E6%90%9C%E7%B4%A2)
12 | - [自动过滤](#%E8%87%AA%E5%8A%A8%E8%BF%87%E6%BB%A4)
13 | - [手动过滤: globs](#%E6%89%8B%E5%8A%A8%E8%BF%87%E6%BB%A4-globs)
14 | - [手动过滤:文件类型](#%E6%89%8B%E5%8A%A8%E8%BF%87%E6%BB%A4%E6%96%87%E4%BB%B6%E7%B1%BB%E5%9E%8B)
15 | - [替换](#%E6%9B%BF%E6%8D%A2)
16 | - [Configuration file](#configuration-file)
17 | - [文件编码](#%E6%96%87%E4%BB%B6%E7%BC%96%E7%A0%81)
18 | - [二进制数据](#%E4%BA%8C%E8%BF%9B%E5%88%B6%E6%95%B0%E6%8D%AE)
19 | - [常用选项](#%E5%B8%B8%E7%94%A8%E9%80%89%E9%A1%B9)
20 |
21 |
22 |
23 | ### Basics
24 |
25 | > 基础
26 |
27 | ripgrep 是一个命令行工具,可在文件中,搜索您提供的(匹配)模式。ripgrep 的行为,就像逐行读取每个文件一样。如果一行与提供给 ripgrep 的模式匹配,那么将打印该行。如果那行与模式不匹配,则不打印该行。
28 |
29 | 查看其工作原理的最佳方法是使用示例。为了展示一个例子,我们需要一些东西来搜索。让我们来尝试搜索 ripgrep 的源代码。首先获取 ripgrep 存档源,并解压:
30 |
31 | ```
32 | $ curl -LO https://github.com/BurntSushi/ripgrep/archive/0.7.1.zip
33 | $ unzip 0.7.1.zip
34 | $ cd ripgrep-0.7.1
35 | $ ls
36 | benchsuite grep tests Cargo.toml LICENSE-MIT
37 | ci ignore wincolor CHANGELOG.md README.md
38 | complete pkg appveyor.yml compile snapcraft.yaml
39 | doc src build.rs COPYING UNLICENSE
40 | globset termcolor Cargo.lock HomebrewFormula
41 | ```
42 |
43 | 让我们通过在`README.md`,查找所有出现的`fast`单词,作为我们的第一次搜索尝试:
44 |
45 | ```
46 | $ rg fast README.md
47 | 75: faster than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
48 | 88: color and full Unicode support. Unlike GNU grep, `ripgrep` stays fast while
49 | 119:### Is it really faster than everything else?
50 | 124:Summarizing, `ripgrep` is fast because:
51 | 129: optimizations to make searching very fast.
52 | ```
53 |
54 | (**注意:**如果您看到来自 ripgrep 的错误消息,说它没有搜索任何文件,那么请重新运行 ripgrep 带 `--debug`标志。一个可能的原因是你`$HOME/.gitignore`文件中,有一个`*`规则。)
55 |
56 | 那么这里发生了什么? ripgrep 阅读`README.md`的内容,以及对于包含`fast`的每一行,ripgrep 将它打印到您的终端。ripgrep 默认还包括每行的行号。如果您的终端支持颜色,那么您的输出可能看起来像这个截图:
57 |
58 | [](https://burntsushi.net/stuff/ripgrep-guide-sample.png)
59 |
60 | 在这个例子中,我们搜索了一个叫做"字面"(literal)字符串的东西。这意味着,我们的模式只是我们要求 ripgrep 查找的一些普通文本。但是 ripgrep 支持[正则表达式](https://en.wikipedia.org/wiki/Regular_expression)指定模式的能力。例如,如果我们想要找出包含一个单词`fast`的,其次还有些其他字母的行? 该怎么办?
61 |
62 | ```
63 | $ rg 'fast\w+' README.md
64 | 75: faster than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
65 | 119:### Is it really faster than everything else?
66 | ```
67 |
68 | 在这个例子中,我们使用了模式`fast\w+`。此模式告诉 ripgrep 查找包含`fast`字母,其次是*一个或多个*字样字符的任何行。也就是说,`\w`匹配组成单词的字符(如`a`和`L`,但不是`.`和``)这个`+`在`\w`后意味着,"匹配以前的模式一次或多次",这意味着`fast`这个词不会匹配,因为字符`t`后面没有字符。 但是一个`faster`可以匹配。`faste`也行!
69 |
70 | 这里有个相同的主题的,不同变化:
71 |
72 | ```
73 | $ rg 'fast\w*' README.md
74 | 75: faster than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
75 | 88: color and full Unicode support. Unlike GNU grep, `ripgrep` stays fast while
76 | 119:### Is it really faster than everything else?
77 | 124:Summarizing, `ripgrep` is fast because:
78 | 129: optimizations to make searching very fast.
79 | ```
80 |
81 | 在这种情况下,我们使用`fast\w*`代替我们的`fast\w+`模式。 这个`*`意味着它应该匹配*零*或更多次。在这种情况下,ripgrip 将打印这个`fast`模式匹配的行,但是如果你的终端支持颜色,你会注意到`faster`将高亮显示,而不仅仅是前缀`fast`。
82 |
83 | 提供关于正则表达式的完整教程,超出了本指南的范围,但是 ripgrep 的特定语法在此文档:
84 |
85 | ### 递归搜索
86 |
87 | 在前一节中,我们演示了如何使用 ripgrep 来搜索单个文件。在本节中,我们将演示如何使用 ripgrep 搜索整个目录的文件。事实上,*递归地*搜索当前工作目录是 ripgrep 的默认操作模式,也就是说,递归搜索操作非常简单。
88 |
89 | 继续使用 ripgrep 源代码目录,下面介绍如何查找名称为`write`的所有函数定义:
90 |
91 | ```
92 | $ rg 'fn write\('
93 | src/printer.rs
94 | 469: fn write(&mut self, buf: &[u8]) {
95 |
96 | termcolor/src/lib.rs
97 | 227: fn write(&mut self, b: &[u8]) -> io::Result {
98 | 250: fn write(&mut self, b: &[u8]) -> io::Result {
99 | 428: fn write(&mut self, b: &[u8]) -> io::Result { self.wtr.write(b) }
100 | 441: fn write(&mut self, b: &[u8]) -> io::Result { self.wtr.write(b) }
101 | 454: fn write(&mut self, buf: &[u8]) -> io::Result {
102 | 511: fn write(&mut self, buf: &[u8]) -> io::Result {
103 | 848: fn write(&mut self, buf: &[u8]) -> io::Result {
104 | 915: fn write(&mut self, buf: &[u8]) -> io::Result {
105 | 949: fn write(&mut self, buf: &[u8]) -> io::Result {
106 | 1114: fn write(&mut self, buf: &[u8]) -> io::Result {
107 | 1348: fn write(&mut self, buf: &[u8]) -> io::Result {
108 | 1353: fn write(&mut self, buf: &[u8]) -> io::Result {
109 | ```
110 |
111 | (**注意:**我们转义了`(`,因为这`(`在正则表达式中具有特殊的意义。你也可以使用`rg -F 'fn write('`做同样的事情,这里的`-F`会将您的模式解释为字面字符串,而不是正则表达式。
112 |
113 | 在这个例子中,我们根本没有指定一个文件。而,ripgrep 默认在没有路径的情况下,搜索当前目录。一般来说,`rg foo`相当于`rg foo ./`。
114 |
115 | 这个特殊的搜索显示了我们`src`和`termcolor`目录的结果。这个`src`目录是 ripgrep 核心代码,而`termcolor`是 ripgrep 的一个依赖项(并且被其他工具使用)。如果我们只想搜索 ripgrip 核心代码呢? 这很简单,只需指定你想要的目录:
116 |
117 | ```
118 | $ rg 'fn write\(' src
119 | src/printer.rs
120 | 469: fn write(&mut self, buf: &[u8]) {
121 | ```
122 |
123 | 在这里,ripgrep 将其搜索限制在`src`目录。做相同搜索的另一种方法是`cd`进入`src`目录,及再一次简单使用`rg 'fn write\('`。
124 |
125 | ### 自动过滤
126 |
127 | 在递归搜索之后,ripgrip 最重要的特性是什么?*不是*搜索。而是默认情况下,当搜索目录时,ripgrep 将忽略以下所有内容:
128 |
129 | **1.** 与您的`.gitignore`中 glob 模式规则,匹配的文件和目录
130 | **2.** 隐藏的文件和目录。
131 | **3.** 二进制文件。(ripgrep 会将任何具有`NUL`字节的文件考虑为二进制)
132 | **4.** 符号链接(快捷方式)不被遵循。
133 |
134 | 所有这些东西都可以用 ripgrep 提供的各种标志来切换:
135 |
136 | **1.** 你可以禁用`.gitignore`处理,通过`--no-ignore`标志。
137 | **2.** 隐藏的文件和目录,可以用`--hidden`标志搜索。
138 | **3.** 二进制文件可以通过`--text`(短的`-a`)标志搜索。
139 | 小心这一标志! 二进制文件可以向终端发送控制字符,这可能会导致奇怪的行为。
140 | **4.** ripgrep 可以用`--follow`遵循 symlinks(短的`-L`)。
141 |
142 | 作为一种特殊的方便方式,ripgrep 还提供了一个`--unrestricted`(短的`-u`)标志。重复使用此标志将导致 ripgrep 禁用其越来越多的过滤。也就是说,`-u`将禁用`.gitignore`处理,`-uu`将搜索隐藏的文件和目录`-uuu`将搜索二进制文件。当您使用 ripgrep ,但不确定它的过滤是否对您隐藏了结果时,这是非常有用的。缝上一对`-u`标志是一种快速的发现方法。(如果你使用`--debug`,仍然困惑,还是没有帮助,[提出一个 issue](https://github.com/BurntSushi/ripgrep/issues/new))
143 |
144 | ripgrep 的`.gitignore`处理实际上,有点超出`.gitignore`文件。ripgrip 还将遵循以下的特定规则。一个是`$GIT_DIR/info/exclude`,以及您的任何全局忽略规则`core.excludesFile`(在类 UNIX 系统上,通常位于`$XDG_CONFIG_HOME/git/ignore`)。
145 |
146 | 有时你想搜索在`.gitignore`的文件,因此,可以再指定,附加(非`!`)忽略规则,或在`.ignore`(未知应用) 或 `.rgignore`(ripgrip 特定)文件重写。
147 |
148 | 例如,假设你有一个`.gitignore`,看起来像这样的文件:
149 |
150 | ```
151 | log/
152 | ```
153 |
154 | 这通常意味着任何`log`目录不会被`git`跟踪。 然而,它可能包含您希望在搜索中,包括的有用输出,但是您仍然不想`git`跟踪它。 你可以通过创建一个`.ignore`文件,与`.gitignore`文件处在同一目录,其内容如下:
155 |
156 | ```
157 | !log/
158 | ```
159 |
160 | ripgrep 对待`.ignore`会比,文件`.gitignore`的优先级要高(对待`.rgignore`比,文件`.ignore`文件优先级要高)。这意味着 ripgrep 将`!log/`先作为白名单规则,然后搜索目录。
161 |
162 | 像`.gitignore`,一个`.ignore`文件可以放在任何目录中。它的规则将对它所在的目录进行处理,就像`.gitignore`。
163 |
164 | 要在遵循 `.gitignore` 和 `.ignore` 文件期间,不关心大小写, 使用
165 | `--ignore-file-case-insensitive`标志。 这在不关心大小写的文件系统中,像在 Windows 和 macOS 中,会很有用。 请注意,这可能会带来显着的性能损失,因此默认情况下会禁用。
166 |
167 | 为了更深入地描述一个`.gitignore`文件中的, glob 模式如何被解释,请参阅`man gitignore`。
168 |
169 | ### 手动过滤: globs
170 |
171 | 在前一节中,我们讨论了 ripgrep 的过滤,默认情况下它是"自动"的,它对你的环境起反应。也就是说,它使用已经存在了的`.gitignore`文件,去生成更多相关搜索结果。
172 |
173 | 除了自动过滤之外,ripgrip 还提供更多的手动或自定义过滤。这有两种形式:在 ripgrep 命令行指定其他的 glob 模式和文件类型筛选。这一部分是涵盖 glob 模式,而下一部分才是涵盖文件类型过滤。
174 |
175 | 在我们的 ripgrip 源代码中(参见[基本内容](#basics)有关如何获取要搜索源的说明),我们假设希望查看哪些内容依赖于`clap`(我们的参数解析器)。
176 |
177 | 我们可以这样,做到这一点:
178 |
179 | ```
180 | $ rg clap
181 | [lots of results]
182 | ```
183 |
184 | 但,这给我们展示了过多东西,而我们只感兴趣的是我们写的,`clap`作为依赖关系的地方。相反,我们可以限制为 TOML 文件,是与 Rust 的构建工具 Cargo 的依赖项沟通格式:
185 |
186 | ```
187 | $ rg clap -g '*.toml'
188 | Cargo.toml
189 | 35:clap = "2.26"
190 | 51:clap = "2.26"
191 | ```
192 |
193 | 这个`-g '*.toml'`语法说的是,"确保搜索到的每个文件,都匹配这个 glob 模式." 注意我们的`'*.toml'`是单引号,防止我们的 shell 转义`*`.
194 |
195 | 如果我们想要的话,我们可以告诉 ripgrep 搜索任何文件,但不要 `*.toml`文件:
196 |
197 | ```
198 | $ rg clap -g '!*.toml'
199 | [lots of results]
200 | ```
201 |
202 | 这将给你更多的结果,但它们不会包含结尾`.toml`的文件。 注意使用了一个`!`,这里的意思是"否定"(有点不规范),但选择它是与写入`.gitignore`文件的"glob"模式保持对应性。(虽然意思颠倒了。在`.gitignore`文件,一个前缀`!`意味着白名单,在命令行上`!`意味着黑名单。)
203 |
204 | 本工具的 Globs 解释方式与`.gitignore`模式完全相同。也就是说,后 glob 将取代早 glob。例如,下面的命令将只搜索`*.toml`文件夹:
205 |
206 | ```
207 | $ rg clap -g '!*.toml' -g '*.toml'
208 | ```
209 |
210 | 有趣的是,在这种情况下,调换 globs 的顺序,将不匹配任何内容,因为白名单的存在,让搜索范围局限与此。
211 |
212 | ### 手动过滤:文件类型
213 |
214 | 随着时间的推移,你可能会注意到,你一遍遍地使用相同的 glob 模式。例如,您可能发现自己正在进行许多搜索,其中只希望看到 Rust 文件的结果:
215 |
216 | ```
217 | $ rg 'fn run' -g '*.rs'
218 | ```
219 |
220 | 不是每次都要写 glob,也可以使用 ripgrep 支持的文件类型:
221 |
222 | ```
223 | $ rg 'fn run' --type rust
224 | ```
225 |
226 | 或者,更简洁点,
227 |
228 | ```
229 | $ rg 'fn run' -trust
230 | ```
231 |
232 | `--type`标志方式的函数很简单,给出一个具体名称(将被分配给一个,或多个匹配相关文件)的 globs。这可以让您编写一个,可能包含一定范围的文件扩展名的单一类型。例如,如果要搜索 C 文件,必须同时检查 C 源文件和 C 头文件:
233 |
234 | ```
235 | $ rg 'int main' -g '*.{c,h}'
236 | ```
237 |
238 | 或者你可以只使用 C 文件类型:
239 |
240 | ```
241 | $ rg 'int main' -tc
242 | ```
243 |
244 | 正如,你可以写黑名单 glob,你也可以黑名单文件类型:
245 |
246 | ```
247 | $ rg clap --type-not rust
248 | ```
249 |
250 | 或者,更简洁点,
251 |
252 | ```
253 | $ rg clap -Trust
254 | ```
255 |
256 | 也就是说,`-t`意味着"包括这种类型的文件",`-T`意思是"排除这种类型的文件".
257 |
258 | 要看到组成一个类型的 glob 模式,运行`rg --type-list`:
259 |
260 | ```
261 | $ rg --type-list | rg '^make:'
262 | make: *.mak, *.mk, GNUmakefile, Gnumakefile, Makefile, gnumakefile, makefile
263 | ```
264 |
265 | 默认情况下,ripgrep 附带了一堆预定义类型。一般来说,这些类型对应于众所周知的公共格式。但是你也可以定义你自己的类型。例如,您可能经常搜索"Web"文件,该文件由 JavaScript、HTML 和 CSS 组成:
266 |
267 | ```
268 | $ rg --type-add 'web:*.html' --type-add 'web:*.css' --type-add 'web:*.js' -tweb title
269 | ```
270 |
271 | 或者,更简洁点,
272 |
273 | ```
274 | $ rg --type-add 'web:*.{html,css,js}' -tweb title
275 | ```
276 |
277 | 上面的命令定义了一个新的类型,`web`,对应于 glob 模式`*.{html,css,js}`。 然后应用新的过滤标志`-tweb`,寻找`title`匹配。如果你运行
278 |
279 | ```
280 | $ rg --type-add 'web:*.{html,css,js}' --type-list
281 | ```
282 |
283 | 然后你会看到你的`web`类型显示在列表中,即使它不是 ripgrep 内置类型。
284 |
285 | 这里要强调的是`--type-add`标志只适用于当前命令。它不添加新的文件类型,并将其保存在某个持久窗体中。如果希望在每个 ripgrep 命令中可以使用类型,则应该创建一个 shell 别名:
286 |
287 | ```
288 | alias rg="rg --type-add 'web:*.{html,css,js}'"
289 | ```
290 |
291 | 或添加`--type-add=web:*.{html,css,js}`到您的 ripgrip 配置文件。([配置文件](#configuration-file)稍后,将更详细点明)。
292 |
293 | ### 替换
294 |
295 | > ripgrip**永远不会修改你的文件**
296 |
297 | ripgrep 通过将匹配的文本替换为其他文本,提供了有限的内容修改能力。这是最容易用一个例子来解释的。当我们在 ripgrep 的 README 中,搜索这个`fast`词的时候?
298 |
299 | ```
300 | $ rg fast README.md
301 | 75: faster than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
302 | 88: color and full Unicode support. Unlike GNU grep, `ripgrep` stays fast while
303 | 119:### Is it really faster than everything else?
304 | 124:Summarizing, `ripgrep` is fast because:
305 | 129: optimizations to make searching very fast.
306 | ```
307 |
308 | 如果我们想*替换*所有的`fast`到`FAST`? 这对 ripgrep 来说很容易。使用`--replace`标志:
309 |
310 | ```
311 | $ rg fast README.md --replace FAST
312 | 75: FASTer than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
313 | 88: color and full Unicode support. Unlike GNU grep, `ripgrep` stays FAST while
314 | 119:### Is it really FASTer than everything else?
315 | 124:Summarizing, `ripgrep` is FAST because:
316 | 129: optimizations to make searching very FAST.
317 | ```
318 |
319 | 或者,更简洁点,
320 |
321 | ```
322 | $ rg fast README.md -r FAST
323 | [snip]
324 | ```
325 |
326 | 本质上,`--replace`标志*只*应用输出中的文本匹配部分。如果要替换整行文本,则需要在匹配中包括整行。例如:
327 |
328 | ```
329 | $ rg '^.*fast.*$' README.md -r FAST
330 | 75:FAST
331 | 88:FAST
332 | 119:FAST
333 | 124:FAST
334 | 129:FAST
335 | ```
336 |
337 | 或者,您可以组合`--only-matching`(或`-o`短的)和`--replace`标志实现相同的结果:
338 |
339 | ```
340 | $ rg fast README.md --only-matching --replace FAST
341 | 75:FAST
342 | 88:FAST
343 | 119:FAST
344 | 124:FAST
345 | 129:FAST
346 | ```
347 |
348 | 或者,更简洁点,
349 |
350 | ```
351 | $ rg fast README.md -or FAST
352 | [snip]
353 | ```
354 |
355 | 最后,替换可以包括捕获组。例如,让我们说,我们想找到所有`fast`的,换成后面跟着另一个单词,然后用破折号加在一起。我们可以使用的匹配模式是`fast\s+(\w+)`,意思是`fast`后面是任意数量的空格,再后面是任意数量的"Word"字符。我们把`\w+`放在"捕获组"(用括号表示)中,以便以后在替换字符串中引用它。例如:
356 |
357 | ```
358 | $ rg 'fast\s+(\w+)' README.md -r 'fast-$1'
359 | 88: color and full Unicode support. Unlike GNU grep, `ripgrep` stays fast-while
360 | 124:Summarizing, `ripgrep` is fast-because:
361 | ```
362 |
363 | 我们在这里的替换字符串`fast-$1`,包括一个`fast-`,其次是一个捕获组的索引内容`1`。 (捕获组实际上从索引 0 开始,但是`0`th 捕捉组总是对应于整个匹配。而捕获组索引`1`总是对应于在正则表达式模式中,发现的第一个显式捕获组。
364 |
365 | 捕获组也可以被命名,这有时比使用索引更方便。例如,下面的命令等效于上面的命令:
366 |
367 | ```
368 | $ rg 'fast\s+(?P\w+)' README.md -r 'fast-$word'
369 | 88: color and full Unicode support. Unlike GNU grep, `ripgrep` stays fast-while
370 | 124:Summarizing, `ripgrep` is fast-because:
371 | ```
372 |
373 | 值得注意的是,ripgrip**永远不会修改你的文件**。 这个`--replace`标志只控制 ripgrip 的输出。(并且没有标志,允许在文件中进行替换)。
374 |
375 | ### Configuration file
376 |
377 | > 配置文件
378 |
379 | 在任何情况下,ripgrep 的默认选项都有可能不合适。由于这个原因,并且由于不方便的 shell 别名,ripgrep 会有配置文件支持。
380 |
381 | 设置配置文件很简单。ripgrep 不会自动查看配置文件中的任何预定目录。相反,您需要设置`RIPGREP_CONFIG_PATH`环境变量,为配置文件的文件路径。一旦设置了环境变量,打开该文件,并输入您想要自动设置的标志。只有两个规则来描述与规范配置文件的格式:
382 |
383 | **1.** 在修剪空格之后,每一行都是一个 shell 参数。
384 | **2.** 以`#`开始的行(可选之前,任何数量的空格)被忽略。
385 |
386 | 特别是,没有转义。每一行都会逐字逐句给到 ripgrep ,作为命令行参数。
387 |
388 | 下面是一个配置文件的例子,它演示了一些格式化特性:
389 |
390 | ```bash
391 | $ cat $HOME/.ripgreprc
392 | # 别让 ripgrep 呕吐到我的终端,
393 | --max-columns=150
394 | # 还展示一个预览。
395 | --max-column-preview
396 |
397 | # 添加我的“网络”类型。
398 | --type-add
399 | web:*.{html,css,js}*
400 |
401 | # 使用全局模式包括/排除文件或文件夹
402 | --glob=!git/*
403 |
404 | # 或
405 | --glob
406 | !git/*
407 |
408 | # 设置颜色。
409 | --colors=line:none
410 | --colors=line:style:bold
411 |
412 | # 智能区别大小写
413 | --smart-case
414 | ```
415 |
416 | 当我们使用具有值的标志时,我们要么将标志和值放在同一行,但以`=`符号(例如,`--max-columns=150`或者,我们把标志和值放在两条不同的行上。这是因为 ripgrep 的参数分析器知道要处理单个参数。`--max-columns=150`作为一个有值的标志,但如果我们已经写了`--max-columns 150`在我们的配置文件中,ripgrip 的参数分析器不知道该怎么处理它。
417 |
418 | 把标志和值放在不同的行上是完全等价的,只是风格的问题。
419 |
420 | 鼓励注释,以便记住配置所做的事情。空行也可以。
421 |
422 | 所以假设您正在使用上面的配置文件,但是当您在终端时,您确实希望看到超过 150 列的行。那你应该做什么? 谢天谢地,你所需要做的,就是在命令行上传递`--max-columns 0`(或)`-M0`,它将覆盖配置文件的设置。这是因为 ripgrep 的配置文件是*预先准备的*,而在命令行上,你可以给出显式参数(覆盖)。由于稍后给出的标志,会覆盖相同标志,一切都按预期工作。这也适用于大多数其他标志,并且每个标志的文档会,说明哪些其他标志能覆盖它。
423 |
424 | 如果您对 ripgrep 从哪个配置文件读取参数感到困惑,那么使用`--debug`标志应该有助于澄清事物。调试的输出会注意到加载了什么配置文件,以及从配置读取的参数。
425 |
426 | 最后,如果你想绝对确定 ripgrep*不*读取配置文件,则可以传递`--no-config`标志,它总是阻止 ripgrep 从环境中读取额外的配置,不管将来向 ripgrep 添加了其他什么配置方法。
427 |
428 | ### 文件编码
429 |
430 | [Text encoding](https://en.wikipedia.org/wiki/Character_encoding)是一个复杂的话题,但我们可以尝试总结下它与 ripgrep 的关联性:
431 |
432 | - 文件通常只是一组字节。没有可靠的方法知道它们的编码。
433 | - (匹配)模式的编码必须与正在搜索的文件编码匹配,或者必须执行转换编码,将模式或文件转换为,与其他模式或文件相同的编码。
434 | - ripgrep 最适合于纯文本文件,在纯文本文件中,最流行的编码可能由 ASCII、latin1 或 UTF-8 组成。作为一个特殊的例外,UTF 16 在 Windows 环境中很流行。
435 |
436 | 综上所述,`--encoding auto` 的 ripgrep 是如何表现的,下面是默认情况:
437 |
438 | - 假设所有输入都是 ASCII 兼容的(这意味着,对应于 ASCII 码点的每个字节实际上是 ASCII 码点)。这包括 ASCII 本身,latin1 和 UTF-8。
439 | - ripgrip 最适合 UTF-8。例如,ripgrep 的正则表达式引擎支持 Unicode 特性。即字符类的`\w`将通过 Unicode 的定义,匹配所有单词字符。`.`将匹配任何 Unicode 码点,而不是任何字节。(工具的)构造都假定为 UTF-8,因此当遇到不是 UTF-8 的文件中的字节时,它们根本不会匹配。
440 | - 为了处理 UTF16 的情况,ripgrep 将默认地执行所谓的"BOM 嗅探"。也就是说,文件的前三个字节将被读取,如果它们对应于 UTF-16BOM,那么 ripgrep 将把文件的内容从 UTF-16 转换为 UTF-8,然后对文件的转换版本执行搜索。(这会导致性能损失,因为转码比正则表达式搜索慢。) 如果该文件包含 无效的 UTF-16, 那么,Unicode 替换代码点,替换无效代码单元的位置。
441 | - 为了处理其他情况,ripgrip 提供了一个`-E/--encoding`标志,它允许您从[标准 编码](https://encoding.spec.whatwg.org/#concept-encoding-get)指定。 ripgrep 将假设*全部的*搜索的文件是指定的编码 (除非,这文件有 BOM) ,并且将执行与上述 UTF-16 情况相同的转码步骤。
442 |
443 | 默认情况下,ripgrip 将不要求其输入为有效的 UTF-8。也就是说,ripgrep 可以,并且将搜索任意字节。这里的关键是,如果搜索的内容不是 UTF-8,那么(匹配)模式的有用性就会降低。如果您正在搜索与 ASCII 不兼容的字节,那么模式很可能找不到任何内容。尽管如此,这种操作模式很重要,因为它可以让你在文件*内部*,找到 ASCII 或 UTF-8,而不是任意字节。
444 |
445 | 作为一个特殊例子, `-E/--encoding` 支持 `none`值, 它会完全禁用,所有的相关编码逻辑,其中包括 BOM 嗅探。
446 | 若是 ripgrep 真把 `-E/--encoding` 设为 `none`, 它会在没有转码步骤的情况下,对底层文件的原始字节进行搜索。 例如, 这里有个你想搜索的,原始 UTF-16 编码字符串 `Шерлок`:
447 |
448 | ```
449 | $ rg '(?-u)\(\x045\x04@\x04;\x04>\x04:\x04' -E none -a some-utf16-file
450 | ```
451 |
452 | 当然, 这只是一个示例,旨在说明,如何将其换成原始字节。 也就是说,有个更简单的命令可以自动运行:
453 |
454 | ```
455 | $ rg 'Шерлок' some-utf16-file
456 | ```
457 |
458 | 最后,可以在正则表达式中禁用 ripgrep 的 Unicode 支持。例如,假设你想要`.`匹配任何字节,而不是任何 Unicode 代码点。(可能在搜索二进制文件时需要这个,因为`.`默认情况下将不匹配无效的 UTF-8。)可以通过通过正则表达式标志,禁用 Unicode 来实现这一点:
459 |
460 | ```
461 | $ rg '(?-u:.)'
462 | ```
463 |
464 | 这适用于图案的任何部分。例如,以下将找到任何 Unicode 单词字符,后面跟随任何 ASCII 单词字符,后面跟随另一个 Unicode 单词字符:
465 |
466 | ```
467 | $ rg '\w(?-u:\w)\w'
468 | ```
469 |
470 | ### 二进制数据
471 |
472 | 除了,跳过隐藏文件和在`.gitignore`内的文件,ripgrep 还会跳过 二进制文件。 ripgrep 默认这么做,是因为
473 | 二进制文件 (像 PDFs 或 images) ,基本上是你不想搜索的正则搜索匹配。 就算,二进制文件的内容匹配到了,然后可能会在你的终端,打印出不需要的二进制数据,群魔乱舞。
474 |
475 | 不幸的是,不像 跳过隐藏文件和 遵循 your `.gitignore`
476 | 规则,一个(搜索的)文件不能轻易归类为二进制文件。 为了弄清是否为二进制文件,平衡正确性和性能的最有效的启发式方法是简单地查找`NUL`字节。此时,判定很简单:当且仅当文件在其内容的某处,包含`NUL`字节时,文件才被视为“二进制”。
477 |
478 | 可问题是,虽大多数二进制文件,在其内容的开头有一个“NUL”字节,但这不一定是对的。 这个 `NUL` 字节 可能会在大文件中的很后的位置,但这样的文件仍考虑为二进制。这会导致 ripgrep 实现的复杂性,也会导致一些不直观的用户体验。
479 |
480 | 思维开阔,ripgrep 运转着三个不同模式,来遵循二进制文件:
481 |
482 | **1.** **默认模式**,会试图在完整搜索,移除二进制文件。 这意味,做与 ripgrep 自动移除隐藏文件和 `.gitignore`中的文件一样的行为。 这样一旦,一个文件被认为是二进制,那就停止搜索下去。如果说,(该文件)已有匹配项(因为,可能匹配项在`NUL`字节前,出现) ,那么, ripgrep 会打印一个警告信息,表明搜索过早结束。 默认模式
483 | **仅适用,被 ripgrep 作为搜索递归目录结果的文件**,但还是会遵循 ripgrep ,其他的自动过滤(条件)。 例如,`rg foo .file` 会搜索 `.file` ,即便它是隐藏文件。 类似的,`rg foo binary-file` 会在自动在二进制模式下,搜索 `binary-file`。
484 | **2.** **二进制模式**,与默认模式类似,除了在发现`NUL`字节之后,并不总是停止搜索。 换句话说,在这个模式下,ripgrep
485 | 会继续,搜索知道是二进制的文件,直到出现以下二种情况的一种: **1)** 文件的结尾到了 或 **2)** 一个匹配项有被看到。这意味着,这个模式下,如果 ripgrep 报告没有匹配项,那么说明这个文件没有匹配。而若匹配确实出现,那么就打印与默认模式下,过早结束相似的信息。
486 | 。可用 `--binary` 标志为所有文件强制启动该模式。该模式的目的是为了,提供一种在所有文件中匹配,但不会有二进制数据在你终端上乱舞的方式。
487 | **3.** **文本模式**,完全禁用,所有的二进制检查,和若是文件是文本,就搜索全部。对主要是文本,但也包括`NUL`字节的文件来说,很有用,或者如果你特别想尝试搜索下二进制数据。 用 `-a/--text`标志启动该模式。
488 | 注意,在大型二进制文件,使用该模式的话,ripgrep 可能会占有大量的内存。
489 |
490 | 再次不幸的是,ripgrep 还有一个额外的复杂性,让推理二进制文件变得困难。就是,二进制检测的工作方式取决于 ripgrep 搜索文件的方式 具体:
491 |
492 | - 当 ripgrep 使用内存映射, 那么,除了每个去匹配的行之外,二进制检测仅在文件的前几千字节上执行。
493 | - 若不使用内存映射,然后对搜索到的所有字节执行二进制检测。
494 |
495 | 这意味着文件是否被检测为二进制文件,会根据 ripgrep 使用的内部搜索策略,而发生改变。如果您更喜欢对 ripgrep 的二进制文件保持检测,那么你可以通过`--no-mmap`标志,不使用 内存映射。(在某些平台上搜索非常大的文件时,成本将是一个小的性能退步。)
496 |
497 | ### 常用选项
498 |
499 | ripgrep 还有很多标志。太多了,一下子就记不起来了。本节将向您展示一些最重要和最常用的选项,这些选项可能会影响您定期使用 ripgrep 的方式。
500 |
501 | | 名 | 曰 |
502 | | ------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------- |
503 | | `-h` | 显示 ripgrep 的浓缩帮助输出。 |
504 | | `--help` | 显示 ripgrep 的更长形式的帮助输出。(几乎是你在 ripgrep 的 man 页面里找到的,所以把它变成一个传呼机!) |
505 | | `-i/--ignore-case` | 当搜索模式时,忽略案例差异。那就是`rg -i fast`匹配`fast`,`fASt`,`FAST`等。 |
506 | | `-S/--smart-case` | 这类似于`--ignore-case`但如果模式包含大写字母,则禁用它自己。通常将此标志放入别名或配置文件中。 |
507 | | `-w/--word-regexp` | 要求图案的所有匹配都被单词边界包围。也就是说,给出`pattern`, the `--word-regexp`标志会使 ripgrep 表现得像`pattern`实际上是`\b(?:pattern)\b`。 |
508 | | `-c/--count` | 报告总匹配行的计数。 |
509 | | `--files` | 打印 ripgrip *将要*搜索的文件,但不要实际搜索它们。 |
510 | | `-a/--text` | 搜索二进制文件,就好像它们是纯文本一样。 |
511 | | `-z/--search-zip` | 搜索压缩文件(gzip, bzip2, lzma, xz, lz4, brotli, zstd)。默认情况下这是禁用的。 |
512 | | `-C/--context` | 显示匹配周围的行。 |
513 | | `--sort path` | 强制 ripgrep 将其输出按文件名排序。(这禁用并行性,所以它可能会慢一些。) |
514 | | `-L/--follow` | 在递归搜索的同时,遵循符号链接。 |
515 | | `-M/--max-columns` | 限制 ripgrip 打印的行的长度。 |
516 | | `--debug` | 显示 ripgrip 的调试输出。这对于理解为什么在搜索中可以忽略特定文件,或者 ripgrep 正在从环境中加载何种配置非常有用。 |
517 |
518 | [更多: v0.10.0 的 `rg -h` 的 帮助信息](./rg-0.10.0-h.zh.md)
519 |
--------------------------------------------------------------------------------
/en.md:
--------------------------------------------------------------------------------
1 | ripgrep (rg)
2 | ------------
3 | ripgrep is a line-oriented search tool that recursively searches your current
4 | directory for a regex pattern while respecting your gitignore rules. ripgrep
5 | has first class support on Windows, macOS and Linux, with binary downloads
6 | available for [every release](https://github.com/BurntSushi/ripgrep/releases).
7 | ripgrep is similar to other popular search tools like The Silver Searcher,
8 | ack and grep.
9 |
10 | [](https://travis-ci.org/BurntSushi/ripgrep)
11 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
12 | [](https://crates.io/crates/ripgrep)
13 |
14 | Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
15 |
16 |
17 | ### CHANGELOG
18 |
19 | Please see the [CHANGELOG](CHANGELOG.md) for a release history.
20 |
21 | ### Documentation quick links
22 |
23 | * [Installation](#installation)
24 | * [User Guide](GUIDE.md)
25 | * [Frequently Asked Questions](FAQ.md)
26 | * [Regex syntax](https://docs.rs/regex/1/regex/#syntax)
27 | * [Configuration files](GUIDE.md#configuration-file)
28 | * [Shell completions](FAQ.md#complete)
29 | * [Building](#building)
30 |
31 |
32 | ### Screenshot of search results
33 |
34 | [](http://burntsushi.net/stuff/ripgrep1.png)
35 |
36 |
37 | ### Quick examples comparing tools
38 |
39 | This example searches the entire Linux kernel source tree (after running
40 | `make defconfig && make -j8`) for `[A-Z]+_SUSPEND`, where all matches must be
41 | words. Timings were collected on a system with an Intel i7-6900K 3.2 GHz, and
42 | ripgrep was compiled with SIMD enabled.
43 |
44 | Please remember that a single benchmark is never enough! See my
45 | [blog post on ripgrep](http://blog.burntsushi.net/ripgrep/)
46 | for a very detailed comparison with more benchmarks and analysis.
47 |
48 | | Tool | Command | Line count | Time |
49 | | ---- | ------- | ---------- | ---- |
50 | | ripgrep (Unicode) | `rg -n -w '[A-Z]+_SUSPEND'` | 450 | **0.106s** |
51 | | [git grep](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=C git grep -E -n -w '[A-Z]+_SUSPEND'` | 450 | 0.553s |
52 | | [The Silver Searcher](https://github.com/ggreer/the_silver_searcher) | `ag -w '[A-Z]+_SUSPEND'` | 450 | 0.589s |
53 | | [git grep (Unicode)](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=en_US.UTF-8 git grep -E -n -w '[A-Z]+_SUSPEND'` | 450 | 2.266s |
54 | | [sift](https://github.com/svent/sift) | `sift --git -n -w '[A-Z]+_SUSPEND'` | 450 | 3.505s |
55 | | [ack](https://github.com/petdance/ack2) | `ack -w '[A-Z]+_SUSPEND'` | 1878 | 6.823s |
56 | | [The Platinum Searcher](https://github.com/monochromegane/the_platinum_searcher) | `pt -w -e '[A-Z]+_SUSPEND'` | 450 | 14.208s |
57 |
58 | (Yes, `ack` [has](https://github.com/petdance/ack2/issues/445) a
59 | [bug](https://github.com/petdance/ack2/issues/14).)
60 |
61 | Here's another benchmark that disregards gitignore files and searches with a
62 | whitelist instead. The corpus is the same as in the previous benchmark, and the
63 | flags passed to each command ensure that they are doing equivalent work:
64 |
65 | | Tool | Command | Line count | Time |
66 | | ---- | ------- | ---------- | ---- |
67 | | ripgrep | `rg -L -u -tc -n -w '[A-Z]+_SUSPEND'` | 404 | **0.079s** |
68 | | [ucg](https://github.com/gvansickle/ucg) | `ucg --type=cc -w '[A-Z]+_SUSPEND'` | 390 | 0.163s |
69 | | [GNU grep](https://www.gnu.org/software/grep/) | `egrep -R -n --include='*.c' --include='*.h' -w '[A-Z]+_SUSPEND'` | 404 | 0.611s |
70 |
71 | (`ucg` [has slightly different behavior in the presence of symbolic links](https://github.com/gvansickle/ucg/issues/106).)
72 |
73 | And finally, a straight-up comparison between ripgrep and GNU grep on a single
74 | large file (~9.3GB,
75 | [`OpenSubtitles2016.raw.en.gz`](http://opus.lingfil.uu.se/OpenSubtitles2016/mono/OpenSubtitles2016.raw.en.gz)):
76 |
77 | | Tool | Command | Line count | Time |
78 | | ---- | ------- | ---------- | ---- |
79 | | ripgrep | `rg -w 'Sherlock [A-Z]\w+'` | 5268 | **2.108s** |
80 | | [GNU grep](https://www.gnu.org/software/grep/) | `LC_ALL=C egrep -w 'Sherlock [A-Z]\w+'` | 5268 | 7.014s |
81 |
82 | In the above benchmark, passing the `-n` flag (for showing line numbers)
83 | increases the times to `2.640s` for ripgrep and `10.277s` for GNU grep.
84 |
85 |
86 | ### Why should I use ripgrep?
87 |
88 | * It can replace many use cases served by other search tools
89 | because it contains most of their features and is generally faster. (See
90 | [the FAQ](FAQ.md#posix4ever) for more details on whether ripgrep can truly
91 | replace grep.)
92 | * Like other tools specialized to code search, ripgrep defaults to recursive
93 | directory search and won't search files ignored by your `.gitignore` files.
94 | It also ignores hidden and binary files by default. ripgrep also implements
95 | full support for `.gitignore`, whereas there are many bugs related to that
96 | functionality in other code search tools claiming to provide the same
97 | functionality.
98 | * ripgrep can search specific types of files. For example, `rg -tpy foo`
99 | limits your search to Python files and `rg -Tjs foo` excludes Javascript
100 | files from your search. ripgrep can be taught about new file types with
101 | custom matching rules.
102 | * ripgrep supports many features found in `grep`, such as showing the context
103 | of search results, searching multiple patterns, highlighting matches with
104 | color and full Unicode support. Unlike GNU grep, ripgrep stays fast while
105 | supporting Unicode (which is always on).
106 | * ripgrep has optional support for switching its regex engine to use PCRE2.
107 | Among other things, this makes it possible to use look-around and
108 | backreferences in your patterns, which are not supported in ripgrep's default
109 | regex engine. PCRE2 support is enabled with `-P`.
110 | * ripgrep supports searching files in text encodings other than UTF-8, such
111 | as UTF-16, latin-1, GBK, EUC-JP, Shift_JIS and more. (Some support for
112 | automatically detecting UTF-16 is provided. Other text encodings must be
113 | specifically specified with the `-E/--encoding` flag.)
114 | * ripgrep supports searching files compressed in a common format (gzip, xz,
115 | lzma, bzip2 or lz4) with the `-z/--search-zip` flag.
116 | * ripgrep supports arbitrary input preprocessing filters which could be PDF
117 | text extraction, less supported decompression, decrypting, automatic encoding
118 | detection and so on.
119 |
120 | In other words, use ripgrep if you like speed, filtering by default, fewer
121 | bugs and Unicode support.
122 |
123 |
124 | ### Why shouldn't I use ripgrep?
125 |
126 | Despite initially not wanting to add every feature under the sun to ripgrep,
127 | over time, ripgrep has grown support for most features found in other file
128 | searching tools. This includes searching for results spanning across multiple
129 | lines, and opt-in support for PCRE2, which provides look-around and
130 | backreference support.
131 |
132 | At this point, the primary reasons not to use ripgrep probably consist of one
133 | or more of the following:
134 |
135 | * You need a portable and ubiquitous tool. While ripgrep works on Windows,
136 | macOS and Linux, it is not ubiquitous and it does not conform to any
137 | standard such as POSIX. The best tool for this job is good old grep.
138 | * There still exists some other feature (or bug) not listed in this README that
139 | you rely on that's in another tool that isn't in ripgrep.
140 | * There is a performance edge case where ripgrep doesn't do well where another
141 | tool does do well. (Please file a bug report!)
142 | * ripgrep isn't possible to install on your machine or isn't available for your
143 | platform. (Please file a bug report!)
144 |
145 |
146 | ### Is it really faster than everything else?
147 |
148 | Generally, yes. A large number of benchmarks with detailed analysis for each is
149 | [available on my blog](http://blog.burntsushi.net/ripgrep/).
150 |
151 | Summarizing, ripgrep is fast because:
152 |
153 | * It is built on top of
154 | [Rust's regex engine](https://github.com/rust-lang-nursery/regex).
155 | Rust's regex engine uses finite automata, SIMD and aggressive literal
156 | optimizations to make searching very fast. (PCRE2 support can be opted into
157 | with the `-P/--pcre2` flag.)
158 | * Rust's regex library maintains performance with full Unicode support by
159 | building UTF-8 decoding directly into its deterministic finite automaton
160 | engine.
161 | * It supports searching with either memory maps or by searching incrementally
162 | with an intermediate buffer. The former is better for single files and the
163 | latter is better for large directories. ripgrep chooses the best searching
164 | strategy for you automatically.
165 | * Applies your ignore patterns in `.gitignore` files using a
166 | [`RegexSet`](https://docs.rs/regex/1/regex/struct.RegexSet.html).
167 | That means a single file path can be matched against multiple glob patterns
168 | simultaneously.
169 | * It uses a lock-free parallel recursive directory iterator, courtesy of
170 | [`crossbeam`](https://docs.rs/crossbeam) and
171 | [`ignore`](https://docs.rs/ignore).
172 |
173 |
174 | ### Feature comparison
175 |
176 | Andy Lester, author of [ack](https://beyondgrep.com/), has published an
177 | excellent table comparing the features of ack, ag, git-grep, GNU grep and
178 | ripgrep: https://beyondgrep.com/feature-comparison/
179 |
180 | Note that ripgrep has grown a few significant new features recently that
181 | are not yet present in Andy's table. This includes, but is not limited to,
182 | configuration files, passthru, support for searching compressed files,
183 | multiline search and opt-in fancy regex support via PCRE2.
184 |
185 |
186 | ### Installation
187 |
188 | The binary name for ripgrep is `rg`.
189 |
190 | **[Archives of precompiled binaries for ripgrep are available for Windows,
191 | macOS and Linux.](https://github.com/BurntSushi/ripgrep/releases)** Users of
192 | platforms not explicitly mentioned below are advised to download one of these
193 | archives.
194 |
195 | Linux binaries are static executables. Windows binaries are available either as
196 | built with MinGW (GNU) or with Microsoft Visual C++ (MSVC). When possible,
197 | prefer MSVC over GNU, but you'll need to have the [Microsoft VC++ 2015
198 | redistributable](https://www.microsoft.com/en-us/download/details.aspx?id=48145)
199 | installed.
200 |
201 | If you're a **macOS Homebrew** or a **Linuxbrew** user,
202 | then you can install ripgrep either
203 | from homebrew-core, (compiled with rust stable, no SIMD):
204 |
205 | ```
206 | $ brew install ripgrep
207 | ```
208 |
209 | or you can install a binary compiled with rust nightly (including SIMD and all
210 | optimizations) by utilizing a custom tap:
211 |
212 | ```
213 | $ brew tap burntsushi/ripgrep https://github.com/BurntSushi/ripgrep.git
214 | $ brew install ripgrep-bin
215 | ```
216 |
217 | If you're a **MacPorts** user, then you can install ripgrep from the
218 | [official ports](https://www.macports.org/ports.php?by=name&substr=ripgrep):
219 |
220 | ```
221 | $ sudo port install ripgrep
222 | ```
223 |
224 | If you're a **Windows Chocolatey** user, then you can install ripgrep from the
225 | [official repo](https://chocolatey.org/packages/ripgrep):
226 |
227 | ```
228 | $ choco install ripgrep
229 | ```
230 |
231 | If you're a **Windows Scoop** user, then you can install ripgrep from the
232 | [official bucket](https://github.com/lukesampson/scoop/blob/master/bucket/ripgrep.json):
233 |
234 | ```
235 | $ scoop install ripgrep
236 | ```
237 |
238 | If you're an **Arch Linux** user, then you can install ripgrep from the official repos:
239 |
240 | ```
241 | $ pacman -S ripgrep
242 | ```
243 |
244 | If you're a **Gentoo** user, you can install ripgrep from the
245 | [official repo](https://packages.gentoo.org/packages/sys-apps/ripgrep):
246 |
247 | ```
248 | $ emerge sys-apps/ripgrep
249 | ```
250 |
251 | If you're a **Fedora 27+** user, you can install ripgrep from official
252 | repositories.
253 |
254 | ```
255 | $ sudo dnf install ripgrep
256 | ```
257 |
258 | If you're a **Fedora 24+** user, you can install ripgrep from
259 | [copr](https://copr.fedorainfracloud.org/coprs/carlwgeorge/ripgrep/):
260 |
261 | ```
262 | $ sudo dnf copr enable carlwgeorge/ripgrep
263 | $ sudo dnf install ripgrep
264 | ```
265 |
266 | If you're an **openSUSE Tumbleweed** user, you can install ripgrep from the
267 | [official repo](http://software.opensuse.org/package/ripgrep):
268 |
269 | ```
270 | $ sudo zypper install ripgrep
271 | ```
272 |
273 | If you're a **RHEL/CentOS 7** user, you can install ripgrep from
274 | [copr](https://copr.fedorainfracloud.org/coprs/carlwgeorge/ripgrep/):
275 |
276 | ```
277 | $ sudo yum-config-manager --add-repo=https://copr.fedorainfracloud.org/coprs/carlwgeorge/ripgrep/repo/epel-7/carlwgeorge-ripgrep-epel-7.repo
278 | $ sudo yum install ripgrep
279 | ```
280 |
281 | If you're a **Nix** user, you can install ripgrep from
282 | [nixpkgs](https://github.com/NixOS/nixpkgs/blob/master/pkgs/tools/text/ripgrep/default.nix):
283 |
284 | ```
285 | $ nix-env --install ripgrep
286 | $ # (Or using the attribute name, which is also ripgrep.)
287 | ```
288 |
289 | If you're a **Debian** user (or a user of a Debian derivative like **Ubuntu**),
290 | then ripgrep can be installed using a binary `.deb` file provided in each
291 | [ripgrep release](https://github.com/BurntSushi/ripgrep/releases).
292 |
293 | ```
294 | $ curl -LO https://github.com/BurntSushi/ripgrep/releases/download/0.10.0/ripgrep_0.10.0_amd64.deb
295 | $ sudo dpkg -i ripgrep_0.10.0_amd64.deb
296 | ```
297 |
298 | If you run Debian Buster (currently Debian testing) or Debian sid, ripgrep is
299 | [officially maintained by Debian](https://tracker.debian.org/pkg/rust-ripgrep).
300 | ```
301 | $ sudo apt-get install ripgrep
302 | ```
303 |
304 | If you're an **Ubuntu Cosmic (18.10)** (or newer) user, ripgrep is
305 | [available](https://launchpad.net/ubuntu/+source/rust-ripgrep) using the same
306 | packaging as Debian:
307 |
308 | ```
309 | $ sudo apt-get install ripgrep
310 | ```
311 |
312 | (N.B. Various snaps for ripgrep on Ubuntu are also available, but none of them
313 | seem to work right and generate a number of very strange bug reports that I
314 | don't know how to fix and don't have the time to fix. Therefore, it is no
315 | longer a recommended installation option.)
316 |
317 | If you're a **FreeBSD** user, then you can install ripgrep from the
318 | [official ports](https://www.freshports.org/textproc/ripgrep/):
319 |
320 | ```
321 | # pkg install ripgrep
322 | ```
323 |
324 | If you're an **OpenBSD** user, then you can install ripgrep from the
325 | [official ports](http://openports.se/textproc/ripgrep):
326 |
327 | ```
328 | $ doas pkg_add ripgrep
329 | ```
330 |
331 | If you're a **NetBSD** user, then you can install ripgrep from
332 | [pkgsrc](http://pkgsrc.se/textproc/ripgrep):
333 |
334 | ```
335 | # pkgin install ripgrep
336 | ```
337 |
338 | If you're a **Rust programmer**, ripgrep can be installed with `cargo`.
339 |
340 | * Note that the minimum supported version of Rust for ripgrep is **1.28.0**,
341 | although ripgrep may work with older versions.
342 | * Note that the binary may be bigger than expected because it contains debug
343 | symbols. This is intentional. To remove debug symbols and therefore reduce
344 | the file size, run `strip` on the binary.
345 |
346 | ```
347 | $ cargo install ripgrep
348 | ```
349 |
350 | When compiling with Rust 1.27 or newer, this will automatically enable SIMD
351 | optimizations for search.
352 |
353 | ripgrep isn't currently in any other package repositories.
354 | [I'd like to change that](https://github.com/BurntSushi/ripgrep/issues/10).
355 |
356 |
357 | ### Building
358 |
359 | ripgrep is written in Rust, so you'll need to grab a
360 | [Rust installation](https://www.rust-lang.org/) in order to compile it.
361 | ripgrep compiles with Rust 1.28.0 (stable) or newer. In general, ripgrep tracks
362 | the latest stable release of the Rust compiler.
363 |
364 | To build ripgrep:
365 |
366 | ```
367 | $ git clone https://github.com/BurntSushi/ripgrep
368 | $ cd ripgrep
369 | $ cargo build --release
370 | $ ./target/release/rg --version
371 | 0.1.3
372 | ```
373 |
374 | If you have a Rust nightly compiler and a recent Intel CPU, then you can enable
375 | additional optional SIMD acceleration like so:
376 |
377 | ```
378 | RUSTFLAGS="-C target-cpu=native" cargo build --release --features 'simd-accel avx-accel'
379 | ```
380 |
381 | If your machine doesn't support AVX instructions, then simply remove
382 | `avx-accel` from the features list. Similarly for SIMD (which corresponds
383 | roughly to SSE instructions).
384 |
385 | The `simd-accel` and `avx-accel` features enable SIMD support in certain
386 | ripgrep dependencies (responsible for counting lines and transcoding). They
387 | are not necessary to get SIMD optimizations for search; those are enabled
388 | automatically. Hopefully, some day, the `simd-accel` and `avx-accel` features
389 | will similarly become unnecessary.
390 |
391 | Finally, optional PCRE2 support can be built with ripgrep by enabling the
392 | `pcre2` feature:
393 |
394 | ```
395 | $ cargo build --release --features 'pcre2'
396 | ```
397 |
398 | (Tip: use `--features 'pcre2 simd-accel avx-accel'` to also include compile
399 | time SIMD optimizations, which will only work with a nightly compiler.)
400 |
401 | Enabling the PCRE2 feature works with a stable Rust compiler and will
402 | attempt to automatically find and link with your system's PCRE2 library via
403 | `pkg-config`. If one doesn't exist, then ripgrep will build PCRE2 from source
404 | using your system's C compiler and then statically link it into the final
405 | executable. Static linking can be forced even when there is an available PCRE2
406 | system library by either building ripgrep with the MUSL target or by setting
407 | `PCRE2_SYS_STATIC=1`.
408 |
409 | ripgrep can be built with the MUSL target on Linux by first installing the MUSL
410 | library on your system (consult your friendly neighborhood package manager).
411 | Then you just need to add MUSL support to your Rust toolchain and rebuild
412 | ripgrep, which yields a fully static executable:
413 |
414 | ```
415 | $ rustup target add x86_64-unknown-linux-musl
416 | $ cargo build --release --target x86_64-unknown-linux-musl
417 | ```
418 |
419 | Applying the `--features` flag from above works as expected. If you want to
420 | build a static executable with MUSL and with PCRE2, then you will need to have
421 | `musl-gcc` installed, which might be in a separate package from the actual
422 | MUSL library, depending on your Linux distribution.
423 |
424 |
425 | ### Running tests
426 |
427 | ripgrep is relatively well-tested, including both unit tests and integration
428 | tests. To run the full test suite, use:
429 |
430 | ```
431 | $ cargo test --all
432 | ```
433 |
434 | from the repository root.
435 |
--------------------------------------------------------------------------------
/globset/README.md:
--------------------------------------------------------------------------------
1 | globset
2 | =======
3 | Cross platform single glob and glob set matching. Glob set matching is the
4 | process of matching one or more glob patterns against a single candidate path
5 | simultaneously, and returning all of the globs that matched.
6 |
7 | [](https://travis-ci.org/BurntSushi/ripgrep)
8 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
9 | [](https://crates.io/crates/globset)
10 |
11 | Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
12 |
13 | ### Documentation
14 |
15 | [https://docs.rs/globset](https://docs.rs/globset)
16 |
17 | ### Usage
18 |
19 | Add this to your `Cargo.toml`:
20 |
21 | ```toml
22 | [dependencies]
23 | globset = "0.3"
24 | ```
25 |
26 | and this to your crate root:
27 |
28 | ```rust
29 | extern crate globset;
30 | ```
31 |
32 | ### Example: one glob
33 |
34 | This example shows how to match a single glob against a single file path.
35 |
36 | ```rust
37 | use globset::Glob;
38 |
39 | let glob = Glob::new("*.rs")?.compile_matcher();
40 |
41 | assert!(glob.is_match("foo.rs"));
42 | assert!(glob.is_match("foo/bar.rs"));
43 | assert!(!glob.is_match("Cargo.toml"));
44 | ```
45 |
46 | ### Example: configuring a glob matcher
47 |
48 | This example shows how to use a `GlobBuilder` to configure aspects of match
49 | semantics. In this example, we prevent wildcards from matching path separators.
50 |
51 | ```rust
52 | use globset::GlobBuilder;
53 |
54 | let glob = GlobBuilder::new("*.rs")
55 | .literal_separator(true).build()?.compile_matcher();
56 |
57 | assert!(glob.is_match("foo.rs"));
58 | assert!(!glob.is_match("foo/bar.rs")); // no longer matches
59 | assert!(!glob.is_match("Cargo.toml"));
60 | ```
61 |
62 | ### Example: match multiple globs at once
63 |
64 | This example shows how to match multiple glob patterns at once.
65 |
66 | ```rust
67 | use globset::{Glob, GlobSetBuilder};
68 |
69 | let mut builder = GlobSetBuilder::new();
70 | // A GlobBuilder can be used to configure each glob's match semantics
71 | // independently.
72 | builder.add(Glob::new("*.rs")?);
73 | builder.add(Glob::new("src/lib.rs")?);
74 | builder.add(Glob::new("src/**/foo.rs")?);
75 | let set = builder.build()?;
76 |
77 | assert_eq!(set.matches("src/bar/baz/foo.rs"), vec![0, 2]);
78 | ```
79 |
80 | ### Performance
81 |
82 | This crate implements globs by converting them to regular expressions, and
83 | executing them with the
84 | [`regex`](https://github.com/rust-lang-nursery/regex)
85 | crate.
86 |
87 | For single glob matching, performance of this crate should be roughly on par
88 | with the performance of the
89 | [`glob`](https://github.com/rust-lang-nursery/glob)
90 | crate. (`*_regex` correspond to benchmarks for this library while `*_glob`
91 | correspond to benchmarks for the `glob` library.)
92 | Optimizations in the `regex` crate may propel this library past `glob`,
93 | particularly when matching longer paths.
94 |
95 | ```
96 | test ext_glob ... bench: 425 ns/iter (+/- 21)
97 | test ext_regex ... bench: 175 ns/iter (+/- 10)
98 | test long_glob ... bench: 182 ns/iter (+/- 11)
99 | test long_regex ... bench: 173 ns/iter (+/- 10)
100 | test short_glob ... bench: 69 ns/iter (+/- 4)
101 | test short_regex ... bench: 83 ns/iter (+/- 2)
102 | ```
103 |
104 | The primary performance advantage of this crate is when matching multiple
105 | globs against a single path. With the `glob` crate, one must match each glob
106 | synchronously, one after the other. In this crate, many can be matched
107 | simultaneously. For example:
108 |
109 | ```
110 | test many_short_glob ... bench: 1,063 ns/iter (+/- 47)
111 | test many_short_regex_set ... bench: 186 ns/iter (+/- 11)
112 | ```
113 |
114 | ### Comparison with the [`glob`](https://github.com/rust-lang-nursery/glob) crate
115 |
116 | * Supports alternate "or" globs, e.g., `*.{foo,bar}`.
117 | * Can match non-UTF-8 file paths correctly.
118 | * Supports matching multiple globs at once.
119 | * Doesn't provide a recursive directory iterator of matching file paths,
120 | although I believe this crate should grow one eventually.
121 | * Supports case insensitive and require-literal-separator match options, but
122 | **doesn't** support the require-literal-leading-dot option.
123 |
--------------------------------------------------------------------------------
/globset/README.zh.md:
--------------------------------------------------------------------------------
1 | # globset
2 |
3 | 跨平台单个 glob 和 glob 集合匹配。glob 集合匹配是将一个或多个 glob 模式与单个备选路径同时匹配,并返回所有匹配的 glob 的过程。
4 |
5 | [](https://travis-ci.org/BurntSushi/ripgrep)
6 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
7 | [](https://crates.io/crates/globset)
8 |
9 | MIT 或[UNLICENSE](http://unlicense.org)的双重许可.
10 |
11 | ### 文档
12 |
13 |
14 |
15 | ### 用法
16 |
17 | 将此添加到您的`Cargo.toml`:
18 |
19 | ```toml
20 | [dependencies]
21 | globset = "0.3"
22 | ```
23 |
24 | 在你的箱根使用:
25 |
26 | ```rust
27 | extern crate globset;
28 | ```
29 |
30 | ### 示例:一个 glob
31 |
32 | 此示例显示如何将单个 glob 与单个文件路径匹配.
33 |
34 | ```rust
35 | use globset::Glob;
36 |
37 | let glob = Glob::new("*.rs")?.compile_matcher();
38 |
39 | assert!(glob.is_match("foo.rs"));
40 | assert!(glob.is_match("foo/bar.rs"));
41 | assert!(!glob.is_match("Cargo.toml"));
42 | ```
43 |
44 | ### 示例:配置 glob 匹配器
45 |
46 | 此示例显示如何使用 一个`GlobBuilder`去配置匹配(match)语义。在此示例中,我们不要通配符,匹配路径分隔符。
47 |
48 | ```rust
49 | use globset::GlobBuilder;
50 |
51 | let glob = GlobBuilder::new("*.rs")
52 | .literal_separator(true).build()?.compile_matcher();
53 |
54 | assert!(glob.is_match("foo.rs"));
55 | assert!(!glob.is_match("foo/bar.rs")); // 不再 matches
56 | assert!(!glob.is_match("Cargo.toml"));
57 | ```
58 |
59 | ### 示例:一次匹配多个 globs
60 |
61 | 此示例显示如何一次匹配多个 glob 模式.
62 |
63 | ```rust
64 | use globset::{Glob, GlobSetBuilder};
65 |
66 | let mut builder = GlobSetBuilder::new();
67 | // 一个 GlobBuilder 就可以用来配置每个 glob's match 语义
68 | // 单个.
69 | builder.add(Glob::new("*.rs")?);
70 | builder.add(Glob::new("src/lib.rs")?);
71 | builder.add(Glob::new("src/**/foo.rs")?);
72 | let set = builder.build()?;
73 |
74 | assert_eq!(set.matches("src/bar/baz/foo.rs"), vec![0, 2]);
75 | ```
76 |
77 | ### 性能
78 |
79 | 这个 箱 通过将它们转换为正则表达式,和使用[`regex`](https://github.com/rust-lang-nursery/regex)箱来执行它们,来实现 globs。
80 |
81 | 对于单个全局匹配,此箱子的性能应该与[`glob`](https://github.com/rust-lang-nursery/glob)箱性能大致相当。(`*_regex`是针对此库的基准,`*_glob`是针对`glob`库的基准。)
82 | 优化`regex`箱,还可能推动这个库超过`glob`基准,特别是匹配较长的路径时。
83 |
84 | ```
85 | test ext_glob ... bench: 425 ns/iter (+/- 21)
86 | test ext_regex ... bench: 175 ns/iter (+/- 10)
87 | test long_glob ... bench: 182 ns/iter (+/- 11)
88 | test long_regex ... bench: 173 ns/iter (+/- 10)
89 | test short_glob ... bench: 69 ns/iter (+/- 4)
90 | test short_regex ... bench: 83 ns/iter (+/- 2)
91 | ```
92 |
93 | 此箱子的主要性能优势是将多个 glob 与单个路径匹配。要用`glob`箱,一个接一个地匹配每个 glob。在这个箱子里,许多人可以同时匹配。例如:
94 |
95 | ```
96 | test many_short_glob ... bench: 1,063 ns/iter (+/- 47)
97 | test many_short_regex_set ... bench: 186 ns/iter (+/- 11)
98 | ```
99 |
100 | ### 与[`glob`](https://github.com/rust-lang-nursery/glob)箱比较
101 |
102 | - 支持替代 "或" globs,例如,`*.{foo,bar}`.
103 | - 可以正确匹配非 UTF-8 文件路径.
104 | - 支持一次匹配多个 globs.
105 | - 不提供匹配文件路径的递归目录迭代器,虽然我相信这个箱子最终应该增长一个。
106 | - 支持不区分大小写和 require-literal-separator(需要路径分隔符) 匹配选项,但是**不**支持 require-literal-leading-dot((需要前路径点号)) 选项.
107 |
--------------------------------------------------------------------------------
/grep-cli/README.md:
--------------------------------------------------------------------------------
1 | grep-cli
2 | --------
3 | A utility library that provides common routines desired in search oriented
4 | command line applications. This includes, but is not limited to, parsing hex
5 | escapes, detecting whether stdin is readable and more. To the extent possible,
6 | this crate strives for compatibility across Windows, macOS and Linux.
7 |
8 | [](https://travis-ci.org/BurntSushi/ripgrep)
9 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
10 | [](https://crates.io/crates/grep-cli)
11 |
12 | Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
13 |
14 |
15 | ### Documentation
16 |
17 | [https://docs.rs/grep-cli](https://docs.rs/grep-cli)
18 |
19 | **NOTE:** You probably don't want to use this crate directly. Instead, you
20 | should prefer the facade defined in the
21 | [`grep`](https://docs.rs/grep)
22 | crate.
23 |
24 |
25 | ### Usage
26 |
27 | Add this to your `Cargo.toml`:
28 |
29 | ```toml
30 | [dependencies]
31 | grep-cli = "0.1"
32 | ```
33 |
34 | and this to your crate root:
35 |
36 | ```rust
37 | extern crate grep_cli;
38 | ```
39 |
--------------------------------------------------------------------------------
/grep-cli/README.zh.md:
--------------------------------------------------------------------------------
1 | ## grep 的 CLI
2 |
3 | 一个实用程序库,提供面向搜索的命令行应用程序所需的常用例程。这包括但不限于解析十六进制转义,检测 stdin 是否可读等等。在这种情况下,这个箱子力求在 Windows,macOS 和 Linux 之间实现兼容性。
4 |
5 | [](https://travis-ci.org/BurntSushi/ripgrep)
6 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
7 | [](https://crates.io/crates/grep-cli)
8 |
9 | MIT 或[UNLICENSE](http://unlicense.org)的双重许可.
10 |
11 | ### 文档
12 |
13 |
14 |
15 | **注意:**您可能不想直接使用此包.相反,你应该更喜欢在[`grep`](https://docs.rs/grep)箱中定义的外观(API).
16 |
17 | ### 用法
18 |
19 | 将此添加到您的`Cargo.toml`:
20 |
21 | ```toml
22 | [dependencies]
23 | grep-cli = "0.1"
24 | ```
25 |
26 | 在你的箱根使用:
27 |
28 | ```rust
29 | extern crate grep_cli;
30 | ```
31 |
--------------------------------------------------------------------------------
/grep-matcher/README.md:
--------------------------------------------------------------------------------
1 | grep-matcher
2 | ------------
3 | This crate provides a low level interface for describing regular expression
4 | matchers. The `grep` crate uses this interface in order to make the regex
5 | engine it uses pluggable.
6 |
7 | [](https://travis-ci.org/BurntSushi/ripgrep)
8 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
9 | [](https://crates.io/crates/grep-matcher)
10 |
11 | Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
12 |
13 | ### Documentation
14 |
15 | [https://docs.rs/grep-matcher](https://docs.rs/grep-matcher)
16 |
17 | **NOTE:** You probably don't want to use this crate directly. Instead, you
18 | should prefer the facade defined in the
19 | [`grep`](https://docs.rs/grep)
20 | crate.
21 |
22 |
23 | ### Usage
24 |
25 | Add this to your `Cargo.toml`:
26 |
27 | ```toml
28 | [dependencies]
29 | grep-matcher = "0.1"
30 | ```
31 |
32 | and this to your crate root:
33 |
34 | ```rust
35 | extern crate grep_matcher;
36 | ```
37 |
--------------------------------------------------------------------------------
/grep-matcher/README.zh.md:
--------------------------------------------------------------------------------
1 | ## grep 的匹配器
2 |
3 | 此包提供了一个底层接口,用于描述正则表达式匹配器。`grep`箱 使用此接口,为了其能使用,可插拔的正则表达式引擎.
4 |
5 | [](https://travis-ci.org/BurntSushi/ripgrep)
6 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
7 | [](https://crates.io/crates/grep-matcher)
8 |
9 | MIT 或[UNLICENSE](http://unlicense.org)的双重许可.
10 |
11 | ### 文档
12 |
13 |
14 |
15 | **注意:**您可能不想直接使用此包.相反,你应该更喜欢在[`grep`](https://docs.rs/grep)箱中定义的外观(API).
16 |
17 | ### 用法
18 |
19 | 将此添加到您的`Cargo.toml`:
20 |
21 | ```toml
22 | [dependencies]
23 | grep-matcher = "0.1"
24 | ```
25 |
26 | 在你的箱根使用:
27 |
28 | ```rust
29 | extern crate grep_matcher;
30 | ```
31 |
--------------------------------------------------------------------------------
/grep-pcre2/README.md:
--------------------------------------------------------------------------------
1 | grep-pcre2
2 | ----------
3 | The `grep-pcre2` crate provides an implementation of the `Matcher` trait from
4 | the `grep-matcher` crate. This implementation permits PCRE2 to be used in the
5 | `grep` crate for fast line oriented searching.
6 |
7 | [](https://travis-ci.org/BurntSushi/ripgrep)
8 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
9 | [](https://crates.io/crates/grep-pcre2)
10 |
11 | Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
12 |
13 | ### Documentation
14 |
15 | [https://docs.rs/grep-pcre2](https://docs.rs/grep-pcre2)
16 |
17 | **NOTE:** You probably don't want to use this crate directly. Instead, you
18 | should prefer the facade defined in the
19 | [`grep`](https://docs.rs/grep)
20 | crate.
21 |
22 | If you're looking to just use PCRE2 from Rust, then you probably want the
23 | [`pcre2`](https://docs.rs/pcre2)
24 | crate, which provide high level safe bindings to PCRE2.
25 |
26 | ### Usage
27 |
28 | Add this to your `Cargo.toml`:
29 |
30 | ```toml
31 | [dependencies]
32 | grep-pcre2 = "0.1"
33 | ```
34 |
35 | and this to your crate root:
36 |
37 | ```rust
38 | extern crate grep_pcre2;
39 | ```
40 |
--------------------------------------------------------------------------------
/grep-pcre2/README.zh.md:
--------------------------------------------------------------------------------
1 | ## grep 的 pcre2
2 |
3 | `grep-pcre2`箱 提供了,一个`grep-matcher`箱的`Matcher`trait 实现。这种实现允许 PCRE2 用在`grep`,的快速线路搜索.
4 |
5 | [](https://travis-ci.org/BurntSushi/ripgrep)
6 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
7 | [](https://crates.io/crates/grep-pcre2)
8 |
9 | MIT 或[UNLICENSE](http://unlicense.org)的双重许可.
10 |
11 | ### 文档
12 |
13 |
14 |
15 | **注意:**您可能不想直接使用此包.相反,你应该更喜欢在[`grep`](https://docs.rs/grep)箱中定义的外观(API).
16 |
17 | 如果您只想使用 Rust 的 PCRE2,那么您可能需要[`pcre2`](https://docs.rs/pcre2)箱子,为 PCRE2 提供高水平的安全装订。
18 |
19 | ### 用法
20 |
21 | 将此添加到您的`Cargo.toml`:
22 |
23 | ```toml
24 | [dependencies]
25 | grep-pcre2 = "0.1"
26 | ```
27 |
28 | 在你的箱根使用:
29 |
30 | ```rust
31 | extern crate grep_pcre2;
32 | ```
33 |
--------------------------------------------------------------------------------
/grep-printer/README.md:
--------------------------------------------------------------------------------
1 | grep-printer
2 | ------------
3 | Print results from line oriented searching in a human readable, aggregate or
4 | JSON Lines format.
5 |
6 | [](https://travis-ci.org/BurntSushi/ripgrep)
7 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
8 | [](https://crates.io/crates/grep-printer)
9 |
10 | Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
11 |
12 | ### Documentation
13 |
14 | [https://docs.rs/grep-printer](https://docs.rs/grep-printer)
15 |
16 | **NOTE:** You probably don't want to use this crate directly. Instead, you
17 | should prefer the facade defined in the
18 | [`grep`](https://docs.rs/grep)
19 | crate.
20 |
21 |
22 | ### Usage
23 |
24 | Add this to your `Cargo.toml`:
25 |
26 | ```toml
27 | [dependencies]
28 | grep-printer = "0.1"
29 | ```
30 |
31 | and this to your crate root:
32 |
33 | ```rust
34 | extern crate grep_printer;
35 | ```
36 |
--------------------------------------------------------------------------------
/grep-printer/README.zh.md:
--------------------------------------------------------------------------------
1 | ## grep 的打印机
2 |
3 | 以人类可读,统计或 JSON 行格式,打印面向行的搜索结果。
4 |
5 | [](https://travis-ci.org/BurntSushi/ripgrep)
6 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
7 | [](https://crates.io/crates/grep-printer)
8 |
9 | MIT 或[UNLICENSE](http://unlicense.org)的双重许可.
10 |
11 | ### 文档
12 |
13 |
14 |
15 | **注意:**您可能不想直接使用此包.相反,你应该更喜欢在[`grep`](https://docs.rs/grep)箱中定义的外观(API).
16 |
17 | ### 用法
18 |
19 | 将此添加到您的`Cargo.toml`:
20 |
21 | ```toml
22 | [dependencies]
23 | grep-printer = "0.1"
24 | ```
25 |
26 | 在你的箱根使用:
27 |
28 | ```rust
29 | extern crate grep_printer;
30 | ```
31 |
--------------------------------------------------------------------------------
/grep-regex/README.md:
--------------------------------------------------------------------------------
1 | grep-regex
2 | ----------
3 | The `grep-regex` crate provides an implementation of the `Matcher` trait from
4 | the `grep-matcher` crate. This implementation permits Rust's regex engine to
5 | be used in the `grep` crate for fast line oriented searching.
6 |
7 | [](https://travis-ci.org/BurntSushi/ripgrep)
8 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
9 | [](https://crates.io/crates/grep-regex)
10 |
11 | Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
12 |
13 | ### Documentation
14 |
15 | [https://docs.rs/grep-regex](https://docs.rs/grep-regex)
16 |
17 | **NOTE:** You probably don't want to use this crate directly. Instead, you
18 | should prefer the facade defined in the
19 | [`grep`](https://docs.rs/grep)
20 | crate.
21 |
22 | ### Usage
23 |
24 | Add this to your `Cargo.toml`:
25 |
26 | ```toml
27 | [dependencies]
28 | grep-regex = "0.1"
29 | ```
30 |
31 | and this to your crate root:
32 |
33 | ```rust
34 | extern crate grep_regex;
35 | ```
36 |
--------------------------------------------------------------------------------
/grep-regex/README.zh.md:
--------------------------------------------------------------------------------
1 | ## grep 的正则表达式
2 |
3 | `grep-regex`箱提供了一个,`grep-matcher`箱的`Matcher`trait 实现。这个实现允许 Rust 的正则表达式引擎用于`grep`箱,的快速行搜索。
4 |
5 | [](https://travis-ci.org/BurntSushi/ripgrep)
6 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
7 | [](https://crates.io/crates/grep-regex)
8 |
9 | MIT 或[UNLICENSE](http://unlicense.org)的双重许可.
10 |
11 | ### 文档
12 |
13 |
14 |
15 | **注意:**您可能不想直接使用此包。那么,你会更喜欢[`grep`](https://docs.rs/grep)箱中定义的外观(API).
16 |
17 | ### 用法
18 |
19 | 将此添加到您的`Cargo.toml`:
20 |
21 | ```toml
22 | [dependencies]
23 | grep-regex = "0.1"
24 | ```
25 |
26 | 并在你的箱根中使用:
27 |
28 | ```rust
29 | extern crate grep_regex;
30 | ```
31 |
--------------------------------------------------------------------------------
/grep-searcher/README.md:
--------------------------------------------------------------------------------
1 | grep-searcher
2 | -------------
3 | A high level library for executing fast line oriented searches. This handles
4 | things like reporting contextual lines, counting lines, inverting a search,
5 | detecting binary data, automatic UTF-16 transcoding and deciding whether or not
6 | to use memory maps.
7 |
8 | [](https://travis-ci.org/BurntSushi/ripgrep)
9 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
10 | [](https://crates.io/crates/grep-searcher)
11 |
12 | Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
13 |
14 | ### Documentation
15 |
16 | [https://docs.rs/grep-searcher](https://docs.rs/grep-searcher)
17 |
18 | **NOTE:** You probably don't want to use this crate directly. Instead, you
19 | should prefer the facade defined in the
20 | [`grep`](https://docs.rs/grep)
21 | crate.
22 |
23 |
24 | ### Usage
25 |
26 | Add this to your `Cargo.toml`:
27 |
28 | ```toml
29 | [dependencies]
30 | grep-searcher = "0.1"
31 | ```
32 |
33 | and this to your crate root:
34 |
35 | ```rust
36 | extern crate grep_searcher;
37 | ```
38 |
--------------------------------------------------------------------------------
/grep-searcher/README.zh.md:
--------------------------------------------------------------------------------
1 | ## grep-searcher
2 |
3 | 用于执行快速面向行的搜索的高级库。这可以处理报告上下文行,计数行,反转搜索,检测二进制数据,自动 UTF-16 转码,以及决定是否使用内存映射等内容.
4 |
5 | [](https://travis-ci.org/BurntSushi/ripgrep)
6 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
7 | [](https://crates.io/crates/grep-searcher)
8 |
9 | MIT 或[UNLICENSE](http://unlicense.org)的双重许可.
10 |
11 | ### 文档
12 |
13 |
14 |
15 | **注意:**您可能不想直接使用此包.相反,你应该更喜欢在[`grep`](https://docs.rs/grep)箱中定义的外观(API).
16 |
17 | ### 用法
18 |
19 | 将此添加到您的`Cargo.toml`:
20 |
21 | ```toml
22 | [dependencies]
23 | grep-searcher = "0.1"
24 | ```
25 |
26 | 在你的箱根使用:
27 |
28 | ```rust
29 | extern crate grep_searcher;
30 | ```
31 |
--------------------------------------------------------------------------------
/grep/README.md:
--------------------------------------------------------------------------------
1 | grep
2 | ----
3 | ripgrep, as a library.
4 |
5 | [](https://travis-ci.org/BurntSushi/ripgrep)
6 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
7 | [](https://crates.io/crates/grep)
8 |
9 | Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
10 |
11 |
12 | ### Documentation
13 |
14 | [https://docs.rs/grep](https://docs.rs/grep)
15 |
16 | NOTE: This crate isn't ready for wide use yet. Ambitious individuals can
17 | probably piece together the parts, but there is no high level documentation
18 | describing how all of the pieces fit together.
19 |
20 |
21 | ### Usage
22 |
23 | Add this to your `Cargo.toml`:
24 |
25 | ```toml
26 | [dependencies]
27 | grep = "0.2"
28 | ```
29 |
30 | and this to your crate root:
31 |
32 | ```rust
33 | extern crate grep;
34 | ```
35 |
36 |
37 | ### Features
38 |
39 | This crate provides a `pcre2` feature (disabled by default) which, when
40 | enabled, re-exports the `grep-pcre2` crate as an alternative `Matcher`
41 | implementation to the standard `grep-regex` implementation.
42 |
--------------------------------------------------------------------------------
/grep/README.zh.md:
--------------------------------------------------------------------------------
1 | ## grep
2 |
3 | ripgrep,的一库.
4 |
5 | [](https://travis-ci.org/BurntSushi/ripgrep)
6 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
7 | [](https://crates.io/crates/grep)
8 |
9 | MIT 或[UNLICENSE](http://unlicense.org)的双重许可.
10 |
11 | ### 文档
12 |
13 |
14 |
15 | **注意**:此箱子尚未准备好广泛使用。雄心勃勃的人可能会将这些部分(自行了解的使用方式)拼凑在一起,但没有高级文档描述所有部分如何组合在一起。
16 |
17 | ### 用法
18 |
19 | 将此添加到您的`Cargo.toml`:
20 |
21 | ```toml
22 | [dependencies]
23 | grep = "0.2"
24 | ```
25 |
26 | 在你的箱根使用:
27 |
28 | ```rust
29 | extern crate grep;
30 | ```
31 |
32 | ### 特征
33 |
34 | 这个箱子提供了一个`pcre2`功能(默认情况下禁用),启用后,重新导出`grep-pcre2`箱子,作为替代`Matcher`的实现,到标准`grep-regex`实现中。
35 |
--------------------------------------------------------------------------------
/ignore/README.md:
--------------------------------------------------------------------------------
1 | ignore
2 | ======
3 | The ignore crate provides a fast recursive directory iterator that respects
4 | various filters such as globs, file types and `.gitignore` files. This crate
5 | also provides lower level direct access to gitignore and file type matchers.
6 |
7 | [](https://travis-ci.org/BurntSushi/ripgrep)
8 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
9 | [](https://crates.io/crates/ignore)
10 |
11 | Dual-licensed under MIT or the [UNLICENSE](http://unlicense.org).
12 |
13 | ### Documentation
14 |
15 | [https://docs.rs/ignore](https://docs.rs/ignore)
16 |
17 | ### Usage
18 |
19 | Add this to your `Cargo.toml`:
20 |
21 | ```toml
22 | [dependencies]
23 | ignore = "0.4"
24 | ```
25 |
26 | and this to your crate root:
27 |
28 | ```rust
29 | extern crate ignore;
30 | ```
31 |
32 | ### Example
33 |
34 | This example shows the most basic usage of this crate. This code will
35 | recursively traverse the current directory while automatically filtering out
36 | files and directories according to ignore globs found in files like
37 | `.ignore` and `.gitignore`:
38 |
39 |
40 | ```rust,no_run
41 | use ignore::Walk;
42 |
43 | for result in Walk::new("./") {
44 | // Each item yielded by the iterator is either a directory entry or an
45 | // error, so either print the path or the error.
46 | match result {
47 | Ok(entry) => println!("{}", entry.path().display()),
48 | Err(err) => println!("ERROR: {}", err),
49 | }
50 | }
51 | ```
52 |
53 | ### Example: advanced
54 |
55 | By default, the recursive directory iterator will ignore hidden files and
56 | directories. This can be disabled by building the iterator with `WalkBuilder`:
57 |
58 | ```rust,no_run
59 | use ignore::WalkBuilder;
60 |
61 | for result in WalkBuilder::new("./").hidden(false).build() {
62 | println!("{:?}", result);
63 | }
64 | ```
65 |
66 | See the documentation for `WalkBuilder` for many other options.
67 |
--------------------------------------------------------------------------------
/ignore/README.zh.md:
--------------------------------------------------------------------------------
1 | # ignore
2 |
3 | ignore 箱 提供了一个快速的递归目录迭代器,它尊重各种过滤器,如 globs,文件类型和`.gitignore`文件。此包还提供对 gitignore 和文件类型匹配器的较低层的直接访问。
4 |
5 | [](https://travis-ci.org/BurntSushi/ripgrep)
6 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
7 | [](https://crates.io/crates/ignore)
8 |
9 | MIT 或[UNLICENSE](http://unlicense.org)的双重许可.
10 |
11 | ### 文档
12 |
13 |
14 |
15 | ### 用法
16 |
17 | 将此添加到您的`Cargo.toml`:
18 |
19 | ```toml
20 | [dependencies]
21 | ignore = "0.4"
22 | ```
23 |
24 | 在你的箱根使用:
25 |
26 | ```rust
27 | extern crate ignore;
28 | ```
29 |
30 | ### 例
31 |
32 | 此示例显示了此箱子的最基本用法。此代码将递归遍历当前目录,同时根据`.ignore`和`.gitignore`文件中的 ignore globs 模式,自动过滤掉文件和目录:
33 |
34 | ```rust,no_run
35 | use ignore::Walk;
36 |
37 | for result in Walk::new("./") {
38 | // Each item yielded by the iterator is either a directory entry or an
39 | // error, so either print the path or the error.
40 | match result {
41 | Ok(entry) => println!("{}", entry.path().display()),
42 | Err(err) => println!("ERROR: {}", err),
43 | }
44 | }
45 | ```
46 |
47 | ### 示例:高级
48 |
49 | 默认情况下,递归目录迭代器将忽略隐藏的文件和目录。可以通过构建`WalkBuilder`迭代器来禁用它:
50 |
51 | ```rust,no_run
52 | use ignore::WalkBuilder;
53 |
54 | for result in WalkBuilder::new("./").hidden(false).build() {
55 | println!("{:?}", result);
56 | }
57 | ```
58 |
59 | 请参阅文档`WalkBuilder`,有许多其他选择.
60 |
--------------------------------------------------------------------------------
/readme.md:
--------------------------------------------------------------------------------
1 | # BurntSushi/ripgrep [![translate-svg]][translate-list]
2 |
3 |
4 |
5 | [explain]: http://llever.com/explain.svg
6 | [source]: https://github.com/chinanf-boy/Source-Explain
7 | [translate-svg]: http://llever.com/translate.svg
8 | [translate-list]: https://github.com/chinanf-boy/chinese-translate-list
9 |
10 | 「 ripgrep 是一个面向行的搜索工具 」
11 |
12 | [中文](./readme.md) | [english](https://github.com/BurntSushi/ripgrep)
13 |
14 | ---
15 |
16 | ## 更新 ✅
17 |
18 |
19 |
20 |
21 |
22 | 翻译的原文 | 与日期 | 最新更新 | 更多
23 | ---|---|---|---
24 | [commit] | ⏰ 2019-04-14 | ![last] | [中文翻译][translate-list]
25 |
26 | [last]: https://img.shields.io/github/last-commit/BurntSushi/ripgrep.svg
27 | [commit]: https://github.com/BurntSushi/ripgrep/tree/45d12abbc5f576d3b10ae13dc7410b14400a8d1e
28 |
29 |
30 |
31 | - [x] readme
32 | - [x] [指南](./GUIDE.zh.md)
33 | - [x] [FAQ](./FAQ.zh.md)
34 | - [x] [`rg -h` v0.10.0 中文](./rg-0.10.0-h.zh.md)
35 |
36 |
37 |
38 | - [x] [grep-regex](./grep-regex/README.zh.md)
39 | - [x] [globset](./globset/README.zh.md)
40 | - [x] [ignore](./ignore/README.zh.md)
41 | - [x] [grep](./grep/README.zh.md)
42 | - [x] [termcolor](./termcolor/README.zh.md)
43 | - [x] [grep-cli](./grep-cli/README.zh.md)
44 | - [x] [grep-PCRE2](./grep-PCRE2/README.zh.md)
45 | - [x] [grep-searcher](./grep-searcher/README.zh.md)
46 | - [x] [wincolor](./wincolor/README.zh.md)
47 | - [x] [grep-printer](./grep-printer/README.zh.md)
48 | - [x] [grep-matcher](./grep-matcher/README.zh.md)
49 |
50 |
51 |
52 | ### 贡献
53 |
54 | 欢迎 👏 勘误/校对/更新贡献 😊 [具体贡献请看](https://github.com/chinanf-boy/chinese-translate-list#贡献)
55 |
56 | ## 生活
57 |
58 | [If help, **buy** me coffee —— 营养跟不上了,给我来瓶营养快线吧! 💰](https://github.com/chinanf-boy/live-need-money)
59 |
60 | ---
61 |
62 | ## ripgrep (rg)
63 |
64 | ripgrep 是一个面向行的搜索工具,它在遵循 gitignore 规则的同时,递归地搜索当前目录以寻找正则表达式模式。ripgrep 在 Windows、MaCOS 和 Linux 上拥有一流的支持,可以下载并使用[每个 release](https://github.com/BurntSushi/ripgrep/releases)二进制文件。 ripgrep 类似于其他流行的搜索工具,如 ag,ACK 和 grep。
65 |
66 | [](https://travis-ci.org/BurntSushi/ripgrep)
67 | [](https://ci.appveyor.com/project/BurntSushi/ripgrep)
68 | [](https://crates.io/crates/ripgrep)
69 |
70 | MIT 或 [UNLICENSE](http://unlicense.org)的双重许可。
71 |
72 | ### CHANGELOG
73 |
74 | 请看[CHANGELOG](https://github.com/BurntSushi/ripgrep/blob/master/CHANGELOG.md)发布历史 - `english`。
75 |
76 | ### 文档快速链接
77 |
78 | - [安装](#installation)
79 | - [用户指南](GUIDE.zh.md)
80 | - [常见问题](FAQ.zh.md)
81 | - [正则式 语法](https://docs.rs/regex/1/regex/#syntax)
82 | - [配置文件](GUIDE.zh.md#configuration-file)
83 | - [Shell tab 补全](FAQ.zh.md#ripgrep-%E6%98%AF%E5%90%A6%E6%94%AF%E6%8C%81-shell-tab-%E8%A1%A5%E5%85%A8)
84 | - [构建](#building)
85 |
86 | ### 搜索结果快照
87 |
88 | [](http://burntsushi.net/stuff/ripgrep1.png)
89 |
90 | ### 各大工具对比例子
91 |
92 | 此示例搜索整个 Linux 内核源树(运行`make defconfig && make -j8`后),找寻匹配项为`[A-Z]+_SUSPEND`,所有匹配必须是单词。在具有 Intel i7-6900K 3.2GHz 的系统上统计时间,并在启用 SIMD 的情况下编译 ripgrep。
93 |
94 | 请记住,单一的基准是不够的! 看到我[博客上关于 ripgrep 的博文](http://blog.burntsushi.net/ripgrep/),其有更多的基准和分析进行了非常详细的比较.
95 |
96 | | 工具 | 命令 | 行计数 | 时间 |
97 | | ------------------------------------------------------------------------------------ | ------------------------------------------------------- | -------------- | ---------- |
98 | | ripgrep(Unicode) | `rg -n -w '[A-Z]+_SUSPEND'` | 450 | **0.106s** |
99 | | [git grep](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=C git grep -E -n -w '[A-Z]+_SUSPEND'` | 450 | 0.55 |
100 | | [The Silver Searcher](https://github.com/ggreer/the_silver_searcher) | `ag -w '[A-Z]+_SUSPEND'` | 450 | 0.895 |
101 | | [git grep (Unicode)](https://www.kernel.org/pub/software/scm/git/docs/git-grep.html) | `LC_ALL=en_US.UTF-8 git grep -E -n -w '[A-Z]+_SUSPEND'` | 450 | 2.266s |
102 | | [sift](https://github.com/svent/sift) | `sift --git -n -w '[A-Z]+_SUSPEND'` | 450 | 3.505s |
103 | | [ack](https://github.com/petdance/ack2) | `ack -w '[A-Z]+_SUSPEND'` | 一千八百七十八 | 6.823s |
104 | | [The Platinum Searcher](https://github.com/monochromegane/the_platinum_searcher) | `pt -w -e '[A-Z]+_SUSPEND'` | 450 | 14.208s |
105 |
106 | (是的,`ack` [会有个](https://github.com/petdance/ack2/issues/445)一[bug](https://github.com/petdance/ack2/issues/14))
107 |
108 | 这是另一个忽略 gitignore 文件和用白名单搜索的基准。语料库与前面的基准测试相同,传递给每个命令的相同标志,确保它们正在执行相同的工作:
109 |
110 | | 工具 | 命令 | 行计数 | 时间 |
111 | | ---------------------------------------------- | ----------------------------------------------------------------- | ------ | ---------- |
112 | | ripgrep | `rg -L -u -tc -n -w '[A-Z]+_SUSPEND'` | 404 | **0.079s** |
113 | | [ucg](https://github.com/gvansickle/ucg) | `ucg --type=cc -w '[A-Z]+_SUSPEND'` | 390 | 0.163s |
114 | | [GNU grep](https://www.gnu.org/software/grep/) | `egrep -R -n --include='*.c' --include='*.h' -w '[A-Z]+_SUSPEND'` | 404 | 0.611s |
115 |
116 | (`ucg` [在存在符号链接时,行为略有不同](https://github.com/gvansickle/ucg/issues/106))
117 |
118 | 最后,在一个单一的大文件(~9.3GB)上,ripgrep 和 GNU grep 之间的直接比较。[`OpenSubtitles2016.raw.en.gz`](http://opus.lingfil.uu.se/OpenSubtitles2016/mono/OpenSubtitles2016.raw.en.gz):
119 |
120 | | 工具 | 命令 | 线路计数 | 时间 |
121 | | ---------------------------------------------- | --------------------------------------- | -------- | ---------- |
122 | | ripgrep | `rg -w 'Sherlock [A-Z]\w+'` | 5268 | **2.108s** |
123 | | [GNU grep](https://www.gnu.org/software/grep/) | `LC_ALL=C egrep -w 'Sherlock [A-Z]\w+'` | 5268 | 7.014s |
124 |
125 | 在上述基准中,传递`-n`标志(用于显示行号),会增加时间,`2.640s`的 ripgrep 和`10.277s`的 GNU grep.
126 |
127 | ### 为什么我应该使用 ripgrep?
128 |
129 | - 它可以替换由其他搜索工具提供的许多用例,因为它包含它们的大部分特性,并且通常更快。(见[FAQ](FAQ.zh.md#posix4ever)有关 ripgrep 是否能够真正取代 grep 的更多细节。
130 | - 像其他专门用于代码搜索的工具一样,ripgrep 默认用于递归目录搜索,并且不会搜索被`.gitignore`匹配的文件/+夹。默认情况下,它也忽略了隐藏文件和二进制文件。ripgrep 还实现了对`.gitignore`支持,然而,在其它代码搜索工具中,有许多与该功能相关的 bug 工具,声称提供了相同的功能。
131 | - ripgrep 可以搜索特定类型的文件。例如,`rg -tpy foo`将搜索限制为 Python 文件,`rg -Tjs foo`从搜索中排除 JavaScript 文件 。ripgrep 可以用自定义匹配规则教给新的文件类型。
132 | - ripgrep 支持许多`grep`特性。例如,显示搜索结果的上下文、搜索多个模式、突出显示具有颜色和全部 Unicode 支持的匹配与 GNU grep 不同,ripgrep 在支持 Unicode(总是在 启动)的同时保持快速。
133 | - ripgrep 有可选的支持,以切换其正则表达式引擎使用 PCRE2 。除此之外,这使得在模式中,使用环视和反向引用成为可能,而这些在 ripgrep 的默认正则表达式引擎中不受支持。启用 PCRE2 支持用`-P`。
134 | - ripgrep 支持以 UTF-8 以外的文本编码搜索文件,例如 UTF-16, latin-1, GBK, EUC-JP, Shift_JIS 等等。提供了自动检测 UTF 16 的一些支持。其他文本编码必须与`-E/--encoding`参数搭配使用)
135 | - ripgrep 支持搜索通用格式(gzip、xz、lzma、bzip2 或 lz4)的压缩文件,搭配`-z/--search-zip`参数.
136 | - ripgrep 支持任意的输入预处理过滤器,这些过滤器可以是 PDF 文本提取、较少支持的解压缩、解密、自动编码检测等。
137 |
138 | 换句话说,如果你喜欢速度,默认的过滤,更少的 bug 和 Unicode 支持,使用 ripgrep。
139 |
140 | ### 为什么我不用 ripgrep?
141 |
142 | 尽管最初不希望将每个特性都添加到 ripgrep 中,但是随着时间的推移,ripgrep 对其他文件搜索工具中的大多数特性的支持不断增加。这包括搜索跨越多行的结果,以及对 PCRE2 的 opt-in 支持,PCRE2 提供环视和反引号支持。
143 |
144 | > 译者: [环视: 在知乎找了个说明文](https://zhuanlan.zhihu.com/p/50789818)
145 |
146 | 此时,不使用 ripgrep 的主要原因可能包括以下一个或多个:
147 |
148 | - 你需要一个便携的无所不在的工具。尽管 ripgrep 在 Windows、macOS 和 Linux 上工作,但它并不普遍,也不符合 POSIX 等任何标准.做这项工作最好的工具是老 grep。
149 | - 在这个 README 中仍然有一些您心目中所依赖的其他特性(或 bug)没有列出,这些东东在其他工具中,而不在 ripgrep 。
150 | - 有一个性能边缘情况,ripgrep 做得不好(而在其他工具做得好的地方)。(请提交 bug 报告!)
151 | - ripgrep 不可能安装在您的机器上,或者不适用于您的平台.(请提交 bug 报告!)
152 |
153 | ### 真的超越所有?
154 |
155 | 一般来说,是的。大量的基准,每一个都有详细的分析.[尽在我的博客上](http://blog.burntsushi.net/ripgrep/).
156 |
157 | 总之,ripgrep 很快,因为:
158 |
159 | - 它建在[Rust's regex 引擎](https://github.com/rust-lang-nursery/regex)。 Rust 的正则表达式引擎使用有限自动机、SIMD 和积极的文字优化来使搜索非常快。(PCRE2 支持可以与`-P/--PCRE2`参数)
160 | - Rust 的 regex 库通过将 UTF-8 解码直接构建到其确定性有限自动机引擎中,从而在完全 Unicode 支持下,保持性能。
161 | - 它支持使用内存映射,或通过使用中间缓冲区递增地搜索进行搜索。前者对于单个文件更好,后者对于大目录更好。ripgrep 自动为您选择最佳搜索策略。
162 | - 将您的`.gitignore`文件的忽略模式,应用了一个[`RegexSet`](https://docs.rs/regex/1/regex/struct.RegexSet.html)。 这意味着单个文件路径,可以同时匹配多个 glob 模式。
163 | - 它使用一个无锁的并行递归目录迭代器[`crossbeam`](https://docs.rs/crossbeam)和[`ignore`](https://docs.rs/ignore)。
164 |
165 | ### 功能 比较
166 |
167 | Andy Lester,[ack](https://beyondgrep.com/)作者已发布了一个比较 ack、ag、git-grep、GNU grep 和 ripgrep 特性的优秀表:
168 |
169 | 请注意,ripgrep 最近增加了一些尚未出现在 Andy 表中的重要新特性。这包括但不限于,配置文件、passthru、对搜索压缩文件的支持、多行搜索和通过 PCRE2 支持选择高级正则表达式。
170 |
171 | ### Installation
172 |
173 | - 安装
174 |
175 | ripgrep 的二进制名是`rg`。
176 |
177 | **[Windows,macOS 和 Linux 的 二进制文件,尽在于此](https://github.com/BurntSushi/ripgrep/releases)**,若以下有用户的平台没提到的,建议下载这些压缩中的一个。
178 |
179 | Linux 二进制文件是静态可执行文件。Windows 二进制文件可用 MinGW(GNU)或微软 Visual C++(MSVC)构建。如果可能的话,更喜欢 MSVC 超过 GNU,但是你需要安装有[Microsoft VC++ 2015
180 | redistributable](https://www.microsoft.com/en-us/download/details.aspx?id=48145)。
181 |
182 | 如果你是**macOS Homebrew**或 一个**Linuxbrew**用户,然后可以从 homebrew-core 安装 ripgrep,(用 Rust 稳定版编译,没有 SIMD 支持):
183 |
184 | ```
185 | $ brew install ripgrep
186 | ```
187 |
188 | 或者您可以通过使用定制的 tab 来安装,用 rust nightly 版本 编译的二进制文件(包括 SIMD 和所有优化):
189 |
190 | ```
191 | $ brew tap burntsushi/ripgrep https://github.com/BurntSushi/ripgrep.git
192 | $ brew install ripgrep-bin
193 | ```
194 |
195 | 如果你是**MacPorts**用户,然后可以从[官方 ports](https://www.macports.org/ports.php?by=name&substr=ripgrep)下载 ripgrep:
196 |
197 | ```
198 | $ sudo port install ripgrep
199 | ```
200 |
201 | 如果你是**Windows Chocolatey**用户,然后可以从[官方 repo](https://chocolatey.org/packages/ripgrep):
202 |
203 | ```
204 | $ choco install ripgrep
205 | ```
206 |
207 | 如果你是**Windows Scoop**用户,然后可以从[官方 bucket](https://github.com/lukesampson/scoop/blob/master/bucket/ripgrep.json):
208 |
209 | ```
210 | $ scoop install ripgrep
211 | ```
212 |
213 | 如果你是一个**Arch Linux**用户,然后您可以从官方,安装 ripgrep :
214 |
215 | ```
216 | $ pacman -S ripgrep
217 | ```
218 |
219 | 如果你是 **Gentoo** 用户,可以从[官方 repo](https://packages.gentoo.org/packages/sys-apps/ripgrep):
220 |
221 | ```
222 | $ emerge sys-apps/ripgrep
223 | ```
224 |
225 | 如果你是 **Fedora 27 +** 用户,可以从官方存储库安装 ripgrep.
226 |
227 | ```
228 | $ sudo dnf install ripgrep
229 | ```
230 |
231 | 如果你是 **Fedora 24 +** 用户,可以从[copr](https://copr.fedorainfracloud.org/coprs/carlwgeorge/ripgrep/)安装 ripgrep:
232 |
233 | ```
234 | $ sudo dnf copr enable carlwgeorge/ripgrep
235 | $ sudo dnf install ripgrep
236 | ```
237 |
238 | 如果你是一个**openSUSE Tumbleweed**用户,可以从[官方 repo](http://software.opensuse.org/package/ripgrep):
239 |
240 | ```
241 | $ sudo zypper install ripgrep
242 | ```
243 |
244 | 如果你是**RHEL/CentOS 7**用户,可以从[copr](https://copr.fedorainfracloud.org/coprs/carlwgeorge/ripgrep/)安装 ripgrep:
245 |
246 | ```
247 | $ sudo yum-config-manager --add-repo=https://copr.fedorainfracloud.org/coprs/carlwgeorge/ripgrep/repo/epel-7/carlwgeorge-ripgrep-epel-7.repo
248 | $ sudo yum install ripgrep
249 | ```
250 |
251 | 如果你是**Nix**用户,可以从[nixpkgs](https://github.com/NixOS/nixpkgs/blob/master/pkgs/tools/text/ripgrep/default.nix)安装 ripgrep:
252 |
253 | ```
254 | $ nix-env --install ripgrep
255 | $ # (Or using the attribute name, which is also ripgrep.)
256 | ```
257 |
258 | 如果你是**Debian**用户(或 **Ubuntu** 的用户),然后,可以使用每一个[ripgrep release](https://github.com/BurntSushi/ripgrep/releases)提供的二进制`.deb`文件来安装 ripgrep。
259 |
260 | ```
261 | $ curl -LO https://github.com/BurntSushi/ripgrep/releases/download/0.10.0/ripgrep_0.10.0_amd64.deb
262 | $ sudo dpkg -i ripgrep_0.10.0_amd64.deb
263 | ```
264 |
265 | 如果运行 **Debian Buster(当前 Debian 测试)或 Debian SID**,ripgrep 是[Debian 官方包所具备的](https://tracker.debian.org/pkg/rust-ripgrep).
266 |
267 | ```
268 | $ sudo apt-get install ripgrep
269 | ```
270 |
271 | 如果你是一个**Ubuntu Cosmic(18.10)**(或更新的)用户,ripgrep[与上面相同](https://launchpad.net/ubuntu/+source/rust-ripgrep),是可以使用与 Debian 包装:
272 |
273 | ```
274 | $ sudo apt-get install ripgrep
275 | ```
276 |
277 | (N.B.Ubuntu 上 ripgrep 的各种快照包也是可用的,但是它们似乎都不能正常工作,并且生成许多非常奇怪的 bug 报告,我不知道如何修复,也没有时间修复)。因此,它**不再是推荐的安装选项**。
278 |
279 | 如果你是**FreeBSD**用户,然后可以从[官方 ports](https://www.freshports.org/textproc/ripgrep/)安装:
280 |
281 | ```
282 | # pkg install ripgrep
283 | ```
284 |
285 | 如果你是一个**OpenBSD**用户,然后可以从[官方 ports](http://openports.se/textproc/ripgrep)安装:
286 |
287 | ```
288 | $ doas pkg_add ripgrep
289 | ```
290 |
291 | 如果你是**NetBSD**用户,然后可以从[pkgsrc](http://pkgsrc.se/textproc/ripgrep)安装 ripgrep:
292 |
293 | ```
294 | # pkgin install ripgrep
295 | ```
296 |
297 | 如果你是**Rust 程序员** 可以用`cargo`安装 ripgrep.
298 |
299 | - 请注意,ripgrep 的最小 Rust 支持版本的是**1.23.0**,虽然 ripgrep 可能与旧版本一起工作。
300 | - 请注意,二进制可能大于预期,因为它包含调试符号。这是故意的。若要删除调试符号,并减少文件大小,请运行二进制的`strip`。
301 |
302 | ```
303 | $ cargo install ripgrep
304 | ```
305 |
306 | 当使用 Rust 1.27 或更新版本进行编译时,这将自动启用搜索的 SIMD 优化。
307 |
308 | ripgrep 目前不在任何其他包库中。[我更喜欢这样](https://github.com/BurntSushi/ripgrep/issues/10).
309 |
310 | ### Building
311 |
312 | - 构建
313 |
314 | ripgrep 是用 Rust 编写的,所以你需要先具有一个[Rust](https://www.rust-lang.org/)。ripgrep 与 Rust 1.28.0(稳定)或打上。一般来说,ripgrep 会跟随 Rust 编译器的最新稳定版本.
315 |
316 | 构建 ripgrep:
317 |
318 | ```
319 | $ git clone https://github.com/BurntSushi/ripgrep
320 | $ cd ripgrep
321 | $ cargo build --release
322 | $ ./target/release/rg --version
323 | 0.1.3
324 | ```
325 |
326 | 如果您有一个 Rust 夜间(nightly)编译器和最近的 Intel CPU,那么您可以启用额外的可选 SIMD 加速,如下所示:
327 |
328 | ```
329 | RUSTFLAGS="-C target-cpu=native" cargo build --release --features 'simd-accel avx-accel'
330 | ```
331 |
332 | 如果您的机器不支持 AVX 指令,则只需从功能列表中删除`avx-accel`。SIMD 也类似(它大致对应于 SSE 指令)。
333 |
334 | 在某些 ripgrep 依赖项中,`simd-accel`和`avx-accel`功能会支持 SIMD(负责计算行数和转码)。它们不需要获得 SIMD 优化搜索; 那些是自动启用。希望有一天,`simd-accel`和`avx-accel`特征变得不必要。
335 |
336 | 最后,可选择加入 PCRE2 的支持,此为通过启用`PCRE2`功能参与 ripgrep 的构建:
337 |
338 | ```
339 | $ cargo build --release --features 'PCRE2'
340 | ```
341 |
342 | (提示:使用`--features 'PCRE2 simd-accel avx-accel'`还包括编译时 SIMD 优化,这将只适用于夜间编译器。
343 |
344 | 启用 PCRE2 特性可以与稳定的 Rust 编译器一起工作,并将尝试通过`pkg-config`自动查找,并链接到系统的 PCRE2 库。 如果不存在,那么 ripgrep 将使用系统的 C 编译器,从源代码构建 PCRE2,然后静态地将其链接到最终的可执行文件中。即使存在可用的 PCRE2 系统库,也可以通过构建具有 MUSL 目标的 ripgrep ,或通过设置强制静态链接`PCRE2_SYS_STATIC=1`。
345 |
346 | 在 Linux 上,能以 MUSL 为目标,构建 ripgrep ,通过在您的系统上先安装 MUSL 库(咨询您友好的邻居 - 包管理器)。然后,您只需要将 MUSL 支持,添加到 Rust 工具链,并重新构建 ripgrep,它就会生成一个完全静态的可执行文件:
347 |
348 | ```
349 | $ rustup target add x86_64-unknown-linux-musl
350 | $ cargo build --release --target x86_64-unknown-linux-musl
351 | ```
352 |
353 | 应用上面的`--features`参数,就会如愿以偿。如果您想用 MUSL 和 PCRE2 构建静态可执行文件,那么您将需要已安装`musl-gcc`,它可能位于与实际 MUSL 库分隔的包中,这取决于您的 Linux 发行版。
354 |
355 | ### Running tests
356 |
357 | - 运行测试
358 |
359 | ripgrep 的测试相对不错,包括单元测试和集成测试。若要运行完整的测试套件,请使用:
360 |
361 | ```
362 | $ cargo test --all
363 | ```
364 |
365 | 位于存储库的根目录。
366 |
--------------------------------------------------------------------------------
/rg-0.10.0-h.md:
--------------------------------------------------------------------------------
1 | ripgrep 0.10.0
2 | Andrew Gallant
3 |
4 | ripgrep (rg) recursively searches your current directory for a regex pattern.
5 | By default, ripgrep will respect your .gitignore and automatically skip hidden
6 | files/directories and binary files.
7 |
8 | ripgrep's default regex engine uses finite automata and guarantees linear
9 | time searching. Because of this, features like backreferences and arbitrary
10 | look-around are not supported. However, if ripgrep is built with PCRE2, then
11 | the --pcre2 flag can be used to enable backreferences and look-around.
12 |
13 | ripgrep supports configuration files. Set RIPGREP_CONFIG_PATH to a
14 | configuration file. The file can specify one shell argument per line. Lines
15 | starting with '#' are ignored. For more details, see the man page or the
16 | README.
17 |
18 | Project home page: https://github.com/BurntSushi/ripgrep
19 |
20 | Use -h for short descriptions and --help for more details.
21 |
22 | ```bash
23 | USAGE:
24 | rg [OPTIONS] PATTERN [PATH ...]
25 | rg [OPTIONS] [-e PATTERN ...] [-f PATTERNFILE ...] [PATH ...]
26 | rg [OPTIONS] --files [PATH ...]
27 | rg [OPTIONS] --type-list
28 | command | rg [OPTIONS] PATTERN
29 |
30 | ARGS:
31 | A regular expression used for searching.
32 | ... A file or directory to search.
33 |
34 | OPTIONS:
35 | -A, --after-context Show NUM lines after each match.
36 | -B, --before-context Show NUM lines before each match.
37 | --block-buffered Force block buffering.
38 | -b, --byte-offset Print the 0-based byte offset for each matching line.
39 | -s, --case-sensitive Search case sensitively (default).
40 | --color Controls when to use color.
41 | --colors ... Configure color settings and styles.
42 | --column Show column numbers.
43 | -C, --context Show NUM lines before and after each match.
44 | --context-separator Set the context separator string.
45 | -c, --count Only show the count of matching lines for each file.
46 | --count-matches Only show the count of individual matches for each file.
47 | --crlf Support CRLF line terminators (useful on Windows).
48 | --debug Show debug messages.
49 | --dfa-size-limit The upper size limit of the regex DFA.
50 | -E, --encoding Specify the text encoding of files to search.
51 | -f, --file ... Search for patterns from the given file.
52 | --files Print each file that would be searched.
53 | -l, --files-with-matches Only print the paths with at least one match.
54 | --files-without-match Only print the paths that contain zero matches.
55 | -F, --fixed-strings Treat the pattern as a literal string.
56 | -L, --follow Follow symbolic links.
57 | -g, --glob ... Include or exclude files.
58 | -h, --help Prints help information. Use --help for more details.
59 | --heading Print matches grouped by each file.
60 | --hidden Search hidden files and directories.
61 | --iglob ... Include or exclude files case insensitively.
62 | -i, --ignore-case Case insensitive search.
63 | --ignore-file ... Specify additional ignore files.
64 | -v, --invert-match Invert matching.
65 | --json Show search results in a JSON Lines format.
66 | --line-buffered Force line buffering.
67 | -n, --line-number Show line numbers.
68 | -x, --line-regexp Only show matches surrounded by line boundaries.
69 | -M, --max-columns Don't print lines longer than this limit.
70 | -m, --max-count Limit the number of matches.
71 | --max-depth Descend at most NUM directories.
72 | --max-filesize Ignore files larger than NUM in size.
73 | --mmap Search using memory maps when possible.
74 | -U, --multiline Enable matching across multiple lines.
75 | --multiline-dotall Make '.' match new lines when multiline is enabled.
76 | --no-config Never read configuration files.
77 | --no-filename Never print the file path with the matched lines.
78 | --no-heading Don't group matches by each file.
79 | --no-ignore Don't respect ignore files.
80 | --no-ignore-global Don't respect global ignore files.
81 | --no-ignore-messages Suppress gitignore parse error messages.
82 | --no-ignore-parent Don't respect ignore files in parent directories.
83 | --no-ignore-vcs Don't respect VCS ignore files.
84 | -N, --no-line-number Suppress line numbers.
85 | --no-messages Suppress some error messages.
86 | --no-mmap Never use memory maps.
87 | --no-pcre2-unicode Disable Unicode mode for PCRE2 matching.
88 | -0, --null Print a NUL byte after file paths.
89 | --null-data Use NUL as a line terminator instead of \n.
90 | --one-file-system Do not descend into directories on other file systems.
91 | -o, --only-matching Print only matches parts of a line.
92 | --passthru Print both matching and non-matching lines.
93 | --path-separator Set the path separator.
94 | -P, --pcre2 Enable PCRE2 matching.
95 | --pre search outputs of COMMAND FILE for each FILE
96 | --pre-glob ... Include or exclude files from a preprocessing command.
97 | -p, --pretty Alias for --color always --heading --line-number.
98 | -q, --quiet Do not print anything to stdout.
99 | --regex-size-limit The upper size limit of the compiled regex.
100 | -e, --regexp ... A pattern to search for.
101 | -r, --replace Replace matches with the given text.
102 | -z, --search-zip Search in compressed files.
103 | -S, --smart-case Smart case search.
104 | --sort Sort results in ascending order. Implies --threads=1.
105 | --sortr Sort results in descending order. Implies --threads=1.
106 | --stats Print statistics about this ripgrep search.
107 | -a, --text Search binary files as if they were text.
108 | -j, --threads The approximate number of threads to use.
109 | --trim Trim prefixed whitespace from matches.
110 | -t, --type ... Only search files matching TYPE.
111 | --type-add ... Add a new glob for a file type.
112 | --type-clear ... Clear globs for a file type.
113 | --type-list Show all supported file types.
114 | -T, --type-not ... Do not search files matching TYPE.
115 | -u, --unrestricted Reduce the level of "smart" searching.
116 | -V, --version Prints version information
117 | --vimgrep Show results in vim compatible format.
118 | -H, --with-filename Print the file path with the matched lines.
119 | -w, --word-regexp Only show matches surrounded by word boundaries.
120 | ```
--------------------------------------------------------------------------------
/rg-0.10.0-h.zh.md:
--------------------------------------------------------------------------------
1 | ## ripgrep 0.10.0
2 |
3 | Andrew Gallant
4 |
5 | ripgrep(rg)以递归方式在当前目录中搜索正则表达式模式。默认情况下,ripgrep 遵循您的.gitignore ,并自动跳过隐藏的文件/目录和二进制文件。
6 |
7 | ripgrep 的默认正则表达式引擎,使用有限自动机,并保证搜索的线性时间。因此,不支持反向引用和任意环视等功能。但是,如果 ripgrep 是使用 PCRE2 构建的,那么`--pcre2` 标志可用于启用反向引用和环视。
8 |
9 | > [环视: 在知乎找了个说明文](https://zhuanlan.zhihu.com/p/50789818)
10 |
11 | ripgrep 支持配置文件。将 RIPGREP_CONFIG_PATH 设置配置文件。该文件可以为每行指定一个 shell 参数。以"#"开头的行将被忽略。有关更多详细信息,请参见 man 页,或 readme 文件。
12 |
13 | 项目主页:
14 |
15 | 使用 `-h` 进行简短描述,并使用 `--help` 获取更多详细信息。
16 |
17 | ```bash
18 | USAGE:
19 | rg [OPTIONS] PATTERN [PATH ...]
20 | rg [OPTIONS] [-e PATTERN ...] [-f PATTERNFILE ...] [PATH ...]
21 | rg [OPTIONS] --files [PATH ...]
22 | rg [OPTIONS] --type-list
23 | command | rg [OPTIONS] PATTERN
24 |
25 | ARGS:
26 | 一个正则表达式,用于搜素
27 | ... 要搜索的文件或目录.
28 |
29 | OPTIONS:
30 | ```
31 |
32 | | 标志 | 曰 |
33 | | ---------------------------------- | ---------------------------------------------------------------------------------------- |
34 | | `-A, --after-context ` | 每次匹配后,显示 NUM 行。 |
35 | | `-B, --before-context ` | 在每次匹配前,显示 NUM 行。 |
36 | | `--block-buffered` | 强制,分块缓冲。 |
37 | | `-b, --byte-offset` | 打印每个匹配行的,字节偏移量,从 0 开始。 |
38 | | `-s, --case-sensitive` | 搜索,关注大小写(默认)。 |
39 | | `--color ` | 控制何时,使用颜色。 |
40 | | `--colors ...` | 配置颜色设置和样式。 |
41 | | `--column` | 显示列号。 |
42 | | `-C, --context ` | 显示每场匹配前后的 NUM 行。 |
43 | | `--context-separator ` | 设置,内容分隔符的字符串。 |
44 | | `-c, --count` | 仅显示每个文件的匹配行数。 |
45 | | `--count-matches` | 仅显示每个文件的单个匹配数。 |
46 | | `--crlf` | 支持 CRLF 行终止符(在 Windows 上很有用)。 |
47 | | `--debug` | 显示,调试消息。 |
48 | | `--dfa-size-limit ` | regex DFA 的大小上限。 |
49 | | `-E, --encoding ` | 指定要搜索的文件的文本编码。 |
50 | | `-f, --file ...` | 从给定的文件模式中,搜索。 |
51 | | `--files` | 打印,要搜索的每个文件。 |
52 | | `-l, --files-with-matches` | 只打印,至少有一个匹配的路径。 |
53 | | `--files-without-match` | 只打印,包含零个匹配项的路径。 |
54 | | `-F, --fixed-strings` | 将模式,视为字面字符串。 |
55 | | `-L, --follow` | 遵循,符号链接。 |
56 | | `-g, --glob ...` | 包括或排除文件。 |
57 | | `-h, --help` | 打印帮助信息。使用 `--help` 了解更多详细信息。 |
58 | | `--heading` | 打印,按每个文件分组的匹配项。 |
59 | | `--hidden` | 搜索隐藏的文件和目录。 |
60 | | `--iglob ...` | 不关心大小写,包括或排除的文件 glob 模式。 |
61 | | `-i, --ignore-case` | 不区分大小写的搜索。 |
62 | | `--ignore-file ...` | 指定,其他忽略文件。 |
63 | | `-v, --invert-match` | 反匹配。 |
64 | | `--json` | 以 JSON 行格式,显示搜索结果。 |
65 | | `--line-buffered` | 强制,行缓冲。 |
66 | | `-n, --line-number` | 显示行号。 |
67 | | `-x, --line-regexp` | 仅显示由行边界包围的匹配项。 |
68 | | `-M, --max-columns ` | 不要打印,超过此限制的行。 |
69 | | `-m, --max-count ` | 限制匹配数。 |
70 | | `--max-depth ` | 最多深入 NUM 个目录。 |
71 | | `--max-filesize ` | 忽略,大于 NUM 的文件。 |
72 | | `--mmap` | 尽可能,使用内存映射进行搜索。 |
73 | | `-U, --multiline` | 启用,跨多行匹配。 |
74 | | `--multiline-dotall` | 启用多行时,使"."与新行匹配。 |
75 | | `--no-config` | 从不读取配置文件。 |
76 | | `--no-filename` | 不要用匹配的行,打印文件路径。 |
77 | | `--no-heading` | 不要按每个文件,对匹配项进行分组。 |
78 | | `--no-ignore` | 不遵循忽略文件。 |
79 | | `--no-ignore-global` | 不遵循全局忽略文件。 |
80 | | `--no-ignore-messages` | 禁止 gitignore 解析的错误消息。 |
81 | | `--no-ignore-parent` | 不遵循忽略父目录中的文件。 |
82 | | `--no-ignore-vcs` | 不要遵循 VCS 忽略文件。 |
83 | | `-N, --no-line-number` | 没有行号。 |
84 | | `--no-messages` | 禁止显示某些错误消息。 |
85 | | `--no-mmap` | 不使用内存映射。 |
86 | | `--no-pcre2-unicode` | 禁用 PCRE2 匹配的 Unicode 模式。 |
87 | | `-0, --null` | 在文件路径后,打印 NUL 字节。 |
88 | | `--null-data` | 使用 NUL 作为行终止符,而不是`\n`。 |
89 | | `--one-file-system` | 不要深入,到其他文件系统的目录中。 |
90 | | `-o, --only-matching` | 打印仅匹配行的部分内容。 |
91 | | `--passthru` | 打印匹配行和不匹配行。 |
92 | | `--path-separator ` | 设置路径分隔符。 |
93 | | `-P, --pcre2` | 启用 PCRE2 匹配。 |
94 | | `--pre ` | 搜索每个文件的命令文件输出 |
95 | | `--pre-glob ...` | 从预处理命令中,包括或排除文件。 |
96 | | `-p, --pretty` | `--color always--heading--line-number`的别名。 |
97 | | `-q, --quiet` | 不要将任何内容,打印到 stdout。 |
98 | | `--regex-size-limit ` | 已编译 regex 的大小上限。 |
99 | | `-e, --regexp ...` | 要搜索的模式。 |
100 | | `-r, --replace ` | 用给定的文本,替换匹配项。 |
101 | | `-z, --search-zip` | 在压缩文件中搜索。 |
102 | | `-S, --smart-case` | 智能案例搜索。(如果它包含大写字符,则将查询视为区分大小写,如果不包含,则不区分大小写。) |
103 | | `--sort ` | 按升序对结果排序。暗示 `--threads=1`。 |
104 | | `--sortr ` | 按降序对结果排序。暗示 `--threads=1`。 |
105 | | `--stats` | 打印有关此 ripgrep 搜索的统计信息。 |
106 | | `-a, --text` | 像搜索文本一样,搜索二进制文件。 |
107 | | `-j, --threads ` | 要使用的线程的近似数。 |
108 | | `--trim` | 从匹配项中,删除前缀空格。 |
109 | | `-t, --type ...` | 仅搜索,与类型匹配的文件。 |
110 | | `--type-add ...` | 为文件类型,添加新的 glob。 |
111 | | `--type-clear ...` | 清除文件类型的全局变量。 |
112 | | `--type-list` | 显示所有支持的文件类型。 |
113 | | `-T, --type-not ...` | 不要搜索与类型匹配的文件。 |
114 | | `-u, --unrestricted` | 降低"智能"搜索的级别。 |
115 | | `-V, --version` | 打印版本信息 |
116 | | `--vimgrep` | 以 Vim 兼容格式,显示结果。 |
117 | | `-H, --with-filename` | 用匹配的行,打印文件路径。 |
118 | | `-w, --word-regexp` | 只显示,由单词边界包围的匹配项。 |
119 |
--------------------------------------------------------------------------------
/sync-en.sh:
--------------------------------------------------------------------------------
1 | cat './.mds-list' | while read line || [[ -n ${line} ]]
2 | do
3 | testseq="zh.md"
4 | if [[ $line =~ $testseq || "$line" == "" ]]; then
5 | echo "skip $line"
6 | else
7 | lowline=`echo "$line" | awk '{print tolower($0)}'`
8 | # lowwer string
9 | zh=${line//source\//}
10 | dir=$(dirname $zh)
11 |
12 | source_readme="./source/readme.md"
13 | if [[ $lowline == $source_readme ]];then
14 | # source/[readme|REAMDE].md => en.md
15 | filename="en.md"
16 | else
17 | # source/other.md => ./other.md
18 | filename=$(basename $zh)
19 | fi
20 | echo "$line >> $dir/$filename"
21 | mkdir -p $dir && cp $line "$_/$filename"
22 | fi
23 | done
--------------------------------------------------------------------------------
/termcolor/README.md:
--------------------------------------------------------------------------------
1 | termcolor has moved to its own repository:
2 | https://github.com/BurntSushi/termcolor
3 |
--------------------------------------------------------------------------------
/termcolor/README.zh.md:
--------------------------------------------------------------------------------
1 | termcolor 已移至其自己的存储库:
2 |
--------------------------------------------------------------------------------
/wincolor/README.md:
--------------------------------------------------------------------------------
1 | wincolor has moved to the termcolor repository:
2 | https://github.com/BurntSushi/termcolor
3 |
--------------------------------------------------------------------------------
/wincolor/README.zh.md:
--------------------------------------------------------------------------------
1 | wincolor 已移至 termcolor 存储库:
2 |
--------------------------------------------------------------------------------