├── LICENSE
├── README.md
├── Version_changes.md
├── code_snippets
├── Alternation_and_Grouping.rb
├── Anchors.rb
├── Character_class.rb
├── Dot_metacharacter_and_Quantifiers.rb
├── Escaping_metacharacters.rb
├── Groupings_and_backreferences.rb
├── Interlude_Common_tasks.rb
├── Lookarounds.rb
├── Modifiers.rb
├── Regexp_introduction.rb
├── Unicode.rb
└── Working_with_matched_portions.rb
├── exercises
├── Exercise_solutions.md
├── Exercises.md
├── expected.md
└── sample.md
├── images
├── debuggex.png
├── find_replace.png
├── info.svg
├── password_check.png
├── rubular.png
├── ruby_regexp_ls.png
└── warning.svg
├── ruby_regexp.md
└── sample_chapters
└── ruby_regexp_sample.pdf
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2024 Sundeep Agarwal
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Understanding Ruby Regexp
2 |
3 | Learn Ruby Regular Expressions step-by-step from beginner to advanced levels with hundreds of examples and exercises. Visit https://youtu.be/QNsCzVeZH78 for a short video about the book.
4 |
5 |
6 |
7 | The book also includes exercises to test your understanding, which are presented together as a single file in this repo — [Exercises.md](./exercises/Exercises.md).
8 |
9 | For solutions to the exercises, see [Exercise_solutions.md](./exercises/Exercise_solutions.md).
10 |
11 | See [Version_changes.md](./Version_changes.md) to keep track of changes made to the book.
12 |
13 |
14 |
15 | # E-book
16 |
17 | * You can download the pdf/epub versions of the book for free using the below links (you can also pay if you wish):
18 | * https://learnbyexample.gumroad.com/l/rubyregexp
19 | * https://leanpub.com/rubyregexp
20 | * You can also get the book as part of these bundles:
21 | * **All books bundle** bundle from https://learnbyexample.gumroad.com/l/all-books
22 | * Includes all my programming books
23 | * **Awesome Regex** bundle from https://learnbyexample.gumroad.com/l/regex or https://leanpub.com/b/regex
24 | * **Ruby text processing** bundle from https://learnbyexample.gumroad.com/l/ruby-textprocessing or https://leanpub.com/b/ruby-textprocessing
25 | * See https://learnbyexample.github.io/books/ for a list of other books
26 |
27 | For a preview of the book, see [sample chapters](./sample_chapters/ruby_regexp_sample.pdf).
28 |
29 | The book can also be [viewed as a single markdown file in this repo](./ruby_regexp.md). See my blogpost on [generating pdfs from markdown using pandoc](https://learnbyexample.github.io/customizing-pandoc/) if you are interested in the ebook creation process.
30 |
31 | For the web version of the book, visit https://learnbyexample.github.io/Ruby_Regexp/
32 |
33 |
34 |
35 | # Feedback
36 |
37 | ⚠️ ⚠️ Please DO NOT submit pull requests. Main reason being any modification requires changes in multiple places.
38 |
39 | I would highly appreciate it if you'd let me know how you felt about this book. It could be anything from a simple thank you, pointing out a typo, mistakes in code snippets, which aspects of the book worked for you (or didn't!) and so on. Reader feedback is essential and especially so for self-published authors.
40 |
41 | You can reach me via:
42 |
43 | * Issue Manager: [https://github.com/learnbyexample/Ruby_Regexp/issues](https://github.com/learnbyexample/Ruby_Regexp/issues)
44 | * E-mail: `echo 'bGVhcm5ieWV4YW1wbGUubmV0QGdtYWlsLmNvbQo=' | base64 --decode`
45 | * Twitter: [https://twitter.com/learn_byexample](https://twitter.com/learn_byexample)
46 |
47 |
48 |
49 | # Table of Contents
50 |
51 | 1. Preface
52 | 2. Why is it needed?
53 | 3. Regexp introduction
54 | 4. Anchors
55 | 5. Alternation and Grouping
56 | 6. Escaping metacharacters
57 | 7. Dot metacharacter and Quantifiers
58 | 8. Interlude: Tools for debugging and visualization
59 | 9. Working with matched portions
60 | 10. Character class
61 | 11. Groupings and backreferences
62 | 12. Interlude: Common tasks
63 | 13. Lookarounds
64 | 14. Modifiers
65 | 15. Unicode
66 | 16. Further Reading
67 |
68 |
69 |
70 | # Acknowledgements
71 |
72 | * [ruby-lang documentation](https://www.ruby-lang.org/en/documentation/) — manuals and tutorials
73 | * [/r/ruby/](https://old.reddit.com/r/ruby/) and [/r/regex/](https://old.reddit.com/r/regex/) — helpful forum for beginners and experienced programmers alike
74 | * [stackoverflow](https://stackoverflow.com/) — for getting answers to pertinent questions on Ruby and regular expressions
75 | * [tex.stackexchange](https://tex.stackexchange.com/) — for help on [pandoc](https://github.com/jgm/pandoc/) and `tex` related questions
76 | * [canva](https://www.canva.com/) — cover image
77 | * [Warning](https://commons.wikimedia.org/wiki/File:Warning_icon.svg) and [Info](https://commons.wikimedia.org/wiki/File:Info_icon_002.svg) icons by [Amada44](https://commons.wikimedia.org/wiki/User:Amada44) under public domain
78 | * [oxipng](https://github.com/shssoichiro/oxipng), [pngquant](https://pngquant.org/) and [svgcleaner](https://github.com/RazrFalcon/svgcleaner) — optimizing images
79 | * [gmovchan](https://github.com/gmovchan) for spotting a typo
80 | * **KOTP** for spotting grammatical mistakes
81 | * [mdBook](https://github.com/rust-lang/mdBook) — for web version of the book
82 | * [mdBook-pagetoc](https://github.com/JorelAli/mdBook-pagetoc) — for adding table of contents for each chapter
83 | * [minify-html](https://github.com/wilsonzlin/minify-html) — for minifying html files
84 |
85 | Special thanks to Allen Downey, an attempt at translating his book [Think Python](https://greenteapress.com/wp/think-python-2e/) to [Think Ruby](https://github.com/learnbyexample/ThinkRubyBuild) gave me the confidence to publish my own book.
86 |
87 |
88 |
89 | # License
90 |
91 | The book is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-nc-sa/4.0/).
92 |
93 | The code snippets are licensed under MIT, see [LICENSE](./LICENSE) file.
94 |
95 |
--------------------------------------------------------------------------------
/Version_changes.md:
--------------------------------------------------------------------------------
1 |
2 |
3 | ### 3.0
4 |
5 | * Ruby version updated to **3.3.0**
6 | * Corrected examples and descriptions for Atomic grouping, `\G` and `\K` features
7 | * In general, many of the examples, exercises, solutions, descriptions and external links were updated/corrected
8 | * Updated Acknowledgements section
9 | * Code snippets related to info/warning sections will now appear as a single block
10 | * Book title changed to **Understanding Ruby Regexp**
11 | * New cover image
12 | * Images centered for EPUB format
13 |
14 |
15 |
16 | ### 2.6
17 |
18 | * Updated `ruby` version to **3.0.0**
19 | * Added a further reading link for `\g` subexpression call usage
20 | * Typo corrections and miscellaneous changes
21 |
22 |
23 |
24 | ### 2.5
25 |
26 | * Added **epub** version of the book
27 | * Added real world regular expressions usage examples and overview of book in introduction chapter
28 | * Added plenty of new exercises, perhaps too many
29 | * Updated and clarified descriptions for many concepts, too many changes to list individually
30 | * Added separate section about escape sequences and differences compared to string literals, added details for `\R` character set
31 | * Added two interlude chapters to highlight external resources
32 | * Removed chapters *Miscellaneous* and *Gotchas* and merged their contents in other chapters
33 | * Added section for conditional group `(?(cond)yes-subexp|no-subexp)`
34 | * And many more typo corrections and miscellaneous changes
35 |
36 |
37 |
38 | ### 2.1
39 |
40 | * corrected formatting in a code snippet comment
41 | * changed a character class example to use possessive quantifier instead of greedy
42 | * added examples for `\k` method of backreferencing
43 | * added examples for `\X`
44 |
45 |
46 |
47 | ### 2.0
48 |
49 | * added Table of Contents
50 | * updated cover image
51 | * changed book formatting
52 | * better contrast for chapter and section names
53 | * changed background color of code snippets for better contrast
54 | * increased font size and page margins
55 | * added cheatsheet at end of chapters
56 | * improved descriptions and examples
57 | * corrected minor typos and improved grammar
58 |
59 |
60 |
61 | ### 1.1
62 |
63 | * changed cover image
64 | * added illustration for recursive matching section
65 | * external link updates and minor description changes
66 |
67 |
68 |
69 | ### 1.0
70 |
71 | * added exercises
72 | * some comments for code snippets improved, typos fixed
73 | * AND conditional examples made as a sub-heading
74 | * font for code snippets changed to accommodate some Unicode characters
75 |
76 |
77 |
78 | ### 0.6
79 |
80 | * second example for Character class chapter changed from `grep` to `gsub`
81 | * second example for String Encoding chapter changed from `gsub` to `scan`
82 | * better naming for `hash` examples
83 | * Recursive matching simplified to use `\g<0>` instead of capture group and `\g<1>`
84 | * also, description improved and added links for viewing the regexp online as railroad diagrams
85 |
86 |
87 |
88 | ### 0.5
89 |
90 | * First version
91 |
92 |
--------------------------------------------------------------------------------
/code_snippets/Alternation_and_Grouping.rb:
--------------------------------------------------------------------------------
1 | ## Alternation
2 |
3 | pet = /cat|dog/
4 |
5 | 'I like cats'.match?(pet)
6 |
7 | 'I like dogs'.match?(pet)
8 |
9 | 'I like parrots'.match?(pet)
10 |
11 | 'catapults concatenate cat scat cater'.gsub(/\Acat|cat\b/, 'X')
12 |
13 | 'cat dog bee parrot fox'.gsub(/cat|dog|fox/, 'mammal')
14 |
15 | ## Regexp.union method
16 |
17 | Regexp.union('car', 'jeep')
18 |
19 | words = %w[cat dog fox]
20 |
21 | pat = Regexp.union(words)
22 |
23 | pat
24 |
25 | 'cat dog bee parrot fox'.gsub(pat, 'mammal')
26 |
27 | ## Grouping
28 |
29 | 'red reform read arrest'.gsub(/reform|rest/, 'X')
30 |
31 | 'red reform read arrest'.gsub(/re(form|st)/, 'X')
32 |
33 | 'par spare part party'.gsub(/\bpar\b|\bpart\b/, 'X')
34 |
35 | 'par spare part party'.gsub(/\b(par|part)\b/, 'X')
36 |
37 | 'par spare part party'.gsub(/\bpar(|t)\b/, 'X')
38 |
39 | ## Regexp.source method
40 |
41 | words = %w[cat par]
42 |
43 | alt = Regexp.union(words)
44 |
45 | alt
46 |
47 | alt_w = /\b(#{alt.source})\b/
48 |
49 | alt_w
50 |
51 | 'cater cat concatenate par spare'.gsub(alt, 'X')
52 |
53 | 'cater cat concatenate par spare'.gsub(alt_w, 'X')
54 |
55 | ## Precedence rules
56 |
57 | words = 'lion elephant are rope not'
58 |
59 | words =~ /on/
60 |
61 | words =~ /ant/
62 |
63 | words.sub(/on|ant/, 'X')
64 |
65 | words.sub(/ant|on/, 'X')
66 |
67 | mood = 'best years'
68 |
69 | mood =~ /year/
70 |
71 | mood =~ /years/
72 |
73 | mood.sub(/year|years/, 'X')
74 |
75 | mood.sub(/years|year/, 'X')
76 |
77 | words = 'ear xerox at mare part learn eye'
78 |
79 | words.gsub(/ar|are|art/, 'X')
80 |
81 | words.gsub(/are|ar|art/, 'X')
82 |
83 | words.gsub(/are|art|ar/, 'X')
84 |
85 | words = %w[hand handy handful]
86 |
87 | alt = Regexp.union(words.sort_by { |w| -w.length })
88 |
89 | alt
90 |
91 | 'hands handful handed handy'.gsub(alt, 'X')
92 |
93 | 'hands handful handed handy'.gsub(Regexp.union(words), 'X')
94 |
95 |
--------------------------------------------------------------------------------
/code_snippets/Anchors.rb:
--------------------------------------------------------------------------------
1 | ## String anchors
2 |
3 | 'cater'.match?(/\Acat/)
4 |
5 | 'concatenation'.match?(/\Acat/)
6 |
7 | "hi hello\ntop spot".match?(/\Ahi/)
8 |
9 | "hi hello\ntop spot".match?(/\Atop/)
10 |
11 | 'spare'.match?(/are\z/)
12 |
13 | 'nearest'.match?(/are\z/)
14 |
15 | words = %w[surrender unicorn newer door empty eel pest]
16 |
17 | words.grep(/er\z/)
18 |
19 | words.grep(/t\z/)
20 |
21 | "spare\ndare".sub(/are\z/, 'X')
22 |
23 | "spare\ndare".sub(/are\Z/, 'X')
24 |
25 | "spare\ndare\n".sub(/are\z/, 'X')
26 |
27 | "spare\ndare\n".sub(/are\Z/, 'X')
28 |
29 | 'cat'.match?(/\Acat\z/)
30 |
31 | 'cater'.match?(/\Acat\z/)
32 |
33 | 'concatenation'.match?(/\Acat\z/)
34 |
35 | 'live'.sub(/\A/, 're')
36 |
37 | 'send'.sub(/\A/, 're')
38 |
39 | 'cat'.sub(/\z/, 'er')
40 |
41 | 'hack'.sub(/\z/, 'er')
42 |
43 | ## Line anchors
44 |
45 | pets = 'cat and dog'
46 |
47 | pets.match?(/^cat/)
48 |
49 | pets.match?(/^dog/)
50 |
51 | pets.match?(/dog$/)
52 |
53 | pets.match?(/^dog$/)
54 |
55 | "hi hello\ntop spot".match?(/^top/)
56 |
57 | "spare\npar\nera\ndare".match?(/er$/)
58 |
59 | "spare\npar\ndare".each_line.grep(/are$/)
60 |
61 | "spare\npar\ndare".match?(/^par$/)
62 |
63 | str = "catapults\nconcatenate\ncat"
64 |
65 | puts str.gsub(/^/, '1: ')
66 |
67 | puts str.gsub(/^/).with_index(1) { "#{_2}: " }
68 |
69 | puts str.gsub(/$/, '.')
70 |
71 | puts "1\n2\n".gsub(/^/, 'fig ')
72 |
73 | puts "1\n\n".gsub(/^/, 'fig ')
74 |
75 | puts "1\n2\n".gsub(/$/, ' banana')
76 |
77 | puts "1\n\n".gsub(/$/, ' banana')
78 |
79 | ## Word anchors
80 |
81 | words = 'par spar apparent spare part'
82 |
83 | words.gsub(/par/, 'X')
84 |
85 | words.gsub(/\bpar/, 'X')
86 |
87 | words.gsub(/par\b/, 'X')
88 |
89 | words.gsub(/\bpar\b/, 'X')
90 |
91 | words = 'par spar apparent spare part'
92 |
93 | puts words.gsub(/\b/, '"').tr(' ', ',')
94 |
95 | '-----hello-----'.gsub(/\b/, ' ')
96 |
97 | 'output=num1+35*42/num2'.gsub(/\b/, ' ')
98 |
99 | 'output=num1+35*42/num2'.gsub(/\b/, ' ').strip
100 |
101 | ## Opposite Word anchors
102 |
103 | words = 'par spar apparent spare part'
104 |
105 | words.gsub(/\Bpar/, 'X')
106 |
107 | words.gsub(/\Bpar\b/, 'X')
108 |
109 | words.gsub(/par\B/, 'X')
110 |
111 | words.gsub(/\Bpar\B/, 'X')
112 |
113 | 'copper'.gsub(/\b/, ':')
114 |
115 | 'copper'.gsub(/\B/, ':')
116 |
117 | '-----hello-----'.gsub(/\b/, ' ')
118 |
119 | '-----hello-----'.gsub(/\B/, ' ')
120 |
121 |
--------------------------------------------------------------------------------
/code_snippets/Character_class.rb:
--------------------------------------------------------------------------------
1 | ## Custom character sets
2 |
3 | %w[cute cat cot coat cost scuttle].grep(/c[ou]t/)
4 |
5 | 'meeting cute boat site foot'.scan(/.[aeo]+t/)
6 |
7 | ## Range of characters
8 |
9 | 'Sample123string42with777numbers'.scan(/[0-9]+/)
10 |
11 | 'coat Bin food tar12 best Apple fig_42'.scan(/\b[a-z0-9]+\b/)
12 |
13 | 'coat tin food put stoop best fig_42 Pet'.scan(/\b[p-z][a-z]*\b/)
14 |
15 | 'coat tin food put stoop best fig_42 Pet'.scan(/\b[a-fp-t]+\b/)
16 |
17 | ## Negating character sets
18 |
19 | 'Sample123string42with777numbers'.scan(/[^0-9]+/)
20 |
21 | 'apple:123:banana:cherry'.sub(/\A([^:]+:){2}/, '')
22 |
23 | 'apple=42; cherry=123'.sub(/=[^=]+\z/, '')
24 |
25 | dates = '2024/04/25,1986/Mar/02,77/12/31'
26 |
27 | dates.scan(%r{([^/]+)/([^/]+)/([^/,]+),?})
28 |
29 | words = %w[tryst fun glyph pity why]
30 |
31 | words.grep(/\A[^aeiou]+\z/)
32 |
33 | words.grep_v(/[aeiou]/)
34 |
35 | ## Set intersection
36 |
37 | 'tryst glyph pity why'.scan(/\b[^aeiou]+\b/)
38 |
39 | 'tryst glyph pity why'.scan(/\b[a-z&&[^aeiou]]+\b/)
40 |
41 | ## Matching metacharacters literally
42 |
43 | 'ab-cd gh-c 12-423'.scan(/\b[a-z-]{2,}\b/)
44 |
45 | 'ab-cd gh-c 12-423'.scan(/\b[a-z\-0-9]{2,}\b/)
46 |
47 | 'f*(a^b) - 3*(a+b)'.scan(/a[+^]b/)
48 |
49 | 'f*(a^b) - 3*(a+b)'.scan(/a[\^+]b/)
50 |
51 | 'words[5] = tea'[/[a-z\[\]0-9]+/]
52 |
53 | puts '5ba\babc2'[/[a\\b]+/]
54 |
55 | ## Escape sequence sets
56 |
57 | '128A foo1 fe32 34 bar'.scan(/\b\h+\b/)
58 |
59 | '128A foo1 fe32 34 bar'.scan(/\b\h+\b/).map(&:hex)
60 |
61 | 'Sample123string42with777numbers'.split(/\d+/)
62 |
63 | 'apple=5, banana=3; x=83, y=120'.scan(/\d+/).map(&:to_i)
64 |
65 | 'sea eat car rat eel tea'.scan(/\b\w/).join
66 |
67 | "tea sea-Pit Sit;(lean_2\tbean_3)".scan(/[\w\s]+/)
68 |
69 | 'Sample123string42with777numbers'.gsub(/\D+/, '-')
70 |
71 | 'apple=5, banana=3; x=83, y=120'.gsub(/\W+/, '')
72 |
73 | " 1..3 \v\f fig_tea 42\tzzz \r\n1-2-3 ".scan(/\S+/)
74 |
75 | "food\r\ngood\napple\vbanana".gsub(/\R/, " ")
76 |
77 | "food\r\ngood"[/\w+\R/]
78 |
79 | ip = ['#comment', 'c = "#"', "\t #comment", 'fig', '', " \t "]
80 |
81 | ip.grep(/\A\s*[^#]/)
82 |
83 | ip.grep(/\A\s*+[^#]/)
84 |
85 | ip.grep(/\A\s*[^#\s]/)
86 |
87 | ## Named character sets
88 |
89 | 'Sample123string42with777numbers'.split(/[[:digit:]]+/)
90 |
91 | " 1..3 \v\f fig_tea 42\tzzz \r\n1-2-3 ".scan(/[[:^space:]]+/)
92 |
93 | "tea sea-Pit Sit;(lean_2\tbean_3)".scan(/[[:word:][:space:]]+/)
94 |
95 | 'Sample123string42with777numbers'.scan(/[[:alpha:]]+/)
96 |
97 | ip = '"Hi", there! How *are* you? All fine here.'
98 |
99 | ip.gsub(/[[:punct:]]+/, '')
100 |
101 | ip.gsub(/[[^.!?]&&[:punct:]]+/, '')
102 |
103 | ## Numeric ranges
104 |
105 | '23 154 12 26 98234'.scan(/\b[12]\d\b/)
106 |
107 | '23 154 12 26 98234'.scan(/\b\d{3,}\b/)
108 |
109 | '0501 035 154 12 26 98234'.scan(/\b0*+\d{3,}\b/)
110 |
111 | '45 349 651 593 4 204'.scan(/\d+/).filter { _1.to_i < 350 }
112 |
113 | '45 349 651 593 4 204'.gsub(/\d+/) { (200..650) === $&.to_i ? 0 : 1 }
114 |
115 |
--------------------------------------------------------------------------------
/code_snippets/Dot_metacharacter_and_Quantifiers.rb:
--------------------------------------------------------------------------------
1 | ## Dot metacharacter
2 |
3 | 'tac tin c.t abc;tuv acute'.gsub(/c.t/, 'X')
4 |
5 | 'breadth markedly reported overrides'.gsub(/r..d/) { _1.upcase }
6 |
7 | "42\t35".sub(/2.3/, '8')
8 |
9 | "a\nb".match?(/a.b/)
10 |
11 | ## split method
12 |
13 | 'apple-85-mango-70'.split(/-/)
14 |
15 | 'bus:3:car:-:van'.split(/:.:/)
16 |
17 | 'apple-85-mango-70'.split(/-/, 2)
18 |
19 | ## Greedy quantifiers
20 |
21 | 'far feat flare fear'.gsub(/e?ar/, 'X')
22 |
23 | 'par spare part party'.gsub(/\bpart?\b/, 'X')
24 |
25 | words = %w[red read ready re;d road redo reed rod]
26 |
27 | words.grep(/\bre.?d\b/)
28 |
29 | 'par part parrot parent'.gsub(/par(ro)?t/, 'X')
30 |
31 | 'par part parrot parent'.gsub(/par(en|ro)?t/, 'X')
32 |
33 | 'tr tear tare steer sitaara'.gsub(/ta*r/, 'X')
34 |
35 | 'tr tear tare steer sitaara'.gsub(/t(e|a)*r/, 'X')
36 |
37 | '3111111111125111142'.gsub(/1*2/, 'X')
38 |
39 | '3111111111125111142'.split(/1*/)
40 |
41 | '3111111111125111142'.split(/1*/, -1)
42 |
43 | '3111111111125111142'.partition(/1*2/)
44 |
45 | '3111111111125111142'.rpartition(/1*2/)
46 |
47 | 'tr tear tare steer sitaara'.gsub(/ta+r/, 'X')
48 |
49 | 'tr tear tare steer sitaara'.gsub(/t(e|a)+r/, 'X')
50 |
51 | '3111111111125111142'.gsub(/1+2/, 'X')
52 |
53 | '3111111111125111142'.split(/1+/)
54 |
55 | repeats = %w[abc ac adc abbc xabbbcz bbb bc abbbbbc]
56 |
57 | repeats.grep(/ab{1,4}c/)
58 |
59 | repeats.grep(/ab{3,}c/)
60 |
61 | repeats.grep(/ab{,2}c/)
62 |
63 | repeats.grep(/ab{3}c/)
64 |
65 | 'a{5} = 10'.sub(/a\{5}/, 'a{6}')
66 |
67 | 'report_{a,b}.txt'.sub(/_{a,b}/, '-{c,d}')
68 |
69 | '# heading ### sub-heading'.gsub(/\#{2,}/, '%')
70 |
71 | ## Conditional AND
72 |
73 | 'Error: not a valid input'.match?(/Error.*valid/)
74 |
75 | 'Error: key not found'.match?(/Error.*valid/)
76 |
77 | seq1, seq2 = ['cat and dog', 'dog and cat']
78 |
79 | seq1.match?(/cat.*dog|dog.*cat/)
80 |
81 | seq2.match?(/cat.*dog|dog.*cat/)
82 |
83 | patterns = [/cat/, /dog/]
84 |
85 | patterns.all? { seq1.match?(_1) }
86 |
87 | patterns.all? { seq2.match?(_1) }
88 |
89 | ## What does greedy mean?
90 |
91 | 'foot'.sub(/f.?o/, 'X')
92 |
93 | puts 'blah \< fig < apple \< blah < cat'.gsub(/\\?, '\<')
94 |
95 | 'hand handy handful'.gsub(/hand(y|ful)?/, 'X')
96 |
97 | sentence = 'that is quite a fabricated tale'
98 |
99 | sentence.sub(/t.*a/, 'X')
100 |
101 | 'star'.sub(/t.*a/, 'X')
102 |
103 | sentence.sub(/t.*a.*q.*f/, 'X')
104 |
105 | sentence.sub(/t.*a.*u/, 'X')
106 |
107 | ## Non-greedy quantifiers
108 |
109 | 'foot'.sub(/f.??o/, 'X')
110 |
111 | 'frost'.sub(/f.??o/, 'X')
112 |
113 | '123456789'.sub(/.{2,5}?/, 'X')
114 |
115 | 'green:3.14:teal::brown:oh!:blue'.split(/:.*:/)
116 |
117 | 'green:3.14:teal::brown:oh!:blue'.split(/:.*?:/)
118 |
119 | ## Possessive quantifiers
120 |
121 | ip = 'fig:mango:pineapple:guava:apples:orange'
122 |
123 | ip.gsub(/:.*+/, 'X')
124 |
125 | ip.match?(/:.*+apple/)
126 |
127 | numbers = '42 314 001 12 00984'
128 |
129 | numbers.scan(/0*\d{3,}/)
130 |
131 | numbers.scan(/0*+\d{3,}/)
132 |
133 | numbers.scan(/0*[1-9]\d{2,}/)
134 |
135 | ## Atomic grouping
136 |
137 | numbers = '42 314 001 12 00984'
138 |
139 | numbers.scan(/(?>0*)\d{3,}/)
140 |
141 | ip = 'fig::mango::pineapple::guava::apples::orange'
142 |
143 | ip.match(/::.*?::apple/)[0]
144 |
145 | ip.match(/(?>::.*?::)apple/)[0]
146 |
147 |
--------------------------------------------------------------------------------
/code_snippets/Escaping_metacharacters.rb:
--------------------------------------------------------------------------------
1 | ## Escaping with backslash
2 |
3 | 'a^2 + b^2 - C*3'.match?(/b^2/)
4 |
5 | 'a^2 + b^2 - C*3'.gsub(/(a|b)\^2/) { _1.upcase }
6 |
7 | '(a*b) + c'.gsub(/\(|\)/, '')
8 |
9 | '\learn\by\example'.gsub(/\\/, '/')
10 |
11 | eqn = 'f*(a^b) - 3*(a^b)'
12 |
13 | eqn.gsub('(a^b)', 'c')
14 |
15 | ## Regexp.escape method
16 |
17 | eqn = 'f*(a^b) - 3*(a^b)'
18 |
19 | expr = '(a^b)'
20 |
21 | puts Regexp.escape(expr)
22 |
23 | eqn.sub(/#{Regexp.escape(expr)}\z/, 'c')
24 |
25 | terms = %w[a_42 (a^b) 2|3]
26 |
27 | pat = Regexp.union(terms)
28 |
29 | pat
30 |
31 | 'ba_423 (a^b)c 2|3 a^b'.gsub(pat, 'X')
32 |
33 | Regexp.union(/^cat|dog$/, 'a^b')
34 |
35 | ## Escaping delimiter
36 |
37 | path = '/home/joe/report/sales/ip.txt'
38 |
39 | path.sub(/\A\/home\/joe\//, '~/')
40 |
41 | path.sub(%r#\A/home/joe/#, '~/')
42 |
43 | ## Escape sequences
44 |
45 | "a\tb\tc".gsub(/\t/, ':')
46 |
47 | "1\n2\n3".gsub(/\n/, ' ')
48 |
49 | 'h%x'.match?(/h\%x/)
50 |
51 | 'h\%x'.match?(/h\%x/)
52 |
53 | 'hello'.match?(/\l/)
54 |
55 | 'h e l l o'.gsub(/\x20/, '')
56 |
57 | 'a+b'.match?(/a\053b/)
58 |
59 | '12|30'.gsub(/2\x7c3/, '5')
60 |
61 | '12|30'.gsub(/2|3/, '5')
62 |
63 |
--------------------------------------------------------------------------------
/code_snippets/Groupings_and_backreferences.rb:
--------------------------------------------------------------------------------
1 | ## Backreferences
2 |
3 | '[52] apples [and] [31] mangoes'.gsub(/\[(\d+)\]/, '\1')
4 |
5 | '[52] apples [and] [31] mangoes'.gsub(/\[(\d+)\]/, '\15')
6 |
7 | '_apple_ __123__ _banana_'.gsub(/(_)?_/, '\1')
8 |
9 | '52 apples and 31 mangoes'.gsub(/\d+/, '(\0)')
10 |
11 | 'Hello world'.sub(/.*/, 'Hi. \0. Have a nice day')
12 |
13 | 'fork,42,nice,3.14'.sub(/,.+/, '\0,\`')
14 |
15 | 'good,bad 42,24 x,y'.gsub(/(\w+),(\w+)/, '\2,\1')
16 |
17 | %w[effort flee facade oddball rat tool].grep(/(\w)\1/)
18 |
19 | 'aa a a a 42 f_1 f_1 f_13.14'.gsub(/\b(\w+)( \1)+\b/, '\1')
20 |
21 | 'two one 5 one2 three'.match?(/([a-z]+).*\12/)
22 |
23 | 'two one 5 one2 three'.match?(/([a-z]+).*\k<1>2/)
24 |
25 | s = 'abcdefghijklmna1d'
26 |
27 | s.sub(/(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.).*\1\x31/, 'X')
28 |
29 | '[52] apples [and] [31] mangoes'.gsub(/\[(\d+)\]/) { $15 }
30 |
31 | '[52] apples [and] [31] mangoes'.gsub(/\[(\d+)\]/) { $1 + "5" }
32 |
33 | '[52] apples [and] [31] mangoes'.gsub(/\[(\d+)\]/) { "#{$1}5" }
34 |
35 | ## Non-capturing groups
36 |
37 | 'cost akin more east run against'.scan(/\b\w*(?:st|in)\b/)
38 |
39 | '123hand42handy777handful500'.split(/hand(?:y|ful)?/)
40 |
41 | '1,2,3,4,5,6,7'.sub(/\A(([^,]+,){3})([^,]+)/, '\1(\3)')
42 |
43 | '1,2,3,4,5,6,7'.sub(/\A((?:[^,]+,){3})([^,]+)/, '\1(\2)')
44 |
45 | s = 'hi 123123123 bye 456123456'
46 |
47 | s.scan(/(123)+/)
48 |
49 | s.scan(/(?:123)+/)
50 |
51 | s.gsub(/(123)+/, 'X')
52 |
53 | row = 'one,2,3.14,42,five'
54 |
55 | puts row.sub(/\A([^,]+,){3}([^,]+)/, '\1"\2"')
56 |
57 | puts row.sub(/\A((?:[^,]+,){3})([^,]+)/, '\1"\2"')
58 |
59 | 'cost akin more east run against'.gsub(/\b\w*(st|in)\b/).to_a
60 |
61 | 'cost akin more east run against'.gsub(/\b\w*(st|in)\b/).map(&:upcase)
62 |
63 | 'effort flee facade oddball rat tool'.gsub(/\b\w*(\w)\1\w*\b/).to_a
64 |
65 | ## Subexpression calls
66 |
67 | row = 'today,2008-03-24,food,2012-08-12,nice,5632'
68 |
69 | row[/(\d{4}-\d{2}-\d{2}).*\g<1>/]
70 |
71 | d = '2008-03-24,2012-08-12 2017-06-27,2018-03-25 1999-12-23,2001-05-08'
72 |
73 | d.scan(/(\d{4}-\d{2}-\d{2}),\g<1>/)
74 |
75 | d.gsub(/(\d{4}-\d{2}-\d{2}),\g<1>/, '\1')
76 |
77 | d.gsub(/((\d{4}-\d{2}-\d{2})),\g<2>/, '\1')
78 |
79 | ## Recursive matching
80 |
81 | eqn0 = 'a + (b * c) - (d / e)'
82 |
83 | eqn0.scan(/\([^()]++\)/)
84 |
85 | eqn1 = '((f+x)^y-42)*((3-g)^z+2)'
86 |
87 | eqn1.scan(/\([^()]++\)/)
88 |
89 | eqn1 = '((f+x)^y-42)*((3-g)^z+2)'
90 |
91 | eqn1.scan(/\((?:[^()]++|\([^()]++\))++\)/)
92 |
93 | eqn2 = 'a + (b) + ((c)) + (((d)))'
94 |
95 | eqn2.scan(/\((?:[^()]++|\([^()]++\))++\)/)
96 |
97 | lvl2 = /\( #literal (
98 | (?: #start of non-capturing group
99 | [^()]++ #non-parentheses characters
100 | | #OR
101 | \([^()]++\) #level-one regexp
102 | )++ #end of non-capturing group, 1 or more times
103 | \) #literal )
104 | /x
105 |
106 | eqn1.scan(lvl2)
107 |
108 | eqn2.scan(lvl2)
109 |
110 | lvln = /\( #literal (
111 | (?: #start of non-capturing group
112 | [^()]++ #non-parentheses characters
113 | | #OR
114 | \g<0> #recursive call
115 | )++ #end of non-capturing group, 1 or more times
116 | \) #literal )
117 | /x
118 |
119 | eqn0.scan(lvln)
120 |
121 | eqn1.scan(lvln)
122 |
123 | eqn2.scan(lvln)
124 |
125 | eqn3 = '(3+a) * ((r-2)*(t+2)/6) + 42 * (a(b(c(d(e)))))'
126 |
127 | eqn3.scan(lvln)
128 |
129 | ## Named capture groups
130 |
131 | 'good,bad 42,24 x,y'.gsub(/(?\w+),(?\w+)/, '\k,\k')
132 |
133 | 'good,bad 42,24 x,y'.gsub(/(?'fw'\w+),(?'sw'\w+)/, '\k,\k')
134 |
135 | row = 'today,2008-03-24,food,2012-08-12,nice,5632'
136 |
137 | row[/(?\d{4}-\d{2}-\d{2}).*\g/]
138 |
139 | details = '2018-10-25,car'
140 |
141 | /(?[^,]+),(?[^,]+)/ =~ details
142 |
143 | date
144 |
145 | product
146 |
147 | details = '2018-10-25,car,2346'
148 |
149 | details.match(/(?[^,]+),(?[^,]+)/).named_captures
150 |
151 | details.match(/(?[^,]+),([^,]+)/).named_captures
152 |
153 | s = 'good,bad 42,24'
154 |
155 | s.gsub(/(?\w+),(?\w+)/).map { $~.named_captures }
156 |
157 | ## Negative backreferences
158 |
159 | '1,2,3,3,5'.match?(/\A([^,]+,){2}([^,]+),\k<-1>,/)
160 |
161 | ## Conditional groups
162 |
163 | words = %w[ bye bad> 42 <3]
164 |
165 | words.grep(/\A(<)?\w+(?(1)>)\z/)
166 |
167 | words.grep(/\A(?:<\w+>|\w+)\z/)
168 |
169 | words.grep(/\A(?:\w+>?)\z/)
170 |
171 | words = ['(hi)', 'good-bye', 'bad', '(42)', '-oh', 'i-j', '(-)', '(oh-no)']
172 |
173 | words.grep(/\A(?:(\()?\w+(?(1)\)|-\w+))\z/)
174 |
175 |
--------------------------------------------------------------------------------
/code_snippets/Interlude_Common_tasks.rb:
--------------------------------------------------------------------------------
1 | ## CommonRegexRuby
2 |
3 | require 'commonregex'
4 |
5 | data = 'hello 255.21.255.22 okay 23/04/96'
6 |
7 | parsed = CommonRegex.new(data)
8 |
9 | parsed.get_ipv4
10 |
11 | parsed.get_dates
12 |
13 | CommonRegex.get_ipv4(data)
14 |
15 | CommonRegex.get_dates(data)
16 |
17 | new_data = '23.14.2.4.2 255.21.255.22 567.12.2.1'
18 |
19 | CommonRegex.get_ipv4(new_data)
20 |
21 |
--------------------------------------------------------------------------------
/code_snippets/Lookarounds.rb:
--------------------------------------------------------------------------------
1 | ## Conditional expressions
2 |
3 | items = ['1,2,3,4', 'a,b,c,d', '#apple 123']
4 |
5 | items.filter { _1.match?(/\d/) && _1.include?('#') }
6 |
7 | items.filter_map { |s| s.sub(/,.+,/, ' ') if s[0] != '#' }
8 |
9 | ## Negative lookarounds
10 |
11 | 'hey cats! cat42 cat_5 catcat'.gsub(/cat(?!\d)/, 'dog')
12 |
13 | 'cat _cat 42catcat'.gsub(/(? 'one', '2' => 'two', '4' => 'four' }
152 |
153 | '9234012'.gsub(/1|2|4/, h)
154 |
155 | h.default = 'X'
156 |
157 | '9234012'.gsub(/./, h)
158 |
159 | swap = { 'cat' => 'tiger', 'tiger' => 'cat' }
160 |
161 | 'cat tiger dog tiger cat'.gsub(/cat|tiger/, swap)
162 |
163 | h = { 'hand' => 1, 'handy' => 2, 'handful' => 3, 'a^b' => 4 }
164 |
165 | pat = Regexp.union(h.keys.sort_by { |w| -w.length })
166 |
167 | pat
168 |
169 | 'handful hand pin handy (a^b)'.gsub(pat, h)
170 |
171 | ## Substitution in conditional expression
172 |
173 | num = '4'
174 |
175 | puts "#{num} apples" if num.sub!(/5/) { $&.to_i ** 2 }
176 |
177 | puts "#{num} apples" if num.sub!(/4/) { $&.to_i ** 2 }
178 |
179 | word, cnt = ['coffining', 0]
180 |
181 | cnt += 1 while word.sub!(/fin/, '')
182 |
183 | [word, cnt]
184 |
185 |
--------------------------------------------------------------------------------
/exercises/Exercise_solutions.md:
--------------------------------------------------------------------------------
1 | # Exercise solutions
2 |
3 | > Solutions for [Exercises.md](https://github.com/learnbyexample/Ruby_Regexp/blob/master/exercises/Exercises.md) is presented here.
4 |
5 |
6 |
7 | # Regexp introduction
8 |
9 | **1)** Check whether the given strings contain `0xB0`. Display a boolean result as shown below.
10 |
11 | ```ruby
12 | >> line1 = 'start address: 0xA0, func1 address: 0xC0'
13 | >> line2 = 'end address: 0xFF, func2 address: 0xB0'
14 |
15 | >> line1.match?(/0xB0/)
16 | => false
17 | >> line2.match?(/0xB0/)
18 | => true
19 | ```
20 |
21 | **2)** Check if the given input strings contain `two` irrespective of case.
22 |
23 | ```ruby
24 | >> s1 = 'Their artwork is exceptional'
25 | >> s2 = 'one plus tw0 is not three'
26 | >> s3 = 'TRUSTWORTHY'
27 |
28 | >> pat1 = /two/i
29 |
30 | >> pat1.match?(s1)
31 | => true
32 | >> pat1.match?(s2)
33 | => false
34 | >> pat1.match?(s3)
35 | => true
36 | ```
37 |
38 | **3)** Replace all occurrences of `5` with `five` for the given string.
39 |
40 | ```ruby
41 | >> ip = 'They ate 5 apples and 5 oranges'
42 |
43 | >> ip.gsub(/5/, 'five')
44 | => "They ate five apples and five oranges"
45 | ```
46 |
47 | **4)** Replace only the first occurrence of `5` with `five` for the given string.
48 |
49 | ```ruby
50 | >> ip = 'They ate 5 apples and 5 oranges'
51 |
52 | >> ip.sub(/5/, 'five')
53 | => "They ate five apples and 5 oranges"
54 | ```
55 |
56 | **5)** For the given array, filter all elements that do *not* contain `e`.
57 |
58 | ```ruby
59 | >> items = %w[goal new user sit eat dinner]
60 |
61 | >> items.grep_v(/e/)
62 | => ["goal", "sit"]
63 | ```
64 |
65 | **6)** Replace all occurrences of `note` irrespective of case with `X`.
66 |
67 | ```ruby
68 | >> ip = 'This note should not be NoTeD'
69 |
70 | >> ip.gsub(/note/i, 'X')
71 | => "This X should not be XD"
72 | ```
73 |
74 | **7)** For the given input string, print all lines NOT containing the string `2`.
75 |
76 | ```ruby
77 | '> purchases = %q{items qty
78 | '> apple 24
79 | '> mango 50
80 | '> guava 42
81 | '> onion 31
82 | >> water 10}
83 |
84 | >> num = /2/
85 |
86 | >> puts purchases.each_line.grep_v(num)
87 | items qty
88 | mango 50
89 | onion 31
90 | water 10
91 | ```
92 |
93 | **8)** For the given array, filter all elements that contain either `a` or `w`.
94 |
95 | ```ruby
96 | >> items = %w[goal new user sit eat dinner]
97 |
98 | >> items.filter { |e| e.match?(/a/) || e.match?(/w/) }
99 | => ["goal", "new", "eat"]
100 | ```
101 |
102 | **9)** For the given array, filter all elements that contain both `e` and `n`.
103 |
104 | ```ruby
105 | >> items = %w[goal new user sit eat dinner]
106 |
107 | >> items.filter { |e| e.match?(/e/) && e.match?(/n/) }
108 | => ["new", "dinner"]
109 | ```
110 |
111 | **10)** For the given string, replace `0xA0` with `0x7F` and `0xC0` with `0x1F`.
112 |
113 | ```ruby
114 | >> ip = 'start address: 0xA0, func1 address: 0xC0'
115 |
116 | >> ip.gsub(/0xA0/, '0x7F').gsub(/0xC0/, '0x1F')
117 | => "start address: 0x7F, func1 address: 0x1F"
118 | ```
119 |
120 | **11)** Find the starting index of the first occurrence of `is` for the given input string.
121 |
122 | ```ruby
123 | >> ip = 'match this after the history lesson'
124 |
125 | >> ip =~ /is/
126 | => 8
127 | ```
128 |
129 |
130 |
131 | # Anchors
132 |
133 | **1)** Check if the given strings start with `be`.
134 |
135 | ```ruby
136 | >> line1 = 'be nice'
137 | >> line2 = '"best!"'
138 | >> line3 = 'better?'
139 | >> line4 = 'oh no\nbear spotted'
140 |
141 | >> pat = /\Abe/
142 |
143 | >> pat.match?(line1)
144 | => true
145 | >> pat.match?(line2)
146 | => false
147 | >> pat.match?(line3)
148 | => true
149 | >> pat.match?(line4)
150 | => false
151 | ```
152 |
153 | **2)** For the given input string, change only the whole word `red` to `brown`.
154 |
155 | ```ruby
156 | >> words = 'bred red spread credible red.'
157 |
158 | >> words.gsub(/\bred\b/, 'brown')
159 | => "bred brown spread credible brown."
160 | ```
161 |
162 | **3)** For the given input array, filter elements that contain `42` surrounded by word characters.
163 |
164 | ```ruby
165 | >> items = ['hi42bye', 'nice1423', 'bad42', 'cool_42a', '42fake', '_42_']
166 |
167 | >> items.grep(/\B42\B/)
168 | => ["hi42bye", "nice1423", "cool_42a", "_42_"]
169 | ```
170 |
171 | **4)** For the given input array, filter elements that start with `den` or end with `ly`.
172 |
173 | ```ruby
174 | >> items = ['lovely', "1\ndentist", '2 lonely', 'eden', "fly\n", 'dent']
175 |
176 | >> items.filter { |e| e.match?(/\Aden/) || e.match?(/ly\z/) }
177 | => ["lovely", "2 lonely", "dent"]
178 | ```
179 |
180 | **5)** For the given input string, change whole word `mall` to `1234` only if it is at the start of a line.
181 |
182 | ```ruby
183 | '> para = %q{(mall) call ball pall
184 | '> ball fall wall tall
185 | '> mall call ball pall
186 | '> wall mall ball fall
187 | '> mallet wallet malls
188 | >> mall:call:ball:pall}
189 |
190 | >> puts para.gsub(/^mall\b/, '1234')
191 | (mall) call ball pall
192 | ball fall wall tall
193 | 1234 call ball pall
194 | wall mall ball fall
195 | mallet wallet malls
196 | 1234:call:ball:pall
197 | ```
198 |
199 | **6)** For the given array, filter elements having a line starting with `den` or ending with `ly`.
200 |
201 | ```ruby
202 | >> items = ['lovely', "1\ndentist", '2 lonely', 'eden', "fly\nfar", 'dent']
203 |
204 | >> items.filter { |e| e.match?(/^den/) || e.match?(/ly$/) }
205 | => ["lovely", "1\ndentist", "2 lonely", "fly\nfar", "dent"]
206 | ```
207 |
208 | **7)** For the given input array, filter all whole elements `12\nthree` irrespective of case.
209 |
210 | ```ruby
211 | >> items = ["12\nthree\n", "12\nThree", "12\nthree\n4", "12\nthree"]
212 |
213 | >> items.grep(/\A12\nthree\z/i)
214 | => ["12\nThree", "12\nthree"]
215 | ```
216 |
217 | **8)** For the given input array, replace `hand` with `X` for all elements that start with `hand` followed by at least one word character.
218 |
219 | ```ruby
220 | >> items = %w[handed hand handy unhanded handle hand-2]
221 |
222 | >> items.map { _1.sub(/\bhand\B/, 'X') }
223 | => ["Xed", "hand", "Xy", "unhanded", "Xle", "hand-2"]
224 | ```
225 |
226 | **9)** For the given input array, filter all elements starting with `h`. Additionally, replace `e` with `X` for these filtered elements.
227 |
228 | ```ruby
229 | >> items = %w[handed hand handy unhanded handle hand-2]
230 |
231 | >> items.filter_map { |e| e.gsub(/e/, 'X') if e.match?(/\Ah/) }
232 | => ["handXd", "hand", "handy", "handlX", "hand-2"]
233 | ```
234 |
235 |
236 |
237 | # Alternation and Grouping
238 |
239 | **1)** For the given input array, filter all elements that start with `den` or end with `ly`.
240 |
241 | ```ruby
242 | >> items = ['lovely', "1\ndentist", '2 lonely', 'eden', "fly\n", 'dent']
243 |
244 | >> items.grep(/\Aden|ly\z/)
245 | => ["lovely", "2 lonely", "dent"]
246 | ```
247 |
248 | **2)** For the given array, filter elements having a line starting with `den` or ending with `ly`.
249 |
250 | ```ruby
251 | >> items = ['lovely', "1\ndentist", '2 lonely', 'eden', "fly\nfar", 'dent']
252 |
253 | >> items.grep(/^den|ly$/)
254 | => ["lovely", "1\ndentist", "2 lonely", "fly\nfar", "dent"]
255 | ```
256 |
257 | **3)** For the given strings, replace all occurrences of `removed` or `reed` or `received` or `refused` with `X`.
258 |
259 | ```ruby
260 | >> s1 = 'creed refuse removed read'
261 | >> s2 = 'refused reed redo received'
262 |
263 | >> pat = /re(mov|ceiv|fus|)ed/
264 |
265 | >> s1.gsub(pat, 'X')
266 | => "cX refuse X read"
267 | >> s2.gsub(pat, 'X')
268 | => "X X redo X"
269 | ```
270 |
271 | **4)** For the given strings, replace all matches from the array `words` with `A`.
272 |
273 | ```ruby
274 | >> s1 = 'plate full of slate'
275 | >> s2 = "slated for later, don't be late"
276 | >> words = %w[late later slated]
277 |
278 | >> pat = Regexp.union(words.sort_by { |w| -w.length })
279 |
280 | >> s1.gsub(pat, 'A')
281 | => "pA full of sA"
282 | >> s2.gsub(pat, 'A')
283 | => "A for A, don't be A"
284 | ```
285 |
286 | **5)** Filter all whole elements from the input array `items` that exactly matches any of the elements present in the array `words`.
287 |
288 | ```ruby
289 | >> items = ['slate', 'later', 'plate', 'late', 'slates', 'slated ']
290 | >> words = %w[late later slated]
291 |
292 | >> pat = Regexp.union(words.sort_by { |w| -w.length })
293 | >> pat = /\A(#{pat.source})\z/
294 |
295 | >> items.grep(pat)
296 | => ["later", "late"]
297 | ```
298 |
299 |
300 |
301 | # Escaping metacharacters
302 |
303 | **1)** Transform the given input strings to the expected output using the same logic on both strings.
304 |
305 | ```ruby
306 | >> str1 = '(9-2)*5+qty/3-(9-2)*7'
307 | >> str2 = '(qty+4)/2-(9-2)*5+pq/4'
308 |
309 | >> str1.gsub('(9-2)*5', '35')
310 | => "35+qty/3-(9-2)*7"
311 | >> str2.gsub('(9-2)*5', '35')
312 | => "(qty+4)/2-35+pq/4"
313 | ```
314 |
315 | **2)** Replace `(4)\|` with `2` only at the start or end of the given input strings.
316 |
317 | ```ruby
318 | >> s1 = '2.3/(4)\|6 fig 5.3-(4)\|'
319 | >> s2 = '(4)\|42 - (4)\|3'
320 | >> s3 = "two - (4)\\|\n"
321 |
322 | >> pat = /\A\(4\)\\\||\(4\)\\\|\z/
323 |
324 | >> s1.gsub(pat, '2')
325 | => "2.3/(4)\\|6 fig 5.3-2"
326 | >> s2.gsub(pat, '2')
327 | => "242 - (4)\\|3"
328 | >> s3.gsub(pat, '2')
329 | => "two - (4)\\|\n"
330 | ```
331 |
332 | **3)** Replace any matching item from the given array with `X` for the given input strings. Match the elements from `items` literally. Assume no two elements of `items` will result in any matching conflict.
333 |
334 | ```ruby
335 | >> items = ['a.b', '3+n', 'x\y\z', 'qty||price', '{n}']
336 |
337 | >> pat = Regexp.union(items)
338 |
339 | >> '0a.bcd'.gsub(pat, 'X')
340 | => "0Xcd"
341 | >> 'E{n}AMPLE'.gsub(pat, 'X')
342 | => "EXAMPLE"
343 | >> '43+n2 ax\y\ze'.gsub(pat, 'X')
344 | => "4X2 aXe"
345 | ```
346 |
347 | **4)** Replace the backspace character `\b` with a single space character for the given input string.
348 |
349 | ```ruby
350 | >> ip = "123\b456"
351 | >> puts ip
352 | 12456
353 |
354 | >> ip.gsub(/\x08/, ' ')
355 | => "123 456"
356 | ```
357 |
358 | **5)** Replace all occurrences of `\o` with `o`.
359 |
360 | ```ruby
361 | >> ip = 'there are c\omm\on aspects am\ong the alternati\ons'
362 |
363 | >> ip.gsub(/\\o/, 'o')
364 | => "there are common aspects among the alternations"
365 | ```
366 |
367 | **6)** Replace any matching item from the array `eqns` with `X` for the given string `ip`. Match the items from `eqns` literally.
368 |
369 | ```ruby
370 | >> ip = '3-(a^b)+2*(a^b)-(a/b)+3'
371 | >> eqns = %w[(a^b) (a/b) (a^b)+2]
372 |
373 | >> pat = Regexp.union(eqns.sort_by { |w| -w.length })
374 |
375 | >> ip.gsub(pat, 'X')
376 | => "3-X*X-X+3"
377 | ```
378 |
379 |
380 |
381 | # Dot metacharacter and Quantifiers
382 |
383 | > Since the `.` metacharacter doesn't match newline characters by default, assume that the input strings in the following exercises will not contain newline characters.
384 |
385 | **1)** Replace `42//5` or `42/5` with `8` for the given input.
386 |
387 | ```ruby
388 | >> ip = 'a+42//5-c pressure*3+42/5-14256'
389 |
390 | >> ip.gsub(%r{42//?5}, '8')
391 | => "a+8-c pressure*3+8-14256"
392 | ```
393 |
394 | **2)** For the array `items`, filter all elements starting with `hand` and ending immediately with at most one more character or `le`.
395 |
396 | ```ruby
397 | >> items = %w[handed hand handled handy unhand hands handle]
398 |
399 | >> items.grep(/\Ahand(.|le)?\z/)
400 | => ["hand", "handy", "hands", "handle"]
401 | ```
402 |
403 | **3)** Use the `split` method to get the output as shown for the given input strings.
404 |
405 | ```ruby
406 | >> eqn1 = 'a+42//5-c'
407 | >> eqn2 = 'pressure*3+42/5-14256'
408 | >> eqn3 = 'r*42-5/3+42///5-42/53+a'
409 |
410 | >> pat = %r{42//?5}
411 |
412 | >> eqn1.split(pat)
413 | => ["a+", "-c"]
414 | >> eqn2.split(pat)
415 | => ["pressure*3+", "-14256"]
416 | >> eqn3.split(pat)
417 | => ["r*42-5/3+42///5-", "3+a"]
418 | ```
419 |
420 | **4)** For the given input strings, remove everything from the first occurrence of `i` till the end of the string.
421 |
422 | ```ruby
423 | >> s1 = 'remove the special meaning of such constructs'
424 | >> s2 = 'characters while constructing'
425 | >> s3 = 'input output'
426 |
427 | >> pat = /i.*/
428 |
429 | >> s1.sub(pat, '')
430 | => "remove the spec"
431 | >> s2.sub(pat, '')
432 | => "characters wh"
433 | >> s3.sub(pat, '')
434 | => ""
435 | ```
436 |
437 | **5)** For the given strings, construct a regexp to get the output as shown below.
438 |
439 | ```ruby
440 | >> str1 = 'a+b(addition)'
441 | >> str2 = 'a/b(division) + c%d(#modulo)'
442 | >> str3 = 'Hi there(greeting). Nice day(a(b)'
443 |
444 | >> remove_parentheses = /\(.*?\)/
445 |
446 | >> str1.gsub(remove_parentheses, '')
447 | => "a+b"
448 | >> str2.gsub(remove_parentheses, '')
449 | => "a/b + c%d"
450 | >> str3.gsub(remove_parentheses, '')
451 | => "Hi there. Nice day"
452 | ```
453 |
454 | **6)** Correct the given regexp to get the expected output.
455 |
456 | ```ruby
457 | >> words = 'plink incoming tint winter in caution sentient'
458 |
459 | # wrong output
460 | >> change = /int|in|ion|ing|inco|inter|ink/
461 | >> words.gsub(change, 'X')
462 | => "plXk XcomXg tX wXer X cautX sentient"
463 |
464 | # expected output
465 | >> change = /in(ter|co|t|g|k)?|ion/
466 | >> words.gsub(change, 'X')
467 | => "plX XmX tX wX X cautX sentient"
468 | ```
469 |
470 | **7)** For the given greedy quantifiers, what would be the equivalent form using the `{m,n}` representation?
471 |
472 | * `?` is same as `{,1}`
473 | * `*` is same as `{0,}`
474 | * `+` is same as `{1,}`
475 |
476 | **8)** `(a*|b*)` is same as `(a|b)*` — true or false?
477 |
478 | False. Because `(a*|b*)` will match only sequences like `a`, `aaa`, `bb`, `bbbbbbbb`. But `(a|b)*` can match mixed sequences like `ababbba` too.
479 |
480 | **9)** For the given input strings, remove everything from the first occurrence of `test` (irrespective of case) till the end of the string, provided `test` isn't at the end of the string.
481 |
482 | ```ruby
483 | >> s1 = 'this is a Test'
484 | >> s2 = 'always test your RE for corner cases'
485 | >> s3 = 'a TEST of skill tests?'
486 |
487 | >> pat = /test.+/i
488 |
489 | >> s1.sub(pat, '')
490 | => "this is a Test"
491 | >> s2.sub(pat, '')
492 | => "always "
493 | >> s3.sub(pat, '')
494 | => "a "
495 | ```
496 |
497 | **10)** For the input array `words`, filter all elements starting with `s` and containing `e` and `t` in any order.
498 |
499 | ```ruby
500 | >> words = ['sequoia', 'subtle', 'exhibit', 'a set', 'sets', 'tests', 'site']
501 |
502 | >> words.grep(/\As.*(e.*t|t.*e)/)
503 | => ["subtle", "sets", "site"]
504 | ```
505 |
506 | **11)** For the input array `words`, remove all elements having less than `6` characters.
507 |
508 | ```ruby
509 | >> words = %w[sequoia subtle exhibit asset sets tests site]
510 |
511 | >> words.grep(/.{6,}/)
512 | => ["sequoia", "subtle", "exhibit"]
513 | ```
514 |
515 | **12)** For the input array `words`, filter all elements starting with `s` or `t` and having a maximum of `6` characters.
516 |
517 | ```ruby
518 | >> words = ['sequoia', 'subtle', 'exhibit', 'asset', 'sets', 't set', 'site']
519 |
520 | >> words.grep(/\A(s|t).{,5}\z/)
521 | => ["subtle", "sets", "t set", "site"]
522 | ```
523 |
524 | **13)** Can you reason out why this code results in the output shown? The aim was to remove all `` patterns but not the `<>` ones. The expected result was `'a 1<> b 2<> c'`.
525 |
526 | The use of `.+` quantifier after `<` means that `<>` cannot be a possible match to satisfy `<.+?>`. So, after matching `<` (which occurs after `1` and `2` in the given input string) the regular expression engine will look for the next occurrence of the `>` character to satisfy the given pattern. To solve such cases, you need to use character classes (discussed in a later chapter) to specify which particular set of characters should be matched by the `+` quantifier (instead of the `.` metacharacter).
527 |
528 | ```ruby
529 | >> ip = 'a 1<> b 2<> c'
530 |
531 | >> ip.gsub(/<.+?>/, '')
532 | => "a 1 2"
533 | ```
534 |
535 | **14)** Use the `split` method to get the output as shown below for the given input strings.
536 |
537 | ```ruby
538 | >> s1 = 'go there :: this :: that'
539 | >> s2 = 'a::b :: c::d e::f :: 4::5'
540 | >> s3 = '42:: hi::bye::see :: carefully'
541 |
542 | >> pat = / +:: +/
543 |
544 | >> s1.split(pat, 2)
545 | => ["go there", "this :: that"]
546 | >> s2.split(pat, 2)
547 | => ["a::b", "c::d e::f :: 4::5"]
548 | >> s3.split(pat, 2)
549 | => ["42:: hi::bye::see", "carefully"]
550 | ```
551 |
552 | **15)** For the given input strings, match if the string starts with optional space characters followed by at least two `#` characters.
553 |
554 | ```ruby
555 | >> s1 = ' ## header2'
556 | >> s2 = '#### header4'
557 | >> s3 = '# comment'
558 | >> s4 = 'normal string'
559 | >> s5 = 'nope ## not this'
560 |
561 | >> pat = /\A *\#{2,}/
562 |
563 | >> s1.match?(pat)
564 | => true
565 | >> s2.match?(pat)
566 | => true
567 | >> s3.match?(pat)
568 | => false
569 | >> s4.match?(pat)
570 | => false
571 | >> s5.match?(pat)
572 | => false
573 | ```
574 |
575 | **16)** Modify the given regular expression such that it gives the expected results.
576 |
577 | ```ruby
578 | >> s1 = 'appleabcabcabcapricot'
579 | >> s2 = 'bananabcabcabcdelicious'
580 |
581 | # wrong output
582 | >> pat = /(abc)+a/
583 | >> pat.match?(s1)
584 | => true
585 | >> pat.match?(s2)
586 | => true
587 |
588 | # expected output
589 | # 'abc' shouldn't be considered when trying to match 'a' at the end
590 | >> pat = /(abc)++a/
591 | >> pat.match?(s1)
592 | => true
593 | >> pat.match?(s2)
594 | => false
595 | ```
596 |
597 |
598 |
599 | # Working with matched portions
600 |
601 | **1)** For the given strings, extract the matching portion from the first `is` to the last `t`.
602 |
603 | ```ruby
604 | >> str1 = 'This the biggest fruit you have seen?'
605 | >> str2 = 'Your mission is to read and practice consistently'
606 |
607 | >> pat = /is.*t/
608 |
609 | >> str1[pat]
610 | => "is the biggest fruit"
611 | >> str2[pat]
612 | => "ission is to read and practice consistent"
613 | ```
614 |
615 | **2)** Find the starting index of the first occurrence of `is` or `the` or `was` or `to` for the given input strings.
616 |
617 | ```ruby
618 | >> s1 = 'match after the last newline character'
619 | >> s2 = 'and then you want to test'
620 | >> s3 = 'this is good bye then'
621 | >> s4 = 'who was there to see?'
622 |
623 | >> pat = /is|the|was|to/
624 |
625 | >> s1 =~ pat
626 | => 12
627 | >> s2 =~ pat
628 | => 4
629 | >> s3 =~ pat
630 | => 2
631 | >> s4 =~ pat
632 | => 4
633 | ```
634 |
635 | **3)** Find the starting index of the last occurrence of `is` or `the` or `was` or `to` for the given input strings.
636 |
637 | ```ruby
638 | >> s1 = 'match after the last newline character'
639 | >> s2 = 'and then you want to test'
640 | >> s3 = 'this is good bye then'
641 | >> s4 = 'who was there to see?'
642 |
643 | >> pat = /.*(is|the|was|to)/
644 |
645 | >> s1.match(pat).begin(1)
646 | => 12
647 | >> s2.match(pat).begin(1)
648 | => 18
649 | >> s3.match(pat).begin(1)
650 | => 17
651 | >> s4.match(pat).begin(1)
652 | => 14
653 | ```
654 |
655 | **4)** Extract everything after the `:` character, which occurs only once in the input.
656 |
657 | ```ruby
658 | >> ip = 'fruits:apple, mango, guava, blueberry'
659 |
660 | # can also use: ip[/:(.*)/, 1]
661 | # can also use: ip.sub(/.*:/, '')
662 | >> ip.match(/:(.*)/)[1]
663 | => "apple, mango, guava, blueberry"
664 | ```
665 |
666 | **5)** The given input strings contains some text followed by `-` followed by a number. Replace that number with its `log` value using `Math.log()`.
667 |
668 | ```ruby
669 | >> s1 = 'first-3.14'
670 | >> s2 = 'next-123'
671 |
672 | >> pat = /-(.+)/
673 |
674 | >> s1.sub(pat) { "-#{Math.log($1.to_f)}" }
675 | => "first-1.144222799920162"
676 | >> s2.sub(pat) { "-#{Math.log($1.to_f)}" }
677 | => "next-4.812184355372417"
678 | ```
679 |
680 | **6)** Replace all occurrences of `par` with `spar`, `spare` with `extra` and `park` with `garden` for the given input strings.
681 |
682 | ```ruby
683 | >> str1 = 'apartment has a park'
684 | >> str2 = 'do you have a spare cable'
685 | >> str3 = 'write a parser'
686 |
687 | >> pat = /park?|spare/
688 | >> h = { 'par' => 'spar', 'spare' => 'extra', 'park' => 'garden' }
689 |
690 | >> str1.gsub(pat, h)
691 | => "aspartment has a garden"
692 | >> str2.gsub(pat, h)
693 | => "do you have a extra cable"
694 | >> str3.gsub(pat, h)
695 | => "write a sparser"
696 | ```
697 |
698 | **7)** Extract all words between `(` and `)` from the given input string as an array. Assume that the input will not contain any broken parentheses.
699 |
700 | ```ruby
701 | >> ip = 'another (way) to reuse (portion) matched (by) capture groups'
702 |
703 | # as nested array
704 | >> ip.scan(/\((.*?)\)/)
705 | => [["way"], ["portion"], ["by"]]
706 |
707 | # as array of strings
708 | >> ip.gsub(/\((.*?)\)/).map { $1 }
709 | => ["way", "portion", "by"]
710 | ```
711 |
712 | **8)** Extract all occurrences of `<` up to the next occurrence of `>`, provided there is at least one character in between `<` and `>`.
713 |
714 | ```ruby
715 | >> ip = 'a 1<> b 2<> c'
716 |
717 | >> ip.scan(/<.+?>/)
718 | => ["", "<> b", "<> c"]
719 | ```
720 |
721 | **9)** Use `scan` to get the output as shown below for the given input strings. Note the characters used in the input strings carefully.
722 |
723 | ```ruby
724 | >> row1 = '-2,5 4,+3 +42,-53 4356246,-357532354 '
725 | >> row2 = '1.32,-3.14 634,5.63 63.3e3,9907809345343.235 '
726 |
727 | >> pat = /(.+?),(.+?) /
728 |
729 | >> row1.scan(pat)
730 | => [["-2", "5"], ["4", "+3"], ["+42", "-53"], ["4356246", "-357532354"]]
731 | >> row2.scan(pat)
732 | => [["1.32", "-3.14"], ["634", "5.63"], ["63.3e3", "9907809345343.235"]]
733 | ```
734 |
735 | **10)** This is an extension to the previous question.
736 |
737 | * For `row1`, find the sum of integers of each array element. For example, sum of `-2` and `5` is `3`.
738 | * For `row2`, find the sum of floating-point numbers of each array element. For example, sum of `1.32` and `-3.14` is `-1.82`.
739 |
740 | ```ruby
741 | >> row1 = '-2,5 4,+3 +42,-53 4356246,-357532354 '
742 | >> row2 = '1.32,-3.14 634,5.63 63.3e3,9907809345343.235 '
743 |
744 | # should be same as the previous question
745 | >> pat = /(.+?),(.+?) /
746 |
747 | >> row1.scan(pat).map { |a, b| a.to_i + b.to_i }
748 | => [3, 7, -11, -353176108]
749 |
750 | >> row2.scan(pat).map { |a, b| a.to_f + b.to_f }
751 | => [-1.82, 639.63, 9907809408643.234]
752 | ```
753 |
754 | **11)** Use the `split` method to get the output as shown below.
755 |
756 | ```ruby
757 | >> ip = '42:no-output;1000:car-tr:u-ck;SQEX49801'
758 |
759 | >> ip.split(/:.+?-(.+?);/)
760 | => ["42", "output", "1000", "tr:u-ck", "SQEX49801"]
761 | ```
762 |
763 | **12)** Convert the comma separated strings to corresponding `hash` objects as shown below. Note that the input strings have an extra `,` at the end.
764 |
765 | ```ruby
766 | >> row1 = 'name:rohan,maths:75,phy:89,'
767 | >> row2 = 'name:rose,maths:88,phy:92,'
768 |
769 | >> pat = /(.+?):(.+?),/
770 |
771 | >> row1.scan(pat).to_h
772 | => {"name"=>"rohan", "maths"=>"75", "phy"=>"89"}
773 | >> row2.scan(pat).to_h
774 | => {"name"=>"rose", "maths"=>"88", "phy"=>"92"}
775 | ```
776 |
777 |
778 |
779 | # Character class
780 |
781 | **1)** For the array `items`, filter all elements starting with `hand` and ending immediately with `s` or `y` or `le`.
782 |
783 | ```ruby
784 | >> items = %w[-handy hand handy unhand hands hand-icy handle]
785 |
786 | >> items.grep(/\Ahand([sy]|le)\z/)
787 | => ["handy", "hands", "handle"]
788 | ```
789 |
790 | **2)** Replace all whole words `reed` or `read` or `red` with `X`.
791 |
792 | ```ruby
793 | >> ip = 'redo red credible :read: rod reed'
794 |
795 | >> ip.gsub(/\bre[ae]?d\b/, 'X')
796 | => "redo X credible :X: rod X"
797 | ```
798 |
799 | **3)** For the array `words`, filter all elements containing `e` or `i` followed by `l` or `n`. Note that the order mentioned should be followed.
800 |
801 | ```ruby
802 | >> words = %w[surrender unicorn newer door empty eel pest]
803 |
804 | >> words.grep(/[ei].*[ln]/)
805 | => ["surrender", "unicorn", "eel"]
806 | ```
807 |
808 | **4)** For the array `words`, filter all elements containing `e` or `i` and `l` or `n` in any order.
809 |
810 | ```ruby
811 | >> words = %w[surrender unicorn newer door empty eel pest]
812 |
813 | >> words.grep(/[ei].*[ln]|[ln].*[ei]/)
814 | => ["surrender", "unicorn", "newer", "eel"]
815 | ```
816 |
817 | **5)** Convert the comma separated strings to corresponding `hash` objects as shown below.
818 |
819 | ```ruby
820 | >> row1 = 'name:rohan,maths:75,phy:89'
821 | >> row2 = 'name:rose,maths:88,phy:92'
822 |
823 | >> pat = /([^:]+):([^,]+),?/
824 |
825 | >> row1.scan(pat).to_h
826 | => {"name"=>"rohan", "maths"=>"75", "phy"=>"89"}
827 | >> row2.scan(pat).to_h
828 | => {"name"=>"rose", "maths"=>"88", "phy"=>"92"}
829 | ```
830 |
831 | **6)** Delete from `(` to the next occurrence of `)` unless they contain parentheses characters in between.
832 |
833 | ```ruby
834 | >> str1 = 'def factorial()'
835 | >> str2 = 'a/b(division) + c%d(#modulo) - (e+(j/k-3)*4)'
836 | >> str3 = 'Hi there(greeting). Nice day(a(b)'
837 |
838 | >> remove_parentheses = /\([^()]*\)/
839 |
840 | >> str1.gsub(remove_parentheses, '')
841 | => "def factorial"
842 | >> str2.gsub(remove_parentheses, '')
843 | => "a/b + c%d - (e+*4)"
844 | >> str3.gsub(remove_parentheses, '')
845 | => "Hi there. Nice day(a"
846 | ```
847 |
848 | **7)** For the array `words`, filter all elements not starting with `e` or `p` or `u`.
849 |
850 | ```ruby
851 | >> words = %w[surrender unicorn newer door empty eel (pest)]
852 |
853 | >> words.grep(/\A[^epu]/)
854 | => ["surrender", "newer", "door", "(pest)"]
855 | ```
856 |
857 | **8)** For the array `words`, filter all elements not containing `u` or `w` or `ee` or `-`.
858 |
859 | ```ruby
860 | >> words = %w[p-t you tea heel owe new reed ear]
861 |
862 | >> words.grep_v(/[uw-]|ee/)
863 | => ["tea", "ear"]
864 | ```
865 |
866 | **9)** The given input strings contain fields separated by `,` and fields can be empty too. Replace the last three fields with `WHTSZ323`.
867 |
868 | ```ruby
869 | >> row1 = '(2),kite,12,,D,C,,'
870 | >> row2 = 'hi,bye,sun,moon'
871 |
872 | >> pat = /(,[^,]*){3}\z/
873 |
874 | >> row1.sub(pat, ',WHTSZ323')
875 | => "(2),kite,12,,D,WHTSZ323"
876 | >> row2.sub(pat, ',WHTSZ323')
877 | => "hi,WHTSZ323"
878 | ```
879 |
880 | **10)** Split the given strings based on consecutive sequence of digit or whitespace characters.
881 |
882 | ```ruby
883 | >> str1 = "lion \t Ink32onion Nice"
884 | >> str2 = "**1\f2\n3star\t7 77\r**"
885 |
886 | >> pat = /[\d\s]+/
887 |
888 | >> str1.split(pat)
889 | => ["lion", "Ink", "onion", "Nice"]
890 | >> str2.split(pat)
891 | => ["**", "star", "**"]
892 | ```
893 |
894 | **11)** Delete all occurrences of the sequence `` where `characters` is one or more non `>` characters and cannot be empty.
895 |
896 | ```ruby
897 | >> ip = 'a 1<> b 2<> c'
898 |
899 | >> ip.gsub(/<[^>]+>/, '')
900 | => "a 1<> b 2<> c"
901 | ```
902 |
903 | **12)** `\b[a-z](on|no)[a-z]\b` is same as `\b[a-z][on]{2}[a-z]\b`. True or False? Sample input lines shown below might help to understand the differences, if any.
904 |
905 | False. `[on]{2}` will also match `oo` and `nn`.
906 |
907 | ```ruby
908 | >> puts "known\nmood\nknow\npony\ninns"
909 | known
910 | mood
911 | know
912 | pony
913 | inns
914 | ```
915 |
916 | **13)** For the given array, filter elements containing any number sequence greater than `624`.
917 |
918 | ```ruby
919 | >> items = ['h0000432ab', 'car00625', '42_624 0512', '96 foo1234baz 3.14 2']
920 |
921 | >> items.filter { _1.gsub(/\d+/).any? { $&.to_i > 624 } }
922 | => ["car00625", "96 foo1234baz 3.14 2"]
923 | ```
924 |
925 | **14)** Count the maximum depth of nested braces for the given strings. Unbalanced or wrongly ordered braces should return `-1`. Note that this will require a mix of regular expressions and Ruby code.
926 |
927 | ```ruby
928 | ?> def max_nested_braces(ip)
929 | ?> cnt = 0
930 | ?> cnt += 1 while ip.gsub!(/\{[^{}]*\}/, '')
931 | ?> return ip.match?(/[{}]/) ? -1 : cnt
932 | >> end
933 |
934 | >> max_nested_braces('a*b')
935 | => 0
936 | >> max_nested_braces('}a+b{')
937 | => -1
938 | >> max_nested_braces('a*b+{}')
939 | => 1
940 | >> max_nested_braces('{{a+2}*{b+c}+e}')
941 | => 2
942 | >> max_nested_braces('{{a+2}*{b+{c*d}}+e}')
943 | => 3
944 | >> max_nested_braces("{{a+2}*{\n{b+{c*d}}+e*d}}")
945 | => 4
946 | >> max_nested_braces('a*{b+c*{e*3.14}}}')
947 | => -1
948 | ```
949 |
950 | **15)** By default, the `split` method will split on whitespace and remove empty strings from the result. Which regexp based method would you use to replicate this functionality?
951 |
952 | ```ruby
953 | >> ip = " \t\r so pole\t\t\t\n\nlit in to \r\n\v\f "
954 |
955 | >> ip.split
956 | => ["so", "pole", "lit", "in", "to"]
957 |
958 | >> ip.scan(/\S+/)
959 | => ["so", "pole", "lit", "in", "to"]
960 | ```
961 |
962 | **16)** Convert the given input string to two different arrays as shown below. You can optimize the regexp based on characters present in the input string.
963 |
964 | ```ruby
965 | >> ip = "price_42 roast^\t\n^-ice==cat\neast"
966 |
967 | >> ip.split(/\W+/)
968 | => ["price_42", "roast", "ice", "cat", "east"]
969 |
970 | >> ip.split(/(\W+)/)
971 | => ["price_42", " ", "roast", "^\t\n^-", "ice", "==", "cat", "\n", "east"]
972 | ```
973 |
974 | **17)** Filter all elements whose first non-whitespace character is not a `#` character. Any element made up of only whitespace characters should be ignored as well.
975 |
976 | ```ruby
977 | >> items = [' #comment', "\t\napple #42", '#oops', 'sure', 'no#1', "\t\r\f"]
978 |
979 | # can also use: items.grep(/\A\s*[^#\s]/)
980 | >> items.grep(/\A\s*+[^#]/)
981 | => ["\t\napple #42", "sure", "no#1"]
982 | ```
983 |
984 | **18)** Extract all whole words for the given input strings. However, based on user input `ignore`, do not match words if they contain any character present in the `ignore` variable. Assume that `ignore` variable will not contain any regexp metacharacters.
985 |
986 | ```ruby
987 | >> s1 = 'match after the last newline character'
988 | >> s2 = 'and then you want to test'
989 |
990 | >> ignore = 'aty'
991 | >> pat = /\b[\w&&[^#{ignore}]]+\b/
992 | >> s1.scan(pat)
993 | => ["newline"]
994 | >> s2.scan(pat)
995 | => []
996 |
997 | >> ignore = 'esw'
998 | >> pat = /\b[\w&&[^#{ignore}]]+\b/
999 | >> s1.scan(pat)
1000 | => ["match"]
1001 | >> s2.scan(pat)
1002 | => ["and", "you", "to"]
1003 | ```
1004 |
1005 | **19)** Filter all whole elements with optional whitespaces at the start followed by three to five non-digit characters. Whitespaces at the start should not be part of the calculation for non-digit characters.
1006 |
1007 | ```ruby
1008 | >> items = ["\t \ncat", 'goal', ' oh', 'he-he', 'goal2', 'ok ', 'sparrow']
1009 |
1010 | >> items.grep(/\A\s*+\D{3,5}\z/)
1011 | => ["\t \ncat", "goal", "he-he", "ok "]
1012 | ```
1013 |
1014 | **20)** Modify the given regexp such that it gives the expected result.
1015 |
1016 | ```ruby
1017 | >> ip = '( S:12 E:5 S:4 and E:123 ok S:100 & E:10 S:1 - E:2 S:42 E:43 )'
1018 |
1019 | # wrong output
1020 | >> ip.scan(/S:\d+.*?E:\d{2,}/)
1021 | => ["S:12 E:5 S:4 and E:123", "S:100 & E:10", "S:1 - E:2 S:42 E:43"]
1022 |
1023 | # expected output
1024 | >> ip.scan(/(?>S:\d+.*?E:)\d{2,}/)
1025 | => ["S:4 and E:123", "S:100 & E:10", "S:42 E:43"]
1026 | ```
1027 |
1028 |
1029 |
1030 | # Groupings and backreferences
1031 |
1032 | **1)** Replace the space character that occurs after a word ending with `a` or `r` with a newline character.
1033 |
1034 | ```ruby
1035 | >> ip = 'area not a _a2_ roar took 22'
1036 |
1037 | >> puts ip.gsub(/([ar]) /, "\\1\n")
1038 | area
1039 | not a
1040 | _a2_ roar
1041 | took 22
1042 | ```
1043 |
1044 | **2)** Add `[]` around words starting with `s` and containing `e` and `t` in any order.
1045 |
1046 | ```ruby
1047 | >> ip = 'sequoia subtle exhibit asset sets2 tests si_te'
1048 |
1049 | >> ip.gsub(/\bs\w*(t\w*e|e\w*t)\w*/, '[\0]')
1050 | => "sequoia [subtle] exhibit asset [sets2] tests [si_te]"
1051 | ```
1052 |
1053 | **3)** Replace all whole words with `X` that start and end with the same word character (irrespective of case). Single character word should get replaced with `X` too, as it satisfies the stated condition.
1054 |
1055 | ```ruby
1056 | >> ip = 'oreo not a _a2_ Roar took 22'
1057 |
1058 | # can also use: ip.gsub(/\b(\w|(\w)\w*\2)\b/i, 'X')
1059 | >> ip.gsub(/\b(\w)(\w*\1)?\b/i, 'X')
1060 | => "X not X X X took X"
1061 | ```
1062 |
1063 | **4)** Convert the given *markdown* headers to corresponding *anchor* tags. Consider the input to start with one or more `#` characters followed by space and word characters. The `name` attribute is constructed by converting the header to lowercase and replacing spaces with hyphens. Can you do it without using a capture group?
1064 |
1065 | ```ruby
1066 | >> header1 = '# Regular Expressions'
1067 | >> header2 = '## Named capture groups'
1068 |
1069 | >> anchor = /\w.*/
1070 |
1071 | >> header1.sub(anchor) { "#{$&}" }
1072 | => "# Regular Expressions"
1073 | >> header2.sub(anchor) { "#{$&}" }
1074 | => "## Named capture groups"
1075 | ```
1076 |
1077 | **5)** Convert the given *markdown* anchors to corresponding *hyperlinks*.
1078 |
1079 | ```ruby
1080 | >> anchor1 = "# Regular Expressions"
1081 | >> anchor2 = "## Subexpression calls"
1082 |
1083 | >> hyperlink = %r{[^']+'([^']+)'>(.+)}
1084 |
1085 | >> anchor1.sub(hyperlink, '[\2](#\1)')
1086 | => "[Regular Expressions](#regular-expressions)"
1087 | >> anchor2.sub(hyperlink, '[\2](#\1)')
1088 | => "[Subexpression calls](#subexpression-calls)"
1089 | ```
1090 |
1091 | **6)** Count the number of whole words that have at least two occurrences of consecutive repeated alphabets. For example, words like `stillness` and `Committee` should be counted but not words like `root` or `readable` or `rotational`.
1092 |
1093 | ```ruby
1094 | '> ip = %q{oppressed abandon accommodation bloodless
1095 | '> carelessness committed apparition innkeeper
1096 | '> occasionally afforded embarrassment foolishness
1097 | '> depended successfully succeeded
1098 | >> possession cleanliness suppress}
1099 |
1100 | # can also use: ip.scan(/\b\w*(\w)\1\w*(\w)\2\w*\b/).size
1101 | >> ip.scan(/\b(\w*(\w)\2){2}\w*\b/).size
1102 | => 13
1103 | ```
1104 |
1105 | **7)** For the given input string, replace all occurrences of digit sequences with only the unique non-repeating sequence. For example, `232323` should be changed to `23` and `897897` should be changed to `897`. If there are no repeats (for example `1234`) or if the repeats end prematurely (for example `12121`), it should not be changed.
1106 |
1107 | ```ruby
1108 | >> ip = '1234 2323 453545354535 9339 11 60260260'
1109 |
1110 | >> ip.gsub(/\b(\d+)\1+\b/, '\1')
1111 | => "1234 23 4535 9339 1 60260260"
1112 | ```
1113 |
1114 | **8)** Replace sequences made up of words separated by `:` or `.` by the first word of the sequence. Such sequences will end when `:` or `.` is not followed by a word character.
1115 |
1116 | ```ruby
1117 | >> ip = 'wow:Good:2_two.five: hi-2 bye kite.777:water.'
1118 |
1119 | >> ip.gsub(/([:.]\w*)+/, '')
1120 | => "wow hi-2 bye kite"
1121 | ```
1122 |
1123 | **9)** Replace sequences made up of words separated by `:` or `.` by the last word of the sequence. Such sequences will end when `:` or `.` is not followed by a word character.
1124 |
1125 | ```ruby
1126 | >> ip = 'wow:Good:2_two.five: hi-2 bye kite.777:water.'
1127 |
1128 | >> ip.gsub(/((\w+)[:.])+/, '\2')
1129 | => "five hi-2 bye water"
1130 | ```
1131 |
1132 | **10)** Split the given input string on one or more repeated sequence of `cat`.
1133 |
1134 | ```ruby
1135 | >> ip = 'firecatlioncatcatcatbearcatcatparrot'
1136 |
1137 | >> ip.split(/(?:cat)+/)
1138 | => ["fire", "lion", "bear", "parrot"]
1139 | ```
1140 |
1141 | **11)** For the given input string, find all occurrences of digit sequences with at least one repeating sequence. For example, `232323` and `897897`. If the repeats end prematurely, for example `12121`, it should not be matched.
1142 |
1143 | ```ruby
1144 | >> ip = '1234 2323 453545354535 9339 11 60260260'
1145 |
1146 | >> pat = /\b(\d+)\1+\b/
1147 |
1148 | # entire sequences in the output
1149 | >> ip.gsub(pat).map { $& }
1150 | => ["2323", "453545354535", "11"]
1151 |
1152 | # only the unique sequence in the output
1153 | >> ip.gsub(pat).map { $1 }
1154 | => ["23", "4535", "1"]
1155 | ```
1156 |
1157 | **12)** Convert the comma separated strings to corresponding `hash` objects as shown below. The keys are `name`, `maths` and `phy` for the three fields in the input strings.
1158 |
1159 | ```ruby
1160 | >> row1 = 'rohan,75,89'
1161 | >> row2 = 'rose,88,92'
1162 |
1163 | >> pat = /(?[^,]+),(?[^,]+),(?[^,]+)/
1164 |
1165 | >> row1.match(pat).named_captures
1166 | => {"name"=>"rohan", "maths"=>"75", "phy"=>"89"}
1167 | >> row2.match(pat).named_captures
1168 | => {"name"=>"rose", "maths"=>"88", "phy"=>"92"}
1169 | ```
1170 |
1171 | **13)** Surround all whole words with `()`. Additionally, if the whole word is `imp` or `ant`, delete them. Can you do it with just a single substitution?
1172 |
1173 | ```ruby
1174 | >> ip = 'tiger imp goat eagle ant important'
1175 |
1176 | >> ip.gsub(/\b(?:imp|ant|(\w+))\b/, '(\1)')
1177 | => "(tiger) () (goat) (eagle) () (important)"
1178 | ```
1179 |
1180 | **14)** Filter all elements that contain a sequence of lowercase alphabets followed by `-` followed by digits. They can be optionally surrounded by `{{` and `}}`. Any partial match shouldn't be part of the output.
1181 |
1182 | ```ruby
1183 | >> ip = %w[{{apple-150}} {{mango2-100}} {{cherry-200 grape-87 {{go-to}}]
1184 |
1185 | >> ip.grep(/\A({{)?[a-z]+-\d+(?(1)}})\z/)
1186 | => ["{{apple-150}}", "grape-87"]
1187 | ```
1188 |
1189 | **15)** Extract all hexadecimal character sequences, with `0x` optional prefix. Match the characters case insensitively, and the sequences shouldn't be surrounded by other word characters.
1190 |
1191 | ```ruby
1192 | >> str1 = '128A foo 0xfe32 34 0xbar'
1193 | >> str2 = '0XDEADBEEF place 0x0ff1ce bad'
1194 |
1195 | >> hex_seq = /\b(?:0x)?\h+\b/i
1196 |
1197 | >> str1.scan(hex_seq)
1198 | => ["128A", "0xfe32", "34"]
1199 | >> str2.scan(hex_seq)
1200 | => ["0XDEADBEEF", "0x0ff1ce", "bad"]
1201 | ```
1202 |
1203 | **16)** Replace sequences made up of words separated by `:` or `.` by the first/last word of the sequence and the separator. Such sequences will end when `:` or `.` is not followed by a word character.
1204 |
1205 | ```ruby
1206 | >> ip = 'wow:Good:2_two.five: hi-2 bye kite.777:water.'
1207 |
1208 | # first word of the sequence
1209 | >> ip.gsub(/((\w+[:.]))\g<2>+/, '\1')
1210 | => "wow: hi-2 bye kite."
1211 |
1212 | # last word of the sequence
1213 | >> ip.gsub(/(\w+[:.])\g<1>+/, '\1')
1214 | => "five: hi-2 bye water."
1215 | ```
1216 |
1217 | **17)** For the given input strings, extract `if` followed by any number of nested parentheses. Assume that there will be only one such pattern per input string.
1218 |
1219 | ```ruby
1220 | >> ip1 = 'for (((i*3)+2)/6) if(3-(k*3+4)/12-(r+2/3)) while()'
1221 | >> ip2 = 'if+while if(a(b)c(d(e(f)1)2)3) for(i=1)'
1222 |
1223 | >> pat = /if(\((?:[^()]++|\g<1>)++\))/
1224 |
1225 | >> ip1[pat]
1226 | => "if(3-(k*3+4)/12-(r+2/3))"
1227 | >> ip2[pat]
1228 | => "if(a(b)c(d(e(f)1)2)3)"
1229 | ```
1230 |
1231 | **18)** The given input string has sequences made up of words separated by `:` or `.` and such sequences will end when `:` or `.` is not followed by a word character. For all such sequences, display only the last word followed by `-` followed by the first word.
1232 |
1233 | ```ruby
1234 | >> ip = 'wow:Good:2_two.five: hi-2 bye kite.777:water.'
1235 |
1236 | >> ip.scan(/(\w+)[:.](?:(\w+)[:.])+/).map { "#{_2}-#{_1}" }
1237 | => ["five-wow", "water-kite"]
1238 | ```
1239 |
1240 |
1241 |
1242 | # Lookarounds
1243 |
1244 | > Please use lookarounds for solving the following exercises even if you can do it without lookarounds. Unless you cannot use lookarounds for cases like variable length lookbehinds.
1245 |
1246 | **1)** Replace all whole words with `X` unless it is preceded by a `(` character.
1247 |
1248 | ```ruby
1249 | >> ip = '(apple) guava berry) apple (mango) (grape'
1250 |
1251 | >> ip.gsub(/(? "(apple) X X) X (mango) (grape"
1253 | ```
1254 |
1255 | **2)** Replace all whole words with `X` unless it is followed by a `)` character.
1256 |
1257 | ```ruby
1258 | >> ip = '(apple) guava berry) apple (mango) (grape'
1259 |
1260 | >> ip.gsub(/\w+\b(?!\))/, 'X')
1261 | => "(apple) X berry) X (mango) (X"
1262 | ```
1263 |
1264 | **3)** Replace all whole words with `X` unless it is preceded by `(` or followed by `)` characters.
1265 |
1266 | ```ruby
1267 | >> ip = '(apple) guava berry) apple (mango) (grape'
1268 |
1269 | >> ip.gsub(/(? "(apple) X berry) X (mango) (grape"
1271 | ```
1272 |
1273 | **4)** Extract all whole words that do not end with `e` or `n`.
1274 |
1275 | ```ruby
1276 | >> ip = 'a_t row on Urn e note Dust n end a2-e|u'
1277 |
1278 | >> ip.scan(/\b\w+\b(? ["a_t", "row", "Dust", "end", "a2", "u"]
1280 | ```
1281 |
1282 | **5)** Extract all whole words that do not start with `a` or `d` or `n`.
1283 |
1284 | ```ruby
1285 | >> ip = 'a_t row on Urn e note Dust n end a2-e|u'
1286 |
1287 | >> ip.scan(/(?![adn])\b\w+\b/)
1288 | => ["row", "on", "Urn", "e", "Dust", "end", "e", "u"]
1289 | ```
1290 |
1291 | **6)** Extract all whole words only if they are followed by `:` or `,` or `-`.
1292 |
1293 | ```ruby
1294 | >> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit'
1295 |
1296 | >> ip.scan(/\w+(?=[:,-])/)
1297 | => ["Poke", "so_good", "ever2"]
1298 | ```
1299 |
1300 | **7)** Extract all whole words only if they are preceded by `=` or `/` or `-`.
1301 |
1302 | ```ruby
1303 | >> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit'
1304 |
1305 | # can also use: ip.scan(%r{[=/-]\K\w+})
1306 | >> ip.scan(%r{(?<=[=/-])\w+})
1307 | => ["so_good", "is", "sit"]
1308 | ```
1309 |
1310 | **8)** Extract all whole words only if they are preceded by `=` or `:` and followed by `:` or `.`.
1311 |
1312 | ```ruby
1313 | >> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit'
1314 |
1315 | # can also use: ip.scan(/[=:]\K\w+(?=[:.])/)
1316 | >> ip.scan(/(?<=[=:])\w+(?=[:.])/)
1317 | => ["so_good", "ink"]
1318 | ```
1319 |
1320 | **9)** Extract all whole words only if they are preceded by `=` or `:` or `.` or `(` or `-` and not followed by `.` or `/`.
1321 |
1322 | ```ruby
1323 | >> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit'
1324 |
1325 | # can also use: ip.scan(%r{[=:.(-]\K\w+\b(?![/.])})
1326 | >> ip.scan(%r{(?<=[=:.(-])\w+\b(?![/.])})
1327 | => ["so_good", "vast", "sit"]
1328 | ```
1329 |
1330 | **10)** Remove the leading and trailing whitespaces from all the individual fields where `,` is the field separator.
1331 |
1332 | ```ruby
1333 | >> csv1 = " comma ,separated ,values \t\r "
1334 | >> csv2 = 'good bad,nice ice , 42 , , stall small'
1335 |
1336 | >> remove_whitespace = /(?> csv1.gsub(remove_whitespace, '')
1339 | => "comma,separated,values"
1340 | >> csv2.gsub(remove_whitespace, '')
1341 | => "good bad,nice ice,42,,stall small"
1342 | ```
1343 |
1344 | **11)** Filter elements that satisfy all of these rules:
1345 |
1346 | * should have at least two alphabets
1347 | * should have at least three digits
1348 | * should have at least one special character among `%` or `*` or `#` or `$`
1349 | * should not end with a whitespace character
1350 |
1351 | ```ruby
1352 | >> pwds = ['hunter2', 'F2H3u%9', "*X3Yz3.14\t", 'r2_d2_42', 'A $B C1234']
1353 |
1354 | >> rule_chk = /(?=(.*[a-zA-Z]){2})(?=(.*\d){3})(?!.*\s\z).*[%*#$]/
1355 |
1356 | >> pwds.grep(rule_chk)
1357 | => ["F2H3u%9", "A $B C1234"]
1358 | ```
1359 |
1360 | **12)** For the given string, surround all whole words with `{}` except for whole words `par` and `cat` and `apple`.
1361 |
1362 | ```ruby
1363 | >> ip = 'part; cat {super} rest_42 par scatter apple spar'
1364 |
1365 | >> ip.gsub(/\b(?!(?:par|cat|apple)\b)\w+/, '{\0}')
1366 | => "{part}; cat {{super}} {rest_42} par {scatter} apple {spar}"
1367 | ```
1368 |
1369 | **13)** Extract the integer portion of floating-point numbers for the given string. Integers and numbers ending with `.` and no further digits should not be considered.
1370 |
1371 | ```ruby
1372 | >> ip = '12 ab32.4 go 5 2. 46.42 5'
1373 |
1374 | >> ip.scan(/\d+(?=\.\d)/)
1375 | => ["32", "46"]
1376 | ```
1377 |
1378 | **14)** For the given input strings, extract all overlapping two character sequences.
1379 |
1380 | ```ruby
1381 | >> s1 = 'apple'
1382 | >> s2 = '1.2-3:4'
1383 |
1384 | >> pat = /.(?=(.))/
1385 |
1386 | >> s1.gsub(pat).map { $& + $1 }
1387 | => ["ap", "pp", "pl", "le"]
1388 | >> s2.gsub(pat).map { $& + $1 }
1389 | => ["1.", ".2", "2-", "-3", "3:", ":4"]
1390 | ```
1391 |
1392 | **15)** The given input strings contain fields separated by the `:` character. Delete `:` and the last field if there is a digit character anywhere before the last field.
1393 |
1394 | ```ruby
1395 | >> s1 = '42:cat'
1396 | >> s2 = 'twelve:a2b'
1397 | >> s3 = 'we:be:he:0:a:b:bother'
1398 | >> s4 = 'apple:banana-42:cherry:'
1399 | >> s5 = 'dragon:unicorn:centaur'
1400 |
1401 | >> pat = /(\d.*):.*/
1402 |
1403 | >> s1.sub(pat, '\1')
1404 | => "42"
1405 | >> s2.sub(pat, '\1')
1406 | => "twelve:a2b"
1407 | >> s3.sub(pat, '\1')
1408 | => "we:be:he:0:a:b"
1409 | >> s4.sub(pat, '\1')
1410 | => "apple:banana-42:cherry"
1411 | >> s5.sub(pat, '\1')
1412 | => "dragon:unicorn:centaur"
1413 | ```
1414 |
1415 | **16)** Extract all whole words unless they are preceded by `:` or `<=>` or `----` or `#`.
1416 |
1417 | ```ruby
1418 | >> ip = '::very--at<=>row|in.a_b#b2c=>lion----east'
1419 |
1420 | >> ip.scan(/(?|-{4})\b\w+/)
1421 | => ["at", "in", "a_b", "lion"]
1422 | ```
1423 |
1424 | **17)** Match strings if it contains `qty` followed by `price` but not if there is any **whitespace** character or the string `error` between them.
1425 |
1426 | ```ruby
1427 | >> str1 = '23,qty,price,42'
1428 | >> str2 = 'qty price,oh'
1429 | >> str3 = '3.14,qty,6,errors,9,price,3'
1430 | >> str4 = "42\nqty-6,apple-56,price-234,error"
1431 | >> str5 = '4,price,3.14,qty,4'
1432 | >> str6 = '(qtyprice) (hi-there)'
1433 |
1434 | # can also use: neg = /qty((?!\s|error).)*price/
1435 | >> neg = /qty(?~\s|error)price/
1436 |
1437 | >> str1.match?(neg)
1438 | => true
1439 | >> str2.match?(neg)
1440 | => false
1441 | >> str3.match?(neg)
1442 | => false
1443 | >> str4.match?(neg)
1444 | => true
1445 | >> str5.match?(neg)
1446 | => false
1447 | >> str6.match?(neg)
1448 | => true
1449 | ```
1450 |
1451 | **18)** Can you reason out why the following regular expressions behave differently?
1452 |
1453 | `\b` matches both the start and end of word locations. In the below example, `\b..\b` doesn't necessarily mean that the first `\b` will match only the start of word location and the second `\b` will match only the end of word location. They can be any combination! For example, `I` followed by space in the input string here is using the start of word location for both the conditions. Similarly, space followed by `2` is using the end of word location for both the conditions.
1454 |
1455 | In contrast, the negative lookarounds version ensures that there are no word characters around any two characters. Also, such assertions will always be satisfied at the start of string and the end of string respectively. But `\b` depends on the presence of word characters. For example, `!` at the end of the input string here matches the lookaround assertion but not word boundary.
1456 |
1457 | ```ruby
1458 | >> ip = 'I have 12, he has 2!'
1459 |
1460 | >> ip.gsub(/\b..\b/, '{\0}')
1461 | => "{I }have {12}{, }{he} has{ 2}!"
1462 |
1463 | >> ip.gsub(/(? "I have {12}, {he} has {2!}"
1465 | ```
1466 |
1467 | **19)** The given input strings have fields separated by the `:` character. Assume that each string has a minimum of two fields and cannot have empty fields. Extract all fields, but stop if a field with a digit character is found.
1468 |
1469 | ```ruby
1470 | >> row1 = 'vast:a2b2:ride:in:awe:b2b:3list:end'
1471 | >> row2 = 'um:no:low:3e:s4w:seer'
1472 | >> row3 = 'oh100:apple:banana:fig'
1473 | >> row4 = 'Dragon:Unicorn:Wizard-Healer'
1474 |
1475 | >> pat = /\G([^\d:]+)(?::|\z)/
1476 |
1477 | >> row1.gsub(pat).map { $1 }
1478 | => ["vast"]
1479 | >> row2.gsub(pat).map { $1 }
1480 | => ["um", "no", "low"]
1481 | >> row3.gsub(pat).map { $1 }
1482 | => []
1483 | >> row4.gsub(pat).map { $1 }
1484 | => ["Dragon", "Unicorn", "Wizard-Healer"]
1485 | ```
1486 |
1487 | **20)** The given input strings have fields separated by the `:` character. Extract all fields only after a field containing a digit character is found. Assume that each string has a minimum of two fields and cannot have empty fields.
1488 |
1489 | ```ruby
1490 | >> row1 = 'vast:a2b2:ride:in:awe:b2b:3list:end'
1491 | >> row2 = 'um:no:low:3e:s4w:seer'
1492 | >> row3 = 'oh100:apple:banana:fig'
1493 | >> row4 = 'Dragon:Unicorn:Wizard-Healer'
1494 |
1495 | >> pat = /(?:\d[^:]*|\G):\K[^:]+/
1496 |
1497 | >> row1.scan(pat)
1498 | => ["ride", "in", "awe", "b2b", "3list", "end"]
1499 | >> row2.scan(pat)
1500 | => ["s4w", "seer"]
1501 | >> row3.scan(pat)
1502 | => ["apple", "banana", "fig"]
1503 | >> row4.scan(pat)
1504 | => []
1505 | ```
1506 |
1507 | **21)** The given input string has comma separated fields and some of them can occur more than once. For the duplicated fields, retain only the rightmost one. Assume that there are no empty fields.
1508 |
1509 | ```ruby
1510 | >> row = '421,cat,2425,42,5,cat,6,6,42,61,6,6,scat,6,6,4,Cat,425,4'
1511 |
1512 | >> row.gsub(/(? "421,2425,5,cat,42,61,scat,6,Cat,425,4"
1514 | ```
1515 |
1516 |
1517 |
1518 | # Modifiers
1519 |
1520 | **1)** Remove from the first occurrence of `hat` to the last occurrence of `it` for the given input strings. Match these markers case insensitively.
1521 |
1522 | ```ruby
1523 | >> s1 = "But Cool THAT\nsee What okay\nwow quite"
1524 | >> s2 = 'it this hat is sliced HIT.'
1525 |
1526 | >> pat = /hat.*it/im
1527 |
1528 | >> s1.sub(pat, '')
1529 | => "But Cool Te"
1530 | >> s2.sub(pat, '')
1531 | => "it this ."
1532 | ```
1533 |
1534 | **2)** Delete from the string `start` if it is at the beginning of a line up to the next occurrence of the string `end` at the end of a line. Match these keywords irrespective of case.
1535 |
1536 | ```ruby
1537 | '> para = %q{good start
1538 | '> start working on that
1539 | '> project you always wanted
1540 | '> to, do not let it end
1541 | '> hi there
1542 | '> start and end the end
1543 | '> 42
1544 | '> Start and try to
1545 | '> finish the End
1546 | >> bye}
1547 |
1548 | >> pat = /^start.*?end$/im
1549 |
1550 | >> puts para.gsub(pat, '')
1551 | good start
1552 |
1553 | hi there
1554 |
1555 | 42
1556 |
1557 | bye
1558 | ```
1559 |
1560 | **3)** For the given *markdown* file, replace all occurrences of the string `ruby` (irrespective of case) with the string `Ruby`. However, any match within code blocks that start with the whole line ` ```ruby ` and end with the whole line ` ``` ` shouldn't be replaced. Consider the input file to be small enough to fit memory requirements.
1561 |
1562 | Refer to the [exercises folder](https://github.com/learnbyexample/Ruby_Regexp/tree/master/exercises) for input files required to solve this exercise.
1563 |
1564 | ```ruby
1565 | >> ip_str = File.open('sample.md').read
1566 | >> pat = /(^```ruby$.*?^```$)/m
1567 |
1568 | >> File.open('sample_mod.md', 'w') do |f|
1569 | ?> ip_str.split(pat).each_with_index do |s, i|
1570 | ?> f.write(i.odd? ? s : s.gsub(/ruby/i) { $&.capitalize })
1571 | >> end
1572 | >> end
1573 |
1574 | >> File.open('sample_mod.md').read == File.open('expected.md').read
1575 | => true
1576 | ```
1577 |
1578 | **4)** Write a string method that changes the given input to alternate case (starting with lowercase first).
1579 |
1580 | ```ruby
1581 | ?> def aLtErNaTe_CaSe(ip_str)
1582 | ?> b = true
1583 | ?> return ip_str.gsub(/[a-z]/i) { (b = !b) ? $&.upcase : $&.downcase }
1584 | >> end
1585 |
1586 | >> aLtErNaTe_CaSe('HI THERE!')
1587 | => "hI tHeRe!"
1588 | >> aLtErNaTe_CaSe('good morning')
1589 | => "gOoD mOrNiNg"
1590 | >> aLtErNaTe_CaSe('Sample123string42with777numbers')
1591 | => "sAmPlE123sTrInG42wItH777nUmBeRs"
1592 | ```
1593 |
1594 | **5)** For the given input strings, match all of these three conditions:
1595 |
1596 | * `This` case sensitively
1597 | * `nice` and `cool` case insensitively
1598 |
1599 | ```ruby
1600 | >> s1 = 'This is nice and Cool'
1601 | >> s2 = 'Nice and cool this is'
1602 | >> s3 = 'What is so nice and cool about This?'
1603 | >> s4 = 'nice,cool,This'
1604 | >> s5 = 'not nice This?'
1605 | >> s6 = 'This is not cool'
1606 |
1607 | >> pat = /(?=.*nice)(?=.*cool)(?-i:.*This)/i
1608 |
1609 | >> s1.match?(pat)
1610 | => true
1611 | >> s2.match?(pat)
1612 | => false
1613 | >> s3.match?(pat)
1614 | => true
1615 | >> s4.match?(pat)
1616 | => true
1617 | >> s5.match?(pat)
1618 | => false
1619 | >> s6.match?(pat)
1620 | => false
1621 | ```
1622 |
1623 | **6)** For the given input strings, match if the string begins with `Th` and also contains a line that starts with `There`.
1624 |
1625 | ```ruby
1626 | >> s1 = "There there\nHave a cookie"
1627 | >> s2 = "This is a mess\nYeah?\nThereeeee"
1628 | >> s3 = "Oh\nThere goes the fun"
1629 | >> s4 = 'This is not\ngood\nno There'
1630 |
1631 | >> pat = /\A(?=Th)(?m:.*^There)/
1632 |
1633 | >> s1.match?(pat)
1634 | => true
1635 | >> s2.match?(pat)
1636 | => true
1637 | >> s3.match?(pat)
1638 | => false
1639 | >> s4.match?(pat)
1640 | => false
1641 | ```
1642 |
1643 |
1644 |
1645 | # Unicode
1646 |
1647 | **1)** Output `true` or `false` depending on input string made up of ASCII characters or not. Consider the input to be non-empty strings and any character that isn't part of the 7-bit ASCII set should give `false`.
1648 |
1649 | ```ruby
1650 | >> str1 = '123—456'
1651 | >> str2 = 'good fοοd'
1652 | >> str3 = 'happy learning!'
1653 |
1654 | # can also use ! str1.match?(/[^\u{00}-\u{7f}]/)
1655 | >> str1.ascii_only?
1656 | => false
1657 | >> str2.ascii_only?
1658 | => false
1659 | >> str3.ascii_only?
1660 | => true
1661 | ```
1662 |
1663 | **2)** Retain only punctuation characters for the given strings (generated from codepoints). Use the Unicode character set definition for punctuation for solving this exercise.
1664 |
1665 | ```ruby
1666 | >> s1 = (0..0x7f).to_a.pack('U*')
1667 | >> s2 = (0x80..0xff).to_a.pack('U*')
1668 | >> s3 = (0x2600..0x27eb).to_a.pack('U*')
1669 |
1670 | >> pat = /\p{^P}/
1671 |
1672 | >> s1.gsub(pat, '')
1673 | => "!\"#%&'()*,-./:;?@[\\]_{}"
1674 | >> s2.gsub(pat, '')
1675 | => "¡§«¶·»¿"
1676 | >> s3.gsub(pat, '')
1677 | => "❨❩❪❫❬❭❮❯❰❱❲❳❴❵⟅⟆⟦⟧⟨⟩⟪⟫"
1678 | ```
1679 |
1680 | **3)** Explore the following Q&A threads.
1681 |
1682 | * [stackoverflow: remove emoji from string](https://stackoverflow.com/q/24672834/4082052)
1683 | * [stackoverflow: why am I seeing different results for these two nearly identical regexp](https://stackoverflow.com/q/13573136/4082052)
1684 | * [stackoverflow: convert unicode number to integer](https://stackoverflow.com/q/37338708/4082052)
1685 | * [stackoverflow: replacing %uXXXX to the corresponding unicode codepoint](https://stackoverflow.com/q/28773392/4082052)
1686 |
1687 |
--------------------------------------------------------------------------------
/exercises/Exercises.md:
--------------------------------------------------------------------------------
1 | # Exercises
2 |
3 | > Try to solve the exercises in every chapter using only the features discussed until that chapter. Some of the exercises will be easier to solve with techniques presented in the later chapters, but the aim of these exercises is to explore the features presented so far.
4 |
5 | > For solutions, see [Exercise_solutions.md](https://github.com/learnbyexample/Ruby_Regexp/blob/master/exercises/Exercise_solutions.md).
6 |
7 |
8 |
9 | # Regexp introduction
10 |
11 | **1)** Check whether the given strings contain `0xB0`. Display a boolean result as shown below.
12 |
13 | ```ruby
14 | >> line1 = 'start address: 0xA0, func1 address: 0xC0'
15 | >> line2 = 'end address: 0xFF, func2 address: 0xB0'
16 |
17 | >> line1.match?() ##### add your solution here
18 | => false
19 | >> line2.match?() ##### add your solution here
20 | => true
21 | ```
22 |
23 | **2)** Check if the given input strings contain `two` irrespective of case.
24 |
25 | ```ruby
26 | >> s1 = 'Their artwork is exceptional'
27 | >> s2 = 'one plus tw0 is not three'
28 | >> s3 = 'TRUSTWORTHY'
29 |
30 | >> pat1 = // ##### add your solution here
31 |
32 | >> pat1.match?(s1)
33 | => true
34 | >> pat1.match?(s2)
35 | => false
36 | >> pat1.match?(s3)
37 | => true
38 | ```
39 |
40 | **3)** Replace all occurrences of `5` with `five` for the given string.
41 |
42 | ```ruby
43 | >> ip = 'They ate 5 apples and 5 oranges'
44 |
45 | >> ip.gsub(//, 'five') ##### add your solution here
46 | => "They ate five apples and five oranges"
47 | ```
48 |
49 | **4)** Replace only the first occurrence of `5` with `five` for the given string.
50 |
51 | ```ruby
52 | >> ip = 'They ate 5 apples and 5 oranges'
53 |
54 | >> ip.sub(//, 'five') ##### add your solution here
55 | => "They ate five apples and 5 oranges"
56 | ```
57 |
58 | **5)** For the given array, filter all elements that do *not* contain `e`.
59 |
60 | ```ruby
61 | >> items = %w[goal new user sit eat dinner]
62 |
63 | >> items.grep_v(//) ##### add your solution here
64 | => ["goal", "sit"]
65 | ```
66 |
67 | **6)** Replace all occurrences of `note` irrespective of case with `X`.
68 |
69 | ```ruby
70 | >> ip = 'This note should not be NoTeD'
71 |
72 | >> ip.gsub(//, 'X') ##### add your solution here
73 | => "This X should not be XD"
74 | ```
75 |
76 | **7)** For the given input string, print all lines NOT containing the string `2`.
77 |
78 | ```ruby
79 | '> purchases = %q{items qty
80 | '> apple 24
81 | '> mango 50
82 | '> guava 42
83 | '> onion 31
84 | >> water 10}
85 |
86 | >> num = // ##### add your solution here
87 |
88 | >> puts purchases.each_line.grep_v(num)
89 | items qty
90 | mango 50
91 | onion 31
92 | water 10
93 | ```
94 |
95 | **8)** For the given array, filter all elements that contain either `a` or `w`.
96 |
97 | ```ruby
98 | >> items = %w[goal new user sit eat dinner]
99 |
100 | >> items.filter { } ##### add your solution here
101 | => ["goal", "new", "eat"]
102 | ```
103 |
104 | **9)** For the given array, filter all elements that contain both `e` and `n`.
105 |
106 | ```ruby
107 | >> items = %w[goal new user sit eat dinner]
108 |
109 | >> items.filter { } ##### add your solution here
110 | => ["new", "dinner"]
111 | ```
112 |
113 | **10)** For the given string, replace `0xA0` with `0x7F` and `0xC0` with `0x1F`.
114 |
115 | ```ruby
116 | >> ip = 'start address: 0xA0, func1 address: 0xC0'
117 |
118 | ##### add your solution here
119 | => "start address: 0x7F, func1 address: 0x1F"
120 | ```
121 |
122 | **11)** Find the starting index of the first occurrence of `is` for the given input string.
123 |
124 | ```ruby
125 | >> ip = 'match this after the history lesson'
126 |
127 | ##### add your solution here
128 | => 8
129 | ```
130 |
131 |
132 |
133 | # Anchors
134 |
135 | **1)** Check if the given strings start with `be`.
136 |
137 | ```ruby
138 | >> line1 = 'be nice'
139 | >> line2 = '"best!"'
140 | >> line3 = 'better?'
141 | >> line4 = 'oh no\nbear spotted'
142 |
143 | >> pat = ##### add your solution here
144 |
145 | >> pat.match?(line1)
146 | => true
147 | >> pat.match?(line2)
148 | => false
149 | >> pat.match?(line3)
150 | => true
151 | >> pat.match?(line4)
152 | => false
153 | ```
154 |
155 | **2)** For the given input string, change only the whole word `red` to `brown`.
156 |
157 | ```ruby
158 | >> words = 'bred red spread credible red.'
159 |
160 | >> words.gsub() ##### add your solution here
161 | => "bred brown spread credible brown."
162 | ```
163 |
164 | **3)** For the given input array, filter elements that contain `42` surrounded by word characters.
165 |
166 | ```ruby
167 | >> items = ['hi42bye', 'nice1423', 'bad42', 'cool_42a', '42fake', '_42_']
168 |
169 | >> items.grep() ##### add your solution here
170 | => ["hi42bye", "nice1423", "cool_42a", "_42_"]
171 | ```
172 |
173 | **4)** For the given input array, filter elements that start with `den` or end with `ly`.
174 |
175 | ```ruby
176 | >> items = ['lovely', "1\ndentist", '2 lonely', 'eden', "fly\n", 'dent']
177 |
178 | >> items.filter { } ##### add your solution here
179 | => ["lovely", "2 lonely", "dent"]
180 | ```
181 |
182 | **5)** For the given input string, change whole word `mall` to `1234` only if it is at the start of a line.
183 |
184 | ```ruby
185 | '> para = %q{(mall) call ball pall
186 | '> ball fall wall tall
187 | '> mall call ball pall
188 | '> wall mall ball fall
189 | '> mallet wallet malls
190 | >> mall:call:ball:pall}
191 |
192 | >> puts para.gsub() ##### add your solution here
193 | (mall) call ball pall
194 | ball fall wall tall
195 | 1234 call ball pall
196 | wall mall ball fall
197 | mallet wallet malls
198 | 1234:call:ball:pall
199 | ```
200 |
201 | **6)** For the given array, filter elements having a line starting with `den` or ending with `ly`.
202 |
203 | ```ruby
204 | >> items = ['lovely', "1\ndentist", '2 lonely', 'eden', "fly\nfar", 'dent']
205 |
206 | >> items.filter { } ##### add your solution here
207 | => ["lovely", "1\ndentist", "2 lonely", "fly\nfar", "dent"]
208 | ```
209 |
210 | **7)** For the given input array, filter all whole elements `12\nthree` irrespective of case.
211 |
212 | ```ruby
213 | >> items = ["12\nthree\n", "12\nThree", "12\nthree\n4", "12\nthree"]
214 |
215 | >> items.grep() ##### add your solution here
216 | => ["12\nThree", "12\nthree"]
217 | ```
218 |
219 | **8)** For the given input array, replace `hand` with `X` for all elements that start with `hand` followed by at least one word character.
220 |
221 | ```ruby
222 | >> items = %w[handed hand handy unhanded handle hand-2]
223 |
224 | >> items.map { } ##### add your solution here
225 | => ["Xed", "hand", "Xy", "unhanded", "Xle", "hand-2"]
226 | ```
227 |
228 | **9)** For the given input array, filter all elements starting with `h`. Additionally, replace `e` with `X` for these filtered elements.
229 |
230 | ```ruby
231 | >> items = %w[handed hand handy unhanded handle hand-2]
232 |
233 | >> items.filter_map { } ##### add your solution here
234 | => ["handXd", "hand", "handy", "handlX", "hand-2"]
235 | ```
236 |
237 |
238 |
239 | # Alternation and Grouping
240 |
241 | **1)** For the given input array, filter all elements that start with `den` or end with `ly`.
242 |
243 | ```ruby
244 | >> items = ['lovely', "1\ndentist", '2 lonely', 'eden', "fly\n", 'dent']
245 |
246 | >> items.grep() ##### add your solution here
247 | => ["lovely", "2 lonely", "dent"]
248 | ```
249 |
250 | **2)** For the given array, filter elements having a line starting with `den` or ending with `ly`.
251 |
252 | ```ruby
253 | >> items = ['lovely', "1\ndentist", '2 lonely', 'eden', "fly\nfar", 'dent']
254 |
255 | >> items.grep() ##### add your solution here
256 | => ["lovely", "1\ndentist", "2 lonely", "fly\nfar", "dent"]
257 | ```
258 |
259 | **3)** For the given strings, replace all occurrences of `removed` or `reed` or `received` or `refused` with `X`.
260 |
261 | ```ruby
262 | >> s1 = 'creed refuse removed read'
263 | >> s2 = 'refused reed redo received'
264 |
265 | >> pat = ##### add your solution here
266 |
267 | >> s1.gsub(pat, 'X')
268 | => "cX refuse X read"
269 | >> s2.gsub(pat, 'X')
270 | => "X X redo X"
271 | ```
272 |
273 | **4)** For the given strings, replace all matches from the array `words` with `A`.
274 |
275 | ```ruby
276 | >> s1 = 'plate full of slate'
277 | >> s2 = "slated for later, don't be late"
278 | >> words = %w[late later slated]
279 |
280 | >> pat = ##### add your solution here
281 |
282 | >> s1.gsub(pat, 'A')
283 | => "pA full of sA"
284 | >> s2.gsub(pat, 'A')
285 | => "A for A, don't be A"
286 | ```
287 |
288 | **5)** Filter all whole elements from the input array `items` that exactly matches any of the elements present in the array `words`.
289 |
290 | ```ruby
291 | >> items = ['slate', 'later', 'plate', 'late', 'slates', 'slated ']
292 | >> words = %w[late later slated]
293 |
294 | >> pat = ##### add your solution here
295 |
296 | >> items.grep(pat)
297 | => ["later", "late"]
298 | ```
299 |
300 |
301 |
302 | # Escaping metacharacters
303 |
304 | **1)** Transform the given input strings to the expected output using the same logic on both strings.
305 |
306 | ```ruby
307 | >> str1 = '(9-2)*5+qty/3-(9-2)*7'
308 | >> str2 = '(qty+4)/2-(9-2)*5+pq/4'
309 |
310 | >> str1.gsub() ##### add your solution here
311 | => "35+qty/3-(9-2)*7"
312 | >> str2.gsub() ##### add your solution here
313 | => "(qty+4)/2-35+pq/4"
314 | ```
315 |
316 | **2)** Replace `(4)\|` with `2` only at the start or end of the given input strings.
317 |
318 | ```ruby
319 | >> s1 = '2.3/(4)\|6 fig 5.3-(4)\|'
320 | >> s2 = '(4)\|42 - (4)\|3'
321 | >> s3 = "two - (4)\\|\n"
322 |
323 | >> pat = ##### add your solution here
324 |
325 | >> s1.gsub(pat, '2')
326 | => "2.3/(4)\\|6 fig 5.3-2"
327 | >> s2.gsub(pat, '2')
328 | => "242 - (4)\\|3"
329 | >> s3.gsub(pat, '2')
330 | => "two - (4)\\|\n"
331 | ```
332 |
333 | **3)** Replace any matching item from the given array with `X` for the given input strings. Match the elements from `items` literally. Assume no two elements of `items` will result in any matching conflict.
334 |
335 | ```ruby
336 | >> items = ['a.b', '3+n', 'x\y\z', 'qty||price', '{n}']
337 |
338 | >> pat = ##### add your solution here
339 |
340 | >> '0a.bcd'.gsub(pat, 'X')
341 | => "0Xcd"
342 | >> 'E{n}AMPLE'.gsub(pat, 'X')
343 | => "EXAMPLE"
344 | >> '43+n2 ax\y\ze'.gsub(pat, 'X')
345 | => "4X2 aXe"
346 | ```
347 |
348 | **4)** Replace the backspace character `\b` with a single space character for the given input string.
349 |
350 | ```ruby
351 | >> ip = "123\b456"
352 | >> puts ip
353 | 12456
354 |
355 | >> ip.gsub() ##### add your solution here
356 | => "123 456"
357 | ```
358 |
359 | **5)** Replace all occurrences of `\o` with `o`.
360 |
361 | ```ruby
362 | >> ip = 'there are c\omm\on aspects am\ong the alternati\ons'
363 |
364 | >> ip.gsub() ##### add your solution here
365 | => "there are common aspects among the alternations"
366 | ```
367 |
368 | **6)** Replace any matching item from the array `eqns` with `X` for the given string `ip`. Match the items from `eqns` literally.
369 |
370 | ```ruby
371 | >> ip = '3-(a^b)+2*(a^b)-(a/b)+3'
372 | >> eqns = %w[(a^b) (a/b) (a^b)+2]
373 |
374 | >> pat = ##### add your solution here
375 |
376 | >> ip.gsub(pat, 'X')
377 | => "3-X*X-X+3"
378 | ```
379 |
380 |
381 |
382 | # Dot metacharacter and Quantifiers
383 |
384 | > Since the `.` metacharacter doesn't match newline characters by default, assume that the input strings in the following exercises will not contain newline characters.
385 |
386 | **1)** Replace `42//5` or `42/5` with `8` for the given input.
387 |
388 | ```ruby
389 | >> ip = 'a+42//5-c pressure*3+42/5-14256'
390 |
391 | >> ip.gsub() ##### add your solution here
392 | => "a+8-c pressure*3+8-14256"
393 | ```
394 |
395 | **2)** For the array `items`, filter all elements starting with `hand` and ending immediately with at most one more character or `le`.
396 |
397 | ```ruby
398 | >> items = %w[handed hand handled handy unhand hands handle]
399 |
400 | >> items.grep() ##### add your solution here
401 | => ["hand", "handy", "hands", "handle"]
402 | ```
403 |
404 | **3)** Use the `split` method to get the output as shown for the given input strings.
405 |
406 | ```ruby
407 | >> eqn1 = 'a+42//5-c'
408 | >> eqn2 = 'pressure*3+42/5-14256'
409 | >> eqn3 = 'r*42-5/3+42///5-42/53+a'
410 |
411 | >> pat = ##### add your solution here
412 |
413 | >> eqn1.split(pat)
414 | => ["a+", "-c"]
415 | >> eqn2.split(pat)
416 | => ["pressure*3+", "-14256"]
417 | >> eqn3.split(pat)
418 | => ["r*42-5/3+42///5-", "3+a"]
419 | ```
420 |
421 | **4)** For the given input strings, remove everything from the first occurrence of `i` till the end of the string.
422 |
423 | ```ruby
424 | >> s1 = 'remove the special meaning of such constructs'
425 | >> s2 = 'characters while constructing'
426 | >> s3 = 'input output'
427 |
428 | >> pat = ##### add your solution here
429 |
430 | >> s1.sub(pat, '')
431 | => "remove the spec"
432 | >> s2.sub(pat, '')
433 | => "characters wh"
434 | >> s3.sub(pat, '')
435 | => ""
436 | ```
437 |
438 | **5)** For the given strings, construct a regexp to get the output as shown below.
439 |
440 | ```ruby
441 | >> str1 = 'a+b(addition)'
442 | >> str2 = 'a/b(division) + c%d(#modulo)'
443 | >> str3 = 'Hi there(greeting). Nice day(a(b)'
444 |
445 | >> remove_parentheses = ##### add your solution here
446 |
447 | >> str1.gsub(remove_parentheses, '')
448 | => "a+b"
449 | >> str2.gsub(remove_parentheses, '')
450 | => "a/b + c%d"
451 | >> str3.gsub(remove_parentheses, '')
452 | => "Hi there. Nice day"
453 | ```
454 |
455 | **6)** Correct the given regexp to get the expected output.
456 |
457 | ```ruby
458 | >> words = 'plink incoming tint winter in caution sentient'
459 |
460 | # wrong output
461 | >> change = /int|in|ion|ing|inco|inter|ink/
462 | >> words.gsub(change, 'X')
463 | => "plXk XcomXg tX wXer X cautX sentient"
464 |
465 | # expected output
466 | >> change = ##### add your solution here
467 | >> words.gsub(change, 'X')
468 | => "plX XmX tX wX X cautX sentient"
469 | ```
470 |
471 | **7)** For the given greedy quantifiers, what would be the equivalent form using the `{m,n}` representation?
472 |
473 | * `?` is same as
474 | * `*` is same as
475 | * `+` is same as
476 |
477 | **8)** `(a*|b*)` is same as `(a|b)*` — true or false?
478 |
479 | **9)** For the given input strings, remove everything from the first occurrence of `test` (irrespective of case) till the end of the string, provided `test` isn't at the end of the string.
480 |
481 | ```ruby
482 | >> s1 = 'this is a Test'
483 | >> s2 = 'always test your RE for corner cases'
484 | >> s3 = 'a TEST of skill tests?'
485 |
486 | >> pat = ##### add your solution here
487 |
488 | >> s1.sub(pat, '')
489 | => "this is a Test"
490 | >> s2.sub(pat, '')
491 | => "always "
492 | >> s3.sub(pat, '')
493 | => "a "
494 | ```
495 |
496 | **10)** For the input array `words`, filter all elements starting with `s` and containing `e` and `t` in any order.
497 |
498 | ```ruby
499 | >> words = ['sequoia', 'subtle', 'exhibit', 'a set', 'sets', 'tests', 'site']
500 |
501 | >> words.grep() ##### add your solution here
502 | => ["subtle", "sets", "site"]
503 | ```
504 |
505 | **11)** For the input array `words`, remove all elements having less than `6` characters.
506 |
507 | ```ruby
508 | >> words = %w[sequoia subtle exhibit asset sets tests site]
509 |
510 | >> words.grep() ##### add your solution here
511 | => ["sequoia", "subtle", "exhibit"]
512 | ```
513 |
514 | **12)** For the input array `words`, filter all elements starting with `s` or `t` and having a maximum of `6` characters.
515 |
516 | ```ruby
517 | >> words = ['sequoia', 'subtle', 'exhibit', 'asset', 'sets', 't set', 'site']
518 |
519 | >> words.grep() ##### add your solution here
520 | => ["subtle", "sets", "t set", "site"]
521 | ```
522 |
523 | **13)** Can you reason out why this code results in the output shown? The aim was to remove all `` patterns but not the `<>` ones. The expected result was `'a 1<> b 2<> c'`.
524 |
525 | ```ruby
526 | >> ip = 'a 1<> b 2<> c'
527 |
528 | >> ip.gsub(/<.+?>/, '')
529 | => "a 1 2"
530 | ```
531 |
532 | **14)** Use the `split` method to get the output as shown below for the given input strings.
533 |
534 | ```ruby
535 | >> s1 = 'go there :: this :: that'
536 | >> s2 = 'a::b :: c::d e::f :: 4::5'
537 | >> s3 = '42:: hi::bye::see :: carefully'
538 |
539 | >> pat = ##### add your solution here
540 |
541 | >> s1.split(pat, 2)
542 | => ["go there", "this :: that"]
543 | >> s2.split(pat, 2)
544 | => ["a::b", "c::d e::f :: 4::5"]
545 | >> s3.split(pat, 2)
546 | => ["42:: hi::bye::see", "carefully"]
547 | ```
548 |
549 | **15)** For the given input strings, match if the string starts with optional space characters followed by at least two `#` characters.
550 |
551 | ```ruby
552 | >> s1 = ' ## header2'
553 | >> s2 = '#### header4'
554 | >> s3 = '# comment'
555 | >> s4 = 'normal string'
556 | >> s5 = 'nope ## not this'
557 |
558 | >> pat = ##### add your solution here
559 |
560 | >> s1.match?(pat)
561 | => true
562 | >> s2.match?(pat)
563 | => true
564 | >> s3.match?(pat)
565 | => false
566 | >> s4.match?(pat)
567 | => false
568 | >> s5.match?(pat)
569 | => false
570 | ```
571 |
572 | **16)** Modify the given regular expression such that it gives the expected results.
573 |
574 | ```ruby
575 | >> s1 = 'appleabcabcabcapricot'
576 | >> s2 = 'bananabcabcabcdelicious'
577 |
578 | # wrong output
579 | >> pat = /(abc)+a/
580 | >> pat.match?(s1)
581 | => true
582 | >> pat.match?(s2)
583 | => true
584 |
585 | # expected output
586 | # 'abc' shouldn't be considered when trying to match 'a' at the end
587 | >> pat = ##### add your solution here
588 | >> pat.match?(s1)
589 | => true
590 | >> pat.match?(s2)
591 | => false
592 | ```
593 |
594 |
595 |
596 | # Working with matched portions
597 |
598 | **1)** For the given strings, extract the matching portion from the first `is` to the last `t`.
599 |
600 | ```ruby
601 | >> str1 = 'This the biggest fruit you have seen?'
602 | >> str2 = 'Your mission is to read and practice consistently'
603 |
604 | >> pat = ##### add your solution here
605 |
606 | ##### add your solution here for str1
607 | => "is the biggest fruit"
608 | ##### add your solution here for str2
609 | => "ission is to read and practice consistent"
610 | ```
611 |
612 | **2)** Find the starting index of the first occurrence of `is` or `the` or `was` or `to` for the given input strings.
613 |
614 | ```ruby
615 | >> s1 = 'match after the last newline character'
616 | >> s2 = 'and then you want to test'
617 | >> s3 = 'this is good bye then'
618 | >> s4 = 'who was there to see?'
619 |
620 | >> pat = ##### add your solution here
621 |
622 | ##### add your solution here for s1
623 | => 12
624 | ##### add your solution here for s2
625 | => 4
626 | ##### add your solution here for s3
627 | => 2
628 | ##### add your solution here for s4
629 | => 4
630 | ```
631 |
632 | **3)** Find the starting index of the last occurrence of `is` or `the` or `was` or `to` for the given input strings.
633 |
634 | ```ruby
635 | >> s1 = 'match after the last newline character'
636 | >> s2 = 'and then you want to test'
637 | >> s3 = 'this is good bye then'
638 | >> s4 = 'who was there to see?'
639 |
640 | >> pat = ##### add your solution here
641 |
642 | ##### add your solution here for s1
643 | => 12
644 | ##### add your solution here for s2
645 | => 18
646 | ##### add your solution here for s3
647 | => 17
648 | ##### add your solution here for s4
649 | => 14
650 | ```
651 |
652 | **4)** Extract everything after the `:` character, which occurs only once in the input.
653 |
654 | ```ruby
655 | >> ip = 'fruits:apple, mango, guava, blueberry'
656 |
657 | ##### add your solution here
658 | => "apple, mango, guava, blueberry"
659 | ```
660 |
661 | **5)** The given input strings contains some text followed by `-` followed by a number. Replace that number with its `log` value using `Math.log()`.
662 |
663 | ```ruby
664 | >> s1 = 'first-3.14'
665 | >> s2 = 'next-123'
666 |
667 | >> pat = ##### add your solution here
668 |
669 | ##### add your solution here for s1
670 | => "first-1.144222799920162"
671 | ##### add your solution here for s2
672 | => "next-4.812184355372417"
673 | ```
674 |
675 | **6)** Replace all occurrences of `par` with `spar`, `spare` with `extra` and `park` with `garden` for the given input strings.
676 |
677 | ```ruby
678 | >> str1 = 'apartment has a park'
679 | >> str2 = 'do you have a spare cable'
680 | >> str3 = 'write a parser'
681 |
682 | ##### add your solution here for str1
683 | => "aspartment has a garden"
684 | ##### add your solution here for str2
685 | => "do you have a extra cable"
686 | ##### add your solution here for str3
687 | => "write a sparser"
688 | ```
689 |
690 | **7)** Extract all words between `(` and `)` from the given input string as an array. Assume that the input will not contain any broken parentheses.
691 |
692 | ```ruby
693 | >> ip = 'another (way) to reuse (portion) matched (by) capture groups'
694 |
695 | # as nested array
696 | ##### add your solution here
697 | => [["way"], ["portion"], ["by"]]
698 |
699 | # as array of strings
700 | ##### add your solution here
701 | => ["way", "portion", "by"]
702 | ```
703 |
704 | **8)** Extract all occurrences of `<` up to the next occurrence of `>`, provided there is at least one character in between `<` and `>`.
705 |
706 | ```ruby
707 | >> ip = 'a 1<> b 2<> c'
708 |
709 | ##### add your solution here
710 | => ["", "<> b", "<> c"]
711 | ```
712 |
713 | **9)** Use `scan` to get the output as shown below for the given input strings. Note the characters used in the input strings carefully.
714 |
715 | ```ruby
716 | >> row1 = '-2,5 4,+3 +42,-53 4356246,-357532354 '
717 | >> row2 = '1.32,-3.14 634,5.63 63.3e3,9907809345343.235 '
718 |
719 | >> pat = ##### add your solution here
720 |
721 | >> row1.scan(pat)
722 | => [["-2", "5"], ["4", "+3"], ["+42", "-53"], ["4356246", "-357532354"]]
723 | >> row2.scan(pat)
724 | => [["1.32", "-3.14"], ["634", "5.63"], ["63.3e3", "9907809345343.235"]]
725 | ```
726 |
727 | **10)** This is an extension to the previous question.
728 |
729 | * For `row1`, find the sum of integers of each array element. For example, sum of `-2` and `5` is `3`.
730 | * For `row2`, find the sum of floating-point numbers of each array element. For example, sum of `1.32` and `-3.14` is `-1.82`.
731 |
732 | ```ruby
733 | >> row1 = '-2,5 4,+3 +42,-53 4356246,-357532354 '
734 | >> row2 = '1.32,-3.14 634,5.63 63.3e3,9907809345343.235 '
735 |
736 | # should be same as the previous question
737 | >> pat = ##### add your solution here
738 |
739 | ##### add your solution here for row1
740 | => [3, 7, -11, -353176108]
741 |
742 | ##### add your solution here for row2
743 | => [-1.82, 639.63, 9907809408643.234]
744 | ```
745 |
746 | **11)** Use the `split` method to get the output as shown below.
747 |
748 | ```ruby
749 | >> ip = '42:no-output;1000:car-tr:u-ck;SQEX49801'
750 |
751 | >> ip.split() ##### add your solution here
752 | => ["42", "output", "1000", "tr:u-ck", "SQEX49801"]
753 | ```
754 |
755 | **12)** Convert the comma separated strings to corresponding `hash` objects as shown below. Note that the input strings have an extra `,` at the end.
756 |
757 | ```ruby
758 | >> row1 = 'name:rohan,maths:75,phy:89,'
759 | >> row2 = 'name:rose,maths:88,phy:92,'
760 |
761 | >> pat = ##### add your solution here
762 |
763 | ##### add your solution here for row1
764 | => {"name"=>"rohan", "maths"=>"75", "phy"=>"89"}
765 | ##### add your solution here for row2
766 | => {"name"=>"rose", "maths"=>"88", "phy"=>"92"}
767 | ```
768 |
769 |
770 |
771 | # Character class
772 |
773 | **1)** For the array `items`, filter all elements starting with `hand` and ending immediately with `s` or `y` or `le`.
774 |
775 | ```ruby
776 | >> items = %w[-handy hand handy unhand hands hand-icy handle]
777 |
778 | ##### add your solution here
779 | => ["handy", "hands", "handle"]
780 | ```
781 |
782 | **2)** Replace all whole words `reed` or `read` or `red` with `X`.
783 |
784 | ```ruby
785 | >> ip = 'redo red credible :read: rod reed'
786 |
787 | ##### add your solution here
788 | => "redo X credible :X: rod X"
789 | ```
790 |
791 | **3)** For the array `words`, filter all elements containing `e` or `i` followed by `l` or `n`. Note that the order mentioned should be followed.
792 |
793 | ```ruby
794 | >> words = %w[surrender unicorn newer door empty eel pest]
795 |
796 | ##### add your solution here
797 | => ["surrender", "unicorn", "eel"]
798 | ```
799 |
800 | **4)** For the array `words`, filter all elements containing `e` or `i` and `l` or `n` in any order.
801 |
802 | ```ruby
803 | >> words = %w[surrender unicorn newer door empty eel pest]
804 |
805 | ##### add your solution here
806 | => ["surrender", "unicorn", "newer", "eel"]
807 | ```
808 |
809 | **5)** Convert the comma separated strings to corresponding `hash` objects as shown below.
810 |
811 | ```ruby
812 | >> row1 = 'name:rohan,maths:75,phy:89'
813 | >> row2 = 'name:rose,maths:88,phy:92'
814 |
815 | >> pat = ##### add your solution here
816 |
817 | ##### add your solution here for row1
818 | => {"name"=>"rohan", "maths"=>"75", "phy"=>"89"}
819 | ##### add your solution here for row2
820 | => {"name"=>"rose", "maths"=>"88", "phy"=>"92"}
821 | ```
822 |
823 | **6)** Delete from `(` to the next occurrence of `)` unless they contain parentheses characters in between.
824 |
825 | ```ruby
826 | >> str1 = 'def factorial()'
827 | >> str2 = 'a/b(division) + c%d(#modulo) - (e+(j/k-3)*4)'
828 | >> str3 = 'Hi there(greeting). Nice day(a(b)'
829 |
830 | >> remove_parentheses = ##### add your solution here
831 |
832 | >> str1.gsub(remove_parentheses, '')
833 | => "def factorial"
834 | >> str2.gsub(remove_parentheses, '')
835 | => "a/b + c%d - (e+*4)"
836 | >> str3.gsub(remove_parentheses, '')
837 | => "Hi there. Nice day(a"
838 | ```
839 |
840 | **7)** For the array `words`, filter all elements not starting with `e` or `p` or `u`.
841 |
842 | ```ruby
843 | >> words = %w[surrender unicorn newer door empty eel (pest)]
844 |
845 | ##### add your solution here
846 | => ["surrender", "newer", "door", "(pest)"]
847 | ```
848 |
849 | **8)** For the array `words`, filter all elements not containing `u` or `w` or `ee` or `-`.
850 |
851 | ```ruby
852 | >> words = %w[p-t you tea heel owe new reed ear]
853 |
854 | ##### add your solution here
855 | => ["tea", "ear"]
856 | ```
857 |
858 | **9)** The given input strings contain fields separated by `,` and fields can be empty too. Replace the last three fields with `WHTSZ323`.
859 |
860 | ```ruby
861 | >> row1 = '(2),kite,12,,D,C,,'
862 | >> row2 = 'hi,bye,sun,moon'
863 |
864 | >> pat = ##### add your solution here
865 |
866 | ##### add your solution here for row1
867 | => "(2),kite,12,,D,WHTSZ323"
868 | ##### add your solution here for row2
869 | => "hi,WHTSZ323"
870 | ```
871 |
872 | **10)** Split the given strings based on consecutive sequence of digit or whitespace characters.
873 |
874 | ```ruby
875 | >> str1 = "lion \t Ink32onion Nice"
876 | >> str2 = "**1\f2\n3star\t7 77\r**"
877 |
878 | >> pat = ##### add your solution here
879 |
880 | >> str1.split(pat)
881 | => ["lion", "Ink", "onion", "Nice"]
882 | >> str2.split(pat)
883 | => ["**", "star", "**"]
884 | ```
885 |
886 | **11)** Delete all occurrences of the sequence `` where `characters` is one or more non `>` characters and cannot be empty.
887 |
888 | ```ruby
889 | >> ip = 'a 1<> b 2<> c'
890 |
891 | ##### add your solution here
892 | => "a 1<> b 2<> c"
893 | ```
894 |
895 | **12)** `\b[a-z](on|no)[a-z]\b` is same as `\b[a-z][on]{2}[a-z]\b`. True or False? Sample input lines shown below might help to understand the differences, if any.
896 |
897 | ```ruby
898 | >> puts "known\nmood\nknow\npony\ninns"
899 | known
900 | mood
901 | know
902 | pony
903 | inns
904 | ```
905 |
906 | **13)** For the given array, filter elements containing any number sequence greater than `624`.
907 |
908 | ```ruby
909 | >> items = ['h0000432ab', 'car00625', '42_624 0512', '96 foo1234baz 3.14 2']
910 |
911 | ##### add your solution here
912 | => ["car00625", "96 foo1234baz 3.14 2"]
913 | ```
914 |
915 | **14)** Count the maximum depth of nested braces for the given strings. Unbalanced or wrongly ordered braces should return `-1`. Note that this will require a mix of regular expressions and Ruby code.
916 |
917 | ```ruby
918 | ?> def max_nested_braces(ip)
919 | ##### add your solution here
920 | >> end
921 |
922 | >> max_nested_braces('a*b')
923 | => 0
924 | >> max_nested_braces('}a+b{')
925 | => -1
926 | >> max_nested_braces('a*b+{}')
927 | => 1
928 | >> max_nested_braces('{{a+2}*{b+c}+e}')
929 | => 2
930 | >> max_nested_braces('{{a+2}*{b+{c*d}}+e}')
931 | => 3
932 | >> max_nested_braces("{{a+2}*{\n{b+{c*d}}+e*d}}")
933 | => 4
934 | >> max_nested_braces('a*{b+c*{e*3.14}}}')
935 | => -1
936 | ```
937 |
938 | **15)** By default, the `split` method will split on whitespace and remove empty strings from the result. Which regexp based method would you use to replicate this functionality?
939 |
940 | ```ruby
941 | >> ip = " \t\r so pole\t\t\t\n\nlit in to \r\n\v\f "
942 |
943 | >> ip.split
944 | => ["so", "pole", "lit", "in", "to"]
945 |
946 | ##### add your solution here
947 | => ["so", "pole", "lit", "in", "to"]
948 | ```
949 |
950 | **16)** Convert the given input string to two different arrays as shown below. You can optimize the regexp based on characters present in the input string.
951 |
952 | ```ruby
953 | >> ip = "price_42 roast^\t\n^-ice==cat\neast"
954 |
955 | ##### add your solution here
956 | => ["price_42", "roast", "ice", "cat", "east"]
957 |
958 | ##### add your solution here
959 | => ["price_42", " ", "roast", "^\t\n^-", "ice", "==", "cat", "\n", "east"]
960 | ```
961 |
962 | **17)** Filter all elements whose first non-whitespace character is not a `#` character. Any element made up of only whitespace characters should be ignored as well.
963 |
964 | ```ruby
965 | >> items = [' #comment', "\t\napple #42", '#oops', 'sure', 'no#1', "\t\r\f"]
966 |
967 | ##### add your solution here
968 | => ["\t\napple #42", "sure", "no#1"]
969 | ```
970 |
971 | **18)** Extract all whole words for the given input strings. However, based on user input `ignore`, do not match words if they contain any character present in the `ignore` variable. Assume that `ignore` variable will not contain any regexp metacharacters.
972 |
973 | ```ruby
974 | >> s1 = 'match after the last newline character'
975 | >> s2 = 'and then you want to test'
976 |
977 | >> ignore = 'aty'
978 | >> pat = ##### add your solution here
979 | >> s1.scan(pat)
980 | => ["newline"]
981 | >> s2.scan(pat)
982 | => []
983 |
984 | >> ignore = 'esw'
985 | >> pat = ##### add your solution here
986 | >> s1.scan(pat)
987 | => ["match"]
988 | >> s2.scan(pat)
989 | => ["and", "you", "to"]
990 | ```
991 |
992 | **19)** Filter all whole elements with optional whitespaces at the start followed by three to five non-digit characters. Whitespaces at the start should not be part of the calculation for non-digit characters.
993 |
994 | ```ruby
995 | >> items = ["\t \ncat", 'goal', ' oh', 'he-he', 'goal2', 'ok ', 'sparrow']
996 |
997 | ##### add your solution here
998 | => ["\t \ncat", "goal", "he-he", "ok "]
999 | ```
1000 |
1001 | **20)** Modify the given regexp such that it gives the expected result.
1002 |
1003 | ```ruby
1004 | >> ip = '( S:12 E:5 S:4 and E:123 ok S:100 & E:10 S:1 - E:2 S:42 E:43 )'
1005 |
1006 | # wrong output
1007 | >> ip.scan(/S:\d+.*?E:\d{2,}/)
1008 | => ["S:12 E:5 S:4 and E:123", "S:100 & E:10", "S:1 - E:2 S:42 E:43"]
1009 |
1010 | # expected output
1011 | ##### add your solution here
1012 | => ["S:4 and E:123", "S:100 & E:10", "S:42 E:43"]
1013 | ```
1014 |
1015 |
1016 |
1017 | # Groupings and backreferences
1018 |
1019 | **1)** Replace the space character that occurs after a word ending with `a` or `r` with a newline character.
1020 |
1021 | ```ruby
1022 | >> ip = 'area not a _a2_ roar took 22'
1023 |
1024 | >> puts ip.gsub() ##### add your solution here
1025 | area
1026 | not a
1027 | _a2_ roar
1028 | took 22
1029 | ```
1030 |
1031 | **2)** Add `[]` around words starting with `s` and containing `e` and `t` in any order.
1032 |
1033 | ```ruby
1034 | >> ip = 'sequoia subtle exhibit asset sets2 tests si_te'
1035 |
1036 | ##### add your solution here
1037 | => "sequoia [subtle] exhibit asset [sets2] tests [si_te]"
1038 | ```
1039 |
1040 | **3)** Replace all whole words with `X` that start and end with the same word character (irrespective of case). Single character word should get replaced with `X` too, as it satisfies the stated condition.
1041 |
1042 | ```ruby
1043 | >> ip = 'oreo not a _a2_ Roar took 22'
1044 |
1045 | ##### add your solution here
1046 | => "X not X X X took X"
1047 | ```
1048 |
1049 | **4)** Convert the given *markdown* headers to corresponding *anchor* tags. Consider the input to start with one or more `#` characters followed by space and word characters. The `name` attribute is constructed by converting the header to lowercase and replacing spaces with hyphens. Can you do it without using a capture group?
1050 |
1051 | ```ruby
1052 | >> header1 = '# Regular Expressions'
1053 | >> header2 = '## Named capture groups'
1054 |
1055 | >> anchor = ##### add your solution here
1056 |
1057 | ##### add your solution here for header1
1058 | => "# Regular Expressions"
1059 | ##### add your solution here for header2
1060 | => "## Named capture groups"
1061 | ```
1062 |
1063 | **5)** Convert the given *markdown* anchors to corresponding *hyperlinks*.
1064 |
1065 | ```ruby
1066 | >> anchor1 = "# Regular Expressions"
1067 | >> anchor2 = "## Subexpression calls"
1068 |
1069 | >> hyperlink = ##### add your solution here
1070 |
1071 | ##### add your solution here for anchor1
1072 | => "[Regular Expressions](#regular-expressions)"
1073 | ##### add your solution here for anchor2
1074 | => "[Subexpression calls](#subexpression-calls)"
1075 | ```
1076 |
1077 | **6)** Count the number of whole words that have at least two occurrences of consecutive repeated alphabets. For example, words like `stillness` and `Committee` should be counted but not words like `root` or `readable` or `rotational`.
1078 |
1079 | ```ruby
1080 | '> ip = %q{oppressed abandon accommodation bloodless
1081 | '> carelessness committed apparition innkeeper
1082 | '> occasionally afforded embarrassment foolishness
1083 | '> depended successfully succeeded
1084 | >> possession cleanliness suppress}
1085 |
1086 | ##### add your solution here
1087 | => 13
1088 | ```
1089 |
1090 | **7)** For the given input string, replace all occurrences of digit sequences with only the unique non-repeating sequence. For example, `232323` should be changed to `23` and `897897` should be changed to `897`. If there are no repeats (for example `1234`) or if the repeats end prematurely (for example `12121`), it should not be changed.
1091 |
1092 | ```ruby
1093 | >> ip = '1234 2323 453545354535 9339 11 60260260'
1094 |
1095 | ##### add your solution here
1096 | => "1234 23 4535 9339 1 60260260"
1097 | ```
1098 |
1099 | **8)** Replace sequences made up of words separated by `:` or `.` by the first word of the sequence. Such sequences will end when `:` or `.` is not followed by a word character.
1100 |
1101 | ```ruby
1102 | >> ip = 'wow:Good:2_two.five: hi-2 bye kite.777:water.'
1103 |
1104 | ##### add your solution here
1105 | => "wow hi-2 bye kite"
1106 | ```
1107 |
1108 | **9)** Replace sequences made up of words separated by `:` or `.` by the last word of the sequence. Such sequences will end when `:` or `.` is not followed by a word character.
1109 |
1110 | ```ruby
1111 | >> ip = 'wow:Good:2_two.five: hi-2 bye kite.777:water.'
1112 |
1113 | ##### add your solution here
1114 | => "five hi-2 bye water"
1115 | ```
1116 |
1117 | **10)** Split the given input string on one or more repeated sequence of `cat`.
1118 |
1119 | ```ruby
1120 | >> ip = 'firecatlioncatcatcatbearcatcatparrot'
1121 |
1122 | ##### add your solution here
1123 | => ["fire", "lion", "bear", "parrot"]
1124 | ```
1125 |
1126 | **11)** For the given input string, find all occurrences of digit sequences with at least one repeating sequence. For example, `232323` and `897897`. If the repeats end prematurely, for example `12121`, it should not be matched.
1127 |
1128 | ```ruby
1129 | >> ip = '1234 2323 453545354535 9339 11 60260260'
1130 |
1131 | >> pat = ##### add your solution here
1132 |
1133 | # entire sequences in the output
1134 | ##### add your solution here
1135 | => ["2323", "453545354535", "11"]
1136 |
1137 | # only the unique sequence in the output
1138 | ##### add your solution here
1139 | => ["23", "4535", "1"]
1140 | ```
1141 |
1142 | **12)** Convert the comma separated strings to corresponding `hash` objects as shown below. The keys are `name`, `maths` and `phy` for the three fields in the input strings.
1143 |
1144 | ```ruby
1145 | >> row1 = 'rohan,75,89'
1146 | >> row2 = 'rose,88,92'
1147 |
1148 | >> pat = ##### add your solution here
1149 |
1150 | ##### add your solution here for row1
1151 | => {"name"=>"rohan", "maths"=>"75", "phy"=>"89"}
1152 | ##### add your solution here for row2
1153 | => {"name"=>"rose", "maths"=>"88", "phy"=>"92"}
1154 | ```
1155 |
1156 | **13)** Surround all whole words with `()`. Additionally, if the whole word is `imp` or `ant`, delete them. Can you do it with just a single substitution?
1157 |
1158 | ```ruby
1159 | >> ip = 'tiger imp goat eagle ant important'
1160 |
1161 | ##### add your solution here
1162 | => "(tiger) () (goat) (eagle) () (important)"
1163 | ```
1164 |
1165 | **14)** Filter all elements that contain a sequence of lowercase alphabets followed by `-` followed by digits. They can be optionally surrounded by `{{` and `}}`. Any partial match shouldn't be part of the output.
1166 |
1167 | ```ruby
1168 | >> ip = %w[{{apple-150}} {{mango2-100}} {{cherry-200 grape-87 {{go-to}}]
1169 |
1170 | ##### add your solution here
1171 | => ["{{apple-150}}", "grape-87"]
1172 | ```
1173 |
1174 | **15)** Extract all hexadecimal character sequences, with `0x` optional prefix. Match the characters case insensitively, and the sequences shouldn't be surrounded by other word characters.
1175 |
1176 | ```ruby
1177 | >> str1 = '128A foo 0xfe32 34 0xbar'
1178 | >> str2 = '0XDEADBEEF place 0x0ff1ce bad'
1179 |
1180 | >> hex_seq = ##### add your solution here
1181 |
1182 | >> str1.scan(hex_seq)
1183 | => ["128A", "0xfe32", "34"]
1184 | >> str2.scan(hex_seq)
1185 | => ["0XDEADBEEF", "0x0ff1ce", "bad"]
1186 | ```
1187 |
1188 | **16)** Replace sequences made up of words separated by `:` or `.` by the first/last word of the sequence and the separator. Such sequences will end when `:` or `.` is not followed by a word character.
1189 |
1190 | ```ruby
1191 | >> ip = 'wow:Good:2_two.five: hi-2 bye kite.777:water.'
1192 |
1193 | # first word of the sequence
1194 | ##### add your solution here
1195 | => "wow: hi-2 bye kite."
1196 |
1197 | # last word of the sequence
1198 | ##### add your solution here
1199 | => "five: hi-2 bye water."
1200 | ```
1201 |
1202 | **17)** For the given input strings, extract `if` followed by any number of nested parentheses. Assume that there will be only one such pattern per input string.
1203 |
1204 | ```ruby
1205 | >> ip1 = 'for (((i*3)+2)/6) if(3-(k*3+4)/12-(r+2/3)) while()'
1206 | >> ip2 = 'if+while if(a(b)c(d(e(f)1)2)3) for(i=1)'
1207 |
1208 | >> pat = ##### add your solution here
1209 |
1210 | >> ip1[pat]
1211 | => "if(3-(k*3+4)/12-(r+2/3))"
1212 | >> ip2[pat]
1213 | => "if(a(b)c(d(e(f)1)2)3)"
1214 | ```
1215 |
1216 | **18)** The given input string has sequences made up of words separated by `:` or `.` and such sequences will end when `:` or `.` is not followed by a word character. For all such sequences, display only the last word followed by `-` followed by the first word.
1217 |
1218 | ```ruby
1219 | >> ip = 'wow:Good:2_two.five: hi-2 bye kite.777:water.'
1220 |
1221 | ##### add your solution here
1222 | => ["five-wow", "water-kite"]
1223 | ```
1224 |
1225 |
1226 |
1227 | # Lookarounds
1228 |
1229 | > Please use lookarounds for solving the following exercises even if you can do it without lookarounds. Unless you cannot use lookarounds for cases like variable length lookbehinds.
1230 |
1231 | **1)** Replace all whole words with `X` unless it is preceded by a `(` character.
1232 |
1233 | ```ruby
1234 | >> ip = '(apple) guava berry) apple (mango) (grape'
1235 |
1236 | ##### add your solution here
1237 | => "(apple) X X) X (mango) (grape"
1238 | ```
1239 |
1240 | **2)** Replace all whole words with `X` unless it is followed by a `)` character.
1241 |
1242 | ```ruby
1243 | >> ip = '(apple) guava berry) apple (mango) (grape'
1244 |
1245 | ##### add your solution here
1246 | => "(apple) X berry) X (mango) (X"
1247 | ```
1248 |
1249 | **3)** Replace all whole words with `X` unless it is preceded by `(` or followed by `)` characters.
1250 |
1251 | ```ruby
1252 | >> ip = '(apple) guava berry) apple (mango) (grape'
1253 |
1254 | ##### add your solution here
1255 | => "(apple) X berry) X (mango) (grape"
1256 | ```
1257 |
1258 | **4)** Extract all whole words that do not end with `e` or `n`.
1259 |
1260 | ```ruby
1261 | >> ip = 'a_t row on Urn e note Dust n end a2-e|u'
1262 |
1263 | ##### add your solution here
1264 | => ["a_t", "row", "Dust", "end", "a2", "u"]
1265 | ```
1266 |
1267 | **5)** Extract all whole words that do not start with `a` or `d` or `n`.
1268 |
1269 | ```ruby
1270 | >> ip = 'a_t row on Urn e note Dust n end a2-e|u'
1271 |
1272 | ##### add your solution here
1273 | => ["row", "on", "Urn", "e", "Dust", "end", "e", "u"]
1274 | ```
1275 |
1276 | **6)** Extract all whole words only if they are followed by `:` or `,` or `-`.
1277 |
1278 | ```ruby
1279 | >> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit'
1280 |
1281 | ##### add your solution here
1282 | => ["Poke", "so_good", "ever2"]
1283 | ```
1284 |
1285 | **7)** Extract all whole words only if they are preceded by `=` or `/` or `-`.
1286 |
1287 | ```ruby
1288 | >> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit'
1289 |
1290 | ##### add your solution here
1291 | => ["so_good", "is", "sit"]
1292 | ```
1293 |
1294 | **8)** Extract all whole words only if they are preceded by `=` or `:` and followed by `:` or `.`.
1295 |
1296 | ```ruby
1297 | >> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit'
1298 |
1299 | ##### add your solution here
1300 | => ["so_good", "ink"]
1301 | ```
1302 |
1303 | **9)** Extract all whole words only if they are preceded by `=` or `:` or `.` or `(` or `-` and not followed by `.` or `/`.
1304 |
1305 | ```ruby
1306 | >> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit'
1307 |
1308 | ##### add your solution here
1309 | => ["so_good", "vast", "sit"]
1310 | ```
1311 |
1312 | **10)** Remove the leading and trailing whitespaces from all the individual fields where `,` is the field separator.
1313 |
1314 | ```ruby
1315 | >> csv1 = " comma ,separated ,values \t\r "
1316 | >> csv2 = 'good bad,nice ice , 42 , , stall small'
1317 |
1318 | >> remove_whitespace = ##### add your solution here
1319 |
1320 | >> csv1.gsub(remove_whitespace, '')
1321 | => "comma,separated,values"
1322 | >> csv2.gsub(remove_whitespace, '')
1323 | => "good bad,nice ice,42,,stall small"
1324 | ```
1325 |
1326 | **11)** Filter elements that satisfy all of these rules:
1327 |
1328 | * should have at least two alphabets
1329 | * should have at least three digits
1330 | * should have at least one special character among `%` or `*` or `#` or `$`
1331 | * should not end with a whitespace character
1332 |
1333 | ```ruby
1334 | >> pwds = ['hunter2', 'F2H3u%9', "*X3Yz3.14\t", 'r2_d2_42', 'A $B C1234']
1335 |
1336 | >> rule_chk = ##### add your solution here
1337 |
1338 | >> pwds.grep(rule_chk)
1339 | => ["F2H3u%9", "A $B C1234"]
1340 | ```
1341 |
1342 | **12)** For the given string, surround all whole words with `{}` except for whole words `par` and `cat` and `apple`.
1343 |
1344 | ```ruby
1345 | >> ip = 'part; cat {super} rest_42 par scatter apple spar'
1346 |
1347 | ##### add your solution here
1348 | => "{part}; cat {{super}} {rest_42} par {scatter} apple {spar}"
1349 | ```
1350 |
1351 | **13)** Extract the integer portion of floating-point numbers for the given string. Integers and numbers ending with `.` and no further digits should not be considered.
1352 |
1353 | ```ruby
1354 | >> ip = '12 ab32.4 go 5 2. 46.42 5'
1355 |
1356 | ##### add your solution here
1357 | => ["32", "46"]
1358 | ```
1359 |
1360 | **14)** For the given input strings, extract all overlapping two character sequences.
1361 |
1362 | ```ruby
1363 | >> s1 = 'apple'
1364 | >> s2 = '1.2-3:4'
1365 |
1366 | >> pat = ##### add your solution here
1367 |
1368 | ##### add your solution here for s1
1369 | => ["ap", "pp", "pl", "le"]
1370 | ##### add your solution here for s2
1371 | => ["1.", ".2", "2-", "-3", "3:", ":4"]
1372 | ```
1373 |
1374 | **15)** The given input strings contain fields separated by the `:` character. Delete `:` and the last field if there is a digit character anywhere before the last field.
1375 |
1376 | ```ruby
1377 | >> s1 = '42:cat'
1378 | >> s2 = 'twelve:a2b'
1379 | >> s3 = 'we:be:he:0:a:b:bother'
1380 | >> s4 = 'apple:banana-42:cherry:'
1381 | >> s5 = 'dragon:unicorn:centaur'
1382 |
1383 | >> pat = ##### add your solution here
1384 |
1385 | ##### add your solution here for s1
1386 | => "42"
1387 | ##### add your solution here for s2
1388 | => "twelve:a2b"
1389 | ##### add your solution here for s3
1390 | => "we:be:he:0:a:b"
1391 | ##### add your solution here for s4
1392 | => "apple:banana-42:cherry"
1393 | ##### add your solution here for s5
1394 | => "dragon:unicorn:centaur"
1395 | ```
1396 |
1397 | **16)** Extract all whole words unless they are preceded by `:` or `<=>` or `----` or `#`.
1398 |
1399 | ```ruby
1400 | >> ip = '::very--at<=>row|in.a_b#b2c=>lion----east'
1401 |
1402 | ##### add your solution here
1403 | => ["at", "in", "a_b", "lion"]
1404 | ```
1405 |
1406 | **17)** Match strings if it contains `qty` followed by `price` but not if there is any **whitespace** character or the string `error` between them.
1407 |
1408 | ```ruby
1409 | >> str1 = '23,qty,price,42'
1410 | >> str2 = 'qty price,oh'
1411 | >> str3 = '3.14,qty,6,errors,9,price,3'
1412 | >> str4 = "42\nqty-6,apple-56,price-234,error"
1413 | >> str5 = '4,price,3.14,qty,4'
1414 | >> str6 = '(qtyprice) (hi-there)'
1415 |
1416 | >> neg = ##### add your solution here
1417 |
1418 | >> str1.match?(neg)
1419 | => true
1420 | >> str2.match?(neg)
1421 | => false
1422 | >> str3.match?(neg)
1423 | => false
1424 | >> str4.match?(neg)
1425 | => true
1426 | >> str5.match?(neg)
1427 | => false
1428 | >> str6.match?(neg)
1429 | => true
1430 | ```
1431 |
1432 | **18)** Can you reason out why the following regular expressions behave differently?
1433 |
1434 | ```ruby
1435 | >> ip = 'I have 12, he has 2!'
1436 |
1437 | >> ip.gsub(/\b..\b/, '{\0}')
1438 | => "{I }have {12}{, }{he} has{ 2}!"
1439 |
1440 | >> ip.gsub(/(? "I have {12}, {he} has {2!}"
1442 | ```
1443 |
1444 | **19)** The given input strings have fields separated by the `:` character. Assume that each string has a minimum of two fields and cannot have empty fields. Extract all fields, but stop if a field with a digit character is found.
1445 |
1446 | ```ruby
1447 | >> row1 = 'vast:a2b2:ride:in:awe:b2b:3list:end'
1448 | >> row2 = 'um:no:low:3e:s4w:seer'
1449 | >> row3 = 'oh100:apple:banana:fig'
1450 | >> row4 = 'Dragon:Unicorn:Wizard-Healer'
1451 |
1452 | >> pat = ##### add your solution here
1453 |
1454 | >> row1.gsub(pat).map { $1 }
1455 | => ["vast"]
1456 | >> row2.gsub(pat).map { $1 }
1457 | => ["um", "no", "low"]
1458 | >> row3.gsub(pat).map { $1 }
1459 | => []
1460 | >> row4.gsub(pat).map { $1 }
1461 | => ["Dragon", "Unicorn", "Wizard-Healer"]
1462 | ```
1463 |
1464 | **20)** The given input strings have fields separated by the `:` character. Extract all fields only after a field containing a digit character is found. Assume that each string has a minimum of two fields and cannot have empty fields.
1465 |
1466 | ```ruby
1467 | >> row1 = 'vast:a2b2:ride:in:awe:b2b:3list:end'
1468 | >> row2 = 'um:no:low:3e:s4w:seer'
1469 | >> row3 = 'oh100:apple:banana:fig'
1470 | >> row4 = 'Dragon:Unicorn:Wizard-Healer'
1471 |
1472 | >> pat = ##### add your solution here
1473 |
1474 | >> row1.scan(pat)
1475 | => ["ride", "in", "awe", "b2b", "3list", "end"]
1476 | >> row2.scan(pat)
1477 | => ["s4w", "seer"]
1478 | >> row3.scan(pat)
1479 | => ["apple", "banana", "fig"]
1480 | >> row4.scan(pat)
1481 | => []
1482 | ```
1483 |
1484 | **21)** The given input string has comma separated fields and some of them can occur more than once. For the duplicated fields, retain only the rightmost one. Assume that there are no empty fields.
1485 |
1486 | ```ruby
1487 | >> row = '421,cat,2425,42,5,cat,6,6,42,61,6,6,scat,6,6,4,Cat,425,4'
1488 |
1489 | ##### add your solution here
1490 | => "421,2425,5,cat,42,61,scat,6,Cat,425,4"
1491 | ```
1492 |
1493 |
1494 |
1495 | # Modifiers
1496 |
1497 | **1)** Remove from the first occurrence of `hat` to the last occurrence of `it` for the given input strings. Match these markers case insensitively.
1498 |
1499 | ```ruby
1500 | >> s1 = "But Cool THAT\nsee What okay\nwow quite"
1501 | >> s2 = 'it this hat is sliced HIT.'
1502 |
1503 | >> pat = ##### add your solution here
1504 |
1505 | >> s1.sub(pat, '')
1506 | => "But Cool Te"
1507 | >> s2.sub(pat, '')
1508 | => "it this ."
1509 | ```
1510 |
1511 | **2)** Delete from the string `start` if it is at the beginning of a line up to the next occurrence of the string `end` at the end of a line. Match these keywords irrespective of case.
1512 |
1513 | ```ruby
1514 | '> para = %q{good start
1515 | '> start working on that
1516 | '> project you always wanted
1517 | '> to, do not let it end
1518 | '> hi there
1519 | '> start and end the end
1520 | '> 42
1521 | '> Start and try to
1522 | '> finish the End
1523 | >> bye}
1524 |
1525 | >> pat = ##### add your solution here
1526 |
1527 | >> puts para.gsub(pat, '')
1528 | good start
1529 |
1530 | hi there
1531 |
1532 | 42
1533 |
1534 | bye
1535 | ```
1536 |
1537 | **3)** For the given *markdown* file, replace all occurrences of the string `ruby` (irrespective of case) with the string `Ruby`. However, any match within code blocks that start with the whole line ` ```ruby ` and end with the whole line ` ``` ` shouldn't be replaced. Consider the input file to be small enough to fit memory requirements.
1538 |
1539 | Refer to the [exercises folder](https://github.com/learnbyexample/Ruby_Regexp/tree/master/exercises) for input files required to solve this exercise.
1540 |
1541 | ```ruby
1542 | >> ip_str = File.open('sample.md').read
1543 | >> pat = ##### add your solution here
1544 |
1545 | >> File.open('sample_mod.md', 'w') do |f|
1546 | ?> ip_str.split(pat).each_with_index do |s, i|
1547 | ?> f.write(i.odd? ? s : s.gsub(/ruby/i) { $&.capitalize })
1548 | >> end
1549 | >> end
1550 |
1551 | >> File.open('sample_mod.md').read == File.open('expected.md').read
1552 | => true
1553 | ```
1554 |
1555 | **4)** Write a string method that changes the given input to alternate case (starting with lowercase first).
1556 |
1557 | ```ruby
1558 | ?> def aLtErNaTe_CaSe(ip_str)
1559 | ##### add your solution here
1560 | >> end
1561 |
1562 | >> aLtErNaTe_CaSe('HI THERE!')
1563 | => "hI tHeRe!"
1564 | >> aLtErNaTe_CaSe('good morning')
1565 | => "gOoD mOrNiNg"
1566 | >> aLtErNaTe_CaSe('Sample123string42with777numbers')
1567 | => "sAmPlE123sTrInG42wItH777nUmBeRs"
1568 | ```
1569 |
1570 | **5)** For the given input strings, match all of these three conditions:
1571 |
1572 | * `This` case sensitively
1573 | * `nice` and `cool` case insensitively
1574 |
1575 | ```ruby
1576 | >> s1 = 'This is nice and Cool'
1577 | >> s2 = 'Nice and cool this is'
1578 | >> s3 = 'What is so nice and cool about This?'
1579 | >> s4 = 'nice,cool,This'
1580 | >> s5 = 'not nice This?'
1581 | >> s6 = 'This is not cool'
1582 |
1583 | >> pat = ##### add your solution here
1584 |
1585 | >> s1.match?(pat)
1586 | => true
1587 | >> s2.match?(pat)
1588 | => false
1589 | >> s3.match?(pat)
1590 | => true
1591 | >> s4.match?(pat)
1592 | => true
1593 | >> s5.match?(pat)
1594 | => false
1595 | >> s6.match?(pat)
1596 | => false
1597 | ```
1598 |
1599 | **6)** For the given input strings, match if the string begins with `Th` and also contains a line that starts with `There`.
1600 |
1601 | ```ruby
1602 | >> s1 = "There there\nHave a cookie"
1603 | >> s2 = "This is a mess\nYeah?\nThereeeee"
1604 | >> s3 = "Oh\nThere goes the fun"
1605 | >> s4 = 'This is not\ngood\nno There'
1606 |
1607 | >> pat = ##### add your solution here
1608 |
1609 | >> s1.match?(pat)
1610 | => true
1611 | >> s2.match?(pat)
1612 | => true
1613 | >> s3.match?(pat)
1614 | => false
1615 | >> s4.match?(pat)
1616 | => false
1617 | ```
1618 |
1619 |
1620 |
1621 | # Unicode
1622 |
1623 | **1)** Output `true` or `false` depending on input string made up of ASCII characters or not. Consider the input to be non-empty strings and any character that isn't part of the 7-bit ASCII set should give `false`.
1624 |
1625 | ```ruby
1626 | >> str1 = '123—456'
1627 | >> str2 = 'good fοοd'
1628 | >> str3 = 'happy learning!'
1629 |
1630 | ##### add your solution here for str1
1631 | => false
1632 | ##### add your solution here for str2
1633 | => false
1634 | ##### add your solution here for str3
1635 | => true
1636 | ```
1637 |
1638 | **2)** Retain only punctuation characters for the given strings (generated from codepoints). Use the Unicode character set definition for punctuation for solving this exercise.
1639 |
1640 | ```ruby
1641 | >> s1 = (0..0x7f).to_a.pack('U*')
1642 | >> s2 = (0x80..0xff).to_a.pack('U*')
1643 | >> s3 = (0x2600..0x27eb).to_a.pack('U*')
1644 |
1645 | >> pat = ##### add your solution here
1646 |
1647 | >> s1.gsub(pat, '')
1648 | => "!\"#%&'()*,-./:;?@[\\]_{}"
1649 | >> s2.gsub(pat, '')
1650 | => "¡§«¶·»¿"
1651 | >> s3.gsub(pat, '')
1652 | => "❨❩❪❫❬❭❮❯❰❱❲❳❴❵⟅⟆⟦⟧⟨⟩⟪⟫"
1653 | ```
1654 |
1655 | **3)** Explore the following Q&A threads.
1656 |
1657 | * [stackoverflow: remove emoji from string](https://stackoverflow.com/q/24672834/4082052)
1658 | * [stackoverflow: why am I seeing different results for these two nearly identical regexp](https://stackoverflow.com/q/13573136/4082052)
1659 | * [stackoverflow: convert unicode number to integer](https://stackoverflow.com/q/37338708/4082052)
1660 | * [stackoverflow: replacing %uXXXX to the corresponding unicode codepoint](https://stackoverflow.com/q/28773392/4082052)
1661 |
1662 |
--------------------------------------------------------------------------------
/exercises/expected.md:
--------------------------------------------------------------------------------
1 | # Introduction
2 |
3 | REPL is a good way to learn Ruby for beginners.
4 |
5 | ```ruby
6 | >> 3 + 7
7 | => 10
8 | >> 22 / 7
9 | => 3
10 | >> 22.0 / 7
11 | => 3.142857142857143
12 | ```
13 |
14 | ## String methods
15 |
16 | Ruby comes loaded with awesome methods. Enjoy learning Ruby.
17 |
18 | ```ruby
19 | >> 'ruby'.capitalize
20 | => "Ruby"
21 |
22 | >> ' comma '.strip
23 | => "comma"
24 | ```
25 |
26 |
--------------------------------------------------------------------------------
/exercises/sample.md:
--------------------------------------------------------------------------------
1 | # Introduction
2 |
3 | REPL is a good way to learn RUBY for beginners.
4 |
5 | ```ruby
6 | >> 3 + 7
7 | => 10
8 | >> 22 / 7
9 | => 3
10 | >> 22.0 / 7
11 | => 3.142857142857143
12 | ```
13 |
14 | ## String methods
15 |
16 | ruby comes loaded with awesome methods. Enjoy learning RuBy.
17 |
18 | ```ruby
19 | >> 'ruby'.capitalize
20 | => "Ruby"
21 |
22 | >> ' comma '.strip
23 | => "comma"
24 | ```
25 |
26 |
--------------------------------------------------------------------------------
/images/debuggex.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learnbyexample/Ruby_Regexp/b32aa467bbc8192b1e7a97520a5cdce251a320b3/images/debuggex.png
--------------------------------------------------------------------------------
/images/find_replace.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learnbyexample/Ruby_Regexp/b32aa467bbc8192b1e7a97520a5cdce251a320b3/images/find_replace.png
--------------------------------------------------------------------------------
/images/info.svg:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/images/password_check.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learnbyexample/Ruby_Regexp/b32aa467bbc8192b1e7a97520a5cdce251a320b3/images/password_check.png
--------------------------------------------------------------------------------
/images/rubular.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learnbyexample/Ruby_Regexp/b32aa467bbc8192b1e7a97520a5cdce251a320b3/images/rubular.png
--------------------------------------------------------------------------------
/images/ruby_regexp_ls.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learnbyexample/Ruby_Regexp/b32aa467bbc8192b1e7a97520a5cdce251a320b3/images/ruby_regexp_ls.png
--------------------------------------------------------------------------------
/images/warning.svg:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/sample_chapters/ruby_regexp_sample.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learnbyexample/Ruby_Regexp/b32aa467bbc8192b1e7a97520a5cdce251a320b3/sample_chapters/ruby_regexp_sample.pdf
--------------------------------------------------------------------------------