├── LICENSE ├── README.md ├── Version_changes.md ├── code_snippets ├── Alternation_and_Grouping.rb ├── Anchors.rb ├── Character_class.rb ├── Dot_metacharacter_and_Quantifiers.rb ├── Escaping_metacharacters.rb ├── Groupings_and_backreferences.rb ├── Interlude_Common_tasks.rb ├── Lookarounds.rb ├── Modifiers.rb ├── Regexp_introduction.rb ├── Unicode.rb └── Working_with_matched_portions.rb ├── exercises ├── Exercise_solutions.md ├── Exercises.md ├── expected.md └── sample.md ├── images ├── debuggex.png ├── find_replace.png ├── info.svg ├── password_check.png ├── rubular.png ├── ruby_regexp_ls.png └── warning.svg ├── ruby_regexp.md └── sample_chapters └── ruby_regexp_sample.pdf /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 Sundeep Agarwal 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Understanding Ruby Regexp 2 | 3 | Learn Ruby Regular Expressions step-by-step from beginner to advanced levels with hundreds of examples and exercises. Visit https://youtu.be/QNsCzVeZH78 for a short video about the book. 4 | 5 |

Understanding Ruby RegExp ebook cover image

6 | 7 | The book also includes exercises to test your understanding, which are presented together as a single file in this repo — [Exercises.md](./exercises/Exercises.md). 8 | 9 | For solutions to the exercises, see [Exercise_solutions.md](./exercises/Exercise_solutions.md). 10 | 11 | See [Version_changes.md](./Version_changes.md) to keep track of changes made to the book. 12 | 13 |
14 | 15 | # E-book 16 | 17 | * You can download the pdf/epub versions of the book for free using the below links (you can also pay if you wish): 18 | * https://learnbyexample.gumroad.com/l/rubyregexp 19 | * https://leanpub.com/rubyregexp 20 | * You can also get the book as part of these bundles: 21 | * **All books bundle** bundle from https://learnbyexample.gumroad.com/l/all-books 22 | * Includes all my programming books 23 | * **Awesome Regex** bundle from https://learnbyexample.gumroad.com/l/regex or https://leanpub.com/b/regex 24 | * **Ruby text processing** bundle from https://learnbyexample.gumroad.com/l/ruby-textprocessing or https://leanpub.com/b/ruby-textprocessing 25 | * See https://learnbyexample.github.io/books/ for a list of other books 26 | 27 | For a preview of the book, see [sample chapters](./sample_chapters/ruby_regexp_sample.pdf). 28 | 29 | The book can also be [viewed as a single markdown file in this repo](./ruby_regexp.md). See my blogpost on [generating pdfs from markdown using pandoc](https://learnbyexample.github.io/customizing-pandoc/) if you are interested in the ebook creation process. 30 | 31 | For the web version of the book, visit https://learnbyexample.github.io/Ruby_Regexp/ 32 | 33 |
34 | 35 | # Feedback 36 | 37 | ⚠️ ⚠️ Please DO NOT submit pull requests. Main reason being any modification requires changes in multiple places. 38 | 39 | I would highly appreciate it if you'd let me know how you felt about this book. It could be anything from a simple thank you, pointing out a typo, mistakes in code snippets, which aspects of the book worked for you (or didn't!) and so on. Reader feedback is essential and especially so for self-published authors. 40 | 41 | You can reach me via: 42 | 43 | * Issue Manager: [https://github.com/learnbyexample/Ruby_Regexp/issues](https://github.com/learnbyexample/Ruby_Regexp/issues) 44 | * E-mail: `echo 'bGVhcm5ieWV4YW1wbGUubmV0QGdtYWlsLmNvbQo=' | base64 --decode` 45 | * Twitter: [https://twitter.com/learn_byexample](https://twitter.com/learn_byexample) 46 | 47 |
48 | 49 | # Table of Contents 50 | 51 | 1. Preface 52 | 2. Why is it needed? 53 | 3. Regexp introduction 54 | 4. Anchors 55 | 5. Alternation and Grouping 56 | 6. Escaping metacharacters 57 | 7. Dot metacharacter and Quantifiers 58 | 8. Interlude: Tools for debugging and visualization 59 | 9. Working with matched portions 60 | 10. Character class 61 | 11. Groupings and backreferences 62 | 12. Interlude: Common tasks 63 | 13. Lookarounds 64 | 14. Modifiers 65 | 15. Unicode 66 | 16. Further Reading 67 | 68 |
69 | 70 | # Acknowledgements 71 | 72 | * [ruby-lang documentation](https://www.ruby-lang.org/en/documentation/) — manuals and tutorials 73 | * [/r/ruby/](https://old.reddit.com/r/ruby/) and [/r/regex/](https://old.reddit.com/r/regex/) — helpful forum for beginners and experienced programmers alike 74 | * [stackoverflow](https://stackoverflow.com/) — for getting answers to pertinent questions on Ruby and regular expressions 75 | * [tex.stackexchange](https://tex.stackexchange.com/) — for help on [pandoc](https://github.com/jgm/pandoc/) and `tex` related questions 76 | * [canva](https://www.canva.com/) — cover image 77 | * [Warning](https://commons.wikimedia.org/wiki/File:Warning_icon.svg) and [Info](https://commons.wikimedia.org/wiki/File:Info_icon_002.svg) icons by [Amada44](https://commons.wikimedia.org/wiki/User:Amada44) under public domain 78 | * [oxipng](https://github.com/shssoichiro/oxipng), [pngquant](https://pngquant.org/) and [svgcleaner](https://github.com/RazrFalcon/svgcleaner) — optimizing images 79 | * [gmovchan](https://github.com/gmovchan) for spotting a typo 80 | * **KOTP** for spotting grammatical mistakes 81 | * [mdBook](https://github.com/rust-lang/mdBook) — for web version of the book 82 | * [mdBook-pagetoc](https://github.com/JorelAli/mdBook-pagetoc) — for adding table of contents for each chapter 83 | * [minify-html](https://github.com/wilsonzlin/minify-html) — for minifying html files 84 | 85 | Special thanks to Allen Downey, an attempt at translating his book [Think Python](https://greenteapress.com/wp/think-python-2e/) to [Think Ruby](https://github.com/learnbyexample/ThinkRubyBuild) gave me the confidence to publish my own book. 86 | 87 |
88 | 89 | # License 90 | 91 | The book is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-nc-sa/4.0/). 92 | 93 | The code snippets are licensed under MIT, see [LICENSE](./LICENSE) file. 94 | 95 | -------------------------------------------------------------------------------- /Version_changes.md: -------------------------------------------------------------------------------- 1 |
2 | 3 | ### 3.0 4 | 5 | * Ruby version updated to **3.3.0** 6 | * Corrected examples and descriptions for Atomic grouping, `\G` and `\K` features 7 | * In general, many of the examples, exercises, solutions, descriptions and external links were updated/corrected 8 | * Updated Acknowledgements section 9 | * Code snippets related to info/warning sections will now appear as a single block 10 | * Book title changed to **Understanding Ruby Regexp** 11 | * New cover image 12 | * Images centered for EPUB format 13 | 14 |
15 | 16 | ### 2.6 17 | 18 | * Updated `ruby` version to **3.0.0** 19 | * Added a further reading link for `\g` subexpression call usage 20 | * Typo corrections and miscellaneous changes 21 | 22 |
23 | 24 | ### 2.5 25 | 26 | * Added **epub** version of the book 27 | * Added real world regular expressions usage examples and overview of book in introduction chapter 28 | * Added plenty of new exercises, perhaps too many 29 | * Updated and clarified descriptions for many concepts, too many changes to list individually 30 | * Added separate section about escape sequences and differences compared to string literals, added details for `\R` character set 31 | * Added two interlude chapters to highlight external resources 32 | * Removed chapters *Miscellaneous* and *Gotchas* and merged their contents in other chapters 33 | * Added section for conditional group `(?(cond)yes-subexp|no-subexp)` 34 | * And many more typo corrections and miscellaneous changes 35 | 36 |
37 | 38 | ### 2.1 39 | 40 | * corrected formatting in a code snippet comment 41 | * changed a character class example to use possessive quantifier instead of greedy 42 | * added examples for `\k` method of backreferencing 43 | * added examples for `\X` 44 | 45 |
46 | 47 | ### 2.0 48 | 49 | * added Table of Contents 50 | * updated cover image 51 | * changed book formatting 52 | * better contrast for chapter and section names 53 | * changed background color of code snippets for better contrast 54 | * increased font size and page margins 55 | * added cheatsheet at end of chapters 56 | * improved descriptions and examples 57 | * corrected minor typos and improved grammar 58 | 59 |
60 | 61 | ### 1.1 62 | 63 | * changed cover image 64 | * added illustration for recursive matching section 65 | * external link updates and minor description changes 66 | 67 |
68 | 69 | ### 1.0 70 | 71 | * added exercises 72 | * some comments for code snippets improved, typos fixed 73 | * AND conditional examples made as a sub-heading 74 | * font for code snippets changed to accommodate some Unicode characters 75 | 76 |
77 | 78 | ### 0.6 79 | 80 | * second example for Character class chapter changed from `grep` to `gsub` 81 | * second example for String Encoding chapter changed from `gsub` to `scan` 82 | * better naming for `hash` examples 83 | * Recursive matching simplified to use `\g<0>` instead of capture group and `\g<1>` 84 | * also, description improved and added links for viewing the regexp online as railroad diagrams 85 | 86 |
87 | 88 | ### 0.5 89 | 90 | * First version 91 | 92 | -------------------------------------------------------------------------------- /code_snippets/Alternation_and_Grouping.rb: -------------------------------------------------------------------------------- 1 | ## Alternation 2 | 3 | pet = /cat|dog/ 4 | 5 | 'I like cats'.match?(pet) 6 | 7 | 'I like dogs'.match?(pet) 8 | 9 | 'I like parrots'.match?(pet) 10 | 11 | 'catapults concatenate cat scat cater'.gsub(/\Acat|cat\b/, 'X') 12 | 13 | 'cat dog bee parrot fox'.gsub(/cat|dog|fox/, 'mammal') 14 | 15 | ## Regexp.union method 16 | 17 | Regexp.union('car', 'jeep') 18 | 19 | words = %w[cat dog fox] 20 | 21 | pat = Regexp.union(words) 22 | 23 | pat 24 | 25 | 'cat dog bee parrot fox'.gsub(pat, 'mammal') 26 | 27 | ## Grouping 28 | 29 | 'red reform read arrest'.gsub(/reform|rest/, 'X') 30 | 31 | 'red reform read arrest'.gsub(/re(form|st)/, 'X') 32 | 33 | 'par spare part party'.gsub(/\bpar\b|\bpart\b/, 'X') 34 | 35 | 'par spare part party'.gsub(/\b(par|part)\b/, 'X') 36 | 37 | 'par spare part party'.gsub(/\bpar(|t)\b/, 'X') 38 | 39 | ## Regexp.source method 40 | 41 | words = %w[cat par] 42 | 43 | alt = Regexp.union(words) 44 | 45 | alt 46 | 47 | alt_w = /\b(#{alt.source})\b/ 48 | 49 | alt_w 50 | 51 | 'cater cat concatenate par spare'.gsub(alt, 'X') 52 | 53 | 'cater cat concatenate par spare'.gsub(alt_w, 'X') 54 | 55 | ## Precedence rules 56 | 57 | words = 'lion elephant are rope not' 58 | 59 | words =~ /on/ 60 | 61 | words =~ /ant/ 62 | 63 | words.sub(/on|ant/, 'X') 64 | 65 | words.sub(/ant|on/, 'X') 66 | 67 | mood = 'best years' 68 | 69 | mood =~ /year/ 70 | 71 | mood =~ /years/ 72 | 73 | mood.sub(/year|years/, 'X') 74 | 75 | mood.sub(/years|year/, 'X') 76 | 77 | words = 'ear xerox at mare part learn eye' 78 | 79 | words.gsub(/ar|are|art/, 'X') 80 | 81 | words.gsub(/are|ar|art/, 'X') 82 | 83 | words.gsub(/are|art|ar/, 'X') 84 | 85 | words = %w[hand handy handful] 86 | 87 | alt = Regexp.union(words.sort_by { |w| -w.length }) 88 | 89 | alt 90 | 91 | 'hands handful handed handy'.gsub(alt, 'X') 92 | 93 | 'hands handful handed handy'.gsub(Regexp.union(words), 'X') 94 | 95 | -------------------------------------------------------------------------------- /code_snippets/Anchors.rb: -------------------------------------------------------------------------------- 1 | ## String anchors 2 | 3 | 'cater'.match?(/\Acat/) 4 | 5 | 'concatenation'.match?(/\Acat/) 6 | 7 | "hi hello\ntop spot".match?(/\Ahi/) 8 | 9 | "hi hello\ntop spot".match?(/\Atop/) 10 | 11 | 'spare'.match?(/are\z/) 12 | 13 | 'nearest'.match?(/are\z/) 14 | 15 | words = %w[surrender unicorn newer door empty eel pest] 16 | 17 | words.grep(/er\z/) 18 | 19 | words.grep(/t\z/) 20 | 21 | "spare\ndare".sub(/are\z/, 'X') 22 | 23 | "spare\ndare".sub(/are\Z/, 'X') 24 | 25 | "spare\ndare\n".sub(/are\z/, 'X') 26 | 27 | "spare\ndare\n".sub(/are\Z/, 'X') 28 | 29 | 'cat'.match?(/\Acat\z/) 30 | 31 | 'cater'.match?(/\Acat\z/) 32 | 33 | 'concatenation'.match?(/\Acat\z/) 34 | 35 | 'live'.sub(/\A/, 're') 36 | 37 | 'send'.sub(/\A/, 're') 38 | 39 | 'cat'.sub(/\z/, 'er') 40 | 41 | 'hack'.sub(/\z/, 'er') 42 | 43 | ## Line anchors 44 | 45 | pets = 'cat and dog' 46 | 47 | pets.match?(/^cat/) 48 | 49 | pets.match?(/^dog/) 50 | 51 | pets.match?(/dog$/) 52 | 53 | pets.match?(/^dog$/) 54 | 55 | "hi hello\ntop spot".match?(/^top/) 56 | 57 | "spare\npar\nera\ndare".match?(/er$/) 58 | 59 | "spare\npar\ndare".each_line.grep(/are$/) 60 | 61 | "spare\npar\ndare".match?(/^par$/) 62 | 63 | str = "catapults\nconcatenate\ncat" 64 | 65 | puts str.gsub(/^/, '1: ') 66 | 67 | puts str.gsub(/^/).with_index(1) { "#{_2}: " } 68 | 69 | puts str.gsub(/$/, '.') 70 | 71 | puts "1\n2\n".gsub(/^/, 'fig ') 72 | 73 | puts "1\n\n".gsub(/^/, 'fig ') 74 | 75 | puts "1\n2\n".gsub(/$/, ' banana') 76 | 77 | puts "1\n\n".gsub(/$/, ' banana') 78 | 79 | ## Word anchors 80 | 81 | words = 'par spar apparent spare part' 82 | 83 | words.gsub(/par/, 'X') 84 | 85 | words.gsub(/\bpar/, 'X') 86 | 87 | words.gsub(/par\b/, 'X') 88 | 89 | words.gsub(/\bpar\b/, 'X') 90 | 91 | words = 'par spar apparent spare part' 92 | 93 | puts words.gsub(/\b/, '"').tr(' ', ',') 94 | 95 | '-----hello-----'.gsub(/\b/, ' ') 96 | 97 | 'output=num1+35*42/num2'.gsub(/\b/, ' ') 98 | 99 | 'output=num1+35*42/num2'.gsub(/\b/, ' ').strip 100 | 101 | ## Opposite Word anchors 102 | 103 | words = 'par spar apparent spare part' 104 | 105 | words.gsub(/\Bpar/, 'X') 106 | 107 | words.gsub(/\Bpar\b/, 'X') 108 | 109 | words.gsub(/par\B/, 'X') 110 | 111 | words.gsub(/\Bpar\B/, 'X') 112 | 113 | 'copper'.gsub(/\b/, ':') 114 | 115 | 'copper'.gsub(/\B/, ':') 116 | 117 | '-----hello-----'.gsub(/\b/, ' ') 118 | 119 | '-----hello-----'.gsub(/\B/, ' ') 120 | 121 | -------------------------------------------------------------------------------- /code_snippets/Character_class.rb: -------------------------------------------------------------------------------- 1 | ## Custom character sets 2 | 3 | %w[cute cat cot coat cost scuttle].grep(/c[ou]t/) 4 | 5 | 'meeting cute boat site foot'.scan(/.[aeo]+t/) 6 | 7 | ## Range of characters 8 | 9 | 'Sample123string42with777numbers'.scan(/[0-9]+/) 10 | 11 | 'coat Bin food tar12 best Apple fig_42'.scan(/\b[a-z0-9]+\b/) 12 | 13 | 'coat tin food put stoop best fig_42 Pet'.scan(/\b[p-z][a-z]*\b/) 14 | 15 | 'coat tin food put stoop best fig_42 Pet'.scan(/\b[a-fp-t]+\b/) 16 | 17 | ## Negating character sets 18 | 19 | 'Sample123string42with777numbers'.scan(/[^0-9]+/) 20 | 21 | 'apple:123:banana:cherry'.sub(/\A([^:]+:){2}/, '') 22 | 23 | 'apple=42; cherry=123'.sub(/=[^=]+\z/, '') 24 | 25 | dates = '2024/04/25,1986/Mar/02,77/12/31' 26 | 27 | dates.scan(%r{([^/]+)/([^/]+)/([^/,]+),?}) 28 | 29 | words = %w[tryst fun glyph pity why] 30 | 31 | words.grep(/\A[^aeiou]+\z/) 32 | 33 | words.grep_v(/[aeiou]/) 34 | 35 | ## Set intersection 36 | 37 | 'tryst glyph pity why'.scan(/\b[^aeiou]+\b/) 38 | 39 | 'tryst glyph pity why'.scan(/\b[a-z&&[^aeiou]]+\b/) 40 | 41 | ## Matching metacharacters literally 42 | 43 | 'ab-cd gh-c 12-423'.scan(/\b[a-z-]{2,}\b/) 44 | 45 | 'ab-cd gh-c 12-423'.scan(/\b[a-z\-0-9]{2,}\b/) 46 | 47 | 'f*(a^b) - 3*(a+b)'.scan(/a[+^]b/) 48 | 49 | 'f*(a^b) - 3*(a+b)'.scan(/a[\^+]b/) 50 | 51 | 'words[5] = tea'[/[a-z\[\]0-9]+/] 52 | 53 | puts '5ba\babc2'[/[a\\b]+/] 54 | 55 | ## Escape sequence sets 56 | 57 | '128A foo1 fe32 34 bar'.scan(/\b\h+\b/) 58 | 59 | '128A foo1 fe32 34 bar'.scan(/\b\h+\b/).map(&:hex) 60 | 61 | 'Sample123string42with777numbers'.split(/\d+/) 62 | 63 | 'apple=5, banana=3; x=83, y=120'.scan(/\d+/).map(&:to_i) 64 | 65 | 'sea eat car rat eel tea'.scan(/\b\w/).join 66 | 67 | "tea sea-Pit Sit;(lean_2\tbean_3)".scan(/[\w\s]+/) 68 | 69 | 'Sample123string42with777numbers'.gsub(/\D+/, '-') 70 | 71 | 'apple=5, banana=3; x=83, y=120'.gsub(/\W+/, '') 72 | 73 | " 1..3 \v\f fig_tea 42\tzzz \r\n1-2-3 ".scan(/\S+/) 74 | 75 | "food\r\ngood\napple\vbanana".gsub(/\R/, " ") 76 | 77 | "food\r\ngood"[/\w+\R/] 78 | 79 | ip = ['#comment', 'c = "#"', "\t #comment", 'fig', '', " \t "] 80 | 81 | ip.grep(/\A\s*[^#]/) 82 | 83 | ip.grep(/\A\s*+[^#]/) 84 | 85 | ip.grep(/\A\s*[^#\s]/) 86 | 87 | ## Named character sets 88 | 89 | 'Sample123string42with777numbers'.split(/[[:digit:]]+/) 90 | 91 | " 1..3 \v\f fig_tea 42\tzzz \r\n1-2-3 ".scan(/[[:^space:]]+/) 92 | 93 | "tea sea-Pit Sit;(lean_2\tbean_3)".scan(/[[:word:][:space:]]+/) 94 | 95 | 'Sample123string42with777numbers'.scan(/[[:alpha:]]+/) 96 | 97 | ip = '"Hi", there! How *are* you? All fine here.' 98 | 99 | ip.gsub(/[[:punct:]]+/, '') 100 | 101 | ip.gsub(/[[^.!?]&&[:punct:]]+/, '') 102 | 103 | ## Numeric ranges 104 | 105 | '23 154 12 26 98234'.scan(/\b[12]\d\b/) 106 | 107 | '23 154 12 26 98234'.scan(/\b\d{3,}\b/) 108 | 109 | '0501 035 154 12 26 98234'.scan(/\b0*+\d{3,}\b/) 110 | 111 | '45 349 651 593 4 204'.scan(/\d+/).filter { _1.to_i < 350 } 112 | 113 | '45 349 651 593 4 204'.gsub(/\d+/) { (200..650) === $&.to_i ? 0 : 1 } 114 | 115 | -------------------------------------------------------------------------------- /code_snippets/Dot_metacharacter_and_Quantifiers.rb: -------------------------------------------------------------------------------- 1 | ## Dot metacharacter 2 | 3 | 'tac tin c.t abc;tuv acute'.gsub(/c.t/, 'X') 4 | 5 | 'breadth markedly reported overrides'.gsub(/r..d/) { _1.upcase } 6 | 7 | "42\t35".sub(/2.3/, '8') 8 | 9 | "a\nb".match?(/a.b/) 10 | 11 | ## split method 12 | 13 | 'apple-85-mango-70'.split(/-/) 14 | 15 | 'bus:3:car:-:van'.split(/:.:/) 16 | 17 | 'apple-85-mango-70'.split(/-/, 2) 18 | 19 | ## Greedy quantifiers 20 | 21 | 'far feat flare fear'.gsub(/e?ar/, 'X') 22 | 23 | 'par spare part party'.gsub(/\bpart?\b/, 'X') 24 | 25 | words = %w[red read ready re;d road redo reed rod] 26 | 27 | words.grep(/\bre.?d\b/) 28 | 29 | 'par part parrot parent'.gsub(/par(ro)?t/, 'X') 30 | 31 | 'par part parrot parent'.gsub(/par(en|ro)?t/, 'X') 32 | 33 | 'tr tear tare steer sitaara'.gsub(/ta*r/, 'X') 34 | 35 | 'tr tear tare steer sitaara'.gsub(/t(e|a)*r/, 'X') 36 | 37 | '3111111111125111142'.gsub(/1*2/, 'X') 38 | 39 | '3111111111125111142'.split(/1*/) 40 | 41 | '3111111111125111142'.split(/1*/, -1) 42 | 43 | '3111111111125111142'.partition(/1*2/) 44 | 45 | '3111111111125111142'.rpartition(/1*2/) 46 | 47 | 'tr tear tare steer sitaara'.gsub(/ta+r/, 'X') 48 | 49 | 'tr tear tare steer sitaara'.gsub(/t(e|a)+r/, 'X') 50 | 51 | '3111111111125111142'.gsub(/1+2/, 'X') 52 | 53 | '3111111111125111142'.split(/1+/) 54 | 55 | repeats = %w[abc ac adc abbc xabbbcz bbb bc abbbbbc] 56 | 57 | repeats.grep(/ab{1,4}c/) 58 | 59 | repeats.grep(/ab{3,}c/) 60 | 61 | repeats.grep(/ab{,2}c/) 62 | 63 | repeats.grep(/ab{3}c/) 64 | 65 | 'a{5} = 10'.sub(/a\{5}/, 'a{6}') 66 | 67 | 'report_{a,b}.txt'.sub(/_{a,b}/, '-{c,d}') 68 | 69 | '# heading ### sub-heading'.gsub(/\#{2,}/, '%') 70 | 71 | ## Conditional AND 72 | 73 | 'Error: not a valid input'.match?(/Error.*valid/) 74 | 75 | 'Error: key not found'.match?(/Error.*valid/) 76 | 77 | seq1, seq2 = ['cat and dog', 'dog and cat'] 78 | 79 | seq1.match?(/cat.*dog|dog.*cat/) 80 | 81 | seq2.match?(/cat.*dog|dog.*cat/) 82 | 83 | patterns = [/cat/, /dog/] 84 | 85 | patterns.all? { seq1.match?(_1) } 86 | 87 | patterns.all? { seq2.match?(_1) } 88 | 89 | ## What does greedy mean? 90 | 91 | 'foot'.sub(/f.?o/, 'X') 92 | 93 | puts 'blah \< fig < apple \< blah < cat'.gsub(/\\?0*)\d{3,}/) 140 | 141 | ip = 'fig::mango::pineapple::guava::apples::orange' 142 | 143 | ip.match(/::.*?::apple/)[0] 144 | 145 | ip.match(/(?>::.*?::)apple/)[0] 146 | 147 | -------------------------------------------------------------------------------- /code_snippets/Escaping_metacharacters.rb: -------------------------------------------------------------------------------- 1 | ## Escaping with backslash 2 | 3 | 'a^2 + b^2 - C*3'.match?(/b^2/) 4 | 5 | 'a^2 + b^2 - C*3'.gsub(/(a|b)\^2/) { _1.upcase } 6 | 7 | '(a*b) + c'.gsub(/\(|\)/, '') 8 | 9 | '\learn\by\example'.gsub(/\\/, '/') 10 | 11 | eqn = 'f*(a^b) - 3*(a^b)' 12 | 13 | eqn.gsub('(a^b)', 'c') 14 | 15 | ## Regexp.escape method 16 | 17 | eqn = 'f*(a^b) - 3*(a^b)' 18 | 19 | expr = '(a^b)' 20 | 21 | puts Regexp.escape(expr) 22 | 23 | eqn.sub(/#{Regexp.escape(expr)}\z/, 'c') 24 | 25 | terms = %w[a_42 (a^b) 2|3] 26 | 27 | pat = Regexp.union(terms) 28 | 29 | pat 30 | 31 | 'ba_423 (a^b)c 2|3 a^b'.gsub(pat, 'X') 32 | 33 | Regexp.union(/^cat|dog$/, 'a^b') 34 | 35 | ## Escaping delimiter 36 | 37 | path = '/home/joe/report/sales/ip.txt' 38 | 39 | path.sub(/\A\/home\/joe\//, '~/') 40 | 41 | path.sub(%r#\A/home/joe/#, '~/') 42 | 43 | ## Escape sequences 44 | 45 | "a\tb\tc".gsub(/\t/, ':') 46 | 47 | "1\n2\n3".gsub(/\n/, ' ') 48 | 49 | 'h%x'.match?(/h\%x/) 50 | 51 | 'h\%x'.match?(/h\%x/) 52 | 53 | 'hello'.match?(/\l/) 54 | 55 | 'h e l l o'.gsub(/\x20/, '') 56 | 57 | 'a+b'.match?(/a\053b/) 58 | 59 | '12|30'.gsub(/2\x7c3/, '5') 60 | 61 | '12|30'.gsub(/2|3/, '5') 62 | 63 | -------------------------------------------------------------------------------- /code_snippets/Groupings_and_backreferences.rb: -------------------------------------------------------------------------------- 1 | ## Backreferences 2 | 3 | '[52] apples [and] [31] mangoes'.gsub(/\[(\d+)\]/, '\1') 4 | 5 | '[52] apples [and] [31] mangoes'.gsub(/\[(\d+)\]/, '\15') 6 | 7 | '_apple_ __123__ _banana_'.gsub(/(_)?_/, '\1') 8 | 9 | '52 apples and 31 mangoes'.gsub(/\d+/, '(\0)') 10 | 11 | 'Hello world'.sub(/.*/, 'Hi. \0. Have a nice day') 12 | 13 | 'fork,42,nice,3.14'.sub(/,.+/, '\0,\`') 14 | 15 | 'good,bad 42,24 x,y'.gsub(/(\w+),(\w+)/, '\2,\1') 16 | 17 | %w[effort flee facade oddball rat tool].grep(/(\w)\1/) 18 | 19 | 'aa a a a 42 f_1 f_1 f_13.14'.gsub(/\b(\w+)( \1)+\b/, '\1') 20 | 21 | 'two one 5 one2 three'.match?(/([a-z]+).*\12/) 22 | 23 | 'two one 5 one2 three'.match?(/([a-z]+).*\k<1>2/) 24 | 25 | s = 'abcdefghijklmna1d' 26 | 27 | s.sub(/(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.).*\1\x31/, 'X') 28 | 29 | '[52] apples [and] [31] mangoes'.gsub(/\[(\d+)\]/) { $15 } 30 | 31 | '[52] apples [and] [31] mangoes'.gsub(/\[(\d+)\]/) { $1 + "5" } 32 | 33 | '[52] apples [and] [31] mangoes'.gsub(/\[(\d+)\]/) { "#{$1}5" } 34 | 35 | ## Non-capturing groups 36 | 37 | 'cost akin more east run against'.scan(/\b\w*(?:st|in)\b/) 38 | 39 | '123hand42handy777handful500'.split(/hand(?:y|ful)?/) 40 | 41 | '1,2,3,4,5,6,7'.sub(/\A(([^,]+,){3})([^,]+)/, '\1(\3)') 42 | 43 | '1,2,3,4,5,6,7'.sub(/\A((?:[^,]+,){3})([^,]+)/, '\1(\2)') 44 | 45 | s = 'hi 123123123 bye 456123456' 46 | 47 | s.scan(/(123)+/) 48 | 49 | s.scan(/(?:123)+/) 50 | 51 | s.gsub(/(123)+/, 'X') 52 | 53 | row = 'one,2,3.14,42,five' 54 | 55 | puts row.sub(/\A([^,]+,){3}([^,]+)/, '\1"\2"') 56 | 57 | puts row.sub(/\A((?:[^,]+,){3})([^,]+)/, '\1"\2"') 58 | 59 | 'cost akin more east run against'.gsub(/\b\w*(st|in)\b/).to_a 60 | 61 | 'cost akin more east run against'.gsub(/\b\w*(st|in)\b/).map(&:upcase) 62 | 63 | 'effort flee facade oddball rat tool'.gsub(/\b\w*(\w)\1\w*\b/).to_a 64 | 65 | ## Subexpression calls 66 | 67 | row = 'today,2008-03-24,food,2012-08-12,nice,5632' 68 | 69 | row[/(\d{4}-\d{2}-\d{2}).*\g<1>/] 70 | 71 | d = '2008-03-24,2012-08-12 2017-06-27,2018-03-25 1999-12-23,2001-05-08' 72 | 73 | d.scan(/(\d{4}-\d{2}-\d{2}),\g<1>/) 74 | 75 | d.gsub(/(\d{4}-\d{2}-\d{2}),\g<1>/, '\1') 76 | 77 | d.gsub(/((\d{4}-\d{2}-\d{2})),\g<2>/, '\1') 78 | 79 | ## Recursive matching 80 | 81 | eqn0 = 'a + (b * c) - (d / e)' 82 | 83 | eqn0.scan(/\([^()]++\)/) 84 | 85 | eqn1 = '((f+x)^y-42)*((3-g)^z+2)' 86 | 87 | eqn1.scan(/\([^()]++\)/) 88 | 89 | eqn1 = '((f+x)^y-42)*((3-g)^z+2)' 90 | 91 | eqn1.scan(/\((?:[^()]++|\([^()]++\))++\)/) 92 | 93 | eqn2 = 'a + (b) + ((c)) + (((d)))' 94 | 95 | eqn2.scan(/\((?:[^()]++|\([^()]++\))++\)/) 96 | 97 | lvl2 = /\( #literal ( 98 | (?: #start of non-capturing group 99 | [^()]++ #non-parentheses characters 100 | | #OR 101 | \([^()]++\) #level-one regexp 102 | )++ #end of non-capturing group, 1 or more times 103 | \) #literal ) 104 | /x 105 | 106 | eqn1.scan(lvl2) 107 | 108 | eqn2.scan(lvl2) 109 | 110 | lvln = /\( #literal ( 111 | (?: #start of non-capturing group 112 | [^()]++ #non-parentheses characters 113 | | #OR 114 | \g<0> #recursive call 115 | )++ #end of non-capturing group, 1 or more times 116 | \) #literal ) 117 | /x 118 | 119 | eqn0.scan(lvln) 120 | 121 | eqn1.scan(lvln) 122 | 123 | eqn2.scan(lvln) 124 | 125 | eqn3 = '(3+a) * ((r-2)*(t+2)/6) + 42 * (a(b(c(d(e)))))' 126 | 127 | eqn3.scan(lvln) 128 | 129 | ## Named capture groups 130 | 131 | 'good,bad 42,24 x,y'.gsub(/(?\w+),(?\w+)/, '\k,\k') 132 | 133 | 'good,bad 42,24 x,y'.gsub(/(?'fw'\w+),(?'sw'\w+)/, '\k,\k') 134 | 135 | row = 'today,2008-03-24,food,2012-08-12,nice,5632' 136 | 137 | row[/(?\d{4}-\d{2}-\d{2}).*\g/] 138 | 139 | details = '2018-10-25,car' 140 | 141 | /(?[^,]+),(?[^,]+)/ =~ details 142 | 143 | date 144 | 145 | product 146 | 147 | details = '2018-10-25,car,2346' 148 | 149 | details.match(/(?[^,]+),(?[^,]+)/).named_captures 150 | 151 | details.match(/(?[^,]+),([^,]+)/).named_captures 152 | 153 | s = 'good,bad 42,24' 154 | 155 | s.gsub(/(?\w+),(?\w+)/).map { $~.named_captures } 156 | 157 | ## Negative backreferences 158 | 159 | '1,2,3,3,5'.match?(/\A([^,]+,){2}([^,]+),\k<-1>,/) 160 | 161 | ## Conditional groups 162 | 163 | words = %w[ bye bad> 42 <3] 164 | 165 | words.grep(/\A(<)?\w+(?(1)>)\z/) 166 | 167 | words.grep(/\A(?:<\w+>|\w+)\z/) 168 | 169 | words.grep(/\A(?:?)\z/) 170 | 171 | words = ['(hi)', 'good-bye', 'bad', '(42)', '-oh', 'i-j', '(-)', '(oh-no)'] 172 | 173 | words.grep(/\A(?:(\()?\w+(?(1)\)|-\w+))\z/) 174 | 175 | -------------------------------------------------------------------------------- /code_snippets/Interlude_Common_tasks.rb: -------------------------------------------------------------------------------- 1 | ## CommonRegexRuby 2 | 3 | require 'commonregex' 4 | 5 | data = 'hello 255.21.255.22 okay 23/04/96' 6 | 7 | parsed = CommonRegex.new(data) 8 | 9 | parsed.get_ipv4 10 | 11 | parsed.get_dates 12 | 13 | CommonRegex.get_ipv4(data) 14 | 15 | CommonRegex.get_dates(data) 16 | 17 | new_data = '23.14.2.4.2 255.21.255.22 567.12.2.1' 18 | 19 | CommonRegex.get_ipv4(new_data) 20 | 21 | -------------------------------------------------------------------------------- /code_snippets/Lookarounds.rb: -------------------------------------------------------------------------------- 1 | ## Conditional expressions 2 | 3 | items = ['1,2,3,4', 'a,b,c,d', '#apple 123'] 4 | 5 | items.filter { _1.match?(/\d/) && _1.include?('#') } 6 | 7 | items.filter_map { |s| s.sub(/,.+,/, ' ') if s[0] != '#' } 8 | 9 | ## Negative lookarounds 10 | 11 | 'hey cats! cat42 cat_5 catcat'.gsub(/cat(?!\d)/, 'dog') 12 | 13 | 'cat _cat 42catcat'.gsub(/(? 'one', '2' => 'two', '4' => 'four' } 152 | 153 | '9234012'.gsub(/1|2|4/, h) 154 | 155 | h.default = 'X' 156 | 157 | '9234012'.gsub(/./, h) 158 | 159 | swap = { 'cat' => 'tiger', 'tiger' => 'cat' } 160 | 161 | 'cat tiger dog tiger cat'.gsub(/cat|tiger/, swap) 162 | 163 | h = { 'hand' => 1, 'handy' => 2, 'handful' => 3, 'a^b' => 4 } 164 | 165 | pat = Regexp.union(h.keys.sort_by { |w| -w.length }) 166 | 167 | pat 168 | 169 | 'handful hand pin handy (a^b)'.gsub(pat, h) 170 | 171 | ## Substitution in conditional expression 172 | 173 | num = '4' 174 | 175 | puts "#{num} apples" if num.sub!(/5/) { $&.to_i ** 2 } 176 | 177 | puts "#{num} apples" if num.sub!(/4/) { $&.to_i ** 2 } 178 | 179 | word, cnt = ['coffining', 0] 180 | 181 | cnt += 1 while word.sub!(/fin/, '') 182 | 183 | [word, cnt] 184 | 185 | -------------------------------------------------------------------------------- /exercises/Exercise_solutions.md: -------------------------------------------------------------------------------- 1 | # Exercise solutions 2 | 3 | >![info](../images/info.svg) Solutions for [Exercises.md](https://github.com/learnbyexample/Ruby_Regexp/blob/master/exercises/Exercises.md) is presented here. 4 | 5 |
6 | 7 | # Regexp introduction 8 | 9 | **1)** Check whether the given strings contain `0xB0`. Display a boolean result as shown below. 10 | 11 | ```ruby 12 | >> line1 = 'start address: 0xA0, func1 address: 0xC0' 13 | >> line2 = 'end address: 0xFF, func2 address: 0xB0' 14 | 15 | >> line1.match?(/0xB0/) 16 | => false 17 | >> line2.match?(/0xB0/) 18 | => true 19 | ``` 20 | 21 | **2)** Check if the given input strings contain `two` irrespective of case. 22 | 23 | ```ruby 24 | >> s1 = 'Their artwork is exceptional' 25 | >> s2 = 'one plus tw0 is not three' 26 | >> s3 = 'TRUSTWORTHY' 27 | 28 | >> pat1 = /two/i 29 | 30 | >> pat1.match?(s1) 31 | => true 32 | >> pat1.match?(s2) 33 | => false 34 | >> pat1.match?(s3) 35 | => true 36 | ``` 37 | 38 | **3)** Replace all occurrences of `5` with `five` for the given string. 39 | 40 | ```ruby 41 | >> ip = 'They ate 5 apples and 5 oranges' 42 | 43 | >> ip.gsub(/5/, 'five') 44 | => "They ate five apples and five oranges" 45 | ``` 46 | 47 | **4)** Replace only the first occurrence of `5` with `five` for the given string. 48 | 49 | ```ruby 50 | >> ip = 'They ate 5 apples and 5 oranges' 51 | 52 | >> ip.sub(/5/, 'five') 53 | => "They ate five apples and 5 oranges" 54 | ``` 55 | 56 | **5)** For the given array, filter all elements that do *not* contain `e`. 57 | 58 | ```ruby 59 | >> items = %w[goal new user sit eat dinner] 60 | 61 | >> items.grep_v(/e/) 62 | => ["goal", "sit"] 63 | ``` 64 | 65 | **6)** Replace all occurrences of `note` irrespective of case with `X`. 66 | 67 | ```ruby 68 | >> ip = 'This note should not be NoTeD' 69 | 70 | >> ip.gsub(/note/i, 'X') 71 | => "This X should not be XD" 72 | ``` 73 | 74 | **7)** For the given input string, print all lines NOT containing the string `2`. 75 | 76 | ```ruby 77 | '> purchases = %q{items qty 78 | '> apple 24 79 | '> mango 50 80 | '> guava 42 81 | '> onion 31 82 | >> water 10} 83 | 84 | >> num = /2/ 85 | 86 | >> puts purchases.each_line.grep_v(num) 87 | items qty 88 | mango 50 89 | onion 31 90 | water 10 91 | ``` 92 | 93 | **8)** For the given array, filter all elements that contain either `a` or `w`. 94 | 95 | ```ruby 96 | >> items = %w[goal new user sit eat dinner] 97 | 98 | >> items.filter { |e| e.match?(/a/) || e.match?(/w/) } 99 | => ["goal", "new", "eat"] 100 | ``` 101 | 102 | **9)** For the given array, filter all elements that contain both `e` and `n`. 103 | 104 | ```ruby 105 | >> items = %w[goal new user sit eat dinner] 106 | 107 | >> items.filter { |e| e.match?(/e/) && e.match?(/n/) } 108 | => ["new", "dinner"] 109 | ``` 110 | 111 | **10)** For the given string, replace `0xA0` with `0x7F` and `0xC0` with `0x1F`. 112 | 113 | ```ruby 114 | >> ip = 'start address: 0xA0, func1 address: 0xC0' 115 | 116 | >> ip.gsub(/0xA0/, '0x7F').gsub(/0xC0/, '0x1F') 117 | => "start address: 0x7F, func1 address: 0x1F" 118 | ``` 119 | 120 | **11)** Find the starting index of the first occurrence of `is` for the given input string. 121 | 122 | ```ruby 123 | >> ip = 'match this after the history lesson' 124 | 125 | >> ip =~ /is/ 126 | => 8 127 | ``` 128 | 129 |
130 | 131 | # Anchors 132 | 133 | **1)** Check if the given strings start with `be`. 134 | 135 | ```ruby 136 | >> line1 = 'be nice' 137 | >> line2 = '"best!"' 138 | >> line3 = 'better?' 139 | >> line4 = 'oh no\nbear spotted' 140 | 141 | >> pat = /\Abe/ 142 | 143 | >> pat.match?(line1) 144 | => true 145 | >> pat.match?(line2) 146 | => false 147 | >> pat.match?(line3) 148 | => true 149 | >> pat.match?(line4) 150 | => false 151 | ``` 152 | 153 | **2)** For the given input string, change only the whole word `red` to `brown`. 154 | 155 | ```ruby 156 | >> words = 'bred red spread credible red.' 157 | 158 | >> words.gsub(/\bred\b/, 'brown') 159 | => "bred brown spread credible brown." 160 | ``` 161 | 162 | **3)** For the given input array, filter elements that contain `42` surrounded by word characters. 163 | 164 | ```ruby 165 | >> items = ['hi42bye', 'nice1423', 'bad42', 'cool_42a', '42fake', '_42_'] 166 | 167 | >> items.grep(/\B42\B/) 168 | => ["hi42bye", "nice1423", "cool_42a", "_42_"] 169 | ``` 170 | 171 | **4)** For the given input array, filter elements that start with `den` or end with `ly`. 172 | 173 | ```ruby 174 | >> items = ['lovely', "1\ndentist", '2 lonely', 'eden', "fly\n", 'dent'] 175 | 176 | >> items.filter { |e| e.match?(/\Aden/) || e.match?(/ly\z/) } 177 | => ["lovely", "2 lonely", "dent"] 178 | ``` 179 | 180 | **5)** For the given input string, change whole word `mall` to `1234` only if it is at the start of a line. 181 | 182 | ```ruby 183 | '> para = %q{(mall) call ball pall 184 | '> ball fall wall tall 185 | '> mall call ball pall 186 | '> wall mall ball fall 187 | '> mallet wallet malls 188 | >> mall:call:ball:pall} 189 | 190 | >> puts para.gsub(/^mall\b/, '1234') 191 | (mall) call ball pall 192 | ball fall wall tall 193 | 1234 call ball pall 194 | wall mall ball fall 195 | mallet wallet malls 196 | 1234:call:ball:pall 197 | ``` 198 | 199 | **6)** For the given array, filter elements having a line starting with `den` or ending with `ly`. 200 | 201 | ```ruby 202 | >> items = ['lovely', "1\ndentist", '2 lonely', 'eden', "fly\nfar", 'dent'] 203 | 204 | >> items.filter { |e| e.match?(/^den/) || e.match?(/ly$/) } 205 | => ["lovely", "1\ndentist", "2 lonely", "fly\nfar", "dent"] 206 | ``` 207 | 208 | **7)** For the given input array, filter all whole elements `12\nthree` irrespective of case. 209 | 210 | ```ruby 211 | >> items = ["12\nthree\n", "12\nThree", "12\nthree\n4", "12\nthree"] 212 | 213 | >> items.grep(/\A12\nthree\z/i) 214 | => ["12\nThree", "12\nthree"] 215 | ``` 216 | 217 | **8)** For the given input array, replace `hand` with `X` for all elements that start with `hand` followed by at least one word character. 218 | 219 | ```ruby 220 | >> items = %w[handed hand handy unhanded handle hand-2] 221 | 222 | >> items.map { _1.sub(/\bhand\B/, 'X') } 223 | => ["Xed", "hand", "Xy", "unhanded", "Xle", "hand-2"] 224 | ``` 225 | 226 | **9)** For the given input array, filter all elements starting with `h`. Additionally, replace `e` with `X` for these filtered elements. 227 | 228 | ```ruby 229 | >> items = %w[handed hand handy unhanded handle hand-2] 230 | 231 | >> items.filter_map { |e| e.gsub(/e/, 'X') if e.match?(/\Ah/) } 232 | => ["handXd", "hand", "handy", "handlX", "hand-2"] 233 | ``` 234 | 235 |
236 | 237 | # Alternation and Grouping 238 | 239 | **1)** For the given input array, filter all elements that start with `den` or end with `ly`. 240 | 241 | ```ruby 242 | >> items = ['lovely', "1\ndentist", '2 lonely', 'eden', "fly\n", 'dent'] 243 | 244 | >> items.grep(/\Aden|ly\z/) 245 | => ["lovely", "2 lonely", "dent"] 246 | ``` 247 | 248 | **2)** For the given array, filter elements having a line starting with `den` or ending with `ly`. 249 | 250 | ```ruby 251 | >> items = ['lovely', "1\ndentist", '2 lonely', 'eden', "fly\nfar", 'dent'] 252 | 253 | >> items.grep(/^den|ly$/) 254 | => ["lovely", "1\ndentist", "2 lonely", "fly\nfar", "dent"] 255 | ``` 256 | 257 | **3)** For the given strings, replace all occurrences of `removed` or `reed` or `received` or `refused` with `X`. 258 | 259 | ```ruby 260 | >> s1 = 'creed refuse removed read' 261 | >> s2 = 'refused reed redo received' 262 | 263 | >> pat = /re(mov|ceiv|fus|)ed/ 264 | 265 | >> s1.gsub(pat, 'X') 266 | => "cX refuse X read" 267 | >> s2.gsub(pat, 'X') 268 | => "X X redo X" 269 | ``` 270 | 271 | **4)** For the given strings, replace all matches from the array `words` with `A`. 272 | 273 | ```ruby 274 | >> s1 = 'plate full of slate' 275 | >> s2 = "slated for later, don't be late" 276 | >> words = %w[late later slated] 277 | 278 | >> pat = Regexp.union(words.sort_by { |w| -w.length }) 279 | 280 | >> s1.gsub(pat, 'A') 281 | => "pA full of sA" 282 | >> s2.gsub(pat, 'A') 283 | => "A for A, don't be A" 284 | ``` 285 | 286 | **5)** Filter all whole elements from the input array `items` that exactly matches any of the elements present in the array `words`. 287 | 288 | ```ruby 289 | >> items = ['slate', 'later', 'plate', 'late', 'slates', 'slated '] 290 | >> words = %w[late later slated] 291 | 292 | >> pat = Regexp.union(words.sort_by { |w| -w.length }) 293 | >> pat = /\A(#{pat.source})\z/ 294 | 295 | >> items.grep(pat) 296 | => ["later", "late"] 297 | ``` 298 | 299 |
300 | 301 | # Escaping metacharacters 302 | 303 | **1)** Transform the given input strings to the expected output using the same logic on both strings. 304 | 305 | ```ruby 306 | >> str1 = '(9-2)*5+qty/3-(9-2)*7' 307 | >> str2 = '(qty+4)/2-(9-2)*5+pq/4' 308 | 309 | >> str1.gsub('(9-2)*5', '35') 310 | => "35+qty/3-(9-2)*7" 311 | >> str2.gsub('(9-2)*5', '35') 312 | => "(qty+4)/2-35+pq/4" 313 | ``` 314 | 315 | **2)** Replace `(4)\|` with `2` only at the start or end of the given input strings. 316 | 317 | ```ruby 318 | >> s1 = '2.3/(4)\|6 fig 5.3-(4)\|' 319 | >> s2 = '(4)\|42 - (4)\|3' 320 | >> s3 = "two - (4)\\|\n" 321 | 322 | >> pat = /\A\(4\)\\\||\(4\)\\\|\z/ 323 | 324 | >> s1.gsub(pat, '2') 325 | => "2.3/(4)\\|6 fig 5.3-2" 326 | >> s2.gsub(pat, '2') 327 | => "242 - (4)\\|3" 328 | >> s3.gsub(pat, '2') 329 | => "two - (4)\\|\n" 330 | ``` 331 | 332 | **3)** Replace any matching item from the given array with `X` for the given input strings. Match the elements from `items` literally. Assume no two elements of `items` will result in any matching conflict. 333 | 334 | ```ruby 335 | >> items = ['a.b', '3+n', 'x\y\z', 'qty||price', '{n}'] 336 | 337 | >> pat = Regexp.union(items) 338 | 339 | >> '0a.bcd'.gsub(pat, 'X') 340 | => "0Xcd" 341 | >> 'E{n}AMPLE'.gsub(pat, 'X') 342 | => "EXAMPLE" 343 | >> '43+n2 ax\y\ze'.gsub(pat, 'X') 344 | => "4X2 aXe" 345 | ``` 346 | 347 | **4)** Replace the backspace character `\b` with a single space character for the given input string. 348 | 349 | ```ruby 350 | >> ip = "123\b456" 351 | >> puts ip 352 | 12456 353 | 354 | >> ip.gsub(/\x08/, ' ') 355 | => "123 456" 356 | ``` 357 | 358 | **5)** Replace all occurrences of `\o` with `o`. 359 | 360 | ```ruby 361 | >> ip = 'there are c\omm\on aspects am\ong the alternati\ons' 362 | 363 | >> ip.gsub(/\\o/, 'o') 364 | => "there are common aspects among the alternations" 365 | ``` 366 | 367 | **6)** Replace any matching item from the array `eqns` with `X` for the given string `ip`. Match the items from `eqns` literally. 368 | 369 | ```ruby 370 | >> ip = '3-(a^b)+2*(a^b)-(a/b)+3' 371 | >> eqns = %w[(a^b) (a/b) (a^b)+2] 372 | 373 | >> pat = Regexp.union(eqns.sort_by { |w| -w.length }) 374 | 375 | >> ip.gsub(pat, 'X') 376 | => "3-X*X-X+3" 377 | ``` 378 | 379 |
380 | 381 | # Dot metacharacter and Quantifiers 382 | 383 | >![info](../images/info.svg) Since the `.` metacharacter doesn't match newline characters by default, assume that the input strings in the following exercises will not contain newline characters. 384 | 385 | **1)** Replace `42//5` or `42/5` with `8` for the given input. 386 | 387 | ```ruby 388 | >> ip = 'a+42//5-c pressure*3+42/5-14256' 389 | 390 | >> ip.gsub(%r{42//?5}, '8') 391 | => "a+8-c pressure*3+8-14256" 392 | ``` 393 | 394 | **2)** For the array `items`, filter all elements starting with `hand` and ending immediately with at most one more character or `le`. 395 | 396 | ```ruby 397 | >> items = %w[handed hand handled handy unhand hands handle] 398 | 399 | >> items.grep(/\Ahand(.|le)?\z/) 400 | => ["hand", "handy", "hands", "handle"] 401 | ``` 402 | 403 | **3)** Use the `split` method to get the output as shown for the given input strings. 404 | 405 | ```ruby 406 | >> eqn1 = 'a+42//5-c' 407 | >> eqn2 = 'pressure*3+42/5-14256' 408 | >> eqn3 = 'r*42-5/3+42///5-42/53+a' 409 | 410 | >> pat = %r{42//?5} 411 | 412 | >> eqn1.split(pat) 413 | => ["a+", "-c"] 414 | >> eqn2.split(pat) 415 | => ["pressure*3+", "-14256"] 416 | >> eqn3.split(pat) 417 | => ["r*42-5/3+42///5-", "3+a"] 418 | ``` 419 | 420 | **4)** For the given input strings, remove everything from the first occurrence of `i` till the end of the string. 421 | 422 | ```ruby 423 | >> s1 = 'remove the special meaning of such constructs' 424 | >> s2 = 'characters while constructing' 425 | >> s3 = 'input output' 426 | 427 | >> pat = /i.*/ 428 | 429 | >> s1.sub(pat, '') 430 | => "remove the spec" 431 | >> s2.sub(pat, '') 432 | => "characters wh" 433 | >> s3.sub(pat, '') 434 | => "" 435 | ``` 436 | 437 | **5)** For the given strings, construct a regexp to get the output as shown below. 438 | 439 | ```ruby 440 | >> str1 = 'a+b(addition)' 441 | >> str2 = 'a/b(division) + c%d(#modulo)' 442 | >> str3 = 'Hi there(greeting). Nice day(a(b)' 443 | 444 | >> remove_parentheses = /\(.*?\)/ 445 | 446 | >> str1.gsub(remove_parentheses, '') 447 | => "a+b" 448 | >> str2.gsub(remove_parentheses, '') 449 | => "a/b + c%d" 450 | >> str3.gsub(remove_parentheses, '') 451 | => "Hi there. Nice day" 452 | ``` 453 | 454 | **6)** Correct the given regexp to get the expected output. 455 | 456 | ```ruby 457 | >> words = 'plink incoming tint winter in caution sentient' 458 | 459 | # wrong output 460 | >> change = /int|in|ion|ing|inco|inter|ink/ 461 | >> words.gsub(change, 'X') 462 | => "plXk XcomXg tX wXer X cautX sentient" 463 | 464 | # expected output 465 | >> change = /in(ter|co|t|g|k)?|ion/ 466 | >> words.gsub(change, 'X') 467 | => "plX XmX tX wX X cautX sentient" 468 | ``` 469 | 470 | **7)** For the given greedy quantifiers, what would be the equivalent form using the `{m,n}` representation? 471 | 472 | * `?` is same as `{,1}` 473 | * `*` is same as `{0,}` 474 | * `+` is same as `{1,}` 475 | 476 | **8)** `(a*|b*)` is same as `(a|b)*` — true or false? 477 | 478 | False. Because `(a*|b*)` will match only sequences like `a`, `aaa`, `bb`, `bbbbbbbb`. But `(a|b)*` can match mixed sequences like `ababbba` too. 479 | 480 | **9)** For the given input strings, remove everything from the first occurrence of `test` (irrespective of case) till the end of the string, provided `test` isn't at the end of the string. 481 | 482 | ```ruby 483 | >> s1 = 'this is a Test' 484 | >> s2 = 'always test your RE for corner cases' 485 | >> s3 = 'a TEST of skill tests?' 486 | 487 | >> pat = /test.+/i 488 | 489 | >> s1.sub(pat, '') 490 | => "this is a Test" 491 | >> s2.sub(pat, '') 492 | => "always " 493 | >> s3.sub(pat, '') 494 | => "a " 495 | ``` 496 | 497 | **10)** For the input array `words`, filter all elements starting with `s` and containing `e` and `t` in any order. 498 | 499 | ```ruby 500 | >> words = ['sequoia', 'subtle', 'exhibit', 'a set', 'sets', 'tests', 'site'] 501 | 502 | >> words.grep(/\As.*(e.*t|t.*e)/) 503 | => ["subtle", "sets", "site"] 504 | ``` 505 | 506 | **11)** For the input array `words`, remove all elements having less than `6` characters. 507 | 508 | ```ruby 509 | >> words = %w[sequoia subtle exhibit asset sets tests site] 510 | 511 | >> words.grep(/.{6,}/) 512 | => ["sequoia", "subtle", "exhibit"] 513 | ``` 514 | 515 | **12)** For the input array `words`, filter all elements starting with `s` or `t` and having a maximum of `6` characters. 516 | 517 | ```ruby 518 | >> words = ['sequoia', 'subtle', 'exhibit', 'asset', 'sets', 't set', 'site'] 519 | 520 | >> words.grep(/\A(s|t).{,5}\z/) 521 | => ["subtle", "sets", "t set", "site"] 522 | ``` 523 | 524 | **13)** Can you reason out why this code results in the output shown? The aim was to remove all `` patterns but not the `<>` ones. The expected result was `'a 1<> b 2<> c'`. 525 | 526 | The use of `.+` quantifier after `<` means that `<>` cannot be a possible match to satisfy `<.+?>`. So, after matching `<` (which occurs after `1` and `2` in the given input string) the regular expression engine will look for the next occurrence of the `>` character to satisfy the given pattern. To solve such cases, you need to use character classes (discussed in a later chapter) to specify which particular set of characters should be matched by the `+` quantifier (instead of the `.` metacharacter). 527 | 528 | ```ruby 529 | >> ip = 'a 1<> b 2<> c' 530 | 531 | >> ip.gsub(/<.+?>/, '') 532 | => "a 1 2" 533 | ``` 534 | 535 | **14)** Use the `split` method to get the output as shown below for the given input strings. 536 | 537 | ```ruby 538 | >> s1 = 'go there :: this :: that' 539 | >> s2 = 'a::b :: c::d e::f :: 4::5' 540 | >> s3 = '42:: hi::bye::see :: carefully' 541 | 542 | >> pat = / +:: +/ 543 | 544 | >> s1.split(pat, 2) 545 | => ["go there", "this :: that"] 546 | >> s2.split(pat, 2) 547 | => ["a::b", "c::d e::f :: 4::5"] 548 | >> s3.split(pat, 2) 549 | => ["42:: hi::bye::see", "carefully"] 550 | ``` 551 | 552 | **15)** For the given input strings, match if the string starts with optional space characters followed by at least two `#` characters. 553 | 554 | ```ruby 555 | >> s1 = ' ## header2' 556 | >> s2 = '#### header4' 557 | >> s3 = '# comment' 558 | >> s4 = 'normal string' 559 | >> s5 = 'nope ## not this' 560 | 561 | >> pat = /\A *\#{2,}/ 562 | 563 | >> s1.match?(pat) 564 | => true 565 | >> s2.match?(pat) 566 | => true 567 | >> s3.match?(pat) 568 | => false 569 | >> s4.match?(pat) 570 | => false 571 | >> s5.match?(pat) 572 | => false 573 | ``` 574 | 575 | **16)** Modify the given regular expression such that it gives the expected results. 576 | 577 | ```ruby 578 | >> s1 = 'appleabcabcabcapricot' 579 | >> s2 = 'bananabcabcabcdelicious' 580 | 581 | # wrong output 582 | >> pat = /(abc)+a/ 583 | >> pat.match?(s1) 584 | => true 585 | >> pat.match?(s2) 586 | => true 587 | 588 | # expected output 589 | # 'abc' shouldn't be considered when trying to match 'a' at the end 590 | >> pat = /(abc)++a/ 591 | >> pat.match?(s1) 592 | => true 593 | >> pat.match?(s2) 594 | => false 595 | ``` 596 | 597 |
598 | 599 | # Working with matched portions 600 | 601 | **1)** For the given strings, extract the matching portion from the first `is` to the last `t`. 602 | 603 | ```ruby 604 | >> str1 = 'This the biggest fruit you have seen?' 605 | >> str2 = 'Your mission is to read and practice consistently' 606 | 607 | >> pat = /is.*t/ 608 | 609 | >> str1[pat] 610 | => "is the biggest fruit" 611 | >> str2[pat] 612 | => "ission is to read and practice consistent" 613 | ``` 614 | 615 | **2)** Find the starting index of the first occurrence of `is` or `the` or `was` or `to` for the given input strings. 616 | 617 | ```ruby 618 | >> s1 = 'match after the last newline character' 619 | >> s2 = 'and then you want to test' 620 | >> s3 = 'this is good bye then' 621 | >> s4 = 'who was there to see?' 622 | 623 | >> pat = /is|the|was|to/ 624 | 625 | >> s1 =~ pat 626 | => 12 627 | >> s2 =~ pat 628 | => 4 629 | >> s3 =~ pat 630 | => 2 631 | >> s4 =~ pat 632 | => 4 633 | ``` 634 | 635 | **3)** Find the starting index of the last occurrence of `is` or `the` or `was` or `to` for the given input strings. 636 | 637 | ```ruby 638 | >> s1 = 'match after the last newline character' 639 | >> s2 = 'and then you want to test' 640 | >> s3 = 'this is good bye then' 641 | >> s4 = 'who was there to see?' 642 | 643 | >> pat = /.*(is|the|was|to)/ 644 | 645 | >> s1.match(pat).begin(1) 646 | => 12 647 | >> s2.match(pat).begin(1) 648 | => 18 649 | >> s3.match(pat).begin(1) 650 | => 17 651 | >> s4.match(pat).begin(1) 652 | => 14 653 | ``` 654 | 655 | **4)** Extract everything after the `:` character, which occurs only once in the input. 656 | 657 | ```ruby 658 | >> ip = 'fruits:apple, mango, guava, blueberry' 659 | 660 | # can also use: ip[/:(.*)/, 1] 661 | # can also use: ip.sub(/.*:/, '') 662 | >> ip.match(/:(.*)/)[1] 663 | => "apple, mango, guava, blueberry" 664 | ``` 665 | 666 | **5)** The given input strings contains some text followed by `-` followed by a number. Replace that number with its `log` value using `Math.log()`. 667 | 668 | ```ruby 669 | >> s1 = 'first-3.14' 670 | >> s2 = 'next-123' 671 | 672 | >> pat = /-(.+)/ 673 | 674 | >> s1.sub(pat) { "-#{Math.log($1.to_f)}" } 675 | => "first-1.144222799920162" 676 | >> s2.sub(pat) { "-#{Math.log($1.to_f)}" } 677 | => "next-4.812184355372417" 678 | ``` 679 | 680 | **6)** Replace all occurrences of `par` with `spar`, `spare` with `extra` and `park` with `garden` for the given input strings. 681 | 682 | ```ruby 683 | >> str1 = 'apartment has a park' 684 | >> str2 = 'do you have a spare cable' 685 | >> str3 = 'write a parser' 686 | 687 | >> pat = /park?|spare/ 688 | >> h = { 'par' => 'spar', 'spare' => 'extra', 'park' => 'garden' } 689 | 690 | >> str1.gsub(pat, h) 691 | => "aspartment has a garden" 692 | >> str2.gsub(pat, h) 693 | => "do you have a extra cable" 694 | >> str3.gsub(pat, h) 695 | => "write a sparser" 696 | ``` 697 | 698 | **7)** Extract all words between `(` and `)` from the given input string as an array. Assume that the input will not contain any broken parentheses. 699 | 700 | ```ruby 701 | >> ip = 'another (way) to reuse (portion) matched (by) capture groups' 702 | 703 | # as nested array 704 | >> ip.scan(/\((.*?)\)/) 705 | => [["way"], ["portion"], ["by"]] 706 | 707 | # as array of strings 708 | >> ip.gsub(/\((.*?)\)/).map { $1 } 709 | => ["way", "portion", "by"] 710 | ``` 711 | 712 | **8)** Extract all occurrences of `<` up to the next occurrence of `>`, provided there is at least one character in between `<` and `>`. 713 | 714 | ```ruby 715 | >> ip = 'a 1<> b 2<> c' 716 | 717 | >> ip.scan(/<.+?>/) 718 | => ["", "<> b", "<> c"] 719 | ``` 720 | 721 | **9)** Use `scan` to get the output as shown below for the given input strings. Note the characters used in the input strings carefully. 722 | 723 | ```ruby 724 | >> row1 = '-2,5 4,+3 +42,-53 4356246,-357532354 ' 725 | >> row2 = '1.32,-3.14 634,5.63 63.3e3,9907809345343.235 ' 726 | 727 | >> pat = /(.+?),(.+?) / 728 | 729 | >> row1.scan(pat) 730 | => [["-2", "5"], ["4", "+3"], ["+42", "-53"], ["4356246", "-357532354"]] 731 | >> row2.scan(pat) 732 | => [["1.32", "-3.14"], ["634", "5.63"], ["63.3e3", "9907809345343.235"]] 733 | ``` 734 | 735 | **10)** This is an extension to the previous question. 736 | 737 | * For `row1`, find the sum of integers of each array element. For example, sum of `-2` and `5` is `3`. 738 | * For `row2`, find the sum of floating-point numbers of each array element. For example, sum of `1.32` and `-3.14` is `-1.82`. 739 | 740 | ```ruby 741 | >> row1 = '-2,5 4,+3 +42,-53 4356246,-357532354 ' 742 | >> row2 = '1.32,-3.14 634,5.63 63.3e3,9907809345343.235 ' 743 | 744 | # should be same as the previous question 745 | >> pat = /(.+?),(.+?) / 746 | 747 | >> row1.scan(pat).map { |a, b| a.to_i + b.to_i } 748 | => [3, 7, -11, -353176108] 749 | 750 | >> row2.scan(pat).map { |a, b| a.to_f + b.to_f } 751 | => [-1.82, 639.63, 9907809408643.234] 752 | ``` 753 | 754 | **11)** Use the `split` method to get the output as shown below. 755 | 756 | ```ruby 757 | >> ip = '42:no-output;1000:car-tr:u-ck;SQEX49801' 758 | 759 | >> ip.split(/:.+?-(.+?);/) 760 | => ["42", "output", "1000", "tr:u-ck", "SQEX49801"] 761 | ``` 762 | 763 | **12)** Convert the comma separated strings to corresponding `hash` objects as shown below. Note that the input strings have an extra `,` at the end. 764 | 765 | ```ruby 766 | >> row1 = 'name:rohan,maths:75,phy:89,' 767 | >> row2 = 'name:rose,maths:88,phy:92,' 768 | 769 | >> pat = /(.+?):(.+?),/ 770 | 771 | >> row1.scan(pat).to_h 772 | => {"name"=>"rohan", "maths"=>"75", "phy"=>"89"} 773 | >> row2.scan(pat).to_h 774 | => {"name"=>"rose", "maths"=>"88", "phy"=>"92"} 775 | ``` 776 | 777 |
778 | 779 | # Character class 780 | 781 | **1)** For the array `items`, filter all elements starting with `hand` and ending immediately with `s` or `y` or `le`. 782 | 783 | ```ruby 784 | >> items = %w[-handy hand handy unhand hands hand-icy handle] 785 | 786 | >> items.grep(/\Ahand([sy]|le)\z/) 787 | => ["handy", "hands", "handle"] 788 | ``` 789 | 790 | **2)** Replace all whole words `reed` or `read` or `red` with `X`. 791 | 792 | ```ruby 793 | >> ip = 'redo red credible :read: rod reed' 794 | 795 | >> ip.gsub(/\bre[ae]?d\b/, 'X') 796 | => "redo X credible :X: rod X" 797 | ``` 798 | 799 | **3)** For the array `words`, filter all elements containing `e` or `i` followed by `l` or `n`. Note that the order mentioned should be followed. 800 | 801 | ```ruby 802 | >> words = %w[surrender unicorn newer door empty eel pest] 803 | 804 | >> words.grep(/[ei].*[ln]/) 805 | => ["surrender", "unicorn", "eel"] 806 | ``` 807 | 808 | **4)** For the array `words`, filter all elements containing `e` or `i` and `l` or `n` in any order. 809 | 810 | ```ruby 811 | >> words = %w[surrender unicorn newer door empty eel pest] 812 | 813 | >> words.grep(/[ei].*[ln]|[ln].*[ei]/) 814 | => ["surrender", "unicorn", "newer", "eel"] 815 | ``` 816 | 817 | **5)** Convert the comma separated strings to corresponding `hash` objects as shown below. 818 | 819 | ```ruby 820 | >> row1 = 'name:rohan,maths:75,phy:89' 821 | >> row2 = 'name:rose,maths:88,phy:92' 822 | 823 | >> pat = /([^:]+):([^,]+),?/ 824 | 825 | >> row1.scan(pat).to_h 826 | => {"name"=>"rohan", "maths"=>"75", "phy"=>"89"} 827 | >> row2.scan(pat).to_h 828 | => {"name"=>"rose", "maths"=>"88", "phy"=>"92"} 829 | ``` 830 | 831 | **6)** Delete from `(` to the next occurrence of `)` unless they contain parentheses characters in between. 832 | 833 | ```ruby 834 | >> str1 = 'def factorial()' 835 | >> str2 = 'a/b(division) + c%d(#modulo) - (e+(j/k-3)*4)' 836 | >> str3 = 'Hi there(greeting). Nice day(a(b)' 837 | 838 | >> remove_parentheses = /\([^()]*\)/ 839 | 840 | >> str1.gsub(remove_parentheses, '') 841 | => "def factorial" 842 | >> str2.gsub(remove_parentheses, '') 843 | => "a/b + c%d - (e+*4)" 844 | >> str3.gsub(remove_parentheses, '') 845 | => "Hi there. Nice day(a" 846 | ``` 847 | 848 | **7)** For the array `words`, filter all elements not starting with `e` or `p` or `u`. 849 | 850 | ```ruby 851 | >> words = %w[surrender unicorn newer door empty eel (pest)] 852 | 853 | >> words.grep(/\A[^epu]/) 854 | => ["surrender", "newer", "door", "(pest)"] 855 | ``` 856 | 857 | **8)** For the array `words`, filter all elements not containing `u` or `w` or `ee` or `-`. 858 | 859 | ```ruby 860 | >> words = %w[p-t you tea heel owe new reed ear] 861 | 862 | >> words.grep_v(/[uw-]|ee/) 863 | => ["tea", "ear"] 864 | ``` 865 | 866 | **9)** The given input strings contain fields separated by `,` and fields can be empty too. Replace the last three fields with `WHTSZ323`. 867 | 868 | ```ruby 869 | >> row1 = '(2),kite,12,,D,C,,' 870 | >> row2 = 'hi,bye,sun,moon' 871 | 872 | >> pat = /(,[^,]*){3}\z/ 873 | 874 | >> row1.sub(pat, ',WHTSZ323') 875 | => "(2),kite,12,,D,WHTSZ323" 876 | >> row2.sub(pat, ',WHTSZ323') 877 | => "hi,WHTSZ323" 878 | ``` 879 | 880 | **10)** Split the given strings based on consecutive sequence of digit or whitespace characters. 881 | 882 | ```ruby 883 | >> str1 = "lion \t Ink32onion Nice" 884 | >> str2 = "**1\f2\n3star\t7 77\r**" 885 | 886 | >> pat = /[\d\s]+/ 887 | 888 | >> str1.split(pat) 889 | => ["lion", "Ink", "onion", "Nice"] 890 | >> str2.split(pat) 891 | => ["**", "star", "**"] 892 | ``` 893 | 894 | **11)** Delete all occurrences of the sequence `` where `characters` is one or more non `>` characters and cannot be empty. 895 | 896 | ```ruby 897 | >> ip = 'a 1<> b 2<> c' 898 | 899 | >> ip.gsub(/<[^>]+>/, '') 900 | => "a 1<> b 2<> c" 901 | ``` 902 | 903 | **12)** `\b[a-z](on|no)[a-z]\b` is same as `\b[a-z][on]{2}[a-z]\b`. True or False? Sample input lines shown below might help to understand the differences, if any. 904 | 905 | False. `[on]{2}` will also match `oo` and `nn`. 906 | 907 | ```ruby 908 | >> puts "known\nmood\nknow\npony\ninns" 909 | known 910 | mood 911 | know 912 | pony 913 | inns 914 | ``` 915 | 916 | **13)** For the given array, filter elements containing any number sequence greater than `624`. 917 | 918 | ```ruby 919 | >> items = ['h0000432ab', 'car00625', '42_624 0512', '96 foo1234baz 3.14 2'] 920 | 921 | >> items.filter { _1.gsub(/\d+/).any? { $&.to_i > 624 } } 922 | => ["car00625", "96 foo1234baz 3.14 2"] 923 | ``` 924 | 925 | **14)** Count the maximum depth of nested braces for the given strings. Unbalanced or wrongly ordered braces should return `-1`. Note that this will require a mix of regular expressions and Ruby code. 926 | 927 | ```ruby 928 | ?> def max_nested_braces(ip) 929 | ?> cnt = 0 930 | ?> cnt += 1 while ip.gsub!(/\{[^{}]*\}/, '') 931 | ?> return ip.match?(/[{}]/) ? -1 : cnt 932 | >> end 933 | 934 | >> max_nested_braces('a*b') 935 | => 0 936 | >> max_nested_braces('}a+b{') 937 | => -1 938 | >> max_nested_braces('a*b+{}') 939 | => 1 940 | >> max_nested_braces('{{a+2}*{b+c}+e}') 941 | => 2 942 | >> max_nested_braces('{{a+2}*{b+{c*d}}+e}') 943 | => 3 944 | >> max_nested_braces("{{a+2}*{\n{b+{c*d}}+e*d}}") 945 | => 4 946 | >> max_nested_braces('a*{b+c*{e*3.14}}}') 947 | => -1 948 | ``` 949 | 950 | **15)** By default, the `split` method will split on whitespace and remove empty strings from the result. Which regexp based method would you use to replicate this functionality? 951 | 952 | ```ruby 953 | >> ip = " \t\r so pole\t\t\t\n\nlit in to \r\n\v\f " 954 | 955 | >> ip.split 956 | => ["so", "pole", "lit", "in", "to"] 957 | 958 | >> ip.scan(/\S+/) 959 | => ["so", "pole", "lit", "in", "to"] 960 | ``` 961 | 962 | **16)** Convert the given input string to two different arrays as shown below. You can optimize the regexp based on characters present in the input string. 963 | 964 | ```ruby 965 | >> ip = "price_42 roast^\t\n^-ice==cat\neast" 966 | 967 | >> ip.split(/\W+/) 968 | => ["price_42", "roast", "ice", "cat", "east"] 969 | 970 | >> ip.split(/(\W+)/) 971 | => ["price_42", " ", "roast", "^\t\n^-", "ice", "==", "cat", "\n", "east"] 972 | ``` 973 | 974 | **17)** Filter all elements whose first non-whitespace character is not a `#` character. Any element made up of only whitespace characters should be ignored as well. 975 | 976 | ```ruby 977 | >> items = [' #comment', "\t\napple #42", '#oops', 'sure', 'no#1', "\t\r\f"] 978 | 979 | # can also use: items.grep(/\A\s*[^#\s]/) 980 | >> items.grep(/\A\s*+[^#]/) 981 | => ["\t\napple #42", "sure", "no#1"] 982 | ``` 983 | 984 | **18)** Extract all whole words for the given input strings. However, based on user input `ignore`, do not match words if they contain any character present in the `ignore` variable. Assume that `ignore` variable will not contain any regexp metacharacters. 985 | 986 | ```ruby 987 | >> s1 = 'match after the last newline character' 988 | >> s2 = 'and then you want to test' 989 | 990 | >> ignore = 'aty' 991 | >> pat = /\b[\w&&[^#{ignore}]]+\b/ 992 | >> s1.scan(pat) 993 | => ["newline"] 994 | >> s2.scan(pat) 995 | => [] 996 | 997 | >> ignore = 'esw' 998 | >> pat = /\b[\w&&[^#{ignore}]]+\b/ 999 | >> s1.scan(pat) 1000 | => ["match"] 1001 | >> s2.scan(pat) 1002 | => ["and", "you", "to"] 1003 | ``` 1004 | 1005 | **19)** Filter all whole elements with optional whitespaces at the start followed by three to five non-digit characters. Whitespaces at the start should not be part of the calculation for non-digit characters. 1006 | 1007 | ```ruby 1008 | >> items = ["\t \ncat", 'goal', ' oh', 'he-he', 'goal2', 'ok ', 'sparrow'] 1009 | 1010 | >> items.grep(/\A\s*+\D{3,5}\z/) 1011 | => ["\t \ncat", "goal", "he-he", "ok "] 1012 | ``` 1013 | 1014 | **20)** Modify the given regexp such that it gives the expected result. 1015 | 1016 | ```ruby 1017 | >> ip = '( S:12 E:5 S:4 and E:123 ok S:100 & E:10 S:1 - E:2 S:42 E:43 )' 1018 | 1019 | # wrong output 1020 | >> ip.scan(/S:\d+.*?E:\d{2,}/) 1021 | => ["S:12 E:5 S:4 and E:123", "S:100 & E:10", "S:1 - E:2 S:42 E:43"] 1022 | 1023 | # expected output 1024 | >> ip.scan(/(?>S:\d+.*?E:)\d{2,}/) 1025 | => ["S:4 and E:123", "S:100 & E:10", "S:42 E:43"] 1026 | ``` 1027 | 1028 |
1029 | 1030 | # Groupings and backreferences 1031 | 1032 | **1)** Replace the space character that occurs after a word ending with `a` or `r` with a newline character. 1033 | 1034 | ```ruby 1035 | >> ip = 'area not a _a2_ roar took 22' 1036 | 1037 | >> puts ip.gsub(/([ar]) /, "\\1\n") 1038 | area 1039 | not a 1040 | _a2_ roar 1041 | took 22 1042 | ``` 1043 | 1044 | **2)** Add `[]` around words starting with `s` and containing `e` and `t` in any order. 1045 | 1046 | ```ruby 1047 | >> ip = 'sequoia subtle exhibit asset sets2 tests si_te' 1048 | 1049 | >> ip.gsub(/\bs\w*(t\w*e|e\w*t)\w*/, '[\0]') 1050 | => "sequoia [subtle] exhibit asset [sets2] tests [si_te]" 1051 | ``` 1052 | 1053 | **3)** Replace all whole words with `X` that start and end with the same word character (irrespective of case). Single character word should get replaced with `X` too, as it satisfies the stated condition. 1054 | 1055 | ```ruby 1056 | >> ip = 'oreo not a _a2_ Roar took 22' 1057 | 1058 | # can also use: ip.gsub(/\b(\w|(\w)\w*\2)\b/i, 'X') 1059 | >> ip.gsub(/\b(\w)(\w*\1)?\b/i, 'X') 1060 | => "X not X X X took X" 1061 | ``` 1062 | 1063 | **4)** Convert the given *markdown* headers to corresponding *anchor* tags. Consider the input to start with one or more `#` characters followed by space and word characters. The `name` attribute is constructed by converting the header to lowercase and replacing spaces with hyphens. Can you do it without using a capture group? 1064 | 1065 | ```ruby 1066 | >> header1 = '# Regular Expressions' 1067 | >> header2 = '## Named capture groups' 1068 | 1069 | >> anchor = /\w.*/ 1070 | 1071 | >> header1.sub(anchor) { "#{$&}" } 1072 | => "# Regular Expressions" 1073 | >> header2.sub(anchor) { "#{$&}" } 1074 | => "## Named capture groups" 1075 | ``` 1076 | 1077 | **5)** Convert the given *markdown* anchors to corresponding *hyperlinks*. 1078 | 1079 | ```ruby 1080 | >> anchor1 = "# Regular Expressions" 1081 | >> anchor2 = "## Subexpression calls" 1082 | 1083 | >> hyperlink = %r{[^']+'([^']+)'>(.+)} 1084 | 1085 | >> anchor1.sub(hyperlink, '[\2](#\1)') 1086 | => "[Regular Expressions](#regular-expressions)" 1087 | >> anchor2.sub(hyperlink, '[\2](#\1)') 1088 | => "[Subexpression calls](#subexpression-calls)" 1089 | ``` 1090 | 1091 | **6)** Count the number of whole words that have at least two occurrences of consecutive repeated alphabets. For example, words like `stillness` and `Committee` should be counted but not words like `root` or `readable` or `rotational`. 1092 | 1093 | ```ruby 1094 | '> ip = %q{oppressed abandon accommodation bloodless 1095 | '> carelessness committed apparition innkeeper 1096 | '> occasionally afforded embarrassment foolishness 1097 | '> depended successfully succeeded 1098 | >> possession cleanliness suppress} 1099 | 1100 | # can also use: ip.scan(/\b\w*(\w)\1\w*(\w)\2\w*\b/).size 1101 | >> ip.scan(/\b(\w*(\w)\2){2}\w*\b/).size 1102 | => 13 1103 | ``` 1104 | 1105 | **7)** For the given input string, replace all occurrences of digit sequences with only the unique non-repeating sequence. For example, `232323` should be changed to `23` and `897897` should be changed to `897`. If there are no repeats (for example `1234`) or if the repeats end prematurely (for example `12121`), it should not be changed. 1106 | 1107 | ```ruby 1108 | >> ip = '1234 2323 453545354535 9339 11 60260260' 1109 | 1110 | >> ip.gsub(/\b(\d+)\1+\b/, '\1') 1111 | => "1234 23 4535 9339 1 60260260" 1112 | ``` 1113 | 1114 | **8)** Replace sequences made up of words separated by `:` or `.` by the first word of the sequence. Such sequences will end when `:` or `.` is not followed by a word character. 1115 | 1116 | ```ruby 1117 | >> ip = 'wow:Good:2_two.five: hi-2 bye kite.777:water.' 1118 | 1119 | >> ip.gsub(/([:.]\w*)+/, '') 1120 | => "wow hi-2 bye kite" 1121 | ``` 1122 | 1123 | **9)** Replace sequences made up of words separated by `:` or `.` by the last word of the sequence. Such sequences will end when `:` or `.` is not followed by a word character. 1124 | 1125 | ```ruby 1126 | >> ip = 'wow:Good:2_two.five: hi-2 bye kite.777:water.' 1127 | 1128 | >> ip.gsub(/((\w+)[:.])+/, '\2') 1129 | => "five hi-2 bye water" 1130 | ``` 1131 | 1132 | **10)** Split the given input string on one or more repeated sequence of `cat`. 1133 | 1134 | ```ruby 1135 | >> ip = 'firecatlioncatcatcatbearcatcatparrot' 1136 | 1137 | >> ip.split(/(?:cat)+/) 1138 | => ["fire", "lion", "bear", "parrot"] 1139 | ``` 1140 | 1141 | **11)** For the given input string, find all occurrences of digit sequences with at least one repeating sequence. For example, `232323` and `897897`. If the repeats end prematurely, for example `12121`, it should not be matched. 1142 | 1143 | ```ruby 1144 | >> ip = '1234 2323 453545354535 9339 11 60260260' 1145 | 1146 | >> pat = /\b(\d+)\1+\b/ 1147 | 1148 | # entire sequences in the output 1149 | >> ip.gsub(pat).map { $& } 1150 | => ["2323", "453545354535", "11"] 1151 | 1152 | # only the unique sequence in the output 1153 | >> ip.gsub(pat).map { $1 } 1154 | => ["23", "4535", "1"] 1155 | ``` 1156 | 1157 | **12)** Convert the comma separated strings to corresponding `hash` objects as shown below. The keys are `name`, `maths` and `phy` for the three fields in the input strings. 1158 | 1159 | ```ruby 1160 | >> row1 = 'rohan,75,89' 1161 | >> row2 = 'rose,88,92' 1162 | 1163 | >> pat = /(?[^,]+),(?[^,]+),(?[^,]+)/ 1164 | 1165 | >> row1.match(pat).named_captures 1166 | => {"name"=>"rohan", "maths"=>"75", "phy"=>"89"} 1167 | >> row2.match(pat).named_captures 1168 | => {"name"=>"rose", "maths"=>"88", "phy"=>"92"} 1169 | ``` 1170 | 1171 | **13)** Surround all whole words with `()`. Additionally, if the whole word is `imp` or `ant`, delete them. Can you do it with just a single substitution? 1172 | 1173 | ```ruby 1174 | >> ip = 'tiger imp goat eagle ant important' 1175 | 1176 | >> ip.gsub(/\b(?:imp|ant|(\w+))\b/, '(\1)') 1177 | => "(tiger) () (goat) (eagle) () (important)" 1178 | ``` 1179 | 1180 | **14)** Filter all elements that contain a sequence of lowercase alphabets followed by `-` followed by digits. They can be optionally surrounded by `{{` and `}}`. Any partial match shouldn't be part of the output. 1181 | 1182 | ```ruby 1183 | >> ip = %w[{{apple-150}} {{mango2-100}} {{cherry-200 grape-87 {{go-to}}] 1184 | 1185 | >> ip.grep(/\A({{)?[a-z]+-\d+(?(1)}})\z/) 1186 | => ["{{apple-150}}", "grape-87"] 1187 | ``` 1188 | 1189 | **15)** Extract all hexadecimal character sequences, with `0x` optional prefix. Match the characters case insensitively, and the sequences shouldn't be surrounded by other word characters. 1190 | 1191 | ```ruby 1192 | >> str1 = '128A foo 0xfe32 34 0xbar' 1193 | >> str2 = '0XDEADBEEF place 0x0ff1ce bad' 1194 | 1195 | >> hex_seq = /\b(?:0x)?\h+\b/i 1196 | 1197 | >> str1.scan(hex_seq) 1198 | => ["128A", "0xfe32", "34"] 1199 | >> str2.scan(hex_seq) 1200 | => ["0XDEADBEEF", "0x0ff1ce", "bad"] 1201 | ``` 1202 | 1203 | **16)** Replace sequences made up of words separated by `:` or `.` by the first/last word of the sequence and the separator. Such sequences will end when `:` or `.` is not followed by a word character. 1204 | 1205 | ```ruby 1206 | >> ip = 'wow:Good:2_two.five: hi-2 bye kite.777:water.' 1207 | 1208 | # first word of the sequence 1209 | >> ip.gsub(/((\w+[:.]))\g<2>+/, '\1') 1210 | => "wow: hi-2 bye kite." 1211 | 1212 | # last word of the sequence 1213 | >> ip.gsub(/(\w+[:.])\g<1>+/, '\1') 1214 | => "five: hi-2 bye water." 1215 | ``` 1216 | 1217 | **17)** For the given input strings, extract `if` followed by any number of nested parentheses. Assume that there will be only one such pattern per input string. 1218 | 1219 | ```ruby 1220 | >> ip1 = 'for (((i*3)+2)/6) if(3-(k*3+4)/12-(r+2/3)) while()' 1221 | >> ip2 = 'if+while if(a(b)c(d(e(f)1)2)3) for(i=1)' 1222 | 1223 | >> pat = /if(\((?:[^()]++|\g<1>)++\))/ 1224 | 1225 | >> ip1[pat] 1226 | => "if(3-(k*3+4)/12-(r+2/3))" 1227 | >> ip2[pat] 1228 | => "if(a(b)c(d(e(f)1)2)3)" 1229 | ``` 1230 | 1231 | **18)** The given input string has sequences made up of words separated by `:` or `.` and such sequences will end when `:` or `.` is not followed by a word character. For all such sequences, display only the last word followed by `-` followed by the first word. 1232 | 1233 | ```ruby 1234 | >> ip = 'wow:Good:2_two.five: hi-2 bye kite.777:water.' 1235 | 1236 | >> ip.scan(/(\w+)[:.](?:(\w+)[:.])+/).map { "#{_2}-#{_1}" } 1237 | => ["five-wow", "water-kite"] 1238 | ``` 1239 | 1240 |
1241 | 1242 | # Lookarounds 1243 | 1244 | >![info](../images/info.svg) Please use lookarounds for solving the following exercises even if you can do it without lookarounds. Unless you cannot use lookarounds for cases like variable length lookbehinds. 1245 | 1246 | **1)** Replace all whole words with `X` unless it is preceded by a `(` character. 1247 | 1248 | ```ruby 1249 | >> ip = '(apple) guava berry) apple (mango) (grape' 1250 | 1251 | >> ip.gsub(/(? "(apple) X X) X (mango) (grape" 1253 | ``` 1254 | 1255 | **2)** Replace all whole words with `X` unless it is followed by a `)` character. 1256 | 1257 | ```ruby 1258 | >> ip = '(apple) guava berry) apple (mango) (grape' 1259 | 1260 | >> ip.gsub(/\w+\b(?!\))/, 'X') 1261 | => "(apple) X berry) X (mango) (X" 1262 | ``` 1263 | 1264 | **3)** Replace all whole words with `X` unless it is preceded by `(` or followed by `)` characters. 1265 | 1266 | ```ruby 1267 | >> ip = '(apple) guava berry) apple (mango) (grape' 1268 | 1269 | >> ip.gsub(/(? "(apple) X berry) X (mango) (grape" 1271 | ``` 1272 | 1273 | **4)** Extract all whole words that do not end with `e` or `n`. 1274 | 1275 | ```ruby 1276 | >> ip = 'a_t row on Urn e note Dust n end a2-e|u' 1277 | 1278 | >> ip.scan(/\b\w+\b(? ["a_t", "row", "Dust", "end", "a2", "u"] 1280 | ``` 1281 | 1282 | **5)** Extract all whole words that do not start with `a` or `d` or `n`. 1283 | 1284 | ```ruby 1285 | >> ip = 'a_t row on Urn e note Dust n end a2-e|u' 1286 | 1287 | >> ip.scan(/(?![adn])\b\w+\b/) 1288 | => ["row", "on", "Urn", "e", "Dust", "end", "e", "u"] 1289 | ``` 1290 | 1291 | **6)** Extract all whole words only if they are followed by `:` or `,` or `-`. 1292 | 1293 | ```ruby 1294 | >> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit' 1295 | 1296 | >> ip.scan(/\w+(?=[:,-])/) 1297 | => ["Poke", "so_good", "ever2"] 1298 | ``` 1299 | 1300 | **7)** Extract all whole words only if they are preceded by `=` or `/` or `-`. 1301 | 1302 | ```ruby 1303 | >> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit' 1304 | 1305 | # can also use: ip.scan(%r{[=/-]\K\w+}) 1306 | >> ip.scan(%r{(?<=[=/-])\w+}) 1307 | => ["so_good", "is", "sit"] 1308 | ``` 1309 | 1310 | **8)** Extract all whole words only if they are preceded by `=` or `:` and followed by `:` or `.`. 1311 | 1312 | ```ruby 1313 | >> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit' 1314 | 1315 | # can also use: ip.scan(/[=:]\K\w+(?=[:.])/) 1316 | >> ip.scan(/(?<=[=:])\w+(?=[:.])/) 1317 | => ["so_good", "ink"] 1318 | ``` 1319 | 1320 | **9)** Extract all whole words only if they are preceded by `=` or `:` or `.` or `(` or `-` and not followed by `.` or `/`. 1321 | 1322 | ```ruby 1323 | >> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit' 1324 | 1325 | # can also use: ip.scan(%r{[=:.(-]\K\w+\b(?![/.])}) 1326 | >> ip.scan(%r{(?<=[=:.(-])\w+\b(?![/.])}) 1327 | => ["so_good", "vast", "sit"] 1328 | ``` 1329 | 1330 | **10)** Remove the leading and trailing whitespaces from all the individual fields where `,` is the field separator. 1331 | 1332 | ```ruby 1333 | >> csv1 = " comma ,separated ,values \t\r " 1334 | >> csv2 = 'good bad,nice ice , 42 , , stall small' 1335 | 1336 | >> remove_whitespace = /(?> csv1.gsub(remove_whitespace, '') 1339 | => "comma,separated,values" 1340 | >> csv2.gsub(remove_whitespace, '') 1341 | => "good bad,nice ice,42,,stall small" 1342 | ``` 1343 | 1344 | **11)** Filter elements that satisfy all of these rules: 1345 | 1346 | * should have at least two alphabets 1347 | * should have at least three digits 1348 | * should have at least one special character among `%` or `*` or `#` or `$` 1349 | * should not end with a whitespace character 1350 | 1351 | ```ruby 1352 | >> pwds = ['hunter2', 'F2H3u%9', "*X3Yz3.14\t", 'r2_d2_42', 'A $B C1234'] 1353 | 1354 | >> rule_chk = /(?=(.*[a-zA-Z]){2})(?=(.*\d){3})(?!.*\s\z).*[%*#$]/ 1355 | 1356 | >> pwds.grep(rule_chk) 1357 | => ["F2H3u%9", "A $B C1234"] 1358 | ``` 1359 | 1360 | **12)** For the given string, surround all whole words with `{}` except for whole words `par` and `cat` and `apple`. 1361 | 1362 | ```ruby 1363 | >> ip = 'part; cat {super} rest_42 par scatter apple spar' 1364 | 1365 | >> ip.gsub(/\b(?!(?:par|cat|apple)\b)\w+/, '{\0}') 1366 | => "{part}; cat {{super}} {rest_42} par {scatter} apple {spar}" 1367 | ``` 1368 | 1369 | **13)** Extract the integer portion of floating-point numbers for the given string. Integers and numbers ending with `.` and no further digits should not be considered. 1370 | 1371 | ```ruby 1372 | >> ip = '12 ab32.4 go 5 2. 46.42 5' 1373 | 1374 | >> ip.scan(/\d+(?=\.\d)/) 1375 | => ["32", "46"] 1376 | ``` 1377 | 1378 | **14)** For the given input strings, extract all overlapping two character sequences. 1379 | 1380 | ```ruby 1381 | >> s1 = 'apple' 1382 | >> s2 = '1.2-3:4' 1383 | 1384 | >> pat = /.(?=(.))/ 1385 | 1386 | >> s1.gsub(pat).map { $& + $1 } 1387 | => ["ap", "pp", "pl", "le"] 1388 | >> s2.gsub(pat).map { $& + $1 } 1389 | => ["1.", ".2", "2-", "-3", "3:", ":4"] 1390 | ``` 1391 | 1392 | **15)** The given input strings contain fields separated by the `:` character. Delete `:` and the last field if there is a digit character anywhere before the last field. 1393 | 1394 | ```ruby 1395 | >> s1 = '42:cat' 1396 | >> s2 = 'twelve:a2b' 1397 | >> s3 = 'we:be:he:0:a:b:bother' 1398 | >> s4 = 'apple:banana-42:cherry:' 1399 | >> s5 = 'dragon:unicorn:centaur' 1400 | 1401 | >> pat = /(\d.*):.*/ 1402 | 1403 | >> s1.sub(pat, '\1') 1404 | => "42" 1405 | >> s2.sub(pat, '\1') 1406 | => "twelve:a2b" 1407 | >> s3.sub(pat, '\1') 1408 | => "we:be:he:0:a:b" 1409 | >> s4.sub(pat, '\1') 1410 | => "apple:banana-42:cherry" 1411 | >> s5.sub(pat, '\1') 1412 | => "dragon:unicorn:centaur" 1413 | ``` 1414 | 1415 | **16)** Extract all whole words unless they are preceded by `:` or `<=>` or `----` or `#`. 1416 | 1417 | ```ruby 1418 | >> ip = '::very--at<=>row|in.a_b#b2c=>lion----east' 1419 | 1420 | >> ip.scan(/(?|-{4})\b\w+/) 1421 | => ["at", "in", "a_b", "lion"] 1422 | ``` 1423 | 1424 | **17)** Match strings if it contains `qty` followed by `price` but not if there is any **whitespace** character or the string `error` between them. 1425 | 1426 | ```ruby 1427 | >> str1 = '23,qty,price,42' 1428 | >> str2 = 'qty price,oh' 1429 | >> str3 = '3.14,qty,6,errors,9,price,3' 1430 | >> str4 = "42\nqty-6,apple-56,price-234,error" 1431 | >> str5 = '4,price,3.14,qty,4' 1432 | >> str6 = '(qtyprice) (hi-there)' 1433 | 1434 | # can also use: neg = /qty((?!\s|error).)*price/ 1435 | >> neg = /qty(?~\s|error)price/ 1436 | 1437 | >> str1.match?(neg) 1438 | => true 1439 | >> str2.match?(neg) 1440 | => false 1441 | >> str3.match?(neg) 1442 | => false 1443 | >> str4.match?(neg) 1444 | => true 1445 | >> str5.match?(neg) 1446 | => false 1447 | >> str6.match?(neg) 1448 | => true 1449 | ``` 1450 | 1451 | **18)** Can you reason out why the following regular expressions behave differently? 1452 | 1453 | `\b` matches both the start and end of word locations. In the below example, `\b..\b` doesn't necessarily mean that the first `\b` will match only the start of word location and the second `\b` will match only the end of word location. They can be any combination! For example, `I` followed by space in the input string here is using the start of word location for both the conditions. Similarly, space followed by `2` is using the end of word location for both the conditions. 1454 | 1455 | In contrast, the negative lookarounds version ensures that there are no word characters around any two characters. Also, such assertions will always be satisfied at the start of string and the end of string respectively. But `\b` depends on the presence of word characters. For example, `!` at the end of the input string here matches the lookaround assertion but not word boundary. 1456 | 1457 | ```ruby 1458 | >> ip = 'I have 12, he has 2!' 1459 | 1460 | >> ip.gsub(/\b..\b/, '{\0}') 1461 | => "{I }have {12}{, }{he} has{ 2}!" 1462 | 1463 | >> ip.gsub(/(? "I have {12}, {he} has {2!}" 1465 | ``` 1466 | 1467 | **19)** The given input strings have fields separated by the `:` character. Assume that each string has a minimum of two fields and cannot have empty fields. Extract all fields, but stop if a field with a digit character is found. 1468 | 1469 | ```ruby 1470 | >> row1 = 'vast:a2b2:ride:in:awe:b2b:3list:end' 1471 | >> row2 = 'um:no:low:3e:s4w:seer' 1472 | >> row3 = 'oh100:apple:banana:fig' 1473 | >> row4 = 'Dragon:Unicorn:Wizard-Healer' 1474 | 1475 | >> pat = /\G([^\d:]+)(?::|\z)/ 1476 | 1477 | >> row1.gsub(pat).map { $1 } 1478 | => ["vast"] 1479 | >> row2.gsub(pat).map { $1 } 1480 | => ["um", "no", "low"] 1481 | >> row3.gsub(pat).map { $1 } 1482 | => [] 1483 | >> row4.gsub(pat).map { $1 } 1484 | => ["Dragon", "Unicorn", "Wizard-Healer"] 1485 | ``` 1486 | 1487 | **20)** The given input strings have fields separated by the `:` character. Extract all fields only after a field containing a digit character is found. Assume that each string has a minimum of two fields and cannot have empty fields. 1488 | 1489 | ```ruby 1490 | >> row1 = 'vast:a2b2:ride:in:awe:b2b:3list:end' 1491 | >> row2 = 'um:no:low:3e:s4w:seer' 1492 | >> row3 = 'oh100:apple:banana:fig' 1493 | >> row4 = 'Dragon:Unicorn:Wizard-Healer' 1494 | 1495 | >> pat = /(?:\d[^:]*|\G):\K[^:]+/ 1496 | 1497 | >> row1.scan(pat) 1498 | => ["ride", "in", "awe", "b2b", "3list", "end"] 1499 | >> row2.scan(pat) 1500 | => ["s4w", "seer"] 1501 | >> row3.scan(pat) 1502 | => ["apple", "banana", "fig"] 1503 | >> row4.scan(pat) 1504 | => [] 1505 | ``` 1506 | 1507 | **21)** The given input string has comma separated fields and some of them can occur more than once. For the duplicated fields, retain only the rightmost one. Assume that there are no empty fields. 1508 | 1509 | ```ruby 1510 | >> row = '421,cat,2425,42,5,cat,6,6,42,61,6,6,scat,6,6,4,Cat,425,4' 1511 | 1512 | >> row.gsub(/(? "421,2425,5,cat,42,61,scat,6,Cat,425,4" 1514 | ``` 1515 | 1516 |
1517 | 1518 | # Modifiers 1519 | 1520 | **1)** Remove from the first occurrence of `hat` to the last occurrence of `it` for the given input strings. Match these markers case insensitively. 1521 | 1522 | ```ruby 1523 | >> s1 = "But Cool THAT\nsee What okay\nwow quite" 1524 | >> s2 = 'it this hat is sliced HIT.' 1525 | 1526 | >> pat = /hat.*it/im 1527 | 1528 | >> s1.sub(pat, '') 1529 | => "But Cool Te" 1530 | >> s2.sub(pat, '') 1531 | => "it this ." 1532 | ``` 1533 | 1534 | **2)** Delete from the string `start` if it is at the beginning of a line up to the next occurrence of the string `end` at the end of a line. Match these keywords irrespective of case. 1535 | 1536 | ```ruby 1537 | '> para = %q{good start 1538 | '> start working on that 1539 | '> project you always wanted 1540 | '> to, do not let it end 1541 | '> hi there 1542 | '> start and end the end 1543 | '> 42 1544 | '> Start and try to 1545 | '> finish the End 1546 | >> bye} 1547 | 1548 | >> pat = /^start.*?end$/im 1549 | 1550 | >> puts para.gsub(pat, '') 1551 | good start 1552 | 1553 | hi there 1554 | 1555 | 42 1556 | 1557 | bye 1558 | ``` 1559 | 1560 | **3)** For the given *markdown* file, replace all occurrences of the string `ruby` (irrespective of case) with the string `Ruby`. However, any match within code blocks that start with the whole line ` ```ruby ` and end with the whole line ` ``` ` shouldn't be replaced. Consider the input file to be small enough to fit memory requirements. 1561 | 1562 | Refer to the [exercises folder](https://github.com/learnbyexample/Ruby_Regexp/tree/master/exercises) for input files required to solve this exercise. 1563 | 1564 | ```ruby 1565 | >> ip_str = File.open('sample.md').read 1566 | >> pat = /(^```ruby$.*?^```$)/m 1567 | 1568 | >> File.open('sample_mod.md', 'w') do |f| 1569 | ?> ip_str.split(pat).each_with_index do |s, i| 1570 | ?> f.write(i.odd? ? s : s.gsub(/ruby/i) { $&.capitalize }) 1571 | >> end 1572 | >> end 1573 | 1574 | >> File.open('sample_mod.md').read == File.open('expected.md').read 1575 | => true 1576 | ``` 1577 | 1578 | **4)** Write a string method that changes the given input to alternate case (starting with lowercase first). 1579 | 1580 | ```ruby 1581 | ?> def aLtErNaTe_CaSe(ip_str) 1582 | ?> b = true 1583 | ?> return ip_str.gsub(/[a-z]/i) { (b = !b) ? $&.upcase : $&.downcase } 1584 | >> end 1585 | 1586 | >> aLtErNaTe_CaSe('HI THERE!') 1587 | => "hI tHeRe!" 1588 | >> aLtErNaTe_CaSe('good morning') 1589 | => "gOoD mOrNiNg" 1590 | >> aLtErNaTe_CaSe('Sample123string42with777numbers') 1591 | => "sAmPlE123sTrInG42wItH777nUmBeRs" 1592 | ``` 1593 | 1594 | **5)** For the given input strings, match all of these three conditions: 1595 | 1596 | * `This` case sensitively 1597 | * `nice` and `cool` case insensitively 1598 | 1599 | ```ruby 1600 | >> s1 = 'This is nice and Cool' 1601 | >> s2 = 'Nice and cool this is' 1602 | >> s3 = 'What is so nice and cool about This?' 1603 | >> s4 = 'nice,cool,This' 1604 | >> s5 = 'not nice This?' 1605 | >> s6 = 'This is not cool' 1606 | 1607 | >> pat = /(?=.*nice)(?=.*cool)(?-i:.*This)/i 1608 | 1609 | >> s1.match?(pat) 1610 | => true 1611 | >> s2.match?(pat) 1612 | => false 1613 | >> s3.match?(pat) 1614 | => true 1615 | >> s4.match?(pat) 1616 | => true 1617 | >> s5.match?(pat) 1618 | => false 1619 | >> s6.match?(pat) 1620 | => false 1621 | ``` 1622 | 1623 | **6)** For the given input strings, match if the string begins with `Th` and also contains a line that starts with `There`. 1624 | 1625 | ```ruby 1626 | >> s1 = "There there\nHave a cookie" 1627 | >> s2 = "This is a mess\nYeah?\nThereeeee" 1628 | >> s3 = "Oh\nThere goes the fun" 1629 | >> s4 = 'This is not\ngood\nno There' 1630 | 1631 | >> pat = /\A(?=Th)(?m:.*^There)/ 1632 | 1633 | >> s1.match?(pat) 1634 | => true 1635 | >> s2.match?(pat) 1636 | => true 1637 | >> s3.match?(pat) 1638 | => false 1639 | >> s4.match?(pat) 1640 | => false 1641 | ``` 1642 | 1643 |
1644 | 1645 | # Unicode 1646 | 1647 | **1)** Output `true` or `false` depending on input string made up of ASCII characters or not. Consider the input to be non-empty strings and any character that isn't part of the 7-bit ASCII set should give `false`. 1648 | 1649 | ```ruby 1650 | >> str1 = '123—456' 1651 | >> str2 = 'good fοοd' 1652 | >> str3 = 'happy learning!' 1653 | 1654 | # can also use ! str1.match?(/[^\u{00}-\u{7f}]/) 1655 | >> str1.ascii_only? 1656 | => false 1657 | >> str2.ascii_only? 1658 | => false 1659 | >> str3.ascii_only? 1660 | => true 1661 | ``` 1662 | 1663 | **2)** Retain only punctuation characters for the given strings (generated from codepoints). Use the Unicode character set definition for punctuation for solving this exercise. 1664 | 1665 | ```ruby 1666 | >> s1 = (0..0x7f).to_a.pack('U*') 1667 | >> s2 = (0x80..0xff).to_a.pack('U*') 1668 | >> s3 = (0x2600..0x27eb).to_a.pack('U*') 1669 | 1670 | >> pat = /\p{^P}/ 1671 | 1672 | >> s1.gsub(pat, '') 1673 | => "!\"#%&'()*,-./:;?@[\\]_{}" 1674 | >> s2.gsub(pat, '') 1675 | => "¡§«¶·»¿" 1676 | >> s3.gsub(pat, '') 1677 | => "❨❩❪❫❬❭❮❯❰❱❲❳❴❵⟅⟆⟦⟧⟨⟩⟪⟫" 1678 | ``` 1679 | 1680 | **3)** Explore the following Q&A threads. 1681 | 1682 | * [stackoverflow: remove emoji from string](https://stackoverflow.com/q/24672834/4082052) 1683 | * [stackoverflow: why am I seeing different results for these two nearly identical regexp](https://stackoverflow.com/q/13573136/4082052) 1684 | * [stackoverflow: convert unicode number to integer](https://stackoverflow.com/q/37338708/4082052) 1685 | * [stackoverflow: replacing %uXXXX to the corresponding unicode codepoint](https://stackoverflow.com/q/28773392/4082052) 1686 | 1687 | -------------------------------------------------------------------------------- /exercises/Exercises.md: -------------------------------------------------------------------------------- 1 | # Exercises 2 | 3 | >![info](../images/info.svg) Try to solve the exercises in every chapter using only the features discussed until that chapter. Some of the exercises will be easier to solve with techniques presented in the later chapters, but the aim of these exercises is to explore the features presented so far. 4 | 5 | >![info](../images/info.svg) For solutions, see [Exercise_solutions.md](https://github.com/learnbyexample/Ruby_Regexp/blob/master/exercises/Exercise_solutions.md). 6 | 7 |
8 | 9 | # Regexp introduction 10 | 11 | **1)** Check whether the given strings contain `0xB0`. Display a boolean result as shown below. 12 | 13 | ```ruby 14 | >> line1 = 'start address: 0xA0, func1 address: 0xC0' 15 | >> line2 = 'end address: 0xFF, func2 address: 0xB0' 16 | 17 | >> line1.match?() ##### add your solution here 18 | => false 19 | >> line2.match?() ##### add your solution here 20 | => true 21 | ``` 22 | 23 | **2)** Check if the given input strings contain `two` irrespective of case. 24 | 25 | ```ruby 26 | >> s1 = 'Their artwork is exceptional' 27 | >> s2 = 'one plus tw0 is not three' 28 | >> s3 = 'TRUSTWORTHY' 29 | 30 | >> pat1 = // ##### add your solution here 31 | 32 | >> pat1.match?(s1) 33 | => true 34 | >> pat1.match?(s2) 35 | => false 36 | >> pat1.match?(s3) 37 | => true 38 | ``` 39 | 40 | **3)** Replace all occurrences of `5` with `five` for the given string. 41 | 42 | ```ruby 43 | >> ip = 'They ate 5 apples and 5 oranges' 44 | 45 | >> ip.gsub(//, 'five') ##### add your solution here 46 | => "They ate five apples and five oranges" 47 | ``` 48 | 49 | **4)** Replace only the first occurrence of `5` with `five` for the given string. 50 | 51 | ```ruby 52 | >> ip = 'They ate 5 apples and 5 oranges' 53 | 54 | >> ip.sub(//, 'five') ##### add your solution here 55 | => "They ate five apples and 5 oranges" 56 | ``` 57 | 58 | **5)** For the given array, filter all elements that do *not* contain `e`. 59 | 60 | ```ruby 61 | >> items = %w[goal new user sit eat dinner] 62 | 63 | >> items.grep_v(//) ##### add your solution here 64 | => ["goal", "sit"] 65 | ``` 66 | 67 | **6)** Replace all occurrences of `note` irrespective of case with `X`. 68 | 69 | ```ruby 70 | >> ip = 'This note should not be NoTeD' 71 | 72 | >> ip.gsub(//, 'X') ##### add your solution here 73 | => "This X should not be XD" 74 | ``` 75 | 76 | **7)** For the given input string, print all lines NOT containing the string `2`. 77 | 78 | ```ruby 79 | '> purchases = %q{items qty 80 | '> apple 24 81 | '> mango 50 82 | '> guava 42 83 | '> onion 31 84 | >> water 10} 85 | 86 | >> num = // ##### add your solution here 87 | 88 | >> puts purchases.each_line.grep_v(num) 89 | items qty 90 | mango 50 91 | onion 31 92 | water 10 93 | ``` 94 | 95 | **8)** For the given array, filter all elements that contain either `a` or `w`. 96 | 97 | ```ruby 98 | >> items = %w[goal new user sit eat dinner] 99 | 100 | >> items.filter { } ##### add your solution here 101 | => ["goal", "new", "eat"] 102 | ``` 103 | 104 | **9)** For the given array, filter all elements that contain both `e` and `n`. 105 | 106 | ```ruby 107 | >> items = %w[goal new user sit eat dinner] 108 | 109 | >> items.filter { } ##### add your solution here 110 | => ["new", "dinner"] 111 | ``` 112 | 113 | **10)** For the given string, replace `0xA0` with `0x7F` and `0xC0` with `0x1F`. 114 | 115 | ```ruby 116 | >> ip = 'start address: 0xA0, func1 address: 0xC0' 117 | 118 | ##### add your solution here 119 | => "start address: 0x7F, func1 address: 0x1F" 120 | ``` 121 | 122 | **11)** Find the starting index of the first occurrence of `is` for the given input string. 123 | 124 | ```ruby 125 | >> ip = 'match this after the history lesson' 126 | 127 | ##### add your solution here 128 | => 8 129 | ``` 130 | 131 |
132 | 133 | # Anchors 134 | 135 | **1)** Check if the given strings start with `be`. 136 | 137 | ```ruby 138 | >> line1 = 'be nice' 139 | >> line2 = '"best!"' 140 | >> line3 = 'better?' 141 | >> line4 = 'oh no\nbear spotted' 142 | 143 | >> pat = ##### add your solution here 144 | 145 | >> pat.match?(line1) 146 | => true 147 | >> pat.match?(line2) 148 | => false 149 | >> pat.match?(line3) 150 | => true 151 | >> pat.match?(line4) 152 | => false 153 | ``` 154 | 155 | **2)** For the given input string, change only the whole word `red` to `brown`. 156 | 157 | ```ruby 158 | >> words = 'bred red spread credible red.' 159 | 160 | >> words.gsub() ##### add your solution here 161 | => "bred brown spread credible brown." 162 | ``` 163 | 164 | **3)** For the given input array, filter elements that contain `42` surrounded by word characters. 165 | 166 | ```ruby 167 | >> items = ['hi42bye', 'nice1423', 'bad42', 'cool_42a', '42fake', '_42_'] 168 | 169 | >> items.grep() ##### add your solution here 170 | => ["hi42bye", "nice1423", "cool_42a", "_42_"] 171 | ``` 172 | 173 | **4)** For the given input array, filter elements that start with `den` or end with `ly`. 174 | 175 | ```ruby 176 | >> items = ['lovely', "1\ndentist", '2 lonely', 'eden', "fly\n", 'dent'] 177 | 178 | >> items.filter { } ##### add your solution here 179 | => ["lovely", "2 lonely", "dent"] 180 | ``` 181 | 182 | **5)** For the given input string, change whole word `mall` to `1234` only if it is at the start of a line. 183 | 184 | ```ruby 185 | '> para = %q{(mall) call ball pall 186 | '> ball fall wall tall 187 | '> mall call ball pall 188 | '> wall mall ball fall 189 | '> mallet wallet malls 190 | >> mall:call:ball:pall} 191 | 192 | >> puts para.gsub() ##### add your solution here 193 | (mall) call ball pall 194 | ball fall wall tall 195 | 1234 call ball pall 196 | wall mall ball fall 197 | mallet wallet malls 198 | 1234:call:ball:pall 199 | ``` 200 | 201 | **6)** For the given array, filter elements having a line starting with `den` or ending with `ly`. 202 | 203 | ```ruby 204 | >> items = ['lovely', "1\ndentist", '2 lonely', 'eden', "fly\nfar", 'dent'] 205 | 206 | >> items.filter { } ##### add your solution here 207 | => ["lovely", "1\ndentist", "2 lonely", "fly\nfar", "dent"] 208 | ``` 209 | 210 | **7)** For the given input array, filter all whole elements `12\nthree` irrespective of case. 211 | 212 | ```ruby 213 | >> items = ["12\nthree\n", "12\nThree", "12\nthree\n4", "12\nthree"] 214 | 215 | >> items.grep() ##### add your solution here 216 | => ["12\nThree", "12\nthree"] 217 | ``` 218 | 219 | **8)** For the given input array, replace `hand` with `X` for all elements that start with `hand` followed by at least one word character. 220 | 221 | ```ruby 222 | >> items = %w[handed hand handy unhanded handle hand-2] 223 | 224 | >> items.map { } ##### add your solution here 225 | => ["Xed", "hand", "Xy", "unhanded", "Xle", "hand-2"] 226 | ``` 227 | 228 | **9)** For the given input array, filter all elements starting with `h`. Additionally, replace `e` with `X` for these filtered elements. 229 | 230 | ```ruby 231 | >> items = %w[handed hand handy unhanded handle hand-2] 232 | 233 | >> items.filter_map { } ##### add your solution here 234 | => ["handXd", "hand", "handy", "handlX", "hand-2"] 235 | ``` 236 | 237 |
238 | 239 | # Alternation and Grouping 240 | 241 | **1)** For the given input array, filter all elements that start with `den` or end with `ly`. 242 | 243 | ```ruby 244 | >> items = ['lovely', "1\ndentist", '2 lonely', 'eden', "fly\n", 'dent'] 245 | 246 | >> items.grep() ##### add your solution here 247 | => ["lovely", "2 lonely", "dent"] 248 | ``` 249 | 250 | **2)** For the given array, filter elements having a line starting with `den` or ending with `ly`. 251 | 252 | ```ruby 253 | >> items = ['lovely', "1\ndentist", '2 lonely', 'eden', "fly\nfar", 'dent'] 254 | 255 | >> items.grep() ##### add your solution here 256 | => ["lovely", "1\ndentist", "2 lonely", "fly\nfar", "dent"] 257 | ``` 258 | 259 | **3)** For the given strings, replace all occurrences of `removed` or `reed` or `received` or `refused` with `X`. 260 | 261 | ```ruby 262 | >> s1 = 'creed refuse removed read' 263 | >> s2 = 'refused reed redo received' 264 | 265 | >> pat = ##### add your solution here 266 | 267 | >> s1.gsub(pat, 'X') 268 | => "cX refuse X read" 269 | >> s2.gsub(pat, 'X') 270 | => "X X redo X" 271 | ``` 272 | 273 | **4)** For the given strings, replace all matches from the array `words` with `A`. 274 | 275 | ```ruby 276 | >> s1 = 'plate full of slate' 277 | >> s2 = "slated for later, don't be late" 278 | >> words = %w[late later slated] 279 | 280 | >> pat = ##### add your solution here 281 | 282 | >> s1.gsub(pat, 'A') 283 | => "pA full of sA" 284 | >> s2.gsub(pat, 'A') 285 | => "A for A, don't be A" 286 | ``` 287 | 288 | **5)** Filter all whole elements from the input array `items` that exactly matches any of the elements present in the array `words`. 289 | 290 | ```ruby 291 | >> items = ['slate', 'later', 'plate', 'late', 'slates', 'slated '] 292 | >> words = %w[late later slated] 293 | 294 | >> pat = ##### add your solution here 295 | 296 | >> items.grep(pat) 297 | => ["later", "late"] 298 | ``` 299 | 300 |
301 | 302 | # Escaping metacharacters 303 | 304 | **1)** Transform the given input strings to the expected output using the same logic on both strings. 305 | 306 | ```ruby 307 | >> str1 = '(9-2)*5+qty/3-(9-2)*7' 308 | >> str2 = '(qty+4)/2-(9-2)*5+pq/4' 309 | 310 | >> str1.gsub() ##### add your solution here 311 | => "35+qty/3-(9-2)*7" 312 | >> str2.gsub() ##### add your solution here 313 | => "(qty+4)/2-35+pq/4" 314 | ``` 315 | 316 | **2)** Replace `(4)\|` with `2` only at the start or end of the given input strings. 317 | 318 | ```ruby 319 | >> s1 = '2.3/(4)\|6 fig 5.3-(4)\|' 320 | >> s2 = '(4)\|42 - (4)\|3' 321 | >> s3 = "two - (4)\\|\n" 322 | 323 | >> pat = ##### add your solution here 324 | 325 | >> s1.gsub(pat, '2') 326 | => "2.3/(4)\\|6 fig 5.3-2" 327 | >> s2.gsub(pat, '2') 328 | => "242 - (4)\\|3" 329 | >> s3.gsub(pat, '2') 330 | => "two - (4)\\|\n" 331 | ``` 332 | 333 | **3)** Replace any matching item from the given array with `X` for the given input strings. Match the elements from `items` literally. Assume no two elements of `items` will result in any matching conflict. 334 | 335 | ```ruby 336 | >> items = ['a.b', '3+n', 'x\y\z', 'qty||price', '{n}'] 337 | 338 | >> pat = ##### add your solution here 339 | 340 | >> '0a.bcd'.gsub(pat, 'X') 341 | => "0Xcd" 342 | >> 'E{n}AMPLE'.gsub(pat, 'X') 343 | => "EXAMPLE" 344 | >> '43+n2 ax\y\ze'.gsub(pat, 'X') 345 | => "4X2 aXe" 346 | ``` 347 | 348 | **4)** Replace the backspace character `\b` with a single space character for the given input string. 349 | 350 | ```ruby 351 | >> ip = "123\b456" 352 | >> puts ip 353 | 12456 354 | 355 | >> ip.gsub() ##### add your solution here 356 | => "123 456" 357 | ``` 358 | 359 | **5)** Replace all occurrences of `\o` with `o`. 360 | 361 | ```ruby 362 | >> ip = 'there are c\omm\on aspects am\ong the alternati\ons' 363 | 364 | >> ip.gsub() ##### add your solution here 365 | => "there are common aspects among the alternations" 366 | ``` 367 | 368 | **6)** Replace any matching item from the array `eqns` with `X` for the given string `ip`. Match the items from `eqns` literally. 369 | 370 | ```ruby 371 | >> ip = '3-(a^b)+2*(a^b)-(a/b)+3' 372 | >> eqns = %w[(a^b) (a/b) (a^b)+2] 373 | 374 | >> pat = ##### add your solution here 375 | 376 | >> ip.gsub(pat, 'X') 377 | => "3-X*X-X+3" 378 | ``` 379 | 380 |
381 | 382 | # Dot metacharacter and Quantifiers 383 | 384 | >![info](../images/info.svg) Since the `.` metacharacter doesn't match newline characters by default, assume that the input strings in the following exercises will not contain newline characters. 385 | 386 | **1)** Replace `42//5` or `42/5` with `8` for the given input. 387 | 388 | ```ruby 389 | >> ip = 'a+42//5-c pressure*3+42/5-14256' 390 | 391 | >> ip.gsub() ##### add your solution here 392 | => "a+8-c pressure*3+8-14256" 393 | ``` 394 | 395 | **2)** For the array `items`, filter all elements starting with `hand` and ending immediately with at most one more character or `le`. 396 | 397 | ```ruby 398 | >> items = %w[handed hand handled handy unhand hands handle] 399 | 400 | >> items.grep() ##### add your solution here 401 | => ["hand", "handy", "hands", "handle"] 402 | ``` 403 | 404 | **3)** Use the `split` method to get the output as shown for the given input strings. 405 | 406 | ```ruby 407 | >> eqn1 = 'a+42//5-c' 408 | >> eqn2 = 'pressure*3+42/5-14256' 409 | >> eqn3 = 'r*42-5/3+42///5-42/53+a' 410 | 411 | >> pat = ##### add your solution here 412 | 413 | >> eqn1.split(pat) 414 | => ["a+", "-c"] 415 | >> eqn2.split(pat) 416 | => ["pressure*3+", "-14256"] 417 | >> eqn3.split(pat) 418 | => ["r*42-5/3+42///5-", "3+a"] 419 | ``` 420 | 421 | **4)** For the given input strings, remove everything from the first occurrence of `i` till the end of the string. 422 | 423 | ```ruby 424 | >> s1 = 'remove the special meaning of such constructs' 425 | >> s2 = 'characters while constructing' 426 | >> s3 = 'input output' 427 | 428 | >> pat = ##### add your solution here 429 | 430 | >> s1.sub(pat, '') 431 | => "remove the spec" 432 | >> s2.sub(pat, '') 433 | => "characters wh" 434 | >> s3.sub(pat, '') 435 | => "" 436 | ``` 437 | 438 | **5)** For the given strings, construct a regexp to get the output as shown below. 439 | 440 | ```ruby 441 | >> str1 = 'a+b(addition)' 442 | >> str2 = 'a/b(division) + c%d(#modulo)' 443 | >> str3 = 'Hi there(greeting). Nice day(a(b)' 444 | 445 | >> remove_parentheses = ##### add your solution here 446 | 447 | >> str1.gsub(remove_parentheses, '') 448 | => "a+b" 449 | >> str2.gsub(remove_parentheses, '') 450 | => "a/b + c%d" 451 | >> str3.gsub(remove_parentheses, '') 452 | => "Hi there. Nice day" 453 | ``` 454 | 455 | **6)** Correct the given regexp to get the expected output. 456 | 457 | ```ruby 458 | >> words = 'plink incoming tint winter in caution sentient' 459 | 460 | # wrong output 461 | >> change = /int|in|ion|ing|inco|inter|ink/ 462 | >> words.gsub(change, 'X') 463 | => "plXk XcomXg tX wXer X cautX sentient" 464 | 465 | # expected output 466 | >> change = ##### add your solution here 467 | >> words.gsub(change, 'X') 468 | => "plX XmX tX wX X cautX sentient" 469 | ``` 470 | 471 | **7)** For the given greedy quantifiers, what would be the equivalent form using the `{m,n}` representation? 472 | 473 | * `?` is same as 474 | * `*` is same as 475 | * `+` is same as 476 | 477 | **8)** `(a*|b*)` is same as `(a|b)*` — true or false? 478 | 479 | **9)** For the given input strings, remove everything from the first occurrence of `test` (irrespective of case) till the end of the string, provided `test` isn't at the end of the string. 480 | 481 | ```ruby 482 | >> s1 = 'this is a Test' 483 | >> s2 = 'always test your RE for corner cases' 484 | >> s3 = 'a TEST of skill tests?' 485 | 486 | >> pat = ##### add your solution here 487 | 488 | >> s1.sub(pat, '') 489 | => "this is a Test" 490 | >> s2.sub(pat, '') 491 | => "always " 492 | >> s3.sub(pat, '') 493 | => "a " 494 | ``` 495 | 496 | **10)** For the input array `words`, filter all elements starting with `s` and containing `e` and `t` in any order. 497 | 498 | ```ruby 499 | >> words = ['sequoia', 'subtle', 'exhibit', 'a set', 'sets', 'tests', 'site'] 500 | 501 | >> words.grep() ##### add your solution here 502 | => ["subtle", "sets", "site"] 503 | ``` 504 | 505 | **11)** For the input array `words`, remove all elements having less than `6` characters. 506 | 507 | ```ruby 508 | >> words = %w[sequoia subtle exhibit asset sets tests site] 509 | 510 | >> words.grep() ##### add your solution here 511 | => ["sequoia", "subtle", "exhibit"] 512 | ``` 513 | 514 | **12)** For the input array `words`, filter all elements starting with `s` or `t` and having a maximum of `6` characters. 515 | 516 | ```ruby 517 | >> words = ['sequoia', 'subtle', 'exhibit', 'asset', 'sets', 't set', 'site'] 518 | 519 | >> words.grep() ##### add your solution here 520 | => ["subtle", "sets", "t set", "site"] 521 | ``` 522 | 523 | **13)** Can you reason out why this code results in the output shown? The aim was to remove all `` patterns but not the `<>` ones. The expected result was `'a 1<> b 2<> c'`. 524 | 525 | ```ruby 526 | >> ip = 'a 1<> b 2<> c' 527 | 528 | >> ip.gsub(/<.+?>/, '') 529 | => "a 1 2" 530 | ``` 531 | 532 | **14)** Use the `split` method to get the output as shown below for the given input strings. 533 | 534 | ```ruby 535 | >> s1 = 'go there :: this :: that' 536 | >> s2 = 'a::b :: c::d e::f :: 4::5' 537 | >> s3 = '42:: hi::bye::see :: carefully' 538 | 539 | >> pat = ##### add your solution here 540 | 541 | >> s1.split(pat, 2) 542 | => ["go there", "this :: that"] 543 | >> s2.split(pat, 2) 544 | => ["a::b", "c::d e::f :: 4::5"] 545 | >> s3.split(pat, 2) 546 | => ["42:: hi::bye::see", "carefully"] 547 | ``` 548 | 549 | **15)** For the given input strings, match if the string starts with optional space characters followed by at least two `#` characters. 550 | 551 | ```ruby 552 | >> s1 = ' ## header2' 553 | >> s2 = '#### header4' 554 | >> s3 = '# comment' 555 | >> s4 = 'normal string' 556 | >> s5 = 'nope ## not this' 557 | 558 | >> pat = ##### add your solution here 559 | 560 | >> s1.match?(pat) 561 | => true 562 | >> s2.match?(pat) 563 | => true 564 | >> s3.match?(pat) 565 | => false 566 | >> s4.match?(pat) 567 | => false 568 | >> s5.match?(pat) 569 | => false 570 | ``` 571 | 572 | **16)** Modify the given regular expression such that it gives the expected results. 573 | 574 | ```ruby 575 | >> s1 = 'appleabcabcabcapricot' 576 | >> s2 = 'bananabcabcabcdelicious' 577 | 578 | # wrong output 579 | >> pat = /(abc)+a/ 580 | >> pat.match?(s1) 581 | => true 582 | >> pat.match?(s2) 583 | => true 584 | 585 | # expected output 586 | # 'abc' shouldn't be considered when trying to match 'a' at the end 587 | >> pat = ##### add your solution here 588 | >> pat.match?(s1) 589 | => true 590 | >> pat.match?(s2) 591 | => false 592 | ``` 593 | 594 |
595 | 596 | # Working with matched portions 597 | 598 | **1)** For the given strings, extract the matching portion from the first `is` to the last `t`. 599 | 600 | ```ruby 601 | >> str1 = 'This the biggest fruit you have seen?' 602 | >> str2 = 'Your mission is to read and practice consistently' 603 | 604 | >> pat = ##### add your solution here 605 | 606 | ##### add your solution here for str1 607 | => "is the biggest fruit" 608 | ##### add your solution here for str2 609 | => "ission is to read and practice consistent" 610 | ``` 611 | 612 | **2)** Find the starting index of the first occurrence of `is` or `the` or `was` or `to` for the given input strings. 613 | 614 | ```ruby 615 | >> s1 = 'match after the last newline character' 616 | >> s2 = 'and then you want to test' 617 | >> s3 = 'this is good bye then' 618 | >> s4 = 'who was there to see?' 619 | 620 | >> pat = ##### add your solution here 621 | 622 | ##### add your solution here for s1 623 | => 12 624 | ##### add your solution here for s2 625 | => 4 626 | ##### add your solution here for s3 627 | => 2 628 | ##### add your solution here for s4 629 | => 4 630 | ``` 631 | 632 | **3)** Find the starting index of the last occurrence of `is` or `the` or `was` or `to` for the given input strings. 633 | 634 | ```ruby 635 | >> s1 = 'match after the last newline character' 636 | >> s2 = 'and then you want to test' 637 | >> s3 = 'this is good bye then' 638 | >> s4 = 'who was there to see?' 639 | 640 | >> pat = ##### add your solution here 641 | 642 | ##### add your solution here for s1 643 | => 12 644 | ##### add your solution here for s2 645 | => 18 646 | ##### add your solution here for s3 647 | => 17 648 | ##### add your solution here for s4 649 | => 14 650 | ``` 651 | 652 | **4)** Extract everything after the `:` character, which occurs only once in the input. 653 | 654 | ```ruby 655 | >> ip = 'fruits:apple, mango, guava, blueberry' 656 | 657 | ##### add your solution here 658 | => "apple, mango, guava, blueberry" 659 | ``` 660 | 661 | **5)** The given input strings contains some text followed by `-` followed by a number. Replace that number with its `log` value using `Math.log()`. 662 | 663 | ```ruby 664 | >> s1 = 'first-3.14' 665 | >> s2 = 'next-123' 666 | 667 | >> pat = ##### add your solution here 668 | 669 | ##### add your solution here for s1 670 | => "first-1.144222799920162" 671 | ##### add your solution here for s2 672 | => "next-4.812184355372417" 673 | ``` 674 | 675 | **6)** Replace all occurrences of `par` with `spar`, `spare` with `extra` and `park` with `garden` for the given input strings. 676 | 677 | ```ruby 678 | >> str1 = 'apartment has a park' 679 | >> str2 = 'do you have a spare cable' 680 | >> str3 = 'write a parser' 681 | 682 | ##### add your solution here for str1 683 | => "aspartment has a garden" 684 | ##### add your solution here for str2 685 | => "do you have a extra cable" 686 | ##### add your solution here for str3 687 | => "write a sparser" 688 | ``` 689 | 690 | **7)** Extract all words between `(` and `)` from the given input string as an array. Assume that the input will not contain any broken parentheses. 691 | 692 | ```ruby 693 | >> ip = 'another (way) to reuse (portion) matched (by) capture groups' 694 | 695 | # as nested array 696 | ##### add your solution here 697 | => [["way"], ["portion"], ["by"]] 698 | 699 | # as array of strings 700 | ##### add your solution here 701 | => ["way", "portion", "by"] 702 | ``` 703 | 704 | **8)** Extract all occurrences of `<` up to the next occurrence of `>`, provided there is at least one character in between `<` and `>`. 705 | 706 | ```ruby 707 | >> ip = 'a 1<> b 2<> c' 708 | 709 | ##### add your solution here 710 | => ["", "<> b", "<> c"] 711 | ``` 712 | 713 | **9)** Use `scan` to get the output as shown below for the given input strings. Note the characters used in the input strings carefully. 714 | 715 | ```ruby 716 | >> row1 = '-2,5 4,+3 +42,-53 4356246,-357532354 ' 717 | >> row2 = '1.32,-3.14 634,5.63 63.3e3,9907809345343.235 ' 718 | 719 | >> pat = ##### add your solution here 720 | 721 | >> row1.scan(pat) 722 | => [["-2", "5"], ["4", "+3"], ["+42", "-53"], ["4356246", "-357532354"]] 723 | >> row2.scan(pat) 724 | => [["1.32", "-3.14"], ["634", "5.63"], ["63.3e3", "9907809345343.235"]] 725 | ``` 726 | 727 | **10)** This is an extension to the previous question. 728 | 729 | * For `row1`, find the sum of integers of each array element. For example, sum of `-2` and `5` is `3`. 730 | * For `row2`, find the sum of floating-point numbers of each array element. For example, sum of `1.32` and `-3.14` is `-1.82`. 731 | 732 | ```ruby 733 | >> row1 = '-2,5 4,+3 +42,-53 4356246,-357532354 ' 734 | >> row2 = '1.32,-3.14 634,5.63 63.3e3,9907809345343.235 ' 735 | 736 | # should be same as the previous question 737 | >> pat = ##### add your solution here 738 | 739 | ##### add your solution here for row1 740 | => [3, 7, -11, -353176108] 741 | 742 | ##### add your solution here for row2 743 | => [-1.82, 639.63, 9907809408643.234] 744 | ``` 745 | 746 | **11)** Use the `split` method to get the output as shown below. 747 | 748 | ```ruby 749 | >> ip = '42:no-output;1000:car-tr:u-ck;SQEX49801' 750 | 751 | >> ip.split() ##### add your solution here 752 | => ["42", "output", "1000", "tr:u-ck", "SQEX49801"] 753 | ``` 754 | 755 | **12)** Convert the comma separated strings to corresponding `hash` objects as shown below. Note that the input strings have an extra `,` at the end. 756 | 757 | ```ruby 758 | >> row1 = 'name:rohan,maths:75,phy:89,' 759 | >> row2 = 'name:rose,maths:88,phy:92,' 760 | 761 | >> pat = ##### add your solution here 762 | 763 | ##### add your solution here for row1 764 | => {"name"=>"rohan", "maths"=>"75", "phy"=>"89"} 765 | ##### add your solution here for row2 766 | => {"name"=>"rose", "maths"=>"88", "phy"=>"92"} 767 | ``` 768 | 769 |
770 | 771 | # Character class 772 | 773 | **1)** For the array `items`, filter all elements starting with `hand` and ending immediately with `s` or `y` or `le`. 774 | 775 | ```ruby 776 | >> items = %w[-handy hand handy unhand hands hand-icy handle] 777 | 778 | ##### add your solution here 779 | => ["handy", "hands", "handle"] 780 | ``` 781 | 782 | **2)** Replace all whole words `reed` or `read` or `red` with `X`. 783 | 784 | ```ruby 785 | >> ip = 'redo red credible :read: rod reed' 786 | 787 | ##### add your solution here 788 | => "redo X credible :X: rod X" 789 | ``` 790 | 791 | **3)** For the array `words`, filter all elements containing `e` or `i` followed by `l` or `n`. Note that the order mentioned should be followed. 792 | 793 | ```ruby 794 | >> words = %w[surrender unicorn newer door empty eel pest] 795 | 796 | ##### add your solution here 797 | => ["surrender", "unicorn", "eel"] 798 | ``` 799 | 800 | **4)** For the array `words`, filter all elements containing `e` or `i` and `l` or `n` in any order. 801 | 802 | ```ruby 803 | >> words = %w[surrender unicorn newer door empty eel pest] 804 | 805 | ##### add your solution here 806 | => ["surrender", "unicorn", "newer", "eel"] 807 | ``` 808 | 809 | **5)** Convert the comma separated strings to corresponding `hash` objects as shown below. 810 | 811 | ```ruby 812 | >> row1 = 'name:rohan,maths:75,phy:89' 813 | >> row2 = 'name:rose,maths:88,phy:92' 814 | 815 | >> pat = ##### add your solution here 816 | 817 | ##### add your solution here for row1 818 | => {"name"=>"rohan", "maths"=>"75", "phy"=>"89"} 819 | ##### add your solution here for row2 820 | => {"name"=>"rose", "maths"=>"88", "phy"=>"92"} 821 | ``` 822 | 823 | **6)** Delete from `(` to the next occurrence of `)` unless they contain parentheses characters in between. 824 | 825 | ```ruby 826 | >> str1 = 'def factorial()' 827 | >> str2 = 'a/b(division) + c%d(#modulo) - (e+(j/k-3)*4)' 828 | >> str3 = 'Hi there(greeting). Nice day(a(b)' 829 | 830 | >> remove_parentheses = ##### add your solution here 831 | 832 | >> str1.gsub(remove_parentheses, '') 833 | => "def factorial" 834 | >> str2.gsub(remove_parentheses, '') 835 | => "a/b + c%d - (e+*4)" 836 | >> str3.gsub(remove_parentheses, '') 837 | => "Hi there. Nice day(a" 838 | ``` 839 | 840 | **7)** For the array `words`, filter all elements not starting with `e` or `p` or `u`. 841 | 842 | ```ruby 843 | >> words = %w[surrender unicorn newer door empty eel (pest)] 844 | 845 | ##### add your solution here 846 | => ["surrender", "newer", "door", "(pest)"] 847 | ``` 848 | 849 | **8)** For the array `words`, filter all elements not containing `u` or `w` or `ee` or `-`. 850 | 851 | ```ruby 852 | >> words = %w[p-t you tea heel owe new reed ear] 853 | 854 | ##### add your solution here 855 | => ["tea", "ear"] 856 | ``` 857 | 858 | **9)** The given input strings contain fields separated by `,` and fields can be empty too. Replace the last three fields with `WHTSZ323`. 859 | 860 | ```ruby 861 | >> row1 = '(2),kite,12,,D,C,,' 862 | >> row2 = 'hi,bye,sun,moon' 863 | 864 | >> pat = ##### add your solution here 865 | 866 | ##### add your solution here for row1 867 | => "(2),kite,12,,D,WHTSZ323" 868 | ##### add your solution here for row2 869 | => "hi,WHTSZ323" 870 | ``` 871 | 872 | **10)** Split the given strings based on consecutive sequence of digit or whitespace characters. 873 | 874 | ```ruby 875 | >> str1 = "lion \t Ink32onion Nice" 876 | >> str2 = "**1\f2\n3star\t7 77\r**" 877 | 878 | >> pat = ##### add your solution here 879 | 880 | >> str1.split(pat) 881 | => ["lion", "Ink", "onion", "Nice"] 882 | >> str2.split(pat) 883 | => ["**", "star", "**"] 884 | ``` 885 | 886 | **11)** Delete all occurrences of the sequence `` where `characters` is one or more non `>` characters and cannot be empty. 887 | 888 | ```ruby 889 | >> ip = 'a 1<> b 2<> c' 890 | 891 | ##### add your solution here 892 | => "a 1<> b 2<> c" 893 | ``` 894 | 895 | **12)** `\b[a-z](on|no)[a-z]\b` is same as `\b[a-z][on]{2}[a-z]\b`. True or False? Sample input lines shown below might help to understand the differences, if any. 896 | 897 | ```ruby 898 | >> puts "known\nmood\nknow\npony\ninns" 899 | known 900 | mood 901 | know 902 | pony 903 | inns 904 | ``` 905 | 906 | **13)** For the given array, filter elements containing any number sequence greater than `624`. 907 | 908 | ```ruby 909 | >> items = ['h0000432ab', 'car00625', '42_624 0512', '96 foo1234baz 3.14 2'] 910 | 911 | ##### add your solution here 912 | => ["car00625", "96 foo1234baz 3.14 2"] 913 | ``` 914 | 915 | **14)** Count the maximum depth of nested braces for the given strings. Unbalanced or wrongly ordered braces should return `-1`. Note that this will require a mix of regular expressions and Ruby code. 916 | 917 | ```ruby 918 | ?> def max_nested_braces(ip) 919 | ##### add your solution here 920 | >> end 921 | 922 | >> max_nested_braces('a*b') 923 | => 0 924 | >> max_nested_braces('}a+b{') 925 | => -1 926 | >> max_nested_braces('a*b+{}') 927 | => 1 928 | >> max_nested_braces('{{a+2}*{b+c}+e}') 929 | => 2 930 | >> max_nested_braces('{{a+2}*{b+{c*d}}+e}') 931 | => 3 932 | >> max_nested_braces("{{a+2}*{\n{b+{c*d}}+e*d}}") 933 | => 4 934 | >> max_nested_braces('a*{b+c*{e*3.14}}}') 935 | => -1 936 | ``` 937 | 938 | **15)** By default, the `split` method will split on whitespace and remove empty strings from the result. Which regexp based method would you use to replicate this functionality? 939 | 940 | ```ruby 941 | >> ip = " \t\r so pole\t\t\t\n\nlit in to \r\n\v\f " 942 | 943 | >> ip.split 944 | => ["so", "pole", "lit", "in", "to"] 945 | 946 | ##### add your solution here 947 | => ["so", "pole", "lit", "in", "to"] 948 | ``` 949 | 950 | **16)** Convert the given input string to two different arrays as shown below. You can optimize the regexp based on characters present in the input string. 951 | 952 | ```ruby 953 | >> ip = "price_42 roast^\t\n^-ice==cat\neast" 954 | 955 | ##### add your solution here 956 | => ["price_42", "roast", "ice", "cat", "east"] 957 | 958 | ##### add your solution here 959 | => ["price_42", " ", "roast", "^\t\n^-", "ice", "==", "cat", "\n", "east"] 960 | ``` 961 | 962 | **17)** Filter all elements whose first non-whitespace character is not a `#` character. Any element made up of only whitespace characters should be ignored as well. 963 | 964 | ```ruby 965 | >> items = [' #comment', "\t\napple #42", '#oops', 'sure', 'no#1', "\t\r\f"] 966 | 967 | ##### add your solution here 968 | => ["\t\napple #42", "sure", "no#1"] 969 | ``` 970 | 971 | **18)** Extract all whole words for the given input strings. However, based on user input `ignore`, do not match words if they contain any character present in the `ignore` variable. Assume that `ignore` variable will not contain any regexp metacharacters. 972 | 973 | ```ruby 974 | >> s1 = 'match after the last newline character' 975 | >> s2 = 'and then you want to test' 976 | 977 | >> ignore = 'aty' 978 | >> pat = ##### add your solution here 979 | >> s1.scan(pat) 980 | => ["newline"] 981 | >> s2.scan(pat) 982 | => [] 983 | 984 | >> ignore = 'esw' 985 | >> pat = ##### add your solution here 986 | >> s1.scan(pat) 987 | => ["match"] 988 | >> s2.scan(pat) 989 | => ["and", "you", "to"] 990 | ``` 991 | 992 | **19)** Filter all whole elements with optional whitespaces at the start followed by three to five non-digit characters. Whitespaces at the start should not be part of the calculation for non-digit characters. 993 | 994 | ```ruby 995 | >> items = ["\t \ncat", 'goal', ' oh', 'he-he', 'goal2', 'ok ', 'sparrow'] 996 | 997 | ##### add your solution here 998 | => ["\t \ncat", "goal", "he-he", "ok "] 999 | ``` 1000 | 1001 | **20)** Modify the given regexp such that it gives the expected result. 1002 | 1003 | ```ruby 1004 | >> ip = '( S:12 E:5 S:4 and E:123 ok S:100 & E:10 S:1 - E:2 S:42 E:43 )' 1005 | 1006 | # wrong output 1007 | >> ip.scan(/S:\d+.*?E:\d{2,}/) 1008 | => ["S:12 E:5 S:4 and E:123", "S:100 & E:10", "S:1 - E:2 S:42 E:43"] 1009 | 1010 | # expected output 1011 | ##### add your solution here 1012 | => ["S:4 and E:123", "S:100 & E:10", "S:42 E:43"] 1013 | ``` 1014 | 1015 |
1016 | 1017 | # Groupings and backreferences 1018 | 1019 | **1)** Replace the space character that occurs after a word ending with `a` or `r` with a newline character. 1020 | 1021 | ```ruby 1022 | >> ip = 'area not a _a2_ roar took 22' 1023 | 1024 | >> puts ip.gsub() ##### add your solution here 1025 | area 1026 | not a 1027 | _a2_ roar 1028 | took 22 1029 | ``` 1030 | 1031 | **2)** Add `[]` around words starting with `s` and containing `e` and `t` in any order. 1032 | 1033 | ```ruby 1034 | >> ip = 'sequoia subtle exhibit asset sets2 tests si_te' 1035 | 1036 | ##### add your solution here 1037 | => "sequoia [subtle] exhibit asset [sets2] tests [si_te]" 1038 | ``` 1039 | 1040 | **3)** Replace all whole words with `X` that start and end with the same word character (irrespective of case). Single character word should get replaced with `X` too, as it satisfies the stated condition. 1041 | 1042 | ```ruby 1043 | >> ip = 'oreo not a _a2_ Roar took 22' 1044 | 1045 | ##### add your solution here 1046 | => "X not X X X took X" 1047 | ``` 1048 | 1049 | **4)** Convert the given *markdown* headers to corresponding *anchor* tags. Consider the input to start with one or more `#` characters followed by space and word characters. The `name` attribute is constructed by converting the header to lowercase and replacing spaces with hyphens. Can you do it without using a capture group? 1050 | 1051 | ```ruby 1052 | >> header1 = '# Regular Expressions' 1053 | >> header2 = '## Named capture groups' 1054 | 1055 | >> anchor = ##### add your solution here 1056 | 1057 | ##### add your solution here for header1 1058 | => "# Regular Expressions" 1059 | ##### add your solution here for header2 1060 | => "## Named capture groups" 1061 | ``` 1062 | 1063 | **5)** Convert the given *markdown* anchors to corresponding *hyperlinks*. 1064 | 1065 | ```ruby 1066 | >> anchor1 = "# Regular Expressions" 1067 | >> anchor2 = "## Subexpression calls" 1068 | 1069 | >> hyperlink = ##### add your solution here 1070 | 1071 | ##### add your solution here for anchor1 1072 | => "[Regular Expressions](#regular-expressions)" 1073 | ##### add your solution here for anchor2 1074 | => "[Subexpression calls](#subexpression-calls)" 1075 | ``` 1076 | 1077 | **6)** Count the number of whole words that have at least two occurrences of consecutive repeated alphabets. For example, words like `stillness` and `Committee` should be counted but not words like `root` or `readable` or `rotational`. 1078 | 1079 | ```ruby 1080 | '> ip = %q{oppressed abandon accommodation bloodless 1081 | '> carelessness committed apparition innkeeper 1082 | '> occasionally afforded embarrassment foolishness 1083 | '> depended successfully succeeded 1084 | >> possession cleanliness suppress} 1085 | 1086 | ##### add your solution here 1087 | => 13 1088 | ``` 1089 | 1090 | **7)** For the given input string, replace all occurrences of digit sequences with only the unique non-repeating sequence. For example, `232323` should be changed to `23` and `897897` should be changed to `897`. If there are no repeats (for example `1234`) or if the repeats end prematurely (for example `12121`), it should not be changed. 1091 | 1092 | ```ruby 1093 | >> ip = '1234 2323 453545354535 9339 11 60260260' 1094 | 1095 | ##### add your solution here 1096 | => "1234 23 4535 9339 1 60260260" 1097 | ``` 1098 | 1099 | **8)** Replace sequences made up of words separated by `:` or `.` by the first word of the sequence. Such sequences will end when `:` or `.` is not followed by a word character. 1100 | 1101 | ```ruby 1102 | >> ip = 'wow:Good:2_two.five: hi-2 bye kite.777:water.' 1103 | 1104 | ##### add your solution here 1105 | => "wow hi-2 bye kite" 1106 | ``` 1107 | 1108 | **9)** Replace sequences made up of words separated by `:` or `.` by the last word of the sequence. Such sequences will end when `:` or `.` is not followed by a word character. 1109 | 1110 | ```ruby 1111 | >> ip = 'wow:Good:2_two.five: hi-2 bye kite.777:water.' 1112 | 1113 | ##### add your solution here 1114 | => "five hi-2 bye water" 1115 | ``` 1116 | 1117 | **10)** Split the given input string on one or more repeated sequence of `cat`. 1118 | 1119 | ```ruby 1120 | >> ip = 'firecatlioncatcatcatbearcatcatparrot' 1121 | 1122 | ##### add your solution here 1123 | => ["fire", "lion", "bear", "parrot"] 1124 | ``` 1125 | 1126 | **11)** For the given input string, find all occurrences of digit sequences with at least one repeating sequence. For example, `232323` and `897897`. If the repeats end prematurely, for example `12121`, it should not be matched. 1127 | 1128 | ```ruby 1129 | >> ip = '1234 2323 453545354535 9339 11 60260260' 1130 | 1131 | >> pat = ##### add your solution here 1132 | 1133 | # entire sequences in the output 1134 | ##### add your solution here 1135 | => ["2323", "453545354535", "11"] 1136 | 1137 | # only the unique sequence in the output 1138 | ##### add your solution here 1139 | => ["23", "4535", "1"] 1140 | ``` 1141 | 1142 | **12)** Convert the comma separated strings to corresponding `hash` objects as shown below. The keys are `name`, `maths` and `phy` for the three fields in the input strings. 1143 | 1144 | ```ruby 1145 | >> row1 = 'rohan,75,89' 1146 | >> row2 = 'rose,88,92' 1147 | 1148 | >> pat = ##### add your solution here 1149 | 1150 | ##### add your solution here for row1 1151 | => {"name"=>"rohan", "maths"=>"75", "phy"=>"89"} 1152 | ##### add your solution here for row2 1153 | => {"name"=>"rose", "maths"=>"88", "phy"=>"92"} 1154 | ``` 1155 | 1156 | **13)** Surround all whole words with `()`. Additionally, if the whole word is `imp` or `ant`, delete them. Can you do it with just a single substitution? 1157 | 1158 | ```ruby 1159 | >> ip = 'tiger imp goat eagle ant important' 1160 | 1161 | ##### add your solution here 1162 | => "(tiger) () (goat) (eagle) () (important)" 1163 | ``` 1164 | 1165 | **14)** Filter all elements that contain a sequence of lowercase alphabets followed by `-` followed by digits. They can be optionally surrounded by `{{` and `}}`. Any partial match shouldn't be part of the output. 1166 | 1167 | ```ruby 1168 | >> ip = %w[{{apple-150}} {{mango2-100}} {{cherry-200 grape-87 {{go-to}}] 1169 | 1170 | ##### add your solution here 1171 | => ["{{apple-150}}", "grape-87"] 1172 | ``` 1173 | 1174 | **15)** Extract all hexadecimal character sequences, with `0x` optional prefix. Match the characters case insensitively, and the sequences shouldn't be surrounded by other word characters. 1175 | 1176 | ```ruby 1177 | >> str1 = '128A foo 0xfe32 34 0xbar' 1178 | >> str2 = '0XDEADBEEF place 0x0ff1ce bad' 1179 | 1180 | >> hex_seq = ##### add your solution here 1181 | 1182 | >> str1.scan(hex_seq) 1183 | => ["128A", "0xfe32", "34"] 1184 | >> str2.scan(hex_seq) 1185 | => ["0XDEADBEEF", "0x0ff1ce", "bad"] 1186 | ``` 1187 | 1188 | **16)** Replace sequences made up of words separated by `:` or `.` by the first/last word of the sequence and the separator. Such sequences will end when `:` or `.` is not followed by a word character. 1189 | 1190 | ```ruby 1191 | >> ip = 'wow:Good:2_two.five: hi-2 bye kite.777:water.' 1192 | 1193 | # first word of the sequence 1194 | ##### add your solution here 1195 | => "wow: hi-2 bye kite." 1196 | 1197 | # last word of the sequence 1198 | ##### add your solution here 1199 | => "five: hi-2 bye water." 1200 | ``` 1201 | 1202 | **17)** For the given input strings, extract `if` followed by any number of nested parentheses. Assume that there will be only one such pattern per input string. 1203 | 1204 | ```ruby 1205 | >> ip1 = 'for (((i*3)+2)/6) if(3-(k*3+4)/12-(r+2/3)) while()' 1206 | >> ip2 = 'if+while if(a(b)c(d(e(f)1)2)3) for(i=1)' 1207 | 1208 | >> pat = ##### add your solution here 1209 | 1210 | >> ip1[pat] 1211 | => "if(3-(k*3+4)/12-(r+2/3))" 1212 | >> ip2[pat] 1213 | => "if(a(b)c(d(e(f)1)2)3)" 1214 | ``` 1215 | 1216 | **18)** The given input string has sequences made up of words separated by `:` or `.` and such sequences will end when `:` or `.` is not followed by a word character. For all such sequences, display only the last word followed by `-` followed by the first word. 1217 | 1218 | ```ruby 1219 | >> ip = 'wow:Good:2_two.five: hi-2 bye kite.777:water.' 1220 | 1221 | ##### add your solution here 1222 | => ["five-wow", "water-kite"] 1223 | ``` 1224 | 1225 |
1226 | 1227 | # Lookarounds 1228 | 1229 | >![info](../images/info.svg) Please use lookarounds for solving the following exercises even if you can do it without lookarounds. Unless you cannot use lookarounds for cases like variable length lookbehinds. 1230 | 1231 | **1)** Replace all whole words with `X` unless it is preceded by a `(` character. 1232 | 1233 | ```ruby 1234 | >> ip = '(apple) guava berry) apple (mango) (grape' 1235 | 1236 | ##### add your solution here 1237 | => "(apple) X X) X (mango) (grape" 1238 | ``` 1239 | 1240 | **2)** Replace all whole words with `X` unless it is followed by a `)` character. 1241 | 1242 | ```ruby 1243 | >> ip = '(apple) guava berry) apple (mango) (grape' 1244 | 1245 | ##### add your solution here 1246 | => "(apple) X berry) X (mango) (X" 1247 | ``` 1248 | 1249 | **3)** Replace all whole words with `X` unless it is preceded by `(` or followed by `)` characters. 1250 | 1251 | ```ruby 1252 | >> ip = '(apple) guava berry) apple (mango) (grape' 1253 | 1254 | ##### add your solution here 1255 | => "(apple) X berry) X (mango) (grape" 1256 | ``` 1257 | 1258 | **4)** Extract all whole words that do not end with `e` or `n`. 1259 | 1260 | ```ruby 1261 | >> ip = 'a_t row on Urn e note Dust n end a2-e|u' 1262 | 1263 | ##### add your solution here 1264 | => ["a_t", "row", "Dust", "end", "a2", "u"] 1265 | ``` 1266 | 1267 | **5)** Extract all whole words that do not start with `a` or `d` or `n`. 1268 | 1269 | ```ruby 1270 | >> ip = 'a_t row on Urn e note Dust n end a2-e|u' 1271 | 1272 | ##### add your solution here 1273 | => ["row", "on", "Urn", "e", "Dust", "end", "e", "u"] 1274 | ``` 1275 | 1276 | **6)** Extract all whole words only if they are followed by `:` or `,` or `-`. 1277 | 1278 | ```ruby 1279 | >> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit' 1280 | 1281 | ##### add your solution here 1282 | => ["Poke", "so_good", "ever2"] 1283 | ``` 1284 | 1285 | **7)** Extract all whole words only if they are preceded by `=` or `/` or `-`. 1286 | 1287 | ```ruby 1288 | >> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit' 1289 | 1290 | ##### add your solution here 1291 | => ["so_good", "is", "sit"] 1292 | ``` 1293 | 1294 | **8)** Extract all whole words only if they are preceded by `=` or `:` and followed by `:` or `.`. 1295 | 1296 | ```ruby 1297 | >> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit' 1298 | 1299 | ##### add your solution here 1300 | => ["so_good", "ink"] 1301 | ``` 1302 | 1303 | **9)** Extract all whole words only if they are preceded by `=` or `:` or `.` or `(` or `-` and not followed by `.` or `/`. 1304 | 1305 | ```ruby 1306 | >> ip = 'Poke,on=-=so_good:ink.to/is(vast)ever2-sit' 1307 | 1308 | ##### add your solution here 1309 | => ["so_good", "vast", "sit"] 1310 | ``` 1311 | 1312 | **10)** Remove the leading and trailing whitespaces from all the individual fields where `,` is the field separator. 1313 | 1314 | ```ruby 1315 | >> csv1 = " comma ,separated ,values \t\r " 1316 | >> csv2 = 'good bad,nice ice , 42 , , stall small' 1317 | 1318 | >> remove_whitespace = ##### add your solution here 1319 | 1320 | >> csv1.gsub(remove_whitespace, '') 1321 | => "comma,separated,values" 1322 | >> csv2.gsub(remove_whitespace, '') 1323 | => "good bad,nice ice,42,,stall small" 1324 | ``` 1325 | 1326 | **11)** Filter elements that satisfy all of these rules: 1327 | 1328 | * should have at least two alphabets 1329 | * should have at least three digits 1330 | * should have at least one special character among `%` or `*` or `#` or `$` 1331 | * should not end with a whitespace character 1332 | 1333 | ```ruby 1334 | >> pwds = ['hunter2', 'F2H3u%9', "*X3Yz3.14\t", 'r2_d2_42', 'A $B C1234'] 1335 | 1336 | >> rule_chk = ##### add your solution here 1337 | 1338 | >> pwds.grep(rule_chk) 1339 | => ["F2H3u%9", "A $B C1234"] 1340 | ``` 1341 | 1342 | **12)** For the given string, surround all whole words with `{}` except for whole words `par` and `cat` and `apple`. 1343 | 1344 | ```ruby 1345 | >> ip = 'part; cat {super} rest_42 par scatter apple spar' 1346 | 1347 | ##### add your solution here 1348 | => "{part}; cat {{super}} {rest_42} par {scatter} apple {spar}" 1349 | ``` 1350 | 1351 | **13)** Extract the integer portion of floating-point numbers for the given string. Integers and numbers ending with `.` and no further digits should not be considered. 1352 | 1353 | ```ruby 1354 | >> ip = '12 ab32.4 go 5 2. 46.42 5' 1355 | 1356 | ##### add your solution here 1357 | => ["32", "46"] 1358 | ``` 1359 | 1360 | **14)** For the given input strings, extract all overlapping two character sequences. 1361 | 1362 | ```ruby 1363 | >> s1 = 'apple' 1364 | >> s2 = '1.2-3:4' 1365 | 1366 | >> pat = ##### add your solution here 1367 | 1368 | ##### add your solution here for s1 1369 | => ["ap", "pp", "pl", "le"] 1370 | ##### add your solution here for s2 1371 | => ["1.", ".2", "2-", "-3", "3:", ":4"] 1372 | ``` 1373 | 1374 | **15)** The given input strings contain fields separated by the `:` character. Delete `:` and the last field if there is a digit character anywhere before the last field. 1375 | 1376 | ```ruby 1377 | >> s1 = '42:cat' 1378 | >> s2 = 'twelve:a2b' 1379 | >> s3 = 'we:be:he:0:a:b:bother' 1380 | >> s4 = 'apple:banana-42:cherry:' 1381 | >> s5 = 'dragon:unicorn:centaur' 1382 | 1383 | >> pat = ##### add your solution here 1384 | 1385 | ##### add your solution here for s1 1386 | => "42" 1387 | ##### add your solution here for s2 1388 | => "twelve:a2b" 1389 | ##### add your solution here for s3 1390 | => "we:be:he:0:a:b" 1391 | ##### add your solution here for s4 1392 | => "apple:banana-42:cherry" 1393 | ##### add your solution here for s5 1394 | => "dragon:unicorn:centaur" 1395 | ``` 1396 | 1397 | **16)** Extract all whole words unless they are preceded by `:` or `<=>` or `----` or `#`. 1398 | 1399 | ```ruby 1400 | >> ip = '::very--at<=>row|in.a_b#b2c=>lion----east' 1401 | 1402 | ##### add your solution here 1403 | => ["at", "in", "a_b", "lion"] 1404 | ``` 1405 | 1406 | **17)** Match strings if it contains `qty` followed by `price` but not if there is any **whitespace** character or the string `error` between them. 1407 | 1408 | ```ruby 1409 | >> str1 = '23,qty,price,42' 1410 | >> str2 = 'qty price,oh' 1411 | >> str3 = '3.14,qty,6,errors,9,price,3' 1412 | >> str4 = "42\nqty-6,apple-56,price-234,error" 1413 | >> str5 = '4,price,3.14,qty,4' 1414 | >> str6 = '(qtyprice) (hi-there)' 1415 | 1416 | >> neg = ##### add your solution here 1417 | 1418 | >> str1.match?(neg) 1419 | => true 1420 | >> str2.match?(neg) 1421 | => false 1422 | >> str3.match?(neg) 1423 | => false 1424 | >> str4.match?(neg) 1425 | => true 1426 | >> str5.match?(neg) 1427 | => false 1428 | >> str6.match?(neg) 1429 | => true 1430 | ``` 1431 | 1432 | **18)** Can you reason out why the following regular expressions behave differently? 1433 | 1434 | ```ruby 1435 | >> ip = 'I have 12, he has 2!' 1436 | 1437 | >> ip.gsub(/\b..\b/, '{\0}') 1438 | => "{I }have {12}{, }{he} has{ 2}!" 1439 | 1440 | >> ip.gsub(/(? "I have {12}, {he} has {2!}" 1442 | ``` 1443 | 1444 | **19)** The given input strings have fields separated by the `:` character. Assume that each string has a minimum of two fields and cannot have empty fields. Extract all fields, but stop if a field with a digit character is found. 1445 | 1446 | ```ruby 1447 | >> row1 = 'vast:a2b2:ride:in:awe:b2b:3list:end' 1448 | >> row2 = 'um:no:low:3e:s4w:seer' 1449 | >> row3 = 'oh100:apple:banana:fig' 1450 | >> row4 = 'Dragon:Unicorn:Wizard-Healer' 1451 | 1452 | >> pat = ##### add your solution here 1453 | 1454 | >> row1.gsub(pat).map { $1 } 1455 | => ["vast"] 1456 | >> row2.gsub(pat).map { $1 } 1457 | => ["um", "no", "low"] 1458 | >> row3.gsub(pat).map { $1 } 1459 | => [] 1460 | >> row4.gsub(pat).map { $1 } 1461 | => ["Dragon", "Unicorn", "Wizard-Healer"] 1462 | ``` 1463 | 1464 | **20)** The given input strings have fields separated by the `:` character. Extract all fields only after a field containing a digit character is found. Assume that each string has a minimum of two fields and cannot have empty fields. 1465 | 1466 | ```ruby 1467 | >> row1 = 'vast:a2b2:ride:in:awe:b2b:3list:end' 1468 | >> row2 = 'um:no:low:3e:s4w:seer' 1469 | >> row3 = 'oh100:apple:banana:fig' 1470 | >> row4 = 'Dragon:Unicorn:Wizard-Healer' 1471 | 1472 | >> pat = ##### add your solution here 1473 | 1474 | >> row1.scan(pat) 1475 | => ["ride", "in", "awe", "b2b", "3list", "end"] 1476 | >> row2.scan(pat) 1477 | => ["s4w", "seer"] 1478 | >> row3.scan(pat) 1479 | => ["apple", "banana", "fig"] 1480 | >> row4.scan(pat) 1481 | => [] 1482 | ``` 1483 | 1484 | **21)** The given input string has comma separated fields and some of them can occur more than once. For the duplicated fields, retain only the rightmost one. Assume that there are no empty fields. 1485 | 1486 | ```ruby 1487 | >> row = '421,cat,2425,42,5,cat,6,6,42,61,6,6,scat,6,6,4,Cat,425,4' 1488 | 1489 | ##### add your solution here 1490 | => "421,2425,5,cat,42,61,scat,6,Cat,425,4" 1491 | ``` 1492 | 1493 |
1494 | 1495 | # Modifiers 1496 | 1497 | **1)** Remove from the first occurrence of `hat` to the last occurrence of `it` for the given input strings. Match these markers case insensitively. 1498 | 1499 | ```ruby 1500 | >> s1 = "But Cool THAT\nsee What okay\nwow quite" 1501 | >> s2 = 'it this hat is sliced HIT.' 1502 | 1503 | >> pat = ##### add your solution here 1504 | 1505 | >> s1.sub(pat, '') 1506 | => "But Cool Te" 1507 | >> s2.sub(pat, '') 1508 | => "it this ." 1509 | ``` 1510 | 1511 | **2)** Delete from the string `start` if it is at the beginning of a line up to the next occurrence of the string `end` at the end of a line. Match these keywords irrespective of case. 1512 | 1513 | ```ruby 1514 | '> para = %q{good start 1515 | '> start working on that 1516 | '> project you always wanted 1517 | '> to, do not let it end 1518 | '> hi there 1519 | '> start and end the end 1520 | '> 42 1521 | '> Start and try to 1522 | '> finish the End 1523 | >> bye} 1524 | 1525 | >> pat = ##### add your solution here 1526 | 1527 | >> puts para.gsub(pat, '') 1528 | good start 1529 | 1530 | hi there 1531 | 1532 | 42 1533 | 1534 | bye 1535 | ``` 1536 | 1537 | **3)** For the given *markdown* file, replace all occurrences of the string `ruby` (irrespective of case) with the string `Ruby`. However, any match within code blocks that start with the whole line ` ```ruby ` and end with the whole line ` ``` ` shouldn't be replaced. Consider the input file to be small enough to fit memory requirements. 1538 | 1539 | Refer to the [exercises folder](https://github.com/learnbyexample/Ruby_Regexp/tree/master/exercises) for input files required to solve this exercise. 1540 | 1541 | ```ruby 1542 | >> ip_str = File.open('sample.md').read 1543 | >> pat = ##### add your solution here 1544 | 1545 | >> File.open('sample_mod.md', 'w') do |f| 1546 | ?> ip_str.split(pat).each_with_index do |s, i| 1547 | ?> f.write(i.odd? ? s : s.gsub(/ruby/i) { $&.capitalize }) 1548 | >> end 1549 | >> end 1550 | 1551 | >> File.open('sample_mod.md').read == File.open('expected.md').read 1552 | => true 1553 | ``` 1554 | 1555 | **4)** Write a string method that changes the given input to alternate case (starting with lowercase first). 1556 | 1557 | ```ruby 1558 | ?> def aLtErNaTe_CaSe(ip_str) 1559 | ##### add your solution here 1560 | >> end 1561 | 1562 | >> aLtErNaTe_CaSe('HI THERE!') 1563 | => "hI tHeRe!" 1564 | >> aLtErNaTe_CaSe('good morning') 1565 | => "gOoD mOrNiNg" 1566 | >> aLtErNaTe_CaSe('Sample123string42with777numbers') 1567 | => "sAmPlE123sTrInG42wItH777nUmBeRs" 1568 | ``` 1569 | 1570 | **5)** For the given input strings, match all of these three conditions: 1571 | 1572 | * `This` case sensitively 1573 | * `nice` and `cool` case insensitively 1574 | 1575 | ```ruby 1576 | >> s1 = 'This is nice and Cool' 1577 | >> s2 = 'Nice and cool this is' 1578 | >> s3 = 'What is so nice and cool about This?' 1579 | >> s4 = 'nice,cool,This' 1580 | >> s5 = 'not nice This?' 1581 | >> s6 = 'This is not cool' 1582 | 1583 | >> pat = ##### add your solution here 1584 | 1585 | >> s1.match?(pat) 1586 | => true 1587 | >> s2.match?(pat) 1588 | => false 1589 | >> s3.match?(pat) 1590 | => true 1591 | >> s4.match?(pat) 1592 | => true 1593 | >> s5.match?(pat) 1594 | => false 1595 | >> s6.match?(pat) 1596 | => false 1597 | ``` 1598 | 1599 | **6)** For the given input strings, match if the string begins with `Th` and also contains a line that starts with `There`. 1600 | 1601 | ```ruby 1602 | >> s1 = "There there\nHave a cookie" 1603 | >> s2 = "This is a mess\nYeah?\nThereeeee" 1604 | >> s3 = "Oh\nThere goes the fun" 1605 | >> s4 = 'This is not\ngood\nno There' 1606 | 1607 | >> pat = ##### add your solution here 1608 | 1609 | >> s1.match?(pat) 1610 | => true 1611 | >> s2.match?(pat) 1612 | => true 1613 | >> s3.match?(pat) 1614 | => false 1615 | >> s4.match?(pat) 1616 | => false 1617 | ``` 1618 | 1619 |
1620 | 1621 | # Unicode 1622 | 1623 | **1)** Output `true` or `false` depending on input string made up of ASCII characters or not. Consider the input to be non-empty strings and any character that isn't part of the 7-bit ASCII set should give `false`. 1624 | 1625 | ```ruby 1626 | >> str1 = '123—456' 1627 | >> str2 = 'good fοοd' 1628 | >> str3 = 'happy learning!' 1629 | 1630 | ##### add your solution here for str1 1631 | => false 1632 | ##### add your solution here for str2 1633 | => false 1634 | ##### add your solution here for str3 1635 | => true 1636 | ``` 1637 | 1638 | **2)** Retain only punctuation characters for the given strings (generated from codepoints). Use the Unicode character set definition for punctuation for solving this exercise. 1639 | 1640 | ```ruby 1641 | >> s1 = (0..0x7f).to_a.pack('U*') 1642 | >> s2 = (0x80..0xff).to_a.pack('U*') 1643 | >> s3 = (0x2600..0x27eb).to_a.pack('U*') 1644 | 1645 | >> pat = ##### add your solution here 1646 | 1647 | >> s1.gsub(pat, '') 1648 | => "!\"#%&'()*,-./:;?@[\\]_{}" 1649 | >> s2.gsub(pat, '') 1650 | => "¡§«¶·»¿" 1651 | >> s3.gsub(pat, '') 1652 | => "❨❩❪❫❬❭❮❯❰❱❲❳❴❵⟅⟆⟦⟧⟨⟩⟪⟫" 1653 | ``` 1654 | 1655 | **3)** Explore the following Q&A threads. 1656 | 1657 | * [stackoverflow: remove emoji from string](https://stackoverflow.com/q/24672834/4082052) 1658 | * [stackoverflow: why am I seeing different results for these two nearly identical regexp](https://stackoverflow.com/q/13573136/4082052) 1659 | * [stackoverflow: convert unicode number to integer](https://stackoverflow.com/q/37338708/4082052) 1660 | * [stackoverflow: replacing %uXXXX to the corresponding unicode codepoint](https://stackoverflow.com/q/28773392/4082052) 1661 | 1662 | -------------------------------------------------------------------------------- /exercises/expected.md: -------------------------------------------------------------------------------- 1 | # Introduction 2 | 3 | REPL is a good way to learn Ruby for beginners. 4 | 5 | ```ruby 6 | >> 3 + 7 7 | => 10 8 | >> 22 / 7 9 | => 3 10 | >> 22.0 / 7 11 | => 3.142857142857143 12 | ``` 13 | 14 | ## String methods 15 | 16 | Ruby comes loaded with awesome methods. Enjoy learning Ruby. 17 | 18 | ```ruby 19 | >> 'ruby'.capitalize 20 | => "Ruby" 21 | 22 | >> ' comma '.strip 23 | => "comma" 24 | ``` 25 | 26 | -------------------------------------------------------------------------------- /exercises/sample.md: -------------------------------------------------------------------------------- 1 | # Introduction 2 | 3 | REPL is a good way to learn RUBY for beginners. 4 | 5 | ```ruby 6 | >> 3 + 7 7 | => 10 8 | >> 22 / 7 9 | => 3 10 | >> 22.0 / 7 11 | => 3.142857142857143 12 | ``` 13 | 14 | ## String methods 15 | 16 | ruby comes loaded with awesome methods. Enjoy learning RuBy. 17 | 18 | ```ruby 19 | >> 'ruby'.capitalize 20 | => "Ruby" 21 | 22 | >> ' comma '.strip 23 | => "comma" 24 | ``` 25 | 26 | -------------------------------------------------------------------------------- /images/debuggex.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learnbyexample/Ruby_Regexp/b32aa467bbc8192b1e7a97520a5cdce251a320b3/images/debuggex.png -------------------------------------------------------------------------------- /images/find_replace.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learnbyexample/Ruby_Regexp/b32aa467bbc8192b1e7a97520a5cdce251a320b3/images/find_replace.png -------------------------------------------------------------------------------- /images/info.svg: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /images/password_check.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learnbyexample/Ruby_Regexp/b32aa467bbc8192b1e7a97520a5cdce251a320b3/images/password_check.png -------------------------------------------------------------------------------- /images/rubular.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learnbyexample/Ruby_Regexp/b32aa467bbc8192b1e7a97520a5cdce251a320b3/images/rubular.png -------------------------------------------------------------------------------- /images/ruby_regexp_ls.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learnbyexample/Ruby_Regexp/b32aa467bbc8192b1e7a97520a5cdce251a320b3/images/ruby_regexp_ls.png -------------------------------------------------------------------------------- /images/warning.svg: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /sample_chapters/ruby_regexp_sample.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learnbyexample/Ruby_Regexp/b32aa467bbc8192b1e7a97520a5cdce251a320b3/sample_chapters/ruby_regexp_sample.pdf --------------------------------------------------------------------------------