├── .github └── workflows │ └── ci.yml ├── .gitignore ├── CHANGELOG.md ├── Cargo.toml ├── LICENSE-APACHE ├── LICENSE-MIT ├── README.md ├── examples └── procmacro │ ├── .gitignore │ ├── Cargo.toml │ ├── README.md │ ├── examples │ └── main.rs │ └── src │ └── lib.rs └── src ├── bool ├── mod.rs └── tests.rs ├── byte ├── mod.rs └── tests.rs ├── bytestr ├── mod.rs └── tests.rs ├── char ├── mod.rs └── tests.rs ├── err.rs ├── escape.rs ├── float ├── mod.rs └── tests.rs ├── impls.rs ├── integer ├── mod.rs └── tests.rs ├── lib.rs ├── parse.rs ├── string ├── mod.rs └── tests.rs ├── test_util.rs └── tests.rs /.github/workflows/ci.yml: -------------------------------------------------------------------------------- 1 | name: CI 2 | 3 | on: 4 | pull_request: 5 | push: 6 | branches: [ main ] 7 | 8 | env: 9 | CARGO_TERM_COLOR: always 10 | RUSTFLAGS: --deny warnings 11 | 12 | jobs: 13 | style: 14 | name: Check basic style 15 | runs-on: ubuntu-latest 16 | steps: 17 | - uses: actions/checkout@v3 18 | - uses: LukasKalbertodt/check-basic-style@v0.1 19 | 20 | check: 21 | name: 'Build & test' 22 | runs-on: ubuntu-20.04 23 | steps: 24 | - uses: actions/checkout@v2 25 | 26 | # We test in release mode as two tests would take a long time otherwise. 27 | - name: Build 28 | run: cargo build 29 | - name: Run tests 30 | run: | 31 | cargo test --release --lib -- --include-ignored 32 | cargo test --doc 33 | - name: Test procmacro example 34 | working-directory: examples/procmacro 35 | run: cargo test 36 | 37 | - name: Build without default features 38 | run: cargo build --no-default-features 39 | - name: Run tests without default features 40 | run: | 41 | cargo test --release --no-default-features --lib -- --include-ignored 42 | cargo test --doc --no-default-features 43 | 44 | - name: Build with check_suffix 45 | run: cargo build --features=check_suffix 46 | - name: Run tests with check_suffix 47 | run: | 48 | cargo test --release --features=check_suffix --lib -- --include-ignored 49 | cargo test --doc --features=check_suffix 50 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | /target 2 | Cargo.lock 3 | -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | # Changelog 2 | 3 | All notable changes to this project will be documented in this file. 4 | 5 | 6 | ## [Unreleased] 7 | 8 | ## [0.4.1] - 2023-10-18 9 | - Fixed incorrectly labeling `27f32` a float literals in docs. 10 | - Added hint to integer literal docs about parsing as `u128`. 11 | 12 | ## [0.4.0] - 2023-03-05 13 | ### Added 14 | - Add ability to parse literals with arbitrary suffixes (e.g. `"foo"bla` or `23px`) 15 | - Add `suffix()` method to all literal types except `BoolLit` 16 | - Add `IntegerBase::value` 17 | - Add `from_suffix` and `suffix` methods to `FloatType` and `IntegerType` 18 | - Add `FromStr` and `Display` impls to `FloatType` and `IntegerType` 19 | 20 | ### Changed 21 | - **Breaking**: Mark `FloatType` and `IntegerType` as `#[non_exhaustive]` 22 | - **Breaking**: Fix integer parsing for cases like `27f32`. `Literal::parse` 23 | and `IntegerLit::parse` will both identify this as an integer literal. 24 | - **Breaking**: Fix float parsing by correctly rejecting inputs like `27f32`. A 25 | float literal must have a period OR an exponent part, according to the spec. 26 | Previously decimal integers were accepted in `FloatLit::parse`. 27 | - Improved some parts of the docs 28 | 29 | ### Removed 30 | - **Breaking**: Remove `OwnedLiteral` and `SharedLiteral` 31 | 32 | ## [0.3.0] - 2022-12-19 33 | ### Breaking 34 | - Bump MSRV (minimal supported Rust version) to 1.54 35 | 36 | ### Added 37 | - Add `raw_input` and `into_raw_input` to non-bool `*Lit` types 38 | - Add `impl From<*Lit> for pm::Literal` (for non-bool literals) 39 | - Add `impl From for pm::Ident` 40 | 41 | ### Fixed 42 | - Fix link to reference and clarify bool literals ([#7](https://github.com/LukasKalbertodt/litrs/pull/7)) 43 | 44 | ### Internals 45 | - Move lots of parsing code into non-generic functions (this hopefully reduces compile times) 46 | - To implement `[into_]raw_input` for integer and float literals, their 47 | internals were changed a bit so that they store the full input string now. 48 | 49 | ## [0.2.3] - 2021-06-09 50 | ### Changed 51 | - Minor internal code change to bring MSRV from 1.52 to 1.42 52 | 53 | ## [0.2.2] - 2021-06-09 54 | ### Changed 55 | - Fixed (byte) string literal parsing by: 56 | - Correctly handling "string continue" sequences 57 | - Correctly converting `\n\r` into `\n` 58 | 59 | ## [0.2.1] - 2021-06-04 60 | ### Changed 61 | - Fixed the `expected` value of the error returned from `TryFrom` impls in some cases 62 | 63 | ## [0.2.0] - 2021-05-28 64 | ### Changed 65 | - **Breaking**: rename `Error` to `ParseError`. That describes its purpose more 66 | closely and is particular useful now that other error types exist in the library. 67 | 68 | ### Removed 69 | - **Breaking**: remove `proc-macro` feature and instead offer the corresponding 70 | `impl`s unconditionally. Since the feature didn't enable/disable a 71 | dependency (`proc-macro` is a compiler provided crate) and since apparently 72 | it works fine in `no_std` environments, I dropped this feature. I don't 73 | currently see a reason why the corresponding impls should be conditional. 74 | 75 | ### Added 76 | - `TryFrom for litrs::Literal` impls 77 | - `From<*Lit> for litrs::Literal` impls 78 | - `TryFrom for *Lit` 79 | - `TryFrom for *Lit` 80 | - `InvalidToken` error type for all new `TryFrom` impls 81 | 82 | 83 | ## [0.1.1] - 2021-05-25 84 | ### Added 85 | - `From` impls to create a `Literal` from references to proc-macro literal types: 86 | - `From<&proc_macro::Literal>` 87 | - `From<&proc_macro2::Literal>` 88 | - Better examples in README and repository 89 | 90 | ## 0.1.0 - 2021-05-24 91 | ### Added 92 | - Everything 93 | 94 | 95 | [Unreleased]: https://github.com/LukasKalbertodt/litrs/compare/v0.4.1...HEAD 96 | [0.4.1]: https://github.com/LukasKalbertodt/litrs/compare/v0.4.0...v0.4.1 97 | [0.4.0]: https://github.com/LukasKalbertodt/litrs/compare/v0.3.0...v0.4.0 98 | [0.3.0]: https://github.com/LukasKalbertodt/litrs/compare/v0.2.3...v0.3.0 99 | [0.2.3]: https://github.com/LukasKalbertodt/litrs/compare/v0.2.2...v0.2.3 100 | [0.2.2]: https://github.com/LukasKalbertodt/litrs/compare/v0.2.1...v0.2.2 101 | [0.2.1]: https://github.com/LukasKalbertodt/litrs/compare/v0.2.0...v0.2.1 102 | [0.2.0]: https://github.com/LukasKalbertodt/litrs/compare/v0.1.1...v0.2.0 103 | [0.1.1]: https://github.com/LukasKalbertodt/litrs/compare/v0.1.0...v0.1.1 104 | -------------------------------------------------------------------------------- /Cargo.toml: -------------------------------------------------------------------------------- 1 | [package] 2 | name = "litrs" 3 | version = "0.4.1" 4 | authors = ["Lukas Kalbertodt "] 5 | edition = "2018" 6 | rust-version = "1.54" 7 | 8 | description = """ 9 | Parse and inspect Rust literals (i.e. tokens in the Rust programming language 10 | representing fixed values). Particularly useful for proc macros, but can also 11 | be used outside of a proc-macro context. 12 | """ 13 | documentation = "https://docs.rs/litrs/" 14 | repository = "https://github.com/LukasKalbertodt/litrs/" 15 | readme = "README.md" 16 | license = "MIT/Apache-2.0" 17 | 18 | keywords = ["literal", "parsing", "proc-macro", "type", "procedural"] 19 | categories = [ 20 | "development-tools::procedural-macro-helpers", 21 | "parser-implementations", 22 | "development-tools::build-utils", 23 | ] 24 | exclude = [".github"] 25 | 26 | 27 | [features] 28 | default = ["proc-macro2"] 29 | check_suffix = ["unicode-xid"] 30 | 31 | [dependencies] 32 | proc-macro2 = { version = "1", optional = true } 33 | unicode-xid = { version = "0.2.4", optional = true } 34 | -------------------------------------------------------------------------------- /LICENSE-APACHE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | -------------------------------------------------------------------------------- /LICENSE-MIT: -------------------------------------------------------------------------------- 1 | Copyright (c) 2020 Project Developers 2 | 3 | Permission is hereby granted, free of charge, to any 4 | person obtaining a copy of this software and associated 5 | documentation files (the "Software"), to deal in the 6 | Software without restriction, including without 7 | limitation the rights to use, copy, modify, merge, 8 | publish, distribute, sublicense, and/or sell copies of 9 | the Software, and to permit persons to whom the Software 10 | is furnished to do so, subject to the following 11 | conditions: 12 | 13 | The above copyright notice and this permission notice 14 | shall be included in all copies or substantial portions 15 | of the Software. 16 | 17 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF 18 | ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED 19 | TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A 20 | PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT 21 | SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY 22 | CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION 23 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR 24 | IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 25 | DEALINGS IN THE SOFTWARE. 26 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # `litrs`: parsing and inspecting Rust literals 2 | 3 | [CI status of main](https://github.com/LukasKalbertodt/litrs/actions/workflows/ci.yml) 4 | [Crates.io Version](https://crates.io/crates/litrs) 5 | [docs.rs](https://docs.rs/litrs) 6 | 7 | `litrs` offers functionality to parse Rust literals, i.e. tokens in the Rust programming language that represent fixed values. 8 | For example: `27`, `"crab"`, `bool`. 9 | This is particularly useful for proc macros, but can also be used outside of a proc-macro context. 10 | 11 | **Why this library?** 12 | Unfortunately, the `proc_macro` API shipped with the compiler offers no easy way to inspect literals. 13 | There are mainly two libraries for this purpose: 14 | [`syn`](https://github.com/dtolnay/syn) and [`literalext`](https://github.com/mystor/literalext). 15 | The latter is deprecated. 16 | And `syn` is oftentimes overkill for the task at hand, especially when developing function-like proc-macros (e.g. `foo!(..)`). 17 | This crate is a lightweight alternative. 18 | Also, when it comes to literals, `litrs` offers a bit more flexibility and a few more features compared to `syn`. 19 | 20 | I'm interested in community feedback! 21 | If you consider using this, please speak your mind [in this issue](https://github.com/LukasKalbertodt/litrs/issues/1). 22 | 23 | ## Example 24 | 25 | ### In proc macro 26 | 27 | ```rust 28 | use std::convert::TryFrom; 29 | use proc_macro::TokenStream; 30 | use litrs::Literal; 31 | 32 | #[proc_macro] 33 | pub fn foo(input: TokenStream) -> TokenStream { 34 | // Please do proper error handling in your real code! 35 | let first_token = input.into_iter().next().expect("no input"); 36 | 37 | // `try_from` will return an error if the token is not a literal. 38 | match Literal::try_from(first_token) { 39 | // Convenient methods to produce decent errors via `compile_error!`. 40 | Err(e) => return e.to_compile_error(), 41 | 42 | // You can now inspect your literal! 43 | Ok(Literal::Integer(i)) => { 44 | println!("Got an integer specified in base {:?}", i.base()); 45 | 46 | let value = i.value::().expect("integer literal too large"); 47 | println!("Is your integer even? {}", value % 2 == 0); 48 | } 49 | Ok(other) => { 50 | println!("Got a non-integer literal"); 51 | } 52 | } 53 | 54 | TokenStream::new() // dummy output 55 | } 56 | ``` 57 | 58 | If you are expecting a specific kind of literal, you can also use this, which will return an error if the token is not a float literal. 59 | 60 | ```rust 61 | FloatLit::try_from(first_token) 62 | ``` 63 | 64 | ### Parsing from a `&str` 65 | 66 | Outside of a proc macro context you might want to parse a string directly. 67 | 68 | ```rust 69 | use litrs::{FloatLit, Literal}; 70 | 71 | let lit = Literal::parse("'🦀'").expect("failed to parse literal"); 72 | let float_lit = FloatLit::parse("2.7e3").expect("failed to parse as float literal"); 73 | ``` 74 | 75 | See [**the documentation**](https://docs.rs/litrs) or the `examples/` directory for more examples and information. 76 | 77 | 78 |
79 | 80 | --- 81 | 82 | ## License 83 | 84 | Licensed under either of Apache License, Version 85 | 2.0 or MIT license at your option. 86 | Unless you explicitly state otherwise, any contribution intentionally submitted 87 | for inclusion in this project by you, as defined in the Apache-2.0 license, 88 | shall be dual licensed as above, without any additional terms or conditions. 89 | -------------------------------------------------------------------------------- /examples/procmacro/.gitignore: -------------------------------------------------------------------------------- 1 | target 2 | Cargo.lock 3 | -------------------------------------------------------------------------------- /examples/procmacro/Cargo.toml: -------------------------------------------------------------------------------- 1 | [package] 2 | name = "procmacro-example" 3 | version = "0.1.0" 4 | authors = ["Lukas Kalbertodt "] 5 | edition = "2018" 6 | publish = false 7 | 8 | [lib] 9 | proc-macro = true 10 | 11 | [dependencies] 12 | litrs = { path = "../.." } 13 | -------------------------------------------------------------------------------- /examples/procmacro/README.md: -------------------------------------------------------------------------------- 1 | # Proc macro example 2 | 3 | Two simple function-like proc macros are defined [in `src/lib.rs`](src/lib.rs). 4 | Run them with this command in this directory: 5 | 6 | ``` 7 | cargo run --example main 8 | ``` 9 | -------------------------------------------------------------------------------- /examples/procmacro/examples/main.rs: -------------------------------------------------------------------------------- 1 | use procmacro_example::{concat, dbg_and_swallow, repeat}; 2 | 3 | const FOO: &str = concat!(r#"Hello "# '🦊' "\nHere is a friend: \u{1F427}"); 4 | // const FOO: &str = concat!(::); 5 | // const FOO: &str = concat!(peter); 6 | 7 | const BAR: &str = repeat!(3 * "నా పిల్లి లావుగా ఉంది"); 8 | const BAZ: &str = repeat!(0b101 * "🦀"); 9 | // const BAZ: &str = repeat!(3.5 * "🦀"); 10 | 11 | dbg_and_swallow!(16px); 12 | 13 | fn main() { 14 | println!("{}", FOO); 15 | println!("{}", BAR); 16 | println!("{}", BAZ); 17 | } 18 | -------------------------------------------------------------------------------- /examples/procmacro/src/lib.rs: -------------------------------------------------------------------------------- 1 | use std::convert::TryFrom; 2 | use proc_macro::{Spacing, TokenStream, TokenTree}; 3 | use litrs::{Literal, IntegerLit, StringLit}; 4 | 5 | 6 | #[proc_macro] 7 | pub fn dbg_and_swallow(input: TokenStream) -> TokenStream { 8 | for token in input { 9 | println!("{} -> {:#?}", token, Literal::try_from(&token)); 10 | } 11 | TokenStream::new() 12 | } 13 | 14 | /// Concatinates all input string and char literals into a single output string 15 | /// literal. 16 | #[proc_macro] 17 | pub fn concat(input: TokenStream) -> TokenStream { 18 | let mut out = String::new(); 19 | 20 | for tt in input { 21 | let lit = match Literal::try_from(tt) { 22 | Ok(lit) => lit, 23 | Err(e) => return e.to_compile_error(), 24 | }; 25 | 26 | // Here we can match over the literal to inspect it. All literal kinds 27 | // have a `value` method to return the represented value. 28 | println!("{:?}", lit); 29 | match lit { 30 | Literal::String(s) => out.push_str(s.value()), 31 | Literal::Char(c) => out.push(c.value()), 32 | _ => panic!("input has to be char or string literals, but this is not: {}", lit), 33 | } 34 | } 35 | 36 | TokenTree::Literal(proc_macro::Literal::string(&out)).into() 37 | } 38 | 39 | /// Repeats a given string a given number of times. Example: `repeat! 40 | /// (3 * "foo")` will result int `"foofoofoo"`. 41 | #[proc_macro] 42 | pub fn repeat(input: TokenStream) -> TokenStream { 43 | // Validate input 44 | let (int, string) = match &*input.into_iter().collect::>() { 45 | [TokenTree::Literal(int), TokenTree::Punct(p), TokenTree::Literal(string)] => { 46 | if p.as_char() != '*' || p.spacing() != Spacing::Alone { 47 | panic!("second token has to be a single `*`"); 48 | } 49 | 50 | let int = match IntegerLit::try_from(int) { 51 | Ok(i) => i, 52 | Err(e) => return e.to_compile_error(), 53 | }; 54 | let string = match StringLit::try_from(string) { 55 | Ok(s) => s, 56 | Err(e) => return e.to_compile_error(), 57 | }; 58 | 59 | (int, string) 60 | } 61 | _ => panic!("expected three input tokens: ` * `"), 62 | }; 63 | 64 | // Create the output string 65 | let times = int.value::().expect("integer value too large :("); 66 | let out = (0..times).map(|_| string.value()).collect::(); 67 | TokenTree::Literal(proc_macro::Literal::string(&out)).into() 68 | } 69 | -------------------------------------------------------------------------------- /src/bool/mod.rs: -------------------------------------------------------------------------------- 1 | use std::fmt; 2 | 3 | use crate::{ParseError, err::{perr, ParseErrorKind::*}}; 4 | 5 | 6 | /// A bool literal: `true` or `false`. Also see [the reference][ref]. 7 | /// 8 | /// Notice that, strictly speaking, from Rust point of view "boolean literals" are not 9 | /// actual literals but [keywords]. 10 | /// 11 | /// [ref]: https://doc.rust-lang.org/reference/expressions/literal-expr.html#boolean-literal-expressions 12 | /// [keywords]: https://doc.rust-lang.org/reference/keywords.html#strict-keywords 13 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 14 | pub enum BoolLit { 15 | False, 16 | True, 17 | } 18 | 19 | impl BoolLit { 20 | /// Parses the input as a bool literal. Returns an error if the input is 21 | /// invalid or represents a different kind of literal. 22 | pub fn parse(s: &str) -> Result { 23 | match s { 24 | "false" => Ok(Self::False), 25 | "true" => Ok(Self::True), 26 | _ => Err(perr(None, InvalidLiteral)), 27 | } 28 | } 29 | 30 | /// Returns the actual Boolean value of this literal. 31 | pub fn value(self) -> bool { 32 | match self { 33 | Self::False => false, 34 | Self::True => true, 35 | } 36 | } 37 | 38 | /// Returns the literal as string. 39 | pub fn as_str(&self) -> &'static str { 40 | match self { 41 | Self::False => "false", 42 | Self::True => "true", 43 | } 44 | } 45 | } 46 | 47 | impl fmt::Display for BoolLit { 48 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 49 | f.pad(self.as_str()) 50 | } 51 | } 52 | 53 | 54 | #[cfg(test)] 55 | mod tests; 56 | -------------------------------------------------------------------------------- /src/bool/tests.rs: -------------------------------------------------------------------------------- 1 | use crate::{ 2 | Literal, BoolLit, 3 | test_util::assert_parse_ok_eq, 4 | }; 5 | 6 | macro_rules! assert_bool_parse { 7 | ($input:literal, $expected:expr) => { 8 | assert_parse_ok_eq( 9 | $input, Literal::parse($input), Literal::Bool($expected), "Literal::parse"); 10 | assert_parse_ok_eq($input, BoolLit::parse($input), $expected, "BoolLit::parse"); 11 | }; 12 | } 13 | 14 | 15 | 16 | #[test] 17 | fn parse_ok() { 18 | assert_bool_parse!("false", BoolLit::False); 19 | assert_bool_parse!("true", BoolLit::True); 20 | } 21 | 22 | #[test] 23 | fn parse_err() { 24 | assert!(Literal::parse("fa").is_err()); 25 | assert!(Literal::parse("fal").is_err()); 26 | assert!(Literal::parse("fals").is_err()); 27 | assert!(Literal::parse(" false").is_err()); 28 | assert!(Literal::parse("false ").is_err()); 29 | assert!(Literal::parse("False").is_err()); 30 | 31 | assert!(Literal::parse("tr").is_err()); 32 | assert!(Literal::parse("tru").is_err()); 33 | assert!(Literal::parse(" true").is_err()); 34 | assert!(Literal::parse("true ").is_err()); 35 | assert!(Literal::parse("True").is_err()); 36 | } 37 | 38 | #[test] 39 | fn value() { 40 | assert!(!BoolLit::False.value()); 41 | assert!(BoolLit::True.value()); 42 | } 43 | 44 | #[test] 45 | fn as_str() { 46 | assert_eq!(BoolLit::False.as_str(), "false"); 47 | assert_eq!(BoolLit::True.as_str(), "true"); 48 | } 49 | -------------------------------------------------------------------------------- /src/byte/mod.rs: -------------------------------------------------------------------------------- 1 | use core::fmt; 2 | 3 | use crate::{ 4 | Buffer, ParseError, 5 | err::{perr, ParseErrorKind::*}, 6 | escape::unescape, 7 | parse::check_suffix, 8 | }; 9 | 10 | 11 | /// A (single) byte literal, e.g. `b'k'` or `b'!'`. 12 | /// 13 | /// See [the reference][ref] for more information. 14 | /// 15 | /// [ref]: https://doc.rust-lang.org/reference/tokens.html#byte-literals 16 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 17 | pub struct ByteLit { 18 | raw: B, 19 | /// Start index of the suffix or `raw.len()` if there is no suffix. 20 | start_suffix: usize, 21 | value: u8, 22 | } 23 | 24 | impl ByteLit { 25 | /// Parses the input as a byte literal. Returns an error if the input is 26 | /// invalid or represents a different kind of literal. 27 | pub fn parse(input: B) -> Result { 28 | if input.is_empty() { 29 | return Err(perr(None, Empty)); 30 | } 31 | if !input.starts_with("b'") { 32 | return Err(perr(None, InvalidByteLiteralStart)); 33 | } 34 | 35 | let (value, start_suffix) = parse_impl(&input)?; 36 | Ok(Self { raw: input, value, start_suffix }) 37 | } 38 | 39 | /// Returns the byte value that this literal represents. 40 | pub fn value(&self) -> u8 { 41 | self.value 42 | } 43 | 44 | /// The optional suffix. Returns `""` if the suffix is empty/does not exist. 45 | pub fn suffix(&self) -> &str { 46 | &(*self.raw)[self.start_suffix..] 47 | } 48 | 49 | /// Returns the raw input that was passed to `parse`. 50 | pub fn raw_input(&self) -> &str { 51 | &self.raw 52 | } 53 | 54 | /// Returns the raw input that was passed to `parse`, potentially owned. 55 | pub fn into_raw_input(self) -> B { 56 | self.raw 57 | } 58 | 59 | } 60 | 61 | impl ByteLit<&str> { 62 | /// Makes a copy of the underlying buffer and returns the owned version of 63 | /// `Self`. 64 | pub fn to_owned(&self) -> ByteLit { 65 | ByteLit { 66 | raw: self.raw.to_owned(), 67 | start_suffix: self.start_suffix, 68 | value: self.value, 69 | } 70 | } 71 | } 72 | 73 | impl fmt::Display for ByteLit { 74 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 75 | f.pad(&self.raw) 76 | } 77 | } 78 | 79 | /// Precondition: must start with `b'`. 80 | #[inline(never)] 81 | pub(crate) fn parse_impl(input: &str) -> Result<(u8, usize), ParseError> { 82 | let input_bytes = input.as_bytes(); 83 | let first = input_bytes.get(2).ok_or(perr(None, UnterminatedByteLiteral))?; 84 | let (c, len) = match first { 85 | b'\'' if input_bytes.get(3) == Some(&b'\'') => return Err(perr(2, UnescapedSingleQuote)), 86 | b'\'' => return Err(perr(None, EmptyByteLiteral)), 87 | b'\n' | b'\t' | b'\r' => return Err(perr(2, UnescapedSpecialWhitespace)), 88 | b'\\' => unescape::(&input[2..], 2)?, 89 | other if other.is_ascii() => (*other, 1), 90 | _ => return Err(perr(2, NonAsciiInByteLiteral)), 91 | }; 92 | 93 | match input[2 + len..].find('\'') { 94 | Some(0) => {} 95 | Some(_) => return Err(perr(None, OverlongByteLiteral)), 96 | None => return Err(perr(None, UnterminatedByteLiteral)), 97 | } 98 | 99 | let start_suffix = 2 + len + 1; 100 | let suffix = &input[start_suffix..]; 101 | check_suffix(suffix).map_err(|kind| perr(start_suffix, kind))?; 102 | 103 | Ok((c, start_suffix)) 104 | } 105 | 106 | #[cfg(test)] 107 | mod tests; 108 | -------------------------------------------------------------------------------- /src/byte/tests.rs: -------------------------------------------------------------------------------- 1 | use crate::{ByteLit, Literal, test_util::{assert_parse_ok_eq, assert_roundtrip}}; 2 | 3 | // ===== Utility functions ======================================================================= 4 | 5 | macro_rules! check { 6 | ($lit:literal) => { check!($lit, stringify!($lit), "") }; 7 | ($lit:literal, $input:expr, $suffix:literal) => { 8 | let input = $input; 9 | let expected = ByteLit { 10 | raw: input, 11 | start_suffix: input.len() - $suffix.len(), 12 | value: $lit, 13 | }; 14 | 15 | assert_parse_ok_eq(input, ByteLit::parse(input), expected.clone(), "ByteLit::parse"); 16 | assert_parse_ok_eq(input, Literal::parse(input), Literal::Byte(expected), "Literal::parse"); 17 | let lit = ByteLit::parse(input).unwrap(); 18 | assert_eq!(lit.value(), $lit); 19 | assert_eq!(lit.suffix(), $suffix); 20 | assert_roundtrip(expected.to_owned(), input); 21 | }; 22 | } 23 | 24 | 25 | // ===== Actual tests ============================================================================ 26 | 27 | #[test] 28 | fn alphanumeric() { 29 | check!(b'a'); 30 | check!(b'b'); 31 | check!(b'y'); 32 | check!(b'z'); 33 | check!(b'A'); 34 | check!(b'B'); 35 | check!(b'Y'); 36 | check!(b'Z'); 37 | 38 | check!(b'0'); 39 | check!(b'1'); 40 | check!(b'8'); 41 | check!(b'9'); 42 | } 43 | 44 | #[test] 45 | fn special_chars() { 46 | check!(b' '); 47 | check!(b'!'); 48 | check!(b'"'); 49 | check!(b'#'); 50 | check!(b'$'); 51 | check!(b'%'); 52 | check!(b'&'); 53 | check!(b'('); 54 | check!(b')'); 55 | check!(b'*'); 56 | check!(b'+'); 57 | check!(b','); 58 | check!(b'-'); 59 | check!(b'.'); 60 | check!(b'/'); 61 | check!(b':'); 62 | check!(b';'); 63 | check!(b'<'); 64 | check!(b'='); 65 | check!(b'>'); 66 | check!(b'?'); 67 | check!(b'@'); 68 | check!(b'['); 69 | check!(b']'); 70 | check!(b'^'); 71 | check!(b'_'); 72 | check!(b'`'); 73 | check!(b'{'); 74 | check!(b'|'); 75 | check!(b'}'); 76 | check!(b'~'); 77 | } 78 | 79 | #[test] 80 | fn quote_escapes() { 81 | check!(b'\''); 82 | check!(b'\"'); 83 | } 84 | 85 | #[test] 86 | fn ascii_escapes() { 87 | check!(b'\n'); 88 | check!(b'\r'); 89 | check!(b'\t'); 90 | check!(b'\\'); 91 | check!(b'\0'); 92 | 93 | check!(b'\x00'); 94 | check!(b'\x01'); 95 | check!(b'\x0c'); 96 | check!(b'\x0D'); 97 | check!(b'\x13'); 98 | check!(b'\x30'); 99 | check!(b'\x30'); 100 | check!(b'\x4B'); 101 | check!(b'\x6b'); 102 | check!(b'\x7F'); 103 | check!(b'\x7f'); 104 | } 105 | 106 | #[test] 107 | fn byte_escapes() { 108 | check!(b'\x80'); 109 | check!(b'\x8a'); 110 | check!(b'\x8C'); 111 | check!(b'\x99'); 112 | check!(b'\xa0'); 113 | check!(b'\xAd'); 114 | check!(b'\xfe'); 115 | check!(b'\xFe'); 116 | check!(b'\xfF'); 117 | check!(b'\xFF'); 118 | } 119 | 120 | #[test] 121 | fn suffixes() { 122 | check!(b'a', r##"b'a'peter"##, "peter"); 123 | check!(b'#', r##"b'#'peter"##, "peter"); 124 | check!(b'\n', r##"b'\n'peter"##, "peter"); 125 | check!(b'\'', r##"b'\''peter"##, "peter"); 126 | check!(b'\"', r##"b'\"'peter"##, "peter"); 127 | check!(b'\xFF', r##"b'\xFF'peter"##, "peter"); 128 | } 129 | 130 | #[test] 131 | fn invald_escapes() { 132 | assert_err!(ByteLit, r"b'\a'", UnknownEscape, 2..4); 133 | assert_err!(ByteLit, r"b'\y'", UnknownEscape, 2..4); 134 | assert_err!(ByteLit, r"b'\", UnterminatedEscape, 2..3); 135 | assert_err!(ByteLit, r"b'\x'", UnterminatedEscape, 2..5); 136 | assert_err!(ByteLit, r"b'\x1'", InvalidXEscape, 2..6); 137 | assert_err!(ByteLit, r"b'\xaj'", InvalidXEscape, 2..6); 138 | assert_err!(ByteLit, r"b'\xjb'", InvalidXEscape, 2..6); 139 | } 140 | 141 | #[test] 142 | fn unicode_escape_not_allowed() { 143 | assert_err!(ByteLit, r"b'\u{0}'", UnicodeEscapeInByteLiteral, 2..4); 144 | assert_err!(ByteLit, r"b'\u{00}'", UnicodeEscapeInByteLiteral, 2..4); 145 | assert_err!(ByteLit, r"b'\u{b}'", UnicodeEscapeInByteLiteral, 2..4); 146 | assert_err!(ByteLit, r"b'\u{B}'", UnicodeEscapeInByteLiteral, 2..4); 147 | assert_err!(ByteLit, r"b'\u{7e}'", UnicodeEscapeInByteLiteral, 2..4); 148 | assert_err!(ByteLit, r"b'\u{E4}'", UnicodeEscapeInByteLiteral, 2..4); 149 | assert_err!(ByteLit, r"b'\u{e4}'", UnicodeEscapeInByteLiteral, 2..4); 150 | assert_err!(ByteLit, r"b'\u{fc}'", UnicodeEscapeInByteLiteral, 2..4); 151 | assert_err!(ByteLit, r"b'\u{Fc}'", UnicodeEscapeInByteLiteral, 2..4); 152 | assert_err!(ByteLit, r"b'\u{fC}'", UnicodeEscapeInByteLiteral, 2..4); 153 | assert_err!(ByteLit, r"b'\u{FC}'", UnicodeEscapeInByteLiteral, 2..4); 154 | assert_err!(ByteLit, r"b'\u{b10}'", UnicodeEscapeInByteLiteral, 2..4); 155 | assert_err!(ByteLit, r"b'\u{B10}'", UnicodeEscapeInByteLiteral, 2..4); 156 | assert_err!(ByteLit, r"b'\u{0b10}'", UnicodeEscapeInByteLiteral, 2..4); 157 | assert_err!(ByteLit, r"b'\u{2764}'", UnicodeEscapeInByteLiteral, 2..4); 158 | assert_err!(ByteLit, r"b'\u{1f602}'", UnicodeEscapeInByteLiteral, 2..4); 159 | assert_err!(ByteLit, r"b'\u{1F602}'", UnicodeEscapeInByteLiteral, 2..4); 160 | } 161 | 162 | #[test] 163 | fn parse_err() { 164 | assert_err!(ByteLit, r"b''", EmptyByteLiteral, None); 165 | assert_err!(ByteLit, r"b' ''", UnexpectedChar, 4..5); 166 | 167 | assert_err!(ByteLit, r"b'", UnterminatedByteLiteral, None); 168 | assert_err!(ByteLit, r"b'a", UnterminatedByteLiteral, None); 169 | assert_err!(ByteLit, r"b'\n", UnterminatedByteLiteral, None); 170 | assert_err!(ByteLit, r"b'\x35", UnterminatedByteLiteral, None); 171 | 172 | assert_err!(ByteLit, r"b'ab'", OverlongByteLiteral, None); 173 | assert_err!(ByteLit, r"b'a _'", OverlongByteLiteral, None); 174 | assert_err!(ByteLit, r"b'\n3'", OverlongByteLiteral, None); 175 | 176 | assert_err!(ByteLit, r"", Empty, None); 177 | 178 | assert_err!(ByteLit, r"b'''", UnescapedSingleQuote, 2); 179 | assert_err!(ByteLit, r"b''''", UnescapedSingleQuote, 2); 180 | 181 | assert_err!(ByteLit, "b'\n'", UnescapedSpecialWhitespace, 2); 182 | assert_err!(ByteLit, "b'\t'", UnescapedSpecialWhitespace, 2); 183 | assert_err!(ByteLit, "b'\r'", UnescapedSpecialWhitespace, 2); 184 | 185 | assert_err!(ByteLit, "b'న'", NonAsciiInByteLiteral, 2); 186 | assert_err!(ByteLit, "b'犬'", NonAsciiInByteLiteral, 2); 187 | assert_err!(ByteLit, "b'🦊'", NonAsciiInByteLiteral, 2); 188 | } 189 | -------------------------------------------------------------------------------- /src/bytestr/mod.rs: -------------------------------------------------------------------------------- 1 | use std::{fmt, ops::Range}; 2 | 3 | use crate::{ 4 | Buffer, ParseError, 5 | err::{perr, ParseErrorKind::*}, 6 | escape::{scan_raw_string, unescape_string}, 7 | }; 8 | 9 | 10 | /// A byte string or raw byte string literal, e.g. `b"hello"` or `br#"abc"def"#`. 11 | /// 12 | /// See [the reference][ref] for more information. 13 | /// 14 | /// [ref]: https://doc.rust-lang.org/reference/tokens.html#byte-string-literals 15 | #[derive(Debug, Clone, PartialEq, Eq)] 16 | pub struct ByteStringLit { 17 | /// The raw input. 18 | raw: B, 19 | 20 | /// The string value (with all escaped unescaped), or `None` if there were 21 | /// no escapes. In the latter case, `input` is the string value. 22 | value: Option>, 23 | 24 | /// The number of hash signs in case of a raw string literal, or `None` if 25 | /// it's not a raw string literal. 26 | num_hashes: Option, 27 | 28 | /// Start index of the suffix or `raw.len()` if there is no suffix. 29 | start_suffix: usize, 30 | } 31 | 32 | impl ByteStringLit { 33 | /// Parses the input as a (raw) byte string literal. Returns an error if the 34 | /// input is invalid or represents a different kind of literal. 35 | pub fn parse(input: B) -> Result { 36 | if input.is_empty() { 37 | return Err(perr(None, Empty)); 38 | } 39 | if !input.starts_with(r#"b""#) && !input.starts_with("br") { 40 | return Err(perr(None, InvalidByteStringLiteralStart)); 41 | } 42 | 43 | let (value, num_hashes, start_suffix) = parse_impl(&input)?; 44 | Ok(Self { raw: input, value, num_hashes, start_suffix }) 45 | } 46 | 47 | /// Returns the string value this literal represents (where all escapes have 48 | /// been turned into their respective values). 49 | pub fn value(&self) -> &[u8] { 50 | self.value.as_deref().unwrap_or(&self.raw.as_bytes()[self.inner_range()]) 51 | } 52 | 53 | /// Like `value` but returns a potentially owned version of the value. 54 | /// 55 | /// The return value is either `Cow<'static, [u8]>` if `B = String`, or 56 | /// `Cow<'a, [u8]>` if `B = &'a str`. 57 | pub fn into_value(self) -> B::ByteCow { 58 | let inner_range = self.inner_range(); 59 | let Self { raw, value, .. } = self; 60 | value.map(B::ByteCow::from).unwrap_or_else(|| raw.cut(inner_range).into_byte_cow()) 61 | } 62 | 63 | /// The optional suffix. Returns `""` if the suffix is empty/does not exist. 64 | pub fn suffix(&self) -> &str { 65 | &(*self.raw)[self.start_suffix..] 66 | } 67 | 68 | /// Returns whether this literal is a raw string literal (starting with 69 | /// `r`). 70 | pub fn is_raw_byte_string(&self) -> bool { 71 | self.num_hashes.is_some() 72 | } 73 | 74 | /// Returns the raw input that was passed to `parse`. 75 | pub fn raw_input(&self) -> &str { 76 | &self.raw 77 | } 78 | 79 | /// Returns the raw input that was passed to `parse`, potentially owned. 80 | pub fn into_raw_input(self) -> B { 81 | self.raw 82 | } 83 | 84 | /// The range within `self.raw` that excludes the quotes and potential `r#`. 85 | fn inner_range(&self) -> Range { 86 | match self.num_hashes { 87 | None => 2..self.start_suffix - 1, 88 | Some(n) => 2 + n as usize + 1..self.start_suffix - n as usize - 1, 89 | } 90 | } 91 | } 92 | 93 | impl ByteStringLit<&str> { 94 | /// Makes a copy of the underlying buffer and returns the owned version of 95 | /// `Self`. 96 | pub fn into_owned(self) -> ByteStringLit { 97 | ByteStringLit { 98 | raw: self.raw.to_owned(), 99 | value: self.value, 100 | num_hashes: self.num_hashes, 101 | start_suffix: self.start_suffix, 102 | } 103 | } 104 | } 105 | 106 | impl fmt::Display for ByteStringLit { 107 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 108 | f.pad(&self.raw) 109 | } 110 | } 111 | 112 | 113 | /// Precondition: input has to start with either `b"` or `br`. 114 | #[inline(never)] 115 | fn parse_impl(input: &str) -> Result<(Option>, Option, usize), ParseError> { 116 | if input.starts_with("br") { 117 | scan_raw_string::(&input, 2) 118 | .map(|(v, num, start_suffix)| (v.map(String::into_bytes), Some(num), start_suffix)) 119 | } else { 120 | unescape_string::(&input, 2) 121 | .map(|(v, start_suffix)| (v.map(String::into_bytes), None, start_suffix)) 122 | } 123 | } 124 | 125 | #[cfg(test)] 126 | mod tests; 127 | -------------------------------------------------------------------------------- /src/bytestr/tests.rs: -------------------------------------------------------------------------------- 1 | use crate::{Literal, ByteStringLit, test_util::{assert_parse_ok_eq, assert_roundtrip}}; 2 | 3 | // ===== Utility functions ======================================================================= 4 | 5 | macro_rules! check { 6 | ($lit:literal, $has_escapes:expr, $num_hashes:expr) => { 7 | check!($lit, stringify!($lit), $has_escapes, $num_hashes, "") 8 | }; 9 | ($lit:literal, $input:expr, $has_escapes:expr, $num_hashes:expr, $suffix:literal) => { 10 | let input = $input; 11 | let expected = ByteStringLit { 12 | raw: input, 13 | value: if $has_escapes { Some($lit.to_vec()) } else { None }, 14 | num_hashes: $num_hashes, 15 | start_suffix: input.len() - $suffix.len(), 16 | }; 17 | 18 | assert_parse_ok_eq( 19 | input, ByteStringLit::parse(input), expected.clone(), "ByteStringLit::parse"); 20 | assert_parse_ok_eq( 21 | input, Literal::parse(input), Literal::ByteString(expected.clone()), "Literal::parse"); 22 | let lit = ByteStringLit::parse(input).unwrap(); 23 | assert_eq!(lit.value(), $lit); 24 | assert_eq!(lit.suffix(), $suffix); 25 | assert_eq!(lit.into_value().as_ref(), $lit); 26 | assert_roundtrip(expected.into_owned(), input); 27 | }; 28 | } 29 | 30 | 31 | // ===== Actual tests ============================================================================ 32 | 33 | #[test] 34 | fn simple() { 35 | check!(b"", false, None); 36 | check!(b"a", false, None); 37 | check!(b"peter", false, None); 38 | } 39 | 40 | #[test] 41 | fn special_whitespace() { 42 | let strings = ["\n", "\t", "foo\tbar", "baz\n"]; 43 | 44 | for &s in &strings { 45 | let input = format!(r#"b"{}""#, s); 46 | let input_raw = format!(r#"br"{}""#, s); 47 | for (input, num_hashes) in vec![(input, None), (input_raw, Some(0))] { 48 | let expected = ByteStringLit { 49 | raw: &*input, 50 | value: None, 51 | num_hashes, 52 | start_suffix: input.len(), 53 | }; 54 | assert_parse_ok_eq( 55 | &input, ByteStringLit::parse(&*input), expected.clone(), "ByteStringLit::parse"); 56 | assert_parse_ok_eq( 57 | &input, Literal::parse(&*input), Literal::ByteString(expected), "Literal::parse"); 58 | assert_eq!(ByteStringLit::parse(&*input).unwrap().value(), s.as_bytes()); 59 | assert_eq!(ByteStringLit::parse(&*input).unwrap().into_value(), s.as_bytes()); 60 | } 61 | } 62 | 63 | let res = ByteStringLit::parse("br\"\r\"").expect("failed to parse"); 64 | assert_eq!(res.value(), b"\r"); 65 | } 66 | 67 | #[test] 68 | fn simple_escapes() { 69 | check!(b"a\nb", true, None); 70 | check!(b"\nb", true, None); 71 | check!(b"a\n", true, None); 72 | check!(b"\n", true, None); 73 | 74 | check!(b"\x60foo \t bar\rbaz\n banana \0kiwi", true, None); 75 | check!(b"foo \\ferris", true, None); 76 | check!(b"baz \\ferris\"box", true, None); 77 | check!(b"\\foo\\ banana\" baz\"", true, None); 78 | check!(b"\"foo \\ferris \" baz\\", true, None); 79 | 80 | check!(b"\x00", true, None); 81 | check!(b" \x01", true, None); 82 | check!(b"\x0c foo", true, None); 83 | check!(b" foo\x0D ", true, None); 84 | check!(b"\\x13", true, None); 85 | check!(b"\"x30", true, None); 86 | } 87 | 88 | #[test] 89 | fn string_continue() { 90 | check!(b"foo\ 91 | bar", true, None); 92 | check!(b"foo\ 93 | bar", true, None); 94 | 95 | check!(b"foo\ 96 | 97 | banana", true, None); 98 | 99 | // Weird whitespace characters 100 | let lit = ByteStringLit::parse("b\"foo\\\n\r\t\n \n\tbar\"").expect("failed to parse"); 101 | assert_eq!(lit.value(), b"foobar"); 102 | 103 | // Raw strings do not handle "string continues" 104 | check!(br"foo\ 105 | bar", false, Some(0)); 106 | } 107 | 108 | #[test] 109 | fn crlf_newlines() { 110 | let lit = ByteStringLit::parse("b\"foo\r\nbar\"").expect("failed to parse"); 111 | assert_eq!(lit.value(), b"foo\nbar"); 112 | 113 | let lit = ByteStringLit::parse("b\"\r\nbar\"").expect("failed to parse"); 114 | assert_eq!(lit.value(), b"\nbar"); 115 | 116 | let lit = ByteStringLit::parse("b\"foo\r\n\"").expect("failed to parse"); 117 | assert_eq!(lit.value(), b"foo\n"); 118 | 119 | let lit = ByteStringLit::parse("br\"foo\r\nbar\"").expect("failed to parse"); 120 | assert_eq!(lit.value(), b"foo\nbar"); 121 | 122 | let lit = ByteStringLit::parse("br#\"\r\nbar\"#").expect("failed to parse"); 123 | assert_eq!(lit.value(), b"\nbar"); 124 | 125 | let lit = ByteStringLit::parse("br##\"foo\r\n\"##").expect("failed to parse"); 126 | assert_eq!(lit.value(), b"foo\n"); 127 | } 128 | 129 | #[test] 130 | fn raw_byte_string() { 131 | check!(br"", false, Some(0)); 132 | check!(br"a", false, Some(0)); 133 | check!(br"peter", false, Some(0)); 134 | check!(br"Greetings jason!", false, Some(0)); 135 | 136 | check!(br#""#, false, Some(1)); 137 | check!(br#"a"#, false, Some(1)); 138 | check!(br##"peter"##, false, Some(2)); 139 | check!(br###"Greetings # Jason!"###, false, Some(3)); 140 | check!(br########"we ## need #### more ####### hashtags"########, false, Some(8)); 141 | 142 | check!(br#"foo " bar"#, false, Some(1)); 143 | check!(br##"foo " bar"##, false, Some(2)); 144 | check!(br#"foo """" '"'" bar"#, false, Some(1)); 145 | check!(br#""foo""#, false, Some(1)); 146 | check!(br###""foo'"###, false, Some(3)); 147 | check!(br#""x'#_#s'"#, false, Some(1)); 148 | check!(br"#", false, Some(0)); 149 | check!(br"foo#", false, Some(0)); 150 | check!(br"##bar", false, Some(0)); 151 | check!(br###""##foo"##bar'"###, false, Some(3)); 152 | 153 | check!(br"foo\n\t\r\0\\x60\u{123}doggo", false, Some(0)); 154 | check!(br#"cat\n\t\r\0\\x60\u{123}doggo"#, false, Some(1)); 155 | } 156 | 157 | #[test] 158 | fn suffixes() { 159 | check!(b"hello", r###"b"hello"suffix"###, false, None, "suffix"); 160 | check!(b"fox", r#"b"fox"peter"#, false, None, "peter"); 161 | check!(b"a\x0cb\\", r#"b"a\x0cb\\"_jürgen"#, true, None, "_jürgen"); 162 | check!(br"a\x0cb\\", r###"br#"a\x0cb\\"#_jürgen"###, false, Some(1), "_jürgen"); 163 | } 164 | 165 | #[test] 166 | fn parse_err() { 167 | assert_err!(ByteStringLit, r#"b""#, UnterminatedString, None); 168 | assert_err!(ByteStringLit, r#"b"cat"#, UnterminatedString, None); 169 | assert_err!(ByteStringLit, r#"b"Jurgen"#, UnterminatedString, None); 170 | assert_err!(ByteStringLit, r#"b"foo bar baz"#, UnterminatedString, None); 171 | 172 | assert_err!(ByteStringLit, r#"b"fox"peter""#, InvalidSuffix, 6); 173 | assert_err!(ByteStringLit, r###"br#"foo "# bar"#"###, UnexpectedChar, 10); 174 | 175 | assert_err!(ByteStringLit, "b\"\r\"", IsolatedCr, 2); 176 | assert_err!(ByteStringLit, "b\"fo\rx\"", IsolatedCr, 4); 177 | 178 | assert_err!(ByteStringLit, r##"br####""##, UnterminatedRawString, None); 179 | assert_err!(ByteStringLit, r#####"br##"foo"#bar"#####, UnterminatedRawString, None); 180 | assert_err!(ByteStringLit, r##"br####"##, InvalidLiteral, None); 181 | assert_err!(ByteStringLit, r##"br####x"##, InvalidLiteral, None); 182 | } 183 | 184 | #[test] 185 | fn non_ascii() { 186 | assert_err!(ByteStringLit, r#"b"న""#, NonAsciiInByteLiteral, 2); 187 | assert_err!(ByteStringLit, r#"b"foo犬""#, NonAsciiInByteLiteral, 5); 188 | assert_err!(ByteStringLit, r#"b"x🦊baz""#, NonAsciiInByteLiteral, 3); 189 | assert_err!(ByteStringLit, r#"br"న""#, NonAsciiInByteLiteral, 3); 190 | assert_err!(ByteStringLit, r#"br"foo犬""#, NonAsciiInByteLiteral, 6); 191 | assert_err!(ByteStringLit, r#"br"x🦊baz""#, NonAsciiInByteLiteral, 4); 192 | } 193 | 194 | #[test] 195 | fn invalid_escapes() { 196 | assert_err!(ByteStringLit, r#"b"\a""#, UnknownEscape, 2..4); 197 | assert_err!(ByteStringLit, r#"b"foo\y""#, UnknownEscape, 5..7); 198 | assert_err!(ByteStringLit, r#"b"\"#, UnterminatedEscape, 2); 199 | assert_err!(ByteStringLit, r#"b"\x""#, UnterminatedEscape, 2..4); 200 | assert_err!(ByteStringLit, r#"b"foo\x1""#, UnterminatedEscape, 5..8); 201 | assert_err!(ByteStringLit, r#"b" \xaj""#, InvalidXEscape, 3..7); 202 | assert_err!(ByteStringLit, r#"b"\xjbbaz""#, InvalidXEscape, 2..6); 203 | } 204 | 205 | #[test] 206 | fn unicode_escape_not_allowed() { 207 | assert_err!(ByteStringLit, r#"b"\u{0}""#, UnicodeEscapeInByteLiteral, 2..4); 208 | assert_err!(ByteStringLit, r#"b"\u{00}""#, UnicodeEscapeInByteLiteral, 2..4); 209 | assert_err!(ByteStringLit, r#"b"\u{b}""#, UnicodeEscapeInByteLiteral, 2..4); 210 | assert_err!(ByteStringLit, r#"b"\u{B}""#, UnicodeEscapeInByteLiteral, 2..4); 211 | assert_err!(ByteStringLit, r#"b"\u{7e}""#, UnicodeEscapeInByteLiteral, 2..4); 212 | assert_err!(ByteStringLit, r#"b"\u{E4}""#, UnicodeEscapeInByteLiteral, 2..4); 213 | assert_err!(ByteStringLit, r#"b"\u{e4}""#, UnicodeEscapeInByteLiteral, 2..4); 214 | assert_err!(ByteStringLit, r#"b"\u{fc}""#, UnicodeEscapeInByteLiteral, 2..4); 215 | assert_err!(ByteStringLit, r#"b"\u{Fc}""#, UnicodeEscapeInByteLiteral, 2..4); 216 | assert_err!(ByteStringLit, r#"b"\u{fC}""#, UnicodeEscapeInByteLiteral, 2..4); 217 | assert_err!(ByteStringLit, r#"b"\u{FC}""#, UnicodeEscapeInByteLiteral, 2..4); 218 | assert_err!(ByteStringLit, r#"b"\u{b10}""#, UnicodeEscapeInByteLiteral, 2..4); 219 | assert_err!(ByteStringLit, r#"b"\u{B10}""#, UnicodeEscapeInByteLiteral, 2..4); 220 | assert_err!(ByteStringLit, r#"b"\u{0b10}""#, UnicodeEscapeInByteLiteral, 2..4); 221 | assert_err!(ByteStringLit, r#"b"\u{2764}""#, UnicodeEscapeInByteLiteral, 2..4); 222 | assert_err!(ByteStringLit, r#"b"\u{1f602}""#, UnicodeEscapeInByteLiteral, 2..4); 223 | assert_err!(ByteStringLit, r#"b"\u{1F602}""#, UnicodeEscapeInByteLiteral, 2..4); 224 | } 225 | -------------------------------------------------------------------------------- /src/char/mod.rs: -------------------------------------------------------------------------------- 1 | use std::fmt; 2 | 3 | use crate::{ 4 | Buffer, ParseError, 5 | err::{perr, ParseErrorKind::*}, 6 | escape::unescape, 7 | parse::{first_byte_or_empty, check_suffix}, 8 | }; 9 | 10 | 11 | /// A character literal, e.g. `'g'` or `'🦊'`. 12 | /// 13 | /// See [the reference][ref] for more information. 14 | /// 15 | /// [ref]: https://doc.rust-lang.org/reference/tokens.html#character-literals 16 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 17 | pub struct CharLit { 18 | raw: B, 19 | /// Start index of the suffix or `raw.len()` if there is no suffix. 20 | start_suffix: usize, 21 | value: char, 22 | } 23 | 24 | impl CharLit { 25 | /// Parses the input as a character literal. Returns an error if the input 26 | /// is invalid or represents a different kind of literal. 27 | pub fn parse(input: B) -> Result { 28 | match first_byte_or_empty(&input)? { 29 | b'\'' => { 30 | let (value, start_suffix) = parse_impl(&input)?; 31 | Ok(Self { raw: input, value, start_suffix }) 32 | }, 33 | _ => Err(perr(0, DoesNotStartWithQuote)), 34 | } 35 | } 36 | 37 | /// Returns the character value that this literal represents. 38 | pub fn value(&self) -> char { 39 | self.value 40 | } 41 | 42 | /// The optional suffix. Returns `""` if the suffix is empty/does not exist. 43 | pub fn suffix(&self) -> &str { 44 | &(*self.raw)[self.start_suffix..] 45 | } 46 | 47 | /// Returns the raw input that was passed to `parse`. 48 | pub fn raw_input(&self) -> &str { 49 | &self.raw 50 | } 51 | 52 | /// Returns the raw input that was passed to `parse`, potentially owned. 53 | pub fn into_raw_input(self) -> B { 54 | self.raw 55 | } 56 | 57 | } 58 | 59 | impl CharLit<&str> { 60 | /// Makes a copy of the underlying buffer and returns the owned version of 61 | /// `Self`. 62 | pub fn to_owned(&self) -> CharLit { 63 | CharLit { 64 | raw: self.raw.to_owned(), 65 | start_suffix: self.start_suffix, 66 | value: self.value, 67 | } 68 | } 69 | } 70 | 71 | impl fmt::Display for CharLit { 72 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 73 | f.pad(&self.raw) 74 | } 75 | } 76 | 77 | /// Precondition: first character in input must be `'`. 78 | #[inline(never)] 79 | pub(crate) fn parse_impl(input: &str) -> Result<(char, usize), ParseError> { 80 | let first = input.chars().nth(1).ok_or(perr(None, UnterminatedCharLiteral))?; 81 | let (c, len) = match first { 82 | '\'' if input.chars().nth(2) == Some('\'') => return Err(perr(1, UnescapedSingleQuote)), 83 | '\'' => return Err(perr(None, EmptyCharLiteral)), 84 | '\n' | '\t' | '\r' 85 | => return Err(perr(1, UnescapedSpecialWhitespace)), 86 | 87 | '\\' => unescape::(&input[1..], 1)?, 88 | other => (other, other.len_utf8()), 89 | }; 90 | 91 | match input[1 + len..].find('\'') { 92 | Some(0) => {} 93 | Some(_) => return Err(perr(None, OverlongCharLiteral)), 94 | None => return Err(perr(None, UnterminatedCharLiteral)), 95 | } 96 | 97 | let start_suffix = 1 + len + 1; 98 | let suffix = &input[start_suffix..]; 99 | check_suffix(suffix).map_err(|kind| perr(start_suffix, kind))?; 100 | 101 | Ok((c, start_suffix)) 102 | } 103 | 104 | #[cfg(test)] 105 | mod tests; 106 | -------------------------------------------------------------------------------- /src/char/tests.rs: -------------------------------------------------------------------------------- 1 | use crate::{Literal, test_util::{assert_parse_ok_eq, assert_roundtrip}}; 2 | use super::CharLit; 3 | 4 | // ===== Utility functions ======================================================================= 5 | 6 | macro_rules! check { 7 | ($lit:literal) => { check!($lit, stringify!($lit), "") }; 8 | ($lit:literal, $input:expr, $suffix:literal) => { 9 | let input = $input; 10 | let expected = CharLit { 11 | raw: input, 12 | start_suffix: input.len() - $suffix.len(), 13 | value: $lit, 14 | }; 15 | 16 | assert_parse_ok_eq(input, CharLit::parse(input), expected.clone(), "CharLit::parse"); 17 | assert_parse_ok_eq(input, Literal::parse(input), Literal::Char(expected), "Literal::parse"); 18 | let lit = CharLit::parse(input).unwrap(); 19 | assert_eq!(lit.value(), $lit); 20 | assert_eq!(lit.suffix(), $suffix); 21 | assert_roundtrip(expected.to_owned(), input); 22 | }; 23 | } 24 | 25 | 26 | // ===== Actual tests ============================================================================ 27 | 28 | #[test] 29 | fn alphanumeric() { 30 | check!('a'); 31 | check!('b'); 32 | check!('y'); 33 | check!('z'); 34 | check!('A'); 35 | check!('B'); 36 | check!('Y'); 37 | check!('Z'); 38 | 39 | check!('0'); 40 | check!('1'); 41 | check!('8'); 42 | check!('9'); 43 | } 44 | 45 | #[test] 46 | fn special_chars() { 47 | check!(' '); 48 | check!('!'); 49 | check!('"'); 50 | check!('#'); 51 | check!('$'); 52 | check!('%'); 53 | check!('&'); 54 | check!('('); 55 | check!(')'); 56 | check!('*'); 57 | check!('+'); 58 | check!(','); 59 | check!('-'); 60 | check!('.'); 61 | check!('/'); 62 | check!(':'); 63 | check!(';'); 64 | check!('<'); 65 | check!('='); 66 | check!('>'); 67 | check!('?'); 68 | check!('@'); 69 | check!('['); 70 | check!(']'); 71 | check!('^'); 72 | check!('_'); 73 | check!('`'); 74 | check!('{'); 75 | check!('|'); 76 | check!('}'); 77 | check!('~'); 78 | } 79 | 80 | #[test] 81 | fn unicode() { 82 | check!('న'); 83 | check!('犬'); 84 | check!('🦊'); 85 | } 86 | 87 | #[test] 88 | fn quote_escapes() { 89 | check!('\''); 90 | check!('\"'); 91 | } 92 | 93 | #[test] 94 | fn ascii_escapes() { 95 | check!('\n'); 96 | check!('\r'); 97 | check!('\t'); 98 | check!('\\'); 99 | check!('\0'); 100 | 101 | check!('\x00'); 102 | check!('\x01'); 103 | check!('\x0c'); 104 | check!('\x0D'); 105 | check!('\x13'); 106 | check!('\x30'); 107 | check!('\x30'); 108 | check!('\x4B'); 109 | check!('\x6b'); 110 | check!('\x7F'); 111 | check!('\x7f'); 112 | } 113 | 114 | #[test] 115 | fn unicode_escapes() { 116 | check!('\u{0}'); 117 | check!('\u{00}'); 118 | check!('\u{b}'); 119 | check!('\u{B}'); 120 | check!('\u{7e}'); 121 | check!('\u{E4}'); 122 | check!('\u{e4}'); 123 | check!('\u{fc}'); 124 | check!('\u{Fc}'); 125 | check!('\u{fC}'); 126 | check!('\u{FC}'); 127 | check!('\u{b10}'); 128 | check!('\u{B10}'); 129 | check!('\u{0b10}'); 130 | check!('\u{2764}'); 131 | check!('\u{1f602}'); 132 | check!('\u{1F602}'); 133 | 134 | check!('\u{0}'); 135 | check!('\u{0__}'); 136 | check!('\u{3_b}'); 137 | check!('\u{1_F_6_0_2}'); 138 | check!('\u{1_F6_02_____}'); 139 | } 140 | 141 | #[test] 142 | fn suffixes() { 143 | check!('a', r##"'a'peter"##, "peter"); 144 | check!('#', r##"'#'peter"##, "peter"); 145 | check!('\n', r##"'\n'peter"##, "peter"); 146 | check!('\'', r##"'\''peter"##, "peter"); 147 | check!('\"', r##"'\"'peter"##, "peter"); 148 | } 149 | 150 | #[test] 151 | fn invald_ascii_escapes() { 152 | assert_err!(CharLit, r"'\x80'", NonAsciiXEscape, 1..5); 153 | assert_err!(CharLit, r"'\x81'", NonAsciiXEscape, 1..5); 154 | assert_err!(CharLit, r"'\x8a'", NonAsciiXEscape, 1..5); 155 | assert_err!(CharLit, r"'\x8F'", NonAsciiXEscape, 1..5); 156 | assert_err!(CharLit, r"'\xa0'", NonAsciiXEscape, 1..5); 157 | assert_err!(CharLit, r"'\xB0'", NonAsciiXEscape, 1..5); 158 | assert_err!(CharLit, r"'\xc3'", NonAsciiXEscape, 1..5); 159 | assert_err!(CharLit, r"'\xDf'", NonAsciiXEscape, 1..5); 160 | assert_err!(CharLit, r"'\xff'", NonAsciiXEscape, 1..5); 161 | assert_err!(CharLit, r"'\xfF'", NonAsciiXEscape, 1..5); 162 | assert_err!(CharLit, r"'\xFf'", NonAsciiXEscape, 1..5); 163 | assert_err!(CharLit, r"'\xFF'", NonAsciiXEscape, 1..5); 164 | } 165 | 166 | #[test] 167 | fn invalid_escapes() { 168 | assert_err!(CharLit, r"'\a'", UnknownEscape, 1..3); 169 | assert_err!(CharLit, r"'\y'", UnknownEscape, 1..3); 170 | assert_err!(CharLit, r"'\", UnterminatedEscape, 1); 171 | assert_err!(CharLit, r"'\x'", UnterminatedEscape, 1..4); 172 | assert_err!(CharLit, r"'\x1'", InvalidXEscape, 1..5); 173 | assert_err!(CharLit, r"'\xaj'", InvalidXEscape, 1..5); 174 | assert_err!(CharLit, r"'\xjb'", InvalidXEscape, 1..5); 175 | } 176 | 177 | #[test] 178 | fn invalid_unicode_escapes() { 179 | assert_err!(CharLit, r"'\u'", UnicodeEscapeWithoutBrace, 1..3); 180 | assert_err!(CharLit, r"'\u '", UnicodeEscapeWithoutBrace, 1..3); 181 | assert_err!(CharLit, r"'\u3'", UnicodeEscapeWithoutBrace, 1..3); 182 | 183 | assert_err!(CharLit, r"'\u{'", UnterminatedUnicodeEscape, 1..5); 184 | assert_err!(CharLit, r"'\u{12'", UnterminatedUnicodeEscape, 1..7); 185 | assert_err!(CharLit, r"'\u{a0b'", UnterminatedUnicodeEscape, 1..8); 186 | assert_err!(CharLit, r"'\u{a0_b '", UnterminatedUnicodeEscape, 1..11); 187 | 188 | assert_err!(CharLit, r"'\u{_}'", InvalidStartOfUnicodeEscape, 4); 189 | assert_err!(CharLit, r"'\u{_5f}'", InvalidStartOfUnicodeEscape, 4); 190 | 191 | assert_err!(CharLit, r"'\u{x}'", NonHexDigitInUnicodeEscape, 4); 192 | assert_err!(CharLit, r"'\u{0x}'", NonHexDigitInUnicodeEscape, 5); 193 | assert_err!(CharLit, r"'\u{3bx}'", NonHexDigitInUnicodeEscape, 6); 194 | assert_err!(CharLit, r"'\u{3b_x}'", NonHexDigitInUnicodeEscape, 7); 195 | assert_err!(CharLit, r"'\u{4x_}'", NonHexDigitInUnicodeEscape, 5); 196 | 197 | assert_err!(CharLit, r"'\u{1234567}'", TooManyDigitInUnicodeEscape, 10); 198 | assert_err!(CharLit, r"'\u{1234567}'", TooManyDigitInUnicodeEscape, 10); 199 | assert_err!(CharLit, r"'\u{1_23_4_56_7}'", TooManyDigitInUnicodeEscape, 14); 200 | assert_err!(CharLit, r"'\u{abcdef123}'", TooManyDigitInUnicodeEscape, 10); 201 | 202 | assert_err!(CharLit, r"'\u{110000}'", InvalidUnicodeEscapeChar, 1..10); 203 | } 204 | 205 | #[test] 206 | fn parse_err() { 207 | assert_err!(CharLit, r"''", EmptyCharLiteral, None); 208 | assert_err!(CharLit, r"' ''", UnexpectedChar, 3); 209 | 210 | assert_err!(CharLit, r"'", UnterminatedCharLiteral, None); 211 | assert_err!(CharLit, r"'a", UnterminatedCharLiteral, None); 212 | assert_err!(CharLit, r"'\n", UnterminatedCharLiteral, None); 213 | assert_err!(CharLit, r"'\x35", UnterminatedCharLiteral, None); 214 | 215 | assert_err!(CharLit, r"'ab'", OverlongCharLiteral, None); 216 | assert_err!(CharLit, r"'a _'", OverlongCharLiteral, None); 217 | assert_err!(CharLit, r"'\n3'", OverlongCharLiteral, None); 218 | 219 | assert_err!(CharLit, r"", Empty, None); 220 | 221 | assert_err!(CharLit, r"'''", UnescapedSingleQuote, 1); 222 | assert_err!(CharLit, r"''''", UnescapedSingleQuote, 1); 223 | 224 | assert_err!(CharLit, "'\n'", UnescapedSpecialWhitespace, 1); 225 | assert_err!(CharLit, "'\t'", UnescapedSpecialWhitespace, 1); 226 | assert_err!(CharLit, "'\r'", UnescapedSpecialWhitespace, 1); 227 | } 228 | -------------------------------------------------------------------------------- /src/err.rs: -------------------------------------------------------------------------------- 1 | use std::{fmt, ops::Range}; 2 | 3 | 4 | /// An error signaling that a different kind of token was expected. Returned by 5 | /// the various `TryFrom` impls. 6 | #[derive(Debug, Clone, Copy)] 7 | pub struct InvalidToken { 8 | pub(crate) expected: TokenKind, 9 | pub(crate) actual: TokenKind, 10 | pub(crate) span: Span, 11 | } 12 | 13 | impl InvalidToken { 14 | /// Returns a token stream representing `compile_error!("msg");` where 15 | /// `"msg"` is the output of `self.to_string()`. **Panics if called outside 16 | /// of a proc-macro context!** 17 | pub fn to_compile_error(&self) -> proc_macro::TokenStream { 18 | use proc_macro::{Delimiter, Ident, Group, Punct, Spacing, TokenTree}; 19 | 20 | let span = match self.span { 21 | Span::One(s) => s, 22 | #[cfg(feature = "proc-macro2")] 23 | Span::Two(s) => s.unwrap(), 24 | }; 25 | let msg = self.to_string(); 26 | let tokens = vec![ 27 | TokenTree::from(Ident::new("compile_error", span)), 28 | TokenTree::from(Punct::new('!', Spacing::Alone)), 29 | TokenTree::from(Group::new( 30 | Delimiter::Parenthesis, 31 | TokenTree::from(proc_macro::Literal::string(&msg)).into(), 32 | )), 33 | ]; 34 | 35 | 36 | tokens.into_iter().map(|mut t| { t.set_span(span); t }).collect() 37 | } 38 | 39 | /// Like [`to_compile_error`][Self::to_compile_error], but returns a token 40 | /// stream from `proc_macro2` and does not panic outside of a proc-macro 41 | /// context. 42 | #[cfg(feature = "proc-macro2")] 43 | pub fn to_compile_error2(&self) -> proc_macro2::TokenStream { 44 | use proc_macro2::{Delimiter, Ident, Group, Punct, Spacing, TokenTree}; 45 | 46 | let span = match self.span { 47 | Span::One(s) => proc_macro2::Span::from(s), 48 | Span::Two(s) => s, 49 | }; 50 | let msg = self.to_string(); 51 | let tokens = vec![ 52 | TokenTree::from(Ident::new("compile_error", span)), 53 | TokenTree::from(Punct::new('!', Spacing::Alone)), 54 | TokenTree::from(Group::new( 55 | Delimiter::Parenthesis, 56 | TokenTree::from(proc_macro2::Literal::string(&msg)).into(), 57 | )), 58 | ]; 59 | 60 | 61 | tokens.into_iter().map(|mut t| { t.set_span(span); t }).collect() 62 | } 63 | } 64 | 65 | impl std::error::Error for InvalidToken {} 66 | 67 | impl fmt::Display for InvalidToken { 68 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 69 | fn kind_desc(kind: TokenKind) -> &'static str { 70 | match kind { 71 | TokenKind::Punct => "a punctuation character", 72 | TokenKind::Ident => "an identifier", 73 | TokenKind::Group => "a group", 74 | TokenKind::Literal => "a literal", 75 | TokenKind::BoolLit => "a bool literal (`true` or `false`)", 76 | TokenKind::ByteLit => "a byte literal (e.g. `b'r')", 77 | TokenKind::ByteStringLit => r#"a byte string literal (e.g. `b"fox"`)"#, 78 | TokenKind::CharLit => "a character literal (e.g. `'P'`)", 79 | TokenKind::FloatLit => "a float literal (e.g. `3.14`)", 80 | TokenKind::IntegerLit => "an integer literal (e.g. `27`)", 81 | TokenKind::StringLit => r#"a string literal (e.g. "Ferris")"#, 82 | } 83 | } 84 | 85 | write!(f, "expected {}, but found {}", kind_desc(self.expected), kind_desc(self.actual)) 86 | } 87 | } 88 | 89 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 90 | pub(crate) enum TokenKind { 91 | Punct, 92 | Ident, 93 | Group, 94 | Literal, 95 | BoolLit, 96 | ByteLit, 97 | ByteStringLit, 98 | CharLit, 99 | FloatLit, 100 | IntegerLit, 101 | StringLit, 102 | } 103 | 104 | /// Unfortunately, we have to deal with both cases. 105 | #[derive(Debug, Clone, Copy)] 106 | pub(crate) enum Span { 107 | One(proc_macro::Span), 108 | #[cfg(feature = "proc-macro2")] 109 | Two(proc_macro2::Span), 110 | } 111 | 112 | impl From for Span { 113 | fn from(src: proc_macro::Span) -> Self { 114 | Self::One(src) 115 | } 116 | } 117 | 118 | #[cfg(feature = "proc-macro2")] 119 | impl From for Span { 120 | fn from(src: proc_macro2::Span) -> Self { 121 | Self::Two(src) 122 | } 123 | } 124 | 125 | /// Errors during parsing. 126 | /// 127 | /// This type should be seen primarily for error reporting and not for catching 128 | /// specific cases. The span and error kind are not guaranteed to be stable 129 | /// over different versions of this library, meaning that a returned error can 130 | /// change from one version to the next. There are simply too many fringe cases 131 | /// that are not easy to classify as a specific error kind. It depends entirely 132 | /// on the specific parser code how an invalid input is categorized. 133 | /// 134 | /// Consider these examples: 135 | /// - `'\` can be seen as 136 | /// - invalid escape in character literal, or 137 | /// - unterminated character literal. 138 | /// - `'''` can be seen as 139 | /// - empty character literal, or 140 | /// - unescaped quote character in character literal. 141 | /// - `0b64` can be seen as 142 | /// - binary integer literal with invalid digit 6, or 143 | /// - binary integer literal with invalid digit 4, or 144 | /// - decimal integer literal with invalid digit b, or 145 | /// - decimal integer literal 0 with unknown type suffix `b64`. 146 | /// 147 | /// If you want to see more if these examples, feel free to check out the unit 148 | /// tests of this library. 149 | /// 150 | /// While this library does its best to emit sensible and precise errors, and to 151 | /// keep the returned errors as stable as possible, full stability cannot be 152 | /// guaranteed. 153 | #[derive(Debug, Clone)] 154 | pub struct ParseError { 155 | pub(crate) span: Option>, 156 | pub(crate) kind: ParseErrorKind, 157 | } 158 | 159 | impl ParseError { 160 | /// Returns a span of this error, if available. **Note**: the returned span 161 | /// might change in future versions of this library. See [the documentation 162 | /// of this type][ParseError] for more information. 163 | pub fn span(&self) -> Option> { 164 | self.span.clone() 165 | } 166 | } 167 | 168 | /// This is a free standing function instead of an associated one to reduce 169 | /// noise around parsing code. There are lots of places that create errors, we 170 | /// I wanna keep them as short as possible. 171 | pub(crate) fn perr(span: impl SpanLike, kind: ParseErrorKind) -> ParseError { 172 | ParseError { 173 | span: span.into_span(), 174 | kind, 175 | } 176 | } 177 | 178 | pub(crate) trait SpanLike { 179 | fn into_span(self) -> Option>; 180 | } 181 | 182 | impl SpanLike for Option> { 183 | #[inline(always)] 184 | fn into_span(self) -> Option> { 185 | self 186 | } 187 | } 188 | impl SpanLike for Range { 189 | #[inline(always)] 190 | fn into_span(self) -> Option> { 191 | Some(self) 192 | } 193 | } 194 | impl SpanLike for usize { 195 | #[inline(always)] 196 | fn into_span(self) -> Option> { 197 | Some(self..self + 1) 198 | } 199 | } 200 | 201 | 202 | /// Kinds of errors. 203 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 204 | #[non_exhaustive] 205 | pub(crate) enum ParseErrorKind { 206 | /// The input was an empty string 207 | Empty, 208 | 209 | /// An unexpected char was encountered. 210 | UnexpectedChar, 211 | 212 | /// Literal was not recognized. 213 | InvalidLiteral, 214 | 215 | /// Input does not start with decimal digit when trying to parse an integer. 216 | DoesNotStartWithDigit, 217 | 218 | /// A digit invalid for the specified integer base was found. 219 | InvalidDigit, 220 | 221 | /// Integer literal does not contain any valid digits. 222 | NoDigits, 223 | 224 | /// Exponent of a float literal does not contain any digits. 225 | NoExponentDigits, 226 | 227 | /// An unknown escape code, e.g. `\b`. 228 | UnknownEscape, 229 | 230 | /// A started escape sequence where the input ended before the escape was 231 | /// finished. 232 | UnterminatedEscape, 233 | 234 | /// An `\x` escape where the two digits are not valid hex digits. 235 | InvalidXEscape, 236 | 237 | /// A string or character literal using the `\xNN` escape where `NN > 0x7F`. 238 | NonAsciiXEscape, 239 | 240 | /// A `\u{...}` escape in a byte or byte string literal. 241 | UnicodeEscapeInByteLiteral, 242 | 243 | /// A Unicode escape that does not start with a hex digit. 244 | InvalidStartOfUnicodeEscape, 245 | 246 | /// A `\u{...}` escape that lacks the opening brace. 247 | UnicodeEscapeWithoutBrace, 248 | 249 | /// In a `\u{...}` escape, a non-hex digit and non-underscore character was 250 | /// found. 251 | NonHexDigitInUnicodeEscape, 252 | 253 | /// More than 6 digits found in unicode escape. 254 | TooManyDigitInUnicodeEscape, 255 | 256 | /// The value from a unicode escape does not represent a valid character. 257 | InvalidUnicodeEscapeChar, 258 | 259 | /// A `\u{..` escape that is not terminated (lacks the closing brace). 260 | UnterminatedUnicodeEscape, 261 | 262 | /// A character literal that's not terminated. 263 | UnterminatedCharLiteral, 264 | 265 | /// A character literal that contains more than one character. 266 | OverlongCharLiteral, 267 | 268 | /// An empty character literal, i.e. `''`. 269 | EmptyCharLiteral, 270 | 271 | UnterminatedByteLiteral, 272 | OverlongByteLiteral, 273 | EmptyByteLiteral, 274 | NonAsciiInByteLiteral, 275 | 276 | /// A `'` character was not escaped in a character or byte literal, or a `"` 277 | /// character was not escaped in a string or byte string literal. 278 | UnescapedSingleQuote, 279 | 280 | /// A \n, \t or \r raw character in a char or byte literal. 281 | UnescapedSpecialWhitespace, 282 | 283 | /// When parsing a character, byte, string or byte string literal directly 284 | /// and the input does not start with the corresponding quote character 285 | /// (plus optional raw string prefix). 286 | DoesNotStartWithQuote, 287 | 288 | /// Unterminated raw string literal. 289 | UnterminatedRawString, 290 | 291 | /// String literal without a `"` at the end. 292 | UnterminatedString, 293 | 294 | /// Invalid start for a string literal. 295 | InvalidStringLiteralStart, 296 | 297 | /// Invalid start for a byte literal. 298 | InvalidByteLiteralStart, 299 | 300 | InvalidByteStringLiteralStart, 301 | 302 | /// An literal `\r` character not followed by a `\n` character in a 303 | /// (raw) string or byte string literal. 304 | IsolatedCr, 305 | 306 | /// Literal suffix is not a valid identifier. 307 | InvalidSuffix, 308 | 309 | /// Returned by `Float::parse` if an integer literal (no fractional nor 310 | /// exponent part) is passed. 311 | UnexpectedIntegerLit, 312 | 313 | /// Integer suffixes cannot start with `e` or `E` as this conflicts with the 314 | /// grammar for float literals. 315 | IntegerSuffixStartingWithE, 316 | } 317 | 318 | impl std::error::Error for ParseError {} 319 | 320 | impl fmt::Display for ParseError { 321 | fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { 322 | use ParseErrorKind::*; 323 | 324 | let description = match self.kind { 325 | Empty => "input is empty", 326 | UnexpectedChar => "unexpected character", 327 | InvalidLiteral => "invalid literal", 328 | DoesNotStartWithDigit => "number literal does not start with decimal digit", 329 | InvalidDigit => "integer literal contains a digit invalid for its base", 330 | NoDigits => "integer literal does not contain any digits", 331 | NoExponentDigits => "exponent of floating point literal does not contain any digits", 332 | UnknownEscape => "unknown escape", 333 | UnterminatedEscape => "unterminated escape: input ended too soon", 334 | InvalidXEscape => r"invalid `\x` escape: not followed by two hex digits", 335 | NonAsciiXEscape => r"`\x` escape in char/string literal exceed ASCII range", 336 | UnicodeEscapeInByteLiteral => r"`\u{...}` escape in byte (string) literal not allowed", 337 | InvalidStartOfUnicodeEscape => r"invalid start of `\u{...}` escape", 338 | UnicodeEscapeWithoutBrace => r"`Unicode \u{...}` escape without opening brace", 339 | NonHexDigitInUnicodeEscape => r"non-hex digit found in `\u{...}` escape", 340 | TooManyDigitInUnicodeEscape => r"more than six digits in `\u{...}` escape", 341 | InvalidUnicodeEscapeChar => r"value specified in `\u{...}` escape is not a valid char", 342 | UnterminatedUnicodeEscape => r"unterminated `\u{...}` escape", 343 | UnterminatedCharLiteral => "character literal is not terminated", 344 | OverlongCharLiteral => "character literal contains more than one character", 345 | EmptyCharLiteral => "empty character literal", 346 | UnterminatedByteLiteral => "byte literal is not terminated", 347 | OverlongByteLiteral => "byte literal contains more than one byte", 348 | EmptyByteLiteral => "empty byte literal", 349 | NonAsciiInByteLiteral => "non ASCII character in byte (string) literal", 350 | UnescapedSingleQuote => "character literal contains unescaped ' character", 351 | UnescapedSpecialWhitespace => r"unescaped newline (\n), tab (\t) or cr (\r) character", 352 | DoesNotStartWithQuote => "invalid start for char/byte/string literal", 353 | UnterminatedRawString => "unterminated raw (byte) string literal", 354 | UnterminatedString => "unterminated (byte) string literal", 355 | InvalidStringLiteralStart => "invalid start for string literal", 356 | InvalidByteLiteralStart => "invalid start for byte literal", 357 | InvalidByteStringLiteralStart => "invalid start for byte string literal", 358 | IsolatedCr => r"`\r` not immediately followed by `\n` in string", 359 | InvalidSuffix => "literal suffix is not a valid identifier", 360 | UnexpectedIntegerLit => "expected float literal, but found integer", 361 | IntegerSuffixStartingWithE => "integer literal suffix must not start with 'e' or 'E'", 362 | }; 363 | 364 | description.fmt(f)?; 365 | if let Some(span) = &self.span { 366 | write!(f, " (at {}..{})", span.start, span.end)?; 367 | } 368 | 369 | Ok(()) 370 | } 371 | } 372 | -------------------------------------------------------------------------------- /src/escape.rs: -------------------------------------------------------------------------------- 1 | use crate::{ParseError, err::{perr, ParseErrorKind::*}, parse::{hex_digit_value, check_suffix}}; 2 | 3 | 4 | /// Must start with `\` 5 | pub(crate) fn unescape(input: &str, offset: usize) -> Result<(E, usize), ParseError> { 6 | let first = input.as_bytes().get(1) 7 | .ok_or(perr(offset, UnterminatedEscape))?; 8 | let out = match first { 9 | // Quote escapes 10 | b'\'' => (E::from_byte(b'\''), 2), 11 | b'"' => (E::from_byte(b'"'), 2), 12 | 13 | // Ascii escapes 14 | b'n' => (E::from_byte(b'\n'), 2), 15 | b'r' => (E::from_byte(b'\r'), 2), 16 | b't' => (E::from_byte(b'\t'), 2), 17 | b'\\' => (E::from_byte(b'\\'), 2), 18 | b'0' => (E::from_byte(b'\0'), 2), 19 | b'x' => { 20 | let hex_string = input.get(2..4) 21 | .ok_or(perr(offset..offset + input.len(), UnterminatedEscape))? 22 | .as_bytes(); 23 | let first = hex_digit_value(hex_string[0]) 24 | .ok_or(perr(offset..offset + 4, InvalidXEscape))?; 25 | let second = hex_digit_value(hex_string[1]) 26 | .ok_or(perr(offset..offset + 4, InvalidXEscape))?; 27 | let value = second + 16 * first; 28 | 29 | if E::SUPPORTS_UNICODE && value > 0x7F { 30 | return Err(perr(offset..offset + 4, NonAsciiXEscape)); 31 | } 32 | 33 | (E::from_byte(value), 4) 34 | }, 35 | 36 | // Unicode escape 37 | b'u' => { 38 | if !E::SUPPORTS_UNICODE { 39 | return Err(perr(offset..offset + 2, UnicodeEscapeInByteLiteral)); 40 | } 41 | 42 | if input.as_bytes().get(2) != Some(&b'{') { 43 | return Err(perr(offset..offset + 2, UnicodeEscapeWithoutBrace)); 44 | } 45 | 46 | let closing_pos = input.bytes().position(|b| b == b'}') 47 | .ok_or(perr(offset..offset + input.len(), UnterminatedUnicodeEscape))?; 48 | 49 | let inner = &input[3..closing_pos]; 50 | if inner.as_bytes().first() == Some(&b'_') { 51 | return Err(perr(4, InvalidStartOfUnicodeEscape)); 52 | } 53 | 54 | let mut v: u32 = 0; 55 | let mut digit_count = 0; 56 | for (i, b) in inner.bytes().enumerate() { 57 | if b == b'_'{ 58 | continue; 59 | } 60 | 61 | let digit = hex_digit_value(b) 62 | .ok_or(perr(offset + 3 + i, NonHexDigitInUnicodeEscape))?; 63 | 64 | if digit_count == 6 { 65 | return Err(perr(offset + 3 + i, TooManyDigitInUnicodeEscape)); 66 | } 67 | digit_count += 1; 68 | v = 16 * v + digit as u32; 69 | } 70 | 71 | let c = std::char::from_u32(v) 72 | .ok_or(perr(offset..closing_pos + 1, InvalidUnicodeEscapeChar))?; 73 | 74 | (E::from_char(c), closing_pos + 1) 75 | } 76 | 77 | _ => return Err(perr(offset..offset + 2, UnknownEscape)), 78 | }; 79 | 80 | Ok(out) 81 | } 82 | 83 | pub(crate) trait Escapee: Into { 84 | const SUPPORTS_UNICODE: bool; 85 | fn from_byte(b: u8) -> Self; 86 | fn from_char(c: char) -> Self; 87 | } 88 | 89 | impl Escapee for u8 { 90 | const SUPPORTS_UNICODE: bool = false; 91 | fn from_byte(b: u8) -> Self { 92 | b 93 | } 94 | fn from_char(_: char) -> Self { 95 | panic!("bug: `::from_char` was called"); 96 | } 97 | } 98 | 99 | impl Escapee for char { 100 | const SUPPORTS_UNICODE: bool = true; 101 | fn from_byte(b: u8) -> Self { 102 | b.into() 103 | } 104 | fn from_char(c: char) -> Self { 105 | c 106 | } 107 | } 108 | 109 | /// Checks whether the character is skipped after a string continue start 110 | /// (unescaped backlash followed by `\n`). 111 | fn is_string_continue_skipable_whitespace(b: u8) -> bool { 112 | b == b' ' || b == b'\t' || b == b'\n' || b == b'\r' 113 | } 114 | 115 | /// Unescapes a whole string or byte string. 116 | #[inline(never)] 117 | pub(crate) fn unescape_string( 118 | input: &str, 119 | offset: usize, 120 | ) -> Result<(Option, usize), ParseError> { 121 | let mut closing_quote_pos = None; 122 | let mut i = offset; 123 | let mut end_last_escape = offset; 124 | let mut value = String::new(); 125 | while i < input.len() { 126 | match input.as_bytes()[i] { 127 | // Handle "string continue". 128 | b'\\' if input.as_bytes().get(i + 1) == Some(&b'\n') => { 129 | value.push_str(&input[end_last_escape..i]); 130 | 131 | // Find the first non-whitespace character. 132 | let end_escape = input[i + 2..].bytes() 133 | .position(|b| !is_string_continue_skipable_whitespace(b)) 134 | .ok_or(perr(None, UnterminatedString))?; 135 | 136 | i += 2 + end_escape; 137 | end_last_escape = i; 138 | } 139 | b'\\' => { 140 | let (c, len) = unescape::(&input[i..input.len() - 1], i)?; 141 | value.push_str(&input[end_last_escape..i]); 142 | value.push(c.into()); 143 | i += len; 144 | end_last_escape = i; 145 | } 146 | b'\r' => { 147 | if input.as_bytes().get(i + 1) == Some(&b'\n') { 148 | value.push_str(&input[end_last_escape..i]); 149 | value.push('\n'); 150 | i += 2; 151 | end_last_escape = i; 152 | } else { 153 | return Err(perr(i, IsolatedCr)) 154 | } 155 | } 156 | b'"' => { 157 | closing_quote_pos = Some(i); 158 | break; 159 | }, 160 | b if !E::SUPPORTS_UNICODE && !b.is_ascii() 161 | => return Err(perr(i, NonAsciiInByteLiteral)), 162 | _ => i += 1, 163 | } 164 | } 165 | 166 | let closing_quote_pos = closing_quote_pos.ok_or(perr(None, UnterminatedString))?; 167 | 168 | let start_suffix = closing_quote_pos + 1; 169 | let suffix = &input[start_suffix..]; 170 | check_suffix(suffix).map_err(|kind| perr(start_suffix, kind))?; 171 | 172 | // `value` is only empty if there was no escape in the input string 173 | // (with the special case of the input being empty). This means the 174 | // string value basically equals the input, so we store `None`. 175 | let value = if value.is_empty() { 176 | None 177 | } else { 178 | // There was an escape in the string, so we need to push the 179 | // remaining unescaped part of the string still. 180 | value.push_str(&input[end_last_escape..closing_quote_pos]); 181 | Some(value) 182 | }; 183 | 184 | Ok((value, start_suffix)) 185 | } 186 | 187 | /// Reads and checks a raw (byte) string literal, converting `\r\n` sequences to 188 | /// just `\n` sequences. Returns an optional new string (if the input contained 189 | /// any `\r\n`) and the number of hashes used by the literal. 190 | #[inline(never)] 191 | pub(crate) fn scan_raw_string( 192 | input: &str, 193 | offset: usize, 194 | ) -> Result<(Option, u32, usize), ParseError> { 195 | // Raw string literal 196 | let num_hashes = input[offset..].bytes().position(|b| b != b'#') 197 | .ok_or(perr(None, InvalidLiteral))?; 198 | 199 | if input.as_bytes().get(offset + num_hashes) != Some(&b'"') { 200 | return Err(perr(None, InvalidLiteral)); 201 | } 202 | let start_inner = offset + num_hashes + 1; 203 | let hashes = &input[offset..num_hashes + offset]; 204 | 205 | let mut closing_quote_pos = None; 206 | let mut i = start_inner; 207 | let mut end_last_escape = start_inner; 208 | let mut value = String::new(); 209 | while i < input.len() { 210 | let b = input.as_bytes()[i]; 211 | if b == b'"' && input[i + 1..].starts_with(hashes) { 212 | closing_quote_pos = Some(i); 213 | break; 214 | } 215 | 216 | if b == b'\r' { 217 | // Convert `\r\n` into `\n`. This is currently not well documented 218 | // in the Rust reference, but is done even for raw strings. That's 219 | // because rustc simply converts all line endings when reading 220 | // source files. 221 | if input.as_bytes().get(i + 1) == Some(&b'\n') { 222 | value.push_str(&input[end_last_escape..i]); 223 | value.push('\n'); 224 | i += 2; 225 | end_last_escape = i; 226 | continue; 227 | } else if E::SUPPORTS_UNICODE { 228 | // If no \n follows the \r and we are scanning a raw string 229 | // (not raw byte string), we error. 230 | return Err(perr(i, IsolatedCr)) 231 | } 232 | } 233 | 234 | if !E::SUPPORTS_UNICODE { 235 | if !b.is_ascii() { 236 | return Err(perr(i, NonAsciiInByteLiteral)); 237 | } 238 | } 239 | 240 | i += 1; 241 | } 242 | 243 | let closing_quote_pos = closing_quote_pos.ok_or(perr(None, UnterminatedRawString))?; 244 | 245 | let start_suffix = closing_quote_pos + num_hashes + 1; 246 | let suffix = &input[start_suffix..]; 247 | check_suffix(suffix).map_err(|kind| perr(start_suffix, kind))?; 248 | 249 | // `value` is only empty if there was no \r\n in the input string (with the 250 | // special case of the input being empty). This means the string value 251 | // equals the input, so we store `None`. 252 | let value = if value.is_empty() { 253 | None 254 | } else { 255 | // There was an \r\n in the string, so we need to push the remaining 256 | // unescaped part of the string still. 257 | value.push_str(&input[end_last_escape..closing_quote_pos]); 258 | Some(value) 259 | }; 260 | 261 | Ok((value, num_hashes as u32, start_suffix)) 262 | } 263 | -------------------------------------------------------------------------------- /src/float/mod.rs: -------------------------------------------------------------------------------- 1 | use std::{fmt, str::FromStr}; 2 | 3 | use crate::{ 4 | Buffer, ParseError, 5 | err::{perr, ParseErrorKind::*}, 6 | parse::{end_dec_digits, first_byte_or_empty, check_suffix}, 7 | }; 8 | 9 | 10 | 11 | /// A floating point literal, e.g. `3.14`, `8.`, `135e12`, or `1.956e2f64`. 12 | /// 13 | /// This kind of literal has several forms, but generally consists of a main 14 | /// number part, an optional exponent and an optional type suffix. See 15 | /// [the reference][ref] for more information. 16 | /// 17 | /// A leading minus sign `-` is not part of the literal grammar! `-3.14` are two 18 | /// tokens in the Rust grammar. Further, `27` and `27f32` are both not float, 19 | /// but integer literals! Consequently `FloatLit::parse` will reject them. 20 | /// 21 | /// 22 | /// [ref]: https://doc.rust-lang.org/reference/tokens.html#floating-point-literals 23 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 24 | pub struct FloatLit { 25 | /// The whole raw input. The `usize` fields in this struct partition this 26 | /// string. Always true: `end_integer_part <= end_fractional_part`. 27 | /// 28 | /// ```text 29 | /// 12_3.4_56e789f32 30 | /// ╷ ╷ ╷ 31 | /// | | └ end_number_part = 13 32 | /// | └ end_fractional_part = 9 33 | /// └ end_integer_part = 4 34 | /// 35 | /// 246. 36 | /// ╷╷ 37 | /// |└ end_fractional_part = end_number_part = 4 38 | /// └ end_integer_part = 3 39 | /// 40 | /// 1234e89 41 | /// ╷ ╷ 42 | /// | └ end_number_part = 7 43 | /// └ end_integer_part = end_fractional_part = 4 44 | /// ``` 45 | raw: B, 46 | 47 | /// The first index not part of the integer part anymore. Since the integer 48 | /// part is at the start, this is also the length of that part. 49 | end_integer_part: usize, 50 | 51 | /// The first index after the fractional part. 52 | end_fractional_part: usize, 53 | 54 | /// The first index after the whole number part (everything except type suffix). 55 | end_number_part: usize, 56 | } 57 | 58 | impl FloatLit { 59 | /// Parses the input as a floating point literal. Returns an error if the 60 | /// input is invalid or represents a different kind of literal. Will also 61 | /// reject decimal integer literals like `23` or `17f32`, in accordance 62 | /// with the spec. 63 | pub fn parse(s: B) -> Result { 64 | match first_byte_or_empty(&s)? { 65 | b'0'..=b'9' => { 66 | // TODO: simplify once RFC 2528 is stabilized 67 | let FloatLit { 68 | end_integer_part, 69 | end_fractional_part, 70 | end_number_part, 71 | .. 72 | } = parse_impl(&s)?; 73 | 74 | Ok(Self { raw: s, end_integer_part, end_fractional_part, end_number_part }) 75 | }, 76 | _ => Err(perr(0, DoesNotStartWithDigit)), 77 | } 78 | } 79 | 80 | /// Returns the number part (including integer part, fractional part and 81 | /// exponent), but without the suffix. If you want an actual floating 82 | /// point value, you need to parse this string, e.g. with `f32::from_str` 83 | /// or an external crate. 84 | pub fn number_part(&self) -> &str { 85 | &(*self.raw)[..self.end_number_part] 86 | } 87 | 88 | /// Returns the non-empty integer part of this literal. 89 | pub fn integer_part(&self) -> &str { 90 | &(*self.raw)[..self.end_integer_part] 91 | } 92 | 93 | /// Returns the optional fractional part of this literal. Does not include 94 | /// the period. If a period exists in the input, `Some` is returned, `None` 95 | /// otherwise. Note that `Some("")` might be returned, e.g. for `3.`. 96 | pub fn fractional_part(&self) -> Option<&str> { 97 | if self.end_integer_part == self.end_fractional_part { 98 | None 99 | } else { 100 | Some(&(*self.raw)[self.end_integer_part + 1..self.end_fractional_part]) 101 | } 102 | } 103 | 104 | /// Optional exponent part. Might be empty if there was no exponent part in 105 | /// the input. Includes the `e` or `E` at the beginning. 106 | pub fn exponent_part(&self) -> &str { 107 | &(*self.raw)[self.end_fractional_part..self.end_number_part] 108 | } 109 | 110 | /// The optional suffix. Returns `""` if the suffix is empty/does not exist. 111 | pub fn suffix(&self) -> &str { 112 | &(*self.raw)[self.end_number_part..] 113 | } 114 | 115 | /// Returns the raw input that was passed to `parse`. 116 | pub fn raw_input(&self) -> &str { 117 | &self.raw 118 | } 119 | 120 | /// Returns the raw input that was passed to `parse`, potentially owned. 121 | pub fn into_raw_input(self) -> B { 122 | self.raw 123 | } 124 | } 125 | 126 | impl FloatLit<&str> { 127 | /// Makes a copy of the underlying buffer and returns the owned version of 128 | /// `Self`. 129 | pub fn to_owned(&self) -> FloatLit { 130 | FloatLit { 131 | raw: self.raw.to_owned(), 132 | end_integer_part: self.end_integer_part, 133 | end_fractional_part: self.end_fractional_part, 134 | end_number_part: self.end_number_part, 135 | } 136 | } 137 | } 138 | 139 | impl fmt::Display for FloatLit { 140 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 141 | write!(f, "{}", &*self.raw) 142 | } 143 | } 144 | 145 | /// Precondition: first byte of string has to be in `b'0'..=b'9'`. 146 | #[inline(never)] 147 | pub(crate) fn parse_impl(input: &str) -> Result, ParseError> { 148 | // Integer part. 149 | let end_integer_part = end_dec_digits(input.as_bytes()); 150 | let rest = &input[end_integer_part..]; 151 | 152 | 153 | // Fractional part. 154 | let end_fractional_part = if rest.as_bytes().get(0) == Some(&b'.') { 155 | // The fractional part must not start with `_`. 156 | if rest.as_bytes().get(1) == Some(&b'_') { 157 | return Err(perr(end_integer_part + 1, UnexpectedChar)); 158 | } 159 | 160 | end_dec_digits(rest[1..].as_bytes()) + 1 + end_integer_part 161 | } else { 162 | end_integer_part 163 | }; 164 | let rest = &input[end_fractional_part..]; 165 | 166 | // If we have a period that is not followed by decimal digits, the 167 | // literal must end now. 168 | if end_integer_part + 1 == end_fractional_part && !rest.is_empty() { 169 | return Err(perr(end_integer_part + 1, UnexpectedChar)); 170 | } 171 | 172 | // Optional exponent. 173 | let end_number_part = if rest.starts_with('e') || rest.starts_with('E') { 174 | // Strip single - or + sign at the beginning. 175 | let exp_number_start = match rest.as_bytes().get(1) { 176 | Some(b'-') | Some(b'+') => 2, 177 | _ => 1, 178 | }; 179 | 180 | // Find end of exponent and make sure there is at least one digit. 181 | let end_exponent = end_dec_digits(rest[exp_number_start..].as_bytes()) + exp_number_start; 182 | if !rest[exp_number_start..end_exponent].bytes().any(|b| matches!(b, b'0'..=b'9')) { 183 | return Err(perr( 184 | end_fractional_part..end_fractional_part + end_exponent, 185 | NoExponentDigits, 186 | )); 187 | } 188 | 189 | end_exponent + end_fractional_part 190 | } else { 191 | end_fractional_part 192 | }; 193 | 194 | // Make sure the suffix is valid. 195 | let suffix = &input[end_number_part..]; 196 | check_suffix(suffix).map_err(|kind| perr(end_number_part..input.len(), kind))?; 197 | 198 | // A float literal needs either a fractional or exponent part, otherwise its 199 | // an integer literal. 200 | if end_integer_part == end_number_part { 201 | return Err(perr(None, UnexpectedIntegerLit)); 202 | } 203 | 204 | Ok(FloatLit { 205 | raw: input, 206 | end_integer_part, 207 | end_fractional_part, 208 | end_number_part, 209 | }) 210 | } 211 | 212 | 213 | /// All possible float type suffixes. 214 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 215 | #[non_exhaustive] 216 | pub enum FloatType { 217 | F32, 218 | F64, 219 | } 220 | 221 | impl FloatType { 222 | /// Returns the type corresponding to the given suffix (e.g. `"f32"` is 223 | /// mapped to `Self::F32`). If the suffix is not a valid float type, `None` 224 | /// is returned. 225 | pub fn from_suffix(suffix: &str) -> Option { 226 | match suffix { 227 | "f32" => Some(FloatType::F32), 228 | "f64" => Some(FloatType::F64), 229 | _ => None, 230 | } 231 | } 232 | 233 | /// Returns the suffix for this type, e.g. `"f32"` for `Self::F32`. 234 | pub fn suffix(self) -> &'static str { 235 | match self { 236 | Self::F32 => "f32", 237 | Self::F64 => "f64", 238 | } 239 | } 240 | } 241 | 242 | impl FromStr for FloatType { 243 | type Err = (); 244 | fn from_str(s: &str) -> Result { 245 | Self::from_suffix(s).ok_or(()) 246 | } 247 | } 248 | 249 | impl fmt::Display for FloatType { 250 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 251 | self.suffix().fmt(f) 252 | } 253 | } 254 | 255 | 256 | #[cfg(test)] 257 | mod tests; 258 | -------------------------------------------------------------------------------- /src/float/tests.rs: -------------------------------------------------------------------------------- 1 | use crate::{ 2 | Literal, ParseError, 3 | test_util::{assert_parse_ok_eq, assert_roundtrip}, 4 | }; 5 | use super::{FloatLit, FloatType}; 6 | 7 | 8 | // ===== Utility functions ======================================================================= 9 | 10 | /// Helper macro to check parsing a float. 11 | /// 12 | /// This macro contains quite a bit of logic itself (which can be buggy of 13 | /// course), so we have a few test functions below to test a bunch of cases 14 | /// manually. 15 | macro_rules! check { 16 | ($intpart:literal $fracpart:literal $exppart:literal $suffix:tt) => { 17 | let input = concat!($intpart, $fracpart, $exppart, check!(@stringify_suffix $suffix)); 18 | let expected_float = FloatLit { 19 | raw: input, 20 | end_integer_part: $intpart.len(), 21 | end_fractional_part: $intpart.len() + $fracpart.len(), 22 | end_number_part: $intpart.len() + $fracpart.len() + $exppart.len(), 23 | }; 24 | 25 | assert_parse_ok_eq( 26 | input, FloatLit::parse(input), expected_float.clone(), "FloatLit::parse"); 27 | assert_parse_ok_eq( 28 | input, Literal::parse(input), Literal::Float(expected_float), "Literal::parse"); 29 | assert_eq!(FloatLit::parse(input).unwrap().suffix(), check!(@ty $suffix)); 30 | assert_roundtrip(expected_float.to_owned(), input); 31 | }; 32 | (@ty f32) => { "f32" }; 33 | (@ty f64) => { "f64" }; 34 | (@ty -) => { "" }; 35 | (@stringify_suffix -) => { "" }; 36 | (@stringify_suffix $suffix:ident) => { stringify!($suffix) }; 37 | } 38 | 39 | 40 | // ===== Actual tests =========================================================================== 41 | 42 | #[test] 43 | fn manual_without_suffix() -> Result<(), ParseError> { 44 | let f = FloatLit::parse("3.14")?; 45 | assert_eq!(f.number_part(), "3.14"); 46 | assert_eq!(f.integer_part(), "3"); 47 | assert_eq!(f.fractional_part(), Some("14")); 48 | assert_eq!(f.exponent_part(), ""); 49 | assert_eq!(f.suffix(), ""); 50 | 51 | let f = FloatLit::parse("9.")?; 52 | assert_eq!(f.number_part(), "9."); 53 | assert_eq!(f.integer_part(), "9"); 54 | assert_eq!(f.fractional_part(), Some("")); 55 | assert_eq!(f.exponent_part(), ""); 56 | assert_eq!(f.suffix(), ""); 57 | 58 | let f = FloatLit::parse("8e1")?; 59 | assert_eq!(f.number_part(), "8e1"); 60 | assert_eq!(f.integer_part(), "8"); 61 | assert_eq!(f.fractional_part(), None); 62 | assert_eq!(f.exponent_part(), "e1"); 63 | assert_eq!(f.suffix(), ""); 64 | 65 | let f = FloatLit::parse("8E3")?; 66 | assert_eq!(f.number_part(), "8E3"); 67 | assert_eq!(f.integer_part(), "8"); 68 | assert_eq!(f.fractional_part(), None); 69 | assert_eq!(f.exponent_part(), "E3"); 70 | assert_eq!(f.suffix(), ""); 71 | 72 | let f = FloatLit::parse("8_7_6.1_23e15")?; 73 | assert_eq!(f.number_part(), "8_7_6.1_23e15"); 74 | assert_eq!(f.integer_part(), "8_7_6"); 75 | assert_eq!(f.fractional_part(), Some("1_23")); 76 | assert_eq!(f.exponent_part(), "e15"); 77 | assert_eq!(f.suffix(), ""); 78 | 79 | let f = FloatLit::parse("8.2e-_04_9")?; 80 | assert_eq!(f.number_part(), "8.2e-_04_9"); 81 | assert_eq!(f.integer_part(), "8"); 82 | assert_eq!(f.fractional_part(), Some("2")); 83 | assert_eq!(f.exponent_part(), "e-_04_9"); 84 | assert_eq!(f.suffix(), ""); 85 | 86 | Ok(()) 87 | } 88 | 89 | #[test] 90 | fn manual_with_suffix() -> Result<(), ParseError> { 91 | let f = FloatLit::parse("3.14f32")?; 92 | assert_eq!(f.number_part(), "3.14"); 93 | assert_eq!(f.integer_part(), "3"); 94 | assert_eq!(f.fractional_part(), Some("14")); 95 | assert_eq!(f.exponent_part(), ""); 96 | assert_eq!(FloatType::from_suffix(f.suffix()), Some(FloatType::F32)); 97 | 98 | let f = FloatLit::parse("8e1f64")?; 99 | assert_eq!(f.number_part(), "8e1"); 100 | assert_eq!(f.integer_part(), "8"); 101 | assert_eq!(f.fractional_part(), None); 102 | assert_eq!(f.exponent_part(), "e1"); 103 | assert_eq!(FloatType::from_suffix(f.suffix()), Some(FloatType::F64)); 104 | 105 | let f = FloatLit::parse("8_7_6.1_23e15f32")?; 106 | assert_eq!(f.number_part(), "8_7_6.1_23e15"); 107 | assert_eq!(f.integer_part(), "8_7_6"); 108 | assert_eq!(f.fractional_part(), Some("1_23")); 109 | assert_eq!(f.exponent_part(), "e15"); 110 | assert_eq!(FloatType::from_suffix(f.suffix()), Some(FloatType::F32)); 111 | 112 | let f = FloatLit::parse("8.2e-_04_9f64")?; 113 | assert_eq!(f.number_part(), "8.2e-_04_9"); 114 | assert_eq!(f.integer_part(), "8"); 115 | assert_eq!(f.fractional_part(), Some("2")); 116 | assert_eq!(f.exponent_part(), "e-_04_9"); 117 | assert_eq!(FloatType::from_suffix(f.suffix()), Some(FloatType::F64)); 118 | 119 | Ok(()) 120 | } 121 | 122 | #[test] 123 | fn simple() { 124 | check!("3" ".14" "" -); 125 | check!("3" ".14" "" f32); 126 | check!("3" ".14" "" f64); 127 | 128 | check!("3" "" "e987654321" -); 129 | check!("3" "" "e987654321" f64); 130 | 131 | check!("42_888" ".05" "" -); 132 | check!("42_888" ".05" "E5___" f32); 133 | check!("123456789" "" "e_1" f64); 134 | check!("123456789" ".99" "e_1" f64); 135 | check!("123456789" ".99" "" f64); 136 | check!("123456789" ".99" "" -); 137 | 138 | check!("147" ".3_33" "" -); 139 | check!("147" ".3_33__" "E3" f64); 140 | check!("147" ".3_33__" "" f32); 141 | 142 | check!("147" ".333" "e-10" -); 143 | check!("147" ".333" "e-_7" f32); 144 | check!("147" ".333" "e+10" -); 145 | check!("147" ".333" "e+_7" f32); 146 | 147 | check!("86" "." "" -); 148 | check!("0" "." "" -); 149 | check!("0_" "." "" -); 150 | check!("0" ".0000001" "" -); 151 | check!("0" ".000_0001" "" -); 152 | 153 | check!("0" ".0" "e+0" -); 154 | check!("0" "" "E+0" -); 155 | check!("34" "" "e+0" -); 156 | check!("0" ".9182" "E+0" f32); 157 | } 158 | 159 | #[test] 160 | fn non_standard_suffixes() { 161 | #[track_caller] 162 | fn check_suffix( 163 | input: &str, 164 | integer_part: &str, 165 | fractional_part: Option<&str>, 166 | exponent_part: &str, 167 | suffix: &str, 168 | ) { 169 | let lit = FloatLit::parse(input) 170 | .unwrap_or_else(|e| panic!("expected to parse '{}' but got {}", input, e)); 171 | assert_eq!(lit.integer_part(), integer_part); 172 | assert_eq!(lit.fractional_part(), fractional_part); 173 | assert_eq!(lit.exponent_part(), exponent_part); 174 | assert_eq!(lit.suffix(), suffix); 175 | 176 | let lit = match Literal::parse(input) { 177 | Ok(Literal::Float(f)) => f, 178 | other => panic!("Expected float literal, but got {:?} for '{}'", other, input), 179 | }; 180 | assert_eq!(lit.integer_part(), integer_part); 181 | assert_eq!(lit.fractional_part(), fractional_part); 182 | assert_eq!(lit.exponent_part(), exponent_part); 183 | assert_eq!(lit.suffix(), suffix); 184 | } 185 | 186 | check_suffix("7.1f23", "7", Some("1"), "", "f23"); 187 | check_suffix("7.1f320", "7", Some("1"), "", "f320"); 188 | check_suffix("7.1f64_", "7", Some("1"), "", "f64_"); 189 | check_suffix("8.1f649", "8", Some("1"), "", "f649"); 190 | check_suffix("8.1f64f32", "8", Some("1"), "", "f64f32"); 191 | check_suffix("23e2_banana", "23", None, "e2_", "banana"); 192 | check_suffix("23.2_banana", "23", Some("2_"), "", "banana"); 193 | check_suffix("23e2pe55ter", "23", None, "e2", "pe55ter"); 194 | check_suffix("23e2p_e55ter", "23", None, "e2", "p_e55ter"); 195 | check_suffix("3.15Jürgen", "3", Some("15"), "", "Jürgen"); 196 | check_suffix("3e2e5", "3", None, "e2", "e5"); 197 | check_suffix("3e2e5f", "3", None, "e2", "e5f"); 198 | } 199 | 200 | #[test] 201 | fn parse_err() { 202 | assert_err!(FloatLit, "", Empty, None); 203 | assert_err_single!(FloatLit::parse("."), DoesNotStartWithDigit, 0); 204 | assert_err_single!(FloatLit::parse("+"), DoesNotStartWithDigit, 0); 205 | assert_err_single!(FloatLit::parse("-"), DoesNotStartWithDigit, 0); 206 | assert_err_single!(FloatLit::parse("e"), DoesNotStartWithDigit, 0); 207 | assert_err_single!(FloatLit::parse("e8"), DoesNotStartWithDigit, 0); 208 | assert_err!(FloatLit, "0e", NoExponentDigits, 1..2); 209 | assert_err_single!(FloatLit::parse("f32"), DoesNotStartWithDigit, 0); 210 | assert_err_single!(FloatLit::parse("foo"), DoesNotStartWithDigit, 0); 211 | 212 | assert_err_single!(FloatLit::parse("inf"), DoesNotStartWithDigit, 0); 213 | assert_err_single!(FloatLit::parse("nan"), DoesNotStartWithDigit, 0); 214 | assert_err_single!(FloatLit::parse("NaN"), DoesNotStartWithDigit, 0); 215 | assert_err_single!(FloatLit::parse("NAN"), DoesNotStartWithDigit, 0); 216 | 217 | assert_err_single!(FloatLit::parse("_2.7"), DoesNotStartWithDigit, 0); 218 | assert_err_single!(FloatLit::parse(".5"), DoesNotStartWithDigit, 0); 219 | assert_err!(FloatLit, "1e", NoExponentDigits, 1..2); 220 | assert_err!(FloatLit, "1.e4", UnexpectedChar, 2); 221 | assert_err!(FloatLit, "3._4", UnexpectedChar, 2); 222 | assert_err!(FloatLit, "3.f32", UnexpectedChar, 2); 223 | assert_err!(FloatLit, "3.e5", UnexpectedChar, 2); 224 | assert_err!(FloatLit, "12345._987", UnexpectedChar, 6); 225 | assert_err!(FloatLit, "46._", UnexpectedChar, 3); 226 | assert_err!(FloatLit, "46.f32", UnexpectedChar, 3); 227 | assert_err!(FloatLit, "46.e3", UnexpectedChar, 3); 228 | assert_err!(FloatLit, "46._e3", UnexpectedChar, 3); 229 | assert_err!(FloatLit, "46.e3f64", UnexpectedChar, 3); 230 | assert_err!(FloatLit, "23.4e_", NoExponentDigits, 4..6); 231 | assert_err!(FloatLit, "23E___f32", NoExponentDigits, 2..6); 232 | assert_err!(FloatLit, "55e3.1", UnexpectedChar, 4..6); 233 | 234 | assert_err!(FloatLit, "3.7+", UnexpectedChar, 3..4); 235 | assert_err!(FloatLit, "3.7+2", UnexpectedChar, 3..5); 236 | assert_err!(FloatLit, "3.7-", UnexpectedChar, 3..4); 237 | assert_err!(FloatLit, "3.7-2", UnexpectedChar, 3..5); 238 | assert_err!(FloatLit, "3.7e+", NoExponentDigits, 3..5); 239 | assert_err!(FloatLit, "3.7e-", NoExponentDigits, 3..5); 240 | assert_err!(FloatLit, "3.7e-+3", NoExponentDigits, 3..5); // suboptimal error 241 | assert_err!(FloatLit, "3.7e+-3", NoExponentDigits, 3..5); // suboptimal error 242 | assert_err_single!(FloatLit::parse("0x44.5"), InvalidSuffix, 1..6); 243 | 244 | assert_err_single!(FloatLit::parse("3"), UnexpectedIntegerLit, None); 245 | assert_err_single!(FloatLit::parse("35_389"), UnexpectedIntegerLit, None); 246 | assert_err_single!(FloatLit::parse("9_8_7f32"), UnexpectedIntegerLit, None); 247 | assert_err_single!(FloatLit::parse("9_8_7banana"), UnexpectedIntegerLit, None); 248 | assert_err_single!(FloatLit::parse("7f23"), UnexpectedIntegerLit, None); 249 | assert_err_single!(FloatLit::parse("7f320"), UnexpectedIntegerLit, None); 250 | assert_err_single!(FloatLit::parse("7f64_"), UnexpectedIntegerLit, None); 251 | assert_err_single!(FloatLit::parse("8f649"), UnexpectedIntegerLit, None); 252 | assert_err_single!(FloatLit::parse("8f64f32"), UnexpectedIntegerLit, None); 253 | } 254 | -------------------------------------------------------------------------------- /src/impls.rs: -------------------------------------------------------------------------------- 1 | use std::convert::TryFrom; 2 | 3 | use crate::{Literal, err::{InvalidToken, TokenKind}}; 4 | 5 | 6 | /// Helper macro to call a `callback` macro four times for all combinations of 7 | /// `proc_macro`/`proc_macro2` and `&`/owned. 8 | macro_rules! helper { 9 | ($callback:ident, $($input:tt)*) => { 10 | $callback!([proc_macro::] => $($input)*); 11 | $callback!([&proc_macro::] => $($input)*); 12 | #[cfg(feature = "proc-macro2")] 13 | $callback!([proc_macro2::] => $($input)*); 14 | #[cfg(feature = "proc-macro2")] 15 | $callback!([&proc_macro2::] => $($input)*); 16 | }; 17 | } 18 | 19 | /// Like `helper!` but without reference types. 20 | macro_rules! helper_no_refs { 21 | ($callback:ident, $($input:tt)*) => { 22 | $callback!([proc_macro::] => $($input)*); 23 | #[cfg(feature = "proc-macro2")] 24 | $callback!([proc_macro2::] => $($input)*); 25 | }; 26 | } 27 | 28 | 29 | // ============================================================================================== 30 | // ===== `From<*Lit> for Literal` 31 | // ============================================================================================== 32 | 33 | macro_rules! impl_specific_lit_to_lit { 34 | ($ty:ty, $variant:ident) => { 35 | impl From<$ty> for Literal { 36 | fn from(src: $ty) -> Self { 37 | Literal::$variant(src) 38 | } 39 | } 40 | }; 41 | } 42 | 43 | impl_specific_lit_to_lit!(crate::BoolLit, Bool); 44 | impl_specific_lit_to_lit!(crate::IntegerLit, Integer); 45 | impl_specific_lit_to_lit!(crate::FloatLit, Float); 46 | impl_specific_lit_to_lit!(crate::CharLit, Char); 47 | impl_specific_lit_to_lit!(crate::StringLit, String); 48 | impl_specific_lit_to_lit!(crate::ByteLit, Byte); 49 | impl_specific_lit_to_lit!(crate::ByteStringLit, ByteString); 50 | 51 | 52 | 53 | // ============================================================================================== 54 | // ===== `From for Literal` 55 | // ============================================================================================== 56 | 57 | 58 | macro_rules! impl_tt_to_lit { 59 | ([$($prefix:tt)*] => ) => { 60 | impl From<$($prefix)* Literal> for Literal { 61 | fn from(src: $($prefix)* Literal) -> Self { 62 | // We call `expect` in all these impls: this library aims to implement exactly 63 | // the Rust grammar, so if we have a valid Rust literal, we should always be 64 | // able to parse it. 65 | Self::parse(src.to_string()) 66 | .expect("bug: failed to parse output of `Literal::to_string`") 67 | } 68 | } 69 | } 70 | } 71 | 72 | helper!(impl_tt_to_lit, ); 73 | 74 | 75 | // ============================================================================================== 76 | // ===== `TryFrom for Literal` 77 | // ============================================================================================== 78 | 79 | macro_rules! impl_tt_to_lit { 80 | ([$($prefix:tt)*] => ) => { 81 | impl TryFrom<$($prefix)* TokenTree> for Literal { 82 | type Error = InvalidToken; 83 | fn try_from(tt: $($prefix)* TokenTree) -> Result { 84 | let span = tt.span(); 85 | let res = match tt { 86 | $($prefix)* TokenTree::Group(_) => Err(TokenKind::Group), 87 | $($prefix)* TokenTree::Punct(_) => Err(TokenKind::Punct), 88 | $($prefix)* TokenTree::Ident(ref ident) if ident.to_string() == "true" 89 | => return Ok(Literal::Bool(crate::BoolLit::True)), 90 | $($prefix)* TokenTree::Ident(ref ident) if ident.to_string() == "false" 91 | => return Ok(Literal::Bool(crate::BoolLit::False)), 92 | $($prefix)* TokenTree::Ident(_) => Err(TokenKind::Ident), 93 | $($prefix)* TokenTree::Literal(ref lit) => Ok(lit), 94 | }; 95 | 96 | match res { 97 | Ok(lit) => Ok(From::from(lit)), 98 | Err(actual) => Err(InvalidToken { 99 | actual, 100 | expected: TokenKind::Literal, 101 | span: span.into(), 102 | }), 103 | } 104 | } 105 | } 106 | } 107 | } 108 | 109 | helper!(impl_tt_to_lit, ); 110 | 111 | 112 | // ============================================================================================== 113 | // ===== `TryFrom`, `TryFrom` for non-bool `*Lit` 114 | // ============================================================================================== 115 | 116 | fn kind_of(lit: &Literal) -> TokenKind { 117 | match lit { 118 | Literal::String(_) => TokenKind::StringLit, 119 | Literal::Bool(_) => TokenKind::BoolLit, 120 | Literal::Integer(_) => TokenKind::IntegerLit, 121 | Literal::Float(_) => TokenKind::FloatLit, 122 | Literal::Char(_) => TokenKind::CharLit, 123 | Literal::Byte(_) => TokenKind::ByteLit, 124 | Literal::ByteString(_) => TokenKind::ByteStringLit, 125 | } 126 | } 127 | 128 | macro_rules! impl_for_specific_lit { 129 | ([$($prefix:tt)*] => $ty:ty, $variant:ident, $kind:ident) => { 130 | impl TryFrom<$($prefix)* Literal> for $ty { 131 | type Error = InvalidToken; 132 | fn try_from(src: $($prefix)* Literal) -> Result { 133 | let span = src.span(); 134 | let lit: Literal = src.into(); 135 | match lit { 136 | Literal::$variant(s) => Ok(s), 137 | other => Err(InvalidToken { 138 | expected: TokenKind::$kind, 139 | actual: kind_of(&other), 140 | span: span.into(), 141 | }), 142 | } 143 | } 144 | } 145 | 146 | impl TryFrom<$($prefix)* TokenTree> for $ty { 147 | type Error = InvalidToken; 148 | fn try_from(tt: $($prefix)* TokenTree) -> Result { 149 | let span = tt.span(); 150 | let res = match tt { 151 | $($prefix)* TokenTree::Group(_) => Err(TokenKind::Group), 152 | $($prefix)* TokenTree::Punct(_) => Err(TokenKind::Punct), 153 | $($prefix)* TokenTree::Ident(_) => Err(TokenKind::Ident), 154 | $($prefix)* TokenTree::Literal(ref lit) => Ok(lit), 155 | }; 156 | 157 | match res { 158 | Ok(lit) => <$ty>::try_from(lit), 159 | Err(actual) => Err(InvalidToken { 160 | actual, 161 | expected: TokenKind::$kind, 162 | span: span.into(), 163 | }), 164 | } 165 | } 166 | } 167 | }; 168 | } 169 | 170 | helper!(impl_for_specific_lit, crate::IntegerLit, Integer, IntegerLit); 171 | helper!(impl_for_specific_lit, crate::FloatLit, Float, FloatLit); 172 | helper!(impl_for_specific_lit, crate::CharLit, Char, CharLit); 173 | helper!(impl_for_specific_lit, crate::StringLit, String, StringLit); 174 | helper!(impl_for_specific_lit, crate::ByteLit, Byte, ByteLit); 175 | helper!(impl_for_specific_lit, crate::ByteStringLit, ByteString, ByteStringLit); 176 | 177 | 178 | // ============================================================================================== 179 | // ===== `From<*Lit> for pm::Literal` 180 | // ============================================================================================== 181 | 182 | macro_rules! impl_specific_lit_to_pm_lit { 183 | ([$($prefix:tt)*] => $ty:ident, $variant:ident, $kind:ident) => { 184 | impl From> for $($prefix)* Literal { 185 | fn from(l: crate::$ty) -> Self { 186 | // This should never fail: an input that is parsed successfuly 187 | // as one of our literal types should always parse as a 188 | // proc_macro literal as well! 189 | l.raw_input().parse().unwrap_or_else(|e| { 190 | panic!( 191 | "failed to parse `{}` as `{}`: {}", 192 | l.raw_input(), 193 | std::any::type_name::(), 194 | e, 195 | ) 196 | }) 197 | } 198 | } 199 | }; 200 | } 201 | 202 | helper_no_refs!(impl_specific_lit_to_pm_lit, IntegerLit, Integer, IntegerLit); 203 | helper_no_refs!(impl_specific_lit_to_pm_lit, FloatLit, Float, FloatLit); 204 | helper_no_refs!(impl_specific_lit_to_pm_lit, CharLit, Char, CharLit); 205 | helper_no_refs!(impl_specific_lit_to_pm_lit, StringLit, String, StringLit); 206 | helper_no_refs!(impl_specific_lit_to_pm_lit, ByteLit, Byte, ByteLit); 207 | helper_no_refs!(impl_specific_lit_to_pm_lit, ByteStringLit, ByteString, ByteStringLit); 208 | 209 | 210 | // ============================================================================================== 211 | // ===== `TryFrom for BoolLit` 212 | // ============================================================================================== 213 | 214 | macro_rules! impl_from_tt_for_bool { 215 | ([$($prefix:tt)*] => ) => { 216 | impl TryFrom<$($prefix)* TokenTree> for crate::BoolLit { 217 | type Error = InvalidToken; 218 | fn try_from(tt: $($prefix)* TokenTree) -> Result { 219 | let span = tt.span(); 220 | let actual = match tt { 221 | $($prefix)* TokenTree::Ident(ref ident) if ident.to_string() == "true" 222 | => return Ok(crate::BoolLit::True), 223 | $($prefix)* TokenTree::Ident(ref ident) if ident.to_string() == "false" 224 | => return Ok(crate::BoolLit::False), 225 | 226 | $($prefix)* TokenTree::Group(_) => TokenKind::Group, 227 | $($prefix)* TokenTree::Punct(_) => TokenKind::Punct, 228 | $($prefix)* TokenTree::Ident(_) => TokenKind::Ident, 229 | $($prefix)* TokenTree::Literal(ref lit) => kind_of(&Literal::from(lit)), 230 | }; 231 | 232 | Err(InvalidToken { 233 | actual, 234 | expected: TokenKind::BoolLit, 235 | span: span.into(), 236 | }) 237 | } 238 | } 239 | }; 240 | } 241 | 242 | helper!(impl_from_tt_for_bool, ); 243 | 244 | // ============================================================================================== 245 | // ===== `From for pm::Ident` 246 | // ============================================================================================== 247 | 248 | macro_rules! impl_bool_lit_to_pm_lit { 249 | ([$($prefix:tt)*] => ) => { 250 | impl From for $($prefix)* Ident { 251 | fn from(l: crate::BoolLit) -> Self { 252 | Self::new(l.as_str(), $($prefix)* Span::call_site()) 253 | } 254 | } 255 | }; 256 | } 257 | 258 | helper_no_refs!(impl_bool_lit_to_pm_lit, ); 259 | 260 | 261 | mod tests { 262 | //! # Tests 263 | //! 264 | //! ```no_run 265 | //! extern crate proc_macro; 266 | //! 267 | //! use std::convert::TryFrom; 268 | //! use litrs::Literal; 269 | //! 270 | //! fn give() -> T { 271 | //! panic!() 272 | //! } 273 | //! 274 | //! let _ = litrs::Literal::::from(give::()); 275 | //! let _ = litrs::Literal::::from(give::>()); 276 | //! let _ = litrs::Literal::::from(give::>()); 277 | //! let _ = litrs::Literal::::from(give::>()); 278 | //! let _ = litrs::Literal::::from(give::>()); 279 | //! let _ = litrs::Literal::::from(give::>()); 280 | //! let _ = litrs::Literal::::from(give::>()); 281 | //! 282 | //! let _ = litrs::Literal::<&'static str>::from(give::()); 283 | //! let _ = litrs::Literal::<&'static str>::from(give::>()); 284 | //! let _ = litrs::Literal::<&'static str>::from(give::>()); 285 | //! let _ = litrs::Literal::<&'static str>::from(give::>()); 286 | //! let _ = litrs::Literal::<&'static str>::from(give::>()); 287 | //! let _ = litrs::Literal::<&'static str>::from(give::>()); 288 | //! let _ = litrs::Literal::<&'static str>::from(give::>()); 289 | //! 290 | //! 291 | //! let _ = litrs::Literal::from(give::()); 292 | //! let _ = litrs::Literal::from(give::<&proc_macro::Literal>()); 293 | //! 294 | //! let _ = litrs::Literal::try_from(give::()); 295 | //! let _ = litrs::Literal::try_from(give::<&proc_macro::TokenTree>()); 296 | //! 297 | //! 298 | //! let _ = litrs::IntegerLit::try_from(give::()); 299 | //! let _ = litrs::IntegerLit::try_from(give::<&proc_macro::Literal>()); 300 | //! 301 | //! let _ = litrs::FloatLit::try_from(give::()); 302 | //! let _ = litrs::FloatLit::try_from(give::<&proc_macro::Literal>()); 303 | //! 304 | //! let _ = litrs::CharLit::try_from(give::()); 305 | //! let _ = litrs::CharLit::try_from(give::<&proc_macro::Literal>()); 306 | //! 307 | //! let _ = litrs::StringLit::try_from(give::()); 308 | //! let _ = litrs::StringLit::try_from(give::<&proc_macro::Literal>()); 309 | //! 310 | //! let _ = litrs::ByteLit::try_from(give::()); 311 | //! let _ = litrs::ByteLit::try_from(give::<&proc_macro::Literal>()); 312 | //! 313 | //! let _ = litrs::ByteStringLit::try_from(give::()); 314 | //! let _ = litrs::ByteStringLit::try_from(give::<&proc_macro::Literal>()); 315 | //! 316 | //! 317 | //! let _ = litrs::BoolLit::try_from(give::()); 318 | //! let _ = litrs::BoolLit::try_from(give::<&proc_macro::TokenTree>()); 319 | //! 320 | //! let _ = litrs::IntegerLit::try_from(give::()); 321 | //! let _ = litrs::IntegerLit::try_from(give::<&proc_macro::TokenTree>()); 322 | //! 323 | //! let _ = litrs::FloatLit::try_from(give::()); 324 | //! let _ = litrs::FloatLit::try_from(give::<&proc_macro::TokenTree>()); 325 | //! 326 | //! let _ = litrs::CharLit::try_from(give::()); 327 | //! let _ = litrs::CharLit::try_from(give::<&proc_macro::TokenTree>()); 328 | //! 329 | //! let _ = litrs::StringLit::try_from(give::()); 330 | //! let _ = litrs::StringLit::try_from(give::<&proc_macro::TokenTree>()); 331 | //! 332 | //! let _ = litrs::ByteLit::try_from(give::()); 333 | //! let _ = litrs::ByteLit::try_from(give::<&proc_macro::TokenTree>()); 334 | //! 335 | //! let _ = litrs::ByteStringLit::try_from(give::()); 336 | //! let _ = litrs::ByteStringLit::try_from(give::<&proc_macro::TokenTree>()); 337 | //! ``` 338 | } 339 | 340 | #[cfg(feature = "proc-macro2")] 341 | mod tests_proc_macro2 { 342 | //! # Tests 343 | //! 344 | //! ```no_run 345 | //! extern crate proc_macro; 346 | //! 347 | //! use std::convert::TryFrom; 348 | //! use litrs::Literal; 349 | //! 350 | //! fn give() -> T { 351 | //! panic!() 352 | //! } 353 | //! 354 | //! let _ = litrs::Literal::from(give::()); 355 | //! let _ = litrs::Literal::from(give::<&proc_macro2::Literal>()); 356 | //! 357 | //! let _ = litrs::Literal::try_from(give::()); 358 | //! let _ = litrs::Literal::try_from(give::<&proc_macro2::TokenTree>()); 359 | //! 360 | //! 361 | //! let _ = litrs::IntegerLit::try_from(give::()); 362 | //! let _ = litrs::IntegerLit::try_from(give::<&proc_macro2::Literal>()); 363 | //! 364 | //! let _ = litrs::FloatLit::try_from(give::()); 365 | //! let _ = litrs::FloatLit::try_from(give::<&proc_macro2::Literal>()); 366 | //! 367 | //! let _ = litrs::CharLit::try_from(give::()); 368 | //! let _ = litrs::CharLit::try_from(give::<&proc_macro2::Literal>()); 369 | //! 370 | //! let _ = litrs::StringLit::try_from(give::()); 371 | //! let _ = litrs::StringLit::try_from(give::<&proc_macro2::Literal>()); 372 | //! 373 | //! let _ = litrs::ByteLit::try_from(give::()); 374 | //! let _ = litrs::ByteLit::try_from(give::<&proc_macro2::Literal>()); 375 | //! 376 | //! let _ = litrs::ByteStringLit::try_from(give::()); 377 | //! let _ = litrs::ByteStringLit::try_from(give::<&proc_macro2::Literal>()); 378 | //! 379 | //! 380 | //! let _ = litrs::BoolLit::try_from(give::()); 381 | //! let _ = litrs::BoolLit::try_from(give::<&proc_macro2::TokenTree>()); 382 | //! 383 | //! let _ = litrs::IntegerLit::try_from(give::()); 384 | //! let _ = litrs::IntegerLit::try_from(give::<&proc_macro2::TokenTree>()); 385 | //! 386 | //! let _ = litrs::FloatLit::try_from(give::()); 387 | //! let _ = litrs::FloatLit::try_from(give::<&proc_macro2::TokenTree>()); 388 | //! 389 | //! let _ = litrs::CharLit::try_from(give::()); 390 | //! let _ = litrs::CharLit::try_from(give::<&proc_macro2::TokenTree>()); 391 | //! 392 | //! let _ = litrs::StringLit::try_from(give::()); 393 | //! let _ = litrs::StringLit::try_from(give::<&proc_macro2::TokenTree>()); 394 | //! 395 | //! let _ = litrs::ByteLit::try_from(give::()); 396 | //! let _ = litrs::ByteLit::try_from(give::<&proc_macro2::TokenTree>()); 397 | //! 398 | //! let _ = litrs::ByteStringLit::try_from(give::()); 399 | //! let _ = litrs::ByteStringLit::try_from(give::<&proc_macro2::TokenTree>()); 400 | //! ``` 401 | } 402 | -------------------------------------------------------------------------------- /src/integer/mod.rs: -------------------------------------------------------------------------------- 1 | use std::{fmt, str::FromStr}; 2 | 3 | use crate::{ 4 | Buffer, ParseError, 5 | err::{perr, ParseErrorKind::*}, 6 | parse::{first_byte_or_empty, hex_digit_value, check_suffix}, 7 | }; 8 | 9 | 10 | /// An integer literal, e.g. `27`, `0x7F`, `0b101010u8` or `5_000_000i64`. 11 | /// 12 | /// An integer literal consists of an optional base prefix (`0b`, `0o`, `0x`), 13 | /// the main part (digits and underscores), and an optional type suffix 14 | /// (e.g. `u64` or `i8`). See [the reference][ref] for more information. 15 | /// 16 | /// Note that integer literals are always positive: the grammar does not contain 17 | /// the minus sign at all. The minus sign is just the unary negate operator, 18 | /// not part of the literal. Which is interesting for cases like `- 128i8`: 19 | /// here, the literal itself would overflow the specified type (`i8` cannot 20 | /// represent 128). That's why in rustc, the literal overflow check is 21 | /// performed as a lint after parsing, not during the lexing stage. Similarly, 22 | /// [`IntegerLit::parse`] does not perform an overflow check. 23 | /// 24 | /// [ref]: https://doc.rust-lang.org/reference/tokens.html#integer-literals 25 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 26 | #[non_exhaustive] 27 | pub struct IntegerLit { 28 | /// The raw literal. Grammar: `
`. 29 | raw: B, 30 | /// First index of the main number part (after the base prefix). 31 | start_main_part: usize, 32 | /// First index not part of the main number part. 33 | end_main_part: usize, 34 | /// Parsed `raw[..start_main_part]`. 35 | base: IntegerBase, 36 | } 37 | 38 | impl IntegerLit { 39 | /// Parses the input as an integer literal. Returns an error if the input is 40 | /// invalid or represents a different kind of literal. 41 | pub fn parse(input: B) -> Result { 42 | match first_byte_or_empty(&input)? { 43 | digit @ b'0'..=b'9' => { 44 | // TODO: simplify once RFC 2528 is stabilized 45 | let IntegerLit { 46 | start_main_part, 47 | end_main_part, 48 | base, 49 | .. 50 | } = parse_impl(&input, digit)?; 51 | 52 | Ok(Self { raw: input, start_main_part, end_main_part, base }) 53 | }, 54 | _ => Err(perr(0, DoesNotStartWithDigit)), 55 | } 56 | } 57 | 58 | /// Performs the actual string to int conversion to obtain the integer 59 | /// value. The optional type suffix of the literal **is ignored by this 60 | /// method**. This means `N` does not need to match the type suffix! 61 | /// 62 | /// Returns `None` if the literal overflows `N`. 63 | /// 64 | /// Hint: `u128` can represent all possible values integer literal values, 65 | /// as there are no negative literals (see type docs). Thus you can, for 66 | /// example, safely use `lit.value::().to_string()` to get a decimal 67 | /// string. (Technically, Rust integer literals can represent arbitrarily 68 | /// large numbers, but those would be rejected at a later stage by the Rust 69 | /// compiler). 70 | pub fn value(&self) -> Option { 71 | let base = N::from_small_number(self.base.value()); 72 | 73 | let mut acc = N::from_small_number(0); 74 | for digit in self.raw_main_part().bytes() { 75 | if digit == b'_' { 76 | continue; 77 | } 78 | 79 | // We don't actually need the base here: we already know this main 80 | // part only contains digits valid for the specified base. 81 | let digit = hex_digit_value(digit) 82 | .unwrap_or_else(|| unreachable!("bug: integer main part contains non-digit")); 83 | 84 | acc = acc.checked_mul(base)?; 85 | acc = acc.checked_add(N::from_small_number(digit))?; 86 | } 87 | 88 | Some(acc) 89 | } 90 | 91 | /// The base of this integer literal. 92 | pub fn base(&self) -> IntegerBase { 93 | self.base 94 | } 95 | 96 | /// The main part containing the digits and potentially `_`. Do not try to 97 | /// parse this directly as that would ignore the base! 98 | pub fn raw_main_part(&self) -> &str { 99 | &(*self.raw)[self.start_main_part..self.end_main_part] 100 | } 101 | 102 | /// The optional suffix. Returns `""` if the suffix is empty/does not exist. 103 | /// 104 | /// If you want the type, try `IntegerType::from_suffix(lit.suffix())`. 105 | pub fn suffix(&self) -> &str { 106 | &(*self.raw)[self.end_main_part..] 107 | } 108 | 109 | /// Returns the raw input that was passed to `parse`. 110 | pub fn raw_input(&self) -> &str { 111 | &self.raw 112 | } 113 | 114 | /// Returns the raw input that was passed to `parse`, potentially owned. 115 | pub fn into_raw_input(self) -> B { 116 | self.raw 117 | } 118 | } 119 | 120 | impl IntegerLit<&str> { 121 | /// Makes a copy of the underlying buffer and returns the owned version of 122 | /// `Self`. 123 | pub fn to_owned(&self) -> IntegerLit { 124 | IntegerLit { 125 | raw: self.raw.to_owned(), 126 | start_main_part: self.start_main_part, 127 | end_main_part: self.end_main_part, 128 | base: self.base, 129 | } 130 | } 131 | } 132 | 133 | impl fmt::Display for IntegerLit { 134 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 135 | write!(f, "{}", &*self.raw) 136 | } 137 | } 138 | 139 | /// Integer literal types. *Implementation detail*. 140 | /// 141 | /// Implemented for all integer literal types. This trait is sealed and cannot 142 | /// be implemented outside of this crate. The trait's methods are implementation 143 | /// detail of this library and are not subject to semver. 144 | pub trait FromIntegerLiteral: self::sealed::Sealed + Copy { 145 | /// Creates itself from the given number. `n` is guaranteed to be `<= 16`. 146 | #[doc(hidden)] 147 | fn from_small_number(n: u8) -> Self; 148 | 149 | #[doc(hidden)] 150 | fn checked_add(self, rhs: Self) -> Option; 151 | 152 | #[doc(hidden)] 153 | fn checked_mul(self, rhs: Self) -> Option; 154 | 155 | #[doc(hidden)] 156 | fn ty() -> IntegerType; 157 | } 158 | 159 | macro_rules! impl_from_int_literal { 160 | ($( $ty:ty => $variant:ident ,)* ) => { 161 | $( 162 | impl self::sealed::Sealed for $ty {} 163 | impl FromIntegerLiteral for $ty { 164 | fn from_small_number(n: u8) -> Self { 165 | n as Self 166 | } 167 | fn checked_add(self, rhs: Self) -> Option { 168 | self.checked_add(rhs) 169 | } 170 | fn checked_mul(self, rhs: Self) -> Option { 171 | self.checked_mul(rhs) 172 | } 173 | fn ty() -> IntegerType { 174 | IntegerType::$variant 175 | } 176 | } 177 | )* 178 | }; 179 | } 180 | 181 | impl_from_int_literal!( 182 | u8 => U8, u16 => U16, u32 => U32, u64 => U64, u128 => U128, usize => Usize, 183 | i8 => I8, i16 => I16, i32 => I32, i64 => I64, i128 => I128, isize => Isize, 184 | ); 185 | 186 | mod sealed { 187 | pub trait Sealed {} 188 | } 189 | 190 | /// Precondition: first byte of string has to be in `b'0'..=b'9'`. 191 | #[inline(never)] 192 | pub(crate) fn parse_impl(input: &str, first: u8) -> Result, ParseError> { 193 | // Figure out base and strip prefix base, if it exists. 194 | let (end_prefix, base) = match (first, input.as_bytes().get(1)) { 195 | (b'0', Some(b'b')) => (2, IntegerBase::Binary), 196 | (b'0', Some(b'o')) => (2, IntegerBase::Octal), 197 | (b'0', Some(b'x')) => (2, IntegerBase::Hexadecimal), 198 | 199 | // Everything else is treated as decimal. Several cases are caught 200 | // by this: 201 | // - "123" 202 | // - "0" 203 | // - "0u8" 204 | // - "0r" -> this will error later 205 | _ => (0, IntegerBase::Decimal), 206 | }; 207 | let without_prefix = &input[end_prefix..]; 208 | 209 | 210 | // Scan input to find the first character that's not a valid digit. 211 | let is_valid_digit = match base { 212 | IntegerBase::Binary => |b| matches!(b, b'0' | b'1' | b'_'), 213 | IntegerBase::Octal => |b| matches!(b, b'0'..=b'7' | b'_'), 214 | IntegerBase::Decimal => |b| matches!(b, b'0'..=b'9' | b'_'), 215 | IntegerBase::Hexadecimal => |b| matches!(b, b'0'..=b'9' | b'a'..=b'f' | b'A'..=b'F' | b'_'), 216 | }; 217 | let end_main = without_prefix.bytes() 218 | .position(|b| !is_valid_digit(b)) 219 | .unwrap_or(without_prefix.len()); 220 | let (main_part, suffix) = without_prefix.split_at(end_main); 221 | 222 | check_suffix(suffix).map_err(|kind| { 223 | // This is just to have a nicer error kind for this special case. If the 224 | // suffix is invalid, it is non-empty -> unwrap ok. 225 | let first = suffix.as_bytes()[0]; 226 | if !is_valid_digit(first) && first.is_ascii_digit() { 227 | perr(end_main + end_prefix, InvalidDigit) 228 | } else { 229 | perr(end_main + end_prefix..input.len(), kind) 230 | } 231 | })?; 232 | if suffix.starts_with('e') || suffix.starts_with('E') { 233 | return Err(perr(end_main, IntegerSuffixStartingWithE)); 234 | } 235 | 236 | // Make sure main number part is not empty. 237 | if main_part.bytes().filter(|&b| b != b'_').count() == 0 { 238 | return Err(perr(end_prefix..end_prefix + end_main, NoDigits)); 239 | } 240 | 241 | Ok(IntegerLit { 242 | raw: input, 243 | start_main_part: end_prefix, 244 | end_main_part: end_main + end_prefix, 245 | base, 246 | }) 247 | } 248 | 249 | 250 | /// The bases in which an integer can be specified. 251 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 252 | pub enum IntegerBase { 253 | Binary, 254 | Octal, 255 | Decimal, 256 | Hexadecimal, 257 | } 258 | 259 | impl IntegerBase { 260 | /// Returns the literal prefix that indicates this base, i.e. `"0b"`, 261 | /// `"0o"`, `""` and `"0x"`. 262 | pub fn prefix(self) -> &'static str { 263 | match self { 264 | Self::Binary => "0b", 265 | Self::Octal => "0o", 266 | Self::Decimal => "", 267 | Self::Hexadecimal => "0x", 268 | } 269 | } 270 | 271 | /// Returns the base value, i.e. 2, 8, 10 or 16. 272 | pub fn value(self) -> u8 { 273 | match self { 274 | Self::Binary => 2, 275 | Self::Octal => 8, 276 | Self::Decimal => 10, 277 | Self::Hexadecimal => 16, 278 | } 279 | } 280 | } 281 | 282 | /// All possible integer type suffixes. 283 | #[derive(Debug, Clone, Copy, PartialEq, Eq)] 284 | #[non_exhaustive] 285 | pub enum IntegerType { 286 | U8, 287 | U16, 288 | U32, 289 | U64, 290 | U128, 291 | Usize, 292 | I8, 293 | I16, 294 | I32, 295 | I64, 296 | I128, 297 | Isize, 298 | } 299 | 300 | impl IntegerType { 301 | /// Returns the type corresponding to the given suffix (e.g. `"u8"` is 302 | /// mapped to `Self::U8`). If the suffix is not a valid integer type, 303 | /// `None` is returned. 304 | pub fn from_suffix(suffix: &str) -> Option { 305 | match suffix { 306 | "u8" => Some(Self::U8), 307 | "u16" => Some(Self::U16), 308 | "u32" => Some(Self::U32), 309 | "u64" => Some(Self::U64), 310 | "u128" => Some(Self::U128), 311 | "usize" => Some(Self::Usize), 312 | "i8" => Some(Self::I8), 313 | "i16" => Some(Self::I16), 314 | "i32" => Some(Self::I32), 315 | "i64" => Some(Self::I64), 316 | "i128" => Some(Self::I128), 317 | "isize" => Some(Self::Isize), 318 | _ => None, 319 | } 320 | } 321 | 322 | /// Returns the suffix for this type, e.g. `"u8"` for `Self::U8`. 323 | pub fn suffix(self) -> &'static str { 324 | match self { 325 | Self::U8 => "u8", 326 | Self::U16 => "u16", 327 | Self::U32 => "u32", 328 | Self::U64 => "u64", 329 | Self::U128 => "u128", 330 | Self::Usize => "usize", 331 | Self::I8 => "i8", 332 | Self::I16 => "i16", 333 | Self::I32 => "i32", 334 | Self::I64 => "i64", 335 | Self::I128 => "i128", 336 | Self::Isize => "isize", 337 | } 338 | } 339 | } 340 | 341 | impl FromStr for IntegerType { 342 | type Err = (); 343 | fn from_str(s: &str) -> Result { 344 | Self::from_suffix(s).ok_or(()) 345 | } 346 | } 347 | 348 | impl fmt::Display for IntegerType { 349 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 350 | self.suffix().fmt(f) 351 | } 352 | } 353 | 354 | 355 | #[cfg(test)] 356 | mod tests; 357 | -------------------------------------------------------------------------------- /src/integer/tests.rs: -------------------------------------------------------------------------------- 1 | use std::fmt::{Debug, Display}; 2 | use crate::{ 3 | FromIntegerLiteral, Literal, IntegerLit, IntegerType as Ty, IntegerBase, IntegerBase::*, 4 | test_util::{assert_parse_ok_eq, assert_roundtrip}, 5 | }; 6 | 7 | 8 | // ===== Utility functions ======================================================================= 9 | 10 | #[track_caller] 11 | fn check( 12 | input: &str, 13 | value: T, 14 | base: IntegerBase, 15 | main_part: &str, 16 | type_suffix: Option, 17 | ) { 18 | let expected_integer = IntegerLit { 19 | raw: input, 20 | start_main_part: base.prefix().len(), 21 | end_main_part: base.prefix().len() + main_part.len(), 22 | base, 23 | }; 24 | assert_parse_ok_eq( 25 | input, IntegerLit::parse(input), expected_integer.clone(), "IntegerLit::parse"); 26 | assert_parse_ok_eq( 27 | input, Literal::parse(input), Literal::Integer(expected_integer), "Literal::parse"); 28 | assert_roundtrip(expected_integer.to_owned(), input); 29 | assert_eq!(Ty::from_suffix(IntegerLit::parse(input).unwrap().suffix()), type_suffix); 30 | 31 | let actual_value = IntegerLit::parse(input) 32 | .unwrap() 33 | .value::() 34 | .unwrap_or_else(|| panic!("unexpected overflow in `IntegerLit::value` for `{}`", input)); 35 | if actual_value != value { 36 | panic!( 37 | "Parsing int literal `{}` should give value `{}`, but actually resulted in `{}`", 38 | input, 39 | value, 40 | actual_value, 41 | ); 42 | } 43 | } 44 | 45 | 46 | // ===== Actual tests =========================================================================== 47 | 48 | #[test] 49 | fn parse_decimal() { 50 | check("0", 0u128, Decimal, "0", None); 51 | check("1", 1u8, Decimal, "1", None); 52 | check("8", 8u16, Decimal, "8", None); 53 | check("9", 9u32, Decimal, "9", None); 54 | check("10", 10u64, Decimal, "10", None); 55 | check("11", 11i8, Decimal, "11", None); 56 | check("123456789", 123456789i128, Decimal, "123456789", None); 57 | 58 | check("05", 5i16, Decimal, "05", None); 59 | check("00005", 5i32, Decimal, "00005", None); 60 | check("0123456789", 123456789i64, Decimal, "0123456789", None); 61 | 62 | check("123_456_789", 123_456_789, Decimal, "123_456_789", None); 63 | check("0___4", 4, Decimal, "0___4", None); 64 | check("0___4_3", 43, Decimal, "0___4_3", None); 65 | check("0___4_3", 43, Decimal, "0___4_3", None); 66 | check("123___________", 123, Decimal, "123___________", None); 67 | 68 | check( 69 | "340282366920938463463374607431768211455", 70 | 340282366920938463463374607431768211455u128, 71 | Decimal, 72 | "340282366920938463463374607431768211455", 73 | None, 74 | ); 75 | check( 76 | "340_282_366_920_938_463_463_374_607_431_768_211_455", 77 | 340282366920938463463374607431768211455u128, 78 | Decimal, 79 | "340_282_366_920_938_463_463_374_607_431_768_211_455", 80 | None, 81 | ); 82 | check( 83 | "3_40_282_3669_20938_463463_3746074_31768211_455___", 84 | 340282366920938463463374607431768211455u128, 85 | Decimal, 86 | "3_40_282_3669_20938_463463_3746074_31768211_455___", 87 | None, 88 | ); 89 | } 90 | 91 | #[test] 92 | fn parse_binary() { 93 | check("0b0", 0b0, Binary, "0", None); 94 | check("0b000", 0b000, Binary, "000", None); 95 | check("0b1", 0b1, Binary, "1", None); 96 | check("0b01", 0b01, Binary, "01", None); 97 | check("0b101010", 0b101010, Binary, "101010", None); 98 | check("0b10_10_10", 0b10_10_10, Binary, "10_10_10", None); 99 | check("0b01101110____", 0b01101110____, Binary, "01101110____", None); 100 | 101 | check("0b10010u8", 0b10010u8, Binary, "10010", Some(Ty::U8)); 102 | check("0b10010i8", 0b10010u8, Binary, "10010", Some(Ty::I8)); 103 | check("0b10010u64", 0b10010u64, Binary, "10010", Some(Ty::U64)); 104 | check("0b10010i64", 0b10010i64, Binary, "10010", Some(Ty::I64)); 105 | check( 106 | "0b1011001_00110000_00101000_10100101u32", 107 | 0b1011001_00110000_00101000_10100101u32, 108 | Binary, 109 | "1011001_00110000_00101000_10100101", 110 | Some(Ty::U32), 111 | ); 112 | } 113 | 114 | #[test] 115 | fn parse_octal() { 116 | check("0o0", 0o0, Octal, "0", None); 117 | check("0o1", 0o1, Octal, "1", None); 118 | check("0o6", 0o6, Octal, "6", None); 119 | check("0o7", 0o7, Octal, "7", None); 120 | check("0o17", 0o17, Octal, "17", None); 121 | check("0o123", 0o123, Octal, "123", None); 122 | check("0o7654321", 0o7654321, Octal, "7654321", None); 123 | check("0o7_53_1", 0o7_53_1, Octal, "7_53_1", None); 124 | check("0o66_", 0o66_, Octal, "66_", None); 125 | 126 | check("0o755u16", 0o755u16, Octal, "755", Some(Ty::U16)); 127 | check("0o755i128", 0o755i128, Octal, "755", Some(Ty::I128)); 128 | } 129 | 130 | #[test] 131 | fn parse_hexadecimal() { 132 | check("0x0", 0x0, Hexadecimal, "0", None); 133 | check("0x1", 0x1, Hexadecimal, "1", None); 134 | check("0x9", 0x9, Hexadecimal, "9", None); 135 | 136 | check("0xa", 0xa, Hexadecimal, "a", None); 137 | check("0xf", 0xf, Hexadecimal, "f", None); 138 | check("0x17", 0x17, Hexadecimal, "17", None); 139 | check("0x1b", 0x1b, Hexadecimal, "1b", None); 140 | check("0x123", 0x123, Hexadecimal, "123", None); 141 | check("0xace", 0xace, Hexadecimal, "ace", None); 142 | check("0xfdb971", 0xfdb971, Hexadecimal, "fdb971", None); 143 | check("0xa_54_f", 0xa_54_f, Hexadecimal, "a_54_f", None); 144 | check("0x6d_", 0x6d_, Hexadecimal, "6d_", None); 145 | 146 | check("0xA", 0xA, Hexadecimal, "A", None); 147 | check("0xF", 0xF, Hexadecimal, "F", None); 148 | check("0x17", 0x17, Hexadecimal, "17", None); 149 | check("0x1B", 0x1B, Hexadecimal, "1B", None); 150 | check("0x123", 0x123, Hexadecimal, "123", None); 151 | check("0xACE", 0xACE, Hexadecimal, "ACE", None); 152 | check("0xFDB971", 0xFDB971, Hexadecimal, "FDB971", None); 153 | check("0xA_54_F", 0xA_54_F, Hexadecimal, "A_54_F", None); 154 | check("0x6D_", 0x6D_, Hexadecimal, "6D_", None); 155 | 156 | check("0xFdB97a1", 0xFdB97a1, Hexadecimal, "FdB97a1", None); 157 | check("0xfdB97A1", 0xfdB97A1, Hexadecimal, "fdB97A1", None); 158 | 159 | check("0x40u16", 0x40u16, Hexadecimal, "40", Some(Ty::U16)); 160 | check("0xffi128", 0xffi128, Hexadecimal, "ff", Some(Ty::I128)); 161 | } 162 | 163 | #[test] 164 | fn starting_underscore() { 165 | check("0b_1", 1, Binary, "_1", None); 166 | check("0b_010i16", 0b_010, Binary, "_010", Some(Ty::I16)); 167 | 168 | check("0o_5", 5, Octal, "_5", None); 169 | check("0o_750u128", 0o_750u128, Octal, "_750", Some(Ty::U128)); 170 | 171 | check("0x_c", 0xc, Hexadecimal, "_c", None); 172 | check("0x_cf3i8", 0x_cf3, Hexadecimal, "_cf3", Some(Ty::I8)); 173 | } 174 | 175 | #[test] 176 | fn parse_overflowing_just_fine() { 177 | check("256u8", 256u16, Decimal, "256", Some(Ty::U8)); 178 | check("123_456_789u8", 123_456_789u32, Decimal, "123_456_789", Some(Ty::U8)); 179 | check("123_456_789u16", 123_456_789u32, Decimal, "123_456_789", Some(Ty::U16)); 180 | 181 | check("123_123_456_789u8", 123_123_456_789u64, Decimal, "123_123_456_789", Some(Ty::U8)); 182 | check("123_123_456_789u16", 123_123_456_789u64, Decimal, "123_123_456_789", Some(Ty::U16)); 183 | check("123_123_456_789u32", 123_123_456_789u64, Decimal, "123_123_456_789", Some(Ty::U32)); 184 | } 185 | 186 | #[test] 187 | fn suffixes() { 188 | [ 189 | ("123i8", Ty::I8), 190 | ("123i16", Ty::I16), 191 | ("123i32", Ty::I32), 192 | ("123i64", Ty::I64), 193 | ("123i128", Ty::I128), 194 | ("123u8", Ty::U8), 195 | ("123u16", Ty::U16), 196 | ("123u32", Ty::U32), 197 | ("123u64", Ty::U64), 198 | ("123u128", Ty::U128), 199 | ].iter().for_each(|&(s, ty)| { 200 | assert_eq!(Ty::from_suffix(IntegerLit::parse(s).unwrap().suffix()), Some(ty)); 201 | }); 202 | } 203 | 204 | #[test] 205 | fn overflow_u128() { 206 | let inputs = [ 207 | "340282366920938463463374607431768211456", 208 | "0x100000000000000000000000000000000", 209 | "0o4000000000000000000000000000000000000000000", 210 | "0b1000000000000000000000000000000000000000000000000000000000000000000\ 211 | 00000000000000000000000000000000000000000000000000000000000000", 212 | "340282366920938463463374607431768211456u128", 213 | "340282366920938463463374607431768211457", 214 | "3_40_282_3669_20938_463463_3746074_31768211_456___", 215 | "3_40_282_3669_20938_463463_3746074_31768211_455___1", 216 | "3_40_282_3669_20938_463463_3746074_31768211_455___0u128", 217 | "3402823669209384634633746074317682114570", 218 | ]; 219 | 220 | for &input in &inputs { 221 | let lit = IntegerLit::parse(input).expect("failed to parse"); 222 | assert!(lit.value::().is_none()); 223 | } 224 | } 225 | 226 | #[test] 227 | fn overflow_u8() { 228 | let inputs = [ 229 | "256", "0x100", "0o400", "0b100000000", 230 | "257", "0x101", "0o401", "0b100000001", 231 | "300", 232 | "1548", 233 | "2548985", 234 | "256u128", 235 | "256u8", 236 | "2_5_6", 237 | "256_____1", 238 | "256__", 239 | ]; 240 | 241 | for &input in &inputs { 242 | let lit = IntegerLit::parse(input).expect("failed to parse"); 243 | assert!(lit.value::().is_none()); 244 | } 245 | } 246 | 247 | #[test] 248 | fn parse_err() { 249 | assert_err!(IntegerLit, "", Empty, None); 250 | assert_err_single!(IntegerLit::parse("a"), DoesNotStartWithDigit, 0); 251 | assert_err_single!(IntegerLit::parse(";"), DoesNotStartWithDigit, 0); 252 | assert_err_single!(IntegerLit::parse("0;"), UnexpectedChar, 1..2); 253 | assert_err!(IntegerLit, "0b", NoDigits, 2..2); 254 | assert_err_single!(IntegerLit::parse(" 0"), DoesNotStartWithDigit, 0); 255 | assert_err_single!(IntegerLit::parse("0 "), UnexpectedChar, 1); 256 | assert_err!(IntegerLit, "0b3", InvalidDigit, 2); 257 | assert_err_single!(IntegerLit::parse("_"), DoesNotStartWithDigit, 0); 258 | assert_err_single!(IntegerLit::parse("_3"), DoesNotStartWithDigit, 0); 259 | assert_err!(IntegerLit, "0x44.5", UnexpectedChar, 4..6); 260 | assert_err_single!(IntegerLit::parse("123em"), IntegerSuffixStartingWithE, 3); 261 | } 262 | 263 | #[test] 264 | fn invalid_digits() { 265 | assert_err!(IntegerLit, "0b10201", InvalidDigit, 4); 266 | assert_err!(IntegerLit, "0b9", InvalidDigit, 2); 267 | assert_err!(IntegerLit, "0b07", InvalidDigit, 3); 268 | 269 | assert_err!(IntegerLit, "0o12380", InvalidDigit, 5); 270 | assert_err!(IntegerLit, "0o192", InvalidDigit, 3); 271 | 272 | assert_err_single!(IntegerLit::parse("a_123"), DoesNotStartWithDigit, 0); 273 | assert_err_single!(IntegerLit::parse("B_123"), DoesNotStartWithDigit, 0); 274 | } 275 | 276 | #[test] 277 | fn no_valid_digits() { 278 | assert_err!(IntegerLit, "0x_", NoDigits, 2..3); 279 | assert_err!(IntegerLit, "0x__", NoDigits, 2..4); 280 | assert_err!(IntegerLit, "0x________", NoDigits, 2..10); 281 | assert_err!(IntegerLit, "0x_i8", NoDigits, 2..3); 282 | assert_err!(IntegerLit, "0x_u8", NoDigits, 2..3); 283 | assert_err!(IntegerLit, "0x_isize", NoDigits, 2..3); 284 | assert_err!(IntegerLit, "0x_usize", NoDigits, 2..3); 285 | 286 | assert_err!(IntegerLit, "0o_", NoDigits, 2..3); 287 | assert_err!(IntegerLit, "0o__", NoDigits, 2..4); 288 | assert_err!(IntegerLit, "0o________", NoDigits, 2..10); 289 | assert_err!(IntegerLit, "0o_i32", NoDigits, 2..3); 290 | assert_err!(IntegerLit, "0o_u32", NoDigits, 2..3); 291 | 292 | assert_err!(IntegerLit, "0b_", NoDigits, 2..3); 293 | assert_err!(IntegerLit, "0b__", NoDigits, 2..4); 294 | assert_err!(IntegerLit, "0b________", NoDigits, 2..10); 295 | assert_err!(IntegerLit, "0b_i128", NoDigits, 2..3); 296 | assert_err!(IntegerLit, "0b_u128", NoDigits, 2..3); 297 | } 298 | 299 | #[test] 300 | fn non_standard_suffixes() { 301 | #[track_caller] 302 | fn check_suffix( 303 | input: &str, 304 | value: T, 305 | base: IntegerBase, 306 | main_part: &str, 307 | suffix: &str, 308 | ) { 309 | check(input, value, base, main_part, None); 310 | assert_eq!(IntegerLit::parse(input).unwrap().suffix(), suffix); 311 | } 312 | 313 | check_suffix("5u7", 5, Decimal, "5", "u7"); 314 | check_suffix("5u7", 5, Decimal, "5", "u7"); 315 | check_suffix("5u9", 5, Decimal, "5", "u9"); 316 | check_suffix("5u0", 5, Decimal, "5", "u0"); 317 | check_suffix("33u12", 33, Decimal, "33", "u12"); 318 | check_suffix("84u17", 84, Decimal, "84", "u17"); 319 | check_suffix("99u80", 99, Decimal, "99", "u80"); 320 | check_suffix("1234uu16", 1234, Decimal, "1234", "uu16"); 321 | 322 | check_suffix("5i7", 5, Decimal, "5", "i7"); 323 | check_suffix("5i9", 5, Decimal, "5", "i9"); 324 | check_suffix("5i0", 5, Decimal, "5", "i0"); 325 | check_suffix("33i12", 33, Decimal, "33", "i12"); 326 | check_suffix("84i17", 84, Decimal, "84", "i17"); 327 | check_suffix("99i80", 99, Decimal, "99", "i80"); 328 | check_suffix("1234ii16", 1234, Decimal, "1234", "ii16"); 329 | 330 | check_suffix("0ui32", 0, Decimal, "0", "ui32"); 331 | check_suffix("1iu32", 1, Decimal, "1", "iu32"); 332 | check_suffix("54321a64", 54321, Decimal, "54321", "a64"); 333 | check_suffix("54321b64", 54321, Decimal, "54321", "b64"); 334 | check_suffix("54321x64", 54321, Decimal, "54321", "x64"); 335 | check_suffix("54321o64", 54321, Decimal, "54321", "o64"); 336 | 337 | check_suffix("0a", 0, Decimal, "0", "a"); 338 | check_suffix("0a3", 0, Decimal, "0", "a3"); 339 | check_suffix("0z", 0, Decimal, "0", "z"); 340 | check_suffix("0z3", 0, Decimal, "0", "z3"); 341 | check_suffix("0b0a", 0, Binary, "0", "a"); 342 | check_suffix("0b0A", 0, Binary, "0", "A"); 343 | check_suffix("0b01f", 1, Binary, "01", "f"); 344 | check_suffix("0b01F", 1, Binary, "01", "F"); 345 | check_suffix("0o7a_", 7, Octal, "7", "a_"); 346 | check_suffix("0o7A_", 7, Octal, "7", "A_"); 347 | check_suffix("0o72f_0", 0o72, Octal, "72", "f_0"); 348 | check_suffix("0o72F_0", 0o72, Octal, "72", "F_0"); 349 | 350 | check_suffix("0x8cg", 0x8c, Hexadecimal, "8c", "g"); 351 | check_suffix("0x8cG", 0x8c, Hexadecimal, "8c", "G"); 352 | check_suffix("0x8c1h_", 0x8c1, Hexadecimal, "8c1", "h_"); 353 | check_suffix("0x8c1H_", 0x8c1, Hexadecimal, "8c1", "H_"); 354 | check_suffix("0x8czu16", 0x8c, Hexadecimal, "8c", "zu16"); 355 | 356 | check_suffix("123_foo", 123, Decimal, "123_", "foo"); 357 | } 358 | -------------------------------------------------------------------------------- /src/lib.rs: -------------------------------------------------------------------------------- 1 | //! Parsing and inspecting Rust literal tokens. 2 | //! 3 | //! This library offers functionality to parse Rust literals, i.e. tokens in the 4 | //! Rust programming language that represent fixed values. The grammar for 5 | //! those is defined [here][ref]. 6 | //! 7 | //! This kind of functionality already exists in the crate `syn`. However, as 8 | //! you oftentimes don't need (nor want) the full power of `syn`, `litrs` was 9 | //! built. This crate also offers a bit more flexibility compared to `syn` 10 | //! (only regarding literals, of course). 11 | //! 12 | //! 13 | //! # Quick start 14 | //! 15 | //! | **`StringLit::try_from(tt)?.value()`** | 16 | //! | - | 17 | //! 18 | //! ... where `tt` is a `proc_macro::TokenTree` and where [`StringLit`] can be 19 | //! replaced with [`Literal`] or other types of literals (e.g. [`FloatLit`]). 20 | //! Calling `value()` returns the value that is represented by the literal. 21 | //! 22 | //! **Mini Example** 23 | //! 24 | //! ```ignore 25 | //! use proc_macro::TokenStream; 26 | //! 27 | //! #[proc_macro] 28 | //! pub fn foo(input: TokenStream) -> TokenStream { 29 | //! let first_token = input.into_iter().next().unwrap(); // Do proper error handling! 30 | //! let string_value = match litrs::StringLit::try_from(first_token) { 31 | //! Ok(string_lit) => string_lit.value(), 32 | //! Err(e) => return e.to_compile_error(), 33 | //! }; 34 | //! 35 | //! // `string_value` is the string value with all escapes resolved. 36 | //! todo!() 37 | //! } 38 | //! ``` 39 | //! 40 | //! # Overview 41 | //! 42 | //! The main types of this library are [`Literal`], representing any kind of 43 | //! literal, and `*Lit`, like [`StringLit`] or [`FloatLit`], representing a 44 | //! specific kind of literal. 45 | //! 46 | //! There are different ways to obtain such a literal type: 47 | //! 48 | //! - **`parse`**: parses a `&str` or `String` and returns `Result<_, 49 | //! ParseError>`. For example: [`Literal::parse`] and 50 | //! [`IntegerLit::parse`]. 51 | //! 52 | //! - **`From for Literal`**: turns a `Literal` value from 53 | //! the `proc_macro` crate into a `Literal` from this crate. 54 | //! 55 | //! - **`TryFrom for *Lit`**: tries to turn a 56 | //! `proc_macro::Literal` into a specific literal type of this crate. If 57 | //! the input is a literal of a different kind, `Err(InvalidToken)` is 58 | //! returned. 59 | //! 60 | //! - **`TryFrom`**: attempts to turn a token tree into a 61 | //! literal type of this crate. An error is returned if the token tree is 62 | //! not a literal, or if you are trying to turn it into a specific kind of 63 | //! literal and the token tree is a different kind of literal. 64 | //! 65 | //! All of the `From` and `TryFrom` conversions also work for reference to 66 | //! `proc_macro` types. Additionally, if the crate feature `proc-macro2` is 67 | //! enabled (which it is by default), all these `From` and `TryFrom` impls also 68 | //! exist for the corresponding `proc_macro2` types. 69 | //! 70 | //! **Note**: `true` and `false` are `Ident`s when passed to your proc macro. 71 | //! The `TryFrom` impls check for those two special idents and 72 | //! return a [`BoolLit`] appropriately. For that reason, there is also no 73 | //! `TryFrom` impl for [`BoolLit`]. The `proc_macro::Literal` 74 | //! simply cannot represent bool literals. 75 | //! 76 | //! 77 | //! # Examples 78 | //! 79 | //! In a proc-macro: 80 | //! 81 | //! ```ignore 82 | //! use std::convert::TryFrom; 83 | //! use proc_macro::TokenStream; 84 | //! use litrs::FloatLit; 85 | //! 86 | //! #[proc_macro] 87 | //! pub fn foo(input: TokenStream) -> TokenStream { 88 | //! let mut input = input.into_iter().collect::>(); 89 | //! if input.len() != 1 { 90 | //! // Please do proper error handling in your real code! 91 | //! panic!("expected exactly one token as input"); 92 | //! } 93 | //! let token = input.remove(0); 94 | //! 95 | //! match FloatLit::try_from(token) { 96 | //! Ok(float_lit) => { /* do something */ } 97 | //! Err(e) => return e.to_compile_error(), 98 | //! } 99 | //! 100 | //! // Dummy output 101 | //! TokenStream::new() 102 | //! } 103 | //! ``` 104 | //! 105 | //! Parsing from string: 106 | //! 107 | //! ``` 108 | //! use litrs::{FloatLit, Literal}; 109 | //! 110 | //! // Parse a specific kind of literal (float in this case): 111 | //! let float_lit = FloatLit::parse("3.14f32"); 112 | //! assert!(float_lit.is_ok()); 113 | //! assert_eq!(float_lit.unwrap().suffix(), "f32"); 114 | //! assert!(FloatLit::parse("'c'").is_err()); 115 | //! 116 | //! // Parse any kind of literal. After parsing, you can inspect the literal 117 | //! // and decide what to do in each case. 118 | //! let lit = Literal::parse("0xff80").expect("failed to parse literal"); 119 | //! match lit { 120 | //! Literal::Integer(lit) => { /* ... */ } 121 | //! Literal::Float(lit) => { /* ... */ } 122 | //! Literal::Bool(lit) => { /* ... */ } 123 | //! Literal::Char(lit) => { /* ... */ } 124 | //! Literal::String(lit) => { /* ... */ } 125 | //! Literal::Byte(lit) => { /* ... */ } 126 | //! Literal::ByteString(lit) => { /* ... */ } 127 | //! } 128 | //! ``` 129 | //! 130 | //! 131 | //! 132 | //! # Crate features 133 | //! 134 | //! - `proc-macro2` (**default**): adds the dependency `proc_macro2`, a bunch of 135 | //! `From` and `TryFrom` impls, and [`InvalidToken::to_compile_error2`]. 136 | //! - `check_suffix`: if enabled, `parse` functions will exactly verify that the 137 | //! literal suffix is valid. Adds the dependency `unicode-xid`. If disabled, 138 | //! only an approximate check (only in ASCII range) is done. If you are 139 | //! writing a proc macro, you don't need to enable this as the suffix is 140 | //! already checked by the compiler. 141 | //! 142 | //! 143 | //! [ref]: https://doc.rust-lang.org/reference/tokens.html#literals 144 | //! 145 | 146 | #![deny(missing_debug_implementations)] 147 | 148 | extern crate proc_macro; 149 | 150 | #[cfg(test)] 151 | #[macro_use] 152 | mod test_util; 153 | 154 | #[cfg(test)] 155 | mod tests; 156 | 157 | mod bool; 158 | mod byte; 159 | mod bytestr; 160 | mod char; 161 | mod err; 162 | mod escape; 163 | mod float; 164 | mod impls; 165 | mod integer; 166 | mod parse; 167 | mod string; 168 | 169 | 170 | use std::{borrow::{Borrow, Cow}, fmt, ops::{Deref, Range}}; 171 | 172 | pub use self::{ 173 | bool::BoolLit, 174 | byte::ByteLit, 175 | bytestr::ByteStringLit, 176 | char::CharLit, 177 | err::{InvalidToken, ParseError}, 178 | float::{FloatLit, FloatType}, 179 | integer::{FromIntegerLiteral, IntegerLit, IntegerBase, IntegerType}, 180 | string::StringLit, 181 | }; 182 | 183 | 184 | // ============================================================================================== 185 | // ===== `Literal` and type defs 186 | // ============================================================================================== 187 | 188 | /// A literal. This is the main type of this library. 189 | /// 190 | /// This type is generic over the underlying buffer `B`, which can be `&str` or 191 | /// `String`. 192 | /// 193 | /// To create this type, you have to either call [`Literal::parse`] with an 194 | /// input string or use the `From<_>` impls of this type. The impls are only 195 | /// available of the corresponding crate features are enabled (they are enabled 196 | /// by default). 197 | #[derive(Debug, Clone, PartialEq, Eq)] 198 | pub enum Literal { 199 | Bool(BoolLit), 200 | Integer(IntegerLit), 201 | Float(FloatLit), 202 | Char(CharLit), 203 | String(StringLit), 204 | Byte(ByteLit), 205 | ByteString(ByteStringLit), 206 | } 207 | 208 | impl Literal { 209 | /// Parses the given input as a Rust literal. 210 | pub fn parse(input: B) -> Result { 211 | parse::parse(input) 212 | } 213 | 214 | /// Returns the suffix of this literal or `""` if it doesn't have one. 215 | /// 216 | /// Rust token grammar actually allows suffixes for all kinds of tokens. 217 | /// Most Rust programmer only know the type suffixes for integer and 218 | /// floats, e.g. `0u32`. And in normal Rust code, everything else causes an 219 | /// error. But it is possible to pass literals with arbitrary suffixes to 220 | /// proc macros, for example: 221 | /// 222 | /// ```ignore 223 | /// some_macro!(3.14f33 16px '🦊'good_boy "toph"beifong); 224 | /// ``` 225 | /// 226 | /// Boolean literals, not actually being literals, but idents, cannot have 227 | /// suffixes and this method always returns `""` for those. 228 | /// 229 | /// There are some edge cases to be aware of: 230 | /// - Integer suffixes must not start with `e` or `E` as that conflicts with 231 | /// the exponent grammar for floats. `0e1` is a float; `0eel` is also 232 | /// parsed as a float and results in an error. 233 | /// - Hexadecimal integers eagerly parse digits, so `0x5abcdefgh` has a 234 | /// suffix von `gh`. 235 | /// - Suffixes can contain and start with `_`, but for integer and number 236 | /// literals, `_` is eagerly parsed as part of the number, so `1_x` has 237 | /// the suffix `x`. 238 | /// - The input `55f32` is regarded as integer literal with suffix `f32`. 239 | /// 240 | /// # Example 241 | /// 242 | /// ``` 243 | /// use litrs::Literal; 244 | /// 245 | /// assert_eq!(Literal::parse(r##"3.14f33"##).unwrap().suffix(), "f33"); 246 | /// assert_eq!(Literal::parse(r##"123hackerman"##).unwrap().suffix(), "hackerman"); 247 | /// assert_eq!(Literal::parse(r##"0x0fuck"##).unwrap().suffix(), "uck"); 248 | /// assert_eq!(Literal::parse(r##"'🦊'good_boy"##).unwrap().suffix(), "good_boy"); 249 | /// assert_eq!(Literal::parse(r##""toph"beifong"##).unwrap().suffix(), "beifong"); 250 | /// ``` 251 | pub fn suffix(&self) -> &str { 252 | match self { 253 | Literal::Bool(_) => "", 254 | Literal::Integer(l) => l.suffix(), 255 | Literal::Float(l) => l.suffix(), 256 | Literal::Char(l) => l.suffix(), 257 | Literal::String(l) => l.suffix(), 258 | Literal::Byte(l) => l.suffix(), 259 | Literal::ByteString(l) => l.suffix(), 260 | } 261 | } 262 | } 263 | 264 | impl Literal<&str> { 265 | /// Makes a copy of the underlying buffer and returns the owned version of 266 | /// `Self`. 267 | pub fn into_owned(self) -> Literal { 268 | match self { 269 | Literal::Bool(l) => Literal::Bool(l.to_owned()), 270 | Literal::Integer(l) => Literal::Integer(l.to_owned()), 271 | Literal::Float(l) => Literal::Float(l.to_owned()), 272 | Literal::Char(l) => Literal::Char(l.to_owned()), 273 | Literal::String(l) => Literal::String(l.into_owned()), 274 | Literal::Byte(l) => Literal::Byte(l.to_owned()), 275 | Literal::ByteString(l) => Literal::ByteString(l.into_owned()), 276 | } 277 | } 278 | } 279 | 280 | impl fmt::Display for Literal { 281 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 282 | match self { 283 | Literal::Bool(l) => l.fmt(f), 284 | Literal::Integer(l) => l.fmt(f), 285 | Literal::Float(l) => l.fmt(f), 286 | Literal::Char(l) => l.fmt(f), 287 | Literal::String(l) => l.fmt(f), 288 | Literal::Byte(l) => l.fmt(f), 289 | Literal::ByteString(l) => l.fmt(f), 290 | } 291 | } 292 | } 293 | 294 | 295 | // ============================================================================================== 296 | // ===== Buffer 297 | // ============================================================================================== 298 | 299 | /// A shared or owned string buffer. Implemented for `String` and `&str`. *Implementation detail*. 300 | /// 301 | /// This is trait is implementation detail of this library, cannot be 302 | /// implemented in other crates and is not subject to semantic versioning. 303 | /// `litrs` only guarantees that this trait is implemented for `String` and 304 | /// `for<'a> &'a str`. 305 | pub trait Buffer: sealed::Sealed + Deref { 306 | /// This is `Cow<'static, str>` for `String`, and `Cow<'a, str>` for `&'a str`. 307 | type Cow: From + AsRef + Borrow + Deref; 308 | 309 | #[doc(hidden)] 310 | fn into_cow(self) -> Self::Cow; 311 | 312 | /// This is `Cow<'static, [u8]>` for `String`, and `Cow<'a, [u8]>` for `&'a str`. 313 | type ByteCow: From> + AsRef<[u8]> + Borrow<[u8]> + Deref; 314 | 315 | #[doc(hidden)] 316 | fn into_byte_cow(self) -> Self::ByteCow; 317 | 318 | /// Cuts away some characters at the beginning and some at the end. Given 319 | /// range has to be in bounds. 320 | #[doc(hidden)] 321 | fn cut(self, range: Range) -> Self; 322 | } 323 | 324 | mod sealed { 325 | pub trait Sealed {} 326 | } 327 | 328 | impl<'a> sealed::Sealed for &'a str {} 329 | impl<'a> Buffer for &'a str { 330 | #[doc(hidden)] 331 | fn cut(self, range: Range) -> Self { 332 | &self[range] 333 | } 334 | 335 | type Cow = Cow<'a, str>; 336 | #[doc(hidden)] 337 | fn into_cow(self) -> Self::Cow { 338 | self.into() 339 | } 340 | type ByteCow = Cow<'a, [u8]>; 341 | #[doc(hidden)] 342 | fn into_byte_cow(self) -> Self::ByteCow { 343 | self.as_bytes().into() 344 | } 345 | } 346 | 347 | impl sealed::Sealed for String {} 348 | impl Buffer for String { 349 | #[doc(hidden)] 350 | fn cut(mut self, range: Range) -> Self { 351 | // This is not the most efficient way, but it works. First we cut the 352 | // end, then the beginning. Note that `drain` also removes the range if 353 | // the iterator is not consumed. 354 | self.truncate(range.end); 355 | self.drain(..range.start); 356 | self 357 | } 358 | 359 | type Cow = Cow<'static, str>; 360 | #[doc(hidden)] 361 | fn into_cow(self) -> Self::Cow { 362 | self.into() 363 | } 364 | 365 | type ByteCow = Cow<'static, [u8]>; 366 | #[doc(hidden)] 367 | fn into_byte_cow(self) -> Self::ByteCow { 368 | self.into_bytes().into() 369 | } 370 | } 371 | -------------------------------------------------------------------------------- /src/parse.rs: -------------------------------------------------------------------------------- 1 | use crate::{ 2 | BoolLit, 3 | Buffer, 4 | ByteLit, 5 | ByteStringLit, 6 | CharLit, 7 | ParseError, 8 | FloatLit, 9 | IntegerLit, 10 | Literal, 11 | StringLit, 12 | err::{perr, ParseErrorKind::{*, self}}, 13 | }; 14 | 15 | 16 | pub fn parse(input: B) -> Result, ParseError> { 17 | let (first, rest) = input.as_bytes().split_first().ok_or(perr(None, Empty))?; 18 | let second = input.as_bytes().get(1).copied(); 19 | 20 | match first { 21 | b'f' if &*input == "false" => Ok(Literal::Bool(BoolLit::False)), 22 | b't' if &*input == "true" => Ok(Literal::Bool(BoolLit::True)), 23 | 24 | // A number literal (integer or float). 25 | b'0'..=b'9' => { 26 | // To figure out whether this is a float or integer, we do some 27 | // quick inspection here. Yes, this is technically duplicate 28 | // work with what is happening in the integer/float parse 29 | // methods, but it makes the code way easier for now and won't 30 | // be a huge performance loss. 31 | // 32 | // The first non-decimal char in a float literal must 33 | // be '.', 'e' or 'E'. 34 | match input.as_bytes().get(1 + end_dec_digits(rest)) { 35 | Some(b'.') | Some(b'e') | Some(b'E') 36 | => FloatLit::parse(input).map(Literal::Float), 37 | 38 | _ => IntegerLit::parse(input).map(Literal::Integer), 39 | } 40 | }, 41 | 42 | b'\'' => CharLit::parse(input).map(Literal::Char), 43 | b'"' | b'r' => StringLit::parse(input).map(Literal::String), 44 | 45 | b'b' if second == Some(b'\'') => ByteLit::parse(input).map(Literal::Byte), 46 | b'b' if second == Some(b'r') || second == Some(b'"') 47 | => ByteStringLit::parse(input).map(Literal::ByteString), 48 | 49 | _ => Err(perr(None, InvalidLiteral)), 50 | } 51 | } 52 | 53 | 54 | pub(crate) fn first_byte_or_empty(s: &str) -> Result { 55 | s.as_bytes().get(0).copied().ok_or(perr(None, Empty)) 56 | } 57 | 58 | /// Returns the index of the first non-underscore, non-decimal digit in `input`, 59 | /// or the `input.len()` if all characters are decimal digits. 60 | pub(crate) fn end_dec_digits(input: &[u8]) -> usize { 61 | input.iter() 62 | .position(|b| !matches!(b, b'_' | b'0'..=b'9')) 63 | .unwrap_or(input.len()) 64 | } 65 | 66 | pub(crate) fn hex_digit_value(digit: u8) -> Option { 67 | match digit { 68 | b'0'..=b'9' => Some(digit - b'0'), 69 | b'a'..=b'f' => Some(digit - b'a' + 10), 70 | b'A'..=b'F' => Some(digit - b'A' + 10), 71 | _ => None, 72 | } 73 | } 74 | 75 | /// Makes sure that `s` is a valid literal suffix. 76 | pub(crate) fn check_suffix(s: &str) -> Result<(), ParseErrorKind> { 77 | if s.is_empty() { 78 | return Ok(()); 79 | } 80 | 81 | let mut chars = s.chars(); 82 | let first = chars.next().unwrap(); 83 | let rest = chars.as_str(); 84 | if first == '_' && rest.is_empty() { 85 | return Err(InvalidSuffix); 86 | } 87 | 88 | // This is just an extra check to improve the error message. If the first 89 | // character of the "suffix" is already some invalid ASCII 90 | // char, "unexpected character" seems like the more fitting error. 91 | if first.is_ascii() && !(first.is_ascii_alphabetic() || first == '_') { 92 | return Err(UnexpectedChar); 93 | } 94 | 95 | // Proper check is optional as it's not really necessary in proc macro 96 | // context. 97 | #[cfg(feature = "check_suffix")] 98 | fn is_valid_suffix(first: char, rest: &str) -> bool { 99 | use unicode_xid::UnicodeXID; 100 | 101 | (first == '_' || first.is_xid_start()) 102 | && rest.chars().all(|c| c.is_xid_continue()) 103 | } 104 | 105 | // When avoiding the dependency on `unicode_xid`, we just do a best effort 106 | // to catch the most common errors. 107 | #[cfg(not(feature = "check_suffix"))] 108 | fn is_valid_suffix(first: char, rest: &str) -> bool { 109 | if first.is_ascii() && !(first.is_ascii_alphabetic() || first == '_') { 110 | return false; 111 | } 112 | for c in rest.chars() { 113 | if c.is_ascii() && !(c.is_ascii_alphanumeric() || c == '_') { 114 | return false; 115 | } 116 | } 117 | true 118 | } 119 | 120 | if is_valid_suffix(first, rest) { 121 | Ok(()) 122 | } else { 123 | Err(InvalidSuffix) 124 | } 125 | } 126 | -------------------------------------------------------------------------------- /src/string/mod.rs: -------------------------------------------------------------------------------- 1 | use std::{fmt, ops::Range}; 2 | 3 | use crate::{ 4 | Buffer, ParseError, 5 | err::{perr, ParseErrorKind::*}, 6 | escape::{scan_raw_string, unescape_string}, 7 | parse::first_byte_or_empty, 8 | }; 9 | 10 | 11 | /// A string or raw string literal, e.g. `"foo"`, `"Grüße"` or `r#"a🦊c"d🦀f"#`. 12 | /// 13 | /// See [the reference][ref] for more information. 14 | /// 15 | /// [ref]: https://doc.rust-lang.org/reference/tokens.html#string-literals 16 | #[derive(Debug, Clone, PartialEq, Eq)] 17 | pub struct StringLit { 18 | /// The raw input. 19 | raw: B, 20 | 21 | /// The string value (with all escapes unescaped), or `None` if there were 22 | /// no escapes. In the latter case, the string value is in `raw`. 23 | value: Option, 24 | 25 | /// The number of hash signs in case of a raw string literal, or `None` if 26 | /// it's not a raw string literal. 27 | num_hashes: Option, 28 | 29 | /// Start index of the suffix or `raw.len()` if there is no suffix. 30 | start_suffix: usize, 31 | } 32 | 33 | impl StringLit { 34 | /// Parses the input as a (raw) string literal. Returns an error if the 35 | /// input is invalid or represents a different kind of literal. 36 | pub fn parse(input: B) -> Result { 37 | match first_byte_or_empty(&input)? { 38 | b'r' | b'"' => { 39 | let (value, num_hashes, start_suffix) = parse_impl(&input)?; 40 | Ok(Self { raw: input, value, num_hashes, start_suffix }) 41 | } 42 | _ => Err(perr(0, InvalidStringLiteralStart)), 43 | } 44 | } 45 | 46 | /// Returns the string value this literal represents (where all escapes have 47 | /// been turned into their respective values). 48 | pub fn value(&self) -> &str { 49 | self.value.as_deref().unwrap_or(&self.raw[self.inner_range()]) 50 | } 51 | 52 | /// Like `value` but returns a potentially owned version of the value. 53 | /// 54 | /// The return value is either `Cow<'static, str>` if `B = String`, or 55 | /// `Cow<'a, str>` if `B = &'a str`. 56 | pub fn into_value(self) -> B::Cow { 57 | let inner_range = self.inner_range(); 58 | let Self { raw, value, .. } = self; 59 | value.map(B::Cow::from).unwrap_or_else(|| raw.cut(inner_range).into_cow()) 60 | } 61 | 62 | /// The optional suffix. Returns `""` if the suffix is empty/does not exist. 63 | pub fn suffix(&self) -> &str { 64 | &(*self.raw)[self.start_suffix..] 65 | } 66 | 67 | /// Returns whether this literal is a raw string literal (starting with 68 | /// `r`). 69 | pub fn is_raw_string(&self) -> bool { 70 | self.num_hashes.is_some() 71 | } 72 | 73 | /// Returns the raw input that was passed to `parse`. 74 | pub fn raw_input(&self) -> &str { 75 | &self.raw 76 | } 77 | 78 | /// Returns the raw input that was passed to `parse`, potentially owned. 79 | pub fn into_raw_input(self) -> B { 80 | self.raw 81 | } 82 | 83 | /// The range within `self.raw` that excludes the quotes and potential `r#`. 84 | fn inner_range(&self) -> Range { 85 | match self.num_hashes { 86 | None => 1..self.start_suffix - 1, 87 | Some(n) => 1 + n as usize + 1..self.start_suffix - n as usize - 1, 88 | } 89 | } 90 | } 91 | 92 | impl StringLit<&str> { 93 | /// Makes a copy of the underlying buffer and returns the owned version of 94 | /// `Self`. 95 | pub fn into_owned(self) -> StringLit { 96 | StringLit { 97 | raw: self.raw.to_owned(), 98 | value: self.value, 99 | num_hashes: self.num_hashes, 100 | start_suffix: self.start_suffix, 101 | } 102 | } 103 | } 104 | 105 | impl fmt::Display for StringLit { 106 | fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 107 | f.pad(&self.raw) 108 | } 109 | } 110 | 111 | /// Precondition: input has to start with either `"` or `r`. 112 | #[inline(never)] 113 | pub(crate) fn parse_impl(input: &str) -> Result<(Option, Option, usize), ParseError> { 114 | if input.starts_with('r') { 115 | scan_raw_string::(&input, 1) 116 | .map(|(v, hashes, start_suffix)| (v, Some(hashes), start_suffix)) 117 | } else { 118 | unescape_string::(&input, 1) 119 | .map(|(v, start_suffix)| (v, None, start_suffix)) 120 | } 121 | } 122 | 123 | 124 | #[cfg(test)] 125 | mod tests; 126 | -------------------------------------------------------------------------------- /src/string/tests.rs: -------------------------------------------------------------------------------- 1 | use crate::{Literal, StringLit, test_util::{assert_parse_ok_eq, assert_roundtrip}}; 2 | 3 | // ===== Utility functions ======================================================================= 4 | 5 | macro_rules! check { 6 | ($lit:literal, $has_escapes:expr, $num_hashes:expr) => { 7 | check!($lit, stringify!($lit), $has_escapes, $num_hashes, "") 8 | }; 9 | ($lit:literal, $input:expr, $has_escapes:expr, $num_hashes:expr, $suffix:literal) => { 10 | let input = $input; 11 | let expected = StringLit { 12 | raw: input, 13 | value: if $has_escapes { Some($lit.to_string()) } else { None }, 14 | num_hashes: $num_hashes, 15 | start_suffix: input.len() - $suffix.len(), 16 | }; 17 | 18 | assert_parse_ok_eq(input, StringLit::parse(input), expected.clone(), "StringLit::parse"); 19 | assert_parse_ok_eq( 20 | input, Literal::parse(input), Literal::String(expected.clone()), "Literal::parse"); 21 | let lit = StringLit::parse(input).unwrap(); 22 | assert_eq!(lit.value(), $lit); 23 | assert_eq!(lit.suffix(), $suffix); 24 | assert_eq!(lit.into_value(), $lit); 25 | assert_roundtrip(expected.into_owned(), input); 26 | }; 27 | } 28 | 29 | 30 | // ===== Actual tests ============================================================================ 31 | 32 | #[test] 33 | fn simple() { 34 | check!("", false, None); 35 | check!("a", false, None); 36 | check!("peter", false, None); 37 | check!("Sei gegrüßt, Bärthelt!", false, None); 38 | check!("أنا لا أتحدث العربية", false, None); 39 | check!("お前はもう死んでいる", false, None); 40 | check!("Пушки - интересные музыкальные инструменты", false, None); 41 | check!("lit 👌 😂 af", false, None); 42 | } 43 | 44 | #[test] 45 | fn special_whitespace() { 46 | let strings = ["\n", "\t", "foo\tbar", "🦊\n"]; 47 | 48 | for &s in &strings { 49 | let input = format!(r#""{}""#, s); 50 | let input_raw = format!(r#"r"{}""#, s); 51 | for (input, num_hashes) in vec![(input, None), (input_raw, Some(0))] { 52 | let expected = StringLit { 53 | raw: &*input, 54 | value: None, 55 | num_hashes, 56 | start_suffix: input.len(), 57 | }; 58 | assert_parse_ok_eq( 59 | &input, StringLit::parse(&*input), expected.clone(), "StringLit::parse"); 60 | assert_parse_ok_eq( 61 | &input, Literal::parse(&*input), Literal::String(expected), "Literal::parse"); 62 | assert_eq!(StringLit::parse(&*input).unwrap().value(), s); 63 | assert_eq!(StringLit::parse(&*input).unwrap().into_value(), s); 64 | } 65 | } 66 | } 67 | 68 | #[test] 69 | fn simple_escapes() { 70 | check!("a\nb", true, None); 71 | check!("\nb", true, None); 72 | check!("a\n", true, None); 73 | check!("\n", true, None); 74 | 75 | check!("\x60犬 \t 猫\r馬\n うさぎ \0ネズミ", true, None); 76 | check!("నా \\పిల్లి లావుగా ఉంది", true, None); 77 | check!("నా \\పిల్లి లావుగా 🐈\"ఉంది", true, None); 78 | check!("\\నా\\ పిల్లి లావుగా\" ఉంది\"", true, None); 79 | check!("\"నా \\🐈 పిల్లి లావుగా \" ఉంది\\", true, None); 80 | 81 | check!("\x00", true, None); 82 | check!(" \x01", true, None); 83 | check!("\x0c 🦊", true, None); 84 | check!(" 🦊\x0D ", true, None); 85 | check!("\\x13", true, None); 86 | check!("\"x30", true, None); 87 | } 88 | 89 | #[test] 90 | fn unicode_escapes() { 91 | check!("\u{0}", true, None); 92 | check!(" \u{00}", true, None); 93 | check!("\u{b} ", true, None); 94 | check!(" \u{B} ", true, None); 95 | check!("\u{7e}", true, None); 96 | check!("నక్క\u{E4}", true, None); 97 | check!("\u{e4} నక్క", true, None); 98 | check!(" \u{fc}నక్క ", true, None); 99 | check!("\u{Fc}", true, None); 100 | check!("\u{fC}🦊\nлиса", true, None); 101 | check!("лиса\u{FC}", true, None); 102 | check!("лиса\u{b10}నక్క🦊", true, None); 103 | check!("\"నక్క\u{B10}", true, None); 104 | check!("лиса\\\u{0b10}", true, None); 105 | check!("ли🦊са\\\"\u{0b10}", true, None); 106 | check!("నక్క\\\\u{0b10}", true, None); 107 | check!("\u{2764}Füchsin", true, None); 108 | check!("Füchse \u{1f602}", true, None); 109 | check!("cd\u{1F602}ab", true, None); 110 | 111 | check!("\u{0}🦊", true, None); 112 | check!("лиса\u{0__}", true, None); 113 | check!("\\🦊\u{3_b}", true, None); 114 | check!("🦊\u{1_F_6_0_2}Füchsin", true, None); 115 | check!("నక్క\\\u{1_F6_02_____}నక్క", true, None); 116 | } 117 | 118 | #[test] 119 | fn string_continue() { 120 | check!("నక్క\ 121 | bar", true, None); 122 | check!("foo\ 123 | 🦊", true, None); 124 | 125 | check!("foo\ 126 | 127 | banana", true, None); 128 | 129 | // Weird whitespace characters 130 | let lit = StringLit::parse("\"foo\\\n\r\t\n \n\tbar\"").expect("failed to parse"); 131 | assert_eq!(lit.value(), "foobar"); 132 | let lit = StringLit::parse("\"foo\\\n\u{85}bar\"").expect("failed to parse"); 133 | assert_eq!(lit.value(), "foo\u{85}bar"); 134 | let lit = StringLit::parse("\"foo\\\n\u{a0}bar\"").expect("failed to parse"); 135 | assert_eq!(lit.value(), "foo\u{a0}bar"); 136 | 137 | // Raw strings do not handle "string continues" 138 | check!(r"foo\ 139 | bar", false, Some(0)); 140 | } 141 | 142 | #[test] 143 | fn crlf_newlines() { 144 | let lit = StringLit::parse("\"foo\r\nbar\"").expect("failed to parse"); 145 | assert_eq!(lit.value(), "foo\nbar"); 146 | 147 | let lit = StringLit::parse("\"\r\nbar\"").expect("failed to parse"); 148 | assert_eq!(lit.value(), "\nbar"); 149 | 150 | let lit = StringLit::parse("\"лиса\r\n\"").expect("failed to parse"); 151 | assert_eq!(lit.value(), "лиса\n"); 152 | 153 | let lit = StringLit::parse("r\"foo\r\nbar\"").expect("failed to parse"); 154 | assert_eq!(lit.value(), "foo\nbar"); 155 | 156 | let lit = StringLit::parse("r#\"\r\nbar\"#").expect("failed to parse"); 157 | assert_eq!(lit.value(), "\nbar"); 158 | 159 | let lit = StringLit::parse("r##\"лиса\r\n\"##").expect("failed to parse"); 160 | assert_eq!(lit.value(), "лиса\n"); 161 | } 162 | 163 | #[test] 164 | fn raw_string() { 165 | check!(r"", false, Some(0)); 166 | check!(r"a", false, Some(0)); 167 | check!(r"peter", false, Some(0)); 168 | check!(r"Sei gegrüßt, Bärthelt!", false, Some(0)); 169 | check!(r"أنا لا أتحدث العربية", false, Some(0)); 170 | check!(r"お前はもう死んでいる", false, Some(0)); 171 | check!(r"Пушки - интересные музыкальные инструменты", false, Some(0)); 172 | check!(r"lit 👌 😂 af", false, Some(0)); 173 | 174 | check!(r#""#, false, Some(1)); 175 | check!(r#"a"#, false, Some(1)); 176 | check!(r##"peter"##, false, Some(2)); 177 | check!(r###"Sei gegrüßt, Bärthelt!"###, false, Some(3)); 178 | check!(r########"lit 👌 😂 af"########, false, Some(8)); 179 | 180 | check!(r#"foo " bar"#, false, Some(1)); 181 | check!(r##"foo " bar"##, false, Some(2)); 182 | check!(r#"foo """" '"'" bar"#, false, Some(1)); 183 | check!(r#""foo""#, false, Some(1)); 184 | check!(r###""foo'"###, false, Some(3)); 185 | check!(r#""x'#_#s'"#, false, Some(1)); 186 | check!(r"#", false, Some(0)); 187 | check!(r"foo#", false, Some(0)); 188 | check!(r"##bar", false, Some(0)); 189 | check!(r###""##foo"##bar'"###, false, Some(3)); 190 | 191 | check!(r"さび\n\t\r\0\\x60\u{123}フェリス", false, Some(0)); 192 | check!(r#"さび\n\t\r\0\\x60\u{123}フェリス"#, false, Some(1)); 193 | } 194 | 195 | #[test] 196 | fn suffixes() { 197 | check!("hello", r###""hello"suffix"###, false, None, "suffix"); 198 | check!(r"お前はもう死んでいる", r###"r"お前はもう死んでいる"_banana"###, false, Some(0), "_banana"); 199 | check!("fox", r#""fox"peter"#, false, None, "peter"); 200 | check!("🦊", r#""🦊"peter"#, false, None, "peter"); 201 | check!("నక్క\\\\u{0b10}", r###""నక్క\\\\u{0b10}"jü_rgen"###, true, None, "jü_rgen"); 202 | } 203 | 204 | #[test] 205 | fn parse_err() { 206 | assert_err!(StringLit, r#"""#, UnterminatedString, None); 207 | assert_err!(StringLit, r#""犬"#, UnterminatedString, None); 208 | assert_err!(StringLit, r#""Jürgen"#, UnterminatedString, None); 209 | assert_err!(StringLit, r#""foo bar baz"#, UnterminatedString, None); 210 | 211 | assert_err!(StringLit, r#""fox"peter""#, InvalidSuffix, 5); 212 | assert_err!(StringLit, r###"r#"foo "# bar"#"###, UnexpectedChar, 9); 213 | 214 | assert_err!(StringLit, "\"\r\"", IsolatedCr, 1); 215 | assert_err!(StringLit, "\"fo\rx\"", IsolatedCr, 3); 216 | assert_err!(StringLit, "r\"\r\"", IsolatedCr, 2); 217 | assert_err!(StringLit, "r\"fo\rx\"", IsolatedCr, 4); 218 | 219 | assert_err!(StringLit, r##"r####""##, UnterminatedRawString, None); 220 | assert_err!(StringLit, r#####"r##"foo"#bar"#####, UnterminatedRawString, None); 221 | assert_err!(StringLit, r##"r####"##, InvalidLiteral, None); 222 | assert_err!(StringLit, r##"r####x"##, InvalidLiteral, None); 223 | } 224 | 225 | #[test] 226 | fn invald_ascii_escapes() { 227 | assert_err!(StringLit, r#""\x80""#, NonAsciiXEscape, 1..5); 228 | assert_err!(StringLit, r#""🦊\x81""#, NonAsciiXEscape, 5..9); 229 | assert_err!(StringLit, r#"" \x8a""#, NonAsciiXEscape, 2..6); 230 | assert_err!(StringLit, r#""\x8Ff""#, NonAsciiXEscape, 1..5); 231 | assert_err!(StringLit, r#""\xa0 ""#, NonAsciiXEscape, 1..5); 232 | assert_err!(StringLit, r#""నక్క\xB0""#, NonAsciiXEscape, 13..17); 233 | assert_err!(StringLit, r#""\xc3నక్క""#, NonAsciiXEscape, 1..5); 234 | assert_err!(StringLit, r#""\xDf🦊""#, NonAsciiXEscape, 1..5); 235 | assert_err!(StringLit, r#""నక్క\xffనక్క""#, NonAsciiXEscape, 13..17); 236 | assert_err!(StringLit, r#""\xfF ""#, NonAsciiXEscape, 1..5); 237 | assert_err!(StringLit, r#"" \xFf""#, NonAsciiXEscape, 2..6); 238 | assert_err!(StringLit, r#""నక్క \xFF""#, NonAsciiXEscape, 15..19); 239 | } 240 | 241 | #[test] 242 | fn invalid_escapes() { 243 | assert_err!(StringLit, r#""\a""#, UnknownEscape, 1..3); 244 | assert_err!(StringLit, r#""foo\y""#, UnknownEscape, 4..6); 245 | assert_err!(StringLit, r#""\"#, UnterminatedEscape, 1); 246 | assert_err!(StringLit, r#""\x""#, UnterminatedEscape, 1..3); 247 | assert_err!(StringLit, r#""🦊\x1""#, UnterminatedEscape, 5..8); 248 | assert_err!(StringLit, r#"" \xaj""#, InvalidXEscape, 2..6); 249 | assert_err!(StringLit, r#""నక్క\xjb""#, InvalidXEscape, 13..17); 250 | } 251 | 252 | #[test] 253 | fn invalid_unicode_escapes() { 254 | assert_err!(StringLit, r#""\u""#, UnicodeEscapeWithoutBrace, 1..3); 255 | assert_err!(StringLit, r#""🦊\u ""#, UnicodeEscapeWithoutBrace, 5..7); 256 | assert_err!(StringLit, r#""\u3""#, UnicodeEscapeWithoutBrace, 1..3); 257 | 258 | assert_err!(StringLit, r#""\u{""#, UnterminatedUnicodeEscape, 1..4); 259 | assert_err!(StringLit, r#""\u{12""#, UnterminatedUnicodeEscape, 1..6); 260 | assert_err!(StringLit, r#""🦊\u{a0b""#, UnterminatedUnicodeEscape, 5..11); 261 | assert_err!(StringLit, r#""\u{a0_b ""#, UnterminatedUnicodeEscape, 1..10); 262 | 263 | assert_err!(StringLit, r#""\u{_}నక్క""#, InvalidStartOfUnicodeEscape, 4); 264 | assert_err!(StringLit, r#""\u{_5f}""#, InvalidStartOfUnicodeEscape, 4); 265 | 266 | assert_err!(StringLit, r#""fox\u{x}""#, NonHexDigitInUnicodeEscape, 7); 267 | assert_err!(StringLit, r#""\u{0x}🦊""#, NonHexDigitInUnicodeEscape, 5); 268 | assert_err!(StringLit, r#""నక్క\u{3bx}""#, NonHexDigitInUnicodeEscape, 18); 269 | assert_err!(StringLit, r#""\u{3b_x}лиса""#, NonHexDigitInUnicodeEscape, 7); 270 | assert_err!(StringLit, r#""\u{4x_}""#, NonHexDigitInUnicodeEscape, 5); 271 | 272 | assert_err!(StringLit, r#""\u{1234567}""#, TooManyDigitInUnicodeEscape, 10); 273 | assert_err!(StringLit, r#""నక్క\u{1234567}🦊""#, TooManyDigitInUnicodeEscape, 22); 274 | assert_err!(StringLit, r#""నక్క\u{1_23_4_56_7}""#, TooManyDigitInUnicodeEscape, 26); 275 | assert_err!(StringLit, r#""\u{abcdef123}лиса""#, TooManyDigitInUnicodeEscape, 10); 276 | 277 | assert_err!(StringLit, r#""\u{110000}fox""#, InvalidUnicodeEscapeChar, 1..10); 278 | } 279 | -------------------------------------------------------------------------------- /src/test_util.rs: -------------------------------------------------------------------------------- 1 | use crate::*; 2 | use std::fmt::{Debug, Display}; 3 | 4 | 5 | #[track_caller] 6 | pub(crate) fn assert_parse_ok_eq( 7 | input: &str, 8 | result: Result, 9 | expected: T, 10 | parse_method: &str, 11 | ) { 12 | match result { 13 | Ok(actual) if actual == expected => { 14 | if actual.to_string() != input { 15 | panic!( 16 | "formatting does not yield original input `{}`: {:?}", 17 | input, 18 | actual, 19 | ); 20 | } 21 | } 22 | Ok(actual) => { 23 | panic!( 24 | "unexpected parsing result (with `{}`) for `{}`:\nactual: {:?}\nexpected: {:?}", 25 | parse_method, 26 | input, 27 | actual, 28 | expected, 29 | ); 30 | } 31 | Err(e) => { 32 | panic!( 33 | "expected `{}` to be parsed (with `{}`) successfully, but it failed: {:?}", 34 | input, 35 | parse_method, 36 | e, 37 | ); 38 | } 39 | } 40 | } 41 | 42 | // This is not ideal, but to perform this check we need `proc-macro2`. So we 43 | // just don't do anything if that feature is not enabled. 44 | #[cfg(not(feature = "proc-macro2"))] 45 | pub(crate) fn assert_roundtrip(_: T, _: &str) {} 46 | 47 | #[cfg(feature = "proc-macro2")] 48 | #[track_caller] 49 | pub(crate) fn assert_roundtrip(ours: T, input: &str) 50 | where 51 | T: std::convert::TryFrom + fmt::Debug + PartialEq + Clone, 52 | proc_macro2::Literal: From, 53 | >::Error: std::fmt::Display, 54 | { 55 | let pm_lit = input.parse::() 56 | .expect("failed to parse input as proc_macro2::Literal"); 57 | let t_name = std::any::type_name::(); 58 | 59 | // Unfortunately, `proc_macro2::Literal` does not implement `PartialEq`, so 60 | // this is the next best thing. 61 | if proc_macro2::Literal::from(ours.clone()).to_string() != pm_lit.to_string() { 62 | panic!( 63 | "Converting {} to proc_macro2::Literal has unexpected result:\ 64 | \nconverted: {:?}\nexpected: {:?}", 65 | t_name, 66 | proc_macro2::Literal::from(ours), 67 | pm_lit, 68 | ); 69 | } 70 | 71 | match T::try_from(pm_lit) { 72 | Err(e) => { 73 | panic!("Trying to convert proc_macro2::Literal to {} results in error: {}", t_name, e); 74 | } 75 | Ok(res) => { 76 | if res != ours { 77 | panic!( 78 | "Converting proc_macro2::Literal to {} has unexpected result:\ 79 | \nactual: {:?}\nexpected: {:?}", 80 | t_name, 81 | res, 82 | ours, 83 | ); 84 | } 85 | } 86 | } 87 | } 88 | 89 | macro_rules! assert_err { 90 | ($ty:ident, $input:literal, $kind:ident, $( $span:tt )+ ) => { 91 | assert_err_single!($ty::parse($input), $kind, $($span)+); 92 | assert_err_single!($crate::Literal::parse($input), $kind, $($span)+); 93 | }; 94 | } 95 | 96 | macro_rules! assert_err_single { 97 | ($expr:expr, $kind:ident, $( $span:tt )+ ) => { 98 | let res = $expr; 99 | let err = match res { 100 | Err(e) => e, 101 | Ok(v) => panic!( 102 | "Expected `{}` to return an error, but it returned Ok({:?})", 103 | stringify!($expr), 104 | v, 105 | ), 106 | }; 107 | if err.kind != $crate::err::ParseErrorKind::$kind { 108 | panic!( 109 | "Expected error kind {} for `{}` but got {:?}", 110 | stringify!($kind), 111 | stringify!($expr), 112 | err.kind, 113 | ) 114 | } 115 | let expected_span = assert_err_single!(@span $($span)+); 116 | if err.span != expected_span { 117 | panic!( 118 | "Expected error span {:?} for `{}` but got {:?}", 119 | expected_span, 120 | stringify!($expr), 121 | err.span, 122 | ) 123 | } 124 | }; 125 | (@span $start:literal .. $end:literal) => { Some($start .. $end) }; 126 | (@span $at:literal) => { Some($at.. $at + 1) }; 127 | (@span None) => { None }; 128 | } 129 | -------------------------------------------------------------------------------- /src/tests.rs: -------------------------------------------------------------------------------- 1 | use crate::Literal; 2 | 3 | 4 | #[test] 5 | fn empty() { 6 | assert_err!(Literal, "", Empty, None); 7 | } 8 | 9 | #[test] 10 | fn invalid_literals() { 11 | assert_err_single!(Literal::parse("."), InvalidLiteral, None); 12 | assert_err_single!(Literal::parse("+"), InvalidLiteral, None); 13 | assert_err_single!(Literal::parse("-"), InvalidLiteral, None); 14 | assert_err_single!(Literal::parse("e"), InvalidLiteral, None); 15 | assert_err_single!(Literal::parse("e8"), InvalidLiteral, None); 16 | assert_err_single!(Literal::parse("f32"), InvalidLiteral, None); 17 | assert_err_single!(Literal::parse("foo"), InvalidLiteral, None); 18 | assert_err_single!(Literal::parse("inf"), InvalidLiteral, None); 19 | assert_err_single!(Literal::parse("nan"), InvalidLiteral, None); 20 | assert_err_single!(Literal::parse("NaN"), InvalidLiteral, None); 21 | assert_err_single!(Literal::parse("NAN"), InvalidLiteral, None); 22 | assert_err_single!(Literal::parse("_2.7"), InvalidLiteral, None); 23 | assert_err_single!(Literal::parse(".5"), InvalidLiteral, None); 24 | } 25 | 26 | #[test] 27 | fn misc() { 28 | assert_err_single!(Literal::parse("0x44.5"), UnexpectedChar, 4..6); 29 | assert_err_single!(Literal::parse("a"), InvalidLiteral, None); 30 | assert_err_single!(Literal::parse(";"), InvalidLiteral, None); 31 | assert_err_single!(Literal::parse("0;"), UnexpectedChar, 1); 32 | assert_err_single!(Literal::parse(" 0"), InvalidLiteral, None); 33 | assert_err_single!(Literal::parse("0 "), UnexpectedChar, 1); 34 | assert_err_single!(Literal::parse("_"), InvalidLiteral, None); 35 | assert_err_single!(Literal::parse("_3"), InvalidLiteral, None); 36 | assert_err_single!(Literal::parse("a_123"), InvalidLiteral, None); 37 | assert_err_single!(Literal::parse("B_123"), InvalidLiteral, None); 38 | } 39 | 40 | macro_rules! assert_no_panic { 41 | ($input:expr) => { 42 | let arr = $input; 43 | let input = std::str::from_utf8(&arr).expect("not unicode"); 44 | let res = std::panic::catch_unwind(move || { 45 | let _ = Literal::parse(input); 46 | let _ = crate::BoolLit::parse(input); 47 | let _ = crate::IntegerLit::parse(input); 48 | let _ = crate::FloatLit::parse(input); 49 | let _ = crate::CharLit::parse(input); 50 | let _ = crate::StringLit::parse(input); 51 | let _ = crate::ByteLit::parse(input); 52 | let _ = crate::ByteStringLit::parse(input); 53 | }); 54 | 55 | if let Err(e) = res { 56 | println!("\n!!! panic for: {:?}", input); 57 | std::panic::resume_unwind(e); 58 | } 59 | }; 60 | } 61 | 62 | #[test] 63 | #[ignore] 64 | fn never_panic_up_to_3() { 65 | for a in 0..128 { 66 | assert_no_panic!([a]); 67 | for b in 0..128 { 68 | assert_no_panic!([a, b]); 69 | for c in 0..128 { 70 | assert_no_panic!([a, b, c]); 71 | } 72 | } 73 | } 74 | } 75 | 76 | // This test takes super long in debug mode, but in release mode it's fine. 77 | #[test] 78 | #[ignore] 79 | fn never_panic_len_4() { 80 | for a in 0..128 { 81 | for b in 0..128 { 82 | for c in 0..128 { 83 | for d in 0..128 { 84 | assert_no_panic!([a, b, c, d]); 85 | } 86 | } 87 | } 88 | } 89 | } 90 | 91 | #[cfg(feature = "proc-macro2")] 92 | #[test] 93 | fn proc_macro() { 94 | use std::convert::TryFrom; 95 | use proc_macro2::{ 96 | self as pm2, TokenTree, Group, TokenStream, Delimiter, Spacing, Punct, Span, Ident, 97 | }; 98 | use crate::{ 99 | BoolLit, ByteLit, ByteStringLit, CharLit, FloatLit, IntegerLit, StringLit, err::TokenKind 100 | }; 101 | 102 | 103 | macro_rules! assert_invalid_token { 104 | ($input:expr, expected: $expected:path, actual: $actual:path $(,)?) => { 105 | let err = $input.unwrap_err(); 106 | if err.expected != $expected { 107 | panic!( 108 | "err.expected was expected to be {:?}, but is {:?}", 109 | $expected, 110 | err.expected, 111 | ); 112 | } 113 | if err.actual != $actual { 114 | panic!("err.actual was expected to be {:?}, but is {:?}", $actual, err.actual); 115 | } 116 | }; 117 | } 118 | 119 | 120 | let pm_u16_lit = pm2::Literal::u16_suffixed(2700); 121 | let pm_i16_lit = pm2::Literal::i16_unsuffixed(3912); 122 | let pm_f32_lit = pm2::Literal::f32_unsuffixed(3.14); 123 | let pm_f64_lit = pm2::Literal::f64_suffixed(99.3); 124 | let pm_string_lit = pm2::Literal::string("hello 🦊"); 125 | let pm_bytestr_lit = pm2::Literal::byte_string(b"hello \nfoxxo"); 126 | let pm_char_lit = pm2::Literal::character('🦀'); 127 | 128 | let u16_lit = Literal::parse("2700u16".to_string()).unwrap(); 129 | let i16_lit = Literal::parse("3912".to_string()).unwrap(); 130 | let f32_lit = Literal::parse("3.14".to_string()).unwrap(); 131 | let f64_lit = Literal::parse("99.3f64".to_string()).unwrap(); 132 | let string_lit = Literal::parse(r#""hello 🦊""#.to_string()).unwrap(); 133 | let bytestr_lit = Literal::parse(r#"b"hello \nfoxxo""#.to_string()).unwrap(); 134 | let char_lit = Literal::parse("'🦀'".to_string()).unwrap(); 135 | 136 | assert_eq!(Literal::from(&pm_u16_lit), u16_lit); 137 | assert_eq!(Literal::from(&pm_i16_lit), i16_lit); 138 | assert_eq!(Literal::from(&pm_f32_lit), f32_lit); 139 | assert_eq!(Literal::from(&pm_f64_lit), f64_lit); 140 | assert_eq!(Literal::from(&pm_string_lit), string_lit); 141 | assert_eq!(Literal::from(&pm_bytestr_lit), bytestr_lit); 142 | assert_eq!(Literal::from(&pm_char_lit), char_lit); 143 | 144 | 145 | let group = TokenTree::from(Group::new(Delimiter::Brace, TokenStream::new())); 146 | let punct = TokenTree::from(Punct::new(':', Spacing::Alone)); 147 | let ident = TokenTree::from(Ident::new("peter", Span::call_site())); 148 | 149 | assert_eq!( 150 | Literal::try_from(TokenTree::Literal(pm2::Literal::string("hello 🦊"))).unwrap(), 151 | Literal::String(StringLit::parse(r#""hello 🦊""#.to_string()).unwrap()), 152 | ); 153 | assert_invalid_token!( 154 | Literal::try_from(punct.clone()), 155 | expected: TokenKind::Literal, 156 | actual: TokenKind::Punct, 157 | ); 158 | assert_invalid_token!( 159 | Literal::try_from(group.clone()), 160 | expected: TokenKind::Literal, 161 | actual: TokenKind::Group, 162 | ); 163 | assert_invalid_token!( 164 | Literal::try_from(ident.clone()), 165 | expected: TokenKind::Literal, 166 | actual: TokenKind::Ident, 167 | ); 168 | 169 | 170 | assert_eq!(Literal::from(IntegerLit::try_from(pm_u16_lit.clone()).unwrap()), u16_lit); 171 | assert_eq!(Literal::from(IntegerLit::try_from(pm_i16_lit.clone()).unwrap()), i16_lit); 172 | assert_eq!(Literal::from(FloatLit::try_from(pm_f32_lit.clone()).unwrap()), f32_lit); 173 | assert_eq!(Literal::from(FloatLit::try_from(pm_f64_lit.clone()).unwrap()), f64_lit); 174 | assert_eq!(Literal::from(StringLit::try_from(pm_string_lit.clone()).unwrap()), string_lit); 175 | assert_eq!( 176 | Literal::from(ByteStringLit::try_from(pm_bytestr_lit.clone()).unwrap()), 177 | bytestr_lit, 178 | ); 179 | assert_eq!(Literal::from(CharLit::try_from(pm_char_lit.clone()).unwrap()), char_lit); 180 | 181 | assert_invalid_token!( 182 | StringLit::try_from(pm_u16_lit.clone()), 183 | expected: TokenKind::StringLit, 184 | actual: TokenKind::IntegerLit, 185 | ); 186 | assert_invalid_token!( 187 | StringLit::try_from(pm_f32_lit.clone()), 188 | expected: TokenKind::StringLit, 189 | actual: TokenKind::FloatLit, 190 | ); 191 | assert_invalid_token!( 192 | ByteLit::try_from(pm_bytestr_lit.clone()), 193 | expected: TokenKind::ByteLit, 194 | actual: TokenKind::ByteStringLit, 195 | ); 196 | assert_invalid_token!( 197 | ByteLit::try_from(pm_i16_lit.clone()), 198 | expected: TokenKind::ByteLit, 199 | actual: TokenKind::IntegerLit, 200 | ); 201 | assert_invalid_token!( 202 | IntegerLit::try_from(pm_string_lit.clone()), 203 | expected: TokenKind::IntegerLit, 204 | actual: TokenKind::StringLit, 205 | ); 206 | assert_invalid_token!( 207 | IntegerLit::try_from(pm_char_lit.clone()), 208 | expected: TokenKind::IntegerLit, 209 | actual: TokenKind::CharLit, 210 | ); 211 | 212 | 213 | assert_eq!( 214 | Literal::from(IntegerLit::try_from(TokenTree::from(pm_u16_lit.clone())).unwrap()), 215 | u16_lit, 216 | ); 217 | assert_eq!( 218 | Literal::from(IntegerLit::try_from(TokenTree::from(pm_i16_lit.clone())).unwrap()), 219 | i16_lit, 220 | ); 221 | assert_eq!( 222 | Literal::from(FloatLit::try_from(TokenTree::from(pm_f32_lit.clone())).unwrap()), 223 | f32_lit, 224 | ); 225 | assert_eq!( 226 | Literal::from(FloatLit::try_from(TokenTree::from(pm_f64_lit.clone())).unwrap()), 227 | f64_lit, 228 | ); 229 | assert_eq!( 230 | Literal::from(StringLit::try_from(TokenTree::from(pm_string_lit.clone())).unwrap()), 231 | string_lit, 232 | ); 233 | assert_eq!( 234 | Literal::from(ByteStringLit::try_from(TokenTree::from(pm_bytestr_lit.clone())).unwrap()), 235 | bytestr_lit, 236 | ); 237 | assert_eq!( 238 | Literal::from(CharLit::try_from(TokenTree::from(pm_char_lit.clone())).unwrap()), 239 | char_lit, 240 | ); 241 | 242 | assert_invalid_token!( 243 | StringLit::try_from(TokenTree::from(pm_u16_lit.clone())), 244 | expected: TokenKind::StringLit, 245 | actual: TokenKind::IntegerLit, 246 | ); 247 | assert_invalid_token!( 248 | StringLit::try_from(TokenTree::from(pm_f32_lit.clone())), 249 | expected: TokenKind::StringLit, 250 | actual: TokenKind::FloatLit, 251 | ); 252 | assert_invalid_token!( 253 | BoolLit::try_from(TokenTree::from(pm_bytestr_lit.clone())), 254 | expected: TokenKind::BoolLit, 255 | actual: TokenKind::ByteStringLit, 256 | ); 257 | assert_invalid_token!( 258 | BoolLit::try_from(TokenTree::from(pm_i16_lit.clone())), 259 | expected: TokenKind::BoolLit, 260 | actual: TokenKind::IntegerLit, 261 | ); 262 | assert_invalid_token!( 263 | IntegerLit::try_from(TokenTree::from(pm_string_lit.clone())), 264 | expected: TokenKind::IntegerLit, 265 | actual: TokenKind::StringLit, 266 | ); 267 | assert_invalid_token!( 268 | IntegerLit::try_from(TokenTree::from(pm_char_lit.clone())), 269 | expected: TokenKind::IntegerLit, 270 | actual: TokenKind::CharLit, 271 | ); 272 | 273 | assert_invalid_token!( 274 | StringLit::try_from(TokenTree::from(group)), 275 | expected: TokenKind::StringLit, 276 | actual: TokenKind::Group, 277 | ); 278 | assert_invalid_token!( 279 | BoolLit::try_from(TokenTree::from(punct)), 280 | expected: TokenKind::BoolLit, 281 | actual: TokenKind::Punct, 282 | ); 283 | assert_invalid_token!( 284 | FloatLit::try_from(TokenTree::from(ident)), 285 | expected: TokenKind::FloatLit, 286 | actual: TokenKind::Ident, 287 | ); 288 | } 289 | 290 | #[cfg(feature = "proc-macro2")] 291 | #[test] 292 | fn bool_try_from_tt() { 293 | use std::convert::TryFrom; 294 | use proc_macro2::{Ident, Span, TokenTree}; 295 | use crate::BoolLit; 296 | 297 | 298 | let ident = |s: &str| Ident::new(s, Span::call_site()); 299 | 300 | assert_eq!(BoolLit::try_from(TokenTree::Ident(ident("true"))).unwrap(), BoolLit::True); 301 | assert_eq!(BoolLit::try_from(TokenTree::Ident(ident("false"))).unwrap(), BoolLit::False); 302 | 303 | assert!(BoolLit::try_from(TokenTree::Ident(ident("falsex"))).is_err()); 304 | assert!(BoolLit::try_from(TokenTree::Ident(ident("_false"))).is_err()); 305 | assert!(BoolLit::try_from(TokenTree::Ident(ident("False"))).is_err()); 306 | assert!(BoolLit::try_from(TokenTree::Ident(ident("True"))).is_err()); 307 | assert!(BoolLit::try_from(TokenTree::Ident(ident("ltrue"))).is_err()); 308 | 309 | 310 | assert_eq!( 311 | Literal::try_from(TokenTree::Ident(ident("true"))).unwrap(), 312 | Literal::Bool(BoolLit::True), 313 | ); 314 | assert_eq!( 315 | Literal::try_from(TokenTree::Ident(ident("false"))).unwrap(), 316 | Literal::Bool(BoolLit::False), 317 | ); 318 | 319 | assert!(Literal::try_from(TokenTree::Ident(ident("falsex"))).is_err()); 320 | assert!(Literal::try_from(TokenTree::Ident(ident("_false"))).is_err()); 321 | assert!(Literal::try_from(TokenTree::Ident(ident("False"))).is_err()); 322 | assert!(Literal::try_from(TokenTree::Ident(ident("True"))).is_err()); 323 | assert!(Literal::try_from(TokenTree::Ident(ident("ltrue"))).is_err()); 324 | } 325 | 326 | #[cfg(feature = "proc-macro2")] 327 | #[test] 328 | fn invalid_token_display() { 329 | use crate::{InvalidToken, err::TokenKind}; 330 | 331 | let span = crate::err::Span::Two(proc_macro2::Span::call_site()); 332 | assert_eq!( 333 | InvalidToken { 334 | actual: TokenKind::StringLit, 335 | expected: TokenKind::FloatLit, 336 | span, 337 | }.to_string(), 338 | r#"expected a float literal (e.g. `3.14`), but found a string literal (e.g. "Ferris")"#, 339 | ); 340 | 341 | assert_eq!( 342 | InvalidToken { 343 | actual: TokenKind::Punct, 344 | expected: TokenKind::Literal, 345 | span, 346 | }.to_string(), 347 | r#"expected a literal, but found a punctuation character"#, 348 | ); 349 | } 350 | --------------------------------------------------------------------------------