├── .github
├── CODE_OF_CONDUCT
├── CONTRIBUTING.md
└── workflows
│ └── ci.yaml
├── .gitignore
├── Cargo.toml
├── LICENSE-MIT
├── README.md
├── benches
├── bench.rs
└── nfa.rs
└── src
├── lib.rs
└── nfa.rs
/.github/CODE_OF_CONDUCT:
--------------------------------------------------------------------------------
1 | # Contributor Covenant Code of Conduct
2 |
3 | ## Our Pledge
4 |
5 | In the interest of fostering an open and welcoming environment, we as
6 | contributors and maintainers pledge to making participation in our project and
7 | our community a harassment-free experience for everyone, regardless of age, body
8 | size, disability, ethnicity, gender identity and expression, level of
9 | experience,
10 | education, socio-economic status, nationality, personal appearance, race,
11 | religion, or sexual identity and orientation.
12 |
13 | ## Our Standards
14 |
15 | Examples of behavior that contributes to creating a positive environment
16 | include:
17 |
18 | - Using welcoming and inclusive language
19 | - Being respectful of differing viewpoints and experiences
20 | - Gracefully accepting constructive criticism
21 | - Focusing on what is best for the community
22 | - Showing empathy towards other community members
23 |
24 | Examples of unacceptable behavior by participants include:
25 |
26 | - The use of sexualized language or imagery and unwelcome sexual attention or
27 | advances
28 | - Trolling, insulting/derogatory comments, and personal or political attacks
29 | - Public or private harassment
30 | - Publishing others' private information, such as a physical or electronic
31 | address, without explicit permission
32 | - Other conduct which could reasonably be considered inappropriate in a
33 | professional setting
34 |
35 |
36 | ## Our Responsibilities
37 |
38 | Project maintainers are responsible for clarifying the standards of acceptable
39 | behavior and are expected to take appropriate and fair corrective action in
40 | response to any instances of unacceptable behavior.
41 |
42 | Project maintainers have the right and responsibility to remove, edit, or
43 | reject comments, commits, code, wiki edits, issues, and other contributions
44 | that are not aligned to this Code of Conduct, or to ban temporarily or
45 | permanently any contributor for other behaviors that they deem inappropriate,
46 | threatening, offensive, or harmful.
47 |
48 | ## Scope
49 |
50 | This Code of Conduct applies both within project spaces and in public spaces
51 | when an individual is representing the project or its community. Examples of
52 | representing a project or community include using an official project e-mail
53 | address, posting via an official social media account, or acting as an appointed
54 | representative at an online or offline event. Representation of a project may be
55 | further defined and clarified by project maintainers.
56 |
57 | ## Enforcement
58 |
59 | Instances of abusive, harassing, or otherwise unacceptable behavior may be
60 | reported by contacting the project team at yoshuawuyts@gmail.com, or through
61 | IRC. All complaints will be reviewed and investigated and will result in a
62 | response that is deemed necessary and appropriate to the circumstances. The
63 | project team is obligated to maintain confidentiality with regard to the
64 | reporter of an incident.
65 | Further details of specific enforcement policies may be posted separately.
66 |
67 | Project maintainers who do not follow or enforce the Code of Conduct in good
68 | faith may face temporary or permanent repercussions as determined by other
69 | members of the project's leadership.
70 |
71 | ## Attribution
72 |
73 | This Code of Conduct is adapted from the Contributor Covenant, version 1.4,
74 | available at
75 | https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
76 |
--------------------------------------------------------------------------------
/.github/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | # Contributing
2 | Contributions include code, documentation, answering user questions, running the
3 | project's infrastructure, and advocating for all types of users.
4 |
5 | The project welcomes all contributions from anyone willing to work in good faith
6 | with other contributors and the community. No contribution is too small and all
7 | contributions are valued.
8 |
9 | This guide explains the process for contributing to the project's GitHub
10 | Repository.
11 |
12 | - [Code of Conduct](#code-of-conduct)
13 | - [Bad Actors](#bad-actors)
14 |
15 | ## Code of Conduct
16 | The project has a [Code of Conduct](./CODE_OF_CONDUCT.md) that *all*
17 | contributors are expected to follow. This code describes the *minimum* behavior
18 | expectations for all contributors.
19 |
20 | As a contributor, how you choose to act and interact towards your
21 | fellow contributors, as well as to the community, will reflect back not only
22 | on yourself but on the project as a whole. The Code of Conduct is designed and
23 | intended, above all else, to help establish a culture within the project that
24 | allows anyone and everyone who wants to contribute to feel safe doing so.
25 |
26 | Should any individual act in any way that is considered in violation of the
27 | [Code of Conduct](./CODE_OF_CONDUCT.md), corrective actions will be taken. It is
28 | possible, however, for any individual to *act* in such a manner that is not in
29 | violation of the strict letter of the Code of Conduct guidelines while still
30 | going completely against the spirit of what that Code is intended to accomplish.
31 |
32 | Open, diverse, and inclusive communities live and die on the basis of trust.
33 | Contributors can disagree with one another so long as they trust that those
34 | disagreements are in good faith and everyone is working towards a common
35 | goal.
36 |
37 | ## Bad Actors
38 | All contributors to tacitly agree to abide by both the letter and
39 | spirit of the [Code of Conduct](./CODE_OF_CONDUCT.md). Failure, or
40 | unwillingness, to do so will result in contributions being respectfully
41 | declined.
42 |
43 | A *bad actor* is someone who repeatedly violates the *spirit* of the Code of
44 | Conduct through consistent failure to self-regulate the way in which they
45 | interact with other contributors in the project. In doing so, bad actors
46 | alienate other contributors, discourage collaboration, and generally reflect
47 | poorly on the project as a whole.
48 |
49 | Being a bad actor may be intentional or unintentional. Typically, unintentional
50 | bad behavior can be easily corrected by being quick to apologize and correct
51 | course *even if you are not entirely convinced you need to*. Giving other
52 | contributors the benefit of the doubt and having a sincere willingness to admit
53 | that you *might* be wrong is critical for any successful open collaboration.
54 |
55 | Don't be a bad actor.
56 |
--------------------------------------------------------------------------------
/.github/workflows/ci.yaml:
--------------------------------------------------------------------------------
1 | name: CI
2 |
3 | on:
4 | pull_request:
5 | push:
6 | branches:
7 | - main
8 | - staging
9 | - trying
10 |
11 | env:
12 | RUSTFLAGS: -Dwarnings
13 |
14 | jobs:
15 | build_and_test:
16 | name: Build and test
17 | runs-on: ${{ matrix.os }}
18 | strategy:
19 | matrix:
20 | os: [ubuntu-latest, windows-latest, macOS-latest]
21 | rust: [stable]
22 |
23 | steps:
24 | - uses: actions/checkout@master
25 |
26 | - name: Install ${{ matrix.rust }}
27 | uses: actions-rs/toolchain@v1
28 | with:
29 | toolchain: ${{ matrix.rust }}
30 | override: true
31 |
32 | - name: check
33 | uses: actions-rs/cargo@v1
34 | with:
35 | command: check
36 | args: --all --bins --examples
37 |
38 | - name: tests
39 | uses: actions-rs/cargo@v1
40 | with:
41 | command: test
42 | args: --all
43 |
44 | check_fmt_and_docs:
45 | name: Checking fmt and docs
46 | runs-on: ubuntu-latest
47 | steps:
48 | - uses: actions/checkout@master
49 | - uses: actions-rs/toolchain@v1
50 | with:
51 | toolchain: nightly
52 | components: rustfmt, clippy
53 | override: true
54 |
55 | - name: fmt
56 | run: cargo fmt --all -- --check
57 |
58 | - name: Docs
59 | run: cargo doc
60 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | /target
2 | /Cargo.lock
3 |
--------------------------------------------------------------------------------
/Cargo.toml:
--------------------------------------------------------------------------------
1 | [package]
2 | name = "route-recognizer"
3 | description = "Recognizes URL patterns with support for dynamic and wildcard segments"
4 | license = "MIT"
5 | repository = "https://github.com/rustasync/route-recognizer"
6 | keywords = ["router", "url"]
7 | edition = "2018"
8 |
9 | version = "0.3.1"
10 | authors = ["wycats", "rustasync"]
11 |
--------------------------------------------------------------------------------
/LICENSE-MIT:
--------------------------------------------------------------------------------
1 | The MIT License (MIT)
2 |
3 | Copyright (c) 2020 The http-rs contributors
4 | Copyright (c) 2019 The rustasync contributors
5 | Copyright (c) 2014 Yehuda Katz
6 |
7 | Permission is hereby granted, free of charge, to any person obtaining a copy
8 | of this software and associated documentation files (the "Software"), to deal
9 | in the Software without restriction, including without limitation the rights
10 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
11 | copies of the Software, and to permit persons to whom the Software is
12 | furnished to do so, subject to the following conditions:
13 |
14 | The above copyright notice and this permission notice shall be included in all
15 | copies or substantial portions of the Software.
16 |
17 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
18 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
19 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
20 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
21 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
22 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
23 | SOFTWARE.
24 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
route-recognizer
2 |
3 |
4 | Recognizes URL patterns with support for dynamic and wildcard segments
5 |
6 |
7 |
8 |
9 |
10 |
27 |
28 |
43 |
44 | ## Installation
45 | ```sh
46 | $ cargo add route-recognizer
47 | ```
48 |
49 | ## Safety
50 | This crate uses ``#![deny(unsafe_code)]`` to ensure everything is implemented in
51 | 100% Safe Rust.
52 |
53 | ## Contributing
54 | Want to join us? Check out our ["Contributing" guide][contributing] and take a
55 | look at some of these issues:
56 |
57 | - [Issues labeled "good first issue"][good-first-issue]
58 | - [Issues labeled "help wanted"][help-wanted]
59 |
60 | [contributing]: https://github.com/http-rs/route-recognizer/blob/master.github/CONTRIBUTING.md
61 | [good-first-issue]: https://github.com/http-rs/route-recognizer/labels/good%20first%20issue
62 | [help-wanted]: https://github.com/http-rs/route-recognizer/labels/help%20wanted
63 |
64 | ## License
65 |
66 |
67 | Licensed under either the MIT license at your option.
68 |
69 |
70 |
71 |
72 |
73 | Unless you explicitly state otherwise, any contribution intentionally submitted
74 | for inclusion in this crate by you, as defined in the Apache-2.0 license, shall
75 | be dual licensed as MIT / Apache-2.0, without any additional terms or
76 | conditions.
77 |
78 |
--------------------------------------------------------------------------------
/benches/bench.rs:
--------------------------------------------------------------------------------
1 | #![feature(test)]
2 |
3 | extern crate route_recognizer;
4 | extern crate test;
5 |
6 | use route_recognizer::Router;
7 |
8 | #[bench]
9 | fn benchmark(b: &mut test::Bencher) {
10 | let mut router = Router::new();
11 | router.add("/posts/:post_id/comments/:id", "comment".to_string());
12 | router.add("/posts/:post_id/comments", "comments".to_string());
13 | router.add("/posts/:post_id", "post".to_string());
14 | router.add("/posts", "posts".to_string());
15 | router.add("/comments", "comments2".to_string());
16 | router.add("/comments/:id", "comment2".to_string());
17 |
18 | b.iter(|| router.recognize("/posts/100/comments/200"));
19 | }
20 |
--------------------------------------------------------------------------------
/benches/nfa.rs:
--------------------------------------------------------------------------------
1 | #![feature(test)]
2 |
3 | extern crate route_recognizer;
4 | extern crate test;
5 |
6 | use route_recognizer::nfa::CharSet;
7 | use std::collections::{BTreeSet, HashSet};
8 |
9 | #[bench]
10 | fn bench_char_set(b: &mut test::Bencher) {
11 | let mut set = CharSet::new();
12 | set.insert('p');
13 | set.insert('n');
14 | set.insert('/');
15 |
16 | b.iter(|| {
17 | assert!(set.contains('p'));
18 | assert!(set.contains('/'));
19 | assert!(!set.contains('z'));
20 | });
21 | }
22 |
23 | #[bench]
24 | fn bench_hash_set(b: &mut test::Bencher) {
25 | let mut set = HashSet::new();
26 | set.insert('p');
27 | set.insert('n');
28 | set.insert('/');
29 |
30 | b.iter(|| {
31 | assert!(set.contains(&'p'));
32 | assert!(set.contains(&'/'));
33 | assert!(!set.contains(&'z'));
34 | });
35 | }
36 |
37 | #[bench]
38 | fn bench_btree_set(b: &mut test::Bencher) {
39 | let mut set = BTreeSet::new();
40 | set.insert('p');
41 | set.insert('n');
42 | set.insert('/');
43 |
44 | b.iter(|| {
45 | assert!(set.contains(&'p'));
46 | assert!(set.contains(&'/'));
47 | assert!(!set.contains(&'z'));
48 | });
49 | }
50 |
--------------------------------------------------------------------------------
/src/lib.rs:
--------------------------------------------------------------------------------
1 | //! Recognizes URL patterns with support for dynamic and wildcard segments
2 | //!
3 | //! # Examples
4 | //!
5 | //! ```
6 | //! use route_recognizer::{Router, Params};
7 | //!
8 | //! let mut router = Router::new();
9 | //!
10 | //! router.add("/thomas", "Thomas".to_string());
11 | //! router.add("/tom", "Tom".to_string());
12 | //! router.add("/wycats", "Yehuda".to_string());
13 | //!
14 | //! let m = router.recognize("/thomas").unwrap();
15 | //!
16 | //! assert_eq!(m.handler().as_str(), "Thomas");
17 | //! assert_eq!(m.params(), &Params::new());
18 | //! ```
19 | //!
20 | //! # Routing params
21 | //!
22 | //! The router supports four kinds of route segments:
23 | //! - __segments__: these are of the format `/a/b`.
24 | //! - __params__: these are of the format `/a/:b`.
25 | //! - __named wildcards__: these are of the format `/a/*b`.
26 | //! - __unnamed wildcards__: these are of the format `/a/*`.
27 | //!
28 | //! The difference between a "named wildcard" and a "param" is how the
29 | //! matching rules apply. Given the router `/a/:b`, passing in `/foo/bar/baz`
30 | //! will not match because `/baz` has no counterpart in the router.
31 | //!
32 | //! However if we define the route `/a/*b` and we pass `/foo/bar/baz` we end up
33 | //! with a named param `"b"` that contains the value `"bar/baz"`. Wildcard
34 | //! routing rules are useful when you don't know which routes may follow. The
35 | //! difference between "named" and "unnamed" wildcards is that the former will
36 | //! show up in `Params`, while the latter won't.
37 |
38 | #![cfg_attr(feature = "docs", feature(doc_cfg))]
39 | #![deny(unsafe_code)]
40 | #![deny(missing_debug_implementations, nonstandard_style)]
41 | #![warn(missing_docs, unreachable_pub, future_incompatible, rust_2018_idioms)]
42 | #![doc(test(attr(deny(warnings))))]
43 | #![doc(test(attr(allow(unused_extern_crates, unused_variables))))]
44 | #![doc(html_favicon_url = "https://yoshuawuyts.com/assets/http-rs/favicon.ico")]
45 | #![doc(html_logo_url = "https://yoshuawuyts.com/assets/http-rs/logo-rounded.png")]
46 |
47 | use std::cmp::Ordering;
48 | use std::collections::{btree_map, BTreeMap};
49 | use std::ops::Index;
50 |
51 | use crate::nfa::{CharacterClass, NFA};
52 |
53 | #[doc(hidden)]
54 | pub mod nfa;
55 |
56 | #[derive(Clone, Eq, Debug)]
57 | struct Metadata {
58 | statics: u32,
59 | dynamics: u32,
60 | wildcards: u32,
61 | param_names: Vec,
62 | }
63 |
64 | impl Metadata {
65 | pub(crate) fn new() -> Self {
66 | Self {
67 | statics: 0,
68 | dynamics: 0,
69 | wildcards: 0,
70 | param_names: Vec::new(),
71 | }
72 | }
73 | }
74 |
75 | impl Ord for Metadata {
76 | fn cmp(&self, other: &Self) -> Ordering {
77 | if self.statics > other.statics {
78 | Ordering::Greater
79 | } else if self.statics < other.statics {
80 | Ordering::Less
81 | } else if self.dynamics > other.dynamics {
82 | Ordering::Greater
83 | } else if self.dynamics < other.dynamics {
84 | Ordering::Less
85 | } else if self.wildcards > other.wildcards {
86 | Ordering::Greater
87 | } else if self.wildcards < other.wildcards {
88 | Ordering::Less
89 | } else {
90 | Ordering::Equal
91 | }
92 | }
93 | }
94 |
95 | impl PartialOrd for Metadata {
96 | fn partial_cmp(&self, other: &Self) -> Option {
97 | Some(self.cmp(other))
98 | }
99 | }
100 |
101 | impl PartialEq for Metadata {
102 | fn eq(&self, other: &Self) -> bool {
103 | self.statics == other.statics
104 | && self.dynamics == other.dynamics
105 | && self.wildcards == other.wildcards
106 | }
107 | }
108 |
109 | /// Router parameters.
110 | #[derive(PartialEq, Clone, Debug, Default)]
111 | pub struct Params {
112 | map: BTreeMap,
113 | }
114 |
115 | impl Params {
116 | /// Create a new instance of `Params`.
117 | pub fn new() -> Self {
118 | Self {
119 | map: BTreeMap::new(),
120 | }
121 | }
122 |
123 | /// Insert a new param into `Params`.
124 | pub fn insert(&mut self, key: String, value: String) {
125 | self.map.insert(key, value);
126 | }
127 |
128 | /// Find a param by name in `Params`.
129 | pub fn find(&self, key: &str) -> Option<&str> {
130 | self.map.get(key).map(|s| &s[..])
131 | }
132 |
133 | /// Iterate over all named params.
134 | ///
135 | /// This will return all named params and named wildcards.
136 | pub fn iter(&self) -> Iter<'_> {
137 | Iter(self.map.iter())
138 | }
139 | }
140 |
141 | impl Index<&str> for Params {
142 | type Output = String;
143 | fn index(&self, index: &str) -> &String {
144 | match self.map.get(index) {
145 | None => panic!("params[{}] did not exist", index),
146 | Some(s) => s,
147 | }
148 | }
149 | }
150 |
151 | impl<'a> IntoIterator for &'a Params {
152 | type IntoIter = Iter<'a>;
153 | type Item = (&'a str, &'a str);
154 |
155 | fn into_iter(self) -> Iter<'a> {
156 | self.iter()
157 | }
158 | }
159 |
160 | /// An iterator over `Params`.
161 | #[derive(Debug)]
162 | pub struct Iter<'a>(btree_map::Iter<'a, String, String>);
163 |
164 | impl<'a> Iterator for Iter<'a> {
165 | type Item = (&'a str, &'a str);
166 |
167 | #[inline]
168 | fn next(&mut self) -> Option<(&'a str, &'a str)> {
169 | self.0.next().map(|(k, v)| (&**k, &**v))
170 | }
171 |
172 | fn size_hint(&self) -> (usize, Option) {
173 | self.0.size_hint()
174 | }
175 | }
176 |
177 | /// The result of a successful match returned by `Router::recognize`.
178 | #[derive(Debug)]
179 | pub struct Match {
180 | /// Return the endpoint handler.
181 | handler: T,
182 | /// Return the params.
183 | params: Params,
184 | }
185 |
186 | impl Match {
187 | /// Create a new instance of `Match`.
188 | pub fn new(handler: T, params: Params) -> Self {
189 | Self { handler, params }
190 | }
191 |
192 | /// Get a handle to the handler.
193 | pub fn handler(&self) -> &T {
194 | &self.handler
195 | }
196 |
197 | /// Get a mutable handle to the handler.
198 | pub fn handler_mut(&mut self) -> &mut T {
199 | &mut self.handler
200 | }
201 |
202 | /// Get a handle to the params.
203 | pub fn params(&self) -> &Params {
204 | &self.params
205 | }
206 |
207 | /// Get a mutable handle to the params.
208 | pub fn params_mut(&mut self) -> &mut Params {
209 | &mut self.params
210 | }
211 | }
212 |
213 | /// Recognizes URL patterns with support for dynamic and wildcard segments.
214 | #[derive(Clone, Debug)]
215 | pub struct Router {
216 | nfa: NFA,
217 | handlers: BTreeMap,
218 | }
219 |
220 | fn segments(route: &str) -> Vec<(Option, &str)> {
221 | let predicate = |c| c == '.' || c == '/';
222 |
223 | let mut segments = vec![];
224 | let mut segment_start = 0;
225 |
226 | while segment_start < route.len() {
227 | let segment_end = route[segment_start + 1..]
228 | .find(predicate)
229 | .map(|i| i + segment_start + 1)
230 | .unwrap_or_else(|| route.len());
231 | let potential_sep = route.chars().nth(segment_start);
232 | let sep_and_segment = match potential_sep {
233 | Some(sep) if predicate(sep) => (Some(sep), &route[segment_start + 1..segment_end]),
234 | _ => (None, &route[segment_start..segment_end]),
235 | };
236 |
237 | segments.push(sep_and_segment);
238 | segment_start = segment_end;
239 | }
240 |
241 | segments
242 | }
243 |
244 | impl Router {
245 | /// Create a new instance of `Router`.
246 | pub fn new() -> Self {
247 | Self {
248 | nfa: NFA::new(),
249 | handlers: BTreeMap::new(),
250 | }
251 | }
252 |
253 | /// Add a route to the router.
254 | pub fn add(&mut self, mut route: &str, dest: T) {
255 | if !route.is_empty() && route.as_bytes()[0] == b'/' {
256 | route = &route[1..];
257 | }
258 |
259 | let nfa = &mut self.nfa;
260 | let mut state = 0;
261 | let mut metadata = Metadata::new();
262 |
263 | for (separator, segment) in segments(route) {
264 | if let Some(separator) = separator {
265 | state = nfa.put(state, CharacterClass::valid_char(separator));
266 | }
267 |
268 | if !segment.is_empty() && segment.as_bytes()[0] == b':' {
269 | state = process_dynamic_segment(nfa, state);
270 | metadata.dynamics += 1;
271 | metadata.param_names.push(segment[1..].to_string());
272 | } else if !segment.is_empty() && segment.as_bytes()[0] == b'*' {
273 | state = process_star_state(nfa, state);
274 | metadata.wildcards += 1;
275 | metadata.param_names.push(segment[1..].to_string());
276 | } else {
277 | state = process_static_segment(segment, nfa, state);
278 | metadata.statics += 1;
279 | }
280 | }
281 |
282 | nfa.acceptance(state);
283 | nfa.metadata(state, metadata);
284 | self.handlers.insert(state, dest);
285 | }
286 |
287 | /// Match a route on the router.
288 | pub fn recognize(&self, mut path: &str) -> Result, String> {
289 | if !path.is_empty() && path.as_bytes()[0] == b'/' {
290 | path = &path[1..];
291 | }
292 |
293 | let nfa = &self.nfa;
294 | let result = nfa.process(path, |index| nfa.get(index).metadata.as_ref().unwrap());
295 |
296 | match result {
297 | Ok(nfa_match) => {
298 | let mut map = Params::new();
299 | let state = &nfa.get(nfa_match.state);
300 | let metadata = state.metadata.as_ref().unwrap();
301 | let param_names = metadata.param_names.clone();
302 |
303 | for (i, capture) in nfa_match.captures.iter().enumerate() {
304 | if !param_names[i].is_empty() {
305 | map.insert(param_names[i].to_string(), capture.to_string());
306 | }
307 | }
308 |
309 | let handler = self.handlers.get(&nfa_match.state).unwrap();
310 | Ok(Match::new(handler, map))
311 | }
312 | Err(str) => Err(str),
313 | }
314 | }
315 | }
316 |
317 | impl Default for Router {
318 | fn default() -> Self {
319 | Self::new()
320 | }
321 | }
322 |
323 | fn process_static_segment(segment: &str, nfa: &mut NFA, mut state: usize) -> usize {
324 | for char in segment.chars() {
325 | state = nfa.put(state, CharacterClass::valid_char(char));
326 | }
327 |
328 | state
329 | }
330 |
331 | fn process_dynamic_segment(nfa: &mut NFA, mut state: usize) -> usize {
332 | state = nfa.put(state, CharacterClass::invalid_char('/'));
333 | nfa.put_state(state, state);
334 | nfa.start_capture(state);
335 | nfa.end_capture(state);
336 |
337 | state
338 | }
339 |
340 | fn process_star_state(nfa: &mut NFA, mut state: usize) -> usize {
341 | state = nfa.put(state, CharacterClass::any());
342 | nfa.put_state(state, state);
343 | nfa.start_capture(state);
344 | nfa.end_capture(state);
345 |
346 | state
347 | }
348 |
349 | #[cfg(test)]
350 | mod tests {
351 | use super::{Params, Router};
352 |
353 | #[test]
354 | fn basic_router() {
355 | let mut router = Router::new();
356 |
357 | router.add("/thomas", "Thomas".to_string());
358 | router.add("/tom", "Tom".to_string());
359 | router.add("/wycats", "Yehuda".to_string());
360 |
361 | let m = router.recognize("/thomas").unwrap();
362 |
363 | assert_eq!(*m.handler, "Thomas".to_string());
364 | assert_eq!(m.params, Params::new());
365 | }
366 |
367 | #[test]
368 | fn root_router() {
369 | let mut router = Router::new();
370 | router.add("/", 10);
371 | assert_eq!(*router.recognize("/").unwrap().handler, 10)
372 | }
373 |
374 | #[test]
375 | fn empty_path() {
376 | let mut router = Router::new();
377 | router.add("/", 12);
378 | assert_eq!(*router.recognize("").unwrap().handler, 12)
379 | }
380 |
381 | #[test]
382 | fn empty_route() {
383 | let mut router = Router::new();
384 | router.add("", 12);
385 | assert_eq!(*router.recognize("/").unwrap().handler, 12)
386 | }
387 |
388 | #[test]
389 | fn ambiguous_router() {
390 | let mut router = Router::new();
391 |
392 | router.add("/posts/new", "new".to_string());
393 | router.add("/posts/:id", "id".to_string());
394 |
395 | let id = router.recognize("/posts/1").unwrap();
396 |
397 | assert_eq!(*id.handler, "id".to_string());
398 | assert_eq!(id.params, params("id", "1"));
399 |
400 | let new = router.recognize("/posts/new").unwrap();
401 | assert_eq!(*new.handler, "new".to_string());
402 | assert_eq!(new.params, Params::new());
403 | }
404 |
405 | #[test]
406 | fn ambiguous_router_b() {
407 | let mut router = Router::new();
408 |
409 | router.add("/posts/:id", "id".to_string());
410 | router.add("/posts/new", "new".to_string());
411 |
412 | let id = router.recognize("/posts/1").unwrap();
413 |
414 | assert_eq!(*id.handler, "id".to_string());
415 | assert_eq!(id.params, params("id", "1"));
416 |
417 | let new = router.recognize("/posts/new").unwrap();
418 | assert_eq!(*new.handler, "new".to_string());
419 | assert_eq!(new.params, Params::new());
420 | }
421 |
422 | #[test]
423 | fn multiple_params() {
424 | let mut router = Router::new();
425 |
426 | router.add("/posts/:post_id/comments/:id", "comment".to_string());
427 | router.add("/posts/:post_id/comments", "comments".to_string());
428 |
429 | let com = router.recognize("/posts/12/comments/100").unwrap();
430 | let coms = router.recognize("/posts/12/comments").unwrap();
431 |
432 | assert_eq!(*com.handler, "comment".to_string());
433 | assert_eq!(com.params, two_params("post_id", "12", "id", "100"));
434 |
435 | assert_eq!(*coms.handler, "comments".to_string());
436 | assert_eq!(coms.params, params("post_id", "12"));
437 | assert_eq!(coms.params["post_id"], "12".to_string());
438 | }
439 |
440 | #[test]
441 | fn wildcard() {
442 | let mut router = Router::new();
443 |
444 | router.add("*foo", "test".to_string());
445 | router.add("/bar/*foo", "test2".to_string());
446 |
447 | let m = router.recognize("/test").unwrap();
448 | assert_eq!(*m.handler, "test".to_string());
449 | assert_eq!(m.params, params("foo", "test"));
450 |
451 | let m = router.recognize("/foo/bar").unwrap();
452 | assert_eq!(*m.handler, "test".to_string());
453 | assert_eq!(m.params, params("foo", "foo/bar"));
454 |
455 | let m = router.recognize("/bar/foo").unwrap();
456 | assert_eq!(*m.handler, "test2".to_string());
457 | assert_eq!(m.params, params("foo", "foo"));
458 | }
459 |
460 | #[test]
461 | fn wildcard_colon() {
462 | let mut router = Router::new();
463 |
464 | router.add("/a/*b", "ab".to_string());
465 | router.add("/a/*b/c", "abc".to_string());
466 | router.add("/a/*b/c/:d", "abcd".to_string());
467 |
468 | let m = router.recognize("/a/foo").unwrap();
469 | assert_eq!(*m.handler, "ab".to_string());
470 | assert_eq!(m.params, params("b", "foo"));
471 |
472 | let m = router.recognize("/a/foo/bar").unwrap();
473 | assert_eq!(*m.handler, "ab".to_string());
474 | assert_eq!(m.params, params("b", "foo/bar"));
475 |
476 | let m = router.recognize("/a/foo/c").unwrap();
477 | assert_eq!(*m.handler, "abc".to_string());
478 | assert_eq!(m.params, params("b", "foo"));
479 |
480 | let m = router.recognize("/a/foo/bar/c").unwrap();
481 | assert_eq!(*m.handler, "abc".to_string());
482 | assert_eq!(m.params, params("b", "foo/bar"));
483 |
484 | let m = router.recognize("/a/foo/c/baz").unwrap();
485 | assert_eq!(*m.handler, "abcd".to_string());
486 | assert_eq!(m.params, two_params("b", "foo", "d", "baz"));
487 |
488 | let m = router.recognize("/a/foo/bar/c/baz").unwrap();
489 | assert_eq!(*m.handler, "abcd".to_string());
490 | assert_eq!(m.params, two_params("b", "foo/bar", "d", "baz"));
491 |
492 | let m = router.recognize("/a/foo/bar/c/baz/bay").unwrap();
493 | assert_eq!(*m.handler, "ab".to_string());
494 | assert_eq!(m.params, params("b", "foo/bar/c/baz/bay"));
495 | }
496 |
497 | #[test]
498 | fn unnamed_parameters() {
499 | let mut router = Router::new();
500 |
501 | router.add("/foo/:/bar", "test".to_string());
502 | router.add("/foo/:bar/*", "test2".to_string());
503 | let m = router.recognize("/foo/test/bar").unwrap();
504 | assert_eq!(*m.handler, "test");
505 | assert_eq!(m.params, Params::new());
506 |
507 | let m = router.recognize("/foo/test/blah").unwrap();
508 | assert_eq!(*m.handler, "test2");
509 | assert_eq!(m.params, params("bar", "test"));
510 | }
511 |
512 | fn params(key: &str, val: &str) -> Params {
513 | let mut map = Params::new();
514 | map.insert(key.to_string(), val.to_string());
515 | map
516 | }
517 |
518 | fn two_params(k1: &str, v1: &str, k2: &str, v2: &str) -> Params {
519 | let mut map = Params::new();
520 | map.insert(k1.to_string(), v1.to_string());
521 | map.insert(k2.to_string(), v2.to_string());
522 | map
523 | }
524 |
525 | #[test]
526 | fn dot() {
527 | let mut router = Router::new();
528 | router.add("/1/baz.:wibble", ());
529 | router.add("/2/:bar.baz", ());
530 | router.add("/3/:dynamic.:extension", ());
531 | router.add("/4/static.static", ());
532 |
533 | let m = router.recognize("/1/baz.jpg").unwrap();
534 | assert_eq!(m.params, params("wibble", "jpg"));
535 |
536 | let m = router.recognize("/2/test.baz").unwrap();
537 | assert_eq!(m.params, params("bar", "test"));
538 |
539 | let m = router.recognize("/3/any.thing").unwrap();
540 | assert_eq!(m.params, two_params("dynamic", "any", "extension", "thing"));
541 |
542 | let m = router.recognize("/3/this.performs.a.greedy.match").unwrap();
543 | assert_eq!(
544 | m.params,
545 | two_params("dynamic", "this.performs.a.greedy", "extension", "match")
546 | );
547 |
548 | let m = router.recognize("/4/static.static").unwrap();
549 | assert_eq!(m.params, Params::new());
550 |
551 | let m = router.recognize("/4/static/static");
552 | assert!(m.is_err());
553 |
554 | let m = router.recognize("/4.static.static");
555 | assert!(m.is_err());
556 | }
557 |
558 | #[test]
559 | fn test_chinese() {
560 | let mut router = Router::new();
561 | router.add("/crates/:foo/:bar", "Hello".to_string());
562 |
563 | let m = router.recognize("/crates/实打实打算/d's'd").unwrap();
564 | assert_eq!(m.handler().as_str(), "Hello");
565 | assert_eq!(m.params().find("foo"), Some("实打实打算"));
566 | assert_eq!(m.params().find("bar"), Some("d's'd"));
567 | }
568 | }
569 |
--------------------------------------------------------------------------------
/src/nfa.rs:
--------------------------------------------------------------------------------
1 | use std::collections::HashSet;
2 |
3 | use self::CharacterClass::{Ascii, InvalidChars, ValidChars};
4 |
5 | #[derive(PartialEq, Eq, Clone, Default, Debug)]
6 | pub struct CharSet {
7 | low_mask: u64,
8 | high_mask: u64,
9 | non_ascii: HashSet,
10 | }
11 |
12 | impl CharSet {
13 | pub fn new() -> Self {
14 | Self {
15 | low_mask: 0,
16 | high_mask: 0,
17 | non_ascii: HashSet::new(),
18 | }
19 | }
20 |
21 | pub fn insert(&mut self, char: char) {
22 | let val = char as u32 - 1;
23 |
24 | if val > 127 {
25 | self.non_ascii.insert(char);
26 | } else if val > 63 {
27 | let bit = 1 << (val - 64);
28 | self.high_mask |= bit;
29 | } else {
30 | let bit = 1 << val;
31 | self.low_mask |= bit;
32 | }
33 | }
34 |
35 | pub fn contains(&self, char: char) -> bool {
36 | let val = char as u32 - 1;
37 |
38 | if val > 127 {
39 | self.non_ascii.contains(&char)
40 | } else if val > 63 {
41 | let bit = 1 << (val - 64);
42 | self.high_mask & bit != 0
43 | } else {
44 | let bit = 1 << val;
45 | self.low_mask & bit != 0
46 | }
47 | }
48 | }
49 |
50 | #[derive(PartialEq, Eq, Clone, Debug)]
51 | pub enum CharacterClass {
52 | Ascii(u64, u64, bool),
53 | ValidChars(CharSet),
54 | InvalidChars(CharSet),
55 | }
56 |
57 | impl CharacterClass {
58 | pub fn any() -> Self {
59 | Ascii(u64::max_value(), u64::max_value(), true)
60 | }
61 |
62 | pub fn valid(string: &str) -> Self {
63 | ValidChars(Self::str_to_set(string))
64 | }
65 |
66 | pub fn invalid(string: &str) -> Self {
67 | InvalidChars(Self::str_to_set(string))
68 | }
69 |
70 | pub fn valid_char(char: char) -> Self {
71 | let val = char as u32 - 1;
72 |
73 | if val > 127 {
74 | ValidChars(Self::char_to_set(char))
75 | } else if val > 63 {
76 | Ascii(1 << (val - 64), 0, false)
77 | } else {
78 | Ascii(0, 1 << val, false)
79 | }
80 | }
81 |
82 | pub fn invalid_char(char: char) -> Self {
83 | let val = char as u32 - 1;
84 |
85 | if val > 127 {
86 | InvalidChars(Self::char_to_set(char))
87 | } else if val > 63 {
88 | Ascii(u64::max_value() ^ (1 << (val - 64)), u64::max_value(), true)
89 | } else {
90 | Ascii(u64::max_value(), u64::max_value() ^ (1 << val), true)
91 | }
92 | }
93 |
94 | pub fn matches(&self, char: char) -> bool {
95 | match *self {
96 | ValidChars(ref valid) => valid.contains(char),
97 | InvalidChars(ref invalid) => !invalid.contains(char),
98 | Ascii(high, low, unicode) => {
99 | let val = char as u32 - 1;
100 | if val > 127 {
101 | unicode
102 | } else if val > 63 {
103 | high & (1 << (val - 64)) != 0
104 | } else {
105 | low & (1 << val) != 0
106 | }
107 | }
108 | }
109 | }
110 |
111 | fn char_to_set(char: char) -> CharSet {
112 | let mut set = CharSet::new();
113 | set.insert(char);
114 | set
115 | }
116 |
117 | fn str_to_set(string: &str) -> CharSet {
118 | let mut set = CharSet::new();
119 | for char in string.chars() {
120 | set.insert(char);
121 | }
122 | set
123 | }
124 | }
125 |
126 | #[derive(Clone)]
127 | struct Thread {
128 | state: usize,
129 | captures: Vec<(usize, usize)>,
130 | capture_begin: Option,
131 | }
132 |
133 | impl Thread {
134 | pub(crate) fn new() -> Self {
135 | Self {
136 | state: 0,
137 | captures: Vec::new(),
138 | capture_begin: None,
139 | }
140 | }
141 |
142 | #[inline]
143 | pub(crate) fn start_capture(&mut self, start: usize) {
144 | self.capture_begin = Some(start);
145 | }
146 |
147 | #[inline]
148 | pub(crate) fn end_capture(&mut self, end: usize) {
149 | self.captures.push((self.capture_begin.unwrap(), end));
150 | self.capture_begin = None;
151 | }
152 |
153 | pub(crate) fn extract<'a>(&self, source: &'a str) -> Vec<&'a str> {
154 | self.captures
155 | .iter()
156 | .map(|&(begin, end)| &source[begin..end])
157 | .collect()
158 | }
159 | }
160 |
161 | #[derive(Clone, Debug)]
162 | pub struct State {
163 | pub index: usize,
164 | pub chars: CharacterClass,
165 | pub next_states: Vec,
166 | pub acceptance: bool,
167 | pub start_capture: bool,
168 | pub end_capture: bool,
169 | pub metadata: Option,
170 | }
171 |
172 | impl PartialEq for State {
173 | fn eq(&self, other: &Self) -> bool {
174 | self.index == other.index
175 | }
176 | }
177 |
178 | impl State {
179 | pub fn new(index: usize, chars: CharacterClass) -> Self {
180 | Self {
181 | index,
182 | chars,
183 | next_states: Vec::new(),
184 | acceptance: false,
185 | start_capture: false,
186 | end_capture: false,
187 | metadata: None,
188 | }
189 | }
190 | }
191 |
192 | #[derive(Debug)]
193 | pub struct Match<'a> {
194 | pub state: usize,
195 | pub captures: Vec<&'a str>,
196 | }
197 |
198 | impl<'a> Match<'a> {
199 | pub fn new(state: usize, captures: Vec<&'_ str>) -> Match<'_> {
200 | Match { state, captures }
201 | }
202 | }
203 |
204 | #[derive(Clone, Default, Debug)]
205 | pub struct NFA {
206 | states: Vec>,
207 | start_capture: Vec,
208 | end_capture: Vec,
209 | acceptance: Vec,
210 | }
211 |
212 | impl NFA {
213 | pub fn new() -> Self {
214 | let root = State::new(0, CharacterClass::any());
215 | Self {
216 | states: vec![root],
217 | start_capture: vec![false],
218 | end_capture: vec![false],
219 | acceptance: vec![false],
220 | }
221 | }
222 |
223 | pub fn process<'a, I, F>(&self, string: &'a str, mut ord: F) -> Result, String>
224 | where
225 | I: Ord,
226 | F: FnMut(usize) -> I,
227 | {
228 | let mut threads = vec![Thread::new()];
229 |
230 | for (i, char) in string.char_indices() {
231 | let next_threads = self.process_char(threads, char, i);
232 |
233 | if next_threads.is_empty() {
234 | return Err(format!("Couldn't process {}", string));
235 | }
236 |
237 | threads = next_threads;
238 | }
239 |
240 | let returned = threads
241 | .into_iter()
242 | .filter(|thread| self.get(thread.state).acceptance);
243 |
244 | let thread = returned
245 | .fold(None, |prev, y| {
246 | let y_v = ord(y.state);
247 | match prev {
248 | None => Some((y_v, y)),
249 | Some((x_v, x)) => {
250 | if x_v < y_v {
251 | Some((y_v, y))
252 | } else {
253 | Some((x_v, x))
254 | }
255 | }
256 | }
257 | })
258 | .map(|p| p.1);
259 |
260 | match thread {
261 | None => Err("The string was exhausted before reaching an \
262 | acceptance state"
263 | .to_string()),
264 | Some(mut thread) => {
265 | if thread.capture_begin.is_some() {
266 | thread.end_capture(string.len());
267 | }
268 | let state = self.get(thread.state);
269 | Ok(Match::new(state.index, thread.extract(string)))
270 | }
271 | }
272 | }
273 |
274 | #[inline]
275 | fn process_char(&self, threads: Vec, char: char, pos: usize) -> Vec {
276 | let mut returned = Vec::with_capacity(threads.len());
277 |
278 | for mut thread in threads {
279 | let current_state = self.get(thread.state);
280 |
281 | let mut count = 0;
282 | let mut found_state = 0;
283 |
284 | for &index in ¤t_state.next_states {
285 | let state = &self.states[index];
286 |
287 | if state.chars.matches(char) {
288 | count += 1;
289 | found_state = index;
290 | }
291 | }
292 |
293 | if count == 1 {
294 | thread.state = found_state;
295 | capture(self, &mut thread, current_state.index, found_state, pos);
296 | returned.push(thread);
297 | continue;
298 | }
299 |
300 | for &index in ¤t_state.next_states {
301 | let state = &self.states[index];
302 | if state.chars.matches(char) {
303 | let mut thread = fork_thread(&thread, state);
304 | capture(self, &mut thread, current_state.index, index, pos);
305 | returned.push(thread);
306 | }
307 | }
308 | }
309 |
310 | returned
311 | }
312 |
313 | #[inline]
314 | pub fn get(&self, state: usize) -> &State {
315 | &self.states[state]
316 | }
317 |
318 | pub fn get_mut(&mut self, state: usize) -> &mut State {
319 | &mut self.states[state]
320 | }
321 |
322 | pub fn put(&mut self, index: usize, chars: CharacterClass) -> usize {
323 | {
324 | let state = self.get(index);
325 |
326 | for &index in &state.next_states {
327 | let state = self.get(index);
328 | if state.chars == chars {
329 | return index;
330 | }
331 | }
332 | }
333 |
334 | let state = self.new_state(chars);
335 | self.get_mut(index).next_states.push(state);
336 | state
337 | }
338 |
339 | pub fn put_state(&mut self, index: usize, child: usize) {
340 | if !self.states[index].next_states.contains(&child) {
341 | self.get_mut(index).next_states.push(child);
342 | }
343 | }
344 |
345 | pub fn acceptance(&mut self, index: usize) {
346 | self.get_mut(index).acceptance = true;
347 | self.acceptance[index] = true;
348 | }
349 |
350 | pub fn start_capture(&mut self, index: usize) {
351 | self.get_mut(index).start_capture = true;
352 | self.start_capture[index] = true;
353 | }
354 |
355 | pub fn end_capture(&mut self, index: usize) {
356 | self.get_mut(index).end_capture = true;
357 | self.end_capture[index] = true;
358 | }
359 |
360 | pub fn metadata(&mut self, index: usize, metadata: T) {
361 | self.get_mut(index).metadata = Some(metadata);
362 | }
363 |
364 | fn new_state(&mut self, chars: CharacterClass) -> usize {
365 | let index = self.states.len();
366 | let state = State::new(index, chars);
367 | self.states.push(state);
368 |
369 | self.acceptance.push(false);
370 | self.start_capture.push(false);
371 | self.end_capture.push(false);
372 |
373 | index
374 | }
375 | }
376 |
377 | #[inline]
378 | fn fork_thread(thread: &Thread, state: &State) -> Thread {
379 | let mut new_trace = thread.clone();
380 | new_trace.state = state.index;
381 | new_trace
382 | }
383 |
384 | #[inline]
385 | fn capture(
386 | nfa: &NFA,
387 | thread: &mut Thread,
388 | current_state: usize,
389 | next_state: usize,
390 | pos: usize,
391 | ) {
392 | if thread.capture_begin == None && nfa.start_capture[next_state] {
393 | thread.start_capture(pos);
394 | }
395 |
396 | if thread.capture_begin != None && nfa.end_capture[current_state] && next_state > current_state
397 | {
398 | thread.end_capture(pos);
399 | }
400 | }
401 |
402 | #[cfg(test)]
403 | mod tests {
404 | use super::{CharSet, CharacterClass, NFA};
405 |
406 | #[test]
407 | fn basic_test() {
408 | let mut nfa = NFA::<()>::new();
409 | let a = nfa.put(0, CharacterClass::valid("h"));
410 | let b = nfa.put(a, CharacterClass::valid("e"));
411 | let c = nfa.put(b, CharacterClass::valid("l"));
412 | let d = nfa.put(c, CharacterClass::valid("l"));
413 | let e = nfa.put(d, CharacterClass::valid("o"));
414 | nfa.acceptance(e);
415 |
416 | let m = nfa.process("hello", |a| a);
417 |
418 | assert!(
419 | m.unwrap().state == e,
420 | "You didn't get the right final state"
421 | );
422 | }
423 |
424 | #[test]
425 | fn multiple_solutions() {
426 | let mut nfa = NFA::<()>::new();
427 | let a1 = nfa.put(0, CharacterClass::valid("n"));
428 | let b1 = nfa.put(a1, CharacterClass::valid("e"));
429 | let c1 = nfa.put(b1, CharacterClass::valid("w"));
430 | nfa.acceptance(c1);
431 |
432 | let a2 = nfa.put(0, CharacterClass::invalid(""));
433 | let b2 = nfa.put(a2, CharacterClass::invalid(""));
434 | let c2 = nfa.put(b2, CharacterClass::invalid(""));
435 | nfa.acceptance(c2);
436 |
437 | let m = nfa.process("new", |a| a);
438 |
439 | assert!(m.unwrap().state == c2, "The two states were not found");
440 | }
441 |
442 | #[test]
443 | fn multiple_paths() {
444 | let mut nfa = NFA::<()>::new();
445 | let a = nfa.put(0, CharacterClass::valid("t")); // t
446 | let b1 = nfa.put(a, CharacterClass::valid("h")); // th
447 | let c1 = nfa.put(b1, CharacterClass::valid("o")); // tho
448 | let d1 = nfa.put(c1, CharacterClass::valid("m")); // thom
449 | let e1 = nfa.put(d1, CharacterClass::valid("a")); // thoma
450 | let f1 = nfa.put(e1, CharacterClass::valid("s")); // thomas
451 |
452 | let b2 = nfa.put(a, CharacterClass::valid("o")); // to
453 | let c2 = nfa.put(b2, CharacterClass::valid("m")); // tom
454 |
455 | nfa.acceptance(f1);
456 | nfa.acceptance(c2);
457 |
458 | let thomas = nfa.process("thomas", |a| a);
459 | let tom = nfa.process("tom", |a| a);
460 | let thom = nfa.process("thom", |a| a);
461 | let nope = nfa.process("nope", |a| a);
462 |
463 | assert!(thomas.unwrap().state == f1, "thomas was parsed correctly");
464 | assert!(tom.unwrap().state == c2, "tom was parsed correctly");
465 | assert!(thom.is_err(), "thom didn't reach an acceptance state");
466 | assert!(nope.is_err(), "nope wasn't parsed");
467 | }
468 |
469 | #[test]
470 | fn repetitions() {
471 | let mut nfa = NFA::<()>::new();
472 | let a = nfa.put(0, CharacterClass::valid("p")); // p
473 | let b = nfa.put(a, CharacterClass::valid("o")); // po
474 | let c = nfa.put(b, CharacterClass::valid("s")); // pos
475 | let d = nfa.put(c, CharacterClass::valid("t")); // post
476 | let e = nfa.put(d, CharacterClass::valid("s")); // posts
477 | let f = nfa.put(e, CharacterClass::valid("/")); // posts/
478 | let g = nfa.put(f, CharacterClass::invalid("/")); // posts/[^/]
479 | nfa.put_state(g, g);
480 |
481 | nfa.acceptance(g);
482 |
483 | let post = nfa.process("posts/1", |a| a);
484 | let new_post = nfa.process("posts/new", |a| a);
485 | let invalid = nfa.process("posts/", |a| a);
486 |
487 | assert!(post.unwrap().state == g, "posts/1 was parsed");
488 | assert!(new_post.unwrap().state == g, "posts/new was parsed");
489 | assert!(invalid.is_err(), "posts/ was invalid");
490 | }
491 |
492 | #[test]
493 | fn repetitions_with_ambiguous() {
494 | let mut nfa = NFA::<()>::new();
495 | let a = nfa.put(0, CharacterClass::valid("p")); // p
496 | let b = nfa.put(a, CharacterClass::valid("o")); // po
497 | let c = nfa.put(b, CharacterClass::valid("s")); // pos
498 | let d = nfa.put(c, CharacterClass::valid("t")); // post
499 | let e = nfa.put(d, CharacterClass::valid("s")); // posts
500 | let f = nfa.put(e, CharacterClass::valid("/")); // posts/
501 | let g1 = nfa.put(f, CharacterClass::invalid("/")); // posts/[^/]
502 | let g2 = nfa.put(f, CharacterClass::valid("n")); // posts/n
503 | let h2 = nfa.put(g2, CharacterClass::valid("e")); // posts/ne
504 | let i2 = nfa.put(h2, CharacterClass::valid("w")); // posts/new
505 |
506 | nfa.put_state(g1, g1);
507 |
508 | nfa.acceptance(g1);
509 | nfa.acceptance(i2);
510 |
511 | let post = nfa.process("posts/1", |a| a);
512 | let ambiguous = nfa.process("posts/new", |a| a);
513 | let invalid = nfa.process("posts/", |a| a);
514 |
515 | assert!(post.unwrap().state == g1, "posts/1 was parsed");
516 | assert!(ambiguous.unwrap().state == i2, "posts/new was ambiguous");
517 | assert!(invalid.is_err(), "posts/ was invalid");
518 | }
519 |
520 | #[test]
521 | fn captures() {
522 | let mut nfa = NFA::<()>::new();
523 | let a = nfa.put(0, CharacterClass::valid("n"));
524 | let b = nfa.put(a, CharacterClass::valid("e"));
525 | let c = nfa.put(b, CharacterClass::valid("w"));
526 |
527 | nfa.acceptance(c);
528 | nfa.start_capture(a);
529 | nfa.end_capture(c);
530 |
531 | let post = nfa.process("new", |a| a);
532 |
533 | assert_eq!(post.unwrap().captures, vec!["new"]);
534 | }
535 |
536 | #[test]
537 | fn capture_mid_match() {
538 | let mut nfa = NFA::<()>::new();
539 | let a = nfa.put(0, valid('p'));
540 | let b = nfa.put(a, valid('/'));
541 | let c = nfa.put(b, invalid('/'));
542 | let d = nfa.put(c, valid('/'));
543 | let e = nfa.put(d, valid('c'));
544 |
545 | nfa.put_state(c, c);
546 | nfa.acceptance(e);
547 | nfa.start_capture(c);
548 | nfa.end_capture(c);
549 |
550 | let post = nfa.process("p/123/c", |a| a);
551 |
552 | assert_eq!(post.unwrap().captures, vec!["123"]);
553 | }
554 |
555 | #[test]
556 | fn capture_multiple_captures() {
557 | let mut nfa = NFA::<()>::new();
558 | let a = nfa.put(0, valid('p'));
559 | let b = nfa.put(a, valid('/'));
560 | let c = nfa.put(b, invalid('/'));
561 | let d = nfa.put(c, valid('/'));
562 | let e = nfa.put(d, valid('c'));
563 | let f = nfa.put(e, valid('/'));
564 | let g = nfa.put(f, invalid('/'));
565 |
566 | nfa.put_state(c, c);
567 | nfa.put_state(g, g);
568 | nfa.acceptance(g);
569 |
570 | nfa.start_capture(c);
571 | nfa.end_capture(c);
572 |
573 | nfa.start_capture(g);
574 | nfa.end_capture(g);
575 |
576 | let post = nfa.process("p/123/c/456", |a| a);
577 | assert_eq!(post.unwrap().captures, vec!["123", "456"]);
578 | }
579 |
580 | #[test]
581 | fn test_ascii_set() {
582 | let mut set = CharSet::new();
583 | set.insert('?');
584 | set.insert('a');
585 | set.insert('é');
586 |
587 | assert!(set.contains('?'), "The set contains char 63");
588 | assert!(set.contains('a'), "The set contains char 97");
589 | assert!(set.contains('é'), "The set contains char 233");
590 | assert!(!set.contains('q'), "The set does not contain q");
591 | assert!(!set.contains('ü'), "The set does not contain ü");
592 | }
593 |
594 | fn valid(char: char) -> CharacterClass {
595 | CharacterClass::valid_char(char)
596 | }
597 |
598 | fn invalid(char: char) -> CharacterClass {
599 | CharacterClass::invalid_char(char)
600 | }
601 | }
602 |
--------------------------------------------------------------------------------