634 |
635 | This program is free software: you can redistribute it and/or modify
636 | it under the terms of the GNU Affero General Public License as published
637 | by the Free Software Foundation, either version 3 of the License, or
638 | (at your option) any later version.
639 |
640 | This program is distributed in the hope that it will be useful,
641 | but WITHOUT ANY WARRANTY; without even the implied warranty of
642 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
643 | GNU Affero General Public License for more details.
644 |
645 | You should have received a copy of the GNU Affero General Public License
646 | along with this program. If not, see .
647 |
648 | Also add information on how to contact you by electronic and paper mail.
649 |
650 | If your software can interact with users remotely through a computer
651 | network, you should also make sure that it provides a way for users to
652 | get its source. For example, if your program is a web application, its
653 | interface could display a "Source" link that leads users to an archive
654 | of the code. There are many ways you could offer source, and different
655 | solutions will be better for different programs; see section 13 for the
656 | specific requirements.
657 |
658 | You should also get your employer (if you work as a programmer) or school,
659 | if any, to sign a "copyright disclaimer" for the program, if necessary.
660 | For more information on this, and how to apply and follow the GNU AGPL, see
661 | .
662 |
--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
1 | .PHONY = md2sexpr-build
2 |
3 | # Compile a debug build for quick development
4 | tree-query: tree-query.d interp.d query.d parser.d
5 | dmd -g -debug -check=invariant -unittest tree-query.d interp.d query.d parser.d
6 |
7 | md2sexpr-build:
8 | ~/dlang/ldc-1.23.0/bin/ldc2 --link-defaultlib-shared=false -O3 -release md2sexpr.d parser.d
9 |
10 | # Compile with LLVM
11 | tree-query-build:
12 | ldc2 --link-defaultlib-shared=false -O2 -release tree-query.d interp.d query.d parser.d
13 | strip tree-query
14 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # `tree-query`
2 |
3 | > This README is TODO. If you'd like to use this software in a project, **don't hestiate to reach out to me.**
4 |
5 | `tree-query` is a tool that lets you execute queries on ordinary directories of files like text and Markdown, inspired by Roam Research's query syntax.
6 |
7 | It is a replacement for Roam's query system. It supports everything Roam does, except for block references.
8 |
9 | [Join us on Discord](https://discord.gg/7B9ywS5x)
10 |
11 | ## Quickstart
12 |
13 | **Query in current directory:**
14 | ```
15 | tree-query '{and: [[Page 1]] [[Page 2]]}' .
16 | ```
17 | [*Learn to navigate to a working directory with `cd`*](https://linuxize.com/post/linux-cd-command/)
18 |
19 | **Query in a folder:**
20 | ```
21 | tree-query '{and: [[Page 1]] [[Page 2]]}' /Users/steve/myfoldername/
22 | ```
23 | *Learn to get the location of a folder on [macOS](https://osxdaily.com/2009/11/23/copy-a-files-path-to-the-terminal-by-dragging-and-dropping/), [Windows](https://www.top-password.com/blog/copy-full-path-of-a-folder-file-in-windows/), or [GNU+Linux](https://unix.stackexchange.com/questions/102551/mouse-shortcut-to-copy-the-path-to-a-file-in-the-gnome-file-manager).*
24 |
25 | Or query in multiple folders and files:
26 |
27 | ```
28 | tree-query '{and: [[Page 1]] [[Page 2]]}' /Users/steve/myfoldername/ file1
29 | ```
30 |
31 | **Query stdin with pipes:**
32 | ```
33 | cat myfile | tree-query '{and: [[Page 1]] [[Page 2]]}'
34 | ```
35 | [*Learn to build powerful no-code applications using pipes*](https://youtu.be/tc4ROCJYbm0?t=360)
36 |
37 | ## Installation
38 | ### [Download a copy](https://github.com/CrazyPython/tree-query/releases/tag/v0.1.1)
39 | The more convenient method. Click above for instructions.
40 |
41 | ### Build from source
42 |
43 | On FreeBSD, GNU+Linux, and macOS, open Terminal and go:
44 | ```
45 | curl https://dlang.org/install.sh | bash -s
46 | ```
47 |
48 | On Windows, download and install [Git Bash](https://gitforwindows.org/) and [7-Zip to C:\Program Files](https://www.7-zip.org/), then run:
49 | ```
50 | mkdir %USERPROFILE%\dlang
51 | set PATH="%PATH%;C:\Program Files\7-Zip"
52 | set BASH="\Program Files\Git\usr\bin\bash.exe"
53 | mkdir dlang
54 | powershell.exe -Command "wget https://dlang.org/install.sh -OutFile dlang\install.sh"
55 | ```
56 |
57 | Then:
58 | ```
59 | ~/dlang/install.sh install ldc-1.23.0
60 | ```
61 |
62 | Then `cd` into the directory where you cloned this directory (`git clone https://github.com/CrazyPython/tree-query.git && cd tree-query`) and type:
63 | ```
64 | ~/dlang/ldc-1.23.0/bin/ldc2 --link-defaultlib-shared=false -O2 -release tree-query.d interp.d query.d parser.d
65 | ```
66 |
67 | (Non-Windows) Install to make available eveywhere:
68 | ```
69 | chmod 700 tree-query
70 | sudo mv tree-query /usr/local/bin
71 | ```
72 |
73 | ## Features
74 |
75 | This section is a work-in-progress.
76 |
77 | * **Fast**
78 |
79 | ## Basic usage with the command-line
80 |
81 | ### Copy result to clipboard
82 | macOS:
83 | ```
84 | tree-query '{and: [[Page 1]] [[Page 2]]}' . | pbcopy
85 | ```
86 |
87 | GNU+Linux:
88 | ```
89 | tree-query '{and: [[Page 1]] [[Page 2]]}' . | xclip
90 | ```
91 |
92 | ## Contributing
93 |
94 | **query.d**: The logic for executing Roam queries
95 |
96 | **parser.d**: The Markdown/org-mode parser, which detects the indentation level of each line and uses it to emit a stream of events for a handler like `query.d` to consume.
97 |
98 | **tree-query.d**: The command-line tool/wrapper. Recursively parses the string `{and: [[Page 1]] [[Page 2]]}` into a tree of expressions
99 |
100 | **interp.d**: Recursively evaluates boolean expressions on behalf of query.d
101 |
102 | `tree-query` has doc comments on internals and example code demonstrating how internal APIs work. (Ctrl-F for "unittest")
103 |
104 | Tree-query is written in [Dlang](https://dlang.org) but don't let that put you off- if you know C, C++, or Java, you'll pick it up very quickly.
105 |
106 | If you have any questions on D, feel free to go to #d on freenode or D Forums. People are very nice.
107 |
108 | ### Internals
109 |
110 | I've spent some time writing doc comments inside the code. They provide a conceptual explanation of how the system works. Look for `/**` and `/++`.
111 |
112 | Inside the unittests, there's an example guide on using `parser.d` as a library to build a Markdown to XML converter.
113 |
114 | ### Notes
115 |
116 | A string in D is a reference to a region of immutable memory. It is a length and a pointer. For this reason, it is is very efficient to copy.
117 |
118 | A struct is like a Java record or class.
119 |
120 | ## Known bugs
121 | - Mixing tabs and spaces in the same file is not supported, unless an explicit spaces per indent specified
122 |
123 | ## Your rights
124 |
125 | This is open-source software.
126 |
127 | We use copyleft to gurantee these rights:
128 |
129 | 0. Free for commercial use and any other purpose
130 | 1. Freedom to remix to fit your needs: You (and if you can't code, by proxy a programmer you hire) have freedom to add new query keywords, completely change the query system, add support for new formats like org-mode, or anything else
131 | 2. Freedom to help friends by sharing
132 | 3. Freedom to share remixes, commercially and noncommercially
133 |
134 | I believe knowledge management is a deeply, and everyone should have freedom over their "digital brain." A digital brain is a deeply intimate and personal thing. This means you are the sovereign of your digital brain.
135 |
136 | Compatible with permissive licenses like Apache License, MIT License, and Mozilla Public License.
137 |
138 | (C) 2021. Affero General Public License v3.0 or any later version
139 |
--------------------------------------------------------------------------------
/RELEASE_NOTES.md:
--------------------------------------------------------------------------------
1 | # Release notes
2 |
3 | ## 0.1.1
4 | - Added hyperlinked documentation on usage instructions for people new to the CLI
5 | - Added missing source files, now it's possible to compile
6 | - Support for starting queries with `{query:` was enabled
7 | - Shows message when executed without a query
8 | - Fixed build instructions for Windows and added a binary release for Windows and Mac
9 |
10 | ## 0.1.0
11 | - Supports `{and:`, `{or:`, `{not:`, nested arbitrarily, querying using page references
12 | - Works on any kind of indentation, including tabs, spaces.
13 | - Caveat: Mixing tabs and spaces in one file may lead to undesired results, because a tab is interpreted as one space
14 | - Ignores text inside Markdown code blocks
15 | - Indentation is different for these code blocks. Searching with their text included as part of the parent block is not supported yet
16 |
--------------------------------------------------------------------------------
/interp.d:
--------------------------------------------------------------------------------
1 | /+ A boolean expression interpreter.
2 | + Used to evaluate boolean bitfields for the query.
3 | +/
4 | module interp;
5 |
6 | import std.variant;
7 | import std.typecons;
8 | // TCO
9 | alias BitType = uint;
10 | enum Op { AND, OR, NOT, ATOM };
11 | bool eval_form(BitType bitstring, Form form) {
12 | if (form.op == Op.AND) {
13 | return eval_and(bitstring, form.operands);
14 | } else if (form.op == Op.OR) {
15 | return eval_or(bitstring, form.operands);
16 | } else if (form.op == Op.NOT) {
17 | return cast(bool)(!(bitstring & (1 << form.operands[0].bitshift)));
18 | //return cast(bool)(!(bitstring & (1 << form.bitshift)));
19 | } else {
20 | return cast(bool)(bitstring & (1 << form.bitshift));
21 | }
22 | }
23 | struct Form {
24 | Op op;
25 | union {
26 | Form[] operands;
27 | // What is faster, a ubyte or a byte?
28 | ubyte bitshift;
29 | }
30 | }
31 | bool eval_and(BitType bitstring, Form[] terms) {
32 | if (terms.length == 0) {
33 | return true;
34 | }
35 | if (eval_form(bitstring, terms[0])) {
36 | return eval_and(bitstring, terms[1..$]);
37 | } else {
38 | // short-circuit evaluation
39 | return false;
40 | }
41 | }
42 | bool eval_or(BitType bitstring, Form[] terms) {
43 | if (terms.length == 0) {
44 | return false;
45 | }
46 | if (eval_form(bitstring, terms[0])) {
47 | // short-circuit evaluation
48 | return true;
49 | } else {
50 | return eval_or(bitstring, terms[1..$]);
51 | }
52 | }
53 | private Form Atom(ubyte bitshift) {
54 | Form form = { Op.ATOM, bitshift: bitshift };
55 | return form;
56 | }
57 | unittest {
58 | assert(eval_form(0b1, Form(Op.OR, [Atom(0)])) == true);
59 | assert(eval_form(0b11, Form(Op.AND, [Atom(0), Atom(1)])) == true);
60 | assert(eval_form(0b10, Form(Op.AND, [Atom(0), Atom(1)])) == false);
61 | assert(eval_form(0b10, Form(Op.OR , [Atom(0), Atom(1)])) == true);
62 | assert(eval_form(0b111,Form(Op.AND, [Form(Op.AND, [Atom(0), Atom(2)]), Atom(1)])) == true);
63 | }
64 |
--------------------------------------------------------------------------------
/md2sexpr.d:
--------------------------------------------------------------------------------
1 | import parser;
2 | import std.string;
3 | private string escape(string text) {
4 | return `"` ~ text.replace(`\`, `\\`).replace(`"`, `\"`) ~ `"`;
5 | }
6 | struct ConvertToSExprEventHandler {
7 | string start(string text) {
8 | return " (" ~ escape(text);
9 | }
10 | string end() {
11 | return ")";
12 | }
13 | }
14 | struct ConvertToSExprListEventHandler {
15 | int[] nchildren;
16 | string start(string text) {
17 | int current;
18 | if (nchildren.length) {
19 | nchildren[$-1]++;
20 | current = nchildren[$-1];
21 | } else {
22 | current = 0;
23 | }
24 | nchildren ~= 0;
25 | // If we are starting a list with more than one child, add 2 parens
26 | if (current == 1) {
27 | return " ((" ~ escape(text);
28 | } else {
29 | return " (" ~ escape(text);
30 | }
31 | }
32 | string end() {
33 | // If we are ending a list with more than one child, add 2 parens
34 | if (nchildren.length > 0 && nchildren[$-1] > 0) {
35 | nchildren.length--;
36 | return "))";
37 | } else {
38 | nchildren.length--;
39 | return ")";
40 | }
41 | }
42 | }
43 | void main(string[] args) {
44 | import std.stdio, std.file;
45 | string[] inputnames;
46 | string[] inputs;
47 | args = args[1..$]; // skip first arg, which is name of binary
48 | bool list = false;
49 | if (args.length > 0 && args[0] == "-l") {
50 | args = args[1..$];
51 | list = true;
52 | }
53 | if (args.length == 0) {
54 | // stdin
55 | string input;
56 | string line;
57 | while ((line = readln()) !is null)
58 | input ~= line;
59 | inputnames ~= "stdin";
60 | inputs ~= input;
61 | }
62 | void readRestArgs() {
63 | foreach (filename; args) {
64 | inputnames ~= filename;
65 | inputs ~= filename.readText;
66 | }
67 | }
68 | if (list) {
69 | readRestArgs();
70 | for (int i = 0; i < inputs.length; ++i) {
71 | string output;
72 | parse!(ConvertToSExprListEventHandler, s => output ~= s, s => output ~=s)(inputs[i], 4, inputnames[i]);
73 | // Write each on a separate lines
74 | writeln(output);
75 | }
76 | } else {
77 | readRestArgs();
78 | for (int i = 0; i < inputs.length; ++i) {
79 | string output;
80 | parse!(ConvertToSExprEventHandler, s => output ~= s, s => output ~=s)(inputs[i], 4, inputnames[i]);
81 | // Write each on separate lines
82 | writeln(output);
83 | }
84 | }
85 | }
86 |
--------------------------------------------------------------------------------
/parser.d:
--------------------------------------------------------------------------------
1 | /** Count the indent level and return where the indent level stops in "i"
2 | * A tab is counted as one indent.
3 | * Params:
4 | * i = the index to start looking from. This is mutated
5 | * by reference to indicate the index the last index
6 | * was.
7 | * spaces_per_indent = the number of spaces to treat as one indent
8 | */
9 | int count_indents(ref int i, string input, int spaces_per_indent=4) {
10 | int nspaces = 0;
11 | if (i == input.length || !(input[i] == ' ' || input[i] == '\t')) {
12 | return 0;
13 | }
14 | for (; i < input.length; ++i) {
15 | switch (input[i]) {
16 | case ' ': nspaces++; break;
17 | case '*': nspaces++; break;
18 | case '\t': nspaces += spaces_per_indent; break;
19 | // Commented out makes it only accept "-"
20 | //case '-': return nspaces / spaces_per_indent;
21 | case '\n': nspaces = 0; break;
22 | //default : return 0;
23 | default : return nspaces / spaces_per_indent;
24 | }
25 | }
26 | return nspaces / spaces_per_indent;
27 | }
28 | unittest {
29 | int i;
30 | assert(count_indents(i = 0, " - hi") == 1);
31 | assert(count_indents(i = 0, "Foo") == 0);
32 | assert(count_indents(i = 0, " - hi") == 2);
33 | assert(count_indents(i = 0, "\t - hey") == 1);
34 | }
35 | void doNothing(T)(T _=null) {}
36 | /** A streaming parser for Markdown trees. Emits a stream of parsing events.
37 | * Params:
38 | * ParseEventHandler = struct that handles a stream of parsing events.
39 | * See [ExampleParseEventHandler] for an example.
40 | * deleStart = function that is called with the return value of
41 | * the ParseEventHandler's start(). Defaults to
42 | * doNothing
43 | * deleEnd = function that is called with the return value of
44 | * the ParseEventHandler's end(). Defaults to
45 | * doNothing
46 | * These are all compile-time (template) parameters, thus the compiler will
47 | * inline functions into generated code.
48 | */
49 | void parse(ParseEventHandler,
50 | alias deleStart=doNothing,
51 | alias deleEnd=doNothing,
52 | bool relaxed=false,
53 | Args...)
54 | (string input, int spaces_per_indent, string title, Args args) {
55 | import std.traits;
56 | ParseEventHandler handler = ParseEventHandler(args);
57 | //const initial_indent_level =
58 | int current_indent_level;
59 | //BitType current_bits;
60 | //BitType[] stack;
61 | int line_content_start_index;
62 | int line_start_index;
63 | // Workwround for void not being a parameter type
64 | void emitBlockStart(string nodeContent) {
65 | static if (is(ReturnType!(handler.start) == void)) {
66 | handler.start(nodeContent);
67 | deleStart();
68 | } else deleStart(handler.start(nodeContent));
69 | }
70 | void emitBlockEnd() {
71 | static if (is(ReturnType!(handler.end) == void)) {
72 | handler.end();
73 | deleEnd();
74 | } else deleEnd(handler.end());
75 | }
76 | emitBlockStart(title);
77 | for (int i = 0; i < input.length; ++i) {
78 | // emit end when on same or lower indent level, number based on difference
79 | // start on new lines
80 | switch (input[i]) {
81 | case '\n':
82 | //stack[0] & current_bits;
83 | // FIXME: Start and end based on "-"
84 | const nodeContent = input[line_start_index..i];
85 | import std.stdio;
86 | emitBlockStart(nodeContent);
87 | i++;
88 | line_start_index = i;
89 | int new_indent_level =
90 | count_indents(i, input, spaces_per_indent);
91 | line_content_start_index = i;
92 | const extraEndings = i == input.length ? 0 : 1;
93 | foreach (_;
94 | 0..current_indent_level - new_indent_level + extraEndings) {
95 | emitBlockEnd();
96 | }
97 | if (relaxed && new_indent_level > current_indent_level + 1) {
98 | foreach (_; 0..new_indent_level - current_indent_level - 1)
99 | {
100 | emitBlockStart("");
101 | }
102 | }
103 | current_indent_level = new_indent_level;
104 | break;
105 | case '[':
106 | break;
107 | // Skip across multiline constructs to prevent
108 | // detect_indent
109 | case '`':
110 | if (input[i+1] == '`' && input[i+2] == '`') {
111 | // Skip 3 at a time for efficiency
112 | // TODO: SIMD this and/or PGO it
113 | for (i += 3; i < input.length; ++i) {
114 | if (input[i-2] == '`' &&
115 | input[i-1] == '`' &&
116 | input[i-0] == '`') {
117 | break;
118 | }
119 | }
120 | }
121 | break;
122 | default: break;
123 | }
124 | }
125 | if (line_content_start_index != input.length) {
126 | emitBlockStart(input[line_start_index .. input.length]);
127 | }
128 | for (int i = 0; i < current_indent_level + 1; ++i) {
129 | emitBlockEnd();
130 | }
131 | // End again because we started for the title
132 | emitBlockEnd();
133 | }
134 | /// Test the stream of parser events
135 | unittest {
136 | import std.range, std.array;
137 | struct ExampleParseEventHandler {
138 | string start(string text) {
139 | return "START" ~ text;
140 | }
141 | string end() {
142 | return "END";
143 | }
144 | }
145 | string sample = "- ab\n\t- b\nhello\n\tworld\n\tfoo";
146 | // A trailing newline should not affect ther esult
147 | foreach (useTrailingNewline; [false, true]) {
148 | string result;
149 | parse!(ExampleParseEventHandler, s => result ~= s, s => result ~= s)
150 | (sample ~ (useTrailingNewline ? "\n" : ""), 4, "Title");
151 | import std.stdio;
152 | debug writeln(result);
153 | assert(result ==
154 | "STARTTitleSTART- abSTART\t- bENDENDSTARThelloSTART\tworldENDSTART\tfooENDENDEND"
155 | );
156 | }
157 | }
158 | struct A {
159 |
160 | }
161 | private struct WithConstructor {
162 | string[] member;
163 | this(string[] arg, A a) {
164 | member = arg;
165 | }
166 | void start(string text) {
167 | }
168 | void end() {
169 | }
170 | }
171 | unittest {
172 | A a;
173 | parse!(WithConstructor)
174 | ("Test", 4, "Title", ["arg"], a);
175 | }
176 | unittest {
177 | import std.stdio;
178 | struct ExampleParseEventHandler {
179 | string start(string text) {
180 | return "" ~ text;
181 | }
182 | string end() {
183 | return "
";
184 | }
185 | }
186 | string sample =
187 | `a
188 | b
189 | c
190 | d`;
191 | string result;
192 | parse!(ExampleParseEventHandler, s => result ~= s, s => result ~= s, true)
193 | (sample, 1, "Title");
194 | writeln(result);
195 | assert(result == "");
196 | }
197 | struct ConvertToXMLEventHandler {
198 | string start(string text) {
199 | return "" ~ text;
200 | }
201 | string end() {
202 | return "";
203 | }
204 | }
205 | /*struct ConvertToOPMLEventHandler {
206 | string start(string text) {
207 | // todo: escpae
208 | return ``;
212 | }
213 | }*/
214 |
--------------------------------------------------------------------------------
/query.d:
--------------------------------------------------------------------------------
1 | /++ Query system that implements Roam queries. +/
2 | module query;
3 | import interp;
4 | debug import std.stdio;
5 |
6 | // Unoptimized function to calculate matches for a set of
7 | // words
8 | // This should be replaced by something faster.
9 | // For instance, a regex library, that scans once, instead of once
10 | struct DumbMatcher(BitType) {
11 | string[] search_terms;
12 | this(string[] terms) {
13 | search_terms = terms;
14 | }
15 | BitType match(string str) {
16 | import std.algorithm.searching : canFind;
17 | BitType result;
18 | debug writeln("Query " ~ str);
19 | for (int i = 0; i < search_terms.length; ++i) {
20 | auto term = search_terms[i];
21 | if (canFind(str, term)) {
22 | debug writeln("Matched with " ~ term);
23 | result |= 1 << i;
24 | }
25 | }
26 | return result;
27 | }
28 | unittest {
29 | DumbMatcher!BitType matcher = DumbMatcher(["red", "green"]);
30 | assert(matcher.match("blue") == 0b00);
31 | assert(matcher.match("red") == 0b01);
32 | assert(matcher.match("green") == 0b10);
33 | assert(matcher.match("red green") == 0b11);
34 | assert(matcher.match("blue green") == 0b10);
35 | Form form = { Op.ATOM, bitshift: 0};
36 | assert(eval_form(matcher.match("red"), Form(Op.AND, [form])));
37 | }
38 | unittest {
39 | DumbMatcher!BitType matcher = DumbMatcher(["[[PRIME Theory]]", "[[PRIME: Motives]]"]);
40 | assert(matcher.match("- In every moment we [act]([[PRIME: Responses]]) in pursuit of what we most [want or need]([[PRIME: Motives]]) at that moment. Something can only exert [[behavioral influence]] if it is [[salient]] at the moment") == 0b10);
41 | }
42 | }
43 | struct RegexMatcher;
44 | struct TrieMatcher;
45 |
46 | alias BitType = uint;
47 |
48 | /++Roam query.
49 | + It handles a stream of START and END events.
50 | + START means a indented block or line started
51 | + END means a indented block or line ended
52 | +
53 | + It computes a bitfield for each line.
54 | + Each bit in the bitfield corresponds to a word in the query, and represents
55 | + whether that bit was present in this line or one of its parents.
56 | +/
57 | struct QueryHandler {
58 | bool matching = false;
59 | string[] parent_lines; // Bookkeeping data structure. Holds parent lines as strings, so we can later print them out.
60 | //string[] to_print;
61 | BitType[] bit_stack = [0];
62 | DumbMatcher!BitType matcher;
63 | Form expression;
64 | this(string[] terms, Form form) { // Construct a QueryHandler struct
65 | debug writeln("Terms", terms);
66 | assert(terms.length < BitType.sizeof * 8);
67 | matcher = DumbMatcher!BitType(terms);
68 | expression = form;
69 | }
70 | void start(string line) {
71 | BitType own_bits = matcher.match(line);
72 | bit_stack ~= bit_stack[$-1] | own_bits;
73 | debug writeln("Own bitstring", own_bits);
74 | parent_lines ~= line;
75 | if (eval_form(bit_stack[$-1], expression)) {
76 | import std.stdio;
77 | foreach (parent_line; parent_lines)
78 | writeln(parent_line);
79 | // Simple way to avoid printing it twice
80 | parent_lines.length = 0;
81 | }
82 | }
83 | void end() {
84 | debug writeln("stacklen", bit_stack.length);
85 | bit_stack.length--;
86 | if (parent_lines.length > 0)
87 | parent_lines = parent_lines[0..$-1]; // Pop the last item
88 | }
89 | }
90 | unittest {
91 | //QueryHandler!uint handler = QueryHandler!uint(["red", "green"], "");
92 | }
93 |
--------------------------------------------------------------------------------
/tree-query.d:
--------------------------------------------------------------------------------
1 | module tree_query;
2 | import std.stdio;
3 | import std.variant;
4 | import std.string;
5 | import query;
6 | import interp;
7 | /++
8 | + Extract the query from the input.
9 | + These forms are supported, where QUERY represents the query itself.
10 | + - QUERY
11 | + - {{query: QUERY }}
12 | + - {{query:QUERY}}
13 | + - {{[[query]]: QUERY }}
14 | +
15 | + Leading and trailing whitespace are allowed before the start and end of the query.
16 | +
17 | + Note that Roam does not accept {{ query: nor {{ query :
18 | +
19 | +/
20 | string extractQuery(string query) {
21 | import std.string;
22 | query = strip(query); // Allow leading and trailing whitespace
23 | if (query.startsWith("{{")) {
24 | if (query.endsWith("}}")) {
25 | // "1"th index is 2
26 | auto withoutWhitespace = stripLeft(query[2..$]);
27 | enum command = "query:";
28 | if (withoutWhitespace.startsWith(command)) {
29 | auto result = query[2+command.length..$-2].strip;
30 | return result;
31 | } else {
32 | throw new Error("Unrecognized command, command must be 'query:'");
33 | }
34 | } else {
35 | throw new Exception("Malformed query. Must end with '}}'");
36 | }
37 | } else {
38 | return query;
39 | }
40 | }
41 | unittest {
42 | const nakedQueries = [
43 | "{and: [[foo]] [[bar]] }",
44 | "{or: [[foo]] [[bar]] }",
45 | "{and: [[foo]] [[bar]]}",
46 | "{and: [[foo]] {and: [[bar]] [[spam]]}}"
47 | ];
48 | foreach (nakedQuery; nakedQueries) {
49 | assert(extractQuery(nakedQuery) == nakedQuery);
50 | assert(extractQuery("{{query:" ~ nakedQuery ~ "}}") == nakedQuery);
51 | assert(extractQuery(" {{query: " ~ nakedQuery ~ " }} ") == nakedQuery);
52 | }
53 | }
54 | struct ParsedQuery {
55 | string[] terms;
56 | Form form;
57 | }
58 |
59 | ParsedQuery parseQuery(string query) {
60 | ParsedQuery q;
61 | return q;
62 | }
63 |
64 | // TODO: Implement atom dedup
65 | // TODO: Implement support for arbitrary atoms
66 | // TODO: Allocate Form from an array for efficiency
67 | ParsedQuery booleanQuery(string query, ref ubyte bitshift) {
68 | import std.array;
69 | ParsedQuery q;
70 | debug writeln("subquery", query);
71 | query = query.strip();
72 | auto start = query.split(" ")[0];
73 | if (start == "{and:") {
74 | q.form.op = Op.AND;
75 | } else if (start == "{or:") {
76 | q.form.op = Op.OR;
77 | } else if (start == "{not:") {
78 | q.form.op = Op.NOT;
79 | } else {
80 | throw new Exception("Unrecognized keyword" ~ start);
81 | }
82 | // (\[\[.+\]\])|(".+")
83 | // recursive call
84 | // using the rest of the words, build a form
85 | int[] indexes;
86 | int ntoskip = 0;
87 | foreach (i, token; query[start.length..$].split("]]")) {
88 | debug writeln("Parse token", token);
89 | if (ntoskip > 0) {
90 | ntoskip--;
91 | continue;
92 | }
93 | if (token.startsWith("[[") || token.startsWith(" [[")) {
94 | auto text = token.strip() ~ "]]";
95 | q.terms ~= text;
96 | Form inner = { Op.ATOM, bitshift: bitshift++ };
97 | debug bitshift.writeln;
98 | q.form.operands ~= inner;
99 | } else if (token.strip().startsWith("}")) {
100 | return q;
101 | } else {
102 | auto subquery = booleanQuery(
103 | query[start.length..$].split("]]")[i..$].join("]]"),
104 | bitshift
105 | );
106 | q.terms ~= subquery.terms;
107 | q.form.operands ~= subquery.form;
108 | ntoskip = cast(int)subquery.terms.length + 1;
109 | }
110 | }
111 | return q;
112 | }
113 | unittest {
114 | ubyte bitshift = 0;
115 | auto pq = booleanQuery("{and: [[Hi]] {or: [[Blue]] [[White]] } }", bitshift);
116 | assert(pq.terms == ["[[Hi]]", "[[Blue]]", "[[White]]"]);
117 | assert(pq.form.operands[1].op == Op.OR);
118 | assert(pq.form.operands[1].operands[0].bitshift == 1);
119 | assert(pq.form.operands[1].operands[1].bitshift == 2);
120 | }
121 |
122 | string escape(string text) {
123 | return text.replace(`\`, `\\`)
124 | .replace(`"`, `\"`)
125 | .replace(`\\n`, `\n`)
126 | .replace(`\\t`, `\t`);
127 | }
128 | unittest {
129 | import std.stdio;
130 | ubyte n = 0;
131 | auto parsed = booleanQuery("{and: [[Hi]] [[Hello]] }", n);
132 | assert(parsed.terms == ["[[Hi]]", "[[Hello]]"]);
133 | }
134 | private struct WithConstructor {
135 | string[] member;
136 | this(string[] arg, Form a) {
137 | member = arg;
138 | }
139 | void start(string text) {
140 | }
141 | void end() {
142 | }
143 | }
144 | unittest {
145 | import parser;
146 | Form form;
147 | parse!(WithConstructor)
148 | ("Test", 4, "Title", ["arg"], form);
149 | }
150 |
151 | int main(string[] args) {
152 | import std.getopt, std.file, std.stdio;
153 | // Parse arguments
154 | string query;
155 | if (args.length >= 2 && args[1].strip.startsWith("{")) {
156 | query = args[1];
157 | args = args[2..$];
158 | } else {
159 | throw new Exception("Query must be first parameter");
160 | }
161 | // Read query immediately. If the user has written an invalid query, show
162 | // an error before we read in all files.
163 | ubyte n = 0;
164 | ParsedQuery qu = booleanQuery(extractQuery(query), n);
165 | string[] inputs;
166 | string[] inputnames;
167 | if (args.length == 0) {
168 | // stdin
169 | string input;
170 | string line;
171 | while ((line = readln()) !is null)
172 | input ~= line;
173 | inputnames ~= "stdin";
174 | inputs ~= input;
175 | }
176 | foreach (filename; args) {
177 | if (filename.isDir) {
178 | import std.algorithm.iteration;
179 | filename.dirEntries(SpanMode.depth).filter!isFile.each!((string filename) {
180 | // TODO: Proper mechanism to detect and avoid binary files
181 | import std.utf;
182 | try {
183 | inputnames ~= filename;
184 | // TODO: Stream input from files for cache locality, instead of reading everything in at once
185 | inputs ~= filename.readText;
186 | } catch (UTFException) {
187 | inputnames.length--;
188 | }
189 | });
190 | } else {
191 | assert(filename.exists);
192 | inputnames ~= filename;
193 | inputs ~= filename.readText;
194 | }
195 | }
196 | for (int i = 0; i < inputs.length; ++i) {
197 | import parser;
198 | import std.meta;
199 | parse!(QueryHandler, doNothing, doNothing, true)(inputs[i], 4, inputnames[i], qu.terms, qu.form);
200 | // Write each on a separate lines
201 | }
202 | return 0;
203 | }
204 |
--------------------------------------------------------------------------------