├── .gitignore ├── CHANGELOG ├── LICENSE ├── README.md ├── instantsearch.py └── tests.py /.gitignore: -------------------------------------------------------------------------------- 1 | .idea 2 | -------------------------------------------------------------------------------- /CHANGELOG: -------------------------------------------------------------------------------- 1 | # 1.3.1 (unreleased) 2 | * right position fix #50 3 | 4 | # 1.3 (2022-12-15) 5 | * part of the query might be divided between the page name and the page contents search #45 6 | * search state is remembered (Ctrl+E will show last results) 7 | * license info 8 | * fix: open_when_unique parameter would not make the dialog re-close immediately any more 9 | * fix: notebook located at the Windows drive root 10 | 11 | # 1.2.1 (2021-11-02) 12 | * main window resizing fix #46 13 | * Zim 0.74 support #47 14 | 15 | # 1.2 Python3.6 (2021-08-02) 16 | * CHANGED: Check plugin preferences, some of them (using cache) resets to their defaults. 17 | * preview: When navigating search menu, the page is previewed first after 150 ms by default and open after 1 500 ms. When page is not very small, navigation stuck when displaying a new page. 18 | * results are being updated every 200 ms, not instantly, so that the search process might take about 30 % less time 19 | * window title shows if the search has finished 20 | * external fulltext search 21 | * internal Zim search dropped 22 | * wildcard option removed 23 | * searched texts are temporarily cached, drastically improving speed 24 | * geometry fix, stays on position when results are longer than the dialog width 25 | * page base name is in bold in results 26 | * good caret behaviour 27 | * good keyboard navigation 28 | * fixed title search ('!t' would not match 'Other:Test' since 't' was incorrectly consumed in 'Other') 29 | 30 | # 1.1 last Python3.5- (2020-06-09) 31 | 32 | # 1.04 0.68 compatible (2018-06-18) 33 | New config options, Tab and Shift+Tab for moving caret 34 | 35 | # 1.03 Titles first (2017-01-16) 36 | Since I made changes in this commit, the searching become even more smooth for me. The page titles are given the priority now to be ordered in the beginning. 37 | 38 | # 1.02 History excluded (2016-08-27) 39 | Traversing the search results should not fulfill your Zim history now ;) 40 | 41 | # 1.01 #13 (2016-08-27) 42 | many bug fixes, plugin is more reliable 43 | 44 | # 1.0 Instant search for Zim (2016-06-03) 45 | * For first few keystrokes, there is ultraquick search in page titles and then a reliable internal zim search is processed. 46 | * more reliable search (less bugs) 47 | * if I write "linux", the search for "linux" ends first, and than for "linu", "lin" in the background 48 | * if I write "linu" and then "linux", the results are updated only, they are not cleared. 49 | * if I write "linux" and then I paste from clipboard word "liposuction", the results are updated only, not cleared. So that if there is a page with both words "linux" and "liposuction", the page doesn't vanish for a while from menu. You wouldn't say you need this feature but I am convinced that if it disappeared, you would be confused and disappointed. 50 | * if the searched string occures only in the page title and the only as one of the parent names, the results gets omitted. It's useful. When I search for "linux", I want to see page "os:linux", but not "os:linux:technologies:mydisk:foo:bar". 51 | * While traversing the results by arrows, the page is displayed with tiny delay so that you can quickly traverse all the list. -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | GNU GENERAL PUBLIC LICENSE 2 | Version 2, June 1991 3 | 4 | Copyright (C) 1989, 1991 Free Software Foundation, Inc., 5 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA 6 | Everyone is permitted to copy and distribute verbatim copies 7 | of this license document, but changing it is not allowed. 8 | 9 | Preamble 10 | 11 | The licenses for most software are designed to take away your 12 | freedom to share and change it. By contrast, the GNU General Public 13 | License is intended to guarantee your freedom to share and change free 14 | software--to make sure the software is free for all its users. This 15 | General Public License applies to most of the Free Software 16 | Foundation's software and to any other program whose authors commit to 17 | using it. (Some other Free Software Foundation software is covered by 18 | the GNU Lesser General Public License instead.) You can apply it to 19 | your programs, too. 20 | 21 | When we speak of free software, we are referring to freedom, not 22 | price. Our General Public Licenses are designed to make sure that you 23 | have the freedom to distribute copies of free software (and charge for 24 | this service if you wish), that you receive source code or can get it 25 | if you want it, that you can change the software or use pieces of it 26 | in new free programs; and that you know you can do these things. 27 | 28 | To protect your rights, we need to make restrictions that forbid 29 | anyone to deny you these rights or to ask you to surrender the rights. 30 | These restrictions translate to certain responsibilities for you if you 31 | distribute copies of the software, or if you modify it. 32 | 33 | For example, if you distribute copies of such a program, whether 34 | gratis or for a fee, you must give the recipients all the rights that 35 | you have. You must make sure that they, too, receive or can get the 36 | source code. And you must show them these terms so they know their 37 | rights. 38 | 39 | We protect your rights with two steps: (1) copyright the software, and 40 | (2) offer you this license which gives you legal permission to copy, 41 | distribute and/or modify the software. 42 | 43 | Also, for each author's protection and ours, we want to make certain 44 | that everyone understands that there is no warranty for this free 45 | software. If the software is modified by someone else and passed on, we 46 | want its recipients to know that what they have is not the original, so 47 | that any problems introduced by others will not reflect on the original 48 | authors' reputations. 49 | 50 | Finally, any free program is threatened constantly by software 51 | patents. We wish to avoid the danger that redistributors of a free 52 | program will individually obtain patent licenses, in effect making the 53 | program proprietary. To prevent this, we have made it clear that any 54 | patent must be licensed for everyone's free use or not licensed at all. 55 | 56 | The precise terms and conditions for copying, distribution and 57 | modification follow. 58 | 59 | GNU GENERAL PUBLIC LICENSE 60 | TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 61 | 62 | 0. This License applies to any program or other work which contains 63 | a notice placed by the copyright holder saying it may be distributed 64 | under the terms of this General Public License. The "Program", below, 65 | refers to any such program or work, and a "work based on the Program" 66 | means either the Program or any derivative work under copyright law: 67 | that is to say, a work containing the Program or a portion of it, 68 | either verbatim or with modifications and/or translated into another 69 | language. (Hereinafter, translation is included without limitation in 70 | the term "modification".) Each licensee is addressed as "you". 71 | 72 | Activities other than copying, distribution and modification are not 73 | covered by this License; they are outside its scope. The act of 74 | running the Program is not restricted, and the output from the Program 75 | is covered only if its contents constitute a work based on the 76 | Program (independent of having been made by running the Program). 77 | Whether that is true depends on what the Program does. 78 | 79 | 1. You may copy and distribute verbatim copies of the Program's 80 | source code as you receive it, in any medium, provided that you 81 | conspicuously and appropriately publish on each copy an appropriate 82 | copyright notice and disclaimer of warranty; keep intact all the 83 | notices that refer to this License and to the absence of any warranty; 84 | and give any other recipients of the Program a copy of this License 85 | along with the Program. 86 | 87 | You may charge a fee for the physical act of transferring a copy, and 88 | you may at your option offer warranty protection in exchange for a fee. 89 | 90 | 2. You may modify your copy or copies of the Program or any portion 91 | of it, thus forming a work based on the Program, and copy and 92 | distribute such modifications or work under the terms of Section 1 93 | above, provided that you also meet all of these conditions: 94 | 95 | a) You must cause the modified files to carry prominent notices 96 | stating that you changed the files and the date of any change. 97 | 98 | b) You must cause any work that you distribute or publish, that in 99 | whole or in part contains or is derived from the Program or any 100 | part thereof, to be licensed as a whole at no charge to all third 101 | parties under the terms of this License. 102 | 103 | c) If the modified program normally reads commands interactively 104 | when run, you must cause it, when started running for such 105 | interactive use in the most ordinary way, to print or display an 106 | announcement including an appropriate copyright notice and a 107 | notice that there is no warranty (or else, saying that you provide 108 | a warranty) and that users may redistribute the program under 109 | these conditions, and telling the user how to view a copy of this 110 | License. (Exception: if the Program itself is interactive but 111 | does not normally print such an announcement, your work based on 112 | the Program is not required to print an announcement.) 113 | 114 | These requirements apply to the modified work as a whole. If 115 | identifiable sections of that work are not derived from the Program, 116 | and can be reasonably considered independent and separate works in 117 | themselves, then this License, and its terms, do not apply to those 118 | sections when you distribute them as separate works. But when you 119 | distribute the same sections as part of a whole which is a work based 120 | on the Program, the distribution of the whole must be on the terms of 121 | this License, whose permissions for other licensees extend to the 122 | entire whole, and thus to each and every part regardless of who wrote it. 123 | 124 | Thus, it is not the intent of this section to claim rights or contest 125 | your rights to work written entirely by you; rather, the intent is to 126 | exercise the right to control the distribution of derivative or 127 | collective works based on the Program. 128 | 129 | In addition, mere aggregation of another work not based on the Program 130 | with the Program (or with a work based on the Program) on a volume of 131 | a storage or distribution medium does not bring the other work under 132 | the scope of this License. 133 | 134 | 3. You may copy and distribute the Program (or a work based on it, 135 | under Section 2) in object code or executable form under the terms of 136 | Sections 1 and 2 above provided that you also do one of the following: 137 | 138 | a) Accompany it with the complete corresponding machine-readable 139 | source code, which must be distributed under the terms of Sections 140 | 1 and 2 above on a medium customarily used for software interchange; or, 141 | 142 | b) Accompany it with a written offer, valid for at least three 143 | years, to give any third party, for a charge no more than your 144 | cost of physically performing source distribution, a complete 145 | machine-readable copy of the corresponding source code, to be 146 | distributed under the terms of Sections 1 and 2 above on a medium 147 | customarily used for software interchange; or, 148 | 149 | c) Accompany it with the information you received as to the offer 150 | to distribute corresponding source code. (This alternative is 151 | allowed only for noncommercial distribution and only if you 152 | received the program in object code or executable form with such 153 | an offer, in accord with Subsection b above.) 154 | 155 | The source code for a work means the preferred form of the work for 156 | making modifications to it. For an executable work, complete source 157 | code means all the source code for all modules it contains, plus any 158 | associated interface definition files, plus the scripts used to 159 | control compilation and installation of the executable. However, as a 160 | special exception, the source code distributed need not include 161 | anything that is normally distributed (in either source or binary 162 | form) with the major components (compiler, kernel, and so on) of the 163 | operating system on which the executable runs, unless that component 164 | itself accompanies the executable. 165 | 166 | If distribution of executable or object code is made by offering 167 | access to copy from a designated place, then offering equivalent 168 | access to copy the source code from the same place counts as 169 | distribution of the source code, even though third parties are not 170 | compelled to copy the source along with the object code. 171 | 172 | 4. You may not copy, modify, sublicense, or distribute the Program 173 | except as expressly provided under this License. Any attempt 174 | otherwise to copy, modify, sublicense or distribute the Program is 175 | void, and will automatically terminate your rights under this License. 176 | However, parties who have received copies, or rights, from you under 177 | this License will not have their licenses terminated so long as such 178 | parties remain in full compliance. 179 | 180 | 5. You are not required to accept this License, since you have not 181 | signed it. However, nothing else grants you permission to modify or 182 | distribute the Program or its derivative works. These actions are 183 | prohibited by law if you do not accept this License. Therefore, by 184 | modifying or distributing the Program (or any work based on the 185 | Program), you indicate your acceptance of this License to do so, and 186 | all its terms and conditions for copying, distributing or modifying 187 | the Program or works based on it. 188 | 189 | 6. Each time you redistribute the Program (or any work based on the 190 | Program), the recipient automatically receives a license from the 191 | original licensor to copy, distribute or modify the Program subject to 192 | these terms and conditions. You may not impose any further 193 | restrictions on the recipients' exercise of the rights granted herein. 194 | You are not responsible for enforcing compliance by third parties to 195 | this License. 196 | 197 | 7. If, as a consequence of a court judgment or allegation of patent 198 | infringement or for any other reason (not limited to patent issues), 199 | conditions are imposed on you (whether by court order, agreement or 200 | otherwise) that contradict the conditions of this License, they do not 201 | excuse you from the conditions of this License. If you cannot 202 | distribute so as to satisfy simultaneously your obligations under this 203 | License and any other pertinent obligations, then as a consequence you 204 | may not distribute the Program at all. For example, if a patent 205 | license would not permit royalty-free redistribution of the Program by 206 | all those who receive copies directly or indirectly through you, then 207 | the only way you could satisfy both it and this License would be to 208 | refrain entirely from distribution of the Program. 209 | 210 | If any portion of this section is held invalid or unenforceable under 211 | any particular circumstance, the balance of the section is intended to 212 | apply and the section as a whole is intended to apply in other 213 | circumstances. 214 | 215 | It is not the purpose of this section to induce you to infringe any 216 | patents or other property right claims or to contest validity of any 217 | such claims; this section has the sole purpose of protecting the 218 | integrity of the free software distribution system, which is 219 | implemented by public license practices. Many people have made 220 | generous contributions to the wide range of software distributed 221 | through that system in reliance on consistent application of that 222 | system; it is up to the author/donor to decide if he or she is willing 223 | to distribute software through any other system and a licensee cannot 224 | impose that choice. 225 | 226 | This section is intended to make thoroughly clear what is believed to 227 | be a consequence of the rest of this License. 228 | 229 | 8. If the distribution and/or use of the Program is restricted in 230 | certain countries either by patents or by copyrighted interfaces, the 231 | original copyright holder who places the Program under this License 232 | may add an explicit geographical distribution limitation excluding 233 | those countries, so that distribution is permitted only in or among 234 | countries not thus excluded. In such case, this License incorporates 235 | the limitation as if written in the body of this License. 236 | 237 | 9. The Free Software Foundation may publish revised and/or new versions 238 | of the General Public License from time to time. Such new versions will 239 | be similar in spirit to the present version, but may differ in detail to 240 | address new problems or concerns. 241 | 242 | Each version is given a distinguishing version number. If the Program 243 | specifies a version number of this License which applies to it and "any 244 | later version", you have the option of following the terms and conditions 245 | either of that version or of any later version published by the Free 246 | Software Foundation. If the Program does not specify a version number of 247 | this License, you may choose any version ever published by the Free Software 248 | Foundation. 249 | 250 | 10. If you wish to incorporate parts of the Program into other free 251 | programs whose distribution conditions are different, write to the author 252 | to ask for permission. For software which is copyrighted by the Free 253 | Software Foundation, write to the Free Software Foundation; we sometimes 254 | make exceptions for this. Our decision will be guided by the two goals 255 | of preserving the free status of all derivatives of our free software and 256 | of promoting the sharing and reuse of software generally. 257 | 258 | NO WARRANTY 259 | 260 | 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY 261 | FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN 262 | OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES 263 | PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED 264 | OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 265 | MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS 266 | TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE 267 | PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, 268 | REPAIR OR CORRECTION. 269 | 270 | 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING 271 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR 272 | REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, 273 | INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING 274 | OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED 275 | TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY 276 | YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER 277 | PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE 278 | POSSIBILITY OF SUCH DAMAGES. 279 | 280 | END OF TERMS AND CONDITIONS 281 | 282 | How to Apply These Terms to Your New Programs 283 | 284 | If you develop a new program, and you want it to be of the greatest 285 | possible use to the public, the best way to achieve this is to make it 286 | free software which everyone can redistribute and change under these terms. 287 | 288 | To do so, attach the following notices to the program. It is safest 289 | to attach them to the start of each source file to most effectively 290 | convey the exclusion of warranty; and each file should have at least 291 | the "copyright" line and a pointer to where the full notice is found. 292 | 293 | 294 | Copyright (C) 295 | 296 | This program is free software; you can redistribute it and/or modify 297 | it under the terms of the GNU General Public License as published by 298 | the Free Software Foundation; either version 2 of the License, or 299 | (at your option) any later version. 300 | 301 | This program is distributed in the hope that it will be useful, 302 | but WITHOUT ANY WARRANTY; without even the implied warranty of 303 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 304 | GNU General Public License for more details. 305 | 306 | You should have received a copy of the GNU General Public License along 307 | with this program; if not, write to the Free Software Foundation, Inc., 308 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. 309 | 310 | Also add information on how to contact you by electronic and paper mail. 311 | 312 | If the program is interactive, make it output a short notice like this 313 | when it starts in an interactive mode: 314 | 315 | Gnomovision version 69, Copyright (C) year name of author 316 | Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. 317 | This is free software, and you are welcome to redistribute it 318 | under certain conditions; type `show c' for details. 319 | 320 | The hypothetical commands `show w' and `show c' should show the appropriate 321 | parts of the General Public License. Of course, the commands you use may 322 | be called something other than `show w' and `show c'; they could even be 323 | mouse-clicks or menu items--whatever suits your program. 324 | 325 | You should also get your employer (if you work as a programmer) or your 326 | school, if any, to sign a "copyright disclaimer" for the program, if 327 | necessary. Here is a sample; alter the names: 328 | 329 | Yoyodyne, Inc., hereby disclaims all copyright interest in the program 330 | `Gnomovision' (which makes passes at compilers) written by James Hacker. 331 | 332 | , 1 April 1989 333 | Ty Coon, President of Vice 334 | 335 | This General Public License does not permit incorporating your program into 336 | proprietary programs. If your program is a subroutine library, you may 337 | consider it more useful to permit linking proprietary applications with the 338 | library. If this is what you want to do, use the GNU Lesser General 339 | Public License instead of this License. 340 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Instant Search Plugin for Zim Wiki 2 | Search as you type in Zim, in similar manner to OneNote Ctrl+E. 3 | 4 | When you hit Ctrl+E, small window opens, you can type in. As you type third letter, every page that matches your search is listed. You can walk through by Up/Down arrows or Home/End/Page Up/Page Down keys, hit Enter to stay on the page, or Esc to cancel. 5 | Much quicker than current Zim search. 6 | 7 | ## Working with & Feedback 8 | Known to work on: 9 | 10 | * Ubuntu 15.10+ (still working on 20.04) 11 | * Win 7 Zim 0.63+ 12 | * Debian 8.9 Zim 0.62+ 13 | 14 | I'd be glad to hear from you if it's working either here in the issues or in the original bug on [launchpad](https://bugs.launchpad.net/zim/+bug/1409626). 15 | 16 | With old Zim 0.68 you may want to use the [last release](https://github.com/e3rd/zim-plugin-instantsearch/releases/tag/1.04) which is 0.68 compatible. 17 | ### Installation 18 | Same as for the other plugins. 19 | * Put the instantsearch.py into the plugins folder 20 | * something like %appdata%\zim\data\zim\plugins in Win, or /~/.local/share/zim/plugins/ in Linux 21 | * You enable the plugin in Zim/Edit/Preferences/Plugins/ check mark Instant search. 22 | * Type Ctrl+E and see if it's working, or report it here 23 | 24 | ## Demonstration on YouTube 25 | Wanna see how it looks in action? In this example, I just search for the string "linux f" twice. 26 | [![Demonstration](https://img.youtube.com/vi/nB2SfxDhEoM/0.jpg)](https://www.youtube.com/watch?v=nB2SfxDhEoM) 27 | 28 | ## Notes 29 | * Prepend your string with an exclamation mark `!` to search in page names only 30 | * Pages are found containing every piece of string you write, ex: `tour hou` will match page containing words `contour` and `silhouette` 31 | * Favourizes page names, headers and exact query string matches, those are ordered first. 32 | * More reliable than current version of the internal Zim search where the query `economical` is not recognized if the part of the text is bold: `economi**cal**` (however highlighting works great), if a link is inserted in the middle: `economi[[inserted link]]cal` or if the query is hidden in the link: `[[http://economical.example.com|link]]`. 33 | 34 | # Copyright and License 35 | Edvard Rejthar, [CSIRT.cz](https://csirt.cz), released under [LICENSE](LICENSE). 36 | -------------------------------------------------------------------------------- /instantsearch.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # 4 | # Search instantly as you type. Edvard Rejthar 5 | # https://github.com/e3rd/zim-plugin-instantsearch 6 | # 7 | # Note that the search might not work well in case of case-folded letters 8 | # because re.IGNORECASE seem to perform str.lower only. A fix might be implemented if requested. 9 | # Use case: 10 | # re.match("tsChüß".casefold(), "Tschüß".casefold()) # matches 11 | # re.match("tsChüss", "Tschüß", re.IGNORECASE) # does not match 12 | # 13 | # 14 | import logging 15 | from os.path import abspath 16 | import re 17 | from collections import defaultdict 18 | from copy import deepcopy 19 | from pathlib import Path 20 | from time import time, perf_counter 21 | from types import SimpleNamespace 22 | from typing import Dict, List, DefaultDict, NamedTuple, Optional, Union 23 | 24 | from gi.repository import GObject, Gtk, Gdk 25 | from gi.repository.GLib import markup_escape_text 26 | from zim.actions import action 27 | from zim.gui.mainwindow import MainWindow, MainWindowExtension 28 | from zim.gui.widgets import Dialog 29 | from zim.gui.widgets import InputEntry 30 | from zim.history import HistoryList 31 | from zim.newfs import base, File, LocalFile 32 | from zim.notebook import Path as ZimPath 33 | from zim.plugins import PluginClass 34 | from zim.search import Query, SearchSelection 35 | 36 | logger = logging.getLogger('zim.plugins.instantsearch') 37 | 38 | 39 | class _FileCache(NamedTuple): 40 | path: ZimPath 41 | contents: str 42 | 43 | 44 | file_cache: Dict[Path, _FileCache] = {} 45 | # if search dialog closes, file cached are no longer fresh, might have been changed meanwhile 46 | file_cache_fresh = True 47 | 48 | 49 | class InstantSearchPlugin(PluginClass): 50 | plugin_info = { 51 | 'name': _('Instant Search'), # T: plugin name 52 | 'description': _('''\ 53 | Instant search allows you to filter as you type feature known from I.E. OneNote. 54 | When you hit Ctrl+E, small window opens, in where you can type. 55 | As you type third letter, every page that matches your search is listed. 56 | You can walk through by UP/DOWN arrow, hit Enter to stay on the page, or Esc to cancel. 57 | Much quicker than current Zim search. 58 | 59 | (V1.2) 60 | '''), 61 | 'author': "Edvard Rejthar" 62 | 63 | } 64 | 65 | POSITION_CENTER = _('center') # T: option value 66 | POSITION_RIGHT = _('right') # T: option value 67 | 68 | PREVIEW_ONLY = "preview_only" 69 | PREVIEW_THEN_FULL = "preview_then_full" 70 | FULL_ONLY = "full_only" 71 | 72 | PREVIEW_MODE = ( 73 | (PREVIEW_THEN_FULL, _('Preview then full view')), 74 | (PREVIEW_ONLY, _('Preview only')), 75 | (FULL_ONLY, _('Full view only')), 76 | ) 77 | 78 | plugin_preferences = ( 79 | # T: label for plugin preferences dialog 80 | ('title_match_char', 'string', _('Match title only if query starting by this char'), "!"), 81 | ('start_search_length', 'int', _('Start the search when number of letters written'), 3, (0, 10)), 82 | ('keystroke_delay', 'int', _('Keystroke delay before search'), 150, (0, 5000)), 83 | ('keystroke_delay_open', 'int', _('Keystroke delay for opening page in full view' 84 | '\n(Low value might prevent search list smooth navigation' 85 | ' if page is big.)'), 1500, (0, 5000)), 86 | ('preview_mode', 'choice', _('Preview mode'), PREVIEW_THEN_FULL, PREVIEW_MODE), 87 | ('preview_short', 'bool', _('Preview only matching lines' 88 | '\nOtherwise whole page is displayed if not too long.)'), False), 89 | ('highlight_search', 'bool', _('Highlight search'), True), 90 | ('ignore_subpages', 'bool', _("Ignore sub-pages (if ignored, search 'linux'" 91 | " would return page:linux but not page:linux:subpage" 92 | " (if in the subpage, there is no occurrence of string 'linux')"), True), 93 | # ('is_cached', 'bool', 94 | # _("Cache results of a search to be used in another search. (Till the end of zim process.)"), True), 95 | ('open_when_unique', 'bool', _('When only one page is found, open it automatically.'), True), 96 | ('position', 'choice', _('Popup position'), POSITION_RIGHT, (POSITION_RIGHT, POSITION_CENTER)) 97 | ) 98 | 99 | 100 | class InstantSearchMainWindowExtension(MainWindowExtension): 101 | gui: "Dialog" 102 | state: "State" 103 | cached_titles: List[str] 104 | window: MainWindow 105 | prevent_closing = False # if `open_when_unique` is active, having single query in the result would immediately re-close the dialog 106 | 107 | def __init__(self, plugin, window): 108 | super().__init__(plugin, window) 109 | self.timeout = None 110 | self.timeout_open_page = None # will open page after keystroke delay 111 | self.timeout_open_page_preview = None # will open page after keystroke delay 112 | self.last_query = None 113 | self.query_o = None 114 | self.caret = None 115 | self.original_page = None 116 | self.original_history = None 117 | self.selection = None 118 | self.menu_page = None 119 | self.is_closed = None 120 | self.last_page = self.last_page_preview = None 121 | self.label_object = None 122 | self.input_entry = None 123 | self.label_preview = None 124 | self.preview_pane = None 125 | self._last_update = 0 126 | self.state = None 127 | self.caret = SimpleNamespace(pos=0, text="", stick=False) # cursor position 128 | 129 | # preferences 130 | State.title_match_char = self.plugin.preferences['title_match_char'] 131 | State.start_search_length = self.plugin.preferences['start_search_length'] 132 | self.keystroke_delay_open = self.plugin.preferences['keystroke_delay_open'] 133 | self.keystroke_delay = self.plugin.preferences['keystroke_delay'] 134 | 135 | # noinspection PyArgumentList,PyUnresolvedReferences 136 | @action(_('_Instant search'), accelerator='e', menuhints='tools') # T: menu item 137 | def instant_search(self): 138 | 139 | # init 140 | self.cached_titles: List[ZimPathStr] = [] 141 | self.last_query = "" # previous user input 142 | self.query_o = None 143 | self.original_page = self.window.page.name # we return here after escape 144 | self.original_history = list(self.window.history.uistate["list"]) 145 | self.selection = None 146 | # if not self.plugin.preferences['is_cached']: 147 | # reset last search results 148 | # State.reset() 149 | self.menu_page = None 150 | self.is_closed = False 151 | self.last_page = None 152 | 153 | # building quick title cache 154 | def build(start=""): 155 | o = self.window.notebook.pages 156 | for s in o.list_pages(ZimPath(start or ":")): 157 | start2 = (start + ":" if start else "") + s.basename 158 | self.cached_titles.append(start2) 159 | build(start2) 160 | 161 | build() 162 | 163 | # Gtk 164 | self.gui = Dialog(self.window, _('Search'), buttons=None, defaultwindowsize=(300, -1)) 165 | self.gui.resize(300, 100) # reset size 166 | self.input_entry = InputEntry() 167 | self.input_entry.connect('key_press_event', self.move) 168 | self.input_entry.connect('changed', self.change) # self.change is needed by GObject or something 169 | self.gui.vbox.pack_start(self.input_entry, expand=False, fill=True, padding=0) 170 | # noinspection PyArgumentList 171 | self.label_object = Gtk.Label(label='') 172 | self.label_object.set_size_request(300, -1) 173 | self.gui.vbox.pack_start(self.label_object, expand=False, fill=True, padding=0) 174 | 175 | # preview pane 176 | self.label_preview = Gtk.Label(label='...loading...') 177 | # not sure if this has effect, longer lines without spaces still make window inflate 178 | self.label_preview.set_line_wrap(True) 179 | self.label_preview.set_xalign(0) # align to the left 180 | self.label_preview.set_valign(Gtk.Align.START) # align to the top 181 | self.preview_pane = Gtk.VBox() 182 | 183 | inner_container = Gtk.ScrolledWindow() 184 | inner_container.set_policy(Gtk.PolicyType.AUTOMATIC, Gtk.PolicyType.AUTOMATIC) 185 | inner_container.add(self.label_preview) 186 | h = self.window.pageview.textview.get_allocated_height() - 25 187 | inner_container.set_min_content_height(h) 188 | inner_container.set_max_content_height(h) 189 | 190 | self.preview_pane.pack_start(inner_container, False, False, 5) 191 | self.window.pageview.pack_start(self.preview_pane, False, False, 5) 192 | 193 | # gui geometry 194 | self.geometry(init=True) 195 | 196 | self.gui.show_all() 197 | 198 | if self.state: 199 | self.prevent_closing = True 200 | self.input_entry.set_text(self.state.raw_query) 201 | self.input_entry.select_region(0, -1) 202 | self.change(None) 203 | self.prevent_closing = False 204 | 205 | def geometry(self, init=False, repeat=True, force=False): 206 | if repeat and not init: 207 | # I do not know how to catch callback when result list's width is final, so we align several times 208 | [GObject.timeout_add(x, lambda: self.geometry(repeat=False, force=force)) for x in (30, 50, 70, 400)] 209 | # it is not worthy we continue now because often the Gtk redraw is delayed which would mean 210 | # the Dialog dimensions change twice in a row 211 | return 212 | 213 | px, py = self.window.get_position() 214 | pw, ph = self.window.get_size() 215 | init_w, init_h = 300, 100 216 | if init: 217 | x, y = None, None 218 | w, h = init_w, init_h 219 | else: 220 | x, y = self.gui.get_position() 221 | w, h = self.gui.get_allocated_width(), self.gui.get_allocated_height() 222 | if self.plugin.preferences['position'] == InstantSearchPlugin.POSITION_RIGHT: 223 | x2, y2 = px + pw - w, py 224 | elif self.plugin.preferences['position'] == InstantSearchPlugin.POSITION_CENTER: 225 | x2, y2 = px + (pw / 2) - w / 2, py + (ph / 2) - 250 226 | else: 227 | raise AttributeError("Instant search: Wrong position preference.") 228 | 229 | if init or x != x2 or force: 230 | self.gui.resize(init_w, init_h) 231 | self.gui.move(x2, y2) 232 | 233 | def title(self, title=""): 234 | self.gui.set_title("Search " + title) 235 | 236 | def change(self, _): # widget, event,text 237 | if self.timeout: 238 | GObject.source_remove(self.timeout) 239 | self.timeout = None 240 | q = self.input_entry.get_text() 241 | if q == self.last_query: 242 | return 243 | if q == State.title_match_char: 244 | return 245 | if q and q[-1] == "∀": # easter egg: debug option for zim --standalone 246 | q = q[:-1] 247 | import ipdb 248 | ipdb.set_trace() 249 | self.state = State.set_current(q) 250 | 251 | if not self.state.is_finished: 252 | if self.start_search(): 253 | self.process_menu() 254 | else: # search completed before 255 | # If we would not clear the cache in .close(), we had to reset scores 256 | # and re-start search by self.start_search() for the case a page changed meanwhile. 257 | self.check_last() 258 | self.sout_menu() 259 | 260 | self.last_query = q 261 | 262 | def start_search(self): 263 | """ Search string has certainly changed. We search in indexed titles and/or we start fulltext search. 264 | :rtype: True if no other search is needed and we may output the menu immediately. 265 | 266 | """ 267 | 268 | query = self.state.query 269 | menu = self.state.menu 270 | 271 | if not query: 272 | return True 273 | 274 | SearchController.header_search(query, menu, self.cached_titles) 275 | 276 | if self.state.page_name_only: 277 | return True 278 | else: 279 | if not self.state.previous or len(query) == State.start_search_length: 280 | # quickly show page title search results before longer fulltext search is ready 281 | # Either there is no previous state – query might have been copied into input 282 | # or the query is finally long enough to start fulltext search. 283 | # It is handy to show out filtered page names before because 284 | # it is often use case to jump to queries matched in page names. 285 | self.process_menu(ignore_geometry=True) 286 | 287 | self.title("..") 288 | self.timeout = GObject.timeout_add(self.keystroke_delay, 289 | self.start_zim_search) # ideal delay between keystrokes 290 | 291 | def start_zim_search(self): 292 | """ Starts search for the input. """ 293 | self.title("...") 294 | if self.timeout: 295 | GObject.source_remove(self.timeout) 296 | self.timeout = None 297 | self.query_o = Query(self.state.query) 298 | 299 | # it should be quicker to find the string, if we provide this subset from last time 300 | # (in the case we just added a letter, so that the subset gets smaller) 301 | # last_sel = self.selection if self.is_subset and self.state.previous and self.state.previous.is_finished 302 | # else None 303 | selection = self.selection = SearchSelection(self.window.notebook) 304 | state = self.state # this is a thread, so that self.state might change before search finishes 305 | 306 | # internal search disabled - it was way too slower 307 | # selection.search(self.query_o, selection=last_sel, callback=self._search_callback(state)) 308 | # self._update_results(selection, state, force=True) 309 | # self.title("....") 310 | 311 | # fulltext external search 312 | # Loop either all .txt files in the notebook or narrow the search with a previous state 313 | if state.previous and state.previous.is_finished and state.previous.matching_files is not None: 314 | paths = state.previous.matching_files 315 | # see below paths_cached_set = (p for p in files_set if p in InstantSearchPlugin.file_cache) 316 | else: 317 | extension = "*" + self.window.notebook.config["Notebook"]["default_file_extension"] # ex: "*.txt" 318 | # Why the slash "/" after the notebook folder? #51 319 | # If the notebook sits on the root dir in Windows, joining the notebook path "G:" 320 | # and the rglob produces path like "G:file.txt" which is a perfectly valid Windows path. 321 | # Missing slash means relative CWD on the drive G in Windows system 322 | # but Zim seems not to be aware of such a strange Windows behaviour. Hence, putting it into 323 | # self.window.notebook.layout.map_file / base.FilePath.relpath gives ValueError 'Not a parent path G:'. 324 | # Resolving files to the absolute paths by `f.resolve()` might fail as well because the drive G: 325 | # may point to another folder like C:\mount, and C:\mount\file.txt is not under the notebook parent 326 | # path "G:" as well. 327 | # The best solution is to force the notebook folder to have the slash to be sure we get such 328 | # half-absolute paths. 329 | # It's IMHO the bug of the Zim that it does not include trailing slash which is ok till the dir 330 | # is the root drive, while the path reported becomes relative ("G:" – relative to CWD on G, "G:\\" – absolute). 331 | paths = (f for f in Path(abspath(str(self.window.notebook.folder))).rglob(extension) if f.is_file()) 332 | # see below paths_cached_set = (p for p in InstantSearchPlugin.file_cache) 333 | state.matching_files = [] 334 | 335 | # This cached search takes about 60 ms, so I let it commented. 336 | # However on HDD disks this may boost performance. 337 | # We may do an option: "empty cache immediately after close (default)", 338 | # "search cache first and then do the fresh search (HDD)" 339 | # "use cache always (empties cache after Zim restart)" 340 | # "empty cache after 5 minutes" 341 | # and then prevent to clear the cache in .close(). 342 | # Or rather we may read file mtime and re-read if only it has been changed since last search. 343 | # if not InstantSearchPlugin.file_cache_fresh: 344 | # # Cache might not be fresh but since it is quick, perform quick non-fresh-cached search 345 | # # and then do a fresh search. If we are lucky enough, results will not change. 346 | # # using temporary selection so that files will not received double points for both cached and fresh loop 347 | # selection_temp = SearchSelection(self.window.notebook) 348 | # self.start_external_search(selection_temp, state, paths_cached_set) 349 | # InstantSearchPlugin.file_cache_fresh = True 350 | # InstantSearchPlugin.file_cache.clear() 351 | self.start_external_search(selection, state, paths) 352 | 353 | state.is_finished = True 354 | 355 | if state == self.state: 356 | self.check_last() 357 | 358 | self.process_menu(state=state) 359 | self.title() 360 | 361 | def start_external_search(self, selection, state: "State", paths: List[Path]): 362 | """ Zim internal search is not able to find out text with markup. 363 | Ex: 364 | 'economical' is not recognized as 'economi**cal**' (however highlighting works great), 365 | as 'economi[[inserted link]]cal' 366 | as 'any text with [[http://economical.example.com|link]]' 367 | 368 | This fulltext search loops all .txt files in the notebook directory 369 | and tries to recognize the patterns. 370 | """ 371 | 372 | # divide query to independent words "foo economical" -> "foo", "economical", page has to contain both 373 | # strip markup: **bold**, //italic//, __underline__, ''verbatim'', ~~strike through~~ 374 | # matches query "economi**cal**" 375 | 376 | def letter_split(q): 377 | """ Every letter is divided by a any-formatting-match-group and escaped. 378 | 'foo.' -> 'f[*/'_~]o[*/'_~]o[*/'_~]\\.' 379 | """ 380 | return r"[*/'_~]*".join((re.escape(c) for c in list(q))) 381 | 382 | sub_queries = state.query.split(" ") 383 | 384 | # regex to identify in all sub_queries present in the text 385 | queries = [(q, re.compile(letter_split(q), re.IGNORECASE)) for q in sub_queries] 386 | 387 | # regex to identify the very query is present 388 | exact_query = re.compile(letter_split(state.query), re.IGNORECASE) if len(sub_queries) > 1 else None 389 | 390 | # regex to count the number of the sub_queries present and to optionally add information about header used 391 | header_queries = [re.compile("(\n=+ .*)?" + letter_split(q), re.IGNORECASE) for q in sub_queries] 392 | 393 | # regex to identify inner link contents 394 | link = re.compile(r"\[\[(.*?)\]\]", re.IGNORECASE) # matches all links "economi[[inserted link]]cal" 395 | 396 | start = perf_counter() 397 | 398 | for path in paths: 399 | if path not in file_cache: 400 | try: 401 | contents = path.read_text(encoding='UTF-8', errors='replace') 402 | except UnicodeDecodeError as err: 403 | # Ignore file an skip to next path 404 | logger.warning("Skipping path %s due to invalid character encoding error: %s", path, err) 405 | continue 406 | # strip header 407 | if contents.startswith('Content-Type: text/x-zim-wiki'): 408 | # XX will that work on Win? 409 | # I should use more general separator IMHO in the whole file rather than '\n'. 410 | contents = contents[contents.find("\n\n"):] 411 | zim_path = self._path2zim(path) 412 | file_cache[path] = _FileCache(zim_path, contents) 413 | else: 414 | zim_path, contents = file_cache[path].path, file_cache[path].contents 415 | 416 | matched_links = [] 417 | 418 | def matched_link(match): 419 | matched_links.append(match.group(1)) 420 | return "" 421 | 422 | # pull out links "economi[[inserted link]]cal" -> "economical" + "inserted link" 423 | txt_body = link.sub(matched_link, contents) 424 | txt_links = "".join(matched_links) 425 | 426 | # wanted terms do not occur in the page name, waiting to be found in the page contents 427 | wanted = [(None, reg) for q, reg in queries if q not in str(zim_path).casefold()] 428 | 429 | def found(it): # whether sub queries are found in the text 430 | return (reg.search(txt_body) or reg.search(txt_links) for _, reg in it) 431 | 432 | # Process, if not all query-terms (pieces, words, bits) are found in the page name 433 | # and thus the page would be ignored, but all of the remaining terms are found withing the page contents. 434 | # Or if all the terms are included in the page name (anywhere in the page name, 435 | # it does not have to be in its final part, in the least subpage), process if any of the terms are found 436 | # within the page contents as a bonus. 437 | if wanted and all(found(wanted)) or not wanted and any(found(queries)): 438 | # if remaining and all(reg.search(txt_body) or reg.search(txt_links) for reg in remaining): 439 | # score = header order * 3 + body match count * 1 440 | # if there are '=' equal chars before the query, it is header. The bigger number, the bigger header. 441 | # Header 5 corresponds to 3 points, Header 1 to 7 points. XX it seems Header 5 ~ 3 points, Header 1 ~ 15 points. Might be more IMHO, like * 5 instead of * 3. 442 | score = sum([len(m.group(1)) * 3 if m.group(1) else 1 443 | for q in header_queries for m in q.finditer(txt_body)]) 444 | if exact_query: # there are sub-queries, we favourize full-match 445 | score += 100 * len(exact_query.findall(txt_body)) 446 | 447 | # noinspection PyProtectedMember 448 | # score might be zero because we are not re-checking against txt_links matches 449 | selection._count_score(zim_path, score or 1) 450 | state.matching_files.append(path) 451 | elif not wanted: 452 | # The page is not eligible for fulltext search now. However, a term (part of the query) may appear 453 | # that will render the page thrown up from the page name search alone 454 | # but is included in the page contents. 455 | # Use case: 456 | # Step 1: Query "linux foo" matches page "Linux:foo" while neither term is in the page contents ('bar'). 457 | # Step 2: Query "linux foo b" matches page "Linux:foo" because 'bar' is in the page contents. 458 | state.matching_files.append(path) 459 | 460 | logger.info("[Instantsearch] External search: %g s", perf_counter() - start) 461 | self._update_results(selection, state, force=True) 462 | 463 | def check_last(self): 464 | """ opens the page if there is only one option in the menu """ 465 | if len(self.state.menu) == 1 and self.plugin.preferences['open_when_unique']: 466 | self._open_page(ZimPath(list(self.state.menu)[0]), exclude_from_history=False) 467 | if not self.prevent_closing: 468 | self.close() 469 | elif not len(self.state.menu): 470 | self._open_original() 471 | 472 | def _search_callback(self, state): 473 | def _(results, _path): 474 | if results is not None: 475 | # we finish the search even if another search is running. 476 | # If returned False, the search would be cancelled 477 | self._update_results(results, state) 478 | while Gtk.events_pending(): 479 | Gtk.main_iteration() 480 | return True 481 | 482 | return _ 483 | 484 | def _update_results(self, results, state: "State", force=False): 485 | """ 486 | This method may run many times, due to the _update_results, which are updated many times, 487 | the results are appearing one by one. However, if called earlier than 0.2 s, ignored. 488 | 489 | Measures: 490 | If every callback would be counted, it takes 3500 ms to build a result set. 491 | If callbacks earlier than 0.6 s -> 2300 ms, 0.3 -> 2600 ms, 0.1 -> 2800 ms. 492 | """ 493 | if not force and time() < self._last_update + 0.2: # if update callback called earlier than 200 ms, ignore 494 | return 495 | self._last_update = time() 496 | 497 | changed = False 498 | 499 | for option in results.scores: 500 | if option.name not in state.menu or ( 501 | state.menu[option.name].page_score < 0 and state.menu[option.name].score == 0): 502 | changed = True 503 | o: _MenuItem = state.menu[option.name] 504 | o.score = results.scores[option] # includes into options 505 | o.path = option.name 506 | 507 | if changed: # we added a page 508 | self.process_menu(state=state, sort=False) 509 | else: 510 | pass 511 | 512 | def process_menu(self, state=None, sort=True, ignore_geometry=False): 513 | """ Sort menu and generate items and sout menu. """ 514 | if state is None: 515 | state = self.state 516 | 517 | if sort: 518 | state.items = sorted(state.menu.values(), reverse=True, key=lambda item: ( 519 | item.page_highlight, item.score + item.page_score, -item.path.count(":"), item.path)) 520 | else: 521 | # when search results are being updated, it's good when the order does not change all the time. 522 | # So that the first result does not become for a while 10th and then become first back. 523 | state.items = sorted(state.menu.values(), reverse=True, 524 | key=lambda item: (item.page_highlight, -item.last_order)) 525 | 526 | # Items appear only if they have score either from the page contents or the page name search. 527 | # And if the score comes from the page name search only, page_insufficient must be True 528 | # (at least one term appears in the least subpage name). 529 | # Note: I do not know why there are items with score 0 if internal Zim search used 530 | state.items = [page for page in state.items if 531 | (page.score or not page.page_insufficient) 532 | and (page.score + page.page_score) > 0] 533 | 534 | if state == self.state: 535 | self.sout_menu(ignore_geometry=ignore_geometry) 536 | 537 | def sout_menu(self, display_immediately=False, caret_move=None, ignore_geometry=False): 538 | """ Displays menu and handles caret position. """ 539 | if self.timeout_open_page: 540 | GObject.source_remove(self.timeout_open_page) 541 | self.timeout_open_page = None 542 | if self.timeout_open_page_preview: 543 | GObject.source_remove(self.timeout_open_page_preview) 544 | self.timeout_open_page_preview = None 545 | 546 | # caret: 547 | # by default stays at position 0 548 | # If moved to a page, it keeps the page. 549 | # If moved back to position 0, stays there. 550 | if caret_move is not None: 551 | if caret_move == 0: 552 | self.caret.pos = 0 553 | else: 554 | self.caret.pos += caret_move 555 | self.caret.stick = self.caret.pos != 0 556 | elif self.state.items and self.caret.stick: 557 | # identify current caret position, depending on the text 558 | self.caret.pos = next((i for i, item in enumerate(self.state.items) if item.path == self.caret.text), 0) 559 | # treat possible caret deflection 560 | if self.caret.pos < 0: 561 | # place the caret to the beginning or the end of list 562 | self.caret.pos = 0 563 | elif self.caret.pos >= len(self.state.items): 564 | self.caret.pos = 0 if caret_move == 1 else len(self.state.items) - 1 565 | 566 | text = [] 567 | for i, page in enumerate(self.state.items): 568 | score = page.score + page.page_score 569 | page.last_order = i 570 | pieces = page.path.split(":") 571 | pieces[-1] = f"{pieces[-1]}" 572 | s = ":".join(pieces) 573 | if i == self.caret.pos: 574 | self.caret.text = page.path # caret is at this position 575 | text.append(f'→ {s} ({score})') 576 | else: 577 | text.append(f'{s} ({score})') 578 | text = "No result" if not text and self.state.is_finished else "\n".join(text) 579 | 580 | self.label_object.set_markup(text) 581 | self.menu_page = ZimPath(self.caret.text if len(self.state.items) else self.original_page) 582 | 583 | if not display_immediately: 584 | if self.plugin.preferences['preview_mode'] != InstantSearchPlugin.PREVIEW_ONLY: 585 | self.timeout_open_page = GObject.timeout_add(self.keystroke_delay_open, self._open_page, 586 | self.menu_page) # ideal delay between keystrokes 587 | if self.plugin.preferences['preview_mode'] != InstantSearchPlugin.FULL_ONLY: 588 | self.timeout_open_page_preview = GObject.timeout_add(self.keystroke_delay, self._open_page_preview, 589 | self.menu_page) # ideal delay between keystrokes 590 | else: 591 | self._open_page(self.menu_page) 592 | # we force here geometry to redraw because often we end up with "No result" page that is very tall 593 | # because of a many records just hidden 594 | 595 | if not ignore_geometry: 596 | self.geometry(force=True) 597 | 598 | def move(self, widget, event): 599 | """ Move caret up and down. Enter to confirm, Esc closes search.""" 600 | key_name = Gdk.keyval_name(event.keyval) 601 | 602 | # handle basic caret movement 603 | moves = {"Up": -1, "ISO_Left_Tab": -1, "Down": 1, "Tab": 1, "Page_Up": -10, "Page_Down": 10} 604 | if key_name in moves: 605 | self.sout_menu(display_immediately=False, caret_move=moves[key_name]) 606 | elif key_name in ("Home", "End"): 607 | if event.state & Gdk.ModifierType.CONTROL_MASK or event.state & Gdk.ModifierType.SHIFT_MASK: 608 | # Ctrl/Shift+Home jumps to the query input text start 609 | return 610 | if key_name == "Home": # Home jumps at the result list start 611 | self.sout_menu(display_immediately=False, caret_move=0) 612 | widget.emit_stop_by_name("key-press-event") 613 | else: 614 | self.sout_menu(display_immediately=False, caret_move=float("inf")) 615 | widget.emit_stop_by_name("key-press-event") 616 | 617 | # confirm or cancel 618 | elif key_name == "KP_Enter" or key_name == "Return": 619 | self._open_page(self.menu_page, exclude_from_history=False) 620 | self.close() 621 | elif key_name == "Escape": 622 | self._open_original() 623 | self.is_closed = True # few more timeouts are on the way probably 624 | self.close() 625 | 626 | return 627 | 628 | def close(self): 629 | """ Safely (closes gets called when hit Enter) """ 630 | if not self.is_closed: # if hit Esc, GTK has already emitted close itself 631 | self.is_closed = True 632 | self.gui.emit("close") 633 | 634 | # remove preview pane and show current text editor 635 | self._hide_preview() 636 | self.preview_pane.destroy() 637 | file_cache.clear() # until next search, pages might change 638 | 639 | def _open_original(self): 640 | self._open_page(ZimPath(self.original_page)) 641 | # we already have HistoryPath objects in the self.original_history, we cannot add them in the constructor 642 | # XX I do not know what is that good for 643 | hl = HistoryList([]) 644 | hl.extend(self.original_history) 645 | self.window.history.uistate["list"] = hl 646 | 647 | # noinspection PyProtectedMember 648 | def _open_page(self, page, exclude_from_history=True): 649 | """ Open page and highlight matches """ 650 | self._hide_preview() 651 | if self.timeout_open_page: # no delayed page will be open 652 | GObject.source_remove(self.timeout_open_page) 653 | self.timeout_open_page = None 654 | if self.timeout_open_page_preview: # no delayed preview page will be open 655 | GObject.source_remove(self.timeout_open_page_preview) 656 | self.timeout_open_page_preview = None 657 | 658 | # open page 659 | if page and page.name and page.name != self.last_page: 660 | self.last_page = page.name 661 | self.window.navigation.open_page(page) 662 | if exclude_from_history and list(self.window.history._history)[-1:][0].name != self.original_page: 663 | # there is no public API, so lets use protected _history instead 664 | self.window.history._history.pop() 665 | self.window.history._current = len(self.window.history._history) - 1 666 | if not exclude_from_history and self.window.history.get_current().name is not page.name: 667 | # we insert the page to the history because it was likely to be just visited and excluded 668 | self.window.history.append(page) 669 | 670 | # Popup find dialog with same query 671 | if self.query_o: # and self.query_o.simple_match: 672 | string = self.state.query 673 | string = string.strip('*') # support partial matches 674 | if self.plugin.preferences['highlight_search']: 675 | # unfortunately, we can highlight single word only 676 | self.window.pageview.show_find(string.split(" ")[0], highlight=True) 677 | 678 | def _hide_preview(self): 679 | self.preview_pane.hide() 680 | # noinspection PyProtectedMember 681 | self.window.pageview._hack_hbox.show() 682 | 683 | def _path2zim(self, path: Path) -> ZimPath: 684 | return self.window.notebook.layout.map_file(LocalFile(str(path)))[0] 685 | 686 | def _open_page_preview(self, page: ZimPath): 687 | """ Open preview which is far faster then loading and 688 | building big parse trees into text editor buffer when opening page. """ 689 | # note: if the dialog is already closed, we do not want a preview to open, but page still can be open 690 | # (ex: after hitting Enter the dialog can close before opening the page) 691 | 692 | if self.timeout_open_page_preview: 693 | # no delayed preview page will be open, however self.timeout_open_page might be still running 694 | GObject.source_remove(self.timeout_open_page_preview) 695 | self.timeout_open_page_preview = None 696 | 697 | # it does not pose a problem if we re-load preview on the same page; 698 | # the query text might got another letter to highlight 699 | if page and not self.is_closed: 700 | # show preview pane and hide current text editor 701 | self.last_page_preview = page.name 702 | 703 | local_file: File = self.window.notebook.layout.map_page(page)[0] 704 | path = Path(str(local_file)) 705 | if path in file_cache: 706 | s = file_cache[path].contents 707 | else: 708 | try: 709 | s = local_file.read() 710 | file_cache[path] = _FileCache(self._path2zim(path), s) 711 | except base.FileNotFoundError: 712 | s = f"page {page} has no content" # page has not been created yet 713 | lines = s.splitlines() 714 | 715 | # the file length is very small, prefer to not use preview here 716 | if self.plugin.preferences['preview_mode'] != InstantSearchPlugin.PREVIEW_ONLY and len(lines) < 50: 717 | return self._open_page(page, exclude_from_history=True) 718 | self.label_preview.set_markup(self._get_preview_text(lines, self.state.query)) 719 | 720 | # shows GUI (hidden in self._hide_preview() 721 | self.preview_pane.show_all() 722 | # noinspection PyProtectedMember 723 | self.window.pageview._hack_hbox.hide() 724 | 725 | def _get_preview_text(self, lines, query): 726 | max_lines = 200 727 | 728 | # check if the file is a Zim markup file and if so, skip header 729 | if lines[0] == 'Content-Type: text/x-zim-wiki': 730 | for i, line in enumerate(lines): 731 | if line == "": 732 | lines = lines[i + 1:] 733 | break 734 | 735 | if query.strip() == "": 736 | return "\n".join(line for line in lines[:max_lines]) 737 | 738 | # searching for "a" cannot match "&a", since markup_escape_text("&") -> "'" 739 | # Ignoring q == "b", it would interfere with multiple queries: 740 | # Ex: query "f b", text "foo", matched with "f" -> "foo", matched with "b" -> "<b>fb>" 741 | query_match = (re.compile("(" + re.escape(q) + ")", re.IGNORECASE) for q in query.split(" ") if q != "b") 742 | # too long lines caused strange Gtk behaviour – monitor brightness set to maximum, without any logged warning 743 | # so that I decided to put just extract of such long lines in preview 744 | # This regex matches query chunk in the line, prepends characters before and after. 745 | # When there should be the same query chunk after the first, it stops. 746 | # Otherwise, the second chunk might be halved and thus not highlighted. 747 | # Ex: query "test", text: "lorem ipsum text dolor text text sit amet consectetur" -> 748 | # ["ipsum text dolor ", "text ", "text sit amet"] (words "lorem" and "consectetur" are strip) 749 | line_extract = [re.compile("(.{0,80}" + re.escape(q) + "(?:(?!" + re.escape(q) + ").){0,80})", re.IGNORECASE) 750 | for q in query.split(" ") if q != "b"] 751 | 752 | # grep some lines 753 | keep_all = not self.plugin.preferences["preview_short"] and len(lines) < max_lines 754 | lines_iter = iter(lines) 755 | chosen = [next(lines_iter)] # always include header as the first line, even if it does not contain the query 756 | for line in lines_iter: 757 | if len(chosen) > max_lines: # file is too long which would result the preview to not be smooth 758 | break 759 | elif keep_all or any(q in line.lower() for q in query.split(" ")): 760 | # keep this line since it contains a query chunk 761 | if len(line) > 100: 762 | # however, this line is too long to display, try to extract query and its neighbourhood 763 | s = "...".join("...".join(q.findall(line)) for q in line_extract).strip(".") 764 | if not s: # no query chunk was find on this line, the keep_all is True for sure 765 | chosen.append(line[:100] + "...") 766 | else: 767 | chosen.append("..." + s + "...") 768 | else: 769 | chosen.append(line) 770 | if not keep_all or len(chosen) > max_lines: 771 | # note that query might not been found, ex: query "foo" would not find line with a bold 'o': "f**o**o" 772 | chosen.append("...") 773 | txt = markup_escape_text("\n".join(line for line in chosen)) 774 | 775 | # bold query chunks in the text 776 | for q in query_match: 777 | txt = q.sub(r"\g<1>", txt) 778 | 779 | # preserve markup_escape_text entities 780 | # correct ex: '&amp;' -> '&' if searching for 'm' 781 | bold_tag = re.compile("") 782 | broken_entity = re.compile("&[a-z]* "State": 805 | """ Returns other state. 806 | raw_query may include '!' sign for title only search 807 | """ 808 | raw_query = raw_query.lower() 809 | if raw_query not in State._states: 810 | State._states[raw_query] = State(raw_query) 811 | else: 812 | State._states[raw_query].first_seen = False 813 | State._current = State._states[raw_query] 814 | return State._current 815 | 816 | @classmethod 817 | def get(cls, query): 818 | return State._states[query.lower()] 819 | 820 | def __init__(self, raw_query): 821 | self.items: List[_MenuItem] = [] 822 | self.is_finished = False 823 | self.raw_query = r = raw_query # including '!' sign for title only search 824 | self.first_seen = True 825 | 826 | # we are subset of this state from the longest shorter query 827 | self.previous = next((State._states[r[:i]] for i in range(len(r), 0, -1) if r[:i] in State._states), None) 828 | 829 | # since having <= 3 letters uses less benevolent searching method, we cannot reduce the next step 830 | # ex: "!est" should not match "testing" but "!esti" should 831 | if self.previous and self.previous.page_name_only: 832 | self.previous = None 833 | 834 | if self.previous: 835 | self.menu = deepcopy(self.previous.menu) 836 | [item.reset_score() for item in self.menu.values()] 837 | else: 838 | self.menu: Menu = defaultdict(_MenuItem) 839 | 840 | # check if we query page titles only, based on the special '!' sign in the query text 841 | # first char is "!" -> searches in page name only 842 | self.page_name_only, self.query = (True, raw_query[len(State.title_match_char):].lower()) \ 843 | if raw_query.startswith(State.title_match_char) \ 844 | else (False, raw_query) 845 | if len(self.query) < State.start_search_length: 846 | self.page_name_only = True # search only in page names, not in page contents 847 | 848 | 849 | class _MenuItem: 850 | 851 | def __init__(self): 852 | self.path: Optional[ZimPathStr] = None 853 | self.score = 0 # score given by SearchSelection (page contents search) 854 | self.page_score = 0 # score from the page name search 855 | self.page_highlight = False # page name search priority match (query term is not in the middle of the word) 856 | self.last_order = 0 857 | 858 | # None of the query terms is in the least subpage name. Such results may appear only 859 | # if some of the term is found in the page context too. But the page search is insufficient. 860 | self.page_insufficient = False 861 | 862 | def reset_score(self): 863 | """ The item has been just copied from a previous state to narrow down the search. 864 | However, score will be re-counted. """ 865 | self.page_score = self.score = 0 866 | self.page_highlight = False 867 | 868 | 869 | ZimPathStr = str # may serve as an argument to the ZimPath constructor 870 | Menu = DefaultDict[ZimPathStr, _MenuItem] 871 | 872 | 873 | class SearchController: 874 | @staticmethod 875 | def header_search(query: str, menu: Menu, cached_titles: List[ZimPathStr]) -> None: 876 | # 'te' matches these page titles: 'test' or 'Journal:test' or 'foo test' or 'foo (test)' 877 | sub_queries_benevolent = [re.compile(r"(^|:|\s|\()?" + q, re.IGNORECASE) for q in query.split(" ")] 878 | # 'st' does not match those 879 | sub_queries_strict = [re.compile(r"(^|:|\s|\()" + q, re.IGNORECASE) for q in query.split(" ")] 880 | 881 | def in_query(txt) -> Union[int, bool]: 882 | """ False if any part of the query does not match. 883 | If the query is longer >3 characters: 884 | * +10 for every query part that matches a title part beginning 885 | Ex: query 'te' -> +10 for these page titles: 886 | 'test' or 'Journal:test' or 'foo test' or 'foo (test)' 887 | * +1 for every query part 888 | Ex: query 'st' -> +1 for those page titles 889 | 890 | If the query is shorter <=3 characters: 891 | +10 for every query part that matches a title part beginning 'te' for 'test' 892 | False otherwise ('st' for 'test') so that you do not end up messed 893 | with page titles, after writing a single letter. 894 | """ 895 | try: 896 | if len(query) <= 3: 897 | # raises if subquery m does not match or is not at a page chunk beginning 898 | return sum(10 if m.group(1) is not None else None 899 | for m in (q.search(txt) for q in sub_queries_strict)) 900 | else: 901 | # raises if subquery m does not match 902 | return sum(10 if m.group(1) is not None else 1 903 | for m in (q.search(txt) for q in sub_queries_benevolent)) 904 | except (AttributeError, TypeError): # one of the sub_queries is not part of the page title 905 | return False 906 | 907 | # we loop either all cached page titles or menu that should be built from previous superset-query menu 908 | for path in list(menu) or cached_titles: # quick search in titles 909 | path_lower = path.casefold() 910 | path_end = path_lower[path_lower.rfind(":") + 1:] 911 | score = in_query(path_lower) 912 | 913 | if score: # 'te' matches 'test' or 'Journal:test' etc 914 | # "foo" in "foo:bar", but not in "bar" 915 | # when looping "foo:bar", page "foo" receives +1 for having a subpage 916 | # if all(q in path.lower() for q in query) \ 917 | # and any(q not in path.lower().split(":")[-1] for q in query): 918 | # menu[":".join(path.split(":")[:-1])].bonus += 1 # 1 point for having a subpage 919 | 920 | # Normally, zim search gives 11 points bonus if the search-string appears in the titles. 921 | # If we are ignoring sub-pages, the search "foo" will match only page "journal:foo", 922 | # but not "journal:foo:subpage" 923 | # (and score of the parent page will get slightly higher by 1.) 924 | # However, if there are occurrences of the string in the fulltext of the subpage, 925 | # subpage remains in the result, but gets bonus only 2 points (not 11). 926 | # But internal zim search is now disabled. 927 | # menu[path].bonus = -11 928 | 929 | # 10 points for title (zim default) (so that it gets displayed before the search finishes) 930 | m = menu[path] 931 | m.page_score += score # will be added to score (score will be reset) 932 | # if score > 9, it means this might be priority match, not fulltext page name search 933 | # ex "te" for "test" is priority, whereas "st" is just fulltext 934 | m.page_highlight = True if score > 9 else False 935 | m.path = path 936 | 937 | if not any(q in path_end for q in query.split()): 938 | m.page_insufficient = True 939 | else: 940 | m.page_insufficient = False 941 | else: # remove the item from menu if it was there before 942 | menu.pop(path, None) 943 | -------------------------------------------------------------------------------- /tests.py: -------------------------------------------------------------------------------- 1 | from collections import defaultdict 2 | from unittest import TestCase, main 3 | 4 | import gi 5 | 6 | gi.require_version('Gtk', '3.0') 7 | 8 | from instantsearch import SearchController, _MenuItem 9 | 10 | cached_titles = [ 11 | 'Journal', 12 | 'Journal:2021', 13 | 'Journal:2021:12', 14 | 'Journal:foo', 15 | 'Journal:foo:bar', 16 | 'Journal:foo:bar:fourth', 17 | 'test', 18 | 'Journal:test', 19 | 'foo test', 20 | 'foo (test)' 21 | ] 22 | 23 | 24 | class TestSearch(TestCase): 25 | def _search(self, query, expected): 26 | menu = defaultdict(_MenuItem) 27 | SearchController.header_search(query, menu, cached_titles) 28 | self.assertListEqual(expected, [*menu]) 29 | 30 | def test_header(self): 31 | self._search("foo", ['Journal:foo', 'Journal:foo:bar', 'Journal:foo:bar:fourth', 'foo test', 'foo (test)']) 32 | self._search("tes", ['test', 'Journal:test', 'foo test', 'foo (test)']) 33 | 34 | 35 | if __name__ == '__main__': 36 | main() 37 | --------------------------------------------------------------------------------