22 |
23 | ## Introduction
24 |
25 | This package aims to index all fields the portal_catalog indexes and allows you to delete the `Title`, `Description` and `SearchableText` indexes which can provide significant improvement to performance and RAM usage.
26 |
27 | Then, ElasticSearch queries are ONLY used when Title, Description and SearchableText text are in the query. Otherwise, the plone's default catalog will be used. This is because Plone's default catalog is faster on normal queries than using ElasticSearch.
28 |
29 |
30 | ## Install Elastic Search
31 |
32 | For a comprehensive documentation about the different options of installing Elastic Search, please read [their documentation](https://www.elastic.co/guide/en/elasticsearch/reference/7.7/install-elasticsearch.html).
33 |
34 | A quick start, using Docker would be:
35 |
36 | ```shell
37 | docker run \
38 | -e "discovery.type=single-node" \
39 | -e "cluster.name=docker-cluster" \
40 | -e "ES_JAVA_OPTS=-Xms512m -Xmx512m" \
41 | -p 9200:9200 \
42 | elasticsearch:7.7.0
43 | ```
44 |
45 | ### Test the installation
46 |
47 | Run, on your shell:
48 |
49 | ```shell
50 | curl http://localhost:9200/
51 | ```
52 | And you should see the Hudsucker Proxy reference? "You Know, for Search"
53 |
54 | ## Install collective.elasticsearch
55 |
56 | First, add `collective.elasticsearch` to your package dependencies, or install it with `pip` (the same one used by your Plone installation):
57 |
58 | ```shell
59 | pip install collective.elasticsearch
60 | ```
61 |
62 | Restart Plone, and go to the `Control Panel`, click in `Add-ons`, and select `Elastic Search`.
63 |
64 | Now, go to `Add-on Configuration` and:
65 |
66 | - Check "Enable"
67 | - Click "Convert Catalog"
68 | - Click "Rebuild Catalog"
69 |
70 | You now have a insanely scalable modern search engine. Now live the life of the Mind!
71 |
72 |
73 | ## Redis queue integration with blob indexing support
74 |
75 | ### TLDR
76 |
77 | ```shell
78 | docker-compose -f docker-compose.dev.yaml up -d
79 | ```
80 |
81 | Your Plone site should be up and running: http://localhost:8080/Plone
82 |
83 | - Go to `Add-on Configuration`
84 | - Check "Enable"
85 | - Click "Convert Catalog"
86 | - Click "Rebuild Catalog"
87 |
88 | ### Why
89 |
90 | Having a queue, which does heavy and time consuming jobs asynchronous improves the responsiveness of the website and lowers
91 | the risk of having database conflicts. This implementation aims to have an almost zero impact in terms of performance for any given plone
92 | installation or given installation using collective.elasticsearch already
93 |
94 | ### How does it work
95 |
96 | - Instead of index/reindex/unindex data while committing to the DB, jobs are added to a queue in a after commit hook.
97 | - No data is extracted from any object, this all happens later
98 | - One or multiple worker execute jobs, which gather the necessary data via the RestAPI.
99 | - The extraction of the data and the indexing in elasticsearch happens via queue.
100 |
101 | Workflow:
102 |
103 | 1. Content gets created/updated
104 | 2. Commit Data to DB + Update Plone Catalog
105 | 3. Via after commit hooks jobs are getting created
106 | 4. Website is ready to use again - Request is done
107 | 5. Worker get initialized
108 | 6. A job collects values to index via plone RestAPI and indexes those values on elasticsearch
109 |
110 | There are two queues. One for normal indexing jobs and one for the heavy lifting to index binaries.
111 | Jobs from the second queue only gets pulled if the normal indexing queue is empty.
112 |
113 | Trade of: Instead of a fully indexed document in elasticsearch we have pretty fast at least one there.
114 |
115 | ### Requirements
116 |
117 | There are a couple things that need to be done manually if you want redis queue support.
118 |
119 |
120 | 1. Install redis extra from collective.elasticsearch
121 | ```shell
122 | pip install collective.elasticsearch[redis]
123 | ```
124 |
125 |
126 | 2. Install ingest-attachment plugin for elasticsearch - by default the elasticsearch image does not have any plugins installed.
127 |
128 | ```shell
129 | docker exec CONTAINER_NAME /bin/sh -c "bin/elasticsearch-plugin install ingest-attachment -b"; \
130 | docker restart CONTAINER_NAME
131 | ```
132 |
133 | The container needs to be restarted, otherwise the plugin is not available
134 |
135 | 3. Communication between Redis Server, Plone and Redis worker is configured in environment variables.
136 |
137 | ```shell
138 | export PLONE_REDIS_DSN=redis://localhost:6379/0
139 | export PLONE_BACKEND=http://localhost:8080/Plone
140 | export PLONE_USERNAME=admin
141 | export PLONE_PASSWORD=admin
142 | ```
143 | This is a example configuration for local development only.
144 | You can use the `start-redis-support` command to spin up a plone instance with the environment variables already set
145 |
146 | ```shell
147 | make start-redis-support
148 | ```
149 |
150 | 4. Start a Redis Server
151 |
152 | Start your own or use the `start-redis` command
153 | ```shell
154 | make redis
155 | ```
156 |
157 | 5. start a redis worker
158 |
159 | The redis worker does the "job" and indexes everything via two queues:
160 |
161 | - normal: Normal indexing/reindexing/unindexing jobs - Does basically the same thing as without redis support, but well yeah via a queue.
162 | - low: Holds jobs for expensive blob indexing
163 |
164 | The priority is handled by the python-rq worker.
165 |
166 | The rq worker needs to be started with the same environment variables present as described in 3.
167 |
168 | ```shell
169 | ./bin/rq worker normal low --with-scheduler
170 | ```
171 |
172 | `--with-scheduler` is needed in order to retry failed jobs after a certain time period.
173 |
174 | Or yous the `worker` command
175 | ```shell
176 | make worker
177 | ```
178 |
179 | 6. Go to the control panel and repeat the following stepts.
180 |
181 | - Check "Enable"
182 | - Click "Convert Catalog"
183 | - Click "Rebuild Catalog"
184 |
185 | ### Technical documentation for elasticsearch
186 |
187 | #### Pipeline
188 |
189 | If you hit convert in the control panel and you meet all the requirements to index blobs as well,
190 | collective.elasticsearch installs a default pipeline for the plone-index.
191 | This Pipeline coverts the binary data to text (if possible) and extends the searchableText index with the extracted data
192 | The setup uses multiple nested processors in order to extract all binary data from all fields (blob fields).
193 |
194 | The binary data is not stored in index permanently. As last step the pipeline removes the binary itself.
195 |
196 | #### ingest-attachment plugin
197 |
198 | The ingest-attachment plugin is used to extract text data with tika from any binary.
199 |
200 |
201 | ### Note on Performance
202 |
203 | Putting all the jobs into a queue is much faster then actually calculate all index values and send them to elasticsearch.
204 | This feature aims to have a minimal impact in terms of responsiveness of the plone site.
205 |
206 |
207 | ## Compatibility
208 |
209 | - Python 3
210 | - Plone 5.2 and above
211 | - Tested with Elastic Search 7.17.0
212 |
213 | ## State
214 |
215 | Support for all index column types is done EXCEPT for the DateRecurringIndex index column type. If you are doing a full text search along with a query that contains a DateRecurringIndex column, it will not work.
216 |
217 |
218 | ## Search Highlighting
219 |
220 | If you want to make use of the [Elasticsearch highlight](https://www.elastic.co/guide/en/elasticsearch/reference/current/highlighting.html) feature you can enable it in the control panel.
221 |
222 | When enabled, it will replace the description of search results with the highlighted fragments from elastic search.
223 |
224 | ### Highlight Threshold
225 |
226 | This is the number of characters to show in the description. Fragments will be added until this threshold is met.
227 |
228 | ### Pre/Post Tags
229 |
230 | Highlighted terms can be wrapped in html which can be used to enhance the results further, such as by adding a custom background color. Note that the default Plone search results will not render html so to use this feature you will need to create a custom saearch result view.
231 |
232 | ## Developing this package
233 |
234 | Create the virtual enviroment and install all dependencies:
235 |
236 | ```shell
237 | make build
238 | ```
239 |
240 | Start Plone in foreground:
241 |
242 | ```shell
243 | make start
244 | ```
245 |
246 |
247 | ### Running tests
248 |
249 | ```shell
250 | make tests
251 | ```
252 |
253 |
254 | ### Formatting the codebase
255 |
256 | ```shell
257 | make format
258 | ```
259 |
260 | ### Linting the codebase
261 |
262 | ```shell
263 | make lint
264 | ```
265 |
266 | ## License
267 |
268 | The project is licensed under the GPLv2.
269 |
--------------------------------------------------------------------------------
/docker-compose.dev.yaml:
--------------------------------------------------------------------------------
1 | version: "3.8"
2 |
3 | services:
4 | redis:
5 | image: redis:7.0.5
6 | command: redis-server --appendonly yes
7 | ports:
8 | - 6379:6379
9 | volumes:
10 | - redis_data:/data
11 |
12 | elasticsearch:
13 | build:
14 | context: .
15 | dockerfile: docker/elasticsearch.Dockerfile
16 | ports:
17 | - 9200:9200
18 | - 9300:9300
19 | environment:
20 | - discovery.type=single-node
21 | - cluster.name=docker-cluster
22 | - http.cors.enabled=true
23 | - http.cors.allow-origin=*
24 | - http.cors.allow-headers=X-Requested-With,X-Auth-Token,Content-Type,Content-Length,Authorization
25 | - http.cors.allow-credentials=true
26 | - ES_JAVA_OPTS=-Xms512m -Xmx512m
27 | volumes:
28 | - elasticsearch_data:/usr/share/elasticsearch/data
29 |
30 | worker:
31 | build:
32 | context: .
33 | dockerfile: docker/worker.Dockerfile
34 | environment:
35 | - PLONE_REDIS_DSN=redis://redis:6379/0
36 | - PLONE_BACKEND=http://plone:8080/Plone
37 | - PLONE_USERNAME=admin
38 | - PLONE_PASSWORD=admin
39 |
40 | plone:
41 | build:
42 | context: .
43 | dockerfile: docker/plone.Dockerfile
44 | environment:
45 | - PLONE_REDIS_DSN=redis://redis:6379/0
46 | - PLONE_BACKEND=http://127.0.0.1:8080/Plone
47 | - PLONE_USERNAME=admin
48 | - PLONE_PASSWORD=admin
49 | ports:
50 | - "8080:8080"
51 | depends_on:
52 | - redis
53 | - elasticsearch
54 | - worker
55 | volumes:
56 | - plone_data:/data
57 |
58 | volumes:
59 | redis_data:
60 | elasticsearch_data:
61 | plone_data:
62 |
--------------------------------------------------------------------------------
/docker/elasticsearch.Dockerfile:
--------------------------------------------------------------------------------
1 | FROM elasticsearch:7.17.7
2 |
3 | RUN bin/elasticsearch-plugin install ingest-attachment -b
4 |
--------------------------------------------------------------------------------
/docker/plone.Dockerfile:
--------------------------------------------------------------------------------
1 | FROM plone/plone-backend:6.0.0b3
2 |
3 | WORKDIR /app
4 |
5 | RUN /app/bin/pip install git+https://github.com/collective/collective.elasticsearch.git@mle-redis-rq#egg=collective.elasticsearch[redis]
6 |
7 | ENV PROFILES="collective.elasticsearch:default collective.elasticsearch:docker-dev"
8 | ENV TYPE="classic"
9 | ENV SITE="Plone"
10 |
--------------------------------------------------------------------------------
/docker/worker.Dockerfile:
--------------------------------------------------------------------------------
1 | FROM plone/plone-backend:6.0.0b3
2 |
3 | WORKDIR /app
4 |
5 | RUN /app/bin/pip install git+https://github.com/collective/collective.elasticsearch.git@mle-redis-rq#egg=collective.elasticsearch[redis]
6 |
7 | CMD /app/bin/rq worker normal low --with-scheduler --url=$PLONE_REDIS_DSN
8 |
--------------------------------------------------------------------------------
/docs/Makefile:
--------------------------------------------------------------------------------
1 | # Makefile for Sphinx documentation
2 | #
3 |
4 | # You can set these variables from the command line.
5 | SPHINXOPTS =
6 | SPHINXBUILD = sphinx-build
7 | PAPER =
8 | BUILDDIR = _build
9 |
10 | # User-friendly check for sphinx-build
11 | ifeq ($(shell which $(SPHINXBUILD) >/dev/null 2>&1; echo $$?), 1)
12 | $(error The '$(SPHINXBUILD)' command was not found. Make sure you have Sphinx installed, then set the SPHINXBUILD environment variable to point to the full path of the '$(SPHINXBUILD)' executable. Alternatively you can add the directory with the executable to your PATH. If you don't have Sphinx installed, grab it from http://sphinx-doc.org/)
13 | endif
14 |
15 | # Internal variables.
16 | PAPEROPT_a4 = -D latex_paper_size=a4
17 | PAPEROPT_letter = -D latex_paper_size=letter
18 | ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .
19 | # the i18n builder cannot share the environment and doctrees with the others
20 | I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .
21 |
22 | .PHONY: help clean html dirhtml singlehtml pickle json htmlhelp qthelp devhelp epub latex latexpdf text man changes linkcheck doctest gettext
23 |
24 | help:
25 | @echo "Please use \`make ' where is one of"
26 | @echo " html to make standalone HTML files"
27 | @echo " dirhtml to make HTML files named index.html in directories"
28 | @echo " singlehtml to make a single large HTML file"
29 | @echo " pickle to make pickle files"
30 | @echo " json to make JSON files"
31 | @echo " htmlhelp to make HTML files and a HTML help project"
32 | @echo " qthelp to make HTML files and a qthelp project"
33 | @echo " devhelp to make HTML files and a Devhelp project"
34 | @echo " epub to make an epub"
35 | @echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter"
36 | @echo " latexpdf to make LaTeX files and run them through pdflatex"
37 | @echo " latexpdfja to make LaTeX files and run them through platex/dvipdfmx"
38 | @echo " text to make text files"
39 | @echo " man to make manual pages"
40 | @echo " texinfo to make Texinfo files"
41 | @echo " info to make Texinfo files and run them through makeinfo"
42 | @echo " gettext to make PO message catalogs"
43 | @echo " changes to make an overview of all changed/added/deprecated items"
44 | @echo " xml to make Docutils-native XML files"
45 | @echo " pseudoxml to make pseudoxml-XML files for display purposes"
46 | @echo " linkcheck to check all external links for integrity"
47 | @echo " doctest to run all doctests embedded in the documentation (if enabled)"
48 |
49 | clean:
50 | rm -rf $(BUILDDIR)/*
51 |
52 | html:
53 | $(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html
54 | @echo
55 | @echo "Build finished. The HTML pages are in $(BUILDDIR)/html."
56 |
57 | dirhtml:
58 | $(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml
59 | @echo
60 | @echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml."
61 |
62 | singlehtml:
63 | $(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml
64 | @echo
65 | @echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml."
66 |
67 | pickle:
68 | $(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle
69 | @echo
70 | @echo "Build finished; now you can process the pickle files."
71 |
72 | json:
73 | $(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json
74 | @echo
75 | @echo "Build finished; now you can process the JSON files."
76 |
77 | htmlhelp:
78 | $(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp
79 | @echo
80 | @echo "Build finished; now you can run HTML Help Workshop with the" \
81 | ".hhp project file in $(BUILDDIR)/htmlhelp."
82 |
83 | qthelp:
84 | $(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp
85 | @echo
86 | @echo "Build finished; now you can run "qcollectiongenerator" with the" \
87 | ".qhcp project file in $(BUILDDIR)/qthelp, like this:"
88 | @echo "# qcollectiongenerator $(BUILDDIR)/qthelp/collectiveelasticsearch.qhcp"
89 | @echo "To view the help file:"
90 | @echo "# assistant -collectionFile $(BUILDDIR)/qthelp/collectiveelasticsearch.qhc"
91 |
92 | devhelp:
93 | $(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp
94 | @echo
95 | @echo "Build finished."
96 | @echo "To view the help file:"
97 | @echo "# mkdir -p $$HOME/.local/share/devhelp/collectiveelasticsearch"
98 | @echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/collectiveelasticsearch"
99 | @echo "# devhelp"
100 |
101 | epub:
102 | $(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub
103 | @echo
104 | @echo "Build finished. The epub file is in $(BUILDDIR)/epub."
105 |
106 | latex:
107 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
108 | @echo
109 | @echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex."
110 | @echo "Run \`make' in that directory to run these through (pdf)latex" \
111 | "(use \`make latexpdf' here to do that automatically)."
112 |
113 | latexpdf:
114 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
115 | @echo "Running LaTeX files through pdflatex..."
116 | $(MAKE) -C $(BUILDDIR)/latex all-pdf
117 | @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex."
118 |
119 | latexpdfja:
120 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
121 | @echo "Running LaTeX files through platex and dvipdfmx..."
122 | $(MAKE) -C $(BUILDDIR)/latex all-pdf-ja
123 | @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex."
124 |
125 | text:
126 | $(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text
127 | @echo
128 | @echo "Build finished. The text files are in $(BUILDDIR)/text."
129 |
130 | man:
131 | $(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man
132 | @echo
133 | @echo "Build finished. The manual pages are in $(BUILDDIR)/man."
134 |
135 | texinfo:
136 | $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo
137 | @echo
138 | @echo "Build finished. The Texinfo files are in $(BUILDDIR)/texinfo."
139 | @echo "Run \`make' in that directory to run these through makeinfo" \
140 | "(use \`make info' here to do that automatically)."
141 |
142 | info:
143 | $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo
144 | @echo "Running Texinfo files through makeinfo..."
145 | make -C $(BUILDDIR)/texinfo info
146 | @echo "makeinfo finished; the Info files are in $(BUILDDIR)/texinfo."
147 |
148 | gettext:
149 | $(SPHINXBUILD) -b gettext $(I18NSPHINXOPTS) $(BUILDDIR)/locale
150 | @echo
151 | @echo "Build finished. The message catalogs are in $(BUILDDIR)/locale."
152 |
153 | changes:
154 | $(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes
155 | @echo
156 | @echo "The overview file is in $(BUILDDIR)/changes."
157 |
158 | linkcheck:
159 | $(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck
160 | @echo
161 | @echo "Link check complete; look for any errors in the above output " \
162 | "or in $(BUILDDIR)/linkcheck/output.txt."
163 |
164 | doctest:
165 | $(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest
166 | @echo "Testing of doctests in the sources finished, look at the " \
167 | "results in $(BUILDDIR)/doctest/output.txt."
168 |
169 | xml:
170 | $(SPHINXBUILD) -b xml $(ALLSPHINXOPTS) $(BUILDDIR)/xml
171 | @echo
172 | @echo "Build finished. The XML files are in $(BUILDDIR)/xml."
173 |
174 | pseudoxml:
175 | $(SPHINXBUILD) -b pseudoxml $(ALLSPHINXOPTS) $(BUILDDIR)/pseudoxml
176 | @echo
177 | @echo "Build finished. The pseudo-XML files are in $(BUILDDIR)/pseudoxml."
178 |
--------------------------------------------------------------------------------
/docs/conf.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | #
3 | # collective.elasticsearch documentation build configuration file, created by
4 | # sphinx-quickstart on Mon Mar 13 15:04:25 2017.
5 | #
6 | # This file is execfile()d with the current directory set to its
7 | # containing dir.
8 | #
9 | # Note that not all possible configuration values are present in this
10 | # autogenerated file.
11 | #
12 | # All configuration values have a default; values that are commented out
13 | # serve to show the default.
14 |
15 | import sys
16 | import os
17 |
18 | # If extensions (or modules to document with autodoc) are in another directory,
19 | # add these directories to sys.path here. If the directory is relative to the
20 | # documentation root, use os.path.abspath to make it absolute, like shown here.
21 | #sys.path.insert(0, os.path.abspath('.'))
22 |
23 | # -- General configuration ------------------------------------------------
24 |
25 | # If your documentation needs a minimal Sphinx version, state it here.
26 | #needs_sphinx = '1.0'
27 |
28 | # Add any Sphinx extension module names here, as strings. They can be
29 | # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
30 | # ones.
31 | extensions = []
32 |
33 | # Add any paths that contain templates here, relative to this directory.
34 | templates_path = ['_templates']
35 |
36 | # The suffix(es) of source filenames.
37 | # You can specify multiple suffix as a list of string:
38 | # source_suffix = ['.rst', '.md']
39 | source_suffix = '.rst'
40 |
41 | # The encoding of source files.
42 | #source_encoding = 'utf-8-sig'
43 |
44 | # The master toctree document.
45 | master_doc = 'index'
46 |
47 | # General information about the project.
48 | project = u'collective.elasticsearch'
49 | copyright = u'Nathan Van Gheem (vangheem)'
50 | author = u'Nathan Van Gheem (vangheem)'
51 |
52 | # The version info for the project you're documenting, acts as replacement for
53 | # |version| and |release|, also used in various other places throughout the
54 | # built documents.
55 | #
56 | # The short X.Y version.
57 | version = u'3.0'
58 | # The full version, including alpha/beta/rc tags.
59 | release = u'3.0'
60 |
61 | # The language for content autogenerated by Sphinx. Refer to documentation
62 | # for a list of supported languages.
63 | #
64 | # This is also used if you do content translation via gettext catalogs.
65 | # Usually you set "language" from the command line for these cases.
66 | language = None
67 |
68 | # There are two options for replacing |today|: either, you set today to some
69 | # non-false value, then it is used:
70 | #today = ''
71 | # Else, today_fmt is used as the format for a strftime call.
72 | #today_fmt = '%B %d, %Y'
73 |
74 | # List of patterns, relative to source directory, that match files and
75 | # directories to ignore when looking for source files.
76 | # This patterns also effect to html_static_path and html_extra_path
77 | exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
78 |
79 | # The reST default role (used for this markup: `text`) to use for all
80 | # documents.
81 | #default_role = None
82 |
83 | # If true, '()' will be appended to :func: etc. cross-reference text.
84 | #add_function_parentheses = True
85 |
86 | # If true, the current module name will be prepended to all description
87 | # unit titles (such as .. function::).
88 | #add_module_names = True
89 |
90 | # If true, sectionauthor and moduleauthor directives will be shown in the
91 | # output. They are ignored by default.
92 | #show_authors = False
93 |
94 | # The name of the Pygments (syntax highlighting) style to use.
95 | pygments_style = 'sphinx'
96 |
97 | # A list of ignored prefixes for module index sorting.
98 | #modindex_common_prefix = []
99 |
100 | # If true, keep warnings as "system message" paragraphs in the built documents.
101 | #keep_warnings = False
102 |
103 | # If true, `todo` and `todoList` produce output, else they produce nothing.
104 | todo_include_todos = False
105 |
106 |
107 | # -- Options for HTML output ----------------------------------------------
108 |
109 | # The theme to use for HTML and HTML Help pages. See the documentation for
110 | # a list of builtin themes.
111 | html_theme = 'alabaster'
112 |
113 | # Theme options are theme-specific and customize the look and feel of a theme
114 | # further. For a list of options available for each theme, see the
115 | # documentation.
116 | #html_theme_options = {}
117 |
118 | # Add any paths that contain custom themes here, relative to this directory.
119 | #html_theme_path = []
120 |
121 | # The name for this set of Sphinx documents.
122 | # " v documentation" by default.
123 | #html_title = u'bobtemplates.plone v3.0'
124 |
125 | # A shorter title for the navigation bar. Default is the same as html_title.
126 | #html_short_title = None
127 |
128 | # The name of an image file (relative to this directory) to place at the top
129 | # of the sidebar.
130 | #html_logo = None
131 |
132 | # The name of an image file (relative to this directory) to use as a favicon of
133 | # the docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
134 | # pixels large.
135 | #html_favicon = None
136 |
137 | # Add any paths that contain custom static files (such as style sheets) here,
138 | # relative to this directory. They are copied after the builtin static files,
139 | # so a file named "default.css" will overwrite the builtin "default.css".
140 | html_static_path = ['_static']
141 |
142 | # Add any extra paths that contain custom files (such as robots.txt or
143 | # .htaccess) here, relative to this directory. These files are copied
144 | # directly to the root of the documentation.
145 | #html_extra_path = []
146 |
147 | # If not None, a 'Last updated on:' timestamp is inserted at every page
148 | # bottom, using the given strftime format.
149 | # The empty string is equivalent to '%b %d, %Y'.
150 | #html_last_updated_fmt = None
151 |
152 | # If true, SmartyPants will be used to convert quotes and dashes to
153 | # typographically correct entities.
154 | #html_use_smartypants = True
155 |
156 | # Custom sidebar templates, maps document names to template names.
157 | #html_sidebars = {}
158 |
159 | # Additional templates that should be rendered to pages, maps page names to
160 | # template names.
161 | #html_additional_pages = {}
162 |
163 | # If false, no module index is generated.
164 | #html_domain_indices = True
165 |
166 | # If false, no index is generated.
167 | #html_use_index = True
168 |
169 | # If true, the index is split into individual pages for each letter.
170 | #html_split_index = False
171 |
172 | # If true, links to the reST sources are added to the pages.
173 | #html_show_sourcelink = True
174 |
175 | # If true, "Created using Sphinx" is shown in the HTML footer. Default is True.
176 | #html_show_sphinx = True
177 |
178 | # If true, "(C) Copyright ..." is shown in the HTML footer. Default is True.
179 | #html_show_copyright = True
180 |
181 | # If true, an OpenSearch description file will be output, and all pages will
182 | # contain a tag referring to it. The value of this option must be the
183 | # base URL from which the finished HTML is served.
184 | #html_use_opensearch = ''
185 |
186 | # This is the file name suffix for HTML files (e.g. ".xhtml").
187 | #html_file_suffix = None
188 |
189 | # Language to be used for generating the HTML full-text search index.
190 | # Sphinx supports the following languages:
191 | # 'da', 'de', 'en', 'es', 'fi', 'fr', 'hu', 'it', 'ja'
192 | # 'nl', 'no', 'pt', 'ro', 'ru', 'sv', 'tr', 'zh'
193 | #html_search_language = 'en'
194 |
195 | # A dictionary with options for the search language support, empty by default.
196 | # 'ja' uses this config value.
197 | # 'zh' user can custom change `jieba` dictionary path.
198 | #html_search_options = {'type': 'default'}
199 |
200 | # The name of a javascript file (relative to the configuration directory) that
201 | # implements a search results scorer. If empty, the default will be used.
202 | #html_search_scorer = 'scorer.js'
203 |
204 | # Output file base name for HTML help builder.
205 | htmlhelp_basename = 'collective.elasticsearchdoc'
206 |
207 | # -- Options for LaTeX output ---------------------------------------------
208 |
209 | latex_elements = {
210 | # The paper size ('letterpaper' or 'a4paper').
211 | #'papersize': 'letterpaper',
212 |
213 | # The font size ('10pt', '11pt' or '12pt').
214 | #'pointsize': '10pt',
215 |
216 | # Additional stuff for the LaTeX preamble.
217 | #'preamble': '',
218 |
219 | # Latex figure (float) alignment
220 | #'figure_align': 'htbp',
221 | }
222 |
223 | # Grouping the document tree into LaTeX files. List of tuples
224 | # (source start file, target name, title,
225 | # author, documentclass [howto, manual, or own class]).
226 | latex_documents = [
227 | ('index', 'collectiveelasticsearch.tex', u'collective.elasticsearch Documentation',
228 | u'Nathan Van Gheem', 'manual'),
229 | ]
230 |
231 | # The name of an image file (relative to this directory) to place at the top of
232 | # the title page.
233 | #latex_logo = None
234 |
235 | # For "manual" documents, if this is true, then toplevel headings are parts,
236 | # not chapters.
237 | #latex_use_parts = False
238 |
239 | # If true, show page references after internal links.
240 | #latex_show_pagerefs = False
241 |
242 | # If true, show URL addresses after external links.
243 | #latex_show_urls = False
244 |
245 | # Documents to append as an appendix to all manuals.
246 | #latex_appendices = []
247 |
248 | # If false, no module index is generated.
249 | #latex_domain_indices = True
250 |
251 |
252 | # -- Options for manual page output ---------------------------------------
253 |
254 | # One entry per manual page. List of tuples
255 | # (source start file, name, description, authors, manual section).
256 | man_pages = [
257 | ('index', 'collectiveelasticsearch', u'collective.elasticsearch Documentation',
258 | [u'Nathan Van Gheem'], 1)
259 | ]
260 |
261 | # If true, show URL addresses after external links.
262 | #man_show_urls = False
263 |
264 |
265 | # -- Options for Texinfo output -------------------------------------------
266 |
267 | # Grouping the document tree into Texinfo files. List of tuples
268 | # (source start file, target name, title, author,
269 | # dir menu entry, description, category)
270 | texinfo_documents = [
271 | ('index', 'collectiveelasticsearch', u'collective.elasticsearch Documentation',
272 | u'Nathan Van Gheem', 'collectiveelasticsearch', 'One line description of project.',
273 | 'Miscellaneous'),
274 | ]
275 |
276 | # Documents to append as an appendix to all manuals.
277 | #texinfo_appendices = []
278 |
279 | # If false, no module index is generated.
280 | #texinfo_domain_indices = True
281 |
282 | # How to display URL addresses: 'footnote', 'no', or 'inline'.
283 | #texinfo_show_urls = 'footnote'
284 |
285 | # If true, do not generate a @detailmenu in the "Top" node's menu.
286 | #texinfo_no_detailmenu = False
287 |
--------------------------------------------------------------------------------
/docs/config.rst:
--------------------------------------------------------------------------------
1 | Configuration
2 | =============
3 |
4 | Basic configuration
5 | -------------------
6 |
7 | - Goto Control Panel
8 | - Add "Eleastic Search" in Add-on Products
9 | - Click "Elastic Search" in "Add-on Configuration"
10 | - Enable
11 | - Click "Convert Catalog"
12 | - Click "Rebuild Catalog"
13 |
14 |
15 | Changing the index used for elasticsearch
16 | -----------------------------------------
17 |
18 | The index used for elasticsearch is the path to the portal_catalog by default. So you don't have anything to do if
19 | you have several plone site on the same instance (the plone site id would be different).
20 |
21 | However, if you want to use the same elasticsearch instance with several plone instance, you may
22 | end up having conflicts. In that case, you may want to manually set the index used by adding the following code
23 | to the ``__init__.py`` file of your module::
24 |
25 | from Products.CMFPlone.CatalogTool import CatalogTool
26 | from collective.elasticsearch.es import CUSTOM_INDEX_NAME_ATTR
27 |
28 | setattr(CatalogTool, CUSTOM_INDEX_NAME_ATTR, "my_elasticsearch_custom_index")
29 |
30 |
31 | Adding custom index which are not in the catalog
32 | ------------------------------------------------
33 |
34 | An adapter is used to define the mapping between the index and the elasticsearch properties. You can override
35 | the _default_mapping attribute to add your own indexes::
36 |
37 |
43 |
44 | ::
45 |
46 | @implementer(IMappingProvider)
47 | class MyMappingAdapter(object):
48 |
49 | _default_mapping = {
50 | 'SearchableText': {'store': False, 'type': 'text', 'index': True},
51 | 'Title': {'store': False, 'type': 'text', 'index': True},
52 | 'Description': {'store': False, 'type': 'text', 'index': True},
53 | 'MyOwnIndex': {'store': False, 'type': 'text', 'index': True,
54 | }
55 |
56 |
57 | Changing the settings of the index
58 | ----------------------------------
59 |
60 | If you want to customize your elasticsearch index, you can override the ``get_index_creation_body`` method on the ``MappingAdapter``::
61 |
62 | @implementer(IMappingProvider)
63 | class MyMappingAdapter(object):
64 |
65 | def get_index_creation_body(self):
66 | return {
67 | "settings" : {
68 | "number_of_shards": 1,
69 | "number_of_replicas": 0
70 | }
71 | }
72 |
73 |
74 | Changing the query made to elasticsearch
75 | ----------------------------------------
76 |
77 | The query generation is handled by another adapter::
78 |
79 |
84 |
85 | You will have to override the ``__call__`` method to change the query. Look at the original adapter to have a better
86 | idea on what you need to change.
87 |
--------------------------------------------------------------------------------
/docs/index.rst:
--------------------------------------------------------------------------------
1 | .. collective.elasticsearch documentation master file, created by
2 | sphinx-quickstart on Mon Mar 13 15:04:25 2017.
3 | You can adapt this file completely to your liking, but it should at least
4 | contain the root `toctree` directive.
5 |
6 | Welcome to collective.elasticsearch's documentation!
7 | ====================================================
8 |
9 | Overview
10 | --------
11 |
12 | This package aims to index all fields the portal_catalog indexes
13 | and allows you to delete the `Title`, `Description` and `SearchableText`
14 | indexes which can provide significant improvement to performance and RAM usage.
15 |
16 | Then, ElasticSearch queries are ONLY used when Title, Description and SearchableText
17 | text are in the query. Otherwise, the plone's default catalog will be used.
18 | This is because Plone's default catalog is faster on normal queries than using
19 | ElasticSearch.
20 |
21 |
22 | Compatibility
23 | -------------
24 |
25 | Only unit tested with Plone 5 with Dexterity types and archetypes.
26 |
27 | It should also work with Plone 4.3 and Plone 5.1.
28 |
29 | Deployed with Elasticsearch 7.6.0
30 |
31 | State
32 | -----
33 |
34 | Support for all index column types is done EXCEPT for the DateRecurringIndex
35 | index column type. If you are doing a full text search along with a query that
36 | contains a DateRecurringIndex column, it will not work.
37 |
38 |
39 | Celery support
40 | --------------
41 |
42 | This package comes with Celery support where all indexing operations will be pushed
43 | into celery to be run asynchronously.
44 |
45 | Please see instructions for collective.celery to see how this works.
46 |
47 | Contents:
48 |
49 | .. toctree::
50 | :maxdepth: 2
51 |
52 | install
53 | config
54 | history
55 |
56 |
57 |
58 | Indices and tables
59 | ==================
60 |
61 | * :ref:`genindex`
62 | * :ref:`modindex`
63 | * :ref:`search`
64 |
--------------------------------------------------------------------------------
/docs/install.rst:
--------------------------------------------------------------------------------
1 | Installation
2 | ============
3 |
4 | collective.elasticsearch
5 | ------------------------
6 |
7 | To install collective.elasticsearch into the global Python environment (or a workingenv),
8 | using a traditional Zope 2 instance, you can do this:
9 |
10 | * When you're reading this you have probably already run
11 | ``easy_install collective.elasticsearch``. Find out how to install setuptools
12 | (and EasyInstall) here:
13 | http://peak.telecommunity.com/DevCenter/EasyInstall
14 |
15 | * If you are using Zope 2.9 (not 2.10), get `pythonproducts`_ and install it
16 | via::
17 |
18 | python setup.py install --home /path/to/instance
19 |
20 | into your Zope instance.
21 |
22 | * Create a file called ``collective.elasticsearch-configure.zcml`` in the
23 | ``/path/to/instance/etc/package-includes`` directory. The file
24 | should only contain this::
25 |
26 |
27 |
28 | .. _pythonproducts: http://plone.org/products/pythonproducts
29 |
30 |
31 | Alternatively, if you are using zc.buildout and the plone.recipe.zope2instance
32 | recipe to manage your project, you can do this:
33 |
34 | * Add ``collective.elasticsearch`` to the list of eggs to install, e.g.::
35 |
36 | [buildout]
37 | ...
38 | eggs =
39 | ...
40 | collective.elasticsearch
41 |
42 | * Tell the plone.recipe.zope2instance recipe to install a ZCML slug::
43 |
44 | [instance]
45 | recipe = plone.recipe.zope2instance
46 | ...
47 | zcml =
48 | collective.elasticsearch
49 |
50 | * Re-run buildout, e.g. with::
51 |
52 | $ ./bin/buildout
53 |
54 | You can skip the ZCML slug if you are going to explicitly include the package
55 | from another package's configure.zcml file.
56 |
57 | elasticsearch
58 | -------------
59 |
60 | Less than 5 minutes:
61 | - Download & install Java
62 | - Download & install Elastic Search
63 | - bin/elasticsearch
64 |
65 | Step by Step for Ubuntu:
66 | - add-apt-repository ppa:webupd8team/java
67 | - apt-get update
68 | - apt-get install git curl oracle-java7-installer
69 | - wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.6.0-linux-x86_64.tar.gz
70 | - wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.6.0-linux-x86_64.tar.gz.sha512
71 | - shasum -a 512 -c elasticsearch-7.6.0-linux-x86_64.tar.gz.sha512
72 | - tar -xzf elasticsearch-7.6.0-linux-x86_64.tar.gz
73 | - cd elasticsearch
74 | - bin/elasticsearch
75 |
76 | Step by Step for CentOS/RedHat:
77 | - yum -y install java-1.8.0-openjdk.x86_64
78 | - alternatives --auto java
79 | - curl -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.6.0.tar.gz
80 | - tar xfvz elasticsearch-7.6.0.tar.gz
81 | - cd elasticsearch
82 | - bin/elasticsearch
83 |
84 | Does it work?
85 | - curl http://localhost:9200/
86 | - Do you see the Hudsucker Proxy reference? "You Know, for Search"
87 |
--------------------------------------------------------------------------------
/docs/make.bat:
--------------------------------------------------------------------------------
1 | @ECHO OFF
2 |
3 | REM Command file for Sphinx documentation
4 |
5 | if "%SPHINXBUILD%" == "" (
6 | set SPHINXBUILD=sphinx-build
7 | )
8 | set BUILDDIR=_build
9 | set ALLSPHINXOPTS=-d %BUILDDIR%/doctrees %SPHINXOPTS% .
10 | set I18NSPHINXOPTS=%SPHINXOPTS% .
11 | if NOT "%PAPER%" == "" (
12 | set ALLSPHINXOPTS=-D latex_paper_size=%PAPER% %ALLSPHINXOPTS%
13 | set I18NSPHINXOPTS=-D latex_paper_size=%PAPER% %I18NSPHINXOPTS%
14 | )
15 |
16 | if "%1" == "" goto help
17 |
18 | if "%1" == "help" (
19 | :help
20 | echo.Please use `make ^` where ^ is one of
21 | echo. html to make standalone HTML files
22 | echo. dirhtml to make HTML files named index.html in directories
23 | echo. singlehtml to make a single large HTML file
24 | echo. pickle to make pickle files
25 | echo. json to make JSON files
26 | echo. htmlhelp to make HTML files and a HTML help project
27 | echo. qthelp to make HTML files and a qthelp project
28 | echo. devhelp to make HTML files and a Devhelp project
29 | echo. epub to make an epub
30 | echo. latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter
31 | echo. text to make text files
32 | echo. man to make manual pages
33 | echo. texinfo to make Texinfo files
34 | echo. gettext to make PO message catalogs
35 | echo. changes to make an overview over all changed/added/deprecated items
36 | echo. xml to make Docutils-native XML files
37 | echo. pseudoxml to make pseudoxml-XML files for display purposes
38 | echo. linkcheck to check all external links for integrity
39 | echo. doctest to run all doctests embedded in the documentation if enabled
40 | goto end
41 | )
42 |
43 | if "%1" == "clean" (
44 | for /d %%i in (%BUILDDIR%\*) do rmdir /q /s %%i
45 | del /q /s %BUILDDIR%\*
46 | goto end
47 | )
48 |
49 |
50 | %SPHINXBUILD% 2> nul
51 | if errorlevel 9009 (
52 | echo.
53 | echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
54 | echo.installed, then set the SPHINXBUILD environment variable to point
55 | echo.to the full path of the 'sphinx-build' executable. Alternatively you
56 | echo.may add the Sphinx directory to PATH.
57 | echo.
58 | echo.If you don't have Sphinx installed, grab it from
59 | echo.http://sphinx-doc.org/
60 | exit /b 1
61 | )
62 |
63 | if "%1" == "html" (
64 | %SPHINXBUILD% -b html %ALLSPHINXOPTS% %BUILDDIR%/html
65 | if errorlevel 1 exit /b 1
66 | echo.
67 | echo.Build finished. The HTML pages are in %BUILDDIR%/html.
68 | goto end
69 | )
70 |
71 | if "%1" == "dirhtml" (
72 | %SPHINXBUILD% -b dirhtml %ALLSPHINXOPTS% %BUILDDIR%/dirhtml
73 | if errorlevel 1 exit /b 1
74 | echo.
75 | echo.Build finished. The HTML pages are in %BUILDDIR%/dirhtml.
76 | goto end
77 | )
78 |
79 | if "%1" == "singlehtml" (
80 | %SPHINXBUILD% -b singlehtml %ALLSPHINXOPTS% %BUILDDIR%/singlehtml
81 | if errorlevel 1 exit /b 1
82 | echo.
83 | echo.Build finished. The HTML pages are in %BUILDDIR%/singlehtml.
84 | goto end
85 | )
86 |
87 | if "%1" == "pickle" (
88 | %SPHINXBUILD% -b pickle %ALLSPHINXOPTS% %BUILDDIR%/pickle
89 | if errorlevel 1 exit /b 1
90 | echo.
91 | echo.Build finished; now you can process the pickle files.
92 | goto end
93 | )
94 |
95 | if "%1" == "json" (
96 | %SPHINXBUILD% -b json %ALLSPHINXOPTS% %BUILDDIR%/json
97 | if errorlevel 1 exit /b 1
98 | echo.
99 | echo.Build finished; now you can process the JSON files.
100 | goto end
101 | )
102 |
103 | if "%1" == "htmlhelp" (
104 | %SPHINXBUILD% -b htmlhelp %ALLSPHINXOPTS% %BUILDDIR%/htmlhelp
105 | if errorlevel 1 exit /b 1
106 | echo.
107 | echo.Build finished; now you can run HTML Help Workshop with the ^
108 | .hhp project file in %BUILDDIR%/htmlhelp.
109 | goto end
110 | )
111 |
112 | if "%1" == "qthelp" (
113 | %SPHINXBUILD% -b qthelp %ALLSPHINXOPTS% %BUILDDIR%/qthelp
114 | if errorlevel 1 exit /b 1
115 | echo.
116 | echo.Build finished; now you can run "qcollectiongenerator" with the ^
117 | .qhcp project file in %BUILDDIR%/qthelp, like this:
118 | echo.^> qcollectiongenerator %BUILDDIR%\qthelp\collectiveelasticsearch.qhcp
119 | echo.To view the help file:
120 | echo.^> assistant -collectionFile %BUILDDIR%\qthelp\collectiveelasticsearch.ghc
121 | goto end
122 | )
123 |
124 | if "%1" == "devhelp" (
125 | %SPHINXBUILD% -b devhelp %ALLSPHINXOPTS% %BUILDDIR%/devhelp
126 | if errorlevel 1 exit /b 1
127 | echo.
128 | echo.Build finished.
129 | goto end
130 | )
131 |
132 | if "%1" == "epub" (
133 | %SPHINXBUILD% -b epub %ALLSPHINXOPTS% %BUILDDIR%/epub
134 | if errorlevel 1 exit /b 1
135 | echo.
136 | echo.Build finished. The epub file is in %BUILDDIR%/epub.
137 | goto end
138 | )
139 |
140 | if "%1" == "latex" (
141 | %SPHINXBUILD% -b latex %ALLSPHINXOPTS% %BUILDDIR%/latex
142 | if errorlevel 1 exit /b 1
143 | echo.
144 | echo.Build finished; the LaTeX files are in %BUILDDIR%/latex.
145 | goto end
146 | )
147 |
148 | if "%1" == "latexpdf" (
149 | %SPHINXBUILD% -b latex %ALLSPHINXOPTS% %BUILDDIR%/latex
150 | cd %BUILDDIR%/latex
151 | make all-pdf
152 | cd %BUILDDIR%/..
153 | echo.
154 | echo.Build finished; the PDF files are in %BUILDDIR%/latex.
155 | goto end
156 | )
157 |
158 | if "%1" == "latexpdfja" (
159 | %SPHINXBUILD% -b latex %ALLSPHINXOPTS% %BUILDDIR%/latex
160 | cd %BUILDDIR%/latex
161 | make all-pdf-ja
162 | cd %BUILDDIR%/..
163 | echo.
164 | echo.Build finished; the PDF files are in %BUILDDIR%/latex.
165 | goto end
166 | )
167 |
168 | if "%1" == "text" (
169 | %SPHINXBUILD% -b text %ALLSPHINXOPTS% %BUILDDIR%/text
170 | if errorlevel 1 exit /b 1
171 | echo.
172 | echo.Build finished. The text files are in %BUILDDIR%/text.
173 | goto end
174 | )
175 |
176 | if "%1" == "man" (
177 | %SPHINXBUILD% -b man %ALLSPHINXOPTS% %BUILDDIR%/man
178 | if errorlevel 1 exit /b 1
179 | echo.
180 | echo.Build finished. The manual pages are in %BUILDDIR%/man.
181 | goto end
182 | )
183 |
184 | if "%1" == "texinfo" (
185 | %SPHINXBUILD% -b texinfo %ALLSPHINXOPTS% %BUILDDIR%/texinfo
186 | if errorlevel 1 exit /b 1
187 | echo.
188 | echo.Build finished. The Texinfo files are in %BUILDDIR%/texinfo.
189 | goto end
190 | )
191 |
192 | if "%1" == "gettext" (
193 | %SPHINXBUILD% -b gettext %I18NSPHINXOPTS% %BUILDDIR%/locale
194 | if errorlevel 1 exit /b 1
195 | echo.
196 | echo.Build finished. The message catalogs are in %BUILDDIR%/locale.
197 | goto end
198 | )
199 |
200 | if "%1" == "changes" (
201 | %SPHINXBUILD% -b changes %ALLSPHINXOPTS% %BUILDDIR%/changes
202 | if errorlevel 1 exit /b 1
203 | echo.
204 | echo.The overview file is in %BUILDDIR%/changes.
205 | goto end
206 | )
207 |
208 | if "%1" == "linkcheck" (
209 | %SPHINXBUILD% -b linkcheck %ALLSPHINXOPTS% %BUILDDIR%/linkcheck
210 | if errorlevel 1 exit /b 1
211 | echo.
212 | echo.Link check complete; look for any errors in the above output ^
213 | or in %BUILDDIR%/linkcheck/output.txt.
214 | goto end
215 | )
216 |
217 | if "%1" == "doctest" (
218 | %SPHINXBUILD% -b doctest %ALLSPHINXOPTS% %BUILDDIR%/doctest
219 | if errorlevel 1 exit /b 1
220 | echo.
221 | echo.Testing of doctests in the sources finished, look at the ^
222 | results in %BUILDDIR%/doctest/output.txt.
223 | goto end
224 | )
225 |
226 | if "%1" == "xml" (
227 | %SPHINXBUILD% -b xml %ALLSPHINXOPTS% %BUILDDIR%/xml
228 | if errorlevel 1 exit /b 1
229 | echo.
230 | echo.Build finished. The XML files are in %BUILDDIR%/xml.
231 | goto end
232 | )
233 |
234 | if "%1" == "pseudoxml" (
235 | %SPHINXBUILD% -b pseudoxml %ALLSPHINXOPTS% %BUILDDIR%/pseudoxml
236 | if errorlevel 1 exit /b 1
237 | echo.
238 | echo.Build finished. The pseudo-XML files are in %BUILDDIR%/pseudoxml.
239 | goto end
240 | )
241 |
242 | :end
243 |
--------------------------------------------------------------------------------
/instance.yaml:
--------------------------------------------------------------------------------
1 | ---
2 | # This is a cookiecutter configuration context file for
3 | #
4 | # cookiecutter-zope-instance
5 | #
6 | # available options are documented at
7 | # https://github.com/bluedynamics/cookiecutter-zope-instance/
8 |
9 | default_context:
10 | debug_mode: true
11 | verbose_security: true
12 | wsgi_listen: 0.0.0.0:8080
13 | initial_user_name: admin
14 | initial_user_password: admin
15 | load_zcml:
16 | package_includes: ['collective.elasticsearch']
17 | db_storage: direct
18 |
--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------
1 | [tool.black]
2 | line-length = 88
3 | target-version = ['py38']
4 | include = '\.pyi?$'
5 |
6 | [tool.isort]
7 | profile = "black"
8 | force_alphabetical_sort = true
9 | force_single_line = true
10 | lines_after_imports = 2
11 | line_length = 88
12 |
13 | [tool.flakeheaven.plugins]
14 | # Disable some checks.
15 | # - E501 line too long
16 | # flake8 is already testing this, with max-line-length=100000 in .flake8,
17 | # so pycodestyle should not test it.
18 | # - W503 line break before binary operator
19 | # Outdated recommendation, see https://www.flake8rules.com/rules/W503.html
20 | mccabe = ["+*"]
21 | pycodestyle = ["+*", "-E501", "-W503"]
22 | pyflakes = ["+*"]
23 | pylint = ["+*"]
24 |
25 | [tool.plone-code-analysis]
26 | checkers = ["black", "flake8", "isort", "pyroma", "zpretty"]
27 | formatters = ["black", "isort", "zpretty"]
28 | paths = "setup.py src/ scripts/"
29 |
--------------------------------------------------------------------------------
/scripts/populate.py:
--------------------------------------------------------------------------------
1 | from AccessControl.SecurityManagement import newSecurityManager
2 | from AccessControl.SecurityManager import setSecurityPolicy
3 | from lxml.html import fromstring
4 | from lxml.html import tostring
5 | from multiprocessing.pool import ThreadPool as Pool
6 | from plone import api
7 | from plone.app.textfield.value import RichTextValue
8 | from Products.CMFCore.tests.base.security import OmnipotentUser
9 | from Products.CMFCore.tests.base.security import PermissiveSecurityPolicy
10 | from Testing.makerequest import makerequest
11 | from unidecode import unidecode
12 | from zope.component.hooks import setSite
13 |
14 | import os
15 | import random
16 | import requests
17 | import transaction
18 |
19 |
20 | SITE_ID = "Plone"
21 |
22 |
23 | def parse_url(url):
24 | resp = requests.get(url)
25 | return resp.content
26 |
27 |
28 | def spoofRequest(app): # NOQA W0621
29 | """
30 | Make REQUEST variable to be available on the Zope application server.
31 |
32 | This allows acquisition to work properly
33 | """
34 | _policy = PermissiveSecurityPolicy()
35 | setSecurityPolicy(_policy)
36 | newSecurityManager(None, OmnipotentUser().__of__(app.acl_users))
37 | return makerequest(app)
38 |
39 |
40 | # Enable Faux HTTP request object
41 | app = spoofRequest(app) # noqa
42 |
43 | _dir = os.path.join(os.getcwd(), "src")
44 |
45 | _links = [] # type: list
46 | _toparse = [] # type: list
47 |
48 |
49 | def parse_urls(urls):
50 | with Pool(8) as pool:
51 | return pool.map(parse_url, urls)
52 |
53 |
54 | class DataReader:
55 | base_url = "https://en.wikipedia.org"
56 | base_content_url = base_url + "/wiki/"
57 | start_page = base_content_url + "Main_Page"
58 | title_selector = "#firstHeading"
59 | content_selector = "#bodyContent"
60 |
61 | def __init__(self):
62 | self.parsed = []
63 | self.toparse = [self.start_page]
64 | self.toprocess = []
65 |
66 | def get_content(self, html, selector, text=False): # NOQA R0201
67 | els = html.cssselect(selector)
68 | if len(els) > 0:
69 | if text:
70 | return unidecode(els[0].text_content())
71 | return tostring(els[0])
72 | return None
73 |
74 | def __iter__(self):
75 | while len(self.toparse) > 0:
76 | if len(self.toprocess) == 0:
77 | toparse = [
78 | self.toparse.pop(0) for _ in range(min(20, len(self.toparse)))
79 | ]
80 | self.toprocess = parse_urls(toparse)
81 | self.parsed.extend(toparse)
82 | html = fromstring(self.toprocess.pop(0))
83 |
84 | # get more links!
85 | for el in html.cssselect("a"):
86 | url = el.attrib.get("href", "")
87 | if url.startswith("/"):
88 | url = self.base_url + url
89 | if url.startswith(self.base_content_url) and url not in self.parsed:
90 | self.toparse.append(url)
91 |
92 | title = self.get_content(html, self.title_selector, text=True)
93 | body = self.get_content(html, self.content_selector)
94 | if not title or not body:
95 | continue
96 |
97 | yield {
98 | "title": f"{title}",
99 | "text": RichTextValue(
100 | body.decode("utf-8"),
101 | mimeType="text/html",
102 | outputMimeType="text/x-html-safe",
103 | ),
104 | }
105 |
106 |
107 | def importit(app): # NOQA W0621
108 | site = app[SITE_ID]
109 | setSite(site)
110 | per_folder = 50
111 | num_folders = 6
112 | max_depth = 4
113 | portal_types = ["Document", "News Item", "Event"]
114 | data = iter(DataReader())
115 |
116 | def populate(parent, count=0, depth=0):
117 | if depth >= max_depth:
118 | return count
119 | for fidx in range(num_folders):
120 | count += 1
121 | fid = f"folder{fidx}"
122 | if fid in parent.objectIds():
123 | folder = parent[fid]
124 | else:
125 | folder = api.content.create(
126 | type="Folder",
127 | title=f"Folder {fidx}",
128 | id=fid,
129 | exclude_from_nav=True,
130 | container=parent,
131 | )
132 | for didx in range(per_folder):
133 | count += 1
134 | pid = f"page{didx}"
135 | if pid not in folder.objectIds():
136 | payload = next(data)
137 | try:
138 | api.content.create(
139 | type=random.choice(portal_types),
140 | id=pid,
141 | container=folder,
142 | exclude_from_nav=True,
143 | **payload,
144 | )
145 | print("created ", count)
146 | except Exception: # NOQA W0703
147 | print("skipping", count)
148 | print("commiting")
149 | transaction.commit()
150 | count = populate(folder, count, depth + 1)
151 | app._p_jar.cacheMinimize()
152 | return count
153 |
154 | populate(site)
155 |
156 |
157 | importit(app)
158 |
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | """Installer for the collective.elasticsearch package."""
2 | from pathlib import Path
3 | from setuptools import find_packages
4 | from setuptools import setup
5 |
6 |
7 | long_description = f"""
8 | {Path("README.md").read_text()}\n
9 | {Path("CHANGELOG.md").read_text()}\n
10 | """
11 |
12 |
13 | setup(
14 | name="collective.elasticsearch",
15 | version="5.0.1.dev0",
16 | description="elasticsearch integration with plone",
17 | long_description=long_description,
18 | long_description_content_type="text/markdown",
19 | # Get more from https://pypi.org/classifiers/
20 | classifiers=[
21 | "Development Status :: 5 - Production/Stable",
22 | "Environment :: Web Environment",
23 | "Framework :: Plone :: 5.2",
24 | "Framework :: Plone :: 6.0",
25 | "Framework :: Plone :: Addon",
26 | "Framework :: Plone",
27 | "Framework :: Zope :: 4",
28 | "Framework :: Zope :: 5",
29 | "Intended Audience :: System Administrators",
30 | "License :: OSI Approved :: GNU General Public License (GPL)",
31 | "License :: OSI Approved :: GNU General Public License v2 (GPLv2)",
32 | "Operating System :: OS Independent",
33 | "Programming Language :: Python :: 3 :: Only",
34 | "Programming Language :: Python :: 3.7",
35 | "Programming Language :: Python :: 3.8",
36 | "Programming Language :: Python :: 3.9",
37 | "Programming Language :: Python :: 3.10",
38 | "Programming Language :: Python",
39 | "Topic :: Software Development :: Libraries :: Python Modules",
40 | ],
41 | keywords="plone elasticsearch search indexing",
42 | author="Nathan Van Gheem",
43 | author_email="vangheem@gmail.com",
44 | url="https://github.com/collective/collective.elasticsearch",
45 | project_urls={
46 | "PyPI": "https://pypi.python.org/pypi/collective.elasticsearch",
47 | "Source": "https://github.com/collective/collective.elasticsearch",
48 | "Tracker": "https://github.com/collective/collective.elasticsearch/issues",
49 | },
50 | license="GPL version 2",
51 | packages=find_packages("src", exclude=["ez_setup"]),
52 | namespace_packages=["collective"],
53 | package_dir={"": "src"},
54 | include_package_data=True,
55 | zip_safe=False,
56 | python_requires=">=3.7",
57 | install_requires=[
58 | "setuptools",
59 | "elasticsearch==7.17.7",
60 | "plone.app.registry",
61 | "plone.api",
62 | "setuptools",
63 | ],
64 | extras_require={
65 | "test": [
66 | "plone.app.contentrules",
67 | "plone.app.contenttypes",
68 | "plone.restapi[test]",
69 | "plone.app.testing[robot]>=7.0.0a3",
70 | "plone.app.robotframework[test]>=2.0.0a5",
71 | "parameterized",
72 | ],
73 | "redis": [
74 | "redis",
75 | "rq",
76 | "requests",
77 | "cbor2",
78 | ],
79 | },
80 | entry_points="""
81 | [z3c.autoinclude.plugin]
82 | target = plone
83 | [plone.autoinclude.plugin]
84 | target = plone
85 | """,
86 | )
87 |
--------------------------------------------------------------------------------
/src/collective/__init__.py:
--------------------------------------------------------------------------------
1 | __import__("pkg_resources").declare_namespace(__name__)
2 |
--------------------------------------------------------------------------------
/src/collective/elasticsearch/__init__.py:
--------------------------------------------------------------------------------
1 | import logging
2 |
3 |
4 | logger = logging.getLogger("collective.elasticsearch")
5 |
--------------------------------------------------------------------------------
/src/collective/elasticsearch/browser/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/collective/collective.elasticsearch/58f3f479cac40f33e79348016da42fb34149886f/src/collective/elasticsearch/browser/__init__.py
--------------------------------------------------------------------------------
/src/collective/elasticsearch/browser/configure.zcml:
--------------------------------------------------------------------------------
1 |
7 |
8 |
9 |
10 |
17 |
18 |
26 |
27 |
35 |
36 |
44 |
45 |
46 |
--------------------------------------------------------------------------------
/src/collective/elasticsearch/browser/controlpanel.py:
--------------------------------------------------------------------------------
1 | from collective.elasticsearch.interfaces import IElasticSettings
2 | from collective.elasticsearch.manager import ElasticSearchManager
3 | from collective.elasticsearch.utils import is_redis_available
4 | from elasticsearch.exceptions import ConnectionError as conerror
5 | from plone import api
6 | from plone.app.registry.browser.controlpanel import ControlPanelFormWrapper
7 | from plone.app.registry.browser.controlpanel import RegistryEditForm
8 | from plone.z3cform import layout
9 | from Products.Five.browser.pagetemplatefile import ViewPageTemplateFile
10 | from urllib3.exceptions import NewConnectionError
11 | from z3c.form import form
12 |
13 |
14 | class ElasticControlPanelForm(RegistryEditForm):
15 | form.extends(RegistryEditForm)
16 | schema = IElasticSettings
17 |
18 | label = "Elastic Search Settings"
19 |
20 | control_panel_view = "@@elastic-controlpanel"
21 |
22 | def updateWidgets(self):
23 | super().updateWidgets()
24 | if not is_redis_available():
25 | self.widgets["use_redis"].disabled = "disabled"
26 |
27 |
28 | class ElasticControlPanelFormWrapper(ControlPanelFormWrapper):
29 | index = ViewPageTemplateFile("controlpanel_layout.pt")
30 |
31 | def __init__(self, *args, **kwargs):
32 | super().__init__(*args, **kwargs)
33 | self.portal_catalog = api.portal.get_tool("portal_catalog")
34 | self.es = ElasticSearchManager()
35 |
36 | @property
37 | def connection_status(self):
38 | try:
39 | return self.es.connection.status()["ok"]
40 | except conerror:
41 | return False
42 | except (
43 | conerror,
44 | ConnectionError,
45 | NewConnectionError,
46 | ConnectionRefusedError,
47 | AttributeError,
48 | ):
49 | try:
50 | health_status = self.es.connection.cluster.health()["status"]
51 | return health_status in ("green", "yellow")
52 | except (
53 | conerror,
54 | ConnectionError,
55 | NewConnectionError,
56 | ConnectionRefusedError,
57 | AttributeError,
58 | ):
59 | return False
60 |
61 | @property
62 | def es_info(self):
63 | return self.es.info
64 |
65 | @property
66 | def enabled(self):
67 | return self.es.enabled
68 |
69 | @property
70 | def active(self):
71 | return self.es.active
72 |
73 | @property
74 | def enable_data_sync(self):
75 | if self.es_info:
76 | info = dict((key, value) for key, value in self.es_info)
77 | elastic_docs = info["Number of docs"]
78 | catalog_objs = info["Number of docs (Catalog)"]
79 | if elastic_docs != catalog_objs:
80 | return dict(elastic_docs=elastic_docs, catalog_objs=catalog_objs)
81 | return False
82 |
83 |
84 | ElasticControlPanelView = layout.wrap_form(
85 | ElasticControlPanelForm, ElasticControlPanelFormWrapper
86 | )
87 |
--------------------------------------------------------------------------------
/src/collective/elasticsearch/browser/controlpanel_layout.pt:
--------------------------------------------------------------------------------
1 |
4 |
5 |
6 |