├── KNIMEZoBot_Updated.knwf ├── LICENSE └── README.md /KNIMEZoBot_Updated.knwf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dayanjan-lab/KNIMEZoBot/5c4c506a68b4064099abaa1df3ae0eee60788c15/KNIMEZoBot_Updated.knwf -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 dayanjan-lab 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # KNIMEZoBot: Enhancing Literature Review with Zotero and KNIME OpenAI Integration using Retrieval-Augmented Generation 2 | 3 | ## Description 4 | 5 | The "KNIMEZoBot" represents an innovative approach to enhance the efficiency of literature reviews and research by seamlessly integrating Zotero, a widely-adopted reference management tool, with OpenAI's potent natural language processing capabilities. This project's primary goal is to simplify the retrieval of PDFs from Zotero group or user libraries, including the option to filter by collections, and subsequently employ OpenAI within the Konstanz Information Miner (KNIME) workflow to generate insightful questions and extract valuable answers from academic papers. The KNIMEZoBot utilizes a Retrieval-Augmented Generation (RAG) architecture, which combines a vector semantic similarity search to identify relevant passages from the retrieved PDFs with large language models to synthesize natural language responses based on the information extracted from those passages. This allows the KNIMEZoBot to provide informative answers to questions by efficiently searching over academic papers and extracting salient facts and key points. 6 | 7 | ## Features 8 | 9 | - Seamless integration of Zotero and OpenAI within KNIME 10 | - Retrieval of PDFs from Zotero libraries 11 | - Filtering of PDFs by collections 12 | - Generation of insightful questions from academic papers 13 | - Extraction of valuable answers using large language models 14 | 15 | ## Getting Started 16 | 17 | ### Prerequisites 18 | 19 | Before you begin, ensure you have the following prerequisites: 20 | 21 | - **KNIME Analytics Platform (Version 5.1.1)**: Used as the core environment for building data workflows. 22 | 23 | - **Python (Version 3.9)**: Integrated into KNIME to leverage Python libraries and machine learning capabilities. 24 | 25 | ### Installation 26 | 27 | To install the necessary Python packages, execute the following commands in the same Python environment used in your KNIME settings: 28 | 29 | ```bash 30 | pip install pandas 31 | pip install openai 32 | pip install langchain 33 | pip install unstructured 34 | pip install fitz 35 | pip install PyPDF2 36 | pip install PyMuPDF 37 | pip install "unstructured[pdf]" 38 | --------------------------------------------------------------------------------