├── .gitignore
├── CODE_OF_CONDUCT.md
├── LICENSE
├── README.md
├── SECURITY.md
├── app.py
├── app
    ├── __init__.py
    ├── prompts.py
    ├── routes.py
    ├── static
    │   ├── css
    │   │   └── style.css
    │   ├── favicon.ico
    │   ├── images
    │   │   ├── Home_Example_1.png
    │   │   ├── Home_Example_2.png
    │   │   ├── Home_Example_3.png
    │   │   ├── PDF_Management_Eaxmple.png
    │   │   ├── full_logo.png
    │   │   └── logo.png
    │   └── js
    │   │   ├── home.js
    │   │   └── script.js
    └── templates
    │   ├── index.html
    │   └── pdfManagement.html
└── requirements.txt


/.gitignore:
--------------------------------------------------------------------------------
  1 | # Ignore system files
  2 | .DS_Store
  3 | static/.DS_Store
  4 | static/js/.DS_Store
  5 | 
  6 | # Ignore virtual environment directories and files
  7 | venv/
  8 | bin/
  9 | share/
 10 | pyvenv.cfg
 11 | 
 12 | # Ignore database directory contents, but keep the directory
 13 | data/db/*
 14 | !data/db/.gitkeep
 15 | 
 16 | # Ignore PDF directory contents, but keep the directory
 17 | data/pdf/*
 18 | !data/pdf/.gitkeep
 19 | 
 20 | # Ignore results directory
 21 | results/
 22 | 
 23 | # Byte-compiled / optimized / DLL files
 24 | __pycache__/
 25 | *.py[cod]
 26 | *$py.class
 27 | 
 28 | # Distribution / packaging
 29 | .Python
 30 | build/
 31 | develop-eggs/
 32 | dist/
 33 | downloads/
 34 | eggs/
 35 | .eggs/
 36 | lib/
 37 | lib64/
 38 | parts/
 39 | sdist/
 40 | var/
 41 | wheels/
 42 | *.egg-info/
 43 | .installed.cfg
 44 | *.egg
 45 | 
 46 | # PyInstaller
 47 | #  Usually these files are written by a python script from a template
 48 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 49 | *.manifest
 50 | *.spec
 51 | 
 52 | # Installer logs
 53 | pip-log.txt
 54 | pip-delete-this-directory.txt
 55 | 
 56 | # Unit test / coverage reports
 57 | htmlcov/
 58 | .tox/
 59 | .nox/
 60 | .coverage
 61 | .coverage.*
 62 | .cache
 63 | nosetests.xml
 64 | coverage.xml
 65 | *.cover
 66 | *.py,cover
 67 | .hypothesis/
 68 | .pytest_cache/
 69 | 
 70 | # Translations
 71 | *.mo
 72 | *.pot
 73 | 
 74 | # Django stuff:
 75 | *.log
 76 | local_settings.py
 77 | 
 78 | # Flask stuff:
 79 | instance/
 80 | .webassets-cache
 81 | 
 82 | # Scrapy stuff:
 83 | .scrapy
 84 | 
 85 | # Sphinx documentation
 86 | docs/_build/
 87 | 
 88 | # PyBuilder
 89 | target/
 90 | 
 91 | # Jupyter Notebook
 92 | .ipynb_checkpoints
 93 | 
 94 | # IPython
 95 | profile_default/
 96 | ipython_config.py
 97 | 
 98 | # PyCharm
 99 | .idea/
100 | 
101 | # VS Code
102 | .vscode/
103 | 
104 | # mypy
105 | .mypy_cache/
106 | .dmypy.json
107 | dmypy.json
108 | 


--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
  1 | 
  2 | # Contributor Covenant Code of Conduct
  3 | 
  4 | ## Our Pledge
  5 | 
  6 | We as members, contributors, and leaders pledge to make participation in our
  7 | community a harassment-free experience for everyone, regardless of age, body
  8 | size, visible or invisible disability, ethnicity, sex characteristics, gender
  9 | identity and expression, level of experience, education, socio-economic status,
 10 | nationality, personal appearance, race, caste, color, religion, or sexual
 11 | identity and orientation.
 12 | 
 13 | We pledge to act and interact in ways that contribute to an open, welcoming,
 14 | diverse, inclusive, and healthy community.
 15 | 
 16 | ## Our Standards
 17 | 
 18 | Examples of behavior that contributes to a positive environment for our
 19 | community include:
 20 | 
 21 | * Demonstrating empathy and kindness toward other people
 22 | * Being respectful of differing opinions, viewpoints, and experiences
 23 | * Giving and gracefully accepting constructive feedback
 24 | * Accepting responsibility and apologizing to those affected by our mistakes,
 25 |   and learning from the experience
 26 | * Focusing on what is best not just for us as individuals, but for the overall
 27 |   community
 28 | 
 29 | Examples of unacceptable behavior include:
 30 | 
 31 | * The use of sexualized language or imagery, and sexual attention or advances of
 32 |   any kind
 33 | * Trolling, insulting or derogatory comments, and personal or political attacks
 34 | * Public or private harassment
 35 | * Publishing others' private information, such as a physical or email address,
 36 |   without their explicit permission
 37 | * Other conduct which could reasonably be considered inappropriate in a
 38 |   professional setting
 39 | 
 40 | ## Enforcement Responsibilities
 41 | 
 42 | Community leaders are responsible for clarifying and enforcing our standards of
 43 | acceptable behavior and will take appropriate and fair corrective action in
 44 | response to any behavior that they deem inappropriate, threatening, offensive,
 45 | or harmful.
 46 | 
 47 | Community leaders have the right and responsibility to remove, edit, or reject
 48 | comments, commits, code, wiki edits, issues, and other contributions that are
 49 | not aligned to this Code of Conduct, and will communicate reasons for moderation
 50 | decisions when appropriate.
 51 | 
 52 | ## Scope
 53 | 
 54 | This Code of Conduct applies within all community spaces, and also applies when
 55 | an individual is officially representing the community in public spaces.
 56 | Examples of representing our community include using an official email address,
 57 | posting via an official social media account, or acting as an appointed
 58 | representative at an online or offline event.
 59 | 
 60 | ## Enforcement
 61 | 
 62 | Instances of abusive, harassing, or otherwise unacceptable behavior may be
 63 | reported to the community leaders responsible for enforcement at
 64 | [INSERT CONTACT METHOD].
 65 | All complaints will be reviewed and investigated promptly and fairly.
 66 | 
 67 | All community leaders are obligated to respect the privacy and security of the
 68 | reporter of any incident.
 69 | 
 70 | ## Enforcement Guidelines
 71 | 
 72 | Community leaders will follow these Community Impact Guidelines in determining
 73 | the consequences for any action they deem in violation of this Code of Conduct:
 74 | 
 75 | ### 1. Correction
 76 | 
 77 | **Community Impact**: Use of inappropriate language or other behavior deemed
 78 | unprofessional or unwelcome in the community.
 79 | 
 80 | **Consequence**: A private, written warning from community leaders, providing
 81 | clarity around the nature of the violation and an explanation of why the
 82 | behavior was inappropriate. A public apology may be requested.
 83 | 
 84 | ### 2. Warning
 85 | 
 86 | **Community Impact**: A violation through a single incident or series of
 87 | actions.
 88 | 
 89 | **Consequence**: A warning with consequences for continued behavior. No
 90 | interaction with the people involved, including unsolicited interaction with
 91 | those enforcing the Code of Conduct, for a specified period of time. This
 92 | includes avoiding interactions in community spaces as well as external channels
 93 | like social media. Violating these terms may lead to a temporary or permanent
 94 | ban.
 95 | 
 96 | ### 3. Temporary Ban
 97 | 
 98 | **Community Impact**: A serious violation of community standards, including
 99 | sustained inappropriate behavior.
100 | 
101 | **Consequence**: A temporary ban from any sort of interaction or public
102 | communication with the community for a specified period of time. No public or
103 | private interaction with the people involved, including unsolicited interaction
104 | with those enforcing the Code of Conduct, is allowed during this period.
105 | Violating these terms may lead to a permanent ban.
106 | 
107 | ### 4. Permanent Ban
108 | 
109 | **Community Impact**: Demonstrating a pattern of violation of community
110 | standards, including sustained inappropriate behavior, harassment of an
111 | individual, or aggression toward or disparagement of classes of individuals.
112 | 
113 | **Consequence**: A permanent ban from any sort of public interaction within the
114 | community.
115 | 
116 | ## Attribution
117 | 
118 | This Code of Conduct is adapted from the [Contributor Covenant][homepage],
119 | version 2.1, available at
120 | [https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
121 | 
122 | Community Impact Guidelines were inspired by
123 | [Mozilla's code of conduct enforcement ladder][Mozilla CoC].
124 | 
125 | For answers to common questions about this code of conduct, see the FAQ at
126 | [https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
127 | [https://www.contributor-covenant.org/translations][translations].
128 | 
129 | [homepage]: https://www.contributor-covenant.org
130 | [v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
131 | [Mozilla CoC]: https://github.com/mozilla/diversity
132 | [FAQ]: https://www.contributor-covenant.org/faq
133 | [translations]: https://www.contributor-covenant.org/translations
134 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
  1 |                                  Apache License
  2 |                            Version 2.0, January 2004
  3 |                         http://www.apache.org/licenses/
  4 | 
  5 |    TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
  6 | 
  7 |    1. Definitions.
  8 | 
  9 |       "License" shall mean the terms and conditions for use, reproduction,
 10 |       and distribution as defined by Sections 1 through 9 of this document.
 11 | 
 12 |       "Licensor" shall mean the copyright owner or entity authorized by
 13 |       the copyright owner that is granting the License.
 14 | 
 15 |       "Legal Entity" shall mean the union of the acting entity and all
 16 |       other entities that control, are controlled by, or are under common
 17 |       control with that entity. For the purposes of this definition,
 18 |       "control" means (i) the power, direct or indirect, to cause the
 19 |       direction or management of such entity, whether by contract or
 20 |       otherwise, or (ii) ownership of fifty percent (50%) or more of the
 21 |       outstanding shares, or (iii) beneficial ownership of such entity.
 22 | 
 23 |       "You" (or "Your") shall mean an individual or Legal Entity
 24 |       exercising permissions granted by this License.
 25 | 
 26 |       "Source" form shall mean the preferred form for making modifications,
 27 |       including but not limited to software source code, documentation
 28 |       source, and configuration files.
 29 | 
 30 |       "Object" form shall mean any form resulting from mechanical
 31 |       transformation or translation of a Source form, including but
 32 |       not limited to compiled object code, generated documentation,
 33 |       and conversions to other media types.
 34 | 
 35 |       "Work" shall mean the work of authorship, whether in Source or
 36 |       Object form, made available under the License, as indicated by a
 37 |       copyright notice that is included in or attached to the work
 38 |       (an example is provided in the Appendix below).
 39 | 
 40 |       "Derivative Works" shall mean any work, whether in Source or Object
 41 |       form, that is based on (or derived from) the Work and for which the
 42 |       editorial revisions, annotations, elaborations, or other modifications
 43 |       represent, as a whole, an original work of authorship. For the purposes
 44 |       of this License, Derivative Works shall not include works that remain
 45 |       separable from, or merely link (or bind by name) to the interfaces of,
 46 |       the Work and Derivative Works thereof.
 47 | 
 48 |       "Contribution" shall mean any work of authorship, including
 49 |       the original version of the Work and any modifications or additions
 50 |       to that Work or Derivative Works thereof, that is intentionally
 51 |       submitted to Licensor for inclusion in the Work by the copyright owner
 52 |       or by an individual or Legal Entity authorized to submit on behalf of
 53 |       the copyright owner. For the purposes of this definition, "submitted"
 54 |       means any form of electronic, verbal, or written communication sent
 55 |       to the Licensor or its representatives, including but not limited to
 56 |       communication on electronic mailing lists, source code control systems,
 57 |       and issue tracking systems that are managed by, or on behalf of, the
 58 |       Licensor for the purpose of discussing and improving the Work, but
 59 |       excluding communication that is conspicuously marked or otherwise
 60 |       designated in writing by the copyright owner as "Not a Contribution."
 61 | 
 62 |       "Contributor" shall mean Licensor and any individual or Legal Entity
 63 |       on behalf of whom a Contribution has been received by Licensor and
 64 |       subsequently incorporated within the Work.
 65 | 
 66 |    2. Grant of Copyright License. Subject to the terms and conditions of
 67 |       this License, each Contributor hereby grants to You a perpetual,
 68 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 69 |       copyright license to reproduce, prepare Derivative Works of,
 70 |       publicly display, publicly perform, sublicense, and distribute the
 71 |       Work and such Derivative Works in Source or Object form.
 72 | 
 73 |    3. Grant of Patent License. Subject to the terms and conditions of
 74 |       this License, each Contributor hereby grants to You a perpetual,
 75 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 76 |       (except as stated in this section) patent license to make, have made,
 77 |       use, offer to sell, sell, import, and otherwise transfer the Work,
 78 |       where such license applies only to those patent claims licensable
 79 |       by such Contributor that are necessarily infringed by their
 80 |       Contribution(s) alone or by combination of their Contribution(s)
 81 |       with the Work to which such Contribution(s) was submitted. If You
 82 |       institute patent litigation against any entity (including a
 83 |       cross-claim or counterclaim in a lawsuit) alleging that the Work
 84 |       or a Contribution incorporated within the Work constitutes direct
 85 |       or contributory patent infringement, then any patent licenses
 86 |       granted to You under this License for that Work shall terminate
 87 |       as of the date such litigation is filed.
 88 | 
 89 |    4. Redistribution. You may reproduce and distribute copies of the
 90 |       Work or Derivative Works thereof in any medium, with or without
 91 |       modifications, and in Source or Object form, provided that You
 92 |       meet the following conditions:
 93 | 
 94 |       (a) You must give any other recipients of the Work or
 95 |           Derivative Works a copy of this License; and
 96 | 
 97 |       (b) You must cause any modified files to carry prominent notices
 98 |           stating that You changed the files; and
 99 | 
100 |       (c) You must retain, in the Source form of any Derivative Works
101 |           that You distribute, all copyright, patent, trademark, and
102 |           attribution notices from the Source form of the Work,
103 |           excluding those notices that do not pertain to any part of
104 |           the Derivative Works; and
105 | 
106 |       (d) If the Work includes a "NOTICE" text file as part of its
107 |           distribution, then any Derivative Works that You distribute must
108 |           include a readable copy of the attribution notices contained
109 |           within such NOTICE file, excluding those notices that do not
110 |           pertain to any part of the Derivative Works, in at least one
111 |           of the following places: within a NOTICE text file distributed
112 |           as part of the Derivative Works; within the Source form or
113 |           documentation, if provided along with the Derivative Works; or,
114 |           within a display generated by the Derivative Works, if and
115 |           wherever such third-party notices normally appear. The contents
116 |           of the NOTICE file are for informational purposes only and
117 |           do not modify the License. You may add Your own attribution
118 |           notices within Derivative Works that You distribute, alongside
119 |           or as an addendum to the NOTICE text from the Work, provided
120 |           that such additional attribution notices cannot be construed
121 |           as modifying the License.
122 | 
123 |       You may add Your own copyright statement to Your modifications and
124 |       may provide additional or different license terms and conditions
125 |       for use, reproduction, or distribution of Your modifications, or
126 |       for any such Derivative Works as a whole, provided Your use,
127 |       reproduction, and distribution of the Work otherwise complies with
128 |       the conditions stated in this License.
129 | 
130 |    5. Submission of Contributions. Unless You explicitly state otherwise,
131 |       any Contribution intentionally submitted for inclusion in the Work
132 |       by You to the Licensor shall be under the terms and conditions of
133 |       this License, without any additional terms or conditions.
134 |       Notwithstanding the above, nothing herein shall supersede or modify
135 |       the terms of any separate license agreement you may have executed
136 |       with Licensor regarding such Contributions.
137 | 
138 |    6. Trademarks. This License does not grant permission to use the trade
139 |       names, trademarks, service marks, or product names of the Licensor,
140 |       except as required for reasonable and customary use in describing the
141 |       origin of the Work and reproducing the content of the NOTICE file.
142 | 
143 |    7. Disclaimer of Warranty. Unless required by applicable law or
144 |       agreed to in writing, Licensor provides the Work (and each
145 |       Contributor provides its Contributions) on an "AS IS" BASIS,
146 |       WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 |       implied, including, without limitation, any warranties or conditions
148 |       of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 |       PARTICULAR PURPOSE. You are solely responsible for determining the
150 |       appropriateness of using or redistributing the Work and assume any
151 |       risks associated with Your exercise of permissions under this License.
152 | 
153 |    8. Limitation of Liability. In no event and under no legal theory,
154 |       whether in tort (including negligence), contract, or otherwise,
155 |       unless required by applicable law (such as deliberate and grossly
156 |       negligent acts) or agreed to in writing, shall any Contributor be
157 |       liable to You for damages, including any direct, indirect, special,
158 |       incidental, or consequential damages of any character arising as a
159 |       result of this License or out of the use or inability to use the
160 |       Work (including but not limited to damages for loss of goodwill,
161 |       work stoppage, computer failure or malfunction, or any and all
162 |       other commercial damages or losses), even if such Contributor
163 |       has been advised of the possibility of such damages.
164 | 
165 |    9. Accepting Warranty or Additional Liability. While redistributing
166 |       the Work or Derivative Works thereof, You may choose to offer,
167 |       and charge a fee for, acceptance of support, warranty, indemnity,
168 |       or other liability obligations and/or rights consistent with this
169 |       License. However, in accepting such obligations, You may act only
170 |       on Your own behalf and on Your sole responsibility, not on behalf
171 |       of any other Contributor, and only if You agree to indemnify,
172 |       defend, and hold each Contributor harmless for any liability
173 |       incurred by, or claims asserted against, such Contributor by reason
174 |       of your accepting any such warranty or additional liability.
175 | 
176 |    END OF TERMS AND CONDITIONS
177 | 
178 |    APPENDIX: How to apply the Apache License to your work.
179 | 
180 |       To apply the Apache License to your work, attach the following
181 |       boilerplate notice, with the fields enclosed by brackets "[]"
182 |       replaced with your own identifying information. (Don't include
183 |       the brackets!)  The text should be enclosed in the appropriate
184 |       comment syntax for the file format. We also recommend that a
185 |       file or class name and description of purpose be included on the
186 |       same "printed page" as the copyright notice for easier
187 |       identification within third-party archives.
188 | 
189 |    Copyright [yyyy] [name of copyright owner]
190 | 
191 |    Licensed under the Apache License, Version 2.0 (the "License");
192 |    you may not use this file except in compliance with the License.
193 |    You may obtain a copy of the License at
194 | 
195 |        http://www.apache.org/licenses/LICENSE-2.0
196 | 
197 |    Unless required by applicable law or agreed to in writing, software
198 |    distributed under the License is distributed on an "AS IS" BASIS,
199 |    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 |    See the License for the specific language governing permissions and
201 |    limitations under the License.
202 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | <p align="center">
  2 |   <img src="https://github.com/AbdArdati/PDFQueryAI/blob/main/app/static/images/full_logo.png">
  3 | </p>
  4 | 
  5 | # AskItRight: AI-Powered PDF Query Application 🚀
  6 | 
  7 | This repository contains a Flask web application that allows users to upload PDF documents, query their contents, and retrieve answers using an AI language model. The application integrates several functionalities to manage PDFs, handle user queries, and maintain usage statistics.
  8 | 
  9 | ## High-Level Overview 🌟
 10 | 
 11 | The application provides an interface for:
 12 | - 📄 **Uploading and managing PDF documents**.
 13 | - ❓ **Submitting queries to retrieve information from uploaded PDFs**.
 14 | - 📊 **Tracking PDF usage statistics**.
 15 | - 🔧 **Performing administrative operations like clearing data and deleting files**.
 16 | - 🔍 **Viewing PDF files**
 17 | - 📝 **Prompt templates management**
 18 | 
 19 | ### Key Features 🔑
 20 | 
 21 | 1. **PDF Management**:
 22 |    - **Upload PDFs**: 📤 Users can upload PDF files through the upload interface. These files are processed and stored in the system.
 23 |    - **List PDFs**: 📋 Users can view a list of all uploaded PDF files through the available PDFs interface.
 24 |    - **Delete PDFs**: 🗑️ Users can remove specific PDF files using the delete functionality available in the PDF management interface.
 25 |    - **View PDFs**: 👁️ Users can open and view the content of PDF files in a new browser tab directly from the list of PDFs.
 26 | 
 27 | 2. **Query Handling**:
 28 |    - **Ask Questions to PDF**: 🤔 Users can submit questions about the content of uploaded PDFs using the query interface. The application uses the AI model to provide answers based on the PDF contents.
 29 |    - **AI Integration**: 🤖 The Ollama3.1 model is used to generate answers to queries from the content of the PDFs. This functionality is accessible through the AI query interface.
 30 |   - **Prompt Templates**: 📝 Users can view and select from various prompt templates to guide the AI's responses, ensuring they are tailored to specific needs. (Currently in progress, with frontend Create, Update, and Delete to be implemented.)
 31 | 
 32 | 3. **Statistics and Administration**:
 33 |    - **Clear Chat History**: 🧹 Users can clear previous chat interactions using the clear chat history button in the query section.
 34 |    - **Clear Database**: 🚮 Deletes all stored PDFs and related data, effectively resetting the application’s state. This action is available in the database management section.
 35 |    - **PDF Usage Statistics**: 📈 Provides information on how frequently each PDF has been queried, viewable through the statistics dashboard.
 36 | 
 37 | ### HTML Interfaces Overview 🖥️
 38 | 
 39 | 1. **Document Interaction Dashboard**:
 40 |    - **Homepage**: 🏠 Features interfaces for asking questions about PDF content and interacting with the Ollama3.1 AI model. It also displays query and PDF usage statistics.
 41 |    - **PDF Query Section**: ❓ Allows users to submit questions about PDFs and view responses.
 42 |    - **AI Query Section**: 🤖 Provides functionality to query the Ollama3.1 AI model independently of PDFs.
 43 |    - **Statistics Section**: 📊 Displays usage statistics for both queries and PDFs.
 44 | 
 45 | 2. **PDF Management and Statistics**:
 46 |    - **Upload and List PDFs**: 📤📋 An interface to upload new PDFs, view a list of all uploaded PDFs, and access each PDF file.
 47 |    - **Database Management**: 🗑️ Provides options to clear the database and manage stored PDFs.
 48 |    - **Statistics Dashboard**: 📈 Shows statistics related to the total number of PDFs and documents in the vector store.
 49 | 
 50 | ### Examples
 51 | 
 52 | | Image | Description |
 53 | |-------|-------------|
 54 | | <img src="https://github.com/AbdArdati/PDFQueryAI/blob/main/app/static/images/Home_Example_1.png" alt="Home Example" width="400" /> | This screenshot shows the functionality of using PDFs with the 'Essay Expert' prompt template. At the top, the system leverages PDF content for detailed responses, while the lower section illustrates responses generated without PDFs. |
 55 | | <img src="https://github.com/AbdArdati/PDFQueryAI/blob/main/app/static/images/Home_Example_2.png" alt="Home Example" width="400" /> | This example demonstrates the advanced capabilities of the 'Essays Expert' prompt template. The screenshot highlights how the system utilises PDF content to generate comprehensive responses at the top, while the lower section shows the output generated without PDFs, illustrating the impact of including detailed content. |
 56 | | <img src="https://github.com/AbdArdati/PDFQueryAI/blob/main/app/static/images/Home_Example_3.png" alt="Home Example" width="400" /> | This screenshot reveals the limitations of the current app, indicating that it may struggle with queries beyond the scope of the provided documents. It underscores the need for further improvements and extensive testing to enhance the model's accuracy and robustness. |
 57 | | <img src="https://github.com/AbdArdati/PDFQueryAI/blob/main/app/static/images/PDF_Management_Eaxmple.png" alt="Home Example" width="400" /> | This screenshot demonstrates the PDF Management & Statistics Dashboard, showcasing how users can view detailed statistics related to the uploaded PDFs and documents within the system. |
 58 | 
 59 | ## Installation Instructions ⚙️
 60 | 
 61 | ### System Requirements 🖥️
 62 | 
 63 | For Ollama3.1 8B, ensure your system has at least 16 GB of RAM, and reasonable disk space for a reasonable performance; a GPU is recommended for models with 70B parameters or higher.
 64 | 
 65 | ### 1. Download and Install Ollama
 66 | 
 67 | To use the `Ollama` model, follow these steps to download and install it based on your operating system:
 68 | 
 69 | 1. **Visit the Ollama Download Page:**
 70 |    Go to [Ollama Download Page](https://ollama.com/download).
 71 | 
 72 | 2. **Download the Installer:**
 73 |    Choose the appropriate installer for your operating system and download it.
 74 | 
 75 | 3. **Install Ollama:**
 76 |    Follow the installation instructions provided on the download page for your specific operating system.
 77 | 
 78 | 4. **Verify Installation:**
 79 |    After installation, verify that Ollama is installed correctly by running the following command in your terminal or command prompt:
 80 | 
 81 |    ```bash
 82 |    ollama --version
 83 | 
 84 | 5. **Run the Llama3 Model:**
 85 |    Once Ollama is installed, start the llama3 model by running the following command in your terminal or command prompt:
 86 | 
 87 |    ```bash
 88 |    ollama run llama3.1
 89 |    ```
 90 |   When you run `ollama run llama3.1`, it defaults to using configuration 8b unless you specify otherwise.
 91 | 
 92 | ### 2. **Clone the Repository**:
 93 |     
 94 |     git clone https://github.com/AbdArdati/PDFQueryAI.git
 95 |     cd PDFQueryAI 
 96 | 
 97 | ### 3. **Create a Virtual Environment**:
 98 |     
 99 |     python3 -m venv venv
100 |     source venv/bin/activate  # On Windows use `venv\Scripts\activate`
101 | 
102 | ### 4. **Install Dependencies**:
103 |     
104 |     pip install -r requirements.txt
105 | 
106 | ### 5. **Run the Application**:
107 |      
108 |     python app.py
109 | 
110 | ### Endpoints and Their Functions
111 | 
112 | - **`/`**: Serves the main HTML page for user interaction.
113 |   
114 | - **`/ai`**:
115 |   - **Method**: `POST`
116 |   - **Function**: Accepts a JSON request containing a query, passes it to the AI model, and returns the generated response.
117 | 
118 | - **`/ask_pdf`**:
119 |   - **Method**: `POST`
120 |   - **Function**: Processes queries against uploaded PDFs. It uses a vector store to find relevant documents, generates an answer based on the context, and returns the answer along with sources and usage statistics.
121 | 
122 | - **`/clear_chat_history`**:
123 |   - **Method**: `POST`
124 |   - **Function**: Clears the chat history stored in the global `chat_history` list.
125 | 
126 | - **`/clear_db`**:
127 |   - **Method**: `POST`
128 |   - **Function**: Deletes all stored vector data and PDF files from the filesystem, effectively resetting the application’s state.
129 | 
130 | - **`/list_pdfs`**:
131 |   - **Method**: `GET`
132 |   - **Function**: Lists all PDF files currently available in the storage directory.
133 | 
134 | - **`/pdf`**:
135 |   - **Method**: `POST`
136 |   - **Function**: Handles PDF uploads. The file is saved, processed, and split into chunks. These chunks are then stored in a vector database for later querying.
137 | 
138 | - **`/list_documents`**:
139 |   - **Method**: `GET`
140 |   - **Function**: Lists all documents in the vector store.
141 | 
142 | - **`/delete_pdf`**:
143 |   - **Method**: `POST`
144 |   - **Function**: Deletes a specific PDF file and its associated documents from the vector store.
145 | 
146 | - **`/delete_document`**:
147 |   - **Method**: `POST`
148 |   - **Function**: Deletes a specific document from the vector store by its ID.
149 | 
150 | - **`/pdf_usage`**:
151 |   - **Method**: `GET`
152 |   - **Function**: Provides usage statistics on how often each PDF has been queried.
153 | 
154 | ### Internal Functions 🔧
155 | 
156 | - **`file_exists(filename)`**: Checks if a file with the given name exists in the directory.
157 |   
158 | - **`compute_file_hash(file)`**: Computes the MD5 hash of a file to detect duplicates.
159 |   
160 | - **`perform_ocr(pdf_path)`**: Placeholder function for performing Optical Character Recognition (OCR) on a PDF if it is not structured. This needs to be implemented with an actual OCR library like `pytesseract`.
161 | 
162 | ### Error Handling ⚠️
163 | 
164 | - The application includes basic error handling for missing data, file operations, and vector store interactions. For example, if a file is not found or an operation fails, the application returns an appropriate error message and HTTP status code.
165 | 
166 | ### Release Information 🚀
167 | 
168 | **Current Version:** `v1.0.0-beta`
169 | 
170 | **Release Date:** `25/07/2024`
171 | 
172 | **Description:** This is the beta release of AskItRight. It includes features for uploading, managing, and querying PDF documents using an AI model, as well as basic statistics and administrative functionalities.
173 | 
174 | **Changelog:**
175 | - Initial beta release with core functionalities.
176 | - Added PDF upload, query, and management features.
177 | - Integrated AI model for querying PDF content.
178 | - Implemented basic statistics and database management features.
179 | 
180 | ### Summary 📝
181 | 
182 | Overall, the application provides a structured way to manage and interact with PDF documents using an AI model. It integrates file management, data processing, and querying capabilities into a Flask web service, allowing users to upload, query, and manage PDFs while also keeping track of usage statistics and providing administrative functionalities.
183 | 
184 | 
185 | ### Disclaimer 📝
186 | Please be aware that this application is provided "as is," without any guarantee of functionality or reliability. It's important to note that the codebase may not adhere to best practices in terms of tidiness and organisation, and there is significant room for improvement and bug fixes. While I try to enhance the application, I plan to release a new version in the future as time permits. 
187 | 
188 | The developers disclaim all responsibility for bugs or issues, stating use is at one's own risk. Users are encouraged to contribute but the developers are not liable for any damages from using the application.
189 | 
190 | ## License 🛡️
191 | 
192 | This project is licensed under the Apache 2.0 License. The core components of this application follow the work from [https://github.com/ThomasJay/RAG](https://github.com/ThomasJay/RAG) which is also licensed under Apache 2.0.
193 | 
194 | ```
195 | Licensed under the Apache License, Version 2.0 (the "License");
196 | you may not use this file except in compliance with the License.
197 | You may obtain a copy of the License at
198 | 
199 |     http://www.apache.org/licenses/LICENSE-2.0
200 | 
201 | Unless required by applicable law or agreed to in writing, software
202 | distributed under the License is distributed on an "AS IS" BASIS,
203 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
204 | See the License for the specific language governing permissions and
205 | limitations under the License.
206 | ```
207 | 
208 | ## Code of Conduct
209 | 
210 | We are committed to fostering a welcoming and inclusive community. Please read our [Code of Conduct](CODE_OF_CONDUCT.md) to understand our guidelines and expectations.
211 | 
212 | ---
213 | **Developed by Abd Alsattar Ardati for the sake of exploring, learning, and sharing.**  
214 | Visit my [website](https://abd.ardati.org) for more information or contact me at abd.alsattar.ardati @ gmail.
215 | 


--------------------------------------------------------------------------------
/SECURITY.md:
--------------------------------------------------------------------------------
 1 | # Security Policy
 2 | 
 3 | ## Supported Versions
 4 | 
 5 | As this project is primarily for personal experimentation and not intended for production deployment, I do not provide security updates or support for any versions. The project is in a developmental stage, and no security considerations have been incorporated. Use it at your own risk, as mentioned in the README.md.
 6 | 
 7 | | Version | Supported          |
 8 | | ------- | ------------------ |
 9 | | All     | :x:                |
10 | 
11 | ## Reporting a Vulnerability
12 | 
13 | Since this project is for personal use and experimentation only, I am not actively managing or addressing security vulnerabilities. However, if you encounter any issues or potential vulnerabilities, you may report them through the project's [issue tracker](https://github.com/AbdArdati/PDFQueryAI/issues). 
14 | 
15 | Please be aware that due to the experimental nature of this project, there are no guarantees of updates or fixes for reported issues. Reports are reviewed on a case-by-case basis, and while I appreciate feedback, there is no commitment to address or resolve security concerns promptly.
16 | 
17 | Thank you for understanding.
18 | 


--------------------------------------------------------------------------------
/app.py:
--------------------------------------------------------------------------------
1 | from app import create_app
2 | 
3 | if __name__ == "__main__":
4 |     app = create_app()
5 |     app.run(debug=True)  # Turn on debugging mode for development
6 |     print("Flask app is running on http://127.0.0.1:5000")
7 | 


--------------------------------------------------------------------------------
/app/__init__.py:
--------------------------------------------------------------------------------
 1 | from flask import Flask
 2 | __version__ = '1.0.0-beta'
 3 | 
 4 | def create_app():
 5 |     app = Flask(__name__, template_folder='templates')
 6 | 
 7 |     # Register blueprints
 8 |     from .routes import bp
 9 |     app.register_blueprint(bp)
10 | 
11 |     # Additional configuration if needed
12 | 
13 |     return app
14 | 


--------------------------------------------------------------------------------
/app/prompts.py:
--------------------------------------------------------------------------------
  1 | from langchain.prompts import PromptTemplate
  2 | 
  3 | # Define all prompts
  4 | PROMPTS = {
  5 |     "General AI Assistant": PromptTemplate.from_template(
  6 |         """
  7 |         <s>[INST] You are an exceptionally advanced AI assistant, equipped with state-of-the-art capabilities to understand and analyze technical documents. Your role is to deliver responses that are not only accurate and insightful but also enriched with a deep understanding of the context provided by the PDFs.
  8 | 
  9 |         **Instructions:**
 10 |         - Thoroughly analyze the provided context and input.
 11 |         - Extract and synthesize key information from the PDFs to provide a comprehensive and informed response.
 12 |         - Enhance your responses with detailed explanations, advanced insights, and contextually relevant examples.
 13 |         - Present information in a structured format using Markdown where applicable, but prioritize clarity and depth of content over formatting.
 14 |         - Address the query with a high level of detail and sophistication, demonstrating a deep understanding of the subject matter.
 15 |         - If any critical information is missing or if further context is needed, clearly indicate this in your response.
 16 | 
 17 |         **Response Guidelines:**
 18 |         - **Introduction:** Begin with a brief overview of the topic, setting the stage for a detailed analysis.
 19 |         - **Detailed Analysis:** Provide an in-depth examination of the topic, incorporating insights derived from the PDFs.
 20 |         - **Contextual Insights:** Relate the information to the context provided by the PDFs, making connections and highlighting relevant points.
 21 |         - **Examples and Explanations:** Include specific examples, detailed explanations, and any relevant data or findings from the PDFs.
 22 |         - **Conclusion:** Summarize the key points and provide a well-rounded conclusion based on the analysis.
 23 | 
 24 |         **Example Output:**
 25 | 
 26 |         # Overview
 27 |         The provided PDFs offer a comprehensive overview of ...
 28 | 
 29 |         # In-Depth Analysis
 30 |         Based on the documents, the key findings include ...
 31 | 
 32 |         # Contextual Insights
 33 |         The analysis reveals that ...
 34 | 
 35 |         # Examples and Explanations
 36 |         For instance, document A highlights ...
 37 | 
 38 |         # Conclusion
 39 |         In conclusion, the analysis demonstrates ...
 40 | 
 41 |         **Your Response:**
 42 |         [/INST]</s> {input} 
 43 |         Context: {context}
 44 |         """
 45 |     ),
 46 |     "Summary": PromptTemplate.from_template(
 47 |         """
 48 |         <s>[INST] You are an advanced AI assistant with expertise in summarizing technical documents. Your goal is to create a clear, concise, and well-organized summary using Markdown formatting. Focus on extracting and presenting the essential points of the document effectively.
 49 | 
 50 |         **Instructions:**
 51 |         - Analyze the provided context and input carefully.
 52 |         - Identify and highlight the key points, main arguments, and important details.
 53 |         - Format the summary using Markdown for clarity:
 54 |             - Use `#` for main headers and `##` for subheaders.
 55 |             - Use `**text**` for important terms or concepts.
 56 |             - Provide a brief introduction, followed by the main points, and a concluding summary if applicable.
 57 |         - Ensure the summary is easy to read and understand, avoiding unnecessary jargon.
 58 | 
 59 |         **Example Summary Format:**
 60 | 
 61 |         # Overview
 62 |         **Document Title:** *Technical Analysis Report*
 63 | 
 64 |         **Summary:**
 65 |         The report provides an in-depth analysis of the recent technical advancements in AI. It covers key areas such as ...
 66 | 
 67 |         # Key Findings
 68 |         - **Finding 1:** Description of finding 1.
 69 |         - **Finding 2:** Description of finding 2.
 70 | 
 71 |         # Conclusion
 72 |         The analysis highlights the significant advancements and future directions for AI technology.
 73 | 
 74 |         **Your Response:**
 75 |         [/INST]</s> {input} 
 76 |         Context: {context}
 77 |         """
 78 |     ),
 79 |     "Essays Expert": PromptTemplate.from_template(
 80 |         """
 81 |         <s>[INST] Your task is to compose a detailed and engaging essay on the provided topic. Begin by thoroughly examining the context derived from PDFs uploaded by the user, along with the given input. Your essay should be seamlessly structured, starting with an engaging introduction that sets the stage and highlights the significance of the topic. Follow with a comprehensive body where you delve into the subject matter, offering in-depth analysis, relevant examples, and detailed explanations. Conclude with a reflective summary that captures the essence of your discussion and considers potential future implications or directions.
 82 | 
 83 |         Ensure that your essay flows continuously and cohesively, avoiding the use of bullet points or lists. Construct your writing with smooth transitions and connected sentences, employing clear and descriptive language to effectively convey your insights and findings.
 84 | 
 85 |         For example, if addressing recent developments in artificial intelligence, you should explore how advancements are transforming various sectors and influencing societal interactions. Discuss the implications of technological progress in machine learning and natural language processing on business practices and everyday life. Your conclusion should provide thoughtful reflections on the future trajectory of AI and its broader implications.
 86 | 
 87 |         **Your Response:**
 88 |         [/INST]</s> Context derived from PDFs uploaded by the user: {context} {input}
 89 |         """
 90 |     ),
 91 |     "Technical": PromptTemplate.from_template(
 92 |         """
 93 |         <s>[INST] You are a highly skilled AI assistant in technical document summarization. Your task is to provide a detailed and well-organized response using Markdown formatting. The response should be informative and structured, presenting data and information in a clear manner.
 94 | 
 95 |         **Instructions:**
 96 |         - Analyze the provided context and input comprehensively.
 97 |         - Use Markdown to structure the response effectively:
 98 |             - Employ `#`, `##`, `###` headers for different sections.
 99 |             - Use `**text**` to emphasize key points.
100 |             - Include relevant links, code blocks, and tables if applicable.
101 |         - Ensure that each section of the response flows logically and that the information is presented clearly.
102 |         - Indicate if any critical information is missing and provide a structured layout for easy readability.
103 | 
104 |         **Markdown Formatting Guide:**
105 |         - Headers: Use `#` for main headings, `##` for subheadings, and `###` for detailed subheadings.
106 |         - Bold Text: Use `**text**` to highlight important terms or concepts.
107 |         - Italic Text: Use `*text*` for emphasis.
108 |         - Bulleted Lists: Use `-` or `*` for unordered lists where necessary.
109 |         - Numbered Lists: Use `1.`, `2.` for ordered lists when appropriate.
110 |         - Links: Include `[link text](URL)` to provide additional resources or references.
111 |         - Code Blocks: Use triple backticks (```) for code snippets.
112 |         - Tables: Use `|` to organize data into tables for clarity.
113 | 
114 |         **Example Output:**
115 | 
116 |         ## Introduction
117 |         The document provides a thorough analysis of ...
118 | 
119 |         ## Key Details
120 |         - **Aspect 1:** Detailed description of aspect 1.
121 |         - **Aspect 2:** Detailed description of aspect 2.
122 | 
123 |         ## Analysis
124 |         The analysis reveals ...
125 | 
126 |         ## Conclusion
127 |         The summary highlights the significance of ...
128 | 
129 |         **Your Response:**
130 |         [/INST]</s> {input} 
131 |         Context: {context}
132 |         """
133 |     ),
134 |     # Add more prompts as needed
135 | }
136 | 


--------------------------------------------------------------------------------
/app/routes.py:
--------------------------------------------------------------------------------
  1 | 
  2 | 
  3 | from flask import Flask, request, jsonify, render_template, Blueprint, send_from_directory
  4 | from langchain_community.llms import Ollama
  5 | from langchain_community.vectorstores import Chroma
  6 | from langchain.text_splitter import RecursiveCharacterTextSplitter
  7 | from langchain_community.embeddings.fastembed import FastEmbedEmbeddings
  8 | from langchain_community.document_loaders import PDFPlumberLoader
  9 | from langchain.chains.combine_documents import create_stuff_documents_chain
 10 | from langchain.chains import create_retrieval_chain
 11 | from langchain_core.messages import HumanMessage, AIMessage
 12 | from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
 13 | from langchain.chains.history_aware_retriever import create_history_aware_retriever
 14 | from .prompts import PROMPTS
 15 | 
 16 | import os
 17 | import shutil
 18 | import hashlib
 19 | import logging
 20 | 
 21 | # Define a blueprint
 22 | bp = Blueprint('main', __name__)
 23 | 
 24 | # Folder paths
 25 | folder_path = "data/db"
 26 | pdf_dir = "data/pdf"
 27 | 
 28 | # Path to the PDF directory for viewing PDFs
 29 | PDF_DIRECTORY = os.path.join(os.path.dirname(__file__), '..', 'data', 'pdf')
 30 | 
 31 | # Dictionary to keep track of PDF usage counts
 32 | pdf_usage_count = {}
 33 | query_usage_count = {}
 34 | chat_history = []
 35 | 
 36 | # Initialize the Ollama model
 37 | cached_llm = Ollama(model="llama3.1")
 38 | 
 39 | # Initialize the embedding model
 40 | embedding = FastEmbedEmbeddings()
 41 | 
 42 | # Initialize text splitter
 43 | text_splitter = RecursiveCharacterTextSplitter(
 44 |     chunk_size=2048,  # Increased chunk size for better context
 45 |     chunk_overlap=100,  # Increased overlap to maintain context between chunks
 46 |     length_function=len,
 47 |     is_separator_regex=False
 48 | )
 49 | 
 50 | def initialize_vector_store():
 51 |     if not os.path.exists(folder_path):
 52 |         os.makedirs(folder_path)
 53 |     # Perform a test operation to ensure proper initialization
 54 |     try:
 55 |         vector_store = Chroma(persist_directory=folder_path, embedding_function=embedding)
 56 |         # Example check to validate database
 57 |         if not vector_store.get():
 58 |             print("Vector store is empty or not properly initialized.")
 59 |             # Additional initialization steps if needed
 60 |         return vector_store
 61 |     except Exception as e:
 62 |         print(f"Error initializing vector store: {str(e)}")
 63 |         return None
 64 | 
 65 | initialize_vector_store()
 66 | 
 67 | if not os.path.exists(folder_path):
 68 |     print(f"Error: Directory {folder_path} does not exist.")
 69 | 
 70 | def file_exists(file_path):
 71 |     return os.path.isfile(file_path)
 72 | 
 73 | def compute_file_hash(file):
 74 |     """Compute the MD5 hash of a file."""
 75 |     hash_md5 = hashlib.md5()
 76 |     for chunk in iter(lambda: file.read(4096), b""):
 77 |         hash_md5.update(chunk)
 78 |     file.seek(0)  # Reset file pointer
 79 |     return hash_md5.hexdigest()
 80 | 
 81 | 
 82 | if not os.path.exists(folder_path):
 83 |     os.makedirs(folder_path)
 84 | 
 85 | if not os.path.exists(pdf_dir):
 86 |     os.makedirs(pdf_dir)
 87 | 
 88 | def preprocess_text(text):
 89 |     # Implement text cleaning steps
 90 |     text = text.strip().replace('\n', ' ').replace('\r', '')
 91 |     return text
 92 | 
 93 | class Document:
 94 |     def __init__(self, page_content, metadata=None):
 95 |         self.page_content = page_content
 96 |         self.metadata = metadata or {}
 97 | 
 98 | @bp.route('/')
 99 | def home():
100 |     return render_template('index.html')
101 | 
102 | @bp.route('/prompts', methods=['GET'])
103 | def get_prompts():
104 |     try:
105 |         # Convert PromptTemplate objects to a serializable format
106 |         serializable_prompts = {key: str(value) for key, value in PROMPTS.items()}
107 |         return jsonify(serializable_prompts)
108 |     except Exception as e:
109 |         return jsonify({"error": str(e)}), 500
110 | 
111 | 
112 | @bp.route("/pdfManagement")
113 | def pdfManagement():
114 |     try:
115 |         # Fetch the PDF and document statistics
116 |         pdf_files = os.listdir("data/pdf")  # Adjust the path as needed
117 |         vector_store = Chroma(persist_directory="data/db", embedding_function=embedding)
118 |         db_data = vector_store.get()
119 |         document_count = len(db_data.get("metadatas", []))
120 |         
121 |         return render_template("pdfManagement.html", pdf_count=len(pdf_files), doc_count=document_count)
122 |     except Exception as e:
123 |         return jsonify({"error": str(e)}), 500
124 |     
125 | @bp.route("/ai", methods=["POST"])
126 | def aiPost():
127 |     query_usage_count = {}
128 |     print("POST /ai called")
129 |     json_content = request.json
130 |     query = json_content.get("query")
131 | 
132 |     if not query:
133 |         return jsonify({"error": "No 'query' found in JSON request"}), 400
134 | 
135 |     print(f"query: {query}")
136 | 
137 |     response = cached_llm.invoke(query)
138 | 
139 |     print(response)
140 | 
141 |     response_answer = {"answer": response}
142 |     return jsonify(response_answer)
143 | 
144 | @bp.route("/ask_pdf", methods=["POST"])
145 | def askPDFPost():
146 |     query_usage_count = {}# Added for tracking PDF usage count
147 |     print("POST /ask_pdf called")
148 | 
149 |     json_content = request.json
150 |     query = json_content.get("query")
151 |     prompt_type = json_content.get("promptType")  # Get the prompt type
152 | 
153 |     if not query:
154 |         return jsonify({"error": "No 'query' found in JSON request"}), 400
155 | 
156 |     print(f"**query**: {query}")
157 |     print(f"**prompt_type**: {prompt_type}")
158 | 
159 |     # Dynamically select the prompt based on prompt_type
160 |     prompt = PROMPTS.get(prompt_type)
161 |     if not prompt:
162 |         return jsonify({"error": "Unknown prompt type"}), 400
163 | 
164 |     try:
165 |         print("Loading vector store")
166 |         vector_store = Chroma(persist_directory=folder_path, embedding_function=embedding)
167 |         db_data = vector_store.get()
168 | 
169 |         if not db_data.get("metadatas"):
170 |             print("Call ended since there are no documents available to process the query.")
171 |             return jsonify({
172 |                 "answer": "No documents available to process your query.",
173 |                 "disclaimer": "No documents available to process your query. Upload some PDFs to enable document search.",
174 |                 "pdf_usage": {},
175 |                 "query_usage": {}
176 |             })
177 | 
178 |         print("Creating retrieval chain")
179 |         retriever = vector_store.as_retriever(
180 |             search_type="similarity_score_threshold",
181 |             search_kwargs={
182 |                 "k": 20,
183 |                 "score_threshold": 0.1,
184 |             },
185 |         )
186 | 
187 |         retriever_prompt = ChatPromptTemplate.from_messages(
188 |             [
189 |                 MessagesPlaceholder(variable_name="chat_history"),
190 |                 ("human", "{input}"),
191 |                 (
192 |                     "human",
193 |                     "Given the above conversation, generate a search query to lookup in order to get information relevant to the conversation",
194 |                 ),
195 |             ]
196 |         )
197 |         history_aware_retriever = create_history_aware_retriever(
198 |             llm=cached_llm, retriever=retriever, prompt=retriever_prompt
199 |         )
200 |         document_chain = create_stuff_documents_chain(cached_llm, prompt)
201 | 
202 |         retrieval_chain = create_retrieval_chain(
203 |             history_aware_retriever,
204 |             document_chain,
205 |         )
206 | 
207 |         result = retrieval_chain.invoke({"input": query})
208 |         chat_history.append(HumanMessage(content=query))
209 |         chat_history.append(AIMessage(content=result["answer"]))
210 | 
211 |         print(chat_history)
212 | 
213 |         sources = create_context_with_metadata(result.get("context", []))
214 | 
215 |         # Update PDF usage count and query usage count as before
216 |         for doc in result["context"]:
217 |             pdf_source = doc.metadata.get("source")
218 |             if pdf_source in pdf_usage_count:
219 |                 pdf_usage_count[pdf_source] += 1
220 |             else:
221 |                 pdf_usage_count[pdf_source] = 1
222 | 
223 |             if pdf_source in query_usage_count:
224 |                 query_usage_count[pdf_source] += 1
225 |             else:
226 |                 query_usage_count[pdf_source] = 1
227 | 
228 |         if not sources:
229 |             answer = f"No relevant documents found for the query: {query}. This answer is generated without any PDF context."
230 |         else:
231 |             answer = result["answer"]
232 | 
233 |         response_answer = {
234 |             "answer": answer,
235 |             "sources": sources,
236 |             "pdf_usage": {pdf: {"count": count, "percentage": (count / sum(pdf_usage_count.values()) * 100) if sum(pdf_usage_count.values()) > 0 else 0} for pdf, count in pdf_usage_count.items()},
237 |             "query_usage": {pdf: {"count": count, "percentage": (count / sum(query_usage_count.values()) * 100) if sum(query_usage_count.values()) > 0 else 0} for pdf, count in query_usage_count.items()},
238 |             "disclaimer": "This answer is not based on any available PDF documents." if not sources else None
239 |         }
240 | 
241 |         return jsonify(response_answer)
242 |     except Exception as e:
243 |         return jsonify({"error": str(e)}), 500
244 | 
245 | 
246 | def create_context_with_metadata(documents):
247 |     contexts = []
248 |     for doc in documents:
249 |         metadata = doc.metadata
250 |         context = f"Document Source: {metadata.get('source', 'Unknown')}\n"
251 |         context += f"Content: {doc.page_content}\n"
252 |         contexts.append({
253 |             "source": metadata.get("source", "Unknown"),
254 |             "page_content": doc.page_content
255 |         })
256 |     return contexts
257 | 
258 | @bp.route("/clear_chat_history", methods=["POST"])
259 | def clear_chat_history():
260 |     global chat_history
261 |     chat_history = []
262 |     print("Chat history cleared")
263 |     return jsonify({"status": "Chat history cleared successfully"})
264 | 
265 | @bp.route("/clear_db", methods=["POST"])
266 | def clear_db():
267 |     logging.info("POST /clear_db called")
268 |     try:
269 |         # Clear the vector store (i.e., delete all documents)
270 |         clear_vector_store()
271 | 
272 |         # Clear the PDF directory if necessary
273 |         clear_directory(pdf_dir)
274 | 
275 |         # Reinitialize the vector store
276 |         global vector_store
277 |         vector_store = initialize_vector_store()
278 | 
279 |         return jsonify({"status": "Database and files cleared successfully"})
280 |     except Exception as e:
281 |         logging.error(f"Error during clear_db operation: {str(e)}")
282 |         return jsonify({"error": str(e)}), 500
283 | 
284 | def clear_directory(directory_path):
285 |     """
286 |     Clears the specified directory by removing it and then recreating it.
287 |     """
288 |     if os.path.exists(directory_path):
289 |         shutil.rmtree(directory_path)
290 |         os.makedirs(directory_path)
291 |         logging.info(f"Directory cleared: {directory_path}")
292 | 
293 | def clear_vector_store():
294 |     """
295 |     Clears all documents from the vector store.
296 |     """
297 |     try:
298 |         global vector_store
299 |         # Initialize the vector store
300 |         vector_store = Chroma(persist_directory=folder_path, embedding_function=embedding)
301 | 
302 |         # Get all document IDs from the vector store
303 |         db_data = vector_store.get()  # Get the data from the vector store
304 |         ids = db_data.get("ids", [])
305 | 
306 |         # Log the number of documents found
307 |         logging.info(f"Found {len(ids)} documents in vector store")
308 | 
309 |         if ids:
310 |             # Delete all documents by their IDs
311 |             for doc_id in ids:
312 |                 if doc_id:
313 |                     vector_store.delete(doc_id)
314 |                     logging.info(f"Deleted document with ID: {doc_id}")
315 | 
316 |             # Persist changes to the vector store
317 |             vector_store.persist()
318 |             logging.info("Successfully deleted all documents and persisted changes.")
319 |         else:
320 |             logging.info("No documents found in vector store to delete.")
321 |     
322 |     except Exception as e:
323 |         logging.error(f"Error during vector store clearing: {str(e)}")
324 |         raise
325 | 
326 |     
327 | @bp.route("/list_pdfs", methods=["GET"])
328 | def list_pdfs():
329 |     print("GET /list_pdfs called")
330 |     print(f"pdf_dir: {pdf_dir}")
331 |     print(f"pdf_dir: {pdf_files}")
332 |     try:
333 |         pdf_files = os.listdir(pdf_dir)
334 |         return jsonify({"pdf_files": pdf_files})
335 |     except Exception as e:
336 |         return jsonify({"error": str(e)}), 500
337 |     
338 | @bp.route("/pdf", methods=["POST"])
339 | def pdfPost():
340 |     if 'file' not in request.files:
341 |         return jsonify({"error": "No file part in the request"}), 400
342 | 
343 |     file = request.files['file']
344 | 
345 |     if file.filename == '':
346 |         return jsonify({"error": "No selected file"}), 400
347 | 
348 |     file_name = file.filename
349 |     save_file = os.path.join(pdf_dir, file_name)
350 | 
351 |     # Check if the file already exists
352 |     if file_exists(save_file):
353 |         return jsonify({"error": "File already exists."}), 400
354 | 
355 |     # Compute file hash to check for duplicates
356 |     try:
357 |         file_hash = compute_file_hash(file)
358 |     except Exception as e:
359 |         return jsonify({"error": f"Error computing file hash: {str(e)}"}), 500
360 | 
361 |     existing_files = [f for f in os.listdir(pdf_dir) if f.endswith('.pdf')]
362 |     for existing_file in existing_files:
363 |         existing_file_path = os.path.join(pdf_dir, existing_file)
364 |         try:
365 |             if compute_file_hash(open(existing_file_path, 'rb')) == file_hash:
366 |                 return jsonify({"error": "File with identical content already exists."}), 400
367 |         except Exception as e:
368 |             return jsonify({"error": f"Error checking existing files: {str(e)}"}), 500
369 | 
370 |     # Save the file
371 |     try:
372 |         file.save(save_file)
373 |     except Exception as e:
374 |         return jsonify({"error": f"Error saving file: {str(e)}"}), 500
375 | 
376 |     # Load and preprocess the PDF file
377 |     try:
378 |         loader = PDFPlumberLoader(save_file)
379 |         docs = loader.load_and_split()
380 |         is_structured = True
381 |     except Exception as e:
382 |         print(f"Error loading structured text: {e}")
383 |         docs = []
384 |         is_structured = False
385 | 
386 |     if not docs:
387 |         # Perform OCR if the document is unstructured
388 |         try:
389 |             print("Performing OCR")
390 |             text = perform_ocr(save_file)
391 |             docs = [Document(page_content=preprocess_text(text), metadata={"source": file_name})]
392 |             is_structured = False
393 |         except Exception as e:
394 |             return jsonify({"error": f"Error during OCR processing: {str(e)}"}), 500
395 |     else:
396 |         # Preprocess text from documents
397 |         docs = [Document(page_content=preprocess_text(doc.page_content), metadata={"source": file_name}) for doc in docs]
398 | 
399 |     try:
400 |         chunks = text_splitter.split_documents(docs)
401 |         print(f"Loaded len={len(chunks)} chunks")
402 | 
403 |         # Add source metadata to each chunk
404 |         for chunk in chunks:
405 |             chunk.metadata = {"source": file_name}
406 |         global vector_store
407 |         # Initialize the vector store
408 |         vector_store = Chroma.from_documents(
409 |             documents=chunks, embedding=embedding, persist_directory=folder_path
410 |         )
411 |     except Exception as e:
412 |         return jsonify({"error": f"Error initializing vector store: {str(e)}"}), 500
413 | 
414 |     response = {
415 |         "status": "Successfully Uploaded",
416 |         "filename": file_name,
417 |         "doc_len": len(docs),
418 |         "chunk_len": len(chunks),
419 |         "is_structured": is_structured
420 |     }
421 |     return jsonify(response)
422 | 
423 | 
424 | @bp.route("/list_documents", methods=["GET"])
425 | def list_documents():
426 | 
427 |     try:
428 |         print("GET /list_pdfs called")
429 |         print(f"pdf_dir: {folder_path}")
430 |         print(f"pdf_dir: {pdf_dir}")
431 |         print(f"folder_path: {folder_path}")
432 |         print(f"embedding function: {embedding}")
433 | 
434 |         if not os.path.exists(folder_path):
435 |             print(f"The directory {folder_path} does not exist. Initializing vector store...")
436 |             os.makedirs(folder_path, exist_ok=True)  # Ensure the folder exists before initializing
437 | 
438 |         print("Initializing vector store...")
439 |         initialize_vector_store()
440 |         try:
441 |             vector_store = Chroma(persist_directory=folder_path, embedding_function=embedding)
442 |         except Exception as init_error:
443 |             print(f"Failed to initialize vector store: {init_error}")
444 |             return jsonify({"error": "Failed to initialize vector store"}), 500
445 |         print("Vector store initialized.")
446 | 
447 |         print("Fetching data from vector store...")
448 |         try:
449 |             db_data = vector_store.get()
450 |         except Exception as fetch_error:
451 |             print(f"Failed to fetch data from vector store: {fetch_error}")
452 |             return jsonify({"error": "Failed to fetch data from vector store"}), 500
453 | 
454 |         if not db_data or "metadatas" not in db_data or not db_data["metadatas"]:
455 |             print("No documents found in vector store.")
456 |             return jsonify({"message": "No documents found"}), 200
457 | 
458 |         documents = [{"source": metadata.get("source", "Unknown")} for metadata in db_data["metadatas"]]
459 |         response = {"documents": documents}
460 |         print("Documents extracted:", documents)
461 |         return jsonify(response), 200
462 | 
463 |     except Exception as e:
464 |         print(f"Error listing documents: {e}")
465 |         return jsonify({"error": "An error occurred while listing documents. Please try again later."}), 500
466 | 
467 | @bp.route("/delete_pdf", methods=["POST"])
468 | def delete_pdf():
469 |     json_content = request.json
470 |     file_name = json_content.get("file_name")
471 | 
472 |     if not file_name:
473 |         return jsonify({"error": "No 'file_name' found in JSON request"}), 400
474 | 
475 |     try:
476 |         # Construct the file path
477 |         file_path = os.path.join(pdf_dir, file_name)
478 |         print(f"Attempting to delete file at: {file_path}")
479 | 
480 |         # Create a path pattern to match against
481 |         path_pattern = file_name  # Use only file_name for pattern matching
482 |         print(f"Looking for documents with source: {path_pattern}")
483 | 
484 |         # Remove the PDF references from the vector store
485 |         vector_store = Chroma(persist_directory=folder_path, embedding_function=embedding)
486 | 
487 |         # Get all documents from the vector store
488 |         db_data = vector_store.get()  # Get the data from the vector store
489 |         print(f"db_data: {db_data}")
490 |         metadatas = db_data.get("metadatas", [])
491 |         ids = db_data.get("ids", [])
492 |         
493 |         # Log the number of documents and sample metadata
494 |         print(f"Found {len(metadatas)} documents in vector store")
495 |         if metadatas:
496 |             print(f"Sample metadata: {metadatas[0]}")
497 | 
498 |         # Find and delete documents with the matching source path
499 |         docs_to_delete = [id for id, metadata in zip(ids, metadatas) if metadata.get("source").strip().lower() == path_pattern.strip().lower()]
500 |         print(f"Documents to delete: {docs_to_delete}")
501 | 
502 |         if docs_to_delete:
503 |             for doc_id in docs_to_delete:
504 |                 if doc_id is None:
505 |                     print("Encountered None as doc_id, skipping deletion.")
506 |                     continue
507 |                 print(f"Deleting document with ID: {doc_id}")
508 |                 vector_store.delete(doc_id)
509 | 
510 |             # Persist changes to the vector store
511 |             vector_store.persist()
512 |             print(f"Successfully deleted documents and persisted changes.")
513 |         else:
514 |             print(f"No documents found for source: {path_pattern}")
515 |             return jsonify({"status": "No documents found for the provided file name"}), 404
516 | 
517 |         # Check if the file exists
518 |         if os.path.exists(file_path):
519 |             os.remove(file_path)
520 |             print(f"Successfully deleted file: {file_path}")
521 |         else:
522 |             print(f"File not found: {file_path}")
523 |             return jsonify({"error": "File not found"}), 404
524 |         
525 |         return jsonify({"status": "success"})
526 |     except Exception as e:
527 |         print(f"Error during deletion: {str(e)}")
528 |         return jsonify({"error": str(e)}), 500
529 | 
530 | @bp.route('/pdfs/<path:filename>')
531 | def serve_pdf(filename):
532 |     return send_from_directory(PDF_DIRECTORY, filename)
533 | 
534 | @bp.route("/delete_document", methods=["POST"])
535 | def delete_document():
536 |     print("POST /delete_document called")
537 |     json_content = request.json
538 |     doc_id = json_content.get("doc_id")
539 | 
540 |     if not doc_id:
541 |         return jsonify({"error": "No 'doc_id' found in JSON request"}), 400
542 | 
543 |     try:
544 |         vector_store = Chroma(persist_directory=folder_path, embedding_function=embedding)
545 |         vector_store.delete(doc_id)
546 |         return jsonify({"status": "Document deleted successfully"})
547 |     except Exception as e:
548 |         return jsonify({"error": str(e)}), 500
549 | 
550 | def perform_ocr(pdf_path):
551 |     # Placeholder function for OCR processing
552 |     # You can implement this using libraries like pytesseract or other OCR tools
553 |     return "OCR processed text from PDF"
554 | 
555 | @bp.route("/pdf_usage", methods=["GET"])
556 | def get_pdf_usage():
557 |     try:
558 |         # Calculate percentage influence
559 |         total_queries = sum(pdf_usage_count.values())
560 |         pdf_influence = {pdf: {"count": count, "percentage": (count / total_queries * 100) if total_queries > 0 else 0} for pdf, count in pdf_usage_count.items()}
561 |         return jsonify({"pdf_usage": pdf_influence})
562 |     except Exception as e:
563 |         return jsonify({"error": str(e)}), 500
564 | 
565 | # Configure logging
566 | logging.basicConfig(level=logging.INFO, filename='app.log', filemode='a')
567 | 


--------------------------------------------------------------------------------
/app/static/css/style.css:
--------------------------------------------------------------------------------
  1 | /* Root Variables */
  2 | :root {
  3 |     --primary-color: #007bff;
  4 |     --secondary-color: #0056b3;
  5 |     --highlight-color: #003d7a;
  6 |     --text-color: #333;
  7 |     --background-color: #f4f4f4;
  8 |     --card-background: #fff;
  9 |     --border-color: #ddd;
 10 |     --border-radius: 8px;
 11 |     --box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1);
 12 |     --transition-duration: 0.3s;
 13 | }
 14 | 
 15 | /* General Styles */
 16 | body {
 17 |     font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
 18 |     margin: 0;
 19 |     padding: 0;
 20 |     background-color: var(--background-color);
 21 |     color: var(--text-color);
 22 | }
 23 | 
 24 | .container {
 25 |     max-width: 1200px;
 26 |     margin: 0 auto;
 27 |     padding: 20px;
 28 | }
 29 | 
 30 | 
 31 | /* Header Styles */
 32 | header {
 33 |     background-color: var(--primary-color);
 34 |     color: #fff;
 35 |     padding: 10px 10px;
 36 |     text-align: center;
 37 |     box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
 38 |     border-bottom: 5px solid #0056b3;
 39 |     margin-bottom: 20px;
 40 | }
 41 | 
 42 | .title-wrapper {
 43 |     position: relative;
 44 |     display: inline-block;
 45 | }
 46 | 
 47 | header h1 {
 48 |     margin: 0;
 49 |     font-size: 2.5em;
 50 |     font-weight: 700;
 51 |     color: #ffffff;
 52 | }
 53 | 
 54 | .ai-powered {
 55 |     position: absolute;
 56 |     top: 100%;
 57 |     left: 101%;
 58 |     transform: translateX(-50%);
 59 |     font-weight: normal;
 60 |     color: #FFFFFF;
 61 |     background-color: transparent;
 62 |     border-radius: 5px;
 63 |     font-size: 13px;
 64 |     margin-top: -10px;
 65 |     white-space: nowrap;
 66 | }
 67 | 
 68 | header h2 {
 69 |     font-size: 20px;
 70 |     color: #ffffff;
 71 |     margin-top: 20px;
 72 | }
 73 | 
 74 | /* Media Query for Smaller Screens */
 75 | @media (max-width: 768px) {
 76 |     .ai-powered {
 77 |         position: static;
 78 |         transform: none;
 79 |         margin-top: 10px;
 80 |         white-space: normal; /* Allow text to wrap */
 81 |         display: block;
 82 |         text-align: center;
 83 |     }
 84 | 
 85 |     .title-wrapper {
 86 |         text-align: center;
 87 |     }
 88 | }
 89 | 
 90 | /* Navbar Styles */
 91 | .navbar {
 92 |     background-color: var(--secondary-color);
 93 |     padding: 10px 0;
 94 | }
 95 | 
 96 | .navbar .container {
 97 |     display: flex;
 98 |     align-items: center;
 99 |     justify-content: space-between; /* Space between logo and links */
100 | }
101 | 
102 | .navbar-logo {
103 |     flex-shrink: 0; /* Prevent logo from shrinking */
104 | }
105 | 
106 | .header-image {
107 |     width: 70px; /* Adjust as needed */
108 |     height: auto; /* Maintain aspect ratio */
109 |     margin-right: 10px; /* Space between the image and text */
110 | }
111 | 
112 | .navbar-links {
113 |     display: flex;
114 |     gap: 15px; /* Space between nav links */
115 | }
116 | 
117 | .nav-link {
118 |     color: #fff;
119 |     text-decoration: none;
120 |     padding: 10px 20px;
121 |     font-size: 18px;
122 |     transition: background-color var(--transition-duration), color var(--transition-duration);
123 | }
124 | 
125 | .nav-link:hover {
126 |     background-color: var(--highlight-color);
127 |     color: #ffd700;
128 | }
129 | 
130 | /* Card Styles */
131 | .card {
132 |     background: var(--card-background);
133 |     border-radius: var(--border-radius);
134 |     box-shadow: var(--box-shadow);
135 |     padding: 20px;
136 |     margin-bottom: 20px;
137 | }
138 | 
139 | /* Section Styles */
140 | .section-group {
141 |     display: flex;
142 |     flex-wrap: wrap;
143 |     gap: 20px;
144 | }
145 | 
146 | .section {
147 |     flex: 1;
148 |     min-width: calc(50% - 20px);
149 | }
150 | 
151 | h2 {
152 |     color: var(--text-color);
153 |     margin-top: 0;
154 | }
155 | 
156 | /* Button Styles */
157 | button {
158 |     padding: 10px 20px;
159 |     background: var(--primary-color);
160 |     color: #fff;
161 |     border: none;
162 |     cursor: pointer;
163 |     border-radius: 5px;
164 |     font-size: 16px;
165 |     transition: background-color var(--transition-duration), transform var(--transition-duration);
166 | }
167 | 
168 | button:hover {
169 |     background: var(--secondary-color);
170 |     transform: scale(1.05);
171 | }
172 | 
173 | /* Input Styles */
174 | input[type="file"], input[type="text"] {
175 |     width: 100%;
176 |     padding: 10px;
177 |     margin: 10px 0;
178 |     box-sizing: border-box;
179 |     border: 1px solid var(--border-color);
180 |     border-radius: 5px;
181 | }
182 | 
183 | /* List Styles */
184 | ul, ol {
185 |     padding: 0;
186 |     margin: 0;
187 |     list-style-position: inside;
188 | }
189 | 
190 | li {
191 |     background: var(--card-background);
192 |     margin: 5px 0;
193 |     padding: 10px;
194 |     border: 1px solid var(--border-color);
195 |     border-radius: 5px;
196 | }
197 | 
198 | /* Container for response and resize handle */
199 | .response-container {
200 |     position: relative;
201 |     margin-top: 20px;
202 |     width: 100%;
203 |     box-sizing: border-box;
204 | }
205 | 
206 | /* Response Section */
207 | .response {
208 |     padding: 15px;
209 |     border: 1px solid var(--border-color);
210 |     border-radius: 5px;
211 |     background: #fff;
212 |     height: 150px;
213 |     max-height: 500px;
214 |     min-height: 100px;
215 |     overflow-y: auto;
216 |     box-sizing: border-box;
217 |     width: 100%;
218 | }
219 | 
220 | /* Resize Handle */
221 | .resize-handle {
222 |     width: 20px;
223 |     height: 20px;
224 |     background: var(--primary-color);
225 |     position: absolute;
226 |     bottom: 0;
227 |     right: 0;
228 |     cursor: ns-resize;
229 |     z-index: 10;
230 |     border-top-left-radius: 5px;
231 |     border-top-right-radius: 5px;
232 |     box-shadow: 0 2px 5px rgba(0, 0, 0, 0.2);
233 | }
234 | 
235 | /* Loading Spinner */
236 | .spinner {
237 |     border: 4px solid rgba(0, 0, 0, 0.1);
238 |     border-radius: 50%;
239 |     border-top: 4px solid var(--primary-color);
240 |     width: 24px;
241 |     height: 24px;
242 |     animation: spin 1s linear infinite;
243 |     margin: 10px auto;
244 | }
245 | 
246 | /* Spin Animation */
247 | @keyframes spin {
248 |     0% { transform: rotate(0deg); }
249 |     100% { transform: rotate(360deg); }
250 | }
251 | 
252 | /* Loading Message */
253 | .loading-message {
254 |     font-size: 16px;
255 |     color: var(--primary-color);
256 |     text-align: center;
257 |     margin: 10px 0;
258 | }
259 | 
260 | /* Statistics Section */
261 | .stats {
262 |     padding: 10px;
263 |     background: #f9f9f9;
264 |     border: 1px solid var(--border-color);
265 |     border-radius: 5px;
266 |     margin-top: 20px;
267 | }
268 | 
269 | /* Status Messages */
270 | .status-message {
271 |     font-size: 16px;
272 |     color: var(--primary-color);
273 |     text-align: center;
274 |     margin-top: 10px;
275 | }
276 | 
277 | /* Fade Out Animation */
278 | .fade-out {
279 |     opacity: 0;
280 |     transition: opacity 2s ease-out;
281 | }
282 | 
283 | /* Button Styles (Additional) */
284 | .view-button {
285 |     background-color: #4CAF50;
286 |     color: white;
287 |     border: none;
288 |     padding: 10px 15px;
289 |     margin-right: 5px;
290 |     border-radius: 5px;
291 |     cursor: pointer;
292 | }
293 | 
294 | .delete-button {
295 |     background-color: #f44336;
296 |     color: white;
297 |     border: none;
298 |     padding: 10px 15px;
299 |     border-radius: 5px;
300 |     cursor: pointer;
301 | }
302 | 
303 | .view-button:hover, .delete-button:hover {
304 |     opacity: 0.8;
305 | }
306 | 
307 | /* Enhanced Styles for Prompt Container */
308 | #promptContainer {
309 |     max-width: 600px;
310 |     margin: 20px auto;
311 |     padding: 20px;
312 |     border: 1px solid var(--border-color);
313 |     border-radius: var(--border-radius);
314 |     background-color: #f9f9f9;
315 |     box-shadow: var(--box-shadow);
316 | }
317 | 
318 | #promptContainer p {
319 |     font-size: 16px;
320 |     color: var(--text-color);
321 |     margin-bottom: 20px;
322 |     line-height: 1.5;
323 | }
324 | 
325 | /* Style for the select dropdown */
326 | #promptType {
327 |     width: 100%;
328 |     padding: 10px;
329 |     font-size: 16px;
330 |     border: 1px solid var(--border-color);
331 |     border-radius: 4px;
332 |     background-color: #fff;
333 |     box-shadow: inset 0 1px 3px rgba(0, 0, 0, 0.12);
334 |     transition: border-color var(--transition-duration) ease;
335 | }
336 | 
337 | #promptType:focus {
338 |     border-color: var(--primary-color);
339 |     outline: none;
340 |     box-shadow: 0 0 0 2px rgba(0, 123, 255, 0.25);
341 | }
342 | 
343 | /* Container for each PDF item */
344 | .pdf-item {
345 |     display: flex;
346 |     justify-content: space-between; /* Space between content and buttons */
347 |     align-items: center; /* Align items vertically in the center */
348 |     margin-bottom: 20px;
349 |     padding: 10px;
350 |     border: 1px solid var(--border-color);
351 |     border-radius: 5px;
352 |     background-color: var(--card-background);
353 | }
354 | 
355 | 
356 | 
357 | 
358 | /* Button container to hold buttons */
359 | .button-container {
360 |     display: flex;
361 |     gap: 10px; /* Space between buttons */
362 | }
363 | 
364 | /* General button style */
365 | .button-container .btn {
366 |     padding: 8px 12px;
367 |     font-size: 14px;
368 |     color: #fff;
369 |     border: none;
370 |     border-radius: 4px;
371 |     cursor: pointer;
372 |     transition: background-color 0.3s ease;
373 | }
374 | 
375 | .button-container .btn {
376 |     padding: 8px 12px;
377 |     font-size: 14px;
378 |     border: none;
379 |     border-radius: var(--border-radius);
380 |     cursor: pointer;
381 |     transition: background-color var(--transition-duration), transform var(--transition-duration);
382 | }
383 | 
384 | 
385 | /* General button style */
386 | .button-container .btn {
387 |     padding: 8px 12px;
388 |     font-size: 14px;
389 |     color: #fff;
390 |     border: none;
391 |     border-radius: 4px;
392 |     cursor: pointer;
393 |     transition: background-color 0.3s ease;
394 | }
395 | 
396 | /* Specific button styles */
397 | .view-button {
398 |     background-color: #007bff;
399 | }
400 | 
401 | .view-button:hover {
402 |     background-color: #0056b3;
403 | }
404 | 
405 | .delete-button {
406 |     background-color: #dc3545;
407 | }
408 | 
409 | .delete-button:hover {
410 |     background-color: #c82333;
411 | }
412 | 
413 | #pdfList {
414 |     margin: 20px;
415 | }
416 | 
417 | /* Accessibility Improvements */
418 | :focus {
419 |     outline: 3px solid var(--highlight-color);
420 |     outline-offset: 2px;
421 | }
422 | 
423 | /* Dark Mode Support */
424 | @media (prefers-color-scheme: dark) {
425 |     :root {
426 |         --primary-color: #0056b3;
427 |         --secondary-color: #007bff;
428 |         --highlight-color: #ffd700;
429 |         --text-color: #f4f4f4;
430 |         --background-color: #333;
431 |         --card-background: #444;
432 |         --border-color: #555;
433 |     }
434 | 
435 |     body {
436 |         background-color: var(--background-color);
437 |         color: var(--text-color);
438 |     }
439 | 
440 |     .nav-link, button, .status-message, .loading-message {
441 |         color: #fff;
442 |     }
443 | 
444 |     .card, .response, .stats, .status-message, .loading-message, #promptContainer, .pdf-item {
445 |         background: var(--card-background);
446 |         border-color: var(--border-color);
447 |     }
448 | 
449 |     input[type="file"], input[type="text"], #promptType {
450 |         background-color: #555;
451 |         color: #fff;
452 |     }
453 | }
454 | 
455 | /* Smooth Transitions for Hover and Focus */
456 | a, button, .nav-link, input[type="file"], input[type="text"], #promptType {
457 |     transition: background-color 0.3s ease, color 0.3s ease, border-color 0.3s ease, box-shadow 0.3s ease;
458 | }
459 | 
460 | /* Consistent Spacing and Alignment */
461 | * {
462 |     box-sizing: border-box;
463 | }
464 | 
465 | .container, .section-group, .button-container, #pdfList {
466 |     padding-left: 15px;
467 |     padding-right: 15px;
468 | }
469 | 
470 | .section-group, .button-container {
471 |     justify-content: space-between;
472 | }
473 | 
474 | /* Enhancements for Readability */
475 | body, h1, h2, p, .nav-link, button, input, .status-message, .loading-message {
476 |     line-height: 1.6;
477 |     font-weight: normal;
478 | }
479 | 
480 | h1, h2 {
481 |     font-weight: bold;
482 | }
483 | 
484 | footer {
485 |     background-color: #007bff;
486 |     color: #fff;
487 |     padding: 20px 0;
488 |     text-align: center;
489 |     border-top: 5px solid #0056b3;
490 | }
491 | 
492 | footer a {
493 |     color: #ffd700;
494 |     text-decoration: none;
495 | }
496 | 
497 | footer a:hover {
498 |     color: #ffffff;
499 |     text-decoration: underline;
500 | }
501 | 
502 | .footer-content {
503 |     max-width: 1200px;
504 |     margin: 0 auto;
505 |     padding: 0 20px;
506 | }
507 | 
508 | @media (max-width: 768px) {
509 |     .footer-content {
510 |         text-align: center;
511 |     }
512 | }
513 | 
514 | .navigation {
515 |     display: flex;
516 |     justify-content: center;
517 |     padding: 20px 0;
518 | }
519 | 
520 | .navigation .button {
521 |     background-color: #6c757d;
522 |     color: #ffffff;
523 |     padding: 8px 16px;
524 |     text-decoration: none;
525 |     border-radius: 4px;
526 |     font-weight: normal;
527 |     font-size: 0.9rem;
528 |     transition: background-color 0.3s, transform 0.2s;
529 | }
530 | 
531 | .navigation .button:hover {
532 |     background-color: #5a6268;
533 |     transform: translateY(-2px);
534 | }
535 | 
536 | .app-name {
537 |     font-weight: bold;
538 |     color: #4CAF50;
539 |     margin-right: 10px;
540 | }
541 | 
542 | 
543 | .header-image {
544 |     width: 70px;
545 |     height: auto;
546 |     margin-right: 10px;
547 | }
548 | 
549 | .welcome {
550 |     font-family: 'Brush Script MT', cursive; /* Example of a handwritten-style font */
551 |     font-size: 35px; /* Adjust size as needed */
552 |     color: #ffffff; /* Example color, adjust as needed */
553 |     margin-right: 5px;
554 | }
555 | 
556 | @media (max-width: 768px) {
557 |     .pdf-item {
558 |         flex-direction: column; /* Stack items vertically on smaller screens */
559 |         padding: 15px;
560 |     }
561 | 
562 |     .button-container {
563 |         flex-direction: column; /* Stack buttons vertically */
564 |         gap: 10px;
565 |         justify-content: center; /* Center buttons on smaller screens */
566 |     }
567 | 
568 |     .button-container .btn {
569 |         width: 100%; /* Make buttons full width on small screens */
570 |         font-size: 16px;
571 |     }
572 | }
573 | 
574 | @media (max-width: 480px) {
575 |     .pdf-item {
576 |         padding: 10px;
577 |     }
578 | 
579 |     .button-container .btn {
580 |         padding: 10px;
581 |         font-size: 14px;
582 |     }
583 | }
584 | 


--------------------------------------------------------------------------------
/app/static/favicon.ico:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AbdArdati/PDFQueryAI/5448eeb6ebfbf31f881b51227e1a7468518d7449/app/static/favicon.ico


--------------------------------------------------------------------------------
/app/static/images/Home_Example_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AbdArdati/PDFQueryAI/5448eeb6ebfbf31f881b51227e1a7468518d7449/app/static/images/Home_Example_1.png


--------------------------------------------------------------------------------
/app/static/images/Home_Example_2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AbdArdati/PDFQueryAI/5448eeb6ebfbf31f881b51227e1a7468518d7449/app/static/images/Home_Example_2.png


--------------------------------------------------------------------------------
/app/static/images/Home_Example_3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AbdArdati/PDFQueryAI/5448eeb6ebfbf31f881b51227e1a7468518d7449/app/static/images/Home_Example_3.png


--------------------------------------------------------------------------------
/app/static/images/PDF_Management_Eaxmple.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AbdArdati/PDFQueryAI/5448eeb6ebfbf31f881b51227e1a7468518d7449/app/static/images/PDF_Management_Eaxmple.png


--------------------------------------------------------------------------------
/app/static/images/full_logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AbdArdati/PDFQueryAI/5448eeb6ebfbf31f881b51227e1a7468518d7449/app/static/images/full_logo.png


--------------------------------------------------------------------------------
/app/static/images/logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AbdArdati/PDFQueryAI/5448eeb6ebfbf31f881b51227e1a7468518d7449/app/static/images/logo.png


--------------------------------------------------------------------------------
/app/static/js/home.js:
--------------------------------------------------------------------------------
  1 | // Function to make API requests
  2 | async function apiRequest(url, options = {}) {
  3 |     try {
  4 |         const response = await fetch(url, options);
  5 | 
  6 |         // Check if the response status is not OK
  7 |         if (!response.ok) {
  8 |             const errorText = await response.text(); // Read response text for debugging
  9 |             throw new Error(`Failed to fetch from ${url}. Status: ${response.status}. Response: ${errorText}`);
 10 |         }
 11 | 
 12 |         // Try parsing JSON, if applicable
 13 |         try {
 14 |             return await response.json();
 15 |         } catch (jsonError) {
 16 |             throw new Error(`Failed to parse JSON response from ${url}. Response: ${await response.text()}`);
 17 |         }
 18 |     } catch (error) {
 19 |         console.error('API request error:', error.message);
 20 |         throw error; // Re-throw error to be handled by the calling function
 21 |     }
 22 | }
 23 | 
 24 | // Function to copy text from PDF field and submit to AI
 25 | async function copyTextAndSubmit() {
 26 |     const pdfText = document.getElementById('queryPDF').value;
 27 |     document.getElementById('query').value = pdfText;
 28 |     await askAI(); // Call askAI directly since it's defined in this file
 29 | }
 30 | 
 31 | // Function to ask AI a question
 32 | async function askAI() {
 33 |     console.log('askAI called'); // Debug log
 34 |     const query = document.getElementById('query').value;
 35 |     const responseDiv = document.getElementById('queryResponseAI');
 36 | 
 37 |     if (!query) {
 38 |         alert('Please enter a query.');
 39 |         return;
 40 |     }
 41 | 
 42 |     responseDiv.innerHTML = '<div class="spinner"></div><p class="loading-message">Fetching response, please wait...</p>';
 43 | 
 44 |     try {
 45 |         const result = await apiRequest('/ai', {
 46 |             method: 'POST',
 47 |             headers: { 'Content-Type': 'application/json' },
 48 |             body: JSON.stringify({ query }),
 49 |         });
 50 | 
 51 |         responseDiv.innerHTML = `<p>${await markdownToHTML(result.answer) || result.error}</p>`;
 52 |     } catch (error) {
 53 |         responseDiv.innerHTML = `<p>An error occurred while processing the query: ${error.message}</p>`;
 54 |     }
 55 | }
 56 | 
 57 | // Function to ask about PDF
 58 | async function askPDF() {
 59 |     console.log('askPDF called'); // Debug log
 60 |     const query = document.getElementById('queryPDF').value;
 61 |     const responseDiv = document.getElementById('queryResponse');
 62 |     const promptType = document.getElementById('promptType').value;
 63 | 
 64 |     if (!responseDiv) {
 65 |         console.error('Element with ID "queryResponse" is missing.');
 66 |         return;
 67 |     }
 68 | 
 69 |     if (!query) {
 70 |         alert('Please enter a query.');
 71 |         return;
 72 |     }
 73 | 
 74 |     responseDiv.innerHTML = '<div class="spinner"></div><p class="loading-message">Fetching response, please wait...</p>';
 75 | 
 76 |     try {
 77 |         console.log('Sending API request...'); // Debug log
 78 |         const result = await apiRequest('/ask_pdf', {
 79 |             method: 'POST',
 80 |             headers: { 'Content-Type': 'application/json' },
 81 |             body: JSON.stringify({ query, promptType }),
 82 |         });
 83 | 
 84 |         console.log('API response received:', result); // Debug log
 85 | 
 86 |         if (result.disclaimer) {
 87 |             responseDiv.innerHTML = `<p>${result.disclaimer}</p>`;
 88 |         } else {
 89 |             responseDiv.innerHTML = `<p>${await markdownToHTML(result.answer) || result.error}</p>`;
 90 |         }
 91 | 
 92 |         if (result.pdf_usage) displayPDFUsage(result.pdf_usage);
 93 |         if (result.query_usage) displayQueryUsage(result.query_usage);
 94 |     } catch (error) {
 95 |         responseDiv.innerHTML = `<p>An error occurred while processing the PDF query: ${error.message}</p>`;
 96 |         console.error('Error in askPDF:', error); // Debug log
 97 |     }
 98 | }
 99 | 
100 | // Function to display PDF usage statistics
101 | function displayPDFUsage(pdfUsage) {
102 |     const usageDiv = document.getElementById('pdfUsageStats');
103 |     usageDiv.innerHTML = '<h3>PDF Usage Statistics In Total</h3>';
104 | 
105 |     if (Object.keys(pdfUsage).length === 0) {
106 |         usageDiv.innerHTML += '<p>No usage statistics available.</p>';
107 |         return;
108 |     }
109 | 
110 |     usageDiv.innerHTML += '<ul>' + Object.entries(pdfUsage).map(([pdf, data]) => 
111 |         `<li>${pdf}: ${data.count} queries (${data.percentage.toFixed(2)}%)</li>`
112 |     ).join('') + '</ul>';
113 | }
114 | 
115 | // Function to display query usage statistics
116 | function displayQueryUsage(queryUsage) {
117 |     const usageDiv = document.getElementById('queryUsageStats');
118 |     usageDiv.innerHTML = '<h3>PDF Usage Statistics Per Query</h3>';
119 | 
120 |     if (Object.keys(queryUsage).length === 0) {
121 |         usageDiv.innerHTML += '<p>No usage statistics available.</p>';
122 |         return;
123 |     }
124 | 
125 |     usageDiv.innerHTML += '<ul>' + Object.entries(queryUsage).map(([pdf, data]) => 
126 |         `<li>${pdf}: ${data.count} queries (${data.percentage.toFixed(2)}%)</li>`
127 |     ).join('') + '</ul>';
128 | }
129 | 
130 | // Function to convert markdown to HTML
131 | async function markdownToHTML(markdown) {
132 |     if (!markdown) return '';
133 | 
134 |     let html = markdown
135 |         .replace(/\*\*(.*?)\*\*/g, '<strong>$1</strong>') // Bold
136 |         .replace(/\n{2,}/g, '</p><p>') // Paragraphs
137 |         .replace(/\n/g, '<br>') // New lines
138 |         .replace(/\t/g, '&nbsp;&nbsp;&nbsp;&nbsp;') // Tabs
139 |         .replace(/^(\d+)\.\s+(.*)$/gm, (match, number, text) => `<p>${number}. ${text}</p>`) // Numbered lists
140 |         .replace(/^\*\s+(.*)$/gm, (match, text) => `<ul><li>${text}</li></ul>`) // Bullet lists
141 |         .replace(/<\/ul>\s*<ul>/g, '</ul><ul>') // Nested lists
142 |         .replace(/<\/li>\s*<li>/g, '</li><li>') // List items
143 |         .replace(/<\/ul>\s*<\/li>/g, '</ul></li>'); // Nested lists closing
144 | 
145 |     return `<p>${html.replace(/<\/p>\s*<p>/g, '</p><p>').replace(/<p>\s*<\/p>/g, '')}</p>`;
146 | }
147 | 
148 | // Event listeners for form submission on Enter key press
149 | document.addEventListener('DOMContentLoaded', () => {
150 |     console.log('DOMContentLoaded fired'); // Debug log
151 | 
152 |     const queryPDFInput = document.getElementById('queryPDF');
153 |     const queryPDFButton = document.getElementById('askPDFButton');
154 |     if (queryPDFInput && queryPDFButton) {
155 |         queryPDFInput.addEventListener('keypress', function(event) {
156 |             if (event.key === 'Enter') {
157 |                 event.preventDefault();
158 |                 queryPDFButton.click();
159 |             }
160 |         });
161 |     } else {
162 |         console.log('queryPDFInput or queryPDFButton is missing'); // Debug log
163 |     }
164 | 
165 |     const queryInput = document.getElementById('query');
166 |     const queryButton = document.getElementById('askAIButton');
167 |     if (queryInput && queryButton) {
168 |         queryInput.addEventListener('keypress', function(event) {
169 |             if (event.key === 'Enter') {
170 |                 event.preventDefault();
171 |                 queryButton.click();
172 |             }
173 |         });
174 |     } else {
175 |         console.log('queryInput or queryButton is missing'); // Debug log
176 |     }
177 | 
178 |     const copyPromptButton = document.getElementById('copyPromptButton');
179 |     if (copyPromptButton) {
180 |         copyPromptButton.addEventListener('click', copyTextAndSubmit);
181 |     } else {
182 |         console.log('copyPromptButton is missing'); // Debug log
183 |     }
184 | 
185 |     // Fetch prompts when the page is loaded
186 |     fetchPrompts();
187 | });
188 | 
189 | // Function to fetch prompts from the server
190 | async function fetchPrompts() {
191 |     try {
192 |         const prompts = await apiRequest('/prompts');
193 |         const selectElement = document.getElementById('promptType');
194 | 
195 |         if (selectElement) {
196 |             selectElement.innerHTML = '';
197 |             for (const [key] of Object.entries(prompts)) {
198 |                 const option = document.createElement('option');
199 |                 option.value = key;
200 |                 option.textContent = key;
201 |                 selectElement.appendChild(option);
202 |             }
203 |         } else {
204 |             console.error('Prompt type select element is missing in the DOM.');
205 |         }
206 |     } catch (error) {
207 |         console.error('Error fetching prompts:', error);
208 |     }
209 | }
210 | 


--------------------------------------------------------------------------------
/app/static/js/script.js:
--------------------------------------------------------------------------------
  1 | async function apiRequest(url, options = {}) {
  2 |     try {
  3 |         const response = await fetch(url, options);
  4 | 
  5 |         // Check if the response status is not OK
  6 |         if (!response.ok) {
  7 |             // Attempt to parse the error response
  8 |             let errorMessage = `Failed to fetch from ${url}. Status: ${response.status}`;
  9 | 
 10 |             try {
 11 |                 // Try parsing the response as JSON
 12 |                 const errorData = await response.json();
 13 |                 errorMessage += `. Response: ${errorData.message || errorData.error || 'Unknown error'}`;
 14 |             } catch {
 15 |                 // If JSON parsing fails, fall back to plain text response
 16 |                 errorMessage += `. Response: ${await response.text()}`;
 17 |             }
 18 | 
 19 |             throw new Error(errorMessage);
 20 |         }
 21 | 
 22 |         // Try parsing JSON, if applicable
 23 |         try {
 24 |             return await response.json();
 25 |         } catch (jsonError) {
 26 |             throw new Error(`Failed to parse JSON response from ${url}. Response: ${await response.text()}`);
 27 |         }
 28 |     } catch (error) {
 29 |         console.error('API request error:', error.message);
 30 |         throw error; // Re-throw error to be handled by the calling function
 31 |     }
 32 | }
 33 | 
 34 | 
 35 | async function uploadPDF() {
 36 |     const fileInput = document.getElementById('pdfFile');
 37 |     const file = fileInput.files[0];
 38 | 
 39 |     if (!file) return showToast('Please select a PDF file to upload.');
 40 | 
 41 |     const formData = new FormData();
 42 |     formData.append('file', file);
 43 | 
 44 |     try {
 45 |         const result = await apiRequest('/pdf', {
 46 |             method: 'POST',
 47 |             body: formData,
 48 |         });
 49 | 
 50 |         const { status, filename, doc_len, chunk_len, error } = result;
 51 |         if (status === 'Successfully Uploaded') {
 52 |             showToast(`Success: ${status}\nFilename: ${filename}\nLoaded ${doc_len} documents\nLoaded len=${chunk_len} chunks`, type = 'success');
 53 |             
 54 |             // Call listPDFs function after successful upload
 55 |             listPDFs();
 56 |         } else {
 57 |             showToast(error || 'An error occurred during the upload.', type = 'error');
 58 |         }
 59 |     } catch (error) {
 60 |         // Display only the error message for 400 Bad Request
 61 |         const errorMessage = error.message.includes('Status: 400') 
 62 |             ? `${error.message.split('Response: ')[1]}`
 63 |             : `An error occurred while uploading the PDF: ${error.message}`;
 64 | 
 65 |         showToast(errorMessage, type = 'error');
 66 |     }
 67 | }
 68 | 
 69 | 
 70 | // Function to clear chat history
 71 | async function clearChatHistory() {
 72 |     try {
 73 |         const data = await apiRequest('/clear_chat_history', {
 74 |             method: 'POST',
 75 |             headers: { 'Content-Type': 'application/json' },
 76 |         });
 77 | 
 78 |         const statusElement = document.getElementById('chatHistoryStatus');
 79 |         statusElement.innerText = data.status === "Chat history cleared successfully" ? 'Chat history cleared successfully.' : 'Failed to clear chat history.';
 80 |         statusElement.classList.remove('fade-out');
 81 | 
 82 |         setTimeout(() => statusElement.classList.add('fade-out'), 1500);
 83 |     } catch (error) {
 84 |         const statusElement = document.getElementById('chatHistoryStatus');
 85 |         statusElement.innerText = `An error occurred while clearing chat history: ${error.message}`;
 86 |         statusElement.classList.remove('fade-out');
 87 |     }
 88 | 
 89 | }
 90 | // Function to list PDFs
 91 | async function listPDFs() {
 92 |     try {
 93 |         const result = await apiRequest('/list_documents');
 94 | 
 95 |         const pdfList = document.getElementById('pdfList');
 96 |         pdfList.innerHTML = '';
 97 | 
 98 |         if (result.documents && result.documents.length > 0) {
 99 |             const seenPDFs = new Set();
100 |             result.documents.forEach(doc => {
101 |                 const source = doc.source;
102 |                 if (!seenPDFs.has(source)) {
103 |                     seenPDFs.add(source);
104 | 
105 |                     const listItem = document.createElement('div');
106 |                     listItem.className = 'pdf-item'; // Apply the new class
107 |                     listItem.textContent = source;
108 | 
109 |                     const buttonContainer = document.createElement('div');
110 |                     buttonContainer.className = 'button-container';
111 | 
112 |                     buttonContainer.appendChild(createButton('View', 'btn view-button', () => window.open(`/pdfs/${source}`, '_blank').focus()));
113 |                     buttonContainer.appendChild(createButton('Delete', 'btn delete-button', () => deletePDF(source)));
114 | 
115 |                     listItem.appendChild(buttonContainer);
116 |                     pdfList.appendChild(listItem);
117 |                 }
118 |             });
119 |         } else {
120 |             pdfList.innerHTML = '<li>No documents found.</li>';
121 |         }
122 |     } catch (error) {
123 |         console.error('Error during listPDFs:', error);
124 |         showToast('An error occurred while listing documents. Please try again later.', type = 'error');
125 |     }
126 | }
127 | 
128 | // Create a button with text, class, and click handler
129 | function createButton(text, className, onClick) {
130 |     const button = document.createElement('button');
131 |     button.textContent = text;
132 |     button.className = className;
133 |     button.onclick = onClick;
134 |     return button;
135 | }
136 | 
137 | // Function to copy text from PDF field and submit to AI
138 | async function copyTextAndSubmit() {
139 |     const pdfText = document.getElementById('queryPDF').value;
140 |     document.getElementById('query').value = pdfText;
141 |     document.querySelector('.query-section button[onclick="askAI()"]').click();
142 | }
143 | 
144 | // Function to delete a PDF
145 | async function deletePDF(fileName) {
146 |     if (!confirm(`Are you sure you want to delete ${fileName}?`)) return;
147 | 
148 |     try {
149 |         const result = await apiRequest('/delete_pdf', {
150 |             method: 'POST',
151 |             headers: { 'Content-Type': 'application/json' },
152 |             body: JSON.stringify({ file_name: fileName }),
153 |         });
154 | 
155 |         if (result.status === 'success') {
156 |             showToast('PDF deleted successfully.', type = 'success');
157 |             listPDFs();
158 |         } else {
159 |             showToast('Failed to delete PDF: ' + (result.error || 'Unknown error'), type = 'error');
160 |         }
161 |     } catch (error) {
162 |         showToast('An error occurred while deleting the PDF: ' + error.message, type = 'error');
163 |     }
164 | }
165 | 
166 | // Function to clear the database
167 | async function clearDatabase() {
168 |     if (!confirm('Are you sure you want to delete all PDFs and clear the database?')) return;
169 | 
170 |     try {
171 |         const result = await apiRequest('/clear_db', {
172 |             method: 'POST',
173 |             headers: { 'Content-Type': 'application/json' }
174 |         });
175 | 
176 |         showToast(result.error ? `Error: ${result.error}` : 'Database and files cleared successfully', type = 'success');
177 | 
178 |         // Call listPDFs function after successful database clearance
179 |         if (!result.error) {
180 |             listPDFs();
181 |         }
182 |     } catch (error) {
183 |         showToast('Network Error: ' + error.message, type = 'error');
184 |     }
185 | }
186 | 
187 | 
188 | async function showToast(message, type = 'info') {
189 |     const toast = document.createElement('div');
190 |     toast.textContent = message;
191 | 
192 |     // Apply base styles
193 |     toast.style.position = 'fixed';
194 |     toast.style.top = '5%'; // Center vertically
195 |     toast.style.left = '50%'; // Center horizontally
196 |     toast.style.transform = 'translate(-50%, -50%)';
197 |     toast.style.padding = '25px 30px';
198 |     toast.style.borderRadius = '12x';
199 |     toast.style.color = '#fff';
200 |     toast.style.fontFamily = 'Arial, sans-serif';
201 |     toast.style.fontSize = '24px';
202 |     toast.style.zIndex = '1000';
203 |     toast.style.boxShadow = '0 2px 4px rgba(0,0,0,0.3)';
204 |     toast.style.transition = 'opacity 0.5s ease, top 0.5s ease'; // Add transition for top property
205 | 
206 |     // Style based on type
207 |     switch (type) {
208 |         case 'success':
209 |             toast.style.backgroundColor = '#4CAF50'; // Green
210 |             break;
211 |         case 'error':
212 |             toast.style.backgroundColor = '#F44336'; // Red
213 |             break;
214 |         case 'warning':
215 |             toast.style.backgroundColor = '#FFC107'; // Yellow
216 |             break;
217 |         case 'info':
218 |         default:
219 |             toast.style.backgroundColor = '#2196F3'; // Blue
220 |             break;
221 |     }
222 | 
223 |     // Append toast to the body
224 |     document.body.appendChild(toast);
225 | 
226 |     // Animate toast appearance
227 |     toast.style.opacity = '1';
228 | 
229 |     // Remove the toast after a timeout
230 |     setTimeout(() => {
231 |         toast.style.opacity = '0';
232 |         toast.style.top = '45%'; // Move up slightly before disappearing
233 |         setTimeout(() => {
234 |             document.body.removeChild(toast);
235 |         }, 500); // Allow time for fade-out animation
236 |     }, 3000); // Toast disappears after 3 seconds
237 | }
238 | 
239 | function handleNavigation(event) {
240 |     // Check if the spinner is still visible
241 |     var spinner = document.querySelector('.spinner');
242 |     if (spinner) {
243 |         // Show a confirmation dialog to the user
244 |         var confirmNavigation = confirm("You have an ongoing process. If you leave now, you may not get the answer you are waiting for. Do you want to continue?");
245 |         if (!confirmNavigation) {
246 |             // Prevent the default action if the user does not confirm
247 |             event.preventDefault();
248 |         }
249 |     }
250 | }
251 | 
252 | // Select all response containers and resize handles
253 | const responseContainers = document.querySelectorAll('.response-container');
254 | const resizeHandles = document.querySelectorAll('.resize-handle');
255 | 
256 | resizeHandles.forEach((resizeHandle, index) => {
257 |     let isResizing = false;
258 |     let startY, startHeight;
259 | 
260 |     // Function to start resizing
261 |     function startResizing(event) {
262 |         isResizing = true;
263 |         // Use 'touches[0].clientY' for touch events and 'clientY' for mouse events
264 |         startY = event.touches ? event.touches[0].clientY : event.clientY;
265 |         startHeight = parseInt(document.defaultView.getComputedStyle(responseContainers[index].querySelector('.response')).height, 10);
266 | 
267 |         document.addEventListener('mousemove', handleMouseMove);
268 |         document.addEventListener('mouseup', stopResizing);
269 |         // Add touch event listeners
270 |         document.addEventListener('touchmove', handleMouseMove, { passive: false });
271 |         document.addEventListener('touchend', stopResizing);
272 |     }
273 | 
274 |     // Function to handle mouse and touch move events
275 |     function handleMouseMove(event) {
276 |         if (isResizing) {
277 |             event.preventDefault(); // Prevent scrolling while resizing
278 |             let clientY = event.touches ? event.touches[0].clientY : event.clientY;
279 |             let newHeight = startHeight + (clientY - startY);
280 | 
281 |             if (newHeight < 100) { // Optional: Minimum height constraint
282 |                 newHeight = 100;
283 |             }
284 |             responseContainers[index].querySelector('.response').style.height = `${newHeight}px`;
285 |         }
286 |     }
287 | 
288 |     // Function to stop resizing
289 |     function stopResizing() {
290 |         isResizing = false;
291 |         document.removeEventListener('mousemove', handleMouseMove);
292 |         document.removeEventListener('mouseup', stopResizing);
293 |         // Remove touch event listeners
294 |         document.removeEventListener('touchmove', handleMouseMove);
295 |         document.removeEventListener('touchend', stopResizing);
296 |     }
297 | 
298 |     // Listen for both mouse and touch start events
299 |     resizeHandle.addEventListener('mousedown', startResizing);
300 |     resizeHandle.addEventListener('touchstart', startResizing, { passive: false });
301 | });


--------------------------------------------------------------------------------
/app/templates/index.html:
--------------------------------------------------------------------------------
  1 | <!DOCTYPE html>
  2 | <html lang="en">
  3 | <head>
  4 |     <meta charset="UTF-8">
  5 |     <meta name="viewport" content="width=device-width, initial-scale=1.0">
  6 |     <title>Home - PDF Interaction Dashboard</title>
  7 |     <link rel="icon" type="image/x-icon" href="{{ url_for('static', filename='favicon.ico') }}">
  8 |     <link rel="stylesheet" href="/static/css/style.css">
  9 |     <script src="/static/js/home.js" defer></script>
 10 |     <script src="/static/js/script.js" defer></script>
 11 | </head>
 12 | <body>
 13 |     <header>
 14 |         <div class="header-content">
 15 |             <div class="title-wrapper">
 16 |                 <h1>
 17 |                     <span class="welcome">Welcome to</span> AskItRight
 18 |                 </h1>
 19 |                 <div class="ai-powered">
 20 |                     AI-Powered PDF Query App (Ollama3.1) 🚀
 21 |                 </div>
 22 |             </div>
 23 |             <h2>Document Interaction Dashboard</h2>
 24 |             <p>Manage your PDFs, ask questions, and view usage statistics all in one place.</p>
 25 |         </div>
 26 |     </header>    
 27 |     <div class="container card">
 28 |         <nav class="navbar">
 29 |             <div class="container">
 30 |                 <div class="navbar-logo">
 31 |                     <img src="{{ url_for('static', filename='images/logo.png') }}" alt="Header Image" class="header-image">
 32 |                 </div>
 33 |                 <div class="navbar-links">
 34 |                     <a href="{{ url_for('main.home') }}" class="nav-link">Home</a>
 35 |                     <a href="{{ url_for('main.pdfManagement') }}" class="nav-link" onclick="handleNavigation(event)">PDF Management</a>
 36 |                     <!-- Add more links here as needed -->
 37 |                 </div>
 38 |             </div>
 39 |         </nav>
 40 |         <div class="section-group">
 41 |             <div class="card section">
 42 |                 <div class="pdf-query-section">
 43 |                     <h2>Ask About PDF Content</h2>
 44 |                     <div id="promptContainer">
 45 |                         <p>Use the dropdown menu below to choose an appropriate prompt template...</p>
 46 |                         <select id="promptType">
 47 |                             <!-- Options populated by JavaScript -->
 48 |                         </select>
 49 |                     </div>
 50 |                     <input type="text" id="queryPDF" placeholder="Type your question about the PDF/s">
 51 |                     <button id="askPDFButton" onclick="askPDF()">Submit</button>
 52 |                     <div id="queryResponseContainer" class="response-container">
 53 |                         <div id="queryResponse" class="response">
 54 |                             <!-- Response content goes here -->
 55 |                         </div>
 56 |                         <div class="resize-handle"></div>
 57 |                     </div>
 58 |                     <button onclick="clearChatHistory()" style="margin-top: 10px;">Clear Chat History</button>
 59 |                     <span style="margin-left: 10px;"></span>
 60 |                     <span id="chatHistoryStatus" class="status-message"></span>
 61 |                 </div>
 62 |             </div>
 63 |         </div>
 64 |         
 65 |         <div class="section-group">
 66 |             <div class="card section">
 67 |                 <div class="query-section">
 68 |                     <h2>Ask AI</h2>
 69 |                     <p>Submit your questions to Ollama3.1 AI model with no embedding from PDFs. I would use this mostly to compare answers</p>
 70 |                     <input type="text" id="query" placeholder="Type your question here to the AI model without PDF">
 71 |                     <button id="askAIButton" onclick="askAI()">Submit</button>
 72 |                     <button onclick="copyTextAndSubmit()">Copy Prompt from ASKPDF & Submit</button>
 73 |                     <div id="queryResponseContainerAI" class="response-container">
 74 |                         <div id="queryResponseAI" class="response">
 75 |                             <!-- Response content goes here -->
 76 |                         </div>
 77 |                         <div class="resize-handle"></div>
 78 |                     </div>
 79 |                 </div>
 80 |             </div>
 81 |         </div>
 82 |         
 83 |         <div class="section-group">
 84 |             <div class="card section">
 85 |                 <h2>Usage Statistics</h2>
 86 |                 <div class="query-usage-section">
 87 |                     <h3>Query Usage Statistics</h3>
 88 |                     <p>View statistics on how frequently different queries are made to the AI and PDF content.</p>
 89 |                     <div id="queryUsageStats" class="stats"></div>       
 90 |                 </div>
 91 |                 <div class="pdf-usage-section">
 92 |                     <h3>PDF Usage Statistics</h3>
 93 |                     <p>Monitor how often each uploaded PDF is accessed or queried within the system.</p>
 94 |                     <div id="pdfUsageStats" class="stats"></div>
 95 |                 </div>
 96 |             </div>
 97 |         </div>
 98 | 
 99 |         <div class="navigation">
100 |             <a href="{{ url_for('main.pdfManagement') }}" class="button" onclick="handleNavigation(event)">PDF Management</a>
101 |         </div>    
102 |     </div>
103 | </body>
104 | <footer>
105 |     <div class="footer-content">
106 |         <p>© 2024 Abd Alsattar Ardati. Created for exploration, learning, and sharing. 📚🔍</p>
107 |         <p>Licensed under Apache 2.0. Visit <a href="https://abd.ardati.org" target="_blank">abd.ardati.org</a> 🌐</p>
108 |     </div>
109 | </footer>
110 | </html>
111 | 


--------------------------------------------------------------------------------
/app/templates/pdfManagement.html:
--------------------------------------------------------------------------------
  1 | <!DOCTYPE html>
  2 | <html lang="en">
  3 | 
  4 | <head>
  5 |     <meta charset="UTF-8">
  6 |     <meta name="viewport" content="width=device-width, initial-scale=1.0">
  7 |     <title>PDF Management</title>
  8 |     <link rel="icon" type="image/x-icon" href="{{ url_for('static', filename='favicon.ico') }}">
  9 |     <link rel="stylesheet" href="{{ url_for('static', filename='css/style.css') }}">
 10 |     <script src="static/js/script.js" defer></script>
 11 | </head>
 12 | 
 13 | <body>
 14 |     <header>
 15 |         <div class="header-content">
 16 |             <div class="title-wrapper">
 17 |                 <h1>
 18 |                     <span class="welcome">Welcome to</span> AskItRight
 19 |                 </h1>
 20 |                 <div class="ai-powered">
 21 |                     AI-Powered PDF Query App (Ollama3.1) 🚀
 22 |                 </div>
 23 |             </div>
 24 |             <h2>PDF Management & Statistics Dashboard</h2>
 25 |             <p>View statistics related to the uploaded PDFs and documents in the system.</p>
 26 |         </div>
 27 |     </header> 
 28 | 
 29 |     <div class="container card">
 30 |         <nav class="navbar">
 31 |             <div class="container">
 32 |                 <div class="navbar-logo">
 33 |                     <img src="{{ url_for('static', filename='images/logo.png') }}" alt="Header Image" class="header-image">
 34 |                 </div>
 35 |                 <div class="navbar-links">
 36 |                     <a href="{{ url_for('main.home') }}" class="nav-link">Home</a>
 37 |                     <a href="{{ url_for('main.pdfManagement') }}" class="nav-link" onclick="handleNavigation(event)">PDF Management</a>
 38 |                     <!-- Add more links here as needed -->
 39 |                 </div>
 40 |             </div>
 41 |         </nav>
 42 | 
 43 |         <div class="section-group">
 44 |             <div class="card section">
 45 |                 <div class="upload-section">
 46 |                     <h2>Upload New PDF</h2>
 47 |                     <p>Upload a new PDF document to be processed and made available for querying.</p>
 48 |                     <input type="file" id="pdfFile">
 49 |                     <button onclick="uploadPDF()">Upload</button>
 50 |                 </div>
 51 |             </div>
 52 |         </div>
 53 | 
 54 |         <div class="section-group">
 55 |             <div class="card section">
 56 |                 <div class="pdf-list-section">
 57 |                     <h2>Available PDF Documents</h2>
 58 |                     <p>View and access the list of PDFs that have been uploaded to the system.</p>
 59 |                     <button onclick="listPDFs()">List PDFs</button>
 60 |                     <ul id="pdfList"></ul>
 61 |                 </div>
 62 |             </div>
 63 |         </div>
 64 | 
 65 |         <div class="section-group">
 66 |             <div class="card section">
 67 |                 <div class="clear-db-section">
 68 |                     <h2>Database and File Management</h2>
 69 |                     <p>Clear all stored data and uploaded files from the system. Use this carefully as it will delete all records.</p>
 70 |                     <button class="primary" onclick="clearDatabase()">Clear Database</button>
 71 |                 </div>
 72 |             </div>
 73 |         </div>
 74 | 
 75 |         <div class="section-group">
 76 |             <div class="card section">
 77 |                 <h2>PDF Statistics</h2>
 78 |                 <p>Total PDFs: {{ pdf_count }}</p>
 79 |             </div>
 80 |         </div>
 81 | 
 82 |         <div class="section-group">
 83 |             <div class="card section">
 84 |                 <h2>Document Statistics</h2>
 85 |                 <p>Total Documents in Chroma DB: {{ doc_count }}</p>
 86 |             </div>
 87 |         </div>
 88 | 
 89 |         <div class="navigation">
 90 |             <a href="{{ url_for('main.home') }}" class="button">Back to Home</a>
 91 |         </div>
 92 |     </div>
 93 | 
 94 |     <footer>
 95 |         <div class="footer-content">
 96 |             <p>© 2024 Abd Alsattar Ardati. Created for exploration, learning, and sharing. 📚🔍</p>
 97 |             <p>Licensed under Apache 2.0. Visit <a href="https://abd.ardati.org" target="_blank">abd.ardati.org</a> 🌐</p>
 98 |         </div>
 99 |     </footer>
100 | </body>
101 | 
102 | </html>
103 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
  1 | aiohttp==3.9.5
  2 | aiosignal==1.3.1
  3 | annotated-types==0.7.0
  4 | anyio==4.4.0
  5 | asgiref==3.8.1
  6 | attrs==23.2.0
  7 | backoff==2.2.1
  8 | bcrypt==4.1.3
  9 | blinker==1.8.2
 10 | build==1.2.1
 11 | cachetools==5.4.0
 12 | certifi==2024.7.4
 13 | cffi==1.16.0
 14 | charset-normalizer==3.3.2
 15 | chroma-hnswlib==0.7.5
 16 | chromadb==0.5.4
 17 | click==8.1.7
 18 | coloredlogs==15.0.1
 19 | cryptography==42.0.8
 20 | dataclasses-json==0.6.7
 21 | Deprecated==1.2.14
 22 | dnspython==2.6.1
 23 | email_validator==2.2.0
 24 | fastapi==0.111.1
 25 | fastapi-cli==0.0.4
 26 | fastembed==0.3.2
 27 | filelock==3.15.4
 28 | Flask==3.0.3
 29 | flatbuffers==24.3.25
 30 | frozenlist==1.4.1
 31 | fsspec==2024.6.1
 32 | google-auth==2.32.0
 33 | googleapis-common-protos==1.63.2
 34 | grpcio==1.64.1
 35 | h11==0.14.0
 36 | httpcore==1.0.5
 37 | httptools==0.6.1
 38 | httpx==0.27.0
 39 | huggingface-hub==0.23.5
 40 | humanfriendly==10.0
 41 | idna==3.7
 42 | importlib_metadata==7.1.0
 43 | importlib_resources==6.4.0
 44 | itsdangerous==2.2.0
 45 | Jinja2==3.1.4
 46 | jsonpatch==1.33
 47 | jsonpointer==3.0.0
 48 | kubernetes==30.1.0
 49 | langchain==0.2.9
 50 | langchain-community==0.2.6
 51 | langchain-core==0.2.20
 52 | langchain-text-splitters==0.2.2
 53 | langsmith==0.1.88
 54 | loguru==0.7.2
 55 | markdown-it-py==3.0.0
 56 | MarkupSafe==2.1.5
 57 | marshmallow==3.21.3
 58 | mdurl==0.1.2
 59 | mmh3==4.1.0
 60 | monotonic==1.6
 61 | mpmath==1.3.0
 62 | multidict==6.0.5
 63 | mypy-extensions==1.0.0
 64 | numpy==1.26.4
 65 | oauthlib==3.2.2
 66 | onnx==1.16.1
 67 | onnxruntime==1.18.1
 68 | opentelemetry-api==1.25.0
 69 | opentelemetry-exporter-otlp-proto-common==1.25.0
 70 | opentelemetry-exporter-otlp-proto-grpc==1.25.0
 71 | opentelemetry-instrumentation==0.46b0
 72 | opentelemetry-instrumentation-asgi==0.46b0
 73 | opentelemetry-instrumentation-fastapi==0.46b0
 74 | opentelemetry-proto==1.25.0
 75 | opentelemetry-sdk==1.25.0
 76 | opentelemetry-semantic-conventions==0.46b0
 77 | opentelemetry-util-http==0.46b0
 78 | orjson==3.10.6
 79 | overrides==7.7.0
 80 | packaging==24.1
 81 | pdfminer.six==20231228
 82 | pdfplumber==0.11.2
 83 | pillow==10.4.0
 84 | posthog==3.5.0
 85 | protobuf==4.25.3
 86 | pyasn1==0.6.0
 87 | pyasn1_modules==0.4.0
 88 | pycparser==2.22
 89 | pydantic==2.8.2
 90 | pydantic_core==2.20.1
 91 | Pygments==2.18.0
 92 | pypdfium2==4.30.0
 93 | PyPika==0.48.9
 94 | pyproject_hooks==1.1.0
 95 | PyStemmer==2.2.0.1
 96 | python-dateutil==2.9.0.post0
 97 | python-dotenv==1.0.1
 98 | python-multipart==0.0.9
 99 | PyYAML==6.0.1
100 | requests==2.32.3
101 | requests-oauthlib==2.0.0
102 | rich==13.7.1
103 | rsa==4.9
104 | setuptools==70.3.0
105 | shellingham==1.5.4
106 | six==1.16.0
107 | sniffio==1.3.1
108 | snowballstemmer==2.2.0
109 | SQLAlchemy==2.0.31
110 | starlette==0.37.2
111 | sympy==1.13.0
112 | tenacity==8.5.0
113 | tokenizers==0.19.1
114 | tqdm==4.66.4
115 | typer==0.12.3
116 | typing-inspect==0.9.0
117 | typing_extensions==4.12.2
118 | urllib3==2.2.2
119 | uvicorn==0.30.1
120 | uvloop==0.19.0
121 | watchfiles==0.22.0
122 | websocket-client==1.8.0
123 | websockets==12.0
124 | Werkzeug==3.0.3
125 | wrapt==1.16.0
126 | yarl==1.9.4
127 | zipp==3.19.2
128 | 


--------------------------------------------------------------------------------