├── .gitignore ├── README.md ├── app.py ├── file_crew ├── .gitignore ├── README.md ├── knowledge │ └── user_preference.txt ├── pyproject.toml └── src │ └── file_crew │ ├── __init__.py │ ├── config │ ├── agents.yaml │ └── tasks.yaml │ ├── crew.py │ ├── main.py │ └── tools │ ├── __init__.py │ └── custom_tool.py └── requirements.txt /.gitignore: -------------------------------------------------------------------------------- 1 | __pycache__ -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # PDF Processing with Docling and CrewAI 2 | 3 | This is a FastAPI application that uses Docling to parse PDF files and extract their content, then analyzes this content using CrewAI to generate structured reports and insights. 4 | 5 | ## 5 Steps: 6 | 1. Create Frontend with Bolt.new 7 | 2. Create FastAPI Project 8 | 3. Create Crew 9 | 4. Ngrok Static Domain 10 | 5. Connect Frontend with FastAPI 11 | 12 | ## Features 13 | 14 | - **Document Parsing**: Uses Docling to extract text content from PDF files 15 | - **AI Analysis**: Processes extracted content with CrewAI to generate structured reports 16 | - **API Interface**: Simple REST API for file upload and processing 17 | - **Markdown Export**: Returns parsed content in markdown format 18 | - **Clean Implementation**: Proper error handling and temporary file management 19 | 20 | ## Requirements 21 | 22 | - Python 3.9+ 23 | - FastAPI 24 | - Docling 25 | - CrewAI 26 | - Other dependencies as specified in requirements.txt 27 | 28 | ## Setup 29 | 30 | 1. Clone the repository: 31 | 32 | ```bash 33 | git clone 34 | cd 35 | ``` 36 | 37 | 2. Set up a virtual environment (recommended): 38 | 39 | ```bash 40 | python -m venv venv 41 | source venv/bin/activate # On Windows: venv\Scripts\activate 42 | ``` 43 | 44 | 3. Install the required dependencies: 45 | 46 | ```bash 47 | pip install -r requirements.txt 48 | ``` 49 | 50 | 4. Environment variables (optional): 51 | 52 | If you want to customize the CrewAI behavior or use specific LLM models, you can create a `.env` file with appropriate settings: 53 | 54 | ``` 55 | OPENAI_API_KEY=your_openai_api_key 56 | MODEL_NAME=gpt-4-turbo 57 | # Add other environment variables as needed 58 | ``` 59 | 60 | 5. Run the application: 61 | 62 | ```bash 63 | python app.py 64 | ``` 65 | 66 | The API will be available at `http://:5010`. 67 | 68 | ## API Endpoints 69 | 70 | ### Process PDF File 71 | 72 | **Endpoint**: `/file-handler` 73 | **Method**: POST 74 | **Content-Type**: multipart/form-data 75 | 76 | **Parameters**: 77 | - `file`: The PDF file to parse and analyze 78 | 79 | **Example using curl**: 80 | ```bash 81 | curl -X POST "http://localhost:5010/file-handler" \ 82 | -H "accept: application/json" \ 83 | -H "Content-Type: multipart/form-data" \ 84 | -F "file=@/path/to/your/file.pdf" 85 | ``` 86 | 87 | **Example using JavaScript/Fetch**: 88 | ```javascript 89 | const formData = new FormData(); 90 | formData.append('file', pdfFile); // pdfFile is a File object 91 | 92 | fetch('http://localhost:5010/file-handler', { 93 | method: 'POST', 94 | body: formData 95 | }) 96 | .then(response => response.json()) 97 | .then(data => console.log(data)) 98 | .catch(error => console.error('Error:', error)); 99 | ``` 100 | 101 | **Response**: 102 | ```json 103 | { 104 | "filename": "example.pdf", 105 | "markdown": "# Document Title\n\nContent in markdown format...", 106 | "result": { 107 | "key_points": ["Point 1", "Point 2", "..."], 108 | "quick_summary": "Summary of the document...", 109 | "extended_summary": "Detailed analysis...", 110 | "actionable_insights": ["Insight 1", "Insight 2", "..."], 111 | "source_documents": ["Document title and description..."], 112 | "potential_biases": "Discussion of biases..." 113 | } 114 | } 115 | ``` 116 | 117 | ## Implementation Details 118 | 119 | ### How It Works 120 | 121 | 1. **PDF Processing**: 122 | - The application receives a PDF file through the `/file-handler` endpoint 123 | - Docling's DocumentConverter processes the PDF and extracts structured content 124 | - The content is converted to markdown format for further processing 125 | 126 | 2. **AI Analysis**: 127 | - The markdown content is passed to the CrewAI system 128 | - A researcher agent analyzes the content according to the task definition 129 | - The agent generates a structured report with key points, summaries, and insights 130 | 131 | 3. **Result Delivery**: 132 | - Both the original markdown and the AI-generated analysis are returned to the client 133 | - The response includes detailed sections as specified in the task configuration 134 | 135 | ### CrewAI Task Configuration 136 | 137 | The CrewAI system is configured to analyze documents with a specific focus. The task configuration is defined in `file_crew/src/file_crew/config/tasks.yaml` and includes: 138 | 139 | - Extracting key points from the document 140 | - Creating executive summaries 141 | - Providing detailed analysis 142 | - Suggesting actionable insights 143 | - Identifying potential biases in the source material 144 | 145 | ## Testing 146 | 147 | You can use the included test script to try out the API: 148 | 149 | ```bash 150 | python test_pdf_parser.py /path/to/your/file.pdf 151 | ``` 152 | 153 | This will: 154 | 1. Upload the specified PDF to the API 155 | 2. Process the file through both Docling and CrewAI 156 | 3. Display a summary of the results in the terminal 157 | 4. Save the full results to a JSON file 158 | 159 | ## Project Structure 160 | 161 | ``` 162 | . 163 | ├── app.py # FastAPI application 164 | ├── requirements.txt # Project dependencies 165 | ├── test_pdf_parser.py # Test script for the API 166 | ├── file_crew/ 167 | │ └── src/ 168 | │ └── file_crew/ 169 | │ ├── main.py # CrewAI integration 170 | │ └── config/ 171 | │ └── tasks.yaml # Task definitions for the AI analysis 172 | ``` 173 | 174 | ## Troubleshooting 175 | 176 | - **PDF Parsing Issues**: Make sure the uploaded PDF is not corrupted and doesn't have DRM protection 177 | - **CrewAI Errors**: Check that any required environment variables are set correctly 178 | - **API Connection Issues**: Verify that the API is running and accessible from your client 179 | 180 | 181 | ## Acknowledgements 182 | 183 | - [Docling](https://github.com/docling-project/docling) for PDF parsing 184 | - [CrewAI](https://github.com/joaomdmoura/crewAI) for AI agent orchestration 185 | -------------------------------------------------------------------------------- /app.py: -------------------------------------------------------------------------------- 1 | from fastapi import FastAPI, HTTPException, Body, File, Form, UploadFile 2 | from fastapi.middleware.cors import CORSMiddleware 3 | from fastapi.responses import JSONResponse 4 | import socket 5 | import json 6 | from typing import List, Dict, Any 7 | import uvicorn 8 | import tempfile 9 | import os 10 | from docling.document_converter import DocumentConverter 11 | from file_crew.src.file_crew.main import run 12 | 13 | def get_local_ip() -> str: 14 | """ 15 | Get the local IP address of the machine. 16 | 17 | Returns: 18 | str: The local IP address, or '127.0.0.1' if it can't be determined. 19 | """ 20 | try: 21 | s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) 22 | s.connect(('8.8.8.8', 80)) 23 | ip = s.getsockname()[0] 24 | s.close() 25 | return ip 26 | except Exception: 27 | return '127.0.0.1' 28 | 29 | app = FastAPI( 30 | title="CrewAI API", 31 | description="API for processing documents with Docling and CrewAI", 32 | version="1.0.0" 33 | ) 34 | 35 | # Configure CORS 36 | app.add_middleware( 37 | CORSMiddleware, 38 | allow_origins=["*"], 39 | allow_credentials=True, 40 | allow_methods=["*"], 41 | allow_headers=["*"], 42 | ) 43 | 44 | @app.post("/file-handler", response_model=Dict[str, Any]) 45 | async def file_handler_endpoint( 46 | file: UploadFile = File(...) 47 | ): 48 | """ 49 | Process a PDF file using Docling and CrewAI. 50 | 51 | Args: 52 | file (UploadFile): The PDF file to process 53 | 54 | Returns: 55 | Dict[str, Any]: The processed results including markdown content and CrewAI analysis 56 | 57 | Raises: 58 | HTTPException: If there's an error processing the file 59 | """ 60 | temp_file_path = None 61 | 62 | try: 63 | print(f"Received file: {file.filename} (type: {file.content_type}, size: {file.size})") 64 | 65 | # Validate file type 66 | if not file.content_type.startswith('application/pdf'): 67 | raise HTTPException( 68 | status_code=400, 69 | detail=f"Invalid file type: {file.content_type}. Please upload a PDF file." 70 | ) 71 | 72 | # Create a temporary file to save the uploaded PDF 73 | with tempfile.NamedTemporaryFile(delete=False, suffix='.pdf') as temp_file: 74 | # Write the uploaded file content to the temporary file 75 | content = await file.read() 76 | temp_file.write(content) 77 | temp_file_path = temp_file.name 78 | 79 | # Process the file 80 | try: 81 | # Initialize the DocumentConverter 82 | converter = DocumentConverter() 83 | 84 | # Convert the PDF document 85 | docling_result = converter.convert(temp_file_path) 86 | 87 | # Get the parsed content in markdown format 88 | markdown_content = docling_result.document.export_to_markdown() 89 | 90 | # Process with CrewAI 91 | crew_result = run(markdown_content) 92 | 93 | # Return the results 94 | return { 95 | "filename": file.filename, 96 | "markdown": markdown_content, 97 | "result": crew_result.model_dump() 98 | } 99 | 100 | finally: 101 | # Always clean up the temporary file 102 | if temp_file_path and os.path.exists(temp_file_path): 103 | os.unlink(temp_file_path) 104 | 105 | except HTTPException as e: 106 | # Re-raise HTTP exceptions 107 | raise e 108 | except json.JSONDecodeError as e: 109 | print(f"JSON decode error: {str(e)}") 110 | raise HTTPException(status_code=400, detail=f"Invalid JSON payload: {str(e)}") 111 | except Exception as e: 112 | print(f"Error in file_handler: {str(e)}") 113 | import traceback 114 | traceback.print_exc() 115 | raise HTTPException(status_code=500, detail=str(e)) 116 | 117 | @app.get("/") 118 | async def root(): 119 | """Root endpoint that returns basic API information.""" 120 | return { 121 | "message": "Welcome to the CrewAI Document Processing API", 122 | "endpoints": { 123 | "/file-handler": "POST endpoint for processing PDF files" 124 | } 125 | } 126 | 127 | if __name__ == "__main__": 128 | host_ip = get_local_ip() 129 | print(f"Starting server at http://{host_ip}:5010") 130 | uvicorn.run("app:app", host=host_ip, port=5010, reload=True) 131 | -------------------------------------------------------------------------------- /file_crew/.gitignore: -------------------------------------------------------------------------------- 1 | .env 2 | __pycache__/ 3 | .DS_Store 4 | -------------------------------------------------------------------------------- /file_crew/README.md: -------------------------------------------------------------------------------- 1 | # FileCrew Crew 2 | 3 | Welcome to the FileCrew Crew project, powered by [crewAI](https://crewai.com). This template is designed to help you set up a multi-agent AI system with ease, leveraging the powerful and flexible framework provided by crewAI. Our goal is to enable your agents to collaborate effectively on complex tasks, maximizing their collective intelligence and capabilities. 4 | 5 | ## Installation 6 | 7 | Ensure you have Python >=3.10 <3.13 installed on your system. This project uses [UV](https://docs.astral.sh/uv/) for dependency management and package handling, offering a seamless setup and execution experience. 8 | 9 | First, if you haven't already, install uv: 10 | 11 | ```bash 12 | pip install uv 13 | ``` 14 | 15 | Next, navigate to your project directory and install the dependencies: 16 | 17 | (Optional) Lock the dependencies and install them by using the CLI command: 18 | ```bash 19 | crewai install 20 | ``` 21 | ### Customizing 22 | 23 | **Add your `OPENAI_API_KEY` into the `.env` file** 24 | 25 | - Modify `src/file_crew/config/agents.yaml` to define your agents 26 | - Modify `src/file_crew/config/tasks.yaml` to define your tasks 27 | - Modify `src/file_crew/crew.py` to add your own logic, tools and specific args 28 | - Modify `src/file_crew/main.py` to add custom inputs for your agents and tasks 29 | 30 | ## Running the Project 31 | 32 | To kickstart your crew of AI agents and begin task execution, run this from the root folder of your project: 33 | 34 | ```bash 35 | $ crewai run 36 | ``` 37 | 38 | This command initializes the file_crew Crew, assembling the agents and assigning them tasks as defined in your configuration. 39 | 40 | This example, unmodified, will run the create a `report.md` file with the output of a research on LLMs in the root folder. 41 | 42 | ## Understanding Your Crew 43 | 44 | The file_crew Crew is composed of multiple AI agents, each with unique roles, goals, and tools. These agents collaborate on a series of tasks, defined in `config/tasks.yaml`, leveraging their collective skills to achieve complex objectives. The `config/agents.yaml` file outlines the capabilities and configurations of each agent in your crew. 45 | 46 | ## Support 47 | 48 | For support, questions, or feedback regarding the FileCrew Crew or crewAI. 49 | - Visit our [documentation](https://docs.crewai.com) 50 | - Reach out to us through our [GitHub repository](https://github.com/joaomdmoura/crewai) 51 | - [Join our Discord](https://discord.com/invite/X4JWnZnxPb) 52 | - [Chat with our docs](https://chatg.pt/DWjSBZn) 53 | 54 | Let's create wonders together with the power and simplicity of crewAI. 55 | -------------------------------------------------------------------------------- /file_crew/knowledge/user_preference.txt: -------------------------------------------------------------------------------- 1 | User name is John Doe. 2 | User is an AI Engineer. 3 | User is interested in AI Agents. 4 | User is based in San Francisco, California. 5 | -------------------------------------------------------------------------------- /file_crew/pyproject.toml: -------------------------------------------------------------------------------- 1 | [project] 2 | name = "file_crew" 3 | version = "0.1.0" 4 | description = "file_crew using crewAI" 5 | authors = [{ name = "Your Name", email = "you@example.com" }] 6 | requires-python = ">=3.10,<3.13" 7 | dependencies = [ 8 | "crewai[tools]>=0.105.0,<1.0.0" 9 | ] 10 | 11 | [project.scripts] 12 | file_crew = "file_crew.main:run" 13 | run_crew = "file_crew.main:run" 14 | train = "file_crew.main:train" 15 | replay = "file_crew.main:replay" 16 | test = "file_crew.main:test" 17 | 18 | [build-system] 19 | requires = ["hatchling"] 20 | build-backend = "hatchling.build" 21 | 22 | [tool.crewai] 23 | type = "crew" 24 | -------------------------------------------------------------------------------- /file_crew/src/file_crew/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tylerprogramming/crewai-frontend/b9035966a2a460c05104404122f7e03d0faf9ef5/file_crew/src/file_crew/__init__.py -------------------------------------------------------------------------------- /file_crew/src/file_crew/config/agents.yaml: -------------------------------------------------------------------------------- 1 | researcher: 2 | role: > 3 | Markdown Miner & Insight Extractor 4 | goal: > 5 | To efficiently and accurately extract critical information from markdown documents, identify key themes and relationships within the text, and provide the user with a concise and actionable summary of the document's contents. 6 | The agent prioritizes accuracy and thoroughness, ensuring no vital detail is overlooked. {file_text} 7 | backstory: > 8 | Created to navigate the growing landscape of markdown-based knowledge, this agent is engineered for precision extraction and insightful synthesis. 9 | Its core strength lies in its ability to deeply understand markdown structures, identify crucial data points, and distill them into actionable summaries. 10 | It acts as a dedicated research assistant, transforming raw markdown data into readily accessible and understandable information. 11 | -------------------------------------------------------------------------------- /file_crew/src/file_crew/config/tasks.yaml: -------------------------------------------------------------------------------- 1 | research_task: 2 | description: | 3 | Parse and analyze markdown documents {file_text} to identify key developments and relevant information. 4 | Focus on extracting factual statements, key findings, and notable trends within the documents. 5 | expected_output: | 6 | A structured report containing the following sections, derived from the analyzed markdown documents: 7 | 8 | 1. Key Points (Concise Summary): 9 | - A list of 5-7 bullet points summarizing the most critical and impactful findings related to {topic}. 10 | - Each bullet point should be factually accurate and referenced to its source markdown document. 11 | - Aim for clarity and conciseness, capturing the essence of the information in a single sentence. 12 | 13 | 2. Quick Summary (Executive Overview): 14 | - A brief paragraph (approximately 50-75 words) providing a high-level overview of the current state of {topic}. 15 | - Highlight the most significant recent developments and their potential implications. 16 | - Avoid jargon and technical terms, making it accessible to a general audience. 17 | 18 | 3. Extended Summary (Detailed Analysis): 19 | - A more in-depth summary (approximately 200-300 words) expanding on the Key Points and Quick Summary. 20 | - Provide more context for the key findings, explaining their significance and potential impact. 21 | - Include details on any conflicting viewpoints or ongoing debates within the analyzed documents. 22 | - Maintain clear references to the source markdown documents for each key point. 23 | 24 | 4. Actionable Insights (What We Should Take From This): 25 | - A section outlining 3-5 specific actions or recommendations that can be derived from the research. 26 | - These should be practical and based on the identified trends, findings, and potential implications. 27 | - Examples: 28 | - "Further investigate the impact of {specific technology} on {specific application}" 29 | - "Develop a strategy to address the emerging challenges related to {specific problem}" 30 | - "Monitor the progress of {specific research area} due to its potential impact on {specific field}" 31 | - For each actionable insight, explain the rationale behind the recommendation and its potential benefits. 32 | 33 | 5. Source Document List: 34 | - A comprehensive list of all markdown files analyzed, including their titles and brief descriptions. 35 | 36 | 6. Potential Biases and Limitations: 37 | - A brief paragraph acknowledging any potential biases or limitations inherent in the analyzed markdown data. 38 | - This could include: 39 | - Overrepresentation of certain viewpoints 40 | - Lack of coverage in specific areas 41 | - Potential for outdated information 42 | - Acknowledge if the Markdown files come from a specific source and therefore might have a bias 43 | agent: researcher 44 | 45 | -------------------------------------------------------------------------------- /file_crew/src/file_crew/crew.py: -------------------------------------------------------------------------------- 1 | from crewai import Agent, Crew, Process, Task, LLM 2 | from crewai.project import CrewBase, agent, crew, task 3 | from dotenv import load_dotenv 4 | import os 5 | from pydantic import BaseModel 6 | from typing import List 7 | 8 | load_dotenv() 9 | 10 | google_flash_llm = LLM( 11 | model="gemini/gemini-2.0-flash", 12 | api_key=os.getenv("GOOGLE_API_KEY"), 13 | ) 14 | 15 | class ResearchOutput(BaseModel): 16 | summary: str 17 | key_points: List[str] 18 | quick_summary: str 19 | extended_summary: str 20 | actionable_insights: List[str] 21 | source_document_list: List[str] 22 | potential_biases_and_limitations: str 23 | 24 | @CrewBase 25 | class FileCrew(): 26 | 27 | agents_config = 'config/agents.yaml' 28 | tasks_config = 'config/tasks.yaml' 29 | 30 | @agent 31 | def researcher(self) -> Agent: 32 | return Agent( 33 | config=self.agents_config['researcher'], 34 | verbose=True, 35 | llm=google_flash_llm 36 | ) 37 | 38 | @task 39 | def research_task(self) -> Task: 40 | return Task( 41 | config=self.tasks_config['research_task'], 42 | output_pydantic=ResearchOutput 43 | ) 44 | 45 | @crew 46 | def crew(self) -> Crew: 47 | """Creates the FileCrew crew""" 48 | 49 | return Crew( 50 | agents=self.agents, 51 | tasks=self.tasks, 52 | process=Process.sequential, 53 | verbose=True, 54 | ) 55 | -------------------------------------------------------------------------------- /file_crew/src/file_crew/main.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | import sys 3 | import warnings 4 | 5 | from datetime import datetime 6 | 7 | from .crew import FileCrew 8 | 9 | warnings.filterwarnings("ignore", category=SyntaxWarning, module="pysbd") 10 | 11 | # This main file is intended to be a way for you to run your 12 | # crew locally, so refrain from adding unnecessary logic into this file. 13 | # Replace with inputs you want to test with, it will automatically 14 | # interpolate any tasks and agents information 15 | 16 | def run(file_text): 17 | """ 18 | Run the crew. 19 | """ 20 | inputs = { 21 | 'file_text': file_text, 22 | } 23 | 24 | try: 25 | result = FileCrew().crew().kickoff(inputs=inputs) 26 | return result.pydantic 27 | except Exception as e: 28 | raise Exception(f"An error occurred while running the crew: {e}") 29 | 30 | 31 | def train(): 32 | """ 33 | Train the crew for a given number of iterations. 34 | """ 35 | inputs = { 36 | "topic": "AI LLMs" 37 | } 38 | try: 39 | FileCrew().crew().train(n_iterations=int(sys.argv[1]), filename=sys.argv[2], inputs=inputs) 40 | 41 | except Exception as e: 42 | raise Exception(f"An error occurred while training the crew: {e}") 43 | 44 | def replay(): 45 | """ 46 | Replay the crew execution from a specific task. 47 | """ 48 | try: 49 | FileCrew().crew().replay(task_id=sys.argv[1]) 50 | 51 | except Exception as e: 52 | raise Exception(f"An error occurred while replaying the crew: {e}") 53 | 54 | def test(): 55 | """ 56 | Test the crew execution and returns the results. 57 | """ 58 | inputs = { 59 | "topic": "AI LLMs", 60 | "current_year": str(datetime.now().year) 61 | } 62 | try: 63 | FileCrew().crew().test(n_iterations=int(sys.argv[1]), openai_model_name=sys.argv[2], inputs=inputs) 64 | 65 | except Exception as e: 66 | raise Exception(f"An error occurred while testing the crew: {e}") 67 | -------------------------------------------------------------------------------- /file_crew/src/file_crew/tools/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tylerprogramming/crewai-frontend/b9035966a2a460c05104404122f7e03d0faf9ef5/file_crew/src/file_crew/tools/__init__.py -------------------------------------------------------------------------------- /file_crew/src/file_crew/tools/custom_tool.py: -------------------------------------------------------------------------------- 1 | from crewai.tools import BaseTool 2 | from typing import Type 3 | from pydantic import BaseModel, Field 4 | 5 | 6 | class MyCustomToolInput(BaseModel): 7 | """Input schema for MyCustomTool.""" 8 | argument: str = Field(..., description="Description of the argument.") 9 | 10 | class MyCustomTool(BaseTool): 11 | name: str = "Name of my tool" 12 | description: str = ( 13 | "Clear description for what this tool is useful for, your agent will need this information to use it." 14 | ) 15 | args_schema: Type[BaseModel] = MyCustomToolInput 16 | 17 | def _run(self, argument: str) -> str: 18 | # Implementation goes here 19 | return "this is an example of a tool output, ignore it and move along." 20 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | agentops==0.3.26 2 | agentstack==0.3.5 3 | aiofiles==24.1.0 4 | aiohappyeyeballs==2.4.4 5 | aiohttp==3.11.10 6 | aiohttp-retry==2.8.3 7 | aiosignal==1.3.2 8 | aiosqlite==0.21.0 9 | alembic==1.14.0 10 | altair==5.5.0 11 | annotated-types==0.7.0 12 | anyio==4.7.0 13 | appdirs==1.4.4 14 | arrow==1.3.0 15 | art==6.4 16 | asgiref==3.8.1 17 | asn1crypto==1.5.1 18 | astor==0.8.1 19 | asttokens==3.0.0 20 | atpublic==5.0 21 | attrs==24.2.0 22 | auth0-python==4.7.2 23 | Authlib==1.3.1 24 | autoflake==2.3.1 25 | babel==2.17.0 26 | backoff==2.2.1 27 | backrefs==5.8 28 | bcrypt==4.2.1 29 | beautifulsoup4==4.12.3 30 | binaryornot==0.4.4 31 | blessed==1.20.0 32 | blinker==1.9.0 33 | build==1.2.2.post1 34 | cached-property==2.0.1 35 | cachetools==5.5.0 36 | certifi==2024.12.14 37 | cffi==1.17.1 38 | chardet==5.2.0 39 | charset-normalizer==3.4.0 40 | chroma-hnswlib==0.7.6 41 | chromadb==0.5.23 42 | click==8.1.8 43 | cohere==5.13.3 44 | colorama==0.4.6 45 | coloredlogs==15.0.1 46 | composio==0.1.1 47 | composio_core==0.7.2 48 | composio_crewai==0.7.2 49 | composio_langchain==0.7.2 50 | cookiecutter==2.6.0 51 | Crawl4AI==0.4.248 52 | crewai==0.105.0 53 | crewai-tools==0.37.0 54 | cryptography==43.0.3 55 | css-inline==0.14.3 56 | cssselect==1.2.0 57 | dataclasses-json==0.6.7 58 | dateparser==1.2.0 59 | decorator==5.1.1 60 | deepsearch-glm==1.0.0 61 | Deprecated==1.2.15 62 | deprecation==2.1.0 63 | dill==0.3.9 64 | dirtyjson==1.0.8 65 | distro==1.9.0 66 | docker==7.1.0 67 | docling==2.15.1 68 | docling-core==2.14.0 69 | docling-ibm-models==3.1.2 70 | docling-parse==3.0.0 71 | docstring_parser==0.16 72 | docx2txt==0.8 73 | durationpy==0.9 74 | easyocr==1.7.2 75 | editor==1.6.6 76 | elevenlabs==1.53.0 77 | embedchain==0.1.125 78 | et_xmlfile==2.0.0 79 | exa-py==1.8.8 80 | executing==2.1.0 81 | fake-http-header==0.3.5 82 | fake-useragent==2.0.3 83 | fastapi==0.115.11 84 | fastavro==1.9.7 85 | fastembed==0.5.1 86 | feedparser==6.0.11 87 | ffmpeg==1.4 88 | ffmpy==0.5.0 89 | filelock==3.16.1 90 | filetype==1.2.0 91 | firecrawl-py==1.6.4 92 | Flask==3.1.0 93 | Flask-Cors==5.0.0 94 | flatbuffers==24.3.25 95 | frozenlist==1.5.0 96 | fsspec==2024.10.0 97 | ghp-import==2.1.0 98 | gitdb==4.0.12 99 | GitPython==3.1.44 100 | google-ai-generativelanguage==0.6.15 101 | google-api-core==2.24.0 102 | google-api-python-client==2.154.0 103 | google-auth==2.37.0 104 | google-auth-httplib2==0.2.0 105 | google-auth-oauthlib==1.2.1 106 | google-cloud-aiplatform==1.74.0 107 | google-cloud-bigquery==3.27.0 108 | google-cloud-core==2.4.1 109 | google-cloud-resource-manager==1.14.0 110 | google-cloud-storage==2.19.0 111 | google-crc32c==1.6.0 112 | google-genai==1.0.0 113 | google-generativeai==0.8.4 114 | google-resumable-media==2.7.2 115 | google_news_feed==1.1.0 116 | googleapis-common-protos==1.66.0 117 | gotrue==2.11.0 118 | gptcache==0.1.44 119 | gradio==5.13.1 120 | gradio_client==1.6.0 121 | greenlet==3.1.1 122 | griffe==1.6.0 123 | grpc-google-iam-v1==0.13.1 124 | grpcio==1.69.0 125 | grpcio-health-checking==1.69.0 126 | grpcio-status==1.68.1 127 | grpcio-tools==1.68.1 128 | h11==0.14.0 129 | h2==4.1.0 130 | hpack==4.0.0 131 | httpcore==1.0.7 132 | httplib2==0.22.0 133 | httptools==0.6.4 134 | httpx==0.27.2 135 | httpx-sse==0.4.0 136 | huggingface-hub==0.26.5 137 | humanfriendly==10.0 138 | hyperframe==6.0.1 139 | idna==3.10 140 | ijson==3.3.0 141 | imageio==2.36.1 142 | imageio-ffmpeg==0.6.0 143 | importlib_metadata==8.5.0 144 | importlib_resources==6.4.5 145 | inflection==0.5.1 146 | iniconfig==2.0.0 147 | inquirer==3.4.0 148 | instructor==1.7.0 149 | ipython==8.30.0 150 | itsdangerous==2.2.0 151 | jedi==0.19.2 152 | Jinja2==3.1.4 153 | jiter==0.6.1 154 | joblib==1.4.2 155 | json5==0.10.0 156 | json_repair==0.31.0 157 | jsonlines==3.1.0 158 | jsonpatch==1.33 159 | jsonpickle==4.0.1 160 | jsonpointer==3.0.0 161 | jsonref==1.1.0 162 | jsonschema==4.23.0 163 | jsonschema-specifications==2024.10.1 164 | kubernetes==31.0.0 165 | lancedb==0.17.0 166 | langchain==0.3.12 167 | langchain-cohere==0.3.3 168 | langchain-community==0.3.12 169 | langchain-core==0.3.25 170 | langchain-experimental==0.3.3 171 | langchain-openai==0.2.12 172 | langchain-text-splitters==0.3.3 173 | langchainhub==0.1.21 174 | langsmith==0.1.147 175 | lazy_loader==0.4 176 | linkup-sdk==0.2.1 177 | litellm==1.60.2 178 | llama-cloud==0.1.12 179 | llama-cloud-services==0.6.1 180 | llama-index-core==0.12.17 181 | llama-index-readers-file==0.4.5 182 | llvmlite==0.43.0 183 | loguru==0.7.3 184 | lxml==5.3.0 185 | Mako==1.3.8 186 | Markdown==3.7 187 | markdown-it-py==3.0.0 188 | marko==2.1.2 189 | MarkupSafe==2.1.5 190 | marshmallow==3.23.1 191 | matplotlib-inline==0.1.7 192 | mdurl==0.1.2 193 | mem0ai==0.1.35 194 | mergedeep==1.3.4 195 | mkdocs==1.6.1 196 | mkdocs-autorefs==1.4.1 197 | mkdocs-get-deps==0.2.0 198 | mkdocs-material==9.6.7 199 | mkdocs-material-extensions==1.3.1 200 | mkdocstrings==0.28.3 201 | mkdocstrings-python==1.16.3 202 | mmh3==4.1.0 203 | monotonic==1.6 204 | more-itertools==10.5.0 205 | moviepy==2.1.2 206 | mpire==2.10.2 207 | mpmath==1.3.0 208 | multidict==6.1.0 209 | multiprocess==0.70.17 210 | mypy-extensions==1.0.0 211 | narwhals==1.22.0 212 | nest-asyncio==1.6.0 213 | networkx==3.4.2 214 | ninja==1.11.1.3 215 | nltk==3.9.1 216 | nodeenv==1.9.1 217 | notion-client==2.3.0 218 | numba==0.60.0 219 | numpy==1.26.4 220 | oauthlib==3.2.2 221 | onnxruntime==1.20.1 222 | openai==1.66.2 223 | openai-agents==0.0.3 224 | openai-whisper==20240930 225 | opencv-python-headless==4.10.0.84 226 | openpyxl==3.1.5 227 | opentelemetry-api==1.29.0 228 | opentelemetry-exporter-otlp-proto-common==1.29.0 229 | opentelemetry-exporter-otlp-proto-grpc==1.29.0 230 | opentelemetry-exporter-otlp-proto-http==1.29.0 231 | opentelemetry-instrumentation==0.50b0 232 | opentelemetry-instrumentation-asgi==0.50b0 233 | opentelemetry-instrumentation-fastapi==0.50b0 234 | opentelemetry-proto==1.29.0 235 | opentelemetry-sdk==1.29.0 236 | opentelemetry-semantic-conventions==0.50b0 237 | opentelemetry-util-http==0.50b0 238 | orjson==3.10.12 239 | outcome==1.3.0.post0 240 | overrides==7.7.0 241 | packaging==23.2 242 | paginate==0.5.7 243 | pandas==2.2.3 244 | parameterized==0.9.0 245 | paramiko==3.5.1 246 | parso==0.8.4 247 | pathspec==0.12.1 248 | patronus==0.0.17 249 | pdfminer.six==20231228 250 | pdfplumber==0.11.4 251 | pexpect==4.9.0 252 | pillow==10.4.0 253 | platformdirs==4.3.6 254 | playwright==1.49.1 255 | plotly==6.0.0 256 | pluggy==1.5.0 257 | portalocker==2.10.1 258 | portkey-ai==1.10.2 259 | postgrest==0.18.0 260 | posthog==3.7.4 261 | proglog==0.1.10 262 | prompt_toolkit==3.0.48 263 | propcache==0.2.1 264 | proto-plus==1.25.0 265 | protobuf==5.29.1 266 | psutil==5.9.8 267 | ptyprocess==0.7.0 268 | pure_eval==0.2.3 269 | py_rust_stemmers==0.1.5 270 | pyairtable==3.0.0 271 | pyarrow==18.1.0 272 | pyasn1==0.6.1 273 | pyasn1_modules==0.4.1 274 | pyclipper==1.3.0.post6 275 | pycparser==2.22 276 | pydantic==2.10.3 277 | pydantic-settings==2.7.0 278 | pydantic_core==2.27.1 279 | pydeck==0.9.1 280 | pydub==0.25.1 281 | pyee==12.0.0 282 | pyflakes==3.2.0 283 | pygame==2.6.1 284 | Pygments==2.18.0 285 | PyJWT==2.10.1 286 | pylance==0.20.0 287 | pymdown-extensions==10.14.3 288 | PyNaCl==1.5.0 289 | pyOpenSSL==24.3.0 290 | pyparsing==3.2.0 291 | pypdf==5.1.0 292 | pypdfium2==4.30.0 293 | pyperclip==1.9.0 294 | PyPika==0.48.9 295 | pyproject_hooks==1.2.0 296 | pyright==1.1.390 297 | pysbd==0.3.4 298 | Pysher==1.0.8 299 | PySocks==1.7.1 300 | pytest==8.3.4 301 | python-bidi==0.6.3 302 | python-dateutil==2.9.0.post0 303 | python-docx==1.1.2 304 | python-dotenv==1.0.1 305 | python-multipart==0.0.20 306 | python-pptx==1.0.2 307 | python-slugify==8.0.4 308 | python-telegram-bot==21.10 309 | pytube==15.0.0 310 | pytz==2024.2 311 | pyvis==0.3.2 312 | PyYAML==6.0.2 313 | pyyaml_env_tag==0.1 314 | qdrant-client==1.12.1 315 | rank-bm25==0.2.2 316 | readchar==4.2.1 317 | realtime==2.0.6 318 | referencing==0.35.1 319 | regex==2024.11.6 320 | requests==2.32.3 321 | requests-oauthlib==2.0.0 322 | requests-toolbelt==1.0.0 323 | rich==13.9.4 324 | rpds-py==0.22.3 325 | rsa==4.9 326 | Rtree==1.3.0 327 | ruamel.yaml==0.18.6 328 | ruamel.yaml.base==0.3.2 329 | ruamel.yaml.clib==0.2.12 330 | ruff==0.9.3 331 | runs==1.2.2 332 | safehttpx==0.1.6 333 | safetensors==0.5.2 334 | schedule==1.2.2 335 | schema==0.7.7 336 | scikit-image==0.25.0 337 | scipy==1.15.1 338 | scrapegraph_py==1.8.0 339 | selenium==4.27.1 340 | semantic-version==2.10.0 341 | semchunk==2.2.2 342 | semver==3.0.4 343 | sentry-sdk==2.21.0 344 | serpapi==0.1.5 345 | sgmllib3k==1.0.0 346 | shapely==2.0.6 347 | shellingham==1.5.4 348 | six==1.17.0 349 | smmap==5.0.2 350 | sniffio==1.3.1 351 | snowballstemmer==2.2.0 352 | snowflake==1.0.2 353 | snowflake-connector-python==3.12.4 354 | snowflake._legacy==1.0.0 355 | snowflake.core==1.0.2 356 | sortedcontainers==2.4.0 357 | soupsieve==2.6 358 | spider-client==0.1.25 359 | SQLAlchemy==2.0.36 360 | stack-data==0.6.3 361 | starlette==0.41.3 362 | storage3==0.9.0 363 | streamlit==1.41.1 364 | striprtf==0.0.26 365 | supabase==2.10.0 366 | supafunc==0.7.0 367 | sympy==1.13.3 368 | tabulate==0.9.0 369 | tenacity==9.0.0 370 | termcolor==2.4.0 371 | text-unidecode==1.3 372 | tf-playwright-stealth==1.1.1 373 | tifffile==2025.1.10 374 | tiktoken==0.7.0 375 | tokenizers==0.19.1 376 | toml==0.10.2 377 | tomli==2.2.1 378 | tomli_w==1.1.0 379 | tomlkit==0.13.2 380 | torch==2.2.2 381 | torchvision==0.17.2 382 | tornado==6.4.2 383 | tqdm==4.67.1 384 | traitlets==5.14.3 385 | transformers==4.42.4 386 | trio==0.27.0 387 | trio-websocket==0.11.1 388 | twilio==9.4.3 389 | typer==0.12.5 390 | types-python-dateutil==2.9.0.20241206 391 | types-requests==2.32.0.20241016 392 | typing-inspect==0.9.0 393 | typing_extensions==4.12.2 394 | tzdata==2024.2 395 | tzlocal==5.2 396 | uritemplate==4.1.1 397 | urllib3==2.2.3 398 | uv==0.5.9 399 | uvicorn==0.34.0 400 | uvloop==0.21.0 401 | validators==0.34.0 402 | watchdog==6.0.0 403 | watchfiles==1.0.3 404 | wcwidth==0.2.13 405 | weaviate-client==4.10.2 406 | websocket-client==1.8.0 407 | websockets==13.1 408 | Werkzeug==3.1.3 409 | whisper==1.1.10 410 | wrapt==1.17.0 411 | wsproto==1.2.0 412 | XlsxWriter==3.2.0 413 | xmod==1.8.1 414 | xxhash==3.5.0 415 | yarl==1.18.3 416 | yt-dlp==2024.11.18 417 | zipp==3.21.0 418 | --------------------------------------------------------------------------------