└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # Tree-KG 2 | 3 | **Tree-KG** is an expandable framework for constructing and iteratively expanding knowledge graphs (KGs) in knowledge-intensive domains. 4 | 5 | ## Example Usage 6 | 7 | ### 1. Submit a Task 8 | 9 | ```bash 10 | curl -X POST https://pacman.cs.tsinghua.edu.cn/api/treekg/submit_task \ 11 | -F "user_config.json=@path/to/user_config.json" \ 12 | -F "file_dir=@path/to/input_file" 13 | ```` 14 | 15 | Example book: 16 | *《大学物理学(第4版)电磁学、光学、量子物理》* 17 | 张三慧 编著 安宇 阮东 李岩松 修订 18 | 清华大学出版社 19 | 20 | Example `user_config.json`: 21 | 22 | ```json 23 | { 24 | "course_name": "Physics", 25 | "material_name": "Electromagnetic_Optics_Quantum_Physics.pdf", 26 | "book_start_page": 1, 27 | "book_end_page": 507, 28 | "cover_start_page": 1, 29 | "cover_end_page": 1, 30 | "toc_start_page": 2, 31 | "toc_end_page": 9, 32 | "text_start_page": 10, 33 | "text_end_page": 479, 34 | "appendix_start_page": 480, 35 | "appendix_end_page": 507, 36 | "toc_max_level": 3, 37 | "toc_re_expression": [ 38 | "(第\\d+篇.*\\n\\n)", 39 | "(第\\d+章.*\\n\\n|.*物理趣闻.*\\n\\n|.*元素周期表.*\\n\\n|.*数值表.*\\n\\n|.*部分习题答案.*\\n\\n|.*索引.*\\n\\n)", 40 | "(\\*?\\d+\\.\\d+.*\\n\\n|.+?\\n\\n)" 41 | ] 42 | } 43 | ``` 44 | 45 | #### Field Descriptions 46 | 47 | * **course\_name**: e.g., `"Physics"` 48 | * **material\_name**: e.g., `"Electromagnetic_Optics_Quantum_Physics.pdf"` 49 | * **Page ranges** [book, cover, TOC (table of contents), text, appendix]: counted from the first file page, not internal book numbering. 50 | * **toc\_max\_level**: maximum TOC depth (starts from 1). 51 | * **toc\_re\_expression**: regular expressions to match TOC levels. 52 | 53 | 54 | ### 2. Check Task Status 55 | 56 | ```bash 57 | curl https://pacman.cs.tsinghua.edu.cn/api/treekg/task_status/${task_id} 58 | ``` 59 | 60 | Example response: 61 | 62 | ```json 63 | { 64 | "task_id": "uuid-string", 65 | "status": "processing", 66 | "progress": 60, 67 | "current_step": "augmentation", 68 | "created_at": "2025-05-30T12:00:00", 69 | "updated_at": "2025-05-30T12:05:00" 70 | } 71 | ``` 72 | 73 | ### 3. Retrieve Task Results 74 | 75 | ```bash 76 | curl -O https://pacman.cs.tsinghua.edu.cn/api/treekg/task_result/${task_id} 77 | ``` --------------------------------------------------------------------------------