├── README.md ├── booky.py └── booky.sh /README.md: -------------------------------------------------------------------------------- 1 | # booky 2 | 3 | This script creates bookmarks of a pdf from a simple text file. The tool `pdftk` can already do this in fact 4 | internally I am using that tool itself. But `pdftk` requires a format which is too tedious to write. So I have written 5 | this script to enter bookmarks data in a simple format. 6 | 7 | ## Dependencies 8 | * bash 9 | * python3 10 | * pdftk 11 | * dirname 12 | * basename 13 | * GNU sed (OSX users take note, you may have BSD sed. Install `gsed` instead) 14 | 15 | ## Bookmark format 16 | * Every level starts with a `{` on a _separate line_. 17 | * Bookmarks have title with page number separated by comma. 18 | * Both title and page number should be on the same line. 19 | * All these are equivalent (i.e. the script is whitespace agnostic). 20 | ``` 21 | title1, 1 22 | title1, 1 23 | title1 , 1 24 | ``` 25 | 26 | ### Example 27 | ``` 28 | { 29 | Title1, 1 30 | Title2, 2 31 | { 32 | Subtitle1, 3 33 | Subtitle2, 4 34 | { 35 | SubSubtitle1, 5 36 | ... 37 | } 38 | } 39 | } 40 | ``` 41 | 42 | ## How To Use it? 43 | * First clone this repository and change your directory. Execute this in a terminal 44 | 45 | ``` 46 | git clone https://github.com/SiddharthPant/booky.git 47 | cd booky 48 | ``` 49 | * Now copy your pdf file to this directory 50 | * Create a new text file and write your bookmarks in the given format 51 | * Now your directory should contain 4 files: `booky.sh`, `booky.py`, `your_pdf_file.pdf`, `your_text_file.txt` 52 | * Write the following commands in the terminal 53 | 54 | ``` 55 | ./booky.sh your_pdf_file.pdf your_text_file.txt 56 | ``` 57 | 58 | If you add the `booky` directory to the environment PATH like: 59 | ``` 60 | export PATH=/path_to_the_booky:$PATH 61 | ``` 62 | then it can run from any directory: 63 | ``` 64 | booky.sh your_pdf_file.pdf your_text_file.txt 65 | 66 | ``` 67 | 68 | This creats a new pdf file `your_pdf_file_new.pdf` with your bookmarks. 69 | 70 | This is going to work in POSIX systems, but if instead you are on a Windows machine. Then first install `python3` and `pdftk` just use the `booky.py` file in the repo to convert `bkmrks.txt` to `pdftk` compatible format 71 | 72 | python3 booky.py < bkmrks.txt > output.txt 73 | 74 | use the export command to generate a dumped data file. 75 | 76 | ``` 77 | pdftk C:\Users\Sid\Desktop\doc.pdf dump_data output C:\Users\Sid\Desktop\doc_data.txt 78 | ``` 79 | Remove the previous bookmarks from that file and insert content of `output.txt` instead using a simple copy & paste. 80 | And then import that data back. 81 | 82 | ``` 83 | pdftk C:\Users\Sid\Desktop\doc.pdf update_info C:\Users\Sid\Desktop\doc_data.txt output C:\Users\Sid\Desktop\updated.pdf 84 | ``` 85 | If this does not update your bookmarks check that your `pdftk` version is greater than 1.45 86 | -------------------------------------------------------------------------------- /booky.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import sys 3 | 4 | level = 0 5 | startChar = "{" 6 | endChar = "}" 7 | for line in sys.stdin: 8 | line = line.strip() 9 | if line == startChar: 10 | level = level + 1 11 | elif line == endChar: 12 | level = level - 1 13 | elif line: 14 | commaIndex = line.rfind(',') 15 | title = line[:commaIndex] 16 | pageNo = line[commaIndex + 1:].strip() 17 | print("BookmarkBegin") 18 | print("BookmarkTitle:", title.strip()) 19 | print("BookmarkLevel:", level) 20 | print("BookmarkPageNumber:", pageNo.strip()) 21 | -------------------------------------------------------------------------------- /booky.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Change to the directory of pdf file 4 | SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) 5 | cd $(dirname "$1") 6 | pdf=$(basename "$1") 7 | pdf_data="${pdf%.*}""_data.txt" 8 | EXTRACT_FILE=booky_bookmarks_extract 9 | bkFile="$2" 10 | 11 | 12 | if [[ "$OSTYPE" == "darwin"* ]]; then 13 | SED=gsed 14 | else 15 | SED=sed 16 | fi 17 | 18 | echo "Converting $bkFile to pdftk compatible format" 19 | python3 $SCRIPT_DIR/booky.py < "$bkFile" > "$EXTRACT_FILE" 20 | 21 | echo "Dumping pdf meta data..." 22 | pdftk "$pdf" dump_data_utf8 output "$pdf_data" 23 | 24 | echo "Clear dumped data of any previous bookmarks" 25 | $SED -i '/Bookmark/d' "$pdf_data" 26 | 27 | echo "Inserting your bookmarks in the data" 28 | $SED -i "/NumberOfPages/r $EXTRACT_FILE" "$pdf_data" 29 | 30 | echo "Creating new pdf with your bookmarks..." 31 | pdftk "$pdf" update_info_utf8 "$pdf_data" output "${pdf%.*}""_new.pdf" 32 | 33 | echo "Deleting leftovers" 34 | rm "$EXTRACT_FILE" "$pdf_data" 35 | --------------------------------------------------------------------------------