26 |
27 | There are many ways to search for a book that you are interested in - searching it by name, author, ISBN, and any other relevant features. That is all great, but the process is becoming very long as we introduce more and more features in the search system.
28 |
29 | This project tries to find the book you want by leveraging the power of Deep Learning and creating an system that allows an end-user to take a picture of books' cover and search the book in the database.
30 |
31 | ### Built With
32 | * [Tensorflow](https://www.tensorflow.org/)
33 | * [Flask](https://www.palletsprojects.com/p/flask/)
34 |
35 |
36 |
37 | ## Getting Started
38 |
39 | To get a local copy up and running follow these simple example steps.
40 |
41 | ### Prerequisites
42 |
43 | To run this project you'll need **Python 3.5 or later** and all dependencies listed in the **requirements.txt**.
44 |
45 | To install all dependencies listend in the requirements file:
46 |
47 | ```sh
48 | pip install -r requirements.txt
49 | ```
50 |
51 | ### Installation
52 |
53 | 1. Clone the repo
54 | ```sh
55 | git clone https://github.com/HANyangguang/ECBIR.git
56 | ```
57 | 2. Create the **dataset** folder and other folders in the static folder
58 | ```sh
59 | mkdir static/dataset
60 | mkdir static/feature
61 | mkdir static/resized
62 | mkdir static/uploads
63 | ```
64 | 3. Download the books covers dataset from the Kaggle and unpack the dataset into the **dataset** folder
65 |
66 | Link to the [dataset](https://www.kaggle.com/lukaanicin/book-covers-dataset)
67 |
68 | 4. Run the script **offline.py** to index the database use DELF and HNSW
69 | ```sh
70 | python(3) offline.py
71 | ```
72 | 5. Start the Flaks server with the **server.py**
73 | ```sh
74 | python(3) server.py
75 | ```
76 |
77 |
78 | ## Usage examples
79 |
80 |
81 |
82 |
83 |
84 |
85 | ## License
86 | Distributed under the MIT License. See `LICENSE` for more information.
87 |
88 |
89 | ## Acknowledgements
90 | * [DEep Local Features (DELF) paper](https://arxiv.org/pdf/1612.06321.pdf)
91 | * [DELF Reference implementation](https://www.dlology.com/blog/easy-landmark-image-recognition-with-tensorflow-hub-delf-module/)
92 | * [Search Engine](https://github.com/lucko515/search-book-by-cover-server)
93 | * [Web HTML](https://github.com/matsui528/sis)
--------------------------------------------------------------------------------
/demo/ECBIR.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HANyangguang/ECBIR/2e80a2f84523be69dd0383aa5c7e228e2014db39/demo/ECBIR.jpg
--------------------------------------------------------------------------------
/demo/ECBIRdemo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HANyangguang/ECBIR/2e80a2f84523be69dd0383aa5c7e228e2014db39/demo/ECBIRdemo.png
--------------------------------------------------------------------------------
/offline.py:
--------------------------------------------------------------------------------
1 | import glob
2 | import pickle
3 | import pandas as pd
4 | from tqdm import tqdm
5 | import nmslib
6 |
7 | # For DELF loading
8 | import tensorflow as tf
9 | import tensorflow_hub as hub
10 | from utils import *
11 |
12 |
13 | def generate_dataset_vectors(paths):
14 | '''
15 | Call this method to generate feature vectors for each image in the dataset.
16 | '''
17 | tf.reset_default_graph()
18 | tf.logging.set_verbosity(tf.logging.FATAL)
19 |
20 | model = hub.Module('https://tfhub.dev/google/delf/1')
21 |
22 | image_placeholder = tf.placeholder(
23 | tf.float32, shape=(None, None, 3), name='input_image')
24 |
25 | module_inputs = {
26 | 'image': image_placeholder,
27 | 'score_threshold': 100.0,
28 | 'image_scales': [0.25, 0.3536, 0.5, 0.7071, 1.0, 1.4142, 2.0],
29 | 'max_feature_num': 1000,
30 | }
31 |
32 | module_outputs = model(module_inputs, as_dict=True)
33 | image_tf = paths_to_image_loader(list(paths))
34 |
35 | with tf.train.MonitoredSession() as sess:
36 | results_dict = {}
37 | for i in tqdm(range(len(paths))):
38 | image_path = paths[i]
39 | image = sess.run(image_tf)
40 | results_dict[image_path] = sess.run(
41 | [module_outputs['locations'], module_outputs['descriptors']], feed_dict={image_placeholder: image})
42 | return results_dict
43 |
44 |
45 | paths = [img_path for img_path in sorted(
46 | glob.glob('static/dataset/*/*.[Jj][Pp][Gg]'))]
47 | rea_dict = generate_dataset_vectors(paths)
48 |
49 | paths = list(rea_dict.keys())
50 | locations_agg = np.concatenate([rea_dict[img][0] for img in paths])
51 | descriptors_agg = np.concatenate([rea_dict[img][1] for img in paths])
52 | accumulated_indexes_boundaries = list(accumulate(
53 | [rea_dict[img][0].shape[0] for img in paths]))
54 |
55 | # Space name
56 | space_name = 'l2'
57 | # Intitialize the library, specify the space, the type of the vector and add data points
58 | index = nmslib.init(method='hnsw', space=space_name,
59 | data_type=nmslib.DataType.DENSE_VECTOR)
60 | index.addDataPointBatch(descriptors_agg)
61 |
62 |
63 | # Set index parameters
64 | # These are the most important onese
65 | # Create an index
66 | M = 15
67 | efC = 100
68 | num_threads = 4
69 | index_time_params = {'M': M, 'indexThreadQty': num_threads,
70 | 'efConstruction': efC, 'post': 0}
71 |
72 | index.createIndex(index_time_params)
73 | index.saveIndex('static/feature/feature_set.bin')
74 | pickle.dump(locations_agg, open('static/feature/locations_agg.pkl', 'wb'))
75 | pickle.dump(accumulated_indexes_boundaries, open(
76 | 'static/feature/accumulated_indexes_boundaries.pkl', 'wb'))
77 | pickle.dump(paths, open('static/feature/paths.pkl', 'wb'))
78 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | numpy
2 | pandas
3 | selenium
4 | Flask
5 | opencv_python
6 | scipy
7 | tensorflow_hub
8 | scikit_image
9 | matplotlib
10 | Werkzeug
11 | requests
12 | tqdm
13 | imutils
14 | tensorflow
15 | Pillow
16 | beautifulsoup4
17 | currency_converter
18 | skimage
19 |
--------------------------------------------------------------------------------
/server.py:
--------------------------------------------------------------------------------
1 | import os
2 | import pickle
3 | import werkzeug
4 | import nmslib
5 | from utils import find_close_books
6 |
7 | # import Flask dependencies
8 | from flask import Flask, request, render_template
9 |
10 | # Set root dir
11 | APP_ROOT = os.path.dirname(os.path.abspath(__file__))
12 |
13 | with open("static/feature/locations_agg.pkl", 'rb') as f:
14 | locations_agg = pickle.load(f)
15 | with open("static/feature/accumulated_indexes_boundaries.pkl", 'rb') as f:
16 | accumulated_indexes_boundaries = pickle.load(f)
17 | with open("static/feature/paths.pkl", 'rb') as f:
18 | paths = pickle.load(f)
19 |
20 | space_name = 'l2'
21 | # Re-intitialize the library, specify the space, the type of the vector.
22 | newIndex = nmslib.init(method='hnsw', space=space_name,
23 | data_type=nmslib.DataType.DENSE_VECTOR)
24 | # For an optimized L2 index, there's no need to re-load data points, but this would be required for
25 | # non-optimized index or any other methods different from HNSW (other methods can save only meta indices)
26 | # newIndex.addDataPointBatch(data_matrix)
27 |
28 | # Re-load the index and re-run queries
29 | newIndex.loadIndex('static/feature/feature_set.bin')
30 |
31 | # Setting query-time parameters and querying
32 | efS = 100
33 | query_time_params = {'efSearch': efS}
34 | newIndex.setQueryTimeParams(query_time_params)
35 |
36 | # Define Flask app
37 | app = Flask(__name__)
38 |
39 | # Define apps home page
40 |
41 |
42 | @app.route("/", methods=['GET', 'POST'])
43 | def index():
44 | if request.method == 'POST':
45 | upload_dir = os.path.join(APP_ROOT, "static/uploads/")
46 | if not os.path.isdir(upload_dir):
47 | os.mkdir(upload_dir)
48 | resized_dir = os.path.join(APP_ROOT, "static/resized/")
49 | if not os.path.isdir(resized_dir):
50 | os.mkdir(resized_dir)
51 | imagefile = request.files['query_img']
52 | filename = werkzeug.utils.secure_filename(imagefile.filename)
53 | imagefile.save(upload_dir + filename)
54 | # Perform the inference process on the uploaded image
55 | results = find_close_books(upload_dir + filename, resized_dir +
56 | filename, locations_agg, accumulated_indexes_boundaries, newIndex,)
57 | book_covers = [(paths[result[0]], result[1]) for result in results]
58 | return render_template('index.html', query_path='static/uploads/' + filename, cover_results=book_covers)
59 | else:
60 | return render_template("index.html")
61 |
62 |
63 | # Start the application
64 | if __name__ == "__main__":
65 | app.run(host="0.0.0.0", port=5000, debug=True)
--------------------------------------------------------------------------------
/static/dataset/Art-Photography/0000001.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HANyangguang/ECBIR/2e80a2f84523be69dd0383aa5c7e228e2014db39/static/dataset/Art-Photography/0000001.jpg
--------------------------------------------------------------------------------
/static/dataset/Art-Photography/0000002.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HANyangguang/ECBIR/2e80a2f84523be69dd0383aa5c7e228e2014db39/static/dataset/Art-Photography/0000002.jpg
--------------------------------------------------------------------------------
/static/dataset/Art-Photography/0000003.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HANyangguang/ECBIR/2e80a2f84523be69dd0383aa5c7e228e2014db39/static/dataset/Art-Photography/0000003.jpg
--------------------------------------------------------------------------------
/static/dataset/Art-Photography/0000004.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HANyangguang/ECBIR/2e80a2f84523be69dd0383aa5c7e228e2014db39/static/dataset/Art-Photography/0000004.jpg
--------------------------------------------------------------------------------
/static/dataset/Art-Photography/0000005.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HANyangguang/ECBIR/2e80a2f84523be69dd0383aa5c7e228e2014db39/static/dataset/Art-Photography/0000005.jpg
--------------------------------------------------------------------------------
/static/dataset/Art-Photography/0000006.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HANyangguang/ECBIR/2e80a2f84523be69dd0383aa5c7e228e2014db39/static/dataset/Art-Photography/0000006.jpg
--------------------------------------------------------------------------------
/static/dataset/Art-Photography/0000007.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HANyangguang/ECBIR/2e80a2f84523be69dd0383aa5c7e228e2014db39/static/dataset/Art-Photography/0000007.jpg
--------------------------------------------------------------------------------
/static/dataset/Art-Photography/0000008.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HANyangguang/ECBIR/2e80a2f84523be69dd0383aa5c7e228e2014db39/static/dataset/Art-Photography/0000008.jpg
--------------------------------------------------------------------------------
/static/dataset/Art-Photography/0000009.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HANyangguang/ECBIR/2e80a2f84523be69dd0383aa5c7e228e2014db39/static/dataset/Art-Photography/0000009.jpg
--------------------------------------------------------------------------------
/static/dataset/Art-Photography/0000010.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/HANyangguang/ECBIR/2e80a2f84523be69dd0383aa5c7e228e2014db39/static/dataset/Art-Photography/0000010.jpg
--------------------------------------------------------------------------------
/templates/index.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
Exact Content Based Image Retrieval with Deep Learning