├── .github
└── ISSUE_TEMPLATE
│ ├── ---bug-report.md
│ └── -questions-help-support.md
├── .gitignore
├── LICENSE
├── README.md
├── data
├── README.md
├── coco_to_synset.json
├── lvis_results_100.json
└── lvis_val_100.json
├── images
├── examples.png
└── lvis_icon.svg
├── lvis
├── __init__.py
├── colormap.py
├── eval.py
├── lvis.py
├── results.py
└── vis.py
├── requirements.txt
├── setup.py
└── test.py
/.github/ISSUE_TEMPLATE/---bug-report.md:
--------------------------------------------------------------------------------
1 | ---
2 | name: " \U0001F41BBug Report"
3 | about: Submit a bug report to help us improve LVIS API
4 | title: ''
5 | labels: bug
6 | assignees: ''
7 |
8 | ---
9 |
10 | ## 🐛 Bug
11 |
12 |
13 |
14 | ## To Reproduce
15 |
16 | Steps to reproduce the behavior.
17 |
18 |
19 |
20 | ## Expected behavior
21 |
22 |
23 |
24 |
25 | ## Additional context
26 |
27 |
28 |
--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/-questions-help-support.md:
--------------------------------------------------------------------------------
1 | ---
2 | name: "❓Questions/Help/Support"
3 | about: Do you need support?
4 | title: ''
5 | labels: ''
6 | assignees: ''
7 |
8 | ---
9 |
10 | ## ❓ Questions and Help
11 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | .ipynb_checkpoints
2 | __pycache__
3 | .DS_Store
4 | dist/*
5 | lvis.egg-info/
6 | build/*
7 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | Copyright (c) 2019, Agrim Gupta and Ross Girshick
2 | All rights reserved.
3 |
4 | Redistribution and use in source and binary forms, with or without
5 | modification, are permitted provided that the following conditions are met:
6 |
7 | 1. Redistributions of source code must retain the above copyright notice, this
8 | list of conditions and the following disclaimer.
9 | 2. Redistributions in binary form must reproduce the above copyright notice,
10 | this list of conditions and the following disclaimer in the documentation
11 | and/or other materials provided with the distribution.
12 |
13 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
14 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
15 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
16 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
17 | ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
18 | (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
19 | LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
20 | ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
21 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
22 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
23 |
24 | The views and conclusions contained in the software and documentation are those
25 | of the authors and should not be interpreted as representing official policies,
26 | either expressed or implied, of the FreeBSD Project.
27 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | #
LVIS API
2 |
3 |
4 | LVIS (pronounced ‘el-vis’): is a new dataset for Large Vocabulary Instance Segmentation.
5 | When complete, it will feature more than 2 million high-quality instance segmentation masks for over 1200 entry-level object categories in 164k images. The LVIS API enables reading and interacting with annotation files, visualizing annotations, and evaluating results.
6 |
7 |
8 |
9 | ## LVIS v1.0
10 |
11 | For this release, we have annotated 159,623 images (100k train, 20k val, 20k test-dev, 20k test-challenge). Release v1.0 is publicly available at [LVIS website](http://www.lvisdataset.org) and will be used in the second LVIS Challenge to be held at Joint COCO and LVIS Workshop at ECCV 2020.
12 |
13 | ## Setup
14 | You can setup a virtual environment and then install `lvisapi` using pip:
15 |
16 | ```bash
17 | python3 -m venv env # Create a virtual environment
18 | source env/bin/activate # Activate virtual environment
19 |
20 | # install COCO API. COCO API requires numpy to install. Ensure that you installed numpy.
21 | pip install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
22 | # install LVIS API
23 | pip install lvis
24 | # Work for a while ...
25 | deactivate # Exit virtual environment
26 | ```
27 |
28 | You can also clone the repo first and then do the following steps inside the repo:
29 | ```bash
30 | python3 -m venv env # Create a virtual environment
31 | source env/bin/activate # Activate virtual environment
32 |
33 | # install COCO API. COCO API requires numpy to install. Ensure that you installed numpy.
34 | pip install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
35 | # install LVIS API
36 | pip install .
37 | # test if the installation was correct
38 | python test.py
39 | # Work for a while ...
40 | deactivate # Exit virtual environment
41 | ```
42 | ## Citing LVIS
43 |
44 | If you find this code/data useful in your research then please cite our [paper](https://arxiv.org/abs/1908.03195):
45 | ```
46 | @inproceedings{gupta2019lvis,
47 | title={{LVIS}: A Dataset for Large Vocabulary Instance Segmentation},
48 | author={Gupta, Agrim and Dollar, Piotr and Girshick, Ross},
49 | booktitle={Proceedings of the {IEEE} Conference on Computer Vision and Pattern Recognition},
50 | year={2019}
51 | }
52 | ```
53 |
54 | ## Credit
55 |
56 | The code is a re-write of PythonAPI for [COCO](https://github.com/cocodataset/cocoapi).
57 | The core functionality is the same with LVIS specific changes.
58 |
--------------------------------------------------------------------------------
/data/README.md:
--------------------------------------------------------------------------------
1 | ## Mapping between LVIS and COCO categories
2 |
3 | The json file `coco_to_synset.json` provides a mapping from each COCO category
4 | to a synset. The synset can then be used to find the corresponding category in
5 | LVIS. Matching based on synsets (instead of category id) allows this mapping
6 | to be correct even if LVIS category ids change (which will likely happen when
7 | upgrading from LVIS release v0.5 to v1.0).
8 |
--------------------------------------------------------------------------------
/data/coco_to_synset.json:
--------------------------------------------------------------------------------
1 | {"bench": {"coco_cat_id": 15, "meaning": "a long seat for more than one person", "synset": "bench.n.01"}, "baseball bat": {"coco_cat_id": 39, "meaning": "an implement used in baseball by the batter", "synset": "baseball_bat.n.01"}, "kite": {"coco_cat_id": 38, "meaning": "plaything consisting of a light frame covered with tissue paper; flown in wind at end of a string", "synset": "kite.n.03"}, "orange": {"coco_cat_id": 55, "meaning": "orange (FRUIT of an orange tree)", "synset": "orange.n.01"}, "boat": {"coco_cat_id": 9, "meaning": "a vessel for travel on water", "synset": "boat.n.01"}, "carrot": {"coco_cat_id": 57, "meaning": "deep orange edible root of the cultivated carrot plant", "synset": "carrot.n.01"}, "bicycle": {"coco_cat_id": 2, "meaning": "a wheeled vehicle that has two wheels and is moved by foot pedals", "synset": "bicycle.n.01"}, "book": {"coco_cat_id": 84, "meaning": "a written work or composition that has been published", "synset": "book.n.01"}, "toothbrush": {"coco_cat_id": 90, "meaning": "small brush; has long handle; used to clean teeth", "synset": "toothbrush.n.01"}, "tie": {"coco_cat_id": 32, "meaning": "neckwear consisting of a long narrow piece of material worn under a collar and tied in knot at the front", "synset": "necktie.n.01"}, "sandwich": {"coco_cat_id": 54, "meaning": "two (or more) slices of bread with a filling between them", "synset": "sandwich.n.01"}, "toilet": {"coco_cat_id": 70, "meaning": "a plumbing fixture for defecation and urination", "synset": "toilet.n.02"}, "stop sign": {"coco_cat_id": 13, "meaning": "a traffic sign to notify drivers that they must come to a complete stop", "synset": "stop_sign.n.01"}, "wine glass": {"coco_cat_id": 46, "meaning": "a glass that has a stem and in which wine is served", "synset": "wineglass.n.01"}, "clock": {"coco_cat_id": 85, "meaning": "a timepiece that shows the time of day", "synset": "clock.n.01"}, "bear": {"coco_cat_id": 23, "meaning": "large carnivorous or omnivorous mammals with shaggy coats and claws", "synset": "bear.n.01"}, "vase": {"coco_cat_id": 86, "meaning": "an open jar of glass or porcelain used as an ornament or to hold flowers", "synset": "vase.n.01"}, "microwave": {"coco_cat_id": 78, "meaning": "kitchen appliance that cooks food by passing an electromagnetic wave through it", "synset": "microwave.n.02"}, "oven": {"coco_cat_id": 79, "meaning": "kitchen appliance used for baking or roasting", "synset": "oven.n.01"}, "cake": {"coco_cat_id": 61, "meaning": "baked goods made from or based on a mixture of flour, sugar, eggs, and fat", "synset": "cake.n.03"}, "apple": {"coco_cat_id": 53, "meaning": "fruit with red or yellow or green skin and sweet to tart crisp whitish flesh", "synset": "apple.n.01"}, "bed": {"coco_cat_id": 65, "meaning": "a piece of furniture that provides a place to sleep", "synset": "bed.n.01"}, "skis": {"coco_cat_id": 35, "meaning": "sports equipment for skiing on snow", "synset": "ski.n.01"}, "dining table": {"coco_cat_id": 67, "meaning": "a table at which meals are served", "synset": "dining_table.n.01"}, "remote": {"coco_cat_id": 75, "meaning": "a device that can be used to control a machine or apparatus from a distance", "synset": "remote_control.n.01"}, "bird": {"coco_cat_id": 16, "meaning": "animal characterized by feathers and wings", "synset": "bird.n.01"}, "laptop": {"coco_cat_id": 73, "meaning": "a portable computer small enough to use in your lap", "synset": "laptop.n.01"}, "train": {"coco_cat_id": 7, "meaning": "public or private transport provided by a line of railway cars coupled together and drawn by a locomotive", "synset": "train.n.01"}, "mouse": {"coco_cat_id": 74, "meaning": "a computer input device that controls an on-screen pointer", "synset": "mouse.n.04"}, "pizza": {"coco_cat_id": 59, "meaning": "Italian open pie made of thin bread dough spread with a spiced mixture of e.g. tomato sauce and cheese", "synset": "pizza.n.01"}, "toaster": {"coco_cat_id": 80, "meaning": "a kitchen appliance (usually electric) for toasting bread", "synset": "toaster.n.02"}, "cell phone": {"coco_cat_id": 77, "meaning": "a hand-held mobile telephone", "synset": "cellular_telephone.n.01"}, "person": {"coco_cat_id": 1, "meaning": "a human being", "synset": "person.n.01"}, "sports ball": {"coco_cat_id": 37, "meaning": "a spherical object used as a plaything", "synset": "ball.n.06"}, "fire hydrant": {"coco_cat_id": 11, "meaning": "an upright hydrant for drawing water to use in fighting a fire", "synset": "fireplug.n.01"}, "umbrella": {"coco_cat_id": 28, "meaning": "a lightweight handheld collapsible canopy", "synset": "umbrella.n.01"}, "truck": {"coco_cat_id": 8, "meaning": "an automotive vehicle suitable for hauling", "synset": "truck.n.01"}, "knife": {"coco_cat_id": 49, "meaning": "tool with a blade and point used as a cutting instrument", "synset": "knife.n.01"}, "baseball glove": {"coco_cat_id": 40, "meaning": "the handwear used by fielders in playing baseball", "synset": "baseball_glove.n.01"}, "giraffe": {"coco_cat_id": 25, "meaning": "tall animal having a spotted coat and small horns and very long neck and legs", "synset": "giraffe.n.01"}, "airplane": {"coco_cat_id": 5, "meaning": "an aircraft that has a fixed wing and is powered by propellers or jets", "synset": "airplane.n.01"}, "parking meter": {"coco_cat_id": 14, "meaning": "a coin-operated timer located next to a parking space", "synset": "parking_meter.n.01"}, "couch": {"coco_cat_id": 63, "meaning": "an upholstered seat for more than one person", "synset": "sofa.n.01"}, "tennis racket": {"coco_cat_id": 43, "meaning": "a racket used to play tennis", "synset": "tennis_racket.n.01"}, "backpack": {"coco_cat_id": 27, "meaning": "a bag carried by a strap on your back or shoulder", "synset": "backpack.n.01"}, "hot dog": {"coco_cat_id": 58, "meaning": "a smooth-textured sausage, usually smoked, often served on a bread roll", "synset": "frank.n.02"}, "banana": {"coco_cat_id": 52, "meaning": "elongated crescent-shaped yellow fruit with soft sweet flesh", "synset": "banana.n.02"}, "bowl": {"coco_cat_id": 51, "meaning": "a dish that is round and open at the top for serving foods", "synset": "bowl.n.03"}, "skateboard": {"coco_cat_id": 41, "meaning": "a board with wheels that is ridden in a standing or crouching position and propelled by foot", "synset": "skateboard.n.01"}, "bottle": {"coco_cat_id": 44, "meaning": "a glass or plastic vessel used for storing drinks or other liquids", "synset": "bottle.n.01"}, "dog": {"coco_cat_id": 18, "meaning": "a common domesticated dog", "synset": "dog.n.01"}, "frisbee": {"coco_cat_id": 34, "meaning": "a light, plastic disk propelled with a flip of the wrist for recreation or competition", "synset": "frisbee.n.01"}, "broccoli": {"coco_cat_id": 56, "meaning": "plant with dense clusters of tight green flower buds", "synset": "broccoli.n.01"}, "elephant": {"coco_cat_id": 22, "meaning": "a common elephant", "synset": "elephant.n.01"}, "car": {"coco_cat_id": 3, "meaning": "a motor vehicle with four wheels", "synset": "car.n.01"}, "donut": {"coco_cat_id": 60, "meaning": "a small ring-shaped friedcake", "synset": "doughnut.n.02"}, "suitcase": {"coco_cat_id": 33, "meaning": "cases used to carry belongings when traveling", "synset": "bag.n.06"}, "cup": {"coco_cat_id": 47, "meaning": "a small open container usually used for drinking; usually has a handle", "synset": "cup.n.01"}, "hair drier": {"coco_cat_id": 89, "meaning": "a hand-held electric blower that can blow warm air onto the hair", "synset": "hand_blower.n.01"}, "surfboard": {"coco_cat_id": 42, "meaning": "a narrow buoyant board for riding surf", "synset": "surfboard.n.01"}, "traffic light": {"coco_cat_id": 10, "meaning": "a device to control vehicle traffic often consisting of three or more lights", "synset": "traffic_light.n.01"}, "tv": {"coco_cat_id": 72, "meaning": "an electronic device that receives television signals and displays them on a screen", "synset": "television_receiver.n.01"}, "spoon": {"coco_cat_id": 50, "meaning": "a piece of cutlery with a shallow bowl-shaped container and a handle", "synset": "spoon.n.01"}, "horse": {"coco_cat_id": 19, "meaning": "a common horse", "synset": "horse.n.01"}, "motorcycle": {"coco_cat_id": 4, "meaning": "a motor vehicle with two wheels and a strong frame", "synset": "motorcycle.n.01"}, "zebra": {"coco_cat_id": 24, "meaning": "any of several fleet black-and-white striped African equines", "synset": "zebra.n.01"}, "cat": {"coco_cat_id": 17, "meaning": "a domestic house cat", "synset": "cat.n.01"}, "teddy bear": {"coco_cat_id": 88, "meaning": "plaything consisting of a child's toy bear (usually plush and stuffed with soft materials)", "synset": "teddy.n.01"}, "handbag": {"coco_cat_id": 31, "meaning": "a container used for carrying money and small personal items or accessories", "synset": "bag.n.04"}, "sink": {"coco_cat_id": 81, "meaning": "plumbing fixture consisting of a water basin fixed to a wall or floor and having a drainpipe", "synset": "sink.n.01"}, "keyboard": {"coco_cat_id": 76, "meaning": "a keyboard that is a data input device for computers", "synset": "computer_keyboard.n.01"}, "bus": {"coco_cat_id": 6, "meaning": "a vehicle carrying many passengers; used for public transport", "synset": "bus.n.01"}, "fork": {"coco_cat_id": 48, "meaning": "cutlery used for serving and eating food", "synset": "fork.n.01"}, "chair": {"coco_cat_id": 62, "meaning": "a seat for one person, with a support for the back", "synset": "chair.n.01"}, "refrigerator": {"coco_cat_id": 82, "meaning": "a refrigerator in which the coolant is pumped around by an electric motor", "synset": "electric_refrigerator.n.01"}, "scissors": {"coco_cat_id": 87, "meaning": "a tool having two crossed pivoting blades with looped handles", "synset": "scissors.n.01"}, "sheep": {"coco_cat_id": 20, "meaning": "woolly usually horned ruminant mammal related to the goat", "synset": "sheep.n.01"}, "potted plant": {"coco_cat_id": 64, "meaning": "a container in which plants are cultivated", "synset": "pot.n.04"}, "snowboard": {"coco_cat_id": 36, "meaning": "a board that resembles a broad ski or a small surfboard; used in a standing position to slide down snow-covered slopes", "synset": "snowboard.n.01"}, "cow": {"coco_cat_id": 21, "meaning": "cattle that are reared for their meat", "synset": "beef.n.01"}}
--------------------------------------------------------------------------------
/images/examples.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/lvis-dataset/lvis-api/7d7f07def11da91f8b2710ce352c62a78fd5a7ad/images/examples.png
--------------------------------------------------------------------------------
/images/lvis_icon.svg:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/lvis/__init__.py:
--------------------------------------------------------------------------------
1 | import logging
2 | from lvis.lvis import LVIS
3 | from lvis.results import LVISResults
4 | from lvis.eval import LVISEval
5 | from lvis.vis import LVISVis
6 |
7 | logging.basicConfig(
8 | format="[%(asctime)s] %(name)s %(levelname)s: %(message)s", datefmt="%m/%d %H:%M:%S",
9 | level=logging.WARN,
10 | )
11 |
12 | __all__ = ["LVIS", "LVISResults", "LVISEval", "LVISVis"]
13 |
--------------------------------------------------------------------------------
/lvis/colormap.py:
--------------------------------------------------------------------------------
1 | """An awesome colormap for really neat visualizations. Taken from detectron."""
2 |
3 | import numpy as np
4 |
5 | def colormap(rgb=False):
6 | color_list = np.array(
7 | [
8 | 0.000,
9 | 0.447,
10 | 0.741,
11 | 0.850,
12 | 0.325,
13 | 0.098,
14 | 0.929,
15 | 0.694,
16 | 0.125,
17 | 0.494,
18 | 0.184,
19 | 0.556,
20 | 0.466,
21 | 0.674,
22 | 0.188,
23 | 0.301,
24 | 0.745,
25 | 0.933,
26 | 0.635,
27 | 0.078,
28 | 0.184,
29 | 0.300,
30 | 0.300,
31 | 0.300,
32 | 0.600,
33 | 0.600,
34 | 0.600,
35 | 1.000,
36 | 0.000,
37 | 0.000,
38 | 1.000,
39 | 0.500,
40 | 0.000,
41 | 0.749,
42 | 0.749,
43 | 0.000,
44 | 0.000,
45 | 1.000,
46 | 0.000,
47 | 0.000,
48 | 0.000,
49 | 1.000,
50 | 0.667,
51 | 0.000,
52 | 1.000,
53 | 0.333,
54 | 0.333,
55 | 0.000,
56 | 0.333,
57 | 0.667,
58 | 0.000,
59 | 0.333,
60 | 1.000,
61 | 0.000,
62 | 0.667,
63 | 0.333,
64 | 0.000,
65 | 0.667,
66 | 0.667,
67 | 0.000,
68 | 0.667,
69 | 1.000,
70 | 0.000,
71 | 1.000,
72 | 0.333,
73 | 0.000,
74 | 1.000,
75 | 0.667,
76 | 0.000,
77 | 1.000,
78 | 1.000,
79 | 0.000,
80 | 0.000,
81 | 0.333,
82 | 0.500,
83 | 0.000,
84 | 0.667,
85 | 0.500,
86 | 0.000,
87 | 1.000,
88 | 0.500,
89 | 0.333,
90 | 0.000,
91 | 0.500,
92 | 0.333,
93 | 0.333,
94 | 0.500,
95 | 0.333,
96 | 0.667,
97 | 0.500,
98 | 0.333,
99 | 1.000,
100 | 0.500,
101 | 0.667,
102 | 0.000,
103 | 0.500,
104 | 0.667,
105 | 0.333,
106 | 0.500,
107 | 0.667,
108 | 0.667,
109 | 0.500,
110 | 0.667,
111 | 1.000,
112 | 0.500,
113 | 1.000,
114 | 0.000,
115 | 0.500,
116 | 1.000,
117 | 0.333,
118 | 0.500,
119 | 1.000,
120 | 0.667,
121 | 0.500,
122 | 1.000,
123 | 1.000,
124 | 0.500,
125 | 0.000,
126 | 0.333,
127 | 1.000,
128 | 0.000,
129 | 0.667,
130 | 1.000,
131 | 0.000,
132 | 1.000,
133 | 1.000,
134 | 0.333,
135 | 0.000,
136 | 1.000,
137 | 0.333,
138 | 0.333,
139 | 1.000,
140 | 0.333,
141 | 0.667,
142 | 1.000,
143 | 0.333,
144 | 1.000,
145 | 1.000,
146 | 0.667,
147 | 0.000,
148 | 1.000,
149 | 0.667,
150 | 0.333,
151 | 1.000,
152 | 0.667,
153 | 0.667,
154 | 1.000,
155 | 0.667,
156 | 1.000,
157 | 1.000,
158 | 1.000,
159 | 0.000,
160 | 1.000,
161 | 1.000,
162 | 0.333,
163 | 1.000,
164 | 1.000,
165 | 0.667,
166 | 1.000,
167 | 0.167,
168 | 0.000,
169 | 0.000,
170 | 0.333,
171 | 0.000,
172 | 0.000,
173 | 0.500,
174 | 0.000,
175 | 0.000,
176 | 0.667,
177 | 0.000,
178 | 0.000,
179 | 0.833,
180 | 0.000,
181 | 0.000,
182 | 1.000,
183 | 0.000,
184 | 0.000,
185 | 0.000,
186 | 0.167,
187 | 0.000,
188 | 0.000,
189 | 0.333,
190 | 0.000,
191 | 0.000,
192 | 0.500,
193 | 0.000,
194 | 0.000,
195 | 0.667,
196 | 0.000,
197 | 0.000,
198 | 0.833,
199 | 0.000,
200 | 0.000,
201 | 1.000,
202 | 0.000,
203 | 0.000,
204 | 0.000,
205 | 0.167,
206 | 0.000,
207 | 0.000,
208 | 0.333,
209 | 0.000,
210 | 0.000,
211 | 0.500,
212 | 0.000,
213 | 0.000,
214 | 0.667,
215 | 0.000,
216 | 0.000,
217 | 0.833,
218 | 0.000,
219 | 0.000,
220 | 1.000,
221 | 0.000,
222 | 0.000,
223 | 0.000,
224 | 0.143,
225 | 0.143,
226 | 0.143,
227 | 0.286,
228 | 0.286,
229 | 0.286,
230 | 0.429,
231 | 0.429,
232 | 0.429,
233 | 0.571,
234 | 0.571,
235 | 0.571,
236 | 0.714,
237 | 0.714,
238 | 0.714,
239 | 0.857,
240 | 0.857,
241 | 0.857,
242 | 1.000,
243 | 1.000,
244 | 1.000,
245 | ]
246 | ).astype(np.float32)
247 | color_list = color_list.reshape((-1, 3)) * 255
248 | if not rgb:
249 | color_list = color_list[:, ::-1]
250 | return color_list
251 |
--------------------------------------------------------------------------------
/lvis/eval.py:
--------------------------------------------------------------------------------
1 | import datetime
2 | import logging
3 | from collections import OrderedDict
4 | from collections import defaultdict
5 |
6 | import numpy as np
7 |
8 | from lvis.lvis import LVIS
9 | from lvis.results import LVISResults
10 |
11 | import pycocotools.mask as mask_utils
12 |
13 |
14 | class LVISEval:
15 | def __init__(self, lvis_gt, lvis_dt, iou_type="segm"):
16 | """Constructor for LVISEval.
17 | Args:
18 | lvis_gt (LVIS class instance, or str containing path of annotation file)
19 | lvis_dt (LVISResult class instance, or str containing path of result file,
20 | or list of dict)
21 | iou_type (str): segm or bbox evaluation
22 | """
23 | self.logger = logging.getLogger(__name__)
24 |
25 | if iou_type not in ["bbox", "segm"]:
26 | raise ValueError("iou_type: {} is not supported.".format(iou_type))
27 |
28 | if isinstance(lvis_gt, LVIS):
29 | self.lvis_gt = lvis_gt
30 | elif isinstance(lvis_gt, str):
31 | self.lvis_gt = LVIS(lvis_gt)
32 | else:
33 | raise TypeError("Unsupported type {} of lvis_gt.".format(lvis_gt))
34 |
35 | if isinstance(lvis_dt, LVISResults):
36 | self.lvis_dt = lvis_dt
37 | elif isinstance(lvis_dt, (str, list)):
38 | self.lvis_dt = LVISResults(self.lvis_gt, lvis_dt)
39 | else:
40 | raise TypeError("Unsupported type {} of lvis_dt.".format(lvis_dt))
41 |
42 | # per-image per-category evaluation results
43 | self.eval_imgs = defaultdict(list)
44 | self.eval = {} # accumulated evaluation results
45 | self._gts = defaultdict(list) # gt for evaluation
46 | self._dts = defaultdict(list) # dt for evaluation
47 | self.params = Params(iou_type=iou_type) # parameters
48 | self.results = OrderedDict()
49 | self.ious = {} # ious between all gts and dts
50 |
51 | self.params.img_ids = sorted(self.lvis_gt.get_img_ids())
52 | self.params.cat_ids = sorted(self.lvis_gt.get_cat_ids())
53 |
54 | def _to_mask(self, anns, lvis):
55 | for ann in anns:
56 | rle = lvis.ann_to_rle(ann)
57 | ann["segmentation"] = rle
58 |
59 | def _prepare(self):
60 | """Prepare self._gts and self._dts for evaluation based on params."""
61 |
62 | cat_ids = self.params.cat_ids if self.params.cat_ids else None
63 |
64 | gts = self.lvis_gt.load_anns(
65 | self.lvis_gt.get_ann_ids(img_ids=self.params.img_ids, cat_ids=cat_ids)
66 | )
67 | dts = self.lvis_dt.load_anns(
68 | self.lvis_dt.get_ann_ids(img_ids=self.params.img_ids, cat_ids=cat_ids)
69 | )
70 | # convert ground truth to mask if iou_type == 'segm'
71 | if self.params.iou_type == "segm":
72 | self._to_mask(gts, self.lvis_gt)
73 | self._to_mask(dts, self.lvis_dt)
74 |
75 | # set ignore flag
76 | for gt in gts:
77 | if "ignore" not in gt:
78 | gt["ignore"] = 0
79 |
80 | for gt in gts:
81 | self._gts[gt["image_id"], gt["category_id"]].append(gt)
82 |
83 | # For federated dataset evaluation we will filter out all dt for an
84 | # image which belong to categories not present in gt and not present in
85 | # the negative list for an image. In other words detector is not penalized
86 | # for categories about which we don't have gt information about their
87 | # presence or absence in an image.
88 | img_data = self.lvis_gt.load_imgs(ids=self.params.img_ids)
89 | # per image map of categories not present in image
90 | img_nl = {d["id"]: d["neg_category_ids"] for d in img_data}
91 | # per image list of categories present in image
92 | img_pl = defaultdict(set)
93 | for ann in gts:
94 | img_pl[ann["image_id"]].add(ann["category_id"])
95 | # per image map of categoires which have missing gt. For these
96 | # categories we don't penalize the detector for false positives.
97 | self.img_nel = {d["id"]: d["not_exhaustive_category_ids"] for d in img_data}
98 |
99 | for dt in dts:
100 | img_id, cat_id = dt["image_id"], dt["category_id"]
101 | if cat_id not in img_nl[img_id] and cat_id not in img_pl[img_id]:
102 | continue
103 | self._dts[img_id, cat_id].append(dt)
104 |
105 | self.freq_groups = self._prepare_freq_group()
106 |
107 | def _prepare_freq_group(self):
108 | freq_groups = [[] for _ in self.params.img_count_lbl]
109 | cat_data = self.lvis_gt.load_cats(self.params.cat_ids)
110 | for idx, _cat_data in enumerate(cat_data):
111 | frequency = _cat_data["frequency"]
112 | freq_groups[self.params.img_count_lbl.index(frequency)].append(idx)
113 | return freq_groups
114 |
115 | def evaluate(self):
116 | """
117 | Run per image evaluation on given images and store results
118 | (a list of dict) in self.eval_imgs.
119 | """
120 | self.logger.info("Running per image evaluation.")
121 | self.logger.info("Evaluate annotation type *{}*".format(self.params.iou_type))
122 |
123 | self.params.img_ids = list(np.unique(self.params.img_ids))
124 |
125 | if self.params.use_cats:
126 | cat_ids = self.params.cat_ids
127 | else:
128 | cat_ids = [-1]
129 |
130 | self._prepare()
131 |
132 | self.ious = {
133 | (img_id, cat_id): self.compute_iou(img_id, cat_id)
134 | for img_id in self.params.img_ids
135 | for cat_id in cat_ids
136 | }
137 |
138 | # loop through images, area range, max detection number
139 | self.eval_imgs = [
140 | self.evaluate_img(img_id, cat_id, area_rng)
141 | for cat_id in cat_ids
142 | for area_rng in self.params.area_rng
143 | for img_id in self.params.img_ids
144 | ]
145 |
146 | def _get_gt_dt(self, img_id, cat_id):
147 | """Create gt, dt which are list of anns/dets. If use_cats is true
148 | only anns/dets corresponding to tuple (img_id, cat_id) will be
149 | used. Else, all anns/dets in image are used and cat_id is not used.
150 | """
151 | if self.params.use_cats:
152 | gt = self._gts[img_id, cat_id]
153 | dt = self._dts[img_id, cat_id]
154 | else:
155 | gt = [
156 | _ann
157 | for _cat_id in self.params.cat_ids
158 | for _ann in self._gts[img_id, _cat_id]
159 | ]
160 | dt = [
161 | _ann
162 | for _cat_id in self.params.cat_ids
163 | for _ann in self._dts[img_id, _cat_id]
164 | ]
165 | return gt, dt
166 |
167 | def compute_iou(self, img_id, cat_id):
168 | gt, dt = self._get_gt_dt(img_id, cat_id)
169 |
170 | if len(gt) == 0 and len(dt) == 0:
171 | return []
172 |
173 | # Sort detections in decreasing order of score.
174 | idx = np.argsort([-d["score"] for d in dt], kind="mergesort")
175 | dt = [dt[i] for i in idx]
176 |
177 | iscrowd = [int(False)] * len(gt)
178 |
179 | if self.params.iou_type == "segm":
180 | ann_type = "segmentation"
181 | elif self.params.iou_type == "bbox":
182 | ann_type = "bbox"
183 | else:
184 | raise ValueError("Unknown iou_type for iou computation.")
185 | gt = [g[ann_type] for g in gt]
186 | dt = [d[ann_type] for d in dt]
187 |
188 | # compute iou between each dt and gt region
189 | # will return array of shape len(dt), len(gt)
190 | ious = mask_utils.iou(dt, gt, iscrowd)
191 | return ious
192 |
193 | def evaluate_img(self, img_id, cat_id, area_rng):
194 | """Perform evaluation for single category and image."""
195 | gt, dt = self._get_gt_dt(img_id, cat_id)
196 |
197 | if len(gt) == 0 and len(dt) == 0:
198 | return None
199 |
200 | # Add another filed _ignore to only consider anns based on area range.
201 | for g in gt:
202 | if g["ignore"] or (g["area"] < area_rng[0] or g["area"] > area_rng[1]):
203 | g["_ignore"] = 1
204 | else:
205 | g["_ignore"] = 0
206 |
207 | # Sort gt ignore last
208 | gt_idx = np.argsort([g["_ignore"] for g in gt], kind="mergesort")
209 | gt = [gt[i] for i in gt_idx]
210 |
211 | # Sort dt highest score first
212 | dt_idx = np.argsort([-d["score"] for d in dt], kind="mergesort")
213 | dt = [dt[i] for i in dt_idx]
214 |
215 | # load computed ious
216 | ious = (
217 | self.ious[img_id, cat_id][:, gt_idx]
218 | if len(self.ious[img_id, cat_id]) > 0
219 | else self.ious[img_id, cat_id]
220 | )
221 |
222 | num_thrs = len(self.params.iou_thrs)
223 | num_gt = len(gt)
224 | num_dt = len(dt)
225 |
226 | # Array to store the "id" of the matched dt/gt
227 | gt_m = np.zeros((num_thrs, num_gt))
228 | dt_m = np.zeros((num_thrs, num_dt))
229 |
230 | gt_ig = np.array([g["_ignore"] for g in gt])
231 | dt_ig = np.zeros((num_thrs, num_dt))
232 |
233 | for iou_thr_idx, iou_thr in enumerate(self.params.iou_thrs):
234 | if len(ious) == 0:
235 | break
236 |
237 | for dt_idx, _dt in enumerate(dt):
238 | iou = min([iou_thr, 1 - 1e-10])
239 | # information about best match so far (m=-1 -> unmatched)
240 | # store the gt_idx which matched for _dt
241 | m = -1
242 | for gt_idx, _ in enumerate(gt):
243 | # if this gt already matched continue
244 | if gt_m[iou_thr_idx, gt_idx] > 0:
245 | continue
246 | # if _dt matched to reg gt, and on ignore gt, stop
247 | if m > -1 and gt_ig[m] == 0 and gt_ig[gt_idx] == 1:
248 | break
249 | # continue to next gt unless better match made
250 | if ious[dt_idx, gt_idx] < iou:
251 | continue
252 | # if match successful and best so far, store appropriately
253 | iou = ious[dt_idx, gt_idx]
254 | m = gt_idx
255 |
256 | # No match found for _dt, go to next _dt
257 | if m == -1:
258 | continue
259 |
260 | # if gt to ignore for some reason update dt_ig.
261 | # Should not be used in evaluation.
262 | dt_ig[iou_thr_idx, dt_idx] = gt_ig[m]
263 | # _dt match found, update gt_m, and dt_m with "id"
264 | dt_m[iou_thr_idx, dt_idx] = gt[m]["id"]
265 | gt_m[iou_thr_idx, m] = _dt["id"]
266 |
267 | # For LVIS we will ignore any unmatched detection if that category was
268 | # not exhaustively annotated in gt.
269 | dt_ig_mask = [
270 | d["area"] < area_rng[0]
271 | or d["area"] > area_rng[1]
272 | or d["category_id"] in self.img_nel[d["image_id"]]
273 | for d in dt
274 | ]
275 | dt_ig_mask = np.array(dt_ig_mask).reshape((1, num_dt)) # 1 X num_dt
276 | dt_ig_mask = np.repeat(dt_ig_mask, num_thrs, 0) # num_thrs X num_dt
277 | # Based on dt_ig_mask ignore any unmatched detection by updating dt_ig
278 | dt_ig = np.logical_or(dt_ig, np.logical_and(dt_m == 0, dt_ig_mask))
279 | # store results for given image and category
280 | return {
281 | "image_id": img_id,
282 | "category_id": cat_id,
283 | "area_rng": area_rng,
284 | "dt_ids": [d["id"] for d in dt],
285 | "gt_ids": [g["id"] for g in gt],
286 | "dt_matches": dt_m,
287 | "gt_matches": gt_m,
288 | "dt_scores": [d["score"] for d in dt],
289 | "gt_ignore": gt_ig,
290 | "dt_ignore": dt_ig,
291 | }
292 |
293 | def accumulate(self):
294 | """Accumulate per image evaluation results and store the result in
295 | self.eval.
296 | """
297 | self.logger.info("Accumulating evaluation results.")
298 |
299 | if not self.eval_imgs:
300 | self.logger.warn("Please run evaluate first.")
301 |
302 | if self.params.use_cats:
303 | cat_ids = self.params.cat_ids
304 | else:
305 | cat_ids = [-1]
306 |
307 | num_thrs = len(self.params.iou_thrs)
308 | num_recalls = len(self.params.rec_thrs)
309 | num_cats = len(cat_ids)
310 | num_area_rngs = len(self.params.area_rng)
311 | num_imgs = len(self.params.img_ids)
312 |
313 | # -1 for absent categories
314 | precision = -np.ones(
315 | (num_thrs, num_recalls, num_cats, num_area_rngs)
316 | )
317 | recall = -np.ones((num_thrs, num_cats, num_area_rngs))
318 |
319 | # Initialize dt_pointers
320 | dt_pointers = {}
321 | for cat_idx in range(num_cats):
322 | dt_pointers[cat_idx] = {}
323 | for area_idx in range(num_area_rngs):
324 | dt_pointers[cat_idx][area_idx] = {}
325 |
326 | # Per category evaluation
327 | for cat_idx in range(num_cats):
328 | Nk = cat_idx * num_area_rngs * num_imgs
329 | for area_idx in range(num_area_rngs):
330 | Na = area_idx * num_imgs
331 | E = [
332 | self.eval_imgs[Nk + Na + img_idx]
333 | for img_idx in range(num_imgs)
334 | ]
335 | # Remove elements which are None
336 | E = [e for e in E if not e is None]
337 | if len(E) == 0:
338 | continue
339 |
340 | # Append all scores: shape (N,)
341 | dt_scores = np.concatenate([e["dt_scores"] for e in E], axis=0)
342 | dt_ids = np.concatenate([e["dt_ids"] for e in E], axis=0)
343 |
344 | dt_idx = np.argsort(-dt_scores, kind="mergesort")
345 | dt_scores = dt_scores[dt_idx]
346 | dt_ids = dt_ids[dt_idx]
347 |
348 | dt_m = np.concatenate([e["dt_matches"] for e in E], axis=1)[:, dt_idx]
349 | dt_ig = np.concatenate([e["dt_ignore"] for e in E], axis=1)[:, dt_idx]
350 |
351 | gt_ig = np.concatenate([e["gt_ignore"] for e in E])
352 | # num gt anns to consider
353 | num_gt = np.count_nonzero(gt_ig == 0)
354 |
355 | if num_gt == 0:
356 | continue
357 |
358 | tps = np.logical_and(dt_m, np.logical_not(dt_ig))
359 | fps = np.logical_and(np.logical_not(dt_m), np.logical_not(dt_ig))
360 |
361 | tp_sum = np.cumsum(tps, axis=1).astype(dtype=float)
362 | fp_sum = np.cumsum(fps, axis=1).astype(dtype=float)
363 |
364 | dt_pointers[cat_idx][area_idx] = {
365 | "dt_ids": dt_ids,
366 | "tps": tps,
367 | "fps": fps,
368 | }
369 |
370 | for iou_thr_idx, (tp, fp) in enumerate(zip(tp_sum, fp_sum)):
371 | tp = np.array(tp)
372 | fp = np.array(fp)
373 | num_tp = len(tp)
374 | rc = tp / num_gt
375 | if num_tp:
376 | recall[iou_thr_idx, cat_idx, area_idx] = rc[
377 | -1
378 | ]
379 | else:
380 | recall[iou_thr_idx, cat_idx, area_idx] = 0
381 |
382 | # np.spacing(1) ~= eps
383 | pr = tp / (fp + tp + np.spacing(1))
384 | pr = pr.tolist()
385 |
386 | # Replace each precision value with the maximum precision
387 | # value to the right of that recall level. This ensures
388 | # that the calculated AP value will be less suspectable
389 | # to small variations in the ranking.
390 | for i in range(num_tp - 1, 0, -1):
391 | if pr[i] > pr[i - 1]:
392 | pr[i - 1] = pr[i]
393 |
394 | rec_thrs_insert_idx = np.searchsorted(
395 | rc, self.params.rec_thrs, side="left"
396 | )
397 |
398 | pr_at_recall = [0.0] * num_recalls
399 |
400 | try:
401 | for _idx, pr_idx in enumerate(rec_thrs_insert_idx):
402 | pr_at_recall[_idx] = pr[pr_idx]
403 | except:
404 | pass
405 | precision[iou_thr_idx, :, cat_idx, area_idx] = np.array(pr_at_recall)
406 |
407 | self.eval = {
408 | "params": self.params,
409 | "counts": [num_thrs, num_recalls, num_cats, num_area_rngs],
410 | "date": datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
411 | "precision": precision,
412 | "recall": recall,
413 | "dt_pointers": dt_pointers,
414 | }
415 |
416 | def _summarize(
417 | self, summary_type, iou_thr=None, area_rng="all", freq_group_idx=None
418 | ):
419 | aidx = [
420 | idx
421 | for idx, _area_rng in enumerate(self.params.area_rng_lbl)
422 | if _area_rng == area_rng
423 | ]
424 |
425 | if summary_type == 'ap':
426 | s = self.eval["precision"]
427 | if iou_thr is not None:
428 | tidx = np.where(iou_thr == self.params.iou_thrs)[0]
429 | s = s[tidx]
430 | if freq_group_idx is not None:
431 | s = s[:, :, self.freq_groups[freq_group_idx], aidx]
432 | else:
433 | s = s[:, :, :, aidx]
434 | else:
435 | s = self.eval["recall"]
436 | if iou_thr is not None:
437 | tidx = np.where(iou_thr == self.params.iou_thrs)[0]
438 | s = s[tidx]
439 | s = s[:, :, aidx]
440 |
441 | if len(s[s > -1]) == 0:
442 | mean_s = -1
443 | else:
444 | mean_s = np.mean(s[s > -1])
445 | return mean_s
446 |
447 | def summarize(self):
448 | """Compute and display summary metrics for evaluation results."""
449 | if not self.eval:
450 | raise RuntimeError("Please run accumulate() first.")
451 |
452 | max_dets = self.params.max_dets
453 |
454 | self.results["AP"] = self._summarize('ap')
455 | self.results["AP50"] = self._summarize('ap', iou_thr=0.50)
456 | self.results["AP75"] = self._summarize('ap', iou_thr=0.75)
457 | self.results["APs"] = self._summarize('ap', area_rng="small")
458 | self.results["APm"] = self._summarize('ap', area_rng="medium")
459 | self.results["APl"] = self._summarize('ap', area_rng="large")
460 | self.results["APr"] = self._summarize('ap', freq_group_idx=0)
461 | self.results["APc"] = self._summarize('ap', freq_group_idx=1)
462 | self.results["APf"] = self._summarize('ap', freq_group_idx=2)
463 |
464 | key = "AR@{}".format(max_dets)
465 | self.results[key] = self._summarize('ar')
466 |
467 | for area_rng in ["small", "medium", "large"]:
468 | key = "AR{}@{}".format(area_rng[0], max_dets)
469 | self.results[key] = self._summarize('ar', area_rng=area_rng)
470 |
471 | def run(self):
472 | """Wrapper function which calculates the results."""
473 | self.evaluate()
474 | self.accumulate()
475 | self.summarize()
476 |
477 | def print_results(self):
478 | template = " {:<18} {} @[ IoU={:<9} | area={:>6s} | maxDets={:>3d} catIds={:>3s}] = {:0.3f}"
479 |
480 | for key, value in self.results.items():
481 | max_dets = self.params.max_dets
482 | if "AP" in key:
483 | title = "Average Precision"
484 | _type = "(AP)"
485 | else:
486 | title = "Average Recall"
487 | _type = "(AR)"
488 |
489 | if len(key) > 2 and key[2].isdigit():
490 | iou_thr = (float(key[2:]) / 100)
491 | iou = "{:0.2f}".format(iou_thr)
492 | else:
493 | iou = "{:0.2f}:{:0.2f}".format(
494 | self.params.iou_thrs[0], self.params.iou_thrs[-1]
495 | )
496 |
497 | if len(key) > 2 and key[2] in ["r", "c", "f"]:
498 | cat_group_name = key[2]
499 | else:
500 | cat_group_name = "all"
501 |
502 | if len(key) > 2 and key[2] in ["s", "m", "l"]:
503 | area_rng = key[2]
504 | else:
505 | area_rng = "all"
506 |
507 | print(template.format(title, _type, iou, area_rng, max_dets, cat_group_name, value))
508 |
509 | def get_results(self):
510 | if not self.results:
511 | self.logger.warn("results is empty. Call run().")
512 | return self.results
513 |
514 |
515 | class Params:
516 | def __init__(self, iou_type):
517 | """Params for LVIS evaluation API."""
518 | self.img_ids = []
519 | self.cat_ids = []
520 | # np.arange causes trouble. the data point on arange is slightly
521 | # larger than the true value
522 | self.iou_thrs = np.linspace(
523 | 0.5, 0.95, int(np.round((0.95 - 0.5) / 0.05)) + 1, endpoint=True
524 | )
525 | self.rec_thrs = np.linspace(
526 | 0.0, 1.00, int(np.round((1.00 - 0.0) / 0.01)) + 1, endpoint=True
527 | )
528 | self.max_dets = 300
529 | self.area_rng = [
530 | [0 ** 2, 1e5 ** 2],
531 | [0 ** 2, 32 ** 2],
532 | [32 ** 2, 96 ** 2],
533 | [96 ** 2, 1e5 ** 2],
534 | ]
535 | self.area_rng_lbl = ["all", "small", "medium", "large"]
536 | self.use_cats = 1
537 | # We bin categories in three bins based how many images of the training
538 | # set the category is present in.
539 | # r: Rare : < 10
540 | # c: Common : >= 10 and < 100
541 | # f: Frequent: >= 100
542 | self.img_count_lbl = ["r", "c", "f"]
543 | self.iou_type = iou_type
544 |
--------------------------------------------------------------------------------
/lvis/lvis.py:
--------------------------------------------------------------------------------
1 | """
2 | API for accessing LVIS Dataset: https://lvisdataset.org.
3 |
4 | LVIS API is a Python API that assists in loading, parsing and visualizing
5 | the annotations in LVIS. In addition to this API, please download
6 | images and annotations from the LVIS website.
7 | """
8 |
9 | import json
10 | import os
11 | import logging
12 | from collections import defaultdict
13 | from urllib.request import urlretrieve
14 |
15 | import pycocotools.mask as mask_utils
16 |
17 |
18 | class LVIS:
19 | def __init__(self, annotation_path):
20 | """Class for reading and visualizing annotations.
21 | Args:
22 | annotation_path (str): location of annotation file
23 | """
24 | self.logger = logging.getLogger(__name__)
25 | self.logger.info("Loading annotations.")
26 |
27 | self.dataset = self._load_json(annotation_path)
28 |
29 | assert (
30 | type(self.dataset) == dict
31 | ), "Annotation file format {} not supported.".format(type(self.dataset))
32 | self._create_index()
33 |
34 | def _load_json(self, path):
35 | with open(path, "r") as f:
36 | return json.load(f)
37 |
38 | def _create_index(self):
39 | self.logger.info("Creating index.")
40 |
41 | self.img_ann_map = defaultdict(list)
42 | self.cat_img_map = defaultdict(list)
43 |
44 | self.anns = {}
45 | self.cats = {}
46 | self.imgs = {}
47 |
48 | for ann in self.dataset["annotations"]:
49 | self.img_ann_map[ann["image_id"]].append(ann)
50 | self.anns[ann["id"]] = ann
51 |
52 | for img in self.dataset["images"]:
53 | self.imgs[img["id"]] = img
54 |
55 | for cat in self.dataset["categories"]:
56 | self.cats[cat["id"]] = cat
57 |
58 | for ann in self.dataset["annotations"]:
59 | self.cat_img_map[ann["category_id"]].append(ann["image_id"])
60 |
61 | self.logger.info("Index created.")
62 |
63 | def get_ann_ids(self, img_ids=None, cat_ids=None, area_rng=None):
64 | """Get ann ids that satisfy given filter conditions.
65 |
66 | Args:
67 | img_ids (int array): get anns for given imgs
68 | cat_ids (int array): get anns for given cats
69 | area_rng (float array): get anns for a given area range. e.g [0, inf]
70 |
71 | Returns:
72 | ids (int array): integer array of ann ids
73 | """
74 | anns = []
75 | if img_ids is not None:
76 | for img_id in img_ids:
77 | anns.extend(self.img_ann_map[img_id])
78 | else:
79 | anns = self.dataset["annotations"]
80 |
81 | # return early if no more filtering required
82 | if cat_ids is None and area_rng is None:
83 | return [_ann["id"] for _ann in anns]
84 |
85 | cat_ids = set(cat_ids)
86 |
87 | if area_rng is None:
88 | area_rng = [0, float("inf")]
89 |
90 | ann_ids = [
91 | _ann["id"]
92 | for _ann in anns
93 | if _ann["category_id"] in cat_ids
94 | and _ann["area"] > area_rng[0]
95 | and _ann["area"] < area_rng[1]
96 | ]
97 | return ann_ids
98 |
99 | def get_cat_ids(self):
100 | """Get all category ids.
101 |
102 | Returns:
103 | ids (int array): integer array of category ids
104 | """
105 | return list(self.cats.keys())
106 |
107 | def get_img_ids(self):
108 | """Get all img ids.
109 |
110 | Returns:
111 | ids (int array): integer array of image ids
112 | """
113 | return list(self.imgs.keys())
114 |
115 | def _load_helper(self, _dict, ids):
116 | if ids is None:
117 | return list(_dict.values())
118 | else:
119 | return [_dict[id] for id in ids]
120 |
121 | def load_anns(self, ids=None):
122 | """Load anns with the specified ids. If ids=None load all anns.
123 |
124 | Args:
125 | ids (int array): integer array of annotation ids
126 |
127 | Returns:
128 | anns (dict array) : loaded annotation objects
129 | """
130 | return self._load_helper(self.anns, ids)
131 |
132 | def load_cats(self, ids):
133 | """Load categories with the specified ids. If ids=None load all
134 | categories.
135 |
136 | Args:
137 | ids (int array): integer array of category ids
138 |
139 | Returns:
140 | cats (dict array) : loaded category dicts
141 | """
142 | return self._load_helper(self.cats, ids)
143 |
144 | def load_imgs(self, ids):
145 | """Load categories with the specified ids. If ids=None load all images.
146 |
147 | Args:
148 | ids (int array): integer array of image ids
149 |
150 | Returns:
151 | imgs (dict array) : loaded image dicts
152 | """
153 | return self._load_helper(self.imgs, ids)
154 |
155 | def download(self, save_dir, img_ids=None):
156 | """Download images from mscoco.org server.
157 | Args:
158 | save_dir (str): dir to save downloaded images
159 | img_ids (int array): img ids of images to download
160 | """
161 | imgs = self.load_imgs(img_ids)
162 |
163 | if not os.path.exists(save_dir):
164 | os.makedirs(save_dir)
165 |
166 | for img in imgs:
167 | file_name = os.path.join(save_dir, img["coco_url"].split("/")[-1])
168 | if not os.path.exists(file_name):
169 | urlretrieve(img["coco_url"], file_name)
170 |
171 | def ann_to_rle(self, ann):
172 | """Convert annotation which can be polygons, uncompressed RLE to RLE.
173 | Args:
174 | ann (dict) : annotation object
175 |
176 | Returns:
177 | ann (rle)
178 | """
179 | img_data = self.imgs[ann["image_id"]]
180 | h, w = img_data["height"], img_data["width"]
181 | segm = ann["segmentation"]
182 | if isinstance(segm, list):
183 | # polygon -- a single object might consist of multiple parts
184 | # we merge all parts into one mask rle code
185 | rles = mask_utils.frPyObjects(segm, h, w)
186 | rle = mask_utils.merge(rles)
187 | elif isinstance(segm["counts"], list):
188 | # uncompressed RLE
189 | rle = mask_utils.frPyObjects(segm, h, w)
190 | else:
191 | # rle
192 | rle = ann["segmentation"]
193 | return rle
194 |
195 | def ann_to_mask(self, ann):
196 | """Convert annotation which can be polygons, uncompressed RLE, or RLE
197 | to binary mask.
198 | Args:
199 | ann (dict) : annotation object
200 |
201 | Returns:
202 | binary mask (numpy 2D array)
203 | """
204 | rle = self.ann_to_rle(ann)
205 | return mask_utils.decode(rle)
206 |
--------------------------------------------------------------------------------
/lvis/results.py:
--------------------------------------------------------------------------------
1 | from copy import deepcopy
2 | import logging
3 | from collections import defaultdict
4 | from lvis.lvis import LVIS
5 |
6 | import pycocotools.mask as mask_utils
7 |
8 |
9 | class LVISResults(LVIS):
10 | def __init__(self, lvis_gt, results, max_dets=300):
11 | """Constructor for LVIS results.
12 | Args:
13 | lvis_gt (LVIS class instance, or str containing path of
14 | annotation file)
15 | results (str containing path of result file or a list of dicts)
16 | max_dets (int): max number of detections per image. The official
17 | value of max_dets for LVIS is 300.
18 | """
19 | if isinstance(lvis_gt, LVIS):
20 | self.dataset = deepcopy(lvis_gt.dataset)
21 | elif isinstance(lvis_gt, str):
22 | self.dataset = self._load_json(lvis_gt)
23 | else:
24 | raise TypeError("Unsupported type {} of lvis_gt.".format(lvis_gt))
25 |
26 | self.logger = logging.getLogger(__name__)
27 | self.logger.info("Loading and preparing results.")
28 |
29 | if isinstance(results, str):
30 | result_anns = self._load_json(results)
31 | else:
32 | # this path way is provided to avoid saving and loading result
33 | # during training.
34 | self.logger.warn("Assuming user provided the results in correct format.")
35 | result_anns = results
36 |
37 | assert isinstance(result_anns, list), "results is not a list."
38 |
39 | if max_dets >= 0:
40 | result_anns = self.limit_dets_per_image(result_anns, max_dets)
41 |
42 | if "bbox" in result_anns[0]:
43 | for id, ann in enumerate(result_anns):
44 | x1, y1, w, h = ann["bbox"]
45 | x2 = x1 + w
46 | y2 = y1 + h
47 |
48 | if "segmentation" not in ann:
49 | ann["segmentation"] = [[x1, y1, x1, y2, x2, y2, x2, y1]]
50 |
51 | ann["area"] = w * h
52 | ann["id"] = id + 1
53 |
54 | elif "segmentation" in result_anns[0]:
55 | for id, ann in enumerate(result_anns):
56 | # Only support compressed RLE format as segmentation results
57 | ann["area"] = mask_utils.area(ann["segmentation"])
58 |
59 | if "bbox" not in ann:
60 | ann["bbox"] = mask_utils.toBbox(ann["segmentation"])
61 |
62 | ann["id"] = id + 1
63 |
64 | self.dataset["annotations"] = result_anns
65 | self._create_index()
66 |
67 | img_ids_in_result = [ann["image_id"] for ann in result_anns]
68 |
69 | assert set(img_ids_in_result) == (
70 | set(img_ids_in_result) & set(self.get_img_ids())
71 | ), "Results do not correspond to current LVIS set."
72 |
73 | def limit_dets_per_image(self, anns, max_dets):
74 | img_ann = defaultdict(list)
75 | for ann in anns:
76 | img_ann[ann["image_id"]].append(ann)
77 |
78 | for img_id, _anns in img_ann.items():
79 | if len(_anns) <= max_dets:
80 | continue
81 | _anns = sorted(_anns, key=lambda ann: ann["score"], reverse=True)
82 | img_ann[img_id] = _anns[:max_dets]
83 |
84 | return [ann for anns in img_ann.values() for ann in anns]
85 |
86 | def get_top_results(self, img_id, score_thrs):
87 | ann_ids = self.get_ann_ids(img_ids=[img_id])
88 | anns = self.load_anns(ann_ids)
89 | return list(filter(lambda ann: ann["score"] > score_thrs, anns))
90 |
--------------------------------------------------------------------------------
/lvis/vis.py:
--------------------------------------------------------------------------------
1 | import cv2
2 | import logging
3 | import os
4 |
5 | import numpy as np
6 | import matplotlib.pyplot as plt
7 | import pycocotools.mask as mask_utils
8 | from matplotlib.patches import Polygon
9 |
10 | from lvis.lvis import LVIS
11 | from lvis.results import LVISResults
12 | from lvis.colormap import colormap
13 |
14 |
15 | class LVISVis:
16 | def __init__(self, lvis_gt, lvis_dt=None, img_dir=None, dpi=75):
17 | """Constructor for LVISVis.
18 | Args:
19 | lvis_gt (LVIS class instance, or str containing path of annotation file)
20 | lvis_dt (LVISResult class instance, or str containing path of result file,
21 | or list of dict)
22 | img_dir (str): path of folder containing all images. If None, the image
23 | to be displayed will be downloaded to the current working dir.
24 | dpi (int): dpi for figure size setup
25 | """
26 | self.logger = logging.getLogger(__name__)
27 |
28 | if isinstance(lvis_gt, LVIS):
29 | self.lvis_gt = lvis_gt
30 | elif isinstance(lvis_gt, str):
31 | self.lvis_gt = LVIS(lvis_gt)
32 | else:
33 | raise TypeError("Unsupported type {} of lvis_gt.".format(lvis_gt))
34 |
35 | if lvis_dt is not None:
36 | if isinstance(lvis_dt, LVISResults):
37 | self.lvis_dt = lvis_dt
38 | elif isinstance(lvis_dt, (str, list)):
39 | self.lvis_dt = LVISResults(self.lvis_gt, lvis_dt)
40 | else:
41 | raise TypeError("Unsupported type {} of lvis_dt.".format(lvis_dt))
42 | else:
43 | self.lvis_dt = None
44 | self.dpi = dpi
45 | self.img_dir = img_dir if img_dir else '.'
46 | if self.img_dir == '.':
47 | self.logger.warn("img_dir not specified. Images will be downloaded.")
48 |
49 | def coco_segm_to_poly(self, _list):
50 | x = _list[0::2]
51 | y = _list[1::2]
52 | points = np.asarray([x, y])
53 | return np.transpose(points)
54 |
55 | def get_synset(self, idx):
56 | synset = self.lvis_gt.load_cats(ids=[idx])[0]["synset"]
57 | text = synset.split(".")
58 | text = "{}.{}".format(text[0], int(text[-1]))
59 | return text
60 |
61 | def setup_figure(self, img, title="", dpi=75):
62 | fig = plt.figure(frameon=False)
63 | fig.set_size_inches(img.shape[1] / dpi, img.shape[0] / dpi)
64 | ax = plt.Axes(fig, [0.0, 0.0, 1.0, 1.0])
65 | ax.set_title(title)
66 | ax.axis("off")
67 | fig.add_axes(ax)
68 | ax.imshow(img)
69 | return fig, ax
70 |
71 | def vis_bbox(self, ax, bbox, box_alpha=0.5, edgecolor="g", linestyle="--"):
72 | # bbox should be of the form x, y, w, h
73 | ax.add_patch(
74 | plt.Rectangle(
75 | (bbox[0], bbox[1]),
76 | bbox[2],
77 | bbox[3],
78 | fill=False,
79 | edgecolor=edgecolor,
80 | linewidth=2.5,
81 | alpha=box_alpha,
82 | linestyle=linestyle,
83 | )
84 | )
85 |
86 | def vis_text(self, ax, bbox, text, color="w"):
87 | ax.text(
88 | bbox[0],
89 | bbox[1] - 2,
90 | text,
91 | fontsize=15,
92 | family="serif",
93 | bbox=dict(facecolor="none", alpha=0.4, pad=0, edgecolor="none"),
94 | color=color,
95 | zorder=10,
96 | )
97 |
98 | def vis_mask(self, ax, segm, color):
99 | # segm is numpy array of shape Nx2
100 | polygon = Polygon(
101 | segm, fill=True, facecolor=color, edgecolor=color, linewidth=3, alpha=0.5
102 | )
103 | ax.add_patch(polygon)
104 |
105 | def get_color(self, idx):
106 | color_list = colormap(rgb=True) / 255
107 | return color_list[idx % len(color_list), 0:3]
108 |
109 | def load_img(self, img_id):
110 | img = self.lvis_gt.load_imgs([img_id])[0]
111 | img_path = os.path.join(self.img_dir, img["coco_url"].split("/")[-1])
112 | if not os.path.exists(img_path):
113 | self.lvis_gt.download(self.img_dir, img_ids=[img_id])
114 | img = cv2.imread(img_path)
115 | b, g, r = cv2.split(img)
116 | return cv2.merge([r, g, b])
117 |
118 | def vis_img(
119 | self, img_id, show_boxes=False, show_segms=True, show_classes=False,
120 | cat_ids_to_show=None
121 | ):
122 | ann_ids = self.lvis_gt.get_ann_ids(img_ids=[img_id])
123 | anns = self.lvis_gt.load_anns(ids=ann_ids)
124 | boxes, segms, classes = [], [], []
125 | for ann in anns:
126 | boxes.append(ann["bbox"])
127 | segms.append(ann["segmentation"])
128 | classes.append(ann["category_id"])
129 |
130 | if len(boxes) == 0:
131 | self.logger.warn("No gt anno found for img_id: {}".format(img_id))
132 | return
133 |
134 | boxes = np.asarray(boxes)
135 | areas = boxes[:, 2] * boxes[:, 3]
136 | sorted_inds = np.argsort(-areas)
137 |
138 | fig, ax = self.setup_figure(self.load_img(img_id))
139 |
140 | for idx in sorted_inds:
141 | if cat_ids_to_show is not None and classes[idx] not in cat_ids_to_show:
142 | continue
143 | color = self.get_color(idx)
144 | if show_boxes:
145 | self.vis_bbox(ax, boxes[idx], edgecolor=color)
146 | if show_classes:
147 | text = self.get_synset(classes[idx])
148 | self.vis_text(ax, boxes[idx], text)
149 | if show_segms:
150 | for segm in segms[idx]:
151 | self.vis_mask(ax, self.coco_segm_to_poly(segm), color)
152 |
153 | def vis_result(
154 | self, img_id, show_boxes=False, show_segms=True, show_classes=False,
155 | cat_ids_to_show=None, score_thrs=0.0, show_scores=True
156 | ):
157 | assert self.lvis_dt is not None, "lvis_dt was not specified."
158 | anns = self.lvis_dt.get_top_results(img_id, score_thrs)
159 | boxes, segms, classes, scores = [], [], [], []
160 | for ann in anns:
161 | boxes.append(ann["bbox"])
162 | segms.append(ann["segmentation"])
163 | classes.append(ann["category_id"])
164 | scores.append(ann["score"])
165 |
166 | if len(boxes) == 0:
167 | self.logger.warn("No gt anno found for img_id: {}".format(img_id))
168 | return
169 |
170 | boxes = np.asarray(boxes)
171 | areas = boxes[:, 2] * boxes[:, 3]
172 | sorted_inds = np.argsort(-areas)
173 |
174 | fig, ax = self.setup_figure(self.load_img(img_id))
175 |
176 | for idx in sorted_inds:
177 | if cat_ids_to_show is not None and classes[idx] not in cat_ids_to_show:
178 | continue
179 | color = self.get_color(idx)
180 | if show_boxes:
181 | self.vis_bbox(ax, boxes[idx], edgecolor=color)
182 | if show_classes:
183 | text = self.get_synset(classes[idx])
184 | if show_scores:
185 | text = "{}: {:.2f}".format(text, scores[idx])
186 | self.vis_text(ax, boxes[idx], text)
187 | if show_segms:
188 | for segm in segms[idx]:
189 | self.vis_mask(ax, self.coco_segm_to_poly(segm), color)
190 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | cycler>=0.10.0
2 | Cython>=0.29.12
3 | kiwisolver>=1.1.0
4 | matplotlib>=3.1.1
5 | numpy>=1.18.2
6 | opencv-python>=4.1.0.25
7 | pyparsing>=2.4.0
8 | python-dateutil>=2.8.0
9 | six>=1.12.0
10 |
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | """LVIS (pronounced ‘el-vis’): is a new dataset for Large Vocabulary Instance Segmentation.
2 | We collect over 2 million high-quality instance segmentation masks for over 1200 entry-level object categories in 164k images. LVIS API enables reading and interacting with annotation files,
3 | visualizing annotations, and evaluating results.
4 |
5 | """
6 | DOCLINES = (__doc__ or '')
7 |
8 | import os.path
9 | import sys
10 | import pip
11 |
12 | import setuptools
13 |
14 | sys.path.insert(0, os.path.join(os.path.dirname(__file__), "lvis"))
15 |
16 | with open("requirements.txt") as f:
17 | reqs = f.read()
18 |
19 | DISTNAME = "lvis"
20 | DESCRIPTION = "Python API for LVIS dataset."
21 | AUTHOR = "Agrim Gupta"
22 | REQUIREMENTS = (reqs.strip().split("\n"),)
23 |
24 |
25 | if __name__ == "__main__":
26 | setuptools.setup(
27 | name=DISTNAME,
28 | install_requires=REQUIREMENTS,
29 | packages=setuptools.find_packages(),
30 | version="0.5.3",
31 | description=DESCRIPTION,
32 | long_description=DOCLINES,
33 | long_description_content_type='text/markdown',
34 | author=AUTHOR
35 | )
36 |
--------------------------------------------------------------------------------
/test.py:
--------------------------------------------------------------------------------
1 | import logging
2 | from lvis import LVIS, LVISResults, LVISEval
3 |
4 | # result and val files for 100 randomly sampled images.
5 | ANNOTATION_PATH = "./data/lvis_val_100.json"
6 | RESULT_PATH = "./data/lvis_results_100.json"
7 |
8 | ANN_TYPE = 'bbox'
9 |
10 | lvis_eval = LVISEval(ANNOTATION_PATH, RESULT_PATH, ANN_TYPE)
11 | lvis_eval.run()
12 | lvis_eval.print_results()
13 |
--------------------------------------------------------------------------------