├── .github └── ISSUE_TEMPLATE │ ├── ---bug-report.md │ └── -questions-help-support.md ├── .gitignore ├── LICENSE ├── README.md ├── data ├── README.md ├── coco_to_synset.json ├── lvis_results_100.json └── lvis_val_100.json ├── images ├── examples.png └── lvis_icon.svg ├── lvis ├── __init__.py ├── colormap.py ├── eval.py ├── lvis.py ├── results.py └── vis.py ├── requirements.txt ├── setup.py └── test.py /.github/ISSUE_TEMPLATE/---bug-report.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: " \U0001F41BBug Report" 3 | about: Submit a bug report to help us improve LVIS API 4 | title: '' 5 | labels: bug 6 | assignees: '' 7 | 8 | --- 9 | 10 | ## 🐛 Bug 11 | 12 | 13 | 14 | ## To Reproduce 15 | 16 | Steps to reproduce the behavior. 17 | 18 | 19 | 20 | ## Expected behavior 21 | 22 | 23 | 24 | 25 | ## Additional context 26 | 27 | 28 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/-questions-help-support.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: "❓Questions/Help/Support" 3 | about: Do you need support? 4 | title: '' 5 | labels: '' 6 | assignees: '' 7 | 8 | --- 9 | 10 | ## ❓ Questions and Help 11 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .ipynb_checkpoints 2 | __pycache__ 3 | .DS_Store 4 | dist/* 5 | lvis.egg-info/ 6 | build/* 7 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2019, Agrim Gupta and Ross Girshick 2 | All rights reserved. 3 | 4 | Redistribution and use in source and binary forms, with or without 5 | modification, are permitted provided that the following conditions are met: 6 | 7 | 1. Redistributions of source code must retain the above copyright notice, this 8 | list of conditions and the following disclaimer. 9 | 2. Redistributions in binary form must reproduce the above copyright notice, 10 | this list of conditions and the following disclaimer in the documentation 11 | and/or other materials provided with the distribution. 12 | 13 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 14 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 15 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 16 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR 17 | ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES 18 | (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 19 | LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND 20 | ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 21 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 22 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 23 | 24 | The views and conclusions contained in the software and documentation are those 25 | of the authors and should not be interpreted as representing official policies, 26 | either expressed or implied, of the FreeBSD Project. 27 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # LVIS API 2 | 3 | 4 | LVIS (pronounced ‘el-vis’): is a new dataset for Large Vocabulary Instance Segmentation. 5 | When complete, it will feature more than 2 million high-quality instance segmentation masks for over 1200 entry-level object categories in 164k images. The LVIS API enables reading and interacting with annotation files, visualizing annotations, and evaluating results. 6 | 7 | 8 | 9 | ## LVIS v1.0 10 | 11 | For this release, we have annotated 159,623 images (100k train, 20k val, 20k test-dev, 20k test-challenge). Release v1.0 is publicly available at [LVIS website](http://www.lvisdataset.org) and will be used in the second LVIS Challenge to be held at Joint COCO and LVIS Workshop at ECCV 2020. 12 | 13 | ## Setup 14 | You can setup a virtual environment and then install `lvisapi` using pip: 15 | 16 | ```bash 17 | python3 -m venv env # Create a virtual environment 18 | source env/bin/activate # Activate virtual environment 19 | 20 | # install COCO API. COCO API requires numpy to install. Ensure that you installed numpy. 21 | pip install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI' 22 | # install LVIS API 23 | pip install lvis 24 | # Work for a while ... 25 | deactivate # Exit virtual environment 26 | ``` 27 | 28 | You can also clone the repo first and then do the following steps inside the repo: 29 | ```bash 30 | python3 -m venv env # Create a virtual environment 31 | source env/bin/activate # Activate virtual environment 32 | 33 | # install COCO API. COCO API requires numpy to install. Ensure that you installed numpy. 34 | pip install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI' 35 | # install LVIS API 36 | pip install . 37 | # test if the installation was correct 38 | python test.py 39 | # Work for a while ... 40 | deactivate # Exit virtual environment 41 | ``` 42 | ## Citing LVIS 43 | 44 | If you find this code/data useful in your research then please cite our [paper](https://arxiv.org/abs/1908.03195): 45 | ``` 46 | @inproceedings{gupta2019lvis, 47 | title={{LVIS}: A Dataset for Large Vocabulary Instance Segmentation}, 48 | author={Gupta, Agrim and Dollar, Piotr and Girshick, Ross}, 49 | booktitle={Proceedings of the {IEEE} Conference on Computer Vision and Pattern Recognition}, 50 | year={2019} 51 | } 52 | ``` 53 | 54 | ## Credit 55 | 56 | The code is a re-write of PythonAPI for [COCO](https://github.com/cocodataset/cocoapi). 57 | The core functionality is the same with LVIS specific changes. 58 | -------------------------------------------------------------------------------- /data/README.md: -------------------------------------------------------------------------------- 1 | ## Mapping between LVIS and COCO categories 2 | 3 | The json file `coco_to_synset.json` provides a mapping from each COCO category 4 | to a synset. The synset can then be used to find the corresponding category in 5 | LVIS. Matching based on synsets (instead of category id) allows this mapping 6 | to be correct even if LVIS category ids change (which will likely happen when 7 | upgrading from LVIS release v0.5 to v1.0). 8 | -------------------------------------------------------------------------------- /data/coco_to_synset.json: -------------------------------------------------------------------------------- 1 | {"bench": {"coco_cat_id": 15, "meaning": "a long seat for more than one person", "synset": "bench.n.01"}, "baseball bat": {"coco_cat_id": 39, "meaning": "an implement used in baseball by the batter", "synset": "baseball_bat.n.01"}, "kite": {"coco_cat_id": 38, "meaning": "plaything consisting of a light frame covered with tissue paper; flown in wind at end of a string", "synset": "kite.n.03"}, "orange": {"coco_cat_id": 55, "meaning": "orange (FRUIT of an orange tree)", "synset": "orange.n.01"}, "boat": {"coco_cat_id": 9, "meaning": "a vessel for travel on water", "synset": "boat.n.01"}, "carrot": {"coco_cat_id": 57, "meaning": "deep orange edible root of the cultivated carrot plant", "synset": "carrot.n.01"}, "bicycle": {"coco_cat_id": 2, "meaning": "a wheeled vehicle that has two wheels and is moved by foot pedals", "synset": "bicycle.n.01"}, "book": {"coco_cat_id": 84, "meaning": "a written work or composition that has been published", "synset": "book.n.01"}, "toothbrush": {"coco_cat_id": 90, "meaning": "small brush; has long handle; used to clean teeth", "synset": "toothbrush.n.01"}, "tie": {"coco_cat_id": 32, "meaning": "neckwear consisting of a long narrow piece of material worn under a collar and tied in knot at the front", "synset": "necktie.n.01"}, "sandwich": {"coco_cat_id": 54, "meaning": "two (or more) slices of bread with a filling between them", "synset": "sandwich.n.01"}, "toilet": {"coco_cat_id": 70, "meaning": "a plumbing fixture for defecation and urination", "synset": "toilet.n.02"}, "stop sign": {"coco_cat_id": 13, "meaning": "a traffic sign to notify drivers that they must come to a complete stop", "synset": "stop_sign.n.01"}, "wine glass": {"coco_cat_id": 46, "meaning": "a glass that has a stem and in which wine is served", "synset": "wineglass.n.01"}, "clock": {"coco_cat_id": 85, "meaning": "a timepiece that shows the time of day", "synset": "clock.n.01"}, "bear": {"coco_cat_id": 23, "meaning": "large carnivorous or omnivorous mammals with shaggy coats and claws", "synset": "bear.n.01"}, "vase": {"coco_cat_id": 86, "meaning": "an open jar of glass or porcelain used as an ornament or to hold flowers", "synset": "vase.n.01"}, "microwave": {"coco_cat_id": 78, "meaning": "kitchen appliance that cooks food by passing an electromagnetic wave through it", "synset": "microwave.n.02"}, "oven": {"coco_cat_id": 79, "meaning": "kitchen appliance used for baking or roasting", "synset": "oven.n.01"}, "cake": {"coco_cat_id": 61, "meaning": "baked goods made from or based on a mixture of flour, sugar, eggs, and fat", "synset": "cake.n.03"}, "apple": {"coco_cat_id": 53, "meaning": "fruit with red or yellow or green skin and sweet to tart crisp whitish flesh", "synset": "apple.n.01"}, "bed": {"coco_cat_id": 65, "meaning": "a piece of furniture that provides a place to sleep", "synset": "bed.n.01"}, "skis": {"coco_cat_id": 35, "meaning": "sports equipment for skiing on snow", "synset": "ski.n.01"}, "dining table": {"coco_cat_id": 67, "meaning": "a table at which meals are served", "synset": "dining_table.n.01"}, "remote": {"coco_cat_id": 75, "meaning": "a device that can be used to control a machine or apparatus from a distance", "synset": "remote_control.n.01"}, "bird": {"coco_cat_id": 16, "meaning": "animal characterized by feathers and wings", "synset": "bird.n.01"}, "laptop": {"coco_cat_id": 73, "meaning": "a portable computer small enough to use in your lap", "synset": "laptop.n.01"}, "train": {"coco_cat_id": 7, "meaning": "public or private transport provided by a line of railway cars coupled together and drawn by a locomotive", "synset": "train.n.01"}, "mouse": {"coco_cat_id": 74, "meaning": "a computer input device that controls an on-screen pointer", "synset": "mouse.n.04"}, "pizza": {"coco_cat_id": 59, "meaning": "Italian open pie made of thin bread dough spread with a spiced mixture of e.g. tomato sauce and cheese", "synset": "pizza.n.01"}, "toaster": {"coco_cat_id": 80, "meaning": "a kitchen appliance (usually electric) for toasting bread", "synset": "toaster.n.02"}, "cell phone": {"coco_cat_id": 77, "meaning": "a hand-held mobile telephone", "synset": "cellular_telephone.n.01"}, "person": {"coco_cat_id": 1, "meaning": "a human being", "synset": "person.n.01"}, "sports ball": {"coco_cat_id": 37, "meaning": "a spherical object used as a plaything", "synset": "ball.n.06"}, "fire hydrant": {"coco_cat_id": 11, "meaning": "an upright hydrant for drawing water to use in fighting a fire", "synset": "fireplug.n.01"}, "umbrella": {"coco_cat_id": 28, "meaning": "a lightweight handheld collapsible canopy", "synset": "umbrella.n.01"}, "truck": {"coco_cat_id": 8, "meaning": "an automotive vehicle suitable for hauling", "synset": "truck.n.01"}, "knife": {"coco_cat_id": 49, "meaning": "tool with a blade and point used as a cutting instrument", "synset": "knife.n.01"}, "baseball glove": {"coco_cat_id": 40, "meaning": "the handwear used by fielders in playing baseball", "synset": "baseball_glove.n.01"}, "giraffe": {"coco_cat_id": 25, "meaning": "tall animal having a spotted coat and small horns and very long neck and legs", "synset": "giraffe.n.01"}, "airplane": {"coco_cat_id": 5, "meaning": "an aircraft that has a fixed wing and is powered by propellers or jets", "synset": "airplane.n.01"}, "parking meter": {"coco_cat_id": 14, "meaning": "a coin-operated timer located next to a parking space", "synset": "parking_meter.n.01"}, "couch": {"coco_cat_id": 63, "meaning": "an upholstered seat for more than one person", "synset": "sofa.n.01"}, "tennis racket": {"coco_cat_id": 43, "meaning": "a racket used to play tennis", "synset": "tennis_racket.n.01"}, "backpack": {"coco_cat_id": 27, "meaning": "a bag carried by a strap on your back or shoulder", "synset": "backpack.n.01"}, "hot dog": {"coco_cat_id": 58, "meaning": "a smooth-textured sausage, usually smoked, often served on a bread roll", "synset": "frank.n.02"}, "banana": {"coco_cat_id": 52, "meaning": "elongated crescent-shaped yellow fruit with soft sweet flesh", "synset": "banana.n.02"}, "bowl": {"coco_cat_id": 51, "meaning": "a dish that is round and open at the top for serving foods", "synset": "bowl.n.03"}, "skateboard": {"coco_cat_id": 41, "meaning": "a board with wheels that is ridden in a standing or crouching position and propelled by foot", "synset": "skateboard.n.01"}, "bottle": {"coco_cat_id": 44, "meaning": "a glass or plastic vessel used for storing drinks or other liquids", "synset": "bottle.n.01"}, "dog": {"coco_cat_id": 18, "meaning": "a common domesticated dog", "synset": "dog.n.01"}, "frisbee": {"coco_cat_id": 34, "meaning": "a light, plastic disk propelled with a flip of the wrist for recreation or competition", "synset": "frisbee.n.01"}, "broccoli": {"coco_cat_id": 56, "meaning": "plant with dense clusters of tight green flower buds", "synset": "broccoli.n.01"}, "elephant": {"coco_cat_id": 22, "meaning": "a common elephant", "synset": "elephant.n.01"}, "car": {"coco_cat_id": 3, "meaning": "a motor vehicle with four wheels", "synset": "car.n.01"}, "donut": {"coco_cat_id": 60, "meaning": "a small ring-shaped friedcake", "synset": "doughnut.n.02"}, "suitcase": {"coco_cat_id": 33, "meaning": "cases used to carry belongings when traveling", "synset": "bag.n.06"}, "cup": {"coco_cat_id": 47, "meaning": "a small open container usually used for drinking; usually has a handle", "synset": "cup.n.01"}, "hair drier": {"coco_cat_id": 89, "meaning": "a hand-held electric blower that can blow warm air onto the hair", "synset": "hand_blower.n.01"}, "surfboard": {"coco_cat_id": 42, "meaning": "a narrow buoyant board for riding surf", "synset": "surfboard.n.01"}, "traffic light": {"coco_cat_id": 10, "meaning": "a device to control vehicle traffic often consisting of three or more lights", "synset": "traffic_light.n.01"}, "tv": {"coco_cat_id": 72, "meaning": "an electronic device that receives television signals and displays them on a screen", "synset": "television_receiver.n.01"}, "spoon": {"coco_cat_id": 50, "meaning": "a piece of cutlery with a shallow bowl-shaped container and a handle", "synset": "spoon.n.01"}, "horse": {"coco_cat_id": 19, "meaning": "a common horse", "synset": "horse.n.01"}, "motorcycle": {"coco_cat_id": 4, "meaning": "a motor vehicle with two wheels and a strong frame", "synset": "motorcycle.n.01"}, "zebra": {"coco_cat_id": 24, "meaning": "any of several fleet black-and-white striped African equines", "synset": "zebra.n.01"}, "cat": {"coco_cat_id": 17, "meaning": "a domestic house cat", "synset": "cat.n.01"}, "teddy bear": {"coco_cat_id": 88, "meaning": "plaything consisting of a child's toy bear (usually plush and stuffed with soft materials)", "synset": "teddy.n.01"}, "handbag": {"coco_cat_id": 31, "meaning": "a container used for carrying money and small personal items or accessories", "synset": "bag.n.04"}, "sink": {"coco_cat_id": 81, "meaning": "plumbing fixture consisting of a water basin fixed to a wall or floor and having a drainpipe", "synset": "sink.n.01"}, "keyboard": {"coco_cat_id": 76, "meaning": "a keyboard that is a data input device for computers", "synset": "computer_keyboard.n.01"}, "bus": {"coco_cat_id": 6, "meaning": "a vehicle carrying many passengers; used for public transport", "synset": "bus.n.01"}, "fork": {"coco_cat_id": 48, "meaning": "cutlery used for serving and eating food", "synset": "fork.n.01"}, "chair": {"coco_cat_id": 62, "meaning": "a seat for one person, with a support for the back", "synset": "chair.n.01"}, "refrigerator": {"coco_cat_id": 82, "meaning": "a refrigerator in which the coolant is pumped around by an electric motor", "synset": "electric_refrigerator.n.01"}, "scissors": {"coco_cat_id": 87, "meaning": "a tool having two crossed pivoting blades with looped handles", "synset": "scissors.n.01"}, "sheep": {"coco_cat_id": 20, "meaning": "woolly usually horned ruminant mammal related to the goat", "synset": "sheep.n.01"}, "potted plant": {"coco_cat_id": 64, "meaning": "a container in which plants are cultivated", "synset": "pot.n.04"}, "snowboard": {"coco_cat_id": 36, "meaning": "a board that resembles a broad ski or a small surfboard; used in a standing position to slide down snow-covered slopes", "synset": "snowboard.n.01"}, "cow": {"coco_cat_id": 21, "meaning": "cattle that are reared for their meat", "synset": "beef.n.01"}} -------------------------------------------------------------------------------- /images/examples.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lvis-dataset/lvis-api/7d7f07def11da91f8b2710ce352c62a78fd5a7ad/images/examples.png -------------------------------------------------------------------------------- /images/lvis_icon.svg: -------------------------------------------------------------------------------- 1 | LVIS-IconMark -------------------------------------------------------------------------------- /lvis/__init__.py: -------------------------------------------------------------------------------- 1 | import logging 2 | from lvis.lvis import LVIS 3 | from lvis.results import LVISResults 4 | from lvis.eval import LVISEval 5 | from lvis.vis import LVISVis 6 | 7 | logging.basicConfig( 8 | format="[%(asctime)s] %(name)s %(levelname)s: %(message)s", datefmt="%m/%d %H:%M:%S", 9 | level=logging.WARN, 10 | ) 11 | 12 | __all__ = ["LVIS", "LVISResults", "LVISEval", "LVISVis"] 13 | -------------------------------------------------------------------------------- /lvis/colormap.py: -------------------------------------------------------------------------------- 1 | """An awesome colormap for really neat visualizations. Taken from detectron.""" 2 | 3 | import numpy as np 4 | 5 | def colormap(rgb=False): 6 | color_list = np.array( 7 | [ 8 | 0.000, 9 | 0.447, 10 | 0.741, 11 | 0.850, 12 | 0.325, 13 | 0.098, 14 | 0.929, 15 | 0.694, 16 | 0.125, 17 | 0.494, 18 | 0.184, 19 | 0.556, 20 | 0.466, 21 | 0.674, 22 | 0.188, 23 | 0.301, 24 | 0.745, 25 | 0.933, 26 | 0.635, 27 | 0.078, 28 | 0.184, 29 | 0.300, 30 | 0.300, 31 | 0.300, 32 | 0.600, 33 | 0.600, 34 | 0.600, 35 | 1.000, 36 | 0.000, 37 | 0.000, 38 | 1.000, 39 | 0.500, 40 | 0.000, 41 | 0.749, 42 | 0.749, 43 | 0.000, 44 | 0.000, 45 | 1.000, 46 | 0.000, 47 | 0.000, 48 | 0.000, 49 | 1.000, 50 | 0.667, 51 | 0.000, 52 | 1.000, 53 | 0.333, 54 | 0.333, 55 | 0.000, 56 | 0.333, 57 | 0.667, 58 | 0.000, 59 | 0.333, 60 | 1.000, 61 | 0.000, 62 | 0.667, 63 | 0.333, 64 | 0.000, 65 | 0.667, 66 | 0.667, 67 | 0.000, 68 | 0.667, 69 | 1.000, 70 | 0.000, 71 | 1.000, 72 | 0.333, 73 | 0.000, 74 | 1.000, 75 | 0.667, 76 | 0.000, 77 | 1.000, 78 | 1.000, 79 | 0.000, 80 | 0.000, 81 | 0.333, 82 | 0.500, 83 | 0.000, 84 | 0.667, 85 | 0.500, 86 | 0.000, 87 | 1.000, 88 | 0.500, 89 | 0.333, 90 | 0.000, 91 | 0.500, 92 | 0.333, 93 | 0.333, 94 | 0.500, 95 | 0.333, 96 | 0.667, 97 | 0.500, 98 | 0.333, 99 | 1.000, 100 | 0.500, 101 | 0.667, 102 | 0.000, 103 | 0.500, 104 | 0.667, 105 | 0.333, 106 | 0.500, 107 | 0.667, 108 | 0.667, 109 | 0.500, 110 | 0.667, 111 | 1.000, 112 | 0.500, 113 | 1.000, 114 | 0.000, 115 | 0.500, 116 | 1.000, 117 | 0.333, 118 | 0.500, 119 | 1.000, 120 | 0.667, 121 | 0.500, 122 | 1.000, 123 | 1.000, 124 | 0.500, 125 | 0.000, 126 | 0.333, 127 | 1.000, 128 | 0.000, 129 | 0.667, 130 | 1.000, 131 | 0.000, 132 | 1.000, 133 | 1.000, 134 | 0.333, 135 | 0.000, 136 | 1.000, 137 | 0.333, 138 | 0.333, 139 | 1.000, 140 | 0.333, 141 | 0.667, 142 | 1.000, 143 | 0.333, 144 | 1.000, 145 | 1.000, 146 | 0.667, 147 | 0.000, 148 | 1.000, 149 | 0.667, 150 | 0.333, 151 | 1.000, 152 | 0.667, 153 | 0.667, 154 | 1.000, 155 | 0.667, 156 | 1.000, 157 | 1.000, 158 | 1.000, 159 | 0.000, 160 | 1.000, 161 | 1.000, 162 | 0.333, 163 | 1.000, 164 | 1.000, 165 | 0.667, 166 | 1.000, 167 | 0.167, 168 | 0.000, 169 | 0.000, 170 | 0.333, 171 | 0.000, 172 | 0.000, 173 | 0.500, 174 | 0.000, 175 | 0.000, 176 | 0.667, 177 | 0.000, 178 | 0.000, 179 | 0.833, 180 | 0.000, 181 | 0.000, 182 | 1.000, 183 | 0.000, 184 | 0.000, 185 | 0.000, 186 | 0.167, 187 | 0.000, 188 | 0.000, 189 | 0.333, 190 | 0.000, 191 | 0.000, 192 | 0.500, 193 | 0.000, 194 | 0.000, 195 | 0.667, 196 | 0.000, 197 | 0.000, 198 | 0.833, 199 | 0.000, 200 | 0.000, 201 | 1.000, 202 | 0.000, 203 | 0.000, 204 | 0.000, 205 | 0.167, 206 | 0.000, 207 | 0.000, 208 | 0.333, 209 | 0.000, 210 | 0.000, 211 | 0.500, 212 | 0.000, 213 | 0.000, 214 | 0.667, 215 | 0.000, 216 | 0.000, 217 | 0.833, 218 | 0.000, 219 | 0.000, 220 | 1.000, 221 | 0.000, 222 | 0.000, 223 | 0.000, 224 | 0.143, 225 | 0.143, 226 | 0.143, 227 | 0.286, 228 | 0.286, 229 | 0.286, 230 | 0.429, 231 | 0.429, 232 | 0.429, 233 | 0.571, 234 | 0.571, 235 | 0.571, 236 | 0.714, 237 | 0.714, 238 | 0.714, 239 | 0.857, 240 | 0.857, 241 | 0.857, 242 | 1.000, 243 | 1.000, 244 | 1.000, 245 | ] 246 | ).astype(np.float32) 247 | color_list = color_list.reshape((-1, 3)) * 255 248 | if not rgb: 249 | color_list = color_list[:, ::-1] 250 | return color_list 251 | -------------------------------------------------------------------------------- /lvis/eval.py: -------------------------------------------------------------------------------- 1 | import datetime 2 | import logging 3 | from collections import OrderedDict 4 | from collections import defaultdict 5 | 6 | import numpy as np 7 | 8 | from lvis.lvis import LVIS 9 | from lvis.results import LVISResults 10 | 11 | import pycocotools.mask as mask_utils 12 | 13 | 14 | class LVISEval: 15 | def __init__(self, lvis_gt, lvis_dt, iou_type="segm"): 16 | """Constructor for LVISEval. 17 | Args: 18 | lvis_gt (LVIS class instance, or str containing path of annotation file) 19 | lvis_dt (LVISResult class instance, or str containing path of result file, 20 | or list of dict) 21 | iou_type (str): segm or bbox evaluation 22 | """ 23 | self.logger = logging.getLogger(__name__) 24 | 25 | if iou_type not in ["bbox", "segm"]: 26 | raise ValueError("iou_type: {} is not supported.".format(iou_type)) 27 | 28 | if isinstance(lvis_gt, LVIS): 29 | self.lvis_gt = lvis_gt 30 | elif isinstance(lvis_gt, str): 31 | self.lvis_gt = LVIS(lvis_gt) 32 | else: 33 | raise TypeError("Unsupported type {} of lvis_gt.".format(lvis_gt)) 34 | 35 | if isinstance(lvis_dt, LVISResults): 36 | self.lvis_dt = lvis_dt 37 | elif isinstance(lvis_dt, (str, list)): 38 | self.lvis_dt = LVISResults(self.lvis_gt, lvis_dt) 39 | else: 40 | raise TypeError("Unsupported type {} of lvis_dt.".format(lvis_dt)) 41 | 42 | # per-image per-category evaluation results 43 | self.eval_imgs = defaultdict(list) 44 | self.eval = {} # accumulated evaluation results 45 | self._gts = defaultdict(list) # gt for evaluation 46 | self._dts = defaultdict(list) # dt for evaluation 47 | self.params = Params(iou_type=iou_type) # parameters 48 | self.results = OrderedDict() 49 | self.ious = {} # ious between all gts and dts 50 | 51 | self.params.img_ids = sorted(self.lvis_gt.get_img_ids()) 52 | self.params.cat_ids = sorted(self.lvis_gt.get_cat_ids()) 53 | 54 | def _to_mask(self, anns, lvis): 55 | for ann in anns: 56 | rle = lvis.ann_to_rle(ann) 57 | ann["segmentation"] = rle 58 | 59 | def _prepare(self): 60 | """Prepare self._gts and self._dts for evaluation based on params.""" 61 | 62 | cat_ids = self.params.cat_ids if self.params.cat_ids else None 63 | 64 | gts = self.lvis_gt.load_anns( 65 | self.lvis_gt.get_ann_ids(img_ids=self.params.img_ids, cat_ids=cat_ids) 66 | ) 67 | dts = self.lvis_dt.load_anns( 68 | self.lvis_dt.get_ann_ids(img_ids=self.params.img_ids, cat_ids=cat_ids) 69 | ) 70 | # convert ground truth to mask if iou_type == 'segm' 71 | if self.params.iou_type == "segm": 72 | self._to_mask(gts, self.lvis_gt) 73 | self._to_mask(dts, self.lvis_dt) 74 | 75 | # set ignore flag 76 | for gt in gts: 77 | if "ignore" not in gt: 78 | gt["ignore"] = 0 79 | 80 | for gt in gts: 81 | self._gts[gt["image_id"], gt["category_id"]].append(gt) 82 | 83 | # For federated dataset evaluation we will filter out all dt for an 84 | # image which belong to categories not present in gt and not present in 85 | # the negative list for an image. In other words detector is not penalized 86 | # for categories about which we don't have gt information about their 87 | # presence or absence in an image. 88 | img_data = self.lvis_gt.load_imgs(ids=self.params.img_ids) 89 | # per image map of categories not present in image 90 | img_nl = {d["id"]: d["neg_category_ids"] for d in img_data} 91 | # per image list of categories present in image 92 | img_pl = defaultdict(set) 93 | for ann in gts: 94 | img_pl[ann["image_id"]].add(ann["category_id"]) 95 | # per image map of categoires which have missing gt. For these 96 | # categories we don't penalize the detector for false positives. 97 | self.img_nel = {d["id"]: d["not_exhaustive_category_ids"] for d in img_data} 98 | 99 | for dt in dts: 100 | img_id, cat_id = dt["image_id"], dt["category_id"] 101 | if cat_id not in img_nl[img_id] and cat_id not in img_pl[img_id]: 102 | continue 103 | self._dts[img_id, cat_id].append(dt) 104 | 105 | self.freq_groups = self._prepare_freq_group() 106 | 107 | def _prepare_freq_group(self): 108 | freq_groups = [[] for _ in self.params.img_count_lbl] 109 | cat_data = self.lvis_gt.load_cats(self.params.cat_ids) 110 | for idx, _cat_data in enumerate(cat_data): 111 | frequency = _cat_data["frequency"] 112 | freq_groups[self.params.img_count_lbl.index(frequency)].append(idx) 113 | return freq_groups 114 | 115 | def evaluate(self): 116 | """ 117 | Run per image evaluation on given images and store results 118 | (a list of dict) in self.eval_imgs. 119 | """ 120 | self.logger.info("Running per image evaluation.") 121 | self.logger.info("Evaluate annotation type *{}*".format(self.params.iou_type)) 122 | 123 | self.params.img_ids = list(np.unique(self.params.img_ids)) 124 | 125 | if self.params.use_cats: 126 | cat_ids = self.params.cat_ids 127 | else: 128 | cat_ids = [-1] 129 | 130 | self._prepare() 131 | 132 | self.ious = { 133 | (img_id, cat_id): self.compute_iou(img_id, cat_id) 134 | for img_id in self.params.img_ids 135 | for cat_id in cat_ids 136 | } 137 | 138 | # loop through images, area range, max detection number 139 | self.eval_imgs = [ 140 | self.evaluate_img(img_id, cat_id, area_rng) 141 | for cat_id in cat_ids 142 | for area_rng in self.params.area_rng 143 | for img_id in self.params.img_ids 144 | ] 145 | 146 | def _get_gt_dt(self, img_id, cat_id): 147 | """Create gt, dt which are list of anns/dets. If use_cats is true 148 | only anns/dets corresponding to tuple (img_id, cat_id) will be 149 | used. Else, all anns/dets in image are used and cat_id is not used. 150 | """ 151 | if self.params.use_cats: 152 | gt = self._gts[img_id, cat_id] 153 | dt = self._dts[img_id, cat_id] 154 | else: 155 | gt = [ 156 | _ann 157 | for _cat_id in self.params.cat_ids 158 | for _ann in self._gts[img_id, _cat_id] 159 | ] 160 | dt = [ 161 | _ann 162 | for _cat_id in self.params.cat_ids 163 | for _ann in self._dts[img_id, _cat_id] 164 | ] 165 | return gt, dt 166 | 167 | def compute_iou(self, img_id, cat_id): 168 | gt, dt = self._get_gt_dt(img_id, cat_id) 169 | 170 | if len(gt) == 0 and len(dt) == 0: 171 | return [] 172 | 173 | # Sort detections in decreasing order of score. 174 | idx = np.argsort([-d["score"] for d in dt], kind="mergesort") 175 | dt = [dt[i] for i in idx] 176 | 177 | iscrowd = [int(False)] * len(gt) 178 | 179 | if self.params.iou_type == "segm": 180 | ann_type = "segmentation" 181 | elif self.params.iou_type == "bbox": 182 | ann_type = "bbox" 183 | else: 184 | raise ValueError("Unknown iou_type for iou computation.") 185 | gt = [g[ann_type] for g in gt] 186 | dt = [d[ann_type] for d in dt] 187 | 188 | # compute iou between each dt and gt region 189 | # will return array of shape len(dt), len(gt) 190 | ious = mask_utils.iou(dt, gt, iscrowd) 191 | return ious 192 | 193 | def evaluate_img(self, img_id, cat_id, area_rng): 194 | """Perform evaluation for single category and image.""" 195 | gt, dt = self._get_gt_dt(img_id, cat_id) 196 | 197 | if len(gt) == 0 and len(dt) == 0: 198 | return None 199 | 200 | # Add another filed _ignore to only consider anns based on area range. 201 | for g in gt: 202 | if g["ignore"] or (g["area"] < area_rng[0] or g["area"] > area_rng[1]): 203 | g["_ignore"] = 1 204 | else: 205 | g["_ignore"] = 0 206 | 207 | # Sort gt ignore last 208 | gt_idx = np.argsort([g["_ignore"] for g in gt], kind="mergesort") 209 | gt = [gt[i] for i in gt_idx] 210 | 211 | # Sort dt highest score first 212 | dt_idx = np.argsort([-d["score"] for d in dt], kind="mergesort") 213 | dt = [dt[i] for i in dt_idx] 214 | 215 | # load computed ious 216 | ious = ( 217 | self.ious[img_id, cat_id][:, gt_idx] 218 | if len(self.ious[img_id, cat_id]) > 0 219 | else self.ious[img_id, cat_id] 220 | ) 221 | 222 | num_thrs = len(self.params.iou_thrs) 223 | num_gt = len(gt) 224 | num_dt = len(dt) 225 | 226 | # Array to store the "id" of the matched dt/gt 227 | gt_m = np.zeros((num_thrs, num_gt)) 228 | dt_m = np.zeros((num_thrs, num_dt)) 229 | 230 | gt_ig = np.array([g["_ignore"] for g in gt]) 231 | dt_ig = np.zeros((num_thrs, num_dt)) 232 | 233 | for iou_thr_idx, iou_thr in enumerate(self.params.iou_thrs): 234 | if len(ious) == 0: 235 | break 236 | 237 | for dt_idx, _dt in enumerate(dt): 238 | iou = min([iou_thr, 1 - 1e-10]) 239 | # information about best match so far (m=-1 -> unmatched) 240 | # store the gt_idx which matched for _dt 241 | m = -1 242 | for gt_idx, _ in enumerate(gt): 243 | # if this gt already matched continue 244 | if gt_m[iou_thr_idx, gt_idx] > 0: 245 | continue 246 | # if _dt matched to reg gt, and on ignore gt, stop 247 | if m > -1 and gt_ig[m] == 0 and gt_ig[gt_idx] == 1: 248 | break 249 | # continue to next gt unless better match made 250 | if ious[dt_idx, gt_idx] < iou: 251 | continue 252 | # if match successful and best so far, store appropriately 253 | iou = ious[dt_idx, gt_idx] 254 | m = gt_idx 255 | 256 | # No match found for _dt, go to next _dt 257 | if m == -1: 258 | continue 259 | 260 | # if gt to ignore for some reason update dt_ig. 261 | # Should not be used in evaluation. 262 | dt_ig[iou_thr_idx, dt_idx] = gt_ig[m] 263 | # _dt match found, update gt_m, and dt_m with "id" 264 | dt_m[iou_thr_idx, dt_idx] = gt[m]["id"] 265 | gt_m[iou_thr_idx, m] = _dt["id"] 266 | 267 | # For LVIS we will ignore any unmatched detection if that category was 268 | # not exhaustively annotated in gt. 269 | dt_ig_mask = [ 270 | d["area"] < area_rng[0] 271 | or d["area"] > area_rng[1] 272 | or d["category_id"] in self.img_nel[d["image_id"]] 273 | for d in dt 274 | ] 275 | dt_ig_mask = np.array(dt_ig_mask).reshape((1, num_dt)) # 1 X num_dt 276 | dt_ig_mask = np.repeat(dt_ig_mask, num_thrs, 0) # num_thrs X num_dt 277 | # Based on dt_ig_mask ignore any unmatched detection by updating dt_ig 278 | dt_ig = np.logical_or(dt_ig, np.logical_and(dt_m == 0, dt_ig_mask)) 279 | # store results for given image and category 280 | return { 281 | "image_id": img_id, 282 | "category_id": cat_id, 283 | "area_rng": area_rng, 284 | "dt_ids": [d["id"] for d in dt], 285 | "gt_ids": [g["id"] for g in gt], 286 | "dt_matches": dt_m, 287 | "gt_matches": gt_m, 288 | "dt_scores": [d["score"] for d in dt], 289 | "gt_ignore": gt_ig, 290 | "dt_ignore": dt_ig, 291 | } 292 | 293 | def accumulate(self): 294 | """Accumulate per image evaluation results and store the result in 295 | self.eval. 296 | """ 297 | self.logger.info("Accumulating evaluation results.") 298 | 299 | if not self.eval_imgs: 300 | self.logger.warn("Please run evaluate first.") 301 | 302 | if self.params.use_cats: 303 | cat_ids = self.params.cat_ids 304 | else: 305 | cat_ids = [-1] 306 | 307 | num_thrs = len(self.params.iou_thrs) 308 | num_recalls = len(self.params.rec_thrs) 309 | num_cats = len(cat_ids) 310 | num_area_rngs = len(self.params.area_rng) 311 | num_imgs = len(self.params.img_ids) 312 | 313 | # -1 for absent categories 314 | precision = -np.ones( 315 | (num_thrs, num_recalls, num_cats, num_area_rngs) 316 | ) 317 | recall = -np.ones((num_thrs, num_cats, num_area_rngs)) 318 | 319 | # Initialize dt_pointers 320 | dt_pointers = {} 321 | for cat_idx in range(num_cats): 322 | dt_pointers[cat_idx] = {} 323 | for area_idx in range(num_area_rngs): 324 | dt_pointers[cat_idx][area_idx] = {} 325 | 326 | # Per category evaluation 327 | for cat_idx in range(num_cats): 328 | Nk = cat_idx * num_area_rngs * num_imgs 329 | for area_idx in range(num_area_rngs): 330 | Na = area_idx * num_imgs 331 | E = [ 332 | self.eval_imgs[Nk + Na + img_idx] 333 | for img_idx in range(num_imgs) 334 | ] 335 | # Remove elements which are None 336 | E = [e for e in E if not e is None] 337 | if len(E) == 0: 338 | continue 339 | 340 | # Append all scores: shape (N,) 341 | dt_scores = np.concatenate([e["dt_scores"] for e in E], axis=0) 342 | dt_ids = np.concatenate([e["dt_ids"] for e in E], axis=0) 343 | 344 | dt_idx = np.argsort(-dt_scores, kind="mergesort") 345 | dt_scores = dt_scores[dt_idx] 346 | dt_ids = dt_ids[dt_idx] 347 | 348 | dt_m = np.concatenate([e["dt_matches"] for e in E], axis=1)[:, dt_idx] 349 | dt_ig = np.concatenate([e["dt_ignore"] for e in E], axis=1)[:, dt_idx] 350 | 351 | gt_ig = np.concatenate([e["gt_ignore"] for e in E]) 352 | # num gt anns to consider 353 | num_gt = np.count_nonzero(gt_ig == 0) 354 | 355 | if num_gt == 0: 356 | continue 357 | 358 | tps = np.logical_and(dt_m, np.logical_not(dt_ig)) 359 | fps = np.logical_and(np.logical_not(dt_m), np.logical_not(dt_ig)) 360 | 361 | tp_sum = np.cumsum(tps, axis=1).astype(dtype=float) 362 | fp_sum = np.cumsum(fps, axis=1).astype(dtype=float) 363 | 364 | dt_pointers[cat_idx][area_idx] = { 365 | "dt_ids": dt_ids, 366 | "tps": tps, 367 | "fps": fps, 368 | } 369 | 370 | for iou_thr_idx, (tp, fp) in enumerate(zip(tp_sum, fp_sum)): 371 | tp = np.array(tp) 372 | fp = np.array(fp) 373 | num_tp = len(tp) 374 | rc = tp / num_gt 375 | if num_tp: 376 | recall[iou_thr_idx, cat_idx, area_idx] = rc[ 377 | -1 378 | ] 379 | else: 380 | recall[iou_thr_idx, cat_idx, area_idx] = 0 381 | 382 | # np.spacing(1) ~= eps 383 | pr = tp / (fp + tp + np.spacing(1)) 384 | pr = pr.tolist() 385 | 386 | # Replace each precision value with the maximum precision 387 | # value to the right of that recall level. This ensures 388 | # that the calculated AP value will be less suspectable 389 | # to small variations in the ranking. 390 | for i in range(num_tp - 1, 0, -1): 391 | if pr[i] > pr[i - 1]: 392 | pr[i - 1] = pr[i] 393 | 394 | rec_thrs_insert_idx = np.searchsorted( 395 | rc, self.params.rec_thrs, side="left" 396 | ) 397 | 398 | pr_at_recall = [0.0] * num_recalls 399 | 400 | try: 401 | for _idx, pr_idx in enumerate(rec_thrs_insert_idx): 402 | pr_at_recall[_idx] = pr[pr_idx] 403 | except: 404 | pass 405 | precision[iou_thr_idx, :, cat_idx, area_idx] = np.array(pr_at_recall) 406 | 407 | self.eval = { 408 | "params": self.params, 409 | "counts": [num_thrs, num_recalls, num_cats, num_area_rngs], 410 | "date": datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), 411 | "precision": precision, 412 | "recall": recall, 413 | "dt_pointers": dt_pointers, 414 | } 415 | 416 | def _summarize( 417 | self, summary_type, iou_thr=None, area_rng="all", freq_group_idx=None 418 | ): 419 | aidx = [ 420 | idx 421 | for idx, _area_rng in enumerate(self.params.area_rng_lbl) 422 | if _area_rng == area_rng 423 | ] 424 | 425 | if summary_type == 'ap': 426 | s = self.eval["precision"] 427 | if iou_thr is not None: 428 | tidx = np.where(iou_thr == self.params.iou_thrs)[0] 429 | s = s[tidx] 430 | if freq_group_idx is not None: 431 | s = s[:, :, self.freq_groups[freq_group_idx], aidx] 432 | else: 433 | s = s[:, :, :, aidx] 434 | else: 435 | s = self.eval["recall"] 436 | if iou_thr is not None: 437 | tidx = np.where(iou_thr == self.params.iou_thrs)[0] 438 | s = s[tidx] 439 | s = s[:, :, aidx] 440 | 441 | if len(s[s > -1]) == 0: 442 | mean_s = -1 443 | else: 444 | mean_s = np.mean(s[s > -1]) 445 | return mean_s 446 | 447 | def summarize(self): 448 | """Compute and display summary metrics for evaluation results.""" 449 | if not self.eval: 450 | raise RuntimeError("Please run accumulate() first.") 451 | 452 | max_dets = self.params.max_dets 453 | 454 | self.results["AP"] = self._summarize('ap') 455 | self.results["AP50"] = self._summarize('ap', iou_thr=0.50) 456 | self.results["AP75"] = self._summarize('ap', iou_thr=0.75) 457 | self.results["APs"] = self._summarize('ap', area_rng="small") 458 | self.results["APm"] = self._summarize('ap', area_rng="medium") 459 | self.results["APl"] = self._summarize('ap', area_rng="large") 460 | self.results["APr"] = self._summarize('ap', freq_group_idx=0) 461 | self.results["APc"] = self._summarize('ap', freq_group_idx=1) 462 | self.results["APf"] = self._summarize('ap', freq_group_idx=2) 463 | 464 | key = "AR@{}".format(max_dets) 465 | self.results[key] = self._summarize('ar') 466 | 467 | for area_rng in ["small", "medium", "large"]: 468 | key = "AR{}@{}".format(area_rng[0], max_dets) 469 | self.results[key] = self._summarize('ar', area_rng=area_rng) 470 | 471 | def run(self): 472 | """Wrapper function which calculates the results.""" 473 | self.evaluate() 474 | self.accumulate() 475 | self.summarize() 476 | 477 | def print_results(self): 478 | template = " {:<18} {} @[ IoU={:<9} | area={:>6s} | maxDets={:>3d} catIds={:>3s}] = {:0.3f}" 479 | 480 | for key, value in self.results.items(): 481 | max_dets = self.params.max_dets 482 | if "AP" in key: 483 | title = "Average Precision" 484 | _type = "(AP)" 485 | else: 486 | title = "Average Recall" 487 | _type = "(AR)" 488 | 489 | if len(key) > 2 and key[2].isdigit(): 490 | iou_thr = (float(key[2:]) / 100) 491 | iou = "{:0.2f}".format(iou_thr) 492 | else: 493 | iou = "{:0.2f}:{:0.2f}".format( 494 | self.params.iou_thrs[0], self.params.iou_thrs[-1] 495 | ) 496 | 497 | if len(key) > 2 and key[2] in ["r", "c", "f"]: 498 | cat_group_name = key[2] 499 | else: 500 | cat_group_name = "all" 501 | 502 | if len(key) > 2 and key[2] in ["s", "m", "l"]: 503 | area_rng = key[2] 504 | else: 505 | area_rng = "all" 506 | 507 | print(template.format(title, _type, iou, area_rng, max_dets, cat_group_name, value)) 508 | 509 | def get_results(self): 510 | if not self.results: 511 | self.logger.warn("results is empty. Call run().") 512 | return self.results 513 | 514 | 515 | class Params: 516 | def __init__(self, iou_type): 517 | """Params for LVIS evaluation API.""" 518 | self.img_ids = [] 519 | self.cat_ids = [] 520 | # np.arange causes trouble. the data point on arange is slightly 521 | # larger than the true value 522 | self.iou_thrs = np.linspace( 523 | 0.5, 0.95, int(np.round((0.95 - 0.5) / 0.05)) + 1, endpoint=True 524 | ) 525 | self.rec_thrs = np.linspace( 526 | 0.0, 1.00, int(np.round((1.00 - 0.0) / 0.01)) + 1, endpoint=True 527 | ) 528 | self.max_dets = 300 529 | self.area_rng = [ 530 | [0 ** 2, 1e5 ** 2], 531 | [0 ** 2, 32 ** 2], 532 | [32 ** 2, 96 ** 2], 533 | [96 ** 2, 1e5 ** 2], 534 | ] 535 | self.area_rng_lbl = ["all", "small", "medium", "large"] 536 | self.use_cats = 1 537 | # We bin categories in three bins based how many images of the training 538 | # set the category is present in. 539 | # r: Rare : < 10 540 | # c: Common : >= 10 and < 100 541 | # f: Frequent: >= 100 542 | self.img_count_lbl = ["r", "c", "f"] 543 | self.iou_type = iou_type 544 | -------------------------------------------------------------------------------- /lvis/lvis.py: -------------------------------------------------------------------------------- 1 | """ 2 | API for accessing LVIS Dataset: https://lvisdataset.org. 3 | 4 | LVIS API is a Python API that assists in loading, parsing and visualizing 5 | the annotations in LVIS. In addition to this API, please download 6 | images and annotations from the LVIS website. 7 | """ 8 | 9 | import json 10 | import os 11 | import logging 12 | from collections import defaultdict 13 | from urllib.request import urlretrieve 14 | 15 | import pycocotools.mask as mask_utils 16 | 17 | 18 | class LVIS: 19 | def __init__(self, annotation_path): 20 | """Class for reading and visualizing annotations. 21 | Args: 22 | annotation_path (str): location of annotation file 23 | """ 24 | self.logger = logging.getLogger(__name__) 25 | self.logger.info("Loading annotations.") 26 | 27 | self.dataset = self._load_json(annotation_path) 28 | 29 | assert ( 30 | type(self.dataset) == dict 31 | ), "Annotation file format {} not supported.".format(type(self.dataset)) 32 | self._create_index() 33 | 34 | def _load_json(self, path): 35 | with open(path, "r") as f: 36 | return json.load(f) 37 | 38 | def _create_index(self): 39 | self.logger.info("Creating index.") 40 | 41 | self.img_ann_map = defaultdict(list) 42 | self.cat_img_map = defaultdict(list) 43 | 44 | self.anns = {} 45 | self.cats = {} 46 | self.imgs = {} 47 | 48 | for ann in self.dataset["annotations"]: 49 | self.img_ann_map[ann["image_id"]].append(ann) 50 | self.anns[ann["id"]] = ann 51 | 52 | for img in self.dataset["images"]: 53 | self.imgs[img["id"]] = img 54 | 55 | for cat in self.dataset["categories"]: 56 | self.cats[cat["id"]] = cat 57 | 58 | for ann in self.dataset["annotations"]: 59 | self.cat_img_map[ann["category_id"]].append(ann["image_id"]) 60 | 61 | self.logger.info("Index created.") 62 | 63 | def get_ann_ids(self, img_ids=None, cat_ids=None, area_rng=None): 64 | """Get ann ids that satisfy given filter conditions. 65 | 66 | Args: 67 | img_ids (int array): get anns for given imgs 68 | cat_ids (int array): get anns for given cats 69 | area_rng (float array): get anns for a given area range. e.g [0, inf] 70 | 71 | Returns: 72 | ids (int array): integer array of ann ids 73 | """ 74 | anns = [] 75 | if img_ids is not None: 76 | for img_id in img_ids: 77 | anns.extend(self.img_ann_map[img_id]) 78 | else: 79 | anns = self.dataset["annotations"] 80 | 81 | # return early if no more filtering required 82 | if cat_ids is None and area_rng is None: 83 | return [_ann["id"] for _ann in anns] 84 | 85 | cat_ids = set(cat_ids) 86 | 87 | if area_rng is None: 88 | area_rng = [0, float("inf")] 89 | 90 | ann_ids = [ 91 | _ann["id"] 92 | for _ann in anns 93 | if _ann["category_id"] in cat_ids 94 | and _ann["area"] > area_rng[0] 95 | and _ann["area"] < area_rng[1] 96 | ] 97 | return ann_ids 98 | 99 | def get_cat_ids(self): 100 | """Get all category ids. 101 | 102 | Returns: 103 | ids (int array): integer array of category ids 104 | """ 105 | return list(self.cats.keys()) 106 | 107 | def get_img_ids(self): 108 | """Get all img ids. 109 | 110 | Returns: 111 | ids (int array): integer array of image ids 112 | """ 113 | return list(self.imgs.keys()) 114 | 115 | def _load_helper(self, _dict, ids): 116 | if ids is None: 117 | return list(_dict.values()) 118 | else: 119 | return [_dict[id] for id in ids] 120 | 121 | def load_anns(self, ids=None): 122 | """Load anns with the specified ids. If ids=None load all anns. 123 | 124 | Args: 125 | ids (int array): integer array of annotation ids 126 | 127 | Returns: 128 | anns (dict array) : loaded annotation objects 129 | """ 130 | return self._load_helper(self.anns, ids) 131 | 132 | def load_cats(self, ids): 133 | """Load categories with the specified ids. If ids=None load all 134 | categories. 135 | 136 | Args: 137 | ids (int array): integer array of category ids 138 | 139 | Returns: 140 | cats (dict array) : loaded category dicts 141 | """ 142 | return self._load_helper(self.cats, ids) 143 | 144 | def load_imgs(self, ids): 145 | """Load categories with the specified ids. If ids=None load all images. 146 | 147 | Args: 148 | ids (int array): integer array of image ids 149 | 150 | Returns: 151 | imgs (dict array) : loaded image dicts 152 | """ 153 | return self._load_helper(self.imgs, ids) 154 | 155 | def download(self, save_dir, img_ids=None): 156 | """Download images from mscoco.org server. 157 | Args: 158 | save_dir (str): dir to save downloaded images 159 | img_ids (int array): img ids of images to download 160 | """ 161 | imgs = self.load_imgs(img_ids) 162 | 163 | if not os.path.exists(save_dir): 164 | os.makedirs(save_dir) 165 | 166 | for img in imgs: 167 | file_name = os.path.join(save_dir, img["coco_url"].split("/")[-1]) 168 | if not os.path.exists(file_name): 169 | urlretrieve(img["coco_url"], file_name) 170 | 171 | def ann_to_rle(self, ann): 172 | """Convert annotation which can be polygons, uncompressed RLE to RLE. 173 | Args: 174 | ann (dict) : annotation object 175 | 176 | Returns: 177 | ann (rle) 178 | """ 179 | img_data = self.imgs[ann["image_id"]] 180 | h, w = img_data["height"], img_data["width"] 181 | segm = ann["segmentation"] 182 | if isinstance(segm, list): 183 | # polygon -- a single object might consist of multiple parts 184 | # we merge all parts into one mask rle code 185 | rles = mask_utils.frPyObjects(segm, h, w) 186 | rle = mask_utils.merge(rles) 187 | elif isinstance(segm["counts"], list): 188 | # uncompressed RLE 189 | rle = mask_utils.frPyObjects(segm, h, w) 190 | else: 191 | # rle 192 | rle = ann["segmentation"] 193 | return rle 194 | 195 | def ann_to_mask(self, ann): 196 | """Convert annotation which can be polygons, uncompressed RLE, or RLE 197 | to binary mask. 198 | Args: 199 | ann (dict) : annotation object 200 | 201 | Returns: 202 | binary mask (numpy 2D array) 203 | """ 204 | rle = self.ann_to_rle(ann) 205 | return mask_utils.decode(rle) 206 | -------------------------------------------------------------------------------- /lvis/results.py: -------------------------------------------------------------------------------- 1 | from copy import deepcopy 2 | import logging 3 | from collections import defaultdict 4 | from lvis.lvis import LVIS 5 | 6 | import pycocotools.mask as mask_utils 7 | 8 | 9 | class LVISResults(LVIS): 10 | def __init__(self, lvis_gt, results, max_dets=300): 11 | """Constructor for LVIS results. 12 | Args: 13 | lvis_gt (LVIS class instance, or str containing path of 14 | annotation file) 15 | results (str containing path of result file or a list of dicts) 16 | max_dets (int): max number of detections per image. The official 17 | value of max_dets for LVIS is 300. 18 | """ 19 | if isinstance(lvis_gt, LVIS): 20 | self.dataset = deepcopy(lvis_gt.dataset) 21 | elif isinstance(lvis_gt, str): 22 | self.dataset = self._load_json(lvis_gt) 23 | else: 24 | raise TypeError("Unsupported type {} of lvis_gt.".format(lvis_gt)) 25 | 26 | self.logger = logging.getLogger(__name__) 27 | self.logger.info("Loading and preparing results.") 28 | 29 | if isinstance(results, str): 30 | result_anns = self._load_json(results) 31 | else: 32 | # this path way is provided to avoid saving and loading result 33 | # during training. 34 | self.logger.warn("Assuming user provided the results in correct format.") 35 | result_anns = results 36 | 37 | assert isinstance(result_anns, list), "results is not a list." 38 | 39 | if max_dets >= 0: 40 | result_anns = self.limit_dets_per_image(result_anns, max_dets) 41 | 42 | if "bbox" in result_anns[0]: 43 | for id, ann in enumerate(result_anns): 44 | x1, y1, w, h = ann["bbox"] 45 | x2 = x1 + w 46 | y2 = y1 + h 47 | 48 | if "segmentation" not in ann: 49 | ann["segmentation"] = [[x1, y1, x1, y2, x2, y2, x2, y1]] 50 | 51 | ann["area"] = w * h 52 | ann["id"] = id + 1 53 | 54 | elif "segmentation" in result_anns[0]: 55 | for id, ann in enumerate(result_anns): 56 | # Only support compressed RLE format as segmentation results 57 | ann["area"] = mask_utils.area(ann["segmentation"]) 58 | 59 | if "bbox" not in ann: 60 | ann["bbox"] = mask_utils.toBbox(ann["segmentation"]) 61 | 62 | ann["id"] = id + 1 63 | 64 | self.dataset["annotations"] = result_anns 65 | self._create_index() 66 | 67 | img_ids_in_result = [ann["image_id"] for ann in result_anns] 68 | 69 | assert set(img_ids_in_result) == ( 70 | set(img_ids_in_result) & set(self.get_img_ids()) 71 | ), "Results do not correspond to current LVIS set." 72 | 73 | def limit_dets_per_image(self, anns, max_dets): 74 | img_ann = defaultdict(list) 75 | for ann in anns: 76 | img_ann[ann["image_id"]].append(ann) 77 | 78 | for img_id, _anns in img_ann.items(): 79 | if len(_anns) <= max_dets: 80 | continue 81 | _anns = sorted(_anns, key=lambda ann: ann["score"], reverse=True) 82 | img_ann[img_id] = _anns[:max_dets] 83 | 84 | return [ann for anns in img_ann.values() for ann in anns] 85 | 86 | def get_top_results(self, img_id, score_thrs): 87 | ann_ids = self.get_ann_ids(img_ids=[img_id]) 88 | anns = self.load_anns(ann_ids) 89 | return list(filter(lambda ann: ann["score"] > score_thrs, anns)) 90 | -------------------------------------------------------------------------------- /lvis/vis.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import logging 3 | import os 4 | 5 | import numpy as np 6 | import matplotlib.pyplot as plt 7 | import pycocotools.mask as mask_utils 8 | from matplotlib.patches import Polygon 9 | 10 | from lvis.lvis import LVIS 11 | from lvis.results import LVISResults 12 | from lvis.colormap import colormap 13 | 14 | 15 | class LVISVis: 16 | def __init__(self, lvis_gt, lvis_dt=None, img_dir=None, dpi=75): 17 | """Constructor for LVISVis. 18 | Args: 19 | lvis_gt (LVIS class instance, or str containing path of annotation file) 20 | lvis_dt (LVISResult class instance, or str containing path of result file, 21 | or list of dict) 22 | img_dir (str): path of folder containing all images. If None, the image 23 | to be displayed will be downloaded to the current working dir. 24 | dpi (int): dpi for figure size setup 25 | """ 26 | self.logger = logging.getLogger(__name__) 27 | 28 | if isinstance(lvis_gt, LVIS): 29 | self.lvis_gt = lvis_gt 30 | elif isinstance(lvis_gt, str): 31 | self.lvis_gt = LVIS(lvis_gt) 32 | else: 33 | raise TypeError("Unsupported type {} of lvis_gt.".format(lvis_gt)) 34 | 35 | if lvis_dt is not None: 36 | if isinstance(lvis_dt, LVISResults): 37 | self.lvis_dt = lvis_dt 38 | elif isinstance(lvis_dt, (str, list)): 39 | self.lvis_dt = LVISResults(self.lvis_gt, lvis_dt) 40 | else: 41 | raise TypeError("Unsupported type {} of lvis_dt.".format(lvis_dt)) 42 | else: 43 | self.lvis_dt = None 44 | self.dpi = dpi 45 | self.img_dir = img_dir if img_dir else '.' 46 | if self.img_dir == '.': 47 | self.logger.warn("img_dir not specified. Images will be downloaded.") 48 | 49 | def coco_segm_to_poly(self, _list): 50 | x = _list[0::2] 51 | y = _list[1::2] 52 | points = np.asarray([x, y]) 53 | return np.transpose(points) 54 | 55 | def get_synset(self, idx): 56 | synset = self.lvis_gt.load_cats(ids=[idx])[0]["synset"] 57 | text = synset.split(".") 58 | text = "{}.{}".format(text[0], int(text[-1])) 59 | return text 60 | 61 | def setup_figure(self, img, title="", dpi=75): 62 | fig = plt.figure(frameon=False) 63 | fig.set_size_inches(img.shape[1] / dpi, img.shape[0] / dpi) 64 | ax = plt.Axes(fig, [0.0, 0.0, 1.0, 1.0]) 65 | ax.set_title(title) 66 | ax.axis("off") 67 | fig.add_axes(ax) 68 | ax.imshow(img) 69 | return fig, ax 70 | 71 | def vis_bbox(self, ax, bbox, box_alpha=0.5, edgecolor="g", linestyle="--"): 72 | # bbox should be of the form x, y, w, h 73 | ax.add_patch( 74 | plt.Rectangle( 75 | (bbox[0], bbox[1]), 76 | bbox[2], 77 | bbox[3], 78 | fill=False, 79 | edgecolor=edgecolor, 80 | linewidth=2.5, 81 | alpha=box_alpha, 82 | linestyle=linestyle, 83 | ) 84 | ) 85 | 86 | def vis_text(self, ax, bbox, text, color="w"): 87 | ax.text( 88 | bbox[0], 89 | bbox[1] - 2, 90 | text, 91 | fontsize=15, 92 | family="serif", 93 | bbox=dict(facecolor="none", alpha=0.4, pad=0, edgecolor="none"), 94 | color=color, 95 | zorder=10, 96 | ) 97 | 98 | def vis_mask(self, ax, segm, color): 99 | # segm is numpy array of shape Nx2 100 | polygon = Polygon( 101 | segm, fill=True, facecolor=color, edgecolor=color, linewidth=3, alpha=0.5 102 | ) 103 | ax.add_patch(polygon) 104 | 105 | def get_color(self, idx): 106 | color_list = colormap(rgb=True) / 255 107 | return color_list[idx % len(color_list), 0:3] 108 | 109 | def load_img(self, img_id): 110 | img = self.lvis_gt.load_imgs([img_id])[0] 111 | img_path = os.path.join(self.img_dir, img["coco_url"].split("/")[-1]) 112 | if not os.path.exists(img_path): 113 | self.lvis_gt.download(self.img_dir, img_ids=[img_id]) 114 | img = cv2.imread(img_path) 115 | b, g, r = cv2.split(img) 116 | return cv2.merge([r, g, b]) 117 | 118 | def vis_img( 119 | self, img_id, show_boxes=False, show_segms=True, show_classes=False, 120 | cat_ids_to_show=None 121 | ): 122 | ann_ids = self.lvis_gt.get_ann_ids(img_ids=[img_id]) 123 | anns = self.lvis_gt.load_anns(ids=ann_ids) 124 | boxes, segms, classes = [], [], [] 125 | for ann in anns: 126 | boxes.append(ann["bbox"]) 127 | segms.append(ann["segmentation"]) 128 | classes.append(ann["category_id"]) 129 | 130 | if len(boxes) == 0: 131 | self.logger.warn("No gt anno found for img_id: {}".format(img_id)) 132 | return 133 | 134 | boxes = np.asarray(boxes) 135 | areas = boxes[:, 2] * boxes[:, 3] 136 | sorted_inds = np.argsort(-areas) 137 | 138 | fig, ax = self.setup_figure(self.load_img(img_id)) 139 | 140 | for idx in sorted_inds: 141 | if cat_ids_to_show is not None and classes[idx] not in cat_ids_to_show: 142 | continue 143 | color = self.get_color(idx) 144 | if show_boxes: 145 | self.vis_bbox(ax, boxes[idx], edgecolor=color) 146 | if show_classes: 147 | text = self.get_synset(classes[idx]) 148 | self.vis_text(ax, boxes[idx], text) 149 | if show_segms: 150 | for segm in segms[idx]: 151 | self.vis_mask(ax, self.coco_segm_to_poly(segm), color) 152 | 153 | def vis_result( 154 | self, img_id, show_boxes=False, show_segms=True, show_classes=False, 155 | cat_ids_to_show=None, score_thrs=0.0, show_scores=True 156 | ): 157 | assert self.lvis_dt is not None, "lvis_dt was not specified." 158 | anns = self.lvis_dt.get_top_results(img_id, score_thrs) 159 | boxes, segms, classes, scores = [], [], [], [] 160 | for ann in anns: 161 | boxes.append(ann["bbox"]) 162 | segms.append(ann["segmentation"]) 163 | classes.append(ann["category_id"]) 164 | scores.append(ann["score"]) 165 | 166 | if len(boxes) == 0: 167 | self.logger.warn("No gt anno found for img_id: {}".format(img_id)) 168 | return 169 | 170 | boxes = np.asarray(boxes) 171 | areas = boxes[:, 2] * boxes[:, 3] 172 | sorted_inds = np.argsort(-areas) 173 | 174 | fig, ax = self.setup_figure(self.load_img(img_id)) 175 | 176 | for idx in sorted_inds: 177 | if cat_ids_to_show is not None and classes[idx] not in cat_ids_to_show: 178 | continue 179 | color = self.get_color(idx) 180 | if show_boxes: 181 | self.vis_bbox(ax, boxes[idx], edgecolor=color) 182 | if show_classes: 183 | text = self.get_synset(classes[idx]) 184 | if show_scores: 185 | text = "{}: {:.2f}".format(text, scores[idx]) 186 | self.vis_text(ax, boxes[idx], text) 187 | if show_segms: 188 | for segm in segms[idx]: 189 | self.vis_mask(ax, self.coco_segm_to_poly(segm), color) 190 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | cycler>=0.10.0 2 | Cython>=0.29.12 3 | kiwisolver>=1.1.0 4 | matplotlib>=3.1.1 5 | numpy>=1.18.2 6 | opencv-python>=4.1.0.25 7 | pyparsing>=2.4.0 8 | python-dateutil>=2.8.0 9 | six>=1.12.0 10 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | """LVIS (pronounced ‘el-vis’): is a new dataset for Large Vocabulary Instance Segmentation. 2 | We collect over 2 million high-quality instance segmentation masks for over 1200 entry-level object categories in 164k images. LVIS API enables reading and interacting with annotation files, 3 | visualizing annotations, and evaluating results. 4 | 5 | """ 6 | DOCLINES = (__doc__ or '') 7 | 8 | import os.path 9 | import sys 10 | import pip 11 | 12 | import setuptools 13 | 14 | sys.path.insert(0, os.path.join(os.path.dirname(__file__), "lvis")) 15 | 16 | with open("requirements.txt") as f: 17 | reqs = f.read() 18 | 19 | DISTNAME = "lvis" 20 | DESCRIPTION = "Python API for LVIS dataset." 21 | AUTHOR = "Agrim Gupta" 22 | REQUIREMENTS = (reqs.strip().split("\n"),) 23 | 24 | 25 | if __name__ == "__main__": 26 | setuptools.setup( 27 | name=DISTNAME, 28 | install_requires=REQUIREMENTS, 29 | packages=setuptools.find_packages(), 30 | version="0.5.3", 31 | description=DESCRIPTION, 32 | long_description=DOCLINES, 33 | long_description_content_type='text/markdown', 34 | author=AUTHOR 35 | ) 36 | -------------------------------------------------------------------------------- /test.py: -------------------------------------------------------------------------------- 1 | import logging 2 | from lvis import LVIS, LVISResults, LVISEval 3 | 4 | # result and val files for 100 randomly sampled images. 5 | ANNOTATION_PATH = "./data/lvis_val_100.json" 6 | RESULT_PATH = "./data/lvis_results_100.json" 7 | 8 | ANN_TYPE = 'bbox' 9 | 10 | lvis_eval = LVISEval(ANNOTATION_PATH, RESULT_PATH, ANN_TYPE) 11 | lvis_eval.run() 12 | lvis_eval.print_results() 13 | --------------------------------------------------------------------------------