├── README.md
├── data
├── demo
│ ├── mask.jpg
│ ├── pcd0100r_rgd_preprocessed_1.png
│ ├── pcd0266r_rgd_preprocessed_1.png
│ ├── pcd0882r_rgd_preprocessed_1.png
│ └── rgd_0000Cropped320.png
└── scripts
│ ├── dataPreprocessingTest_fasterrcnn_split.m
│ ├── dataPreprocessing_fasterrcnn.m
│ └── fetch_faster_rcnn_models.sh
├── experiments
├── cfgs
│ ├── res101-lg.yml
│ ├── res101.yml
│ ├── res50.yml
│ └── vgg16.yml
├── logs
│ └── .gitignore
└── scripts
│ ├── convert_vgg16.sh
│ ├── test_faster_rcnn.sh
│ ├── train_faster_rcnn.sh
│ └── train_faster_rcnn.sh~
├── fig
├── ivalab_dataset.png
└── multi_grasp_multi_object.png
├── lib
├── Makefile
├── datasets
│ ├── VOCdevkit-matlab-wrapper
│ │ ├── get_voc_opts.m
│ │ ├── voc_eval.m
│ │ └── xVOCap.m
│ ├── __init__.py
│ ├── __init__.pyc
│ ├── coco.py
│ ├── coco.pyc
│ ├── ds_utils.py
│ ├── ds_utils.pyc
│ ├── factory.py
│ ├── factory.pyc
│ ├── factory.py~
│ ├── graspRGB.py
│ ├── graspRGB.pyc
│ ├── graspRGB.py~
│ ├── imdb.py
│ ├── imdb.pyc
│ ├── pascal_voc.py
│ ├── pascal_voc.pyc
│ ├── tools
│ │ └── mcg_munge.py
│ ├── voc_eval.py
│ └── voc_eval.pyc
├── layer_utils
│ ├── __init__.py
│ ├── __init__.pyc
│ ├── anchor_target_layer.py
│ ├── anchor_target_layer.pyc
│ ├── generate_anchors.py
│ ├── generate_anchors.pyc
│ ├── proposal_layer.py
│ ├── proposal_layer.pyc
│ ├── proposal_target_layer.py
│ ├── proposal_target_layer.pyc
│ ├── proposal_top_layer.py
│ ├── proposal_top_layer.pyc
│ ├── snippets.py
│ └── snippets.pyc
├── model
│ ├── __init__.py
│ ├── __init__.pyc
│ ├── bbox_transform.py
│ ├── bbox_transform.pyc
│ ├── config.py
│ ├── config.pyc
│ ├── config.py~
│ ├── nms_wrapper.py
│ ├── nms_wrapper.pyc
│ ├── test.py
│ ├── test.pyc
│ ├── test.py~
│ ├── train_val.py
│ ├── train_val.pyc
│ └── train_val.py~
├── nets
│ ├── __init__.py
│ ├── __init__.pyc
│ ├── network.py
│ ├── network.pyc
│ ├── resnet_v1.py
│ ├── resnet_v1.pyc
│ ├── resnet_v1.py~
│ ├── vgg16.py
│ └── vgg16.pyc
├── nms
│ ├── .gitignore
│ ├── __init__.py
│ ├── __init__.pyc
│ ├── cpu_nms.c
│ ├── cpu_nms.pyx
│ ├── cpu_nms.so
│ ├── gpu_nms.cpp
│ ├── gpu_nms.hpp
│ ├── gpu_nms.pyx
│ ├── gpu_nms.so
│ ├── nms_kernel.cu
│ └── py_cpu_nms.py
├── roi_data_layer
│ ├── __init__.py
│ ├── __init__.pyc
│ ├── layer.py
│ ├── layer.pyc
│ ├── minibatch.py
│ ├── minibatch.pyc
│ ├── minibatch.py~
│ ├── roidb.py
│ └── roidb.pyc
├── setup.py
├── setup.py~
└── utils
│ ├── .gitignore
│ ├── __init__.py
│ ├── __init__.pyc
│ ├── bbox.pyx
│ ├── blob.py
│ ├── blob.pyc
│ ├── boxes_grid.py
│ ├── boxes_grid.pyc
│ ├── cython_bbox.so
│ ├── cython_nms.so
│ ├── nms.py
│ ├── nms.pyx
│ ├── timer.py
│ └── timer.pyc
└── tools
├── _init_paths.py
├── _init_paths.pyc
├── demo.py~
├── demo_graspRGD.py
├── demo_graspRGD.py~
├── demo_graspRGD_socket.py
├── demo_graspRGD_socket.py~
├── demo_graspRGD_socket_drawer.py~
├── demo_graspRGD_socket_save_to_rgbd.py~
├── demo_graspRGD_vis_mask.py
├── demo_graspRGD_vis_select.py
├── eval_graspRGD.py~
├── mask_gen.py
└── trainval_net.py
/README.md:
--------------------------------------------------------------------------------
1 | # grasp_multiObject_multiGrasp
2 |
3 | This is the implementation of our RA-L work 'Real-world Multi-object, Multi-grasp Detection'. The detector takes RGB-D image input and predicts multiple grasp candidates for a single object or multiple objects, in a single shot. The original arxiv paper can be found [here](https://arxiv.org/pdf/1802.00520.pdf). The final version will be updated after publication process.
4 |
5 |
6 |
7 |
8 |
9 | If you find it helpful for your research, please consider citing:
10 |
11 | @inproceedings{chu2018deep,
12 | title = {Real-World Multiobject, Multigrasp Detection},
13 | author = {F. Chu and R. Xu and P. A. Vela},
14 | journal = {IEEE Robotics and Automation Letters},
15 | year = {2018},
16 | volume = {3},
17 | number = {4},
18 | pages = {3355-3362},
19 | DOI = {10.1109/LRA.2018.2852777},
20 | ISSN = {2377-3766},
21 | month = {Oct}
22 | }
23 |
24 |
25 | If you encounter any questions, please contact me at fujenchu[at]gatech[dot]edu
26 |
27 |
28 | ### Demo
29 | 1. Clone this repository
30 | ```
31 | git clone https://github.com/ivalab/grasp_multiObject_multiGrasp.git
32 | cd grasp_multiObject_multiGrasp
33 | ```
34 |
35 | 2. Build Cython modules
36 | ```
37 | cd lib
38 | make clean
39 | make
40 | cd ..
41 | ```
42 |
43 | 3. Install [Python COCO API](https://github.com/cocodataset/cocoapi)
44 | ```
45 | cd data
46 | git clone https://github.com/pdollar/coco.git
47 | cd coco/PythonAPI
48 | make
49 | cd ../../..
50 | ```
51 |
52 | 4. Download pretrained models
53 | - trained model for grasp on [dropbox drive](https://www.dropbox.com/s/ldapcpanzqdu7tc/models.zip?dl=0)
54 | - put under `output/res50/train/default/`
55 |
56 | 5. Run demo
57 | ```
58 | ./tools/demo_graspRGD.py --net res50 --dataset grasp
59 | ```
60 | you can see images pop out.
61 |
62 | ### Train
63 | 1. Generate data
64 | 1-1. Download [Cornell Dataset](http://pr.cs.cornell.edu/grasping/rect_data/data.php)
65 | 1-2. Run `dataPreprocessingTest_fasterrcnn_split.m` (please modify paths according to your structure)
66 | 1-3. Follow 'Format Your Dataset' section [here](https://github.com/zeyuanxy/fast-rcnn/tree/master/help/train) to check if your data follows VOC format
67 |
68 | 2. Train
69 | ```
70 | ./experiments/scripts/train_faster_rcnn.sh 0 graspRGB res50
71 | ```
72 |
73 | ### ROS version?
74 | Yes! please find it [HERE](https://github.com/ivaROS/ros_deep_grasp)
75 |
76 | ### Acknowledgment
77 |
78 | This repo borrows tons of code from
79 | - [tf-faster-rcnn](https://github.com/endernewton/tf-faster-rcnn) by endernewton
80 |
81 | ### Resources
82 | - [multi-object grasp dataset](https://github.com/ivalab/grasp_multiObject)
83 | - [grasp annotation tool](https://github.com/ivalab/grasp_annotation_tool)
84 |
--------------------------------------------------------------------------------
/data/demo/mask.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/data/demo/mask.jpg
--------------------------------------------------------------------------------
/data/demo/pcd0100r_rgd_preprocessed_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/data/demo/pcd0100r_rgd_preprocessed_1.png
--------------------------------------------------------------------------------
/data/demo/pcd0266r_rgd_preprocessed_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/data/demo/pcd0266r_rgd_preprocessed_1.png
--------------------------------------------------------------------------------
/data/demo/pcd0882r_rgd_preprocessed_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/data/demo/pcd0882r_rgd_preprocessed_1.png
--------------------------------------------------------------------------------
/data/demo/rgd_0000Cropped320.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/data/demo/rgd_0000Cropped320.png
--------------------------------------------------------------------------------
/data/scripts/dataPreprocessingTest_fasterrcnn_split.m:
--------------------------------------------------------------------------------
1 | %% script to test dataPreprocessing
2 | %% created by Fu-Jen Chu on 09/15/2016
3 |
4 | close all
5 | clear
6 |
7 | %parpool(4)
8 | addpath('/media/fujenchu/home3/data/grasps/')
9 |
10 |
11 | % generate list for splits
12 | list = [100:949 1000:1034];
13 | list_idx = randperm(length(list));
14 | train_list_idx = list_idx(length(list)/5+1:end);
15 | test_list_idx = list_idx(1:length(list)/5);
16 | train_list = list(train_list_idx);
17 | test_list = list(test_list_idx);
18 |
19 |
20 | for folder = 1:10
21 | display(['processing folder ' int2str(folder)])
22 |
23 | imgDataDir = ['/home/fujenchu/projects/deepLearning/tensorflow-finetune-flickr-style-master/data/grasps/' sprintf('%02d',folder) '_rgd'];
24 | txtDataDir = ['/home/fujenchu/projects/deepLearning/tensorflow-finetune-flickr-style-master/data/grasps/' sprintf('%02d',folder)];
25 |
26 | %imgDataOutDir = ['/home/fujenchu/projects/deepLearning/tensorflow-finetune-flickr-style-master/data/grasps/' sprintf('%02d',folder) '_Cropped320_rgd'];
27 | imgDataOutDir = '/media/fujenchu/home3/fasterrcnn_grasp/rgd_multibbs_5_5_5_tf/data/Images';
28 | annotationDataOutDir = '/media/fujenchu/home3/fasterrcnn_grasp/rgd_multibbs_5_5_5_tf/data/Annotations';
29 | imgSetTrain = '/media/fujenchu/home3/fasterrcnn_grasp/rgd_multibbs_5_5_5_tf/data/ImageSets/train.txt';
30 | imgSetTest = '/media/fujenchu/home3/fasterrcnn_grasp/rgd_multibbs_5_5_5_tf/data/ImageSets/test.txt';
31 |
32 | imgFiles = dir([imgDataDir '/*.png']);
33 | txtFiles = dir([txtDataDir '/*pos.txt']);
34 |
35 | logfileID = fopen('log.txt','a');
36 | mainfileID = fopen(['/home/fujenchu/projects/deepLearning/deepGraspExtensiveOffline/data/grasps/scripts/trainttt' sprintf('%02d',folder) '.txt'],'a');
37 | for idx = 1:length(imgFiles)
38 | %% display progress
39 | tic
40 | display(['processing folder: ' sprintf('%02d',folder) ', imgFiles: ' int2str(idx)])
41 |
42 | %% reading data
43 | imgName = imgFiles(idx).name;
44 | [pathstr,imgname] = fileparts(imgName);
45 |
46 | filenum = str2num(imgname(4:7));
47 | if(any(test_list == filenum))
48 | file_writeID = fopen(imgSetTest,'a');
49 | fprintf(file_writeID, '%s\n', [imgDataDir(1:end-3) 'Cropped320_rgd/' imgname '_preprocessed_1.png' ] );
50 | fclose(file_writeID);
51 | continue;
52 | end
53 |
54 | txtName = txtFiles(idx).name;
55 | [pathstr,txtname] = fileparts(txtName);
56 |
57 | img = imread([imgDataDir '/' imgname '.png']);
58 | fileID = fopen([txtDataDir '/' txtname '.txt'],'r');
59 | sizeA = [2 100];
60 | bbsIn_all = fscanf(fileID, '%f %f', sizeA);
61 | fclose(fileID);
62 |
63 | %% data pre-processing
64 | [imagesOut bbsOut] = dataPreprocessing_fasterrcnn(img, bbsIn_all, 227, 5, 5);
65 |
66 | % for each augmented image
67 | for i = 1:1:size(imagesOut,2)
68 |
69 | % for each bbs
70 | file_writeID = fopen([annotationDataOutDir '/' imgname '_preprocessed_' int2str(i) '.txt'],'w');
71 | printCount = 0;
72 | for ibbs = 1:1:size(bbsOut{i},2)
73 | A = bbsOut{i}{ibbs};
74 | xy_ctr = sum(A,2)/4; x_ctr = xy_ctr(1); y_ctr = xy_ctr(2);
75 | width = sqrt(sum((A(:,1) - A(:,2)).^2)); height = sqrt(sum((A(:,2) - A(:,3)).^2));
76 | if(A(1,1) > A(1,2))
77 | theta = atan((A(2,2)-A(2,1))/(A(1,1)-A(1,2)));
78 | else
79 | theta = atan((A(2,1)-A(2,2))/(A(1,2)-A(1,1))); % note y is facing down
80 | end
81 |
82 | % process to fasterrcnn
83 | x_min = x_ctr - width/2; x_max = x_ctr + width/2;
84 | y_min = y_ctr - height/2; y_max = y_ctr + height/2;
85 | %if(x_min < 0 || y_min < 0 || x_max > 227 || y_max > 227) display('yoooooooo'); end
86 | if((x_min < 0 && x_max < 0) || (y_min > 227 && y_max > 227) || (x_min > 227 && x_max > 227) || (y_min < 0 && y_max < 0)) display('xxxxxxxxx'); break; end
87 | cls = round((theta/pi*180+90)/10) + 1;
88 |
89 | % write as lefttop rightdown, Xmin Ymin Xmax Ymax, ex: 261 109 511 705 (x水平 y垂直)
90 | fprintf(file_writeID, '%d %f %f %f %f\n', cls, x_min, y_min, x_max, y_max );
91 | printCount = printCount+1;
92 | end
93 | if(printCount == 0) fprintf(logfileID, '%s\n', [imgname '_preprocessed_' int2str(i) ]);end
94 |
95 | fclose(file_writeID);
96 | imwrite(imagesOut{i}, [imgDataOutDir '/' imgname '_preprocessed_' int2str(i) '.png']);
97 |
98 | % write filename to imageSet
99 | file_writeID = fopen(imgSetTrain,'a');
100 | fprintf(file_writeID, '%s\n', [imgname '_preprocessed_' int2str(i) ] );
101 | fclose(file_writeID);
102 |
103 | end
104 |
105 |
106 | toc
107 | end
108 | fclose(mainfileID);
109 |
110 | end
111 |
--------------------------------------------------------------------------------
/data/scripts/dataPreprocessing_fasterrcnn.m:
--------------------------------------------------------------------------------
1 | function [imagesOut bbsOut] = dataPreprocessing( imageIn, bbsIn_all, cropSize, translationShiftNumber, roatateAngleNumber)
2 | % dataPreprocessing function perfroms
3 | % 1) croping
4 | % 2) padding
5 | % 3) rotatation
6 | % 4) shifting
7 | %
8 | % for a input image with a bbs as 4 points,
9 | % dataPreprocessing outputs a set of images with corresponding bbs.
10 | %
11 | %
12 | % Inputs:
13 | % imageIn: input image (480 by 640 by 3)
14 | % bbsIn: bounding box (2 by 4)
15 | % cropSize: output image size
16 | % shift: shifting offset
17 | % rotate: rotation angle
18 | %
19 | % Outputs:
20 | % imagesOut: output images (n images)
21 | % bbsOut: output bbs according to shift and rotation
22 | %
23 | %% created by Fu-Jen Chu on 09/15/2016
24 |
25 | debug_dev = 0;
26 | debug = 0;
27 | %% show image and bbs
28 | if(debug_dev)
29 | figure(1); imshow(imageIn); hold on;
30 | x = bbsIn_all(1, [1:3]);
31 | y = bbsIn_all(2, [1:3]);
32 | plot(x,y); hold off;
33 | end
34 |
35 | %% crop image and padding image
36 | % cropping to 321 by 321 from center
37 | imgCrop = imcrop(imageIn, [145 65 351 351]);
38 |
39 | % padding to 501 by 501
40 | imgPadding = padarray(imgCrop, [75 75], 'replicate', 'both');
41 |
42 | count = 1;
43 | for i_rotate = 1:roatateAngleNumber*translationShiftNumber*translationShiftNumber
44 | % random roatateAngle
45 | theta = randi(360)-1;
46 | %theta = 0;
47 |
48 | % random translationShift
49 | dx = randi(101)-51;
50 | %dx = 0;
51 |
52 | %% rotation and shifting
53 | % random translationShift
54 | dy = randi(101)-51;
55 | %dy = 0;
56 |
57 | imgRotate = imrotate(imgPadding, theta);
58 | if(debug_dev)figure(2); imshow(imgRotate);end
59 | imgCropRotate = imcrop(imgRotate, [size(imgRotate,1)/2-160-dx size(imgRotate,1)/2-160-dy 320 320]);
60 | if(debug_dev)figure(3); imshow(imgCropRotate);end
61 | imgResize = imresize(imgCropRotate, [cropSize cropSize]);
62 | if(debug)figure(4); imshow(imgResize); hold on;end
63 |
64 | %% modify bbs
65 | [m, n] = size(bbsIn_all);
66 | bbsNum = n/4;
67 |
68 | countbbs = 1;
69 | for idx = 1:bbsNum
70 | bbsIn = bbsIn_all(:,idx*4-3:idx*4);
71 | if(sum(sum(isnan(bbsIn)))) continue; end
72 |
73 | bbsInShift = bbsIn - repmat([320; 240], 1, 4);
74 | R = [cos(theta/180*pi) -sin(theta/180*pi); sin(theta/180*pi) cos(theta/180*pi)];
75 | bbsRotated = (bbsInShift'*R)';
76 | bbsInShiftBack = (bbsRotated + repmat([160; 160], 1, 4) + repmat([dx; dy], 1, 4))*cropSize/320;
77 | if(debug)
78 | figure(4)
79 | x = bbsInShiftBack(1, [1:4 1]);
80 | y = bbsInShiftBack(2, [1:4 1]);
81 | plot(x,y); hold on; pause(0.01);
82 | end
83 | bbsOut{count}{countbbs} = bbsInShiftBack;
84 | countbbs = countbbs + 1;
85 | end
86 |
87 | imagesOut{count} = imgResize;
88 | count = count +1;
89 |
90 |
91 | end
92 |
93 |
94 |
95 | end
96 |
--------------------------------------------------------------------------------
/data/scripts/fetch_faster_rcnn_models.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )/../" && pwd )"
4 | cd $DIR
5 |
6 | NET=res101
7 | FILE=voc_0712_80k-110k.tgz
8 | # replace it with gs11655.sp.cs.cmu.edu if ladoga.graphics.cs.cmu.edu does not work
9 | URL=http://ladoga.graphics.cs.cmu.edu/xinleic/tf-faster-rcnn/$NET/$FILE
10 | CHECKSUM=cb32e9df553153d311cc5095b2f8c340
11 |
12 | if [ -f $FILE ]; then
13 | echo "File already exists. Checking md5..."
14 | os=`uname -s`
15 | if [ "$os" = "Linux" ]; then
16 | checksum=`md5sum $FILE | awk '{ print $1 }'`
17 | elif [ "$os" = "Darwin" ]; then
18 | checksum=`cat $FILE | md5`
19 | fi
20 | if [ "$checksum" = "$CHECKSUM" ]; then
21 | echo "Checksum is correct. No need to download."
22 | exit 0
23 | else
24 | echo "Checksum is incorrect. Need to download again."
25 | fi
26 | fi
27 |
28 | echo "Downloading Resnet 101 Faster R-CNN models Pret-trained on VOC 07+12 (340M)..."
29 |
30 | wget $URL -O $FILE
31 |
32 | echo "Unzipping..."
33 |
34 | tar zxvf $FILE
35 |
36 | echo "Done. Please run this command again to verify that checksum = $CHECKSUM."
37 |
--------------------------------------------------------------------------------
/experiments/cfgs/res101-lg.yml:
--------------------------------------------------------------------------------
1 | EXP_DIR: res101-lg
2 | TRAIN:
3 | HAS_RPN: True
4 | IMS_PER_BATCH: 1
5 | BBOX_NORMALIZE_TARGETS_PRECOMPUTED: True
6 | RPN_POSITIVE_OVERLAP: 0.7
7 | RPN_BATCHSIZE: 256
8 | PROPOSAL_METHOD: gt
9 | BG_THRESH_LO: 0.0
10 | DISPLAY: 20
11 | BATCH_SIZE: 256
12 | WEIGHT_DECAY: 0.0001
13 | DOUBLE_BIAS: False
14 | SNAPSHOT_PREFIX: res101_faster_rcnn
15 | SCALES: [800]
16 | MAX_SIZE: 1333
17 | TEST:
18 | HAS_RPN: True
19 | SCALES: [800]
20 | MAX_SIZE: 1333
21 | RPN_POST_NMS_TOP_N: 1000
22 | POOLING_MODE: crop
23 | ANCHOR_SCALES: [2,4,8,16,32]
24 |
--------------------------------------------------------------------------------
/experiments/cfgs/res101.yml:
--------------------------------------------------------------------------------
1 | EXP_DIR: res101
2 | TRAIN:
3 | HAS_RPN: True
4 | IMS_PER_BATCH: 1
5 | BBOX_NORMALIZE_TARGETS_PRECOMPUTED: True
6 | RPN_POSITIVE_OVERLAP: 0.7
7 | RPN_BATCHSIZE: 256
8 | PROPOSAL_METHOD: gt
9 | BG_THRESH_LO: 0.0
10 | DISPLAY: 20
11 | BATCH_SIZE: 256
12 | WEIGHT_DECAY: 0.0001
13 | DOUBLE_BIAS: False
14 | SNAPSHOT_PREFIX: res101_faster_rcnn
15 | TEST:
16 | HAS_RPN: True
17 | POOLING_MODE: crop
18 |
--------------------------------------------------------------------------------
/experiments/cfgs/res50.yml:
--------------------------------------------------------------------------------
1 | EXP_DIR: res50
2 | TRAIN:
3 | HAS_RPN: True
4 | IMS_PER_BATCH: 1
5 | BBOX_NORMALIZE_TARGETS_PRECOMPUTED: True
6 | RPN_POSITIVE_OVERLAP: 0.7
7 | RPN_BATCHSIZE: 256
8 | PROPOSAL_METHOD: gt
9 | BG_THRESH_LO: 0.0
10 | DISPLAY: 20
11 | BATCH_SIZE: 256
12 | WEIGHT_DECAY: 0.0001
13 | DOUBLE_BIAS: False
14 | SNAPSHOT_PREFIX: res50_faster_rcnn
15 | TEST:
16 | HAS_RPN: True
17 | POOLING_MODE: crop
18 |
--------------------------------------------------------------------------------
/experiments/cfgs/vgg16.yml:
--------------------------------------------------------------------------------
1 | EXP_DIR: vgg16
2 | TRAIN:
3 | HAS_RPN: True
4 | IMS_PER_BATCH: 1
5 | BBOX_NORMALIZE_TARGETS_PRECOMPUTED: True
6 | RPN_POSITIVE_OVERLAP: 0.7
7 | RPN_BATCHSIZE: 256
8 | PROPOSAL_METHOD: gt
9 | BG_THRESH_LO: 0.0
10 | DISPLAY: 20
11 | BATCH_SIZE: 256
12 | SNAPSHOT_PREFIX: vgg16_faster_rcnn
13 | TEST:
14 | HAS_RPN: True
15 | POOLING_MODE: crop
16 |
--------------------------------------------------------------------------------
/experiments/logs/.gitignore:
--------------------------------------------------------------------------------
1 | *.txt.*
2 |
--------------------------------------------------------------------------------
/experiments/scripts/convert_vgg16.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | set -x
4 | set -e
5 |
6 | export PYTHONUNBUFFERED="True"
7 |
8 | GPU_ID=$1
9 | DATASET=$2
10 | NET=vgg16
11 |
12 | array=( $@ )
13 | len=${#array[@]}
14 | EXTRA_ARGS=${array[@]:2:$len}
15 | EXTRA_ARGS_SLUG=${EXTRA_ARGS// /_}
16 |
17 | case ${DATASET} in
18 | pascal_voc)
19 | TRAIN_IMDB="voc_2007_trainval"
20 | TEST_IMDB="voc_2007_test"
21 | STEPSIZE=50000
22 | ITERS=70000
23 | ANCHORS="[8,16,32]"
24 | RATIOS="[0.5,1,2]"
25 | ;;
26 | pascal_voc_0712)
27 | TRAIN_IMDB="voc_2007_trainval+voc_2012_trainval"
28 | TEST_IMDB="voc_2007_test"
29 | STEPSIZE=80000
30 | ITERS=110000
31 | ANCHORS="[8,16,32]"
32 | RATIOS="[0.5,1,2]"
33 | ;;
34 | coco)
35 | TRAIN_IMDB="coco_2014_train+coco_2014_valminusminival"
36 | TEST_IMDB="coco_2014_minival"
37 | STEPSIZE=350000
38 | ITERS=490000
39 | ANCHORS="[4,8,16,32]"
40 | RATIOS="[0.5,1,2]"
41 | ;;
42 | *)
43 | echo "No dataset given"
44 | exit
45 | ;;
46 | esac
47 |
48 | set +x
49 | NET_FINAL=${NET}_faster_rcnn_iter_${ITERS}
50 | set -x
51 |
52 | if [ ! -f ${NET_FINAL}.index ]; then
53 | if [[ ! -z ${EXTRA_ARGS_SLUG} ]]; then
54 | CUDA_VISIBLE_DEVICES=${GPU_ID} time python ./tools/convert_from_depre.py \
55 | --snapshot ${NET_FINAL} \
56 | --imdb ${TRAIN_IMDB} \
57 | --iters ${ITERS} \
58 | --cfg experiments/cfgs/${NET}.yml \
59 | --tag ${EXTRA_ARGS_SLUG} \
60 | --set ANCHOR_SCALES ${ANCHORS} ANCHOR_RATIOS ${RATIOS} TRAIN.STEPSIZE ${STEPSIZE} ${EXTRA_ARGS}
61 | else
62 | CUDA_VISIBLE_DEVICES=${GPU_ID} time python ./tools/convert_from_depre.py \
63 | --snapshot ${NET_FINAL} \
64 | --imdb ${TRAIN_IMDB} \
65 | --iters ${ITERS} \
66 | --cfg experiments/cfgs/${NET}.yml \
67 | --set ANCHOR_SCALES ${ANCHORS} ANCHOR_RATIOS ${RATIOS} TRAIN.STEPSIZE ${STEPSIZE} ${EXTRA_ARGS}
68 | fi
69 | fi
70 |
71 |
--------------------------------------------------------------------------------
/experiments/scripts/test_faster_rcnn.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | set -x
4 | set -e
5 |
6 | export PYTHONUNBUFFERED="True"
7 |
8 | GPU_ID=$1
9 | DATASET=$2
10 | NET=$3
11 |
12 | array=( $@ )
13 | len=${#array[@]}
14 | EXTRA_ARGS=${array[@]:3:$len}
15 | EXTRA_ARGS_SLUG=${EXTRA_ARGS// /_}
16 |
17 | case ${DATASET} in
18 | pascal_voc)
19 | TRAIN_IMDB="voc_2007_trainval"
20 | TEST_IMDB="voc_2007_test"
21 | ITERS=70000
22 | ANCHORS="[8,16,32]"
23 | RATIOS="[0.5,1,2]"
24 | ;;
25 | pascal_voc_0712)
26 | TRAIN_IMDB="voc_2007_trainval+voc_2012_trainval"
27 | TEST_IMDB="voc_2007_test"
28 | ITERS=110000
29 | ANCHORS="[8,16,32]"
30 | RATIOS="[0.5,1,2]"
31 | ;;
32 | coco)
33 | TRAIN_IMDB="coco_2014_train+coco_2014_valminusminival"
34 | TEST_IMDB="coco_2014_minival"
35 | ITERS=490000
36 | ANCHORS="[4,8,16,32]"
37 | RATIOS="[0.5,1,2]"
38 | ;;
39 | *)
40 | echo "No dataset given"
41 | exit
42 | ;;
43 | esac
44 |
45 | LOG="experiments/logs/test_${NET}_${TRAIN_IMDB}_${EXTRA_ARGS_SLUG}.txt.`date +'%Y-%m-%d_%H-%M-%S'`"
46 | exec &> >(tee -a "$LOG")
47 | echo Logging output to "$LOG"
48 |
49 | set +x
50 | if [[ ! -z ${EXTRA_ARGS_SLUG} ]]; then
51 | NET_FINAL=output/${NET}/${TRAIN_IMDB}/${EXTRA_ARGS_SLUG}/${NET}_faster_rcnn_iter_${ITERS}.ckpt
52 | else
53 | NET_FINAL=output/${NET}/${TRAIN_IMDB}/default/${NET}_faster_rcnn_iter_${ITERS}.ckpt
54 | fi
55 | set -x
56 |
57 | if [[ ! -z ${EXTRA_ARGS_SLUG} ]]; then
58 | CUDA_VISIBLE_DEVICES=${GPU_ID} time python ./tools/test_net.py \
59 | --imdb ${TEST_IMDB} \
60 | --model ${NET_FINAL} \
61 | --cfg experiments/cfgs/${NET}.yml \
62 | --tag ${EXTRA_ARGS_SLUG} \
63 | --net ${NET} \
64 | --set ANCHOR_SCALES ${ANCHORS} ANCHOR_RATIOS ${RATIOS} ${EXTRA_ARGS}
65 | else
66 | CUDA_VISIBLE_DEVICES=${GPU_ID} time python ./tools/test_net.py \
67 | --imdb ${TEST_IMDB} \
68 | --model ${NET_FINAL} \
69 | --cfg experiments/cfgs/${NET}.yml \
70 | --net ${NET} \
71 | --set ANCHOR_SCALES ${ANCHORS} ANCHOR_RATIOS ${RATIOS} ${EXTRA_ARGS}
72 | fi
73 |
74 |
--------------------------------------------------------------------------------
/experiments/scripts/train_faster_rcnn.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | set -x
4 | set -e
5 |
6 | export PYTHONUNBUFFERED="True"
7 |
8 | GPU_ID=$1
9 | DATASET=$2
10 | NET=$3
11 |
12 | array=( $@ )
13 | len=${#array[@]}
14 | EXTRA_ARGS=${array[@]:3:$len}
15 | EXTRA_ARGS_SLUG=${EXTRA_ARGS// /_}
16 |
17 | case ${DATASET} in
18 | graspRGB)
19 | TRAIN_IMDB="graspRGB_train"
20 | TEST_IMDB="graspRGB_test"
21 | STEPSIZE=50000
22 | ITERS=240000
23 | #ANCHORS="[2,4,8,16,32]"
24 | ANCHORS="[8,16,32]"
25 | RATIOS="[0.5,1,2]"
26 | ;;
27 | pascal_voc)
28 | TRAIN_IMDB="voc_2007_trainval"
29 | TEST_IMDB="voc_2007_test"
30 | STEPSIZE=50000
31 | ITERS=70000
32 | ANCHORS="[8,16,32]"
33 | RATIOS="[0.5,1,2]"
34 | ;;
35 | pascal_voc_0712)
36 | TRAIN_IMDB="voc_2007_trainval+voc_2012_trainval"
37 | TEST_IMDB="voc_2007_test"
38 | STEPSIZE=80000
39 | ITERS=110000
40 | ANCHORS="[8,16,32]"
41 | RATIOS="[0.5,1,2]"
42 | ;;
43 | coco)
44 | TRAIN_IMDB="coco_2014_train+coco_2014_valminusminival"
45 | TEST_IMDB="coco_2014_minival"
46 | STEPSIZE=350000
47 | ITERS=490000
48 | ANCHORS="[4,8,16,32]"
49 | RATIOS="[0.5,1,2]"
50 | ;;
51 | *)
52 | echo "No dataset given"
53 | exit
54 | ;;
55 | esac
56 |
57 | LOG="experiments/logs/${NET}_${TRAIN_IMDB}_${EXTRA_ARGS_SLUG}_${NET}.txt.`date +'%Y-%m-%d_%H-%M-%S'`"
58 | exec &> >(tee -a "$LOG")
59 | echo Logging output to "$LOG"
60 |
61 | set +x
62 | if [[ ! -z ${EXTRA_ARGS_SLUG} ]]; then
63 | NET_FINAL=output/${NET}/${TRAIN_IMDB}/${EXTRA_ARGS_SLUG}/${NET}_faster_rcnn_iter_${ITERS}.ckpt
64 | else
65 | NET_FINAL=output/${NET}/${TRAIN_IMDB}/default/${NET}_faster_rcnn_iter_${ITERS}.ckpt
66 | fi
67 | set -x
68 |
69 | if [ ! -f ${NET_FINAL}.index ]; then
70 | if [[ ! -z ${EXTRA_ARGS_SLUG} ]]; then
71 | CUDA_VISIBLE_DEVICES=${GPU_ID} time python ./tools/trainval_net.py \
72 | --weight data/imagenet_weights/${NET}.ckpt \
73 | --imdb ${TRAIN_IMDB} \
74 | --imdbval ${TEST_IMDB} \
75 | --iters ${ITERS} \
76 | --cfg experiments/cfgs/${NET}.yml \
77 | --tag ${EXTRA_ARGS_SLUG} \
78 | --net ${NET} \
79 | --set ANCHOR_SCALES ${ANCHORS} ANCHOR_RATIOS ${RATIOS} TRAIN.STEPSIZE ${STEPSIZE} ${EXTRA_ARGS}
80 | else
81 | CUDA_VISIBLE_DEVICES=${GPU_ID} time python ./tools/trainval_net.py \
82 | --weight data/imagenet_weights/${NET}.ckpt \
83 | --imdb ${TRAIN_IMDB} \
84 | --imdbval ${TEST_IMDB} \
85 | --iters ${ITERS} \
86 | --cfg experiments/cfgs/${NET}.yml \
87 | --net ${NET} \
88 | --set ANCHOR_SCALES ${ANCHORS} ANCHOR_RATIOS ${RATIOS} TRAIN.STEPSIZE ${STEPSIZE} ${EXTRA_ARGS}
89 | fi
90 | fi
91 |
92 | ./experiments/scripts/test_faster_rcnn.sh $@
93 |
--------------------------------------------------------------------------------
/experiments/scripts/train_faster_rcnn.sh~:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | set -x
4 | set -e
5 |
6 | export PYTHONUNBUFFERED="True"
7 |
8 | GPU_ID=$1
9 | DATASET=$2
10 | NET=$3
11 |
12 | array=( $@ )
13 | len=${#array[@]}
14 | EXTRA_ARGS=${array[@]:3:$len}
15 | EXTRA_ARGS_SLUG=${EXTRA_ARGS// /_}
16 |
17 | case ${DATASET} in
18 | graspRGB)
19 | TRAIN_IMDB="graspRGB_train"
20 | TEST_IMDB="graspRGB_test"
21 | STEPSIZE=50000
22 | ITERS=160000
23 | #ANCHORS="[2,4,8,16,32]"
24 | ANCHORS="[8,16,32]"
25 | RATIOS="[0.5,1,2]"
26 | ;;
27 | pascal_voc)
28 | TRAIN_IMDB="voc_2007_trainval"
29 | TEST_IMDB="voc_2007_test"
30 | STEPSIZE=50000
31 | ITERS=70000
32 | ANCHORS="[8,16,32]"
33 | RATIOS="[0.5,1,2]"
34 | ;;
35 | pascal_voc_0712)
36 | TRAIN_IMDB="voc_2007_trainval+voc_2012_trainval"
37 | TEST_IMDB="voc_2007_test"
38 | STEPSIZE=80000
39 | ITERS=110000
40 | ANCHORS="[8,16,32]"
41 | RATIOS="[0.5,1,2]"
42 | ;;
43 | coco)
44 | TRAIN_IMDB="coco_2014_train+coco_2014_valminusminival"
45 | TEST_IMDB="coco_2014_minival"
46 | STEPSIZE=350000
47 | ITERS=490000
48 | ANCHORS="[4,8,16,32]"
49 | RATIOS="[0.5,1,2]"
50 | ;;
51 | *)
52 | echo "No dataset given"
53 | exit
54 | ;;
55 | esac
56 |
57 | LOG="experiments/logs/${NET}_${TRAIN_IMDB}_${EXTRA_ARGS_SLUG}_${NET}.txt.`date +'%Y-%m-%d_%H-%M-%S'`"
58 | exec &> >(tee -a "$LOG")
59 | echo Logging output to "$LOG"
60 |
61 | set +x
62 | if [[ ! -z ${EXTRA_ARGS_SLUG} ]]; then
63 | NET_FINAL=output/${NET}/${TRAIN_IMDB}/${EXTRA_ARGS_SLUG}/${NET}_faster_rcnn_iter_${ITERS}.ckpt
64 | else
65 | NET_FINAL=output/${NET}/${TRAIN_IMDB}/default/${NET}_faster_rcnn_iter_${ITERS}.ckpt
66 | fi
67 | set -x
68 |
69 | if [ ! -f ${NET_FINAL}.index ]; then
70 | if [[ ! -z ${EXTRA_ARGS_SLUG} ]]; then
71 | CUDA_VISIBLE_DEVICES=${GPU_ID} time python ./tools/trainval_net.py \
72 | --weight data/imagenet_weights/${NET}.ckpt \
73 | --imdb ${TRAIN_IMDB} \
74 | --imdbval ${TEST_IMDB} \
75 | --iters ${ITERS} \
76 | --cfg experiments/cfgs/${NET}.yml \
77 | --tag ${EXTRA_ARGS_SLUG} \
78 | --net ${NET} \
79 | --set ANCHOR_SCALES ${ANCHORS} ANCHOR_RATIOS ${RATIOS} TRAIN.STEPSIZE ${STEPSIZE} ${EXTRA_ARGS}
80 | else
81 | CUDA_VISIBLE_DEVICES=${GPU_ID} time python ./tools/trainval_net.py \
82 | --weight data/imagenet_weights/${NET}.ckpt \
83 | --imdb ${TRAIN_IMDB} \
84 | --imdbval ${TEST_IMDB} \
85 | --iters ${ITERS} \
86 | --cfg experiments/cfgs/${NET}.yml \
87 | --net ${NET} \
88 | --set ANCHOR_SCALES ${ANCHORS} ANCHOR_RATIOS ${RATIOS} TRAIN.STEPSIZE ${STEPSIZE} ${EXTRA_ARGS}
89 | fi
90 | fi
91 |
92 | ./experiments/scripts/test_faster_rcnn.sh $@
93 |
--------------------------------------------------------------------------------
/fig/ivalab_dataset.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/fig/ivalab_dataset.png
--------------------------------------------------------------------------------
/fig/multi_grasp_multi_object.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/fig/multi_grasp_multi_object.png
--------------------------------------------------------------------------------
/lib/Makefile:
--------------------------------------------------------------------------------
1 | all:
2 | python setup.py build_ext --inplace
3 | rm -rf build
4 | clean:
5 | rm -rf */*.pyc
6 | rm -rf */*.so
7 |
--------------------------------------------------------------------------------
/lib/datasets/VOCdevkit-matlab-wrapper/get_voc_opts.m:
--------------------------------------------------------------------------------
1 | function VOCopts = get_voc_opts(path)
2 |
3 | tmp = pwd;
4 | cd(path);
5 | try
6 | addpath('VOCcode');
7 | VOCinit;
8 | catch
9 | rmpath('VOCcode');
10 | cd(tmp);
11 | error(sprintf('VOCcode directory not found under %s', path));
12 | end
13 | rmpath('VOCcode');
14 | cd(tmp);
15 |
--------------------------------------------------------------------------------
/lib/datasets/VOCdevkit-matlab-wrapper/voc_eval.m:
--------------------------------------------------------------------------------
1 | function res = voc_eval(path, comp_id, test_set, output_dir)
2 |
3 | VOCopts = get_voc_opts(path);
4 | VOCopts.testset = test_set;
5 |
6 | for i = 1:length(VOCopts.classes)
7 | cls = VOCopts.classes{i};
8 | res(i) = voc_eval_cls(cls, VOCopts, comp_id, output_dir);
9 | end
10 |
11 | fprintf('\n~~~~~~~~~~~~~~~~~~~~\n');
12 | fprintf('Results:\n');
13 | aps = [res(:).ap]';
14 | fprintf('%.1f\n', aps * 100);
15 | fprintf('%.1f\n', mean(aps) * 100);
16 | fprintf('~~~~~~~~~~~~~~~~~~~~\n');
17 |
18 | function res = voc_eval_cls(cls, VOCopts, comp_id, output_dir)
19 |
20 | test_set = VOCopts.testset;
21 | year = VOCopts.dataset(4:end);
22 |
23 | addpath(fullfile(VOCopts.datadir, 'VOCcode'));
24 |
25 | res_fn = sprintf(VOCopts.detrespath, comp_id, cls);
26 |
27 | recall = [];
28 | prec = [];
29 | ap = 0;
30 | ap_auc = 0;
31 |
32 | do_eval = (str2num(year) <= 2007) | ~strcmp(test_set, 'test');
33 | if do_eval
34 | % Bug in VOCevaldet requires that tic has been called first
35 | tic;
36 | [recall, prec, ap] = VOCevaldet(VOCopts, comp_id, cls, true);
37 | ap_auc = xVOCap(recall, prec);
38 |
39 | % force plot limits
40 | ylim([0 1]);
41 | xlim([0 1]);
42 |
43 | print(gcf, '-djpeg', '-r0', ...
44 | [output_dir '/' cls '_pr.jpg']);
45 | end
46 | fprintf('!!! %s : %.4f %.4f\n', cls, ap, ap_auc);
47 |
48 | res.recall = recall;
49 | res.prec = prec;
50 | res.ap = ap;
51 | res.ap_auc = ap_auc;
52 |
53 | save([output_dir '/' cls '_pr.mat'], ...
54 | 'res', 'recall', 'prec', 'ap', 'ap_auc');
55 |
56 | rmpath(fullfile(VOCopts.datadir, 'VOCcode'));
57 |
--------------------------------------------------------------------------------
/lib/datasets/VOCdevkit-matlab-wrapper/xVOCap.m:
--------------------------------------------------------------------------------
1 | function ap = xVOCap(rec,prec)
2 | % From the PASCAL VOC 2011 devkit
3 |
4 | mrec=[0 ; rec ; 1];
5 | mpre=[0 ; prec ; 0];
6 | for i=numel(mpre)-1:-1:1
7 | mpre(i)=max(mpre(i),mpre(i+1));
8 | end
9 | i=find(mrec(2:end)~=mrec(1:end-1))+1;
10 | ap=sum((mrec(i)-mrec(i-1)).*mpre(i));
11 |
--------------------------------------------------------------------------------
/lib/datasets/__init__.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick
6 | # --------------------------------------------------------
7 |
--------------------------------------------------------------------------------
/lib/datasets/__init__.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/datasets/__init__.pyc
--------------------------------------------------------------------------------
/lib/datasets/coco.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/datasets/coco.pyc
--------------------------------------------------------------------------------
/lib/datasets/ds_utils.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast/er R-CNN
3 | # Licensed under The MIT License [see LICENSE for details]
4 | # Written by Ross Girshick
5 | # --------------------------------------------------------
6 | from __future__ import absolute_import
7 | from __future__ import division
8 | from __future__ import print_function
9 |
10 | import numpy as np
11 |
12 |
13 | def unique_boxes(boxes, scale=1.0):
14 | """Return indices of unique boxes."""
15 | v = np.array([1, 1e3, 1e6, 1e9])
16 | hashes = np.round(boxes * scale).dot(v)
17 | _, index = np.unique(hashes, return_index=True)
18 | return np.sort(index)
19 |
20 |
21 | def xywh_to_xyxy(boxes):
22 | """Convert [x y w h] box format to [x1 y1 x2 y2] format."""
23 | return np.hstack((boxes[:, 0:2], boxes[:, 0:2] + boxes[:, 2:4] - 1))
24 |
25 |
26 | def xyxy_to_xywh(boxes):
27 | """Convert [x1 y1 x2 y2] box format to [x y w h] format."""
28 | return np.hstack((boxes[:, 0:2], boxes[:, 2:4] - boxes[:, 0:2] + 1))
29 |
30 |
31 | def validate_boxes(boxes, width=0, height=0):
32 | """Check that a set of boxes are valid."""
33 | x1 = boxes[:, 0]
34 | y1 = boxes[:, 1]
35 | x2 = boxes[:, 2]
36 | y2 = boxes[:, 3]
37 | assert (x1 >= 0).all()
38 | assert (y1 >= 0).all()
39 | assert (x2 >= x1).all()
40 | assert (y2 >= y1).all()
41 | assert (x2 < width).all()
42 | assert (y2 < height).all()
43 |
44 |
45 | def filter_small_boxes(boxes, min_size):
46 | w = boxes[:, 2] - boxes[:, 0]
47 | h = boxes[:, 3] - boxes[:, 1]
48 | keep = np.where((w >= min_size) & (h > min_size))[0]
49 | return keep
50 |
--------------------------------------------------------------------------------
/lib/datasets/ds_utils.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/datasets/ds_utils.pyc
--------------------------------------------------------------------------------
/lib/datasets/factory.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick
6 | # --------------------------------------------------------
7 |
8 | """Factory method for easily getting imdbs by name."""
9 | from __future__ import absolute_import
10 | from __future__ import division
11 | from __future__ import print_function
12 | from datasets.graspRGB import graspRGB
13 |
14 | __sets = {}
15 | from datasets.pascal_voc import pascal_voc
16 | from datasets.coco import coco
17 |
18 | import numpy as np
19 |
20 | # Set up voc__
21 | for year in ['2007', '2012']:
22 | for split in ['train', 'val', 'trainval', 'test']:
23 | name = 'voc_{}_{}'.format(year, split)
24 | __sets[name] = (lambda split=split, year=year: pascal_voc(split, year))
25 |
26 | # Set up coco_2014_
27 | for year in ['2014']:
28 | for split in ['train', 'val', 'minival', 'valminusminival', 'trainval']:
29 | name = 'coco_{}_{}'.format(year, split)
30 | __sets[name] = (lambda split=split, year=year: coco(split, year))
31 |
32 | # Set up coco_2015_
33 | for year in ['2015']:
34 | for split in ['test', 'test-dev']:
35 | name = 'coco_{}_{}'.format(year, split)
36 | __sets[name] = (lambda split=split, year=year: coco(split, year))
37 |
38 | # Set up graspRGB_ using selective search "fast" mode # added by FC
39 | graspRGB_devkit_path = '/media/fujenchu/home3/fasterrcnn_grasp/rgb_multibbs_5_5_5_object_tf'
40 | for split in ['train', 'test']:
41 | name = '{}_{}'.format('graspRGB', split)
42 | __sets[name] = (lambda split=split: graspRGB(split, graspRGB_devkit_path))
43 |
44 | def get_imdb(name):
45 | """Get an imdb (image database) by name."""
46 | if name not in __sets:
47 | raise KeyError('Unknown dataset: {}'.format(name))
48 | return __sets[name]()
49 |
50 |
51 | def list_imdbs():
52 | """List all registered imdbs."""
53 | return list(__sets.keys())
54 |
--------------------------------------------------------------------------------
/lib/datasets/factory.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/datasets/factory.pyc
--------------------------------------------------------------------------------
/lib/datasets/factory.py~:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick
6 | # --------------------------------------------------------
7 |
8 | """Factory method for easily getting imdbs by name."""
9 | from __future__ import absolute_import
10 | from __future__ import division
11 | from __future__ import print_function
12 | from datasets.graspRGB import graspRGB
13 |
14 | __sets = {}
15 | from datasets.pascal_voc import pascal_voc
16 | from datasets.coco import coco
17 |
18 | import numpy as np
19 |
20 | # Set up voc__
21 | for year in ['2007', '2012']:
22 | for split in ['train', 'val', 'trainval', 'test']:
23 | name = 'voc_{}_{}'.format(year, split)
24 | __sets[name] = (lambda split=split, year=year: pascal_voc(split, year))
25 |
26 | # Set up coco_2014_
27 | for year in ['2014']:
28 | for split in ['train', 'val', 'minival', 'valminusminival', 'trainval']:
29 | name = 'coco_{}_{}'.format(year, split)
30 | __sets[name] = (lambda split=split, year=year: coco(split, year))
31 |
32 | # Set up coco_2015_
33 | for year in ['2015']:
34 | for split in ['test', 'test-dev']:
35 | name = 'coco_{}_{}'.format(year, split)
36 | __sets[name] = (lambda split=split, year=year: coco(split, year))
37 |
38 | # Set up graspRGB_ using selective search "fast" mode # added by FC
39 | graspRGB_devkit_path = '/media/fujenchu/home3/fasterrcnn_grasp/rgd_multibbs_5_5_5_object_tf'
40 | for split in ['train', 'test']:
41 | name = '{}_{}'.format('graspRGB', split)
42 | __sets[name] = (lambda split=split: graspRGB(split, graspRGB_devkit_path))
43 |
44 | def get_imdb(name):
45 | """Get an imdb (image database) by name."""
46 | if name not in __sets:
47 | raise KeyError('Unknown dataset: {}'.format(name))
48 | return __sets[name]()
49 |
50 |
51 | def list_imdbs():
52 | """List all registered imdbs."""
53 | return list(__sets.keys())
54 |
--------------------------------------------------------------------------------
/lib/datasets/graspRGB.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/datasets/graspRGB.pyc
--------------------------------------------------------------------------------
/lib/datasets/imdb.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick and Xinlei Chen
6 | # --------------------------------------------------------
7 | from __future__ import absolute_import
8 | from __future__ import division
9 | from __future__ import print_function
10 |
11 | import os
12 | import os.path as osp
13 | import PIL
14 | from utils.cython_bbox import bbox_overlaps
15 | import numpy as np
16 | import scipy.sparse
17 | from model.config import cfg
18 |
19 |
20 | class imdb(object):
21 | """Image database."""
22 |
23 | def __init__(self, name, classes=None):
24 | self._name = name
25 | self._num_classes = 0
26 | if not classes:
27 | self._classes = []
28 | else:
29 | self._classes = classes
30 | self._image_index = []
31 | self._obj_proposer = 'gt'
32 | self._roidb = None
33 | self._roidb_handler = self.default_roidb
34 | # Use this dict for storing dataset specific config options
35 | self.config = {}
36 |
37 | @property
38 | def name(self):
39 | return self._name
40 |
41 | @property
42 | def num_classes(self):
43 | return len(self._classes)
44 |
45 | @property
46 | def classes(self):
47 | return self._classes
48 |
49 | @property
50 | def image_index(self):
51 | return self._image_index
52 |
53 | @property
54 | def roidb_handler(self):
55 | return self._roidb_handler
56 |
57 | @roidb_handler.setter
58 | def roidb_handler(self, val):
59 | self._roidb_handler = val
60 |
61 | def set_proposal_method(self, method):
62 | method = eval('self.' + method + '_roidb')
63 | self.roidb_handler = method
64 |
65 | @property
66 | def roidb(self):
67 | # A roidb is a list of dictionaries, each with the following keys:
68 | # boxes
69 | # gt_overlaps
70 | # gt_classes
71 | # flipped
72 | if self._roidb is not None:
73 | return self._roidb
74 | self._roidb = self.roidb_handler()
75 | return self._roidb
76 |
77 | @property
78 | def cache_path(self):
79 | cache_path = osp.abspath(osp.join(cfg.DATA_DIR, 'cache'))
80 | if not os.path.exists(cache_path):
81 | os.makedirs(cache_path)
82 | return cache_path
83 |
84 | @property
85 | def num_images(self):
86 | return len(self.image_index)
87 |
88 | def image_path_at(self, i):
89 | raise NotImplementedError
90 |
91 | def default_roidb(self):
92 | raise NotImplementedError
93 |
94 | def evaluate_detections(self, all_boxes, output_dir=None):
95 | """
96 | all_boxes is a list of length number-of-classes.
97 | Each list element is a list of length number-of-images.
98 | Each of those list elements is either an empty list []
99 | or a numpy array of detection.
100 |
101 | all_boxes[class][image] = [] or np.array of shape #dets x 5
102 | """
103 | raise NotImplementedError
104 |
105 | def _get_widths(self):
106 | return [PIL.Image.open(self.image_path_at(i)).size[0]
107 | for i in range(self.num_images)]
108 |
109 | def append_flipped_images(self):
110 | num_images = self.num_images
111 | widths = self._get_widths()
112 | for i in range(num_images):
113 | boxes = self.roidb[i]['boxes'].copy()
114 | oldx1 = boxes[:, 0].copy()
115 | oldx2 = boxes[:, 2].copy()
116 | boxes[:, 0] = widths[i] - oldx2 - 1
117 | boxes[:, 2] = widths[i] - oldx1 - 1
118 | assert (boxes[:, 2] >= boxes[:, 0]).all()
119 | entry = {'boxes': boxes,
120 | 'gt_overlaps': self.roidb[i]['gt_overlaps'],
121 | 'gt_classes': self.roidb[i]['gt_classes'],
122 | 'flipped': True}
123 | self.roidb.append(entry)
124 | self._image_index = self._image_index * 2
125 |
126 | def evaluate_recall(self, candidate_boxes=None, thresholds=None,
127 | area='all', limit=None):
128 | """Evaluate detection proposal recall metrics.
129 |
130 | Returns:
131 | results: dictionary of results with keys
132 | 'ar': average recall
133 | 'recalls': vector recalls at each IoU overlap threshold
134 | 'thresholds': vector of IoU overlap thresholds
135 | 'gt_overlaps': vector of all ground-truth overlaps
136 | """
137 | # Record max overlap value for each gt box
138 | # Return vector of overlap values
139 | areas = {'all': 0, 'small': 1, 'medium': 2, 'large': 3,
140 | '96-128': 4, '128-256': 5, '256-512': 6, '512-inf': 7}
141 | area_ranges = [[0 ** 2, 1e5 ** 2], # all
142 | [0 ** 2, 32 ** 2], # small
143 | [32 ** 2, 96 ** 2], # medium
144 | [96 ** 2, 1e5 ** 2], # large
145 | [96 ** 2, 128 ** 2], # 96-128
146 | [128 ** 2, 256 ** 2], # 128-256
147 | [256 ** 2, 512 ** 2], # 256-512
148 | [512 ** 2, 1e5 ** 2], # 512-inf
149 | ]
150 | assert area in areas, 'unknown area range: {}'.format(area)
151 | area_range = area_ranges[areas[area]]
152 | gt_overlaps = np.zeros(0)
153 | num_pos = 0
154 | for i in range(self.num_images):
155 | # Checking for max_overlaps == 1 avoids including crowd annotations
156 | # (...pretty hacking :/)
157 | max_gt_overlaps = self.roidb[i]['gt_overlaps'].toarray().max(axis=1)
158 | gt_inds = np.where((self.roidb[i]['gt_classes'] > 0) &
159 | (max_gt_overlaps == 1))[0]
160 | gt_boxes = self.roidb[i]['boxes'][gt_inds, :]
161 | gt_areas = self.roidb[i]['seg_areas'][gt_inds]
162 | valid_gt_inds = np.where((gt_areas >= area_range[0]) &
163 | (gt_areas <= area_range[1]))[0]
164 | gt_boxes = gt_boxes[valid_gt_inds, :]
165 | num_pos += len(valid_gt_inds)
166 |
167 | if candidate_boxes is None:
168 | # If candidate_boxes is not supplied, the default is to use the
169 | # non-ground-truth boxes from this roidb
170 | non_gt_inds = np.where(self.roidb[i]['gt_classes'] == 0)[0]
171 | boxes = self.roidb[i]['boxes'][non_gt_inds, :]
172 | else:
173 | boxes = candidate_boxes[i]
174 | if boxes.shape[0] == 0:
175 | continue
176 | if limit is not None and boxes.shape[0] > limit:
177 | boxes = boxes[:limit, :]
178 |
179 | overlaps = bbox_overlaps(boxes.astype(np.float),
180 | gt_boxes.astype(np.float))
181 |
182 | _gt_overlaps = np.zeros((gt_boxes.shape[0]))
183 | for j in range(gt_boxes.shape[0]):
184 | # find which proposal box maximally covers each gt box
185 | argmax_overlaps = overlaps.argmax(axis=0)
186 | # and get the iou amount of coverage for each gt box
187 | max_overlaps = overlaps.max(axis=0)
188 | # find which gt box is 'best' covered (i.e. 'best' = most iou)
189 | gt_ind = max_overlaps.argmax()
190 | gt_ovr = max_overlaps.max()
191 | assert (gt_ovr >= 0)
192 | # find the proposal box that covers the best covered gt box
193 | box_ind = argmax_overlaps[gt_ind]
194 | # record the iou coverage of this gt box
195 | _gt_overlaps[j] = overlaps[box_ind, gt_ind]
196 | assert (_gt_overlaps[j] == gt_ovr)
197 | # mark the proposal box and the gt box as used
198 | overlaps[box_ind, :] = -1
199 | overlaps[:, gt_ind] = -1
200 | # append recorded iou coverage level
201 | gt_overlaps = np.hstack((gt_overlaps, _gt_overlaps))
202 |
203 | gt_overlaps = np.sort(gt_overlaps)
204 | if thresholds is None:
205 | step = 0.05
206 | thresholds = np.arange(0.5, 0.95 + 1e-5, step)
207 | recalls = np.zeros_like(thresholds)
208 | # compute recall for each iou threshold
209 | for i, t in enumerate(thresholds):
210 | recalls[i] = (gt_overlaps >= t).sum() / float(num_pos)
211 | # ar = 2 * np.trapz(recalls, thresholds)
212 | ar = recalls.mean()
213 | return {'ar': ar, 'recalls': recalls, 'thresholds': thresholds,
214 | 'gt_overlaps': gt_overlaps}
215 |
216 | def create_roidb_from_box_list(self, box_list, gt_roidb):
217 | assert len(box_list) == self.num_images, \
218 | 'Number of boxes must match number of ground-truth images'
219 | roidb = []
220 | for i in range(self.num_images):
221 | boxes = box_list[i]
222 | num_boxes = boxes.shape[0]
223 | overlaps = np.zeros((num_boxes, self.num_classes), dtype=np.float32)
224 |
225 | if gt_roidb is not None and gt_roidb[i]['boxes'].size > 0:
226 | gt_boxes = gt_roidb[i]['boxes']
227 | gt_classes = gt_roidb[i]['gt_classes']
228 | gt_overlaps = bbox_overlaps(boxes.astype(np.float),
229 | gt_boxes.astype(np.float))
230 | argmaxes = gt_overlaps.argmax(axis=1)
231 | maxes = gt_overlaps.max(axis=1)
232 | I = np.where(maxes > 0)[0]
233 | overlaps[I, gt_classes[argmaxes[I]]] = maxes[I]
234 |
235 | overlaps = scipy.sparse.csr_matrix(overlaps)
236 | roidb.append({
237 | 'boxes': boxes,
238 | 'gt_classes': np.zeros((num_boxes,), dtype=np.int32),
239 | 'gt_overlaps': overlaps,
240 | 'flipped': False,
241 | 'seg_areas': np.zeros((num_boxes,), dtype=np.float32),
242 | })
243 | return roidb
244 |
245 | @staticmethod
246 | def merge_roidbs(a, b):
247 | assert len(a) == len(b)
248 | for i in range(len(a)):
249 | a[i]['boxes'] = np.vstack((a[i]['boxes'], b[i]['boxes']))
250 | a[i]['gt_classes'] = np.hstack((a[i]['gt_classes'],
251 | b[i]['gt_classes']))
252 | a[i]['gt_overlaps'] = scipy.sparse.vstack([a[i]['gt_overlaps'],
253 | b[i]['gt_overlaps']])
254 | a[i]['seg_areas'] = np.hstack((a[i]['seg_areas'],
255 | b[i]['seg_areas']))
256 | return a
257 |
258 | def competition_mode(self, on):
259 | """Turn competition mode on or off."""
260 | pass
261 |
--------------------------------------------------------------------------------
/lib/datasets/imdb.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/datasets/imdb.pyc
--------------------------------------------------------------------------------
/lib/datasets/pascal_voc.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick and Xinlei Chen
6 | # --------------------------------------------------------
7 | from __future__ import absolute_import
8 | from __future__ import division
9 | from __future__ import print_function
10 |
11 | import os
12 | from datasets.imdb import imdb
13 | import datasets.ds_utils as ds_utils
14 | import xml.etree.ElementTree as ET
15 | import numpy as np
16 | import scipy.sparse
17 | import scipy.io as sio
18 | import utils.cython_bbox
19 | import pickle
20 | import subprocess
21 | import uuid
22 | from .voc_eval import voc_eval
23 | from model.config import cfg
24 |
25 |
26 | class pascal_voc(imdb):
27 | def __init__(self, image_set, year, devkit_path=None):
28 | imdb.__init__(self, 'voc_' + year + '_' + image_set)
29 | self._year = year
30 | self._image_set = image_set
31 | self._devkit_path = self._get_default_path() if devkit_path is None \
32 | else devkit_path
33 | self._data_path = os.path.join(self._devkit_path, 'VOC' + self._year)
34 | self._classes = ('__background__', # always index 0
35 | 'aeroplane', 'bicycle', 'bird', 'boat',
36 | 'bottle', 'bus', 'car', 'cat', 'chair',
37 | 'cow', 'diningtable', 'dog', 'horse',
38 | 'motorbike', 'person', 'pottedplant',
39 | 'sheep', 'sofa', 'train', 'tvmonitor')
40 | self._class_to_ind = dict(list(zip(self.classes, list(range(self.num_classes)))))
41 | self._image_ext = '.jpg'
42 | self._image_index = self._load_image_set_index()
43 | # Default to roidb handler
44 | self._roidb_handler = self.gt_roidb
45 | self._salt = str(uuid.uuid4())
46 | self._comp_id = 'comp4'
47 |
48 | # PASCAL specific config options
49 | self.config = {'cleanup': True,
50 | 'use_salt': True,
51 | 'use_diff': False,
52 | 'matlab_eval': False,
53 | 'rpn_file': None}
54 |
55 | assert os.path.exists(self._devkit_path), \
56 | 'VOCdevkit path does not exist: {}'.format(self._devkit_path)
57 | assert os.path.exists(self._data_path), \
58 | 'Path does not exist: {}'.format(self._data_path)
59 |
60 | def image_path_at(self, i):
61 | """
62 | Return the absolute path to image i in the image sequence.
63 | """
64 | return self.image_path_from_index(self._image_index[i])
65 |
66 | def image_path_from_index(self, index):
67 | """
68 | Construct an image path from the image's "index" identifier.
69 | """
70 | image_path = os.path.join(self._data_path, 'JPEGImages',
71 | index + self._image_ext)
72 | assert os.path.exists(image_path), \
73 | 'Path does not exist: {}'.format(image_path)
74 | return image_path
75 |
76 | def _load_image_set_index(self):
77 | """
78 | Load the indexes listed in this dataset's image set file.
79 | """
80 | # Example path to image set file:
81 | # self._devkit_path + /VOCdevkit2007/VOC2007/ImageSets/Main/val.txt
82 | image_set_file = os.path.join(self._data_path, 'ImageSets', 'Main',
83 | self._image_set + '.txt')
84 | assert os.path.exists(image_set_file), \
85 | 'Path does not exist: {}'.format(image_set_file)
86 | with open(image_set_file) as f:
87 | image_index = [x.strip() for x in f.readlines()]
88 | return image_index
89 |
90 | def _get_default_path(self):
91 | """
92 | Return the default path where PASCAL VOC is expected to be installed.
93 | """
94 | return os.path.join(cfg.DATA_DIR, 'VOCdevkit' + self._year)
95 |
96 | def gt_roidb(self):
97 | """
98 | Return the database of ground-truth regions of interest.
99 |
100 | This function loads/saves from/to a cache file to speed up future calls.
101 | """
102 | cache_file = os.path.join(self.cache_path, self.name + '_gt_roidb.pkl')
103 | if os.path.exists(cache_file):
104 | with open(cache_file, 'rb') as fid:
105 | try:
106 | roidb = pickle.load(fid)
107 | except:
108 | roidb = pickle.load(fid, encoding='bytes')
109 | print('{} gt roidb loaded from {}'.format(self.name, cache_file))
110 | return roidb
111 |
112 | gt_roidb = [self._load_pascal_annotation(index)
113 | for index in self.image_index]
114 | with open(cache_file, 'wb') as fid:
115 | pickle.dump(gt_roidb, fid, pickle.HIGHEST_PROTOCOL)
116 | print('wrote gt roidb to {}'.format(cache_file))
117 |
118 | return gt_roidb
119 |
120 | def rpn_roidb(self):
121 | if int(self._year) == 2007 or self._image_set != 'test':
122 | gt_roidb = self.gt_roidb()
123 | rpn_roidb = self._load_rpn_roidb(gt_roidb)
124 | roidb = imdb.merge_roidbs(gt_roidb, rpn_roidb)
125 | else:
126 | roidb = self._load_rpn_roidb(None)
127 |
128 | return roidb
129 |
130 | def _load_rpn_roidb(self, gt_roidb):
131 | filename = self.config['rpn_file']
132 | print('loading {}'.format(filename))
133 | assert os.path.exists(filename), \
134 | 'rpn data not found at: {}'.format(filename)
135 | with open(filename, 'rb') as f:
136 | box_list = pickle.load(f)
137 | return self.create_roidb_from_box_list(box_list, gt_roidb)
138 |
139 | def _load_pascal_annotation(self, index):
140 | """
141 | Load image and bounding boxes info from XML file in the PASCAL VOC
142 | format.
143 | """
144 | filename = os.path.join(self._data_path, 'Annotations', index + '.xml')
145 | tree = ET.parse(filename)
146 | objs = tree.findall('object')
147 | if not self.config['use_diff']:
148 | # Exclude the samples labeled as difficult
149 | non_diff_objs = [
150 | obj for obj in objs if int(obj.find('difficult').text) == 0]
151 | # if len(non_diff_objs) != len(objs):
152 | # print 'Removed {} difficult objects'.format(
153 | # len(objs) - len(non_diff_objs))
154 | objs = non_diff_objs
155 | num_objs = len(objs)
156 |
157 | boxes = np.zeros((num_objs, 4), dtype=np.uint16)
158 | gt_classes = np.zeros((num_objs), dtype=np.int32)
159 | overlaps = np.zeros((num_objs, self.num_classes), dtype=np.float32)
160 | # "Seg" area for pascal is just the box area
161 | seg_areas = np.zeros((num_objs), dtype=np.float32)
162 |
163 | # Load object bounding boxes into a data frame.
164 | for ix, obj in enumerate(objs):
165 | bbox = obj.find('bndbox')
166 | # Make pixel indexes 0-based
167 | x1 = float(bbox.find('xmin').text) - 1
168 | y1 = float(bbox.find('ymin').text) - 1
169 | x2 = float(bbox.find('xmax').text) - 1
170 | y2 = float(bbox.find('ymax').text) - 1
171 | cls = self._class_to_ind[obj.find('name').text.lower().strip()]
172 | boxes[ix, :] = [x1, y1, x2, y2]
173 | gt_classes[ix] = cls
174 | overlaps[ix, cls] = 1.0
175 | seg_areas[ix] = (x2 - x1 + 1) * (y2 - y1 + 1)
176 |
177 | overlaps = scipy.sparse.csr_matrix(overlaps)
178 |
179 | return {'boxes': boxes,
180 | 'gt_classes': gt_classes,
181 | 'gt_overlaps': overlaps,
182 | 'flipped': False,
183 | 'seg_areas': seg_areas}
184 |
185 | def _get_comp_id(self):
186 | comp_id = (self._comp_id + '_' + self._salt if self.config['use_salt']
187 | else self._comp_id)
188 | return comp_id
189 |
190 | def _get_voc_results_file_template(self):
191 | # VOCdevkit/results/VOC2007/Main/_det_test_aeroplane.txt
192 | filename = self._get_comp_id() + '_det_' + self._image_set + '_{:s}.txt'
193 | path = os.path.join(
194 | self._devkit_path,
195 | 'results',
196 | 'VOC' + self._year,
197 | 'Main',
198 | filename)
199 | return path
200 |
201 | def _write_voc_results_file(self, all_boxes):
202 | for cls_ind, cls in enumerate(self.classes):
203 | if cls == '__background__':
204 | continue
205 | print('Writing {} VOC results file'.format(cls))
206 | filename = self._get_voc_results_file_template().format(cls)
207 | with open(filename, 'wt') as f:
208 | for im_ind, index in enumerate(self.image_index):
209 | dets = all_boxes[cls_ind][im_ind]
210 | if dets == []:
211 | continue
212 | # the VOCdevkit expects 1-based indices
213 | for k in range(dets.shape[0]):
214 | f.write('{:s} {:.3f} {:.1f} {:.1f} {:.1f} {:.1f}\n'.
215 | format(index, dets[k, -1],
216 | dets[k, 0] + 1, dets[k, 1] + 1,
217 | dets[k, 2] + 1, dets[k, 3] + 1))
218 |
219 | def _do_python_eval(self, output_dir='output'):
220 | annopath = os.path.join(
221 | self._devkit_path,
222 | 'VOC' + self._year,
223 | 'Annotations',
224 | '{:s}.xml')
225 | imagesetfile = os.path.join(
226 | self._devkit_path,
227 | 'VOC' + self._year,
228 | 'ImageSets',
229 | 'Main',
230 | self._image_set + '.txt')
231 | cachedir = os.path.join(self._devkit_path, 'annotations_cache')
232 | aps = []
233 | # The PASCAL VOC metric changed in 2010
234 | use_07_metric = True if int(self._year) < 2010 else False
235 | print('VOC07 metric? ' + ('Yes' if use_07_metric else 'No'))
236 | if not os.path.isdir(output_dir):
237 | os.mkdir(output_dir)
238 | for i, cls in enumerate(self._classes):
239 | if cls == '__background__':
240 | continue
241 | filename = self._get_voc_results_file_template().format(cls)
242 | rec, prec, ap = voc_eval(
243 | filename, annopath, imagesetfile, cls, cachedir, ovthresh=0.5,
244 | use_07_metric=use_07_metric)
245 | aps += [ap]
246 | print(('AP for {} = {:.4f}'.format(cls, ap)))
247 | with open(os.path.join(output_dir, cls + '_pr.pkl'), 'wb') as f:
248 | pickle.dump({'rec': rec, 'prec': prec, 'ap': ap}, f)
249 | print(('Mean AP = {:.4f}'.format(np.mean(aps))))
250 | print('~~~~~~~~')
251 | print('Results:')
252 | for ap in aps:
253 | print(('{:.3f}'.format(ap)))
254 | print(('{:.3f}'.format(np.mean(aps))))
255 | print('~~~~~~~~')
256 | print('')
257 | print('--------------------------------------------------------------')
258 | print('Results computed with the **unofficial** Python eval code.')
259 | print('Results should be very close to the official MATLAB eval code.')
260 | print('Recompute with `./tools/reval.py --matlab ...` for your paper.')
261 | print('-- Thanks, The Management')
262 | print('--------------------------------------------------------------')
263 |
264 | def _do_matlab_eval(self, output_dir='output'):
265 | print('-----------------------------------------------------')
266 | print('Computing results with the official MATLAB eval code.')
267 | print('-----------------------------------------------------')
268 | path = os.path.join(cfg.ROOT_DIR, 'lib', 'datasets',
269 | 'VOCdevkit-matlab-wrapper')
270 | cmd = 'cd {} && '.format(path)
271 | cmd += '{:s} -nodisplay -nodesktop '.format(cfg.MATLAB)
272 | cmd += '-r "dbstop if error; '
273 | cmd += 'voc_eval(\'{:s}\',\'{:s}\',\'{:s}\',\'{:s}\'); quit;"' \
274 | .format(self._devkit_path, self._get_comp_id(),
275 | self._image_set, output_dir)
276 | print(('Running:\n{}'.format(cmd)))
277 | status = subprocess.call(cmd, shell=True)
278 |
279 | def evaluate_detections(self, all_boxes, output_dir):
280 | self._write_voc_results_file(all_boxes)
281 | self._do_python_eval(output_dir)
282 | if self.config['matlab_eval']:
283 | self._do_matlab_eval(output_dir)
284 | if self.config['cleanup']:
285 | for cls in self._classes:
286 | if cls == '__background__':
287 | continue
288 | filename = self._get_voc_results_file_template().format(cls)
289 | os.remove(filename)
290 |
291 | def competition_mode(self, on):
292 | if on:
293 | self.config['use_salt'] = False
294 | self.config['cleanup'] = False
295 | else:
296 | self.config['use_salt'] = True
297 | self.config['cleanup'] = True
298 |
299 |
300 | if __name__ == '__main__':
301 | from datasets.pascal_voc import pascal_voc
302 |
303 | d = pascal_voc('trainval', '2007')
304 | res = d.roidb
305 | from IPython import embed;
306 |
307 | embed()
308 |
--------------------------------------------------------------------------------
/lib/datasets/pascal_voc.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/datasets/pascal_voc.pyc
--------------------------------------------------------------------------------
/lib/datasets/tools/mcg_munge.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 |
4 | """Hacky tool to convert file system layout of MCG boxes downloaded from
5 | http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/mcg/
6 | so that it's consistent with those computed by Jan Hosang (see:
7 | http://www.mpi-inf.mpg.de/departments/computer-vision-and-multimodal-
8 | computing/research/object-recognition-and-scene-understanding/how-
9 | good-are-detection-proposals-really/)
10 |
11 | NB: Boxes from the MCG website are in (y1, x1, y2, x2) order.
12 | Boxes from Hosang et al. are in (x1, y1, x2, y2) order.
13 | """
14 |
15 | def munge(src_dir):
16 | # stored as: ./MCG-COCO-val2014-boxes/COCO_val2014_000000193401.mat
17 | # want: ./MCG/mat/COCO_val2014_0/COCO_val2014_000000141/COCO_val2014_000000141334.mat
18 |
19 | files = os.listdir(src_dir)
20 | for fn in files:
21 | base, ext = os.path.splitext(fn)
22 | # first 14 chars / first 22 chars / all chars + .mat
23 | # COCO_val2014_0/COCO_val2014_000000447/COCO_val2014_000000447991.mat
24 | first = base[:14]
25 | second = base[:22]
26 | dst_dir = os.path.join('MCG', 'mat', first, second)
27 | if not os.path.exists(dst_dir):
28 | os.makedirs(dst_dir)
29 | src = os.path.join(src_dir, fn)
30 | dst = os.path.join(dst_dir, fn)
31 | print 'MV: {} -> {}'.format(src, dst)
32 | os.rename(src, dst)
33 |
34 | if __name__ == '__main__':
35 | # src_dir should look something like:
36 | # src_dir = 'MCG-COCO-val2014-boxes'
37 | src_dir = sys.argv[1]
38 | munge(src_dir)
39 |
--------------------------------------------------------------------------------
/lib/datasets/voc_eval.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast/er R-CNN
3 | # Licensed under The MIT License [see LICENSE for details]
4 | # Written by Bharath Hariharan
5 | # --------------------------------------------------------
6 | from __future__ import absolute_import
7 | from __future__ import division
8 | from __future__ import print_function
9 |
10 | import xml.etree.ElementTree as ET
11 | import os
12 | import pickle
13 | import numpy as np
14 |
15 | def parse_rec(filename):
16 | """ Parse a PASCAL VOC xml file """
17 | tree = ET.parse(filename)
18 | objects = []
19 | for obj in tree.findall('object'):
20 | obj_struct = {}
21 | obj_struct['name'] = obj.find('name').text
22 | obj_struct['pose'] = obj.find('pose').text
23 | obj_struct['truncated'] = int(obj.find('truncated').text)
24 | obj_struct['difficult'] = int(obj.find('difficult').text)
25 | bbox = obj.find('bndbox')
26 | obj_struct['bbox'] = [int(bbox.find('xmin').text),
27 | int(bbox.find('ymin').text),
28 | int(bbox.find('xmax').text),
29 | int(bbox.find('ymax').text)]
30 | objects.append(obj_struct)
31 |
32 | return objects
33 |
34 |
35 | def voc_ap(rec, prec, use_07_metric=False):
36 | """ ap = voc_ap(rec, prec, [use_07_metric])
37 | Compute VOC AP given precision and recall.
38 | If use_07_metric is true, uses the
39 | VOC 07 11 point method (default:False).
40 | """
41 | if use_07_metric:
42 | # 11 point metric
43 | ap = 0.
44 | for t in np.arange(0., 1.1, 0.1):
45 | if np.sum(rec >= t) == 0:
46 | p = 0
47 | else:
48 | p = np.max(prec[rec >= t])
49 | ap = ap + p / 11.
50 | else:
51 | # correct AP calculation
52 | # first append sentinel values at the end
53 | mrec = np.concatenate(([0.], rec, [1.]))
54 | mpre = np.concatenate(([0.], prec, [0.]))
55 |
56 | # compute the precision envelope
57 | for i in range(mpre.size - 1, 0, -1):
58 | mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])
59 |
60 | # to calculate area under PR curve, look for points
61 | # where X axis (recall) changes value
62 | i = np.where(mrec[1:] != mrec[:-1])[0]
63 |
64 | # and sum (\Delta recall) * prec
65 | ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])
66 | return ap
67 |
68 |
69 | def voc_eval(detpath,
70 | annopath,
71 | imagesetfile,
72 | classname,
73 | cachedir,
74 | ovthresh=0.5,
75 | use_07_metric=False):
76 | """rec, prec, ap = voc_eval(detpath,
77 | annopath,
78 | imagesetfile,
79 | classname,
80 | [ovthresh],
81 | [use_07_metric])
82 |
83 | Top level function that does the PASCAL VOC evaluation.
84 |
85 | detpath: Path to detections
86 | detpath.format(classname) should produce the detection results file.
87 | annopath: Path to annotations
88 | annopath.format(imagename) should be the xml annotations file.
89 | imagesetfile: Text file containing the list of images, one image per line.
90 | classname: Category name (duh)
91 | cachedir: Directory for caching the annotations
92 | [ovthresh]: Overlap threshold (default = 0.5)
93 | [use_07_metric]: Whether to use VOC07's 11 point AP computation
94 | (default False)
95 | """
96 | # assumes detections are in detpath.format(classname)
97 | # assumes annotations are in annopath.format(imagename)
98 | # assumes imagesetfile is a text file with each line an image name
99 | # cachedir caches the annotations in a pickle file
100 |
101 | # first load gt
102 | if not os.path.isdir(cachedir):
103 | os.mkdir(cachedir)
104 | cachefile = os.path.join(cachedir, 'annots.pkl')
105 | # read list of images
106 | with open(imagesetfile, 'r') as f:
107 | lines = f.readlines()
108 | imagenames = [x.strip() for x in lines]
109 |
110 | if not os.path.isfile(cachefile):
111 | # load annots
112 | recs = {}
113 | for i, imagename in enumerate(imagenames):
114 | recs[imagename] = parse_rec(annopath.format(imagename))
115 | if i % 100 == 0:
116 | print('Reading annotation for {:d}/{:d}'.format(
117 | i + 1, len(imagenames)))
118 | # save
119 | print('Saving cached annotations to {:s}'.format(cachefile))
120 | with open(cachefile, 'w') as f:
121 | pickle.dump(recs, f)
122 | else:
123 | # load
124 | with open(cachefile, 'rb') as f:
125 | try:
126 | recs = pickle.load(f)
127 | except:
128 | recs = pickle.load(f, encoding='bytes')
129 |
130 | # extract gt objects for this class
131 | class_recs = {}
132 | npos = 0
133 | for imagename in imagenames:
134 | R = [obj for obj in recs[imagename] if obj['name'] == classname]
135 | bbox = np.array([x['bbox'] for x in R])
136 | difficult = np.array([x['difficult'] for x in R]).astype(np.bool)
137 | det = [False] * len(R)
138 | npos = npos + sum(~difficult)
139 | class_recs[imagename] = {'bbox': bbox,
140 | 'difficult': difficult,
141 | 'det': det}
142 |
143 | # read dets
144 | detfile = detpath.format(classname)
145 | with open(detfile, 'r') as f:
146 | lines = f.readlines()
147 |
148 | splitlines = [x.strip().split(' ') for x in lines]
149 | image_ids = [x[0] for x in splitlines]
150 | confidence = np.array([float(x[1]) for x in splitlines])
151 | BB = np.array([[float(z) for z in x[2:]] for x in splitlines])
152 |
153 | nd = len(image_ids)
154 | tp = np.zeros(nd)
155 | fp = np.zeros(nd)
156 |
157 | if BB.shape[0] > 0:
158 | # sort by confidence
159 | sorted_ind = np.argsort(-confidence)
160 | sorted_scores = np.sort(-confidence)
161 | BB = BB[sorted_ind, :]
162 | image_ids = [image_ids[x] for x in sorted_ind]
163 |
164 | # go down dets and mark TPs and FPs
165 | for d in range(nd):
166 | R = class_recs[image_ids[d]]
167 | bb = BB[d, :].astype(float)
168 | ovmax = -np.inf
169 | BBGT = R['bbox'].astype(float)
170 |
171 | if BBGT.size > 0:
172 | # compute overlaps
173 | # intersection
174 | ixmin = np.maximum(BBGT[:, 0], bb[0])
175 | iymin = np.maximum(BBGT[:, 1], bb[1])
176 | ixmax = np.minimum(BBGT[:, 2], bb[2])
177 | iymax = np.minimum(BBGT[:, 3], bb[3])
178 | iw = np.maximum(ixmax - ixmin + 1., 0.)
179 | ih = np.maximum(iymax - iymin + 1., 0.)
180 | inters = iw * ih
181 |
182 | # union
183 | uni = ((bb[2] - bb[0] + 1.) * (bb[3] - bb[1] + 1.) +
184 | (BBGT[:, 2] - BBGT[:, 0] + 1.) *
185 | (BBGT[:, 3] - BBGT[:, 1] + 1.) - inters)
186 |
187 | overlaps = inters / uni
188 | ovmax = np.max(overlaps)
189 | jmax = np.argmax(overlaps)
190 |
191 | if ovmax > ovthresh:
192 | if not R['difficult'][jmax]:
193 | if not R['det'][jmax]:
194 | tp[d] = 1.
195 | R['det'][jmax] = 1
196 | else:
197 | fp[d] = 1.
198 | else:
199 | fp[d] = 1.
200 |
201 | # compute precision recall
202 | fp = np.cumsum(fp)
203 | tp = np.cumsum(tp)
204 | rec = tp / float(npos)
205 | # avoid divide by zero in case the first detection matches a difficult
206 | # ground truth
207 | prec = tp / np.maximum(tp + fp, np.finfo(np.float64).eps)
208 | ap = voc_ap(rec, prec, use_07_metric)
209 |
210 | return rec, prec, ap
211 |
--------------------------------------------------------------------------------
/lib/datasets/voc_eval.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/datasets/voc_eval.pyc
--------------------------------------------------------------------------------
/lib/layer_utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/layer_utils/__init__.py
--------------------------------------------------------------------------------
/lib/layer_utils/__init__.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/layer_utils/__init__.pyc
--------------------------------------------------------------------------------
/lib/layer_utils/anchor_target_layer.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Faster R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick and Xinlei Chen
6 | # --------------------------------------------------------
7 | from __future__ import absolute_import
8 | from __future__ import division
9 | from __future__ import print_function
10 |
11 | import os
12 | from model.config import cfg
13 | import numpy as np
14 | import numpy.random as npr
15 | from utils.cython_bbox import bbox_overlaps
16 | from model.bbox_transform import bbox_transform
17 |
18 | def anchor_target_layer(rpn_cls_score, gt_boxes, im_info, _feat_stride, all_anchors, num_anchors):
19 | """Same as the anchor target layer in original Fast/er RCNN """
20 | A = num_anchors
21 | total_anchors = all_anchors.shape[0]
22 | K = total_anchors / num_anchors
23 | im_info = im_info[0]
24 |
25 | # allow boxes to sit over the edge by a small amount
26 | _allowed_border = 0
27 |
28 | # map of shape (..., H, W)
29 | height, width = rpn_cls_score.shape[1:3]
30 |
31 | # only keep anchors inside the image
32 | inds_inside = np.where(
33 | (all_anchors[:, 0] >= -_allowed_border) &
34 | (all_anchors[:, 1] >= -_allowed_border) &
35 | (all_anchors[:, 2] < im_info[1] + _allowed_border) & # width
36 | (all_anchors[:, 3] < im_info[0] + _allowed_border) # height
37 | )[0]
38 |
39 | # keep only inside anchors
40 | anchors = all_anchors[inds_inside, :]
41 |
42 | # label: 1 is positive, 0 is negative, -1 is dont care
43 | labels = np.empty((len(inds_inside),), dtype=np.float32)
44 | labels.fill(-1)
45 |
46 | # overlaps between the anchors and the gt boxes
47 | # overlaps (ex, gt)
48 | overlaps = bbox_overlaps(
49 | np.ascontiguousarray(anchors, dtype=np.float),
50 | np.ascontiguousarray(gt_boxes, dtype=np.float))
51 | argmax_overlaps = overlaps.argmax(axis=1)
52 | max_overlaps = overlaps[np.arange(len(inds_inside)), argmax_overlaps]
53 | gt_argmax_overlaps = overlaps.argmax(axis=0)
54 | gt_max_overlaps = overlaps[gt_argmax_overlaps,
55 | np.arange(overlaps.shape[1])]
56 | gt_argmax_overlaps = np.where(overlaps == gt_max_overlaps)[0]
57 |
58 | if not cfg.TRAIN.RPN_CLOBBER_POSITIVES:
59 | # assign bg labels first so that positive labels can clobber them
60 | # first set the negatives
61 | labels[max_overlaps < cfg.TRAIN.RPN_NEGATIVE_OVERLAP] = 0
62 |
63 | # fg label: for each gt, anchor with highest overlap
64 | labels[gt_argmax_overlaps] = 1
65 |
66 | # fg label: above threshold IOU
67 | labels[max_overlaps >= cfg.TRAIN.RPN_POSITIVE_OVERLAP] = 1
68 |
69 | if cfg.TRAIN.RPN_CLOBBER_POSITIVES:
70 | # assign bg labels last so that negative labels can clobber positives
71 | labels[max_overlaps < cfg.TRAIN.RPN_NEGATIVE_OVERLAP] = 0
72 |
73 | # subsample positive labels if we have too many
74 | num_fg = int(cfg.TRAIN.RPN_FG_FRACTION * cfg.TRAIN.RPN_BATCHSIZE)
75 | fg_inds = np.where(labels == 1)[0]
76 | if len(fg_inds) > num_fg:
77 | disable_inds = npr.choice(
78 | fg_inds, size=(len(fg_inds) - num_fg), replace=False)
79 | labels[disable_inds] = -1
80 |
81 | # subsample negative labels if we have too many
82 | num_bg = cfg.TRAIN.RPN_BATCHSIZE - np.sum(labels == 1)
83 | bg_inds = np.where(labels == 0)[0]
84 | if len(bg_inds) > num_bg:
85 | disable_inds = npr.choice(
86 | bg_inds, size=(len(bg_inds) - num_bg), replace=False)
87 | labels[disable_inds] = -1
88 |
89 | bbox_targets = np.zeros((len(inds_inside), 4), dtype=np.float32)
90 | bbox_targets = _compute_targets(anchors, gt_boxes[argmax_overlaps, :])
91 |
92 | bbox_inside_weights = np.zeros((len(inds_inside), 4), dtype=np.float32)
93 | # only the positive ones have regression targets
94 | bbox_inside_weights[labels == 1, :] = np.array(cfg.TRAIN.RPN_BBOX_INSIDE_WEIGHTS)
95 |
96 | bbox_outside_weights = np.zeros((len(inds_inside), 4), dtype=np.float32)
97 | if cfg.TRAIN.RPN_POSITIVE_WEIGHT < 0:
98 | # uniform weighting of examples (given non-uniform sampling)
99 | num_examples = np.sum(labels >= 0)
100 | positive_weights = np.ones((1, 4)) * 1.0 / num_examples
101 | negative_weights = np.ones((1, 4)) * 1.0 / num_examples
102 | else:
103 | assert ((cfg.TRAIN.RPN_POSITIVE_WEIGHT > 0) &
104 | (cfg.TRAIN.RPN_POSITIVE_WEIGHT < 1))
105 | positive_weights = (cfg.TRAIN.RPN_POSITIVE_WEIGHT /
106 | np.sum(labels == 1))
107 | negative_weights = ((1.0 - cfg.TRAIN.RPN_POSITIVE_WEIGHT) /
108 | np.sum(labels == 0))
109 | bbox_outside_weights[labels == 1, :] = positive_weights
110 | bbox_outside_weights[labels == 0, :] = negative_weights
111 |
112 | # map up to original set of anchors
113 | labels = _unmap(labels, total_anchors, inds_inside, fill=-1)
114 | bbox_targets = _unmap(bbox_targets, total_anchors, inds_inside, fill=0)
115 | bbox_inside_weights = _unmap(bbox_inside_weights, total_anchors, inds_inside, fill=0)
116 | bbox_outside_weights = _unmap(bbox_outside_weights, total_anchors, inds_inside, fill=0)
117 |
118 | # labels
119 | labels = labels.reshape((1, height, width, A)).transpose(0, 3, 1, 2)
120 | labels = labels.reshape((1, 1, A * height, width))
121 | rpn_labels = labels
122 |
123 | # bbox_targets
124 | bbox_targets = bbox_targets \
125 | .reshape((1, height, width, A * 4))
126 |
127 | rpn_bbox_targets = bbox_targets
128 | # bbox_inside_weights
129 | bbox_inside_weights = bbox_inside_weights \
130 | .reshape((1, height, width, A * 4))
131 |
132 | rpn_bbox_inside_weights = bbox_inside_weights
133 |
134 | # bbox_outside_weights
135 | bbox_outside_weights = bbox_outside_weights \
136 | .reshape((1, height, width, A * 4))
137 |
138 | rpn_bbox_outside_weights = bbox_outside_weights
139 | return rpn_labels, rpn_bbox_targets, rpn_bbox_inside_weights, rpn_bbox_outside_weights
140 |
141 |
142 | def _unmap(data, count, inds, fill=0):
143 | """ Unmap a subset of item (data) back to the original set of items (of
144 | size count) """
145 | if len(data.shape) == 1:
146 | ret = np.empty((count,), dtype=np.float32)
147 | ret.fill(fill)
148 | ret[inds] = data
149 | else:
150 | ret = np.empty((count,) + data.shape[1:], dtype=np.float32)
151 | ret.fill(fill)
152 | ret[inds, :] = data
153 | return ret
154 |
155 |
156 | def _compute_targets(ex_rois, gt_rois):
157 | """Compute bounding-box regression targets for an image."""
158 |
159 | assert ex_rois.shape[0] == gt_rois.shape[0]
160 | assert ex_rois.shape[1] == 4
161 | assert gt_rois.shape[1] == 5
162 |
163 | return bbox_transform(ex_rois, gt_rois[:, :4]).astype(np.float32, copy=False)
164 |
--------------------------------------------------------------------------------
/lib/layer_utils/anchor_target_layer.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/layer_utils/anchor_target_layer.pyc
--------------------------------------------------------------------------------
/lib/layer_utils/generate_anchors.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Faster R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick and Sean Bell
6 | # --------------------------------------------------------
7 | from __future__ import absolute_import
8 | from __future__ import division
9 | from __future__ import print_function
10 |
11 | import numpy as np
12 |
13 |
14 | # Verify that we compute the same anchors as Shaoqing's matlab implementation:
15 | #
16 | # >> load output/rpn_cachedir/faster_rcnn_VOC2007_ZF_stage1_rpn/anchors.mat
17 | # >> anchors
18 | #
19 | # anchors =
20 | #
21 | # -83 -39 100 56
22 | # -175 -87 192 104
23 | # -359 -183 376 200
24 | # -55 -55 72 72
25 | # -119 -119 136 136
26 | # -247 -247 264 264
27 | # -35 -79 52 96
28 | # -79 -167 96 184
29 | # -167 -343 184 360
30 |
31 | # array([[ -83., -39., 100., 56.],
32 | # [-175., -87., 192., 104.],
33 | # [-359., -183., 376., 200.],
34 | # [ -55., -55., 72., 72.],
35 | # [-119., -119., 136., 136.],
36 | # [-247., -247., 264., 264.],
37 | # [ -35., -79., 52., 96.],
38 | # [ -79., -167., 96., 184.],
39 | # [-167., -343., 184., 360.]])
40 |
41 | def generate_anchors(base_size=16, ratios=[0.5, 1, 2],
42 | scales=2 ** np.arange(3, 6)):
43 | """
44 | Generate anchor (reference) windows by enumerating aspect ratios X
45 | scales wrt a reference (0, 0, 15, 15) window.
46 | """
47 |
48 | base_anchor = np.array([1, 1, base_size, base_size]) - 1
49 | ratio_anchors = _ratio_enum(base_anchor, ratios)
50 | anchors = np.vstack([_scale_enum(ratio_anchors[i, :], scales)
51 | for i in range(ratio_anchors.shape[0])])
52 | return anchors
53 |
54 |
55 | def _whctrs(anchor):
56 | """
57 | Return width, height, x center, and y center for an anchor (window).
58 | """
59 |
60 | w = anchor[2] - anchor[0] + 1
61 | h = anchor[3] - anchor[1] + 1
62 | x_ctr = anchor[0] + 0.5 * (w - 1)
63 | y_ctr = anchor[1] + 0.5 * (h - 1)
64 | return w, h, x_ctr, y_ctr
65 |
66 |
67 | def _mkanchors(ws, hs, x_ctr, y_ctr):
68 | """
69 | Given a vector of widths (ws) and heights (hs) around a center
70 | (x_ctr, y_ctr), output a set of anchors (windows).
71 | """
72 |
73 | ws = ws[:, np.newaxis]
74 | hs = hs[:, np.newaxis]
75 | anchors = np.hstack((x_ctr - 0.5 * (ws - 1),
76 | y_ctr - 0.5 * (hs - 1),
77 | x_ctr + 0.5 * (ws - 1),
78 | y_ctr + 0.5 * (hs - 1)))
79 | return anchors
80 |
81 |
82 | def _ratio_enum(anchor, ratios):
83 | """
84 | Enumerate a set of anchors for each aspect ratio wrt an anchor.
85 | """
86 |
87 | w, h, x_ctr, y_ctr = _whctrs(anchor)
88 | size = w * h
89 | size_ratios = size / ratios
90 | ws = np.round(np.sqrt(size_ratios))
91 | hs = np.round(ws * ratios)
92 | anchors = _mkanchors(ws, hs, x_ctr, y_ctr)
93 | return anchors
94 |
95 |
96 | def _scale_enum(anchor, scales):
97 | """
98 | Enumerate a set of anchors for each scale wrt an anchor.
99 | """
100 |
101 | w, h, x_ctr, y_ctr = _whctrs(anchor)
102 | ws = w * scales
103 | hs = h * scales
104 | anchors = _mkanchors(ws, hs, x_ctr, y_ctr)
105 | return anchors
106 |
107 |
108 | if __name__ == '__main__':
109 | import time
110 |
111 | t = time.time()
112 | a = generate_anchors()
113 | print(time.time() - t)
114 | print(a)
115 | from IPython import embed;
116 |
117 | embed()
118 |
--------------------------------------------------------------------------------
/lib/layer_utils/generate_anchors.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/layer_utils/generate_anchors.pyc
--------------------------------------------------------------------------------
/lib/layer_utils/proposal_layer.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Faster R-CNN
3 | # Licensed under The MIT License [see LICENSE for details]
4 | # Written by Ross Girshick and Xinlei Chen
5 | # --------------------------------------------------------
6 | from __future__ import absolute_import
7 | from __future__ import division
8 | from __future__ import print_function
9 |
10 | import numpy as np
11 | from model.config import cfg
12 | from model.bbox_transform import bbox_transform_inv, clip_boxes
13 | from model.nms_wrapper import nms
14 |
15 |
16 | def proposal_layer(rpn_cls_prob, rpn_bbox_pred, im_info, cfg_key, _feat_stride, anchors, num_anchors):
17 | """A simplified version compared to fast/er RCNN
18 | For details please see the technical report
19 | """
20 | if type(cfg_key) == bytes:
21 | cfg_key = cfg_key.decode('utf-8')
22 | pre_nms_topN = cfg[cfg_key].RPN_PRE_NMS_TOP_N
23 | post_nms_topN = cfg[cfg_key].RPN_POST_NMS_TOP_N
24 | nms_thresh = cfg[cfg_key].RPN_NMS_THRESH
25 |
26 | im_info = im_info[0]
27 | # Get the scores and bounding boxes
28 | scores = rpn_cls_prob[:, :, :, num_anchors:]
29 | rpn_bbox_pred = rpn_bbox_pred.reshape((-1, 4))
30 | scores = scores.reshape((-1, 1))
31 | proposals = bbox_transform_inv(anchors, rpn_bbox_pred)
32 | proposals = clip_boxes(proposals, im_info[:2])
33 |
34 | # Pick the top region proposals
35 | order = scores.ravel().argsort()[::-1]
36 | if pre_nms_topN > 0:
37 | order = order[:pre_nms_topN]
38 | proposals = proposals[order, :]
39 | scores = scores[order]
40 |
41 | # Non-maximal suppression
42 | keep = nms(np.hstack((proposals, scores)), nms_thresh)
43 |
44 | # Pick th top region proposals after NMS
45 | if post_nms_topN > 0:
46 | keep = keep[:post_nms_topN]
47 | proposals = proposals[keep, :]
48 | scores = scores[keep]
49 |
50 | # Only support single image as input
51 | batch_inds = np.zeros((proposals.shape[0], 1), dtype=np.float32)
52 | blob = np.hstack((batch_inds, proposals.astype(np.float32, copy=False)))
53 |
54 | return blob, scores
55 |
--------------------------------------------------------------------------------
/lib/layer_utils/proposal_layer.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/layer_utils/proposal_layer.pyc
--------------------------------------------------------------------------------
/lib/layer_utils/proposal_target_layer.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Faster R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick, Sean Bell and Xinlei Chen
6 | # --------------------------------------------------------
7 | from __future__ import absolute_import
8 | from __future__ import division
9 | from __future__ import print_function
10 |
11 | import numpy as np
12 | import numpy.random as npr
13 | from model.config import cfg
14 | from model.bbox_transform import bbox_transform
15 | from utils.cython_bbox import bbox_overlaps
16 |
17 |
18 | def proposal_target_layer(rpn_rois, rpn_scores, gt_boxes, _num_classes):
19 | """
20 | Assign object detection proposals to ground-truth targets. Produces proposal
21 | classification labels and bounding-box regression targets.
22 | """
23 |
24 | # Proposal ROIs (0, x1, y1, x2, y2) coming from RPN
25 | # (i.e., rpn.proposal_layer.ProposalLayer), or any other source
26 | all_rois = rpn_rois
27 | all_scores = rpn_scores
28 |
29 | # Include ground-truth boxes in the set of candidate rois
30 | if cfg.TRAIN.USE_GT:
31 | zeros = np.zeros((gt_boxes.shape[0], 1), dtype=gt_boxes.dtype)
32 | all_rois = np.vstack(
33 | (all_rois, np.hstack((zeros, gt_boxes[:, :-1])))
34 | )
35 | # not sure if it a wise appending, but anyway i am not using it
36 | all_scores = np.vstack((all_scores, zeros))
37 |
38 | num_images = 1
39 | rois_per_image = cfg.TRAIN.BATCH_SIZE / num_images
40 | fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image)
41 |
42 | # Sample rois with classification labels and bounding box regression
43 | # targets
44 | labels, rois, roi_scores, bbox_targets, bbox_inside_weights = _sample_rois(
45 | all_rois, all_scores, gt_boxes, fg_rois_per_image,
46 | rois_per_image, _num_classes)
47 |
48 | rois = rois.reshape(-1, 5)
49 | roi_scores = roi_scores.reshape(-1)
50 | labels = labels.reshape(-1, 1)
51 | bbox_targets = bbox_targets.reshape(-1, _num_classes * 4)
52 | bbox_inside_weights = bbox_inside_weights.reshape(-1, _num_classes * 4)
53 | bbox_outside_weights = np.array(bbox_inside_weights > 0).astype(np.float32)
54 |
55 | return rois, roi_scores, labels, bbox_targets, bbox_inside_weights, bbox_outside_weights
56 |
57 |
58 | def _get_bbox_regression_labels(bbox_target_data, num_classes):
59 | """Bounding-box regression targets (bbox_target_data) are stored in a
60 | compact form N x (class, tx, ty, tw, th)
61 |
62 | This function expands those targets into the 4-of-4*K representation used
63 | by the network (i.e. only one class has non-zero targets).
64 |
65 | Returns:
66 | bbox_target (ndarray): N x 4K blob of regression targets
67 | bbox_inside_weights (ndarray): N x 4K blob of loss weights
68 | """
69 |
70 | clss = bbox_target_data[:, 0]
71 | bbox_targets = np.zeros((clss.size, 4 * num_classes), dtype=np.float32)
72 | bbox_inside_weights = np.zeros(bbox_targets.shape, dtype=np.float32)
73 | inds = np.where(clss > 0)[0]
74 | for ind in inds:
75 | cls = clss[ind]
76 | start = int(4 * cls)
77 | end = start + 4
78 | bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
79 | bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
80 | return bbox_targets, bbox_inside_weights
81 |
82 |
83 | def _compute_targets(ex_rois, gt_rois, labels):
84 | """Compute bounding-box regression targets for an image."""
85 |
86 | assert ex_rois.shape[0] == gt_rois.shape[0]
87 | assert ex_rois.shape[1] == 4
88 | assert gt_rois.shape[1] == 4
89 |
90 | targets = bbox_transform(ex_rois, gt_rois)
91 | if cfg.TRAIN.BBOX_NORMALIZE_TARGETS_PRECOMPUTED:
92 | # Optionally normalize targets by a precomputed mean and stdev
93 | targets = ((targets - np.array(cfg.TRAIN.BBOX_NORMALIZE_MEANS))
94 | / np.array(cfg.TRAIN.BBOX_NORMALIZE_STDS))
95 | return np.hstack(
96 | (labels[:, np.newaxis], targets)).astype(np.float32, copy=False)
97 |
98 |
99 | def _sample_rois(all_rois, all_scores, gt_boxes, fg_rois_per_image, rois_per_image, num_classes):
100 | """Generate a random sample of RoIs comprising foreground and background
101 | examples.
102 | """
103 | # overlaps: (rois x gt_boxes)
104 | overlaps = bbox_overlaps(
105 | np.ascontiguousarray(all_rois[:, 1:5], dtype=np.float),
106 | np.ascontiguousarray(gt_boxes[:, :4], dtype=np.float))
107 | gt_assignment = overlaps.argmax(axis=1)
108 | max_overlaps = overlaps.max(axis=1)
109 | labels = gt_boxes[gt_assignment, 4]
110 |
111 | # Select foreground RoIs as those with >= FG_THRESH overlap
112 | fg_inds = np.where(max_overlaps >= cfg.TRAIN.FG_THRESH)[0]
113 | # Guard against the case when an image has fewer than fg_rois_per_image
114 | # Select background RoIs as those within [BG_THRESH_LO, BG_THRESH_HI)
115 | bg_inds = np.where((max_overlaps < cfg.TRAIN.BG_THRESH_HI) &
116 | (max_overlaps >= cfg.TRAIN.BG_THRESH_LO))[0]
117 |
118 | # Small modification to the original version where we ensure a fixed number of regions are sampled
119 | if fg_inds.size > 0 and bg_inds.size > 0:
120 | fg_rois_per_image = min(fg_rois_per_image, fg_inds.size)
121 | fg_inds = npr.choice(fg_inds, size=int(fg_rois_per_image), replace=False)
122 | bg_rois_per_image = rois_per_image - fg_rois_per_image
123 | to_replace = bg_inds.size < bg_rois_per_image
124 | bg_inds = npr.choice(bg_inds, size=int(bg_rois_per_image), replace=to_replace)
125 | elif fg_inds.size > 0:
126 | to_replace = fg_inds.size < rois_per_image
127 | fg_inds = npr.choice(fg_inds, size=int(rois_per_image), replace=to_replace)
128 | fg_rois_per_image = rois_per_image
129 | elif bg_inds.size > 0:
130 | to_replace = bg_inds.size < rois_per_image
131 | bg_inds = npr.choice(bg_inds, size=int(rois_per_image), replace=to_replace)
132 | fg_rois_per_image = 0
133 | else:
134 | import pdb
135 | pdb.set_trace()
136 |
137 | # The indices that we're selecting (both fg and bg)
138 | keep_inds = np.append(fg_inds, bg_inds)
139 | # Select sampled values from various arrays:
140 | labels = labels[keep_inds]
141 | # Clamp labels for the background RoIs to 0
142 | labels[int(fg_rois_per_image):] = 0
143 | rois = all_rois[keep_inds]
144 | roi_scores = all_scores[keep_inds]
145 |
146 | bbox_target_data = _compute_targets(
147 | rois[:, 1:5], gt_boxes[gt_assignment[keep_inds], :4], labels)
148 |
149 | bbox_targets, bbox_inside_weights = \
150 | _get_bbox_regression_labels(bbox_target_data, num_classes)
151 |
152 | return labels, rois, roi_scores, bbox_targets, bbox_inside_weights
153 |
--------------------------------------------------------------------------------
/lib/layer_utils/proposal_target_layer.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/layer_utils/proposal_target_layer.pyc
--------------------------------------------------------------------------------
/lib/layer_utils/proposal_top_layer.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Faster R-CNN
3 | # Licensed under The MIT License [see LICENSE for details]
4 | # Written by Xinlei Chen
5 | # --------------------------------------------------------
6 | from __future__ import absolute_import
7 | from __future__ import division
8 | from __future__ import print_function
9 |
10 | import numpy as np
11 | from model.config import cfg
12 | from model.bbox_transform import bbox_transform_inv, clip_boxes
13 | import numpy.random as npr
14 |
15 | def proposal_top_layer(rpn_cls_prob, rpn_bbox_pred, im_info, _feat_stride, anchors, num_anchors):
16 | """A layer that just selects the top region proposals
17 | without using non-maximal suppression,
18 | For details please see the technical report
19 | """
20 | rpn_top_n = cfg.TEST.RPN_TOP_N
21 | im_info = im_info[0]
22 |
23 | scores = rpn_cls_prob[:, :, :, num_anchors:]
24 |
25 | rpn_bbox_pred = rpn_bbox_pred.reshape((-1, 4))
26 | scores = scores.reshape((-1, 1))
27 |
28 | length = scores.shape[0]
29 | if length < rpn_top_n:
30 | # Random selection, maybe unnecessary and loses good proposals
31 | # But such case rarely happens
32 | top_inds = npr.choice(length, size=rpn_top_n, replace=True)
33 | else:
34 | top_inds = scores.argsort(0)[::-1]
35 | top_inds = top_inds[:rpn_top_n]
36 | top_inds = top_inds.reshape(rpn_top_n, )
37 |
38 | # Do the selection here
39 | anchors = anchors[top_inds, :]
40 | rpn_bbox_pred = rpn_bbox_pred[top_inds, :]
41 | scores = scores[top_inds]
42 |
43 | # Convert anchors into proposals via bbox transformations
44 | proposals = bbox_transform_inv(anchors, rpn_bbox_pred)
45 |
46 | # Clip predicted boxes to image
47 | proposals = clip_boxes(proposals, im_info[:2])
48 |
49 | # Output rois blob
50 | # Our RPN implementation only supports a single input image, so all
51 | # batch inds are 0
52 | batch_inds = np.zeros((proposals.shape[0], 1), dtype=np.float32)
53 | blob = np.hstack((batch_inds, proposals.astype(np.float32, copy=False)))
54 | return blob, scores
55 |
--------------------------------------------------------------------------------
/lib/layer_utils/proposal_top_layer.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/layer_utils/proposal_top_layer.pyc
--------------------------------------------------------------------------------
/lib/layer_utils/snippets.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Tensorflow Faster R-CNN
3 | # Licensed under The MIT License [see LICENSE for details]
4 | # Written by Xinlei Chen
5 | # --------------------------------------------------------
6 | from __future__ import absolute_import
7 | from __future__ import division
8 | from __future__ import print_function
9 |
10 | import numpy as np
11 | import numpy.random as npr
12 | from model.config import cfg
13 | from layer_utils.generate_anchors import generate_anchors
14 | from model.bbox_transform import bbox_transform_inv, clip_boxes
15 | from utils.cython_bbox import bbox_overlaps
16 |
17 | def generate_anchors_pre(height, width, feat_stride, anchor_scales=(8,16,32), anchor_ratios=(0.5,1,2)):
18 | """ A wrapper function to generate anchors given different scales
19 | Also return the number of anchors in variable 'length'
20 | """
21 | anchors = generate_anchors(ratios=np.array(anchor_ratios), scales=np.array(anchor_scales))
22 | A = anchors.shape[0]
23 | shift_x = np.arange(0, width) * feat_stride
24 | shift_y = np.arange(0, height) * feat_stride
25 | shift_x, shift_y = np.meshgrid(shift_x, shift_y)
26 | shifts = np.vstack((shift_x.ravel(), shift_y.ravel(), shift_x.ravel(), shift_y.ravel())).transpose()
27 | K = shifts.shape[0]
28 | # width changes faster, so here it is H, W, C
29 | anchors = anchors.reshape((1, A, 4)) + shifts.reshape((1, K, 4)).transpose((1, 0, 2))
30 | anchors = anchors.reshape((K * A, 4)).astype(np.float32, copy=False)
31 | length = np.int32(anchors.shape[0])
32 |
33 | return anchors, length
34 |
--------------------------------------------------------------------------------
/lib/layer_utils/snippets.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/layer_utils/snippets.pyc
--------------------------------------------------------------------------------
/lib/model/__init__.py:
--------------------------------------------------------------------------------
1 | from . import config
2 |
--------------------------------------------------------------------------------
/lib/model/__init__.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/model/__init__.pyc
--------------------------------------------------------------------------------
/lib/model/bbox_transform.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick
6 | # --------------------------------------------------------
7 | from __future__ import absolute_import
8 | from __future__ import division
9 | from __future__ import print_function
10 |
11 | import numpy as np
12 |
13 | def bbox_transform(ex_rois, gt_rois):
14 | ex_widths = ex_rois[:, 2] - ex_rois[:, 0] + 1.0
15 | ex_heights = ex_rois[:, 3] - ex_rois[:, 1] + 1.0
16 | ex_ctr_x = ex_rois[:, 0] + 0.5 * ex_widths
17 | ex_ctr_y = ex_rois[:, 1] + 0.5 * ex_heights
18 |
19 | gt_widths = gt_rois[:, 2] - gt_rois[:, 0] + 1.0
20 | gt_heights = gt_rois[:, 3] - gt_rois[:, 1] + 1.0
21 | gt_ctr_x = gt_rois[:, 0] + 0.5 * gt_widths
22 | gt_ctr_y = gt_rois[:, 1] + 0.5 * gt_heights
23 |
24 | targets_dx = (gt_ctr_x - ex_ctr_x) / ex_widths
25 | targets_dy = (gt_ctr_y - ex_ctr_y) / ex_heights
26 | targets_dw = np.log(gt_widths / ex_widths)
27 | targets_dh = np.log(gt_heights / ex_heights)
28 |
29 | targets = np.vstack(
30 | (targets_dx, targets_dy, targets_dw, targets_dh)).transpose()
31 | return targets
32 |
33 |
34 | def bbox_transform_inv(boxes, deltas):
35 | if boxes.shape[0] == 0:
36 | return np.zeros((0, deltas.shape[1]), dtype=deltas.dtype)
37 |
38 | boxes = boxes.astype(deltas.dtype, copy=False)
39 | widths = boxes[:, 2] - boxes[:, 0] + 1.0
40 | heights = boxes[:, 3] - boxes[:, 1] + 1.0
41 | ctr_x = boxes[:, 0] + 0.5 * widths
42 | ctr_y = boxes[:, 1] + 0.5 * heights
43 |
44 | dx = deltas[:, 0::4]
45 | dy = deltas[:, 1::4]
46 | dw = deltas[:, 2::4]
47 | dh = deltas[:, 3::4]
48 |
49 | pred_ctr_x = dx * widths[:, np.newaxis] + ctr_x[:, np.newaxis]
50 | pred_ctr_y = dy * heights[:, np.newaxis] + ctr_y[:, np.newaxis]
51 | pred_w = np.exp(dw) * widths[:, np.newaxis]
52 | pred_h = np.exp(dh) * heights[:, np.newaxis]
53 |
54 | pred_boxes = np.zeros(deltas.shape, dtype=deltas.dtype)
55 | # x1
56 | pred_boxes[:, 0::4] = pred_ctr_x - 0.5 * pred_w
57 | # y1
58 | pred_boxes[:, 1::4] = pred_ctr_y - 0.5 * pred_h
59 | # x2
60 | pred_boxes[:, 2::4] = pred_ctr_x + 0.5 * pred_w
61 | # y2
62 | pred_boxes[:, 3::4] = pred_ctr_y + 0.5 * pred_h
63 |
64 | return pred_boxes
65 |
66 |
67 | def clip_boxes(boxes, im_shape):
68 | """
69 | Clip boxes to image boundaries.
70 | """
71 |
72 | # x1 >= 0
73 | boxes[:, 0::4] = np.maximum(np.minimum(boxes[:, 0::4], im_shape[1] - 1), 0)
74 | # y1 >= 0
75 | boxes[:, 1::4] = np.maximum(np.minimum(boxes[:, 1::4], im_shape[0] - 1), 0)
76 | # x2 < im_shape[1]
77 | boxes[:, 2::4] = np.maximum(np.minimum(boxes[:, 2::4], im_shape[1] - 1), 0)
78 | # y2 < im_shape[0]
79 | boxes[:, 3::4] = np.maximum(np.minimum(boxes[:, 3::4], im_shape[0] - 1), 0)
80 | return boxes
81 |
--------------------------------------------------------------------------------
/lib/model/bbox_transform.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/model/bbox_transform.pyc
--------------------------------------------------------------------------------
/lib/model/config.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/model/config.pyc
--------------------------------------------------------------------------------
/lib/model/nms_wrapper.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick
6 | # --------------------------------------------------------
7 | from __future__ import absolute_import
8 | from __future__ import division
9 | from __future__ import print_function
10 |
11 | from model.config import cfg
12 | from nms.gpu_nms import gpu_nms
13 | from nms.cpu_nms import cpu_nms
14 |
15 | def nms(dets, thresh, force_cpu=False):
16 | """Dispatch to either CPU or GPU NMS implementations."""
17 |
18 | if dets.shape[0] == 0:
19 | return []
20 | if cfg.USE_GPU_NMS and not force_cpu:
21 | return gpu_nms(dets, thresh, device_id=cfg.GPU_ID)
22 | else:
23 | return cpu_nms(dets, thresh)
24 |
--------------------------------------------------------------------------------
/lib/model/nms_wrapper.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/model/nms_wrapper.pyc
--------------------------------------------------------------------------------
/lib/model/test.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Tensorflow Faster R-CNN
3 | # Licensed under The MIT License [see LICENSE for details]
4 | # Written by Xinlei Chen
5 | # --------------------------------------------------------
6 | from __future__ import absolute_import
7 | from __future__ import division
8 | from __future__ import print_function
9 |
10 | import cv2
11 | import numpy as np
12 | try:
13 | import cPickle as pickle
14 | except ImportError:
15 | import pickle
16 | import os
17 | import math
18 |
19 | from utils.timer import Timer
20 | from utils.cython_nms import nms, nms_new
21 | from utils.boxes_grid import get_boxes_grid
22 | from utils.blob import im_list_to_blob
23 |
24 | from model.config import cfg, get_output_dir
25 | from model.bbox_transform import clip_boxes, bbox_transform_inv
26 |
27 | def _get_image_blob(im):
28 | """Converts an image into a network input.
29 | Arguments:
30 | im (ndarray): a color image in BGR order
31 | Returns:
32 | blob (ndarray): a data blob holding an image pyramid
33 | im_scale_factors (list): list of image scales (relative to im) used
34 | in the image pyramid
35 | """
36 | im_orig = im.astype(np.float32, copy=True)
37 | im_orig -= cfg.PIXEL_MEANS
38 |
39 | im_shape = im_orig.shape
40 | im_size_min = np.min(im_shape[0:2])
41 | im_size_max = np.max(im_shape[0:2])
42 |
43 | processed_ims = []
44 | im_scale_factors = []
45 |
46 | for target_size in cfg.TEST.SCALES:
47 | im_scale = float(target_size) / float(im_size_min)
48 | # Prevent the biggest axis from being more than MAX_SIZE
49 | if np.round(im_scale * im_size_max) > cfg.TEST.MAX_SIZE:
50 | im_scale = float(cfg.TEST.MAX_SIZE) / float(im_size_max)
51 | im = cv2.resize(im_orig, None, None, fx=im_scale, fy=im_scale,
52 | interpolation=cv2.INTER_LINEAR)
53 | im_scale_factors.append(im_scale)
54 | processed_ims.append(im)
55 |
56 | # Create a blob to hold the input images
57 | blob = im_list_to_blob(processed_ims)
58 |
59 | return blob, np.array(im_scale_factors)
60 |
61 | def _get_blobs(im):
62 | """Convert an image and RoIs within that image into network inputs."""
63 | blobs = {}
64 | blobs['data'], im_scale_factors = _get_image_blob(im)
65 |
66 | return blobs, im_scale_factors
67 |
68 | def _clip_boxes(boxes, im_shape):
69 | """Clip boxes to image boundaries."""
70 | # x1 >= 0
71 | boxes[:, 0::4] = np.maximum(boxes[:, 0::4], 0)
72 | # y1 >= 0
73 | boxes[:, 1::4] = np.maximum(boxes[:, 1::4], 0)
74 | # x2 < im_shape[1]
75 | boxes[:, 2::4] = np.minimum(boxes[:, 2::4], im_shape[1] - 1)
76 | # y2 < im_shape[0]
77 | boxes[:, 3::4] = np.minimum(boxes[:, 3::4], im_shape[0] - 1)
78 | return boxes
79 |
80 | def _rescale_boxes(boxes, inds, scales):
81 | """Rescale boxes according to image rescaling."""
82 | for i in range(boxes.shape[0]):
83 | boxes[i,:] = boxes[i,:] / scales[int(inds[i])]
84 |
85 | return boxes
86 |
87 | def im_detect(sess, net, im):
88 | blobs, im_scales = _get_blobs(im)
89 | assert len(im_scales) == 1, "Only single-image batch implemented"
90 |
91 | im_blob = blobs['data']
92 | # seems to have height, width, and image scales
93 | # still not sure about the scale, maybe full image it is 1.
94 | blobs['im_info'] = np.array([[im_blob.shape[1], im_blob.shape[2], im_scales[0]]], dtype=np.float32)
95 |
96 | _, scores, bbox_pred, rois = net.test_image(sess, blobs['data'], blobs['im_info'])
97 |
98 | boxes = rois[:, 1:5] / im_scales[0]
99 | # print(scores.shape, bbox_pred.shape, rois.shape, boxes.shape)
100 | scores = np.reshape(scores, [scores.shape[0], -1])
101 | bbox_pred = np.reshape(bbox_pred, [bbox_pred.shape[0], -1])
102 | if cfg.TEST.BBOX_REG:
103 | # Apply bounding-box regression deltas
104 | box_deltas = bbox_pred
105 | pred_boxes = bbox_transform_inv(boxes, box_deltas)
106 | pred_boxes = _clip_boxes(pred_boxes, im.shape)
107 | else:
108 | # Simply repeat the boxes, once for each class
109 | pred_boxes = np.tile(boxes, (1, scores.shape[1]))
110 |
111 | return scores, pred_boxes
112 |
113 | def apply_nms(all_boxes, thresh):
114 | """Apply non-maximum suppression to all predicted boxes output by the
115 | test_net method.
116 | """
117 | num_classes = len(all_boxes)
118 | num_images = len(all_boxes[0])
119 | nms_boxes = [[[] for _ in range(num_images)] for _ in range(num_classes)]
120 | for cls_ind in range(num_classes):
121 | for im_ind in range(num_images):
122 | dets = all_boxes[cls_ind][im_ind]
123 | if dets == []:
124 | continue
125 |
126 | x1 = dets[:, 0]
127 | y1 = dets[:, 1]
128 | x2 = dets[:, 2]
129 | y2 = dets[:, 3]
130 | scores = dets[:, 4]
131 | inds = np.where((x2 > x1) & (y2 > y1) & (scores > cfg.TEST.DET_THRESHOLD))[0]
132 | dets = dets[inds,:]
133 | if dets == []:
134 | continue
135 |
136 | keep = nms(dets, thresh)
137 | if len(keep) == 0:
138 | continue
139 | nms_boxes[cls_ind][im_ind] = dets[keep, :].copy()
140 | return nms_boxes
141 |
142 | def test_net(sess, net, imdb, weights_filename, max_per_image=100, thresh=0.05):
143 | np.random.seed(cfg.RNG_SEED)
144 | """Test a Fast R-CNN network on an image database."""
145 | num_images = len(imdb.image_index)
146 | # all detections are collected into:
147 | # all_boxes[cls][image] = N x 5 array of detections in
148 | # (x1, y1, x2, y2, score)
149 | all_boxes = [[[] for _ in range(num_images)]
150 | for _ in range(imdb.num_classes)]
151 |
152 | output_dir = get_output_dir(imdb, weights_filename)
153 | # timers
154 | _t = {'im_detect' : Timer(), 'misc' : Timer()}
155 |
156 | for i in range(num_images):
157 | im = cv2.imread(imdb.image_path_at(i))
158 |
159 | _t['im_detect'].tic()
160 | scores, boxes = im_detect(sess, net, im)
161 | _t['im_detect'].toc()
162 |
163 | _t['misc'].tic()
164 |
165 | # skip j = 0, because it's the background class
166 | for j in range(1, imdb.num_classes):
167 | inds = np.where(scores[:, j] > thresh)[0]
168 | cls_scores = scores[inds, j]
169 | cls_boxes = boxes[inds, j*4:(j+1)*4]
170 | cls_dets = np.hstack((cls_boxes, cls_scores[:, np.newaxis])) \
171 | .astype(np.float32, copy=False)
172 | keep = nms(cls_dets, cfg.TEST.NMS)
173 | cls_dets = cls_dets[keep, :]
174 | all_boxes[j][i] = cls_dets
175 |
176 | # Limit to max_per_image detections *over all classes*
177 | if max_per_image > 0:
178 | image_scores = np.hstack([all_boxes[j][i][:, -1]
179 | for j in range(1, imdb.num_classes)])
180 | if len(image_scores) > max_per_image:
181 | image_thresh = np.sort(image_scores)[-max_per_image]
182 | for j in range(1, imdb.num_classes):
183 | keep = np.where(all_boxes[j][i][:, -1] >= image_thresh)[0]
184 | all_boxes[j][i] = all_boxes[j][i][keep, :]
185 | _t['misc'].toc()
186 |
187 | print('im_detect: {:d}/{:d} {:.3f}s {:.3f}s' \
188 | .format(i + 1, num_images, _t['im_detect'].average_time,
189 | _t['misc'].average_time))
190 |
191 | det_file = os.path.join(output_dir, 'detections.pkl')
192 | with open(det_file, 'wb') as f:
193 | pickle.dump(all_boxes, f, pickle.HIGHEST_PROTOCOL)
194 |
195 | print('Evaluating detections')
196 | imdb.evaluate_detections(all_boxes, output_dir)
197 |
198 |
--------------------------------------------------------------------------------
/lib/model/test.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/model/test.pyc
--------------------------------------------------------------------------------
/lib/model/test.py~:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Tensorflow Faster R-CNN
3 | # Licensed under The MIT License [see LICENSE for details]
4 | # Written by Xinlei Chen
5 | # --------------------------------------------------------
6 | from __future__ import absolute_import
7 | from __future__ import division
8 | from __future__ import print_function
9 |
10 | import cv2
11 | import numpy as np
12 | try:
13 | import cPickle as pickle
14 | except ImportError:
15 | import pickle
16 | import os
17 | import math
18 |
19 | from utils.timer import Timer
20 | from utils.cython_nms import nms, nms_new
21 | from utils.boxes_grid import get_boxes_grid
22 | from utils.blob import im_list_to_blob
23 |
24 | from model.config import cfg, get_output_dir
25 | from model.bbox_transform import clip_boxes, bbox_transform_inv
26 |
27 | def _get_image_blob(im):
28 | """Converts an image into a network input.
29 | Arguments:
30 | im (ndarray): a color image in BGR order
31 | Returns:
32 | blob (ndarray): a data blob holding an image pyramid
33 | im_scale_factors (list): list of image scales (relative to im) used
34 | in the image pyramid
35 | """
36 | im_orig = im.astype(np.float32, copy=True)
37 | im_orig -= cfg.PIXEL_MEANS
38 |
39 | im_shape = im_orig.shape
40 | im_size_min = np.min(im_shape[0:2])
41 | im_size_max = np.max(im_shape[0:2])
42 |
43 | processed_ims = []
44 | im_scale_factors = []
45 |
46 | for target_size in cfg.TEST.SCALES:
47 | im_scale = float(target_size) / float(im_size_min)
48 | # Prevent the biggest axis from being more than MAX_SIZE
49 | if np.round(im_scale * im_size_max) > cfg.TEST.MAX_SIZE:
50 | im_scale = float(cfg.TEST.MAX_SIZE) / float(im_size_max)
51 | im = cv2.resize(im_orig, None, None, fx=im_scale, fy=im_scale,
52 | interpolation=cv2.INTER_LINEAR)
53 | im_scale_factors.append(im_scale)
54 | processed_ims.append(im)
55 |
56 | # Create a blob to hold the input images
57 | blob = im_list_to_blob(processed_ims)
58 |
59 | return blob, np.array(im_scale_factors)
60 |
61 | def _get_blobs(im):
62 | """Convert an image and RoIs within that image into network inputs."""
63 | blobs = {}
64 | blobs['data'], im_scale_factors = _get_image_blob(im)
65 |
66 | return blobs, im_scale_factors
67 |
68 | def _clip_boxes(boxes, im_shape):
69 | """Clip boxes to image boundaries."""
70 | # x1 >= 0
71 | boxes[:, 0::4] = np.maximum(boxes[:, 0::4], 0)
72 | # y1 >= 0
73 | boxes[:, 1::4] = np.maximum(boxes[:, 1::4], 0)
74 | # x2 < im_shape[1]
75 | boxes[:, 2::4] = np.minimum(boxes[:, 2::4], im_shape[1] - 1)
76 | # y2 < im_shape[0]
77 | boxes[:, 3::4] = np.minimum(boxes[:, 3::4], im_shape[0] - 1)
78 | return boxes
79 |
80 | def _rescale_boxes(boxes, inds, scales):
81 | """Rescale boxes according to image rescaling."""
82 | for i in range(boxes.shape[0]):
83 | boxes[i,:] = boxes[i,:] / scales[int(inds[i])]
84 |
85 | return boxes
86 |
87 | def im_detect(sess, net, im):
88 | blobs, im_scales = _get_blobs(im)
89 | assert len(im_scales) == 1, "Only single-image batch implemented"
90 | print (im_scales)
91 | im_blob = blobs['data']
92 | # seems to have height, width, and image scales
93 | # still not sure about the scale, maybe full image it is 1.
94 | blobs['im_info'] = np.array([[im_blob.shape[1], im_blob.shape[2], im_scales[0]]], dtype=np.float32)
95 |
96 | _, scores, bbox_pred, rois = net.test_image(sess, blobs['data'], blobs['im_info'])
97 |
98 | boxes = rois[:, 1:5] / im_scales[0]
99 | # print(scores.shape, bbox_pred.shape, rois.shape, boxes.shape)
100 | scores = np.reshape(scores, [scores.shape[0], -1])
101 | bbox_pred = np.reshape(bbox_pred, [bbox_pred.shape[0], -1])
102 | if cfg.TEST.BBOX_REG:
103 | # Apply bounding-box regression deltas
104 | box_deltas = bbox_pred
105 | pred_boxes = bbox_transform_inv(boxes, box_deltas)
106 | pred_boxes = _clip_boxes(pred_boxes, im.shape)
107 | else:
108 | # Simply repeat the boxes, once for each class
109 | pred_boxes = np.tile(boxes, (1, scores.shape[1]))
110 |
111 | return scores, pred_boxes
112 |
113 | def apply_nms(all_boxes, thresh):
114 | """Apply non-maximum suppression to all predicted boxes output by the
115 | test_net method.
116 | """
117 | num_classes = len(all_boxes)
118 | num_images = len(all_boxes[0])
119 | nms_boxes = [[[] for _ in range(num_images)] for _ in range(num_classes)]
120 | for cls_ind in range(num_classes):
121 | for im_ind in range(num_images):
122 | dets = all_boxes[cls_ind][im_ind]
123 | if dets == []:
124 | continue
125 |
126 | x1 = dets[:, 0]
127 | y1 = dets[:, 1]
128 | x2 = dets[:, 2]
129 | y2 = dets[:, 3]
130 | scores = dets[:, 4]
131 | inds = np.where((x2 > x1) & (y2 > y1) & (scores > cfg.TEST.DET_THRESHOLD))[0]
132 | dets = dets[inds,:]
133 | if dets == []:
134 | continue
135 |
136 | keep = nms(dets, thresh)
137 | if len(keep) == 0:
138 | continue
139 | nms_boxes[cls_ind][im_ind] = dets[keep, :].copy()
140 | return nms_boxes
141 |
142 | def test_net(sess, net, imdb, weights_filename, max_per_image=100, thresh=0.05):
143 | np.random.seed(cfg.RNG_SEED)
144 | """Test a Fast R-CNN network on an image database."""
145 | num_images = len(imdb.image_index)
146 | # all detections are collected into:
147 | # all_boxes[cls][image] = N x 5 array of detections in
148 | # (x1, y1, x2, y2, score)
149 | all_boxes = [[[] for _ in range(num_images)]
150 | for _ in range(imdb.num_classes)]
151 |
152 | output_dir = get_output_dir(imdb, weights_filename)
153 | # timers
154 | _t = {'im_detect' : Timer(), 'misc' : Timer()}
155 |
156 | for i in range(num_images):
157 | im = cv2.imread(imdb.image_path_at(i))
158 |
159 | _t['im_detect'].tic()
160 | scores, boxes = im_detect(sess, net, im)
161 | _t['im_detect'].toc()
162 |
163 | _t['misc'].tic()
164 |
165 | # skip j = 0, because it's the background class
166 | for j in range(1, imdb.num_classes):
167 | inds = np.where(scores[:, j] > thresh)[0]
168 | cls_scores = scores[inds, j]
169 | cls_boxes = boxes[inds, j*4:(j+1)*4]
170 | cls_dets = np.hstack((cls_boxes, cls_scores[:, np.newaxis])) \
171 | .astype(np.float32, copy=False)
172 | keep = nms(cls_dets, cfg.TEST.NMS)
173 | cls_dets = cls_dets[keep, :]
174 | all_boxes[j][i] = cls_dets
175 |
176 | # Limit to max_per_image detections *over all classes*
177 | if max_per_image > 0:
178 | image_scores = np.hstack([all_boxes[j][i][:, -1]
179 | for j in range(1, imdb.num_classes)])
180 | if len(image_scores) > max_per_image:
181 | image_thresh = np.sort(image_scores)[-max_per_image]
182 | for j in range(1, imdb.num_classes):
183 | keep = np.where(all_boxes[j][i][:, -1] >= image_thresh)[0]
184 | all_boxes[j][i] = all_boxes[j][i][keep, :]
185 | _t['misc'].toc()
186 |
187 | print('im_detect: {:d}/{:d} {:.3f}s {:.3f}s' \
188 | .format(i + 1, num_images, _t['im_detect'].average_time,
189 | _t['misc'].average_time))
190 |
191 | det_file = os.path.join(output_dir, 'detections.pkl')
192 | with open(det_file, 'wb') as f:
193 | pickle.dump(all_boxes, f, pickle.HIGHEST_PROTOCOL)
194 |
195 | print('Evaluating detections')
196 | imdb.evaluate_detections(all_boxes, output_dir)
197 |
198 |
--------------------------------------------------------------------------------
/lib/model/train_val.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/model/train_val.pyc
--------------------------------------------------------------------------------
/lib/nets/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/nets/__init__.py
--------------------------------------------------------------------------------
/lib/nets/__init__.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/nets/__init__.pyc
--------------------------------------------------------------------------------
/lib/nets/network.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/nets/network.pyc
--------------------------------------------------------------------------------
/lib/nets/resnet_v1.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/nets/resnet_v1.pyc
--------------------------------------------------------------------------------
/lib/nets/vgg16.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Tensorflow Faster R-CNN
3 | # Licensed under The MIT License [see LICENSE for details]
4 | # Written by Xinlei Chen
5 | # --------------------------------------------------------
6 | from __future__ import absolute_import
7 | from __future__ import division
8 | from __future__ import print_function
9 |
10 | import tensorflow as tf
11 | import tensorflow.contrib.slim as slim
12 | from tensorflow.contrib.slim import losses
13 | from tensorflow.contrib.slim import arg_scope
14 | import numpy as np
15 |
16 | from nets.network import Network
17 | from model.config import cfg
18 |
19 | class vgg16(Network):
20 | def __init__(self, batch_size=1):
21 | Network.__init__(self, batch_size=batch_size)
22 |
23 | def build_network(self, sess, is_training=True):
24 | with tf.variable_scope('vgg_16', 'vgg_16'):
25 | # select initializers
26 | if cfg.TRAIN.TRUNCATED:
27 | initializer = tf.truncated_normal_initializer(mean=0.0, stddev=0.01)
28 | initializer_bbox = tf.truncated_normal_initializer(mean=0.0, stddev=0.001)
29 | else:
30 | initializer = tf.random_normal_initializer(mean=0.0, stddev=0.01)
31 | initializer_bbox = tf.random_normal_initializer(mean=0.0, stddev=0.001)
32 |
33 | net = slim.repeat(self._image, 2, slim.conv2d, 64, [3, 3],
34 | trainable=False, scope='conv1')
35 | net = slim.max_pool2d(net, [2, 2], padding='SAME', scope='pool1')
36 | net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3],
37 | trainable=False, scope='conv2')
38 | net = slim.max_pool2d(net, [2, 2], padding='SAME', scope='pool2')
39 | net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3],
40 | trainable=is_training, scope='conv3')
41 | net = slim.max_pool2d(net, [2, 2], padding='SAME', scope='pool3')
42 | net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3],
43 | trainable=is_training, scope='conv4')
44 | net = slim.max_pool2d(net, [2, 2], padding='SAME', scope='pool4')
45 | net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3],
46 | trainable=is_training, scope='conv5')
47 | self._act_summaries.append(net)
48 | self._layers['head'] = net
49 | # build the anchors for the image
50 | self._anchor_component()
51 |
52 | # rpn
53 | rpn = slim.conv2d(net, 512, [3, 3], trainable=is_training, weights_initializer=initializer, scope="rpn_conv/3x3")
54 | self._act_summaries.append(rpn)
55 | rpn_cls_score = slim.conv2d(rpn, self._num_anchors * 2, [1, 1], trainable=is_training,
56 | weights_initializer=initializer,
57 | padding='VALID', activation_fn=None, scope='rpn_cls_score')
58 | # change it so that the score has 2 as its channel size
59 | rpn_cls_score_reshape = self._reshape_layer(rpn_cls_score, 2, 'rpn_cls_score_reshape')
60 | rpn_cls_prob_reshape = self._softmax_layer(rpn_cls_score_reshape, "rpn_cls_prob_reshape")
61 | rpn_cls_prob = self._reshape_layer(rpn_cls_prob_reshape, self._num_anchors * 2, "rpn_cls_prob")
62 | rpn_bbox_pred = slim.conv2d(rpn, self._num_anchors * 4, [1, 1], trainable=is_training,
63 | weights_initializer=initializer,
64 | padding='VALID', activation_fn=None, scope='rpn_bbox_pred')
65 | if is_training:
66 | rois, roi_scores = self._proposal_layer(rpn_cls_prob, rpn_bbox_pred, "rois")
67 | rpn_labels = self._anchor_target_layer(rpn_cls_score, "anchor")
68 | # Try to have a determinestic order for the computing graph, for reproducibility
69 | with tf.control_dependencies([rpn_labels]):
70 | rois, _ = self._proposal_target_layer(rois, roi_scores, "rpn_rois")
71 | else:
72 | if cfg.TEST.MODE == 'nms':
73 | rois, _ = self._proposal_layer(rpn_cls_prob, rpn_bbox_pred, "rois")
74 | elif cfg.TEST.MODE == 'top':
75 | rois, _ = self._proposal_top_layer(rpn_cls_prob, rpn_bbox_pred, "rois")
76 | else:
77 | raise NotImplementedError
78 |
79 | # rcnn
80 | if cfg.POOLING_MODE == 'crop':
81 | pool5 = self._crop_pool_layer(net, rois, "pool5")
82 | else:
83 | raise NotImplementedError
84 |
85 | pool5_flat = slim.flatten(pool5, scope='flatten')
86 | fc6 = slim.fully_connected(pool5_flat, 4096, scope='fc6')
87 | if is_training:
88 | fc6 = slim.dropout(fc6, keep_prob=0.5, is_training=True, scope='dropout6')
89 | fc7 = slim.fully_connected(fc6, 4096, scope='fc7')
90 | if is_training:
91 | fc7 = slim.dropout(fc7, keep_prob=0.5, is_training=True, scope='dropout7')
92 | cls_score = slim.fully_connected(fc7, self._num_classes,
93 | weights_initializer=initializer,
94 | trainable=is_training,
95 | activation_fn=None, scope='cls_score')
96 | cls_prob = self._softmax_layer(cls_score, "cls_prob")
97 | bbox_pred = slim.fully_connected(fc7, self._num_classes * 4,
98 | weights_initializer=initializer_bbox,
99 | trainable=is_training,
100 | activation_fn=None, scope='bbox_pred')
101 |
102 | self._predictions["rpn_cls_score"] = rpn_cls_score
103 | self._predictions["rpn_cls_score_reshape"] = rpn_cls_score_reshape
104 | self._predictions["rpn_cls_prob"] = rpn_cls_prob
105 | self._predictions["rpn_bbox_pred"] = rpn_bbox_pred
106 | self._predictions["cls_score"] = cls_score
107 | self._predictions["cls_prob"] = cls_prob
108 | self._predictions["bbox_pred"] = bbox_pred
109 | self._predictions["rois"] = rois
110 |
111 | self._score_summaries.update(self._predictions)
112 |
113 | return rois, cls_prob, bbox_pred
114 |
115 | def get_variables_to_restore(self, variables, var_keep_dic):
116 | variables_to_restore = []
117 |
118 | for v in variables:
119 | # exclude the conv weights that are fc weights in vgg16
120 | if v.name == 'vgg_16/fc6/weights:0' or v.name == 'vgg_16/fc7/weights:0':
121 | self._variables_to_fix[v.name] = v
122 | continue
123 | # exclude the first conv layer to swap RGB to BGR
124 | if v.name == 'vgg_16/conv1/conv1_1/weights:0':
125 | self._variables_to_fix[v.name] = v
126 | continue
127 | if v.name.split(':')[0] in var_keep_dic:
128 | print('Varibles restored: %s' % v.name)
129 | variables_to_restore.append(v)
130 |
131 | return variables_to_restore
132 |
133 | def fix_variables(self, sess, pretrained_model):
134 | print('Fix VGG16 layers..')
135 | with tf.variable_scope('Fix_VGG16') as scope:
136 | with tf.device("/cpu:0"):
137 | # fix the vgg16 issue from conv weights to fc weights
138 | # fix RGB to BGR
139 | fc6_conv = tf.get_variable("fc6_conv", [7, 7, 512, 4096], trainable=False)
140 | fc7_conv = tf.get_variable("fc7_conv", [1, 1, 4096, 4096], trainable=False)
141 | conv1_rgb = tf.get_variable("conv1_rgb", [3, 3, 3, 64], trainable=False)
142 | restorer_fc = tf.train.Saver({"vgg_16/fc6/weights": fc6_conv,
143 | "vgg_16/fc7/weights": fc7_conv,
144 | "vgg_16/conv1/conv1_1/weights": conv1_rgb})
145 | restorer_fc.restore(sess, pretrained_model)
146 |
147 | sess.run(tf.assign(self._variables_to_fix['vgg_16/fc6/weights:0'], tf.reshape(fc6_conv,
148 | self._variables_to_fix['vgg_16/fc6/weights:0'].get_shape())))
149 | sess.run(tf.assign(self._variables_to_fix['vgg_16/fc7/weights:0'], tf.reshape(fc7_conv,
150 | self._variables_to_fix['vgg_16/fc7/weights:0'].get_shape())))
151 | sess.run(tf.assign(self._variables_to_fix['vgg_16/conv1/conv1_1/weights:0'],
152 | tf.reverse(conv1_rgb, [2])))
--------------------------------------------------------------------------------
/lib/nets/vgg16.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/nets/vgg16.pyc
--------------------------------------------------------------------------------
/lib/nms/.gitignore:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/nms/.gitignore
--------------------------------------------------------------------------------
/lib/nms/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/nms/__init__.py
--------------------------------------------------------------------------------
/lib/nms/__init__.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/nms/__init__.pyc
--------------------------------------------------------------------------------
/lib/nms/cpu_nms.pyx:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick
6 | # --------------------------------------------------------
7 |
8 | import numpy as np
9 | cimport numpy as np
10 |
11 | cdef inline np.float32_t max(np.float32_t a, np.float32_t b):
12 | return a if a >= b else b
13 |
14 | cdef inline np.float32_t min(np.float32_t a, np.float32_t b):
15 | return a if a <= b else b
16 |
17 | def cpu_nms(np.ndarray[np.float32_t, ndim=2] dets, np.float thresh):
18 | cdef np.ndarray[np.float32_t, ndim=1] x1 = dets[:, 0]
19 | cdef np.ndarray[np.float32_t, ndim=1] y1 = dets[:, 1]
20 | cdef np.ndarray[np.float32_t, ndim=1] x2 = dets[:, 2]
21 | cdef np.ndarray[np.float32_t, ndim=1] y2 = dets[:, 3]
22 | cdef np.ndarray[np.float32_t, ndim=1] scores = dets[:, 4]
23 |
24 | cdef np.ndarray[np.float32_t, ndim=1] areas = (x2 - x1 + 1) * (y2 - y1 + 1)
25 | cdef np.ndarray[np.int_t, ndim=1] order = scores.argsort()[::-1]
26 |
27 | cdef int ndets = dets.shape[0]
28 | cdef np.ndarray[np.int_t, ndim=1] suppressed = \
29 | np.zeros((ndets), dtype=np.int)
30 |
31 | # nominal indices
32 | cdef int _i, _j
33 | # sorted indices
34 | cdef int i, j
35 | # temp variables for box i's (the box currently under consideration)
36 | cdef np.float32_t ix1, iy1, ix2, iy2, iarea
37 | # variables for computing overlap with box j (lower scoring box)
38 | cdef np.float32_t xx1, yy1, xx2, yy2
39 | cdef np.float32_t w, h
40 | cdef np.float32_t inter, ovr
41 |
42 | keep = []
43 | for _i in range(ndets):
44 | i = order[_i]
45 | if suppressed[i] == 1:
46 | continue
47 | keep.append(i)
48 | ix1 = x1[i]
49 | iy1 = y1[i]
50 | ix2 = x2[i]
51 | iy2 = y2[i]
52 | iarea = areas[i]
53 | for _j in range(_i + 1, ndets):
54 | j = order[_j]
55 | if suppressed[j] == 1:
56 | continue
57 | xx1 = max(ix1, x1[j])
58 | yy1 = max(iy1, y1[j])
59 | xx2 = min(ix2, x2[j])
60 | yy2 = min(iy2, y2[j])
61 | w = max(0.0, xx2 - xx1 + 1)
62 | h = max(0.0, yy2 - yy1 + 1)
63 | inter = w * h
64 | ovr = inter / (iarea + areas[j] - inter)
65 | if ovr >= thresh:
66 | suppressed[j] = 1
67 |
68 | return keep
69 |
--------------------------------------------------------------------------------
/lib/nms/cpu_nms.so:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/nms/cpu_nms.so
--------------------------------------------------------------------------------
/lib/nms/gpu_nms.hpp:
--------------------------------------------------------------------------------
1 | void _nms(int* keep_out, int* num_out, const float* boxes_host, int boxes_num,
2 | int boxes_dim, float nms_overlap_thresh, int device_id);
3 |
--------------------------------------------------------------------------------
/lib/nms/gpu_nms.pyx:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Faster R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick
6 | # --------------------------------------------------------
7 |
8 | import numpy as np
9 | cimport numpy as np
10 |
11 | assert sizeof(int) == sizeof(np.int32_t)
12 |
13 | cdef extern from "gpu_nms.hpp":
14 | void _nms(np.int32_t*, int*, np.float32_t*, int, int, float, int)
15 |
16 | def gpu_nms(np.ndarray[np.float32_t, ndim=2] dets, np.float thresh,
17 | np.int32_t device_id=0):
18 | cdef int boxes_num = dets.shape[0]
19 | cdef int boxes_dim = dets.shape[1]
20 | cdef int num_out
21 | cdef np.ndarray[np.int32_t, ndim=1] \
22 | keep = np.zeros(boxes_num, dtype=np.int32)
23 | cdef np.ndarray[np.float32_t, ndim=1] \
24 | scores = dets[:, 4]
25 | cdef np.ndarray[np.int_t, ndim=1] \
26 | order = scores.argsort()[::-1]
27 | cdef np.ndarray[np.float32_t, ndim=2] \
28 | sorted_dets = dets[order, :]
29 | _nms(&keep[0], &num_out, &sorted_dets[0, 0], boxes_num, boxes_dim, thresh, device_id)
30 | keep = keep[:num_out]
31 | return list(order[keep])
32 |
--------------------------------------------------------------------------------
/lib/nms/gpu_nms.so:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/nms/gpu_nms.so
--------------------------------------------------------------------------------
/lib/nms/nms_kernel.cu:
--------------------------------------------------------------------------------
1 | // ------------------------------------------------------------------
2 | // Faster R-CNN
3 | // Copyright (c) 2015 Microsoft
4 | // Licensed under The MIT License [see fast-rcnn/LICENSE for details]
5 | // Written by Shaoqing Ren
6 | // ------------------------------------------------------------------
7 |
8 | #include "gpu_nms.hpp"
9 | #include
10 | #include
11 |
12 | #define CUDA_CHECK(condition) \
13 | /* Code block avoids redefinition of cudaError_t error */ \
14 | do { \
15 | cudaError_t error = condition; \
16 | if (error != cudaSuccess) { \
17 | std::cout << cudaGetErrorString(error) << std::endl; \
18 | } \
19 | } while (0)
20 |
21 | #define DIVUP(m,n) ((m) / (n) + ((m) % (n) > 0))
22 | int const threadsPerBlock = sizeof(unsigned long long) * 8;
23 |
24 | __device__ inline float devIoU(float const * const a, float const * const b) {
25 | float left = max(a[0], b[0]), right = min(a[2], b[2]);
26 | float top = max(a[1], b[1]), bottom = min(a[3], b[3]);
27 | float width = max(right - left + 1, 0.f), height = max(bottom - top + 1, 0.f);
28 | float interS = width * height;
29 | float Sa = (a[2] - a[0] + 1) * (a[3] - a[1] + 1);
30 | float Sb = (b[2] - b[0] + 1) * (b[3] - b[1] + 1);
31 | return interS / (Sa + Sb - interS);
32 | }
33 |
34 | __global__ void nms_kernel(const int n_boxes, const float nms_overlap_thresh,
35 | const float *dev_boxes, unsigned long long *dev_mask) {
36 | const int row_start = blockIdx.y;
37 | const int col_start = blockIdx.x;
38 |
39 | // if (row_start > col_start) return;
40 |
41 | const int row_size =
42 | min(n_boxes - row_start * threadsPerBlock, threadsPerBlock);
43 | const int col_size =
44 | min(n_boxes - col_start * threadsPerBlock, threadsPerBlock);
45 |
46 | __shared__ float block_boxes[threadsPerBlock * 5];
47 | if (threadIdx.x < col_size) {
48 | block_boxes[threadIdx.x * 5 + 0] =
49 | dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 0];
50 | block_boxes[threadIdx.x * 5 + 1] =
51 | dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 1];
52 | block_boxes[threadIdx.x * 5 + 2] =
53 | dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 2];
54 | block_boxes[threadIdx.x * 5 + 3] =
55 | dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 3];
56 | block_boxes[threadIdx.x * 5 + 4] =
57 | dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 4];
58 | }
59 | __syncthreads();
60 |
61 | if (threadIdx.x < row_size) {
62 | const int cur_box_idx = threadsPerBlock * row_start + threadIdx.x;
63 | const float *cur_box = dev_boxes + cur_box_idx * 5;
64 | int i = 0;
65 | unsigned long long t = 0;
66 | int start = 0;
67 | if (row_start == col_start) {
68 | start = threadIdx.x + 1;
69 | }
70 | for (i = start; i < col_size; i++) {
71 | if (devIoU(cur_box, block_boxes + i * 5) > nms_overlap_thresh) {
72 | t |= 1ULL << i;
73 | }
74 | }
75 | const int col_blocks = DIVUP(n_boxes, threadsPerBlock);
76 | dev_mask[cur_box_idx * col_blocks + col_start] = t;
77 | }
78 | }
79 |
80 | void _set_device(int device_id) {
81 | int current_device;
82 | CUDA_CHECK(cudaGetDevice(¤t_device));
83 | if (current_device == device_id) {
84 | return;
85 | }
86 | // The call to cudaSetDevice must come before any calls to Get, which
87 | // may perform initialization using the GPU.
88 | CUDA_CHECK(cudaSetDevice(device_id));
89 | }
90 |
91 | void _nms(int* keep_out, int* num_out, const float* boxes_host, int boxes_num,
92 | int boxes_dim, float nms_overlap_thresh, int device_id) {
93 | _set_device(device_id);
94 |
95 | float* boxes_dev = NULL;
96 | unsigned long long* mask_dev = NULL;
97 |
98 | const int col_blocks = DIVUP(boxes_num, threadsPerBlock);
99 |
100 | CUDA_CHECK(cudaMalloc(&boxes_dev,
101 | boxes_num * boxes_dim * sizeof(float)));
102 | CUDA_CHECK(cudaMemcpy(boxes_dev,
103 | boxes_host,
104 | boxes_num * boxes_dim * sizeof(float),
105 | cudaMemcpyHostToDevice));
106 |
107 | CUDA_CHECK(cudaMalloc(&mask_dev,
108 | boxes_num * col_blocks * sizeof(unsigned long long)));
109 |
110 | dim3 blocks(DIVUP(boxes_num, threadsPerBlock),
111 | DIVUP(boxes_num, threadsPerBlock));
112 | dim3 threads(threadsPerBlock);
113 | nms_kernel<<>>(boxes_num,
114 | nms_overlap_thresh,
115 | boxes_dev,
116 | mask_dev);
117 |
118 | std::vector mask_host(boxes_num * col_blocks);
119 | CUDA_CHECK(cudaMemcpy(&mask_host[0],
120 | mask_dev,
121 | sizeof(unsigned long long) * boxes_num * col_blocks,
122 | cudaMemcpyDeviceToHost));
123 |
124 | std::vector remv(col_blocks);
125 | memset(&remv[0], 0, sizeof(unsigned long long) * col_blocks);
126 |
127 | int num_to_keep = 0;
128 | for (int i = 0; i < boxes_num; i++) {
129 | int nblock = i / threadsPerBlock;
130 | int inblock = i % threadsPerBlock;
131 |
132 | if (!(remv[nblock] & (1ULL << inblock))) {
133 | keep_out[num_to_keep++] = i;
134 | unsigned long long *p = &mask_host[0] + i * col_blocks;
135 | for (int j = nblock; j < col_blocks; j++) {
136 | remv[j] |= p[j];
137 | }
138 | }
139 | }
140 | *num_out = num_to_keep;
141 |
142 | CUDA_CHECK(cudaFree(boxes_dev));
143 | CUDA_CHECK(cudaFree(mask_dev));
144 | }
145 |
--------------------------------------------------------------------------------
/lib/nms/py_cpu_nms.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick
6 | # --------------------------------------------------------
7 |
8 | import numpy as np
9 |
10 | def py_cpu_nms(dets, thresh):
11 | """Pure Python NMS baseline."""
12 | x1 = dets[:, 0]
13 | y1 = dets[:, 1]
14 | x2 = dets[:, 2]
15 | y2 = dets[:, 3]
16 | scores = dets[:, 4]
17 |
18 | areas = (x2 - x1 + 1) * (y2 - y1 + 1)
19 | order = scores.argsort()[::-1]
20 |
21 | keep = []
22 | while order.size > 0:
23 | i = order[0]
24 | keep.append(i)
25 | xx1 = np.maximum(x1[i], x1[order[1:]])
26 | yy1 = np.maximum(y1[i], y1[order[1:]])
27 | xx2 = np.minimum(x2[i], x2[order[1:]])
28 | yy2 = np.minimum(y2[i], y2[order[1:]])
29 |
30 | w = np.maximum(0.0, xx2 - xx1 + 1)
31 | h = np.maximum(0.0, yy2 - yy1 + 1)
32 | inter = w * h
33 | ovr = inter / (areas[i] + areas[order[1:]] - inter)
34 |
35 | inds = np.where(ovr <= thresh)[0]
36 | order = order[inds + 1]
37 |
38 | return keep
39 |
--------------------------------------------------------------------------------
/lib/roi_data_layer/__init__.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick
6 | # --------------------------------------------------------
7 |
--------------------------------------------------------------------------------
/lib/roi_data_layer/__init__.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/roi_data_layer/__init__.pyc
--------------------------------------------------------------------------------
/lib/roi_data_layer/layer.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick and Xinlei Chen
6 | # --------------------------------------------------------
7 |
8 | """The data layer used during training to train a Fast R-CNN network.
9 |
10 | RoIDataLayer implements a Caffe Python layer.
11 | """
12 | from __future__ import absolute_import
13 | from __future__ import division
14 | from __future__ import print_function
15 |
16 | from model.config import cfg
17 | from roi_data_layer.minibatch import get_minibatch
18 | import numpy as np
19 | import time
20 |
21 | class RoIDataLayer(object):
22 | """Fast R-CNN data layer used for training."""
23 |
24 | def __init__(self, roidb, num_classes, random=False):
25 | """Set the roidb to be used by this layer during training."""
26 | self._roidb = roidb
27 | self._num_classes = num_classes
28 | # Also set a random flag
29 | self._random = random
30 | self._shuffle_roidb_inds()
31 |
32 | def _shuffle_roidb_inds(self):
33 | """Randomly permute the training roidb."""
34 | # If the random flag is set,
35 | # then the database is shuffled according to system time
36 | # Useful for the validation set
37 | if self._random:
38 | st0 = np.random.get_state()
39 | millis = int(round(time.time() * 1000)) % 4294967295
40 | np.random.seed(millis)
41 |
42 | if cfg.TRAIN.ASPECT_GROUPING:
43 | widths = np.array([r['width'] for r in self._roidb])
44 | heights = np.array([r['height'] for r in self._roidb])
45 | horz = (widths >= heights)
46 | vert = np.logical_not(horz)
47 | horz_inds = np.where(horz)[0]
48 | vert_inds = np.where(vert)[0]
49 | inds = np.hstack((
50 | np.random.permutation(horz_inds),
51 | np.random.permutation(vert_inds)))
52 | inds = np.reshape(inds, (-1, 2))
53 | row_perm = np.random.permutation(np.arange(inds.shape[0]))
54 | inds = np.reshape(inds[row_perm, :], (-1,))
55 | self._perm = inds
56 | else:
57 | self._perm = np.random.permutation(np.arange(len(self._roidb)))
58 | # Restore the random state
59 | if self._random:
60 | np.random.set_state(st0)
61 |
62 | self._cur = 0
63 |
64 | def _get_next_minibatch_inds(self):
65 | """Return the roidb indices for the next minibatch."""
66 |
67 | if self._cur + cfg.TRAIN.IMS_PER_BATCH >= len(self._roidb):
68 | self._shuffle_roidb_inds()
69 |
70 | #print (len(self._perm))
71 |
72 | db_inds = self._perm[self._cur:self._cur + cfg.TRAIN.IMS_PER_BATCH]
73 | self._cur += cfg.TRAIN.IMS_PER_BATCH
74 |
75 | return db_inds
76 |
77 | def _get_next_minibatch(self):
78 | """Return the blobs to be used for the next minibatch.
79 |
80 | If cfg.TRAIN.USE_PREFETCH is True, then blobs will be computed in a
81 | separate process and made available through self._blob_queue.
82 | """
83 | db_inds = self._get_next_minibatch_inds()
84 | minibatch_db = [self._roidb[i] for i in db_inds]
85 | return get_minibatch(minibatch_db, self._num_classes)
86 |
87 | def forward(self):
88 | """Get blobs and copy them into this layer's top blob vector."""
89 | blobs = self._get_next_minibatch()
90 | return blobs
91 |
--------------------------------------------------------------------------------
/lib/roi_data_layer/layer.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/roi_data_layer/layer.pyc
--------------------------------------------------------------------------------
/lib/roi_data_layer/minibatch.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick and Xinlei Chen
6 | # --------------------------------------------------------
7 |
8 | """Compute minibatch blobs for training a Fast R-CNN network."""
9 | from __future__ import absolute_import
10 | from __future__ import division
11 | from __future__ import print_function
12 |
13 | import numpy as np
14 | import numpy.random as npr
15 | import cv2
16 | from model.config import cfg
17 | from utils.blob import prep_im_for_blob, im_list_to_blob
18 |
19 | def get_minibatch(roidb, num_classes):
20 | """Given a roidb, construct a minibatch sampled from it."""
21 | num_images = len(roidb)
22 | # Sample random scales to use for each image in this batch
23 | random_scale_inds = npr.randint(0, high=len(cfg.TRAIN.SCALES),
24 | size=num_images)
25 | assert(cfg.TRAIN.BATCH_SIZE % num_images == 0), \
26 | 'num_images ({}) must divide BATCH_SIZE ({})'. \
27 | format(num_images, cfg.TRAIN.BATCH_SIZE)
28 |
29 | # Get the input image blob, formatted for caffe
30 | im_blob, im_scales = _get_image_blob(roidb, random_scale_inds)
31 |
32 | blobs = {'data': im_blob}
33 |
34 | assert len(im_scales) == 1, "Single batch only"
35 | assert len(roidb) == 1, "Single batch only"
36 |
37 | # gt boxes: (x1, y1, x2, y2, cls)
38 | if cfg.TRAIN.USE_ALL_GT:
39 | # Include all ground truth boxes
40 | gt_inds = np.where(roidb[0]['gt_classes'] != 0)[0]
41 | else:
42 | # For the COCO ground truth boxes, exclude the ones that are ''iscrowd''
43 | gt_inds = np.where(roidb[0]['gt_classes'] != 0 & np.all(roidb[0]['gt_overlaps'].toarray() > -1.0, axis=1))[0]
44 | gt_boxes = np.empty((len(gt_inds), 5), dtype=np.float32)
45 | gt_boxes[:, 0:4] = roidb[0]['boxes'][gt_inds, :] * im_scales[0]
46 | gt_boxes[:, 4] = roidb[0]['gt_classes'][gt_inds]
47 | blobs['gt_boxes'] = gt_boxes
48 | blobs['im_info'] = np.array(
49 | [[im_blob.shape[1], im_blob.shape[2], im_scales[0]]],
50 | dtype=np.float32)
51 |
52 | return blobs
53 |
54 | def _get_image_blob(roidb, scale_inds):
55 | """Builds an input blob from the images in the roidb at the specified
56 | scales.
57 | """
58 | num_images = len(roidb)
59 | processed_ims = []
60 | im_scales = []
61 | for i in range(num_images):
62 | im = cv2.imread(roidb[i]['image'])
63 | if roidb[i]['flipped']:
64 | im = im[:, ::-1, :]
65 | target_size = cfg.TRAIN.SCALES[scale_inds[i]]
66 | im, im_scale = prep_im_for_blob(im, cfg.PIXEL_MEANS, target_size,
67 | cfg.TRAIN.MAX_SIZE)
68 | im_scales.append(im_scale)
69 | processed_ims.append(im)
70 |
71 | # Create a blob to hold the input images
72 | blob = im_list_to_blob(processed_ims)
73 |
74 | return blob, im_scales
75 |
--------------------------------------------------------------------------------
/lib/roi_data_layer/minibatch.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/roi_data_layer/minibatch.pyc
--------------------------------------------------------------------------------
/lib/roi_data_layer/minibatch.py~:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick and Xinlei Chen
6 | # --------------------------------------------------------
7 |
8 | """Compute minibatch blobs for training a Fast R-CNN network."""
9 | from __future__ import absolute_import
10 | from __future__ import division
11 | from __future__ import print_function
12 |
13 | import numpy as np
14 | import numpy.random as npr
15 | import cv2
16 | from model.config import cfg
17 | from utils.blob import prep_im_for_blob, im_list_to_blob
18 |
19 | def get_minibatch(roidb, num_classes):
20 | """Given a roidb, construct a minibatch sampled from it."""
21 | num_images = len(roidb)
22 | # Sample random scales to use for each image in this batch
23 | random_scale_inds = npr.randint(0, high=len(cfg.TRAIN.SCALES),
24 | size=num_images)
25 | print (num_images)
26 | print cfg.TRAIN.BATCH_SIZE
27 | assert(cfg.TRAIN.BATCH_SIZE % num_images == 0), \
28 | 'num_images ({}) must divide BATCH_SIZE ({})'. \
29 | format(num_images, cfg.TRAIN.BATCH_SIZE)
30 |
31 | # Get the input image blob, formatted for caffe
32 | im_blob, im_scales = _get_image_blob(roidb, random_scale_inds)
33 |
34 | blobs = {'data': im_blob}
35 |
36 | assert len(im_scales) == 1, "Single batch only"
37 | assert len(roidb) == 1, "Single batch only"
38 |
39 | # gt boxes: (x1, y1, x2, y2, cls)
40 | if cfg.TRAIN.USE_ALL_GT:
41 | # Include all ground truth boxes
42 | gt_inds = np.where(roidb[0]['gt_classes'] != 0)[0]
43 | else:
44 | # For the COCO ground truth boxes, exclude the ones that are ''iscrowd''
45 | gt_inds = np.where(roidb[0]['gt_classes'] != 0 & np.all(roidb[0]['gt_overlaps'].toarray() > -1.0, axis=1))[0]
46 | gt_boxes = np.empty((len(gt_inds), 5), dtype=np.float32)
47 | gt_boxes[:, 0:4] = roidb[0]['boxes'][gt_inds, :] * im_scales[0]
48 | gt_boxes[:, 4] = roidb[0]['gt_classes'][gt_inds]
49 | blobs['gt_boxes'] = gt_boxes
50 | blobs['im_info'] = np.array(
51 | [[im_blob.shape[1], im_blob.shape[2], im_scales[0]]],
52 | dtype=np.float32)
53 |
54 | return blobs
55 |
56 | def _get_image_blob(roidb, scale_inds):
57 | """Builds an input blob from the images in the roidb at the specified
58 | scales.
59 | """
60 | num_images = len(roidb)
61 | processed_ims = []
62 | im_scales = []
63 | for i in range(num_images):
64 | im = cv2.imread(roidb[i]['image'])
65 | if roidb[i]['flipped']:
66 | im = im[:, ::-1, :]
67 | target_size = cfg.TRAIN.SCALES[scale_inds[i]]
68 | im, im_scale = prep_im_for_blob(im, cfg.PIXEL_MEANS, target_size,
69 | cfg.TRAIN.MAX_SIZE)
70 | im_scales.append(im_scale)
71 | processed_ims.append(im)
72 |
73 | # Create a blob to hold the input images
74 | blob = im_list_to_blob(processed_ims)
75 |
76 | return blob, im_scales
77 |
--------------------------------------------------------------------------------
/lib/roi_data_layer/roidb.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick
6 | # --------------------------------------------------------
7 |
8 | """Transform a roidb into a trainable roidb by adding a bunch of metadata."""
9 | from __future__ import absolute_import
10 | from __future__ import division
11 | from __future__ import print_function
12 |
13 | import numpy as np
14 | from model.config import cfg
15 | from model.bbox_transform import bbox_transform
16 | from utils.cython_bbox import bbox_overlaps
17 | import PIL
18 |
19 | def prepare_roidb(imdb):
20 | """Enrich the imdb's roidb by adding some derived quantities that
21 | are useful for training. This function precomputes the maximum
22 | overlap, taken over ground-truth boxes, between each ROI and
23 | each ground-truth box. The class with maximum overlap is also
24 | recorded.
25 | """
26 | roidb = imdb.roidb
27 | if not (imdb.name.startswith('coco')):
28 | sizes = [PIL.Image.open(imdb.image_path_at(i)).size
29 | for i in range(imdb.num_images)]
30 | for i in range(len(imdb.image_index)):
31 | roidb[i]['image'] = imdb.image_path_at(i)
32 | if not (imdb.name.startswith('coco')):
33 | roidb[i]['width'] = sizes[i][0]
34 | roidb[i]['height'] = sizes[i][1]
35 | # need gt_overlaps as a dense array for argmax
36 | gt_overlaps = roidb[i]['gt_overlaps'].toarray()
37 | # max overlap with gt over classes (columns)
38 | max_overlaps = gt_overlaps.max(axis=1)
39 | # gt class that had the max overlap
40 | max_classes = gt_overlaps.argmax(axis=1)
41 | roidb[i]['max_classes'] = max_classes
42 | roidb[i]['max_overlaps'] = max_overlaps
43 | # sanity checks
44 | # max overlap of 0 => class should be zero (background)
45 | zero_inds = np.where(max_overlaps == 0)[0]
46 | assert all(max_classes[zero_inds] == 0)
47 | # max overlap > 0 => class should not be zero (must be a fg class)
48 | nonzero_inds = np.where(max_overlaps > 0)[0]
49 | assert all(max_classes[nonzero_inds] != 0)
50 |
--------------------------------------------------------------------------------
/lib/roi_data_layer/roidb.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/roi_data_layer/roidb.pyc
--------------------------------------------------------------------------------
/lib/setup.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick
6 | # --------------------------------------------------------
7 |
8 | import os
9 | from os.path import join as pjoin
10 | import numpy as np
11 | from distutils.core import setup
12 | from distutils.extension import Extension
13 | from Cython.Distutils import build_ext
14 |
15 | def find_in_path(name, path):
16 | "Find a file in a search path"
17 | #adapted fom http://code.activestate.com/recipes/52224-find-a-file-given-a-search-path/
18 | for dir in path.split(os.pathsep):
19 | binpath = pjoin(dir, name)
20 | if os.path.exists(binpath):
21 | return os.path.abspath(binpath)
22 | return None
23 |
24 | def locate_cuda():
25 | """Locate the CUDA environment on the system
26 |
27 | Returns a dict with keys 'home', 'nvcc', 'include', and 'lib64'
28 | and values giving the absolute path to each directory.
29 |
30 | Starts by looking for the CUDAHOME env variable. If not found, everything
31 | is based on finding 'nvcc' in the PATH.
32 | """
33 |
34 | # first check if the CUDAHOME env variable is in use
35 | if 'CUDAHOME' in os.environ:
36 | home = os.environ['CUDAHOME']
37 | nvcc = pjoin(home, 'bin', 'nvcc')
38 | else:
39 | # otherwise, search the PATH for NVCC
40 | default_path = pjoin(os.sep, 'usr', 'local', 'cuda', 'bin')
41 | nvcc = find_in_path('nvcc', os.environ['PATH'] + os.pathsep + default_path)
42 | if nvcc is None:
43 | raise EnvironmentError('The nvcc binary could not be '
44 | 'located in your $PATH. Either add it to your path, or set $CUDAHOME')
45 | home = os.path.dirname(os.path.dirname(nvcc))
46 |
47 | cudaconfig = {'home':home, 'nvcc':nvcc,
48 | 'include': pjoin(home, 'include'),
49 | 'lib64': pjoin(home, 'lib64')}
50 | for k, v in cudaconfig.items():
51 | if not os.path.exists(v):
52 | raise EnvironmentError('The CUDA %s path could not be located in %s' % (k, v))
53 |
54 | return cudaconfig
55 | CUDA = locate_cuda()
56 |
57 | # Obtain the numpy include directory. This logic works across numpy versions.
58 | try:
59 | numpy_include = np.get_include()
60 | except AttributeError:
61 | numpy_include = np.get_numpy_include()
62 |
63 | def customize_compiler_for_nvcc(self):
64 | """inject deep into distutils to customize how the dispatch
65 | to gcc/nvcc works.
66 |
67 | If you subclass UnixCCompiler, it's not trivial to get your subclass
68 | injected in, and still have the right customizations (i.e.
69 | distutils.sysconfig.customize_compiler) run on it. So instead of going
70 | the OO route, I have this. Note, it's kindof like a wierd functional
71 | subclassing going on."""
72 |
73 | # tell the compiler it can processes .cu
74 | self.src_extensions.append('.cu')
75 |
76 | # save references to the default compiler_so and _comple methods
77 | default_compiler_so = self.compiler_so
78 | super = self._compile
79 |
80 | # now redefine the _compile method. This gets executed for each
81 | # object but distutils doesn't have the ability to change compilers
82 | # based on source extension: we add it.
83 | def _compile(obj, src, ext, cc_args, extra_postargs, pp_opts):
84 | print(extra_postargs)
85 | if os.path.splitext(src)[1] == '.cu':
86 | # use the cuda for .cu files
87 | self.set_executable('compiler_so', CUDA['nvcc'])
88 | # use only a subset of the extra_postargs, which are 1-1 translated
89 | # from the extra_compile_args in the Extension class
90 | postargs = extra_postargs['nvcc']
91 | else:
92 | postargs = extra_postargs['gcc']
93 |
94 | super(obj, src, ext, cc_args, postargs, pp_opts)
95 | # reset the default compiler_so, which we might have changed for cuda
96 | self.compiler_so = default_compiler_so
97 |
98 | # inject our redefined _compile method into the class
99 | self._compile = _compile
100 |
101 | # run the customize_compiler
102 | class custom_build_ext(build_ext):
103 | def build_extensions(self):
104 | customize_compiler_for_nvcc(self.compiler)
105 | build_ext.build_extensions(self)
106 |
107 | ext_modules = [
108 | Extension(
109 | "utils.cython_bbox",
110 | ["utils/bbox.pyx"],
111 | extra_compile_args={'gcc': ["-Wno-cpp", "-Wno-unused-function"]},
112 | include_dirs = [numpy_include]
113 | ),
114 | Extension(
115 | "utils.cython_nms",
116 | ["utils/nms.pyx"],
117 | extra_compile_args={'gcc': ["-Wno-cpp", "-Wno-unused-function"]},
118 | include_dirs = [numpy_include]
119 | ),
120 | Extension(
121 | "nms.cpu_nms",
122 | ["nms/cpu_nms.pyx"],
123 | extra_compile_args={'gcc': ["-Wno-cpp", "-Wno-unused-function"]},
124 | include_dirs = [numpy_include]
125 | ),
126 | Extension('nms.gpu_nms',
127 | ['nms/nms_kernel.cu', 'nms/gpu_nms.pyx'],
128 | library_dirs=[CUDA['lib64']],
129 | libraries=['cudart'],
130 | language='c++',
131 | runtime_library_dirs=[CUDA['lib64']],
132 | # this syntax is specific to this build system
133 | # we're only going to use certain compiler args with nvcc and not with gcc
134 | # the implementation of this trick is in customize_compiler() below
135 | extra_compile_args={'gcc': ["-Wno-unused-function"],
136 | 'nvcc': ['-arch=sm_52',
137 | '--ptxas-options=-v',
138 | '-c',
139 | '--compiler-options',
140 | "'-fPIC'"]},
141 | include_dirs = [numpy_include, CUDA['include']]
142 | )
143 | ]
144 |
145 | setup(
146 | name='tf_faster_rcnn',
147 | ext_modules=ext_modules,
148 | # inject our custom trigger
149 | cmdclass={'build_ext': custom_build_ext},
150 | )
151 |
--------------------------------------------------------------------------------
/lib/setup.py~:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick
6 | # --------------------------------------------------------
7 |
8 | import os
9 | from os.path import join as pjoin
10 | import numpy as np
11 | from distutils.core import setup
12 | from distutils.extension import Extension
13 | from Cython.Distutils import build_ext
14 |
15 | def find_in_path(name, path):
16 | "Find a file in a search path"
17 | #adapted fom http://code.activestate.com/recipes/52224-find-a-file-given-a-search-path/
18 | for dir in path.split(os.pathsep):
19 | binpath = pjoin(dir, name)
20 | if os.path.exists(binpath):
21 | return os.path.abspath(binpath)
22 | return None
23 |
24 | def locate_cuda():
25 | """Locate the CUDA environment on the system
26 |
27 | Returns a dict with keys 'home', 'nvcc', 'include', and 'lib64'
28 | and values giving the absolute path to each directory.
29 |
30 | Starts by looking for the CUDAHOME env variable. If not found, everything
31 | is based on finding 'nvcc' in the PATH.
32 | """
33 |
34 | # first check if the CUDAHOME env variable is in use
35 | if 'CUDAHOME' in os.environ:
36 | home = os.environ['CUDAHOME']
37 | nvcc = pjoin(home, 'bin', 'nvcc')
38 | else:
39 | # otherwise, search the PATH for NVCC
40 | default_path = pjoin(os.sep, 'usr', 'local', 'cuda', 'bin')
41 | nvcc = find_in_path('nvcc', os.environ['PATH'] + os.pathsep + default_path)
42 | if nvcc is None:
43 | raise EnvironmentError('The nvcc binary could not be '
44 | 'located in your $PATH. Either add it to your path, or set $CUDAHOME')
45 | home = os.path.dirname(os.path.dirname(nvcc))
46 |
47 | cudaconfig = {'home':home, 'nvcc':nvcc,
48 | 'include': pjoin(home, 'include'),
49 | 'lib64': pjoin(home, 'lib64')}
50 | for k, v in cudaconfig.items():
51 | if not os.path.exists(v):
52 | raise EnvironmentError('The CUDA %s path could not be located in %s' % (k, v))
53 |
54 | return cudaconfig
55 | CUDA = locate_cuda()
56 |
57 | # Obtain the numpy include directory. This logic works across numpy versions.
58 | try:
59 | numpy_include = np.get_include()
60 | except AttributeError:
61 | numpy_include = np.get_numpy_include()
62 |
63 | def customize_compiler_for_nvcc(self):
64 | """inject deep into distutils to customize how the dispatch
65 | to gcc/nvcc works.
66 |
67 | If you subclass UnixCCompiler, it's not trivial to get your subclass
68 | injected in, and still have the right customizations (i.e.
69 | distutils.sysconfig.customize_compiler) run on it. So instead of going
70 | the OO route, I have this. Note, it's kindof like a wierd functional
71 | subclassing going on."""
72 |
73 | # tell the compiler it can processes .cu
74 | self.src_extensions.append('.cu')
75 |
76 | # save references to the default compiler_so and _comple methods
77 | default_compiler_so = self.compiler_so
78 | super = self._compile
79 |
80 | # now redefine the _compile method. This gets executed for each
81 | # object but distutils doesn't have the ability to change compilers
82 | # based on source extension: we add it.
83 | def _compile(obj, src, ext, cc_args, extra_postargs, pp_opts):
84 | print(extra_postargs)
85 | if os.path.splitext(src)[1] == '.cu':
86 | # use the cuda for .cu files
87 | self.set_executable('compiler_so', CUDA['nvcc'])
88 | # use only a subset of the extra_postargs, which are 1-1 translated
89 | # from the extra_compile_args in the Extension class
90 | postargs = extra_postargs['nvcc']
91 | else:
92 | postargs = extra_postargs['gcc']
93 |
94 | super(obj, src, ext, cc_args, postargs, pp_opts)
95 | # reset the default compiler_so, which we might have changed for cuda
96 | self.compiler_so = default_compiler_so
97 |
98 | # inject our redefined _compile method into the class
99 | self._compile = _compile
100 |
101 | # run the customize_compiler
102 | class custom_build_ext(build_ext):
103 | def build_extensions(self):
104 | customize_compiler_for_nvcc(self.compiler)
105 | build_ext.build_extensions(self)
106 |
107 | ext_modules = [
108 | Extension(
109 | "utils.cython_bbox",
110 | ["utils/bbox.pyx"],
111 | extra_compile_args={'gcc': ["-Wno-cpp", "-Wno-unused-function"]},
112 | include_dirs = [numpy_include]
113 | ),
114 | Extension(
115 | "utils.cython_nms",
116 | ["utils/nms.pyx"],
117 | extra_compile_args={'gcc': ["-Wno-cpp", "-Wno-unused-function"]},
118 | include_dirs = [numpy_include]
119 | ),
120 | Extension(
121 | "nms.cpu_nms",
122 | ["nms/cpu_nms.pyx"],
123 | extra_compile_args={'gcc': ["-Wno-cpp", "-Wno-unused-function"]},
124 | include_dirs = [numpy_include]
125 | ),
126 | Extension('nms.gpu_nms',
127 | ['nms/nms_kernel.cu', 'nms/gpu_nms.pyx'],
128 | library_dirs=[CUDA['lib64']],
129 | libraries=['cudart'],
130 | language='c++',
131 | runtime_library_dirs=[CUDA['lib64']],
132 | # this syntax is specific to this build system
133 | # we're only going to use certain compiler args with nvcc and not with gcc
134 | # the implementation of this trick is in customize_compiler() below
135 | extra_compile_args={'gcc': ["-Wno-unused-function"],
136 | 'nvcc': ['-arch=sm_52',
137 | '--ptxas-options=-v',
138 | '-c',
139 | '--compiler-options',
140 | "'-fPIC'"]},
141 | include_dirs = [numpy_include, CUDA['include']]
142 | )
143 | ]
144 |
145 | setup(
146 | name='tf_faster_rcnn',
147 | ext_modules=ext_modules,
148 | # inject our custom trigger
149 | cmdclass={'build_ext': custom_build_ext},
150 | )
151 |
--------------------------------------------------------------------------------
/lib/utils/.gitignore:
--------------------------------------------------------------------------------
1 | *.c
2 | *.cpp
3 | *.h
4 | *.hpp
5 |
--------------------------------------------------------------------------------
/lib/utils/__init__.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick
6 | # --------------------------------------------------------
7 |
--------------------------------------------------------------------------------
/lib/utils/__init__.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/utils/__init__.pyc
--------------------------------------------------------------------------------
/lib/utils/bbox.pyx:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Sergey Karayev
6 | # --------------------------------------------------------
7 |
8 | cimport cython
9 | import numpy as np
10 | cimport numpy as np
11 |
12 | DTYPE = np.float
13 | ctypedef np.float_t DTYPE_t
14 |
15 | def bbox_overlaps(
16 | np.ndarray[DTYPE_t, ndim=2] boxes,
17 | np.ndarray[DTYPE_t, ndim=2] query_boxes):
18 | """
19 | Parameters
20 | ----------
21 | boxes: (N, 4) ndarray of float
22 | query_boxes: (K, 4) ndarray of float
23 | Returns
24 | -------
25 | overlaps: (N, K) ndarray of overlap between boxes and query_boxes
26 | """
27 | cdef unsigned int N = boxes.shape[0]
28 | cdef unsigned int K = query_boxes.shape[0]
29 | cdef np.ndarray[DTYPE_t, ndim=2] overlaps = np.zeros((N, K), dtype=DTYPE)
30 | cdef DTYPE_t iw, ih, box_area
31 | cdef DTYPE_t ua
32 | cdef unsigned int k, n
33 | for k in range(K):
34 | box_area = (
35 | (query_boxes[k, 2] - query_boxes[k, 0] + 1) *
36 | (query_boxes[k, 3] - query_boxes[k, 1] + 1)
37 | )
38 | for n in range(N):
39 | iw = (
40 | min(boxes[n, 2], query_boxes[k, 2]) -
41 | max(boxes[n, 0], query_boxes[k, 0]) + 1
42 | )
43 | if iw > 0:
44 | ih = (
45 | min(boxes[n, 3], query_boxes[k, 3]) -
46 | max(boxes[n, 1], query_boxes[k, 1]) + 1
47 | )
48 | if ih > 0:
49 | ua = float(
50 | (boxes[n, 2] - boxes[n, 0] + 1) *
51 | (boxes[n, 3] - boxes[n, 1] + 1) +
52 | box_area - iw * ih
53 | )
54 | overlaps[n, k] = iw * ih / ua
55 | return overlaps
56 |
57 | def bbox_overlaps_self(
58 | np.ndarray[DTYPE_t, ndim=2] boxes,
59 | np.ndarray[DTYPE_t, ndim=2] query_boxes):
60 | """
61 | Parameters
62 | ----------
63 | boxes: (N, 4) ndarray of float
64 | query_boxes: (K, 4) ndarray of float
65 | Returns
66 | -------
67 | overlaps: (N, K) ndarray of overlap between boxes and query_boxes
68 | """
69 | cdef unsigned int N = boxes.shape[0]
70 | cdef unsigned int K = query_boxes.shape[0]
71 | cdef np.ndarray[DTYPE_t, ndim=2] overlaps = np.zeros((N, K), dtype=DTYPE)
72 | cdef DTYPE_t iw, ih, box_area
73 | cdef DTYPE_t ua
74 | cdef unsigned int k, n
75 | for k in range(K):
76 | box_area = (
77 | (query_boxes[k, 2] - query_boxes[k, 0] + 1) *
78 | (query_boxes[k, 3] - query_boxes[k, 1] + 1)
79 | )
80 | for n in range(N):
81 | iw = (
82 | min(boxes[n, 2], query_boxes[k, 2]) -
83 | max(boxes[n, 0], query_boxes[k, 0]) + 1
84 | )
85 | if iw > 0:
86 | ih = (
87 | min(boxes[n, 3], query_boxes[k, 3]) -
88 | max(boxes[n, 1], query_boxes[k, 1]) + 1
89 | )
90 | if ih > 0:
91 | ua = float(box_area)
92 | overlaps[n, k] = iw * ih / ua
93 | return overlaps
--------------------------------------------------------------------------------
/lib/utils/blob.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick
6 | # --------------------------------------------------------
7 |
8 | """Blob helper functions."""
9 | from __future__ import absolute_import
10 | from __future__ import division
11 | from __future__ import print_function
12 |
13 | import numpy as np
14 | import cv2
15 |
16 |
17 | def im_list_to_blob(ims):
18 | """Convert a list of images into a network input.
19 |
20 | Assumes images are already prepared (means subtracted, BGR order, ...).
21 | """
22 | max_shape = np.array([im.shape for im in ims]).max(axis=0)
23 | num_images = len(ims)
24 | blob = np.zeros((num_images, max_shape[0], max_shape[1], 3),
25 | dtype=np.float32)
26 | for i in range(num_images):
27 | im = ims[i]
28 | blob[i, 0:im.shape[0], 0:im.shape[1], :] = im
29 |
30 | return blob
31 |
32 |
33 | def prep_im_for_blob(im, pixel_means, target_size, max_size):
34 | """Mean subtract and scale an image for use in a blob."""
35 | im = im.astype(np.float32, copy=False)
36 | im -= pixel_means
37 | im_shape = im.shape
38 | im_size_min = np.min(im_shape[0:2])
39 | im_size_max = np.max(im_shape[0:2])
40 | im_scale = float(target_size) / float(im_size_min)
41 | # Prevent the biggest axis from being more than MAX_SIZE
42 | if np.round(im_scale * im_size_max) > max_size:
43 | im_scale = float(max_size) / float(im_size_max)
44 | im = cv2.resize(im, None, None, fx=im_scale, fy=im_scale,
45 | interpolation=cv2.INTER_LINEAR)
46 |
47 | return im, im_scale
48 |
--------------------------------------------------------------------------------
/lib/utils/blob.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/utils/blob.pyc
--------------------------------------------------------------------------------
/lib/utils/boxes_grid.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Subcategory CNN
3 | # Copyright (c) 2015 CVGL Stanford
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Yu Xiang
6 | # --------------------------------------------------------
7 | from __future__ import absolute_import
8 | from __future__ import division
9 | from __future__ import print_function
10 |
11 | import numpy as np
12 | import math
13 | from model.config import cfg
14 |
15 |
16 | def get_boxes_grid(image_height, image_width):
17 | """
18 | Return the boxes on image grid.
19 | """
20 |
21 | # height and width of the heatmap
22 | if cfg.NET_NAME == 'CaffeNet':
23 | height = np.floor((image_height * max(cfg.TRAIN.SCALES) - 1) / 4.0 + 1)
24 | height = np.floor((height - 1) / 2.0 + 1 + 0.5)
25 | height = np.floor((height - 1) / 2.0 + 1 + 0.5)
26 |
27 | width = np.floor((image_width * max(cfg.TRAIN.SCALES) - 1) / 4.0 + 1)
28 | width = np.floor((width - 1) / 2.0 + 1 + 0.5)
29 | width = np.floor((width - 1) / 2.0 + 1 + 0.5)
30 | elif cfg.NET_NAME == 'VGGnet':
31 | height = np.floor(image_height * max(cfg.TRAIN.SCALES) / 2.0 + 0.5)
32 | height = np.floor(height / 2.0 + 0.5)
33 | height = np.floor(height / 2.0 + 0.5)
34 | height = np.floor(height / 2.0 + 0.5)
35 |
36 | width = np.floor(image_width * max(cfg.TRAIN.SCALES) / 2.0 + 0.5)
37 | width = np.floor(width / 2.0 + 0.5)
38 | width = np.floor(width / 2.0 + 0.5)
39 | width = np.floor(width / 2.0 + 0.5)
40 | else:
41 | assert (1), 'The network architecture is not supported in utils.get_boxes_grid!'
42 |
43 | # compute the grid box centers
44 | h = np.arange(height)
45 | w = np.arange(width)
46 | y, x = np.meshgrid(h, w, indexing='ij')
47 | centers = np.dstack((x, y))
48 | centers = np.reshape(centers, (-1, 2))
49 | num = centers.shape[0]
50 |
51 | # compute width and height of grid box
52 | area = cfg.TRAIN.KERNEL_SIZE * cfg.TRAIN.KERNEL_SIZE
53 | aspect = cfg.TRAIN.ASPECTS # height / width
54 | num_aspect = len(aspect)
55 | widths = np.zeros((1, num_aspect), dtype=np.float32)
56 | heights = np.zeros((1, num_aspect), dtype=np.float32)
57 | for i in range(num_aspect):
58 | widths[0, i] = math.sqrt(area / aspect[i])
59 | heights[0, i] = widths[0, i] * aspect[i]
60 |
61 | # construct grid boxes
62 | centers = np.repeat(centers, num_aspect, axis=0)
63 | widths = np.tile(widths, num).transpose()
64 | heights = np.tile(heights, num).transpose()
65 |
66 | x1 = np.reshape(centers[:, 0], (-1, 1)) - widths * 0.5
67 | x2 = np.reshape(centers[:, 0], (-1, 1)) + widths * 0.5
68 | y1 = np.reshape(centers[:, 1], (-1, 1)) - heights * 0.5
69 | y2 = np.reshape(centers[:, 1], (-1, 1)) + heights * 0.5
70 |
71 | boxes_grid = np.hstack((x1, y1, x2, y2)) / cfg.TRAIN.SPATIAL_SCALE
72 |
73 | return boxes_grid, centers[:, 0], centers[:, 1]
74 |
--------------------------------------------------------------------------------
/lib/utils/boxes_grid.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/utils/boxes_grid.pyc
--------------------------------------------------------------------------------
/lib/utils/cython_bbox.so:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/utils/cython_bbox.so
--------------------------------------------------------------------------------
/lib/utils/cython_nms.so:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/utils/cython_nms.so
--------------------------------------------------------------------------------
/lib/utils/nms.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick
6 | # --------------------------------------------------------
7 |
8 | import numpy as np
9 |
10 | def nms(dets, thresh):
11 | x1 = dets[:, 0]
12 | y1 = dets[:, 1]
13 | x2 = dets[:, 2]
14 | y2 = dets[:, 3]
15 | scores = dets[:, 4]
16 |
17 | areas = (x2 - x1 + 1) * (y2 - y1 + 1)
18 | order = scores.argsort()[::-1]
19 |
20 | keep = []
21 | while order.size > 0:
22 | i = order[0]
23 | keep.append(i)
24 | xx1 = np.maximum(x1[i], x1[order[1:]])
25 | yy1 = np.maximum(y1[i], y1[order[1:]])
26 | xx2 = np.minimum(x2[i], x2[order[1:]])
27 | yy2 = np.minimum(y2[i], y2[order[1:]])
28 |
29 | w = np.maximum(0.0, xx2 - xx1 + 1)
30 | h = np.maximum(0.0, yy2 - yy1 + 1)
31 | inter = w * h
32 | ovr = inter / (areas[i] + areas[order[1:]] - inter)
33 |
34 | inds = np.where(ovr <= thresh)[0]
35 | order = order[inds + 1]
36 |
37 | return keep
38 |
--------------------------------------------------------------------------------
/lib/utils/nms.pyx:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick
6 | # --------------------------------------------------------
7 |
8 | import numpy as np
9 | cimport numpy as np
10 |
11 | cdef inline np.float32_t max(np.float32_t a, np.float32_t b):
12 | return a if a >= b else b
13 |
14 | cdef inline np.float32_t min(np.float32_t a, np.float32_t b):
15 | return a if a <= b else b
16 |
17 | def nms(np.ndarray[np.float32_t, ndim=2] dets, np.float thresh):
18 | cdef np.ndarray[np.float32_t, ndim=1] x1 = dets[:, 0]
19 | cdef np.ndarray[np.float32_t, ndim=1] y1 = dets[:, 1]
20 | cdef np.ndarray[np.float32_t, ndim=1] x2 = dets[:, 2]
21 | cdef np.ndarray[np.float32_t, ndim=1] y2 = dets[:, 3]
22 | cdef np.ndarray[np.float32_t, ndim=1] scores = dets[:, 4]
23 |
24 | cdef np.ndarray[np.float32_t, ndim=1] areas = (x2 - x1 + 1) * (y2 - y1 + 1)
25 | cdef np.ndarray[np.int_t, ndim=1] order = scores.argsort()[::-1]
26 |
27 | cdef int ndets = dets.shape[0]
28 | cdef np.ndarray[np.int_t, ndim=1] suppressed = \
29 | np.zeros((ndets), dtype=np.int)
30 |
31 | # nominal indices
32 | cdef int _i, _j
33 | # sorted indices
34 | cdef int i, j
35 | # temp variables for box i's (the box currently under consideration)
36 | cdef np.float32_t ix1, iy1, ix2, iy2, iarea
37 | # variables for computing overlap with box j (lower scoring box)
38 | cdef np.float32_t xx1, yy1, xx2, yy2
39 | cdef np.float32_t w, h
40 | cdef np.float32_t inter, ovr
41 |
42 | keep = []
43 | for _i in range(ndets):
44 | i = order[_i]
45 | if suppressed[i] == 1:
46 | continue
47 | keep.append(i)
48 | ix1 = x1[i]
49 | iy1 = y1[i]
50 | ix2 = x2[i]
51 | iy2 = y2[i]
52 | iarea = areas[i]
53 | for _j in range(_i + 1, ndets):
54 | j = order[_j]
55 | if suppressed[j] == 1:
56 | continue
57 | xx1 = max(ix1, x1[j])
58 | yy1 = max(iy1, y1[j])
59 | xx2 = min(ix2, x2[j])
60 | yy2 = min(iy2, y2[j])
61 | w = max(0.0, xx2 - xx1 + 1)
62 | h = max(0.0, yy2 - yy1 + 1)
63 | inter = w * h
64 | ovr = inter / (iarea + areas[j] - inter)
65 | if ovr >= thresh:
66 | suppressed[j] = 1
67 |
68 | return keep
69 |
70 | def nms_new(np.ndarray[np.float32_t, ndim=2] dets, np.float thresh):
71 | cdef np.ndarray[np.float32_t, ndim=1] x1 = dets[:, 0]
72 | cdef np.ndarray[np.float32_t, ndim=1] y1 = dets[:, 1]
73 | cdef np.ndarray[np.float32_t, ndim=1] x2 = dets[:, 2]
74 | cdef np.ndarray[np.float32_t, ndim=1] y2 = dets[:, 3]
75 | cdef np.ndarray[np.float32_t, ndim=1] scores = dets[:, 4]
76 |
77 | cdef np.ndarray[np.float32_t, ndim=1] areas = (x2 - x1 + 1) * (y2 - y1 + 1)
78 | cdef np.ndarray[np.int_t, ndim=1] order = scores.argsort()[::-1]
79 |
80 | cdef int ndets = dets.shape[0]
81 | cdef np.ndarray[np.int_t, ndim=1] suppressed = \
82 | np.zeros((ndets), dtype=np.int)
83 |
84 | # nominal indices
85 | cdef int _i, _j
86 | # sorted indices
87 | cdef int i, j
88 | # temp variables for box i's (the box currently under consideration)
89 | cdef np.float32_t ix1, iy1, ix2, iy2, iarea
90 | # variables for computing overlap with box j (lower scoring box)
91 | cdef np.float32_t xx1, yy1, xx2, yy2
92 | cdef np.float32_t w, h
93 | cdef np.float32_t inter, ovr
94 |
95 | keep = []
96 | for _i in range(ndets):
97 | i = order[_i]
98 | if suppressed[i] == 1:
99 | continue
100 | keep.append(i)
101 | ix1 = x1[i]
102 | iy1 = y1[i]
103 | ix2 = x2[i]
104 | iy2 = y2[i]
105 | iarea = areas[i]
106 | for _j in range(_i + 1, ndets):
107 | j = order[_j]
108 | if suppressed[j] == 1:
109 | continue
110 | xx1 = max(ix1, x1[j])
111 | yy1 = max(iy1, y1[j])
112 | xx2 = min(ix2, x2[j])
113 | yy2 = min(iy2, y2[j])
114 | w = max(0.0, xx2 - xx1 + 1)
115 | h = max(0.0, yy2 - yy1 + 1)
116 | inter = w * h
117 | ovr = inter / (iarea + areas[j] - inter)
118 | ovr1 = inter / iarea
119 | ovr2 = inter / areas[j]
120 | if ovr >= thresh or ovr1 > 0.95 or ovr2 > 0.95:
121 | suppressed[j] = 1
122 |
123 | return keep
124 |
--------------------------------------------------------------------------------
/lib/utils/timer.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Fast R-CNN
3 | # Copyright (c) 2015 Microsoft
4 | # Licensed under The MIT License [see LICENSE for details]
5 | # Written by Ross Girshick
6 | # --------------------------------------------------------
7 |
8 | import time
9 |
10 | class Timer(object):
11 | """A simple timer."""
12 | def __init__(self):
13 | self.total_time = 0.
14 | self.calls = 0
15 | self.start_time = 0.
16 | self.diff = 0.
17 | self.average_time = 0.
18 |
19 | def tic(self):
20 | # using time.time instead of time.clock because time time.clock
21 | # does not normalize for multithreading
22 | self.start_time = time.time()
23 |
24 | def toc(self, average=True):
25 | self.diff = time.time() - self.start_time
26 | self.total_time += self.diff
27 | self.calls += 1
28 | self.average_time = self.total_time / self.calls
29 | if average:
30 | return self.average_time
31 | else:
32 | return self.diff
33 |
--------------------------------------------------------------------------------
/lib/utils/timer.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/lib/utils/timer.pyc
--------------------------------------------------------------------------------
/tools/_init_paths.py:
--------------------------------------------------------------------------------
1 | import os.path as osp
2 | import sys
3 |
4 | def add_path(path):
5 | if path not in sys.path:
6 | sys.path.insert(0, path)
7 |
8 | this_dir = osp.dirname(__file__)
9 |
10 | # Add lib to PYTHONPATH
11 | lib_path = osp.join(this_dir, '..', 'lib')
12 | add_path(lib_path)
13 |
14 | coco_path = osp.join(this_dir, '..', 'data', 'coco', 'PythonAPI')
15 | add_path(coco_path)
16 |
--------------------------------------------------------------------------------
/tools/_init_paths.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ivalab/grasp_multiObject_multiGrasp/806ad3d71c2f413a74294fe75fe26ba4f32c8813/tools/_init_paths.pyc
--------------------------------------------------------------------------------
/tools/demo.py~:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 |
3 | # --------------------------------------------------------
4 | # Tensorflow Faster R-CNN
5 | # Licensed under The MIT License [see LICENSE for details]
6 | # Written by Xinlei Chen, based on code from Ross Girshick
7 | # --------------------------------------------------------
8 |
9 | """
10 | Demo script showing detections in sample images.
11 |
12 | See README.md for installation instructions before running.
13 | """
14 | from __future__ import absolute_import
15 | from __future__ import division
16 | from __future__ import print_function
17 |
18 | import _init_paths
19 | from model.config import cfg
20 | from model.test import im_detect
21 | from model.nms_wrapper import nms
22 |
23 | from utils.timer import Timer
24 | import tensorflow as tf
25 | import matplotlib.pyplot as plt
26 | import numpy as np
27 | import os, cv2
28 | import argparse
29 |
30 | from nets.vgg16 import vgg16
31 | from nets.resnet_v1 import resnetv1
32 |
33 | CLASSES = ('__background__',
34 | 'aeroplane', 'bicycle', 'bird', 'boat',
35 | 'bottle', 'bus', 'car', 'cat', 'chair',
36 | 'cow', 'diningtable', 'dog', 'horse',
37 | 'motorbike', 'person', 'pottedplant',
38 | 'sheep', 'sofa', 'train', 'tvmonitor')
39 |
40 | NETS = {'vgg16': ('vgg16_faster_rcnn_iter_70000.ckpt',),'res101': ('res101_faster_rcnn_iter_110000.ckpt',)}
41 | DATASETS= {'pascal_voc': ('voc_2007_trainval',),'pascal_voc_0712': ('voc_2007_trainval+voc_2012_trainval',)}
42 |
43 | def vis_detections(im, class_name, dets, thresh=0.5):
44 | """Draw detected bounding boxes."""
45 | inds = np.where(dets[:, -1] >= thresh)[0]
46 | if len(inds) == 0:
47 | return
48 |
49 | im = im[:, :, (2, 1, 0)]
50 | fig, ax = plt.subplots(figsize=(12, 12))
51 | ax.imshow(im, aspect='equal')
52 | for i in inds:
53 | bbox = dets[i, :4]
54 | score = dets[i, -1]
55 |
56 | ax.add_patch(
57 | plt.Rectangle((bbox[0], bbox[1]),
58 | bbox[2] - bbox[0],
59 | bbox[3] - bbox[1], fill=False,
60 | edgecolor='red', linewidth=3.5)
61 | )
62 | ax.text(bbox[0], bbox[1] - 2,
63 | '{:s} {:.3f}'.format(class_name, score),
64 | bbox=dict(facecolor='blue', alpha=0.5),
65 | fontsize=14, color='white')
66 |
67 | ax.set_title(('{} detections with '
68 | 'p({} | box) >= {:.1f}').format(class_name, class_name,
69 | thresh),
70 | fontsize=14)
71 | plt.axis('off')
72 | plt.tight_layout()
73 | plt.draw()
74 |
75 | def demo(sess, net, image_name):
76 | """Detect object classes in an image using pre-computed object proposals."""
77 |
78 | # Load the demo image
79 | im_file = os.path.join(cfg.DATA_DIR, 'demo', image_name)
80 | im = cv2.imread(im_file)
81 |
82 | # Detect all object classes and regress object bounds
83 | timer = Timer()
84 | timer.tic()
85 | scores, boxes = im_detect(sess, net, im)
86 | timer.toc()
87 | print('Detection took {:.3f}s for {:d} object proposals'.format(timer.total_time, boxes.shape[0]))
88 |
89 | # Visualize detections for each class
90 | CONF_THRESH = 0.8
91 | NMS_THRESH = 0.3
92 | for cls_ind, cls in enumerate(CLASSES[1:]):
93 | cls_ind += 1 # because we skipped background
94 | cls_boxes = boxes[:, 4*cls_ind:4*(cls_ind + 1)]
95 | cls_scores = scores[:, cls_ind]
96 | dets = np.hstack((cls_boxes,
97 | cls_scores[:, np.newaxis])).astype(np.float32)
98 | keep = nms(dets, NMS_THRESH)
99 | dets = dets[keep, :]
100 | vis_detections(im, cls, dets, thresh=CONF_THRESH)
101 |
102 | def parse_args():
103 | """Parse input arguments."""
104 | parser = argparse.ArgumentParser(description='Tensorflow Faster R-CNN demo')
105 | parser.add_argument('--net', dest='demo_net', help='Network to use [vgg16 res101]',
106 | choices=NETS.keys(), default='res101')
107 | parser.add_argument('--dataset', dest='dataset', help='Trained dataset [pascal_voc pascal_voc_0712]',
108 | choices=DATASETS.keys(), default='pascal_voc_0712')
109 | args = parser.parse_args()
110 |
111 | return args
112 |
113 | if __name__ == '__main__':
114 | cfg.TEST.HAS_RPN = True # Use RPN for proposals
115 | args = parse_args()
116 |
117 | # model path
118 | demonet = args.demo_net
119 | dataset = args.dataset
120 | tfmodel = os.path.join('output', demonet, DATASETS[dataset][0], 'default',
121 | NETS[demonet][0])
122 |
123 |
124 | if not os.path.isfile(tfmodel + '.meta'):
125 | raise IOError(('{:s} not found.\nDid you download the proper networks from '
126 | 'our server and place them properly?').format(tfmodel + '.meta'))
127 |
128 | # set config
129 | tfconfig = tf.ConfigProto(allow_soft_placement=True)
130 | tfconfig.gpu_options.allow_growth=True
131 |
132 | # init session
133 | sess = tf.Session(config=tfconfig)
134 | # load network
135 | if demonet == 'vgg16':
136 | net = vgg16(batch_size=1)
137 | elif demonet == 'res101':
138 | net = resnetv1(batch_size=1, num_layers=101)
139 | else:
140 | raise NotImplementedError
141 | net.create_architecture(sess, "TEST", 21,
142 | tag='default', anchor_scales=[8, 16, 32])
143 | saver = tf.train.Saver()
144 | saver.restore(sess, tfmodel)
145 |
146 | print('Loaded network {:s}'.format(tfmodel))
147 |
148 | im_names = ['0010.png','0012.png','0027.png','0038.png','0039.png','000456.jpg', '000542.jpg', '001150.jpg',
149 | '001763.jpg', '004545.jpg']
150 | for im_name in im_names:
151 | print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')
152 | print('Demo for data/demo/{}'.format(im_name))
153 | demo(sess, net, im_name)
154 |
155 | plt.show()
156 |
--------------------------------------------------------------------------------
/tools/demo_graspRGD.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 |
3 | # --------------------------------------------------------
4 | # Tensorflow Faster R-CNN
5 | # Licensed under The MIT License [see LICENSE for details]
6 | # Written by Xinlei Chen, based on code from Ross Girshick
7 | # --------------------------------------------------------
8 |
9 | """
10 | Demo script showing detections in sample images.
11 |
12 | See README.md for installation instructions before running.
13 | """
14 | from __future__ import absolute_import
15 | from __future__ import division
16 | from __future__ import print_function
17 |
18 | import _init_paths
19 | from model.config import cfg
20 | from model.test import im_detect
21 | from model.nms_wrapper import nms
22 |
23 | from utils.timer import Timer
24 | import tensorflow as tf
25 | import matplotlib.pyplot as plt
26 | import numpy as np
27 | import os, cv2
28 | import argparse
29 |
30 | from nets.vgg16 import vgg16
31 | from nets.resnet_v1 import resnetv1
32 | import scipy
33 | from shapely.geometry import Polygon
34 |
35 | pi = scipy.pi
36 | dot = scipy.dot
37 | sin = scipy.sin
38 | cos = scipy.cos
39 | ar = scipy.array
40 |
41 | CLASSES = ('__background__',
42 | 'angle_01', 'angle_02', 'angle_03', 'angle_04', 'angle_05',
43 | 'angle_06', 'angle_07', 'angle_08', 'angle_09', 'angle_10',
44 | 'angle_11', 'angle_12', 'angle_13', 'angle_14', 'angle_15',
45 | 'angle_16', 'angle_17', 'angle_18', 'angle_19')
46 |
47 | NETS = {'vgg16': ('vgg16_faster_rcnn_iter_70000.ckpt',),'res101': ('res101_faster_rcnn_iter_110000.ckpt',),'res50': ('res50_faster_rcnn_iter_240000.ckpt',)}
48 | DATASETS= {'pascal_voc': ('voc_2007_trainval',),'pascal_voc_0712': ('voc_2007_trainval+voc_2012_trainval',),'grasp': ('train',)}
49 |
50 | def Rotate2D(pts,cnt,ang=scipy.pi/4):
51 | '''pts = {} Rotates points(nx2) about center cnt(2) by angle ang(1) in radian'''
52 | return dot(pts-cnt,ar([[cos(ang),sin(ang)],[-sin(ang),cos(ang)]]))+cnt
53 |
54 | def vis_detections(ax, image_name, im, class_name, dets, thresh=0.5):
55 | """Draw detected bounding boxes."""
56 | inds = np.where(dets[:, -1] >= thresh)[0]
57 | if len(inds) == 0:
58 | return
59 |
60 | im = im[:, :, (2, 1, 0)]
61 | #fig, ax = plt.subplots(figsize=(12, 12))
62 | ax.imshow(im, aspect='equal')
63 | for i in inds:
64 | bbox = dets[i, :4]
65 | score = dets[i, -1]
66 |
67 | #ax.add_patch(
68 | # plt.Rectangle((bbox[0], bbox[1]),
69 | # bbox[2] - bbox[0],
70 | # bbox[3] - bbox[1], fill=False,
71 | # edgecolor='red', linewidth=3.5)
72 | # )
73 |
74 | # plot rotated rectangles
75 | pts = ar([[bbox[0],bbox[1]], [bbox[2], bbox[1]], [bbox[2], bbox[3]], [bbox[0], bbox[3]]])
76 | cnt = ar([(bbox[0] + bbox[2])/2, (bbox[1] + bbox[3])/2])
77 | angle = int(class_name[6:])
78 | r_bbox = Rotate2D(pts, cnt, -pi/2-pi/20*(angle-1))
79 | pred_label_polygon = Polygon([(r_bbox[0,0],r_bbox[0,1]), (r_bbox[1,0], r_bbox[1,1]), (r_bbox[2,0], r_bbox[2,1]), (r_bbox[3,0], r_bbox[3,1])])
80 | pred_x, pred_y = pred_label_polygon.exterior.xy
81 |
82 | plt.plot(pred_x[0:2],pred_y[0:2], color='k', alpha = 0.7, linewidth=1, solid_capstyle='round', zorder=2)
83 | plt.plot(pred_x[1:3],pred_y[1:3], color='r', alpha = 0.7, linewidth=3, solid_capstyle='round', zorder=2)
84 | plt.plot(pred_x[2:4],pred_y[2:4], color='k', alpha = 0.7, linewidth=1, solid_capstyle='round', zorder=2)
85 | plt.plot(pred_x[3:5],pred_y[3:5], color='r', alpha = 0.7, linewidth=3, solid_capstyle='round', zorder=2)
86 |
87 | #ax.text(bbox[0], bbox[1] - 2,
88 | # '{:s} {:.3f}'.format(class_name, score),
89 | # bbox=dict(facecolor='blue', alpha=0.5),
90 | # fontsize=14, color='white')
91 |
92 | #ax.set_title(('{} detections with '
93 | # 'p({} | box) >= {:.1f}').format(class_name, class_name,
94 | # thresh),
95 | # fontsize=14)
96 | #plt.axis('off')
97 | #plt.tight_layout()
98 |
99 | #save result
100 | #savepath = './data/demo/results/' + str(image_name) + str(class_name) + '.png'
101 | #plt.savefig(savepath)
102 |
103 | #plt.draw()
104 |
105 | def demo(sess, net, image_name):
106 | """Detect object classes in an image using pre-computed object proposals."""
107 |
108 | # Load the demo image
109 | im_file = os.path.join(cfg.DATA_DIR, 'demo', image_name)
110 | im = cv2.imread(im_file)
111 | #print(im)
112 |
113 | # Detect all object classes and regress object bounds
114 | timer = Timer()
115 | timer.tic()
116 | scores, boxes = im_detect(sess, net, im)
117 |
118 | #scores_max = scores[:,1:-1].max(axis=1)
119 | #scores_max_idx = np.argmax(scores_max)
120 | #scores = scores[scores_max_idx:scores_max_idx+1,:]
121 | #boxes = boxes[scores_max_idx:scores_max_idx+1, :]
122 |
123 | #im = cv2.imread('/home/fujenchu/projects/deepLearning/tensorflow-finetune-flickr-style-master/data/grasps_ivalab/rgb_cropped320/rgb_0076Cropped320.png')
124 | timer.toc()
125 | print('Detection took {:.3f}s for {:d} object proposals'.format(timer.total_time, boxes.shape[0]))
126 |
127 | fig, ax = plt.subplots(figsize=(12, 12))
128 | # Visualize detections for each class
129 | CONF_THRESH = 0.1
130 | NMS_THRESH = 0.3
131 | for cls_ind, cls in enumerate(CLASSES[1:]):
132 | cls_ind += 1 # because we skipped background
133 | cls_boxes = boxes[:, 4*cls_ind:4*(cls_ind + 1)]
134 | cls_scores = scores[:, cls_ind]
135 | dets = np.hstack((cls_boxes,
136 | cls_scores[:, np.newaxis])).astype(np.float32)
137 | keep = nms(dets, NMS_THRESH)
138 | dets = dets[keep, :]
139 | vis_detections(ax, image_name, im, cls, dets, thresh=CONF_THRESH)
140 | #tmp = max(cls_scores)
141 |
142 | plt.axis('off')
143 | plt.tight_layout()
144 |
145 | #cv2.imshow('deepGrasp_top_score', im)
146 | #choice = cv2.waitKey(100)
147 |
148 | #save result
149 | savepath = './data/demo/results_all_cls/' + str(image_name) + '.png'
150 | plt.savefig(savepath)
151 |
152 | plt.draw()
153 |
154 | def parse_args():
155 | """Parse input arguments."""
156 | parser = argparse.ArgumentParser(description='Tensorflow Faster R-CNN demo')
157 | parser.add_argument('--net', dest='demo_net', help='Network to use [vgg16 res101]',
158 | choices=NETS.keys(), default='res101')
159 | parser.add_argument('--dataset', dest='dataset', help='Trained dataset [pascal_voc pascal_voc_0712]',
160 | choices=DATASETS.keys(), default='pascal_voc_0712')
161 | args = parser.parse_args()
162 |
163 | return args
164 |
165 | if __name__ == '__main__':
166 | cfg.TEST.HAS_RPN = True # Use RPN for proposals
167 | args = parse_args()
168 |
169 | # model path
170 | demonet = args.demo_net
171 | dataset = args.dataset
172 | tfmodel = os.path.join('output', demonet, DATASETS[dataset][0], 'default',
173 | NETS[demonet][0])
174 |
175 |
176 | if not os.path.isfile(tfmodel + '.meta'):
177 | raise IOError(('{:s} not found.\nDid you download the proper networks from '
178 | 'our server and place them properly?').format(tfmodel + '.meta'))
179 |
180 | # set config
181 | tfconfig = tf.ConfigProto(allow_soft_placement=True)
182 | tfconfig.gpu_options.allow_growth=True
183 |
184 | # init session
185 | sess = tf.Session(config=tfconfig)
186 | # load network
187 | if demonet == 'vgg16':
188 | net = vgg16(batch_size=1)
189 | elif demonet == 'res101':
190 | net = resnetv1(batch_size=1, num_layers=101)
191 | elif demonet == 'res50':
192 | net = resnetv1(batch_size=1, num_layers=50)
193 | else:
194 | raise NotImplementedError
195 | net.create_architecture(sess, "TEST", 20,
196 | tag='default', anchor_scales=[8, 16, 32])
197 | saver = tf.train.Saver()
198 | saver.restore(sess, tfmodel)
199 |
200 | print('Loaded network {:s}'.format(tfmodel))
201 |
202 | #im_names = ['rgd_0076Cropped320.png','rgd_0095.png','pcd0122r_rgd_preprocessed_1.png','pcd0875r_rgd_preprocessed_1.png','resized_0875_2.png']
203 | im_names = ['pcd0100r_rgd_preprocessed_1.png','pcd0266r_rgd_preprocessed_1.png','pcd0882r_rgd_preprocessed_1.png','rgd_0000Cropped320.png']
204 | for im_name in im_names:
205 | print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')
206 | print('Demo for data/demo/{}'.format(im_name))
207 | demo(sess, net, im_name)
208 |
209 | plt.show()
210 |
--------------------------------------------------------------------------------
/tools/demo_graspRGD.py~:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 |
3 | # --------------------------------------------------------
4 | # Tensorflow Faster R-CNN
5 | # Licensed under The MIT License [see LICENSE for details]
6 | # Written by Xinlei Chen, based on code from Ross Girshick
7 | # --------------------------------------------------------
8 |
9 | """
10 | Demo script showing detections in sample images.
11 |
12 | See README.md for installation instructions before running.
13 | """
14 | from __future__ import absolute_import
15 | from __future__ import division
16 | from __future__ import print_function
17 |
18 | import _init_paths
19 | from model.config import cfg
20 | from model.test import im_detect
21 | from model.nms_wrapper import nms
22 |
23 | from utils.timer import Timer
24 | import tensorflow as tf
25 | import matplotlib.pyplot as plt
26 | import numpy as np
27 | import os, cv2
28 | import argparse
29 |
30 | from nets.vgg16 import vgg16
31 | from nets.resnet_v1 import resnetv1
32 | import scipy
33 | from shapely.geometry import Polygon
34 |
35 | pi = scipy.pi
36 | dot = scipy.dot
37 | sin = scipy.sin
38 | cos = scipy.cos
39 | ar = scipy.array
40 |
41 | CLASSES = ('__background__',
42 | 'angle_01', 'angle_02', 'angle_03', 'angle_04', 'angle_05',
43 | 'angle_06', 'angle_07', 'angle_08', 'angle_09', 'angle_10',
44 | 'angle_11', 'angle_12', 'angle_13', 'angle_14', 'angle_15',
45 | 'angle_16', 'angle_17', 'angle_18', 'angle_19')
46 |
47 | NETS = {'vgg16': ('vgg16_faster_rcnn_iter_70000.ckpt',),'res101': ('res101_faster_rcnn_iter_110000.ckpt',),'res50': ('res50_faster_rcnn_iter_240000.ckpt',)}
48 | DATASETS= {'pascal_voc': ('voc_2007_trainval',),'pascal_voc_0712': ('voc_2007_trainval+voc_2012_trainval',),'grasp': ('train',)}
49 |
50 | def Rotate2D(pts,cnt,ang=scipy.pi/4):
51 | '''pts = {} Rotates points(nx2) about center cnt(2) by angle ang(1) in radian'''
52 | return dot(pts-cnt,ar([[cos(ang),sin(ang)],[-sin(ang),cos(ang)]]))+cnt
53 |
54 | def vis_detections(ax, image_name, im, class_name, dets, thresh=0.5):
55 | """Draw detected bounding boxes."""
56 | inds = np.where(dets[:, -1] >= thresh)[0]
57 | if len(inds) == 0:
58 | return
59 |
60 | im = im[:, :, (2, 1, 0)]
61 | #fig, ax = plt.subplots(figsize=(12, 12))
62 | ax.imshow(im, aspect='equal')
63 | for i in inds:
64 | bbox = dets[i, :4]
65 | score = dets[i, -1]
66 |
67 | #ax.add_patch(
68 | # plt.Rectangle((bbox[0], bbox[1]),
69 | # bbox[2] - bbox[0],
70 | # bbox[3] - bbox[1], fill=False,
71 | # edgecolor='red', linewidth=3.5)
72 | # )
73 |
74 | # plot rotated rectangles
75 | pts = ar([[bbox[0],bbox[1]], [bbox[2], bbox[1]], [bbox[2], bbox[3]], [bbox[0], bbox[3]]])
76 | cnt = ar([(bbox[0] + bbox[2])/2, (bbox[1] + bbox[3])/2])
77 | angle = int(class_name[6:])
78 | r_bbox = Rotate2D(pts, cnt, -pi/2-pi/20*(angle-1))
79 | pred_label_polygon = Polygon([(r_bbox[0,0],r_bbox[0,1]), (r_bbox[1,0], r_bbox[1,1]), (r_bbox[2,0], r_bbox[2,1]), (r_bbox[3,0], r_bbox[3,1])])
80 | pred_x, pred_y = pred_label_polygon.exterior.xy
81 |
82 | plt.plot(pred_x[0:2],pred_y[0:2], color='k', alpha = 0.7, linewidth=1, solid_capstyle='round', zorder=2)
83 | plt.plot(pred_x[1:3],pred_y[1:3], color='r', alpha = 0.7, linewidth=3, solid_capstyle='round', zorder=2)
84 | plt.plot(pred_x[2:4],pred_y[2:4], color='k', alpha = 0.7, linewidth=1, solid_capstyle='round', zorder=2)
85 | plt.plot(pred_x[3:5],pred_y[3:5], color='r', alpha = 0.7, linewidth=3, solid_capstyle='round', zorder=2)
86 |
87 | #ax.text(bbox[0], bbox[1] - 2,
88 | # '{:s} {:.3f}'.format(class_name, score),
89 | # bbox=dict(facecolor='blue', alpha=0.5),
90 | # fontsize=14, color='white')
91 |
92 | #ax.set_title(('{} detections with '
93 | # 'p({} | box) >= {:.1f}').format(class_name, class_name,
94 | # thresh),
95 | # fontsize=14)
96 | #plt.axis('off')
97 | #plt.tight_layout()
98 |
99 | #save result
100 | #savepath = './data/demo/results/' + str(image_name) + str(class_name) + '.png'
101 | #plt.savefig(savepath)
102 |
103 | #plt.draw()
104 |
105 | def demo(sess, net, image_name):
106 | """Detect object classes in an image using pre-computed object proposals."""
107 |
108 | # Load the demo image
109 | im_file = os.path.join(cfg.DATA_DIR, 'demo', image_name)
110 | im = cv2.imread(im_file)
111 | #print(im)
112 |
113 | # Detect all object classes and regress object bounds
114 | timer = Timer()
115 | timer.tic()
116 | scores, boxes = im_detect(sess, net, im)
117 |
118 | #scores_max = scores[:,1:-1].max(axis=1)
119 | #scores_max_idx = np.argmax(scores_max)
120 | #scores = scores[scores_max_idx:scores_max_idx+1,:]
121 | #boxes = boxes[scores_max_idx:scores_max_idx+1, :]
122 |
123 | #im = cv2.imread('/home/fujenchu/projects/deepLearning/tensorflow-finetune-flickr-style-master/data/grasps_ivalab/rgb_cropped320/rgb_0076Cropped320.png')
124 | timer.toc()
125 | print('Detection took {:.3f}s for {:d} object proposals'.format(timer.total_time, boxes.shape[0]))
126 |
127 | fig, ax = plt.subplots(figsize=(12, 12))
128 | # Visualize detections for each class
129 | CONF_THRESH = 0.1
130 | NMS_THRESH = 0.3
131 | for cls_ind, cls in enumerate(CLASSES[1:]):
132 | cls_ind += 1 # because we skipped background
133 | cls_boxes = boxes[:, 4*cls_ind:4*(cls_ind + 1)]
134 | cls_scores = scores[:, cls_ind]
135 | dets = np.hstack((cls_boxes,
136 | cls_scores[:, np.newaxis])).astype(np.float32)
137 | keep = nms(dets, NMS_THRESH)
138 | dets = dets[keep, :]
139 | vis_detections(ax, image_name, im, cls, dets, thresh=CONF_THRESH)
140 | #tmp = max(cls_scores)
141 |
142 | plt.axis('off')
143 | plt.tight_layout()
144 |
145 | #cv2.imshow('deepGrasp_top_score', im)
146 | #choice = cv2.waitKey(100)
147 |
148 | #save result
149 | savepath = './data/demo/results_all_cls/' + str(image_name) + '.png'
150 | plt.savefig(savepath)
151 |
152 | plt.draw()
153 |
154 | def parse_args():
155 | """Parse input arguments."""
156 | parser = argparse.ArgumentParser(description='Tensorflow Faster R-CNN demo')
157 | parser.add_argument('--net', dest='demo_net', help='Network to use [vgg16 res101]',
158 | choices=NETS.keys(), default='res101')
159 | parser.add_argument('--dataset', dest='dataset', help='Trained dataset [pascal_voc pascal_voc_0712]',
160 | choices=DATASETS.keys(), default='pascal_voc_0712')
161 | args = parser.parse_args()
162 |
163 | return args
164 |
165 | if __name__ == '__main__':
166 | cfg.TEST.HAS_RPN = True # Use RPN for proposals
167 | args = parse_args()
168 |
169 | # model path
170 | demonet = args.demo_net
171 | dataset = args.dataset
172 | tfmodel = os.path.join('output', demonet, DATASETS[dataset][0], 'default',
173 | NETS[demonet][0])
174 |
175 |
176 | if not os.path.isfile(tfmodel + '.meta'):
177 | raise IOError(('{:s} not found.\nDid you download the proper networks from '
178 | 'our server and place them properly?').format(tfmodel + '.meta'))
179 |
180 | # set config
181 | tfconfig = tf.ConfigProto(allow_soft_placement=True)
182 | tfconfig.gpu_options.allow_growth=True
183 |
184 | # init session
185 | sess = tf.Session(config=tfconfig)
186 | # load network
187 | if demonet == 'vgg16':
188 | net = vgg16(batch_size=1)
189 | elif demonet == 'res101':
190 | net = resnetv1(batch_size=1, num_layers=101)
191 | elif demonet == 'res50':
192 | net = resnetv1(batch_size=1, num_layers=50)
193 | else:
194 | raise NotImplementedError
195 | net.create_architecture(sess, "TEST", 20,
196 | tag='default', anchor_scales=[8, 16, 32])
197 | saver = tf.train.Saver()
198 | saver.restore(sess, tfmodel)
199 |
200 | print('Loaded network {:s}'.format(tfmodel))
201 |
202 | #im_names = ['rgd_0076Cropped320.png','rgd_0095.png','pcd0122r_rgd_preprocessed_1.png','pcd0875r_rgd_preprocessed_1.png','resized_0875_2.png']
203 | im_names = ['0010.png','0012.png','0027.png','0038.png','0039.png','pcd0875r_rgd_preprocessed_1.png','pic_0010.png']
204 | for im_name in im_names:
205 | print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')
206 | print('Demo for data/demo/{}'.format(im_name))
207 | demo(sess, net, im_name)
208 |
209 | plt.show()
210 |
--------------------------------------------------------------------------------
/tools/demo_graspRGD_vis_mask.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 |
3 | # --------------------------------------------------------
4 | # Tensorflow Faster R-CNN
5 | # Licensed under The MIT License [see LICENSE for details]
6 | # Written by Xinlei Chen, based on code from Ross Girshick
7 | # --------------------------------------------------------
8 |
9 | """
10 | Demo script showing detections in sample images.
11 |
12 | See README.md for installation instructions before running.
13 | """
14 | from __future__ import absolute_import
15 | from __future__ import division
16 | from __future__ import print_function
17 |
18 | import _init_paths
19 | from model.config import cfg
20 | from model.test import im_detect
21 | from model.nms_wrapper import nms
22 |
23 | from utils.timer import Timer
24 | import tensorflow as tf
25 | import matplotlib.pyplot as plt
26 | import numpy as np
27 | import os, cv2
28 | import argparse
29 |
30 | from nets.vgg16 import vgg16
31 | from nets.resnet_v1 import resnetv1
32 | import scipy
33 | from shapely.geometry import Polygon
34 |
35 | pi = scipy.pi
36 | dot = scipy.dot
37 | sin = scipy.sin
38 | cos = scipy.cos
39 | ar = scipy.array
40 |
41 | CLASSES = ('__background__',
42 | 'angle_01', 'angle_02', 'angle_03', 'angle_04', 'angle_05',
43 | 'angle_06', 'angle_07', 'angle_08', 'angle_09', 'angle_10',
44 | 'angle_11', 'angle_12', 'angle_13', 'angle_14', 'angle_15',
45 | 'angle_16', 'angle_17', 'angle_18', 'angle_19')
46 |
47 | NETS = {'vgg16': ('vgg16_faster_rcnn_iter_70000.ckpt',),'res101': ('res101_faster_rcnn_iter_110000.ckpt',),'res50': ('res50_faster_rcnn_iter_240000.ckpt',)}
48 | DATASETS= {'pascal_voc': ('voc_2007_trainval',),'pascal_voc_0712': ('voc_2007_trainval+voc_2012_trainval',),'grasp': ('train',)}
49 |
50 | def Rotate2D(pts,cnt,ang=scipy.pi/4):
51 | '''pts = {} Rotates points(nx2) about center cnt(2) by angle ang(1) in radian'''
52 | return dot(pts-cnt,ar([[cos(ang),sin(ang)],[-sin(ang),cos(ang)]]))+cnt
53 |
54 | def vis_detections(ax, image_name, im, class_name, dets, thresh=0.5):
55 | """Draw detected bounding boxes."""
56 | inds = np.where(dets[:, -1] >= thresh)[0]
57 | if len(inds) == 0:
58 | return
59 |
60 | im = im[:, :, (2, 1, 0)]
61 | #fig, ax = plt.subplots(figsize=(12, 12))
62 | ax.imshow(im, aspect='equal')
63 |
64 | for i in inds:
65 | bbox = dets[i, :4]
66 | score = dets[i, -1]
67 |
68 | #ax.add_patch(
69 | # plt.Rectangle((bbox[0], bbox[1]),
70 | # bbox[2] - bbox[0],
71 | # bbox[3] - bbox[1], fill=False,
72 | # edgecolor='red', linewidth=3.5)
73 | # )
74 |
75 | # plot rotated rectangles
76 | pts = ar([[bbox[0],bbox[1]], [bbox[2], bbox[1]], [bbox[2], bbox[3]], [bbox[0], bbox[3]]])
77 | cnt = ar([(bbox[0] + bbox[2])/2, (bbox[1] + bbox[3])/2])
78 | angle = int(class_name[6:])
79 | r_bbox = Rotate2D(pts, cnt, -pi/2-pi/20*(angle-1))
80 | pred_label_polygon = Polygon([(r_bbox[0,0],r_bbox[0,1]), (r_bbox[1,0], r_bbox[1,1]), (r_bbox[2,0], r_bbox[2,1]), (r_bbox[3,0], r_bbox[3,1])])
81 | pred_x, pred_y = pred_label_polygon.exterior.xy
82 |
83 | plt.plot(pred_x[0:2],pred_y[0:2], color='k', alpha = 0.7, linewidth=1, solid_capstyle='round', zorder=2)
84 | plt.plot(pred_x[1:3],pred_y[1:3], color='r', alpha = 0.7, linewidth=3, solid_capstyle='round', zorder=2)
85 | plt.plot(pred_x[2:4],pred_y[2:4], color='k', alpha = 0.7, linewidth=1, solid_capstyle='round', zorder=2)
86 | plt.plot(pred_x[3:5],pred_y[3:5], color='r', alpha = 0.7, linewidth=3, solid_capstyle='round', zorder=2)
87 |
88 | #ax.text(bbox[0], bbox[1] - 2,
89 | # '{:s} {:.3f}'.format(class_name, score),
90 | # bbox=dict(facecolor='blue', alpha=0.5),
91 | # fontsize=14, color='white')
92 |
93 | #ax.set_title(('{} detections with '
94 | # 'p({} | box) >= {:.1f}').format(class_name, class_name,
95 | # thresh),
96 | # fontsize=14)
97 | #plt.axis('off')
98 | #plt.tight_layout()
99 |
100 | #save result
101 | #savepath = './data/demo/results/' + str(image_name) + str(class_name) + '.png'
102 | #plt.savefig(savepath)
103 |
104 | #plt.draw()
105 |
106 | def demo(sess, net, image_name, mask_name):
107 | """Detect object classes in an image using pre-computed object proposals."""
108 |
109 | # Load the demo image
110 | im_file = os.path.join(cfg.DATA_DIR, 'demo', image_name)
111 | mask_file = os.path.join(cfg.DATA_DIR, 'demo', mask_name)
112 | im = cv2.imread(im_file)
113 | mask = cv2.imread(mask_file,0)
114 | #im = im*np.stack([mask,mask,mask], axis=2)
115 | im = cv2.bitwise_and(im,im,mask = mask)
116 |
117 |
118 | # Detect all object classes and regress object bounds
119 | timer = Timer()
120 | timer.tic()
121 | scores, boxes = im_detect(sess, net, im)
122 |
123 | #scores_max = scores[:,1:-1].max(axis=1)
124 | #scores_max_idx = np.argmax(scores_max)
125 | #scores = scores[scores_max_idx:scores_max_idx+1,:]
126 | #boxes = boxes[scores_max_idx:scores_max_idx+1, :]
127 |
128 | #im = cv2.imread('/home/fujenchu/projects/deepLearning/tensorflow-finetune-flickr-style-master/data/grasps_ivalab/rgb_cropped320/rgb_0076Cropped320.png')
129 | timer.toc()
130 | print('Detection took {:.3f}s for {:d} object proposals'.format(timer.total_time, boxes.shape[0]))
131 |
132 | fig, ax = plt.subplots(figsize=(12, 12))
133 | # Visualize detections for each class
134 | CONF_THRESH = 0.1
135 | NMS_THRESH = 0.3
136 | for cls_ind, cls in enumerate(CLASSES[1:]):
137 | cls_ind += 1 # because we skipped background
138 | cls_boxes = boxes[:, 4*cls_ind:4*(cls_ind + 1)]
139 | cls_scores = scores[:, cls_ind]
140 | dets = np.hstack((cls_boxes,
141 | cls_scores[:, np.newaxis])).astype(np.float32)
142 | keep = nms(dets, NMS_THRESH)
143 | dets = dets[keep, :]
144 | vis_detections(ax, image_name, im, cls, dets, thresh=CONF_THRESH)
145 | #tmp = max(cls_scores)
146 |
147 | plt.axis('off')
148 | plt.tight_layout()
149 |
150 | #cv2.imshow('deepGrasp_top_score', im)
151 | #choice = cv2.waitKey(100)
152 |
153 | #save result
154 | savepath = './data/demo/results_all_cls/' + str(image_name) + '.png'
155 | plt.savefig(savepath)
156 |
157 | plt.draw()
158 |
159 | def parse_args():
160 | """Parse input arguments."""
161 | parser = argparse.ArgumentParser(description='Tensorflow Faster R-CNN demo')
162 | parser.add_argument('--net', dest='demo_net', help='Network to use [vgg16 res101]',
163 | choices=NETS.keys(), default='res101')
164 | parser.add_argument('--dataset', dest='dataset', help='Trained dataset [pascal_voc pascal_voc_0712]',
165 | choices=DATASETS.keys(), default='pascal_voc_0712')
166 | args = parser.parse_args()
167 |
168 | return args
169 |
170 | if __name__ == '__main__':
171 | cfg.TEST.HAS_RPN = True # Use RPN for proposals
172 | args = parse_args()
173 |
174 | # model path
175 | demonet = args.demo_net
176 | dataset = args.dataset
177 | tfmodel = os.path.join('output', demonet, DATASETS[dataset][0], 'default',
178 | NETS[demonet][0])
179 |
180 |
181 | if not os.path.isfile(tfmodel + '.meta'):
182 | raise IOError(('{:s} not found.\nDid you download the proper networks from '
183 | 'our server and place them properly?').format(tfmodel + '.meta'))
184 |
185 | # set config
186 | tfconfig = tf.ConfigProto(allow_soft_placement=True)
187 | tfconfig.gpu_options.allow_growth=True
188 |
189 | # init session
190 | sess = tf.Session(config=tfconfig)
191 | # load network
192 | if demonet == 'vgg16':
193 | net = vgg16(batch_size=1)
194 | elif demonet == 'res101':
195 | net = resnetv1(batch_size=1, num_layers=101)
196 | elif demonet == 'res50':
197 | net = resnetv1(batch_size=1, num_layers=50)
198 | else:
199 | raise NotImplementedError
200 | net.create_architecture(sess, "TEST", 20,
201 | tag='default', anchor_scales=[8, 16, 32])
202 | saver = tf.train.Saver()
203 | saver.restore(sess, tfmodel)
204 |
205 | print('Loaded network {:s}'.format(tfmodel))
206 |
207 | #im_names = ['rgd_0076Cropped320.png','rgd_0095.png','pcd0122r_rgd_preprocessed_1.png','pcd0875r_rgd_preprocessed_1.png','resized_0875_2.png']
208 | im_names = ['rgd_0000Cropped320.png']
209 | mask_name = 'mask.jpg'
210 |
211 | for im_name in im_names:
212 | print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')
213 | print('Demo for data/demo/{}'.format(im_name))
214 | demo(sess, net, im_name, mask_name)
215 |
216 | plt.show()
217 |
--------------------------------------------------------------------------------
/tools/demo_graspRGD_vis_select.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 |
3 | # --------------------------------------------------------
4 | # Tensorflow Faster R-CNN
5 | # Licensed under The MIT License [see LICENSE for details]
6 | # Written by Xinlei Chen, based on code from Ross Girshick
7 | # --------------------------------------------------------
8 |
9 | """
10 | Demo script showing detections in sample images.
11 |
12 | See README.md for installation instructions before running.
13 | """
14 | from __future__ import absolute_import
15 | from __future__ import division
16 | from __future__ import print_function
17 |
18 | import _init_paths
19 | from model.config import cfg
20 | from model.test import im_detect
21 | from model.nms_wrapper import nms
22 |
23 | from utils.timer import Timer
24 | import tensorflow as tf
25 | import matplotlib.pyplot as plt
26 | import numpy as np
27 | import os, cv2
28 | import argparse
29 |
30 | from nets.vgg16 import vgg16
31 | from nets.resnet_v1 import resnetv1
32 | import scipy
33 | from shapely.geometry import Polygon
34 |
35 | pi = scipy.pi
36 | dot = scipy.dot
37 | sin = scipy.sin
38 | cos = scipy.cos
39 | ar = scipy.array
40 |
41 | CLASSES = ('__background__',
42 | 'angle_01', 'angle_02', 'angle_03', 'angle_04', 'angle_05',
43 | 'angle_06', 'angle_07', 'angle_08', 'angle_09', 'angle_10',
44 | 'angle_11', 'angle_12', 'angle_13', 'angle_14', 'angle_15',
45 | 'angle_16', 'angle_17', 'angle_18', 'angle_19')
46 |
47 | global SELECTION_COUNT
48 | SELECTION_COUNT = 0
49 | NETS = {'vgg16': ('vgg16_faster_rcnn_iter_70000.ckpt',),'res101': ('res101_faster_rcnn_iter_110000.ckpt',),'res50': ('res50_faster_rcnn_iter_240000.ckpt',)}
50 | DATASETS= {'pascal_voc': ('voc_2007_trainval',),'pascal_voc_0712': ('voc_2007_trainval+voc_2012_trainval',),'grasp': ('train',)}
51 |
52 |
53 | def Rotate2D(pts,cnt,ang=scipy.pi/4):
54 | '''pts = {} Rotates points(nx2) about center cnt(2) by angle ang(1) in radian'''
55 | return dot(pts-cnt,ar([[cos(ang),sin(ang)],[-sin(ang),cos(ang)]]))+cnt
56 |
57 | def vis_detections(ax, image_name, im, class_name, dets, thresh=0.5):
58 | """Draw detected bounding boxes."""
59 | inds = np.where(dets[:, -1] >= thresh)[0]
60 | if len(inds) == 0:
61 | return
62 |
63 | im = im[:, :, (2, 1, 0)]
64 | #fig, ax = plt.subplots(figsize=(12, 12))
65 | ax.imshow(im, aspect='equal')
66 | for i in inds:
67 | bbox = dets[i, :4]
68 | score = dets[i, -1]
69 |
70 | #ax.add_patch(
71 | # plt.Rectangle((bbox[0], bbox[1]),
72 | # bbox[2] - bbox[0],
73 | # bbox[3] - bbox[1], fill=False,
74 | # edgecolor='red', linewidth=3.5)
75 | # )
76 |
77 | # plot rotated rectangles
78 | pts = ar([[bbox[0],bbox[1]], [bbox[2], bbox[1]], [bbox[2], bbox[3]], [bbox[0], bbox[3]]])
79 | cnt = ar([(bbox[0] + bbox[2])/2, (bbox[1] + bbox[3])/2])
80 | angle = int(class_name[6:])
81 | r_bbox = Rotate2D(pts, cnt, -pi/2-pi/20*(angle-1))
82 | pred_label_polygon = Polygon([(r_bbox[0,0],r_bbox[0,1]), (r_bbox[1,0], r_bbox[1,1]), (r_bbox[2,0], r_bbox[2,1]), (r_bbox[3,0], r_bbox[3,1])])
83 | pred_x, pred_y = pred_label_polygon.exterior.xy
84 |
85 | plt.plot(pred_x[0:2],pred_y[0:2], color='k', alpha = 0.7, linewidth=1, solid_capstyle='round', zorder=2)
86 | plt.plot(pred_x[1:3],pred_y[1:3], color='r', alpha = 0.7, linewidth=3, solid_capstyle='round', zorder=2)
87 | plt.plot(pred_x[2:4],pred_y[2:4], color='k', alpha = 0.7, linewidth=1, solid_capstyle='round', zorder=2)
88 | plt.plot(pred_x[3:5],pred_y[3:5], color='r', alpha = 0.7, linewidth=3, solid_capstyle='round', zorder=2)
89 |
90 | #ax.text(bbox[0], bbox[1] - 2,
91 | # '{:s} {:.3f}'.format(class_name, score),
92 | # bbox=dict(facecolor='blue', alpha=0.5),
93 | # fontsize=14, color='white')
94 | global SELECTION_COUNT
95 | ax.text(bbox[0], bbox[1] - 2,
96 | '{:s} {:.3f}'.format(str(SELECTION_COUNT), score),
97 | bbox=dict(facecolor='blue', alpha=0.5),
98 | fontsize=14, color='white')
99 | SELECTION_COUNT = SELECTION_COUNT + 1
100 |
101 | #ax.set_title(('{} detections with '
102 | # 'p({} | box) >= {:.1f}').format(class_name, class_name,
103 | # thresh),
104 | # fontsize=14)
105 | #plt.axis('off')
106 | #plt.tight_layout()
107 |
108 | #save result
109 | #savepath = './data/demo/results/' + str(image_name) + str(class_name) + '.png'
110 | #plt.savefig(savepath)
111 |
112 | #plt.draw()
113 |
114 | def demo(sess, net, image_name):
115 | """Detect object classes in an image using pre-computed object proposals."""
116 |
117 | # Load the demo image
118 | im_file = os.path.join(cfg.DATA_DIR, 'demo', image_name)
119 | im = cv2.imread(im_file)
120 | #print(im)
121 |
122 | # Detect all object classes and regress object bounds
123 | timer = Timer()
124 | timer.tic()
125 | scores, boxes = im_detect(sess, net, im)
126 |
127 | #scores_max = scores[:,1:-1].max(axis=1)
128 | #scores_max_idx = np.argmax(scores_max)
129 | #scores = scores[scores_max_idx:scores_max_idx+1,:]
130 | #boxes = boxes[scores_max_idx:scores_max_idx+1, :]
131 |
132 | #im = cv2.imread('/home/fujenchu/projects/deepLearning/tensorflow-finetune-flickr-style-master/data/grasps_ivalab/rgb_cropped320/rgb_0076Cropped320.png')
133 | timer.toc()
134 | print('Detection took {:.3f}s for {:d} object proposals'.format(timer.total_time, boxes.shape[0]))
135 |
136 | fig, ax = plt.subplots(figsize=(12, 12))
137 | # Visualize detections for each class
138 | CONF_THRESH = 0.1
139 | NMS_THRESH = 0.3
140 | global SELECTION_COUNT
141 | SELECTION_COUNT = 0
142 | for cls_ind, cls in enumerate(CLASSES[1:]):
143 | cls_ind += 1 # because we skipped background
144 | cls_boxes = boxes[:, 4*cls_ind:4*(cls_ind + 1)]
145 | cls_scores = scores[:, cls_ind]
146 | dets = np.hstack((cls_boxes,
147 | cls_scores[:, np.newaxis])).astype(np.float32)
148 | keep = nms(dets, NMS_THRESH)
149 | dets = dets[keep, :]
150 | vis_detections(ax, image_name, im, cls, dets, thresh=CONF_THRESH)
151 | #tmp = max(cls_scores)
152 |
153 | plt.axis('off')
154 | plt.tight_layout()
155 |
156 | #cv2.imshow('deepGrasp_top_score', im)
157 | #choice = cv2.waitKey(100)
158 |
159 | #save result
160 | savepath = './data/demo/results_all_cls/' + str(image_name) + '.png'
161 | plt.savefig(savepath)
162 |
163 | plt.draw()
164 |
165 | def parse_args():
166 | """Parse input arguments."""
167 | parser = argparse.ArgumentParser(description='Tensorflow Faster R-CNN demo')
168 | parser.add_argument('--net', dest='demo_net', help='Network to use [vgg16 res101]',
169 | choices=NETS.keys(), default='res101')
170 | parser.add_argument('--dataset', dest='dataset', help='Trained dataset [pascal_voc pascal_voc_0712]',
171 | choices=DATASETS.keys(), default='pascal_voc_0712')
172 | args = parser.parse_args()
173 |
174 | return args
175 |
176 | if __name__ == '__main__':
177 | cfg.TEST.HAS_RPN = True # Use RPN for proposals
178 | args = parse_args()
179 |
180 | # model path
181 | demonet = args.demo_net
182 | dataset = args.dataset
183 | tfmodel = os.path.join('output', demonet, DATASETS[dataset][0], 'default',
184 | NETS[demonet][0])
185 |
186 |
187 | if not os.path.isfile(tfmodel + '.meta'):
188 | raise IOError(('{:s} not found.\nDid you download the proper networks from '
189 | 'our server and place them properly?').format(tfmodel + '.meta'))
190 |
191 | # set config
192 | tfconfig = tf.ConfigProto(allow_soft_placement=True)
193 | tfconfig.gpu_options.allow_growth=True
194 |
195 | # init session
196 | sess = tf.Session(config=tfconfig)
197 | # load network
198 | if demonet == 'vgg16':
199 | net = vgg16(batch_size=1)
200 | elif demonet == 'res101':
201 | net = resnetv1(batch_size=1, num_layers=101)
202 | elif demonet == 'res50':
203 | net = resnetv1(batch_size=1, num_layers=50)
204 | else:
205 | raise NotImplementedError
206 | net.create_architecture(sess, "TEST", 20,
207 | tag='default', anchor_scales=[8, 16, 32])
208 | saver = tf.train.Saver()
209 | saver.restore(sess, tfmodel)
210 |
211 | print('Loaded network {:s}'.format(tfmodel))
212 |
213 | #im_names = ['rgd_0076Cropped320.png','rgd_0095.png','pcd0122r_rgd_preprocessed_1.png','pcd0875r_rgd_preprocessed_1.png','resized_0875_2.png']
214 | im_names = ['pcd0100r_rgd_preprocessed_1.png','pcd0266r_rgd_preprocessed_1.png','pcd0882r_rgd_preprocessed_1.png','rgd_0000Cropped320.png']
215 | for im_name in im_names:
216 | print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')
217 | print('Demo for data/demo/{}'.format(im_name))
218 | demo(sess, net, im_name)
219 |
220 | plt.show()
221 |
--------------------------------------------------------------------------------
/tools/mask_gen.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | mask = np.zeros((227,227))
3 | mask[70:120,90:150]=1
4 | import scipy.miscscipy.misc.imsave('mask.jpg', mask)
5 |
--------------------------------------------------------------------------------
/tools/trainval_net.py:
--------------------------------------------------------------------------------
1 | # --------------------------------------------------------
2 | # Tensorflow Faster R-CNN
3 | # Licensed under The MIT License [see LICENSE for details]
4 | # Written by Zheqi He, Xinlei Chen, based on code from Ross Girshick
5 | # --------------------------------------------------------
6 | from __future__ import absolute_import
7 | from __future__ import division
8 | from __future__ import print_function
9 |
10 | import _init_paths
11 | from model.train_val import get_training_roidb, train_net
12 | from model.config import cfg, cfg_from_file, cfg_from_list, get_output_dir, get_output_tb_dir
13 | from datasets.factory import get_imdb
14 | import datasets.imdb
15 | import argparse
16 | import pprint
17 | import numpy as np
18 | import sys
19 |
20 | import tensorflow as tf
21 | from nets.vgg16 import vgg16
22 | from nets.resnet_v1 import resnetv1
23 |
24 | def parse_args():
25 | """
26 | Parse input arguments
27 | """
28 | parser = argparse.ArgumentParser(description='Train a Fast R-CNN network')
29 | parser.add_argument('--cfg', dest='cfg_file',
30 | help='optional config file',
31 | default=None, type=str)
32 | parser.add_argument('--weight', dest='weight',
33 | help='initialize with pretrained model weights',
34 | type=str)
35 | parser.add_argument('--imdb', dest='imdb_name',
36 | help='dataset to train on',
37 | default='voc_2007_trainval', type=str)
38 | parser.add_argument('--imdbval', dest='imdbval_name',
39 | help='dataset to validate on',
40 | default='voc_2007_test', type=str)
41 | parser.add_argument('--iters', dest='max_iters',
42 | help='number of iterations to train',
43 | default=70000, type=int)
44 | parser.add_argument('--tag', dest='tag',
45 | help='tag of the model',
46 | default=None, type=str)
47 | parser.add_argument('--net', dest='net',
48 | help='vgg16, res50, res101, res152',
49 | default='res50', type=str)
50 | parser.add_argument('--set', dest='set_cfgs',
51 | help='set config keys', default=None,
52 | nargs=argparse.REMAINDER)
53 |
54 | if len(sys.argv) == 1:
55 | parser.print_help()
56 | sys.exit(1)
57 |
58 | args = parser.parse_args()
59 | return args
60 |
61 |
62 | def combined_roidb(imdb_names):
63 | """
64 | Combine multiple roidbs
65 | """
66 |
67 | def get_roidb(imdb_name):
68 | imdb = get_imdb(imdb_name)
69 | print('Loaded dataset `{:s}` for training'.format(imdb.name))
70 | imdb.set_proposal_method(cfg.TRAIN.PROPOSAL_METHOD)
71 | print('Set proposal method: {:s}'.format(cfg.TRAIN.PROPOSAL_METHOD))
72 | roidb = get_training_roidb(imdb)
73 | return roidb
74 |
75 | roidbs = [get_roidb(s) for s in imdb_names.split('+')]
76 | roidb = roidbs[0]
77 | if len(roidbs) > 1:
78 | for r in roidbs[1:]:
79 | roidb.extend(r)
80 | tmp = get_imdb(imdb_names.split('+')[1])
81 | imdb = datasets.imdb.imdb(imdb_names, tmp.classes)
82 | else:
83 | imdb = get_imdb(imdb_names)
84 | return imdb, roidb
85 |
86 |
87 | if __name__ == '__main__':
88 | args = parse_args()
89 |
90 | print('Called with args:')
91 | print(args)
92 |
93 | if args.cfg_file is not None:
94 | cfg_from_file(args.cfg_file)
95 | if args.set_cfgs is not None:
96 | cfg_from_list(args.set_cfgs)
97 |
98 | print('Using config:')
99 | pprint.pprint(cfg)
100 |
101 | np.random.seed(cfg.RNG_SEED)
102 |
103 | # train set
104 | imdb, roidb = combined_roidb(args.imdb_name)
105 | print('{:d} roidb entries'.format(len(roidb)))
106 |
107 | # output directory where the models are saved
108 | output_dir = get_output_dir(imdb, args.tag)
109 | print('Output will be saved to `{:s}`'.format(output_dir))
110 |
111 | # tensorboard directory where the summaries are saved during training
112 | tb_dir = get_output_tb_dir(imdb, args.tag)
113 | print('TensorFlow summaries will be saved to `{:s}`'.format(tb_dir))
114 |
115 | # also add the validation set, but with no flipping images
116 | orgflip = cfg.TRAIN.USE_FLIPPED
117 | cfg.TRAIN.USE_FLIPPED = False
118 | _, valroidb = combined_roidb(args.imdbval_name)
119 | print('{:d} validation roidb entries'.format(len(valroidb)))
120 | cfg.TRAIN.USE_FLIPPED = orgflip
121 |
122 | # load network
123 | if args.net == 'vgg16':
124 | net = vgg16(batch_size=cfg.TRAIN.IMS_PER_BATCH)
125 | elif args.net == 'res50':
126 | net = resnetv1(batch_size=cfg.TRAIN.IMS_PER_BATCH, num_layers=50)
127 | elif args.net == 'res101':
128 | net = resnetv1(batch_size=cfg.TRAIN.IMS_PER_BATCH, num_layers=101)
129 | elif args.net == 'res152':
130 | net = resnetv1(batch_size=cfg.TRAIN.IMS_PER_BATCH, num_layers=152)
131 | else:
132 | raise NotImplementedError
133 |
134 | train_net(net, imdb, roidb, valroidb, output_dir, tb_dir,
135 | pretrained_model=args.weight,
136 | max_iters=args.max_iters)
137 |
--------------------------------------------------------------------------------