├── README.md
└── example-annotation.png


/README.md:
--------------------------------------------------------------------------------
 1 | # PFN Picking Instructions for Commodities Dataset (PFN-PIC)
 2 | This dataset is a collection of spoken language instructions for a robotic system to pick and place common objects. Text instructions and corresponding object images are provided.
 3 | 
 4 | [Download (dataset-main.zip)](https://figshare.com/articles/figure/PFN_Picking_Instructions_for_Commodities_Dataset_PFN-PIC_/23675100)
 5 | 
 6 | We consider a situation where the robot is instructed by the operator to pick up a specific object and move it to another location: for example, _Move the blue and white tissue box to the top right bin_.
 7 | 
 8 | <img src="example-annotation.png" alt="An example of image" title="An example of image and annotation" width="320" height="260">
 9 | 
10 | This dataset consists of RGBD images, bounding box annotations, destination box annotations, and text instructions.
11 | 
12 | ```
13 | dataset
14 | ├── en.train.jsonl
15 | ├── en.validation.jsonl
16 | ├── ja.train.jsonl
17 | ├── ja.validation.jsonl
18 | ├── image_file/
19 |     ├── 1.png
20 |     ├── 2.png
21 |     ├── ....
22 |     └── 1180.png
23 | ```
24 | 
25 | All objects in each image are annotated with bounding boxes.
26 | Each bounding box is associated with a destination box and text instructions.
27 | In addition to RGB images, depth images are also available in [PCD (Point Cloud Data) file format](http://pointclouds.org/documentation/tutorials/pcd_file_format.php).  Since the PCD files are relatively large (17GB), we provide them upon request.  Please create a [GitHub issue](https://github.com/pfnet-research/picking-instruction/issues) for the request.
28 | 
29 | The bounding box annotations, destination box annotations, and text instructions are provided in `en.train.jsonl`, `en.validation.jsonl`, `ja.train.jsonl`, `ja.validation.jsonl` which are all in
30 | [JSON Lines text file format](http://jsonlines.org/).
31 | Each line of these files represents the annotations for one image.  We recommend to use [jq](https://stedolan.github.io/jq/) or other JSON tools for pretty-printing.
32 | 
33 | ```
34 | $ jq -r '.' dataset/en.train.jsonl | head -30
35 | {
36 |   "image_file": "1.png",
37 |   "pcd_file": "1.pcd",
38 |   "objects": [
39 |     {
40 |       "dest_box": "tl",
41 |       "bbox": {
42 |         "x": 649.7302,
43 |         "y": 654.038,
44 |         "width": 171.1864,
45 |         "height": 235.914
46 |       },
47 |       "instructions": [
48 |         "Put the green package next to the mustard in the first box on the left with the white circle.",
49 |         "pick up the green sachet and put it in the upper left box",
50 |         "Move the green and white package with asian scripture to the top left box."
51 |       ]
52 |     },
53 | ...
54 | ```
55 | 
56 | ## Citation:
57 | * [English] Jun Hatori, Yuta Kikuchi, Sosuke Kobayashi, Kuniyuki Takahashi, Yuta Tsuboi, Yuya Unno, Wilson Ko, Jethro Tan. 
58 | Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions, 
59 | _Proceedings of International Conference on Robotics and Automation ([ICRA2018](https://icra2018.org/))_, 2018. ICRA Best Paper Award on Human-Robot Interaction (HRI), [project page](https://pfnet.github.io/interactive-robot/), [paper content on arxiv](https://arxiv.org/abs/1710.06280) 
60 | (The first 6 authors are contributed equally and ordered alphabetically.)
61 | * [Japanese] 羽鳥 潤, 菊池 悠太, 小林 颯介, 高橋 城志, 坪井 祐太, 海野 裕也, Wilson Ko, Jethro Tan. [実世界におけるインタラクティブな物体指示](http://anlp.jp/proceedings/annual_meeting/2018/pdf_dir/C5-1.pdf), _言語処理学会第21回年次大会([NLP2018](http://www.anlp.jp/nlp2018/))_, 2018.
62 | (最初の6人は全員筆頭著者であり貢献度に差はない)
63 | 
64 | ## Statistics
65 | | file name | #image | #bounding box | #instruction|
66 | |:---|---:|---:|---:|
67 | |en.train.json| 1060 | 25500 | 71701|
68 | |en.validation.json| 20 | 353| 898 |
69 | |ja.train.json| 1060 | 25500 | 76551 |
70 | |ja.validation.json| 20 | 383 | 1149 |
71 | 
72 | Note that since some of the annotations include misspelling and do not appropriately specify target objects in the English validation set, we manually reviewed all the text instructions in the validation set and removed inappropriate instructions.
73 | 
74 | ## 
75 | 
76 | ## Terms of Use
77 | The images and annotations in this dataset belong to Preferred Networks, Inc. and 
78 | are licensed under a [Creative Commons Attribution 4.0 License](https://creativecommons.org/licenses/by/4.0/legalcode).
79 | 
80 | ![creative commons logo](https://mirrors.creativecommons.org/presskit/logos/cc.logo.png)
81 | 
82 | THIS IMAGES AND ANNOTATIONS ARE PROVIDED "AS IS" AND NO REPRESENTATIONS OR WARRANTIES OF ANY KIND CONCERNING THE IMAGES AND ANNOTATIONS, WHETHER EXPRESS, IMPLIED, STATUTORY, OR OTHER, ARE MADE. THIS INCLUDES, WITHOUT LIMITATION, WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, NON-INFRINGEMENT, ABSENCE OF LATENT OR OTHER DEFECTS, ACCURACY, OR THE PRESENCE OR ABSENCE OF ERRORS, WHETHER OR NOT KNOWN OR DISCOVERABLE. 
83 | IN NO EVENT SHALL THE COPYRIGHT HOLDER BE LIABLE ON ANY THEORY OR OTHERWISE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, CONSEQUENTIAL, PUNITIVE, EXEMPLARY, OR OTHER LOSSES, COSTS, EXPENSES, OR DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) ARISING OUT OF THIS PUBLIC LICENSE OR USE OF THE IMAGES AND ANNOTATIONS EVEN IF THE COPYRIGHT HOLDER HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH LOSSES, COSTS, EXPENSES, OR DAMAGES.
84 | 
85 | 


--------------------------------------------------------------------------------
/example-annotation.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/pfnet-research/picking-instruction/ce31facc164d3025e52ab428484b5eed722c0eb4/example-annotation.png


--------------------------------------------------------------------------------