53 |
54 |
55 |
56 |
57 |
58 |
--------------------------------------------------------------------------------
/papers/pdfs/cs_CL/2017-04-02/summary.csv:
--------------------------------------------------------------------------------
1 | arXiv:1703.10252 Linguistic Matrix Theory https://arxiv.org/pdf/1703.10252.pdf Dimitrios Kartsaklis, Sanjaye Ramgoolam, Mehrnoosh Sadrzadeh https://arxiv.org/find/cs/1/au:+Kartsaklis_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ramgoolam_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Sadrzadeh_M/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.10252.pdf
2 | arXiv:1703.10186 Colors in Context: A Pragmatic Neural Model for Grounded Language Understanding https://arxiv.org/pdf/1703.10186.pdf Will Monroe, Robert X.D. Hawkins, Noah D. Goodman, Christopher Potts https://arxiv.org/find/cs/1/au:+Monroe_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Hawkins_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Goodman_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Potts_C/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.10186.pdf
3 | arXiv:1703.10476 Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training https://arxiv.org/pdf/1703.10476.pdf Rakshith Shetty, Marcus Rohrbach, Lisa Anne Hendricks, Mario Fritz, Bernt Schiele https://arxiv.org/find/cs/1/au:+Shetty_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Rohrbach_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Hendricks_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Fritz_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Schiele_B/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10476.pdf
4 | arXiv:1703.10356 End-to-End MAP Training of a Hybrid HMM-DNN Model https://arxiv.org/pdf/1703.10356.pdf Lior Fritz, David Burshtein https://arxiv.org/find/cs/1/au:+Fritz_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Burshtein_D/0/1/0/all/0/1 Learning (cs.LG) https://arxiv.org/abs/1703.10356.pdf
5 | arXiv:1703.10344 Automated News Suggestions for Populating Wikipedia Entity Pages https://arxiv.org/pdf/1703.10344.pdf Besnik Fetahu, Katja Markert, Avishek Anand https://arxiv.org/find/cs/1/au:+Fetahu_B/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Markert_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Anand_A/0/1/0/all/0/1 Information Retrieval (cs.IR) https://arxiv.org/abs/1703.10344.pdf
6 | arXiv:1703.10339 Finding News Citations for Wikipedia https://arxiv.org/pdf/1703.10339.pdf Besnik Fetahu, Katja Markert, Wolfgang Nejdl, Avishek Anand https://arxiv.org/find/cs/1/au:+Fetahu_B/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Markert_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Nejdl_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Anand_A/0/1/0/all/0/1 Information Retrieval (cs.IR) https://arxiv.org/abs/1703.10339.pdf
7 | arXiv:1703.10152 Automatic Argumentative-Zoning Using Word2vec https://arxiv.org/pdf/1703.10152.pdf Haixia Liu https://arxiv.org/find/cs/1/au:+Liu_H/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.10152.pdf
8 | arXiv:1703.10135 Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model https://arxiv.org/pdf/1703.10135.pdf Yuxuan Wang, RJ Skerry-Ryan, Daisy Stanton, Yonghui Wu, Ron J. Weiss, Navdeep Jaitly, Zongheng Yang, Ying Xiao, Zhifeng Chen, Samy Bengio, Quoc Le, Yannis Agiomyrgiannakis, Rob Clark, Rif A. Saurous https://arxiv.org/find/cs/1/au:+Wang_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Skerry_Ryan_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Stanton_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wu_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Weiss_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Jaitly_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yang_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Xiao_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Chen_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Bengio_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Le_Q/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Agiomyrgiannakis_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Clark_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Saurous_R/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.10135.pdf
9 | arXiv:1703.10090 A Short Review of Ethical Challenges in Clinical Natural Language Processing https://arxiv.org/pdf/1703.10090.pdf Simon Šuster, Stéphan Tulkens, Walter Daelemans https://arxiv.org/find/cs/1/au:+Suster_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Tulkens_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Daelemans_W/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.10090.pdf
10 | arXiv:1703.10065 Hierarchical Classification for Spoken Arabic Dialect Identification using Prosody: Case of Algerian Dialects https://arxiv.org/pdf/1703.10065.pdf Soumia Bougrine, Hadda Cherroun, Djelloul Ziadi https://arxiv.org/find/cs/1/au:+Bougrine_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Cherroun_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ziadi_D/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.10065.pdf
11 | arXiv:1703.09902 Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation https://arxiv.org/pdf/1703.09902.pdf Albert Gatt, Emiel Krahmer https://arxiv.org/find/cs/1/au:+Gatt_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Krahmer_E/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.09902.pdf
12 | arXiv:1703.09831 A Deep Compositional Framework for Human-like Language Acquisition in Virtual Environment https://arxiv.org/pdf/1703.09831.pdf Haonan Yu, Haichao Zhang, Wei Xu https://arxiv.org/find/cs/1/au:+Yu_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhang_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Xu_W/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.09831.pdf
13 | arXiv:1703.09825 Semi-Supervised Affective Meaning Lexicon Expansion Using Semantic and Distributed Word Representations https://arxiv.org/pdf/1703.09825.pdf Areej Alhothali, Jesse Hoey https://arxiv.org/find/cs/1/au:+Alhothali_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Hoey_J/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.09825.pdf
14 | arXiv:1703.09817 Learning Similarity Function for Pronunciation Variations https://arxiv.org/pdf/1703.09817.pdf Einat Naaman, Yossi Adi, Joseph Keshet https://arxiv.org/find/cs/1/au:+Naaman_E/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Adi_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Keshet_J/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.09817.pdf
15 | arXiv:1703.09749 Developpement de Methodes Automatiques pour la Reutilisation des Composants Logiciels https://arxiv.org/pdf/1703.09749.pdf Kouakou Ive Arsene Koffi, Konan Marcellin Brou, Souleymane Oumtanaga https://arxiv.org/find/cs/1/au:+Koffi_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Brou_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Oumtanaga_S/0/1/0/all/0/1 Software Engineering (cs.SE) https://arxiv.org/abs/1703.09749.pdf
16 | arXiv:1703.09570 A Tidy Data Model for Natural Language Processing using cleanNLP https://arxiv.org/pdf/1703.09570.pdf Taylor Arnold https://arxiv.org/find/cs/1/au:+Arnold_T/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.09570.pdf
17 | arXiv:1703.09527 Is This a Joke? Detecting Humor in Spanish Tweets https://arxiv.org/pdf/1703.09527.pdf Santiago Castro, Matías Cubero, Diego Garat, Guillermo Moncecchi https://arxiv.org/find/cs/1/au:+Castro_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Cubero_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Garat_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Moncecchi_G/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.09527.pdf
18 | arXiv:1703.09439 A practical approach to dialogue response generation in closed domains https://arxiv.org/pdf/1703.09439.pdf Yichao Lu, Phillip Keung, Shaonan Zhang, Jason Sun, Vikas Bhardwaj https://arxiv.org/find/cs/1/au:+Lu_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Keung_P/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhang_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Sun_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Bhardwaj_V/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.09439.pdf
19 | arXiv:1703.09684 An Analysis of Visual Question Answering Algorithms https://arxiv.org/pdf/1703.09684.pdf Kushal Kafle, Christopher Kanan https://arxiv.org/find/cs/1/au:+Kafle_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kanan_C/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09684.pdf
20 | arXiv:1703.09400 Diving Deep into Clickbaits: Who Use Them to What Extents in Which Topics with What Effects? https://arxiv.org/pdf/1703.09400.pdf Md Main Uddin Rony, Naeemul Hassan, Mohammad Yousuf https://arxiv.org/find/cs/1/au:+Rony_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Hassan_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yousuf_M/0/1/0/all/0/1 Social and Information Networks (cs.SI) https://arxiv.org/abs/1703.09400.pdf
21 | arXiv:1703.09398 This Just In: Fake News Packs a Lot in Title, Uses Simpler, Repetitive Content in Text Body, More Similar to Satire than Real News https://arxiv.org/pdf/1703.09398.pdf Benjamin D. Horne, Sibel Adali https://arxiv.org/find/cs/1/au:+Horne_B/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Adali_S/0/1/0/all/0/1 Social and Information Networks (cs.SI) https://arxiv.org/abs/1703.09398.pdf
22 | arXiv:1703.09013 A Sentence Simplification System for Improving Relation Extraction https://arxiv.org/pdf/1703.09013.pdf Christina Niklaus, Bernhard Bermeitinger, Siegfried Handschuh, André Freitas https://arxiv.org/find/cs/1/au:+Niklaus_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Bermeitinger_B/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Handschuh_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Freitas_A/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.09013.pdf
23 | arXiv:1703.08885 Question Answering from Unstructured Text by Retrieval and Comprehension https://arxiv.org/pdf/1703.08885.pdf Yusuke Watanabe, Bhuwan Dhingra, Ruslan Salakhutdinov https://arxiv.org/find/cs/1/au:+Watanabe_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Dhingra_B/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Salakhutdinov_R/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.08885.pdf
24 | arXiv:1703.08864 Learning Simpler Language Models with the Delta Recurrent Neural Network Framework https://arxiv.org/pdf/1703.08864.pdf Alexander G. Ororbia II, Tomas Mikolov, David Reitter https://arxiv.org/find/cs/1/au:+Ororbia_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Mikolov_T/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Reitter_D/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.08864.pdf
25 | arXiv:1703.08748 LEPOR: An Augmented Machine Translation Evaluation Metric https://arxiv.org/pdf/1703.08748.pdf Lifeng Han https://arxiv.org/find/cs/1/au:+Han_L/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.08748.pdf
26 | arXiv:1703.08705 Comparing Rule-Based and Deep Learning Models for Patient Phenotyping https://arxiv.org/pdf/1703.08705.pdf Sebastian Gehrmann, Franck Dernoncourt, Yeran Li, Eric T. Carlson, Joy T. Wu, Jonathan Welt, John Foote Jr., Edward T. Moseley, David W. Grant, Patrick D. Tyler, Leo Anthony Celi https://arxiv.org/find/cs/1/au:+Gehrmann_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Dernoncourt_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Li_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Carlson_E/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wu_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Welt_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Foote_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Moseley_E/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Grant_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Tyler_P/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Celi_L/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.08705.pdf
27 | arXiv:1703.08701 Morphological Analysis for the Maltese Language: The Challenges of a Hybrid System https://arxiv.org/pdf/1703.08701.pdf Claudia Borg, Albert Gatt https://arxiv.org/find/cs/1/au:+Borg_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Gatt_A/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.08701.pdf
28 | arXiv:1703.08646 Simplifying the Bible and Wikipedia Using Statistical Machine Translation https://arxiv.org/pdf/1703.08646.pdf Yohan Jo https://arxiv.org/find/cs/1/au:+Jo_Y/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.08646.pdf
29 | arXiv:1703.08581 Sequence-to-Sequence Models Can Directly Transcribe Foreign Speech https://arxiv.org/pdf/1703.08581.pdf Ron J. Weiss, Jan Chorowski, Navdeep Jaitly, Yonghui Wu, Zhifeng Chen https://arxiv.org/find/cs/1/au:+Weiss_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Chorowski_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Jaitly_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wu_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Chen_Z/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.08581.pdf
30 | arXiv:1703.09137 Where to put the Image in an Image Caption Generator https://arxiv.org/pdf/1703.09137.pdf Marc Tanti (1), Albert Gatt (1), Kenneth P. Camilleri (1) ((1) University of Malta) https://arxiv.org/find/cs/1/au:+Tanti_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Gatt_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Camilleri_K/0/1/0/all/0/1 Neural and Evolutionary Computing (cs.NE) https://arxiv.org/abs/1703.09137.pdf
31 | arXiv:1703.09046 Bootstrapping a Lexicon for Emotional Arousal in Software Engineering https://arxiv.org/pdf/1703.09046.pdf Mika V. Mäntylä, Nicole Novielli, Filippo Lanubile, Maëlick Claes, Miikka Kuutila https://arxiv.org/find/cs/1/au:+Mantyla_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Novielli_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Lanubile_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Claes_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kuutila_M/0/1/0/all/0/1 Software Engineering (cs.SE) https://arxiv.org/abs/1703.09046.pdf
32 | arXiv:1703.08544 Data-Mining Textual Responses to Uncover Misconception Patterns https://arxiv.org/pdf/1703.08544.pdf Joshua J. Michalenko, Andrew S. Lan, Richard G. Baraniuk https://arxiv.org/find/stat/1/au:+Michalenko_J/0/1/0/all/0/1,https://arxiv.org/find/stat/1/au:+Lan_A/0/1/0/all/0/1,https://arxiv.org/find/stat/1/au:+Baraniuk_R/0/1/0/all/0/1 Machine Learning (stat.ML) https://arxiv.org/abs/1703.08544.pdf
33 | arXiv:1703.08537 Crowdsourcing Universal Part-Of-Speech Tags for Code-Switching https://arxiv.org/pdf/1703.08537.pdf Victor Soto, Julia Hirschberg https://arxiv.org/find/cs/1/au:+Soto_V/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Hirschberg_J/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.08537.pdf
34 | arXiv:1703.08513 Interactive Natural Language Acquisition in a Multi-modal Recurrent Neural Architecture https://arxiv.org/pdf/1703.08513.pdf Stefan Heinrich, Stefan Wermter https://arxiv.org/find/cs/1/au:+Heinrich_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wermter_S/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.08513.pdf
35 | arXiv:1703.08471 Batch-normalized joint training for DNN-based distant speech recognition https://arxiv.org/pdf/1703.08471.pdf Mirco Ravanelli, Philemon Brakel, Maurizio Omologo, Yoshua Bengio https://arxiv.org/find/cs/1/au:+Ravanelli_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Brakel_P/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Omologo_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Bengio_Y/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.08471.pdf
36 | arXiv:1703.08244 TokTrack: A Complete Token Provenance and Change Tracking Dataset for the English Wikipedia https://arxiv.org/pdf/1703.08244.pdf Fabian Flöck, Kenan Erdogan, Maribel Acosta https://arxiv.org/find/cs/1/au:+Flock_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Erdogan_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Acosta_M/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.08244.pdf
37 | arXiv:1703.08428 Calendar.help: Designing a Workflow-Based Scheduling Agent with Humans in the Loop https://arxiv.org/pdf/1703.08428.pdf Justin Cranshaw, Emad Elwany, Todd Newman, Rafal Kocielnik, Bowen Yu, Sandeep Soni, Jaime Teevan, Andrés Monroy-Hernández https://arxiv.org/find/cs/1/au:+Cranshaw_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Elwany_E/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Newman_T/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kocielnik_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yu_B/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Soni_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Teevan_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Monroy_Hernandez_A/0/1/0/all/0/1 Human-Computer Interaction (cs.HC) https://arxiv.org/abs/1703.08428.pdf
38 | arXiv:1703.08324 Are crossing dependencies really scarce? https://arxiv.org/pdf/1703.08324.pdf Ramon Ferrer-i-Cancho, Carlos Gomez-Rodriguez, J.L. Esteban https://arxiv.org/find/physics/1/au:+Ferrer_i_Cancho_R/0/1/0/all/0/1,https://arxiv.org/find/physics/1/au:+Gomez_Rodriguez_C/0/1/0/all/0/1,https://arxiv.org/find/physics/1/au:+Esteban_J/0/1/0/all/0/1 Physics and Society (physics.soc-ph) https://arxiv.org/abs/1703.08324.pdf
39 | arXiv:1703.08314 Interacting Conceptual Spaces I : Grammatical Composition of Concepts https://arxiv.org/pdf/1703.08314.pdf Joe Bolt, Bob Coecke, Fabrizio Genovese, Martha Lewis, Dan Marsden, Robin Piedeleu https://arxiv.org/find/cs/1/au:+Bolt_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Coecke_B/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Genovese_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Lewis_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Marsden_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Piedeleu_R/0/1/0/all/0/1 Logic in Computer Science (cs.LO) https://arxiv.org/abs/1703.08314.pdf
40 |
--------------------------------------------------------------------------------
/papers/pdfs/cs_CL/2017-04-03/summary.csv:
--------------------------------------------------------------------------------
1 | arXiv:1703.10252 Linguistic Matrix Theory https://arxiv.org/pdf/1703.10252.pdf Dimitrios Kartsaklis Sanjaye Ramgoolam Mehrnoosh Sadrzadeh https://arxiv.org/find/cs/1/au:+Kartsaklis_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ramgoolam_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Sadrzadeh_M/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.10252.pdf
2 | arXiv:1703.10186 Colors in Context: A Pragmatic Neural Model for Grounded Language Understanding https://arxiv.org/pdf/1703.10186.pdf Will Monroe Robert X.D. Hawkins Noah D. Goodman Christopher Potts https://arxiv.org/find/cs/1/au:+Monroe_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Hawkins_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Goodman_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Potts_C/0/1/0/all/0/1 Computation and Language (cs.CL) https://arxiv.org/abs/1703.10186.pdf
3 | arXiv:1703.10476 Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training https://arxiv.org/pdf/1703.10476.pdf Rakshith Shetty Marcus Rohrbach Lisa Anne Hendricks Mario Fritz Bernt Schiele https://arxiv.org/find/cs/1/au:+Shetty_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Rohrbach_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Hendricks_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Fritz_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Schiele_B/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10476.pdf
4 | arXiv:1703.10356 End-to-End MAP Training of a Hybrid HMM-DNN Model https://arxiv.org/pdf/1703.10356.pdf Lior Fritz David Burshtein https://arxiv.org/find/cs/1/au:+Fritz_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Burshtein_D/0/1/0/all/0/1 Learning (cs.LG) https://arxiv.org/abs/1703.10356.pdf
5 | arXiv:1703.10344 Automated News Suggestions for Populating Wikipedia Entity Pages https://arxiv.org/pdf/1703.10344.pdf Besnik Fetahu Katja Markert Avishek Anand https://arxiv.org/find/cs/1/au:+Fetahu_B/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Markert_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Anand_A/0/1/0/all/0/1 Information Retrieval (cs.IR) https://arxiv.org/abs/1703.10344.pdf
6 |
--------------------------------------------------------------------------------
/papers/pdfs/cs_cv/2017-03-03/summary.csv:
--------------------------------------------------------------------------------
1 | 1,1,1,1,1,1
2 | 2,2,2,2,2,2
3 | 3,3,3,3,3,3
4 | 4,4,4,4,4,4
5 |
--------------------------------------------------------------------------------
/papers/pdfs/cs_cv/2017-03-05/summary.csv:
--------------------------------------------------------------------------------
1 | arXiv:1703.00862 arXiv:1703.00862 arXiv:1703.00862 Adrian Bulat, Georgios Tzimiropoulos https://arxiv.org/find/cs/1/au:+Bulat_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Tzimiropoulos_G/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.00862.pdf
2 | arXiv:1703.00856 arXiv:1703.00856 arXiv:1703.00856 Rafael Teixeira Sousa, Larissa Vasconcellos de Moraes https://arxiv.org/find/cs/1/au:+Sousa_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Moraes_L/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.00856.pdf
3 | arXiv:1703.00848 arXiv:1703.00848 arXiv:1703.00848 Ming-Yu Liu, Thomas Breuel, Jan Kautz https://arxiv.org/find/cs/1/au:+Liu_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Breuel_T/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kautz_J/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.00848.pdf
4 | arXiv:1703.00845 arXiv:1703.00845 arXiv:1703.00845 Luis Contreras, Walterio Mayol-Cuevas https://arxiv.org/find/cs/1/au:+Contreras_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Mayol_Cuevas_W/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.00845.pdf
5 | arXiv:1703.00832 arXiv:1703.00832 arXiv:1703.00832 Guangcan Mai, Kai Cao, Pong C. Yuen, Anil K. Jain https://arxiv.org/find/cs/1/au:+Mai_G/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Cao_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yuen_P/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Jain_A/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.00832.pdf
6 | arXiv:1703.00792 arXiv:1703.00792 arXiv:1703.00792 Felipe Petroski Such, Shagan Sah, Miguel Dominguez, Suhas Pillai, Chao Zhang, Andrew Michael, Nathan Cahill, Raymond Ptucha https://arxiv.org/find/cs/1/au:+Such_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Sah_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Dominguez_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Pillai_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhang_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Michael_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Cahill_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ptucha_R/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.00792.pdf
7 | arXiv:1703.00767 arXiv:1703.00767 arXiv:1703.00767 Pranav Shyam, Shubham Gupta, Ambedkar Dukkipati https://arxiv.org/find/cs/1/au:+Shyam_P/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Gupta_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Dukkipati_A/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.00767.pdf
8 |
--------------------------------------------------------------------------------
/papers/pdfs/cs_cv/2017-03-06/summary.csv:
--------------------------------------------------------------------------------
1 | arXiv:1703.00862 Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Resources https://arxiv.org/pdf/1703.00862.pdf Adrian Bulat, Georgios Tzimiropoulos https://arxiv.org/find/cs/1/au:+Bulat_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Tzimiropoulos_G/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.00862.pdf
2 | arXiv:1703.00856 Araguaia Medical Vision Lab at ISIC 2017 Skin Lesion Classification Challenge https://arxiv.org/pdf/1703.00856.pdf Rafael Teixeira Sousa, Larissa Vasconcellos de Moraes https://arxiv.org/find/cs/1/au:+Sousa_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Moraes_L/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.00856.pdf
3 | arXiv:1703.00848 Unsupervised Image-to-Image Translation Networks https://arxiv.org/pdf/1703.00848.pdf Ming-Yu Liu, Thomas Breuel, Jan Kautz https://arxiv.org/find/cs/1/au:+Liu_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Breuel_T/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kautz_J/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.00848.pdf
4 | arXiv:1703.00845 Towards CNN Map Compression for camera relocalisation https://arxiv.org/pdf/1703.00845.pdf Luis Contreras, Walterio Mayol-Cuevas https://arxiv.org/find/cs/1/au:+Contreras_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Mayol_Cuevas_W/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.00845.pdf
5 | arXiv:1703.00832 Face Image Reconstruction from Deep Templates https://arxiv.org/pdf/1703.00832.pdf Guangcan Mai, Kai Cao, Pong C. Yuen, Anil K. Jain https://arxiv.org/find/cs/1/au:+Mai_G/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Cao_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yuen_P/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Jain_A/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.00832.pdf
6 | arXiv:1703.00792 Robust Spatial Filtering with Graph Convolutional Neural Networks https://arxiv.org/pdf/1703.00792.pdf Felipe Petroski Such, Shagan Sah, Miguel Dominguez, Suhas Pillai, Chao Zhang, Andrew Michael, Nathan Cahill, Raymond Ptucha https://arxiv.org/find/cs/1/au:+Such_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Sah_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Dominguez_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Pillai_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhang_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Michael_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Cahill_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ptucha_R/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.00792.pdf
7 | arXiv:1703.00767 Attentive Recurrent Comparators https://arxiv.org/pdf/1703.00767.pdf Pranav Shyam, Shubham Gupta, Ambedkar Dukkipati https://arxiv.org/find/cs/1/au:+Shyam_P/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Gupta_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Dukkipati_A/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.00767.pdf
8 | arXiv:1703.00686 BoxCars: Improving Vehicle Fine-Grained Recognition using 3D Bounding Boxes in Traffic Surveillance https://arxiv.org/pdf/1703.00686.pdf Jakub Sochor, Jakub Špaňhel, Adam Herout https://arxiv.org/find/cs/1/au:+Sochor_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Spanhel_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Herout_A/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.00686.pdf
9 |
--------------------------------------------------------------------------------
/papers/pdfs/cs_cv/2017-04-02/summary.csv:
--------------------------------------------------------------------------------
1 | arXiv:1703.10593 Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks https://arxiv.org/pdf/1703.10593.pdf Jun-Yan Zhu, Taesung Park, Phillip Isola, Alexei A. Efros https://arxiv.org/find/cs/1/au:+Zhu_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Park_T/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Isola_P/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Efros_A/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10593.pdf
2 | arXiv:1703.10584 Geometric Affordances from a Single Example via the Interaction Tensor https://arxiv.org/pdf/1703.10584.pdf Eduardo Ruiz, Walterio Mayol-Cuevas https://arxiv.org/find/cs/1/au:+Ruiz_E/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Mayol_Cuevas_W/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10584.pdf
3 | arXiv:1703.10580 MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction https://arxiv.org/pdf/1703.10580.pdf Ayush Tewari, Michael Zollhöfer, Hyeongwoo Kim, Pablo Garrido, Florian Bernard, Patrick Pérez, Christian Theobalt https://arxiv.org/find/cs/1/au:+Tewari_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zollhofer_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kim_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Garrido_P/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Bernard_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Perez_P/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Theobalt_C/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10580.pdf
4 | arXiv:1703.10571 Bootstrapping Labelled Dataset Construction for Cow Tracking and Behavior Analysis https://arxiv.org/pdf/1703.10571.pdf Aram Ter-Sarkisov, Robert Ross, John Kelleher https://arxiv.org/find/cs/1/au:+Ter_Sarkisov_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ross_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kelleher_J/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10571.pdf
5 | arXiv:1703.10553 Learning Convolutional Networks for Content-weighted Image Compression https://arxiv.org/pdf/1703.10553.pdf Mu Li, Wangmeng Zuo, Shuhang Gu, Debin Zhao, David Zhang https://arxiv.org/find/cs/1/au:+Li_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zuo_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Gu_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhao_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhang_D/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10553.pdf
6 | arXiv:1703.10530 Efficient optimization for Hierarchically-structured Interacting Segments (HINTS) https://arxiv.org/pdf/1703.10530.pdf Hossam Isack, Olga Veksler, Ipek Oguz, Milan Sonka, Yuri Boykov https://arxiv.org/find/cs/1/au:+Isack_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Veksler_O/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Oguz_I/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Sonka_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Boykov_Y/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10530.pdf
7 | arXiv:1703.10501 A Paradigm Shift: Detecting Human Rights Violations Through Web Images https://arxiv.org/pdf/1703.10501.pdf Grigorios Kalliatakis, Shoaib Ehsan, Klaus D. McDonald-Maier https://arxiv.org/find/cs/1/au:+Kalliatakis_G/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ehsan_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+McDonald_Maier_K/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10501.pdf
8 | arXiv:1703.10480 A deep learning classification scheme based on augmented-enhanced features to segment organs at risk on the optic region in brain cancer patients https://arxiv.org/pdf/1703.10480.pdf Jose Dolz, Nicolas Reyns, Nacim Betrouni, Dris Kharroubi, Mathilde Quidet, Laurent Massoptier, Maximilien Vermandel https://arxiv.org/find/cs/1/au:+Dolz_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Reyns_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Betrouni_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kharroubi_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Quidet_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Massoptier_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Vermandel_M/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10480.pdf
9 | arXiv:1703.10476 Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training https://arxiv.org/pdf/1703.10476.pdf Rakshith Shetty, Marcus Rohrbach, Lisa Anne Hendricks, Mario Fritz, Bernt Schiele https://arxiv.org/find/cs/1/au:+Shetty_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Rohrbach_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Hendricks_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Fritz_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Schiele_B/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10476.pdf
10 | arXiv:1703.10332 Dynamic Computational Time for Visual Attention https://arxiv.org/pdf/1703.10332.pdf Zhichao Li, Yi Yang, Xiao Liu, Shilei Wen, Wei Xu https://arxiv.org/find/cs/1/au:+Li_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yang_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Liu_X/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wen_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Xu_W/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10332.pdf
11 | arXiv:1703.10304 Planecell: Representing the 3D Space with Planes https://arxiv.org/pdf/1703.10304.pdf Lei Fan, Ziyu Pan, Long Chen, Kai Huang https://arxiv.org/find/cs/1/au:+Fan_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Pan_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Chen_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Huang_K/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10304.pdf
12 | arXiv:1703.10295 DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling https://arxiv.org/pdf/1703.10295.pdf Lachlan Tychsen-Smith, Lars Petersson https://arxiv.org/find/cs/1/au:+Tychsen_Smith_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Petersson_L/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10295.pdf
13 | arXiv:1703.10277 Semantic Instance Segmentation via Deep Metric Learning https://arxiv.org/pdf/1703.10277.pdf Alireza Fathi, Zbigniew Wojna, Vivek Rathod, Peng Wang, Hyun Oh Song, Sergio Guadarrama, Kevin P. Murphy https://arxiv.org/find/cs/1/au:+Fathi_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wojna_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Rathod_V/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wang_P/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Song_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Guadarrama_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Murphy_K/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10277.pdf
14 | arXiv:1703.10239 SeGAN: Segmenting and Generating the Invisible https://arxiv.org/pdf/1703.10239.pdf Kiana Ehsani, Roozbeh Mottaghi, Ali Farhadi https://arxiv.org/find/cs/1/au:+Ehsani_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Mottaghi_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Farhadi_A/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10239.pdf
15 | arXiv:1703.10217 Smartphone Based Colorimetric Detection via Machine Learning https://arxiv.org/pdf/1703.10217.pdf Ali Y. Mutlu, Volkan Kılıç, Gizem K. Özdemir, Abdullah Bayram, Nesrin Horzum, Mehmet E. Solmaz https://arxiv.org/find/cs/1/au:+Mutlu_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kilic_V/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ozdemir_G/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Bayram_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Horzum_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Solmaz_M/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10217.pdf
16 | arXiv:1703.10200 Learning High Dynamic Range from Outdoor Panoramas https://arxiv.org/pdf/1703.10200.pdf Jinsong Zhang, Jean-François Lalonde https://arxiv.org/find/cs/1/au:+Zhang_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Lalonde_J/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10200.pdf
17 | arXiv:1703.10196 Detecting Human Interventions on the Landscape: KAZE Features, Poisson Point Processes, and a Construction Dataset https://arxiv.org/pdf/1703.10196.pdf Edward Boyda, Colin McCormick, Dan Hammer https://arxiv.org/find/cs/1/au:+Boyda_E/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+McCormick_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Hammer_D/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10196.pdf
18 | arXiv:1703.10155 CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training https://arxiv.org/pdf/1703.10155.pdf Jianmin Bao, Dong Chen, Fang Wen, Houqiang Li, Gang Hua https://arxiv.org/find/cs/1/au:+Bao_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Chen_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wen_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Li_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Hua_G/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10155.pdf
19 | arXiv:1703.10131 Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation https://arxiv.org/pdf/1703.10131.pdf Matan Sela, Elad Richardson, Ron Kimmel https://arxiv.org/find/cs/1/au:+Sela_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Richardson_E/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kimmel_R/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10131.pdf
20 | arXiv:1703.10125 Google Map Aided Visual Navigation for UAVs in GPS-denied Environment https://arxiv.org/pdf/1703.10125.pdf Mo Shan, Fei Wang, Feng Lin, Zhi Gao, Ya Z. Tang, Ben M. Chen https://arxiv.org/find/cs/1/au:+Shan_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wang_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Lin_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Gao_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Tang_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Chen_B/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10125.pdf
21 | arXiv:1703.10114 Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks https://arxiv.org/pdf/1703.10114.pdf Nick Johnston, Damien Vincent, David Minnen, Michele Covell, Saurabh Singh, Troy Chinen, Sung Jin Hwang, Joel Shor, George Toderici https://arxiv.org/find/cs/1/au:+Johnston_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Vincent_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Minnen_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Covell_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Singh_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Chinen_T/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Hwang_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Shor_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Toderici_G/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10114.pdf
22 | arXiv:1703.10106 Pose-conditioned Spatio-Temporal Attention for Human Action Recognition https://arxiv.org/pdf/1703.10106.pdf Fabien Baradel, Christian Wolf, Julien Mille https://arxiv.org/find/cs/1/au:+Baradel_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wolf_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Mille_J/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10106.pdf
23 | arXiv:1703.10025 Flow-Guided Feature Aggregation for Video Object Detection https://arxiv.org/pdf/1703.10025.pdf Xizhou Zhu, Yujie Wang, Jifeng Dai, Lu Yuan, Yichen Wei https://arxiv.org/find/cs/1/au:+Zhu_X/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wang_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Dai_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yuan_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wei_Y/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10025.pdf
24 | arXiv:1703.09983 Iterative Object and Part Transfer for Fine-Grained Recognition https://arxiv.org/pdf/1703.09983.pdf Zhiqiang Shen, Yu-Gang Jiang, Dequan Wang, Xiangyang Xue https://arxiv.org/find/cs/1/au:+Shen_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Jiang_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wang_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Xue_X/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09983.pdf
25 | arXiv:1703.09971 A Geometric Framework for Stochastic Shape Analysis https://arxiv.org/pdf/1703.09971.pdf Alexis Arnaudon, Darryl D. Holm, Stefan Sommer https://arxiv.org/find/cs/1/au:+Arnaudon_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Holm_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Sommer_S/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09971.pdf
26 | arXiv:1703.09964 Image Restoration using Autoencoding Priors https://arxiv.org/pdf/1703.09964.pdf Siavash Arjomand Bigdeli, Matthias Zwicker https://arxiv.org/find/cs/1/au:+Bigdeli_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zwicker_M/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09964.pdf
27 | arXiv:1703.09933 Sentiment Recognition in Egocentric Photostreams https://arxiv.org/pdf/1703.09933.pdf Estefania Talavera, Nicola Strisciuglio, Nicolai Petkov, Petia Radeva https://arxiv.org/find/cs/1/au:+Talavera_E/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Strisciuglio_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Petkov_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Radeva_P/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09933.pdf
28 | arXiv:1703.09928 Bundle Optimization for Multi-aspect Embedding https://arxiv.org/pdf/1703.09928.pdf Qiong Zeng, Wenzheng Chen, Zhuo Han, Mingyi Shi, Yanir Kleiman, Daniel Cohen-Or, Baoquan Chen, Yangyan Li https://arxiv.org/find/cs/1/au:+Zeng_Q/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Chen_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Han_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Shi_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kleiman_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Cohen_Or_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Chen_B/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Li_Y/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09928.pdf
29 | arXiv:1703.09916 Towards thinner convolutional neural networks through Gradually Global Pruning https://arxiv.org/pdf/1703.09916.pdf Zhengtao Wang, Ce Zhu, Zhiqiang Xia, Qi Guo, Yipeng Liu https://arxiv.org/find/cs/1/au:+Wang_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhu_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Xia_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Guo_Q/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Liu_Y/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09916.pdf
30 | arXiv:1703.09913 Who's Better, Who's Best: Skill Determination in Video using Deep Ranking https://arxiv.org/pdf/1703.09913.pdf Hazel Doughty, Dima Damen, Walterio Mayol-Cuevas https://arxiv.org/find/cs/1/au:+Doughty_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Damen_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Mayol_Cuevas_W/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09913.pdf
31 | arXiv:1703.09912 One Network to Solve Them All --- Solving Linear Inverse Problems using Deep Projection Models https://arxiv.org/pdf/1703.09912.pdf J. H. Rick Chang, Chun-Liang Li, Barnabas Poczos, B. V. K. Vijaya Kumar, Aswin C. Sankaranarayanan https://arxiv.org/find/cs/1/au:+Chang_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Li_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Poczos_B/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kumar_B/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Sankaranarayanan_A/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09912.pdf
32 | arXiv:1703.09911 Learning with Privileged Information for Multi-Label Classification https://arxiv.org/pdf/1703.09911.pdf Shiyu Chen, Shangfei Wang, Tanfang Chen, Xiaoxiao Shi https://arxiv.org/find/cs/1/au:+Chen_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wang_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Chen_T/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Shi_X/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09911.pdf
33 | arXiv:1703.09891 LabelBank: Revisiting Global Perspectives for Semantic Segmentation https://arxiv.org/pdf/1703.09891.pdf Hexiang Hu, Zhiwei Deng, Guang-Tong Zhou, Fei Sha, Greg Mori https://arxiv.org/find/cs/1/au:+Hu_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Deng_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhou_G/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Sha_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Mori_G/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09891.pdf
34 | arXiv:1703.09880 Novel Structured Low-rank algorithm to recover spatially smooth exponential image time series https://arxiv.org/pdf/1703.09880.pdf Arvind Balachandrasekaran, Mathews Jacob https://arxiv.org/find/cs/1/au:+Balachandrasekaran_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Jacob_M/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09880.pdf
35 | arXiv:1703.09859 Click Here: Human-Localized Keypoints as Guidance for Viewpoint Estimation https://arxiv.org/pdf/1703.09859.pdf Ryan Szeto, Jason J. Corso https://arxiv.org/find/cs/1/au:+Szeto_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Corso_J/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09859.pdf
36 | arXiv:1703.09856 Automatic Detection of Knee Joints and Quantification of Knee Osteoarthritis Severity using Convolutional Neural Networks https://arxiv.org/pdf/1703.09856.pdf Joseph Antony, Kevin McGuinness, Kieran Moran, Noel E O'Connor https://arxiv.org/find/cs/1/au:+Antony_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+McGuinness_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Moran_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+OConnor_N/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09856.pdf
37 | arXiv:1703.09793 Deceiving Google's Cloud Video Intelligence API Built for Summarizing Videos https://arxiv.org/pdf/1703.09793.pdf Hossein Hosseini, Baicen Xiao, Radha Poovendran https://arxiv.org/find/cs/1/au:+Hosseini_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Xiao_B/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Poovendran_R/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09793.pdf
38 | arXiv:1703.09788 ProcNets: Learning to Segment Procedures in Untrimmed and Unconstrained Videos https://arxiv.org/pdf/1703.09788.pdf Luowei Zhou, Chenliang Xu, Jason J. Corso https://arxiv.org/find/cs/1/au:+Zhou_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Xu_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Corso_J/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09788.pdf
39 | arXiv:1703.09784 Perception Driven Texture Generation https://arxiv.org/pdf/1703.09784.pdf Yanhai Gan, Huifang Chi, Ying Gao, Jun Liu, Guoqiang Zhong, Junyu Dong https://arxiv.org/find/cs/1/au:+Gan_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Chi_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Gao_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Liu_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhong_G/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Dong_J/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09784.pdf
40 | arXiv:1703.09783 Two-Stream RNN/CNN for Action Recognition in 3D Videos https://arxiv.org/pdf/1703.09783.pdf Rui Zhao, Haider Ali, Patrick van der Smagt https://arxiv.org/find/cs/1/au:+Zhao_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ali_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Smagt_P/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09783.pdf
41 | arXiv:1703.09779 A Holistic Approach for Optimizing DSP Block Utilization of a CNN implementation on FPGA https://arxiv.org/pdf/1703.09779.pdf Kamel Abdelouahab, Cedric Bourrasset, Maxime Pelcat, François Berry, Jean-Charles Quinton, Jocelyn Serot https://arxiv.org/find/cs/1/au:+Abdelouahab_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Bourrasset_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Pelcat_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Berry_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Quinton_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Serot_J/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09779.pdf
42 | arXiv:1703.09778 INTEL-TUT Dataset for Camera Invariant Color Constancy Research https://arxiv.org/pdf/1703.09778.pdf Caglar Aytekin, Jarno Nikkanen, Moncef Gabbouj https://arxiv.org/find/cs/1/au:+Aytekin_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Nikkanen_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Gabbouj_M/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09778.pdf
43 | arXiv:1703.09771 Deep 6-DOF Tracking https://arxiv.org/pdf/1703.09771.pdf Mathieu Garon, Jean-François Lalonde https://arxiv.org/find/cs/1/au:+Garon_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Lalonde_J/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09771.pdf
44 | arXiv:1703.09746 Coordinating Filters for Faster Deep Neural Networks https://arxiv.org/pdf/1703.09746.pdf Wei Wen, Cong Xu, Chunpeng Wu, Yandan Wang, Yiran Chen, Hai Li https://arxiv.org/find/cs/1/au:+Wen_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Xu_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wu_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wang_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Chen_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Li_H/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09746.pdf
45 | arXiv:1703.09744 Feature Analysis and Selection for Training an End-to-End Autonomous Vehicle Controller Using the Deep Learning Approach https://arxiv.org/pdf/1703.09744.pdf Shun Yang, Wenshuo Wang, Chang Liu, Kevin Deng, J. Karl Hedrick https://arxiv.org/find/cs/1/au:+Yang_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wang_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Liu_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Deng_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Hedrick_J/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09744.pdf
46 | arXiv:1703.09725 An Epipolar Line from a Single Pixel https://arxiv.org/pdf/1703.09725.pdf Tavi Halperin, Michael Werman https://arxiv.org/find/cs/1/au:+Halperin_T/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Werman_M/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09725.pdf
47 | arXiv:1703.09833 Theory II: Landscape of the Empirical Risk in Deep Learning https://arxiv.org/pdf/1703.09833.pdf Tomaso Poggio, Qianli Liao https://arxiv.org/find/cs/1/au:+Poggio_T/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Liao_Q/0/1/0/all/0/1 Learning (cs.LG) https://arxiv.org/abs/1703.09833.pdf
48 | arXiv:1703.09695 Semi and Weakly Supervised Semantic Segmentation Using Generative Adversarial Network https://arxiv.org/pdf/1703.09695.pdf Nasim Souly, Concetto Spampinato, Mubarak Shah https://arxiv.org/find/cs/1/au:+Souly_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Spampinato_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Shah_M/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09695.pdf
49 | arXiv:1703.09690 Efficient Two-Dimensional Sparse Coding Using Tensor-Linear Combination https://arxiv.org/pdf/1703.09690.pdf Fei Jiang, Xiao-Yang Liu, Hongtao Lu, Ruimin Shen https://arxiv.org/find/cs/1/au:+Jiang_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Liu_X/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Lu_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Shen_R/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09690.pdf
50 | arXiv:1703.09684 An Analysis of Visual Question Answering Algorithms https://arxiv.org/pdf/1703.09684.pdf Kushal Kafle, Christopher Kanan https://arxiv.org/find/cs/1/au:+Kafle_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kanan_C/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09684.pdf
51 | arXiv:1703.09625 Learning and Refining of Privileged Information-based RNNs for Action Recognition from Depth Sequences https://arxiv.org/pdf/1703.09625.pdf Zhiyuan Shi, Tae-Kyun Kim https://arxiv.org/find/cs/1/au:+Shi_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kim_T/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09625.pdf
52 | arXiv:1703.09554 Lucid Data Dreaming for Object Tracking https://arxiv.org/pdf/1703.09554.pdf Anna Khoreva, Rodrigo Benenson, Eddy Ilg, Thomas Brox, Bernt Schiele https://arxiv.org/find/cs/1/au:+Khoreva_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Benenson_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ilg_E/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Brox_T/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Schiele_B/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09554.pdf
53 | arXiv:1703.09550 Important New Developments in Arabographic Optical Character Recognition (OCR) https://arxiv.org/pdf/1703.09550.pdf Maxim Romanov, Matthew Thomas Miller, Sarah Bowen Savant, Benjamin Kiessling https://arxiv.org/find/cs/1/au:+Romanov_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Miller_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Savant_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kiessling_B/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09550.pdf
54 | arXiv:1703.09529 Objects as context for part detection https://arxiv.org/pdf/1703.09529.pdf Abel Gonzalez-Garcia, Davide Modolo, Vittorio Ferrari https://arxiv.org/find/cs/1/au:+Gonzalez_Garcia_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Modolo_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ferrari_V/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09529.pdf
55 | arXiv:1703.09507 L2-constrained Softmax Loss for Discriminative Face Verification https://arxiv.org/pdf/1703.09507.pdf Rajeev Ranjan, Carlos D. Castillo, Rama Chellappa https://arxiv.org/find/cs/1/au:+Ranjan_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Castillo_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Chellappa_R/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09507.pdf
56 | arXiv:1703.09499 Locally Preserving Projection on Symmetric Positive Definite Matrix Lie Group https://arxiv.org/pdf/1703.09499.pdf Yangyang Li, Ruqian Lu https://arxiv.org/find/cs/1/au:+Li_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Lu_R/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09499.pdf
57 | arXiv:1703.09474 Robust Depth-based Person Re-identification https://arxiv.org/pdf/1703.09474.pdf Ancong Wu, Wei-Shi Zheng, Jianhuang Lai https://arxiv.org/find/cs/1/au:+Wu_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zheng_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Lai_J/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09474.pdf
58 | arXiv:1703.09471 Adversarial Image Perturbation for Privacy Protection -- A Game Theory Perspective https://arxiv.org/pdf/1703.09471.pdf Seong Joon Oh, Mario Fritz, Bernt Schiele https://arxiv.org/find/cs/1/au:+Oh_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Fritz_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Schiele_B/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09471.pdf
59 | arXiv:1703.09470 Learned Spectral Super-Resolution https://arxiv.org/pdf/1703.09470.pdf Silvano Galliani, Charis Lanaras, Dimitrios Marmanis, Emmanuel Baltsavias, Konrad Schindler https://arxiv.org/find/cs/1/au:+Galliani_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Lanaras_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Marmanis_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Baltsavias_E/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Schindler_K/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09470.pdf
60 | arXiv:1703.09438 Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs https://arxiv.org/pdf/1703.09438.pdf Maxim Tatarchenko, Alexey Dosovitskiy, Thomas Brox https://arxiv.org/find/cs/1/au:+Tatarchenko_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Dosovitskiy_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Brox_T/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09438.pdf
61 | arXiv:1703.09436 Evaluation of Classifiers for Image Segmentation: Applications for Eucalypt Forest Inventory https://arxiv.org/pdf/1703.09436.pdf Rodrigo M. Ferreira, Ricardo M. Marcacini https://arxiv.org/find/cs/1/au:+Ferreira_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Marcacini_R/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09436.pdf
62 | arXiv:1703.09393 Mixture of Counting CNNs: Adaptive Integration of CNNs Specialized to Specific Appearance for Crowd Counting https://arxiv.org/pdf/1703.09393.pdf Shohei Kumagai, Kazuhiro Hotta, Takio Kurita https://arxiv.org/find/cs/1/au:+Kumagai_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Hotta_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kurita_T/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09393.pdf
63 | arXiv:1703.09379 Robust Guided Image Filtering https://arxiv.org/pdf/1703.09379.pdf Wei Liu, Xiaogang Chen, Chunhua Shen, Jingyi Yu, Qiang Wu, Jie Yang https://arxiv.org/find/cs/1/au:+Liu_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Chen_X/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Shen_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yu_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wu_Q/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yang_J/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09379.pdf
64 | arXiv:1703.09342 Graph Regularized Tensor Sparse Coding for Image Representation https://arxiv.org/pdf/1703.09342.pdf Fei Jiang, Xiao-Yang Liu, Hongtao Lu, Ruimin Shen https://arxiv.org/find/cs/1/au:+Jiang_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Liu_X/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Lu_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Shen_R/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09342.pdf
65 | arXiv:1703.09296 Femoral ROIs and Entropy for Texture-based Detection of Osteoarthritis from High-Resolution Knee Radiographs https://arxiv.org/pdf/1703.09296.pdf Jiří Hladůvka, Bui Thi Mai Phuong, Richard Ljuhar, Davul Ljuhar, Ana M Rodrigues, Jaime C Branco, Helena Canhão https://arxiv.org/find/cs/1/au:+Hlad%5Cr%7Bu%7Dvka_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Phuong_B/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ljuhar_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ljuhar_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Rodrigues_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Branco_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Canhao_H/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09296.pdf
66 | arXiv:1703.09245 Discriminative Transfer Learning for General Image Restoration https://arxiv.org/pdf/1703.09245.pdf Lei Xiao, Felix Heide, Wolfgang Heidrich, Bernhard Schölkopf, Michael Hirsch https://arxiv.org/find/cs/1/au:+Xiao_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Heide_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Heidrich_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Scholkopf_B/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Hirsch_M/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09245.pdf
67 | arXiv:1703.09387 Adversarial Transformation Networks: Learning to Generate Adversarial Examples https://arxiv.org/pdf/1703.09387.pdf Shumeet Baluja, Ian Fischer https://arxiv.org/find/cs/1/au:+Baluja_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Fischer_I/0/1/0/all/0/1 Neural and Evolutionary Computing (cs.NE) https://arxiv.org/abs/1703.09387.pdf
68 | arXiv:1703.09370 Ensembles of Deep LSTM Learners for Activity Recognition using Wearables https://arxiv.org/pdf/1703.09370.pdf Yu Guan, Thomas Ploetz https://arxiv.org/find/cs/1/au:+Guan_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ploetz_T/0/1/0/all/0/1 Learning (cs.LG) https://arxiv.org/abs/1703.09370.pdf
69 | arXiv:1703.09211 Coherent Online Video Style Transfer https://arxiv.org/pdf/1703.09211.pdf Dongdong Chen, Jing Liao, Lu Yuan, Nenghai Yu, Gang Hua https://arxiv.org/find/cs/1/au:+Chen_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Liao_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yuan_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yu_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Hua_G/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09211.pdf
70 | arXiv:1703.09210 StyleBank: An Explicit Representation for Neural Image Style Transfer https://arxiv.org/pdf/1703.09210.pdf Dongdong Chen, Lu Yuan, Jing Liao, Nenghai Yu, Gang Hua https://arxiv.org/find/cs/1/au:+Chen_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yuan_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Liao_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yu_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Hua_G/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09210.pdf
71 | arXiv:1703.09200 Deep Poincare Map For Robust Medical Image Segmentation https://arxiv.org/pdf/1703.09200.pdf Yuanhan Mo, Fangde Liu, Jingqing Zhang, Guang Yang, Taigang He, Yike Guo https://arxiv.org/find/cs/1/au:+Mo_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Liu_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhang_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yang_G/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+He_T/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Guo_Y/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09200.pdf
72 | arXiv:1703.09199 Introduction To The Monogenic Signal https://arxiv.org/pdf/1703.09199.pdf Christopher P. Bridge https://arxiv.org/find/cs/1/au:+Bridge_C/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09199.pdf
73 | arXiv:1703.09179 Transfer learning for music classification and regression tasks https://arxiv.org/pdf/1703.09179.pdf Keunwoo Choi, György Fazekas, Mark Sandler, Kyunghyun Cho https://arxiv.org/find/cs/1/au:+Choi_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Fazekas_G/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Sandler_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Cho_K/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09179.pdf
74 | arXiv:1703.09167 A Study on the Extraction and Analysis of a Large Set of Eye Movement Features during Reading https://arxiv.org/pdf/1703.09167.pdf Ioannis Rigas, Lee Friedman, Oleg Komogortsev https://arxiv.org/find/cs/1/au:+Rigas_I/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Friedman_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Komogortsev_O/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09167.pdf
75 | arXiv:1703.09157 Reweighted Infrared Patch-Tensor Model With Both Non-Local and Local Priors for Single-Frame Small Target Detection https://arxiv.org/pdf/1703.09157.pdf Yimian Dai, Yiquan Wu https://arxiv.org/find/cs/1/au:+Dai_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wu_Y/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09157.pdf
76 | arXiv:1703.09145 Multi-Path Region-Based Convolutional Neural Network for Accurate Detection of Unconstrained "Hard Faces" https://arxiv.org/pdf/1703.09145.pdf Yuguang Liu, Martin D. Levine https://arxiv.org/find/cs/1/au:+Liu_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Levine_M/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09145.pdf
77 | arXiv:1703.09076 Active Convolution: Learning the Shape of Convolution for Image Classification https://arxiv.org/pdf/1703.09076.pdf Yunho Jeon, Junmo Kim https://arxiv.org/find/cs/1/au:+Jeon_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kim_J/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09076.pdf
78 | arXiv:1703.09039 Efficient Processing of Deep Neural Networks: A Tutorial and Survey https://arxiv.org/pdf/1703.09039.pdf Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, Joel Emer https://arxiv.org/find/cs/1/au:+Sze_V/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Chen_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yang_T/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Emer_J/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09039.pdf
79 | arXiv:1703.09026 Trespassing the Boundaries: Labeling Temporal Bounds for Object Interactions in Egocentric Video https://arxiv.org/pdf/1703.09026.pdf Davide Moltisanti, Michael Wray, Walterio Mayol-Cuevas, Dima Damen https://arxiv.org/find/cs/1/au:+Moltisanti_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wray_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Mayol_Cuevas_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Damen_D/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.09026.pdf
80 | arXiv:1703.08987 Simultaneous Perception and Path Generation Using Fully Convolutional Neural Networks https://arxiv.org/pdf/1703.08987.pdf Luca Caltagirone, Mauro Bellone, Lennart Svensson, Mattias Wahde https://arxiv.org/find/cs/1/au:+Caltagirone_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Bellone_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Svensson_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wahde_M/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08987.pdf
81 | arXiv:1703.08966 Mastering Sketching: Adversarial Augmentation for Structured Prediction https://arxiv.org/pdf/1703.08966.pdf Edgar Simo-Serra, Satoshi Iizuka, Hiroshi Ishikawa https://arxiv.org/find/cs/1/au:+Simo_Serra_E/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Iizuka_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ishikawa_H/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08966.pdf
82 | arXiv:1703.08961 Scaling the Scattering Transform: Deep Hybrid Networks https://arxiv.org/pdf/1703.08961.pdf Edouard Oyallon (DI-ENS), Eugene Belilovsky (CVN, GALEN), Sergey Zagoruyko (ENPC) https://arxiv.org/find/cs/1/au:+Oyallon_E/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Belilovsky_E/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zagoruyko_S/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08961.pdf
83 | arXiv:1703.08919 MIHash: Online Hashing with Mutual Information https://arxiv.org/pdf/1703.08919.pdf Fatih Cakir, Kun He, Sarah Adel Bargal, Stan Sclaroff https://arxiv.org/find/cs/1/au:+Cakir_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+He_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Bargal_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Sclaroff_S/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08919.pdf
84 | arXiv:1703.08917 A Visual Measure of Changes to Weighted Self-Organizing Map Patterns https://arxiv.org/pdf/1703.08917.pdf Younjin Chung, Joachim Gudmundsson, Masahiro Takatsuka https://arxiv.org/find/cs/1/au:+Chung_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Gudmundsson_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Takatsuka_M/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08917.pdf
85 | arXiv:1703.08912 Exploiting Color Name Space for Salient Object Detection https://arxiv.org/pdf/1703.08912.pdf Jing Lou, Huan Wang, Longtao Chen, Qingyuan Xia, Wei Zhu, Mingwu Ren https://arxiv.org/find/cs/1/au:+Lou_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wang_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Chen_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Xia_Q/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhu_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ren_M/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08912.pdf
86 | arXiv:1703.08897 Transductive Zero-Shot Learning with Adaptive Structural Embedding https://arxiv.org/pdf/1703.08897.pdf Yunlong Yu, Zhong Ji, Jichang Guo, Yanwei Pang https://arxiv.org/find/cs/1/au:+Yu_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ji_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Guo_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Pang_Y/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08897.pdf
87 | arXiv:1703.08893 Transductive Zero-Shot Learning with a Self-training dictionary approach https://arxiv.org/pdf/1703.08893.pdf Yunlong Yu, Zhong Ji, Xi Li, Jichang Guo, Zhongfei Zhang, Haibin Ling, Fei Wu https://arxiv.org/find/cs/1/au:+Yu_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ji_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Li_X/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Guo_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhang_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ling_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wu_F/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08893.pdf
88 | arXiv:1703.08866 Multi-View Deep Learning for Consistent Semantic Mapping with RGB-D Cameras https://arxiv.org/pdf/1703.08866.pdf Lingni Ma, Jörg Stückler, Christian Kerl, Daniel Cremers https://arxiv.org/find/cs/1/au:+Ma_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Stuckler_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kerl_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Cremers_D/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08866.pdf
89 | arXiv:1703.08837 Person Re-Identification by Camera Correlation Aware Feature Augmentation https://arxiv.org/pdf/1703.08837.pdf Ying-Cong Chen, Xiatian Zhu, Wei-Shi Zheng, Jian-Huang Lai https://arxiv.org/find/cs/1/au:+Chen_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhu_X/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zheng_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Lai_J/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08837.pdf
90 | arXiv:1703.08836 Learned multi-patch similarity https://arxiv.org/pdf/1703.08836.pdf Wilfried Hartmann, Silvano Galliani, Michal Havlena, Konrad Schindler, Luc Van Gool https://arxiv.org/find/cs/1/au:+Hartmann_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Galliani_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Havlena_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Schindler_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Gool_L/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08836.pdf
91 | arXiv:1703.08770 SCAN: Structure Correcting Adversarial Network for Chest X-rays Organ Segmentation https://arxiv.org/pdf/1703.08770.pdf Wei Dai, Joseph Doyle, Xiaodan Liang, Hao Zhang, Nanqing Dong, Yuan Li, Eric P. Xing https://arxiv.org/find/cs/1/au:+Dai_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Doyle_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Liang_X/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhang_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Dong_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Li_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Xing_E/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08770.pdf
92 | arXiv:1703.08769 Open Vocabulary Scene Parsing https://arxiv.org/pdf/1703.08769.pdf Hang Zhao, Xavier Puig, Bolei Zhou, Sanja Fidler, Antonio Torralba https://arxiv.org/find/cs/1/au:+Zhao_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Puig_X/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhou_B/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Fidler_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Torralba_A/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08769.pdf
93 | arXiv:1703.08764 Structured Learning of Tree Potentials in CRF for Image Segmentation https://arxiv.org/pdf/1703.08764.pdf Fayao Liu, Guosheng Lin, Ruizhi Qiao, Chunhua Shen https://arxiv.org/find/cs/1/au:+Liu_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Lin_G/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Qiao_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Shen_C/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08764.pdf
94 | arXiv:1703.08738 Sketch-based Face Editing in Video Using Identity Deformation Transfer https://arxiv.org/pdf/1703.08738.pdf Long Zhao, Fangda Han, Mubbasir Kapadia, Vladimir Pavlovic, Dimitris Metaxas https://arxiv.org/find/cs/1/au:+Zhao_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Han_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kapadia_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Pavlovic_V/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Metaxas_D/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08738.pdf
95 | arXiv:1703.08710 Count-ception: Counting by Fully Convolutional Redundant Counting https://arxiv.org/pdf/1703.08710.pdf Joseph Paul Cohen, Henry Z. Lo, Yoshua Bengio https://arxiv.org/find/cs/1/au:+Cohen_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Lo_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Bengio_Y/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08710.pdf
96 | arXiv:1703.08697 Improving the Accuracy of the CogniLearn System for Cognitive Behavior Assessment https://arxiv.org/pdf/1703.08697.pdf Amir Ghaderi, Srujana Gattupalli, Dylan Ebert, Ali Sharifara, Vassilis Athitsos, Fillia Makedon https://arxiv.org/find/cs/1/au:+Ghaderi_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Gattupalli_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ebert_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Sharifara_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Athitsos_V/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Makedon_F/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08697.pdf
97 | arXiv:1703.08653 Bayesian Optimization for Refining Object Proposals https://arxiv.org/pdf/1703.08653.pdf Anthony D. Rhodes, Jordan Witte, Melanie Mitchell, Bruno Jedynak https://arxiv.org/find/cs/1/au:+Rhodes_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Witte_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Mitchell_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Jedynak_B/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08653.pdf
98 | arXiv:1703.08651 More is Less: A More Complicated Network with Less Inference Complexity https://arxiv.org/pdf/1703.08651.pdf Xuanyi Dong, Junshi Huang, Yi Yang, Shuicheng Yan https://arxiv.org/find/cs/1/au:+Dong_X/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Huang_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yang_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yan_S/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08651.pdf
99 | arXiv:1703.08628 AMAT: Medial Axis Transform for Natural Images https://arxiv.org/pdf/1703.08628.pdf Stavros Tsogkas, Sven Dickinson https://arxiv.org/find/cs/1/au:+Tsogkas_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Dickinson_S/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08628.pdf
100 | arXiv:1703.08617 Temporal Non-Volume Preserving Approach to Facial Age-Progression and Age-Invariant Face Recognition https://arxiv.org/pdf/1703.08617.pdf Chi Nhan Duong, Kha Gia Quach, Khoa Luu, T. Hoang Ngan le, Marios Savvides https://arxiv.org/find/cs/1/au:+Duong_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Quach_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Luu_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+le_T/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Savvides_M/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08617.pdf
101 | arXiv:1703.08603 Adversarial Examples for Semantic Segmentation and Object Detection https://arxiv.org/pdf/1703.08603.pdf Cihang Xie, Jianyu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie, Alan Yuille https://arxiv.org/find/cs/1/au:+Xie_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wang_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhang_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhou_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Xie_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yuille_A/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08603.pdf
102 | arXiv:1703.08580 Deep Residual Learning for Instrument Segmentation in Robotic Surgery https://arxiv.org/pdf/1703.08580.pdf Daniil Pakhomov, Vittal Premachandran, Max Allan, Mahdi Azizian, Nassir Navab https://arxiv.org/find/cs/1/au:+Pakhomov_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Premachandran_V/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Allan_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Azizian_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Navab_N/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08580.pdf
103 | arXiv:1703.09161 A Dynamic Programming Solution to Bounded Dejittering Problems https://arxiv.org/pdf/1703.09161.pdf Lukas F. Lang https://arxiv.org/find/math/1/au:+Lang_L/0/1/0/all/0/1 Optimization and Control (math.OC) https://arxiv.org/abs/1703.09161.pdf
104 | arXiv:1703.09137 Where to put the Image in an Image Caption Generator https://arxiv.org/pdf/1703.09137.pdf Marc Tanti (1), Albert Gatt (1), Kenneth P. Camilleri (1) ((1) University of Malta) https://arxiv.org/find/cs/1/au:+Tanti_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Gatt_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Camilleri_K/0/1/0/all/0/1 Neural and Evolutionary Computing (cs.NE) https://arxiv.org/abs/1703.09137.pdf
105 | arXiv:1703.08840 Inferring The Latent Structure of Human Decision-Making from Raw Visual Inputs https://arxiv.org/pdf/1703.08840.pdf Yunzhu Li, Jiaming Song, Stefano Ermon https://arxiv.org/find/cs/1/au:+Li_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Song_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ermon_S/0/1/0/all/0/1 Learning (cs.LG) https://arxiv.org/abs/1703.08840.pdf
106 | arXiv:1703.08774 Who Said What: Modeling Individual Labelers Improves Classification https://arxiv.org/pdf/1703.08774.pdf Melody Y. Guan, Varun Gulshan, Andrew M. Dai, Geoffrey E. Hinton https://arxiv.org/find/cs/1/au:+Guan_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Gulshan_V/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Dai_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Hinton_G/0/1/0/all/0/1 Learning (cs.LG) https://arxiv.org/abs/1703.08774.pdf
107 | arXiv:1703.08772 Multivariate Regression with Gross Errors on Manifold-valued Data https://arxiv.org/pdf/1703.08772.pdf Xiaowei Zhang, Xudong Shi, Yu Sun, Li Cheng https://arxiv.org/find/stat/1/au:+Zhang_X/0/1/0/all/0/1,https://arxiv.org/find/stat/1/au:+Shi_X/0/1/0/all/0/1,https://arxiv.org/find/stat/1/au:+Sun_Y/0/1/0/all/0/1,https://arxiv.org/find/stat/1/au:+Cheng_L/0/1/0/all/0/1 Machine Learning (stat.ML) https://arxiv.org/abs/1703.08772.pdf
108 | arXiv:1703.08516 Radiomics strategies for risk assessment of tumour failure in head-and-neck cancer https://arxiv.org/pdf/1703.08516.pdf Martin Vallières (1), Emily Kay-Rivest (2), Léo Jean Perrin (3), Xavier Liem (4), Christophe Furstoss (5), Hugo J. W. L. Aerts (6), Nader Khaouam (5), Phuc Felix Nguyen-Tan (4), Chang-Shu Wang (3), Khalil Sultanem (2), Jan Seuntjens (1), Issam El Naqa (7) ((1) Medical Physics Unit, McGill University, Montréal, Canada, (2) Radiation Oncology Division, Hôpital général juif, Montréal, Canada, (3) Department of Radiation Oncology, Centre hospitalier universitaire de Sherbrooke, Montréal, Canada, (4) Department of Radiation Oncology, Centre hospitalier de l'Université de Montréal, Montréal, Canada, (5) Department of Radiation Oncology, Hôpital Maisonneuve-Rosemont, Montréal, Canada, (6) Departments of Radiation Oncology & Radiology, Dana-Farber Cancer Institute, Boston, USA, (7) Department of Radiation Oncology, Physics Division, University of Michigan, Ann Arbor, USA) https://arxiv.org/find/cs/1/au:+Vallieres_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kay_Rivest_E/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Perrin_L/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Liem_X/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Furstoss_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Aerts_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Khaouam_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Nguyen_Tan_P/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wang_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Sultanem_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Seuntjens_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Naqa_I/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08516.pdf
109 | arXiv:1703.08497 Local Deep Neural Networks for Age and Gender Classification https://arxiv.org/pdf/1703.08497.pdf Zukang Liao, Stavros Petridis, Maja Pantic https://arxiv.org/find/cs/1/au:+Liao_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Petridis_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Pantic_M/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08497.pdf
110 | arXiv:1703.08493 Multi-stage Multi-recursive-input Fully Convolutional Networks for Neuronal Boundary Detection https://arxiv.org/pdf/1703.08493.pdf Wei Shen, Bin Wang, Yuan Jiang, Yan Wang, Alan Yuille https://arxiv.org/find/cs/1/au:+Shen_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wang_B/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Jiang_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wang_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yuille_A/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08493.pdf
111 | arXiv:1703.08492 Content-Based Image Retrieval Based on Late Fusion of Binary and Local Descriptors https://arxiv.org/pdf/1703.08492.pdf Nouman Ali, Danish Ali Mazhar, Zeshan Iqbal, Rehan Ashraf, Jawad Ahmed, Farrukh Zeeshan Khan https://arxiv.org/find/cs/1/au:+Ali_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Mazhar_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Iqbal_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ashraf_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ahmed_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Khan_F/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08492.pdf
112 | arXiv:1703.08472 Medical Image Retrieval using Deep Convolutional Neural Network https://arxiv.org/pdf/1703.08472.pdf Adnan Qayyum, Syed Muhammad Anwar, Muhammad Awais, Muhammad Majid https://arxiv.org/find/cs/1/au:+Qayyum_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Anwar_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Awais_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Majid_M/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08472.pdf
113 | arXiv:1703.08448 Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach https://arxiv.org/pdf/1703.08448.pdf Yunchao Wei, Jiashi Feng, Xiaodan Liang, Ming-Ming Cheng, Yao Zhao, Shuicheng Yan https://arxiv.org/find/cs/1/au:+Wei_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Feng_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Liang_X/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Cheng_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhao_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yan_S/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08448.pdf
114 | arXiv:1703.08388 DeepVisage: Making face recognition simple yet with powerful generalization skills https://arxiv.org/pdf/1703.08388.pdf Abul Hasnat, Julien Bohné, Stéphane Gentric, Liming Chen https://arxiv.org/find/cs/1/au:+Hasnat_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Bohne_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Gentric_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Chen_L/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08388.pdf
115 | arXiv:1703.08378 Feature Fusion using Extended Jaccard Graph and Stochastic Gradient Descent for Robot https://arxiv.org/pdf/1703.08378.pdf Shenglan Liu, Muxin Sun, Wei Wang, Feilong Wang https://arxiv.org/find/cs/1/au:+Liu_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Sun_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wang_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wang_F/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08378.pdf
116 | arXiv:1703.08366 A Hybrid Deep Learning Approach for Texture Analysis https://arxiv.org/pdf/1703.08366.pdf Hussein Adly, Mohamed Moustafa https://arxiv.org/find/cs/1/au:+Adly_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Moustafa_M/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08366.pdf
117 | arXiv:1703.08359 Scalable Person Re-identification on Supervised Smoothed Manifold https://arxiv.org/pdf/1703.08359.pdf Song Bai, Xiang Bai, Qi Tian https://arxiv.org/find/cs/1/au:+Bai_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Bai_X/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Tian_Q/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08359.pdf
118 | arXiv:1703.08338 Improving Classification by Improving Labelling: Introducing Probabilistic Multi-Label Object Interaction Recognition https://arxiv.org/pdf/1703.08338.pdf Michael Wray, Davide Moltisanti, Walterio Mayol-Cuevas, Dima Damen https://arxiv.org/find/cs/1/au:+Wray_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Moltisanti_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Mayol_Cuevas_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Damen_D/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08338.pdf
119 | arXiv:1703.08289 Deep Direct Regression for Multi-Oriented Scene Text Detection https://arxiv.org/pdf/1703.08289.pdf Wenhao He, Xu-Yao Zhang, Fei Yin, Cheng-Lin Liu https://arxiv.org/find/cs/1/au:+He_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhang_X/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yin_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Liu_C/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08289.pdf
120 | arXiv:1703.08274 View Adaptive Recurrent Neural Networks for High Performance Human Action Recognition from Skeleton Data https://arxiv.org/pdf/1703.08274.pdf Pengfei Zhang, Cuiling Lan, Junliang Xing, Wenjun Zeng, Jianru Xue, Nanning Zheng https://arxiv.org/find/cs/1/au:+Zhang_P/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Lan_C/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Xing_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zeng_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Xue_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zheng_N/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08274.pdf
121 | arXiv:1703.08238 Semi-Automatic Segmentation and Ultrasonic Characterization of Solid Breast Lesions https://arxiv.org/pdf/1703.08238.pdf Mohammad Saad Billah, Tahmida Binte Mahmud https://arxiv.org/find/cs/1/au:+Billah_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Mahmud_T/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08238.pdf
122 | arXiv:1703.08173 Single Image Super-resolution with a Parameter Economic Residual-like Convolutional Neural Network https://arxiv.org/pdf/1703.08173.pdf Yudong Liang, Ze Yang, Kai Zhang, Yihui He, Jinjun Wang, Nanning Zheng https://arxiv.org/find/cs/1/au:+Liang_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Yang_Z/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhang_K/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+He_Y/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Wang_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zheng_N/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.08173.pdf
123 | arXiv:1703.08245 On the Robustness of Convolutional Neural Networks to Internal Architecture and Weight Perturbations https://arxiv.org/pdf/1703.08245.pdf Nicholas Cheney, Martin Schrimpf, Gabriel Kreiman https://arxiv.org/find/cs/1/au:+Cheney_N/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Schrimpf_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kreiman_G/0/1/0/all/0/1 Learning (cs.LG) https://arxiv.org/abs/1703.08245.pdf
124 |
--------------------------------------------------------------------------------
/papers/pdfs/cs_cv/2017-04-03/summary.csv:
--------------------------------------------------------------------------------
1 | arXiv:1703.10593 Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks https://arxiv.org/pdf/1703.10593.pdf Jun-Yan Zhu Taesung Park Phillip Isola Alexei A. Efros https://arxiv.org/find/cs/1/au:+Zhu_J/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Park_T/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Isola_P/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Efros_A/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10593.pdf
2 | arXiv:1703.10584 Geometric Affordances from a Single Example via the Interaction Tensor https://arxiv.org/pdf/1703.10584.pdf Eduardo Ruiz Walterio Mayol-Cuevas https://arxiv.org/find/cs/1/au:+Ruiz_E/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Mayol_Cuevas_W/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10584.pdf
3 | arXiv:1703.10580 MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction https://arxiv.org/pdf/1703.10580.pdf Ayush Tewari Michael Zollhöfer Hyeongwoo Kim Pablo Garrido Florian Bernard Patrick Pérez Christian Theobalt https://arxiv.org/find/cs/1/au:+Tewari_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zollhofer_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kim_H/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Garrido_P/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Bernard_F/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Perez_P/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Theobalt_C/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10580.pdf
4 | arXiv:1703.10571 Bootstrapping Labelled Dataset Construction for Cow Tracking and Behavior Analysis https://arxiv.org/pdf/1703.10571.pdf Aram Ter-Sarkisov Robert Ross John Kelleher https://arxiv.org/find/cs/1/au:+Ter_Sarkisov_A/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Ross_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Kelleher_J/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10571.pdf
5 | arXiv:1703.10553 Learning Convolutional Networks for Content-weighted Image Compression https://arxiv.org/pdf/1703.10553.pdf Mu Li Wangmeng Zuo Shuhang Gu Debin Zhao David Zhang https://arxiv.org/find/cs/1/au:+Li_M/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zuo_W/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Gu_S/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhao_D/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Zhang_D/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.10553.pdf
6 |
--------------------------------------------------------------------------------
/send_email/README.md:
--------------------------------------------------------------------------------
1 | **how to send email in python**
2 |
3 | [email-examples](https://docs.python.org/2/library/email-examples.html)
4 |
--------------------------------------------------------------------------------
/send_email/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/burness/arxiv_tools/0e3fe1bbd4cb26a4f1b5266c32e5b8e24d866c81/send_email/__init__.py
--------------------------------------------------------------------------------
/send_email/send_email.py:
--------------------------------------------------------------------------------
1 | # -*- coding: UTF-8 -*-
2 | import smtplib
3 | from email.mime.text import MIMEText
4 | from email.header import Header
5 | import logging
6 | logger = logging.getLogger(name='arxiv_tools')
7 | import time
8 | import requests
9 | # sh = logging.StreamHandler(stream=None)
10 | # logger.addHandler(sh)
11 |
12 | # 第三方 SMTP 服务
13 | class SendEmail(object):
14 | def __init__(self, mail_host, mail_user, mail_pass, area_week_file):
15 | config = {}
16 | config['mail_host'] = mail_host
17 | config['mail_user'] = mail_user
18 | config['mail_pass'] = mail_pass
19 | self._set_config(**config)
20 | self.area_week_file = area_week_file
21 |
22 | def _set_config(self, **config):
23 | self.mail_host = config['mail_host']
24 | self.mail_user = config['mail_user']
25 | self.mail_pass = config['mail_pass']
26 |
27 | def set_sender(self, sender_email):
28 | self.sender_email = sender_email
29 |
30 | def set_receivers(self, receivers_email):
31 | self.receivers_email = receivers_email
32 |
33 | def get_daily_sentence(self):
34 | try:
35 | result = requests.get('http://open.iciba.com/dsapi/').json()
36 | daily = result['note']
37 | except:
38 | daily = 'Do Better Every DaY'
39 | return daily
40 |
41 | def _format_text_html(self):
42 | '''
43 | return the html text of the area_week_file
44 | '''
45 | # TODO: How to format the html text form the area_week file
46 |
47 | with open(self.area_week_file, 'r') as fread:
48 | print 'heheh'
49 | area = self.area_week_file.split('/')[3]
50 | print area
51 | today = time.strftime('%Y-%m-%d',time.localtime(time.time()))
52 | self.message_text = '
'
74 |
75 | def _format_head(self):
76 | self.message = MIMEText(self.message_text, 'html', 'utf-8')
77 | self.message['From'] = self.sender_email
78 | self.message['To'] = ','.join(self.receivers_email)
79 | subject = 'Arxiv Papers'
80 | self.message['Subject'] = Header(subject, 'utf-8')
81 |
82 | def send(self):
83 | # try:
84 | print 'before format_text_html'
85 | self._format_text_html()
86 | print 'message_text {0}'.format(self.message_text)
87 | logger.info('message_text {0}'.format(self.message_text))
88 | self._format_head()
89 | smtpObj = smtplib.SMTP_SSL(self.mail_host)
90 | logger.info('Trying Connect')
91 | print 'Trying Connect'
92 | # logger.info('Connnect Successfully')
93 | smtpObj.login(self.mail_user,self.mail_pass)
94 | logger.info('Login Successfully')
95 | print 'Login Successfully'
96 | smtpObj.sendmail(self.message['From'], self.receivers_email, self.message.as_string())
97 | print 'send email successful'
98 | # except Exception, e:
99 | # print 'Error during send email'
100 | # print str(e)
101 |
102 | if __name__ == '__main__':
103 | mail_host = 'smtp.qq.com'
104 | mail_user = '363544964@qq.com'
105 | mail_pass = 'xxxxxxxxx'
106 | send_email = SendEmail(mail_host=mail_host, mail_user=mail_user, mail_pass=mail_pass, area_week_file='../papers/pdfs/cs_cl/2017-04-03/summary.csv')
107 | send_email.set_sender(sender_email='363544964@qq.com')
108 | send_email.set_receivers(receivers_email=['dss_1990@sina.com','burness1990@163.com'])
109 | send_email.send()
110 |
111 |
--------------------------------------------------------------------------------
/send_email/temp:
--------------------------------------------------------------------------------
1 | Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Resources
2 | Adrian Bulat, Georgios Tzimirop
3 | oulosAraguaia Medical Vision Lab at ISIC 2017 Skin Lesion Classification Challenge Rafael Teixeira Sousa, Larissa Vasconcellos de MoraesUnsupervised Image-to-Image Translation Networks Ming-Yu Liu
6 | , Thomas Breuel, Jan KautzTowards CNN Map Compression for camera relocalisation Luis Contreras, Walterio Mayol-CuevasFace
9 | Image Reconstruction from Deep Templates Guangcan Mai, Kai Cao, Pong C. Yuen, Anil K. Jain
12 | Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Resources <
13 | a href=https://arxiv.org/find/cs/1/au:+Bulat_A/0/1/0/all/0/1>Adrian Bulat, Georgios Tzimirop
14 | oulosAraguaia Medical Vision Lab at ISIC 2017 Skin Lesion Classification Challenge Rafael Teixeira Sousa, Larissa Vasconcellos de MoraesUnsupervised Image-to-Image Translation Networks Ming-Yu Liu
17 | , Thomas Breuel, Jan KautzTowards CNN Map Compression for camera relocalisation Luis Contreras, Walterio Mayol-CuevasFace
20 | Image Reconstruction from Deep Templates Guangcan Mai, Kai Cao, Pong C. Yuen, Anil K. JainRobust Spatial Filtering with Graph Convolutional Neural Networks Felipe Petroski Such, Shagan Sah, Miguel Dominguez, Suhas Pillai, Chao Zhang, Andrew Michael, Nathan Cahill, Raymond Ptucha
27 | Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Resources <
28 | a href=https://arxiv.org/find/cs/1/au:+Bulat_A/0/1/0/all/0/1>Adrian Bulat, Georgios Tzimirop
29 | oulosAraguaia Medical Vision Lab at ISIC 2017 Skin Lesion Classification Challenge Rafael Teixeira Sousa, Larissa Vasconcellos de MoraesUnsupervised Image-to-Image Translation Networks Ming-Yu Liu
32 | , Thomas Breuel, Jan KautzTowards CNN Map Compression for camera relocalisation Luis Contreras, Walterio Mayol-CuevasFace
35 | Image Reconstruction from Deep Templates Guangcan Mai, Kai Cao, Pong C. Yuen, Anil K. JainRobust Spatial Filtering with Graph Convolutional Neural Networks Felipe Petroski Such, Shagan Sah, Miguel Dominguez, Suhas Pillai, Chao Zhang, Andrew Michael, Nathan Cahill, Raymond PtuchaAttentive Recurrent Comparators Pranav Shyam, Shubham Gupta, Ambedkar Dukkipati
44 | Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Resources <
45 | a href=https://arxiv.org/find/cs/1/au:+Bulat_A/0/1/0/all/0/1>Adrian Bulat, Georgios Tzimirop
46 | oulosAraguaia Medical Vision Lab at ISIC 2017 Skin Lesion Classification Challenge Rafael Teixeira Sousa, Larissa Vasconcellos de MoraesUnsupervised Image-to-Image Translation Networks Ming-Yu Liu
49 | , Thomas Breuel, Jan KautzTowards CNN Map Compression for camera relocalisation Luis Contreras, Walterio Mayol-CuevasFace
52 | Image Reconstruction from Deep Templates Guangcan Mai, Kai Cao, Pong C. Yuen, Anil K. JainRobust Spatial Filtering with Graph Convolutional Neural Networks Felipe Petroski Such, Shagan Sah, Miguel Dominguez, Suhas Pillai, Chao Zhang, Andrew Michael, Nathan Cahill, Raymond PtuchaAttentive Recurrent Comparators Pranav Shyam, Shubham Gupta, Ambedkar DukkipatiBoxCars: Improving Vehicle Fine-Grained Recognition using 3D Bounding Boxes in Traffic Surveillance Jakub Sochor, Jakub Špaňhel, Adam Herout
--------------------------------------------------------------------------------
/send_email/test:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/burness/arxiv_tools/0e3fe1bbd4cb26a4f1b5266c32e5b8e24d866c81/send_email/test
--------------------------------------------------------------------------------
/spider/1703.00686.txt:
--------------------------------------------------------------------------------
1 | Under review in IJCV manuscript No.
2 | (will be inserted by the editor)
3 |
4 | BoxCars: Improving Vehicle Fine-Grained Recognition using
5 | 3D Bounding Boxes in Traffic Surveillance
6 |
7 | Jakub Sochor · Jakub ˇSpaˇnhel · Adam Herout
8 |
9 | 7
10 | 1
11 | 0
12 | 2
13 |
14 |
15 | r
16 | a
17 |
18 | M
19 | 2
20 |
21 |
22 |
23 |
24 |
25 | ]
26 |
27 | V
28 | C
29 | .
30 | s
31 | c
32 | [
33 |
34 |
35 |
36 | 1
37 | v
38 | 6
39 | 8
40 | 6
41 | 0
42 | 0
43 |
44 | .
45 |
46 | 3
47 | 0
48 | 7
49 | 1
50 | :
51 | v
52 | i
53 | X
54 | r
55 | a
56 |
57 | Received: date / Accepted: date
58 |
59 | Abstract In this paper, we focus on fine-grained recog-
60 | nition of vehicles mainly in traffic surveillance applica-
61 | tions. We propose an approach orthogonal to recent ad-
62 | vancement in fine-grained recognition (automatic part
63 | discovery, bilinear pooling). Also, in contrast to other
64 | methods focused on fine-grained recognition of vehicles,
65 | we do not limit ourselves to frontal/rear viewpoint but
66 | allow the vehicles to be seen from any viewpoint. Our
67 | approach is based on 3D bounding boxes built around
68 | the vehicles. The bounding box can be automatically
69 | constructed from traffic surveillance data. For scenarios
70 | where it is not possible to use the precise construction,
71 | we propose a method for estimation of the 3D bounding
72 | box. The 3D bounding box is used to normalize the im-
73 | age viewpoint by “unpacking” the image into plane. We
74 | also propose to randomly alter the color of the image
75 | and add a rectangle with random noise to random posi-
76 | tion in the image during training Convolutional Neural
77 | Networks. We have collected a large fine-grained vehi-
78 | cle dataset BoxCars116k, with 116k images of vehicles
79 | from various viewpoints taken by numerous surveillance
80 | cameras. We performed a number of experiments which
81 | show that our proposed method significantly improves
82 | CNN classification accuracy (the accuracy is increased
83 | by up to 12 percent points and the error is reduced by
84 | up to 50 % compared to CNNs without the proposed
85 | modifications). We also show that our method outper-
86 | forms state-of-the-art methods for fine-grained recogni-
87 | tion.
88 |
89 | Graph@FIT, Centre of Excellence IT4Innovations, Brno Uni-
90 | versity of Technology.
91 | Brno, Czech Republic
92 | Tel.: +420 54114-1414
93 | E-mail: {isochor,herout}@fit.vutbr.cz
94 | Jakub Sochor is a Brno Ph.D. Talent Scholarship Holder —
95 | Funded by the Brno City Municipality.
96 |
97 | Fig. 1 Example of automatically obtained 3D bounding box
98 | used for fine-grained vehicle classification. Top left: vehicle
99 | with 2D bounding box annotation, top right: estimated con-
100 | tour, bottom left: estimated directions to vanishing points,
101 | bottom right: 3D bounding box automatically obtained
102 | from surveillance video (green) and our estimated 3D bound-
103 | ing box (red).
104 |
105 | 1 Introduction
106 |
107 | Fine-grained recognition of vehicles is interesting both
108 | from the application point of view (surveillance, data
109 | retrieval, etc.) and from the point of view of research
110 | of general fine-grained recognition applicable in other
111 | fields. For example, Gebru et al (2017) proposed esti-
112 | mation of demographic statistics based on fine-grained
113 | recognition of vehicles. In this article, we are presenting
114 | methodology which considerably increases the perfor-
115 | mance of multiple state-of-the-art CNN architectures in
116 | the task of fine-grained vehicle recognition. We target
117 |
118 | 2
119 |
120 | Jakub Sochor et al.
121 |
122 | the traffic surveillance context, namely images of vehi-
123 | cles taken from an arbitrary viewpoint – we do not
124 | limit ourselves to frontal/rear viewpoints. As the im-
125 | ages are obtained from surveillance cameras, they have
126 | challenging properties – they are often small and taken
127 | from very general viewpoints (high elevation). Also, we
128 | construct the training and testing sets from images from
129 | different cameras as it is common for surveillance ap-
130 | plications that it is not known a priori under which
131 | viewpoint the camera will be observing the road.
132 |
133 | Methods focused on fine-grained recognition of vehi-
134 | cles usually have some limitations – they can be limited
135 | to frontal/rear viewpoint or use 3D CAD models of all
136 | the vehicles. Both these limitations are rather impracti-
137 | cal for large scale deployment. There are also methods
138 | for fine-grained recognition in general which were ap-
139 | plied on vehicles. The methods recently follow several
140 | main directions – automatic discovery of parts (Krause
141 | et al, 2015; Simon and Rodner, 2015), bilinear pooling
142 | (Lin et al, 2015b; Gao et al, 2016), or exploiting struc-
143 | ture of fine-grained labels (Xie et al, 2015; Zhou and
144 | Lin, 2016). Our method is not limited to any particular
145 | viewpoint and it does not require 3D models vehicles
146 | at all.
147 |
148 | We propose an orthogonal approach to these meth-
149 | ods and use CNNs with modified input to achieve better
150 | image normalization and data augmentation (therefore,
151 | our approach can be combined with other methods).
152 | We use 3D bounding boxes around vehicles to normal-
153 | ize vehicle image, see Figure 3 for examples. This work
154 | is based on our previous conference paper (Sochor et al,
155 | 2016a); it pushes the performance further and mainly
156 | we propose a new method how to build the 3D bound-
157 | ing box without any prior knowledge, see Figure 1. Our
158 | input modifications are able to significantly increase the
159 | classification accuracy (up to 12 percent points, classi-
160 | fication error is reduced by up to 50 %).
161 |
162 | The key contributions of the paper are:
163 |
164 | – Complex and thorough evaluation of our previous
165 |
166 | method (Sochor et al, 2016a).
167 |
168 | – Our novel data augmentation techniques further im-
169 | prove the results of the fine-grained recognition of
170 | vehicles relative both to our previous method and
171 | other state-of-the-art methods (Section 3.3).
172 |
173 | – We remove the requirement of the previous method
174 | (Sochor et al, 2016a) to know the 3D bounding box
175 | by estimating the bounding box both at training
176 | and test time (Section 3.4).
177 |
178 | – We collected more samples to the BoxCars dataset,
179 | increasing the dataset size almost twice, see Sec-
180 | tion 4.
181 |
182 | We make the collected dataset and source codes for
183 | the proposed algorithm publicly available1 for future
184 | reference and comparison.
185 |
186 | 2 Related Work
187 |
188 | To provide context to the proposed method, we provide
189 | summary of existing fine-grained recognition methods
190 | (both general and focused on vehicles). We also briefly
191 | describe recent advancements in Convolutional Neural
192 | Networks.
193 |
194 | 2.1 General Fine-Grained Object Recognition
195 |
196 | We divide the fine-grained recognition methods from
197 | recent literature into several categories as they usually
198 | share some common traits. Methods exploiting anno-
199 | tated model parts (Parkhi et al, 2012; Liu et al, 2012;
200 | G¨oring et al, 2014; Zhang et al, 2013; Chai et al, 2013;
201 | Zhang et al, 2014, 2016a; Huang et al, 2016; Zhang et al,
202 | 2016b) are not discussed in detail as it is not common
203 | in fine-grained datasets of vehicles to have the parts
204 | annotated.
205 |
206 | 2.1.1 Automatic Part Discovery
207 |
208 | Parts of classified objects may be discriminatory and
209 | provide lots of information for the fine-grained clas-
210 | sification task. However, it is not practical to assume
211 | that the location of such parts is known a priori as
212 | it requires significantly more annotation work. There-
213 | fore, several papers have dealt with this problem and
214 | proposed methods how to automatically (during both
215 | training and test time) discover and localize such parts.
216 | The methods differ mainly in the way which is used for
217 | the discovery. The features of the parts are usually clas-
218 | sified by SVMs (Yang et al, 2012; Duan et al, 2012; Yao,
219 | 2012; Simon and Rodner, 2015; Krause et al, 2015).
220 |
221 | Yang et al (2012) propose to use discriminative tem-
222 | plates based on template-image similarity with learnt
223 | co-occurrences to detect different common parts of clas-
224 | sified objects. Duan et al (2012) propose discovery of
225 | discriminative parts by optimization formulated as la-
226 | tent Conditional Random Field on hierarchical segmen-
227 | tation of the images. Krause et al (2014, 2015) use au-
228 | tomatic discovery of parts using pose aligned images
229 | based on nearest neighbors of features (HOG or conv4
230 | CNN activations) and select for the pose aligned clus-
231 | ters the parts which have discriminative power. Simon
232 | and Rodner (2015) use deep neural activation maps to
233 |
234 | 1 https://medusa.fit.vutbr.cz/traffic
235 |
236 | BoxCars: Improving Vehicle Fine-Grained Recognition using 3D Bounding Boxes in Traffic Surveillance
237 |
238 | 3
239 |
240 | detect parts of objects which are used to build a star
241 | shape constellation model which is used for classifica-
242 | tion to fine-grained categories. Zhang et al (2016c) pro-
243 | pose to iteratively train and pick deep filters which cor-
244 | respond to parts.
245 |
246 | 2.1.2 Methods using Bilinear Pooling
247 |
248 | Lin et al (2015b) use only convolutional layers from
249 | the net for extraction of features which are classified
250 | by bilinear classifier (Pirsiavash et al, 2009). Gao et al
251 | (2016) followed the path of bilinear pooling and pro-
252 | posed a method for Compact Bilinear Pooling getting
253 | the same accuracy as the full bilinear pooling with a
254 | significantly lower number of features.
255 |
256 | 2.1.3 Other Methods
257 |
258 | Xie et al (2015) proposed to use hyper-class for data
259 | augmentation and regularization of fine-grained deep
260 | learning. Zhou and Lin (2016) use CNN with Bipartite
261 | Graph Labeling to achieve better accuracy by exploit-
262 | ing the fine-grained annotations and coarse body type
263 | (e.g. Sedan, SUV).
264 |
265 | Lin et al (2015a) use three neural networks for si-
266 | multaneous localization, alignment and classification of
267 | images. Each of these three networks does one of the
268 | three tasks and they are connected into one bigger net-
269 | work.
270 |
271 | Yao (2012) proposed an approach which is using re-
272 | sponses to random templates obtained from images and
273 | classify merged representations of the response maps by
274 | SVM. Zhang et al (2012) use pose normalization kernels
275 | and their responses warped into a feature vector.
276 |
277 | Chai et al (2012) propose to use segmentation for
278 | fine-grained recognition to obtain foreground parts of
279 | image. Similar approach was also proposed by Li et al
280 | (2015); however, the authors use a segmentation algo-
281 | rithm which is optimized and fine-tuned for the pur-
282 | pose of fine-grained recognition. Finally, Gavves et al
283 | (2015) propose to use object proposals to obtain fore-
284 | ground mask and unsupervised alignment to improve
285 | fine-grained classification accuracy.
286 |
287 | 2.2 Fine-Grained Recognition of Vehicles
288 |
289 | The goal of fine-grained recognition of vehicles is to
290 | identify the exact type of the vehicle, that is its make,
291 | model, submodel, and model year. The recognition sys-
292 | tem focused only on vehicles (in relation to general fine-
293 | grained classification of birds, dogs, etc.) can benefit
294 |
295 | from that the vehicles are rigid, have some distinguish-
296 | able landmarks (e.g. license plates), and rigorous mo-
297 | dels (e.g. 3D CAD models) can be available.
298 |
299 | 2.2.1 Methods Limited to Frontal/Rear Images of
300 | Vehicles
301 |
302 | There is a multitude of papers (Petrovic and Cootes,
303 | 2004; Dlagnekov and Belongie, 2005; Clady et al, 2008;
304 | Pearce and Pears, 2011; Psyllos et al, 2011; Lee et al,
305 | 2013; Zhang, 2013; Llorca et al, 2014) using a common
306 | approach: they detect the license plate (as a common
307 | landmark) on the vehicle and extract features from the
308 | area around the license plate as the front/rear parts of
309 | vehicles are usually discriminative.
310 |
311 | There are also papers (Zhang, 2014; Hsieh et al,
312 | 2014; Hu et al, 2015; Liao et al, 2015; Baran et al, 2015;
313 | He et al, 2015) directly extracting features from frontal
314 | images of vehicles by different methods and optionally
315 | exploiting standard structure of parts on the frontal
316 | mask of car (e.g. headlights).
317 |
318 | 2.2.2 Methods based on 3D CAD Models
319 |
320 | There were several approaches how to deal with view-
321 | point variance using synthetic 3D models of vehicles.
322 | Lin et al (2014) propose to jointly optimize 3D model
323 | fitting and fine-grained classification, Hsiao et al (2014)
324 | use detected contour and align the 3D model using 3D
325 | chamfer matching. Krause et al (2013) propose to use
326 | synthetic data to train geometry and viewpoint classi-
327 | fiers for 3D model and 2D image alignment. Prokaj and
328 | Medioni (2009) propose to detect SIFT features on the
329 | vehicle image and on every 3D model seen from a set
330 | of discretized viewpoints.
331 |
332 | 2.2.3 Other Methods
333 |
334 | Gu and Lee (2013) propose to extract center of vehicle
335 | and roughly estimate the viewpoint from the bounding
336 | box aspect ratio. Then, they use different Active Shape
337 | Models for alignment in different viewpoints and use
338 | segmentation for background removal.
339 |
340 | Stark et al (2012) propose to use an extension of
341 | DPM (Felzenszwalb et al, 2010) to be able to handle
342 | multi-class recognition. The model is represented by
343 | latent linear multi-class SVM with HOG (Dalal and
344 | Triggs, 2005) features. The authors show that the sys-
345 | tem outperforms different methods based on LLC (Wang
346 | et al, 2010) and HOG. The recognized vehicles are used
347 | for eye-level camera calibration.
348 |
349 | Liu et al (2016a) use deep relative distance trained
350 | on vehicle re-identification task and propose to train
351 |
352 | 4
353 |
354 | Jakub Sochor et al.
355 |
356 | the neural net with Coupled Clusters Loss instead of
357 | triplet loss.
358 |
359 | Boonsim and Prakoonwit (2016) propose a method
360 | for fine-grained recognition of vehicles at night. The
361 | authors use relative position and shape of features vi-
362 | sible at night (e.g. lights, license plates) to identify the
363 | make&model of a vehicle, which is visible from the rear
364 | side.
365 |
366 | Fang et al (2016) propose to use an approach based
367 | on detected parts. The parts are obtained in an unsu-
368 | pervised manner as high responses from mean response
369 | across channels of the last convolutional layer of used
370 | CNN.
371 |
372 | 2.2.4 Summary of Existing Methods
373 |
374 | Existing methods for fine-grained classification of ve-
375 | hicles usually have significant limitations. They are ei-
376 | ther limited to frontal/rear viewpoints (Petrovic and
377 | Cootes, 2004; Dlagnekov and Belongie, 2005; Clady et al,
378 | 2008; Pearce and Pears, 2011; Psyllos et al, 2011; Lee
379 | et al, 2013; Zhang, 2013; Llorca et al, 2014; Zhang,
380 | 2014; Hsieh et al, 2014; Hu et al, 2015; Liao et al, 2015;
381 | Baran et al, 2015; He et al, 2015) or they require some
382 | knowledge about 3D models of the vehicles (Prokaj and
383 | Medioni, 2009; Krause et al, 2013; Hsiao et al, 2014; Lin
384 | et al, 2014) which can be impractical when new models
385 | of vehicles emerge.
386 |
387 | Our proposed method do not have such limitations.
388 | The method works with arbitrary viewpoints and we
389 | require only 3D bounding boxes of vehicles. The 3D
390 | bounding boxes can be either automatically constructed
391 | from traffic video surveillance data (Dubsk´a et al, 2014,
392 | 2015) or we propose method how to estimate the 3D
393 | bounding boxes both at training and test time (see Sec-
394 | tion 3.4).
395 |
396 | 2.3 Deep Convolutional Neural Networks
397 |
398 | As our methods exploits Convolutional Neural Networks
399 | (CNN), we provide a brief summary of recent advance-
400 | ments in this area. The first version of Convolutional
401 | Neural Networks was proposed by LeCun et al (1998).
402 | Recently, CNNs got much attention than before, thanks
403 | to the paper by Krizhevsky et al (2012). Since then the
404 | performance on ImageNet was significantly improved
405 | by larger and deeper variants of CNNs (Simonyan and
406 | Zisserman, 2014; He et al, 2016).
407 |
408 | Recently, authors also used input normalization to
409 | improve the performance of CNN (Taigman et al, 2014)
410 | and adding additional training data to CNN. Also, parts
411 | of the CNN can be viewed as feature extractors and
412 | independently reused. These trained feature extractors
413 |
414 | Fig. 2 Example of 3D bounding box and its unpacked ver-
415 | sion.
416 |
417 | outperform the hand-crafted features (Bluche et al, 2013;
418 | Taigman et al, 2014).
419 |
420 | Deep Convolutional Neural Networks were also used
421 | for fine-grained recognition. Xiao et al (2015) proposed
422 | to use two nets – one for object level classification and
423 | the second one for part level classification. Yang et al
424 | (2015) used CNNs for fine-grained recognition of vehi-
425 | cles.
426 |
427 | 3 Proposed Methodology for Fine-Grained
428 | Recognition of Vehicles
429 |
430 | In agreement with the recent progress in the Convolu-
431 | tional Neural Networks (Taigman et al, 2014; Krizhevsky
432 | et al, 2012; Chatfield et al, 2014), we use CNN for both
433 | classification and verification. However, we propose to
434 | use several data normalization and augmentation tech-
435 | niques to significantly boost the classification perfor-
436 | mance (up to ∼ 50 % error reduction compared to base
437 | net). We utilize information about 3D bounding boxes
438 | obtained from traffic surveillance camera (Dubsk´a et al,
439 | 2014). Furthermore, we show data augmentation tech-
440 | niques which increased the performance and are appli-
441 | cable in general. Finally, to increase applicability of our
442 | method to scenarios where the 3D bounding box is not
443 | known, we propose an algorithm for bounding box es-
444 | timation both at training and test time.
445 |
446 | 3.1 Image Normalization by Unpacking the 3D
447 | Bounding Box
448 |
449 | We based our work on 3D bounding boxes proposed by
450 | Dubsk´a et al (2014) (Fig. 3) which can be automati-
451 | cally obtained for each vehicle seen by a surveillance
452 | camera (see the original paper Dubsk´a et al (2014) for
453 | further details). These boxes allow us to identify side,
454 | roof, and front (or rear) side of vehicles in addition to
455 | other information about the vehicles. We use these lo-
456 | calized segments to normalize the image of the observed
457 | vehicles (considerably boosting the recognition perfor-
458 | mance).
459 |
460 | The normalization is done by unpacking the image
461 | into a plane. The plane contains rectified versions of the
462 |
463 | 𝑏0𝑏1𝑏2𝑏3𝑏4𝑏5𝑏6𝑏7FSR𝑏0𝑏4𝑏5𝑏1F𝑏0𝑏3𝑏2𝑏6SRBoxCars: Improving Vehicle Fine-Grained Recognition using 3D Bounding Boxes in Traffic Surveillance
464 |
465 | 5
466 |
467 | Fig. 3 Examples of data normalization and auxiliary data feeded to nets. Left to right: vehicle with 2D bounding box,
468 | computed 3D bounding box, vectors encoding viewpoints on the vehicle (View), unpacked image of the vehicle (Unpack),
469 | and rasterized 3D bounding box feeded to the net (Rast).
470 |
471 | front/rear (F), side (S), and roof (R). These parts are
472 | adjacent to each other (Fig. 2) and they are organized
473 | into the final matrix U:
474 |
475 | (cid:18) 0 R
476 |
477 | (cid:19)
478 |
479 | U =
480 |
481 | F S
482 |
483 | (1)
484 |
485 | The unpacking itself is done by obtaining homogra-
486 | phy between points bi (Fig. 2) and perspective warping
487 | parts of the original image. The left top submatrix is
488 | filled with zeros. This unpacked version of the vehicle is
489 | used instead of the original image to feed the net. The
490 | unpacking is beneficial as it localizes parts of the vehi-
491 | cles, normalizes their position in the image and all that
492 | without the necessity to use DPM or other algorithms
493 | for part localization. In the further text, we will refer
494 | to this normalization method as Unpack.
495 |
496 | 3.2 Extended Input to the Neural Nets
497 |
498 | It it possible to infer additional information about the
499 | vehicle from the 3D bounding box and we found out
500 | that these data slightly improve the classification and
501 | verification performance. One piece of this auxiliary
502 | information is the encoded viewpoint (direction from
503 | which the vehicle is observed). We also add rasterized
504 |
505 | 3D bounding box as additional input to the CNNs.
506 | Compared to our previously proposed auxiliary data
507 | fed to the net (Sochor et al, 2016a), we handle frontal
508 | and rear vehicle side differently.
509 |
510 | View The viewpoint is extracted from the orienta-
511 | tion of the 3D bounding box – Fig. 3. We encode the
512 | viewpoint as three 2D vectors vi, where i ∈ {f, s, r}
513 | (front/rear, side, roof ) and pass them to the net. Vec-
514 | tors vi are connecting the center of the bounding box
515 | −−−→
516 | with the centers of the box’s faces. Therefore, it can be
517 | computed as vi =
518 | CcCi. Point Cc is the center of the
519 | ←→
520 | ←→
521 | bounding box and it can be obtained as the intersection
522 | b5b3. Points Ci for i ∈ {f, s, r}
523 | of diagonals
524 | b2b4 and
525 | denote the centers of each face, again computed as in-
526 | tersections of face diagonals. In contrast to our previous
527 | approach (Sochor et al, 2016a), which did not take the
528 | direction of the vehicle into account; instead, we encode
529 | information about the vehicle direction (d = 1 for vehi-
530 | cles going to camera, d = 0 for vehicles going from the
531 | camera), to determine which side of the bounding box
532 | is the frontal one. The vectors are normalized to have
533 | unit size; storing them with a different normalization
534 | (e.g. the front one normalized, the other in the proper
535 | ratio) did not improve the results.
536 |
537 | Rast Another way of encoding the viewpoint and
538 | also the relative dimensions of vehicles is to rasterize
539 |
540 | 6
541 |
542 | Jakub Sochor et al.
543 |
544 | Fig. 4 Examples of proposed data augmentation techniques. Left most image contains the original cropped image of the
545 | vehicle and other images contains augmented versions of the image (Top – Color, Bottom – ImageDrop).
546 |
547 | the 3D bounding box and use it as an additional in-
548 | put to the net. The rasterization is done separately for
549 | all sides, each filled by one color. The final rasterized
550 | bounding box is then a four-channel image containing
551 | each visible face rasterized in a different channel. For-
552 | mally, point p of the rasterized bounding box T is ob-
553 | tained as
554 |
555 | (1, 0, 0, 0) p ∈
556 | (0, 1, 0, 0) p ∈
557 | (0, 0, 1, 0)
558 | (0, 0, 0, 1)
559 | (0, 0, 0, 0)
560 |
561 | b0b1b4b5 and d = 1
562 | b0b1b4b5 and d = 0
563 | p ∈
564 | p ∈
565 |
566 | b1b2b5b6
567 | b0b1b2b3
568 |
569 | otherwise
570 |
571 | (2)
572 |
573 |
574 |
575 | Tp =
576 |
577 | where
578 | points b0, b1, b4 and b5 in Figure 2.
579 |
580 | b0b1b4b5 denotes the quadrilateral defined by
581 |
582 | Finally, the 3D rasterized bounding box is cropped
583 | by the 2D bounding box of the vehicle. For an exam-
584 | ple, see Figure 3, showing rasterized bounding boxes for
585 | different vehicles taken from different viewpoints.
586 |
587 | 3.3 Additional Training Data Augmentation
588 |
589 | To increase the diversity of the training data, we pro-
590 | pose additional data augmentation techniques. The first
591 | one (denoted as Color) deals with the fact that for
592 | fine-grained recognition of vehicles (and some other ob-
593 | jects), the color is irrelevant. The other method (Image-
594 | Drop) deals with some potentially missing parts on the
595 | vehicle. Examples of the data augmentation are shown
596 | in Figure 4. Both these augmentation techniques are
597 | done only with predefined probability during training,
598 | otherwise they are not modified. During testing, we do
599 | not modify the images at all.
600 |
601 | The results presented in Section 5.5 show that both
602 | these modifications improve the classification accuracy
603 | both in combination with other presented techniques or
604 | by themselves.
605 |
606 | Color To increase training samples color variabili-
607 | ty, we propose to randomly alternate the color of the
608 | image. The alternation is done in HSV color space by
609 |
610 | adding the same random values to each pixel in the
611 | image (each HSV channel is processed separately).
612 |
613 | ImageDrop Inspired by Zeiler and Fergus (2014)
614 | who evaluated the influence of covering a part of the in-
615 | put image on the probability of the ground truth class,
616 | we take this step further and in order to deal with miss-
617 | ing parts on the vehicles, we take a random rectangle
618 | in the image and fill it with random noise, effectively
619 | dropping any information contained in that part of im-
620 | age.
621 |
622 | 3.4 Estimation of 3D Bounding Box at Test Time
623 |
624 | As the results (Section 5) show, the most important
625 | part of the proposed algorithm is Unpack followed by
626 | Color and ImageDrop. However, the 3D bounding
627 | box is required for the unpacking of the vehicles and
628 | we acknowledge that there may be scenarios when such
629 | information is not available. Thus, we propose a method
630 | how to estimate the 3D bounding box for both training
631 | and test time with only limited information available.
632 | As proposed by Dubsk´a et al (2014), the vehicle’s
633 | contour and the vanishing points are required for the
634 | bounding box construction. Therefore, it is necessary
635 | to estimate the contour and the vanishing points for
636 | the vehicle. For estimating the vehicle contour, we use
637 | Fully Convolutional Encoder-Decoder network designed
638 | by Yang et al (2016) for general object contour detec-
639 | tion and masks with probabilities of vehicles contours
640 | for each image pixel. To obtain the final contour, we
641 | search for global maxima along line segments from 2D
642 | bounding box centers to edge points of the 2D bounding
643 | box. For examples, see Figure 5.
644 |
645 | We found out that the exact position of the van-
646 | ishing point is not required for the 3D bounding box
647 | construction, but the directions to the vanishing points
648 | are much more important. Therefore, we use regression
649 | to obtain the directions towards the vanishing points
650 | and then assume that the vanishing points are in infin-
651 | ity.
652 |
653 | BoxCars: Improving Vehicle Fine-Grained Recognition using 3D Bounding Boxes in Traffic Surveillance
654 |
655 | 7
656 |
657 | Fig. 5 Estimation of 3D bounding box. Left to right: image with vehicle 2D bounding box, output of contour object detector
658 | (Yang et al, 2016), our constructed contour, estimated directions towards vanishing points, ground truth (green) and estimated
659 | (red) 3D bounding box.
660 |
661 | Following the work by Rothe et al (2016), we formu-
662 | lated the regression of the direction towards the vanish-
663 | ing points as a classification task into bins correspond-
664 | ing to angles and we use ResNet50 (He et al, 2016)
665 | with three classification outputs. As the training data
666 | for the regression we used BoxCars116k dataset (Sec-
667 | tion 4) with the test samples omitted. To construct the
668 | lines on which the vanishing points are, we use the cen-
669 | ter of the 2D bounding box.
670 |
671 | With all these estimated information it is then pos-
672 | sible to construct the 3D bounding box. It is important
673 | to note that using this 3D bounding box estimation,
674 | it is possible to use this method beyond the scope of
675 | traffic surveillance. It is only necessary to train the re-
676 | gressor of vanishing points directions. For training of
677 | such regressor, it is possible to use either the direc-
678 | tions themselves or viewpoints on the vehicle and focal
679 | lengths of the images.
680 |
681 | 4 BoxCars116k Dataset
682 |
683 | There is a large number of datasets of vehicles (Rus-
684 | sakovsky et al, 2015; Agarwal et al, 2004; Papageorgiou
685 | and Poggio, 1999; Everingham et al, 2010; Xiang and
686 | Savarese, 2012; Caraffi et al, 2012; Opelt et al, 2004;
687 | Leibe et al, 2007; Glasner et al, 2012; Savarese and
688 | Fei-Fei, 2007; Geiger et al, 2012; ¨Ozuysal et al, 2009;
689 | Matzen and Snavely, 2013) which are usable mainly
690 | for vehicle detection, pose estimation, and other tasks.
691 | However, these datasets do not contain annotation of
692 | the precise vehicles’ make & model.
693 |
694 | When it comes to the fine-grained datasets, a few of
695 | them exist and all are quite recent. Lin et al (2014) pub-
696 | lished FG3DCar dataset (300 images, 30 classes), Stark
697 |
698 | et al (2012) made another dataset containing 1 904 ve-
699 | hicles from 14 classes. Krause et al (2013) published
700 | two datasets; one of them, called Car Types, contains
701 | 16k of images and 196 classes. The other one, BMW 10,
702 | is made of 10 models of BMW vehicles and 500 images.
703 | Finally, Liao et al (2015) created a dataset of 1 482 ve-
704 | hicles from 8 classes. All these datasets are relatively
705 | small for training the CNN for real-world surveillance
706 | tasks.
707 |
708 | Yang et al (2015) published a large dataset Comp-
709 | Cars. The dataset consists of a web-nature part, made
710 | of 136k of vehicles from 1 600 classes taken from dif-
711 | ferent viewpoints. Then, it also contains a surveillance-
712 | nature part with 50k frontal images of vehicles taken
713 | from surveillance cameras.
714 |
715 | Liu et al (2016b) published dataset VeRi-776 for
716 | vehicle re-identification task. The dataset contains over
717 | 50k images of 776 vehicles captured by 20 cameras cov-
718 | ering an 1.0 km2 area in 24 hours. Each vehicle is cap-
719 | tured by 2 ∼ 18 cameras in different viewpoints, il-
720 | luminations, resolutions and oclusions, and various at-
721 | tributes like bounding boxes, vehicle types, colors and
722 | brands are provided.
723 |
724 | We collected and annotated a new dataset Box-
725 | Cars116k. The dataset is focused on images taken from
726 | surveillance cameras as it is meant to be useful for traf-
727 | fic surveillance applications. We do not restrict that the
728 | vehicles are taken from the frontal side (Fig. 6). We used
729 | surveillance cameras mounted near streets and tracked
730 | the passing vehicles. Each correctly detected vehicle is
731 | captured in multiple images, as it is passing by the cam-
732 | era; therefore, we have more visual information about
733 | each vehicle.
734 |
735 | 8
736 |
737 | Jakub Sochor et al.
738 |
739 | Fig. 6 Collate of random samples from the dataset.
740 |
741 | 4.1 Dataset Acquisition
742 |
743 | The dataset is formed by two parts. The first part con-
744 | sists of data from BoxCars21k dataset (Sochor et al,
745 | 2016a) which were cleaned up and some imprecise an-
746 | notations were corrected (e.g. missing model years for
747 | some uncommon vehicle types).
748 |
749 | We also collected other data from videos relevant
750 | to our previous work (Dubsk´a et al, 2014, 2015; Sochor
751 | et al, 2016b). We detected all vehicles, tracked them
752 | and for each track collected images of the respective
753 | vehicle. We downsampled the framerate to ∼ 12.5 FPS
754 | to avoid collection of multiple almost identical images
755 | of the same vehicle.
756 |
757 | The new dataset was annotated by multiple hu-
758 | man annotators with interest in vehicles and sufficient
759 | knowledge about vehicle types and models. The anno-
760 | tators were assigned to clean up the processed data
761 | from invalid detections and assign exact vehicle type
762 | (make, model, submodel, year) for each obtained track.
763 | While preparing the dataset for annotation, 3D bound-
764 | ing boxes were constructed for each detected vehicle
765 | using the method proposed by Dubsk´a et al (2014).
766 | Invalid detections were then distinguished by the anno-
767 | tators based on this constructed 3D bounding boxes. In
768 | the case that all 3D bounding boxes are not constructed
769 | precisely, the whole track was invalidated.
770 |
771 | Vehicle type annotation reliability is guaranteed by
772 | providing multiple annotations for each valid track (∼ 4
773 | annotations per vehicle). The annotation of vehicle type
774 | is considered as correct in the case of at least three iden-
775 | tical annotations. Uncertain cases were authoritatively
776 | annotated by the authors.
777 |
778 | The tracks in BoxCars21k dataset consist from ex-
779 | actly 3 images per track. However, in the new part of
780 | the dataset, we collect arbitrary number of images per
781 | track (usually more then 3).
782 |
783 | # tracks
784 | # samples
785 | # cameras
786 | # make
787 | # make & model
788 | # make & model & submodel
789 | # make & model & submodel & model year
790 |
791 | 27 496
792 | 116 286
793 | 137
794 | 45
795 | 341
796 | 421
797 | 693
798 |
799 | Table 1 Statistics of our new BoxCars116k dataset.
800 |
801 | Fig. 7 BoxCars116k dataset statistics – top left: 2D bound-
802 | ing box dimensions, top right: number of fine-grained types
803 | samples, bottom left: azimuth distribution (0◦ denotes
804 | frontal viewpoint), bottom right: elevation distribution.
805 |
806 | 4.2 Dataset Statistics
807 |
808 | The dataset contains 27 496 vehicles (116 286 images) of
809 | 45 different makes with 693 fine-grained classes (make
810 | & model & submodel & model year) collected from 137
811 | different cameras with a large variation in the view-
812 | points. Detailed statistics about the dataset can be found
813 | in Table 1 and in Figure 7. The distribution of types
814 | in the dataset is shown in Figure 7 (top right) and
815 | samples from the dataset are in Figure 6. The dataset
816 | includes also information about the 3D bounding box
817 | (Dubsk´a et al, 2014) for each vehicle and an image with
818 | a foreground mask extracted by background subtrac-
819 | tion (Stauffer and Grimson, 1999; Zivkovic, 2004). The
820 |
821 | 0100200300size [px]4k8k12kwidthsheights0200400600type rank10010110210310415075075150azimuth [degrees]5k10k15k020406080elevation [degrees]8k16k24kBoxCars: Improving Vehicle Fine-Grained Recognition using 3D Bounding Boxes in Traffic Surveillance
822 |
823 | 9
824 |
825 | Fig. 8 Viewpoints to dataset samples (horizontal flips are not included). Red dot on the unit circle denotes the frontal
826 | viewpoint. Left: all samples with elevation color coding (in degrees), center: train samples for hard split with color coded by
827 | 2D BB area (in thousands of pixels), right: test samples for hard split color coded by angle to the nearest training viewpoint
828 | sample (in degrees).
829 |
830 | hard medium
831 |
832 | # classes
833 | # trainval cameras
834 | # test cameras
835 |
836 | # train tracks
837 | # train samples
838 |
839 | # validation tracks
840 | # validation samples
841 |
842 | # test tracks
843 | # test samples
844 |
845 | 107
846 | 81
847 | 56
848 |
849 | 11 653
850 | 51 691
851 |
852 | 637
853 | 2 763
854 |
855 | 11 125
856 | 39 149
857 |
858 | 79
859 | 81
860 | 56
861 |
862 | 12 084
863 | 54 653
864 |
865 | 611
866 | 2 802
867 |
868 | 11 456
869 | 40 842
870 |
871 | Table 2 Statistics about splits with different difficulty (hard
872 | and medium)
873 |
874 | dataset is made publicly available2 for future reference
875 | and evaluation.
876 |
877 | Compared to “web-based” datasets, the new Box-
878 | Cars116k dataset contains images of vehicles relevant
879 | to traffic surveillance which have specific viewpoint (high
880 | elevation), usually small images etc. Compared to other
881 | fine-grained surveillance datasets, our dataset provides
882 | data with a high variation in viewpoints, see Figure 8.
883 |
884 | 4.3 Training & Test Splits
885 |
886 | Our task is to provide a dataset for fine-grained recog-
887 | nition in traffic surveillance without any viewpoint con-
888 | straint. Therefore, we construct the splits for training
889 | and evaluation in a way which reflects the fact that it is
890 | usually not known a priori from which viewpoints the
891 | vehicles will be seen by the surveillance camera.
892 |
893 | Thus, for the construction of the splits, we randomly
894 | selected cameras and used all tracks from these came-
895 | ras for training and vehicles from other cameras for
896 | testing. This way, we are testing the classification al-
897 | gorithms on images of vehicles from previously unseen
898 | cameras (viewpoints). However, this dataset selection
899 |
900 | 2 https://medusa.fit.vutbr.cz/traffic
901 |
902 | process causes that some of the vehicles from the test-
903 | ing set may be taken under slightly different viewpoint
904 | the are present in the training set, these differences are
905 | shown in Figure 8 (right).
906 |
907 | We constructed two splits. In the first one (hard),
908 | we are interested in recognition of precise type inclu-
909 | ding model year. In the other one (medium), we omit
910 | the difference in model years and all vehicles of the
911 | same subtype (and potentially different model years)
912 | are present in the same class. We selected only types
913 | which have at least 15 tracks in the training set and at
914 | least one track in the testing set. The statistics about
915 | the splits are shown in Table 2.
916 |
917 | 5 Experiments
918 |
919 | We thoroughly evaluated our proposed algorithm on
920 | the BoxCars116k dataset. First, we evaluated how these
921 | methods improve for different nets, compared them to
922 | the state of the art, and analyzed how using approx-
923 | imate 3D bounding boxes influence the achieved ac-
924 | curacy. Then, we searched for the main source of im-
925 | provements, analyzed improvements of different modi-
926 | fications separately, and we also evaluated the usability
927 | of features from the trained nets for the task of vehicle
928 | type identity verification.
929 |
930 | To show that our modifications improve the accu-
931 | racy independently on the used nets, we use several of
932 | them:
933 |
934 | – AlexNet (Krizhevsky et al, 2012)
935 | – VGG16, VGG19 (Simonyan and Zisserman, 2014)
936 | – ResNet50, ResNet101, ResNet152 (He et al,
937 |
938 | 2016)
939 |
940 | – CNNs with Compact Bilinear Pooling layer (Gao
941 | et al, 2016) in combination with VGG nets denoted
942 | as VGG16+CBL and VGG19+CBL.
943 |
944 | As there are several options how to use the pro-
945 | posed modifications of input data and add additional
946 |
947 | 10
948 |
949 | Jakub Sochor et al.
950 |
951 | auxiliary data, we define several labels which we will
952 | use:
953 |
954 | – ALL – All five proposed modifications (Unpack,
955 |
956 | Color, ImageDrop, View, Rast).
957 |
958 | – IMAGE – Modifications working only on the image
959 |
960 | level (Unpack, Color, ImageDrop).
961 |
962 | – CVPR16 – Modifications as proposed in our pre-
963 | vious CVPR paper (Sochor et al, 2016a) (Unpack,
964 | View, Rast – however, for the View and Rast mod-
965 | ifications differ from those ones used in this pa-
966 | per as the original modifications do not distinguish
967 | frontal/rear side of vehicles).
968 |
969 | ligible and therefore it is reasonable to only use the IM-
970 | AGE modifications. This also results into CNNs which
971 | uses just the Unpack modification during test time as
972 | the other modifications (Color, ImageDrop) are used
973 | only during fine-tuning of CNNs.
974 |
975 | Also, the evaluation shows that the results are al-
976 | most identical for the hard and medium split; therefore,
977 | we will further only report results on the hard split, as it
978 | is the main goal to distinguish also the model years. The
979 | names for the splits were chosen to be consistent with
980 | the original version of dataset (Sochor et al, 2016a) and
981 | the small difference between medium and hard split ac-
982 | curacies is caused mainly by the size of the new dataset.
983 |
984 | 5.1 Improvements for Different CNNs
985 |
986 | The first experiment which we have done is evaluation
987 | how our modifications improve classification accuracy
988 | for different CNNs.
989 |
990 | All the nets were fine-tuned from models pre-trained
991 | on ImageNet (Russakovsky et al, 2015) for approxi-
992 | mately 15 epochs which was sufficient for the nets to
993 | converge. We used the same batch size (except for Res-
994 | Net151, where we had to use smaller batch size because
995 | of GPU memory limitations), the same initial learning
996 | rate and learning rate decay and the same hyperpa-
997 | rameters for every net (initial learning rate 2.5 · 10−3,
998 | weight decay 5 · 10−4, quadratic learning rate decay,
999 | loss is averaged over 100 iterations). We also used stan-
1000 | dard data augmentation techniques as horizontal flip
1001 | and randomly moving bounding box (Simonyan and
1002 | Zisserman, 2014). As ResNets do not use fully con-
1003 | nected layers, we only report IMAGE modifications
1004 | for them.
1005 |
1006 | The results for both medium and hard splits are
1007 | shown in Table 3. As we have correspondences between
1008 | the samples in the dataset and know which samples are
1009 | from the same track, we are able to use mean probabil-
1010 | ity across track samples and merge the classification for
1011 | the whole track. Therefore we always report the results
1012 | in form single sample accuracy/whole track accuracy.
1013 | As expected, the results for whole tracks are much bet-
1014 | ter than for single samples.
1015 |
1016 | There are several things which should be noted about
1017 | the results. The most important one is that our modifi-
1018 | cations significantly improve classification accuracy (up
1019 | to +12 percent points) and reduce classification er-
1020 | ror (up to 50 % error reduction). Another important
1021 | fact is that our new modifications push the accuracy
1022 | much further compared to the original method (Sochor
1023 | et al, 2016a).
1024 |
1025 | The table also shows that the difference between
1026 | ALL modifications and IMAGE modifications is neg-
1027 |
1028 | 5.2 Comparison with the State of the Art
1029 |
1030 | In order to examine the performance of our method, we
1031 | also evaluated other state-of-the-art methods for fine-
1032 | grained recognition. We used 3 different algorithms for
1033 | general fine-grained recognition with published code.
1034 | We always first used the code to reproduce the results
1035 | in respective papers to ensure that we are using the
1036 | published work correctly. All of the methods use CNNs
1037 | and the used net influences the accuracy, therefore the
1038 | results should be compared with respective base CNNs.
1039 | It was impossible to evaluate methods focused only
1040 | on fine-grained recognition of vehicles as they are usu-
1041 | ally limited to frontal/rear viewpoint or require 3D mod-
1042 | els of vehicles for all the types. In the following text we
1043 | define labels for each evaluated state-of-the-art method
1044 | and describe details for the method separately.
1045 |
1046 | BCNN Lin et al (2015b) proposed to use Bilinear
1047 | CNN. We used VGG-M and VGG16 networks in a
1048 | symmetric setup (details in the original paper), and
1049 | trained the nets for 30 epochs (the nets were converged
1050 | around the 20th epoch). We also used image flipping to
1051 | augment the training set.
1052 |
1053 | CBL We modified compatible nets with Compact
1054 | BiLinear Pooling proposed by Gao et al (2016) which
1055 | followed the work of Lin et al (2015b) and reduced the
1056 | number of output features of the bilinear layers. We
1057 | used the Caffe implementation of the layer provided by
1058 | the authors and used 8 192 features. We trained the net
1059 | using the same hyper-parameters, protocol, and data
1060 | augmentation as described in Section 5.1.
1061 |
1062 | PCM Simon and Rodner (2015) propose Part Con-
1063 | stellation Models and use neural activations (see the
1064 | paper for the details) to get the parts of the model.
1065 | We used AlexNet (BVLC Caffe reference version) and
1066 | VGG19 as base nets for the method. We used the same
1067 | hyper-parameters as the authors with the exception of
1068 | fine-tuning number of iterations which was increased,
1069 |
1070 | BoxCars: Improving Vehicle Fine-Grained Recognition using 3D Bounding Boxes in Traffic Surveillance
1071 |
1072 | 11
1073 |
1074 | SPLIT: HARD
1075 |
1076 | accuracy [%]
1077 |
1078 | improvement [pp]
1079 |
1080 | error reduction [%]
1081 |
1082 | AlexNet + ALL
1083 | AlexNet + IMAGE
1084 | AlexNet + CVPR16
1085 | AlexNet (Krizhevsky et al, 2012)
1086 |
1087 | VGG16 + ALL
1088 | VGG16 + IMAGE
1089 | VGG16 + CVPR16
1090 | VGG16 (Simonyan and Zisserman, 2014)
1091 |
1092 | VGG16+CBL + ALL
1093 | VGG16+CBL + IMAGE
1094 | VGG16+CBL + CVPR16
1095 | VGG16+CBL (Gao et al, 2016)
1096 |
1097 | VGG19 + IMAGE
1098 | VGG19 + ALL
1099 | VGG19 + CVPR16
1100 | VGG19 (Simonyan and Zisserman, 2014)
1101 |
1102 | VGG19+CBL + ALL
1103 | VGG19+CBL + IMAGE
1104 | VGG19+CBL + CVPR16
1105 | VGG19+CBL (Gao et al, 2016)
1106 |
1107 | ResNet50 + IMAGE
1108 | ResNet50 (He et al, 2016)
1109 |
1110 | ResNet101 + IMAGE
1111 | ResNet101 (He et al, 2016)
1112 |
1113 | ResNet152 + IMAGE
1114 | ResNet152 (He et al, 2016)
1115 |
1116 | 77.79/88.60
1117 | 77.67/88.28
1118 | 70.21/81.67
1119 | 66.65/77.75
1120 |
1121 | 84.13/92.27
1122 | 83.79/92.23
1123 | 79.58/89.27
1124 | 77.26/86.71
1125 |
1126 | 75.06/83.42
1127 | 75.04/83.16
1128 | 70.94/81.08
1129 | 70.38/80.11
1130 |
1131 | 83.91/92.17
1132 | 84.12/92.00
1133 | 79.69/89.42
1134 | 76.74/86.06
1135 |
1136 | 75.62/83.76
1137 | 75.47/83.56
1138 | 71.92/81.64
1139 | 70.69/80.26
1140 |
1141 | 82.27/90.79
1142 | 75.48/84.61
1143 |
1144 | 83.41/91.59
1145 | 76.46/85.31
1146 |
1147 | 83.74/91.71
1148 | 77.68/86.20
1149 |
1150 | +11.15/+10.85
1151 | +11.02/+10.53
1152 | +3.56/+3.92
1153 | —
1154 |
1155 | +6.88/+5.56
1156 | +6.53/+5.53
1157 | +2.32/+2.56
1158 | —
1159 |
1160 | +4.67/+3.31
1161 | +4.66/+3.05
1162 | +0.56/+0.97
1163 | —
1164 |
1165 | +7.17/+6.11
1166 | +7.38/+5.94
1167 | +2.95/+3.36
1168 | —
1169 |
1170 | +4.93/+3.50
1171 | +4.78/+3.30
1172 | +1.23/+1.38
1173 | —
1174 |
1175 | +6.79/+6.18
1176 | —
1177 |
1178 | +6.95/+6.27
1179 | —
1180 |
1181 | +6.06/+5.51
1182 | —
1183 |
1184 | 33.42/48.77
1185 | 33.04/47.31
1186 | 10.68/17.62
1187 | —
1188 |
1189 | 30.24/41.85
1190 | 28.71/41.58
1191 | 10.22/19.27
1192 | —
1193 |
1194 | 15.78/16.63
1195 | 15.73/15.32
1196 | 1.88/4.88
1197 | —
1198 |
1199 | 30.83/43.84
1200 | 31.74/42.62
1201 | 12.69/24.11
1202 | —
1203 |
1204 | 16.82/17.71
1205 | 16.31/16.71
1206 | 4.20/6.97
1207 | —
1208 |
1209 | 27.69/40.13
1210 | —
1211 |
1212 | 29.52/42.72
1213 | —
1214 |
1215 | 27.16/39.93
1216 | —
1217 |
1218 | SPLIT: MEDIUM
1219 |
1220 | accuracy [%]
1221 |
1222 | improvement [pp]
1223 |
1224 | error reduction [%]
1225 |
1226 | AlexNet + IMAGE
1227 | AlexNet + ALL
1228 | AlexNet + CVPR16
1229 | AlexNet (Krizhevsky et al, 2012)
1230 |
1231 | VGG16 + ALL
1232 | VGG16 + IMAGE
1233 | VGG16 + CVPR16
1234 | VGG16 (Simonyan and Zisserman, 2014)
1235 |
1236 | VGG16+CBL + IMAGE
1237 | VGG16+CBL + ALL
1238 | VGG16+CBL + CVPR16
1239 | VGG16+CBL (Gao et al, 2016)
1240 |
1241 | VGG19 + ALL
1242 | VGG19 + IMAGE
1243 | VGG19 + CVPR16
1244 | VGG19 (Simonyan and Zisserman, 2014)
1245 |
1246 | VGG19+CBL + IMAGE
1247 | VGG19+CBL + ALL
1248 | VGG19+CBL + CVPR16
1249 | VGG19+CBL (Gao et al, 2016)
1250 |
1251 | ResNet50 + IMAGE
1252 | ResNet50 (He et al, 2016)
1253 |
1254 | ResNet101 + IMAGE
1255 | ResNet101 (He et al, 2016)
1256 |
1257 | ResNet152 + IMAGE
1258 | ResNet152 (He et al, 2016)
1259 |
1260 | 77.77/88.16
1261 | 77.52/87.52
1262 | 70.90/82.18
1263 | 65.68/76.53
1264 |
1265 | 83.89/91.75
1266 | 83.93/91.69
1267 | 79.50/88.58
1268 | 75.96/85.39
1269 |
1270 | 75.67/83.49
1271 | 75.47/83.23
1272 | 71.07/81.02
1273 | 70.74/80.22
1274 |
1275 | 84.43/92.22
1276 | 83.98/91.71
1277 | 80.26/89.39
1278 | 75.40/84.34
1279 |
1280 | 76.88/84.63
1281 | 75.47/83.88
1282 | 72.53/81.90
1283 | 71.54/80.67
1284 |
1285 | 82.28/90.63
1286 | 75.07/83.55
1287 |
1288 | 83.10/90.80
1289 | 77.05/85.61
1290 |
1291 | 83.80/91.38
1292 | 78.44/86.98
1293 |
1294 | +12.09/+11.64
1295 | +11.84/+10.99
1296 | +5.23/+5.65
1297 | —
1298 |
1299 | +7.93/+6.36
1300 | +7.96/+6.30
1301 | +3.54/+3.19
1302 | —
1303 |
1304 | +4.93/+3.27
1305 | +4.73/+3.01
1306 | +0.33/+0.80
1307 | —
1308 |
1309 | +9.03/+7.88
1310 | +8.58/+7.37
1311 | +4.87/+5.05
1312 | —
1313 |
1314 | +5.34/+3.95
1315 | +3.92/+3.20
1316 | +0.98/+1.22
1317 | —
1318 |
1319 | +7.21/+7.09
1320 | —
1321 |
1322 | +6.05/+5.19
1323 | —
1324 |
1325 | +5.36/+4.40
1326 | —
1327 |
1328 | 35.21/49.57
1329 | 34.49/46.82
1330 | 15.22/24.06
1331 | —
1332 |
1333 | 32.99/43.55
1334 | 33.13/43.13
1335 | 14.71/21.86
1336 | —
1337 |
1338 | 16.84/16.55
1339 | 16.15/15.23
1340 | 1.12/4.06
1341 | —
1342 |
1343 | 36.70/50.33
1344 | 34.88/47.05
1345 | 19.78/32.27
1346 | —
1347 |
1348 | 18.75/20.46
1349 | 13.79/16.58
1350 | 3.46/6.32
1351 | —
1352 |
1353 | 28.90/43.08
1354 | —
1355 |
1356 | 26.37/36.08
1357 | —
1358 |
1359 | 24.85/33.78
1360 | —
1361 |
1362 | Table 3 Improvements of our proposed modifications for different CNNs. The accuracy is reported as single sample accu-
1363 | racy/track accuracy. We also present improvement in percent points and classification error reduction in the same format.
1364 |
1365 | 12
1366 |
1367 | Jakub Sochor et al.
1368 |
1369 | method
1370 |
1371 | accuracy [%]
1372 |
1373 | speed [FPS]
1374 |
1375 | AlexNet (Krizhevsky et al, 2012)
1376 | VGG16 (Simonyan and Zisserman, 2014)
1377 | VGG19 (Simonyan and Zisserman, 2014)
1378 | Resnet50 (He et al, 2016)
1379 | Resnet101 (He et al, 2016)
1380 | Resnet152 (He et al, 2016)
1381 |
1382 | BCNN (VGG-M) (Lin et al, 2015b)
1383 | BCNN (VGG16) (Lin et al, 2015b)
1384 | CBL (VGG16) (Gao et al, 2016)
1385 | CBL (VGG19) (Gao et al, 2016)
1386 | PCM (AlexNet) (Simon and Rodner, 2015)
1387 | PCM (VGG19) (Simon and Rodner, 2015)
1388 |
1389 | AlexNet + ALL (ours)
1390 | VGG16 + ALL (ours)
1391 | VGG19 + ALL (ours)
1392 | VGG16+CBL + ALL (ours)
1393 | VGG19+CBL + ALL (ours)
1394 | Resnet50 + IMAGE (ours)
1395 | Resnet101 + IMAGE (ours)
1396 | Resnet152 + IMAGE (ours)
1397 |
1398 | 66.65/77.75
1399 | 77.26/86.71
1400 | 76.74/86.06
1401 | 75.48/84.61
1402 | 76.46/85.31
1403 | 77.68/86.20
1404 |
1405 | 64.83/72.22
1406 | 69.64/78.56
1407 | 70.38/80.11
1408 | 70.69/80.26
1409 | 63.24/73.94
1410 | 75.99/85.24
1411 |
1412 | 77.79/88.60
1413 | 84.13/92.27
1414 | 84.12/92.00
1415 | 75.06/83.42
1416 | 75.62/83.76
1417 | 82.27/90.79
1418 | 83.41/91.59
1419 | 83.74/91.71
1420 |
1421 | 963
1422 | 173
1423 | 146
1424 | 155
1425 | 95
1426 | 66
1427 | 87∗
1428 | 10∗
1429 | 165
1430 | 141
1431 | 15
1432 | 4
1433 |
1434 | 580
1435 | 154
1436 | 133
1437 | 146
1438 | 126
1439 | 151
1440 | 93
1441 | 65
1442 |
1443 | Table 4 Comparison of different vehicle fine-grained recognition methods. Accuracy is reported as single image accuracy/whole
1444 | track accuracy. Processing speed was measured on a machine with GTX1080 and CUDNN. ∗ FPS reported by authors.
1445 |
1446 | net
1447 |
1448 | no modification GT 3D BB estimated 3D BB
1449 |
1450 | AlexNet
1451 | VGG16
1452 | VGG19
1453 | VGG16+CBL
1454 | VGG19+CBL
1455 | ResNet50
1456 | ResNet101
1457 | ResNet152
1458 |
1459 | 66.65/77.75
1460 | 77.26/86.71
1461 | 76.74/86.06
1462 | 70.38/80.11
1463 | 70.69/80.26
1464 | 75.48/84.61
1465 | 76.46/85.31
1466 | 77.68/86.20
1467 |
1468 | 77.67/88.28
1469 | 83.79/92.23
1470 | 83.91/92.17
1471 | 75.04/83.16
1472 | 75.47/83.56
1473 | 82.27/90.79
1474 | 83.41/91.59
1475 | 83.74/91.71
1476 |
1477 | 74.81/87.30
1478 | 80.60/90.59
1479 | 81.43/91.57
1480 | 72.83/82.92
1481 | 73.09/83.09
1482 | 79.60/90.40
1483 | 80.20/90.42
1484 | 80.87/90.93
1485 |
1486 | Table 5 Comparison of classification accuracy (percent) on the hard split with standard nets without any modifications,
1487 | IMAGE modifications using 3D bounding box from surveillance data, and IMAGE modifications using estimated 3D BB
1488 | (Section 3.4).
1489 |
1490 | and the C parameter of used linear SVM was cross-
1491 | validated on the training data.
1492 |
1493 | The results of all the comparisons can be found in
1494 | Table 4. As the table shows, our method significantly
1495 | outperforms both standard CNNs (Krizhevsky et al,
1496 | 2012; Simonyan and Zisserman, 2014; He et al, 2016)
1497 | and methods for fine-grained recognition (Lin et al,
1498 | 2015b; Simon and Rodner, 2015; Gao et al, 2016). The
1499 | results for fine-grained recognition methods should be
1500 | compared with the same used base network as for dif-
1501 | ferent networks, they provide different results. Our best
1502 | accuracy (84 %) is better by a large margin compared to
1503 | all other variants (both standard CNN and fine-grained
1504 | methods).
1505 |
1506 | In order to provide approximate information about
1507 | the processing efficiency, we measured how many im-
1508 | ages of vehicles are different methods and networks able
1509 | to process per second (referenced as FPS). The mea-
1510 |
1511 | surement was done with GTX1080 and CUDNN when-
1512 | ever possible. In the case of BCNN we report the num-
1513 | bers reported by the authors, as we were forced to save
1514 | some intermediate data to disk because we were not
1515 | able to fit all the data to memory (∼200 GB). The re-
1516 | sults are also shown in Table 4; they show that our input
1517 | modification decreased the processing speed; however,
1518 | the speed penalty is small and the method is still well
1519 | usable for real-time processing.
1520 |
1521 | 5.3 Influence of Using Estimated 3D Bounding Boxes
1522 | instead of the Surveillance Ones
1523 |
1524 | We also evaluated how the results will be influenced
1525 | when instead of using the 3D bounding boxes obtained
1526 | from the surveillance data (long-time observation of
1527 | video (Dubsk´a et al, 2014, 2015)), the estimated 3D
1528 | bounding boxes (Section 3.4) would be used.
1529 |
1530 | BoxCars: Improving Vehicle Fine-Grained Recognition using 3D Bounding Boxes in Traffic Surveillance
1531 |
1532 | 13
1533 |
1534 | Fig. 9 Correlation of improvement relative to CNNs without modification with respect to train-test viewpoint difference. The
1535 | x-axis contains bins viewpoint difference bins (in degrees), and the y-axis denotes improvement compared to base net in percent
1536 | points, see Section 5.4 for details. The graphs show that with increasing viewpoint difference, the accuracy improvement of
1537 | our method increases.
1538 |
1539 | The classification results are shown in Table 5; they
1540 | show that the proposed modifications still significantly
1541 | improve the accuracy even if only the estimated 3D
1542 | bounding box – the less accurate one – is used. This
1543 | result is fairly important, as it enables to transfer this
1544 | method to different (non-surveillance) scenarios. The
1545 | only additional data which is then required is a reliable
1546 | training set of directions towards the vanishing points
1547 | (or viewpoints and focal length) from the vehicles (or
1548 | other rigid objects).
1549 |
1550 | 5.4 Impact of Training/Testing Viewpoint Difference
1551 |
1552 | We were also interested what is the main source of the
1553 | classification accuracy improvement. We have analyzed
1554 | several possibilities and found out that the most impor-
1555 | tant aspect is viewpoint difference.
1556 |
1557 | For every training and testing sample we computed
1558 | the viewpoint (unit 3D vector from vehicles’ 3D bound-
1559 | ing boxes centers) and for each testing sample we found
1560 | one training sample with the lowest angle between its
1561 | viewpoint and the test sample viewpoint (see Figure 10).
1562 | Then, we divided the testing samples into several bins
1563 | based on the computed angle. For each of these bins we
1564 | computed the accuracy for the standard nets without
1565 | any modifications and nets with the proposed modi-
1566 | fications. Finally, for accuracy each of the nets with
1567 | modifications and each bin we subtracted the accuracy
1568 | of corresponding net without any modification yielding
1569 | improvement (in percent points) for the given modifi-
1570 | cations and bin. The results are displayed in Figure 9.
1571 | There are several facts which should be noted. The
1572 | first and the most important is that the Unpack mod-
1573 | ification alone improves the accuracy significantly for
1574 | larger viewpoint differences (the accuracy is improved
1575 |
1576 | Fig. 10 Left column: test samples, right column: samples
1577 | from train set with the lowest angle between its viewpoint and
1578 | the test sample viewpoint.
1579 |
1580 | by more then 20 percent points for the last bin). The
1581 | other important fact which should be noted is that the
1582 | other modifications (mainly Color and ImageDrop)
1583 | push the accuracy furthermore independently on the
1584 | training-testing viewpoint difference.
1585 |
1586 | 5.5 Impact of Individual Modifications
1587 |
1588 | We were also curious how different modifications by
1589 | themselves help to improve the accuracy. We conducted
1590 | two types of experiments, which focus on different as-
1591 |
1592 | 0◦ − 2◦2◦ − 4◦4◦ − 6◦6◦ − 13◦051015202530AlexNet + ALL + IMAGE + Unpack0◦ − 2◦2◦ − 4◦4◦ − 6◦6◦ − 13◦5051015202530ResNet50 + IMAGE + Unpack0◦ − 2◦2◦ − 4◦4◦ − 6◦6◦ − 13◦010203040ResNet101 + IMAGE + Unpack0◦ − 2◦2◦ − 4◦4◦ − 6◦6◦ − 13◦5051015202530ResNet152 + IMAGE + Unpack0◦ − 2◦2◦ − 4◦4◦ − 6◦6◦ − 13◦50510152025VGG16 + ALL + IMAGE + Unpack0◦ − 2◦2◦ − 4◦4◦ − 6◦6◦ − 13◦051015202530VGG19 + ALL + IMAGE + Unpack0◦ − 2◦2◦ − 4◦4◦ − 6◦6◦ − 13◦051015VGG16+CBL + ALL + IMAGE + Unpack0◦ − 2◦2◦ − 4◦4◦ − 6◦6◦ − 13◦02468101214VGG19+CBL + ALL + IMAGE + Unpackangle:0.14◦angle:3.02◦angle:5.28◦angle:11.06◦14
1593 |
1594 | Jakub Sochor et al.
1595 |
1596 | AlexNet
1597 |
1598 | VGG16+CBL VGG19+CBL VGG16
1599 |
1600 | VGG19
1601 |
1602 | mean
1603 |
1604 | best
1605 |
1606 | Unpack
1607 | +3.47/+4.37 +0.69/+1.06
1608 | −0.96/−1.20 −0.19/−0.19
1609 | View
1610 | −0.80/−1.18 +0.30/+0.27
1611 | Rast
1612 | Color
1613 | +4.80/+3.60 +2.08/+0.97
1614 | ImageDrop +0.05/−0.47 +0.29/−0.43
1615 | Table 6 Improvements for different nets and modifications computed as [base net + modification] − [base net].
1616 |
1617 | +2.07/+2.51 +3.29/+3.48 +2.11/+2.55 +3.47/+4.37
1618 | −0.46/−0.93 −0.19/+0.26 −0.32/−0.35 +0.19/+0.31
1619 | −0.20/−0.08 +0.28/+0.09 −0.03/−0.04 +0.30/+0.72
1620 | +2.72/+1.38 +3.79/+2.55 +3.17/+2.03 +4.80/+3.60
1621 | +0.63/+0.07 +1.00/+0.84 +0.70/+0.20 +1.53/+0.96
1622 |
1623 | +1.02/+1.31
1624 | +0.19/+0.31
1625 | +0.28/+0.72
1626 | +2.47/+1.65
1627 | +1.53/+0.96
1628 |
1629 | AlexNet
1630 |
1631 | VGG16+CBL VGG19+CBL VGG16
1632 |
1633 | VGG19
1634 |
1635 | mean
1636 |
1637 | best
1638 |
1639 | +6.93/+7.60 +2.18/+2.22
1640 | Unpack
1641 | +0.09/+0.18 −0.41/−0.19
1642 | View
1643 | +0.22/+0.17 +0.11/−0.08
1644 | Rast
1645 | Color
1646 | +6.34/+6.18 +2.54/+1.28
1647 | ImageDrop +1.07/+0.79 +4.24/+3.54
1648 | Table 7 Improvements for different nets and modifications computed as [base net + all] − [base net + all − modification].
1649 |
1650 | +2.82/+2.46 +3.07/+2.82 +3.41/+3.48 +6.93/+7.60
1651 | +0.36/+0.15 +0.05/−0.27 −0.14/−0.15 +0.36/+0.18
1652 | +0.30/+0.20 −0.01/−0.11 −0.03/−0.08 +0.30/+0.20
1653 | +3.08/+1.73 +2.92/+1.67 +3.42/+2.43 +6.34/+6.18
1654 | +0.89/+0.05 +1.19/+0.68 +1.32/+0.77 +4.24/+3.54
1655 |
1656 | +2.06/+2.32
1657 | −0.78/−0.64
1658 | −0.76/−0.58
1659 | +2.21/+1.31
1660 | −0.79/−1.21
1661 |
1662 | net
1663 |
1664 | accuracy [%]
1665 |
1666 | all types merged types
1667 |
1668 | 77.79/88.60
1669 | AlexNet + ALL
1670 | 84.13/92.27
1671 | VGG16 + ALL
1672 | 75.06/83.42
1673 | VGG16+CBL + ALL
1674 | 84.12/92.00
1675 | VGG19 + ALL
1676 | 75.62/83.76
1677 | VGG19+CBL + ALL
1678 | 82.27/90.79
1679 | ResNet50 + IMAGE
1680 | ResNet101 + IMAGE 83.41/91.59
1681 | ResNet152 + IMAGE 83.74/91.71
1682 |
1683 | 79.08/89.70
1684 | 85.42/93.28
1685 | 76.82/85.07
1686 | 85.51/92.97
1687 | 78.56/86.62
1688 | 83.51/91.79
1689 | 84.65/92.55
1690 | 85.10/92.84
1691 |
1692 | Table 8 Comparison of accuracy with all types and 8 merged
1693 | types into supertypes.
1694 |
1695 | 5.6 Vehicle Types Resisting to Fine-Grained
1696 | Recognition
1697 |
1698 | As possible applications of the fine-grained recognition
1699 | may vary, we merged pairs of fine-grained classes dur-
1700 | ing testing into one supertype. The merge was done for
1701 | vehicles which are made by the same concern, have the
1702 | same dimensions, and which are only differentiated by
1703 | subtle branding details on the mask. This merge can
1704 | be beneficial if the task is for example determining the
1705 | dimensions of the vehicle.
1706 |
1707 | We merged 8 pairs of vehicle types (see Figure 11
1708 | for an example) affecting 1 034 tracks and 5 567 image
1709 | samples. We show the results in Table 8; the accuracy
1710 | improves only slightly – by ∼ 1 percent point).
1711 |
1712 | Fig. 11 Example of vehicle types merged into one supertype.
1713 | Left: Renault Traffic, right: Opel Vivaro.
1714 |
1715 | pects of the modifications. The evaluation is not done
1716 | on ResNets, as we only use IMAGE level modifications
1717 | with ResNets; thus, we can not evaluate Rast and View
1718 | modifications with ResNets.
1719 |
1720 | The first experiment is focused on the influence of
1721 | the modification by itself. Therefore, we compute the
1722 | accuracy improvement (in accuracy percent points) for
1723 | the modifications as [base net+modification]−[base net],
1724 | where [. . .] stands for the accuracy of the classifier de-
1725 | scribed by its contents. The results are shown in Ta-
1726 | ble 6. As it can be seen in the table, the most positive
1727 | modifications are Color, Unpack, and ImageDrop.
1728 |
1729 | The second experiment evaluates how a given modi-
1730 | fication contributed to the accuracy improvement when
1731 | all of the modifications are used. Thus, the improve-
1732 | ment is computed as [base net + all ]− [base net + all −
1733 | modification]. See Table 7 for the results, which confirm
1734 | the previous findings and Color, Unpack, and Image-
1735 | Drop are again the most positive modifications.
1736 |
1737 | 5.7 Vehicle Type Verification
1738 |
1739 | Lastly, we evaluated the quality of features extracted
1740 | from the last layer of the convolutional nets for the ver-
1741 | ification task. Under the term verification, we under-
1742 | stand the task to determine whether a pair of vehicle
1743 | tracks share the same fine-grained type or not. In agree-
1744 |
1745 | BoxCars: Improving Vehicle Fine-Grained Recognition using 3D Bounding Boxes in Traffic Surveillance
1746 |
1747 | 15
1748 |
1749 | Fig. 12 Precision-Recall curves for verification of fine-grained types. Black dots represent the human performance.
1750 |
1751 | ment with previous works in the field (Taigman et al,
1752 | 2014), we use cosine distance between the features for
1753 | the verification.
1754 |
1755 | We collected random 5 millions of random pairs of
1756 | vehicle tracks from test part of BoxCars116k splits and
1757 | evaluate the verification on these pairs. As we used
1758 | tracks which can have a different number of vehicle im-
1759 | ages, we use 9 random pairs of images for each pair of
1760 | tracks and used median distance between these image
1761 | pairs as the distance between the whole tracks.
1762 |
1763 | Precision-Recall curves and Average Precisions are
1764 | shown in Figure 12. As the results show, our modifi-
1765 | cations significantly improve the average precision for
1766 | each CNN in the given task. Also, as the figure shows,
1767 | the method outperforms human performance (black dots
1768 | in Figure 12), as reported in the previous paper (Sochor
1769 | et al, 2016a).
1770 |
1771 | 6 Conclusion
1772 |
1773 | This article presents and sums up multiple algorith-
1774 | mic modifications suitable for CNN-based fine-grained
1775 | recognition of vehicles. Some of the modifications were
1776 | originally proposed in a conference paper (Sochor et al,
1777 | 2016a), some are results of the ongoing research. We
1778 | also propose a method for obtaining the 3D bound-
1779 | ing boxes necessary for the image unwrapping (which
1780 | has the largest impact on the performance improve-
1781 | ment) without observing a surveillance video, but only
1782 | working with the individual input image. This consid-
1783 | erably increases the application potential of the pro-
1784 | posed methodology (and the performance for such es-
1785 | timated 3D bboxes is only somewhat lower than when
1786 | the “proper” bounding boxes are used). We focused on
1787 |
1788 | thorough evaluation of the methods: we couple them
1789 | with multiple state-of-the-art CNN architectures (Si-
1790 | monyan and Zisserman, 2014; He et al, 2016), we mea-
1791 | sure the contribution/influence of individual modifica-
1792 | tions.
1793 |
1794 | Our method significantly improves the classification
1795 | accuracy (up to +12 percent points) and reduces
1796 | the classification error (up to 50 % error reduction)
1797 | compared to the base CNNs. Also, our method outper-
1798 | forms other state-of-the-art methods (Lin et al, 2015b;
1799 | Simon and Rodner, 2015; Gao et al, 2016) by 9 per-
1800 | cent points in single image accuracy and by 7 per-
1801 | cent points in whole track accuracy.
1802 |
1803 | We collected, processed, and annotated a dataset
1804 | BoxCars116k targeted to fine-grained recognition of ve-
1805 | hicles in the surveillance domain. Contrary to majority
1806 | of existing vehicle recognition datasets, the viewpoints
1807 | are greatly varying and they correspond to surveillance
1808 | scenarios; the existing datasets are mostly collected from
1809 | web images and the vehicles are typically captured from
1810 | eye-level positions. This dataset is made publicly avail-
1811 | able for future research and evaluation.
1812 |
1813 | Acknowledgment
1814 |
1815 | This work was supported by The Ministry of Education,
1816 | Youth and Sports of the Czech Republic from the Na-
1817 | tional Programme of Sustainability (NPU II); project
1818 | IT4Innovations excellence in science – LQ1602.
1819 |
1820 | References
1821 |
1822 | Agarwal S, Awan A, , Roth D (2004) Learning to detect
1823 | objects in images via a sparse, part-based represen-
1824 |
1825 | 0.00.20.40.60.81.00.00.20.40.60.81.0AlexNet + ALL 0.665 + IMAGE 0.657 + CVPR16 0.603 0.5120.00.20.40.60.81.00.00.20.40.60.81.0ResNet50 + IMAGE 0.734 0.5480.00.20.40.60.81.00.00.20.40.60.81.0ResNet101 + IMAGE 0.766 0.5750.00.20.40.60.81.00.00.20.40.60.81.0ResNet152 + IMAGE 0.764 0.6450.00.20.40.60.81.00.00.20.40.60.81.0VGG16 + ALL 0.849 + IMAGE 0.845 + CVPR16 0.827 0.7560.00.20.40.60.81.00.00.20.40.60.81.0VGG19 + ALL 0.869 + IMAGE 0.857 + CVPR16 0.837 0.7640.00.20.40.60.81.00.00.20.40.60.81.0VGG16+CBL + ALL 0.888 + IMAGE 0.891 + CVPR16 0.845 0.8400.00.20.40.60.81.00.00.20.40.60.81.0VGG19+CBL + ALL 0.905 + IMAGE 0.898 + CVPR16 0.876 0.86416
1826 |
1827 | Jakub Sochor et al.
1828 |
1829 | tation. IEEE PAMI 26(11):1475–1490 7
1830 |
1831 | IEEE Transactions on 16(3):1162–1171 4, 8, 12
1832 |
1833 | Baran R, Glowacz A, Matiolanski A (2015) The effi-
1834 | cient real- and non-real-time make and model recog-
1835 | nition of cars. Multimedia Tools and Applications
1836 | 74(12):4269–4288 3, 4
1837 |
1838 | Bluche T, Ney H, Kermorvant C (2013) Feature extrac-
1839 | tion with convolutional neural networks for handwrit-
1840 | ten word recognition. In: International Conference on
1841 | Document Analysis and Recognition (ICDAR), pp
1842 | 285–289 4
1843 |
1844 | Boonsim N, Prakoonwit S (2016) Car make and
1845 | model recognition under limited lighting conditions
1846 | at night. Pattern Analysis and Applications pp 1–13
1847 | 4
1848 |
1849 | Caraffi C, Vojir T, Trefny J, Sochman J, Matas J (2012)
1850 | A System for Real-time Detection and Tracking of
1851 | Vehicles from a Single Car-mounted Camera. In: ITS
1852 | Conference, pp 975–982 7
1853 |
1854 | Chai Y, Rahtu E, Lempitsky V, Van Gool L, Zisser-
1855 | man A (2012) Tricos: A tri-level class-discriminative
1856 | co-segmentation method for image classification. In:
1857 | European Conference on Computer Vision 3
1858 |
1859 | Chai Y, Lempitsky V, Zisserman A (2013) Symbiotic
1860 | segmentation and part localization for fine-grained
1861 | categorization. In: Computer Vision (ICCV), 2013
1862 | IEEE International Conference on, pp 321–328 2
1863 |
1864 | Chatfield K, Simonyan K, Vedaldi A, Zisserman A
1865 | (2014) Return of the devil in the details: Delving deep
1866 | into convolutional nets. In: British Machine Vision
1867 | Conference 4
1868 |
1869 | Clady X, Negri P, Milgram M, Poulenard R (2008)
1870 | Multi-class vehicle type recognition system. In: Pro-
1871 | ceedings of the 3rd IAPR Workshop on Artificial
1872 | Neural Networks in Pattern Recognition, Springer-
1873 | Verlag, Berlin, Heidelberg, ANNPR ’08, pp 228–239
1874 | 3, 4
1875 |
1876 | Dalal N, Triggs B (2005) Histograms of oriented gradi-
1877 | ents for human detection. In: Computer Vision and
1878 | Pattern Recognition, 2005. CVPR 2005. IEEE Com-
1879 | puter Society Conference on, IEEE, vol 1, pp 886–893
1880 | 3
1881 |
1882 | Dlagnekov L, Belongie S (2005) Recognizing cars. Tech.
1883 |
1884 | rep., UCSD CSE Tech Report CS2005-0833 3, 4
1885 |
1886 | Duan K, Parikh D, Crandall D, Grauman K (2012) Dis-
1887 | covering localized attributes for fine-grained recogni-
1888 | tion. In: IEEE Conference on Computer Vision and
1889 | Pattern Recognition (CVPR) 2
1890 |
1891 | Dubsk´a M, Sochor J, Herout A (2014) Automatic cam-
1892 | era calibration for traffic understanding. In: BMVC
1893 | 4, 6, 8, 12
1894 |
1895 | Dubsk´a M, Herout A, Jur´anek R, Sochor J (2015)
1896 | Fully automatic roadside camera calibration for traf-
1897 | fic surveillance. Intelligent Transportation Systems,
1898 |
1899 | Everingham M, Van Gool L, Williams CKI, Winn J,
1900 | Zisserman A (2010) The pascal visual object classes
1901 | (VOC) challenge. IJCV 88(2):303–338 7
1902 |
1903 | Fang J, Zhou Y, Yu Y, Du S (2016) Fine-grained
1904 | vehicle model recognition using a coarse-to-fine
1905 | convolutional neural network architecture.
1906 | IEEE
1907 | Transactions on Intelligent Transportation Systems
1908 | PP(99):1–11 4
1909 |
1910 | Felzenszwalb P, Girshick R, McAllester D, Ramanan D
1911 | (2010) Object detection with discriminatively trained
1912 | part-based models. IEEE Transactions on Pattern
1913 | Analysis and Machine Intelligence 32(9):1627–1645
1914 | 3
1915 |
1916 | Gao Y, Beijbom O, Zhang N, Darrell T (2016) Com-
1917 | pact bilinear pooling. In: The IEEE Conference on
1918 | Computer Vision and Pattern Recognition (CVPR)
1919 | 2, 3, 9, 10, 11, 12, 15
1920 |
1921 | Gavves E, Fernando B, Snoek C, Smeulders A, Tuyte-
1922 | laars T (2015) Local alignments for fine-grained cate-
1923 | gorization. International Journal of Computer Vision
1924 | 111(2):191–212 3
1925 |
1926 | Gebru T, Krause J, Wang Y, Chen D, Deng J, Aiden
1927 | EL, Fei-Fei L (2017) Using deep learning and google
1928 | street view to estimate the demographic makeup of
1929 | the us, arXiv:1702.06683 1
1930 |
1931 | Geiger A, Lenz P, Urtasun R (2012) Are we ready for
1932 | autonomous driving? the KITTI vision benchmark
1933 | suite. In: CVPR 7
1934 |
1935 | Glasner D, Galun M, Alpert S, Basri R, Shakhnarovich
1936 | G (2012) Viewpoint-aware object detection and con-
1937 | tinuous pose estimation. Image&Vision Comp 7
1938 |
1939 | Gu HZ, Lee SY (2013) Car model recognition by utiliz-
1940 | ing symmetric property to overcome severe pose vari-
1941 | ation. Machine Vision and Applications 24(2):255–
1942 | 274 3
1943 |
1944 | G¨oring C, Rodner E, Freytag A, Denzler J (2014) Non-
1945 | parametric part transfer for fine-grained recognition.
1946 | In: IEEE Conference on Computer Vision and Pat-
1947 | tern Recognition (CVPR), pp 2489–2496 2
1948 |
1949 | He H, Shao Z, Tan J (2015) Recognition of car makes
1950 | and models from a single traffic-camera image. IEEE
1951 | Transactions on Intelligent Transportation Systems
1952 | PP(99):1–11 3, 4
1953 |
1954 | He K, Zhang X, Ren S, Sun J (2016) Deep residual
1955 | learning for image recognition. In: The IEEE Confer-
1956 | ence on Computer Vision and Pattern Recognition
1957 | (CVPR) 4, 7, 9, 11, 12, 15
1958 |
1959 | Hsiao E, Sinha S, Ramnath K, Baker S, Zitnick L,
1960 | Szeliski R (2014) Car make and model recognition
1961 | using 3D curve alignment. In: IEEE WACV 3, 4
1962 |
1963 | Hsieh JW, Chen LC, Chen DY (2014) Symmetrical surf
1964 | and its applications to vehicle detection and vehicle
1965 |
1966 | BoxCars: Improving Vehicle Fine-Grained Recognition using 3D Bounding Boxes in Traffic Surveillance
1967 |
1968 | 17
1969 |
1970 | make and model recognition. Intelligent Transporta-
1971 | tion Systems, IEEE Transactions on 15(1):6–20 3,
1972 | 4
1973 |
1974 | Hu C, Bai X, Qi L, Wang X, Xue G, Mei L (2015)
1975 | Learning discriminative pattern for real-time car
1976 | brand recognition. Intelligent Transportation Sys-
1977 | tems, IEEE Transactions on 16(6):3170–3181 3, 4
1978 |
1979 | Huang S, Xu Z, Tao D, Zhang Y (2016) Part-stacked
1980 | cnn for fine-grained visual categorization. In: The
1981 | IEEE Conference on Computer Vision and Pattern
1982 | Recognition (CVPR) 2
1983 |
1984 | Krause J, Stark M, Deng J, Fei-Fei L (2013) 3D ob-
1985 | ject representations for fine-grained categorization.
1986 | In: ICCV Workshop 3dRR-13 3, 4, 7
1987 |
1988 | Krause J, Gebru T, Deng J, Li LJ, Fei-Fei L (2014)
1989 | Learning features and parts for fine-grained recog-
1990 | nition. In: Pattern Recognition (ICPR), 2014 22nd
1991 | International Conference on, pp 26–33 2
1992 |
1993 | Krause J, Jin H, Yang J, Fei-Fei L (2015) Fine-grained
1994 | recognition without part annotations. In: IEEE Con-
1995 | ference on Computer Vision and Pattern Recognition
1996 | (CVPR) 2
1997 |
1998 | Krizhevsky A, Sutskever I, Hinton GE (2012) Ima-
1999 | genet classification with deep convolutional neural
2000 | networks. In: Pereira F, Burges C, Bottou L, Wein-
2001 | berger K (eds) Advances in Neural Information Pro-
2002 | cessing Systems 25, Curran Associates, Inc., pp 1097–
2003 | 1105 4, 9, 11, 12
2004 |
2005 | LeCun Y, Bottou L, Bengio Y, Haffner P (1998)
2006 | Gradient-based learning applied to document recog-
2007 | nition. Proceedings of the IEEE 86(11):2278–2324 4
2008 | Lee S, Gwak J, Jeon M (2013) Vehicle model recog-
2009 | nition in video. International Journal of Signal Pro-
2010 | cessing, Image Processing and Pattern Recognition
2011 | 6(2):175 3, 4
2012 |
2013 | Leibe B, Cornelis N, Cornelis K, Van Gool L (2007)
2014 | Dynamic 3D scene analysis from a moving vehicle.
2015 | In: CVPR, pp 1–8 7
2016 |
2017 | Li L, Guo Y, Xie L, Kong X, Tian Q (2015) Fine-
2018 | Grained Visual Categorization with Fine-Tuned Seg-
2019 | mentation. IEEE International Conference on Image
2020 | Processing 3
2021 |
2022 | Liao L, Hu R, Xiao J, Wang Q, Xiao J, Chen J (2015)
2023 | Exploiting effects of parts in fine-grained categoriza-
2024 | tion of vehicles. In: International Conference on Im-
2025 | age Processing (ICIP) 3, 4, 7
2026 |
2027 | Lin D, Shen X, Lu C, Jia J (2015a) Deep lac: Deep local-
2028 | ization, alignment and classification for fine-grained
2029 | recognition. In: The IEEE Conference on Computer
2030 | Vision and Pattern Recognition (CVPR) 3
2031 |
2032 | Lin TY, RoyChowdhury A, Maji S (2015b) Bilinear cnn
2033 | models for fine-grained visual recognition. In: Inter-
2034 | national Conference on Computer Vision (ICCV) 2,
2035 |
2036 | 3, 10, 12, 15
2037 |
2038 | Lin YL, Morariu VI, Hsu W, Davis LS (2014) Jointly
2039 | optimizing 3D model fitting and fine-grained classifi-
2040 | cation. In: ECCV 3, 4, 7
2041 |
2042 | Liu H, Tian Y, Yang Y, Pang L, Huang T (2016a) Deep
2043 | relative distance learning: Tell the difference between
2044 | similar vehicles. In: The IEEE Conference on Com-
2045 | puter Vision and Pattern Recognition (CVPR) 3
2046 |
2047 | Liu J, Kanazawa A, Jacobs D, Belhumeur P (2012)
2048 | Dog breed classification using part localization. In:
2049 | Fitzgibbon A, Lazebnik S, Perona P, Sato Y, Schmid
2050 | C (eds) ECCV 2012, Lecture Notes in Computer Sci-
2051 | ence, vol 7572, Springer Berlin Heidelberg, pp 172–
2052 | 185 2
2053 |
2054 | Liu X, Liu W, Ma H, Fu H (2016b) Large-scale vehi-
2055 | cle re-identification in urban surveillance videos. In:
2056 | Multimedia and Expo (ICME), 2016 IEEE Interna-
2057 | tional Conference on, IEEE, pp 1–6 7
2058 |
2059 | Llorca DF, Col´as D, Daza IG, Parra I, Sotelo MA
2060 | (2014) Vehicle model recognition using geometry and
2061 | appearance of car emblems from rear view images.
2062 | In: 17th International IEEE Conference on Intelli-
2063 | gent Transportation Systems (ITSC), pp 3094–3099
2064 | 3, 4
2065 |
2066 | Matzen K, Snavely N (2013) NYC3DCars: A dataset of
2067 | 3D vehicles in geographic context. In: International
2068 | Conference on Computer Vision (ICCV) 7
2069 |
2070 | Opelt A, Fussenegger M, Pinz A, Auer P (2004) Generic
2071 | object recognition with boosting. Tech. Rep. TR-
2072 | EMT-2004-01, EMT, TU Graz, Austria, submitted
2073 | to the IEEE Tr. PAMI 7
2074 |
2075 | ¨Ozuysal M, Lepetit V, Fua P (2009) Pose estimation
2076 | for category specific multiview object localization. In:
2077 | IEEE CVPR, pp 778–785 7
2078 |
2079 | Papageorgiou C, Poggio T (1999) A trainable object de-
2080 | tection system: Car detection in static images. Tech.
2081 | Rep. 1673, (CBCL Memo 180) 7
2082 |
2083 | Parkhi OM, Vedaldi A, Zisserman A, Jawahar CV
2084 | (2012) Cats and dogs. In: IEEE Conference on Com-
2085 | puter Vision and Pattern Recognition 2
2086 |
2087 | Pearce G, Pears N (2011) Automatic make and model
2088 | recognition from frontal images of cars. In: IEEE
2089 | AVSS, pp 373–378 3, 4
2090 |
2091 | Petrovic V, Cootes TF (2004) Analysis of features for
2092 | rigid structure vehicle type recognition. In: BMVC,
2093 | pp 587–596 3, 4
2094 |
2095 | Pirsiavash H, Ramanan D, Fowlkes CC (2009) Bilinear
2096 | classifiers for visual recognition. In: Bengio Y, Schu-
2097 | urmans D, Lafferty J, Williams C, Culotta A (eds)
2098 | Advances in Neural Information Processing Systems
2099 | 22, Curran Associates, Inc., pp 1482–1490 3
2100 |
2101 | Prokaj J, Medioni G (2009) 3-D model based vehicle
2102 |
2103 | recognition. In: IEEE WACV 3, 4
2104 |
2105 | 18
2106 |
2107 | Jakub Sochor et al.
2108 |
2109 | Psyllos A, Anagnostopoulos C, Kayafas E (2011) Ve-
2110 | hicle model recognition from frontal view image
2111 | measurements. Computer Standards & Interfaces
2112 | 33(2):142 – 151, {XVI} {IMEKO} {TC4} Sympo-
2113 | sium and {XIII} International Workshop on {ADC}
2114 | Modelling and Testing 3, 4
2115 |
2116 | Rothe R, Timofte R, Van Gool L (2016) Deep expecta-
2117 | tion of real and apparent age from a single image
2118 | without facial landmarks. International Journal of
2119 | Computer Vision pp 1–14 7
2120 |
2121 | Russakovsky O, Deng J, Su H, Krause J, Satheesh S,
2122 | Ma S, Huang Z, Karpathy A, Khosla A, Bernstein
2123 | M, Berg AC, Fei-Fei L (2015) ImageNet Large Scale
2124 | Visual Recognition Challenge. IJCV 7, 10
2125 |
2126 | Savarese S, Fei-Fei L (2007) 3D generic object catego-
2127 | rization, localization and pose estimation. In: ICCV,
2128 | IEEE 7
2129 |
2130 | Simon M, Rodner E (2015) Neural activation constella-
2131 | tions: Unsupervised part model discovery with con-
2132 | volutional networks. In: International Conference on
2133 | Computer Vision (ICCV) 2, 10, 12, 15
2134 |
2135 | Simonyan K, Zisserman A (2014) Very deep convo-
2136 | lutional networks for large-scale image recognition.
2137 | CoRR abs/1409.1556 4, 9, 10, 11, 12, 15
2138 |
2139 | Sochor J, Herout A, Havel J (2016a) Boxcars: 3d boxes
2140 | as cnn input for improved fine-grained vehicle recog-
2141 | nition. In: The IEEE Conference on Computer Vision
2142 | and Pattern Recognition (CVPR) 2, 5, 8, 10, 15
2143 |
2144 | Sochor J, Jur´anek R, ˇSpaˇnhel J, Marˇs´ık L, ˇSirok´y A,
2145 | Herout A, Zemˇc´ık P (2016b) BrnoCompSpeed: Re-
2146 | view of traffic camera calibration and a comprehen-
2147 | sive dataset for monocular speed measurement. Intel-
2148 | ligent Transportation Systems (under review), IEEE
2149 | Transactions on 8
2150 |
2151 | Stark M, Krause J, Pepik B, Meger D, Little J, Schiele
2152 | B, Koller D (2012) Fine-grained categorization for 3D
2153 | scene understanding. In: BMVC 3, 7
2154 |
2155 | Stauffer C, Grimson WEL (1999) Adaptive background
2156 | mixture models for real-time tracking. In: CVPR,
2157 | vol 2, pp 246–252 8
2158 |
2159 | Taigman Y, Yang M, Ranzato M, Wolf L (2014) Deep-
2160 | Face: Closing the gap to human-level performance in
2161 | face verification. In: CVPR, pp 1701–1708 4, 15
2162 |
2163 | Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010)
2164 | Locality-constrained linear coding for image classifi-
2165 | cation. In: CVPR, pp 3360–3367 3
2166 |
2167 | Xiang Y, Savarese S (2012) Estimating the aspect lay-
2168 | out of object categories. In: CVPR, pp 3410–3417 7
2169 | Xiao T, Xu Y, Yang K, Zhang J, Peng Y, Zhang Z
2170 | (2015) The application of two-level attention mod-
2171 | els in deep convolutional neural network for fine-
2172 | grained image classification. In: The IEEE Confer-
2173 | ence on Computer Vision and Pattern Recognition
2174 |
2175 | (CVPR) 4
2176 |
2177 | Xie S, Yang T, Wang X, Lin Y (2015) Hyper-class
2178 | augmented and regularized deep learning for fine-
2179 | grained image classification. In: The IEEE Confer-
2180 | ence on Computer Vision and Pattern Recognition
2181 | (CVPR) 2, 3
2182 |
2183 | Yang J, Price B, Cohen S, Lee H, Yang MH (2016)
2184 | Object contour detection with a fully convolutional
2185 | encoder-decoder network. In: Proceedings of the
2186 | IEEE Conference on Computer Vision and Pattern
2187 | Recognition, pp 193–202 6, 7
2188 |
2189 | Yang L, Luo P, Change Loy C, Tang X (2015) A large-
2190 | scale car dataset for fine-grained categorization and
2191 | verification. In: The IEEE Conference on Computer
2192 | Vision and Pattern Recognition (CVPR) 4, 7
2193 |
2194 | Yang S, Bo L, Wang J, Shapiro LG (2012) Unsuper-
2195 | vised template learning for fine-grained object recog-
2196 | nition. In: Pereira F, Burges C, Bottou L, Weinberger
2197 | K (eds) Advances in Neural Information Processing
2198 | Systems 25, Curran Associates, Inc., pp 3122–3130 2
2199 | Yao B (2012) A codebook-free and annotation-free
2200 | approach for fine-grained image categorization. In:
2201 | Proceedings of the 2012 IEEE Conference on Com-
2202 | puter Vision and Pattern Recognition (CVPR), IEEE
2203 | Computer Society, Washington, DC, USA, CVPR
2204 | ’12, pp 3466–3473 2, 3
2205 |
2206 | Zeiler MD, Fergus R (2014) Visualizing and under-
2207 | standing convolutional networks. In: European con-
2208 | ference on computer vision, Springer, pp 818–833 6
2209 | Zhang B (2013) Reliable classification of vehicle
2210 | types based on cascade classifier ensembles. IEEE
2211 | Transactions on Intelligent Transportation Systems
2212 | 14(1):322–332 3, 4
2213 |
2214 | Zhang B (2014) Classification and identification of ve-
2215 | hicle type and make by cortex-like image descriptor
2216 | HMAX. IJCVR 4:195–211 3, 4
2217 |
2218 | Zhang H, Xu T, Elhoseiny M, Huang X, Zhang S,
2219 | Elgammal A, Metaxas D (2016a) Spda-cnn: Unify-
2220 | ing semantic part detection and abstraction for fine-
2221 | grained recognition. In: The IEEE Conference on
2222 | Computer Vision and Pattern Recognition (CVPR)
2223 | 2
2224 |
2225 | Zhang L, Yang Y, Wang M, Hong R, Nie L, Li X
2226 | (2016b) Detecting densely distributed graph patterns
2227 | for fine-grained image categorization. IEEE Transac-
2228 | tions on Image Processing 25(2):553–565 2
2229 |
2230 | Zhang N, Farrell R, Darrell T (2012) Pose pooling ker-
2231 | nels for sub-category recognition. In: Computer Vi-
2232 | sion and Pattern Recognition (CVPR), 2012 IEEE
2233 | Conference on, pp 3665–3672 3
2234 |
2235 | Zhang N, Farrell R, Iandola F, Darrell T (2013) De-
2236 | formable part descriptors for fine-grained recognition
2237 | and attribute prediction. In: The IEEE International
2238 |
2239 | BoxCars: Improving Vehicle Fine-Grained Recognition using 3D Bounding Boxes in Traffic Surveillance
2240 |
2241 | 19
2242 |
2243 | Conference on Computer Vision (ICCV) 2
2244 |
2245 | Zhang N, Donahue J, Girshick R, Darrell T (2014) Part-
2246 | based R-CNNs for fine-grained category detection.
2247 | In: Proceedings of the European Conference on Com-
2248 | puter Vision (ECCV) 2
2249 |
2250 | Zhang X, Xiong H, Zhou W, Lin W, Tian Q (2016c)
2251 | Picking deep filter responses for fine-grained image
2252 | recognition. In: The IEEE Conference on Computer
2253 | Vision and Pattern Recognition (CVPR) 3
2254 |
2255 | Zhou F, Lin Y (2016) Fine-grained image classification
2256 | by exploring bipartite-graph labels. In: The IEEE
2257 | Conference on Computer Vision and Pattern Recog-
2258 | nition (CVPR) 2, 3
2259 |
2260 | Zivkovic Z (2004) Improved adaptive gaussian mixture
2261 | model for background subtraction. In: ICPR, pp 28–
2262 | 31 8
2263 |
2264 |
--------------------------------------------------------------------------------
/spider/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/burness/arxiv_tools/0e3fe1bbd4cb26a4f1b5266c32e5b8e24d866c81/spider/__init__.py
--------------------------------------------------------------------------------
/spider/download_pdfs.py:
--------------------------------------------------------------------------------
1 | #-*-coding:utf-8-*-
2 | import requests
3 | import sys
4 | import urllib2
5 | from lxml import html
6 | import Queue
7 | from threading import Thread
8 | import time
9 | import os
10 | import logging
11 | import codecs
12 |
13 | # logger = logging.getLogger()
14 | logger = logging.getLogger('arxiv_tools')
15 | # handler = logging.StreamHandler()
16 | # formatter = logging.Formatter('%(asctime)s - %(filename)s:%(lineno)s - %(name)s - %(message)s' )
17 | # handler.setFormatter(formatter)
18 | # logger.addHandler(handler)
19 | # logger.setLevel(logging.DEBUG)
20 |
21 | class ArxivPdfs():
22 | def __init__(self, url):
23 | self.url = url
24 |
25 | def get_links(self) :
26 | try :
27 | result = requests.get(self.url)
28 | except :
29 | sys.exit(0)
30 |
31 | content = html.fromstring(result.content)
32 | print 'read web successfully'
33 | pdf_ids = content.xpath('//span[@class="list-identifier"]//a[@title="Abstract"]/text()')
34 | pdf_links = ['https://arxiv.org'+i+'.pdf' for i in content.xpath('//span[@class="list-identifier"]//a[@title="Download PDF"]/@href')]
35 | pdf_describe_links = [link.replace('pdf', 'abs', 1) for link in pdf_links]
36 | pdf_titles = [i.strip().replace('$','') for i in filter(lambda x : x!='\n', content.xpath('//div[@class="list-title mathjax"]/text()'))]
37 | pdf_authors = content.xpath('//div[@class="list-authors"]')
38 | pdf_authors_links = [','.join(pdf_author.xpath('a/@href')) for pdf_author in pdf_authors]
39 | pdf_authors_links = [','.join(['https://arxiv.org'+j for j in i.split(',')]) for i in pdf_authors_links]
40 | pdf_authors = [pdf_author.xpath('string(.)') for pdf_author in pdf_authors]
41 | pdf_authors = [author.replace('\n','\\') for author in pdf_authors]
42 | pdf_authors = [author.replace('\n,',' ') for author in pdf_authors]
43 | pdf_authors = [author.replace('Authors: ','') for author in pdf_authors]
44 | pdf_authors = [author.replace(',','') for author in pdf_authors]
45 | pdf_authors = [author.replace('\\',',') for author in pdf_authors]
46 | pdf_subjects = content.xpath('//span[@class="primary-subject"]/text()')
47 | return pdf_ids, pdf_describe_links, pdf_titles, pdf_links, pdf_authors, pdf_authors_links, pdf_subjects
48 |
49 | def download_pdf(url, area, pdf_dir='./papers/pdfs/'):
50 | area = area.replace('.','_')
51 | date = time.strftime('%Y-%m-%d',time.localtime(time.time()))
52 | pdf_dir = os.path.join(pdf_dir, area+'/'+date)
53 |
54 | filename = os.path.join(pdf_dir,url.split('/')[-1])
55 | try:
56 | f = urllib2.urlopen(url)
57 | data = f.read()
58 | with open(filename, "wb") as code:
59 | code.write(data)
60 | logger.info("Download {0} completed...".format(filename))
61 | except:
62 | logger.info("Download {1} error".format(filename))
63 |
64 |
65 | class DownloadWorker(Thread):
66 | def __init__(self, queue, area):
67 | Thread.__init__(self)
68 | self.queue = queue
69 | self.area = area
70 |
71 | def run(self):
72 | while True:
73 | # Get the work from the queue and expand the tuple
74 | url = self.queue.get()
75 | if url is None:
76 | break
77 | # download_link(directory, link)
78 | download_pdf(url, self.area)
79 | self.queue.task_done()
80 |
81 | def build_url(area, show_num=1000):
82 | '''
83 | build the url of the specified area
84 | args:
85 | area: the area
86 | show_num: the show num, default 1000
87 | return:
88 | url: the url of the specified area and show num
89 | '''
90 | url = 'https://arxiv.org/list/{0}/pastweek?skip=0&show={1}'.format(area, show_num)
91 | return url
92 |
93 |
94 | def pdf_info_write(area,date=None, **pdf_info):
95 | pdf_num = pdf_info['pdf_num']
96 | area = area.replace('.','_')
97 | if not date:
98 | date = time.strftime('%Y-%m-%d',time.localtime(time.time()))
99 | summary_file = os.path.join('./papers/pdfs/',area+'/'+date+'/'+'summary.csv')
100 |
101 | with codecs.open(summary_file, 'w', encoding='utf-8') as fw:
102 | for index in xrange(pdf_num):
103 | print pdf_info['pdf_ids'][index], pdf_info['pdf_titles'][index], pdf_info['pdf_links'][index]
104 | print pdf_info['pdf_authors_links'][index]
105 | print pdf_info['pdf_subjects'][index]
106 | print pdf_info['pdf_describe_links'][index]
107 | # coding format here
108 | line = '{0}\t{1}\t{2}\t{3}\t{4}\t{5}\t{6}\n'.format(pdf_info['pdf_ids'][index], pdf_info['pdf_titles'][index], pdf_info['pdf_links'][index],
109 | pdf_info['pdf_authors'][index].encode('utf-8'), pdf_info['pdf_authors_links'][index], pdf_info['pdf_subjects'][index], pdf_info['pdf_describe_links'][index])
110 | logger.info(line)
111 | fw.write(line.decode('utf-8'))
112 | logger.info('Write to {0} successful'.format(summary_file))
113 |
114 | def run_all(area, show_num=2, max_size=100, parallel_num=8, download_pdfs=False, pdf_dir='./papers/pdfs/'):
115 |
116 | date = time.strftime('%Y-%m-%d',time.localtime(time.time()))
117 | pdf_dir = os.path.join(pdf_dir, area.replace('.','_')+'/'+date)
118 | if not os.path.exists(pdf_dir.lower()):
119 | try:
120 | os.makedirs(pdf_dir)
121 | except:
122 | logger.info('Other thread Create')
123 | url = build_url(area, show_num)
124 | area = area.replace('.','_')
125 | logger.info('url: {0}'.format(url))
126 | arxiv_pdfs = ArxivPdfs(url)
127 | if download_pdfs:
128 | download_queue = Queue.Queue(maxsize=max_size)
129 | for x in range(parallel_num):
130 | worker = DownloadWorker(download_queue, area)
131 | worker.daemon = True
132 | worker.start()
133 | pdf_ids, pdf_describe_links, pdf_titles, pdf_links, pdf_authors, pdf_authors_links, pdf_subjects = arxiv_pdfs.get_links()
134 | if download_pdfs:
135 | for link in pdf_links:
136 | download_queue.put(link)
137 | download_queue.join()
138 | # print pdf_titles
139 | pdf_info = {}
140 | pdf_info['pdf_num'] = len(pdf_ids)
141 | pdf_info['pdf_ids'] = pdf_ids
142 | pdf_info['pdf_describe_links'] = pdf_describe_links
143 | pdf_info['pdf_titles'] = pdf_titles
144 | pdf_info['pdf_links'] = pdf_links
145 | pdf_info['pdf_authors'] = pdf_authors
146 | pdf_info['pdf_authors_links'] = pdf_authors_links
147 | pdf_info['pdf_subjects'] = pdf_subjects
148 | logger.info('extract pdfs links done, begin to download {0} pdfs '.format(len(pdf_links)))
149 | logger.info('subject: {0}'.format(area))
150 | pdf_info_write(area, **pdf_info)
151 | # download the all pdfs
152 |
153 |
154 | if __name__ == '__main__':
155 | start = time.time()
156 | run_all('cs.cv', show_num=8, max_size=1)
157 | logger.info("took time : {0}".format(time.time() - start))
158 | # arXiv:1703.00856 Araguaia Medical Vision Lab at ISIC 2017 Skin Lesion Classification Challenge https://arxiv.org/pdf/1703.00856.pdf Rafael Teixeira Sousa, Larissa Vasconcellos de Moraes https://arxiv.org/find/cs/1/au:+Sousa_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Moraes_L/0/1/0/all/0/1 Computer Vision and Pattern Recognition (cs.CV) https://arxiv.org/abs/1703.00856.pdf
159 |
160 | # test the pdf_info_write
161 | # pdf_num = 7
162 | # pdf_ids = ['arXiv:1703.00856']
163 | # pdf_titles = ['Araguaia Medical Vision Lab at ISIC 2017 Skin Lesion Classification Challenge']
164 | # pdf_links = ['https://arxiv.org/pdf/1703.00856.pdf']
165 | # pdf_authors = ['Rafael Teixeira Sousa, Larissa Vasconcellos de Moraes']
166 | # pdf_authors_links = ['https://arxiv.org/find/cs/1/au:+Sousa_R/0/1/0/all/0/1,https://arxiv.org/find/cs/1/au:+Moraes_L/0/1/0/all/0/1']
167 | # pdf_subjects = ['Computer Vision and Pattern Recognition (cs.CV)']
168 | # pdf_describe_links = ['https://arxiv.org/abs/1703.00856.pdf']
169 | # pdf_info = {}
170 | # pdf_info['pdf_num'] = len(pdf_ids)
171 | # pdf_info['pdf_ids'] = pdf_ids
172 | # pdf_info['pdf_titles'] = pdf_titles
173 | # pdf_info['pdf_describe_links'] = pdf_describe_links
174 | # pdf_info['pdf_links'] = pdf_links
175 | # pdf_info['pdf_authors'] = pdf_authors
176 | # pdf_info['pdf_authors_links'] = pdf_authors_links
177 | # pdf_info['pdf_subjects'] = pdf_subjects
178 | # pdf_info_write('cs.cv',date='2017-03-05', **pdf_info)
179 |
--------------------------------------------------------------------------------
/spider/read_pdfs.py:
--------------------------------------------------------------------------------
1 | import sys
2 | from pdfminer.pdfdocument import PDFDocument
3 | from pdfminer.pdfparser import PDFParser
4 | from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
5 | from pdfminer.pdfdevice import PDFDevice, TagExtractor
6 | from pdfminer.pdfpage import PDFPage
7 | from pdfminer.converter import XMLConverter, HTMLConverter, TextConverter
8 | from pdfminer.cmapdb import CMapDB
9 | from pdfminer.layout import LAParams
10 | from pdfminer.image import ImageWriter
11 |
12 | # main
13 | def main(argv):
14 | import getopt
15 | def usage():
16 | print ('usage: %s [-d] [-p pagenos] [-m maxpages] [-P password] [-o output]'
17 | ' [-C] [-n] [-A] [-V] [-M char_margin] [-L line_margin] [-W word_margin]'
18 | ' [-F boxes_flow] [-Y layout_mode] [-O output_dir] [-R rotation]'
19 | ' [-t text|html|xml|tag] [-c codec] [-s scale]'
20 | ' file ...' % argv[0])
21 | return 100
22 | try:
23 | (opts, args) = getopt.getopt(argv[1:], 'dp:m:P:o:CnAVM:L:W:F:Y:O:R:t:c:s:')
24 | except getopt.GetoptError:
25 | return usage()
26 | if not args: return usage()
27 | # debug option
28 | debug = 0
29 | # input option
30 | password = ''
31 | pagenos = set()
32 | maxpages = 0
33 | # output option
34 | outfile = None
35 | outtype = None
36 | imagewriter = None
37 | rotation = 0
38 | layoutmode = 'normal'
39 | codec = 'utf-8'
40 | pageno = 1
41 | scale = 1
42 | caching = True
43 | showpageno = True
44 | laparams = LAParams()
45 | for (k, v) in opts:
46 | if k == '-d': debug += 1
47 | elif k == '-p': pagenos.update( int(x)-1 for x in v.split(',') )
48 | elif k == '-m': maxpages = int(v)
49 | elif k == '-P': password = v
50 | elif k == '-o': outfile = v
51 | elif k == '-C': caching = False
52 | elif k == '-n': laparams = None
53 | elif k == '-A': laparams.all_texts = True
54 | elif k == '-V': laparams.detect_vertical = True
55 | elif k == '-M': laparams.char_margin = float(v)
56 | elif k == '-L': laparams.line_margin = float(v)
57 | elif k == '-W': laparams.word_margin = float(v)
58 | elif k == '-F': laparams.boxes_flow = float(v)
59 | elif k == '-Y': layoutmode = v
60 | elif k == '-O': imagewriter = ImageWriter(v)
61 | elif k == '-R': rotation = int(v)
62 | elif k == '-t': outtype = v
63 | elif k == '-c': codec = v
64 | elif k == '-s': scale = float(v)
65 | #
66 | PDFDocument.debug = debug
67 | PDFParser.debug = debug
68 | CMapDB.debug = debug
69 | PDFResourceManager.debug = debug
70 | PDFPageInterpreter.debug = debug
71 | PDFDevice.debug = debug
72 | #
73 | rsrcmgr = PDFResourceManager(caching=caching)
74 | if not outtype:
75 | outtype = 'text'
76 | if outfile:
77 | if outfile.endswith('.htm') or outfile.endswith('.html'):
78 | outtype = 'html'
79 | elif outfile.endswith('.xml'):
80 | outtype = 'xml'
81 | elif outfile.endswith('.tag'):
82 | outtype = 'tag'
83 | if outfile:
84 | outfp = file(outfile, 'w')
85 | else:
86 | outfp = sys.stdout
87 | if outtype == 'text':
88 | device = TextConverter(rsrcmgr, outfp, codec=codec, laparams=laparams,
89 | imagewriter=imagewriter)
90 | elif outtype == 'xml':
91 | device = XMLConverter(rsrcmgr, outfp, codec=codec, laparams=laparams,
92 | imagewriter=imagewriter)
93 | elif outtype == 'html':
94 | device = HTMLConverter(rsrcmgr, outfp, codec=codec, scale=scale,
95 | layoutmode=layoutmode, laparams=laparams,
96 | imagewriter=imagewriter)
97 | elif outtype == 'tag':
98 | device = TagExtractor(rsrcmgr, outfp, codec=codec)
99 | else:
100 | return usage()
101 | for fname in args:
102 | fp = file(fname, 'rb')
103 | interpreter = PDFPageInterpreter(rsrcmgr, device)
104 | for page in PDFPage.get_pages(fp, pagenos,
105 | maxpages=maxpages, password=password,
106 | caching=caching, check_extractable=True):
107 | page.rotate = (page.rotate+rotation) % 360
108 | interpreter.process_page(page)
109 | fp.close()
110 | device.close()
111 | outfp.close()
112 | return
113 |
114 | if __name__ == '__main__': sys.exit(main(sys.argv))
--------------------------------------------------------------------------------
/wechat/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/burness/arxiv_tools/0e3fe1bbd4cb26a4f1b5266c32e5b8e24d866c81/wechat/__init__.py
--------------------------------------------------------------------------------