└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # Document-Image-Classification-with-Intra-Domain-Transfer-Learning-and-Stacked-Generalization-of-Deep 2 | 3 | Blog post : https://medium.com/@shikharsambal3/how-i-built-a-document-classification-system-using-deep-convolutional-neural-networks-e1d9a83cbabd 4 | 5 | RVL-CDIP could be looked at as the equivalent of ImageNet for the document image community. 6 | It’s certainly the largest we’ve seen in the literature. There are 400,000 total document images in the dataset. 7 | The dataset contains much noise and variance in composition of each document class. Uncompressed, the dataset size is ~100GB, 8 | and comprises 16 classes of document types, with 25,000 samples per classes. Example classes include email, resume, 9 | and invoice. Achieved an Accuracy of over 93% which beat the benchmark score of 92% based 10 | on https://paperswithcode.com/sota/document-image-classification-on-rvl-cdip 11 | 12 | --------------------------------------------------------------------------------