├── document_clustering.pdf └── README.md /document_clustering.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ONSBigData/Clustering_paper/HEAD/document_clustering.pdf -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Unsupervised Document Clustering with Cluster Topic Identification 2 | 3 | This is a repository for the paper 'Unsupervised Document Clustering with Cluster Topic Identification' to be published in the 4 | Office for National Statistics' working paper series. 5 | 6 | The paper details a pipeline that can be used to cluster documents in an unsupervised way with a suggested automated procedure 7 | for identifying a cluster's topic. 8 | 9 | There is an accompanying Jupyter notebook taht details an example of the pipeline as set out in the paper. 10 | --------------------------------------------------------------------------------