Data Mining Projects

Abstract:

Text data analysis requires text clustering, which the text mining community has extensively studied. Most text clustering algorithms use the bag-of-words model, which ignores text structural and sequence information and is high-dimensional and sparse.

Convolutional and recurrent neural networks treat texts as sequences but lack supervised signals and explainable results. We propose a deep feature-based text clustering (DFTC) framework that uses pretrained text encoders. This sequence-based model eliminates supervision.

Our model outperforms classic text clustering algorithms and the best pretrained language model, BERT, on almost all datasets. Explaining clustering results helps explain deep learning principles. The explanation module in our clustering framework helps users understand the results.

Note: Please discuss with our team before submitting this abstract to the college. This Abstract or Synopsis varies based on student project requirements.

Did you like this final year project?

To download this project Code with thesis report and project training... Click Here

You may also like: