Data Mining Projects

Abstract:

Matrix decomposition is essential for mining big data from modern applications. However, processing very large data on a single machine is inefficient or impossible. Big data are also distributed across multiple machines.

Thus, heterogeneous noise dominates such data. Big data analytics needs distributed matrix decomposition. A scalable method should model heterogeneous noise and solve distributed system communication problems.

We propose a distributed Bayesian matrix decomposition model (DBMD) for big data mining and clustering. We implement distributed computing using three methods: accelerated gradient descent, ADMM, and statistical inference.

We study these algorithms’ theoretical convergence. An optimal plug-in weighted average reduces estimation variance to address noise heterogeneity. Our algorithms scale well to big data and outperform Scalable-NMF and scalable k-means++, two typical distributed methods.

Note: Please discuss with our team before submitting this abstract to the college. This Abstract or Synopsis varies based on student project requirements.

Did you like this final year project?

To download this project Code with thesis report and project training... Click Here

You may also like: