Python Machine Learning Projects

Encoding high-cardinality string categorical variables

Abstract: One-hot encoding of categorical variables is often needed in statistical models. High-dimensional feature vectors make this strategy fail as categories increase. One-hot encoding also lacks morphological information for string entries. High-cardinality string categorical variables…

Python Machine Learning Projects

Deep Learning for Spatio-Temporal Data Mining A Survey

Abstract: Spatial-temporal data is increasingly available due to the rapid development of GPS, mobile devices, and remote sensing. Human mobility understanding, smart transportation, urban planning, public safety, health care, and environmental management depend on spatio-temporal…

Python Machine Learning Projects

Consensus One-step Multi-view Subspace Clustering

Abstract: Multimedia, machine learning, and data mining communities are focusing on multi-view clustering. Multi-view subspace clustering (MVSC) is a popular multi-view clustering algorithm because it can reveal the intrinsic low-dimensional clustering structure hidden across views….

Python Machine Learning Projects

Hypergraph Partitioning With Embeddings

Abstract: Distributing large sparse matrix operations in scientific computing is like hypergraph partitioning. Hypergraphs have “hyperedges” that connect any number of nodes. Thus, solving or approximating hypergraph partitioning is NP-hard. Current algorithms solve this problem…

Python Machine Learning Projects

HAM Hybrid Associations Models for Sequential Recommendation

Abstract: Given a user’s purchase/rating trajectories, sequential recommendation recommends the next few items they’re most likely to buy/review. It helps users choose their favorites. This manuscript uses hybrid associations models (HAM) to generate sequential recommendations…

Python Machine Learning Projects

Gradual Machine Learning for Entity Resolution

Abstract: Entity resolution (ER), a classification problem, is difficult on real data due to dirty values. The best ER solutions use learning models like deep neural networks, which need lots of labeled training data. High-quality…

Python Machine Learning Projects

Fully Dynamic k-Center Clustering with Improved Memory Efficiency

Abstract: Machine learning libraries need static and dynamic clustering algorithms. The sliding window model or simpler models have dominated dynamic machine learning and data mining algorithm development. Many real-world applications require arbitrary deletions and insertions….