Data Mining Projects

Large Scale Tensor Factorization via Parallel Sketches

Abstract: Tensor factorization methods are popular now. Directly modeling multi-relational data makes tensors attractive. We propose ParaSketch, a massively parallel tensor factorization algorithm for large tensors. Compress the large tensor into multiple small ones, decompose…

Data Mining Projects

Iterative Refinement for Multi-source Visual Domain Adaptation

Abstract: Multi-source domain adaptation requires reducing domain discrepancy between source domains and target domains and then assessing domain relevance to determine how much knowledge should be transferred. Most previous approaches ignored domain discrepancies and relevance….

Data Mining Projects

Improving I/O Complexity of Triangle Enumeration

Abstract: In the age of big data, many graph algorithms must operate in external memory and maintain performance regardless of problem size. Triangle listing algorithms must carefully combine edges from multiple partitions to detect cycles…

Data Mining Projects

Identifying User Relationship on WeChat Money-Gifting Network

Abstract: With the rise of online social networks, identifying or classifying real-life relationships between users has become useful for financial fraud detection. People in different relationships usually give each other meaningful gifts on different dates….

Data Mining Projects

Heuristic 3D Interactive Walks for Multilayer Network Embedding

Abstract: Network embedding solves network analytics. Methods focus on single-layered homogeneous or heterogeneous networks. Multilayer networks—heterogeneous networks with multiple edge/relation types—can naturally represent many real-world complex systems. Multilayer network embedding struggles to capture and use…