Machine Learning Projects for Beginners
If you’re just starting out with machine learning, it can be overwhelming to choose a project that is both interesting and achievable. Here are five machine learning projects that are great for beginners:
- Attention Mechanism-Based BP Neural Network Recommender Framework
ABSTRACT: Deep neural networks (DNNs) with nonlinear representation learning improve recommender system prediction accuracy. High computational and storage costs. DNNs with limited rating input may overfit. BPAM++, a Back Propagation (BP) neural network recommendation framework with attention mechanism, addresses these issues. The BP neural network learns the complex relationship between the target user and neighbors and the target item and neighbors. Shallow neural networks (BP neural networks) reduce computational and storage costs and eliminate DNN overfitting caused by a small number of ratings. An attention mechanism captures the target user’s nearest users’ global effect on their nearest target user sets. Eight benchmark datasets validate the model. - Crossing-City POI Recommendation Deep Neural Network
ABSTRACT: Smart phones generate lots of location-based social media data like check-ins. This encourages machine learning POI recommendation studies. POIs in new cities are rarely recommended. This paper introduces ST-TransRec, a crossing-city POI recommendation deep neural network. It uses deep neural networks, transfer learning, and density-based resampling. This deep neural network learns POI embeddings and user preferences. Transfer learning bridges city-specific feature gaps. Because POI distributions are imbalanced, density-based spatial resampling matches POIs across cities. We extensively test two real-world datasets. ST-TransRec outperforms leading crossing-city POI recommendations in experiments. - High-Performance Real-Time Matrix Retrieval Index
Abstract: Machine learning “embedding” changed data representation. Word, image, and audio embedding. Matrix embedding represents many real-world objects. Each row of a matrix represents a word in a document. Many applications generate matrices and need real-time query answering. Effectively retrieve these continuously generated matrices. This paper proposes a fast real-time matrix retrieval index. LSM-trees allow real-time insertion and query response. Matrix indexes take longer to search and use more memory. To address the challenges, we propose an index with precise and fuzzy inverted lists and a series of novel memory and search efficiency techniques. Vector signature, residual sorting, hashing-based lookup, and dictionary initialization ensure index quality. Our index allows real-time matrix search and uses less memory than the current method. - Multi-criteria Representative Selection from Manifolds
ABSTRACT: Representative selection selects few informative exemplars from large datasets. Data selection methods rarely handle non-linear data structures, sample concise, non-redundant subsets, reject outliers, and produce interpretable results. This paper introduces MOSAIC, a representative selection method for descriptive sketches of arbitrary manifold structures. MOSAIC selects samples that maximize global representation power, minimize redundancy, and reject disruptive information by detecting outliers using a novel quadratic formulation. Theoretical analyses show that sampled representatives maximize data coverage in a transformed space and geometrically characterize the sketch. A highly scalable randomized implementation of the proposed algorithm accelerates. MOSAIC outperforms state-of-the-art algorithms in achieving the desired characteristics of a representative subset while being robust to various outlier types. - A Novel Cluster Size Reduction and Diversity Method for Optimizing Ensemble Classifiers
This paper proposes an optimized ensemble classifier method. Data classes reduce class imbalances. Clustering partitions incrementally creates class pure data clusters. Samples from all classes near the cluster centroid balance data clusters. Balanced data clusters train unbiased classifiers. Diversifying base classifier training inputs. Clusters train the base classifier pool. An evolutionary algorithm optimizes the classifier pool combinatorially. The proposed method produces a highly accurate ensemble classifier with low component size. The method is compared to state-of-the-art ensemble classifiers on 31 UCI machine learning repository benchmark datasets. - Single-Source Deep Unsupervised Visual Domain Adaptation Review
Large labelled training datasets help deep neural networks perform well on many benchmark vision tasks. Many applications require expensive and time-consuming labeled data. To handle limited labelled training data, many have directly applied models trained on a large-scale labelled source domain to a sparsely or unlabeled target domain. Domain shift and dataset bias hinder direct domain transfer. Domain adaptation (DA) machine learning learns a model from a source domain that works well on a related target domain. This article reviews recent single-source deep unsupervised DA methods for visual tasks and proposes new research. benchmark datasets and DA strategies. We review discrepancy-based, adversarial discriminative, adversarial generative, and self-supervision-based single-source unsupervised DA methods. Finally, we discuss future research challenges and solutions. - Knowledge-Graph-Based Recommender Systems Survey
Recommender systems model user preferences to reduce information explosion and improve online application user experience. Despite personalization efforts, data sparsity and cold-start issues plague recommender systems. Knowledge graph recommendations have grown in popularity. This method solves the above issues and explains recommended items. This paper analyzes knowledge graph-based recommender systems. Recent papers in this field use embedding-, connection-, and propagation-based methods. Approach characteristics divide each category. We examine how algorithms use the knowledge graph for accurate and explainable recommendation. Several research avenues conclude. - Modern Deep Neural Networks for Traffic Prediction: Trends, Methods, and Challenges
Traffic congestion is a global economic and environmental issue. Predicting traffic reduces congestion. Traffic prediction research has grown since the late 1970s. Early studies used ARIMA and variants. Researchers like machine learning models’ versatility. Deep neural networks are popular for their prediction power and complex structure. Deep neural network traffic prediction models are popular, but literature surveys are rare. We review deep neural network traffic prediction. We’ll explain popular deep neural network architectures used in traffic flow prediction literatures, categorize and describe them, discuss their similarities and differences, and discuss the field’s challenges and future directions. - Adaptive Hierarchical Attention-Enhanced Gated Network Integrating Reviews for Item Recommendation
Many reviews-rating integration studies improved recommendation performance. These works have flaws: (1) Dynamically integrating review and interaction data features is often overlooked, but treating them equally may misinterpret user preferences. Soft attention models word local semantic information. Due to irrelevant features, the attention map is neither discriminatory nor detailed. AHAG, an adaptive hierarchical attention-enhanced gated network, integrates reviews for item recommendation. AHAG’s adaptive reviews reveal users’ intentions. Gated networks dynamically fuse extracted features and select user-preference-relevant features. Hierarchical attention mechanisms identify fine-grained semantic information features and their dynamic interaction. Neural factorization machines predict ratings using high-order non-linear interaction. AHAG outperforms state-of-the-art methods on seven real-world datasets. Attention can highlight relevant review information to improve recommendation task interpretability. - Adaptive Lower-level Driven Compaction Optimizes LSM-Tree Key-Value Stores
ABSTRACT: Many NoSQL and SQL systems use log-structured merge (LSM) tree key-value (KV) stores for online big data applications like social networking, graph processing, machine learning, etc. Lazy compaction improves write efficiency in LSM-tree key-value stores by accumulating more data per batch. Online processing cannot tolerate batch writing tail latency. Lower-level Driven Compaction (LDC) reduces compaction granularity for lower latency and write amplification for higher throughput. Adaptive LDC (ALDC) adapts its key compaction threshold to workload features. ALDC improves tail latency and throughput. - Deep Multi-View Anomaly Detection on Attributed Networks
ABSTRACT: The explosion of modeling complex systems using attributed networks boosts anomaly detection research in such networks, which can be applied in many high-impact domains. Many attempts concatenate heterogeneous views into a single feature vector, ignoring statistical incompatibility. Multi-view data detects anomalies better than single-view data. People want to find specific abnormalities based on their attributes, and abnormal patterns behave differently in different views. Most methods cannot adapt to user preferences. Thus, we propose Alarm, a multi-view framework that incorporates user preferences into anomaly detection and addresses heterogeneous attribute characteristics through multiple graph encoders and a well-designed aggregator that supports self-learning and user-guided learning. Disney, Books, and Enron datasets demonstrate Alarm’s improved AUC metric-based detection accuracy and user-oriented anomaly detection. - Dynamic Selection-Based Hybrid Time Series Forecasting
ABSTRACT: Hybrid systems, which combine statistical and machine learning (ML) techniques using residual (error forecasting) modelling, are known for their accuracy and ability to forecast time series with different characteristics. These architectures require residual modelling due to random fluctuations, complex nonlinear patterns, and heteroscedastic behavior. Thus, selecting, specifying, and training an ML model to forecast residuals is costly and difficult because underfitting, overfitting, and misspecification can reduce system accuracy or even degrade the linear time series forecast. Dynamic residual forecasting (DReF) is a hybrid system that uses a modified dynamic selection (DS) algorithm to choose the best ML model to forecast a residual series pattern and if it is a promising candidate to improve the linear combination’s time series forecast. DReF reduces ML model selection uncertainty and improves time series forecasts. The system determines each data set’s best DS algorithm parameters. Five popular ML models—multilayer perceptron, support vector regression, radial basis function, long short-term memory, and convolutional neural network—are used in the method. Ten famous time series were tested. DReF outperforms literature single and hybrid models for most data sets. - Weighted MinHash Review
ABSTRACT: Data similarity (or distance) computation is the foundation of many high-level machine learning and data mining applications based on similarity measures. Big data’s “3V” nature—volume, velocity, and variety—makes exact similarity computation difficult in large-scale real-world scenarios. Hashing estimates similarity efficiently in theory and practice. Weighted MinHash efficiently estimates weighted set generalised Jaccard similarity. This review categorizes weighted MinHash algorithms. In this review, we mostly categorize weighted MinHash algorithms into quantization-based, “active index”-based, and others and show their evolution and inherent connection from integer to real-valued. Python algorithm toolbox on github. Similarity estimation error and information retrieval are tested with standard and weighted MinHash algorithms. - Large-scale Machine Learning Survey
ABSTRACT: Machine learning predicts accurately in text mining, visual classification, and recommender systems. Most advanced machine learning methods are slow on large datasets. LML efficiently learns big data patterns. This paper surveys LML methods for future research. First, we divide these LML methods by scalability improvement: model simplification on computational complexities, optimisation approximation on efficiency, and computation parallelism on capabilities. We then categorize each perspective’s methods by scenario and present representative intrinsic strategies. Finally, we discuss their drawbacks and potential. - Semi-supervised Dimensionality Reduction Adaptive Local Embedding Learning
Semi-supervised learning is a popular machine learning problem. Semi-supervised Adaptive Local Embedding learning (SALE) is a novel locality-preserved dimensionality reduction framework that learns a local discriminative embedding by constructing a k1 Nearest Neighbours (k1 NN) graph on labelled data to explore the intrinsic structure, i.e., sub-manifolds from non-Gaussian labelled data. Mapping all samples into learned embedding and constructing another k2 NN graph on all embedded data to explore global sample structure. Clustering unlabeled and labelled data into the same sub-manifold improves embedded data discrimination. We propose orthogonal and whitening semi-supervised dimensionality reduction methods using SALE framework. Our models solve NP-hard problems with an efficient alternatively iterative optimization algorithm. We excel at local structure exploration and classification. - An Experimentation Platform for On-Chip Integration of Analog Neural Networks: Toward Trusted and Robust Analog/RF ICs
ABSTRACT: Our platform prototypes low-cost analog neural networks for on-chip integration with analog/RF circuits. Classifying on-chip sensor analog measurements supports self-test, self-tuning, and trust/aging monitoring. Priorities include neural network circuits with low energy and area budgets, robust learning with analog inaccuracies, and long-term retention of learned functionality. Our sub-μW chip has a reconfigurable array of synapses and neurons below threshold. Dual-mode weight storage in synapse circuits allows fast bidirectional weight updates during training and permanent storage of learned functionality. A robust learning strategy and benchmark problems like XOR2-6 and two-spirals classification are evaluated. - Piecewise Linear Neural Network Linear Region Number Analysis
DNNs solve complex machine learning problems. Expression helped them succeed. Piecewise linear neural networks (PLNNs) use linear pieces to model complex patterns, so their expressive power is measured by their linear regions. Counting and bounding linear regions theoretically analyzes PLNN expressive power. ReLU PLNNs’ linear regions are refined first. We then calculate the maximum number of linear regions in single-layer PLNNs with general piecewise linear (PWL) activation functions. The upper and lower bounds on multilayer PLNN linear regions scale polynomially with the number of neurons at each layer and pieces of PWL activation function but exponentially with layers. Deep PLNNs with complex activation functions perform better at computing complex and structured functions, which partially explains their classification and function fitting performance. - Deep Active Learning for Remote Sensing Bioinspired Scene Classification
Scene parsing, robot motion planning, and autonomous driving require accurate scene classification. Deep recognition models have performed well for a decade. Deep architectures cannot encode human visual perception, including gaze movements and cognitive processes. This article proposes a biologically inspired deep scene classification model that robustly discovers and represents human gaze behaviors using a unified deep active learning (UDAL) framework. To characterize different-sized objects, an objectness measure decomposes each scenery into semantically aware object patches. Local–global feature fusion automatically calculates multimodal feature weights to represent each region at a low level. The UDAL mimics human visual perception by recognizing semantically important regions in various scenes. Importantly, UDAL integrates semantically salient region detection and deep gaze shifting path (GSP) representation learning into a principled framework that requires only partial semantic tags. The sparsity penalty prevents contaminated/redundant low-level regional features. Finally, a kernel SVM classifies scenes using an image kernel machine made from deep GSP features from all scene images. Our method was competitively tested on six famous scenery sets, including remote sensing images. - Consensus Accelerated Proximal Reweighted Iteration for Nonconvex Minimizations (Capri).
Machine learning and data processing use nonconvex regularised optimisation problems. Due to problem structure, consensus optimisation used iteratively reweighted. This paper suggests adding an inertial term to each iteration to accelerate it. A connected network where agents communicate and compute locally can implement classical decentralized algorithms. Decreasing stepsizes accelerate the iteratively reweighted algorithm. Our algorithms suggest decentralized schemes. Several objective function assumptions converge both algorithms mathematically. Kurdyka-Łojasiewicz allows constant stepsize convergence rates. Algorithms work numerically. - Cold-Start Active Sampling Via γ-Tube
Active learning (AL) queries unlabeled data to improve classification hypothesis generalization. Evaluation policies evaluate sampling. Because it needs an initial labelled set, the policy may degenerate into a cold-start hypothesis. First, we show that typical AL sampling is geometric sampling over minimum enclosing balls (MEB), a conceptual geometry over the cluster in generalisation analysis. SVMs use cluster hard-margin support vector data description (MEBs). Following geometric clustering’s -tube structure, we divide one MEB covering a cluster into a -tube and -ball. Our theoretical insight shows that -tube can estimate error disagreement to measure hypotheses in original space over MEB and sampling space over -ball. Generalisation analysis clarifies and shows that sampling in a tube can derive higher probability bounds for nearly zero generalisation error. After these analyses, we present a tube AL (TAL) algorithm against cold-start sampling using the informative sampling policy of AL over -tube. Queries and active sampling evaluation policies can be separated. TAL’s tube structure handles cold-start sampling better than standard AL evaluation baselines. Edge detection expands our theory.
Beginners can work on many machine learning projects like these. Choose a project that fits your skills, interests, and data. Join online communities or machine learning events to network and learn from other practitioners. You can learn to build complex machine learning models and solve real-world problems with persistence and practice.
Did you like this final year project?
To download this project Code with thesis report and project training... Click Here