Abstract:
Temporal point process (TPP) modeling is expressive. TPP modeling rarely investigates event sequence clustering temporal patterns. Reinforcement learning assumes the observed sequences are generated by a mixture of latent policies to solve this problem.
Learning each policy model involves clustering sequences with different temporal patterns into the underlying policies.
Our model is flexible because: i) all components are networks, including the policy network for modeling the temporal point process; ii) to handle varying-length event sequences, we use inverse reinforcement learning by decomposing the observed sequence into states (RNN hidden embedding of history) and actions (time interval to next event) to learn a reward function, which improves performance or efficiency compared to existing methods.
We use an Expectation-Maximization algorithm to estimate cluster labels for each sequence and learn the policy. Our method performs well on synthetic and real-world datasets.
Note: Please discuss with our team before submitting this abstract to the college. This Abstract or Synopsis varies based on student project requirements.
Did you like this final year project?
To download this project Code with thesis report and project training... Click Here