Satishraju R.
MS in Data Science
GQP
2019 Spring
Project Sponsor
Kronos
Project
We were a team of four students—all MS in Data Science. Our goal for the GQP was to analyze the knowledge base articles to determine what is covered, where is the duplication, and strategize how to improve the documentation. We were provided with the dataset; the corpus had about 19000+ documents for different articles. Our approach was to apply unsupervised learning on the data; the architecture of the model was to get the tf-idf and reduce the dimension by applying PCA and cluster the m in low dimension.