About Me

A final year CSE major who is constanly in awe of the way data shapes our lives. With a passion for solve real-world complex problems and my background in Statistics, Predictive Modelling and Machine Learning, I specialize in using data-driven solutions to form business strategies. I'm currently involved in projects that involve sales data analysis, development of recommendation systems and modeling user preferences for e-commerce platforms.

Professional Experience

Machine Learning Intern | SpanIdea Systems Pvt. Ltd.

January 2022 - Ongoing

As a part of the Machine Learning team, I was responsible for user personalisation by modelling their preferences from past orders data. I also worked to uncover insights from sales and clickstream data to aid in better targeting.

  • Products sales analysis using e-commerce data.
  • Developing databases for non-personalized or loosely-personalized and recommendations.
  • Modelling customer preferences using Content-based and collaborative-based filtering algorithms.
  • Model evaluation using precision and recall metrics.
  • Cohort Analysis to gauge customer retention and loyalty.
  • Marketing Intern | Data Sutram

    June 2019 - July 2019

    I worked closely with the team developing the company's local intelligence platform and brainstormed ideas to market their DaaS solutions in sectors like Retail, Healthcare and Pharmaceuticals.


    Vehicle Route Optimizer (in progress)

    Solving a form of TSP by calculating optimal routes between 2 or more points. A LightGBM model is trained on New York Taxi Trips dataset and the predictions are fed to a Genetic Algorithm for optimization.

    Project Repository

    DataMetric - Customer Segmentation (in progress)

    A Customer Segmentation model with data-preprocessing pipelines and Kmeans clustering to provide an insight into an organization's customer base. The application is in the form of a Flask web application, uses PostgreSQL and is deployed on Heroku.

    Project Repository

    Anomaly Detection In High Dimensional Data

    A Research and Development Project that analyzes several outlier detection algorithm and focuses on the Isolation Forest algorithm. A Cluster-Based Isolation Forest (CBIF) algorithm is proposed to overcome the drawbacks of the iForest algorithm.

    Project Repository

    Covid Deaths - Data Exploration

    Exploration of the Covid Deaths dataset from ourworldindata.org using SQL queries. The results obtained are visualised into a dashboard on Tableau Public.

    Project Repository

    Measures of Dispersion for Streaming Data

    A research paper that explores measures to compute variablity of unidimensional and multidimensional data. It compares methods like covariance matrix, Welford's online algorithm and the textbook one-pass algorithm's accuracy and stability.

    Project Report

    skribe - A Flask-based Blogging Platform

    A blogging web application which would allow writers to publish articles on their subdomain. The application utilises the Python's Flask framework and SQLite as the database.

    Project Repository

    Experiences and Acheivements

    Get In Touch