CV
Education
Skills
- Analytical Tools: Tableau, Excel (LOOKUP, Pivot Table, Charts, INDEX MATCH), QlikView, SAS, Power BI, SSMS, SSRS, Looker
- Databases: MySQL, PostgreSQL, Oracle, MS SQL, Dremio SQL
- Programming: Python (Pandas, Matplotlib, NumPy), R, SQL, PL/SQL, Java, HTML, Bootstrap, LookML, Airflow, Segment
- Machine Learning: Regression, Time Series, Classification, Clustering, ARIMA, KNN, ANOVA
- Project Management: JIRA, Alation, Lucid Chart, Confluence, Notion
- Certifications: Tableau for Data Science, Machine Learning A-Z, Master SQL for Data Science, Google Ad Grants
Industrial Experience
- Data Management Analyst, Avant, Apr 2022 - Jun 2023
- Redesigned 4 card collections campaigns by pushing data to Segment (Customer Data Portal) and Braze (Marketing tool), thus improving the customer reach to reduce default rates and increase overall cash flow
- Audited various customer data attributes during data migration from Presto to Dremio via Python Jupyter notebooks and designed customer financial reporting data view in Looker to ensure financial reporting has a higher level of accuracy in the new environment
- Implemented data integrity checks on up-versioned customer account data via SQL validation scripts and designed a Looker data view to assist the business users with updated data for accurate financial reporting
- Reengineered automated financial dispute resolution data pipeline by validating changes in the pipeline, thus reducing manual efforts by ~ 75% and enabling faster resolution times for identified disputes
- Refactored SQL queries and migrated various Looker views to the Dremio platform to ensure accurate data values are reflected as key performance indicators in financial dashboards
- Modified SQL query for a financial reporting business use case to reduce the data computation time by 44% and improved performance
- Engineered data validation scripts in SQL for changes in customer data pipelines by reviewing Pyspark code to increase the robustness of the data pipelines
- Executed an investigative analysis to understand the patterns in missing data deliverables for the customer account pipeline to enhance data completeness to ~ 98%
- Data Analyst, The University of Texas at Austin - Office of Strategy & Policy, Jan 2021 - Mar 2022
- Designed College Career Knowledge Assessment reports in SQL & SSRS for 52 districts that track key performance indicators (KPIs) and provide insights on student’s as well as professor’s performance on various academic programs supported by Texas public district schools.
- Developed automated data integrity checks for 4 Post Learning Initiative reports to reduce data errors by ~15%
- Constructed End of Year reports for 197 Texas public district schools to monitor the overall program’s growth over the past years by providing interactive visualizations in SQL & SSRS
- Debugged and re-engineered a Python web scrapping script to extract around 1 million records of course data by implementing the Selenium automation package, thus reducing manual efforts by 88%
- Conceptualized and modeled database schema diagram to revamp the internal database structure for the MapMyPath project supported by Texas Higher Education Coordinating Board
- Facilitated and modified the SQL scripts to reduce the data loading time by ~15% for various Tableau visualizations for the Texas Higher Education Coordinating Board dashboard
- Acted as the department’s key resource for formulating business requirements from UT Austin’s internal and external stakeholders into technical business queries and furnish accurate data for higher management reporting
- Assistant System Engineer, Tata Consultancy Services, Dec 2017 - Jun 2018
- Designed a report to demonstrate a number of defects identified and solved in concurrent sprints using SQL and Excel to showcase the team’s performance over the sprints
- Developed automated Python script for Sanity and Load Testing failure triage process, reducing manual efforts by 18%
- Devised end-to-end testing on storage systems project by designing several automation test scripts using selenium
Projects
- Frito-Lay Data Analysis Using Microsoft Excel & SAS, Mar 2020 - May 2020
- Performed ad hoc data analysis on 9 million records to visualize market share and weekly sales by product categories
- Identified key factors affecting the odds of people buying Fritos products by designing a multinomial logit model with 83% accuracy
- Forecasted overall sales of the Fritos brand and drilled down the forecasting on market and product level using the ARIMA model
- Predicted number of units at market level for various product types to efficiently manage inventory across 140 locations
- Utilized demographics coupled with customer segments using Recency Frequency Monetary analysis to get crucial insights
- Predictive Analytics Using SAP Business Objects, Nov 2019 - Dec 2019
- Estimated the future value of liquid assets by performing a time series analysis of the cash flow data
- Created interaction variables to reduce forecasting error of the predictive model by 6.8%
- Established rules for customer search patterns and constructed a flow graph by conducting association rule mining
- Predicted number of customers responding to the promotional offer with 91.97% accuracy by defining a classification model
- Walmart Sales Prediction Using Python, Apr 2019 – Jun 2019
- Analyzed the impact of Markdown values and Holidays on Weekly Sales of various departments for 45 Walmart stores
- Forecasted overall sales for 3 months by store type using Auto-ARIMA predictive model with RMSE of 407.8 units
- Improved accuracy of sales forecast by implementing GRU predictive model to reduce the root mean square error by 6.14%
- Created an ensemble regression model by taking an average of regressors to reduce RMSE and improve accuracy to 97.18 %
- Exploratory Data Analysis Using Tableau & Hive, Apr 2019 – Jun 2019
- Imported geolocation and truck information data into the Hadoop File System by incorporating external tables using Hive
- Eliminated the skewness in the data by identifying and normalizing risk factor associated with truck accidents
- Identified cities prone to high number of accidents by integrating Apache Hadoop with Tableau dashboards
- Bitcoin Price Prediction Using ARIMA Forecasting in R, Mar 2019
- Utilized cryptocurrency, crude oil, and stock price variables to improve the forecasting accuracy of bitcoin prices by 9%
- Visualized periodogram and analyzed results to identify the trends in bitcoin prices based on seasonality
- Optimized ARIMA model to efficiently forecast bitcoin prices and visualized the predicted results using ggplot2
- Google AdWords Campaign, Feb 2019 – Mar 2019
- Strategized SEM campaigns for Thea’s Star of Hope to increase brand awareness and donations towards cancer research studies
- Optimized bidding strategies by conducting adequate keyword research and improving the quality score of the ads
- Improved click-through rate (CTR) of a brand awareness campaign by 5.12%
- Designed dynamic ad groups for an awareness campaign, leading to a click-through rate (CTR) of 45.94%
- Sales Prediction Using Multi-Linear Regression Analysis, Mar 2019
- Predicted the price based on customer demographics by extracting and cleaning 2 million records of Black Friday Sales
- Improved fitness of multi-linear regression model by 7.5% using advanced techniques like backward elimination
- Accio Viand Food Ordering App, Aug 2018 – Dec 2018
- Produced a system analysis report by employing project management tools and techniques.
- Identified functional and non-functional requirements and modeled an information system by designing UML diagrams
- Visual Analysis of Natural Disasters at ConocoPhillips Refineries Locations, Oct 2018
- Performed an exploratory design analysis on different Conoco Phillips Refineries locations
- Designed a narrative visualization of refineries by developing a dashboard of different story points and annotations
- Cataloged appropriate rhetoric and established a connected relationship between the story points
- Standards Elimination Parser Using Natural Language Processing
- Designed a parser for converting the Units/standards to the intended standards as required by the user
- Engineered logic for a method used for converting words to numbers like four hundred gets converted to 400
- Strategized and implemented logic for conversion of time as per the required time zone