Hi, my name is

Vinmathi Iyappan

Data Analyst | BI Professional

I'm passionate about transforming complex data into actionable insights that drive strategy and impact. With 7+ years of industry experience, I specialize in predictive modeling, dashboard development, and AI-driven public health solutions.

Vinmathi Iyappan

About Me

Hi, I'm Vinmathi Iyappan, a Data Analyst and Business Intelligence Professional with a Master's in Business Analytics from California State University, East Bay. With 7+ years of experience in Healthcare, Manufacturing, and Logistics, I specialize in data analysis, predictive modeling and business intelligence.

I bring together technical expertise and data storytelling — from building ETL pipelines and ML models to designing dashboards that inform strategic decisions. I’m skilled in Python, SQL, Power BI, Tableau , and experienced with Mainframe technologies.

Experience

Research Assistant

California State University, East Bay • Sept 2023 – Present

Using Utility-driven Clustering for Analyzing U.S. Prescription Drug Expenditure
  • With prescription drug expenditure rising annually, this research aims to optimize healthcare resource allocation and guide policy decisions for cost reduction.
  • Wrangled 500K+ records, applied dimensionality reduction and utility-based clustering to understand national spending patterns.
  • Identifying cost-saving opportunities that could reduce drug expenditure by ~5%, offering insights for targeted policy reform.
Read more
Using LLMs to Understand the Impact of Epilepsy Disclosure or Concealment on Workplace Outcomes
  • Analyzed 540K+ Reddit discussions using NLP, sentiment analysis, and Large Language Models to uncover workplace struggles among individuals with epilepsy.
  • Performing hypothesis testing to compare the consequences of disclosure vs. concealment in professional environments.
  • Initial Findings revealed that 67% of users reported difficulties due to inflexible policies and lack of workplace adjustments.
Read more

Data Analyst

California Department of Public Health • Sept 2024 – Dec 2024

  • Replaced manual data extraction with an AI-powered solution using Power Apps, achieving 95% accuracy and saving 10 hours per week.
  • Developed the team’s first SQL database with 30+ tables and migrated Excel-based datasets into ETL pipelines to enable scalable public health analysis.

Data Analyst

Essilor Luxottica – Oakley • May 2024 – Aug 2024

  • Built Power BI dashboards to replace Excel reports, saving $150K annually and improving warehouse monitoring for leadership.
  • Improved a backorder AI model by applying feature engineering to 1TB of ATP data, boosting prediction accuracy by 25% and reducing stockouts by 6%.

Senior Data Analyst

Thryve Digital Health LLP • Mar 2021 – Aug 2022

  • Resolved 50+ critical COBOL-based pricing bugs, cutting revenue leakage by 8% and saving ~$80K annually through early detection of claim errors.
  • Reduced manual rework by identifying root causes in claim eligibility logic across 30 legacy programs.

Associate Analyst - Mainframe

Cognizant Technology Solutions • Nov 2015 – Feb 2021

  • Optimized a CPU-intensive DB2 batch job, reducing runtime from 58 minutes to 30 seconds and saving $150K+ in mainframe costs.
  • Engineered ETL workflows using Informatica CDC on IMS DB, cutting data latency by 60% and streamlining batch processes.
  • Lowered post-release defects by 20% by proactively flagging issues during peer reviews.

Certifications

Certification Logo

AWS Certified Cloud Practitioner

March 2025

View Certification
Certification Logo

Google Analytics Certification

April 2025

View Certification

Skills

Languages

PythonPython
SQLSQL
RR

Mainframe

COBOLCOBOL
JCLJCL

Databases

MySQLMySQL
DB2DB2
IMS DBIMS DB
TeradataTeradata

Project Management

AgileAgile
JiraJira
KanbanKanban
GitGit
ScrumScrum
Rational Clear CaseRational Clear Case
ServiceNowServiceNow
Smart sheetSmart sheet
PowerPointPowerPoint

Tools

Power BIPower BI
TableauTableau
Informatica Power CenterInformatica Power Center
ExcelExcel
AWSAWS
Google AnalyticsGoogle Analytics

Machine Learning

Regression Random Forest Time Series KMeans DBSCAN UMAP Statistical Analysis Hypothesis testing Causal Inference A/B Testing Sentiment Analysis LLM

Things I've Worked On

Loan Default Prediction

Featured Project

Loan Default Prediction

Built and compared multiple machine learning models—including Logistic Regression (with Class Weights & Elastic Net), Random Forest, XGBoost, CatBoost, Bagging Classifier, Neural Networks, LDA, and QDA—on historical loan data to predict default risk and improve lending decisions.

  • Python
  • Scikit-learn
  • Pandas
  • Matplotlib
  • Data Preprocessing
View on GitHub

Featured Project

Time Series Forecasting

Analyzed and forecasted quarterly U.S. state and local government tax revenues from 2009 to 2023 using time series models.

  • R
  • Quadratic Trend
  • Seasonal Trend
  • Exponential smoothing
  • ARIMA
  • Auto-ARIMA
View on GitHub
Time Series Forecasting
Web Scraping Real Estate

Featured Project

Real Estate Web Scraping

Extracted housing data using BeautifulSoup and Selenium to identify real estate market patterns. Analyzed scraped data to determine pricing trends and regional shifts.

  • Python
  • BeautifulSoup
  • Selenium
  • Pandas
View on GitHub

Featured Project

Healthcare Analytics using Medicare Data

Investigated Medicare prescription and claim data to uncover cost drivers in healthcare. Used SQL for extraction and Python for trend and anomaly detection.

  • SQL
  • BigQuery
  • CTE
  • Joins
View on GitHub
Medicare SQL Analysis
View more projects on GitHub ↗

What's Next?

Get In Touch

Let's connect! Whether you have a project idea, a question, or just want to chat, feel free to reach out.