Data Analyst building SQL pipelines and BI dashboards
Graduate Data Science student at UT Arlington with hands-on experience of analyzing large datasets and building analytics pipelines that convert operational data into business insights.
Featured Project
AI Data Governance Model
Designed and implemented an end-to-end governance system that ingests GitHub Archive data, models analytics datasets with dbt, executes automated quality rules, tracks incidents, computes dataset health scores, and surfaces governance insights through a Power BI monitoring dashboard with AI-generated explanations.
Impact
PostgreSQL Python dbt Power BI LLM
View GitHub RepoData Pipeline Architecture
Projects
Decision Intelligence Experimentation Platform
• Engineered an end-to-end experimentation system using PostgreSQL, dbt, and Python, simulating 50K users, 280K+ events, and real-world behavioral patterns.
• Built a dbt-powered metrics layer transforming raw event data into experiment-ready datasets with conversion rate, lift, and AOV calculations.
• Implemented statistical hypothesis testing (z-test) to evaluate experiment significance and automate product decisions based on p-value and lift.
• Developed a segment-aware decision layer identifying performance differences across platform and geography, enabling data-driven rollout strategies.
• Delivered a Streamlit dashboard presenting experiment metrics, statistical outcomes, and business-ready insights in a unified interface.
GitHub RepoAI Data Governance Model
• Built an AI-assisted data governance system using PostgreSQL, dbt, and Python to monitor datasets and enforce automated data quality rules on GitHub Archive data.
• Designed a governance metadata layer (dataset registry, rule catalog, rule runs, health scoring, incident tracking) to operationalize dataset monitoring.
• Implemented a rule execution engine detecting data quality failures and generating incidents with severity tracking.
• Developed an AI-based incident explanation assistant to translate data issues into clear, actionable insights for stakeholders.
GitHub RepoRetail Analytics Dashboard
• Architected a data pipeline to process and transform 541K+ retail transactions into analysis-ready datasets using SQL and Python.
• Engineered SQL aggregations and indexing strategies reducing dashboard load times by 40%.
• Implemented ABC inventory analysis identifying the top 20% of products driving 75% of revenue, enabling data-driven stock decisions.
• Built an automated Streamlit and Plotly dashboard replacing manual reporting workflows, saving 3–4 hours/week.
GitHub RepoCustomer Churn Prediction
• Developed an end-to-end churn prediction pipeline using XGBoost on 7,043 customers and 21 features achieving ROC-AUC 0.84.
• Engineered predictive features and applied resampling techniques to address class imbalance improving recall by 18%.
• Applied SHAP explainability to identify key churn drivers and translate model outputs into actionable retention strategies.
GitHub RepoSkills
Programming
Python · SQL · R
Databases & Data Warehousing
PostgreSQL · MySQL · BigQuery
SQL & Data Manipulation
Joins · Aggregations · Window Functions · Query Optimization
Experimentation / Decision Intelligence
A/B Testing · Experiment Design · Statistical Inference · Hypothesis Testing · Lift Analysis · Decision Frameworks
Metrics & Product Analytics
KPI Design · Conversion Funnel Analysis · Cohort Analysis · Retention Metrics
Data Analysis
Pandas · NumPy · Exploratory Data Analysis (EDA) · Data Cleaning · Data Validation
Statistical Analysis
Hypothesis Testing · A/B Testing · Descriptive Statistics · Statistical Modeling · Experiment Evaluation
Data Visualization & Business Intelligence
Power BI · Tableau · Dashboard Development · KPI Reporting · Data Storytelling
Data Engineering & Analytics Engineering
dbt · ETL Pipelines · Data Transformation · Data Preparation · Data Modeling
Machine Learning & Artificial Intelligence
Scikit-learn · XGBoost · Feature Engineering · Model Evaluation · SHAP (Explainability)
Cloud Platforms
Google Cloud Platform
Tools
Git · Docker · Jira · Salesforce (CRM) · Microsoft Office Suite (Excel, PowerPoint, Word)