My Journey So Far

From mathematics in Mumbai to AI safety research in Toronto — the roles, training, and education that shaped how I think and build.

Work Experience

Associate AI Engineer

Feb 2026 – Present · Part-time

XEqualTo Analytics — Quper Platform (FinOps & Warehouse Intelligence)

Architecting an MCP server to enable structured, tool-driven LLM interactions across FinOps and warehouse observability systems, with security-first design at every layer.
Building FOCUS-aligned data pipelines ingesting cloud cost and usage data from AWS and Databricks into queryable intelligence layers; designing agent-compatible tool schemas for safe, auditable LLM interactions.
Exploring Redshift and Databricks performance telemetry to surface workload efficiency signals, cost anomalies, and capacity indicators within the Quper platform.
Contributing to multi-domain observability architecture spanning Spend, Efficiency, Allocation, Deviation, Compute, Capacity, and Storage performance pillars.

MCPLLM AgentsAWSDatabricksRedshiftFinOpsPython

Business Intelligence Engineer

Feb 2022 – Jul 2023

Contentstack Pvt. Ltd. — API-first Content Management System

Engineered production ETL pipelines using SQL, Snowflake, and Fivetran processing millions of records daily, reducing data processing time by 50%.
Architected a Customer Health Index system integrating S3, Azure Blob Storage, and Salesforce data — delivering KPIs that drove a 30% improvement in customer retention rates.
Built data quality frameworks across marketing and product, increasing dataset reliability by 45% and standardizing metric definitions organization-wide.
Developed proofs of concept for Heap, Tableau, and Salesforce integrations, validating technical feasibility for executive leadership approval.

SQLSnowflakeFivetranTableauETLSalesforcePython

AI Safety Training & Community

Independent AI Safety Researcher

Jun 2025 – Present

Trajectory Labs | AI Safety Workspace Canada

Designed LLM evaluation pipelines using judge-router architectures to assess reasoning quality, sycophancy, and behavioral consistency across prompt variants and model configurations.
Built and iterated on RAG systems with LangChain and LlamaIndex, experimenting with retrieval strategies and contextual grounding to measure downstream effects on model reliability and misgeneralization.
Developed prompt-based behavioral testing frameworks probing alignment-relevant behaviors including agreement bias, instruction following, and reasoning robustness.
Publishing research notes and technical breakdowns to contribute to open AI safety discourse and knowledge-sharing.

LLM EvaluationRAGSycophancy DetectionJudge-RouterLangChain

Research Fellow — Electric Sheep Fellowship

Nov 2025 – Feb 2026

Futurekind AI

Designing and building an AI Extinction Tracker web platform analyzing how AI-driven automation and agentic systems contribute to ecological and animal welfare risks.
Developing a structured risk-mapping framework connecting AI deployment pathways to downstream environmental and systemic impacts.
Exploring safety-aware deployment strategies with emphasis on uncertainty modeling, misuse risk, and constraint-based system design.

AI RiskAgentic SystemsRisk MappingSafety-Aware Design

Technical AI Safety — Mentored Cohort

Nov 2025 – Dec 2025

BlueDot Impact

Completed mentored technical training covering alignment theory, threat pathways, interpretability, scalable oversight, and AI risk evaluation.
Developed a concrete AI control and deployment safety roadmap focused on strengthening preventive and constraining layers in real-world systems.

Alignment TheoryInterpretabilityScalable OversightAI Risk Evaluation

Technical AI Safety Project — Mentored Cohort

Ongoing

BlueDot Impact

Building an Agent-Based Red Teaming Framework — a lightweight evaluation pipeline that automatically tests LLM-powered chatbots and applications for safety vulnerabilities prior to deployment.
Framework targets four risk categories: prompt injection, data leakage, hallucination, and unsafe tool usage — risks that standard benchmarks fail to surface under real-world adversarial conditions.
Four-stage pipeline: curated adversarial attack prompts → target LLM endpoint → rule-based safety judge (pattern matching + classification heuristics) → structured safety report with scores, failure list, and triggering prompts.
Representative output metrics include safety score (0–100), injection success rate, hallucination rate, and total prompts tested — designed to be repeatable and accessible to development teams pre-deployment.
Research goal: assess whether automated adversarial testing can surface safety risks earlier in the development lifecycle and move evaluation beyond fixed benchmarks toward dynamic misuse simulation.

Red TeamingAdversarial EvaluationPrompt InjectionLLM SafetyAgent FrameworkAutomated Testing

Adversarial Inputs Research

2025

Apart x Martian Hackathon

Mechanistic analysis of manipulation vulnerabilities in AI orchestration — identifying, ranking, and visualizing attack vectors against judge and router models.
Collaborated with the Trajectory Labs community to document findings and propose mitigations.

Adversarial MLRed TeamingJudge ModelsHackathon

Education

Postgraduate Diploma — Artificial Intelligence & Data Science

Sep 2023 – Apr 2025

Loyalist College, Toronto

GPA: 7.1 / 10

Developed practical expertise through projects involving model training, deployment, and responsible AI design.

Machine LearningDeep LearningNLPComputer VisionGenerative AIAgentic AI

Bachelor of Science — Mathematics (Major)

Jun 2018 – Jun 2021

Mumbai University

GPA: 9.21 / 10

Built a strong foundation in quantitative reasoning, probability, and analytical problem-solving.

Linear & Abstract AlgebraStatisticsMetric SpacesIntegral CalculusData Mining

Certifications

Technical AI Safety (Mentored Cohort)

Oct 2025

BlueDot Impact

Alignment theory, threat pathways, interpretability, scalable oversight, and AI risk evaluation.

Governing AI Agents

Oct 2025

DeepMind

AI governance frameworks, agent oversight, security and lifecycle control for autonomous systems.

Machine Learning Specialization

Sep 2025

DeepLearning.AI

Regression, clustering, anomaly detection, tree ensembles, and recommender systems.

Databricks Certified Data Engineer Associate

Feb 2025

Databricks

Big data processing with Apache Spark and the Databricks Lakehouse.

Skill Sets

Languages & Frameworks

Python

SQL

TypeScript

JavaScript

LangChain

LlamaIndex

Bash

AI / ML

RAG Systems

LLM Evaluation

Prompt Engineering

Agentic AI

MCP

Mechanistic Interpretability

Fine-tuning

MLflow

Docker

Kubernetes

Data Engineering

Snowflake

Databricks

Apache Spark

Fivetran

ETL

AWS S3

Azure Blob Storage

Redshift

Tools & Platforms

Git

Tailscale

WireGuard

Tableau

Salesforce

Raspberry Pi

Jupyter

Concepts

AI Safety & Alignment

LLM Behavioral Evaluation

FinOps

Observability

Data Quality Frameworks