/about · the long version

about .

~/about 6 roles 5 milestones
Dipankar Das

fig.02 / portrait, b&w 2025

/intro

Dipankar Das,
systems engineer who treats infrastructure like a product.

Designing & Building Scalable, Reliable Systems. I build sustainable platforms and write deep-dives on Software Development, DevOps, Kubernetes, and System Design.

Most of my work lives at the seam between application code and the infrastructure that runs it — eliminating the "it works on my machine" class of problems for everyone, permanently.

/education

B.Tech Computer Science Class of 2024
/work-history

Roles, in reverse chronology

01 / 06 Jan 2024 - Present
rtCamp

rtCamp

DevOps Engineer

  1. .01

    Enterprise IaC & Migration — Cox Automotive

    why
    S3 buckets, CloudFront distributions, and WAF policies for prod, dev, and staging environments were created manually via aws console and have some legacy settings
    what
    Migrated and standardized S3, CloudFront, and WAF configurations across all three environments into Terraform. with 0 drift terraform drift and 0 downtime.
    how
    Established a multi-environment GitOps workflow with a rigorous Plan-Review-Apply SOP, eliminating click-ops and achieving consistent, repeatable infrastructure across prod, dev, and staging.
  2. .02

    Cloud-Native Scaling & Observability — Global FinTech

    why
    A monolithic Frappe/ERPNext platform was buckling under high-concurrency traffic with no visibility into where production bottlenecks occurred.
    what
    Migrated the platform to a distributed Kubernetes cluster and implemented full-stack OpenTelemetry instrumentation.
    how
    Decomposed the monolith into individual services and migrated to Kubernetes — gaining autohealing, rolling deployments, and the broader ecosystem benefits. Layered OTel logs, traces, and metrics across the stack while deliberately keeping the architecture as simple as possible to reduce operational overhead.
  3. .03

    Cloud FinOps & Cost Engineering

    why
    Memory-intensive background jobs and cron workloads were running on always-on instances, inflating cloud spend without any performance benefit.
    what
    Achieved a 20% reduction in cloud OpEx across Kubernetes compute.
    how
    Engineered specialized Node Groups with Spot Instances and implemented scale-to-zero logic for background and cron workloads, eliminating idle resource waste while maintaining throughput.
  4. .04

    Product Engineering — EasyDash / EasyEngine

    why
    Manual WordPress/PHP deployment processes were slow and error-prone, blocking a commercial product launch.
    what
    Co-developed a high-scale Cloud Provisioning Engine for dash.easyengine.io.
    how
    Built the automated backend with Python, Terraform, and Ansible — enabling rapid deployments that generated $200+ in subscription revenue within 60 days of launch.
  5. .05

    Developer Experience & CI Optimization

    why
    Shared CI runners were creating queue bottlenecks and long wait times that disrupted engineering flow across teams.
    what
    Optimized GitHub Self-Hosted Runners across the organization.
    how
    Applied resource-aware labeling and multi-container environments, drastically reducing CI/CD wait times and improving overall build reliability.
02 / 06 Jun 2025 - Present
Kubmin - Ksctl

Kubmin - Ksctl

Founder & Principal Engineer

  1. .01

    Distributed Orchestration Engine

    why
    Managing Kubernetes cluster lifecycles across AWS and Azure required a reliable, cloud-agnostic solution without heavy infrastructure overhead.
    what
    Architected a Go-based cloud-agnostic provisioning engine for full Kubernetes lifecycle management.
    how
    Built a high-availability state layer with Turso (Edge SQLite) and Redis, enabling idempotent cluster operations with a minimal infrastructure footprint.
  2. .02

    Event-Driven Task Orchestration with NATS

    why
    Long-running infrastructure tasks needed guaranteed execution without introducing the operational burden of heavy frameworks like Temporal.
    what
    Developed a lightweight event-driven state machine using NATS JetStream.
    how
    Implemented custom NAK/ACK and retry logic ensuring 100% task reliability during long-running cluster operations while keeping the system operationally simple.
  3. .03

    Relationship-Based Access Control (ReBAC)

    why
    Flat RBAC couldn't model the complex sharing hierarchies needed for multi-tenant teams across shared infrastructure.
    what
    Implemented a full Relationship-Based Access Control system using Authzed (SpiceDB).
    how
    Designed a hierarchical model spanning Org, Cluster, and Workload levels, enabling fine-grained permission enforcement and quota management across distributed engineering teams.
  4. .04

    Workload Intelligence & Cost Tracking

    why
    Teams had no per-workload visibility into what each deploy actually costs in money, energy, and compute — waste compounded silently with every release.
    what
    Built a per-workload cost and efficiency tracking engine powered by Kepler (CNCF) with SCI/SEE sustainability scoring aligned to the Green Software Foundation.
    how
    Surfaces idle workloads, overprovisioned containers, and efficiency regressions between deployment versions — with dollar amounts and ready-to-use kubectl commands to fix them. Compares workload costs across regions and instance types.
  5. .05

    AI-Agents Orchestration

    why
    Platform delivery velocity needed to scale without proportionally growing the team.
    what
    Orchestrated AI agents (Claude Code, Gemini CLI) into the core development workflow to autonomously handle well-scoped engineering tasks.
    how
    Leveraged agentic patterns — task decomposition, tool use, and iterative feedback loops — to boost team productivity by 40% while maintaining high code quality and architectural consistency.
  6. .06

    Workload Recommendation Engine

    why
    Teams had no structured way to understand the true profile of their running workloads — efficiency waste, energy consumption, and resource behaviour were all invisible.
    what
    Developed a recommendation system that detects idle workloads, overprovisioned containers, and temporal waste patterns — each with dollar amounts attached.
    how
    Built profiling pipelines that compare workload efficiency across deployment versions (cost, energy, CPU, memory, SCI score), surface regional cost optimization opportunities, and generate ready-to-use kubectl commands for immediate savings.
03 / 06 Feb 2025 - Apr 2025
Viamagus

Viamagus

DevSecMLOps Consultant

  1. .01

    API Security Testing with OWASP ZAP

    why
    Client web applications and APIs had undetected vulnerabilities posing real regulatory and business risk.
    what
    Implemented automated API security testing using OWASP ZAP.
    how
    Deployed a custom proxy to surface SQL injection, MITM, and other OWASP Top 10 risks across the application surface, giving the team actionable findings.
  2. .02

    Continuous Vulnerability Scanning with Snyk

    why
    Security issues were caught late in the development cycle, making remediation expensive and disruptive.
    what
    Integrated Snyk into CI/CD for shift-left vulnerability scanning.
    how
    Enabled detection and remediation of dependency and code vulnerabilities at pull request level, before they ever reached production.
  3. .03

    LLM Containerization & Performance Benchmarking

    why
    The team had no baseline data on containerization efficiency, making optimization decisions purely speculative.
    what
    Optimized internal LLM-based projects for containerization and produced detailed performance benchmarks.
    how
    Measured image size, network throughput, and disk I/O to give the team a clear, data-backed picture of real-world efficiency tradeoffs.
  4. .04

    Scalable LLM Deployment on AWS

    why
    LLM inference at scale required elastic capacity and strict network isolation without exposing endpoints publicly.
    what
    Deployed vLLM and Ollama on AWS for production-scale LLM inference.
    how
    Used Auto Scaling Groups and VPC PrivateLink to deliver secure, elastic inference capacity without public endpoint exposure.
04 / 06 Jul 2022 - Jul 2025
  1. .01

    High-Performance CLI Engineering

    why
    Provisioning Kubernetes clusters across multiple cloud providers required deep provider expertise and many tedious manual steps.
    what
    Architected kli — a multi-cloud CLI using Cobra and Viper — to abstract the full Kubernetes lifecycle.
    how
    Built a pluggable architecture with robust configuration management, enabling single-command cluster creation and teardown across AWS, Azure, and Civo.
  2. .02

    Custom Kubernetes Controllers

    why
    Cluster state could drift between local CLI metadata and remote cloud infrastructure, causing hard-to-debug inconsistencies.
    what
    Developed custom Kubernetes controllers and reconciliation logic using client-go.
    how
    Ensured idempotent state management that reliably kept local metadata and remote infrastructure in sync.
  3. .03

    Automated Addon & Helm Integration

    why
    Manual post-provisioning addon setup was repetitive, error-prone, and delayed clusters reaching a ready state.
    what
    Automated deployment of core cluster components — CNI, Storage Classes, and Ingress — as part of provisioning.
    how
    Leveraged native Helm SDKs and Go client packages to install components consistently on every cluster without manual intervention.
  4. .04

    Secure Distribution & Artifact Signing

    why
    Users needed confidence that CLI binaries were tamper-free and reliably distributed across all platforms.
    what
    Implemented secure artifact signing and automated multi-platform binary releases.
    how
    Integrated Cosign for signing and GitHub Actions for cross-platform builds, delivering a trusted and seamless developer distribution experience.
05 / 06 Mar 2022 - Jul 2024
Kubesimplify

Kubesimplify

Ambassador

  1. .01

    Technical Content & Education

    why
    Developers onboarding to cloud-native lacked approachable, practical resources for Kubernetes and Go.
    what
    Authored blogs and tutorials on Kubernetes and Go.
    how
    Published accessible technical content aligned with real-world use cases for the Kubesimplify community.
  2. .02

    Live Cloud-Native Sessions

    why
    Complex topics like cloud-native architecture benefit more from interactive live walkthroughs than static articles.
    what
    Conducted Twitch live sessions on cloud-native development and Golang best practices.
    how
    Engaged the community through live demos and real-time Q&A, making advanced topics approachable.
  3. .03

    Open Source Maintenance

    why
    The CNCF ecosystem needed reliable, community-maintained tooling for Kubernetes workflows.
    what
    Maintained and improved Ksctl, contributing to CNCF-aligned open-source tooling.
    how
    Contributed code, fixes, and documentation to keep the project active and aligned with community standards.
06 / 06 Mar 2023 - Sep 2023
Viamagus

Viamagus

DevOps Intern

  1. .01

    Kubernetes Migration to EKS

    why
    The existing cluster lacked the scalability and traffic management needed for production workloads.
    what
    Supported Kubernetes migration to EKS with NGINX Gateway API for traffic management.
    how
    Configured EKS and set up Gateway API routing rules to handle production-grade traffic reliably.
  2. .02

    Application Containerization

    why
    Node.js applications weren't containerized, limiting deployment consistency and portability across environments.
    what
    Dockerized Node.js applications and migrated the reverse proxy from Apache to NGINX.
    how
    Wrote Dockerfiles for each service and reconfigured NGINX for improved routing and performance.
  3. .03

    CI/CD Automation

    why
    Manual Jenkins job setup for new projects was repetitive and slowed pipeline delivery for the team.
    what
    Automated Jenkins job creation using a CLI tool.
    how
    Built a CLI automation layer that provisioned jobs from config, eliminating manual setup overhead.
  4. .04

    Security & Observability Integration

    why
    The team had no visibility into vulnerabilities or runtime metrics before production deployments.
    what
    Integrated Snyk for vulnerability scanning and Prometheus for real-time observability.
    how
    Configured Snyk in CI for early detection and set up Prometheus scraping for live service metrics.
  5. .05

    Infrastructure & SSL Automation

    why
    Manual SSL provisioning and repository management were error-prone and time-consuming at scale.
    what
    Migrated repositories to AWS CodeCommit and automated SSL provisioning.
    how
    Set up Let's Encrypt automation for SSL certificates alongside the CodeCommit migration.
Workspace
fig.03 / workspace shutter open
/achievements

A few things I'm proud of.

  • 01 OSS contributions to Kubernetes, CNCF TAG Green, Kubescape, Monokle, and more 2024
  • 02 Invited to GitHub Maintainers repo 2024
  • 03 PR Wrangler in sig-docs (Kubernetes) 2024
  • 04 Member of Kubernetes and Kubernetes-Sigs Organization 2023
  • 05 Winner of Napptive + WeMakeDevs Cloud Native Hackathon (Track 2) 2022
/cta · the next move

Pick a thread.
Let's pull on it together.

DevOps consulting, Kubernetes audits, Go service design, or just a 30-minute sanity-check on your platform — start with whatever feels right.