Senior Software Developer | Applied AI Developer | Dual US Patent Inventor | 7-Yr Exp. in Software Dev | AI Application Development, RAG, AI Agents & Distributed Systems
I am a Senior Software Developer and Backend Expert with over seven years of experience engineering the high-performance distributed systems required to scale compute-intensive services and modern AI applications. While my core expertise is rooted in robust backend architecture and applied AI development, I maintain strong cross-stack frontend capabilities, enabling me to seamlessly bridge complex intelligent logic layers with polished, user-facing interfaces.
My engineering philosophy is anchored in structural integrity, predictable scalability, and clean abstractions. I specialize in identifying and resolving systemic bottlenecks—such as decoupling monolithic data schemas to eradicate $O(N)$ limitations—and establishing the asynchronous infrastructure necessary to power intelligent products without compromising enterprise stability.
Currently, my work focuses on the intersection of advanced backend infrastructure and the intelligent logic layer. I specialize in architecting autonomous AI agents, complex Retrieval-Augmented Generation (RAG) pipelines, and integrating Model Context Protocols (MCP) to build highly responsive, high-throughput applications.
Career Highlights:
AI Application & Agent Development: Extensive experience engineering fault-tolerant backend pipelines for AI integration. Independent R&D includes building autonomous discovery agents (such as “Hackathon Sniper,” utilizing Groq LLMs and Brave Search for the Notion MCP Hackathon) and engineering robust, RAG-driven inference pipelines utilizing LangChain.
Patentable R&D: Dual US Patent Holder (US-20250363378-A1, US-20250240262-A1). Architected a zone-based, production-grade algorithm to optimize spatial computing—successfully reducing distributed network message volume by up to 80%—and developed a separate conceptual Machine Learning (Reinforcement Learning) framework.
Enterprise Scale & Delivery: Served as a Senior Engineer for the HTC Viverse global launch, where I conceptualized and designed the dynamic edge-routing network architecture (AWS Lambda@Edge) required to scale to one million monthly active users.
Systemic Evolution: Designed, engineered from scratch, and open-sourced a complete asynchronous microservice architecture (FastAPI/Python). Established zero-trust data pipelines alongside this architecture to securely integrate programmatic services and aggressively drive enterprise technical execution.
Designing scalable and reliable systems requires understanding a core set of reusable building blocks. This article covers the most common components — from load balancers and databases to caches and CDNs — and the key trade-offs to consider when using them.
Load Balancer with Multiple Web Servers and Auto Scaling # A load balancer distributes incoming traffic across multiple web servers, providing a single public entry point to the system.
RAG (Retrieval Augmented Generation) is an AI framework that allows developers to add external information without retraining the LLM, improving the accuracy of its answers. As of 2026, it is a widely known technology.
The concept is roughly as follows: First, vectorize external information (the data you want the LLM to know) using an embedding model and store it. After a user enters a prompt, the prompt is also vectorized using the same embedding model. It is then compared against the previously stored vectors to retrieve the most similar pieces of data. These are then integrated by the LLM to generate a response for the user. This approach allows the LLM to answer using specific knowledge integrated by developers without the need for retraining.
RDBMS # Relational Database Management System
Used when there are strong Relations between data: Design a schema that is unlikely to change, relating tables to each other, then you can retrieve the desired data through SQL. Used when data correctness is very important: Usually provides ACID properties. Changing the schema is a huge undertaking: Requires updating the table schema and migrating data. All programs that use the table with the changed schema need to be modified. join operations can be performed across different tables.
If you’re a developer who uses GitHub daily, you probably rely on notification badges to track issues, pull requests, and mentions. But what happens when the notification badge gets stuck — even after clearing everything?
For months, I saw a 1 notification badge that refused to disappear, even though there aren’t any unread messages in any inbox folder. No archived items, no subscriptions, nothing hidden. Still, the badge remained.
Many developers started reporting the same issue as early as September 2025. This wasn’t just a UI bug — it was caused by a spam attack impersonating Gitcoin, leaving backend notification records that GitHub never automatically cleaned up.
Python’s performance bottlenecks were criticized for years,
but thanks to the hard work of developers,
Asyncio was introduced in Python 3.4 to improve performance in specific scenarios.
By Python 3.13, the Free-threaded design (PEP-703) emerged,
allowing the optional disabling of the GIL.
Combined with the pre-existing Multiprocessing and Multithreading,
I have compiled a few records on the principles, differences, and use cases for these three technologies.
This first post will briefly introduce the basic concepts and suitable scenarios for each.
I originally used Notion as my note-taking software.
It’s feature-rich and has a beautiful interface.
However, a few years ago, a privacy controversy arose around Notion,
accusing them of looking at a company’s content stored in Notion,
and even proposing a partnership based on that information.
So, I switched to Joplin for a while,
but eventually moved to Obsidian, which has a large number of plugins, strong community support, and is highly customizable.