[{"content":"","date":"1 May 2026","externalUrl":null,"permalink":"/tags/edge-computing.html","section":"Tags","summary":"","title":"Edge Computing","type":"tags"},{"content":"","date":"1 May 2026","externalUrl":null,"permalink":"/tags/gemma-4.html","section":"Tags","summary":"","title":"Gemma 4","type":"tags"},{"content":"","date":"1 May 2026","externalUrl":null,"permalink":"/tags/next.js.html","section":"Tags","summary":"","title":"Next.js","type":"tags"},{"content":"","date":"1 May 2026","externalUrl":null,"permalink":"/tags/openrouter.html","section":"Tags","summary":"","title":"OpenRouter","type":"tags"},{"content":"An AI-driven Progressive Web App (PWA) designed to process unstructured visual data and generate real-time recipes. Built to demonstrate secure, low-latency edge computing, the application autonomously parses pantry inventories via image recognition to structure outputs for highly specialized dietary profiles. Dispatched vision tasks to the Gemma 4 model, engineered a real-time Server-Sent Events (SSE) pipeline, and utilized Redis for sliding-window IP rate limiting.\n","date":"1 May 2026","externalUrl":"https://pantry-lens-one.vercel.app/","permalink":"/projects/pantry-lens.html","section":"Projects","summary":"An AI-driven Progressive Web App (PWA) designed to process unstructured visual data and generate real-time recipes. Built to demonstrate secure, low-latency edge computing, the application autonomously parses pantry inventories via image recognition to structure outputs for highly specialized dietary profiles. Dispatched vision tasks to the Gemma 4 model, engineered a real-time Server-Sent Events (SSE) pipeline, and utilized Redis for sliding-window IP rate limiting.\n","title":"PantryLens | Edge-Routed AI Vision PWA","type":"projects"},{"content":"","date":"1 May 2026","externalUrl":null,"permalink":"/projects/index.html","section":"Projects","summary":"","title":"Projects","type":"projects"},{"content":"","date":"1 May 2026","externalUrl":null,"permalink":"/tags/redis.html","section":"Tags","summary":"","title":"Redis","type":"tags"},{"content":"","date":"1 May 2026","externalUrl":null,"permalink":"/tags/server-sent-events-sse.html","section":"Tags","summary":"","title":"Server-Sent Events (SSE)","type":"tags"},{"content":"","date":"1 May 2026","externalUrl":null,"permalink":"/tags/system-architecture..html","section":"Tags","summary":"","title":"System Architecture.","type":"tags"},{"content":"","date":"1 May 2026","externalUrl":null,"permalink":"/tags/index.html","section":"Tags","summary":"","title":"Tags","type":"tags"},{"content":"","date":"1 May 2026","externalUrl":null,"permalink":"/tags/vercel.html","section":"Tags","summary":"","title":"Vercel","type":"tags"},{"content":"","date":"1 April 2026","externalUrl":null,"permalink":"/tags/api-integration.html","section":"Tags","summary":"","title":"API Integration","type":"tags"},{"content":"","date":"1 April 2026","externalUrl":null,"permalink":"/tags/api-rate-limiting.html","section":"Tags","summary":"","title":"API Rate Limiting","type":"tags"},{"content":"A modern, full-stack app connecting real-time NOAA space weather data with Google Gemini to generate travel routes for aurora sightings deployed to the global edge via Vercel. Demonstrate secure AI integration with a dual-layer identity ecosystem (Auth0 M2M) to securely isolate end-user authentication from the Gemini AI inference agent, enforcing strict API quotas and telemetry ingestion via Upstash Redis. Provides scalable edge-routing within a consumer-facing product.\n","date":"1 April 2026","externalUrl":"https://aurora-path.vercel.app/","permalink":"/projects/aurora-path.html","section":"Projects","summary":"A modern, full-stack app connecting real-time NOAA space weather data with Google Gemini to generate travel routes for aurora sightings deployed to the global edge via Vercel. Demonstrate secure AI integration with a dual-layer identity ecosystem (Auth0 M2M) to securely isolate end-user authentication from the Gemini AI inference agent, enforcing strict API quotas and telemetry ingestion via Upstash Redis. Provides scalable edge-routing within a consumer-facing product.\n","title":"AuroraPath | Secure AI Routing Web Application","type":"projects"},{"content":"","date":"1 April 2026","externalUrl":null,"permalink":"/tags/auth0.html","section":"Tags","summary":"","title":"Auth0","type":"tags"},{"content":"","date":"1 April 2026","externalUrl":null,"permalink":"/tags/autonomous-agents.html","section":"Tags","summary":"","title":"Autonomous Agents","type":"tags"},{"content":"","date":"1 April 2026","externalUrl":null,"permalink":"/tags/groq.html","section":"Tags","summary":"","title":"Groq","type":"tags"},{"content":"An autonomous AI agent engineered to dynamically discover, evaluate, and structure technology event data. Built for the Notion MCP Hackathon, this project demonstrates the transition from traditional, linear API and multiple odel Context Protocol (MCP) integrations to autonomous AI agent building.\n","date":"1 April 2026","externalUrl":"https://github.com/klee1611/HackathonSniper","permalink":"/projects/hackathon-sniper.html","section":"Projects","summary":"An autonomous AI agent engineered to dynamically discover, evaluate, and structure technology event data. Built for the Notion MCP Hackathon, this project demonstrates the transition from traditional, linear API and multiple odel Context Protocol (MCP) integrations to autonomous AI agent building.\n","title":"Hackathon Sniper | Autonomous Discovery AI Agent","type":"projects"},{"content":"","date":"1 April 2026","externalUrl":null,"permalink":"/tags/llm.html","section":"Tags","summary":"","title":"LLM","type":"tags"},{"content":"","date":"1 April 2026","externalUrl":null,"permalink":"/tags/model-context-protocol-mcp.html","section":"Tags","summary":"","title":"Model Context Protocol (MCP)","type":"tags"},{"content":"","date":"1 April 2026","externalUrl":null,"permalink":"/tags/system-architecture.html","section":"Tags","summary":"","title":"System Architecture","type":"tags"},{"content":"","date":"1 April 2026","externalUrl":null,"permalink":"/tags/typescript.html","section":"Tags","summary":"","title":"Typescript","type":"tags"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/ai.html","section":"Tags","summary":"","title":"AI","type":"tags"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/categories/architecture.html","section":"Categories","summary":"","title":"Architecture","type":"categories"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/architecture.html","section":"Tags","summary":"","title":"Architecture","type":"tags"},{"content":"RAG (Retrieval Augmented Generation) is an AI framework that allows developers to add external information without retraining the LLM, improving the accuracy of its answers. As of 2026, it is a widely known technology.\nThe concept is roughly as follows: First, vectorize external information (the data you want the LLM to know) using an embedding model and store it. After a user enters a prompt, the prompt is also vectorized using the same embedding model. It is then compared against the previously stored vectors to retrieve the most similar pieces of data. These are then integrated by the LLM to generate a response for the user. This approach allows the LLM to answer using specific knowledge integrated by developers without the need for retraining.\nRecently, I used RAG to develop an AI bot that uses quotes from \u0026ldquo;Game of Thrones: A Song of Ice and Fire\u0026rdquo; to answer whether I should skip my gym workout today. This project is called \u0026ldquo;Iron Counsel\u0026rdquo;. The project goals were:\nUse a Telegram Bot as the frontend for easy access. Deploy the RAG system to the cloud; I didn\u0026rsquo;t want to keep my own computer on all day or deal with complex networking and security issues. Following goal 2, implement a whitelist for personal use or for a few friends to prevent billing explosions from excessive traffic. Bilingual support (Chinese and English). Low cost. The project code can be found on my GitHub. This post mainly explains the concepts and technical choices, operation, and implementation from an architectural perspective.\nConcept Explanation # Vector Search # Unlike traditional keyword search, vector search doesn\u0026rsquo;t look for exact matches. Instead, it converts every piece of data into a multi-dimensional vector. During a search, it calculates the similarity between the search criteria and the stored vectors to find the closest matches.\nEmbedding Model # A model that converts data into multi-dimensional vectors. Different embedding models have varying levels of support for different data types—some are multilingual, some only support a single language; some support images, while others only support text. The dimensions of the generated vectors also vary. When choosing an embedding model, you must match it to your needs. Furthermore, the same model must be used for both initial data processing and subsequent user prompt vectorization; otherwise, the comparison in vector search will be biased.\nLow-Cost Cloud RAG Architecture \u0026amp; Technical Choices # Phase 1: The Ingestion Pipeline # The primary goal of the Ingestion Pipeline is to vectorize and store data so it can be used later to match user prompts.\nBefore the data hits the cloud, I chose to perform the heavy lifting on my own computer. A MacBook with an M1 CPU or newer can handle vectorizing data with a lightweight embedding model with ease.\nLocal Embedding: I developed an ingest.py script. Instead of calling expensive cloud APIs, it uses the MacBook\u0026rsquo;s CPU directly to convert over 2,000 Game of Thrones script lines into vectors via FastEmbed (ONNX). Decoupled Upload: After generating vectors.json, it is uploaded in batches to the Firestore vector database. Q\u0026amp;A: Why not do it in the cloud? Because local computation is free! This \u0026ldquo;offline pre-processing\u0026rdquo; ensures the cloud only handles the core storage and querying. Embedding Model Choice # Why choose paraphrase-multilingual-MiniLM-L12-v2?\nMultilingual Support: I wanted a bot that understands both Chinese and English, and this model performs excellently in semantic alignment for both languages. Small User Base: Since the project has few users, I don\u0026rsquo;t need to use third-party pay-per-token embedding model APIs to balance load, which would cost more. Tiny but Mighty: It only has 384 dimensions. Lower dimensionality means lower storage costs in Firestore and faster queries. Compared to OpenAI\u0026rsquo;s 1,536-dimensional models, its computation speed on a CPU is extremely fast, making it ideal for running in Cloud Run containers without GPUs. ONNX Compatibility: Through FastEmbed, this model runs in ONNX format. This eliminates the need for the heavy PyTorch library, resulting in smaller container sizes and faster cold starts. Firestore Vector and History Database # Vector Store Implementation: Firestore supports KNN vector search. I converted script lines into 384-dimensional vectors locally and uploaded them to Firestore. When a user sends a prompt, executing a KNN vector comparison in Firestore retrieves the most relevant lines. For example, even if you don\u0026rsquo;t mention \u0026ldquo;dragons,\u0026rdquo; if your meaning relates to \u0026ldquo;powerful force,\u0026rdquo; Firestore might pull up a Daenerys quote. Chat History Persistence: Besides vectors, the project stores the chat history for each whitelisted user in Firestore. Using sub-collections, it can retrieve the last 10 messages with minimal latency and inject them into the LLM prompt. This gives the project long-term memory; it will remember you complained about your boss two minutes ago. Why not a dedicated vector database (like Pinecone)? At this scale, Firestore\u0026rsquo;s \u0026ldquo;one-stop shop\u0026rdquo; allows us to perform vector retrieval and history logging within the same ACID transaction space. This reduces a network hop, prioritizing low latency and easier maintenance. Phase 2: The Runtime # IQ and Speed — LLM (Groq) # I considered and tested several LLM options before choosing Groq with Llama 3.3 70B.\nFor a RAG project, the LLM doesn\u0026rsquo;t need to be \u0026ldquo;massive\u0026rdquo; because the core information is the data provided by the developer, not the LLM\u0026rsquo;s internal knowledge. In this scenario, the LLM just needs to answer quickly. Groq\u0026rsquo;s LPU (Language Processing Unit) architecture provides blazing-fast inference speeds. While typical GPUs are still loading model weights, Groq has already streamed hundreds of tokens. Model Selection: Llama 3.3 70B: I needed a model that supports both Chinese and English contexts and can maintain a specific persona. Llama 3.3 70B\u0026rsquo;s reasoning capabilities are close to GPT-4, and its response speed on Groq is remarkably fast. Pareto Efficiency of Cost and Performance: Groq is currently very, very generous to developers! It allows me to enjoy the logical reasoning power highly praised in the open-source world with minimal (or zero for low volume) API costs. LangChain # LangChain is responsible for coordinating the dialogue between the user, the embedding model, and the LLM. I used the FastEmbed wrapper within LangChain\u0026rsquo;s Embeddings interface. When a user asks, \u0026ldquo;Should I drink this wine?\u0026rdquo;, the process looks like this:\nLangChain calls FastEmbed to convert the sentence into a 384-dimensional array like [0.12, -0.05, ...]. LangChain passes this array to Firestore for vector comparison. The LLM then receives the relevant script lines retrieved by LangChain. GCP Cloud Run # Scale-to-Zero: I chose Google Cloud Run. Since the bot might only be called a dozen times a day, Cloud Run only charges when a request is active. When idle, the bill stays at $0. (Cloud Run would not be cost-effective for a high-traffic bot). Low Maintenance: With a small user base, there\u0026rsquo;s no need for a persistent GKE cluster and its complex maintenance. GCP Ecosystem: GCP offers Firestore, Cloud Run, and Artifact Registry as integrated services. Having everything in one place makes checking logs and managing images very convenient. CI/CD: Terraform \u0026amp; GitHub Actions # Infrastructure as Code (IaC): Everything is defined with Terraform, from Firestore index configurations and Secret Manager keys to Cloud Run permissions. This ensures reproducibility and reduces operational complexity. If I ever need to move the bot to another project, a terraform apply can rebuild everything in five minutes. It also prevents human errors associated with manual console configuration. CI/CD with GitHub Actions: Automated deployment pipelines ensure that every code push triggers a Docker build and push to Artifact Registry. Secure Environment Variables: Telegram tokens and Groq API keys are stored in GCP Secret Manager and authorized to the Cloud Run service account via Terraform. Sensitive info never appears in logs or source code, adhering to the Principle of Least Privilege. Telegram Bot Webhook # Low Overhead: Telegram bots can communicate via long-polling (inefficiently checking for messages every second) or Webhooks. Webhooks are passive triggers; the backend only reacts when a message is received. Scale to Zero: When someone messages the bot, Telegram sends an HTTPS POST request to our FastAPI backend on Cloud Run. This perfectly fits the \u0026ldquo;Scale-to-Zero\u0026rdquo; requirement. Token Verification: To prevent malicious calls, I implemented a Secret Token verification mechanism checking the X-Telegram-Bot-Api-Secret-Token header. Access Control (The Gatekeeper): I implemented a Telegram User ID whitelist in the code. Only authorized users can interact with the bot, protecting my budget from random traffic. Architecture Blueprint # graph TD subgraph \"Phase 1: Local Ingestion (Data Alchemy)\" CSV[GoT Script CSV] --\u003e|Local Read| ING[scripts/ingest.py] ING --\u003e|LangChain + FastEmbed| FE_L[MiniLM-L12 Model] FE_L --\u003e|Generate 384D Vectors| VEC[vectors.json] VEC --\u003e|Batch Upload| FS_V[(Firestore: Vector Script Store)] end subgraph \"Phase 2: Live Runtime (Cloud Phase)\" User[Whitelisted User] --\u003e|Telegram Message| App[Cloud Run: FastAPI] subgraph \"Google Cloud Citadel\" App --\u003e|Real-time Vectorization| FE_C[FastEmbed in Container] App --\u003e|Vector Retrieval| FS_V[(Firestore: Vector Script Store)] App --\u003e|Memory Extraction| FS_H[(Firestore: Chat History)] FS_V --\u003e|Relevant Lines| App FS_H --\u003e|Conversation Context| App end App --\u003e|Inject Prompt| Groq[Groq: Llama 3.3 70B] Groq --\u003e|Fast Response Generation| App App --\u003e|Return Result| User end Conclusion # This project successfully created an extremely low-cost RAG application with several key features:\nDecoupled Dev/Prod: Ingestion pipeline is local; runtime is in the cloud. Robustness: Low probability of GCP downtime or being overwhelmed by attacks. Security: Ensured by whitelisting, Telegram headers, and Secret Manager. Balanced Performance \u0026amp; Cost: Leveraging Groq\u0026rsquo;s speed and Cloud Run\u0026rsquo;s elasticity provides a responsive and nearly zero-cost implementation for a specific user scale. The design principle: Use minimal resources to design the most rational, secure, and stable architecture.\n","date":"20 March 2026","externalUrl":null,"permalink":"/posts/build-rag-system-firestore-vector-search-iron-counsel.html","section":"Posts","summary":"RAG (Retrieval Augmented Generation) is an AI framework that allows developers to add external information without retraining the LLM, improving the accuracy of its answers. As of 2026, it is a widely known technology.\nThe concept is roughly as follows: First, vectorize external information (the data you want the LLM to know) using an embedding model and store it. After a user enters a prompt, the prompt is also vectorized using the same embedding model. It is then compared against the previously stored vectors to retrieve the most similar pieces of data. These are then integrated by the LLM to generate a response for the user. This approach allows the LLM to answer using specific knowledge integrated by developers without the need for retraining.\n","title":"Beyond the Wall: Building a Low-Cost, High-Efficiency Cloud RAG Application with Firestore Vector Search","type":"posts"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/categories/index.html","section":"Categories","summary":"","title":"Categories","type":"categories"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/embedding-model.html","section":"Tags","summary":"","title":"Embedding Model","type":"tags"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/gcp.html","section":"Tags","summary":"","title":"GCP","type":"tags"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/rag.html","section":"Tags","summary":"","title":"RAG","type":"tags"},{"content":"","date":"20 March 2026","externalUrl":null,"permalink":"/tags/telegram-bot.html","section":"Tags","summary":"","title":"Telegram Bot","type":"tags"},{"content":"","date":"14 November 2025","externalUrl":null,"permalink":"/tags/developer-productivity.html","section":"Tags","summary":"","title":"Developer Productivity","type":"tags"},{"content":"","date":"14 November 2025","externalUrl":null,"permalink":"/categories/developer-tools.html","section":"Categories","summary":"","title":"Developer Tools","type":"categories"},{"content":"","date":"14 November 2025","externalUrl":null,"permalink":"/tags/devtools.html","section":"Tags","summary":"","title":"DevTools","type":"tags"},{"content":"","date":"14 November 2025","externalUrl":null,"permalink":"/tags/gitcoin.html","section":"Tags","summary":"","title":"Gitcoin","type":"tags"},{"content":"","date":"14 November 2025","externalUrl":null,"permalink":"/tags/github.html","section":"Tags","summary":"","title":"GitHub","type":"tags"},{"content":"","date":"14 November 2025","externalUrl":null,"permalink":"/tags/github-cli.html","section":"Tags","summary":"","title":"GitHub CLI","type":"tags"},{"content":"If you’re a developer who uses GitHub daily, you probably rely on notification badges to track issues, pull requests, and mentions. But what happens when the notification badge gets stuck — even after clearing everything?\nFor months, I saw a 1 notification badge that refused to disappear, even though there aren\u0026rsquo;t any unread messages in any inbox folder. No archived items, no subscriptions, nothing hidden. Still, the badge remained.\nMany developers started reporting the same issue as early as September 2025. This wasn’t just a UI bug — it was caused by a spam attack impersonating Gitcoin, leaving backend notification records that GitHub never automatically cleaned up.\nHere’s the full story and, most importantly, how to fix the phantom notification for real.\nWhy does this “ghost notification” exist? # A large spam wave impersonated Gitcoin by creating fake repositories such as:\nGitcoin-Developer/gitcoin.com Gitcoin-Developers/gitcoin.com Attackers added random users to issues and PR threads, which auto-generated GitHub notifications. When the repositories were later deleted or blocked, the notification entries remained in GitHub’s backend without corresponding UI elements — leaving many users with:\n🔴 A permanent unread badge\n❌ No visible notification in the UI\n🧠 A database thread that cannot be cleared normally\nSo if you’ve ignored the badge for months and now want to clean it up, there’s nothing you can do through the web interface.\nHow the scam worked # The scam process:\nAttackers create a fake repo or issue\nMass-mention developers to generate real GitHub notifications\nDirect users to a site asking them to connect a crypto wallet\nRequest approval or “verification” transaction\nDrain assets once permissions are granted\nWhile the scam itself isn’t the focus of this post, it explains why so many developers have the same persistent notification problem.\nOnce the scam repo or issue is deleted,\nGitHub’s UI has nothing to open → but still marks it unread → and it gets stuck forever.\nScam Technique Explanation Fake Gitcoin repo Designed to appear legitimate via similar organization name Auto-adding users People were tagged without consent to trigger notification curiosity Crypto-reward bait Threads referenced bounties or airdrops Phishing links Redirected to fake wallet-connect pages Even if you never clicked anything, the notification artifact remained.\nHow to remove the stuck GitHub notification # The solution comes from this community discussion\n✔ Best solution: Clear the backend notification record via GitHub CLI\nStep 1 — List all active notification threads # gh api notifications | jq \u0026#39;.[] | { id, title: .subject.title, repo: .repository.full_name }\u0026#39; This will show threads still registered even if invisible with their thread IDs.\nStep 2 — Delete the target thread # Replace $THREAD_ID with the ID you find:\ngh api --method DELETE notifications/threads/$THREAD_ID Refresh GitHub — the red badge disappears immediately.\nIf you don’t have jq installed # gh api notifications gh api --method DELETE notifications/threads/THREAD_ID Conclusion # The persistent notification badge wasn’t just an interface glitch — it was a leftover record from a wide-scale scam attack that GitHub never fully cleaned up. Many developers have dealt with this for months, and UI clearing methods don’t help because nothing actually remains in the visible inbox.\nThe GitHub CLI deletion method is currently the only reliable fix.\nReferences # GitHub community discussion\n","date":"14 November 2025","externalUrl":null,"permalink":"/posts/remove-ghost-notification-github-gitcoin-spam.html","section":"Posts","summary":"If you’re a developer who uses GitHub daily, you probably rely on notification badges to track issues, pull requests, and mentions. But what happens when the notification badge gets stuck — even after clearing everything?\nFor months, I saw a 1 notification badge that refused to disappear, even though there aren’t any unread messages in any inbox folder. No archived items, no subscriptions, nothing hidden. Still, the badge remained.\nMany developers started reporting the same issue as early as September 2025. This wasn’t just a UI bug — it was caused by a spam attack impersonating Gitcoin, leaving backend notification records that GitHub never automatically cleaned up.\n","title":"How I Finally Removed GitHub’s Persistent “Ghost Notification” — The Real Fix With GitHub CLI","type":"posts"},{"content":"","date":"14 November 2025","externalUrl":null,"permalink":"/tags/notifications.html","section":"Tags","summary":"","title":"Notifications","type":"tags"},{"content":"","date":"14 November 2025","externalUrl":null,"permalink":"/tags/phantom-notification.html","section":"Tags","summary":"","title":"Phantom Notification","type":"tags"},{"content":"","date":"14 November 2025","externalUrl":null,"permalink":"/tags/scam.html","section":"Tags","summary":"","title":"Scam","type":"tags"},{"content":"","date":"14 November 2025","externalUrl":null,"permalink":"/tags/security.html","section":"Tags","summary":"","title":"Security","type":"tags"},{"content":"","date":"14 November 2025","externalUrl":null,"permalink":"/tags/troubleshooting.html","section":"Tags","summary":"","title":"Troubleshooting","type":"tags"},{"content":"","date":"25 October 2025","externalUrl":null,"permalink":"/tags/asyncio.html","section":"Tags","summary":"","title":"Asyncio","type":"tags"},{"content":"","date":"25 October 2025","externalUrl":null,"permalink":"/tags/concurrency.html","section":"Tags","summary":"","title":"Concurrency","type":"tags"},{"content":"","date":"25 October 2025","externalUrl":null,"permalink":"/categories/concurrency-programming.html","section":"Categories","summary":"","title":"Concurrency Programming","type":"categories"},{"content":"","date":"25 October 2025","externalUrl":null,"permalink":"/tags/coroutine.html","section":"Tags","summary":"","title":"Coroutine","type":"tags"},{"content":"","date":"25 October 2025","externalUrl":null,"permalink":"/tags/gil.html","section":"Tags","summary":"","title":"GIL","type":"tags"},{"content":"","date":"25 October 2025","externalUrl":null,"permalink":"/tags/multiprocessing.html","section":"Tags","summary":"","title":"Multiprocessing","type":"tags"},{"content":"Python\u0026rsquo;s performance bottlenecks were criticized for years,\nbut thanks to the hard work of developers,\nAsyncio was introduced in Python 3.4 to improve performance in specific scenarios.\nBy Python 3.13, the Free-threaded design (PEP-703) emerged,\nallowing the optional disabling of the GIL.\nCombined with the pre-existing Multiprocessing and Multithreading,\nI have compiled a few records on the principles, differences, and use cases for these three technologies.\nThis first post will briefly introduce the basic concepts and suitable scenarios for each.\nMultiprocessing # Multiple processes can be created and executed in parallel by a singel program.\nEach process has its own independent memory space and therefore can completely bypass the limitations of Python\u0026rsquo;s GIL (Global Interpreter Lock).\nThis means that regardless which version of Python you are using, you can execute multiple processes in parallel on multi-core CPUs, independently and without interference.\nUse Case:\nCPU-bound tasks, such as extensive mathematical calculations, data processing, image recognition, etc.\nIt can effectively utilize the computing power of multi-core CPUs.\nAdditionally, due to the isolation of processes memory space,\nif a single process crashes, it won\u0026rsquo;t affect other running processes or the main program.\nimport multiprocessing import time def cpu_bound_task(n): count = 0 for i in range(n): count += i print(f\u0026#34;Finished task with {n}\u0026#34;) if __name__ == \u0026#39;__main__\u0026#39;: start_time = time.time() processes = [] for i in range(4): p = multiprocessing.Process(target=cpu_bound_task, args=(10**7,)) processes.append(p) p.start() for p in processes: p.join() end_time = time.time() print(f\u0026#34;Multiprocessing took {end_time - start_time:.2f} seconds.\u0026#34;) Pros: Can achieve true parallelism using multi-core CPUs. Not GIL limitation. Independent memory space between processes leads to high stability and less likelihood of Race Conditions. Cons: Creating independent processes requires more resources (CPU, memory). Inter-process communication (IPC) is more complex, requiring mechanisms like Queue, Pipe, or shared memory, which results in higher latency. Multithreading # Multiple threads are created within a single process.\nThese threads share the same memory space (Heap) used by the process, allowing for easy data sharing and exchange.\nIn versions before Python 3.13, unlike other programming languages like C/C++,\nPython\u0026rsquo;s multithreading was limited by the Python GIL (Global Interpreter Lock).\nEven when running on a multi-core CPU,\nPython\u0026rsquo;s multithreading could not actually achieve parallel computing.\nGIL (Global Interpreter Lock): The GIL is a mechanism in CPython (the official Python implementation) designed to protect Python objects (like dicts, lists, etc.) from corruption.\nThis mechanism ensures that only one thread can execute Python bytecode at any given time.\nThis means that for CPU-bound tasks, even on a multi-core CPU,\nPython\u0026rsquo;s multithreading can only execute on a single core at a time,\nfailing to achieve true parallelism for speedup.\nHistorically, under the GIL, when a thread encountered an I/O operation (like reading/writing a file or a network request), it would release the GIL,\nallowing other threads a chance to run.\nTherefore, Multithreading was traditionally used for handling I/O-bound tasks.\nAs Python\u0026rsquo;s user base grew, demands like PEP-703 emerged.\nStarting from Python 3.13, an experimental feature to optionally disable the GIL (free-threading mode) was included.\nPython 3.14 introduced a GIL-free, Free-threaded version.\nIn these versions, Python\u0026rsquo;s Multithreading can finally break through the limits and handle CPU-bound tasks in parallel across multiple cores, avoiding the IPC overhead of Multiprocessing.\nUse Case:\nBefore Python 3.13: I/O-bound tasks, such as web scraping, file downloads, API requests, etc. Python 3.13+ (with Free-threaded mode): All types of tasks, including CPU-bound ones. import threading import requests import time def io_bound_task(url): try: response = requests.get(url) print(f\u0026#34;Downloaded {url} with status {response.status_code}\u0026#34;) except Exception as e: print(f\u0026#34;Error downloading {url}: {e}\u0026#34;) if __name__ == \u0026#39;__main__\u0026#39;: urls = [\u0026#34;https://www.google.com\u0026#34;] * 5 start_time = time.time() threads = [] for url in urls: t = threading.Thread(target=io_bound_task, args=(url,)) threads.append(t) t.start() for t in threads: t.join() end_time = time.time() print(f\u0026#34;Multithreading took {end_time - start_time:.2f} seconds.\u0026#34;) Pros: The overhead of creating a thread is smaller than that of a process. Shared memory makes data exchange convenient. Cons: Versions before Python 3.13 are limited by the GIL and cannot utilize multi-core CPUs for CPU-bound tasks. Requires handling thread synchronization issues, such as using Lock to avoid Race conditions. Asyncio I/O and Coroutines # Asyncio is a standard library introduced after Python 3.4.\nConceptually, it uses an Event Loop and Coroutines to achieve concurrency on a single thread.\nA Coroutine can be seen as a lightweight thread.\nIt can be controlled to pause at a point where it needs to wait for I/O (await),\nreturning control to the event loop to execute other coroutines.\nWhen the condition for the pause is met (e.g., the awaited I/O operation is complete),\nthe event loop returns to continue executing that coroutine.\nBesides being able to execute other coroutines during the pause,\nit also saves OS-level thread switching (context switch),\nwhich can significantly improve performance.\nUse Case:\nHighly concurrent I/O-bound tasks,\nespecially scenarios that require handling a large number of network connections simultaneously (like Web servers, chat applications, or massive API requests).\nimport asyncio import aiohttp import time async def async_io_bound_task(session, url): try: async with session.get(url) as response: print(f\u0026#34;Downloaded {url} with status {response.status}\u0026#34;) except Exception as e: print(f\u0026#34;Error downloading {url}: {e}\u0026#34;) async def main(): urls = [\u0026#34;https://www.google.com\u0026#34;] * 5 start_time = time.time() async with aiohttp.ClientSession() as session: tasks = [async_io_bound_task(session, url) for url in urls] await asyncio.gather(*tasks) end_time = time.time() print(f\u0026#34;Asyncio took {end_time - start_time:.2f} seconds.\u0026#34;) if __name__ == \u0026#39;__main__\u0026#39;: asyncio.run(main()) Pros: Extremely low context switching overhead, capable of handling a large number of I/O operations with high efficiency. Operates on a single thread, so there are no OS-level race condition issues (though application-layer race conditions are still possible if not careful). Cons: Not suitable for CPU-bound tasks; a single CPU-bound task can block the entire Event Loop. Requires using async/await syntax and corresponding asynchronous library support (like aiohttp, asyncpg). Comparison Summary # Feature Multiprocessing Multithreading Asyncio Basic Unit Process Thread Coroutine Memory Space Independent Shared Shared (Single-threaded) GIL Impact None, bypassed Restricted in old versions; 3.13+ can avoid None (Single-threaded) Parallelism/Concurrency Parallelism Old: Concurrency; 3.13+: Parallelism Concurrency Use Case CPU-bound, high fault tolerance/isolation General I/O-bound Massive/highly concurrent I/O-bound Pros Multi-core utility, high stability Shared memory, low overhead Extremely high I/O throughput, low overhead Cons High resource overhead, complex IPC Old versions (pre-3.13) limited by GIL; race conditions/Lock complexity Not for CPU-bound tasks (Event Loop will block) If your task is CPU-bound, requiring a lot of CPU computation, then multiprocessing in any version can handle it by fully utilizing multi-core CPUs. multithreading in Python 3.13+ is also a viable option, reducing context switch overhead and IPC complexity. If your task is I/O-bound with relatively simple logic and a moderate number of connections, multithreading is a lightweight and straightforward choice. If your task is I/O-bound and requires handling a large number of concurrent connections (e.g., developing a Web server, API, or microservice), then asyncio provides the highest performance and throughput. ","date":"25 October 2025","externalUrl":null,"permalink":"/posts/python-concurrency-parallelism-multiprocessing-multithreading-asyncio.html","section":"Posts","summary":"Python’s performance bottlenecks were criticized for years,\nbut thanks to the hard work of developers,\nAsyncio was introduced in Python 3.4 to improve performance in specific scenarios.\nBy Python 3.13, the Free-threaded design (PEP-703) emerged,\nallowing the optional disabling of the GIL.\nCombined with the pre-existing Multiprocessing and Multithreading,\nI have compiled a few records on the principles, differences, and use cases for these three technologies.\nThis first post will briefly introduce the basic concepts and suitable scenarios for each.\n","title":"Multiprocessing, Multithreading and Asyncio in Python Part 1 - Basic Concept","type":"posts"},{"content":"","date":"25 October 2025","externalUrl":null,"permalink":"/tags/multithreading.html","section":"Tags","summary":"","title":"Multithreading","type":"tags"},{"content":"","date":"25 October 2025","externalUrl":null,"permalink":"/tags/parallelism.html","section":"Tags","summary":"","title":"Parallelism","type":"tags"},{"content":"","date":"25 October 2025","externalUrl":null,"permalink":"/tags/performance.html","section":"Tags","summary":"","title":"Performance","type":"tags"},{"content":"","date":"25 October 2025","externalUrl":null,"permalink":"/tags/python.html","section":"Tags","summary":"","title":"Python","type":"tags"},{"content":"","date":"4 October 2025","externalUrl":null,"permalink":"/tags/debugging.html","section":"Tags","summary":"","title":"Debugging","type":"tags"},{"content":"","date":"4 October 2025","externalUrl":null,"permalink":"/categories/develop-environment.html","section":"Categories","summary":"","title":"Develop Environment","type":"categories"},{"content":"","date":"4 October 2025","externalUrl":null,"permalink":"/tags/macos.html","section":"Tags","summary":"","title":"Macos","type":"tags"},{"content":"A few months ago, I encountered an issue while using rsync to back up data from my MacBook to a NAS.\nrsync would appear to be running normally for a while and then suddenly hang indefinitely.\nThe terminal output showed it syncing files as usual, and then it just\u0026hellip; stopped.\nThere were no error messages, and rsync didn\u0026rsquo;t exit.\nInitially, I thought it might be a large file transfer or an unstable network connection. However, I discovered that if I killed the process and ran the rsync command again, it would resume smoothly from the file where it had previously stuck. This happened several times in a row!\nThis clearly wasn\u0026rsquo;t a network or file issue. I opened the built-in Activity Monitor on my Mac to see what was happening with rsync.\nWhat I saw was shocking:\nThe rsync status was \u0026ldquo;Terminated.\u0026rdquo; CPU usage was 0% because it had stopped, yet it was still hogging a massive amount of Virtual Memory (34 GB!!!!!). Its path was under /usr/bin, which meant it was the system\u0026rsquo;s built-in version.\nAfter some research, I found a Reddit thread mentioning that the built-in rsync on macOS is problematic. Following the advice there, I abandoned the outdated system version and switched to the Homebrew version.\nAfter that, over 20 GB of files synced without a hitch!\nI\u0026rsquo;m still quite puzzled. The Reddit comment was from three years ago, and the problem is very obvious whenever there\u0026rsquo;s a large number of files. Why hasn\u0026rsquo;t that outdated version been updated yet?\nI wonder if it\u0026rsquo;s due to legal considerations regarding GPL licenses that they refuse to update it?\n","date":"4 October 2025","externalUrl":null,"permalink":"/posts/resolve-macos-rsync-hangs.html","section":"Posts","summary":"A few months ago, I encountered an issue while using rsync to back up data from my MacBook to a NAS.\nrsync would appear to be running normally for a while and then suddenly hang indefinitely.\nThe terminal output showed it syncing files as usual, and then it just… stopped.\nThere were no error messages, and rsync didn’t exit.\nInitially, I thought it might be a large file transfer or an unstable network connection. However, I discovered that if I killed the process and ran the rsync command again, it would resume smoothly from the file where it had previously stuck. This happened several times in a row!\n","title":"MacOS Legacy Rsync Hangs","type":"posts"},{"content":"","date":"4 October 2025","externalUrl":null,"permalink":"/tags/productivity.html","section":"Tags","summary":"","title":"Productivity","type":"tags"},{"content":"","date":"4 October 2025","externalUrl":null,"permalink":"/tags/tools.html","section":"Tags","summary":"","title":"Tools","type":"tags"},{"content":"","date":"1 January 2025","externalUrl":null,"permalink":"/tags/ci/cd.html","section":"Tags","summary":"","title":"CI/CD","type":"tags"},{"content":"","date":"1 January 2025","externalUrl":null,"permalink":"/tags/fastapi.html","section":"Tags","summary":"","title":"FastAPI","type":"tags"},{"content":"Conversational AI app demonstrating accurate LLM inference with RAG over complex datasets. Tech stack: Serverless FastAPI backend on GCP (provisioned via Terraform), local ONNX embeddings, Firestore vector search, and LangChain orchestration.\n","date":"1 January 2025","externalUrl":"https://github.com/klee1611/iron-counsel","permalink":"/projects/iron-counsel.html","section":"Projects","summary":"Conversational AI app demonstrating accurate LLM inference with RAG over complex datasets. Tech stack: Serverless FastAPI backend on GCP (provisioned via Terraform), local ONNX embeddings, Firestore vector search, and LangChain orchestration.\n","title":"Iron Counsel: Enterprise RAG Architecture \u0026 Conversational AI","type":"projects"},{"content":"","date":"1 January 2025","externalUrl":null,"permalink":"/tags/langchain.html","section":"Tags","summary":"","title":"LangChain","type":"tags"},{"content":"","date":"1 January 2025","externalUrl":null,"permalink":"/tags/terraform.html","section":"Tags","summary":"","title":"Terraform","type":"tags"},{"content":"","date":"1 January 2025","externalUrl":null,"permalink":"/tags/vector-database.html","section":"Tags","summary":"","title":"Vector Database","type":"tags"},{"content":"","date":"25 December 2024","externalUrl":null,"permalink":"/tags/joplin.html","section":"Tags","summary":"","title":"Joplin","type":"tags"},{"content":"","date":"25 December 2024","externalUrl":null,"permalink":"/tags/obsidian.html","section":"Tags","summary":"","title":"Obsidian","type":"tags"},{"content":"I originally used Notion as my note-taking software.\nIt\u0026rsquo;s feature-rich and has a beautiful interface.\nHowever, a few years ago, a privacy controversy arose around Notion,\naccusing them of looking at a company\u0026rsquo;s content stored in Notion,\nand even proposing a partnership based on that information.\nSo, I switched to Joplin for a while,\nbut eventually moved to Obsidian, which has a large number of plugins, strong community support, and is highly customizable.\nJoplin natively supports WebDAV synchronization.\nAfter switching to Obsidian, I found the \u0026ldquo;Remotely Save\u0026rdquo; plugin, which also supports WebDAV.\nThis allows me to store my notes on my own NAS\nand keep them synchronized across different devices.\nWebDAV and HTTPS Certificate Setup - Connecting to WebDAV Server on an Internal NAS Through a Router # Equipment # Router: Synology RT2600AC NAS: Synology 918+ Environment Setup # After installing the WebDAV Server package on the Synology NAS,\nfor security reasons, I disabled HTTP connections in the WebDAV settings\nand only enabled HTTPS access.\n(Although I\u0026rsquo;m not entirely sure, the mobile version of Joplin seems to be restricted to HTTPS for security reasons as well).\nThen, I configured port forwarding on my router as shown in the image below.\nSince I had already set up DDNS for the router,\nI could already connect to the router via a URL.\nSo I went ahead and tested the WebDAV connection.\nIn Joplin\u0026rsquo;s connection settings, I selected WebDAV,\nand entered the URL, port number, and the folder to store the notes in the WebDAV URL field\n(e.g., https://.....:5xxx/homes/xxx/xxx/xxx).\nAfter clicking the test connection button, I found a certificate issue.\nAfter some googling, I discovered that this was because the domain name in the connection URL corresponded to the router,\nwhich then forwarded the connection to the NAS.\nHowever, the NAS did not have a certificate for that domain name,\ncausing a mismatch between the certificate and the URL when Joplin tried to connect.\n(You can use https://www.geocerts.com/ssl-checker to check this).\nSo, I exported the certificate from the router and imported it to the NAS.\nThen, in the NAS settings, I found the certificate configuration\nand changed the WebDAV certificate to the one imported from the router,\nas shown in the image below.\nThen I went to https://www.geocerts.com/ssl-checker and ran another test.\nThis time, I confirmed that the certificate for the port connected via HTTPS was correct.\nI went back to Joplin\u0026rsquo;s connection settings and tested the connection again to confirm it was successful, and synchronization worked smoothly.\nSupplementary Information - Notion Privacy Controversy # In a 2020 post in the Notion.Taiwan Official Community, someone mentioned that\nNotion would look at user data and even propose partnerships:\nIn the end, Notion\u0026rsquo;s official statement removed the controversial \u0026ldquo;Business Development and Strategic Partnerships\u0026rdquo; clause.\n","date":"25 December 2024","externalUrl":null,"permalink":"/posts/sync-obsidian-joplin-data-across-multiple-device-synology-webdav.html","section":"Posts","summary":"I originally used Notion as my note-taking software.\nIt’s feature-rich and has a beautiful interface.\nHowever, a few years ago, a privacy controversy arose around Notion,\naccusing them of looking at a company’s content stored in Notion,\nand even proposing a partnership based on that information.\nSo, I switched to Joplin for a while,\nbut eventually moved to Obsidian, which has a large number of plugins, strong community support, and is highly customizable.\n","title":"Sync Obsidian / Joplin Data Across Multiple Devices with Synology WebDAV","type":"posts"},{"content":"","date":"25 December 2024","externalUrl":null,"permalink":"/tags/synology.html","section":"Tags","summary":"","title":"Synology","type":"tags"},{"content":"","date":"25 December 2024","externalUrl":null,"permalink":"/categories/tools.html","section":"Categories","summary":"","title":"Tools","type":"categories"},{"content":"","date":"25 December 2024","externalUrl":null,"permalink":"/tags/webdav.html","section":"Tags","summary":"","title":"WebDAV","type":"tags"},{"content":"","date":"1 March 2022","externalUrl":null,"permalink":"/tags/docker.html","section":"Tags","summary":"","title":"Docker","type":"tags"},{"content":"A production-grade, asynchronous Python microservice boilerplate engineered to standardize enterprise backend development and eradicate legacy I/O bottlenecks. Enforced strict organizational engineering culture by integrating Docker and mandating a baseline of 95% test coverage.\n","date":"1 March 2022","externalUrl":"https://github.com/klee1611/cookiecutter-fastapi-mongo","permalink":"/projects/cookiecutter-fastapi-mongo.html","section":"Projects","summary":"A production-grade, asynchronous Python microservice boilerplate engineered to standardize enterprise backend development and eradicate legacy I/O bottlenecks. Enforced strict organizational engineering culture by integrating Docker and mandating a baseline of 95% test coverage.\n","title":"Enterprise Standard: Asynchronous FastAPI Architecture","type":"projects"},{"content":"","date":"1 March 2022","externalUrl":null,"permalink":"/tags/microservices.html","section":"Tags","summary":"","title":"Microservices","type":"tags"},{"content":"","date":"1 March 2022","externalUrl":null,"permalink":"/tags/mongodb.html","section":"Tags","summary":"","title":"MongoDB","type":"tags"},{"content":"","date":"1 March 2022","externalUrl":null,"permalink":"/tags/software-standardization.html","section":"Tags","summary":"","title":"Software Standardization","type":"tags"},{"content":"","date":"6 November 2021","externalUrl":null,"permalink":"/tags/global-packages.html","section":"Tags","summary":"","title":"Global Packages","type":"tags"},{"content":"Today I encountered a problem:\nAfter installing nvm, the path for installing global packages changed,\nmaking it impossible to directly remove previously installed global packages using npm uninstall -g.\nHow did I discover this?\nA long time ago, I installed a global package that could be executed directly from the terminal using a command.\nBut because it was so long ago,\nwhen I tried to upgrade that package, I found it wasn\u0026rsquo;t listed in npm list -g.\nSo, I first used which to find its location,\nthen discovered it was a symbolic link and used ls -al to see where that link pointed.\nI found it was under /usr/lib/node_modules,\nwhich clearly indicated it was installed with npm -g.\nThen I carefully re-examined the output of npm list -g,\nand found that other global packages were listed under /Users/\u0026lt;USER_NAME\u0026gt;/.nvm/versions/node/v16.5.0/lib.\nAfter some Googling, I found a way to list the global packages installed before nvm using nvm use system \u0026amp;\u0026amp; npm ls -g --depth=0.\nTragically, it showed:\nSystem version of node not found. It seems I had already removed node from the system\u0026hellip;\nSo, I found another command, nvm deactivate, to temporarily disable nvm.\nThen, I reinstalled node using brew.\nAfter that, I ran npm list -g again,\nand finally saw the package that was installed before nvm!!!\nHooray!!\nI could finally successfully remove/upgrade the previously installed global package.\nAfter resolving the issue, to bring nvm back, simply restart the shell with source ~/.zshrc or similar.\n","date":"6 November 2021","externalUrl":null,"permalink":"/posts/managing-pre-exist-global-npm-packages-after-installing-nvm.html","section":"Posts","summary":"Today I encountered a problem:\nAfter installing nvm, the path for installing global packages changed,\nmaking it impossible to directly remove previously installed global packages using npm uninstall -g.\nHow did I discover this?\nA long time ago, I installed a global package that could be executed directly from the terminal using a command.\nBut because it was so long ago,\nwhen I tried to upgrade that package, I found it wasn’t listed in npm list -g.\n","title":"Managing Pre-existing Global NPM Packages After Installing NVM","type":"posts"},{"content":"","date":"6 November 2021","externalUrl":null,"permalink":"/tags/node.js.html","section":"Tags","summary":"","title":"Node.js","type":"tags"},{"content":"","date":"6 November 2021","externalUrl":null,"permalink":"/tags/npm.html","section":"Tags","summary":"","title":"NPM","type":"tags"},{"content":"","date":"6 November 2021","externalUrl":null,"permalink":"/tags/nvm.html","section":"Tags","summary":"","title":"NVM","type":"tags"},{"content":"","date":"1 November 2021","externalUrl":null,"permalink":"/tags/programming.html","section":"Tags","summary":"","title":"Programming","type":"tags"},{"content":"","date":"1 November 2021","externalUrl":null,"permalink":"/tags/pyenv.html","section":"Tags","summary":"","title":"Pyenv","type":"tags"},{"content":" Functions and Reasons for Using pyenv # pyenv is a tool used to install various versions of Python on a system,\nand to conveniently switch between Python versions.\nWhen you need to develop or maintain projects that require different Python versions simultaneously,\nyou will need to use pyenv to help switch Python versions.\nNew Python versions usually include syntax updates or new features.\nFor example, Python\u0026rsquo;s async / await feature appeared only in Python 3.5 and later.\nProjects developed with Python versions below 3.5 cannot use it.\nAnother example is having projects that use both Python 2 and Python 3.\nSince Python 2 and Python 3 are syntactically incompatible,\nit is necessary to install both Python 2 and Python 3 on the system.\nIn such cases, pyenv can be used to conveniently switch Python versions.\nInstallation and Initialization # Installation\nbrew install pyenv After installation, run initialization\npyenv init Then, follow the instructions to paste the displayed code into ~/.zshrc or ~/.bash_profile\nCommon Commands # List available Python versions for installation pyenv install --list This will show:\nAvailable versions: 2.1.3 2.2.3 2.3.7 2.4.0 ... 3.9.6 3.9.7 3.10.0 3.10-dev 3.11.0a1 ... Install a specific Python version pyenv install 3.10.0 Observe which Python versions have been installed pyenv versions This will show:\n* system (set by ......./.pyenv/version) 3.10.0 This indicates that the system\u0026rsquo;s default version and the recently installed 3.10.0 are available,\nbut the currently used Python version is the system\u0026rsquo;s default.\nSwitch the system-wide Python version pyenv global 3.10.0 Running pyenv versions anywhere in the system will show that the currently used Python version is 3.10.0.\nSwitch the Python version only for the current directory pyenv local 3.7.12 In the current directory, pyenv versions will show that version 3.7.12 is being used.\nHowever, in other directories,\nif a version was previously set using pyenv global (e.g., 3.10.0),\nthen pyenv versions will show the version set by pyenv global (3.10.0).\nIf pyenv global was not run to set a version, pyenv versions will show the system\u0026rsquo;s default version.\nUninstall a specific Python version pyenv uninstall 3.7.12 Reference # pyenv Github\n","date":"1 November 2021","externalUrl":null,"permalink":"/posts/pyenv-notes.html","section":"Posts","summary":"Functions and Reasons for Using pyenv # pyenv is a tool used to install various versions of Python on a system,\nand to conveniently switch between Python versions.\nWhen you need to develop or maintain projects that require different Python versions simultaneously,\nyou will need to use pyenv to help switch Python versions.\nNew Python versions usually include syntax updates or new features.\n","title":"Pyenv Notes","type":"posts"},{"content":"","date":"27 October 2021","externalUrl":null,"permalink":"/tags/concurrent-processing.html","section":"Tags","summary":"","title":"Concurrent Processing","type":"tags"},{"content":"Before the advent of asyncio,\nwhen a Python program had many tasks that needed to be executed concurrently,\nand wanted to improve program performance,\nthe only options were multiprocessing or threading.\nAfter Python 3.4, asyncio became another option.\nasyncio can be used to write coroutines,\nand execute coroutines concurrently using an event loop,\nreducing unnecessary waiting time in the program to improve performance.\nCoroutines # Coroutine Definition # In the Python official documentation,\nPython coroutines are defined as:\nCoroutines are a more generalized form of subroutines. Subroutines are entered at one point and exited at another point. Coroutines can be entered, exited, and resumed at many different points. They can be implemented with the async def statement. See also PEP 492.\nThis means that Python coroutines are quite similar to subroutines.\nThe difference is that a subroutine executes from start to finish in one go,\nand then terminates.\nA coroutine, on the other hand,\ncan execute up to a certain point, pause, and then resume execution later.\nDefining and Executing Coroutines with async, await, and asyncio.run # async can be used to define a coroutine.\nBy adding async before the def keyword when defining a function, you can define a coroutine using async def.\nawait is used to define a suspension point in a coroutine.\nWhen await is encountered,\nthe coroutine can pause,\nand then resume execution later.\nawait can only be followed by an awaitable object.\nAwaitable objects include coroutines or event loop tasks, etc.\nimport asyncio async def ten_sec_sleep(): await asyncio.sleep(10) print(\u0026#39;10 sec sleep finish\u0026#39;) if __name__ == \u0026#39;__main__\u0026#39;: asyncio.run(ten_sec_sleep()) Event Loop # What is an event loop # In the Python official documentation,\nthe event loop is introduced as:\nThe event loop is the core of every asyncio application. Event loops run asynchronous tasks and callbacks, perform network IO operations, and run subprocesses.\nSimply put, it is used to run asynchronously executing tasks.\nAn event loop executes only one task at a time.\nWhen running coroutines using an event loop,\nwhen a task reaches a programmer-defined suspension point,\nthe event loop pauses and schedules that task,\nthen switches to execute other work (which could be other tasks or callbacks, etc.).\nThis makes the combination of event loop and coroutine particularly suitable for handling I/O-bound tasks;\nby defining the I/O operations of a coroutine as suspension points,\nand using an event loop to run these coroutines,\nthe time spent waiting for I/O can be used to perform other work.\nExecuting Coroutines with an Event Loop # Executing a single coroutine:\nimport asyncio async def ten_sec_sleep(count): await asyncio.sleep(10) print(f\u0026#39;10 sec sleep finish, count: {count}\u0026#39;) if __name__ == \u0026#39;__main__\u0026#39;: loop = asyncio.get_event_loop() task = loop.create_task(ten_sec_sleep(0)) loop.run_until_complete(task) The execution time using the time command is 10.09 seconds.\n10 sec sleep finish, count: 0 10.09 real 0.06 user 0.01 sys Executing multiple coroutines concurrently:\nimport asyncio async def ten_sec_sleep(count): await asyncio.sleep(10) print(f\u0026#39;10 sec sleep finish, count: {count}\u0026#39;) if __name__ == \u0026#39;__main__\u0026#39;: loop = asyncio.get_event_loop() tasks = [] for i in range(10): tasks.append(loop.create_task(ten_sec_sleep(i))) loop.run_until_complete(asyncio.wait(tasks)) Whenever sleep(10) is executed,\nthe event loop can switch to another coroutine to execute.\nThe concurrent execution time for coroutines using the time command is 10.09 seconds.\n10 sec sleep finish, count: 0 10 sec sleep finish, count: 1 10 sec sleep finish, count: 2 10 sec sleep finish, count: 3 10 sec sleep finish, count: 4 10 sec sleep finish, count: 5 10 sec sleep finish, count: 6 10 sec sleep finish, count: 7 10 sec sleep finish, count: 8 10 sec sleep finish, count: 9 10.09 real 0.07 user 0.01 sys Performance Measurement # A program that continuously sends 10 requests to Google,\nwithout using coroutines:\nimport requests def issue_req(count): resp = requests.get(\u0026#39;http://www.google.com.tw\u0026#39;) print(f\u0026#39;count: {count}, resp status: {resp.status_code}\u0026#39;) if __name__ == \u0026#39;__main__\u0026#39;: for i in range(10): issue_req(i) The time required using the time command is 0.83 seconds.\ncount: 0, resp status: 200 count: 1, resp status: 200 count: 2, resp status: 200 count: 3, resp status: 200 count: 4, resp status: 200 count: 5, resp status: 200 count: 6, resp status: 200 count: 7, resp status: 200 count: 8, resp status: 200 count: 9, resp status: 200 0.83 real 0.16 user 0.05 sys Using coroutines to send requests concurrently:\nimport requests import asyncio async def issue_req(count): loop = asyncio.get_event_loop() resp = await loop.run_in_executor( None, requests.get, \u0026#39;http://www.google.com.tw\u0026#39; ) print(f\u0026#39;count: {count}, resp status: {resp.status_code}\u0026#39;) if __name__ == \u0026#39;__main__\u0026#39;: loop = asyncio.get_event_loop() tasks = [] for i in range(10): tasks.append(loop.create_task(issue_req(i))) loop.run_until_complete(asyncio.wait(tasks)) The time required using the time command is 0.31 seconds.\ncount: 0, resp status: 200 count: 2, resp status: 200 count: 3, resp status: 200 count: 7, resp status: 200 count: 5, resp status: 200 count: 1, resp status: 200 count: 9, resp status: 200 count: 6, resp status: 200 count: 8, resp status: 200 count: 4, resp status: 200 0.31 real 0.18 user 0.05 sys Looking at these 10 requests,\nfrom 0.83 seconds to 0.31 seconds,\nthe performance improved by (0.83 - 0.31)/0.83 * 100% = 62.65%,\nwhich is quite significant.\nFor programs with many I/O-bound tasks,\nusing coroutines is a good choice.\n","date":"27 October 2021","externalUrl":null,"permalink":"/posts/python-coroutine-asyncio.html","section":"Posts","summary":"Before the advent of asyncio,\nwhen a Python program had many tasks that needed to be executed concurrently,\nand wanted to improve program performance,\nthe only options were multiprocessing or threading.\nAfter Python 3.4, asyncio became another option.\nasyncio can be used to write coroutines,\nand execute coroutines concurrently using an event loop,\nreducing unnecessary waiting time in the program to improve performance.\n","title":"Python Coroutine Asyncio","type":"posts"},{"content":"","date":"26 September 2021","externalUrl":null,"permalink":"/tags/pipenv.html","section":"Tags","summary":"","title":"Pipenv","type":"tags"},{"content":"Why Pipenv # When maintaining many Python projects,\ndifferent projects might use different versions of the same Python libraries.\nNot using a virtual environment and installing all Python modules directly on your machine will lead to version conflicts.\nIn the past, the mechanism of virtualenv + requirements.txt allowed different projects to use different versions of the same package,\nand also enabled new developers or production environments to quickly install the packages required by the project.\nHowever, updating packages was quite troublesome,\nrequiring manual re-exporting of a new requirements.txt.\nFurthermore, when a project had different environment requirements (e.g., development environment and production environment),\ntwo sets of package configurations, requirements-prod.txt and requirements-dev.txt, had to be maintained.\nWithout pyenv, it was also impossible to switch between different Python versions.\nLater, the officially recommended pipenv from Python solved these problems,\nmaking it convenient to achieve the following with just commands:\nCreate independent Python versions and package virtual environments. Install and record package versions in automatically generated Pipfile and Pipfile.lock, while checking package security through package hash values. Record package usage environments (separating development and production environments). Read .env files to set environment variables for the virtual environment. Automatically switch Python versions in the system (or Python versions installed with pyenv). Operating Pipenv # Install Pipenv # pip3 install pipenv Pipenv Commands # Create an independent virtual environment for a specific Python version:\nNavigate to the project directory.\npipenv --python 3.8 Note that the specified Python version must be available on the system; otherwise, you need to install it using pyenv. Install packages\npipenv install flask Install development packages\nPackages with --dev will be placed under [dev-packages] in the automatically generated Pipfile.\npipenv install pytest --dev Uninstall packages\npipenv uninstall flask Execute scripts in the created virtual environment\npipenv run python server.py Other commands like pytest can also be executed.\npipenv run pytest Enter the virtual environment\npipenv shell To exit the virtual environment, simply type exit.\nCreate a virtual environment using existing Pipfile and Pipfile.lock\npipenv install To install development environment packages as well:\npipenv install --dev Create Pipfile and Pipfile.lock from requirements.txt\npipenv install Output requirements.txt\nThis is generally not necessary, but in some special cases (e.g., requirements for specific platforms), it can still be done.\npipenv lock --requirements \u0026gt; requirements.txt Upgrade packages in the virtual environment\npipenv update Delete the current virtual environment\npipenv --rm Upgrade Pipenv # pip3 install --upgrade pipenv","date":"26 September 2021","externalUrl":null,"permalink":"/posts/pipenv-notes.html","section":"Posts","summary":"Why Pipenv # When maintaining many Python projects,\ndifferent projects might use different versions of the same Python libraries.\nNot using a virtual environment and installing all Python modules directly on your machine will lead to version conflicts.\nIn the past, the mechanism of virtualenv + requirements.txt allowed different projects to use different versions of the same package,\nand also enabled new developers or production environments to quickly install the packages required by the project.\n","title":"Pipenv Notes","type":"posts"},{"content":"","date":"28 June 2021","externalUrl":null,"permalink":"/tags/cookie.html","section":"Tags","summary":"","title":"Cookie","type":"tags"},{"content":"","date":"28 June 2021","externalUrl":null,"permalink":"/tags/session.html","section":"Tags","summary":"","title":"Session","type":"tags"},{"content":" Stateless HTTP # HTTP is a stateless protocol,\nmeaning that each request/response is independent,\nand is unrelated to previous or subsequent requests/responses.\nThe same request will always receive the same response,\nand will not differ based on the content of previous requests/responses.\nThis allows the server to save a large amount of database and server storage space because it doesn\u0026rsquo;t need to store user information.\nIt also speeds up response times and saves considerable network bandwidth because the client doesn\u0026rsquo;t always have to connect to the same socket.\nHowever, when a website needs to perform continuous actions (e.g., requiring user authentication),\nsome mechanisms are needed to assist.\nAt this point, most websites utilize sessions or cookies.\nSession # A session is a stateful period of time.\nHTTP requests/responses are stateless,\nbut if state information is carried through stateless requests/responses,\nthe client and server can create stateful operations using the state information carried in the requests/responses.\nFor example, if a certain action requires the user to be logged in and have selected option A before it can be performed,\nit is desirable to have a stateful period (session) that represents the state \u0026ldquo;user is logged in and has selected option A.\u0026rdquo;\nThere are many ways to achieve this state.\nFor instance, during this period,\nrequests can carry an encrypted user ID and option A to inform the server that the user is logged in,\ntheir identity, and the selected option.\nThere are many ways to implement a session,\nthe most common being cookies.\nHowever, cookies are just one method;\nit doesn\u0026rsquo;t mean that sessions can only be implemented through cookies.\nSessions can also be created through other means,\nsuch as using query strings to record previous interactions.\nCookie # A cookie is a mechanism for implementing sessions.\nThe server can use the Set-Cookie header to instruct the browser to set a cookie and specify its content.\nSubsequently, when the browser sends a request to the same domain and path, it will include the cookie with the request.\nThis way, when certain states need to be remembered,\nthe server only needs to instruct the browser to set the necessary cookie.\nThen, when a request is sent, the server can understand the current state by examining the cookie\u0026rsquo;s content.\nSince the content of a cookie can be modified by the user,\nfor security considerations,\nthere are two common approaches when using cookies (they can also be used together):\nCookie-based session # Encrypt the cookie content.\nThe server decrypts the content after receiving it to understand what the cookie stores.\nContinuing the example above,\nthe user ID and option A would be encrypted and placed in the cookie.\nPoints to note:\nBecause cookies have a size limit, special attention must be paid to ensure the encrypted cookie size is not too large. The encryption key must be properly secured. Session ID # Use an ID (Session Identifier, Session ID) to record user identity,\nwhile all other data (Session Data) is stored on the server.\nContinuing the example above, the user\u0026rsquo;s selected option A would be stored on the server,\nand the user\u0026rsquo;s ID would be placed in the cookie.\nPoints to note:\nThe Session ID must be designed to be difficult to guess; if it is guessed, the user\u0026rsquo;s identity will be stolen. If the website is not secure enough, and the Session ID is stolen by another malicious website or hacker on a certain page, the user\u0026rsquo;s identity will be stolen. ","date":"28 June 2021","externalUrl":null,"permalink":"/posts/stateless-http-stateful-session-and-cookies.html","section":"Posts","summary":"Stateless HTTP # HTTP is a stateless protocol,\nmeaning that each request/response is independent,\nand is unrelated to previous or subsequent requests/responses.\nThe same request will always receive the same response,\nand will not differ based on the content of previous requests/responses.\nThis allows the server to save a large amount of database and server storage space because it doesn’t need to store user information.\nIt also speeds up response times and saves considerable network bandwidth because the client doesn’t always have to connect to the same socket.\n","title":"Stateless HTTP, Stateful Session and Cookies","type":"posts"},{"content":"","date":"28 June 2021","externalUrl":null,"permalink":"/categories/web.html","section":"Categories","summary":"","title":"Web","type":"categories"},{"content":"","date":"28 June 2021","externalUrl":null,"permalink":"/tags/web.html","section":"Tags","summary":"","title":"Web","type":"tags"},{"content":"","date":"11 April 2021","externalUrl":null,"permalink":"/tags/ubuntu.html","section":"Tags","summary":"","title":"Ubuntu","type":"tags"},{"content":"","date":"11 April 2021","externalUrl":null,"permalink":"/tags/windows-terminal.html","section":"Tags","summary":"","title":"Windows Terminal","type":"tags"},{"content":"","date":"11 April 2021","externalUrl":null,"permalink":"/tags/wsl.html","section":"Tags","summary":"","title":"WSL","type":"tags"},{"content":" Bringing the terminal settings from Linux and Mac to Windows for easier operation.\nWindows Terminal Features # With Windows Terminal, you can:\nEnable multiple tabs (quickly switch between multiple Linux CLIs, Windows CLIs, PowerShell, etc.) Customize key bindings (shortcuts for opening/closing tabs, copy/paste, etc.) Use search functionality Customize themes These features offer much more than native WSL support, and allow for a setup similar to my Linux or Mac development environments, which is why I decided to use Windows Terminal.\nWindows Terminal Settings # After searching for and installing Windows Terminal from the Microsoft Store, you can start configuring it.\nSetting WSL as the Default Opening Environment for Windows Terminal # In the Windows Terminal\u0026rsquo;s [V] arrow menu, select \u0026ldquo;Settings,\u0026rdquo; which will open a JSON file for modification. From the profiles \u0026gt; list, find the Linux distribution you want to set as default, for example:\n{ \u0026#34;guid\u0026#34;: \u0026#34;{xxxxxxxxxxxxxxx}\u0026#34;, \u0026#34;hidden\u0026#34;: false, \u0026#34;name\u0026#34;: \u0026#34;Ubuntu-18.04\u0026#34;, \u0026#34;commandline\u0026#34;: \u0026#34;wsl.exe\u0026#34;, \u0026#34;source\u0026#34;: \u0026#34;Windows.Terminal.Wsl\u0026#34; } Copy the GUID string enclosed in curly braces after guid, and replace the ID of the originally default profile with that ID:\n\u0026#34;defaultProfile\u0026#34;: \u0026#34;{yyyyyy}\u0026#34; (Replace yyyyyy with the GUID of your Linux distribution)\nSetting the Default Starting Directory for Windows Terminal # In the Linux distribution profile within the JSON settings file, append the default directory to open (~ refers to the user\u0026rsquo;s Linux home directory) to the commandline:\n\u0026#34;commandline\u0026#34;: \u0026#34;wsl.exe ~\u0026#34;, Setting the Windows Terminal Scheme # Add this line to the Linux distribution profile in the JSON settings file:\n\u0026#34;colorScheme\u0026#34;: \u0026#34;One Half Dark\u0026#34;, One Half Dark is one of the color schemes provided by Windows. Other schemes can be found in Microsoft Doc: Color schemes in Windows Terminal.\nSetting the Windows Terminal Font # Add this to the Linux distribution profile in the JSON settings file:\n\u0026#34;fontFace\u0026#34;: \u0026#34;xxxxx\u0026#34;, xxxxx is the name of the font. If you need to use Powerline, you can first install Powerline fonts, then fill in the desired font name.\nReference # Microsoft Doc: Install and set up Windows Terminal\nMicrosoft Doc: Color schemes in Windows Terminal\nSet Windows Terminal as WSL operating interface\n","date":"11 April 2021","externalUrl":null,"permalink":"/posts/wsl-2-on-windows-part-2.html","section":"Posts","summary":" Bringing the terminal settings from Linux and Mac to Windows for easier operation.\nWindows Terminal Features # With Windows Terminal, you can:\nEnable multiple tabs (quickly switch between multiple Linux CLIs, Windows CLIs, PowerShell, etc.) Customize key bindings (shortcuts for opening/closing tabs, copy/paste, etc.) Use search functionality Customize themes These features offer much more than native WSL support, and allow for a setup similar to my Linux or Mac development environments, which is why I decided to use Windows Terminal.\n","title":"WSL 2 on Windows Part 2 - Terminal Interface Settings","type":"posts"},{"content":"I\u0026rsquo;m used to using Linux or Mac terminals for work.\nI took some time to set up the WSL environment on my home PC to easily switch work environments.\nDifferences between WSL 2 and WSL 1 # WSL 2 is based on Hyper-V and runs a full Linux kernel in a virtual machine.\nWSL 1 is a simulation of Linux functionalities on the Windows system.\nTherefore, WSL 2 supports more native Linux features and system calls than WSL 1.\nIf you need to use low-level Linux applications,\nWSL 2 offers better support than WSL 1.\nGenerally, WSL 2 also has better performance for starting processes,\nexcept when reading files from the host system.\nHowever, because WSL 2 runs a Linux kernel in a VM,\nits integration with Windows as the host is relatively poorer than WSL 1.\nProcesses within WSL 2 cannot be managed by Windows Task Manager,\nand there\u0026rsquo;s an additional layer in the network connection between Windows and WSL 2.\nDue to WSL 2\u0026rsquo;s use of Hyper-V,\nI\u0026rsquo;ve heard reports of incompatibility issues with VMWare.\nI don\u0026rsquo;t use VMWare, so I don\u0026rsquo;t know if this issue is real,\nbut I haven\u0026rsquo;t encountered any problems when using Docker.\nMicrosoft Doc lists a detailed comparison of WSL 1 and WSL 2.\nRequirements # Windows version must be Windows 10. If your version is lower, please use Windows Update:\nFor X64 systems: Version 1903 or higher, with Build 18362 or higher. For ARM64 systems: Version 2004 or higher, with Build 19041 or higher. The machine must have virtualization features enabled.\nThis can usually be found in the motherboard\u0026rsquo;s BIOS settings.\nLook for settings related to CPU configuration, such as Intel Virtualization,\nand enable it.\nInstalling and Activating WSL 2 # Open PowerShell with administrator privileges.\nEnable Windows Subsystem for Linux: Run dism.exe /online /enable-feature /featurename:Microsoft-Windows-Subsystem-Linux /all /norestart Enable Virtual Machine Platform optional feature: Run dism.exe /online /enable-feature /featurename:VirtualMachinePlatform /all /norestart Restart your computer. Download and install the WSL 2 Linux kernel update package. Set WSL 2 as the default version: wsl --set-default-version 2 Go to the Microsoft Store,\nfind the Linux distribution you want to install,\nand install it, setting up your account and password. You can check the WSL version of the installed Linux distribution in PowerShell: wsl -l -v You can also change the WSL version of a Linux distribution:\nwsl --set-version \u0026lt;distribution name\u0026gt; \u0026lt;versionNumber\u0026gt; Reference # Microsoft WSL 2 Installation Guide\nMicrosoft DOC: Compare WSL 1 and WSL 2\n","date":"10 April 2021","externalUrl":null,"permalink":"/posts/wsl-2-on-windows-part-1.html","section":"Posts","summary":"I’m used to using Linux or Mac terminals for work.\nI took some time to set up the WSL environment on my home PC to easily switch work environments.\nDifferences between WSL 2 and WSL 1 # WSL 2 is based on Hyper-V and runs a full Linux kernel in a virtual machine.\nWSL 1 is a simulation of Linux functionalities on the Windows system.\nTherefore, WSL 2 supports more native Linux features and system calls than WSL 1.\n","title":"WSL 2 on Windows Part 1 - Installation and Activation","type":"posts"},{"content":" Shallow Copy # Copies as little as possible.\nA new structure created by a shallow copy has the same structure as the old one,\nand they share the memory address of elements.\nFor example, in Java:\nint[] arr1 = {1, 2, 3}; int[] arr2 = arr1; arr2 is a shallow copy of arr1.\nIf one of the structures modifies an element, the other will also be affected. Deep Copy # Copies everything.\nA new structure created by a deep copy not only has the same structure as the old one,\nbut also copies all elements of the old structure to the new memory address.\nint[] arr1 = {1, 2, 3}; int[] arr2 = new int[arr1.length]; for (int i = 0; i \u0026lt; arr1.length; ++i) { arr2[i] = arr1[i]; } arr2 is a deep copy of arr1.\nIt occupies more memory space. ","date":"21 January 2020","externalUrl":null,"permalink":"/posts/deep-copy-shallow-copy.html","section":"Posts","summary":"Shallow Copy # Copies as little as possible.\nA new structure created by a shallow copy has the same structure as the old one,\nand they share the memory address of elements.\nFor example, in Java:\nint[] arr1 = {1, 2, 3}; int[] arr2 = arr1; arr2 is a shallow copy of arr1.\nIf one of the structures modifies an element, the other will also be affected. Deep Copy # Copies everything.\nA new structure created by a deep copy not only has the same structure as the old one,\nbut also copies all elements of the old structure to the new memory address.\n","title":"Deep Copy and Shallow Copy","type":"posts"},{"content":"","date":"21 January 2020","externalUrl":null,"permalink":"/categories/programming.html","section":"Categories","summary":"","title":"Programming","type":"categories"},{"content":"","date":"12 January 2020","externalUrl":null,"permalink":"/categories/c++.html","section":"Categories","summary":"","title":"C++","type":"categories"},{"content":"","date":"12 January 2020","externalUrl":null,"permalink":"/tags/c++.html","section":"Tags","summary":"","title":"C++","type":"tags"},{"content":" Introduction # C++ provides various containers to store and manage data,\neach with its unique characteristics and applicable scenarios.\nThis article will delve into five common sequence containers: array, vector, deque, list, and forward_list,\ncomparing their features, performance, and offering selection advice.\nArray # std::array is a fixed-size array introduced in C++11,\ncombining the performance of C-style arrays with the interface of STL containers.\nCharacteristics # Fixed Size: Size is determined at compile time and cannot be changed dynamically. Stack Allocation: Typically allocates memory on the stack, offering high performance. Random Access: Supports O(1) time complexity for random access. Iterator Support: Provides iterators, allowing use with STL algorithms. When to Use # When the data size is known and fixed. When pursuing ultimate performance, avoiding heap allocation overhead. When interoperability with C-style arrays is required. Example # #include \u0026lt;array\u0026gt; #include \u0026lt;iostream\u0026gt; #include \u0026lt;numeric\u0026gt; // For std::accumulate int main() { std::array\u0026lt;int, 5\u0026gt; my_array = {1, 2, 3, 4, 5}; // Access elements std::cout \u0026lt;\u0026lt; \u0026#34;Element at index 2: \u0026#34; \u0026lt;\u0026lt; my_array[2] \u0026lt;\u0026lt; std::endl; // Output: 3 // Iteration for (int\u0026amp; x : my_array) { x *= 2; } // Sum int sum = std::accumulate(my_array.begin(), my_array.end(), 0); std::cout \u0026lt;\u0026lt; \u0026#34;Sum of elements: \u0026#34; \u0026lt;\u0026lt; sum \u0026lt;\u0026lt; std::endl; // Output: 30 return 0; } Vector # std::vector is a dynamic array that can change its size dynamically at runtime.\nIt is the most commonly used and flexible sequence container in C++.\nConcept # Allocate array dynamically.\nWhen the capacity is not big enough,\nreallocate a new array with sufficient memory space and move elements to the new one.\nThe actual capacity usually a bit bigger than the number of elements in the vector.\nCharacteristics # Dynamic Size: Can automatically expand or shrink at runtime. Contiguous Memory: Elements are stored contiguously in memory, supporting efficient random access with O(1) time complexity. Automatic Memory Management: Automatically handles memory allocation and deallocation. Efficient Back Operations: Inserting and deleting elements at the back is typically O(1) amortized time complexity. Inefficient Middle Insertions/Deletions: Inserting or deleting elements in the middle requires moving many elements, resulting in O(n) time complexity. When to Use # When the data size is unknown or changes dynamically. When frequent additions or deletions at the back are needed. When random access to elements is required. Example # #include \u0026lt;vector\u0026gt; #include \u0026lt;iostream\u0026gt; int main() { std::vector\u0026lt;int\u0026gt; my_vector; // Add elements my_vector.push_back(10); my_vector.push_back(20); my_vector.push_back(30); // Access elements std::cout \u0026lt;\u0026lt; \u0026#34;Element at index 1: \u0026#34; \u0026lt;\u0026lt; my_vector[1] \u0026lt;\u0026lt; std::endl; // Output: 20 // Iteration for (int x : my_vector) { std::cout \u0026lt;\u0026lt; x \u0026lt;\u0026lt; \u0026#34; \u0026#34;; // Output: 10 20 30 } std::cout \u0026lt;\u0026lt; std::endl; // Delete back element my_vector.pop_back(); // Size std::cout \u0026lt;\u0026lt; \u0026#34;Vector size: \u0026#34; \u0026lt;\u0026lt; my_vector.size() \u0026lt;\u0026lt; std::endl; // Output: 2 return 0; } Deque # std::deque (double-ended queue) is a double-ended queue that supports efficient insertion and deletion of elements at both ends.\nCharacteristics # Dynamic Size: Can automatically expand or shrink at runtime. Efficient Operations at Both Ends: Inserting and deleting elements at both the front and back are O(1) time complexity. For other elements that are not at the front or back, inserting and deleting are bit slower. Random Access: Supports O(1) time complexity for random access, but typically slower than vector. Non-contiguous Memory: Elements are not guaranteed to be stored contiguously in memory, usually composed of multiple fixed-size blocks. Leads to more efficiency for reallocation When to Use # When frequent additions or deletions at both ends are needed. When random access to elements is required, but performance requirements are not as strict as vector. As the underlying implementation for queues or stacks. Example # #include \u0026lt;deque\u0026gt; #include \u0026lt;iostream\u0026gt; int main() { std::deque\u0026lt;int\u0026gt; my_deque; // Add to front my_deque.push_front(10); // Add to back my_deque.push_back(20); my_deque.push_back(30); // Access elements std::cout \u0026lt;\u0026lt; \u0026#34;First element: \u0026#34; \u0026lt;\u0026lt; my_deque.front() \u0026lt;\u0026lt; std::endl; // Output: 10 std::cout \u0026lt;\u0026lt; \u0026#34;Element at index 1: \u0026#34; \u0026lt;\u0026lt; my_deque[1] \u0026lt;\u0026lt; std::endl; // Output: 20 // Iteration for (int x : my_deque) { std::cout \u0026lt;\u0026lt; x \u0026lt;\u0026lt; \u0026#34; \u0026#34;; // Output: 10 20 30 } std::cout \u0026lt;\u0026lt; std::endl; // Delete from front my_deque.pop_front(); // Delete from back my_deque.pop_back(); std::cout \u0026lt;\u0026lt; \u0026#34;Deque size: \u0026#34; \u0026lt;\u0026lt; my_deque.size() \u0026lt;\u0026lt; std::endl; // Output: 1 return 0; } List # std::list is a doubly linked list that supports efficient insertion and deletion of elements at any position.\nCharacteristics # Efficient Operations at Any Position: Inserting and deleting elements at any position are O(1) time complexity. Good choice for sorting Efficient iterator: Move backward or forward efficiently with doubly linked list Slow Random Access: Can only be accessed sequentially via iterators, with O(n) time complexity. Non-contiguous Memory: Elements are not stored contiguously in memory; each element contains pointers to the previous and next elements. Additional Memory Overhead: Each element requires extra memory to store pointers. When to Use # When frequent insertions or deletions at any position are needed. When random access to elements is not required. When not sensitive to memory overhead. Example # #include \u0026lt;list\u0026gt; #include \u0026lt;iostream\u0026gt; #include \u0026lt;algorithm\u0026gt; // For std::find int main() { std::list\u0026lt;int\u0026gt; my_list; // Add elements my_list.push_back(10); my_list.push_back(20); my_list.push_back(40); // Insert in the middle auto it = std::find(my_list.begin(), my_list.end(), 20); my_list.insert(it, 30); // Iteration for (int x : my_list) { std::cout \u0026lt;\u0026lt; x \u0026lt;\u0026lt; \u0026#34; \u0026#34;; // Output: 10 20 30 40 } std::cout \u0026lt;\u0026lt; std::endl; // Delete element my_list.remove(20); std::cout \u0026lt;\u0026lt; \u0026#34;List size: \u0026#34; \u0026lt;\u0026lt; my_list.size() \u0026lt;\u0026lt; std::endl; // Output: 3 return 0; } Forward_list # std::forward_list is a singly linked list, a lightweight version of std::list that only supports forward traversal.\nCharacteristics # Forward Traversal Only: Can only traverse from head to tail. Efficient Operations at Any Position: Inserting and deleting elements at any position are O(1) time complexity (requires an iterator to the element before the insertion/deletion point). No Random Access Support: Can only be accessed sequentially via iterators, with O(n) time complexity. Smaller Memory Overhead: Each element contains only one pointer to the next element, saving memory compared to list. When to Use # When only forward traversal is needed. When frequent insertions or deletions at any position are needed. When extremely sensitive to memory overhead. Example # #include \u0026lt;forward_list\u0026gt; #include \u0026lt;iostream\u0026gt; int main() { std::forward_list\u0026lt;int\u0026gt; my_forward_list; // Add elements (Note: forward_list does not have push_back) my_forward_list.push_front(40); my_forward_list.push_front(30); my_forward_list.push_front(10); // Insert after a specified position auto it = my_forward_list.begin(); // Points to 10 it++; // Points to 30 my_forward_list.insert_after(it, 20); // Insert 20 after 30 // Iteration for (int x : my_forward_list) { std::cout \u0026lt;\u0026lt; x \u0026lt;\u0026lt; \u0026#34; \u0026#34;; // Output: 10 30 20 40 } std::cout \u0026lt;\u0026lt; std::endl; // Delete element (requires an iterator to the element before) my_forward_list.remove(30); // Iteration for (int x : my_forward_list) { std::cout \u0026lt;\u0026lt; x \u0026lt;\u0026lt; \u0026#34; \u0026#34;; // Output: 10 20 40 } std::cout \u0026lt;\u0026lt; std::endl; return 0; } Summary and Selection Advice # Feature \\ Container Array Vector Deque List Forward_list Size Fixed Dynamic Dynamic Dynamic Dynamic Memory Allocation Stack Heap Heap Heap Heap Memory Contiguous Yes Yes No No No Random Access O(1) O(1) O(1) O(n) O(n) Front Insert/Delete O(n) O(n) O(1) O(1) O(1) Back Insert/Delete O(n) O(1) O(1) O(1) O(n) (no push_back) Middle Insert/Delete O(n) O(n) O(n) O(1) O(1) Additional Overhead None Small Small Large Small Selection Advice # Array: When data size is fixed and known, and highest performance is desired. Vector: The most common and versatile choice. When data size changes dynamically, and operations are primarily at the back with a need for random access. Deque: When frequent insertions/deletions at both ends are needed, and random access is also required. List: When frequent insertions/deletions at any position are needed, and random access is not required. Forward_list: When all conditions for List are met, and only forward traversal is needed, with extreme sensitivity to memory overhead. References # C++ Reference - Containers GeeksforGeeks - C++ STL Containers http://www.cplusplus.com/reference/array/array/ http://www.cplusplus.com/reference/vector/vector/ http://www.cplusplus.com/reference/deque/deque/ http://www.cplusplus.com/reference/list/list/ http://www.cplusplus.com/reference/forward_list/forward_list/ ","date":"12 January 2020","externalUrl":null,"permalink":"/posts/c-stl-container-compare-array-vector-dequeue-list-forward_list.html","section":"Posts","summary":"Introduction # C++ provides various containers to store and manage data,\neach with its unique characteristics and applicable scenarios.\nThis article will delve into five common sequence containers: array, vector, deque, list, and forward_list,\ncomparing their features, performance, and offering selection advice.\nArray # std::array is a fixed-size array introduced in C++11,\ncombining the performance of C-style arrays with the interface of STL containers.\n","title":"C++ Container Characteristics and Usage Scenarios - array, vector, deque, list, forward_list","type":"posts"},{"content":"","date":"12 January 2020","externalUrl":null,"permalink":"/tags/container.html","section":"Tags","summary":"","title":"Container","type":"tags"},{"content":"","date":"12 January 2020","externalUrl":null,"permalink":"/tags/stl.html","section":"Tags","summary":"","title":"STL","type":"tags"},{"content":"","date":"9 January 2020","externalUrl":null,"permalink":"/categories/docker.html","section":"Categories","summary":"","title":"Docker","type":"categories"},{"content":"Continuing from Docker Operations Log (Part 1)\nBasic Docker Usage # Deleting a Container # Remember to stop the container using stop before deleting it.\ndocker rm CONTAINER_NAME Or\ndocker rm CONTAINER_ID After deletion, you can use\ndocker ps -a to confirm if the container has disappeared.\nCreating an Image from a Previously Exported Container # If you previously exported a container as c_test.tar, you can use it to create a new image:\ncat c_test.tar | docker import - ubuntu_test_repo:1.0 ubuntu_test_repo is the repository name, and 1.0 is the tag.\nYou can use\ndocker images to list and check it.\nOnce you have an image, you can create new containers from it.\nDeleting an Image # If I use\ndocker images and the listed images are:\nREPOSITORY TAG IMAGE ID CREATED SIZE aaa 2.0 b30c39fffb75 4 seconds ago 64.2MB aaa 1.0 6b8046192d83 8 seconds ago 64.2MB ubuntu_test_repo 1.0 864c36a752c3 5 hours ago 64.2MB ubuntu latest 549b9b86cb8d 2 weeks ago 1.84kB hello-world latest fce289e99eb9 12 months ago 1.84kB To delete the image with repository name aaa and tag 1.0:\ndocker rmi aaa:1.0 This will work.\nAll containers using this image must be rm\u0026rsquo;d first.\nDockerfile # A Dockerfile is a file that allows users to create images in a simpler way.\nIt is divided into four parts:\nImage Maintainer (who is responsible for this Dockerfile) Operation commands Command to run when the container starts Here\u0026rsquo;s an Nginx example:\n# This is how to comment in a Dockerfile # Image FROM ubuntu # Maintainer MAINTAINER user user@example.com # Operation commands RUN apt-get update \\ \u0026amp;\u0026amp; apt-get upgrade -y \\ \u0026amp;\u0026amp; apt-get install -y nginx # Container Start Command CMD [\u0026#34;nginx\u0026#34;, \u0026#34;-g\u0026#34;, \u0026#34;daemon off;\u0026#34;] Building an Image # You can use docker build to create an image.\nIf the Nginx Dockerfile mentioned above is located at /tmp/d_file and named test_d_file, to build it into an image and tag it as test-nginx-img/1.0:\ndocker build -t test-nginx-img/1.0 -f /tmp/d_file/test_d_file . After building, check with docker images:\nREPOSITORY TAG IMAGE ID CREATED SIZE test-nginx-img/1.0 latest 7293588d00a9 27 seconds ago 152MB Reference # Docker docs\n","date":"9 January 2020","externalUrl":null,"permalink":"/posts/docker-operating-2.html","section":"Posts","summary":"Continuing from Docker Operations Log (Part 1)\nBasic Docker Usage # Deleting a Container # Remember to stop the container using stop before deleting it.\ndocker rm CONTAINER_NAME Or\ndocker rm CONTAINER_ID After deletion, you can use\ndocker ps -a to confirm if the container has disappeared.\n","title":"Docker Operations Log (Part 2)","type":"posts"},{"content":"","date":"9 January 2020","externalUrl":null,"permalink":"/tags/virtual-environment.html","section":"Tags","summary":"","title":"Virtual Environment","type":"tags"},{"content":"","date":"9 January 2020","externalUrl":null,"permalink":"/tags/cluster.html","section":"Tags","summary":"","title":"Cluster","type":"tags"},{"content":"When designing systems with higher traffic, you will eventually encounter cluster-related issues.\nCluster # A collection of one or more machines (nodes) with three different purposes:\nLoad Balancing # Allows multiple machines to share tasks as evenly as possible, accelerating application execution.\nHigh Availability (HA) # For high availability and redundancy, if one machine suddenly fails, others can take over.\nHigh Performance Computing # High-performance/parallel computing systems, abbreviated as HPC clusters, combine the hardware of multiple machines to increase computing power, used to solve tasks that a single machine cannot handle.\nHA Operating Modes # There are many types, such as N+1, N+M, \u0026hellip; But the most common is a two-node cluster. A two-node cluster has two operating modes:\nActive-Passive Active-Active Active-Passive (AP) # A master-slave design. Under normal circumstances, only the master (Active) provides the service. When the master (Active) encounters a problem, the slave (Passive) takes over. Once the master (Active) recovers, it switches back, and the master (Active) continues to handle the service.\nAdvantages:\nFast fail-over speed. Relatively simple design and configuration. Disadvantages:\nCannot perform load balancing simultaneously, wasting some hardware resources. Active-Active (AA) # Both machines simultaneously run their own independent services (both are Active), and also provide mutual redundancy (acting as the other\u0026rsquo;s Passive). When one machine encounters a problem, the other takes over its service.\nAdvantages:\nNeither machine is idle during normal operation, resulting in high operational efficiency. Disadvantages:\nThe machine\u0026rsquo;s load increases after fail-over, leading to slower performance. Relatively complex design and configuration. Application Design # There needs to be a relatively simple way to start, stop, force-stop services, and check the current status of services.\n=\u0026gt; When designing the application, there should be a command-line interface or script to achieve this.\n=\u0026gt; Services on both machines should be able to know each other\u0026rsquo;s status and be able to start or stop in case of an accident. Shared storage is required, and the application should record its state as meticulously as possible to shared storage.\n=\u0026gt; This ensures nothing is lost when switching between the two machines. It should be possible to restart another node and restore it to the state before the failure occurred.\n=\u0026gt; Restoring to the pre-failure state can be done using the state saved to shared storage. When the application crashes, the data stored on shared storage must not be corrupted.\n=\u0026gt; The other side needs to use it. Remark # Consider scenarios that occur during application upgrades. Some SQL or NoSQL databases inherently support these types of configurations, which can be adopted to reduce a lot of trouble. ","date":"9 January 2020","externalUrl":null,"permalink":"/posts/ha-cluster-app-architecture.html","section":"Posts","summary":"When designing systems with higher traffic, you will eventually encounter cluster-related issues.\nCluster # A collection of one or more machines (nodes) with three different purposes:\nLoad Balancing # Allows multiple machines to share tasks as evenly as possible, accelerating application execution.\nHigh Availability (HA) # For high availability and redundancy, if one machine suddenly fails, others can take over.\n","title":"HA Cluster Notes and Application Design","type":"posts"},{"content":"","date":"9 January 2020","externalUrl":null,"permalink":"/tags/high-availability.html","section":"Tags","summary":"","title":"High Availability","type":"tags"},{"content":"","date":"9 January 2020","externalUrl":null,"permalink":"/categories/web-hosting.html","section":"Categories","summary":"","title":"Web Hosting","type":"categories"},{"content":"Each DNS zone in a DNS server has a zone file.\nA DNS zone is usually a single domain (though not always).\nA zone file is composed of many DNS resource records (RRs).\nThere are many different types of RRs.\nLet\u0026rsquo;s record some common ones.\nA record # Maps a hostname to an IPv4 address. (32-bit)\nhostname IN A xxx.xxx.xxx.xxx AAAA record # Maps a hostname to an IPv6 address. (128-bit)\nhostname IN AAAA xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx CNAME record # An alias for a hostname.\nalias IN CNAME hostname Note that an alias cannot have other A records or MX records.\nMX record # Mail exchanger record.\nThe email server\u0026rsquo;s domain name, priority, and hostname.\nNote that it must be a hostname, not directly an IP address,\nand it cannot be a CNAME alias.\nTherefore, an additional RR (A or AAAA record) must be set up for the IP.\nA lower priority number indicates higher priority (mail is delivered to higher priority servers first).\nmail_domain_name IN MX priority hostname Example:\nexample.com. IN MX 10 mailserver.example.com","date":"8 January 2020","externalUrl":null,"permalink":"/posts/common-dns-resource-record.html","section":"Posts","summary":"Each DNS zone in a DNS server has a zone file.\nA DNS zone is usually a single domain (though not always).\nA zone file is composed of many DNS resource records (RRs).\nThere are many different types of RRs.\nLet’s record some common ones.\nA record # Maps a hostname to an IPv4 address. (32-bit)\nhostname IN A xxx.xxx.xxx.xxx","title":"Common DNS Resource Records","type":"posts"},{"content":"","date":"8 January 2020","externalUrl":null,"permalink":"/tags/dns.html","section":"Tags","summary":"","title":"DNS","type":"tags"},{"content":"","date":"6 January 2020","externalUrl":null,"permalink":"/categories/database.html","section":"Categories","summary":"","title":"Database","type":"categories"},{"content":"","date":"6 January 2020","externalUrl":null,"permalink":"/tags/database.html","section":"Tags","summary":"","title":"Database","type":"tags"},{"content":"","date":"6 January 2020","externalUrl":null,"permalink":"/tags/nosql.html","section":"Tags","summary":"","title":"NoSQL","type":"tags"},{"content":"","date":"6 January 2020","externalUrl":null,"permalink":"/tags/rdbms.html","section":"Tags","summary":"","title":"RDBMS","type":"tags"},{"content":"RDBMS # Relational Database Management System\nUsed when there are strong Relations between data: Design a schema that is unlikely to change, relating tables to each other, then you can retrieve the desired data through SQL. Used when data correctness is very important: Usually provides ACID properties. Changing the schema is a huge undertaking: Requires updating the table schema and migrating data. All programs that use the table with the changed schema need to be modified. Vertical scaling is more effective (improving machine performance). ACID # RDBMS usually guarantees four properties for transactions:\nAtomicity Only two possibilities: all completed (Commit) or all not done (Abort). There is no \u0026ldquo;half-done\u0026rdquo; state. If there is an error during execution, it will Rollback to the state where nothing was done.\nConsistency The database will remain in a legal state before and after the transaction.\nIsolation When multiple transactions need to be executed, each transaction is separate and does not interfere with each other. Transaction A and B does not affect transaction B and C.\nDurability Once a transaction is completed, it is permanently valid and will not be lost, even if the system suddenly fails.\nNoSQL # Not only SQL\nLess concerned with relations between data: Does not require a fixed schema for data access. Each piece of data exists independently, without issues of who relates to whom. More concerned with the content of the data: Whether updates, additions, deletions, etc., are needed. Data can have different formats. More suitable for distributed systems. Usually provides two of the CAP properties. Horizontal scaling is more effective (adding more machines). CAP # For a distributed system, it is impossible to guarantee all three CAP properties simultaneously (though they might coexist when the network is stable). At most, two can be guaranteed simultaneously.\nConsistency Every read, if it doesn\u0026rsquo;t result in an error, will return the result of the most recent write. =\u0026gt; Data on every node is identical.\nAvailability Every request will receive a non-error response, regardless of whether the data returned by this response is the latest. =\u0026gt; Guarantees that data will always be returned, but the data might be old.\nPartition tolerance Even if some messages transmitted between nodes are delayed or lost, the system will continue to operate. =\u0026gt; When network issues occur, the normally connected part of the nodes can continue to operate.\nReference # Wiki ACID Wiki CAP\n","date":"6 January 2020","externalUrl":null,"permalink":"/posts/rdbms-acid-nosql-cap.html","section":"Posts","summary":"RDBMS # Relational Database Management System\nUsed when there are strong Relations between data: Design a schema that is unlikely to change, relating tables to each other, then you can retrieve the desired data through SQL. Used when data correctness is very important: Usually provides ACID properties. Changing the schema is a huge undertaking: Requires updating the table schema and migrating data. All programs that use the table with the changed schema need to be modified. ","title":"RDBMS and NoSQL Differences Notes","type":"posts"},{"content":"","date":"2 January 2020","externalUrl":null,"permalink":"/tags/linux.html","section":"Tags","summary":"","title":"Linux","type":"tags"},{"content":"","date":"2 January 2020","externalUrl":null,"permalink":"/categories/operating-system.html","section":"Categories","summary":"","title":"Operating System","type":"categories"},{"content":"","date":"2 January 2020","externalUrl":null,"permalink":"/tags/os.html","section":"Tags","summary":"","title":"OS","type":"tags"},{"content":"To understand what a daemon is,\nI consulted The Linux Programming Interface and took notes related to it.\nA process group is a collection of related processes. A session is a collection of related process groups. Process groups and sessions are defined for job control\nProcess Group # A process group is a collection of related processes that share the same process group identifier (PGID).\nEach process group has a process group leader,\nwhich is the process that created the group.\nThe PGID of the process group is the PID of its leader.\nThe PGID of any newly created process is inherited from its parent.\nThe lifetime of a process group begins when its leader creates it and ends when all processes have left the group.\nA process \u0026rsquo;leaves\u0026rsquo; a process group either by terminating or by being switched to another process group.\nSession # A session is a collection of related process groups that share the same session identifier (SID).\nEach session has a session leader,\nwhich is the process that created the session.\nThe SID of the session is the PID of its leader.\nWhenever a new process is created,\nits SID is inherited from its parent.\nAll processes in the same session share a controlling terminal.\nA controlling terminal is established the first time a session leader opens a terminal device,\nand a terminal can only be the controlling terminal for one session.\nHence, there is a one-to-one relationship between a session and a controlling terminal.\nAt any given time, there are:\nforeground process group:\nA process group within a session.\nOnly processes in this process group can receive input from the controlling terminal. background process groups:\nAll process groups that are not the foreground process group. Terminal # When a terminal is opened, a session leader is established,\nand this same session leader also acts as the controlling process.\nMeanwhile, the foreground process group waits for input from the terminal,\nwhich could be user input or signals from the user,\nwhile background process groups exist concurrently.\nWhen a terminal is terminated,\nthe kernel sends a SIGHUP signal to the session leader to indicate that the terminal session has ended.\nShell Job Control # Process groups and sessions are defined to explain shell job control.\nHere is an example of shell job control:\nThe terminal used by the user to log in is the controlling terminal,\nand the login shell acts as both the session leader and the controlling process.\nCommands executed from this shell will create one or more processes.\nThese processes form new process groups,\nand any other processes created by them become part of these process groups.\nAll these processes are created from this shell and thus belong to this login session.\nDaemon # Characteristics of daemons:\nlong-lived:\nUsually activated when the system starts,\nrunning until the system is turned off.\nRunning in the background and having no controlling terminal: To ensure that no job control or terminal-related signals generated by the kernel affect the daemon. Daemons usually end with \u0026rsquo;d'.\nSeveral common daemons:\ncron sshd httpd ","date":"2 January 2020","externalUrl":null,"permalink":"/posts/linux-process-group-sessions-daemon.html","section":"Posts","summary":"To understand what a daemon is,\nI consulted The Linux Programming Interface and took notes related to it.\n","title":"Process Groups, Sessions and Daemon Overview","type":"posts"},{"content":"Basic Concept # Docker can be seen as a simplified virtual machine (VM).\nSince it doesn\u0026rsquo;t install a full operating system, it offers a smaller footprint and faster speed.\nImage # An Image contains a lightweight runtime environment,\nincluding its libraries and executables.\nImages can be thought of as the .iso file for a Docker VM.\nIt can only be read, not executed directly.\nIf one wants to modify an image, they can only create a new image based on the existing one.\nContainer # When an image is used to create a running environment,\nit becomes a container.\nLike a VM,\nthe Docker container is isolated from the host environment.\nWhatever is done within the container does not affect the host environment,\nunless specific settings are configured.\nFor example,\nWe can expose a port of a Docker container,\nwhile the host port remains closed.\nHowever, we can also expose the host\u0026rsquo;s port if desired.\nRepository # A repository is where images are stored.\nIt is similar to a Git repository:\nThere can be many repositories,\nand each serves as a place to store the code of a project.\nSimilarly, Docker repositories are where images are stored.\nEvery image within the same repository shares the same name but has different tags.\nAdditionally, there are many different repositories for various images.\nRegistry # A registry is also a place to store images.\nThe difference between a registry and a repository is that a registry is a service where users can push or pull images to and from their local machines, similar to GitHub for Git.\nThe most famous one is Docker Hub.\nWhereas a repository is a location to keep images with the same name but different tags.\nBasic Usage of Docker # Install # It is very simple in Ubuntu:\nsudo apt-get install docker.io Pull Image # There are many images on Docker Hub that can be used.\nIf I want a clean Ubuntu environment,\nI can pull an Ubuntu image to my local machine:\ndocker pull ubuntu Or if I want to specify a tag:\ndocker pull ubuntu:14.04 Run the Image # We can echo Hello world using the image we just obtained:\ndocker run ubuntu /bin/echo \u0026#39;Hello world\u0026#39; This should print Hello world on the terminal.\nWhat just happened is that the docker run command creates a temporary container,\nand terminates itself after the echo command completes.\nList Images at Local # docker images It should list the image we just pulled to our local machine.\nCreate a Container # Once we have a container,\nwe have a running environment that can be modified by our actions.\nCreating a container from an image is just like using an .iso to create a virtual machine.\nWe can create a container running Ubuntu with the image we pulled:\ndocker create -it ubuntu We can also create a container with a name:\ndocker create -it --name CONTAINER_NAME ubuntu i refers to \u0026lsquo;interactive\u0026rsquo; (opens stdin of the container).\nt refers to \u0026lsquo;TTY\u0026rsquo; (allocates a pseudo-TTY so we can interact with it via a terminal).\nOr if we want to create a container and run it:\ndocker run -itd ubuntu or\ndocker run -itd --name CONTAINER_NAME ubuntu d refers to \u0026lsquo;detach\u0026rsquo; (runs the container in the background).\nList Containers # docker ps -a This should list all containers on the host machine.\nAnd we can observe a difference in status between containers created with docker create and docker run:\nContainers created with docker create are only created and not yet running,\nso their status is created;\nwhereas containers created with docker run are both created and run,\nso their status is up.\nThere is a container ID that can be used to run or terminate the container.\nRun Containers # So,\nif a container is created with docker create, it must be run before we can access it:\nWe can use the container ID to run the container:\ndocker start \u0026#34;CONTAINER_ID\u0026#34; Or,\nif the container was created with a name,\nwe can use that name to run it:\ndocker start \u0026#34;CONTAINER_NAME\u0026#34; If the container\u0026rsquo;s status is exited,\nit also needs to be started to run before we can access it.\nUse\ndocker ps -a to check the status first.\nFor containers created with docker run,\nor those already started with docker start,\nwe can access them with docker exec:\ndocker exec -it \u0026#34;CONTAINER_ID\u0026#34; bash bash is the command we want to run;\nit can be replaced with other commands like echo or anything else.\nWe can also use the container\u0026rsquo;s name to access it:\ndocker exec -it \u0026#34;CONTAINER_NAME\u0026#34; bash If bash is the command used,\nwe should find ourselves inside the container.\nThe user becomes root,\nand we can start configuring settings or installing software within the container.\nIf we want to leave the container:\nexit The container remains running in the background after we exit it.\nStop Containers # This is quite similar to turning off a virtual machine.\nStopping a container only changes its status to exited;\nit does not remove the container entirely.\ndocker stop \u0026#34;CONTAINER_ID\u0026#34; Or,\ndocker stop \u0026#34;CONTAINER_NAME\u0026#34; If we inspect it with,\ndocker ps -a We will find that the container still exists,\nbut its status has changed to exited.\nExport Container # Once a container is exported,\nit can be moved to another host machine.\nWe can export a container as a .tar file.\nFor example,\nexport a container as exported.tar:\ndocker export \u0026#34;CONTAINER_ID\u0026#34; \u0026gt; exported.tar Or,\ndocker export \u0026#34;CONTAINER_NAME\u0026#34; \u0026gt; exported.tar Then we can move the .tar file to another machine.\nReference # Docker docs\n","date":"1 January 2020","externalUrl":null,"permalink":"/posts/docker-operating-1.html","section":"Posts","summary":"Basic Concept # Docker can be seen as a simplified virtual machine (VM).\nSince it doesn’t install a full operating system, it offers a smaller footprint and faster speed.\nImage # An Image contains a lightweight runtime environment,\nincluding its libraries and executables.\n","title":"Docker Notes 1 - Beginner","type":"posts"},{"content":"","date":"30 December 2019","externalUrl":null,"permalink":"/categories/blog.html","section":"Categories","summary":"","title":"Blog","type":"categories"},{"content":"","date":"30 December 2019","externalUrl":null,"permalink":"/tags/github-pages.html","section":"Tags","summary":"","title":"Github Pages","type":"tags"},{"content":"Update # I\u0026rsquo;ve moved from Jekyll to Hugo.\nThis method is only applicable to Jekyll.\nSitemap # A sitemap is an .xml file that contains links to all the pages within a website.\nWith a sitemap,\na search engine can discover the pages and subsequently create indexes for them.\nThen, people browsing the internet can find those pages using keywords.\nJekyll-sitemap # There is a plugin called jekyll-sitemap for Jekyll,\nwhich automatically generates a sitemap whenever the website is rebuilt.\nIt is a good choice if you build your website locally,\nbut with GitHub Pages,\nit doesn\u0026rsquo;t work as expected.\nIt\u0026rsquo;s unclear if this is due to the parameters or how GitHub builds websites;\nThe sitemap is generated,\nbut the URLs are incorrect.\nGenerates sitemap without plugin # So I found this,\nwhich seemed to work, so I decided to give it a try,\nmodified it, and placed it in sitemap.xml inside the repository,\nand it did work!\n","date":"30 December 2019","externalUrl":null,"permalink":"/posts/jekyll-sitemap-github-pages.html","section":"Posts","summary":"Update # I’ve moved from Jekyll to Hugo.\nThis method is only applicable to Jekyll.\nSitemap # A sitemap is an .xml file that contains links to all the pages within a website.\nWith a sitemap,\na search engine can discover the pages and subsequently create indexes for them.\nThen, people browsing the internet can find those pages using keywords.\nJekyll-sitemap # There is a plugin called jekyll-sitemap for Jekyll,\nwhich automatically generates a sitemap whenever the website is rebuilt.\n","title":"Github Pages and Jekyll - sitemap","type":"posts"},{"content":"","date":"30 December 2019","externalUrl":null,"permalink":"/tags/jekyll.html","section":"Tags","summary":"","title":"Jekyll","type":"tags"},{"content":"","date":"30 December 2019","externalUrl":null,"permalink":"/tags/seo.html","section":"Tags","summary":"","title":"SEO","type":"tags"},{"content":"","date":"30 December 2019","externalUrl":null,"permalink":"/tags/c.html","section":"Tags","summary":"","title":"C","type":"tags"},{"content":"const with Normal Variables # Two ways to add const for normal variables:\nconst TYPE NAME = VALUE; // more common TYPE const NAME = VAULE; Both mean this variable cannot be assigned to another value.\nFor example,\n#include \u0026lt;iostream\u0026gt; using namespace std; int main(void) { const int i = 1; int const j = 1; i = 2; // error j = 2; // error cout \u0026lt;\u0026lt; \u0026#34;i = \u0026#34; \u0026lt;\u0026lt; i \u0026lt;\u0026lt; endl; cout \u0026lt;\u0026lt; \u0026#34;j = \u0026#34; \u0026lt;\u0026lt; j \u0026lt;\u0026lt; endl; return 0; } The same error occurs for i and j:\nconst.cpp:9:4: error: cannot assign to variable \u0026#39;i\u0026#39; with const-qualified type \u0026#39;const int\u0026#39; i = 2; ~ ^ const.cpp:7:12: note: variable \u0026#39;i\u0026#39; declared const here const int i = 1; ~~~~~~~~~~^~~~~ const.cpp:10:4: error: cannot assign to variable \u0026#39;j\u0026#39; with const-qualified type \u0026#39;const int\u0026#39; j = 2; ~ ^ const.cpp:8:12: note: variable \u0026#39;j\u0026#39; declared const here int const j = 1; ~~~~~~~~~~^~~~~ 2 errors generated. const and Reference # There are also two ways to add const to a reference:\nconst TYPE \u0026amp;NAME = VALUE; // more common TYPE const \u0026amp;NAME = VAULE; Both have the same meaning.\nThere are two restrictions for them:\nThis reference cannot be reassigned to another variable The variable being referenced cannot have its value changed through this reference,\nbut its value can be changed without using this reference. For example,\n#include \u0026lt;iostream\u0026gt; using namespace std; int main(void) { int i = 1, j = 2; int const \u0026amp;r1 = i; const int \u0026amp;r2 = i; // change value with reference r1 = 3; // error r2 = 3; // error // change value i = 4; // change reference object r1 = j; // error r2 = j; // error return 0; } A constant reference can only be read.\nIf the value of the variable it references has been changed,\nit can only be changed without using that reference.\nconst and Pointer # This can be complicated.\nHowever, we can use the position of const to determine what it is modifying:\nTYPE* const pNAME; // 1 TYPE const *pNAME; // 2 const TYPE *pNAME; // 3 const TYPE* const pNAME; // 4 For 1,\nconst modifies pNAME,\nmeaning that pNAME cannot be changed (i.e., pNAME = ... is not allowed).\nFor 2,\nconst modifies *pNAME,\nso the value pointed to by pNAME cannot be changed (i.e., *pNAME = ... is not allowed).\nFor 3,\nconst modifies TYPE *pNAME.\nThis is the same as case 2, meaning that the value pointed to by pNAME cannot be changed (i.e., *pNAME = ... is not allowed).\nFor 4,\nconst modifies both pNAME and the TYPE it points to,\nso neither pNAME nor the value it points to can be changed (i.e., pNAME = ... or *pNAME = ... are not allowed).\n#include \u0026lt;iostream\u0026gt; using namespace std; int main(void) { int i = 1, j = 2; int* const p1 = \u0026amp;i; int const *p2 = \u0026amp;i; const int *p3 = \u0026amp;i; const int* const p4 = \u0026amp;i; // Change value through pointer *p1 = 2; *p2 = 2; // error *p3 = 2; // error *p4 = 2; // error // change value i = 3; // Change pointer\u0026#39;s target p1 = \u0026amp;j; // error p2 = \u0026amp;j; p3 = \u0026amp;j; p4 = \u0026amp;j; // error return 0; }","date":"30 December 2019","externalUrl":null,"permalink":"/posts/const-pointer-reference.html","section":"Posts","summary":"const with Normal Variables # Two ways to add const for normal variables:\nconst TYPE NAME = VALUE; // more common TYPE const NAME = VAULE; Both mean this variable cannot be assigned to another value.\nFor example,\n","title":"C/C++ - const with Pointer or Reference","type":"posts"},{"content":"Update # I\u0026rsquo;ve moved my blog from Jekyll to Hugo.\nThe method for adding a like button remains similar,\nbut the code and its placement need to be adjusted.\nLikeCoin # I came across LikeCoin and it piqued my interest.\nLikeCoin is a cryptocurrency,\nwhich was created to encourage content creators.\nContent creators can embed a Like Button within their content or web pages.\nAnyone with a LikeCoin account who appreciates the content can click the button,\nand the content creator will receive the corresponding LikeCoin.\nThe amount of LikeCoin a content creator receives depends on the account type of the likers.\nThe LikeCoin Foundation proportionally rewards free account likers,\nwhile payment accounts are proportionally charged based on the number of likes they click each month.\nMore details can be found on LikeCoin\u0026rsquo;s Medium.\nLike Rewards Button for Jekyll Theme # So I registered an account and found that the like button widget supports platforms like Medium, WordPress, Oice, Matters, etc. However, since Jekyll themes are custom-designed, we need to manually integrate the Like button.\nSo I found an article with an embedded Like button on Medium and inspected it using my browser\u0026rsquo;s developer console:\nIt appears to be using an iframe. To make it work, the src attribute needs to include our liker ID and the article address.\nSo, there are two different ways to add a like button to the blog post:\nAdd the iframe to every blog post Add the iframe to the template which generates the blog posts Clearly,\nthe second approach is better.\nSo I located the template that generates blog posts at _layouts/post.html in my Jekyll theme,\nand I added this code to the template after the content section:\n\u0026lt;div align=\u0026#34;center\u0026#34;\u0026gt; \u0026lt;iframe scrolling=\u0026#34;no\u0026#34; src=\u0026#34;https://button.like.co/in/embed/\u0026lt;MY_LIKER_ID\u0026gt;/button/?referrer={{ site.url }}{{ page.url }}\u0026#34; frameborder=\u0026#34;0\u0026#34;\u0026gt;\u0026lt;/iframe\u0026gt; \u0026lt;/div\u0026gt; Remember to replace the Liker ID with your own.\nsite.url and page.url are Liquid syntax,\nrepresenting the URL for the current blog post.\nAfter adding this, every generated blog post should have a like button at the end of the article.\n","date":"27 December 2019","externalUrl":null,"permalink":"/posts/likecoin-button-jekyll.html","section":"Posts","summary":"Update # I’ve moved my blog from Jekyll to Hugo.\nThe method for adding a like button remains similar,\nbut the code and its placement need to be adjusted.\nLikeCoin # I came across LikeCoin and it piqued my interest.\nLikeCoin is a cryptocurrency,\nwhich was created to encourage content creators.\nContent creators can embed a Like Button within their content or web pages.\nAnyone with a LikeCoin account who appreciates the content can click the button,\nand the content creator will receive the corresponding LikeCoin.\nThe amount of LikeCoin a content creator receives depends on the account type of the likers.\nThe LikeCoin Foundation proportionally rewards free account likers,\nwhile payment accounts are proportionally charged based on the number of likes they click each month.\nMore details can be found on LikeCoin’s Medium.\n","title":"Add LikeWidget to Jekyll theme","type":"posts"},{"content":"","date":"27 December 2019","externalUrl":null,"permalink":"/tags/frontend.html","section":"Tags","summary":"","title":"Frontend","type":"tags"},{"content":"","date":"27 December 2019","externalUrl":null,"permalink":"/tags/likecoin.html","section":"Tags","summary":"","title":"Likecoin","type":"tags"},{"content":"","date":"25 December 2019","externalUrl":null,"permalink":"/categories/git.html","section":"Categories","summary":"","title":"Git","type":"categories"},{"content":"","date":"25 December 2019","externalUrl":null,"permalink":"/tags/git.html","section":"Tags","summary":"","title":"Git","type":"tags"},{"content":"","date":"25 December 2019","externalUrl":null,"permalink":"/tags/gitignore.html","section":"Tags","summary":"","title":"Gitignore","type":"tags"},{"content":"Every time a new folder is created and files are added in macOS, a .DS_Store file is generated within that folder. This results in numerous .DS_Store files scattered across macOS, and it\u0026rsquo;s quite annoying to add **/.DS_Store to every .gitignore file each time a new Git repository is created. Therefore, I found a method to prevent .DS_Store files from being tracked in all Git repositories.\nThe git config command # The git config command can be used to set a variety of Git settings.\nThe most common uses are git config --global user.name and git config --global user.email,\nwhich set user.name and user.email globally,\nallowing all Git repositories to use these settings.\nThis command can also be used to set local settings for a single Git repository.\nIf we wish to use a different username or email within a specific repository,\nwe can use git config --local user.name and git config --local user.email to achieve this.\nThere is a core.excludesfile setting for git config,\nwhich can be set to an ignore configuration file to specify files that all Git repositories should ignore.\nTherefore, all we need to do is write .DS_Store and **/.DS_Store into a file,\nand then set core.excludesfile to point to that file.\nThen all Git repositories will ignore .DS_Store files.\nWe can achieve this by using these commands:\necho \u0026#34;.DS_Store\u0026#34; \u0026gt;\u0026gt; ~/.gitignore_global echo \u0026#34;**/.DS_Store\u0026#34; \u0026gt;\u0026gt; ~/.gitignore_global git config --global core.excludesfile ~/.gitignore_global","date":"25 December 2019","externalUrl":null,"permalink":"/posts/remove-ds_store-from-all-git-repo.html","section":"Posts","summary":"Every time a new folder is created and files are added in macOS, a .DS_Store file is generated within that folder. This results in numerous .DS_Store files scattered across macOS, and it’s quite annoying to add **/.DS_Store to every .gitignore file each time a new Git repository is created. Therefore, I found a method to prevent .DS_Store files from being tracked in all Git repositories.\n","title":"Remove .DS_Store tracking in all Git repositories","type":"posts"},{"content":"","date":"25 December 2019","externalUrl":null,"permalink":"/tags/tool.html","section":"Tags","summary":"","title":"Tool","type":"tags"},{"content":"","date":"25 December 2019","externalUrl":null,"permalink":"/categories/concurrency.html","section":"Categories","summary":"","title":"Concurrency","type":"categories"},{"content":"Both \u0026lsquo;Concurrent Processing\u0026rsquo; and \u0026lsquo;Parallel Processing\u0026rsquo; refer to multiple processes executing on the CPU within a period,\nbut they are two different things.\nAccording to The Art of Concurrency,\nConcurrent means:\ntwo or more processes are in progress at the same time\nWhile Parallel means:\ntwo or more processes executing simultaneously\nThey look pretty similar but are actually different.\nFor example,\ntwo processes are executing,\nprocess A and process B.\nParallel Processing # Parallel processing may look like this：\nBoth Process A and Process B are being executed.\nConcurrent Processing # However, for concurrent processing,\nthe execution might look like the diagram above,\nor it might look like this:\nBoth Process A and Process B are in progress,\nbut they are not executing at the same time.\nNotice # Parallel processing is only a type of concurrent processing.\nAs long as there are multiple processes in progress,\nit is concurrent processing.\nThere are many ways to achieve concurrent processing,\nand parallel processing is only one of them.\n","date":"25 December 2019","externalUrl":null,"permalink":"/posts/concurrent-process-parallel-process.html","section":"Posts","summary":"Both ‘Concurrent Processing’ and ‘Parallel Processing’ refer to multiple processes executing on the CPU within a period,\nbut they are two different things.\nAccording to The Art of Concurrency,\nConcurrent means:\n","title":"Difference between Concurrent Processing and Parallel Processing","type":"posts"},{"content":"","date":"25 December 2019","externalUrl":null,"permalink":"/tags/parallel-processing.html","section":"Tags","summary":"","title":"Parallel Processing","type":"tags"},{"content":" Summary # I am a Senior Software Developer and Backend Expert with over seven years of experience engineering the high-performance distributed systems required to scale compute-intensive services and modern AI applications. While my core expertise is rooted in robust backend architecture and applied AI development, I maintain strong cross-stack frontend capabilities, enabling me to seamlessly bridge complex intelligent logic layers with polished, user-facing interfaces.\nMy engineering philosophy is anchored in structural integrity, predictable scalability, and clean abstractions. I specialize in identifying and resolving systemic bottlenecks—such as decoupling monolithic data schemas to eradicate O(N) limitations—and establishing the asynchronous infrastructure necessary to power intelligent products without compromising enterprise stability.\nCurrently, my work focuses on the intersection of advanced backend infrastructure and the intelligent logic layer. I specialize in architecting autonomous AI agents, complex Retrieval-Augmented Generation (RAG) pipelines, and integrating Model Context Protocols (MCP) to build highly responsive, high-throughput applications.\nCareer Highlights:\nAI Application \u0026amp; Agent Development: Extensive experience engineering fault-tolerant backend pipelines for AI integration. Independent R\u0026amp;D includes building autonomous discovery agents (such as \u0026ldquo;Hackathon Sniper,\u0026rdquo; utilizing Groq LLMs and Brave Search for the Notion MCP Hackathon) and engineering robust, RAG-driven inference pipelines utilizing LangChain. Patentable R\u0026amp;D: Dual US Patent Holder (US-20250363378-A1, US-20250240262-A1). Architected a zone-based, production-grade algorithm to optimize spatial computing—successfully reducing distributed network message volume by up to 80%—and developed a separate conceptual Machine Learning (Reinforcement Learning) framework. Enterprise Scale \u0026amp; Delivery: Served as a Senior Engineer for the HTC Viverse global launch, where I conceptualized and designed the dynamic edge-routing network architecture (AWS Lambda@Edge) required to scale to one million monthly active users. Systemic Evolution: Designed, engineered from scratch, and open-sourced a complete asynchronous microservice architecture (FastAPI/Python). Established zero-trust data pipelines alongside this architecture to securely integrate programmatic services and aggressively drive enterprise technical execution. Experience # Senior Software Engineer · HTC Viverse Research Team May 2022 – Jul 2025 Go · Elixir · Python · Node.js · PostgreSQL · Docker · AWS · WebRTC · mediasoup High-concurrency Metaverse platform supporting massive concurrent requests.\nViverse Worlds Global Launch Scale: Architected the foundational backend and serverless edge-routing infrastructure (AWS Lambda@Edge) driving the platform’s global commercial release to support 1 million monthly active users. Technical Leadership: Directed multi-disciplinary engineering pods from system design to execution, transitioning the core platform from an isolated, single-scene architecture into an interconnected multiple-scene 3D ecosystem while maintaining strict backward compatibility. Database Concurrency: Eradicated a critical O(N) locking bottleneck during high-concurrency room allocations by redesigning the PostgreSQL schema. Replaced highly contested table-level locks with granular O(1) row-level locking, accelerating throughput by 10x-100x and stabilizing p99 latencies to 2ms-10ms. Real-Time AI Pipelines: Engineered an event-driven media bridge (mediasoup) connecting low-latency WebRTC/RTP streams to persistent cloud storage. Designed secure, asynchronous data pipelines for third-party AI moderation, directly driving a $33,000 USD revenue increase. Dual Patent Inventor (US-20250240262-A1, US-20250363378-A1): Invented and engineered an zone-based distributed spatial computing algorithm (combining frustum culling, dead reckoning) and a Machine Learning (Reinforcement Learning) evaluating occlusion framework, slashing distributed network egress bandwidth by up to 80%. Compute Scalability: Spearheaded a multi-worker paradigm shift in the WebRTC core, breaking single-core CPU ceilings to expand concurrent capacity from 300 to 4,500+ streams per event. Load Simulation \u0026amp; Reliability: Engineered a custom WebRTC load-testing CLI tool to simulate massive concurrent traffic, proactively isolating memory leaks and optimizing hardware load-balancing prior to production. Senior Backend Engineer · HTC Vive Sep 2020 – May 2022 Python (FastAPI) · Golang · MongoDB · MySQL · Docker · EKS · AWS · React Centralized enterprise VR platform (Vive Business Training) supporting high-density concurrent sessions for hundreds of institutional clients, alongside an ecosystem managing complex 3D asset generation.\nArchitecture \u0026amp; Standardization: Drove a high-performance evolution by architecting an asynchronous FastAPI standard that slashed API latency by 85%. Enforced strict engineering rigor, achieving 95% test coverage and establishing containerized auto-scaling pipelines via Docker and AWS EKS to maintain 99.99% uptime across critical multi-platform services. Asynchronous Compute Pipelines: Architected a fault-tolerant microservice ecosystem leveraging Celery and AWS SQS to decouple heavy 3D model and media processing, guaranteeing high availability and non-blocking API execution. Concurrency \u0026amp; State Management: Engineered core backend services driving the asset lifecycle. Designed strict transactional logic to completely eradicate database race conditions during high-concurrency, multi-tenant modifications. Polyglot Persistence \u0026amp; Identity: Engineered a highly scalable RBAC ecosystem and integrated multi-platform account management APIs with third-party cloud hosting to streamline SaaS provisioning. Enforced polyglot persistence, isolating rigid relational data in MySQL while utilizing MongoDB for flexible 3D asset schemas. Serverless Observability Architecture: Engineered an event-driven health-monitoring pipeline utilizing AWS Lambda (Golang/Python) and CloudWatch to continuously validate the identity system\u0026rsquo;s availability, enabling real-time fault detection and ensuring high availability Zero-Trust IoT Edge Routing: Architected a Proof of Concept utilizing AWS IoT (MQTT) to manage telemetry for large fleets of VR headsets, incorporating a Certificate Vending Machine for X.509 device authentication. Senior Software Engineer \u0026amp; Technical Lead · InfoBoom Ltd Feb 2020 – Sep 2020 C\u0026#43;\u0026#43; · Node.js · MongoDB · RabbitMQ · WebRTC · GCP High-security, end-to-end encrypted SaaS communication platform developed in partnership with Taiwan’s Academia Sinica. Selected for presentation to the President of Taiwan at CYBERSEC 2020.\nAsynchronous System Modernization: Architected the enterprise teardown of a legacy WebRTC monolith, migrating the core infrastructure into a fault-tolerant, event-driven Node.js microservice ecosystem. Leveraged MongoDB and RabbitMQ to establish resilient, asynchronous inter-service communication pipelines. Cryptographic Architecture \u0026amp; Compliance: Spearheaded the mission-critical modernization of the 3rd-generation core C++ encryption library. Enforced strict zero-regression API contracts across the entire backend, mobile, and client ecosystem while achieving full ISO27001 and Taiwan MAS compliance. Technical Pod Leadership: Directed a 4-person engineering team through complex architectural migrations, managing critical-path to guarantee the on-time delivery of mission-critical security milestones. Infrastructure Automation (CI/CD): Introduced and engineered fully automated CI/CD deployment pipelines for the business-critical, multi-platform cryptography library, accelerating developer velocity and resulting in a 3x reduction in average deployment times. Product Developer · Synology Dec 2016 – Jun 2018 C\u0026#43;\u0026#43; · Python · Redis · ExtJS · CI/CD Architected and delivered full-stack mail flow control features, including SMTP Relay, integrating low-level C++/Postfix/Dovecot components with Redis schemas, web APIs, and ExtJS front-end interfaces. Maintained complete ownership of the standalone NAS Mail Server product, ensuring long-term stability, patch management, and feature alignment with enterprise requirements. Delivered GDPR compliance features by implementing privacy-preserving data handling mechanisms and system-level audit controls in mail workflows. Built and maintained CI/CD pipelines for automated testing and deployment, improving release reliability and team velocity; enhanced system security and uptime by patching critical CVE vulnerabilities (e.g., Dovecot, memcached) and resolving race conditions and memory leaks. Drove high reliability and operational excellence through close collaboration with Technical Support, directly resolving escalated customer incidents — including backend data correction, permission fixes, and mail queue management — reducing mean time to resolution (MTTR) for critical issues. Collaborated cross-functionally with Product Management, UX/UI designers, QA engineers, and Technical Writers to ensure functional correctness, usability, and documentation accuracy across all user-facing and system features. Research Assistant · National Taiwan University May 2014 – Aug 2014 Wireless Networking \u0026amp; Embedded Systems Lab Project: Information Delivery Middleware for Disaster Management over Heterogeneous Interwoven Communication Networks | Funded by National Science and Technology Council\nDistributed Systems Infrastructure: Built and deployed the foundational distributed testing platform required to execute large-scale, heterogeneous network simulations for the lab\u0026rsquo;s disaster management research. Algorithm Validation Environment: Operationalized the testing environment to conduct empirical validation of experimental service recovery algorithms, simulating network infrastructure failures and ensuring reliable telemetry collection across a publish/subscribe architecture. Linux Environment Provisioning: Configured and tuned the underlying Linux operating system and network environments to support high-concurrency distributed testing without I/O degradation. Skills # Languages: Python, C++, Golang, Node.js, Elixir, SQL, JavaScript, TypeScript System Architecture: Distributed Systems, Event-Driven Architecture, Microservices, Domain-Driven Design (DDD), Backend-For-Frontend (BFF), Zero-Trust Data Pipelines Database \u0026amp; State Management: MySQL, SQLite, PostgreSQL, MongoDB, Redis, Firestore Frameworks: FastAPI, LangChain, Gin, Node.js, Flask, React, TensorFlow, PyTorch Platforms, Tools \u0026amp; Network Protocols: Git, Docker, Kubernetes, AWS, GCP, Linux, Jenkins, CI/CD, WebRTC, gRPC, RESTful APIs, WebSockets, Terraform (IaC) Education # Master of Science in Computer Science # National Taiwan University 2014 – 2016 Thesis: Time-Sensitive Message Delivering with QoS in Named Data Networking Project: Information Delivery Middleware for Disaster Management over Heterogeneous Interwoven Communication Networks | Funded by National Science and Technology Council Project: Real-Time High-Performance Miniature Sensing Systems for Sleep Apnea | Funded by National Science and Technology Council Bachelor of Science in Computer Science - Program of Computer and Electrical Engineering # National Chiao Tung University 2009 – 2013 Publications # Data-driven IoT applications design for smart city and smart buildings # 2017 IEEE SmartWorld, Ubiquitous Intelligence \u0026amp; Computing, Advanced \u0026amp; Trusted Computed, Scalable Computing \u0026amp; Communications, Cloud \u0026amp; Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI)\nAuthors: Chi-Sheng Shih; Kuo-Hsiu (Kourtney) Lee; Jyun-Jhe Chou; Kwei-Jay Lin · DOI: 10.1109/UIC-ATC.2017.8397394\nPatents # METHOD AND SYSTEM FOR MANAGING POSITION INFORMATION, AND COMPUTER READABLE STORAGE MEDIUM # Patent No.: US-20250363378-A1\nInventors: Kuo-Hsiu, Lee· Filed: 2024-05-22\nReduced message exchange volume in distributed spatial computing using Machine Learning (Reinforcement Learning). Patent applications have also been filed in the following regions: European Patent(EP24206749.4 2024-10-15), China(202411517088.4 2024-10-29), Taiwan(113133801 2024-09-06)\nMETHOD FOR MANAGING MESSAGE TRANSMISSION, CONTROL DEVICE, AND COMPUTER READABLE STORAGE MEDIUM # Patent No.: US-20250240262-A1\nInventors: Kuo-Hsiu, Lee · Filed: 2024-06-27\nAchieved message reduction in distributed spatial computing through innovative algorithms. Real-world application tests showed a 80% reduction in network message volume. Patent applications have also been filed in the following regions: China(202411455975.3 2024-10-18), Taiwan(113131471 2024-08-21)\nProjects # PantryLens | Edge-Routed AI Vision PWA \u0026#8599; \u0026#8598; An AI-driven Progressive Web App (PWA) designed to process unstructured visual data and generate real-time recipes. Built to demonstrate secure, low-latency edge computing, the application autonomously parses pantry inventories via image recognition to structure outputs for highly specialized dietary profiles. Dispatched vision tasks to the Gemma 4 model, engineered a real-time Server-Sent Events (SSE) pipeline, and utilized Redis for sliding-window IP rate limiting. Next.js Edge Computing Vercel Server-Sent Events (SSE) OpenRouter Gemma 4 Redis System Architecture. AuroraPath | Secure AI Routing Web Application \u0026#8599; \u0026#8598; A modern, full-stack app connecting real-time NOAA space weather data with Google Gemini to generate travel routes for aurora sightings deployed to the global edge via Vercel. Demonstrate secure AI integration with a dual-layer identity ecosystem (Auth0 M2M) to securely isolate end-user authentication from the Gemini AI inference agent, enforcing strict API quotas and telemetry ingestion via Upstash Redis. Provides scalable edge-routing within a consumer-facing product. TypeScript Next.js Vercel Auth0 System Architecture API Rate Limiting LLM Hackathon Sniper | Autonomous Discovery AI Agent \u0026#8599; \u0026#8598; An autonomous AI agent engineered to dynamically discover, evaluate, and structure technology event data. Built for the Notion MCP Hackathon, this project demonstrates the transition from traditional, linear API and multiple odel Context Protocol (MCP) integrations to autonomous AI agent building. Typescript Model Context Protocol (MCP) Autonomous Agents Groq API Integration System Architecture Iron Counsel: Enterprise RAG Architecture \u0026 Conversational AI \u0026#8599; \u0026#8598; Conversational AI app demonstrating accurate LLM inference with RAG over complex datasets. Tech stack: Serverless FastAPI backend on GCP (provisioned via Terraform), local ONNX embeddings, Firestore vector search, and LangChain orchestration. RAG LangChain LLM Vector Database FastAPI Python GCP CI/CD Terraform Enterprise Standard: Asynchronous FastAPI Architecture \u0026#8599; \u0026#8598; A production-grade, asynchronous Python microservice boilerplate engineered to standardize enterprise backend development and eradicate legacy I/O bottlenecks. Enforced strict organizational engineering culture by integrating Docker and mandating a baseline of 95% test coverage. Python FastAPI Microservices System Architecture MongoDB Docker Software Standardization ","externalUrl":null,"permalink":"/about.html","section":"Kourtney's Space","summary":"Summary # I am a Senior Software Developer and Backend Expert with over seven years of experience engineering the high-performance distributed systems required to scale compute-intensive services and modern AI applications. While my core expertise is rooted in robust backend architecture and applied AI development, I maintain strong cross-stack frontend capabilities, enabling me to seamlessly bridge complex intelligent logic layers with polished, user-facing interfaces.\n","title":"About","type":"page"},{"content":" About # I am a Senior Software Developer and Backend Expert with over seven years of experience engineering the high-performance distributed systems required to scale compute-intensive services and modern AI applications. While my core expertise is rooted in robust backend architecture and applied AI development, I maintain strong cross-stack frontend capabilities, enabling me to seamlessly bridge complex intelligent logic layers with polished, user-facing interfaces.\nMy engineering philosophy is anchored in structural integrity, predictable scalability, and clean abstractions. I specialize in identifying and resolving systemic bottlenecks—such as decoupling monolithic data schemas to eradicate $O(N)$ limitations—and establishing the asynchronous infrastructure necessary to power intelligent products without compromising enterprise stability.\nCurrently, my work focuses on the intersection of advanced backend infrastructure and the intelligent logic layer. I specialize in architecting autonomous AI agents, complex Retrieval-Augmented Generation (RAG) pipelines, and integrating Model Context Protocols (MCP) to build highly responsive, high-throughput applications.\nCareer Highlights:\nAI Application \u0026amp; Agent Development: Extensive experience engineering fault-tolerant backend pipelines for AI integration. Independent R\u0026amp;D includes building autonomous discovery agents (such as \u0026ldquo;Hackathon Sniper,\u0026rdquo; utilizing Groq LLMs and Brave Search for the Notion MCP Hackathon) and engineering robust, RAG-driven inference pipelines utilizing LangChain. Patentable R\u0026amp;D: Dual US Patent Holder (US-20250363378-A1, US-20250240262-A1). Architected a zone-based, production-grade algorithm to optimize spatial computing—successfully reducing distributed network message volume by up to 80%—and developed a separate conceptual Machine Learning (Reinforcement Learning) framework. Enterprise Scale \u0026amp; Delivery: Served as a Senior Engineer for the HTC Viverse global launch, where I conceptualized and designed the dynamic edge-routing network architecture (AWS Lambda@Edge) required to scale to one million monthly active users. Systemic Evolution: Designed, engineered from scratch, and open-sourced a complete asynchronous microservice architecture (FastAPI/Python). Established zero-trust data pipelines alongside this architecture to securely integrate programmatic services and aggressively drive enterprise technical execution. Key Skills: Python · Go · Node.js · C++ · Elixir · GCP · AWS · Docker · MongoDB · PostgreSQL\nFull experience, projects \u0026amp; background →\nBlog Posts # I also write on dev.to → — shorter articles for hackathons and coding challenges. Beyond the Wall: Building a Low-Cost, High-Efficiency Cloud RAG Application with Firestore Vector Search 20 March 2026\u0026middot;1573 words\u0026middot;8 mins RAG (Retrieval Augmented Generation) is an AI framework that allows developers to add external information without retraining the LLM, improving the accuracy of its answers. As of 2026, it is a widely known technology. The concept is roughly as follows: First, vectorize external information (the data you want the LLM to know) using an embedding model and store it. After a user enters a prompt, the prompt is also vectorized using the same embedding model. It is then compared against the previously stored vectors to retrieve the most similar pieces of data. These are then integrated by the LLM to generate a response for the user. This approach allows the LLM to answer using specific knowledge integrated by developers without the need for retraining. How I Finally Removed GitHub’s Persistent “Ghost Notification” — The Real Fix With GitHub CLI 14 November 2025\u0026middot;538 words\u0026middot;3 mins If you’re a developer who uses GitHub daily, you probably rely on notification badges to track issues, pull requests, and mentions. But what happens when the notification badge gets stuck — even after clearing everything? For months, I saw a 1 notification badge that refused to disappear, even though there aren\u0026rsquo;t any unread messages in any inbox folder. No archived items, no subscriptions, nothing hidden. Still, the badge remained. Many developers started reporting the same issue as early as September 2025. This wasn’t just a UI bug — it was caused by a spam attack impersonating Gitcoin, leaving backend notification records that GitHub never automatically cleaned up. Multiprocessing, Multithreading and Asyncio in Python Part 1 - Basic Concept 25 October 2025\u0026middot;Updated: 9 June 2026\u0026middot;1131 words\u0026middot;6 mins Python\u0026rsquo;s performance bottlenecks were criticized for years, but thanks to the hard work of developers, Asyncio was introduced in Python 3.4 to improve performance in specific scenarios. By Python 3.13, the Free-threaded design (PEP-703) emerged, allowing the optional disabling of the GIL. Combined with the pre-existing Multiprocessing and Multithreading, I have compiled a few records on the principles, differences, and use cases for these three technologies. This first post will briefly introduce the basic concepts and suitable scenarios for each. MacOS Legacy Rsync Hangs 4 October 2025\u0026middot;270 words\u0026middot;2 mins A few months ago, I encountered an issue while using rsync to back up data from my MacBook to a NAS. rsync would appear to be running normally for a while and then suddenly hang indefinitely. The terminal output showed it syncing files as usual, and then it just\u0026hellip; stopped. There were no error messages, and rsync didn\u0026rsquo;t exit. Initially, I thought it might be a large file transfer or an unstable network connection. However, I discovered that if I killed the process and ran the rsync command again, it would resume smoothly from the file where it had previously stuck. This happened several times in a row! Sync Obsidian / Joplin Data Across Multiple Devices with Synology WebDAV 25 December 2024\u0026middot;Updated: 9 June 2026\u0026middot;459 words\u0026middot;3 mins I originally used Notion as my note-taking software. It\u0026rsquo;s feature-rich and has a beautiful interface. However, a few years ago, a privacy controversy arose around Notion, accusing them of looking at a company\u0026rsquo;s content stored in Notion, and even proposing a partnership based on that information. So, I switched to Joplin for a while, but eventually moved to Obsidian, which has a large number of plugins, strong community support, and is highly customizable. Managing Pre-existing Global NPM Packages After Installing NVM 6 November 2021\u0026middot;Updated: 9 June 2026\u0026middot;244 words\u0026middot;2 mins Today I encountered a problem: After installing nvm, the path for installing global packages changed, making it impossible to directly remove previously installed global packages using npm uninstall -g. How did I discover this? A long time ago, I installed a global package that could be executed directly from the terminal using a command. But because it was so long ago, when I tried to upgrade that package, I found it wasn\u0026rsquo;t listed in npm list -g. See all posts →\nLet\u0026rsquo;s Work Together — Available for consulting, research collaboration, speaking engagements, and contract work. Work With Me → ","externalUrl":null,"permalink":"/index.html","section":"Kourtney's Space","summary":"About # I am a Senior Software Developer and Backend Expert with over seven years of experience engineering the high-performance distributed systems required to scale compute-intensive services and modern AI applications. While my core expertise is rooted in robust backend architecture and applied AI development, I maintain strong cross-stack frontend capabilities, enabling me to seamlessly bridge complex intelligent logic layers with polished, user-facing interfaces.\n","title":"Kourtney's Space","type":"page"},{"content":" I also write on dev.to → — shorter articles for hackathons and coding challenges. ","externalUrl":null,"permalink":"/posts/index.html","section":"Posts","summary":" I also write on dev.to → — shorter articles for hackathons and coding challenges. ","title":"Posts","type":"posts"},{"content":" Software \u0026amp; Systems Development # Available for consulting and contract work on website development, full-stack systems, workflow automation, backend systems, and AI applications.\nTypical engagements: Technical consulting, contract development, code reviews, architecture design.\nResearch Collaboration # Open to academic or industry research collaboration in areas such as distributed systems, applied AI, and cloud computing, including co-authoring papers.\nTypical engagements: Co-authorship, research advising, dataset or system design contributions.\nGet In Touch # Have a project in mind? Feel free to reach out.\nkourtneylee1611@gmail.com\n","externalUrl":null,"permalink":"/work-with-me.html","section":"Kourtney's Space","summary":"Software \u0026 Systems Development # Available for consulting and contract work on website development, full-stack systems, workflow automation, backend systems, and AI applications.\nTypical engagements: Technical consulting, contract development, code reviews, architecture design.\nResearch Collaboration # Open to academic or industry research collaboration in areas such as distributed systems, applied AI, and cloud computing, including co-authoring papers.\n","title":"Work With Me","type":"page"}]