my-blogs

Understanding YouTube’s Image Processing Pipeline

Lavkush Maurya — Tue, 21 Apr 2026 06:09:22 GMT

When I had the idea to build my own movie review platform, I started wondering how I would handle images for different screen sizes. That question led me to learn about YouTube’s image processing pipeline, so I could build something similar for my project.

So, here’s what happens behind the scenes:

Step 1: Upload (Input Stage)

When a creator uploads a thumbnail with high resolution (e.g., 1280×720) in JPEG or PNG format, the image may be converted into more efficient formats such as WebP or AVIF to improve loading speed and optimize performance. The processed image is then stored as the master (original) image.

At this stage, YouTube also uses an NSFW detection system to filter out any images that are inappropriate.

Step 2: Derivative Generation (Multiple Resolutions)

After the first step, multiple versions of the master image are created. The system resizes the master image into different resolutions, such as:

default → 120×90
mqdefault → 320×180
hqdefault → 480×360
maxresdefault → up to 1280×720+

Each version is generated directly from the master image to maintain quality and avoid repeated compression.

Step 3: Compression(Perceptual Optimization)

Human vision is more sensitive to brightness (luminance), edges, faces, and text, and less sensitive to fine color details (chrominance). Modern image compression techniques take advantage of this:

1. Chroma Subsampling Instead of storing full color detail, the image keeps full brightness information while reducing color resolution.

2. Frequency-Based Compression (DCT) The image is divided into small blocks (usually 8×8). Each block is transformed into frequency components:

Low frequency → smooth areas
High frequency → sharp edges and details

3. Quantization This is the “lossy” stage, where small details are rounded off and less important frequency data is reduced to save space.

4. Perceptual Weighting Different parts of the image are given different levels of importance during compression. Human eyes are:

Highly sensitive to edges (text, outlines)
Very sensitive to faces
Less sensitive to smooth backgrounds, noise, and minor color variations

Compression algorithms take advantage of these characteristics to preserve what matters most.

5. Adaptive Compression Compression is applied unevenly across the image:

Complex areas → higher quality
Simple areas → more compression

6. Post-Processing After resizing, slight sharpening is often applied to restore edge clarity and improve perceived image quality.

Step 4: Storage

All generated image variants are stored within YouTube’s infrastructure, which relies on Google’s distributed storage systems.

Step 5: CDN Distribution

Images are delivered through Google’s CDN, which uses edge servers to serve content from locations closest to the user for faster loading.

Step 6: Device Aware Delivery

YouTube selects the appropriate image based on factors such as screen size, network speed, and the UI context (e.g., whether the image is displayed in a grid or full view).

In the end, YouTube’s thumbnail system isn’t built on magic. it’s built on smart engineering decisions. By combining efficient compression techniques, multiple image variants, perceptual optimization, and fast CDN delivery, it ensures that thumbnails look sharp while loading quickly across all devices.

If you’re building your own platform, the key takeaway is simple: store a high-quality master image, generate optimized versions, and deliver them intelligently based on user context. With these principles, you can create a system that feels fast, scalable, and professional, just like the best platforms on the web.

Sources

YouTube Data API – Thumbnails
Google Developers – WebP Image Format
JPEG Compression (DCT & Quantization) – Technical Overview
Cloudflare – How CDNs Work
Google Cloud Vision API – SafeSearch Detection

How Does react actually works?

Lavkush Maurya — Mon, 13 Apr 2026 07:55:53 GMT

In recent days, I’ve been diving deeper into React and exploring not just how to use it, but how it actually works under the hood. While most tutorials do a great job explaining concepts step by step, it’s easy to feel overwhelmed when trying to connect everything—components, hooks, reconciliation, and rendering—all at once.

This article is my attempt to simplify those internal concepts and build a clear mental model of how React works behind the scenes, especially focusing on Fiber, rendering phases, and how updates flow through the system.

In this blog, I assume that you already know the basics of React, such as:

How to create functional components.
How to use Vite to create a React project.
What props are.
The basic usage of React Hooks.

These are the prerequisites for understanding this blog.

If we take a look at the project structure of React, we see a single index.html file. That is one of the reasons React applications are called Single Page Applications (SPAs).

This HTML file contains a div element with an id="root". This is the element where all React components are rendered.

React element Objects

These are like normal javascript objects that holdZ properties of an element.

const App = () =>{
    return(
        App Component
    )
}

When we use console.log(App()), it outputs an object:

{
    "$$typeof": Symbol(react.element),
    key: null,
    props: {children: "App Component"},
    ref: null,
    type: "div",
}

Each of those function calls gets converted into React.createElement(). That is the reason we import the React module whenever we create a JSX component. Each function call returns an element object, as shown above.

Whenever react parses the component function to the React.createElement() it is transpiled by babel.

These elements are:

Immutable in nature
Lightweight

They are just a description of the UI, not the actual DOM. Now, using this object, React creates an element tree, also known as the Virtual DOM.

In React we do not call the component as a function but instead we passes it as html tag. React first creates an object using React.createElement(). If the type property is a function, React calls it. This returns more JSX, which is again converted into React.createElement() calls. Eventually, we get an object that can be used to create the Virtual DOM.

Flow:

JSX
React.createElement(App)
Creates element object {type: App}
React sees type is a function
calls App()
Gets more JSX
Converts again in objects
Builds Virtual DOM Tree

What is Virtual DOM?

You have heard the term Virtual DOM many times in this blog. The Virtual DOM is created by React. When React renders a component, it creates a tree of React elements (plain JavaScript objects), which are used to construct and update the Virtual DOM internally. After the creation of the Virtual DOM, it is synced with the browser DOM.

When the page loads for the first time, the Virtual DOM is fully rendered and inserted into the browser DOM.

If some part of the tree changes, updating the entire tree is not an efficient approach. Instead, React compares the previous and the updated trees and determines the minimum number of changes required to update the DOM. This process is done using a diffing algorithm.

The Diffing Algorithm

This is the main algorithm that compares changes between two trees and updates the DOM accordingly. It helps maintain the performance of the application because it reduces the time required to render elements.

Assumptions:

Different trees are created when there are elements of two different types.
In a list of child elements that often changes, we should provide a unique key as a prop. We can see that in the element object, key is a separate property. This is because it plays a crucial role in the diffing algorithm.

React will rebuild the tree from the start if the elements inside div changes.

In this case, the attributes are updated while the elements remain the same, so the state is preserved. The tree is not fully re-rendered.

First, React compares the initial items in the list. If they are the same, it then detects any changes in the Virtual DOM and appends the new element to the end of the

If we add an element at the beginning, React will regenerate the tree, which is less efficient. Instead, we can use the key property to tell React which element is new.

Rendering

React itself doesn’t directly update the UI. Instead, it relies on a renderer like React DOM to apply changes to the browser.

After React determines what needs to change during the render phase, it passes those updates to the renderer. The renderer then performs the actual DOM operations, such as creating, updating, or removing elements.

In simple terms:

React decides what should change
The renderer decides how to update the UI

Understanding how React works internally can feel overwhelming at first, but breaking it down into concepts like Fiber, rendering phases, and the role of the renderer makes it much easier to grasp.

At a high level, React is all about efficiently calculating changes and updating the UI in a predictable way. Once you understand how it separates what to update from how to update, many advanced concepts start to make sense naturally.

This article reflects my learning journey, and there’s still more to explore. If you’re learning React too, take it step by step—these concepts become clearer with time and practice.

Resources

React internals - https://youtu.be/7YhdqIR2Yzo?si=FCVd0b9WwqZhqlh3
React Fiber - https://youtu.be/0ympFIwQFJw?si=WRmbB78F53atzRPu
Full Stack React Course - https://youtu.be/Bvwq\_S0n2pk?si=Q8vaDNw4n\_TivpIx

MongoDB Aggregation Pipeline: From Basics to Behind the Scenes

Lavkush Maurya — Tue, 07 Apr 2026 02:50:41 GMT

In this blog, you are going to understand the internals of the MongoDB aggregation pipeline. Most of us, when we first face this aggregation pipeline, get confused about why they use the $ sign inside [] brackets. Inside the [] brackets are the stages. These stages are created using {} braces. Each Stage is using operations to transform the data. e.g. group, project, addField, lookup etc..

The aggregation pipeline looks like:

Now we are going to understand that behind this aggregation pipeline, there is a mini query engine that handles each step of the aggregation process. I also provided the technical jargons for each step.

Step 1: Parsing the Data

Parsing, Logical Plan, AST(Abstract Syntax Tree)

At first MongoDB reads this pipeline and turn it into a structure of steps. This helps to analyze and execute the query more efficiently.

Step 2: Query Optimizer

Predicate Pushdown, Stage Re-ordering

It reorder the query to make it faster to execute and improves performance without changing the result.

Step 3: Query Planner

IXSCAN, COLLSCAN, B-Tree Index

It decides the proper way to fetch data. IXSCAN refers to index-based scanning, whereas COLLSCAN means scanning the entire collection. The index used in IXSCAN can be on any field, not just _id, but also combinations like { _id, name }.

Step 4: Physical Plan

Execution Plan, Operators

It converts the optimized plan into a physical execution plan, where MongoDB decides how each stage will actually run using specific operators and algorithms.

Step 5: Execution Engine

Slot Based Execution Engine(SBE)

It uses small memory boxes (slots) instead of moving full documents, which makes execution faster and more efficient.

{
    "customerId": 01,
    "amount": 10000,
    "total": 1000,
}

Step 6: Iterator Model

Volcano Model, next()

Each stage processes data by pulling one document at a time from the previous stage using the next() function.

Step 7: Streaming VS Blocking

Streaming refers to processing one document at a time. Blocking refers to processing all the data only after it is fully available. Aggregation operations like match and project are considered streaming, while group and sort are considered blocking.

Step 8: Stage Algorithms

Hash Aggregation, External Merge Sort, Nested Loop Join

It defines which algorithm is used in each stage to perform the operation.

e.g. group - Hash Table, sort - chunks->sort->merge->Result

Step 9: Expression Evaluation

Expression Tree, Evaluation Engine

It uses an expression evaluation engine to process expressions (like sum, multiply) by converting them into expression trees and evaluating them for each document.

What happens internally:

Doc1 → price = 100, quantity = 2 → totalPrice = 200  
Doc2 → price = 50, quantity = 3 → totalPrice = 150

For each document, MongoDB evaluates expressions like multiply by reading field values, applying the operation, and storing the result.

Step 10: Memory Management

Memory Threshold, Disk Spill, External Processing

It processes small data in RAM, but when the data size exceeds the limit (around 100MB), it uses disk storage temporarily.

Step 11: Final output (Cursor)

Cursor, Batching, Lazy Fetching

Instead of sending all results at once, it sends results gradually using a cursor. This saves memory, provides a faster response, and is scalable for large datasets.

The aggregation pipeline is not just a sequence of operations. it is a mini query engine that processes data efficiently using different strategies and algorithms.

Understanding this makes you a better developer when working with MongoDB.