What's in this blog
Ever wondered how YouTube generates videos in different resolutions and how those videos are sent to us in chunks? What happens is, when you start watching a YouTube video, only some part of it is downloaded, and as you progress beyond that part, it starts downloading more chunks of the video. This process ensures smooth playback and efficient use of bandwidth to the viewer.
Video transcoding: It is the process of converting a video file from one format to another. This is important for delivering video content in various resolutions and formats to support different devices and network conditions.
In this blog, I will share how I built a video transcoding service using Node.js. This service simply takes a video as an input and then outputs the video in four different resolutions (360p, 480p, 720p, and 1080p) in HLS format. We'll discuss about HLS (HTTP Live Streaming) later in this blog.
I've also built a frontend to fetch the generated video from node.js server and show it in a video player in different resolutions (you can toggle between different resolutions too).
How I get the idea to build this
I was planning to build a side project, basically a LMS website using Next.js & Node.js and was writing about the features that I'm gonna include in this. So when I was writing admin panel features, I've written create course, create contents, video uploading, and wait wait wait.
I just stuck for a minute thinking "How will I allow user to change resolutions?", "Do we just need to upload the whole 2GB of video and video players allow these feature", "What youtube does in such case", and bang!
So I decided to study a bit and see how youtube video uploading works and then after reading some articles and watching some videos I got to know about some interesting things like ffmpeg, HLS, video transcoding, m3u8 file and so on.
These things were interesting enough to make me build a video transcoding service for myself and then use it in the LMS project instead of directly uploading an mp4 files to the server.
Let me tell you about some topics that you must know before proceeding to build a basic video transcoding service.
What is HLS (HTTP Live Streaming)
HLS or HTTP Live Streaming, is a protocol developed by Apple in 2009 for streaming audio/video over the internet.
It breaks the overall video into small segments (typically 2-10 seconds long), HTTP-based file downloads, each segment containing a short piece of the overall video.
It creates a master playlist (.m3u8 file) and segmented files (.ts files).
Manifest File (m3u8): The segmented files are listed in a playlist or manifest file with a .m3u8 extension. This file provides the front-end with URLs to the media segments and can include multiple playlists for different bitrate. You can consider it as a book's index page where list of chapters are mentioned with their page number and you use this index page to navigate through the list of chapters of the book. The same way we are going to use m3u8 file to navigate through the segments of the video.
Segment Files (.ts or MPEG-2 Transport Stream): The chunks or the video or you can say the segmented video files have the either .ts format or .mp4 format. As told earlier these are typically 2-10 seconds log video segment of the original video.
On front-end we just send the .m3u8 file and the video player uses it to fetch the segments and show the video to the user.
HLS supports adaptive bitrate streaming, which means it can automatically adjust the quality of the video stream in real-time based on the viewer's network conditions. If the network bandwidth decreases, HLS switches to a lower bitrate version of the video to prevent buffering.
HLS uses standard HTTP for delivery, making it widely compatible across devices and platforms. It supports both live streaming and video-on-demand content, and includes features for content encryption and authentication.
HLS supports encryption and digital rights management (DRM) to protect content from unauthorized access and copying.
What is video transcoding service and what it has to do with HLS
As learnt earlier, A video transcoding service primarily converts video files from one format or encoding to another. It includes changing codecs, adjusting bitrate, changing resolutions, or modifying frame rates and so on.
So here, in our case we want to generate HLS format videos as we have read about its advantages.
What this service will do for us:
Encoding: The original video is encoded at multiple bitrates and multiple resolutions of the video is generated.
Segmentation: The encoded video files are broken into small segments (typically 6-10 seconds each).
Manifest File Creation: An .m3u8 playlist file is created, listing the URLs of the video segments.
Distribution: The segments(.ts files) and the playlist file(.m3u8 files) are uploaded to a web server or CDN.
Playback: The client (e.g., a web browser or media player) downloads the playlist file, retrieves the segments sequentially, and assembles them for playback.
FFmpeg : A gem for video transcoding!
FFmpeg is a powerful, open-source software suite for handling multimedia data. It is used for recording, converting, and streaming audio and video.
It offers a command-line tool that is used to convert video and audio in different formats. Along with that it is capable of scaling, cropping, and filtering video files.
It can also resize videos to different resolutions and frame rates.
It can alter video and audio bitrates to optimize for different use cases.
It can extract frames to create video thumbnails.
It offers various filters for tasks like de-noising, color correction, or adding watermarks.
FFmpeg can be integrated in different softwares to utilize its various features. It is used by many video platforms, media players, and transcoding services.
It is the core of video transcoding service.
The video transcoding services often build their systems around FFmpeg, using it as the core engine for media processing while adding custom interfaces, job management, and scalability features.
Decision Making for the MVP
So after learning the theory of these topics I started with very basic set of features to build the MVP of my video transcoding service.
Functionality
Instead of going with complex AWS architecture like utilizing the S3 buckets, ECS, CDNs, etc, I choose to keep it very simple and decided to that I would have an API endpoint where I will hit with a video file then it should generate multiple resolutions of that video and store it directly in my local system in the backend folder. That's it.
Obviously, I have to deploy it someday but firstly I wanted to at least have a working MVP.
Language Selection
I've been working with node.js for writing backend for most of my client's projects and I had a good grip and comfort in it, so I choose it.
For front-end, I decided to go with the React.js as I just have to make an API call and integrate a media player with resolution toggling feature.
Storage Selection
Storing a huge GBs of files in the backend or in local system is obviously a bad idea but this was a MVP so for now I can go with it. Later on I can change it upload on the cloud storage like AWS S3.
FFmpeg Installation
To use FFmpeg, we need to install it, so either we can use homebrew (mac) or directly download it from their official website. Verify your installation by running ffmpeg -version
command in the terminal.
Beginning to code
The first step is to setup node.js and create an API that can accept video as a payload and upload it directly to the server without converting it into HLS. That's it.
Required packages
We just require some basic packages (nothing fancy) :
Dependencies
-
cors
- Enables Cross-Origin Resource Sharing (CORS) in Express applications.
- Allows the API to accept requests from different domains.
-
express
- A minimal Node.js framework for handling routing, requests, and responses.
-
multer
- Middleware for handling
multipart/form-data
, primarily used for uploading files. - Manages video file uploads in the application.
- Middleware for handling
-
uuid
- Generates unique identifiers (UUIDs) and will be used for naming uploaded videos to avoid naming conflicts.
DevDependencies
- nodemon
- Monitors for changes in code and automatically restarts the server.
- Improves development workflow with automatic application restarts.
Node.js and express setup
-
Initialize a new Node.js project using
npm init -y
-
Install development dependencies using
npm install --save-dev nodemon
-
Update package.json "scripts" object:
"scripts": {
"dev": "nodemon"
}
-
Start the development server using
npm run dev
. -
To run the output or compiled code use
npm run start
Making API for uploading video
- src/index.js
// Import necessary modules
import express from 'express'
import cors from 'cors'
import { v4 as uuid } from 'uuid'
import { uploader } from './middlewares/uploader.js'
// Set the port for the server, using an environment variable or defaulting to 2000
const port = process.env.PORT || 2000
// Initialize Express application
const app = express()
// Enable CORS for all routes
app.use(cors())
// Parse JSON bodies
app.use(express.json())
// Parse URL-encoded bodies
app.use(express.urlencoded({ extended: false }))
// Define the upload route
app.post('/api/upload', uploader('video'), (req, res) => {
// Check if a file was uploaded
if (!req.file) {
// If no file was uploaded, return a 400 Bad Request status
return res.status(400).send('Video not sent!')
}
// Generate a unique ID for the video
const videoId = uuid()
// Get the path where the file was uploaded
const uploadedVideoPath = req.file.path
// Return a success response
return res.status(200).json({
success: true,
message: 'Video uploaded successfully',
videoId: videoId,
filePath: uploadedVideoPath
})
})
// Start the server
app.listen(port, () => {
console.log(`Server is running at ${port}`)
})
- src/middlewares/uploader.js (create a
uploads
folder in project root directory)
import multer from 'multer'
import { v4 as uuid } from 'uuid'
// Configure Multer for file storage
const multerConfig = () => {
// Set up disk storage for uploaded files
const storage = multer.diskStorage({
// Define the destination directory for uploaded files
destination: function (req, file, cb) {
cb(null, `./uploads/`)
},
// Define the filename for uploaded files
filename: function (req, file, cb) {
// Extract the file extension
const fileExtension = file.originalname.split('.').pop()
// Generate a unique filename using timestamp and UUID
cb(null, `${Date.now()}-${uuid()}.${fileExtension}`)
},
})
// Create and return a Multer instance with the configured storage
const upload = multer({ storage })
return upload
}
// Export the uploader middleware
export const uploader = (fieldName) => {
// Return a middleware function
return (req, res, next) => {
// Get the Multer upload instance
const upload = multerConfig()
// Set up Multer to handle a single file upload for the specified field
const isUploaded = upload.single(fieldName)
// Execute the file upload
isUploaded(req, res, function (error) {
if (error) {
// If there's an error, return a 400 status with an error message
return res
.status(400)
.json({ error: error.message ?? 'File upload failed!' })
}
// If upload is successful, move to the next middleware
next()
})
}
}
This much of code will take the video as an input and will upload it to the backend in uploads folder.
In the above code, we are just generating the videoId but not using it but now is the time. We are going to change that API to run FFmpeg command and generate 360p, 480p, 720p and 1080p resolutions of the uploaded video in a separate folder.
The explain of FFmpeg command is below the code.
Using FFmpeg in the API
- src/index.js (final code)
// Import necessary modules
import express from 'express'
import cors from 'cors'
import { v4 as uuid } from 'uuid'
import { uploader } from './middlewares/uploader.js'
import fs from 'fs' // NEW
// exec - Needed to execute command in shell
import { exec } from 'child_process' // NEW
import path from 'path' // NEW
// Set up port, defaulting to 2000 if not specified in environment
const port = process.env.PORT || 2000
// Initialize Express application
const app = express()
// Enable CORS for all routes
app.use(cors())
// Parse JSON and URL-encoded bodies
app.use(express.json())
app.use(express.urlencoded({ extended: false }))
// Serve HLS output files statically (NEW)
app.use('/hls-output', express.static(path.join(process.cwd(), 'hls-output')))
// Define route for video upload
app.post('/api/upload', uploader('video'), (req, res) => {
// Check if a file was uploaded
if (!req.file) {
return res.status(400).send('Video not sent!')
}
// Generate a unique ID for the video
const videoId = uuid()
const uploadedVideoPath = req.file.path
// Define output folder structure (NEW)
const outputFolderRootPath = `./hls-output/${videoId}`
const outputFolderSubDirectoryPath = {
'360p': `${outputFolderRootPath}/360p`,
'480p': `${outputFolderRootPath}/480p`,
'720p': `${outputFolderRootPath}/720p`,
'1080p': `${outputFolderRootPath}/1080p`,
}
// Create directories if they don't exist, for storing output video (NEW)
if (!fs.existsSync(outputFolderRootPath)) {
// ./hls-output/video-id/360p/
fs.mkdirSync(outputFolderSubDirectoryPath['360p'], { recursive: true })
// ./hls-output/video-id/480p/
fs.mkdirSync(outputFolderSubDirectoryPath['480p'], { recursive: true })
// ./hls-output/video-id/720p/
fs.mkdirSync(outputFolderSubDirectoryPath['720p'], { recursive: true })
// ./hls-output/video-id/1080p/
fs.mkdirSync(outputFolderSubDirectoryPath['1080p'], { recursive: true })
}
// Define FFmpeg commands for different resolutions (NEW)
const ffmpegCommands = [
// 360p resolution
`ffmpeg -i ${uploadedVideoPath} -vf "scale=w=640:h=360" -c:v libx264 -b:v 800k -c:a aac -b:a 96k -f hls -hls_time 15 -hls_playlist_type vod -hls_segment_filename "${outputFolderSubDirectoryPath['360p']}/segment%03d.ts" -start_number 0 "${outputFolderSubDirectoryPath['360p']}/index.m3u8"`,
// 480p resolution
`ffmpeg -i ${uploadedVideoPath} -vf "scale=w=854:h=480" -c:v libx264 -b:v 1400k -c:a aac -b:a 128k -f hls -hls_time 15 -hls_playlist_type vod -hls_segment_filename "${outputFolderSubDirectoryPath['480p']}/segment%03d.ts" -start_number 0 "${outputFolderSubDirectoryPath['480p']}/index.m3u8"`,
// 720p resolution
`ffmpeg -i ${uploadedVideoPath} -vf "scale=w=1280:h=720" -c:v libx264 -b:v 2800k -c:a aac -b:a 128k -f hls -hls_time 15 -hls_playlist_type vod -hls_segment_filename "${outputFolderSubDirectoryPath['720p']}/segment%03d.ts" -start_number 0 "${outputFolderSubDirectoryPath['720p']}/index.m3u8"`,
// 1080p resolution
`ffmpeg -i ${uploadedVideoPath} -vf "scale=w=1920:h=1080" -c:v libx264 -b:v 5000k -c:a aac -b:a 192k -f hls -hls_time 15 -hls_playlist_type vod -hls_segment_filename "${outputFolderSubDirectoryPath['1080p']}/segment%03d.ts" -start_number 0 "${outputFolderSubDirectoryPath['1080p']}/index.m3u8"`,
]
// Function to execute a single FFmpeg command (NEW)
const executeCommand = (command)=> {
return new Promise((resolve, reject) => {
// Execute ffmpeg command in shell
exec(command, (error, stdout, stderr) => {
if (error) {
console.error(`exec error: ${error}`)
reject(error)
} else {
resolve()
}
})
})
}
// Execute all FFmpeg commands concurrently (NEW)
Promise.all(ffmpegCommands.map((cmd) => executeCommand(cmd)))
.then(() => {
// Create master playlist
const masterPlaylistPath = `${outputFolderRootPath}/index.m3u8` // ./hls-output/video-id/index.m3u8
const masterPlaylistContent = `
#EXTM3U
#EXT-X-STREAM-INF:BANDWIDTH=800000,RESOLUTION=640x360
360p/index.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=1400000,RESOLUTION=854x480
480p/index.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2800000,RESOLUTION=1280x720
720p/index.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=5000000,RESOLUTION=1920x1080
1080p/index.m3u8
`.trim()
fs.writeFileSync(masterPlaylistPath, masterPlaylistContent) // write the above content in the index.m3u8 file
// Creating URLs for accessing the video streams
const videoUrls = {
master: `http://localhost:${port}/hls-output/${videoId}/index.m3u8`,
'360p': `http://localhost:${port}/hls-output/${videoId}/360p/index.m3u8`,
'480p': `http://localhost:${port}/hls-output/${videoId}/480p/index.m3u8`,
'720p': `http://localhost:${port}/hls-output/${videoId}/720p/index.m3u8`,
'1080p': `http://localhost:${port}/hls-output/${videoId}/1080p/index.m3u8`,
}
// Send success response with video URLs
return res.status(200).json({ videoId, videoUrls })
})
.catch((error) => {
console.error(`HLS conversion error: ${error}`)
// Clean up: Delete the uploaded video file
try {
fs.unlinkSync(uploadedVideoPath)
} catch (err) {
console.error(`Failed to delete original video file: ${err}`)
}
// Clean up: Delete the generated HLS files and folders
try {
fs.unlinkSync(outputFolderRootPath)
} catch (err) {
console.error(`Failed to delete generated HLS files: ${err}`)
}
// Send error response
return res.status(500).send('HLS conversion failed!')
})
})
// Start the server
app.listen(port, () => {
console.log(`Server is running at ${port}`)
})
Firstly, read the comments in the above codes properly to get the high level idea of what is being done.
We are creating a hls-output
folder and in that folder whenever you upload a video, a parent video folder will be created using the uuid
as video-id
and in this folder there will be many sub folders and files like index.m3u8 file, 360p folder, 480p folder, 720p folder and 1080p folder. These sub folders will contain video segments or .ts
files, generated using exec
command. Every sub folder have their own index.m3u8
file.
If by any reason, any error occurs then the uploaded video file and the generated video folders will be deleted.
The master index.m3u8
file is very important as it contains references to all the generated video and this file will be served to the front end and used in the video player. The index.m3u8
file in the sub folders (360p, 480p, 720p, etc) will contain the video segment related data, i.e. they will help us to navigate through the video segments.
Lets see what the FFmpeg command and index.m3u8 content means.
FFmpeg commands & index.m3u8 content explanation
FFmpeg Commands
Lets take the command to generate 360p resolution of the video.
ffmpeg -i ${uploadedVideoPath} -vf "scale=w=640:h=360" -c:v libx264 -b:v 800k -c:a aac -b:a 96k -f hls -hls_time 15 -hls_playlist_type vod -hls_segment_filename "${outputFolderSubDirectoryPath['360p']}/segment%03d.ts" -start_number 0 "${outputFolderSubDirectoryPath['360p']}/index.m3u8"
1. ffmpeg
: To run ffmpeg in command-line.
2. -i ${uploadedVideoPath}
: Specifies the input video file.
3. -vf "scale=w=width:h=height"
: Applies a video filter to scale the video to the specified resolution, in this case, 640x360.
4. -c:v libx264
: Specifies the video codec to use, in this case, H.264.
5. -b:v video_bitrate
: Sets the target video bitrate.
6. -c:a aac
: Specifies the audio codec to use, in this case, AAC.
7. -b:a audio_bitrate
: Sets the target audio bitrate, in this case, 96 kbps.
8. -f hls
: Specifies the output format, in this case, HLS (HTTP Live Streaming).
9. -hls_time segment_duration
: Sets the duration of each HLS segment in seconds, in this case 15 seconds.
10. -hls_playlist_type playlist_type
: Sets the playlist type, in this case, VOD (Video on Demand).
11. -hls_segment_filename "segment_filename_pattern"
: Specifies the pattern for naming the segment files, in this case, segment001.ts, segment002.ts, segment003.ts and so on.
12. -start_number start_number
: Sets the starting number for the segments, in this case 0.
13. "output_playlist.m3u8"
: Specifies the output playlist file for this resolution, in this case index.m3u8.
The same command is used with different parameters for other resolutions.
index.m3u8 (Master Playlist)
This master playlist allows the frontend to select from multiple variant playlists based on the current network conditions and device capabilities. The frontend can switch between these streams seamlessly to provide an optimal viewing experience, ensuring the video plays smoothly even if the network conditions change.
#EXTM3U
#EXT-X-STREAM-INF:BANDWIDTH=800000,RESOLUTION=640x360
360p/index.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=1400000,RESOLUTION=854x480
480p/index.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2800000,RESOLUTION=1280x720
720p/index.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=5000000,RESOLUTION=1920x1080
1080p/index.m3u8
1. #EXTM3U
- This is the standard opening tag for M3U8 files, indicating that this is an extended M3U playlist.
2. #EXT-X-STREAM-INF:BANDWIDTH=800000,RESOLUTION=640x360
#EXT-X-STREAM-INF
: This tag introduces a media stream within the playlist.BANDWIDTH=800000
: Specifies the peak bitrate of the stream in bits per second (bps). For this stream, the peak bitrate is 800,000 bps (800 kbps).RESOLUTION=640x360
: Specifies the resolution of the video stream. For this stream, the resolution is 640 pixels wide by 360 pixels high.360p/index.m3u8
: This is the URI to the index.m3u8 file for 360p resolution stream. The client will use this playlist to fetch the 360p media segments for playback.
3. #EXT-X-STREAM-INF:BANDWIDTH=1400000,RESOLUTION=854x480
Similar to the previous tag but for a higher bitrate and resolution stream.
BANDWIDTH=1400000
: Specifies the peak bitrate of the stream as 1,400,000 bps (1,400 kbps).RESOLUTION=854x480
: Specifies the resolution as 854 pixels wide by 480 pixels high.480p/index.m3u8
: URI to the index.m3u8 file for the 480p resolution stream.
4. #EXT-X-STREAM-INF:BANDWIDTH=2800000,RESOLUTION=1280x720
BANDWIDTH=2800000
: Specifies the peak bitrate of the stream as 2,800,000 bps (2,800 kbps).RESOLUTION=1280x720
: Specifies the resolution as 1280 pixels wide by 720 pixels high.720p/index.m3u8
: URI to the index.m3u8 file for the 720p resolution stream.
5. #EXT-X-STREAM-INF:BANDWIDTH=5000000,RESOLUTION=1920x1080
BANDWIDTH=5000000
: Specifies the peak bitrate of the stream as 5,000,000 bps (5,000 kbps).RESOLUTION=1920x1080
: Specifies the resolution as 1920 pixels wide by 1080 pixels high.1080p/index.m3u8
: URI to the index.m3u8 file for the 1080p resolution stream.
index.m3u8 (Sub folder Playlist)
Lets see what 360p/index.m3u8 or 480p/index.m3u8 or 720p/index.m3u8 and 1080p/index.m3u8 will have.
This code is generated by the ffmpeg command. We only have to create the master playlist by ourselves. There will be minor different in all the index.m3u8 files but most of them will have almost same content.
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-TARGETDURATION:21
#EXT-X-MEDIA-SEQUENCE:0
#EXT-X-PLAYLIST-TYPE:VOD
#EXTINF:20.833333,
segment000.ts
#EXTINF:10.416667,
segment001.ts
#EXTINF:20.833333,
segment002.ts
#EXTINF:10.416667,
segment003.ts
#EXTINF:20.833333,
segment004.ts
#EXTINF:10.416667,
segment005.ts
#EXTINF:4.625000,
segment006.ts
#EXT-X-ENDLIST
1. #EXTM3U
: This is the standard opening tag for M3U8 files, indicating that this is an extended M3U playlist.
2. #EXT-X-VERSION:3
: Specifies the version of the M3U8 file. Version 3 is being used in this playlist.
3. #EXT-X-TARGETDURATION:21
: Specifies the maximum duration (in seconds) of any segment in the playlist. The longest segment in this playlist is 21 seconds.
4. #EXT-X-MEDIA-SEQUENCE:0
: Indicates the sequence number of the first segment in the playlist. This helps the client to keep track of the segments it has already played.
5. #EXT-X-PLAYLIST-TYPE:VOD
: Indicates that this playlist is for video on demand (VOD) content. This means the playlist is complete and will not change.
6. #EXTINF:20.833333,
: Specifies the duration of the following media segment (segment000.ts) in seconds. This segment is approximately 20.833333 seconds long.
7. segment000.ts
: URI to the first media segment.
8. #EXTINF:10.416667,
: Specifies the duration of the following media segment (segment001.ts) in seconds. This segment is approximately 10.416667 seconds long.
9. segment001.ts
: URI to the second media segment.
10. #EXTINF:20.833333,
: Specifies the duration of the following media segment (segment002.ts) in seconds. This segment is approximately 20.833333 seconds long.
11. #EXT-X-ENDLIST
: Indicates that this is the end of the playlist. No more media segments will be added.
Conclusion
So after making the backend, I've learnt several things about ffmpeg commands, how to create a master playlist file to navigate through different resolutions of the video segments, gone though the index.m3u8 file generated for different resolutions by ffmpeg.
Next Step
Creating a video player that has resolution toggling feature and integrate with this service.
To learn it you need to watch the YouTube video: CLICK HERE
For complete source code (GitHub): CLICK HERE