356 private links
- The current hardware bottleneck isn't I/O anymore but system calls.
Each system call causes a CPU mode switch between user mode and kernel mode. The switch costs 1000-1500 CPU cycles.
On a 3GHz processor, 1000-1500 cycles is about 500 nanoseconds. This might sound negligibly fast, but modern SSDs can handle over 1 million operations per second. If each operation requires a system call, you're burning 1.5 billion cycles per second just on mode switching.
A package manager can trigger 50k+ system calls to install reacts for example.
- JS adds overhead, especially with NodeJS that have layers. There are more steps in the pipeline to read the content of a file. Bun read package.json 2.2x faster than NodeJS because of it
Another use case is string optimization. package-lock files have an expected format with predefined strings (MIT, licence, etc...). These repeated strings can be optimized.
The manifest of each package is stored in a binary format
Bun stores the responses's ETag
and sends If-None-Match
header
-
The buffer for the tarball decompression is set in advance. When the data size is unknown, the buffer must be reallocated to grow (see [}. Bun buffers the entire tarball before decompressing. Most of JS packages are 1MB so it's fine (ts package is 50MB ok).
The uncompressed file size is known with the last 4 bytes of the gzip format.
Bun uses libdefalte optimized with SIMD instructions.
The comparison in NodeJS is a readStream, but it's not as efficient as a seek operation. -
Cache-friendly data layout
JSON is inefficient because each address pointer has a string step. "The CPU accesses a pointer that tells it where Next's data is located in memory. This data then contains yet another pointer to where its dependencies live, which in turn contains more pointers to the actual dependency strings."
Fetching data from RAM is slow, because CPU stores data in cache lines.
Because JSON (and especially JS objects) are stored randomly in RAM, the line cache is inefficient or will be used only for a few bytes.
This optimization works great for data that's stored sequentially, but it backfires when your data is scattered randomly across memory.
The nested structure of objects creates whats called "pointer chasing", a common anti-pattern in system programming.
For a project with 1000 packages averaging 5 dependencies, that's 2ms of pure memory latency.
5.Structure of arrays (SoA) instead of array of structs
Bun uses large contiguous buffers. While accessing a package is 8 bytes, the CPU can load an entire 64 byte cache line from packages[0]
to packages[7]
As a sidenote: Bun originally used a binary lockfile format (bun.lockb) to avoid JSON parsing overhead entirely, but binary files are impossible to review in pull requests and can't be merged when conflicts happen.
- File copying
Copying a file can be expensive as it runs first through the kernal memory. There are ways to optimize it though.
On MacOS, clonefile can clone entire directories, so it's a O(n) operation.
Linux has hardlinks. It has fallbacks such as ioctl_ficlone
for Btrfs and XFS, or copy_file_range
, or sendfile
- Multi-Core parallelism
Bun uses lock-free data structures. It also uses a thread pool of 64 concurrent HTTP connections.
Each thread gets its own memory pool.
- Conclusion
[...] npm gave us a foundation to build on, yarn made managing workspaces less painful, and pnpm came up with a clever way to save space and speed things up with hardlinks. Each worked hard to solve the problems developers were actually hitting at the time. But that world no longer exists. SSDs are 70× faster, CPUs have dozens of cores, and memory is cheap. The real bottleneck shifted from hardware speed to software abstractions. [...] The tools that will define the next decade of developer productivity are being written right now, by teams who understand that performance bottlenecks shifted when storage got fast and memory got cheap. Installing packages 25x faster isn't "magic": it's what happens when tools are built for the hardware we actually have.
An alternative to Deno and NodeJS. It seems promising, but too soon to be used yet