228 private links
This is huge:
Cores may stay idle for seconds while ready threads are waiting in runqueues. In our experiments, these performance bugs caused many-fold performance degradation for synchronization-heavy scientific applications, 13% higher latency for kernel make, and a 14-23% decrease in TPC-H throughput for a widely used commercial database.
DOI: https://dl.acm.org/doi/10.1145/2901318.2901326
It may be useful to read it completely.
Fixes:
- compare the minimum load of each scheduling groups instead of the average
- Linux spawns threads on the same core as their parent thread: a node can steal threads from a another node by comparing the average load
and two others
It is useful to read their tools (online sanity checker for invariants such as "No core remains idle while another core is overloaded")
During the 00s,dozens of papers described new schedling algorithms, [... but] a few of them were adopted in mainstream operatin systems, mainly because it is not clear how to integrate all theseideas in scheduler safely.
Similar the part Related Work describes the current state of the research on other domains: performance bugs, kernel correctness, tracing.
The resources are available on Github: https://github.com/jplozi/wastedcores
We can expect a x8 speedup for a big transaction.
Optimization is not always a progress in every field
Similar to https://shaarli.lyokolux.space/shaare/xwiTHQ
Bit operators are the fastest, then static array, then dynamic arrays.
Objects are heavy in comparison.
PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL;
PRAGMA busy_timeout = 5000;
PRAGMA cache_size = -20000;
PRAGMA foreign_keys = ON;
PRAGMA auto_vacuum = INCREMENTAL;
PRAGMA temp_store = MEMORY;
PRAGMA mmap_size = 2147483648;
PRAGMA page_size = 8192;
Hashsets < Dynamic Array < Statis Array < Bit Mask
How to cache? It depends of the context: push vs pull and owned vs user.
Push means that the asset is pushed to a central server and then distributed.
Pull means the asset is referenced and the central server has to “pull” the content.
Owned means it’s owned by the central server.
User means it’s user-submitted content.
Push + owned
Make everything push + owned content if possible. "It turns out, however, that you can make a shit ton of other stuff push + owned if you try a little harder. "
How does the client check if they're expired?
Use “stale while re-validate”. Ur welc’
In summary:
- store asset
- use stale-while-re-validate access patterns
- should work offline
Push + User & Pull + Owned
Handle these with hash URLs. Hash the URL and treat it immutably.
Push + User: Forum comment -> hash URL
Pull + Owned: "in-content" assets. That’s where it’s user generated content, but not owned by the server.
Summary:
- Load asset
- Use infinite TTL + hashed URLs
- Should not re-fetch across page/app reloads
Pull + User
That’s where it’s user generated content, but not owned by the server. Posting gifs into the chat is a prime example; linking a blog post and generating a media upload for that is another.
Guess what: this pattern fits for highly dynamic user-generated content, which means it’s the content users link to each other in-platform.
Stable URL, short TTL. YES, SHORT TTL. [...] Debounce + throttle? Sure. Micro-TTL? Yes. Cache? Never.
See also https://jpegxl.info/art/
Also called german strings. This is a great data structure that explains how handling strings can be diverse.
Read ahead of time of the safety bound of the kernel...
An optimisation that I don't really understand.
10 to 20% performance boost... that's great!
About running Blue Dwarf
A lightweight 70 KB implementation of the Jinja template engine. It was 130MB with the python environment and moustache divided the payload size by 1857! It is useful to run it for CI/CD pipelines if a subset of Jinja is needed.
How to configure SQLite for
Using a simple INT
with Unix millisecond timestamps is the best for performance.
COUNT
is slow, so it can be useful to keep track of them in a separate table.
Distributed SQLite databases can be achieved the same way as PostgresSQL: one writer and multiple replicated readers.
Great insights too :)