319 private links
It's absolutely possible to beat even the best sort implementations with domain specific knowledge, careful benchmarking and an understanding of CPU micro-architectures. At the same time, assumptions will become invalid, mistakes can creep in silently and good sort implementations can be surprisingly fast even without prior domain knowledge. If you have access to a high-quality sort implementation, think twice about replacing it with something home-grown.
Understanding the O notation for software complexity.
Bloom filters are excellent data structures to check if an element can be in a set.
Examples can be found on Wikipedia
A piece of engineering to "display" every UUIDs on one page: https://everyuuid.com/
Modern cryptography
Hashing: BLAKE3, Keccak-based functions (SHA-3, SHAKE256) or BLAKE2b.
Encryption: XChaCha20-Poly1305, ChaCah20-BLAKE3, or, I would like to see keccak-based AEAD constructions.
Key Exchange: X25519, X448
Digital Signatures: Ed25519, Ed448
Password Hashing / Key Derivation: Argon2id
About the new Anki algorithm to schedule flashcard reviews. I didn't read it yet, but it can become handy.
Instead of hash functions to store password, use Password-Based Key Derivation Functions (PBKDF) such as Argon2id.
bcrypt should be avoided due to its huge footgun: it truncates inputs longer than 72 characters. Okta AD/LDAP was vulnerable because of it.
Checksum functions such as CRC32 and xxh3 are optimized for pure speed and don't provide any security guarantees about their output, and it's easy to find collisions for a given checksum.
In 2024 based on I/O speed, a hash function with a throughput of 1 GB / s / core is considered fast enough for most use cases.
I skip the speed part because it is not relevant for me: 100MB/s or 1GB/s does not make much difference.
SHA3 and the BLAKE family which produced secures hash functions that are also misuse resistant.
A strength >= 128 bits is considered secure. The security agencies recommendation are a bit different. Hash length ranges from 256 (NIST) to 512 (ECRYPT-CSA).
SHA3 has many functions, SHA2 is vulnerable to length extension attacks (secret || message)
but BLAKE3 has none of these issues.
Post-Quantum security from Grover's algorithm divides by 2 the preimage and 2nd-preimage resistance. The BHT algorithm predicts however that a quantum computer can find a collision in operations instead of 2^n/2
So SHA2 for convenience or BLAKE for the rest. There is only C and Rust that have official support for BLAKE though.
Adding a proof-of-work algorithm can work with this experience.
I guess that the main lesson was that these particular spammers, are really low-effort creatures. You raise the bar a little, and they stop being effective.
Hashsets < Dynamic Array < Statis Array < Bit Mask
There is also a dedicated section for JS https://shaarli.lyokolux.space/shaare/DhH-Zw
Outline:
Using a BKTree data structure to identify and correct typos
Writing the Business Logic to Perform Typo Corrections
Pulling from Redis and caching it with lazy_static!
Identifying english words (among others, BKTree Search for Non-Dictionary Words)
Also called german strings. This is a great data structure that explains how handling strings can be diverse.
an arena is a way to store your data somewhere without directly going through the system allocator. If you have a lot of small objects which you don’t mind to deallocate together instead of individually, this can be a lot faster. You could use a Vec for this. However, if you store data in a vec its address might change all the time.
A great thing would be an implementation in #rust as a small #project #idea
See the page 38-45 computer science PDF of Andy Pavlo: https://15721.courses.cs.cmu.edu/spring2024/slides/05-execution2.pdf
About ULIDs, UUIDv4 & UUIDv7. The dynamic examples are great!
How to remove XML comments in Javascript?
How regex can solve the issue but why they can be slow. There is a category for this weakness: CWE-1333 "Inefficient Regular Expression Complexity".
Other workarounds are also proposed, such as using efficient engines with backtracking.