Daily Shaarli

All links of one day in a single page.

October 23, 2025

Octothorp - Enjoy the World Wide Web

Octothorpes are hashtags and backlinks that can be used on regular websites, connecting pages across the open internet regardless of where they're hosted.

A Secret Web

Outside the grasp of social media nad the commercial web sits a broad community of people with personal websites and blogs. [...] The community has received many names:

  • The Small Web contrasts this community with the “Big Web”, valuing personal ownership over scale.
  • The IndieWeb also values personal ownership of websites, providing numerous technical standards and proposals to help facilitate interaction between different people’s blogs.
  • Web 1.0 rejects the hype of “Web 2.0” apps, using simple, straightforward technologies to build websites.
  • The Blogosphere is an old term that’s been around since 1999, referencing the community of bloggers.
  • The Web Revival is the concept shared by many that this community has been growing and making a comeback.

This web relies on the hyperlinks.

There is the classic web Discovery with Blogrolls, Webrings and Feeds.

and search engines that are wonderful tools to find a specific thing, but they shouldn't be the only discovery tool, because they only show a subset of the available information.

That's why Clew highlights the small independent websites "to make discovering what real people think easier". Other search engines are doing this:

  • Marginalia
  • Unobtanium
  • Stract
  • Lieu focus on webrings.
  • Mwmbl - curated by the users.
  • Search My Site crawls user-submitted sites
  • Wiby for websites using older technology, great for use on vintage computers.
  • YaCy - a decentralized search engine
  • PeARS - A search engine that can be run in the browser, without needing a server.
  • Mojeek - an independent search engine

Another idea to bring back a healthier web is to provide blogrolls in the OPML format directly: https://opml.org/blogroll.opml.
Jamesg.blog created the Artemis Link Graph web extension. It lists the web pages authored by people you follow that link to the page you are viewing.

All of these has one limitation: much of the independent web today is made up of people with similar interests, in technology in particular.

A decade of dotfiles

Some useful tips in the CLCLI to handle dotfiles and customization of the shell.

Why "alias" is my last resort for aliases

But over time, I think I discovered a better way: a script in my $PATH.

Benefits of scripts over aliases:

  • No reloading; changes are picked up immediatly
  • Choice of programming language
  • Complex logic can be implemented
  • More portable between shells

Aliases have certain benefits:

  • special syntax (cd.. for cd .. with space)
  • completion
  • conditional definition
  • easier to bypass with unalias
  • brevity: it's a one liner
  • performance: alias are 100x faster.
Foreign hackers breached a US nuclear weapons plant via SharePoint flaws | CSO Online
Pourquoi vous devez supprimer vos infos perso (et comment faire sans y passer 200 heures) | Cybersécurité | Le site de Korben

Le RGPD est impraticable puisqu'il faut identifier tous les sites qui détiennent des données personnelles, trouver leur formulaire de suppression, envoyer une demande conforme, relancer s'ils ne répondent pas et recommencer tous les 3 à 6 mois, parce qu'ils recollectent les informations.

Incogni gère extactement cela.

88 % des applis américaines et 92 % des applis chinoises partagent vos données avec des tiers. Contre seulement 54 % pour les applications européennes.

What Made Blogging Different? - TPM – Talking Points Memo

So if you wanted people to read your blog, you had to make it compelling enough that they would visit it, directly, because they wanted to. And if they wanted to respond to you, they had to do it on their own blog, and link back.

There are bright spots [for blogging], though. I fear we’re in a newsletter bubble (how many subscriptions can one person pay for?)

Some of the best blogs have evolved and expanded. Independent media is more important than ever, and Donald Trump’s recent attempts to censor mainstream outlets, comedians he doesn’t like, and “leftist” professors underscore the fact that speech is critical.

it’s actually a lot harder to intimidate a million different outlets, each run by a single determined person.

Behind the Scenes of Pingoo: Slashing Rust allocations with mimalloc and heapless to build the fastest reverse proxy

The heap is a performance killer in Rust. One woraround is to swap to a more efficient memory allocator such as jemalloc.

In Cargo.toml:

[dependencies]
mimalloc = "0.1"

In main.rs:

#[global_allocator]
static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;

The best performance optimisation is to avoid the heap. There is the heapless create for that. "The only thing to know is that the size of heapless types and collections needs to be known at compile-time."

Feux de forêt -Débroussaillez votre terrain dès maintenant pour éviter des feux durant l’été | Service Public
Entre la Suisse et la France, les “exilés de la classe moyenne”

le Conseil d’État genevois ne veut plus des élèves frontaliers dans les écoles du canton.

Ils sont plus de 30 000 à avoir quitté le canton pour emménager dans l’Ain ou la Haute-Savoie, le “Genevois français”.

211 MILLIARD D'EUROS : un cadeau sans contrepartie ? [ARGENT MAGIQUE]
A look at search engines with their own indexes - Seirdy

I lost all the notes but I rewrite them again from this blog post. The author focus is towards english search engines.

The Common Crawl can be used by search engine that does not own an index, or enrich it. The dominant Google, Bing and Yandex search engines are also noted GBY.

General indexing search engines

  1. Google: the biggest index. Powers other search engines: - A former version of Startpage, GMX Search, run by a popular German email provider, Mullvad Leta, SAPO (Portuguese UI), DSearch, 13TABS, Zarebin (Persian), Ecosia, a host of other engines using Programmable Search Engine’s client-side scripts.
  2. Bing powers many indexees: Yahoo, DuckDuckGo, AOL, Qwant, Ekoru, Privado,, Findx, Disconnect search, Lilo, ...
  3. Yandex: a russian search engine with
  4. Mojeek: privavy oriented with billions of pages.

Smaller indexes or less relevant results

Stract an OSS project
Right Dao very fast with good results. Focus on large established sites rather than smaller, independent ones.
Alexandria is a non-profit, add free engine. Built from the Common Crawl.
Yep also shows results linked by pages containing the query. In other words, not all results contain relevant keywords. This makes it excellent for less precise searches and discovery of “related sites”
SeSe Engine chinese engine. Good results for such a low-budget project.
greppr “Search the Internet with no filters, no tracking, no ads.”

Smaller indexes, hit-and-miss

Peekr: a searxNG metasearch engine that now returns results from its own growing ElasticSearch index. Self-hostable.
Seekport german UI. Small for its own small index.
ExactSeek disproportionately dominated by big sites. Webmaster tools seem to heavily push for paid SEO options.
Burf.co very small index.
ChatNoir: an experiemental OSS engine by researchers that uses teh Common Crawl index
Secret Search Engine Labs avoid spams.
Gabanza small index from a hosting company.
Jambo bias towards older content. Not updated since 2006.
search.dxhub.de an open source version of Gigablast.
Fynd

Fledging engines

Yessle
Bloopish
Artado Search Primarily Turkish
Active Search Results biased towards commercial sites
Crawlson index cap of 10 URLs per domain. Has some downtime.
Anoox vote on listings to alter rankings
Yioop! FLOSS search engine with an impressive feature set. Yioop’s results are few and irrelevant due to its small index. It allows submitting sites for crawling. Like Meorca, Yioop has social features such as blogs, wikis, and a chat bot API.
Spyda a small Go engine made by James Mills
Slzii.com a new web portal with a search engine. It has a tiny index dominated by SEO spam.
Weblog DataBase a metadata search engine for technical blogs. Small index and ranking seems poor, but it has different goals from most search engines. it encourages filtering search results iteratively until finding the desired subset of results.

Semi-independant indexes

Brave Search reuse Google and Bing search results. The company has its own history.
Plumb nearly returns no results and falls back to Google.
Qwant: own index but still relies on Bing for most results.
Kagi Search requires an account and limits use without payment. It has its own Teclis index. The company seems to use the Brave's commercial API.
PriEco a metasearch engine. Other sources can be turned off, but tis own index is quite tiny.

Non-generalist search

They’re trying to do something different. You aren’t supposed to use these engines the same way you use GBY.

Marginalia Search has its own crawler and is strongly biased towards non-commercial, personal, and/or minimal sites.
Ichido rolled out its own independent index with a lof of care to its ranking algorithm.. Biased towards the non-commercial web.
Teclis uses its own crawler that measures content blocked by uBlock Origin, and extracts content with the open-source article scrapers Trafilatura and Readability.js.
Clew new FOSS engine with a small index. It focuses on independent content. It seems to have a real focus on quality over quantity.
Lixia Labs Search indexes technical websites and blogs with minimal Javascript-free front-end.

Site finders

Kozmonavt 8 million sites. It lacks contact information, a privacy policy or any other information about the organisation.
search.tl limits searches to specific TLD. It seems to be connected to Amidalla.
Thunderstone a combined website catalog and search engine that focuses on categorization. "It is very good at finding companies and organizations by purpose, product, subject matter, or location."
sengine.info only shows domains. Made by netEstate GmbH
Gnomit allows single-keyword queries and returns sites that seems to cover related topic. The results are typically old (from 2009)

Other

High Browse introduce non-SEO-optimized serendipity into search results. Favorite surf-engines of the author.
Keybot crawls the web for multilangual sites. Parts of the TTN Translation Network.

Semantic Sholar by the Allen Institute for AI focused on academic PDFs
Bonzamate focuses on Australian websites.
Searchcode focuses on... code searching.
StarFinder focuses on Open Graph Protocol metadata

Other languages

Big Indexes

Baidu: Chinese. It's a major engine alogside the GBY.
Qihoo: Chinese. How idependant?
Toutiao: Chinese. The index seems limited outside of its own content distribution.
Sogou: Chinese.
Yisou: Chinese by Yahoo. Defunct.
Naver: Korean.
Daum: Korean.
Seznam: Czec, seems relatively privacy-friendly. It uses IndexNow.
Cốc Cốc: Vietnamese
go.mail.ru: Russian
LetSearch.ru:: Russian.

Smaller indexes

ALibw.com: Chinese.
Vuhuv: Turkish.
search.ch: regional search engines for Switzerland.
fastbot: german
SOLOFIELD: Japanese
kaz.kz: Kazakh and Russian

Almost qualified

These engines come close enough to passing my inclusion criteria that I felt I had to mention them. They all display original organic results that you can’t find on other engines, and maintain their own indexes. Unfortunately, they don’t quite pass because they don’t crawl the Web; most limit themselves to a specific set of sites.

Wiby.me focuses on smaller independent sites that capture the spirit of he "early" web. it's more focused on discovering new interesting pages. Great for surfing. It is also available via wiby.org.
Mwmbl is an open-source engine whose crawling is community driven. It crawls only pages from hand-picked sites. It allows users to contribute to crawls webpages in its index backlog via the Mwbl Donate firefox extension.
Search My Site indexes user-submitted personal and independent sites. It supports IndieAuth.
Kukei.eu is a curated search engine for web developers. It crawls a hand-picked sites.
Unobtanium Search is a fledgling search engine by Slatian. It crawls hand-curated sites: personal, technical, indie wiki, and German hacker community sites.

Infinity Search: is young and splits between a paid offer with the main index and Infinity Decentralized, a community-hosted crawlers.

Graveyard

Petal Search, Neeva, Gigablast, wbsrch, Gowiki, Meorca, Ninfex, Marlo, Entfer, Siik, Blog Surf, Infotiger

Rationale behind the post

Google, Bing and Yandex have conflicts of interest. They won't deliver the "best" of the web for the users. It's also important to get information diversity and most search engines' ranking algorithms incorporate a method similar to PageRank, which biases them towards sites with many backlinks.
The author also describes its methodology.

Findings

  1. Using one engine for everything ignores the fact that different engines have different strengths
  2. When talking to search engine founders, I found that the biggest obstacle to growing an index is getting blocked by sites.
  3. Too many people optimize sites specifically for Google without considering the long-term consequences of their actions. Almost non-GBY engines on this list are Javascript-aware.
  4. When building webpages, authors need to consider the barriers to entry for a new search engine.
  5. Try a “bad” engine from lower in the list. It might show you utter crap. But every garbage heap has an undiscovered treasure.

From Teclis: Using Trafilatura and Readability.js encourages the use of semantic HTML and Semantic Web standards such as microformats, microdata, and RDFa. It claims to also use some results from Marginalia. The Web interface has been shut down, but its standalone API is still available for Kagi customers.

it's time to reinvent the wheel

Reinventing the wheel, but every time different. The author shares its experience.
Consider standards for it: they are powerful.

Some wheels I see that I think could use some new takes but which I don’t have the time/energy to do myself:

  • Web browsers - probably the most significant. The browser market is essentially a monopoly right now. And Firefox is pretty much the only alternative option, somewhat of a monopoly in itself. We need to have many independent browser projects going on, not just an alternative.
  • Higher education - this is probably too big a project for any one person, but I think there’s a lot of ground that needs new work and reevaluating in the world’s current higher education system.
  • Task management - there are a lot of task management systems out there, but I think there’s still definitely room for more. I’m personally beginning to settle on a hybrid analog/digital task management system I’m designing myself.
readable.css

Rather than helping you build a sitewide design, readable.css provides a base default that is both sensible and beautiful.

Scrolls

Scrolls is a weekly newsletter / link roundup / information digest at the intersection of the IndieWeb and the Fediverse, with a splash of Cybersecurity stuff. It is published on the web every Friday, completely free. Check out the latest edition and get scrollin'!

manifestos [MelonLand Wiki]
Kat Marchán 🐈: "you benchmark your node/ruby/python software on y…" - Toot.Cat

You benchmark your node/ruby/python software on a fancy Macbook M4 and celebrate 500ms response time.

I benchmark my rust software on a $30 potato computer that may as well have 256MB of RAM and celebrate 800ms response time.

AWS outage reminds us why $2,449 Internet-dependent beds are a bad idea - Ars Technica
L’empiètement sur le terrain d’autrui : une atteinte au droit de propriété - UNPI

Mais pour que cette tolérance soit valable, il faut un accord écrit, clair et formalisé (exemple : un acte de servitude). Un simple échange verbal ne suffit pas.

Millionaires and Billionaires · Jens Oliver Meiert

From what we can observe, most people with significant wealth seem to be peculiar in two particular ways: Appearing not wise enough to recognize and know that their wealth means another’s poverty and that that’s actually relevant because ultimately, they can only be truly well if everyone is well, and that they, too, live in a climate catastrophe from which they cannot escape, even if they built themselves the most sophisticated bunker.

Appearing not courageous enough to act to use their fortunes for the greater good and for everyone’s well-being, because they seem so afraid they would not have enough, even though they already have way more than enough (and will keep enough) to live a comfortable and fulfilling life, and to move away from their ways of “making” money, especially when these ways include exploiting and damaging people, animals, or planet, out of the same fear of not having enough, or other fears like not being able to replicate their success or being admired for it.

There’s some superb (and superbly sad) irony here that millionaires and billionaires are in the best
position to be role models, by doing amazing things for the well-being and advancement of mankind (and all species)

How popular foreign mobile apps are quietly exporting European's most sensitive personal data—by country [2025] | Incogni
Scripts I wrote that I use all the time

getsong make sense and so many useful scripts.

  • copy and pasta
  • mkcd
  • tempe
  • trash
  • mksh
  • serveit starts a static file server
  • getsong
  • getpod to download something from a podcast player
  • getsubs
  • wifi off, wifi on and wifi toggle
  • url "my_url" parses a URL into its parts.
  • markdownquote to add > before every line
  • u+ 2025 to get the unicode caracter associated
  • snippets to run some snippets
  • some REPL launchers for Clojure, Deno, Php, Python and SQLite
  • hoy prints the current date in ISO format
  • timer
  • ocr to extract text from an image
  • removeexif to delete EXIF data from images
  • emoji fuzzy finder helper https://codeberg.org/EvanHahn/dotfiles/src/commit/843b9ee13d949d346a4a73ccee2a99351aed285b/home/bin/bin/emoji

and more Process management scripts

What Made Blogging Different?

If someone wants to be read, it has to be compelling enough that visitors would come.

GitHub - pingooio/pingoo: The fast and secure Load Balancer / API Gateway / Reverse Proxy with built-in service discovery, GeoIP, WAF, bot protection and much more - https://pingoo.io

Documentation: https://pingoo.io/

Developed by Silvain Kerkour: https://kerkour.com/

Cleanup your lifetime annotations in Rust with Rc and Arc

Lifetime annotations are needed to tell the compiler that we are manipulating some kind of long-lived reference and let it assert that we are not going to screw ourselves

The only downside is that smart pointers, in Rust, are a little bit verbose (but still way less ugly than lifetime annotations). [They add some runtime overhead.]

When to use lifetimes annotations?

When performance really matters or when your code will be used in no_std environments.

Wikipédia perd 8 % de pages vues par les humains en un an et met ça sur le compte de l’IA - Next

Ce serait en grande partie dû à une meilleure détection des crawlers. Aussi, les moteurs de recherche fournissent directement des réponses, en se basant sur le contenu de Wikipédia.

Les crawlers sont plus agressifs et certaines ne respecte pas le robots.txt.

« cela signifie que les gens lisent les connaissances créées par les bénévoles de Wikimedia partout sur Internet, même s'ils ne visitent pas wikipedia.org. Ces connaissances créées par l'homme sont devenues encore plus importantes pour la diffusion d'informations fiables en ligne ».

Il y a en revanche un risque à long-terme

avec moins de visites sur Wikipédia, moins de bénévoles vont développer et enrichir le contenu, et moins de donateurs individuels vont soutenir ce travail ».