https://bibwild.wordpress.com/2025/01/16/using-cloudflare-turnstile-to-protect-certain-pages-on-a-rails-app/

There is a growing problem of so-called "bots" harvesting content from repositories in an aggressive manner. They are often poorly designed and have no care for the bandwidth or processing capacity they demand from the repository as they attempt to "hoover" up all available content (increasingly to use for AI training purposes). The result of this is to impact - sometimes severely - on the performance of the repository.

This piece by Jonathan Rochkind describes approaches to mitigating this problem. IN particular, it describes an interesting approach to blocking only certain resources - e.g the search function - from machine processes, while leaving content resources freely available for machine processes to access.

https://arxiv.org/abs/2407.09237

Refreshing my knowledge/memory of some of these (often contested) issues with URL management. I’m still more sceptical than the author about how well Content Negotiation works in practice, but this is useful nonetheless

gothenburg.jpeg

I've just got back from the excellent Open Repositories 2024 conference in Gothenburg. Lots of interesting work being described and met some great new people. I was involved in several sessions there. One talk I gave was a rapid "lightning" talk (max 7 minutes, max 24 slides) called Should repositories participate in the Fediverse?.

There are plugins available for adding icons to the Obsidian file-explorer, but I decided not to use them for two reasons:

  1. I'm trying to maintain a "plugin diet", on the grounds that the more plugins I add to Obsidian, the more likely I will hit problems related to incompatibility and performance
  2. The plugins I tried, although mostly functional, were nonetheless buggy.

Instead, it is possible to decorate the file-explorer with plugins using nothing other than CSS, via the Obsidian CSS "snippets" feature, and the standard set of Unicode emoji characters. This appeals to me because it avoids the need for a plugin just for what is largely a cosmetic concern.

There are two ways to add emoji to files:

This website is served as static HTML, compiles by a really capable "static-site-generator" called Hugo. I had a problem to solve with rendering images here: sometimes I wanted an image which is local to a particular blog post to also show up in the homepage, which serves the most recent posts. The problem is that the relative URL for the image is different when that content is served on the homepage. I don't want to hard-code absolute URLs, but I do want to use the sources of the webpage in different parts of the website. Therefore, I needed Hugo to somehow intelligently re-write those URLs when compiling the website.

This is where Hugo's relatively new Markdown render hooks come in. I've added the following code to a partial under layouts/_default/_markup/render-image.html

{{ $url := urls.Parse .Destination }}
{{ if or (eq $url.IsAbs true) (hasPrefix .Destination "/") }}
    <img src="{{ .Destination }}" title="{{ .Title }}" alt="{{ .Title }}"/>
{{ else }}
    <img src="{{ .Page.Permalink }}/{{ .Destination }}" title="{{ .Title }}" alt="{{ .Title }}"/>
{{ end }}

This has the effect of prepending the page's absolute URL to the image path at compile time. It is invoked every time a Markdown image element is encountered in the sources. If the image is not local to the page, then the image HTML tag is rendered with the URL unchanged (e.g for external images, or for images served from a folder relative to the webroot, rather than the current page's folder.)

Hugo's render hooks are an interesting and useful addition. As well as images, you can specify render hooks for:

  • image
  • link
  • heading
  • codeblock

The Gell-Mann Amnesia Effect was coined by the late Michael Crichton in a talk entitled Why Speculate, given to the International Leadership Forum, La Jolla, in 2002. Below is an excerpt from that talk:

Media carries with it a credibility that is totally undeserved. You have all experienced this, in what I call the Murray Gell-Mann Amnesia effect. (I call it by this name because I once discussed it with Murray Gell-Mann, and by dropping a famous name I imply greater importance to myself, and to the effect, than it would otherwise have.)

Briefly stated, the Gell-Mann Amnesia effect works as follows. You open the newspaper to an article on some subject you know well. In Murray's case, physics. In mine, show business. You read the article and see the journalist has absolutely no understanding of either the facts or the issues. Often, the article is so wrong it actually presents the story backward-reversing cause and effect. I call these the "wet streets cause rain" stories. Paper's full of them.

In any case, you read with exasperation or amusement the multiple errors in a story-and then turn the page to national or international affairs, and read with renewed interest as if the rest of the newspaper was somehow more accurate about far-off Palestine than it was about the story you just read. You turn the page, and forget what you know.

That is the Gell-Mann Amnesia effect. I'd point out it does not operate in other arenas of life. In ordinary life, if somebody consistently exaggerates or lies to you, you soon discount everything they say. In court, there is the legal doctrine of falsus in uno, falsus in omnibus, which means untruthful in one part, untruthful in all.

I think this has become worse in recent years. Much of the mainstream press and TV news seems to dwell in the realm of speculation, more than dry, objective reportage. The important lesson is, frankly, to doubt everything you read in the news unless you have reason to trust the source. This is exhausting, and makes the whole business of actually reading "speculative" news reporting somewhat pointless.

As Crichton said, introducing the transcript of the talk on his website:

In recent years, media has increasingly turned away from reporting what has happened to focus on speculation about what may happen in the future. Paying attention to modern media is thus a waste of time.

In recent months I have successfully weaned myself off daily news consumption. I pick up bits and pieces, here and there, but I no longer intentionally go to news sources. At the weekend, I catch up with digests from a few, trusted sources. I do not think this has significantly impaired my awareness of current affairs, while it has certainly saved me from wasting a lot of time!

Using the excellent Dataview Obsidian plugin, inserting this snippet (below) into the note will create a table listing:

  • all notes in the same folder as the note, and in all sub-folders, recursively
  • all notes which link to the note
  • all notes linked to by the note

Particularly when used in a "folder note" (a note which serves as the key note in any given folder), this is a simple way to create a kind of "section index" for that part of the folder hierarchy.

```dataview
TABLE rows.file.link AS Pages
WHERE
	contains(file.folder, this.file.folder)
	OR contains(file.inlinks, this.file.link)
	OR contains(file.outlinks, this.file.link)
	AND file != this.file
GROUP BY file.folder AS Folder
```

I attended the British Library's annual Open and Engaged conference on 2023-10-30, held in their conference centre in St Pancras, London. At the time, the British Library had just discovered that they had been subjected to a cyber attack (this is ongoing at the time of writing). Despite the ensuing disruption, with BL staff being unable to access their email or documents, and with the BL's internet access being offline, the staff there managed the remarkable feat of hosting the event with little evidence of the chaos in the background. I found the day interesting, and made the following notes from the various speakers' presentations.

COAR Infographic

I very much like this infographic from COAR. I've been working with COAR on the Next Generation Repositories Working Group and we have been gradually building a picture of a technological future for repository systems. As this work has progressed over the last year or so, it has gradually become clear that there is an opportunity to describe a sustainable knowledge commons. While the Next Generation Repository group is gradually assembling a picture of the technical components and protocols which can make this work, this infographic covers some other, non-technical aspects which will also be required.

I recommend taking a look at the document from which I have taken this image - it adds some useful context.