Pinned post

Hi, I'm Martin. I am an archivist and computer scientist working on digital preservation at the State Archives of North Rhine-Westphalia (Germany). Before doing digital preservation exclusively I was a sysadmin and before that a "non-digital" archivist. I'm particularly interested in file formats and other tech stuff.

@mickylindlar Probably my ugliest code so far with all that array indexing ... 😞

I just completed "Treetop Tree House" - Day 8 - Advent of Code 2022 adventofcode.com/2022/day/8

DPC's Paul Wheatley on restrictive policies in (I totally agree with him):
"We [...] almost certainly don’t have the resource to pre-emptively migrate in a dependable, well documented, accurate and verified manner. If we (the preservers) don’t have the resource, how can we expect the depositors to be able to afford it, never mind do a quality job, given that they will likely have little digipres expertise or tech?"
dpconline.org/blog/file-format

You know how the Eiffel Tower won the Grand Prize at the 1889 World Fair? Well, it had to share the glory with a book.

Not any book: A book ENTIRELY WOVEN IN SILK.

You heard right. And nerds, get this: All pages of this book were produced on the Jacquard loom in 1889, using thousands (200k-500k) of punch cards. Only 50-60 copies were made. >

I would love to shake the hand of whomever wrote this file format specification.

File format dissection 

I uploaded a small revision of my JPEG image format dissection, along with a PDF version - and a minor bugfix.
https://github.com/corkami/pics

Really useful (also humbling) for us Mastodon newbs on the cultural differences between this place and that other place, (via @dzorich) https://www.hughrundle.net/home-invasion/

The UNIX Pipe Card Game -- umm, this is the cutest?? I think I'd definitely incorporate this into learning if I were still teaching digipres students punkx.org/unix-pipe-game/

Character encoding 

I just discovered that there is a Japanese word for the garbled text that is the result of text being decoded using an unintended character encoding: Mojibake.

As someone with a diacritic in my name, I am all too used to Mojibaked versions of my family name.

More about it on Wikipedia: en.wikipedia.org/wiki/Mojibake

A proposal for an for

digipres.club/@anj/10931606075

Additional notes and context from me here:

github.com/orgs/digipres/discu

If you have GitHub please add your notes & comments. If not, reply on Twitter or Mastodon with permission to translate to the discussion.

If you download your #Twitter archive it arrives wrapped as a static HTML page, which is not very useful for doing anything with, and worse: it requires the original account to be still active to do useful things like enlarge the images since they use t.co links.

So here's a #Python script to convert a Twitter archive to #markdown or other formats: https://github.com/timhutton/twitter-archive-parser

Now you can archive your tweets in any way you want.

Hi, I'm Martin. I am an archivist and computer scientist working on digital preservation at the State Archives of North Rhine-Westphalia (Germany). Before doing digital preservation exclusively I was a sysadmin and before that a "non-digital" archivist. I'm particularly interested in file formats and other tech stuff.

Hometown is adapted from Mastodon, a decentralized social network with no ads, no corporate surveillance, and ethical design.