I would love to shake the hand of whomever wrote this file format specification.

File format dissection 

I uploaded a small revision of my JPEG image format dissection, along with a PDF version - and a minor bugfix.

Really useful (also humbling) for us Mastodon newbs on the cultural differences between this place and that other place, (via @dzorich)

The UNIX Pipe Card Game -- umm, this is the cutest?? I think I'd definitely incorporate this into learning if I were still teaching digipres students

Character encoding 

I just discovered that there is a Japanese word for the garbled text that is the result of text being decoded using an unintended character encoding: Mojibake.

As someone with a diacritic in my name, I am all too used to Mojibaked versions of my family name.

More about it on Wikipedia:

Additional notes and context from me here:

If you have GitHub please add your notes & comments. If not, reply on Twitter or Mastodon with permission to translate to the discussion.

If you download your #Twitter archive it arrives wrapped as a static HTML page, which is not very useful for doing anything with, and worse: it requires the original account to be still active to do useful things like enlarge the images since they use links.

So here's a #Python script to convert a Twitter archive to #markdown or other formats:

Now you can archive your tweets in any way you want.

Hi, I'm Martin. I am an archivist and computer scientist working on digital preservation at the State Archives of North Rhine-Westphalia (Germany). Before doing digital preservation exclusively I was a sysadmin and before that a "non-digital" archivist. I'm particularly interested in file formats and other tech stuff.

