For those in search of some next-level digital preservation weirdness, look no further than my review of the ISO/IEC TS 22424 standard on 3 preservation bitsgalore.org/2020/04/30/iso-

Does Microsoft OneDrive export large ZIP files that are corrupt? UPDATED version of my earlier blog, turns out that one unexpected ZIP64 field value results in ZIP files that are unreadable by most extraction tools and libraries bitsgalore.org/2020/03/11/does

My father’s hammer l, and a small story of maintenance and digital preservation: blogs.bl.uk/webarchive/2020/03

@mtnsic Oh, that's really interesting, might give this a try at some point. Very strange though, since as far as I'm aware most open-source zip libraries have supported ZIP64 for ages, which makes me wonder what those cloud providers are using instead.

Does Microsoft OneDrive export large ZIP files that are corrupt? New blog: bitsgalore.org/2020/03/11/does (if anyone has any more info or test results on this I'd love to know!)

Really useful blog post on how to add tracker-free commenting functionality to static Jekyll web sites using Github issues aristath.github.io/blog/static (I can confirm this actually works, as I used it for my own blog at bitsgalore.org/)

@andrewjbtw Like all Java applications I'm aware of, initialization of the Java VM is a pain if you want to process large numbers of files, but you can get around that by running the Tika server application. Once that's fired up, you can then process individual files using HTTP requests, which is pretty fast. See for more info + some examples here: cwiki.apache.org/confluence/di

@andrewjbtw FWIW Apache Tika is pretty great for a wide range of (document and office) formats, including PDF tika.apache.org/ It is a Java tool though. Any reason in particular why you're excluding Java tools?

We just published the jpylyzer 2 release candidate, which includes (among other things) support for raw codestream validation. More info here: openpreservation.org/news/jpyl

Here's a weird and slightly spooky 1-minute trailer(!) I made for the upcoming jpylyzer 2.0.0 release (don't miss out on the audio on this one) youtu.be/gIutpFxGy28

Here's a video of the presentation I did on recovering '90s data tapes at (audio is not from the actual presentation, but from a rehearsal I did a day earlier) vimeo.com/362061955

Recovering '90s Data Tapes - Experiences From the KB Web Archaeology project bitsgalore.org/2019/09/09/reco Web-friendly version of the paper I wrote for the upcoming conference in Amsterdam. Contains links to lots of tape-related resources

@The_BFOOL Can't you just set up an autoforward in gmail (explained here lifewire.com/how-to-forward-yo) to your ProtonMail address? Then always use your ProtonMail address for sending/replying, so over time people will stop using the gmail one.

@ashley @The_BFOOL Funny thing is I used to be terrified of any GUI-related coding, and I've actively avoided doing going anywhere near it for almost 20 years!

Attention jpylyzer users - I'm considering some changes to jpylyzer's output format for an upcoming 2.0 release in November. See this note for details: gist.github.com/bitsgalore/300 If you have any comments or suggestions just let me know!

One for the floppy crowd - a simple disk imaging workflow tool (basically a simple graphical front-end around dd and ddrescue) bitsgalore.org/2019/04/10/a-si

Show more
digipres.club

digipres.club is a space for folks interested in productive conversations about, well, digital preservation! If you enjoy talking about how to do memory work with computers, or even with cardboard boxes of old photos, you belong with us on digipres.club. Many of us are/were Twitter users looking for an inclusive and community supported approach to social media. If any of these things sound good to you, consider joining us now.