neues projekt zur umwandlung eines #warc files in #graphviz implementiert in #java und zur verfügung gestellt auf #github - dank inspiration von @despens ...

Dear web archiving colleagues, is there a tool that you would recommend for creating a graphical site map? For instance, a tool that would crawl a site or ingest a WARC file and export graphviz, or SVG?

Doc-minded colleagues: does anyone know of good + accessible tools or libraries for creating help-text overlays or guided actions on a web site?

Like, something that can be integrated into a front-end client that enables a "tutorial" mode with scripted actions and text you can advance through to explain the interface

Follow up to my earlier post about archiving and deleting my Facebook account, with suggestions on how I'd approach doing the same thing today

Watching in amazement as Ilya Kreymer talks at the Archiving the Black Web workshop about which lets you crawl a website with multiple headless browsers and take the collected data and view it with -- while letting you even view a screencast of what the browsers are doing.

Government Labels In the lead up to the 2020 US Presidential Election Twitter implemented new labels for government officials, organizations and state-affiliated media accounts. This was a follow on from their previous ban on state-backed political advertising in 2019.
By their own description Twitter apply these labels to:

Accounts of key government officials, including foreign ministers, institutional entities, ambassadors, official spokespeople, and key

Does anyone have a good workflow for converting articles & blog posts on the web to epub (preferred) or pdf for later reading on an ereader? #crosspost

Floppy archivist friends, looking for opinions: I have a floppy that has read errors on certain sectors with a cheap drive. Do I risk damaging it by retrying on a better-quality drive before I’m able to read it with a low-level drive like a Greaseweasel?

PDF processing and analysis with open-source tools (an attempt at working various PDF-related working notes and lists into something more structured)

Some thoughts about *that* talk _Why "Web Archiving" is No Longer a Useful Concept or Phrase_

Friends, what is the lossless RGB video codec du jour that ffmpeg can handle? I used to do qtrle, maybe there's a better way?

anyone have a favourite project management tool for tracking both software development + non-software dev tasks?

sh is a kinda genius replacement for Python's subprocess module:

Does anyone happen to know what is and how it works? I've been noticing it popping up in my search results more and more lately, and am ... confused.

hello fediverse! I'm looking for openstreetmap users (or rather: editors) that would be willing to talk to me for a journalistic article (in the Luxembourgish leftist weekly "woxx") about their experience.

it doesn't matter if you edit #osm #openstreetmap a lot or only casualy, I'd just be interested to hear different voices.

(you can also just reply to this toot and tell me why you started to edit osm/whats your motivation to do it/what problems you see)

boosts more than welcome

I'm just noticing that browsertrix-crawler added an experimental feature to provide a browser screencast of what web content is being crawled. I can see how this could be super useful for site-specific behaviors that scroll pages, and click on detail links, etc.

I have some exciting professional news to share!

The Sloan Foundation has generously funded "Collaborating on Software Archiving for Institutions" #CoSAI, with 3 streams of work:

1) technical development on open source, decentralized toolkit for software archiving & replay; 2) community building and education on software archiving; and 3) optimizing machine and human workflows

Psyched that this is a big collaborative project, feat. the one and only @wilkie

Read more:

