Here's an interesting discussion about why it's difficult to archive Facebook, at least with current web archiving tech.
FB's user interface is driven by HTTP POSTs to facebook.com/api/graphql/ so the usual web archiving crawlers (which typically discover URLs and GET them) won't work. Archiving bots or tools like Webrecorder that load and interact with the DOM have more luck recording.
@anj nice, thanks Andy! Is the fuzzy matching kind of like a levenshtein distance type of thing? I guess I could take a peek in the code...
@edsu I’m not 100% sure TBH - lemme know what you find! 🙂
@anj it looks like fuzzy matching was started for HTTP GETs. Here are a bunch of rules that get compiled in:
Maybe I'm missing it or looking in the wrong place, but I don't actually see one for FB's GraphQL.
digipres.club is a space for folks interested in productive conversations about, well, digital preservation! If you enjoy talking about how to do memory work with computers, or even with cardboard boxes of old photos, you belong with us on digipres.club. Many of us are/were Twitter users looking for an inclusive and community supported approach to social media. If any of these things sound good to you, consider joining us now.