Anyway, I hope webrecorder desktop clears up most of the hassle. There was literally no other way I could think of to get posts and comments from my account.

Facebook's own download function was useful mainly to get me the URLs for my posts, which I fed into webrecorder to take the uncertainty out of the news feed algorithm and make sure I really got every post.

When I was done, I changed my Facebook password to be safe. Towards the end of the process, which was a full weekend of clicking, Facebook started treating my behavior as suspicious and I was having to sign in and complete captchas every few minutes.

Access to those WARCs isn't great: I don't really have a search and I have to spin up a VM to serve them, but the rendering with pywb is remarkably faithful to vintage 2016 Facebook.

Years ago, when I was about to delete my Facebook account and webrecorder was architecturally a bit different and I was routinely in and out of virtual machines, I set up my own webrecorder VM to capture just my own Facebook to WARCs.

The fact that it was possible to do this without knowing the internals of how webrecorder worked has always impressed me. It was a little weird to have all the account management functionality on my own locally hosted version, but I just ignored it.

I don't have anything to capture with webrecorder at the moment but it's interesting to see the mention here that the server version could potentially capture passwords.

blog.webrecorder.io/2019/10/28

I'd always wondered about that and avoided the hosted version for anything requiring a login.

This San Jose State MS admissions FAQ is unintentionally revealing of not just the types of questions they must get, but also the phrasing:

Q: I need to know NOW whether you will accept me later, so that I can decide whether I should try applying at a different department or university.

Q: I was accepted with a condition, but I don't like the condition. Can I negotiate to have it waived or changed?

sjsu.edu/cs/programs/mscs/faq/

I know that this program to get a CS BS for people who already have a Bachelors degree exists

ecampus.oregonstate.edu/online

and even predates the MOOC craze. But it's basically at least a two-year degree if you also have a job.

Trying to figure out if graduate work in computer science is:

a) possible without a CS undergrad degree
b) something that would be useful as a step towards doing research around digital preservation and/or digital archives

Even if not (b), I still wonder about (a).

I forgot AMIA-L is a moderated list. I guess that's probably not a bad thing. I was just confused by Gmail's filters putting my send message in my AMIA folder.

Not entirely related, it took a while for me to remember that moderation online generally used to mean identifiable people visibly doing moderation things, unlike today's hellscape of deliberately obfuscated labor. I need to read yalebooks.yale.edu/book/978030

I do think there are many benefits to thinking about storage as its own thing. It becomes a problem when you think it's the whole thing.

Two ideas I've found harmful in digital preservation:

1. "digital preservation" can be reduced to how you set up your storage

2. "digital preservation" can be considered separate from how the stuff you store gets into storage

Maybe that's just one thing.

predominantly white supremacists at predominantly white university campuses 

I was in such a rush to unsubscribe from the Internet History mailing list, which I subscribed to too late to see at its worst, that I initially deleted the confirmation email with the link I needed to click to finish unsubscribing.

Second time in my digital preservation career I've seen a machine intermittently fail to generate the correct checksums. As in if you checksum the same file multiple times you will occasionally see a wrong result.

The other time I saw this, hardware was failing. This is why you check checksums.

andrew boosted

Here's a weird and slightly spooky 1-minute trailer(!) I made for the upcoming jpylyzer 2.0.0 release (don't miss out on the audio on this one) youtu.be/gIutpFxGy28

I really want someone to publish a paper with the title: "The Old Flickr Versions More"

Signs you are working with custom-built in-house software:

it's easy to do one thing at a time but not a batch

you can do batches but the batch interface is totally different from one at a time

you can add data in a batch, but you can't make changes in a batch

you can add data in a batch, but you can't delete in a batch

you can add data but you can't delete

if there's an administrative interface, it was clearly built after the public-facing interface, and probably by different people

Fedora Commons-based repository adoption and then (potentially) non-adoption would make an interesting study in organizational development.

Sorry I'm no longer in a BitCurator-adjacent position because last year I went to the BitCurator Users Forum - it was in driving distance of my parents' house, making it affordable without institutional support - and it was probably the best conference I've attended in years.

It makes me think what we were doing at the museum was a lot more on the edge of things than I'd thought.

blog.suppliedtitle.org/2019/02

Show more
digipres.club

digipres.club is a space for folks interested in productive conversations about, well, digital preservation! If you enjoy talking about how to do memory work with computers, or even with cardboard boxes of old photos, you belong with us on digipres.club. Many of us are/were Twitter users looking for an inclusive and community supported approach to social media. If any of these things sound good to you, consider joining us now.