Does anyone know of resources about working for Code4America? (Besides their own of course.) Many years ago I applied for something there, but long before I had much tech experience. As I think about what I'd do if not libraries or an academic program, civic tech is high on my list of potential directions.
Because the repository I manage both misses some functionality but also scales really well, I always feel like an outlier in digipres conversations.
Topic: file format analysis
Many: we're doing dedicated analysis with metadata extraction tools and developing format plans
Me: we get some minimal metadata, one day we'll analyze it
Topic: research data
Many: we're not really responsible for it right now
Me: [tells a story about a 400 GB file called DATA.TAR]
I only follow one bot and it's just old roadside pictures and it never ceases to bring me some joy
One thing I've learned from doing various types of technical work is that "detail-oriented" is too generic a descriptor. Some people are really good at spotting typos in code and some are really good at checking for data processing errors but there isn't necessarily an overlap between the two. Either skill is learnable, anyway.
Thinking about my attachment to using an old-school digital alarm clock as I drift off to sleep reminded me of a story from my youth: for about a year in college my alarm clock, at least 10 years old by then, stopped sounding the alarm at any time other than midnight. So every night when I got ready for bed I did the math backwards and reset the clock so it would show midnight when I wanted to get up.
Yes, I was a history major, thanks for asking.
Second Moderna dose hit me hard. Feels like a whole day disappeared as I tried to sleep through it. Probably wouldn't have been so bad if I took something for the symptoms, but I've been following the theory of let the immune system work. Finally took Tylenol last night since that's supposed to be ok and I slept better.
I'm glad I planned for a strong reaction but I wouldn't have minded being wrong.
I'm not an OCR or Python expert so here's how I evaluated it:
1. Linked from official documentation (not a guarantee of quality, but usually a good sign)
2. Active maintenance as indicated by the commit history (not just a readme update 3 years later or whatever)
3. Works on the sample they provided (you may be surprised how often people's own examples fail)
4. Worked on a sampling of files I have
This seems like a generally useful tool for working with ALTO XML, especially if you just want the actual plain text, so I'm reposting from a thread:
"bio" : "basic input output"
Digital Archivist, posting mostly digipres-related content at this account, with some general spillover from work/life. (He/him)
digipres.club is a space for folks interested in productive conversations about, well, digital preservation! If you enjoy talking about how to do memory work with computers, or even with cardboard boxes of old photos, you belong with us on digipres.club. Many of us are/were Twitter users looking for an inclusive and community supported approach to social media. If any of these things sound good to you, consider joining us now.