The issue of redactions – both redacting names of victims and not redacting participants in Jeffery Epstein’s “social circle” — documents has been in the news. There have also been questions about why it took so long for DOJ to release around 3.5 million pages of documents, plus about 2,000 videos and 180,000 images.
Working with the UCSF Library, I created the first collection of previously secret tobacco industry documents in 1995, which consisted of a few thousand pages of internal Brown and Williamson tobacco company documents.
Today, that collection has grown to the UCSF Industry Documents Library that (as of January 29, 2026) contained 145 million pages in 27 million documents (including videos and photographs) from the tobacco, opioid, chemical, drug, food, and fossil fuel industries.
I asked the archivist who manages redactions in the collections, how long it would take her to redact the names, emails, and personal information (like personal emails, Social Security and phone numbers) of victims in the 3.5 million pages DOJ has released so far, assuming she had a list of the victims’ names and that the files were born digital or at least scanned with good quality optical characterization software.
She explained that this was a straightforward task that redaction software can handle. She said it would take a couple days, followed by time to conduct quality checks for accuracy.
Depending on the quality and format of videos and photos might take a bit longer. Having photos of the victims to train the software would speed the process.
She also told me that the UCSF Library did not subscribe to the most powerful redaction software tier available, which DOJ would likely have. And that software would almost certainly speed up processing videos and photos compared to what it would take UCSF to check.
Given the quality of modern software, there was no excuse for failing to redact all instances of known victims’ material.
Reviewing the documents to make sure that all Epstein’s “social circle” was disclosed while redacting appropriate details would take a little longer since the results of the software would have to be reviewed. She thought it would take her and one assistant less than two months.
If the DOJ had “hundreds” of lawyers and others working on it, they should have had no problem meeting the Epstein Files Transparency Act’s 30-day deadline of December 19, 2025.
Given the power of redaction software, she also questioned the need to redact large blocks of text.
The bottom line: The victims, their lawyers, Congress, the media and the public should demand that the current DOJ redactions be independently reviewed and that redactions on the remaining Epstein documents (reported to be around 3 million more pages) be properly processed and released quickly.
They should also talk to archivists who are used to handling such collections, as well as lawyers and paralegals at law firms (and DOJ) as well as redaction software companies who routinely deal with such issues.
While 3 or 6 million pages sounds like a lot, it’s not. Compared with what we have been dealing with at UCSF, the Epstein files is a small collection.
WOW – a very good blog, thank you Stan.
CB
LikeLike
Very interesting perspective, thank you!
LikeLike