n1gr3d0 t1_j8qt5x3 wrote on February 16, 2023 at 7:47 AM

Reply to comment by SketchyApothecary in TIL that back in 2013, Xerox had scanners that would randomly change numbers after scanning a document. by COMPUTER1313

The fun part is that was not about recognition. Scanning shouldn't do any OCR, so in that context any meaningful character manipulation (like replacing one character with another one) look shady as hell. Thankfully it turned out to be just an overly zealous compression algorithm.

surelythisisfree t1_j8rdhnl wrote on February 16, 2023 at 12:17 PM

Most copier manufacturers have their own pdf compression that generally puts a scanned page between 50 and 150kB, down from about a Meg if they don’t do anything fancy. I only realised after years of working with them how that isolated out things that looked like letters and basically averaged each letter representation on the page to slow better compression. The only reason I realised was due to a big in a released firmware that only affected compact pdf (that was quickly pulled within 24 hours). The big basically made all the letters not line up in a row on each page so they’d move up and down the line a bit.

Gathorall t1_j8r7dsg wrote on February 16, 2023 at 11:06 AM

That tracks, magnitudes easier "though shouldn't matter nowadays" to tell the head to put "8 in black" in a certain spot rather than tell the precise location and color of every constituting dot.