Daniel Reetz, the founder of the DIY Book Scanner community, has recently started making videos of prototyping and shop tips. If you are tinkering with a book scanner (or any other project) in your home shop, these tips will come in handy. https://www.youtube.com/channel/UCn0gq8 ... g_8K1nfInQ
Search found 58 matches
- 31 Jan 2021, 10:51
- Forum: Scan Tailor
- Topic: Combining Split output into PDF
- Replies: 9
- Views: 658
Re: Combining Split output into PDF
What I don't understand, better, I don't know is how to achieve the 2 passages after scantailor: PDF size 50% JPEG -G4 and PDF at 150dpi scaling (I am on linux). Can you explain me what have I to do? What do you mean by 'PDF size 50% JPEG -G4'? Do you want to have JPEG compression applied to color ...
- 22 Jan 2021, 08:02
- Forum: Scan Tailor
- Topic: Combining Split output into PDF
- Replies: 9
- Views: 658
Re: Combining Split output into PDF
Adobe Acrobat in more recent versions does a good job with Scan Tailor output and creates such 'optimized' pdf files, where color and b&w content is segmented and compressed separately. The result seems to me to be similar to Abbyy FineReader MRC compression model.
- 26 Nov 2020, 07:57
- Forum: Tutorials/How-To's
- Topic: How to convert a book to serchable pdf using open source software
- Replies: 28
- Views: 125561
Re: How to convert a book to serchable pdf using open source software
Actually, each book created by me using the described method has a colored front cover and back cover. Contents between covers are binarized (B&W). There may be added pictures in color, but it would be necessary to manually convert them to appropriate format and turn into pdf, and afterwards insert ...
- 24 Nov 2020, 15:40
- Forum: Scan Tailor
- Topic: output dimensions ratio
- Replies: 2
- Views: 964
Re: output dimensions ratio
In the bottom part, near the right-hand corner you have information on dimensions of the image. Width and height may be adjusted at the 'Margins' stage. You may change units by selecting 'Tools' and 'Units'.
- 14 May 2020, 14:21
- Forum: Tutorials/How-To's
- Topic: How to convert a book to serchable pdf using open source software
- Replies: 28
- Views: 125561
Re: How to convert a book to serchable pdf using open source software
My bad. I was convinced that OCRmyPDF supports jbig2 but apparently this applies only to regular pdfs.
- 06 May 2020, 09:34
- Forum: Tutorials/How-To's
- Topic: How to convert a book to serchable pdf using open source software
- Replies: 28
- Views: 125561
Re: How to convert a book to serchable pdf using open source software
Thank you for your comments and sharing details of your workflow. Nice to see, that someone found useful the thread I wrote. First I think the cover should be at the same size when scrolling the pdf file. I had some problem since I scanned the covers at higher resolutions. You are right. The scripts...
- 22 Feb 2020, 19:06
- Forum: Tutorials/How-To's
- Topic: From tiff-scans, ScanTailor and Tesseract to djvu-files - how?
- Replies: 2
- Views: 2299
Re: From tiff-scans, ScanTailor and Tesseract to djvu-files - how?
By far the most time consuming part is the OCR. I am wondering, if the -j option from ocroodjvu would speed this up (number of OCR threads)? Is there a relation between the threads and the cpu-cores. What amount of threads would be meaningful (I have an AMD cpu with 6 cores, and an nivida GPU) I gu...
- 20 Jan 2020, 18:15
- Forum: Scan Tailor
- Topic: What to do with a page with text and graphics
- Replies: 4
- Views: 2056
Re: What to do with a page with text and graphics
I attach some screenshots and hope you will find them useful. 1. Let's say there is a page with pictures and text (doesn't matter there are in grayscale, the same apply to color ones) Zrzut ekranu z 2020-01-17 22-00-59.png 2. In the "Output" stage: a. Change "Mode" from "Black and White" to "Mixed" ...
- 09 Jan 2020, 21:17
- Forum: Tutorials/How-To's
- Topic: How to convert a book to serchable pdf using open source software
- Replies: 28
- Views: 125561
Re: How to convert a book to serchable pdf using open source software
As you already wrote it was the sorting problem due to inconsistent naming of files. BTW this is not the Tesseract issue as it cannot process batch of separate files directly and the workaround is necessary by creating a list of files in right order which Tesseract may follow. This list was created ...
- 09 Jan 2020, 06:50
- Forum: Scan Tailor
- Topic: What to do with a page with text and graphics
- Replies: 4
- Views: 2056
Re: What to do with a page with text and graphics
In case where there are pages with text and photos it would be possible to apply the "mixed output" (text areas are binarized but pictures remain in color). Picture areas should be selected and indicated as picture zones and "Rectangular picture shape" mode from ST Advanced is really helpful for tha...