abbyy 12 - how to clean up and center pages after split?

Convert page images into searchable text. Talk about software, techniques, and new developments here.
abbyy 12 - how to clean up and center pages after split?

Post by glenleslie » 23 Aug 2020, 13:22

I scanned a few pages of a book in 2-up format.

abbyy 12 pro does a great job of splitting the pages automatically.

Once they're split, I don't see a way to:

1. remove page edges (Page edge detection is in the PDF output tab)
2. center the recognized text - abbyy is seeing all the text on the split pages but it's keeping in its exact original position on the split page (biased towards the center margin of the original book).-- is there some way to force it to center the recognized text on the output page?

example here: http://oldsaw.org/temp/test.pdf

Re: abbyy 12 - how to clean up and center pages after split?

Post by BruceG » 24 Aug 2020, 05:55

I am using Finereader 15 at the moment. I do not need pages to be centred so have not looked how to do it before. Looking now I also have not found how. A work around would be to use cropping, odd pages then crop even pages if want to use just 2 steps. The scanned pages would need to be consistently scanned with the same white space for this to work 100%. The book would need to be printed like wise with white space on all odd and even pages. The other way is crop pages individually and do the best you can with the scans. It doesn't take that long.
I am either making ebooks so only the text is saved or retaining the look of the Newspaper, magazine of book.
I also use Sprint12 and did not find a way to centre text.
I find Sprint12 useful in straightening up a page before using Finereader. I also turn off auto page splitting in Image Processing when opening a new image as I have found tables being split into 2 pages when it is only one.

Re: abbyy 12 - how to clean up and center pages after split?

Post by cday » 29 Aug 2020, 15:17

I have Abbyy FineReader Pro 12 although I don't use it much now, and in the absence of any replies since BruceG's above I have fired up my Windows computer and taken a look at the tools available.

When I view scanned book pages on the screen, I like the text on successive pages to remain centered on the page, the page to be a constant size, and as far as possible the text to have a constant top margin unless there is a reason for not using one. That avoids the text jumping around or moving noticeably on the screen as successive pages are viewed.

Looking in the FineReader 'Page' menu, in the 'Edit Image...' option there is a crop tool available which enables a selection of a specified size to be set up, then moved using the mouse to the desired position on the image to make the crop. The crop size set should be remembered when the next page is selected, so it should be possible to crop successive pages quickly to a fixed size with the text in the desired position on the page. Ideally, it would also be possible to set up vertical and horizontal guidelines to aid accurate positioning on the page.

The 'Edit Image...' menu also has an eraser tool, which could be used when needed after the crop to quickly erase any unwanted page edges, and optionally also any stray dark marks on the page.

I notice the 'Edit Image...' tools also include a Levels tool, which if you are not already familiar with it can provide an easy way to enhance scans, both by enabling any gray background to be reduced or eliminated, and by enabling text to be darkened slightly when needed.

I had expected someone to suggest you look at the latest ScanTailor versions (plural) which I suspect might also provide a viable and possibly more automated solution, but if you are using FineReader that would seem to provide everything you need to easily get the results you desire. Maybe a future version of FineReader will also include an option to automatically centre the text on successive pages. ;)

