korben1 wrote:I would like to reduce the pdf file size of my documents... a lot of scans are color scans... at the moment I have between 2-3 M for a page.
I found this link:
http://en.flossmanuals.net/e-book-enlightenment/index/
They explain it's possible to reduce the file size by using a multi-layer technique. The text layer is stored in high quality, the images are downsampled and the background is highly compressed. But unfortunately I'm unable to find how I can make it, which software I have to use. Anybody has an idea ?
The multi-layer technique is commonly referred to as MRC for Mixed Raster Content. An image of a colour or grayscale page normally has a much larger file size than that of a black and white page, both because it requires a 24-bit or 8-bit depth image rather than a 1-bit depth image, and because black and white images can be stored particularly efficiently if an optimum compression method is used.
When a page contains both colour or grayscale content and black and white content, it is normally necessary to store the whole page as a colour (or grayscale) image even when much of the page area is actually black and white, with a resulting large file size. With MRC, the different types of content are separated and stored as separate layers, with the result that only areas of the page that need to be colour or grayscale are stored as such. There is therefore a double benefit, in that the area of the page that needs to be stored in colour or grayscale is reduced, while the remainder of the page can be stored very efficiently in black and white.
The common file types that support multi-layer content are PDF and DjVu, the latter much less widely used but sometimes considered to produce smaller file sizes.
Note that Scantailor 'mixed-mode' (from memory...) converts text to pure black while leaving other image content unchanged, but the output file produced is a still a single-layer image at a colour or grayscale bit depth, so there is no direct file size reduction. It is in effect an image enhancement process rather than a compression process.
For the moment I discovered an alternative, but I still don't understand why it works. I can scan at 600dpi. Then I downsample to 300dpi and then I resize the image of the file to 50%. If I print my scan, the size remain the same. And on my computer the quality I get is the same for me as the scan at 600 dpi. So I can reduce the file size a lot.
It isn't clear why you are obtaining that result. When file size is important colour and grayscale images are normally best stored as JPEG files, although in principle the format isn't well suited to images containing sharp edges, such as text. The compression level or 'Quality' setting used each time the file is saved can have a considerable effect on the resulting file size, and it seems likely that the reduction you are seeing is either the result of the file being resaved at a lower setting, or some other unidentified effect.
When file size is important JPEG images even of text can often be quite heavily compressed with little visible change in appearance; the strategy would normally
be to try increasing the compression until visible deterioration is seen, and to then back off a bit. Over-compression typically produces compression artefacts in the form of feint gray smudges between the characters. It is as well to leave something in reserve, though, remembering that the resolution of screens is increasing and may increase further in the future.