For the project I'm currently working on, I needed to rescale some images automatically. As the big fan of Python that I am, I chose to use PIL (Python Imaging Library) to do the image processing. Everything would have been fine except for the small fact that the scaling can only be specified by the target image size in pixels (which needs to be integral) and not by a scaling factor, which in theory could be fractional.
Saturday, January 14. 2012
Arbitrary scaling of images with integral sizes
For example, we can scale a 100x100px to 50% its size with no problem; the resulting image would be 50x50px in size. But if we wanted to scale it down to 1/3rd of its size, the correct size would then be ~33.33 pixels on each side, so to have integer pixels, we'd need to choose either 33 or 34 pixels.
It seems logical to pick 33, as it is much closer to 100/3 and thus, the error should be smaller. The image would then be <# \frac{33 - 100/3}{100/3} = - \frac{1}{100} #>, or 1% too small. Just for comparison, if we choose 34, it would be <# \frac{34 - 100/3}{100/3} = \frac{1}{50} #> or 2% too large.
When using integer coordinates and rounding to the nearest integer, the maximum error is never greater than 0.5 pixels, and the smaller the final image the larger the maximum relative error. For some purposes, such error might acceptable, but for my project I needed more precision and I didn't want to look for alternatives to PIL.
Method
The solution is simple, and the idea should become clear with an example: If we want to scale an image down to 1/n of its size, with n integer, all we need to do is make sure the original image has dimensions which are divisible by n. That can be done by embedding (pasting) the original image onto one which is just a little bit bigger, scaling down, and finally, cropping the image.
Of course, things get a little more interesting if the scaling factor is not of the form 1/n, but the idea is the same, even when the scaling is > 1: embed, resize and crop
Analysis
As both dimensions (horizontal and vertical) are independent, I'll discuss only the problem of scaling a single dimension. The same procedure applies to the other dimension.
If we want to scale an image with a side of X pixels by a scaling factor F, we need the algorithm to calculate E (extended size) and S (scaled size). All dimensions must be integer values <# X,E,S \in \mathbb{N} #>. The scaling factor F can be any positive real value [1], while the actual scaling factor F' will be a rational approximation to F given by F' <# = \frac{S}{E} \approx #> F
We would also like to impose the following restrictions, particularly on E:
- it should be bounded: we need to set a maximum size for E because otherwise E could be very large for some values of F, as F' approaches F. We'll denote that high bound on E as B, determined by a "maximum growth factor" q. On the lower bound we don't want E to be smaller than X, after all, we want to embed the image in a larger one, not to lose information by cropping it.
\[ B = q X \]
\[ X \leq E \leq B \]
- it should be optimal: F' (that is, <# \frac{S}{E} #> ) should be the best approximation to F for all (allowed) values of E and S.
Algorithm
To obtain E and S, we can use continued fractions, or more precisely, the convergents obtained from the continued fraction expansion of F.
Let (NumN DenN) = CN(F) be the numerator and denominator of the Nth convergent of F, for N = 1,2,3... up to a possible Nmax
To get the embedding and scaling sizes, we can use the following algorithm:
- Take the first convergent C1(F) and store Num1 → S and Den1 → E.
- Scale both S and E by the smallest integer factor such that E <# \geq #> X, and keep this as our first approximation. Note E might now be larger than B, but hopefully we'll have a better (lower E) later on.
- It there are more convergents, then take the next, otherwise stop
- If DenN is greater than B then stop.
- Scale the numerator and denominator as we did with E and S in 2. If the denominator is lower than or equal to B or if it isn't but it is still lower than the previous E, then store the numerator and denominator in S and E.
- keep repeating from step 3
With S and E we can embed the image and scale it. After that, we might just need to crop it. The cropping size should then be simply C = <# \lceil #> X F <# \rceil #>
Nicely enough, the algorithm works both for scaling up or down. The value of q should be picked to give enough margin so the algorithm can find a good denominator. A value of 2 should give good results for most purposes.
Here's a sample implementation of the algorithm: int_scaling.py
[1] In practice, F can never be irrational because of the way numbers are represented digitally. For example, in a IEEE-754 32-bit float, <# \pi #> is precise to ~7.6 digits, with the actual stored value = 3.1415927410125732421875 which is exactly equal to <# \frac{13176795}{4194304} #>. However, it wouldn't be practical to use 4194304 as E. If we wanted to scale a 100x100px image by a factor of <# \pi #>, using convergents we would get a more practical approximation of <# \frac{355}{113} #>, which is precise to ~7.1 digits.


