Weblog
More alpha bits
While the ColorMatrix class wasn't widely used it was not the only
place
where
libgdiplus
suffered from a lack of pre-multiplication.
Loading bitmaps, e.g. 32bpp PNG images, also require to pre-multiply the alpha value to every R, G and B values.
Otherwise things starts to look strange or bad.
The pre-multiplication process is simple but CPU intensive. It requires, for every pixel
(and everyone like bitmaps with a lot of pixels), a division (alpha / 0xff = float) and
three multiplications (float * R, G, B).
Divisions are slow, so removing the division, using 256 pre-computed
floats values (1kb), can have a very visible impact on the required time
to apply a ColorMatrix or, more commonly, when loading a transparent bitmap.
Time: 0:44.1136470 seconds (see
previous benchmark)
to
Time: 0:40.8741020 seconds
I suspect this difference will vary a lot depending on how well your CPU architecture handles divisions.
Multiplications are faster than divisions, but we have three of them. Sadly
removing the multiplications requires a bigger table, like 65536 bytes (it could be made
smaller but would require more time to compute, negating part of the advantage of using a
table). This table also removes the need for the previous, albeit a lot smaller, table.
New time: 0:35.7004520 seconds
That's almost another 20% reduction (45% from the original ColorMatrix).
Now is it worth the extra 64kb space in the (already more then 2.5mb) libgdiplus binary ?
If it was only for the ColorMatrix then probably not.
But we get a lot more comments/bugs on libgdiplus performance than on it's size
and loading transparent bitmaps correctly, and without getting (too much) slower,
looks worthy enough :-)
1/3/2007 15:21:09 | Comments
The views expressed on this website/weblog are mine alone and do not necessarily reflect the views of my employer.
