Fix generic ceil for SSE2.
The existing implementation would return the wrong sign for negative numbers rounded to zero, since
-0 + 0 = +0, whereas we need pceil(-0.01) = -0 for consistency with std::ceil.
The existing implementation would return the wrong sign for negative numbers rounded to zero, since
-0 + 0 = +0, whereas we need pceil(-0.01) = -0 for consistency with std::ceil.