孤独明镜

菩提本无树,明镜亦非台,本来无一物,何处惹尘埃。

嗨,我是Dragon,一名全栈开发者。现居上海,就职于一家游戏开发公司。做过Linux/QT、WEB(PHP/Python)、游戏开发,目前主攻图形/图像方向,对数学、文学、哲学、摄影等非常的喜爱。


使用CUDA做RGB到HSL颜色空间的转换及优化

__device__ float
_cuda_max(float r, float g, float b)
{
    float ret = g;
&nbsp; &nbsp; if (ret <= r)
&nbsp; &nbsp; &nbsp; &nbsp; ret = r;
&nbsp; &nbsp; if (ret < b)
&nbsp; &nbsp; &nbsp; &nbsp; ret = b;

&nbsp; &nbsp; return ret;
}

__device__ float
_cuda_min(float r, float g, float b)
{
&nbsp; &nbsp; float ret = g;

&nbsp; &nbsp; if (r < ret)
&nbsp; &nbsp; &nbsp; &nbsp; ret = r;&nbsp;
&nbsp; &nbsp; if (b < ret)
&nbsp; &nbsp; &nbsp; &nbsp; ret = b;

&nbsp; &nbsp; return ret;
}

__global__ void
_cuda_hsl(float *A, float *B, int w, int h)
{
    int i = blockIdx.x * blockDim.x + threadIdx.x;
    int j = blockIdx.y * blockDim.y + threadIdx.y;

    float blue =  A[(i * w + j) * 3 + 2];
    float green =  A[(i * w + j) * 3 + 1];
    float red =  A[(i * w + j) * 3];

    float r = red/255.0;
    float g = green/255.0;
    float b = blue/255.0;

    float max = _cuda_max(r, g, b);
    float min = _cuda_min(r, g, b);

    float C = max - min;
    float L = (max + min)/2.0;
    float S = 0.0;
    float H = 0.0;

    if (C == 0 || L == 0)
    {
        S = 0;
    }
    else if (L >= 0.5)
    {
        S = C/(2.0 - max - min);
    }else if (L < 0.5)
    {
        S = C/(max + min);
    }

    if (C == 0)
    {
        H = 0.0;
    }
    else if (max == r)
    {
        H = (g - b)/C;
    }
    else if (max == g)
    {
        H = (b - r)/C + 2.0;
    }
    else if (max == b)
    {
        H = (r - g)/C + 4.0;
    }

    B[(i * w + j) * 3] = H * 60.0;
    B[(i * w + j) * 3 + 1] = S;
    B[(i * w + j) * 3 + 2] = L;
}

由于SM没有分支预测,因此只能让束内线程在每个分支上都执行一遍。而上面的程序中分支很多,会影响程序性能,所以在程序中应避免分支。先优化一下min和max函数吧。

__device__ float
_cuda_max(float r, float g, float b)
{
    float ret = 0.0;

    ret = ((r + g) + abs(r - g))/2;//除2前的值是二数之和加上两数之差,和为两数中最大的数的2倍,所以除2后为最大数
    ret = ((ret + b) + abs(ret - b))/2;

    return ret;
}

__device__ float
_cuda_min(float r, float g, float b)
{
    float ret = 0.0;

    ret = ((r + g) - abs(r - g))/2;//除2前的值是二数之和减去两数之差,差为两数中最小的数的2倍,所以除2后为最小数
    ret = ((ret + b) - abs(ret - b))/2;

    return ret;
}

上面的两函数中的分支被下面两函数中两个表达式替换了,所有线程执行一样的指令了。。。
下面是程序运行结果:

转载请注明地址:孤独明镜