♻️ 资源大小17.4MB➡️资源下载https://download.csdn.net/download/s1t16/87430301SLIC 超像素分割算法复现实验目的阅读 SLIC 论文使用 c 复现论文中的算法并对图片进行实际操作论文详读这里从原文章出发文章中介绍算法的部分在第三段SLIC SUPERPIXELS We propose a new method for generating superpixels which is faster than existing methods, more memory efficient, exhibits state-of-the-art boundary adherence, and improves the perfor- mance of segmentation algorithms. Simple linear iterative clustering is an adaptation of k-means for superpixel generation, with two important distinctions:The number of distance calculations in the optimization is dramatically reduced by limiting the search space to a region proportional to the superpixel size. This reduces the complexity to be linear in the number of pixels N—and independent of the number of superpixels k.weighted distance measure combines color and spatial proximity while simultaneously providing control over the size and compactness of the superpixels. SLIC is similar to the approach used as a preprocessing step for depth estimation described in [30], which was not fully explored in the context of superpixel generation.超像素我们提出一种新的生成超像素的方法比现有方法更快更高的记忆效率展示了目前最优的边界依从性并提高了分割算法的性能。简单线性迭代聚类SLIC采用 K 均值算法生成超像素相较与其他算法具有两个重要的区别通过将搜索空间限制为与超像素大小成比例的区域显着地减少了优化中的距离计算的数量。这降低了像素数 N 的线性复杂度并且与超像素 k 的数量无关。加权距离度量组合颜色和空间接近度同时提供对超像素的尺寸和紧凑性的控制。SLIC 类似于[30]中描述的用于深度估计的预处理步骤的方法其没有在超像素方向进行研究。Algorithm SLIC is simple to use and understand. By default, the only parameter of the algorithm is k, the desired number of approximately equally sized superpixels. For color images in the CIELAB color space, the clustering procedure begins with an initialization step where k initial cluster centersare sampled on a regular grid spaced S pixels apart. To produce roughly roughly equally sized superpixels, the grid interval isThe centers are moved to seed locations corresponding to the lowest gradient position in a 3 * 3 neighborhood. This is done to avoid centering a superpixel on an edge and to reduce the chance of seeding a superpixel with a noisy pixel. Next, in the assignment step, each pixel i is associated with the nearest cluster center whose search region overlaps its location, as depicted in Fig. 2. This is the key to speeding up our algorithm because limiting the size of the search region significantly reduces the number of distance calculations, and results in a significant speed advantage over conventional k-means clustering where each pixel must be compared with all cluster centers. This is only possible through the introduction of a distance measure D, which determines the nearest cluster center for each pixel, as discussed in Section 3.2. Since the expected spatial extent of a superpixel is a region of approximate size S * S, the search for similar pixels is done in a region 2S * 2S around the superpixel center. Once each pixel has been associated to the nearest cluster center, an update step adjusts the cluster centers to be the meanvector of all the pixels belonging to the cluster. The L2 norm is used to compute a residual error E between the new cluster center locations and previous cluster center locations. The assignment and update steps can be repeated iteratively until the error converges, but we have found that 10 iterations suffices for most images, and report all results in this paper using this criteria. Finally, a postprocessing step enforces connectivity by reassigning disjoint pixels to nearby superpixels. The entire algorithm is summarized in Algorithm 1.算法SLIC 使用简单易懂。默认情况下算法的唯一参数是 k其含义是大小大致相等的超像素的个数。对于 CIELAB 色彩空间中的彩色图像聚类过程从初始化步骤开始其中 k 个初始聚类中心在间隔 S 个像素的规则网格上采样。为了产生大致相等大小的超像素网格间隔为将中心移动到与 3×3 邻域中的最低梯度位置相对应的种子位置。这样做是为了避免将超像素定位在边缘上并且减少用噪声像素接种超像素的机会。接下来在分配步骤中每个像素 i 与搜索区域与其位置重叠的最近聚类中心相关联如图 2 所示。这是加速我们的算法的关键因为限制搜索区域的大小显着地减少了距离计算的数量并且导致相对于常规 kmeans 聚类的显着的速度优势其中每个像素必须与所有聚类中心比较。这只能通过引入距离测量 D 来实现该距离测量 D 确定每个像素的最近聚类中心如第 III-B 节中所讨论的。由于超像素的预期空间范围是近似尺寸 S×S 的区域因此在超像素中心周围的区域 2S×2S 中进行类似像素的搜索。一旦每个像素已经与最近的聚类中心相关联更新步骤将聚类中心调整为属于该聚类的所有像素的平均向量。L2 范数用于计算新聚类中心位置和先前聚类中心位置之间的残差误差 E.分配和更新步骤可以迭代重复直到错误收敛但我们发现 10 次迭代足够大多数图像并报告本文中使用此标准的所有结果。最后后处理步骤通过将不相交像素重新分配给附近的超像素来实施连通性。算法 1 中总结了整个算法。Fig. 2. Reducing the superpixel search regions. The complexity of SLIC is linear in the number of pixels in the image O(N), while the conventional k-means algorithm is O(kNI), where I is the number of iterations. This is achieved by limiting the search space of each cluster center in the assignment step. (a) In the conventional k-means algorithm, distances are computed from each cluster center to every pixel in the image. (b) SLIC only computes distances from each cluster center to pixels within a 2S2S region. Note that the expected superpixel size is only SS, indicated by the smaller square. This approach not only reduces distance computations but also makes SLIC’s complexity independent of the number of superpixels.图.2减少超像素搜索区域。SLIC 的复杂性在图像 O(N)中的像素数目中是线性的而常规的 k 均值算法是 O(kNI)其中 I 是迭代次数。这在分配步骤中提供了每个聚类中心的搜索空间。a在常规 k 均值算法中从每个聚类中心到图像中的每个像素计算距离。(b)SLIC 仅计算从每个聚类中心到 2S×2S 区域内的像素的距离。注意期望的超像素大小仅为 S×S由较小的正方形表示。这种方法不仅减少了距离计算而且使得 SLIC 的复杂性与超像素的数量无关。Distance MeasureSLIC superpixels correspond to clusters in the labxy color-image plane space. This presents a problem in defining the distance measure D, which may not be immediately obvious. D computes the distance between a pixel i and cluster center Ck in Algorithm 1. A pixel’s color is represented in the CIELAB color space, whose range of possible values is known. The pixel’s position position, on the other hand, may take a range of values that varies according to the size of the image. Simply defining D to be the 5D euclidean distance in labxy space will cause inconsistencies in clustering behavior for different superpixel sizes. For large superpixels, spatial distances outweigh color proximity, giving more relative importance to spatial proximity than color. This produces compact superpixels that do not adhere well to image boundaries. For smaller superpixels, the converse is true.To combine the two distances into a single measure, it is necessary to normalize color proximity and spatial proximity by their respective maximum distances within a cluster, Ns and Nc. Doing so, D’ is written距离测量SLIC 超像素对应于 labxy 色像平面空间中的簇。这提出了定义距离测量 D 的问题这可能不是立即显而易见的。口在算法 1 中计算像素 i 和聚类中心 Ck 之间的距离。像素的颜色在 CIELAB 颜色空间中表示其取值范围是己知的。另一方面像素的位置的取值范国随着图像的尺寸变化而变化。简单定义 D 为 labary 空间中的五维欧氏距离将导致不同超像素大小的聚类行为的不一致。对于大的超像素空间距离超过颜色接近度给出比空间接近度比颜色更多的相对重要性。这产生不良好地粘附到图像边界的紧凑超像素。对于较小的超像素则相反。为了将两个距离组合成单个测量有必要通过它们在簇內的各自的最大距离 Ns 和 Nc 来标准化颜色接近度和空间接近度。这样做D’写为(1) The maximum spatial distance expected within a given cluster should correspond to the sampling interval,Determining the maximum color distance Nc is not so straightfor- ward, as color distances can vary significantly from cluster to cluster and image to image. This problem can be avoided by fixing Nc to a constant m so that (1) becomes给定群集内预期的最大空间距离应对应于采样间隔。确定最大颜色距离 Nc 不是那么简单因为颜色距离可以从簇到簇和图像到图像显著不同。这个问题可以通过将 Nc 固定为常数 m 来避免变为(2) which simplifies to the distance measure we use in practice:这简化了我们在实践中使用的距离测量(3) By defining D in this manner, m also allows us to weigh the relative importance between color similarity and spatial proximity. When m is large, spatial proximity is more important and the resulting superpixels are more compact (i.e., they have a lower area to perimeter ratio). When m is small, the resulting superpixels adhere more tightly to image boundaries, but have less regular size and shape. When using the CIELAB color space, m can be in the range[1, 40]. Equation (3) can be adapted for grayscale images by setting通过以这种方式定义 Dm 还允许我们权衡颜色相似性和空间邻近度之间的相对重要性。当 m 大时空间邻近性更重要并且所得到的超像素更紧凑即它们具有更低的面积与周长比。当 m 小时所得到的超像素更紧密地粘附到图像边界但是具有较小的规则尺寸和形状。当使用 CIELAB 色彩空间时m 可以在[1,40]的范围内等式 3 可以通过设置适用于灰度图像。It can also be extended to handle 3D supervoxels, as depicted in Fig. 3, by including the depth dimension to the spatial proximity term of (3):它也可以扩展到处理 3D 超体像素如图 3 所示通过包括深度维度到空间邻近项的方程 3(4)Fig. 3. SLIC supervoxels computed for a video sequence. (Top) Frames from a short video sequence of a flag waving. (Bottom left) A volume containing the video. The last frame appears at the top of the volume. (Bottom right) A supervoxel segmentation of the video. Supervoxels with orange cluster centers are removed for display purposes图 3为视频序列计算的 SLIC 超体元。顶部短波的短视频序列所产生的帧。左下包含视频的卷。最后一帧出现在卷的顶部。右下视频的超像素分割。为便于显示具有橙色聚类中心的超体素被去除PostprocessingLike some other superpixel algorithms [8], SLIC does not explicitly enforce connectivity. At the end of the clustering procedure, some “orphaned” pixels that do not belong to the same connected component as their cluster center may remain. To correct for this, such pixels are assigned the label of the nearest cluster center using a connected components algorithm.像一些其他超像素算法[8]SLIC 没有明确强制连接。在聚类过程结束时可能保留不属于与其聚类中心相同的连接分量的一些“孤立”像素。为了对此进行校正使用连通分量算法向这些像素分配最近聚类中心的标签。Complexity By localizing the search in the clustering procedure, SLIC avoids performing thousands of redundant distance calculations. In practice, a pixel falls in the neighborhood of less than eight cluster centers, meaning that SLIC is O(N) complex. In contrast, the trivial upper bound for the classical k-means algorithm is O(k^N), and the practical time complexity is O(NkI), where I is the number of iterations required for convergence. While schemes to reduce the complexity of k-means have been proposed using prime number length sampling, random sampling, local cluster swapping, and by setting lower and upper bounds, these methods are very general in nature. SLIC is specifically tailored to the problem of superpixel clustering. Finally, unlike most superpixel methods and the aforementioned approaches to speed up k-means, the complexity of SLIC is linear in the number of pixels, irrespective of k.复杂度通过在聚类过程中定位搜索SLIC 避免执行数千个冗余距离计算。在实践中像素落在小于 8 个聚类中心的附近这意味著 SLIC 是 O(N)复杂度。相比之下经典 k 均值算法的平凡上限是 O(kN)实际时间复杂度为 O(KNI)其中 I 是收敛所需的选代次数。虽然己经提出了使用素数长度采样随机抽样局部簇交换以及通过设置下限和上限来降低 k 均值复杂度的方案一般性质。SLIC 专门针对超像素聚类的问题。最后与大多数超像素方法和上述加速 k-均值的方法不同SLIC 的复杂性在像素数量上是线性的与 k 无关。实验原理基本步骤已知一副图像大小 M*N,可以从 RGB 空间转换为 LAB 空间LAB 颜色空间表现的颜色更全面假如预定义参数 KK 为预生成的超像素数量即预计将 MN 大小的图像(像素数目即为 MN)分隔为 K 个超像素块每个超像素块范围大小包含[(M*N)/K]个像素假设每个超像素区域长和宽都均匀分布的话那么每个超像素块的长和宽均可定义为 SSsqrt(M*N/K)遍历操作将每个像素块的中心点的坐标(x,y)及其 lab 的值保存起来加入到事先定义好的集合中每个像素块的中心点默认是(S/2,S/2)进行获取的有可能落在噪音点或者像素边缘所谓像素边缘即指像素突变处比如从黑色过渡到白色的交界处这里利用差分方式进行梯度计算调整中心点算法中使用中心点的 8 领域像素点计算获得最小梯度值的像素点并将其作为新的中心点差分计算梯度的公式Gradient(x,y)dx(i,j) dy(i,j); dx(i,j) I(i1,j) - I(i,j); dy(i,j) I(i,j1) - I(i,j);遍历现中心点的 8 领域像素点将其中计算得到最小 Gradient 值的像素点作为新的中心点调整完中心点后即需要进行像素点的聚类操作通过聚类的方式迭代计算新的聚类中心首先需要借助 K-means 聚类算法将像素点进行归类通过变换的欧氏聚距离公式进行公式如下通过两个参数 m 和 S 来协调两种距离的比例分配。参数 S 即是上面第 ③ 步计算得出的每个像素块的长度值而参数 M 为 LAB 空间的距离可能最大值其可取的范围建议为[1,40]为了节省时间只遍历每个超像素块中心点周边的 2S*2S 区域内的像素点计算该区域内每个像素点距离哪一个超像素块的中心点最近并将其划分到其中完成一次迭代后重新计算每个超像素块的中心点坐标并重新进行迭代注衡量效率和效果后一般选择迭代 10 次实验环境实验语言使用 C调用 opencv 库仅为便于将图片读入成矩阵和将矩阵输出为图片程序中其它的图像操作等均未调用任何 opencv 的函数。所有操作均面向提取后的矩阵。实验设备MacBook M1程序设计和实现# include iostream # include opencv2/opencv.hpp # include vector # include map const float param_13 1.0f / 3.0f; const float param_16116 16.0f / 116.0f; const float Xn 0.950456f; const float Yn 1.0f; const float Zn 1.088754f; using namespace std; using namespace cv; float gamma(float x) { return x 0.04045 ? powf((x 0.055f) / 1.055f, 2.4f) : (x / 12.92); } float gamma_XYZ2RGB(float x) { return x 0.0031308 ? (1.055f * powf(x, (1 / 2.4f)) - 0.055) : (x * 12.92); } void XYZ2RGB(float X, float Y, float Z, int *R, int *G, int *B) { float RR, GG, BB; RR 3.2404542f * X - 1.5371385f * Y - 0.4985314f * Z; GG -0.9692660f * X 1.8760108f * Y 0.0415560f * Z; BB 0.0556434f * X - 0.2040259f * Y 1.0572252f * Z; RR gamma_XYZ2RGB(RR); GG gamma_XYZ2RGB(GG); BB gamma_XYZ2RGB(BB); RR int(RR * 255.0 0.5); GG int(GG * 255.0 0.5); BB int(BB * 255.0 0.5); *R RR; *G GG; *B BB; } void Lab2XYZ(float L, float a, float b, float *X, float *Y, float *Z) { float fX, fY, fZ; fY (L 16.0f) / 116.0; fX a / 500.0f fY; fZ fY - b / 200.0f; if (powf(fY, 3.0) 0.008856) *Y powf(fY, 3.0); else *Y (fY - param_16116) / 7.787f; if (powf(fX, 3) 0.008856) *X fX * fX * fX; else *X (fX - param_16116) / 7.787f; if (powf(fZ, 3.0) 0.008856) *Z fZ * fZ * fZ; else *Z (fZ - param_16116) / 7.787f; (*X) * (Xn); (*Y) * (Yn); (*Z) * (Zn); } void RGB2XYZ(int R, int G, int B, float *X, float *Y, float *Z) { float RR gamma((float) R / 255.0f); float GG gamma((float) G / 255.0f); float BB gamma((float) B / 255.0f); *X 0.4124564f * RR 0.3575761f * GG 0.1804375f * BB; *Y 0.2126729f * RR 0.7151522f * GG 0.0721750f * BB; *Z 0.0193339f * RR 0.1191920f * GG 0.9503041f * BB; } void XYZ2Lab(float X, float Y, float Z, float *L, float *a, float *b) { float fX, fY, fZ; / Xn; / Yn; / Zn; if (Y 0.008856f) fY pow(Y, param_13); else fY 7.787f * Y param_16116; *L 116.0f * fY - 16.0f; *L *L 0.0f ? *L : 0.0f; if (X 0.008856f) fX pow(X, param_13); else fX 7.787f * X param_16116; if (Z 0.008856) fZ pow(Z, param_13); else fZ 7.787f * Z param_16116; *a 500.0f * (fX - fY); *b 200.0f * (fY - fZ); } void RGB2Lab(int R, int G, int B, float *L, float *a, float *b) { float X, Y, Z; RGB2XYZ(R, G, B, X, Y, Z); XYZ2Lab(X, Y, Z, L, a, b); } void Lab2RGB(float L, float a, float b, int *R, int *G, int *B) { float X, Y, Z; Lab2XYZ(L, a, b, X, Y, Z); XYZ2RGB(X, Y, Z, R, G, B); } int main() { Mat raw_image imread(../pic6.jpg); if (raw_image.empty()) { cout read error endl; return 0; } vectorvectorvectorfloat image;//x,y,(L,a,b) int rows raw_image.rows; int cols raw_image.cols; cout rows: rows cols: cols endl; int N rows * cols; //K个超像素 int K 150; cout cluster num: K endl; int M 40; //以步距为S的距离划分超像素 int S (int) sqrt(N / K); cout S: S endl; //RGB2Lab for (int i 0; i rows; i) { vectorvectorfloat line; for (int j 0; j cols; j) { vectorfloat pixel; float L; float a; float b; RGB2Lab(raw_image.atVec3b(i, j)[2], raw_image.atVec3b(i, j)[1], raw_image.atVec3b(i, j)[0], L, a, b); pixel.push_back(L); pixel.push_back(a); pixel.push_back(b); line.push_back(pixel); } image.push_back(line); } cout RGB2Lab is finished endl; //聚类中心[x y l a b] vectorvectorfloat Cluster; //生成所有聚类中心 for (int i S / 2; i rows; i S) { for (int j S / 2; j cols; j S) { vectorfloat c; push_back((float) i); push_back((float) j); push_back(image[i][j][0]); push_back(image[i][j][1]); push_back(image[i][j][2]); Cluster.push_back(c); } } int cluster_num Cluster.size(); cout init cluster is finished endl; //获得最小梯度值作为新中心点 for (int c 0; c cluster_num; c) { int c_row (int) Cluster[c][0]; int c_col (int) Cluster[c][1]; //梯度以右侧和下侧两个像素点来计算分别计算Lab三个的梯度来求和 //需要保证当前点右侧和下侧是存在的点否则就向左上移动来替代梯度值 if (c_row 1 rows) { c_row rows - 2; } if (c_col 1 cols) { c_col cols - 2; } float c_gradient image[c_row 1][c_col][0] image[c_row][c_col 1][0] - 2 * image[c_row][c_col][0] image[c_row 1][c_col][1] image[c_row][c_col 1][1] - 2 * image[c_row][c_col][1] image[c_row 1][c_col][2] image[c_row][c_col 1][2] - 2 * image[c_row][c_col][2]; for (int i -1; i 1; i) { for (int j -1; j 1; j) { int tmp_row c_row i; int tmp_col c_col j; if (tmp_row 1 rows) { tmp_row rows - 2; } if (tmp_col 1 cols) { tmp_col cols - 2; } float tmp_gradient image[tmp_row 1][tmp_col][0] image[tmp_row][tmp_col 1][0] - * image[tmp_row][tmp_col][0] image[tmp_row 1][tmp_col][1] image[tmp_row][tmp_col 1][1] - 2 * image[tmp_row][tmp_col][1] image[tmp_row 1][tmp_col][2] image[tmp_row][tmp_col 1][2] - * image[tmp_row][tmp_col][2]; if (tmp_gradient c_gradient) { Cluster[c][0] (float) tmp_row; Cluster[c][1] (float) tmp_col; Cluster[c][2] image[tmp_row][tmp_col][0]; Cluster[c][3] image[tmp_row][tmp_col][1]; Cluster[c][3] image[tmp_row][tmp_col][2]; c_gradient tmp_gradient; } } } } cout move cluster is finished; //创建一个dis的矩阵for each pixel ∞ vectorvectordouble distance; for (int i 0; i rows; i) { vectordouble tmp; for (int j 0; j cols; j) { tmp.push_back(INT_MAX); } distance.push_back(tmp); } //创建一个dis的矩阵for each pixel -1 vectorvectorint label; for (int i 0; i rows; i) { vectorint tmp; for (int j 0; j cols; j) { tmp.push_back(-1); } label.push_back(tmp); } //为每一个Cluster创建一个pixel集合 vectorvectorvectorint pixel(Cluster.size()); //核心代码部分迭代计算 for (int t 0; t 10; t) { cout endl iteration num: t 1 ; //遍历所有的中心点,在2S范围内进行像素搜索 int c_num 0; for (int c 0; c cluster_num; c) { if (c - c_num (cluster_num / 10)) { cout ; c_num c; } int c_row (int) Cluster[c][0]; int c_col (int) Cluster[c][1]; float c_L Cluster[c][2]; float c_a Cluster[c][3]; float c_b Cluster[c][4]; for (int i c_row - 2 * S; i c_row 2 * S; i) { if (i 0 || i rows) { continue; } for (int j c_col - 2 * S; j c_col 2 * S; j) { if (j 0 || j cols) { continue; } float tmp_L image[i][j][0]; float tmp_a image[i][j][1]; float tmp_b image[i][j][2]; double Dc sqrt((tmp_L - c_L) * (tmp_L - c_L) (tmp_a - c_a) * (tmp_a - c_a) (tmp_b - c_b) * (tmp_b - c_b)); double Ds sqrt((i - c_row) * (i - c_row) (j - c_col) * (j - c_col)); double D sqrt((Dc / (double) M) * (Dc / (double) M) (Ds / (double) S) * (Ds / (double) S)); if (D distance[i][j]) { if (label[i][j] -1) { //还没有被标记过 label[i][j] c; vectorint point; point.push_back(i); point.push_back(j); pixel[c].push_back(point); } else { int old_cluster label[i][j]; vectorvectorint::iterator iter; for (iter pixel[old_cluster].begin(); iter ! pixel[old_cluster].end(); iter) { if ((*iter)[0] i (*iter)[1] j) { pixel[old_cluster].erase(iter); break; } } label[i][j] c; vectorint point; point.push_back(i); point.push_back(j); pixel[c].push_back(point); } distance[i][j] D; } } } } cout start update cluster; for (int c 0; c Cluster.size(); c) { int sum_i 0; int sum_j 0; int number 0; for (int p 0; p pixel[c].size(); p) { sum_i pixel[c][p][0]; sum_j pixel[c][p][1]; number; } int tmp_i (int) ((double) sum_i / (double) number); int tmp_j (int) ((double) sum_j / (double) number); Cluster[c][0] (float) tmp_i; Cluster[c][1] (float) tmp_j; Cluster[c][2] image[tmp_i][tmp_j][0]; Cluster[c][3] image[tmp_i][tmp_j][1]; Cluster[c][4] image[tmp_i][tmp_j][2]; } } //导出Lab空间的矩阵 vectorvectorvectorfloat out_image image;//x,y,(L,a,b) for (int c 0; c Cluster.size(); c) { for (int p 0; p pixel[c].size(); p) { out_image[pixel[c][p][0]][pixel[c][p][1]][0] Cluster[c][2]; out_image[pixel[c][p][0]][pixel[c][p][1]][1] Cluster[c][3]; out_image[pixel[c][p][0]][pixel[c][p][1]][2] Cluster[c][4]; } out_image[(int) Cluster[c][0]][(int) Cluster[c][1]][0] 0; out_image[(int) Cluster[c][0]][(int) Cluster[c][1]][1] 0; out_image[(int) Cluster[c][0]][(int) Cluster[c][1]][2] 0; } cout endl export image mat finished endl; Mat print_image raw_image.clone(); for (int i 0; i rows; i) { for (int j 0; j cols; j) { float L out_image[i][j][0]; float a out_image[i][j][1]; float b out_image[i][j][2]; int R, G, B; Lab2RGB(L, a, b, R, G, B); Vec3b vec3b; vec3b[0] B; vec3b[1] G; vec3b[2] R; print_image.atVec3b(i, j) vec3b; } } imshow(print_image, print_image); waitKey(0); //暂停保持图像显示等待按键结束 return 0; }实验结果和分析直接查看前后对比效果我们将中心点设置成黑色其余点均设置成原中心点的颜色这样就可以预览效果。
基于C++实现SLIC 超像素分割算法
♻️ 资源大小17.4MB➡️资源下载https://download.csdn.net/download/s1t16/87430301SLIC 超像素分割算法复现实验目的阅读 SLIC 论文使用 c 复现论文中的算法并对图片进行实际操作论文详读这里从原文章出发文章中介绍算法的部分在第三段SLIC SUPERPIXELS We propose a new method for generating superpixels which is faster than existing methods, more memory efficient, exhibits state-of-the-art boundary adherence, and improves the perfor- mance of segmentation algorithms. Simple linear iterative clustering is an adaptation of k-means for superpixel generation, with two important distinctions:The number of distance calculations in the optimization is dramatically reduced by limiting the search space to a region proportional to the superpixel size. This reduces the complexity to be linear in the number of pixels N—and independent of the number of superpixels k.weighted distance measure combines color and spatial proximity while simultaneously providing control over the size and compactness of the superpixels. SLIC is similar to the approach used as a preprocessing step for depth estimation described in [30], which was not fully explored in the context of superpixel generation.超像素我们提出一种新的生成超像素的方法比现有方法更快更高的记忆效率展示了目前最优的边界依从性并提高了分割算法的性能。简单线性迭代聚类SLIC采用 K 均值算法生成超像素相较与其他算法具有两个重要的区别通过将搜索空间限制为与超像素大小成比例的区域显着地减少了优化中的距离计算的数量。这降低了像素数 N 的线性复杂度并且与超像素 k 的数量无关。加权距离度量组合颜色和空间接近度同时提供对超像素的尺寸和紧凑性的控制。SLIC 类似于[30]中描述的用于深度估计的预处理步骤的方法其没有在超像素方向进行研究。Algorithm SLIC is simple to use and understand. By default, the only parameter of the algorithm is k, the desired number of approximately equally sized superpixels. For color images in the CIELAB color space, the clustering procedure begins with an initialization step where k initial cluster centersare sampled on a regular grid spaced S pixels apart. To produce roughly roughly equally sized superpixels, the grid interval isThe centers are moved to seed locations corresponding to the lowest gradient position in a 3 * 3 neighborhood. This is done to avoid centering a superpixel on an edge and to reduce the chance of seeding a superpixel with a noisy pixel. Next, in the assignment step, each pixel i is associated with the nearest cluster center whose search region overlaps its location, as depicted in Fig. 2. This is the key to speeding up our algorithm because limiting the size of the search region significantly reduces the number of distance calculations, and results in a significant speed advantage over conventional k-means clustering where each pixel must be compared with all cluster centers. This is only possible through the introduction of a distance measure D, which determines the nearest cluster center for each pixel, as discussed in Section 3.2. Since the expected spatial extent of a superpixel is a region of approximate size S * S, the search for similar pixels is done in a region 2S * 2S around the superpixel center. Once each pixel has been associated to the nearest cluster center, an update step adjusts the cluster centers to be the meanvector of all the pixels belonging to the cluster. The L2 norm is used to compute a residual error E between the new cluster center locations and previous cluster center locations. The assignment and update steps can be repeated iteratively until the error converges, but we have found that 10 iterations suffices for most images, and report all results in this paper using this criteria. Finally, a postprocessing step enforces connectivity by reassigning disjoint pixels to nearby superpixels. The entire algorithm is summarized in Algorithm 1.算法SLIC 使用简单易懂。默认情况下算法的唯一参数是 k其含义是大小大致相等的超像素的个数。对于 CIELAB 色彩空间中的彩色图像聚类过程从初始化步骤开始其中 k 个初始聚类中心在间隔 S 个像素的规则网格上采样。为了产生大致相等大小的超像素网格间隔为将中心移动到与 3×3 邻域中的最低梯度位置相对应的种子位置。这样做是为了避免将超像素定位在边缘上并且减少用噪声像素接种超像素的机会。接下来在分配步骤中每个像素 i 与搜索区域与其位置重叠的最近聚类中心相关联如图 2 所示。这是加速我们的算法的关键因为限制搜索区域的大小显着地减少了距离计算的数量并且导致相对于常规 kmeans 聚类的显着的速度优势其中每个像素必须与所有聚类中心比较。这只能通过引入距离测量 D 来实现该距离测量 D 确定每个像素的最近聚类中心如第 III-B 节中所讨论的。由于超像素的预期空间范围是近似尺寸 S×S 的区域因此在超像素中心周围的区域 2S×2S 中进行类似像素的搜索。一旦每个像素已经与最近的聚类中心相关联更新步骤将聚类中心调整为属于该聚类的所有像素的平均向量。L2 范数用于计算新聚类中心位置和先前聚类中心位置之间的残差误差 E.分配和更新步骤可以迭代重复直到错误收敛但我们发现 10 次迭代足够大多数图像并报告本文中使用此标准的所有结果。最后后处理步骤通过将不相交像素重新分配给附近的超像素来实施连通性。算法 1 中总结了整个算法。Fig. 2. Reducing the superpixel search regions. The complexity of SLIC is linear in the number of pixels in the image O(N), while the conventional k-means algorithm is O(kNI), where I is the number of iterations. This is achieved by limiting the search space of each cluster center in the assignment step. (a) In the conventional k-means algorithm, distances are computed from each cluster center to every pixel in the image. (b) SLIC only computes distances from each cluster center to pixels within a 2S2S region. Note that the expected superpixel size is only SS, indicated by the smaller square. This approach not only reduces distance computations but also makes SLIC’s complexity independent of the number of superpixels.图.2减少超像素搜索区域。SLIC 的复杂性在图像 O(N)中的像素数目中是线性的而常规的 k 均值算法是 O(kNI)其中 I 是迭代次数。这在分配步骤中提供了每个聚类中心的搜索空间。a在常规 k 均值算法中从每个聚类中心到图像中的每个像素计算距离。(b)SLIC 仅计算从每个聚类中心到 2S×2S 区域内的像素的距离。注意期望的超像素大小仅为 S×S由较小的正方形表示。这种方法不仅减少了距离计算而且使得 SLIC 的复杂性与超像素的数量无关。Distance MeasureSLIC superpixels correspond to clusters in the labxy color-image plane space. This presents a problem in defining the distance measure D, which may not be immediately obvious. D computes the distance between a pixel i and cluster center Ck in Algorithm 1. A pixel’s color is represented in the CIELAB color space, whose range of possible values is known. The pixel’s position position, on the other hand, may take a range of values that varies according to the size of the image. Simply defining D to be the 5D euclidean distance in labxy space will cause inconsistencies in clustering behavior for different superpixel sizes. For large superpixels, spatial distances outweigh color proximity, giving more relative importance to spatial proximity than color. This produces compact superpixels that do not adhere well to image boundaries. For smaller superpixels, the converse is true.To combine the two distances into a single measure, it is necessary to normalize color proximity and spatial proximity by their respective maximum distances within a cluster, Ns and Nc. Doing so, D’ is written距离测量SLIC 超像素对应于 labxy 色像平面空间中的簇。这提出了定义距离测量 D 的问题这可能不是立即显而易见的。口在算法 1 中计算像素 i 和聚类中心 Ck 之间的距离。像素的颜色在 CIELAB 颜色空间中表示其取值范围是己知的。另一方面像素的位置的取值范国随着图像的尺寸变化而变化。简单定义 D 为 labary 空间中的五维欧氏距离将导致不同超像素大小的聚类行为的不一致。对于大的超像素空间距离超过颜色接近度给出比空间接近度比颜色更多的相对重要性。这产生不良好地粘附到图像边界的紧凑超像素。对于较小的超像素则相反。为了将两个距离组合成单个测量有必要通过它们在簇內的各自的最大距离 Ns 和 Nc 来标准化颜色接近度和空间接近度。这样做D’写为(1) The maximum spatial distance expected within a given cluster should correspond to the sampling interval,Determining the maximum color distance Nc is not so straightfor- ward, as color distances can vary significantly from cluster to cluster and image to image. This problem can be avoided by fixing Nc to a constant m so that (1) becomes给定群集内预期的最大空间距离应对应于采样间隔。确定最大颜色距离 Nc 不是那么简单因为颜色距离可以从簇到簇和图像到图像显著不同。这个问题可以通过将 Nc 固定为常数 m 来避免变为(2) which simplifies to the distance measure we use in practice:这简化了我们在实践中使用的距离测量(3) By defining D in this manner, m also allows us to weigh the relative importance between color similarity and spatial proximity. When m is large, spatial proximity is more important and the resulting superpixels are more compact (i.e., they have a lower area to perimeter ratio). When m is small, the resulting superpixels adhere more tightly to image boundaries, but have less regular size and shape. When using the CIELAB color space, m can be in the range[1, 40]. Equation (3) can be adapted for grayscale images by setting通过以这种方式定义 Dm 还允许我们权衡颜色相似性和空间邻近度之间的相对重要性。当 m 大时空间邻近性更重要并且所得到的超像素更紧凑即它们具有更低的面积与周长比。当 m 小时所得到的超像素更紧密地粘附到图像边界但是具有较小的规则尺寸和形状。当使用 CIELAB 色彩空间时m 可以在[1,40]的范围内等式 3 可以通过设置适用于灰度图像。It can also be extended to handle 3D supervoxels, as depicted in Fig. 3, by including the depth dimension to the spatial proximity term of (3):它也可以扩展到处理 3D 超体像素如图 3 所示通过包括深度维度到空间邻近项的方程 3(4)Fig. 3. SLIC supervoxels computed for a video sequence. (Top) Frames from a short video sequence of a flag waving. (Bottom left) A volume containing the video. The last frame appears at the top of the volume. (Bottom right) A supervoxel segmentation of the video. Supervoxels with orange cluster centers are removed for display purposes图 3为视频序列计算的 SLIC 超体元。顶部短波的短视频序列所产生的帧。左下包含视频的卷。最后一帧出现在卷的顶部。右下视频的超像素分割。为便于显示具有橙色聚类中心的超体素被去除PostprocessingLike some other superpixel algorithms [8], SLIC does not explicitly enforce connectivity. At the end of the clustering procedure, some “orphaned” pixels that do not belong to the same connected component as their cluster center may remain. To correct for this, such pixels are assigned the label of the nearest cluster center using a connected components algorithm.像一些其他超像素算法[8]SLIC 没有明确强制连接。在聚类过程结束时可能保留不属于与其聚类中心相同的连接分量的一些“孤立”像素。为了对此进行校正使用连通分量算法向这些像素分配最近聚类中心的标签。Complexity By localizing the search in the clustering procedure, SLIC avoids performing thousands of redundant distance calculations. In practice, a pixel falls in the neighborhood of less than eight cluster centers, meaning that SLIC is O(N) complex. In contrast, the trivial upper bound for the classical k-means algorithm is O(k^N), and the practical time complexity is O(NkI), where I is the number of iterations required for convergence. While schemes to reduce the complexity of k-means have been proposed using prime number length sampling, random sampling, local cluster swapping, and by setting lower and upper bounds, these methods are very general in nature. SLIC is specifically tailored to the problem of superpixel clustering. Finally, unlike most superpixel methods and the aforementioned approaches to speed up k-means, the complexity of SLIC is linear in the number of pixels, irrespective of k.复杂度通过在聚类过程中定位搜索SLIC 避免执行数千个冗余距离计算。在实践中像素落在小于 8 个聚类中心的附近这意味著 SLIC 是 O(N)复杂度。相比之下经典 k 均值算法的平凡上限是 O(kN)实际时间复杂度为 O(KNI)其中 I 是收敛所需的选代次数。虽然己经提出了使用素数长度采样随机抽样局部簇交换以及通过设置下限和上限来降低 k 均值复杂度的方案一般性质。SLIC 专门针对超像素聚类的问题。最后与大多数超像素方法和上述加速 k-均值的方法不同SLIC 的复杂性在像素数量上是线性的与 k 无关。实验原理基本步骤已知一副图像大小 M*N,可以从 RGB 空间转换为 LAB 空间LAB 颜色空间表现的颜色更全面假如预定义参数 KK 为预生成的超像素数量即预计将 MN 大小的图像(像素数目即为 MN)分隔为 K 个超像素块每个超像素块范围大小包含[(M*N)/K]个像素假设每个超像素区域长和宽都均匀分布的话那么每个超像素块的长和宽均可定义为 SSsqrt(M*N/K)遍历操作将每个像素块的中心点的坐标(x,y)及其 lab 的值保存起来加入到事先定义好的集合中每个像素块的中心点默认是(S/2,S/2)进行获取的有可能落在噪音点或者像素边缘所谓像素边缘即指像素突变处比如从黑色过渡到白色的交界处这里利用差分方式进行梯度计算调整中心点算法中使用中心点的 8 领域像素点计算获得最小梯度值的像素点并将其作为新的中心点差分计算梯度的公式Gradient(x,y)dx(i,j) dy(i,j); dx(i,j) I(i1,j) - I(i,j); dy(i,j) I(i,j1) - I(i,j);遍历现中心点的 8 领域像素点将其中计算得到最小 Gradient 值的像素点作为新的中心点调整完中心点后即需要进行像素点的聚类操作通过聚类的方式迭代计算新的聚类中心首先需要借助 K-means 聚类算法将像素点进行归类通过变换的欧氏聚距离公式进行公式如下通过两个参数 m 和 S 来协调两种距离的比例分配。参数 S 即是上面第 ③ 步计算得出的每个像素块的长度值而参数 M 为 LAB 空间的距离可能最大值其可取的范围建议为[1,40]为了节省时间只遍历每个超像素块中心点周边的 2S*2S 区域内的像素点计算该区域内每个像素点距离哪一个超像素块的中心点最近并将其划分到其中完成一次迭代后重新计算每个超像素块的中心点坐标并重新进行迭代注衡量效率和效果后一般选择迭代 10 次实验环境实验语言使用 C调用 opencv 库仅为便于将图片读入成矩阵和将矩阵输出为图片程序中其它的图像操作等均未调用任何 opencv 的函数。所有操作均面向提取后的矩阵。实验设备MacBook M1程序设计和实现# include iostream # include opencv2/opencv.hpp # include vector # include map const float param_13 1.0f / 3.0f; const float param_16116 16.0f / 116.0f; const float Xn 0.950456f; const float Yn 1.0f; const float Zn 1.088754f; using namespace std; using namespace cv; float gamma(float x) { return x 0.04045 ? powf((x 0.055f) / 1.055f, 2.4f) : (x / 12.92); } float gamma_XYZ2RGB(float x) { return x 0.0031308 ? (1.055f * powf(x, (1 / 2.4f)) - 0.055) : (x * 12.92); } void XYZ2RGB(float X, float Y, float Z, int *R, int *G, int *B) { float RR, GG, BB; RR 3.2404542f * X - 1.5371385f * Y - 0.4985314f * Z; GG -0.9692660f * X 1.8760108f * Y 0.0415560f * Z; BB 0.0556434f * X - 0.2040259f * Y 1.0572252f * Z; RR gamma_XYZ2RGB(RR); GG gamma_XYZ2RGB(GG); BB gamma_XYZ2RGB(BB); RR int(RR * 255.0 0.5); GG int(GG * 255.0 0.5); BB int(BB * 255.0 0.5); *R RR; *G GG; *B BB; } void Lab2XYZ(float L, float a, float b, float *X, float *Y, float *Z) { float fX, fY, fZ; fY (L 16.0f) / 116.0; fX a / 500.0f fY; fZ fY - b / 200.0f; if (powf(fY, 3.0) 0.008856) *Y powf(fY, 3.0); else *Y (fY - param_16116) / 7.787f; if (powf(fX, 3) 0.008856) *X fX * fX * fX; else *X (fX - param_16116) / 7.787f; if (powf(fZ, 3.0) 0.008856) *Z fZ * fZ * fZ; else *Z (fZ - param_16116) / 7.787f; (*X) * (Xn); (*Y) * (Yn); (*Z) * (Zn); } void RGB2XYZ(int R, int G, int B, float *X, float *Y, float *Z) { float RR gamma((float) R / 255.0f); float GG gamma((float) G / 255.0f); float BB gamma((float) B / 255.0f); *X 0.4124564f * RR 0.3575761f * GG 0.1804375f * BB; *Y 0.2126729f * RR 0.7151522f * GG 0.0721750f * BB; *Z 0.0193339f * RR 0.1191920f * GG 0.9503041f * BB; } void XYZ2Lab(float X, float Y, float Z, float *L, float *a, float *b) { float fX, fY, fZ; / Xn; / Yn; / Zn; if (Y 0.008856f) fY pow(Y, param_13); else fY 7.787f * Y param_16116; *L 116.0f * fY - 16.0f; *L *L 0.0f ? *L : 0.0f; if (X 0.008856f) fX pow(X, param_13); else fX 7.787f * X param_16116; if (Z 0.008856) fZ pow(Z, param_13); else fZ 7.787f * Z param_16116; *a 500.0f * (fX - fY); *b 200.0f * (fY - fZ); } void RGB2Lab(int R, int G, int B, float *L, float *a, float *b) { float X, Y, Z; RGB2XYZ(R, G, B, X, Y, Z); XYZ2Lab(X, Y, Z, L, a, b); } void Lab2RGB(float L, float a, float b, int *R, int *G, int *B) { float X, Y, Z; Lab2XYZ(L, a, b, X, Y, Z); XYZ2RGB(X, Y, Z, R, G, B); } int main() { Mat raw_image imread(../pic6.jpg); if (raw_image.empty()) { cout read error endl; return 0; } vectorvectorvectorfloat image;//x,y,(L,a,b) int rows raw_image.rows; int cols raw_image.cols; cout rows: rows cols: cols endl; int N rows * cols; //K个超像素 int K 150; cout cluster num: K endl; int M 40; //以步距为S的距离划分超像素 int S (int) sqrt(N / K); cout S: S endl; //RGB2Lab for (int i 0; i rows; i) { vectorvectorfloat line; for (int j 0; j cols; j) { vectorfloat pixel; float L; float a; float b; RGB2Lab(raw_image.atVec3b(i, j)[2], raw_image.atVec3b(i, j)[1], raw_image.atVec3b(i, j)[0], L, a, b); pixel.push_back(L); pixel.push_back(a); pixel.push_back(b); line.push_back(pixel); } image.push_back(line); } cout RGB2Lab is finished endl; //聚类中心[x y l a b] vectorvectorfloat Cluster; //生成所有聚类中心 for (int i S / 2; i rows; i S) { for (int j S / 2; j cols; j S) { vectorfloat c; push_back((float) i); push_back((float) j); push_back(image[i][j][0]); push_back(image[i][j][1]); push_back(image[i][j][2]); Cluster.push_back(c); } } int cluster_num Cluster.size(); cout init cluster is finished endl; //获得最小梯度值作为新中心点 for (int c 0; c cluster_num; c) { int c_row (int) Cluster[c][0]; int c_col (int) Cluster[c][1]; //梯度以右侧和下侧两个像素点来计算分别计算Lab三个的梯度来求和 //需要保证当前点右侧和下侧是存在的点否则就向左上移动来替代梯度值 if (c_row 1 rows) { c_row rows - 2; } if (c_col 1 cols) { c_col cols - 2; } float c_gradient image[c_row 1][c_col][0] image[c_row][c_col 1][0] - 2 * image[c_row][c_col][0] image[c_row 1][c_col][1] image[c_row][c_col 1][1] - 2 * image[c_row][c_col][1] image[c_row 1][c_col][2] image[c_row][c_col 1][2] - 2 * image[c_row][c_col][2]; for (int i -1; i 1; i) { for (int j -1; j 1; j) { int tmp_row c_row i; int tmp_col c_col j; if (tmp_row 1 rows) { tmp_row rows - 2; } if (tmp_col 1 cols) { tmp_col cols - 2; } float tmp_gradient image[tmp_row 1][tmp_col][0] image[tmp_row][tmp_col 1][0] - * image[tmp_row][tmp_col][0] image[tmp_row 1][tmp_col][1] image[tmp_row][tmp_col 1][1] - 2 * image[tmp_row][tmp_col][1] image[tmp_row 1][tmp_col][2] image[tmp_row][tmp_col 1][2] - * image[tmp_row][tmp_col][2]; if (tmp_gradient c_gradient) { Cluster[c][0] (float) tmp_row; Cluster[c][1] (float) tmp_col; Cluster[c][2] image[tmp_row][tmp_col][0]; Cluster[c][3] image[tmp_row][tmp_col][1]; Cluster[c][3] image[tmp_row][tmp_col][2]; c_gradient tmp_gradient; } } } } cout move cluster is finished; //创建一个dis的矩阵for each pixel ∞ vectorvectordouble distance; for (int i 0; i rows; i) { vectordouble tmp; for (int j 0; j cols; j) { tmp.push_back(INT_MAX); } distance.push_back(tmp); } //创建一个dis的矩阵for each pixel -1 vectorvectorint label; for (int i 0; i rows; i) { vectorint tmp; for (int j 0; j cols; j) { tmp.push_back(-1); } label.push_back(tmp); } //为每一个Cluster创建一个pixel集合 vectorvectorvectorint pixel(Cluster.size()); //核心代码部分迭代计算 for (int t 0; t 10; t) { cout endl iteration num: t 1 ; //遍历所有的中心点,在2S范围内进行像素搜索 int c_num 0; for (int c 0; c cluster_num; c) { if (c - c_num (cluster_num / 10)) { cout ; c_num c; } int c_row (int) Cluster[c][0]; int c_col (int) Cluster[c][1]; float c_L Cluster[c][2]; float c_a Cluster[c][3]; float c_b Cluster[c][4]; for (int i c_row - 2 * S; i c_row 2 * S; i) { if (i 0 || i rows) { continue; } for (int j c_col - 2 * S; j c_col 2 * S; j) { if (j 0 || j cols) { continue; } float tmp_L image[i][j][0]; float tmp_a image[i][j][1]; float tmp_b image[i][j][2]; double Dc sqrt((tmp_L - c_L) * (tmp_L - c_L) (tmp_a - c_a) * (tmp_a - c_a) (tmp_b - c_b) * (tmp_b - c_b)); double Ds sqrt((i - c_row) * (i - c_row) (j - c_col) * (j - c_col)); double D sqrt((Dc / (double) M) * (Dc / (double) M) (Ds / (double) S) * (Ds / (double) S)); if (D distance[i][j]) { if (label[i][j] -1) { //还没有被标记过 label[i][j] c; vectorint point; point.push_back(i); point.push_back(j); pixel[c].push_back(point); } else { int old_cluster label[i][j]; vectorvectorint::iterator iter; for (iter pixel[old_cluster].begin(); iter ! pixel[old_cluster].end(); iter) { if ((*iter)[0] i (*iter)[1] j) { pixel[old_cluster].erase(iter); break; } } label[i][j] c; vectorint point; point.push_back(i); point.push_back(j); pixel[c].push_back(point); } distance[i][j] D; } } } } cout start update cluster; for (int c 0; c Cluster.size(); c) { int sum_i 0; int sum_j 0; int number 0; for (int p 0; p pixel[c].size(); p) { sum_i pixel[c][p][0]; sum_j pixel[c][p][1]; number; } int tmp_i (int) ((double) sum_i / (double) number); int tmp_j (int) ((double) sum_j / (double) number); Cluster[c][0] (float) tmp_i; Cluster[c][1] (float) tmp_j; Cluster[c][2] image[tmp_i][tmp_j][0]; Cluster[c][3] image[tmp_i][tmp_j][1]; Cluster[c][4] image[tmp_i][tmp_j][2]; } } //导出Lab空间的矩阵 vectorvectorvectorfloat out_image image;//x,y,(L,a,b) for (int c 0; c Cluster.size(); c) { for (int p 0; p pixel[c].size(); p) { out_image[pixel[c][p][0]][pixel[c][p][1]][0] Cluster[c][2]; out_image[pixel[c][p][0]][pixel[c][p][1]][1] Cluster[c][3]; out_image[pixel[c][p][0]][pixel[c][p][1]][2] Cluster[c][4]; } out_image[(int) Cluster[c][0]][(int) Cluster[c][1]][0] 0; out_image[(int) Cluster[c][0]][(int) Cluster[c][1]][1] 0; out_image[(int) Cluster[c][0]][(int) Cluster[c][1]][2] 0; } cout endl export image mat finished endl; Mat print_image raw_image.clone(); for (int i 0; i rows; i) { for (int j 0; j cols; j) { float L out_image[i][j][0]; float a out_image[i][j][1]; float b out_image[i][j][2]; int R, G, B; Lab2RGB(L, a, b, R, G, B); Vec3b vec3b; vec3b[0] B; vec3b[1] G; vec3b[2] R; print_image.atVec3b(i, j) vec3b; } } imshow(print_image, print_image); waitKey(0); //暂停保持图像显示等待按键结束 return 0; }实验结果和分析直接查看前后对比效果我们将中心点设置成黑色其余点均设置成原中心点的颜色这样就可以预览效果。