Power iteration clustering (PIC)

2019-04-15 13:01发布生成海报

站内文章 / PIC单片机

18055 0

Power iteration clustering (PIC) is a scalable and efficient algorithm for clustering vertices of a graph given pairwise similarties as edge properties, described in Lin and Cohen, Power Iteration Clustering. It computes a pseudo-eigenvector of the normalized affinity matrix of the graph via power iteration and uses it to cluster vertices. MLlib includes an implementation of PIC using GraphX as its backend. It takes an RDD of

(srcId,
 dstId, similarity)

tuples and outputs a model with the clustering assignments. The similarities must be nonnegative. PIC assumes that the similarity measure is symmetric. A pair

(srcId,
 dstId)

regardless of the ordering should appear at most once in the input data. If a pair is missing from input, their similarity is treated as zero. MLlib’s PIC implementation takes the following (hyper-)parameters:

k: number of clusters
maxIterations: maximum number of power iterations
initializationMode: initialization model. This can be either “random”, which is the default, to use a random vector as vertex properties, or “degree” to use normalized sum similarities.

Ta的文章更多 >>

Power iteration clustering (PIC)
0 个评论

Power iteration clustering (PIC)

Ta的文章 更多 >>

热门文章

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

Ta的文章更多 >>