社群发现算法实现:CPM,基于SPARK+SCALA+MAVEN+Hadoop
选择此框架实现原因:
(1)SPARK的Graphx对于图操作较为便捷。
(2)SPARK下的并行处理,便于对以后大型数据集进行扩充。
详情见我的Github:传送门
代码和数据集都在里面,需要的自己clone。
ps:对我的算法或结果有什么疑问,或者搬代码。请评论留言。
以下是我的Readme陈述算法思路,还没写完,先发上来,增加浏览量,之后部分我近几天补充。
💡 The latest and final version of this algorithm, its route is BK\src\main\scala\BBB\Final.scala ✂️ I'm so sorry that I donn't have sufficient time to sort up my code. I'll gradually update it in the next few days after my first Commit.😝
use [spark+scala] to implement CPM(Cluster Percolation Method)😃 As we konw, the CPM algorithm has three steps on the whole, but I have made a little alteration for the dataset and my project. My algorithm process is shown below.
1.Find all kliques of size k(or larger than k) in the graph, As a result, Bron-kerbosch algorithm is an efficient choice for this step. ps: There are two versions of BK that are most commonly used: (1)the basic form; (2)with pivoting. For my experience, the second one has a better performance 😏
2.Fliter noisy nodes to make the final evaluation(use the modularity) valuable, so I create a new subgraph based on the initial graph. ps:the noisy nodes are refer to the ones who has few connections with others (such as: the isolated points)
3.Statistic whether these two maximal cliques are 'related'(if they contain the same nodes more than n), If they are 'related', they could be merged together.
😕ps: (1)the author of the paper use the matrix to finally merge all the 'related' maximal cliques together, I use the set to store which maximal cliques are 'related' with the current one. However, this alteratin doesn't make any difference essentially. (2)n is just a variable, you can adjust it with your result. ☺️
4.Use the ans of step 3 to merge the 'related' maximal cliques into one community.
5.Use the modularity to evaluate the outcome.
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' 👊 Now I generalize my function in this project (1)bronKerboschl the basic form of bronkerbosch algorithm (2)bronkerbosch2 the bronkerbosch algorithm with pivoting
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。