community search

{{multiple issues|

}}

Discovering communities in a network, known as community detection/discovery, is a fundamental problem in network science, which attracted much attention in the past several decades. In recent years{{when|date=September 2024}}, with the tremendous studies on big data, another related but different problem, called community search, which aims to find the most likely community that contains the query node, has attracted great attention from both academic and industry areas. It is a query-dependent variant of the community detection problem. A detailed survey of community search can be found at ref.,{{cite arXiv |eprint=1904.12539|last1=Fang|first1=Yixiang|last2=Huang|first2=Xin|last3=Qin|first3=Lu|last4=Zhang|first4=Ying|last5=Zhang|first5=Wenjie|last6=Cheng|first6=Reynold|last7=Lin|first7=Xuemin|title=A Survey of Community Search over Big Graphs|year=2019|class=cs.DB}} which reviews all the recent studies

{{cite book|doi=10.1145/1835804.1835923|chapter=The community-search problem and how to plan a successful cocktail party|title=Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '10|year=2010|last1=Sozio|first1=Mauro|last2=Gionis|first2=Aristides|page=939|isbn=9781450300551|s2cid=11484255}}{{cite book|doi=10.1145/2463676.2463722|chapter=Online search of overlapping communities|title=Proceedings of the 2013 international conference on Management of data - SIGMOD '13|year=2013|last1=Cui|first1=Wanyun|last2=Xiao|first2=Yanghua|last3=Wang|first3=Haixun|last4=Lu|first4=Yiqi|last5=Wang|first5=Wei|page=277|isbn=9781450320375|s2cid=953025}}{{cite book|doi=10.1145/2588555.2612179|chapter=Local search of communities in large graphs|title=Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data|year=2014|last1=Cui|first1=Wanyun|last2=Xiao|first2=Yanghua|last3=Wang|first3=Haixun|last4=Wang|first4=Wei|pages=991–1002|isbn=9781450323765|s2cid=4653380}}{{cite book|doi=10.1145/2588555.2610495|chapter=Querying k-truss community in large and dynamic graphs|title=Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data|year=2014|last1=Huang|first1=Xin|last2=Cheng|first2=Hong|last3=Qin|first3=Lu|last4=Tian|first4=Wentao|last5=Yu|first5=Jeffrey Xu|pages=1311–1322|isbn=9781450323765|s2cid=207211829}}{{cite journal|doi=10.14778/2735479.2735484|title=Influential community search in large networks|year=2015|last1=Li|first1=Rong-Hua|last2=Qin|first2=Lu|last3=Yu|first3=Jeffrey Xu|last4=Mao|first4=Rui|journal=Proceedings of the VLDB Endowment|volume=8|issue=5|pages=509–520|s2cid=17672355 |citeseerx=10.1.1.667.4074}}{{cite journal|doi=10.1007/s10618-015-0422-1|title=Efficient and effective community search|year=2015|last1=Barbieri|first1=Nicola|last2=Bonchi|first2=Francesco|last3=Galimberti|first3=Edoardo|last4=Gullo|first4=Francesco|journal=Data Mining and Knowledge Discovery|volume=29|issue=5|pages=1406–1433|s2cid=13440433}}{{cite journal|doi=10.14778/2856318.2856323|title=Approximate closest community search in networks|year=2015|last1=Huang|first1=Xin|last2=Lakshmanan|first2=Laks V. S.|last3=Yu|first3=Jeffrey Xu|last4=Cheng|first4=Hong|journal=Proceedings of the VLDB Endowment|volume=9|issue=4|pages=276–287|arxiv=1505.05956|s2cid=2905457}}

{{cite journal|doi=10.14778/2994509.2994538|title=Effective community search for large attributed graphs|year=2016|last1=Fang|first1=Yixiang|last2=Cheng|first2=Reynold|last3=Luo|first3=Siqiang|last4=Hu|first4=Jiafeng|journal=Proceedings of the VLDB Endowment|volume=9|issue=12|pages=1233–1244|hdl=10722/232839|hdl-access=free}}

{{cite journal|doi=10.14778/3055330.3055337|title=Effective community search over large spatial graphs|year=2017|last1=Fang|first1=Yixiang|last2=Cheng|first2=Reynold|last3=Li|first3=Xiaodong|last4=Luo|first4=Siqiang|last5=Hu|first5=Jiafeng|journal=Proceedings of the VLDB Endowment|volume=10|issue=6|pages=709–720|hdl=10722/243528|hdl-access=free}}

{{Cite web|url=http://i.cs.hku.hk/~yxfang/acq.html|title = Effective Community Search for Large Attributed Graphs}}

Main advantages

As pointed by the first work on community search published in SIGKDD'2010, many existing community detection/discovery methods consider the static community detection problem, where the graph needs to be partitioned a-priori with no reference to query nodes. While community search often focuses the most-likely communitie containing the query vertex{{clarify|reason=sentence meaning unclear.|date=September 2024}}. The main advantages of community search over community detection/discovery are listed as below:

(1) High personalization. Community detection/discovery often uses the same global criterion to decide whether a subgraph qualifies as a community. In other words, the criterion is fixed and predetermined. But in reality, communities for different vertices may have very different characteristics. Moreover, community search allows the query users to specify more personalized query conditions. In addition, the personalized query conditions enable the communities to be interpreted easily.

For example, a recent work, which focuses on attributed graphs, where nodes are often associated with some attributes like keyword, and tries to find the communities, called attributed communities, which exhibit both strong structure and keyword cohesiveness. The query users are allowed to specify a query node and some other query conditions: (1) a value, k, the minimum degree for the expected communities; and (2) a set of keywords, which control the semantic of the expected communities. The communities returned can be easily interpreted by the keywords shared by all the community members. More details can be found from.

(2) High efficiency. With the striking booming of social networks in recent years, there are many real big graphs. For example, the numbers of users in Facebook and Twitter are often billions-scale. As community detection/discovery often finds all the communities from an entire social network, this can be very costly and also time-consuming. In contrast, community search often works on a sub-graph, which is much efficient. Moreover, detecting all the communities from an entire social network is often unnecessary. For real applications like recommendation and social media markets, people often focus on some communities that they are really interested in, rather than all the communities.

Some recent studies have shown that, for million-scale graphs, community search often takes less than 1 second to find a well-defined community, which is generally much faster than many existing community detection/discovery methods. This also implies that, community search is more suitable for finding communities from big graphs.

(3) Support for dynamically evolving graphs. Almost all the graphs in real life are often evolving over time. Since community detection often uses the same global criterion to find communities, they are not sensitive of the updates of nodes and edges in graphs. In other words, the detected communities may loose freshness after a short period of time. On the contrary, community search can handle this easily since it is able to search the communities in an online manner, based on a query request.

Metrics for community search

Community search often uses some well-defined, fundamental graph metrics to formulate the cohesiveness of communities. The commonly used metrics are

k-core (minimum degree),

k-truss (metric), k-edge-connected,{{Cite book|doi=10.1145/2723372.2746486|chapter=Index-based Optimal Algorithms for Computing Steiner Components with Maximum Connectivity|title=Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data|year=2015|last1=Chang|first1=Lijun|last2=Lin|first2=Xuemin|last3=Qin|first3=Lu|last4=Yu|first4=Jeffrey Xu|last5=Zhang|first5=Wenjie|pages=459–474|isbn=9781450327589|s2cid=18282516}}{{cite journal|doi=10.1109/TKDE.2017.2730873|title=On Minimal Steiner Maximum-Connected Subgraph Queries|year=2017|last1=Hu|first1=Jiafeng|last2=Wu|first2=Xiaowei|last3=Cheng|first3=Reynold|last4=Luo|first4=Siqiang|last5=Fang|first5=Yixiang|journal=IEEE Transactions on Knowledge and Data Engineering|volume=29|issue=11|pages=2455–2469|s2cid=40432915}} etc. Among these measures, the k-core metric is the most popular one, and has been used in many recent studies as surveyed in.

References

Category:Community

Category:Network theory