L2-005 集合相似度 (25 分)（set容器）

Here_SDUT

发布于 2022-08-08 16:25:46

2300

发布于 2022-08-08 16:25:46

文章被收录于专栏：机器学习炼丹之旅

给定两个整数集合，它们的相似度定义为：N_c/N_t×100%。其中N_c是两个集合都有的不相等整数的个数，Nt是两个集合一共有的不相等整数的个数。你的任务就是计算任意一对给定集合的相似度。

输入格式：输入第一行给出一个正整数N（≤50），是集合的个数。随后N行，每行对应一个集合。每个集合首先给出一个正整数M（≤10^4），是集合中元素的个数；然后跟M个

之后一行给出一个正整数K（≤2000），随后K行，每行对应一对需要计算相似度的集合的编号（集合从1到N编号）。数字间以空格分隔。

输出格式： 对每一对需要计算的集合，在一行中输出它们的相似度，为保留小数点后2位的百分比数字。

输入样例：

3
3 99 87 101
4 87 101 5 87
7 99 101 18 5 135 18 99
2
1 2
1 3

输出样例：

50.00%
33.33%

分析最多50个集合，预处理出全部的组合，C_{50}^2 = 49 * 25, 用set存放所有的集合，然后预处理的时候遍历两个set中较小的那个，在较大的中查找是否存在，将集合i和集合j共同拥有的数量存在both[i][j]中。N_c就是both[i][j]，N_t就是两个集合size加起来再减掉both[i][j]。时间复杂度：25 * 49 * 10000* log(10000) = 49000000。

代码

//                              _ooOoo_
//                             o8888888o
//                             88" . "88
//                             (| -_- |)
//                              O\ = /O
//                           ____/`---'\____
//                        .   ' \| |// `.
//                         / \||| : |||// \
//                        / _||||| -:- |||||- \
//                         | | \\ - /// | |
//                       | \_| ''\---/'' | |
//                        \ .-\__ `-` ___/-. /
//                    ___`. .' /--.--\ `. . __
//                  ."" '< `.___\_<|>_/___.' >'"".
//                 | | : `- \`.;`\ _ /`;.`/ - ` : | |
//                    \ \ `-. \_ __\ /__ _/ .-` / /
//           ======`-.____`-.___\_____/___.-`____.-'======
//                              `=---='
//
//           .............................................
//                   佛祖保佑     一发AC     永无BUG
#include <bits/stdc++.h>
#define LL long long
using namespace std;
const int maxn = 1e5 + 10;
const int inf = 0x3f3f3f3f;
const double PI = acos(-1.0);
typedef pair<int, int> PII;

int both[100][100];
set<int> s[110];
int main(int argc, char const *argv[]) {
    int n;
    cin >> n;
    for (int i = 1; i <= n; i++) {
        int m;
        cin >> m;
        while (m--) {
            int x;
            cin >> x;
            s[i].insert(x);
        }
    }
    //预处理出所有的可能
    for (int i = 1; i <= n; i++) {
        for (int j = i + 1; j <= n; j++) {
            if (s[i].size() <= s[j].size()) {//从小的里查询大的
                for (auto p : s[i]) {
                    if (s[j].find(p) != s[j].end()) {
                        both[i][j]++;
                        both[j][i]++;
                    }
                }
            }
            else {
                for (auto p : s[j]) {
                    if (s[i].find(p) != s[i].end()) {
                        both[i][j]++;
                        both[j][i]++;
                    }
                }
            }
        }
    }
    int k;
    cin >> k;
    while(k--) {
        int a, b;
        cin >>a >>b;
        printf("%.2lf%\n",both[a][b]*100.0/(s[a].size() + s[b].size() - both[a][b]));
    }
    return 0;
}

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

原始发表：2021-3-13 9，如有侵权请联系 cloudcommunity@tencent.com 删除

set