BigQuery中的重复字段(REPEATED fields)是指可以包含多个值的字段类型,类似于其他数据库系统中的数组类型。重复字段在BigQuery的表结构中表示为可以存储多个值的列。
BigQuery支持以下类型的重复字段:
CREATE TABLE dataset.repeated_example (
user_id INT64,
user_name STRING,
tags ARRAY<STRING>, -- 简单重复字段
orders ARRAY<STRUCT< -- 嵌套重复字段
order_id STRING,
amount FLOAT64,
items ARRAY<STRUCT<
item_id STRING,
quantity INT64
>>
>>
);
使用UNNEST函数展开重复字段:
SELECT user_id, tag
FROM dataset.repeated_example,
UNNEST(tags) AS tag;
INSERT INTO dataset.repeated_example
(user_id, user_name, tags, orders)
VALUES
(1, 'John', ['premium', 'vip'],
[
STRUCT('ord123' AS order_id, 99.99 AS amount,
[
STRUCT('item1' AS item_id, 2 AS quantity),
STRUCT('item2' AS item_id, 1 AS quantity)
] AS items)
]);
UPDATE dataset.repeated_example
SET tags = ARRAY_CONCAT(tags, ['new_tag'])
WHERE user_id = 1;
SELECT user_id, ARRAY_LENGTH(tags) AS tag_count
FROM dataset.repeated_example;
-- 或者计算嵌套重复字段中的总和
SELECT
user_id,
(SELECT SUM(amount) FROM UNNEST(orders)) AS total_spent
FROM dataset.repeated_example;
BigQuery的重复字段功能为处理复杂数据结构提供了强大支持,合理使用可以显著简化数据模型并提高查询效率。
没有搜到相关的文章