要从每个区间 (0.0, 0.2),(0.2, 0.4),(0.4, 0.8),(0.8, 1.0) 中获得包含相等数量的值的数组子集,可以按照以下步骤进行:
假设我们有一个包含大量浮点数的数组 data
,并且希望从每个区间中抽取 n
个样本。
result
来存储最终的子集。n
个样本。result
数组中。import random
def equal_distribution_sampling(data, n):
intervals = [(0.0, 0.2), (0.2, 0.4), (0.4, 0.8), (0.8, 1.0)]
result = []
for start, end in intervals:
# Filter data within the current interval
interval_data = [x for x in data if start < x < end]
# Randomly sample n elements from the interval data
sampled_data = random.sample(interval_data, min(n, len(interval_data)))
# Append to the result
result.extend(sampled_data)
return result
# Example usage
data = [random.uniform(0, 1) for _ in range(1000)] # Generate a list of 1000 random floats between 0 and 1
n = 100 # Number of samples per interval
sampled_subset = equal_distribution_sampling(data, n)
print(sampled_subset)
n
,则无法抽取足够数量的样本。n
个样本,或者从相邻区间借用样本。通过上述方法,可以有效地从每个指定区间中获得包含相等数量值的数组子集,确保数据的均匀分布和平衡性。
领取专属 10元无门槛券
手把手带您无忧上云