stat统计学,共计29篇
【1】 Clustering Areal Units at Multiple Levels of Resolution to Model Crime Incidence in Philadelphia 标题:在多层次分辨率上对区域单位进行聚类以模拟费城的犯罪率 链接:https://arxiv.org/abs/2112.02059
作者:Cecilia Balocchi,Edward I. George,Shane T. Jensen 摘要:估计整个城市犯罪率的空间异质性是减少犯罪和提高我们对城市环境的物理和社会功能的理解的一个重要步骤。这是一项困难的建模工作,因为犯罪率可以在空间和时间上平稳变化,但也存在物理和社会障碍,导致城市内不同地区之间的犯罪率不连续。另一个困难是,有不同级别的分辨率可用于定义城市区域,以便分析犯罪。为了应对这些挑战,我们开发了一种贝叶斯非参数方法,用于同时对不同分辨率的城市区域单元进行聚类。我们的方法通过广泛的综合数据研究进行评估,然后应用于费城市不同分辨率的犯罪率估计。 摘要:Estimation of the spatial heterogeneity in crime incidence across an entire city is an important step towards reducing crime and increasing our understanding of the physical and social functioning of urban environments. This is a difficult modeling endeavor since crime incidence can vary smoothly across space and time but there also exist physical and social barriers that result in discontinuities in crime rates between different regions within a city. A further difficulty is that there are different levels of resolution that can be used for defining regions of a city in order to analyze crime. To address these challenges, we develop a Bayesian non-parametric approach for the clustering of urban areal units at different levels of resolution simultaneously. Our approach is evaluated with an extensive synthetic data study and then applied to the estimation of crime incidence at various levels of resolution in the city of Philadelphia.
【2】 Bayesian nonparametric strategies for power maximization in rare variants association studies 标题:稀有变异关联研究中功率最大化的贝叶斯非参数策略 链接:https://arxiv.org/abs/2112.02032
作者:Lorenzo Masoero,Joshua Schraiber,Tamara Broderick 摘要:据推测,罕见变异在很大程度上与人类的遗传力和疾病易感性有关。因此,罕见变异关联研究有望了解疾病。相反,变异的罕见性带来了实际的挑战;由于这些变异存在于少数个体中,因此很难开发有效利用其稀疏信息的数据收集和统计方法。在这项工作中,我们开发了一个新的贝叶斯非参数模型,以捕捉罕见变异关联研究中的设计选择如何影响其有用性。然后,我们将展示如何使用我们的模型在实践中指导在固定实验预算下的设计选择。特别是,我们提供了一个实际的工作流程和模拟数据的说明性实验。 摘要:Rare variants are hypothesized to be largely responsible for heritability and susceptibility to disease in humans. So rare variants association studies hold promise for understanding disease. Conversely though, the rareness of the variants poses practical challenges; since these variants are present in few individuals, it can be difficult to develop data-collection and statistical methods that effectively leverage their sparse information. In this work, we develop a novel Bayesian nonparametric model to capture how design choices in rare variants association studies can impact their usefulness. We then show how to use our model to guide design choices under a fixed experimental budget in practice. In particular, we provide a practical workflow and illustrative experiments on simulated data.
【3】 Power analysis for cluster randomized trials with continuous co-primary endpoints 标题:具有连续共同主要终点的整群随机试验的功率分析 链接:https://arxiv.org/abs/2112.01981
作者:Siyun Yang,Mirjam Moerbeek,Monica Taljaard,Fan Li 摘要:出于科学或后勤考虑,评估卫生保健干预措施的实用性试验通常采用群集随机化。以前的研究表明,共同主要终点在实际试验中很常见,但在样本量或功率计算中很少被识别。虽然基于$K$($K\geq 2$)二进制共同主端点的功率分析方法可用于CRT,但据我们所知,用于连续共同主端点的方法尚不可用。假设一个多元线性混合模型,该模型考虑了每个聚类中观察值之间的多种类型的组内相关系数(终点特异性ICC、受试者内ICC和受试者间终点ICC),我们推导了$K$治疗效应估计器的闭合形式联合分布,以便于在相同聚类规模下,使用不同类型的零假设确定样本大小和功率。我们描述了每个测试的功率与不同类型的相关参数之间的关系。我们进一步放宽了等簇大小假设,并通过簇大小的均值和变异系数来近似$K$治疗效果估计量的联合分布。我们对有限个聚类的模拟研究表明,当通过期望最大化算法估计多元线性混合模型中的参数时,我们的方法预测的功率与经验功率吻合良好。以实际阴极射线管为例说明了该方法。 摘要:Pragmatic trials evaluating health care interventions often adopt cluster randomization due to scientific or logistical considerations. Previous reviews have shown that co-primary endpoints are common in pragmatic trials but infrequently recognized in sample size or power calculations. While methods for power analysis based on $K$ ($K\geq 2$) binary co-primary endpoints are available for CRTs, to our knowledge, methods for continuous co-primary endpoints are not yet available. Assuming a multivariate linear mixed model that accounts for multiple types of intraclass correlation coefficients (endpoint-specific ICCs, intra-subject ICCs and inter-subject between-endpoint ICCs) among the observations in each cluster, we derive the closed-form joint distribution of $K$ treatment effect estimators to facilitate sample size and power determination with different types of null hypotheses under equal cluster sizes. We characterize the relationship between the power of each test and different types of correlation parameters. We further relax the equal cluster size assumption and approximate the joint distribution of the $K$ treatment effect estimators through the mean and coefficient of variation of cluster sizes. Our simulation studies with a finite number of clusters indicate that the predicted power by our method agrees well with the empirical power, when the parameters in the multivariate linear mixed model are estimated via the expectation-maximization algorithm. An application to a real CRT is presented to illustrate the proposed method.
【4】 A Gaussian copula joint model for longitudinal and time-to-event data with random effects 标题:具有随机效应的纵向数据和事件间隔时间数据的高斯Copula联合模型 链接:https://arxiv.org/abs/2112.01941
作者:Zili Zhang,Christiana Charalambous,Peter Foster 备注:31 pages, 4 figures, do a fast presentation on 2021 Joint Statistical Meeting 摘要:纵向和生存子模型是纵向和事件时间数据联合建模的两个构建块。广泛的研究表明,对这两个过程的单独分析可能会由于它们之间的关联而导致输出偏差。生物标志物测量值与给定潜在类别或随机效应的事件时间过程之间的条件独立性是描述两个子模型之间关联的常用方法,同时考虑到人群之间的异质性。然而,由于不可观测的潜在变量,这一假设很难验证。因此,我们提出了一个具有随机效应的高斯copula联合模型,以适应条件独立性假设有问题的情况。在我们提出的模型中,当高斯copula中的关联参数收缩到零时,假设条件独立的传统联合模型是一个特例。通过仿真研究和实际数据应用,评估了该模型的性能。此外,基于所提出的模型获得了生存概率的个性化动态预测,并与传统联合模型下获得的预测进行了比较。 摘要:Longitudinal and survival sub-models are two building blocks for joint modelling of longitudinal and time to event data. Extensive research indicates separate analysis of these two processes could result in biased outputs due to their associations. Conditional independence between measurements of biomarkers and event time process given latent classes or random effects is a common approach for characterising the association between the two sub-models while taking the heterogeneity among the population into account. However, this assumption is tricky to validate because of the unobservable latent variables. Thus a Gaussian copula joint model with random effects is proposed to accommodate the scenarios where the conditional independence assumption is questionable. In our proposed model, the conventional joint model assuming conditional independence is a special case when the association parameter in the Gaussian copula shrinks to zero. Simulation studies and real data application are carried out to evaluate the performance of our proposed model. In addition, personalised dynamic predictions of survival probabilities are obtained based on the proposed model and comparisons are made to the predictions obtained under the conventional joint model.
【5】 Near-optimal estimation of smooth transport maps with kernel sums-of-squares 标题:具有核平方和的光滑传输映射的次最优估计 链接:https://arxiv.org/abs/2112.01907
作者:Boris Muzellec,Adrien Vacher,Francis Bach,François-Xavier Vialard,Alessandro Rudi 摘要:最近的研究表明,在光滑条件下,两个分布之间的平方Wasserstein距离可以有效地计算,并具有诱人的统计误差上界。然而,与距离本身不同,生成性建模等应用程序感兴趣的对象是底层的最佳运输地图。因此,需要获得估计地图本身的计算和统计保证。在本文中,我们提出了第一个易于处理的算法,该算法使得映射上的统计$L^2$误差几乎与现有的平滑映射估计的极小极大下界相匹配。我们的方法是基于求解具有无限维平方和的最优传输的半对偶公式,并导出了一个在样本数上具有无维多项式速率的算法,该算法具有潜在的指数维依赖常数。 摘要:It was recently shown that under smoothness conditions, the squared Wasserstein distance between two distributions could be efficiently computed with appealing statistical error upper bounds. However, rather than the distance itself, the object of interest for applications such as generative modeling is the underlying optimal transport map. Hence, computational and statistical guarantees need to be obtained for the estimated maps themselves. In this paper, we propose the first tractable algorithm for which the statistical $L^2$ error on the maps nearly matches the existing minimax lower-bounds for smooth map estimation. Our method is based on solving the semi-dual formulation of optimal transport with an infinite-dimensional sum-of-squares reformulation, and leads to an algorithm which has dimension-free polynomial rates in the number of samples, with potentially exponentially dimension-dependent constants.
【6】 Bayes in Wonderland! Predictive supervised classification inference hits unpredictability 标题:仙境里的贝叶斯!预测监督分类推理命中不可预测性 链接:https://arxiv.org/abs/2112.01880
作者:Ali Amiryousefi,Ville Kinnula,Jing Tang 备注:arXiv admin note: text overlap with arXiv:2101.10950 摘要:与同步贝叶斯预测分类器(sBpc)不同,边缘贝叶斯预测分类器(mBpc)分别处理每个数据,因此默认独立于观测值。然而,由于生成模型参数的学习饱和,这种错误假设对mBpc精度的不利影响在训练数据量不断增加的情况下趋于减弱;保证这两个分类器在definetti类型的可交换性下收敛。然而,对于分区可交换性(PE)下生成的序列来说,这一结果远非微不足道,因为即使是无数的训练数据也不排除出现未观察到的结果的可能性(Wonderland!)。我们提供了一个计算方案,允许在PE下生成序列。在此基础上,通过控制训练数据的增加,我们证明了sBpc和mBpc的收敛性。这是使用更简单但计算效率更高的边缘分类器而不是同时使用的基础。我们还提供了生成模型的参数估计,该生成模型产生了分区可交换序列,以及该参数在不同样本间相等性的测试范式。Ewens抽样公式生成模型的贝叶斯预测监督分类、参数估计和假设检验包作为PEkit包存放在CRAN上,免费从https://github.com/AmiryousefiLab/PEkit. 摘要:The marginal Bayesian predictive classifiers (mBpc) as opposed to the simultaneous Bayesian predictive classifiers (sBpc), handle each data separately and hence tacitly assumes the independence of the observations. However, due to saturation in learning of generative model parameters, the adverse effect of this false assumption on the accuracy of mBpc tends to wear out in face of increasing amount of training data; guaranteeing the convergence of these two classifiers under de Finetti type of exchangeability. This result however, is far from trivial for the sequences generated under Partition exchangeability (PE), where even umpteen amount of training data is not ruling out the possibility of an unobserved outcome (Wonderland!). We provide a computational scheme that allows the generation of the sequences under PE. Based on that, with controlled increase of the training data, we show the convergence of the sBpc and mBpc. This underlies the use of simpler yet computationally more efficient marginal classifiers instead of simultaneous. We also provide a parameter estimation of the generative model giving rise to the partition exchangeable sequence as well as a testing paradigm for the equality of this parameter across different samples. The package for Bayesian predictive supervised classifications, parameter estimation and hypothesis testing of the Ewens Sampling formula generative model is deposited on CRAN as PEkit package and free available from https://github.com/AmiryousefiLab/PEkit.
【7】 Chronological Causal Bandits 标题:时序因果关系图 链接:https://arxiv.org/abs/2112.01819
作者:Neil Dhir 备注:10 pages, accepted at the NeurIPS 2021 workshop Causal Inference Challenges in Sequential Decision Making: Bridging Theory and Practice 摘要:本文研究了多臂bandit(MAB)问题的一个实例,特别是多个因果MAB在同一动力系统中按时间顺序运行的情况。实际上,每个强盗的报酬分布由相同的非平凡依赖结构控制,这是一个动态因果模型。动态,因为我们允许每个因果性单克隆抗体依赖于前面的单克隆抗体,并且在这样做时能够在代理之间传递信息。我们的贡献,按时间顺序排列的因果班迪特(CCB),在因果效应随时间变化的离散决策环境中是有用的,并且可以通过同一系统中的早期干预获得信息。在这篇文章中,我们提出了一些早期的发现,商业罪案调查局证明了一个玩具问题。 摘要:This paper studies an instance of the multi-armed bandit (MAB) problem, specifically where several causal MABs operate chronologically in the same dynamical system. Practically the reward distribution of each bandit is governed by the same non-trivial dependence structure, which is a dynamic causal model. Dynamic because we allow for each causal MAB to depend on the preceding MAB and in doing so are able to transfer information between agents. Our contribution, the Chronological Causal Bandit (CCB), is useful in discrete decision-making settings where the causal effects are changing across time and can be informed by earlier interventions in the same system. In this paper, we present some early findings of the CCB as demonstrated on a toy problem.
【8】 Data-driven stabilizations of goodness-of-fit tests 标题:拟合优度检验的数据驱动稳定化 链接:https://arxiv.org/abs/2112.01808
作者:Alberto Fernández-de-Marcos,Eduardo García-Portugués 备注:21 pages, 4 figures, 8 tables 摘要:拟合优度检验统计量的精确零分布通常难以以易于处理的形式获得。因此,实践者通常不得不依赖于渐近零分布或蒙特卡罗方法(以查找表的形式或根据需要执行),以应用拟合优度检验。Stephens(1970)对几种经典的拟合优度检验统计数据进行了显著的简单而有用的转换,从而稳定了不同样本大小的精确-n$临界值$n$。然而,关于这些转换和后续转换在产生精确的$p$-值方面的准确性的详细信息,甚至对几个转换的推导的深入理解,目前仍然很少。我们使用现代工具阐明并自动化后一种稳定化方法,以(i)扩大其适用范围和(ii)产生半连续的精确$p$-值,而不是固定显著性水平的精确临界值。我们展示了Kolmogorov-Smirnov、Cram’er-von Mises、Anderson-Darling、Kuiper和Watson检验统计量的精确零分布的稳定精度的改进。此外,我们还提供了一个参数相关的精确稳定化方法,用于测试任意维超球面上的均匀性。天文学中的一个数据应用说明了所提倡的稳定性对于快速分析小到中等顺序测量样本的好处。 摘要:Exact null distributions of goodness-of-fit test statistics are generally challenging to obtain in tractable forms. Practitioners are therefore usually obliged to rely on asymptotic null distributions or Monte Carlo methods, either in the form of a lookup table or carried out on demand, to apply a goodness-of-fit test. Stephens (1970) provided remarkable simple and useful transformations of several classic goodness-of-fit test statistics that stabilized their exact-$n$ critical values for varying sample sizes $n$. However, detail on the accuracy of these and subsequent transformations in yielding exact $p$-values, or even deep understanding on the derivation of several transformations, is still scarce nowadays. We illuminate and automatize, using modern tools, the latter stabilization approach to (i) expand its scope of applicability and (ii) yield semi-continuous exact $p$-values, as opposed to exact critical values for fixed significance levels. We show improvements on the stabilization accuracy of the exact null distributions of the Kolmogorov-Smirnov, Cram\'er-von Mises, Anderson-Darling, Kuiper, and Watson test statistics. In addition, we provide a parameter-dependent exact-$n$ stabilization for several novel statistics for testing uniformity on the hypersphere of arbitrary dimension. A data application in astronomy illustrates the benefits of the advocated stabilization for quickly analyzing small-to-moderate sequentially-measured samples.
【9】 Computation of conditional expectations with guarantees 标题:有担保的条件期望的计算 链接:https://arxiv.org/abs/2112.01804
作者:Patrick Cheridito,Balint Gersey 摘要:理论上,给定一个$d$维随机向量$X$的平方可积随机变量$Y$的条件期望可以通过最小化所有Borel可测函数$f\colon\mathbb{R}^d\to\mathbb{R}$上$Y$和$f(X)$之间的均方距离来获得。然而,在许多应用中,这个最小化问题不能精确地解决,相反,必须使用一种数值方法来计算合适的Borel函数子族上的近似最小值。结果的质量取决于亚家族的充分性和数值方法的性能。在本文中,我们推导了最小均方距离的期望值表示,在许多应用中,它可以用标准蒙特卡罗平均值有效地近似。这使我们能够保证给定条件期望的任何数值近似的准确性。我们通过在不同的具体例子中评估通过线性、多项式以及神经网络回归获得的近似条件期望的质量来说明该方法。 摘要:Theoretically, the conditional expectation of a square-integrable random variable $Y$ given a $d$-dimensional random vector $X$ can be obtained by minimizing the mean squared distance between $Y$ and $f(X)$ over all Borel measurable functions $f \colon \mathbb{R}^d \to \mathbb{R}$. However, in many applications this minimization problem cannot be solved exactly, and instead, a numerical method that computes an approximate minimum over a suitable subfamily of Borel functions has to be used. The quality of the result depends on the adequacy of the subfamily and the performance of the numerical method. In this paper, we derive an expected value representation of the minimal mean square distance which in many applications can efficiently be approximated with a standard Monte Carlo average. This enables us to provide guarantees for the accuracy of any numerical approximation of a given conditional expectation. We illustrate the method by assessing the quality of approximate conditional expectations obtained by linear, polynomial as well as neural network regression in different concrete examples.
【10】 Optimized variance estimation under interference and complex experimental designs 标题:干扰和复杂试验设计下的最优方差估计 链接:https://arxiv.org/abs/2112.01709
作者:Christopher Harshaw,Joel A. Middleton,Fredrik Sävje 摘要:基于设计的治疗效果估计器通常不存在无偏和一致的方差估计器,因为实验者从未观察到任何单位的多个潜在结果。干扰和复杂的实验设计加剧了这一问题。在本文中,我们考虑在干扰和任意实验设计下线性处理效果估计器的方差估计。在这种情况下,实验者必须接受保守估计,但他们可以努力最小化保守性。我们表明,这项任务可以解释为一个优化问题,其中一个目标是在给定风险偏好和潜在结果知识的情况下,找到真实方差的最低可估计上界。我们刻画了二次型类的容许界集,并证明了该优化问题是多个自然目标的凸规划。这使得实验者能够构造不太保守的方差估计量,从而使关于治疗效果的推断更具信息性。无论用于构造界的背景知识是否正确,所得到的估计量都保证是保守的,但是如果知识是合理准确的,则估计量就不那么保守。 摘要:Unbiased and consistent variance estimators generally do not exist for design-based treatment effect estimators because experimenters never observe more than one potential outcome for any unit. The problem is exacerbated by interference and complex experimental designs. In this paper, we consider variance estimation for linear treatment effect estimators under interference and arbitrary experimental designs. Experimenters must accept conservative estimators in this setting, but they can strive to minimize the conservativeness. We show that this task can be interpreted as an optimization problem in which one aims to find the lowest estimable upper bound of the true variance given one's risk preference and knowledge of the potential outcomes. We characterize the set of admissible bounds in the class of quadratic forms, and we demonstrate that the optimization problem is a convex program for many natural objectives. This allows experimenters to construct less conservative variance estimators, making inferences about treatment effects more informative. The resulting estimators are guaranteed to be conservative regardless of whether the background knowledge used to construct the bound is correct, but the estimators are less conservative if the knowledge is reasonably accurate.
【11】 Learning Curves for Sequential Training of Neural Networks: Self-Knowledge Transfer and Forgetting 标题:神经网络序贯训练的学习曲线:自我知识迁移与遗忘 链接:https://arxiv.org/abs/2112.01653
作者:Ryo Karakida,Shotaro Akaho 备注:31 pages, 6 figures 摘要:从一个任务到另一个任务的顺序训练正成为持续学习和迁移学习等深度学习应用的主要对象之一。然而,在何种条件下,经过训练的模型的性能会提高或降低,目前尚不清楚。为了加深我们对序贯训练的理解,本研究在一个可解决的持续学习案例中对泛化绩效进行了理论分析。我们考虑神经网络的神经切向核(NTK)制度,不断学习目标函数从任务到任务,并通过使用建立的统计力学分析核无脊回归调查推广。我们首先展示了从正迁移到负迁移的特征转换。超过特定临界值的更相似目标可以为后续任务实现积极的知识转移,而灾难性遗忘甚至在目标非常相似的情况下也会发生。接下来,我们研究了连续学习的一种变体,其中模型在多个任务中学习相同的目标函数。即使对于相同的目标,训练模型也会根据每个任务的样本大小显示一些迁移和遗忘。我们可以保证在样本量相等的情况下,泛化误差在任务之间单调减小,而样本量不平衡会恶化泛化。我们将这些改善和恶化分别称为自我知识转移和遗忘,并在深层神经网络的实际训练中进行了实证验证。 摘要:Sequential training from task to task is becoming one of the major objects in deep learning applications such as continual learning and transfer learning. Nevertheless, it remains unclear under what conditions the trained model's performance improves or deteriorates. To deepen our understanding of sequential training, this study provides a theoretical analysis of generalization performance in a solvable case of continual learning. We consider neural networks in the neural tangent kernel (NTK) regime that continually learn target functions from task to task, and investigate the generalization by using an established statistical mechanical analysis of kernel ridge-less regression. We first show characteristic transitions from positive to negative transfer. More similar targets above a specific critical value can achieve positive knowledge transfer for the subsequent task while catastrophic forgetting occurs even with very similar targets. Next, we investigate a variant of continual learning where the model learns the same target function in multiple tasks. Even for the same target, the trained model shows some transfer and forgetting depending on the sample size of each task. We can guarantee that the generalization error monotonically decreases from task to task for equal sample sizes while unbalanced sample sizes deteriorate the generalization. We respectively refer to these improvement and deterioration as self-knowledge transfer and forgetting, and empirically confirm them in realistic training of deep neural networks as well.
【12】 Empirical phi-divergence test statistics in the logistic regression model 标题:Logistic回归模型中的经验性φ-发散检验统计量 链接:https://arxiv.org/abs/2112.01636
作者:A. Felipe,P. Garcia-Segador,N. Martin,P. Miranda,L. Pardo 摘要:在这篇文章中,我们将散度测度应用于logistic回归模型的经验似然。我们定义了一系列基于散度测度的经验检验统计量,称为经验φ散度检验统计量,扩展了经验似然比检验。我们研究了这些经验检验统计量的渐近分布,表明它对于这个族中的所有检验统计量都是相同的,并且与经典的经验似然比检验相同。接下来,我们研究了该家族成员的幂函数,表明本文介绍的经验φ散度检验在弗雷泽意义上是一致的。为了比较这一新家族中经验phi发散检验统计数据之间的行为差异,我们进行了一项模拟研究,这是本文首次考虑的。 摘要:In this paper we apply divergence measures to empirical likelihood applied to logistic regression models. We define a family of empirical test statistics based on divergence measures, called empirical phi-divergence test statistics, extending the empirical likelihood ratio test. We study the asymptotic distribution of these empirical test statistics, showing that it is the same for all the test statistics in this family, and the same as the classical empirical likelihood ratio test. Next, we study the power function for the members in this family, showing that the empirical phi-divergence tests introduced in the paper are consistent in the Fraser sense. In order to compare the differences in behavior among the empirical phi-divergence test statistics in this new family, considered for the first time in this paper, we carry out a simulation study.
【13】 Bayesian supervised predictive classification and hypothesis testing toolkit for partition exchangeability 标题:用于分区互换性的贝叶斯监督预测分类和假设检验工具包 链接:https://arxiv.org/abs/2112.01618
作者:Ville Kinnula,Jing Tang,Ali Amiryousefi 摘要:实现了贝叶斯监督预测分类器、假设检验和分区可交换性下的参数估计。提出的两个分类器是边缘分类器(假设测试数据为i.i.d.),紧挨着计算成本更高但更准确的同时分类器(该分类器基于同时使用所有测试数据预测每个标签,一次为整个测试数据集找到标签)。我们还提供了分区可交换性生成模型唯一基础参数的最大似然估计(MLE),以及该参数与单个值、备选或多个样本相等的假设检验统计数据。我们给出了模拟Ewens抽样公式序列的函数,作为Poisson-Dirichlet分布及其各自概率的实现。 摘要:Bayesian supervised predictive classifiers, hypothesis testing, and parametric estimation under Partition Exchangeability are implemented. The two classifiers presented are the marginal classifier (that assumes test data is i.i.d.) next to a more computationally costly but accurate simultaneous classifier (that finds a labelling for the entire test dataset at once based on simultanous use of all the test data to predict each label). We also provide the Maximum Likelihood Estimation (MLE) of the only underlying parameter of the partition exchangeability generative model as well as hypothesis testing statistics for equality of this parameter with a single value, alternative, or multiple samples. We present functions to simulate the sequences from Ewens Sampling Formula as the realisation of the Poisson-Dirichlet distribution and their respective probabilities.
【14】 Change-point detection in the covariance kernel of functional data using data depth 标题:基于数据深度的函数数据协方差核中的变点检测 链接:https://arxiv.org/abs/2112.01611
作者:Kelly Ramsay,Shoja'eddin Chenouri 备注:40 pages, 7 figures 摘要:我们研究了观测函数序列中协方差算子的几个基于秩的变点过程,称为FKWC变点过程。我们的方法允许用户测试一个变化点,测试流行期,或者检测数据中未知数量的变化点。我们的方法将功能数据深度值与传统的Kruskal-Wallis检验统计量相结合。通过采用这种方法,我们不需要估计协方差算子,这使得我们的方法计算成本低廉。例如,我们的过程可以在$O(n\log n)$time中识别多个变更点。我们的程序是完全非参数的,并且通过使用数据深度等级对异常值具有鲁棒性。我们证明了当$n$较大时,我们的方法在零假设下具有简单的行为。我们还证明了FKWC变化点过程是$n^{-1/2}$-一致的。除了渐近结果外,我们还为我们的至多一个变化点估计提供了一个有限样本精度结果。在模拟中,我们将我们的方法与其他几种方法进行了比较。我们还介绍了我们的方法在日内资产回报和f-MRI扫描中的应用。 摘要:We investigate several rank-based change-point procedures for the covariance operator in a sequence of observed functions, called FKWC change-point procedures. Our methods allow the user to test for one change-point, to test for an epidemic period, or to detect an unknown amount of change-points in the data. Our methodology combines functional data depth values with the traditional Kruskal Wallis test statistic. By taking this approach we have no need to estimate the covariance operator, which makes our methods computationally cheap. For example, our procedure can identify multiple change-points in $O(n\log n)$ time. Our procedure is fully non-parametric and is robust to outliers through the use of data depth ranks. We show that when $n$ is large, our methods have simple behaviour under the null hypothesis.We also show that the FKWC change-point procedures are $n^{-1/2}$-consistent. In addition to asymptotic results, we provide a finite sample accuracy result for our at-most-one change-point estimator. In simulation, we compare our methods against several others. We also present an application of our methods to intraday asset returns and f-MRI scans.
【15】 Recovering Hölder smooth functions from noisy modulo samples 标题:从有噪模样中恢复Hölder光滑函数 链接:https://arxiv.org/abs/2112.01610
作者:Michaël Fanuel,Hemant Tyagi 备注:9 pages, 2 figures; Asilomar Conference on Signals, Systems, and Computers 2021 摘要:在信号处理中,一些应用涉及给定噪声模采样的函数的恢复。本文考虑的设置是,由于模运算,被加性高斯噪声破坏的样本被包裹。此问题的典型示例出现在相位解包裹问题或自复位模数转换器中。我们考虑一个固定的设计设置,在正规网格上给出模样。然后,提出了一种三级恢复策略,将地面真值信号恢复到全局整数移位。第一阶段利用局部多项式估计对模样本进行去噪。在第二阶段,对网格上的去噪模样本应用去包裹算法。最后,使用一个基于样条的准插值算子对地面真值函数进行估计,使其达到全局整数移位。对于H \“旧类中的函数,给出了高概率恢复性能的均匀错误率。这扩展了Fanuel和Tyagi最近获得的Lipschitz平滑函数的结果,其中在去噪步骤中使用了$k$NN回归。 摘要:In signal processing, several applications involve the recovery of a function given noisy modulo samples. The setting considered in this paper is that the samples corrupted by an additive Gaussian noise are wrapped due to the modulo operation. Typical examples of this problem arise in phase unwrapping problems or in the context of self-reset analog to digital converters. We consider a fixed design setting where the modulo samples are given on a regular grid. Then, a three stage recovery strategy is proposed to recover the ground truth signal up to a global integer shift. The first stage denoises the modulo samples by using local polynomial estimators. In the second stage, an unwrapping algorithm is applied to the denoised modulo samples on the grid. Finally, a spline based quasi-interpolant operator is used to yield an estimate of the ground truth function up to a global integer shift. For a function in H\"older class, uniform error rates are given for recovery performance with high probability. This extends recent results obtained by Fanuel and Tyagi for Lipschitz smooth functions wherein $k$NN regression was used in the denoising step.
【16】 On the Reliability of Multiple Systems Estimation for the Quantification of Modern Slavery 标题:论现代奴隶制量化的多系统估计的可靠性 链接:https://arxiv.org/abs/2112.01594
作者:Olivier Binette,Rebecca C. Steorts 摘要:现代奴隶制的量化最近受到了越来越多的关注,因为各组织已经联合起来进行全球评估,为此经常使用多系统评估(MSE)。与长期存在的争议相呼应的是,关于基本MSE假设、MSE方法的稳健性以及该应用中MSE估计的准确性的分歧再次浮出水面。我们的目标是帮助解决和克服这些争议。为此,我们回顾了MSE及其假设,以及现代应用中常用的模型。我们介绍了文献中所有公开的现代奴隶制数据集,提供了可复制的分析,并强调了当前的问题。具体地说,我们利用内部一致性方法,构建基础真值可用的数据子集,使我们能够评估MSE估计器的准确性。接下来,我们提出了估计量的大样本偏差作为错误假设函数的特征。然后,我们提出了一种替代传统(例如,基于bootstrap的)可靠性评估的方法,该方法允许我们可视化MSE估计的轨迹,以说明估计的稳健性。最后,我们的补充分析用于为MSE方法的应用和可靠性提供指导。 摘要:The quantification of modern slavery has received increased attention recently as organizations have come together to produce global estimates, where multiple systems estimation (MSE) is often used to this end. Echoing a long-standing controversy, disagreements have re-surfaced regarding the underlying MSE assumptions, the robustness of MSE methodology, and the accuracy of MSE estimates in this application. Our goal is to help address and move past these controversies. To do so, we review MSE, its assumptions, and commonly used models for modern slavery applications. We introduce all of the publicly available modern slavery datasets in the literature, providing a reproducible analysis and highlighting current issues. Specifically, we utilize an internal consistency approach that constructs subsets of data for which ground truth is available, allowing us to evaluate the accuracy of MSE estimators. Next, we propose a characterization of the large sample bias of estimators as a function of misspecified assumptions. Then, we propose an alternative to traditional (e.g., bootstrap-based) assessments of reliability, which allows us to visualize trajectories of MSE estimates to illustrate the robustness of estimates. Finally, our complementary analyses are used to provide guidance regarding the application and reliability of MSE methodology.
【17】 Invariant Priors for Bayesian Quadrature 标题:贝叶斯求积的不变先验 链接:https://arxiv.org/abs/2112.01578
作者:Masha Naslidnyk,Javier Gonzalez,Maren Mahsereci 摘要:贝叶斯求积(BQ)是一种基于模型的数值积分方法,能够通过编码和利用手头积分任务的已知结构来提高样本效率。在本文中,我们探讨了在输入域中一组双射变换,特别是一些幺正变换(如旋转、轴翻转或点对称)下被积函数不变性的编码先验。我们在几个合成应用和一个实际应用中展示了与标准贝叶斯求积相比优越性能的初步结果。 摘要:Bayesian quadrature (BQ) is a model-based numerical integration method that is able to increase sample efficiency by encoding and leveraging known structure of the integration task at hand. In this paper, we explore priors that encode invariance of the integrand under a set of bijective transformations in the input domain, in particular some unitary transformations, such as rotations, axis-flips, or point symmetries. We show initial results on superior performance in comparison to standard Bayesian quadrature on several synthetic and one real world application.
【18】 Dimension-Free Average Treatment Effect Inference with Deep Neural Networks 标题:基于深度神经网络的无因次平均处理效果推断 链接:https://arxiv.org/abs/2112.01574
作者:Xinze Du,Yingying Fan,Jinchi Lv,Tianshu Sun,Patrick Vossler 备注:56 pages, 22 figures 摘要:本文研究了在潜在结果框架下,利用深度神经网络(DNN)对平均治疗效果(ATE)的估计和推断。在某些规律性条件下,观察到的反应可以表示为以混杂变量和治疗指标为自变量的均值回归问题的反应。利用这种公式,我们研究了两种基于DNN回归估计平均回归函数的ATE估计和推断方法,使用特定的网络结构。我们证明了在真实均值回归模型的某些假设下,ATE的两个DNN估计与无量纲一致性率是一致的。我们的模型假设适应了观察到的反应对协变量的潜在复杂依赖结构,包括潜在因素以及治疗指标和混杂变量之间的非线性相互作用。我们还基于样本分割的思想建立了估计量的渐近正态性,确保了精确的推断和不确定性量化。仿真研究和实际数据应用证明了我们的理论发现,并支持我们的DNN估计和推断方法。 摘要:This paper investigates the estimation and inference of the average treatment effect (ATE) using deep neural networks (DNNs) in the potential outcomes framework. Under some regularity conditions, the observed response can be formulated as the response of a mean regression problem with both the confounding variables and the treatment indicator as the independent variables. Using such formulation, we investigate two methods for ATE estimation and inference based on the estimated mean regression function via DNN regression using a specific network architecture. We show that both DNN estimates of ATE are consistent with dimension-free consistency rates under some assumptions on the underlying true mean regression model. Our model assumptions accommodate the potentially complicated dependence structure of the observed response on the covariates, including latent factors and nonlinear interactions between the treatment indicator and confounding variables. We also establish the asymptotic normality of our estimators based on the idea of sample splitting, ensuring precise inference and uncertainty quantification. Simulation studies and real data application justify our theoretical findings and support our DNN estimation and inference methods.
【19】 Causal-based Time Series Domain Generalization for Vehicle Intention Prediction 标题:基于因果关系的汽车意向预测时域泛化 链接:https://arxiv.org/abs/2112.02093
作者:Yeping Hu,Xiaogang Jia,Masayoshi Tomizuka,Wei Zhan 备注:Accepted by NeurIPS 2021 Workshop on Distribution Shifts 摘要:准确预测交通参与者可能的行为是自动驾驶车辆的基本能力。由于自动驾驶车辆需要在动态变化的环境中导航,因此无论在何处以及遇到什么样的驾驶环境,都需要进行准确的预测。因此,当自动驾驶车辆部署在现实世界中时,对未知领域的泛化能力对于预测模型至关重要。本文针对车辆意图预测任务的领域泛化问题,提出了一种基于因果关系的时间序列领域泛化(CTSDG)模型。我们构建了一个车辆意图预测任务的结构因果模型,以学习输入驾驶数据的不变表示,用于领域泛化。我们进一步将递归潜在变量模型集成到我们的结构因果模型中,以更好地从时间序列输入数据中捕获时间潜在依赖性。我们的方法的有效性通过真实驾驶数据进行评估。我们证明,与其他最先进的领域泛化和行为预测方法相比,我们提出的方法在预测精度上有一致的提高。 摘要:Accurately predicting possible behaviors of traffic participants is an essential capability for autonomous vehicles. Since autonomous vehicles need to navigate in dynamically changing environments, they are expected to make accurate predictions regardless of where they are and what driving circumstances they encountered. Therefore, generalization capability to unseen domains is crucial for prediction models when autonomous vehicles are deployed in the real world. In this paper, we aim to address the domain generalization problem for vehicle intention prediction tasks and a causal-based time series domain generalization (CTSDG) model is proposed. We construct a structural causal model for vehicle intention prediction tasks to learn an invariant representation of input driving data for domain generalization. We further integrate a recurrent latent variable model into our structural causal model to better capture temporal latent dependencies from time-series input data. The effectiveness of our approach is evaluated via real-world driving data. We demonstrate that our proposed method has consistent improvement on prediction accuracy compared to other state-of-the-art domain generalization and behavior prediction methods.
【20】 Hierarchical Optimal Transport for Unsupervised Domain Adaptation 标题:无监督域自适应的分层最优传输算法 链接:https://arxiv.org/abs/2112.02073
作者:Mourad El Hamri,Younès Bennani,Issam Falih,Hamid Ahaggach 摘要:在本文中,我们提出了一种新的无监督域自适应方法,该方法结合了最优传输、学习概率测度和无监督学习的概念。所提出的方法HOT-DA基于优化传输的分层公式,该公式利用了地面度量捕获的几何信息之外的源域和目标域中更丰富的结构信息。标记源域中的附加信息是通过根据类标签将样本分组到结构中本能地形成的。而在未标记的目标域中探索隐藏结构的问题则归结为通过Wasserstein重心学习概率测度的问题,我们证明了它等价于谱聚类。在一个复杂度可控的玩具数据集和两个具有挑战性的视觉适应数据集上的实验表明,该方法优于现有的方法。 摘要:In this paper, we propose a novel approach for unsupervised domain adaptation, that relates notions of optimal transport, learning probability measures and unsupervised learning. The proposed approach, HOT-DA, is based on a hierarchical formulation of optimal transport, that leverages beyond the geometrical information captured by the ground metric, richer structural information in the source and target domains. The additional information in the labeled source domain is formed instinctively by grouping samples into structures according to their class labels. While exploring hidden structures in the unlabeled target domain is reduced to the problem of learning probability measures through Wasserstein barycenter, which we prove to be equivalent to spectral clustering. Experiments on a toy dataset with controllable complexity and two challenging visual adaptation datasets show the superiority of the proposed approach over the state-of-the-art.
【21】 Graph Neural Networks for Charged Particle Tracking on FPGAs 标题:图神经网络在FPGA带电粒子跟踪中的应用 链接:https://arxiv.org/abs/2112.02048
作者:Abdelrahman Elabd,Vesal Razavimaleki,Shi-Yu Huang,Javier Duarte,Markus Atkinson,Gage DeZoort,Peter Elmer,Jin-Xuan Hu,Shih-Chieh Hsu,Bo-Cheng Lai,Mark Neubauer,Isobel Ojalvo,Savannah Thais 备注:26 pages, 17 figures, 1 table 摘要:在欧洲核子研究中心大型强子对撞机(LHC)上确定碰撞中的带电粒子轨迹是一个重要但具有挑战性的问题,特别是在LHC(HL-LHC)未来高亮度阶段预期的高相互作用密度条件下。图形神经网络(GNNs)是一种几何深度学习算法,通过将跟踪器数据嵌入为图形(节点表示命中,而边表示可能的轨迹段)并将边分类为真轨迹段或假轨迹段,已成功应用于此任务。然而,由于计算量大,他们在基于硬件或软件的触发器应用方面的研究受到限制。在本文中,我们介绍了一个自动翻译工作流,它集成到一个更广泛的工具$\texttt{hls4ml}$中,用于将GNN转换为现场可编程门阵列(FPGA)固件。我们使用此转换工具在FPGA上实现带电粒子跟踪的GNN,使用TrackML挑战数据集进行训练,设计目标是不同的图形大小、任务复杂度和延迟/吞吐量要求。这项工作可以在HL-LHC实验的触发水平上包含带电粒子跟踪GNN。 摘要:The determination of charged particle trajectories in collisions at the CERN Large Hadron Collider (LHC) is an important but challenging problem, especially in the high interaction density conditions expected during the future high-luminosity phase of the LHC (HL-LHC). Graph neural networks (GNNs) are a type of geometric deep learning algorithm that has successfully been applied to this task by embedding tracker data as a graph -- nodes represent hits, while edges represent possible track segments -- and classifying the edges as true or fake track segments. However, their study in hardware- or software-based trigger applications has been limited due to their large computational cost. In this paper, we introduce an automated translation workflow, integrated into a broader tool called $\texttt{hls4ml}$, for converting GNNs into firmware for field-programmable gate arrays (FPGAs). We use this translation tool to implement GNNs for charged particle tracking, trained using the TrackML challenge dataset, on FPGAs with designs targeting different graph sizes, task complexites, and latency/throughput requirements. This work could enable the inclusion of charged particle tracking GNNs at the trigger level for HL-LHC experiments.
【22】 Could AI Democratise Education? Socio-Technical Imaginaries of an EdTech Revolution 标题:人工智能能让教育民主化吗?教育技术革命的社会技术想象 链接:https://arxiv.org/abs/2112.02034
作者:Sahan Bulathwela,María Pérez-Ortiz,Catherine Holloway,John Shawe-Taylor 备注:To be presented at Workshop on Machine Learning for the Developing World (ML4D) at the Conference on Neural Information Processing Systems 2021 摘要:据说,教育中的人工智能(AI)有潜力构建更个性化的课程,并在全球范围内实现教育民主化,创造新的教学方式和学习方式的复兴。数以百万计的学生已经开始从这些技术的使用中受益,但全世界还有数以百万计的学生没有受益。如果这一趋势继续下去,人工智能在教育领域的首次应用可能会导致更大的教育不平等,以及当前技术决定论叙事所引发的全球教育资源配置不当。在本文中,我们重点围绕人工智能在教育中的未来进行推测并提出问题,目的是展开紧迫的对话,为技术渗透的新一代教育奠定正确的基础。本文首先综合人工智能如何改变我们的学习和教学方式,特别关注个性化学习伙伴的情况,然后讨论一些社会技术特征,这些特征对于避免这些人工智能系统在全球范围内的危险(并可能确保其成功)至关重要。本文还讨论了将人工智能与免费、参与性和民主资源(如维基百科、开放教育资源和开源工具)结合使用的潜力。我们还强调需要集体设计以人为中心、透明、互动和协作的基于人工智能的算法,为利益相关者提供授权和完全代理,并支持新兴的教学法。最后,我们要问的是,这场教育革命需要什么才能超越任何政治、文化、语言、地理和学习能力障碍,提供平等和授权的教育机会。 摘要:Artificial Intelligence (AI) in Education has been said to have the potential for building more personalised curricula, as well as democratising education worldwide and creating a Renaissance of new ways of teaching and learning. Millions of students are already starting to benefit from the use of these technologies, but millions more around the world are not. If this trend continues, the first delivery of AI in Education could be greater educational inequality, along with a global misallocation of educational resources motivated by the current technological determinism narrative. In this paper, we focus on speculating and posing questions around the future of AI in Education, with the aim of starting the pressing conversation that would set the right foundations for the new generation of education that is permeated by technology. This paper starts by synthesising how AI might change how we learn and teach, focusing specifically on the case of personalised learning companions, and then move to discuss some socio-technical features that will be crucial for avoiding the perils of these AI systems worldwide (and perhaps ensuring their success). This paper also discusses the potential of using AI together with free, participatory and democratic resources, such as Wikipedia, Open Educational Resources and open-source tools. We also emphasise the need for collectively designing human-centered, transparent, interactive and collaborative AI-based algorithms that empower and give complete agency to stakeholders, as well as support new emerging pedagogies. Finally, we ask what would it take for this educational revolution to provide egalitarian and empowering access to education, beyond any political, cultural, language, geographical and learning ability barriers.
【23】 Practitioner-Centric Approach for Early Incident Detection Using Crowdsourced Data for Emergency Services 标题:使用众包数据进行应急服务的以从业者为中心的早期事件检测方法 链接:https://arxiv.org/abs/2112.02012
作者:Yasas Senarath,Ayan Mukhopadhyay,Sayyed Mohsen Vazirizade,Hemant Purohit,Saideep Nannapaneni,Abhishek Dubey 备注:Accepted at IEEE International Conference on Data Mining (ICDM) 2021 摘要:应急响应在很大程度上取决于事件报告的时间。不幸的是,接收事故报告的传统方法(如在美国拨打911)存在时间延迟。Waze等众包平台为事件的早期识别提供了机会。然而,由于与众包数据流相关的噪声和不确定性的挑战,从众包数据流中检测事件是困难的。此外,简单地优化过检测精度可能会影响推理的时空定位,从而使此类方法不适用于实际部署。本文以应急响应管理为例,提出了一种新的基于众包数据的以从业者为中心的事件检测问题公式和解决方法。提议的方法CROME(众包多目标事件检测)量化了事件分类的性能指标(如F1分数)与模型从业者的要求(如事件检测的1 km半径)之间的关系。首先,我们展示了如何在卷积神经网络(CNN)架构中,将众包报告、地面实况历史数据和其他相关决定因素(如交通和天气)结合使用,以早期检测紧急事件。然后,我们使用基于帕累托优化的方法来优化CNN的输出,并结合以从业者为中心的参数来平衡检测精度和时空定位。最后,我们使用来自Waze的众包数据和来自美国田纳西州纳什维尔的交通事故报告证明了该方法的适用性。我们的实验表明,该方法在事件检测方面优于现有方法,同时优化了对真实世界部署和可用性的需求。 摘要:Emergency response is highly dependent on the time of incident reporting. Unfortunately, the traditional approach to receiving incident reports (e.g., calling 911 in the USA) has time delays. Crowdsourcing platforms such as Waze provide an opportunity for early identification of incidents. However, detecting incidents from crowdsourced data streams is difficult due to the challenges of noise and uncertainty associated with such data. Further, simply optimizing over detection accuracy can compromise spatial-temporal localization of the inference, thereby making such approaches infeasible for real-world deployment. This paper presents a novel problem formulation and solution approach for practitioner-centered incident detection using crowdsourced data by using emergency response management as a case-study. The proposed approach CROME (Crowdsourced Multi-objective Event Detection) quantifies the relationship between the performance metrics of incident classification (e.g., F1 score) and the requirements of model practitioners (e.g., 1 km. radius for incident detection). First, we show how crowdsourced reports, ground-truth historical data, and other relevant determinants such as traffic and weather can be used together in a Convolutional Neural Network (CNN) architecture for early detection of emergency incidents. Then, we use a Pareto optimization-based approach to optimize the output of the CNN in tandem with practitioner-centric parameters to balance detection accuracy and spatial-temporal localization. Finally, we demonstrate the applicability of this approach using crowdsourced data from Waze and traffic accident reports from Nashville, TN, USA. Our experiments demonstrate that the proposed approach outperforms existing approaches in incident detection while simultaneously optimizing the needs for real-world deployment and usability.
【24】 A network analysis of decision strategies of human experts in steel manufacturing 标题:钢铁制造业人类专家决策策略的网络分析 链接:https://arxiv.org/abs/2112.01991
作者:Daniel Christopher Merten,Prof. Dr. Marc-Thorsten Hütt,Prof. Dr. Yilmaz Uygun 备注:submitted to Computers & Industrial Engineering, 29 pages, 12 fiugres, 3 tables 摘要:钢铁生产调度通常由人力专家计划员完成。因此,与全自动调度系统相比,钢铁制造商更喜欢辅助推荐算法。通过建议合适的订单,这些算法可以帮助负责选择和安排生产订单的人类专家计划员。然而,由于钢铁战役规划缺乏精确的基于规则的程序,很难估计这些算法的复杂程度;事实上,它需要广泛的领域知识和直觉,而这些只有通过多年的业务经验才能获得。在这里,我们没有开发新的算法或改进旧的算法,而是引入了一种洗牌辅助网络方法来评估由人类专家建立的选择模式的复杂性。这种技术使我们能够形式化并表示进入活动规划的隐性知识。通过网络分析,我们发现生产订单的选择主要取决于订单的碳含量。令人惊讶的是,锰、硅和钛等微量元素对选择决策的影响小于相关文献的假设。当人类专家需要创建满足某些隐含选择标准的订单组(“活动”)时,我们的方法可以作为一系列决策支持系统的输入。 摘要:Steel production scheduling is typically accomplished by human expert planners. Hence, instead of fully automated scheduling systems steel manufacturers prefer auxiliary recommendation algorithms. Through the suggestion of suitable orders, these algorithms assist human expert planners who are tasked with the selection and scheduling of production orders. However, it is hard to estimate, what degree of complexity these algorithms should have as steel campaign planning lacks precise rule-based procedures; in fact, it requires extensive domain knowledge as well as intuition that can only be acquired by years of business experience. Here, instead of developing new algorithms or improving older ones, we introduce a shuffling-aided network method to assess the complexity of the selection patterns established by a human expert. This technique allows us to formalize and represent the tacit knowledge that enters the campaign planning. As a result of the network analysis, we have discovered that the choice of production orders is primarily determined by the orders' carbon content. Surprisingly, trace elements like manganese, silicon, and titanium have a lesser impact on the selection decision than assumed by the pertinent literature. Our approach can serve as an input to a range of decision-support systems, whenever a human expert needs to create groups of orders ('campaigns') that fulfill certain implicit selection criteria.
【25】 Estimating the Value-at-Risk by Temporal VAE 标题:用时间VAE估计在险价值(Value-at-Risk) 链接:https://arxiv.org/abs/2112.01896
作者:Robert Sicks,Stefanie Grimm,Ralf Korn,Ivo Richert 备注:35 pages 摘要:大型资产组合的风险价值(VaR)估计是金融机构的一项重要任务。由于资产价格的联合对数收益通常可以预测到更小维度的潜在空间,因此使用变分自动编码器(VAE)估计VaR是一个自然的建议。为了确保在学习序列数据时自动编码器的瓶颈结构,我们使用了一种时态VAE(TempVAE),它避免了观测变量的自回归结构。然而,金融数据的低信噪比与VAE的自动修剪特性相结合,通常使得VAE的使用容易出现后崩溃。所以,我们建议使用正则化退火来减轻这种影响。因此,TempVAE的自动修剪工作正常,这也为VaR带来了出色的估计结果,在应用于实际数据时,它优于经典的GARCH类型和历史模拟方法。 摘要:Estimation of the value-at-risk (VaR) of a large portfolio of assets is an important task for financial institutions. As the joint log-returns of asset prices can often be projected to a latent space of a much smaller dimension, the use of a variational autoencoder (VAE) for estimating the VaR is a natural suggestion. To ensure the bottleneck structure of autoencoders when learning sequential data, we use a temporal VAE (TempVAE) that avoids an auto-regressive structure for the observation variables. However, the low signal- to-noise ratio of financial data in combination with the auto-pruning property of a VAE typically makes the use of a VAE prone to posterior collapse. Therefore, we propose to use annealing of the regularization to mitigate this effect. As a result, the auto-pruning of the TempVAE works properly which also results in excellent estimation results for the VaR that beats classical GARCH-type and historical simulation approaches when applied to real data.
【26】 Inference for ROC Curves Based on Estimated Predictive Indices 标题:基于估计预测指数的ROC曲线推断 链接:https://arxiv.org/abs/2112.01772
作者:Yu-Chin Hsu,Robert P. Lieli 摘要:我们提供了一个全面的理论,用于对接收机工作特性(ROC)曲线进行样本内统计推断,该曲线基于具有估计参数的第一阶段模型(如logit回归)的预测值。术语“样本中”指使用相同数据进行模型估计(训练)和后续评估的实践,即ROC曲线的构建。我们证明,在这种情况下,第一阶段估计误差对ROC曲线的渐近分布具有不可忽略的影响,并发展了适当的逐点和函数极限理论。我们提出了模拟极限过程分布的方法,并展示了如何在实践中使用这些结果来比较ROC曲线。 摘要:We provide a comprehensive theory of conducting in-sample statistical inference about receiver operating characteristic (ROC) curves that are based on predicted values from a first stage model with estimated parameters (such as a logit regression). The term "in-sample" refers to the practice of using the same data for model estimation (training) and subsequent evaluation, i.e., the construction of the ROC curve. We show that in this case the first stage estimation error has a generally non-negligible impact on the asymptotic distribution of the ROC curve and develop the appropriate pointwise and functional limit theory. We propose methods for simulating the distribution of the limit process and show how to use the results in practice in comparing ROC curves.
【27】 Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research 标题:缩减、重用和再循环:机器学习研究中的数据集寿命 链接:https://arxiv.org/abs/2112.01716
作者:Bernard Koch,Emily Denton,Alex Hanna,Jacob G. Foster 备注:35th Conference on Neural Information Processing Systems (NeurIPS 2021), Sydney, Australia 摘要:基准数据集在机器学习研究的组织中起着核心作用。他们围绕共同的研究问题协调研究人员,并作为实现共同目标进展的衡量标准。尽管基准测试实践在这一领域起着基础性作用,但在机器学习子单元内部或之间,对基准数据集使用和重用的动态性关注相对较少。在本文中,我们深入研究了这些动力学。我们研究了2015-2020年间数据集使用模式在机器学习子单元和时间上的差异。我们发现,越来越多的人关注任务社区中越来越少的数据集,大量采用其他任务中的数据集,以及研究人员在少数精英机构中引入的数据集。我们的研究结果对科学评估、人工智能伦理和领域内的公平/准入具有影响。 摘要:Benchmark datasets play a central role in the organization of machine learning research. They coordinate researchers around shared research problems and serve as a measure of progress towards shared goals. Despite the foundational role of benchmarking practices in this field, relatively little attention has been paid to the dynamics of benchmark dataset use and reuse, within or across machine learning subcommunities. In this paper, we dig into these dynamics. We study how dataset usage patterns differ across machine learning subcommunities and across time from 2015-2020. We find increasing concentration on fewer and fewer datasets within task communities, significant adoption of datasets from other tasks, and concentration across the field on datasets that have been introduced by researchers situated within a small number of elite institutions. Our results have implications for scientific evaluation, AI ethics, and equity/access within the field.
【28】 On the Existence of the Adversarial Bayes Classifier (Extended Version) 标题:关于对抗性贝叶斯分类器(扩展版)的存在性 链接:https://arxiv.org/abs/2112.01694
作者:Pranjal Awasthi,Natalie S. Frank,Mehryar Mohri 备注:49 pages, 8 figures. Extended version of the paper "On the Existence of the Adversarial Bayes Classifier" published in NeurIPS 摘要:对抗性鲁棒性是现代机器学习应用中的一个重要特性。虽然它是最近几次理论研究的主题,但与对抗鲁棒性相关的许多重要问题仍然悬而未决。在这项工作中,我们研究了一个关于对抗鲁棒性的贝叶斯最优性的基本问题。我们提供了一般的充分条件,在此条件下,贝叶斯最优分类器的存在可以保证对抗鲁棒性。我们的结果可以为后续研究对抗鲁棒性中的替代损失及其一致性特性提供有用的工具。这篇手稿是发表在NeurIPS上的“关于敌对贝叶斯分类器的存在”一文的扩展版本。本文的结果不适用于某些非严格凸范数。在这里,我们将结果扩展到所有可能的规范。 摘要:Adversarial robustness is a critical property in a variety of modern machine learning applications. While it has been the subject of several recent theoretical studies, many important questions related to adversarial robustness are still open. In this work, we study a fundamental question regarding Bayes optimality for adversarial robustness. We provide general sufficient conditions under which the existence of a Bayes optimal classifier can be guaranteed for adversarial robustness. Our results can provide a useful tool for a subsequent study of surrogate losses in adversarial robustness and their consistency properties. This manuscript is the extended version of the paper "On the Existence of the Adversarial Bayes Classifier" published in NeurIPS. The results of the original paper did not apply to some non-strictly convex norms. Here we extend our results to all possible norms.
【29】 The Linear Template Fit 标题:线性模板拟合 链接:https://arxiv.org/abs/2112.01548
作者:Daniel Britzger 备注:44 pages. 11 figures, 3 tables 摘要:在某些基于仿真的参数估计问题中,将提出并讨论确定最佳估计量的矩阵形式。这些方程称为线性模板拟合,将线性回归与最小二乘法及其优化相结合。线性模板拟合仅使用预先计算的预测,并为相关参数的一些值提供预测。因此,线性模板拟合特别适用于计算密集型模拟的参数估计,否则,其统计推断或性能关键应用的可用性通常受到限制。讨论了误差传播方程,分析形式为参数估计问题提供了全面的见解。此外,还将提出二次模板拟合的快速收敛算法,该算法适用于对参数的非线性依赖。作为一个应用实例,研究了从CERN大型强子对撞机的包容性喷流截面数据确定强耦合常数$\alpha_s(m_Z)$,并与先前公布的结果进行了比较。 摘要:A matrix formalism for the determination of the best estimator in certain simulation-based parameter estimation problems will be presented and discussed. The equations, termed as the Linear Template Fit, combine a linear regression with a least square method and its optimization. The Linear Template Fit employs only predictions that are calculated beforehand and which are provided for a few values of the parameter of interest. Therefore, the Linear Template Fit is particularly suited for parameter estimation with computationally intensive simulations that are otherwise often limited in their usability for statistical inference, or for performance critical applications. Equations for error propagation are discussed, and the analytic form provides comprehensive insights into the parameter estimation problem. Furthermore, the quickly-converging algorithm of the Quadratic Template Fit will be presented, which is suitable for a non-linear dependence on the parameters. As an example application, a determination of the strong coupling constant, $\alpha_s(m_Z)$, from inclusive jet cross section data at the CERN Large Hadron Collider is studied and compared with previously published results.
机器翻译,仅供参考