每天给你送来NLP技术干货!
© 作者|王晓磊
机构|中国人民大学高瓴人工智能学院
研究方向 | 对话式信息获取
来自 | RUC AI Box
本文从NeurlPS 2022 的2000多篇接收论文中筛选出了与自然语言处理相关的论文200多篇,并按照研究主题进行分类整理,以供参考。
导读:
NeurIPS 2022 是 CCF A 类会议,人工智能领域方向的顶级国际会议之一。第36届神经信息处理系统会议将于今年 11 月 28 日至 12 月 9 日举行。官方发布的接收论文列表链接如下:https://nips.cc/Conferences/2022/Schedule?type=Poster。
本文从 2000 多篇接收论文中筛选出了与自然语言处理相关的论文 200 多篇,并按照研究主题进行分类整理,以供参考。论文列表也同步更新到 GitHub,欢迎大家关注和Star:github.com/RUCAIBox/Top-conference-paper-list。
目录: Model 【模型】 Interpretability, Analysis and Evaluation 【可解释性、分析、评测】 Robustness and Safety 【鲁棒性与安全】 knowledge and reasoning 【知识与推理】 Information Extraction 【信息抽取】 Information Retrieval 【信息检索】 Text Classification 【文本分类】 Text Generation 【文本生成】 Machine Translation and Multilinguality 【机器翻译与多语言】 Multimodality 【多模态】 Special Tasks 【特殊任务】 01
Model
【模型】
1. Model Design 【模型设计】 Recurrent Memory Transformer
Jump Self-attention: Capturing High-order Statistics in Transformers Block-Recurrent Transformers Staircase Attention for Recurrent Processing of Sequences Non-Linguistic Supervision for Contrastive Learning of Sentence Embeddings Transcormer: Transformer for Sentence Scoring with Sliding Language Modeling Mixture-of-Experts with Expert Choice Routing On the Representation Collapse of Sparse Mixture of Experts Improving Transformer with an Admixture of Attention Heads Your Transformer May Not be as Powerful as You Expect Confident Adaptive Language Modeling Decoupled Context Processing for Context Augmented Language Modeling Unsupervised Cross-Task Generalization via Retrieval Augmentation Revisiting Neural Scaling Laws in Language and Vision Learning to Scaffold: Optimizing Model Explanations for Teaching 2. Model Compression 【模型压缩】 Information-Theoretic Generative Model Compression with Variational Energy-based Model Towards Efficient Post-training Quantization of Pre-trained Language Models Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models Deep Compression of Pre-trained Transformer Models LiteTransformerSearch: Training-free On-device Search for Efficient Autoregressive Language Models GPT3.int8(): 8-bit Matrix Multiplication for Transformers at Scale MorphTE: Injecting Morphology in Tensorized Embeddings Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models A Fast Post-Training Pruning Framework for Transformers 3. Model Training 【模型训练】
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models Generating Training Data with Language Models: Towards Zero-Shot Language Understanding A Data-Augmentation Is Worth A Thousand Samples TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT Models Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction Training and Inference on Any-Order Autoregressive Models the Right Way Decentralized Training of Foundation Models in Heterogeneous Environment 4. Model Usage 【模型使用】 The Unreliability of Explanations in Few-Shot In-Context Learning What Can Transformers Learn In-Context? A Case Study of Simple Function Classes Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning Training language models to follow instructions with human feedback LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning How to talk to your model: Instructions, descriptions, and learning Data Distributional Properties Drive Emergent In-Context Learning in Transformers Sparse Structure Search for Parameter-Efficient Tuning Fine-Tuning Pre-Trained Language Models Effectively by Optimizing Subnetworks Adaptively Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits LIFT: Language-Interfaced FineTuning for Non-language Machine Learning Tasks Adapting to Domain Shift by Meta-Distillation from Mixture-of-Experts
02
Interpretability, Analysis and Evaluation
【可解释性、分析、评测】
CEBaB: Estimating the Causal Effects of Real-World Concepts on NLP Model Behavior Rule-Based but Flexible? Evaluating and Improving Language Models as Accounts of Human Moral Judgment Understanding the Failure of Batch Normalization for Transformers in NLP AttCAT: Explaining Transformers via Attentive Class Activation Tokens An empirical analysis of compute-optimal large language model training Why GANs are overkill for NLP Exploring Length Generalization in Large Language Models Capturing Failures of Large Language Models via Human Cognitive Biases Pre-Trained Model Reusability Evaluation for Small-Data Transfer Learning First is Better Than Last for Language Data Influence What are the best Systems? New Perspectives on NLP Benchmarking Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models FETA: Towards Specializing Foundational Models for Expert Task Applications This is the way - lessons learned from designing and compiling LEPISZCZE, a comprehensive NLP benchmark for Polish Rethinking Knowledge Graph Evaluation Under the Open-World Assumption A Multi-Task Benchmark for Korean Legal Language Understanding and Judgement Prediction 03
Robustness and Safety
【鲁棒性与安全】
Active Learning Helps Pretrained Models Learn the Intended Task Improving Certified Robustness via Statistical Learning with Logical Reasoning Moderate-fitting as a Natural Backdoor Defender for Pre-trained Language Models BadPrompt: Backdoor Attacks on Continuous Prompts A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models AD-DROP: Attribution Driven Dropout for Robust Language Model Finetuning Large (robust) models from computational constraints Multitasking Models are Robust to Structural Failure: A Neural Model for Bilingual Cognitive Reserve A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks Recovering Private Text in Federated Learning of Language Models LAMP: Extracting Text from Gradients with Language Model Priors SeqPATE: Differentially Private Text Generation via Knowledge Distillation Differentially Private Model Compression Federated Learning from Pre-Trained Models: A Contrastive Learning Approach 04
Knowledge and Reasoning
【知识与推理】
Learning to Sample and Aggregate: Few-shot Reasoning over Temporal Knowledge Graph Retaining Knowledge for Learning with Dynamic Definition Shadow Knowledge Distillation: Bridging Offline and Online Knowledge Transfer What Makes a "Good" Data Augmentation in Knowledge Distillation - A Statistical Perspective Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures Roadblocks for Temporarily Disabling Shortcuts and Learning New Knowledge PALBERT: Teaching ALBERT to Ponder Locating and Editing Factual Associations in GPT OTKGE: Multi-modal Knowledge Graph Embeddings via Optimal Transport Large Language Models are Zero-Shot Reasoners STaR: Bootstrapping Reasoning With Reasoning Chain of Thought Prompting Elicits Reasoning in Large Language Models ELASTIC: Numerical Reasoning with Adaptive Symbolic Compiler Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering Inductive Logical Query Answering in Knowledge Graphs Formalizing Coherence and Consistency Applied to Transfer Learning in Neuro-Symbolic Autoencoders CoNSoLe: Convex Neural Symbolic Learning Deep Bidirectional Language-Knowledge Pretraining Neurosymbolic Deep Generative Models for Sequence Data with Relational Constraints Instance-based Learning for Knowledge Base Completion LogiGAN: Learning Logical Reasoning via Adversarial Pre-training Learning robust rule representations for abstract reasoning via internal inferences Solving Quantitative Reasoning Problems with Language Models Towards Better Evaluation for Dynamic Link Prediction Predictive Querying for Autoregressive Neural Sequence Models Semantic Probabilistic Layers for Neuro-Symbolic Learning End-to-end Symbolic Regression with Transformers A Unified Framework for Deep Symbolic Regression ZeroC: A Neuro-Symbolic Model for Zero-shot Concept Recognition and Acquisition at Inference Time 05
Information Extraction
【信息抽取】
Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model TweetNERD - End to End Entity Linking Benchmark for Tweets METS-CoV: A Dataset of Medical Entity and Targeted Sentiment on COVID-19 Related Tweets 06
Information Retrieval
【信息检索】
Transformer Memory as a Differentiable Search Index Autoregressive Search Engines: Generating Substrings as Document Identifiers A Neural Corpus Indexer for Document Retrieval 07
Text Classification
【文本分类】
CascadeXML: End-to-end Multi-Resolution Learning for Extreme Multi-Label Text Classification Text Classification with Born's Rule Public Wisdom Matters! Discourse-Aware Hyperbolic Fourier Co-Attention for Social Text Classification 08
Text Generation
【文本生成】
CoNT: Contrastive Neural Text Generation
A Character-Level Length Control Algorithm for Non-Autoregressive Sentence Summarization Towards Improving Faithfulness in Abstractive Summarization QUARK: Controllable Text Generation with Reinforced Unlearning Teacher Forcing Recovers Reward Functions for Text Generation Retrieve, Reason, and Refine: Generating Accurate and Faithful Patient Instructions A Contrastive Framework for Neural Text Generation Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation COLD Decoding: Energy-based Constrained Text Generation with Langevin Dynamics Diffusion-LM Improves Controllable Text Generation Factuality Enhanced Language Models for Open-Ended Text Generation Controllable Text Generation with Neurally-Decomposed Oracle InsNet: An Efficient, Flexible, and Performant Insertion-based Text Generation Model Relation-Constrained Decoding for Text Generation EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records TGEA 2.0: A Large-Scale Diagnostically Annotated Dataset with Benchmark Tasks for Text Generation of Pretrained Language Models 09
Machine Translation and Multilinguality
【机器翻译与多语言】
Exploring Non-Monotonic Latent Alignments for Non-Autoregressive Machine Translation A new dataset for multilingual keyphrase generation Less-forgetting Multi-lingual Fine-tuning Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing Refining Low-Resource Unsupervised Translation by Language Disentanglement of Multilingual Translation Model OccGen: Selection of Real-world Multilingual Parallel Data Balanced in Gender within Occupations Multilingual Abusive Comment Detection at Scale for Indic Languages The BigScience Corpus A 1.6TB Composite Multilingual Dataset Addressing Resource Scarcity across Sign Languages with Multilingual Pretraining and Unified-Vocabulary Datasets 10
Multimodality
【多模态】
REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning GLIPv2: Unifying Localization and Vision-Language Understanding VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval Egocentric Video-Language Pretraining Flamingo: a Visual Language Model for Few-Shot Learning Language Conditioned Spatial Relation Reasoning for 3D Object Grounding Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation Learning Deep Multi-Modal Structural Equations For Causal Effect Estimation With Unstructured Proxies OmniVL: One Foundation Model for Image-Language and Video-Language Tasks Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning TVLT: Textless Vision-Language Transformer Divert More Attention to Vision-Language Tracking CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers Text-Adaptive Multiple Visual Prototype Matching for Video-Text Retrieval BMU-MoCo: Bidirectional Momentum Update For Continual Video-Language Modeling Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations What is Where by Looking: Weakly-Supervised Open-World Phrase-Grounding without Text Inputs Flamingo: a Visual Language Model for Few-Shot Learning Self-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation UniCLIP: Unified Framework for Contrastive Language-Image Pre-training Contrastive Language-Image Pre-Training with Knowledge Graphs PyramidCLIP: Hierarchical Feature Alignment for Vision-language Model Pretraining Enhancing and Scaling Cross-Modality Alignment for Contrastive Multimodal Pre-Training via Gradient Harmonization Mutual Information Divergence: A Unified Metric for Multimodal Generative Models Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching MACK: Multimodal Aligned Conceptual Knowledge for Unpaired Image-text Matching HUMANISE: Language-conditioned Human Motion Generation in 3D Scenes CyCLIP: Cyclic Contrastive Language-Image Pretraining S-Prompts Learning with Pre-trained Transformers: An Occam’s Razor for Domain Incremental Learning Delving into OOD Detection with Vision-Language Representations Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone CoupAlign: Coupling Word-Pixel with Sentence-Mask Alignments for Referring Image Segmentation Relational Language-Image Pre-training for Human-Object Interaction Detection Fine-Grained Semantically Aligned Vision-Language Pre-Training Cross-Linked Unified Embedding for cross-modality representation learning Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP Kernel Multimodal Continuous Attention Paraphrasing Is All You Need for Novel Object Captioning Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning CLIPDraw: Exploring Text-to-Drawing Synthesis through Language-Image Encoders One Model to Edit Them All: Free-Form Text-Driven Image Manipulation with Semantic Modulations LGDN: Language-Guided Denoising Network for Video-Language Modeling Zero-Shot Video Question Answering via Frozen Bidirectional Language Models WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models LAION-5B: An open large-scale dataset for training next generation image-text models Towards Video Text Visual Question Answering: Benchmark and Baseline TaiSu: A 166M Large-scale High-Quality Dataset for Chinese Vision-Language Pre-training Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark Understanding Aesthetics with Language: A Photo Critique Dataset for Aesthetic Assessment Multi-modal Robustness Analysis Against Language and Visual Perturbations CLiMB: A Continual Learning Benchmark for Vision-and-Language Tasks OrdinalCLIP: Learning Rank Prompts for Language-Guided Ordinal Regression 11
Special Tasks
【特殊任务】
1. Code 【代码】 CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning Fault-Aware Neural Code Rankers NS3: Neuro-symbolic Semantic Code Search Pyramid Attention For Source Code Summarization 2. Mathematics 【数学】 HyperTree Proof Search for Neural Theorem Proving NaturalProver: Grounded Mathematical Proof Generation with Language Models Autoformalization with Large Language Models Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers 3. Others 【其他】 Measuring and Reducing Model Update Regression in Structured Prediction for NLP Learning to Follow Instructions in Text-Based Games WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents LISA: Learning Interpretable Skill Abstractions from Language Inherently Explainable Reinforcement Learning in Natural Language Using natural language and program abstractions to instill human inductive biases in machines Semantic Exploration from Language Abstractions and Pretrained Representations Pre-Trained Language Models for Interactive Decision-Making Knowledge-Aware Bayesian Deep Topic Model Improving Intrinsic Exploration with Language Abstractions Improving Policy Learning via Language Dynamics Distillation Meta-Complementing the Semantics of Short Texts in Neural Topic Models Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset BigBio: A Framework for Data-Centric Biomedical Natural Language Processing
📝论文解读投稿,让你的文章被更多不同背景、不同方向的人看到,不被石沉大海,或许还能增加不少引用的呦~ 投稿加下面微信备注“投稿”即可。
最近文章
为什么回归问题不能用Dropout?
Bert/Transformer 被忽视的细节
中文小样本NER模型方法总结和实战
一文详解Transformers的性能优化的8种方法
DiffCSE: 将Equivariant Contrastive Learning应用于句子特征学习
苏州大学NLP团队文本生成&预训练方向招收研究生/博士生(含直博生)
NIPS'22 | 重新审视区域视觉特征在基于知识的视觉问答中的作用
武汉大学提出:用于基于统一Aspect的情感分析的关系感知协作学习
全新的多模态预训练范式:微软提出GLIP统一了对象检测和短语定位任务
投稿或交流学习,备注:昵称-学校(公司)-方向 ,进入DL&NLP交流群。
方向有很多:机器学习、深度学习,python,情感分析、意见挖掘、句法分析、机器翻译、人机对话、知识图谱、语音识别等。
记得备注~