直播时间:2022年2月27日(周日)21:00-22:00
王心怡:加州大学圣塔芭芭拉分校博士生,主要研究方向为机器学习和自然语言处理,使用因果推理工具分析解决机器学习中的问题。
报告题目:利用反事实最大似然估计训练深度学习网络
报告摘要:We propose a causality-based training framework to reduce the spurious correlations caused by observed confounders. We give theoretical analysis on the underlying general Structural Causal Model (SCM) and propose to perform Maximum Likelihood Estimation (MLE) on the interventional distribution instead of the observational distribution, namely Counterfactual Maximum Likelihood Estimation (CMLE). As the interventional distribution, in general, is hidden from the observational data, we then derive two different upper bounds of the expected negative log-likelihood and propose two general algorithms, Implicit CMLE and Explicit CMLE, for causal predictions of deep learning models using observational data. We conduct experiments on both simulated data and two real-world tasks: Natural Language Inference (NLI) and Image Captioning. The results show that CMLE methods outperform the regular MLE method in terms of out-of-domain generalization performance and reducing spurious correlations, while maintaining comparable performance on the regular evaluations.
论文题目:Counterfactual Maximum Likelihood Estimation for Training Deep Networks
分享亮点:
1. We formalize the spurious correlation problem as a confounding problem using a structural causal model.
2. We propose a new general training scheme to reduce the unwanted effect of observed confounders with provable bounds.
3. The proposed method is evaluated on both simulated data and real-world tasks and is shown to be superior to the regular ERM training scheme in terms of out-of-domain generalization performance and reducing spurious correlations, while maintaining comparable performance on the regular evaluations.
1. Invariant risk minimization. (IRM)
这篇可能是利用causal invariance来改善ERM会学到spurious correlation的问题这一条线中最出名的paper了。虽然这篇paper在理论和实践中都被一些paper指出有诸多问题,但是确实由此开创了一系列与causal invariance相关的工作。大家感兴趣可以去找一下这篇paper的citation了解一波。
2. Estimating individual treatment effect: generalization bounds and algorithms.
这篇paper解决的是individual treatment effect estimate (ITE) 中selective bias的问题,很大程度上启发了我们这篇paper。这篇paper里的关键证明的变体都在我们的paper里有所体现,可以利用这篇paper了解一下ITE这个经典的causal inference的问题。
3. Learning the difference that makes a difference with counterfactually augmented data.
这篇paper提出了一种人工改写一个句子,达到以最小的改动改变这个句子的label的效果。作者认为这样我们就可以得到原来句子的counterfactual example。这种改变causal factor得到counterfactual的思路恰好和IRM改变非causal factor让causal factor保持不变的思路相反。
4. Explaining the efficacy of counterfactually augmented data.
这篇paper是前一篇paper的follow-up,对之前的data augmentation方法提出了一些理论解释。
5. On Calibration and Out-of-domain Generalization.
这篇paper是从causal invariance的角度来看out-of-domain generalization,并且巧妙的将causal invariance和muti-domain calibration联系到了一起。这是一篇比较新的没有focus在representation learning上的从causal invariance角度思考问题的paper。
6. Desiderata for Representation Learning: A Causal Perspective.
这篇比较新的长paper是讲如何系统性的evaluate和学习causal representation。这篇paper认为data自身内部存在一些confounder,比如一张图片的不同pixel之间的confounding。作者提出了很多很high level的想法和定义,可以启发许多思考。