Inherent Trade-Offs in the Fair Determination of Risk Scores

Jon Kleinberg

Intro

很多情况是people在决策者之前就到，而决策者又必须根据一些可以观测到的特征进行判断。在很多实际场景中，这些决策会考虑的因素有human expertise，algorithmic and statistical framework。

相关的研究中，我们会考虑这些设定中的bias和discrimination，当大量数据都是基于以往历史。下面我们考虑三个例子，这些例子都包含了类似的问题。

A set of example domains

在犯罪系统中，警察会使用risk tool根据其特征和过往历史来评估辩护人再犯的概率。一份相关的报告说，COMPAS risk tool作为一份常用的统计方法（用于计算risk）会带来非裔美国人的歧视，而白人辩护人的犯罪概率会低估。但是后续的分析对这份报告提出了异议，但也观测到了尽管COMPAS risk tool有误差，但对于非裔美国人和白人再犯概率是equally well calibrated。

第二个场景是不同性别和种族的人在网上接受到的广告或者经济相关内容。比如同时男性和女性对一个产品感兴趣，那么他们是否有同等概率受到相关信息？这会有更广泛的影响，比如工作广告。

第三个场景，是medical testing and diagnosis。医生会依据病人制定治疗方案，但是方案会依据一些测试，其中包含不同疾病和情况的监测。这里无法保证最终的决策会在不同病人群组中同等代价的评估。

Providing guarantees for decision procedures

还有其他的应用，但是主要考虑上述三个。有一些共性的问题产生了。第一是这些算法的预测结果通常是整个大系统（用于最终决策）的一个输入部分，作为一个risk score提供给专家；或者是在互联网上投放的广告。第二，潜在的任务都是用来分类用户是否具有某些性质，比如recidivism, medical condition, interest in a product。我们把这个标记为positive instance；反之为negative instance。

下面考虑如何使决策能够保证不会产生potential bias。

第一个目标是产生的概率概率需要是well-calibrated。如果一些人被预测positive instance的概率是z，那么中间差不多有z的人的确是positive。并且，在不同的group中也要符合。

第二个是balance for the positive class：每一个group中positive instance收到的average score也要差不多一样。对应于negative instance的就是balance for the negative class。

注意和statistical parity的不同；他要求的是对于所有组的member，average probability要一致（pos和neg都包含）。尽管很多情况下statistical parity是目标，但是通常无法满足feasible和desirable。而balance for positive/negative 则可以独立于statistical parity讨论；因为balance要求的是对某一个个体进行判断的正确性要求，对他们的判断需要独立于其所属于的组。

The present work: Trade-offs among the guarantees

尽管有不同的形式，但calibration和balance conditions直观上都能认为是一个统一目标的变形：我们的effectiveness能够一致，而不会因为group不同而改变。

我们的主要结果是这些condition总的来说是互相不包容，他们只能同时满足某些高度限制的情况。而且，这种不包容度是适用于approximate versions of the conditions。

Formulating the Goal

feature vector $$\sigma$$，而$$p_\sigma$$表示positive的概率。

每个人属于两个groups中的一个，label 1/2，然后我们希望我们的决策不会因为他们属于哪一个group而产生偏移。但是每一个group可能会包含类似性别种、族之类的信息，而我们希望这些信息不会给带来bias。group中的每个个体有不同的概率$$a{t\sigma}$$ 对应于特征$$\sigma$$。但是每一个group中属于positive class的概率$$p\sigma$$是一样的。

每一个对象有两个参数表示，一个是特征向量$$\sigma$$，另一个是属于哪个group。$$a{t\sigma}$$是特征向量的分布，而$$p\sigma$$是特征向量$$\sigma$$属于positive label的概率（0%和100%较常见）。

risk assessments是一种将人们分到不同的set（基于特征向量$$\sigma$$），然后给每一个set给出其中人们属于positive的概率估计的方式。对于每一个set／bin，设置一个score $$v_b$$，用于表示每个人属于bin $$b$$的概率？（根据后文，我推测$$v_b$$首先是表示每一个bin属于positive class的概率，其次这个概率也被用来当作score）

下面就创建规则，将每个人分配到bins，基于他们的feature vector $$\sigma$$。主要这里允许将同一个特征向量按照某种概率分布分配到不同的bin。这个规则就由$$X{\sigma b}$$表示，所有人中有$$X{\sigma b}$$的人，其特征向量是$$\sigma$$，被分配到了bin $$b$$。这个规则无法访问每个人属于哪个group t，而只能知道特征向量$$\sigma$$。

Fairness Properties

下面讨论三个conditions，每一个都表示fair risk assignment的不同概念。

Calibration： $$P(Y=1, T=t | B=b) = P(T=1 | B=b) \cdot P(Y=1 | B=b) = v_b \cdot P(T=t|B=b)$$
Balance for the negative class 所有group中属于negative class的average score一样。
Balance for the positive class 所有group中属于positive class的average score一样。

Determining What is Achievable: A Characterization Theorem

讨论上述三个条件同时符合的情况。给出两个例子。

Perfect prediction. 假设对于每一个feature vector $$\sigma$$, $$p\sigma=1/0$$，也就是全部预测为pos或者neg。assign all feature vectors $$\sigma$$ with $$p\sigma=0$$ to bin $$b$$ with score $$vb=0$$; vectors with $$p\sigma=1$$ to bin $$b'$$ with score $$v_{b'}=1$$。可以检测所有的三个条件都符合。
Equal base rates. 如果两个group有相同部分的成员都是positive class，也就是average $$p\sigma$$是相同的，那么我们就把这个叫做equal base rate。In this case, we can create a single bin $$b$$ with score equal to the average of $$p\sigma$$, and assign everyone to bin $$b$$。

总结一下。

perfect prediction就是说，属于0的全部放到bin $$b$$中，且$$v_b=0$$（neg属于pos的概率当然是0）；把属于1的全部放到bin $$b'$$中，且$$v_b'=1$$。 1. condition (A): expected number of people from group t in bin $$b/b'$$ who belong to positive is equal to $$v_b / v_b'$$ of expected number of people from group t assigned to bin $$b/b'$$ 2. condition (B): avg score for negative class is 0 3. condition (C): avg score for positive class is 1
equal base rates就是说，不同group有相同比率的人属于pos。one bin $$b$$ with score equal to the base rate, and assign everyone to bin $$b$$。 1. condition (A): expected number of people from group t in bin $$b$$ who belong to positive is equal to $$vb=P\sigma$$ of expected number of people from group t assigned to bin $$b$$ 2. condition (B): avg score for negative class is $$1 - P\sigma$$ 3. condition (C): avg score for positive class is $$P\sigma$$

Theorem 1.1 如果risk assignment满足三个fairness conditions，那么这个问题必须是perfect prediction和equal base rates中的一个。

The Characterization Theorems

informal overview

$$N_t$$ people in group $$t$$, $$\mu_t$$ is the number of positive people in group $$t$$.

Suppose there are $$N_{b}$$ people in bin b.

According to calibration: expected number of people in group t and bin b who belong to positive = a fraction $$v_b$$ of expected number of people in group t assigned to bin b / total score of people in group t assigned to bin b.

$$N_b \cdot P(Y=1, T=t|B=b)$$ = $$N_b \cdot P(Y=1|B=b) \cdot P(T=t|B=b)$$

Given a single bin, $$\mu_t = N_b \cdot P(Y=1, T=t|B=b)$$ = total score of people in both group t and bin b.

Let $$x$$ be average score to negative class, $$y$$ be average score to positive class. By balance condition, $$x$$ and $$y$$ stay the same for all groups.

So we have

$$(N_1 - \mu_1) x + \mu_1 y = \mu_1$$ $$(N_2 - \mu_2) x + \mu_1 y = \mu_2$$

If equal base rates, $$\mu_1 / N_1 = \mu_2 / N_2$$, two lines are the same; if not equal, only intersect at (x,y) = (0, 1), which is the perfect prediction.

注：这里 $$Nb \cdot P(Y=1, T=t|B=b) = N{all} \cdot P(Y=1, T=t, B=b) = ...$$ 有很多种写法，这里为了方便说明，就这么写了。

formal proof

引入了几个变量，简单分析了一下为何只能是perfect prediction和equal base rates情况。

一些重要的定义

$$p_\sigma$$ 当特征为$$\sigma$$的时候，有多大的概率属于positive class
group t有$$N_t$$个人
group t中有 $$a_{t\sigma}$$（比例）的人有特征$$\sigma$$
group t中有 $$n{t\sigma} = a{t\sigma} N_t$$（人数）的人有特征$$\sigma$$
一共有$$|\sigma|$$种不同的特征（假设特征是离散的）
$$nt \in \mathbb{R}^{|\sigma|}$$表示vector indexed by feature vector, with the coordinate in position $$\sigma$$ euqal to $$n{t\sigma}$$，即对应位置是group t中有该特征$$\sigma$$的人数
$$P \in \mathbb{R}^{|\sigma| \times |\sigma|}$$是diagonal matrix
$$B$$个bin
$$v \in \mathbb{R}^B$$ indexed by the bins, with the coordinate in posibition $$b$$ equal to the score $$v_b$$ of bin $$b$$
$$V \in \mathbb{R}^{B \times B}$$ be diagonal matrix
$$X \in \mathbb{R}^{|\sigma| \times B}$$，每一个数值$$X_{\sigma b}$$表示的是有多少比例的人是有特征$$\sigma$$并被分配到了bin $$b$$

由此可以有

$$n_t^T P$$表示的是每一个feature vector $$\sigma$$ 对应有几个人在group t中并且属于positive class
$$n_t^T X$$表示的是group t中对应的bin $$b$$有多少人
$$n_t^T XV \in \mathbb{R}^B$$ 是group t中，对应位置$$b$$的score为多少
$$n_t^T PX \in \mathbb{R}^B$$ 在位置$$b$$上表示的是group t中属于positive class的人中，有几个也是属于bin $$b$$

第一个condition calibration within groups就是说 $$n_t^T XV = n_t^T PX$$。

另外记$$\mu_t = n_t^T XV e= n_t^T PX e$$为group t中的score之和；其中$$e$$就是所有元素都为1的向量，表示求和。$$n_t^T XV e$$表示标记为positive的人数总和，$$n_t^T PX e$$表示为group t中expected score的总和。

为了证明后面两个condition: fairness to the positive and negative classes。$$n_t^T PXv$$表示group t中所有标记为positive的特征的score之和。而$$\mu_t$$是属于label的总人数，所以平均score为$$\frac{1}{\mu_t} n_t^T PXv$$ （这表示为group t中标记为pos的人的平均score）。

因此第三个条件就可以写为

$$\frac{1}{\mu_1} n_1^T PXv = \frac{1}{\mu_2} n_2^T PXv$$

既然$$p\sigma$$表示的是属于pos的比例，那么$$1-p\sigma$$就可以表示属于neg的比例。因此第二个condition就可以写作

$$\frac{1}{N_1 - \mu_1} n_1^T (1-P)Xv = \frac{1}{N_2 - \mu_2} n_2^T (1-P)Xv$$

由于第一个condition要求$$n_t^T XV = n_t^T PX$$，可以将这两个条件写成

$$\frac{1}{\mu_1} n_1^T XVv = \frac{1}{\mu_2} n_2^T XVv$$
$$\frac{1}{N_1 - \mu_1} n_1^T (1-P)Xv = \frac{1}{N_2 - \mu_2} n_2^T (1-P)Xv$$
- 注，$$n_t^T (1-P) X v = n_t^T (1) Xv - n_t^T PXv = n_t^T XVe - n_t^T XV v = \mu_t - n_t^T XVv$$

使用$$\gamma_t = \frac{1}{\mu_t} n_t^T PXv$$表示grou t中标记为positive的平均score。

$$\gamma_t=1$$即为perfect prediction。

假设$$\gamma_1 = \gamma_2 = \gamma$$，那么就有

$$\frac{1}{N_1 - \mu_1} (\mu_1 - \gamma \mu_1) = \frac{1}{N_2 - \mu_2} (\mu_2 - \gamma \mu_2)$$
$$\frac{1}{N_1 - \mu_1} \mu_1(1 - \gamma) = \frac{1}{N_2 - \mu_2} \mu_2 (1 - \gamma)$$
$$\frac{\mu_1 / N_1}{1-\mu_1 / N_1} (1 - \gamma) = \frac{\mu_2 / N_2}{1-\mu_2 / N_2} (1 - \gamma)$$

这最后一个等式在两种情况下成立,$$\gamma=1$$就是perfect prediction，或者

$$\frac{\mu_1 / N_1}{1-\mu_1 / N_1} = \frac{\mu_2 / N_2}{1-\mu_2 / N_2}$$

也就是$$\mu_1 / N_1 = \mu_2 / N_2$$，也就是equal base rates。

和Statistical Parity关系：后两个condition和statistical parity还是有区别的，statistical parity要求的是每个group的average score都要一样；而这里我们更细化，要求pos和neg的都要一样。

Reducing Loss with Equal Base Rates

在risk assignment中，我们希望能有尽可能多的score给分配到members with pos label。当一个member接收到的分数为$$v$$时，我们定义其loss为$$y v + (1-y)(1-v)$$。每一个group的risk assignment就是每一个member的expected individual loss之和，即

$$\ell_t(X) = n_T^T(1-P)Xv + (\mu_t - n_t^T PX v) = 2 (\mu_t - n_t^T PXv)$$

我们将符合三个条件的称作fair assignment。当两个group是equal base rate，且calibrated risk assignment（将每个人放到一个独立的bin）也是fair的，那么fair assignment也是非空的。那么我们会问，是否存在一个fair assignment，其loss比这个one-bin assigment小？答案是存在，当且仅当这个assigment一个bin，我们把这个叫做non-trivial assignment。

注意使得loss最小的分配方式是将每一个特征向量$$\sigma$$ with score $$p_\sigma$$分配到一个独立的bin，这也意味着$$X$$是identity matrix。这种分配方式叫做identity assignment $$I$$符合well-calibrated，但是有可能不符合后面两个条件（equal base rates）。

因此这里一个结论就是，除非identity assigment $$I$$正巧也是fair，否则任意的fair assignment的loss都比$$I$$的loss高，这也就带来了performance和fairness之间的一个trade-off。

Appendix

这篇paper前半部分一直没有很好解释bin，最后才豁然开朗。简单的说，就是让我们的预测不那么精确，而且做一些round；round到不同的bin，每个bin可能又各自对应不同的pos／neg 概率分布。最后结论就是performance和fairness之间的trade-off。

Inherent Trade-Offs in the Fair Determination of Risk Scores

Inherent Trade-Offs in the Fair Determination of Risk Scores