<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel>
        <title>NotionNext BLOG</title>
        <link>https://notion-next-eta-weld.vercel.app//</link>
        <description>这是一个由NotionNext生成的站点</description>
        <lastBuildDate>Mon, 08 May 2023 13:48:49 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <language>zh-CN</language>
        <copyright>All rights reserved 2023, on the way</copyright>
        <item>
            <title><![CDATA[无]]></title>
            <link>https://notion-next-eta-weld.vercel.app//article/59a215f9-6ead-4a3b-9817-31fa9b41e734</link>
            <guid>https://notion-next-eta-weld.vercel.app//article/59a215f9-6ead-4a3b-9817-31fa9b41e734</guid>
            <pubDate>Mon, 17 Apr 2023 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<div id="container" class="mx-auto undefined"><main class="notion light-mode notion-page notion-block-59a215f96ead4a3b981731fa9b41e734"><div class="notion-viewport"></div><div class="notion-collection-page-properties"></div></main></div>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[潜在因果模型]]></title>
            <link>https://notion-next-eta-weld.vercel.app//article/art-1</link>
            <guid>https://notion-next-eta-weld.vercel.app//article/art-1</guid>
            <pubDate>Mon, 17 Apr 2023 00:00:00 GMT</pubDate>
            <description><![CDATA[因果推断模型]]></description>
            <content:encoded><![CDATA[<div id="container" class="mx-auto undefined"><main class="notion light-mode notion-page notion-block-36f66247cac94b709be825bc43afaeea"><div class="notion-viewport"></div><div class="notion-collection-page-properties"></div><h2 class="notion-h notion-h1 notion-h-indent-0 notion-block-73243b189501499a92a4286f45aa4abb" data-id="73243b189501499a92a4286f45aa4abb"><span><div id="73243b189501499a92a4286f45aa4abb" class="notion-header-anchor"></div><a class="notion-hash-link" href="#73243b189501499a92a4286f45aa4abb" title="潜在因果模型"><svg viewBox="0 0 16 16" width="16" height="16"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a><span class="notion-h-title">潜在因果模型</span></span></h2><blockquote class="notion-quote notion-block-1dd4ca2749784cf787f637bfc2005773"><div>变量定义：
<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>：个体<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的观察结果
<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>：干预变量为<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>时的个体<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的潜在结果</div></blockquote><h3 class="notion-h notion-h2 notion-h-indent-1 notion-block-cda0710324014f908f73147b6584b674" data-id="cda0710324014f908f73147b6584b674"><span><div id="cda0710324014f908f73147b6584b674" class="notion-header-anchor"></div><a class="notion-hash-link" href="#cda0710324014f908f73147b6584b674" title="一、潜在结果模型中的定义"><svg viewBox="0 0 16 16" width="16" height="16"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a><span class="notion-h-title">一、潜在结果模型中的定义</span></span></h3><details class="notion-toggle notion-block-b4776d5d38e84b9989a61206ebc051c2"><summary><b>1.1 潜在结果</b></summary><div><div class="notion-text notion-block-00d33707b3454eb6b0890766a4dea1bf">考虑两个随机变量<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>，当我们研究的因果效应<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>时，如果干预变量<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>，个体<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的潜在结果可以表示为<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>。它表示的是个体<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>在干预变量<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>时结果变量的值。</div></div></details><div class="notion-blank notion-block-3a3dd5ff9ef04cf199f0b92823653d4a"> </div><div class="notion-callout notion-gray_background_co notion-block-5ab89098a34f4cec96ef25f1acc77908"><div class="notion-page-icon-inline notion-page-icon-span"><span class="notion-page-icon" role="img" aria-label="💡">💡</span></div><div class="notion-callout-text">可以看到潜在结果其实定义的是某个个体的因果量，因此可以很容易的定义个体因果效应。与潜在结果对应的是“观察结果”，即对个体 <span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span> 实际执行了某种干预对应的结果，可以记为<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>。假设干预变量<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>，那么有<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>。</div></div><div class="notion-blank notion-block-a92bd8f4a68241878d1d38dc6d916dc4"> </div><details class="notion-toggle notion-block-75b36ba382034eb499cca0f14d28de4d"><summary><b>1.2 个体因果效应（ITE）</b></summary><div><div class="notion-text notion-block-17b53fe42cb04bb8b80f2e03bf36dac7">假设干预变量<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>，结果变量<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>，那么个体 <span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span> 的ITE就是当这个个体在实验组和对照组时所对应的两个潜在结果的差：</div><span role="button" tabindex="0" class="notion-equation notion-equation-block"><span></span></span></div></details><details class="notion-toggle notion-block-330295b40b8a4721bd6e167abc8a1e76"><summary><b>1.3 平均因果效应（ATE）</b></summary><div><div class="notion-text notion-block-b519e03e5b584be881f761098c2bbbe5">平均因果效应是指在<b>「群体」</b>层面的因果效应，即是ITE在整体上的期望：</div><span role="button" tabindex="0" class="notion-equation notion-equation-block"><span></span></span></div></details><details class="notion-toggle notion-block-67ab82be6afd423d8214238952ccd559"><summary><b>1.4 条件因果效应（CATE）</b></summary><div><div class="notion-text notion-block-79ba8a34ae6c43df967b4fea5c9e49f5">指特征变量取值为<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>时的群体上的平均因果效应：</div><span role="button" tabindex="0" class="notion-equation notion-equation-block"><span></span></span><div class="notion-text notion-block-91c8ff9d4a114716981b0c98e7f6eeea">当干预效果在不同的子群中存在差异时，CATE 是一个常用的干预效果评估方法，也被称为异质干预效果。</div></div></details><details class="notion-toggle notion-block-6af4d5f74f5541978ab4f71ee7d53008"><summary><b>1.5 干预组的平均因果效应（ATT）</b></summary><div><div class="notion-text notion-block-0c3ecf2810a24aaab291afb59d43390d">特指干预组子群的平均因果效应：</div><span role="button" tabindex="0" class="notion-equation notion-equation-block"><span></span></span></div></details><h3 class="notion-h notion-h2 notion-h-indent-1 notion-block-ca5abbb66c7941b7a74336a10d76b4da" data-id="ca5abbb66c7941b7a74336a10d76b4da"><span><div id="ca5abbb66c7941b7a74336a10d76b4da" class="notion-header-anchor"></div><a class="notion-hash-link" href="#ca5abbb66c7941b7a74336a10d76b4da" title="二、因果推断的目的"><svg viewBox="0 0 16 16" width="16" height="16"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a><span class="notion-h-title">二、因果推断的目的</span></span></h3><div class="notion-text notion-block-a3397853a79c4b63b959c47ce8bfa518">对于因果推断，我们的目标是从观察性数据中估计干预效果。从形式上看，给定观察性数据集<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>，则因果推断任务的目标是估计上述定义中的各项干预效果。</div><h3 class="notion-h notion-h2 notion-h-indent-1 notion-block-5a5f2c658cb8492a83e878df43ee1f11" data-id="5a5f2c658cb8492a83e878df43ee1f11"><span><div id="5a5f2c658cb8492a83e878df43ee1f11" class="notion-header-anchor"></div><a class="notion-hash-link" href="#5a5f2c658cb8492a83e878df43ee1f11" title="三、三个假设"><svg viewBox="0 0 16 16" width="16" height="16"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a><span class="notion-h-title">三、三个假设</span></span></h3><div class="notion-text notion-block-108071b10bc04487834d9778b340f9f9">潜在因果模型能够进行因果识别主要基于以下几个假设：</div><ol start="1" class="notion-list notion-list-numbered notion-block-48df0a884e1848ba9ff22d8d2e9c1948"><li><b>个体处理效应稳定假设（SUTVA），</b>此假设包含两部分</li><ol class="notion-list notion-list-numbered notion-block-48df0a884e1848ba9ff22d8d2e9c1948"><li> 第一部分，无干预假设，<b>不同个体间的潜在结果是相互独立的</b>，即对任意个体<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的干预不会影响到其他个体。如我的头疼症只应该与我自己吃不吃阿斯匹林有关，别人吃不吃阿斯匹林不应该对我的头疼症产生任何影响。SUTVA使我们可以把样本中每个个体的反应看作独立事件，从而降低了我们需要的样本体积、模型体积和建模时间。</li><li>第二个部分，<b>一致性假设</b>，一个个体被观测到在干预变量<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>下的结果<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>（事实结果），也就是他的干预变量<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>时的潜在结果<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>，即<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>。如一个人服用阿斯匹林并因此康复的人，假如他在临床试验中以随机分配的方式分配到实验组并服用了阿斯匹林，那么他一样会康复。</li></ol></ol><ol start="2" class="notion-list notion-list-numbered notion-block-a3822109b7a24c078dc5be5e37a578d3"><li><b>可忽略性假设</b></li><ol class="notion-list notion-list-numbered notion-block-a3822109b7a24c078dc5be5e37a578d3"><li>在控制混杂因子<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的条件下，潜在结果与是否进行干预是相互独立的。一般表示为：<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>。可以分成两部分进行描述：</li><ol class="notion-list notion-list-numbered notion-block-6347a098a5df4522b911a2c238c1ff21"><li>给定变量<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>，无论<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的取值，潜在结果<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的取值都相同。即<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>。即如果两名患者具有相同的背景变量，则无论采取怎样的干预措施，其潜在结果都会是相同的，又或者无论我有没有吃阿斯匹林，阿斯匹林对我头疼症的因果效应都应该是一样的。</li><li>给定变量<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>，若个体<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>和<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的潜在结果相同，那么干预<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的分布相同，即可视为随机分配。即<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>。</li></ol><li>通俗的解释就是，对于混杂因子<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>相同的「子群」，是否进行干预是随机的，近似于随机对照试验，从而在混杂因子<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>相同的「子群」上，“观察结果” 等价于 “潜在结果”，那么条件平均因果效应：</li><span role="button" tabindex="0" class="notion-equation notion-equation-block"><span></span></span><div class="notion-callout notion-gray_background_co notion-block-b021e504d92645f0b9c6b77dce72284c"><div class="notion-page-icon-inline notion-page-icon-span"><span class="notion-page-icon" role="img" aria-label="💡">💡</span></div><div class="notion-callout-text">第二个等式转换是基于期望的性质，差值的期望等于期望的差值。第三个等式是基于可忽略性假设，即在控制<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的值时，潜在结果与干预相互独立。第四个等式是基于一致性假设，即被观测到的事实结果就是相同干预下的潜在结果。</div></div><div class="notion-callout notion-gray_background_co notion-block-672f19be517b4c46b43ea7e75a57fe55"><div class="notion-page-icon-inline notion-page-icon-span"><span class="notion-page-icon" role="img" aria-label="💡">💡</span></div><div class="notion-callout-text">从因果图的角度来描述是否满足可忽略性，就是控制变量<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>时，是否阻断了<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>和<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>之间的后门路径，且<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的成员都不是<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的后代。如果是则可以说明<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>对<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>来说是可忽略的。对于可忽略性假设来说也就是<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>包含了所有的混杂因子，不存在未观测的混杂因子。</div></div><div class="notion-text notion-block-2a56ecbe11414d7aa3c8f9e7a9350840">对比以下两种因果图是否满足可忽略性：</div><li>从图中可知，对于<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>和<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>之间的因果关系，<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>是混淆因子，且存在<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>到<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的后门路径<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>，当控制了变量<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>后就阻断了<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>到<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的后门路径，且<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>不是<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的后代，因此满足可忽略性。因此通过控制变量<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>就可以估计出<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>和<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>之间因果效应。（后门准则）</li><pre class="notion-code"><div class="notion-code-copy"><div class="notion-code-copy-button"><svg fill="currentColor" viewBox="0 0 16 16" width="1em" version="1.1"><path fill-rule="evenodd" d="M0 6.75C0 5.784.784 5 1.75 5h1.5a.75.75 0 010 1.5h-1.5a.25.25 0 00-.25.25v7.5c0 .138.112.25.25.25h7.5a.25.25 0 00.25-.25v-1.5a.75.75 0 011.5 0v1.5A1.75 1.75 0 019.25 16h-7.5A1.75 1.75 0 010 14.25v-7.5z"></path><path fill-rule="evenodd" d="M5 1.75C5 .784 5.784 0 6.75 0h7.5C15.216 0 16 .784 16 1.75v7.5A1.75 1.75 0 0114.25 11h-7.5A1.75 1.75 0 015 9.25v-7.5zm1.75-.25a.25.25 0 00-.25.25v7.5c0 .138.112.25.25.25h7.5a.25.25 0 00.25-.25v-7.5a.25.25 0 00-.25-.25h-7.5z"></path></svg></div></div><code class="language-mermaid">graph LR;
  T--&gt;Y
	X--&gt;T
  X--&gt;Y</code></pre><li>从图中可知，对于<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>和<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>之间的因果关系，不存在<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>到<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的后门路径，那么就算控制了变量<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>，因为<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>是<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的后代，故不满足可忽略性。因此想要通过控制变量<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>，无法估计出<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>和<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>之间因果效应。（需采用前门准则）</li><pre class="notion-code"><div class="notion-code-copy"><div class="notion-code-copy-button"><svg fill="currentColor" viewBox="0 0 16 16" width="1em" version="1.1"><path fill-rule="evenodd" d="M0 6.75C0 5.784.784 5 1.75 5h1.5a.75.75 0 010 1.5h-1.5a.25.25 0 00-.25.25v7.5c0 .138.112.25.25.25h7.5a.25.25 0 00.25-.25v-1.5a.75.75 0 011.5 0v1.5A1.75 1.75 0 019.25 16h-7.5A1.75 1.75 0 010 14.25v-7.5z"></path><path fill-rule="evenodd" d="M5 1.75C5 .784 5.784 0 6.75 0h7.5C15.216 0 16 .784 16 1.75v7.5A1.75 1.75 0 0114.25 11h-7.5A1.75 1.75 0 015 9.25v-7.5zm1.75-.25a.25.25 0 00-.25.25v7.5c0 .138.112.25.25.25h7.5a.25.25 0 00.25-.25v-7.5a.25.25 0 00-.25-.25h-7.5z"></path></svg></div></div><code class="language-mermaid">graph LR;
  T--&gt;Y
	T--&gt;X
  X--&gt;Y</code></pre></ol></ol><ol start="3" class="notion-list notion-list-numbered notion-block-00a4af5f874e4917be26023e9b96d4c0"><li><b>正值假设</b></li><ol class="notion-list notion-list-numbered notion-block-00a4af5f874e4917be26023e9b96d4c0"><li>对于任意值<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的干预分配都不是确定的。即<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>。换句话说就是确保对于任意<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>「子群」，<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>和<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的结果均存在。从而避免由于对给定的<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>，仅包含<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的结果数据，使得<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的结果无法估测。</li></ol></ol><div class="notion-blank notion-block-734c0470199b45108b0b3e6b3f550ca7"> </div><h3 class="notion-h notion-h2 notion-h-indent-1 notion-block-4f808b04b844413caef846c33d6f89f0" data-id="4f808b04b844413caef846c33d6f89f0"><span><div id="4f808b04b844413caef846c33d6f89f0" class="notion-header-anchor"></div><a class="notion-hash-link" href="#4f808b04b844413caef846c33d6f89f0" title="四、因果效应估计"><svg viewBox="0 0 16 16" width="16" height="16"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a><span class="notion-h-title">四、因果效应估计</span></span></h3><div class="notion-text notion-block-a7f84fb4884c4412a4e02a9f0170251c">假设干预变量<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>，要估计<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>，若直接采用观测数据进行计算，则很可能由于<b>「混杂因子」</b>的存在，导致估计的包含完全虚假的因果效应：</div><span role="button" tabindex="0" class="notion-equation notion-equation-block"><span></span></span><div class="notion-text notion-block-f4f5c97c66f646afabda77bd260ee572">对等式左侧进行拆解分析造成等式不成立的原因：</div><span role="button" tabindex="0" class="notion-equation notion-equation-block"><span></span></span><div class="notion-text notion-block-abb86d8e004f41ac9011cba09981772f">其中：</div><ul class="notion-list notion-list-disc notion-block-536cf0eb3f0c429593ebf620c8ab58ff"><li><span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>：表示干预组的平均因果效应；</li></ul><ul class="notion-list notion-list-disc notion-block-deb5c16978354382834bb3b8c933595a"><li><span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>：表示控制组的平均因果效应；</li></ul><ul class="notion-list notion-list-disc notion-block-957336491c4c46fd9637ab510ea2f149"><li><span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>：表示<b>选择偏差</b>，描述的是干预组和对照组在潜在结果的分布上有差异；</li></ul><ul class="notion-list notion-list-disc notion-block-671bc032cab44e9583337b50f9b7c98f"><li><span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>：差值表示的是因果效应在干预组和控制组之间存在差异，称为<b>混淆偏差。</b></li></ul><ul class="notion-list notion-list-disc notion-block-6fe7becebacd455abab34622a3cd0983"><li><span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>：表示干预的概率。则有:</li><ul class="notion-list notion-list-disc notion-block-6fe7becebacd455abab34622a3cd0983"><span role="button" tabindex="0" class="notion-equation notion-equation-block"><span></span></span></ul></ul><div class="notion-text notion-block-8c9284ec3ffc4b54b2adee872cf25192">因此，一般情况下我们无法直接根据观测数据计算出<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>。只有在满足了较强的假设下才可以用观测数据进行<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>的估计：</div><span role="button" tabindex="0" class="notion-equation notion-equation-block"><span></span></span><div class="notion-text notion-block-b971d315b7cc4575bd5e93cfebbc13f3">在上式中，第二到第三等式推导用到了<b>“可忽略性”</b>假设，第三到第四等式的推导用到了<b>SUTVA</b>中<b>“一致性”</b>假设。</div><div class="notion-text notion-block-f68200d7444647e78cd45279ec4126aa">实际中我们很难满足<b>“可忽略性”</b>假设，因为我们无法观测到所有的混杂因子，就如 第<b>(4.2)</b>式 推导，直接采用观测数据计算<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>，由于未观测到的混杂因子导致选择偏差和混淆偏差的发生（其实都可以理解为选择偏差，都是实验组与对照组不够随机）。</div><div class="notion-text notion-block-b6f2e5d2ba914b78a32ef99f3f8f9a7c">“辛普森悖论”就是典型的例子，即存在混杂因子-患者年龄（<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>），不同年龄的患者对于使用药物的倾向不同，导致我们若直接计算是否服药和是否康复之间的因果效应就会由于混杂因子-患者年龄的影响而算出“伪效应”，但假设在 是否服药 和 是否康复 之间仅存在患者年龄（<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>）这一个混杂因子，那么就可以先估计以患者年龄（<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>）为条件的干预效果，然后基于混杂因子的分布进行加权平均，即控制变量<span role="button" tabindex="0" class="notion-equation notion-equation-inline"><span></span></span>来阻断了 是否服药 和 是否康复 之间的后门路径。</div><div class="notion-text notion-block-095da473a03c470d9222a393019c8cea">当无法观测到所有的混杂因子时，有什么比较好的方式进行因果效应估计呢？</div><div class="notion-text notion-block-5f34ebed30ef42f0b00d32e752dd1113">常用的一般有两种解决方案：</div><ul class="notion-list notion-list-disc notion-block-4efd1e5f6d7741e5862638cd07c5e799"><li>第一种方案通过创造一个<b>「拟群」</b>（pseudo group）来近似目标组的真实分布。常用的方法包括样本重加权、匹配、基于树的方法、混杂因子平衡、平衡表征学习方法、基于多任务的方法等。创建的拟群可以缓解选择偏差的消极影响，从而得到更加可靠的反事实结果估计；</li></ul><ul class="notion-list notion-list-disc notion-block-44edc294ff384d4c90f6a8cec69e9ad5"><li>第二种方案首先仅基于观察性数据训练基础的潜在结果估计模型，然后对选择偏差引起的估计偏差进行纠正。这种方案的代表性方法是基于元学习的方法。</li></ul></main></div>]]></content:encoded>
        </item>
    </channel>
</rss>