{"id":1473,"date":"2025-05-17T02:04:38","date_gmt":"2025-05-17T02:04:38","guid":{"rendered":"https:\/\/tiemensfamily.com\/timoncs\/?p=1473"},"modified":"2025-05-17T02:04:38","modified_gmt":"2025-05-17T02:04:38","slug":"reaction-to-cuped-article","status":"publish","type":"post","link":"https:\/\/tiemensfamily.com\/timoncs\/2025\/05\/17\/reaction-to-cuped-article\/","title":{"rendered":"Reaction to CUPED article"},"content":{"rendered":"\n<p>On September 15, 2024, Craig Sexauer posted &#8220;<a href=\"https:\/\/www.statsig.com\/blog\/cuped\">CUPED Explained<\/a>&#8221; where CUPED stands for Controlled-experiment Using Pre-Experiment Data, billed as &#8220;one of the most powerful algorithmic tools for increasing the speed and accuracy of experimentation programs.&#8221;<\/p>\n\n\n\n<p>Here is a list of some of the things that just make me want to vomit about this explanation of the tool (if not the actual tool itself):<\/p>\n\n\n\n<blockquote class=\"wp-block-quote\"><p>As an experiment matures and hits its target date for readout, it\u2019s not uncommon to see a result that seems to be\u00a0<em><strong>only barely<\/strong><\/em>\u00a0outside the range where it would be treated as statistically significant.\u00a0<\/p><\/blockquote>\n\n\n\n<p>&#8220;<strong>Only barely<\/strong>&#8220;.  As in, we already gave you a 1-in-20 chance of not proving your hypothesis, you just need a little bit more.   This is very reminiscent of this <a href=\"https:\/\/xkcd.com\/882\/\">XKCD comic<\/a> showing that green jelly beans linked to acne, where &#8220;only 5% chance of coincidence&#8221; prominently displayed, while burying the other Red, Turquoise, Magenta, Yellow, Grey, etc (19 other times) studies.  Any study that admits to using CUPED would instantly get a &#8220;what are you not telling me?&#8221; rejection.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote\"><p>Waiting for more samples delays your ability to make an informed decision, and it doesn\u2019t guarantee you\u2019ll observe a statistically significant result when there is a real effect.<\/p><\/blockquote>\n\n\n\n<p>See above &#8211; you have a weak hypothesis, and adding more sample doesn&#8217;t guarantee the result you want.  This is all true &#8211; it just hides the &#8220;use CUPED to solve this problem&#8221; conclusion.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote\"><p>Conceptually, if one group has a\u00a0<strong>faster average baseline<\/strong>, their experiment results will also be faster. When we apply a CUPED correction, the\u00a0<strong>faster group\u2019s metric will be adjusted downwards relative to the slower group.<\/strong><\/p><\/blockquote>\n\n\n\n<p>Before you just pick a random variable and &#8220;correct&#8221; for it (like, by using CUPED), I would recommend you read (and understand) <a href=\"https:\/\/www.amazon.com\/Book-Why-Science-Cause-Effect\/dp\/1541698967\">The Book of Why<\/a> by Judea Pearl.  This book has tons of examples where a &#8220;confounding variable&#8221; is corrected for incorrectly.  The book describes p-calculus as a way to model causality  [All of this is not to say I understand p-calculus.  I only understand enough to tell when other people are wrong.]<\/p>\n\n\n\n<p>Python libraries on causaility:<\/p>\n\n\n\n<ul><li>gCastle &#8211; <a href=\"https:\/\/github.com\/huawei-noah\/trustworthyAI\/tree\/master\/gcastle\">https:\/\/github.com\/huawei-noah\/trustworthyAI\/tree\/master\/gcastle<\/a><\/li><li>DoWhy &#8211; <a href=\"https:\/\/www.pywhy.org\/dowhy\/v0.11.1\/\">https:\/\/www.pywhy.org\/dowhy\/v0.11.1\/<\/a><\/li><li>EconML &#8211; <a href=\"https:\/\/github.com\/py-why\/EconML\">https:\/\/github.com\/py-why\/EconML<\/a><\/li><li>causal-lean &#8211; <a href=\"https:\/\/github.com\/py-why\/causal-learn\">https:\/\/github.com\/py-why\/causal-learn<\/a> and tetrad (<a href=\"https:\/\/github.com\/cmu-phil\/tetrad\">Java<\/a>)<\/li><li>PyWhy &#8211; <a href=\"https:\/\/www.pywhy.org\/\">https:\/\/www.pywhy.org\/<\/a><\/li><\/ul>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>On September 15, 2024, Craig Sexauer posted &#8220;CUPED Explained&#8221; where CUPED stands for Controlled-experiment Using Pre-Experiment Data, billed as &#8220;one of the most powerful algorithmic tools for increasing the speed and accuracy of experimentation programs.&#8221; Here is a list of &hellip; <a href=\"https:\/\/tiemensfamily.com\/timoncs\/2025\/05\/17\/reaction-to-cuped-article\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[18],"tags":[],"_links":{"self":[{"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/posts\/1473"}],"collection":[{"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/comments?post=1473"}],"version-history":[{"count":3,"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/posts\/1473\/revisions"}],"predecessor-version":[{"id":1476,"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/posts\/1473\/revisions\/1476"}],"wp:attachment":[{"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/media?parent=1473"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/categories?post=1473"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tiemensfamily.com\/timoncs\/wp-json\/wp\/v2\/tags?post=1473"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}