├── LICENSE ├── README.md ├── did_imputation.ado ├── did_imputation.sthlp ├── event_plot.ado ├── event_plot.sthlp ├── five_estimators_example.do └── five_estimators_example.png /LICENSE: -------------------------------------------------------------------------------- 1 | GNU GENERAL PUBLIC LICENSE 2 | Version 3, 29 June 2007 3 | 4 | Copyright (C) 2007 Free Software Foundation, Inc. 5 | Everyone is permitted to copy and distribute verbatim copies 6 | of this license document, but changing it is not allowed. 7 | 8 | Preamble 9 | 10 | The GNU General Public License is a free, copyleft license for 11 | software and other kinds of works. 12 | 13 | The licenses for most software and other practical works are designed 14 | to take away your freedom to share and change the works. By contrast, 15 | the GNU General Public License is intended to guarantee your freedom to 16 | share and change all versions of a program--to make sure it remains free 17 | software for all its users. We, the Free Software Foundation, use the 18 | GNU General Public License for most of our software; it applies also to 19 | any other work released this way by its authors. You can apply it to 20 | your programs, too. 21 | 22 | When we speak of free software, we are referring to freedom, not 23 | price. Our General Public Licenses are designed to make sure that you 24 | have the freedom to distribute copies of free software (and charge for 25 | them if you wish), that you receive source code or can get it if you 26 | want it, that you can change the software or use pieces of it in new 27 | free programs, and that you know you can do these things. 28 | 29 | To protect your rights, we need to prevent others from denying you 30 | these rights or asking you to surrender the rights. Therefore, you have 31 | certain responsibilities if you distribute copies of the software, or if 32 | you modify it: responsibilities to respect the freedom of others. 33 | 34 | For example, if you distribute copies of such a program, whether 35 | gratis or for a fee, you must pass on to the recipients the same 36 | freedoms that you received. You must make sure that they, too, receive 37 | or can get the source code. And you must show them these terms so they 38 | know their rights. 39 | 40 | Developers that use the GNU GPL protect your rights with two steps: 41 | (1) assert copyright on the software, and (2) offer you this License 42 | giving you legal permission to copy, distribute and/or modify it. 43 | 44 | For the developers' and authors' protection, the GPL clearly explains 45 | that there is no warranty for this free software. For both users' and 46 | authors' sake, the GPL requires that modified versions be marked as 47 | changed, so that their problems will not be attributed erroneously to 48 | authors of previous versions. 49 | 50 | Some devices are designed to deny users access to install or run 51 | modified versions of the software inside them, although the manufacturer 52 | can do so. This is fundamentally incompatible with the aim of 53 | protecting users' freedom to change the software. The systematic 54 | pattern of such abuse occurs in the area of products for individuals to 55 | use, which is precisely where it is most unacceptable. Therefore, we 56 | have designed this version of the GPL to prohibit the practice for those 57 | products. If such problems arise substantially in other domains, we 58 | stand ready to extend this provision to those domains in future versions 59 | of the GPL, as needed to protect the freedom of users. 60 | 61 | Finally, every program is threatened constantly by software patents. 62 | States should not allow patents to restrict development and use of 63 | software on general-purpose computers, but in those that do, we wish to 64 | avoid the special danger that patents applied to a free program could 65 | make it effectively proprietary. To prevent this, the GPL assures that 66 | patents cannot be used to render the program non-free. 67 | 68 | The precise terms and conditions for copying, distribution and 69 | modification follow. 70 | 71 | TERMS AND CONDITIONS 72 | 73 | 0. Definitions. 74 | 75 | "This License" refers to version 3 of the GNU General Public License. 76 | 77 | "Copyright" also means copyright-like laws that apply to other kinds of 78 | works, such as semiconductor masks. 79 | 80 | "The Program" refers to any copyrightable work licensed under this 81 | License. Each licensee is addressed as "you". "Licensees" and 82 | "recipients" may be individuals or organizations. 83 | 84 | To "modify" a work means to copy from or adapt all or part of the work 85 | in a fashion requiring copyright permission, other than the making of an 86 | exact copy. The resulting work is called a "modified version" of the 87 | earlier work or a work "based on" the earlier work. 88 | 89 | A "covered work" means either the unmodified Program or a work based 90 | on the Program. 91 | 92 | To "propagate" a work means to do anything with it that, without 93 | permission, would make you directly or secondarily liable for 94 | infringement under applicable copyright law, except executing it on a 95 | computer or modifying a private copy. Propagation includes copying, 96 | distribution (with or without modification), making available to the 97 | public, and in some countries other activities as well. 98 | 99 | To "convey" a work means any kind of propagation that enables other 100 | parties to make or receive copies. Mere interaction with a user through 101 | a computer network, with no transfer of a copy, is not conveying. 102 | 103 | An interactive user interface displays "Appropriate Legal Notices" 104 | to the extent that it includes a convenient and prominently visible 105 | feature that (1) displays an appropriate copyright notice, and (2) 106 | tells the user that there is no warranty for the work (except to the 107 | extent that warranties are provided), that licensees may convey the 108 | work under this License, and how to view a copy of this License. If 109 | the interface presents a list of user commands or options, such as a 110 | menu, a prominent item in the list meets this criterion. 111 | 112 | 1. Source Code. 113 | 114 | The "source code" for a work means the preferred form of the work 115 | for making modifications to it. "Object code" means any non-source 116 | form of a work. 117 | 118 | A "Standard Interface" means an interface that either is an official 119 | standard defined by a recognized standards body, or, in the case of 120 | interfaces specified for a particular programming language, one that 121 | is widely used among developers working in that language. 122 | 123 | The "System Libraries" of an executable work include anything, other 124 | than the work as a whole, that (a) is included in the normal form of 125 | packaging a Major Component, but which is not part of that Major 126 | Component, and (b) serves only to enable use of the work with that 127 | Major Component, or to implement a Standard Interface for which an 128 | implementation is available to the public in source code form. A 129 | "Major Component", in this context, means a major essential component 130 | (kernel, window system, and so on) of the specific operating system 131 | (if any) on which the executable work runs, or a compiler used to 132 | produce the work, or an object code interpreter used to run it. 133 | 134 | The "Corresponding Source" for a work in object code form means all 135 | the source code needed to generate, install, and (for an executable 136 | work) run the object code and to modify the work, including scripts to 137 | control those activities. However, it does not include the work's 138 | System Libraries, or general-purpose tools or generally available free 139 | programs which are used unmodified in performing those activities but 140 | which are not part of the work. For example, Corresponding Source 141 | includes interface definition files associated with source files for 142 | the work, and the source code for shared libraries and dynamically 143 | linked subprograms that the work is specifically designed to require, 144 | such as by intimate data communication or control flow between those 145 | subprograms and other parts of the work. 146 | 147 | The Corresponding Source need not include anything that users 148 | can regenerate automatically from other parts of the Corresponding 149 | Source. 150 | 151 | The Corresponding Source for a work in source code form is that 152 | same work. 153 | 154 | 2. Basic Permissions. 155 | 156 | All rights granted under this License are granted for the term of 157 | copyright on the Program, and are irrevocable provided the stated 158 | conditions are met. This License explicitly affirms your unlimited 159 | permission to run the unmodified Program. The output from running a 160 | covered work is covered by this License only if the output, given its 161 | content, constitutes a covered work. This License acknowledges your 162 | rights of fair use or other equivalent, as provided by copyright law. 163 | 164 | You may make, run and propagate covered works that you do not 165 | convey, without conditions so long as your license otherwise remains 166 | in force. You may convey covered works to others for the sole purpose 167 | of having them make modifications exclusively for you, or provide you 168 | with facilities for running those works, provided that you comply with 169 | the terms of this License in conveying all material for which you do 170 | not control copyright. Those thus making or running the covered works 171 | for you must do so exclusively on your behalf, under your direction 172 | and control, on terms that prohibit them from making any copies of 173 | your copyrighted material outside their relationship with you. 174 | 175 | Conveying under any other circumstances is permitted solely under 176 | the conditions stated below. Sublicensing is not allowed; section 10 177 | makes it unnecessary. 178 | 179 | 3. Protecting Users' Legal Rights From Anti-Circumvention Law. 180 | 181 | No covered work shall be deemed part of an effective technological 182 | measure under any applicable law fulfilling obligations under article 183 | 11 of the WIPO copyright treaty adopted on 20 December 1996, or 184 | similar laws prohibiting or restricting circumvention of such 185 | measures. 186 | 187 | When you convey a covered work, you waive any legal power to forbid 188 | circumvention of technological measures to the extent such circumvention 189 | is effected by exercising rights under this License with respect to 190 | the covered work, and you disclaim any intention to limit operation or 191 | modification of the work as a means of enforcing, against the work's 192 | users, your or third parties' legal rights to forbid circumvention of 193 | technological measures. 194 | 195 | 4. Conveying Verbatim Copies. 196 | 197 | You may convey verbatim copies of the Program's source code as you 198 | receive it, in any medium, provided that you conspicuously and 199 | appropriately publish on each copy an appropriate copyright notice; 200 | keep intact all notices stating that this License and any 201 | non-permissive terms added in accord with section 7 apply to the code; 202 | keep intact all notices of the absence of any warranty; and give all 203 | recipients a copy of this License along with the Program. 204 | 205 | You may charge any price or no price for each copy that you convey, 206 | and you may offer support or warranty protection for a fee. 207 | 208 | 5. Conveying Modified Source Versions. 209 | 210 | You may convey a work based on the Program, or the modifications to 211 | produce it from the Program, in the form of source code under the 212 | terms of section 4, provided that you also meet all of these conditions: 213 | 214 | a) The work must carry prominent notices stating that you modified 215 | it, and giving a relevant date. 216 | 217 | b) The work must carry prominent notices stating that it is 218 | released under this License and any conditions added under section 219 | 7. This requirement modifies the requirement in section 4 to 220 | "keep intact all notices". 221 | 222 | c) You must license the entire work, as a whole, under this 223 | License to anyone who comes into possession of a copy. This 224 | License will therefore apply, along with any applicable section 7 225 | additional terms, to the whole of the work, and all its parts, 226 | regardless of how they are packaged. This License gives no 227 | permission to license the work in any other way, but it does not 228 | invalidate such permission if you have separately received it. 229 | 230 | d) If the work has interactive user interfaces, each must display 231 | Appropriate Legal Notices; however, if the Program has interactive 232 | interfaces that do not display Appropriate Legal Notices, your 233 | work need not make them do so. 234 | 235 | A compilation of a covered work with other separate and independent 236 | works, which are not by their nature extensions of the covered work, 237 | and which are not combined with it such as to form a larger program, 238 | in or on a volume of a storage or distribution medium, is called an 239 | "aggregate" if the compilation and its resulting copyright are not 240 | used to limit the access or legal rights of the compilation's users 241 | beyond what the individual works permit. Inclusion of a covered work 242 | in an aggregate does not cause this License to apply to the other 243 | parts of the aggregate. 244 | 245 | 6. Conveying Non-Source Forms. 246 | 247 | You may convey a covered work in object code form under the terms 248 | of sections 4 and 5, provided that you also convey the 249 | machine-readable Corresponding Source under the terms of this License, 250 | in one of these ways: 251 | 252 | a) Convey the object code in, or embodied in, a physical product 253 | (including a physical distribution medium), accompanied by the 254 | Corresponding Source fixed on a durable physical medium 255 | customarily used for software interchange. 256 | 257 | b) Convey the object code in, or embodied in, a physical product 258 | (including a physical distribution medium), accompanied by a 259 | written offer, valid for at least three years and valid for as 260 | long as you offer spare parts or customer support for that product 261 | model, to give anyone who possesses the object code either (1) a 262 | copy of the Corresponding Source for all the software in the 263 | product that is covered by this License, on a durable physical 264 | medium customarily used for software interchange, for a price no 265 | more than your reasonable cost of physically performing this 266 | conveying of source, or (2) access to copy the 267 | Corresponding Source from a network server at no charge. 268 | 269 | c) Convey individual copies of the object code with a copy of the 270 | written offer to provide the Corresponding Source. This 271 | alternative is allowed only occasionally and noncommercially, and 272 | only if you received the object code with such an offer, in accord 273 | with subsection 6b. 274 | 275 | d) Convey the object code by offering access from a designated 276 | place (gratis or for a charge), and offer equivalent access to the 277 | Corresponding Source in the same way through the same place at no 278 | further charge. You need not require recipients to copy the 279 | Corresponding Source along with the object code. If the place to 280 | copy the object code is a network server, the Corresponding Source 281 | may be on a different server (operated by you or a third party) 282 | that supports equivalent copying facilities, provided you maintain 283 | clear directions next to the object code saying where to find the 284 | Corresponding Source. Regardless of what server hosts the 285 | Corresponding Source, you remain obligated to ensure that it is 286 | available for as long as needed to satisfy these requirements. 287 | 288 | e) Convey the object code using peer-to-peer transmission, provided 289 | you inform other peers where the object code and Corresponding 290 | Source of the work are being offered to the general public at no 291 | charge under subsection 6d. 292 | 293 | A separable portion of the object code, whose source code is excluded 294 | from the Corresponding Source as a System Library, need not be 295 | included in conveying the object code work. 296 | 297 | A "User Product" is either (1) a "consumer product", which means any 298 | tangible personal property which is normally used for personal, family, 299 | or household purposes, or (2) anything designed or sold for incorporation 300 | into a dwelling. In determining whether a product is a consumer product, 301 | doubtful cases shall be resolved in favor of coverage. For a particular 302 | product received by a particular user, "normally used" refers to a 303 | typical or common use of that class of product, regardless of the status 304 | of the particular user or of the way in which the particular user 305 | actually uses, or expects or is expected to use, the product. A product 306 | is a consumer product regardless of whether the product has substantial 307 | commercial, industrial or non-consumer uses, unless such uses represent 308 | the only significant mode of use of the product. 309 | 310 | "Installation Information" for a User Product means any methods, 311 | procedures, authorization keys, or other information required to install 312 | and execute modified versions of a covered work in that User Product from 313 | a modified version of its Corresponding Source. The information must 314 | suffice to ensure that the continued functioning of the modified object 315 | code is in no case prevented or interfered with solely because 316 | modification has been made. 317 | 318 | If you convey an object code work under this section in, or with, or 319 | specifically for use in, a User Product, and the conveying occurs as 320 | part of a transaction in which the right of possession and use of the 321 | User Product is transferred to the recipient in perpetuity or for a 322 | fixed term (regardless of how the transaction is characterized), the 323 | Corresponding Source conveyed under this section must be accompanied 324 | by the Installation Information. But this requirement does not apply 325 | if neither you nor any third party retains the ability to install 326 | modified object code on the User Product (for example, the work has 327 | been installed in ROM). 328 | 329 | The requirement to provide Installation Information does not include a 330 | requirement to continue to provide support service, warranty, or updates 331 | for a work that has been modified or installed by the recipient, or for 332 | the User Product in which it has been modified or installed. Access to a 333 | network may be denied when the modification itself materially and 334 | adversely affects the operation of the network or violates the rules and 335 | protocols for communication across the network. 336 | 337 | Corresponding Source conveyed, and Installation Information provided, 338 | in accord with this section must be in a format that is publicly 339 | documented (and with an implementation available to the public in 340 | source code form), and must require no special password or key for 341 | unpacking, reading or copying. 342 | 343 | 7. Additional Terms. 344 | 345 | "Additional permissions" are terms that supplement the terms of this 346 | License by making exceptions from one or more of its conditions. 347 | Additional permissions that are applicable to the entire Program shall 348 | be treated as though they were included in this License, to the extent 349 | that they are valid under applicable law. If additional permissions 350 | apply only to part of the Program, that part may be used separately 351 | under those permissions, but the entire Program remains governed by 352 | this License without regard to the additional permissions. 353 | 354 | When you convey a copy of a covered work, you may at your option 355 | remove any additional permissions from that copy, or from any part of 356 | it. (Additional permissions may be written to require their own 357 | removal in certain cases when you modify the work.) You may place 358 | additional permissions on material, added by you to a covered work, 359 | for which you have or can give appropriate copyright permission. 360 | 361 | Notwithstanding any other provision of this License, for material you 362 | add to a covered work, you may (if authorized by the copyright holders of 363 | that material) supplement the terms of this License with terms: 364 | 365 | a) Disclaiming warranty or limiting liability differently from the 366 | terms of sections 15 and 16 of this License; or 367 | 368 | b) Requiring preservation of specified reasonable legal notices or 369 | author attributions in that material or in the Appropriate Legal 370 | Notices displayed by works containing it; or 371 | 372 | c) Prohibiting misrepresentation of the origin of that material, or 373 | requiring that modified versions of such material be marked in 374 | reasonable ways as different from the original version; or 375 | 376 | d) Limiting the use for publicity purposes of names of licensors or 377 | authors of the material; or 378 | 379 | e) Declining to grant rights under trademark law for use of some 380 | trade names, trademarks, or service marks; or 381 | 382 | f) Requiring indemnification of licensors and authors of that 383 | material by anyone who conveys the material (or modified versions of 384 | it) with contractual assumptions of liability to the recipient, for 385 | any liability that these contractual assumptions directly impose on 386 | those licensors and authors. 387 | 388 | All other non-permissive additional terms are considered "further 389 | restrictions" within the meaning of section 10. If the Program as you 390 | received it, or any part of it, contains a notice stating that it is 391 | governed by this License along with a term that is a further 392 | restriction, you may remove that term. If a license document contains 393 | a further restriction but permits relicensing or conveying under this 394 | License, you may add to a covered work material governed by the terms 395 | of that license document, provided that the further restriction does 396 | not survive such relicensing or conveying. 397 | 398 | If you add terms to a covered work in accord with this section, you 399 | must place, in the relevant source files, a statement of the 400 | additional terms that apply to those files, or a notice indicating 401 | where to find the applicable terms. 402 | 403 | Additional terms, permissive or non-permissive, may be stated in the 404 | form of a separately written license, or stated as exceptions; 405 | the above requirements apply either way. 406 | 407 | 8. Termination. 408 | 409 | You may not propagate or modify a covered work except as expressly 410 | provided under this License. Any attempt otherwise to propagate or 411 | modify it is void, and will automatically terminate your rights under 412 | this License (including any patent licenses granted under the third 413 | paragraph of section 11). 414 | 415 | However, if you cease all violation of this License, then your 416 | license from a particular copyright holder is reinstated (a) 417 | provisionally, unless and until the copyright holder explicitly and 418 | finally terminates your license, and (b) permanently, if the copyright 419 | holder fails to notify you of the violation by some reasonable means 420 | prior to 60 days after the cessation. 421 | 422 | Moreover, your license from a particular copyright holder is 423 | reinstated permanently if the copyright holder notifies you of the 424 | violation by some reasonable means, this is the first time you have 425 | received notice of violation of this License (for any work) from that 426 | copyright holder, and you cure the violation prior to 30 days after 427 | your receipt of the notice. 428 | 429 | Termination of your rights under this section does not terminate the 430 | licenses of parties who have received copies or rights from you under 431 | this License. If your rights have been terminated and not permanently 432 | reinstated, you do not qualify to receive new licenses for the same 433 | material under section 10. 434 | 435 | 9. Acceptance Not Required for Having Copies. 436 | 437 | You are not required to accept this License in order to receive or 438 | run a copy of the Program. Ancillary propagation of a covered work 439 | occurring solely as a consequence of using peer-to-peer transmission 440 | to receive a copy likewise does not require acceptance. However, 441 | nothing other than this License grants you permission to propagate or 442 | modify any covered work. These actions infringe copyright if you do 443 | not accept this License. Therefore, by modifying or propagating a 444 | covered work, you indicate your acceptance of this License to do so. 445 | 446 | 10. Automatic Licensing of Downstream Recipients. 447 | 448 | Each time you convey a covered work, the recipient automatically 449 | receives a license from the original licensors, to run, modify and 450 | propagate that work, subject to this License. You are not responsible 451 | for enforcing compliance by third parties with this License. 452 | 453 | An "entity transaction" is a transaction transferring control of an 454 | organization, or substantially all assets of one, or subdividing an 455 | organization, or merging organizations. If propagation of a covered 456 | work results from an entity transaction, each party to that 457 | transaction who receives a copy of the work also receives whatever 458 | licenses to the work the party's predecessor in interest had or could 459 | give under the previous paragraph, plus a right to possession of the 460 | Corresponding Source of the work from the predecessor in interest, if 461 | the predecessor has it or can get it with reasonable efforts. 462 | 463 | You may not impose any further restrictions on the exercise of the 464 | rights granted or affirmed under this License. For example, you may 465 | not impose a license fee, royalty, or other charge for exercise of 466 | rights granted under this License, and you may not initiate litigation 467 | (including a cross-claim or counterclaim in a lawsuit) alleging that 468 | any patent claim is infringed by making, using, selling, offering for 469 | sale, or importing the Program or any portion of it. 470 | 471 | 11. Patents. 472 | 473 | A "contributor" is a copyright holder who authorizes use under this 474 | License of the Program or a work on which the Program is based. The 475 | work thus licensed is called the contributor's "contributor version". 476 | 477 | A contributor's "essential patent claims" are all patent claims 478 | owned or controlled by the contributor, whether already acquired or 479 | hereafter acquired, that would be infringed by some manner, permitted 480 | by this License, of making, using, or selling its contributor version, 481 | but do not include claims that would be infringed only as a 482 | consequence of further modification of the contributor version. For 483 | purposes of this definition, "control" includes the right to grant 484 | patent sublicenses in a manner consistent with the requirements of 485 | this License. 486 | 487 | Each contributor grants you a non-exclusive, worldwide, royalty-free 488 | patent license under the contributor's essential patent claims, to 489 | make, use, sell, offer for sale, import and otherwise run, modify and 490 | propagate the contents of its contributor version. 491 | 492 | In the following three paragraphs, a "patent license" is any express 493 | agreement or commitment, however denominated, not to enforce a patent 494 | (such as an express permission to practice a patent or covenant not to 495 | sue for patent infringement). To "grant" such a patent license to a 496 | party means to make such an agreement or commitment not to enforce a 497 | patent against the party. 498 | 499 | If you convey a covered work, knowingly relying on a patent license, 500 | and the Corresponding Source of the work is not available for anyone 501 | to copy, free of charge and under the terms of this License, through a 502 | publicly available network server or other readily accessible means, 503 | then you must either (1) cause the Corresponding Source to be so 504 | available, or (2) arrange to deprive yourself of the benefit of the 505 | patent license for this particular work, or (3) arrange, in a manner 506 | consistent with the requirements of this License, to extend the patent 507 | license to downstream recipients. "Knowingly relying" means you have 508 | actual knowledge that, but for the patent license, your conveying the 509 | covered work in a country, or your recipient's use of the covered work 510 | in a country, would infringe one or more identifiable patents in that 511 | country that you have reason to believe are valid. 512 | 513 | If, pursuant to or in connection with a single transaction or 514 | arrangement, you convey, or propagate by procuring conveyance of, a 515 | covered work, and grant a patent license to some of the parties 516 | receiving the covered work authorizing them to use, propagate, modify 517 | or convey a specific copy of the covered work, then the patent license 518 | you grant is automatically extended to all recipients of the covered 519 | work and works based on it. 520 | 521 | A patent license is "discriminatory" if it does not include within 522 | the scope of its coverage, prohibits the exercise of, or is 523 | conditioned on the non-exercise of one or more of the rights that are 524 | specifically granted under this License. You may not convey a covered 525 | work if you are a party to an arrangement with a third party that is 526 | in the business of distributing software, under which you make payment 527 | to the third party based on the extent of your activity of conveying 528 | the work, and under which the third party grants, to any of the 529 | parties who would receive the covered work from you, a discriminatory 530 | patent license (a) in connection with copies of the covered work 531 | conveyed by you (or copies made from those copies), or (b) primarily 532 | for and in connection with specific products or compilations that 533 | contain the covered work, unless you entered into that arrangement, 534 | or that patent license was granted, prior to 28 March 2007. 535 | 536 | Nothing in this License shall be construed as excluding or limiting 537 | any implied license or other defenses to infringement that may 538 | otherwise be available to you under applicable patent law. 539 | 540 | 12. No Surrender of Others' Freedom. 541 | 542 | If conditions are imposed on you (whether by court order, agreement or 543 | otherwise) that contradict the conditions of this License, they do not 544 | excuse you from the conditions of this License. If you cannot convey a 545 | covered work so as to satisfy simultaneously your obligations under this 546 | License and any other pertinent obligations, then as a consequence you may 547 | not convey it at all. For example, if you agree to terms that obligate you 548 | to collect a royalty for further conveying from those to whom you convey 549 | the Program, the only way you could satisfy both those terms and this 550 | License would be to refrain entirely from conveying the Program. 551 | 552 | 13. Use with the GNU Affero General Public License. 553 | 554 | Notwithstanding any other provision of this License, you have 555 | permission to link or combine any covered work with a work licensed 556 | under version 3 of the GNU Affero General Public License into a single 557 | combined work, and to convey the resulting work. The terms of this 558 | License will continue to apply to the part which is the covered work, 559 | but the special requirements of the GNU Affero General Public License, 560 | section 13, concerning interaction through a network will apply to the 561 | combination as such. 562 | 563 | 14. Revised Versions of this License. 564 | 565 | The Free Software Foundation may publish revised and/or new versions of 566 | the GNU General Public License from time to time. Such new versions will 567 | be similar in spirit to the present version, but may differ in detail to 568 | address new problems or concerns. 569 | 570 | Each version is given a distinguishing version number. If the 571 | Program specifies that a certain numbered version of the GNU General 572 | Public License "or any later version" applies to it, you have the 573 | option of following the terms and conditions either of that numbered 574 | version or of any later version published by the Free Software 575 | Foundation. If the Program does not specify a version number of the 576 | GNU General Public License, you may choose any version ever published 577 | by the Free Software Foundation. 578 | 579 | If the Program specifies that a proxy can decide which future 580 | versions of the GNU General Public License can be used, that proxy's 581 | public statement of acceptance of a version permanently authorizes you 582 | to choose that version for the Program. 583 | 584 | Later license versions may give you additional or different 585 | permissions. However, no additional obligations are imposed on any 586 | author or copyright holder as a result of your choosing to follow a 587 | later version. 588 | 589 | 15. Disclaimer of Warranty. 590 | 591 | THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY 592 | APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT 593 | HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY 594 | OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, 595 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 596 | PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM 597 | IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF 598 | ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 599 | 600 | 16. Limitation of Liability. 601 | 602 | IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING 603 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS 604 | THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY 605 | GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE 606 | USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF 607 | DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD 608 | PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), 609 | EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF 610 | SUCH DAMAGES. 611 | 612 | 17. Interpretation of Sections 15 and 16. 613 | 614 | If the disclaimer of warranty and limitation of liability provided 615 | above cannot be given local legal effect according to their terms, 616 | reviewing courts shall apply local law that most closely approximates 617 | an absolute waiver of all civil liability in connection with the 618 | Program, unless a warranty or assumption of liability accompanies a 619 | copy of the Program in return for a fee. 620 | 621 | END OF TERMS AND CONDITIONS 622 | 623 | How to Apply These Terms to Your New Programs 624 | 625 | If you develop a new program, and you want it to be of the greatest 626 | possible use to the public, the best way to achieve this is to make it 627 | free software which everyone can redistribute and change under these terms. 628 | 629 | To do so, attach the following notices to the program. It is safest 630 | to attach them to the start of each source file to most effectively 631 | state the exclusion of warranty; and each file should have at least 632 | the "copyright" line and a pointer to where the full notice is found. 633 | 634 | 635 | Copyright (C) 636 | 637 | This program is free software: you can redistribute it and/or modify 638 | it under the terms of the GNU General Public License as published by 639 | the Free Software Foundation, either version 3 of the License, or 640 | (at your option) any later version. 641 | 642 | This program is distributed in the hope that it will be useful, 643 | but WITHOUT ANY WARRANTY; without even the implied warranty of 644 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 645 | GNU General Public License for more details. 646 | 647 | You should have received a copy of the GNU General Public License 648 | along with this program. If not, see . 649 | 650 | Also add information on how to contact you by electronic and paper mail. 651 | 652 | If the program does terminal interaction, make it output a short 653 | notice like this when it starts in an interactive mode: 654 | 655 | Copyright (C) 656 | This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. 657 | This is free software, and you are welcome to redistribute it 658 | under certain conditions; type `show c' for details. 659 | 660 | The hypothetical commands `show w' and `show c' should show the appropriate 661 | parts of the General Public License. Of course, your program's commands 662 | might be different; for a GUI interface, you would use an "about box". 663 | 664 | You should also get your employer (if you work as a programmer) or school, 665 | if any, to sign a "copyright disclaimer" for the program, if necessary. 666 | For more information on this, and how to apply and follow the GNU GPL, see 667 | . 668 | 669 | The GNU General Public License does not permit incorporating your program 670 | into proprietary programs. If your program is a subroutine library, you 671 | may consider it more useful to permit linking proprietary applications with 672 | the library. If this is what you want to do, use the GNU Lesser General 673 | Public License instead of this License. But first, please read 674 | . 675 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # did_imputation 2 | Event studies: robust and efficient estimation, testing, and plotting 3 | 4 | This is a Stata package for Borusyak, Jaravel, and Spiess (2023), "Revisiting Event Study Designs: Robust and Efficient Estimation" 5 | 6 | The package includes: 7 | 1) *did_imputation* command: for estimating causal effects & testing for pre-trends with the imputation method of Borusyak et al. 8 | 2) *event_plot* command: for plotting event study graphs after did_imputation, other robust estimators 9 | (by de Chaisemartin and D'Haultfoeuille, Callaway and Sant'Anna, and Sun and Abraham), and conventional event study OLS 10 | 3) an example of using all five estimators in a simulated dataset and plotting the coefficients & confidence intervals for all of them at once. 11 | 12 | Please contact Kirill Borusyak at k.borusyak@berkeley.edu with any questions. 13 | -------------------------------------------------------------------------------- /did_imputation.ado: -------------------------------------------------------------------------------- 1 | *! did_imputation: Treatment effect estimation and pre-trend testing in staggered adoption diff-in-diff designs with an imputation approach of Borusyak, Jaravel, and Spiess (2023) 2 | *! Version: November 22, 2023 3 | *! Author: Kirill Borusyak 4 | *! Recent updates: project() option can no longer be combined with autosample, which could lead to errors 5 | *! Citation: Borusyak, Jaravel, and Spiess, "Revisiting Event Study Designs: Robust and Efficient Estimation" (2023) 6 | program define did_imputation, eclass sortpreserve 7 | version 13.0 8 | syntax varlist(min=4 max=4) [if] [in] [aw iw] [, wtr(varlist) sum Horizons(numlist >=0) ALLHorizons HBALance HETby(varname) PROject(varlist numeric) /// 9 | minn(integer 30) shift(integer 0) /// 10 | AUTOSample SAVEestimates(name) SAVEWeights LOADWeights(varlist) SAVEResid(name) /// 11 | AVGEFFectsby(varlist) fe(string) Controls(varlist) UNITControls(varlist) TIMEControls(varlist) /// 12 | CLUSter(varname) leaveout tol(real 0.000001) maxit(integer 100) verbose nose PREtrends(integer 0) delta(integer 0) alpha(real 0.05)] 13 | qui { 14 | if ("`verbose'"!="") noi di "Starting" 15 | ms_get_version reghdfe, min_version("5.7.3") 16 | ms_get_version ftools, min_version("2.37.0") 17 | // Part 1: Initialize 18 | marksample touse, novarlist 19 | if ("`controls'"!="") markout `touse' `controls' 20 | if ("`unitcontrols'"!="") markout `touse' `unitcontrols' 21 | if ("`timecontrols'"!="") markout `touse' `timecontrols' 22 | // if ("`timeinteractions'"!="") markout `touse' `timeinteractions' 23 | if ("`cluster'"!="") markout `touse' `cluster', strok 24 | if ("`saveestimates'"!="") confirm new variable `saveestimates' 25 | if ("`saveweights'"!="") confirm new variable `saveweights' 26 | if ("`verbose'"!="") noi di "#00" 27 | tempvar wei 28 | if ("`weight'"=="") { 29 | gen `wei' = 1 30 | local weiexp "" 31 | } 32 | else { 33 | gen `wei' `exp' 34 | replace `wei' = . if `wei'==0 35 | markout `touse' `wei' 36 | 37 | if ("`sum'"=="") { // unless want a weighted sum, normalize the weights to have reasonable scale, just in case for better numerical convergence 38 | sum `wei' if `touse' 39 | replace `wei' = `wei' * r(N)/r(sum) 40 | if ("`verbose'"!="") noi di "Normalizing weights by " %12.8f r(N)/r(sum) 41 | } 42 | local weiexp "[`weight'=`wei']" 43 | } 44 | local debugging = ("`verbose'"=="verbose") 45 | 46 | tokenize `varlist' 47 | local Y `1' 48 | local i `2' 49 | local t `3' 50 | local ei `4' 51 | markout `touse' `Y' `t' // missing `ei' is fine, indicates the never-treated group 52 | markout `touse' `i', strok 53 | 54 | tempvar D K 55 | 56 | // Process FE 57 | if ("`fe'"=="") local fe `i' `t' 58 | if ("`fe'"==".") { 59 | tempvar constant 60 | gen `constant' = 1 61 | local fe `constant' 62 | } 63 | local fecount = 0 64 | foreach fecurrent of local fe { 65 | if (("`fecurrent'"!="`i'" | "`unitcontrols'"=="") & ("`fecurrent'"!="`t'" | "`timecontrols'"=="")) { // skip i and t if there are corresponding interacted controls 66 | local ++fecount 67 | local fecopy `fecopy' `fecurrent' 68 | local fe`fecount' = subinstr("`fecurrent'","#"," ",.) 69 | markout `touse' `fe`fecount'', strok 70 | } 71 | } 72 | local fe `fecopy' 73 | 74 | // Figure out the delta 75 | if (`delta'==0) { 76 | cap tsset, noquery 77 | if (_rc==0) { 78 | if (r(timevar)=="`t'") { 79 | local delta = r(tdelta) 80 | if (`delta'!=1) noi di "Note: setting delta = `delta'" 81 | } 82 | } 83 | else local delta = 1 84 | } 85 | if (`delta'<=0 | mi(`delta')) { 86 | di as error "A problem has occured with determining delta. Please specify it explicitly." 87 | error 198 88 | } 89 | 90 | if (`debugging') noi di "#1" 91 | gen `K' = (`t'-`ei'+`shift')/`delta' if `touse' 92 | cap assert mi(`K') | mod(`K',1)==0 93 | if (_rc!=0) { 94 | di as error "There are non-integer values of the number of periods since treatment. Please check the time dimension of your data." 95 | error 198 96 | } 97 | 98 | gen `D' = (`K'>=0 & !mi(`K')) if `touse' 99 | 100 | if ("`avgeffectsby'"=="") local avgeffectsby = "`ei' `t'" 101 | if ("`cluster'"=="") local cluster = "`i'" 102 | 103 | if ("`autosample'"!="" & "`sum'"!="") { 104 | di as error "Autosample cannot be combined with sum. Please specify the sample explicitly" 105 | error 184 106 | } 107 | if ("`autosample'"!="" & "`hbalance'"!="") { 108 | di as error "Autosample cannot be combined with hbalance. Please specify the sample explicitly" 109 | error 184 110 | } 111 | if ("`autosample'"!="" & "`project'"!="") { 112 | di as error "Autosample cannot be combined with project. Please specify the sample explicitly" 113 | error 184 114 | } 115 | if ("`project'"!="" & "`hetby'"!="") { 116 | di as error "Options project and hetby cannot be combined." 117 | error 184 118 | } 119 | if ("`project'"!="" & "`sum'"!="") { 120 | di as error "Options project and sum cannot be combined." // hetby and sum are fine: just add them up separately 121 | error 184 122 | } 123 | if ("`se'"=="nose" & "`saveweights'"!="") { 124 | di as error "Option saveweights is not available if nose is specified." 125 | error 184 126 | } 127 | if ("`se'"=="nose" & "`loadweights'"!="") { 128 | di as error "Option loadweights is not available if nose is specified." 129 | error 184 130 | } 131 | if ("`se'"=="nose" & "`saveresid'"!="") { 132 | di as error "Option saveresid is not available if nose is specified." 133 | error 184 134 | } 135 | if (`debugging') noi di "#2 `fe'" 136 | 137 | // Part 2: Prepare the variables with weights on the treated units (e.g. by horizon) 138 | local wtr_count : word count `wtr' 139 | local wtr_count_init = `wtr_count' 140 | if (`wtr_count'==0) { // if no wtr, use the simple average 141 | tempvar wtr 142 | gen `wtr' = 1 if (`touse') & (`D'==1) 143 | local wtrnames tau 144 | local wtr_count = 1 145 | } 146 | else { // create copies of the specified variables so that I can modify them later (adjust for weights, normalize) 147 | if (`wtr_count'==1) local wtrnames tau 148 | else local wtrnames "" // will fill it in the loop 149 | 150 | local wtr_new_list 151 | foreach v of local wtr { 152 | tempvar `v'_new 153 | gen ``v'_new' = `v' if `touse' 154 | local wtr_new_list `wtr_new_list' ``v'_new' 155 | if (`wtr_count'>1) local wtrnames `wtrnames' tau_`v' 156 | } 157 | local wtr `wtr_new_list' 158 | } 159 | 160 | * Horizons 161 | if (("`horizons'"!="" | "`allhorizons'"!="") & `wtr_count'>1) { 162 | di as error "Options horizons and allhorizons cannot be combined with multiple wtr variables" 163 | error 184 164 | } 165 | 166 | if ("`allhorizons'"!="") { 167 | if ("`horizons'"!="") { 168 | di as error "Options horizons and allhorizons cannot be combined" 169 | error 184 170 | } 171 | if ("`hbalance'"!="") di as error "Warning: combining hbalance with allhorizons may lead to very restricted samples. Consider specifying a smaller subset of horizons." 172 | 173 | levelsof `K' if `touse' & `D'==1 & `wtr'!=0 & !mi(`wtr'), local(horizons) 174 | } 175 | 176 | if ("`horizons'"!="") { // Create a weights var for each horizon 177 | if ("`hbalance'"=="hbalance") { 178 | // Put zero weight on units for which we don't have all horizons 179 | tempvar in_horizons num_horizons_by_i min_weight_by_i max_weight_by_i 180 | local n_horizons = 0 181 | gen `in_horizons'=0 if `touse' 182 | foreach h of numlist `horizons' { 183 | replace `in_horizons'=1 if (`K'==`h') & `touse' 184 | local ++n_horizons 185 | } 186 | egen `num_horizons_by_i' = sum(`in_horizons') if `in_horizons'==1, by(`i') 187 | replace `wtr' = 0 if `touse' & (`in_horizons'==0 | (`num_horizons_by_i'<`n_horizons')) 188 | 189 | // Now check whether wtr and wei weights are identical across periods 190 | egen `min_weight_by_i' = min(`wtr'*`wei') if `touse' & `in_horizons'==1 & (`num_horizons_by_i'==`n_horizons'), by(`i') 191 | egen `max_weight_by_i' = max(`wtr'*`wei') if `touse' & `in_horizons'==1 & (`num_horizons_by_i'==`n_horizons'), by(`i') 192 | cap assert `max_weight_by_i'<=1.000001*`min_weight_by_i' if `touse' & `in_horizons'==1 & (`num_horizons_by_i'==`n_horizons') 193 | if (_rc>0) { 194 | di as error "Weights must be identical across periods for units in the balanced sample" 195 | error 498 196 | } 197 | drop `in_horizons' `num_horizons_by_i' `min_weight_by_i' `max_weight_by_i' 198 | } 199 | foreach h of numlist `horizons' { 200 | tempvar wtr`h' 201 | gen `wtr`h'' = `wtr' * (`K'==`h') 202 | local horlist `horlist' `wtr`h'' 203 | local hornameslist `hornameslist' tau`h' 204 | } 205 | local wtr `horlist' 206 | local wtrnames `hornameslist' 207 | } 208 | 209 | if ("`hetby'"!="") { // Split each wtr by values of hetby 210 | local hetby_type : type `hetby' 211 | local hetby_string = substr("`hetby_type'",1,3)=="str" 212 | if (`hetby_string'==0) { 213 | sum `hetby' if `touse' & (`D'==1) 214 | if r(min)<0 { 215 | di as error "The hetby variable cannot take negative values." 216 | error 411 217 | } 218 | cap assert `hetby'==round(`hetby') if `touse' & (`D'==1) 219 | if (_rc>0) { 220 | di as error "The hetby variable cannot take non-integer values." 221 | error 452 222 | } 223 | } 224 | levelsof `hetby' if `touse' & (`D'==1), local(hetby_values) 225 | if (`debugging') noi di `"Hetby_values: `hetby_values'"' 226 | if (r(r)>30) { 227 | di as error "The hetby variable takes too many (over 30) values" 228 | error 149 229 | } 230 | if (r(r)==0) { 231 | di as error "The hetby variable is always missing." 232 | error 148 233 | } 234 | local wtr_split 235 | local wtrnames_split 236 | local index = 1 237 | foreach v of local wtr { 238 | local wtrname : word `index' of `wtrnames' 239 | foreach g of local hetby_values { 240 | if (`hetby_string') gen `v'_`g' = `v' if `hetby'==`"`g'"' 241 | else gen `v'_`g' = `v' if `hetby'==`g' 242 | local wtr_split `wtr_split' `v'_`g' 243 | local wtrnames_split `wtrnames_split' `wtrname'_`g' 244 | } 245 | local ++index 246 | drop `v' 247 | } 248 | local wtr `wtr_split' 249 | local wtrnames `wtrnames_split' 250 | } 251 | 252 | if ("`sum'"=="" & "`project'"=="") { // If computing the mean (and not projecting), normalize each wtr variable such that sum(wei*wtr*(D==1))==1 253 | foreach v of local wtr { 254 | cap assert `v'>=0 if (`touse') & (`D'==1) 255 | if (_rc!=0) { 256 | di as error "Negative wtr weights are only allowed if the sum option is specified" 257 | error 9 258 | } 259 | sum `v' `weiexp' if (`touse') & (`D'==1) 260 | replace `v' = `v'/r(sum) // r(sum)=sum(`v'*`wei') 261 | } 262 | } 263 | 264 | if ("`project'"!="") { // So far assume all wtr have to be 0/1, e.g. as coming from horizons 265 | if (`wtr_count_init'>0) { 266 | di as error "The option project can be combined with horizons/allhorizons but not with wtr." // To rethink if they could be combined 267 | error 184 268 | } 269 | local wtr_project 270 | local wtrnames_project 271 | local index = 1 272 | tempvar one wtrsq 273 | gen `one' = 1 274 | gen `wtrsq' = . 275 | foreach v of local wtr { 276 | local wtrname : word `index' of `wtrnames' 277 | 278 | * Process the constant via FWL 279 | reg `one' `project' `weiexp' if `touse' & (`D'==1) & !mi(`v') & (`v'>0), nocon 280 | tempvar wtr_curr 281 | predict `wtr_curr' if `touse' & (`D'==1) & !mi(`v') & (`v'>0), resid 282 | replace `wtrsq' = `wei'*`wtr_curr'^2 283 | sum `wtrsq' 284 | if (r(sum)<1e-6) noi di "WARNING: Dropping `wtrname'_cons because of collinearity" 285 | else { 286 | replace `wtr_curr' = `wtr_curr'/r(sum) 287 | local wtr_project `wtr_project' `wtr_curr' 288 | local wtrnames_project `wtrnames_project' `wtrname'_cons 289 | } 290 | 291 | * Process other vars via FWL 292 | foreach r of local project { 293 | local otherproject : list project - r 294 | reg `r' `otherproject' `weiexp' if `touse' & (`D'==1) & !mi(`v') & (`v'>0) 295 | tempvar wtr_curr 296 | predict `wtr_curr' if `touse' & (`D'==1) & !mi(`v') & (`v'>0), resid 297 | replace `wtrsq' = `wei' * (`wtr_curr'^2) 298 | sum `wtrsq' 299 | if (r(sum)<1e-6) noi di "WARNING: Dropping `wtrname'_`r' because of collinearity" 300 | else { 301 | replace `wtr_curr' = `wtr_curr'/r(sum) 302 | local wtr_project `wtr_project' `wtr_curr' 303 | local wtrnames_project `wtrnames_project' `wtrname'_`r' 304 | } 305 | } 306 | local ++index 307 | } 308 | local wtr `wtr_project' 309 | local wtrnames `wtrnames_project' 310 | if ("`wtr'"=="") { 311 | di as error "Projection is not possible, most likely because of collinearity." 312 | error 498 313 | } 314 | } 315 | 316 | if (`debugging') noi di "List: `wtr'" 317 | if (`debugging') noi di "Namelist: `wtrnames'" 318 | 319 | // Part 2A: initialize the matrices [used to be just before Part 5] 320 | local tau_num : word count `wtr' 321 | local ctrl_num : word count `controls' 322 | if (`debugging') noi di `tau_num' 323 | if (`debugging') noi di `"`wtr' | `wtrnames' | `controls'"' 324 | tempname b Nt 325 | matrix `b' = J(1,`tau_num'+`pretrends'+`ctrl_num',.) 326 | matrix `Nt' = J(1,`tau_num',.) 327 | if (`debugging') noi di "#4.0" 328 | 329 | // Part 3: Run the imputation regression and impute the controls for treated obs 330 | if ("`unitcontrols'"!="") local fe_i `i'##c.(`unitcontrols') 331 | if ("`timecontrols'"!="") local fe_t `t'##c.(`timecontrols') 332 | 333 | count if (`D'==0) & (`touse') 334 | if (r(N)==0) { 335 | if (`shift'==0) noi di as error "There are no untreated observations, i.e. those with `t'<`ei' or mi(`ei')." 336 | else noi di as error "There are no untreated observations, i.e. those with `t'<`ei'-`shift' or mi(`ei')." 337 | noi di as error "Please double-check the period & event time variables." 338 | noi di 339 | error 459 340 | } 341 | 342 | tempvar imput_resid 343 | if (`debugging') noi di "#4: reghdfe `Y' `controls' if (`D'==0) & (`touse') `weiexp', a(`fe_i' `fe_t' `fe', savefe) nocon keepsing resid(`imput_resid') cluster(`cluster')" 344 | if (`debugging') noi reghdfe `Y' `controls' if (`D'==0) & (`touse') `weiexp', a(`fe_i' `fe_t' `fe', savefe) nocon keepsing resid(`imput_resid') cluster(`cluster') 345 | else reghdfe `Y' `controls' if (`D'==0) & (`touse') `weiexp', a(`fe_i' `fe_t' `fe', savefe) nocon keepsing resid(`imput_resid') cluster(`cluster')verbose(-1) 346 | // nocon makes the constant recorded in the first FE 347 | // keepsing is important for when there are units available in only one period (e.g. treated in period 2) which are fine 348 | // verbose(-1) suppresses singleton warnings 349 | local dof_adj = (e(N)-1)/(e(N)-e(df_m)-e(df_a)) * (e(N_clust)/(e(N_clust)-1)) // that's how regdfhe does dof adjustment with clusters, see reghdfe_common.mata line 634 350 | 351 | * Extrapolate the controls to the treatment group and construct Y0 (do it right away before the next reghdfe kills __hdfe*) 352 | if (`debugging') noi di "#5" 353 | tempvar Y0 354 | gen `Y0' = 0 if `touse' 355 | 356 | local feset = 1 // indexing as in reghdfe 357 | if ("`unitcontrols'"!="") { 358 | recover __hdfe`feset'__*, from(`i') 359 | replace `Y0' = `Y0' + __hdfe`feset'__ if `touse' 360 | local j=1 361 | foreach v of local unitcontrols { 362 | replace `Y0' = `Y0'+__hdfe`feset'__Slope`j'*`v' if `touse' 363 | local ++j 364 | } 365 | local ++feset 366 | } 367 | if ("`timecontrols'"!="") { 368 | recover __hdfe`feset'__*, from(`t') 369 | replace `Y0' = `Y0' + __hdfe`feset'__ if `touse' 370 | local j=1 371 | foreach v of local timecontrols { 372 | replace `Y0' = `Y0'+__hdfe`feset'__Slope`j'*`v' if `touse' 373 | local ++j 374 | } 375 | local ++feset 376 | } 377 | forvalues feindex = 1/`fecount' { // indexing as in the fe option 378 | recover __hdfe`feset'__, from(`fe`feindex'') 379 | replace `Y0' = `Y0' + __hdfe`feset'__ if `touse' 380 | local ++feset 381 | } 382 | foreach v of local controls { 383 | replace `Y0' = `Y0'+_b[`v']*`v' if `touse' 384 | } 385 | if (`debugging') noi di "#7" 386 | 387 | if ("`saveestimates'"=="") tempvar effect 388 | else { 389 | local effect `saveestimates' 390 | cap confirm var `effect', exact 391 | if (_rc==0) drop `effect' 392 | } 393 | gen `effect' = `Y' - `Y0' if (`D'==1) & (`touse') 394 | 395 | drop __hdfe* 396 | if (`debugging') noi di "#8" 397 | 398 | * Save control coefs and prepare weights corresponding to the controls to report them later 399 | if (`ctrl_num'>0) { 400 | forvalues h = 1/`ctrl_num' { 401 | local ctrl_current : word `h' of `controls' 402 | matrix `b'[1,`tau_num'+`pretrends'+`h'] = _b[`ctrl_current'] 403 | local ctrlb`h' = _b[`ctrl_current'] 404 | local ctrlse`h' = _se[`ctrl_current'] 405 | } 406 | local ctrl_df = e(df_r) 407 | if (`debugging') noi di "#4B" 408 | local list_ctrl_weps 409 | if ("`se'"!="nose") { // Construct weights behind control estimates. [Could speed up by residualizing all relevant vars on FE first?] 410 | if (`debugging') noi di "#4C3" 411 | local ctrlvars "" // drop omitted vars from controls (so that residualization works correctly when computing SE?) 412 | forvalues h = 1/`ctrl_num' { 413 | local ctrl_current : word `h' of `controls' 414 | if (`ctrlb`h''!=0 | `ctrlse`h''!=0) local ctrlvars `ctrlvars' `ctrl_current' 415 | } 416 | if (`debugging') noi di "#4C4 `ctrlvars'" 417 | 418 | tempvar ctrlweight ctrlweight_product // ctrlweight_product=ctrlweight * ctrl_current 419 | forvalues h = 1/`ctrl_num' { 420 | if (`debugging') noi di "#4D `h'" 421 | tempvar ctrleps_w`h' 422 | if (`ctrlb`h''==0 & `ctrlse`h''==0) gen `ctrleps_w`h'' = 0 // omitted 423 | else { 424 | local ctrl_current : word `h' of `controls' 425 | //local rhsvars = subinstr(" `ctrlvars' "," `ctrl_current' "," ",.) 426 | local rhsvars : list ctrlvars - ctrl_current 427 | reghdfe `ctrl_current' `rhsvars' `weiexp' if `touse' & `D'==0, a(`fe_i' `fe_t' `fe') cluster(`cluster') resid(`ctrlweight') 428 | replace `ctrlweight' = `ctrlweight' * `wei' 429 | gen `ctrlweight_product' = `ctrlweight' * `ctrl_current' 430 | sum `ctrlweight_product' if `touse' & `D'==0 431 | replace `ctrlweight' = `ctrlweight'/r(sum) 432 | egen `ctrleps_w`h'' = total(`ctrlweight' * `imput_resid') if `touse', by(`cluster') 433 | replace `ctrleps_w`h'' = `ctrleps_w`h'' * sqrt(`dof_adj') 434 | drop `ctrlweight' `ctrlweight_product' 435 | } 436 | local list_ctrl_weps `list_ctrl_weps' `ctrleps_w`h'' 437 | } 438 | } 439 | if (`debugging') noi di "#4.75 `list_ctrl_weps'" 440 | } 441 | 442 | // Check if imputation was successful, and apply autosample 443 | * For FE can just check they have been imputed everywhere 444 | tempvar need_imputation 445 | gen byte `need_imputation' = 0 446 | foreach v of local wtr { 447 | replace `need_imputation'=1 if `touse' & `D'==1 & `v'!=0 & !mi(`v') 448 | } 449 | replace `touse' = (`touse') & (`D'==0 | `need_imputation') // View as e(sample) all controls + relevant treatments only 450 | 451 | count if mi(`effect') & `need_imputation' 452 | if r(N)>0 { 453 | if (`debugging') noi di "#8b `wtr'" 454 | cap drop cannot_impute 455 | gen byte cannot_impute = mi(`effect') & `need_imputation' 456 | count if cannot_impute==1 457 | if ("`autosample'"=="") { 458 | noi di as error "Could not impute FE for " r(N) " observations. Those are saved in the cannot_impute variable. Use the autosample option if you would like those observations to be dropped from the sample automatically." 459 | error 198 460 | } 461 | else { // drop the subsample where it didn't work and renormalize all wtr variables 462 | assert "`sum'"=="" 463 | local j = 1 464 | qui foreach v of local wtr { 465 | if (`debugging') noi di "#8d sum `v' `weiexp' if `touse' & `D'==1" 466 | local outputname : word `j' of `wtrnames' 467 | sum `v' `weiexp' if `touse' & `D'==1 // just a test that it added up to one first 468 | if (`debugging') noi di "#8dd " r(sum) 469 | assert abs(r(sum)-1)<10^-5 | abs(r(sum))<10^-5 // if this variable is always zero/missing, then the sum would be zero 470 | 471 | count if `touse' & `D'==1 & cannot_impute==1 & `v'!=0 & !mi(`v') 472 | local n_cannot_impute = r(N) // count the dropped units 473 | if (`n_cannot_impute'>0) { 474 | sum `v' `weiexp' if `touse' & `D'==1 & cannot_impute!=1 & `v'!=0 & !mi(`v') // those still remaining 475 | if (r(N)==0) { 476 | replace `v' = 0 if `touse' & `D'==1 // totally drop the wtr 477 | local autosample_drop `autosample_drop' `outputname' 478 | } 479 | else { 480 | replace `v' = `v'/r(sum) if `touse' & `D'==1 & cannot_impute!=1 481 | replace `v' = 0 if cannot_impute==1 482 | local autosample_trim `autosample_trim' `outputname' 483 | } 484 | } 485 | local ++j 486 | } 487 | if (`debugging') noi di "#8e" 488 | replace `touse' = `touse' & cannot_impute!=1 489 | if ("`autosample_drop'"!="") noi di "Warning: suppressing the following coefficients because FE could not be imputed for any units: `autosample_drop'." 490 | if ("`autosample_trim'"!="") noi di "Warning: part of the sample was dropped for the following coefficients because FE could not be imputed: `autosample_trim'." 491 | } 492 | } 493 | * Compare model degrees of freedom [does not work correctly for timecontrols and unitcontrols, need to recompute] 494 | if (`debugging') noi di "#8c" 495 | tempvar tnorm 496 | gen `tnorm' = rnormal() if (`touse') & (`D'==0 | `need_imputation') 497 | reghdfe `tnorm' `controls' if (`D'==0) & (`touse'), a(`fe_i' `fe_t' `fe') nocon keepsing verbose(-1) 498 | local df_m_control = e(df_m) // model DoF corresponding to explicitly specified controls 499 | local df_a_control = e(df_a) // DoF for FE 500 | reghdfe `tnorm' `controls' , a(`fe_i' `fe_t' `fe') nocon keepsing verbose(-1) 501 | local df_m_full = e(df_m) 502 | local df_a_full = e(df_a) 503 | if (`debugging') noi di "#9 `df_m_control' `df_m_full' `df_a_control' `df_a_full'" 504 | if (`df_m_control'<`df_m_full') { 505 | di as error "Could not run imputation for some observations because some controls are collinear in the D==0 subsample but not in the full sample" 506 | if ("`autosample'"!="") di as error "Please note that autosample does not know how to deal with this. Please correct the sample manually" 507 | error 481 508 | } 509 | if (`df_a_control'<`df_a_full') { 510 | di as error "Could not run imputation for some observations because some absorbed variables/FEs are collinear in the D==0 subsample but not in the full sample" 511 | if ("`autosample'"!="") di as error "Please note that autosample does not know how to deal with this. Please correct the sample manually" 512 | error 481 513 | } 514 | 515 | 516 | // Part 4: Suppress wtr which have an effective sample size (for absolute weights of treated obs) that is too small 517 | local droplist 518 | tempvar abswei 519 | gen `abswei' = . 520 | local j = 1 521 | foreach v of local wtr { 522 | local outputname : word `j' of `wtrnames' 523 | replace `abswei' = abs(`v') if (`touse') & (`D'==1) 524 | sum `abswei' `weiexp' 525 | if (r(sum)!=0) { // o/w dropped earlier 526 | replace `abswei' = (`v'*`wei'/r(sum))^2 if (`touse') & (`D'==1) // !! Probably doesn't work with fw, not sure about pw; probably ok for aw 527 | sum `abswei' 528 | if (r(sum)>1/`minn') { // HHI is large => effective sample size is too small 529 | local droplist `droplist' `outputname' 530 | replace `v' = 0 if `touse' 531 | } 532 | } 533 | else local droplist `droplist' `outputname' // not ideal: should report those with no data at all separately (maybe together with autosample_drop?) 534 | local ++j 535 | } 536 | if ("`droplist'"!="") noi di "WARNING: suppressing the following coefficients from estimation because of insufficient effective sample size: `droplist'. To report them nevertheless, set the minn option to a smaller number or 0, but keep in mind that the estimates may be unreliable and their SE may be downward biased." 537 | 538 | if (`debugging') noi di "#9.5" 539 | 540 | // Part 5: pre-tests 541 | if (`pretrends'>0) { 542 | tempname pretrendvar 543 | tempvar preresid 544 | forvalues h = 1/`pretrends' { 545 | gen `pretrendvar'`h' = (`K'==-`h') if `touse' 546 | local pretrendvars `pretrendvars' `pretrendvar'`h' 547 | local prenames `prenames' pre`h' 548 | } 549 | if (`debugging') noi di "#9A reghdfe `Y' `controls' `pretrendvars' `weiexp' if `touse' & `D'==0, a(`fe_i' `fe_t' `fe') cluster(`cluster') resid(`preresid')" 550 | reghdfe `Y' `controls' `pretrendvars' `weiexp' if `touse' & `D'==0, a(`fe_i' `fe_t' `fe') cluster(`cluster') resid(`preresid') 551 | forvalues h = 1/`pretrends' { 552 | matrix `b'[1,`tau_num'+`h'] = _b[`pretrendvar'`h'] 553 | local preb`h' = _b[`pretrendvar'`h'] 554 | local prese`h' = _se[`pretrendvar'`h'] 555 | } 556 | local pre_df = e(df_r) 557 | if (`debugging') noi di "#9B" 558 | local list_pre_weps 559 | if ("`se'"!="nose") { // Construct weights behind pre-trend estimaters. Could speed up by residualizing all relevant vars on FE first 560 | matrix pre_b = e(b) 561 | if (`debugging') noi di "#9C1" 562 | matrix pre_V = e(V) 563 | if (`debugging') noi di "#9C2" 564 | local dof_adj = (e(N)-1)/(e(N)-e(df_m)-e(df_a)) * (e(N_clust)/(e(N_clust)-1)) // that's how regdfhe does dof adjustment with clusters, see reghdfe_common.mata line 634 565 | if (`debugging') noi di "#9C3" 566 | local pretrendvars "" // drop omitted vars from pretrendvars (so that residualization works correctly when computing SE) 567 | forvalues h = 1/`pretrends' { 568 | if (`preb`h''!=0 | `prese`h''!=0) local pretrendvars `pretrendvars' `pretrendvar'`h' 569 | } 570 | if (`debugging') noi di "#9C4 `pretrendvars'" 571 | 572 | tempvar preweight 573 | forvalues h = 1/`pretrends' { 574 | if (`debugging') noi di "#9D `h'" 575 | tempvar preeps_w`h' 576 | if (`preb`h''==0 & `prese`h''==0) gen `preeps_w`h'' = 0 // omitted 577 | else { 578 | local rhsvars = subinstr(" `pretrendvars' "," `pretrendvar'`h' "," ",.) 579 | reghdfe `pretrendvar'`h' `controls' `rhsvars' `weiexp' if `touse' & `D'==0, a(`fe_i' `fe_t' `fe') cluster(`cluster') resid(`preweight') 580 | replace `preweight' = `preweight' * `wei' 581 | sum `preweight' if `touse' & `D'==0 & `pretrendvar'`h'==1 582 | replace `preweight' = `preweight'/r(sum) 583 | egen `preeps_w`h'' = total(`preweight' * `preresid') if `touse', by(`cluster') 584 | replace `preeps_w`h'' = `preeps_w`h'' * sqrt(`dof_adj') 585 | drop `preweight' 586 | } 587 | local list_pre_weps `list_pre_weps' `preeps_w`h'' 588 | } 589 | } 590 | if (`debugging') noi di "#9.75" 591 | } 592 | 593 | // Part 6: Compute the effects 594 | count if `D'==0 & `touse' 595 | local Nc = r(N) 596 | 597 | count if `touse' 598 | local Nall = r(N) 599 | 600 | tempvar effectsum 601 | gen `effectsum' = . 602 | local j = 1 603 | foreach v of local wtr { 604 | local outputname : word `j' of `wtrnames' 605 | if (`debugging') noi di "Reporting `j' `v' `outputname'" 606 | 607 | replace `effectsum' = `effect'*`v'*`wei' if (`D'==1) & (`touse') 608 | sum `effectsum' 609 | //ereturn scalar `outputname' = r(sum) 610 | matrix `b'[1,`j'] = r(sum) 611 | 612 | count if `D'==1 & `touse' & `v'!=0 & !mi(`v') 613 | matrix `Nt'[1,`j'] = r(N) 614 | 615 | local ++j 616 | } 617 | 618 | if (`debugging') noi di "#10" 619 | 620 | // Part 7: Report SE [can add a check that there are no conflicts in the residuals] 621 | if ("`se'"!="nose") { 622 | cap drop __w_* 623 | tempvar tag_clus resid0 624 | egen `tag_clus' = tag(`cluster') if `touse' 625 | gen `resid0' = `Y' - `Y0' if (`touse') & (`D'==0) 626 | if ("`loadweights'"=="") { 627 | local weightvars = "" 628 | foreach vn of local wtrnames { 629 | local weightvars `weightvars' __w_`vn' 630 | } 631 | if (`debugging') noi di "#11a imputation_weights `i' `t' `D' , touse(`touse') wtr(`wtr') saveweights(`weightvars') wei(`wei') fe(`fe') controls(`controls') unitcontrols(`unitcontrols') timecontrols(`timecontrols') tol(`tol') maxit(`maxit')" 632 | noi imputation_weights `i' `t' `D', touse(`touse') wtr(`wtr') saveweights(`weightvars') wei(`wei') /// 633 | fe(`fe') controls(`controls') unitcontrols(`unitcontrols') timecontrols(`timecontrols') /// 634 | tol(`tol') maxit(`maxit') `verbose' 635 | local Niter = r(iter) 636 | } 637 | else { 638 | local weightvars `loadweights' 639 | // Here can verify the supplied weights 640 | } 641 | 642 | local list_weps = "" 643 | local j = 1 644 | foreach v of local wtr { // to do: speed up by sorting for all wtr together 645 | if (`debugging') noi di "#11b `v'" 646 | local weightvar : word `j' of `weightvars' 647 | local wtrname : word `j' of `wtrnames' 648 | tempvar clusterweight smartweight smartdenom avgtau eps_w`j' // Need to regenerate every time in case the weights on treated are in conflict 649 | egen `clusterweight' = total(`wei'*`v') if `touse' & (`D'==1), by(`cluster' `avgeffectsby') 650 | egen `smartdenom' = total(`clusterweight' * `wei' * `v') if `touse' & (`D'==1), by(`avgeffectsby') 651 | gen `smartweight' = `clusterweight' * `wei' * `v' / `smartdenom' if `touse' & (`D'==1) 652 | replace `smartweight' = 0 if mi(`smartweight') & `touse' & (`D'==1) // if the denominator is zero, this avgtau won't matter 653 | egen `avgtau' = sum(`effect'*`smartweight') if (`touse') & (`D'==1), by(`avgeffectsby') 654 | 655 | if ("`saveresid'"=="") tempvar resid 656 | else local resid `saveresid'_`wtrname' 657 | 658 | gen `resid' = `resid0' 659 | replace `resid' = `effect'-`avgtau' if (`touse') & (`D'==1) 660 | if ("`leaveout'"!="") { 661 | if (`debugging') noi di "#11LO" 662 | count if `smartdenom'>0 & ((`clusterweight'^2)/`smartdenom'>0.99999) & (`touse') & (`D'==1) 663 | if (r(N)>0) { 664 | local outputname : word `j' of `wtrnames' // is this the correct variable name when some coefs have been dropped? 665 | di as error `"Cannot compute leave-out standard errors because of "' r(N) `" observations for coefficient "`outputname'""' 666 | di as error "This most likely happened because there are cohorts with only one unit or cluster (and the default value for avgeffectsby is used)." 667 | di as error "Consider using the avgeffectsby option with broader observation groups. Do not address this problem by using non-leave-out standard errors, as they may be downward biased for the same reason." 668 | error 498 669 | } 670 | replace `resid' = `resid' * `smartdenom' / (`smartdenom'-(`clusterweight'^2)) if (`touse') & (`D'==1) 671 | } 672 | egen `eps_w`j'' = sum(`wei'*`weightvar'*`resid') if `touse', by(`cluster') 673 | 674 | local list_weps `list_weps' `eps_w`j'' 675 | drop `clusterweight' `smartweight' `smartdenom' `avgtau' 676 | if ("`saveresid'"=="") drop `resid' 677 | local ++j 678 | } 679 | if (`debugging') noi di "11c" 680 | tempname V 681 | if (`debugging') noi di "11d `list_weps' | `list_pre_weps' | `list_ctrl_weps'" 682 | matrix accum `V' = `list_weps' `list_pre_weps' `list_ctrl_weps' if `tag_clus', nocon 683 | if (`debugging') noi di "11e `wtrnames' | `prenames' | `controls'" 684 | matrix rownames `V' = `wtrnames' `prenames' `controls' 685 | matrix colnames `V' = `wtrnames' `prenames' `controls' 686 | if ("`saveweights'"=="" & "`loadweights'"=="") drop __w_* 687 | } 688 | 689 | // Part 8: report everything 690 | if (`debugging') noi di "#12" 691 | matrix colnames `b' = `wtrnames' `prenames' `controls' 692 | matrix colnames `Nt' = `wtrnames' 693 | ereturn post `b' `V', esample(`touse') depname(`Y') obs(`Nall') 694 | ereturn matrix Nt = `Nt' 695 | ereturn scalar Nc = `Nc' 696 | ereturn local depvar `Y' 697 | ereturn local cmd did_imputation 698 | ereturn local droplist `droplist' 699 | ereturn local autosample_drop `autosample_drop' 700 | ereturn local autosample_trim `autosample_trim' 701 | if ("`Niter'"!="") ereturn scalar Niter = `Niter' 702 | if (`pretrends'>0 & "`se'"!="nose") { 703 | test `prenames', df(`pre_df') 704 | ereturn scalar pre_F = r(F) 705 | ereturn scalar pre_p = r(p) 706 | ereturn scalar pre_df = `pre_df' 707 | } 708 | } 709 | 710 | local level = 100*(1-`alpha') 711 | _coef_table_header 712 | ereturn display, level(`level') 713 | 714 | end 715 | 716 | // Additional program that computes the weights corresponding to the imputation estimator and saves them in a variable 717 | cap program drop imputation_weights 718 | program define imputation_weights, rclass sortpreserve 719 | syntax varlist(min=3 max=3), touse(varname) wtr(varlist) SAVEWeights(namelist) wei(varname) /// 720 | [tol(real 0.000001) maxit(integer 1000) fe(string) Controls(varlist) UNITControls(varlist) TIMEControls(varlist) verbose] 721 | // Weights of the imputation procedure given wtr for controls = - X0 * (X0'X0)^-1 * X1' * wtr but we get them via iterative procedure 722 | // k<0 | k==. is control 723 | // Observation weights are in wei; wtr should be specified BEFORE applying the wei, and the output is before applying them too, i.e. estimator = sum(wei*saveweights*Y) 724 | qui { 725 | // Part 1: Initialize 726 | local debugging = ("`verbose'"=="verbose") 727 | if (`debugging') noi di "#IW1" 728 | tokenize `varlist' 729 | local i `1' 730 | local t `2' 731 | local D `3' 732 | 733 | local wcount : word count `wtr' 734 | local savecount : word count `saveweights' 735 | assert `wcount'==`savecount' 736 | forvalues j = 1/`wcount' { 737 | local wtr_j : word `j' of `wtr' 738 | local saveweights_j : word `j' of `saveweights' 739 | gen `saveweights_j' = `wtr_j' 740 | replace `saveweights_j' = 0 if mi(`saveweights_j') & `touse' 741 | tempvar copy`saveweights_j' 742 | gen `copy`saveweights_j'' = `saveweights_j' 743 | } 744 | 745 | local fecount = 0 746 | foreach fecurrent of local fe { 747 | local ++fecount 748 | local fe`fecount' = subinstr("`fecurrent'","#"," ",.) 749 | } 750 | 751 | if (`debugging') noi di "#IW2" 752 | 753 | // Part 2: Demean & construct denom for weight updating 754 | if ("`unitcontrols'"!="") { 755 | tempvar N0i 756 | egen `N0i' = sum(`wei') if (`touse') & `D'==0, by(`i') 757 | } 758 | if ("`timecontrols'"!="") { 759 | tempvar N0t 760 | egen `N0t' = sum(`wei') if (`touse') & `D'==0, by(`t') 761 | } 762 | forvalues feindex = 1/`fecount' { 763 | tempvar N0fe`feindex' 764 | egen `N0fe`feindex'' = sum(`wei') if (`touse') & `D'==0, by(`fe`feindex'') 765 | } 766 | 767 | foreach v of local controls { 768 | tempvar dm_`v' c`v' 769 | sum `v' [aw=`wei'] if `D'==0 & `touse' // demean such that the mean is zero in the control sample 770 | gen `dm_`v'' = `v'-r(mean) if `touse' 771 | egen `c`v'' = sum(`wei' * `dm_`v''^2) if `D'==0 & `touse' 772 | } 773 | 774 | foreach v of local unitcontrols { 775 | tempvar u`v' dm_u`v' s_u`v' 776 | egen `s_u`v'' = pc(`wei') if `D'==0 & `touse', by(`i') prop 777 | egen `dm_u`v'' = sum(`s_u`v'' * `v') if `touse', by(`i') // this automatically includes it in `D'==1 as well 778 | replace `dm_u`v'' = `v' - `dm_u`v'' if `touse' 779 | egen `u`v'' = sum(`wei' * `dm_u`v''^2) if `D'==0 & `touse', by(`i') 780 | drop `s_u`v'' 781 | } 782 | foreach v of local timecontrols { 783 | tempvar t`v' dm_t`v' s_t`v' 784 | egen `s_t`v'' = pc(`wei') if `D'==0 & `touse', by(`t') prop 785 | egen `dm_t`v'' = sum(`s_t`v'' * `v') if `touse', by(`t') // this automatically includes it in `D'==1 as well 786 | replace `dm_t`v'' = `v' - `dm_t`v'' if `touse' 787 | egen `t`v'' = sum(`wei' * `dm_t`v''^2) if `D'==0 & `touse', by(`t') 788 | drop `s_t`v'' 789 | } 790 | if (`debugging') noi di "#IW3" 791 | 792 | // Part 3: Iterate 793 | local it = 0 794 | local keepiterating `saveweights' 795 | tempvar delta 796 | gen `delta' = 0 797 | while (`it'<`maxit' & "`keepiterating'"!="") { 798 | if (`debugging') noi di "#IW it `it': `keepiterating'" 799 | // Simple controls 800 | foreach v of local controls { 801 | update_weights `dm_`v'' , w(`keepiterating') wei(`wei') d(`D') touse(`touse') denom(`c`v'') 802 | } 803 | 804 | // Unit-interacted continuous controls 805 | foreach v of local unitcontrols { 806 | update_weights `dm_u`v'' , w(`keepiterating') wei(`wei') d(`D') touse(`touse') denom(`u`v'') by(`i') 807 | } 808 | if ("`unitcontrols'"!="") update_weights , w(`keepiterating') wei(`wei') d(`D') touse(`touse') denom(`N0i') by(`i') // could speed up a bit by skipping this if we have i#something later 809 | 810 | // Time-interacted continuous controls 811 | foreach v of local timecontrols { 812 | update_weights `dm_t`v'' , w(`keepiterating') wei(`wei') d(`D') touse(`touse') denom(`t`v'') by(`t') 813 | } 814 | if ("`timecontrols'"!="") update_weights , w(`keepiterating') wei(`wei') d(`D') touse(`touse') denom(`N0t') by(`t') // could speed up a bit by skipping this if we have t#something later 815 | 816 | // FEs 817 | forvalues feindex = 1/`fecount' { 818 | update_weights , w(`keepiterating') wei(`wei') d(`D') touse(`touse') denom(`N0fe`feindex'') by(`fe`feindex'') 819 | } 820 | 821 | // Check for which coefs the weights have changed, keep iterating for them 822 | local newkeepit 823 | foreach w of local keepiterating { 824 | replace `delta' = abs(`w'-`copy`w'') 825 | sum `delta' if `D'==0 & `touse' 826 | if (`debugging') noi di "#IW it `it' `w' " r(sum) 827 | if (r(sum)>`tol') local newkeepit `newkeepit' `w' 828 | replace `copy`w'' = `w' 829 | } 830 | local keepiterating `newkeepit' 831 | local ++it 832 | } 833 | if ("`keepiterating'"!="") { 834 | noi di as error "Convergence of standard errors is not achieved for coefs: `keepiterating'." 835 | noi di as error "Try increasing the tolerance, the number of iterations, or use the nose option for the point estimates without SE." 836 | error 430 837 | } 838 | return scalar iter = `it' 839 | } 840 | end 841 | 842 | cap program drop update_weights // warning: intentionally destroys sorting 843 | program define update_weights, rclass 844 | syntax [varname(default=none)] , w(varlist) wei(varname) d(varname) touse(varname) denom(varname) [by(varlist)] 845 | // varlist = variable on which to residualize (if empty, a constant is assumed, as for any FE) [for now only one is max!] 846 | // w = variable storing the weights to be updated 847 | // wei = observation weights 848 | // touse = variable defining sample 849 | // denom = variable storing sum(`wei'*`varlist'^2) if `d'==0, by(`by') 850 | qui { 851 | tempvar sumw 852 | tokenize `varlist' 853 | if ("`1'"=="") local 1 = "1" 854 | if ("`by'"!="") sort `by' 855 | foreach w_j of local w { 856 | noi di "#UW 5 `w_j': `1' by(`by') " 857 | egen `sumw' = total(`wei' * `w_j' * `1') if `touse', by(`by') 858 | replace `w_j' = `w_j'-`sumw'*`1'/`denom' if `d'==0 & `denom'!=0 & `touse' 859 | assert !mi(`w_j') if `touse' 860 | drop `sumw' 861 | } 862 | } 863 | end 864 | 865 | // When there is a variable that only varies by `from' but is missing for some observations, fill in its missing values wherever possible 866 | cap program drop recover 867 | program define recover, sortpreserve 868 | syntax varlist, from(varlist) 869 | foreach var of local varlist { 870 | gsort `from' -`var' 871 | by `from' : replace `var' = `var'[1] if mi(`var') 872 | } 873 | end 874 | 875 | -------------------------------------------------------------------------------- /did_imputation.sthlp: -------------------------------------------------------------------------------- 1 | {smcl} 2 | {* *! version 3.1 2023-11-22}{...} 3 | {vieweralsosee "reghdfe" "help reghdfe"}{...} 4 | {vieweralsosee "event_plot" "help event_plot"}{...} 5 | {viewerjumpto "Syntax" "did_imputation##syntax"}{...} 6 | {viewerjumpto "Description" "did_imputation##description"}{...} 7 | {viewerjumpto "What if imputation is not possible" "did_imputation##impfails"}{...} 8 | {viewerjumpto "Options" "did_imputation##options"}{...} 9 | {viewerjumpto "Weights" "did_imputation##weights"}{...} 10 | {viewerjumpto "Stored results" "did_imputation##results"}{...} 11 | {viewerjumpto "Usage examples" "did_imputation##usage"}{...} 12 | {title:Title} 13 | 14 | {pstd} 15 | {bf:did_imputation} - Treatment effect estimation and pre-trend testing in event studies: difference-in-differences designs with staggered adoption of treatment, using the imputation approach of Borusyak, Jaravel, and Spiess (April 2023) 16 | 17 | {marker syntax}{...} 18 | {title:Syntax} 19 | 20 | {phang} 21 | {cmd: did_imputation} {it:Y i t Ei} [if] [in] [{help did_imputation##weights:estimation weights}] [{cmd:,} {help did_imputation##options:options}] 22 | {p_end} 23 | 24 | {synoptset 8 tabbed}{...} 25 | {synopt : {it:Y}}outcome variable {p_end} 26 | {synopt : {it:i}}variable for unique unit id{p_end} 27 | {synopt : {it:t}}variable for calendar period {p_end} 28 | {synopt : {it:Ei}}variable for unit-specific date of treatment (missing = never-treated) {p_end} 29 | 30 | {phang} {it: Note:} These main parameters imply: {p_end} 31 | {phang3}- the treatment indicator: {it:D=1[t>=Ei]}; {p_end} 32 | {phang3}- "relative time", i.e. the number of periods since treatment: {it:K=(t-Ei)} (possibly adjusted by the {opt shift} and {opt delta} options described below). {p_end} 33 | 34 | {phang} 35 | {it: Note}: {cmd:did_imputation} requires a recent version of {help reghdfe}. If you get error messages (e.g. {it:r(123)} or {it:"verbose must be between 0 and 5"}), please (re)install {cmd:reghdfe} to make sure you have the most recent version. 36 | {p_end} 37 | 38 | {phang} 39 | {it: Note}: Before emailing the authors about errors, please read the {help did_imputation##bugs:Bug reporting} section of this helpfile. 40 | {p_end} 41 | 42 | {marker description}{...} 43 | {title:Description} 44 | 45 | {pstd} 46 | {bf:did_imputation} estimates the effects of a binary treatment with staggered rollout allowing for arbitrary heterogeneity and dynamics of causal effects, using the imputation estimator of Borusyak et al. (2023). 47 | {p_end} 48 | 49 | {pstd} 50 | The benchmark case is with panel data, in which each unit {it:i} that gets treated as of period {it:Ei} stays treated forever; 51 | some units may never be treated. Other types of data (e.g. repeated cross-sections) and other designs (e.g. triple-diffs) are also allowed; 52 | see {help did_imputation##usage:Usage examples}. 53 | {p_end} 54 | 55 | {pstd}Estimation proceeds in three steps:{p_end} 56 | 57 | {p2col 5 8 8 0 : 1.}{ul:Estimate} a model for non-treated potential outcomes using the non-treated (i.e. never-treated or not-yet-treated) 58 | observations only. The benchmark model for diff-in-diff designs is a two-way fixed effect (FE) model: Y_it = a_i + b_t + eps_it, 59 | but other FEs, controls, etc., are also allowed.{p_end} 60 | {p2col 5 8 8 0 : 2.}{ul:Extrapolate} the model from Step 1 to treated observations, {ul:imputing} non-treated potential outcomes Y_it(0), 61 | and obtain an estimate of the treatment effect {it: tau_it = Y_it - Y_it(0)} for each treated observation. (See {help did_imputation##impfails:What if imputation is not possible}){p_end} 62 | {p2col 5 8 8 0 : 3.}{ul:Take averages} of estimated treatment effects corresponding to the estimand of interest.{p_end} 63 | 64 | {pstd} 65 | A pre-trend test (for the assumptions of parallel trends and no anticipation) is a separate exercise 66 | (see the {opt pretrends} option). Regardless of whether the pre-trend test is performed, the reference group 67 | for estimation is always all pre-treatment (or never-treated) observations. 68 | {p_end} 69 | 70 | {pstd} 71 | To make "event study" plots, please use the accompanying command {help event_plot}. 72 | {p_end} 73 | 74 | {marker impfails}{...} 75 | {title:What if imputation is not possible} 76 | 77 | {phang}The imputation step (Step 2) is not always possible for all treated observations:{p_end} 78 | {phang2}- With unit FEs, imputation is not possible for units treated in all periods in the sample;{p_end} 79 | {phang2}- With period FEs, it is impossible to isolate the period FE from the variation in treatment effects 80 | in a period when all units have already been treated (and if there are never-treated units);{p_end} 81 | {phang2}- If you include group#period FEs, imputation is further impossible once all units {it:in the group} have been treated;{p_end} 82 | {phang2}- Similar issues arise with other covariates in the model of Y(0).{p_end} 83 | 84 | {phang}This is a fundamental issue: the model you specified does not allow to find unbiased estimates of treatment effects 85 | for those observations (without restrictions on treatment effects; see Borusyak et al. 2023). 86 | 87 | {phang} 88 | If this problem arises (i.e. there is at least one treated observation that enters one of the estimands of interest with a non-zero weight 89 | and for which the treatment effect cannot be imputed), 90 | the command will throw an error and generate a dummy variable {it:cannot_impute} which equals to one for those observations. 91 | 92 | {phang}You have two ways to proceed:{p_end} 93 | {phang}- Modify the estimand, excluding those observations manually: via the {opt if} clause or by setting the weights on them to zero 94 | (via the {opt wtr} option);{p_end} 95 | {phang}- Specify the {opt autosample} option that will do this automatically in most cases. But we recommend that you still review {it:cannot_impute} 96 | to understand what estimand you will be getting.{p_end} 97 | 98 | {marker options}{...} 99 | {title:Options} 100 | 101 | {dlgtab:Model of Y(0)} 102 | 103 | {phang}{opt fe(list of FE)}: which FE to include in the model of Y(0). Default is {opt fe(i t)} for the diff-in-diff (two-way FE) model. 104 | But you can include fewer FEs, e.g. just period FE {opt fe(t)} or, for repeated cross-sections at the individual level, {opt fe(state t)}. 105 | Or you can have more FEs: e.g. {bf:fe(}{it:i t{cmd:#}state}{bf:)} (for state-by-year FE with county-level data) 106 | or {bf:fe(}{it:i{cmd:#}dow t}{bf:)} (for unit by day-of-week FE). Each member of the list has to look like 107 | {it:v1{cmd:#}v2{cmd:#}...{cmd:#}vk}. If you want no FE at all, specify {opt fe(.)}. 108 | {p_end} 109 | 110 | {phang}{opt c:ontrols(varlist)}: list of continuous time-varying controls. (For dummy-variable controls, e.g. gender, please use the 111 | {opt fe} option for better convergence.){p_end} 112 | 113 | {phang}{opt unitc:ontrols(varlist)}: list of continuous controls (often unit-invariant) to be included {it:interacted} with unit dummies. 114 | E.g. with {opt unitcontrols(year)} the regression includes unit-specific trends. 115 | (For binary controls interacted with unit dummies, use the {opt fe} option.) {p_end} 116 | 117 | {pmore}{it:Use with caution}: the command may not recognize that imputation is not possible for some treated observations. 118 | For example, a unit-specific trend is not possible to estimate if only one pre-treatment observation is available for the unit, but it is not 119 | guaranteed that the command will throw an error. 120 | {p_end} 121 | 122 | {phang}{opt timec:ontrols(varlist)} list of continuous controls (often time-invariant) to be included {it:interacted} with period dummies. 123 | E.g. with {opt timecontrols(population)} the regression includes {it:i.year#c.population}. 124 | (For binary controls interacted with period dummies, use the {opt fe} option.){p_end} 125 | 126 | {pmore}{it:Use with caution}: the command may not recognize that imputation is not possible for some treated observations. 127 | {p_end} 128 | 129 | {dlgtab:Estimands and Pre-trends} 130 | 131 | {phang}{opt wtr(varlist)}: A list of variables, manually defining estimands of interest by storing the weights on the treated observations.{p_end} 132 | {phang2}- If {help did_imputation##weights:estimation weights} ({bf:aw/iw/fw}) are used, {opt wtr} weights will be applied {it:in addition} 133 | to those weights in defining the estimand.{p_end} 134 | {phang2}- If nothing is specified, the default is the simple ATT across all treated observations (or, with {opt horizons} or {opt allhorizons}, by horizon). So {opt wtr}=1/number of the relevant observations. {p_end} 135 | {phang2}- Values of {opt wtr} for untreated observations are ignored (except as initial values in the iterative procedure for computing SE).{p_end} 136 | {phang2}- Values below 0 are only allowed if {opt sum} is also specified (i.e. for weighted sums and not weighted averages).{p_end} 137 | {phang2}- Using multiple {opt wtr} variables is faster than running {cmd:did_imputation} for each of them separately, and produces a joint variance-covariance matrix.{p_end} 138 | 139 | {phang}{opt sum}: if specified, the weighted {it:sum}, rather than average, of treatment effects is computed (overall or by horizons). 140 | With {opt sum} specified, it's OK to have some {opt wtr}<0 or even adding up to zero; 141 | this is useful, for example, to estimate the difference between two weighted averages of treatment effects 142 | (e.g. across horizons or between men and woment).{p_end} 143 | 144 | {phang}{opt h:orizons(numlist)}: if specified, weighted averages/sums of treatment effects will be reported for each of these horizons separately 145 | (i.e. tau0 for the treatment period, tau1 for one period after treatment, etc.). Horizons which are not specified will be ignored. 146 | Each horizon must a be non-negative integer.{p_end} 147 | 148 | {phang}{opt allh:orizons}: picks all non-negative horizons available in the sample.{p_end} 149 | 150 | {phang}{opt hbal:ance}: if specified together with a list of horizons, estimands for each of the horizons will be based 151 | only on the subset of units for which observations for all chosen horizons are available 152 | (note that by contruction this means that the estimands will be based on different periods). 153 | If {opt wtr} or estimation weights are specified, the researcher needs to make sure that the weights are constant over time 154 | for the relevant units---otherwise proper balancing is impossible and an error will be thrown. 155 | Note that excluded units will still be used in Step 1 (e.g. to recover the period FEs) and for pre-trend tests.{p_end} 156 | 157 | {phang}{opt het:by(varname)}: reports estimands separately by subgroups defined by the discrete (non-negative integer or string) 158 | variable provided. This is the preferred option for treatment effect heterogeneity analyses (but see also {opt project}).{p_end} 159 | 160 | {phang}{opt pro:ject(varlist)}: projects (i.e., regresses) treatment effect estimates on a set of numeric variables 161 | and reports the constant and slope coefficients. The variables should not be collinear. (To analyse effect heterogeneity 162 | by subgroup, option {opt hetby} is preferred. Note that standard errors may not agree exactly between {opt hetby} and 163 | {opt project}.){p_end} 164 | 165 | {phang}{opt minn(#)}: the minimum effective number (i.e. inverse Herfindahl index) of treated observations, 166 | below which a coefficient is suppressed and a warning is issued. 167 | The inference on coefficients which are based on a small number of observations is unreliable. The default is {opt minn(30)}. 168 | Set to {opt minn(0)} to report all coefficients nevertheless.{p_end} 169 | 170 | {phang}{opt autos:ample}: if specified, the observations for which FE cannot be imputed will be automatically dropped from the sample, 171 | with a warning issued. Otherwise an error will be thrown if any such observations are found. 172 | {opt autosample} cannot be combined with {opt sum} or {opt hbalance}; 173 | please specify the sample explicitly if using one of those options and you get an error that imputation has failed (see {help did_imputation##impfails:What if imputation is not possible}).{p_end} 174 | 175 | {phang}{opt shift(integer)}: specify to allow for anticipation effects. 176 | The command will pretend that treatment happened {opt shift} periods earlier for each treated unit.{p_end} 177 | {phang2}- Do NOT use this option for pre-trend testing; 178 | use it if anticipation effects are expected in your setting. 179 | (This option {it:can} be used for a placebo test but we recommend a pretrend test instead; see Section 4.4 of Borusyak et al. 2023.){p_end} 180 | {phang2}- The command's output will be labeled relative to the shifted treated date {it:Ei}-shift. 181 | For example, with {opt horizons(0/10)} {opt shift(3)} you will get coeffieints {it:_b[tau0]}...{it:_b[tau10]} where tau{it:h} 182 | is the effect {it:h} periods after the shifted treatment. That is, {it:tau1} corresponds to the average anticipation effect 2 periods before 183 | the actual treatment, while {it:tau8} to the average effect 5 periods after the actual treatment.{p_end} 184 | 185 | {phang}{opt pre:trends(integer)}: if some value {it:k}>0 is specified, the command will performs a test for parallel trends, 186 | by a {bf:separate} regression on nontreated observations only: of the outcome on the dummies for 1,...,{it:k} periods before treatment, 187 | in addition to all the FE and controls. The coefficients are reported as {bf:pre}{it:1},...,{bf:pre}{it:k}. 188 | The F-statistic (corresponding to the cluster-robust Wald test), corresponding pvalue, and degrees-of-freedom are reported in {res:e(pre_F)}, 189 | {res:e(pre_p)}, and {res:e(pre_df)} resp.{p_end} 190 | {phang2}- Use a reasonable number of pre-trends, do not use all of the available ones unless you have a really large never-treated group. With too many pre-trend coefficients, the power of the joint test will be lower.{p_end} 191 | {phang2}- The entire sample of nontreated observations is always used for pre-trend tests, regardless of {opt hbalance} and other options that restrict the sample for post-treatment effect estimation.{p_end} 192 | {phang2}- The number of pretrend coefficients does not affect the post-treatment effect estimates, which are always computed under the assumption of parallel trends and no anticipation.{p_end} 193 | {phang2}- The reference group for the pretrend test is all periods more than {it:k} periods prior to the event date (and all never-treated 194 | observations, if available).{p_end} 195 | {phang2}- Because of this reference group, it is expected that the SE are the largest for pre1 (opposite from some conventional tests).{p_end} 196 | {phang2}- This is only one of many tests for the parallel trends and no anticipation assumptions. Others are easy to implement manually; please see 197 | the paper for the discussion.{p_end} 198 | 199 | {dlgtab:Standard errors} 200 | 201 | {phang}{opt clus:ter(varname)}: cluster SE within groups defined by this variable. Default is {it:i}. {p_end} 202 | 203 | {phang}{opt avgeff:ectsby(varlist)}: Use this option (and/or {opt leaveout}) if you have small cohorts of treated observations, and after reviewing 204 | Section 4.3 of Borusyak et al. (2023). In brief, SE computation requires averaging the treatment effects by groups of treated observations.{p_end} 205 | {phang2}- These groups should be large enough, so that there is no downward bias from overfitting. {p_end} 206 | {phang2}- But the larger they are, the more conservative SE will be, unless treatment effects are homogeneous within these groups. {p_end} 207 | {phang2}- The varlist in {opt avgeffectsby} defines these groups.{p_end} 208 | {phang2}- The default is cohort-years {opt avgeffectsby(Ei t)}, which is appropriate for large cohorts.{p_end} 209 | {phang2}- With small cohorts, specify coarser groupings: e.g. {opt avgeffectsby(K)} (to pool across cohorts) or 210 | {opt avgeffectsby(D)} (to pool across cohorts and periods when computing the overall ATT). {p_end} 211 | {phang2}- The averages are computed using the "smart" formula from Section 4.3, adjusted for any clustering and any choice of {opt avgeffectsby}. {p_end} 212 | 213 | {phang}{opt leaveout}: {it:Recommended option}. In particular, use it (and/or {opt avgeffectsby}) if you have small cohorts of treated observations. 214 | The averages of treatment effects will be computed excluding the own unit (or, more generally, cluster). 215 | See Section 4.3 of Borusyak et al. (2023) for details.{p_end} 216 | 217 | {phang}{opt alpha(real)}: confidence intervals will be displayed corresponding to that significance level. Default is 0.05.{p_end} 218 | 219 | {phang}{opt nose}: do not produce standard errors (much faster).{p_end} 220 | 221 | {dlgtab:Miscellaneous} 222 | 223 | {phang}{opt save:estimates(newvarname)}: if specified, a new variable will be created storing the estimates of treatment effects for each observation 224 | are stored. The researcher can then construct weighted averages of their interest manually (but without SE). {p_end} 225 | {pmore}{it:Note}: Individual estimates are of course not consistent. But weighted sums of many of them typically are, which is what {cmd:did_imputation} generally reports. {p_end} 226 | 227 | {phang}{opt savew:eights}: if specified, new variables {it:__w_*} are generated, storing the weights corresponding to (each) coefficient. 228 | Recall that the imputation estimator is a linear estimator that can be represented as a weighted sum of the outcomes.{p_end} 229 | {phang2}- These weights are applied on top of any {help did_imputation##weights:estimation weights}.{p_end} 230 | {phang2}- For treated observations these weights equal to the corresponding {opt wtr} - that's why the estimator is unbiased 231 | under arbitrary treatment effect heterogeneity.{p_end} 232 | {phang2}- If a weighted average is estimated (i.e. {opt sum} is not specified) and there are no estimation weights, 233 | the weights add up to one across all treated observations.{p_end} 234 | {phang2}- With unit and period FEs, weights add up to zero for every unit and time period (when weighted by estimation weights).{p_end} 235 | 236 | {phang}{opt loadw:eights(varlist)}: use this to speed up the analysis of different outcome variables with an identical specification 237 | on an identical sample. To do so, provide the set of the weight variables (__w*, but can be renamed), 238 | saved using the {opt saveweights} option when running the analysis for the first outcome. 239 | [Warning: the validity of the weights is assumed and not double-checked.] [Currently works only if the varnames are not __w*]{p_end} 240 | 241 | {phang}{opt saver:esid(name)}: if specified, a new variable {it:name}_* is generated for each estimand (e.g. {it:name}_tau0) to store 242 | model residuals used in the computation of standard errors. For untreated observations, they are residuals from the estimation step. For 243 | treated observations, they are the epsilon-tildes from the BJS Theorem 3 (based on the {opt avgeffectsby} option) or the leave-out 244 | versions from Appendix A.6. The residuals may be heterogeneous across estimands in general --- see equation (8) in the paper 245 | which depends on the estimator weights, and thus on the estimand. This option can be helpful for reproducing the standard errors manually.{p_end} 246 | 247 | {phang}{opt delta(integer)}: indicates that one period should correpond to {opt delta} steps of {it:t} and {it:Ei}. 248 | Default is 1, except when the time dimension of the data is set (via {help tsset} or {help xtset}); 249 | in that case the default is the corresponding delta.{p_end} 250 | 251 | {phang}{opt tol(real)}, {opt maxit(integer)}: tolerance and the maximum number of iterations. 252 | This affects the iterative procedure used to search for the weights underlying the estimator (to produce SE). 253 | Defaults are 10^-6 and 100, resp. If convergence is not achieved otherwise, try increasing them.{p_end} 254 | 255 | {phang}{opt verbose}: specify for the debugging mode.{p_end} 256 | 257 | 258 | {marker weights}{...} 259 | {title:Estimation weights} 260 | 261 | {phang} Estimation weights (only {opt aw} or {opt iw} are allowed) play two roles: 262 | 263 | {p2col 5 8 8 0 : 1.}Step 1 estimation is done by a weighted regression. This is most efficient when the variance 264 | of the error terms is inversely proportionate to the weights. 265 | 266 | {p2col 5 8 8 0 : 2.}In Step 3, the average that defines the estimand is weighted by these weights. 267 | If {opt wtr} is specified explicitly, estimation weights are applied on top of them.{p_end} 268 | {phang2}- If {opt sum} is not specified, the estimand is the average of treatment effects weighted by {opt wtr}*{opt aw} {p_end} 269 | {phang2}- If {opt sum} is specified, the estimand is the sum of treatment effects multiplied by {opt wtr}*{opt aw} {p_end} 270 | 271 | {phang} Do NOT specify regression weights if you only have the second motivation to use the weights, 272 | e.g. if you want to measure the ATT weighted by county size but you have no reason to think that the outcomes of larger counties have less noise. 273 | Instead specify the estimand of your interest via {opt wtr}.{p_end} 274 | 275 | {phang} All weight types ({opt aw/iw/fw}) produce identical results. Weights should always be non-negative. {p_end} 276 | 277 | {marker results}{...} 278 | {title:Stored results} 279 | 280 | {pstd} 281 | {cmd:did_imputation} stores the following in {cmd:e()}: 282 | 283 | {synoptset 10 tabbed}{...} 284 | {p2col 5 15 15 2: Matrices}{p_end} 285 | {synopt:{cmd:e(b)}}A row-vector of (i) the estimates, (ii) pre-trend coefficients, and (iii) coefficients on controls from Step 1:{p_end} 286 | {pmore3}- If {opt horizons} is specified, the program returns {bf:tau}{it:h} for each {it:h} in the list of horizons.{p_end} 287 | {pmore3}- If multiple {opt wtr} are specified, the program returns {bf:tau_}{it:v} for each {it:v} in the list of {opt wtr} variables. {p_end} 288 | {pmore3}- Otherwise the single returned coefficient is called {bf:tau}. {p_end} 289 | {pmore3}- If {opt hetby} is specified, the coefficient names are appended with underscore and the values of the grouping variable. For instance, with {opt hetby(female)} where {it:female} takes values 0 and 1, the command will return {bf:tau}{it:h}_0 and {bf:tau}{it:h}_1 for each horizon, corresponding to average treatment effects by horizon and sex.{p_end} 290 | {pmore3}- If {opt project} is specified, the coefficient names are appended with underscore and the constant plus the list of slope coefficients. For instance, with {opt project(female income)}, the command will return {bf:tau}{it:h}{bf:_cons}, {bf:tau}{it:h}{bf:_female}, and {bf:tau}{it:h}{bf:_income}.{p_end} 291 | {pmore3}- In addition, if {cmd:pretrends} is specified, the command returns {bf:pre}{it:h} for each pre-trend coefficient {it:h}=1..{opt pretrends}. {p_end} 292 | {pmore3}- And if {cmd:controls} is specified, the command returns the coefficients on those controls. (Estimated fixed effects, {cmd:unitcontrols}, and {cmd:timecontrols} are not reported in the {cmd:e(b)}.){p_end} 293 | {synopt:{cmd:e(V)}}Corresponding variance-covariance matrix {p_end} 294 | {synopt:{cmd:e(Nt)}}A row-vector of the number of treated observations used to compute each estimator {p_end} 295 | 296 | {p2col 5 15 15 2: Scalars}{p_end} 297 | {synopt:{cmd:e(Nc)}} the number of control observations used in imputation (scalar) {p_end} 298 | {synopt:{cmd:e(pre_F), e(pre_p), e(pre_df)}} if {opt pretrends} is specified, the F-statistic, pvalue, and dof for the joint test for no pre-trends {p_end} 299 | {synopt:{cmd:e(Niter)}} the # of iterations to compute SE {p_end} 300 | 301 | {p2col 5 15 15 2: Macros}{p_end} 302 | {synopt:{cmd:e(cmd)}} {cmd:did_imputation} {p_end} 303 | {synopt:{cmd:e(droplist)}} the set of coefficients suppressed to zero because of insufficient effective sample size (see the {opt minn} option){p_end} 304 | {synopt:{cmd:e(autosample_drop)}} the set of coefficients suppressed to zero because treatment effects could not be imputed for any observation (if {opt autosample} is specified) {p_end} 305 | {synopt:{cmd:e(autosample_trim)}} the set of coefficients where the sample was partially reduced because treatment effects could not be imputed for some observations (if {opt autosample} is specified) {p_end} 306 | 307 | {p2col 5 15 15 2: Functions}{p_end} 308 | {synopt:{cmd:e(sample)}} Marks the estimation sample: all treated observations for which imputation was successful and the weights are non-zero for at least one coefficient + all non-treated observations used in Step 1 {p_end} 309 | 310 | {marker usage}{...} 311 | {title:Usage Examples} 312 | 313 | {phang}{ul:Conventional panels}{p_end} 314 | 315 | 1) Estimate the single average treatment-on-the-treated (ATT) across all treated observations, assuming that FE can be imputed for all treated observations (which is rarely the case) 316 | {cmd:. did_imputation Y i t Ei} 317 | 318 | 2) Same but dropping the observations for which it cannot be imputed. (After running, investigate that the sample is what you think it is!) 319 | {cmd:. did_imputation Y i t Ei, autosample} 320 | 321 | 3) Estimate the ATT by horizon 322 | {cmd:. did_imputation Y i t Ei, allhorizons autosample} 323 | 324 | 4) Estimate the ATT at horizons 0..+6 only 325 | {cmd:. did_imputation Y i t Ei, horizons(0/6)} 326 | 327 | 5) Estimate the ATT at horizons 0..+6 for the subset of units available for all of these horizons (such that the dynamics are not driven by compositional effects) 328 | {cmd:. did_imputation Y i t Ei, horizons(0/6) hbalance} 329 | 330 | 6) Include time-varying controls: 331 | {cmd:. did_imputation Y i t Ei, controls(w_first w_other*)} 332 | 333 | 7) Include state-by-year FE 334 | {cmd:. did_imputation Y county year Ei, fe(county state#year)} 335 | 336 | 8) Drop unit FE 337 | {cmd:. did_imputation Y i t Ei, fe(t)} 338 | 339 | 9) Additionally report pre-trend coefficients for leads 1,...,5. The estimates for post-treatment effects will NOT change. 340 | {cmd:. did_imputation Y i t Ei, horizons(0/6) pretrends(5)} 341 | 342 | 10) Estimate the difference between ATE at horizons +2 vs +1 [this can equivalently be done via {help lincom} after estimating ATT by horizon] 343 | {cmd:. count if K==1} 344 | {cmd:. gen wtr1 = (K==1)/r(N)} 345 | {cmd:. count if K==2} 346 | {cmd:. gen wtr2 = (K==2)/r(N)} 347 | {cmd:. gen wtr_diff = wtr2-wtr1} 348 | {cmd:. did_imputation Y i t Ei, wtr(wtr_diff) sum} 349 | 350 | 11) Reduce estimation time by using {opt loadweights} when analyzing several outcomes with identical specifications on identical samples: 351 | {cmd:. did_imputation Y1 i t Ei, horizons(0/10) saveweights} 352 | {cmd:. rename __* myweights* // optional} 353 | {cmd:. did_imputation Y2 i t Ei, horizons(0/10) loadweights(myweights*)} 354 | 355 | {phang}12) {ul:Treatment effect heterogeneity}: {p_end} 356 | {pmore}To estimate heterogeneity by individual's sex {it:female}, you can obtain individual treatment effect estimates 357 | (via {opt saveestimates}) and simply run a second-step regression of the estimates on {it:female}: {p_end} 358 | {pmore}{cmd:. did_imputation Y i t Ei, saveestimates(tau)}{p_end} 359 | {pmore}{cmd:. reg tau female} (DON'T DO THIS!){p_end} 360 | 361 | {pmore}HOWEVER, standard errors will be incorrect. Instead, use {opt hetby} or {opt project}:{p_end} 362 | {pmore}{cmd:. did_imputation Y i t Ei, hetby(female)}{p_end} 363 | {pmore}will produce the ATT for males (tau_0) and females (tau_1). You can further use{p_end} 364 | {pmore}{cmd:. lincom tau_1-tau_0}{p_end} 365 | {pmore}to compute the difference between them with a SE. Alternatively,{p_end} 366 | {pmore}{cmd:. did_imputation Y i t Ei, project(female)}{p_end} 367 | {pmore}will produce the ATT for males (tau_cons) and the difference in ATTs between females and males (tau_female).{p_end} 368 | 369 | {pmore}Both options can be combined with {opt horizons} to do heterogeneity analysis within each horizon.{p_end} 370 | 371 | {pmore}If you also want to allow different period effects for the two groups, you can use the {opt fe()} option as usual, e.g.:{p_end} 372 | {pmore}{cmd:. did_imputation Y i t Ei, hetby(female) fe(i t#female)}{p_end} 373 | 374 | {phang}21) {ul:Repeated cross-sections}: {p_end} 375 | {pmore}When in each period you have a different sample of individiuals {it:i} in the same groups (e.g. regions), 376 | replace individual FEs with group FEs and consider clustering at the regional level:{p_end} 377 | {phang2}{cmd:. did_imputation Y i t Ei, fe(region t) cluster(region) ...}{p_end} 378 | 379 | {pmore}Note the main parameters still include {it:i}, and not {it:region}, as the unit identifier.{p_end} 380 | 381 | {phang}22) {ul:Triple-diffs}: {p_end} 382 | {pmore}When observations are defined by {it:i,g,t} when, say, {it:i} are counties and {it:g} are age groups, 383 | specify a variable {it:ig} identifying the {it:(i,g)} pairs as the unit identifier, add appropriate FEs, and choose your clustering level, e.g.:{p_end} 384 | {phang2}{cmd:. did_imputation Y ig t Eig, fe(ig i#t g#t) cluster(i) ...}{p_end} 385 | 386 | {pmore}Note that the event time {it:Eig} should be specific to the {it:i,g} pairs, not to the {it:i}. For instance, {it:Eig} is missing for a never-treated age group in a county where other groups are treated at some point.{p_end} 387 | 388 | {title:Missing Features} 389 | 390 | {phang}- Save imputed Y(0) in addition to treatment effects {p_end} 391 | {phang}- Making {opt hbalance} work with {opt autosample} {p_end} 392 | {phang}- Throw an error if imputation is not possible with complicated controls {p_end} 393 | {phang}- Allow to use treatment effect averages for SE computation which rely even on observations 394 | outside the estimation sample for the current {opt wtr} {p_end} 395 | {phang}- Estimation when treatment switches on and off {p_end} 396 | {phang}- More general interactions between FEs and continuous controls than with {opt timecontrols} and {opt unitcontrols}{p_end} 397 | {phang}- Frequency weights{p_end} 398 | {phang}- Verify that the unit ID variable is numeric{p_end} 399 | {phang}- In Stata 13 there may be a problem with the {opt df} option of {cmd:test} {p_end} 400 | {phang}- {opt loadweights} doesn't work when the weights are saved with the default names {it:__w*}{p_end} 401 | {phang}- Allow for designs in which treatment is not binary{p_end} 402 | {phang}- Add a check that ranges of {opt t} and {opt Ei} match{p_end} 403 | 404 | {pstd} 405 | If you are interested in discussing these or others, please {help did_imputation##author:contact me}. 406 | 407 | {marker bugs}{...} 408 | {title:Bug reporting} 409 | 410 | {phang}If you get an error message, please:{p_end} 411 | {phang}- Reinstall {cmd:reghdfe} and {cmd:ftools} to have the most recent version{p_end} 412 | {phang}- Double check the syntax: e.g. make sure that {it:Ei} is the event date and not the treatment dummy, and that 413 | the treatment dummy can be obtained as {it:t>=Ei}{p_end} 414 | {phang}- If it's a message with an explanation, read the message carefully.{p_end} 415 | 416 | {phang}If this doesn't help:{p_end} 417 | {phang}- Rerun your command adding the {opt verbose} option and save the log-file{p_end} 418 | {phang}- If possible, create a version of the dataset in which you can replicate the error (e.g. with a fake outcome variable).{p_end} 419 | {phang}- If you can't share a fake dataset, summarize all the relevant variables in the log-file before calling {cmd:did_imputation}.{p_end} 420 | {phang}- Report this on {browse "https://github.com/borusyak/did_imputation/issues":github} or 421 | {help did_imputation##author:email me} with all of this.{p_end} 422 | 423 | {title:References} 424 | 425 | {phang} 426 | Borusyak, Kirill, Xavier Jaravel, and Jann Spiess (2023). "Revisiting Event Study Designs: Robust and Efficient Estimation," Working paper. 427 | {p_end} 428 | 429 | {title:Acknowledgements} 430 | 431 | {pstd} 432 | We thank Kyle Butts for the help in preparing this helpfile. 433 | 434 | {marker author}{...} 435 | {title:Author} 436 | 437 | {pstd} 438 | Kirill Borusyak (UC Berkeley), k.borusyak@berkeley.edu 439 | 440 | -------------------------------------------------------------------------------- /event_plot.ado: -------------------------------------------------------------------------------- 1 | *! event_plot: Plot coefficients from a staggered adoption event study analysis 2 | *! Version: June 1, 2021 3 | *! Author: Kirill Borusyak 4 | *! Please check the latest version at https://github.com/borusyak/did_imputation/ 5 | *! Citation: Borusyak, Jaravel, and Spiess, "Revisiting Event Study Designs: Robust and Efficient Estimation" (2021) 6 | program define event_plot 7 | version 13.0 8 | syntax [anything(name=eqlist)] [, trimlag(numlist integer) trimlead(numlist integer) default_look stub_lag(string) stub_lead(string) plottype(string) ciplottype(string) together /// 9 | graph_opt(string asis) noautolegend legend_opt(string) perturb(numlist) shift(numlist integer) /// 10 | lag_opt(string) lag_ci_opt(string) lead_opt(string) lead_ci_opt(string) /// 11 | lag_opt1(string) lag_ci_opt1(string) lead_opt1(string) lead_ci_opt1(string) /// 12 | lag_opt2(string) lag_ci_opt2(string) lead_opt2(string) lead_ci_opt2(string) /// 13 | lag_opt3(string) lag_ci_opt3(string) lead_opt3(string) lead_ci_opt3(string) /// 14 | lag_opt4(string) lag_ci_opt4(string) lead_opt4(string) lead_ci_opt4(string) /// 15 | lag_opt5(string) lag_ci_opt5(string) lead_opt5(string) lead_ci_opt5(string) /// 16 | lag_opt6(string) lag_ci_opt6(string) lead_opt6(string) lead_ci_opt6(string) /// 17 | lag_opt7(string) lag_ci_opt7(string) lead_opt7(string) lead_ci_opt7(string) /// 18 | lag_opt8(string) lag_ci_opt8(string) lead_opt8(string) lead_ci_opt8(string) /// 19 | savecoef reportcommand noplot verbose alpha(real 0.05)] 20 | qui { 21 | // to-do: read dcdh or K_95; compatibility with the code from Goodman-Bacon, eventdd(?), did_multiplegt; use eventstudy_siegloch on options for many graphs; Burtch: ib4.rel_period_pos 22 | // Part 1: Initialize 23 | local verbose = ("`verbose'"=="verbose") 24 | if ("`plottype'"=="") local plottype connected 25 | if ("`ciplottype'"=="" & ("`plottype'"=="connected" | "`plottype'"=="line")) local ciplottype rarea 26 | if ("`ciplottype'"=="" & "`plottype'"=="scatter") local ciplottype rcap 27 | if (`verbose') noi di "#1" 28 | if ("`eqlist'"=="") local eqlist . 29 | if ("`shift'"=="") local shift 0 30 | if ("`savecoef'"=="savecoef") cap drop __event* 31 | 32 | tempname dot bmat Vmat bmat_current Vmat_current 33 | cap estimates store `dot' // cap in case there are no current estimate (but plotting is done based on previously saved ones) 34 | local rc_current = _rc 35 | local eq_n : word count `eqlist' 36 | if (`eq_n'>8) { 37 | di as error "Combining at most 8 graphs are currently supported" 38 | error 198 39 | } 40 | 41 | if ("`perturb'"=="") { 42 | local perturb 0 43 | if (`eq_n'>1) forvalues eq=1/`eq_n' { 44 | local perturb `perturb' `=0.2*`eq'/`eq_n'' 45 | } 46 | } 47 | 48 | tokenize `eqlist' 49 | forvalues eq = 1/`eq_n' { 50 | local hashpos = strpos("``eq''","#") 51 | if (`hashpos'==0) { // e() syntax 52 | if ("``eq''"==".") { 53 | if (`rc_current'==0) estimates restore `dot' 54 | else error 301 55 | } 56 | else estimates restore ``eq'' 57 | 58 | matrix `bmat' = e(b) 59 | cap matrix `Vmat' = e(V) 60 | if (_rc==0) local vregime = "matrix" 61 | else local vregime = "none" 62 | } 63 | else { // bmat#Vmat syntax 64 | matrix `bmat' = `=substr("``eq''",1,`hashpos'-1)' 65 | if (colsof(`bmat')==1) matrix `bmat' = `bmat'' 66 | 67 | cap matrix `Vmat' = `=substr("``eq''",`hashpos'+1,.)' 68 | if (_rc==0) { 69 | if (rowsof(`Vmat')==1) local vregime = "row" 70 | else if (colsof(`Vmat')==1) { 71 | matrix `Vmat' = `Vmat'' 72 | local vregime = "row" 73 | } 74 | else if (rowsof(`Vmat')==colsof(`Vmat')) local vregime = "matrix" 75 | else { 76 | di as error "The variance matrix " substr("``eq''",`hashpos'+1,.) " does not have an expected format in model `eq'" 77 | error 198 78 | } 79 | } 80 | else local vregime = "none" 81 | } 82 | 83 | * extract prefix and suffix 84 | foreach o in lag lead { 85 | local currstub_`o' : word `eq' of `stub_`o'' 86 | if ("`currstub_`o''"=="") local currstub_`o' : word 1 of `stub_`o'' 87 | if ("`currstub_`o''"=="" & e(cmd)=="did_imputation" & "`o'"=="lag") local currstub_`o' tau# 88 | if ("`currstub_`o''"=="" & e(cmd)=="did_imputation" & "`o'"=="lead") local currstub_`o' pre# 89 | 90 | if ("`currstub_`o''"!="") { 91 | local hashpos = strpos("`currstub_`o''","#") 92 | if (`hashpos'==0) { 93 | di as error "stub_`o' is incorrectly specified for model `eq'" 94 | error 198 95 | } 96 | local prefix_`o' = substr("`currstub_`o''",1,`hashpos'-1) 97 | local postfix_`o' = substr("`currstub_`o''",`hashpos'+1,.) 98 | local lprefix_`o' = length("`prefix_`o''") 99 | local lpostfix_`o' = length("`postfix_`o''") 100 | local have_`o' = 1 101 | } 102 | else local have_`o' = 0 103 | } 104 | if (`have_lag'==0 & `have_lead'==0) { 105 | di as error "At least one of stub_lag and stub_lead has to be specified for model `eq'" 106 | error 198 107 | } 108 | if ("`currstub_lag'"=="`currstub_lead'") { 109 | di as error "stub_lag and stub_lead have to be different for model `eq'" 110 | error 198 111 | } 112 | 113 | // Part 2: Compute the number of available lags&leads 114 | local maxlag = -1 115 | local maxlead = 0 // zero leads = nothing since they start from 1, while lags start from 0 116 | local allvars : colnames `bmat' 117 | foreach v of local allvars { 118 | if (substr("`v'",1,2)=="o.") local v = substr("`v'",3,.) 119 | if (`have_lag') { 120 | if (substr("`v'",1,`lprefix_lag')=="`prefix_lag'" & substr("`v'",-`lpostfix_lag',.)=="`postfix_lag'") { 121 | if !mi(real(substr("`v'",`lprefix_lag'+1,length("`v'")-`lprefix_lag'-`lpostfix_lag'))) { 122 | local maxlag = max(`maxlag',real(substr("`v'",`lprefix_lag'+1,length("`v'")-`lprefix_lag'-`lpostfix_lag'))) 123 | } 124 | } 125 | } 126 | if (`have_lead') { 127 | if (substr("`v'",1,`lprefix_lead')=="`prefix_lead'" & substr("`v'",-`lpostfix_lead',.)=="`postfix_lead'") { 128 | if !mi(real(substr("`v'",`lprefix_lead'+1,length("`v'")-`lprefix_lead'-`lpostfix_lead'))) { 129 | local maxlead = max(`maxlead',real(substr("`v'",`lprefix_lead'+1,length("`v'")-`lprefix_lead'-`lpostfix_lead'))) 130 | } 131 | } 132 | } 133 | } 134 | 135 | local curr_trimlag : word `eq' of `trimlag' 136 | if mi("`curr_trimlag'") local curr_trimlag : word 1 of `trimlag' 137 | if mi("`curr_trimlag'") local curr_trimlag = -2 138 | local curr_trimlead : word `eq' of `trimlead' 139 | if mi("`curr_trimlead'") local curr_trimlead : word 1 of `trimlead' 140 | if mi("`curr_trimlead'") local curr_trimlead = -1 141 | 142 | local maxlag = cond(`curr_trimlag'>=-1, min(`maxlag',`curr_trimlag'), `maxlag') 143 | local maxlead = cond(`curr_trimlead'>=0, min(`maxlead',`curr_trimlead'), `maxlead') 144 | if (_N<`maxlag'+`maxlead'+1) { 145 | di as err "Not enough observations to store `=`maxlag'+`maxlead'+1' coefficient estimates for model `eq'" 146 | error 198 147 | } 148 | if (`verbose') noi di "#2 Model `eq': `maxlag' lags, `maxlead' leads" 149 | 150 | // Part 3: Fill in coefs & CIs 151 | if ("`savecoef'"=="") tempvar H`eq' pos`eq' coef`eq' hi`eq' lo`eq' 152 | else { 153 | local H`eq' __event_H`eq' 154 | local pos`eq' __event_pos`eq' 155 | local coef`eq' __event_coef`eq' 156 | local hi`eq' __event_hi`eq' 157 | local lo`eq' __event_lo`eq' 158 | } 159 | 160 | local shift`eq' : word `eq' of `shift' 161 | if ("`shift`eq''"=="") local shift`eq' 0 162 | 163 | gen `H`eq'' = _n-1-`maxlead' if _n<=`maxlag'+`maxlead'+1 164 | gen `coef`eq'' = . 165 | gen `hi`eq'' = . 166 | gen `lo`eq'' = . 167 | label var `H`eq'' "Periods since treatment" 168 | if (`maxlag'>=0) forvalues h=0/`maxlag' { 169 | matrix `bmat_current' = J(1,1,.) 170 | cap matrix `bmat_current' = `bmat'[1,"`prefix_lag'`h'`postfix_lag'"] 171 | cap replace `coef`eq'' = `bmat_current'[1,1] if `H`eq''==`h' // because `bmat'[1,"`prefix_lag'`h'`postfix_lag'"] is only a matrix expression on macs 172 | 173 | if ("`ciplottype'"!="none" & "`vregime'"!="none") { 174 | matrix `Vmat_current' = J(1,1,.) 175 | if ("`vregime'"=="matrix") cap matrix `Vmat_current' = `Vmat'["`prefix_lag'`h'`postfix_lag'","`prefix_lag'`h'`postfix_lag'"] 176 | else cap matrix `Vmat_current' = `Vmat'[1,"`prefix_lag'`h'`postfix_lag'"] 177 | local se = `Vmat_current'[1,1]^0.5 178 | cap replace `hi`eq'' = `bmat_current'[1,1]+invnorm(1-`alpha'/2)*`se' if `H`eq''==`h' 179 | cap replace `lo`eq'' = `bmat_current'[1,1]-invnorm(1-`alpha'/2)*`se' if `H`eq''==`h' 180 | } 181 | } 182 | if (`maxlead'>0) forvalues h=1/`maxlead' { 183 | matrix `bmat_current' = J(1,1,.) 184 | cap matrix `bmat_current' = `bmat'[1,"`prefix_lead'`h'`postfix_lead'"] 185 | cap replace `coef`eq'' = `bmat_current'[1,1] if `H`eq''==-`h' 186 | 187 | if ("`ciplottype'"!="none" & "`vregime'"!="none") { 188 | matrix `Vmat_current' = J(1,1,.) 189 | if ("`vregime'"=="matrix") cap matrix `Vmat_current' = `Vmat'["`prefix_lead'`h'`postfix_lead'","`prefix_lead'`h'`postfix_lead'"] 190 | else cap matrix `Vmat_current' = `Vmat'[1,"`prefix_lead'`h'`postfix_lead'"] 191 | local se = `Vmat_current'[1,1]^0.5 192 | cap replace `hi`eq'' = `bmat_current'[1,1]+invnorm(1-`alpha'/2)*`se' if `H`eq''==-`h' 193 | cap replace `lo`eq'' = `bmat_current'[1,1]-invnorm(1-`alpha'/2)*`se' if `H`eq''==-`h' 194 | } 195 | } 196 | count if !mi(`coef`eq'') 197 | if (r(N)==0) { 198 | if (`eq_n'==1) noi di as error `"No estimates found. Make sure you have specified stub_lag and stub_lead correctly."' 199 | else noi di as error `"No estimates found for the model "``eq''". Make sure you have specified stub_lag and stub_lead correctly."' 200 | error 498 201 | } 202 | if (`verbose') noi di "#3 `perturb'" 203 | 204 | local perturb_now : word `eq' of `perturb' 205 | if ("`perturb_now'"=="") local perturb_now = 0 206 | if (`verbose') noi di "#3A gen `pos`eq''=`H`eq''+`perturb_now'-`shift`eq''" 207 | gen `pos`eq''=`H`eq''+`perturb_now'-`shift`eq'' 208 | if (`verbose') noi di "#3B" 209 | 210 | } 211 | cap estimates restore `dot' 212 | cap estimates drop `dot' 213 | 214 | // Part 4: Prepare graphs 215 | if ("`default_look'"!="") { 216 | local graph_opt xline(0, lcolor(gs8) lpattern(dash)) yline(0, lcolor(gs8)) graphregion(color(white)) bgcolor(white) ylabel(, angle(horizontal)) `graph_opt' 217 | if (`eq_n'==1) { 218 | local lag_opt color(navy) `lag_opt' 219 | local lead_opt color(maroon) msymbol(S) `lead_opt' 220 | local lag_ci_opt color(navy%45 navy%45) `lag_ci_opt' // color repeated twice only for connected/scatter, o/w doesn't matter 221 | local lead_ci_opt color(maroon%45 maroon%45) `lead_ci_opt' 222 | } 223 | else { 224 | local lag_opt1 color(navy) `lag_opt1' 225 | local lag_opt2 color(maroon) `lag_opt2' 226 | local lag_opt3 color(forest_green) `lag_opt3' 227 | local lag_opt4 color(dkorange) `lag_opt4' 228 | local lag_opt5 color(teal) `lag_opt5' 229 | local lag_opt6 color(cranberry) `lag_opt6' 230 | local lag_opt7 color(lavender) `lag_opt7' 231 | local lag_opt8 color(khaki) `lag_opt8' 232 | local lead_opt1 color(navy) `lead_opt1' 233 | local lead_opt2 color(maroon) `lead_opt2' 234 | local lead_opt3 color(forest_green) `lead_opt3' 235 | local lead_opt4 color(dkorange) `lead_opt4' 236 | local lead_opt5 color(teal) `lead_opt5' 237 | local lead_opt6 color(cranberry) `lead_opt6' 238 | local lead_opt7 color(lavender) `lead_opt7' 239 | local lead_opt8 color(khaki) `lead_opt8' 240 | local lag_ci_opt1 color(navy%45 navy%45) `lag_ci_opt1' 241 | local lag_ci_opt2 color(maroon%45 maroon%45) `lag_ci_opt2' 242 | local lag_ci_opt3 color(forest_green%45 forest_green%45) `lag_ci_opt3' 243 | local lag_ci_opt4 color(dkorange%45 dkorange%45) `lag_ci_opt4' 244 | local lag_ci_opt5 color(teal%45 teal%45) `lag_ci_opt5' 245 | local lag_ci_opt6 color(cranberry%45 cranberry%45) `lag_ci_opt6' 246 | local lag_ci_opt7 color(lavender%45 lavender%45) `lag_ci_opt7' 247 | local lag_ci_opt8 color(khaki%45 khaki%45) `lag_ci_opt8' 248 | local lead_ci_opt1 color(navy%45 navy%45) `lead_ci_opt1' 249 | local lead_ci_opt2 color(maroon%45 maroon%45) `lead_ci_opt2' 250 | local lead_ci_opt3 color(forest_green%45 forest_green%45) `lead_ci_opt3' 251 | local lead_ci_opt4 color(dkorange%45 dkorange%45) `lead_ci_opt4' 252 | local lead_ci_opt5 color(teal%45 teal%45) `lead_ci_opt5' 253 | local lead_ci_opt6 color(cranberry%45 cranberry%45) `lead_ci_opt6' 254 | local lead_ci_opt7 color(lavender%45 lavender%45) `lead_ci_opt7' 255 | local lead_ci_opt8 color(khaki%45 khaki%45) `lead_ci_opt8' 256 | } 257 | local legend_opt region(lstyle(none)) `legend_opt' 258 | } 259 | 260 | local plotindex = 0 261 | local legend_order 262 | 263 | forvalues eq = 1/`eq_n' { 264 | local lead_cmd 265 | local leadci_cmd 266 | local lag_cmd 267 | local lagci_cmd 268 | 269 | if ("`together'"=="") { // lead graph commands only when they are separate from lags 270 | count if !mi(`coef`eq'') & `H`eq''<0 271 | if (r(N)>0) { 272 | local ++plotindex 273 | local lead_cmd (`plottype' `coef`eq'' `pos`eq'' if !mi(`coef`eq'') & `H`eq''<0, `lead_opt' `lead_opt`eq'') 274 | local legend_order = `"`legend_order' `plotindex' "Pre-trend coefficients""' 275 | } 276 | 277 | count if !mi(`hi`eq'') & `H`eq''<0 278 | if (r(N)>0) { 279 | local ++plotindex 280 | local leadci_cmd (`ciplottype' `hi`eq'' `lo`eq'' `pos`eq'' if !mi(`hi`eq'') & `H`eq''<0, `lead_ci_opt' `lead_ci_opt`eq'') 281 | } 282 | } 283 | 284 | local lag_filter = cond("`together'"=="", "`H`eq''>=0", "1") 285 | count if !mi(`coef') & `lag_filter' 286 | if (r(N)>0) { 287 | local ++plotindex 288 | local lag_cmd (`plottype' `coef`eq'' `pos`eq'' if !mi(`coef`eq'') & `lag_filter', `lag_opt' `lag_opt`eq'') 289 | if ("`together'"=="") local legend_order = `"`legend_order' `plotindex' "Treatment effects""' 290 | } 291 | 292 | count if !mi(`hi`eq'') & `lag_filter' 293 | if (r(N)>0) { 294 | local ++plotindex 295 | local lagci_cmd (`ciplottype' `hi`eq'' `lo`eq'' `pos`eq'' if !mi(`hi`eq'') & `lag_filter', `lag_ci_opt' `lag_ci_opt`eq'') 296 | } 297 | if ("`autolegend'"=="noautolegend") local legend = "" 298 | else if ("`together'"=="together") local legend = "legend(off)" // show auto legend only for separate, o/w just one item 299 | else local legend legend(order(`legend_order') `legend_opt') 300 | local maincmd `maincmd' `lead_cmd' `leadci_cmd' `lag_cmd' `lagci_cmd' 301 | if (`verbose') noi di `"#4a ``eq'': `lead_cmd' `leadci_cmd' `lag_cmd' `lagci_cmd'"' 302 | } 303 | if (`verbose' | "`reportcommand'"!="") noi di `"twoway `maincmd' , `legend' `graph_opt'"' 304 | if ("`plot'"!="noplot") twoway `maincmd', `legend' `graph_opt' 305 | } 306 | end 307 | -------------------------------------------------------------------------------- /event_plot.sthlp: -------------------------------------------------------------------------------- 1 | {smcl} 2 | {* *! version 1 2021-05-26}{...} 3 | {vieweralsosee "did_imputation" "help did_imputation"}{...} 4 | {vieweralsosee "csdid" "help csdid"}{...} 5 | {vieweralsosee "did_multiplegt" "help did_multiplegt"}{...} 6 | {vieweralsosee "eventstudyinteract" "help eventstudyinteract"}{...} 7 | {vieweralsosee "did_multiplegt" "help did_multiplegt"}{...} 8 | {vieweralsosee "estimates store" "help estimates store"}{...} 9 | {viewerjumpto "Syntax" "event_plot##syntax"}{...} 10 | {viewerjumpto "The list of models" "event_plot##listmodels"}{...} 11 | {viewerjumpto "Options" "event_plot##options"}{...} 12 | {viewerjumpto "Combining plots" "event_plot##combine"}{...} 13 | {viewerjumpto "Usage examples" "event_plot##usage"}{...} 14 | {title:Description} 15 | 16 | {pstd} 17 | {bf:event_plot} - Plot the staggered-adoption diff-in-diff ("event study") estimates: coefficients post treatment ("lags") and, if available, pre-trend coefficients ("leads") along with confidence intervals (CIs). 18 | 19 | {pstd} 20 | This command is used once estimates have been produced by the imputation estimator of Borusyak et al. 2021 ({help did_imputation}), 21 | other methods robust to treatment effect heterogeneity ({help did_multiplegt}, {help csdid}, {help eventstudyinteract}), and conventional event-study OLS. 22 | 23 | 24 | {marker syntax}{...} 25 | {title:Syntax} 26 | 27 | {phang} 28 | {cmd: event_plot} [{help event_plot##listmodels:list of models}] [, {help event_plot##options:options}] 29 | 30 | 31 | {marker listmodels}{...} 32 | {title:The List of Models} 33 | 34 | {phang} 35 | Each term in the list of models specifies where to read the coefficient estimates (and variances) from.{p_end} 36 | 37 | {phang}1) Leave empty or specify a dot ({bf:.}) to plot the current estimates, stored in the {cmd:e()} output;{p_end} 38 | {phang}2) To show previously constructed estimates which were saved by {help estimates store}, provide their name;{p_end} 39 | {phang}3) To read the estimates from an arbitrary row-vector, specify {it:bmat}{bf:#}{it:vmat} where:{p_end} 40 | {pmore}- {it:bmat} is the name of the coefficient matrix or an expression to access it, e.g. r(myestimates) (with no internal spaces). 41 | This should be a row-vector;{p_end} 42 | {pmore}- {it:vmat} is the name of the variance matrix or an expression to access it. 43 | This can be a square matrix or a row-vector of invidiual coefficient variances, and it is optional 44 | (i.e. {it:bmat}{bf:#} would plot the coefs without CIs).{p_end} 45 | 46 | {phang}By including several terms like this, you can combine several sets of estimates on one plot, see {help event_plot##combine:Combining plots}. 47 | 48 | 49 | {marker options}{...} 50 | {title:Options} 51 | 52 | {pstd} 53 | These options are designs for a showing single plot. Please see {help event_plot##combine:Combining plots} for adjustments and additional options when plots are combined. 54 | 55 | {dlgtab:Which Coefficients to Show} 56 | 57 | {phang}{opt stub_lag(prefix#postfix)}: a template for how the relevant coefficients are called in the estimation output. 58 | No lag coefficients will be shown if {opt stub_lag} is not specified, except after {cmd:did_imputation} (in which case {opt stub_lag(tau#)} is assumed). 59 | The template must include the symbol {it:#} indicating where the number is located (running from 0).{p_end} 60 | 61 | {pmore}{it:Examples:}{p_end} 62 | {phang2}{opt stub_lag(tau#)} means that the relevant coefficients are called tau0, tau1, ..., as with {cmd:did_imputation} (note that the postfix is empty in this example);{p_end} 63 | {phang2}{opt stub_lag(L#xyz)} means they are called L0xyz, L1xyz, ... (note that just specifying {opt stub_lag(L#)} will not be enough in this case). 64 | 65 | {phang}{opt stub_lead(prefix#postfix)}: same for the leads. Here the number runs from 1. {it:Examples:} {opt stub_lead(pre#)} or {opt stub_lead(F#xyz)}. 66 | 67 | {phang}{opt trimlag(integer)}: lags 0..{bf:trimlag} will be shown, while others will be suppressed. To show none (i.e. pre-trends only), specify {opt trimlag(-1)}. The default is to show all available lags. 68 | 69 | {phang}{opt trimlead(integer)}: leads 1..{bf:trimlead} will be shown, while others will be suppressed. To show none (i.e no pre-trends), specify {opt trimlead(0)}. The default is to show all available lags. 70 | 71 | {dlgtab:How to Show The Coefficients} 72 | 73 | {phang}{opt plottype(string)}: the {help twoway} plot type used to show coefficient estimates. Supported options: {help twoway connected:connected} (by default), {help line}, {help scatter}.{p_end} 74 | 75 | {phang}{opt ciplottype(string)}; the {help twoway} plot type used to show CI estimates. Supported options:{p_end} 76 | {phang2}- {help rarea} (default for {opt plottype(connected)} and {opt plottype(line)});{p_end} 77 | {phang2}- {help rcap} (default for {opt plottype(scatter)});{p_end} 78 | {phang2}- {help twoway connected:connected};{p_end} 79 | {phang2}- {help scatter};{p_end} 80 | {phang2}- {bf:none} (i.e. don't show CIs at all; default if SE are not available).{p_end} 81 | 82 | {phang}{opt together}: by default the leads and lags are shown as two separate lines (as recommended by Borusyak, Jaravel, and Spiess 2021). 83 | If {opt together} is specified, they are shown as one line, and the options for the lags are used for this line 84 | (while the options for the leads are ignored). {p_end} 85 | 86 | {phang}{opt shift(integer)}: Shift all coefficients to the left (when {opt shift}>0) or right (when {opt shift}<0). Specify if lag 0 actually corresponds to period -{opt shift} relative to the event time, as in the case of anticipation effects. This is similar to the {opt shift} option in {help did_imputation}. The default is zero. {p_end} 87 | 88 | {dlgtab:Graph options} 89 | 90 | {phang}{opt default_look}: sets default graph parameters. Additional graph options can still be specified and will be combined with these, but options cannot be repeated. See details in the {help event_plot##defaultlook:Default Look} section below. {p_end} 91 | 92 | {phang}{opt graph_opt(string)}: additional {help twoway options} for the graph overall (e.g. {opt title}, {opt xlabel}).{p_end} 93 | 94 | {phang}{opt lag_opt(string)}: additional options for the lag coefficient graph (e.g. {opt msymbol}, {opt lpattern}, {opt color}).{p_end} 95 | 96 | {phang}{opt lag_ci_opt(string)}: additional options for the lag CI graph (e.g. {opt color}) {p_end} 97 | 98 | {phang}{opt lead_opt(string)}, {opt lead_ci_opt(string)}: same for lead coefficients and CIs. Ignored if {opt together} is specified.{p_end} 99 | 100 | {dlgtab:Legend options} 101 | 102 | {pstd}A legend is shown by default, unless {opt together} is specified. You can either adjust the automatic legend by using {opt legend_opt()} 103 | , or suppress or replace it by specifying {opt noautolegend} and modifying {opt graph_opt()}.{p_end} 104 | {pmore}{it:Notes:}{p_end} 105 | {phang2}- the order of graphs for the legend: lead coefs, lead CIs, lag coefs, lag CIs, excluding those not applicable 106 | (e.g. CIs with {opt ciplottype(none)} or leads with {opt together}).{p_end} 107 | {phang2}- with {opt ciplottype(connected)} or {opt ciplottype(scatter)}, each CI is two lines instead of one.{p_end} 108 | {phang2}- if {opt together} is specified, the legend is automatically off. Use {opt noautolegend} to add a manual legend.{p_end} 109 | 110 | {phang}{opt legend_opt(string)}: additional options for the automatic legend.{p_end} 111 | 112 | {phang}{opt noautolegend}: suppresses the automatic legend. A manual legend (or the {opt legend(off)} option) should be added to {opt graph_opt()}.{p_end} 113 | 114 | {dlgtab:Miscellaneous} 115 | 116 | {phang}{opt savecoef}: save the data underlying the plot in the current dataset, e.g. to later use it in more elaborate manual plots. 117 | Variables {it:__event_H#}, {it:__event_pos#}, {it:__event_coef#}, {it:__event_lo#}, and {it:__event_hi#} will be created for each model {it:#}=1,..., where:{p_end} 118 | {phang2}- {it:H} is the number of periods relative to treatment;{p_end} 119 | {phang2}- {it:pos} is the x-coordinate (equals to {it:H} by default but modified by {opt perturb} and {opt shift});{p_end} 120 | {phang2}- {it:coef} is the point estimate;{p_end} 121 | {phang2}- [{it:lo},{it:hi}] is the CI.{p_end} 122 | 123 | {phang}{opt reportcommand}: report the command for the plot. Use it together with {opt savecoef} to then create more elaborate manual plots.{p_end} 124 | 125 | {phang}{opt noplot}: do not show the plot (useful together with {opt savecoef}).{p_end} 126 | 127 | {phang}{opt alpha(real)}: CIs will be shown for the confidence level {opt alpha}. Default is 0.05. {p_end} 128 | 129 | {phang}{opt verbose}: debugging mode.{p_end} 130 | 131 | 132 | {marker combine}{...} 133 | {title:Combining plots} 134 | 135 | {phang}Up to 8 models can be combined, e.g. to show how the estimates differ between {cmd:did_imputation} and OLS, or between males and females. 136 | 137 | {phang}With several models, additional options are available, while the syntax and meaning of others is modified: {p_end} 138 | 139 | {phang2}{opt perturb(numlist)}: shifts the plots horizontally relative to each other, so that the estimates are easier to read. The numlist is the list of x-shifts, and the default is an equally spaced sequence from 0 to 0.2 (but negative numbers are allowed). To prevent the shifts, specify {opt perturb(0)}. {p_end} 140 | 141 | {phang2}{opt lag_opt#(string)}, {opt lag_ci_opt#(string)}, {opt lead_opt#(string)}, {opt lead_ci_opt#()} for #=1,...,5: extra parameters 142 | for individual models (e.g. colors). Similar options without an index, e.g. {opt lag_opt()}, are passed to all relevant graphs. {p_end} 143 | 144 | {phang2}{opt stub_lag}, {opt stub_lead}, {opt trim_lag}, {opt trim_lead}, {opt shift} can be specified either as a list of values (one per plot), or as just one value to be used for all plots.{p_end} 145 | 146 | {phang2}{opt plottype} and {opt together} are currently required to be the same for all graphs.{p_end} 147 | 148 | 149 | {marker defaultlook}{...} 150 | {title:Default Look} 151 | 152 | {phang} With one model, specifying {opt default_look} is equivalent to including these options:{p_end} 153 | 154 | {phang2}{opt graph_opt(xline(0, lcolor(gs8) lpattern(dash)) yline(0, lcolor(gs8)) graphregion(color(white)) bgcolor(white) ylabel(, angle(horizontal)))} 155 | {opt lag_opt(color(navy))} {opt lead_opt(color(maroon) msymbol(S))} 156 | {opt lag_ci_opt(color(navy%45 navy%45))} {opt lead_ci_opt(color(maroon%45 maroon%45))} 157 | {opt legend_opt(region(lstyle(none)))} 158 | 159 | {phang}With multiple models, the only difference is in colors. Both lags and leads use the same color: navy for the first plot, maroon for the second, etc.{p_end} 160 | 161 | {marker usage}{...} 162 | {title:Usage examples} 163 | 164 | 1) Estimation + plottting via {help did_imputation}: 165 | 166 | {cmd:did_imputation Y i t Ei, autosample hor(0/20) pretrend(14)} 167 | {cmd:estimates store bjs} {it:// you need to store the coefs only to combined the plots, see Exanple 3} 168 | {cmd:event_plot, default_look graph_opt(xtitle("Days since the event") ytitle("Coefficients") xlabel(-14(7)14 20))} 169 | 170 | 2) Estimation + plotting via conventional OLS-based event study estimation: 171 | 172 | {it:// creating dummies for the lags 0..19, based on K = number of periods since treatment (or missing if there is a never-treated group)} 173 | {cmd:forvalues l = 0/19} { 174 | {cmd:gen L`l'event = K==`l'} 175 | } 176 | {cmd:gen L20event = K>=20} {it:// binning K=20 and above} 177 | 178 | {it:// creating dummies for the leads 1..14} 179 | {cmd:forvalues l = 0/13} { 180 | {cmd:gen F`l'event = K==-`l'} 181 | } 182 | {cmd:gen F14event = K<=-14} {it:// binning K=-14 and below} 183 | 184 | {it:// running the event study regression. Drop leads 1 and 2 to avoid underidentification} 185 | {it://if there is no never-treated group (could instead drop any others); see Borusyak et al. 2021} 186 | {cmd:reghdfe outcome o.F1event o.F2event F3event-F14event L*event, a(i t) cluster(i)} 187 | 188 | {it:// plotting the coeffients} 189 | {cmd:event_plot, default_look stub_lag(L#event) stub_lead(F#event) together plottype(scatter)} /// 190 | {cmd:graph_opt(xtitle("Days since the event") ytitle("OLS coefficients") xlabel(-14(7)14 20))} 191 | 192 | 3) Combining estimates from {help did_imputation} OLS: 193 | 194 | {cmd:event_plot bjs ., stub_lag(tau# L#event) stub_lead(pre# F#event) together plottype(scatter) default_look} /// 195 | {cmd:graph_opt(xtitle("Days since the event") ytitle("OLS coefficients") xlabel(-14(7)14 20))} 196 | 197 | 4) For estimation + plotting with {help csdid}, {help did_multiplegt}, and {help eventstudyinteract}, as well as {help did_imputation} 198 | and traditional OLS, see our example on GitHub: five_estimators_example.do at {browse "https://github.com/borusyak/did_imputation"} 199 | 200 | 201 | {title:Missing Features} 202 | 203 | {phang}- More flexibility for {opt stub_lag} and {opt stub_lead} for reading the coefficients of conventional event studies{p_end} 204 | {phang}- Automatic support of alternative robust estimators: {cmd:did_multiplegt}, {cmd:csdid}, and {cmd:eventstudyinteract}{p_end} 205 | {phang}- Allow {opt plottype} and {opt together} to vary across the combined plots{p_end} 206 | {phang}- Make the command consistent with {cmd:did_multiplegt} with the {opt longdiff_placebo} option{p_end} 207 | {phang}- Throw an error when neither default_look nor graphical options are specified{p_end} 208 | {phang}- In old Stata versions, avoid using transparent colors{p_end} 209 | {phang}- After {cmd:eventstudyinteract}, allow to display omitted categories{p_end} 210 | {phang}- Add the addzero() option to accommodate the omitted category in, e.g., {cmd:eventstudyinteract} 211 | 212 | {pstd} 213 | If you are interested in discussing these or others, please {help event_plot##author:contact me}. 214 | 215 | {title:References} 216 | 217 | {phang}{it:If using this command, please cite:} 218 | 219 | {phang} 220 | Borusyak, Kirill, Xavier Jaravel, and Jann Spiess (2021). "Revisiting Event Study Designs: Robust and Efficient Estimation," Working paper. 221 | {p_end} 222 | 223 | {title:Acknowledgements} 224 | 225 | {pstd} 226 | We thank Kyle Butts for the help in preparing this helpfile. 227 | 228 | {marker author}{...} 229 | {title:Author} 230 | 231 | {pstd} 232 | Kirill Borusyak (UCL Economics), k.borusyak@ucl.ac.uk 233 | 234 | -------------------------------------------------------------------------------- /five_estimators_example.do: -------------------------------------------------------------------------------- 1 | /* 2 | This simulated example illustrates how to estimate causal effects with event studies using a range of methods 3 | and plot the coefficients & confidence intervals using the event_plot command. 4 | 5 | Date: 28/05/2021 6 | Author: Kirill Borusyak (UCL), k.borusyak@ucl.ac.uk 7 | 8 | You'll need the following commands: 9 | - did_imputation (Borusyak et al. 2021): available on SSC 10 | - did_multiplegt (de Chaisemartin and D'Haultfoeuille 2020): available on SSC 11 | - eventstudyinteract (San and Abraham 2020): available on SSC 12 | - csdid (Callaway and Sant'Anna 2020): available on SSC 13 | 14 | */ 15 | 16 | // Generate a complete panel of 300 units observed in 15 periods 17 | clear all 18 | timer clear 19 | set seed 10 20 | global T = 15 21 | global I = 300 22 | 23 | set obs `=$I*$T' 24 | gen i = int((_n-1)/$T )+1 // unit id 25 | gen t = mod((_n-1),$T )+1 // calendar period 26 | tsset i t 27 | 28 | // Randomly generate treatment rollout years uniformly across Ei=10..16 (note that periods t>=16 would not be useful since all units are treated by then) 29 | gen Ei = ceil(runiform()*7)+$T -6 if t==1 // year when unit is first treated 30 | bys i (t): replace Ei = Ei[1] 31 | gen K = t-Ei // "relative time", i.e. the number periods since treated (could be missing if never-treated) 32 | gen D = K>=0 & Ei!=. // treatment indicator 33 | 34 | // Generate the outcome with parallel trends and heterogeneous treatment effects 35 | gen tau = cond(D==1, (t-12.5), 0) // heterogeneous treatment effects (in this case vary over calendar periods) 36 | gen eps = rnormal() // error term 37 | gen Y = i + 3*t + tau*D + eps // the outcome (FEs play no role since all methods control for them) 38 | //save five_estimators_data, replace 39 | 40 | // Estimation with did_imputation of Borusyak et al. (2021) 41 | did_imputation Y i t Ei, allhorizons pretrend(5) 42 | event_plot, default_look graph_opt(xtitle("Periods since the event") ytitle("Average causal effect") /// 43 | title("Borusyak et al. (2021) imputation estimator") xlabel(-5(1)5)) 44 | 45 | estimates store bjs // storing the estimates for later 46 | 47 | // Estimation with did_multiplegt of de Chaisemartin and D'Haultfoeuille (2020) 48 | did_multiplegt Y i t D, robust_dynamic dynamic(5) placebo(5) breps(100) cluster(i) 49 | event_plot e(estimates)#e(variances), default_look graph_opt(xtitle("Periods since the event") ytitle("Average causal effect") /// 50 | title("de Chaisemartin and D'Haultfoeuille (2020)") xlabel(-5(1)5)) stub_lag(Effect_#) stub_lead(Placebo_#) together 51 | 52 | matrix dcdh_b = e(estimates) // storing the estimates for later 53 | matrix dcdh_v = e(variances) 54 | 55 | // Estimation with cldid of Callaway and Sant'Anna (2020) 56 | gen gvar = cond(Ei==., 0, Ei) // group variable as required for the csdid command 57 | csdid Y, ivar(i) time(t) gvar(gvar) notyet 58 | estat event, estore(cs) // this produces and stores the estimates at the same time 59 | event_plot cs, default_look graph_opt(xtitle("Periods since the event") ytitle("Average causal effect") xlabel(-14(1)5) /// 60 | title("Callaway and Sant'Anna (2020)")) stub_lag(Tp#) stub_lead(Tm#) together 61 | 62 | // Estimation with eventstudyinteract of Sun and Abraham (2020) 63 | sum Ei 64 | gen lastcohort = Ei==r(max) // dummy for the latest- or never-treated cohort 65 | forvalues l = 0/5 { 66 | gen L`l'event = K==`l' 67 | } 68 | forvalues l = 1/14 { 69 | gen F`l'event = K==-`l' 70 | } 71 | drop F1event // normalize K=-1 (and also K=-15) to zero 72 | eventstudyinteract Y L*event F*event, vce(cluster i) absorb(i t) cohort(Ei) control_cohort(lastcohort) 73 | event_plot e(b_iw)#e(V_iw), default_look graph_opt(xtitle("Periods since the event") ytitle("Average causal effect") xlabel(-14(1)5) /// 74 | title("Sun and Abraham (2020)")) stub_lag(L#event) stub_lead(F#event) together 75 | 76 | matrix sa_b = e(b_iw) // storing the estimates for later 77 | matrix sa_v = e(V_iw) 78 | 79 | // TWFE OLS estimation (which is correct here because of treatment effect homogeneity). Some groups could be binned. 80 | reghdfe Y F*event L*event, a(i t) cluster(i) 81 | event_plot, default_look stub_lag(L#event) stub_lead(F#event) together graph_opt(xtitle("Days since the event") ytitle("OLS coefficients") xlabel(-14(1)5) /// 82 | title("OLS")) 83 | 84 | estimates store ols // saving the estimates for later 85 | 86 | // Construct the vector of true average treatment effects by the number of periods since treatment 87 | matrix btrue = J(1,6,.) 88 | matrix colnames btrue = tau0 tau1 tau2 tau3 tau4 tau5 89 | qui forvalues h = 0/5 { 90 | sum tau if K==`h' 91 | matrix btrue[1,`h'+1]=r(mean) 92 | } 93 | 94 | // Combine all plots using the stored estimates 95 | event_plot btrue# bjs dcdh_b#dcdh_v cs sa_b#sa_v ols, /// 96 | stub_lag(tau# tau# Effect_# Tp# L#event L#event) stub_lead(pre# pre# Placebo_# Tm# F#event F#event) plottype(scatter) ciplottype(rcap) /// 97 | together perturb(-0.325(0.13)0.325) trimlead(5) noautolegend /// 98 | graph_opt(title("Event study estimators in a simulated panel (300 units, 15 periods)", size(medlarge)) /// 99 | xtitle("Periods since the event") ytitle("Average causal effect") xlabel(-5(1)5) ylabel(0(1)3) /// 100 | legend(order(1 "True value" 2 "Borusyak et al." 4 "de Chaisemartin-D'Haultfoeuille" /// 101 | 6 "Callaway-Sant'Anna" 8 "Sun-Abraham" 10 "OLS") rows(3) region(style(none))) /// 102 | /// the following lines replace default_look with something more elaborate 103 | xline(-0.5, lcolor(gs8) lpattern(dash)) yline(0, lcolor(gs8)) graphregion(color(white)) bgcolor(white) ylabel(, angle(horizontal)) /// 104 | ) /// 105 | lag_opt1(msymbol(+) color(cranberry)) lag_ci_opt1(color(cranberry)) /// 106 | lag_opt2(msymbol(O) color(cranberry)) lag_ci_opt2(color(cranberry)) /// 107 | lag_opt3(msymbol(Dh) color(navy)) lag_ci_opt3(color(navy)) /// 108 | lag_opt4(msymbol(Th) color(forest_green)) lag_ci_opt4(color(forest_green)) /// 109 | lag_opt5(msymbol(Sh) color(dkorange)) lag_ci_opt5(color(dkorange)) /// 110 | lag_opt6(msymbol(Oh) color(purple)) lag_ci_opt6(color(purple)) 111 | graph export "five_estimators_example.png", replace 112 | -------------------------------------------------------------------------------- /five_estimators_example.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/borusyak/did_imputation/767c8d6670a751170910d419bbafd323df92ef08/five_estimators_example.png --------------------------------------------------------------------------------