├── .gitignore ├── BayesianPrevalence.pdf ├── BayesianPrevalenceDifference.pdf ├── LICENSE ├── R ├── bayesprev.R ├── bayesprev_example.R ├── example_csv.R └── example_data.csv ├── README.md ├── matlab ├── bayesprev_bound.m ├── bayesprev_diff_between.m ├── bayesprev_diff_within.m ├── bayesprev_example.m ├── bayesprev_hpdi.m ├── bayesprev_logodds.m ├── bayesprev_map.m ├── bayesprev_plotposterior.m ├── bayesprev_posterior.m ├── currfig1seed.mat ├── example_csv.m ├── example_data.csv └── fig1_group_vs_ind.m ├── paper ├── bayes_scale.mat ├── bayes_scale_between.mat ├── bayes_scale_within.mat ├── fig1_group_vs_ind.m ├── fig1seed.mat ├── fig2_simEEG.m ├── fig3_diff_scaling.m ├── fig4_group_diffs.m ├── fig5_effectsize_examples.m ├── fig5seed.mat ├── fig6_effectsize_examples.m ├── fig6seed.mat ├── fig7_scaling.m ├── figsubjectalingment.mat ├── figsubjectprop.mat ├── generate_data.m ├── prev_curve_onesided.m ├── prevbayes_normal.mat ├── run_T_vs_N_bayes_contour.m ├── run_T_vs_N_ttest_power.m ├── run_bayesian_scaling.m ├── run_scaling_between.m ├── run_scaling_within.m └── tpow.mat └── python ├── bayesprev ├── LICENSE ├── bayesprev.py └── pyproject.toml ├── bayesprev_example.py ├── example_csv.py └── example_data.csv /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | *.egg-info/ 24 | .installed.cfg 25 | *.egg 26 | MANIFEST 27 | 28 | # PyInstaller 29 | # Usually these files are written by a python script from a template 30 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 31 | *.manifest 32 | *.spec 33 | 34 | # Installer logs 35 | pip-log.txt 36 | pip-delete-this-directory.txt 37 | 38 | # Unit test / coverage reports 39 | htmlcov/ 40 | .tox/ 41 | .coverage 42 | .coverage.* 43 | .cache 44 | nosetests.xml 45 | coverage.xml 46 | *.cover 47 | .hypothesis/ 48 | .pytest_cache/ 49 | 50 | # Translations 51 | *.mo 52 | *.pot 53 | 54 | # Django stuff: 55 | *.log 56 | local_settings.py 57 | db.sqlite3 58 | 59 | # Flask stuff: 60 | instance/ 61 | .webassets-cache 62 | 63 | # Scrapy stuff: 64 | .scrapy 65 | 66 | # Sphinx documentation 67 | docs/_build/ 68 | 69 | # PyBuilder 70 | target/ 71 | 72 | # Jupyter Notebook 73 | .ipynb_checkpoints 74 | 75 | # pyenv 76 | .python-version 77 | 78 | # celery beat schedule file 79 | celerybeat-schedule 80 | 81 | # SageMath parsed files 82 | *.sage.py 83 | 84 | # Environments 85 | .env 86 | .venv 87 | env/ 88 | venv/ 89 | ENV/ 90 | env.bak/ 91 | venv.bak/ 92 | 93 | # Spyder project settings 94 | .spyderproject 95 | .spyproject 96 | 97 | # Rope project settings 98 | .ropeproject 99 | 100 | # mkdocs documentation 101 | /site 102 | 103 | # mypy 104 | .mypy_cache/ 105 | 106 | # matlab 107 | *~ 108 | *.swp 109 | *~ 110 | *.DS_Store 111 | *.mex* 112 | *.asv 113 | -------------------------------------------------------------------------------- /BayesianPrevalence.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robince/bayesian-prevalence/2f6954779206b1914760434095fa4678d53f23a6/BayesianPrevalence.pdf -------------------------------------------------------------------------------- /BayesianPrevalenceDifference.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robince/bayesian-prevalence/2f6954779206b1914760434095fa4678d53f23a6/BayesianPrevalenceDifference.pdf -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | GNU GENERAL PUBLIC LICENSE 2 | Version 3, 29 June 2007 3 | 4 | Copyright (C) 2007 Free Software Foundation, Inc. 5 | Everyone is permitted to copy and distribute verbatim copies 6 | of this license document, but changing it is not allowed. 7 | 8 | Preamble 9 | 10 | The GNU General Public License is a free, copyleft license for 11 | software and other kinds of works. 12 | 13 | The licenses for most software and other practical works are designed 14 | to take away your freedom to share and change the works. By contrast, 15 | the GNU General Public License is intended to guarantee your freedom to 16 | share and change all versions of a program--to make sure it remains free 17 | software for all its users. We, the Free Software Foundation, use the 18 | GNU General Public License for most of our software; it applies also to 19 | any other work released this way by its authors. You can apply it to 20 | your programs, too. 21 | 22 | When we speak of free software, we are referring to freedom, not 23 | price. Our General Public Licenses are designed to make sure that you 24 | have the freedom to distribute copies of free software (and charge for 25 | them if you wish), that you receive source code or can get it if you 26 | want it, that you can change the software or use pieces of it in new 27 | free programs, and that you know you can do these things. 28 | 29 | To protect your rights, we need to prevent others from denying you 30 | these rights or asking you to surrender the rights. Therefore, you have 31 | certain responsibilities if you distribute copies of the software, or if 32 | you modify it: responsibilities to respect the freedom of others. 33 | 34 | For example, if you distribute copies of such a program, whether 35 | gratis or for a fee, you must pass on to the recipients the same 36 | freedoms that you received. You must make sure that they, too, receive 37 | or can get the source code. And you must show them these terms so they 38 | know their rights. 39 | 40 | Developers that use the GNU GPL protect your rights with two steps: 41 | (1) assert copyright on the software, and (2) offer you this License 42 | giving you legal permission to copy, distribute and/or modify it. 43 | 44 | For the developers' and authors' protection, the GPL clearly explains 45 | that there is no warranty for this free software. For both users' and 46 | authors' sake, the GPL requires that modified versions be marked as 47 | changed, so that their problems will not be attributed erroneously to 48 | authors of previous versions. 49 | 50 | Some devices are designed to deny users access to install or run 51 | modified versions of the software inside them, although the manufacturer 52 | can do so. This is fundamentally incompatible with the aim of 53 | protecting users' freedom to change the software. The systematic 54 | pattern of such abuse occurs in the area of products for individuals to 55 | use, which is precisely where it is most unacceptable. Therefore, we 56 | have designed this version of the GPL to prohibit the practice for those 57 | products. If such problems arise substantially in other domains, we 58 | stand ready to extend this provision to those domains in future versions 59 | of the GPL, as needed to protect the freedom of users. 60 | 61 | Finally, every program is threatened constantly by software patents. 62 | States should not allow patents to restrict development and use of 63 | software on general-purpose computers, but in those that do, we wish to 64 | avoid the special danger that patents applied to a free program could 65 | make it effectively proprietary. To prevent this, the GPL assures that 66 | patents cannot be used to render the program non-free. 67 | 68 | The precise terms and conditions for copying, distribution and 69 | modification follow. 70 | 71 | TERMS AND CONDITIONS 72 | 73 | 0. Definitions. 74 | 75 | "This License" refers to version 3 of the GNU General Public License. 76 | 77 | "Copyright" also means copyright-like laws that apply to other kinds of 78 | works, such as semiconductor masks. 79 | 80 | "The Program" refers to any copyrightable work licensed under this 81 | License. Each licensee is addressed as "you". "Licensees" and 82 | "recipients" may be individuals or organizations. 83 | 84 | To "modify" a work means to copy from or adapt all or part of the work 85 | in a fashion requiring copyright permission, other than the making of an 86 | exact copy. The resulting work is called a "modified version" of the 87 | earlier work or a work "based on" the earlier work. 88 | 89 | A "covered work" means either the unmodified Program or a work based 90 | on the Program. 91 | 92 | To "propagate" a work means to do anything with it that, without 93 | permission, would make you directly or secondarily liable for 94 | infringement under applicable copyright law, except executing it on a 95 | computer or modifying a private copy. Propagation includes copying, 96 | distribution (with or without modification), making available to the 97 | public, and in some countries other activities as well. 98 | 99 | To "convey" a work means any kind of propagation that enables other 100 | parties to make or receive copies. Mere interaction with a user through 101 | a computer network, with no transfer of a copy, is not conveying. 102 | 103 | An interactive user interface displays "Appropriate Legal Notices" 104 | to the extent that it includes a convenient and prominently visible 105 | feature that (1) displays an appropriate copyright notice, and (2) 106 | tells the user that there is no warranty for the work (except to the 107 | extent that warranties are provided), that licensees may convey the 108 | work under this License, and how to view a copy of this License. If 109 | the interface presents a list of user commands or options, such as a 110 | menu, a prominent item in the list meets this criterion. 111 | 112 | 1. Source Code. 113 | 114 | The "source code" for a work means the preferred form of the work 115 | for making modifications to it. "Object code" means any non-source 116 | form of a work. 117 | 118 | A "Standard Interface" means an interface that either is an official 119 | standard defined by a recognized standards body, or, in the case of 120 | interfaces specified for a particular programming language, one that 121 | is widely used among developers working in that language. 122 | 123 | The "System Libraries" of an executable work include anything, other 124 | than the work as a whole, that (a) is included in the normal form of 125 | packaging a Major Component, but which is not part of that Major 126 | Component, and (b) serves only to enable use of the work with that 127 | Major Component, or to implement a Standard Interface for which an 128 | implementation is available to the public in source code form. A 129 | "Major Component", in this context, means a major essential component 130 | (kernel, window system, and so on) of the specific operating system 131 | (if any) on which the executable work runs, or a compiler used to 132 | produce the work, or an object code interpreter used to run it. 133 | 134 | The "Corresponding Source" for a work in object code form means all 135 | the source code needed to generate, install, and (for an executable 136 | work) run the object code and to modify the work, including scripts to 137 | control those activities. However, it does not include the work's 138 | System Libraries, or general-purpose tools or generally available free 139 | programs which are used unmodified in performing those activities but 140 | which are not part of the work. For example, Corresponding Source 141 | includes interface definition files associated with source files for 142 | the work, and the source code for shared libraries and dynamically 143 | linked subprograms that the work is specifically designed to require, 144 | such as by intimate data communication or control flow between those 145 | subprograms and other parts of the work. 146 | 147 | The Corresponding Source need not include anything that users 148 | can regenerate automatically from other parts of the Corresponding 149 | Source. 150 | 151 | The Corresponding Source for a work in source code form is that 152 | same work. 153 | 154 | 2. Basic Permissions. 155 | 156 | All rights granted under this License are granted for the term of 157 | copyright on the Program, and are irrevocable provided the stated 158 | conditions are met. This License explicitly affirms your unlimited 159 | permission to run the unmodified Program. The output from running a 160 | covered work is covered by this License only if the output, given its 161 | content, constitutes a covered work. This License acknowledges your 162 | rights of fair use or other equivalent, as provided by copyright law. 163 | 164 | You may make, run and propagate covered works that you do not 165 | convey, without conditions so long as your license otherwise remains 166 | in force. You may convey covered works to others for the sole purpose 167 | of having them make modifications exclusively for you, or provide you 168 | with facilities for running those works, provided that you comply with 169 | the terms of this License in conveying all material for which you do 170 | not control copyright. Those thus making or running the covered works 171 | for you must do so exclusively on your behalf, under your direction 172 | and control, on terms that prohibit them from making any copies of 173 | your copyrighted material outside their relationship with you. 174 | 175 | Conveying under any other circumstances is permitted solely under 176 | the conditions stated below. Sublicensing is not allowed; section 10 177 | makes it unnecessary. 178 | 179 | 3. Protecting Users' Legal Rights From Anti-Circumvention Law. 180 | 181 | No covered work shall be deemed part of an effective technological 182 | measure under any applicable law fulfilling obligations under article 183 | 11 of the WIPO copyright treaty adopted on 20 December 1996, or 184 | similar laws prohibiting or restricting circumvention of such 185 | measures. 186 | 187 | When you convey a covered work, you waive any legal power to forbid 188 | circumvention of technological measures to the extent such circumvention 189 | is effected by exercising rights under this License with respect to 190 | the covered work, and you disclaim any intention to limit operation or 191 | modification of the work as a means of enforcing, against the work's 192 | users, your or third parties' legal rights to forbid circumvention of 193 | technological measures. 194 | 195 | 4. Conveying Verbatim Copies. 196 | 197 | You may convey verbatim copies of the Program's source code as you 198 | receive it, in any medium, provided that you conspicuously and 199 | appropriately publish on each copy an appropriate copyright notice; 200 | keep intact all notices stating that this License and any 201 | non-permissive terms added in accord with section 7 apply to the code; 202 | keep intact all notices of the absence of any warranty; and give all 203 | recipients a copy of this License along with the Program. 204 | 205 | You may charge any price or no price for each copy that you convey, 206 | and you may offer support or warranty protection for a fee. 207 | 208 | 5. Conveying Modified Source Versions. 209 | 210 | You may convey a work based on the Program, or the modifications to 211 | produce it from the Program, in the form of source code under the 212 | terms of section 4, provided that you also meet all of these conditions: 213 | 214 | a) The work must carry prominent notices stating that you modified 215 | it, and giving a relevant date. 216 | 217 | b) The work must carry prominent notices stating that it is 218 | released under this License and any conditions added under section 219 | 7. This requirement modifies the requirement in section 4 to 220 | "keep intact all notices". 221 | 222 | c) You must license the entire work, as a whole, under this 223 | License to anyone who comes into possession of a copy. This 224 | License will therefore apply, along with any applicable section 7 225 | additional terms, to the whole of the work, and all its parts, 226 | regardless of how they are packaged. This License gives no 227 | permission to license the work in any other way, but it does not 228 | invalidate such permission if you have separately received it. 229 | 230 | d) If the work has interactive user interfaces, each must display 231 | Appropriate Legal Notices; however, if the Program has interactive 232 | interfaces that do not display Appropriate Legal Notices, your 233 | work need not make them do so. 234 | 235 | A compilation of a covered work with other separate and independent 236 | works, which are not by their nature extensions of the covered work, 237 | and which are not combined with it such as to form a larger program, 238 | in or on a volume of a storage or distribution medium, is called an 239 | "aggregate" if the compilation and its resulting copyright are not 240 | used to limit the access or legal rights of the compilation's users 241 | beyond what the individual works permit. Inclusion of a covered work 242 | in an aggregate does not cause this License to apply to the other 243 | parts of the aggregate. 244 | 245 | 6. Conveying Non-Source Forms. 246 | 247 | You may convey a covered work in object code form under the terms 248 | of sections 4 and 5, provided that you also convey the 249 | machine-readable Corresponding Source under the terms of this License, 250 | in one of these ways: 251 | 252 | a) Convey the object code in, or embodied in, a physical product 253 | (including a physical distribution medium), accompanied by the 254 | Corresponding Source fixed on a durable physical medium 255 | customarily used for software interchange. 256 | 257 | b) Convey the object code in, or embodied in, a physical product 258 | (including a physical distribution medium), accompanied by a 259 | written offer, valid for at least three years and valid for as 260 | long as you offer spare parts or customer support for that product 261 | model, to give anyone who possesses the object code either (1) a 262 | copy of the Corresponding Source for all the software in the 263 | product that is covered by this License, on a durable physical 264 | medium customarily used for software interchange, for a price no 265 | more than your reasonable cost of physically performing this 266 | conveying of source, or (2) access to copy the 267 | Corresponding Source from a network server at no charge. 268 | 269 | c) Convey individual copies of the object code with a copy of the 270 | written offer to provide the Corresponding Source. This 271 | alternative is allowed only occasionally and noncommercially, and 272 | only if you received the object code with such an offer, in accord 273 | with subsection 6b. 274 | 275 | d) Convey the object code by offering access from a designated 276 | place (gratis or for a charge), and offer equivalent access to the 277 | Corresponding Source in the same way through the same place at no 278 | further charge. You need not require recipients to copy the 279 | Corresponding Source along with the object code. If the place to 280 | copy the object code is a network server, the Corresponding Source 281 | may be on a different server (operated by you or a third party) 282 | that supports equivalent copying facilities, provided you maintain 283 | clear directions next to the object code saying where to find the 284 | Corresponding Source. Regardless of what server hosts the 285 | Corresponding Source, you remain obligated to ensure that it is 286 | available for as long as needed to satisfy these requirements. 287 | 288 | e) Convey the object code using peer-to-peer transmission, provided 289 | you inform other peers where the object code and Corresponding 290 | Source of the work are being offered to the general public at no 291 | charge under subsection 6d. 292 | 293 | A separable portion of the object code, whose source code is excluded 294 | from the Corresponding Source as a System Library, need not be 295 | included in conveying the object code work. 296 | 297 | A "User Product" is either (1) a "consumer product", which means any 298 | tangible personal property which is normally used for personal, family, 299 | or household purposes, or (2) anything designed or sold for incorporation 300 | into a dwelling. In determining whether a product is a consumer product, 301 | doubtful cases shall be resolved in favor of coverage. For a particular 302 | product received by a particular user, "normally used" refers to a 303 | typical or common use of that class of product, regardless of the status 304 | of the particular user or of the way in which the particular user 305 | actually uses, or expects or is expected to use, the product. A product 306 | is a consumer product regardless of whether the product has substantial 307 | commercial, industrial or non-consumer uses, unless such uses represent 308 | the only significant mode of use of the product. 309 | 310 | "Installation Information" for a User Product means any methods, 311 | procedures, authorization keys, or other information required to install 312 | and execute modified versions of a covered work in that User Product from 313 | a modified version of its Corresponding Source. The information must 314 | suffice to ensure that the continued functioning of the modified object 315 | code is in no case prevented or interfered with solely because 316 | modification has been made. 317 | 318 | If you convey an object code work under this section in, or with, or 319 | specifically for use in, a User Product, and the conveying occurs as 320 | part of a transaction in which the right of possession and use of the 321 | User Product is transferred to the recipient in perpetuity or for a 322 | fixed term (regardless of how the transaction is characterized), the 323 | Corresponding Source conveyed under this section must be accompanied 324 | by the Installation Information. But this requirement does not apply 325 | if neither you nor any third party retains the ability to install 326 | modified object code on the User Product (for example, the work has 327 | been installed in ROM). 328 | 329 | The requirement to provide Installation Information does not include a 330 | requirement to continue to provide support service, warranty, or updates 331 | for a work that has been modified or installed by the recipient, or for 332 | the User Product in which it has been modified or installed. Access to a 333 | network may be denied when the modification itself materially and 334 | adversely affects the operation of the network or violates the rules and 335 | protocols for communication across the network. 336 | 337 | Corresponding Source conveyed, and Installation Information provided, 338 | in accord with this section must be in a format that is publicly 339 | documented (and with an implementation available to the public in 340 | source code form), and must require no special password or key for 341 | unpacking, reading or copying. 342 | 343 | 7. Additional Terms. 344 | 345 | "Additional permissions" are terms that supplement the terms of this 346 | License by making exceptions from one or more of its conditions. 347 | Additional permissions that are applicable to the entire Program shall 348 | be treated as though they were included in this License, to the extent 349 | that they are valid under applicable law. If additional permissions 350 | apply only to part of the Program, that part may be used separately 351 | under those permissions, but the entire Program remains governed by 352 | this License without regard to the additional permissions. 353 | 354 | When you convey a copy of a covered work, you may at your option 355 | remove any additional permissions from that copy, or from any part of 356 | it. (Additional permissions may be written to require their own 357 | removal in certain cases when you modify the work.) You may place 358 | additional permissions on material, added by you to a covered work, 359 | for which you have or can give appropriate copyright permission. 360 | 361 | Notwithstanding any other provision of this License, for material you 362 | add to a covered work, you may (if authorized by the copyright holders of 363 | that material) supplement the terms of this License with terms: 364 | 365 | a) Disclaiming warranty or limiting liability differently from the 366 | terms of sections 15 and 16 of this License; or 367 | 368 | b) Requiring preservation of specified reasonable legal notices or 369 | author attributions in that material or in the Appropriate Legal 370 | Notices displayed by works containing it; or 371 | 372 | c) Prohibiting misrepresentation of the origin of that material, or 373 | requiring that modified versions of such material be marked in 374 | reasonable ways as different from the original version; or 375 | 376 | d) Limiting the use for publicity purposes of names of licensors or 377 | authors of the material; or 378 | 379 | e) Declining to grant rights under trademark law for use of some 380 | trade names, trademarks, or service marks; or 381 | 382 | f) Requiring indemnification of licensors and authors of that 383 | material by anyone who conveys the material (or modified versions of 384 | it) with contractual assumptions of liability to the recipient, for 385 | any liability that these contractual assumptions directly impose on 386 | those licensors and authors. 387 | 388 | All other non-permissive additional terms are considered "further 389 | restrictions" within the meaning of section 10. If the Program as you 390 | received it, or any part of it, contains a notice stating that it is 391 | governed by this License along with a term that is a further 392 | restriction, you may remove that term. If a license document contains 393 | a further restriction but permits relicensing or conveying under this 394 | License, you may add to a covered work material governed by the terms 395 | of that license document, provided that the further restriction does 396 | not survive such relicensing or conveying. 397 | 398 | If you add terms to a covered work in accord with this section, you 399 | must place, in the relevant source files, a statement of the 400 | additional terms that apply to those files, or a notice indicating 401 | where to find the applicable terms. 402 | 403 | Additional terms, permissive or non-permissive, may be stated in the 404 | form of a separately written license, or stated as exceptions; 405 | the above requirements apply either way. 406 | 407 | 8. Termination. 408 | 409 | You may not propagate or modify a covered work except as expressly 410 | provided under this License. Any attempt otherwise to propagate or 411 | modify it is void, and will automatically terminate your rights under 412 | this License (including any patent licenses granted under the third 413 | paragraph of section 11). 414 | 415 | However, if you cease all violation of this License, then your 416 | license from a particular copyright holder is reinstated (a) 417 | provisionally, unless and until the copyright holder explicitly and 418 | finally terminates your license, and (b) permanently, if the copyright 419 | holder fails to notify you of the violation by some reasonable means 420 | prior to 60 days after the cessation. 421 | 422 | Moreover, your license from a particular copyright holder is 423 | reinstated permanently if the copyright holder notifies you of the 424 | violation by some reasonable means, this is the first time you have 425 | received notice of violation of this License (for any work) from that 426 | copyright holder, and you cure the violation prior to 30 days after 427 | your receipt of the notice. 428 | 429 | Termination of your rights under this section does not terminate the 430 | licenses of parties who have received copies or rights from you under 431 | this License. If your rights have been terminated and not permanently 432 | reinstated, you do not qualify to receive new licenses for the same 433 | material under section 10. 434 | 435 | 9. Acceptance Not Required for Having Copies. 436 | 437 | You are not required to accept this License in order to receive or 438 | run a copy of the Program. Ancillary propagation of a covered work 439 | occurring solely as a consequence of using peer-to-peer transmission 440 | to receive a copy likewise does not require acceptance. However, 441 | nothing other than this License grants you permission to propagate or 442 | modify any covered work. These actions infringe copyright if you do 443 | not accept this License. Therefore, by modifying or propagating a 444 | covered work, you indicate your acceptance of this License to do so. 445 | 446 | 10. Automatic Licensing of Downstream Recipients. 447 | 448 | Each time you convey a covered work, the recipient automatically 449 | receives a license from the original licensors, to run, modify and 450 | propagate that work, subject to this License. You are not responsible 451 | for enforcing compliance by third parties with this License. 452 | 453 | An "entity transaction" is a transaction transferring control of an 454 | organization, or substantially all assets of one, or subdividing an 455 | organization, or merging organizations. If propagation of a covered 456 | work results from an entity transaction, each party to that 457 | transaction who receives a copy of the work also receives whatever 458 | licenses to the work the party's predecessor in interest had or could 459 | give under the previous paragraph, plus a right to possession of the 460 | Corresponding Source of the work from the predecessor in interest, if 461 | the predecessor has it or can get it with reasonable efforts. 462 | 463 | You may not impose any further restrictions on the exercise of the 464 | rights granted or affirmed under this License. For example, you may 465 | not impose a license fee, royalty, or other charge for exercise of 466 | rights granted under this License, and you may not initiate litigation 467 | (including a cross-claim or counterclaim in a lawsuit) alleging that 468 | any patent claim is infringed by making, using, selling, offering for 469 | sale, or importing the Program or any portion of it. 470 | 471 | 11. Patents. 472 | 473 | A "contributor" is a copyright holder who authorizes use under this 474 | License of the Program or a work on which the Program is based. The 475 | work thus licensed is called the contributor's "contributor version". 476 | 477 | A contributor's "essential patent claims" are all patent claims 478 | owned or controlled by the contributor, whether already acquired or 479 | hereafter acquired, that would be infringed by some manner, permitted 480 | by this License, of making, using, or selling its contributor version, 481 | but do not include claims that would be infringed only as a 482 | consequence of further modification of the contributor version. For 483 | purposes of this definition, "control" includes the right to grant 484 | patent sublicenses in a manner consistent with the requirements of 485 | this License. 486 | 487 | Each contributor grants you a non-exclusive, worldwide, royalty-free 488 | patent license under the contributor's essential patent claims, to 489 | make, use, sell, offer for sale, import and otherwise run, modify and 490 | propagate the contents of its contributor version. 491 | 492 | In the following three paragraphs, a "patent license" is any express 493 | agreement or commitment, however denominated, not to enforce a patent 494 | (such as an express permission to practice a patent or covenant not to 495 | sue for patent infringement). To "grant" such a patent license to a 496 | party means to make such an agreement or commitment not to enforce a 497 | patent against the party. 498 | 499 | If you convey a covered work, knowingly relying on a patent license, 500 | and the Corresponding Source of the work is not available for anyone 501 | to copy, free of charge and under the terms of this License, through a 502 | publicly available network server or other readily accessible means, 503 | then you must either (1) cause the Corresponding Source to be so 504 | available, or (2) arrange to deprive yourself of the benefit of the 505 | patent license for this particular work, or (3) arrange, in a manner 506 | consistent with the requirements of this License, to extend the patent 507 | license to downstream recipients. "Knowingly relying" means you have 508 | actual knowledge that, but for the patent license, your conveying the 509 | covered work in a country, or your recipient's use of the covered work 510 | in a country, would infringe one or more identifiable patents in that 511 | country that you have reason to believe are valid. 512 | 513 | If, pursuant to or in connection with a single transaction or 514 | arrangement, you convey, or propagate by procuring conveyance of, a 515 | covered work, and grant a patent license to some of the parties 516 | receiving the covered work authorizing them to use, propagate, modify 517 | or convey a specific copy of the covered work, then the patent license 518 | you grant is automatically extended to all recipients of the covered 519 | work and works based on it. 520 | 521 | A patent license is "discriminatory" if it does not include within 522 | the scope of its coverage, prohibits the exercise of, or is 523 | conditioned on the non-exercise of one or more of the rights that are 524 | specifically granted under this License. You may not convey a covered 525 | work if you are a party to an arrangement with a third party that is 526 | in the business of distributing software, under which you make payment 527 | to the third party based on the extent of your activity of conveying 528 | the work, and under which the third party grants, to any of the 529 | parties who would receive the covered work from you, a discriminatory 530 | patent license (a) in connection with copies of the covered work 531 | conveyed by you (or copies made from those copies), or (b) primarily 532 | for and in connection with specific products or compilations that 533 | contain the covered work, unless you entered into that arrangement, 534 | or that patent license was granted, prior to 28 March 2007. 535 | 536 | Nothing in this License shall be construed as excluding or limiting 537 | any implied license or other defenses to infringement that may 538 | otherwise be available to you under applicable patent law. 539 | 540 | 12. No Surrender of Others' Freedom. 541 | 542 | If conditions are imposed on you (whether by court order, agreement or 543 | otherwise) that contradict the conditions of this License, they do not 544 | excuse you from the conditions of this License. If you cannot convey a 545 | covered work so as to satisfy simultaneously your obligations under this 546 | License and any other pertinent obligations, then as a consequence you may 547 | not convey it at all. For example, if you agree to terms that obligate you 548 | to collect a royalty for further conveying from those to whom you convey 549 | the Program, the only way you could satisfy both those terms and this 550 | License would be to refrain entirely from conveying the Program. 551 | 552 | 13. Use with the GNU Affero General Public License. 553 | 554 | Notwithstanding any other provision of this License, you have 555 | permission to link or combine any covered work with a work licensed 556 | under version 3 of the GNU Affero General Public License into a single 557 | combined work, and to convey the resulting work. The terms of this 558 | License will continue to apply to the part which is the covered work, 559 | but the special requirements of the GNU Affero General Public License, 560 | section 13, concerning interaction through a network will apply to the 561 | combination as such. 562 | 563 | 14. Revised Versions of this License. 564 | 565 | The Free Software Foundation may publish revised and/or new versions of 566 | the GNU General Public License from time to time. Such new versions will 567 | be similar in spirit to the present version, but may differ in detail to 568 | address new problems or concerns. 569 | 570 | Each version is given a distinguishing version number. If the 571 | Program specifies that a certain numbered version of the GNU General 572 | Public License "or any later version" applies to it, you have the 573 | option of following the terms and conditions either of that numbered 574 | version or of any later version published by the Free Software 575 | Foundation. If the Program does not specify a version number of the 576 | GNU General Public License, you may choose any version ever published 577 | by the Free Software Foundation. 578 | 579 | If the Program specifies that a proxy can decide which future 580 | versions of the GNU General Public License can be used, that proxy's 581 | public statement of acceptance of a version permanently authorizes you 582 | to choose that version for the Program. 583 | 584 | Later license versions may give you additional or different 585 | permissions. However, no additional obligations are imposed on any 586 | author or copyright holder as a result of your choosing to follow a 587 | later version. 588 | 589 | 15. Disclaimer of Warranty. 590 | 591 | THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY 592 | APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT 593 | HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY 594 | OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, 595 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 596 | PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM 597 | IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF 598 | ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 599 | 600 | 16. Limitation of Liability. 601 | 602 | IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING 603 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS 604 | THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY 605 | GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE 606 | USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF 607 | DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD 608 | PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), 609 | EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF 610 | SUCH DAMAGES. 611 | 612 | 17. Interpretation of Sections 15 and 16. 613 | 614 | If the disclaimer of warranty and limitation of liability provided 615 | above cannot be given local legal effect according to their terms, 616 | reviewing courts shall apply local law that most closely approximates 617 | an absolute waiver of all civil liability in connection with the 618 | Program, unless a warranty or assumption of liability accompanies a 619 | copy of the Program in return for a fee. 620 | 621 | END OF TERMS AND CONDITIONS 622 | 623 | How to Apply These Terms to Your New Programs 624 | 625 | If you develop a new program, and you want it to be of the greatest 626 | possible use to the public, the best way to achieve this is to make it 627 | free software which everyone can redistribute and change under these terms. 628 | 629 | To do so, attach the following notices to the program. It is safest 630 | to attach them to the start of each source file to most effectively 631 | state the exclusion of warranty; and each file should have at least 632 | the "copyright" line and a pointer to where the full notice is found. 633 | 634 | 635 | Copyright (C) 636 | 637 | This program is free software: you can redistribute it and/or modify 638 | it under the terms of the GNU General Public License as published by 639 | the Free Software Foundation, either version 3 of the License, or 640 | (at your option) any later version. 641 | 642 | This program is distributed in the hope that it will be useful, 643 | but WITHOUT ANY WARRANTY; without even the implied warranty of 644 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 645 | GNU General Public License for more details. 646 | 647 | You should have received a copy of the GNU General Public License 648 | along with this program. If not, see . 649 | 650 | Also add information on how to contact you by electronic and paper mail. 651 | 652 | If the program does terminal interaction, make it output a short 653 | notice like this when it starts in an interactive mode: 654 | 655 | Copyright (C) 656 | This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. 657 | This is free software, and you are welcome to redistribute it 658 | under certain conditions; type `show c' for details. 659 | 660 | The hypothetical commands `show w' and `show c' should show the appropriate 661 | parts of the General Public License. Of course, your program's commands 662 | might be different; for a GUI interface, you would use an "about box". 663 | 664 | You should also get your employer (if you work as a programmer) or school, 665 | if any, to sign a "copyright disclaimer" for the program, if necessary. 666 | For more information on this, and how to apply and follow the GNU GPL, see 667 | . 668 | 669 | The GNU General Public License does not permit incorporating your program 670 | into proprietary programs. If your program is a subroutine library, you 671 | may consider it more useful to permit linking proprietary applications with 672 | the library. If this is what you want to do, use the GNU Lesser General 673 | Public License instead of this License. But first, please read 674 | . 675 | -------------------------------------------------------------------------------- /R/bayesprev.R: -------------------------------------------------------------------------------- 1 | 2 | # nleqslv package required for HPDI optimization 3 | library(nleqslv) 4 | 5 | bayesprev_map <- function(k, n, a=0.05, b=1) { 6 | # Bayesian maximum a posteriori estimate of population prevalence gamma 7 | # under a uniform prior 8 | # 9 | # Args: 10 | # k: number of participants/tests significant out of 11 | # n: total number of participants/tests 12 | # a: alpha value of within-participant test (default=0.05) 13 | # b: sensitivity/beta of within-participant test (default=1) 14 | 15 | gm <- (k/n -a)/(b-a) 16 | if(gm <0) gm <- 0 17 | if(gm>1) gm <- 1 18 | return(gm) 19 | } 20 | 21 | bayesprev_posterior <- function(x, k, n, a=0.05, b=1) { 22 | # Bayesian posterior of population prevalence gamma under a uniform prior 23 | # 24 | # Args: 25 | # x : values of gamma at which to evaluate the posterior density 26 | # k : number of participants significant out of 27 | # n : total number of participants 28 | # a : alpha value of within-participant test (default=0.05) 29 | # b : sensitivity/beta of within-participant test (default=1) 30 | 31 | m1 <- k + 1 32 | m2 <- n - k + 1 33 | theta <- a + (b-a)*x 34 | post <- (b -a)*dbeta(theta,m1, m2) 35 | post <- post/(pbeta(b, m1, m2) - pbeta(a, m1, m2)) 36 | return(post) 37 | } 38 | 39 | 40 | bayesprev_bound <- function(p, k, n, a=0.05, b=1) { 41 | # Bayesian lower bound of population prevalence gamma under a uniform prior 42 | # 43 | # Args: 44 | # p : density the lower bound should bound (e.g. 0.95) 45 | # k : number of participants significant out of 46 | # n : total number of participants 47 | # a : alpha value of within-participant test (default=0.05) 48 | # b : sensitivity/beta of within-participant test (default=1) 49 | 50 | m1 <- k + 1 51 | m2 <- n - k + 1 52 | th_c <- qbeta( p*pbeta(a, m1, m2) + (1-p)*pbeta(b, m1, m2), m1, m2 ) 53 | g_c <- (th_c -a)/(b-a) 54 | return(g_c) 55 | } 56 | 57 | 58 | bayesprev_hpdi <- function(p, k, n, a=0.05, b=1) { 59 | # Bayesian highest posterior density interval of population prevalence gamma 60 | # under a uniform prior 61 | # 62 | # Args: 63 | # p : HPDI to return (e.g. 0.95 for 95%) 64 | # k : number of participants significant out of 65 | # n : total number of participants 66 | # a : alpha value of within-participant test (default=0.05) 67 | # b : sensitivity/beta of within-participant test (default=1) 68 | 69 | m1 <- k+1 70 | m2 <- n-k+1 71 | 72 | if(m1 ==1) { 73 | endpts <- c(a, qbeta( (1 -p)*pbeta(a, m1, m2) + p*pbeta(b, m1, m2), m1, m2 ) ) 74 | return((endpts -a)/(b-a)) 75 | } 76 | 77 | if(m2 ==1) { 78 | endpts <- c( qbeta( p*pbeta(a, m1, m2) + (1- p)* pbeta(b, m1, m2), m1, m2 ) , b) 79 | return( (endpts-a)/(b-a)) 80 | } 81 | 82 | if(k<= n*a) { 83 | endpts <- c(a, qbeta( (1 -p)*pbeta(a, m1, m2) + p*pbeta(b, m1, m2), m1, m2 ) ) 84 | return((endpts -a)/(b-a)) 85 | } 86 | 87 | if(k>= n*b) { 88 | endpts <- c( qbeta( p*pbeta(a, m1, m2) + (1- p)* pbeta(b, m1, m2), m1, m2 ) , b) 89 | return( (endpts-a)/(b-a)) 90 | } 91 | 92 | 93 | g <- function(x, m1, m2, a, b, p ) { 94 | y <- numeric(2) 95 | y[1] <- pbeta(x[2], m1, m2) - pbeta(x[1], m1, m2) - p*(pbeta(b, m1, m2) - pbeta(a, m1, m2)) 96 | y[2] <- log(dbeta(x[2], m1, m2)) - log(dbeta(x[1], m1, m2)) 97 | return(y) 98 | } 99 | 100 | x_init <- numeric(2) 101 | 102 | p1 <- (1-p)/2 103 | p2 <- (1 +p)/2 104 | 105 | x_init[1] <- qbeta( (1 -p1)*pbeta(a, m1, m2) + p1* pbeta(b, m1, m2), m1, m2 ) 106 | x_init[2] <- qbeta( (1 -p2)*pbeta(a, m1, m2) + p2* pbeta(b, m1, m2), m1, m2 ) 107 | 108 | opt <- nleqslv(x_init, g, method ="Newton", control=list(maxit=1000), m1=m1, m2=m2, a=a, b=b, p=p) 109 | 110 | if (opt$termcd ==1) print("convergence achieved") 111 | if (opt$termcd != 1) print("failed to converge") 112 | 113 | temp <- opt$x 114 | if (temp[1] b) { 119 | temp[1] <- qbeta( p*pbeta(a, m1, m2) + (1-p)* pbeta(b, m1, m2), m1, m2 ) 120 | temp[2] <- b 121 | } 122 | endpts <- (temp -a)/(b-a) 123 | return(endpts) 124 | } 125 | 126 | 127 | # Functions for Bayesian inference of difference in prevalence 128 | 129 | bayes_prev_diff_between <- function(k1, n1, k2, n2, p, a = 0.05, b = 1, Nsamp){ 130 | 131 | # Bayesian posterior inference for the difference in prevalence 132 | # when the same test is applied to two different groups 133 | 134 | # Inputs: 135 | # k1 : number of participants significant in group 1, out of n1 136 | # n1 : total number of participants in group 1 137 | # k2 : number of participants significant in group 2, out of n2 138 | # n2 : total number of participants in group 2 139 | # p : coverage for highest-posterior density interval (in [0 1]) 140 | # a : alpha value of within-participant test (default=0.05) 141 | # b : sensitivity/beta of within-participant test (default=1) 142 | # Nsamp : number of samples from the posterior 143 | 144 | # Outputs: 145 | # map : maximum a posteriori estimate of the difference in prevalence: 146 | # gamma_1 - gamma_2 147 | # hpdi : highest-posterior density interval with coverage p 148 | # probGT: estimated posterior probability that the prevalence is higher in group 1 149 | # logoddsGT: estimated log odds in favour of the hypothesis that the prevalence 150 | # is higher in group 1 151 | # gpost : ggplot object of posterior distribution of the prevalence difference 152 | 153 | # Load the necessary libraries 154 | 155 | library(HDInterval) 156 | library(ggplot2) 157 | 158 | # Parameters for uniform priors 159 | 160 | r1 =1 ; s1 =1 ; r2 =1 ; s2 = 1 161 | 162 | # Parameters for Beta posteriors 163 | 164 | m11 = k1 + r1 ; m12 = n1 - k1 + s1 165 | m21 = k2 + r2 ; m22 = n2 - k2 + s2 166 | 167 | # Generate truncated Beta values for theta_1 168 | 169 | vec1 = runif(Nsamp, pbeta(a, m11, m12), pbeta(b, m11, m12)) 170 | th1 = qbeta(vec1, m11, m12) 171 | 172 | # Generate truncated Beta values for theta_2 173 | 174 | vec2 = runif(Nsamp, pbeta(a, m21, m22), pbeta(b, m21, m22)) 175 | th2 = qbeta(vec2, m21, m22) 176 | 177 | # Compute vector of estimates of prevalence difference 178 | 179 | delta = (th1 - th2)/ (b-a) 180 | 181 | # Estimate the posterior probability, and logodds, that the prevalence is higher for group 1. 182 | # Laplace's rule of succession used to avoid estimates of 0 or 1 183 | 184 | 185 | probGT = (sum(delta > 0)+1)/(Nsamp +2) 186 | 187 | logoddsGT = log(probGT / (1 - probGT)) 188 | 189 | # Compute the HPD interval, coverage = p 190 | 191 | hpdi = hdi(delta, credMass = p) 192 | 193 | # Approximate MAP estimate 194 | 195 | dens = density(delta, bw = "SJ") 196 | map = dens$x[dens$y == max(dens$y)] 197 | 198 | # Produce a plot of the posterior density 199 | 200 | fill <- "gold1" 201 | line <- "goldenrod2" 202 | dd =data.frame(delta) 203 | 204 | gpost <- ggplot(dd, aes(x=delta)) + 205 | geom_density(fill = fill, colour = line, bw ="SJ") + 206 | scale_x_continuous(name = "Prevalence difference", 207 | limits=c(-1, 1)) + 208 | scale_y_continuous(name = "Posterior density") + 209 | ggtitle("Posterior density of difference in Prevalence") 210 | 211 | return(list(map = map, hpdi = hpdi, probGT = probGT, logoddsGT = logoddsGT, gpost = gpost)) 212 | } 213 | 214 | 215 | bayesprev_diff_within <- function(k11, k10, k01, n, p, a = 0.05, b = 1, Nsamp){ 216 | 217 | # Bayesian posterior inference for the difference in prevalence 218 | # when two different tests are applied to the same group 219 | 220 | 221 | #Inputs: 222 | # k11 : number of participants significant in both tests 223 | # k01 : number of participants significant in test 2 and not test 1 224 | # k10 : number of participants significant in test 1 and not test 2 225 | # n : total number of participants 226 | # p : coverage for highest-posterior density interval (in [0 1]) 227 | # a : alpha value of within-participant test (default=0.05) 228 | # b : sensitivity/beta of within-participant test (default=1) 229 | # Nsamp : number of samples from the posterior 230 | 231 | # Outputs: 232 | # map : maximum a posteriori estimate of the difference in prevalence: 233 | # gamma_1 - gamma_2 234 | # hpdi : highest-posterior density interval with coverage p 235 | # probGT: estimated posterior probability that the prevalence is higher on test 1 236 | # logoddsGT: estimated log odds in favour of the hypothesis that the prevalence 237 | # is higher on test 1 238 | # gpost : ggplot object of the posterior distribution 239 | 240 | # Load the necessary libraries 241 | 242 | library(HDInterval) 243 | library(ggplot2) 244 | 245 | # Define the parameters for the Dirichlet prior distribution -- 246 | # here giving a uniform distribution 247 | 248 | r11 =1; r10 = 1; r01 = 1; r00 = 1 249 | 250 | # Compute the parameters for the posterior Dirichlet distribution 251 | 252 | k00 = n - k11 - k10 - k01 253 | 254 | m11 = k11 + r11 ; m10 = k10 + r10 255 | m01 = k01 + r01 ; m00 = k00 + r00 256 | 257 | 258 | # Compute samples from the truncated Dirichlet posterior 259 | 260 | z11 = runif(Nsamp, pbeta(0, m11, m10 + m01 + m00), pbeta(b, m11, m10 + m01 + m00)) 261 | u11 = qbeta(z11, m11, m10 + m01 + m00); th11 = u11 262 | 263 | lo = pmax( (a - th11)/(1-th11), 0); hi = (b - th11)/(1-th11) 264 | z10 = runif(Nsamp, pbeta(lo, m10, m01 + m00), pbeta(hi, m10, m01 + m00)) 265 | u10 = qbeta(z10, m10, m01 + m00); th10 = (1 - th11)*u10 266 | 267 | 268 | lo = pmax( (a - th11)/(1-th11 - th10), 0); hi = pmin( (b - th11)/(1-th11- th10), 1 ) 269 | z01 = runif(Nsamp, pbeta(lo, m01, m00), pbeta(hi, m01, m00)) 270 | u01 = qbeta(z01, m01, m00); th01 = (1 -th11 - th10)*u01 271 | 272 | th00 = 1 - th11 -th10 - th01 273 | 274 | theta = cbind(th11, th10, th01, th00) 275 | 276 | # Compute vector of estimates of prevalence difference 277 | 278 | delta = (th10 - th01)/(b-a) 279 | 280 | # Estimate the posterior probability, and logodds, that the prevalence is higher on test 1. 281 | # Laplace's rule of succession used to avoid estimates of 0 or 1 282 | 283 | probGT = (sum(delta > 0) +1)/(Nsamp +2) 284 | 285 | logoddsGT = log(probGT / (1 - probGT)) 286 | 287 | 288 | # Compute the 96% HPDI 289 | 290 | hpdi = hdi(delta, credMass = p) 291 | 292 | dens = density(delta, bw = "SJ") 293 | 294 | # Approximate MAP estimate 295 | 296 | dens = density(delta, bw = "SJ") 297 | map = dens$x[dens$y == max(dens$y)] 298 | 299 | # Produce a plot of the posterior density 300 | 301 | fill <- "gold1" 302 | line <- "goldenrod2" 303 | 304 | dd = data.frame(delta) 305 | 306 | gpost <- ggplot(dd, aes(x=delta)) + 307 | geom_density(fill = fill, colour = line, bw ="SJ") + 308 | scale_x_continuous(name = "Prevalence difference", 309 | limits=c(-1, 1)) + 310 | scale_y_continuous(name = "Posterior density") + 311 | ggtitle("Posterior density of difference in Prevalence") 312 | 313 | return(list(map = map, hpdi = hpdi, probGT = probGT, logoddsGT = logoddsGT, gpost = gpost)) 314 | } 315 | 316 | sim_binom <- function(g1, n1, g2, n2, a = 0.05, b = 1){ 317 | 318 | # Simulation of random binomial data for each of two groups 319 | # who undertake the same test 320 | 321 | th1 = a + (b -a) * g1 322 | th2 = a + (b -a) * g2 323 | 324 | k1 = rbinom(1, n1, th1) 325 | k2 = rbinom(1, n2, th2) 326 | 327 | return(list(k1 = k1, k2 = k2)) 328 | } 329 | 330 | 331 | sim_multinom <- function(g1, g2, r12, n, a = 0.05, b = 1){ 332 | 333 | # Simulation of random multinomial data for the case of 334 | # prevalence difference between tests, with a single group 335 | 336 | # Input: 337 | 338 | # g1 : prevalence of the effect associated with test 1 339 | # g2 : prevalence of the effect associated with test 2 340 | # r12 : correlation between the presence of the effect on test 1 and 341 | # the presence of the effect on test 2. 342 | # n : total number of participants 343 | # a : alpha value of within-participant test (default=0.05) 344 | # b : sensitivity/beta of within-participant test (default=1) 345 | 346 | # Output: 347 | 348 | # mdat : a vector containing the numbers of participants who provide 349 | # a significant result on both tests, on only the first test, 350 | # on only the second test, on neither test, respectively 351 | 352 | 353 | # Compute the prevalence of the effect on both tests 354 | 355 | g11 = g1 * g2 + r12 * sqrt( g1 *(1-g1) *g2 *(1-g2)) 356 | g10 = g1 - g11 357 | g01 = g2 - g11 358 | g00 = 1 - g11 - g10 - g01 359 | 360 | # Compute the parameters of the Multinomial distribution 361 | 362 | th11 = b^2 * g11 + a * b * g10 + a * b * g01 + a^2 * g00 363 | th10 = a + (b-a)* g1 - th11 364 | th01 = a + (b-a)* g2 - th11 365 | th00= 1 - th11 - th10 -th01 366 | 367 | theta =c(th11, th10, th01, th00) 368 | 369 | mdat = as.vector(rmultinom(1, n, theta)) 370 | 371 | return(list(k11 = mdat[1], k10 = mdat[2], k01 = mdat[3], k00 = mdat[4])) 372 | } 373 | 374 | 375 | -------------------------------------------------------------------------------- /R/bayesprev_example.R: -------------------------------------------------------------------------------- 1 | # Example of how to use Bayesian prevalence functions 2 | # 3 | # 1. Simulate or load within-participant raw experimental data 4 | # 2. LEVEL 1: Apply statstical test at the individual level 5 | # 3. LEVEL 2: Apply Bayesian Prevalence to the outcomes of Level 1 6 | 7 | source("bayesprev.R") 8 | 9 | # 10 | # 1. Simulate or load within-participant raw experimental data 11 | # 12 | 13 | # 1.1. Simulate within-participant raw experimental data 14 | Nsub<-20 # number of particpants 15 | Nsamp <- 100 # trials/samples per participant 16 | sigma_w <- 10 # within-participant SD 17 | sigma_b <- 2 # between-participant SD 18 | mu_g <- 1 # population mean 19 | 20 | # per participant mean drawn from population normal distribution 21 | submeanstrue <- rnorm(Nsub, mu_g, sigma_b) 22 | # rawdat holds trial data for each participant 23 | rawdat = matrix(0, nrow=Nsamp, ncol=Nsub) 24 | for( si in 1:Nsub ){ 25 | # generate trials for each participant 26 | rawdat[, si] = rnorm(Nsamp, submeanstrue[si], sigma_w) 27 | } 28 | 29 | # 1.2.Load within-participant raw experimental data 30 | # Load your own data into the variable rawdat with dimensions [Nsamp Nsub], 31 | # setting Nsamp and Nsub accordingly. 32 | 33 | # 34 | # 2. LEVEL 1 35 | # 36 | 37 | # 2.1. Within-participant statistical test 38 | # This loop performs within-participant statistical test. Here, a t-test for 39 | # non-zero mean which is the simplest statistical test. In general, any 40 | # statistical test can be used at Level 1. 41 | 42 | p <- vector(mode="numeric",length=Nsub) 43 | for( si in 1:Nsub ){ 44 | t = t.test(rawdat[,si], mu=0) 45 | p[si] = t[3] 46 | } 47 | # p holds p-values of test for each participant 48 | alpha = 0.05 49 | indsig = p0`, log odds `g1>g2`, and posterior samples for the population prevalence difference `g1-g2` between two populations. 44 | 45 | `bayesprev_diff_within(k11,k10,k01,n,p,a,b,Nsamp)` *Matlab*, *R* 46 | `bayesprev.diff_within(k11,k10,k01,n,p,a,b,Nsamp)` *Python* 47 | Returns MAP, kernel density fit to posterior density, HPDI of p, probability `g1-g2>0`, log odds `g1>g2`, and posterior samples for the population prevalence difference `g1-g2` for two tests applied to the same sample. 48 | 49 | ## Installation 50 | 51 | ### Python 52 | 53 | `pip install bayesprev` 54 | 55 | ### Matlab 56 | 57 | Clone or download this repository, then in Matlab add folder to path: 58 | 59 | `addpath('/path/to/bayesprev/matlab')` 60 | 61 | ### R 62 | 63 | Clone or download this repository, then in R: 64 | 65 | `source("bayesprev.R")` 66 | 67 | ## Contact 68 | 69 | Robin Ince, robince@gmail.com (Python code, MATLAB code) 70 | Jim Kay, jimkay049@gmail.com (R code, Technical note on Bayesian prevalence) 71 | -------------------------------------------------------------------------------- /matlab/bayesprev_bound.m: -------------------------------------------------------------------------------- 1 | function g_c = bayesprev_bound(p, k, n, a, b) 2 | % Bayesian lower bound of population prevalence gamma 3 | % under a uniform prior 4 | % 5 | % p : density the lower bound should bound (e.g. 0.95) 6 | % k : number of participants significant out of 7 | % n : total number of participants 8 | % a : alpha value of within-participant test (default=0.05) 9 | % b : sensitivity/beta of within-participant test (default=1) 10 | 11 | if nargin<=4 12 | b = 1; 13 | end 14 | if nargin<=3 15 | a = 0.05; 16 | end 17 | 18 | % gamma prior = Beta(r,s) 19 | r = 1; 20 | s = 1; 21 | 22 | b1 = k+r; 23 | b2 = n-k+s; 24 | cdfp = (1-p)*betacdf(b,b1,b2) + p*betacdf(a,b1,b2); 25 | the_c = betainv(cdfp,b1,b2); 26 | g_c = (the_c - a)./(b-a); 27 | -------------------------------------------------------------------------------- /matlab/bayesprev_diff_between.m: -------------------------------------------------------------------------------- 1 | function [map, post_x, post_p, hpi, probGT, logoddsGT, samples] = bayesprev_diff_between(k1, n1, k2, n2, p, a, b, Nsamp) 2 | % Bayesian maximum a posteriori estimate of the difference in prevalence 3 | % when the same test is applied to two groups 4 | % 5 | % k1 : number of participants significant in group 1 out of 6 | % n1 : total number of participants in group 1 7 | % k2 : number of participants significant in group 2 out of 8 | % n2 : total number of participants in group 2 9 | % p : coverage for highest-posterior density interval (in [0 1]) 10 | % a : alpha value of within-participant test (default=0.05) 11 | % b : sensitivity/beta of within-participant test (default=1) 12 | % Nsamp : number of samples from the posterior 13 | % 14 | % Outputs: 15 | % map : maximum a posteriori estimate of the difference in prevalence: 16 | % gamma_1 - gamma_2 17 | % post_x : x-axis for kernel density fit of posterior distribution of the 18 | % above 19 | % post : posterior distribution from kernel density fit 20 | % hpdi : highest-posterior density interval with coverage p 21 | % probGT : estimated posterior probability that the prevalence is higher in group 1 22 | % logoddsGT : estimated log odds in favour of the hypothesis that the prevalence is 23 | % higher in group 1 24 | % samples : posterior samples 25 | if nargin<=6 26 | b = 1; 27 | end 28 | if nargin<=5 29 | a = 0.05; 30 | end 31 | if nargin<=4 32 | p = 0.96; 33 | end 34 | if nargin<=7 35 | Nsamp = 10000; 36 | end 37 | 38 | % gamma priors = Beta(r,s) 39 | r1 = 1; 40 | s1 = 1; 41 | r2 = 1; 42 | s2 = 1; 43 | 44 | the1d = makedist('Beta','a',k1+r1,'b',n1-k1+s1); 45 | the1d = the1d.truncate(a,b); 46 | 47 | the2d = makedist('Beta','a',k2+r2,'b',n2-k2+s2); 48 | the2d = the2d.truncate(a,b); 49 | 50 | the1samp = the1d.random(Nsamp,1); 51 | the2samp = the2d.random(Nsamp,1); 52 | 53 | g_diff_samples = (the1samp - the2samp)./(b-a); 54 | 55 | x = linspace(-1,1,200); 56 | g_diff_post = ksdensity(g_diff_samples,x); 57 | 58 | [~, idx] = max(g_diff_post); 59 | g_diff_map = x(idx); 60 | 61 | map = g_diff_map; 62 | post_p = g_diff_post; 63 | post_x = x; 64 | samples = g_diff_samples; 65 | hpi = hpdi(g_diff_samples,100*p); 66 | % Estimate the posterior probability, and logodds, that the prevalence is higher for group 1. 67 | % Laplace's rule of succession used to avoid estimates of 0 or 1 68 | probGT = (sum(samples>0)+1)/(Nsamp+2); 69 | logoddsGT = log(probGT / (1-probGT)); 70 | 71 | function hpdi = hpdi(x, p) 72 | % HPDI - Estimates the Bayesian HPD intervals 73 | % 74 | % Y = HPDI(X,P) returns a Highest Posterior Density (HPD) interval 75 | % for each column of X. P must be a scalar. Y is a 2 row matrix 76 | % where ith column is HPDI for ith column of X. 77 | 78 | % References: 79 | % [1] Chen, M.-H., Shao, Q.-M., and Ibrahim, J. Q., (2000). 80 | % Monte Carlo Methods in Bayesian Computation. Springer-Verlag. 81 | 82 | % Copyright (C) 2001 Aki Vehtari 83 | % 84 | % This software is distributed under the GNU General Public 85 | % Licence (version 2 or later); please refer to the file 86 | % Licence.txt, included with the software, for details. 87 | 88 | if nargin < 2 89 | error('Not enough arguments') 90 | end 91 | 92 | m=size(x,2); 93 | pts=linspace(0,100-p,100); 94 | pt1=prctile(x,pts); 95 | pt2=prctile(x,p+pts); 96 | cis=abs(pt2-pt1); 97 | [~,hpdpi]=min(cis); 98 | if m==1 99 | hpdi=[pt1(hpdpi); pt2(hpdpi)]; 100 | else 101 | hpdpi=sub2ind(size(pt1),hpdpi,1:m); 102 | hpdi=[pt1(hpdpi); pt2(hpdpi)]; 103 | end -------------------------------------------------------------------------------- /matlab/bayesprev_diff_within.m: -------------------------------------------------------------------------------- 1 | function [map, post_x, post_p, hpi, probGT, logoddsGT, samples] = bayesprev_diff_within(k11, k10, k01, n, p, a, b, Nsamp) 2 | % Bayesian maximum a posteriori estimate of the difference in prevalence 3 | % when two tests are applied to the same group 4 | % 5 | % k11 : number of participants significant in both tests 6 | % k10 : number of participants significant in test 1 and not test 2 7 | % k01 : number of participants significant in test 2 and not test 1 8 | % n : total number of participants 9 | % p : coverage for highest-posterior density interval (in [0 1]) 10 | % a : alpha value of within-participant test (default=0.05) 11 | % can be a length 2 array for the two tests if different 12 | % b : sensitivity/beta of within-participant test (default=1) 13 | % Nsamp : number of samples from the posterior 14 | % 15 | % Outputs: 16 | % map : maximum a posteriori estimate of the difference in prevalence: 17 | % gamma_1 - gamma_2 18 | % post_x : x-axis for kernel density fit of posterior distribution of the 19 | % above 20 | % post : posterior distribution from kernel density fit 21 | % hpdi : highest-posterior density interval with coverage p 22 | % probGT : estimated posterior probability that the prevalence is higher in group 1 23 | % logoddsGT : estimated log odds in favour of the hypothesis that the prevalence is 24 | % higher in group 1 25 | % samples : posterior samples 26 | 27 | if nargin<=6 28 | b = 1; 29 | end 30 | if nargin<=5 31 | a = 0.05; 32 | end 33 | if nargin<=4 34 | p = 0.96; 35 | end 36 | if nargin<=7 37 | Nsamp = 10000; 38 | end 39 | 40 | % default both tests have the same properties 41 | if length(a)==1 42 | a = repmat(a,1,2); 43 | end 44 | if length(b)==1 45 | b = repmat(b,1,2); 46 | end 47 | 48 | % gamma priors, Dirichlet parameters 49 | r11 = 1; 50 | r10 = 1; 51 | r01 = 1; 52 | r00 = 1; 53 | 54 | % posterior dirichlet parameters 55 | k00 = n - k11 - k01 - k10; 56 | 57 | if k00<0 58 | error("Input test results don't sum to n") 59 | end 60 | 61 | m11 = k11 + r11; 62 | m10 = k10 + r10; 63 | m01 = k01 + r01; 64 | m00 = k00 + r00; 65 | 66 | the11d = makedist('Beta','a',m11,'b',m01+m10+m00); 67 | the11d = the11d.truncate(0,min(b)); 68 | 69 | the10d = makedist('Beta','a',m10,'b',m01+m00); 70 | the01d = makedist('Beta','a',m01,'b',m00); 71 | 72 | the11 = the11d.random(Nsamp,1); 73 | 74 | low = max((a(1)-the11)./(1-the11),0); 75 | high = (b(1)-the11)./(1-the11); 76 | zlow = the10d.cdf(low); 77 | zhigh = the10d.cdf(high); 78 | z10 = zlow + (zhigh-zlow).*rand(Nsamp,1); 79 | u10 = the10d.icdf(z10); 80 | the10 = (1-the11).*u10; 81 | 82 | low = max((a(2)-the11)./(1-the11-the10), 0); 83 | high = min((b(2)-the11)./(1-the11-the10), 1); 84 | zlow = the01d.cdf(low); 85 | zhigh = the01d.cdf(high); 86 | z01 = zlow + (zhigh-zlow).*rand(Nsamp,1); 87 | u01 = the01d.icdf(z01); 88 | the01 = (1-the11-the10).*u01; 89 | 90 | g_diff_samples = (the11 + the10 - a(1))./(b(1)-a(1)) - (the11 + the01 - a(2))./(b(2)-a(2)); 91 | 92 | x = linspace(-1,1,200); 93 | g_diff_post = ksdensity(g_diff_samples,x); 94 | 95 | [~, idx] = max(g_diff_post); 96 | g_diff_map = x(idx); 97 | 98 | map = g_diff_map; 99 | post_p = g_diff_post; 100 | post_x = x; 101 | samples = g_diff_samples; 102 | hpi = hpdi(g_diff_samples,100*p); 103 | % Estimate the posterior probability, and logodds, that the prevalence is higher for group 1. 104 | % Laplace's rule of succession used to avoid estimates of 0 or 1 105 | probGT = (sum(samples>0)+1)/(Nsamp+2); 106 | logoddsGT = log(probGT / (1-probGT)); 107 | 108 | 109 | 110 | function hpdi = hpdi(x, p) 111 | % HPDI - Estimates the Bayesian HPD intervals 112 | % 113 | % Y = HPDI(X,P) returns a Highest Posterior Density (HPD) interval 114 | % for each column of X. P must be a scalar. Y is a 2 row matrix 115 | % where ith column is HPDI for ith column of X. 116 | 117 | % References: 118 | % [1] Chen, M.-H., Shao, Q.-M., and Ibrahim, J. Q., (2000). 119 | % Monte Carlo Methods in Bayesian Computation. Springer-Verlag. 120 | 121 | % Copyright (C) 2001 Aki Vehtari 122 | % 123 | % This software is distributed under the GNU General Public 124 | % Licence (version 2 or later); please refer to the file 125 | % Licence.txt, included with the software, for details. 126 | 127 | if nargin < 2 128 | error('Not enough arguments') 129 | end 130 | 131 | m=size(x,2); 132 | pts=linspace(0,100-p,100); 133 | pt1=prctile(x,pts); 134 | pt2=prctile(x,p+pts); 135 | cis=abs(pt2-pt1); 136 | [~,hpdpi]=min(cis); 137 | if m==1 138 | hpdi=[pt1(hpdpi); pt2(hpdpi)]; 139 | else 140 | hpdpi=sub2ind(size(pt1),hpdpi,1:m); 141 | hpdi=[pt1(hpdpi); pt2(hpdpi)]; 142 | end 143 | -------------------------------------------------------------------------------- /matlab/bayesprev_example.m: -------------------------------------------------------------------------------- 1 | % Example of how to use Bayesian prevalence functions 2 | % 3 | % 1. Simulate or load within-participant raw experimental data 4 | % 2. LEVEL 1: Apply statstical test at the individual level 5 | % 3. LEVEL 2: Apply Bayesian Prevalence to the outcomes of Level 1 6 | 7 | % 8 | % 1. Simulate or load within-participant raw experimental data 9 | % 10 | 11 | % 1.1. Simulate within-participant raw experimental data 12 | Nsub = 20; % number of particpants 13 | Nsamp = 100; % trials/samples per participant 14 | sigma_w = 10; % within-participant SD 15 | sigma_b = 2; % between-participant SD 16 | mu_g = 1; % population mean 17 | 18 | % per participant mean drawn from population normal distribution 19 | submeanstrue = normrnd(mu_g, sigma_b, [Nsub 1]); 20 | % rawdat holds trial data for each participnt 21 | rawdat = zeros(Nsamp, Nsub); 22 | for si=1:Nsub 23 | % generate trials for each participant 24 | rawdat(:,si) = normrnd(submeanstrue(si), sigma_w, [Nsamp 1]); 25 | end 26 | 27 | % 1.2.Load within-participant raw experimental data 28 | % Load your own data into the variable rawdat with dimensions [Nsamp Nsub], 29 | % setting Nsamp and Nsub accordingly. 30 | 31 | % 32 | % 2. LEVEL 1 33 | % 34 | 35 | % 2.1. Within-participant statistical test 36 | % The loop performs within-participant statistical test. Here, a t-test for 37 | % non-zero mean which is the simplest statistical test. In general, any 38 | % statistical test can be used at Level 1. 39 | 40 | indsig = zeros(1,Nsub); 41 | for si=1:Nsub 42 | % within-participant t-test significance 43 | [indsig(si) p ci stats] = ttest(rawdat(:,si)); 44 | end 45 | % the binary variable indsig indicates whether the within-participant 46 | % t-test is, or not, significant for each participant (1 entry for each 47 | % participant) 48 | 49 | % 2.2. Loading within-participant statistical test. 50 | % You can also load your own within-participant statistical test results here. 51 | % Load binary results into binary indsig vector with one entry per participant. 52 | % See also example_csv.m for an example. 53 | 54 | % 55 | % 3. LEVEL 2 56 | % 57 | 58 | % Bayesian prevalence inference is performed with three numbers: 59 | % k, the number of significant participants (e.g. sum of binary indicator 60 | % variable) 61 | % n, the number of participants in the sample 62 | % alpha, the false positive rate 63 | k = sum(indsig); 64 | n = Nsub; 65 | alpha = 0.05; % default value see 'help ttest' 66 | 67 | % plot posterior distribution of population prevalence 68 | figure 69 | co = get(gca,'ColorOrder'); ci=1; 70 | hold on 71 | 72 | x = linspace(0,1,100); 73 | posterior = bayesprev_posterior(x,k,n,alpha); 74 | plot(x, posterior,'Color',co(ci,:)); 75 | 76 | % add MAP as a point 77 | xmap = bayesprev_map(k,n,alpha); 78 | pmap = bayesprev_posterior(xmap,k,n,alpha); 79 | plot(xmap, pmap,'.','MarkerSize',20,'Color',co(ci,:)); 80 | 81 | % add lower bound as a vertical line 82 | bound = bayesprev_bound(0.95,k,n,alpha); 83 | line([bound bound], [0 bayesprev_posterior(bound,k,n,alpha)],'Color',co(ci,:),'LineStyle',':') 84 | 85 | % add 95% HPDI 86 | oil = 2; 87 | iil = 4; 88 | h = bayesprev_hpdi(0.95,k,n,alpha); 89 | plot([h(1) h(2)],[pmap pmap],'Color',co(ci,:),'LineWidth',oil) 90 | % add 50% HPDI 91 | h = bayesprev_hpdi(0.5,k,n,alpha); 92 | plot([h(1) h(2)],[pmap pmap],'Color',co(ci,:),'LineWidth',iil) 93 | 94 | xlabel('Population prevalence proportion') 95 | ylabel('Posterior density') 96 | -------------------------------------------------------------------------------- /matlab/bayesprev_hpdi.m: -------------------------------------------------------------------------------- 1 | function hpdi = bayesprev_hpdi(p, k, n, a, b) 2 | % Bayesian highest posterior density interval of population prevalence gamma 3 | % under a uniform prior 4 | % 5 | % p : HPDI to return (e.g. 0.95 for 95%) 6 | % k : number of participants significant out of 7 | % n : total number of participants 8 | % a : alpha value of within-participant test (default=0.05) 9 | % b : sensitivity/beta of within-participant test (default=1) 10 | 11 | if nargin<=5 12 | b = 1; 13 | end 14 | if nargin<=4 15 | a = 0.05; 16 | end 17 | if nargin<=3 18 | inc = 0.01; 19 | end 20 | 21 | % gamma prior = Beta(r,s) 22 | r = 1; 23 | s = 1; 24 | 25 | td = makedist('Beta','a',k+r,'b',n-k+s); 26 | td = td.truncate(a,b); 27 | 28 | if k==0 29 | x = [a td.icdf(p)]; 30 | elseif k==n 31 | x = [td.icdf(1-p) b]; 32 | else 33 | f = @(x) [td.cdf(x(2))-td.cdf(x(1))-p, td.pdf(x(2))-td.pdf(x(1))]; 34 | opt.Display = 'off'; 35 | opt.FunctionTolerance = 1e-10; 36 | [x, fval, exitflag, output] = fsolve(f, [td.icdf((1-p)/2) td.icdf((1+p)/2)],opt); 37 | end 38 | 39 | % limit to valid theta values 40 | if x(1)b 44 | x = [td.icdf(1-p) b]; 45 | end 46 | hpdi = (x-a)./(b-a); 47 | -------------------------------------------------------------------------------- /matlab/bayesprev_logodds.m: -------------------------------------------------------------------------------- 1 | function lo = bayesprev_logodds(k, n, x, a, b) 2 | % Posterior log-odds in favor of the population prevalence gamma being 3 | % greater than x 4 | % 5 | % k : number of participants significant out of 6 | % n : total number of participants 7 | % x : log-odds threshold (default=0.5) 8 | % a : alpha value of within-participant test (default=0.05) 9 | % b : sensitivity/beta of within-participant test (default=1) 10 | 11 | if nargin<=4 12 | b = 1; 13 | end 14 | if nargin<=3 15 | a = 0.05; 16 | end 17 | if nargin<=2 18 | x = 0.5; 19 | end 20 | 21 | % gamma prior = Beta(r,s) 22 | r = 1; 23 | s = 1; 24 | 25 | the = a + (b-a)*x; 26 | 27 | % d = makedist('Beta','a',k+r,'b',n-k+s); 28 | % d.truncate(a,b); 29 | % p = 1-d.cdf(the); 30 | 31 | b1 = k+r; 32 | b2 = n-k+s; 33 | p = (betacdf(b,b1,b2)-betacdf(the,b1,b2))./(betacdf(b,b1,b2)-betacdf(a,b1,b2)); 34 | lo = log(p/(1-p)); 35 | -------------------------------------------------------------------------------- /matlab/bayesprev_map.m: -------------------------------------------------------------------------------- 1 | function gamma = bayesprev_map(k, n, a, b) 2 | % Bayesian maximum a posteriori estimate of population prevalence gamma 3 | % under a uniform prior 4 | % 5 | % k : number of participants significant out of 6 | % n : total number of participants 7 | % a : alpha value of within-participant test (default=0.05) 8 | % b : sensitivity/beta of within-participant test (default=1) 9 | 10 | if nargin<=3 11 | b = 1; 12 | end 13 | if nargin<=2 14 | a = 0.05; 15 | end 16 | 17 | % gamma prior = Beta(r,s) 18 | r = 1; 19 | s = 1; 20 | 21 | theta = (k+r-1)./(n+r+s-2); 22 | if theta<=a 23 | gamma = 0; 24 | elseif theta>=b 25 | gamma = 1; 26 | else 27 | gamma = (theta-a)/(b-a); 28 | end 29 | 30 | -------------------------------------------------------------------------------- /matlab/bayesprev_plotposterior.m: -------------------------------------------------------------------------------- 1 | function h = bayesprev_plotposterior(k, n, a) 2 | % Helper function to plot posterior distribution with MAP, 50% and 96% HPDI 3 | % and 1st percentile of posterior. 4 | % 5 | % k : number of participants significant out of 6 | % n : total number of participants 7 | % a : alpha value of within-participant test (default=0.05) 8 | 9 | 10 | b = 1; 11 | if nargin<3 12 | a = 0.05; 13 | end 14 | 15 | figure 16 | 17 | % widths of HPDI indicators 18 | oil = 3; 19 | iil = 10; 20 | % yaxis height of HPDI indicator 21 | yp = 0.15; 22 | 23 | x = linspace(0,1,100); 24 | lw = 2; 25 | 26 | co = get(gca,'ColorOrder'); 27 | cidx = get(gca,'ColorOrderIndex'); 28 | 29 | lh(1) = plot(x, bayesprev_posterior(x, k, n, a, b),'LineWidth',lw,'Color',co(cidx,:)); 30 | hold on 31 | xmap = bayesprev_map(k,n, a, b); 32 | pmap = bayesprev_posterior(xmap,k,n, a, b); 33 | h96 = bayesprev_hpdi(0.96,k,n, a, b); 34 | 35 | plot(xmap, yp,'.','MarkerSize',60,'Color',co(cidx,:)); 36 | 37 | plot([h96(1) h96(2)],[yp yp],'LineWidth',oil,'Color',co(cidx,:)) 38 | h50 = bayesprev_hpdi(0.5,k,n, a, b); 39 | plot([h50(1) h50(2)],[yp yp],'LineWidth',iil,'Color',co(cidx,:)) 40 | % xline(xmap,'k') 41 | box off 42 | 43 | xlabel('Prevalence Proportion') 44 | ylabel('Posterior Density') 45 | 46 | title(sprintf('Posterior Prevalence from %d / %d at a=%.02f',k,n,a)) 47 | 48 | % 1st percetile 49 | lb1 = bayesprev_bound(0.99,k,n,a); 50 | xline(lb1,'Color',co(cidx,:)) 51 | 52 | fprintf(1,'\n %d / %d significant at a=%0.2f\n',k,n,a) 53 | fprintf(1,'MAP [96%% HPDI]: %.3f [%.3f %.3f] [50%% HPDI]: [%.3f %.3f]\n',xmap,h96(1),h96(2),h50(1),h50(2)) 54 | fprintf(1,'1st percentile: %.3f\n\n', lb1) -------------------------------------------------------------------------------- /matlab/bayesprev_posterior.m: -------------------------------------------------------------------------------- 1 | function post = bayesprev_posterior(x, k, n, a, b) 2 | % Bayesian posterior of population prevalence gamma 3 | % under a uniform prior 4 | % 5 | % x : values of gamma at which to evaluate the posterior density 6 | % k : number of participants significant out of 7 | % n : total number of participants 8 | % a : alpha value of within-participant test (default=0.05) 9 | % b : sensitivity/beta of within-participant test (default=1) 10 | 11 | if nargin<=4 12 | b = 1; 13 | end 14 | if nargin<=3 15 | a = 0.05; 16 | end 17 | 18 | if any(x<0) || any(x>1) 19 | error('prev_posterior: requested value out of range') 20 | end 21 | % gamma prior = Beta(r,s) 22 | r = 1; 23 | s = 1; 24 | 25 | theta = a+(b-a).*x; 26 | % d = makedist('Beta','a',k+r,'b',n-k+s); 27 | % d.truncate(a,b); 28 | % post = d.pdf(x); 29 | post = (b-a).*betapdf(theta, k+r, n-k+s); 30 | post = post ./ (betacdf(b, k+r, n-k+s) - betacdf(a, k+r, n-k+s)); 31 | 32 | -------------------------------------------------------------------------------- /matlab/currfig1seed.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robince/bayesian-prevalence/2f6954779206b1914760434095fa4678d53f23a6/matlab/currfig1seed.mat -------------------------------------------------------------------------------- /matlab/example_csv.m: -------------------------------------------------------------------------------- 1 | % 2 | % Example of loading data from a CSV and applying Bayesian prevalence 3 | % second level 4 | 5 | % example_data.csv is a file with one binary value for each experimental 6 | % unit (participant, neuron, voxel etc.) where 1 indicates the within-unit 7 | % null hypothesis was rejected and 0 indicates it was not rejected 8 | 9 | % load the data 10 | sigdat = csvread('example_data.csv'); 11 | 12 | alpha = 0.05; % this specifies the alpha value used for the within-unit tests 13 | Ntests = numel(sigdat); % number of tests (e.g. participants) 14 | Nsigtests = sum(sigdat(:)); % number of significant tests 15 | 16 | 17 | % example of Bayesian prevalence analyses on a single plot 18 | figure 19 | k = Nsigtests; 20 | n = Ntests; 21 | co = get(gca,'ColorOrder'); ci=1; 22 | hold on 23 | 24 | x = linspace(0,1,100); 25 | posterior = bayesprev_posterior(x,k,n,alpha); 26 | plot(x, posterior,'Color',co(ci,:)); 27 | 28 | % add MAP as a point 29 | xmap = bayesprev_map(k,n,alpha); 30 | pmap = bayesprev_posterior(xmap,k,n,alpha); 31 | plot(xmap, pmap,'.','MarkerSize',20,'Color',co(ci,:)); 32 | 33 | % add lower bound as a vertical line 34 | bound = bayesprev_bound(0.95,k,n,alpha); 35 | line([bound bound], [0 bayesprev_posterior(bound,k,n,alpha)],'Color',co(ci,:),'LineStyle',':') 36 | 37 | % add 95% HPDI 38 | oil = 2; 39 | iil = 4; 40 | h = bayesprev_hpdi(0.95,k,n,alpha); 41 | plot([h(1) h(2)],[pmap pmap],'Color',co(ci,:),'LineWidth',oil) 42 | % add 50% HPDI 43 | h = bayesprev_hpdi(0.5,k,n,alpha); 44 | plot([h(1) h(2)],[pmap pmap],'Color',co(ci,:),'LineWidth',iil) 45 | 46 | xlabel('Population prevalence proportion') 47 | ylabel('Posterior density') 48 | -------------------------------------------------------------------------------- /matlab/example_data.csv: -------------------------------------------------------------------------------- 1 | 1 2 | 0 3 | 0 4 | 1 5 | 1 6 | 0 7 | 1 8 | 1 9 | 0 10 | 1 11 | 0 12 | 0 13 | 1 14 | 1 15 | 0 16 | 1 17 | 1 18 | 0 19 | 0 20 | 0 21 | 0 22 | 1 23 | 0 24 | 1 25 | 1 26 | 1 27 | 0 28 | 1 29 | 1 30 | 0 31 | 0 32 | 0 33 | 1 34 | 1 35 | 0 36 | 0 37 | 1 38 | 1 39 | 0 40 | 0 41 | 1 42 | 1 43 | 1 44 | 1 45 | 0 46 | 1 47 | 1 48 | 0 49 | 1 50 | 1 51 | -------------------------------------------------------------------------------- /matlab/fig1_group_vs_ind.m: -------------------------------------------------------------------------------- 1 | % 2 | % Figure 1: Population vs individual inference. For each simulation we sample ? 3 | % = 50 individual participant mean effects from a normal distribution with 4 | % population mean ? (A,B: ? = 0; C,D: ? = 1) and between-participant 5 | % standard deviation ?_b = 2. Within each participant, ? trials (A,C: ? = 20; 6 | % B,D: ? = 500) are drawn from a normal distribution with the 7 | % participant-specific mean and a common within-participant standard deviation 8 | % ?_w = 10.Orange and blue indicate, respectively, exceeding or not exceeding a 9 | % ?=0.05 threshold for a t-test at the population level (on the within-participant 10 | % means, population normal density curves) or at the individual participant 11 | % level (individual sample means +/- s.e.m.). E: Bayesian posterior population 12 | % prevalence distributions for the 4 simulated data sets. Points show Bayesian 13 | % maximum a posteriori estimate. Thick and thin horizontal lines indicate 50% 14 | % and 96% highest posterior density intervals respectively. 15 | 16 | % Simulations inspired by 17 | % Baker, Vilidaite, Lygo, Smith, Flack, Gouws and Andrews 18 | % "Power contours: optimising sample size and precision in experimental 19 | % psychology and human neuroscience" 20 | 21 | x = []; 22 | 23 | figure 24 | 25 | Nsub = 50; 26 | sigma_w = 10; 27 | sigma_b = 2; 28 | 29 | % s = rng; 30 | % save('currfig1seed','s') 31 | load currfig1seed 32 | rng(s); 33 | 34 | % Generate data from heirachical normal distribution 35 | % and plot per-participant means with standard deviations 36 | % together with population distribution 37 | subplot(3,2,1) 38 | Nsamp = 20; 39 | mu_g = 0; 40 | sigma_g = sqrt(sigma_b.^2 + ((sigma_w).^2)/Nsamp); 41 | datA = generate_data(mu_g, sigma_b, sigma_w, Nsamp, Nsub); 42 | datA.sigma_g = sigma_g; 43 | plot_data(mu_g, sigma_g, datA, datA.groupsig+1); 44 | 45 | subplot(3,2,2) 46 | Nsamp = 500; 47 | mu_g = 0; 48 | sigma_g = sqrt(sigma_b.^2 + ((sigma_w).^2)/Nsamp); 49 | datB = generate_data(mu_g, sigma_b, sigma_w, Nsamp, Nsub); 50 | datB.sigma_g = sigma_g; 51 | plot_data(mu_g, sigma_g, datB, datB.groupsig+1); 52 | 53 | subplot(3,2,3) 54 | Nsamp = 20; 55 | mu_g = 1; 56 | sigma_g = sqrt(sigma_b.^2 + ((sigma_w).^2)/Nsamp); 57 | datC = generate_data(mu_g, sigma_b, sigma_w, Nsamp, Nsub); 58 | datC.sigma_g = sigma_g; 59 | plot_data(mu_g, sigma_g, datC, datC.groupsig+1); 60 | 61 | subplot(3,2,4) 62 | Nsamp = 500; 63 | mu_g = 1; 64 | sigma_g = sqrt(sigma_b.^2 + ((sigma_w).^2)/Nsamp); 65 | datD = generate_data(mu_g, sigma_b, sigma_w, Nsamp, Nsub); 66 | datD.sigma_g = sigma_g; 67 | plot_data(mu_g, sigma_g, datD, datD.groupsig+1); 68 | 69 | 70 | 71 | subplot(3,1,3); 72 | x = linspace(0,1,200); 73 | co = get(gca,'ColorOrder'); 74 | 75 | oil = 2; 76 | iil = 4; 77 | a = 0.05; 78 | lh = []; 79 | 80 | k = sum(datA.indsig);i=3;hy = 0.3; 81 | b = 1; 82 | dat = datA; 83 | lh(1) = plot(x, bayesprev_posterior(x, k, Nsub, a, b),'Color',co(i,:)); 84 | hold on 85 | xmap = bayesprev_map(k,Nsub, a, b); 86 | pmap = bayesprev_posterior(xmap,k,Nsub, a, b); 87 | plot(xmap, pmap,'.','MarkerSize',20,'Color',co(i,:)); 88 | h = bayesprev_hpdi(0.96,k,Nsub, a, b); 89 | plot([h(1) h(2)],[pmap pmap],'Color',co(i,:),'LineWidth',oil) 90 | h = bayesprev_hpdi(0.5,k,Nsub, a, b); 91 | plot([h(1) h(2)],[pmap pmap],'Color',co(i,:),'LineWidth',iil) 92 | 93 | k = sum(datB.indsig);i=4;hy = 0.5; 94 | b = 1; 95 | dat = datB; 96 | lh(2) = plot(x, bayesprev_posterior(x, k, Nsub, a, b),'Color',co(i,:)); 97 | hold on 98 | xmap = bayesprev_map(k,Nsub, a, b); 99 | pmap = bayesprev_posterior(xmap,k,Nsub, a, b); 100 | plot(xmap, pmap,'.','MarkerSize',20,'Color',co(i,:)); 101 | h = bayesprev_hpdi(0.96,k,Nsub, a, b); 102 | plot([h(1) h(2)],[pmap pmap],'Color',co(i,:),'LineWidth',oil) 103 | h = bayesprev_hpdi(0.5,k,Nsub, a, b); 104 | plot([h(1) h(2)],[pmap pmap],'Color',co(i,:),'LineWidth',iil) 105 | 106 | k = sum(datC.indsig);i=5;hy = 0.5; 107 | b = 1; 108 | dat = datC; 109 | lh(3) = plot(x, bayesprev_posterior(x, k, Nsub, a, b),'Color',co(i,:)); 110 | hold on 111 | xmap = bayesprev_map(k,Nsub, a, b); 112 | pmap = bayesprev_posterior(xmap,k,Nsub, a, b); 113 | plot(xmap, pmap,'.','MarkerSize',20,'Color',co(i,:)); 114 | h = bayesprev_hpdi(0.96,k,Nsub, a, b); 115 | plot([h(1) h(2)],[pmap pmap],'Color',co(i,:),'LineWidth',oil) 116 | h = bayesprev_hpdi(0.5,k,Nsub, a, b); 117 | plot([h(1) h(2)],[pmap pmap],'Color',co(i,:),'LineWidth',iil) 118 | 119 | k = sum(datD.indsig);i=6;hy = 0.3; 120 | b = 1; 121 | dat = datD; 122 | lh(4) = plot(x, bayesprev_posterior(x, k, Nsub, a, b),'Color',co(i,:)); 123 | hold on 124 | xmap = bayesprev_map(k,Nsub, a, b); 125 | pmap = bayesprev_posterior(xmap,k,Nsub, a, b); 126 | plot(xmap, pmap,'.','MarkerSize',20,'Color',co(i,:)); 127 | h = bayesprev_hpdi(0.96,k,Nsub, a, b); 128 | plot([h(1) h(2)],[pmap pmap],'Color',co(i,:),'LineWidth',oil) 129 | h = bayesprev_hpdi(0.5,k,Nsub, a, b); 130 | plot([h(1) h(2)],[pmap pmap],'Color',co(i,:),'LineWidth',iil) 131 | 132 | xlim([0 1]) 133 | 134 | legend(lh,{'A','B','C','D'}) 135 | 136 | % generate data from heirachical normal model 137 | function dat = generate_data(mu_g, sigma_b, sigma_w, Nsamp, Nsub) 138 | % mu_g - ground truth group mean 139 | % sigma_b - between participant standard deviation 140 | % sigma_w - within participant standard deviation 141 | % Nsamp - number of trials per participant 142 | % Nsub - number of participants 143 | 144 | % generate individual subject means from population normal distribution 145 | submeanstrue = normrnd(mu_g, sigma_b, [Nsub 1]); 146 | rawdat = zeros(Nsamp, Nsub); 147 | dat.indsig = false(1,Nsub); 148 | dat.indt = zeros(1,Nsub); 149 | for si=1:Nsub 150 | % generate within-participant data 151 | rawdat(:,si) = normrnd(submeanstrue(si), sigma_w, [Nsamp 1]); 152 | % within-participant t-test significance 153 | [dat.indsig(si) p ci stats] = ttest(rawdat(:,si)); 154 | % within-participant t-score 155 | dat.indt(si) = stats.tstat; 156 | end 157 | % within-participant mean 158 | dat.submeans = mean(rawdat,1); 159 | dat.subsem = std(rawdat,[],1) ./ sqrt(Nsamp); 160 | % second level t-test on within-participant means 161 | [h p ci stats] = ttest(mean(rawdat,1)); 162 | % population level t-test significance 163 | dat.groupsig = h; 164 | dat.groupp = p; 165 | dat.Nsub = Nsub; 166 | dat.Nsamp = Nsamp; 167 | dat.tdf = stats.df; 168 | % population level t-score 169 | dat.t = stats.tstat; 170 | end 171 | 172 | % plot group and individual data 173 | function plot_data(mu_g, sigma_g, d, gc) 174 | 175 | x = -15:0.1:15; 176 | y = normpdf(x,mu_g,sigma_g); 177 | co = get(gca,'ColorOrder'); 178 | % plot population model (normal) distribution 179 | plot(x,y,'Color',co(gc,:)) 180 | hold on 181 | ypos = unifrnd(zeros(1,d.Nsub)+0.0005, 0.95*normpdf(d.submeans, mu_g, sigma_g)); 182 | % plot(submeans, ypos, '.' , 'Color',co(1,:)) 183 | % plot significant within-participant means in orange 184 | errorbar(d.submeans(d.indsig),ypos(d.indsig),d.subsem(d.indsig),'horizontal','.',... 185 | 'capsize',0,... 186 | 'Color',co(2,:),... 187 | 'MarkerSize',10) 188 | % plot non-significant within-participant means in blue 189 | errorbar(d.submeans(~d.indsig),ypos(~d.indsig),d.subsem(~d.indsig),'horizontal','.',... 190 | 'capsize',0,... 191 | 'Color',co(1,:),... 192 | 'MarkerSize',10) 193 | xline(0,'k:'); 194 | if mu_g~=0 195 | xline(mu_g,'k--'); 196 | end 197 | axis square 198 | xlim([-15 15]) 199 | title(sprintf('%d trials, %d/50 sig, group t(%d)=%.2f p=%.3f',d.Nsamp,sum(d.indsig),d.tdf,d.t,d.groupp)) 200 | 201 | end 202 | -------------------------------------------------------------------------------- /paper/bayes_scale.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robince/bayesian-prevalence/2f6954779206b1914760434095fa4678d53f23a6/paper/bayes_scale.mat -------------------------------------------------------------------------------- /paper/bayes_scale_between.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robince/bayesian-prevalence/2f6954779206b1914760434095fa4678d53f23a6/paper/bayes_scale_between.mat -------------------------------------------------------------------------------- /paper/bayes_scale_within.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robince/bayesian-prevalence/2f6954779206b1914760434095fa4678d53f23a6/paper/bayes_scale_within.mat -------------------------------------------------------------------------------- /paper/fig1_group_vs_ind.m: -------------------------------------------------------------------------------- 1 | % 2 | % Ince, Paton, Kay and Schyns 3 | % "Bayesian inference of population prevalence" 4 | % biorxiv: https://doi.org/10.1101/2020.07.08.191106 5 | % 6 | % Figure 1: Population vs individual inference. For each simulation, we 7 | % sample N=50 individual participant mean effects from a normal 8 | % distribution with population mean μ (A,B: μ=0; C,D: μ=1) and 9 | % between-participant standard deviation σ_b=2. Within each participant, 10 | % T trials (A,C: T=20; B,D: T=500) are drawn from a normal distribution 11 | % with the participant-specific mean and a common within-participant 12 | % standard deviation σ_w=10 (Baker et al. 2020). Orange and blue indicate, 13 | % respectively, exceeding or not exceeding a p=0.05 threshold for a t-test 14 | % at the population level (on the within-participant means, population 15 | % normal density curves) or at the individual participant level (individual 16 | % sample means +/- s.e.m.). E: Bayesian posterior distributions of 17 | % population prevalence of true positive results for the 4 simulated data 18 | % sets (A-D). Circles show Bayesian maximum a posteriori estimates. 19 | % Thick and thin horizontal lines indicate 50% and 96% highest posterior 20 | % density intervals, respectively. MAP [96% HPDI] values are shown in the 21 | % legend. 22 | 23 | % Simulations inspired by 24 | % Baker, Vilidaite, Lygo, Smith, Flack, Gouws and Andrews 25 | % "Power contours: optimising sample size and precision in experimental 26 | % psychology and human neuroscience" 27 | % Psychological Methods 28 | % https://doi.org/10.1037/met0000337 29 | % http://arxiv.org/abs/1902.06122 30 | 31 | x = []; 32 | 33 | figure 34 | 35 | Nsub = 50; 36 | sigma_w = 10; 37 | sigma_b = 2; 38 | 39 | % s = rng; 40 | % save('fig1seed','s') 41 | load fig1seed 42 | rng(s); 43 | 44 | % Generate data from heirachical normal distribution 45 | subplot(3,2,1) 46 | Nsamp = 20; 47 | mu_g = 0; 48 | sigma_g = sqrt(sigma_b.^2 + ((sigma_w).^2)/Nsamp); 49 | datA = generate_data(mu_g, sigma_b, sigma_w, Nsamp, Nsub); 50 | datA.sigma_g = sigma_g; 51 | plot_data(mu_g, sigma_g, datA, datA.groupsig+1); 52 | 53 | subplot(3,2,2) 54 | Nsamp = 500; 55 | mu_g = 0; 56 | sigma_g = sqrt(sigma_b.^2 + ((sigma_w).^2)/Nsamp); 57 | datB = generate_data(mu_g, sigma_b, sigma_w, Nsamp, Nsub); 58 | datB.sigma_g = sigma_g; 59 | plot_data(mu_g, sigma_g, datB, datB.groupsig+1); 60 | 61 | subplot(3,2,3) 62 | Nsamp = 20; 63 | mu_g = 1; 64 | sigma_g = sqrt(sigma_b.^2 + ((sigma_w).^2)/Nsamp); 65 | datC = generate_data(mu_g, sigma_b, sigma_w, Nsamp, Nsub); 66 | datC.sigma_g = sigma_g; 67 | plot_data(mu_g, sigma_g, datC, datC.groupsig+1); 68 | 69 | subplot(3,2,4) 70 | Nsamp = 500; 71 | mu_g = 1; 72 | sigma_g = sqrt(sigma_b.^2 + ((sigma_w).^2)/Nsamp); 73 | datD = generate_data(mu_g, sigma_b, sigma_w, Nsamp, Nsub); 74 | datD.sigma_g = sigma_g; 75 | plot_data(mu_g, sigma_g, datD, datD.groupsig+1); 76 | 77 | 78 | 79 | subplot(3,1,3); 80 | x = linspace(0,1,200); 81 | co = get(gca,'ColorOrder'); 82 | 83 | oil = 2; 84 | iil = 4; 85 | a = 0.05; 86 | lh = []; 87 | 88 | k = sum(datA.indsig);i=3;hy = 0.3; 89 | b = 1; 90 | dat = datA; 91 | % b = sampsizepwr('t',[0 sigma_w],max(abs(dat.submeans)),[],dat.Nsamp) 92 | lh(1) = plot(x, bayesprev_posterior(x, k, Nsub, a, b),'Color',co(i,:)); 93 | hold on 94 | xmap = bayesprev_map(k,Nsub, a, b); 95 | pmap = bayesprev_posterior(xmap,k,Nsub, a, b); 96 | plot(xmap, pmap,'.','MarkerSize',20,'Color',co(i,:)); 97 | h = bayesprev_hpdi(0.96,k,Nsub, a, b); 98 | plot([h(1) h(2)],[pmap pmap],'Color',co(i,:),'LineWidth',oil) 99 | h = bayesprev_hpdi(0.5,k,Nsub, a, b); 100 | plot([h(1) h(2)],[pmap pmap],'Color',co(i,:),'LineWidth',iil) 101 | % plot([p.g p.g],[0 freqy],'Color',co(i,:)) 102 | 103 | k = sum(datB.indsig);i=4;hy = 0.5; 104 | b = 1; 105 | dat = datB; 106 | % b = sampsizepwr('t',[0 sigma_w],max(abs(dat.submeans)),[],dat.Nsamp) 107 | lh(2) = plot(x, bayesprev_posterior(x, k, Nsub, a, b),'Color',co(i,:)); 108 | hold on 109 | xmap = bayesprev_map(k,Nsub, a, b); 110 | pmap = bayesprev_posterior(xmap,k,Nsub, a, b); 111 | plot(xmap, pmap,'.','MarkerSize',20,'Color',co(i,:)); 112 | h = bayesprev_hpdi(0.96,k,Nsub, a, b); 113 | plot([h(1) h(2)],[pmap pmap],'Color',co(i,:),'LineWidth',oil) 114 | h = bayesprev_hpdi(0.5,k,Nsub, a, b); 115 | plot([h(1) h(2)],[pmap pmap],'Color',co(i,:),'LineWidth',iil) 116 | % plot([p.g p.g],[0 freqy],'Color',co(i,:)) 117 | 118 | k = sum(datC.indsig);i=5;hy = 0.5; 119 | b = 1; 120 | dat = datC; 121 | % b = sampsizepwr('t',[0 sigma_w],max(abs(dat.submeans)),[],dat.Nsamp) 122 | lh(3) = plot(x, bayesprev_posterior(x, k, Nsub, a, b),'Color',co(i,:)); 123 | hold on 124 | xmap = bayesprev_map(k,Nsub, a, b); 125 | pmap = bayesprev_posterior(xmap,k,Nsub, a, b); 126 | plot(xmap, pmap,'.','MarkerSize',20,'Color',co(i,:)); 127 | h = bayesprev_hpdi(0.96,k,Nsub, a, b); 128 | plot([h(1) h(2)],[pmap pmap],'Color',co(i,:),'LineWidth',oil) 129 | h = bayesprev_hpdi(0.5,k,Nsub, a, b); 130 | plot([h(1) h(2)],[pmap pmap],'Color',co(i,:),'LineWidth',iil) 131 | % plot([p.g p.g],[0 freqy],'Color',co(i,:)) 132 | 133 | k = sum(datD.indsig);i=6;hy = 0.3; 134 | b = 1; 135 | dat = datD; 136 | % b = sampsizepwr('t',[0 sigma_w],max(abs(dat.submeans)),[],dat.Nsamp) 137 | lh(4) = plot(x, bayesprev_posterior(x, k, Nsub, a, b),'Color',co(i,:)); 138 | hold on 139 | xmap = bayesprev_map(k,Nsub, a, b); 140 | pmap = bayesprev_posterior(xmap,k,Nsub, a, b); 141 | plot(xmap, pmap,'.','MarkerSize',20,'Color',co(i,:)); 142 | h = bayesprev_hpdi(0.96,k,Nsub, a, b); 143 | plot([h(1) h(2)],[pmap pmap],'Color',co(i,:),'LineWidth',oil) 144 | h = bayesprev_hpdi(0.5,k,Nsub, a, b); 145 | plot([h(1) h(2)],[pmap pmap],'Color',co(i,:),'LineWidth',iil) 146 | % plot([p.g p.g],[0 freqy],'Color',co(i,:)) 147 | 148 | xlim([0 1]) 149 | 150 | legend(lh,{'A','B','C','D'}) 151 | 152 | % generate data from heirachical normal model 153 | function dat = generate_data(mu_g, sigma_b, sigma_w, Nsamp, Nsub) 154 | % mu_g - ground truth group mean 155 | % sigma_b - between participant standard deviation 156 | % sigma_w - within participant standard deviation 157 | % Nsamp - number of trials per participant 158 | % Nsub - number of participants 159 | 160 | % generate individual subject means from population normal distribution 161 | submeanstrue = normrnd(mu_g, sigma_b, [Nsub 1]); 162 | rawdat = zeros(Nsamp, Nsub); 163 | dat.indsig = false(1,Nsub); 164 | dat.indt = zeros(1,Nsub); 165 | for si=1:Nsub 166 | % generate within-participant data 167 | rawdat(:,si) = normrnd(submeanstrue(si), sigma_w, [Nsamp 1]); 168 | % within-participant t-test significance 169 | [dat.indsig(si) p ci stats] = ttest(rawdat(:,si)); 170 | % within-participant t-score 171 | dat.indt(si) = stats.tstat; 172 | end 173 | % within-participant mean 174 | dat.submeans = mean(rawdat,1); 175 | dat.subsem = std(rawdat,[],1) ./ sqrt(Nsamp); 176 | % second level t-test on within-participant means 177 | [h p ci stats] = ttest(mean(rawdat,1)); 178 | % population level t-test significance 179 | dat.groupsig = h; 180 | dat.groupp = p; 181 | dat.Nsub = Nsub; 182 | dat.Nsamp = Nsamp; 183 | dat.tdf = stats.df; 184 | % population level t-score 185 | dat.t = stats.tstat; 186 | end 187 | 188 | % plot group and individual data 189 | function plot_data(mu_g, sigma_g, d, gc) 190 | 191 | x = -15:0.1:15; 192 | y = normpdf(x,mu_g,sigma_g); 193 | co = get(gca,'ColorOrder'); 194 | % plot population model (normal) distribution 195 | plot(x,y,'Color',co(gc,:)) 196 | hold on 197 | ypos = unifrnd(zeros(1,d.Nsub)+0.0005, 0.95*normpdf(d.submeans, mu_g, sigma_g)); 198 | % plot(submeans, ypos, '.' , 'Color',co(1,:)) 199 | % plot significant within-participant means in orange 200 | errorbar(d.submeans(d.indsig),ypos(d.indsig),d.subsem(d.indsig),'horizontal','.',... 201 | 'capsize',0,... 202 | 'Color',co(2,:),... 203 | 'MarkerSize',10) 204 | % plot non-significant within-participant means in blue 205 | errorbar(d.submeans(~d.indsig),ypos(~d.indsig),d.subsem(~d.indsig),'horizontal','.',... 206 | 'capsize',0,... 207 | 'Color',co(1,:),... 208 | 'MarkerSize',10) 209 | xline(0,'k:'); 210 | if mu_g~=0 211 | xline(mu_g,'k--'); 212 | end 213 | axis square 214 | xlim([-15 15]) 215 | title(sprintf('%d trials, %d/50 sig, group t(%d)=%.2f p=%.3f',d.Nsamp,sum(d.indsig),d.tdf,d.t,d.groupp)) 216 | 217 | end -------------------------------------------------------------------------------- /paper/fig1seed.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robince/bayesian-prevalence/2f6954779206b1914760434095fa4678d53f23a6/paper/fig1seed.mat -------------------------------------------------------------------------------- /paper/fig2_simEEG.m: -------------------------------------------------------------------------------- 1 | % Ince, Paton, Kay and Schyns 2 | % "Bayesian inference of population prevalence" 3 | % biorxiv: https://doi.org/10.1101/2020.07.08.191106 4 | % 5 | % Figure 2: Simulated examples where Bayesian prevalence and second-level 6 | % t-tests diverge. EEG traces are simulated for 100 trials from 20 7 | % participants as white noise [N(0,1)] with an additive Gaussian activation 8 | % (σ = 20 ms) with amplitudes drawn from a uniform distribution. For each 9 | % simulation, mean traces are shown per participant (top left). A 10 | % second-level t-test is performed at each time point separately (blue 11 | % curve, bottom panel), dashed line shows the p=0.05 threshold, Bonferonni 12 | % corrected over time points. A within-participant t-test is performed at 13 | % each time point and for each participant separately (right hand panel); 14 | % the blue points show the maximum T-statistic over time points for each 15 | % participant, and the dashed line shows the p=0.05 Bonferonni corrected 16 | % threshold. Bottom right panel shows posterior population prevalence for 17 | % an effect in the analysis window. Black curves (bottom panel) show the 18 | % prevalence posterior at each time point (black line MAP, shaded area 19 | % 96% HPDI). A: An effect is simulated in all participants, with a peak 20 | % time uniformly distributed in the range 100-400 ms. B: An effect is 21 | % simulated in 10 participants, with a peak time uniformly distributed in 22 | % the range 200-275 ms. 23 | 24 | Nsub = 20; 25 | Ntrl = 100; 26 | 27 | % Fs = 1000Hz (1ms bins) 28 | Ntime = 600; 29 | 30 | % 31 | % Panel A: all subjects have an effect but at different times 32 | % 33 | % s = rng; 34 | % save('figsubjectalingment','s') 35 | load figsubjectalingment 36 | rng(s); 37 | 38 | % peak effects uniformly distributed between 100 and 500 ms 39 | effect_range_width = 400; 40 | effect_range_start = 100; 41 | sub_effect_time = round(rand(Nsub,1)*effect_range_width + effect_range_start); 42 | effect_size = 0.6; 43 | % generate data 44 | dat = randn(Ntime,Ntrl,Nsub); 45 | % add effect 46 | effect_bins = 40; 47 | effect_sd = 20; 48 | for si=1:Nsub 49 | effect = normpdf(-effect_bins:effect_bins,0,effect_sd)'; 50 | effect = effect./max(effect); 51 | % uniformly distributed random ampltiude on each trial 0-effect_size 52 | trial_amp = effect_size*rand(1,Ntrl); 53 | idx = sub_effect_time(si)-effect_bins:sub_effect_time(si)+effect_bins; 54 | dat(idx,:,si) = dat(idx,:,si) + effect.*trial_amp; 55 | end 56 | 57 | stemlim = [3 14]; 58 | plot_results(dat, Nsub, Ntrl, Ntime, stemlim) 59 | 60 | 61 | % 62 | % Panel B: 50% of subjects have an effect 63 | % 64 | % s = rng; 65 | % save('figsubjectprop','s') 66 | load figsubjectprop 67 | rng(s); 68 | 69 | Nsubeff = 10; 70 | Nsubnoeff = 10; 71 | Nsub = Nsubeff + Nsubnoeff; 72 | 73 | % peak effect uniformly distributed between 200 and 275 ms 74 | effect_range_width = 75; 75 | effect_range_start = 200; 76 | sub_effect_time = round(rand(Nsub,1)*effect_range_width + effect_range_start); 77 | effect_size = 0.6; 78 | % generate data 79 | dat = randn(Ntime,Ntrl,Nsub); 80 | % add effect 81 | effect_bins = 40; 82 | effect_sd = 20; 83 | for si=1:Nsubeff 84 | effect = normpdf(-effect_bins:effect_bins,0,effect_sd)'; 85 | effect = effect./max(effect); 86 | % uniformly distributed random ampltiude on each trial 0-effect_size 87 | trial_amp = effect_size*rand(1,Ntrl); 88 | idx = sub_effect_time(si)-effect_bins:sub_effect_time(si)+effect_bins; 89 | dat(idx,:,si) = dat(idx,:,si) + effect.*trial_amp; 90 | end 91 | 92 | stemlim = [0 14]; 93 | plot_results(dat, Nsub, Ntrl, Ntime, stemlim) 94 | 95 | 96 | function plot_results(dat, Nsub, Ntrl, Ntime, stemlim) 97 | 98 | % filter <30Hz as typical ERP analysis 99 | Fs = 1000; 100 | [b,a] = butter(3,30/(Fs/2),'low'); 101 | for si=1:Nsub 102 | for ti=1:Ntrl 103 | dat(:,ti,si) = filtfilt(b,a,dat(:,ti,si)); 104 | end 105 | end 106 | 107 | % group results (t-test across participants at each time point) 108 | group_t = zeros(Ntime,1); 109 | group_p = zeros(Ntime,1); 110 | for ti=1:Ntime 111 | [sig p ci stats] = ttest(squeeze(mean(dat(ti,:,:),2))); 112 | group_t(ti) = stats.tstat; 113 | group_p(ti) = p; 114 | end 115 | 116 | % group results (average over time points within each participant) 117 | tavg = squeeze(mean(dat)); 118 | [sig p ci stats] = ttest(mean(tavg)); 119 | group_tavg_t = stats.tstat; 120 | group_tavg_p = p; 121 | 122 | % within participant results 123 | ind_t = zeros(Ntime,Nsub); 124 | ind_p = zeros(Ntime,Nsub); 125 | for si=1:Nsub 126 | for ti=1:Ntime 127 | [sig p ci stats] = ttest(dat(ti,:,si)); 128 | ind_t(ti,si) = stats.tstat; 129 | ind_p(ti,si) = p; 130 | end 131 | end 132 | ind_sig = ind_p<0.05/Ntime; 133 | 134 | % % sig time points, unc and bonf 135 | % [sum(group_p<0.05) sum(group_p<0.05/Ntime)] 136 | % % sig subjects, unc and bonf 137 | % [sum(sum(ind_p<0.05)>0) sum(min(ind_p)<0.05/Ntime)] 138 | 139 | % close all 140 | figure 141 | cm = flipud(cbrewer('div','RdBu',128)); 142 | co = get(gca,'ColorOrder'); 143 | 144 | imah = subplot(3,3,[1 2 4 5]); 145 | imagesc(squeeze(mean(dat,2))') 146 | colorbar 147 | colormap(cm) 148 | caxis([-1 1]*max(abs(caxis))) 149 | set(gca,'YDir','normal') 150 | yl = ylim; 151 | % axis square 152 | ylabel('Participant') 153 | 154 | % yyaxis right 155 | % ylabel('Prev') 156 | % ah = gca; 157 | % ah.YAxis(2).Color = 'k'; 158 | % ah.YAxis(2).Label.Color = 'w'; 159 | % t = ah; 160 | 161 | ah = subplot(3,3,[7 8]); 162 | plot(group_t,'LineWidth',2) 163 | cb = colorbar; 164 | set(cb,'Vis','off') 165 | ylim([-3 7]) 166 | xlabel('Time') 167 | ylabel('T(19)') 168 | yline(tinv(1-(0.05/Ntime),Nsub-1),'k--') 169 | hold on 170 | yyaxis right 171 | 172 | 173 | x = 1:Ntime; 174 | kt = sum(ind_sig,2); 175 | y = zeros(1,Ntime); 176 | lo = zeros(1,Ntime); 177 | hi = zeros(1,Ntime); 178 | for ti=1:Ntime 179 | y(ti) = bayesprev_map(kt(ti),Nsub); 180 | hp = bayesprev_hpdi(0.96,kt(ti),Nsub); 181 | lo(ti) = hp(1); 182 | hi(ti) = hp(2); 183 | end 184 | shadedErrorBar(x,y,[hi-y; y-lo],'lineprops',{'LineWidth',2,'Color','k'}) 185 | ylabel('Prevalence') 186 | 187 | ah = gca; 188 | ah.YAxis(1).Color = co(1,:); 189 | ah.YAxis(2).Color = 'k'; 190 | 191 | subplot(3,3,[3 6]) 192 | % plot(max(ind_t,[],1),'s','MarkerSize',10,'LineWidth',2) 193 | stem(max(ind_t,[],1),'filled','LineWidth',1); 194 | set(gca,'view',[90 -90]) 195 | ylim(stemlim) 196 | xlim(yl) 197 | ylabel('T(99)') 198 | h = hline(tinv(1-(0.05/Ntime),Ntrl-1),'k--'); 199 | % set(h,'LineWidth',1.5) 200 | 201 | subplot(3,3,9) 202 | k = sum((min(ind_p)<0.05/Ntime)); 203 | i=1; 204 | oil = 5; 205 | iil = 15; 206 | hy = 0.3; 207 | a = 0.05; 208 | b = 1; 209 | co = get(gca,'ColorOrder'); 210 | x = linspace(0,1,100); 211 | lw = 2; 212 | 213 | lh(1) = plot(x, bayesprev_posterior(x, k, Nsub, a, b),'Color','k','LineWidth',lw); 214 | hold on 215 | 216 | xmap = bayesprev_map(k, Nsub, a, b); 217 | pmap = bayesprev_posterior(xmap, k, Nsub, a, b); 218 | h = bayesprev_hpdi(0.96,k, Nsub, a, b); 219 | 220 | % yp = pmap; 221 | yp = 0.5; 222 | yp = 0.25; 223 | c = [0 0 0 0.4]; 224 | plot(xmap, yp,'.','MarkerSize',20,'Color','k'); 225 | plot([h(1) h(2)],[yp yp],'Color',c,'LineWidth',oil) 226 | h = bayesprev_hpdi(0.5,k,Nsub, a, b); 227 | plot([h(1) h(2)],[yp yp],'Color',c,'LineWidth',iil) 228 | % xline(xmap,'k') 229 | box off 230 | xlabel('Population Prevalence') 231 | ylabel('Posterior Density') 232 | 233 | end 234 | 235 | -------------------------------------------------------------------------------- /paper/fig3_diff_scaling.m: -------------------------------------------------------------------------------- 1 | % Ince, Paton, Kay and Schyns 2 | % "Bayesian inference of population prevalence" 3 | % biorxiv: https://doi.org/10.1101/2020.07.08.191106 4 | % 5 | % Figure 3: Bayesian inference of difference of prevalence. 6 | % A,B: We consider two independent groups of participants with population 7 | % prevalence of true positives [γ_1,γ_2] of [25% 25%] (blue), 8 | % [25% 50%] (red) and [25% 75%] (yellow). We show how A: the Bayesian MAP 9 | % estimate, and B: 96% HPDI width, of the estimated between-group 10 | % prevalence difference γ_1-γ_2 scale with the number of participants. 11 | % C,D: We consider two tests applied to the same sample of participants. 12 | % Here, each simulation is parameterised by the population prevalence of 13 | % true positives for the two tests, [γ_1,γ_2], as well as ρ_12, the 14 | % correlation between the (binary) test results across the population. We 15 | % show this for [50% 50%] with ρ_12=0.2 (blue), [50% 0%] with ρ_12=0 (red), 16 | % and [75% 50%] with ρ_12=-0.2 (yellow). We show how C: the Bayesian MAP 17 | % estimate, and D: 96% HPDI width, of the estimated within-group prevalence 18 | % difference γ_1-γ_2 scale with the number of participants. 19 | 20 | a = 0.05; 21 | b = 1; 22 | 23 | figure 24 | hold all 25 | ax = []; 26 | c = get(0, 'DefaultAxesColorOrder'); 27 | 28 | % 29 | % Panels A and B 30 | % 31 | % bayes_scale_between.mat from run_scaling_between.m 32 | load bayes_scale_between 33 | ax(1) = subplot(2,2,1); 34 | vi = 1; 35 | gi=1; 36 | dat = squeeze(res(vi,:,gi,:)); 37 | shadedErrorBar(Nvals, mean(dat), std(dat),'lineprops',{'-' 'color' c(gi,:)}) 38 | gi=2; 39 | dat = squeeze(res(vi,:,gi,:)); 40 | shadedErrorBar(Nvals, mean(dat), std(dat),'lineprops',{'-' 'color' c(gi,:)}) 41 | gi=3; 42 | dat = squeeze(res(vi,:,gi,:)); 43 | shadedErrorBar(Nvals, mean(dat), std(dat),'lineprops',{'-' 'color' c(gi,:)}) 44 | 45 | % legend({'[25% 25%]' '[25% 50%]' '[25% 75%]'},'location','southeast') 46 | legend({'[0.25 0.25]' '[0.25 0.5]' '[0.25 0.75]'},'location','southeast') 47 | xlabel('Participants (N)') 48 | ylabel('MAP') 49 | title('Between - Bayesian MAP') 50 | axis square 51 | grid off 52 | 53 | ax(2) = subplot(2,2,2); 54 | vi = 2; 55 | gi=1; 56 | dat = squeeze(res(vi,:,gi,:)); 57 | shadedErrorBar(Nvals, mean(dat), std(dat),'lineprops',{'-' 'color' c(gi,:)}) 58 | gi=2; 59 | dat = squeeze(res(vi,:,gi,:)); 60 | shadedErrorBar(Nvals, mean(dat), std(dat),'lineprops',{'-' 'color' c(gi,:)}) 61 | gi=3; 62 | dat = squeeze(res(vi,:,gi,:)); 63 | shadedErrorBar(Nvals, mean(dat), std(dat),'lineprops',{'-' 'color' c(gi,:)}) 64 | 65 | % legend({'[25% 25%]' '[25% 50%]' '[25% 75%]'},'location','southeast') 66 | legend({'[0.25 0.25]' '[0.25 0.5]' '[0.25 0.75]'},'location','northeast') 67 | xlabel('Participants (N)') 68 | ylabel('HPDI Width') 69 | title('Between - 96% HPDI Width') 70 | axis square 71 | grid off 72 | 73 | 74 | % 75 | % Panels C and D 76 | % 77 | % bayes_scale_within.mat from run_scaling_within.m 78 | load bayes_scale_within 79 | gts = [0.5 0.5 0.2; 0 0.5 0; 0.5 0.75 -0.2]; 80 | a = 0.05; 81 | b = 1; 82 | 83 | ax(3) = subplot(2,2,3); 84 | vi = 1; 85 | gi=1; 86 | dat = squeeze(res(vi,:,gi,:)); 87 | shadedErrorBar(Nvals, mean(dat), std(dat),'lineprops',{'-' 'color' c(gi,:)}) 88 | gi=2; 89 | dat = squeeze(res(vi,:,gi,:)); 90 | shadedErrorBar(Nvals, mean(dat), std(dat),'lineprops',{'-' 'color' c(gi,:)}) 91 | gi=3; 92 | dat = squeeze(res(vi,:,gi,:)); 93 | shadedErrorBar(Nvals, mean(dat), std(dat),'lineprops',{'-' 'color' c(gi,:)}) 94 | % legend({'[0.5 0.5] \rho=0' ... 95 | % '[0 0.5]\rho=0'... 96 | % '[0.5 0.75] \rho=0.2'},'location','southeast') 97 | xlabel('Participants (N)') 98 | ylabel('MAP') 99 | title('Within - Bayesian MAP') 100 | axis square 101 | grid off 102 | 103 | ax(4) = subplot(2,2,4); 104 | vi = 2; 105 | gi=1; 106 | dat = squeeze(res(vi,:,gi,:)); 107 | shadedErrorBar(Nvals, mean(dat), std(dat),'lineprops',{'-' 'color' c(gi,:)}) 108 | gi=2; 109 | dat = squeeze(res(vi,:,gi,:)); 110 | shadedErrorBar(Nvals, mean(dat), std(dat),'lineprops',{'-' 'color' c(gi,:)}) 111 | gi=3; 112 | dat = squeeze(res(vi,:,gi,:)); 113 | shadedErrorBar(Nvals, mean(dat), std(dat),'lineprops',{'-' 'color' c(gi,:)}) 114 | 115 | legend({'[0.5 0.5] \rho=0.2' ... 116 | '[0 0.5]\rho=0'... 117 | '[0.5 0.75] \rho=-0.2'},'location','northeast') 118 | xlabel('Participants (N)') 119 | ylabel('HPDI Width') 120 | title('Within - 96% HPDI Width') 121 | axis square 122 | grid off 123 | 124 | -------------------------------------------------------------------------------- /paper/fig4_group_diffs.m: -------------------------------------------------------------------------------- 1 | % 2 | % Ince, Paton, Kay and Schyns 3 | % "Bayesian inference of population prevalence" 4 | % biorxiv: https://doi.org/10.1101/2020.07.08.191106 5 | % 6 | % Figure 4: Example where between-group prevalence diverges from two-sample 7 | % t-test. 8 | % We simulate standard hierarchical Gaussian data for two groups of 20 9 | % participants, T=100, σ_w=10, N=20 per group. A: Group 1 participants are 10 | % drawn from a single population Gaussian distribution with μ=4,σ_b=1. 11 | % Group 2 participants are drawn from two Gaussian distributions. 75% of 12 | % participants are drawn from N(0,0.01) and 25% of participants are drawn 13 | % from N(16,0.5). Dashed line shows with the p=0.05 within-participant 14 | % threshold (one-sample t-test). The means of these two groups are not 15 | % significantly different (B), but they have very different prevalence 16 | % posteriors (C). The posterior distribution for the difference in 17 | % prevalence shows the higher prevalence in Group 1: 18 | % 0.61 [0.36 0.85] (MAP [96% HPDI]) (D). 19 | 20 | Ngrp1 = 20; 21 | Ngrp2 = 20; 22 | 23 | sig_w = 10; 24 | Nsamp = 100; 25 | 26 | % group 1 27 | % all members show an effect 28 | % narrow between participant variance 29 | % medium effect size 30 | g1_sig_b = 1; 31 | g1_mu_effect = 4; 32 | g1dat = generate_data(g1_mu_effect, g1_sig_b, sig_w, Nsamp, Ngrp1); 33 | 34 | % group 2 35 | % only 25% of members show an effect 36 | % narrow between participant variance for this effect 37 | % strong effect size 38 | g2_sig_b = 0.5; 39 | g2_mu_effect = 16; 40 | g2_prev = 0.25; 41 | g2_Neff = round(Ngrp2*g2_prev); 42 | g2_Nnoeff = Ngrp2 - g2_Neff; 43 | g2dat_effect = generate_data(g2_mu_effect, g2_sig_b, sig_w, Nsamp, g2_Neff); 44 | g2dat_noeffect = generate_data(0, 0.01, sig_w, Nsamp, g2_Nnoeff); 45 | g2dat = cat(2,g2dat_effect,g2dat_noeffect); 46 | 47 | % between group t-test 48 | [tsig grp_p ci stats] = ttest2(mean(g1dat),mean(g2dat)); 49 | grp_t = stats.tstat; 50 | 51 | % within participant tests 52 | g1_indsig = false(1,Ngrp1); 53 | g1_indt = zeros(1,Ngrp1); 54 | for si=1:Ngrp1 55 | % within-participant t-test significance 56 | [g1_indsig(si) p ci stats] = ttest(g1dat(:,si)); 57 | % within-participant t-score 58 | g1_indt(si) = stats.tstat; 59 | end 60 | g2_indsig = false(1,Ngrp2); 61 | g2_indt = zeros(1,Ngrp2); 62 | for si=1:Ngrp2 63 | % within-participant t-test significance 64 | [g2_indsig(si) p ci stats] = ttest(g2dat(:,si)); 65 | % within-participant t-score 66 | g2_indt(si) = stats.tstat; 67 | end 68 | 69 | [map, px, p, hpdi, probGT, loGT, samples] = bayesprev_diff_between(sum(g1_indsig), Ngrp1, sum(g2_indsig), Ngrp2, 0.96); 70 | 71 | % plots 72 | figure 73 | mg1 = mean(g1dat); 74 | mg2 = mean(g2dat); 75 | 76 | subplot(1,3,1) 77 | violins = violinplot([mg1' mg2'],{'Group 1','Group 2'}); 78 | % title('Participant means in each group') 79 | c1 = violins(1).ViolinColor; 80 | c2 = violins(2).ViolinColor; 81 | hold on 82 | yline(tinv(1-0.05,99),'k--') 83 | 84 | subplot(1,3,2) 85 | hold on 86 | h = bar(1,mean(mg1),'k'); 87 | set(h,'facecolor',c1) 88 | set(h,'facealpha',violins(1).ViolinAlpha) 89 | h = bar(2,mean(mg2),'k'); 90 | set(h,'facecolor',c2); 91 | set(h,'facealpha',violins(1).ViolinAlpha) 92 | 93 | er = errorbar(1 , mean(mg1), std(mg1)./sqrt(Ngrp1) ); 94 | er.LineStyle = 'none'; 95 | er.Color = 'k'; 96 | er.LineWidth = 2; 97 | set(er.Bar, 'ColorType', 'truecoloralpha', 'ColorData', [er.Line.ColorData(1:3); 255*0.3]) 98 | 99 | er = errorbar(2, mean(mg2), std(mg2)/sqrt(Ngrp2)); 100 | er.LineStyle = 'none'; 101 | er.Color = 'k'; 102 | er.LineWidth = 2; 103 | set(er.Bar, 'ColorType', 'truecoloralpha', 'ColorData', [er.Line.ColorData(1:3); 255*0.3]) 104 | ylabel('Mean') 105 | % title('T-test not significant p=0.92') 106 | set(gca,'XTick', [1 2]) 107 | set(gca,'XTickLabels',{'Group 1', 'Group 2'}) 108 | 109 | 110 | subplot(2,3,3) 111 | box off 112 | xlabel('Prevalence Proportion') 113 | ylabel('Posterior Density') 114 | hold on 115 | a = 0.05; 116 | b = 1; 117 | 118 | oil = 2; 119 | iil = 8; 120 | hy = 0.3; 121 | lw = 2; 122 | alf = violins(1).ViolinAlpha; 123 | % co = get(gca,'ColorOrder'); 124 | % co = [c1;c2]; 125 | co = cat(1,[c1 alf], [c2 alf]); 126 | x = linspace(0,1,100); 127 | 128 | k = sum(g1_indsig); 129 | N = Ngrp1; 130 | i=1; 131 | 132 | lh(1) = plot(x, bayesprev_posterior(x, k, N, a, b),'Color',co(i,:),'LineWidth',lw); 133 | xmap = bayesprev_map(k, N, a, b); 134 | pmap = bayesprev_posterior(xmap, k, N, a, b); %#ok 135 | h = bayesprev_hpdi(0.96,k, N, a, b); 136 | 137 | yp = 0.5; 138 | yp = 0.25; 139 | plot(xmap, yp,'.','MarkerSize',20,'Color',co(i,:)); 140 | 141 | plot([h(1) h(2)],[yp yp],'Color',co(i,:),'LineWidth',oil) 142 | h = bayesprev_hpdi(0.5,k,N, a, b); 143 | plot([h(1) h(2)],[yp yp],'Color',co(i,:),'LineWidth',iil) 144 | 145 | k = sum(g2_indsig); 146 | N = Ngrp2; 147 | i=2; 148 | 149 | lh(1) = plot(x, bayesprev_posterior(x, k, N, a, b),'Color',co(i,:),'LineWidth',lw); 150 | 151 | xmap = bayesprev_map(k, N, a, b); 152 | pmap = bayesprev_posterior(xmap, k, N, a, b); %#ok 153 | h = bayesprev_hpdi(0.96,k, N, a, b); 154 | 155 | % yp = pmap; 156 | yp = 0.5; 157 | yp = 0.25; 158 | plot(xmap, yp,'.','MarkerSize',20,'Color',co(i,:)); 159 | 160 | plot([h(1) h(2)],[yp yp],'Color',co(i,:),'LineWidth',oil) 161 | h = bayesprev_hpdi(0.5,k,N, a, b); 162 | plot([h(1) h(2)],[yp yp],'Color',co(i,:),'LineWidth',iil) 163 | 164 | subplot(2,3,6) 165 | hold on 166 | plot(px,p,'color',[0 0 0 0.3],'LineWidth',lw) 167 | yp = 0.25; 168 | plot(map, yp,'.','MarkerSize',20,'Color',[0 0 0 0.3]); 169 | 170 | plot([hpdi(1) hpdi(2)],[yp yp],'Color',[0 0 0 0.3],'LineWidth',iil) 171 | box off 172 | xlabel('Prevalence Difference \gamma_1 - \gamma_2') 173 | ylabel('Posterior Density') 174 | 175 | 176 | % generate data from heirachical normal model 177 | function dat = generate_data(mu_g, sigma_b, sigma_w, Nsamp, Nsub) 178 | % mu_g - ground truth group mean 179 | % sigma_b - between participant standard deviation 180 | % sigma_w - within participant standard deviation 181 | % Nsamp - number of trials per participant 182 | % Nsub - number of participants 183 | 184 | % generate individual subject means from population normal distribution 185 | submeanstrue = normrnd(mu_g, sigma_b, [Nsub 1]); 186 | dat = zeros(Nsamp, Nsub); 187 | for si=1:Nsub 188 | % generate within-participant data 189 | dat(:,si) = normrnd(submeanstrue(si), sigma_w, [Nsamp 1]); 190 | end 191 | end -------------------------------------------------------------------------------- /paper/fig5_effectsize_examples.m: -------------------------------------------------------------------------------- 1 | % Ince, Paton, Kay and Schyns 2 | % "Bayesian inference of population prevalence" 3 | % biorxiv: https://doi.org/10.1101/2020.07.08.191106 4 | % 5 | % Figure 5: One-sided prevalence as a function of effect size. 6 | % We consider the same simulated systems shown in Figure 1, showing both 7 | % right-tailed (E_p>E ̂) and left-tailed (E_p0) sum(min(ind_p)<0.05/Ntime)] 144 | 145 | % close all 146 | figure 147 | cm = flipud(cbrewer('div','RdBu',128)); 148 | co = get(gca,'ColorOrder'); 149 | 150 | imah = subplot(3,3,[1 2 4 5]); 151 | imagesc(squeeze(mean(dat,2))') 152 | colorbar 153 | colormap(cm) 154 | caxis([-1 1]*max(abs(caxis))) 155 | set(gca,'YDir','normal') 156 | yl = ylim; 157 | % axis square 158 | ylabel('Participant') 159 | 160 | ah = subplot(3,3,[7 8]); 161 | d.indt = max(ind_t); 162 | d.Nsamp = size(dat,2); 163 | d.Nsub = size(dat,3); 164 | [es pmap hpdi] = prev_curve_onesided(d,1); 165 | posbar = hpdi(2,:) - pmap; 166 | negbar = pmap - hpdi(1,:); 167 | shadedErrorBar(es,pmap,cat(1,posbar,negbar)) 168 | % xline(mu_g./(sigma_w./sqrt(Nsamp)),'b') 169 | p = 0.05 ./ Ntime; 170 | xline(tinv(1-p, d.Nsamp-1),'r'); 171 | % xline(tinv(p/2,d.Nsamp-1),'r'); 172 | ylim([0 1]) 173 | xlim([0 20]) 174 | xlabel('Threshold T(99)') 175 | ylabel('Prevalence (> Threshold)') 176 | cb = colorbar; 177 | set(cb,'Vis','off') 178 | 179 | subplot(3,3,[3 6]) 180 | % plot(max(ind_t,[],1),'s','MarkerSize',10,'LineWidth',2) 181 | stem(max(ind_t,[],1),'filled','LineWidth',1); 182 | set(gca,'view',[90 -90]) 183 | ylim(stemlim) 184 | xlim(yl) 185 | ylabel('T(99)') 186 | h = hline(tinv(1-(0.05/Ntime),Ntrl-1),'k--'); 187 | % set(h,'LineWidth',1.5) 188 | 189 | subplot(3,3,9) 190 | k = sum((min(ind_p)<0.05/Ntime)); 191 | i=1; 192 | oil = 5; 193 | iil = 15; 194 | hy = 0.3; 195 | a = 0.05; 196 | b = 1; 197 | co = get(gca,'ColorOrder'); 198 | x = linspace(0,1,100); 199 | lw = 2; 200 | 201 | lh(1) = plot(x, bayesprev_posterior(x, k, Nsub, a, b),'Color','k','LineWidth',lw); 202 | hold on 203 | 204 | xmap = bayesprev_map(k, Nsub, a, b); 205 | pmap = bayesprev_posterior(xmap, k, Nsub, a, b); 206 | h = bayesprev_hpdi(0.96,k, Nsub, a, b); 207 | 208 | % yp = pmap; 209 | yp = 0.5; 210 | yp = 0.25; 211 | c = [0 0 0 0.4]; 212 | plot(xmap, yp,'.','MarkerSize',20,'Color','k'); 213 | plot([h(1) h(2)],[yp yp],'Color',c,'LineWidth',oil) 214 | h = bayesprev_hpdi(0.5,k,Nsub, a, b); 215 | plot([h(1) h(2)],[yp yp],'Color',c,'LineWidth',iil) 216 | % xline(xmap,'k') 217 | box off 218 | xlabel('Population Prevalence') 219 | ylabel('Posterior Density') 220 | 221 | end 222 | 223 | 224 | 225 | -------------------------------------------------------------------------------- /paper/fig6seed.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robince/bayesian-prevalence/2f6954779206b1914760434095fa4678d53f23a6/paper/fig6seed.mat -------------------------------------------------------------------------------- /paper/fig7_scaling.m: -------------------------------------------------------------------------------- 1 | % Ince, Paton, Kay and Schyns 2 | % "Bayesian inference of population prevalence" 3 | % biorxiv: https://doi.org/10.1101/2020.07.08.191106 4 | % 5 | % Figure 7: Characterisation of Bayesian prevalence inference. 6 | % A,B,C: We consider the binomial model of within-participant testing for 7 | % three ground truth population proportions: 25%, 50% and 75% (blue, 8 | % orange, yellow, respectively). We show how A: the Bayesian MAP estimate, 9 | % B: 95% Bayesian lower bound and C: 96% HPDI width, scale with the number 10 | % of participants. Lines show theoretical expectation, coloured regions 11 | % show +/- 1 s.d.. D,E,F: We consider the population model from 12 | % Figure 1C,D (μ=1). D: Power contours for the population inference using 13 | % a t-test (Baker et al. 2020). Colour scale shows statistical power 14 | % (probability of rejecting the null hypothesis). E: Contours of average 15 | % Bayesian MAP estimate for γ. Colour scale shows MAP prevalence 16 | % proportion. F: Contours of average 95% Bayesian lower bound for γ. 17 | % Colour scale shows lower bound prevalence. From the prevalence 18 | % perspective, the number of trials obtained per participant has a larger 19 | % effect on the resulting population inference than does the number of 20 | % participants. 21 | 22 | 23 | figure 24 | 25 | % 26 | % Panel D, T-test power scaling with T and N 27 | % 28 | % tpow.mat from run_T_vs_N_ttest_power.m 29 | subplot(2,3,4) 30 | load tpow 31 | 32 | cmap = cbrewer('seq','Oranges',100); 33 | set(gca,'YDir','normal') 34 | cb = colorbar; 35 | colormap(cmap); 36 | caxis([-0.5 1]) 37 | set(cb,'YLim',[0 1]) 38 | hold on 39 | st = 'on'; 40 | contour(Nvals,kvals,tpow,[0.1:0.1:0.9 0.99 0.9999],'ShowText',st) 41 | xlabel('Participants (N)') 42 | ylabel('Trials per participant (k)') 43 | title('t-test power') 44 | axis square 45 | 46 | % 47 | % Panel E: Bayesian prevalence MAP 48 | % 49 | % prevbayes_normal.mat from run_T_vs_N_bayes_contour.m 50 | subplot(2,3,5) 51 | load prevbayes_normal 52 | 53 | cmap = cbrewer('seq','Oranges',100); 54 | % cmap = flipud(cmap); 55 | % cmap = cmap(300:400,:); 56 | % imagesc(kvals,Nvals,tpow) 57 | set(gca,'YDir','normal') 58 | cb = colorbar; 59 | colormap(cmap); 60 | caxis([-0.5 1]) 61 | set(cb,'YLim',[0 1]) 62 | hold on 63 | st = 'on'; 64 | t = squeeze(mean(gmap,1)); 65 | xGrid = repmat(Nvals,numel(kvals),1); 66 | yGrid = repmat(kvals',1,numel(Nvals)); 67 | [xQuery, yQuery] = meshgrid(1:250,1:500); 68 | vq = interp2(xGrid,yGrid,t,xQuery,yQuery,'makima'); 69 | contour(xQuery,yQuery,vq,[0.1:0.1:0.6],'ShowText',st) 70 | xlabel('Participants (N)') 71 | ylabel('Trials per participant (k)') 72 | title('MAP g') 73 | axis square 74 | 75 | % 76 | % Panel F: Bayesian prevalence lower bound 77 | % 78 | % prevbayes_normal.mat from run_T_vs_N_bayes_contour.m 79 | subplot(2,3,6) 80 | 81 | cmap = cbrewer('seq','Oranges',100); 82 | set(gca,'YDir','normal') 83 | cb = colorbar; 84 | colormap(cmap); 85 | caxis([-0.5 1]) 86 | set(cb,'YLim',[0 1]) 87 | hold on 88 | st = 'on'; 89 | t = squeeze(mean(glb,1)); 90 | xGrid = repmat(Nvals,numel(kvals),1); 91 | yGrid = repmat(kvals',1,numel(Nvals)); 92 | [xQuery, yQuery] = meshgrid(1:250,1:500); 93 | vq = interp2(xGrid,yGrid,t,xQuery,yQuery,'makima'); 94 | contour(xQuery,yQuery,vq,[0.1:0.1:0.6],'ShowText',st) 95 | xlabel('Participants (N)') 96 | ylabel('Trials per participant (k)') 97 | title('95% lower bound') 98 | axis square 99 | 100 | 101 | % 102 | % Panel A: scaling of MAP with N 103 | % 104 | ax2 = []; 105 | ax2(1) = subplot(2,3,1); 106 | hold all 107 | 108 | Nvals = 2:2:256; 109 | a = 0.05; 110 | b = 1; 111 | c = get(0, 'DefaultAxesColorOrder'); 112 | 113 | gt = 0.25; 114 | theta = a + (b-a)*gt; 115 | dat = zeros(length(Nvals),2); 116 | for ni=1:length(Nvals) 117 | N = Nvals(ni); 118 | k = 0:N; 119 | vk = zeros(1,N+1); 120 | for ki=1:N+1 121 | vk(ki) = bayesprev_map(k(ki),N); 122 | end 123 | pk = binopdf(k, N, theta); 124 | mu = sum(pk.*vk); % mean 125 | sigma = sqrt(sum(pk.*(vk-mu).^2)); 126 | dat(ni,1) = mu; 127 | dat(ni,2) = sigma; 128 | end 129 | shadedErrorBar(Nvals, dat(:,1), dat(:,2),'lineprops',{'-' 'color' c(1,:)}) 130 | 131 | gt = 0.5; 132 | theta = a + (b-a)*gt; 133 | dat = zeros(length(Nvals),2); 134 | for ni=1:length(Nvals) 135 | N = Nvals(ni); 136 | k = 0:N; 137 | vk = zeros(1,N+1); 138 | for ki=1:N+1 139 | vk(ki) = bayesprev_map(k(ki),N); 140 | end 141 | pk = binopdf(k, N, theta); 142 | mu = sum(pk.*vk); % mean 143 | sigma = sqrt(sum(pk.*(vk-mu).^2)); 144 | dat(ni,1) = mu; 145 | dat(ni,2) = sigma; 146 | end 147 | shadedErrorBar(Nvals, dat(:,1), dat(:,2),'lineprops',{'-' 'color' c(2,:)}) 148 | 149 | gt = 0.75; 150 | theta = a + (b-a)*gt; 151 | dat = zeros(length(Nvals),2); 152 | for ni=1:length(Nvals) 153 | N = Nvals(ni); 154 | k = 0:N; 155 | vk = zeros(1,N+1); 156 | for ki=1:N+1 157 | vk(ki) = bayesprev_map(k(ki),N); 158 | end 159 | pk = binopdf(k, N, theta); 160 | mu = sum(pk.*vk); % mean 161 | sigma = sqrt(sum(pk.*(vk-mu).^2)); 162 | dat(ni,1) = mu; 163 | dat(ni,2) = sigma; 164 | end 165 | shadedErrorBar(Nvals, dat(:,1), dat(:,2),'lineprops',{'-' 'color' c(3,:)}) 166 | 167 | % legend({'25%' '50%' '75%'},'location','southeast') 168 | xlabel('Participants (N)') 169 | ylabel('\gamma') 170 | title('Bayesian MAP') 171 | axis square 172 | 173 | % 174 | % Panel B: scaling of 95% lower bound with N 175 | % 176 | % bayes_scale.mat from run_bayesian_scaling.m 177 | ax2(2) = subplot(2,3,2); 178 | hold all 179 | load bayes_scale 180 | 181 | plt = 2; 182 | gi=1; 183 | shadedErrorBar(Nvals, res(1,plt,gi,:), res(2,plt,gi,:),'lineprops',{'-' 'color' c(gi,:)}) 184 | gi=2; 185 | shadedErrorBar(Nvals, res(1,plt,gi,:), res(2,plt,gi,:),'lineprops',{'-' 'color' c(gi,:)}) 186 | gi=3; 187 | shadedErrorBar(Nvals, res(1,plt,gi,:), res(2,plt,gi,:),'lineprops',{'-' 'color' c(gi,:)}) 188 | 189 | % legend({'25%' '50%' '75%'},'location','southeast') 190 | xlabel('Participants (N)') 191 | ylabel('95% lower bound') 192 | title('Bayesian 95% lower bound') 193 | axis square 194 | 195 | 196 | % 197 | % Panel C: scaling of 96% HPDI width with N 198 | % 199 | % bayes_scale.mat from run_bayesian_scaling.m 200 | ax2(3) = subplot(2,3,3); 201 | hold all 202 | load bayes_scale 203 | 204 | plt = 1; 205 | gi=1; 206 | shadedErrorBar(Nvals, res(1,plt,gi,:), res(2,plt,gi,:),'lineprops',{'-' 'color' c(gi,:)}) 207 | gi=2; 208 | shadedErrorBar(Nvals, res(1,plt,gi,:), res(2,plt,gi,:),'lineprops',{'-' 'color' c(gi,:)}) 209 | gi=3; 210 | shadedErrorBar(Nvals, res(1,plt,gi,:), res(2,plt,gi,:),'lineprops',{'-' 'color' c(gi,:)}) 211 | 212 | legend({'25%' '50%' '75%'},'location','northeast') 213 | xlabel('Participants (N)') 214 | ylabel('HPDI width') 215 | title('96% HPDI width') 216 | axis square 217 | 218 | set(ax2,'xlim',[0 max(Nvals)]) 219 | set(ax2,'ylim',[-0.05 1.1]) 220 | -------------------------------------------------------------------------------- /paper/figsubjectalingment.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robince/bayesian-prevalence/2f6954779206b1914760434095fa4678d53f23a6/paper/figsubjectalingment.mat -------------------------------------------------------------------------------- /paper/figsubjectprop.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robince/bayesian-prevalence/2f6954779206b1914760434095fa4678d53f23a6/paper/figsubjectprop.mat -------------------------------------------------------------------------------- /paper/generate_data.m: -------------------------------------------------------------------------------- 1 | function dat = generate_data(mu_g, sigma_b, sigma_w, Nsamp, Nsub) 2 | % generate data from heirachical normal model 3 | % mu_g - ground truth group mean 4 | % sigma_b - between participant standard deviation 5 | % sigma_w - within participant standard deviation 6 | % Nsamp - number of trials per participant 7 | % Nsub - number of participants 8 | 9 | % generate individual subject means from population normal distribution 10 | submeanstrue = normrnd(mu_g, sigma_b, [Nsub 1]); 11 | rawdat = zeros(Nsamp, Nsub); 12 | dat.indsig = false(1,Nsub); 13 | dat.indt = zeros(1,Nsub); 14 | for si=1:Nsub 15 | % generate within-participant data 16 | rawdat(:,si) = normrnd(submeanstrue(si), sigma_w, [Nsamp 1]); 17 | % within-participant t-test significance 18 | [dat.indsig(si) p ci stats] = ttest(rawdat(:,si)); 19 | % within-participant t-score 20 | dat.indt(si) = stats.tstat; 21 | end 22 | % within-participant mean 23 | dat.submeans = mean(rawdat,1); 24 | dat.subsem = std(rawdat,[],1) ./ sqrt(Nsamp); 25 | % second level t-test on within-participant means 26 | [h p ci stats] = ttest(mean(rawdat,1)); 27 | % population level t-test significance 28 | dat.groupsig = h; 29 | dat.groupp = p; 30 | dat.Nsub = Nsub; 31 | dat.Nsamp = Nsamp; 32 | dat.tdf = stats.df; 33 | % population level t-score 34 | dat.t = stats.tstat; 35 | end 36 | -------------------------------------------------------------------------------- /paper/prev_curve_onesided.m: -------------------------------------------------------------------------------- 1 | function [es pmap hpdi] = prev_curve_onesided(dat,side) 2 | % prevalence of a one-sided effect size threshold for a t-test 3 | 4 | Nsamp = dat.Nsamp; 5 | Nsub = dat.Nsub; 6 | Nx = 100; 7 | edat = dat.indt; 8 | esx = linspace(min(edat),max(edat),Nx); 9 | emap = zeros(1,Nx); 10 | eh = zeros(2, Nx); 11 | b = 1; 12 | for xi=1:Nx 13 | % if xi==24 14 | % keyboard 15 | % end 16 | % number greater than threshold 17 | if side>0 18 | k = sum(edat>esx(xi)); 19 | a = 1 - tcdf(esx(xi),Nsamp-1); 20 | elseif side<0 21 | k = sum(edat=0.8 || abs(b-a) < 2*eps(b) 25 | emap(xi) = NaN; 26 | eh(:,xi) = NaN; 27 | continue 28 | end 29 | emap(xi) = bayesprev_map(k,Nsub,a,b); 30 | try 31 | eh(:,xi) = bayesprev_hpdi(0.96,k,Nsub,a,b); 32 | catch 33 | % a,b too close together, distribution 34 | emap(xi) = NaN; 35 | eh(:,xi) = NaN; 36 | continue 37 | end 38 | end 39 | 40 | es = esx; 41 | pmap = emap; 42 | hpdi = eh; -------------------------------------------------------------------------------- /paper/prevbayes_normal.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robince/bayesian-prevalence/2f6954779206b1914760434095fa4678d53f23a6/paper/prevbayes_normal.mat -------------------------------------------------------------------------------- /paper/run_T_vs_N_bayes_contour.m: -------------------------------------------------------------------------------- 1 | 2 | sigma_w = 10; 3 | sigma_b = 2; 4 | mu_g = 1; 5 | % Nvals = 2:2:200; 6 | % kvals = 2:2:500; 7 | 8 | % Nvals = 2.^[1:8]; 9 | % kvals = 2.^[1:9]; 10 | 11 | Nvals = [2 4 8 16 32 64 100 150 200 250]; 12 | kvals = [2 4 8 16 32 64 128 200 250 300 350 400 450 500]; 13 | 14 | NN = length(Nvals); 15 | Nk = length(kvals); 16 | Nperm = 1000; 17 | 18 | gmap = zeros(Nperm,Nk,NN); 19 | glb = zeros(Nperm,Nk,NN); 20 | 21 | tic 22 | parfor ni=1:NN 23 | ni 24 | for ki=1:Nk 25 | Nsub = Nvals(ni); 26 | Nsamp = kvals(ki); 27 | for pi=1:Nperm 28 | submeanstrue = normrnd(mu_g, sigma_b, [Nsub 1]); 29 | indsig = zeros(1,Nsub); 30 | for si=1:Nsub 31 | dat = normrnd(submeanstrue(si), sigma_w, [Nsamp 1]); 32 | indsig(si) = ttest(dat); 33 | end 34 | k = sum(indsig); 35 | gmap(pi,ki,ni) = bayesprev_map(k,Nsub); 36 | glb(pi,ki,ni) = bayesprev_bound(0.95,k,Nsub); 37 | end 38 | end 39 | end 40 | toc 41 | 42 | 43 | save prevbayes_normal Nvals kvals gmap glb -------------------------------------------------------------------------------- /paper/run_T_vs_N_ttest_power.m: -------------------------------------------------------------------------------- 1 | sigma_w = 20; 2 | sigma_b = 2; 3 | mu_g = 1; 4 | 5 | Nvals = 2:2:200; 6 | kvals = 2:2:500; 7 | 8 | NN = length(Nvals); 9 | Nk = length(kvals); 10 | 11 | tpow = zeros(Nk,NN); 12 | parfor ni=1:NN 13 | for ki=1:Nk 14 | sigma_g = sqrt(sigma_b.^2 + ((sigma_w).^2)/kvals(ki)); 15 | tpow(ki,ni) = sampsizepwr('t',[0 sigma_g],mu_g,[],Nvals(ni)); 16 | end 17 | end 18 | 19 | %% 20 | save tpow tpow Nvals kvals 21 | -------------------------------------------------------------------------------- /paper/run_bayesian_scaling.m: -------------------------------------------------------------------------------- 1 | 2 | Nvals = 2:2:256; 3 | a = 0.05; 4 | b = 1; 5 | 6 | gts = [0.25 0.5 0.75]; 7 | Ngt = length(gts); 8 | hpd = 0.96; 9 | 10 | parres = cell(1,length(Nvals)); 11 | 12 | parfor ni=1:length(Nvals) 13 | ni 14 | N = Nvals(ni); 15 | k = 0:N; 16 | hpdiwidthk = zeros(1,N+1); 17 | lboundk = zeros(1,N+1); 18 | for ki=1:N+1 19 | hpdi = bayesprev_hpdi(hpd,k(ki),N); 20 | hpdiwidthk(ki) = hpdi(2)-hpdi(1); 21 | lboundk(ki) = bayesprev_bound(0.95, k(ki), N); 22 | end 23 | 24 | res = zeros(2,2,Ngt); 25 | % calcualte mean and s.d. for different ground truths 26 | for gi=1:Ngt 27 | theta = a + (b-a)*gts(gi); 28 | pk = binopdf(k, N, theta); 29 | mu = sum(pk.*hpdiwidthk); 30 | res(1,1,gi) = mu; 31 | res(2,1,gi) = sqrt(sum(pk.*(hpdiwidthk-mu).^2)); 32 | mu = sum(pk.*lboundk); 33 | res(1,2,gi) = mu; 34 | res(2,2,gi) = sqrt(sum(pk.*(lboundk-mu).^2)); 35 | end 36 | parres{ni} = res; 37 | end 38 | 39 | %% 40 | res = cell2mat(reshape(parres,[1 1 1 length(Nvals)])); 41 | save bayes_scale res Nvals gts Ngt hpd 42 | -------------------------------------------------------------------------------- /paper/run_scaling_between.m: -------------------------------------------------------------------------------- 1 | gts = [0.25 0.25; 0.25 0.5; 0.25 0.75]; 2 | Nvals = 2:2:256; 3 | % Nvals = 10; 4 | a = 0.05; 5 | b = 1; 6 | Ngt = size(gts,1); 7 | hpd = 0.96; 8 | 9 | Nsamp = 1000; 10 | 11 | parres = cell(1,length(Nvals)); 12 | parfor ni=1:length(Nvals) 13 | ni 14 | N = Nvals(ni); 15 | 16 | res = zeros(2,Nsamp,Ngt); 17 | % calcualte mean and s.d. for different ground truths 18 | for gi=1:Ngt 19 | theta1 = a + (b-a)*gts(gi,1); 20 | theta2 = a + (b-a)*gts(gi,2); 21 | 22 | k1 = binornd(N, theta1, [Nsamp 1]); 23 | k2 = binornd(N, theta2, [Nsamp 1]); 24 | 25 | for si=1:Nsamp 26 | [map, x, post, hpdi] = bayesprev_diff_between(k1(si),N,k2(si),N,hpd); 27 | res(1,si,gi) = map; 28 | res(2,si,gi) = hpdi(2) - hpdi(1); 29 | end 30 | end 31 | parres{ni} = res; 32 | end 33 | 34 | %% 35 | res = cell2mat(reshape(parres,[1 1 1 length(Nvals)])); 36 | save bayes_scale_between res Nvals gts Ngt hpd 37 | -------------------------------------------------------------------------------- /paper/run_scaling_within.m: -------------------------------------------------------------------------------- 1 | 2 | gts = [0.5 0.5 0.2; 0 0.5 0; 0.5 0.75 -0.2]; 3 | 4 | Nvals = 2:2:256; 5 | % Nvals = 10; 6 | a = 0.05; 7 | b = 1; 8 | Ngt = size(gts,1); 9 | hpd = 0.96; 10 | 11 | Nsamp = 1000; 12 | 13 | parres = cell(1,length(Nvals)); 14 | parfor ni=1:length(Nvals) 15 | ni 16 | N = Nvals(ni); 17 | 18 | res = zeros(2,Nsamp,Ngt); 19 | % calcualte mean and s.d. for different ground truths 20 | for gi=1:Ngt 21 | g1 = gts(gi,1); 22 | g2 = gts(gi,2); 23 | g11 = g1*g2 + gts(gi,3)*sqrt(g1*(1-g1)*g2*(1-g2)); 24 | g10 = g1 - g11; 25 | g01 = g2 - g11; 26 | g00 = 1 - g11 - g10 - g01; 27 | 28 | the11 = (b^2)*g11 + a*b*g10 + a*b*g01 + a*a*g00; 29 | the10 = a + (b-a)*g1 - the11; 30 | the01 = a + (b-a)*g2 - the11; 31 | the00 = 1 - the11 - the10 - the01; 32 | theta = [the11 the10 the01 the00]; 33 | 34 | for si=1:Nsamp 35 | k = mnrnd(N, theta); 36 | [map, x, post, hpdi] = bayesprev_diff_within(k(1),k(2),k(3),N,hpd); 37 | res(1,si,gi) = map; 38 | res(2,si,gi) = hpdi(2) - hpdi(1); 39 | end 40 | end 41 | parres{ni} = res; 42 | end 43 | 44 | %% 45 | res = cell2mat(reshape(parres,[1 1 1 length(Nvals)])); 46 | save bayes_scale_within res Nvals gts Ngt hpd 47 | -------------------------------------------------------------------------------- /paper/tpow.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robince/bayesian-prevalence/2f6954779206b1914760434095fa4678d53f23a6/paper/tpow.mat -------------------------------------------------------------------------------- /python/bayesprev/LICENSE: -------------------------------------------------------------------------------- 1 | GNU GENERAL PUBLIC LICENSE 2 | Version 3, 29 June 2007 3 | 4 | Copyright (C) 2007 Free Software Foundation, Inc. 5 | Everyone is permitted to copy and distribute verbatim copies 6 | of this license document, but changing it is not allowed. 7 | 8 | Preamble 9 | 10 | The GNU General Public License is a free, copyleft license for 11 | software and other kinds of works. 12 | 13 | The licenses for most software and other practical works are designed 14 | to take away your freedom to share and change the works. By contrast, 15 | the GNU General Public License is intended to guarantee your freedom to 16 | share and change all versions of a program--to make sure it remains free 17 | software for all its users. We, the Free Software Foundation, use the 18 | GNU General Public License for most of our software; it applies also to 19 | any other work released this way by its authors. You can apply it to 20 | your programs, too. 21 | 22 | When we speak of free software, we are referring to freedom, not 23 | price. Our General Public Licenses are designed to make sure that you 24 | have the freedom to distribute copies of free software (and charge for 25 | them if you wish), that you receive source code or can get it if you 26 | want it, that you can change the software or use pieces of it in new 27 | free programs, and that you know you can do these things. 28 | 29 | To protect your rights, we need to prevent others from denying you 30 | these rights or asking you to surrender the rights. Therefore, you have 31 | certain responsibilities if you distribute copies of the software, or if 32 | you modify it: responsibilities to respect the freedom of others. 33 | 34 | For example, if you distribute copies of such a program, whether 35 | gratis or for a fee, you must pass on to the recipients the same 36 | freedoms that you received. You must make sure that they, too, receive 37 | or can get the source code. And you must show them these terms so they 38 | know their rights. 39 | 40 | Developers that use the GNU GPL protect your rights with two steps: 41 | (1) assert copyright on the software, and (2) offer you this License 42 | giving you legal permission to copy, distribute and/or modify it. 43 | 44 | For the developers' and authors' protection, the GPL clearly explains 45 | that there is no warranty for this free software. For both users' and 46 | authors' sake, the GPL requires that modified versions be marked as 47 | changed, so that their problems will not be attributed erroneously to 48 | authors of previous versions. 49 | 50 | Some devices are designed to deny users access to install or run 51 | modified versions of the software inside them, although the manufacturer 52 | can do so. This is fundamentally incompatible with the aim of 53 | protecting users' freedom to change the software. The systematic 54 | pattern of such abuse occurs in the area of products for individuals to 55 | use, which is precisely where it is most unacceptable. Therefore, we 56 | have designed this version of the GPL to prohibit the practice for those 57 | products. If such problems arise substantially in other domains, we 58 | stand ready to extend this provision to those domains in future versions 59 | of the GPL, as needed to protect the freedom of users. 60 | 61 | Finally, every program is threatened constantly by software patents. 62 | States should not allow patents to restrict development and use of 63 | software on general-purpose computers, but in those that do, we wish to 64 | avoid the special danger that patents applied to a free program could 65 | make it effectively proprietary. To prevent this, the GPL assures that 66 | patents cannot be used to render the program non-free. 67 | 68 | The precise terms and conditions for copying, distribution and 69 | modification follow. 70 | 71 | TERMS AND CONDITIONS 72 | 73 | 0. Definitions. 74 | 75 | "This License" refers to version 3 of the GNU General Public License. 76 | 77 | "Copyright" also means copyright-like laws that apply to other kinds of 78 | works, such as semiconductor masks. 79 | 80 | "The Program" refers to any copyrightable work licensed under this 81 | License. Each licensee is addressed as "you". "Licensees" and 82 | "recipients" may be individuals or organizations. 83 | 84 | To "modify" a work means to copy from or adapt all or part of the work 85 | in a fashion requiring copyright permission, other than the making of an 86 | exact copy. The resulting work is called a "modified version" of the 87 | earlier work or a work "based on" the earlier work. 88 | 89 | A "covered work" means either the unmodified Program or a work based 90 | on the Program. 91 | 92 | To "propagate" a work means to do anything with it that, without 93 | permission, would make you directly or secondarily liable for 94 | infringement under applicable copyright law, except executing it on a 95 | computer or modifying a private copy. Propagation includes copying, 96 | distribution (with or without modification), making available to the 97 | public, and in some countries other activities as well. 98 | 99 | To "convey" a work means any kind of propagation that enables other 100 | parties to make or receive copies. Mere interaction with a user through 101 | a computer network, with no transfer of a copy, is not conveying. 102 | 103 | An interactive user interface displays "Appropriate Legal Notices" 104 | to the extent that it includes a convenient and prominently visible 105 | feature that (1) displays an appropriate copyright notice, and (2) 106 | tells the user that there is no warranty for the work (except to the 107 | extent that warranties are provided), that licensees may convey the 108 | work under this License, and how to view a copy of this License. If 109 | the interface presents a list of user commands or options, such as a 110 | menu, a prominent item in the list meets this criterion. 111 | 112 | 1. Source Code. 113 | 114 | The "source code" for a work means the preferred form of the work 115 | for making modifications to it. "Object code" means any non-source 116 | form of a work. 117 | 118 | A "Standard Interface" means an interface that either is an official 119 | standard defined by a recognized standards body, or, in the case of 120 | interfaces specified for a particular programming language, one that 121 | is widely used among developers working in that language. 122 | 123 | The "System Libraries" of an executable work include anything, other 124 | than the work as a whole, that (a) is included in the normal form of 125 | packaging a Major Component, but which is not part of that Major 126 | Component, and (b) serves only to enable use of the work with that 127 | Major Component, or to implement a Standard Interface for which an 128 | implementation is available to the public in source code form. A 129 | "Major Component", in this context, means a major essential component 130 | (kernel, window system, and so on) of the specific operating system 131 | (if any) on which the executable work runs, or a compiler used to 132 | produce the work, or an object code interpreter used to run it. 133 | 134 | The "Corresponding Source" for a work in object code form means all 135 | the source code needed to generate, install, and (for an executable 136 | work) run the object code and to modify the work, including scripts to 137 | control those activities. However, it does not include the work's 138 | System Libraries, or general-purpose tools or generally available free 139 | programs which are used unmodified in performing those activities but 140 | which are not part of the work. For example, Corresponding Source 141 | includes interface definition files associated with source files for 142 | the work, and the source code for shared libraries and dynamically 143 | linked subprograms that the work is specifically designed to require, 144 | such as by intimate data communication or control flow between those 145 | subprograms and other parts of the work. 146 | 147 | The Corresponding Source need not include anything that users 148 | can regenerate automatically from other parts of the Corresponding 149 | Source. 150 | 151 | The Corresponding Source for a work in source code form is that 152 | same work. 153 | 154 | 2. Basic Permissions. 155 | 156 | All rights granted under this License are granted for the term of 157 | copyright on the Program, and are irrevocable provided the stated 158 | conditions are met. This License explicitly affirms your unlimited 159 | permission to run the unmodified Program. The output from running a 160 | covered work is covered by this License only if the output, given its 161 | content, constitutes a covered work. This License acknowledges your 162 | rights of fair use or other equivalent, as provided by copyright law. 163 | 164 | You may make, run and propagate covered works that you do not 165 | convey, without conditions so long as your license otherwise remains 166 | in force. You may convey covered works to others for the sole purpose 167 | of having them make modifications exclusively for you, or provide you 168 | with facilities for running those works, provided that you comply with 169 | the terms of this License in conveying all material for which you do 170 | not control copyright. Those thus making or running the covered works 171 | for you must do so exclusively on your behalf, under your direction 172 | and control, on terms that prohibit them from making any copies of 173 | your copyrighted material outside their relationship with you. 174 | 175 | Conveying under any other circumstances is permitted solely under 176 | the conditions stated below. Sublicensing is not allowed; section 10 177 | makes it unnecessary. 178 | 179 | 3. Protecting Users' Legal Rights From Anti-Circumvention Law. 180 | 181 | No covered work shall be deemed part of an effective technological 182 | measure under any applicable law fulfilling obligations under article 183 | 11 of the WIPO copyright treaty adopted on 20 December 1996, or 184 | similar laws prohibiting or restricting circumvention of such 185 | measures. 186 | 187 | When you convey a covered work, you waive any legal power to forbid 188 | circumvention of technological measures to the extent such circumvention 189 | is effected by exercising rights under this License with respect to 190 | the covered work, and you disclaim any intention to limit operation or 191 | modification of the work as a means of enforcing, against the work's 192 | users, your or third parties' legal rights to forbid circumvention of 193 | technological measures. 194 | 195 | 4. Conveying Verbatim Copies. 196 | 197 | You may convey verbatim copies of the Program's source code as you 198 | receive it, in any medium, provided that you conspicuously and 199 | appropriately publish on each copy an appropriate copyright notice; 200 | keep intact all notices stating that this License and any 201 | non-permissive terms added in accord with section 7 apply to the code; 202 | keep intact all notices of the absence of any warranty; and give all 203 | recipients a copy of this License along with the Program. 204 | 205 | You may charge any price or no price for each copy that you convey, 206 | and you may offer support or warranty protection for a fee. 207 | 208 | 5. Conveying Modified Source Versions. 209 | 210 | You may convey a work based on the Program, or the modifications to 211 | produce it from the Program, in the form of source code under the 212 | terms of section 4, provided that you also meet all of these conditions: 213 | 214 | a) The work must carry prominent notices stating that you modified 215 | it, and giving a relevant date. 216 | 217 | b) The work must carry prominent notices stating that it is 218 | released under this License and any conditions added under section 219 | 7. This requirement modifies the requirement in section 4 to 220 | "keep intact all notices". 221 | 222 | c) You must license the entire work, as a whole, under this 223 | License to anyone who comes into possession of a copy. This 224 | License will therefore apply, along with any applicable section 7 225 | additional terms, to the whole of the work, and all its parts, 226 | regardless of how they are packaged. This License gives no 227 | permission to license the work in any other way, but it does not 228 | invalidate such permission if you have separately received it. 229 | 230 | d) If the work has interactive user interfaces, each must display 231 | Appropriate Legal Notices; however, if the Program has interactive 232 | interfaces that do not display Appropriate Legal Notices, your 233 | work need not make them do so. 234 | 235 | A compilation of a covered work with other separate and independent 236 | works, which are not by their nature extensions of the covered work, 237 | and which are not combined with it such as to form a larger program, 238 | in or on a volume of a storage or distribution medium, is called an 239 | "aggregate" if the compilation and its resulting copyright are not 240 | used to limit the access or legal rights of the compilation's users 241 | beyond what the individual works permit. Inclusion of a covered work 242 | in an aggregate does not cause this License to apply to the other 243 | parts of the aggregate. 244 | 245 | 6. Conveying Non-Source Forms. 246 | 247 | You may convey a covered work in object code form under the terms 248 | of sections 4 and 5, provided that you also convey the 249 | machine-readable Corresponding Source under the terms of this License, 250 | in one of these ways: 251 | 252 | a) Convey the object code in, or embodied in, a physical product 253 | (including a physical distribution medium), accompanied by the 254 | Corresponding Source fixed on a durable physical medium 255 | customarily used for software interchange. 256 | 257 | b) Convey the object code in, or embodied in, a physical product 258 | (including a physical distribution medium), accompanied by a 259 | written offer, valid for at least three years and valid for as 260 | long as you offer spare parts or customer support for that product 261 | model, to give anyone who possesses the object code either (1) a 262 | copy of the Corresponding Source for all the software in the 263 | product that is covered by this License, on a durable physical 264 | medium customarily used for software interchange, for a price no 265 | more than your reasonable cost of physically performing this 266 | conveying of source, or (2) access to copy the 267 | Corresponding Source from a network server at no charge. 268 | 269 | c) Convey individual copies of the object code with a copy of the 270 | written offer to provide the Corresponding Source. This 271 | alternative is allowed only occasionally and noncommercially, and 272 | only if you received the object code with such an offer, in accord 273 | with subsection 6b. 274 | 275 | d) Convey the object code by offering access from a designated 276 | place (gratis or for a charge), and offer equivalent access to the 277 | Corresponding Source in the same way through the same place at no 278 | further charge. You need not require recipients to copy the 279 | Corresponding Source along with the object code. If the place to 280 | copy the object code is a network server, the Corresponding Source 281 | may be on a different server (operated by you or a third party) 282 | that supports equivalent copying facilities, provided you maintain 283 | clear directions next to the object code saying where to find the 284 | Corresponding Source. Regardless of what server hosts the 285 | Corresponding Source, you remain obligated to ensure that it is 286 | available for as long as needed to satisfy these requirements. 287 | 288 | e) Convey the object code using peer-to-peer transmission, provided 289 | you inform other peers where the object code and Corresponding 290 | Source of the work are being offered to the general public at no 291 | charge under subsection 6d. 292 | 293 | A separable portion of the object code, whose source code is excluded 294 | from the Corresponding Source as a System Library, need not be 295 | included in conveying the object code work. 296 | 297 | A "User Product" is either (1) a "consumer product", which means any 298 | tangible personal property which is normally used for personal, family, 299 | or household purposes, or (2) anything designed or sold for incorporation 300 | into a dwelling. In determining whether a product is a consumer product, 301 | doubtful cases shall be resolved in favor of coverage. For a particular 302 | product received by a particular user, "normally used" refers to a 303 | typical or common use of that class of product, regardless of the status 304 | of the particular user or of the way in which the particular user 305 | actually uses, or expects or is expected to use, the product. A product 306 | is a consumer product regardless of whether the product has substantial 307 | commercial, industrial or non-consumer uses, unless such uses represent 308 | the only significant mode of use of the product. 309 | 310 | "Installation Information" for a User Product means any methods, 311 | procedures, authorization keys, or other information required to install 312 | and execute modified versions of a covered work in that User Product from 313 | a modified version of its Corresponding Source. The information must 314 | suffice to ensure that the continued functioning of the modified object 315 | code is in no case prevented or interfered with solely because 316 | modification has been made. 317 | 318 | If you convey an object code work under this section in, or with, or 319 | specifically for use in, a User Product, and the conveying occurs as 320 | part of a transaction in which the right of possession and use of the 321 | User Product is transferred to the recipient in perpetuity or for a 322 | fixed term (regardless of how the transaction is characterized), the 323 | Corresponding Source conveyed under this section must be accompanied 324 | by the Installation Information. But this requirement does not apply 325 | if neither you nor any third party retains the ability to install 326 | modified object code on the User Product (for example, the work has 327 | been installed in ROM). 328 | 329 | The requirement to provide Installation Information does not include a 330 | requirement to continue to provide support service, warranty, or updates 331 | for a work that has been modified or installed by the recipient, or for 332 | the User Product in which it has been modified or installed. Access to a 333 | network may be denied when the modification itself materially and 334 | adversely affects the operation of the network or violates the rules and 335 | protocols for communication across the network. 336 | 337 | Corresponding Source conveyed, and Installation Information provided, 338 | in accord with this section must be in a format that is publicly 339 | documented (and with an implementation available to the public in 340 | source code form), and must require no special password or key for 341 | unpacking, reading or copying. 342 | 343 | 7. Additional Terms. 344 | 345 | "Additional permissions" are terms that supplement the terms of this 346 | License by making exceptions from one or more of its conditions. 347 | Additional permissions that are applicable to the entire Program shall 348 | be treated as though they were included in this License, to the extent 349 | that they are valid under applicable law. If additional permissions 350 | apply only to part of the Program, that part may be used separately 351 | under those permissions, but the entire Program remains governed by 352 | this License without regard to the additional permissions. 353 | 354 | When you convey a copy of a covered work, you may at your option 355 | remove any additional permissions from that copy, or from any part of 356 | it. (Additional permissions may be written to require their own 357 | removal in certain cases when you modify the work.) You may place 358 | additional permissions on material, added by you to a covered work, 359 | for which you have or can give appropriate copyright permission. 360 | 361 | Notwithstanding any other provision of this License, for material you 362 | add to a covered work, you may (if authorized by the copyright holders of 363 | that material) supplement the terms of this License with terms: 364 | 365 | a) Disclaiming warranty or limiting liability differently from the 366 | terms of sections 15 and 16 of this License; or 367 | 368 | b) Requiring preservation of specified reasonable legal notices or 369 | author attributions in that material or in the Appropriate Legal 370 | Notices displayed by works containing it; or 371 | 372 | c) Prohibiting misrepresentation of the origin of that material, or 373 | requiring that modified versions of such material be marked in 374 | reasonable ways as different from the original version; or 375 | 376 | d) Limiting the use for publicity purposes of names of licensors or 377 | authors of the material; or 378 | 379 | e) Declining to grant rights under trademark law for use of some 380 | trade names, trademarks, or service marks; or 381 | 382 | f) Requiring indemnification of licensors and authors of that 383 | material by anyone who conveys the material (or modified versions of 384 | it) with contractual assumptions of liability to the recipient, for 385 | any liability that these contractual assumptions directly impose on 386 | those licensors and authors. 387 | 388 | All other non-permissive additional terms are considered "further 389 | restrictions" within the meaning of section 10. If the Program as you 390 | received it, or any part of it, contains a notice stating that it is 391 | governed by this License along with a term that is a further 392 | restriction, you may remove that term. If a license document contains 393 | a further restriction but permits relicensing or conveying under this 394 | License, you may add to a covered work material governed by the terms 395 | of that license document, provided that the further restriction does 396 | not survive such relicensing or conveying. 397 | 398 | If you add terms to a covered work in accord with this section, you 399 | must place, in the relevant source files, a statement of the 400 | additional terms that apply to those files, or a notice indicating 401 | where to find the applicable terms. 402 | 403 | Additional terms, permissive or non-permissive, may be stated in the 404 | form of a separately written license, or stated as exceptions; 405 | the above requirements apply either way. 406 | 407 | 8. Termination. 408 | 409 | You may not propagate or modify a covered work except as expressly 410 | provided under this License. Any attempt otherwise to propagate or 411 | modify it is void, and will automatically terminate your rights under 412 | this License (including any patent licenses granted under the third 413 | paragraph of section 11). 414 | 415 | However, if you cease all violation of this License, then your 416 | license from a particular copyright holder is reinstated (a) 417 | provisionally, unless and until the copyright holder explicitly and 418 | finally terminates your license, and (b) permanently, if the copyright 419 | holder fails to notify you of the violation by some reasonable means 420 | prior to 60 days after the cessation. 421 | 422 | Moreover, your license from a particular copyright holder is 423 | reinstated permanently if the copyright holder notifies you of the 424 | violation by some reasonable means, this is the first time you have 425 | received notice of violation of this License (for any work) from that 426 | copyright holder, and you cure the violation prior to 30 days after 427 | your receipt of the notice. 428 | 429 | Termination of your rights under this section does not terminate the 430 | licenses of parties who have received copies or rights from you under 431 | this License. If your rights have been terminated and not permanently 432 | reinstated, you do not qualify to receive new licenses for the same 433 | material under section 10. 434 | 435 | 9. Acceptance Not Required for Having Copies. 436 | 437 | You are not required to accept this License in order to receive or 438 | run a copy of the Program. Ancillary propagation of a covered work 439 | occurring solely as a consequence of using peer-to-peer transmission 440 | to receive a copy likewise does not require acceptance. However, 441 | nothing other than this License grants you permission to propagate or 442 | modify any covered work. These actions infringe copyright if you do 443 | not accept this License. Therefore, by modifying or propagating a 444 | covered work, you indicate your acceptance of this License to do so. 445 | 446 | 10. Automatic Licensing of Downstream Recipients. 447 | 448 | Each time you convey a covered work, the recipient automatically 449 | receives a license from the original licensors, to run, modify and 450 | propagate that work, subject to this License. You are not responsible 451 | for enforcing compliance by third parties with this License. 452 | 453 | An "entity transaction" is a transaction transferring control of an 454 | organization, or substantially all assets of one, or subdividing an 455 | organization, or merging organizations. If propagation of a covered 456 | work results from an entity transaction, each party to that 457 | transaction who receives a copy of the work also receives whatever 458 | licenses to the work the party's predecessor in interest had or could 459 | give under the previous paragraph, plus a right to possession of the 460 | Corresponding Source of the work from the predecessor in interest, if 461 | the predecessor has it or can get it with reasonable efforts. 462 | 463 | You may not impose any further restrictions on the exercise of the 464 | rights granted or affirmed under this License. For example, you may 465 | not impose a license fee, royalty, or other charge for exercise of 466 | rights granted under this License, and you may not initiate litigation 467 | (including a cross-claim or counterclaim in a lawsuit) alleging that 468 | any patent claim is infringed by making, using, selling, offering for 469 | sale, or importing the Program or any portion of it. 470 | 471 | 11. Patents. 472 | 473 | A "contributor" is a copyright holder who authorizes use under this 474 | License of the Program or a work on which the Program is based. The 475 | work thus licensed is called the contributor's "contributor version". 476 | 477 | A contributor's "essential patent claims" are all patent claims 478 | owned or controlled by the contributor, whether already acquired or 479 | hereafter acquired, that would be infringed by some manner, permitted 480 | by this License, of making, using, or selling its contributor version, 481 | but do not include claims that would be infringed only as a 482 | consequence of further modification of the contributor version. For 483 | purposes of this definition, "control" includes the right to grant 484 | patent sublicenses in a manner consistent with the requirements of 485 | this License. 486 | 487 | Each contributor grants you a non-exclusive, worldwide, royalty-free 488 | patent license under the contributor's essential patent claims, to 489 | make, use, sell, offer for sale, import and otherwise run, modify and 490 | propagate the contents of its contributor version. 491 | 492 | In the following three paragraphs, a "patent license" is any express 493 | agreement or commitment, however denominated, not to enforce a patent 494 | (such as an express permission to practice a patent or covenant not to 495 | sue for patent infringement). To "grant" such a patent license to a 496 | party means to make such an agreement or commitment not to enforce a 497 | patent against the party. 498 | 499 | If you convey a covered work, knowingly relying on a patent license, 500 | and the Corresponding Source of the work is not available for anyone 501 | to copy, free of charge and under the terms of this License, through a 502 | publicly available network server or other readily accessible means, 503 | then you must either (1) cause the Corresponding Source to be so 504 | available, or (2) arrange to deprive yourself of the benefit of the 505 | patent license for this particular work, or (3) arrange, in a manner 506 | consistent with the requirements of this License, to extend the patent 507 | license to downstream recipients. "Knowingly relying" means you have 508 | actual knowledge that, but for the patent license, your conveying the 509 | covered work in a country, or your recipient's use of the covered work 510 | in a country, would infringe one or more identifiable patents in that 511 | country that you have reason to believe are valid. 512 | 513 | If, pursuant to or in connection with a single transaction or 514 | arrangement, you convey, or propagate by procuring conveyance of, a 515 | covered work, and grant a patent license to some of the parties 516 | receiving the covered work authorizing them to use, propagate, modify 517 | or convey a specific copy of the covered work, then the patent license 518 | you grant is automatically extended to all recipients of the covered 519 | work and works based on it. 520 | 521 | A patent license is "discriminatory" if it does not include within 522 | the scope of its coverage, prohibits the exercise of, or is 523 | conditioned on the non-exercise of one or more of the rights that are 524 | specifically granted under this License. You may not convey a covered 525 | work if you are a party to an arrangement with a third party that is 526 | in the business of distributing software, under which you make payment 527 | to the third party based on the extent of your activity of conveying 528 | the work, and under which the third party grants, to any of the 529 | parties who would receive the covered work from you, a discriminatory 530 | patent license (a) in connection with copies of the covered work 531 | conveyed by you (or copies made from those copies), or (b) primarily 532 | for and in connection with specific products or compilations that 533 | contain the covered work, unless you entered into that arrangement, 534 | or that patent license was granted, prior to 28 March 2007. 535 | 536 | Nothing in this License shall be construed as excluding or limiting 537 | any implied license or other defenses to infringement that may 538 | otherwise be available to you under applicable patent law. 539 | 540 | 12. No Surrender of Others' Freedom. 541 | 542 | If conditions are imposed on you (whether by court order, agreement or 543 | otherwise) that contradict the conditions of this License, they do not 544 | excuse you from the conditions of this License. If you cannot convey a 545 | covered work so as to satisfy simultaneously your obligations under this 546 | License and any other pertinent obligations, then as a consequence you may 547 | not convey it at all. For example, if you agree to terms that obligate you 548 | to collect a royalty for further conveying from those to whom you convey 549 | the Program, the only way you could satisfy both those terms and this 550 | License would be to refrain entirely from conveying the Program. 551 | 552 | 13. Use with the GNU Affero General Public License. 553 | 554 | Notwithstanding any other provision of this License, you have 555 | permission to link or combine any covered work with a work licensed 556 | under version 3 of the GNU Affero General Public License into a single 557 | combined work, and to convey the resulting work. The terms of this 558 | License will continue to apply to the part which is the covered work, 559 | but the special requirements of the GNU Affero General Public License, 560 | section 13, concerning interaction through a network will apply to the 561 | combination as such. 562 | 563 | 14. Revised Versions of this License. 564 | 565 | The Free Software Foundation may publish revised and/or new versions of 566 | the GNU General Public License from time to time. Such new versions will 567 | be similar in spirit to the present version, but may differ in detail to 568 | address new problems or concerns. 569 | 570 | Each version is given a distinguishing version number. If the 571 | Program specifies that a certain numbered version of the GNU General 572 | Public License "or any later version" applies to it, you have the 573 | option of following the terms and conditions either of that numbered 574 | version or of any later version published by the Free Software 575 | Foundation. If the Program does not specify a version number of the 576 | GNU General Public License, you may choose any version ever published 577 | by the Free Software Foundation. 578 | 579 | If the Program specifies that a proxy can decide which future 580 | versions of the GNU General Public License can be used, that proxy's 581 | public statement of acceptance of a version permanently authorizes you 582 | to choose that version for the Program. 583 | 584 | Later license versions may give you additional or different 585 | permissions. However, no additional obligations are imposed on any 586 | author or copyright holder as a result of your choosing to follow a 587 | later version. 588 | 589 | 15. Disclaimer of Warranty. 590 | 591 | THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY 592 | APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT 593 | HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY 594 | OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, 595 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 596 | PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM 597 | IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF 598 | ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 599 | 600 | 16. Limitation of Liability. 601 | 602 | IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING 603 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS 604 | THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY 605 | GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE 606 | USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF 607 | DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD 608 | PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), 609 | EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF 610 | SUCH DAMAGES. 611 | 612 | 17. Interpretation of Sections 15 and 16. 613 | 614 | If the disclaimer of warranty and limitation of liability provided 615 | above cannot be given local legal effect according to their terms, 616 | reviewing courts shall apply local law that most closely approximates 617 | an absolute waiver of all civil liability in connection with the 618 | Program, unless a warranty or assumption of liability accompanies a 619 | copy of the Program in return for a fee. 620 | -------------------------------------------------------------------------------- /python/bayesprev/bayesprev.py: -------------------------------------------------------------------------------- 1 | """Bayesian estimation of population prevalence 2 | 3 | Bayesian inference of population prevalence 4 | RAA Ince, AT Paton, JW Kay & PG Schyns 5 | (2021) eLife 10:e62461 doi: 10.7554/eLife.62461 6 | 7 | If a statistical test with false positive rate alpha is performed in 8 | N participants (or other replication units) and k are found to be significant 9 | then we can estimate the population prevalence of a true positive effect. 10 | This is the population level within-participant replication probability. 11 | 12 | """ 13 | __version__ = "0.1.1" 14 | 15 | import numpy as np 16 | import scipy as sp 17 | from scipy.optimize import fsolve 18 | from scipy.stats import beta 19 | 20 | # parameters for Gamma prior 21 | r = 1 22 | s = 1 23 | 24 | 25 | def map(k, n, a=0.05, b=1): 26 | """Bayesian maximum a posteriori estimate of population prevalence gamma 27 | under a uniform prior 28 | 29 | k : number of participants significant out of 30 | n : total number of participants 31 | a : alpha value of within-participant test (default=0.05) 32 | b : sensitivity/beta of within-participant test (default=1) 33 | 34 | """ 35 | 36 | theta = (k + r - 1.0) / (n + r + s - 2.0) 37 | if theta <= a: 38 | return 0.0 39 | elif theta >= b: 40 | return 1.0 41 | else: 42 | return (theta - a) / (b - a) 43 | 44 | 45 | def posterior(x, k, n, a=0.05, b=1): 46 | """Bayesian posterior of population prevalence gamma 47 | under a uniform prior 48 | 49 | x : values of gamma at which to evaluate the posterior density 50 | k : number of participants significant out of 51 | n : total number of participants 52 | a : alpha value of within-participant test (default=0.05) 53 | b : sensitivity/beta of within-participant test (default=1) 54 | 55 | """ 56 | 57 | theta = a + (b - a) * x 58 | post = (b - a) * beta.pdf(theta, k + r, n - k + s) 59 | post = post / (beta.cdf(b, k + r, n - k + s) - beta.cdf(a, k + r, n - k + s)) 60 | return post 61 | 62 | 63 | def bound(p, k, n, a=0.05, b=1): 64 | """Bayesian lower bound of population prevalence gamma under a uniform prior 65 | 66 | p : density the lower bound should bound (e.g. 0.95) 67 | k : number of participants significant out of 68 | n : total number of participants 69 | a : alpha value of within-participant test (default=0.05) 70 | b : sensitivity/beta of within-participant test (default=1) 71 | 72 | """ 73 | 74 | b1 = k + r 75 | b2 = n - k + s 76 | cdfp = (1 - p) * beta.cdf(b, b1, b2) + p * beta.cdf(a, b1, b2) 77 | the_c = beta.ppf(cdfp, b1, b2) 78 | g_c = (the_c - a) / (b - a) 79 | return g_c 80 | 81 | 82 | def hpdi(p, k, n, a=0.05, b=1): 83 | """Bayesian highest posterior density interval of population prevalence gamma 84 | under a uniform prior 85 | 86 | p : HPDI to return (e.g. 0.95 for 95%) 87 | k : number of participants significant out of 88 | n : total number of participants 89 | a : alpha value of within-participant test (default=0.05) 90 | b : sensitivity/beta of within-participant test (default=1) 91 | 92 | """ 93 | 94 | b1 = k + r 95 | b2 = n - k + s 96 | 97 | # truncated beta pdf/cdf/icdf 98 | tbpdf = lambda x: beta.pdf(x, b1, b2) / (beta.cdf(b, b1, b2) - beta.cdf(a, b1, b2)) 99 | tbcdf = lambda x: (beta.cdf(x, b1, b2) - beta.cdf(a, b1, b2)) / ( 100 | beta.cdf(b, b1, b2) - beta.cdf(a, b1, b2) 101 | ) 102 | tbicdf = lambda x: beta.ppf((1 - x) * beta.cdf(a, b1, b2) + x * beta.cdf(b, b1, b2), b1, b2) 103 | 104 | if k == a: 105 | x = np.array([a, tbicdf(p)]) 106 | elif k == n: 107 | x = np.array([tbicdf(1 - p), b]) 108 | else: 109 | f = lambda x: np.array([tbcdf(x[1]) - tbcdf(x[0]) - p, tbpdf(x[1]) - tbpdf(x[0])]) 110 | x, info, ier, mesg = fsolve( 111 | f, np.array([tbicdf((1 - p) / 2), tbicdf((1 + p) / 2)]), full_output=True 112 | ) 113 | 114 | # limit to valid theta values 115 | if (x[0] < a) or (x[1] < x[0]): 116 | x = np.array([a, tbicdf(p)]) 117 | if x[1] > b: 118 | x = np.array([tbicdf(1 - p), b]) 119 | hpdi = (x - a) / (b - a) 120 | return hpdi 121 | 122 | def logodds(k, n, x=0.5, a=0.05, b=1): 123 | """Posterior log-odds in favor of the population prevalence gamma being 124 | greater than x 125 | 126 | k : number of participants significant out of 127 | n : total number of participants 128 | x : log-odds theshold (default=0.5) 129 | a : alpha value of within-participant test (default=0.05) 130 | b : sensitivity/beta of within-participant test (default=1)) 131 | 132 | """ 133 | 134 | theta = a + (b - a) * x 135 | b1 = k + r 136 | b2 = n - k + s 137 | p = (beta.cdf(b,b1,b2)-beta.cdf(theta,b1,b2)) / (beta.cdf(b,b1,b2)-beta.cdf(a,b1,b2)); 138 | lo = np.log(p/(1-p)); 139 | return lo 140 | 141 | def diff_between(k1, n1, k2, n2, p=0.96, a=0.05, b=1, Nsamp=10000): 142 | """Bayesian maximum a posteriori estimate of the difference in prevalence 143 | when the same test is applied to two groups 144 | 145 | k1 : number of participants significant in group 1 out of 146 | n1 : total number of participants in group 1 147 | k2 : number of participants significant in group 2 out of 148 | n2 : total number of participants in group 2 149 | p : coverage for highest-posterior density interval (in [0 1]) 150 | a : alpha value of within-participant test (default=0.05) 151 | b : sensitivity/beta of within-participant test (default=1) 152 | Nsamp : number of samples from the posterior 153 | 154 | Outputs: 155 | map : maximum a posteriori estimate of the difference in prevalence: 156 | gamma_1 - gamma_2 157 | post_x : x-axis for kernel density fit of posterior distribution of the 158 | above 159 | post : posterior distribution from kernel density fit 160 | hpdi : highest-posterior density interval with coverage p 161 | probGT : estimated posterior probability that the prevalence is higher in group 1 162 | logoddsGT : estimated log odds in favour of the hypothesis that the prevalence is higher in group 1 163 | samples : posterior samples 164 | 165 | """ 166 | 167 | # gamma priors = Beta(r,s) 168 | r1 = 1 169 | s1 = 1 170 | r2 = 1 171 | s2 = 1 172 | 173 | # Parameters for Beta posteriors 174 | m11 = k1 + r1 175 | m12 = n1 - k1 + s1 176 | m21 = k2 + r2 177 | m22 = n2 - k2 + s2 178 | 179 | # Generate truncated beta samples 180 | # fix numerical issue 181 | r1 = (beta.cdf(a, m11, m12), beta.cdf(b, m11, m12)) 182 | r2 = (beta.cdf(a, m21, m22), beta.cdf(b, m21, m22)) 183 | if np.any([np.isclose(*r, rtol=1e-12, atol=1e-12) for r in [r1, r2]]): 184 | res = { 185 | x: np.NaN for x in ["map", "post_x", "post", "hpdi", "probGT", "logoddsGT", "samples"] 186 | } 187 | return res 188 | 189 | th1 = beta.ppf(np.random.uniform(r1[0], r1[1], Nsamp), m11, m12) 190 | th2 = beta.ppf(np.random.uniform(r2[0], r2[1], Nsamp), m21, m22) 191 | 192 | # vector of estimates of prevalence differences 193 | samples = (th1 - th2) / (b - a) 194 | 195 | # kernel density estimate of posterior 196 | post_x = np.linspace(-1, 1, 200) 197 | kde = sp.stats.gaussian_kde(samples) 198 | post = kde(post_x) 199 | map = post_x[np.argmax(post)] 200 | 201 | # Estimate the posterior probability, and logodds, that the prevalence is higher for group 1. 202 | # Laplace's rule of succession used to avoid estimates of 0 or 1 203 | probGT = (np.sum(samples > 0) + 1) / (Nsamp + 2) 204 | logoddsGT = np.log(probGT / (1 - probGT)) 205 | hpdi = _hpdi(samples, p) 206 | 207 | res = { 208 | "map": map, 209 | "post_x": post_x, 210 | "post": post, 211 | "hpdi": hpdi, 212 | "probGT": probGT, 213 | "logoddsGT": logoddsGT, 214 | "samples": samples, 215 | } 216 | return res 217 | 218 | 219 | def diff_within(k11, k10, k01, n, p=0.96, a=0.05, b=1, Nsamp=10000): 220 | """Bayesian maximum a posteriori estimate of the difference in prevalence 221 | when the same test is applied to two groups 222 | 223 | k1 : number of participants significant in group 1 out of 224 | n1 : total number of participants in group 1 225 | k2 : number of participants significant in group 2 out of 226 | n2 : total number of participants in group 2 227 | p : coverage for highest-posterior density interval (in [0 1]) 228 | a : alpha value of within-participant test (default=0.05) 229 | b : sensitivity/beta of within-participant test (default=1) 230 | Nsamp : number of samples from the posterior 231 | 232 | Outputs: 233 | map : maximum a posteriori estimate of the difference in prevalence: 234 | gamma_1 - gamma_2 235 | post_x : x-axis for kernel density fit of posterior distribution of the 236 | above 237 | post : posterior distribution from kernel density fit 238 | hpdi : highest-posterior density interval with coverage p 239 | probGT : estimated posterior probability that the prevalence is higher in group 1 240 | logoddsGT : estimated log odds in favour of the hypothesis that the prevalence is higher in group 1 241 | samples : posterior samples 242 | 243 | """ 244 | 245 | # Parameters for the Dirichlet prior distribution (1,1,1,1) = uniform 246 | r11 = 1 247 | r10 = 1 248 | r01 = 1 249 | r00 = 1 250 | 251 | # parameters for posterior Dirichlet distribution 252 | k00 = n - k11 - k10 - k01 253 | m11 = k11 + r11 254 | m10 = k10 + r10 255 | m01 = k01 + r01 256 | m00 = k00 + r00 257 | 258 | r11 = (beta.cdf(0, m11, m10 + m01 + m00), beta.cdf(b, m11, m10 + m01 + m00)) 259 | if np.any(np.isclose(*r11, rtol=1e-12, atol=1e-12)): 260 | res = { 261 | x: np.NaN for x in ["map", "post_x", "post", "hpdi", "probGT", "logoddsGT", "samples"] 262 | } 263 | return res 264 | # samples from the truncated Dirichlet posterior 265 | z11 = np.random.uniform(r11[0], r11[1], Nsamp) 266 | th11 = beta.ppf(z11, m11, m10 + m01 + m00) 267 | 268 | lo = np.maximum((a - th11) / (1 - th11), 0) 269 | hi = (b - th11) / (1 - th11) 270 | 271 | r10 = beta.cdf(lo, m10, m01 + m00), beta.cdf(hi, m10, m01 + m00) 272 | if np.any(np.isclose(*r10, rtol=1e-12, atol=1e-12)): 273 | res = { 274 | x: np.NaN for x in ["map", "post_x", "post", "hpdi", "probGT", "logoddsGT", "samples"] 275 | } 276 | return res 277 | z10 = np.random.uniform(r10[0], r10[1], Nsamp) 278 | u10 = beta.ppf(z10, m10, m01 + m00) 279 | th10 = (1 - th11) * u10 280 | 281 | lo = np.maximum((a - th11) / (1 - th11 - th10), 0) 282 | hi = np.minimum((b - th11) / (1 - th11 - th10), 1) 283 | r01 = (beta.cdf(lo, m01, m00), beta.cdf(hi, m01, m00)) 284 | if np.any(np.isclose(*r01, rtol=1e-12, atol=1e-12)): 285 | res = { 286 | x: np.NaN for x in ["map", "post_x", "post", "hpdi", "probGT", "logoddsGT", "samples"] 287 | } 288 | return res 289 | z01 = np.random.uniform(r01[0], r01[1], Nsamp) 290 | u01 = beta.ppf(z01, m01, m00) 291 | th01 = (1 - th11 - th10) * u01 292 | 293 | th00 = 1 - th11 - th10 - th01 294 | 295 | # samples of posterior prevalence difference 296 | samples = (th10 - th01) / (b - a) 297 | 298 | # kernel density estimate of posterior 299 | post_x = np.linspace(-1, 1, 200) 300 | kde = sp.stats.gaussian_kde(samples) 301 | post = kde(post_x) 302 | map = post_x[np.argmax(post)] 303 | 304 | # Estimate the posterior probability, and logodds, that the prevalence is higher for group 1. 305 | # Laplace's rule of succession used to avoid estimates of 0 or 1 306 | probGT = (np.sum(samples > 0) + 1) / (Nsamp + 2) 307 | logoddsGT = np.log(probGT / (1 - probGT)) 308 | hpdi = _hpdi(samples, p) 309 | 310 | res = { 311 | "map": map, 312 | "post_x": post_x, 313 | "post": post, 314 | "hpdi": hpdi, 315 | "probGT": probGT, 316 | "logoddsGT": logoddsGT, 317 | "samples": samples, 318 | } 319 | return res 320 | 321 | 322 | def _hpdi(data, p): 323 | """HPDI modified from https://arviz-devs.github.io/arviz/""" 324 | data = data.flatten() 325 | n = len(data) 326 | data = np.sort(data) 327 | interval_idx_inc = int(np.floor(p * n)) 328 | n_intervals = n - interval_idx_inc 329 | interval_width = data[interval_idx_inc:] - data[:n_intervals] 330 | 331 | if len(interval_width) == 0: 332 | raise ValueError("Too few elements for interval calculation.") 333 | 334 | min_idx = np.argmin(interval_width) 335 | hdi_min = data[min_idx] 336 | hdi_max = data[min_idx + interval_idx_inc] 337 | 338 | hpdi = np.array([hdi_min, hdi_max]) 339 | 340 | return hpdi 341 | -------------------------------------------------------------------------------- /python/bayesprev/pyproject.toml: -------------------------------------------------------------------------------- 1 | [build-system] 2 | requires = ["flit_core >=3.2,<4"] 3 | build-backend = "flit_core.buildapi" 4 | 5 | [project] 6 | name = "bayesprev" 7 | authors = [{name = "Robin Ince", email = "robince@gmail.com"}] 8 | license = {file = "LICENSE"} 9 | classifiers = ["License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)"] 10 | dynamic = ["version", "description"] 11 | dependencies = [ 12 | "numpy", 13 | "scipy", 14 | ] 15 | 16 | [project.urls] 17 | Home = "https://github.com/robince/bayesian-prevalence" 18 | -------------------------------------------------------------------------------- /python/bayesprev_example.py: -------------------------------------------------------------------------------- 1 | # Example of how to use Bayesian prevalence functions 2 | # 3 | # 1. Simulate or load within-participant raw experimental data 4 | # 2. LEVEL 1: Apply statstical test at the individual level 5 | # 3. LEVEL 2: Apply Bayesian Prevalence to the outcomes of Level 1 6 | 7 | import numpy as np 8 | import scipy as sp 9 | import bayesprev 10 | import matplotlib.pyplot as plt 11 | 12 | # 13 | # 1. Simulate or load within-participant raw experimental data 14 | # 15 | 16 | # 1.1. Simulate within-participant raw experimental data 17 | Nsub = 20 # number of particpants 18 | Nsamp = 100 # trials/samples per participant 19 | sigma_w = 10 # within-participant SD 20 | sigma_b = 2 # between-participant SD 21 | mu_g = 1 # population mean 22 | 23 | # per participant mean drawn from population normal distribution 24 | submeanstrue = np.random.normal(mu_g, sigma_b, Nsub) 25 | # rawdat holds trial data for each participant 26 | rawdat = np.zeros((Nsamp, Nsub)) 27 | for si in range(Nsub): 28 | # generate trials for each participant 29 | rawdat[:,si] = np.random.normal(submeanstrue[si], sigma_w, Nsamp) 30 | 31 | # 1.2.Load within-participant raw experimental data 32 | # Load your own data into the variable rawdat with dimensions [Nsamp Nsub], 33 | # setting Nsamp and Nsub accordingly. 34 | 35 | # 36 | # 2. LEVEL 1 37 | # 38 | 39 | # 2.1. Within-participant statistical test 40 | # This function performs within-participant statistical test. Here, a t-test for 41 | # non-zero mean which is the simplest statistical test. In general, any 42 | # statistical test can be used at Level 1. 43 | 44 | # calculates a t-test against 0 mean independently for each participant 45 | [t, p] = sp.stats.ttest_1samp(rawdat,0) 46 | # p holds p-values of test for each participant 47 | alpha = 0.05 # false positive rate of test 48 | indsig = p