├── README.md ├── links.md ├── registered-reports_guide.md ├── replication-packages.md ├── review-criteria.md └── rr ├── README.md ├── icsme_rr_guide.md ├── registered-reports_ICSME_CFP.md ├── registered-reports_MSR_CFP.md └── rr_policies.md /README.md: -------------------------------------------------------------------------------- 1 | # EMSE Open Science Initiative 2 | 3 | 4 | Openness in science is key to fostering progress via transparency, reproducibility, and replicability. Especially open data and open source are two fundamental pillars in open science as both build the core for excellence in evidence-based research. The Empirical Software Engineering journal (EMSE) has therefore decided to explicitly foster open science and reproducible research by encouraging and supporting authors to share their (anonymised and curated) empirical data and source code in form of replication packages. The overall goals are: 5 | * Increasing the transparency, reproducibility, and replicability of research endeavours. This supports the immediate credibility of authors' work, and it also provides a common basis for joint community efforts grounded on shared data. 6 | * Building up an overall body of knowledge in the community leading to widely accepted and well-formed software engineering theories in the long run. 7 | 8 | This document describes the principles, the process, and the infrastructure that support this initiative. 9 | 10 | This information is also available in the editorial [The open science initiative of the Empirical Software 11 | Engineering journal](https://link.springer.com/epdf/10.1007/s10664-019-09712-x) (Published May 2019, [doi:10.1007/s10664-018-9632-7](https://doi.org/10.1007/s10664-019-09712-x)) 12 | 13 | ## Open Science Principles at EMSE 14 | 15 | As for any initiative in a research community, the success of the Open Science Initiative, too, depends on the willingness and the possibilities of authors to disclose their data. Therefore, we strive to implement the Open Science Initiative at EMSE as a community effort with services that aim at encouraging and supporting authors of EMSE articles in opening up their research. The steering and motivating principle is that only openness in empirical research increases the transparency of research in a way such that the authors' empirical analyses can be reproduced, fully understood, and ideally replicated by others not involved in the research. To this end, we aim at promoting a data-sharing culture where authors publicly archive their data and related material required to understand and reproduce the claims and analyses presented by them in their manuscripts. Our hope is to move our community as a whole forward to the point where open science becomes the norm. 16 | 17 | All submissions to EMSE will undergo the same known review process regardless of whether authors decide to disclose their data or not. Yet, as the leading journal in empirical research methodologies and their application to software engineering, we strongly encourage all authors to make an effort in supporting this initiative by making data available upon submission (either privately or publicly) and especially upon acceptance (publicly). Authors who cannot disclose non-public data (e.g. industrial data sets that fall under non-disclosure agreements), are asked to please provide an explicit and short statement in their manuscript. 18 | 19 | To make research data sets and research software accessible and citable, we encourage authors to: 20 | * archive data on preserved archives such as [zenodo.org](https://zenodo.org/) and [figshare.com](https://figshare.com/) so that replication packages remain available in the very long term (on Zenodo, there is a [dedicated community for empirical software engineering](https://zenodo.org/communities/empirical-software-engineering/)). 21 | * use an appropriate license, e.g., the [CC-BY 4.0 license](https://creativecommons.org/licenses/by/4.0/) for data and the [MIT License](https://choosealicense.com/licenses/mit/) for code. Look at [choosealicense.com](https://choosealicense.com/) for more information about suitable open source licenses. 22 | 23 | Those *replication packages* disclosed by the authors will then undergo an additional, short, review by the open science board as described next. When archiving data as part of a replication package, we ask authors to attend to the [FAIR](https://www.force11.org/group/fairgroup/fairprinciples), i.e. data should be: 24 | * Findable, 25 | * Accessible, 26 | * Interoperable, and 27 | * Reusable. 28 | 29 | Authors should therefore use archival repositories and avoid putting data and software on their own (institutional or private) websites or systems like Dropbox, version control systems (SVN, Git), or service like Academia.edu and ResearchGate. Personal websites are prone to changes and errors, and more than 30% of them will not work in a 4 year period. Moreover, nobody should have the ability to delete data once it is public. Finally, the package disclosed via an archival repository should link to the paper (DOI) upon final production of the manuscript. 30 | 31 | ## Open Science Board 32 | 33 | * [Daniel Méndez](https://www.mendezfe.org) (Chair), Blekinge Institute of Technology, Sweden, and fortiss GmbH, Germany 34 | * [René Just](https://homes.cs.washington.edu/~rjust/), University of Washington, USA 35 | * [Daniel Graziotin](https://ineed.coffee), University of Stuttgart, Germany 36 | * [Neil Ernst](https://www.neilernst.net/), University of Victoria, Canada 37 | * [Chakkrit Tantithamthavorn](http://chakkrit.com/), Monash University, Australia 38 | 39 | If you are interested in joining the board and contributing to open science at EMSE, contact Daniel and Martin by email. 40 | 41 | ## Open Science Process 42 | 43 | 1. Once a manuscript gets "Minor revision" or "Acceptance", the decision email contains the following text: 44 | * "EMSE encourages open science and reproducible research. We are happy to invite your to submit your open data, open material, or open source code (in the folllowing referred to as "replication package") for an additional, short, review by the open science board. Provided you agree to participate, the board will then review the replication package and check its eligibility to publicly recognise your open science effort with an open science badge. The board will provide you with constructive feedback on content and documentation of the package. To submit your replication package, please send an email to mendezfe@acm.org. Should you have any questions, please do not hesitate to contact the open science chairs Daniel Mendez. Note that your decision to participate in the open science initiative will not affect the remaining reviewing and editorial process in any way." 45 | 1. The authors are given 2 weeks to submit their replication package after the final acceptance. 46 | 1. When the authors submit a replication package, the Open Science Chairs ask one member of the Open Science board to review the package. 47 | * The review is made according to transparent [review criteria](review-criteria.md) 48 | * The open science reviewer is given two weeks to accept or consolidate a list of questions to the authors 49 | * The open science review is blinded, the open science reviewer does not sign her review 50 | 1. If necessary, the open science reviewer asks for changes by sending an email to the authors 51 | * the authors are given another two weeks to make the changes. 52 | 1. The open science reviewer makes the final decision. 53 | * If the replication package is rated as insufficient, the manuscript is still accepted and the authors are given a list of constructive comments on how to improve their open science practices 54 | * If the replication package is considered to be of good or excellent quality, the authors can add in their final version. "Open Science Replication Package validated by the Open Science Board". 55 | 56 | Throughout the whole communication process, the Open Science co-Chairs serve as mediator between the authors and the Open Science Board members in a, for now, single blind process. 57 | 58 | The [Frequently Asked Questions](FAQ.md) provides additional information. 59 | 60 | ## EMSE papers with the Open Science Badge 61 | 62 | When papers are awarded the "Open Science" badge, the following text is added as a separate title note, the badge note appears after the “Communicated by”-line 63 | 64 | “This paper has been awarded the Empirical Software Engineering (EMSE) open science badge.” 65 | 66 | Springer maintains the ‘Open Science’ topical collection for papers that have been awarded the EMSE open science badge: 67 | 68 | 69 | ## FAQ 70 | 71 | **How should the replication packages be disclosed?** 72 | We encourage authors to archive their data as part of replication packages on preserved archives such as [zenodo.org](https://zenodo.org/) or [figshare.com](https://figshare.com/) so that the data will receive a [DOI](https://www.doi.org/) and become citable. Further, we recommend the authors to use the [CC0](https://creativecommons.org/publicdomain/zero/1.0/) dedication (or the [CC-BY 4.0 license](https://creativecommons.org/licenses/by/4.0/)) when publishing the data (automatic when using, for instance, zenodo.org or figshare.com). 73 | 74 | Those archives allow updating published replication packages any time. We strongly recommend the authors to update the package information after the review process, once the manuscript is in production and receives a DOI, with a reference to the published manuscript so that the package is citable along the published article. 75 | 76 | **In this EMSE open science process, what’s the difference between reproducibility and replicability?** 77 | There is no consensus across disciplines about the difference between reproducibility and replicability. Often, replicability is seen as the ability to repeat the same study under the very same conditions yielding same results. 78 | Reproducibility is seen as the ability to independently reproduce the study yielding same or similar results with a given precision. In the EMSE open science process, we make no specific difference for now. The goal is to encourage open data and code so that researchers can reproduce the results (partially or completely), and/or perform further research using this data and code. 79 | 80 | **What happens if the data violates one or more of the FAIR principles?** 81 | [FAIR](https://www.force11.org/group/fairgroup/fairprinciples) is an interesting initiative which we follow attentively. It’s good if authors get to know, and ideally follow those principles, but it’s not required. 82 | 83 | **Is restricted access sharing allowed (e.g., data is stored online but the link is not public and provided only if the interested party explicitly asks for it or signs a formal agreement with the data owner)?** 84 | We certainly understand that in certain contexts, unconditional data disclosure to the public is not possible. Open science, however, means that data is publicly accessible by anyone which is why the open science badges cannot be granted when data is shared in a restricted manner. 85 | 86 | **When it comes to data about the humans, do we want to encourage or adhere to some kind of privacy regulations such as GDPR?** 87 | Privacy is a very important concern taken seriously by the open science board. When it is required for the data, proper consent and anonymisation of data is mandatory, in compliance with existing ethical codes of conduct and regulations such as [GDPR](https://en.wikipedia.org/wiki/General_Data_Protection_Regulation). Note that besides regional/local regulations and policies, also considering potential institutional entities such as an [IRB](https://en.wikipedia.org/wiki/Institutional_review_board) and necessary approvals of a study by such entities, sharing data about humans (e.g. interview or survey data) must always be based on the explicit consent given by the participants. That is, anonymizing the resulting data sets is not enough to meet our ethical standards. Further note that such consents may also be withdrawn by participants along a study (e.g., after presenting the results). While the open science board is advised to check for proper anomyzation when deemed necessary, it remains the sole responsibility of the authors to put careful considerations into privacy-related and ethical concerns, to obtain the necessary approvals and consents prior to submission of their data sets, and to ensure compliance of their actions with existing regulations. 88 | 89 | 90 | **What happens if data or access to it are modified by the authors after the approval of the Open Science chairs?** 91 | We count on the authors’ ethics to avoid this inappropriate behaviour. Note that once data has been released on an archival website (e.g. Zenodo), later modifications are not possible anymore. This is also one of the reasons why we refrain from disclosing data sets only on institutional websites or personal webpages. 92 | 93 | **What does replication mean for qualitative studies, say ethnography research or action research?** 94 | Complete replication of qualitative human studies is challenging, as human practices are rarely purely rational and reproducible. In our understanding, however, data sharing can at least support comprehending the analysis results yielded by the researchers and the conclusions they have drawn based on their data. We consider, therefore, a qualitative study to be (sufficiently) reproducible when the shared data allows other researchers to understand the claims and analyses presented by the authors. For interview research, for example, the shared data should include the instrumentation, transcripts (potentially anonymised), field notes, and codebooks so that others not involved in the study fully understand how the authors inferred their conclusions. 95 | 96 | **What does replication mean for systematic reviews?** 97 | Similarly as for qualitative studies, secondary studies are difficult to fully reproduce. There exist multiple reasons for this, such as the different functionalities of different search engines yielding different results when repeatedly used in an independent manner. We consider a secondary study to be sufficiently reproducible when the reporting in the manuscript and the shared data and scripts allow to understand the claims and analyses presented by the authors. 98 | 99 | **Why is the open science review process single blind?** 100 | It is for now single blind in order to be the least disruptive and to maximise acceptance (both from authors and from reviewers). We will consider alternatives in the near future. 101 | 102 | **Will the absence of badges on research papers, as a result of confidential research data/artefacts, give the impression of a lower class science?** 103 | No, the badge signals open data and open science. The absence of badge states does not mean that this is lower-class science, it only means that the authors cannot participate to the process or that the data was not open enough. In the presence of confidential research data or artefacts, authors will be encouraged to state in their paper the reasons for confidentiality in order to prevent misleading impressions. 104 | 105 | **Will the absence of badges discourage research done on industrial proprietary datasets, that is potentially very valuable?** 106 | The journal does encourage and support high-quality research based on proprietary datasets which provides novel and unique insights. 107 | 108 | **Are there multiple badges of different flavours? What is the connection to the [ACM badging](https://www.acm.org/publications/policies/artifact-review-badging) (used by [ROSE](https://2018.fseconference.org/track/rosefest-2018))?** 109 | First, there will be a single badge "Open science" in order to keep things simple. Then, multiple badges (up to the 5 levels of ACM) may be introduced. The badges are also being discussed with the publisher (Springer). 110 | 111 | **Do artifacts need to be executable? What about models or diagrams?** 112 | Many areas of software engineering do not generate executable artifacts. Non-code artifacts such as UML diagrams/models, requirements text, design documents, etc. are all valid artifacts. If the artifact addresses the topic of the paper and supports replication, that's fine. See here for a lengthier, but not complete, [list of artifacts](https://github.com/researchart/all/blob/master/ListOfArtifacts.md). Finally, we prefer if such artifacts are machine-readable and open, e.g., using an open format such as JSON or XMI, as opposed to Visio, PNG, or PDF. 113 | 114 | ## See also 115 | 116 | * [How to make a good open-science repository? (Martin Monperrus)](https://www.monperrus.net/martin/how-to-open-science-software-research) 117 | * [Open Science in Software Engineering (Daniel Mendez, Daniel Graziotin, Stefan Wagner, Heidi Seibold)](https://arxiv.org/abs/1904.06499) 118 | * [Reproducibility News Feed](https://vida-nyu.github.io/reproducibility-news/feed.rss) 119 | * [Awesome open-science software](https://github.com/INRIA/awesome-open-science-software) 120 | 121 | -------------------------------------------------------------------------------- /links.md: -------------------------------------------------------------------------------- 1 | # Links on Open Science, Reproducibility, and Replication 2 | 3 | ## General introductions 4 | - [Open Science in Software Engineering](https://arxiv.org/abs/1904.06499), book chapter highlighting general principles, concepts and terms, and how-to's to open science (with focus on registered reports, open access, and open data) 5 | 6 | ## Reproducibility 7 | 8 | ### What different types of reproducibility is there? 9 | - [rOpenSci's Introduction](https://ropensci.github.io/reproducibility-guide/sections/introduction/) with an overview to reproducibility 10 | - Stodden et al (2013). "[Setting the Default to Reproducible: Reproducibility in Computational and Experimental Mathematics](http://stodden.net/icerm_report.pdf)", ICERM workshop. 11 | 12 | ### How to share credit (attribution) 13 | - Stodden's (2009) proposal for a [Reproducible Research Standard](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1362040) (RRS). 14 | 15 | ### New publication models 16 | - [ReScience Journal](http://rescience.github.io/) for publishing the replication of existing computational research. 17 | 18 | ### Research on Reproducibility 19 | - Allison et al (2018). "[Reproducibility of research: Issues and 20 | proposed remedies](https://www.pnas.org/content/pnas/115/11/2561.full.pdf)", introduction to special issue, PNAS, 2018. 21 | 22 | ### Tools for Reproducibility 23 | 24 | * [Open Science Framework (OSF.io)](https://osf.io) provides project management support for researchers across the entire research lifecycle. Developed by the [Center for Open Science](https://cos.io). 25 | * [CodeCheck](https://codecheck.org.uk/): guidelines and tools to evaluate computer programs underlying scientific papers 26 | 27 | ## Registered/Pre-registered reports 28 | 29 | - [Registered Reports (RR) FAQ](https://osf.io/gha9f/) from OSF 30 | - [OSF's RR resources](https://osf.io/gha9f/) 31 | - [RR Facts for Editors](https://osf.io/rux9a/) 32 | - [What's next for Registered Reports?](https://www.nature.com/articles/d41586-019-02674-6) by Chris Chambers in Nature, September 2019. 33 | - Overview and summary of what has been learnt about Registered Reports so far. 34 | - [The Seven Deadly Sins of Psychology: A Manifesto for Reforming the Culture of Scientific Practice](https://www.amazon.com/Seven-Deadly-Sins-Psychology-Scientific/dp/0691192278/ref=sr_1_2?keywords=chris+chambers&qid=1568746957&s=gateway&sr=8-2) by Chris Chambers. 35 | - Detailed summary of the many problems with bias and unreliability in Psychology (but applicable to many empirical fields). Proposes pre-registration as a main remedy but also discusses other steps that can be taken. 36 | - Does Pre-registration lead to different results? Yes, without pre-registration effect size was more than double (0.36 without (N=900) and 0.16 (N=93) with RR) [in this study within psychology by Schäfer and Schwarz](https://www.frontiersin.org/articles/10.3389/fpsyg.2019.00813/full): "The Meaningfulness of Effect Sizes in Psychological Research: Differences Between Sub-Disciplines and the Impact of Potential Biases", April 2019, frontiers in Psychology. 37 | 38 | ## Replications in SE 39 | 40 | - [Incomplete but growing list of replications in SE](https://github.com/researchart/rose/blob/master/replications_in_se.md) 41 | 42 | ## Other lists of Open Science resources 43 | 44 | - [Open Science Software](https://github.com/INRIA/awesome-open-science-software) list collected by Martin Monperrus 45 | -------------------------------------------------------------------------------- /registered-reports_guide.md: -------------------------------------------------------------------------------- 1 | **NB: Please [contact the MSR RR track chairs](mailto:nernst@uvic.ca) with any questions, feedback, or requests for clarification. Specific analysis approaches mentioned below are intended as examples, not mandatory components.** 2 | 3 | # Title (required) 4 | Provide the working title of your study. It may be the same title that you submit for publication of your final manuscript, but it is not a requirement. 5 | 6 | * **Example**: Should your family travel with you on the enterprise? 7 | * Subtitle (optional): Effect of accompanying families on the work habits of crew members 8 | 9 | # Authors (required) 10 | At this stage, we believe that an unblinded/single blind review is most productive 11 | 12 | # Structured Abstract (required) 13 | The abstract should describe in 200 words or so: 14 | 15 | ## Background/Context 16 | What is your research about? Why are you doing this research, why is it interesting? 17 | 18 | **Example**: The enterprise is the flag ship of the federation, and it allows families to travel onboard. However, there are no studies that evaluate how this affects the crew members.” 19 | 20 | ## Objective/Aim 21 | What exactly are you studying/investigating/evaluating? What are the objects of the study? 22 | 23 | We welcome both confirmatory and exploratory types of studies. 24 | 25 | **Example**: We evaluate whether the frequency of sick days, the work effectiveness and efficiency differ between science officers who bring their family with them, compared to science officers who are serving without their family. 26 | **Example**: We investigate the problem of frequent Holodeck use on interpersonal relationships with an ethnographic study using participant observation, in order to derive specific hypotheses about Holodeck usage. 27 | 28 | ## Method 29 | How are you addressing your objective? What data sources are you using. 30 | 31 | **Example**: We conduct an observational study and use a between subject design. To analyze the data, we use a t test or Wilcoxon test, depending on the underlying distribution. Our data come computer monitoring of Enterprise crew members. 32 | 33 | ## Limitations 34 | 35 | # Hypotheses / research questions (required) 36 | Clearly state the research hypotheses that you want to test with your study, and a rationalization for the hypotheses. 37 | 38 | * **Example**: Science officers with their family on board have more sick days than science officers without their family 39 | 40 | Rationale: Since toddlers are often sick, we can expect that crew members with their family onboard need to take sick days more often. 41 | 42 | # Introduction 43 | Give more details on the bigger picture of your study and how it contributes to this bigger picture. An important component pf phase 1 review is assessing the importance and relevance of the study questions, so be sure to explain this. 44 | 45 | # Variables (required) 46 | * Independent Variable(s) and their operationalization 47 | * Dependent Variable(s) and their operationalization (e.g., time to solve a specified task) 48 | * Confounding Variable(s) and how their effect will be controlled (e.g., species type (Vulcan, Human, Tribble) might be a confounding factor; we control for it by separating our sample additionally into Human/Non-Human and using an ANOVA (normal distribution) or Friedman (non-normal distribution) to distill its effect). 49 | 50 | For each variable, you should give: 51 | - name (e.g., presence of family) 52 | - abbreviation (if you intend to use one) 53 | - description (whether the family of the crew members travels on board) 54 | - scale type (nominal: either the family is present or not) 55 | - operationalization (crew members without family on board vs. crew members with family onboard) 56 | 57 | # Material/objects (required) 58 | Describe any material that you plan to use, be specific on whether you developed it (and how) or whether it is already defined (e.g., a standard myers-briggs-type indicator) 59 | 60 | **Example**: For sick days, we recruit the medical records from sick bay (ethics approval pending). For efficiency, we conduct standard interviews with the superior officer and crew members. The questions are the following: / can be found on the Web site / Appendix. Furthermore, we observe their performance during a simulated task. 61 | 62 | # Tasks (optional) 63 | If you use tasks, describe them, how they were designed or from where they are taken and why they are suitable to evaluate the hypotheses / research question 64 | 65 | **Example**: For effectiveness of the crew members, we ask them to sweep a class 2 nebula. We simulate an error in the primary sensory array. Crew members should then run a level 3 diagnostic to spot the error, fix the error, and complete the sweep of the nebular. We measure the time to (i) spot that there is an error, (ii) decide on the correct diagnostic protocol, (iii) fix the error, and (iv) complete the sweep. 66 | 67 | # Participants/Subjects/sample (required) 68 | Describe how and why you select the sample. When you conduct a meta analysis, describe the primary studies / work on which you base your meta analysis. 69 | 70 | **Example**: We recruit crew members from the science department on a voluntary basis. They are our targeted population. 71 | 72 | # Execution Plan (required) 73 | Describe the experimental protocol. 74 | 75 | **Example**: Each crew member needs to sign the informed consent and agreement to process their data according to GDPR. Then, we conduct the interviews. Afterwards, participants need to complete the simulated task. 76 | 77 | # Analysis Plan (required) 78 | ## Descriptive statistics 79 | How do you describe the data? How do you handle outliers? 80 | 81 | Example: To represent the number of sickdays, we use histograms. Dependending on the distribution, we remove values that are 2 standard deviations above the mean as outliers (normal distribution). If the data are non-normal, we use the median and values below the 10th/above the 90th percentile. 82 | 83 | ## How do you evaluate the practical significance of the hypotheses? 84 | How are you testing the significance of your results? Be specific about the epistemological paradigm and statistical paradigm you are using. This will help us assign reviewers familiar with the relevant research strategies. See [Neto et al.](https://arxiv.org/pdf/1706.00933.pdf) for more information. 85 | 86 | * **Example**: (Frequentist) To test for normality, we use a Shapiro-Wilk test. For efficiency, we use a t test / Wilcoxon test depending on the distribution. To evaluate the effect of species type, we use a two-way ANOVA / Friedman test, depending on the distribution. 87 | * **Example**: (Bayesian) We derive a posterior predictive distribution by choosing a weakly informative prior with sickdays modeled using a Poisson distribution, and likelihood of species influence modeled using a normal distribution with mean 0 and s.d. \sigma. We then calculate the 95% and 99% uncertainty intervals and median m and mean μ of the posterior. 88 | 89 | # Examples 90 | * Final studies (phase 1 and 2) are available at [this Zotero page](https://www.zotero.org/groups/479248/osf/items/collectionKey/KEJP68G9?) 91 | * Example phase 1 registrations can be found at [OSF Registry](https://osf.io/registries/discover?provider=OSF&type=OSF%20Preregistration) 92 | * A sample phase 1 registration is [a study of tax in economics](https://osf.io/5g7hv/) 93 | -------------------------------------------------------------------------------- /replication-packages.md: -------------------------------------------------------------------------------- 1 | Replication packages from EmSE articles 2 | ======================================= 3 | 4 | Volume 23: 5 | * [Overfitting in semantics-based program repair](https://link.springer.com/article/10.1007/s10664-017-9577-2): 6 | 7 | -------------------------------------------------------------------------------- /review-criteria.md: -------------------------------------------------------------------------------- 1 | EMSE Open science - Evaluation Criteria 2 | ======================= 3 | 4 | Authors: the EMSE open science board, see 5 | 6 | 7 | This document contains the points that will be evaluated by the open science reviewer. It can be considered as the review template. 8 | 9 | 10 | Is the replication package? 11 | -------- 12 | 13 | 14 | Downloadable behind a public URL? 15 | 16 | - Does the data and code lie behind a single URL[? (Recommendation: it 17 | should be the case)]{.c2} 18 | 19 | Archived? 20 | 21 | - Is the replication package hosted on an persistent, 22 | archived repository? (Recommendation: even the submitted version 23 | should be hosted on a archived repository, such as [Zenodo](http://zenodo.org/) or [archive.org](https://archive.org/) 24 | 25 | Documented? 26 | 27 | - Is the replication package properly documented? 28 | 29 | - does the replication package contain an inventory of artifacts (files and folders)? 30 | - are the used file formats documented? 31 | - are the naming conventions documented? 32 | 33 | Complete? 34 | 35 | - Does the replication package contain everything required to understand and/or recompute all data, numbers and figures presented in the paper? 36 | 37 | Exercisable? (if the paper contains results based on code) 38 | 39 | - Does the code compile and execute given the instructions in the package? 40 | - Does the code only depend on publicly-available modules and libraries? 41 | 42 | Licensed? 43 | 44 | - Does the replication package contain an appropriate license for the code or data? 45 | - We strongly encourage that the replication package contains a license. 46 | - The Open Science board suggest the CC-BY version 4.0 which is is suitable for data 47 | 48 | -------------------------------------------------------------------------------- /rr/README.md: -------------------------------------------------------------------------------- 1 | This folder hosts registered report (RR) details for the EMSE journal. 2 | 3 | See the main [journal website](https://emsejournal.github.io/registered_reports/) for more information on RR. 4 | 5 | Read the journal [policies](rr_policies.md). 6 | 7 | Reuse previous Calls for Phase 1 Submissions: 8 | - International Conference on Software Maintenance and Evolution (ICSME 2020) [guide](icsme_rr_guide.md) and [CFP](registered-reports_ICSME_CFP.md). 9 | - International Conference on Mining Software Repositories (MSR2020) [CFP](registered-reports_MSR_CFP.md). 10 | 11 | Contact EMSE registered report editors: 12 | - Neil Ernst (nernst@uvic.ca) 13 | - Maria Teresa Baldassarre -------------------------------------------------------------------------------- /rr/icsme_rr_guide.md: -------------------------------------------------------------------------------- 1 | ### Author's Guide 2 | 3 | **NB: Please [contact the ICSME RR track chairs](mailto:nernst@uvic.ca) with any questions, feedback, or requests for clarification. Specific analysis approaches mentioned below are intended as examples, not mandatory components.** 4 | 5 | # [Title (required)](https://2020.msrconf.org/track/msr-2020-Registered-Reports#title-required) 6 | 7 | Provide the working title of your study. It may be the same title that you submit for publication of your final manuscript, but it is not a requirement. 8 | 9 | - **Example**: Should your family travel with you on the enterprise? 10 | - Subtitle (optional): Effect of accompanying families on the work habits of crew members 11 | 12 | # [Authors (required)](https://2020.msrconf.org/track/msr-2020-Registered-Reports#authors-required) 13 | 14 | At this stage, we believe that an unblinded/single blind review is most productive 15 | 16 | # [Structured Abstract (required)](https://2020.msrconf.org/track/msr-2020-Registered-Reports#structured-abstract-required) 17 | 18 | The abstract should describe in 200 words or so: 19 | 20 | ## [Background/Context](https://2020.msrconf.org/track/msr-2020-Registered-Reports#backgroundcontext) 21 | 22 | What is your research about? Why are you doing this research, why is it interesting? 23 | 24 | **Example**: The enterprise is the flag ship of the federation, and it allows families to travel onboard. However, there are no studies that evaluate how this affects the crew members.” 25 | 26 | ## [Objective/Aim](https://2020.msrconf.org/track/msr-2020-Registered-Reports#objectiveaim) 27 | 28 | What exactly are you studying/investigating/evaluating? What are the objects of the study? 29 | 30 | We welcome both confirmatory and exploratory types of studies. 31 | 32 | **Example**: We evaluate whether the frequency of sick days, the work effectiveness and efficiency differ between science officers who bring their family with them, compared to science officers who are serving without their family. 33 | 34 | **Example**: We investigate the problem of frequent Holodeck use on interpersonal relationships with an ethnographic study using participant observation, in order to derive specific hypotheses about Holodeck usage. 35 | 36 | ## [Method](https://2020.msrconf.org/track/msr-2020-Registered-Reports#method) 37 | 38 | How are you addressing your objective? What data sources are you using. 39 | 40 | **Example**: We conduct an observational study and use a between subject design. To analyze the data, we use a t test or Wilcoxon test, depending on the underlying distribution. Our data come computer monitoring of Enterprise crew members. 41 | 42 | ## [Limitations](https://2020.msrconf.org/track/msr-2020-Registered-Reports#limitations) 43 | 44 | # [Hypotheses / research questions (required)](https://2020.msrconf.org/track/msr-2020-Registered-Reports#hypotheses-research-questions-required) 45 | 46 | Clearly state the research hypotheses that you want to test with your study, and a rationalization for the hypotheses. 47 | 48 | - **Example**: Science officers with their family on board have more sick days than science officers without their family 49 | - **Rationale**: Since toddlers are often sick, we can expect that crew members with their family onboard need to take sick days more often. 50 | 51 | # [Introduction](https://2020.msrconf.org/track/msr-2020-Registered-Reports#introduction) 52 | 53 | Give more details on the bigger picture of your study and how it contributes to this bigger picture. An important component pf phase 1 review is assessing the importance and relevance of the study questions, so be sure to explain this. 54 | 55 | # [Variables (required)](https://2020.msrconf.org/track/msr-2020-Registered-Reports#variables-required) 56 | 57 | - Independent Variable(s) and their operationalization 58 | - Dependent Variable(s) and their operationalization (e.g., time to solve a specified task) 59 | - Confounding Variable(s) and how their effect will be controlled (e.g., species type (Vulcan, Human, Tribble) might be a confounding factor; we control for it by separating our sample additionally into Human/Non-Human and using an ANOVA (normal distribution) or Friedman (non-normal distribution) to distill its effect). 60 | 61 | For each variable, you should give: - name (e.g., presence of family) - abbreviation (if you intend to use one) - description (whether the family of the crew members travels on board) - scale type (nominal: either the family is present or not) - operationalization (crew members without family on board vs. crew members with family onboard) 62 | 63 | # [Material/objects (required)](https://2020.msrconf.org/track/msr-2020-Registered-Reports#materialobjects-required) 64 | 65 | Describe any material that you plan to use, be specific on whether you developed it (and how) or whether it is already defined (e.g., a standard myers-briggs-type indicator) 66 | 67 | **Example**: For sick days, we recruit the medical records from sick bay (ethics approval pending). For efficiency, we conduct standard interviews with the superior officer and crew members. The questions are the following: / can be found on the Web site / Appendix. Furthermore, we observe their performance during a simulated task. 68 | 69 | # [Tasks (optional)](https://2020.msrconf.org/track/msr-2020-Registered-Reports#tasks-optional) 70 | 71 | If you use tasks, describe them, how they were designed or from where they are taken and why they are suitable to evaluate the hypotheses / research question 72 | 73 | **Example**: For effectiveness of the crew members, we ask them to sweep a class 2 nebula. We simulate an error in the primary sensory array. Crew members should then run a level 3 diagnostic to spot the error, fix the error, and complete the sweep of the nebular. We measure the time to (i) spot that there is an error, (ii) decide on the correct diagnostic protocol, (iii) fix the error, and (iv) complete the sweep. 74 | 75 | # [Participants/Subjects/sample (required)](https://2020.msrconf.org/track/msr-2020-Registered-Reports#participantssubjectssample-required) 76 | 77 | Describe how and why you select the sample. When you conduct a meta analysis, describe the primary studies / work on which you base your meta analysis. 78 | 79 | **Example**: We recruit crew members from the science department on a voluntary basis. They are our targeted population. 80 | 81 | # [Execution Plan (required)](https://2020.msrconf.org/track/msr-2020-Registered-Reports#execution-plan-required) 82 | 83 | Describe the experimental protocol. 84 | 85 | **Example**: Each crew member needs to sign the informed consent and agreement to process their data according to GDPR. Then, we conduct the interviews. Afterwards, participants need to complete the simulated task. 86 | 87 | # [Analysis Plan (required)](https://2020.msrconf.org/track/msr-2020-Registered-Reports#analysis-plan-required) 88 | 89 | ## [Descriptive statistics](https://2020.msrconf.org/track/msr-2020-Registered-Reports#descriptive-statistics) 90 | 91 | How do you describe the data? How do you handle outliers? 92 | 93 | Example: To represent the number of sickdays, we use histograms. Dependending on the distribution, we remove values that are 2 standard deviations above the mean as outliers (normal distribution). If the data are non-normal, we use the median and values below the 10th/above the 90th percentile. 94 | 95 | ## [How do you evaluate the practical significance of the hypotheses?](https://2020.msrconf.org/track/msr-2020-Registered-Reports#how-do-you-evaluate-the-practical-significance-of-the-hypotheses) 96 | 97 | How are you testing the significance of your results? Be specific about the epistemological paradigm and statistical paradigm you are using. This will help us assign reviewers familiar with the relevant research strategies. See [Neto et al.](https://arxiv.org/pdf/1706.00933.pdf)for more information. 98 | 99 | - **Example**: (Frequentist) To test for normality, we use a Shapiro-Wilk test. For efficiency, we use a t test / Wilcoxon test depending on the distribution. To evaluate the effect of species type, we use a two-way ANOVA / Friedman test, depending on the distribution. 100 | - **Example**: (Bayesian) We derive a posterior predictive distribution by choosing a weakly informative prior with sickdays modeled using a Poisson distribution, and likelihood of species influence modeled using a normal distribution with mean 0 and s.d. \sigma. We then calculate the 95% and 99% uncertainty intervals and median m and mean μ of the posterior. 101 | 102 | # [Examples](https://2020.msrconf.org/track/msr-2020-Registered-Reports#examples) 103 | 104 | - Final studies (phase 1 and 2) are available at [this Zotero page](https://www.zotero.org/groups/479248/osf/items/collectionKey/KEJP68G9?) 105 | - Example phase 1 registrations can be found at [OSF Registry](https://osf.io/registries/discover?provider=OSF&type=OSF Preregistration) 106 | - A sample phase 1 registration is [a study of tax in economics](https://osf.io/5g7hv/) -------------------------------------------------------------------------------- /rr/registered-reports_ICSME_CFP.md: -------------------------------------------------------------------------------- 1 | # Call for Registrations: MSR/EMSE Registered Reports 2 | EMSE, in conjunction with the International Conference on Software Maintenance and Evolution (ICSME), is conducting a pilot RR track. This is the second such pilot after the very successful effort at [MSR](https://2020.msrconf.org/track/msr-2020-Registered-Reports). 3 | 4 | See the associated [Author's Guide](msr_rr_cfp.md). Please email the MSR track chairs - [Neil Ernst](mailto:neil@neilernst.net) or Tim Menzies - for any questions, clarifications, or comments. 5 | 6 | **Please note: small changes to the CFP and guide may occur once the MSR pilot is concluded**. 7 | 8 | ## What are Registered Reports 9 | * Methods and proposed analyses are pre-registered and reviewed prior to research being conducted. 10 | * Reduce/eliminate under-powered, selectively reported, researcher-biased studies. 11 | 12 | ## Two Phase Review 13 | * (ICSME 2020) Phase 1: Introduction, Methods (including proposed analyses), and Pilot Data (where applicable). In Principle Acceptance. 14 | * (EMSE) Phase 2: full study, after data collection and analysis. Results may be negative! 15 | 16 | Final publication is straightforward if the original protocol is adhered to, regardless of positive or negative results. Additional exploratory analyses may be conducted, if they are justified. Any deviation from the protocol must be justified and logged in detail to ensure replicability. EMSE J. Editors reserve the right to tighten eligibility criteria if necessary. 17 | 18 | ## Phase 1 Criteria 19 | * Importance of the research question(s). 20 | * Logic, rationale, and plausibility of the proposed hypotheses. 21 | * Soundness and feasibility of the methodology and analysis pipeline (including statistical power analysis where appropriate). 22 | * Clarity and degree of methodological detail for replication. 23 | * Will results obtained test the stated hypotheses? 24 | 25 | ## Phase 2 Criteria (via https://osf.io/pukzy/) 26 | 27 | - Whether the data are able to test the authors’ proposed hypotheses by satisfying the approved outcome-neutral conditions (such as quality checks, positive controls) 28 | - Whether the Introduction, rationale and stated hypotheses are the same as the approved Stage 1 submission (required) 29 | - Whether the authors adhered precisely to the registered experimental procedures 30 | - Whether any unregistered post hoc analyses added by the authors are justified, methodologically sound, and informative 31 | - Whether the authors’ conclusions are justified given the data 32 | 33 | ## Qualitative Research and RR 34 | 35 | * No reason to assume pre-registration cannot be for qualitative methods such as card-sorting, grounded theory, coding, member checking etc. 36 | * E.g. phase 1 may include details on survey respondents, survey instrument design, data collection techniques. 37 | * [OSF Qualitative Pre-Registration](https://osf.io/j7ghv/) 38 | 39 | ## Organizers 40 | * [Tim Menzies](https://menzies.us) 41 | * [Neil Ernst](https://www.neilernst.net) 42 | 43 | ## Program Committee 44 | (PC members are guiding the formulation of the CFP, and review Phase 1 and 2 studies) 45 | 46 | 1. TBD 47 | 48 | ## Timeline 49 | |Date|Milestone| 50 | |----|----| 51 | |1 June 2020 | study protocols and plans due| 52 | |21 June 2020 |initial protocol reviews| 53 | |7 July 2020| rebuttals/clarifications due| 54 | |21 July 2020 | In Principle Acceptance (IPA) decision notifications| 55 | |7 Aug 2020 | 1 page summary plan / camera-ready| 56 | |21 Aug 2020|Reports registered with OSF registry| 57 | 58 | ## Submissions 59 | 60 | Submit via [this EasyChair link](). EasyChair will be used to handle Phase 1 reviews and feedback/rebuttal. EMSE's EditorialManager system will be used for the Phase 2 submissions, with OSF managing the registration. Reviews from Phase 1 will be shared with the Phase 2 reviewers. 61 | 62 | ### Submission Details 63 | Papers must strictly adhere to the two-column IEEE conference proceedings format. Please use the templates available [here](https://www.ieee.org/conferences/publishing/templates.html). LaTeX users should use the following configuration: \documentclass[conference]{IEEEtran}. Microsoft Word users should use the US Letter format template. 64 | 65 | Follow the template requested in [the author's guide to MSR RR submissions](msr_rr_cfp.md). 66 | 67 | Page limit is 4 pages including references. 68 | 69 | Review will be unblinded or single blind. There will be a light-weight rebuttal phase, in which authors have the opportunity to clarify unclear parts of the report. However, the rebuttal is not there to make changes to the experimental design. 70 | 71 | ## FAQ 72 | Q. *How will self-plagiarism be handled?*
73 | A. Self-plagiarism is where an author includes verbatim text from other, already published work. We expect this to be managed using the existing workshop/extension model; there will be sufficient new content in Phase 2 to clearly indicate this is a new piece of work. 74 | 75 | Q. *What if I publish my Phase 1 proposal, and then someone scoops me by following the protocol?*
76 | A. In practice, this seems quite uncommon. However, OSF has mechanisms to manage embargo periods, so this might be something we also consider in the future. Currently the MSR/EMSE model makes embargos impractical. However, tracks such as "new ideas" already pose this potential risk, so we don't anticipate extensive problems. 77 | 78 | Q. *How does this process deal with exploratory studies, where there is no well-defined hypothesis?*
79 | A. For now, we strongly suggest such studies target the New Ideas and Emerging Results Track of MSR: . We will focus on studies that have a clear, well-formulated hypothesis. 80 | 81 | Q. *What if my study changes as I gather data?*
82 | A. RR have flexibility to deviate from the analysis plan. However, authors will need to provide solid reasons as to why they deviated from the plan. 83 | 84 | Other FAQs on RR in general are at the bottom of [the OSF page](https://cos.io/prereg/). 85 | 86 | ## Links 87 | 1. https://cos.io/prereg/ 88 | 2. See these [links](https://github.com/emsejournal/openscience/blob/master/links.md#registeredpre-registered-reports) 89 | -------------------------------------------------------------------------------- /rr/registered-reports_MSR_CFP.md: -------------------------------------------------------------------------------- 1 | # Call for Registrations: MSR/EMSE Registered Reports 2 | EMSE, in conjunction with the conference on [Mining Software Repositories](https://www.msrconf.org) (MSR), is conducting a pilot RR track. 3 | 4 | See the associated [Author's Guide](msr_rr_cfp.md). Please email the MSR track chairs - [Neil Ernst](mailto:neil@neilernst.net) or Janet Siegmund - for any questions, clarifications, or comments. 5 | 6 | ## What are Registered Reports 7 | * Methods and proposed analyses are pre-registered and reviewed prior to research being conducted. 8 | * Reduce/eliminate under-powered, selectively reported, researcher-biased studies. 9 | 10 | ## Two Phase Review 11 | * (MSR 2020) Phase 1: Introduction, Methods (including proposed analyses), and Pilot Data (where applicable). In Principle Acceptance. 12 | * (EMSE) Phase 2: full study, after data collection and analysis. Results may be negative! 13 | 14 | Final publication is straightforward if the original protocol is adhered to, regardless of positive or negative results. Additional exploratory analyses may be conducted, if they are justified. Any deviation from the protocol must be justified and logged in detail to ensure replicability. EMSE J. Editors reserve the right to tighten eligibility criteria if necessary. 15 | 16 | ## Phase 1 Criteria 17 | * Importance of the research question(s). 18 | * Logic, rationale, and plausibility of the proposed hypotheses. 19 | * Soundness and feasibility of the methodology and analysis pipeline (including statistical power analysis where appropriate). 20 | * Clarity and degree of methodological detail for replication. 21 | * Will results obtained test the stated hypotheses? 22 | 23 | ## Qualitative Research and RR 24 | * No reason to assume pre-registration cannot be for qualitative methods such as card-sorting, grounded theory, coding, member checking etc. 25 | * E.g. phase 1 may include details on survey respondents, survey instrument design, data collection techniques. 26 | * [OSF Qualitative Pre-Registration](https://osf.io/j7ghv/) 27 | 28 | ## Organizers 29 | * [Janet Siegmund](https://www.infosun.fim.uni-passau.de/se/people-jsiegmund.php) 30 | * [Neil Ernst](https://www.neilernst.net) 31 | 32 | ## Publicity 33 | * Norman Peitek 34 | 35 | ## Program Committee 36 | (PC members are guiding the formulation of the CFP, and review Phase 1 and 2 studies) 37 | 38 | 1. Jane Doe 39 | 2. Joe Blø 40 | 41 | ## Timeline 42 | |Date|Milestone| 43 | |----|----| 44 | |January 10, 2020 | study protocols and plans due| 45 | |January 31, 2020 |initial protocol reviews| 46 | |February 14,2020| rebuttals/clarifications due| 47 | |March 2, 2020 | In Principle Acceptance (IPA) decision notifications| 48 | |March 16, 2020 | 1 page summary plan / camera-ready| 49 | |March 31, 2020|Reports registered with OSF registry| 50 | 51 | ## Submissions 52 | EasyChair will be used to handle Phase 1 reviews and feedback/rebuttal. EMSE's EditorialManager system will be used for the Phase 2 submissions, with OSF managing the registration. Reviews from Phase 1 will be shared with the Phase 2 reviewers. 53 | 54 | ### Submission Details 55 | Submissions should follow the [ACM Conference Proceedings Formatting Guidelines](https://www.acm.org/publications/proceedings-template ). LaTeX users must use the provided acmart.cls and ACM-Reference-Format.bst without modification, enable the conference format in the preamble of the document (i.e., \documentclass[sigconf,review]{acmart}), and use the ACM reference format for the bibliography (i.e., \bibliographystyle{ACM-Reference-Format}). The review option adds line numbers, thereby allowing referees to refer to specific lines in their comments. 56 | 57 | Follow the template requested in [the author's guide to MSR RR submissions](msr_rr_cfp.md). 58 | 59 | Page limit is 4 pages including references. 60 | 61 | Review will be unblinded or single blind. There will be a light-weight rebuttal phase, in which authors have the opportunity to clarify unclear parts of the report. However, the rebuttal is not there to make changes to the experimental design. 62 | 63 | ## FAQ 64 | Q. *How will self-plagiarism be handled?*
65 | A. Self-plagiarism is where an author includes verbatim text from other, already published work. We expect this to be managed using the existing workshop/extension model; there will be sufficient new content in Phase 2 to clearly indicate this is a new piece of work. 66 | 67 | Q. *What if I publish my Phase 1 proposal, and then someone scoops me by following the protocol?*
68 | A. In practice, this seems quite uncommon. However, OSF has mechanisms to manage embargo periods, so this might be something we also consider in the future. Currently the MSR/EMSE model makes embargos impractical. However, tracks such as "new ideas" already pose this potential risk, so we don't anticipate extensive problems. 69 | 70 | Q. *How does this process deal with exploratory studies, where there is no well-defined hypothesis?*
71 | A. For now, we strongly suggest such studies target the New Ideas and Emerging Results Track of MSR: . We will focus on studies that have a clear, well-formulated hypothesis. 72 | 73 | Q. *What if my study changes as I gather data?*
74 | A. RR have flexibility to deviate from the analysis plan. However, authors will need to provide solid reasons as to why they deviated from the plan. 75 | 76 | Other FAQs on RR in general are at the bottom of [the OSF page](https://cos.io/prereg/). 77 | 78 | ## Links 79 | 1. https://cos.io/prereg/ 80 | 2. See these [links](https://github.com/emsejournal/openscience/blob/master/links.md#registeredpre-registered-reports) 81 | -------------------------------------------------------------------------------- /rr/rr_policies.md: -------------------------------------------------------------------------------- 1 | # Registered Reports - Journal Policies 2 | 3 | ## Timelines 4 | In general, one year is the expected time between Phase 1 submission and Phase 2 submission. Each conference will outline the specific dates. Reviewer continuity becomes challenging beyond this point. 5 | 6 | ## Submission 7 | Submit Phase 2 registered report papers to EMSE via the [standard submission site](https://www.editorialmanager.com/emse/default2.aspx). 8 | 9 | Under special issues, select "RR" as your special issue. 10 | 11 | Include in your submission a cover letter which explains: 12 | 1. the fact this is a phase 2 submission; 13 | 2. the original venue at which phase 1 was approved; 14 | 3. any details about study protocol changes between phase 1 and phase 2. 15 | 16 | 17 | ## Authorship 18 | Changes in authorship are permitted, however: 19 | 20 | 1. All authors must fill in and sign the authorship form which will explain who did what *and* that the authors satisfied and understood ACM and IEEE authorship criteria 21 | 2. Authors cannot advertise “in principle acceptance” and authorship as an incentive for participation in research studies 22 | 3. IPA (Stage 1) reviewers and their students cannot become authors on the final paper (since they ‘accepted’ it, there are ethical concerns) 23 | 4. There will need to be new COI checks on the new authors (and with large numbers it can become a real problem) 24 | 5. Authors should be made aware of second order effects (e.g., having many conflicts can complicate the review of their submissions) before participating 25 | 26 | The author form can be obtained by contacting the journal editorial office. 27 | 28 | Updated: July 2023 29 | --------------------------------------------------------------------------------