├── .gitignore ├── .gitmodules ├── 01-onboarding ├── example │ ├── 01-code-of-conduct.md │ ├── 02-research-integrity.md │ ├── 03-contributions-and-authorship.md │ ├── 04-nih-public-access-policy.md │ ├── 05-key-readings │ │ ├── Gernsbacher_2016_Language_and_Speech_in_Autism.pdf │ │ ├── Jones_2013_Diagnosing_autism_in_neurobiological_research_studies.pdf │ │ ├── Kaufmann_2017_Autism_Spectrum_Disorder_in_Fragile_X_Syndrome_Cooccurring_Conditions_and_Current_Treatment.pdf │ │ ├── Lord_2012_Annual_Research_Review_Re-thinking_the_classification_of_autism_spectrum_disorders.pdf │ │ ├── Lord_2015_Recent_Advances_in_Autism_Research_as_Reflected_in_DSM-5_Criteria_for_Autism_Spectrum_Disorder.pdf │ │ ├── Presmanes_Hill_2015_Epidemiology_of_Autism_Spectrum_Disorders.pdf │ │ └── README.md │ ├── 06-individual-onboarding │ │ ├── gra-onboarding.md │ │ └── ra-onboarding.md │ ├── README.md │ └── onboarding.wiki │ │ ├── 01-about-cslu.md │ │ ├── 02-seminars-and-journal-clubs.md │ │ ├── 03-servers-and-data-repositories.md │ │ ├── 04-CSLU-acronyms.md │ │ ├── 05-food-and-coffee.md │ │ ├── 06-office-stuff.md │ │ ├── 07-transportation-and-parking.md │ │ └── home.md └── template │ └── onboarding.md ├── 02-protocols ├── example │ ├── README.md │ ├── datastorage │ │ ├── files.md │ │ └── recordings.md │ ├── scripts │ │ └── upload.sh │ ├── suggested-reading │ │ ├── BucholtzPoliticsofTranscription.pdf │ │ └── ochs1979.pdf │ ├── transcription.wiki │ │ ├── datastorage │ │ │ ├── files.md │ │ │ └── recordings.md │ │ ├── home.md │ │ ├── transcription.md │ │ └── transcription │ │ │ ├── activities.md │ │ │ ├── adhd.md │ │ │ ├── ados.md │ │ │ ├── audio2.md │ │ │ ├── bash.md │ │ │ ├── cleaning.md │ │ │ ├── elan.md │ │ │ ├── erpa.md │ │ │ ├── formatting-uploading.md │ │ │ ├── guideline-changelog.md │ │ │ ├── guidelines.md │ │ │ ├── textgrids.md │ │ │ ├── tracking.md │ │ │ └── uw.md │ └── transcription │ │ ├── activities.md │ │ ├── adhd.md │ │ ├── ados.md │ │ ├── audio2.md │ │ ├── bash.md │ │ ├── cleaning.md │ │ ├── elan.md │ │ ├── erpa.md │ │ ├── formatting-uploading.md │ │ ├── guideline-changelog.md │ │ ├── guidelines.md │ │ ├── textgrids.md │ │ ├── tracking.md │ │ ├── transcription.md │ │ └── uw.md └── template │ └── protocol.md ├── 03-housekeeping ├── example │ ├── IRB │ │ ├── .gitkeep │ │ └── 2017-notes.md │ ├── NIH-progress-reports │ │ └── README.md │ ├── README.md │ ├── meetings │ │ ├── .gitkeep │ │ ├── 2017-08-16.md │ │ ├── 2017-09-06.md │ │ ├── 2017-10-04.md │ │ └── 2017-11-01.md │ └── team-contacts.md └── template │ ├── IRB │ ├── .gitkeep │ └── delete-me.md │ ├── NIH-progress-reports │ └── template.md │ ├── README.md │ ├── meetings │ ├── .gitkeep │ └── template.md │ └── team-contacts.md ├── README.md ├── labhub.Rproj └── labhub.wiki ├── 01-about-cslu.md ├── 02-seminars-and-journal-clubs.md ├── 03-servers-and-data-repositories.md ├── 04-CSLU-acronyms.md ├── 05-food-and-coffee.md ├── 06-office-stuff.md ├── 07-transportation-and-parking.md ├── Home.md └── _Sidebar.md /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .RData 4 | .Ruserdata 5 | **/.DS_Store 6 | .DS_Store -------------------------------------------------------------------------------- /.gitmodules: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /01-onboarding/example/01-code-of-conduct.md: -------------------------------------------------------------------------------- 1 | The purpose of this Code of Conduct is to affirm the good nature of our community while also acknowledging 2 | that harassment happens even in seemingly safe spaces, and to provide clear definitions of what harassment 3 | entails and how to report it if it does occur. This Code of Conduct applies to all team members at all CSLU/OHSU offices, presentations, 4 | talks, meetings, and all discussion forums including online and in person, Slack channels, GitHub, and 5 | email. 6 | 7 | OHSU and CSLU value the personal and academic contributions of every member of our community. Accordingly, we insist 8 | that all lab members treat each other with respect, kindness, and dignity. Every community participant is 9 | expected to uphold the standards of this Code of Conduct at all times. We are dedicated to providing a 10 | welcoming, safe, and productive space for all, regardless of race, ethnicity, color, immigration status, 11 | gender, gender identity or expression, sex, sexual orientation, age, disability, socio-economic class, 12 | education level, political belief, body size, physical appearance, or religion. 13 | 14 | We do not tolerate harassment in any form, including: offensive comments, violent language, 15 | discriminatory language including jokes, deliberate intimidation, stalking, unwanted sexual or romantic 16 | advances, unwanted photography or recording, sustained or willful disruption of talks or other events, 17 | inappropriate or non-consensual physical contact, or use of sexual or discriminatory imagery, comments, or 18 | jokes. 19 | 20 | We strive for open communication, respect for all, and collaboration of work and ideas. We expect 21 | everyone to conduct themselves professionally in all communication, both in person and online. Every 22 | participant is an equally important member of our community and should be treated with kindness and respect. 23 | 24 | While it is natural that both professional and personal disagreements will arise at the workplace, these 25 | should be dealt with respectfully and in a non-threatening manner. Do not use ad hominem attacks when debating 26 | issues. Use language that is productive in achieving a resolution. Understand that everyone has different 27 | experiences, preferences, and perspectives. This diversity is a strength in our community and 28 | miscommunications and disagreements should be handled in ways that resolve the issues, rather than inflaming 29 | then. We all make mistakes--learning from them is what fosters a healthy and inclusive community. 30 | 31 | Report violations of this code or harassment of any form to Alison, the project PI. 32 | All communication will be treated with complete confidentiality. Please also reference this recent OHSU podcast on [preventing sexual harassment at OHSU](https://o2.ohsu.edu/blogs/staffnews/2017/11/28/ohsu-week-podcast-preventing-sexual-harassment-at-ohsu/). If you do not feel comfortable speaking with Alison, please contact the [OHSU Ombudsman](https://www.ohsu.edu/xd/about/services/ombudsman/), who is available to all faculty, staff, administrators, students, post-doctoral fellows, trainees and volunteers. We take harassment very seriously, no matter the 33 | experience, seniority, or role in the lab of either the perpetrator or the victim. People who report incidents will be believed and will not be dismissed. 34 | -------------------------------------------------------------------------------- /01-onboarding/example/02-research-integrity.md: -------------------------------------------------------------------------------- 1 | # Research Integrity at OHSU 2 | 3 | The OHSU Research Integrity Office (ORIO) is charged with protecting and assuring compliance under the laws that govern the rights and welfare of human and animal subjects, and the oversight of basic and applied scientific research at OHSU. 4 | 5 | Basic research integrity training at OHSU is provided via [Compass](https://o2.ohsu.edu/compass). You are required to complete the following trainings in order to receive your OHSU ID badge: 6 | 7 | 1. [HIPAA training](https://o2.ohsu.edu/compass) through Compass 8 | 2. [Respect at the University training](https://o2.ohsu.edu/compass) through Compass 9 | 3. [Integrity Foundations training](https://o2.ohsu.edu/compass) through Compass 10 | 4. [Integrity Booster training](https://o2.ohsu.edu/compass) through Compass 11 | 12 | # Human Subjects Research at OHSU 13 | 14 | The Institutional Review Board (IRB) oversees human subjects research at OHSU, where human subjects are defined as "...living individual(s) about whom an investigator (whether professional or student) conducting research obtains (1) data through intervention or interaction with the individual , or (2) identifiable private information." [45 CFR 46.102(f)(1-2)] 15 | 16 | All those involved in research at OHSU must complete the [Conflict of Interest (eCoI)](https://bigbrain.ohsu.edu/coi/) disclosure and the Responsible Conduct of Research education training through [CITI](http://www.ohsu.edu/xd/about/services/integrity/training/rcr-training.cfm). Additionally, because you will be involved in Human Subjects Research, you must also complete this education module through [CITI](https://www.citiprogram.org/members/index.cfm?pageID=50). 17 | 18 | # Summary of required trainings 19 | 20 | 1. [HIPAA training](https://o2.ohsu.edu/compass) through Compass 21 | 2. [Respect at the University training](https://o2.ohsu.edu/compass) through Compass 22 | 3. [Integrity Foundations training](https://o2.ohsu.edu/compass) through Compass 23 | 4. [Integrity Booster training](https://o2.ohsu.edu/compass) through Compass 24 | 5. [Responsible Conduct of Research (RCR) training](http://www.ohsu.edu/xd/about/services/integrity/training/rcr-training.cfm) through CITI 25 | 6. [Human Subjects Research (HSR) training](https://www.citiprogram.org/members/index.cfm?pageID=50) through CITI 26 | 7. [Electronic Conflict of Interest (eCoI)](https://bigbrain.ohsu.edu/coi/) 27 | -------------------------------------------------------------------------------- /01-onboarding/example/03-contributions-and-authorship.md: -------------------------------------------------------------------------------- 1 | We appreciate your contributions to our ongoing research projects, and we are happy to discuss with you how you can make authorable contributions to a future paper, poster, or conference presentation. For research staff and volunteers, please keep in mind that these endeavors often involve work above and beyond your job or volunteer position descriptions, and may require additional hours outside of your defined responsibilities. 2 | 3 | # Guidelines for authorship 4 | 5 | In this project, we follow the [Harvard Authorship guidelines](https://www.hsph.harvard.edu/faculty-affairs/authorship-guidelines/). The guidelines are as follows: 6 | 7 | 1. Everyone who is listed as an author should have made a substantial, direct, and intellectual contribution to the work. For example, they should have contributed to the conception, design, analysis and/or interpretation of data. Honorary or guest authorship is not acceptable. Acquisition of funding and provision of technical services, patients, or materials, while they may be essential to the work, are not in themselves sufficient contributions to justify authorship. 8 | 9 | 2. Everyone who has made substantial and direct intellectual contributions to the work should be an author. Everyone who has made other substantial contributions should be acknowledged. 10 | 11 | 3. When research is done by teams whose members are highly specialized, individuals’ contributions and responsibility may be limited to the specific aspects of the work described in the publication. 12 | 13 | 4. All authors should contribute to writing the manuscript, reviewing drafts and approving the final version. 14 | 15 | 5. One author (usually the Principal Investigator) should take primary responsibility for the work as a whole even if he or she does not have an in-depth understanding of every part of the work. This individual should assure that all authors meet the basic criteria for authorship outlined in guideline 1. 16 | 17 | 6. The authors should make every effort to decide the order of authorship together. Research teams should discuss authorship issues frankly early in the course of their work together and at other times during their collaboration as needed. It is recommended that the PI write up a summary of the authorship agreement. To assist with this, these guidelines should be distributed to all team members at the start of the collaboration. 18 | 19 | 7. If there is an authorship dispute, every effort should be made to settle it at the local level by the authors themselves, the research PI and/or the Department Chair. 20 | 21 | # Order of authorship 22 | 23 | Also from the [Harvard guidelines](https://hms.harvard.edu/sites/default/files/assets/Sites/Ombuds/files/AUTHORSHIP%20GUIDELINES.pdf): 24 | 25 | 1. The authors should decide the order of authorship together. 26 | 27 | 2. Authors should specify in their manuscript a description of the contributions of each author and how they have assigned the order in which they are listed so that readers can interpret their roles correctly. 28 | 29 | 3. The primary author should prepare a concise, written description of how order of authorship was decided. 30 | 31 | # Who is an author? 32 | 33 | According to the [International Committee of Medical Journal Editors' (ICMJE) "Recommendations for the Conduct, Reporting, Editing, and Publication of Scholarly Work in Medical Journals"](http://www.icmje.org/icmje-recommendations.pdf), authorship should be based on the following 4 criteria: 34 | 35 | 1. Substantial contributions to the conception or design of the work; or the acquisition, analysis, or interpretation of data for the work; AND 36 | 37 | 2. Drafting the work or revising it critically for important intellectual content; AND 38 | 39 | 3. Final approval of the version to be published; AND 40 | 41 | 4. Agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. 42 | 43 | Some specific peer-reviewed journals enforce their own additional authorship criteria, so the above are necessary but not sufficient criteria. 44 | 45 | In addition to being accountable for the parts of the work he or she has done, an author should be able to identify which co-authors are responsible for specific other parts of the work. 46 | 47 | In addition, authors should have confidence in the integrity of the contributions of their co-authors. 48 | 49 | # Non-author contributors 50 | 51 | Contributors who meet fewer than all 4 of the above criteria for authorship should not be listed as authors, but they should be acknowledged. Examples of activities that alone (without other contributions) do not qualify a contributor for authorship are acquisition of funding; general supervision of a research group or general administrative support; and writing assistance, technical editing, language editing, and proofreading. Those whose contributions do not justify authorship may be acknowledged individually or together as a group under a single heading (e.g. “Clinical Investigators” or “Participating Investigators”), and their contributions should be specified (e.g., “served as scientific advisors,” “critically reviewed the study proposal,” “collected data,” “provided and cared for study patients”, “participated in writing or technical editing of the manuscript”). 52 | 53 | # More reading: 54 | 55 | 56 | * [Programmers, Professors, and Parasites: Credit and Co-Authorship in Computer Science](https://link.springer.com/article/10.1007/s11948-009-9119-4) 57 | 58 | * [Authorship: why not just toss a coin?](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2544445/) 59 | 60 | * [Vanderbilt Institute for Integrative Biosystems Research and Education Review of Authorship Guidelines](http://www.vanderbilt.edu/viibre/VIIBRE_Author_Policy_Background.pdf) 61 | -------------------------------------------------------------------------------- /01-onboarding/example/04-nih-public-access-policy.md: -------------------------------------------------------------------------------- 1 | The [National Institutes of Health (NIH)](https://www.nih.gov) issued an open access mandate in 2008, which states that research papers describing research funded by the [NIH](https://www.nih.gov) must be available to the public free through [PubMed Central](https://www.ncbi.nlm.nih.gov/pmc/) within 12 months of publication. Since our research is funded by an NIH R01 through the [National Institute on Deafness and Other Communication Disorders (NIDCD)](https://www.nidcd.nih.gov), this policy applies to all of our peer-reviewed papers and conference proceedings. 2 | 3 | In this project, if you are the first author of a paper that falls under the NIH public access policy, it will be your sole responsibility to follow all steps necessary to comply with this policy. 4 | 5 | Read more about the [NIH public access policy here](https://publicaccess.nih.gov), and view the [video training](https://www.youtube.com/watch?v=PVG_lkkoJuw&list=PLOEUwSnjvqBJS9LZs1vMoG6vcAbTAxq0H). 6 | 7 | **Notes from Robin Champieux's Presentation:** 8 | 9 | - Background: 10 | - The NIH has a Public Access Policy for peer-reviewed publications, to ensure public access to publicly funded research. 11 | - This means that these NIH-funded papers must be available on PubMed Central (PMC) within 12 months of publication. 12 | - For this project, it is the job of the first author of the paper to make this happen. 13 | - At the end of this process your paper will have a PMC id number. 14 | - Process for publishers who submit it on your behalf: 15 | - Most publishers will be willing to submit your paper to PMC on your behalf. 16 | - You will know this is true if you have an option to check in submitting the paper that says you would like them to submit it on your behalf. 17 | - If that's the case, you should know these things: 18 | - You will need to provide your NIH grant number to them when checking that option. 19 | - Once you check that option, you will get a notification (usually within a month) where they provide a pdf of your submission for you to approve. 20 | - You do want to proof the pdf they provide before approving it--sometimes formatting can be altered in this process. 21 | - Process for submitting it independently: 22 | 1. You need an NCBI account. Create one if you don't already have one. 23 | 1. You can do that here: https://www.ncbi.nlm.nih.gov 24 | 2. On the upper right hand corner of the page is link "Sign in to NCBI" 25 | 3. This will give you an option to "Register for an NCBI account" 26 | 2. Go to https://www.nihms.nih.gov/db/sub.cgi. Go down to the bottom, and sign in through NCBI under the label "Publishers and Others" 27 | 3. Click on Submit New Manuscript 28 | 4. Follow the steps listed there: 29 | 1. Title 30 | 2. Funding 31 | 3. Choose your files (You have to upload a manuscript file, and you may also upload separate figures, etc. Your manuscript file cannot have the publisher's formatting, and must instead be your generated file.) 32 | 4. Check files 33 | 5. Set reviewer and embargo (The reviewer must be an author. As the first author, you should make the reviewer yourself. The embargo is either 12 months (the standard) if it's not given, or a specified and potentially shorter period given by the journal.) 34 | - Other things to be careful about: 35 | - If the publisher does not submit your paper on your behalf, you will want to submit it independently *as soon as possible*. There is a 3-month grace period, but it is safest to get it submitted as soon as your paper is accepted. 36 | - PubMed Central (PMC) and PubMed (PM) are different. PMC is a repository for the NIH, while PM is an indexed journal of biomedical literature. Additionally, a PMC id number is different from a PM number. To be compliant with this Public Access Policy, you need a PMC id number. 37 | - Your publication date may be the first time your paper is put online--e.g., after a conference, in an online journal before the printed version comes out. 38 | -------------------------------------------------------------------------------- /01-onboarding/example/05-key-readings/Gernsbacher_2016_Language_and_Speech_in_Autism.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/apreshill/labhub/17133b1e3b321cc703e4f6ed75ba77daeb679ab3/01-onboarding/example/05-key-readings/Gernsbacher_2016_Language_and_Speech_in_Autism.pdf -------------------------------------------------------------------------------- /01-onboarding/example/05-key-readings/Jones_2013_Diagnosing_autism_in_neurobiological_research_studies.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/apreshill/labhub/17133b1e3b321cc703e4f6ed75ba77daeb679ab3/01-onboarding/example/05-key-readings/Jones_2013_Diagnosing_autism_in_neurobiological_research_studies.pdf -------------------------------------------------------------------------------- /01-onboarding/example/05-key-readings/Kaufmann_2017_Autism_Spectrum_Disorder_in_Fragile_X_Syndrome_Cooccurring_Conditions_and_Current_Treatment.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/apreshill/labhub/17133b1e3b321cc703e4f6ed75ba77daeb679ab3/01-onboarding/example/05-key-readings/Kaufmann_2017_Autism_Spectrum_Disorder_in_Fragile_X_Syndrome_Cooccurring_Conditions_and_Current_Treatment.pdf -------------------------------------------------------------------------------- /01-onboarding/example/05-key-readings/Lord_2012_Annual_Research_Review_Re-thinking_the_classification_of_autism_spectrum_disorders.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/apreshill/labhub/17133b1e3b321cc703e4f6ed75ba77daeb679ab3/01-onboarding/example/05-key-readings/Lord_2012_Annual_Research_Review_Re-thinking_the_classification_of_autism_spectrum_disorders.pdf -------------------------------------------------------------------------------- /01-onboarding/example/05-key-readings/Lord_2015_Recent_Advances_in_Autism_Research_as_Reflected_in_DSM-5_Criteria_for_Autism_Spectrum_Disorder.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/apreshill/labhub/17133b1e3b321cc703e4f6ed75ba77daeb679ab3/01-onboarding/example/05-key-readings/Lord_2015_Recent_Advances_in_Autism_Research_as_Reflected_in_DSM-5_Criteria_for_Autism_Spectrum_Disorder.pdf -------------------------------------------------------------------------------- /01-onboarding/example/05-key-readings/Presmanes_Hill_2015_Epidemiology_of_Autism_Spectrum_Disorders.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/apreshill/labhub/17133b1e3b321cc703e4f6ed75ba77daeb679ab3/01-onboarding/example/05-key-readings/Presmanes_Hill_2015_Epidemiology_of_Autism_Spectrum_Disorders.pdf -------------------------------------------------------------------------------- /01-onboarding/example/05-key-readings/README.md: -------------------------------------------------------------------------------- 1 | Note that the PDF viewer in GitLab can really make a mess when viewing some article PDFs. For best viewing, please download then read! -------------------------------------------------------------------------------- /01-onboarding/example/06-individual-onboarding/gra-onboarding.md: -------------------------------------------------------------------------------- 1 | To do: 2 | 3 | 1. ~~Turn in Immunization Form~~ 4 | 2. ~~Get ID badge~~ 5 | 3. Read general onboarding materials 6 | 4. ~~Get new OHSU ID (can't get this until you finish trainings in Compass and Conflict of interest form)~~ 7 | 1. ~~Email to set up time to observe ADOS administration ~~ (November 20th) 8 | 1. ~~[Get your flu shot](https://o2.ohsu.edu/blogs/staffnews/2017/10/02/2017-flu-season-walk-in-flu-vaccine-clinics-begin-oct-3/?utm_source=mc_7298669)~~ 9 | 1. ~~Get on email lists for all [seminars and journal clubs](https://repo.cslu.ohsu.edu/language-outcomes/onboarding/blob/master/seminars-and-journal-clubs.md)~~ 10 | 1. ~~After starting work, all new employees must complete the following online training modules through OHSU's online education portal, [Compass](https://o2.ohsu.edu/compass):~~ 11 | - ~~HIPAA~~ 12 | - ~~Respect at the University~~ 13 | - ~~Integrity Foundations~~ 14 | - ~~Integrity Booster~~ 15 | - ~~Responsible Conduct of Research~~ 16 | - ~~Human Subjects Research~~ 17 | 1. ~~Complete [OHSU conflict of interest form](https://bigbrain.ohsu.edu/coi/)~~ 18 | 1. ~~Create eIRB account registration~~ 19 | 1. ~~Update your contact info [here](https://repo.cslu.ohsu.edu/language-outcomes/housekeeping/blob/master/contact-info.md)~~ 20 | 1. ~~Create an ORCID (https://orcid.org)~~ 21 | 1. Start on individual development plan (https://myidp.sciencecareers.org)- talk to PI about this at some point -------------------------------------------------------------------------------- /01-onboarding/example/06-individual-onboarding/ra-onboarding.md: -------------------------------------------------------------------------------- 1 | To do: 2 | 3 | 1. ~~Get OHSU ID badge~~ 4 | 1. ~~Complete all IRB research integrity online trainings~~ 5 | 1. ~~Email to set up time to observe ADOS administration~~ 6 | 1. ~~Work with to get trained on transcribing ADOSes~~ {now transcribing independently} 7 | * Airtable transcript tracker (doesn't have to link with your OHSU acct) 8 | 1. ~~Work with phdstudent to start on programming project(s)~~ {have a working program, continuing edits of it} 9 | 1. ~~Plan to attend next REDCap training session~~ {on my cal} 10 | 1. ~~Set up important accounts linked to your OHSU username/password:~~ 11 | * ~~Box.com (box sync app for your desktop)~~ 12 | * ~~eIRB~~ 13 | * ~~email client + calendar {set up mail + ical}~~ 14 | 1. Read general onboarding materials {finished autism papers, working through other required reading} 15 | 1. ~~[Get your flu shot](https://o2.ohsu.edu/blogs/staffnews/2017/10/02/2017-flu-season-walk-in-flu-vaccine-clinics-begin-oct-3/?utm_source=mc_7298669)~~ -------------------------------------------------------------------------------- /01-onboarding/example/README.md: -------------------------------------------------------------------------------- 1 | Welcome! This project contains resources for new graduate research assistants, research staff, and volunteers. 2 | 3 | # How to use this repository 4 | 5 | ## Please review the following links first: 6 | 7 | * [OHSU's New Hire Resources](https://o2.ohsu.edu/human-resources/employment/new-hire-resources.cfm) 8 | * [OHSU's Code of Conduct](https://o2.ohsu.edu/integrity-department/code-of-conduct/index.cfm) 9 | * [Conflicts of Interest and OHSU Disclosure Requirements](https://o2.ohsu.edu/integrity-department/all-ohsu/conflict-of-interest/index.cfm) 10 | * [OHSU Employee Onboarding Checklist](https://www.ohsu.edu/xd/about/services/human-resources/working-at-ohsu/upload/employee-onboarding-checklist.pdf) 11 | * [OHSU New Employee Paperwork](http://www.ohsu.edu/xd/about/services/human-resources/working-at-ohsu/new-employee-paperwork.cfm) 12 | 13 | ## Read more about our lab: 14 | 15 | * [Our lab's code of conduct](01-code-of-conduct.md) 16 | * [OHSU's requirements for research integrity training](02-research-integrity.md) 17 | * [Scholarly contributions and authorship](03-contributions-and-authorship.md) 18 | * [Complying with NIH public access policy](04-nih-public-access-policy.md) 19 | * Get up to date with [key readings](05-key-readings) to help you get up to speed about the scientific background for our research 20 | 21 | ## Get to work! 22 | * Make your on onboarding to-do list [here](06-individual-onboarding) and @apreshill will collaborate on the list with you 23 | * Update progress on your onboarding [list](individual-onboarding) by crossing things off your list as you complete them 24 | * Edit our onboarding materials! Please edit any files to clarify or improve so that the information in this repository will be useful to our next new team member! 25 | * Add yourself and your contact information to our [project team directory](../../03-housekeeping/example/team-contacts.md)- we are excited to have you on board! 26 | * Don't miss our [project wiki](https://github.com/apreshill/labhub/wiki) provides info on human logistics concerns--coffee, transportation, etc. 27 | 28 | 29 | -------------------------------------------------------------------------------- /01-onboarding/example/onboarding.wiki/01-about-cslu.md: -------------------------------------------------------------------------------- 1 | # CSLU 2 | 3 | CSLU stands for the [Center for Spoken Language Understanding](https://cslu.ohsu.edu). We do research, and have a [Computer Science & Electrical Engineering (CSEE) education program](http://www.ohsu.edu/csee). 4 | 5 | # People 6 | 7 | You may see some of these folks around: https://www.ohsu.edu/xd/research/centers-institutes/center-for-spoken-language-understanding/people.cfm 8 | 9 | # CSLU Seminar Series 10 | 11 | The seminar series is generally Tuesdays from 12-1pm; lunch is provided. The [events calendar](https://www.ohsu.edu/xd/education/schools/school-of-medicine/departments/basic-science-departments/csee/events.cfm) lists upcoming seminars. 12 | 13 | # Websites 14 | 15 | * CSLU: https://cslu.ohsu.edu 16 | * CSEE education program: http://www.ohsu.edu/csee 17 | 18 | # Mailing Address 19 | 20 | 3181 SW Sam Jackson Park Rd, GH40 21 | 22 | Portland, OR 97239-3098 23 | 24 | # Physical Address (for mapping) 25 | 26 | 840 SW Gaines Street 27 | 28 | Portland, OR 97239-3098 -------------------------------------------------------------------------------- /01-onboarding/example/onboarding.wiki/02-seminars-and-journal-clubs.md: -------------------------------------------------------------------------------- 1 | All graduate students, research staff, and volunteers are invited to attend: 2 | 3 | 1. [OHSU's monthly Autism Seminar Series](https://www.ohsu.edu/xd/research/about/calendar.cfm#/?i=1); email Eric Fombonne to be added to the mailing list 4 | 5 | 2. [CSLU's weekly CSEE Seminar Series](https://www.ohsu.edu/xd/about/news_events/events/index.cfm#/?i=4); email Patricia Dickerson to be added to the mailing list 6 | 7 | 3. Our weekly(-ish) Natural Language Processing reading group; email Steven Bedrick 8 | -------------------------------------------------------------------------------- /01-onboarding/example/onboarding.wiki/03-servers-and-data-repositories.md: -------------------------------------------------------------------------------- 1 | In this project, we'll use data from several different sources, and the file location depends on the type of data (text files, flat files like .csvs). We'll organize this by corpora: 2 | 3 | # ERPA 4 | 5 | This was data collected at OHSU between 6 | 7 | **Text files of ADOS transcripts** are stored on the asd server: asd.cslu.ohsu.edu:transcripts/TextGrid/chronTextGrid_merged_by_subject/ 8 | 9 | **Participant-level data** is stored in REDCap (Email Alison for access using your OHSU username and password): https://octri.ohsu.edu/redcap/ 10 | The project is called `OCTRI 10001 ERPA: Expressive & Receptive Prosody in Autism - Version 2` 11 | 12 | 2. 13 | 14 | # Additional cloud storage 15 | 16 | [OHSU provides](http://www.ohsu.edu/blogs/researchnews/2014/08/05/cloud-storage-now-available-for-ohsu-researchers/) you an institutional Box.com account, which is the only approved cloud storage that fits OHSU's data protection policy: https://ohsu.app.box.com/login -------------------------------------------------------------------------------- /01-onboarding/example/onboarding.wiki/04-CSLU-acronyms.md: -------------------------------------------------------------------------------- 1 | This page provides a list of the acronyms used at the Center for Spoken Language Understanding (CSLU)--see, they come up a lot ;). These may or may not come up in your work. 2 | 3 | # Project-specific jargon 4 | 5 | - **ERPA**: Expressive and Receptive Prosody in Autism (old CSLU study) 6 | - **MIND Institute**: Medical Investigation of Neurodevelopmental Disorders Institute at UC Davis (source of transcripts) 7 | - **FNL**: Fair Neuroimaging Lab at OHSU (source of ADOS recordings) 8 | - **CON**: Conversation (one of the MIND Institute's language samples) 9 | - **NAR**: Narrative (one of the MIND Institute's language samples) 10 | - **PPs**: Pivotal Parameters 11 | - **ACW**: Affirmative Cue Words 12 | - **ADMs**: Automated Discourse Measures 13 | - **DMs**: Discourse Markers (e.g. um, uh) 14 | - **REC**: Replication/Extension Corpora (there are 2: MIND and FNL!) 15 | - **SOR**: Semantic Overlap Ratio 16 | - **WRRs**: Word Repetition Ratios 17 | - **NLs**: Natural Language Samples 18 | 19 | # Labs and Institutions 20 | 21 | - **CDRC**: Child Development and Rehabilitation Center 22 | - **CSEE**: Computer Science and Electrical Engineering Department at OHSU (educational counterpart to CSLU) 23 | - **CSLU**: Center for Spoken Language Understanding (here!) 24 | - **IRB**: Institutional Review Board 25 | - **NIH**: National Institute of Health 26 | - **OGI**: Oregon Graduate Institute (defunct graduate institution that used to host CSLU) 27 | - **OHSU**: Oregon Health and Science University 28 | - **UW**: University of Washington (their autism lab sends ADOS recordings) 29 | 30 | # Buildings 31 | 32 | Also see [here](http://www.ohsu.edu/xd/about/visiting/directions/upload/OHSU_ext_map_BW_8-5x11_FNL.pdf) for a map of campus with commonly used campus acronyms for various buildings. 33 | 34 | - **BICC**: Biomedical Information and Communication Center (the school library) 35 | - **CHH**: Center for Health and Healing (the building next to the bottom of the tram) 36 | - **GH**: Gaines Hall (location of CSLU) 37 | - **SON**: School of Nursing (across the street from CSLU) 38 | 39 | # Journals 40 | 41 | - **ACL**: Association for Computational Linguistics 42 | - **ACM**: Association for Computing Machinery 43 | - **IEEE**: Institute of Electrical and Electronics Engineers 44 | - **NAACL**: North American Association for Computational Linguistics (subset of ACL) 45 | - **PLoS**: Public Library of Science (open access) 46 | 47 | # Technical (transcription, statistics, etc) 48 | 49 | - **NLP**: Natural Language Processing 50 | - **tf-idf**: Term Frequency - Inverse Document Frequency (how often a term appears in this document compared to how often it occurs in your corpus) 51 | - **CFA**: Confirmatory Factor Analysis 52 | - **GEEs**: Generalized Estimating Equations 53 | - **LDA**: Latent Dirichlet Allocation (generative model in NLP) or Linear Discriminant Analysis (machine learning dimensionality reduction technique) 54 | - **RMSEA**: Root Mean Square Error Approximation 55 | - **SVM**: Support Vector Machine (machine learning classification technique) 56 | - **MLU**: Mean Length of Utterance 57 | - **MLUM**: Mean Length of Utterance in Morphemes 58 | - **SALT**: Systematic Analysis of Language Transcripts (protocol for Transcription) 59 | 60 | 61 | # Neurodevelopmental disorders/diagnostic categories (or lack thereof) 62 | 63 | - **ASD**: Autism Spectrum Disorder 64 | - **ALI**: Autism with Language Impairment (subset used in ERPA) 65 | - **ALN**: Autism with Language Normal (subset used in ERPA) 66 | - **DD**: Developmentally Delayed 67 | - **DS**: Down Syndrome 68 | - **FXS**: Fragile X Syndrome 69 | - **TD**: Typically Developing 70 | - **SLI**: Specific Language Impairment 71 | 72 | # Measures (clinical assessments, parent-reported questionnaires, etc.) 73 | 74 | - **ADOS**: Autism Diagnostic Observation Schedule (semi-structured standardized play-based assessment for ASD) 75 | - **BRIEF**: Behavior Rating of Executive Function (parent questionnaire; measure of executive function) 76 | - **CCC-2**: Children's Communication Checklist (parent questionnaire; measures child's language use in natural settings) 77 | - **CELF**: Children's Evaluation of Language Fundamentals (test for expressive and receptive language abilities) 78 | - **SCQ**: Social Communication Questionnaire (parent questionnaire; assessment of core autism symptoms) 79 | - **SDQ**: Strength and Difficulties Questionnaire 80 | - **SRS-2**: Social Responsiveness Scale 81 | - **VABS-II**: Vineland's Adaptive Behavior Scale, Second Edition (parent questionnaire; assessment of daily life skills used to estimate general adaptive functioning) 82 | - **FSIQ**: Full Scale IQ 83 | - **NVIQ**: Nonverbal IQ 84 | - **PIQ**: Performance IQ 85 | 86 | # Other 87 | 88 | - **R01**: NIH Research Project Grant Program (a type of grant: see https://grants.nih.gov/grants/funding/r01.htm) 89 | - **IFDP**: Individual Family Service Plan (a plan for special services for young children with developmental delays) 90 | - **IEP**: Individualized Education Plan (a document that is developed for each public school child who needs special education) 91 | - **IDP**: Individual Development Plan (see: https://myidp.sciencecareers.org; http://www.sciencemag.org/careers/2012/09/you-need-game-plan) 92 | - **RPPR**: Research Performance Progress Report (see: https://grants.nih.gov/grants/rppr/index.htm) 93 | - **myNCBI**: [my National Center for Biotechnology Information](https://www.ncbi.nlm.nih.gov/myncbi/) 94 | - **sciENcv**: [Science Experts Network Curriculum Vitae](https://www.ncbi.nlm.nih.gov/sciencv/) 95 | - **NIHMS**: [NIH Manuscript Submission System](https://www.nihms.nih.gov/) -------------------------------------------------------------------------------- /01-onboarding/example/onboarding.wiki/05-food-and-coffee.md: -------------------------------------------------------------------------------- 1 | OHSU has a [formal page on this] (http://www.ohsu.edu/xd/about/services/food-and-nutrition/where-to-eat/). 2 | 3 | # Food (may also sell coffee) 4 | 5 | * **Mac Hall** is in the first floor of Mackenzie Hall and has cafeteria style lunch. They close at 4. 6 | * **Thai Yummy** is a Thai food cart. They close at 3 officially but usually more like 2. 7 | * There's a **[farmer's market](http://www.ohsu.edu/xd/about/services/food-and-nutrition/farmers-market/index.cfm)** June-September on Tuesdays from 10-2 with several food carts/stalls. 8 | * **It's All Good** is the fancy natural food store, next to the gift store in the main hospital building. It has some prepackaged food and lots of snacks. 9 | * **Hatfield Cafe** is in Hatfield; they have deli fare at lunch and an espresso and pastry window 10 | * At the base of the tram: **Pizzicato/Lovejoy** (pizza mostly), **Cha! Cha! Cha!** (Mexican food), **Greenleaf** (fancy juice) 11 | * On the waterfront: **Let's Eat Thai Food** (food cart; cheap and giant but a fair walk), **Bambuza** (Vietnamese), **Little Big Burger** (burgers). These are variously long walks, so they won't really fit on a standard lunch break. 12 | * There is a **vending machine** outside GH40. There is some formal process to request your money back if it eats it. 13 | * **The Feathered Nesst** is a quick walk from Gaines Hall and sells burgers, sandwiches, salad, etc; however, they are often slow. This is a popular destination for going-away lunches. 14 | 15 | # Coffee (may also sell food) 16 | 17 | * Pat makes **coffee** (or you can too!) 18 | * We have an **espresso maker** in the kitchen now--but bring your own beans! 19 | * **Nightingale Cafe** is in the School of Nursing (get it?) on the first floor. They close at 2pm and have coffee and pastries. 20 | * **Sky Bridge Espresso** is where the VA skybridge and Doernbecher meet. It closes at 3pm. 21 | * The **Summit Cafe**, at the top of the tram, is open until 4pm. 22 | * There's a **Starbucks** in the lobby of Doernbecher that is usually open until 8pm, making it the last coffee shop to close in the area. -------------------------------------------------------------------------------- /01-onboarding/example/onboarding.wiki/06-office-stuff.md: -------------------------------------------------------------------------------- 1 | [Pat Dickerson](mailto:dickersp@ohsu.edu) is your resource for office needs! 2 | 3 | # Office Supplies 4 | 5 | The office supplies are located in the cabinets near the sink in GH 40. These cabinets are stocked with the basics--paper, pencils, folders, highlighters, tissues, etc. If you want anything not available there, ask Pat and she can show you the secret storage or order items we don't have. (Pro-tip: ask about the fancy pens). 6 | 7 | # Office Protocol 8 | If you work in GH 30 or 40, please be sure to shut off lights and lock up if you are the last person out for the day. If you’re unsure if anyone else is in, lock the door anyway. If you need after hours access, Pat can request keys for you. 9 | 10 | Locked out? Call public safety at 503-494-7744. -------------------------------------------------------------------------------- /01-onboarding/example/onboarding.wiki/07-transportation-and-parking.md: -------------------------------------------------------------------------------- 1 | ## Transportation 2 | 3 | ### The Great Tram Debate 4 | 5 | If you take the tram up the hill, there are two ways to get to Gaines Hall. 6 | 7 | * The Outdoor Way: walk straight away from the tram, either out the front doors toward the ER or juke around to the left until you hit the next set of doors. Walk past the library then turn left before phys plant. Walk on the swoopy path past the apartment building/Feathered Nesst. Cross the street to School of Nursing. Either enter the SoN and go up via the internal stairs/elevator, or go to the right and take the steep stairs up. Cross the street to Gaines Hall. 8 | 9 | * The Indoor Way: walk straight away from the tram, juke left then turn left at the gift store. Turn right at the coffee shop which puts you into the Doernbecher skybridge. Cross the skybridge then take the stairs to your left one floor up. Turn left from the stairs and walk outside. Cross the street, then walk up to the enclosed bridge. Take a left out of the enclosed bridge to the parking lot and walk up Gaines Street to Gaines Hall on the left. 10 | 11 | People have opinions about which of these you should take. Steven Bedrick timed them and says they take the same amount of time. It's up to you. 12 | 13 | ### Parking 14 | 15 | Parking for non-patients is intentionally hard up here (the city doesn't want the traffic jam of everyone driving up the hill), so if you're going to work a full day it's usually easier to use transit and/or bike. If you do drive, there's 3-hour pay parking on Gaines and a 2-hour visitor parking zone up towards the nature park. The parking is enforced pretty strictly so you do risk a ticket if you overstay. If you move your car within the visitor zone they can still ticket you for overstaying. In the pay parking you have to switch blocks and/or sides of the street after 3 hours, and buy a new ticket. 16 | 17 | Also, if you're coming from the east it's easier to take Barbur and turn up Bancroft than it is to take the front way; this skips most of the traffic. However it is a very sharp turn and a somewhat steep hill so be careful. 18 | 19 | ### GPS 20 | 21 | Sometimes telling people "Gaines Hall" will erroneously send them to the waterfront, but "SW 9th and Gaines" usually works. 22 | 23 | The physical address for Gaines Hall is 840 SW Gaines Street, Portland, OR, 97239 24 | 25 | ### Go By Bike 26 | 27 | There is a [free bicycle valet](http://www.gobybikepdx.com) at the base of the tram, open from 6 AM to 7:30 pm when the tram is running. You can swipe your badge and they'll use that to store which bike is yours; otherwise you can get a physical claim check for your bike. Bikes are stored in a guarded lot, so you don't have to lock them or strip off your bike accessories. They'll usually have bags to cover your seat if it's rainy, but you may wish to take your helmet with you on wet days. 28 | 29 | They have a small attached bike shop that can do minor repairs like fixing a flat tire. -------------------------------------------------------------------------------- /01-onboarding/example/onboarding.wiki/home.md: -------------------------------------------------------------------------------- 1 | Welcome to the Onboarding Wiki! This Wiki provides additional information on the human logistics concerns for new graduate research assistants, research staff, and volunteers at CSLU. 2 | 3 | # Topics 4 | - Confused about why people keep saying ADOS like it's a real word? Read through [CSLU Acronyms](./CSLU acronyms). 5 | - About to buy your own office supplies? Don't!! Check out [Office Stuff](./Office stuff). 6 | - Experiencing caffeine withdrawal? Go immediately to [Food and Coffee](./Food and coffee). 7 | - Don't know how you got here/how to get home/where anything is? Look at [Transportation and Parking](./Transportation and Parking). -------------------------------------------------------------------------------- /01-onboarding/template/onboarding.md: -------------------------------------------------------------------------------- 1 | Use this template to document and reference resources, expectations, and policies for new lab members. Adapt and refine the sections and examples provided here. 2 | 3 | ## Welcome Message 4 | > Provide a brief introduction to your team. Consider including the name and focus of your lab, and where it sits within the larger university environment. **Let new members know how happy you are to have them on board and how they can use this information.** 5 | 6 | ## Parent Organization Resources & Processes 7 | > Whether you are a new student, postdoc, or staff member, getting started at a new organization can be confusing and overwhelming. Use this section to highlight the processes and resources relevant to new lab members. 8 | 9 | Examples: 10 | 11 | * University intranet sites 12 | * New employee onboarding checklists and paperwork 13 | * Public Safety Office location and contact information 14 | * University codes of conduct 15 | * Staff and student organizations 16 | * Parking and transportation 17 | * Library website :book: 18 | * Research development and grant management resources 19 | 20 | ## Department and Lab Specific Information 21 | > Information about your department and lab that will help new members learn about your colleagues, connect with peers, and leverage educational opportunities. 22 | 23 | Examples: 24 | 25 | * Department website, faculty, and student pages 26 | * Primary collaborators 27 | * Seminars and journal clubs 28 | * Lab publications list 29 | * Professional development materials 30 | 31 | :100: Pro-tip! You can create and include links to your publication list from PubMed and other databases. Ask your library for help :smile: 32 | 33 | ## Lab Policies and Guidelines 34 | > Save time, reduce conflict, and build inclusion by clearly describing and documenting lab expectations and processes. 35 | 36 | Examples: 37 | 38 | * Lab code of conduct 39 | * Scholarly contributions and authorship 40 | * Protocols 41 | * Data management and repositories 42 | * Project organization 43 | * Open access policy 44 | * Data privacy and data sharing 45 | 46 | :100: Pro-tip! A lab code of conduct, or COC, provides an opportunity to codify the organizational culture you want to foster, by clearly describing desired and unacceptable kinds of behavior and interpersonal communication. 47 | 48 | For example, like the [Hill Lab](https://github.com/apreshill/labhub/blob/master/01-onboarding/example/01-code-of-conduct.md) and the [Whitaker Lab](https://github.com/WhitakerLab/Onboarding/blob/master/CODE_OF_CONDUCT.md), you can make clear that sexist and racist language or imagery are not acceptable, and that your lab is dedicated to providing a safe and inclusive environment for everyone, regardless of gender, gender identity and expression, sexual orientation, disability, physical appearance, body size, race, age or religion. 49 | 50 | Check out all of the [Whitaker Lab's onboarding materials.](https://github.com/WhitakerLab/Onboarding) 51 | 52 | ## Additional Resources and Readings 53 | > Maybe your lab has a wiki, or your university sits at the top of a hill and one might need a little help finding the good food and drink :pizza: Think through the things you wish you knew when you first arrived - if they don't belong in one of the sections above, add them here. 54 | 55 | Examples: 56 | 57 | * [Where to find the best coffee on campus](https://github.com/apreshill/labhub/wiki/05-food-and-coffee) 58 | * How to decipher the endless [list of acronyms](https://github.com/apreshill/labhub/wiki/04-CSLU-acronyms) your new colleagues are throwing at you :stuck_out_tongue_winking_eye: 59 | 60 | ## Next Steps 61 | > Make this material more actionable by asking new lab members to collaborate with you on a personalized onboarding plan. And don't underestimate the value of their feedback on this material - welcome their contributions. 62 | 63 | :100: Pro-tip! Contributions will not only improve the quality of your labhub material, they help make it sustainable. 64 | -------------------------------------------------------------------------------- /02-protocols/example/README.md: -------------------------------------------------------------------------------- 1 | Welcome! This folder contains resources for team members who transcribe audio/video files for the project. We track transcription progress via the [Airtable Transcript Tracker](https://airtable.com). Here's some info about [what to put there](transcription/tracking). 2 | 3 | In this folder you'll find: 4 | 5 | * [Useful scripts](scripts): for example, `upload.sh` is a Bash script that uploads your file to the right folder using rsync, assuming you have your username stored and asd aliased to asd. 6 | * [Suggested readings](suggested-reading): these are papers about transcription itself that might be helpful to you (and are generally 7 | interesting). Not required -- just here if you want them! Note that in-browser PDF rendering can scramble pages, so for the best reading experience, download and read locally. 8 | * Data storage: there may come a day when you want to consult the raw source data for a project. There are two types of raw data: 9 | * [Paper Files](datastorage/files.md) 10 | * [Recordings](datastorage/recordings.md) 11 | * All about transcribing: includes lots of good-to-know info such as... 12 | 13 | * [Transcription Overview](transcription/transcription.md) 14 | * [Formatting and Uploading](transcription/formatting-uploading.md) 15 | * [ELAN Tips and Tricks](transcription/elan.md) 16 | * [Activity Details](transcription/activities.md) 17 | * [ADOS overview](transcription/ados.md) 18 | * [Transcription Guidelines](transcription/guidelines.md) 19 | * [UW Specific Information](transcription/uw.md) 20 | * [ADHD Specific Information](transcription/adhd.md) 21 | * [ERPA Specific Information](transcription/erpa.md) 22 | * [Getting Audio from audio2](transcription/audio2.md) 23 | * [More about the TextGrid file format](transcription/textgrids.md) 24 | * [Cool (and relevant) bash tricks](transcription/bash.md) 25 | * [Audio cleaning](transcription/cleaning.md) 26 | 27 | -------------------------------------------------------------------------------- /02-protocols/example/datastorage/files.md: -------------------------------------------------------------------------------- 1 | You may come into possession of a set of filing cabinet keys (Pat has copies of these as well). The green-capped key (labeled '4-tier') unlocks the cabinet inside the double-door closet next to the mail cubbies in GH40. This cabinet contains paper files for ERPA and PAIRS subjects. The blue-capped key (labeled 'Tall') opens the tall five-drawer cabinet next to this closet, which also contains ERPA files. The key with no colored cap but a red tag labeled '103R' opens the black filing cabinet in GH30, which contains the remainder of the ERPA files. At this point, all of this data has been entered into REDCap, so there isn't really a reason you would need to look at it, but it's there if you need it. -------------------------------------------------------------------------------- /02-protocols/example/datastorage/recordings.md: -------------------------------------------------------------------------------- 1 | # Location 2 | 3 | The Mini DVs containing all the video recordings from ERPA can be found in the supply closet in the back of GH30 (this can be opened with a master key). These are organized in labeled storage bins by subject -- if you remove one from the shelves, be sure to put it back in numerical order, and don't put tapes back in the wrong bins! 4 | 5 | # Digitization Efforts 6 | 7 | ## ERPA ADOSes 8 | 9 | All of the ERPA ADOS video recordings that existed in our storage have been digitized and uploaded to the server. They are located in 10 | 11 | ``` 12 | asd:/home/asd-lang/ERPA/ADOS/video/ 13 | ``` 14 | 15 | ## ERPA CELFs 16 | There has been a partial digitization effort for the ERPA CELF video recordings -- you may be asked to resume this endeavor. 17 | 18 | Consult the [Airtable CELF digitization tracker] (https://airtable.com) to see the digitization progress and update as you upload more files. Notice that you should focus first on those labeled 'Completed' in the 'status' column. These are guaranteed to be CELF-4's (for older children), rather than CELF-2's (for preschoolers), and are more useful for our analyses. 19 | 20 | The digitized videos are located in 21 | 22 | ``` 23 | asd:/home/asd-lang/ERPA/CELF/videos/ 24 | ``` 25 | 26 | If you are asked to continue digitization: 27 | 28 | * Locate the CELF cassette in a subject's bin. They may be labeled **OGI-XXX mm/dd/yy CELF**, but more commonly **OGI-XXX mm/dd/yy Language Testing**. 29 | 30 | * There are multiple Canon Vixia HV40 camcorders floating around the office -- locate a working one and a charger (they need to be plugged in while digitizing). If these have all died, anything that plays Mini DVs and can connect to a computer should work. Put the tape in, switch it from **OFF** to **PLAY** (**not** to **CAMERA**, we don't want to record over them!!) 31 | 32 | * Play through until you find where the CELF starts. Consult the CELF manual for reference if you're not familiar with the tasks, and look for the CELF administration book in the video. The other common tasks on this tape are: 33 | - the PPVT, a picture-naming task. 34 | - a task in which the child is asked to list things while being timed. These include the days of the week, months, and number patterns. The examiner uses a stopwatch in these tasks. 35 | 36 | * Connect the camera to your computer with a FireWire cable. 37 | 38 | * Start iMovie (or your preferred software). In iMovie, hit Import Media, find the connected camcorder in the Devices list, and start importing it. 39 | 40 | * Digitization is done in real-time, so I recommend starting it and leaving it to progress for about an hour while you do other work. 41 | 42 | * Again, these camcorders are old and on their last legs, so once it's finished, make sure it all made it on there. Sometimes it gets broken into lots of clips, which is super annoying and necessitates manually pasting them together and/or finding and re-uploading missed sections to make a single unbroken movie. 43 | 44 | * Export your finished movie from iMovie, saved as a .mp4 for consistency, and upload it to the server 45 | 46 | * Note your progress on the CELF Digitization progress spreadsheet. 47 | 48 | -------------------------------------------------------------------------------- /02-protocols/example/scripts/upload.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin bash 2 | 3 | #upload.sh: smart upload transcription files to the right folder 4 | #requires username be stored and asd be aliased to asd 5 | #todo: currently doesn't handle being run from another dir 6 | #because it looks at the front of the address it's given 7 | #so seeing /otherdir/ in front of the address confuses its tiny brain 8 | 9 | #fnl 10 | case $1 in 11 | FNL-*) 12 | echo "FNL file detected." 13 | rsync -avP ./$1 asd:/home/asd-lang/FNL/ADOS/transcripts/TextGrid 14 | ;; 15 | #uw-gendaar - tricky because they just start with numbers 16 | [0-9][0-9][0-9].03*) 17 | echo "UW GENDAAR file detected." 18 | rsync -avP ./$1 asd:/home/asd-lang/UW_GENDAAR/transcripts 19 | ;; 20 | #uw-estes - done being transcribed, unlikely to come up 21 | UW-*) 22 | echo "UW Estes file detected." 23 | rsync -avP ./$1 asd:/home/asd-lang/UW/ADOS/transcripts/TextGrid 24 | ;; 25 | *) 26 | echo "Friend, I'm not sure what this is." 27 | ;; 28 | esac -------------------------------------------------------------------------------- /02-protocols/example/suggested-reading/BucholtzPoliticsofTranscription.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/apreshill/labhub/17133b1e3b321cc703e4f6ed75ba77daeb679ab3/02-protocols/example/suggested-reading/BucholtzPoliticsofTranscription.pdf -------------------------------------------------------------------------------- /02-protocols/example/suggested-reading/ochs1979.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/apreshill/labhub/17133b1e3b321cc703e4f6ed75ba77daeb679ab3/02-protocols/example/suggested-reading/ochs1979.pdf -------------------------------------------------------------------------------- /02-protocols/example/transcription.wiki/datastorage/files.md: -------------------------------------------------------------------------------- 1 | You may come into possession of a set of filing cabinet keys (Pat has copies of these as well). The green-capped key (labeled '4-tier') unlocks the cabinet inside the double-door closet next to the mail cubbies in GH40. This cabinet contains paper files for ERPA and PAIRS subjects. The blue-capped key (labeled 'Tall') opens the tall five-drawer cabinet next to this closet, which also contains ERPA files. The key with no colored cap but a red tag labeled '103R' opens the black filing cabinet in GH30, which contains the remainder of the ERPA files. At this point, all of this data has been entered into REDCap, so there isn't really a reason you would need to look at it, but it's there if you need it. -------------------------------------------------------------------------------- /02-protocols/example/transcription.wiki/datastorage/recordings.md: -------------------------------------------------------------------------------- 1 | # Location 2 | 3 | The Mini DVs containing all the video recordings from ERPA can be found in the supply closet in the back of GH30 (this can be opened with a master key). These are organized in labeled storage bins by subject -- if you remove one from the shelves, be sure to put it back in numerical order, and don't put tapes back in the wrong bins! 4 | 5 | # Digitization Efforts 6 | 7 | ## ERPA ADOSes 8 | 9 | All of the ERPA ADOS video recordings that existed in our storage have been digitized and uploaded to the server. They are located in 10 | 11 | ``` 12 | asd:/home/asd-lang/ERPA/ADOS/video/ 13 | ``` 14 | 15 | ## ERPA CELFs 16 | There has been a partial digitization effort for the ERPA CELF video recordings -- you may be asked to resume this endeavor. 17 | 18 | Consult the [Airtable CELF digitization tracker] (https://airtable.com/invite/l?inviteId=invSCS7qWQ2CmDbPr&inviteToken=ec920973ff253744b078448797db31d6) to see the digitization progress and update as you upload more files. Notice that you should focus first on those labeled 'Completed' in the 'status' column. These are guaranteed to be CELF-4's (for older children), rather than CELF-2's (for preschoolers), and are more useful for our analyses. 19 | 20 | The digitized videos are located in 21 | 22 | ``` 23 | asd:/home/asd-lang/ERPA/CELF/videos/ 24 | ``` 25 | 26 | If you are asked to continue digitization: 27 | 28 | * Locate the CELF cassette in a subject's bin. They may be labeled **OGI-XXX mm/dd/yy CELF**, but more commonly **OGI-XXX mm/dd/yy Language Testing**. 29 | 30 | * There are multiple Canon Vixia HV40 camcorders floating around the office -- locate a working one and a charger (they need to be plugged in while digitizing). If these have all died, anything that plays Mini DVs and can connect to a computer should work. Put the tape in, switch it from **OFF** to **PLAY** (**not** to **CAMERA**, we don't want to record over them!!) 31 | 32 | * Play through until you find where the CELF starts. Consult the CELF manual for reference if you're not familiar with the tasks, and look for the CELF administration book in the video. The other common tasks on this tape are: 33 | - the PPVT, a picture-naming task. 34 | - a task in which the child is asked to list things while being timed. These include the days of the week, months, and number patterns. The examiner uses a stopwatch in these tasks. 35 | 36 | * Connect the camera to your computer with a FireWire cable. 37 | 38 | * Start iMovie (or your preferred software). In iMovie, hit Import Media, find the connected camcorder in the Devices list, and start importing it. 39 | 40 | * Digitization is done in real-time, so I recommend starting it and leaving it to progress for about an hour while you do other work. 41 | 42 | * Again, these camcorders are old and on their last legs, so once it's finished, make sure it all made it on there. Sometimes it gets broken into lots of clips, which is super annoying and necessitates manually pasting them together and/or finding and re-uploading missed sections to make a single unbroken movie. 43 | 44 | * Export your finished movie from iMovie, saved as a .mp4 for consistency, and upload it to the server 45 | 46 | * Note your progress on the CELF Digitization progress spreadsheet. 47 | 48 | -------------------------------------------------------------------------------- /02-protocols/example/transcription.wiki/home.md: -------------------------------------------------------------------------------- 1 | Welcome to the transcription wiki! 2 | 3 | --- 4 | 5 | # Introduction 6 | 7 | Welcome to the CSLU Research Assistant Institutional Knowledge Wiki! The goal of this wiki is to collect all the tips, tricks, and good-to-know-s to make your fellow RA's day a little bit easier, without making the transcriber guidelines too overwhelmingly detailed. 8 | 9 | This is a work in progress! Feel free to contribute and add to the knowledge base, but right now there may be a lot of drastic changes as everything gets set up. Throw redlinks around, it'll be fun! 10 | 11 | # Transcription 12 | 13 | * [Transcription Overview](transcription) 14 | * [Formatting and Uploading](transcription/formatting-uploading) 15 | * [ELAN Tips and Tricks](transcription/elan) 16 | * [Activity Details] (transcription/activities) 17 | * [ADOS overview](transcription/ados) 18 | * [Transcription Guidelines] (transcription/guidelines) 19 | * [UW Specific Information] (transcription/uw) 20 | * [ADHD Specific Information] (transcription/adhd) 21 | * [ERPA Specific Information] (transcription/erpa) 22 | * [Getting Audio from audio2] (transcription/audio2) 23 | * [More about the TextGrid file format] (transcription/textgrids) 24 | * [Cool (and relevant) bash tricks](transcription/bash) 25 | * [Audio cleaning](transcription/cleaning) 26 | * Transcriptions are tracked via the [Airtable Transcript Tracker](https://airtable.com). Here's some info about [what to put there](transcription/tracking). 27 | 28 | # Servers 29 | 30 | The main server for transcription related purposes is asd, formally `asd.cslu.ohsu.edu`. 31 | 32 | Some other servers you may hear about are bergamot (`login.cslu.ohsu.edu`), which is used to tunnel into internal servers from outside the network, and the Big Birds (`bigbirdN.cslu.ohsu.edu`), which are the center's computing cluster. If someone wants you to work on one of these, they'll probably give you more information about it at the time. 33 | 34 | # Raw Data Storage 35 | 36 | There may come a day when you want to consult the raw source data for a project. The main data stored on-site is from ERPA -- UW and other shared projects do not have such raw data available. 37 | 38 | * [Paper Files] (datastorage/files) 39 | * [Recordings] (datastorage/recordings) 40 | -------------------------------------------------------------------------------- /02-protocols/example/transcription.wiki/transcription.md: -------------------------------------------------------------------------------- 1 | This page is a big picture overview of how to transcribe and why we do it. We hope at least one experienced transcriber will be around to talk about this in person too. 2 | 3 | # Transcription Goals 4 | 5 | Most transcribers have historically had a linguistics background. This section may be a bit redundant for those who do. 6 | 7 | The goal of transcription is to preserve what the child actually said, in a form amenable to study. The whole idea of _actually said_ is not nearly as simple as it sounds; we necessarily make choices about what details are relevant to include in a transcript. That said, speech is messy. Some of the children whose tapes we're transcribing may have language disorders and other conditions that influence their speech, but even the adult examiners will display some non-standard speech at times -- everyone does. If you haven't transcribed before you will be tempted to clean up the child's speech into something closer to a novel or script. One of the main tasks of learning to be a transcriber is learning to hear the words as they were said, and write them down, messy as they may be. 8 | 9 | # Uses for the Transcripts 10 | 11 | The transcripts are used for a variety of automated analyses. We can't prepare for every possible future use, but there will probably be new analyses conducted that we can't predict yet. We don't want to focus on only a few predicted uses and thus unduly influence the transcripts, but knowing what transcripts will be used for downstream can help us choose which phenomena deserve special notation. It's also helpful to have enough information in a transcript that a human can read it and figure out what's going on, but this is more for troubleshooting; most of the publications based on our transcripts have to do with a computational analysis. 12 | 13 | One set of measures commonly used to study the transcripts includes MLU (mean length of utterance -- an important measure and the reason so many rules revolve around what counts as one word) as well as `other stuff`. 14 | 15 | Some things that have been studied include turn-taking, mazes, and pedantic speech. 16 | 17 | # Transcriber Training 18 | 19 | New transcribers start off by studying the transcription guidelines, then transcribe the gold standard file. The gold standard file is a file that already has a transcription generally agreed to be of high quality. The current one is ADHD-83322-ADOS (no peeking!). Then experienced transcribers compare the new transcriber's transcription to the gold standard and offer advice and guidance based on the differences. There are certain parts of transcribing that are objective, so the goal isn't necessarily to have character-for-character the same transcript, but this is a good way to get used to the conventions and learn to avoid common mistakes. 20 | 21 | In addition to the version on asd, there are two other versions of the gold standard file made by other transcribers and revised in a group discussion. All canonical versions of the gold standard transcription can be found at `asd:/home/langauge/Transcription/gold_standard`. The point of keeping multiple versions is to make it easier for trainers to see if the new transcriber is showing normal human variation in judgement or making actual mistakes. 22 | 23 | # Standardization Measures 24 | 25 | Every September and March all working transcribers need to do a standardization check. Anyone who is transcribing at the time needs to share a transcript with their fellow transcribers at some point during that month. Transcribers should review each other's work and discuss any discrepancies. This is to ensure that transcribers remain standardized to both the guidelines and each other. -------------------------------------------------------------------------------- /02-protocols/example/transcription.wiki/transcription/activities.md: -------------------------------------------------------------------------------- 1 | This page is for more specific details about ADOS activities that tend to come up a lot. Also check out the actual ADOS manual/scoring guidelines for directions on how the instructors are taught to do the task. 2 | 3 | 4 | | ADOS task | ADOS task number | Transcription activity label | Cue words/questions | Transcribe | 5 | |--------------------------------------------|------------------|-------------------------------------------------|------------------------------------------------------------------------------|------------| 6 | | Construction Task | 1 | Construction Task | Beginning of audio | N | 7 | | Make Believe Play & Joint Interactive Play | 2&3 | Play | Loud background noise of toys being brought out; "I have some toys for you" | Y | 8 | | Demonstration Task | 4 | Demonstration Task | "I want you to pretend like I don't know how to brush my teeth" | N | 9 | | Description of a Picture | 5 | Picture Description | "Let's look at this picture" | Y | 10 | | Telling a Story from a Book | 6 | Wordless Picture Book | "Let's look at this book" | Y | 11 | | Cartoons | 7 | Cartoons | "I've got some cartoons here" | Y | 12 | | Conversation and Reporting | 8 | Conversation and Reporting | (Various and optional) | Y | 13 | | Emotions | 9 | xEmotions Conversation | "What do you like doing that makes you feel happy and cheerful?" | Y | 14 | | Social Difficulties and Annoyance | 10 | xSocial Difficulties and Annoyance Conversation | "Have you ever had problems getting along with people at school?" | Y | 15 | | Break | 11 | Break | "We're going to take a quick break" | N | 16 | | Friends, Relationships and Marriage | 12 | xFriends and Marriage Conversation | "Do you have some friends?" | Y | 17 | | Loneliness | 13 | xLoneliness Conversation | "Do you ever feel lonely?" | Y | 18 | | Creating a Story | 14 | Creating a Story | "One last thing" | N | 19 | 20 | # Construction Task 21 | 22 | **Not transcribed** 23 | 24 | * This is the first task. The examiner will ask the kid to put some blocks on a picture. At the end they might ask the kid what the shape looks like to them. 25 | 26 | * Cues: 27 | 28 | * [start of the tape] 29 | 30 | # Play 31 | 32 | * The sound of taking out the toys is usually the biggest cue to play. 33 | * The most common set of toys has three action figures (two male one female), a dinosaur, and a collection of stuff: fire-truck, hot-dog, chocolate bar, miniature CD-type disc, etc. The toy set for FNL has the wrong sword for the female action figure and it looks notably mis-sized. 34 | * There's an alternate set of toys that have a family including a baby. Some administrations use these instead, usually when a Module-3 administration occurs with a very young child (3,4, maybe 5). 35 | * They are supposed to introduce the characters as "wrestler, superhero, warrior, and their things." 36 | * The task is actually in two parts. First the kid plays on their own then the examiner asks to join. 37 | * Some kids really just make sound effects, particularly in the first part. 38 | * This is probably the most commonly refused task. Older kids will sometimes do a very short one, too. 39 | 40 | * Cues: 41 | 42 | * So I've got some things here. 43 | * So I've got like a ... space guy, army guy, warrior princess, and their pet dinosaur, and here are some of their things. 44 | * I've got some toys for you to play with. 45 | 46 | # Demonstration Task 47 | 48 | **Not transcribed** 49 | 50 | * This task is often framed as the examiner being an alien who needs to be taught how to brush their (shockingly human-like) teeth. 51 | 52 | * Cues: 53 | 54 | * So now I'd like to play a different kind of pretend game. 55 | 56 | # Picture Description 57 | 58 | * There are three pictures they might use. 59 | 60 | * The most common is a novelty map of the United States with different cartoons showing stuff iconic for each state -- landmarks like the Space Needle and the Golden Gate Bridge as well as activities and objects like a gambler for Vegas and an oil well for Texas. The examiner will typically try to start a conversation about vacations the kid's been on and/or where the kid is from. 61 | 62 | * The second most common is a picture of people doing different activities at a beach resort. This one will also typically segue into a chat about vacations. 63 | 64 | * The rarest (used more in ERPA) is a picture of a Thanksgiving feast. The examiner might ask how the kid celebrates Thanksgiving. (this is usually for Mod-2). 65 | 66 | * Cues: 67 | 68 | * Now I've got a picture here... 69 | 70 | # Wordless Picture Book 71 | 72 | * There are three different books, but it's almost always the first. 73 | * The main book is _Tuesday_ by David Wiesner. Pictures from it can be found online. It's a story about a night where frogs spontaneously start to fly on their lily-pads and have adventures around a town. At the end we see that pigs fly the next week. The examiner may ask the kid if they've heard the phrase "when pigs fly" and if they know what it means. (This is not part of a standard administration.) 74 | * An alternative book is _Free Fall_, also by David Wiesner, which is a story about a kid daydreaming. They often offer the kid the choice between this book and _Tuesday_, and the kids tend to go for the frogs. 75 | * A few files will use _Good Night, Gorilla_ by Peggy Rathmann. This is often a sign that the kid is less verbal and that they might switch to the ADOS Module 2, so check the rest of the file if you see it! (Or also used with very young children who are appropriate for a Module 3 (ages 3, 4, maybe 5). 76 | 77 | * Cues: 78 | 79 | * Now I've got this book here... 80 | 81 | # Cartoons 82 | 83 | * There are three different cartoons. FNL tends to use the fisherman, UW often uses the others. 84 | 85 | * A fisherman and his cat are fishing. The fisherman puts the fish into the bucket. The cat steals it and puts it in what looks like his own bucket, but is actually the beak of a pelican. The pelican flies away and the cat is angry. 86 | * A monkey is in a tree. He drops coconuts out of the tree and they are stolen by another monkey. The tree monkey waits for the thief monkey to come back and drops a coconut on his head. 87 | * [UW-Estes only, timepoint 2 & 3] A Spy vs Spy cartoon involving using a "peanut potion" to lure elephants. The examiner may not recognize it and may describe the spies as mosquitoes. (I have never need this cartoon- it does not come with the ADOS-2 kit.) 88 | 89 | * The reason they ask the kids to stand up and take their hands out of their pockets is actually to see if they'll spontaneously gesture as they tell the story. 90 | 91 | * Cues: 92 | 93 | * Now I've got some cartoons... 94 | 95 | # Conversation and Reporting 96 | 97 | * This gets a long description in the main guidelines because it's hard to define. The main point is that it's for when the examiner initiates a conversation that's not really related to the last activity. It is generally used when there has not been enough spontaneous conversation and reporting once the examiner gets to this section. If plenty have already occurred, it is generally not used as a stand-alone "activity." 98 | 99 | * One of the examiners for UW GENDAAR often conducts C&R as the very last activity after Creating a Story, so make sure to check the very end of the file! 100 | 101 | * Common Topics: 102 | 103 | * Does the kid have pets 104 | * Describe a day at school/general questions at school 105 | * What will the kid do after the visit 106 | * The kid's hobbies 107 | 108 | * UWG examiner's questions: 109 | 110 | * Does the kid have any hobbies 111 | * How did the kid get into (kid's hobby) 112 | * Has the kid done anything related to (kid's hobby) recently? 113 | * "Can you tell me about a time when you felt bullied or picked on or treated unfairly?" 114 | * Does the kid have siblings? 115 | * (If so) What's nice about being a sibling? 116 | * (If so) What's hard about being a sibling? 117 | * (If so) What's unique about being a sibling? 118 | 119 | # Emotions 120 | 121 | * Relaxed/content is optional; it's there in case ending on sad is too depressing. (The manual states that if the first two responses are excellent, the rest are all optional.) 122 | 123 | * Cues: 124 | 125 | * So now I just have some questions about feelings kids have 126 | 127 | # Social Difficulties and Annoyance 128 | 129 | * Cues: 130 | 131 | * Have you ever had problems getting along with people at school? 132 | 133 | # Break 134 | 135 | **Not transcribed** 136 | 137 | * There's a bunch of things for the kid to play with including a radio. If you hear the radio on it's probably Break. 138 | *The examiner approaches the kid at the end. There will often be some conversation at the end of Break, which is currently not transcribed. 139 | 140 | * Cues: 141 | 142 | * Now I have some stuff to write down here so I need to take a little break. 143 | * Now I need to take a break... 144 | 145 | # Friends and Marriage 146 | 147 | * Cues: 148 | 149 | * Do you have some friends? Tell me about them. 150 | 151 | # Loneliness 152 | 153 | * This section is almost always much shorter than the others. There are a lot fewer prompts to it. 154 | 155 | * Cues: 156 | 157 | * Do you ever feel lonely? 158 | 159 | # Creating a Story 160 | 161 | **Transcription status TBD** 162 | 163 | * The examiner pulls five things out of a box and makes a story with them. Then the kid is asked to do the same. 164 | * Since this is the last activity, there's often other stuff at the end of it e.g. the kid packing up, getting a sticker, etc. 165 | * Cues: 166 | * One last thing... -------------------------------------------------------------------------------- /02-protocols/example/transcription.wiki/transcription/adhd.md: -------------------------------------------------------------------------------- 1 | # ID conventions 2 | Most files from ADHD have the prefix ADHD. A few have the prefix ADHD02, which is for follow-up ADOSes administered later. 3 | 4 | The ID number after the prefix is a subject ID. The last digit is almost always 1 because it's used to track siblings in the study; if it's greater than 1, then there's multiple siblings in the study sharing the rest of the ID. This doesn't affect anything, it's just a fun fact. 5 | 6 | The software used to record the ADHD ADOSes and send them to us requires that they use a pre-existing ID to record the file. Sometimes this causes issues when they can't connect to our server to make a new ID. To deal with this, they have pre-designed IDs to temporarily save files until they can get the file linked to the proper ID number. This is what the ADHDNA prefix is for. It's also the reason behind odd ID numbers like 99999. -------------------------------------------------------------------------------- /02-protocols/example/transcription.wiki/transcription/ados.md: -------------------------------------------------------------------------------- 1 | The main thing we transcribe are examiner-child interactions during a semi-structured, standardized, play-based assessment instrument called the ADOS-2, or the Autism Diagnostic Observation Schedule (Second Edition). The ADOS is the “gold standard” for observational assessment of autism spectrum disorders (ASDs). 2 | 3 | The ADOS-2 includes 5 modules, each requiring 40 to 60 minutes to administer. The individual being evaluated is given only one module, selected on the basis of his or her expressive language level and chronological age. The 5 modules are: 4 | 5 | * Toddler Module—for ambulatory children between 12 and 30 months of age who do not consistently use phrase speech 6 | * Module 1—for children 31 months and older who do not consistently use phrase speech 7 | * Module 2—for children of any age who use phrase speech but are not verbally fluent 8 | * Module 3—for verbally fluent children and young adolescents (i.e., complex utterances) 9 | * Module 4—for verbally fluent older adolescents and adults 10 | 11 | We mainly transcribe the ADOS-2 Module 3 assessments. As implied from the phrasing, subjects can stay on modules 1 and 2 if their speech doesn't become fluent as they get older. Typically we only transcribe module 3 because module 1 and 2 administrations don't have enough speech data, and our sources don't tend to use module 4 even for older teenagers. 12 | 13 | Modules 3 and 4 are supposed to be conducted without a parent in the room if the subject is over 6. Sometimes parents end up sitting in because the kid isn't comfortable without them, but these administrations are then considered nonstandard and can't be included in a lot of our analyses. So if the parents come in, the file probably shouldn't be transcribed (or should be transcribed last). 14 | 15 | Some module 3 administrations will switch to module 2 because the kid isn't using enough complex language to be eligible for a module 3. These are low priority to transcribe unless instructed otherwise. 16 | 17 | The most detailed information can be found in the ADOS-2 manual, a large spiral-bound book that should be found with the other assessment manuals, on the little set of shelves between the mail cubbies and the admin office in GH40. 18 | 19 | # ADOS Observations 20 | 21 | All CSLU transcribers are asked to observe an ADOS Module 3 administration at OHSU's Child Development & Rehabilitation Center early in the training process. Email [faculty member name here] to set up a date/time to observe. We also encourage experienced transcribers to observe an ADOS in person every once in a while! -------------------------------------------------------------------------------- /02-protocols/example/transcription.wiki/transcription/audio2.md: -------------------------------------------------------------------------------- 1 | If you're just transcribing, you should be getting your audio from `language/ADOS/audio`. You should probably only pull from audio2 if someone has told you to. 2 | 3 | The full address of audio2 is `language/autism/audio2`. It contains folders for [OGI](./Erpa) and [FNL](./Fnl) audio (and some other things). Note that there are other assessments than the ADOS that are conducted for these kids, and not all folders will even have an ADOS -- many will have VerbalFluency or some other test. 4 | 5 | **An odd but important note: the "date modified" of a file is actually meaningful because it is set by the upload software. Do not touch -- i.e. literally use the `touch` command or similar -- the files.** 6 | 7 | Recordings in audio2 are often not as neat and clean as the recordings that have already been vetted and moved into language. It may be easier to use the Mac "Connect to Server" interface than to browse via Terminal, so that you can easily preview each file to find the ones you want. 8 | 9 | There will often be multiple recordings of the ADOS. There are typically two mics used to record each session, so there will often be two recordings of the same size that differ in the second to last digit, e.g.: 10 | 11 | ``` 12 | OHSU-12345-1-ADOS-0-0-0-1-0.wav 13 | OHSU-12345-1-ADOS-0-0-0-2-0.wav 14 | ``` 15 | 16 | The one from mic number 2 is often clearer (it is intended to be the mic positioned nearer to the child), but you should double check this. 17 | 18 | Also, there may be multiple files of different sizes that differ in the digit between the ID and the ADOS, e.g. 19 | 20 | ``` 21 | OHSU-12345-0-ADOS-0-0-0-1-0.wav 22 | OHSU-12345-1-ADOS-0-0-0-1-0.wav 23 | ```` 24 | 25 | (Actually twice this because there'll be one per mic as well). 26 | 27 | These are different sound files. Sometimes the ADOS went long and was split into two recordings, which should be concatenated before they're put into language (`sox` is good for this). Sometimes, however, the short recording is a test recording, which shouldn't be combined with the real ADOS. The only way to be sure is to check both files. 28 | 29 | When copying files into language, make sure to set permissions so that others have read access to the files you upload, or else other transcribers won't be allowed to download them. -------------------------------------------------------------------------------- /02-protocols/example/transcription.wiki/transcription/bash.md: -------------------------------------------------------------------------------- 1 | You will likely want to use bash at least for some retrieving/storing files on the server, if not for other work at CSLU. It's a good idea to get used to bash in general ([here](http://wellformedness.com/courses/CS606-RP/PDFs/L1-2-UNIX-environment.pdf) is past CSLU prof Kyle Gorman's basic bash guide, and there are innumerable others out there) but here are a few specific tricks you may find useful. 2 | 3 | # Connecting to the Server Easily 4 | 5 | Many of these assume you're on MacOS as that's what most of the work machines are, but they can be easily adapted to Linux and even Windows (with newer versions or Cygwin). This also assumes you're working from the command line. 6 | 7 | ## ssh/config 8 | 9 | Edit the file ~/.ssh/config to make aliases for servers, making logging in easy. As an example, here's how to set up asd. Open the file (it's ok if it didn't exist before) and 10 | add: 11 | 12 | ``` 13 | Host asd 14 | User yourusername 15 | HostName asd.ohsu.edu 16 | ``` 17 | 18 | Now, instead of typing `ssh yourusername@asd.ohsu.edu`, you can just type `ssh asd`! You can make the alias (what you type after "host") whatever is easiest for you to remember and type, and you can add aliases for other servers to the end of the file. 19 | 20 | This also works when specifying paths; for example, this is why our example `rsync` code begins with just `language...` 21 | 22 | ## Connecting via SSH key 23 | 24 | You can add an SSH key to allow a given computer to connect to a server without requiring your password every time. This works for servers like ASD as well as for Gitlab and other services. There are lots of tutorials on how to do this on the web, but here's a quick one. 25 | 26 | If you think you may have done this before, type `cat ~/.ssh/id_rsa.pub`. This will try to print to the Terminal screen whatever is in the file that would hold your public key. If it prints a long string of characters starting with `ssh-rsa` onto the screen, you do have one. 27 | 28 | If you don't have a key yet, type: `ssh-keygen` 29 | . Hit return when prompted for a path; this will put it in the default place, which will make later configuration easier. You can optionally associate a passphrase with your key, but you don't have to. 30 | 31 | Once you have a key (new or not), you can use the nifty `ssh-copy-id` function to copy your key onto the remote server. Type `ssh-copy-id yourusername@asd.cslu.ohsu.edu` (or just `asd` if you've got the alias set up already.) It will prompt you for your password for asd one more time. You're set up! With this in place you don't need to enter your password to use utilities like `ssh`, `scp`, and `rsync` to the server. You will need to repeat this process if you want to connect to the server on another machine, or to another server (like the BigBirds). 32 | 33 | # Text Editing 34 | 35 | While you are on the command line, the text editor you should obviously use is *insert holy war here*. 36 | 37 | Now that that's out of the way, on Mac [TextMate](http://macromates.com) is a useful local editor that can be set up to launch from the server, allowing you to edit files in a handy GUI from your local machine. It also provides syntax coloring and lots of other nifty features. For Windows folks, it's similar to notepad++. (While the site includes payment information for those who choose to purchase a license, version 2.0 is open source and the currently available prebuilt binaries work without a license key. In other words: you can just use it.) 38 | 39 | Info about setting up ssh tunneling with TextMate can be found [here](http://blog.macromates.com/2011/mate-and-rmate/) -------------------------------------------------------------------------------- /02-protocols/example/transcription.wiki/transcription/cleaning.md: -------------------------------------------------------------------------------- 1 | [Audacity](http://www.audacityteam.org/home/) is an excellent free tool for audio manipulation which can be useful for cleaning and otherwise manipulating files to make them easier to transcribe. There's plenty of tutorials online for Audacity, but here are a couple things you may want to do. 2 | 3 | Note that Audacity works with its own file format, .aup, by default, so when you load in the wav file it will need to import it, and when you're done you will need to export rather than save the file. 4 | 5 | # File too quiet 6 | 7 | If the entire file is too quiet, you can amplify it in Audacity using `Effect>Amplify...` Make sure `Allow clipping` is _not_ checked to avoid introducing distortion. 8 | 9 | # File has static 10 | 11 | This is helpful only for relatively constant, unchanging background noise such as static, a fan in the background, or a tone throughout the recording. It will not help with dynamic background noise such as a kid frequently banging on the table or people talking in the hallway. Results are mixed for line noise/corruption that gets a file marked "bad audio". 12 | 13 | *Disclaimer* Audacity has tools to remove/reduce background noise, _but_ they are not perfect. ELAN can handle two audio files simultaneously associated with the same transcript. If you clean a file this way it's recommended that you associate _both_ files with your transcript. Use `Edit>Linked Files...` in ELAN to add the second sound file, then use the Controls tab in Annotation or Segmentation mode to control which file is being played by muting the other file. (It may otherwise try to play both at once, which is odd and echoy and unhelpful.) Check with both files to make sure that speech isn't lost or distorted by the noise removal process, especially when segmenting. 14 | 15 | To reduce a constant background noise throughout the entire file, go to `Effect>Noise Reduction`, then select a few seconds of just noise (where neither the examiner nor the child is speaking, and ideally there also aren't significant background noises from toys, feet, etc) and click `Get Noise Profile`. Then select Noise Removal again, select the entire file and click `OK`. If too much or too little was filtered you can revert the change and change the Step 2 settings for how much noise is filtered. -------------------------------------------------------------------------------- /02-protocols/example/transcription.wiki/transcription/elan.md: -------------------------------------------------------------------------------- 1 | The suggested tool for transcription is [ELAN] (https://tla.mpi.nl/tools/tla-tools/elan/), a free professional grade tool for transcription and annotation made by the Max Planck Institute. Versions for MacOS, Windows, and Linux can be downloaded from the link above. 2 | 3 | One of the great advantages to ELAN is that it is very customizable; each transcriber will likely find their own specific set of shortcuts that are most comfortable to them. There are many guides to ELAN online. For example, the official handbook is [here](http://www.mpi.nl/corpus/manuals/manual-elan.pdf) (pdf) while a guide from UPenn with some useful suggested shortcuts is [here](http://fave.ling.upenn.edu/downloads/ELAN_Introduction.pdf) (pdf). Most non-official guides will include some stuff specific to their lab's workflow, so here's ours. 4 | 5 | ELAN can be kind of buggy. We recommend that you configure the autosave (as described below). If you have trouble with the beta, feel free to use the last stable version instead. 6 | 7 | # Autosave 8 | 9 | This gets its own section! ELAN has an autosave, but _it is not on by default_ (R.I. discovered this exactly the way you'd think). Go to `File>Automatic Backup` and select the frequency with which you want it to autosave. The autosave will be in another file with 001 appended to the end of the filename. 10 | 11 | # Overview of Modes 12 | 13 | This is an overview of the three modes relevant to our transcriptions here. The other two, Media Synchronization and Interlinearization, are used for other types of annotation (multiple media types in the first case and multiple tiers per speaker in the second). 14 | 15 | ## Segmentation Mode 16 | 17 | This mode is typically where you want to start. It allows you to mark off segments of speech in different tiers, as well as to change the segment times and merge segments, but not to enter text in the annotations. 18 | 19 | ## Transcription Mode 20 | 21 | This mode presents you with a list of all the segments you've made in other modes. It is meant for entering transcription text quickly and easily. 22 | 23 | Hint: the colors assigned to the different tiers seem to be random. Sometimes it'll assign very similar colors to two tiers; the right-click menu gives you the option to change them (per transcription). 24 | 25 | ## Annotation Mode 26 | 27 | You can segment and transcribe in this mode. It is possible to do your entire workflow in this mode, however this is not recommended because it involves using a lot of typing chords and can be very rough on your hands. This mode can be useful for correcting/tweaking annotations, for example adding a note in the Comments tier about a sound overlapping the child's speech, or extending an annotation that accidentally ended before the utterance did. 28 | 29 | # Setting Up 30 | 31 | When you have your audio file, select `File>New...` and pick the audio file, then click OK to create your transcription. 32 | 33 | For your first transcription, you'll see a Default tier. Go to `Tier>Add New Tier...` and use the options there to remove or rename Default and add other tiers until your tiers are the four required in the transcription guidelines (Child, Examiner, Activity, Comments). You don't need to fill out any other information in the tiers. 34 | 35 | Once you have your tier information, you can select `File>Save As Template...` to create a template to store your tier information. For subsequent transcriptions you can add this in the new file dialogue under `Add Template File`, and the tiers will automatically be in the right order. 36 | 37 | If you move the audio file before you are done with your transcription, you will have to tell ELAN where to find it again. 38 | 39 | # Suggested Workflow 40 | 41 | There are probably as many workflows as transcribers. However, we all follow the same basic segmentation > transcription order. This section lists some options; find what works for you. Definitely pay attention to ergonomics, as transcription can be hard on the hands if you're doing it for long stretches. 42 | 43 | Note that ELAN allows you to change the rate at which it plays back audio. Slowing down can be helpful for hearing fast speech or finding difficult speech boundaries; speeding up can help with broad activity segmentation. Play around with it and find what works for you. 44 | 45 | ## Activity Segmentation 46 | 47 | You may want to segment activities across the file first. This is most easily done in segmentation mode. One way is to use the `One keystroke per annotation (adjacent annotations)` option, which automatically makes the long annotations connected to each other (though it can be a little annoying if you get off by a bit.) Another method is to make short annotations at the beginning of each section, make another short annotation next to it at the end of the last section, then click on the ending annotation and select `Merge with Annotation Before`. A third option is to segment the file with one long activity annotation then as you listen to the audio, split the annotation at the appropriate points between activities. 48 | 49 | ## Segmentation 50 | 51 | You might want to segment the whole file before transcribing, or you might want to go section by section. It's up to you. 52 | 53 | Segmentation mode is typically the easiest here. You can configure shortcuts to switch tiers (configured by default in the beta). You can segment audio while the audio is playing, or you can pause and mark off segments. You can also click and drag parts of the annotation to fix the timestamps, though only when the annotation's tier is highlighted. Make sure that overlapping speech gets overlapping annotations. 54 | 55 | ## Transcription 56 | 57 | At some point you will have segmented enough audio that you want to transcribe it. Switch to Transcription Mode. (The first time, you will have to select a "type"; the only choice will probably be "default-lt", which is the right one.) You can go down the list of segments and each will play the associated audio. Tab will play the audio again, or start/stop it, to help with transcribing tricky parts. 58 | 59 | You may find that some of the segmentation wasn't quite right, or that you missed an annotation. It's easiest to fix these in either Segmentation or Annotation Mode, your choice. Also, if one end of the audio sounds cut off or odd, it's highly recommended that you go back to one of the other modes and listen around the start/end of the annotation -- it's very easy to mess up on one side, and Transcription Mode won't play that audio for you. If you want to split an utterance in Annotation Mode, note that the right click option `Split Annotation` makes the split at the cursor location. 60 | 61 | ## Checking 62 | 63 | Once you've transcribed the whole file, you will probably want to check your work. There are many ways to do this and you can find the one that works for you. 64 | 65 | You can click on the activity segment and let the whole section play while you scroll down the line, following along with your transcription. This will help you find any audio you missed. Alternatively, you can go down the file in transcription mode listening to each annotation; this may miss un-segmented audio but will help you see if any of your annotations are misaligned. 66 | 67 | When you're done with your work, you're ready to [export and upload](transcription/formatting-uploading). 68 | 69 | # Importing TextGrids 70 | 71 | Since all work is saved in [TextGrid](transcription/textgrids) format you may need to import TextGrid files into ELAN, for example if you need to transcribe an un-transcribed section of a pre-existing transcription. Select `Import>Praat TextGrid File...` and browse for the TextGrid. Make sure `Skip empty intervals/annotations` is _checked_ or else you will have a bunch of empty annotations between the meaningful ones. Leave everything else as default and click through the dialogue. 72 | 73 | If you're checking against the audio file, you will have to re-link the audio file manually after you've imported the TextGrid. Once you've imported the TextGrid go to `Edit>Linked Files...`, select `Add...` then select your sound file and click Apply. Then the two should be lined up properly. -------------------------------------------------------------------------------- /02-protocols/example/transcription.wiki/transcription/erpa.md: -------------------------------------------------------------------------------- 1 | ERPA, Expressive and Receptive Prosody in Autism, was the old study done at CSLU long ago. The prefix for ERPA is OGI-. ERPA has been fully transcribed for quite some time; you'll only need to interact with it if someone specifically asks you to find or fix something. 2 | 3 | # Guidelines 4 | 5 | ERPA was transcribed according to an older form of the guidelines that used significantly different conventions from the ones used now. These guidelines are more like traditional SALT guidelines and include a lot more error marking and grammar marking as well as long comments. If you need to do something in an ERPA file it's worth reading the original guidelines to make sure you follow that style. 6 | 7 | # Chronological 8 | 9 | ERPA files, unlike the others, are saved as _chronological_ textgrids. If you want to edit them in ELAN you have to load them into Praat first and export to non-chronological format before importing them into ELAN. Similarly, when you finish you have to load them into Praat and convert them back to chronological. -------------------------------------------------------------------------------- /02-protocols/example/transcription.wiki/transcription/formatting-uploading.md: -------------------------------------------------------------------------------- 1 | # Saving 2 | 3 | All transcriptions are saved as Praat [TextGrids](transcription/textgrids). UW and FNL are non-chronological, ERPA is chronological. To export a non-chronological TextGrid from ELAN: 4 | 5 | * `File>Export As>Praat TextGrid...` 6 | 7 | Naming conventions are the same as the .wav file you transcribed, with the .TextGrid suffix instead of .wav. FNL files just have an ID number; UW Estes files have an ID number and a visit number because it's a longitudinal study! 8 | 9 | # Checking 10 | 11 | Use the scripts from Kyle Gorman's ados-scripts, which are hosted here in the folder /ados-scripts/. In Terminal, put the TextGrid in the ados-scripts folder (or change addresses accordingly) and do: 12 | 13 | ``` 14 | ./serialize.py your-file-here.TextGrid 15 | ./validate.rb your-file-here.txt 16 | ``` 17 | 18 | This will check your file for common typos and syntax errors (omitted punctuation, mismatched brackets, and the like) and give you feedback in Terminal. When you've fixed any errors (remember to re-export the TextGrid!) `validate.rb` will give no output. Then you're good to upload! Upload the TextGrid, not the .txt file -- the latter is just to make it easier on the validation script. 19 | 20 | In the rare case where you had to do something very weird with a transcription file, such as adding an extra tier for another speaker, the script will likely throw errors at you about it. 21 | 22 | Tip: If you need to make changes in ELAN, after your last change in Transcription Mode click over to another annotation before you export. Sometimes if you change an annotation but don't click another annotation, the exported file won't have the change you made. 23 | 24 | # Uploading 25 | 26 | Transcriptions go on the `asd` server. The locations are as follows: 27 | 28 | * FNL: `asd:language/FNL/ADOS/transcripts/TextGrid` 29 | * UW Estes: `asd:language/UW/ADOS/transcripts/TextGrid` 30 | * UW GENDAAR: `asd:language/UW_GENDAAR/transcripts/` 31 | 32 | You can connect to server and drag the files in, or use your preferred Terminal file upload utility. 33 | 34 | Sample code using rsync: 35 | 36 | ``` 37 | rsync -avP ./your-file-here.TextGrid asd:language/FNL/ADOS/transcripts/TextGrid 38 | ``` 39 | 40 | or using scp: 41 | ``` 42 | scp your-file-here.TextGrid yourusername@asd:/home/language/FNL/ADOS/transcripts/TextGrid/ 43 | ``` 44 | 45 | for an FNL file. -------------------------------------------------------------------------------- /02-protocols/example/transcription.wiki/transcription/guidelines.md: -------------------------------------------------------------------------------- 1 | Here are the current transcription guidelines: [SALT Conventions 2017] 2 | (https://repo.cslu.ohsu.edu/language-outcomes/transcription/blob/master/guidelines/SALT_Conventions_2017.pdf) 3 | 4 | What follows is a collection of weird examples that aren't covered in the guidelines, and don't necessarily merit their own section in the official document. This page should be used as a supplement to, but not in place of, the formal guidelines. 5 | 6 | # Content Note 7 | Sometimes what the kids talk about can be fairly distressing: intense bullying, trouble with their parents, violent tendencies. Be aware that there is protocol for addressing suspected child abuse, generally the examiner will try to finish the ADOS as per usual, but then at the end of their meeting may ask the child more pointed questions about it. Often this wouldn't be on the audio file that we receive, but that doesn't mean the examiner wasn't worried and didn't ask! The ADOS examiners are mandatory reporters and will follow up with anything they find concerning. 8 | 9 | # C-Unit Segmentation 10 | If a speaker says a laundry list items of three identical phrase types (i.e., they are equal tiers on a syntax tree), they should be segmented as separate C-units. For example, "I mean like chips or apples or something" should be transcribed as "I mean like chips. Or apples. Or something." whereas "I mean like chips or something or apples" should be transcribed "I mean like chips or something. Or apples." This is a confusing distinction, and looking up syntax trees or discussing it with a linguist can help a lot. 11 | 12 | 13 | # Spelling 14 | For consistency, if one of these phrases come up, transcribe them as they appear here even if it's different from your own intuition. 15 | * P_B_and_J (connecting words are not capitalized) 16 | * Calvin_and_Hobbes 17 | * As/Bs/Cs/Ds/Fs (for grades) 18 | * Four-hundred-and-something 19 | * Nine-and-a-half 20 | * Xbox_Three_Sixty 21 | * Big_Island (when talking about Hawaii) 22 | * Ann_Arbor, Michigan (place names follow the format City, State) 23 | * Valentines_Day (no apostrophe) 24 | * super-frog (often said in context of Tuesday) 25 | * scooch (as in "scooch up your chair a bit") 26 | * Sometimes the indefinite article "a" is hard to distinguish from the filled pause "uh". There's no magic rule to clarify this--but don't worry, it's not just you! People mumble. It sucks. 27 | * Don't mark g-dropping (e.g. "tryin'", "doin'") 28 | 29 | Generally beware of [eye dialect](https://en.wikipedia.org/wiki/Eye_dialect), or the use of nonstandard spellings for normal phonological processes. 30 | 31 | If a child mispronounces a proper noun, transcribe it as it sounds, then place the correct spelling in square brackets. E.g. "Poke_Place_Market[=Pike_Place_Market]" or "A_F_O_Schwarz[=F_A_O_Schwarz]". 32 | 33 | # Hyphenation 34 | Cake types always count as two words for the purposes of MLU: adjective and noun. If the cake type adjective is two words, then it must be hyphenated. Chocolate cake, carrot cake, white cake, birthday cake, wedding cake, pound cake, sponge cake; but red-velvet cake, key-lime pie, Boston-creme pie, blueberry-chocolate cake, black-forest cake. 35 | 36 | Note that "cheesecake" is an exception. 37 | 38 | # Media 39 | 40 | A lot of kids will talk about games, TV shows, movies, etc. that they like. If they mention an identifiable name of something you're not familiar with, it can help a lot to do a quick Google search for that thing; a lot of weird names/words will be much easier to recognize if you've seen them in print first. 41 | 42 | # Play Sounds 43 | 44 | The .ps marking covers a variety of sounds the kid might make during play. The same sound acoustically might be marked as .ns or another tag if used outside the play context; for example, the kid having one of the action figures yell in a fight would be .ps[child yell], whereas the kid yelling while the examiner is talking would be .ns[child yell]. 45 | 46 | The line between play sounds and sound effects is indeed a bit fuzzy. 47 | 48 | # Letter Sequences 49 | 50 | The rules for letter sequences are to be used when the speaker actually pronounces each letter separately; letter sequences that are pronounced like words are spelled as proper nouns. E.g. (for most speakers) Nasa vs the N_S_A. 51 | 52 | # Bracket Capitals 53 | 54 | The _only_ notation in brackets that gets all caps is the "sounds like" notation. E.g. "It was a XX[sounds like GERMAN_SHEPHERD]" vs "It was a German_Shep*[=Shepherd]" 55 | 56 | # Repeated Activities 57 | 58 | Sometimes an activity will happen twice. For example, the examiner may return to an unfinished activity later in the ADOS to complete it. Mark each of them with the activity name as normal, and make a note of it in the spreadsheet. This includes multiple breaks (usually one is the Break activity and any others are breaks for food or medication) -- in this case, label them each as Break. -------------------------------------------------------------------------------- /02-protocols/example/transcription.wiki/transcription/textgrids.md: -------------------------------------------------------------------------------- 1 | **NB:** this is more information than you will need if you are just transcribing. If you are manipulating or using TextGrids in other ways, it might be helpful. 2 | 3 | TextGrids come in two flavors: object-oriented and chronological. If you open a chronological TextGrid in a text editor, you will see that the annotations are ordered by their start time in the file. Object-oriented TextGrids order annotations by tier, so in our ADOSes it would be all the Child annotations first, then Examiner, etc. This difference is not visible when opening them in Praat, but some scripts might require one type or the other. As noted on the [Formatting and Uploading](transcription/formatting-uploading) page, all current projects ask that you save them as the default, object-oriented type. Elan doesn't even have an export option for chronological TextGrids, so you don't have to worry about messing this up. 4 | 5 | There is an NLTK TextGrid module called `textgrid.py`. It can be found on the server at 6 | 7 | ``` 8 | asd:/home/language/ERPA/ADOS/transcripts/Scripts/textgrid.py 9 | ``` 10 | 11 | The contents are well-documented within. Two useful methods it contains are to_chron() and to_oo(). As you might guess, you can use these methods to convert easily between the two TextGrid types (another way would be to open the TextGrid in Praat and save it as the other type). `oo_to_chron.py` and `chron_to_oo.py` in that same directory take an input TextGrid and convert it to the other type. -------------------------------------------------------------------------------- /02-protocols/example/transcription.wiki/transcription/tracking.md: -------------------------------------------------------------------------------- 1 | We use a service called Airtable to keep track of transcription progress. It's basically a spreadsheet with more detailed options than a traditional spreadsheet. The [tracker is located here](https://airtable.com). 2 | 3 | # The Simple Version 4 | 5 | Find a file to transcribe. Put your name and today's date under date started. If there's anything to note about the file put it under "transcriber notes". Transcribe! When you're done and have uploaded it, mark the date you finish and select "complete" in the status column for that file. 6 | 7 | ## Reasons Not to Transcribe a File (Yet) 8 | 9 | * If the audio_location is not "ASD server", then we don't actually have an audio file associated with that ID/visit. Entries without a file are for completeness' sake so we know what happened to that ID. These we always mark as "non-transcribable" in the status column, since they can never be transcribed. 10 | * If it already says "complete" in the status column, someone's already done that one. 11 | * By default we only transcribe module 3. ADHD and UWG are almost all module 3, so the module field is blank -- these are fine to transcribe. However, _if_ there's an entry in "module" that is a number _other_ than 3 then don't transcribe it unless specifically asked. (There's discussion about eventually including module 2s as well. Module 4 would presumably be transcribable, but we never get them.) We don't mark these as non-transcribable in the status column, since they technically could be transcribed--instead we leave it blank. 12 | * Most audio_flags are reasons that we shouldn't transcribe the tape yet. The nonstandard administrations mean we can't use the data for most results reporting, so those files shouldn't be transcribed while we have standard administrations left. Incomplete tapes might still have enough to transcribe, and "bad audio" can mean anything from untranscribably corrupted to just hard to hear. (Those tapes often will give you a headache, though.) We don't automatically mark these as non-transcribable in the status column, since many of them could technically be transcribed when we run out of standard administrations. If you do try to transcribe one and there is reason to think it could never be transcribed--e.g., because the audio is so bad--then mark it as "non-transcribable" in the status column. 13 | 14 | # Column Guide 15 | 16 | ## file_name 17 | Self-explanatory -- formatted as the prefix plus ID, so it'll leave off the "ADOS.wav" parts. This uniquely identifies files from UWG and ADHD; UWL also needs visit_number. 18 | 19 | ## corpus 20 | Which corpus it came from: ADHD, UWL, UWG. 21 | 22 | ## visit_number 23 | This is for UWL which was longitudinal. It tracks the different visits (1, 2, 3, 4) for each kid, because each kid has up to 4 files. Visit 1 didn't include any Module 3s so Visit 1 tapes are pretty much never going to be transcribed. For UWL you need both the file_name and visit_number to identify the wav file/textgrid that corresponds to the row. 24 | 25 | ## status 26 | A drop-down representing which files have a complete transcription up on the asd server! This option can be blank (meaning it hasn't been transcribed yet), complete (meaning it's transcribed and on the asd server) or non-transcribable (meaning the audio location is not on the asd server, or the file is non-transcribable for another reason, as described above). This helps keep track of what's done and gives transcribers a sense of accomplishment after finishing each file. Also, some of the old UW files for timepoint 4 don't have information for transcriber or transcription dates, so this is the most reliable way to see if files are done or not. 27 | 28 | ## audio_location 29 | Either the audio is on the asd server where you can download it, or something happened to it such that we don't have it. 30 | This keeps track of that. From a transcriber's point of view, if it's on "ASD Server" then it's available, otherwise it isn't. 31 | 32 | ## transcriber 33 | This keeps track of which transcriber transcribed which file. 34 | 35 | ## date_started 36 | The date a transcriber started work on the file. 37 | 38 | ## date_finished 39 | The date a transcriber finished work on the file. 40 | 41 | ## ados_module 42 | If it's not filled out, assume 3. For studies where we have multiple modules (mostly UWL) this tracks which module each file is. A few ADHD files switch to module 2 halfway through and these are marked here too. 43 | 44 | ## audio_flags 45 | Flags for common issues with tape, reasons audio can't be transcribed, or reasons why we don't have a tape at all. If a tape turns out to be untranscribable for the reasons listed flag it so other transcribers don't have to repeat your check. 46 | 47 | ## transcriber_notes 48 | A place for relevant notes about the file -- whatever you think would be relevant to analysis. The most common notes are if activities are out of order or if some activities are missing. This is also the place to put more details about why a file is untranscribable if you feel it would be helpful. 49 | 50 | For a while it wasn't normal to see Conversation and Reporting, so some of the first tapes to have it are marked. You don't have to mark if a tape has C&R (or not); it's now considered normal either way. 51 | 52 | If you have to concatenate two audio files when [bringing over audio from audio2](./audio2) that is often mentioned here. 53 | 54 | Audio problems that affect much or all of a file should get some marking here. If marking substantial background or line noise in a file, timestamps should be in seconds. (This is for noise that persists through most of the file; short periods are marked only in the transcript itself.) If the kid is regularly too quiet for the mic to pick up or so loud the mic is overloading, those are often noted here as well. 55 | 56 | Low-verbal or less verbal are just tags transcriptionists include for kids who take module 3 but really don't talk much/display much language. They don't reflect anything clinical and are just a transcriber intuition thing. 57 | 58 | If there are two examiners (definitely not a parent sitting in as described above) that's typically marked here, too. 59 | 60 | 61 | ## additional_notes 62 | These are for notes that came from the lab that gave us the tapes; some of them are outdated e.g. "may be coming" for files we now have. Kept for historical reasons. 63 | 64 | 65 | ## Conversation and Reporting 66 | 67 | There was a long issue where we should have been transcribing Conversation and Reporting but we weren't. Every transcript that has a C&R section as of August 2017 that wasn't transcribed has been marked here as "Not yet transcribed". One of the ongoing projects is to go through those transcripts, transcribe C&R, then upload the full transcript and change this marking to "Transcribed". If nothing is marked here you can assume the file is fine. For new transcripts going forward, transcribe C&R like any other activity and don't mark anything here. Once they're all transcribed we can forget about this whole column. 68 | 69 | # Airtable Hints 70 | 71 | * The dropdown list options have an autocomplete, so you can start typing e.g. your name in the transcriber column and then click to auto-complete it 72 | * You can do a lot of sorting and filtering to find the transcripts you want. For example, one of the views will show you the "To do" to easily choose a new transcription file. 73 | * Sometimes one of the columns will hide itself; look for a slider that says "Drag to adjust the number of frozen columns" and drag it a bit to find the hidden column. This is a mystery. -------------------------------------------------------------------------------- /02-protocols/example/transcription.wiki/transcription/uw.md: -------------------------------------------------------------------------------- 1 | There are two UW studies we have data from, UWL and UWG. UWL is pretty much transcribed, UWG is a work in progress. 2 | 3 | # UWL 4 | 5 | This was a longitudinal study. The files will have the format UWL-###-ADOS-#.wav . The first set of three numbers is an ID number which stays consistent across timepoints. The last number between 1 and 4 is the timepoint. The kids were roughly 5, 8, 11, 14 across the timepoints. (**this may not be right, I'm pretty sure about spacing but not absolute ages!***) Many kids dropped out at different points, which is noted on the AirTable spreadsheet. 6 | 7 | Time point 1 isn't usable because all the kids were too young for module 3, and only some of the kids ever made it to module 3. Also, a lot of time point 1 has the parent interview in the background while the kid plays. There are a bunch of kids who came in just for time point 4 and only have one file; they tend to have the highest ID numbers. 8 | 9 | Many of the time point 4 transcripts were made before the current generation of transcribers, and the record-keeping was less strict in those days so some of the information about who transcribed which files and when has been lost forever. That's why there are gaps on the spreadsheet about time point 4. 10 | 11 | Time point 1-3 recordings came off of VHS tapes shipped to us by UW and digitized at CSLU. Some of the tapes were lost or damaged, or didn't contain an ADOS, and so those weren't transferred. Time point 4 files came on DVD and are therefore a bit higher quality. 12 | 13 | # UWG 14 | 15 | This study is currently being transcribed. Files will have the format ###.03-ADOS.wav . -------------------------------------------------------------------------------- /02-protocols/example/transcription/activities.md: -------------------------------------------------------------------------------- 1 | This page is for more specific details about ADOS activities that tend to come up a lot. Also check out the actual ADOS manual/scoring guidelines for directions on how the instructors are taught to do the task. 2 | 3 | 4 | | ADOS task | ADOS task number | Transcription activity label | Cue words/questions | Transcribe | 5 | |--------------------------------------------|------------------|-------------------------------------------------|------------------------------------------------------------------------------|------------| 6 | | Construction Task | 1 | Construction Task | Beginning of audio | N | 7 | | Make Believe Play & Joint Interactive Play | 2&3 | Play | Loud background noise of toys being brought out; "I have some toys for you" | Y | 8 | | Demonstration Task | 4 | Demonstration Task | "I want you to pretend like I don't know how to brush my teeth" | N | 9 | | Description of a Picture | 5 | Picture Description | "Let's look at this picture" | Y | 10 | | Telling a Story from a Book | 6 | Wordless Picture Book | "Let's look at this book" | Y | 11 | | Cartoons | 7 | Cartoons | "I've got some cartoons here" | Y | 12 | | Conversation and Reporting | 8 | Conversation and Reporting | (Various and optional) | Y | 13 | | Emotions | 9 | xEmotions Conversation | "What do you like doing that makes you feel happy and cheerful?" | Y | 14 | | Social Difficulties and Annoyance | 10 | xSocial Difficulties and Annoyance Conversation | "Have you ever had problems getting along with people at school?" | Y | 15 | | Break | 11 | Break | "We're going to take a quick break" | N | 16 | | Friends, Relationships and Marriage | 12 | xFriends and Marriage Conversation | "Do you have some friends?" | Y | 17 | | Loneliness | 13 | xLoneliness Conversation | "Do you ever feel lonely?" | Y | 18 | | Creating a Story | 14 | Creating a Story | "One last thing" | N | 19 | 20 | # Construction Task 21 | 22 | **Not transcribed** 23 | 24 | * This is the first task. The examiner will ask the kid to put some blocks on a picture. At the end they might ask the kid what the shape looks like to them. 25 | 26 | * Cues: 27 | 28 | * [start of the tape] 29 | 30 | # Play 31 | 32 | * The sound of taking out the toys is usually the biggest cue to play. 33 | * The most common set of toys has three action figures (two male one female), a dinosaur, and a collection of stuff: fire-truck, hot-dog, chocolate bar, miniature CD-type disc, etc. The toy set for FNL has the wrong sword for the female action figure and it looks notably mis-sized. 34 | * There's an alternate set of toys that have a family including a baby. Some administrations use these instead, usually when a Module-3 administration occurs with a very young child (3,4, maybe 5). 35 | * They are supposed to introduce the characters as "wrestler, superhero, warrior, and their things." 36 | * The task is actually in two parts. First the kid plays on their own then the examiner asks to join. 37 | * Some kids really just make sound effects, particularly in the first part. 38 | * This is probably the most commonly refused task. Older kids will sometimes do a very short one, too. 39 | 40 | * Cues: 41 | 42 | * So I've got some things here. 43 | * So I've got like a ... space guy, army guy, warrior princess, and their pet dinosaur, and here are some of their things. 44 | * I've got some toys for you to play with. 45 | 46 | # Demonstration Task 47 | 48 | **Not transcribed** 49 | 50 | * This task is often framed as the examiner being an alien who needs to be taught how to brush their (shockingly human-like) teeth. 51 | 52 | * Cues: 53 | 54 | * So now I'd like to play a different kind of pretend game. 55 | 56 | # Picture Description 57 | 58 | * There are three pictures they might use. 59 | 60 | * The most common is a novelty map of the United States with different cartoons showing stuff iconic for each state -- landmarks like the Space Needle and the Golden Gate Bridge as well as activities and objects like a gambler for Vegas and an oil well for Texas. The examiner will typically try to start a conversation about vacations the kid's been on and/or where the kid is from. 61 | 62 | * The second most common is a picture of people doing different activities at a beach resort. This one will also typically segue into a chat about vacations. 63 | 64 | * The rarest (used more in ERPA) is a picture of a Thanksgiving feast. The examiner might ask how the kid celebrates Thanksgiving. (this is usually for Mod-2). 65 | 66 | * Cues: 67 | 68 | * Now I've got a picture here... 69 | 70 | # Wordless Picture Book 71 | 72 | * There are three different books, but it's almost always the first. 73 | * The main book is _Tuesday_ by David Wiesner. Pictures from it can be found online. It's a story about a night where frogs spontaneously start to fly on their lily-pads and have adventures around a town. At the end we see that pigs fly the next week. The examiner may ask the kid if they've heard the phrase "when pigs fly" and if they know what it means. (This is not part of a standard administration.) 74 | * An alternative book is _Free Fall_, also by David Wiesner, which is a story about a kid daydreaming. They often offer the kid the choice between this book and _Tuesday_, and the kids tend to go for the frogs. 75 | * A few files will use _Good Night, Gorilla_ by Peggy Rathmann. This is often a sign that the kid is less verbal and that they might switch to the ADOS Module 2, so check the rest of the file if you see it! (Or also used with very young children who are appropriate for a Module 3 (ages 3, 4, maybe 5). 76 | 77 | * Cues: 78 | 79 | * Now I've got this book here... 80 | 81 | # Cartoons 82 | 83 | * There are three different cartoons. FNL tends to use the fisherman, UW often uses the others. 84 | 85 | * A fisherman and his cat are fishing. The fisherman puts the fish into the bucket. The cat steals it and puts it in what looks like his own bucket, but is actually the beak of a pelican. The pelican flies away and the cat is angry. 86 | * A monkey is in a tree. He drops coconuts out of the tree and they are stolen by another monkey. The tree monkey waits for the thief monkey to come back and drops a coconut on his head. 87 | * [UW-Estes only, timepoint 2 & 3] A Spy vs Spy cartoon involving using a "peanut potion" to lure elephants. The examiner may not recognize it and may describe the spies as mosquitoes. (I have never need this cartoon- it does not come with the ADOS-2 kit.) 88 | 89 | * The reason they ask the kids to stand up and take their hands out of their pockets is actually to see if they'll spontaneously gesture as they tell the story. 90 | 91 | * Cues: 92 | 93 | * Now I've got some cartoons... 94 | 95 | # Conversation and Reporting 96 | 97 | * This gets a long description in the main guidelines because it's hard to define. The main point is that it's for when the examiner initiates a conversation that's not really related to the last activity. It is generally used when there has not been enough spontaneous conversation and reporting once the examiner gets to this section. If plenty have already occurred, it is generally not used as a stand-alone "activity." 98 | 99 | * One of the examiners for UW GENDAAR often conducts C&R as the very last activity after Creating a Story, so make sure to check the very end of the file! 100 | 101 | * Common Topics: 102 | 103 | * Does the kid have pets 104 | * Describe a day at school/general questions at school 105 | * What will the kid do after the visit 106 | * The kid's hobbies 107 | 108 | * UWG examiner's questions: 109 | 110 | * Does the kid have any hobbies 111 | * How did the kid get into (kid's hobby) 112 | * Has the kid done anything related to (kid's hobby) recently? 113 | * "Can you tell me about a time when you felt bullied or picked on or treated unfairly?" 114 | * Does the kid have siblings? 115 | * (If so) What's nice about being a sibling? 116 | * (If so) What's hard about being a sibling? 117 | * (If so) What's unique about being a sibling? 118 | 119 | # Emotions 120 | 121 | * Relaxed/content is optional; it's there in case ending on sad is too depressing. (The manual states that if the first two responses are excellent, the rest are all optional.) 122 | 123 | * Cues: 124 | 125 | * So now I just have some questions about feelings kids have 126 | 127 | # Social Difficulties and Annoyance 128 | 129 | * Cues: 130 | 131 | * Have you ever had problems getting along with people at school? 132 | 133 | # Break 134 | 135 | **Not transcribed** 136 | 137 | * There's a bunch of things for the kid to play with including a radio. If you hear the radio on it's probably Break. 138 | *The examiner approaches the kid at the end. There will often be some conversation at the end of Break, which is currently not transcribed. 139 | 140 | * Cues: 141 | 142 | * Now I have some stuff to write down here so I need to take a little break. 143 | * Now I need to take a break... 144 | 145 | # Friends and Marriage 146 | 147 | * Cues: 148 | 149 | * Do you have some friends? Tell me about them. 150 | 151 | # Loneliness 152 | 153 | * This section is almost always much shorter than the others. There are a lot fewer prompts to it. 154 | 155 | * Cues: 156 | 157 | * Do you ever feel lonely? 158 | 159 | # Creating a Story 160 | 161 | **Transcription status TBD** 162 | 163 | * The examiner pulls five things out of a box and makes a story with them. Then the kid is asked to do the same. 164 | * Since this is the last activity, there's often other stuff at the end of it e.g. the kid packing up, getting a sticker, etc. 165 | * Cues: 166 | * One last thing... -------------------------------------------------------------------------------- /02-protocols/example/transcription/adhd.md: -------------------------------------------------------------------------------- 1 | # ID conventions 2 | Most files from ADHD have the prefix ADHD. A few have the prefix ADHD02, which is for follow-up ADOSes administered later. 3 | 4 | The ID number after the prefix is a subject ID. The last digit is almost always 1 because it's used to track siblings in the study; if it's greater than 1, then there's multiple siblings in the study sharing the rest of the ID. This doesn't affect anything, it's just a fun fact. 5 | 6 | The software used to record the ADHD ADOSes and send them to us requires that they use a pre-existing ID to record the file. Sometimes this causes issues when they can't connect to our server to make a new ID. To deal with this, they have pre-designed IDs to temporarily save files until they can get the file linked to the proper ID number. This is what the ADHDNA prefix is for. It's also the reason behind odd ID numbers like 99999. -------------------------------------------------------------------------------- /02-protocols/example/transcription/ados.md: -------------------------------------------------------------------------------- 1 | The main thing we transcribe are examiner-child interactions during a semi-structured, standardized, play-based assessment instrument called the ADOS-2, or the Autism Diagnostic Observation Schedule (Second Edition). The ADOS is the “gold standard” for observational assessment of autism spectrum disorders (ASDs). 2 | 3 | The ADOS-2 includes 5 modules, each requiring 40 to 60 minutes to administer. The individual being evaluated is given only one module, selected on the basis of his or her expressive language level and chronological age. The 5 modules are: 4 | 5 | * Toddler Module—for ambulatory children between 12 and 30 months of age who do not consistently use phrase speech 6 | * Module 1—for children 31 months and older who do not consistently use phrase speech 7 | * Module 2—for children of any age who use phrase speech but are not verbally fluent 8 | * Module 3—for verbally fluent children and young adolescents (i.e., complex utterances) 9 | * Module 4—for verbally fluent older adolescents and adults 10 | 11 | We mainly transcribe the ADOS-2 Module 3 assessments. As implied from the phrasing, subjects can stay on modules 1 and 2 if their speech doesn't become fluent as they get older. Typically we only transcribe module 3 because module 1 and 2 administrations don't have enough speech data, and our sources don't tend to use module 4 even for older teenagers. 12 | 13 | Modules 3 and 4 are supposed to be conducted without a parent in the room if the subject is over 6. Sometimes parents end up sitting in because the kid isn't comfortable without them, but these administrations are then considered nonstandard and can't be included in a lot of our analyses. So if the parents come in, the file probably shouldn't be transcribed (or should be transcribed last). 14 | 15 | Some module 3 administrations will switch to module 2 because the kid isn't using enough complex language to be eligible for a module 3. These are low priority to transcribe unless instructed otherwise. 16 | 17 | The most detailed information can be found in the ADOS-2 manual, a large spiral-bound book that should be found with the other assessment manuals, on the little set of shelves between the mail cubbies and the admin office in GH40. 18 | 19 | # ADOS Observations 20 | 21 | All CSLU transcribers are asked to observe an ADOS Module 3 administration at OHSU's Child Development & Rehabilitation Center early in the training process. Email [faculty member name here] to set up a date/time to observe. We also encourage experienced transcribers to observe an ADOS in person every once in a while! -------------------------------------------------------------------------------- /02-protocols/example/transcription/audio2.md: -------------------------------------------------------------------------------- 1 | If you're just transcribing, you should be getting your audio from `language/ADOS/audio`. You should probably only pull from audio2 if someone has told you to. 2 | 3 | The full address of audio2 is `language/autism/audio2`. It contains folders for [OGI](./Erpa) and [FNL](./Fnl) audio (and some other things). Note that there are other assessments than the ADOS that are conducted for these kids, and not all folders will even have an ADOS -- many will have VerbalFluency or some other test. 4 | 5 | **An odd but important note: the "date modified" of a file is actually meaningful because it is set by the upload software. Do not touch -- i.e. literally use the `touch` command or similar -- the files.** 6 | 7 | Recordings in audio2 are often not as neat and clean as the recordings that have already been vetted and moved into language. It may be easier to use the Mac "Connect to Server" interface than to browse via Terminal, so that you can easily preview each file to find the ones you want. 8 | 9 | There will often be multiple recordings of the ADOS. There are typically two mics used to record each session, so there will often be two recordings of the same size that differ in the second to last digit, e.g.: 10 | 11 | ``` 12 | OHSU-12345-1-ADOS-0-0-0-1-0.wav 13 | OHSU-12345-1-ADOS-0-0-0-2-0.wav 14 | ``` 15 | 16 | The one from mic number 2 is often clearer (it is intended to be the mic positioned nearer to the child), but you should double check this. 17 | 18 | Also, there may be multiple files of different sizes that differ in the digit between the ID and the ADOS, e.g. 19 | 20 | ``` 21 | OHSU-12345-0-ADOS-0-0-0-1-0.wav 22 | OHSU-12345-1-ADOS-0-0-0-1-0.wav 23 | ```` 24 | 25 | (Actually twice this because there'll be one per mic as well). 26 | 27 | These are different sound files. Sometimes the ADOS went long and was split into two recordings, which should be concatenated before they're put into language (`sox` is good for this). Sometimes, however, the short recording is a test recording, which shouldn't be combined with the real ADOS. The only way to be sure is to check both files. 28 | 29 | When copying files into language, make sure to set permissions so that others have read access to the files you upload, or else other transcribers won't be allowed to download them. -------------------------------------------------------------------------------- /02-protocols/example/transcription/bash.md: -------------------------------------------------------------------------------- 1 | You will likely want to use bash at least for some retrieving/storing files on the server, if not for other work at CSLU. It's a good idea to get used to bash in general ([here](http://wellformedness.com/courses/CS606-RP/PDFs/L1-2-UNIX-environment.pdf) is past CSLU prof Kyle Gorman's basic bash guide, and there are innumerable others out there) but here are a few specific tricks you may find useful. 2 | 3 | # Connecting to the Server Easily 4 | 5 | Many of these assume you're on MacOS as that's what most of the work machines are, but they can be easily adapted to Linux and even Windows (with newer versions or Cygwin). This also assumes you're working from the command line. 6 | 7 | ## ssh/config 8 | 9 | Edit the file ~/.ssh/config to make aliases for servers, making logging in easy. As an example, here's how to set up asd. Open the file (it's ok if it didn't exist before) and 10 | add: 11 | 12 | ``` 13 | Host asd 14 | User yourusername 15 | HostName asd.ohsu.edu 16 | ``` 17 | 18 | Now, instead of typing `ssh yourusername@asd.ohsu.edu`, you can just type `ssh asd`! You can make the alias (what you type after "host") whatever is easiest for you to remember and type, and you can add aliases for other servers to the end of the file. 19 | 20 | This also works when specifying paths; for example, this is why our example `rsync` code begins with just `language...` 21 | 22 | ## Connecting via SSH key 23 | 24 | You can add an SSH key to allow a given computer to connect to a server without requiring your password every time. This works for servers like ASD as well as for Gitlab and other services. There are lots of tutorials on how to do this on the web, but here's a quick one. 25 | 26 | If you think you may have done this before, type `cat ~/.ssh/id_rsa.pub`. This will try to print to the Terminal screen whatever is in the file that would hold your public key. If it prints a long string of characters starting with `ssh-rsa` onto the screen, you do have one. 27 | 28 | If you don't have a key yet, type: `ssh-keygen` 29 | . Hit return when prompted for a path; this will put it in the default place, which will make later configuration easier. You can optionally associate a passphrase with your key, but you don't have to. 30 | 31 | Once you have a key (new or not), you can use the nifty `ssh-copy-id` function to copy your key onto the remote server. Type `ssh-copy-id yourusername@asd.cslu.ohsu.edu` (or just `asd` if you've got the alias set up already.) It will prompt you for your password for asd one more time. You're set up! With this in place you don't need to enter your password to use utilities like `ssh`, `scp`, and `rsync` to the server. You will need to repeat this process if you want to connect to the server on another machine, or to another server (like the BigBirds). 32 | 33 | # Text Editing 34 | 35 | While you are on the command line, the text editor you should obviously use is *insert holy war here*. 36 | 37 | Now that that's out of the way, on Mac [TextMate](http://macromates.com) is a useful local editor that can be set up to launch from the server, allowing you to edit files in a handy GUI from your local machine. It also provides syntax coloring and lots of other nifty features. For Windows folks, it's similar to notepad++. (While the site includes payment information for those who choose to purchase a license, version 2.0 is open source and the currently available prebuilt binaries work without a license key. In other words: you can just use it.) 38 | 39 | Info about setting up ssh tunneling with TextMate can be found [here](http://blog.macromates.com/2011/mate-and-rmate/) -------------------------------------------------------------------------------- /02-protocols/example/transcription/cleaning.md: -------------------------------------------------------------------------------- 1 | [Audacity](http://www.audacityteam.org/home/) is an excellent free tool for audio manipulation which can be useful for cleaning and otherwise manipulating files to make them easier to transcribe. There's plenty of tutorials online for Audacity, but here are a couple things you may want to do. 2 | 3 | Note that Audacity works with its own file format, .aup, by default, so when you load in the wav file it will need to import it, and when you're done you will need to export rather than save the file. 4 | 5 | # File too quiet 6 | 7 | If the entire file is too quiet, you can amplify it in Audacity using `Effect>Amplify...` Make sure `Allow clipping` is _not_ checked to avoid introducing distortion. 8 | 9 | # File has static 10 | 11 | This is helpful only for relatively constant, unchanging background noise such as static, a fan in the background, or a tone throughout the recording. It will not help with dynamic background noise such as a kid frequently banging on the table or people talking in the hallway. Results are mixed for line noise/corruption that gets a file marked "bad audio". 12 | 13 | *Disclaimer* Audacity has tools to remove/reduce background noise, _but_ they are not perfect. ELAN can handle two audio files simultaneously associated with the same transcript. If you clean a file this way it's recommended that you associate _both_ files with your transcript. Use `Edit>Linked Files...` in ELAN to add the second sound file, then use the Controls tab in Annotation or Segmentation mode to control which file is being played by muting the other file. (It may otherwise try to play both at once, which is odd and echoy and unhelpful.) Check with both files to make sure that speech isn't lost or distorted by the noise removal process, especially when segmenting. 14 | 15 | To reduce a constant background noise throughout the entire file, go to `Effect>Noise Reduction`, then select a few seconds of just noise (where neither the examiner nor the child is speaking, and ideally there also aren't significant background noises from toys, feet, etc) and click `Get Noise Profile`. Then select Noise Removal again, select the entire file and click `OK`. If too much or too little was filtered you can revert the change and change the Step 2 settings for how much noise is filtered. -------------------------------------------------------------------------------- /02-protocols/example/transcription/elan.md: -------------------------------------------------------------------------------- 1 | The suggested tool for transcription is [ELAN] (https://tla.mpi.nl/tools/tla-tools/elan/), a free professional grade tool for transcription and annotation made by the Max Planck Institute. Versions for MacOS, Windows, and Linux can be downloaded from the link above. 2 | 3 | One of the great advantages to ELAN is that it is very customizable; each transcriber will likely find their own specific set of shortcuts that are most comfortable to them. There are many guides to ELAN online. For example, the official handbook is [here](http://www.mpi.nl/corpus/manuals/manual-elan.pdf) (pdf) while a guide from UPenn with some useful suggested shortcuts is [here](http://fave.ling.upenn.edu/downloads/ELAN_Introduction.pdf) (pdf). Most non-official guides will include some stuff specific to their lab's workflow, so here's ours. 4 | 5 | ELAN can be kind of buggy. We recommend that you configure the autosave (as described below). If you have trouble with the beta, feel free to use the last stable version instead. 6 | 7 | # Autosave 8 | 9 | This gets its own section! ELAN has an autosave, but _it is not on by default_ (R.I. discovered this exactly the way you'd think). Go to `File>Automatic Backup` and select the frequency with which you want it to autosave. The autosave will be in another file with 001 appended to the end of the filename. 10 | 11 | # Overview of Modes 12 | 13 | This is an overview of the three modes relevant to our transcriptions here. The other two, Media Synchronization and Interlinearization, are used for other types of annotation (multiple media types in the first case and multiple tiers per speaker in the second). 14 | 15 | ## Segmentation Mode 16 | 17 | This mode is typically where you want to start. It allows you to mark off segments of speech in different tiers, as well as to change the segment times and merge segments, but not to enter text in the annotations. 18 | 19 | ## Transcription Mode 20 | 21 | This mode presents you with a list of all the segments you've made in other modes. It is meant for entering transcription text quickly and easily. 22 | 23 | Hint: the colors assigned to the different tiers seem to be random. Sometimes it'll assign very similar colors to two tiers; the right-click menu gives you the option to change them (per transcription). 24 | 25 | ## Annotation Mode 26 | 27 | You can segment and transcribe in this mode. It is possible to do your entire workflow in this mode, however this is not recommended because it involves using a lot of typing chords and can be very rough on your hands. This mode can be useful for correcting/tweaking annotations, for example adding a note in the Comments tier about a sound overlapping the child's speech, or extending an annotation that accidentally ended before the utterance did. 28 | 29 | # Setting Up 30 | 31 | When you have your audio file, select `File>New...` and pick the audio file, then click OK to create your transcription. 32 | 33 | For your first transcription, you'll see a Default tier. Go to `Tier>Add New Tier...` and use the options there to remove or rename Default and add other tiers until your tiers are the four required in the transcription guidelines (Child, Examiner, Activity, Comments). You don't need to fill out any other information in the tiers. 34 | 35 | Once you have your tier information, you can select `File>Save As Template...` to create a template to store your tier information. For subsequent transcriptions you can add this in the new file dialogue under `Add Template File`, and the tiers will automatically be in the right order. 36 | 37 | If you move the audio file before you are done with your transcription, you will have to tell ELAN where to find it again. 38 | 39 | # Suggested Workflow 40 | 41 | There are probably as many workflows as transcribers. However, we all follow the same basic segmentation > transcription order. This section lists some options; find what works for you. Definitely pay attention to ergonomics, as transcription can be hard on the hands if you're doing it for long stretches. 42 | 43 | Note that ELAN allows you to change the rate at which it plays back audio. Slowing down can be helpful for hearing fast speech or finding difficult speech boundaries; speeding up can help with broad activity segmentation. Play around with it and find what works for you. 44 | 45 | ## Activity Segmentation 46 | 47 | You may want to segment activities across the file first. This is most easily done in segmentation mode. One way is to use the `One keystroke per annotation (adjacent annotations)` option, which automatically makes the long annotations connected to each other (though it can be a little annoying if you get off by a bit.) Another method is to make short annotations at the beginning of each section, make another short annotation next to it at the end of the last section, then click on the ending annotation and select `Merge with Annotation Before`. A third option is to segment the file with one long activity annotation then as you listen to the audio, split the annotation at the appropriate points between activities. 48 | 49 | ## Segmentation 50 | 51 | You might want to segment the whole file before transcribing, or you might want to go section by section. It's up to you. 52 | 53 | Segmentation mode is typically the easiest here. You can configure shortcuts to switch tiers (configured by default in the beta). You can segment audio while the audio is playing, or you can pause and mark off segments. You can also click and drag parts of the annotation to fix the timestamps, though only when the annotation's tier is highlighted. Make sure that overlapping speech gets overlapping annotations. 54 | 55 | ## Transcription 56 | 57 | At some point you will have segmented enough audio that you want to transcribe it. Switch to Transcription Mode. (The first time, you will have to select a "type"; the only choice will probably be "default-lt", which is the right one.) You can go down the list of segments and each will play the associated audio. Tab will play the audio again, or start/stop it, to help with transcribing tricky parts. 58 | 59 | You may find that some of the segmentation wasn't quite right, or that you missed an annotation. It's easiest to fix these in either Segmentation or Annotation Mode, your choice. Also, if one end of the audio sounds cut off or odd, it's highly recommended that you go back to one of the other modes and listen around the start/end of the annotation -- it's very easy to mess up on one side, and Transcription Mode won't play that audio for you. If you want to split an utterance in Annotation Mode, note that the right click option `Split Annotation` makes the split at the cursor location. 60 | 61 | ## Checking 62 | 63 | Once you've transcribed the whole file, you will probably want to check your work. There are many ways to do this and you can find the one that works for you. 64 | 65 | You can click on the activity segment and let the whole section play while you scroll down the line, following along with your transcription. This will help you find any audio you missed. Alternatively, you can go down the file in transcription mode listening to each annotation; this may miss un-segmented audio but will help you see if any of your annotations are misaligned. 66 | 67 | When you're done with your work, you're ready to [export and upload](transcription/formatting-uploading). 68 | 69 | # Importing TextGrids 70 | 71 | Since all work is saved in [TextGrid](transcription/textgrids) format you may need to import TextGrid files into ELAN, for example if you need to transcribe an un-transcribed section of a pre-existing transcription. Select `Import>Praat TextGrid File...` and browse for the TextGrid. Make sure `Skip empty intervals/annotations` is _checked_ or else you will have a bunch of empty annotations between the meaningful ones. Leave everything else as default and click through the dialogue. 72 | 73 | If you're checking against the audio file, you will have to re-link the audio file manually after you've imported the TextGrid. Once you've imported the TextGrid go to `Edit>Linked Files...`, select `Add...` then select your sound file and click Apply. Then the two should be lined up properly. -------------------------------------------------------------------------------- /02-protocols/example/transcription/erpa.md: -------------------------------------------------------------------------------- 1 | ERPA, Expressive and Receptive Prosody in Autism, was the old study done at CSLU long ago. The prefix for ERPA is OGI-. ERPA has been fully transcribed for quite some time; you'll only need to interact with it if someone specifically asks you to find or fix something. 2 | 3 | # Guidelines 4 | 5 | ERPA was transcribed according to an older form of the guidelines that used significantly different conventions from the ones used now. These guidelines are more like traditional SALT guidelines and include a lot more error marking and grammar marking as well as long comments. If you need to do something in an ERPA file it's worth reading the original guidelines to make sure you follow that style. 6 | 7 | # Chronological 8 | 9 | ERPA files, unlike the others, are saved as _chronological_ textgrids. If you want to edit them in ELAN you have to load them into Praat first and export to non-chronological format before importing them into ELAN. Similarly, when you finish you have to load them into Praat and convert them back to chronological. -------------------------------------------------------------------------------- /02-protocols/example/transcription/formatting-uploading.md: -------------------------------------------------------------------------------- 1 | # Saving 2 | 3 | All transcriptions are saved as Praat [TextGrids](transcription/textgrids). UW and FNL are non-chronological, ERPA is chronological. To export a non-chronological TextGrid from ELAN: 4 | 5 | * `File>Export As>Praat TextGrid...` 6 | 7 | Naming conventions are the same as the .wav file you transcribed, with the .TextGrid suffix instead of .wav. FNL files just have an ID number; UW Estes files have an ID number and a visit number because it's a longitudinal study! 8 | 9 | # Checking 10 | 11 | Use the scripts from Kyle Gorman's ados-scripts, which are hosted here in the folder /ados-scripts/. In Terminal, put the TextGrid in the ados-scripts folder (or change addresses accordingly) and do: 12 | 13 | ``` 14 | ./serialize.py your-file-here.TextGrid 15 | ./validate.rb your-file-here.txt 16 | ``` 17 | 18 | This will check your file for common typos and syntax errors (omitted punctuation, mismatched brackets, and the like) and give you feedback in Terminal. When you've fixed any errors (remember to re-export the TextGrid!) `validate.rb` will give no output. Then you're good to upload! Upload the TextGrid, not the .txt file -- the latter is just to make it easier on the validation script. 19 | 20 | In the rare case where you had to do something very weird with a transcription file, such as adding an extra tier for another speaker, the script will likely throw errors at you about it. 21 | 22 | Tip: If you need to make changes in ELAN, after your last change in Transcription Mode click over to another annotation before you export. Sometimes if you change an annotation but don't click another annotation, the exported file won't have the change you made. 23 | 24 | # Uploading 25 | 26 | Transcriptions go on the `asd` server. The locations are as follows: 27 | 28 | * FNL: `asd:language/FNL/ADOS/transcripts/TextGrid` 29 | * UW Estes: `asd:language/UW/ADOS/transcripts/TextGrid` 30 | * UW GENDAAR: `asd:language/UW_GENDAAR/transcripts/` 31 | 32 | You can connect to server and drag the files in, or use your preferred Terminal file upload utility. 33 | 34 | Sample code using rsync: 35 | 36 | ``` 37 | rsync -avP ./your-file-here.TextGrid asd:language/FNL/ADOS/transcripts/TextGrid 38 | ``` 39 | 40 | or using scp: 41 | ``` 42 | scp your-file-here.TextGrid yourusername@asd:/home/language/FNL/ADOS/transcripts/TextGrid/ 43 | ``` 44 | 45 | for an FNL file. -------------------------------------------------------------------------------- /02-protocols/example/transcription/guidelines.md: -------------------------------------------------------------------------------- 1 | Here are the current transcription guidelines: [SALT Conventions 2017] 2 | (https://repo.cslu.ohsu.edu/language-outcomes/transcription/blob/master/guidelines/SALT_Conventions_2017.pdf) 3 | 4 | What follows is a collection of weird examples that aren't covered in the guidelines, and don't necessarily merit their own section in the official document. This page should be used as a supplement to, but not in place of, the formal guidelines. 5 | 6 | # Content Note 7 | Sometimes what the kids talk about can be fairly distressing: intense bullying, trouble with their parents, violent tendencies. Be aware that there is protocol for addressing suspected child abuse, generally the examiner will try to finish the ADOS as per usual, but then at the end of their meeting may ask the child more pointed questions about it. Often this wouldn't be on the audio file that we receive, but that doesn't mean the examiner wasn't worried and didn't ask! The ADOS examiners are mandatory reporters and will follow up with anything they find concerning. 8 | 9 | # C-Unit Segmentation 10 | If a speaker says a laundry list items of three identical phrase types (i.e., they are equal tiers on a syntax tree), they should be segmented as separate C-units. For example, "I mean like chips or apples or something" should be transcribed as "I mean like chips. Or apples. Or something." whereas "I mean like chips or something or apples" should be transcribed "I mean like chips or something. Or apples." This is a confusing distinction, and looking up syntax trees or discussing it with a linguist can help a lot. 11 | 12 | 13 | # Spelling 14 | For consistency, if one of these phrases come up, transcribe them as they appear here even if it's different from your own intuition. 15 | * P_B_and_J (connecting words are not capitalized) 16 | * Calvin_and_Hobbes 17 | * As/Bs/Cs/Ds/Fs (for grades) 18 | * Four-hundred-and-something 19 | * Nine-and-a-half 20 | * Xbox_Three_Sixty 21 | * Big_Island (when talking about Hawaii) 22 | * Ann_Arbor, Michigan (place names follow the format City, State) 23 | * Valentines_Day (no apostrophe) 24 | * super-frog (often said in context of Tuesday) 25 | * scooch (as in "scooch up your chair a bit") 26 | * Sometimes the indefinite article "a" is hard to distinguish from the filled pause "uh". There's no magic rule to clarify this--but don't worry, it's not just you! People mumble. It sucks. 27 | * Don't mark g-dropping (e.g. "tryin'", "doin'") 28 | 29 | Generally beware of [eye dialect](https://en.wikipedia.org/wiki/Eye_dialect), or the use of nonstandard spellings for normal phonological processes. 30 | 31 | If a child mispronounces a proper noun, transcribe it as it sounds, then place the correct spelling in square brackets. E.g. "Poke_Place_Market[=Pike_Place_Market]" or "A_F_O_Schwarz[=F_A_O_Schwarz]". 32 | 33 | # Hyphenation 34 | Cake types always count as two words for the purposes of MLU: adjective and noun. If the cake type adjective is two words, then it must be hyphenated. Chocolate cake, carrot cake, white cake, birthday cake, wedding cake, pound cake, sponge cake; but red-velvet cake, key-lime pie, Boston-creme pie, blueberry-chocolate cake, black-forest cake. 35 | 36 | Note that "cheesecake" is an exception. 37 | 38 | # Media 39 | 40 | A lot of kids will talk about games, TV shows, movies, etc. that they like. If they mention an identifiable name of something you're not familiar with, it can help a lot to do a quick Google search for that thing; a lot of weird names/words will be much easier to recognize if you've seen them in print first. 41 | 42 | # Play Sounds 43 | 44 | The .ps marking covers a variety of sounds the kid might make during play. The same sound acoustically might be marked as .ns or another tag if used outside the play context; for example, the kid having one of the action figures yell in a fight would be .ps[child yell], whereas the kid yelling while the examiner is talking would be .ns[child yell]. 45 | 46 | The line between play sounds and sound effects is indeed a bit fuzzy. 47 | 48 | # Letter Sequences 49 | 50 | The rules for letter sequences are to be used when the speaker actually pronounces each letter separately; letter sequences that are pronounced like words are spelled as proper nouns. E.g. (for most speakers) Nasa vs the N_S_A. 51 | 52 | # Bracket Capitals 53 | 54 | The _only_ notation in brackets that gets all caps is the "sounds like" notation. E.g. "It was a XX[sounds like GERMAN_SHEPHERD]" vs "It was a German_Shep*[=Shepherd]" 55 | 56 | # Repeated Activities 57 | 58 | Sometimes an activity will happen twice. For example, the examiner may return to an unfinished activity later in the ADOS to complete it. Mark each of them with the activity name as normal, and make a note of it in the spreadsheet. This includes multiple breaks (usually one is the Break activity and any others are breaks for food or medication) -- in this case, label them each as Break. -------------------------------------------------------------------------------- /02-protocols/example/transcription/textgrids.md: -------------------------------------------------------------------------------- 1 | **NB:** this is more information than you will need if you are just transcribing. If you are manipulating or using TextGrids in other ways, it might be helpful. 2 | 3 | TextGrids come in two flavors: object-oriented and chronological. If you open a chronological TextGrid in a text editor, you will see that the annotations are ordered by their start time in the file. Object-oriented TextGrids order annotations by tier, so in our ADOSes it would be all the Child annotations first, then Examiner, etc. This difference is not visible when opening them in Praat, but some scripts might require one type or the other. As noted on the [Formatting and Uploading](transcription/formatting-uploading) page, all current projects ask that you save them as the default, object-oriented type. Elan doesn't even have an export option for chronological TextGrids, so you don't have to worry about messing this up. 4 | 5 | There is an NLTK TextGrid module called `textgrid.py`. It can be found on the server at 6 | 7 | ``` 8 | asd:/home/language/ERPA/ADOS/transcripts/Scripts/textgrid.py 9 | ``` 10 | 11 | The contents are well-documented within. Two useful methods it contains are to_chron() and to_oo(). As you might guess, you can use these methods to convert easily between the two TextGrid types (another way would be to open the TextGrid in Praat and save it as the other type). `oo_to_chron.py` and `chron_to_oo.py` in that same directory take an input TextGrid and convert it to the other type. -------------------------------------------------------------------------------- /02-protocols/example/transcription/tracking.md: -------------------------------------------------------------------------------- 1 | We use a service called Airtable to keep track of transcription progress. It's basically a spreadsheet with more detailed options than a traditional spreadsheet. The [tracker is located here](https://airtable.com). 2 | 3 | # The Simple Version 4 | 5 | Find a file to transcribe. Put your name and today's date under date started. If there's anything to note about the file put it under "transcriber notes". Transcribe! When you're done and have uploaded it, mark the date you finish and select "complete" in the status column for that file. 6 | 7 | ## Reasons Not to Transcribe a File (Yet) 8 | 9 | * If the audio_location is not "ASD server", then we don't actually have an audio file associated with that ID/visit. Entries without a file are for completeness' sake so we know what happened to that ID. These we always mark as "non-transcribable" in the status column, since they can never be transcribed. 10 | * If it already says "complete" in the status column, someone's already done that one. 11 | * By default we only transcribe module 3. ADHD and UWG are almost all module 3, so the module field is blank -- these are fine to transcribe. However, _if_ there's an entry in "module" that is a number _other_ than 3 then don't transcribe it unless specifically asked. (There's discussion about eventually including module 2s as well. Module 4 would presumably be transcribable, but we never get them.) We don't mark these as non-transcribable in the status column, since they technically could be transcribed--instead we leave it blank. 12 | * Most audio_flags are reasons that we shouldn't transcribe the tape yet. The nonstandard administrations mean we can't use the data for most results reporting, so those files shouldn't be transcribed while we have standard administrations left. Incomplete tapes might still have enough to transcribe, and "bad audio" can mean anything from untranscribably corrupted to just hard to hear. (Those tapes often will give you a headache, though.) We don't automatically mark these as non-transcribable in the status column, since many of them could technically be transcribed when we run out of standard administrations. If you do try to transcribe one and there is reason to think it could never be transcribed--e.g., because the audio is so bad--then mark it as "non-transcribable" in the status column. 13 | 14 | # Column Guide 15 | 16 | ## file_name 17 | Self-explanatory -- formatted as the prefix plus ID, so it'll leave off the "ADOS.wav" parts. This uniquely identifies files from UWG and ADHD; UWL also needs visit_number. 18 | 19 | ## corpus 20 | Which corpus it came from: ADHD, UWL, UWG. 21 | 22 | ## visit_number 23 | This is for UWL which was longitudinal. It tracks the different visits (1, 2, 3, 4) for each kid, because each kid has up to 4 files. Visit 1 didn't include any Module 3s so Visit 1 tapes are pretty much never going to be transcribed. For UWL you need both the file_name and visit_number to identify the wav file/textgrid that corresponds to the row. 24 | 25 | ## status 26 | A drop-down representing which files have a complete transcription up on the asd server! This option can be blank (meaning it hasn't been transcribed yet), complete (meaning it's transcribed and on the asd server) or non-transcribable (meaning the audio location is not on the asd server, or the file is non-transcribable for another reason, as described above). This helps keep track of what's done and gives transcribers a sense of accomplishment after finishing each file. Also, some of the old UW files for timepoint 4 don't have information for transcriber or transcription dates, so this is the most reliable way to see if files are done or not. 27 | 28 | ## audio_location 29 | Either the audio is on the asd server where you can download it, or something happened to it such that we don't have it. 30 | This keeps track of that. From a transcriber's point of view, if it's on "ASD Server" then it's available, otherwise it isn't. 31 | 32 | ## transcriber 33 | This keeps track of which transcriber transcribed which file. 34 | 35 | ## date_started 36 | The date a transcriber started work on the file. 37 | 38 | ## date_finished 39 | The date a transcriber finished work on the file. 40 | 41 | ## ados_module 42 | If it's not filled out, assume 3. For studies where we have multiple modules (mostly UWL) this tracks which module each file is. A few ADHD files switch to module 2 halfway through and these are marked here too. 43 | 44 | ## audio_flags 45 | Flags for common issues with tape, reasons audio can't be transcribed, or reasons why we don't have a tape at all. If a tape turns out to be untranscribable for the reasons listed flag it so other transcribers don't have to repeat your check. 46 | 47 | ## transcriber_notes 48 | A place for relevant notes about the file -- whatever you think would be relevant to analysis. The most common notes are if activities are out of order or if some activities are missing. This is also the place to put more details about why a file is untranscribable if you feel it would be helpful. 49 | 50 | For a while it wasn't normal to see Conversation and Reporting, so some of the first tapes to have it are marked. You don't have to mark if a tape has C&R (or not); it's now considered normal either way. 51 | 52 | If you have to concatenate two audio files when [bringing over audio from audio2](./audio2) that is often mentioned here. 53 | 54 | Audio problems that affect much or all of a file should get some marking here. If marking substantial background or line noise in a file, timestamps should be in seconds. (This is for noise that persists through most of the file; short periods are marked only in the transcript itself.) If the kid is regularly too quiet for the mic to pick up or so loud the mic is overloading, those are often noted here as well. 55 | 56 | Low-verbal or less verbal are just tags transcriptionists include for kids who take module 3 but really don't talk much/display much language. They don't reflect anything clinical and are just a transcriber intuition thing. 57 | 58 | If there are two examiners (definitely not a parent sitting in as described above) that's typically marked here, too. 59 | 60 | 61 | ## additional_notes 62 | These are for notes that came from the lab that gave us the tapes; some of them are outdated e.g. "may be coming" for files we now have. Kept for historical reasons. 63 | 64 | 65 | ## Conversation and Reporting 66 | 67 | There was a long issue where we should have been transcribing Conversation and Reporting but we weren't. Every transcript that has a C&R section as of August 2017 that wasn't transcribed has been marked here as "Not yet transcribed". One of the ongoing projects is to go through those transcripts, transcribe C&R, then upload the full transcript and change this marking to "Transcribed". If nothing is marked here you can assume the file is fine. For new transcripts going forward, transcribe C&R like any other activity and don't mark anything here. Once they're all transcribed we can forget about this whole column. 68 | 69 | # Airtable Hints 70 | 71 | * The dropdown list options have an autocomplete, so you can start typing e.g. your name in the transcriber column and then click to auto-complete it 72 | * You can do a lot of sorting and filtering to find the transcripts you want. For example, one of the views will show you the "To do" to easily choose a new transcription file. 73 | * Sometimes one of the columns will hide itself; look for a slider that says "Drag to adjust the number of frozen columns" and drag it a bit to find the hidden column. This is a mystery. -------------------------------------------------------------------------------- /02-protocols/example/transcription/transcription.md: -------------------------------------------------------------------------------- 1 | This page is a big picture overview of how to transcribe and why we do it. We hope at least one experienced transcriber will be around to talk about this in person too. 2 | 3 | # Transcription Goals 4 | 5 | Most transcribers have historically had a linguistics background. This section may be a bit redundant for those who do. 6 | 7 | The goal of transcription is to preserve what the child actually said, in a form amenable to study. The whole idea of _actually said_ is not nearly as simple as it sounds; we necessarily make choices about what details are relevant to include in a transcript. That said, speech is messy. Some of the children whose tapes we're transcribing may have language disorders and other conditions that influence their speech, but even the adult examiners will display some non-standard speech at times -- everyone does. If you haven't transcribed before you will be tempted to clean up the child's speech into something closer to a novel or script. One of the main tasks of learning to be a transcriber is learning to hear the words as they were said, and write them down, messy as they may be. 8 | 9 | # Uses for the Transcripts 10 | 11 | The transcripts are used for a variety of automated analyses. We can't prepare for every possible future use, but there will probably be new analyses conducted that we can't predict yet. We don't want to focus on only a few predicted uses and thus unduly influence the transcripts, but knowing what transcripts will be used for downstream can help us choose which phenomena deserve special notation. It's also helpful to have enough information in a transcript that a human can read it and figure out what's going on, but this is more for troubleshooting; most of the publications based on our transcripts have to do with a computational analysis. 12 | 13 | One set of measures commonly used to study the transcripts includes MLU (mean length of utterance -- an important measure and the reason so many rules revolve around what counts as one word) as well as `other stuff`. 14 | 15 | Some things that have been studied include turn-taking, mazes, and pedantic speech. 16 | 17 | # Transcriber Training 18 | 19 | New transcribers start off by studying the transcription guidelines, then transcribe the gold standard file. The gold standard file is a file that already has a transcription generally agreed to be of high quality. The current one is ADHD-83322-ADOS (no peeking!). Then experienced transcribers compare the new transcriber's transcription to the gold standard and offer advice and guidance based on the differences. There are certain parts of transcribing that are objective, so the goal isn't necessarily to have character-for-character the same transcript, but this is a good way to get used to the conventions and learn to avoid common mistakes. 20 | 21 | In addition to the version on asd, there are two other versions of the gold standard file made by other transcribers and revised in a group discussion. All canonical versions of the gold standard transcription can be found at `asd:/home/langauge/Transcription/gold_standard`. The point of keeping multiple versions is to make it easier for trainers to see if the new transcriber is showing normal human variation in judgement or making actual mistakes. 22 | 23 | # Standardization Measures 24 | 25 | Every September and March all working transcribers need to do a standardization check. Anyone who is transcribing at the time needs to share a transcript with their fellow transcribers at some point during that month. Transcribers should review each other's work and discuss any discrepancies. This is to ensure that transcribers remain standardized to both the guidelines and each other. -------------------------------------------------------------------------------- /02-protocols/example/transcription/uw.md: -------------------------------------------------------------------------------- 1 | There are two UW studies we have data from, UWL and UWG. UWL is pretty much transcribed, UWG is a work in progress. 2 | 3 | # UWL 4 | 5 | This was a longitudinal study. The files will have the format UWL-###-ADOS-#.wav . The first set of three numbers is an ID number which stays consistent across timepoints. The last number between 1 and 4 is the timepoint. The kids were roughly 5, 8, 11, 14 across the timepoints. (**this may not be right, I'm pretty sure about spacing but not absolute ages!***) Many kids dropped out at different points, which is noted on the AirTable spreadsheet. 6 | 7 | Time point 1 isn't usable because all the kids were too young for module 3, and only some of the kids ever made it to module 3. Also, a lot of time point 1 has the parent interview in the background while the kid plays. There are a bunch of kids who came in just for time point 4 and only have one file; they tend to have the highest ID numbers. 8 | 9 | Many of the time point 4 transcripts were made before the current generation of transcribers, and the record-keeping was less strict in those days so some of the information about who transcribed which files and when has been lost forever. That's why there are gaps on the spreadsheet about time point 4. 10 | 11 | Time point 1-3 recordings came off of VHS tapes shipped to us by UW and digitized at CSLU. Some of the tapes were lost or damaged, or didn't contain an ADOS, and so those weren't transferred. Time point 4 files came on DVD and are therefore a bit higher quality. 12 | 13 | # UWG 14 | 15 | This study is currently being transcribed. Files will have the format ###.03-ADOS.wav . -------------------------------------------------------------------------------- /02-protocols/template/protocol.md: -------------------------------------------------------------------------------- 1 | Use this template to start identifying and documenting procedures that are regular and essential to the everyday happenings and forward momentum of your lab. 2 | 3 | The point here isn't to detail experimental protocols. Rather, think about the workflows that ensure you and your team have the information and materials needed to perform and communicate your research. 4 | 5 | ## Recognize What is Regular 6 | > Make a list of the recurring activities that take place in your lab on a daily, weekly, monthly, and annual basis. 7 | 8 | Examples: 9 | 10 | * Accessing servers 11 | * Ordering supplies 12 | * Lab meetings 13 | * Maintaining equipment 14 | * Running scripts 15 | * Progress reports 16 | 17 | ## The Who, What, When, Where & Why 18 | > For each activity note its purpose, the actions needed to complete the task, individual and organizational roles, the timeline, where needed information or materials live, and where it takes place. 19 | 20 | ## Describe the Procedure 21 | > Once you've broken down the full context of an activity (those 5 Ws above), translate it into a set of instructions. A good protocol is self-contained, so readers shouldn't have to hunt for additional information to understand a task or take action. 22 | 23 | :100: Pro-tip! Update your protocols regularly and don't forget about authentication. There are many activities, like reporting, that require access to protected databases and personal information. Don't share passwords! Many of these systems, like NIH NCBI and eRA Commons, have delegate access capabilities. 24 | 25 | 26 | -------------------------------------------------------------------------------- /03-housekeeping/example/IRB/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/apreshill/labhub/17133b1e3b321cc703e4f6ed75ba77daeb679ab3/03-housekeeping/example/IRB/.gitkeep -------------------------------------------------------------------------------- /03-housekeeping/example/IRB/2017-notes.md: -------------------------------------------------------------------------------- 1 | For 2017 continuing review, @apreshill was advised that the **total subjects since study began** should only reflect subjects that we ourselves consent. So *N = 237*, and this should not change unless we recruit new participants at **our** site. -------------------------------------------------------------------------------- /03-housekeeping/example/NIH-progress-reports/README.md: -------------------------------------------------------------------------------- 1 | Nothing here yet! TBD 2018 -------------------------------------------------------------------------------- /03-housekeeping/example/README.md: -------------------------------------------------------------------------------- 1 | # [Meetings](meetings) 2 | 3 | Our language outcomes project team will meet as a group monthly, usually on the first Wednesday of each month at 1pm in the conference room at Gaines Hall. In this directory, we'll keep our meeting agendas for future meetings, annotated with the minutes and other notes posted by the first Friday of each month. 4 | 5 | # [IRB](IRB) 6 | 7 | You'll also find in here our IRB updates and docs we use to track modifications, continuing reviews, etc. 8 | 9 | # [NIH Progress Reports](NIH-progress-reports) 10 | 11 | Here you'll find materials and docs related to our yearly NIH progress reports. 12 | 13 | # [Team Contacts](team-contacts.md) 14 | 15 | Here is where you can find all team members preferred ways of communicating- including the location of their OHSU on-campus habitat, and other details like google/skype usernames for remote conferencing. 16 | -------------------------------------------------------------------------------- /03-housekeeping/example/meetings/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/apreshill/labhub/17133b1e3b321cc703e4f6ed75ba77daeb679ab3/03-housekeeping/example/meetings/.gitkeep -------------------------------------------------------------------------------- /03-housekeeping/example/meetings/2017-08-16.md: -------------------------------------------------------------------------------- 1 | Attendees: @you, @me, @superstarphdstudent1, @superstarphdstudent2, @skilledra1, @skilledra2, @skilledra3, etc. 2 | 3 | 4 | 5 | **Main project overview + updates** 6 | 7 | - Discuss what the data will look like when we get it 8 | - Hiring + on-boarding (note: @skilledra1's last day is 2017-08-23) 9 | 10 | 1. Full-time RA 11 | - Training: transcription, REDCap 12 | - What else? 13 | - `Update 2017-08-16: @skilledra2 has been offered the position (yay!). @skilledra3 will help with transcription training` 14 | 1. New graduate RA (@superstarphdstudent, in September) 15 | - Training: transcription, REDCap 16 | - What else? 17 | 1. `Update 2017-08-16: @me created gitlab repo for project with` [onboarding](https://repo.cslu.ohsu.edu/language-outcomes/onboarding)` section- check it out and feel free to edit/contribute!` 18 | 19 | 20 | **Housekeeping** 21 | 22 | - PMC for all future conference proceedings, papers, etc. `Update 2017-08-16: Robin Champieux will join us October 4!` 23 | - IRB update re: CRQ 2017 24 | - `Update 2017-08-16: @skilledra1/@skilledra2 have the` [transcription wiki](https://repo.cslu.ohsu.edu/language-outcomes/transcription/wikis/home) `up and running- please review and suggest edits!` -------------------------------------------------------------------------------- /03-housekeeping/example/meetings/2017-09-06.md: -------------------------------------------------------------------------------- 1 | Attendees: @you (by Google hangout), @me, @skilledra, etc. 2 | 3 | **Main project overview + updates** 4 | 5 | - Transcript progress updates via AirTable 6 | - Hiring + on-boarding 7 | 1. Skilled new RA starts 2017-09-18 8 | - @skilledra to do transcription training right away, starting in September 9 | 10 | **Housekeeping** 11 | 12 | - IRB update re: CRQ 2017 submitted (thanks to all for finishing all those darn trainings!) 13 | - Random Q: does OCTRI provide API keys for REDCap anymore? 14 | - idea: let's add our ORCIDs to [our contact page](https://repo.cslu.ohsu.edu/language-outcomes/housekeeping/blob/master/contact-info.md) 15 | - @skilledra suggested switching up our sesame street avatars for transcription, broad consensus on this issue by all attendees 16 | - `Update 2017-09-07: transcription is now represented by thoughtful Kermit, freeing up Cookie Monster for another repo` -------------------------------------------------------------------------------- /03-housekeeping/example/meetings/2017-10-04.md: -------------------------------------------------------------------------------- 1 | Attendees: @you, @me (by Skype), @skilledra, etc. 2 | 3 | **NIH Public Access Policy walk-through with Robin Champieux** 4 | - Background: 5 | - The NIH has a Public Access Policy for peer-reviewed publications, to ensure public access to publicly funded research. 6 | - This means that these NIH-funded papers must be available on PubMed Central (PMC) within 12 months of publication. 7 | - For this project, it is the job of the first author of the paper to make this happen. 8 | - At the end of this process your paper will have a PMC id number. 9 | - Process for publishers who submit it on your behalf: 10 | - Most publishers will be willing to submit your paper to PMC on your behalf. 11 | - You will know this is true if you have an option to check in submitting the paper that says you would like them to submit it on your behalf. 12 | - If that's the case, you should know these things: 13 | - You will need to provide your NIH grant number to them when checking that option. 14 | - Once you check that option, you will get a notification (usually within a month) where they provide a pdf of your submission for you to approve. 15 | - You do want to proof the pdf they provide before approving it--sometimes formatting can be altered in this process. 16 | - Process for submitting it independently: 17 | 1. You need an NCBI account. Create one if you don't already have one. 18 | 2. Go to https://www.nihms.nih.gov/db/sub.cgi. Go down to the bottom, and sign in through NCBI under the label "Publishers and Others" 19 | 3. Click on Upload Manuscript 20 | 4. Follow the steps listed there: 21 | 1. Title 22 | 2. Funding 23 | 3. Choose your files (You have to upload a manuscript file, and you may also upload separate figures, etc. Your manuscript file cannot have the publisher's formatting, and must instead be your generated file.) 24 | 4. Check files 25 | 5. Set reviewer and embargo (The reviewer must be an author. As the first author, you should make the reviewer yourself. The embargo is either 12 months (the standard) if it's not given, or a specified and potentially shorter period given by the journal.) 26 | - Other things to be careful about: 27 | - If the publisher does not submit your paper on your behalf, you will want to submit it independently *as soon as possible*. There is a 3-month grace period, but it is safest to get it submitted as soon as your paper is accepted. 28 | - PubMed Central (PMC) and PubMed (PM) are different. PMC is a repository for the NIH, while PM is an indexed journal of biomedical literature. Additionally, a PMC id number is different from a PM number. To be compliant with this Public Access Policy, you need a PMC id number. 29 | 30 | 31 | **Main project overview + updates** 32 | - MIND Institute perseverative speech coding--finalized! 33 | - WE HAVE NEW TRANSCRIPTS! From the MIND Institute 34 | - Email @skilledra about access 35 | 36 | 37 | **Housekeeping** 38 | - Our CRQ 2017 has been approved by IRB (2017-10-02) 39 | - Our repository protocol modification was approved (2017-10-02) 40 | -------------------------------------------------------------------------------- /03-housekeeping/example/meetings/2017-11-01.md: -------------------------------------------------------------------------------- 1 | Attendees: @you, @me, @superstarphdstudent1, @superstarphdstudent2, @skilledra1, @skilledra2, etc. 2 | 3 | **Article Presentation** 4 | 5 | - @superstarphdstudent1 presented a paper titled "Linguistic camouflage in girls with autism spectrum disorder" by Julia Parish-Morris, published 2017, Molecular Autism. 6 | - The article can be found [here](https://molecularautism.biomedcentral.com/track/pdf/10.1186/s13229-017-0164-6?site=molecularautism.biomedcentral.com). 7 | - Important differences in methods discussed: 8 | - Their utterances are breath segmented as opposed to c-units 9 | - Thus, the MLU is different from ours 10 | - They use the "interview" segments of the ADOS--emotions, social difficulties, friends, relationships, & marriage, loneliness; we used all sections 11 | - They count all disfluency, instead of just the first per utterance 12 | 13 | **Pedantic Speech Presentation** 14 | 15 | @superstarphdstudent2 presented her research from last summer on pedantic speech in children with autism 16 | - Had some mystery ERPA data with unknown desaltify flags run on it 17 | - But potentially solved--@skilledra1 knows where it came from! 18 | 19 | **NIH Info Updates** 20 | - @superstarphdstudent1 updated on PMC submission process 21 | - Success! 22 | - Added more info on the NIH Public Access Policy .md file [here](https://repo.cslu.ohsu.edu/language-outcomes/onboarding/blob/master/nih-public-access-policy.md) 23 | 24 | 25 | **Main Project Overview & Updates** 26 | - @skilledra2 discussed that transcribers will be working on a file that shows the changes to transcription guidelines 27 | - `Update: @skilledra2 and others completed file and it now exists in our transcription wiki` [here](https://repo.cslu.ohsu.edu/language-outcomes/transcription/wikis/transcription/guideline-changelog) 28 | 29 | 30 | **Housekeeping** 31 | - Box Folder: 32 | - We decided to establish a Box folder with articles relevant to this work 33 | - We should have a naming system and organize them by topic --> have a .bib file 34 | - `Update: access shared box folder here:` [LINK] 35 | - @me will be doing a workshop for OHSU researchers on having an on boarding process, developing a lab code of conduct, issue tracking, etc.--email her if you're interested in being involved! -------------------------------------------------------------------------------- /03-housekeeping/example/team-contacts.md: -------------------------------------------------------------------------------- 1 | # Project Team 2 | 3 | You can email all current project team members using our group email: **ourlab [at] ohsu [dot] edu** 4 | 5 | ### Alison Presmanes Hill (co-PI), PhD 6 | 7 | * email: 8 | * Office: 9 | * Office phone: 10 | * Cell phone: 11 | * Skype: 12 | * Google (for Hangouts, etc.): 13 | * Website: 14 | * ORCID: [0000-0002-8082-1890](http://orcid.org/0000-0002-8082-1890) 15 | 16 | 17 | # Project Alumni 18 | 19 | Preferred contact information to stay in touch with past graduate students, post-docs, and faculty 20 | 21 | -------------------------------------------------------------------------------- /03-housekeeping/template/IRB/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/apreshill/labhub/17133b1e3b321cc703e4f6ed75ba77daeb679ab3/03-housekeeping/template/IRB/.gitkeep -------------------------------------------------------------------------------- /03-housekeeping/template/IRB/delete-me.md: -------------------------------------------------------------------------------- 1 | --- 2 | 3 | # h1 Heading 4 | ## h2 Heading 5 | ### h3 Heading 6 | #### h4 Heading 7 | ##### h5 Heading 8 | ###### h6 Heading 9 | 10 | 11 | ## Horizontal Rules 12 | 13 | ___ 14 | 15 | --- 16 | 17 | *** 18 | 19 | 20 | 21 | 22 | ## Emphasis 23 | 24 | **This is bold text** 25 | 26 | __This is bold text__ 27 | 28 | *This is italic text* 29 | 30 | _This is italic text_ 31 | 32 | ~~Strikethrough~~ 33 | 34 | 35 | ## Blockquotes 36 | 37 | 38 | > Blockquotes can also be nested... 39 | >> ...by using additional greater-than signs right next to each other... 40 | > > > ...or with spaces between arrows. 41 | 42 | 43 | ## Lists 44 | 45 | Unordered 46 | 47 | + Create a list by starting a line with `+`, `-`, or `*` 48 | + Sub-lists are made by indenting 2 spaces: 49 | - Marker character change forces new list start: 50 | * Ac tristique libero volutpat at 51 | + Facilisis in pretium nisl aliquet 52 | - Nulla volutpat aliquam velit 53 | + Very easy! 54 | 55 | Ordered 56 | 57 | 1. Lorem ipsum dolor sit amet 58 | 2. Consectetur adipiscing elit 59 | 3. Integer molestie lorem at massa 60 | 61 | 62 | 1. You can use sequential numbers... 63 | 1. ...or keep all the numbers as `1.` 64 | 65 | Start numbering with offset: 66 | 67 | 57. foo 68 | 1. bar 69 | 70 | 71 | ## Code 72 | 73 | Inline `code` 74 | 75 | Indented code 76 | 77 | // Some comments 78 | line 1 of code 79 | line 2 of code 80 | line 3 of code 81 | 82 | 83 | Block code "fences" 84 | 85 | ``` 86 | Sample text here... 87 | ``` 88 | 89 | Syntax highlighting 90 | 91 | ``` js 92 | var foo = function (bar) { 93 | return bar++; 94 | }; 95 | 96 | console.log(foo(5)); 97 | ``` 98 | 99 | ## Tables 100 | 101 | | Option | Description | 102 | | ------ | ----------- | 103 | | data | path to data files to supply the data that will be passed into templates. | 104 | | engine | engine to be used for processing templates. Handlebars is the default. | 105 | | ext | extension to be used for dest files. | 106 | 107 | Right aligned columns 108 | 109 | | Option | Description | 110 | | ------:| -----------:| 111 | | data | path to data files to supply the data that will be passed into templates. | 112 | | engine | engine to be used for processing templates. Handlebars is the default. | 113 | | ext | extension to be used for dest files. | 114 | 115 | 116 | ## Links 117 | 118 | `[GitHub](https://github.com)` 119 | 120 | [GitHub](https://github.com) 121 | 122 | 123 | 124 | 125 | ## Images 126 | 127 | `![Collabocats](https://octodex.github.com/images/collabocats.jpg)` 128 | ![Collabocats](https://octodex.github.com/images/collabocats.jpg) 129 | 130 | `![Labtocat](https://octodex.github.com/images/labtocat.png "Labtocat")` 131 | ![Labtocat](https://octodex.github.com/images/labtocat.png "Labtocat") 132 | 133 | 134 | 135 | 136 | ### [Emojis](https://gist.github.com/rxaviers/7360908) 137 | 138 | :heart: `:heart:` 139 | 140 | :monkey: `:monkey:` 141 | 142 | :umbrella: `:umbrella:` 143 | 144 | 145 | 146 | 147 | 148 | 149 | 150 | 151 | 152 | -------------------------------------------------------------------------------- /03-housekeeping/template/NIH-progress-reports/template.md: -------------------------------------------------------------------------------- 1 | Add your content in markdown files -------------------------------------------------------------------------------- /03-housekeeping/template/README.md: -------------------------------------------------------------------------------- 1 | Use this template to start getting your ducks in a row. Think about the administrative pieces of your project that need to happen in order for the science to happen and must "live" somewhere. These can include pieces like Institutional Review Board applications, modifications, and continuing reviews, meeting agendas and minutes, and your project progress reports and supporting documentation. 2 | 3 | 4 | 5 | 6 | -------------------------------------------------------------------------------- /03-housekeeping/template/meetings/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/apreshill/labhub/17133b1e3b321cc703e4f6ed75ba77daeb679ab3/03-housekeeping/template/meetings/.gitkeep -------------------------------------------------------------------------------- /03-housekeeping/template/meetings/template.md: -------------------------------------------------------------------------------- 1 | Add your content in markdown files -------------------------------------------------------------------------------- /03-housekeeping/template/team-contacts.md: -------------------------------------------------------------------------------- 1 | # Project Team 2 | 3 | You can email all current project team members using our group email: **ourlab [at] ohsu [dot] edu** 4 | 5 | ### Alison Presmanes Hill (co-PI), PhD 6 | 7 | * email: 8 | * Office: 9 | * Office phone: 10 | * Cell phone: 11 | * Skype: 12 | * Google (for Hangouts, etc.): 13 | * Website: 14 | * ORCID: [0000-0002-8082-1890](http://orcid.org/0000-0002-8082-1890) 15 | 16 | 17 | # Project Alumni 18 | 19 | Preferred contact information to stay in touch with past graduate students, post-docs, and faculty 20 | 21 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Welcome to Labhub 2 | The Labhub workshop and repository were created as part of the [**Open Science at OHSU**](http://www.ohsu.edu/blogs/researchnews/2017/12/04/open-science-panel-the-evolving-landscape-of-scientific-communication-dec-8/) event on December, 8 2017. 3 | 4 | The Labhub repository was adapted from a Gitlab repository, which [Dr. Alison Hill](https://alison.rbind.io/) and her colleagues built to onboard new students and postdoctoral fellows, and facilitate data management, transparency, the long-term value of research contributions, and a safe academic space. 5 | 6 | Labhub is a work in progress. We created this repository as an education and demonstration tool for faculty, postdocs, and students curious about how documentation, open science workflows, and tools like Github can contribute to a healthy and productive research environment. Your ideas and contributions are welcome! 7 | 8 | ## What you'll find in the Labhub repository 9 | Advice, tools, and examples are organized into four areas: onboarding, protocols, housekeeping, and wiki. 10 | 11 | **Onboarding** 12 | 13 | This folder includes an [example repository](https://github.com/apreshill/labhub/tree/master/01-onboarding/example) of resources for new students, graduate research assistants, research staff, and volunteers. We've also created a [onboarding template](https://github.com/apreshill/labhub/blob/master/01-onboarding/template/onboarding.md) to help you think through the goals and issues important to your environment. 14 | 15 | **Protocols** 16 | 17 | Document, document, document! Every lab has a story about an essential process that was lost when a postdoc, student, or research staff member moved on. Having a shared system for documenting and versioning your protocols can help with the transition between lab members, manuscript writing, and reproducibility. 18 | 19 | This folder includes an [example protocol](https://github.com/apreshill/labhub/tree/master/02-protocols/example) from the [Hill lab](https://alison.rbind.io/), and a [template](https://github.com/apreshill/labhub/blob/master/02-protocols/template/protocol.md) to adapt for your own documentation. 20 | 21 | **Housekeeping** 22 | 23 | Need to find a team member's Skype ID, add someone to your Slack Team, save [meeting minutes](http://third-bit.com/teaching/community.html#meetings-meetings-meetings), or confirm when your next NIH progress report is due? Check out our Housekeeping template to record everyday organizational data for easy sharing and retrieval. 24 | 25 | **Wiki** 26 | 27 | Use a [Wiki](https://github.com/apreshill/labhub/wiki) to document helpful but less formal information about working in your lab, like where to find coffee, pens, and a key for deciphering lab acronyms. 28 | 29 | ## You don't have to use Github for your Labhub! 30 | The tools and strategies included in this repository can be translated to different environments. There are many reasons why Github may not be the best solution for your team - maybe you're dealing with PHI data or most of your labmates aren't comfortable using Github. In these cases, tools like [Gitlab](https://about.gitlab.com/), an electronic lab notebook, or even a simple folder structure might be the better starting place. 31 | 32 | ## A note about scope 33 | This repository and the methods we suggest hardly scratch the surface when it comes to addressing research rigor and reuse, let alone sexism, racism, ageism, or ableism in a lab. We plan to add a reading list to this repository which addresses these issues with more depth and breadth. 34 | 35 | ## Contributing 36 | First off, thanks for taking the time to contribute! We want Labhub to be a community driven project and we want to know about your experience and ideas for facilitating transparency, reproducibility, safety, and inclusion in a research lab. 37 | 38 | Please feel free to create an issue or submit a pull request, or just fork this repository to use it in your lab. 39 | 40 | All contributions will be licensed under the same terms noted below. 41 | 42 | ## Licensing 43 | Creative Commons License
Except where otherwise noted, this work is licensed under a Creative Commons Attribution 4.0 International License. 44 | -------------------------------------------------------------------------------- /labhub.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: Default 4 | SaveWorkspace: Default 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | -------------------------------------------------------------------------------- /labhub.wiki/01-about-cslu.md: -------------------------------------------------------------------------------- 1 | # CSLU 2 | 3 | CSLU stands for the [Center for Spoken Language Understanding](https://cslu.ohsu.edu). We do research, and have a [Computer Science & Electrical Engineering (CSEE) education program](http://www.ohsu.edu/csee). 4 | 5 | # People 6 | 7 | You may see some of these folks around: https://www.ohsu.edu/xd/research/centers-institutes/center-for-spoken-language-understanding/people.cfm 8 | 9 | # CSLU Seminar Series 10 | 11 | The seminar series is generally Tuesdays from 12-1pm; lunch is provided. The [events calendar](https://www.ohsu.edu/xd/education/schools/school-of-medicine/departments/basic-science-departments/csee/events.cfm) lists upcoming seminars. 12 | 13 | # Websites 14 | 15 | * CSLU: https://cslu.ohsu.edu 16 | * CSEE education program: http://www.ohsu.edu/csee 17 | 18 | # Mailing Address 19 | 20 | 3181 SW Sam Jackson Park Rd, GH40 21 | 22 | Portland, OR 97239-3098 23 | 24 | # Physical Address (for mapping) 25 | 26 | 840 SW Gaines Street 27 | 28 | Portland, OR 97239-3098 -------------------------------------------------------------------------------- /labhub.wiki/02-seminars-and-journal-clubs.md: -------------------------------------------------------------------------------- 1 | All lab members are invited to attend: 2 | 3 | 1. [OHSU's monthly Autism Seminar Series](https://www.ohsu.edu/xd/research/about/calendar.cfm#/?i=1); email Eric Fombonne to be added to the mailing list 4 | 5 | 2. [CSLU's weekly CSEE Seminar Series](https://www.ohsu.edu/xd/about/news_events/events/index.cfm#/?i=4); email Patricia Dickerson to be added to the mailing list 6 | 7 | 3. Our weekly(-ish) Natural Language Processing reading group; email Steven Bedrick 8 | -------------------------------------------------------------------------------- /labhub.wiki/03-servers-and-data-repositories.md: -------------------------------------------------------------------------------- 1 | In this project, we'll use data from several different sources, and the file location depends on the type of data (text files, flat files like .csvs). We'll organize this by corpora: 2 | 3 | # ERPA 4 | 5 | This was data collected at OHSU between 6 | 7 | **Text files of ADOS transcripts** are stored on the asd server: asd.cslu.ohsu.edu:transcripts/TextGrid/chronTextGrid_merged_by_subject/ 8 | 9 | **Participant-level data** is stored in REDCap (Email Alison for access using your OHSU username and password): https://octri.ohsu.edu/redcap/ 10 | The project is called `OCTRI 10001 ERPA: Expressive & Receptive Prosody in Autism - Version 2` 11 | 12 | 2. 13 | 14 | # Additional cloud storage 15 | 16 | [OHSU provides](http://www.ohsu.edu/blogs/researchnews/2014/08/05/cloud-storage-now-available-for-ohsu-researchers/) you an institutional Box.com account, which is the only approved cloud storage that fits OHSU's data protection policy: https://ohsu.app.box.com/login 17 | 18 | # Servers 19 | 20 | The main server for transcription related purposes is asd, formally `asd.cslu.ohsu.edu`. 21 | 22 | Some other servers you may hear about are bergamot (`login.cslu.ohsu.edu`), which is used to tunnel into internal servers from outside the network, and the Big Birds (`bigbirdN.cslu.ohsu.edu`), which are the center's computing cluster. If someone wants you to work on one of these, they'll probably give you more information about it at the time. 23 | 24 | # Raw Data Storage 25 | 26 | There may come a day when you want to consult the raw source data for a project. The main data stored on-site is from ERPA -- UW and other shared projects do not have such raw data available. 27 | 28 | * [Paper Files] (datastorage/files) 29 | * [Recordings] (datastorage/recordings) -------------------------------------------------------------------------------- /labhub.wiki/04-CSLU-acronyms.md: -------------------------------------------------------------------------------- 1 | This page provides a list of the acronyms used at the Center for Spoken Language Understanding (CSLU)--see, they come up a lot ;). These may or may not come up in your work. 2 | 3 | # Project-specific jargon 4 | 5 | - **ERPA**: Expressive and Receptive Prosody in Autism (old CSLU study) 6 | - **MIND Institute**: Medical Investigation of Neurodevelopmental Disorders Institute at UC Davis (source of transcripts) 7 | - **FNL**: Fair Neuroimaging Lab at OHSU (source of ADOS recordings) 8 | - **CON**: Conversation (one of the MIND Institute's language samples) 9 | - **NAR**: Narrative (one of the MIND Institute's language samples) 10 | - **PPs**: Pivotal Parameters 11 | - **ACW**: Affirmative Cue Words 12 | - **ADMs**: Automated Discourse Measures 13 | - **DMs**: Discourse Markers (e.g. um, uh) 14 | - **REC**: Replication/Extension Corpora (there are 2: MIND and FNL!) 15 | - **SOR**: Semantic Overlap Ratio 16 | - **WRRs**: Word Repetition Ratios 17 | - **NLs**: Natural Language Samples 18 | 19 | # Labs and Institutions 20 | 21 | - **CDRC**: Child Development and Rehabilitation Center 22 | - **CSEE**: Computer Science and Electrical Engineering Department at OHSU (educational counterpart to CSLU) 23 | - **CSLU**: Center for Spoken Language Understanding (here!) 24 | - **IRB**: Institutional Review Board 25 | - **NIH**: National Institute of Health 26 | - **OGI**: Oregon Graduate Institute (defunct graduate institution that used to host CSLU) 27 | - **OHSU**: Oregon Health and Science University 28 | - **UW**: University of Washington (their autism lab sends ADOS recordings) 29 | 30 | # Buildings 31 | 32 | Also see [here](http://www.ohsu.edu/xd/about/visiting/directions/upload/OHSU_ext_map_BW_8-5x11_FNL.pdf) for a map of campus with commonly used campus acronyms for various buildings. 33 | 34 | - **BICC**: Biomedical Information and Communication Center (the school library) 35 | - **CHH**: Center for Health and Healing (the building next to the bottom of the tram) 36 | - **GH**: Gaines Hall (location of CSLU) 37 | - **SON**: School of Nursing (across the street from CSLU) 38 | 39 | # Journals 40 | 41 | - **ACL**: Association for Computational Linguistics 42 | - **ACM**: Association for Computing Machinery 43 | - **IEEE**: Institute of Electrical and Electronics Engineers 44 | - **NAACL**: North American Association for Computational Linguistics (subset of ACL) 45 | - **PLoS**: Public Library of Science (open access) 46 | 47 | # Technical (transcription, statistics, etc) 48 | 49 | - **NLP**: Natural Language Processing 50 | - **tf-idf**: Term Frequency - Inverse Document Frequency (how often a term appears in this document compared to how often it occurs in your corpus) 51 | - **CFA**: Confirmatory Factor Analysis 52 | - **GEEs**: Generalized Estimating Equations 53 | - **LDA**: Latent Dirichlet Allocation (generative model in NLP) or Linear Discriminant Analysis (machine learning dimensionality reduction technique) 54 | - **RMSEA**: Root Mean Square Error Approximation 55 | - **SVM**: Support Vector Machine (machine learning classification technique) 56 | - **MLU**: Mean Length of Utterance 57 | - **MLUM**: Mean Length of Utterance in Morphemes 58 | - **SALT**: Systematic Analysis of Language Transcripts (protocol for Transcription) 59 | 60 | 61 | # Neurodevelopmental disorders/diagnostic categories (or lack thereof) 62 | 63 | - **ASD**: Autism Spectrum Disorder 64 | - **ALI**: Autism with Language Impairment (subset used in ERPA) 65 | - **ALN**: Autism with Language Normal (subset used in ERPA) 66 | - **DD**: Developmentally Delayed 67 | - **DS**: Down Syndrome 68 | - **FXS**: Fragile X Syndrome 69 | - **TD**: Typically Developing 70 | - **SLI**: Specific Language Impairment 71 | 72 | # Measures (clinical assessments, parent-reported questionnaires, etc.) 73 | 74 | - **ADOS**: Autism Diagnostic Observation Schedule (semi-structured standardized play-based assessment for ASD) 75 | - **BRIEF**: Behavior Rating of Executive Function (parent questionnaire; measure of executive function) 76 | - **CCC-2**: Children's Communication Checklist (parent questionnaire; measures child's language use in natural settings) 77 | - **CELF**: Children's Evaluation of Language Fundamentals (test for expressive and receptive language abilities) 78 | - **SCQ**: Social Communication Questionnaire (parent questionnaire; assessment of core autism symptoms) 79 | - **SDQ**: Strength and Difficulties Questionnaire 80 | - **SRS-2**: Social Responsiveness Scale 81 | - **VABS-II**: Vineland's Adaptive Behavior Scale, Second Edition (parent questionnaire; assessment of daily life skills used to estimate general adaptive functioning) 82 | - **FSIQ**: Full Scale IQ 83 | - **NVIQ**: Nonverbal IQ 84 | - **PIQ**: Performance IQ 85 | 86 | # Other 87 | 88 | - **R01**: NIH Research Project Grant Program (a type of grant: see https://grants.nih.gov/grants/funding/r01.htm) 89 | - **IFDP**: Individual Family Service Plan (a plan for special services for young children with developmental delays) 90 | - **IEP**: Individualized Education Plan (a document that is developed for each public school child who needs special education) 91 | - **IDP**: Individual Development Plan (see: https://myidp.sciencecareers.org; http://www.sciencemag.org/careers/2012/09/you-need-game-plan) 92 | - **RPPR**: Research Performance Progress Report (see: https://grants.nih.gov/grants/rppr/index.htm) 93 | - **myNCBI**: [my National Center for Biotechnology Information](https://www.ncbi.nlm.nih.gov/myncbi/) 94 | - **sciENcv**: [Science Experts Network Curriculum Vitae](https://www.ncbi.nlm.nih.gov/sciencv/) 95 | - **NIHMS**: [NIH Manuscript Submission System](https://www.nihms.nih.gov/) -------------------------------------------------------------------------------- /labhub.wiki/05-food-and-coffee.md: -------------------------------------------------------------------------------- 1 | OHSU has a [formal page on this](http://www.ohsu.edu/xd/about/services/food-and-nutrition/where-to-eat/). 2 | 3 | # Food (may also sell coffee) 4 | 5 | * **Mac Hall** is in the first floor of Mackenzie Hall and has cafeteria style lunch. They close at 4. 6 | * **Thai Yummy** is a Thai food cart. They close at 3 officially but usually more like 2. 7 | * There's a **[farmer's market](http://www.ohsu.edu/xd/about/services/food-and-nutrition/farmers-market/index.cfm)** June-September on Tuesdays from 10-2 with several food carts/stalls. 8 | * **It's All Good** is the fancy natural food store, next to the gift store in the main hospital building. It has some prepackaged food and lots of snacks. 9 | * **Hatfield Cafe** is in Hatfield; they have deli fare at lunch and an espresso and pastry window 10 | * At the base of the tram: **Pizzicato/Lovejoy** (pizza mostly), **Cha! Cha! Cha!** (Mexican food), **Greenleaf** (fancy juice) 11 | * On the waterfront: **Let's Eat Thai Food** (food cart; cheap and giant but a fair walk), **Bambuza** (Vietnamese), **Little Big Burger** (burgers). These are variously long walks, so they won't really fit on a standard lunch break. 12 | * There is a **vending machine** outside GH40. There is some formal process to request your money back if it eats it. 13 | * **The Feathered Nesst** is a quick walk from Gaines Hall and sells burgers, sandwiches, salad, etc; however, they are often slow. This is a popular destination for going-away lunches. 14 | 15 | # Coffee (may also sell food) 16 | 17 | * Pat makes **coffee** (or you can too!) 18 | * We have an **espresso maker** in the kitchen now--but bring your own beans! 19 | * **Nightingale Cafe** is in the School of Nursing (get it?) on the first floor. They close at 2pm and have coffee and pastries. 20 | * **Sky Bridge Espresso** is where the VA skybridge and Doernbecher meet. It closes at 3pm. 21 | * The **Summit Cafe**, at the top of the tram, is open until 4pm. 22 | * There's a **Starbucks** in the lobby of Doernbecher that is usually open until 8pm, making it the last coffee shop to close in the area. -------------------------------------------------------------------------------- /labhub.wiki/06-office-stuff.md: -------------------------------------------------------------------------------- 1 | Pat D. is your resource for office needs! 2 | 3 | # Office Supplies 4 | 5 | The office supplies are located in the cabinets near the sink in GH 40. These cabinets are stocked with the basics--paper, pencils, folders, highlighters, tissues, etc. If you want anything not available there, ask Pat and she can show you the secret storage or order items we don't have. (Pro-tip: ask about the fancy pens). 6 | 7 | # Office Protocol 8 | If you work in GH 30 or 40, please be sure to shut off lights and lock up if you are the last person out for the day. If you’re unsure if anyone else is in, lock the door anyway. If you need after hours access, Pat can request keys for you. 9 | 10 | Locked out? Call public safety at 503-494-7744. -------------------------------------------------------------------------------- /labhub.wiki/07-transportation-and-parking.md: -------------------------------------------------------------------------------- 1 | ## Transportation 2 | 3 | ### The Great Tram Debate 4 | 5 | If you take the tram up the hill, there are two ways to get to Gaines Hall. 6 | 7 | * The Outdoor Way: walk straight away from the tram, either out the front doors toward the ER or juke around to the left until you hit the next set of doors. Walk past the library then turn left before phys plant. Walk on the swoopy path past the apartment building/Feathered Nesst. Cross the street to School of Nursing. Either enter the SoN and go up via the internal stairs/elevator, or go to the right and take the steep stairs up. Cross the street to Gaines Hall. 8 | 9 | * The Indoor Way: walk straight away from the tram, juke left then turn left at the gift store. Turn right at the coffee shop which puts you into the Doernbecher skybridge. Cross the skybridge then take the stairs to your left one floor up. Turn left from the stairs and walk outside. Cross the street, then walk up to the enclosed bridge. Take a left out of the enclosed bridge to the parking lot and walk up Gaines Street to Gaines Hall on the left. 10 | 11 | People have opinions about which of these you should take. Steven Bedrick timed them and says they take the same amount of time. It's up to you. 12 | 13 | ### Parking 14 | 15 | Parking for non-patients is intentionally hard up here (the city doesn't want the traffic jam of everyone driving up the hill), so if you're going to work a full day it's usually easier to use transit and/or bike. If you do drive, there's 3-hour pay parking on Gaines and a 2-hour visitor parking zone up towards the nature park. The parking is enforced pretty strictly so you do risk a ticket if you overstay. If you move your car within the visitor zone they can still ticket you for overstaying. In the pay parking you have to switch blocks and/or sides of the street after 3 hours, and buy a new ticket. 16 | 17 | Also, if you're coming from the east it's easier to take Barbur and turn up Bancroft than it is to take the front way; this skips most of the traffic. However it is a very sharp turn and a somewhat steep hill so be careful. 18 | 19 | ### GPS 20 | 21 | Sometimes telling people "Gaines Hall" will erroneously send them to the waterfront, but "SW 9th and Gaines" usually works. 22 | 23 | The physical address for Gaines Hall is 840 SW Gaines Street, Portland, OR, 97239 24 | 25 | ### Go By Bike 26 | 27 | There is a [free bicycle valet](http://www.gobybikepdx.com) at the base of the tram, open from 6 AM to 7:30 pm when the tram is running. You can swipe your badge and they'll use that to store which bike is yours; otherwise you can get a physical claim check for your bike. Bikes are stored in a guarded lot, so you don't have to lock them or strip off your bike accessories. They'll usually have bags to cover your seat if it's rainy, but you may wish to take your helmet with you on wet days. 28 | 29 | They have a small attached bike shop that can do minor repairs like fixing a flat tire. -------------------------------------------------------------------------------- /labhub.wiki/Home.md: -------------------------------------------------------------------------------- 1 | Welcome to the labhub wiki! This is our place to share institutional knowledge about our lab, CSLU, and OHSU. The goal of this wiki is to collect all the tips, tricks, and good-to-know-s to make your fellow lab member's day a little bit easier :smiley:, without providing overwhelming details 😩. 2 | 3 | This is a work in progress! Feel free to contribute and add to the knowledge base. Throw redlinks around, it'll be fun! Use the links on the right to get started :thumbsup: 4 | 5 | # Some useful topics 6 | * Confused about why people keep saying ADOS like it's a real word? Read through [our acronyms and jargon list](./04-CSLU-acronyms). 7 | * About to buy your own office supplies? Don't!! Check out [Office Stuff](./06-office-stuff). 8 | * Experiencing caffeine withdrawal? Go immediately to [Food and Coffee](./05-food-and-coffee). 9 | * Don't know how you got here/how to get home/where anything is? Look at [Transportation and Parking](./07-transportation-and-parking). 10 | 11 | 12 | -------------------------------------------------------------------------------- /labhub.wiki/_Sidebar.md: -------------------------------------------------------------------------------- 1 | * [Home](./Home) 2 | 3 | - [About CSLU](./01-about-cslu) 4 | - [Seminars and Journal Clubs](./02-seminars-and-journal-clubs) 5 | - [Servers and Data Repositories](./03-servers-and-data-repositories) 6 | - [Our acronyms and jargon](./04-CSLU-acronyms) 7 | - [Food and coffee](./05-food-and-coffee) 8 | - [Office stuff](./06-office-stuff) 9 | - [Transportation and Parking](./07-transportation-and-parking) --------------------------------------------------------------------------------