├── .datalad ├── config ├── .gitattributes └── status │ └── fetched-subs.log ├── NWB └── NWB_for_SFN2023.mp4 ├── BABS └── BABS_OHBM2023_20230622.mp4 ├── DataLad ├── What_is_DataLad_.m ├── Research_Data_Management_01.m ├── Research_Data_Management_02.m ├── Research_Data_Management_03.m ├── Research_Data_Management_04.m ├── A_hands-on_introduction_to_DataLad.m ├── OHBM_Poster_presentation__844__DataCat.m ├── OHBM_Poster_presentation__2057__FAIRly_big.m ├── DataLad_for_Machine_Learning_-_An_Introduction.m ├── Data_versioning_and_transformation_with_DataLad.m ├── DataLad_vs_Git_Git-annex_for_modular_data_management.m ├── Demo__Fully_recomputing_a_real_scientific_paper__DIY_.m ├── 01_Introduction_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m ├── 09_Collaboration_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m ├── 02_JuypterHub_overview_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m ├── 05_Drop_and_remove_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m ├── 06_Branching__part_1__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m ├── How_to_introduce_data_management_technology_without_sinking_the_ship_.m ├── DataLad_-_Decentralized_Distribution_and_Sharing_of_Scientific_Datasets.m ├── DataLad__-_Decentralized_Management_of_Digital_Objects_for_Open_Science.m ├── Follow_the_rabbits__The_2020_OHBM_Brainhack_Traintrack_Session_on_DataLad.m ├── 03_Basics_of_Version_Control_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m ├── 07_Data_publication__part_1__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m ├── 08_Data_publication__part_2__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m ├── Perpetual_decentralized_management_of_digital_objects_for_collaborative_open_science.m ├── 04_Version_control_underneath_the_hood__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m ├── 10_Preview_of_reproducibility_features_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m ├── FAIRly_big__A_framework_for_computationally_reproducible_processing_of_large_scale_data.m ├── OHBM_2022_Educational_course__How_to_Write_a_Re-executable_Publication__-_What_is_DataLad_.m ├── Demo__Fully_recomputing_a_real_scientific_paper__DIY_.en.vtt ├── What_is_DataLad_.en.vtt ├── OHBM_Poster_presentation__844__DataCat.srt ├── OHBM_Poster_presentation__844__DataCat.en.vtt ├── OHBM_Poster_presentation__2057__FAIRly_big.en.vtt ├── 02_JuypterHub_overview_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.en.vtt └── DataLad_-_Decentralized_Distribution_and_Sharing_of_Scientific_Datasets.en.vtt ├── ReproNim ├── Introduction_to_DataLad.m ├── How_Would_ReproNim_do_That_.m ├── Introduction_to_Containers.m ├── ReproNim_Webinar__COINSTAC.m ├── ReproNim_Webinar__Containers.m ├── ReproNim_Webinar__ReproLake.m ├── ReproNim_Webinar__ReproPond.m ├── ReproNim_Webinar__ReproSchema.m ├── ReproNim_Webinar__eCOBIDAS_ReproSchema.m ├── Introduction_to_Semantic_Web_and_Linked_Data.m ├── ReproNim_Webinar__IQ_in_Typical_Development.m ├── The_NeuroImaging_Data_Model__NIDM__in_Action.m ├── ReproNim_Webinar__How_Would_ReproNim_do_That_.m ├── ReproNim_Webinar__Reproducible_Execution_of_Data_Collection_Processing.m └── Depression_and_obesity__using_the_ReproNim_technologies_to_study_public_health_problems.m ├── .gitattributes ├── ABCD-ReproNim_Course ├── Week_10_Instructor_Q_A.m ├── Week_11_Instructor_Q_A.m ├── Week_12_Instructor_Q_A.m ├── Week_7_Instructor_Q_A.m ├── Week_8_Instructor_Q_A.m ├── Week_9_Instructor_Q_A.m ├── Week_9_ABCD__Biospecimens.m ├── Week_11_ABCD__Visualizing_Data.m ├── Week_13_Q_A__Project_Week_Pitches.m ├── ABCD-ReproNim_Project_Week__2021_Project_Week_Kickoff.m ├── ABCD-ReproNim_Project_Week__Team_Project_Presentations.m ├── Week_10_ReproNim__ReproMan_Execution_and_Environment_Manager.m ├── Week_11_ReproNim__ReproPub_-_The_Re-Executable_Publication.m ├── Week12__Analytic_Approaches__Reproducible_Practices_in_Machine_Learning.m └── Week_10_ABCD__Novel_Technologies_-_Mobile__Wearable__and_Social_Media.m ├── Open_Data_In_Neurophysiology_Symposium_2023 ├── Day_1_Session_1__Panel_Discussion.m ├── Day_1_Session_2__Panel_Discussion.m ├── Introduction__Satrajit_Ghosh___Nima_Dehghani.m ├── Day_1_Session_2__Oliver_Rubel___NWB__Neurodata_without_borders_.m ├── Day_1__Session_3__Jerome_Lecoq__Brain_Observatory___OpenScope.m ├── Day_1_Session_1.Tim_Harris__Neuropixels_NXT__in_vivo_high_density_electrophysiology.m ├── Day_1_Session_3__David_Feng__Compute__data___standards_in_large-scale_neuroscience.m ├── Day_1_Session_1._Alipasha_Vaziri__Single_cell_resolution_cortex-wide_volumetric_recording.m ├── Day_1_Session2___Jeremy_Magland__Web-based_visualization___analysis_of_neurophysiology_data.m ├── Day_1_Session_1__Adam_Cohen__Voltage_Imaging__all-optical_electrophysiology_of_neuron_excitability.m ├── Day_1_Session_1__Shadi_Dayeh_Recording_the_human_brain_activity__multi-thousand_channel_ecog_grids_.m ├── Day_1_Session_2_Satrajit_Ghosh__DANDI__Distributed_Archives_for_Neurophysiology_Data_Integration.m ├── Day_1_Session_2__Dimitri_Yatsenko__End-to-end_computational_workflows_for_neuroscience_research.m ├── Keynote_1__Andrea_Beckel_Mitchener__Brain_Research_Through_Advancing_Innovative_Neurotechnologies.m └── Day1_Session3__Hideyuki_Okano__Brain_Mapping___Disease_Modellings_with_Genetically_Modified_Marmoset.m ├── Open_Minds___Pitt ├── Vendor-Neutral_Applications_for_Quantitative_MRI_Quality_Control.m ├── Overview_of_various_noise_contributions_to_fMRI_signal_by_Dr._Thomas_T._Liu.m ├── Open_discussion_on_MR_Imaging_Centre_Facility_Operations__focus_on_QA_Processes.m ├── Review_of_Quality_Control_Considerations_for_Resting-state_fMRI__Dr._Jean_Chen.m ├── Setting_up_your_experiment_for__not_success__but_less_failure__by_Dr._Ben_Inglis.m ├── Academic_Exit_Plan__awareness_of_and_planning_for_non-traditional_careers_beyond_academia.m ├── MR_Scanner_QA__Phantoms__commercial_solutions__cloud_services_and_potential_standards_.m ├── Relationship_between_Structural_MRI_Quality_ratings_and_scores__and_morphometric_measures.m ├── _Quality_Conversation__Phantom_data_matter_in_Neuroimaging_QA_QC_beyond_basic_scanner_QA.m ├── Diffusion_Weighted_MRI_QC__Validation_of_tractography_methods_and_related_issues_by_Dr._Yendiki.m ├── Automatic_quality_assessment_of_structural_MRI_in_pediatric_neuroimaging__Quality_Conversations_.m ├── Comparison_of_retrospective_motion_correction_strategies_in_resting-state_fMRI_by_Dr._Linden_Parkes.m ├── Influence_of_Motion___Physiological_noise_on_fMRI__QC__solutions__and_challenges_by_Dr._Rasmus_Birn.m ├── Overview_of_prospective_motion_detection_and_correction_methods_in_neuroimaging_by_Dr._Paul_Wighton.m └── Restoring_statistical_validity_in_group_analyses_of_motion_corrupted_MRI_data_by_Dr._Antoine_Lutti.m ├── README.md └── code └── fetch_subs.sh /.datalad/config: -------------------------------------------------------------------------------- 1 | [datalad "dataset"] 2 | id = ddba1970-21ed-45b9-90fb-a5c5033fcc7e 3 | -------------------------------------------------------------------------------- /.datalad/.gitattributes: -------------------------------------------------------------------------------- 1 | 2 | config annex.largefiles=nothing 3 | metadata/aggregate* annex.largefiles=nothing 4 | metadata/objects/** annex.largefiles=(anything) -------------------------------------------------------------------------------- /NWB/NWB_for_SFN2023.mp4: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/zk/GZ/MD5E-s1043812232--b45fc82669ae2fdc5621b157096ba819.mp4/MD5E-s1043812232--b45fc82669ae2fdc5621b157096ba819.mp4 -------------------------------------------------------------------------------- /BABS/BABS_OHBM2023_20230622.mp4: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/Vw/K5/MD5E-s79204972--7360871b25553476871170cf2490f01d.mp4/MD5E-s79204972--7360871b25553476871170cf2490f01d.mp4 -------------------------------------------------------------------------------- /DataLad/What_is_DataLad_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/Mf/Qx/URL--yt&chttps&c%%www.youtube.com%watch,63v,61IN0vowZ67vs/URL--yt&chttps&c%%www.youtube.com%watch,63v,61IN0vowZ67vs -------------------------------------------------------------------------------- /ReproNim/Introduction_to_DataLad.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/6z/qJ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61pVrjRRrmKbY/URL--yt&chttps&c%%www.youtube.com%watch,63v,61pVrjRRrmKbY -------------------------------------------------------------------------------- /.gitattributes: -------------------------------------------------------------------------------- 1 | 2 | * annex.backend=MD5E 3 | **/.git* annex.largefiles=nothing 4 | * annex.largefiles=((mimeencoding=binary)and(largerthan=0)) 5 | *.srt annex.largefiles=nothing 6 | -------------------------------------------------------------------------------- /DataLad/Research_Data_Management_01.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/wK/K8/URL--yt&chttps&c%%www.youtube.com%watch,63v,61fL3DWzSWFL8/URL--yt&chttps&c%%www.youtube.com%watch,63v,61fL3DWzSWFL8 -------------------------------------------------------------------------------- /DataLad/Research_Data_Management_02.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/P1/JP/URL--yt&chttps&c%%www.youtube.com%watch,63v,61GrOfE8jv12s/URL--yt&chttps&c%%www.youtube.com%watch,63v,61GrOfE8jv12s -------------------------------------------------------------------------------- /DataLad/Research_Data_Management_03.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/M1/fv/URL--yt&chttps&c%%www.youtube.com%watch,63v,61lO4yfl30_uc/URL--yt&chttps&c%%www.youtube.com%watch,63v,61lO4yfl30_uc -------------------------------------------------------------------------------- /DataLad/Research_Data_Management_04.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/Zv/qQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,613ePgH-kK8h8/URL--yt&chttps&c%%www.youtube.com%watch,63v,613ePgH-kK8h8 -------------------------------------------------------------------------------- /ReproNim/How_Would_ReproNim_do_That_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/F1/Fv/URL--yt&chttps&c%%www.youtube.com%watch,63v,61dcY1eXs6EkM/URL--yt&chttps&c%%www.youtube.com%watch,63v,61dcY1eXs6EkM -------------------------------------------------------------------------------- /ReproNim/Introduction_to_Containers.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/3w/xW/URL--yt&chttps&c%%www.youtube.com%watch,63v,615arBTnYWZq4/URL--yt&chttps&c%%www.youtube.com%watch,63v,615arBTnYWZq4 -------------------------------------------------------------------------------- /ReproNim/ReproNim_Webinar__COINSTAC.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/gf/Xg/URL--yt&chttps&c%%www.youtube.com%watch,63v,616lpsro_L9-Y/URL--yt&chttps&c%%www.youtube.com%watch,63v,616lpsro_L9-Y -------------------------------------------------------------------------------- /ReproNim/ReproNim_Webinar__Containers.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/Xf/Z1/URL--yt&chttps&c%%www.youtube.com%watch,63v,61ix3lC6HGo-Q/URL--yt&chttps&c%%www.youtube.com%watch,63v,61ix3lC6HGo-Q -------------------------------------------------------------------------------- /ReproNim/ReproNim_Webinar__ReproLake.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/M7/5z/URL--yt&chttps&c%%www.youtube.com%watch,63v,61VQ5t24mrvJI/URL--yt&chttps&c%%www.youtube.com%watch,63v,61VQ5t24mrvJI -------------------------------------------------------------------------------- /ReproNim/ReproNim_Webinar__ReproPond.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/G1/kx/URL--yt&chttps&c%%www.youtube.com%watch,63v,61clIL2LJcHXY/URL--yt&chttps&c%%www.youtube.com%watch,63v,61clIL2LJcHXY -------------------------------------------------------------------------------- /ReproNim/ReproNim_Webinar__ReproSchema.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/kV/1F/URL--yt&chttps&c%%www.youtube.com%watch,63v,61dDuP-Znso5Y/URL--yt&chttps&c%%www.youtube.com%watch,63v,61dDuP-Znso5Y -------------------------------------------------------------------------------- /ABCD-ReproNim_Course/Week_10_Instructor_Q_A.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/Q7/WJ/URL--yt&chttps&c%%www.youtube.com%watch,63v,614pEOGYcbx64/URL--yt&chttps&c%%www.youtube.com%watch,63v,614pEOGYcbx64 -------------------------------------------------------------------------------- /ABCD-ReproNim_Course/Week_11_Instructor_Q_A.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/zW/KW/URL--yt&chttps&c%%www.youtube.com%watch,63v,61QFngbg74H1o/URL--yt&chttps&c%%www.youtube.com%watch,63v,61QFngbg74H1o -------------------------------------------------------------------------------- /ABCD-ReproNim_Course/Week_12_Instructor_Q_A.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/fJ/VG/URL--yt&chttps&c%%www.youtube.com%watch,63v,61zAqkd9sSspk/URL--yt&chttps&c%%www.youtube.com%watch,63v,61zAqkd9sSspk -------------------------------------------------------------------------------- /ABCD-ReproNim_Course/Week_7_Instructor_Q_A.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/X4/F3/URL--yt&chttps&c%%www.youtube.com%watch,63v,61IQU77HcUfwI/URL--yt&chttps&c%%www.youtube.com%watch,63v,61IQU77HcUfwI -------------------------------------------------------------------------------- /ABCD-ReproNim_Course/Week_8_Instructor_Q_A.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/5j/PP/URL--yt&chttps&c%%www.youtube.com%watch,63v,61WIBQ7k5rVhc/URL--yt&chttps&c%%www.youtube.com%watch,63v,61WIBQ7k5rVhc -------------------------------------------------------------------------------- /ABCD-ReproNim_Course/Week_9_Instructor_Q_A.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/q5/Jv/URL--yt&chttps&c%%www.youtube.com%watch,63v,619-8SwBIkN2k/URL--yt&chttps&c%%www.youtube.com%watch,63v,619-8SwBIkN2k -------------------------------------------------------------------------------- /DataLad/A_hands-on_introduction_to_DataLad.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/zf/2v/URL--yt&chttps&c%%www.youtube.com%watch,63v,61_I3JFhJJtW0/URL--yt&chttps&c%%www.youtube.com%watch,63v,61_I3JFhJJtW0 -------------------------------------------------------------------------------- /ABCD-ReproNim_Course/Week_9_ABCD__Biospecimens.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/FG/JW/URL--yt&chttps&c%%www.youtube.com%watch,63v,61QcsifMz5_fQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61QcsifMz5_fQ -------------------------------------------------------------------------------- /DataLad/OHBM_Poster_presentation__844__DataCat.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/V5/wZ/URL--yt&chttps&c%%www.youtube.com%watch,63v,614GERwj49KFc/URL--yt&chttps&c%%www.youtube.com%watch,63v,614GERwj49KFc -------------------------------------------------------------------------------- /ReproNim/ReproNim_Webinar__eCOBIDAS_ReproSchema.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/Mf/92/URL--yt&chttps&c%%www.youtube.com%watch,63v,61bQd-e_v2iCc/URL--yt&chttps&c%%www.youtube.com%watch,63v,61bQd-e_v2iCc -------------------------------------------------------------------------------- /ABCD-ReproNim_Course/Week_11_ABCD__Visualizing_Data.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/WK/2k/URL--yt&chttps&c%%www.youtube.com%watch,63v,613r73oYta0yA/URL--yt&chttps&c%%www.youtube.com%watch,63v,613r73oYta0yA -------------------------------------------------------------------------------- /DataLad/OHBM_Poster_presentation__2057__FAIRly_big.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/ww/50/URL--yt&chttps&c%%www.youtube.com%watch,63v,61YvZacWgGRZY/URL--yt&chttps&c%%www.youtube.com%watch,63v,61YvZacWgGRZY -------------------------------------------------------------------------------- /ReproNim/Introduction_to_Semantic_Web_and_Linked_Data.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/5X/wf/URL--yt&chttps&c%%www.youtube.com%watch,63v,61KDMEes_syjE/URL--yt&chttps&c%%www.youtube.com%watch,63v,61KDMEes_syjE -------------------------------------------------------------------------------- /ReproNim/ReproNim_Webinar__IQ_in_Typical_Development.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/zj/12/URL--yt&chttps&c%%www.youtube.com%watch,63v,61RdJ_Ac1ZO8M/URL--yt&chttps&c%%www.youtube.com%watch,63v,61RdJ_Ac1ZO8M -------------------------------------------------------------------------------- /ReproNim/The_NeuroImaging_Data_Model__NIDM__in_Action.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/2J/MZ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61223T_s9xSKo/URL--yt&chttps&c%%www.youtube.com%watch,63v,61223T_s9xSKo -------------------------------------------------------------------------------- /ABCD-ReproNim_Course/Week_13_Q_A__Project_Week_Pitches.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/vP/89/URL--yt&chttps&c%%www.youtube.com%watch,63v,61SySRHAp3uRk/URL--yt&chttps&c%%www.youtube.com%watch,63v,61SySRHAp3uRk -------------------------------------------------------------------------------- /DataLad/DataLad_for_Machine_Learning_-_An_Introduction.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/1V/jx/URL--yt&chttps&c%%www.youtube.com%watch,63v,61oXd1GPf-Zv4/URL--yt&chttps&c%%www.youtube.com%watch,63v,61oXd1GPf-Zv4 -------------------------------------------------------------------------------- /DataLad/Data_versioning_and_transformation_with_DataLad.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/fp/xP/URL--yt&chttps&c%%www.youtube.com%watch,63v,61wimd1uhIJ8g/URL--yt&chttps&c%%www.youtube.com%watch,63v,61wimd1uhIJ8g -------------------------------------------------------------------------------- /ReproNim/ReproNim_Webinar__How_Would_ReproNim_do_That_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/K6/63/URL--yt&chttps&c%%www.youtube.com%watch,63v,61NPlAQdSDnBk/URL--yt&chttps&c%%www.youtube.com%watch,63v,61NPlAQdSDnBk -------------------------------------------------------------------------------- /DataLad/DataLad_vs_Git_Git-annex_for_modular_data_management.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/5x/Ff/URL--yt&chttps&c%%www.youtube.com%watch,63v,61Yrg6DgOcbPE/URL--yt&chttps&c%%www.youtube.com%watch,63v,61Yrg6DgOcbPE -------------------------------------------------------------------------------- /DataLad/Demo__Fully_recomputing_a_real_scientific_paper__DIY_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/vJ/G8/URL--yt&chttps&c%%www.youtube.com%watch,63v,61nhLqmF58SLQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61nhLqmF58SLQ -------------------------------------------------------------------------------- /DataLad/01_Introduction_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/Gw/Jp/URL--yt&chttps&c%%www.youtube.com%watch,63v,6140ZcGp2vHXk/URL--yt&chttps&c%%www.youtube.com%watch,63v,6140ZcGp2vHXk -------------------------------------------------------------------------------- /DataLad/09_Collaboration_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/W8/k5/URL--yt&chttps&c%%www.youtube.com%watch,63v,61AuM6bc7-N6U/URL--yt&chttps&c%%www.youtube.com%watch,63v,61AuM6bc7-N6U -------------------------------------------------------------------------------- /ABCD-ReproNim_Course/ABCD-ReproNim_Project_Week__2021_Project_Week_Kickoff.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/m7/7Q/URL--yt&chttps&c%%www.youtube.com%watch,63v,61zTOleP0JIqo/URL--yt&chttps&c%%www.youtube.com%watch,63v,61zTOleP0JIqo -------------------------------------------------------------------------------- /ABCD-ReproNim_Course/ABCD-ReproNim_Project_Week__Team_Project_Presentations.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/QG/G9/URL--yt&chttps&c%%www.youtube.com%watch,63v,61q2xdwKgtbos/URL--yt&chttps&c%%www.youtube.com%watch,63v,61q2xdwKgtbos -------------------------------------------------------------------------------- /DataLad/02_JuypterHub_overview_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/5k/0v/URL--yt&chttps&c%%www.youtube.com%watch,63v,61N7wMaaTAyzE/URL--yt&chttps&c%%www.youtube.com%watch,63v,61N7wMaaTAyzE -------------------------------------------------------------------------------- /DataLad/05_Drop_and_remove_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/XJ/wk/URL--yt&chttps&c%%www.youtube.com%watch,63v,61iulQIhPqRzQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61iulQIhPqRzQ -------------------------------------------------------------------------------- /DataLad/06_Branching__part_1__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/mK/v9/URL--yt&chttps&c%%www.youtube.com%watch,63v,618TyMg9SK35U/URL--yt&chttps&c%%www.youtube.com%watch,63v,618TyMg9SK35U -------------------------------------------------------------------------------- /DataLad/How_to_introduce_data_management_technology_without_sinking_the_ship_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/zW/Kg/URL--yt&chttps&c%%www.youtube.com%watch,63v,61uH75kYgwLH4/URL--yt&chttps&c%%www.youtube.com%watch,63v,61uH75kYgwLH4 -------------------------------------------------------------------------------- /Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_1__Panel_Discussion.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/PP/Xp/URL--yt&chttps&c%%www.youtube.com%watch,63v,61jI9grk7l9kk/URL--yt&chttps&c%%www.youtube.com%watch,63v,61jI9grk7l9kk -------------------------------------------------------------------------------- /Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_2__Panel_Discussion.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/QF/WF/URL--yt&chttps&c%%www.youtube.com%watch,63v,61z4iTFH1adLw/URL--yt&chttps&c%%www.youtube.com%watch,63v,61z4iTFH1adLw -------------------------------------------------------------------------------- /ABCD-ReproNim_Course/Week_10_ReproNim__ReproMan_Execution_and_Environment_Manager.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/M2/mz/URL--yt&chttps&c%%www.youtube.com%watch,63v,61grIVFbYH7YE/URL--yt&chttps&c%%www.youtube.com%watch,63v,61grIVFbYH7YE -------------------------------------------------------------------------------- /ABCD-ReproNim_Course/Week_11_ReproNim__ReproPub_-_The_Re-Executable_Publication.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/pJ/95/URL--yt&chttps&c%%www.youtube.com%watch,63v,61PlTJpErMCEk/URL--yt&chttps&c%%www.youtube.com%watch,63v,61PlTJpErMCEk -------------------------------------------------------------------------------- /DataLad/DataLad_-_Decentralized_Distribution_and_Sharing_of_Scientific_Datasets.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/wp/Zx/URL--yt&chttps&c%%www.youtube.com%watch,63v,61sDP1jhRkKRo/URL--yt&chttps&c%%www.youtube.com%watch,63v,61sDP1jhRkKRo -------------------------------------------------------------------------------- /DataLad/DataLad__-_Decentralized_Management_of_Digital_Objects_for_Open_Science.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/gW/3K/URL--yt&chttps&c%%www.youtube.com%watch,63v,61pIGFS8XDjco/URL--yt&chttps&c%%www.youtube.com%watch,63v,61pIGFS8XDjco -------------------------------------------------------------------------------- /DataLad/Follow_the_rabbits__The_2020_OHBM_Brainhack_Traintrack_Session_on_DataLad.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/6M/pW/URL--yt&chttps&c%%www.youtube.com%watch,63v,61L5A0MXqFrOY/URL--yt&chttps&c%%www.youtube.com%watch,63v,61L5A0MXqFrOY -------------------------------------------------------------------------------- /Open_Minds___Pitt/Vendor-Neutral_Applications_for_Quantitative_MRI_Quality_Control.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/vZ/zq/URL--yt&chttps&c%%www.youtube.com%watch,63v,61ob0hPa1JQac/URL--yt&chttps&c%%www.youtube.com%watch,63v,61ob0hPa1JQac -------------------------------------------------------------------------------- /ReproNim/ReproNim_Webinar__Reproducible_Execution_of_Data_Collection_Processing.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/xq/g8/URL--yt&chttps&c%%www.youtube.com%watch,63v,61dwBtrpI2iS0/URL--yt&chttps&c%%www.youtube.com%watch,63v,61dwBtrpI2iS0 -------------------------------------------------------------------------------- /DataLad/03_Basics_of_Version_Control_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/vj/VJ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61IXSE-KtQVBs/URL--yt&chttps&c%%www.youtube.com%watch,63v,61IXSE-KtQVBs -------------------------------------------------------------------------------- /DataLad/07_Data_publication__part_1__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/0j/Zx/URL--yt&chttps&c%%www.youtube.com%watch,63v,61WwSp22zVwV8/URL--yt&chttps&c%%www.youtube.com%watch,63v,61WwSp22zVwV8 -------------------------------------------------------------------------------- /DataLad/08_Data_publication__part_2__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/Zp/m4/URL--yt&chttps&c%%www.youtube.com%watch,63v,61LQ3gmSOT-Io/URL--yt&chttps&c%%www.youtube.com%watch,63v,61LQ3gmSOT-Io -------------------------------------------------------------------------------- /Open_Data_In_Neurophysiology_Symposium_2023/Introduction__Satrajit_Ghosh___Nima_Dehghani.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/Kk/FF/URL--yt&chttps&c%%www.youtube.com%watch,63v,61EO8QVOcdQYY/URL--yt&chttps&c%%www.youtube.com%watch,63v,61EO8QVOcdQYY -------------------------------------------------------------------------------- /ABCD-ReproNim_Course/Week12__Analytic_Approaches__Reproducible_Practices_in_Machine_Learning.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/pX/Wq/URL--yt&chttps&c%%www.youtube.com%watch,63v,61LAddDaqUe0A/URL--yt&chttps&c%%www.youtube.com%watch,63v,61LAddDaqUe0A -------------------------------------------------------------------------------- /ABCD-ReproNim_Course/Week_10_ABCD__Novel_Technologies_-_Mobile__Wearable__and_Social_Media.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/z4/Pm/URL--yt&chttps&c%%www.youtube.com%watch,63v,61MFk98_ykknQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61MFk98_ykknQ -------------------------------------------------------------------------------- /DataLad/Perpetual_decentralized_management_of_digital_objects_for_collaborative_open_science.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/ZX/57/URL--yt&chttps&c%%www.youtube.com%watch,63v,61SJ64rSMD9PU/URL--yt&chttps&c%%www.youtube.com%watch,63v,61SJ64rSMD9PU -------------------------------------------------------------------------------- /Open_Minds___Pitt/Overview_of_various_noise_contributions_to_fMRI_signal_by_Dr._Thomas_T._Liu.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/vP/28/URL--yt&chttps&c%%www.youtube.com%watch,63v,6176y1Vg12oeA/URL--yt&chttps&c%%www.youtube.com%watch,63v,6176y1Vg12oeA -------------------------------------------------------------------------------- /DataLad/04_Version_control_underneath_the_hood__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/X4/m8/URL--yt&chttps&c%%www.youtube.com%watch,63v,61lBj5J7aKnPc/URL--yt&chttps&c%%www.youtube.com%watch,63v,61lBj5J7aKnPc -------------------------------------------------------------------------------- /DataLad/10_Preview_of_reproducibility_features_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/WP/Q9/URL--yt&chttps&c%%www.youtube.com%watch,63v,61AX3lIw9LQbA/URL--yt&chttps&c%%www.youtube.com%watch,63v,61AX3lIw9LQbA -------------------------------------------------------------------------------- /DataLad/FAIRly_big__A_framework_for_computationally_reproducible_processing_of_large_scale_data.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/3M/g1/URL--yt&chttps&c%%www.youtube.com%watch,63v,61YDtEKUWUPTQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61YDtEKUWUPTQ -------------------------------------------------------------------------------- /DataLad/OHBM_2022_Educational_course__How_to_Write_a_Re-executable_Publication__-_What_is_DataLad_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/V8/xX/URL--yt&chttps&c%%www.youtube.com%watch,63v,61s1zrB_sDbDU/URL--yt&chttps&c%%www.youtube.com%watch,63v,61s1zrB_sDbDU -------------------------------------------------------------------------------- /Open_Minds___Pitt/Open_discussion_on_MR_Imaging_Centre_Facility_Operations__focus_on_QA_Processes.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/zJ/91/URL--yt&chttps&c%%www.youtube.com%watch,63v,61_Vhe892uDVQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61_Vhe892uDVQ -------------------------------------------------------------------------------- /Open_Minds___Pitt/Review_of_Quality_Control_Considerations_for_Resting-state_fMRI__Dr._Jean_Chen.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/fF/xp/URL--yt&chttps&c%%www.youtube.com%watch,63v,612HlQPaPDzNQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,612HlQPaPDzNQ -------------------------------------------------------------------------------- /Open_Minds___Pitt/Setting_up_your_experiment_for__not_success__but_less_failure__by_Dr._Ben_Inglis.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/5p/gx/URL--yt&chttps&c%%www.youtube.com%watch,63v,61xpHzkg4JOkU/URL--yt&chttps&c%%www.youtube.com%watch,63v,61xpHzkg4JOkU -------------------------------------------------------------------------------- /ReproNim/Depression_and_obesity__using_the_ReproNim_technologies_to_study_public_health_problems.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/50/2W/URL--yt&chttps&c%%www.youtube.com%watch,63v,61KpgF18p3Woo/URL--yt&chttps&c%%www.youtube.com%watch,63v,61KpgF18p3Woo -------------------------------------------------------------------------------- /Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_2__Oliver_Rubel___NWB__Neurodata_without_borders_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/1P/Vx/URL--yt&chttps&c%%www.youtube.com%watch,63v,611pqggHHvHdw/URL--yt&chttps&c%%www.youtube.com%watch,63v,611pqggHHvHdw -------------------------------------------------------------------------------- /Open_Data_In_Neurophysiology_Symposium_2023/Day_1__Session_3__Jerome_Lecoq__Brain_Observatory___OpenScope.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/P9/vp/URL--yt&chttps&c%%www.youtube.com%watch,63v,61b97exHKmgho/URL--yt&chttps&c%%www.youtube.com%watch,63v,61b97exHKmgho -------------------------------------------------------------------------------- /Open_Minds___Pitt/Academic_Exit_Plan__awareness_of_and_planning_for_non-traditional_careers_beyond_academia.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/XM/4F/URL--yt&chttps&c%%www.youtube.com%watch,63v,614s8xan-eH0c/URL--yt&chttps&c%%www.youtube.com%watch,63v,614s8xan-eH0c -------------------------------------------------------------------------------- /Open_Minds___Pitt/MR_Scanner_QA__Phantoms__commercial_solutions__cloud_services_and_potential_standards_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/9x/qk/URL--yt&chttps&c%%www.youtube.com%watch,63v,61HFEt3ZxLBl8/URL--yt&chttps&c%%www.youtube.com%watch,63v,61HFEt3ZxLBl8 -------------------------------------------------------------------------------- /Open_Minds___Pitt/Relationship_between_Structural_MRI_Quality_ratings_and_scores__and_morphometric_measures.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/WQ/7z/URL--yt&chttps&c%%www.youtube.com%watch,63v,61md3_oVugOUc/URL--yt&chttps&c%%www.youtube.com%watch,63v,61md3_oVugOUc -------------------------------------------------------------------------------- /Open_Minds___Pitt/_Quality_Conversation__Phantom_data_matter_in_Neuroimaging_QA_QC_beyond_basic_scanner_QA.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/81/J6/URL--yt&chttps&c%%www.youtube.com%watch,63v,61HcS9_LFdoPw/URL--yt&chttps&c%%www.youtube.com%watch,63v,61HcS9_LFdoPw -------------------------------------------------------------------------------- /Open_Minds___Pitt/Diffusion_Weighted_MRI_QC__Validation_of_tractography_methods_and_related_issues_by_Dr._Yendiki.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/Jg/9F/URL--yt&chttps&c%%www.youtube.com%watch,63v,61plB-wmuhEQk/URL--yt&chttps&c%%www.youtube.com%watch,63v,61plB-wmuhEQk -------------------------------------------------------------------------------- /Open_Minds___Pitt/Automatic_quality_assessment_of_structural_MRI_in_pediatric_neuroimaging__Quality_Conversations_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/kP/j5/URL--yt&chttps&c%%www.youtube.com%watch,63v,618QYirk8opLA/URL--yt&chttps&c%%www.youtube.com%watch,63v,618QYirk8opLA -------------------------------------------------------------------------------- /Open_Minds___Pitt/Comparison_of_retrospective_motion_correction_strategies_in_resting-state_fMRI_by_Dr._Linden_Parkes.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/xj/2V/URL--yt&chttps&c%%www.youtube.com%watch,63v,61bo2AFvJ5mYU/URL--yt&chttps&c%%www.youtube.com%watch,63v,61bo2AFvJ5mYU -------------------------------------------------------------------------------- /Open_Minds___Pitt/Influence_of_Motion___Physiological_noise_on_fMRI__QC__solutions__and_challenges_by_Dr._Rasmus_Birn.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/Qz/kz/URL--yt&chttps&c%%www.youtube.com%watch,63v,61z2d_3eyzfJw/URL--yt&chttps&c%%www.youtube.com%watch,63v,61z2d_3eyzfJw -------------------------------------------------------------------------------- /Open_Minds___Pitt/Overview_of_prospective_motion_detection_and_correction_methods_in_neuroimaging_by_Dr._Paul_Wighton.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/JV/0X/URL--yt&chttps&c%%www.youtube.com%watch,63v,619_BH3NJcRzs/URL--yt&chttps&c%%www.youtube.com%watch,63v,619_BH3NJcRzs -------------------------------------------------------------------------------- /Open_Minds___Pitt/Restoring_statistical_validity_in_group_analyses_of_motion_corrupted_MRI_data_by_Dr._Antoine_Lutti.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/1w/V6/URL--yt&chttps&c%%www.youtube.com%watch,63v,61XLSzzJzmtvc/URL--yt&chttps&c%%www.youtube.com%watch,63v,61XLSzzJzmtvc -------------------------------------------------------------------------------- /Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_1.Tim_Harris__Neuropixels_NXT__in_vivo_high_density_electrophysiology.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/8M/jp/URL--yt&chttps&c%%www.youtube.com%watch,63v,61DZRvA7c5UzQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61DZRvA7c5UzQ -------------------------------------------------------------------------------- /Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_3__David_Feng__Compute__data___standards_in_large-scale_neuroscience.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/qQ/JQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61uvavLax2Txs/URL--yt&chttps&c%%www.youtube.com%watch,63v,61uvavLax2Txs -------------------------------------------------------------------------------- /Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_1._Alipasha_Vaziri__Single_cell_resolution_cortex-wide_volumetric_recording.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/6M/GQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61TGaPM72WdDE/URL--yt&chttps&c%%www.youtube.com%watch,63v,61TGaPM72WdDE -------------------------------------------------------------------------------- /Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session2___Jeremy_Magland__Web-based_visualization___analysis_of_neurophysiology_data.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/0k/87/URL--yt&chttps&c%%www.youtube.com%watch,63v,61ZSK5jHy3WzU/URL--yt&chttps&c%%www.youtube.com%watch,63v,61ZSK5jHy3WzU -------------------------------------------------------------------------------- /Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_1__Adam_Cohen__Voltage_Imaging__all-optical_electrophysiology_of_neuron_excitability.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/8V/X9/URL--yt&chttps&c%%www.youtube.com%watch,63v,61yEx4YbtlO9M/URL--yt&chttps&c%%www.youtube.com%watch,63v,61yEx4YbtlO9M -------------------------------------------------------------------------------- /Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_1__Shadi_Dayeh_Recording_the_human_brain_activity__multi-thousand_channel_ecog_grids_.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/fW/56/URL--yt&chttps&c%%www.youtube.com%watch,63v,61hBLuh4hs-To/URL--yt&chttps&c%%www.youtube.com%watch,63v,61hBLuh4hs-To -------------------------------------------------------------------------------- /Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_2_Satrajit_Ghosh__DANDI__Distributed_Archives_for_Neurophysiology_Data_Integration.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/3F/Zg/URL--yt&chttps&c%%www.youtube.com%watch,63v,61LBjGJ_DJ91M/URL--yt&chttps&c%%www.youtube.com%watch,63v,61LBjGJ_DJ91M -------------------------------------------------------------------------------- /Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_2__Dimitri_Yatsenko__End-to-end_computational_workflows_for_neuroscience_research.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/M5/jP/URL--yt&chttps&c%%www.youtube.com%watch,63v,61C_BG6cVHSbQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61C_BG6cVHSbQ -------------------------------------------------------------------------------- /Open_Data_In_Neurophysiology_Symposium_2023/Keynote_1__Andrea_Beckel_Mitchener__Brain_Research_Through_Advancing_Innovative_Neurotechnologies.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/2p/F9/URL--yt&chttps&c%%www.youtube.com%watch,63v,61x15DSuXCmRM/URL--yt&chttps&c%%www.youtube.com%watch,63v,61x15DSuXCmRM -------------------------------------------------------------------------------- /Open_Data_In_Neurophysiology_Symposium_2023/Day1_Session3__Hideyuki_Okano__Brain_Mapping___Disease_Modellings_with_Genetically_Modified_Marmoset.m: -------------------------------------------------------------------------------- 1 | ../.git/annex/objects/mg/5Z/URL--yt&chttps&c%%www.youtube.com%watch,63v,61Zv5NOB-mkXg/URL--yt&chttps&c%%www.youtube.com%watch,63v,61Zv5NOB-mkXg -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # ReproTube inceptor 2 | 3 | This is not even a prototype but just a result of running 3 git annex commands 4 | (check git history for datalad run records) to fetch 3 sample channels of intrest. 5 | 6 | ## HOWTO 7 | 8 | ATM to download actual video you would need to have youtube-dl installed 9 | and invoke git annex with special option to allow download from/through potentially 10 | dangerous media, e.g. 11 | 12 | git -c annex.security.allowed-ip-addresses=all annex get ReproNim/How_Would_ReproNim_do_That_.m 13 | 14 | -------------------------------------------------------------------------------- /code/fetch_subs.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -eu 4 | 5 | # fetch subtitles for video file(s) if there is none 6 | mkdir -p .datalad/status 7 | subs_done=.datalad/status/fetched-subs.log 8 | touch "$subs_done" # to >> or grep at the beginning of the universe 9 | 10 | for f in "$@"; do 11 | fbase=${f%.*} 12 | vtts=$(/bin/ls -d "$fbase".*.vtt 2>/dev/null || :) 13 | if [ ! -z "$vtts" ]; then 14 | # echo "$fbase: already has some vtts" # : $vtts" 15 | continue 16 | fi 17 | if grep -q "^$f" $subs_done; then 18 | # echo "$f: already was getting subs, might have none" 19 | continue 20 | fi 21 | url=$(git annex whereis --in web "$f" | awk '/^ *web:/{print $2;}') 22 | echo "$fbase: getting some for $url" 23 | yt-dlp --write-subs --write-auto-subs -k --sub-lang=en --skip-download -o "$fbase" "$url" && status=ok || status=error 24 | echo -e "$f\t$url\t$(date)\t$status" >> "$subs_done" 25 | done 26 | -------------------------------------------------------------------------------- /DataLad/Demo__Fully_recomputing_a_real_scientific_paper__DIY_.en.vtt: -------------------------------------------------------------------------------- 1 | WEBVTT 2 | Kind: captions 3 | Language: en 4 | 5 | 00:00:05.750 --> 00:00:16.880 align:start position:0% 6 | 7 | [Music] 8 | 9 | 00:00:16.880 --> 00:00:16.890 align:start position:0% 10 | 11 | 12 | 13 | 00:00:16.890 --> 00:00:37.630 align:start position:0% 14 | 15 | [Music] 16 | 17 | 00:00:37.630 --> 00:00:37.640 align:start position:0% 18 | 19 | 20 | 21 | 00:00:37.640 --> 00:00:51.780 align:start position:0% 22 | 23 | [Music] 24 | 25 | 00:00:51.780 --> 00:00:51.790 align:start position:0% 26 | 27 | 28 | 29 | 00:00:51.790 --> 00:01:12.720 align:start position:0% 30 | 31 | [Music] 32 | 33 | 00:01:12.720 --> 00:01:12.730 align:start position:0% 34 | 35 | 36 | 37 | 00:01:12.730 --> 00:01:16.320 align:start position:0% 38 | 39 | yes 40 | 41 | 00:01:16.320 --> 00:01:16.330 align:start position:0% 42 | 43 | 44 | 45 | 00:01:16.330 --> 00:01:26.700 align:start position:0% 46 | 47 | [Music] 48 | 49 | 00:01:26.700 --> 00:01:26.710 align:start position:0% 50 | 51 | 52 | 53 | 00:01:26.710 --> 00:01:35.819 align:start position:0% 54 | 55 | [Music] 56 | 57 | -------------------------------------------------------------------------------- /.datalad/status/fetched-subs.log: -------------------------------------------------------------------------------- 1 | DataLad/01_Introduction_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=40ZcGp2vHXk Thu Jun 27 11:45:14 AM KST 2024 ok 2 | DataLad/02_JuypterHub_overview_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=N7wMaaTAyzE Thu Jun 27 11:45:20 AM KST 2024 ok 3 | DataLad/03_Basics_of_Version_Control_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=IXSE-KtQVBs Thu Jun 27 11:45:27 AM KST 2024 ok 4 | DataLad/04_Version_control_underneath_the_hood__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=lBj5J7aKnPc Thu Jun 27 11:45:32 AM KST 2024 ok 5 | DataLad/05_Drop_and_remove_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=iulQIhPqRzQ Thu Jun 27 11:45:39 AM KST 2024 ok 6 | DataLad/06_Branching__part_1__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=8TyMg9SK35U Thu Jun 27 11:45:44 AM KST 2024 ok 7 | DataLad/07_Data_publication__part_1__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=WwSp22zVwV8 Thu Jun 27 11:45:50 AM KST 2024 ok 8 | DataLad/08_Data_publication__part_2__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=LQ3gmSOT-Io Thu Jun 27 11:45:55 AM KST 2024 ok 9 | DataLad/09_Collaboration_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=AuM6bc7-N6U Thu Jun 27 11:46:01 AM KST 2024 ok 10 | DataLad/10_Preview_of_reproducibility_features_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=AX3lIw9LQbA Thu Jun 27 11:46:06 AM KST 2024 ok 11 | DataLad/A_hands-on_introduction_to_DataLad.m https://www.youtube.com/watch?v=_I3JFhJJtW0 Thu Jun 27 11:46:12 AM KST 2024 ok 12 | DataLad/DataLad_-_Decentralized_Distribution_and_Sharing_of_Scientific_Datasets.m https://www.youtube.com/watch?v=sDP1jhRkKRo Thu Jun 27 11:46:17 AM KST 2024 ok 13 | DataLad/DataLad__-_Decentralized_Management_of_Digital_Objects_for_Open_Science.m https://www.youtube.com/watch?v=pIGFS8XDjco Thu Jun 27 11:46:23 AM KST 2024 ok 14 | DataLad/DataLad_for_Machine_Learning_-_An_Introduction.m https://www.youtube.com/watch?v=oXd1GPf-Zv4 Thu Jun 27 11:46:28 AM KST 2024 ok 15 | DataLad/DataLad_vs_Git_Git-annex_for_modular_data_management.m https://www.youtube.com/watch?v=Yrg6DgOcbPE Thu Jun 27 11:46:38 AM KST 2024 ok 16 | DataLad/Data_versioning_and_transformation_with_DataLad.m https://www.youtube.com/watch?v=wimd1uhIJ8g Thu Jun 27 11:46:43 AM KST 2024 ok 17 | DataLad/Demo__Fully_recomputing_a_real_scientific_paper__DIY_.m https://www.youtube.com/watch?v=nhLqmF58SLQ Thu Jun 27 11:46:49 AM KST 2024 ok 18 | DataLad/FAIRly_big__A_framework_for_computationally_reproducible_processing_of_large_scale_data.m https://www.youtube.com/watch?v=YDtEKUWUPTQ Thu Jun 27 11:46:55 AM KST 2024 ok 19 | DataLad/Follow_the_rabbits__The_2020_OHBM_Brainhack_Traintrack_Session_on_DataLad.m https://www.youtube.com/watch?v=L5A0MXqFrOY Thu Jun 27 11:47:00 AM KST 2024 ok 20 | DataLad/How_to_introduce_data_management_technology_without_sinking_the_ship_.m https://www.youtube.com/watch?v=uH75kYgwLH4 Thu Jun 27 11:47:05 AM KST 2024 ok 21 | DataLad/OHBM_2022_Educational_course__How_to_Write_a_Re-executable_Publication__-_What_is_DataLad_.m https://www.youtube.com/watch?v=s1zrB_sDbDU Thu Jun 27 11:47:10 AM KST 2024 ok 22 | DataLad/OHBM_Poster_presentation__2057__FAIRly_big.m https://www.youtube.com/watch?v=YvZacWgGRZY Thu Jun 27 11:47:15 AM KST 2024 ok 23 | DataLad/OHBM_Poster_presentation__844__DataCat.m https://www.youtube.com/watch?v=4GERwj49KFc Thu Jun 27 11:47:19 AM KST 2024 ok 24 | DataLad/Perpetual_decentralized_management_of_digital_objects_for_collaborative_open_science.m https://www.youtube.com/watch?v=SJ64rSMD9PU Thu Jun 27 11:47:24 AM KST 2024 ok 25 | DataLad/Research_Data_Management_01.m https://www.youtube.com/watch?v=fL3DWzSWFL8 Thu Jun 27 11:47:28 AM KST 2024 ok 26 | DataLad/Research_Data_Management_02.m https://www.youtube.com/watch?v=GrOfE8jv12s Thu Jun 27 11:47:32 AM KST 2024 ok 27 | DataLad/Research_Data_Management_03.m https://www.youtube.com/watch?v=lO4yfl30_uc Thu Jun 27 11:47:37 AM KST 2024 ok 28 | DataLad/Research_Data_Management_04.m https://www.youtube.com/watch?v=3ePgH-kK8h8 Thu Jun 27 11:47:41 AM KST 2024 ok 29 | DataLad/What_is_DataLad_.m https://www.youtube.com/watch?v=IN0vowZ67vs Thu Jun 27 11:47:46 AM KST 2024 ok 30 | -------------------------------------------------------------------------------- /DataLad/What_is_DataLad_.en.vtt: -------------------------------------------------------------------------------- 1 | WEBVTT 2 | Kind: captions 3 | Language: en 4 | 5 | 00:00:01.920 --> 00:00:12.000 6 | What is DataLad? Everyone needs data! Data are  7 | indispensable for learning, understanding, and   8 | 9 | 00:00:12.000 --> 00:00:19.520 10 | decision making. Working with data responsibly for  11 | the benefit of our communities and the environment   12 | 13 | 00:00:20.160 --> 00:00:26.880 14 | requires us to bring together diverse expertise  15 | and to share findings in a way that fosters trust   16 | 17 | 00:00:26.880 --> 00:00:32.400 18 | in the facts that we base our  19 | actions on. But how can we achieve   20 | 21 | 00:00:32.400 --> 00:00:37.600 22 | this in a world that overwhelms us with  23 | information and limitless possibilities?   24 | 25 | 00:00:39.280 --> 00:00:46.720 26 | Here DataLad can help by recording and documenting  27 | the collaborative process of distilling knowledge   28 | 29 | 00:00:46.720 --> 00:00:55.520 30 | from data, such that it becomes more accessible  31 | and ultimately verifiable. Any investigation   32 | 33 | 00:00:55.520 --> 00:01:03.840 34 | is built on facts. Many of them are digital.  35 | They come in text documents, program code,   36 | 37 | 00:01:04.400 --> 00:01:12.240 38 | and binary files such as images. In our connected  39 | world, files cannot only be on our own computers,   40 | 41 | 00:01:12.240 --> 00:01:17.760 42 | but also at many different cloud  43 | services. Wherever data lives,   44 | 45 | 00:01:17.760 --> 00:01:25.440 46 | DataLad can keep track of them to do this! It  47 | provides a data structure, the data set. It can   48 | 49 | 00:01:25.440 --> 00:01:33.520 50 | reference the precise identity and availability  51 | of ANY digital object. Importantly, it can record   52 | 53 | 00:01:33.520 --> 00:01:39.840 54 | how exactly data and program code are used  55 | to derive the results of an investigation. 56 | 57 | 00:01:42.640 --> 00:01:50.400 58 | DataLad can clone a dataset to another location  59 | on a different computer. Like the original dataset   60 | 61 | 00:01:50.400 --> 00:01:56.080 62 | each clone is completely self-contained.  63 | The information in a dataset can be used   64 | 65 | 00:01:56.080 --> 00:02:01.360 66 | to retrieve all data at the precise version  67 | that is needed, either from the internet   68 | 69 | 00:02:01.360 --> 00:02:09.840 70 | or from other clones of a dataset. The process  71 | records in a dataset enable reproducing results   72 | 73 | 00:02:09.840 --> 00:02:15.440 74 | by a collaborator and make it possible to  75 | verify exactly what was done to get them.   76 | 77 | 00:02:17.440 --> 00:02:24.000 78 | Dataset content looks just like any other  79 | directory on a computer. But datasets can   80 | 81 | 00:02:24.000 --> 00:02:30.400 82 | also be nested to form reusable modular units  83 | that can be assembled into bigger projects.   84 | 85 | 00:02:32.240 --> 00:02:38.720 86 | With these basic principles DataLad can  87 | support a diversity of tasks, whether that is   88 | 89 | 00:02:38.720 --> 00:02:44.640 90 | editing a video clip by yourself or collaborative  91 | research on the world's most powerful computers.   92 | 93 | 00:02:46.480 --> 00:02:53.200 94 | Bringing structure to the data flood is no easy  95 | task. But we are a community of people working   96 | 97 | 00:02:53.200 --> 00:03:00.160 98 | hard to make the tools more accessible every day.  99 | Online training resources offer a convenient start   100 | 101 | 00:03:00.160 --> 00:03:07.520 102 | to working with DataLad. A comprehensive handbook  103 | is your guide for a deep dive into all the   104 | 105 | 00:03:07.520 --> 00:03:16.320 106 | features DataLad has to offer. DataLad is free  107 | and open source software. Anyone is free to use   108 | 109 | 00:03:16.320 --> 00:03:21.920 110 | for any purpose. It has been adapted to  111 | facilitate scientific collaborations,   112 | 113 | 00:03:22.560 --> 00:03:29.160 114 | working with the world's largest health data  115 | sets, and to enable reproducible research. [Music]   116 | 117 | 00:03:31.440 --> 00:03:45.840 118 | Join us to help better  119 | serve even more communities! 120 | 121 | -------------------------------------------------------------------------------- /DataLad/OHBM_Poster_presentation__844__DataCat.srt: -------------------------------------------------------------------------------- 1 | 1 2 | 00:00:01,800 --> 00:00:04,500 3 | Welcome to the presentation of DataLad Catalog: 4 | 5 | 2 6 | 00:00:05,000 --> 00:00:7,600 7 | a free and open source command line tool, with a Python API, 8 | 9 | 3 10 | 00:00:08,000 --> 00:00:13,500 11 | that lets you create user-friendly, browser-based data catalogs from structured metadata. 12 | 13 | 4 14 | 00:00:14,500 --> 00:00:20,800 15 | The importance and benefits of making research data Findable, Accessible, Interoperable, and Reusable are clear. 16 | 17 | 6 18 | 00:00:21,000 --> 00:00:28,500 19 | But of equal importance are our legal and ethical obligations to protect the personal data privacy of research participants. 20 | 21 | 7 22 | 00:00:29,000 --> 00:00:35,000 23 | So we are struck with this apparent contradiction: how can we share our data openly…yet keep it secure and protected? 24 | 25 | 8 26 | 00:00:35,200 --> 00:00:40,000 27 | Should we err on the side of FAIRness, or of data privacy? And do we even have to choose? 28 | 29 | 9 30 | 00:00:41,000 --> 00:00:44,000 31 | Ideally, no. And in practice, also no, 32 | 33 | 10 34 | 00:00:44,500 --> 00:00:50,500 35 | because we have a powerful opportunity in the form of linked, structured, and machine-readable metadata. 36 | 37 | 11 38 | 00:00:51,000 --> 00:00:57,400 39 | Metadata provides not only high-level information about our research data, such as study and data acquisition parameters, 40 | 41 | 12 42 | 00:00:57,800 --> 00:01:02,700 43 | but also the descriptive aspects of each file in the dataset, such as file paths, sizes, and formats. 44 | 45 | 13 46 | 00:01:03,000 --> 00:01:09,500 47 | With this metadata, we can create an abstract representation of the full dataset that is separate from the actual data content. 48 | 49 | 14 50 | 00:01:10,000 --> 00:01:15,000 51 | This means that the content can be stored securely, while we openly share the metadata to make our work more FAIR. 52 | 53 | 15 54 | 00:01:16,000 --> 00:01:21,400 55 | As an added benefit, structured and machine-readable metadata that conforms to industry standards 56 | 57 | 16 58 | 00:01:21,500 --> 00:01:27,000 59 | improves the interoperability and allows the use of automated pipelines and tools. 60 | 61 | 17 62 | 00:01:30,500 --> 00:01:36,500 63 | These ideals are achievable in practice, with a toolset that includes the distributed data management system DataLad, 64 | 65 | 18 66 | 00:01:36,800 --> 00:01:40,500 67 | and its extensions for metadata handling and catalog generation. 68 | 69 | 19 70 | 00:01:41,000 --> 00:01:47,800 71 | DataLad can be used for decentralised management of data as lightweight, portable and extensible representations. 72 | 73 | 20 74 | 00:01:48,000 --> 00:01:54,500 75 | Datalad-metalad can extract structured high- and low-level metadata and associate it with these datasets or with individual files. 76 | 77 | 21 78 | 00:01:55,300 --> 00:02:00,800 79 | And at the end of the workflow, Datalad-catalog can turn the structured metadata into a user-friendly data browser! 80 | 81 | 22 82 | 00:02:02,200 --> 00:02:04,800 83 | So how does the catalog generation process work? 84 | 85 | 23 86 | 00:02:05,000 --> 00:02:08,500 87 | Metadata extracted from various sources, even custom sources, 88 | 89 | 24 90 | 00:02:08,700 --> 00:02:10,800 91 | can be aggregated and added to a catalog. 92 | 93 | 25 94 | 00:02:11,000 --> 00:02:13,200 95 | Incoming metadata will first be validated 96 | 97 | 26 98 | 00:02:13,300 --> 00:02:14,800 99 | against a catalog-specific schema, 100 | 101 | 27 102 | 00:02:15,000 --> 00:02:18,500 103 | before the catalog is generated or individual entries are added. 104 | 105 | 28 106 | 00:02:19,000 --> 00:02:20,500 107 | Once the process is finished, 108 | 109 | 29 110 | 00:02:20,800 --> 00:02:23,300 111 | the output is a set of structured metadata files, 112 | 113 | 29 114 | 00:02:23,500 --> 00:02:26,000 115 | as well as a Vue.js-based browser interface 116 | 117 | 30 118 | 00:02:26,400 --> 00:02:28,800 119 | that understands how to render this metadata. 120 | 121 | 31 122 | 00:02:29,900 --> 00:02:31,100 123 | What is left for the user 124 | 125 | 32 126 | 00:02:31,300 --> 00:02:33,500 127 | is to host this content on their platform of choice 128 | 129 | 32 130 | 00:02:33,600 --> 00:02:35,300 131 | and serve it for the world to see. 132 | 133 | 33 134 | 00:02:36,300 --> 00:02:42,800 135 | Datalad catalog brings the powerful functionality of decentralised metadata handling and data publishing into the hands of users, 136 | 137 | 34 138 | 00:02:43,000 --> 00:02:49,000 139 | preventing dependence on centralised infrastructure and keeping private data secure while adhering to FAIR principles. 140 | 141 | 35 142 | 00:02:49,600 --> 00:02:53,000 143 | Please explore the demo catalog, walk through the interactive tutorial 144 | 145 | 36 146 | 00:02:53,100 --> 00:02:58,000 147 | or visit the codebase to start using or contributing to datalad catalog. 148 | -------------------------------------------------------------------------------- /DataLad/OHBM_Poster_presentation__844__DataCat.en.vtt: -------------------------------------------------------------------------------- 1 | WEBVTT 2 | Kind: captions 3 | Language: en 4 | 5 | 00:00:02.080 --> 00:00:03.750 align:start position:0% 6 | 7 | welcome<00:00:02.399> to<00:00:02.560> the<00:00:02.639> presentation<00:00:03.199> of<00:00:03.280> datalit 8 | 9 | 00:00:03.750 --> 00:00:03.760 align:start position:0% 10 | welcome to the presentation of datalit 11 | 12 | 13 | 00:00:03.760 --> 00:00:05.829 align:start position:0% 14 | welcome to the presentation of datalit 15 | catalog<00:00:04.560> a<00:00:04.720> free<00:00:04.960> and<00:00:05.120> open<00:00:05.279> source<00:00:05.520> command 16 | 17 | 00:00:05.829 --> 00:00:05.839 align:start position:0% 18 | catalog a free and open source command 19 | 20 | 21 | 00:00:05.839 --> 00:00:07.990 align:start position:0% 22 | catalog a free and open source command 23 | line<00:00:06.080> tool<00:00:06.240> with<00:00:06.399> a<00:00:06.480> python<00:00:06.879> api<00:00:07.520> that<00:00:07.759> lets 24 | 25 | 00:00:07.990 --> 00:00:08.000 align:start position:0% 26 | line tool with a python api that lets 27 | 28 | 29 | 00:00:08.000 --> 00:00:10.070 align:start position:0% 30 | line tool with a python api that lets 31 | you<00:00:08.160> create<00:00:08.559> user-friendly<00:00:09.360> browser-based 32 | 33 | 00:00:10.070 --> 00:00:10.080 align:start position:0% 34 | you create user-friendly browser-based 35 | 36 | 37 | 00:00:10.080 --> 00:00:15.030 align:start position:0% 38 | you create user-friendly browser-based 39 | data<00:00:10.400> catalogs<00:00:10.960> from<00:00:11.120> structured<00:00:11.759> metadata 40 | 41 | 00:00:15.030 --> 00:00:15.040 align:start position:0% 42 | 43 | 44 | 45 | 00:00:15.040 --> 00:00:16.550 align:start position:0% 46 | 47 | the<00:00:15.200> importance<00:00:15.679> and<00:00:15.679> benefits<00:00:16.160> of<00:00:16.240> making 48 | 49 | 00:00:16.550 --> 00:00:16.560 align:start position:0% 50 | the importance and benefits of making 51 | 52 | 53 | 00:00:16.560 --> 00:00:18.710 align:start position:0% 54 | the importance and benefits of making 55 | research<00:00:16.960> data<00:00:17.279> finable<00:00:17.920> accessible 56 | 57 | 00:00:18.710 --> 00:00:18.720 align:start position:0% 58 | research data finable accessible 59 | 60 | 61 | 00:00:18.720 --> 00:00:21.189 align:start position:0% 62 | research data finable accessible 63 | interoperable<00:00:19.359> and<00:00:19.520> reusable<00:00:20.080> are<00:00:20.240> clear<00:00:21.039> but 64 | 65 | 00:00:21.189 --> 00:00:21.199 align:start position:0% 66 | interoperable and reusable are clear but 67 | 68 | 69 | 00:00:21.199 --> 00:00:23.029 align:start position:0% 70 | interoperable and reusable are clear but 71 | of<00:00:21.439> equal<00:00:21.760> importance 72 | 73 | 00:00:23.029 --> 00:00:23.039 align:start position:0% 74 | of equal importance 75 | 76 | 77 | 00:00:23.039 --> 00:00:25.269 align:start position:0% 78 | of equal importance 79 | are<00:00:23.279> our<00:00:23.600> legal<00:00:24.080> and<00:00:24.240> ethical<00:00:24.640> obligations<00:00:25.199> to 80 | 81 | 00:00:25.269 --> 00:00:25.279 align:start position:0% 82 | are our legal and ethical obligations to 83 | 84 | 85 | 00:00:25.279 --> 00:00:27.189 align:start position:0% 86 | are our legal and ethical obligations to 87 | protect<00:00:25.840> the<00:00:26.000> personal<00:00:26.320> data<00:00:26.640> privacy<00:00:27.039> of 88 | 89 | 00:00:27.189 --> 00:00:27.199 align:start position:0% 90 | protect the personal data privacy of 91 | 92 | 93 | 00:00:27.199 --> 00:00:28.870 align:start position:0% 94 | protect the personal data privacy of 95 | research<00:00:27.519> participants 96 | 97 | 00:00:28.870 --> 00:00:28.880 align:start position:0% 98 | research participants 99 | 100 | 101 | 00:00:28.880 --> 00:00:30.310 align:start position:0% 102 | research participants 103 | so<00:00:29.039> we<00:00:29.199> are<00:00:29.359> struck<00:00:29.599> with<00:00:29.760> this<00:00:29.920> apparent 104 | 105 | 00:00:30.310 --> 00:00:30.320 align:start position:0% 106 | so we are struck with this apparent 107 | 108 | 109 | 00:00:30.320 --> 00:00:32.709 align:start position:0% 110 | so we are struck with this apparent 111 | contradiction<00:00:31.199> how<00:00:31.439> can<00:00:31.599> we<00:00:31.840> share<00:00:32.160> our<00:00:32.320> data 112 | 113 | 00:00:32.709 --> 00:00:32.719 align:start position:0% 114 | contradiction how can we share our data 115 | 116 | 117 | 00:00:32.719 --> 00:00:35.270 align:start position:0% 118 | contradiction how can we share our data 119 | openly<00:00:33.280> it<00:00:33.440> keep<00:00:33.680> it<00:00:33.760> secure<00:00:34.239> and<00:00:34.320> protected 120 | 121 | 00:00:35.270 --> 00:00:35.280 align:start position:0% 122 | openly it keep it secure and protected 123 | 124 | 125 | 00:00:35.280 --> 00:00:37.270 align:start position:0% 126 | openly it keep it secure and protected 127 | should<00:00:35.520> we<00:00:35.680> err<00:00:35.920> on<00:00:36.000> the<00:00:36.079> side<00:00:36.320> of<00:00:36.480> fairness<00:00:37.040> or 128 | 129 | 00:00:37.270 --> 00:00:37.280 align:start position:0% 130 | should we err on the side of fairness or 131 | 132 | 133 | 00:00:37.280 --> 00:00:39.590 align:start position:0% 134 | should we err on the side of fairness or 135 | of<00:00:37.440> data<00:00:37.760> privacy<00:00:38.480> or<00:00:38.640> do<00:00:38.879> we<00:00:39.040> even<00:00:39.280> have<00:00:39.440> to 136 | 137 | 00:00:39.590 --> 00:00:39.600 align:start position:0% 138 | of data privacy or do we even have to 139 | 140 | 141 | 00:00:39.600 --> 00:00:41.030 align:start position:0% 142 | of data privacy or do we even have to 143 | choose 144 | 145 | 00:00:41.030 --> 00:00:41.040 align:start position:0% 146 | choose 147 | 148 | 149 | 00:00:41.040 --> 00:00:42.630 align:start position:0% 150 | choose 151 | ideally<00:00:41.840> no 152 | 153 | 00:00:42.630 --> 00:00:42.640 align:start position:0% 154 | ideally no 155 | 156 | 157 | 00:00:42.640 --> 00:00:45.110 align:start position:0% 158 | ideally no 159 | and<00:00:42.800> in<00:00:42.879> practice<00:00:43.600> also<00:00:44.000> no<00:00:44.559> because<00:00:44.879> we<00:00:45.039> have 160 | 161 | 00:00:45.110 --> 00:00:45.120 align:start position:0% 162 | and in practice also no because we have 163 | 164 | 165 | 00:00:45.120 --> 00:00:47.029 align:start position:0% 166 | and in practice also no because we have 167 | a<00:00:45.200> powerful<00:00:45.600> opportunity<00:00:46.239> in<00:00:46.320> the<00:00:46.399> form<00:00:46.640> of 168 | 169 | 00:00:47.029 --> 00:00:47.039 align:start position:0% 170 | a powerful opportunity in the form of 171 | 172 | 173 | 00:00:47.039 --> 00:00:49.670 align:start position:0% 174 | a powerful opportunity in the form of 175 | linked<00:00:47.840> structured<00:00:48.640> and<00:00:48.800> machine-readable 176 | 177 | 00:00:49.670 --> 00:00:49.680 align:start position:0% 178 | linked structured and machine-readable 179 | 180 | 181 | 00:00:49.680 --> 00:00:51.110 align:start position:0% 182 | linked structured and machine-readable 183 | metadata 184 | 185 | 00:00:51.110 --> 00:00:51.120 align:start position:0% 186 | metadata 187 | 188 | 189 | 00:00:51.120 --> 00:00:53.110 align:start position:0% 190 | metadata 191 | metadata<00:00:51.840> provides<00:00:52.160> not<00:00:52.399> only<00:00:52.640> high-level 192 | 193 | 00:00:53.110 --> 00:00:53.120 align:start position:0% 194 | metadata provides not only high-level 195 | 196 | 197 | 00:00:53.120 --> 00:00:54.950 align:start position:0% 198 | metadata provides not only high-level 199 | information<00:00:53.600> about<00:00:53.840> our<00:00:53.920> research<00:00:54.399> data<00:00:54.719> such 200 | 201 | 00:00:54.950 --> 00:00:54.960 align:start position:0% 202 | information about our research data such 203 | 204 | 205 | 00:00:54.960 --> 00:00:57.350 align:start position:0% 206 | information about our research data such 207 | as<00:00:55.120> study<00:00:55.440> and<00:00:55.600> data<00:00:55.840> acquisition<00:00:56.320> parameters 208 | 209 | 00:00:57.350 --> 00:00:57.360 align:start position:0% 210 | as study and data acquisition parameters 211 | 212 | 213 | 00:00:57.360 --> 00:00:59.590 align:start position:0% 214 | as study and data acquisition parameters 215 | but<00:00:57.520> also<00:00:57.840> the<00:00:57.920> descriptive<00:00:58.480> aspects<00:00:58.960> of<00:00:59.280> each 216 | 217 | 00:00:59.590 --> 00:00:59.600 align:start position:0% 218 | but also the descriptive aspects of each 219 | 220 | 221 | 00:00:59.600 --> 00:01:01.590 align:start position:0% 222 | but also the descriptive aspects of each 223 | file<00:00:59.840> in<00:01:00.000> the<00:01:00.079> data<00:01:00.320> set<00:01:00.480> such<00:01:00.800> as<00:01:00.879> file<00:01:01.199> paths 224 | 225 | 00:01:01.590 --> 00:01:01.600 align:start position:0% 226 | file in the data set such as file paths 227 | 228 | 229 | 00:01:01.600 --> 00:01:03.189 align:start position:0% 230 | file in the data set such as file paths 231 | sizes<00:01:02.079> and<00:01:02.160> formats 232 | 233 | 00:01:03.189 --> 00:01:03.199 align:start position:0% 234 | sizes and formats 235 | 236 | 237 | 00:01:03.199 --> 00:01:04.789 align:start position:0% 238 | sizes and formats 239 | with<00:01:03.359> this<00:01:03.600> metadata<00:01:04.159> we<00:01:04.320> can<00:01:04.400> create<00:01:04.640> an 240 | 241 | 00:01:04.789 --> 00:01:04.799 align:start position:0% 242 | with this metadata we can create an 243 | 244 | 245 | 00:01:04.799 --> 00:01:06.550 align:start position:0% 246 | with this metadata we can create an 247 | abstract<00:01:05.199> representation<00:01:05.840> of<00:01:06.000> the<00:01:06.080> full<00:01:06.240> data 248 | 249 | 00:01:06.550 --> 00:01:06.560 align:start position:0% 250 | abstract representation of the full data 251 | 252 | 253 | 00:01:06.560 --> 00:01:08.310 align:start position:0% 254 | abstract representation of the full data 255 | set<00:01:06.960> that<00:01:07.200> is<00:01:07.360> separate<00:01:07.680> from<00:01:07.840> the<00:01:08.000> actual 256 | 257 | 00:01:08.310 --> 00:01:08.320 align:start position:0% 258 | set that is separate from the actual 259 | 260 | 261 | 00:01:08.320 --> 00:01:09.750 align:start position:0% 262 | set that is separate from the actual 263 | data<00:01:08.720> content 264 | 265 | 00:01:09.750 --> 00:01:09.760 align:start position:0% 266 | data content 267 | 268 | 269 | 00:01:09.760 --> 00:01:11.109 align:start position:0% 270 | data content 271 | this<00:01:10.000> means<00:01:10.240> that<00:01:10.320> the<00:01:10.479> content<00:01:10.799> can<00:01:10.960> be 272 | 273 | 00:01:11.109 --> 00:01:11.119 align:start position:0% 274 | this means that the content can be 275 | 276 | 277 | 00:01:11.119 --> 00:01:12.870 align:start position:0% 278 | this means that the content can be 279 | stored<00:01:11.439> securely<00:01:12.080> while<00:01:12.240> we<00:01:12.400> openly<00:01:12.720> share 280 | 281 | 00:01:12.870 --> 00:01:12.880 align:start position:0% 282 | stored securely while we openly share 283 | 284 | 285 | 00:01:12.880 --> 00:01:16.149 align:start position:0% 286 | stored securely while we openly share 287 | the<00:01:13.040> metadata<00:01:13.600> to<00:01:13.680> make<00:01:13.920> our<00:01:14.080> work<00:01:14.320> more<00:01:14.640> fair 288 | 289 | 00:01:16.149 --> 00:01:16.159 align:start position:0% 290 | the metadata to make our work more fair 291 | 292 | 293 | 00:01:16.159 --> 00:01:18.310 align:start position:0% 294 | the metadata to make our work more fair 295 | as<00:01:16.320> an<00:01:16.479> added<00:01:16.640> benefit<00:01:17.600> structured<00:01:18.159> and 296 | 297 | 00:01:18.310 --> 00:01:18.320 align:start position:0% 298 | as an added benefit structured and 299 | 300 | 301 | 00:01:18.320 --> 00:01:20.310 align:start position:0% 302 | as an added benefit structured and 303 | machine<00:01:18.640> readable<00:01:19.200> metadata<00:01:19.759> that<00:01:19.920> conforms 304 | 305 | 00:01:20.310 --> 00:01:20.320 align:start position:0% 306 | machine readable metadata that conforms 307 | 308 | 309 | 00:01:20.320 --> 00:01:22.469 align:start position:0% 310 | machine readable metadata that conforms 311 | to<00:01:20.479> industry<00:01:20.960> standards<00:01:21.920> improves<00:01:22.320> the 312 | 313 | 00:01:22.469 --> 00:01:22.479 align:start position:0% 314 | to industry standards improves the 315 | 316 | 317 | 00:01:22.479 --> 00:01:24.469 align:start position:0% 318 | to industry standards improves the 319 | interoperability<00:01:23.439> and<00:01:23.600> allows<00:01:24.000> the<00:01:24.080> use<00:01:24.320> of 320 | 321 | 00:01:24.469 --> 00:01:24.479 align:start position:0% 322 | interoperability and allows the use of 323 | 324 | 325 | 00:01:24.479 --> 00:01:30.710 align:start position:0% 326 | interoperability and allows the use of 327 | automated<00:01:24.960> pipelines<00:01:25.680> and<00:01:25.920> tools 328 | 329 | 00:01:30.710 --> 00:01:30.720 align:start position:0% 330 | 331 | 332 | 333 | 00:01:30.720 --> 00:01:32.710 align:start position:0% 334 | 335 | these<00:01:30.960> ideals<00:01:31.360> are<00:01:31.520> achievable<00:01:32.000> in<00:01:32.240> practice 336 | 337 | 00:01:32.710 --> 00:01:32.720 align:start position:0% 338 | these ideals are achievable in practice 339 | 340 | 341 | 00:01:32.720 --> 00:01:33.990 align:start position:0% 342 | these ideals are achievable in practice 343 | with<00:01:32.880> a<00:01:32.960> tool<00:01:33.200> set<00:01:33.360> that<00:01:33.520> includes<00:01:33.840> the 344 | 345 | 00:01:33.990 --> 00:01:34.000 align:start position:0% 346 | with a tool set that includes the 347 | 348 | 349 | 00:01:34.000 --> 00:01:36.310 align:start position:0% 350 | with a tool set that includes the 351 | distributed<00:01:34.720> data<00:01:34.960> management<00:01:35.439> system<00:01:35.920> data 352 | 353 | 00:01:36.310 --> 00:01:36.320 align:start position:0% 354 | distributed data management system data 355 | 356 | 357 | 00:01:36.320 --> 00:01:38.469 align:start position:0% 358 | distributed data management system data 359 | ad<00:01:36.960> and<00:01:37.119> its<00:01:37.360> extensions<00:01:37.840> for<00:01:38.000> metadata 360 | 361 | 00:01:38.469 --> 00:01:38.479 align:start position:0% 362 | ad and its extensions for metadata 363 | 364 | 365 | 00:01:38.479 --> 00:01:41.109 align:start position:0% 366 | ad and its extensions for metadata 367 | handling<00:01:38.880> and<00:01:38.960> catalog<00:01:39.360> generation 368 | 369 | 00:01:41.109 --> 00:01:41.119 align:start position:0% 370 | handling and catalog generation 371 | 372 | 373 | 00:01:41.119 --> 00:01:43.510 align:start position:0% 374 | handling and catalog generation 375 | data<00:01:41.439> that<00:01:41.840> can<00:01:42.000> be<00:01:42.159> used<00:01:42.479> for<00:01:42.640> decentralized 376 | 377 | 00:01:43.510 --> 00:01:43.520 align:start position:0% 378 | data that can be used for decentralized 379 | 380 | 381 | 00:01:43.520 --> 00:01:45.190 align:start position:0% 382 | data that can be used for decentralized 383 | management<00:01:43.920> of<00:01:44.079> data<00:01:44.479> as<00:01:44.640> lightweight 384 | 385 | 00:01:45.190 --> 00:01:45.200 align:start position:0% 386 | management of data as lightweight 387 | 388 | 389 | 00:01:45.200 --> 00:01:47.990 align:start position:0% 390 | management of data as lightweight 391 | portable<00:01:45.680> and<00:01:45.840> extensible<00:01:46.479> representations 392 | 393 | 00:01:47.990 --> 00:01:48.000 align:start position:0% 394 | portable and extensible representations 395 | 396 | 397 | 00:01:48.000 --> 00:01:50.149 align:start position:0% 398 | portable and extensible representations 399 | dataled<00:01:48.479> metal<00:01:48.880> ad<00:01:49.040> can<00:01:49.280> extract<00:01:49.680> structured 400 | 401 | 00:01:50.149 --> 00:01:50.159 align:start position:0% 402 | dataled metal ad can extract structured 403 | 404 | 405 | 00:01:50.159 --> 00:01:51.830 align:start position:0% 406 | dataled metal ad can extract structured 407 | high<00:01:50.399> and<00:01:50.560> low<00:01:50.720> level<00:01:50.960> metadata<00:01:51.680> and 408 | 409 | 00:01:51.830 --> 00:01:51.840 align:start position:0% 410 | high and low level metadata and 411 | 412 | 413 | 00:01:51.840 --> 00:01:53.749 align:start position:0% 414 | high and low level metadata and 415 | associate<00:01:52.399> it<00:01:52.479> with<00:01:52.640> these<00:01:53.119> sets<00:01:53.360> or<00:01:53.520> with 416 | 417 | 00:01:53.749 --> 00:01:53.759 align:start position:0% 418 | associate it with these sets or with 419 | 420 | 421 | 00:01:53.759 --> 00:01:55.830 align:start position:0% 422 | associate it with these sets or with 423 | individual<00:01:54.240> files<00:01:55.119> and<00:01:55.280> at<00:01:55.360> the<00:01:55.520> end<00:01:55.680> of<00:01:55.759> the 424 | 425 | 00:01:55.830 --> 00:01:55.840 align:start position:0% 426 | individual files and at the end of the 427 | 428 | 429 | 00:01:55.840 --> 00:01:57.910 align:start position:0% 430 | individual files and at the end of the 431 | workflow<00:01:56.560> data<00:01:57.040> catalog<00:01:57.439> can<00:01:57.600> turn<00:01:57.840> the 432 | 433 | 00:01:57.910 --> 00:01:57.920 align:start position:0% 434 | workflow data catalog can turn the 435 | 436 | 437 | 00:01:57.920 --> 00:01:59.910 align:start position:0% 438 | workflow data catalog can turn the 439 | structured<00:01:58.320> metadata<00:01:58.880> into<00:01:59.200> user-friendly 440 | 441 | 00:01:59.910 --> 00:01:59.920 align:start position:0% 442 | structured metadata into user-friendly 443 | 444 | 445 | 00:01:59.920 --> 00:02:02.149 align:start position:0% 446 | structured metadata into user-friendly 447 | data<00:02:00.240> browser 448 | 449 | 00:02:02.149 --> 00:02:02.159 align:start position:0% 450 | data browser 451 | 452 | 453 | 00:02:02.159 --> 00:02:03.910 align:start position:0% 454 | data browser 455 | so<00:02:02.399> how<00:02:02.560> does<00:02:02.799> this<00:02:03.040> catalog<00:02:03.439> generation 456 | 457 | 00:02:03.910 --> 00:02:03.920 align:start position:0% 458 | so how does this catalog generation 459 | 460 | 461 | 00:02:03.920 --> 00:02:05.109 align:start position:0% 462 | so how does this catalog generation 463 | process<00:02:04.240> work 464 | 465 | 00:02:05.109 --> 00:02:05.119 align:start position:0% 466 | process work 467 | 468 | 469 | 00:02:05.119 --> 00:02:07.030 align:start position:0% 470 | process work 471 | well<00:02:05.360> metadata<00:02:05.840> extracted<00:02:06.399> from<00:02:06.640> various 472 | 473 | 00:02:07.030 --> 00:02:07.040 align:start position:0% 474 | well metadata extracted from various 475 | 476 | 477 | 00:02:07.040 --> 00:02:08.790 align:start position:0% 478 | well metadata extracted from various 479 | sources<00:02:07.439> even<00:02:07.600> custom<00:02:08.000> sources<00:02:08.479> can<00:02:08.720> be 480 | 481 | 00:02:08.790 --> 00:02:08.800 align:start position:0% 482 | sources even custom sources can be 483 | 484 | 485 | 00:02:08.800 --> 00:02:11.350 align:start position:0% 486 | sources even custom sources can be 487 | aggregated<00:02:09.520> and<00:02:09.679> added<00:02:10.000> to<00:02:10.160> a<00:02:10.239> catalog 488 | 489 | 00:02:11.350 --> 00:02:11.360 align:start position:0% 490 | aggregated and added to a catalog 491 | 492 | 493 | 00:02:11.360 --> 00:02:12.869 align:start position:0% 494 | aggregated and added to a catalog 495 | incoming<00:02:11.760> metadata<00:02:12.319> will<00:02:12.480> first<00:02:12.720> be 496 | 497 | 00:02:12.869 --> 00:02:12.879 align:start position:0% 498 | incoming metadata will first be 499 | 500 | 501 | 00:02:12.879 --> 00:02:14.630 align:start position:0% 502 | incoming metadata will first be 503 | validated<00:02:13.360> against<00:02:13.680> a<00:02:13.760> catalog<00:02:14.160> specific 504 | 505 | 00:02:14.630 --> 00:02:14.640 align:start position:0% 506 | validated against a catalog specific 507 | 508 | 509 | 00:02:14.640 --> 00:02:16.710 align:start position:0% 510 | validated against a catalog specific 511 | schema<00:02:15.120> before<00:02:15.440> the<00:02:15.520> catalog<00:02:16.000> is<00:02:16.080> generated 512 | 513 | 00:02:16.710 --> 00:02:16.720 align:start position:0% 514 | schema before the catalog is generated 515 | 516 | 517 | 00:02:16.720 --> 00:02:19.110 align:start position:0% 518 | schema before the catalog is generated 519 | or<00:02:17.040> individual<00:02:17.599> entries<00:02:18.000> are<00:02:18.080> added 520 | 521 | 00:02:19.110 --> 00:02:19.120 align:start position:0% 522 | or individual entries are added 523 | 524 | 525 | 00:02:19.120 --> 00:02:21.270 align:start position:0% 526 | or individual entries are added 527 | once<00:02:19.360> the<00:02:19.440> process<00:02:19.920> is<00:02:20.000> finished<00:02:20.640> the<00:02:20.879> output 528 | 529 | 00:02:21.270 --> 00:02:21.280 align:start position:0% 530 | once the process is finished the output 531 | 532 | 533 | 00:02:21.280 --> 00:02:23.510 align:start position:0% 534 | once the process is finished the output 535 | is<00:02:21.360> a<00:02:21.440> set<00:02:21.680> of<00:02:21.840> structured<00:02:22.319> metadata<00:02:22.959> files<00:02:23.360> as 536 | 537 | 00:02:23.510 --> 00:02:23.520 align:start position:0% 538 | is a set of structured metadata files as 539 | 540 | 541 | 00:02:23.520 --> 00:02:25.670 align:start position:0% 542 | is a set of structured metadata files as 543 | well<00:02:23.680> as<00:02:23.840> a<00:02:23.920> view<00:02:24.160> js<00:02:24.640> based<00:02:25.280> browser 544 | 545 | 00:02:25.670 --> 00:02:25.680 align:start position:0% 546 | well as a view js based browser 547 | 548 | 549 | 00:02:25.680 --> 00:02:28.070 align:start position:0% 550 | well as a view js based browser 551 | interface<00:02:26.480> that<00:02:26.720> understands<00:02:27.280> how<00:02:27.440> to<00:02:27.599> render 552 | 553 | 00:02:28.070 --> 00:02:28.080 align:start position:0% 554 | interface that understands how to render 555 | 556 | 557 | 00:02:28.080 --> 00:02:30.070 align:start position:0% 558 | interface that understands how to render 559 | this<00:02:28.319> metadata 560 | 561 | 00:02:30.070 --> 00:02:30.080 align:start position:0% 562 | this metadata 563 | 564 | 565 | 00:02:30.080 --> 00:02:31.670 align:start position:0% 566 | this metadata 567 | what<00:02:30.319> is<00:02:30.480> left<00:02:30.640> for<00:02:30.800> the<00:02:30.879> user<00:02:31.200> is<00:02:31.360> to<00:02:31.440> host 568 | 569 | 00:02:31.670 --> 00:02:31.680 align:start position:0% 570 | what is left for the user is to host 571 | 572 | 573 | 00:02:31.680 --> 00:02:33.589 align:start position:0% 574 | what is left for the user is to host 575 | this<00:02:31.840> content<00:02:32.239> on<00:02:32.319> their<00:02:32.560> platform<00:02:32.959> of<00:02:33.040> choice 576 | 577 | 00:02:33.589 --> 00:02:33.599 align:start position:0% 578 | this content on their platform of choice 579 | 580 | 581 | 00:02:33.599 --> 00:02:36.710 align:start position:0% 582 | this content on their platform of choice 583 | and<00:02:33.840> serve<00:02:34.160> it<00:02:34.319> for<00:02:34.480> the<00:02:34.640> world<00:02:35.040> to<00:02:35.280> see 584 | 585 | 00:02:36.710 --> 00:02:36.720 align:start position:0% 586 | and serve it for the world to see 587 | 588 | 589 | 00:02:36.720 --> 00:02:38.790 align:start position:0% 590 | and serve it for the world to see 591 | data<00:02:37.440> catalog<00:02:38.000> brings<00:02:38.319> the<00:02:38.400> powerful 592 | 593 | 00:02:38.790 --> 00:02:38.800 align:start position:0% 594 | data catalog brings the powerful 595 | 596 | 597 | 00:02:38.800 --> 00:02:40.790 align:start position:0% 598 | data catalog brings the powerful 599 | functionality<00:02:39.360> of<00:02:39.519> decentralized<00:02:40.319> metadata 600 | 601 | 00:02:40.790 --> 00:02:40.800 align:start position:0% 602 | functionality of decentralized metadata 603 | 604 | 605 | 00:02:40.800 --> 00:02:42.229 align:start position:0% 606 | functionality of decentralized metadata 607 | handling<00:02:41.120> and<00:02:41.200> data<00:02:41.519> publishing<00:02:41.920> into<00:02:42.160> the 608 | 609 | 00:02:42.229 --> 00:02:42.239 align:start position:0% 610 | handling and data publishing into the 611 | 612 | 613 | 00:02:42.239 --> 00:02:44.550 align:start position:0% 614 | handling and data publishing into the 615 | hands<00:02:42.480> of<00:02:42.640> users<00:02:43.360> preventing<00:02:43.920> dependence<00:02:44.480> on 616 | 617 | 00:02:44.550 --> 00:02:44.560 align:start position:0% 618 | hands of users preventing dependence on 619 | 620 | 621 | 00:02:44.560 --> 00:02:46.229 align:start position:0% 622 | hands of users preventing dependence on 623 | centralized<00:02:45.200> infrastructure<00:02:45.760> and<00:02:45.920> keeping 624 | 625 | 00:02:46.229 --> 00:02:46.239 align:start position:0% 626 | centralized infrastructure and keeping 627 | 628 | 629 | 00:02:46.239 --> 00:02:48.470 align:start position:0% 630 | centralized infrastructure and keeping 631 | private<00:02:46.560> data<00:02:46.959> secure<00:02:47.680> while<00:02:47.920> adhering<00:02:48.319> to 632 | 633 | 00:02:48.470 --> 00:02:48.480 align:start position:0% 634 | private data secure while adhering to 635 | 636 | 637 | 00:02:48.480 --> 00:02:50.869 align:start position:0% 638 | private data secure while adhering to 639 | fair<00:02:48.840> principles<00:02:49.840> please<00:02:50.080> explore<00:02:50.480> the<00:02:50.560> demo 640 | 641 | 00:02:50.869 --> 00:02:50.879 align:start position:0% 642 | fair principles please explore the demo 643 | 644 | 645 | 00:02:50.879 --> 00:02:52.869 align:start position:0% 646 | fair principles please explore the demo 647 | catalog<00:02:51.680> walk<00:02:51.920> through<00:02:52.160> the<00:02:52.319> interactive 648 | 649 | 00:02:52.869 --> 00:02:52.879 align:start position:0% 650 | catalog walk through the interactive 651 | 652 | 653 | 00:02:52.879 --> 00:02:54.710 align:start position:0% 654 | catalog walk through the interactive 655 | tutorial<00:02:53.280> or<00:02:53.440> visit<00:02:53.760> the<00:02:53.840> code<00:02:54.080> base<00:02:54.319> to<00:02:54.480> start 656 | 657 | 00:02:54.710 --> 00:02:54.720 align:start position:0% 658 | tutorial or visit the code base to start 659 | 660 | 661 | 00:02:54.720 --> 00:02:56.869 align:start position:0% 662 | tutorial or visit the code base to start 663 | using<00:02:55.120> or<00:02:55.280> contributing<00:02:56.080> to<00:02:56.239> data<00:02:56.560> lab 664 | 665 | 00:02:56.869 --> 00:02:56.879 align:start position:0% 666 | using or contributing to data lab 667 | 668 | 669 | 00:02:56.879 --> 00:03:00.159 align:start position:0% 670 | using or contributing to data lab 671 | catalog<00:02:57.840> thank<00:02:58.000> you 672 | 673 | -------------------------------------------------------------------------------- /DataLad/OHBM_Poster_presentation__2057__FAIRly_big.en.vtt: -------------------------------------------------------------------------------- 1 | WEBVTT 2 | Kind: captions 3 | Language: en 4 | 5 | 00:00:00.000 --> 00:00:04.870 align:start position:0% 6 | 7 | [Music] 8 | 9 | 00:00:04.870 --> 00:00:04.880 align:start position:0% 10 | 11 | 12 | 13 | 00:00:04.880 --> 00:00:06.950 align:start position:0% 14 | 15 | once<00:00:05.200> upon<00:00:05.440> a<00:00:05.520> time<00:00:05.920> there<00:00:06.160> was<00:00:06.319> an<00:00:06.480> institute 16 | 17 | 00:00:06.950 --> 00:00:06.960 align:start position:0% 18 | once upon a time there was an institute 19 | 20 | 21 | 00:00:06.960 --> 00:00:08.710 align:start position:0% 22 | once upon a time there was an institute 23 | with<00:00:07.120> access<00:00:07.520> to<00:00:07.680> a<00:00:07.759> variety<00:00:08.240> of<00:00:08.400> large 24 | 25 | 00:00:08.710 --> 00:00:08.720 align:start position:0% 26 | with access to a variety of large 27 | 28 | 29 | 00:00:08.720 --> 00:00:10.629 align:start position:0% 30 | with access to a variety of large 31 | neuroscientific<00:00:09.440> data<00:00:09.760> sets 32 | 33 | 00:00:10.629 --> 00:00:10.639 align:start position:0% 34 | neuroscientific data sets 35 | 36 | 37 | 00:00:10.639 --> 00:00:12.390 align:start position:0% 38 | neuroscientific data sets 39 | dozens<00:00:11.040> of<00:00:11.120> researchers<00:00:11.679> depended<00:00:12.240> on 40 | 41 | 00:00:12.390 --> 00:00:12.400 align:start position:0% 42 | dozens of researchers depended on 43 | 44 | 45 | 00:00:12.400 --> 00:00:14.150 align:start position:0% 46 | dozens of researchers depended on 47 | pre-processed<00:00:12.960> versions<00:00:13.360> of<00:00:13.440> these<00:00:13.679> datasets 48 | 49 | 00:00:14.150 --> 00:00:14.160 align:start position:0% 50 | pre-processed versions of these datasets 51 | 52 | 53 | 00:00:14.160 --> 00:00:15.589 align:start position:0% 54 | pre-processed versions of these datasets 55 | for<00:00:14.320> the<00:00:14.480> project 56 | 57 | 00:00:15.589 --> 00:00:15.599 align:start position:0% 58 | for the project 59 | 60 | 61 | 00:00:15.599 --> 00:00:17.430 align:start position:0% 62 | for the project 63 | the<00:00:15.679> central<00:00:16.080> coordinated<00:00:16.800> pre-processing 64 | 65 | 00:00:17.430 --> 00:00:17.440 align:start position:0% 66 | the central coordinated pre-processing 67 | 68 | 69 | 00:00:17.440 --> 00:00:19.590 align:start position:0% 70 | the central coordinated pre-processing 71 | efforts<00:00:17.840> however<00:00:18.640> suffered<00:00:19.039> from<00:00:19.199> a<00:00:19.279> lack<00:00:19.520> of 72 | 73 | 00:00:19.590 --> 00:00:19.600 align:start position:0% 74 | efforts however suffered from a lack of 75 | 76 | 77 | 00:00:19.600 --> 00:00:21.750 align:start position:0% 78 | efforts however suffered from a lack of 79 | transparency<00:00:20.240> and<00:00:20.320> reproducibility<00:00:21.600> there 80 | 81 | 00:00:21.750 --> 00:00:21.760 align:start position:0% 82 | transparency and reproducibility there 83 | 84 | 85 | 00:00:21.760 --> 00:00:24.070 align:start position:0% 86 | transparency and reproducibility there 87 | was<00:00:22.000> pre-processed<00:00:22.640> data<00:00:23.119> but<00:00:23.279> over<00:00:23.600> time<00:00:23.920> the 88 | 89 | 00:00:24.070 --> 00:00:24.080 align:start position:0% 90 | was pre-processed data but over time the 91 | 92 | 93 | 00:00:24.080 --> 00:00:26.070 align:start position:0% 94 | was pre-processed data but over time the 95 | knowledge<00:00:24.400> of<00:00:24.560> who<00:00:24.800> created<00:00:25.279> it<00:00:25.519> how<00:00:25.760> it<00:00:25.840> was 96 | 97 | 00:00:26.070 --> 00:00:26.080 align:start position:0% 98 | knowledge of who created it how it was 99 | 100 | 101 | 00:00:26.080 --> 00:00:28.790 align:start position:0% 102 | knowledge of who created it how it was 103 | created<00:00:26.720> or<00:00:26.880> where<00:00:27.119> it<00:00:27.199> was<00:00:27.359> stored<00:00:27.840> was<00:00:28.080> lost 104 | 105 | 00:00:28.790 --> 00:00:28.800 align:start position:0% 106 | created or where it was stored was lost 107 | 108 | 109 | 00:00:28.800 --> 00:00:31.189 align:start position:0% 110 | created or where it was stored was lost 111 | with<00:00:28.960> this<00:00:29.199> lack<00:00:29.439> of<00:00:29.599> transparency<00:00:30.640> reuse<00:00:31.039> was 112 | 113 | 00:00:31.189 --> 00:00:31.199 align:start position:0% 114 | with this lack of transparency reuse was 115 | 116 | 117 | 00:00:31.199 --> 00:00:33.510 align:start position:0% 118 | with this lack of transparency reuse was 119 | difficult<00:00:32.079> and<00:00:32.399> ceased 120 | 121 | 00:00:33.510 --> 00:00:33.520 align:start position:0% 122 | difficult and ceased 123 | 124 | 125 | 00:00:33.520 --> 00:00:34.870 align:start position:0% 126 | difficult and ceased 127 | but<00:00:33.680> when<00:00:33.920> every<00:00:34.239> research<00:00:34.640> group 128 | 129 | 00:00:34.870 --> 00:00:34.880 align:start position:0% 130 | but when every research group 131 | 132 | 133 | 00:00:34.880 --> 00:00:37.030 align:start position:0% 134 | but when every research group 135 | pre-processed<00:00:35.520> the<00:00:35.680> data<00:00:36.000> individually<00:00:36.880> it 136 | 137 | 00:00:37.030 --> 00:00:37.040 align:start position:0% 138 | pre-processed the data individually it 139 | 140 | 141 | 00:00:37.040 --> 00:00:38.869 align:start position:0% 142 | pre-processed the data individually it 143 | resulted<00:00:37.440> not<00:00:37.680> only<00:00:37.920> in<00:00:38.079> unsustainable 144 | 145 | 00:00:38.869 --> 00:00:38.879 align:start position:0% 146 | resulted not only in unsustainable 147 | 148 | 149 | 00:00:38.879 --> 00:00:41.110 align:start position:0% 150 | resulted not only in unsustainable 151 | duplicate<00:00:39.360> computing<00:00:39.840> efforts<00:00:40.480> but<00:00:40.719> also 152 | 153 | 00:00:41.110 --> 00:00:41.120 align:start position:0% 154 | duplicate computing efforts but also 155 | 156 | 157 | 00:00:41.120 --> 00:00:42.869 align:start position:0% 158 | duplicate computing efforts but also 159 | filled<00:00:41.440> up<00:00:41.520> the<00:00:41.600> disk<00:00:41.920> space<00:00:42.320> of<00:00:42.399> the<00:00:42.480> compute 160 | 161 | 00:00:42.869 --> 00:00:42.879 align:start position:0% 162 | filled up the disk space of the compute 163 | 164 | 165 | 00:00:42.879 --> 00:00:44.389 align:start position:0% 166 | filled up the disk space of the compute 167 | cluster<00:00:43.360> in<00:00:43.440> no<00:00:43.680> time 168 | 169 | 00:00:44.389 --> 00:00:44.399 align:start position:0% 170 | cluster in no time 171 | 172 | 173 | 00:00:44.399 --> 00:00:46.869 align:start position:0% 174 | cluster in no time 175 | and<00:00:44.559> when<00:00:44.800> data<00:00:45.039> sets<00:00:45.440> became<00:00:45.920> too<00:00:46.160> big<00:00:46.559> to<00:00:46.719> be 176 | 177 | 00:00:46.869 --> 00:00:46.879 align:start position:0% 178 | and when data sets became too big to be 179 | 180 | 181 | 00:00:46.879 --> 00:00:51.270 align:start position:0% 182 | and when data sets became too big to be 183 | computed<00:00:47.520> even<00:00:47.760> once<00:00:48.640> things<00:00:49.280> had<00:00:49.440> to<00:00:49.680> change 184 | 185 | 00:00:51.270 --> 00:00:51.280 align:start position:0% 186 | computed even once things had to change 187 | 188 | 189 | 00:00:51.280 --> 00:00:53.189 align:start position:0% 190 | computed even once things had to change 191 | my<00:00:51.440> name<00:00:51.600> is<00:00:51.760> adina<00:00:52.239> and<00:00:52.320> my<00:00:52.480> colleagues<00:00:52.879> and<00:00:52.960> i 192 | 193 | 00:00:53.189 --> 00:00:53.199 align:start position:0% 194 | my name is adina and my colleagues and i 195 | 196 | 197 | 00:00:53.199 --> 00:00:54.790 align:start position:0% 198 | my name is adina and my colleagues and i 199 | created<00:00:53.520> a<00:00:53.600> framework<00:00:54.000> to<00:00:54.160> not<00:00:54.399> only<00:00:54.559> make 200 | 201 | 00:00:54.790 --> 00:00:54.800 align:start position:0% 202 | created a framework to not only make 203 | 204 | 205 | 00:00:54.800 --> 00:00:56.470 align:start position:0% 206 | created a framework to not only make 207 | processing<00:00:55.280> of<00:00:55.440> large<00:00:55.680> scale<00:00:55.920> data<00:00:56.239> sets 208 | 209 | 00:00:56.470 --> 00:00:56.480 align:start position:0% 210 | processing of large scale data sets 211 | 212 | 213 | 00:00:56.480 --> 00:00:58.630 align:start position:0% 214 | processing of large scale data sets 215 | possible<00:00:57.199> but<00:00:57.360> its<00:00:57.520> outcomes<00:00:58.000> also<00:00:58.239> easily 216 | 217 | 00:00:58.630 --> 00:00:58.640 align:start position:0% 218 | possible but its outcomes also easily 219 | 220 | 221 | 00:00:58.640 --> 00:01:00.549 align:start position:0% 222 | possible but its outcomes also easily 223 | shareable<00:00:59.120> transparent<00:00:59.760> and<00:00:59.920> automatically 224 | 225 | 00:01:00.549 --> 00:01:00.559 align:start position:0% 226 | shareable transparent and automatically 227 | 228 | 229 | 00:01:00.559 --> 00:01:01.830 align:start position:0% 230 | shareable transparent and automatically 231 | recomputable 232 | 233 | 00:01:01.830 --> 00:01:01.840 align:start position:0% 234 | recomputable 235 | 236 | 237 | 00:01:01.840 --> 00:01:03.590 align:start position:0% 238 | recomputable 239 | we<00:01:02.000> start<00:01:02.239> with<00:01:02.399> a<00:01:02.559> datalet<00:01:02.960> dataset<00:01:03.359> on<00:01:03.520> a 240 | 241 | 00:01:03.590 --> 00:01:03.600 align:start position:0% 242 | we start with a datalet dataset on a 243 | 244 | 245 | 00:01:03.600 --> 00:01:05.990 align:start position:0% 246 | we start with a datalet dataset on a 247 | computational<00:01:04.239> cluster<00:01:05.119> datasets<00:01:05.680> are<00:01:05.760> based 248 | 249 | 00:01:05.990 --> 00:01:06.000 align:start position:0% 250 | computational cluster datasets are based 251 | 252 | 253 | 00:01:06.000 --> 00:01:07.590 align:start position:0% 254 | computational cluster datasets are based 255 | on<00:01:06.080> git<00:01:06.240> repositories<00:01:06.960> but<00:01:07.119> can<00:01:07.280> version 256 | 257 | 00:01:07.590 --> 00:01:07.600 align:start position:0% 258 | on git repositories but can version 259 | 260 | 261 | 00:01:07.600 --> 00:01:09.670 align:start position:0% 262 | on git repositories but can version 263 | control<00:01:08.000> digital<00:01:08.320> files<00:01:08.720> of<00:01:08.880> any<00:01:09.040> size<00:01:09.439> such 264 | 265 | 00:01:09.670 --> 00:01:09.680 align:start position:0% 266 | control digital files of any size such 267 | 268 | 269 | 00:01:09.680 --> 00:01:11.510 align:start position:0% 270 | control digital files of any size such 271 | as<00:01:09.760> the<00:01:09.920> uk<00:01:10.159> biobank<00:01:10.640> data<00:01:11.119> which<00:01:11.360> we 272 | 273 | 00:01:11.510 --> 00:01:11.520 align:start position:0% 274 | as the uk biobank data which we 275 | 276 | 277 | 00:01:11.520 --> 00:01:14.149 align:start position:0% 278 | as the uk biobank data which we 279 | retrieved<00:01:12.000> using<00:01:12.240> data.uk<00:01:12.960> biobank<00:01:13.920> as<00:01:14.080> you 280 | 281 | 00:01:14.149 --> 00:01:14.159 align:start position:0% 282 | retrieved using data.uk biobank as you 283 | 284 | 285 | 00:01:14.159 --> 00:01:16.469 align:start position:0% 286 | retrieved using data.uk biobank as you 287 | can<00:01:14.400> see<00:01:14.560> here<00:01:15.200> datasets<00:01:15.759> can<00:01:15.920> contain<00:01:16.320> other 288 | 289 | 00:01:16.469 --> 00:01:16.479 align:start position:0% 290 | can see here datasets can contain other 291 | 292 | 293 | 00:01:16.479 --> 00:01:18.469 align:start position:0% 294 | can see here datasets can contain other 295 | datasets<00:01:17.200> this<00:01:17.439> is<00:01:17.520> useful<00:01:17.920> to<00:01:18.080> structure 296 | 297 | 00:01:18.469 --> 00:01:18.479 align:start position:0% 298 | datasets this is useful to structure 299 | 300 | 301 | 00:01:18.479 --> 00:01:20.469 align:start position:0% 302 | datasets this is useful to structure 303 | large<00:01:18.720> datasets<00:01:19.200> into<00:01:19.439> smaller<00:01:19.759> units<00:01:20.240> but 304 | 305 | 00:01:20.469 --> 00:01:20.479 align:start position:0% 306 | large datasets into smaller units but 307 | 308 | 309 | 00:01:20.479 --> 00:01:22.710 align:start position:0% 310 | large datasets into smaller units but 311 | also<00:01:20.880> to<00:01:21.040> link<00:01:21.280> datasets<00:01:21.759> as<00:01:21.920> dependencies<00:01:22.560> to 312 | 313 | 00:01:22.710 --> 00:01:22.720 align:start position:0% 314 | also to link datasets as dependencies to 315 | 316 | 317 | 00:01:22.720 --> 00:01:23.830 align:start position:0% 318 | also to link datasets as dependencies to 319 | one<00:01:22.880> another 320 | 321 | 00:01:23.830 --> 00:01:23.840 align:start position:0% 322 | one another 323 | 324 | 325 | 00:01:23.840 --> 00:01:26.230 align:start position:0% 326 | one another 327 | we<00:01:24.000> use<00:01:24.159> one<00:01:24.320> dataset<00:01:24.799> to<00:01:24.960> link<00:01:25.200> ukb<00:01:25.600> data<00:01:26.080> and 328 | 329 | 00:01:26.230 --> 00:01:26.240 align:start position:0% 330 | we use one dataset to link ukb data and 331 | 332 | 333 | 00:01:26.240 --> 00:01:27.990 align:start position:0% 334 | we use one dataset to link ukb data and 335 | a<00:01:26.320> software<00:01:26.640> dataset<00:01:27.119> with<00:01:27.280> a<00:01:27.360> computational 336 | 337 | 00:01:27.990 --> 00:01:28.000 align:start position:0% 338 | a software dataset with a computational 339 | 340 | 341 | 00:01:28.000 --> 00:01:29.429 align:start position:0% 342 | a software dataset with a computational 343 | pipeline<00:01:28.479> in<00:01:28.560> the<00:01:28.640> form<00:01:28.880> of<00:01:29.040> a<00:01:29.119> software 344 | 345 | 00:01:29.429 --> 00:01:29.439 align:start position:0% 346 | pipeline in the form of a software 347 | 348 | 349 | 00:01:29.439 --> 00:01:31.830 align:start position:0% 350 | pipeline in the form of a software 351 | container<00:01:29.920> as<00:01:30.079> analysis<00:01:30.560> dependencies 352 | 353 | 00:01:31.830 --> 00:01:31.840 align:start position:0% 354 | container as analysis dependencies 355 | 356 | 357 | 00:01:31.840 --> 00:01:33.830 align:start position:0% 358 | container as analysis dependencies 359 | datasets<00:01:32.400> can<00:01:32.560> drop<00:01:32.880> and<00:01:32.960> re-retrieve<00:01:33.600> file 360 | 361 | 00:01:33.830 --> 00:01:33.840 align:start position:0% 362 | datasets can drop and re-retrieve file 363 | 364 | 365 | 00:01:33.840 --> 00:01:36.149 align:start position:0% 366 | datasets can drop and re-retrieve file 367 | content<00:01:34.159> that<00:01:34.320> is<00:01:34.400> hosted<00:01:34.799> elsewhere<00:01:35.680> despite 368 | 369 | 00:01:36.149 --> 00:01:36.159 align:start position:0% 370 | content that is hosted elsewhere despite 371 | 372 | 373 | 00:01:36.159 --> 00:01:38.469 align:start position:0% 374 | content that is hosted elsewhere despite 375 | tracking<00:01:36.560> terabytes<00:01:37.119> of<00:01:37.200> data<00:01:37.759> datasets<00:01:38.320> can 376 | 377 | 00:01:38.469 --> 00:01:38.479 align:start position:0% 378 | tracking terabytes of data datasets can 379 | 380 | 381 | 00:01:38.479 --> 00:01:40.870 align:start position:0% 382 | tracking terabytes of data datasets can 383 | thus<00:01:38.720> be<00:01:38.960> tiny<00:01:39.280> in<00:01:39.439> size<00:01:40.159> this<00:01:40.400> feature<00:01:40.799> is 384 | 385 | 00:01:40.870 --> 00:01:40.880 align:start position:0% 386 | thus be tiny in size this feature is 387 | 388 | 389 | 00:01:40.880 --> 00:01:42.469 align:start position:0% 390 | thus be tiny in size this feature is 391 | used<00:01:41.200> to<00:01:41.360> create<00:01:41.600> hundreds<00:01:42.000> of<00:01:42.159> single 392 | 393 | 00:01:42.469 --> 00:01:42.479 align:start position:0% 394 | used to create hundreds of single 395 | 396 | 397 | 00:01:42.479 --> 00:01:44.710 align:start position:0% 398 | used to create hundreds of single 399 | subject<00:01:42.880> analysis<00:01:43.600> that<00:01:43.840> only<00:01:44.159> retrieve<00:01:44.560> the 400 | 401 | 00:01:44.710 --> 00:01:44.720 align:start position:0% 402 | subject analysis that only retrieve the 403 | 404 | 405 | 00:01:44.720 --> 00:01:46.149 align:start position:0% 406 | subject analysis that only retrieve the 407 | files<00:01:45.040> they<00:01:45.200> need 408 | 409 | 00:01:46.149 --> 00:01:46.159 align:start position:0% 410 | files they need 411 | 412 | 413 | 00:01:46.159 --> 00:01:48.550 align:start position:0% 414 | files they need 415 | at<00:01:46.399> analysis<00:01:46.880> execution<00:01:47.680> a<00:01:47.840> job<00:01:48.079> scheduler 416 | 417 | 00:01:48.550 --> 00:01:48.560 align:start position:0% 418 | at analysis execution a job scheduler 419 | 420 | 421 | 00:01:48.560 --> 00:01:50.469 align:start position:0% 422 | at analysis execution a job scheduler 423 | distributes<00:01:49.119> the<00:01:49.200> analysis<00:01:49.759> over<00:01:50.000> available 424 | 425 | 00:01:50.469 --> 00:01:50.479 align:start position:0% 426 | distributes the analysis over available 427 | 428 | 429 | 00:01:50.479 --> 00:01:52.710 align:start position:0% 430 | distributes the analysis over available 431 | compute<00:01:50.880> nodes<00:01:51.280> each<00:01:51.520> compute<00:01:51.920> node<00:01:52.399> clones 432 | 433 | 00:01:52.710 --> 00:01:52.720 align:start position:0% 434 | compute nodes each compute node clones 435 | 436 | 437 | 00:01:52.720 --> 00:01:54.230 align:start position:0% 438 | compute nodes each compute node clones 439 | the<00:01:52.799> topmost<00:01:53.280> dataset<00:01:53.759> to<00:01:53.840> create<00:01:54.159> a 440 | 441 | 00:01:54.230 --> 00:01:54.240 align:start position:0% 442 | the topmost dataset to create a 443 | 444 | 445 | 00:01:54.240 --> 00:01:56.469 align:start position:0% 446 | the topmost dataset to create a 447 | short-lived<00:01:54.799> ephemeral<00:01:55.280> clone<00:01:55.840> resulting<00:01:56.320> in 448 | 449 | 00:01:56.469 --> 00:01:56.479 align:start position:0% 450 | short-lived ephemeral clone resulting in 451 | 452 | 453 | 00:01:56.479 --> 00:01:58.789 align:start position:0% 454 | short-lived ephemeral clone resulting in 455 | a<00:01:56.560> network<00:01:56.960> of<00:01:57.040> temporary<00:01:57.600> dataset<00:01:58.000> copies 456 | 457 | 00:01:58.789 --> 00:01:58.799 align:start position:0% 458 | a network of temporary dataset copies 459 | 460 | 461 | 00:01:58.799 --> 00:02:01.030 align:start position:0% 462 | a network of temporary dataset copies 463 | each<00:01:59.040> ephemeral<00:01:59.600> clone<00:02:00.159> is<00:02:00.320> tasked<00:02:00.640> with<00:02:00.799> one 464 | 465 | 00:02:01.030 --> 00:02:01.040 align:start position:0% 466 | each ephemeral clone is tasked with one 467 | 468 | 469 | 00:02:01.040 --> 00:02:03.749 align:start position:0% 470 | each ephemeral clone is tasked with one 471 | subset<00:02:01.439> of<00:02:01.600> analysis<00:02:02.079> execution<00:02:03.119> this<00:02:03.360> job<00:02:03.680> is 472 | 473 | 00:02:03.749 --> 00:02:03.759 align:start position:0% 474 | subset of analysis execution this job is 475 | 476 | 477 | 00:02:03.759 --> 00:02:05.590 align:start position:0% 478 | subset of analysis execution this job is 479 | performed<00:02:04.240> with<00:02:04.399> a<00:02:04.479> data<00:02:04.719> that<00:02:04.880> contains<00:02:05.360> one 480 | 481 | 00:02:05.590 --> 00:02:05.600 align:start position:0% 482 | performed with a data that contains one 483 | 484 | 485 | 00:02:05.600 --> 00:02:07.670 align:start position:0% 486 | performed with a data that contains one 487 | call<00:02:06.240> its<00:02:06.479> advantage<00:02:06.960> is<00:02:07.119> that<00:02:07.280> the<00:02:07.439> full 488 | 489 | 00:02:07.670 --> 00:02:07.680 align:start position:0% 490 | call its advantage is that the full 491 | 492 | 493 | 00:02:07.680 --> 00:02:09.589 align:start position:0% 494 | call its advantage is that the full 495 | digital<00:02:08.160> analysis<00:02:08.640> provenance<00:02:09.119> is<00:02:09.200> captured 496 | 497 | 00:02:09.589 --> 00:02:09.599 align:start position:0% 498 | digital analysis provenance is captured 499 | 500 | 501 | 00:02:09.599 --> 00:02:11.270 align:start position:0% 502 | digital analysis provenance is captured 503 | in<00:02:09.759> a<00:02:09.840> structured<00:02:10.239> record<00:02:10.560> that<00:02:10.720> can<00:02:10.879> be<00:02:11.039> used 504 | 505 | 00:02:11.270 --> 00:02:11.280 align:start position:0% 506 | in a structured record that can be used 507 | 508 | 509 | 00:02:11.280 --> 00:02:13.190 align:start position:0% 510 | in a structured record that can be used 511 | for<00:02:11.360> automatic<00:02:11.840> re-execution 512 | 513 | 00:02:13.190 --> 00:02:13.200 align:start position:0% 514 | for automatic re-execution 515 | 516 | 517 | 00:02:13.200 --> 00:02:14.710 align:start position:0% 518 | for automatic re-execution 519 | provenance<00:02:13.680> and<00:02:13.760> computed<00:02:14.239> results<00:02:14.640> are 520 | 521 | 00:02:14.710 --> 00:02:14.720 align:start position:0% 522 | provenance and computed results are 523 | 524 | 525 | 00:02:14.720 --> 00:02:17.190 align:start position:0% 526 | provenance and computed results are 527 | saved<00:02:15.280> pushed<00:02:16.080> and<00:02:16.400> when<00:02:16.640> all<00:02:16.800> jobs<00:02:17.040> are 528 | 529 | 00:02:17.190 --> 00:02:17.200 align:start position:0% 530 | saved pushed and when all jobs are 531 | 532 | 533 | 00:02:17.200 --> 00:02:19.350 align:start position:0% 534 | saved pushed and when all jobs are 535 | finished<00:02:17.840> merged<00:02:18.319> back<00:02:18.560> into<00:02:18.800> the<00:02:18.959> central 536 | 537 | 00:02:19.350 --> 00:02:19.360 align:start position:0% 538 | finished merged back into the central 539 | 540 | 541 | 00:02:19.360 --> 00:02:20.470 align:start position:0% 542 | finished merged back into the central 543 | dataset 544 | 545 | 00:02:20.470 --> 00:02:20.480 align:start position:0% 546 | dataset 547 | 548 | 549 | 00:02:20.480 --> 00:02:21.990 align:start position:0% 550 | dataset 551 | throughout<00:02:20.879> the<00:02:21.040> process<00:02:21.440> a<00:02:21.599> special 552 | 553 | 00:02:21.990 --> 00:02:22.000 align:start position:0% 554 | throughout the process a special 555 | 556 | 557 | 00:02:22.000 --> 00:02:23.670 align:start position:0% 558 | throughout the process a special 559 | internal<00:02:22.480> data<00:02:22.720> set<00:02:22.959> representation 560 | 561 | 00:02:23.670 --> 00:02:23.680 align:start position:0% 562 | internal data set representation 563 | 564 | 565 | 00:02:23.680 --> 00:02:25.990 align:start position:0% 566 | internal data set representation 567 | minimizes<00:02:24.239> disk<00:02:24.560> space<00:02:24.879> and<00:02:24.959> inode<00:02:25.360> usage<00:02:25.840> and 568 | 569 | 00:02:25.990 --> 00:02:26.000 align:start position:0% 570 | minimizes disk space and inode usage and 571 | 572 | 573 | 00:02:26.000 --> 00:02:27.510 align:start position:0% 574 | minimizes disk space and inode usage and 575 | provides<00:02:26.400> optional<00:02:26.800> encryption<00:02:27.200> during 576 | 577 | 00:02:27.510 --> 00:02:27.520 align:start position:0% 578 | provides optional encryption during 579 | 580 | 581 | 00:02:27.520 --> 00:02:29.830 align:start position:0% 582 | provides optional encryption during 583 | transport<00:02:28.480> this<00:02:28.720> allows<00:02:29.040> processing<00:02:29.440> of<00:02:29.599> data 584 | 585 | 00:02:29.830 --> 00:02:29.840 align:start position:0% 586 | transport this allows processing of data 587 | 588 | 589 | 00:02:29.840 --> 00:02:31.509 align:start position:0% 590 | transport this allows processing of data 591 | sets<00:02:30.080> that<00:02:30.239> are<00:02:30.319> larger<00:02:30.720> than<00:02:30.879> the<00:02:31.040> available 592 | 593 | 00:02:31.509 --> 00:02:31.519 align:start position:0% 594 | sets that are larger than the available 595 | 596 | 597 | 00:02:31.519 --> 00:02:33.270 align:start position:0% 598 | sets that are larger than the available 599 | resources<00:02:32.080> with<00:02:32.319> only<00:02:32.560> minimal<00:02:32.879> software 600 | 601 | 00:02:33.270 --> 00:02:33.280 align:start position:0% 602 | resources with only minimal software 603 | 604 | 605 | 00:02:33.280 --> 00:02:35.270 align:start position:0% 606 | resources with only minimal software 607 | requirements<00:02:33.840> on<00:02:34.000> the<00:02:34.080> server<00:02:34.400> side 608 | 609 | 00:02:35.270 --> 00:02:35.280 align:start position:0% 610 | requirements on the server side 611 | 612 | 613 | 00:02:35.280 --> 00:02:37.270 align:start position:0% 614 | requirements on the server side 615 | because<00:02:35.519> data<00:02:35.920> sets<00:02:36.239> can<00:02:36.480> be<00:02:36.640> easily<00:02:37.040> shared 616 | 617 | 00:02:37.270 --> 00:02:37.280 align:start position:0% 618 | because data sets can be easily shared 619 | 620 | 621 | 00:02:37.280 --> 00:02:39.190 align:start position:0% 622 | because data sets can be easily shared 623 | with<00:02:37.440> appropriate<00:02:38.000> audiences<00:02:38.560> the<00:02:38.720> resulting 624 | 625 | 00:02:39.190 --> 00:02:39.200 align:start position:0% 626 | with appropriate audiences the resulting 627 | 628 | 629 | 00:02:39.200 --> 00:02:40.790 align:start position:0% 630 | with appropriate audiences the resulting 631 | data<00:02:39.440> sets<00:02:39.680> can<00:02:39.840> be<00:02:40.000> distributed<00:02:40.640> in<00:02:40.720> a 632 | 633 | 00:02:40.790 --> 00:02:40.800 align:start position:0% 634 | data sets can be distributed in a 635 | 636 | 637 | 00:02:40.800 --> 00:02:42.869 align:start position:0% 638 | data sets can be distributed in a 639 | streamlined<00:02:41.360> transparent<00:02:42.080> and<00:02:42.239> reusable 640 | 641 | 00:02:42.869 --> 00:02:42.879 align:start position:0% 642 | streamlined transparent and reusable 643 | 644 | 645 | 00:02:42.879 --> 00:02:45.430 align:start position:0% 646 | streamlined transparent and reusable 647 | manner<00:02:43.519> as<00:02:43.840> jobs<00:02:44.080> were<00:02:44.239> computed<00:02:44.720> in<00:02:44.879> isolated 648 | 649 | 00:02:45.430 --> 00:02:45.440 align:start position:0% 650 | manner as jobs were computed in isolated 651 | 652 | 653 | 00:02:45.440 --> 00:02:46.869 align:start position:0% 654 | manner as jobs were computed in isolated 655 | compute<00:02:45.760> environments<00:02:46.560> they<00:02:46.800> are 656 | 657 | 00:02:46.869 --> 00:02:46.879 align:start position:0% 658 | compute environments they are 659 | 660 | 661 | 00:02:46.879 --> 00:02:48.550 align:start position:0% 662 | compute environments they are 663 | automatically<00:02:47.519> portable<00:02:48.080> to<00:02:48.319> other 664 | 665 | 00:02:48.550 --> 00:02:48.560 align:start position:0% 666 | automatically portable to other 667 | 668 | 669 | 00:02:48.560 --> 00:02:50.550 align:start position:0% 670 | automatically portable to other 671 | infrastructure<00:02:49.519> as<00:02:49.680> long<00:02:49.840> as<00:02:50.000> data<00:02:50.239> light<00:02:50.400> and 672 | 673 | 00:02:50.550 --> 00:02:50.560 align:start position:0% 674 | infrastructure as long as data light and 675 | 676 | 677 | 00:02:50.560 --> 00:02:52.150 align:start position:0% 678 | infrastructure as long as data light and 679 | the<00:02:50.640> employed<00:02:51.040> container<00:02:51.440> technology<00:02:52.000> are 680 | 681 | 00:02:52.150 --> 00:02:52.160 align:start position:0% 682 | the employed container technology are 683 | 684 | 685 | 00:02:52.160 --> 00:02:54.470 align:start position:0% 686 | the employed container technology are 687 | available<00:02:52.959> whoever<00:02:53.360> obtains<00:02:53.680> this<00:02:53.840> data<00:02:54.160> set 688 | 689 | 00:02:54.470 --> 00:02:54.480 align:start position:0% 690 | available whoever obtains this data set 691 | 692 | 693 | 00:02:54.480 --> 00:02:56.630 align:start position:0% 694 | available whoever obtains this data set 695 | can<00:02:54.640> thus<00:02:54.879> recompute<00:02:55.440> each<00:02:55.599> individual<00:02:56.239> job 696 | 697 | 00:02:56.630 --> 00:02:56.640 align:start position:0% 698 | can thus recompute each individual job 699 | 700 | 701 | 00:02:56.640 --> 00:03:00.680 align:start position:0% 702 | can thus recompute each individual job 703 | on<00:02:56.720> their<00:02:56.959> own<00:02:57.120> computer<00:02:57.680> automatically 704 | 705 | -------------------------------------------------------------------------------- /DataLad/02_JuypterHub_overview_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.en.vtt: -------------------------------------------------------------------------------- 1 | WEBVTT 2 | Kind: captions 3 | Language: en 4 | 5 | 00:00:00.480 --> 00:00:02.629 align:start position:0% 6 | 7 | all<00:00:00.640> right<00:00:01.280> one<00:00:01.520> of<00:00:01.599> the 8 | 9 | 00:00:02.629 --> 00:00:02.639 align:start position:0% 10 | all right one of the 11 | 12 | 13 | 00:00:02.639 --> 00:00:04.870 align:start position:0% 14 | all right one of the 15 | things<00:00:02.960> we've<00:00:03.439> prepared<00:00:03.760> for<00:00:03.919> today<00:00:04.319> is 16 | 17 | 00:00:04.870 --> 00:00:04.880 align:start position:0% 18 | things we've prepared for today is 19 | 20 | 21 | 00:00:04.880 --> 00:00:06.230 align:start position:0% 22 | things we've prepared for today is 23 | jupiter<00:00:05.359> hub 24 | 25 | 00:00:06.230 --> 00:00:06.240 align:start position:0% 26 | jupiter hub 27 | 28 | 29 | 00:00:06.240 --> 00:00:09.190 align:start position:0% 30 | jupiter hub 31 | so<00:00:06.480> it<00:00:06.640> is<00:00:06.879> one<00:00:07.200> of<00:00:07.440> the<00:00:08.000> ways<00:00:08.400> you<00:00:08.720> will<00:00:09.040> be 32 | 33 | 00:00:09.190 --> 00:00:09.200 align:start position:0% 34 | so it is one of the ways you will be 35 | 36 | 37 | 00:00:09.200 --> 00:00:11.110 align:start position:0% 38 | so it is one of the ways you will be 39 | able<00:00:09.679> to 40 | 41 | 00:00:11.110 --> 00:00:11.120 align:start position:0% 42 | able to 43 | 44 | 45 | 00:00:11.120 --> 00:00:14.150 align:start position:0% 46 | able to 47 | run<00:00:11.280> the<00:00:11.599> examples<00:00:12.639> uh<00:00:13.120> all<00:00:13.200> the<00:00:13.599> all<00:00:13.759> the<00:00:13.920> code 48 | 49 | 00:00:14.150 --> 00:00:14.160 align:start position:0% 50 | run the examples uh all the all the code 51 | 52 | 53 | 00:00:14.160 --> 00:00:17.029 align:start position:0% 54 | run the examples uh all the all the code 55 | examples<00:00:14.639> called<00:00:15.040> exercises<00:00:15.839> yourself 56 | 57 | 00:00:17.029 --> 00:00:17.039 align:start position:0% 58 | examples called exercises yourself 59 | 60 | 61 | 00:00:17.039 --> 00:00:19.109 align:start position:0% 62 | examples called exercises yourself 63 | you<00:00:17.119> can<00:00:17.359> of<00:00:17.440> course<00:00:17.680> use<00:00:17.920> your<00:00:18.320> own<00:00:18.480> computer 64 | 65 | 00:00:19.109 --> 00:00:19.119 align:start position:0% 66 | you can of course use your own computer 67 | 68 | 69 | 00:00:19.119 --> 00:00:21.429 align:start position:0% 70 | you can of course use your own computer 71 | but<00:00:19.279> you<00:00:19.439> can<00:00:19.680> also<00:00:20.000> use<00:00:20.400> this 72 | 73 | 00:00:21.429 --> 00:00:21.439 align:start position:0% 74 | but you can also use this 75 | 76 | 77 | 00:00:21.439 --> 00:00:23.910 align:start position:0% 78 | but you can also use this 79 | shared<00:00:21.760> resource<00:00:22.720> so<00:00:23.039> the<00:00:23.279> url<00:00:23.760> is 80 | 81 | 00:00:23.910 --> 00:00:23.920 align:start position:0% 82 | shared resource so the url is 83 | 84 | 85 | 00:00:23.920 --> 00:00:27.990 align:start position:0% 86 | shared resource so the url is 87 | data-hub.inm7.de 88 | 89 | 00:00:27.990 --> 00:00:28.000 align:start position:0% 90 | 91 | 92 | 93 | 00:00:28.000 --> 00:00:30.710 align:start position:0% 94 | 95 | you<00:00:28.240> should<00:00:28.480> have<00:00:28.640> received<00:00:29.119> your<00:00:30.000> usernames 96 | 97 | 00:00:30.710 --> 00:00:30.720 align:start position:0% 98 | you should have received your usernames 99 | 100 | 101 | 00:00:30.720 --> 00:00:32.950 align:start position:0% 102 | you should have received your usernames 103 | in<00:00:30.880> your<00:00:31.119> emails<00:00:31.599> they're<00:00:32.000> usually<00:00:32.559> your 104 | 105 | 00:00:32.950 --> 00:00:32.960 align:start position:0% 106 | in your emails they're usually your 107 | 108 | 109 | 00:00:32.960 --> 00:00:35.190 align:start position:0% 110 | in your emails they're usually your 111 | first<00:00:33.760> usually<00:00:34.079> but<00:00:34.239> not<00:00:34.480> always<00:00:34.800> they<00:00:35.040> are 112 | 113 | 00:00:35.190 --> 00:00:35.200 align:start position:0% 114 | first usually but not always they are 115 | 116 | 117 | 00:00:35.200 --> 00:00:36.709 align:start position:0% 118 | first usually but not always they are 119 | your<00:00:35.440> first<00:00:35.760> names 120 | 121 | 00:00:36.709 --> 00:00:36.719 align:start position:0% 122 | your first names 123 | 124 | 125 | 00:00:36.719 --> 00:00:39.590 align:start position:0% 126 | your first names 127 | so<00:00:37.120> if<00:00:37.360> you<00:00:37.920> if<00:00:38.079> you<00:00:38.559> if<00:00:38.719> you<00:00:39.040> if<00:00:39.200> you're<00:00:39.440> going 128 | 129 | 00:00:39.590 --> 00:00:39.600 align:start position:0% 130 | so if you if you if you if you're going 131 | 132 | 133 | 00:00:39.600 --> 00:00:42.069 align:start position:0% 134 | so if you if you if you if you're going 135 | to<00:00:39.760> use<00:00:39.920> the<00:00:40.000> jupiter<00:00:40.399> hub<00:00:40.640> and<00:00:40.879> you<00:00:41.280> don't 136 | 137 | 00:00:42.069 --> 00:00:42.079 align:start position:0% 138 | to use the jupiter hub and you don't 139 | 140 | 141 | 00:00:42.079 --> 00:00:43.350 align:start position:0% 142 | to use the jupiter hub and you don't 143 | don't<00:00:42.320> have<00:00:42.480> the 144 | 145 | 00:00:43.350 --> 00:00:43.360 align:start position:0% 146 | don't have the 147 | 148 | 149 | 00:00:43.360 --> 00:00:45.670 align:start position:0% 150 | don't have the 151 | username<00:00:44.000> then<00:00:44.399> please<00:00:44.960> then<00:00:45.120> please<00:00:45.360> let<00:00:45.600> us 152 | 153 | 00:00:45.670 --> 00:00:45.680 align:start position:0% 154 | username then please then please let us 155 | 156 | 157 | 00:00:45.680 --> 00:00:46.869 align:start position:0% 158 | username then please then please let us 159 | know 160 | 161 | 00:00:46.869 --> 00:00:46.879 align:start position:0% 162 | know 163 | 164 | 165 | 00:00:46.879 --> 00:00:49.990 align:start position:0% 166 | know 167 | the<00:00:47.039> password<00:00:47.680> is<00:00:48.079> whatever<00:00:48.719> you<00:00:49.039> use<00:00:49.440> for<00:00:49.760> the 168 | 169 | 00:00:49.990 --> 00:00:50.000 align:start position:0% 170 | the password is whatever you use for the 171 | 172 | 173 | 00:00:50.000 --> 00:00:51.270 align:start position:0% 174 | the password is whatever you use for the 175 | first<00:00:50.320> access 176 | 177 | 00:00:51.270 --> 00:00:51.280 align:start position:0% 178 | first access 179 | 180 | 181 | 00:00:51.280 --> 00:00:53.350 align:start position:0% 182 | first access 183 | so<00:00:51.680> whatever<00:00:52.079> you<00:00:52.239> type<00:00:52.480> for<00:00:52.640> the<00:00:52.800> first<00:00:53.039> time 184 | 185 | 00:00:53.350 --> 00:00:53.360 align:start position:0% 186 | so whatever you type for the first time 187 | 188 | 189 | 00:00:53.360 --> 00:00:56.310 align:start position:0% 190 | so whatever you type for the first time 191 | will<00:00:53.600> become<00:00:54.239> your<00:00:54.719> password<00:00:55.199> to<00:00:55.360> the<00:00:55.520> hub<00:00:56.160> for 192 | 193 | 00:00:56.310 --> 00:00:56.320 align:start position:0% 194 | will become your password to the hub for 195 | 196 | 197 | 00:00:56.320 --> 00:00:58.709 align:start position:0% 198 | will become your password to the hub for 199 | the<00:00:56.480> next<00:00:56.800> two<00:00:56.960> days<00:00:57.600> so<00:00:57.760> whatever<00:00:58.239> you<00:00:58.399> choose 200 | 201 | 00:00:58.709 --> 00:00:58.719 align:start position:0% 202 | the next two days so whatever you choose 203 | 204 | 205 | 00:00:58.719 --> 00:00:59.910 align:start position:0% 206 | the next two days so whatever you choose 207 | make<00:00:58.879> sure<00:00:59.199> you 208 | 209 | 00:00:59.910 --> 00:00:59.920 align:start position:0% 210 | make sure you 211 | 212 | 213 | 00:00:59.920 --> 00:01:02.229 align:start position:0% 214 | make sure you 215 | save<00:01:00.239> it 216 | 217 | 00:01:02.229 --> 00:01:02.239 align:start position:0% 218 | save it 219 | 220 | 221 | 00:01:02.239 --> 00:01:04.710 align:start position:0% 222 | save it 223 | and<00:01:02.480> let<00:01:02.640> me<00:01:02.800> give<00:01:02.960> you<00:01:03.120> a<00:01:03.359> quick<00:01:03.760> tour<00:01:04.159> of<00:01:04.400> the 224 | 225 | 00:01:04.710 --> 00:01:04.720 align:start position:0% 226 | and let me give you a quick tour of the 227 | 228 | 229 | 00:01:04.720 --> 00:01:07.830 align:start position:0% 230 | and let me give you a quick tour of the 231 | interface<00:01:05.199> that<00:01:05.439> we<00:01:05.680> will<00:01:05.840> be<00:01:06.080> using 232 | 233 | 00:01:07.830 --> 00:01:07.840 align:start position:0% 234 | interface that we will be using 235 | 236 | 237 | 00:01:07.840 --> 00:01:10.149 align:start position:0% 238 | interface that we will be using 239 | this<00:01:08.080> is<00:01:08.240> what<00:01:08.479> you<00:01:08.799> what<00:01:09.040> you<00:01:09.200> will<00:01:09.439> see<00:01:10.000> the 240 | 241 | 00:01:10.149 --> 00:01:10.159 align:start position:0% 242 | this is what you what you will see the 243 | 244 | 245 | 00:01:10.159 --> 00:01:12.710 align:start position:0% 246 | this is what you what you will see the 247 | first<00:01:10.400> time<00:01:10.880> you<00:01:11.040> log<00:01:11.360> in<00:01:12.320> and<00:01:12.400> there<00:01:12.640> are 248 | 249 | 00:01:12.710 --> 00:01:12.720 align:start position:0% 250 | first time you log in and there are 251 | 252 | 253 | 00:01:12.720 --> 00:01:15.190 align:start position:0% 254 | first time you log in and there are 255 | basically<00:01:13.280> two<00:01:13.520> parts<00:01:14.240> one<00:01:14.479> will<00:01:14.720> be<00:01:14.880> called 256 | 257 | 00:01:15.190 --> 00:01:15.200 align:start position:0% 258 | basically two parts one will be called 259 | 260 | 261 | 00:01:15.200 --> 00:01:17.749 align:start position:0% 262 | basically two parts one will be called 263 | launcher<00:01:16.159> and<00:01:16.320> the<00:01:16.479> other<00:01:16.799> will<00:01:17.040> be 264 | 265 | 00:01:17.749 --> 00:01:17.759 align:start position:0% 266 | launcher and the other will be 267 | 268 | 269 | 00:01:17.759 --> 00:01:19.830 align:start position:0% 270 | launcher and the other will be 271 | a<00:01:17.920> side<00:01:18.159> panel<00:01:18.560> and<00:01:18.720> also<00:01:19.040> a<00:01:19.200> short<00:01:19.439> menu<00:01:19.759> on 272 | 273 | 00:01:19.830 --> 00:01:19.840 align:start position:0% 274 | a side panel and also a short menu on 275 | 276 | 277 | 00:01:19.840 --> 00:01:21.109 align:start position:0% 278 | a side panel and also a short menu on 279 | the<00:01:20.000> top 280 | 281 | 00:01:21.109 --> 00:01:21.119 align:start position:0% 282 | the top 283 | 284 | 285 | 00:01:21.119 --> 00:01:23.350 align:start position:0% 286 | the top 287 | uh<00:01:21.439> so<00:01:21.680> first<00:01:21.920> of<00:01:22.080> all<00:01:22.720> there<00:01:22.960> are<00:01:23.119> some 288 | 289 | 00:01:23.350 --> 00:01:23.360 align:start position:0% 290 | uh so first of all there are some 291 | 292 | 293 | 00:01:23.360 --> 00:01:25.830 align:start position:0% 294 | uh so first of all there are some 295 | settings<00:01:24.240> you<00:01:24.400> can<00:01:24.640> adjust<00:01:25.040> how<00:01:25.280> things<00:01:25.680> look 296 | 297 | 00:01:25.830 --> 00:01:25.840 align:start position:0% 298 | settings you can adjust how things look 299 | 300 | 301 | 00:01:25.840 --> 00:01:27.350 align:start position:0% 302 | settings you can adjust how things look 303 | for<00:01:26.000> example<00:01:26.400> by 304 | 305 | 00:01:27.350 --> 00:01:27.360 align:start position:0% 306 | for example by 307 | 308 | 309 | 00:01:27.360 --> 00:01:28.710 align:start position:0% 310 | for example by 311 | choosing<00:01:27.680> between 312 | 313 | 00:01:28.710 --> 00:01:28.720 align:start position:0% 314 | choosing between 315 | 316 | 317 | 00:01:28.720 --> 00:01:31.030 align:start position:0% 318 | choosing between 319 | a<00:01:28.880> light<00:01:29.280> or<00:01:29.520> dark<00:01:29.920> theme 320 | 321 | 00:01:31.030 --> 00:01:31.040 align:start position:0% 322 | a light or dark theme 323 | 324 | 325 | 00:01:31.040 --> 00:01:33.109 align:start position:0% 326 | a light or dark theme 327 | here<00:01:31.360> also<00:01:31.600> in<00:01:31.759> the<00:01:31.920> settings<00:01:32.400> theme<00:01:32.640> you<00:01:32.880> have 328 | 329 | 00:01:33.109 --> 00:01:33.119 align:start position:0% 330 | here also in the settings theme you have 331 | 332 | 333 | 00:01:33.119 --> 00:01:35.350 align:start position:0% 334 | here also in the settings theme you have 335 | buttons<00:01:33.520> like<00:01:33.840> increase<00:01:34.320> or<00:01:34.560> decrease<00:01:35.040> font 336 | 337 | 00:01:35.350 --> 00:01:35.360 align:start position:0% 338 | buttons like increase or decrease font 339 | 340 | 341 | 00:01:35.360 --> 00:01:36.149 align:start position:0% 342 | buttons like increase or decrease font 343 | size 344 | 345 | 00:01:36.149 --> 00:01:36.159 align:start position:0% 346 | size 347 | 348 | 349 | 00:01:36.159 --> 00:01:37.910 align:start position:0% 350 | size 351 | and<00:01:36.320> they<00:01:36.400> are<00:01:36.640> separately<00:01:37.200> for<00:01:37.439> code<00:01:37.759> for 352 | 353 | 00:01:37.910 --> 00:01:37.920 align:start position:0% 354 | and they are separately for code for 355 | 356 | 357 | 00:01:37.920 --> 00:01:39.670 align:start position:0% 358 | and they are separately for code for 359 | content<00:01:38.400> for<00:01:38.640> ui 360 | 361 | 00:01:39.670 --> 00:01:39.680 align:start position:0% 362 | content for ui 363 | 364 | 365 | 00:01:39.680 --> 00:01:43.350 align:start position:0% 366 | content for ui 367 | and<00:01:40.079> for<00:01:40.560> also<00:01:41.200> terminal<00:01:42.159> out<00:01:42.479> here 368 | 369 | 00:01:43.350 --> 00:01:43.360 align:start position:0% 370 | and for also terminal out here 371 | 372 | 373 | 00:01:43.360 --> 00:01:45.670 align:start position:0% 374 | and for also terminal out here 375 | i<00:01:43.600> have<00:01:43.840> increased<00:01:44.240> mine<00:01:44.799> how<00:01:45.119> i<00:01:45.280> hope<00:01:45.520> they 376 | 377 | 00:01:45.670 --> 00:01:45.680 align:start position:0% 378 | i have increased mine how i hope they 379 | 380 | 381 | 00:01:45.680 --> 00:01:46.870 align:start position:0% 382 | i have increased mine how i hope they 383 | are 384 | 385 | 00:01:46.870 --> 00:01:46.880 align:start position:0% 386 | are 387 | 388 | 389 | 00:01:46.880 --> 00:01:51.990 align:start position:0% 390 | are 391 | they<00:01:47.040> are<00:01:47.280> visible<00:01:47.920> on<00:01:48.479> zoom<00:01:48.799> as<00:01:48.880> well 392 | 393 | 00:01:51.990 --> 00:01:52.000 align:start position:0% 394 | 395 | 396 | 397 | 00:01:52.000 --> 00:01:54.389 align:start position:0% 398 | 399 | and<00:01:52.479> here<00:01:52.880> you<00:01:53.040> can<00:01:53.280> also 400 | 401 | 00:01:54.389 --> 00:01:54.399 align:start position:0% 402 | and here you can also 403 | 404 | 405 | 00:01:54.399 --> 00:01:56.069 align:start position:0% 406 | and here you can also 407 | change<00:01:54.720> how<00:01:54.960> things<00:01:55.200> look<00:01:55.439> so<00:01:55.600> you<00:01:55.759> can<00:01:55.920> for 408 | 409 | 00:01:56.069 --> 00:01:56.079 align:start position:0% 410 | change how things look so you can for 411 | 412 | 413 | 00:01:56.079 --> 00:01:58.469 align:start position:0% 414 | change how things look so you can for 415 | example<00:01:56.640> slide<00:01:57.280> the<00:01:57.439> side<00:01:57.680> panel<00:01:58.000> to<00:01:58.079> make<00:01:58.320> it 416 | 417 | 00:01:58.469 --> 00:01:58.479 align:start position:0% 418 | example slide the side panel to make it 419 | 420 | 421 | 00:01:58.479 --> 00:02:00.870 align:start position:0% 422 | example slide the side panel to make it 423 | smaller<00:01:58.960> or<00:01:59.200> bigger 424 | 425 | 00:02:00.870 --> 00:02:00.880 align:start position:0% 426 | smaller or bigger 427 | 428 | 429 | 00:02:00.880 --> 00:02:03.670 align:start position:0% 430 | smaller or bigger 431 | jupiter<00:02:01.439> lab<00:02:01.840> is<00:02:02.159> something<00:02:02.640> that<00:02:03.119> has<00:02:03.439> been 432 | 433 | 00:02:03.670 --> 00:02:03.680 align:start position:0% 434 | jupiter lab is something that has been 435 | 436 | 437 | 00:02:03.680 --> 00:02:06.310 align:start position:0% 438 | jupiter lab is something that has been 439 | made<00:02:04.240> especially<00:02:05.200> for 440 | 441 | 00:02:06.310 --> 00:02:06.320 align:start position:0% 442 | made especially for 443 | 444 | 445 | 00:02:06.320 --> 00:02:08.469 align:start position:0% 446 | made especially for 447 | notebooks 448 | 449 | 00:02:08.469 --> 00:02:08.479 align:start position:0% 450 | notebooks 451 | 452 | 453 | 00:02:08.479 --> 00:02:10.309 align:start position:0% 454 | notebooks 455 | you<00:02:08.640> may<00:02:08.800> have<00:02:09.039> heard<00:02:09.280> of<00:02:09.440> jupyter<00:02:09.840> notebooks 456 | 457 | 00:02:10.309 --> 00:02:10.319 align:start position:0% 458 | you may have heard of jupyter notebooks 459 | 460 | 461 | 00:02:10.319 --> 00:02:12.550 align:start position:0% 462 | you may have heard of jupyter notebooks 463 | these<00:02:10.560> are<00:02:10.720> environments<00:02:11.520> to 464 | 465 | 00:02:12.550 --> 00:02:12.560 align:start position:0% 466 | these are environments to 467 | 468 | 469 | 00:02:12.560 --> 00:02:13.750 align:start position:0% 470 | these are environments to 471 | combine 472 | 473 | 00:02:13.750 --> 00:02:13.760 align:start position:0% 474 | combine 475 | 476 | 477 | 00:02:13.760 --> 00:02:15.589 align:start position:0% 478 | combine 479 | code<00:02:14.319> and 480 | 481 | 00:02:15.589 --> 00:02:15.599 align:start position:0% 482 | code and 483 | 484 | 485 | 00:02:15.599 --> 00:02:18.710 align:start position:0% 486 | code and 487 | code<00:02:16.000> and<00:02:16.640> markdown<00:02:17.360> and<00:02:17.599> outputs 488 | 489 | 00:02:18.710 --> 00:02:18.720 align:start position:0% 490 | code and markdown and outputs 491 | 492 | 493 | 00:02:18.720 --> 00:02:20.630 align:start position:0% 494 | code and markdown and outputs 495 | in<00:02:18.800> one<00:02:19.040> place<00:02:19.360> we<00:02:19.520> won't<00:02:19.680> be<00:02:19.840> using<00:02:20.160> notebooks 496 | 497 | 00:02:20.630 --> 00:02:20.640 align:start position:0% 498 | in one place we won't be using notebooks 499 | 500 | 501 | 00:02:20.640 --> 00:02:21.670 align:start position:0% 502 | in one place we won't be using notebooks 503 | today 504 | 505 | 00:02:21.670 --> 00:02:21.680 align:start position:0% 506 | today 507 | 508 | 509 | 00:02:21.680 --> 00:02:24.390 align:start position:0% 510 | today 511 | we<00:02:22.239> will<00:02:22.560> be<00:02:22.800> using<00:02:23.280> instead 512 | 513 | 00:02:24.390 --> 00:02:24.400 align:start position:0% 514 | we will be using instead 515 | 516 | 517 | 00:02:24.400 --> 00:02:25.430 align:start position:0% 518 | we will be using instead 519 | uh 520 | 521 | 00:02:25.430 --> 00:02:25.440 align:start position:0% 522 | uh 523 | 524 | 525 | 00:02:25.440 --> 00:02:27.350 align:start position:0% 526 | uh 527 | the<00:02:25.680> jupiter<00:02:26.080> lab<00:02:26.319> for<00:02:26.480> the<00:02:26.640> terminal<00:02:27.200> it 528 | 529 | 00:02:27.350 --> 00:02:27.360 align:start position:0% 530 | the jupiter lab for the terminal it 531 | 532 | 533 | 00:02:27.360 --> 00:02:30.470 align:start position:0% 534 | the jupiter lab for the terminal it 535 | provides<00:02:28.160> so<00:02:28.400> here<00:02:28.720> in<00:02:28.800> the<00:02:28.959> launcher<00:02:29.920> under 536 | 537 | 00:02:30.470 --> 00:02:30.480 align:start position:0% 538 | provides so here in the launcher under 539 | 540 | 541 | 00:02:30.480 --> 00:02:31.990 align:start position:0% 542 | provides so here in the launcher under 543 | in<00:02:30.560> the<00:02:30.720> other<00:02:31.040> section 544 | 545 | 00:02:31.990 --> 00:02:32.000 align:start position:0% 546 | in the other section 547 | 548 | 549 | 00:02:32.000 --> 00:02:34.229 align:start position:0% 550 | in the other section 551 | you<00:02:32.160> have<00:02:32.319> a<00:02:32.480> terminal<00:02:33.440> that<00:02:33.680> opens<00:02:34.080> a 552 | 553 | 00:02:34.229 --> 00:02:34.239 align:start position:0% 554 | you have a terminal that opens a 555 | 556 | 557 | 00:02:34.239 --> 00:02:35.830 align:start position:0% 558 | you have a terminal that opens a 559 | terminal 560 | 561 | 00:02:35.830 --> 00:02:35.840 align:start position:0% 562 | terminal 563 | 564 | 565 | 00:02:35.840 --> 00:02:37.670 align:start position:0% 566 | terminal 567 | that<00:02:36.160> is<00:02:36.640> a 568 | 569 | 00:02:37.670 --> 00:02:37.680 align:start position:0% 570 | that is a 571 | 572 | 573 | 00:02:37.680 --> 00:02:38.710 align:start position:0% 574 | that is a 575 | regular 576 | 577 | 00:02:38.710 --> 00:02:38.720 align:start position:0% 578 | regular 579 | 580 | 581 | 00:02:38.720 --> 00:02:40.390 align:start position:0% 582 | regular 583 | unix<00:02:39.280> terminal 584 | 585 | 00:02:40.390 --> 00:02:40.400 align:start position:0% 586 | unix terminal 587 | 588 | 589 | 00:02:40.400 --> 00:02:42.390 align:start position:0% 590 | unix terminal 591 | regular<00:02:40.959> bash 592 | 593 | 00:02:42.390 --> 00:02:42.400 align:start position:0% 594 | regular bash 595 | 596 | 597 | 00:02:42.400 --> 00:02:45.430 align:start position:0% 598 | regular bash 599 | that<00:02:42.720> runs<00:02:43.360> on<00:02:43.680> the<00:02:44.160> on<00:02:44.239> the<00:02:44.480> server<00:02:45.200> in<00:02:45.280> the 600 | 601 | 00:02:45.430 --> 00:02:45.440 align:start position:0% 602 | that runs on the on the server in the 603 | 604 | 605 | 00:02:45.440 --> 00:02:48.630 align:start position:0% 606 | that runs on the on the server in the 607 | cloud<00:02:45.920> and<00:02:46.160> you<00:02:46.319> have<00:02:46.640> your<00:02:47.200> own 608 | 609 | 00:02:48.630 --> 00:02:48.640 align:start position:0% 610 | cloud and you have your own 611 | 612 | 613 | 00:02:48.640 --> 00:02:51.190 align:start position:0% 614 | cloud and you have your own 615 | user<00:02:49.040> account<00:02:49.360> for<00:02:49.519> the<00:02:49.680> duration<00:02:50.400> of<00:02:51.040> the 616 | 617 | 00:02:51.190 --> 00:02:51.200 align:start position:0% 618 | user account for the duration of the 619 | 620 | 621 | 00:02:51.200 --> 00:02:53.110 align:start position:0% 622 | user account for the duration of the 623 | workshop 624 | 625 | 00:02:53.110 --> 00:02:53.120 align:start position:0% 626 | workshop 627 | 628 | 629 | 00:02:53.120 --> 00:02:55.910 align:start position:0% 630 | workshop 631 | so<00:02:53.360> for<00:02:53.519> example<00:02:54.319> i 632 | 633 | 00:02:55.910 --> 00:02:55.920 align:start position:0% 634 | so for example i 635 | 636 | 637 | 00:02:55.920 --> 00:02:58.790 align:start position:0% 638 | so for example i 639 | so<00:02:56.080> the<00:02:56.239> terminal<00:02:56.720> works<00:02:57.040> so<00:02:57.440> in<00:02:57.599> a<00:02:57.680> way<00:02:57.840> that<00:02:58.560> i 640 | 641 | 00:02:58.790 --> 00:02:58.800 align:start position:0% 642 | so the terminal works so in a way that i 643 | 644 | 645 | 00:02:58.800 --> 00:03:02.149 align:start position:0% 646 | so the terminal works so in a way that i 647 | type<00:02:59.040> a<00:02:59.120> command<00:02:59.680> and<00:02:59.920> i<00:03:00.080> get<00:03:00.319> response<00:03:01.440> for<00:03:02.000> so 648 | 649 | 00:03:02.149 --> 00:03:02.159 align:start position:0% 650 | type a command and i get response for so 651 | 652 | 653 | 00:03:02.159 --> 00:03:03.430 align:start position:0% 654 | type a command and i get response for so 655 | for<00:03:02.319> example 656 | 657 | 00:03:03.430 --> 00:03:03.440 align:start position:0% 658 | for example 659 | 660 | 661 | 00:03:03.440 --> 00:03:06.470 align:start position:0% 662 | for example 663 | who<00:03:03.680> am<00:03:03.920> i<00:03:04.560> will<00:03:05.040> print<00:03:05.519> my 664 | 665 | 00:03:06.470 --> 00:03:06.480 align:start position:0% 666 | who am i will print my 667 | 668 | 669 | 00:03:06.480 --> 00:03:09.990 align:start position:0% 670 | who am i will print my 671 | username<00:03:07.120> here 672 | 673 | 00:03:09.990 --> 00:03:10.000 align:start position:0% 674 | 675 | 676 | 677 | 00:03:10.000 --> 00:03:12.869 align:start position:0% 678 | 679 | we<00:03:10.159> will<00:03:10.400> be<00:03:10.800> using<00:03:11.120> the<00:03:11.280> terminal<00:03:11.840> to<00:03:12.159> run<00:03:12.560> the 680 | 681 | 00:03:12.869 --> 00:03:12.879 align:start position:0% 682 | we will be using the terminal to run the 683 | 684 | 685 | 00:03:12.879 --> 00:03:14.309 align:start position:0% 686 | we will be using the terminal to run the 687 | datalot<00:03:13.360> commands 688 | 689 | 00:03:14.309 --> 00:03:14.319 align:start position:0% 690 | datalot commands 691 | 692 | 693 | 00:03:14.319 --> 00:03:16.390 align:start position:0% 694 | datalot commands 695 | many<00:03:14.560> of<00:03:14.640> these<00:03:14.879> commands<00:03:15.280> will<00:03:15.440> create<00:03:15.840> files 696 | 697 | 00:03:16.390 --> 00:03:16.400 align:start position:0% 698 | many of these commands will create files 699 | 700 | 701 | 00:03:16.400 --> 00:03:19.190 align:start position:0% 702 | many of these commands will create files 703 | and<00:03:16.640> the<00:03:16.800> files<00:03:17.200> that<00:03:17.680> are<00:03:18.080> created 704 | 705 | 00:03:19.190 --> 00:03:19.200 align:start position:0% 706 | and the files that are created 707 | 708 | 709 | 00:03:19.200 --> 00:03:22.070 align:start position:0% 710 | and the files that are created 711 | will<00:03:19.599> also<00:03:19.920> be<00:03:20.159> visible<00:03:20.720> to<00:03:20.879> you<00:03:21.280> in<00:03:21.519> this<00:03:21.760> file 712 | 713 | 00:03:22.070 --> 00:03:22.080 align:start position:0% 714 | will also be visible to you in this file 715 | 716 | 717 | 00:03:22.080 --> 00:03:24.229 align:start position:0% 718 | will also be visible to you in this file 719 | browser<00:03:22.560> here<00:03:23.360> so 720 | 721 | 00:03:24.229 --> 00:03:24.239 align:start position:0% 722 | browser here so 723 | 724 | 725 | 00:03:24.239 --> 00:03:27.030 align:start position:0% 726 | browser here so 727 | for<00:03:24.480> now<00:03:24.640> i<00:03:24.879> don't<00:03:25.120> have<00:03:25.280> many<00:03:25.599> files 728 | 729 | 00:03:27.030 --> 00:03:27.040 align:start position:0% 730 | for now i don't have many files 731 | 732 | 733 | 00:03:27.040 --> 00:03:28.949 align:start position:0% 734 | for now i don't have many files 735 | but<00:03:27.519> with<00:03:27.760> time 736 | 737 | 00:03:28.949 --> 00:03:28.959 align:start position:0% 738 | but with time 739 | 740 | 741 | 00:03:28.959 --> 00:03:31.750 align:start position:0% 742 | but with time 743 | they<00:03:29.200> will<00:03:29.440> be<00:03:29.680> populated<00:03:30.319> i<00:03:30.400> can<00:03:30.640> make 744 | 745 | 00:03:31.750 --> 00:03:31.760 align:start position:0% 746 | they will be populated i can make 747 | 748 | 749 | 00:03:31.760 --> 00:03:34.149 align:start position:0% 750 | they will be populated i can make 751 | directories<00:03:32.720> either<00:03:33.120> from<00:03:33.360> the 752 | 753 | 00:03:34.149 --> 00:03:34.159 align:start position:0% 754 | directories either from the 755 | 756 | 757 | 00:03:34.159 --> 00:03:38.309 align:start position:0% 758 | directories either from the 759 | terminal<00:03:34.640> here 760 | 761 | 00:03:38.309 --> 00:03:38.319 align:start position:0% 762 | 763 | 764 | 765 | 00:03:38.319 --> 00:03:39.509 align:start position:0% 766 | 767 | and 768 | 769 | 00:03:39.509 --> 00:03:39.519 align:start position:0% 770 | and 771 | 772 | 773 | 00:03:39.519 --> 00:03:41.990 align:start position:0% 774 | and 775 | it<00:03:40.000> it<00:03:40.159> appeared<00:03:40.640> here<00:03:40.959> i<00:03:41.120> can<00:03:41.280> go<00:03:41.519> into<00:03:41.680> it<00:03:41.840> by 776 | 777 | 00:03:41.990 --> 00:03:42.000 align:start position:0% 778 | it it appeared here i can go into it by 779 | 780 | 781 | 00:03:42.000 --> 00:03:44.229 align:start position:0% 782 | it it appeared here i can go into it by 783 | double<00:03:42.319> clicking<00:03:42.720> i<00:03:42.879> can<00:03:43.040> go<00:03:43.280> back 784 | 785 | 00:03:44.229 --> 00:03:44.239 align:start position:0% 786 | double clicking i can go back 787 | 788 | 789 | 00:03:44.239 --> 00:03:46.149 align:start position:0% 790 | double clicking i can go back 791 | but<00:03:44.480> i<00:03:44.560> can<00:03:44.799> also<00:03:45.120> use<00:03:45.360> these 792 | 793 | 00:03:46.149 --> 00:03:46.159 align:start position:0% 794 | but i can also use these 795 | 796 | 797 | 00:03:46.159 --> 00:03:47.750 align:start position:0% 798 | but i can also use these 799 | buttons<00:03:46.799> here 800 | 801 | 00:03:47.750 --> 00:03:47.760 align:start position:0% 802 | buttons here 803 | 804 | 805 | 00:03:47.760 --> 00:03:50.390 align:start position:0% 806 | buttons here 807 | that<00:03:48.000> will<00:03:48.319> create<00:03:48.720> new<00:03:48.959> folders 808 | 809 | 00:03:50.390 --> 00:03:50.400 align:start position:0% 810 | that will create new folders 811 | 812 | 813 | 00:03:50.400 --> 00:03:54.229 align:start position:0% 814 | that will create new folders 815 | like<00:03:50.720> this<00:03:51.280> and<00:03:51.599> i<00:03:51.760> can<00:03:52.080> also 816 | 817 | 00:03:54.229 --> 00:03:54.239 align:start position:0% 818 | like this and i can also 819 | 820 | 821 | 00:03:54.239 --> 00:03:57.429 align:start position:0% 822 | like this and i can also 823 | create<00:03:54.959> a<00:03:55.680> new<00:03:56.000> file 824 | 825 | 00:03:57.429 --> 00:03:57.439 align:start position:0% 826 | create a new file 827 | 828 | 829 | 00:03:57.439 --> 00:04:00.630 align:start position:0% 830 | create a new file 831 | i<00:03:57.680> can<00:03:58.239> can<00:03:58.560> name<00:03:58.840> it 832 | 833 | 00:04:00.630 --> 00:04:00.640 align:start position:0% 834 | i can can name it 835 | 836 | 837 | 00:04:00.640 --> 00:04:02.789 align:start position:0% 838 | i can can name it 839 | anything<00:04:01.120> i<00:04:01.280> want 840 | 841 | 00:04:02.789 --> 00:04:02.799 align:start position:0% 842 | anything i want 843 | 844 | 845 | 00:04:02.799 --> 00:04:03.670 align:start position:0% 846 | anything i want 847 | and 848 | 849 | 00:04:03.670 --> 00:04:03.680 align:start position:0% 850 | and 851 | 852 | 853 | 00:04:03.680 --> 00:04:05.030 align:start position:0% 854 | and 855 | i<00:04:04.080> can 856 | 857 | 00:04:05.030 --> 00:04:05.040 align:start position:0% 858 | i can 859 | 860 | 861 | 00:04:05.040 --> 00:04:07.270 align:start position:0% 862 | i can 863 | open<00:04:05.360> it<00:04:05.519> with<00:04:05.760> an<00:04:06.080> editor 864 | 865 | 00:04:07.270 --> 00:04:07.280 align:start position:0% 866 | open it with an editor 867 | 868 | 869 | 00:04:07.280 --> 00:04:12.550 align:start position:0% 870 | open it with an editor 871 | so<00:04:07.599> here<00:04:07.920> i'll<00:04:08.159> get<00:04:08.400> an<00:04:09.040> editor<00:04:09.519> tab 872 | 873 | 00:04:12.550 --> 00:04:12.560 align:start position:0% 874 | 875 | 876 | 877 | 00:04:12.560 --> 00:04:15.509 align:start position:0% 878 | 879 | where<00:04:12.799> i<00:04:12.959> can<00:04:14.000> write<00:04:14.319> things<00:04:14.799> and<00:04:14.959> i<00:04:15.120> have<00:04:15.280> the 880 | 881 | 00:04:15.509 --> 00:04:15.519 align:start position:0% 882 | where i can write things and i have the 883 | 884 | 885 | 00:04:15.519 --> 00:04:17.430 align:start position:0% 886 | where i can write things and i have the 887 | buttons<00:04:16.079> file 888 | 889 | 00:04:17.430 --> 00:04:17.440 align:start position:0% 890 | buttons file 891 | 892 | 893 | 00:04:17.440 --> 00:04:19.670 align:start position:0% 894 | buttons file 895 | save<00:04:17.919> python<00:04:18.400> file 896 | 897 | 00:04:19.670 --> 00:04:19.680 align:start position:0% 898 | save python file 899 | 900 | 901 | 00:04:19.680 --> 00:04:21.909 align:start position:0% 902 | save python file 903 | i<00:04:19.840> can<00:04:20.079> close<00:04:20.479> things 904 | 905 | 00:04:21.909 --> 00:04:21.919 align:start position:0% 906 | i can close things 907 | 908 | 909 | 00:04:21.919 --> 00:04:24.550 align:start position:0% 910 | i can close things 911 | i<00:04:22.160> can<00:04:22.639> open<00:04:22.960> them<00:04:23.199> from<00:04:23.360> a<00:04:23.440> launcher 912 | 913 | 00:04:24.550 --> 00:04:24.560 align:start position:0% 914 | i can open them from a launcher 915 | 916 | 917 | 00:04:24.560 --> 00:04:28.070 align:start position:0% 918 | i can open them from a launcher 919 | or<00:04:24.800> i<00:04:24.880> can<00:04:25.120> open<00:04:25.440> them<00:04:26.160> by<00:04:26.560> double<00:04:26.880> clicking<00:04:28.000> in 920 | 921 | 00:04:28.070 --> 00:04:28.080 align:start position:0% 922 | or i can open them by double clicking in 923 | 924 | 925 | 00:04:28.080 --> 00:04:30.870 align:start position:0% 926 | or i can open them by double clicking in 927 | the<00:04:28.240> file<00:04:28.479> browser<00:04:28.800> window 928 | 929 | 00:04:30.870 --> 00:04:30.880 align:start position:0% 930 | the file browser window 931 | 932 | 933 | 00:04:30.880 --> 00:04:32.870 align:start position:0% 934 | the file browser window 935 | so<00:04:31.440> we<00:04:31.680> will<00:04:31.759> be<00:04:31.919> looking<00:04:32.320> at<00:04:32.479> the<00:04:32.639> file 936 | 937 | 00:04:32.870 --> 00:04:32.880 align:start position:0% 938 | so we will be looking at the file 939 | 940 | 941 | 00:04:32.880 --> 00:04:35.270 align:start position:0% 942 | so we will be looking at the file 943 | browser<00:04:33.440> we<00:04:33.600> will<00:04:33.759> be<00:04:34.080> working<00:04:34.720> mostly<00:04:35.040> with 944 | 945 | 00:04:35.270 --> 00:04:35.280 align:start position:0% 946 | browser we will be working mostly with 947 | 948 | 949 | 00:04:35.280 --> 00:04:36.550 align:start position:0% 950 | browser we will be working mostly with 951 | the<00:04:35.440> terminal 952 | 953 | 00:04:36.550 --> 00:04:36.560 align:start position:0% 954 | the terminal 955 | 956 | 957 | 00:04:36.560 --> 00:04:39.189 align:start position:0% 958 | the terminal 959 | we<00:04:36.800> will<00:04:37.040> be<00:04:37.199> using 960 | 961 | 00:04:39.189 --> 00:04:39.199 align:start position:0% 962 | we will be using 963 | 964 | 965 | 00:04:39.199 --> 00:04:41.670 align:start position:0% 966 | we will be using 967 | using<00:04:39.680> the 968 | 969 | 00:04:41.670 --> 00:04:41.680 align:start position:0% 970 | using the 971 | 972 | 973 | 00:04:41.680 --> 00:04:45.110 align:start position:0% 974 | using the 975 | built-in<00:04:42.479> editor<00:04:43.040> to<00:04:43.440> edit<00:04:43.759> files 976 | 977 | 00:04:45.110 --> 00:04:45.120 align:start position:0% 978 | built-in editor to edit files 979 | 980 | 981 | 00:04:45.120 --> 00:04:48.230 align:start position:0% 982 | built-in editor to edit files 983 | you<00:04:45.280> have<00:04:45.600> some<00:04:46.240> some<00:04:46.479> panes<00:04:46.800> here<00:04:47.280> that<00:04:48.000> that 984 | 985 | 00:04:48.230 --> 00:04:48.240 align:start position:0% 986 | you have some some panes here that that 987 | 988 | 989 | 00:04:48.240 --> 00:04:51.110 align:start position:0% 990 | you have some some panes here that that 991 | show<00:04:48.560> other<00:04:49.360> jupyter<00:04:49.919> things 992 | 993 | 00:04:51.110 --> 00:04:51.120 align:start position:0% 994 | show other jupyter things 995 | 996 | 997 | 00:04:51.120 --> 00:04:54.070 align:start position:0% 998 | show other jupyter things 999 | but<00:04:51.600> we'll<00:04:51.919> mostly<00:04:52.400> be<00:04:52.560> looking<00:04:53.040> at<00:04:53.280> the 1000 | 1001 | 00:04:54.070 --> 00:04:54.080 align:start position:0% 1002 | but we'll mostly be looking at the 1003 | 1004 | 1005 | 00:04:54.080 --> 00:04:56.710 align:start position:0% 1006 | but we'll mostly be looking at the 1007 | file<00:04:54.320> browser<00:04:55.120> and<00:04:55.199> you<00:04:55.360> can<00:04:55.520> also 1008 | 1009 | 00:04:56.710 --> 00:04:56.720 align:start position:0% 1010 | file browser and you can also 1011 | 1012 | 1013 | 00:04:56.720 --> 00:04:59.350 align:start position:0% 1014 | file browser and you can also 1015 | click<00:04:57.040> on<00:04:57.120> this<00:04:57.280> button<00:04:57.680> here<00:04:58.000> to<00:04:58.160> make<00:04:58.400> the 1016 | 1017 | 00:04:59.350 --> 00:04:59.360 align:start position:0% 1018 | click on this button here to make the 1019 | 1020 | 1021 | 00:04:59.360 --> 00:05:01.990 align:start position:0% 1022 | click on this button here to make the 1023 | site<00:04:59.600> panel<00:05:00.160> appear 1024 | 1025 | 00:05:01.990 --> 00:05:02.000 align:start position:0% 1026 | site panel appear 1027 | 1028 | 1029 | 00:05:02.000 --> 00:05:04.870 align:start position:0% 1030 | site panel appear 1031 | or<00:05:02.240> disappear 1032 | 1033 | 00:05:04.870 --> 00:05:04.880 align:start position:0% 1034 | or disappear 1035 | 1036 | 1037 | 00:05:04.880 --> 00:05:07.029 align:start position:0% 1038 | or disappear 1039 | and<00:05:05.120> i<00:05:05.360> think<00:05:05.680> that 1040 | 1041 | 00:05:07.029 --> 00:05:07.039 align:start position:0% 1042 | and i think that 1043 | 1044 | 1045 | 00:05:07.039 --> 00:05:08.710 align:start position:0% 1046 | and i think that 1047 | that's<00:05:07.360> the 1048 | 1049 | 00:05:08.710 --> 00:05:08.720 align:start position:0% 1050 | that's the 1051 | 1052 | 1053 | 00:05:08.720 --> 00:05:11.430 align:start position:0% 1054 | that's the 1055 | shortest<00:05:09.199> possible<00:05:09.840> tour<00:05:10.400> of<00:05:10.639> the 1056 | 1057 | 00:05:11.430 --> 00:05:11.440 align:start position:0% 1058 | shortest possible tour of the 1059 | 1060 | 1061 | 00:05:11.440 --> 00:05:14.800 align:start position:0% 1062 | shortest possible tour of the 1063 | interface<00:05:11.919> will<00:05:12.240> be 1064 | 1065 | -------------------------------------------------------------------------------- /DataLad/DataLad_-_Decentralized_Distribution_and_Sharing_of_Scientific_Datasets.en.vtt: -------------------------------------------------------------------------------- 1 | WEBVTT 2 | Kind: captions 3 | Language: en 4 | 5 | 00:00:02.920 --> 00:00:05.121 6 | My name is Yaroslav Halchenko 7 | 8 | 00:00:05.121 --> 00:00:09.750 9 | and I am talking to you from Dartmouth about DataLad project. 10 | 11 | 00:00:10.299 --> 00:00:14.099 12 | This project was initiated by me and Michael Hanke from Germany 13 | 14 | 00:00:14.099 --> 00:00:18.141 15 | and we had successful few years of collaboration. 16 | 17 | 00:00:18.141 --> 00:00:22.000 18 | Before that you might know us 19 | 20 | 00:00:22.480 --> 00:00:24.599 21 | because of our other projects such as PyMVPA and NeuroDebian. 22 | 23 | 00:00:25.140 --> 00:00:26.901 24 | I hope that you use them 25 | 26 | 00:00:26.901 --> 00:00:30.129 27 | and they help you in your research projects. 28 | 29 | 00:00:30.129 --> 00:00:32.660 30 | More about these and other projects 31 | 32 | 00:00:32.660 --> 00:00:36.530 33 | You could discover if you go to the centerforopenneuroscience.org website, 34 | 35 | 00:00:36.530 --> 00:00:37.680 36 | or you could also find 37 | 38 | 00:00:37.860 --> 00:00:43.100 39 | contacts for us in social media and before I proceed with the talk 40 | 41 | 00:00:43.110 --> 00:00:47.272 42 | I want first of all acknowledge work of others on the project. 43 | 44 | 00:00:47.272 --> 00:00:49.650 45 | It wasn't only my and Michael's work 46 | 47 | 00:00:51.580 --> 00:00:54.201 48 | Our project is heavily based on Git-annex tool, 49 | 50 | 00:00:54.201 --> 00:00:57.659 51 | which Joey Hess wrote for managing his own collection of files 52 | 53 | 00:00:58.060 --> 00:01:00.269 54 | which has nothing to do with science. 55 | 56 | 00:01:01.240 --> 00:01:04.229 57 | Also, he is well known for his work in Debian project 58 | 59 | 00:01:04.600 --> 00:01:09.390 60 | We had... we still have tireless workers on a project 61 | 62 | 00:01:09.909 --> 00:01:11.020 63 | Benjamin 64 | 65 | 00:01:11.020 --> 00:01:14.140 66 | working with Michael and Alex. 67 | 68 | 00:01:14.140 --> 00:01:16.920 69 | Alex recently refurbished or wrote from scratch a new version of the website 70 | 71 | 00:01:16.920 --> 00:01:20.249 72 | I hope that you'll like it and we'll see a bit more of it later. 73 | 74 | 00:01:21.490 --> 00:01:25.439 75 | Also, we had Jason, Debanjum and Gergana working on the project. 76 | 77 | 00:01:26.469 --> 00:01:30.299 78 | They were quite successful to accomplish a lot within short period of time 79 | 80 | 00:01:31.119 --> 00:01:33.260 81 | So if you're looking for a project to contribute to 82 | 83 | 00:01:33.260 --> 00:01:37.199 84 | it might be the interesting project for you to start 85 | 86 | 00:01:37.740 --> 00:01:39.600 87 | working on open source projects 88 | 89 | 00:01:39.600 --> 00:01:42.200 90 | and leave in kind of your foot step in the 91 | 92 | 00:01:42.260 --> 00:01:46.160 93 | ecosystem of Open Source for Neuroscience. 94 | 95 | 00:01:46.160 --> 00:01:49.920 96 | This project is supported by NSF and 97 | 98 | 00:01:50.100 --> 00:01:53.960 99 | Federal Finistry for Education and Research in Germany. 100 | 101 | 00:01:54.400 --> 00:02:00.160 102 | If you go to centerforopenneuroscience.org you could discover more 103 | 104 | 00:02:00.380 --> 00:02:04.500 105 | interesting and exciting projects we either collaborate with it or contribute to. 106 | 107 | 00:02:06.369 --> 00:02:12.239 108 | Before we proceed I want actually to formulate the problem we are trying to solve DataLad. 109 | 110 | 00:02:12.970 --> 00:02:16.506 111 | Data is second class citizen within software platforms. 112 | 113 | 00:02:16.506 --> 00:02:18.499 114 | What could that potentially be? 115 | 116 | 00:02:20.310 --> 00:02:25.009 117 | One of the aspects is if you look how people distribute data nowadays 118 | 119 | 00:02:25.710 --> 00:02:32.239 120 | Quite often you find that even large arrays of data are distributed in tarballs or zip files. 121 | 122 | 00:02:34.110 --> 00:02:37.459 123 | Problems were multiple with these ways of distribution 124 | 125 | 00:02:37.459 --> 00:02:40.249 126 | if one file changes you need to re-distribute 127 | 128 | 00:02:40.820 --> 00:02:42.540 129 | Entire tarball which might be gigabytes in size, 130 | 131 | 00:02:42.540 --> 00:02:48.120 132 | and that's why partially we also couldn't just adopt 133 | 134 | 00:02:48.720 --> 00:02:50.130 135 | technologies which are 136 | 137 | 00:02:50.130 --> 00:02:54.840 138 | proven to work for software, let's say in Debian we distribute complete packages. 139 | 140 | 00:02:54.980 --> 00:02:56.420 141 | But again the problem is the same. 142 | 143 | 00:02:56.660 --> 00:02:59.160 144 | As long as you force 145 | 146 | 00:02:59.360 --> 00:03:04.220 147 | wrapping all the data together in some big file - it wouldn't work. It won't scale. 148 | 149 | 00:03:06.060 --> 00:03:09.023 150 | Also another problem is absent version of data. 151 | 152 | 00:03:09.023 --> 00:03:15.060 153 | And many people actually underappreciate it and think that it doesn't actually exist 154 | 155 | 00:03:15.060 --> 00:03:21.780 156 | or relates to their way of work. But no, actually this problem is quite generic. 157 | 158 | 00:03:22.920 --> 00:03:24.920 159 | So if you look into this 160 | 161 | 00:03:25.620 --> 00:03:30.860 162 | PhD comics caricature, you'll find that these probably relates to many 163 | 164 | 00:03:32.549 --> 00:03:35.209 165 | ways how you deal with files, data or 166 | 167 | 00:03:35.880 --> 00:03:39.079 168 | documents. And you could see that actually 169 | 170 | 00:03:39.930 --> 00:03:47.600 171 | how we tend to version our data is by providing - quite often - the date, right, which creates some kind of linear progression. 172 | 173 | 00:03:48.239 --> 00:03:50.539 174 | Right, so we annotate that: "Oh!" 175 | 176 | 00:03:51.540 --> 00:03:56.209 177 | "I've worked on these in those dates, but also maybe a little bit later..." 178 | 179 | 00:03:56.209 --> 00:04:01.129 180 | And we try to annotate it with some description of what was maybe done to the data 181 | 182 | 00:04:01.380 --> 00:04:04.399 183 | Right, so in this case. It was a test run 184 | 185 | 00:04:04.400 --> 00:04:09.890 186 | Then we test it again and calibrate it and then we ran into a problem, right? So... 187 | 188 | 00:04:10.470 --> 00:04:16.519 189 | All these, kind of, you saved the result of your work and annotated so later on you could 190 | 191 | 00:04:17.010 --> 00:04:23.779 192 | either get back to the previous state. Let's say maybe you indeed made an error and you want to rollback. 193 | 194 | 00:04:24.419 --> 00:04:29.939 195 | Or maybe you want to just compare what have you done, which broke your code or data? 196 | 197 | 00:04:30.700 --> 00:04:34.770 198 | Right, and as you could see those messages could be quite descriptive. 199 | 200 | 00:04:35.830 --> 00:04:43.679 201 | But the problem is that version control systems which are created for code are inadequate for data, right? So the problem is, 202 | 203 | 00:04:44.200 --> 00:04:51.029 204 | quite often, that it's duplication you have copy of the data in the version control system inside somewhere 205 | 206 | 00:04:51.030 --> 00:04:52.450 207 | so you couldn't use it directly. 208 | 209 | 00:04:52.450 --> 00:04:57.440 210 | But also it's present on your hard drive, so at least you have two copies quite often. 211 | 212 | 00:04:57.600 --> 00:05:02.520 213 | Or maybe it's duplicated and just on a single server, right? 214 | 215 | 00:05:03.010 --> 00:05:09.330 216 | I could give you examples were data in a version control system filled up the version control system and 217 | 218 | 00:05:09.940 --> 00:05:12.480 219 | meanwhile filling up the hard drive and 220 | 221 | 00:05:13.180 --> 00:05:19.770 222 | sometimes you try to commit new file and apparently ran out of space on the server and it might ruin your 223 | 224 | 00:05:20.110 --> 00:05:22.319 225 | version control back and then on the server 226 | 227 | 00:05:23.020 --> 00:05:28.409 228 | Rendering it impossible to get to the previous version, so you don't want to have that, right? 229 | 230 | 00:05:29.830 --> 00:05:37.770 231 | Then another problem is that there are no generic data distributions or at least there were no before DataLad. 232 | 233 | 00:05:38.110 --> 00:05:40.410 234 | So there is no efficient ways to 235 | 236 | 00:05:41.170 --> 00:05:43.259 237 | install and upgrade data sets and 238 | 239 | 00:05:44.740 --> 00:05:52.109 240 | When you also deal with different data hosting portals you need to learn how to navigate them 241 | 242 | 00:05:52.110 --> 00:05:52.470 243 | All right 244 | 245 | 00:05:52.470 --> 00:05:53.830 246 | you need to learn how you 247 | 248 | 00:05:53.830 --> 00:06:00.149 249 | authenticate, which page you need to go to and what to download and how to download it? 250 | 251 | 00:06:00.340 --> 00:06:03.750 252 | So just to get to that data set. And then, maybe you 253 | 254 | 00:06:05.230 --> 00:06:08.129 255 | get the announcement that dataset was fixed 256 | 257 | 00:06:08.130 --> 00:06:14.309 258 | and you need to repeat this over and over again trying to remember how to deal with it. And I'm not talking even if 259 | 260 | 00:06:14.920 --> 00:06:22.379 261 | the website became much better and sleeker and changed all the ways how it actually deals with downloads from what it did before. 262 | 263 | 00:06:23.980 --> 00:06:27.029 264 | Another aspect that data is rarely tested 265 | 266 | 00:06:27.160 --> 00:06:29.999 267 | So what does it mean for data to have bugs? 268 | 269 | 00:06:30.340 --> 00:06:36.210 270 | Any derived data is a product of running a script or some kind of procedure on 271 | 272 | 00:06:36.880 --> 00:06:39.839 273 | original data and generating new derived data. 274 | 275 | 00:06:40.990 --> 00:06:44.939 276 | Quite permanent ones which you could find in references later on in this presentation 277 | 278 | 00:06:45.460 --> 00:06:52.079 279 | is atlases. So Atlas is usually produced from the data writing some really sophisticated script 280 | 281 | 00:06:52.330 --> 00:06:53.920 282 | which generates new data: 283 | 284 | 00:06:53.920 --> 00:06:56.819 285 | the atlas. And those atlases could be buggy. 286 | 287 | 00:06:57.160 --> 00:07:03.630 288 | So how do you test the data? The same way as software. If we could establish this efficient process where we 289 | 290 | 00:07:04.630 --> 00:07:08.999 291 | produce some data and verify that at least data meets the assumptions 292 | 293 | 00:07:09.000 --> 00:07:12.869 294 | which you expect. If it's population or probability in the area 295 | 296 | 00:07:12.870 --> 00:07:15.990 297 | which must be present in the entirety of population, 298 | 299 | 00:07:16.060 --> 00:07:21.300 300 | then operability should be up to 100 or nearby 100. If it doesn't add up, 301 | 302 | 00:07:22.060 --> 00:07:25.020 303 | then you have a bug. It's really simple assumption 304 | 305 | 00:07:25.020 --> 00:07:28.145 306 | But very verifying that your data doesn't have... 307 | 308 | 00:07:28.145 --> 00:07:30.779 309 | Doesn't break those is really important. 310 | 311 | 00:07:32.140 --> 00:07:38.729 312 | Unified way how we deal with data, and the code could help to establish those data testing procedures. 313 | 314 | 00:07:39.580 --> 00:07:46.619 315 | Also, it's quite difficult to share your derived data. If downloaded some data set from an well known portal... 316 | 317 | 00:07:46.720 --> 00:07:50.160 318 | How do you share it? What data could be shared? 319 | 320 | 00:07:51.100 --> 00:07:53.340 321 | Where do you deposit it so people later on 322 | 323 | 00:07:53.830 --> 00:07:59.819 324 | Could download maybe original data, and your derive data without even worrying that oh 325 | 326 | 00:07:59.820 --> 00:08:04.710 327 | They need to get this piece from an original source and your derived data from another place 328 | 329 | 00:08:05.110 --> 00:08:09.119 330 | So how do we link those pieces together to make it convenient? 331 | 332 | 00:08:10.210 --> 00:08:17.279 333 | What we're trying to achieve is to make managing of the data as easy as managing code and software. 334 | 335 | 00:08:18.430 --> 00:08:21.690 336 | Is it possible? I hope that you'll see that it is so 337 | 338 | 00:08:22.300 --> 00:08:23.350 339 | 340 | 341 | 00:08:23.350 --> 00:08:25.350 342 | What DataLad is based on... 343 | 344 | 00:08:25.780 --> 00:08:31.380 345 | Is in two pieces and one of them is Git. I hope that everybody knows what Git is. 346 | 347 | 00:08:31.900 --> 00:08:33.010 348 | 349 | 350 | 00:08:33.010 --> 00:08:38.369 351 | But I'll give small presentation nevertheless. So Git is a version control system 352 | 353 | 00:08:38.650 --> 00:08:46.259 354 | and initially it was developed to manage Linux project code, if somebody doesn't know what Linux is this is one of the 355 | 356 | 00:08:46.840 --> 00:08:54.599 357 | most recognized and probably mostly used, because it's used everywhere: on the phones, on the servers, on the operating systems. 358 | 359 | 00:08:54.970 --> 00:09:02.010 360 | It's free and open source and it's developed into open and at some point they needed new version control system 361 | 362 | 00:09:02.010 --> 00:09:07.739 363 | Which would scale for the demand of having lots of code managed there and many people working with it 364 | 365 | 00:09:07.980 --> 00:09:13.360 366 | So it's not a geeky project just for... Between a few people. 367 | 368 | 00:09:13.900 --> 00:09:16.000 369 | It is developed by hundreds. 370 | 371 | 00:09:16.000 --> 00:09:17.240 372 | It's used by millions. 373 | 374 | 00:09:18.540 --> 00:09:21.560 375 | What's great about Git is that it's distributed. 376 | 377 | 00:09:21.780 --> 00:09:27.440 378 | So content is available across all copies of the repository if you clone the repository 379 | 380 | 00:09:28.000 --> 00:09:30.960 381 | You have the entire history of the project 382 | 383 | 00:09:30.960 --> 00:09:35.840 384 | and you could get to any point in that development you could compare different versions. 385 | 386 | 00:09:35.840 --> 00:09:41.080 387 | You could do exactly the same things as original developers dated on this repository. 388 | 389 | 00:09:41.080 --> 00:09:46.900 390 | So it provides you as much flexibility to accomplish things locally 391 | 392 | 00:09:47.470 --> 00:09:50.460 393 | without requiring any network access. 394 | 395 | 00:09:51.070 --> 00:09:55.679 396 | Git became a backbone for github and other social coding portals. 397 | 398 | 00:09:55.900 --> 00:10:01.860 399 | So github came to fill the niche that there were no convenient online resource 400 | 401 | 00:10:01.960 --> 00:10:05.960 402 | where people could easily share these repositories and work on them together. 403 | 404 | 00:10:06.480 --> 00:10:12.820 405 | So git is just a tool and github is just a web portal which provides you 406 | 407 | 00:10:13.000 --> 00:10:20.720 408 | a convenient centralized management of the repositories and collaboration between people. 409 | 410 | 00:10:20.740 --> 00:10:23.220 411 | But it's not a single one there are other 412 | 413 | 00:10:23.400 --> 00:10:25.740 414 | systems which use Git underneath. 415 | 416 | 00:10:26.400 --> 00:10:29.740 417 | Gitlab, Bitbucket, so... 418 | 419 | 00:10:31.000 --> 00:10:37.320 420 | It just creates this the entire ecosystem of the tool and additional services and resources. 421 | 422 | 00:10:38.280 --> 00:10:48.120 423 | What git is great for is very efficient management of textual information, right, so if you manage code, text, configuration files... 424 | 425 | 00:10:48.280 --> 00:10:58.700 426 | Maybe dumped some documentation or JSON files? So, all of those were nicely managed by git because it has really good mechanism to 427 | 428 | 00:10:58.940 --> 00:11:02.662 429 | annotate the differences and compress that efficiently. 430 | 431 | 00:11:02.662 --> 00:11:07.000 432 | So all those distributed copies are actually not that big. 433 | 434 | 00:11:07.200 --> 00:11:10.560 435 | But the problem or inefficiency of Git 436 | 437 | 00:11:10.820 --> 00:11:17.020 438 | is this exactly distributed nature of it. That it stores all the copies of the documents 439 | 440 | 00:11:17.320 --> 00:11:23.400 441 | on all the systems, right? So, if I have big files, then it becomes inefficient. 442 | 443 | 00:11:23.400 --> 00:11:28.300 444 | because now you will have two copies, right? You will have one on the hard drive (at least two copies)... 445 | 446 | 00:11:28.520 --> 00:11:32.049 447 | One on your hard drive and then one committed into Git. 448 | 449 | 00:11:32.050 --> 00:11:35.620 450 | Then if you push this into Github you will have again a 451 | 452 | 00:11:35.779 --> 00:11:40.029 453 | big copy of that file somewhere and anybody who clones that repository 454 | 455 | 00:11:40.580 --> 00:11:43.900 456 | might wait for a while to just get it and then 457 | 458 | 00:11:44.000 --> 00:11:51.640 459 | they might be a little bit upset because they wanted just one file from the repository and didn't care to download a gigabyte 460 | 461 | 00:11:52.120 --> 00:11:54.180 462 | of data just to see it. 463 | 464 | 00:11:54.180 --> 00:11:56.640 465 | So it's inefficient for storing data. 466 | 467 | 00:11:57.760 --> 00:12:04.980 468 | What is the other tool we rely on, as I said, written by Joey Hess it's Git-annex. 469 | 470 | 00:12:05.240 --> 00:12:11.080 471 | So the idea was to build on top of git to provide management for the data files 472 | 473 | 00:12:11.720 --> 00:12:15.279 474 | Without committing those files directly into Git. 475 | 476 | 00:12:16.520 --> 00:12:22.449 477 | So git-annex allows you to add data files under Git control without 478 | 479 | 00:12:23.120 --> 00:12:26.770 480 | committing the content of the files into Git. 481 | 482 | 00:12:27.589 --> 00:12:33.748 483 | While playing with Git-annex and DataLad you might see that files get replaced with the same link. 484 | 485 | 00:12:33.748 --> 00:12:38.280 486 | So what git-annex commits into Git is actually just symlink 487 | 488 | 00:12:38.280 --> 00:12:41.780 489 | which points to the file which contains the data. 490 | 491 | 00:12:42.170 --> 00:12:48.130 492 | This way you can commit really lightweight symlink and keep the data on the hard drive and a single 493 | 494 | 00:12:48.500 --> 00:12:55.299 495 | copy. Sorry, it's not in Git. And then what git-annex does, it orchestrate the 496 | 497 | 00:12:56.089 --> 00:12:58.089 498 | management of those files between 499 | 500 | 00:12:58.279 --> 00:13:03.519 501 | Different clones of the repository or so called other way special nodes. 502 | 503 | 00:13:03.800 --> 00:13:10.630 504 | But also it provides access to those files if they are let's say uploaded on to some website, 505 | 506 | 00:13:10.630 --> 00:13:15.760 507 | so you have a URL. You could associate the URL with the file, you could upload it to FTP, 508 | 509 | 00:13:16.100 --> 00:13:18.820 510 | you could upload it to web server. 511 | 512 | 00:13:19.550 --> 00:13:25.839 513 | You could even get content through BitTorrent, or you could use Amazon s3 storage as your 514 | 515 | 00:13:26.510 --> 00:13:31.029 516 | container for the files and it allows for custom extensions. 517 | 518 | 00:13:31.370 --> 00:13:37.389 519 | Let's say you could upload data to Dropbox, Google Drive, box.com and many, many other 520 | 521 | 00:13:38.839 --> 00:13:40.958 522 | data hosting provider. 523 | 524 | 00:13:42.079 --> 00:13:46.929 525 | Git-annex takes care also about avoiding the limitations of those platforms. 526 | 527 | 00:13:47.480 --> 00:13:54.279 528 | Let's say box.com from public account, it doesn't allow you to have files larger than I believe hundred megabytes. 529 | 530 | 00:13:54.889 --> 00:14:00.489 531 | Git-annex will chop it up so on the box.com you'll have little pieces 532 | 533 | 00:14:00.490 --> 00:14:03.820 534 | You will not use them directly from box.com, but then git-annex 535 | 536 | 00:14:03.820 --> 00:14:09.129 537 | will re-assemble the big file when it gets it onto your hard drive so all those 538 | 539 | 00:14:09.410 --> 00:14:15.339 540 | Conveniences and in addition encryption, that's if you want to share some sensitive data, and you cannot just upload it 541 | 542 | 00:14:16.130 --> 00:14:20.079 543 | Unencrypted to the public service all those are provided by git-annex. 544 | 545 | 00:14:20.839 --> 00:14:27.789 546 | Also additional feature which we don't use in a project is git-annex assistant which is Dropbox like 547 | 548 | 00:14:28.850 --> 00:14:30.850 549 | Synchronization mechanism you could establish 550 | 551 | 00:14:31.519 --> 00:14:35.649 552 | synchronization between your Git, git-annex repositories across multiple servers and 553 | 554 | 00:14:35.990 --> 00:14:42.909 555 | configure them really flexibly so you have that's a backup of on off all the data files on one server and 556 | 557 | 00:14:42.980 --> 00:14:47.589 558 | Some other server will have only files which it cares about let's say 559 | 560 | 00:14:47.930 --> 00:14:52.299 561 | Data files another one might have only video files 562 | 563 | 00:14:53.149 --> 00:14:57.698 564 | another one may be just music files who knows so flexibility is there and 565 | 566 | 00:14:58.060 --> 00:15:02.400 567 | It's all up to you to configure what you want where. 568 | 569 | 00:15:02.400 --> 00:15:07.480 570 | In our project we don't use it yet, but we do use it locally for synchronizing 571 | 572 | 00:15:07.700 --> 00:15:10.020 573 | different git-annex repositories. 574 | 575 | 00:15:12.380 --> 00:15:15.300 576 | But another problem here, so we have really 577 | 578 | 00:15:20.540 --> 00:15:21.180 579 | great two tools Git and git-annex, but both of them work on a single repository level. 580 | 581 | 00:15:21.340 --> 00:15:26.560 582 | So, the work in a git repository you need to go into that directory and 583 | 584 | 00:15:27.320 --> 00:15:29.500 585 | Accomplish whatever you want to do. 586 | 587 | 00:15:29.500 --> 00:15:33.400 588 | It kind of doesn't go along well with the notion of distribution 589 | 590 | 00:15:33.840 --> 00:15:38.760 591 | You don't care where you are you just want to, on your hard drive, so you just want to say: 592 | 593 | 00:15:39.100 --> 00:15:42.720 594 | Oh, search, find me something, install this and 595 | 596 | 00:15:43.140 --> 00:15:46.940 597 | give me access to this data. Right? Or get me give me this file 598 | 599 | 00:15:47.080 --> 00:15:50.700 600 | Even though maybe I'm not in that git or git-annex repository 601 | 602 | 00:15:52.180 --> 00:15:56.780 603 | Also another kind of aspect those are just tools so similarly like 604 | 605 | 00:15:57.120 --> 00:16:02.820 606 | how GitHub provided convenient portal to the tool git. 607 | 608 | 00:16:03.760 --> 00:16:07.178 609 | We want to accomplish something where we use these tools 610 | 611 | 00:16:07.178 --> 00:16:09.620 612 | which are agnostic of domain of the data 613 | 614 | 00:16:09.630 --> 00:16:11.290 615 | (let's say neuroimaging) 616 | 617 | 00:16:11.290 --> 00:16:16.410 618 | to give you guys access to those terabytes of publicly shared data already 619 | 620 | 00:16:16.720 --> 00:16:19.920 621 | which lives out there somewhere, 622 | 623 | 00:16:19.920 --> 00:16:21.569 624 | so we don't need to collect it. We don't need to 625 | 626 | 00:16:22.270 --> 00:16:26.309 627 | make copy of it locally, right, it's already there, so 628 | 629 | 00:16:26.950 --> 00:16:31.890 630 | What we want to achieve is just to provide access to that data without 631 | 632 | 00:16:32.230 --> 00:16:36.180 633 | Mirroring it on our servers or without duplicating it elsewhere 634 | 635 | 00:16:38.260 --> 00:16:47.740 636 | Before going into demos I want to give you kind of more illustrative demo of what is data lifecycle here of data 637 | 638 | 00:16:47.940 --> 00:16:51.140 639 | which we provide by DataLad. 640 | 641 | 00:16:51.340 --> 00:16:59.180 642 | Let's imagine that we have a data set which comes initially from OpenfMRI, right, and live somewhere in the cloud or 643 | 644 | 00:16:59.410 --> 00:17:07.139 645 | On data hosting portal actually we have two copies of the data one of them might be in the tarball, somewhere on HTTP server 646 | 647 | 00:17:07.420 --> 00:17:09.420 648 | right and another one might be 649 | 650 | 00:17:09.850 --> 00:17:16.860 651 | Extracted from the tarball, somewhere on a cloud which might have HTTP access might have S3 access, 652 | 653 | 00:17:16.860 --> 00:17:18.900 654 | but the point is that data is there and 655 | 656 | 00:17:19.480 --> 00:17:25.980 657 | Then we have a data user and that's us right me you everybody who wants to use this data 658 | 659 | 00:17:26.160 --> 00:17:28.160 660 | So now options are: we either... 661 | 662 | 00:17:28.390 --> 00:17:34.680 663 | Go down on the tarball extract it or we learn how to use S3 and go and install some tool 664 | 665 | 00:17:35.440 --> 00:17:37.589 666 | Browse S3 bucket, download those files. 667 | 668 | 00:17:38.950 --> 00:17:42.510 669 | But what we are trying to establish here is actually a middle layer, right? 670 | 671 | 00:17:42.710 --> 00:17:48.499 672 | We want to provide data distribution which might be hosted somewhere, maybe it's on github maybe in our server 673 | 674 | 00:17:49.050 --> 00:17:55.310 675 | Where I'll take this data available online and will automatically crawl it so here 676 | 677 | 00:17:55.310 --> 00:18:02.600 678 | I mentioned this command crawl which is one of the commands DataLad provides to automate monitoring of external resources 679 | 680 | 00:18:03.750 --> 00:18:09.230 681 | So we could get them into Git repositories and actually you could see here that these 682 | 683 | 00:18:11.040 --> 00:18:12.750 684 | Greenish-yellow 685 | 686 | 00:18:12.750 --> 00:18:15.440 687 | Why you don't draw here? Greenish yellow... 688 | 689 | 00:18:16.260 --> 00:18:17.550 690 | color. 691 | 692 | 00:18:17.550 --> 00:18:19.550 693 | Why you don't draw here? 694 | 695 | 00:18:20.580 --> 00:18:27.919 696 | Here we go! So, this greenish yellow color represents just a Content reference 697 | 698 | 00:18:28.620 --> 00:18:34.280 699 | Instead of the actual content, that's why we could host it on github or anywhere because it doesn't have the actual data 700 | 701 | 00:18:35.250 --> 00:18:40.310 702 | So we collect those data sets into collections, which we might share 703 | 704 | 00:18:40.310 --> 00:18:44.929 705 | let's save the one which we share from data sets that are on DataLad.org 706 | 707 | 00:18:45.330 --> 00:18:51.080 708 | underneath we use git modules which is built-in mechanism within Git to organize these collections of 709 | 710 | 00:18:51.270 --> 00:18:55.340 711 | multiple repositories while keeping track of burgeoning information 712 | 713 | 00:18:55.340 --> 00:18:58.369 714 | So you could get the entire collection of let's say of OpenfMRI data sets 715 | 716 | 00:18:58.560 --> 00:19:02.749 717 | For a specific date for a specific version if you want to reproduce some of these else analysis 718 | 719 | 00:19:02.750 --> 00:19:06.140 720 | And then we are making it possible to install 721 | 722 | 00:19:06.660 --> 00:19:10.009 723 | Arbitrary number of those data sets we are unified interface 724 | 725 | 00:19:10.710 --> 00:19:16.639 726 | So here we mentioned command datalad --install which you will see later and hopefully 727 | 728 | 00:19:17.400 --> 00:19:22.400 729 | Those parameters like install into current data set and get all the data 730 | 731 | 00:19:23.010 --> 00:19:28.550 732 | Will it be less surprising and also we provide shortcuts so which I'll talk about later 733 | 734 | 00:19:28.800 --> 00:19:31.100 735 | But the point is that you could now easily 736 | 737 | 00:19:31.680 --> 00:19:36.680 738 | Install those data sets onto your local hard drive, and if you are doing some processing 739 | 740 | 00:19:37.530 --> 00:19:44.810 741 | It might add results of the process in this case. We've got new file filtered bold file, which we could easily add 742 | 743 | 00:19:45.660 --> 00:19:50.480 744 | Into this repository and which means which is committed into the repository 745 | 746 | 00:19:51.000 --> 00:19:58.739 747 | Under git-annex control. And later we could publish this entirety of maybe collection of the datasets 748 | 749 | 00:20:00.010 --> 00:20:05.010 750 | to multiple places one of them might be github or we publish only the 751 | 752 | 00:20:06.130 --> 00:20:08.729 753 | repository itself without actually data files again 754 | 755 | 00:20:08.730 --> 00:20:16.170 756 | Those are just symlinks and maybe offload the actual data to some server which my HTTP server 757 | 758 | 00:20:18.100 --> 00:20:25.230 759 | or some other server through some mechanism right, but the point is that data goes somewhere and the magic happens here 760 | 761 | 00:20:25.330 --> 00:20:31.709 762 | Thanks to the git-annex because that's the Beast which keeps track of were each data file 763 | 764 | 00:20:32.080 --> 00:20:36.929 765 | Could be obtained from so this red links point to the information 766 | 767 | 00:20:36.930 --> 00:20:44.339 768 | What git-annex stores for us that I let's say this bald file is available from original web portal right it's available from S3 bucket, 769 | 770 | 00:20:44.340 --> 00:20:48.300 771 | it might be coming from a tarball, so that's one of the extensions 772 | 773 | 00:20:48.300 --> 00:20:53.190 774 | we added to git-annex to support extraction of the files from the tarball. 775 | 776 | 00:20:53.740 --> 00:20:57.330 777 | So it becomes really transparent to the user and this new file 778 | 779 | 00:20:58.570 --> 00:21:04.889 780 | We published it there. So it might be available now through HTTP so people who cloned this repository 781 | 782 | 00:21:06.220 --> 00:21:13.620 783 | Would be able to get any file from original storage or from any derived data 784 | 785 | 00:21:13.720 --> 00:21:15.720 786 | Which we published on our website? 787 | 788 | 00:21:16.840 --> 00:21:20.250 789 | So that's kind of the main idea behind DataLad. 790 | 791 | 00:21:21.610 --> 00:21:23.260 792 | So, altogether 793 | 794 | 00:21:23.260 --> 00:21:28.650 795 | DataLad allows you to manage multiple repositories organized into these super datasets 796 | 797 | 00:21:28.650 --> 00:21:34.680 798 | Which are just collection of git repositories using standard git sub-modules mechanism. 799 | 800 | 00:21:34.990 --> 00:21:38.760 801 | It supports both git and ggit-annex repository, so if you have 802 | 803 | 00:21:39.490 --> 00:21:45.180 804 | Just regular git repositories where you don't want to add any data. It's perfectly fine. 805 | 806 | 00:21:45.940 --> 00:21:52.499 807 | We can crawl external online data resources and update git-annex repositories upon changes. 808 | 809 | 00:21:53.770 --> 00:21:59.160 810 | It seems to scale quite nicely because data stays with original data provider 811 | 812 | 00:21:59.160 --> 00:22:02.369 813 | so we don't need to increase the storage in our server and 814 | 815 | 00:22:02.920 --> 00:22:09.020 816 | We could use maybe or you could use because anybody could use DataLad to publish 817 | 818 | 00:22:09.200 --> 00:22:11.300 819 | their collections of the datasets on 820 | 821 | 00:22:12.760 --> 00:22:19.679 822 | github and maybe offloading data itself to portals like box.com or dropbox. 823 | 824 | 00:22:21.279 --> 00:22:25.769 825 | What happens now that we have unified access to data regardless of its origin 826 | 827 | 00:22:25.770 --> 00:22:30.389 828 | I didn't care if data comes from openfMRI or CRCNS. 829 | 830 | 00:22:30.820 --> 00:22:36.960 831 | The only difference might be that you need to authenticate it. Let's say CRCNS doesn't allow download without authentication. 832 | 833 | 00:22:37.539 --> 00:22:42.929 834 | So DataLad will ask you for credentials, which you should store locally in the hard drive 835 | 836 | 00:22:42.960 --> 00:22:46.649 837 | Nothing is shared with us and later on when you need to get more data 838 | 839 | 00:22:46.750 --> 00:22:51.089 840 | Just to use those credentials to authenticate in your behalf to CRCNS, 841 | 842 | 00:22:51.640 --> 00:22:55.559 843 | download those tarballs extract it for you, so you didn't need to worry about that and 844 | 845 | 00:22:56.260 --> 00:23:00.929 846 | Also, we take care about serialization, so if original website distributes only tarballs 847 | 848 | 00:23:01.779 --> 00:23:04.919 849 | We download tarballs for you, extract them and again 850 | 851 | 00:23:04.919 --> 00:23:08.939 852 | You didn't need to worry how the data is actually serialized by original data provider 853 | 854 | 00:23:09.940 --> 00:23:13.770 855 | What we do on top is that we aggregate metadata. 856 | 857 | 00:23:14.320 --> 00:23:18.390 858 | What metadata is? It is data about the data. 859 | 860 | 00:23:18.880 --> 00:23:24.779 861 | So let's say you have a data set which contains the data the results information about what this data 862 | 863 | 00:23:24.779 --> 00:23:28.409 864 | Set is about what it's named. What was its offer authors? 865 | 866 | 00:23:29.080 --> 00:23:31.640 867 | What might be the license if it's applicable? 868 | 869 | 00:23:32.160 --> 00:23:35.880 870 | so any additional information about the data constitutes metadata. 871 | 872 | 00:23:36.140 --> 00:23:42.080 873 | What we do in DataLad, we aggregate metadata, which we find about the original data sets and 874 | 875 | 00:23:42.820 --> 00:23:44.490 876 | Provide you convenient interface 877 | 878 | 00:23:44.490 --> 00:23:48.329 879 | So you could search across all of it across all the data sets which we already 880 | 881 | 00:23:48.640 --> 00:23:52.049 882 | Integrated in DataLad. And I hope you'll see the demonstration 883 | 884 | 00:23:52.750 --> 00:23:54.959 885 | quite appealing later on. 886 | 887 | 00:23:56.049 --> 00:24:01.679 888 | Then DataLad after you consumed added extended data sets or just created from scratch 889 | 890 | 00:24:02.380 --> 00:24:09.839 891 | You could share original or derived datasets publicly as I mentioned or internally you could always 892 | 893 | 00:24:10.659 --> 00:24:16.709 894 | publish them locally at your SSH may be to collaborate with somebody and that's what we do regularly and 895 | 896 | 00:24:17.919 --> 00:24:19.919 897 | Meanwhile we'll keep data 898 | 899 | 00:24:20.320 --> 00:24:27.780 900 | we could keep data available elsewhere, or you could even share the data set without sharing the data, which is quite keen as 901 | 902 | 00:24:29.860 --> 00:24:36.360 903 | Demonstration of good intent when you are about to publish the paper, that's what we did them with our recent submission 904 | 905 | 00:24:36.360 --> 00:24:40.829 906 | We publish the data set but not with the entirety of the data set 907 | 908 | 00:24:40.830 --> 00:24:45.449 909 | But just with first subject so reviewers could verify that there is 910 | 911 | 00:24:46.450 --> 00:24:48.130 912 | Good quality data 913 | 914 | 00:24:48.130 --> 00:24:49.990 915 | that 916 | 917 | 00:24:49.990 --> 00:24:56.099 918 | They could get access to it right and that the entirety of data is in principle available 919 | 920 | 00:24:56.100 --> 00:25:00.089 921 | And it was processed accordingly because the whole the entire Git history 922 | 923 | 00:25:00.850 --> 00:25:04.650 924 | is maintained and shared, but the data files are not 925 | 926 | 00:25:06.160 --> 00:25:10.560 927 | Okay and the additional benefit some of it, which is work in progress 928 | 929 | 00:25:11.080 --> 00:25:15.449 930 | You could export the data set if you want to share just the data itself you could 931 | 932 | 00:25:15.580 --> 00:25:21.420 933 | Export the data set and current version in a tarball and give it to somebody but more exciting feature 934 | 935 | 00:25:21.700 --> 00:25:25.680 936 | and we've been working on is exporting in to 937 | 938 | 00:25:26.290 --> 00:25:27.370 939 | some 940 | 941 | 00:25:27.370 --> 00:25:31.170 942 | Metadata heavy data formats if you're publishing scientific data 943 | 944 | 00:25:31.660 --> 00:25:37.170 945 | You will be asked to fill out a big spreadsheet, which is called easy to have 946 | 947 | 00:25:38.950 --> 00:25:44.340 948 | To annotate metadata for your data set it's really tedious and unpleasant job 949 | 950 | 00:25:44.340 --> 00:25:48.630 951 | But the beauty is that all that information is contained within 952 | 953 | 00:25:49.030 --> 00:25:54.180 954 | metadata of either data set or of git-annex. So we could automatically 955 | 956 | 00:25:54.580 --> 00:26:01.199 957 | export majority of information for you, so you just need to fill out left out information and be done 958 | 959 | 00:26:04.150 --> 00:26:07.680 960 | DataLad comes with both common line and Python interfaces 961 | 962 | 00:26:07.680 --> 00:26:14.489 963 | So you could work with it interactively either in common line or script it in bash or working with it interactively in the ipython 964 | 965 | 00:26:14.830 --> 00:26:20.100 966 | and script it with Python language it gives you the same capabilities and really similar syntax 967 | 968 | 00:26:22.300 --> 00:26:24.100 969 | Our distribution 970 | 971 | 00:26:24.100 --> 00:26:27.089 972 | Grew up already to cover over ten terabytes of data 973 | 974 | 00:26:27.910 --> 00:26:31.469 975 | We cover such data sets as OpenfMRI, CRCNS, 976 | 977 | 00:26:32.560 --> 00:26:34.560 978 | functional connectome 979 | 980 | 00:26:34.870 --> 00:26:42.430 981 | INDI data sets and even some data sets from Kaggle and some RatHole radio podcast show 982 | 983 | 00:26:42.830 --> 00:26:49.059 984 | Because it was a cool experiment to be able to crawl that website and collect all the data 985 | 986 | 00:26:49.520 --> 00:26:52.599 987 | About timing of the songs. So check it out 988 | 989 | 00:26:52.600 --> 00:26:57.100 990 | It's available on github although data stays as again with original provider 991 | 992 | 00:26:57.370 --> 00:27:03.609 993 | What is coming? More data, so we'll cover human connectome project and data available from x net servers 994 | 995 | 00:27:03.890 --> 00:27:08.410 996 | We want to provide extended metadata support, so we cover not only data sets 997 | 998 | 00:27:08.410 --> 00:27:09.190 999 | level data 1000 | 1001 | 00:27:09.190 --> 00:27:16.750 1002 | But also data for separate files if you know about any other interesting data set or data provider 1003 | 1004 | 00:27:17.390 --> 00:27:20.739 1005 | File a new issue, or shoot us an email. 1006 | 1007 | 00:27:21.590 --> 00:27:27.850 1008 | we are also working on integrating with NeuroDebian, so you could apt-get install those datasets and the position of data to 1009 | 1010 | 00:27:28.580 --> 00:27:32.350 1011 | OSF and in other platforms. Another interesting integration 1012 | 1013 | 00:27:32.350 --> 00:27:39.880 1014 | Which we've done was to introduce DataLad support into HeuDiConv which stands for Heuristic DICOM Conversion Tool. 1015 | 1016 | 00:27:39.880 --> 00:27:41.809 1017 | which allows you to 1018 | 1019 | 00:27:41.809 --> 00:27:47.739 1020 | automate conversion of your DICOM data obtained from MRI scanner into NIfTI files 1021 | 1022 | 00:27:48.080 --> 00:27:51.520 1023 | but we went one step further and 1024 | 1025 | 00:27:52.280 --> 00:27:57.489 1026 | Standardized it to convert not only to DataLad data set but DataLad BIDS data sets. 1027 | 1028 | 00:27:57.490 --> 00:28:02.020 1029 | Set so if you don't know what BIDS is, it is something you must know nowadays. 1030 | 1031 | 00:28:02.540 --> 00:28:07.479 1032 | If you doing imaging research. It's brain imaging data structure format 1033 | 1034 | 00:28:07.700 --> 00:28:14.679 1035 | Which describes how you should lay out your files on a file system so anybody who finds your data set will be immediately 1036 | 1037 | 00:28:15.230 --> 00:28:17.230 1038 | capable to understand 1039 | 1040 | 00:28:17.690 --> 00:28:24.970 1041 | your design how many subjects you have so it's standardized is beyond NIfTI. It standardized is how you 1042 | 1043 | 00:28:25.280 --> 00:28:27.280 1044 | Work with your files so now 1045 | 1046 | 00:28:27.680 --> 00:28:33.500 1047 | With this integration HeuDiConv we can obtain DataLad datasets 1048 | 1049 | 00:28:34.360 --> 00:28:39.660 1050 | with BIDS if I'd neuroimaging data, so it's ready to be shared 1051 | 1052 | 00:28:39.670 --> 00:28:44.109 1053 | It's ready to be processed by any BIDS compatible tool, so it opens ample 1054 | 1055 | 00:28:44.660 --> 00:28:46.660 1056 | opportunities 1057 | 1058 | 00:28:46.790 --> 00:28:50.930 1059 | And at this point I guess we should switch and do some demos 1060 | 1061 | 00:28:53.640 --> 00:28:59.989 1062 | And before I actually give any demo I want to familiarize you with our new website DataLad.org 1063 | 1064 | 00:29:01.110 --> 00:29:03.000 1065 | On top you could see 1066 | 1067 | 00:29:03.000 --> 00:29:06.619 1068 | navigation for among major portions the website 1069 | 1070 | 00:29:07.140 --> 00:29:09.170 1071 | One of them is about page 1072 | 1073 | 00:29:09.870 --> 00:29:13.910 1074 | Just describes the purpose of the DataLad and provides 1075 | 1076 | 00:29:14.580 --> 00:29:18.379 1077 | information about funding agencies and involved institutions 1078 | 1079 | 00:29:20.010 --> 00:29:22.010 1080 | Next link is "Get DataLad" 1081 | 1082 | 00:29:22.890 --> 00:29:30.350 1083 | Which describes how to install the DataLad. The easiest installation is if you are using your Debian already. 1084 | 1085 | 00:29:30.600 --> 00:29:38.280 1086 | Then it just apt-get install DataLad command or you could find it in package manager and install it within second 1087 | 1088 | 00:29:38.720 --> 00:29:45.320 1089 | Alternatively, if you are on OS-X or any other operating system. Windows support is initial but it 1090 | 1091 | 00:29:46.230 --> 00:29:49.190 1092 | Should work in the basic set of features 1093 | 1094 | 00:29:49.800 --> 00:29:51.950 1095 | you have to install git-annex by 1096 | 1097 | 00:29:52.500 --> 00:29:54.590 1098 | going to git-annex website and 1099 | 1100 | 00:29:56.850 --> 00:29:58.850 1101 | Into install page 1102 | 1103 | 00:29:59.280 --> 00:30:05.840 1104 | choosing the operating system of your choice and following the instructions there how to get it and 1105 | 1106 | 00:30:07.200 --> 00:30:09.140 1107 | after you installed git-annex 1108 | 1109 | 00:30:09.140 --> 00:30:15.499 1110 | You just need to install DataLad from Python package index through pip-install datalad command. 1111 | 1112 | 00:30:16.920 --> 00:30:19.399 1113 | Next page is features page 1114 | 1115 | 00:30:19.410 --> 00:30:27.379 1116 | Which is actually led to by those pretty boxes on the main page and this page will go through 1117 | 1118 | 00:30:27.840 --> 00:30:29.840 1119 | later in greater detail 1120 | 1121 | 00:30:30.150 --> 00:30:33.379 1122 | Another interesting page is Datasets which presents you our 1123 | 1124 | 00:30:34.110 --> 00:30:39.200 1125 | ultimate official distribution which points to datasets.datalad.org 1126 | 1127 | 00:30:39.990 --> 00:30:44.479 1128 | which is the collection of data sets which already pre crawled for you and 1129 | 1130 | 00:30:45.240 --> 00:30:50.420 1131 | That were we provide those data sets like for Open fRI 1132 | 1133 | 00:30:51.510 --> 00:30:53.340 1134 | CRCNS 1135 | 1136 | 00:30:53.340 --> 00:30:55.290 1137 | ADHD and 1138 | 1139 | 00:30:55.290 --> 00:30:56.940 1140 | many others 1141 | 1142 | 00:30:56.940 --> 00:31:00.469 1143 | I will just briefly describe the features of these 1144 | 1145 | 00:31:00.990 --> 00:31:07.819 1146 | Basic website and mention that the such websites if you have any HTTP server available somewhere 1147 | 1148 | 00:31:08.130 --> 00:31:13.459 1149 | Maybe institution provides because you will not host the data actually here, or you don't have to 1150 | 1151 | 00:31:14.429 --> 00:31:21.349 1152 | you could upload similar views of your data sets pretty much anywhere where you could host a website and 1153 | 1154 | 00:31:22.140 --> 00:31:26.479 1155 | OpenfMRI, let's say we go to OpenfMRI, it lists all those data sets 1156 | 1157 | 00:31:26.480 --> 00:31:31.459 1158 | which we crawled from OpenfMRI, you could see also immediately mentioning of the version 1159 | 1160 | 00:31:31.980 --> 00:31:33.980 1161 | here and version goes 1162 | 1163 | 00:31:33.990 --> 00:31:35.990 1164 | from 1165 | 1166 | 00:31:36.000 --> 00:31:42.949 1167 | What version OpenfMRI gave it but also with additional indices pointing to exact commits 1168 | 1169 | 00:31:44.490 --> 00:31:50.539 1170 | Within our git repository I didn't find that version another neat feature here is 1171 | 1172 | 00:31:51.389 --> 00:31:52.919 1173 | immediate 1174 | 1175 | 00:31:52.919 --> 00:31:57.469 1176 | Search so you could start typing and now if you're interested in resting-state 1177 | 1178 | 00:31:58.320 --> 00:32:00.320 1179 | So here we go it goes 1180 | 1181 | 00:32:01.529 --> 00:32:06.199 1182 | Pretty fast and limits the view only the data sets where metadata 1183 | 1184 | 00:32:07.350 --> 00:32:12.169 1185 | Mentions this word and say let's look for Haxby... There we go! 1186 | 1187 | 00:32:12.779 --> 00:32:18.619 1188 | Or let's look for "movie". There we go! So, you could quickly identify the data sets by 1189 | 1190 | 00:32:19.350 --> 00:32:21.829 1191 | browsing and we'll see how we could do 1192 | 1193 | 00:32:22.289 --> 00:32:25.039 1194 | such actions later in the command line and 1195 | 1196 | 00:32:25.289 --> 00:32:31.638 1197 | When you get to the data set of interest or it could be at any pretty much level you'll see on top the command which 1198 | 1199 | 00:32:31.639 --> 00:32:35.509 1200 | Could be used to install this data set and described in some options 1201 | 1202 | 00:32:35.510 --> 00:32:40.969 1203 | Let's say -r is to install this data set with any possible sub data set recursively. 1204 | 1205 | 00:32:41.309 --> 00:32:46.699 1206 | There's -g to install it and also obtain all the data for it, and if you want to speed up the 1207 | 1208 | 00:32:48.360 --> 00:32:51.110 1209 | obtaining the data you could use -J 1210 | 1211 | 00:32:51.110 --> 00:32:56.899 1212 | And specify the number of parallel downloads your server and bandwidth could allow you 1213 | 1214 | 00:32:57.929 --> 00:33:01.788 1215 | Okay, let's go back to DataLad website and another 1216 | 1217 | 00:33:03.779 --> 00:33:10.489 1218 | Page on the website is development. So, if you're interested to help and contribute datasets provide 1219 | 1220 | 00:33:11.039 --> 00:33:13.039 1221 | patches improve documentation 1222 | 1223 | 00:33:13.320 --> 00:33:20.140 1224 | All of the development this is made in open. We use github intensively, we use Travis, we use codecov. 1225 | 1226 | 00:33:20.920 --> 00:33:27.200 1227 | We use Grid for documentation so and that will be our next point 1228 | 1229 | 00:33:28.509 --> 00:33:33.479 1230 | Documentation is hosted on docs.datalad.org and it provides 1231 | 1232 | 00:33:34.720 --> 00:33:39.480 1233 | Not yet as thorough documentation as we wanted but some 1234 | 1235 | 00:33:39.789 --> 00:33:46.289 1236 | documentation about major features of the dataset or a comparison between Git, git-annex and DataLad. 1237 | 1238 | 00:33:46.659 --> 00:33:48.659 1239 | But it also provides 1240 | 1241 | 00:33:49.179 --> 00:33:56.429 1242 | Really thorough interface documentation so as I mentioned before we have command line and Python 1243 | 1244 | 00:33:57.519 --> 00:34:01.199 1245 | interfaces both of those interfaces generated from the same code 1246 | 1247 | 00:34:01.200 --> 00:34:03.539 1248 | So they should be pretty much identical 1249 | 1250 | 00:34:03.759 --> 00:34:07.829 1251 | It just depending how you use command line or Python it will be different 1252 | 1253 | 00:34:07.839 --> 00:34:11.129 1254 | But otherwise all the options all the commands 1255 | 1256 | 00:34:11.169 --> 00:34:15.449 1257 | They look exactly the same and in command line reference 1258 | 1259 | 00:34:15.450 --> 00:34:23.069 1260 | You could find all the documentation for all the commands you could use it that I have some popular ones in my case 1261 | 1262 | 00:34:23.069 --> 00:34:29.099 1263 | Right where I went before and it provides documentation what those and of course there is 1264 | 1265 | 00:34:30.129 --> 00:34:33.629 1266 | notes for power users and quite elaborate 1267 | 1268 | 00:34:34.179 --> 00:34:37.648 1269 | documentation here about all the options which are available 1270 | 1271 | 00:34:38.230 --> 00:34:40.230 1272 | in those commands 1273 | 1274 | 00:34:41.139 --> 00:34:45.779 1275 | Ok so let's go back to features and 1276 | 1277 | 00:34:47.710 --> 00:34:52.859 1278 | First of the demos which I want to show you will be about data discovery 1279 | 1280 | 00:34:54.399 --> 00:34:59.608 1281 | That's any other demo on the website and is provided with 1282 | 1283 | 00:35:00.880 --> 00:35:03.420 1284 | screencast which shows all 1285 | 1286 | 00:35:04.720 --> 00:35:07.770 1287 | necessary commands to carry out the 1288 | 1289 | 00:35:09.670 --> 00:35:12.119 1290 | Presentation, but also provides you with 1291 | 1292 | 00:35:13.210 --> 00:35:17.399 1293 | comments describing the purpose of the actions taken 1294 | 1295 | 00:35:18.520 --> 00:35:20.079 1296 | moreover 1297 | 1298 | 00:35:20.079 --> 00:35:25.989 1299 | You could obtain the full script for the demo so you could run it as case on your hardware 1300 | 1301 | 00:35:28.040 --> 00:35:32.379 1302 | By clicking underneath the screen screencast but 1303 | 1304 | 00:35:33.800 --> 00:35:38.769 1305 | For this demonstration. I'll do it interactively in a shell together with you 1306 | 1307 | 00:35:40.280 --> 00:35:41.960 1308 | So 1309 | 1310 | 00:35:41.960 --> 00:35:43.960 1311 | Let's get started! 1312 | 1313 | 00:35:44.510 --> 00:35:46.540 1314 | If as you remember 1315 | 1316 | 00:35:47.270 --> 00:35:53.590 1317 | We aggregate a lot of metadata in DataLad to provide efficient search mechanisms 1318 | 1319 | 00:35:55.520 --> 00:35:59.919 1320 | In this example we'll imagine that we were looking for a data set which mentions 1321 | 1322 | 00:36:00.650 --> 00:36:07.600 1323 | Raiders in his word after being associated with movie Raiders of the Lost Ark and during imaging 1324 | 1325 | 00:36:09.230 --> 00:36:12.399 1326 | So we'll use datalad -search command where we'll 1327 | 1328 | 00:36:13.430 --> 00:36:16.300 1329 | Just state it right, so we'll call datalad -search 1330 | 1331 | 00:36:16.300 --> 00:36:22.449 1332 | Raiders neuroimaging as with a mini or all commands in DataLad 1333 | 1334 | 00:36:23.060 --> 00:36:26.110 1335 | They are composed by calling datalad 1336 | 1337 | 00:36:26.110 --> 00:36:33.819 1338 | then typing the command you want to implement right and then you could ask for help for that command 1339 | 1340 | 00:36:36.440 --> 00:36:37.850 1341 | Which 1342 | 1343 | 00:36:37.850 --> 00:36:39.850 1344 | provides you with 1345 | 1346 | 00:36:39.950 --> 00:36:41.950 1347 | associated help and 1348 | 1349 | 00:36:42.170 --> 00:36:47.740 1350 | On my screen took a little bit longer just because of video recording usually it's a little bit faster like 1351 | 1352 | 00:36:48.380 --> 00:36:50.359 1353 | five times and 1354 | 1355 | 00:36:50.359 --> 00:36:54.369 1356 | Then you actually type the parameters for this command. For search 1357 | 1358 | 00:36:54.369 --> 00:36:58.869 1359 | It's actually search terms, and I'll present a few other options later on 1360 | 1361 | 00:36:59.780 --> 00:37:01.780 1362 | whenever you 1363 | 1364 | 00:37:02.270 --> 00:37:09.759 1365 | Start this command for the first time it will ask you to install our super data set 1366 | 1367 | 00:37:11.210 --> 00:37:17.590 1368 | Under your home DataLad, in my case that slash demo is the home directory so it asks either 1369 | 1370 | 00:37:17.590 --> 00:37:21.609 1371 | We want to install that super dates which you saw available on DataLad.org 1372 | 1373 | 00:37:22.310 --> 00:37:24.459 1374 | in your home directory. 1375 | 1376 | 00:37:24.460 --> 00:37:32.139 1377 | And that's what it's doing so it quickly installed it because it's just a small git repository without any of those data sets 1378 | 1379 | 00:37:32.720 --> 00:37:38.970 1380 | Directly in a part of it, but they are linked to it as sub modules. It was really fast, and then it loads and caches 1381 | 1382 | 00:37:39.610 --> 00:37:44.069 1383 | metadata, which became available in that dataset and that takes few seconds 1384 | 1385 | 00:37:51.010 --> 00:37:58.649 1386 | Whenever that is done it you see that by default it just returns the paths or names of the 1387 | 1388 | 00:38:00.190 --> 00:38:05.369 1389 | Datasets as they are within the hierarchy of our super dataset and 1390 | 1391 | 00:38:07.510 --> 00:38:09.510 1392 | Search searches within the 1393 | 1394 | 00:38:10.030 --> 00:38:13.709 1395 | Repository data set you are in so if next time 1396 | 1397 | 00:38:13.710 --> 00:38:20.159 1398 | I am just running the same command it will ask me instead of: "Oh, do you want to install it?" it'll ask me either 1399 | 1400 | 00:38:20.160 --> 00:38:23.879 1401 | I want to search in this super dataset which I installed in my home directory 1402 | 1403 | 00:38:26.530 --> 00:38:28.530 1404 | Type yes 1405 | 1406 | 00:38:32.710 --> 00:38:37.619 1407 | And it provides the same result so to avoid such interactive questions 1408 | 1409 | 00:38:37.620 --> 00:38:42.780 1410 | you could explicitly mention which data set you want to search in. 1411 | 1412 | 00:38:43.060 --> 00:38:45.840 1413 | In our case it will be, I'll just specify 1414 | 1415 | 00:38:46.570 --> 00:38:49.170 1416 | That data set will be this 1417 | 1418 | 00:38:49.930 --> 00:38:51.400 1419 | canonical 1420 | 1421 | 00:38:51.400 --> 00:38:54.270 1422 | DataLad data set which is installed in your 1423 | 1424 | 00:38:55.210 --> 00:38:57.750 1425 | DataLad directory when you specify it like this 1426 | 1427 | 00:38:57.750 --> 00:39:06.160 1428 | It assumes location in your home directory when you use triple slashes resource identifier as the source for URLs 1429 | 1430 | 00:39:06.360 --> 00:39:14.020 1431 | To install data sets then it will go to the datasets.datalad.org. And this time we'll search not for Raiders 1432 | 1433 | 00:39:14.040 --> 00:39:20.190 1434 | neuroimaging, but we'll search for Haxby, one of the authors within this data set 1435 | 1436 | 00:39:20.950 --> 00:39:27.839 1437 | So -s stands for the fields which we want to search through and -R will report now 1438 | 1439 | 00:39:27.840 --> 00:39:30.299 1440 | Not just the path to the data set but also 1441 | 1442 | 00:39:30.940 --> 00:39:32.940 1443 | list the fields which match the 1444 | 1445 | 00:39:33.610 --> 00:39:39.120 1446 | Query which we ran. So in this case it should search for data sets and report the field "author". 1447 | 1448 | 00:39:40.210 --> 00:39:43.919 1449 | And only the data sets where Haxby was one of the authors. 1450 | 1451 | 00:39:44.620 --> 00:39:46.390 1452 | So here they are 1453 | 1454 | 00:39:46.390 --> 00:39:49.410 1455 | For convenience, let's just switch to that directory 1456 | 1457 | 00:39:50.160 --> 00:39:52.160 1458 | under our home 1459 | 1460 | 00:39:52.329 --> 00:39:56.939 1461 | Let me clear the screen and go to that directory 1462 | 1463 | 00:39:58.000 --> 00:40:00.299 1464 | So now we don't have to specify 1465 | 1466 | 00:40:01.539 --> 00:40:03.660 1467 | Location of the data set explicitly, 1468 | 1469 | 00:40:03.660 --> 00:40:07.360 1470 | and we could just type the same query without -d 1471 | 1472 | 00:40:07.360 --> 00:40:08.669 1473 | and it will provide the same results 1474 | 1475 | 00:40:13.420 --> 00:40:15.160 1476 | Instead of listing all matching fields, 1477 | 1478 | 00:40:15.160 --> 00:40:19.200 1479 | let's say in our case it was "author" field, we could 1480 | 1481 | 00:40:20.019 --> 00:40:25.409 1482 | explicitly specify which fields you want to search through or to report. 1483 | 1484 | 00:40:26.289 --> 00:40:27.813 1485 | So in this case, I want to see 1486 | 1487 | 00:40:27.813 --> 00:40:31.319 1488 | what's the name of the dataset and what is the author of the dataset? 1489 | 1490 | 00:40:31.319 --> 00:40:33.989 1491 | It's already the author, but with didn't see the name. 1492 | 1493 | 00:40:35.109 --> 00:40:39.929 1494 | And you're on the command to get the output with those fields included 1495 | 1496 | 00:40:41.710 --> 00:40:43.630 1497 | Well enough of searching! 1498 | 1499 | 00:40:43.630 --> 00:40:47.369 1500 | Let's clear the screen and what we could do now -- we found 1501 | 1502 | 00:40:47.680 --> 00:40:51.539 1503 | the datasets right it seems to be that the list of data sets which we found 1504 | 1505 | 00:40:52.360 --> 00:40:55.860 1506 | is good to be installed and we could just 1507 | 1508 | 00:40:56.760 --> 00:40:59.350 1509 | rely on a paradigm of Linux 1510 | 1511 | 00:40:59.350 --> 00:41:05.680 1512 | where you compose commands together through by using pipe command. 1513 | 1514 | 00:41:05.860 --> 00:41:08.900 1515 | So, what this magic would do? 1516 | 1517 | 00:41:08.900 --> 00:41:16.560 1518 | If we didn't have these which already what happens -- we get only the list of data sets or past 1519 | 1520 | 00:41:16.690 --> 00:41:22.560 1521 | Those which are not installed yet, 1522 | 1523 | 00:41:22.560 --> 00:41:24.569 1524 | and OpenfMRI directory is still empty so we get the list of data sets 1525 | 1526 | 00:41:25.320 --> 00:41:29.060 1527 | But then instead of manually going and doing: 1528 | 1529 | 00:41:29.060 --> 00:41:33.800 1530 | "datalad install openfmri/ds00233"... 1531 | 1532 | 00:41:33.980 --> 00:41:39.060 1533 | or doing copy-paste, we could just say that result of this command 1534 | 1535 | 00:41:39.400 --> 00:41:41.400 1536 | should be passed as 1537 | 1538 | 00:41:41.400 --> 00:41:45.760 1539 | Arguments to the next command which will be "datalad install". 1540 | 1541 | 00:41:45.760 --> 00:41:48.680 1542 | "datalad install" command installs those datasets 1543 | 1544 | 00:41:48.680 --> 00:41:51.140 1545 | which are either specified by the 1546 | 1547 | 00:41:51.819 --> 00:41:56.069 1548 | path within current data set or you could provide URLs to 1549 | 1550 | 00:41:56.800 --> 00:42:02.300 1551 | Install command and it will go to those websites and download them explicitly from there. 1552 | 1553 | 00:42:02.300 --> 00:42:05.780 1554 | "datalad install" could be used with other resources 1555 | 1556 | 00:42:06.340 --> 00:42:10.800 1557 | beyond our canonical DataLad distribution. 1558 | 1559 | 00:42:11.140 --> 00:42:13.140 1560 | So let's run this command 1561 | 1562 | 00:42:14.960 --> 00:42:18.640 1563 | as a result of it you'll see that now it goes online and 1564 | 1565 | 00:42:19.550 --> 00:42:25.630 1566 | Installs all those data sets or git/git-annex repositories without any data yet 1567 | 1568 | 00:42:25.670 --> 00:42:29.950 1569 | So only the files which are committed directly into git will be present. 1570 | 1571 | 00:42:42.040 --> 00:42:47.320 1572 | And now we could explore what actually we have got here. 1573 | 1574 | 00:42:47.320 --> 00:42:52.200 1575 | I'll use another DataLad command. Let me clear the screen to bring it on top of the screen. 1576 | 1577 | 00:42:53.079 --> 00:42:56.099 1578 | Next command is "ls", which just lists 1579 | 1580 | 00:42:56.940 --> 00:43:01.400 1581 | either data sets or it could be used also at list S3 URLs. 1582 | 1583 | 00:43:01.400 --> 00:43:04.380 1584 | If you are interested to see what is available in S3 bucket. 1585 | 1586 | 00:43:04.380 --> 00:43:11.159 1587 | And we are specifying the options: capital -L for long listing, and -r recursively 1588 | 1589 | 00:43:11.160 --> 00:43:15.780 1590 | So it will go through all data sets locally in current directory. 1591 | 1592 | 00:43:15.780 --> 00:43:20.060 1593 | (That's why there is a period). And then we'll just remove a list in our data sets 1594 | 1595 | 00:43:20.069 --> 00:43:23.129 1596 | which are not installed because they are not of our interest here. 1597 | 1598 | 00:43:56.050 --> 00:44:01.889 1599 | As you can see all those datasets, which we initially searched for and found 1600 | 1601 | 00:44:03.820 --> 00:44:05.820 1602 | Right? 1603 | 1604 | 00:44:09.850 --> 00:44:13.860 1605 | They became installed, so they became available on our 1606 | 1607 | 00:44:14.500 --> 00:44:19.290 1608 | Local file system and "ls" gives us idea. What kind of repository it is 1609 | 1610 | 00:44:19.530 --> 00:44:21.870 1611 | It's Git versus annex, which branch it is in... 1612 | 1613 | 00:44:22.320 --> 00:44:24.560 1614 | What was the date of the last commit? 1615 | 1616 | 00:44:24.920 --> 00:44:33.360 1617 | Also, the sizes what it tells here that we have a lot of 4 gigabytes of data referenced in this data set at the current 1618 | 1619 | 00:44:33.900 --> 00:44:39.340 1620 | version we've got only 0 bytes locally installed. 1621 | 1622 | 00:44:39.520 --> 00:44:44.720 1623 | We installed only those symlinks I was talking about. 1624 | 1625 | 00:44:46.150 --> 00:44:51.570 1626 | So, now we could actually explore what have you got? 1627 | 1628 | 00:44:53.290 --> 00:44:58.709 1629 | Some of the files that were committed directly into Git, so they became available on the file system as is 1630 | 1631 | 00:44:59.320 --> 00:45:06.029 1632 | But data files we could obtain now using the "datalad get" command. 1633 | 1634 | 00:45:06.670 --> 00:45:09.629 1635 | So what this command will do... Let me clear the screen again... 1636 | 1637 | 00:45:10.440 --> 00:45:16.960 1638 | So you're saying: "Obtain those files! Do it in four parallel processes." 1639 | 1640 | 00:45:16.960 --> 00:45:24.040 1641 | All the files which match these Shell globe expressions, 1642 | 1643 | 00:45:24.040 --> 00:45:26.480 1644 | so all the data sets locally which we have 1645 | 1646 | 00:45:26.860 --> 00:45:32.759 1647 | For all the subjects underneath and anatomical directory, right? We obtained all ready two OpenfMRI dataset 1648 | 1649 | 00:45:32.760 --> 00:45:35.040 1650 | And now we just want to obtain those data files 1651 | 1652 | 00:45:35.320 --> 00:45:42.780 1653 | Let's actually see what this one is pointing to... It points to all those data files. 1654 | 1655 | 00:45:42.940 --> 00:45:46.240 1656 | And if only listed with long listing, 1657 | 1658 | 00:45:46.240 --> 00:45:51.480 1659 | we'll see that those were symlinks which are actually at the moment not even present on that 1660 | 1661 | 00:45:51.820 --> 00:45:56.759 1662 | Point into the files which we don't have locally and that's what git-annex would do for us 1663 | 1664 | 00:45:56.760 --> 00:45:59.760 1665 | It would go online and fetch all those files 1666 | 1667 | 00:46:00.910 --> 00:46:02.910 1668 | from wherever they are available 1669 | 1670 | 00:46:03.520 --> 00:46:05.520 1671 | So let me run this command now 1672 | 1673 | 00:46:18.220 --> 00:46:21.629 1674 | As you can see the are four processes going on 1675 | 1676 | 00:46:27.640 --> 00:46:29.640 1677 | And the end. 1678 | 1679 | 00:46:30.120 --> 00:46:37.060 1680 | All DataLad commands they provide you a summary of what actions did it take? 1681 | 1682 | 00:46:37.280 --> 00:46:42.020 1683 | Here you could see that it got all those files ready to get okay 1684 | 1685 | 00:46:42.299 --> 00:46:49.619 1686 | Or it might say get failed if it failed to get them and then provides action summary, which we might see later in other demos 1687 | 1688 | 00:46:50.650 --> 00:46:57.150 1689 | So let's now run the same command which you ran before to see how much of data we actually got? 1690 | 1691 | 00:47:26.540 --> 00:47:30.199 1692 | As you can see all those which we didn't ask for any data 1693 | 1694 | 00:47:30.200 --> 00:47:33.500 1695 | They still keep zero bytes although all 1696 | 1697 | 00:47:33.810 --> 00:47:37.969 1698 | the files that are available and we could browse them, 1699 | 1700 | 00:47:37.970 --> 00:47:42.830 1701 | but those where we requested additional data files to be obtained finally list how much data 1702 | 1703 | 00:47:42.830 --> 00:47:46.159 1704 | we have in the working tree 1705 | 1706 | 00:47:47.070 --> 00:47:50.059 1707 | of those data sets. 1708 | 1709 | 00:47:51.120 --> 00:47:54.739 1710 | That would complete the demo for "search" and "install". 1711 | 1712 | 00:47:56.610 --> 00:48:01.489 1713 | Now it's your turn to find some interesting for you data sets and get the data for them 1714 | 1715 | 00:48:03.330 --> 00:48:10.009 1716 | Now that we went through one of the demos on our website or we call it features which was data discovery 1717 | 1718 | 00:48:10.410 --> 00:48:12.860 1719 | You could go and visit other 1720 | 1721 | 00:48:15.450 --> 00:48:19.010 1722 | features described on this page. First one is for data consumers 1723 | 1724 | 00:48:19.010 --> 00:48:25.580 1725 | which describes how you could generate native DataLad datasets from the website or 1726 | 1727 | 00:48:26.250 --> 00:48:29.540 1728 | S3 buckets using our crawler so 1729 | 1730 | 00:48:30.870 --> 00:48:33.409 1731 | If you know some resource you could create your own 1732 | 1733 | 00:48:33.870 --> 00:48:40.999 1734 | DataLad crawler to obtain that data into DataLad dataset and keep it up to date with periodic reruns. 1735 | 1736 | 00:48:41.760 --> 00:48:43.760 1737 | Data sharing demo will later show 1738 | 1739 | 00:48:44.970 --> 00:48:50.570 1740 | examples of how you could share the data either on Github, through the github while depositing data to your website, 1741 | 1742 | 00:48:51.090 --> 00:48:56.840 1743 | how I demonstrated earlier, or just for collaboration through SSH servers. 1744 | 1745 | 00:48:57.870 --> 00:48:59.600 1746 | For Git and git-annex users 1747 | 1748 | 00:48:59.600 --> 00:49:07.579 1749 | We give a little example of unique features present in DataLad contrasting it with 1750 | 1751 | 00:49:08.310 --> 00:49:10.881 1752 | regular Git and git-annex usage. 1753 | 1754 | 00:49:10.881 --> 00:49:16.280 1755 | This table outlines there those features. 1756 | 1757 | 00:49:16.280 --> 00:49:23.740 1758 | We operate on multiple data sets at the same time, we operate across data sets seamlessly 1759 | 1760 | 00:49:23.750 --> 00:49:30.169 1761 | So you don't have to switch directories to just operate in with specific data files they provide metadata support 1762 | 1763 | 00:49:30.860 --> 00:49:37.840 1764 | And aggregate from different panel data sources and in unified authentication interface. 1765 | 1766 | 00:49:38.480 --> 00:49:40.480 1767 | Also, one of the 1768 | 1769 | 00:49:40.700 --> 00:49:48.159 1770 | new unique features in DataLad is ability to rerun previously ran commands on the data to see how 1771 | 1772 | 00:49:49.820 --> 00:49:52.299 1773 | things changed or just keep nice 1774 | 1775 | 00:49:53.060 --> 00:49:58.509 1776 | Protocol of actions you have done and record them within your git/git-annex history. 1777 | 1778 | 00:50:00.350 --> 00:50:07.539 1779 | And the last one goes in detail in example on how to use HeuDiCon with your data sets and 1780 | 1781 | 00:50:07.730 --> 00:50:09.730 1782 | relying on our 1783 | 1784 | 00:50:10.100 --> 00:50:14.559 1785 | naming convention for how to name scanning sequences in the scanner. 1786 | 1787 | 00:50:15.640 --> 00:50:18.540 1788 | I hope that you liked this presentation 1789 | 1790 | 00:50:18.540 --> 00:50:27.700 1791 | and you liked what DataLad has to offer so I just want to summarize what DataLad does. 1792 | 1793 | 00:50:27.700 --> 00:50:30.300 1794 | And what it does? It helps to manage and share 1795 | 1796 | 00:50:30.380 --> 00:50:34.300 1797 | Available and your own data by a simple command line of Python interface. 1798 | 1799 | 00:50:34.880 --> 00:50:38.530 1800 | We provide already access to over 10 terabytes of neuroimaging data 1801 | 1802 | 00:50:38.530 --> 00:50:46.269 1803 | And we help with authentication, crawling of the websites, getting data from the archives in which it was originally distributed 1804 | 1805 | 00:50:47.000 --> 00:50:49.000 1806 | publishing new or derived data. 1807 | 1808 | 00:50:50.000 --> 00:50:55.449 1809 | Underneath we use regular pure Git and git-annex repository so whatever tools 1810 | 1811 | 00:50:55.450 --> 00:50:58.780 1812 | You've got used to use you could still use them 1813 | 1814 | 00:50:58.780 --> 00:51:01.810 1815 | And if you're an expert git and git-annex user 1816 | 1817 | 00:51:02.210 --> 00:51:03.880 1818 | We will not limit your powers 1819 | 1820 | 00:51:03.880 --> 00:51:11.800 1821 | You could do the same stuff what you did before with your key tanga tanga suppositories, so we also provide somewhat human 1822 | 1823 | 00:51:12.380 --> 00:51:14.360 1824 | accessible 1825 | 1826 | 00:51:14.360 --> 00:51:22.060 1827 | Metadata interface so in general if you want just to search for some datasets, it's quite convenient with datalad -search. 1828 | 1829 | 00:51:23.240 --> 00:51:25.070 1830 | Documentation is growing 1831 | 1832 | 00:51:25.070 --> 00:51:28.389 1833 | You're welcome to contribute, the project is open source. 1834 | 1835 | 00:51:29.240 --> 00:51:36.790 1836 | I hope that after you've seen the presentation you will agree that managing data can be as simple as manage encode and software. Thank you! 1837 | 1838 | --------------------------------------------------------------------------------