├── .datalad
├── config
├── .gitattributes
└── status
│ └── fetched-subs.log
├── NWB
└── NWB_for_SFN2023.mp4
├── BABS
└── BABS_OHBM2023_20230622.mp4
├── DataLad
├── What_is_DataLad_.m
├── Research_Data_Management_01.m
├── Research_Data_Management_02.m
├── Research_Data_Management_03.m
├── Research_Data_Management_04.m
├── A_hands-on_introduction_to_DataLad.m
├── OHBM_Poster_presentation__844__DataCat.m
├── OHBM_Poster_presentation__2057__FAIRly_big.m
├── DataLad_for_Machine_Learning_-_An_Introduction.m
├── Data_versioning_and_transformation_with_DataLad.m
├── DataLad_vs_Git_Git-annex_for_modular_data_management.m
├── Demo__Fully_recomputing_a_real_scientific_paper__DIY_.m
├── 01_Introduction_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m
├── 09_Collaboration_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m
├── 02_JuypterHub_overview_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m
├── 05_Drop_and_remove_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m
├── 06_Branching__part_1__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m
├── How_to_introduce_data_management_technology_without_sinking_the_ship_.m
├── DataLad_-_Decentralized_Distribution_and_Sharing_of_Scientific_Datasets.m
├── DataLad__-_Decentralized_Management_of_Digital_Objects_for_Open_Science.m
├── Follow_the_rabbits__The_2020_OHBM_Brainhack_Traintrack_Session_on_DataLad.m
├── 03_Basics_of_Version_Control_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m
├── 07_Data_publication__part_1__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m
├── 08_Data_publication__part_2__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m
├── Perpetual_decentralized_management_of_digital_objects_for_collaborative_open_science.m
├── 04_Version_control_underneath_the_hood__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m
├── 10_Preview_of_reproducibility_features_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m
├── FAIRly_big__A_framework_for_computationally_reproducible_processing_of_large_scale_data.m
├── OHBM_2022_Educational_course__How_to_Write_a_Re-executable_Publication__-_What_is_DataLad_.m
├── Demo__Fully_recomputing_a_real_scientific_paper__DIY_.en.vtt
├── What_is_DataLad_.en.vtt
├── OHBM_Poster_presentation__844__DataCat.srt
├── OHBM_Poster_presentation__844__DataCat.en.vtt
├── OHBM_Poster_presentation__2057__FAIRly_big.en.vtt
├── 02_JuypterHub_overview_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.en.vtt
└── DataLad_-_Decentralized_Distribution_and_Sharing_of_Scientific_Datasets.en.vtt
├── ReproNim
├── Introduction_to_DataLad.m
├── How_Would_ReproNim_do_That_.m
├── Introduction_to_Containers.m
├── ReproNim_Webinar__COINSTAC.m
├── ReproNim_Webinar__Containers.m
├── ReproNim_Webinar__ReproLake.m
├── ReproNim_Webinar__ReproPond.m
├── ReproNim_Webinar__ReproSchema.m
├── ReproNim_Webinar__eCOBIDAS_ReproSchema.m
├── Introduction_to_Semantic_Web_and_Linked_Data.m
├── ReproNim_Webinar__IQ_in_Typical_Development.m
├── The_NeuroImaging_Data_Model__NIDM__in_Action.m
├── ReproNim_Webinar__How_Would_ReproNim_do_That_.m
├── ReproNim_Webinar__Reproducible_Execution_of_Data_Collection_Processing.m
└── Depression_and_obesity__using_the_ReproNim_technologies_to_study_public_health_problems.m
├── .gitattributes
├── ABCD-ReproNim_Course
├── Week_10_Instructor_Q_A.m
├── Week_11_Instructor_Q_A.m
├── Week_12_Instructor_Q_A.m
├── Week_7_Instructor_Q_A.m
├── Week_8_Instructor_Q_A.m
├── Week_9_Instructor_Q_A.m
├── Week_9_ABCD__Biospecimens.m
├── Week_11_ABCD__Visualizing_Data.m
├── Week_13_Q_A__Project_Week_Pitches.m
├── ABCD-ReproNim_Project_Week__2021_Project_Week_Kickoff.m
├── ABCD-ReproNim_Project_Week__Team_Project_Presentations.m
├── Week_10_ReproNim__ReproMan_Execution_and_Environment_Manager.m
├── Week_11_ReproNim__ReproPub_-_The_Re-Executable_Publication.m
├── Week12__Analytic_Approaches__Reproducible_Practices_in_Machine_Learning.m
└── Week_10_ABCD__Novel_Technologies_-_Mobile__Wearable__and_Social_Media.m
├── Open_Data_In_Neurophysiology_Symposium_2023
├── Day_1_Session_1__Panel_Discussion.m
├── Day_1_Session_2__Panel_Discussion.m
├── Introduction__Satrajit_Ghosh___Nima_Dehghani.m
├── Day_1_Session_2__Oliver_Rubel___NWB__Neurodata_without_borders_.m
├── Day_1__Session_3__Jerome_Lecoq__Brain_Observatory___OpenScope.m
├── Day_1_Session_1.Tim_Harris__Neuropixels_NXT__in_vivo_high_density_electrophysiology.m
├── Day_1_Session_3__David_Feng__Compute__data___standards_in_large-scale_neuroscience.m
├── Day_1_Session_1._Alipasha_Vaziri__Single_cell_resolution_cortex-wide_volumetric_recording.m
├── Day_1_Session2___Jeremy_Magland__Web-based_visualization___analysis_of_neurophysiology_data.m
├── Day_1_Session_1__Adam_Cohen__Voltage_Imaging__all-optical_electrophysiology_of_neuron_excitability.m
├── Day_1_Session_1__Shadi_Dayeh_Recording_the_human_brain_activity__multi-thousand_channel_ecog_grids_.m
├── Day_1_Session_2_Satrajit_Ghosh__DANDI__Distributed_Archives_for_Neurophysiology_Data_Integration.m
├── Day_1_Session_2__Dimitri_Yatsenko__End-to-end_computational_workflows_for_neuroscience_research.m
├── Keynote_1__Andrea_Beckel_Mitchener__Brain_Research_Through_Advancing_Innovative_Neurotechnologies.m
└── Day1_Session3__Hideyuki_Okano__Brain_Mapping___Disease_Modellings_with_Genetically_Modified_Marmoset.m
├── Open_Minds___Pitt
├── Vendor-Neutral_Applications_for_Quantitative_MRI_Quality_Control.m
├── Overview_of_various_noise_contributions_to_fMRI_signal_by_Dr._Thomas_T._Liu.m
├── Open_discussion_on_MR_Imaging_Centre_Facility_Operations__focus_on_QA_Processes.m
├── Review_of_Quality_Control_Considerations_for_Resting-state_fMRI__Dr._Jean_Chen.m
├── Setting_up_your_experiment_for__not_success__but_less_failure__by_Dr._Ben_Inglis.m
├── Academic_Exit_Plan__awareness_of_and_planning_for_non-traditional_careers_beyond_academia.m
├── MR_Scanner_QA__Phantoms__commercial_solutions__cloud_services_and_potential_standards_.m
├── Relationship_between_Structural_MRI_Quality_ratings_and_scores__and_morphometric_measures.m
├── _Quality_Conversation__Phantom_data_matter_in_Neuroimaging_QA_QC_beyond_basic_scanner_QA.m
├── Diffusion_Weighted_MRI_QC__Validation_of_tractography_methods_and_related_issues_by_Dr._Yendiki.m
├── Automatic_quality_assessment_of_structural_MRI_in_pediatric_neuroimaging__Quality_Conversations_.m
├── Comparison_of_retrospective_motion_correction_strategies_in_resting-state_fMRI_by_Dr._Linden_Parkes.m
├── Influence_of_Motion___Physiological_noise_on_fMRI__QC__solutions__and_challenges_by_Dr._Rasmus_Birn.m
├── Overview_of_prospective_motion_detection_and_correction_methods_in_neuroimaging_by_Dr._Paul_Wighton.m
└── Restoring_statistical_validity_in_group_analyses_of_motion_corrupted_MRI_data_by_Dr._Antoine_Lutti.m
├── README.md
└── code
└── fetch_subs.sh
/.datalad/config:
--------------------------------------------------------------------------------
1 | [datalad "dataset"]
2 | id = ddba1970-21ed-45b9-90fb-a5c5033fcc7e
3 |
--------------------------------------------------------------------------------
/.datalad/.gitattributes:
--------------------------------------------------------------------------------
1 |
2 | config annex.largefiles=nothing
3 | metadata/aggregate* annex.largefiles=nothing
4 | metadata/objects/** annex.largefiles=(anything)
--------------------------------------------------------------------------------
/NWB/NWB_for_SFN2023.mp4:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/zk/GZ/MD5E-s1043812232--b45fc82669ae2fdc5621b157096ba819.mp4/MD5E-s1043812232--b45fc82669ae2fdc5621b157096ba819.mp4
--------------------------------------------------------------------------------
/BABS/BABS_OHBM2023_20230622.mp4:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/Vw/K5/MD5E-s79204972--7360871b25553476871170cf2490f01d.mp4/MD5E-s79204972--7360871b25553476871170cf2490f01d.mp4
--------------------------------------------------------------------------------
/DataLad/What_is_DataLad_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/Mf/Qx/URL--yt&chttps&c%%www.youtube.com%watch,63v,61IN0vowZ67vs/URL--yt&chttps&c%%www.youtube.com%watch,63v,61IN0vowZ67vs
--------------------------------------------------------------------------------
/ReproNim/Introduction_to_DataLad.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/6z/qJ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61pVrjRRrmKbY/URL--yt&chttps&c%%www.youtube.com%watch,63v,61pVrjRRrmKbY
--------------------------------------------------------------------------------
/.gitattributes:
--------------------------------------------------------------------------------
1 |
2 | * annex.backend=MD5E
3 | **/.git* annex.largefiles=nothing
4 | * annex.largefiles=((mimeencoding=binary)and(largerthan=0))
5 | *.srt annex.largefiles=nothing
6 |
--------------------------------------------------------------------------------
/DataLad/Research_Data_Management_01.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/wK/K8/URL--yt&chttps&c%%www.youtube.com%watch,63v,61fL3DWzSWFL8/URL--yt&chttps&c%%www.youtube.com%watch,63v,61fL3DWzSWFL8
--------------------------------------------------------------------------------
/DataLad/Research_Data_Management_02.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/P1/JP/URL--yt&chttps&c%%www.youtube.com%watch,63v,61GrOfE8jv12s/URL--yt&chttps&c%%www.youtube.com%watch,63v,61GrOfE8jv12s
--------------------------------------------------------------------------------
/DataLad/Research_Data_Management_03.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/M1/fv/URL--yt&chttps&c%%www.youtube.com%watch,63v,61lO4yfl30_uc/URL--yt&chttps&c%%www.youtube.com%watch,63v,61lO4yfl30_uc
--------------------------------------------------------------------------------
/DataLad/Research_Data_Management_04.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/Zv/qQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,613ePgH-kK8h8/URL--yt&chttps&c%%www.youtube.com%watch,63v,613ePgH-kK8h8
--------------------------------------------------------------------------------
/ReproNim/How_Would_ReproNim_do_That_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/F1/Fv/URL--yt&chttps&c%%www.youtube.com%watch,63v,61dcY1eXs6EkM/URL--yt&chttps&c%%www.youtube.com%watch,63v,61dcY1eXs6EkM
--------------------------------------------------------------------------------
/ReproNim/Introduction_to_Containers.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/3w/xW/URL--yt&chttps&c%%www.youtube.com%watch,63v,615arBTnYWZq4/URL--yt&chttps&c%%www.youtube.com%watch,63v,615arBTnYWZq4
--------------------------------------------------------------------------------
/ReproNim/ReproNim_Webinar__COINSTAC.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/gf/Xg/URL--yt&chttps&c%%www.youtube.com%watch,63v,616lpsro_L9-Y/URL--yt&chttps&c%%www.youtube.com%watch,63v,616lpsro_L9-Y
--------------------------------------------------------------------------------
/ReproNim/ReproNim_Webinar__Containers.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/Xf/Z1/URL--yt&chttps&c%%www.youtube.com%watch,63v,61ix3lC6HGo-Q/URL--yt&chttps&c%%www.youtube.com%watch,63v,61ix3lC6HGo-Q
--------------------------------------------------------------------------------
/ReproNim/ReproNim_Webinar__ReproLake.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/M7/5z/URL--yt&chttps&c%%www.youtube.com%watch,63v,61VQ5t24mrvJI/URL--yt&chttps&c%%www.youtube.com%watch,63v,61VQ5t24mrvJI
--------------------------------------------------------------------------------
/ReproNim/ReproNim_Webinar__ReproPond.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/G1/kx/URL--yt&chttps&c%%www.youtube.com%watch,63v,61clIL2LJcHXY/URL--yt&chttps&c%%www.youtube.com%watch,63v,61clIL2LJcHXY
--------------------------------------------------------------------------------
/ReproNim/ReproNim_Webinar__ReproSchema.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/kV/1F/URL--yt&chttps&c%%www.youtube.com%watch,63v,61dDuP-Znso5Y/URL--yt&chttps&c%%www.youtube.com%watch,63v,61dDuP-Znso5Y
--------------------------------------------------------------------------------
/ABCD-ReproNim_Course/Week_10_Instructor_Q_A.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/Q7/WJ/URL--yt&chttps&c%%www.youtube.com%watch,63v,614pEOGYcbx64/URL--yt&chttps&c%%www.youtube.com%watch,63v,614pEOGYcbx64
--------------------------------------------------------------------------------
/ABCD-ReproNim_Course/Week_11_Instructor_Q_A.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/zW/KW/URL--yt&chttps&c%%www.youtube.com%watch,63v,61QFngbg74H1o/URL--yt&chttps&c%%www.youtube.com%watch,63v,61QFngbg74H1o
--------------------------------------------------------------------------------
/ABCD-ReproNim_Course/Week_12_Instructor_Q_A.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/fJ/VG/URL--yt&chttps&c%%www.youtube.com%watch,63v,61zAqkd9sSspk/URL--yt&chttps&c%%www.youtube.com%watch,63v,61zAqkd9sSspk
--------------------------------------------------------------------------------
/ABCD-ReproNim_Course/Week_7_Instructor_Q_A.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/X4/F3/URL--yt&chttps&c%%www.youtube.com%watch,63v,61IQU77HcUfwI/URL--yt&chttps&c%%www.youtube.com%watch,63v,61IQU77HcUfwI
--------------------------------------------------------------------------------
/ABCD-ReproNim_Course/Week_8_Instructor_Q_A.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/5j/PP/URL--yt&chttps&c%%www.youtube.com%watch,63v,61WIBQ7k5rVhc/URL--yt&chttps&c%%www.youtube.com%watch,63v,61WIBQ7k5rVhc
--------------------------------------------------------------------------------
/ABCD-ReproNim_Course/Week_9_Instructor_Q_A.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/q5/Jv/URL--yt&chttps&c%%www.youtube.com%watch,63v,619-8SwBIkN2k/URL--yt&chttps&c%%www.youtube.com%watch,63v,619-8SwBIkN2k
--------------------------------------------------------------------------------
/DataLad/A_hands-on_introduction_to_DataLad.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/zf/2v/URL--yt&chttps&c%%www.youtube.com%watch,63v,61_I3JFhJJtW0/URL--yt&chttps&c%%www.youtube.com%watch,63v,61_I3JFhJJtW0
--------------------------------------------------------------------------------
/ABCD-ReproNim_Course/Week_9_ABCD__Biospecimens.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/FG/JW/URL--yt&chttps&c%%www.youtube.com%watch,63v,61QcsifMz5_fQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61QcsifMz5_fQ
--------------------------------------------------------------------------------
/DataLad/OHBM_Poster_presentation__844__DataCat.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/V5/wZ/URL--yt&chttps&c%%www.youtube.com%watch,63v,614GERwj49KFc/URL--yt&chttps&c%%www.youtube.com%watch,63v,614GERwj49KFc
--------------------------------------------------------------------------------
/ReproNim/ReproNim_Webinar__eCOBIDAS_ReproSchema.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/Mf/92/URL--yt&chttps&c%%www.youtube.com%watch,63v,61bQd-e_v2iCc/URL--yt&chttps&c%%www.youtube.com%watch,63v,61bQd-e_v2iCc
--------------------------------------------------------------------------------
/ABCD-ReproNim_Course/Week_11_ABCD__Visualizing_Data.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/WK/2k/URL--yt&chttps&c%%www.youtube.com%watch,63v,613r73oYta0yA/URL--yt&chttps&c%%www.youtube.com%watch,63v,613r73oYta0yA
--------------------------------------------------------------------------------
/DataLad/OHBM_Poster_presentation__2057__FAIRly_big.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/ww/50/URL--yt&chttps&c%%www.youtube.com%watch,63v,61YvZacWgGRZY/URL--yt&chttps&c%%www.youtube.com%watch,63v,61YvZacWgGRZY
--------------------------------------------------------------------------------
/ReproNim/Introduction_to_Semantic_Web_and_Linked_Data.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/5X/wf/URL--yt&chttps&c%%www.youtube.com%watch,63v,61KDMEes_syjE/URL--yt&chttps&c%%www.youtube.com%watch,63v,61KDMEes_syjE
--------------------------------------------------------------------------------
/ReproNim/ReproNim_Webinar__IQ_in_Typical_Development.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/zj/12/URL--yt&chttps&c%%www.youtube.com%watch,63v,61RdJ_Ac1ZO8M/URL--yt&chttps&c%%www.youtube.com%watch,63v,61RdJ_Ac1ZO8M
--------------------------------------------------------------------------------
/ReproNim/The_NeuroImaging_Data_Model__NIDM__in_Action.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/2J/MZ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61223T_s9xSKo/URL--yt&chttps&c%%www.youtube.com%watch,63v,61223T_s9xSKo
--------------------------------------------------------------------------------
/ABCD-ReproNim_Course/Week_13_Q_A__Project_Week_Pitches.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/vP/89/URL--yt&chttps&c%%www.youtube.com%watch,63v,61SySRHAp3uRk/URL--yt&chttps&c%%www.youtube.com%watch,63v,61SySRHAp3uRk
--------------------------------------------------------------------------------
/DataLad/DataLad_for_Machine_Learning_-_An_Introduction.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/1V/jx/URL--yt&chttps&c%%www.youtube.com%watch,63v,61oXd1GPf-Zv4/URL--yt&chttps&c%%www.youtube.com%watch,63v,61oXd1GPf-Zv4
--------------------------------------------------------------------------------
/DataLad/Data_versioning_and_transformation_with_DataLad.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/fp/xP/URL--yt&chttps&c%%www.youtube.com%watch,63v,61wimd1uhIJ8g/URL--yt&chttps&c%%www.youtube.com%watch,63v,61wimd1uhIJ8g
--------------------------------------------------------------------------------
/ReproNim/ReproNim_Webinar__How_Would_ReproNim_do_That_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/K6/63/URL--yt&chttps&c%%www.youtube.com%watch,63v,61NPlAQdSDnBk/URL--yt&chttps&c%%www.youtube.com%watch,63v,61NPlAQdSDnBk
--------------------------------------------------------------------------------
/DataLad/DataLad_vs_Git_Git-annex_for_modular_data_management.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/5x/Ff/URL--yt&chttps&c%%www.youtube.com%watch,63v,61Yrg6DgOcbPE/URL--yt&chttps&c%%www.youtube.com%watch,63v,61Yrg6DgOcbPE
--------------------------------------------------------------------------------
/DataLad/Demo__Fully_recomputing_a_real_scientific_paper__DIY_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/vJ/G8/URL--yt&chttps&c%%www.youtube.com%watch,63v,61nhLqmF58SLQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61nhLqmF58SLQ
--------------------------------------------------------------------------------
/DataLad/01_Introduction_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/Gw/Jp/URL--yt&chttps&c%%www.youtube.com%watch,63v,6140ZcGp2vHXk/URL--yt&chttps&c%%www.youtube.com%watch,63v,6140ZcGp2vHXk
--------------------------------------------------------------------------------
/DataLad/09_Collaboration_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/W8/k5/URL--yt&chttps&c%%www.youtube.com%watch,63v,61AuM6bc7-N6U/URL--yt&chttps&c%%www.youtube.com%watch,63v,61AuM6bc7-N6U
--------------------------------------------------------------------------------
/ABCD-ReproNim_Course/ABCD-ReproNim_Project_Week__2021_Project_Week_Kickoff.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/m7/7Q/URL--yt&chttps&c%%www.youtube.com%watch,63v,61zTOleP0JIqo/URL--yt&chttps&c%%www.youtube.com%watch,63v,61zTOleP0JIqo
--------------------------------------------------------------------------------
/ABCD-ReproNim_Course/ABCD-ReproNim_Project_Week__Team_Project_Presentations.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/QG/G9/URL--yt&chttps&c%%www.youtube.com%watch,63v,61q2xdwKgtbos/URL--yt&chttps&c%%www.youtube.com%watch,63v,61q2xdwKgtbos
--------------------------------------------------------------------------------
/DataLad/02_JuypterHub_overview_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/5k/0v/URL--yt&chttps&c%%www.youtube.com%watch,63v,61N7wMaaTAyzE/URL--yt&chttps&c%%www.youtube.com%watch,63v,61N7wMaaTAyzE
--------------------------------------------------------------------------------
/DataLad/05_Drop_and_remove_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/XJ/wk/URL--yt&chttps&c%%www.youtube.com%watch,63v,61iulQIhPqRzQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61iulQIhPqRzQ
--------------------------------------------------------------------------------
/DataLad/06_Branching__part_1__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/mK/v9/URL--yt&chttps&c%%www.youtube.com%watch,63v,618TyMg9SK35U/URL--yt&chttps&c%%www.youtube.com%watch,63v,618TyMg9SK35U
--------------------------------------------------------------------------------
/DataLad/How_to_introduce_data_management_technology_without_sinking_the_ship_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/zW/Kg/URL--yt&chttps&c%%www.youtube.com%watch,63v,61uH75kYgwLH4/URL--yt&chttps&c%%www.youtube.com%watch,63v,61uH75kYgwLH4
--------------------------------------------------------------------------------
/Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_1__Panel_Discussion.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/PP/Xp/URL--yt&chttps&c%%www.youtube.com%watch,63v,61jI9grk7l9kk/URL--yt&chttps&c%%www.youtube.com%watch,63v,61jI9grk7l9kk
--------------------------------------------------------------------------------
/Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_2__Panel_Discussion.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/QF/WF/URL--yt&chttps&c%%www.youtube.com%watch,63v,61z4iTFH1adLw/URL--yt&chttps&c%%www.youtube.com%watch,63v,61z4iTFH1adLw
--------------------------------------------------------------------------------
/ABCD-ReproNim_Course/Week_10_ReproNim__ReproMan_Execution_and_Environment_Manager.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/M2/mz/URL--yt&chttps&c%%www.youtube.com%watch,63v,61grIVFbYH7YE/URL--yt&chttps&c%%www.youtube.com%watch,63v,61grIVFbYH7YE
--------------------------------------------------------------------------------
/ABCD-ReproNim_Course/Week_11_ReproNim__ReproPub_-_The_Re-Executable_Publication.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/pJ/95/URL--yt&chttps&c%%www.youtube.com%watch,63v,61PlTJpErMCEk/URL--yt&chttps&c%%www.youtube.com%watch,63v,61PlTJpErMCEk
--------------------------------------------------------------------------------
/DataLad/DataLad_-_Decentralized_Distribution_and_Sharing_of_Scientific_Datasets.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/wp/Zx/URL--yt&chttps&c%%www.youtube.com%watch,63v,61sDP1jhRkKRo/URL--yt&chttps&c%%www.youtube.com%watch,63v,61sDP1jhRkKRo
--------------------------------------------------------------------------------
/DataLad/DataLad__-_Decentralized_Management_of_Digital_Objects_for_Open_Science.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/gW/3K/URL--yt&chttps&c%%www.youtube.com%watch,63v,61pIGFS8XDjco/URL--yt&chttps&c%%www.youtube.com%watch,63v,61pIGFS8XDjco
--------------------------------------------------------------------------------
/DataLad/Follow_the_rabbits__The_2020_OHBM_Brainhack_Traintrack_Session_on_DataLad.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/6M/pW/URL--yt&chttps&c%%www.youtube.com%watch,63v,61L5A0MXqFrOY/URL--yt&chttps&c%%www.youtube.com%watch,63v,61L5A0MXqFrOY
--------------------------------------------------------------------------------
/Open_Minds___Pitt/Vendor-Neutral_Applications_for_Quantitative_MRI_Quality_Control.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/vZ/zq/URL--yt&chttps&c%%www.youtube.com%watch,63v,61ob0hPa1JQac/URL--yt&chttps&c%%www.youtube.com%watch,63v,61ob0hPa1JQac
--------------------------------------------------------------------------------
/ReproNim/ReproNim_Webinar__Reproducible_Execution_of_Data_Collection_Processing.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/xq/g8/URL--yt&chttps&c%%www.youtube.com%watch,63v,61dwBtrpI2iS0/URL--yt&chttps&c%%www.youtube.com%watch,63v,61dwBtrpI2iS0
--------------------------------------------------------------------------------
/DataLad/03_Basics_of_Version_Control_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/vj/VJ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61IXSE-KtQVBs/URL--yt&chttps&c%%www.youtube.com%watch,63v,61IXSE-KtQVBs
--------------------------------------------------------------------------------
/DataLad/07_Data_publication__part_1__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/0j/Zx/URL--yt&chttps&c%%www.youtube.com%watch,63v,61WwSp22zVwV8/URL--yt&chttps&c%%www.youtube.com%watch,63v,61WwSp22zVwV8
--------------------------------------------------------------------------------
/DataLad/08_Data_publication__part_2__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/Zp/m4/URL--yt&chttps&c%%www.youtube.com%watch,63v,61LQ3gmSOT-Io/URL--yt&chttps&c%%www.youtube.com%watch,63v,61LQ3gmSOT-Io
--------------------------------------------------------------------------------
/Open_Data_In_Neurophysiology_Symposium_2023/Introduction__Satrajit_Ghosh___Nima_Dehghani.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/Kk/FF/URL--yt&chttps&c%%www.youtube.com%watch,63v,61EO8QVOcdQYY/URL--yt&chttps&c%%www.youtube.com%watch,63v,61EO8QVOcdQYY
--------------------------------------------------------------------------------
/ABCD-ReproNim_Course/Week12__Analytic_Approaches__Reproducible_Practices_in_Machine_Learning.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/pX/Wq/URL--yt&chttps&c%%www.youtube.com%watch,63v,61LAddDaqUe0A/URL--yt&chttps&c%%www.youtube.com%watch,63v,61LAddDaqUe0A
--------------------------------------------------------------------------------
/ABCD-ReproNim_Course/Week_10_ABCD__Novel_Technologies_-_Mobile__Wearable__and_Social_Media.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/z4/Pm/URL--yt&chttps&c%%www.youtube.com%watch,63v,61MFk98_ykknQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61MFk98_ykknQ
--------------------------------------------------------------------------------
/DataLad/Perpetual_decentralized_management_of_digital_objects_for_collaborative_open_science.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/ZX/57/URL--yt&chttps&c%%www.youtube.com%watch,63v,61SJ64rSMD9PU/URL--yt&chttps&c%%www.youtube.com%watch,63v,61SJ64rSMD9PU
--------------------------------------------------------------------------------
/Open_Minds___Pitt/Overview_of_various_noise_contributions_to_fMRI_signal_by_Dr._Thomas_T._Liu.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/vP/28/URL--yt&chttps&c%%www.youtube.com%watch,63v,6176y1Vg12oeA/URL--yt&chttps&c%%www.youtube.com%watch,63v,6176y1Vg12oeA
--------------------------------------------------------------------------------
/DataLad/04_Version_control_underneath_the_hood__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/X4/m8/URL--yt&chttps&c%%www.youtube.com%watch,63v,61lBj5J7aKnPc/URL--yt&chttps&c%%www.youtube.com%watch,63v,61lBj5J7aKnPc
--------------------------------------------------------------------------------
/DataLad/10_Preview_of_reproducibility_features_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/WP/Q9/URL--yt&chttps&c%%www.youtube.com%watch,63v,61AX3lIw9LQbA/URL--yt&chttps&c%%www.youtube.com%watch,63v,61AX3lIw9LQbA
--------------------------------------------------------------------------------
/DataLad/FAIRly_big__A_framework_for_computationally_reproducible_processing_of_large_scale_data.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/3M/g1/URL--yt&chttps&c%%www.youtube.com%watch,63v,61YDtEKUWUPTQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61YDtEKUWUPTQ
--------------------------------------------------------------------------------
/DataLad/OHBM_2022_Educational_course__How_to_Write_a_Re-executable_Publication__-_What_is_DataLad_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/V8/xX/URL--yt&chttps&c%%www.youtube.com%watch,63v,61s1zrB_sDbDU/URL--yt&chttps&c%%www.youtube.com%watch,63v,61s1zrB_sDbDU
--------------------------------------------------------------------------------
/Open_Minds___Pitt/Open_discussion_on_MR_Imaging_Centre_Facility_Operations__focus_on_QA_Processes.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/zJ/91/URL--yt&chttps&c%%www.youtube.com%watch,63v,61_Vhe892uDVQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61_Vhe892uDVQ
--------------------------------------------------------------------------------
/Open_Minds___Pitt/Review_of_Quality_Control_Considerations_for_Resting-state_fMRI__Dr._Jean_Chen.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/fF/xp/URL--yt&chttps&c%%www.youtube.com%watch,63v,612HlQPaPDzNQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,612HlQPaPDzNQ
--------------------------------------------------------------------------------
/Open_Minds___Pitt/Setting_up_your_experiment_for__not_success__but_less_failure__by_Dr._Ben_Inglis.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/5p/gx/URL--yt&chttps&c%%www.youtube.com%watch,63v,61xpHzkg4JOkU/URL--yt&chttps&c%%www.youtube.com%watch,63v,61xpHzkg4JOkU
--------------------------------------------------------------------------------
/ReproNim/Depression_and_obesity__using_the_ReproNim_technologies_to_study_public_health_problems.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/50/2W/URL--yt&chttps&c%%www.youtube.com%watch,63v,61KpgF18p3Woo/URL--yt&chttps&c%%www.youtube.com%watch,63v,61KpgF18p3Woo
--------------------------------------------------------------------------------
/Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_2__Oliver_Rubel___NWB__Neurodata_without_borders_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/1P/Vx/URL--yt&chttps&c%%www.youtube.com%watch,63v,611pqggHHvHdw/URL--yt&chttps&c%%www.youtube.com%watch,63v,611pqggHHvHdw
--------------------------------------------------------------------------------
/Open_Data_In_Neurophysiology_Symposium_2023/Day_1__Session_3__Jerome_Lecoq__Brain_Observatory___OpenScope.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/P9/vp/URL--yt&chttps&c%%www.youtube.com%watch,63v,61b97exHKmgho/URL--yt&chttps&c%%www.youtube.com%watch,63v,61b97exHKmgho
--------------------------------------------------------------------------------
/Open_Minds___Pitt/Academic_Exit_Plan__awareness_of_and_planning_for_non-traditional_careers_beyond_academia.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/XM/4F/URL--yt&chttps&c%%www.youtube.com%watch,63v,614s8xan-eH0c/URL--yt&chttps&c%%www.youtube.com%watch,63v,614s8xan-eH0c
--------------------------------------------------------------------------------
/Open_Minds___Pitt/MR_Scanner_QA__Phantoms__commercial_solutions__cloud_services_and_potential_standards_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/9x/qk/URL--yt&chttps&c%%www.youtube.com%watch,63v,61HFEt3ZxLBl8/URL--yt&chttps&c%%www.youtube.com%watch,63v,61HFEt3ZxLBl8
--------------------------------------------------------------------------------
/Open_Minds___Pitt/Relationship_between_Structural_MRI_Quality_ratings_and_scores__and_morphometric_measures.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/WQ/7z/URL--yt&chttps&c%%www.youtube.com%watch,63v,61md3_oVugOUc/URL--yt&chttps&c%%www.youtube.com%watch,63v,61md3_oVugOUc
--------------------------------------------------------------------------------
/Open_Minds___Pitt/_Quality_Conversation__Phantom_data_matter_in_Neuroimaging_QA_QC_beyond_basic_scanner_QA.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/81/J6/URL--yt&chttps&c%%www.youtube.com%watch,63v,61HcS9_LFdoPw/URL--yt&chttps&c%%www.youtube.com%watch,63v,61HcS9_LFdoPw
--------------------------------------------------------------------------------
/Open_Minds___Pitt/Diffusion_Weighted_MRI_QC__Validation_of_tractography_methods_and_related_issues_by_Dr._Yendiki.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/Jg/9F/URL--yt&chttps&c%%www.youtube.com%watch,63v,61plB-wmuhEQk/URL--yt&chttps&c%%www.youtube.com%watch,63v,61plB-wmuhEQk
--------------------------------------------------------------------------------
/Open_Minds___Pitt/Automatic_quality_assessment_of_structural_MRI_in_pediatric_neuroimaging__Quality_Conversations_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/kP/j5/URL--yt&chttps&c%%www.youtube.com%watch,63v,618QYirk8opLA/URL--yt&chttps&c%%www.youtube.com%watch,63v,618QYirk8opLA
--------------------------------------------------------------------------------
/Open_Minds___Pitt/Comparison_of_retrospective_motion_correction_strategies_in_resting-state_fMRI_by_Dr._Linden_Parkes.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/xj/2V/URL--yt&chttps&c%%www.youtube.com%watch,63v,61bo2AFvJ5mYU/URL--yt&chttps&c%%www.youtube.com%watch,63v,61bo2AFvJ5mYU
--------------------------------------------------------------------------------
/Open_Minds___Pitt/Influence_of_Motion___Physiological_noise_on_fMRI__QC__solutions__and_challenges_by_Dr._Rasmus_Birn.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/Qz/kz/URL--yt&chttps&c%%www.youtube.com%watch,63v,61z2d_3eyzfJw/URL--yt&chttps&c%%www.youtube.com%watch,63v,61z2d_3eyzfJw
--------------------------------------------------------------------------------
/Open_Minds___Pitt/Overview_of_prospective_motion_detection_and_correction_methods_in_neuroimaging_by_Dr._Paul_Wighton.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/JV/0X/URL--yt&chttps&c%%www.youtube.com%watch,63v,619_BH3NJcRzs/URL--yt&chttps&c%%www.youtube.com%watch,63v,619_BH3NJcRzs
--------------------------------------------------------------------------------
/Open_Minds___Pitt/Restoring_statistical_validity_in_group_analyses_of_motion_corrupted_MRI_data_by_Dr._Antoine_Lutti.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/1w/V6/URL--yt&chttps&c%%www.youtube.com%watch,63v,61XLSzzJzmtvc/URL--yt&chttps&c%%www.youtube.com%watch,63v,61XLSzzJzmtvc
--------------------------------------------------------------------------------
/Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_1.Tim_Harris__Neuropixels_NXT__in_vivo_high_density_electrophysiology.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/8M/jp/URL--yt&chttps&c%%www.youtube.com%watch,63v,61DZRvA7c5UzQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61DZRvA7c5UzQ
--------------------------------------------------------------------------------
/Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_3__David_Feng__Compute__data___standards_in_large-scale_neuroscience.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/qQ/JQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61uvavLax2Txs/URL--yt&chttps&c%%www.youtube.com%watch,63v,61uvavLax2Txs
--------------------------------------------------------------------------------
/Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_1._Alipasha_Vaziri__Single_cell_resolution_cortex-wide_volumetric_recording.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/6M/GQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61TGaPM72WdDE/URL--yt&chttps&c%%www.youtube.com%watch,63v,61TGaPM72WdDE
--------------------------------------------------------------------------------
/Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session2___Jeremy_Magland__Web-based_visualization___analysis_of_neurophysiology_data.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/0k/87/URL--yt&chttps&c%%www.youtube.com%watch,63v,61ZSK5jHy3WzU/URL--yt&chttps&c%%www.youtube.com%watch,63v,61ZSK5jHy3WzU
--------------------------------------------------------------------------------
/Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_1__Adam_Cohen__Voltage_Imaging__all-optical_electrophysiology_of_neuron_excitability.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/8V/X9/URL--yt&chttps&c%%www.youtube.com%watch,63v,61yEx4YbtlO9M/URL--yt&chttps&c%%www.youtube.com%watch,63v,61yEx4YbtlO9M
--------------------------------------------------------------------------------
/Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_1__Shadi_Dayeh_Recording_the_human_brain_activity__multi-thousand_channel_ecog_grids_.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/fW/56/URL--yt&chttps&c%%www.youtube.com%watch,63v,61hBLuh4hs-To/URL--yt&chttps&c%%www.youtube.com%watch,63v,61hBLuh4hs-To
--------------------------------------------------------------------------------
/Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_2_Satrajit_Ghosh__DANDI__Distributed_Archives_for_Neurophysiology_Data_Integration.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/3F/Zg/URL--yt&chttps&c%%www.youtube.com%watch,63v,61LBjGJ_DJ91M/URL--yt&chttps&c%%www.youtube.com%watch,63v,61LBjGJ_DJ91M
--------------------------------------------------------------------------------
/Open_Data_In_Neurophysiology_Symposium_2023/Day_1_Session_2__Dimitri_Yatsenko__End-to-end_computational_workflows_for_neuroscience_research.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/M5/jP/URL--yt&chttps&c%%www.youtube.com%watch,63v,61C_BG6cVHSbQ/URL--yt&chttps&c%%www.youtube.com%watch,63v,61C_BG6cVHSbQ
--------------------------------------------------------------------------------
/Open_Data_In_Neurophysiology_Symposium_2023/Keynote_1__Andrea_Beckel_Mitchener__Brain_Research_Through_Advancing_Innovative_Neurotechnologies.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/2p/F9/URL--yt&chttps&c%%www.youtube.com%watch,63v,61x15DSuXCmRM/URL--yt&chttps&c%%www.youtube.com%watch,63v,61x15DSuXCmRM
--------------------------------------------------------------------------------
/Open_Data_In_Neurophysiology_Symposium_2023/Day1_Session3__Hideyuki_Okano__Brain_Mapping___Disease_Modellings_with_Genetically_Modified_Marmoset.m:
--------------------------------------------------------------------------------
1 | ../.git/annex/objects/mg/5Z/URL--yt&chttps&c%%www.youtube.com%watch,63v,61Zv5NOB-mkXg/URL--yt&chttps&c%%www.youtube.com%watch,63v,61Zv5NOB-mkXg
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # ReproTube inceptor
2 |
3 | This is not even a prototype but just a result of running 3 git annex commands
4 | (check git history for datalad run records) to fetch 3 sample channels of intrest.
5 |
6 | ## HOWTO
7 |
8 | ATM to download actual video you would need to have youtube-dl installed
9 | and invoke git annex with special option to allow download from/through potentially
10 | dangerous media, e.g.
11 |
12 | git -c annex.security.allowed-ip-addresses=all annex get ReproNim/How_Would_ReproNim_do_That_.m
13 |
14 |
--------------------------------------------------------------------------------
/code/fetch_subs.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | set -eu
4 |
5 | # fetch subtitles for video file(s) if there is none
6 | mkdir -p .datalad/status
7 | subs_done=.datalad/status/fetched-subs.log
8 | touch "$subs_done" # to >> or grep at the beginning of the universe
9 |
10 | for f in "$@"; do
11 | fbase=${f%.*}
12 | vtts=$(/bin/ls -d "$fbase".*.vtt 2>/dev/null || :)
13 | if [ ! -z "$vtts" ]; then
14 | # echo "$fbase: already has some vtts" # : $vtts"
15 | continue
16 | fi
17 | if grep -q "^$f" $subs_done; then
18 | # echo "$f: already was getting subs, might have none"
19 | continue
20 | fi
21 | url=$(git annex whereis --in web "$f" | awk '/^ *web:/{print $2;}')
22 | echo "$fbase: getting some for $url"
23 | yt-dlp --write-subs --write-auto-subs -k --sub-lang=en --skip-download -o "$fbase" "$url" && status=ok || status=error
24 | echo -e "$f\t$url\t$(date)\t$status" >> "$subs_done"
25 | done
26 |
--------------------------------------------------------------------------------
/DataLad/Demo__Fully_recomputing_a_real_scientific_paper__DIY_.en.vtt:
--------------------------------------------------------------------------------
1 | WEBVTT
2 | Kind: captions
3 | Language: en
4 |
5 | 00:00:05.750 --> 00:00:16.880 align:start position:0%
6 |
7 | [Music]
8 |
9 | 00:00:16.880 --> 00:00:16.890 align:start position:0%
10 |
11 |
12 |
13 | 00:00:16.890 --> 00:00:37.630 align:start position:0%
14 |
15 | [Music]
16 |
17 | 00:00:37.630 --> 00:00:37.640 align:start position:0%
18 |
19 |
20 |
21 | 00:00:37.640 --> 00:00:51.780 align:start position:0%
22 |
23 | [Music]
24 |
25 | 00:00:51.780 --> 00:00:51.790 align:start position:0%
26 |
27 |
28 |
29 | 00:00:51.790 --> 00:01:12.720 align:start position:0%
30 |
31 | [Music]
32 |
33 | 00:01:12.720 --> 00:01:12.730 align:start position:0%
34 |
35 |
36 |
37 | 00:01:12.730 --> 00:01:16.320 align:start position:0%
38 |
39 | yes
40 |
41 | 00:01:16.320 --> 00:01:16.330 align:start position:0%
42 |
43 |
44 |
45 | 00:01:16.330 --> 00:01:26.700 align:start position:0%
46 |
47 | [Music]
48 |
49 | 00:01:26.700 --> 00:01:26.710 align:start position:0%
50 |
51 |
52 |
53 | 00:01:26.710 --> 00:01:35.819 align:start position:0%
54 |
55 | [Music]
56 |
57 |
--------------------------------------------------------------------------------
/.datalad/status/fetched-subs.log:
--------------------------------------------------------------------------------
1 | DataLad/01_Introduction_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=40ZcGp2vHXk Thu Jun 27 11:45:14 AM KST 2024 ok
2 | DataLad/02_JuypterHub_overview_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=N7wMaaTAyzE Thu Jun 27 11:45:20 AM KST 2024 ok
3 | DataLad/03_Basics_of_Version_Control_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=IXSE-KtQVBs Thu Jun 27 11:45:27 AM KST 2024 ok
4 | DataLad/04_Version_control_underneath_the_hood__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=lBj5J7aKnPc Thu Jun 27 11:45:32 AM KST 2024 ok
5 | DataLad/05_Drop_and_remove_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=iulQIhPqRzQ Thu Jun 27 11:45:39 AM KST 2024 ok
6 | DataLad/06_Branching__part_1__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=8TyMg9SK35U Thu Jun 27 11:45:44 AM KST 2024 ok
7 | DataLad/07_Data_publication__part_1__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=WwSp22zVwV8 Thu Jun 27 11:45:50 AM KST 2024 ok
8 | DataLad/08_Data_publication__part_2__-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=LQ3gmSOT-Io Thu Jun 27 11:45:55 AM KST 2024 ok
9 | DataLad/09_Collaboration_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=AuM6bc7-N6U Thu Jun 27 11:46:01 AM KST 2024 ok
10 | DataLad/10_Preview_of_reproducibility_features_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.m https://www.youtube.com/watch?v=AX3lIw9LQbA Thu Jun 27 11:46:06 AM KST 2024 ok
11 | DataLad/A_hands-on_introduction_to_DataLad.m https://www.youtube.com/watch?v=_I3JFhJJtW0 Thu Jun 27 11:46:12 AM KST 2024 ok
12 | DataLad/DataLad_-_Decentralized_Distribution_and_Sharing_of_Scientific_Datasets.m https://www.youtube.com/watch?v=sDP1jhRkKRo Thu Jun 27 11:46:17 AM KST 2024 ok
13 | DataLad/DataLad__-_Decentralized_Management_of_Digital_Objects_for_Open_Science.m https://www.youtube.com/watch?v=pIGFS8XDjco Thu Jun 27 11:46:23 AM KST 2024 ok
14 | DataLad/DataLad_for_Machine_Learning_-_An_Introduction.m https://www.youtube.com/watch?v=oXd1GPf-Zv4 Thu Jun 27 11:46:28 AM KST 2024 ok
15 | DataLad/DataLad_vs_Git_Git-annex_for_modular_data_management.m https://www.youtube.com/watch?v=Yrg6DgOcbPE Thu Jun 27 11:46:38 AM KST 2024 ok
16 | DataLad/Data_versioning_and_transformation_with_DataLad.m https://www.youtube.com/watch?v=wimd1uhIJ8g Thu Jun 27 11:46:43 AM KST 2024 ok
17 | DataLad/Demo__Fully_recomputing_a_real_scientific_paper__DIY_.m https://www.youtube.com/watch?v=nhLqmF58SLQ Thu Jun 27 11:46:49 AM KST 2024 ok
18 | DataLad/FAIRly_big__A_framework_for_computationally_reproducible_processing_of_large_scale_data.m https://www.youtube.com/watch?v=YDtEKUWUPTQ Thu Jun 27 11:46:55 AM KST 2024 ok
19 | DataLad/Follow_the_rabbits__The_2020_OHBM_Brainhack_Traintrack_Session_on_DataLad.m https://www.youtube.com/watch?v=L5A0MXqFrOY Thu Jun 27 11:47:00 AM KST 2024 ok
20 | DataLad/How_to_introduce_data_management_technology_without_sinking_the_ship_.m https://www.youtube.com/watch?v=uH75kYgwLH4 Thu Jun 27 11:47:05 AM KST 2024 ok
21 | DataLad/OHBM_2022_Educational_course__How_to_Write_a_Re-executable_Publication__-_What_is_DataLad_.m https://www.youtube.com/watch?v=s1zrB_sDbDU Thu Jun 27 11:47:10 AM KST 2024 ok
22 | DataLad/OHBM_Poster_presentation__2057__FAIRly_big.m https://www.youtube.com/watch?v=YvZacWgGRZY Thu Jun 27 11:47:15 AM KST 2024 ok
23 | DataLad/OHBM_Poster_presentation__844__DataCat.m https://www.youtube.com/watch?v=4GERwj49KFc Thu Jun 27 11:47:19 AM KST 2024 ok
24 | DataLad/Perpetual_decentralized_management_of_digital_objects_for_collaborative_open_science.m https://www.youtube.com/watch?v=SJ64rSMD9PU Thu Jun 27 11:47:24 AM KST 2024 ok
25 | DataLad/Research_Data_Management_01.m https://www.youtube.com/watch?v=fL3DWzSWFL8 Thu Jun 27 11:47:28 AM KST 2024 ok
26 | DataLad/Research_Data_Management_02.m https://www.youtube.com/watch?v=GrOfE8jv12s Thu Jun 27 11:47:32 AM KST 2024 ok
27 | DataLad/Research_Data_Management_03.m https://www.youtube.com/watch?v=lO4yfl30_uc Thu Jun 27 11:47:37 AM KST 2024 ok
28 | DataLad/Research_Data_Management_04.m https://www.youtube.com/watch?v=3ePgH-kK8h8 Thu Jun 27 11:47:41 AM KST 2024 ok
29 | DataLad/What_is_DataLad_.m https://www.youtube.com/watch?v=IN0vowZ67vs Thu Jun 27 11:47:46 AM KST 2024 ok
30 |
--------------------------------------------------------------------------------
/DataLad/What_is_DataLad_.en.vtt:
--------------------------------------------------------------------------------
1 | WEBVTT
2 | Kind: captions
3 | Language: en
4 |
5 | 00:00:01.920 --> 00:00:12.000
6 | What is DataLad? Everyone needs data! Data are
7 | indispensable for learning, understanding, and
8 |
9 | 00:00:12.000 --> 00:00:19.520
10 | decision making. Working with data responsibly for
11 | the benefit of our communities and the environment
12 |
13 | 00:00:20.160 --> 00:00:26.880
14 | requires us to bring together diverse expertise
15 | and to share findings in a way that fosters trust
16 |
17 | 00:00:26.880 --> 00:00:32.400
18 | in the facts that we base our
19 | actions on. But how can we achieve
20 |
21 | 00:00:32.400 --> 00:00:37.600
22 | this in a world that overwhelms us with
23 | information and limitless possibilities?
24 |
25 | 00:00:39.280 --> 00:00:46.720
26 | Here DataLad can help by recording and documenting
27 | the collaborative process of distilling knowledge
28 |
29 | 00:00:46.720 --> 00:00:55.520
30 | from data, such that it becomes more accessible
31 | and ultimately verifiable. Any investigation
32 |
33 | 00:00:55.520 --> 00:01:03.840
34 | is built on facts. Many of them are digital.
35 | They come in text documents, program code,
36 |
37 | 00:01:04.400 --> 00:01:12.240
38 | and binary files such as images. In our connected
39 | world, files cannot only be on our own computers,
40 |
41 | 00:01:12.240 --> 00:01:17.760
42 | but also at many different cloud
43 | services. Wherever data lives,
44 |
45 | 00:01:17.760 --> 00:01:25.440
46 | DataLad can keep track of them to do this! It
47 | provides a data structure, the data set. It can
48 |
49 | 00:01:25.440 --> 00:01:33.520
50 | reference the precise identity and availability
51 | of ANY digital object. Importantly, it can record
52 |
53 | 00:01:33.520 --> 00:01:39.840
54 | how exactly data and program code are used
55 | to derive the results of an investigation.
56 |
57 | 00:01:42.640 --> 00:01:50.400
58 | DataLad can clone a dataset to another location
59 | on a different computer. Like the original dataset
60 |
61 | 00:01:50.400 --> 00:01:56.080
62 | each clone is completely self-contained.
63 | The information in a dataset can be used
64 |
65 | 00:01:56.080 --> 00:02:01.360
66 | to retrieve all data at the precise version
67 | that is needed, either from the internet
68 |
69 | 00:02:01.360 --> 00:02:09.840
70 | or from other clones of a dataset. The process
71 | records in a dataset enable reproducing results
72 |
73 | 00:02:09.840 --> 00:02:15.440
74 | by a collaborator and make it possible to
75 | verify exactly what was done to get them.
76 |
77 | 00:02:17.440 --> 00:02:24.000
78 | Dataset content looks just like any other
79 | directory on a computer. But datasets can
80 |
81 | 00:02:24.000 --> 00:02:30.400
82 | also be nested to form reusable modular units
83 | that can be assembled into bigger projects.
84 |
85 | 00:02:32.240 --> 00:02:38.720
86 | With these basic principles DataLad can
87 | support a diversity of tasks, whether that is
88 |
89 | 00:02:38.720 --> 00:02:44.640
90 | editing a video clip by yourself or collaborative
91 | research on the world's most powerful computers.
92 |
93 | 00:02:46.480 --> 00:02:53.200
94 | Bringing structure to the data flood is no easy
95 | task. But we are a community of people working
96 |
97 | 00:02:53.200 --> 00:03:00.160
98 | hard to make the tools more accessible every day.
99 | Online training resources offer a convenient start
100 |
101 | 00:03:00.160 --> 00:03:07.520
102 | to working with DataLad. A comprehensive handbook
103 | is your guide for a deep dive into all the
104 |
105 | 00:03:07.520 --> 00:03:16.320
106 | features DataLad has to offer. DataLad is free
107 | and open source software. Anyone is free to use
108 |
109 | 00:03:16.320 --> 00:03:21.920
110 | for any purpose. It has been adapted to
111 | facilitate scientific collaborations,
112 |
113 | 00:03:22.560 --> 00:03:29.160
114 | working with the world's largest health data
115 | sets, and to enable reproducible research. [Music]
116 |
117 | 00:03:31.440 --> 00:03:45.840
118 | Join us to help better
119 | serve even more communities!
120 |
121 |
--------------------------------------------------------------------------------
/DataLad/OHBM_Poster_presentation__844__DataCat.srt:
--------------------------------------------------------------------------------
1 | 1
2 | 00:00:01,800 --> 00:00:04,500
3 | Welcome to the presentation of DataLad Catalog:
4 |
5 | 2
6 | 00:00:05,000 --> 00:00:7,600
7 | a free and open source command line tool, with a Python API,
8 |
9 | 3
10 | 00:00:08,000 --> 00:00:13,500
11 | that lets you create user-friendly, browser-based data catalogs from structured metadata.
12 |
13 | 4
14 | 00:00:14,500 --> 00:00:20,800
15 | The importance and benefits of making research data Findable, Accessible, Interoperable, and Reusable are clear.
16 |
17 | 6
18 | 00:00:21,000 --> 00:00:28,500
19 | But of equal importance are our legal and ethical obligations to protect the personal data privacy of research participants.
20 |
21 | 7
22 | 00:00:29,000 --> 00:00:35,000
23 | So we are struck with this apparent contradiction: how can we share our data openly…yet keep it secure and protected?
24 |
25 | 8
26 | 00:00:35,200 --> 00:00:40,000
27 | Should we err on the side of FAIRness, or of data privacy? And do we even have to choose?
28 |
29 | 9
30 | 00:00:41,000 --> 00:00:44,000
31 | Ideally, no. And in practice, also no,
32 |
33 | 10
34 | 00:00:44,500 --> 00:00:50,500
35 | because we have a powerful opportunity in the form of linked, structured, and machine-readable metadata.
36 |
37 | 11
38 | 00:00:51,000 --> 00:00:57,400
39 | Metadata provides not only high-level information about our research data, such as study and data acquisition parameters,
40 |
41 | 12
42 | 00:00:57,800 --> 00:01:02,700
43 | but also the descriptive aspects of each file in the dataset, such as file paths, sizes, and formats.
44 |
45 | 13
46 | 00:01:03,000 --> 00:01:09,500
47 | With this metadata, we can create an abstract representation of the full dataset that is separate from the actual data content.
48 |
49 | 14
50 | 00:01:10,000 --> 00:01:15,000
51 | This means that the content can be stored securely, while we openly share the metadata to make our work more FAIR.
52 |
53 | 15
54 | 00:01:16,000 --> 00:01:21,400
55 | As an added benefit, structured and machine-readable metadata that conforms to industry standards
56 |
57 | 16
58 | 00:01:21,500 --> 00:01:27,000
59 | improves the interoperability and allows the use of automated pipelines and tools.
60 |
61 | 17
62 | 00:01:30,500 --> 00:01:36,500
63 | These ideals are achievable in practice, with a toolset that includes the distributed data management system DataLad,
64 |
65 | 18
66 | 00:01:36,800 --> 00:01:40,500
67 | and its extensions for metadata handling and catalog generation.
68 |
69 | 19
70 | 00:01:41,000 --> 00:01:47,800
71 | DataLad can be used for decentralised management of data as lightweight, portable and extensible representations.
72 |
73 | 20
74 | 00:01:48,000 --> 00:01:54,500
75 | Datalad-metalad can extract structured high- and low-level metadata and associate it with these datasets or with individual files.
76 |
77 | 21
78 | 00:01:55,300 --> 00:02:00,800
79 | And at the end of the workflow, Datalad-catalog can turn the structured metadata into a user-friendly data browser!
80 |
81 | 22
82 | 00:02:02,200 --> 00:02:04,800
83 | So how does the catalog generation process work?
84 |
85 | 23
86 | 00:02:05,000 --> 00:02:08,500
87 | Metadata extracted from various sources, even custom sources,
88 |
89 | 24
90 | 00:02:08,700 --> 00:02:10,800
91 | can be aggregated and added to a catalog.
92 |
93 | 25
94 | 00:02:11,000 --> 00:02:13,200
95 | Incoming metadata will first be validated
96 |
97 | 26
98 | 00:02:13,300 --> 00:02:14,800
99 | against a catalog-specific schema,
100 |
101 | 27
102 | 00:02:15,000 --> 00:02:18,500
103 | before the catalog is generated or individual entries are added.
104 |
105 | 28
106 | 00:02:19,000 --> 00:02:20,500
107 | Once the process is finished,
108 |
109 | 29
110 | 00:02:20,800 --> 00:02:23,300
111 | the output is a set of structured metadata files,
112 |
113 | 29
114 | 00:02:23,500 --> 00:02:26,000
115 | as well as a Vue.js-based browser interface
116 |
117 | 30
118 | 00:02:26,400 --> 00:02:28,800
119 | that understands how to render this metadata.
120 |
121 | 31
122 | 00:02:29,900 --> 00:02:31,100
123 | What is left for the user
124 |
125 | 32
126 | 00:02:31,300 --> 00:02:33,500
127 | is to host this content on their platform of choice
128 |
129 | 32
130 | 00:02:33,600 --> 00:02:35,300
131 | and serve it for the world to see.
132 |
133 | 33
134 | 00:02:36,300 --> 00:02:42,800
135 | Datalad catalog brings the powerful functionality of decentralised metadata handling and data publishing into the hands of users,
136 |
137 | 34
138 | 00:02:43,000 --> 00:02:49,000
139 | preventing dependence on centralised infrastructure and keeping private data secure while adhering to FAIR principles.
140 |
141 | 35
142 | 00:02:49,600 --> 00:02:53,000
143 | Please explore the demo catalog, walk through the interactive tutorial
144 |
145 | 36
146 | 00:02:53,100 --> 00:02:58,000
147 | or visit the codebase to start using or contributing to datalad catalog.
148 |
--------------------------------------------------------------------------------
/DataLad/OHBM_Poster_presentation__844__DataCat.en.vtt:
--------------------------------------------------------------------------------
1 | WEBVTT
2 | Kind: captions
3 | Language: en
4 |
5 | 00:00:02.080 --> 00:00:03.750 align:start position:0%
6 |
7 | welcome<00:00:02.399> to<00:00:02.560> the<00:00:02.639> presentation<00:00:03.199> of<00:00:03.280> datalit
8 |
9 | 00:00:03.750 --> 00:00:03.760 align:start position:0%
10 | welcome to the presentation of datalit
11 |
12 |
13 | 00:00:03.760 --> 00:00:05.829 align:start position:0%
14 | welcome to the presentation of datalit
15 | catalog<00:00:04.560> a<00:00:04.720> free<00:00:04.960> and<00:00:05.120> open<00:00:05.279> source<00:00:05.520> command
16 |
17 | 00:00:05.829 --> 00:00:05.839 align:start position:0%
18 | catalog a free and open source command
19 |
20 |
21 | 00:00:05.839 --> 00:00:07.990 align:start position:0%
22 | catalog a free and open source command
23 | line<00:00:06.080> tool<00:00:06.240> with<00:00:06.399> a<00:00:06.480> python<00:00:06.879> api<00:00:07.520> that<00:00:07.759> lets
24 |
25 | 00:00:07.990 --> 00:00:08.000 align:start position:0%
26 | line tool with a python api that lets
27 |
28 |
29 | 00:00:08.000 --> 00:00:10.070 align:start position:0%
30 | line tool with a python api that lets
31 | you<00:00:08.160> create<00:00:08.559> user-friendly<00:00:09.360> browser-based
32 |
33 | 00:00:10.070 --> 00:00:10.080 align:start position:0%
34 | you create user-friendly browser-based
35 |
36 |
37 | 00:00:10.080 --> 00:00:15.030 align:start position:0%
38 | you create user-friendly browser-based
39 | data<00:00:10.400> catalogs<00:00:10.960> from<00:00:11.120> structured<00:00:11.759> metadata
40 |
41 | 00:00:15.030 --> 00:00:15.040 align:start position:0%
42 |
43 |
44 |
45 | 00:00:15.040 --> 00:00:16.550 align:start position:0%
46 |
47 | the<00:00:15.200> importance<00:00:15.679> and<00:00:15.679> benefits<00:00:16.160> of<00:00:16.240> making
48 |
49 | 00:00:16.550 --> 00:00:16.560 align:start position:0%
50 | the importance and benefits of making
51 |
52 |
53 | 00:00:16.560 --> 00:00:18.710 align:start position:0%
54 | the importance and benefits of making
55 | research<00:00:16.960> data<00:00:17.279> finable<00:00:17.920> accessible
56 |
57 | 00:00:18.710 --> 00:00:18.720 align:start position:0%
58 | research data finable accessible
59 |
60 |
61 | 00:00:18.720 --> 00:00:21.189 align:start position:0%
62 | research data finable accessible
63 | interoperable<00:00:19.359> and<00:00:19.520> reusable<00:00:20.080> are<00:00:20.240> clear<00:00:21.039> but
64 |
65 | 00:00:21.189 --> 00:00:21.199 align:start position:0%
66 | interoperable and reusable are clear but
67 |
68 |
69 | 00:00:21.199 --> 00:00:23.029 align:start position:0%
70 | interoperable and reusable are clear but
71 | of<00:00:21.439> equal<00:00:21.760> importance
72 |
73 | 00:00:23.029 --> 00:00:23.039 align:start position:0%
74 | of equal importance
75 |
76 |
77 | 00:00:23.039 --> 00:00:25.269 align:start position:0%
78 | of equal importance
79 | are<00:00:23.279> our<00:00:23.600> legal<00:00:24.080> and<00:00:24.240> ethical<00:00:24.640> obligations<00:00:25.199> to
80 |
81 | 00:00:25.269 --> 00:00:25.279 align:start position:0%
82 | are our legal and ethical obligations to
83 |
84 |
85 | 00:00:25.279 --> 00:00:27.189 align:start position:0%
86 | are our legal and ethical obligations to
87 | protect<00:00:25.840> the<00:00:26.000> personal<00:00:26.320> data<00:00:26.640> privacy<00:00:27.039> of
88 |
89 | 00:00:27.189 --> 00:00:27.199 align:start position:0%
90 | protect the personal data privacy of
91 |
92 |
93 | 00:00:27.199 --> 00:00:28.870 align:start position:0%
94 | protect the personal data privacy of
95 | research<00:00:27.519> participants
96 |
97 | 00:00:28.870 --> 00:00:28.880 align:start position:0%
98 | research participants
99 |
100 |
101 | 00:00:28.880 --> 00:00:30.310 align:start position:0%
102 | research participants
103 | so<00:00:29.039> we<00:00:29.199> are<00:00:29.359> struck<00:00:29.599> with<00:00:29.760> this<00:00:29.920> apparent
104 |
105 | 00:00:30.310 --> 00:00:30.320 align:start position:0%
106 | so we are struck with this apparent
107 |
108 |
109 | 00:00:30.320 --> 00:00:32.709 align:start position:0%
110 | so we are struck with this apparent
111 | contradiction<00:00:31.199> how<00:00:31.439> can<00:00:31.599> we<00:00:31.840> share<00:00:32.160> our<00:00:32.320> data
112 |
113 | 00:00:32.709 --> 00:00:32.719 align:start position:0%
114 | contradiction how can we share our data
115 |
116 |
117 | 00:00:32.719 --> 00:00:35.270 align:start position:0%
118 | contradiction how can we share our data
119 | openly<00:00:33.280> it<00:00:33.440> keep<00:00:33.680> it<00:00:33.760> secure<00:00:34.239> and<00:00:34.320> protected
120 |
121 | 00:00:35.270 --> 00:00:35.280 align:start position:0%
122 | openly it keep it secure and protected
123 |
124 |
125 | 00:00:35.280 --> 00:00:37.270 align:start position:0%
126 | openly it keep it secure and protected
127 | should<00:00:35.520> we<00:00:35.680> err<00:00:35.920> on<00:00:36.000> the<00:00:36.079> side<00:00:36.320> of<00:00:36.480> fairness<00:00:37.040> or
128 |
129 | 00:00:37.270 --> 00:00:37.280 align:start position:0%
130 | should we err on the side of fairness or
131 |
132 |
133 | 00:00:37.280 --> 00:00:39.590 align:start position:0%
134 | should we err on the side of fairness or
135 | of<00:00:37.440> data<00:00:37.760> privacy<00:00:38.480> or<00:00:38.640> do<00:00:38.879> we<00:00:39.040> even<00:00:39.280> have<00:00:39.440> to
136 |
137 | 00:00:39.590 --> 00:00:39.600 align:start position:0%
138 | of data privacy or do we even have to
139 |
140 |
141 | 00:00:39.600 --> 00:00:41.030 align:start position:0%
142 | of data privacy or do we even have to
143 | choose
144 |
145 | 00:00:41.030 --> 00:00:41.040 align:start position:0%
146 | choose
147 |
148 |
149 | 00:00:41.040 --> 00:00:42.630 align:start position:0%
150 | choose
151 | ideally<00:00:41.840> no
152 |
153 | 00:00:42.630 --> 00:00:42.640 align:start position:0%
154 | ideally no
155 |
156 |
157 | 00:00:42.640 --> 00:00:45.110 align:start position:0%
158 | ideally no
159 | and<00:00:42.800> in<00:00:42.879> practice<00:00:43.600> also<00:00:44.000> no<00:00:44.559> because<00:00:44.879> we<00:00:45.039> have
160 |
161 | 00:00:45.110 --> 00:00:45.120 align:start position:0%
162 | and in practice also no because we have
163 |
164 |
165 | 00:00:45.120 --> 00:00:47.029 align:start position:0%
166 | and in practice also no because we have
167 | a<00:00:45.200> powerful<00:00:45.600> opportunity<00:00:46.239> in<00:00:46.320> the<00:00:46.399> form<00:00:46.640> of
168 |
169 | 00:00:47.029 --> 00:00:47.039 align:start position:0%
170 | a powerful opportunity in the form of
171 |
172 |
173 | 00:00:47.039 --> 00:00:49.670 align:start position:0%
174 | a powerful opportunity in the form of
175 | linked<00:00:47.840> structured<00:00:48.640> and<00:00:48.800> machine-readable
176 |
177 | 00:00:49.670 --> 00:00:49.680 align:start position:0%
178 | linked structured and machine-readable
179 |
180 |
181 | 00:00:49.680 --> 00:00:51.110 align:start position:0%
182 | linked structured and machine-readable
183 | metadata
184 |
185 | 00:00:51.110 --> 00:00:51.120 align:start position:0%
186 | metadata
187 |
188 |
189 | 00:00:51.120 --> 00:00:53.110 align:start position:0%
190 | metadata
191 | metadata<00:00:51.840> provides<00:00:52.160> not<00:00:52.399> only<00:00:52.640> high-level
192 |
193 | 00:00:53.110 --> 00:00:53.120 align:start position:0%
194 | metadata provides not only high-level
195 |
196 |
197 | 00:00:53.120 --> 00:00:54.950 align:start position:0%
198 | metadata provides not only high-level
199 | information<00:00:53.600> about<00:00:53.840> our<00:00:53.920> research<00:00:54.399> data<00:00:54.719> such
200 |
201 | 00:00:54.950 --> 00:00:54.960 align:start position:0%
202 | information about our research data such
203 |
204 |
205 | 00:00:54.960 --> 00:00:57.350 align:start position:0%
206 | information about our research data such
207 | as<00:00:55.120> study<00:00:55.440> and<00:00:55.600> data<00:00:55.840> acquisition<00:00:56.320> parameters
208 |
209 | 00:00:57.350 --> 00:00:57.360 align:start position:0%
210 | as study and data acquisition parameters
211 |
212 |
213 | 00:00:57.360 --> 00:00:59.590 align:start position:0%
214 | as study and data acquisition parameters
215 | but<00:00:57.520> also<00:00:57.840> the<00:00:57.920> descriptive<00:00:58.480> aspects<00:00:58.960> of<00:00:59.280> each
216 |
217 | 00:00:59.590 --> 00:00:59.600 align:start position:0%
218 | but also the descriptive aspects of each
219 |
220 |
221 | 00:00:59.600 --> 00:01:01.590 align:start position:0%
222 | but also the descriptive aspects of each
223 | file<00:00:59.840> in<00:01:00.000> the<00:01:00.079> data<00:01:00.320> set<00:01:00.480> such<00:01:00.800> as<00:01:00.879> file<00:01:01.199> paths
224 |
225 | 00:01:01.590 --> 00:01:01.600 align:start position:0%
226 | file in the data set such as file paths
227 |
228 |
229 | 00:01:01.600 --> 00:01:03.189 align:start position:0%
230 | file in the data set such as file paths
231 | sizes<00:01:02.079> and<00:01:02.160> formats
232 |
233 | 00:01:03.189 --> 00:01:03.199 align:start position:0%
234 | sizes and formats
235 |
236 |
237 | 00:01:03.199 --> 00:01:04.789 align:start position:0%
238 | sizes and formats
239 | with<00:01:03.359> this<00:01:03.600> metadata<00:01:04.159> we<00:01:04.320> can<00:01:04.400> create<00:01:04.640> an
240 |
241 | 00:01:04.789 --> 00:01:04.799 align:start position:0%
242 | with this metadata we can create an
243 |
244 |
245 | 00:01:04.799 --> 00:01:06.550 align:start position:0%
246 | with this metadata we can create an
247 | abstract<00:01:05.199> representation<00:01:05.840> of<00:01:06.000> the<00:01:06.080> full<00:01:06.240> data
248 |
249 | 00:01:06.550 --> 00:01:06.560 align:start position:0%
250 | abstract representation of the full data
251 |
252 |
253 | 00:01:06.560 --> 00:01:08.310 align:start position:0%
254 | abstract representation of the full data
255 | set<00:01:06.960> that<00:01:07.200> is<00:01:07.360> separate<00:01:07.680> from<00:01:07.840> the<00:01:08.000> actual
256 |
257 | 00:01:08.310 --> 00:01:08.320 align:start position:0%
258 | set that is separate from the actual
259 |
260 |
261 | 00:01:08.320 --> 00:01:09.750 align:start position:0%
262 | set that is separate from the actual
263 | data<00:01:08.720> content
264 |
265 | 00:01:09.750 --> 00:01:09.760 align:start position:0%
266 | data content
267 |
268 |
269 | 00:01:09.760 --> 00:01:11.109 align:start position:0%
270 | data content
271 | this<00:01:10.000> means<00:01:10.240> that<00:01:10.320> the<00:01:10.479> content<00:01:10.799> can<00:01:10.960> be
272 |
273 | 00:01:11.109 --> 00:01:11.119 align:start position:0%
274 | this means that the content can be
275 |
276 |
277 | 00:01:11.119 --> 00:01:12.870 align:start position:0%
278 | this means that the content can be
279 | stored<00:01:11.439> securely<00:01:12.080> while<00:01:12.240> we<00:01:12.400> openly<00:01:12.720> share
280 |
281 | 00:01:12.870 --> 00:01:12.880 align:start position:0%
282 | stored securely while we openly share
283 |
284 |
285 | 00:01:12.880 --> 00:01:16.149 align:start position:0%
286 | stored securely while we openly share
287 | the<00:01:13.040> metadata<00:01:13.600> to<00:01:13.680> make<00:01:13.920> our<00:01:14.080> work<00:01:14.320> more<00:01:14.640> fair
288 |
289 | 00:01:16.149 --> 00:01:16.159 align:start position:0%
290 | the metadata to make our work more fair
291 |
292 |
293 | 00:01:16.159 --> 00:01:18.310 align:start position:0%
294 | the metadata to make our work more fair
295 | as<00:01:16.320> an<00:01:16.479> added<00:01:16.640> benefit<00:01:17.600> structured<00:01:18.159> and
296 |
297 | 00:01:18.310 --> 00:01:18.320 align:start position:0%
298 | as an added benefit structured and
299 |
300 |
301 | 00:01:18.320 --> 00:01:20.310 align:start position:0%
302 | as an added benefit structured and
303 | machine<00:01:18.640> readable<00:01:19.200> metadata<00:01:19.759> that<00:01:19.920> conforms
304 |
305 | 00:01:20.310 --> 00:01:20.320 align:start position:0%
306 | machine readable metadata that conforms
307 |
308 |
309 | 00:01:20.320 --> 00:01:22.469 align:start position:0%
310 | machine readable metadata that conforms
311 | to<00:01:20.479> industry<00:01:20.960> standards<00:01:21.920> improves<00:01:22.320> the
312 |
313 | 00:01:22.469 --> 00:01:22.479 align:start position:0%
314 | to industry standards improves the
315 |
316 |
317 | 00:01:22.479 --> 00:01:24.469 align:start position:0%
318 | to industry standards improves the
319 | interoperability<00:01:23.439> and<00:01:23.600> allows<00:01:24.000> the<00:01:24.080> use<00:01:24.320> of
320 |
321 | 00:01:24.469 --> 00:01:24.479 align:start position:0%
322 | interoperability and allows the use of
323 |
324 |
325 | 00:01:24.479 --> 00:01:30.710 align:start position:0%
326 | interoperability and allows the use of
327 | automated<00:01:24.960> pipelines<00:01:25.680> and<00:01:25.920> tools
328 |
329 | 00:01:30.710 --> 00:01:30.720 align:start position:0%
330 |
331 |
332 |
333 | 00:01:30.720 --> 00:01:32.710 align:start position:0%
334 |
335 | these<00:01:30.960> ideals<00:01:31.360> are<00:01:31.520> achievable<00:01:32.000> in<00:01:32.240> practice
336 |
337 | 00:01:32.710 --> 00:01:32.720 align:start position:0%
338 | these ideals are achievable in practice
339 |
340 |
341 | 00:01:32.720 --> 00:01:33.990 align:start position:0%
342 | these ideals are achievable in practice
343 | with<00:01:32.880> a<00:01:32.960> tool<00:01:33.200> set<00:01:33.360> that<00:01:33.520> includes<00:01:33.840> the
344 |
345 | 00:01:33.990 --> 00:01:34.000 align:start position:0%
346 | with a tool set that includes the
347 |
348 |
349 | 00:01:34.000 --> 00:01:36.310 align:start position:0%
350 | with a tool set that includes the
351 | distributed<00:01:34.720> data<00:01:34.960> management<00:01:35.439> system<00:01:35.920> data
352 |
353 | 00:01:36.310 --> 00:01:36.320 align:start position:0%
354 | distributed data management system data
355 |
356 |
357 | 00:01:36.320 --> 00:01:38.469 align:start position:0%
358 | distributed data management system data
359 | ad<00:01:36.960> and<00:01:37.119> its<00:01:37.360> extensions<00:01:37.840> for<00:01:38.000> metadata
360 |
361 | 00:01:38.469 --> 00:01:38.479 align:start position:0%
362 | ad and its extensions for metadata
363 |
364 |
365 | 00:01:38.479 --> 00:01:41.109 align:start position:0%
366 | ad and its extensions for metadata
367 | handling<00:01:38.880> and<00:01:38.960> catalog<00:01:39.360> generation
368 |
369 | 00:01:41.109 --> 00:01:41.119 align:start position:0%
370 | handling and catalog generation
371 |
372 |
373 | 00:01:41.119 --> 00:01:43.510 align:start position:0%
374 | handling and catalog generation
375 | data<00:01:41.439> that<00:01:41.840> can<00:01:42.000> be<00:01:42.159> used<00:01:42.479> for<00:01:42.640> decentralized
376 |
377 | 00:01:43.510 --> 00:01:43.520 align:start position:0%
378 | data that can be used for decentralized
379 |
380 |
381 | 00:01:43.520 --> 00:01:45.190 align:start position:0%
382 | data that can be used for decentralized
383 | management<00:01:43.920> of<00:01:44.079> data<00:01:44.479> as<00:01:44.640> lightweight
384 |
385 | 00:01:45.190 --> 00:01:45.200 align:start position:0%
386 | management of data as lightweight
387 |
388 |
389 | 00:01:45.200 --> 00:01:47.990 align:start position:0%
390 | management of data as lightweight
391 | portable<00:01:45.680> and<00:01:45.840> extensible<00:01:46.479> representations
392 |
393 | 00:01:47.990 --> 00:01:48.000 align:start position:0%
394 | portable and extensible representations
395 |
396 |
397 | 00:01:48.000 --> 00:01:50.149 align:start position:0%
398 | portable and extensible representations
399 | dataled<00:01:48.479> metal<00:01:48.880> ad<00:01:49.040> can<00:01:49.280> extract<00:01:49.680> structured
400 |
401 | 00:01:50.149 --> 00:01:50.159 align:start position:0%
402 | dataled metal ad can extract structured
403 |
404 |
405 | 00:01:50.159 --> 00:01:51.830 align:start position:0%
406 | dataled metal ad can extract structured
407 | high<00:01:50.399> and<00:01:50.560> low<00:01:50.720> level<00:01:50.960> metadata<00:01:51.680> and
408 |
409 | 00:01:51.830 --> 00:01:51.840 align:start position:0%
410 | high and low level metadata and
411 |
412 |
413 | 00:01:51.840 --> 00:01:53.749 align:start position:0%
414 | high and low level metadata and
415 | associate<00:01:52.399> it<00:01:52.479> with<00:01:52.640> these<00:01:53.119> sets<00:01:53.360> or<00:01:53.520> with
416 |
417 | 00:01:53.749 --> 00:01:53.759 align:start position:0%
418 | associate it with these sets or with
419 |
420 |
421 | 00:01:53.759 --> 00:01:55.830 align:start position:0%
422 | associate it with these sets or with
423 | individual<00:01:54.240> files<00:01:55.119> and<00:01:55.280> at<00:01:55.360> the<00:01:55.520> end<00:01:55.680> of<00:01:55.759> the
424 |
425 | 00:01:55.830 --> 00:01:55.840 align:start position:0%
426 | individual files and at the end of the
427 |
428 |
429 | 00:01:55.840 --> 00:01:57.910 align:start position:0%
430 | individual files and at the end of the
431 | workflow<00:01:56.560> data<00:01:57.040> catalog<00:01:57.439> can<00:01:57.600> turn<00:01:57.840> the
432 |
433 | 00:01:57.910 --> 00:01:57.920 align:start position:0%
434 | workflow data catalog can turn the
435 |
436 |
437 | 00:01:57.920 --> 00:01:59.910 align:start position:0%
438 | workflow data catalog can turn the
439 | structured<00:01:58.320> metadata<00:01:58.880> into<00:01:59.200> user-friendly
440 |
441 | 00:01:59.910 --> 00:01:59.920 align:start position:0%
442 | structured metadata into user-friendly
443 |
444 |
445 | 00:01:59.920 --> 00:02:02.149 align:start position:0%
446 | structured metadata into user-friendly
447 | data<00:02:00.240> browser
448 |
449 | 00:02:02.149 --> 00:02:02.159 align:start position:0%
450 | data browser
451 |
452 |
453 | 00:02:02.159 --> 00:02:03.910 align:start position:0%
454 | data browser
455 | so<00:02:02.399> how<00:02:02.560> does<00:02:02.799> this<00:02:03.040> catalog<00:02:03.439> generation
456 |
457 | 00:02:03.910 --> 00:02:03.920 align:start position:0%
458 | so how does this catalog generation
459 |
460 |
461 | 00:02:03.920 --> 00:02:05.109 align:start position:0%
462 | so how does this catalog generation
463 | process<00:02:04.240> work
464 |
465 | 00:02:05.109 --> 00:02:05.119 align:start position:0%
466 | process work
467 |
468 |
469 | 00:02:05.119 --> 00:02:07.030 align:start position:0%
470 | process work
471 | well<00:02:05.360> metadata<00:02:05.840> extracted<00:02:06.399> from<00:02:06.640> various
472 |
473 | 00:02:07.030 --> 00:02:07.040 align:start position:0%
474 | well metadata extracted from various
475 |
476 |
477 | 00:02:07.040 --> 00:02:08.790 align:start position:0%
478 | well metadata extracted from various
479 | sources<00:02:07.439> even<00:02:07.600> custom<00:02:08.000> sources<00:02:08.479> can<00:02:08.720> be
480 |
481 | 00:02:08.790 --> 00:02:08.800 align:start position:0%
482 | sources even custom sources can be
483 |
484 |
485 | 00:02:08.800 --> 00:02:11.350 align:start position:0%
486 | sources even custom sources can be
487 | aggregated<00:02:09.520> and<00:02:09.679> added<00:02:10.000> to<00:02:10.160> a<00:02:10.239> catalog
488 |
489 | 00:02:11.350 --> 00:02:11.360 align:start position:0%
490 | aggregated and added to a catalog
491 |
492 |
493 | 00:02:11.360 --> 00:02:12.869 align:start position:0%
494 | aggregated and added to a catalog
495 | incoming<00:02:11.760> metadata<00:02:12.319> will<00:02:12.480> first<00:02:12.720> be
496 |
497 | 00:02:12.869 --> 00:02:12.879 align:start position:0%
498 | incoming metadata will first be
499 |
500 |
501 | 00:02:12.879 --> 00:02:14.630 align:start position:0%
502 | incoming metadata will first be
503 | validated<00:02:13.360> against<00:02:13.680> a<00:02:13.760> catalog<00:02:14.160> specific
504 |
505 | 00:02:14.630 --> 00:02:14.640 align:start position:0%
506 | validated against a catalog specific
507 |
508 |
509 | 00:02:14.640 --> 00:02:16.710 align:start position:0%
510 | validated against a catalog specific
511 | schema<00:02:15.120> before<00:02:15.440> the<00:02:15.520> catalog<00:02:16.000> is<00:02:16.080> generated
512 |
513 | 00:02:16.710 --> 00:02:16.720 align:start position:0%
514 | schema before the catalog is generated
515 |
516 |
517 | 00:02:16.720 --> 00:02:19.110 align:start position:0%
518 | schema before the catalog is generated
519 | or<00:02:17.040> individual<00:02:17.599> entries<00:02:18.000> are<00:02:18.080> added
520 |
521 | 00:02:19.110 --> 00:02:19.120 align:start position:0%
522 | or individual entries are added
523 |
524 |
525 | 00:02:19.120 --> 00:02:21.270 align:start position:0%
526 | or individual entries are added
527 | once<00:02:19.360> the<00:02:19.440> process<00:02:19.920> is<00:02:20.000> finished<00:02:20.640> the<00:02:20.879> output
528 |
529 | 00:02:21.270 --> 00:02:21.280 align:start position:0%
530 | once the process is finished the output
531 |
532 |
533 | 00:02:21.280 --> 00:02:23.510 align:start position:0%
534 | once the process is finished the output
535 | is<00:02:21.360> a<00:02:21.440> set<00:02:21.680> of<00:02:21.840> structured<00:02:22.319> metadata<00:02:22.959> files<00:02:23.360> as
536 |
537 | 00:02:23.510 --> 00:02:23.520 align:start position:0%
538 | is a set of structured metadata files as
539 |
540 |
541 | 00:02:23.520 --> 00:02:25.670 align:start position:0%
542 | is a set of structured metadata files as
543 | well<00:02:23.680> as<00:02:23.840> a<00:02:23.920> view<00:02:24.160> js<00:02:24.640> based<00:02:25.280> browser
544 |
545 | 00:02:25.670 --> 00:02:25.680 align:start position:0%
546 | well as a view js based browser
547 |
548 |
549 | 00:02:25.680 --> 00:02:28.070 align:start position:0%
550 | well as a view js based browser
551 | interface<00:02:26.480> that<00:02:26.720> understands<00:02:27.280> how<00:02:27.440> to<00:02:27.599> render
552 |
553 | 00:02:28.070 --> 00:02:28.080 align:start position:0%
554 | interface that understands how to render
555 |
556 |
557 | 00:02:28.080 --> 00:02:30.070 align:start position:0%
558 | interface that understands how to render
559 | this<00:02:28.319> metadata
560 |
561 | 00:02:30.070 --> 00:02:30.080 align:start position:0%
562 | this metadata
563 |
564 |
565 | 00:02:30.080 --> 00:02:31.670 align:start position:0%
566 | this metadata
567 | what<00:02:30.319> is<00:02:30.480> left<00:02:30.640> for<00:02:30.800> the<00:02:30.879> user<00:02:31.200> is<00:02:31.360> to<00:02:31.440> host
568 |
569 | 00:02:31.670 --> 00:02:31.680 align:start position:0%
570 | what is left for the user is to host
571 |
572 |
573 | 00:02:31.680 --> 00:02:33.589 align:start position:0%
574 | what is left for the user is to host
575 | this<00:02:31.840> content<00:02:32.239> on<00:02:32.319> their<00:02:32.560> platform<00:02:32.959> of<00:02:33.040> choice
576 |
577 | 00:02:33.589 --> 00:02:33.599 align:start position:0%
578 | this content on their platform of choice
579 |
580 |
581 | 00:02:33.599 --> 00:02:36.710 align:start position:0%
582 | this content on their platform of choice
583 | and<00:02:33.840> serve<00:02:34.160> it<00:02:34.319> for<00:02:34.480> the<00:02:34.640> world<00:02:35.040> to<00:02:35.280> see
584 |
585 | 00:02:36.710 --> 00:02:36.720 align:start position:0%
586 | and serve it for the world to see
587 |
588 |
589 | 00:02:36.720 --> 00:02:38.790 align:start position:0%
590 | and serve it for the world to see
591 | data<00:02:37.440> catalog<00:02:38.000> brings<00:02:38.319> the<00:02:38.400> powerful
592 |
593 | 00:02:38.790 --> 00:02:38.800 align:start position:0%
594 | data catalog brings the powerful
595 |
596 |
597 | 00:02:38.800 --> 00:02:40.790 align:start position:0%
598 | data catalog brings the powerful
599 | functionality<00:02:39.360> of<00:02:39.519> decentralized<00:02:40.319> metadata
600 |
601 | 00:02:40.790 --> 00:02:40.800 align:start position:0%
602 | functionality of decentralized metadata
603 |
604 |
605 | 00:02:40.800 --> 00:02:42.229 align:start position:0%
606 | functionality of decentralized metadata
607 | handling<00:02:41.120> and<00:02:41.200> data<00:02:41.519> publishing<00:02:41.920> into<00:02:42.160> the
608 |
609 | 00:02:42.229 --> 00:02:42.239 align:start position:0%
610 | handling and data publishing into the
611 |
612 |
613 | 00:02:42.239 --> 00:02:44.550 align:start position:0%
614 | handling and data publishing into the
615 | hands<00:02:42.480> of<00:02:42.640> users<00:02:43.360> preventing<00:02:43.920> dependence<00:02:44.480> on
616 |
617 | 00:02:44.550 --> 00:02:44.560 align:start position:0%
618 | hands of users preventing dependence on
619 |
620 |
621 | 00:02:44.560 --> 00:02:46.229 align:start position:0%
622 | hands of users preventing dependence on
623 | centralized<00:02:45.200> infrastructure<00:02:45.760> and<00:02:45.920> keeping
624 |
625 | 00:02:46.229 --> 00:02:46.239 align:start position:0%
626 | centralized infrastructure and keeping
627 |
628 |
629 | 00:02:46.239 --> 00:02:48.470 align:start position:0%
630 | centralized infrastructure and keeping
631 | private<00:02:46.560> data<00:02:46.959> secure<00:02:47.680> while<00:02:47.920> adhering<00:02:48.319> to
632 |
633 | 00:02:48.470 --> 00:02:48.480 align:start position:0%
634 | private data secure while adhering to
635 |
636 |
637 | 00:02:48.480 --> 00:02:50.869 align:start position:0%
638 | private data secure while adhering to
639 | fair<00:02:48.840> principles<00:02:49.840> please<00:02:50.080> explore<00:02:50.480> the<00:02:50.560> demo
640 |
641 | 00:02:50.869 --> 00:02:50.879 align:start position:0%
642 | fair principles please explore the demo
643 |
644 |
645 | 00:02:50.879 --> 00:02:52.869 align:start position:0%
646 | fair principles please explore the demo
647 | catalog<00:02:51.680> walk<00:02:51.920> through<00:02:52.160> the<00:02:52.319> interactive
648 |
649 | 00:02:52.869 --> 00:02:52.879 align:start position:0%
650 | catalog walk through the interactive
651 |
652 |
653 | 00:02:52.879 --> 00:02:54.710 align:start position:0%
654 | catalog walk through the interactive
655 | tutorial<00:02:53.280> or<00:02:53.440> visit<00:02:53.760> the<00:02:53.840> code<00:02:54.080> base<00:02:54.319> to<00:02:54.480> start
656 |
657 | 00:02:54.710 --> 00:02:54.720 align:start position:0%
658 | tutorial or visit the code base to start
659 |
660 |
661 | 00:02:54.720 --> 00:02:56.869 align:start position:0%
662 | tutorial or visit the code base to start
663 | using<00:02:55.120> or<00:02:55.280> contributing<00:02:56.080> to<00:02:56.239> data<00:02:56.560> lab
664 |
665 | 00:02:56.869 --> 00:02:56.879 align:start position:0%
666 | using or contributing to data lab
667 |
668 |
669 | 00:02:56.879 --> 00:03:00.159 align:start position:0%
670 | using or contributing to data lab
671 | catalog<00:02:57.840> thank<00:02:58.000> you
672 |
673 |
--------------------------------------------------------------------------------
/DataLad/OHBM_Poster_presentation__2057__FAIRly_big.en.vtt:
--------------------------------------------------------------------------------
1 | WEBVTT
2 | Kind: captions
3 | Language: en
4 |
5 | 00:00:00.000 --> 00:00:04.870 align:start position:0%
6 |
7 | [Music]
8 |
9 | 00:00:04.870 --> 00:00:04.880 align:start position:0%
10 |
11 |
12 |
13 | 00:00:04.880 --> 00:00:06.950 align:start position:0%
14 |
15 | once<00:00:05.200> upon<00:00:05.440> a<00:00:05.520> time<00:00:05.920> there<00:00:06.160> was<00:00:06.319> an<00:00:06.480> institute
16 |
17 | 00:00:06.950 --> 00:00:06.960 align:start position:0%
18 | once upon a time there was an institute
19 |
20 |
21 | 00:00:06.960 --> 00:00:08.710 align:start position:0%
22 | once upon a time there was an institute
23 | with<00:00:07.120> access<00:00:07.520> to<00:00:07.680> a<00:00:07.759> variety<00:00:08.240> of<00:00:08.400> large
24 |
25 | 00:00:08.710 --> 00:00:08.720 align:start position:0%
26 | with access to a variety of large
27 |
28 |
29 | 00:00:08.720 --> 00:00:10.629 align:start position:0%
30 | with access to a variety of large
31 | neuroscientific<00:00:09.440> data<00:00:09.760> sets
32 |
33 | 00:00:10.629 --> 00:00:10.639 align:start position:0%
34 | neuroscientific data sets
35 |
36 |
37 | 00:00:10.639 --> 00:00:12.390 align:start position:0%
38 | neuroscientific data sets
39 | dozens<00:00:11.040> of<00:00:11.120> researchers<00:00:11.679> depended<00:00:12.240> on
40 |
41 | 00:00:12.390 --> 00:00:12.400 align:start position:0%
42 | dozens of researchers depended on
43 |
44 |
45 | 00:00:12.400 --> 00:00:14.150 align:start position:0%
46 | dozens of researchers depended on
47 | pre-processed<00:00:12.960> versions<00:00:13.360> of<00:00:13.440> these<00:00:13.679> datasets
48 |
49 | 00:00:14.150 --> 00:00:14.160 align:start position:0%
50 | pre-processed versions of these datasets
51 |
52 |
53 | 00:00:14.160 --> 00:00:15.589 align:start position:0%
54 | pre-processed versions of these datasets
55 | for<00:00:14.320> the<00:00:14.480> project
56 |
57 | 00:00:15.589 --> 00:00:15.599 align:start position:0%
58 | for the project
59 |
60 |
61 | 00:00:15.599 --> 00:00:17.430 align:start position:0%
62 | for the project
63 | the<00:00:15.679> central<00:00:16.080> coordinated<00:00:16.800> pre-processing
64 |
65 | 00:00:17.430 --> 00:00:17.440 align:start position:0%
66 | the central coordinated pre-processing
67 |
68 |
69 | 00:00:17.440 --> 00:00:19.590 align:start position:0%
70 | the central coordinated pre-processing
71 | efforts<00:00:17.840> however<00:00:18.640> suffered<00:00:19.039> from<00:00:19.199> a<00:00:19.279> lack<00:00:19.520> of
72 |
73 | 00:00:19.590 --> 00:00:19.600 align:start position:0%
74 | efforts however suffered from a lack of
75 |
76 |
77 | 00:00:19.600 --> 00:00:21.750 align:start position:0%
78 | efforts however suffered from a lack of
79 | transparency<00:00:20.240> and<00:00:20.320> reproducibility<00:00:21.600> there
80 |
81 | 00:00:21.750 --> 00:00:21.760 align:start position:0%
82 | transparency and reproducibility there
83 |
84 |
85 | 00:00:21.760 --> 00:00:24.070 align:start position:0%
86 | transparency and reproducibility there
87 | was<00:00:22.000> pre-processed<00:00:22.640> data<00:00:23.119> but<00:00:23.279> over<00:00:23.600> time<00:00:23.920> the
88 |
89 | 00:00:24.070 --> 00:00:24.080 align:start position:0%
90 | was pre-processed data but over time the
91 |
92 |
93 | 00:00:24.080 --> 00:00:26.070 align:start position:0%
94 | was pre-processed data but over time the
95 | knowledge<00:00:24.400> of<00:00:24.560> who<00:00:24.800> created<00:00:25.279> it<00:00:25.519> how<00:00:25.760> it<00:00:25.840> was
96 |
97 | 00:00:26.070 --> 00:00:26.080 align:start position:0%
98 | knowledge of who created it how it was
99 |
100 |
101 | 00:00:26.080 --> 00:00:28.790 align:start position:0%
102 | knowledge of who created it how it was
103 | created<00:00:26.720> or<00:00:26.880> where<00:00:27.119> it<00:00:27.199> was<00:00:27.359> stored<00:00:27.840> was<00:00:28.080> lost
104 |
105 | 00:00:28.790 --> 00:00:28.800 align:start position:0%
106 | created or where it was stored was lost
107 |
108 |
109 | 00:00:28.800 --> 00:00:31.189 align:start position:0%
110 | created or where it was stored was lost
111 | with<00:00:28.960> this<00:00:29.199> lack<00:00:29.439> of<00:00:29.599> transparency<00:00:30.640> reuse<00:00:31.039> was
112 |
113 | 00:00:31.189 --> 00:00:31.199 align:start position:0%
114 | with this lack of transparency reuse was
115 |
116 |
117 | 00:00:31.199 --> 00:00:33.510 align:start position:0%
118 | with this lack of transparency reuse was
119 | difficult<00:00:32.079> and<00:00:32.399> ceased
120 |
121 | 00:00:33.510 --> 00:00:33.520 align:start position:0%
122 | difficult and ceased
123 |
124 |
125 | 00:00:33.520 --> 00:00:34.870 align:start position:0%
126 | difficult and ceased
127 | but<00:00:33.680> when<00:00:33.920> every<00:00:34.239> research<00:00:34.640> group
128 |
129 | 00:00:34.870 --> 00:00:34.880 align:start position:0%
130 | but when every research group
131 |
132 |
133 | 00:00:34.880 --> 00:00:37.030 align:start position:0%
134 | but when every research group
135 | pre-processed<00:00:35.520> the<00:00:35.680> data<00:00:36.000> individually<00:00:36.880> it
136 |
137 | 00:00:37.030 --> 00:00:37.040 align:start position:0%
138 | pre-processed the data individually it
139 |
140 |
141 | 00:00:37.040 --> 00:00:38.869 align:start position:0%
142 | pre-processed the data individually it
143 | resulted<00:00:37.440> not<00:00:37.680> only<00:00:37.920> in<00:00:38.079> unsustainable
144 |
145 | 00:00:38.869 --> 00:00:38.879 align:start position:0%
146 | resulted not only in unsustainable
147 |
148 |
149 | 00:00:38.879 --> 00:00:41.110 align:start position:0%
150 | resulted not only in unsustainable
151 | duplicate<00:00:39.360> computing<00:00:39.840> efforts<00:00:40.480> but<00:00:40.719> also
152 |
153 | 00:00:41.110 --> 00:00:41.120 align:start position:0%
154 | duplicate computing efforts but also
155 |
156 |
157 | 00:00:41.120 --> 00:00:42.869 align:start position:0%
158 | duplicate computing efforts but also
159 | filled<00:00:41.440> up<00:00:41.520> the<00:00:41.600> disk<00:00:41.920> space<00:00:42.320> of<00:00:42.399> the<00:00:42.480> compute
160 |
161 | 00:00:42.869 --> 00:00:42.879 align:start position:0%
162 | filled up the disk space of the compute
163 |
164 |
165 | 00:00:42.879 --> 00:00:44.389 align:start position:0%
166 | filled up the disk space of the compute
167 | cluster<00:00:43.360> in<00:00:43.440> no<00:00:43.680> time
168 |
169 | 00:00:44.389 --> 00:00:44.399 align:start position:0%
170 | cluster in no time
171 |
172 |
173 | 00:00:44.399 --> 00:00:46.869 align:start position:0%
174 | cluster in no time
175 | and<00:00:44.559> when<00:00:44.800> data<00:00:45.039> sets<00:00:45.440> became<00:00:45.920> too<00:00:46.160> big<00:00:46.559> to<00:00:46.719> be
176 |
177 | 00:00:46.869 --> 00:00:46.879 align:start position:0%
178 | and when data sets became too big to be
179 |
180 |
181 | 00:00:46.879 --> 00:00:51.270 align:start position:0%
182 | and when data sets became too big to be
183 | computed<00:00:47.520> even<00:00:47.760> once<00:00:48.640> things<00:00:49.280> had<00:00:49.440> to<00:00:49.680> change
184 |
185 | 00:00:51.270 --> 00:00:51.280 align:start position:0%
186 | computed even once things had to change
187 |
188 |
189 | 00:00:51.280 --> 00:00:53.189 align:start position:0%
190 | computed even once things had to change
191 | my<00:00:51.440> name<00:00:51.600> is<00:00:51.760> adina<00:00:52.239> and<00:00:52.320> my<00:00:52.480> colleagues<00:00:52.879> and<00:00:52.960> i
192 |
193 | 00:00:53.189 --> 00:00:53.199 align:start position:0%
194 | my name is adina and my colleagues and i
195 |
196 |
197 | 00:00:53.199 --> 00:00:54.790 align:start position:0%
198 | my name is adina and my colleagues and i
199 | created<00:00:53.520> a<00:00:53.600> framework<00:00:54.000> to<00:00:54.160> not<00:00:54.399> only<00:00:54.559> make
200 |
201 | 00:00:54.790 --> 00:00:54.800 align:start position:0%
202 | created a framework to not only make
203 |
204 |
205 | 00:00:54.800 --> 00:00:56.470 align:start position:0%
206 | created a framework to not only make
207 | processing<00:00:55.280> of<00:00:55.440> large<00:00:55.680> scale<00:00:55.920> data<00:00:56.239> sets
208 |
209 | 00:00:56.470 --> 00:00:56.480 align:start position:0%
210 | processing of large scale data sets
211 |
212 |
213 | 00:00:56.480 --> 00:00:58.630 align:start position:0%
214 | processing of large scale data sets
215 | possible<00:00:57.199> but<00:00:57.360> its<00:00:57.520> outcomes<00:00:58.000> also<00:00:58.239> easily
216 |
217 | 00:00:58.630 --> 00:00:58.640 align:start position:0%
218 | possible but its outcomes also easily
219 |
220 |
221 | 00:00:58.640 --> 00:01:00.549 align:start position:0%
222 | possible but its outcomes also easily
223 | shareable<00:00:59.120> transparent<00:00:59.760> and<00:00:59.920> automatically
224 |
225 | 00:01:00.549 --> 00:01:00.559 align:start position:0%
226 | shareable transparent and automatically
227 |
228 |
229 | 00:01:00.559 --> 00:01:01.830 align:start position:0%
230 | shareable transparent and automatically
231 | recomputable
232 |
233 | 00:01:01.830 --> 00:01:01.840 align:start position:0%
234 | recomputable
235 |
236 |
237 | 00:01:01.840 --> 00:01:03.590 align:start position:0%
238 | recomputable
239 | we<00:01:02.000> start<00:01:02.239> with<00:01:02.399> a<00:01:02.559> datalet<00:01:02.960> dataset<00:01:03.359> on<00:01:03.520> a
240 |
241 | 00:01:03.590 --> 00:01:03.600 align:start position:0%
242 | we start with a datalet dataset on a
243 |
244 |
245 | 00:01:03.600 --> 00:01:05.990 align:start position:0%
246 | we start with a datalet dataset on a
247 | computational<00:01:04.239> cluster<00:01:05.119> datasets<00:01:05.680> are<00:01:05.760> based
248 |
249 | 00:01:05.990 --> 00:01:06.000 align:start position:0%
250 | computational cluster datasets are based
251 |
252 |
253 | 00:01:06.000 --> 00:01:07.590 align:start position:0%
254 | computational cluster datasets are based
255 | on<00:01:06.080> git<00:01:06.240> repositories<00:01:06.960> but<00:01:07.119> can<00:01:07.280> version
256 |
257 | 00:01:07.590 --> 00:01:07.600 align:start position:0%
258 | on git repositories but can version
259 |
260 |
261 | 00:01:07.600 --> 00:01:09.670 align:start position:0%
262 | on git repositories but can version
263 | control<00:01:08.000> digital<00:01:08.320> files<00:01:08.720> of<00:01:08.880> any<00:01:09.040> size<00:01:09.439> such
264 |
265 | 00:01:09.670 --> 00:01:09.680 align:start position:0%
266 | control digital files of any size such
267 |
268 |
269 | 00:01:09.680 --> 00:01:11.510 align:start position:0%
270 | control digital files of any size such
271 | as<00:01:09.760> the<00:01:09.920> uk<00:01:10.159> biobank<00:01:10.640> data<00:01:11.119> which<00:01:11.360> we
272 |
273 | 00:01:11.510 --> 00:01:11.520 align:start position:0%
274 | as the uk biobank data which we
275 |
276 |
277 | 00:01:11.520 --> 00:01:14.149 align:start position:0%
278 | as the uk biobank data which we
279 | retrieved<00:01:12.000> using<00:01:12.240> data.uk<00:01:12.960> biobank<00:01:13.920> as<00:01:14.080> you
280 |
281 | 00:01:14.149 --> 00:01:14.159 align:start position:0%
282 | retrieved using data.uk biobank as you
283 |
284 |
285 | 00:01:14.159 --> 00:01:16.469 align:start position:0%
286 | retrieved using data.uk biobank as you
287 | can<00:01:14.400> see<00:01:14.560> here<00:01:15.200> datasets<00:01:15.759> can<00:01:15.920> contain<00:01:16.320> other
288 |
289 | 00:01:16.469 --> 00:01:16.479 align:start position:0%
290 | can see here datasets can contain other
291 |
292 |
293 | 00:01:16.479 --> 00:01:18.469 align:start position:0%
294 | can see here datasets can contain other
295 | datasets<00:01:17.200> this<00:01:17.439> is<00:01:17.520> useful<00:01:17.920> to<00:01:18.080> structure
296 |
297 | 00:01:18.469 --> 00:01:18.479 align:start position:0%
298 | datasets this is useful to structure
299 |
300 |
301 | 00:01:18.479 --> 00:01:20.469 align:start position:0%
302 | datasets this is useful to structure
303 | large<00:01:18.720> datasets<00:01:19.200> into<00:01:19.439> smaller<00:01:19.759> units<00:01:20.240> but
304 |
305 | 00:01:20.469 --> 00:01:20.479 align:start position:0%
306 | large datasets into smaller units but
307 |
308 |
309 | 00:01:20.479 --> 00:01:22.710 align:start position:0%
310 | large datasets into smaller units but
311 | also<00:01:20.880> to<00:01:21.040> link<00:01:21.280> datasets<00:01:21.759> as<00:01:21.920> dependencies<00:01:22.560> to
312 |
313 | 00:01:22.710 --> 00:01:22.720 align:start position:0%
314 | also to link datasets as dependencies to
315 |
316 |
317 | 00:01:22.720 --> 00:01:23.830 align:start position:0%
318 | also to link datasets as dependencies to
319 | one<00:01:22.880> another
320 |
321 | 00:01:23.830 --> 00:01:23.840 align:start position:0%
322 | one another
323 |
324 |
325 | 00:01:23.840 --> 00:01:26.230 align:start position:0%
326 | one another
327 | we<00:01:24.000> use<00:01:24.159> one<00:01:24.320> dataset<00:01:24.799> to<00:01:24.960> link<00:01:25.200> ukb<00:01:25.600> data<00:01:26.080> and
328 |
329 | 00:01:26.230 --> 00:01:26.240 align:start position:0%
330 | we use one dataset to link ukb data and
331 |
332 |
333 | 00:01:26.240 --> 00:01:27.990 align:start position:0%
334 | we use one dataset to link ukb data and
335 | a<00:01:26.320> software<00:01:26.640> dataset<00:01:27.119> with<00:01:27.280> a<00:01:27.360> computational
336 |
337 | 00:01:27.990 --> 00:01:28.000 align:start position:0%
338 | a software dataset with a computational
339 |
340 |
341 | 00:01:28.000 --> 00:01:29.429 align:start position:0%
342 | a software dataset with a computational
343 | pipeline<00:01:28.479> in<00:01:28.560> the<00:01:28.640> form<00:01:28.880> of<00:01:29.040> a<00:01:29.119> software
344 |
345 | 00:01:29.429 --> 00:01:29.439 align:start position:0%
346 | pipeline in the form of a software
347 |
348 |
349 | 00:01:29.439 --> 00:01:31.830 align:start position:0%
350 | pipeline in the form of a software
351 | container<00:01:29.920> as<00:01:30.079> analysis<00:01:30.560> dependencies
352 |
353 | 00:01:31.830 --> 00:01:31.840 align:start position:0%
354 | container as analysis dependencies
355 |
356 |
357 | 00:01:31.840 --> 00:01:33.830 align:start position:0%
358 | container as analysis dependencies
359 | datasets<00:01:32.400> can<00:01:32.560> drop<00:01:32.880> and<00:01:32.960> re-retrieve<00:01:33.600> file
360 |
361 | 00:01:33.830 --> 00:01:33.840 align:start position:0%
362 | datasets can drop and re-retrieve file
363 |
364 |
365 | 00:01:33.840 --> 00:01:36.149 align:start position:0%
366 | datasets can drop and re-retrieve file
367 | content<00:01:34.159> that<00:01:34.320> is<00:01:34.400> hosted<00:01:34.799> elsewhere<00:01:35.680> despite
368 |
369 | 00:01:36.149 --> 00:01:36.159 align:start position:0%
370 | content that is hosted elsewhere despite
371 |
372 |
373 | 00:01:36.159 --> 00:01:38.469 align:start position:0%
374 | content that is hosted elsewhere despite
375 | tracking<00:01:36.560> terabytes<00:01:37.119> of<00:01:37.200> data<00:01:37.759> datasets<00:01:38.320> can
376 |
377 | 00:01:38.469 --> 00:01:38.479 align:start position:0%
378 | tracking terabytes of data datasets can
379 |
380 |
381 | 00:01:38.479 --> 00:01:40.870 align:start position:0%
382 | tracking terabytes of data datasets can
383 | thus<00:01:38.720> be<00:01:38.960> tiny<00:01:39.280> in<00:01:39.439> size<00:01:40.159> this<00:01:40.400> feature<00:01:40.799> is
384 |
385 | 00:01:40.870 --> 00:01:40.880 align:start position:0%
386 | thus be tiny in size this feature is
387 |
388 |
389 | 00:01:40.880 --> 00:01:42.469 align:start position:0%
390 | thus be tiny in size this feature is
391 | used<00:01:41.200> to<00:01:41.360> create<00:01:41.600> hundreds<00:01:42.000> of<00:01:42.159> single
392 |
393 | 00:01:42.469 --> 00:01:42.479 align:start position:0%
394 | used to create hundreds of single
395 |
396 |
397 | 00:01:42.479 --> 00:01:44.710 align:start position:0%
398 | used to create hundreds of single
399 | subject<00:01:42.880> analysis<00:01:43.600> that<00:01:43.840> only<00:01:44.159> retrieve<00:01:44.560> the
400 |
401 | 00:01:44.710 --> 00:01:44.720 align:start position:0%
402 | subject analysis that only retrieve the
403 |
404 |
405 | 00:01:44.720 --> 00:01:46.149 align:start position:0%
406 | subject analysis that only retrieve the
407 | files<00:01:45.040> they<00:01:45.200> need
408 |
409 | 00:01:46.149 --> 00:01:46.159 align:start position:0%
410 | files they need
411 |
412 |
413 | 00:01:46.159 --> 00:01:48.550 align:start position:0%
414 | files they need
415 | at<00:01:46.399> analysis<00:01:46.880> execution<00:01:47.680> a<00:01:47.840> job<00:01:48.079> scheduler
416 |
417 | 00:01:48.550 --> 00:01:48.560 align:start position:0%
418 | at analysis execution a job scheduler
419 |
420 |
421 | 00:01:48.560 --> 00:01:50.469 align:start position:0%
422 | at analysis execution a job scheduler
423 | distributes<00:01:49.119> the<00:01:49.200> analysis<00:01:49.759> over<00:01:50.000> available
424 |
425 | 00:01:50.469 --> 00:01:50.479 align:start position:0%
426 | distributes the analysis over available
427 |
428 |
429 | 00:01:50.479 --> 00:01:52.710 align:start position:0%
430 | distributes the analysis over available
431 | compute<00:01:50.880> nodes<00:01:51.280> each<00:01:51.520> compute<00:01:51.920> node<00:01:52.399> clones
432 |
433 | 00:01:52.710 --> 00:01:52.720 align:start position:0%
434 | compute nodes each compute node clones
435 |
436 |
437 | 00:01:52.720 --> 00:01:54.230 align:start position:0%
438 | compute nodes each compute node clones
439 | the<00:01:52.799> topmost<00:01:53.280> dataset<00:01:53.759> to<00:01:53.840> create<00:01:54.159> a
440 |
441 | 00:01:54.230 --> 00:01:54.240 align:start position:0%
442 | the topmost dataset to create a
443 |
444 |
445 | 00:01:54.240 --> 00:01:56.469 align:start position:0%
446 | the topmost dataset to create a
447 | short-lived<00:01:54.799> ephemeral<00:01:55.280> clone<00:01:55.840> resulting<00:01:56.320> in
448 |
449 | 00:01:56.469 --> 00:01:56.479 align:start position:0%
450 | short-lived ephemeral clone resulting in
451 |
452 |
453 | 00:01:56.479 --> 00:01:58.789 align:start position:0%
454 | short-lived ephemeral clone resulting in
455 | a<00:01:56.560> network<00:01:56.960> of<00:01:57.040> temporary<00:01:57.600> dataset<00:01:58.000> copies
456 |
457 | 00:01:58.789 --> 00:01:58.799 align:start position:0%
458 | a network of temporary dataset copies
459 |
460 |
461 | 00:01:58.799 --> 00:02:01.030 align:start position:0%
462 | a network of temporary dataset copies
463 | each<00:01:59.040> ephemeral<00:01:59.600> clone<00:02:00.159> is<00:02:00.320> tasked<00:02:00.640> with<00:02:00.799> one
464 |
465 | 00:02:01.030 --> 00:02:01.040 align:start position:0%
466 | each ephemeral clone is tasked with one
467 |
468 |
469 | 00:02:01.040 --> 00:02:03.749 align:start position:0%
470 | each ephemeral clone is tasked with one
471 | subset<00:02:01.439> of<00:02:01.600> analysis<00:02:02.079> execution<00:02:03.119> this<00:02:03.360> job<00:02:03.680> is
472 |
473 | 00:02:03.749 --> 00:02:03.759 align:start position:0%
474 | subset of analysis execution this job is
475 |
476 |
477 | 00:02:03.759 --> 00:02:05.590 align:start position:0%
478 | subset of analysis execution this job is
479 | performed<00:02:04.240> with<00:02:04.399> a<00:02:04.479> data<00:02:04.719> that<00:02:04.880> contains<00:02:05.360> one
480 |
481 | 00:02:05.590 --> 00:02:05.600 align:start position:0%
482 | performed with a data that contains one
483 |
484 |
485 | 00:02:05.600 --> 00:02:07.670 align:start position:0%
486 | performed with a data that contains one
487 | call<00:02:06.240> its<00:02:06.479> advantage<00:02:06.960> is<00:02:07.119> that<00:02:07.280> the<00:02:07.439> full
488 |
489 | 00:02:07.670 --> 00:02:07.680 align:start position:0%
490 | call its advantage is that the full
491 |
492 |
493 | 00:02:07.680 --> 00:02:09.589 align:start position:0%
494 | call its advantage is that the full
495 | digital<00:02:08.160> analysis<00:02:08.640> provenance<00:02:09.119> is<00:02:09.200> captured
496 |
497 | 00:02:09.589 --> 00:02:09.599 align:start position:0%
498 | digital analysis provenance is captured
499 |
500 |
501 | 00:02:09.599 --> 00:02:11.270 align:start position:0%
502 | digital analysis provenance is captured
503 | in<00:02:09.759> a<00:02:09.840> structured<00:02:10.239> record<00:02:10.560> that<00:02:10.720> can<00:02:10.879> be<00:02:11.039> used
504 |
505 | 00:02:11.270 --> 00:02:11.280 align:start position:0%
506 | in a structured record that can be used
507 |
508 |
509 | 00:02:11.280 --> 00:02:13.190 align:start position:0%
510 | in a structured record that can be used
511 | for<00:02:11.360> automatic<00:02:11.840> re-execution
512 |
513 | 00:02:13.190 --> 00:02:13.200 align:start position:0%
514 | for automatic re-execution
515 |
516 |
517 | 00:02:13.200 --> 00:02:14.710 align:start position:0%
518 | for automatic re-execution
519 | provenance<00:02:13.680> and<00:02:13.760> computed<00:02:14.239> results<00:02:14.640> are
520 |
521 | 00:02:14.710 --> 00:02:14.720 align:start position:0%
522 | provenance and computed results are
523 |
524 |
525 | 00:02:14.720 --> 00:02:17.190 align:start position:0%
526 | provenance and computed results are
527 | saved<00:02:15.280> pushed<00:02:16.080> and<00:02:16.400> when<00:02:16.640> all<00:02:16.800> jobs<00:02:17.040> are
528 |
529 | 00:02:17.190 --> 00:02:17.200 align:start position:0%
530 | saved pushed and when all jobs are
531 |
532 |
533 | 00:02:17.200 --> 00:02:19.350 align:start position:0%
534 | saved pushed and when all jobs are
535 | finished<00:02:17.840> merged<00:02:18.319> back<00:02:18.560> into<00:02:18.800> the<00:02:18.959> central
536 |
537 | 00:02:19.350 --> 00:02:19.360 align:start position:0%
538 | finished merged back into the central
539 |
540 |
541 | 00:02:19.360 --> 00:02:20.470 align:start position:0%
542 | finished merged back into the central
543 | dataset
544 |
545 | 00:02:20.470 --> 00:02:20.480 align:start position:0%
546 | dataset
547 |
548 |
549 | 00:02:20.480 --> 00:02:21.990 align:start position:0%
550 | dataset
551 | throughout<00:02:20.879> the<00:02:21.040> process<00:02:21.440> a<00:02:21.599> special
552 |
553 | 00:02:21.990 --> 00:02:22.000 align:start position:0%
554 | throughout the process a special
555 |
556 |
557 | 00:02:22.000 --> 00:02:23.670 align:start position:0%
558 | throughout the process a special
559 | internal<00:02:22.480> data<00:02:22.720> set<00:02:22.959> representation
560 |
561 | 00:02:23.670 --> 00:02:23.680 align:start position:0%
562 | internal data set representation
563 |
564 |
565 | 00:02:23.680 --> 00:02:25.990 align:start position:0%
566 | internal data set representation
567 | minimizes<00:02:24.239> disk<00:02:24.560> space<00:02:24.879> and<00:02:24.959> inode<00:02:25.360> usage<00:02:25.840> and
568 |
569 | 00:02:25.990 --> 00:02:26.000 align:start position:0%
570 | minimizes disk space and inode usage and
571 |
572 |
573 | 00:02:26.000 --> 00:02:27.510 align:start position:0%
574 | minimizes disk space and inode usage and
575 | provides<00:02:26.400> optional<00:02:26.800> encryption<00:02:27.200> during
576 |
577 | 00:02:27.510 --> 00:02:27.520 align:start position:0%
578 | provides optional encryption during
579 |
580 |
581 | 00:02:27.520 --> 00:02:29.830 align:start position:0%
582 | provides optional encryption during
583 | transport<00:02:28.480> this<00:02:28.720> allows<00:02:29.040> processing<00:02:29.440> of<00:02:29.599> data
584 |
585 | 00:02:29.830 --> 00:02:29.840 align:start position:0%
586 | transport this allows processing of data
587 |
588 |
589 | 00:02:29.840 --> 00:02:31.509 align:start position:0%
590 | transport this allows processing of data
591 | sets<00:02:30.080> that<00:02:30.239> are<00:02:30.319> larger<00:02:30.720> than<00:02:30.879> the<00:02:31.040> available
592 |
593 | 00:02:31.509 --> 00:02:31.519 align:start position:0%
594 | sets that are larger than the available
595 |
596 |
597 | 00:02:31.519 --> 00:02:33.270 align:start position:0%
598 | sets that are larger than the available
599 | resources<00:02:32.080> with<00:02:32.319> only<00:02:32.560> minimal<00:02:32.879> software
600 |
601 | 00:02:33.270 --> 00:02:33.280 align:start position:0%
602 | resources with only minimal software
603 |
604 |
605 | 00:02:33.280 --> 00:02:35.270 align:start position:0%
606 | resources with only minimal software
607 | requirements<00:02:33.840> on<00:02:34.000> the<00:02:34.080> server<00:02:34.400> side
608 |
609 | 00:02:35.270 --> 00:02:35.280 align:start position:0%
610 | requirements on the server side
611 |
612 |
613 | 00:02:35.280 --> 00:02:37.270 align:start position:0%
614 | requirements on the server side
615 | because<00:02:35.519> data<00:02:35.920> sets<00:02:36.239> can<00:02:36.480> be<00:02:36.640> easily<00:02:37.040> shared
616 |
617 | 00:02:37.270 --> 00:02:37.280 align:start position:0%
618 | because data sets can be easily shared
619 |
620 |
621 | 00:02:37.280 --> 00:02:39.190 align:start position:0%
622 | because data sets can be easily shared
623 | with<00:02:37.440> appropriate<00:02:38.000> audiences<00:02:38.560> the<00:02:38.720> resulting
624 |
625 | 00:02:39.190 --> 00:02:39.200 align:start position:0%
626 | with appropriate audiences the resulting
627 |
628 |
629 | 00:02:39.200 --> 00:02:40.790 align:start position:0%
630 | with appropriate audiences the resulting
631 | data<00:02:39.440> sets<00:02:39.680> can<00:02:39.840> be<00:02:40.000> distributed<00:02:40.640> in<00:02:40.720> a
632 |
633 | 00:02:40.790 --> 00:02:40.800 align:start position:0%
634 | data sets can be distributed in a
635 |
636 |
637 | 00:02:40.800 --> 00:02:42.869 align:start position:0%
638 | data sets can be distributed in a
639 | streamlined<00:02:41.360> transparent<00:02:42.080> and<00:02:42.239> reusable
640 |
641 | 00:02:42.869 --> 00:02:42.879 align:start position:0%
642 | streamlined transparent and reusable
643 |
644 |
645 | 00:02:42.879 --> 00:02:45.430 align:start position:0%
646 | streamlined transparent and reusable
647 | manner<00:02:43.519> as<00:02:43.840> jobs<00:02:44.080> were<00:02:44.239> computed<00:02:44.720> in<00:02:44.879> isolated
648 |
649 | 00:02:45.430 --> 00:02:45.440 align:start position:0%
650 | manner as jobs were computed in isolated
651 |
652 |
653 | 00:02:45.440 --> 00:02:46.869 align:start position:0%
654 | manner as jobs were computed in isolated
655 | compute<00:02:45.760> environments<00:02:46.560> they<00:02:46.800> are
656 |
657 | 00:02:46.869 --> 00:02:46.879 align:start position:0%
658 | compute environments they are
659 |
660 |
661 | 00:02:46.879 --> 00:02:48.550 align:start position:0%
662 | compute environments they are
663 | automatically<00:02:47.519> portable<00:02:48.080> to<00:02:48.319> other
664 |
665 | 00:02:48.550 --> 00:02:48.560 align:start position:0%
666 | automatically portable to other
667 |
668 |
669 | 00:02:48.560 --> 00:02:50.550 align:start position:0%
670 | automatically portable to other
671 | infrastructure<00:02:49.519> as<00:02:49.680> long<00:02:49.840> as<00:02:50.000> data<00:02:50.239> light<00:02:50.400> and
672 |
673 | 00:02:50.550 --> 00:02:50.560 align:start position:0%
674 | infrastructure as long as data light and
675 |
676 |
677 | 00:02:50.560 --> 00:02:52.150 align:start position:0%
678 | infrastructure as long as data light and
679 | the<00:02:50.640> employed<00:02:51.040> container<00:02:51.440> technology<00:02:52.000> are
680 |
681 | 00:02:52.150 --> 00:02:52.160 align:start position:0%
682 | the employed container technology are
683 |
684 |
685 | 00:02:52.160 --> 00:02:54.470 align:start position:0%
686 | the employed container technology are
687 | available<00:02:52.959> whoever<00:02:53.360> obtains<00:02:53.680> this<00:02:53.840> data<00:02:54.160> set
688 |
689 | 00:02:54.470 --> 00:02:54.480 align:start position:0%
690 | available whoever obtains this data set
691 |
692 |
693 | 00:02:54.480 --> 00:02:56.630 align:start position:0%
694 | available whoever obtains this data set
695 | can<00:02:54.640> thus<00:02:54.879> recompute<00:02:55.440> each<00:02:55.599> individual<00:02:56.239> job
696 |
697 | 00:02:56.630 --> 00:02:56.640 align:start position:0%
698 | can thus recompute each individual job
699 |
700 |
701 | 00:02:56.640 --> 00:03:00.680 align:start position:0%
702 | can thus recompute each individual job
703 | on<00:02:56.720> their<00:02:56.959> own<00:02:57.120> computer<00:02:57.680> automatically
704 |
705 |
--------------------------------------------------------------------------------
/DataLad/02_JuypterHub_overview_-_DataLad_0.16_Workshop_-_UKE_Hamburg__virtual_.en.vtt:
--------------------------------------------------------------------------------
1 | WEBVTT
2 | Kind: captions
3 | Language: en
4 |
5 | 00:00:00.480 --> 00:00:02.629 align:start position:0%
6 |
7 | all<00:00:00.640> right<00:00:01.280> one<00:00:01.520> of<00:00:01.599> the
8 |
9 | 00:00:02.629 --> 00:00:02.639 align:start position:0%
10 | all right one of the
11 |
12 |
13 | 00:00:02.639 --> 00:00:04.870 align:start position:0%
14 | all right one of the
15 | things<00:00:02.960> we've<00:00:03.439> prepared<00:00:03.760> for<00:00:03.919> today<00:00:04.319> is
16 |
17 | 00:00:04.870 --> 00:00:04.880 align:start position:0%
18 | things we've prepared for today is
19 |
20 |
21 | 00:00:04.880 --> 00:00:06.230 align:start position:0%
22 | things we've prepared for today is
23 | jupiter<00:00:05.359> hub
24 |
25 | 00:00:06.230 --> 00:00:06.240 align:start position:0%
26 | jupiter hub
27 |
28 |
29 | 00:00:06.240 --> 00:00:09.190 align:start position:0%
30 | jupiter hub
31 | so<00:00:06.480> it<00:00:06.640> is<00:00:06.879> one<00:00:07.200> of<00:00:07.440> the<00:00:08.000> ways<00:00:08.400> you<00:00:08.720> will<00:00:09.040> be
32 |
33 | 00:00:09.190 --> 00:00:09.200 align:start position:0%
34 | so it is one of the ways you will be
35 |
36 |
37 | 00:00:09.200 --> 00:00:11.110 align:start position:0%
38 | so it is one of the ways you will be
39 | able<00:00:09.679> to
40 |
41 | 00:00:11.110 --> 00:00:11.120 align:start position:0%
42 | able to
43 |
44 |
45 | 00:00:11.120 --> 00:00:14.150 align:start position:0%
46 | able to
47 | run<00:00:11.280> the<00:00:11.599> examples<00:00:12.639> uh<00:00:13.120> all<00:00:13.200> the<00:00:13.599> all<00:00:13.759> the<00:00:13.920> code
48 |
49 | 00:00:14.150 --> 00:00:14.160 align:start position:0%
50 | run the examples uh all the all the code
51 |
52 |
53 | 00:00:14.160 --> 00:00:17.029 align:start position:0%
54 | run the examples uh all the all the code
55 | examples<00:00:14.639> called<00:00:15.040> exercises<00:00:15.839> yourself
56 |
57 | 00:00:17.029 --> 00:00:17.039 align:start position:0%
58 | examples called exercises yourself
59 |
60 |
61 | 00:00:17.039 --> 00:00:19.109 align:start position:0%
62 | examples called exercises yourself
63 | you<00:00:17.119> can<00:00:17.359> of<00:00:17.440> course<00:00:17.680> use<00:00:17.920> your<00:00:18.320> own<00:00:18.480> computer
64 |
65 | 00:00:19.109 --> 00:00:19.119 align:start position:0%
66 | you can of course use your own computer
67 |
68 |
69 | 00:00:19.119 --> 00:00:21.429 align:start position:0%
70 | you can of course use your own computer
71 | but<00:00:19.279> you<00:00:19.439> can<00:00:19.680> also<00:00:20.000> use<00:00:20.400> this
72 |
73 | 00:00:21.429 --> 00:00:21.439 align:start position:0%
74 | but you can also use this
75 |
76 |
77 | 00:00:21.439 --> 00:00:23.910 align:start position:0%
78 | but you can also use this
79 | shared<00:00:21.760> resource<00:00:22.720> so<00:00:23.039> the<00:00:23.279> url<00:00:23.760> is
80 |
81 | 00:00:23.910 --> 00:00:23.920 align:start position:0%
82 | shared resource so the url is
83 |
84 |
85 | 00:00:23.920 --> 00:00:27.990 align:start position:0%
86 | shared resource so the url is
87 | data-hub.inm7.de
88 |
89 | 00:00:27.990 --> 00:00:28.000 align:start position:0%
90 |
91 |
92 |
93 | 00:00:28.000 --> 00:00:30.710 align:start position:0%
94 |
95 | you<00:00:28.240> should<00:00:28.480> have<00:00:28.640> received<00:00:29.119> your<00:00:30.000> usernames
96 |
97 | 00:00:30.710 --> 00:00:30.720 align:start position:0%
98 | you should have received your usernames
99 |
100 |
101 | 00:00:30.720 --> 00:00:32.950 align:start position:0%
102 | you should have received your usernames
103 | in<00:00:30.880> your<00:00:31.119> emails<00:00:31.599> they're<00:00:32.000> usually<00:00:32.559> your
104 |
105 | 00:00:32.950 --> 00:00:32.960 align:start position:0%
106 | in your emails they're usually your
107 |
108 |
109 | 00:00:32.960 --> 00:00:35.190 align:start position:0%
110 | in your emails they're usually your
111 | first<00:00:33.760> usually<00:00:34.079> but<00:00:34.239> not<00:00:34.480> always<00:00:34.800> they<00:00:35.040> are
112 |
113 | 00:00:35.190 --> 00:00:35.200 align:start position:0%
114 | first usually but not always they are
115 |
116 |
117 | 00:00:35.200 --> 00:00:36.709 align:start position:0%
118 | first usually but not always they are
119 | your<00:00:35.440> first<00:00:35.760> names
120 |
121 | 00:00:36.709 --> 00:00:36.719 align:start position:0%
122 | your first names
123 |
124 |
125 | 00:00:36.719 --> 00:00:39.590 align:start position:0%
126 | your first names
127 | so<00:00:37.120> if<00:00:37.360> you<00:00:37.920> if<00:00:38.079> you<00:00:38.559> if<00:00:38.719> you<00:00:39.040> if<00:00:39.200> you're<00:00:39.440> going
128 |
129 | 00:00:39.590 --> 00:00:39.600 align:start position:0%
130 | so if you if you if you if you're going
131 |
132 |
133 | 00:00:39.600 --> 00:00:42.069 align:start position:0%
134 | so if you if you if you if you're going
135 | to<00:00:39.760> use<00:00:39.920> the<00:00:40.000> jupiter<00:00:40.399> hub<00:00:40.640> and<00:00:40.879> you<00:00:41.280> don't
136 |
137 | 00:00:42.069 --> 00:00:42.079 align:start position:0%
138 | to use the jupiter hub and you don't
139 |
140 |
141 | 00:00:42.079 --> 00:00:43.350 align:start position:0%
142 | to use the jupiter hub and you don't
143 | don't<00:00:42.320> have<00:00:42.480> the
144 |
145 | 00:00:43.350 --> 00:00:43.360 align:start position:0%
146 | don't have the
147 |
148 |
149 | 00:00:43.360 --> 00:00:45.670 align:start position:0%
150 | don't have the
151 | username<00:00:44.000> then<00:00:44.399> please<00:00:44.960> then<00:00:45.120> please<00:00:45.360> let<00:00:45.600> us
152 |
153 | 00:00:45.670 --> 00:00:45.680 align:start position:0%
154 | username then please then please let us
155 |
156 |
157 | 00:00:45.680 --> 00:00:46.869 align:start position:0%
158 | username then please then please let us
159 | know
160 |
161 | 00:00:46.869 --> 00:00:46.879 align:start position:0%
162 | know
163 |
164 |
165 | 00:00:46.879 --> 00:00:49.990 align:start position:0%
166 | know
167 | the<00:00:47.039> password<00:00:47.680> is<00:00:48.079> whatever<00:00:48.719> you<00:00:49.039> use<00:00:49.440> for<00:00:49.760> the
168 |
169 | 00:00:49.990 --> 00:00:50.000 align:start position:0%
170 | the password is whatever you use for the
171 |
172 |
173 | 00:00:50.000 --> 00:00:51.270 align:start position:0%
174 | the password is whatever you use for the
175 | first<00:00:50.320> access
176 |
177 | 00:00:51.270 --> 00:00:51.280 align:start position:0%
178 | first access
179 |
180 |
181 | 00:00:51.280 --> 00:00:53.350 align:start position:0%
182 | first access
183 | so<00:00:51.680> whatever<00:00:52.079> you<00:00:52.239> type<00:00:52.480> for<00:00:52.640> the<00:00:52.800> first<00:00:53.039> time
184 |
185 | 00:00:53.350 --> 00:00:53.360 align:start position:0%
186 | so whatever you type for the first time
187 |
188 |
189 | 00:00:53.360 --> 00:00:56.310 align:start position:0%
190 | so whatever you type for the first time
191 | will<00:00:53.600> become<00:00:54.239> your<00:00:54.719> password<00:00:55.199> to<00:00:55.360> the<00:00:55.520> hub<00:00:56.160> for
192 |
193 | 00:00:56.310 --> 00:00:56.320 align:start position:0%
194 | will become your password to the hub for
195 |
196 |
197 | 00:00:56.320 --> 00:00:58.709 align:start position:0%
198 | will become your password to the hub for
199 | the<00:00:56.480> next<00:00:56.800> two<00:00:56.960> days<00:00:57.600> so<00:00:57.760> whatever<00:00:58.239> you<00:00:58.399> choose
200 |
201 | 00:00:58.709 --> 00:00:58.719 align:start position:0%
202 | the next two days so whatever you choose
203 |
204 |
205 | 00:00:58.719 --> 00:00:59.910 align:start position:0%
206 | the next two days so whatever you choose
207 | make<00:00:58.879> sure<00:00:59.199> you
208 |
209 | 00:00:59.910 --> 00:00:59.920 align:start position:0%
210 | make sure you
211 |
212 |
213 | 00:00:59.920 --> 00:01:02.229 align:start position:0%
214 | make sure you
215 | save<00:01:00.239> it
216 |
217 | 00:01:02.229 --> 00:01:02.239 align:start position:0%
218 | save it
219 |
220 |
221 | 00:01:02.239 --> 00:01:04.710 align:start position:0%
222 | save it
223 | and<00:01:02.480> let<00:01:02.640> me<00:01:02.800> give<00:01:02.960> you<00:01:03.120> a<00:01:03.359> quick<00:01:03.760> tour<00:01:04.159> of<00:01:04.400> the
224 |
225 | 00:01:04.710 --> 00:01:04.720 align:start position:0%
226 | and let me give you a quick tour of the
227 |
228 |
229 | 00:01:04.720 --> 00:01:07.830 align:start position:0%
230 | and let me give you a quick tour of the
231 | interface<00:01:05.199> that<00:01:05.439> we<00:01:05.680> will<00:01:05.840> be<00:01:06.080> using
232 |
233 | 00:01:07.830 --> 00:01:07.840 align:start position:0%
234 | interface that we will be using
235 |
236 |
237 | 00:01:07.840 --> 00:01:10.149 align:start position:0%
238 | interface that we will be using
239 | this<00:01:08.080> is<00:01:08.240> what<00:01:08.479> you<00:01:08.799> what<00:01:09.040> you<00:01:09.200> will<00:01:09.439> see<00:01:10.000> the
240 |
241 | 00:01:10.149 --> 00:01:10.159 align:start position:0%
242 | this is what you what you will see the
243 |
244 |
245 | 00:01:10.159 --> 00:01:12.710 align:start position:0%
246 | this is what you what you will see the
247 | first<00:01:10.400> time<00:01:10.880> you<00:01:11.040> log<00:01:11.360> in<00:01:12.320> and<00:01:12.400> there<00:01:12.640> are
248 |
249 | 00:01:12.710 --> 00:01:12.720 align:start position:0%
250 | first time you log in and there are
251 |
252 |
253 | 00:01:12.720 --> 00:01:15.190 align:start position:0%
254 | first time you log in and there are
255 | basically<00:01:13.280> two<00:01:13.520> parts<00:01:14.240> one<00:01:14.479> will<00:01:14.720> be<00:01:14.880> called
256 |
257 | 00:01:15.190 --> 00:01:15.200 align:start position:0%
258 | basically two parts one will be called
259 |
260 |
261 | 00:01:15.200 --> 00:01:17.749 align:start position:0%
262 | basically two parts one will be called
263 | launcher<00:01:16.159> and<00:01:16.320> the<00:01:16.479> other<00:01:16.799> will<00:01:17.040> be
264 |
265 | 00:01:17.749 --> 00:01:17.759 align:start position:0%
266 | launcher and the other will be
267 |
268 |
269 | 00:01:17.759 --> 00:01:19.830 align:start position:0%
270 | launcher and the other will be
271 | a<00:01:17.920> side<00:01:18.159> panel<00:01:18.560> and<00:01:18.720> also<00:01:19.040> a<00:01:19.200> short<00:01:19.439> menu<00:01:19.759> on
272 |
273 | 00:01:19.830 --> 00:01:19.840 align:start position:0%
274 | a side panel and also a short menu on
275 |
276 |
277 | 00:01:19.840 --> 00:01:21.109 align:start position:0%
278 | a side panel and also a short menu on
279 | the<00:01:20.000> top
280 |
281 | 00:01:21.109 --> 00:01:21.119 align:start position:0%
282 | the top
283 |
284 |
285 | 00:01:21.119 --> 00:01:23.350 align:start position:0%
286 | the top
287 | uh<00:01:21.439> so<00:01:21.680> first<00:01:21.920> of<00:01:22.080> all<00:01:22.720> there<00:01:22.960> are<00:01:23.119> some
288 |
289 | 00:01:23.350 --> 00:01:23.360 align:start position:0%
290 | uh so first of all there are some
291 |
292 |
293 | 00:01:23.360 --> 00:01:25.830 align:start position:0%
294 | uh so first of all there are some
295 | settings<00:01:24.240> you<00:01:24.400> can<00:01:24.640> adjust<00:01:25.040> how<00:01:25.280> things<00:01:25.680> look
296 |
297 | 00:01:25.830 --> 00:01:25.840 align:start position:0%
298 | settings you can adjust how things look
299 |
300 |
301 | 00:01:25.840 --> 00:01:27.350 align:start position:0%
302 | settings you can adjust how things look
303 | for<00:01:26.000> example<00:01:26.400> by
304 |
305 | 00:01:27.350 --> 00:01:27.360 align:start position:0%
306 | for example by
307 |
308 |
309 | 00:01:27.360 --> 00:01:28.710 align:start position:0%
310 | for example by
311 | choosing<00:01:27.680> between
312 |
313 | 00:01:28.710 --> 00:01:28.720 align:start position:0%
314 | choosing between
315 |
316 |
317 | 00:01:28.720 --> 00:01:31.030 align:start position:0%
318 | choosing between
319 | a<00:01:28.880> light<00:01:29.280> or<00:01:29.520> dark<00:01:29.920> theme
320 |
321 | 00:01:31.030 --> 00:01:31.040 align:start position:0%
322 | a light or dark theme
323 |
324 |
325 | 00:01:31.040 --> 00:01:33.109 align:start position:0%
326 | a light or dark theme
327 | here<00:01:31.360> also<00:01:31.600> in<00:01:31.759> the<00:01:31.920> settings<00:01:32.400> theme<00:01:32.640> you<00:01:32.880> have
328 |
329 | 00:01:33.109 --> 00:01:33.119 align:start position:0%
330 | here also in the settings theme you have
331 |
332 |
333 | 00:01:33.119 --> 00:01:35.350 align:start position:0%
334 | here also in the settings theme you have
335 | buttons<00:01:33.520> like<00:01:33.840> increase<00:01:34.320> or<00:01:34.560> decrease<00:01:35.040> font
336 |
337 | 00:01:35.350 --> 00:01:35.360 align:start position:0%
338 | buttons like increase or decrease font
339 |
340 |
341 | 00:01:35.360 --> 00:01:36.149 align:start position:0%
342 | buttons like increase or decrease font
343 | size
344 |
345 | 00:01:36.149 --> 00:01:36.159 align:start position:0%
346 | size
347 |
348 |
349 | 00:01:36.159 --> 00:01:37.910 align:start position:0%
350 | size
351 | and<00:01:36.320> they<00:01:36.400> are<00:01:36.640> separately<00:01:37.200> for<00:01:37.439> code<00:01:37.759> for
352 |
353 | 00:01:37.910 --> 00:01:37.920 align:start position:0%
354 | and they are separately for code for
355 |
356 |
357 | 00:01:37.920 --> 00:01:39.670 align:start position:0%
358 | and they are separately for code for
359 | content<00:01:38.400> for<00:01:38.640> ui
360 |
361 | 00:01:39.670 --> 00:01:39.680 align:start position:0%
362 | content for ui
363 |
364 |
365 | 00:01:39.680 --> 00:01:43.350 align:start position:0%
366 | content for ui
367 | and<00:01:40.079> for<00:01:40.560> also<00:01:41.200> terminal<00:01:42.159> out<00:01:42.479> here
368 |
369 | 00:01:43.350 --> 00:01:43.360 align:start position:0%
370 | and for also terminal out here
371 |
372 |
373 | 00:01:43.360 --> 00:01:45.670 align:start position:0%
374 | and for also terminal out here
375 | i<00:01:43.600> have<00:01:43.840> increased<00:01:44.240> mine<00:01:44.799> how<00:01:45.119> i<00:01:45.280> hope<00:01:45.520> they
376 |
377 | 00:01:45.670 --> 00:01:45.680 align:start position:0%
378 | i have increased mine how i hope they
379 |
380 |
381 | 00:01:45.680 --> 00:01:46.870 align:start position:0%
382 | i have increased mine how i hope they
383 | are
384 |
385 | 00:01:46.870 --> 00:01:46.880 align:start position:0%
386 | are
387 |
388 |
389 | 00:01:46.880 --> 00:01:51.990 align:start position:0%
390 | are
391 | they<00:01:47.040> are<00:01:47.280> visible<00:01:47.920> on<00:01:48.479> zoom<00:01:48.799> as<00:01:48.880> well
392 |
393 | 00:01:51.990 --> 00:01:52.000 align:start position:0%
394 |
395 |
396 |
397 | 00:01:52.000 --> 00:01:54.389 align:start position:0%
398 |
399 | and<00:01:52.479> here<00:01:52.880> you<00:01:53.040> can<00:01:53.280> also
400 |
401 | 00:01:54.389 --> 00:01:54.399 align:start position:0%
402 | and here you can also
403 |
404 |
405 | 00:01:54.399 --> 00:01:56.069 align:start position:0%
406 | and here you can also
407 | change<00:01:54.720> how<00:01:54.960> things<00:01:55.200> look<00:01:55.439> so<00:01:55.600> you<00:01:55.759> can<00:01:55.920> for
408 |
409 | 00:01:56.069 --> 00:01:56.079 align:start position:0%
410 | change how things look so you can for
411 |
412 |
413 | 00:01:56.079 --> 00:01:58.469 align:start position:0%
414 | change how things look so you can for
415 | example<00:01:56.640> slide<00:01:57.280> the<00:01:57.439> side<00:01:57.680> panel<00:01:58.000> to<00:01:58.079> make<00:01:58.320> it
416 |
417 | 00:01:58.469 --> 00:01:58.479 align:start position:0%
418 | example slide the side panel to make it
419 |
420 |
421 | 00:01:58.479 --> 00:02:00.870 align:start position:0%
422 | example slide the side panel to make it
423 | smaller<00:01:58.960> or<00:01:59.200> bigger
424 |
425 | 00:02:00.870 --> 00:02:00.880 align:start position:0%
426 | smaller or bigger
427 |
428 |
429 | 00:02:00.880 --> 00:02:03.670 align:start position:0%
430 | smaller or bigger
431 | jupiter<00:02:01.439> lab<00:02:01.840> is<00:02:02.159> something<00:02:02.640> that<00:02:03.119> has<00:02:03.439> been
432 |
433 | 00:02:03.670 --> 00:02:03.680 align:start position:0%
434 | jupiter lab is something that has been
435 |
436 |
437 | 00:02:03.680 --> 00:02:06.310 align:start position:0%
438 | jupiter lab is something that has been
439 | made<00:02:04.240> especially<00:02:05.200> for
440 |
441 | 00:02:06.310 --> 00:02:06.320 align:start position:0%
442 | made especially for
443 |
444 |
445 | 00:02:06.320 --> 00:02:08.469 align:start position:0%
446 | made especially for
447 | notebooks
448 |
449 | 00:02:08.469 --> 00:02:08.479 align:start position:0%
450 | notebooks
451 |
452 |
453 | 00:02:08.479 --> 00:02:10.309 align:start position:0%
454 | notebooks
455 | you<00:02:08.640> may<00:02:08.800> have<00:02:09.039> heard<00:02:09.280> of<00:02:09.440> jupyter<00:02:09.840> notebooks
456 |
457 | 00:02:10.309 --> 00:02:10.319 align:start position:0%
458 | you may have heard of jupyter notebooks
459 |
460 |
461 | 00:02:10.319 --> 00:02:12.550 align:start position:0%
462 | you may have heard of jupyter notebooks
463 | these<00:02:10.560> are<00:02:10.720> environments<00:02:11.520> to
464 |
465 | 00:02:12.550 --> 00:02:12.560 align:start position:0%
466 | these are environments to
467 |
468 |
469 | 00:02:12.560 --> 00:02:13.750 align:start position:0%
470 | these are environments to
471 | combine
472 |
473 | 00:02:13.750 --> 00:02:13.760 align:start position:0%
474 | combine
475 |
476 |
477 | 00:02:13.760 --> 00:02:15.589 align:start position:0%
478 | combine
479 | code<00:02:14.319> and
480 |
481 | 00:02:15.589 --> 00:02:15.599 align:start position:0%
482 | code and
483 |
484 |
485 | 00:02:15.599 --> 00:02:18.710 align:start position:0%
486 | code and
487 | code<00:02:16.000> and<00:02:16.640> markdown<00:02:17.360> and<00:02:17.599> outputs
488 |
489 | 00:02:18.710 --> 00:02:18.720 align:start position:0%
490 | code and markdown and outputs
491 |
492 |
493 | 00:02:18.720 --> 00:02:20.630 align:start position:0%
494 | code and markdown and outputs
495 | in<00:02:18.800> one<00:02:19.040> place<00:02:19.360> we<00:02:19.520> won't<00:02:19.680> be<00:02:19.840> using<00:02:20.160> notebooks
496 |
497 | 00:02:20.630 --> 00:02:20.640 align:start position:0%
498 | in one place we won't be using notebooks
499 |
500 |
501 | 00:02:20.640 --> 00:02:21.670 align:start position:0%
502 | in one place we won't be using notebooks
503 | today
504 |
505 | 00:02:21.670 --> 00:02:21.680 align:start position:0%
506 | today
507 |
508 |
509 | 00:02:21.680 --> 00:02:24.390 align:start position:0%
510 | today
511 | we<00:02:22.239> will<00:02:22.560> be<00:02:22.800> using<00:02:23.280> instead
512 |
513 | 00:02:24.390 --> 00:02:24.400 align:start position:0%
514 | we will be using instead
515 |
516 |
517 | 00:02:24.400 --> 00:02:25.430 align:start position:0%
518 | we will be using instead
519 | uh
520 |
521 | 00:02:25.430 --> 00:02:25.440 align:start position:0%
522 | uh
523 |
524 |
525 | 00:02:25.440 --> 00:02:27.350 align:start position:0%
526 | uh
527 | the<00:02:25.680> jupiter<00:02:26.080> lab<00:02:26.319> for<00:02:26.480> the<00:02:26.640> terminal<00:02:27.200> it
528 |
529 | 00:02:27.350 --> 00:02:27.360 align:start position:0%
530 | the jupiter lab for the terminal it
531 |
532 |
533 | 00:02:27.360 --> 00:02:30.470 align:start position:0%
534 | the jupiter lab for the terminal it
535 | provides<00:02:28.160> so<00:02:28.400> here<00:02:28.720> in<00:02:28.800> the<00:02:28.959> launcher<00:02:29.920> under
536 |
537 | 00:02:30.470 --> 00:02:30.480 align:start position:0%
538 | provides so here in the launcher under
539 |
540 |
541 | 00:02:30.480 --> 00:02:31.990 align:start position:0%
542 | provides so here in the launcher under
543 | in<00:02:30.560> the<00:02:30.720> other<00:02:31.040> section
544 |
545 | 00:02:31.990 --> 00:02:32.000 align:start position:0%
546 | in the other section
547 |
548 |
549 | 00:02:32.000 --> 00:02:34.229 align:start position:0%
550 | in the other section
551 | you<00:02:32.160> have<00:02:32.319> a<00:02:32.480> terminal<00:02:33.440> that<00:02:33.680> opens<00:02:34.080> a
552 |
553 | 00:02:34.229 --> 00:02:34.239 align:start position:0%
554 | you have a terminal that opens a
555 |
556 |
557 | 00:02:34.239 --> 00:02:35.830 align:start position:0%
558 | you have a terminal that opens a
559 | terminal
560 |
561 | 00:02:35.830 --> 00:02:35.840 align:start position:0%
562 | terminal
563 |
564 |
565 | 00:02:35.840 --> 00:02:37.670 align:start position:0%
566 | terminal
567 | that<00:02:36.160> is<00:02:36.640> a
568 |
569 | 00:02:37.670 --> 00:02:37.680 align:start position:0%
570 | that is a
571 |
572 |
573 | 00:02:37.680 --> 00:02:38.710 align:start position:0%
574 | that is a
575 | regular
576 |
577 | 00:02:38.710 --> 00:02:38.720 align:start position:0%
578 | regular
579 |
580 |
581 | 00:02:38.720 --> 00:02:40.390 align:start position:0%
582 | regular
583 | unix<00:02:39.280> terminal
584 |
585 | 00:02:40.390 --> 00:02:40.400 align:start position:0%
586 | unix terminal
587 |
588 |
589 | 00:02:40.400 --> 00:02:42.390 align:start position:0%
590 | unix terminal
591 | regular<00:02:40.959> bash
592 |
593 | 00:02:42.390 --> 00:02:42.400 align:start position:0%
594 | regular bash
595 |
596 |
597 | 00:02:42.400 --> 00:02:45.430 align:start position:0%
598 | regular bash
599 | that<00:02:42.720> runs<00:02:43.360> on<00:02:43.680> the<00:02:44.160> on<00:02:44.239> the<00:02:44.480> server<00:02:45.200> in<00:02:45.280> the
600 |
601 | 00:02:45.430 --> 00:02:45.440 align:start position:0%
602 | that runs on the on the server in the
603 |
604 |
605 | 00:02:45.440 --> 00:02:48.630 align:start position:0%
606 | that runs on the on the server in the
607 | cloud<00:02:45.920> and<00:02:46.160> you<00:02:46.319> have<00:02:46.640> your<00:02:47.200> own
608 |
609 | 00:02:48.630 --> 00:02:48.640 align:start position:0%
610 | cloud and you have your own
611 |
612 |
613 | 00:02:48.640 --> 00:02:51.190 align:start position:0%
614 | cloud and you have your own
615 | user<00:02:49.040> account<00:02:49.360> for<00:02:49.519> the<00:02:49.680> duration<00:02:50.400> of<00:02:51.040> the
616 |
617 | 00:02:51.190 --> 00:02:51.200 align:start position:0%
618 | user account for the duration of the
619 |
620 |
621 | 00:02:51.200 --> 00:02:53.110 align:start position:0%
622 | user account for the duration of the
623 | workshop
624 |
625 | 00:02:53.110 --> 00:02:53.120 align:start position:0%
626 | workshop
627 |
628 |
629 | 00:02:53.120 --> 00:02:55.910 align:start position:0%
630 | workshop
631 | so<00:02:53.360> for<00:02:53.519> example<00:02:54.319> i
632 |
633 | 00:02:55.910 --> 00:02:55.920 align:start position:0%
634 | so for example i
635 |
636 |
637 | 00:02:55.920 --> 00:02:58.790 align:start position:0%
638 | so for example i
639 | so<00:02:56.080> the<00:02:56.239> terminal<00:02:56.720> works<00:02:57.040> so<00:02:57.440> in<00:02:57.599> a<00:02:57.680> way<00:02:57.840> that<00:02:58.560> i
640 |
641 | 00:02:58.790 --> 00:02:58.800 align:start position:0%
642 | so the terminal works so in a way that i
643 |
644 |
645 | 00:02:58.800 --> 00:03:02.149 align:start position:0%
646 | so the terminal works so in a way that i
647 | type<00:02:59.040> a<00:02:59.120> command<00:02:59.680> and<00:02:59.920> i<00:03:00.080> get<00:03:00.319> response<00:03:01.440> for<00:03:02.000> so
648 |
649 | 00:03:02.149 --> 00:03:02.159 align:start position:0%
650 | type a command and i get response for so
651 |
652 |
653 | 00:03:02.159 --> 00:03:03.430 align:start position:0%
654 | type a command and i get response for so
655 | for<00:03:02.319> example
656 |
657 | 00:03:03.430 --> 00:03:03.440 align:start position:0%
658 | for example
659 |
660 |
661 | 00:03:03.440 --> 00:03:06.470 align:start position:0%
662 | for example
663 | who<00:03:03.680> am<00:03:03.920> i<00:03:04.560> will<00:03:05.040> print<00:03:05.519> my
664 |
665 | 00:03:06.470 --> 00:03:06.480 align:start position:0%
666 | who am i will print my
667 |
668 |
669 | 00:03:06.480 --> 00:03:09.990 align:start position:0%
670 | who am i will print my
671 | username<00:03:07.120> here
672 |
673 | 00:03:09.990 --> 00:03:10.000 align:start position:0%
674 |
675 |
676 |
677 | 00:03:10.000 --> 00:03:12.869 align:start position:0%
678 |
679 | we<00:03:10.159> will<00:03:10.400> be<00:03:10.800> using<00:03:11.120> the<00:03:11.280> terminal<00:03:11.840> to<00:03:12.159> run<00:03:12.560> the
680 |
681 | 00:03:12.869 --> 00:03:12.879 align:start position:0%
682 | we will be using the terminal to run the
683 |
684 |
685 | 00:03:12.879 --> 00:03:14.309 align:start position:0%
686 | we will be using the terminal to run the
687 | datalot<00:03:13.360> commands
688 |
689 | 00:03:14.309 --> 00:03:14.319 align:start position:0%
690 | datalot commands
691 |
692 |
693 | 00:03:14.319 --> 00:03:16.390 align:start position:0%
694 | datalot commands
695 | many<00:03:14.560> of<00:03:14.640> these<00:03:14.879> commands<00:03:15.280> will<00:03:15.440> create<00:03:15.840> files
696 |
697 | 00:03:16.390 --> 00:03:16.400 align:start position:0%
698 | many of these commands will create files
699 |
700 |
701 | 00:03:16.400 --> 00:03:19.190 align:start position:0%
702 | many of these commands will create files
703 | and<00:03:16.640> the<00:03:16.800> files<00:03:17.200> that<00:03:17.680> are<00:03:18.080> created
704 |
705 | 00:03:19.190 --> 00:03:19.200 align:start position:0%
706 | and the files that are created
707 |
708 |
709 | 00:03:19.200 --> 00:03:22.070 align:start position:0%
710 | and the files that are created
711 | will<00:03:19.599> also<00:03:19.920> be<00:03:20.159> visible<00:03:20.720> to<00:03:20.879> you<00:03:21.280> in<00:03:21.519> this<00:03:21.760> file
712 |
713 | 00:03:22.070 --> 00:03:22.080 align:start position:0%
714 | will also be visible to you in this file
715 |
716 |
717 | 00:03:22.080 --> 00:03:24.229 align:start position:0%
718 | will also be visible to you in this file
719 | browser<00:03:22.560> here<00:03:23.360> so
720 |
721 | 00:03:24.229 --> 00:03:24.239 align:start position:0%
722 | browser here so
723 |
724 |
725 | 00:03:24.239 --> 00:03:27.030 align:start position:0%
726 | browser here so
727 | for<00:03:24.480> now<00:03:24.640> i<00:03:24.879> don't<00:03:25.120> have<00:03:25.280> many<00:03:25.599> files
728 |
729 | 00:03:27.030 --> 00:03:27.040 align:start position:0%
730 | for now i don't have many files
731 |
732 |
733 | 00:03:27.040 --> 00:03:28.949 align:start position:0%
734 | for now i don't have many files
735 | but<00:03:27.519> with<00:03:27.760> time
736 |
737 | 00:03:28.949 --> 00:03:28.959 align:start position:0%
738 | but with time
739 |
740 |
741 | 00:03:28.959 --> 00:03:31.750 align:start position:0%
742 | but with time
743 | they<00:03:29.200> will<00:03:29.440> be<00:03:29.680> populated<00:03:30.319> i<00:03:30.400> can<00:03:30.640> make
744 |
745 | 00:03:31.750 --> 00:03:31.760 align:start position:0%
746 | they will be populated i can make
747 |
748 |
749 | 00:03:31.760 --> 00:03:34.149 align:start position:0%
750 | they will be populated i can make
751 | directories<00:03:32.720> either<00:03:33.120> from<00:03:33.360> the
752 |
753 | 00:03:34.149 --> 00:03:34.159 align:start position:0%
754 | directories either from the
755 |
756 |
757 | 00:03:34.159 --> 00:03:38.309 align:start position:0%
758 | directories either from the
759 | terminal<00:03:34.640> here
760 |
761 | 00:03:38.309 --> 00:03:38.319 align:start position:0%
762 |
763 |
764 |
765 | 00:03:38.319 --> 00:03:39.509 align:start position:0%
766 |
767 | and
768 |
769 | 00:03:39.509 --> 00:03:39.519 align:start position:0%
770 | and
771 |
772 |
773 | 00:03:39.519 --> 00:03:41.990 align:start position:0%
774 | and
775 | it<00:03:40.000> it<00:03:40.159> appeared<00:03:40.640> here<00:03:40.959> i<00:03:41.120> can<00:03:41.280> go<00:03:41.519> into<00:03:41.680> it<00:03:41.840> by
776 |
777 | 00:03:41.990 --> 00:03:42.000 align:start position:0%
778 | it it appeared here i can go into it by
779 |
780 |
781 | 00:03:42.000 --> 00:03:44.229 align:start position:0%
782 | it it appeared here i can go into it by
783 | double<00:03:42.319> clicking<00:03:42.720> i<00:03:42.879> can<00:03:43.040> go<00:03:43.280> back
784 |
785 | 00:03:44.229 --> 00:03:44.239 align:start position:0%
786 | double clicking i can go back
787 |
788 |
789 | 00:03:44.239 --> 00:03:46.149 align:start position:0%
790 | double clicking i can go back
791 | but<00:03:44.480> i<00:03:44.560> can<00:03:44.799> also<00:03:45.120> use<00:03:45.360> these
792 |
793 | 00:03:46.149 --> 00:03:46.159 align:start position:0%
794 | but i can also use these
795 |
796 |
797 | 00:03:46.159 --> 00:03:47.750 align:start position:0%
798 | but i can also use these
799 | buttons<00:03:46.799> here
800 |
801 | 00:03:47.750 --> 00:03:47.760 align:start position:0%
802 | buttons here
803 |
804 |
805 | 00:03:47.760 --> 00:03:50.390 align:start position:0%
806 | buttons here
807 | that<00:03:48.000> will<00:03:48.319> create<00:03:48.720> new<00:03:48.959> folders
808 |
809 | 00:03:50.390 --> 00:03:50.400 align:start position:0%
810 | that will create new folders
811 |
812 |
813 | 00:03:50.400 --> 00:03:54.229 align:start position:0%
814 | that will create new folders
815 | like<00:03:50.720> this<00:03:51.280> and<00:03:51.599> i<00:03:51.760> can<00:03:52.080> also
816 |
817 | 00:03:54.229 --> 00:03:54.239 align:start position:0%
818 | like this and i can also
819 |
820 |
821 | 00:03:54.239 --> 00:03:57.429 align:start position:0%
822 | like this and i can also
823 | create<00:03:54.959> a<00:03:55.680> new<00:03:56.000> file
824 |
825 | 00:03:57.429 --> 00:03:57.439 align:start position:0%
826 | create a new file
827 |
828 |
829 | 00:03:57.439 --> 00:04:00.630 align:start position:0%
830 | create a new file
831 | i<00:03:57.680> can<00:03:58.239> can<00:03:58.560> name<00:03:58.840> it
832 |
833 | 00:04:00.630 --> 00:04:00.640 align:start position:0%
834 | i can can name it
835 |
836 |
837 | 00:04:00.640 --> 00:04:02.789 align:start position:0%
838 | i can can name it
839 | anything<00:04:01.120> i<00:04:01.280> want
840 |
841 | 00:04:02.789 --> 00:04:02.799 align:start position:0%
842 | anything i want
843 |
844 |
845 | 00:04:02.799 --> 00:04:03.670 align:start position:0%
846 | anything i want
847 | and
848 |
849 | 00:04:03.670 --> 00:04:03.680 align:start position:0%
850 | and
851 |
852 |
853 | 00:04:03.680 --> 00:04:05.030 align:start position:0%
854 | and
855 | i<00:04:04.080> can
856 |
857 | 00:04:05.030 --> 00:04:05.040 align:start position:0%
858 | i can
859 |
860 |
861 | 00:04:05.040 --> 00:04:07.270 align:start position:0%
862 | i can
863 | open<00:04:05.360> it<00:04:05.519> with<00:04:05.760> an<00:04:06.080> editor
864 |
865 | 00:04:07.270 --> 00:04:07.280 align:start position:0%
866 | open it with an editor
867 |
868 |
869 | 00:04:07.280 --> 00:04:12.550 align:start position:0%
870 | open it with an editor
871 | so<00:04:07.599> here<00:04:07.920> i'll<00:04:08.159> get<00:04:08.400> an<00:04:09.040> editor<00:04:09.519> tab
872 |
873 | 00:04:12.550 --> 00:04:12.560 align:start position:0%
874 |
875 |
876 |
877 | 00:04:12.560 --> 00:04:15.509 align:start position:0%
878 |
879 | where<00:04:12.799> i<00:04:12.959> can<00:04:14.000> write<00:04:14.319> things<00:04:14.799> and<00:04:14.959> i<00:04:15.120> have<00:04:15.280> the
880 |
881 | 00:04:15.509 --> 00:04:15.519 align:start position:0%
882 | where i can write things and i have the
883 |
884 |
885 | 00:04:15.519 --> 00:04:17.430 align:start position:0%
886 | where i can write things and i have the
887 | buttons<00:04:16.079> file
888 |
889 | 00:04:17.430 --> 00:04:17.440 align:start position:0%
890 | buttons file
891 |
892 |
893 | 00:04:17.440 --> 00:04:19.670 align:start position:0%
894 | buttons file
895 | save<00:04:17.919> python<00:04:18.400> file
896 |
897 | 00:04:19.670 --> 00:04:19.680 align:start position:0%
898 | save python file
899 |
900 |
901 | 00:04:19.680 --> 00:04:21.909 align:start position:0%
902 | save python file
903 | i<00:04:19.840> can<00:04:20.079> close<00:04:20.479> things
904 |
905 | 00:04:21.909 --> 00:04:21.919 align:start position:0%
906 | i can close things
907 |
908 |
909 | 00:04:21.919 --> 00:04:24.550 align:start position:0%
910 | i can close things
911 | i<00:04:22.160> can<00:04:22.639> open<00:04:22.960> them<00:04:23.199> from<00:04:23.360> a<00:04:23.440> launcher
912 |
913 | 00:04:24.550 --> 00:04:24.560 align:start position:0%
914 | i can open them from a launcher
915 |
916 |
917 | 00:04:24.560 --> 00:04:28.070 align:start position:0%
918 | i can open them from a launcher
919 | or<00:04:24.800> i<00:04:24.880> can<00:04:25.120> open<00:04:25.440> them<00:04:26.160> by<00:04:26.560> double<00:04:26.880> clicking<00:04:28.000> in
920 |
921 | 00:04:28.070 --> 00:04:28.080 align:start position:0%
922 | or i can open them by double clicking in
923 |
924 |
925 | 00:04:28.080 --> 00:04:30.870 align:start position:0%
926 | or i can open them by double clicking in
927 | the<00:04:28.240> file<00:04:28.479> browser<00:04:28.800> window
928 |
929 | 00:04:30.870 --> 00:04:30.880 align:start position:0%
930 | the file browser window
931 |
932 |
933 | 00:04:30.880 --> 00:04:32.870 align:start position:0%
934 | the file browser window
935 | so<00:04:31.440> we<00:04:31.680> will<00:04:31.759> be<00:04:31.919> looking<00:04:32.320> at<00:04:32.479> the<00:04:32.639> file
936 |
937 | 00:04:32.870 --> 00:04:32.880 align:start position:0%
938 | so we will be looking at the file
939 |
940 |
941 | 00:04:32.880 --> 00:04:35.270 align:start position:0%
942 | so we will be looking at the file
943 | browser<00:04:33.440> we<00:04:33.600> will<00:04:33.759> be<00:04:34.080> working<00:04:34.720> mostly<00:04:35.040> with
944 |
945 | 00:04:35.270 --> 00:04:35.280 align:start position:0%
946 | browser we will be working mostly with
947 |
948 |
949 | 00:04:35.280 --> 00:04:36.550 align:start position:0%
950 | browser we will be working mostly with
951 | the<00:04:35.440> terminal
952 |
953 | 00:04:36.550 --> 00:04:36.560 align:start position:0%
954 | the terminal
955 |
956 |
957 | 00:04:36.560 --> 00:04:39.189 align:start position:0%
958 | the terminal
959 | we<00:04:36.800> will<00:04:37.040> be<00:04:37.199> using
960 |
961 | 00:04:39.189 --> 00:04:39.199 align:start position:0%
962 | we will be using
963 |
964 |
965 | 00:04:39.199 --> 00:04:41.670 align:start position:0%
966 | we will be using
967 | using<00:04:39.680> the
968 |
969 | 00:04:41.670 --> 00:04:41.680 align:start position:0%
970 | using the
971 |
972 |
973 | 00:04:41.680 --> 00:04:45.110 align:start position:0%
974 | using the
975 | built-in<00:04:42.479> editor<00:04:43.040> to<00:04:43.440> edit<00:04:43.759> files
976 |
977 | 00:04:45.110 --> 00:04:45.120 align:start position:0%
978 | built-in editor to edit files
979 |
980 |
981 | 00:04:45.120 --> 00:04:48.230 align:start position:0%
982 | built-in editor to edit files
983 | you<00:04:45.280> have<00:04:45.600> some<00:04:46.240> some<00:04:46.479> panes<00:04:46.800> here<00:04:47.280> that<00:04:48.000> that
984 |
985 | 00:04:48.230 --> 00:04:48.240 align:start position:0%
986 | you have some some panes here that that
987 |
988 |
989 | 00:04:48.240 --> 00:04:51.110 align:start position:0%
990 | you have some some panes here that that
991 | show<00:04:48.560> other<00:04:49.360> jupyter<00:04:49.919> things
992 |
993 | 00:04:51.110 --> 00:04:51.120 align:start position:0%
994 | show other jupyter things
995 |
996 |
997 | 00:04:51.120 --> 00:04:54.070 align:start position:0%
998 | show other jupyter things
999 | but<00:04:51.600> we'll<00:04:51.919> mostly<00:04:52.400> be<00:04:52.560> looking<00:04:53.040> at<00:04:53.280> the
1000 |
1001 | 00:04:54.070 --> 00:04:54.080 align:start position:0%
1002 | but we'll mostly be looking at the
1003 |
1004 |
1005 | 00:04:54.080 --> 00:04:56.710 align:start position:0%
1006 | but we'll mostly be looking at the
1007 | file<00:04:54.320> browser<00:04:55.120> and<00:04:55.199> you<00:04:55.360> can<00:04:55.520> also
1008 |
1009 | 00:04:56.710 --> 00:04:56.720 align:start position:0%
1010 | file browser and you can also
1011 |
1012 |
1013 | 00:04:56.720 --> 00:04:59.350 align:start position:0%
1014 | file browser and you can also
1015 | click<00:04:57.040> on<00:04:57.120> this<00:04:57.280> button<00:04:57.680> here<00:04:58.000> to<00:04:58.160> make<00:04:58.400> the
1016 |
1017 | 00:04:59.350 --> 00:04:59.360 align:start position:0%
1018 | click on this button here to make the
1019 |
1020 |
1021 | 00:04:59.360 --> 00:05:01.990 align:start position:0%
1022 | click on this button here to make the
1023 | site<00:04:59.600> panel<00:05:00.160> appear
1024 |
1025 | 00:05:01.990 --> 00:05:02.000 align:start position:0%
1026 | site panel appear
1027 |
1028 |
1029 | 00:05:02.000 --> 00:05:04.870 align:start position:0%
1030 | site panel appear
1031 | or<00:05:02.240> disappear
1032 |
1033 | 00:05:04.870 --> 00:05:04.880 align:start position:0%
1034 | or disappear
1035 |
1036 |
1037 | 00:05:04.880 --> 00:05:07.029 align:start position:0%
1038 | or disappear
1039 | and<00:05:05.120> i<00:05:05.360> think<00:05:05.680> that
1040 |
1041 | 00:05:07.029 --> 00:05:07.039 align:start position:0%
1042 | and i think that
1043 |
1044 |
1045 | 00:05:07.039 --> 00:05:08.710 align:start position:0%
1046 | and i think that
1047 | that's<00:05:07.360> the
1048 |
1049 | 00:05:08.710 --> 00:05:08.720 align:start position:0%
1050 | that's the
1051 |
1052 |
1053 | 00:05:08.720 --> 00:05:11.430 align:start position:0%
1054 | that's the
1055 | shortest<00:05:09.199> possible<00:05:09.840> tour<00:05:10.400> of<00:05:10.639> the
1056 |
1057 | 00:05:11.430 --> 00:05:11.440 align:start position:0%
1058 | shortest possible tour of the
1059 |
1060 |
1061 | 00:05:11.440 --> 00:05:14.800 align:start position:0%
1062 | shortest possible tour of the
1063 | interface<00:05:11.919> will<00:05:12.240> be
1064 |
1065 |
--------------------------------------------------------------------------------
/DataLad/DataLad_-_Decentralized_Distribution_and_Sharing_of_Scientific_Datasets.en.vtt:
--------------------------------------------------------------------------------
1 | WEBVTT
2 | Kind: captions
3 | Language: en
4 |
5 | 00:00:02.920 --> 00:00:05.121
6 | My name is Yaroslav Halchenko
7 |
8 | 00:00:05.121 --> 00:00:09.750
9 | and I am talking to you from Dartmouth about DataLad project.
10 |
11 | 00:00:10.299 --> 00:00:14.099
12 | This project was initiated by me and Michael Hanke from Germany
13 |
14 | 00:00:14.099 --> 00:00:18.141
15 | and we had successful few years of collaboration.
16 |
17 | 00:00:18.141 --> 00:00:22.000
18 | Before that you might know us
19 |
20 | 00:00:22.480 --> 00:00:24.599
21 | because of our other projects such as PyMVPA and NeuroDebian.
22 |
23 | 00:00:25.140 --> 00:00:26.901
24 | I hope that you use them
25 |
26 | 00:00:26.901 --> 00:00:30.129
27 | and they help you in your research projects.
28 |
29 | 00:00:30.129 --> 00:00:32.660
30 | More about these and other projects
31 |
32 | 00:00:32.660 --> 00:00:36.530
33 | You could discover if you go to the centerforopenneuroscience.org website,
34 |
35 | 00:00:36.530 --> 00:00:37.680
36 | or you could also find
37 |
38 | 00:00:37.860 --> 00:00:43.100
39 | contacts for us in social media and before I proceed with the talk
40 |
41 | 00:00:43.110 --> 00:00:47.272
42 | I want first of all acknowledge work of others on the project.
43 |
44 | 00:00:47.272 --> 00:00:49.650
45 | It wasn't only my and Michael's work
46 |
47 | 00:00:51.580 --> 00:00:54.201
48 | Our project is heavily based on Git-annex tool,
49 |
50 | 00:00:54.201 --> 00:00:57.659
51 | which Joey Hess wrote for managing his own collection of files
52 |
53 | 00:00:58.060 --> 00:01:00.269
54 | which has nothing to do with science.
55 |
56 | 00:01:01.240 --> 00:01:04.229
57 | Also, he is well known for his work in Debian project
58 |
59 | 00:01:04.600 --> 00:01:09.390
60 | We had... we still have tireless workers on a project
61 |
62 | 00:01:09.909 --> 00:01:11.020
63 | Benjamin
64 |
65 | 00:01:11.020 --> 00:01:14.140
66 | working with Michael and Alex.
67 |
68 | 00:01:14.140 --> 00:01:16.920
69 | Alex recently refurbished or wrote from scratch a new version of the website
70 |
71 | 00:01:16.920 --> 00:01:20.249
72 | I hope that you'll like it and we'll see a bit more of it later.
73 |
74 | 00:01:21.490 --> 00:01:25.439
75 | Also, we had Jason, Debanjum and Gergana working on the project.
76 |
77 | 00:01:26.469 --> 00:01:30.299
78 | They were quite successful to accomplish a lot within short period of time
79 |
80 | 00:01:31.119 --> 00:01:33.260
81 | So if you're looking for a project to contribute to
82 |
83 | 00:01:33.260 --> 00:01:37.199
84 | it might be the interesting project for you to start
85 |
86 | 00:01:37.740 --> 00:01:39.600
87 | working on open source projects
88 |
89 | 00:01:39.600 --> 00:01:42.200
90 | and leave in kind of your foot step in the
91 |
92 | 00:01:42.260 --> 00:01:46.160
93 | ecosystem of Open Source for Neuroscience.
94 |
95 | 00:01:46.160 --> 00:01:49.920
96 | This project is supported by NSF and
97 |
98 | 00:01:50.100 --> 00:01:53.960
99 | Federal Finistry for Education and Research in Germany.
100 |
101 | 00:01:54.400 --> 00:02:00.160
102 | If you go to centerforopenneuroscience.org you could discover more
103 |
104 | 00:02:00.380 --> 00:02:04.500
105 | interesting and exciting projects we either collaborate with it or contribute to.
106 |
107 | 00:02:06.369 --> 00:02:12.239
108 | Before we proceed I want actually to formulate the problem we are trying to solve DataLad.
109 |
110 | 00:02:12.970 --> 00:02:16.506
111 | Data is second class citizen within software platforms.
112 |
113 | 00:02:16.506 --> 00:02:18.499
114 | What could that potentially be?
115 |
116 | 00:02:20.310 --> 00:02:25.009
117 | One of the aspects is if you look how people distribute data nowadays
118 |
119 | 00:02:25.710 --> 00:02:32.239
120 | Quite often you find that even large arrays of data are distributed in tarballs or zip files.
121 |
122 | 00:02:34.110 --> 00:02:37.459
123 | Problems were multiple with these ways of distribution
124 |
125 | 00:02:37.459 --> 00:02:40.249
126 | if one file changes you need to re-distribute
127 |
128 | 00:02:40.820 --> 00:02:42.540
129 | Entire tarball which might be gigabytes in size,
130 |
131 | 00:02:42.540 --> 00:02:48.120
132 | and that's why partially we also couldn't just adopt
133 |
134 | 00:02:48.720 --> 00:02:50.130
135 | technologies which are
136 |
137 | 00:02:50.130 --> 00:02:54.840
138 | proven to work for software, let's say in Debian we distribute complete packages.
139 |
140 | 00:02:54.980 --> 00:02:56.420
141 | But again the problem is the same.
142 |
143 | 00:02:56.660 --> 00:02:59.160
144 | As long as you force
145 |
146 | 00:02:59.360 --> 00:03:04.220
147 | wrapping all the data together in some big file - it wouldn't work. It won't scale.
148 |
149 | 00:03:06.060 --> 00:03:09.023
150 | Also another problem is absent version of data.
151 |
152 | 00:03:09.023 --> 00:03:15.060
153 | And many people actually underappreciate it and think that it doesn't actually exist
154 |
155 | 00:03:15.060 --> 00:03:21.780
156 | or relates to their way of work. But no, actually this problem is quite generic.
157 |
158 | 00:03:22.920 --> 00:03:24.920
159 | So if you look into this
160 |
161 | 00:03:25.620 --> 00:03:30.860
162 | PhD comics caricature, you'll find that these probably relates to many
163 |
164 | 00:03:32.549 --> 00:03:35.209
165 | ways how you deal with files, data or
166 |
167 | 00:03:35.880 --> 00:03:39.079
168 | documents. And you could see that actually
169 |
170 | 00:03:39.930 --> 00:03:47.600
171 | how we tend to version our data is by providing - quite often - the date, right, which creates some kind of linear progression.
172 |
173 | 00:03:48.239 --> 00:03:50.539
174 | Right, so we annotate that: "Oh!"
175 |
176 | 00:03:51.540 --> 00:03:56.209
177 | "I've worked on these in those dates, but also maybe a little bit later..."
178 |
179 | 00:03:56.209 --> 00:04:01.129
180 | And we try to annotate it with some description of what was maybe done to the data
181 |
182 | 00:04:01.380 --> 00:04:04.399
183 | Right, so in this case. It was a test run
184 |
185 | 00:04:04.400 --> 00:04:09.890
186 | Then we test it again and calibrate it and then we ran into a problem, right? So...
187 |
188 | 00:04:10.470 --> 00:04:16.519
189 | All these, kind of, you saved the result of your work and annotated so later on you could
190 |
191 | 00:04:17.010 --> 00:04:23.779
192 | either get back to the previous state. Let's say maybe you indeed made an error and you want to rollback.
193 |
194 | 00:04:24.419 --> 00:04:29.939
195 | Or maybe you want to just compare what have you done, which broke your code or data?
196 |
197 | 00:04:30.700 --> 00:04:34.770
198 | Right, and as you could see those messages could be quite descriptive.
199 |
200 | 00:04:35.830 --> 00:04:43.679
201 | But the problem is that version control systems which are created for code are inadequate for data, right? So the problem is,
202 |
203 | 00:04:44.200 --> 00:04:51.029
204 | quite often, that it's duplication you have copy of the data in the version control system inside somewhere
205 |
206 | 00:04:51.030 --> 00:04:52.450
207 | so you couldn't use it directly.
208 |
209 | 00:04:52.450 --> 00:04:57.440
210 | But also it's present on your hard drive, so at least you have two copies quite often.
211 |
212 | 00:04:57.600 --> 00:05:02.520
213 | Or maybe it's duplicated and just on a single server, right?
214 |
215 | 00:05:03.010 --> 00:05:09.330
216 | I could give you examples were data in a version control system filled up the version control system and
217 |
218 | 00:05:09.940 --> 00:05:12.480
219 | meanwhile filling up the hard drive and
220 |
221 | 00:05:13.180 --> 00:05:19.770
222 | sometimes you try to commit new file and apparently ran out of space on the server and it might ruin your
223 |
224 | 00:05:20.110 --> 00:05:22.319
225 | version control back and then on the server
226 |
227 | 00:05:23.020 --> 00:05:28.409
228 | Rendering it impossible to get to the previous version, so you don't want to have that, right?
229 |
230 | 00:05:29.830 --> 00:05:37.770
231 | Then another problem is that there are no generic data distributions or at least there were no before DataLad.
232 |
233 | 00:05:38.110 --> 00:05:40.410
234 | So there is no efficient ways to
235 |
236 | 00:05:41.170 --> 00:05:43.259
237 | install and upgrade data sets and
238 |
239 | 00:05:44.740 --> 00:05:52.109
240 | When you also deal with different data hosting portals you need to learn how to navigate them
241 |
242 | 00:05:52.110 --> 00:05:52.470
243 | All right
244 |
245 | 00:05:52.470 --> 00:05:53.830
246 | you need to learn how you
247 |
248 | 00:05:53.830 --> 00:06:00.149
249 | authenticate, which page you need to go to and what to download and how to download it?
250 |
251 | 00:06:00.340 --> 00:06:03.750
252 | So just to get to that data set. And then, maybe you
253 |
254 | 00:06:05.230 --> 00:06:08.129
255 | get the announcement that dataset was fixed
256 |
257 | 00:06:08.130 --> 00:06:14.309
258 | and you need to repeat this over and over again trying to remember how to deal with it. And I'm not talking even if
259 |
260 | 00:06:14.920 --> 00:06:22.379
261 | the website became much better and sleeker and changed all the ways how it actually deals with downloads from what it did before.
262 |
263 | 00:06:23.980 --> 00:06:27.029
264 | Another aspect that data is rarely tested
265 |
266 | 00:06:27.160 --> 00:06:29.999
267 | So what does it mean for data to have bugs?
268 |
269 | 00:06:30.340 --> 00:06:36.210
270 | Any derived data is a product of running a script or some kind of procedure on
271 |
272 | 00:06:36.880 --> 00:06:39.839
273 | original data and generating new derived data.
274 |
275 | 00:06:40.990 --> 00:06:44.939
276 | Quite permanent ones which you could find in references later on in this presentation
277 |
278 | 00:06:45.460 --> 00:06:52.079
279 | is atlases. So Atlas is usually produced from the data writing some really sophisticated script
280 |
281 | 00:06:52.330 --> 00:06:53.920
282 | which generates new data:
283 |
284 | 00:06:53.920 --> 00:06:56.819
285 | the atlas. And those atlases could be buggy.
286 |
287 | 00:06:57.160 --> 00:07:03.630
288 | So how do you test the data? The same way as software. If we could establish this efficient process where we
289 |
290 | 00:07:04.630 --> 00:07:08.999
291 | produce some data and verify that at least data meets the assumptions
292 |
293 | 00:07:09.000 --> 00:07:12.869
294 | which you expect. If it's population or probability in the area
295 |
296 | 00:07:12.870 --> 00:07:15.990
297 | which must be present in the entirety of population,
298 |
299 | 00:07:16.060 --> 00:07:21.300
300 | then operability should be up to 100 or nearby 100. If it doesn't add up,
301 |
302 | 00:07:22.060 --> 00:07:25.020
303 | then you have a bug. It's really simple assumption
304 |
305 | 00:07:25.020 --> 00:07:28.145
306 | But very verifying that your data doesn't have...
307 |
308 | 00:07:28.145 --> 00:07:30.779
309 | Doesn't break those is really important.
310 |
311 | 00:07:32.140 --> 00:07:38.729
312 | Unified way how we deal with data, and the code could help to establish those data testing procedures.
313 |
314 | 00:07:39.580 --> 00:07:46.619
315 | Also, it's quite difficult to share your derived data. If downloaded some data set from an well known portal...
316 |
317 | 00:07:46.720 --> 00:07:50.160
318 | How do you share it? What data could be shared?
319 |
320 | 00:07:51.100 --> 00:07:53.340
321 | Where do you deposit it so people later on
322 |
323 | 00:07:53.830 --> 00:07:59.819
324 | Could download maybe original data, and your derive data without even worrying that oh
325 |
326 | 00:07:59.820 --> 00:08:04.710
327 | They need to get this piece from an original source and your derived data from another place
328 |
329 | 00:08:05.110 --> 00:08:09.119
330 | So how do we link those pieces together to make it convenient?
331 |
332 | 00:08:10.210 --> 00:08:17.279
333 | What we're trying to achieve is to make managing of the data as easy as managing code and software.
334 |
335 | 00:08:18.430 --> 00:08:21.690
336 | Is it possible? I hope that you'll see that it is so
337 |
338 | 00:08:22.300 --> 00:08:23.350
339 |
340 |
341 | 00:08:23.350 --> 00:08:25.350
342 | What DataLad is based on...
343 |
344 | 00:08:25.780 --> 00:08:31.380
345 | Is in two pieces and one of them is Git. I hope that everybody knows what Git is.
346 |
347 | 00:08:31.900 --> 00:08:33.010
348 |
349 |
350 | 00:08:33.010 --> 00:08:38.369
351 | But I'll give small presentation nevertheless. So Git is a version control system
352 |
353 | 00:08:38.650 --> 00:08:46.259
354 | and initially it was developed to manage Linux project code, if somebody doesn't know what Linux is this is one of the
355 |
356 | 00:08:46.840 --> 00:08:54.599
357 | most recognized and probably mostly used, because it's used everywhere: on the phones, on the servers, on the operating systems.
358 |
359 | 00:08:54.970 --> 00:09:02.010
360 | It's free and open source and it's developed into open and at some point they needed new version control system
361 |
362 | 00:09:02.010 --> 00:09:07.739
363 | Which would scale for the demand of having lots of code managed there and many people working with it
364 |
365 | 00:09:07.980 --> 00:09:13.360
366 | So it's not a geeky project just for... Between a few people.
367 |
368 | 00:09:13.900 --> 00:09:16.000
369 | It is developed by hundreds.
370 |
371 | 00:09:16.000 --> 00:09:17.240
372 | It's used by millions.
373 |
374 | 00:09:18.540 --> 00:09:21.560
375 | What's great about Git is that it's distributed.
376 |
377 | 00:09:21.780 --> 00:09:27.440
378 | So content is available across all copies of the repository if you clone the repository
379 |
380 | 00:09:28.000 --> 00:09:30.960
381 | You have the entire history of the project
382 |
383 | 00:09:30.960 --> 00:09:35.840
384 | and you could get to any point in that development you could compare different versions.
385 |
386 | 00:09:35.840 --> 00:09:41.080
387 | You could do exactly the same things as original developers dated on this repository.
388 |
389 | 00:09:41.080 --> 00:09:46.900
390 | So it provides you as much flexibility to accomplish things locally
391 |
392 | 00:09:47.470 --> 00:09:50.460
393 | without requiring any network access.
394 |
395 | 00:09:51.070 --> 00:09:55.679
396 | Git became a backbone for github and other social coding portals.
397 |
398 | 00:09:55.900 --> 00:10:01.860
399 | So github came to fill the niche that there were no convenient online resource
400 |
401 | 00:10:01.960 --> 00:10:05.960
402 | where people could easily share these repositories and work on them together.
403 |
404 | 00:10:06.480 --> 00:10:12.820
405 | So git is just a tool and github is just a web portal which provides you
406 |
407 | 00:10:13.000 --> 00:10:20.720
408 | a convenient centralized management of the repositories and collaboration between people.
409 |
410 | 00:10:20.740 --> 00:10:23.220
411 | But it's not a single one there are other
412 |
413 | 00:10:23.400 --> 00:10:25.740
414 | systems which use Git underneath.
415 |
416 | 00:10:26.400 --> 00:10:29.740
417 | Gitlab, Bitbucket, so...
418 |
419 | 00:10:31.000 --> 00:10:37.320
420 | It just creates this the entire ecosystem of the tool and additional services and resources.
421 |
422 | 00:10:38.280 --> 00:10:48.120
423 | What git is great for is very efficient management of textual information, right, so if you manage code, text, configuration files...
424 |
425 | 00:10:48.280 --> 00:10:58.700
426 | Maybe dumped some documentation or JSON files? So, all of those were nicely managed by git because it has really good mechanism to
427 |
428 | 00:10:58.940 --> 00:11:02.662
429 | annotate the differences and compress that efficiently.
430 |
431 | 00:11:02.662 --> 00:11:07.000
432 | So all those distributed copies are actually not that big.
433 |
434 | 00:11:07.200 --> 00:11:10.560
435 | But the problem or inefficiency of Git
436 |
437 | 00:11:10.820 --> 00:11:17.020
438 | is this exactly distributed nature of it. That it stores all the copies of the documents
439 |
440 | 00:11:17.320 --> 00:11:23.400
441 | on all the systems, right? So, if I have big files, then it becomes inefficient.
442 |
443 | 00:11:23.400 --> 00:11:28.300
444 | because now you will have two copies, right? You will have one on the hard drive (at least two copies)...
445 |
446 | 00:11:28.520 --> 00:11:32.049
447 | One on your hard drive and then one committed into Git.
448 |
449 | 00:11:32.050 --> 00:11:35.620
450 | Then if you push this into Github you will have again a
451 |
452 | 00:11:35.779 --> 00:11:40.029
453 | big copy of that file somewhere and anybody who clones that repository
454 |
455 | 00:11:40.580 --> 00:11:43.900
456 | might wait for a while to just get it and then
457 |
458 | 00:11:44.000 --> 00:11:51.640
459 | they might be a little bit upset because they wanted just one file from the repository and didn't care to download a gigabyte
460 |
461 | 00:11:52.120 --> 00:11:54.180
462 | of data just to see it.
463 |
464 | 00:11:54.180 --> 00:11:56.640
465 | So it's inefficient for storing data.
466 |
467 | 00:11:57.760 --> 00:12:04.980
468 | What is the other tool we rely on, as I said, written by Joey Hess it's Git-annex.
469 |
470 | 00:12:05.240 --> 00:12:11.080
471 | So the idea was to build on top of git to provide management for the data files
472 |
473 | 00:12:11.720 --> 00:12:15.279
474 | Without committing those files directly into Git.
475 |
476 | 00:12:16.520 --> 00:12:22.449
477 | So git-annex allows you to add data files under Git control without
478 |
479 | 00:12:23.120 --> 00:12:26.770
480 | committing the content of the files into Git.
481 |
482 | 00:12:27.589 --> 00:12:33.748
483 | While playing with Git-annex and DataLad you might see that files get replaced with the same link.
484 |
485 | 00:12:33.748 --> 00:12:38.280
486 | So what git-annex commits into Git is actually just symlink
487 |
488 | 00:12:38.280 --> 00:12:41.780
489 | which points to the file which contains the data.
490 |
491 | 00:12:42.170 --> 00:12:48.130
492 | This way you can commit really lightweight symlink and keep the data on the hard drive and a single
493 |
494 | 00:12:48.500 --> 00:12:55.299
495 | copy. Sorry, it's not in Git. And then what git-annex does, it orchestrate the
496 |
497 | 00:12:56.089 --> 00:12:58.089
498 | management of those files between
499 |
500 | 00:12:58.279 --> 00:13:03.519
501 | Different clones of the repository or so called other way special nodes.
502 |
503 | 00:13:03.800 --> 00:13:10.630
504 | But also it provides access to those files if they are let's say uploaded on to some website,
505 |
506 | 00:13:10.630 --> 00:13:15.760
507 | so you have a URL. You could associate the URL with the file, you could upload it to FTP,
508 |
509 | 00:13:16.100 --> 00:13:18.820
510 | you could upload it to web server.
511 |
512 | 00:13:19.550 --> 00:13:25.839
513 | You could even get content through BitTorrent, or you could use Amazon s3 storage as your
514 |
515 | 00:13:26.510 --> 00:13:31.029
516 | container for the files and it allows for custom extensions.
517 |
518 | 00:13:31.370 --> 00:13:37.389
519 | Let's say you could upload data to Dropbox, Google Drive, box.com and many, many other
520 |
521 | 00:13:38.839 --> 00:13:40.958
522 | data hosting provider.
523 |
524 | 00:13:42.079 --> 00:13:46.929
525 | Git-annex takes care also about avoiding the limitations of those platforms.
526 |
527 | 00:13:47.480 --> 00:13:54.279
528 | Let's say box.com from public account, it doesn't allow you to have files larger than I believe hundred megabytes.
529 |
530 | 00:13:54.889 --> 00:14:00.489
531 | Git-annex will chop it up so on the box.com you'll have little pieces
532 |
533 | 00:14:00.490 --> 00:14:03.820
534 | You will not use them directly from box.com, but then git-annex
535 |
536 | 00:14:03.820 --> 00:14:09.129
537 | will re-assemble the big file when it gets it onto your hard drive so all those
538 |
539 | 00:14:09.410 --> 00:14:15.339
540 | Conveniences and in addition encryption, that's if you want to share some sensitive data, and you cannot just upload it
541 |
542 | 00:14:16.130 --> 00:14:20.079
543 | Unencrypted to the public service all those are provided by git-annex.
544 |
545 | 00:14:20.839 --> 00:14:27.789
546 | Also additional feature which we don't use in a project is git-annex assistant which is Dropbox like
547 |
548 | 00:14:28.850 --> 00:14:30.850
549 | Synchronization mechanism you could establish
550 |
551 | 00:14:31.519 --> 00:14:35.649
552 | synchronization between your Git, git-annex repositories across multiple servers and
553 |
554 | 00:14:35.990 --> 00:14:42.909
555 | configure them really flexibly so you have that's a backup of on off all the data files on one server and
556 |
557 | 00:14:42.980 --> 00:14:47.589
558 | Some other server will have only files which it cares about let's say
559 |
560 | 00:14:47.930 --> 00:14:52.299
561 | Data files another one might have only video files
562 |
563 | 00:14:53.149 --> 00:14:57.698
564 | another one may be just music files who knows so flexibility is there and
565 |
566 | 00:14:58.060 --> 00:15:02.400
567 | It's all up to you to configure what you want where.
568 |
569 | 00:15:02.400 --> 00:15:07.480
570 | In our project we don't use it yet, but we do use it locally for synchronizing
571 |
572 | 00:15:07.700 --> 00:15:10.020
573 | different git-annex repositories.
574 |
575 | 00:15:12.380 --> 00:15:15.300
576 | But another problem here, so we have really
577 |
578 | 00:15:20.540 --> 00:15:21.180
579 | great two tools Git and git-annex, but both of them work on a single repository level.
580 |
581 | 00:15:21.340 --> 00:15:26.560
582 | So, the work in a git repository you need to go into that directory and
583 |
584 | 00:15:27.320 --> 00:15:29.500
585 | Accomplish whatever you want to do.
586 |
587 | 00:15:29.500 --> 00:15:33.400
588 | It kind of doesn't go along well with the notion of distribution
589 |
590 | 00:15:33.840 --> 00:15:38.760
591 | You don't care where you are you just want to, on your hard drive, so you just want to say:
592 |
593 | 00:15:39.100 --> 00:15:42.720
594 | Oh, search, find me something, install this and
595 |
596 | 00:15:43.140 --> 00:15:46.940
597 | give me access to this data. Right? Or get me give me this file
598 |
599 | 00:15:47.080 --> 00:15:50.700
600 | Even though maybe I'm not in that git or git-annex repository
601 |
602 | 00:15:52.180 --> 00:15:56.780
603 | Also another kind of aspect those are just tools so similarly like
604 |
605 | 00:15:57.120 --> 00:16:02.820
606 | how GitHub provided convenient portal to the tool git.
607 |
608 | 00:16:03.760 --> 00:16:07.178
609 | We want to accomplish something where we use these tools
610 |
611 | 00:16:07.178 --> 00:16:09.620
612 | which are agnostic of domain of the data
613 |
614 | 00:16:09.630 --> 00:16:11.290
615 | (let's say neuroimaging)
616 |
617 | 00:16:11.290 --> 00:16:16.410
618 | to give you guys access to those terabytes of publicly shared data already
619 |
620 | 00:16:16.720 --> 00:16:19.920
621 | which lives out there somewhere,
622 |
623 | 00:16:19.920 --> 00:16:21.569
624 | so we don't need to collect it. We don't need to
625 |
626 | 00:16:22.270 --> 00:16:26.309
627 | make copy of it locally, right, it's already there, so
628 |
629 | 00:16:26.950 --> 00:16:31.890
630 | What we want to achieve is just to provide access to that data without
631 |
632 | 00:16:32.230 --> 00:16:36.180
633 | Mirroring it on our servers or without duplicating it elsewhere
634 |
635 | 00:16:38.260 --> 00:16:47.740
636 | Before going into demos I want to give you kind of more illustrative demo of what is data lifecycle here of data
637 |
638 | 00:16:47.940 --> 00:16:51.140
639 | which we provide by DataLad.
640 |
641 | 00:16:51.340 --> 00:16:59.180
642 | Let's imagine that we have a data set which comes initially from OpenfMRI, right, and live somewhere in the cloud or
643 |
644 | 00:16:59.410 --> 00:17:07.139
645 | On data hosting portal actually we have two copies of the data one of them might be in the tarball, somewhere on HTTP server
646 |
647 | 00:17:07.420 --> 00:17:09.420
648 | right and another one might be
649 |
650 | 00:17:09.850 --> 00:17:16.860
651 | Extracted from the tarball, somewhere on a cloud which might have HTTP access might have S3 access,
652 |
653 | 00:17:16.860 --> 00:17:18.900
654 | but the point is that data is there and
655 |
656 | 00:17:19.480 --> 00:17:25.980
657 | Then we have a data user and that's us right me you everybody who wants to use this data
658 |
659 | 00:17:26.160 --> 00:17:28.160
660 | So now options are: we either...
661 |
662 | 00:17:28.390 --> 00:17:34.680
663 | Go down on the tarball extract it or we learn how to use S3 and go and install some tool
664 |
665 | 00:17:35.440 --> 00:17:37.589
666 | Browse S3 bucket, download those files.
667 |
668 | 00:17:38.950 --> 00:17:42.510
669 | But what we are trying to establish here is actually a middle layer, right?
670 |
671 | 00:17:42.710 --> 00:17:48.499
672 | We want to provide data distribution which might be hosted somewhere, maybe it's on github maybe in our server
673 |
674 | 00:17:49.050 --> 00:17:55.310
675 | Where I'll take this data available online and will automatically crawl it so here
676 |
677 | 00:17:55.310 --> 00:18:02.600
678 | I mentioned this command crawl which is one of the commands DataLad provides to automate monitoring of external resources
679 |
680 | 00:18:03.750 --> 00:18:09.230
681 | So we could get them into Git repositories and actually you could see here that these
682 |
683 | 00:18:11.040 --> 00:18:12.750
684 | Greenish-yellow
685 |
686 | 00:18:12.750 --> 00:18:15.440
687 | Why you don't draw here? Greenish yellow...
688 |
689 | 00:18:16.260 --> 00:18:17.550
690 | color.
691 |
692 | 00:18:17.550 --> 00:18:19.550
693 | Why you don't draw here?
694 |
695 | 00:18:20.580 --> 00:18:27.919
696 | Here we go! So, this greenish yellow color represents just a Content reference
697 |
698 | 00:18:28.620 --> 00:18:34.280
699 | Instead of the actual content, that's why we could host it on github or anywhere because it doesn't have the actual data
700 |
701 | 00:18:35.250 --> 00:18:40.310
702 | So we collect those data sets into collections, which we might share
703 |
704 | 00:18:40.310 --> 00:18:44.929
705 | let's save the one which we share from data sets that are on DataLad.org
706 |
707 | 00:18:45.330 --> 00:18:51.080
708 | underneath we use git modules which is built-in mechanism within Git to organize these collections of
709 |
710 | 00:18:51.270 --> 00:18:55.340
711 | multiple repositories while keeping track of burgeoning information
712 |
713 | 00:18:55.340 --> 00:18:58.369
714 | So you could get the entire collection of let's say of OpenfMRI data sets
715 |
716 | 00:18:58.560 --> 00:19:02.749
717 | For a specific date for a specific version if you want to reproduce some of these else analysis
718 |
719 | 00:19:02.750 --> 00:19:06.140
720 | And then we are making it possible to install
721 |
722 | 00:19:06.660 --> 00:19:10.009
723 | Arbitrary number of those data sets we are unified interface
724 |
725 | 00:19:10.710 --> 00:19:16.639
726 | So here we mentioned command datalad --install which you will see later and hopefully
727 |
728 | 00:19:17.400 --> 00:19:22.400
729 | Those parameters like install into current data set and get all the data
730 |
731 | 00:19:23.010 --> 00:19:28.550
732 | Will it be less surprising and also we provide shortcuts so which I'll talk about later
733 |
734 | 00:19:28.800 --> 00:19:31.100
735 | But the point is that you could now easily
736 |
737 | 00:19:31.680 --> 00:19:36.680
738 | Install those data sets onto your local hard drive, and if you are doing some processing
739 |
740 | 00:19:37.530 --> 00:19:44.810
741 | It might add results of the process in this case. We've got new file filtered bold file, which we could easily add
742 |
743 | 00:19:45.660 --> 00:19:50.480
744 | Into this repository and which means which is committed into the repository
745 |
746 | 00:19:51.000 --> 00:19:58.739
747 | Under git-annex control. And later we could publish this entirety of maybe collection of the datasets
748 |
749 | 00:20:00.010 --> 00:20:05.010
750 | to multiple places one of them might be github or we publish only the
751 |
752 | 00:20:06.130 --> 00:20:08.729
753 | repository itself without actually data files again
754 |
755 | 00:20:08.730 --> 00:20:16.170
756 | Those are just symlinks and maybe offload the actual data to some server which my HTTP server
757 |
758 | 00:20:18.100 --> 00:20:25.230
759 | or some other server through some mechanism right, but the point is that data goes somewhere and the magic happens here
760 |
761 | 00:20:25.330 --> 00:20:31.709
762 | Thanks to the git-annex because that's the Beast which keeps track of were each data file
763 |
764 | 00:20:32.080 --> 00:20:36.929
765 | Could be obtained from so this red links point to the information
766 |
767 | 00:20:36.930 --> 00:20:44.339
768 | What git-annex stores for us that I let's say this bald file is available from original web portal right it's available from S3 bucket,
769 |
770 | 00:20:44.340 --> 00:20:48.300
771 | it might be coming from a tarball, so that's one of the extensions
772 |
773 | 00:20:48.300 --> 00:20:53.190
774 | we added to git-annex to support extraction of the files from the tarball.
775 |
776 | 00:20:53.740 --> 00:20:57.330
777 | So it becomes really transparent to the user and this new file
778 |
779 | 00:20:58.570 --> 00:21:04.889
780 | We published it there. So it might be available now through HTTP so people who cloned this repository
781 |
782 | 00:21:06.220 --> 00:21:13.620
783 | Would be able to get any file from original storage or from any derived data
784 |
785 | 00:21:13.720 --> 00:21:15.720
786 | Which we published on our website?
787 |
788 | 00:21:16.840 --> 00:21:20.250
789 | So that's kind of the main idea behind DataLad.
790 |
791 | 00:21:21.610 --> 00:21:23.260
792 | So, altogether
793 |
794 | 00:21:23.260 --> 00:21:28.650
795 | DataLad allows you to manage multiple repositories organized into these super datasets
796 |
797 | 00:21:28.650 --> 00:21:34.680
798 | Which are just collection of git repositories using standard git sub-modules mechanism.
799 |
800 | 00:21:34.990 --> 00:21:38.760
801 | It supports both git and ggit-annex repository, so if you have
802 |
803 | 00:21:39.490 --> 00:21:45.180
804 | Just regular git repositories where you don't want to add any data. It's perfectly fine.
805 |
806 | 00:21:45.940 --> 00:21:52.499
807 | We can crawl external online data resources and update git-annex repositories upon changes.
808 |
809 | 00:21:53.770 --> 00:21:59.160
810 | It seems to scale quite nicely because data stays with original data provider
811 |
812 | 00:21:59.160 --> 00:22:02.369
813 | so we don't need to increase the storage in our server and
814 |
815 | 00:22:02.920 --> 00:22:09.020
816 | We could use maybe or you could use because anybody could use DataLad to publish
817 |
818 | 00:22:09.200 --> 00:22:11.300
819 | their collections of the datasets on
820 |
821 | 00:22:12.760 --> 00:22:19.679
822 | github and maybe offloading data itself to portals like box.com or dropbox.
823 |
824 | 00:22:21.279 --> 00:22:25.769
825 | What happens now that we have unified access to data regardless of its origin
826 |
827 | 00:22:25.770 --> 00:22:30.389
828 | I didn't care if data comes from openfMRI or CRCNS.
829 |
830 | 00:22:30.820 --> 00:22:36.960
831 | The only difference might be that you need to authenticate it. Let's say CRCNS doesn't allow download without authentication.
832 |
833 | 00:22:37.539 --> 00:22:42.929
834 | So DataLad will ask you for credentials, which you should store locally in the hard drive
835 |
836 | 00:22:42.960 --> 00:22:46.649
837 | Nothing is shared with us and later on when you need to get more data
838 |
839 | 00:22:46.750 --> 00:22:51.089
840 | Just to use those credentials to authenticate in your behalf to CRCNS,
841 |
842 | 00:22:51.640 --> 00:22:55.559
843 | download those tarballs extract it for you, so you didn't need to worry about that and
844 |
845 | 00:22:56.260 --> 00:23:00.929
846 | Also, we take care about serialization, so if original website distributes only tarballs
847 |
848 | 00:23:01.779 --> 00:23:04.919
849 | We download tarballs for you, extract them and again
850 |
851 | 00:23:04.919 --> 00:23:08.939
852 | You didn't need to worry how the data is actually serialized by original data provider
853 |
854 | 00:23:09.940 --> 00:23:13.770
855 | What we do on top is that we aggregate metadata.
856 |
857 | 00:23:14.320 --> 00:23:18.390
858 | What metadata is? It is data about the data.
859 |
860 | 00:23:18.880 --> 00:23:24.779
861 | So let's say you have a data set which contains the data the results information about what this data
862 |
863 | 00:23:24.779 --> 00:23:28.409
864 | Set is about what it's named. What was its offer authors?
865 |
866 | 00:23:29.080 --> 00:23:31.640
867 | What might be the license if it's applicable?
868 |
869 | 00:23:32.160 --> 00:23:35.880
870 | so any additional information about the data constitutes metadata.
871 |
872 | 00:23:36.140 --> 00:23:42.080
873 | What we do in DataLad, we aggregate metadata, which we find about the original data sets and
874 |
875 | 00:23:42.820 --> 00:23:44.490
876 | Provide you convenient interface
877 |
878 | 00:23:44.490 --> 00:23:48.329
879 | So you could search across all of it across all the data sets which we already
880 |
881 | 00:23:48.640 --> 00:23:52.049
882 | Integrated in DataLad. And I hope you'll see the demonstration
883 |
884 | 00:23:52.750 --> 00:23:54.959
885 | quite appealing later on.
886 |
887 | 00:23:56.049 --> 00:24:01.679
888 | Then DataLad after you consumed added extended data sets or just created from scratch
889 |
890 | 00:24:02.380 --> 00:24:09.839
891 | You could share original or derived datasets publicly as I mentioned or internally you could always
892 |
893 | 00:24:10.659 --> 00:24:16.709
894 | publish them locally at your SSH may be to collaborate with somebody and that's what we do regularly and
895 |
896 | 00:24:17.919 --> 00:24:19.919
897 | Meanwhile we'll keep data
898 |
899 | 00:24:20.320 --> 00:24:27.780
900 | we could keep data available elsewhere, or you could even share the data set without sharing the data, which is quite keen as
901 |
902 | 00:24:29.860 --> 00:24:36.360
903 | Demonstration of good intent when you are about to publish the paper, that's what we did them with our recent submission
904 |
905 | 00:24:36.360 --> 00:24:40.829
906 | We publish the data set but not with the entirety of the data set
907 |
908 | 00:24:40.830 --> 00:24:45.449
909 | But just with first subject so reviewers could verify that there is
910 |
911 | 00:24:46.450 --> 00:24:48.130
912 | Good quality data
913 |
914 | 00:24:48.130 --> 00:24:49.990
915 | that
916 |
917 | 00:24:49.990 --> 00:24:56.099
918 | They could get access to it right and that the entirety of data is in principle available
919 |
920 | 00:24:56.100 --> 00:25:00.089
921 | And it was processed accordingly because the whole the entire Git history
922 |
923 | 00:25:00.850 --> 00:25:04.650
924 | is maintained and shared, but the data files are not
925 |
926 | 00:25:06.160 --> 00:25:10.560
927 | Okay and the additional benefit some of it, which is work in progress
928 |
929 | 00:25:11.080 --> 00:25:15.449
930 | You could export the data set if you want to share just the data itself you could
931 |
932 | 00:25:15.580 --> 00:25:21.420
933 | Export the data set and current version in a tarball and give it to somebody but more exciting feature
934 |
935 | 00:25:21.700 --> 00:25:25.680
936 | and we've been working on is exporting in to
937 |
938 | 00:25:26.290 --> 00:25:27.370
939 | some
940 |
941 | 00:25:27.370 --> 00:25:31.170
942 | Metadata heavy data formats if you're publishing scientific data
943 |
944 | 00:25:31.660 --> 00:25:37.170
945 | You will be asked to fill out a big spreadsheet, which is called easy to have
946 |
947 | 00:25:38.950 --> 00:25:44.340
948 | To annotate metadata for your data set it's really tedious and unpleasant job
949 |
950 | 00:25:44.340 --> 00:25:48.630
951 | But the beauty is that all that information is contained within
952 |
953 | 00:25:49.030 --> 00:25:54.180
954 | metadata of either data set or of git-annex. So we could automatically
955 |
956 | 00:25:54.580 --> 00:26:01.199
957 | export majority of information for you, so you just need to fill out left out information and be done
958 |
959 | 00:26:04.150 --> 00:26:07.680
960 | DataLad comes with both common line and Python interfaces
961 |
962 | 00:26:07.680 --> 00:26:14.489
963 | So you could work with it interactively either in common line or script it in bash or working with it interactively in the ipython
964 |
965 | 00:26:14.830 --> 00:26:20.100
966 | and script it with Python language it gives you the same capabilities and really similar syntax
967 |
968 | 00:26:22.300 --> 00:26:24.100
969 | Our distribution
970 |
971 | 00:26:24.100 --> 00:26:27.089
972 | Grew up already to cover over ten terabytes of data
973 |
974 | 00:26:27.910 --> 00:26:31.469
975 | We cover such data sets as OpenfMRI, CRCNS,
976 |
977 | 00:26:32.560 --> 00:26:34.560
978 | functional connectome
979 |
980 | 00:26:34.870 --> 00:26:42.430
981 | INDI data sets and even some data sets from Kaggle and some RatHole radio podcast show
982 |
983 | 00:26:42.830 --> 00:26:49.059
984 | Because it was a cool experiment to be able to crawl that website and collect all the data
985 |
986 | 00:26:49.520 --> 00:26:52.599
987 | About timing of the songs. So check it out
988 |
989 | 00:26:52.600 --> 00:26:57.100
990 | It's available on github although data stays as again with original provider
991 |
992 | 00:26:57.370 --> 00:27:03.609
993 | What is coming? More data, so we'll cover human connectome project and data available from x net servers
994 |
995 | 00:27:03.890 --> 00:27:08.410
996 | We want to provide extended metadata support, so we cover not only data sets
997 |
998 | 00:27:08.410 --> 00:27:09.190
999 | level data
1000 |
1001 | 00:27:09.190 --> 00:27:16.750
1002 | But also data for separate files if you know about any other interesting data set or data provider
1003 |
1004 | 00:27:17.390 --> 00:27:20.739
1005 | File a new issue, or shoot us an email.
1006 |
1007 | 00:27:21.590 --> 00:27:27.850
1008 | we are also working on integrating with NeuroDebian, so you could apt-get install those datasets and the position of data to
1009 |
1010 | 00:27:28.580 --> 00:27:32.350
1011 | OSF and in other platforms. Another interesting integration
1012 |
1013 | 00:27:32.350 --> 00:27:39.880
1014 | Which we've done was to introduce DataLad support into HeuDiConv which stands for Heuristic DICOM Conversion Tool.
1015 |
1016 | 00:27:39.880 --> 00:27:41.809
1017 | which allows you to
1018 |
1019 | 00:27:41.809 --> 00:27:47.739
1020 | automate conversion of your DICOM data obtained from MRI scanner into NIfTI files
1021 |
1022 | 00:27:48.080 --> 00:27:51.520
1023 | but we went one step further and
1024 |
1025 | 00:27:52.280 --> 00:27:57.489
1026 | Standardized it to convert not only to DataLad data set but DataLad BIDS data sets.
1027 |
1028 | 00:27:57.490 --> 00:28:02.020
1029 | Set so if you don't know what BIDS is, it is something you must know nowadays.
1030 |
1031 | 00:28:02.540 --> 00:28:07.479
1032 | If you doing imaging research. It's brain imaging data structure format
1033 |
1034 | 00:28:07.700 --> 00:28:14.679
1035 | Which describes how you should lay out your files on a file system so anybody who finds your data set will be immediately
1036 |
1037 | 00:28:15.230 --> 00:28:17.230
1038 | capable to understand
1039 |
1040 | 00:28:17.690 --> 00:28:24.970
1041 | your design how many subjects you have so it's standardized is beyond NIfTI. It standardized is how you
1042 |
1043 | 00:28:25.280 --> 00:28:27.280
1044 | Work with your files so now
1045 |
1046 | 00:28:27.680 --> 00:28:33.500
1047 | With this integration HeuDiConv we can obtain DataLad datasets
1048 |
1049 | 00:28:34.360 --> 00:28:39.660
1050 | with BIDS if I'd neuroimaging data, so it's ready to be shared
1051 |
1052 | 00:28:39.670 --> 00:28:44.109
1053 | It's ready to be processed by any BIDS compatible tool, so it opens ample
1054 |
1055 | 00:28:44.660 --> 00:28:46.660
1056 | opportunities
1057 |
1058 | 00:28:46.790 --> 00:28:50.930
1059 | And at this point I guess we should switch and do some demos
1060 |
1061 | 00:28:53.640 --> 00:28:59.989
1062 | And before I actually give any demo I want to familiarize you with our new website DataLad.org
1063 |
1064 | 00:29:01.110 --> 00:29:03.000
1065 | On top you could see
1066 |
1067 | 00:29:03.000 --> 00:29:06.619
1068 | navigation for among major portions the website
1069 |
1070 | 00:29:07.140 --> 00:29:09.170
1071 | One of them is about page
1072 |
1073 | 00:29:09.870 --> 00:29:13.910
1074 | Just describes the purpose of the DataLad and provides
1075 |
1076 | 00:29:14.580 --> 00:29:18.379
1077 | information about funding agencies and involved institutions
1078 |
1079 | 00:29:20.010 --> 00:29:22.010
1080 | Next link is "Get DataLad"
1081 |
1082 | 00:29:22.890 --> 00:29:30.350
1083 | Which describes how to install the DataLad. The easiest installation is if you are using your Debian already.
1084 |
1085 | 00:29:30.600 --> 00:29:38.280
1086 | Then it just apt-get install DataLad command or you could find it in package manager and install it within second
1087 |
1088 | 00:29:38.720 --> 00:29:45.320
1089 | Alternatively, if you are on OS-X or any other operating system. Windows support is initial but it
1090 |
1091 | 00:29:46.230 --> 00:29:49.190
1092 | Should work in the basic set of features
1093 |
1094 | 00:29:49.800 --> 00:29:51.950
1095 | you have to install git-annex by
1096 |
1097 | 00:29:52.500 --> 00:29:54.590
1098 | going to git-annex website and
1099 |
1100 | 00:29:56.850 --> 00:29:58.850
1101 | Into install page
1102 |
1103 | 00:29:59.280 --> 00:30:05.840
1104 | choosing the operating system of your choice and following the instructions there how to get it and
1105 |
1106 | 00:30:07.200 --> 00:30:09.140
1107 | after you installed git-annex
1108 |
1109 | 00:30:09.140 --> 00:30:15.499
1110 | You just need to install DataLad from Python package index through pip-install datalad command.
1111 |
1112 | 00:30:16.920 --> 00:30:19.399
1113 | Next page is features page
1114 |
1115 | 00:30:19.410 --> 00:30:27.379
1116 | Which is actually led to by those pretty boxes on the main page and this page will go through
1117 |
1118 | 00:30:27.840 --> 00:30:29.840
1119 | later in greater detail
1120 |
1121 | 00:30:30.150 --> 00:30:33.379
1122 | Another interesting page is Datasets which presents you our
1123 |
1124 | 00:30:34.110 --> 00:30:39.200
1125 | ultimate official distribution which points to datasets.datalad.org
1126 |
1127 | 00:30:39.990 --> 00:30:44.479
1128 | which is the collection of data sets which already pre crawled for you and
1129 |
1130 | 00:30:45.240 --> 00:30:50.420
1131 | That were we provide those data sets like for Open fRI
1132 |
1133 | 00:30:51.510 --> 00:30:53.340
1134 | CRCNS
1135 |
1136 | 00:30:53.340 --> 00:30:55.290
1137 | ADHD and
1138 |
1139 | 00:30:55.290 --> 00:30:56.940
1140 | many others
1141 |
1142 | 00:30:56.940 --> 00:31:00.469
1143 | I will just briefly describe the features of these
1144 |
1145 | 00:31:00.990 --> 00:31:07.819
1146 | Basic website and mention that the such websites if you have any HTTP server available somewhere
1147 |
1148 | 00:31:08.130 --> 00:31:13.459
1149 | Maybe institution provides because you will not host the data actually here, or you don't have to
1150 |
1151 | 00:31:14.429 --> 00:31:21.349
1152 | you could upload similar views of your data sets pretty much anywhere where you could host a website and
1153 |
1154 | 00:31:22.140 --> 00:31:26.479
1155 | OpenfMRI, let's say we go to OpenfMRI, it lists all those data sets
1156 |
1157 | 00:31:26.480 --> 00:31:31.459
1158 | which we crawled from OpenfMRI, you could see also immediately mentioning of the version
1159 |
1160 | 00:31:31.980 --> 00:31:33.980
1161 | here and version goes
1162 |
1163 | 00:31:33.990 --> 00:31:35.990
1164 | from
1165 |
1166 | 00:31:36.000 --> 00:31:42.949
1167 | What version OpenfMRI gave it but also with additional indices pointing to exact commits
1168 |
1169 | 00:31:44.490 --> 00:31:50.539
1170 | Within our git repository I didn't find that version another neat feature here is
1171 |
1172 | 00:31:51.389 --> 00:31:52.919
1173 | immediate
1174 |
1175 | 00:31:52.919 --> 00:31:57.469
1176 | Search so you could start typing and now if you're interested in resting-state
1177 |
1178 | 00:31:58.320 --> 00:32:00.320
1179 | So here we go it goes
1180 |
1181 | 00:32:01.529 --> 00:32:06.199
1182 | Pretty fast and limits the view only the data sets where metadata
1183 |
1184 | 00:32:07.350 --> 00:32:12.169
1185 | Mentions this word and say let's look for Haxby... There we go!
1186 |
1187 | 00:32:12.779 --> 00:32:18.619
1188 | Or let's look for "movie". There we go! So, you could quickly identify the data sets by
1189 |
1190 | 00:32:19.350 --> 00:32:21.829
1191 | browsing and we'll see how we could do
1192 |
1193 | 00:32:22.289 --> 00:32:25.039
1194 | such actions later in the command line and
1195 |
1196 | 00:32:25.289 --> 00:32:31.638
1197 | When you get to the data set of interest or it could be at any pretty much level you'll see on top the command which
1198 |
1199 | 00:32:31.639 --> 00:32:35.509
1200 | Could be used to install this data set and described in some options
1201 |
1202 | 00:32:35.510 --> 00:32:40.969
1203 | Let's say -r is to install this data set with any possible sub data set recursively.
1204 |
1205 | 00:32:41.309 --> 00:32:46.699
1206 | There's -g to install it and also obtain all the data for it, and if you want to speed up the
1207 |
1208 | 00:32:48.360 --> 00:32:51.110
1209 | obtaining the data you could use -J
1210 |
1211 | 00:32:51.110 --> 00:32:56.899
1212 | And specify the number of parallel downloads your server and bandwidth could allow you
1213 |
1214 | 00:32:57.929 --> 00:33:01.788
1215 | Okay, let's go back to DataLad website and another
1216 |
1217 | 00:33:03.779 --> 00:33:10.489
1218 | Page on the website is development. So, if you're interested to help and contribute datasets provide
1219 |
1220 | 00:33:11.039 --> 00:33:13.039
1221 | patches improve documentation
1222 |
1223 | 00:33:13.320 --> 00:33:20.140
1224 | All of the development this is made in open. We use github intensively, we use Travis, we use codecov.
1225 |
1226 | 00:33:20.920 --> 00:33:27.200
1227 | We use Grid for documentation so and that will be our next point
1228 |
1229 | 00:33:28.509 --> 00:33:33.479
1230 | Documentation is hosted on docs.datalad.org and it provides
1231 |
1232 | 00:33:34.720 --> 00:33:39.480
1233 | Not yet as thorough documentation as we wanted but some
1234 |
1235 | 00:33:39.789 --> 00:33:46.289
1236 | documentation about major features of the dataset or a comparison between Git, git-annex and DataLad.
1237 |
1238 | 00:33:46.659 --> 00:33:48.659
1239 | But it also provides
1240 |
1241 | 00:33:49.179 --> 00:33:56.429
1242 | Really thorough interface documentation so as I mentioned before we have command line and Python
1243 |
1244 | 00:33:57.519 --> 00:34:01.199
1245 | interfaces both of those interfaces generated from the same code
1246 |
1247 | 00:34:01.200 --> 00:34:03.539
1248 | So they should be pretty much identical
1249 |
1250 | 00:34:03.759 --> 00:34:07.829
1251 | It just depending how you use command line or Python it will be different
1252 |
1253 | 00:34:07.839 --> 00:34:11.129
1254 | But otherwise all the options all the commands
1255 |
1256 | 00:34:11.169 --> 00:34:15.449
1257 | They look exactly the same and in command line reference
1258 |
1259 | 00:34:15.450 --> 00:34:23.069
1260 | You could find all the documentation for all the commands you could use it that I have some popular ones in my case
1261 |
1262 | 00:34:23.069 --> 00:34:29.099
1263 | Right where I went before and it provides documentation what those and of course there is
1264 |
1265 | 00:34:30.129 --> 00:34:33.629
1266 | notes for power users and quite elaborate
1267 |
1268 | 00:34:34.179 --> 00:34:37.648
1269 | documentation here about all the options which are available
1270 |
1271 | 00:34:38.230 --> 00:34:40.230
1272 | in those commands
1273 |
1274 | 00:34:41.139 --> 00:34:45.779
1275 | Ok so let's go back to features and
1276 |
1277 | 00:34:47.710 --> 00:34:52.859
1278 | First of the demos which I want to show you will be about data discovery
1279 |
1280 | 00:34:54.399 --> 00:34:59.608
1281 | That's any other demo on the website and is provided with
1282 |
1283 | 00:35:00.880 --> 00:35:03.420
1284 | screencast which shows all
1285 |
1286 | 00:35:04.720 --> 00:35:07.770
1287 | necessary commands to carry out the
1288 |
1289 | 00:35:09.670 --> 00:35:12.119
1290 | Presentation, but also provides you with
1291 |
1292 | 00:35:13.210 --> 00:35:17.399
1293 | comments describing the purpose of the actions taken
1294 |
1295 | 00:35:18.520 --> 00:35:20.079
1296 | moreover
1297 |
1298 | 00:35:20.079 --> 00:35:25.989
1299 | You could obtain the full script for the demo so you could run it as case on your hardware
1300 |
1301 | 00:35:28.040 --> 00:35:32.379
1302 | By clicking underneath the screen screencast but
1303 |
1304 | 00:35:33.800 --> 00:35:38.769
1305 | For this demonstration. I'll do it interactively in a shell together with you
1306 |
1307 | 00:35:40.280 --> 00:35:41.960
1308 | So
1309 |
1310 | 00:35:41.960 --> 00:35:43.960
1311 | Let's get started!
1312 |
1313 | 00:35:44.510 --> 00:35:46.540
1314 | If as you remember
1315 |
1316 | 00:35:47.270 --> 00:35:53.590
1317 | We aggregate a lot of metadata in DataLad to provide efficient search mechanisms
1318 |
1319 | 00:35:55.520 --> 00:35:59.919
1320 | In this example we'll imagine that we were looking for a data set which mentions
1321 |
1322 | 00:36:00.650 --> 00:36:07.600
1323 | Raiders in his word after being associated with movie Raiders of the Lost Ark and during imaging
1324 |
1325 | 00:36:09.230 --> 00:36:12.399
1326 | So we'll use datalad -search command where we'll
1327 |
1328 | 00:36:13.430 --> 00:36:16.300
1329 | Just state it right, so we'll call datalad -search
1330 |
1331 | 00:36:16.300 --> 00:36:22.449
1332 | Raiders neuroimaging as with a mini or all commands in DataLad
1333 |
1334 | 00:36:23.060 --> 00:36:26.110
1335 | They are composed by calling datalad
1336 |
1337 | 00:36:26.110 --> 00:36:33.819
1338 | then typing the command you want to implement right and then you could ask for help for that command
1339 |
1340 | 00:36:36.440 --> 00:36:37.850
1341 | Which
1342 |
1343 | 00:36:37.850 --> 00:36:39.850
1344 | provides you with
1345 |
1346 | 00:36:39.950 --> 00:36:41.950
1347 | associated help and
1348 |
1349 | 00:36:42.170 --> 00:36:47.740
1350 | On my screen took a little bit longer just because of video recording usually it's a little bit faster like
1351 |
1352 | 00:36:48.380 --> 00:36:50.359
1353 | five times and
1354 |
1355 | 00:36:50.359 --> 00:36:54.369
1356 | Then you actually type the parameters for this command. For search
1357 |
1358 | 00:36:54.369 --> 00:36:58.869
1359 | It's actually search terms, and I'll present a few other options later on
1360 |
1361 | 00:36:59.780 --> 00:37:01.780
1362 | whenever you
1363 |
1364 | 00:37:02.270 --> 00:37:09.759
1365 | Start this command for the first time it will ask you to install our super data set
1366 |
1367 | 00:37:11.210 --> 00:37:17.590
1368 | Under your home DataLad, in my case that slash demo is the home directory so it asks either
1369 |
1370 | 00:37:17.590 --> 00:37:21.609
1371 | We want to install that super dates which you saw available on DataLad.org
1372 |
1373 | 00:37:22.310 --> 00:37:24.459
1374 | in your home directory.
1375 |
1376 | 00:37:24.460 --> 00:37:32.139
1377 | And that's what it's doing so it quickly installed it because it's just a small git repository without any of those data sets
1378 |
1379 | 00:37:32.720 --> 00:37:38.970
1380 | Directly in a part of it, but they are linked to it as sub modules. It was really fast, and then it loads and caches
1381 |
1382 | 00:37:39.610 --> 00:37:44.069
1383 | metadata, which became available in that dataset and that takes few seconds
1384 |
1385 | 00:37:51.010 --> 00:37:58.649
1386 | Whenever that is done it you see that by default it just returns the paths or names of the
1387 |
1388 | 00:38:00.190 --> 00:38:05.369
1389 | Datasets as they are within the hierarchy of our super dataset and
1390 |
1391 | 00:38:07.510 --> 00:38:09.510
1392 | Search searches within the
1393 |
1394 | 00:38:10.030 --> 00:38:13.709
1395 | Repository data set you are in so if next time
1396 |
1397 | 00:38:13.710 --> 00:38:20.159
1398 | I am just running the same command it will ask me instead of: "Oh, do you want to install it?" it'll ask me either
1399 |
1400 | 00:38:20.160 --> 00:38:23.879
1401 | I want to search in this super dataset which I installed in my home directory
1402 |
1403 | 00:38:26.530 --> 00:38:28.530
1404 | Type yes
1405 |
1406 | 00:38:32.710 --> 00:38:37.619
1407 | And it provides the same result so to avoid such interactive questions
1408 |
1409 | 00:38:37.620 --> 00:38:42.780
1410 | you could explicitly mention which data set you want to search in.
1411 |
1412 | 00:38:43.060 --> 00:38:45.840
1413 | In our case it will be, I'll just specify
1414 |
1415 | 00:38:46.570 --> 00:38:49.170
1416 | That data set will be this
1417 |
1418 | 00:38:49.930 --> 00:38:51.400
1419 | canonical
1420 |
1421 | 00:38:51.400 --> 00:38:54.270
1422 | DataLad data set which is installed in your
1423 |
1424 | 00:38:55.210 --> 00:38:57.750
1425 | DataLad directory when you specify it like this
1426 |
1427 | 00:38:57.750 --> 00:39:06.160
1428 | It assumes location in your home directory when you use triple slashes resource identifier as the source for URLs
1429 |
1430 | 00:39:06.360 --> 00:39:14.020
1431 | To install data sets then it will go to the datasets.datalad.org. And this time we'll search not for Raiders
1432 |
1433 | 00:39:14.040 --> 00:39:20.190
1434 | neuroimaging, but we'll search for Haxby, one of the authors within this data set
1435 |
1436 | 00:39:20.950 --> 00:39:27.839
1437 | So -s stands for the fields which we want to search through and -R will report now
1438 |
1439 | 00:39:27.840 --> 00:39:30.299
1440 | Not just the path to the data set but also
1441 |
1442 | 00:39:30.940 --> 00:39:32.940
1443 | list the fields which match the
1444 |
1445 | 00:39:33.610 --> 00:39:39.120
1446 | Query which we ran. So in this case it should search for data sets and report the field "author".
1447 |
1448 | 00:39:40.210 --> 00:39:43.919
1449 | And only the data sets where Haxby was one of the authors.
1450 |
1451 | 00:39:44.620 --> 00:39:46.390
1452 | So here they are
1453 |
1454 | 00:39:46.390 --> 00:39:49.410
1455 | For convenience, let's just switch to that directory
1456 |
1457 | 00:39:50.160 --> 00:39:52.160
1458 | under our home
1459 |
1460 | 00:39:52.329 --> 00:39:56.939
1461 | Let me clear the screen and go to that directory
1462 |
1463 | 00:39:58.000 --> 00:40:00.299
1464 | So now we don't have to specify
1465 |
1466 | 00:40:01.539 --> 00:40:03.660
1467 | Location of the data set explicitly,
1468 |
1469 | 00:40:03.660 --> 00:40:07.360
1470 | and we could just type the same query without -d
1471 |
1472 | 00:40:07.360 --> 00:40:08.669
1473 | and it will provide the same results
1474 |
1475 | 00:40:13.420 --> 00:40:15.160
1476 | Instead of listing all matching fields,
1477 |
1478 | 00:40:15.160 --> 00:40:19.200
1479 | let's say in our case it was "author" field, we could
1480 |
1481 | 00:40:20.019 --> 00:40:25.409
1482 | explicitly specify which fields you want to search through or to report.
1483 |
1484 | 00:40:26.289 --> 00:40:27.813
1485 | So in this case, I want to see
1486 |
1487 | 00:40:27.813 --> 00:40:31.319
1488 | what's the name of the dataset and what is the author of the dataset?
1489 |
1490 | 00:40:31.319 --> 00:40:33.989
1491 | It's already the author, but with didn't see the name.
1492 |
1493 | 00:40:35.109 --> 00:40:39.929
1494 | And you're on the command to get the output with those fields included
1495 |
1496 | 00:40:41.710 --> 00:40:43.630
1497 | Well enough of searching!
1498 |
1499 | 00:40:43.630 --> 00:40:47.369
1500 | Let's clear the screen and what we could do now -- we found
1501 |
1502 | 00:40:47.680 --> 00:40:51.539
1503 | the datasets right it seems to be that the list of data sets which we found
1504 |
1505 | 00:40:52.360 --> 00:40:55.860
1506 | is good to be installed and we could just
1507 |
1508 | 00:40:56.760 --> 00:40:59.350
1509 | rely on a paradigm of Linux
1510 |
1511 | 00:40:59.350 --> 00:41:05.680
1512 | where you compose commands together through by using pipe command.
1513 |
1514 | 00:41:05.860 --> 00:41:08.900
1515 | So, what this magic would do?
1516 |
1517 | 00:41:08.900 --> 00:41:16.560
1518 | If we didn't have these which already what happens -- we get only the list of data sets or past
1519 |
1520 | 00:41:16.690 --> 00:41:22.560
1521 | Those which are not installed yet,
1522 |
1523 | 00:41:22.560 --> 00:41:24.569
1524 | and OpenfMRI directory is still empty so we get the list of data sets
1525 |
1526 | 00:41:25.320 --> 00:41:29.060
1527 | But then instead of manually going and doing:
1528 |
1529 | 00:41:29.060 --> 00:41:33.800
1530 | "datalad install openfmri/ds00233"...
1531 |
1532 | 00:41:33.980 --> 00:41:39.060
1533 | or doing copy-paste, we could just say that result of this command
1534 |
1535 | 00:41:39.400 --> 00:41:41.400
1536 | should be passed as
1537 |
1538 | 00:41:41.400 --> 00:41:45.760
1539 | Arguments to the next command which will be "datalad install".
1540 |
1541 | 00:41:45.760 --> 00:41:48.680
1542 | "datalad install" command installs those datasets
1543 |
1544 | 00:41:48.680 --> 00:41:51.140
1545 | which are either specified by the
1546 |
1547 | 00:41:51.819 --> 00:41:56.069
1548 | path within current data set or you could provide URLs to
1549 |
1550 | 00:41:56.800 --> 00:42:02.300
1551 | Install command and it will go to those websites and download them explicitly from there.
1552 |
1553 | 00:42:02.300 --> 00:42:05.780
1554 | "datalad install" could be used with other resources
1555 |
1556 | 00:42:06.340 --> 00:42:10.800
1557 | beyond our canonical DataLad distribution.
1558 |
1559 | 00:42:11.140 --> 00:42:13.140
1560 | So let's run this command
1561 |
1562 | 00:42:14.960 --> 00:42:18.640
1563 | as a result of it you'll see that now it goes online and
1564 |
1565 | 00:42:19.550 --> 00:42:25.630
1566 | Installs all those data sets or git/git-annex repositories without any data yet
1567 |
1568 | 00:42:25.670 --> 00:42:29.950
1569 | So only the files which are committed directly into git will be present.
1570 |
1571 | 00:42:42.040 --> 00:42:47.320
1572 | And now we could explore what actually we have got here.
1573 |
1574 | 00:42:47.320 --> 00:42:52.200
1575 | I'll use another DataLad command. Let me clear the screen to bring it on top of the screen.
1576 |
1577 | 00:42:53.079 --> 00:42:56.099
1578 | Next command is "ls", which just lists
1579 |
1580 | 00:42:56.940 --> 00:43:01.400
1581 | either data sets or it could be used also at list S3 URLs.
1582 |
1583 | 00:43:01.400 --> 00:43:04.380
1584 | If you are interested to see what is available in S3 bucket.
1585 |
1586 | 00:43:04.380 --> 00:43:11.159
1587 | And we are specifying the options: capital -L for long listing, and -r recursively
1588 |
1589 | 00:43:11.160 --> 00:43:15.780
1590 | So it will go through all data sets locally in current directory.
1591 |
1592 | 00:43:15.780 --> 00:43:20.060
1593 | (That's why there is a period). And then we'll just remove a list in our data sets
1594 |
1595 | 00:43:20.069 --> 00:43:23.129
1596 | which are not installed because they are not of our interest here.
1597 |
1598 | 00:43:56.050 --> 00:44:01.889
1599 | As you can see all those datasets, which we initially searched for and found
1600 |
1601 | 00:44:03.820 --> 00:44:05.820
1602 | Right?
1603 |
1604 | 00:44:09.850 --> 00:44:13.860
1605 | They became installed, so they became available on our
1606 |
1607 | 00:44:14.500 --> 00:44:19.290
1608 | Local file system and "ls" gives us idea. What kind of repository it is
1609 |
1610 | 00:44:19.530 --> 00:44:21.870
1611 | It's Git versus annex, which branch it is in...
1612 |
1613 | 00:44:22.320 --> 00:44:24.560
1614 | What was the date of the last commit?
1615 |
1616 | 00:44:24.920 --> 00:44:33.360
1617 | Also, the sizes what it tells here that we have a lot of 4 gigabytes of data referenced in this data set at the current
1618 |
1619 | 00:44:33.900 --> 00:44:39.340
1620 | version we've got only 0 bytes locally installed.
1621 |
1622 | 00:44:39.520 --> 00:44:44.720
1623 | We installed only those symlinks I was talking about.
1624 |
1625 | 00:44:46.150 --> 00:44:51.570
1626 | So, now we could actually explore what have you got?
1627 |
1628 | 00:44:53.290 --> 00:44:58.709
1629 | Some of the files that were committed directly into Git, so they became available on the file system as is
1630 |
1631 | 00:44:59.320 --> 00:45:06.029
1632 | But data files we could obtain now using the "datalad get" command.
1633 |
1634 | 00:45:06.670 --> 00:45:09.629
1635 | So what this command will do... Let me clear the screen again...
1636 |
1637 | 00:45:10.440 --> 00:45:16.960
1638 | So you're saying: "Obtain those files! Do it in four parallel processes."
1639 |
1640 | 00:45:16.960 --> 00:45:24.040
1641 | All the files which match these Shell globe expressions,
1642 |
1643 | 00:45:24.040 --> 00:45:26.480
1644 | so all the data sets locally which we have
1645 |
1646 | 00:45:26.860 --> 00:45:32.759
1647 | For all the subjects underneath and anatomical directory, right? We obtained all ready two OpenfMRI dataset
1648 |
1649 | 00:45:32.760 --> 00:45:35.040
1650 | And now we just want to obtain those data files
1651 |
1652 | 00:45:35.320 --> 00:45:42.780
1653 | Let's actually see what this one is pointing to... It points to all those data files.
1654 |
1655 | 00:45:42.940 --> 00:45:46.240
1656 | And if only listed with long listing,
1657 |
1658 | 00:45:46.240 --> 00:45:51.480
1659 | we'll see that those were symlinks which are actually at the moment not even present on that
1660 |
1661 | 00:45:51.820 --> 00:45:56.759
1662 | Point into the files which we don't have locally and that's what git-annex would do for us
1663 |
1664 | 00:45:56.760 --> 00:45:59.760
1665 | It would go online and fetch all those files
1666 |
1667 | 00:46:00.910 --> 00:46:02.910
1668 | from wherever they are available
1669 |
1670 | 00:46:03.520 --> 00:46:05.520
1671 | So let me run this command now
1672 |
1673 | 00:46:18.220 --> 00:46:21.629
1674 | As you can see the are four processes going on
1675 |
1676 | 00:46:27.640 --> 00:46:29.640
1677 | And the end.
1678 |
1679 | 00:46:30.120 --> 00:46:37.060
1680 | All DataLad commands they provide you a summary of what actions did it take?
1681 |
1682 | 00:46:37.280 --> 00:46:42.020
1683 | Here you could see that it got all those files ready to get okay
1684 |
1685 | 00:46:42.299 --> 00:46:49.619
1686 | Or it might say get failed if it failed to get them and then provides action summary, which we might see later in other demos
1687 |
1688 | 00:46:50.650 --> 00:46:57.150
1689 | So let's now run the same command which you ran before to see how much of data we actually got?
1690 |
1691 | 00:47:26.540 --> 00:47:30.199
1692 | As you can see all those which we didn't ask for any data
1693 |
1694 | 00:47:30.200 --> 00:47:33.500
1695 | They still keep zero bytes although all
1696 |
1697 | 00:47:33.810 --> 00:47:37.969
1698 | the files that are available and we could browse them,
1699 |
1700 | 00:47:37.970 --> 00:47:42.830
1701 | but those where we requested additional data files to be obtained finally list how much data
1702 |
1703 | 00:47:42.830 --> 00:47:46.159
1704 | we have in the working tree
1705 |
1706 | 00:47:47.070 --> 00:47:50.059
1707 | of those data sets.
1708 |
1709 | 00:47:51.120 --> 00:47:54.739
1710 | That would complete the demo for "search" and "install".
1711 |
1712 | 00:47:56.610 --> 00:48:01.489
1713 | Now it's your turn to find some interesting for you data sets and get the data for them
1714 |
1715 | 00:48:03.330 --> 00:48:10.009
1716 | Now that we went through one of the demos on our website or we call it features which was data discovery
1717 |
1718 | 00:48:10.410 --> 00:48:12.860
1719 | You could go and visit other
1720 |
1721 | 00:48:15.450 --> 00:48:19.010
1722 | features described on this page. First one is for data consumers
1723 |
1724 | 00:48:19.010 --> 00:48:25.580
1725 | which describes how you could generate native DataLad datasets from the website or
1726 |
1727 | 00:48:26.250 --> 00:48:29.540
1728 | S3 buckets using our crawler so
1729 |
1730 | 00:48:30.870 --> 00:48:33.409
1731 | If you know some resource you could create your own
1732 |
1733 | 00:48:33.870 --> 00:48:40.999
1734 | DataLad crawler to obtain that data into DataLad dataset and keep it up to date with periodic reruns.
1735 |
1736 | 00:48:41.760 --> 00:48:43.760
1737 | Data sharing demo will later show
1738 |
1739 | 00:48:44.970 --> 00:48:50.570
1740 | examples of how you could share the data either on Github, through the github while depositing data to your website,
1741 |
1742 | 00:48:51.090 --> 00:48:56.840
1743 | how I demonstrated earlier, or just for collaboration through SSH servers.
1744 |
1745 | 00:48:57.870 --> 00:48:59.600
1746 | For Git and git-annex users
1747 |
1748 | 00:48:59.600 --> 00:49:07.579
1749 | We give a little example of unique features present in DataLad contrasting it with
1750 |
1751 | 00:49:08.310 --> 00:49:10.881
1752 | regular Git and git-annex usage.
1753 |
1754 | 00:49:10.881 --> 00:49:16.280
1755 | This table outlines there those features.
1756 |
1757 | 00:49:16.280 --> 00:49:23.740
1758 | We operate on multiple data sets at the same time, we operate across data sets seamlessly
1759 |
1760 | 00:49:23.750 --> 00:49:30.169
1761 | So you don't have to switch directories to just operate in with specific data files they provide metadata support
1762 |
1763 | 00:49:30.860 --> 00:49:37.840
1764 | And aggregate from different panel data sources and in unified authentication interface.
1765 |
1766 | 00:49:38.480 --> 00:49:40.480
1767 | Also, one of the
1768 |
1769 | 00:49:40.700 --> 00:49:48.159
1770 | new unique features in DataLad is ability to rerun previously ran commands on the data to see how
1771 |
1772 | 00:49:49.820 --> 00:49:52.299
1773 | things changed or just keep nice
1774 |
1775 | 00:49:53.060 --> 00:49:58.509
1776 | Protocol of actions you have done and record them within your git/git-annex history.
1777 |
1778 | 00:50:00.350 --> 00:50:07.539
1779 | And the last one goes in detail in example on how to use HeuDiCon with your data sets and
1780 |
1781 | 00:50:07.730 --> 00:50:09.730
1782 | relying on our
1783 |
1784 | 00:50:10.100 --> 00:50:14.559
1785 | naming convention for how to name scanning sequences in the scanner.
1786 |
1787 | 00:50:15.640 --> 00:50:18.540
1788 | I hope that you liked this presentation
1789 |
1790 | 00:50:18.540 --> 00:50:27.700
1791 | and you liked what DataLad has to offer so I just want to summarize what DataLad does.
1792 |
1793 | 00:50:27.700 --> 00:50:30.300
1794 | And what it does? It helps to manage and share
1795 |
1796 | 00:50:30.380 --> 00:50:34.300
1797 | Available and your own data by a simple command line of Python interface.
1798 |
1799 | 00:50:34.880 --> 00:50:38.530
1800 | We provide already access to over 10 terabytes of neuroimaging data
1801 |
1802 | 00:50:38.530 --> 00:50:46.269
1803 | And we help with authentication, crawling of the websites, getting data from the archives in which it was originally distributed
1804 |
1805 | 00:50:47.000 --> 00:50:49.000
1806 | publishing new or derived data.
1807 |
1808 | 00:50:50.000 --> 00:50:55.449
1809 | Underneath we use regular pure Git and git-annex repository so whatever tools
1810 |
1811 | 00:50:55.450 --> 00:50:58.780
1812 | You've got used to use you could still use them
1813 |
1814 | 00:50:58.780 --> 00:51:01.810
1815 | And if you're an expert git and git-annex user
1816 |
1817 | 00:51:02.210 --> 00:51:03.880
1818 | We will not limit your powers
1819 |
1820 | 00:51:03.880 --> 00:51:11.800
1821 | You could do the same stuff what you did before with your key tanga tanga suppositories, so we also provide somewhat human
1822 |
1823 | 00:51:12.380 --> 00:51:14.360
1824 | accessible
1825 |
1826 | 00:51:14.360 --> 00:51:22.060
1827 | Metadata interface so in general if you want just to search for some datasets, it's quite convenient with datalad -search.
1828 |
1829 | 00:51:23.240 --> 00:51:25.070
1830 | Documentation is growing
1831 |
1832 | 00:51:25.070 --> 00:51:28.389
1833 | You're welcome to contribute, the project is open source.
1834 |
1835 | 00:51:29.240 --> 00:51:36.790
1836 | I hope that after you've seen the presentation you will agree that managing data can be as simple as manage encode and software. Thank you!
1837 |
1838 |
--------------------------------------------------------------------------------