├── README.md ├── data ├── janiszewski_rep_cleaned.csv ├── majic.csv ├── sgf_wide.csv ├── sklar_data.csv ├── stiller_scales_data.csv ├── ws.csv └── ws.feather ├── tidyverse_examples.Rmd ├── tidyverse_tutorial.Rmd ├── tidyverse_tutorial_short.Rmd └── tidyverse_tutorial_short_CDS.Rmd /README.md: -------------------------------------------------------------------------------- 1 | # tidyverse-tutorial 2 | 3 | This is a `tidyverse` tutorial that I have used in many contexts, originally for the `data on the mind` workshop at Berkeley in 2017. 4 | 5 | # Original course blurb 6 | 7 | The availability of data on the web has opened up many resources for cognitive scientists who know how to deal with "medium data" - big enough to crash excel but small enough to load into memory. R is a powerful tool for statistical data analysis and reproducible research, and the "tidverse" - an ecosystem of R packages for manipulating, analyzing, and visualizing data - provides many tools for manipulating this kind of data quickly and easily. In this tutorial, I'll walk through how to go from a database or tabular data file to an interactive plot with surprisingly little pain (and less code than you'd imagine). My focus will be on introducing a workflow that uses a wide variety of different tools and packages, including readr, dplyr, tidyr, and shiny. I'll assume basic familiarity with R and will use (but not spend too much time teaching) ggplot2. Featuring data from http://wordbank.stanford.edu 8 | 9 | # Versions 10 | 11 | * Standard tutorial: `tidyverse_tutorial.Rmd` 12 | * Short version of the tutorial (currently most up-to-date): `tidyverse_tutorial_short.Rmd` 13 | * with notes from CDS2019: `tidyverse_tutorial_short_CDS.Rmd` 14 | * Exercises (for extra practice): `tidyverse_exercises.Rmd` 15 | 16 | -------------------------------------------------------------------------------- /data/janiszewski_rep_cleaned.csv: -------------------------------------------------------------------------------- 1 | HITId,HITTypeId,Title,Description,Keywords,Reward,CreationTime,MaxAssignments,RequesterAnnotation,AssignmentDurationInSeconds,AutoApprovalDelayInSeconds,Expiration,NumberOfSimilarHITs,LifetimeInSeconds,AssignmentId,WorkerId,AssignmentStatus,AcceptTime,SubmitTime,AutoApprovalTime,ApprovalTime,RejectionTime,RequesterFeedback,WorkTimeInSeconds,LifetimeApprovalRate,Last30DaysApprovalRate,Last7DaysApprovalRate,Input.condition,Input.price1,Input.price2,Input.price3,Answer.dog_cost,Answer.plasma_cost,Answer.sushi_cost 2 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,23NOK5UFJ51YN8N3B5PQZ1CSVW67AD,1,Submitted,Wed Jan 25 18:31:40 GMT 2012,Wed Jan 25 18:32:03 GMT 2012,Wed Feb 01 10:32:03 PST 2012,,,,23,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,2300,4800,8.7 3 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,23Q24E6AUBXHCAQMRB092BZWF3D0WV,2,Submitted,Thu Jan 26 23:37:33 GMT 2012,Thu Jan 26 23:41:21 GMT 2012,Thu Feb 02 15:41:21 PST 2012,,,,228,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,2450,4850,9 4 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,23UD5NQYN5F36ISVHJPMUCWYDESB7D,3,Submitted,Wed Jan 25 18:39:43 GMT 2012,Wed Jan 25 18:41:12 GMT 2012,Wed Feb 01 10:41:12 PST 2012,,,,89,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,800,1200,8 5 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,240Y1THV8ER9EQ22W8DZZM5KHTE1OH,4,Submitted,Thu Jan 26 19:58:01 GMT 2012,Thu Jan 26 20:00:26 GMT 2012,Thu Feb 02 12:00:26 PST 2012,,,,145,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,2000,4200,9 6 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,250ER9Y8TNKI2H11HA6HZ2OM20IU7Y,5,Submitted,Thu Jan 26 02:05:54 GMT 2012,Thu Jan 26 02:09:46 GMT 2012,Wed Feb 01 18:09:46 PST 2012,,,,232,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,2000,4500,8.5 7 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,276RSQV66RLPX02ITLNA2KI64Z84C3,6,Submitted,Thu Jan 26 22:14:07 GMT 2012,Thu Jan 26 22:15:33 GMT 2012,Thu Feb 02 14:15:33 PST 2012,,,,86,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,1600,4600,8 8 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,298KPEIB9BAW1AVC1GWV4UD3O39X07,7,Submitted,Wed Jan 25 18:24:14 GMT 2012,Wed Jan 25 18:25:47 GMT 2012,Wed Feb 01 10:25:47 PST 2012,,,,93,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,1750,4650,6.95 9 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,29EAZHHRRLFQG6W27H81RHO9GR5NEX,8,Submitted,Thu Jan 26 06:01:52 GMT 2012,Thu Jan 26 06:02:46 GMT 2012,Wed Feb 01 22:02:46 PST 2012,,,,54,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,2200,4800,8 10 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,29YKW4ZMAZHH79UU2QFNUVRH3U5I9A,9,Submitted,Wed Jan 25 20:54:16 GMT 2012,Wed Jan 25 20:56:02 GMT 2012,Wed Feb 01 12:56:02 PST 2012,,,,106,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,2000,4500,8.99 11 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2BG3W5XH0JPP80BZA4GLZP75MV5EKM,10,Submitted,Wed Jan 25 19:05:36 GMT 2012,Wed Jan 25 19:07:26 GMT 2012,Wed Feb 01 11:07:26 PST 2012,,,,110,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,1000,3000,9 12 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2BP3V0FK0COKI1NGYXGKR9LRYG6HOO,11,Submitted,Wed Jan 25 18:20:31 GMT 2012,Wed Jan 25 18:22:20 GMT 2012,Wed Feb 01 10:22:20 PST 2012,,,,109,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,1600,4578,8.5 13 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2BWIXKO2L92HUU5PINUYVCCF2MDEDE,12,Submitted,Wed Jan 25 20:36:36 GMT 2012,Wed Jan 25 20:39:12 GMT 2012,Wed Feb 01 12:39:12 PST 2012,,,,156,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,2250,4999,8.5 14 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2C2ESCWY2AOOIYV1Q6FONQSCOG111N,13,Submitted,Wed Jan 25 18:57:28 GMT 2012,Wed Jan 25 18:57:46 GMT 2012,Wed Feb 01 10:57:46 PST 2012,,,,18,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,2000,4000, 15 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2CWSMEZZ2MIK25I9KF0YA8TWY98M59,14,Submitted,Thu Jan 26 23:12:41 GMT 2012,Thu Jan 26 23:14:34 GMT 2012,Thu Feb 02 15:14:34 PST 2012,,,,113,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,2000,4500,8.5 16 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2DLVOK5UFJ51EPZT0P6YXS1CU5V69T,15,Submitted,Thu Jan 26 16:07:25 GMT 2012,Thu Jan 26 16:09:05 GMT 2012,Thu Feb 02 08:09:05 PST 2012,,,,100,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,2200,4500,8 17 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2GK8SV7WGKCCB3UIO0U3THAPSF9R1P,16,Submitted,Thu Jan 26 12:15:40 GMT 2012,Thu Jan 26 12:17:21 GMT 2012,Thu Feb 02 04:17:21 PST 2012,,,,101,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,2200,4700,8 18 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2J0JHSTLB3L0VTSOUE2JLJREXORTHQ,17,Submitted,Wed Jan 25 20:17:25 GMT 2012,Wed Jan 25 20:19:35 GMT 2012,Wed Feb 01 12:19:35 PST 2012,,,,130,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,1000,5000,7.75 19 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2NAPDOBDDP8PZET9P3ZXTB3Q5LMD2E,18,Submitted,Thu Jan 26 20:47:27 GMT 2012,Thu Jan 26 20:51:43 GMT 2012,Thu Feb 02 12:51:43 PST 2012,,,,256,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,2250,4899,8 20 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2NE25L8I9NZWMZNEIE54XW7TAH5JTX,19,Submitted,Thu Jan 26 11:44:48 GMT 2012,Thu Jan 26 11:46:17 GMT 2012,Thu Feb 02 03:46:17 PST 2012,,,,89,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,1749,3499,8.99 21 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2PD6VCPWZ9RNTER37RJ7ZN3DVSJPAD,20,Submitted,Thu Jan 26 21:54:32 GMT 2012,Thu Jan 26 21:55:46 GMT 2012,Thu Feb 02 13:55:46 PST 2012,,,,74,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,1850,4999.99,7.99 22 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2RQ5PQJQ9OPL1FLM6CIXN8KCQ9HY0S,21,Submitted,Wed Jan 25 22:52:40 GMT 2012,Wed Jan 25 22:54:26 GMT 2012,Wed Feb 01 14:54:26 PST 2012,,,,106,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,2500,4999,9.3 23 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2RSJPPSI2KYE5314JVBLT6WJ500QKL,22,Submitted,Thu Jan 26 23:48:06 GMT 2012,Thu Jan 26 23:50:11 GMT 2012,Thu Feb 02 15:50:11 PST 2012,,,,125,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,2000,4500,8.5 24 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2RX7DZKPFE4EWVQSYSWLFI9N18U5FJ,23,Submitted,Thu Jan 26 20:43:16 GMT 2012,Thu Jan 26 20:44:20 GMT 2012,Thu Feb 02 12:44:20 PST 2012,,,,64,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,1500,3200,7.35 25 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2VHHIT3HVWAV00FHZ9H2872ZBK4DLS,24,Submitted,Thu Jan 26 06:05:59 GMT 2012,Thu Jan 26 06:07:52 GMT 2012,Wed Feb 01 22:07:52 PST 2012,,,,113,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,2250,4500,8.5 26 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2WF9U8P9Y38TCE6XFILYPM0JVKYXGA,25,Submitted,Wed Jan 25 18:32:50 GMT 2012,Wed Jan 25 18:34:41 GMT 2012,Wed Feb 01 10:34:41 PST 2012,,,,111,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,1500,3000,8 27 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2YM3K4F0OHOV7JDX9AF2S92HGN221U,26,Submitted,Thu Jan 26 03:40:52 GMT 2012,Thu Jan 26 03:42:20 GMT 2012,Wed Feb 01 19:42:20 PST 2012,,,,88,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,1875,4699,7.89 28 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2Z1746OQ1SNQ6PZLJQW1XKCNTCVO4M,27,Submitted,Thu Jan 26 10:50:04 GMT 2012,Thu Jan 26 10:54:48 GMT 2012,Thu Feb 02 02:54:48 PST 2012,,,,284,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,2000,4540,8.95 29 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2Z8FL6VCPWZ975MBUEM1Z7SN5NC8NM,28,Submitted,Wed Jan 25 18:40:38 GMT 2012,Wed Jan 25 18:43:38 GMT 2012,Wed Feb 01 10:43:38 PST 2012,,,,180,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,1800,3800,7 30 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2ZKI2KYEPLSPNNT0YWNJAOAHSO8UO2,29,Submitted,Wed Jan 25 18:21:36 GMT 2012,Wed Jan 25 18:24:03 GMT 2012,Wed Feb 01 10:24:03 PST 2012,,,,147,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,2000,4000,8.5 31 | 261WKUDD8XMBJ8COBQ261TO6T1XNCJ,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,2,,2ZVTVOK5UFJ5HGG5QEQF5QS1E3B85X,30,Submitted,Thu Jan 26 07:20:44 GMT 2012,Thu Jan 26 07:23:15 GMT 2012,Wed Feb 01 23:23:15 PST 2012,,,,151,0% (0/0),0% (0/0),0% (0/0),over,5012,2508,9.36,2325,4997,8.99 32 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,211Y8TNKIMZS2NTUTITOT0PGXROXA5,31,Submitted,Thu Jan 26 18:50:19 GMT 2012,Thu Jan 26 18:51:09 GMT 2012,Thu Feb 02 10:51:09 PST 2012,,,,50,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,1000,2500,7.5 33 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,21KQV66RLPHI9LQA80MKP62NL23E63,32,Submitted,Thu Jan 26 12:55:34 GMT 2012,Thu Jan 26 12:58:31 GMT 2012,Thu Feb 02 04:58:31 PST 2012,,,,177,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,2000,2500,7.8 34 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,223RI8IUB2YUPQSW4JCBAL0FDVWI62,33,Submitted,Thu Jan 26 21:16:03 GMT 2012,Thu Jan 26 21:19:17 GMT 2012,Thu Feb 02 13:19:17 PST 2012,,,,194,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,2192,4888,7.5 35 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,254PWZ9RNDWI4DA7JIE3KTGU3FESDP,34,Submitted,Thu Jan 26 16:57:18 GMT 2012,Thu Jan 26 16:58:53 GMT 2012,Thu Feb 02 08:58:53 PST 2012,,,,95,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,2000,4400,7.99 36 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,262VKI62NJQ2HPBELZHHDBKLE50TLX,35,Submitted,Wed Jan 25 20:19:26 GMT 2012,Wed Jan 25 20:22:39 GMT 2012,Wed Feb 01 12:22:39 PST 2012,,,,193,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,2000,4500,8 37 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,273AT2D5NQYNLXC5C9750YMNE6I84S,36,Submitted,Wed Jan 25 19:06:30 GMT 2012,Wed Jan 25 19:07:56 GMT 2012,Wed Feb 01 11:07:56 PST 2012,,,,86,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,2100,4300,7.99 38 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,28349F9SSSHXHJS3ZIK5O5F8LGDTXQ,37,Submitted,Thu Jan 26 22:45:26 GMT 2012,Thu Jan 26 22:47:18 GMT 2012,Thu Feb 02 14:47:18 PST 2012,,,,112,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,2299,4500,7.8 39 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2AJYNJ92NJKMGNK1VJLH5W2GV0BC76,38,Submitted,Thu Jan 26 02:36:34 GMT 2012,Thu Jan 26 02:37:40 GMT 2012,Wed Feb 01 18:37:40 PST 2012,,,,66,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,2100,3500,6.99 40 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2AJYNJ92NJKMGNK1VJLH5W2GVZR7CF,39,Submitted,Wed Jan 25 18:33:07 GMT 2012,Wed Jan 25 18:35:21 GMT 2012,Wed Feb 01 10:35:21 PST 2012,,,,134,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,2292,4888,7.5 41 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2BUA1QP6AUC2MVPLJ3SXAV0FMCZ70V,40,Submitted,Thu Jan 26 20:28:29 GMT 2012,Thu Jan 26 20:29:25 GMT 2012,Thu Feb 02 12:29:25 PST 2012,,,,56,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,2482,4968,8.46 42 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2C2OO2GMMEGO4YZ7OCXIHA2SGUG885,41,Submitted,Thu Jan 26 19:05:17 GMT 2012,Thu Jan 26 19:07:47 GMT 2012,Thu Feb 02 11:07:47 PST 2012,,,,150,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,1500,3000,7.5 43 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2EUJ1LY58B72N49XS5BHK16YH3X7SV,42,Submitted,Thu Jan 26 01:39:06 GMT 2012,Thu Jan 26 01:41:37 GMT 2012,Wed Feb 01 17:41:37 PST 2012,,,,151,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,2200,4000,6.99 44 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2EYOQ1SNQQ7QMP9KDGBCUR12V0U7RL,43,Submitted,Wed Jan 25 18:44:07 GMT 2012,Wed Jan 25 18:46:53 GMT 2012,Wed Feb 01 10:46:53 PST 2012,,,,166,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,1500,4000,7.99 45 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2FD8I9NZW6HEFOXTGGN70869I5CWMY,44,Submitted,Thu Jan 26 09:31:07 GMT 2012,Thu Jan 26 09:34:34 GMT 2012,Thu Feb 02 01:34:34 PST 2012,,,,207,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,1900,4700,7.99 46 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2FMFJ51Y7QEOFX754R3S0MOFZ30EBA,45,Submitted,Thu Jan 26 20:23:02 GMT 2012,Thu Jan 26 20:23:50 GMT 2012,Thu Feb 02 12:23:50 PST 2012,,,,48,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,2000,4500,6.99 47 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2GZD1X3V0FK0S6THV4SMEPKKBV9LEV,46,Submitted,Wed Jan 25 18:43:20 GMT 2012,Wed Jan 25 18:45:30 GMT 2012,Wed Feb 01 10:45:30 PST 2012,,,,130,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,2200,4750,7.5 48 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2HW6OQ1SNQQ76OGFHRHKJNR1438Q64,47,Submitted,Wed Jan 25 18:36:05 GMT 2012,Wed Jan 25 18:38:08 GMT 2012,Wed Feb 01 10:38:08 PST 2012,,,,123,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,1500,2500,7.58 49 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2IBIA0RYNJ9231T1CV2MQTUH06L381,48,Submitted,Wed Jan 25 18:39:05 GMT 2012,Wed Jan 25 18:41:38 GMT 2012,Wed Feb 01 10:41:38 PST 2012,,,,153,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,1900,4120,7.75 50 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2IEQU6L5OBVCYJ0C9U1IQF7GWCJYFB,49,Submitted,Thu Jan 26 04:44:48 GMT 2012,Thu Jan 26 04:47:49 GMT 2012,Wed Feb 01 20:47:49 PST 2012,,,,181,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,2200,4900,8 51 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2JG9Z6KW4ZMAFZQ63B6Q7ONNX1ZF6Z,50,Submitted,Wed Jan 25 18:33:35 GMT 2012,Wed Jan 25 18:35:32 GMT 2012,Wed Feb 01 10:35:32 PST 2012,,,,117,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,2000,4000,7.56 52 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2KRHHRRLFQ0O3546TRBHV9EGTMEPG5,51,Submitted,Thu Jan 26 03:55:33 GMT 2012,Thu Jan 26 03:58:49 GMT 2012,Wed Feb 01 19:58:49 PST 2012,,,,196,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,1990,3250,7.75 53 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2LFVP3TVOK5UV1EGAXHEVZFYS2N52T,52,Submitted,Wed Jan 25 20:05:15 GMT 2012,Wed Jan 25 20:06:29 GMT 2012,Wed Feb 01 12:06:29 PST 2012,,,,74,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,2000,4000,7.99 54 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2PD6VCPWZ9RNTER37RJ7ZN3DVQSAP3,53,Submitted,Wed Jan 25 23:02:10 GMT 2012,Wed Jan 25 23:02:34 GMT 2012,Wed Feb 01 15:02:34 PST 2012,,,,24,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,1750,3500,7.99 55 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2R8YQ4J3NAB6RXVUREPYPT1TLRYQD3,54,Submitted,Thu Jan 26 19:22:55 GMT 2012,Thu Jan 26 19:24:48 GMT 2012,Thu Feb 02 11:24:48 PST 2012,,,,113,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,1790,4350,7.5 56 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2T85TYMNCWYBKROO4IJH411JQZ9OK2,55,Submitted,Thu Jan 26 16:59:02 GMT 2012,Thu Jan 26 17:00:27 GMT 2012,Thu Feb 02 09:00:27 PST 2012,,,,85,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,2000,3200,8.5 57 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2WFCWYB49F9S8AQCDRAOUST5JGOPTG,56,Submitted,Thu Jan 26 04:34:27 GMT 2012,Thu Jan 26 04:36:06 GMT 2012,Wed Feb 01 20:36:06 PST 2012,,,,99,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,500,4500,7.99 58 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2X4WYB49F9SS8Z6GD9FNZT5H7Q8QUQ,57,Submitted,Thu Jan 26 00:21:48 GMT 2012,Thu Jan 26 00:22:47 GMT 2012,Wed Feb 01 16:22:47 PST 2012,,,,59,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,2000,4800,8 59 | 2WQ06UFBNFSVWCUUFZWS36NG1CAH35,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,3,,2XXF3Q0JG5TY25LBA1V9M9SSUTJEII,58,Submitted,Thu Jan 26 19:44:01 GMT 2012,Thu Jan 26 19:46:18 GMT 2012,Thu Feb 02 11:46:18 PST 2012,,,,137,0% (0/0),0% (0/0),0% (0/0),under,4988,2492,8.64,2000,3500,7.69 60 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,206OZFYQS1CS94XU9H4SBIC1G86JMR,59,Submitted,Wed Jan 25 22:04:26 GMT 2012,Wed Jan 25 22:05:48 GMT 2012,Wed Feb 01 14:05:48 PST 2012,,,,82,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,2300,4500,8 61 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,22LMOFXRDS4ISJNDPDOJMM11CC4WTT,60,Submitted,Wed Jan 25 20:27:50 GMT 2012,Wed Jan 25 20:30:24 GMT 2012,Wed Feb 01 12:30:24 PST 2012,,,,154,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,2000,4500,8.49 62 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,23EOFXRDS4ICHW7SZNAFT11A4UZXU0,61,Submitted,Thu Jan 26 14:19:53 GMT 2012,Thu Jan 26 14:22:29 GMT 2012,Thu Feb 02 06:22:29 PST 2012,,,,156,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,2200,4500,8 63 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,24FU8LZ6R70GL9INPYA32E7WU2CM6E,62,Submitted,Wed Jan 25 18:51:08 GMT 2012,Wed Jan 25 18:52:25 GMT 2012,Wed Feb 01 10:52:25 PST 2012,,,,77,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,1900,3500,7.5 64 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,259VKCJ011K2NMSMLAHWD18N6IYA8T,63,Submitted,Thu Jan 26 14:19:49 GMT 2012,Thu Jan 26 14:21:42 GMT 2012,Thu Feb 02 06:21:42 PST 2012,,,,113,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,1000,2500,6.95 65 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,25B8EBDPGFL6BUYBBZINKWIOXDI2HF,64,Submitted,Thu Jan 26 23:05:44 GMT 2012,Thu Jan 26 23:06:52 GMT 2012,Thu Feb 02 15:06:52 PST 2012,,,,68,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,1700,4495,7.5 66 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,262VKI62NJQ2HPBELZHHDBKLE5XLTM,65,Submitted,Wed Jan 25 18:44:34 GMT 2012,Wed Jan 25 18:48:40 GMT 2012,Wed Feb 01 10:48:40 PST 2012,,,,246,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,2400,4849,8.69 67 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,27MLRGNTBJ3M2V6KRFYND2KHDL6U85,66,Submitted,Thu Jan 26 18:01:49 GMT 2012,Thu Jan 26 18:04:02 GMT 2012,Thu Feb 02 10:04:02 PST 2012,,,,133,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,1700,3500,8 68 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,27VV0FK0COK2ZWA1JFBKGLRW80PIPB,67,Submitted,Thu Jan 26 22:56:39 GMT 2012,Thu Jan 26 22:57:48 GMT 2012,Thu Feb 02 14:57:48 PST 2012,,,,69,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,2000,4000,8 69 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,27VYOCCF0CP50FY8UD9I4YG4RYTUTW,68,Submitted,Thu Jan 26 19:48:47 GMT 2012,Thu Jan 26 19:50:33 GMT 2012,Thu Feb 02 11:50:33 PST 2012,,,,106,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,1800,1500,8 70 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,28L3DHJHP4BDT2PJAYIS49X3IE13WV,69,Submitted,Thu Jan 26 11:17:02 GMT 2012,Thu Jan 26 11:19:02 GMT 2012,Thu Feb 02 03:19:02 PST 2012,,,,120,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,2000,4200,8 71 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,28X8B727M0IGV2QSDWPFZVF4XUOCX1,70,Submitted,Thu Jan 26 12:49:25 GMT 2012,Thu Jan 26 12:52:26 GMT 2012,Thu Feb 02 04:52:26 PST 2012,,,,181,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,2000,2999,7 72 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,291O9Z6KW4ZMQHQW3HCFX0ONP7ZE5U,71,Submitted,Thu Jan 26 16:00:28 GMT 2012,Thu Jan 26 16:01:28 GMT 2012,Thu Feb 02 08:01:28 PST 2012,,,,60,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,2450,4995,8.95 73 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,2AGHOVR14IXK4KUOE75C3A6X59P78U,72,Submitted,Thu Jan 26 00:30:43 GMT 2012,Thu Jan 26 00:32:31 GMT 2012,Wed Feb 01 16:32:31 PST 2012,,,,108,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,2300,4700,7.5 74 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,2BKNQQ7Q6705H8TRZHS20QC1ZY8BVI,73,Submitted,Fri Jan 27 01:06:05 GMT 2012,Fri Jan 27 01:09:06 GMT 2012,Thu Feb 02 17:09:06 PST 2012,,,,181,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,2000,4300,8.25 75 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,2CJCJK63ZVE3X99Z681QGJL5TEHY9C,74,Submitted,Thu Jan 26 02:43:02 GMT 2012,Thu Jan 26 02:45:03 GMT 2012,Wed Feb 01 18:45:03 PST 2012,,,,121,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,1000,3500,8 76 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,2CYDG67D1X3VGXTFOEB2QE1M9Z9AHP,75,Submitted,Wed Jan 25 21:13:31 GMT 2012,Wed Jan 25 21:15:06 GMT 2012,Wed Feb 01 13:15:06 PST 2012,,,,95,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,1950,4675,7.89 77 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,2F3MJTUHYW2G97I6MXJ53M16B8BJOW,76,Submitted,Wed Jan 25 18:49:44 GMT 2012,Wed Jan 25 18:50:49 GMT 2012,Wed Feb 01 10:50:49 PST 2012,,,,65,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,2000,4000,7.75 78 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,2IHJWKUDD8XMRLZILPRBDUTO84KMBY,77,Submitted,Thu Jan 26 06:46:26 GMT 2012,Thu Jan 26 06:47:25 GMT 2012,Wed Feb 01 22:47:25 PST 2012,,,,59,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,2000,4500,6.5 79 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,2KRHHRRLFQ0O3546TRBHV9EGTL6PGV,78,Submitted,Wed Jan 25 23:46:59 GMT 2012,Wed Jan 25 23:50:08 GMT 2012,Wed Feb 01 15:50:08 PST 2012,,,,189,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,500,4500,8 80 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,2RE8JIA0RYNJPKWYWCR5IMJTWTQ613,79,Submitted,Fri Jan 27 00:38:58 GMT 2012,Fri Jan 27 00:40:50 GMT 2012,Thu Feb 02 16:40:50 PST 2012,,,,112,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,1500,4500,7 81 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,2REVHVKCJ0110KGJVX0KXW61AXR86G,80,Submitted,Wed Jan 25 20:37:49 GMT 2012,Wed Jan 25 20:40:28 GMT 2012,Wed Feb 01 12:40:28 PST 2012,,,,159,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,2000,4700,8.5 82 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,2RNKCJ011K27K1GOWGN688N484M9BT,81,Submitted,Wed Jan 25 21:00:35 GMT 2012,Wed Jan 25 21:04:08 GMT 2012,Wed Feb 01 13:04:08 PST 2012,,,,213,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,1000,2000,7.5 83 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,2RNWAVKI62NJ6KAMEP09XH6BMWIJRU,82,Submitted,Thu Jan 26 03:34:58 GMT 2012,Thu Jan 26 03:36:57 GMT 2012,Wed Feb 01 19:36:57 PST 2012,,,,119,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,2200,4700,8 84 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,2U1RJ85OZIZEX4JP1CFOX5EWR9C9X8,83,Submitted,Thu Jan 26 23:47:37 GMT 2012,Thu Jan 26 23:49:46 GMT 2012,Thu Feb 02 15:49:46 PST 2012,,,,129,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,2000,4000,8.5 85 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,2UHEIB9BAWLS2FY5HLOUK3MTT1T2ZV,84,Submitted,Wed Jan 25 19:08:28 GMT 2012,Wed Jan 25 19:11:34 GMT 2012,Wed Feb 01 11:11:34 PST 2012,,,,186,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,2350,4750,8.25 86 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,2VCH1Q6SVQ8HOQV56WC5VBVCKC63MN,85,Submitted,Thu Jan 26 06:06:05 GMT 2012,Thu Jan 26 06:09:00 GMT 2012,Wed Feb 01 22:09:00 PST 2012,,,,175,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,2000,4000,8.5 87 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,2W33Q39Z0B6U96F8G4MS0VRKI09LW8,86,Submitted,Thu Jan 26 05:54:54 GMT 2012,Thu Jan 26 05:58:11 GMT 2012,Wed Feb 01 21:58:11 PST 2012,,,,197,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,1900,4800,8.5 88 | 2XNTYMNCWYB4PXI74I8X81JONPBLPS,2DVGTP9RA7S5C4ALLOJOM70JMHCRW9,How much is it worth? Quick 2 minute survey.,A quick two minute survey asking about prices. DO THIS HIT ONLY ONCE.,"survey, quick, prices",0.1,Wed Jan 25 18:13:24 GMT 2012,30,Department:Other,600,604800,Wed Feb 01 18:13:24 GMT 2012,1,,2WBTUHYW2GTPP9JM4VNM869YUZSQL9,87,Submitted,Thu Jan 26 05:42:31 GMT 2012,Thu Jan 26 05:43:31 GMT 2012,Wed Feb 01 21:43:31 PST 2012,,,,60,0% (0/0),0% (0/0),0% (0/0),rounded,5000,2500,9,2499.99,4995,8.99 -------------------------------------------------------------------------------- /data/majic.csv: -------------------------------------------------------------------------------- 1 | subid,grade,group,2015,2016 2 | S1-02-03,first grade,CNTL,0,12 3 | S1-02-08,first grade,CNTL,0,4 4 | S1-02-17,first grade,CNTL,8,NA 5 | S1-03-05,first grade,MA,4,16 6 | S1-03-14,first grade,MA,3,8 7 | S1-03-15,first grade,MA,0,4 8 | S1-04-01,first grade,MA,0,13 9 | S1-04-03,first grade,MA,3,10 10 | S1-04-04,first grade,MA,3,6 11 | S1-04-07,first grade,MA,1,4 12 | S1-04-08,first grade,MA,3,6 13 | S1-04-11,first grade,MA,2,5 14 | S1-04-17,first grade,MA,4,8 15 | S1-04-18,first grade,MA,2,12 16 | S1-07-02,first grade,CNTL,3,15 17 | S1-07-11,first grade,CNTL,2,7 18 | S1-07-18,first grade,CNTL,1,5 19 | S1-07-20,first grade,CNTL,4,8 20 | S1-09-01,first grade,CNTL,3,4 21 | S1-09-05,first grade,CNTL,2,5 22 | S1-09-06,first grade,CNTL,3,9 23 | S1-09-07,first grade,CNTL,3,8 24 | S1-09-11,first grade,CNTL,3,9 25 | S1-10-01,first grade,MA,0,6 26 | S1-10-02,first grade,MA,1,9 27 | S1-10-06,first grade,MA,0,13 28 | S1-10-07,first grade,MA,2,12 29 | S1-10-08,first grade,MA,3,13 30 | S1-10-09,first grade,MA,3,10 31 | S1-10-12,first grade,MA,3,10 32 | S1-10-13,first grade,MA,3,12 33 | S1-11-01,first grade,MA,3,9 34 | S1-11-02,first grade,MA,4,13 35 | S1-11-03,first grade,MA,2,3 36 | S1-11-04,first grade,MA,3,5 37 | S1-11-09,first grade,MA,3,6 38 | S1-11-10,first grade,MA,1,6 39 | S1-11-11,first grade,MA,0,5 40 | S1-11-17,first grade,MA,1,4 41 | S1-12-01,first grade,CNTL,1,14 42 | S1-12-04,first grade,CNTL,0,4 43 | S1-12-05,first grade,CNTL,2,5 44 | S1-12-08,first grade,CNTL,1,2 45 | S1-12-11,first grade,CNTL,1,6 46 | S1-14-01,first grade,CNTL,2,9 47 | S1-14-02,first grade,CNTL,2,4 48 | S1-14-07,first grade,CNTL,2,12 49 | S1-14-09,first grade,CNTL,3,12 50 | S1-14-11,first grade,CNTL,1,16 51 | S1-14-12,first grade,CNTL,9,22 52 | S1-14-14,first grade,CNTL,3,11 53 | S1-14-15,first grade,CNTL,4,13 54 | S1-14-17,first grade,CNTL,1,11 55 | S1-15-01,first grade,MA,0,4 56 | S1-15-02,first grade,MA,2,6 57 | S1-15-04,first grade,MA,2,4 58 | S1-15-05,first grade,MA,1,5 59 | S1-15-08,first grade,MA,4,17 60 | S1-15-09,first grade,MA,4,13 61 | S1-15-10,first grade,MA,3,12 62 | S1-15-11,first grade,MA,2,13 63 | S2-03-01,second grade,MA,4,9 64 | S2-03-02,second grade,MA,6,11 65 | S2-03-03,second grade,MA,12,17 66 | S2-03-04,second grade,MA,5,7 67 | S2-03-13,second grade,MA,4,9 68 | S2-04-01,second grade,MA,12,29 69 | S2-04-05,second grade,MA,6,13 70 | S2-04-07,second grade,MA,5,15 71 | S2-04-08,second grade,MA,13,17 72 | S2-04-09,second grade,MA,4,16 73 | S2-04-10,second grade,MA,9,16 74 | S2-04-11,second grade,MA,13,18 75 | S2-04-12,second grade,MA,7,10 76 | S2-04-16,second grade,MA,10,14 77 | S2-04-17,second grade,MA,13,26 78 | S2-04-18,second grade,MA,10,20 79 | S2-05-02,second grade,MA,3,6 80 | S2-08-01,second grade,CNTL,8,12 81 | S2-08-03,second grade,CNTL,4,9 82 | S2-09-05,second grade,CNTL,5,12 83 | S2-09-07,second grade,CNTL,4,10 84 | S2-09-08,second grade,CNTL,4,9 85 | S2-09-10,second grade,CNTL,8,20 86 | S2-09-13,second grade,CNTL,14,17 87 | S2-09-19,second grade,CNTL,5,20 88 | S2-09-20,second grade,CNTL,10,24 89 | S2-10-02,second grade,CNTL,0,25 90 | S2-10-03,second grade,CNTL,5,17 91 | S2-10-05,second grade,CNTL,4,6 92 | S2-10-10,second grade,CNTL,4,15 93 | S2-10-11,second grade,CNTL,0,14 94 | S2-10-13,second grade,CNTL,4,15 95 | S2-10-14,second grade,CNTL,11,9 96 | S2-10-15,second grade,CNTL,5,10 97 | S2-10-18,second grade,CNTL,4,13 98 | S2-10-20,second grade,CNTL,8,11 99 | S2-13-01,second grade,CNTL,11,19 100 | S2-13-02,second grade,CNTL,10,24 101 | S2-13-04,second grade,CNTL,9,18 102 | S2-13-05,second grade,CNTL,8,14 103 | S2-13-06,second grade,CNTL,4,19 104 | S2-13-09,second grade,CNTL,7,6 105 | S2-13-12,second grade,CNTL,8,9 106 | S2-13-15,second grade,CNTL,10,18 107 | S2-13-18,second grade,CNTL,4,6 108 | S2-13-19,second grade,CNTL,5,14 109 | S2-13-20,second grade,CNTL,12,15 110 | S2-13-21,second grade,CNTL,6,10 111 | S2-16-01,second grade,MA,14,13 112 | S2-16-05,second grade,MA,12,16 113 | S2-16-08,second grade,MA,7,9 114 | S2-16-09,second grade,MA,6,14 115 | S2-16-11,second grade,MA,4,9 116 | S2-16-14,second grade,MA,5,9 117 | S2-16-15,second grade,MA,8,15 118 | S2-16-17,second grade,MA,3,9 119 | S2-16-18,second grade,MA,4,7 120 | S2-16-19,second grade,MA,16,19 121 | S2-16-20,second grade,MA,4,10 122 | S2-17-02,second grade,MA,11,16 123 | S2-17-03,second grade,MA,6,12 124 | S2-17-04,second grade,MA,8,12 125 | S2-17-05,second grade,MA,4,9 126 | S2-17-06,second grade,MA,8,14 127 | S2-17-07,second grade,MA,5,21 128 | S2-17-08,second grade,MA,3,16 129 | S2-17-09,second grade,MA,8,17 130 | S2-17-10,second grade,MA,10,15 131 | S2-17-11,second grade,MA,4,19 132 | S2-17-12,second grade,MA,8,12 133 | S2-17-13,second grade,MA,5,12 134 | S2-17-14,second grade,MA,5,7 135 | S2-17-15,second grade,MA,14,18 136 | S2-17-16,second grade,MA,4,13 137 | S2-17-17,second grade,MA,5,16 138 | S2-17-19,second grade,MA,8,15 139 | S2-18-01,second grade,MA,6,17 140 | S2-18-02,second grade,MA,12,16 141 | S2-18-06,second grade,MA,2,4 142 | S2-18-07,second grade,MA,0,5 143 | S2-18-09,second grade,MA,4,7 144 | S2-18-10,second grade,MA,4,8 145 | S2-18-11,second grade,MA,5,10 146 | S2-18-12,second grade,MA,6,21 147 | S2-18-13,second grade,MA,7,20 148 | S2-18-15,second grade,MA,14,19 149 | S2-18-16,second grade,MA,6,6 150 | S2-18-17,second grade,MA,8,11 151 | S2-18-19,second grade,MA,11,14 152 | S2-18-20,second grade,MA,7,13 153 | S2-19-01,second grade,CNTL,3,14 154 | S2-19-03,second grade,CNTL,4,6 155 | S2-19-04,second grade,CNTL,1,12 156 | S2-19-05,second grade,CNTL,5,3 157 | S2-19-06,second grade,CNTL,4,9 158 | S2-19-07,second grade,CNTL,6,8 159 | S2-19-08,second grade,CNTL,14,17 160 | S2-19-09,second grade,CNTL,5,11 161 | S2-19-11,second grade,CNTL,4,8 162 | S2-19-12,second grade,CNTL,3,15 163 | S2-19-13,second grade,CNTL,14,24 164 | S2-19-14,second grade,CNTL,6,7 165 | S2-19-18,second grade,CNTL,4,12 166 | -------------------------------------------------------------------------------- /data/sgf_wide.csv: -------------------------------------------------------------------------------- 1 | subid,age,condition,age_group,beds,faces,houses,pasta 2 | C1,4.16,Label,"(4,5]",1,1,1,1 3 | C10,3.46,Label,"(3,4]",1,0,0,1 4 | C11,4.22,Label,"(4,5]",1,1,0,1 5 | C12,3.56,Label,"(3,4]",1,1,0,1 6 | C13,4.38,Label,"(4,5]",1,0,1,0 7 | C14,4.57,Label,"(4,5]",1,1,1,0 8 | C15,3.59,Label,"(3,4]",1,1,1,1 9 | C16,3.22,Label,"(3,4]",1,0,0,1 10 | C17,3.25,Label,"(3,4]",0,1,0,1 11 | C18,4.95,Label,"(4,5]",1,0,1,1 12 | C19,4.14,Label,"(4,5]",1,1,0,0 13 | C2,4.6,Label,"(4,5]",1,1,1,1 14 | C20,3.75,Label,"(3,4]",1,1,1,1 15 | C21,4.73,Label,"(4,5]",1,1,0,1 16 | C22,3.92,Label,"(3,4]",1,0,0,1 17 | C23,4.62,Label,"(4,5]",0,0,1,1 18 | C24,3.85,Label,"(3,4]",1,1,0,0 19 | C3,3.46,Label,"(3,4]",1,1,0,1 20 | C4,4.55,Label,"(4,5]",1,1,1,1 21 | C5,4.29,Label,"(4,5]",1,1,1,1 22 | C6,3.26,Label,"(3,4]",1,0,1,1 23 | C7,3.55,Label,"(3,4]",0,0,1,0 24 | C8,3.92,Label,"(3,4]",1,1,1,1 25 | C9,3.82,Label,"(3,4]",1,1,1,1 26 | M1,3.2,Label,"(3,4]",1,0,0,0 27 | M10,3.28,Label,"(3,4]",1,1,1,1 28 | M11,3.82,Label,"(3,4]",1,1,0,1 29 | M12,2.88,Label,"[2,3]",0,1,0,1 30 | M13,2.88,Label,"[2,3]",1,1,0,0 31 | M15,2.98,Label,"[2,3]",1,1,1,1 32 | M16,3.5,Label,"(3,4]",1,0,0,0 33 | M17,4.58,Label,"(4,5]",1,1,1,1 34 | M18,3.46,Label,"(3,4]",1,0,1,1 35 | M19,4.16,Label,"(4,5]",1,0,0,1 36 | M2,4.28,Label,"(4,5]",1,1,0,1 37 | M20,4.64,Label,"(4,5]",1,0,0,1 38 | M21,4.64,Label,"(4,5]",1,1,1,1 39 | M22,2,Label,"[2,3]",0,1,1,0 40 | M23,3.52,Label,"(3,4]",1,1,0,1 41 | M24,4.82,Label,"(4,5]",1,1,1,1 42 | M25,4.96,Label,"(4,5]",1,1,1,1 43 | M26,2.59,Label,"[2,3]",1,1,1,0 44 | M29,3.72,Label,"(3,4]",1,0,1,1 45 | M3,2.38,Label,"[2,3]",1,0,1,1 46 | M30,4.33,Label,"(4,5]",1,1,0,1 47 | M31,3.3,Label,"(3,4]",1,0,1,1 48 | M32,3.19,Label,"(3,4]",1,1,0,1 49 | M4,3.96,Label,"(3,4]",1,1,1,1 50 | M5,4.84,Label,"(4,5]",1,0,0,0 51 | M6,4.5,Label,"(4,5]",0,0,1,1 52 | M7,4.89,Label,"(4,5]",0,1,1,1 53 | M8,4.89,Label,"(4,5]",1,1,1,1 54 | M9,4.26,Label,"(4,5]",1,1,1,1 55 | MSCH38,2.59,No Label,"[2,3]",0,0,0,1 56 | MSCH39,3.93,No Label,"(3,4]",1,NA,NA,0 57 | MSCH39,3.94,No Label,"(3,4]",NA,0,0,NA 58 | MSCH40,3.02,No Label,"(3,4]",1,0,1,0 59 | MSCH41,3.18,No Label,"(3,4]",0,0,0,0 60 | MSCH42,2.93,No Label,"[2,3]",0,1,0,0 61 | MSCH43,2.71,No Label,"[2,3]",0,0,0,0 62 | MSCH44,2.25,No Label,"[2,3]",0,0,0,0 63 | MSCH45,2.9,No Label,"[2,3]",1,0,0,0 64 | MSCH46,3.76,No Label,"(3,4]",0,0,0,1 65 | MSCH47,2.01,No Label,"[2,3]",0,1,0,1 66 | MSCH48,3.5,No Label,"(3,4]",0,0,1,0 67 | MSCH49,2.88,No Label,"[2,3]",0,0,0,0 68 | MSCH50,2.03,No Label,"[2,3]",0,0,0,0 69 | MSCH51,2.07,No Label,"[2,3]",0,0,0,0 70 | MSCH52,2.5,No Label,"[2,3]",1,0,1,0 71 | MSCH53,2.99,No Label,"[2,3]",0,1,1,0 72 | MSCH66,3.5,No Label,"(3,4]",0,0,0,1 73 | MSCH67,3.24,No Label,"(3,4]",1,0,1,0 74 | MSCH68,3.94,No Label,"(3,4]",0,0,0,0 75 | MSCH69,2.72,No Label,"[2,3]",0,0,1,1 76 | MSCH70,2.31,No Label,"[2,3]",1,0,0,0 77 | MSCH71,3.14,No Label,"(3,4]",0,1,1,1 78 | MSCH72,3.72,No Label,"(3,4]",0,1,1,0 79 | MSCH73,3.1,No Label,"(3,4]",0,0,0,0 80 | MSCH74,2.34,No Label,"[2,3]",1,1,0,0 81 | MSCH75,3.66,No Label,"(3,4]",0,NA,NA,NA 82 | MSCH75,3.67,No Label,"(3,4]",NA,0,0,0 83 | MSCH76,2.58,No Label,"[2,3]",0,0,0,0 84 | MSCH77,2.55,No Label,"[2,3]",1,0,0,0 85 | MSCH78,2.43,No Label,"[2,3]",1,0,0,0 86 | MSCH79,2.7,No Label,"[2,3]",1,0,1,0 87 | MSCH80,2.76,No Label,"[2,3]",0,0,0,0 88 | MSCH81,2.84,No Label,"[2,3]",0,1,0,0 89 | MSCH82,2.46,No Label,"[2,3]",0,1,0,1 90 | MSCH83,2.37,No Label,"[2,3]",0,0,0,1 91 | MSCH84,2.83,No Label,"[2,3]",0,0,0,1 92 | MSCH85,2.69,No Label,"[2,3]",0,0,0,0 93 | SCH1,4.82,No Label,"(4,5]",0,0,0,0 94 | SCH10,4.32,No Label,"(4,5]",1,0,0,0 95 | SCH11,3.41,No Label,"(3,4]",0,NA,NA,NA 96 | SCH11,3.82,No Label,"(3,4]",NA,1,1,1 97 | SCH12,3.41,No Label,"(3,4]",0,0,0,0 98 | SCH13,4.75,No Label,"(4,5]",0,0,0,0 99 | SCH14,4.58,No Label,"(4,5]",1,0,0,0 100 | SCH15,4.42,No Label,"(4,5]",0,1,0,0 101 | SCH16,4.55,No Label,"(4,5]",1,0,0,0 102 | SCH17,4.25,No Label,"(4,5]",0,0,0,1 103 | SCH18,3.45,No Label,"(3,4]",0,0,0,0 104 | SCH19,4.79,No Label,"(4,5]",1,0,0,0 105 | SCH2,4.61,No Label,"(4,5]",0,0,0,0 106 | SCH20,4.39,No Label,"(4,5]",0,0,0,0 107 | SCH21,4.76,No Label,"(4,5]",0,0,0,0 108 | SCH22,4.02,No Label,"(4,5]",1,0,0,0 109 | SCH23,4.82,No Label,"(4,5]",0,0,0,0 110 | SCH24,4.07,No Label,"(4,5]",0,0,0,1 111 | SCH25,3.54,No Label,"(3,4]",0,0,1,1 112 | SCH26,4.47,No Label,"(4,5]",0,0,0,1 113 | SCH27,4.09,No Label,"(4,5]",0,0,0,1 114 | SCH28,4.02,No Label,"(4,5]",0,0,0,0 115 | SCH29,3.83,No Label,"(3,4]",0,0,0,0 116 | SCH3,4.47,No Label,"(4,5]",0,0,0,0 117 | SCH30,4.44,No Label,"(4,5]",0,0,0,1 118 | SCH31,3.71,No Label,"(3,4]",0,0,0,0 119 | SCH32,3.27,No Label,"(3,4]",0,1,0,0 120 | SCH33,3.06,No Label,"(3,4]",0,0,0,0 121 | SCH34,3.06,No Label,"(3,4]",0,0,0,0 122 | SCH35,3.02,No Label,"(3,4]",0,0,0,0 123 | SCH36,3.33,No Label,"(3,4]",0,0,1,1 124 | SCH37,3.27,No Label,"(3,4]",0,1,0,1 125 | SCH5,4.61,No Label,"(4,5]",0,0,0,0 126 | SCH6,4.41,No Label,"(4,5]",0,0,0,0 127 | SCH7,4.41,No Label,"(4,5]",0,1,0,0 128 | SCH8,4.52,No Label,"(4,5]",0,0,0,0 129 | SCH9,4.37,No Label,"(4,5]",0,0,0,0 130 | T1,2.95,Label,"[2,3]",1,1,0,0 131 | T10,3,Label,"[2,3]",1,0,1,1 132 | T11,2.99,Label,"[2,3]",1,1,0,1 133 | T12,2.72,Label,"[2,3]",0,0,1,0 134 | T13,2.89,Label,"[2,3]",0,0,1,1 135 | T14,2.83,Label,"[2,3]",1,1,1,0 136 | T15,2.85,Label,"[2,3]",0,0,0,1 137 | T16,2.73,Label,"[2,3]",1,1,0,1 138 | T17,2.32,Label,"[2,3]",0,0,0,0 139 | T18,2.61,Label,"[2,3]",0,1,0,1 140 | T19,2.47,Label,"[2,3]",1,0,0,1 141 | T2,2.83,Label,"[2,3]",1,0,0,1 142 | T20,2.5,Label,"[2,3]",1,1,1,0 143 | T21,2.58,Label,"[2,3]",0,1,1,1 144 | T22,2.13,Label,"[2,3]",0,0,1,1 145 | T3,3.09,Label,"(3,4]",1,1,1,1 146 | T4,3.24,Label,"(3,4]",1,1,0,0 147 | T5,2.8,Label,"[2,3]",1,1,1,0 148 | T6,3.1,Label,"(3,4]",1,1,1,1 149 | T7,2.74,Label,"[2,3]",0,1,0,0 150 | T8,2.91,Label,"[2,3]",1,1,0,1 151 | T9,2.79,Label,"[2,3]",1,1,0,0 152 | -------------------------------------------------------------------------------- /data/sklar_data.csv: -------------------------------------------------------------------------------- 1 | prime,prime.result,target,congruent,operand,distance,counterbalance,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21 2 | =1+2+5,8,9,no,A,-1,1,597,560,872,596,664,597,596,700,664,700,700,559,595,560,629,629,523,524,597,664,664 3 | =1+3+5,9,11,no,A,-2,1,699,595,805,1084,525,664,736,804,736,595,735,353,NA,737,768,768,596,596,839,525,664 4 | =1+4+3,8,12,no,A,-4,1,700,596,700,943,736,803,597,840,804,628,735,NA,664,628,664,596,524,560,804,595,596 5 | =1+6+3,10,12,no,A,-2,1,628,317,839,596,628,872,523,664,805,1219,700,NA,664,560,737,804,525,492,977,664,596 6 | =1+9+2,12,11,no,A,1,1,768,663,804,700,629,560,736,628,944,559,596,664,700,736,768,768,596,597,840,597,627 7 | =1+9+3,13,12,no,A,1,1,595,700,664,456,736,737,596,700,768,664,596,NA,628,735,597,769,560,628,701,700,664 8 | =2+1+9,12,11,no,A,1,1,664,736,769,975,628,700,769,699,664,664,664,663,559,737,872,736,595,596,805,664,525 9 | =2+3+6,11,10,no,A,1,1,803,736,768,NA,628,804,560,908,NA,803,595,664,559,627,NA,841,456,524,561,492,525 10 | =2+3+7,12,11,no,A,1,1,767,561,768,840,525,560,627,595,840,628,804,735,664,NA,664,944,560,456,804,664,736 11 | =2+5+6,13,9,no,A,4,1,700,525,872,628,523,629,628,524,596,840,700,664,420,560,700,664,559,560,596,559,524 12 | =2+7+1,10,12,no,A,-2,1,596,736,NA,768,768,664,700,769,664,701,840,872,700,627,737,804,664,627,664,596,664 13 | =2+8+3,13,11,no,A,2,1,841,665,736,908,628,597,664,804,664,560,803,NA,559,663,561,768,908,492,597,664,804 14 | =3+1+5,9,12,no,A,-3,1,664,176,769,628,597,736,628,597,737,107,736,664,736,560,664,768,524,560,628,525,595 15 | =3+1+7,11,12,no,A,-1,1,737,664,1048,664,NA,525,627,736,737,700,737,352,664,664,872,664,559,492,908,664,597 16 | =3+2+5,10,14,no,A,-4,1,596,523,1116,560,523,NA,628,664,839,840,628,384,736,560,700,664,492,595,699,560,628 17 | =3+4+2,9,10,no,A,-1,1,1048,595,736,595,597,628,701,524,NA,595,699,595,596,NA,768,664,523,420,700,456,664 18 | =3+5+4,12,10,no,A,2,1,628,768,560,664,456,664,699,736,700,1359,560,523,700,561,628,872,456,385,699,559,596 19 | =3+5+6,14,13,no,A,1,1,664,561,700,944,629,700,628,NA,768,559,701,736,664,561,664,804,628,664,701,664,560 20 | =3+6+2,11,12,no,A,-1,1,737,804,943,664,840,NA,559,664,524,1117,737,560,664,699,840,700,596,628,769,628,735 21 | =3+7+2,12,8,no,A,4,1,1048,524,907,736,664,664,628,596,700,735,700,420,628,596,872,769,597,596,804,597,700 22 | =4+1+7,12,9,no,A,3,1,700,561,804,524,561,664,NA,629,664,628,628,663,525,560,768,664,628,492,664,664,596 23 | =4+1+8,13,10,no,A,3,1,840,524,767,908,561,664,736,840,628,804,596,596,945,491,736,803,524,491,629,804,596 24 | =4+3+2,9,10,no,A,-1,1,560,524,803,595,560,596,628,664,664,1084,597,804,597,560,700,736,492,524,700,456,524 25 | =4+6+1,11,12,no,A,-1,1,699,628,737,597,628,736,628,1012,NA,NA,767,872,736,524,735,804,629,664,736,596,596 26 | =4+6+2,12,13,no,A,-1,1,737,628,803,699,737,699,768,628,597,1292,701,352,561,596,561,769,560,840,596,665,596 27 | =4+6+3,13,12,no,A,1,1,736,628,664,595,561,804,736,700,840,664,664,315,595,628,700,737,628,561,629,597,664 28 | =5+1+3,9,14,no,A,-5,1,840,560,805,736,664,524,701,664,804,664,1084,492,596,524,944,664,560,596,664,596,524 29 | =5+1+4,10,11,no,A,-1,1,873,769,872,737,664,628,803,768,701,456,597,NA,597,735,840,803,736,664,700,664,907 30 | =5+1+6,12,8,no,A,4,1,736,596,736,628,699,700,700,700,736,700,768,523,736,560,664,736,596,420,628,559,840 31 | =5+2+4,11,13,no,A,-2,1,976,664,839,524,559,976,596,NA,664,737,596,628,664,628,700,908,456,525,597,628,628 32 | =5+4+1,10,12,no,A,-2,1,804,701,736,1049,872,628,664,664,804,595,664,840,700,664,699,872,561,560,700,699,663 33 | =6+1+2,9,13,no,A,-4,1,700,768,769,664,628,561,595,767,700,NA,805,1048,700,595,664,596,524,561,628,595,628 34 | =6+4+1,11,13,no,A,-2,1,804,736,840,628,699,699,597,628,737,597,700,420,560,524,1116,628,525,700,628,596,768 35 | =6+5+1,12,13,no,A,-1,1,735,560,700,735,561,664,735,840,628,628,596,560,596,560,596,NA,525,560,701,700,629 36 | =7+1+3,11,9,no,A,2,1,700,524,804,2336,700,767,597,872,735,560,595,596,628,596,700,909,628,596,596,524,524 37 | =7+1+4,12,10,no,A,2,1,523,628,1048,596,596,560,628,628,804,803,628,628,NA,NA,595,664,456,457,664,524,524 38 | =7+1+5,13,9,no,A,4,1,975,560,804,663,700,664,561,976,628,524,492,560,628,628,627,596,524,525,664,628,597 39 | =7+2+1,10,11,no,A,-1,1,737,596,735,1011,664,628,699,665,735,628,596,595,701,560,840,NA,596,456,596,492,736 40 | =8+2+1,11,9,no,A,2,1,803,596,804,629,700,700,523,560,628,1152,523,595,629,491,840,664,523,493,525,664,700 41 | =8+2+4,14,11,no,A,3,1,804,664,768,628,700,699,735,805,804,596,769,804,664,596,736,804,NA,492,976,NA,736 42 | =1+2+9,12,12,yes,A,0,1,736,596,907,803,944,768,736,909,664,NA,628,735,664,595,804,803,699,595,664,628,664 43 | =1+3+6,10,10,yes,A,0,1,596,736,1188,596,492,664,628,NA,700,628,840,523,628,524,597,700,560,456,664,524,596 44 | =1+5+3,9,9,yes,A,0,1,841,524,768,596,597,596,595,700,699,872,559,975,840,559,664,664,523,524,596,628,628 45 | =1+6+2,9,9,yes,A,0,1,872,560,768,NA,524,664,664,663,737,736,769,804,700,492,596,700,596,456,628,525,525 46 | =1+7+2,10,10,yes,A,0,1,700,664,700,NA,596,664,664,560,597,NA,492,664,700,523,629,700,456,596,628,NA,664 47 | =2+3+5,10,10,yes,A,0,1,596,908,736,628,524,872,560,628,700,663,524,559,596,525,664,872,NA,525,596,596,560 48 | =2+4+5,11,11,yes,A,0,1,804,492,804,628,664,700,664,872,NA,NA,664,NA,NA,559,769,805,596,627,1116,597,767 49 | =2+6+4,12,12,yes,A,0,1,700,700,943,735,664,1048,664,769,524,559,596,735,628,699,840,700,664,596,628,628,628 50 | =2+7+5,14,14,yes,A,0,1,804,736,840,736,664,908,561,701,1011,664,628,248,768,596,1012,597,596,525,700,597,664 51 | =2+8+1,11,11,yes,A,0,1,804,627,908,1012,699,628,700,596,736,596,664,NA,664,420,872,768,628,596,629,628,735 52 | =3+1+4,8,8,yes,A,0,1,596,839,735,908,736,664,524,700,908,456,1013,559,736,596,872,700,525,456,665,628,768 53 | =3+2+4,9,9,yes,A,0,1,596,560,839,737,945,804,629,700,595,768,523,560,664,597,NA,737,596,597,596,595,595 54 | =3+2+6,11,11,yes,A,0,1,804,664,737,628,NA,213,805,737,663,597,560,596,664,524,664,804,NA,596,664,628,767 55 | =3+7+1,11,11,yes,A,0,1,908,524,1084,596,628,664,664,840,737,523,561,597,597,596,804,768,559,597,559,664,525 56 | =4+1+5,10,10,yes,A,0,1,560,701,804,560,524,492,699,664,767,628,456,664,NA,559,701,596,523,420,596,560,596 57 | =4+1+6,11,11,yes,A,0,1,804,596,945,664,700,595,840,700,840,628,664,699,596,700,804,736,665,492,NA,597,628 58 | =4+2+3,9,9,yes,A,0,1,768,525,1084,700,664,908,628,597,664,664,596,596,NA,561,628,663,248,524,700,664,699 59 | =4+2+6,12,12,yes,A,0,1,700,664,665,NA,NA,1152,561,804,840,699,664,664,628,700,975,768,560,456,664,769,701 60 | =4+2+7,13,13,yes,A,0,1,596,596,525,700,699,629,768,NA,736,628,628,664,664,628,561,629,628,560,664,595,628 61 | =4+5+1,10,10,yes,A,0,1,872,871,1188,1084,768,628,560,561,628,628,664,524,664,628,597,664,560,456,664,524,596 62 | =4+5+2,11,11,yes,A,0,1,803,736,769,NA,597,525,628,840,841,628,664,736,628,700,735,596,384,664,628,492,700 63 | =4+7+1,12,12,yes,A,0,1,804,628,803,839,628,1048,628,804,840,737,628,840,664,597,736,735,492,628,804,664,737 64 | =5+1+2,8,8,yes,A,0,1,840,492,804,768,736,559,628,736,908,1708,700,596,736,664,840,628,596,560,664,597,596 65 | =5+3+1,9,9,yes,A,0,1,NA,596,909,628,524,628,664,665,628,664,524,352,596,597,872,872,560,628,597,596,595 66 | =5+3+2,10,10,yes,A,0,1,768,871,1255,628,628,524,596,664,596,700,628,628,596,524,768,596,492,492,737,491,NA 67 | =5+3+4,12,12,yes,A,0,1,976,840,736,1049,628,596,664,664,560,455,663,NA,665,664,701,840,560,596,737,628,664 68 | =6+1+5,12,12,yes,A,0,1,735,596,768,664,664,872,664,665,664,871,768,700,595,736,737,908,628,524,628,700,665 69 | =6+2+1,9,9,yes,A,0,1,976,559,1048,596,560,597,523,736,627,560,597,420,629,560,628,736,664,559,768,561,628 70 | =6+2+3,11,11,yes,A,0,1,803,736,1189,628,628,596,768,737,840,664,597,628,768,597,840,805,560,595,664,805,735 71 | =6+4+2,12,12,yes,A,0,1,628,735,664,700,628,700,596,628,840,1220,628,491,701,560,977,628,492,524,805,700,664 72 | =6+4+3,13,13,yes,A,0,1,NA,664,872,700,735,701,628,872,736,872,701,492,596,596,664,627,664,628,595,596,628 73 | =6+5+2,13,13,yes,A,0,1,699,701,873,700,735,840,628,596,628,523,628,840,736,664,804,665,596,524,597,597,596 74 | =7+2+3,12,12,yes,A,0,1,664,560,769,664,596,699,595,768,NA,1012,769,NA,700,628,768,871,628,524,560,597,596 75 | =7+3+1,11,11,yes,A,0,1,839,628,872,976,NA,596,737,664,767,735,803,628,596,700,664,945,597,595,804,628,559 76 | =7+3+2,12,12,yes,A,0,1,595,804,908,769,701,456,595,840,804,1012,628,628,736,664,700,700,524,597,769,664,595 77 | =7+3+4,14,14,yes,A,0,1,767,561,736,560,560,493,769,804,664,664,664,596,805,665,559,664,492,628,700,140,596 78 | =8+2+3,13,13,yes,A,0,1,700,596,1256,840,628,628,561,767,736,699,664,804,628,804,736,736,628,596,628,597,803 79 | =9+1+2,12,12,yes,A,0,1,736,628,907,909,699,767,699,737,628,872,596,628,628,628,840,736,560,596,737,596,628 80 | =9+1+3,13,13,yes,A,0,1,769,737,944,663,492,803,628,628,700,1084,804,975,628,596,664,699,560,560,804,664,596 81 | =9+3+1,13,13,yes,A,0,1,700,595,768,908,700,664,664,769,871,872,664,597,664,943,1360,841,NA,493,596,664,NA 82 | =3-1-2,0,4,no,S,-4,1,664,664,804,663,596,561,597,664,664,560,735,735,560,492,628,767,561,595,628,559,628 83 | =4-3-1,0,2,no,S,-2,1,768,735,803,628,597,943,737,736,768,664,735,737,628,736,768,701,596,597,701,700,664 84 | =5-1-4,0,3,no,S,-3,1,840,872,735,735,699,559,700,664,665,628,736,736,560,628,560,735,596,597,769,596,525 85 | =5-3-2,0,1,no,S,-1,1,804,664,804,872,664,976,736,872,841,524,525,560,665,664,596,804,700,735,804,628,492 86 | =6-1-2,3,0,no,S,3,1,561,736,840,597,525,664,595,628,628,628,736,840,525,560,596,663,456,524,736,524,525 87 | =6-1-3,2,0,no,S,2,1,736,596,805,561,664,524,628,628,664,596,NA,597,559,492,700,804,492,492,596,420,560 88 | =6-3-2,1,4,no,S,-3,1,664,524,736,700,524,945,628,628,664,700,628,456,628,700,560,736,560,560,595,560,736 89 | =6-4-2,0,1,no,S,-1,1,803,595,1219,628,699,664,492,737,767,736,628,736,492,597,524,944,664,456,804,560,524 90 | =6-5-1,0,3,no,S,-3,1,628,700,872,628,700,628,596,700,700,628,664,768,560,628,701,628,628,523,596,629,560 91 | =7-1-2,4,0,no,S,4,1,560,597,872,561,561,664,628,559,700,596,NA,524,700,628,664,700,664,420,596,524,493 92 | =7-1-4,2,0,no,S,2,1,628,559,700,628,456,628,596,628,803,456,NA,663,561,595,559,736,560,523,736,524,NA 93 | =7-1-6,0,3,no,S,-3,1,944,803,1084,628,559,768,492,664,736,664,525,805,628,596,664,597,628,560,596,524,NA 94 | =7-2-4,1,0,no,S,1,1,596,664,NA,627,597,596,491,908,841,628,700,700,561,456,629,596,560,523,596,596,975 95 | =7-2-5,0,1,no,S,-1,1,840,700,908,455,664,736,700,872,663,736,840,628,663,628,664,595,700,664,595,663,595 96 | =7-4-3,0,1,no,S,-1,1,805,804,976,805,664,768,804,664,872,769,736,840,664,595,560,665,597,701,628,1084,524 97 | =8-1-2,5,0,no,S,5,1,700,700,944,NA,525,628,597,596,664,628,524,664,664,596,596,803,492,492,664,628,803 98 | =8-1-3,4,0,no,S,4,1,560,663,736,596,559,595,628,NA,NA,595,629,NA,595,NA,524,736,596,560,664,456,700 99 | =8-1-5,2,0,no,S,2,1,559,736,NA,596,595,596,628,597,596,767,493,492,628,524,597,595,595,384,628,492,840 100 | =8-1-7,0,5,no,S,-5,1,700,768,700,1708,664,908,628,596,700,561,664,596,628,700,492,664,665,595,664,595,560 101 | =8-2-5,1,6,no,S,-5,1,NA,873,944,664,700,628,NA,628,804,872,699,804,664,456,700,803,663,597,699,700,804 102 | =8-3-5,0,2,no,S,-2,1,804,700,976,664,664,560,700,804,804,664,664,872,736,664,627,736,560,664,665,664,524 103 | =8-4-1,3,0,no,S,3,1,628,NA,768,664,561,596,493,1012,767,736,NA,596,628,595,NA,803,560,NA,628,NA,561 104 | =8-4-3,1,0,no,S,1,1,767,628,804,628,456,NA,560,736,736,768,700,872,664,628,523,804,523,457,628,492,456 105 | =8-6-2,0,5,no,S,-5,1,736,768,944,596,491,560,596,736,596,596,736,805,560,736,596,664,804,561,699,492,664 106 | =9-1-2,6,3,no,S,3,1,664,248,700,628,700,803,664,628,664,628,700,872,664,524,559,665,561,524,628,701,525 107 | =9-1-8,0,2,no,S,-2,1,768,699,1048,664,664,700,700,840,701,736,664,NA,628,524,736,805,628,628,699,701,735 108 | =9-2-3,4,0,no,S,4,1,944,628,701,524,456,664,524,664,595,595,523,664,559,492,524,840,560,560,628,456,493 109 | =9-2-4,3,0,no,S,3,1,804,872,767,559,420,840,597,628,803,595,456,524,524,455,1152,699,456,523,700,NA,768 110 | =9-2-6,1,0,no,S,1,1,491,628,872,560,803,663,628,1188,664,736,701,628,595,492,524,803,NA,492,664,492,455 111 | =9-3-1,5,0,no,S,5,1,664,559,944,700,628,699,628,871,664,767,736,700,597,523,492,803,492,596,664,492,628 112 | =9-4-3,2,1,no,S,1,1,524,628,1012,803,840,596,736,803,840,768,596,804,664,663,597,872,664,596,908,595,597 113 | =9-5-1,3,2,no,S,1,1,664,735,908,767,700,737,700,664,804,628,664,NA,595,523,492,804,560,493,736,737,767 114 | =9-5-3,1,0,no,S,1,1,736,492,737,596,597,769,525,628,664,456,700,663,596,596,560,700,560,523,803,456,524 115 | =9-5-4,0,2,no,S,-2,1,737,804,1013,457,663,559,595,804,700,663,768,908,735,561,736,663,628,628,735,737,664 116 | =9-6-1,2,0,no,S,2,1,872,908,804,628,664,525,596,700,NA,NA,456,560,596,560,839,805,595,561,664,456,559 117 | =9-6-3,0,1,no,S,-1,1,873,596,840,699,736,736,872,736,769,736,597,737,628,559,628,804,700,628,737,455,456 118 | =9-7-2,0,4,no,S,-4,1,664,627,872,596,628,628,804,595,664,107,597,561,524,525,628,664,595,491,628,561,628 119 | =3-2-1,0,0,yes,S,0,1,737,664,736,596,NA,736,560,872,629,456,596,560,628,663,493,664,456,523,664,628,628 120 | =4-1-3,0,0,yes,S,0,1,561,628,840,700,664,560,524,1084,700,560,523,561,628,NA,561,768,248,560,628,560,NA 121 | =5-2-3,0,0,yes,S,0,1,523,628,803,736,560,840,700,596,628,559,456,595,596,596,628,701,560,561,560,559,492 122 | =5-4-1,0,0,yes,S,0,1,664,628,805,628,559,804,664,872,664,665,596,665,628,805,561,767,700,523,700,456,700 123 | =6-1-5,0,0,yes,S,0,1,664,628,841,596,596,559,524,840,628,628,559,804,559,523,456,664,525,524,841,523,561 124 | =6-2-1,3,3,yes,S,0,1,769,NA,840,628,492,664,628,699,629,628,804,524,NA,524,596,664,664,560,525,596,525 125 | =6-2-3,1,1,yes,S,0,1,596,664,944,664,768,736,560,803,768,560,841,664,664,628,560,664,700,492,NA,493,492 126 | =6-2-4,0,0,yes,S,0,1,524,943,1012,561,595,664,524,736,628,596,701,700,664,561,597,840,456,456,1011,492,524 127 | =6-3-1,2,2,yes,S,0,1,768,700,943,840,664,628,700,700,803,737,664,804,524,596,628,839,627,664,736,699,492 128 | =7-2-1,4,4,yes,S,0,1,596,628,804,664,523,561,524,597,803,628,NA,524,560,595,560,628,596,492,596,561,596 129 | =7-3-4,0,0,yes,S,0,1,523,492,663,805,NA,976,628,628,627,597,560,700,597,493,700,699,524,456,735,493,560 130 | =7-4-1,2,2,yes,S,0,1,736,768,804,492,805,840,664,736,804,699,628,699,701,NA,101,700,628,628,664,700,628 131 | =7-4-2,1,1,yes,S,0,1,839,664,944,735,628,735,803,628,768,176,595,805,597,628,627,840,595,628,872,491,664 132 | =7-5-2,0,0,yes,S,0,1,664,212,701,700,315,596,595,700,628,523,596,561,596,493,456,840,492,492,628,NA,524 133 | =7-6-1,0,0,yes,S,0,1,597,524,664,663,456,664,628,1153,701,NA,456,524,628,664,596,944,560,456,596,492,840 134 | =8-1-4,3,3,yes,S,0,1,871,664,NA,767,664,664,628,629,524,561,700,700,628,595,176,841,596,524,701,628,523 135 | =8-2-1,5,5,yes,S,0,1,700,NA,768,664,627,664,560,628,629,560,872,699,561,629,628,1048,596,595,699,628,491 136 | =8-2-6,0,0,yes,S,0,1,664,596,839,596,NA,664,597,805,700,663,596,525,628,700,663,804,561,492,628,NA,383 137 | =8-3-1,4,4,yes,S,0,1,628,664,804,628,628,664,596,700,700,524,524,700,596,664,596,737,524,523,700,596,977 138 | =8-3-4,1,1,yes,S,0,1,768,664,872,628,699,628,701,737,737,663,628,560,701,664,524,768,628,560,908,492,492 139 | =8-5-1,2,2,yes,S,0,1,804,664,804,628,628,804,628,596,804,628,736,907,525,700,524,560,560,664,700,663,597 140 | =8-5-2,1,1,yes,S,0,1,525,628,976,736,944,840,736,768,872,736,737,804,664,767,664,736,628,560,596,628,560 141 | =8-5-3,0,0,yes,S,0,1,NA,NA,803,597,525,NA,595,561,628,492,492,NA,597,560,560,699,560,560,595,523,664 142 | =8-7-1,0,0,yes,S,0,1,663,664,803,597,560,628,NA,559,663,1187,NA,596,595,491,523,628,596,525,628,804,596 143 | =9-1-3,5,5,yes,S,0,1,700,840,840,596,664,664,596,664,597,NA,736,596,735,735,456,803,560,524,663,664,456 144 | =9-1-5,3,3,yes,S,0,1,664,628,700,NA,559,737,597,559,595,559,664,768,597,664,560,736,560,560,735,665,629 145 | =9-1-6,2,2,yes,S,0,1,735,767,872,NA,700,840,NA,735,768,699,769,664,595,560,803,769,596,628,700,664,596 146 | =9-2-1,6,6,yes,S,0,1,700,803,1047,664,628,596,737,597,700,628,735,805,664,597,559,908,699,596,700,560,559 147 | =9-2-7,0,0,yes,S,0,1,664,560,804,628,492,628,769,700,664,597,596,597,596,596,596,735,492,420,736,456,523 148 | =9-3-2,4,4,yes,S,0,1,595,736,700,664,803,597,561,735,NA,456,629,736,597,628,525,840,524,493,628,596,456 149 | =9-3-4,2,2,yes,S,0,1,840,804,976,736,664,524,628,803,700,627,561,945,628,595,735,736,664,596,664,699,700 150 | =9-3-5,1,1,yes,S,0,1,768,595,907,871,664,736,768,736,735,736,700,736,664,664,456,769,596,596,628,737,525 151 | =9-3-6,0,0,yes,S,0,1,664,736,805,736,491,560,597,597,840,492,736,944,663,596,596,664,559,524,628,NA,664 152 | =9-4-2,3,3,yes,S,0,1,701,804,805,737,736,596,561,664,596,805,524,524,595,628,736,700,629,628,628,559,524 153 | =9-4-5,0,0,yes,S,0,1,524,736,908,596,560,944,629,596,629,560,628,524,595,456,701,767,525,701,664,456,456 154 | =9-6-2,1,1,yes,S,0,1,873,699,908,316,596,736,768,699,735,664,664,663,700,628,559,596,597,596,560,595,523 155 | =9-8-1,0,0,yes,S,0,1,560,908,909,629,493,628,420,736,736,NA,559,560,559,560,595,840,NA,628,945,492,595 156 | -------------------------------------------------------------------------------- /data/stiller_scales_data.csv: -------------------------------------------------------------------------------- 1 | subid,item,correct,age,condition 2 | M22,faces,1,2,Label 3 | M22,houses,1,2,Label 4 | M22,pasta,0,2,Label 5 | M22,beds,0,2,Label 6 | T22,beds,0,2.13,Label 7 | T22,faces,0,2.13,Label 8 | T22,houses,1,2.13,Label 9 | T22,pasta,1,2.13,Label 10 | T17,pasta,0,2.32,Label 11 | T17,faces,0,2.32,Label 12 | T17,houses,0,2.32,Label 13 | T17,beds,0,2.32,Label 14 | M3,faces,0,2.38,Label 15 | M3,houses,1,2.38,Label 16 | M3,pasta,1,2.38,Label 17 | M3,beds,1,2.38,Label 18 | T19,faces,0,2.47,Label 19 | T19,houses,0,2.47,Label 20 | T19,pasta,1,2.47,Label 21 | T19,beds,1,2.47,Label 22 | T20,faces,1,2.5,Label 23 | T20,houses,1,2.5,Label 24 | T20,pasta,0,2.5,Label 25 | T20,beds,1,2.5,Label 26 | T21,faces,1,2.58,Label 27 | T21,houses,1,2.58,Label 28 | T21,pasta,1,2.58,Label 29 | T21,beds,0,2.58,Label 30 | M26,faces,1,2.59,Label 31 | M26,houses,1,2.59,Label 32 | M26,pasta,0,2.59,Label 33 | M26,beds,1,2.59,Label 34 | T18,faces,1,2.61,Label 35 | T18,houses,0,2.61,Label 36 | T18,pasta,1,2.61,Label 37 | T18,beds,0,2.61,Label 38 | T12,beds,0,2.72,Label 39 | T12,faces,0,2.72,Label 40 | T12,houses,1,2.72,Label 41 | T12,pasta,0,2.72,Label 42 | T16,faces,1,2.73,Label 43 | T16,houses,0,2.73,Label 44 | T16,pasta,1,2.73,Label 45 | T16,beds,1,2.73,Label 46 | T7,faces,1,2.74,Label 47 | T7,houses,0,2.74,Label 48 | T7,pasta,0,2.74,Label 49 | T7,beds,0,2.74,Label 50 | T9,houses,0,2.79,Label 51 | T9,faces,1,2.79,Label 52 | T9,pasta,0,2.79,Label 53 | T9,beds,1,2.79,Label 54 | T5,faces,1,2.8,Label 55 | T5,houses,1,2.8,Label 56 | T5,pasta,0,2.8,Label 57 | T5,beds,1,2.8,Label 58 | T14,faces,1,2.83,Label 59 | T14,houses,1,2.83,Label 60 | T14,pasta,0,2.83,Label 61 | T14,beds,1,2.83,Label 62 | T2,houses,0,2.83,Label 63 | T2,faces,0,2.83,Label 64 | T2,pasta,1,2.83,Label 65 | T2,beds,1,2.83,Label 66 | T15,faces,0,2.85,Label 67 | T15,houses,0,2.85,Label 68 | T15,pasta,1,2.85,Label 69 | T15,beds,0,2.85,Label 70 | M13,houses,0,2.88,Label 71 | M13,beds,1,2.88,Label 72 | M13,faces,1,2.88,Label 73 | M13,pasta,0,2.88,Label 74 | M12,faces,1,2.88,Label 75 | M12,houses,0,2.88,Label 76 | M12,pasta,1,2.88,Label 77 | M12,beds,0,2.88,Label 78 | T13,beds,0,2.89,Label 79 | T13,faces,0,2.89,Label 80 | T13,houses,1,2.89,Label 81 | T13,pasta,1,2.89,Label 82 | T8,faces,1,2.91,Label 83 | T8,houses,0,2.91,Label 84 | T8,pasta,1,2.91,Label 85 | T8,beds,1,2.91,Label 86 | T1,faces,1,2.95,Label 87 | T1,houses,0,2.95,Label 88 | T1,pasta,0,2.95,Label 89 | T1,beds,1,2.95,Label 90 | M15,faces,1,2.98,Label 91 | M15,houses,1,2.98,Label 92 | M15,pasta,1,2.98,Label 93 | M15,beds,1,2.98,Label 94 | T11,faces,1,2.99,Label 95 | T11,houses,0,2.99,Label 96 | T11,pasta,1,2.99,Label 97 | T11,beds,1,2.99,Label 98 | T10,faces,0,3,Label 99 | T10,houses,1,3,Label 100 | T10,pasta,1,3,Label 101 | T10,beds,1,3,Label 102 | T3,faces,1,3.09,Label 103 | T3,houses,1,3.09,Label 104 | T3,pasta,1,3.09,Label 105 | T3,beds,1,3.09,Label 106 | T6,faces,1,3.1,Label 107 | T6,houses,1,3.1,Label 108 | T6,pasta,1,3.1,Label 109 | T6,beds,1,3.1,Label 110 | M32,beds,1,3.19,Label 111 | M32,faces,1,3.19,Label 112 | M32,houses,0,3.19,Label 113 | M32,pasta,1,3.19,Label 114 | M1,faces,0,3.2,Label 115 | M1,beds,1,3.2,Label 116 | M1,pasta,0,3.2,Label 117 | M1,houses,0,3.2,Label 118 | C16,faces,0,3.22,Label 119 | C16,houses,0,3.22,Label 120 | C16,pasta,1,3.22,Label 121 | C16,beds,1,3.22,Label 122 | T4,faces,1,3.24,Label 123 | T4,houses,0,3.24,Label 124 | T4,pasta,0,3.24,Label 125 | T4,beds,1,3.24,Label 126 | C17,faces,1,3.25,Label 127 | C17,houses,0,3.25,Label 128 | C17,pasta,1,3.25,Label 129 | C17,beds,0,3.25,Label 130 | C6,faces,0,3.26,Label 131 | C6,houses,1,3.26,Label 132 | C6,pasta,1,3.26,Label 133 | C6,beds,1,3.26,Label 134 | M10,faces,1,3.28,Label 135 | M10,houses,1,3.28,Label 136 | M10,beds,1,3.28,Label 137 | M10,pasta,1,3.28,Label 138 | M31,faces,0,3.3,Label 139 | M31,houses,1,3.3,Label 140 | M31,pasta,1,3.3,Label 141 | M31,beds,1,3.3,Label 142 | C3,houses,0,3.46,Label 143 | C3,pasta,1,3.46,Label 144 | C3,beds,1,3.46,Label 145 | C3,faces,1,3.46,Label 146 | C10,faces,0,3.46,Label 147 | C10,houses,0,3.46,Label 148 | C10,pasta,1,3.46,Label 149 | C10,beds,1,3.46,Label 150 | M18,faces,0,3.46,Label 151 | M18,houses,1,3.46,Label 152 | M18,pasta,1,3.46,Label 153 | M18,beds,1,3.46,Label 154 | M16,faces,0,3.5,Label 155 | M16,houses,0,3.5,Label 156 | M16,pasta,0,3.5,Label 157 | M16,beds,1,3.5,Label 158 | M23,faces,1,3.52,Label 159 | M23,houses,0,3.52,Label 160 | M23,pasta,1,3.52,Label 161 | M23,beds,1,3.52,Label 162 | C7,faces,0,3.55,Label 163 | C7,houses,1,3.55,Label 164 | C7,pasta,0,3.55,Label 165 | C7,beds,0,3.55,Label 166 | C12,faces,1,3.56,Label 167 | C12,houses,0,3.56,Label 168 | C12,pasta,1,3.56,Label 169 | C12,beds,1,3.56,Label 170 | C15,faces,1,3.59,Label 171 | C15,houses,1,3.59,Label 172 | C15,pasta,1,3.59,Label 173 | C15,beds,1,3.59,Label 174 | M29,faces,0,3.72,Label 175 | M29,houses,1,3.72,Label 176 | M29,pasta,1,3.72,Label 177 | M29,beds,1,3.72,Label 178 | C20,faces,1,3.75,Label 179 | C20,houses,1,3.75,Label 180 | C20,pasta,1,3.75,Label 181 | C20,beds,1,3.75,Label 182 | M11,faces,1,3.82,Label 183 | M11,houses,0,3.82,Label 184 | M11,pasta,1,3.82,Label 185 | M11,beds,1,3.82,Label 186 | C9,beds,1,3.82,Label 187 | C9,faces,1,3.82,Label 188 | C9,houses,1,3.82,Label 189 | C9,pasta,1,3.82,Label 190 | C24,faces,1,3.85,Label 191 | C24,houses,0,3.85,Label 192 | C24,pasta,0,3.85,Label 193 | C24,beds,1,3.85,Label 194 | C22,faces,0,3.92,Label 195 | C22,houses,0,3.92,Label 196 | C22,pasta,1,3.92,Label 197 | C22,beds,1,3.92,Label 198 | C8,faces,1,3.92,Label 199 | C8,houses,1,3.92,Label 200 | C8,pasta,1,3.92,Label 201 | C8,beds,1,3.92,Label 202 | M4,faces,1,3.96,Label 203 | M4,houses,1,3.96,Label 204 | M4,pasta,1,3.96,Label 205 | M4,beds,1,3.96,Label 206 | M6,faces,0,4.5,Label 207 | M6,houses,1,4.5,Label 208 | M6,pasta,1,4.5,Label 209 | M6,beds,0,4.5,Label 210 | C19,faces,1,4.14,Label 211 | C19,houses,0,4.14,Label 212 | C19,pasta,0,4.14,Label 213 | C19,beds,1,4.14,Label 214 | C1,faces,1,4.16,Label 215 | C1,houses,1,4.16,Label 216 | C1,pasta,1,4.16,Label 217 | C1,beds,1,4.16,Label 218 | M19,beds,1,4.16,Label 219 | M19,faces,0,4.16,Label 220 | M19,houses,0,4.16,Label 221 | M19,pasta,1,4.16,Label 222 | C11,faces,1,4.22,Label 223 | C11,houses,0,4.22,Label 224 | C11,pasta,1,4.22,Label 225 | C11,beds,1,4.22,Label 226 | M9,faces,1,4.26,Label 227 | M9,houses,1,4.26,Label 228 | M9,pasta,1,4.26,Label 229 | M9,beds,1,4.26,Label 230 | M2,faces,1,4.28,Label 231 | M2,houses,0,4.28,Label 232 | M2,pasta,1,4.28,Label 233 | M2,beds,1,4.28,Label 234 | C5,faces,1,4.29,Label 235 | C5,houses,1,4.29,Label 236 | C5,pasta,1,4.29,Label 237 | C5,beds,1,4.29,Label 238 | M30,beds,1,4.33,Label 239 | M30,faces,1,4.33,Label 240 | M30,houses,0,4.33,Label 241 | M30,pasta,1,4.33,Label 242 | C13,faces,0,4.38,Label 243 | C13,houses,1,4.38,Label 244 | C13,pasta,0,4.38,Label 245 | C13,beds,1,4.38,Label 246 | C4,faces,1,4.55,Label 247 | C4,houses,1,4.55,Label 248 | C4,pasta,1,4.55,Label 249 | C4,beds,1,4.55,Label 250 | C14,faces,1,4.57,Label 251 | C14,houses,1,4.57,Label 252 | C14,pasta,0,4.57,Label 253 | C14,beds,1,4.57,Label 254 | M17,faces,1,4.58,Label 255 | M17,houses,1,4.58,Label 256 | M17,pasta,1,4.58,Label 257 | M17,beds,1,4.58,Label 258 | C2,faces,1,4.6,Label 259 | C2,houses,1,4.6,Label 260 | C2,pasta,1,4.6,Label 261 | C2,beds,1,4.6,Label 262 | C23,faces,0,4.62,Label 263 | C23,houses,1,4.62,Label 264 | C23,pasta,1,4.62,Label 265 | C23,beds,0,4.62,Label 266 | M20,faces,0,4.64,Label 267 | M20,houses,0,4.64,Label 268 | M20,pasta,1,4.64,Label 269 | M20,beds,1,4.64,Label 270 | M21,faces,1,4.64,Label 271 | M21,houses,1,4.64,Label 272 | M21,pasta,1,4.64,Label 273 | M21,beds,1,4.64,Label 274 | C21,faces,1,4.73,Label 275 | C21,houses,0,4.73,Label 276 | C21,pasta,1,4.73,Label 277 | C21,beds,1,4.73,Label 278 | M24,faces,1,4.82,Label 279 | M24,houses,1,4.82,Label 280 | M24,pasta,1,4.82,Label 281 | M24,beds,1,4.82,Label 282 | M5,faces,0,4.84,Label 283 | M5,houses,0,4.84,Label 284 | M5,pasta,0,4.84,Label 285 | M5,beds,1,4.84,Label 286 | M7,faces,1,4.89,Label 287 | M7,houses,1,4.89,Label 288 | M7,pasta,1,4.89,Label 289 | M7,beds,0,4.89,Label 290 | M8,faces,1,4.89,Label 291 | M8,houses,1,4.89,Label 292 | M8,pasta,1,4.89,Label 293 | M8,beds,1,4.89,Label 294 | C18,faces,0,4.95,Label 295 | C18,houses,1,4.95,Label 296 | C18,pasta,1,4.95,Label 297 | C18,beds,1,4.95,Label 298 | M25,faces,1,4.96,Label 299 | M25,houses,1,4.96,Label 300 | M25,pasta,1,4.96,Label 301 | M25,beds,1,4.96,Label 302 | MSCH47,faces,1,2.01,No Label 303 | MSCH47,houses,0,2.01,No Label 304 | MSCH47,pasta,1,2.01,No Label 305 | MSCH47,beds,0,2.01,No Label 306 | MSCH50,faces,0,2.03,No Label 307 | MSCH50,houses,0,2.03,No Label 308 | MSCH50,pasta,0,2.03,No Label 309 | MSCH50,beds,0,2.03,No Label 310 | MSCH51,faces,0,2.07,No Label 311 | MSCH51,houses,0,2.07,No Label 312 | MSCH51,pasta,0,2.07,No Label 313 | MSCH51,beds,0,2.07,No Label 314 | MSCH44,faces,0,2.25,No Label 315 | MSCH44,houses,0,2.25,No Label 316 | MSCH44,pasta,0,2.25,No Label 317 | MSCH44,beds,0,2.25,No Label 318 | MSCH52,faces,0,2.5,No Label 319 | MSCH52,houses,1,2.5,No Label 320 | MSCH52,pasta,0,2.5,No Label 321 | MSCH52,beds,1,2.5,No Label 322 | MSCH38,faces,0,2.59,No Label 323 | MSCH38,houses,0,2.59,No Label 324 | MSCH38,pasta,1,2.59,No Label 325 | MSCH38,beds,0,2.59,No Label 326 | MSCH43,faces,0,2.71,No Label 327 | MSCH43,houses,0,2.71,No Label 328 | MSCH43,pasta,0,2.71,No Label 329 | MSCH43,beds,0,2.71,No Label 330 | MSCH49,faces,0,2.88,No Label 331 | MSCH49,houses,0,2.88,No Label 332 | MSCH49,pasta,0,2.88,No Label 333 | MSCH49,beds,0,2.88,No Label 334 | MSCH45,faces,0,2.9,No Label 335 | MSCH45,houses,0,2.9,No Label 336 | MSCH45,pasta,0,2.9,No Label 337 | MSCH45,beds,1,2.9,No Label 338 | MSCH42,faces,1,2.93,No Label 339 | MSCH42,houses,0,2.93,No Label 340 | MSCH42,pasta,0,2.93,No Label 341 | MSCH42,beds,0,2.93,No Label 342 | MSCH53,faces,1,2.99,No Label 343 | MSCH53,houses,1,2.99,No Label 344 | MSCH53,pasta,0,2.99,No Label 345 | MSCH53,beds,0,2.99,No Label 346 | SCH35,faces,0,3.02,No Label 347 | SCH35,houses,0,3.02,No Label 348 | SCH35,pasta,0,3.02,No Label 349 | SCH35,beds,0,3.02,No Label 350 | MSCH40,faces,0,3.02,No Label 351 | MSCH40,houses,1,3.02,No Label 352 | MSCH40,pasta,0,3.02,No Label 353 | MSCH40,beds,1,3.02,No Label 354 | SCH34,faces,0,3.06,No Label 355 | SCH34,houses,0,3.06,No Label 356 | SCH34,pasta,0,3.06,No Label 357 | SCH34,beds,0,3.06,No Label 358 | SCH33,faces,0,3.06,No Label 359 | SCH33,houses,0,3.06,No Label 360 | SCH33,pasta,0,3.06,No Label 361 | SCH33,beds,0,3.06,No Label 362 | MSCH41,faces,0,3.18,No Label 363 | MSCH41,houses,0,3.18,No Label 364 | MSCH41,pasta,0,3.18,No Label 365 | MSCH41,beds,0,3.18,No Label 366 | SCH37,beds,0,3.27,No Label 367 | SCH37,faces,1,3.27,No Label 368 | SCH37,houses,0,3.27,No Label 369 | SCH37,pasta,1,3.27,No Label 370 | SCH32,faces,1,3.27,No Label 371 | SCH32,houses,0,3.27,No Label 372 | SCH32,pasta,0,3.27,No Label 373 | SCH32,beds,0,3.27,No Label 374 | SCH36,beds,0,3.33,No Label 375 | SCH36,faces,0,3.33,No Label 376 | SCH36,houses,1,3.33,No Label 377 | SCH36,pasta,1,3.33,No Label 378 | SCH11,beds,0,3.41,No Label 379 | SCH12,faces,0,3.41,No Label 380 | SCH12,houses,0,3.41,No Label 381 | SCH12,pasta,0,3.41,No Label 382 | SCH12,beds,0,3.41,No Label 383 | SCH18,faces,0,3.45,No Label 384 | SCH18,houses,0,3.45,No Label 385 | SCH18,pasta,0,3.45,No Label 386 | SCH18,beds,0,3.45,No Label 387 | MSCH48,faces,0,3.5,No Label 388 | MSCH48,houses,1,3.5,No Label 389 | MSCH48,pasta,0,3.5,No Label 390 | MSCH48,beds,0,3.5,No Label 391 | SCH25,faces,0,3.54,No Label 392 | SCH25,houses,1,3.54,No Label 393 | SCH25,pasta,1,3.54,No Label 394 | SCH25,beds,0,3.54,No Label 395 | SCH31,faces,0,3.71,No Label 396 | SCH31,houses,0,3.71,No Label 397 | SCH31,pasta,0,3.71,No Label 398 | SCH31,beds,0,3.71,No Label 399 | MSCH46,faces,0,3.76,No Label 400 | MSCH46,houses,0,3.76,No Label 401 | MSCH46,pasta,1,3.76,No Label 402 | MSCH46,beds,0,3.76,No Label 403 | SCH11,faces,1,3.82,No Label 404 | SCH11,houses,1,3.82,No Label 405 | SCH11,pasta,1,3.82,No Label 406 | SCH29,faces,0,3.83,No Label 407 | SCH29,houses,0,3.83,No Label 408 | SCH29,pasta,0,3.83,No Label 409 | SCH29,beds,0,3.83,No Label 410 | MSCH39,beds,1,3.93,No Label 411 | MSCH39,pasta,0,3.93,No Label 412 | MSCH39,houses,0,3.94,No Label 413 | MSCH39,faces,0,3.94,No Label 414 | SCH28,faces,0,4.02,No Label 415 | SCH28,houses,0,4.02,No Label 416 | SCH28,pasta,0,4.02,No Label 417 | SCH28,beds,0,4.02,No Label 418 | SCH22,faces,0,4.02,No Label 419 | SCH22,houses,0,4.02,No Label 420 | SCH22,pasta,0,4.02,No Label 421 | SCH22,beds,1,4.02,No Label 422 | SCH24,faces,0,4.07,No Label 423 | SCH24,houses,0,4.07,No Label 424 | SCH24,pasta,1,4.07,No Label 425 | SCH24,beds,0,4.07,No Label 426 | SCH27,faces,0,4.09,No Label 427 | SCH27,houses,0,4.09,No Label 428 | SCH27,pasta,1,4.09,No Label 429 | SCH27,beds,0,4.09,No Label 430 | SCH17,faces,0,4.25,No Label 431 | SCH17,houses,0,4.25,No Label 432 | SCH17,pasta,1,4.25,No Label 433 | SCH17,beds,0,4.25,No Label 434 | SCH10,faces,0,4.32,No Label 435 | SCH10,houses,0,4.32,No Label 436 | SCH10,pasta,0,4.32,No Label 437 | SCH10,beds,1,4.32,No Label 438 | SCH9,faces,0,4.37,No Label 439 | SCH9,houses,0,4.37,No Label 440 | SCH9,pasta,0,4.37,No Label 441 | SCH9,beds,0,4.37,No Label 442 | SCH20,faces,0,4.39,No Label 443 | SCH20,houses,0,4.39,No Label 444 | SCH20,pasta,0,4.39,No Label 445 | SCH20,beds,0,4.39,No Label 446 | SCH6,faces,0,4.41,No Label 447 | SCH6,houses,0,4.41,No Label 448 | SCH6,pasta,0,4.41,No Label 449 | SCH6,beds,0,4.41,No Label 450 | SCH7,faces,1,4.41,No Label 451 | SCH7,houses,0,4.41,No Label 452 | SCH7,pasta,0,4.41,No Label 453 | SCH7,beds,0,4.41,No Label 454 | SCH15,faces,1,4.42,No Label 455 | SCH15,houses,0,4.42,No Label 456 | SCH15,pasta,0,4.42,No Label 457 | SCH15,beds,0,4.42,No Label 458 | SCH30,faces,0,4.44,No Label 459 | SCH30,houses,0,4.44,No Label 460 | SCH30,pasta,1,4.44,No Label 461 | SCH30,beds,0,4.44,No Label 462 | SCH3,faces,0,4.47,No Label 463 | SCH3,houses,0,4.47,No Label 464 | SCH3,pasta,0,4.47,No Label 465 | SCH3,beds,0,4.47,No Label 466 | SCH26,faces,0,4.47,No Label 467 | SCH26,houses,0,4.47,No Label 468 | SCH26,pasta,1,4.47,No Label 469 | SCH26,beds,0,4.47,No Label 470 | SCH8,faces,0,4.52,No Label 471 | SCH8,houses,0,4.52,No Label 472 | SCH8,pasta,0,4.52,No Label 473 | SCH8,beds,0,4.52,No Label 474 | SCH16,faces,0,4.55,No Label 475 | SCH16,houses,0,4.55,No Label 476 | SCH16,pasta,0,4.55,No Label 477 | SCH16,beds,1,4.55,No Label 478 | SCH14,faces,0,4.58,No Label 479 | SCH14,houses,0,4.58,No Label 480 | SCH14,pasta,0,4.58,No Label 481 | SCH14,beds,1,4.58,No Label 482 | SCH2,faces,0,4.61,No Label 483 | SCH2,houses,0,4.61,No Label 484 | SCH2,pasta,0,4.61,No Label 485 | SCH2,beds,0,4.61,No Label 486 | SCH5,faces,0,4.61,No Label 487 | SCH5,houses,0,4.61,No Label 488 | SCH5,pasta,0,4.61,No Label 489 | SCH5,beds,0,4.61,No Label 490 | SCH13,faces,0,4.75,No Label 491 | SCH13,houses,0,4.75,No Label 492 | SCH13,pasta,0,4.75,No Label 493 | SCH13,beds,0,4.75,No Label 494 | SCH21,faces,0,4.76,No Label 495 | SCH21,houses,0,4.76,No Label 496 | SCH21,pasta,0,4.76,No Label 497 | SCH21,beds,0,4.76,No Label 498 | SCH19,faces,0,4.79,No Label 499 | SCH19,houses,0,4.79,No Label 500 | SCH19,pasta,0,4.79,No Label 501 | SCH19,beds,1,4.79,No Label 502 | SCH23,faces,0,4.82,No Label 503 | SCH23,houses,0,4.82,No Label 504 | SCH23,pasta,0,4.82,No Label 505 | SCH23,beds,0,4.82,No Label 506 | SCH1,faces,0,4.82,No Label 507 | SCH1,houses,0,4.82,No Label 508 | SCH1,pasta,0,4.82,No Label 509 | SCH1,beds,0,4.82,No Label 510 | MSCH66,faces,0,3.5,No Label 511 | MSCH66,houses,0,3.5,No Label 512 | MSCH66,pasta,1,3.5,No Label 513 | MSCH66,beds,0,3.5,No Label 514 | MSCH67,faces,0,3.24,No Label 515 | MSCH67,houses,1,3.24,No Label 516 | MSCH67,pasta,0,3.24,No Label 517 | MSCH67,beds,1,3.24,No Label 518 | MSCH68,faces,0,3.94,No Label 519 | MSCH68,houses,0,3.94,No Label 520 | MSCH68,pasta,0,3.94,No Label 521 | MSCH68,beds,0,3.94,No Label 522 | MSCH69,faces,0,2.72,No Label 523 | MSCH69,houses,1,2.72,No Label 524 | MSCH69,pasta,1,2.72,No Label 525 | MSCH69,beds,0,2.72,No Label 526 | MSCH70,faces,0,2.31,No Label 527 | MSCH70,houses,0,2.31,No Label 528 | MSCH70,pasta,0,2.31,No Label 529 | MSCH70,beds,1,2.31,No Label 530 | MSCH71,faces,1,3.14,No Label 531 | MSCH71,houses,1,3.14,No Label 532 | MSCH71,pasta,1,3.14,No Label 533 | MSCH71,beds,0,3.14,No Label 534 | MSCH72,faces,1,3.72,No Label 535 | MSCH72,houses,1,3.72,No Label 536 | MSCH72,pasta,0,3.72,No Label 537 | MSCH72,beds,0,3.72,No Label 538 | MSCH73,faces,0,3.1,No Label 539 | MSCH73,houses,0,3.1,No Label 540 | MSCH73,pasta,0,3.1,No Label 541 | MSCH73,beds,0,3.1,No Label 542 | MSCH74,faces,1,2.34,No Label 543 | MSCH74,houses,0,2.34,No Label 544 | MSCH74,pasta,0,2.34,No Label 545 | MSCH74,beds,1,2.34,No Label 546 | MSCH75,faces,0,3.67,No Label 547 | MSCH75,houses,0,3.67,No Label 548 | MSCH75,pasta,0,3.67,No Label 549 | MSCH75,beds,0,3.66,No Label 550 | MSCH76,faces,0,2.58,No Label 551 | MSCH76,houses,0,2.58,No Label 552 | MSCH76,pasta,0,2.58,No Label 553 | MSCH76,beds,0,2.58,No Label 554 | MSCH77,faces,0,2.55,No Label 555 | MSCH77,houses,0,2.55,No Label 556 | MSCH77,pasta,0,2.55,No Label 557 | MSCH77,beds,1,2.55,No Label 558 | MSCH78,faces,0,2.43,No Label 559 | MSCH78,houses,0,2.43,No Label 560 | MSCH78,pasta,0,2.43,No Label 561 | MSCH78,beds,1,2.43,No Label 562 | MSCH79,faces,0,2.7,No Label 563 | MSCH79,houses,1,2.7,No Label 564 | MSCH79,pasta,0,2.7,No Label 565 | MSCH79,beds,1,2.7,No Label 566 | MSCH80,faces,0,2.76,No Label 567 | MSCH80,houses,0,2.76,No Label 568 | MSCH80,pasta,0,2.76,No Label 569 | MSCH80,beds,0,2.76,No Label 570 | MSCH81,faces,1,2.84,No Label 571 | MSCH81,houses,0,2.84,No Label 572 | MSCH81,pasta,0,2.84,No Label 573 | MSCH81,beds,0,2.84,No Label 574 | MSCH82,faces,1,2.46,No Label 575 | MSCH82,houses,0,2.46,No Label 576 | MSCH82,pasta,1,2.46,No Label 577 | MSCH82,beds,0,2.46,No Label 578 | MSCH83,faces,0,2.37,No Label 579 | MSCH83,houses,0,2.37,No Label 580 | MSCH83,pasta,1,2.37,No Label 581 | MSCH83,beds,0,2.37,No Label 582 | MSCH84,faces,0,2.83,No Label 583 | MSCH84,houses,0,2.83,No Label 584 | MSCH84,pasta,1,2.83,No Label 585 | MSCH84,beds,0,2.83,No Label 586 | MSCH85,faces,0,2.69,No Label 587 | MSCH85,houses,0,2.69,No Label 588 | MSCH85,pasta,0,2.69,No Label 589 | MSCH85,beds,0,2.69,No Label 590 | -------------------------------------------------------------------------------- /data/ws.feather: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mcfrank/tidyverse-tutorial/23a6a29c75041a014de29254934117ac30d7f45b/data/ws.feather -------------------------------------------------------------------------------- /tidyverse_examples.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Tidyverse Examples" 3 | author: "Psych 251 Staff" 4 | date: "10/2/2019" 5 | output: html_document 6 | --- 7 | 8 | ```{r setup, include=FALSE} 9 | library(tidyverse) 10 | ``` 11 | 12 | # Manipulating data with dplyr 13 | 14 | Let's use `mtcars`, a built in dataset of cars and their miles/gallon (mpg), number of cylinders (cyl), displacement (disp), gross horsepower (hp), etc. 15 | 16 | ```{r} 17 | mtcars 18 | ``` 19 | 20 | **Exercise**: First, summarise the average miles/gallon (mpg) across the entire dataset. 21 | 22 | ```{r} 23 | mtcars %>% 24 | summarise(mean = mean(mpg)) 25 | ``` 26 | 27 | **Exercise**: A car can either have 4, 6, or 8 cylinders (cyl). Summarise the average mpg, broken down by the number of cylinders. Hint: You may want to "group" by cyl in order to do this. 28 | 29 | ```{r} 30 | mtcars %>% 31 | group_by(cyl) %>% 32 | summarise(mean = mean(mpg)) 33 | ``` 34 | 35 | **Exercise**: In addition to the means, add standard deviations to this summary (still grouped by cyl). 36 | 37 | ```{r} 38 | mtcars %>% 39 | group_by(cyl) %>% 40 | summarise(mean = mean(mpg), 41 | sd = sd(mpg)) 42 | 43 | ``` 44 | 45 | **BONUS**: Let's visualize! Use ggplot (included in the tidyverse package) to make a scatter plot of mpg by horsepower. If you are feeling extra fancy, you can add a smoothing line. (Hint: Google "geom_smooth() scatterplot".) 46 | 47 | ```{r} 48 | ggplot(mtcars, 49 | aes(x = hp, y = mpg)) + 50 | geom_point() + 51 | geom_smooth() 52 | ``` 53 | 54 | 55 | 56 | # Reshaping with tidyr 57 | 58 | ## From long to wide and back again 59 | 60 | We will first use a built-in table in package `tidyr`: table3. We can use `help(table3)` to find its information. 61 | 62 | ```{r} 63 | table3 64 | help(table3) 65 | ``` 66 | 67 | `table3` is in tidy format. Make this into wide data. 68 | 69 | ```{r} 70 | table3_wide <- table3 %>% 71 | spread(year, rate) 72 | ``` 73 | 74 | Now make it back into tidy data. 75 | 76 | ```{r} 77 | table3_long <- table3_wide %>% 78 | gather(year, rate, `1999`:`2000`) 79 | ``` 80 | 81 | Here are examples of more recently published functions for wide to long or long to wide. These two functions have more straightforward names and argument names, which makes them easier to use. 82 | 83 | ```{r} 84 | table3_wide <- table3 %>% 85 | pivot_wider(names_from = year, values_from = rate) 86 | 87 | table3_long <- table3_wide %>% 88 | pivot_longer(cols = `1999`:`2000`, names_to = "year", values_to = "rate") 89 | ``` 90 | 91 | ## From wide to long without seeing the tidy version 92 | 93 | These are pre-post data on children's arithmetic scores from a RCT (Randomized Controlled Trial) in which they were assigned either to CNTL (control) or MA (mental abacus math intervention). They were tested twice, once in 2015 and once in 2016. The paper can be found at https://jnc.psychopen.eu/article/view/106. 94 | 95 | ```{r} 96 | majic <- read_csv("data/majic.csv") 97 | ``` 98 | 99 | Make these tidy. 100 | 101 | ```{r} 102 | majic_long <- majic %>% 103 | gather(year, score, `2015`, `2016`) 104 | 105 | #The new way, using pivot_longer! 106 | majic_long <- majic %>% 107 | pivot_longer(cols = c(`2015`, `2016`), names_to = "year", values_to = "score") 108 | ``` 109 | 110 | **OPTIONAL**: make these back to wide format. 111 | 112 | ```{r} 113 | majic_wide <- majic_long %>% 114 | spread(year, score) 115 | 116 | majic_wide <- majic_long %>% 117 | pivot_wider(names_from = year, values_from = score) 118 | ``` -------------------------------------------------------------------------------- /tidyverse_tutorial.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Medium Data in the Tidyverse" 3 | author: "Mike Frank" 4 | date: "6/22/2017, updated 10/14/2019" 5 | output: html_document 6 | --- 7 | 8 | Starting note: The best reference for this material is Hadley Wickham's [R for data scientists](http://r4ds.had.co.nz/). My contribution here is to translate this reference for psychology. 9 | 10 | ```{r setup, include=FALSE} 11 | library(tidyverse) 12 | knitr::opts_chunk$set(echo = TRUE, cache=TRUE) 13 | ``` 14 | 15 | 16 | # Goals and Introduction 17 | 18 | By the end of this tutorial, you will know: 19 | 20 | + What "tidy data" is and why it's an awesome format 21 | + How to do some stuff with tidy data 22 | + How to get your data to be tidy 23 | + Some tips'n'tricks for dealing with "medium data" in R 24 | 25 | This intro will describe a few concepts that you will need to know, using the famous `iris` dataset that comes with `ggplot2`. 26 | 27 | ## Data frames 28 | 29 | The basic data structure we're working with is the data frame, or `tibble` (in the `tidyverse` reimplementation). 30 | Data frames have rows and columns, and each column has a distinct data type. The implementation in Python's `pandas` is distinct but most of the concepts are the same. 31 | 32 | `iris` is a data frame showing the measurements of a bunch of different instances of iris flowers from different species. (Sepals are the things outside the petals of the flowers that protect the petals while it's blooming, petals are the actual petals of the flower). 33 | 34 | ```{r} 35 | head(iris) 36 | ``` 37 | 38 | > **Exercise.** R is a very flexible programming language, which is both a strength and a weakness. There are many ways to get a particular value of a variable in a data frame. You can use `$` to access a column, as in `iris$Sepal.Length` or you can treat the data frame as a matrix, e.g. `iris[1,1]` or even as a list, as in `iris[[1]]`. You can also mix numeric references and named references, e.g. `iris[["Sepal.Length"]]`. Turn to your neighbor (and/or google) and find as many ways as you can to access the petal length of the third iris in the dataset (row 3). 39 | 40 | ```{r} 41 | # fill me in with calls to the iris dataset that all return the same cell (third from the top, Petal Length). 42 | 43 | ``` 44 | 45 | > **Discussion.** Why might some ways of doing this be better than others? 46 | 47 | ## Tidy data 48 | 49 | > “Tidy datasets are all alike, but every messy dataset is messy in its own way.” –– Hadley Wickham 50 | 51 | Here's the basic idea: In tidy data, every row is a single **observation** (trial), and every column describes a **variable** with some **value** describing that trial. 52 | 53 | And if you know that data are formatted this way, then you can do amazing things, basically because you can take a uniform approach to the dataset. From R4DS: 54 | 55 | "There’s a general advantage to picking one consistent way of storing data. If you have a consistent data structure, it’s easier to learn the tools that work with it because they have an underlying uniformity. There’s a specific advantage to placing variables in columns because it allows R’s vectorised nature to shine." 56 | 57 | `iris` is a tidy dataset. Each row is an observation of an individual iris, each column is a different variable. 58 | 59 | > **Exercise.** Take a look at these data, as downloaded from Amazon Mechanical Turk. They describe an experiment where people had to estimate the price of a dog, a plasma TV, and a sushi dinner (and they were primed with anchors that differed across conditions). It's a replication of a paper by [Janiszewksi & Uy (2008)](http://warrington.ufl.edu/departments/mkt/docs/janiszewski/Anchor.pdf). Examine this dataset with your nextdoor neighbor and sketch out what a tidy version of the dataset would look like (using paper and pencil). 60 | 61 | ```{r} 62 | ju <- read_csv("data/janiszewski_rep_cleaned.csv") 63 | head(ju) 64 | ``` 65 | 66 | ## Functions and Pipes 67 | 68 | Everything you typically want to do in statistical programming uses **functions**. `mean` is a good example. `mean` takes one **argument**, a numeric vector. 69 | 70 | ```{r} 71 | mean(iris$Petal.Length) 72 | ``` 73 | 74 | We're going to call this **applying** the function `mean` to the variable `Petal.Length`. 75 | 76 | Pipes are a way to write strings of functions more easily. They bring the first argument of the function to the bedginning. So you can write: 77 | 78 | ```{r} 79 | iris$Petal.Length %>% mean 80 | ``` 81 | 82 | That's not very useful yet, but when you start **nesting** functions, it gets better. 83 | 84 | ```{r} 85 | mean(unique(iris$Petal.Length)) 86 | iris$Petal.Length %>% unique() %>% mean(na.rm=TRUE) 87 | ``` 88 | 89 | or 90 | 91 | ```{r} 92 | round(mean(unique(iris$Petal.Length)), digits = 2) 93 | iris$Petal.Length %>% unique %>% mean %>% round(digits = 2) 94 | 95 | # indenting makes things even easier to read 96 | iris$Petal.Length %>% 97 | unique %>% 98 | mean %>% 99 | round(digits = 2) 100 | ``` 101 | 102 | This can be super helpful for writing strings of functions so that they are readable and distinct. 103 | 104 | We'll be doing a lot of piping of functions with multiple arguments later, and it will really help keep our syntax simple. 105 | 106 | > **Exercise.** Rewrite these commands using pipes and check that they do the same thing! (Or at least produce the same output). Unpiped version: 107 | 108 | ```{r} 109 | length(unique(iris$Species)) # number of species 110 | ``` 111 | 112 | Piped version: 113 | 114 | ```{r} 115 | iris$Species %>% 116 | unique %>% 117 | length 118 | ``` 119 | 120 | ## `ggplot2` and tidy data 121 | 122 | The last piece of our workflow here is going to be the addition of visualiation elements. `ggplot2` is a plotting package that easily takes advantage of tidy data. ggplots have two important parts (there are of course more): 123 | 124 | + `aes` - the aesthetic mapping, or which data variables get mapped to which visual variables (x, y, color, symbol, etc.) 125 | + `geom` - the plotting objects that represent the data (points, lines, shapes, etc.) 126 | 127 | ```{r} 128 | iris %>% 129 | ggplot(aes(x = Sepal.Width, y = Sepal.Length, col = Species)) + 130 | geom_point() 131 | ``` 132 | 133 | And just to let you know my biases, I like `theme_few` from `ggthemes` and `scale_color_solarized` as my palette. 134 | 135 | ```{r} 136 | iris %>% 137 | ggplot(aes(Sepal.Width, Sepal.Length, col = Species)) + 138 | geom_point() + 139 | ggthemes::theme_few() + 140 | ggthemes::scale_color_solarized() 141 | ``` 142 | 143 | 144 | # Tidy Data Analysis with `dplyr` 145 | 146 | Reference: [R4DS Chapter 5](http://r4ds.had.co.nz/transform.html) 147 | 148 | Let's take a psychological dataset. Here are the raw data from [Stiller, Goodman, & Frank (2015)]. 149 | 150 | These data are tidy: each row describes a single trial, each column describes some aspect of tha trial, including their id (`subid`), age (`age`), condition (`condition` - "label" is the experimental condition, "No Label" is the control), item (`item` - which thing furble was trying to find). 151 | 152 | We are going to manipulate these data using "verbs" from `dplyr`. I'll only teach four verbs, the most common in my workflow (but there are many other useful ones): 153 | 154 | + `filter` - remove rows by some logical condition 155 | + `mutate` - create new columns 156 | + `group_by` - group the data into subsets by some column 157 | + `summarize` - apply some function over columns in each group 158 | 159 | 160 | ## Exploring and characterizing the dataset 161 | 162 | 163 | ```{r} 164 | sgf <- read_csv("data/stiller_scales_data.csv") 165 | sgf 166 | ``` 167 | 168 | Inspect the various variables before you start any analysis. Lots of people recommend `summary` but TBH I don't find it useful. 169 | 170 | ```{r} 171 | summary(sgf) 172 | ``` 173 | 174 | This output just feels overwhelming and uninformative. 175 | 176 | You can look at each variable by itself: 177 | 178 | ```{r} 179 | unique(sgf$condition) 180 | 181 | sgf$subid %>% 182 | unique %>% 183 | length 184 | ``` 185 | 186 | Or use interactive tools like `View` or `DT::datatable` (which I really like). 187 | 188 | ```{r} 189 | View(sgf) 190 | DT::datatable(sgf) 191 | ``` 192 | 193 | ## Filtering & Mutating 194 | 195 | There are lots of reasons you might want to remove *rows* from your dataset, including getting rid of outliers, selecting subpopulations, etc. `filter` is a verb (function) that takes a data frame as its first argument, and then as its second takes the **condition** you want to filter on. 196 | 197 | So if you wanted to look only at two year olds, you could do this. (Note you can give two conditions, could also do `age > 2 & age < 3`). (equivalent: `filter(sgf, age > 2, age < 3)`) 198 | 199 | Note that we're going to be using pipes with functions over data frames here. The way this works is that: 200 | 201 | + `dplyr` verbs always take the data frame as their first argument, and 202 | + because pipes pull out the first argument, the data frame just gets passed through successive operations 203 | + so you can read a pipe chain as "take this data frame and first do this, then do this, then do that." 204 | 205 | This is essentially the huge insight of `dplyr`: you can chain verbs into readable and efficient sequences of operations over dataframes, provided 1) the verbs all have the same syntax (which they do) and 2) the data all have the same structure (which they do if they are tidy). 206 | 207 | OK, so filtering: 208 | 209 | ```{r} 210 | sgf %>% 211 | filter(age > 2, 212 | age < 3) 213 | ``` 214 | 215 | **Exercise.** Filter out only the "face" trial in the "Label" condition. 216 | 217 | ```{r} 218 | sgf %>% 219 | filter(condition == "Label", 220 | item == "faces") 221 | 222 | sgf[sgf$condition == "Label" & sgf$item == "faces", ] # all the columns 223 | ``` 224 | 225 | There are also times when you want to add or remove *columns*. You might want to remove columns to simplify the dataset. There's not much to simplify here, but if you wanted to do that, the verb is `select`. 226 | 227 | ```{r} 228 | sgf %>% 229 | select(subid, age, correct) 230 | 231 | sgf %>% 232 | select(-condition) 233 | 234 | sgf %>% 235 | select(1) 236 | 237 | sgf %>% 238 | select(starts_with("sub")) 239 | 240 | # learn about this with ?select 241 | ``` 242 | 243 | Perhaps more useful is *adding columns*. You might do this perhaps to compute some kind of derived variable. `mutate` is the verb for these situations - it allows you to add a column. Let's add a discrete age group factor to our dataset. 244 | 245 | ```{r} 246 | sgf <- sgf %>% 247 | mutate(age_group = cut(age, 2:5, include.lowest = TRUE), 248 | age_group_halfyear = cut(age, seq(2,5,.5), include.lowest = TRUE)) 249 | 250 | # sgf$age_group <- cut(sgf$age, 2:5, include.lowest = TRUE) 251 | # sgf$age_group <- with(sgf, cut(age, 2:5, include.lowest = TRUE)) 252 | 253 | head(sgf$age_group) 254 | ``` 255 | 256 | ## Standard psychological descriptives 257 | 258 | We typically describe datasets at the level of subjects, not trials. We need two verbs to get a summary at the level of subjects: `group_by` and `summarise` (kiwi spelling). Grouping alone doesn't do much. 259 | 260 | ```{r} 261 | sgf %>% 262 | group_by(age_group) 263 | ``` 264 | 265 | All it does is add a grouping marker. 266 | 267 | What `summarise` does is to *apply a function* to a part of the dataset to create a new summary dataset. So we can apply the function `mean` to the dataset and get the grand mean. 268 | 269 | ```{r} 270 | ## DO NOT DO THIS!!! 271 | # foo <- initialize_the_thing_being_bound() 272 | # for (i in 1:length(unique(sgf$item))) { 273 | # for (j in 1:length(unique(sgf$condition))) { 274 | # this_data <- sgf[sgf$item == unique(sgf$item)[i] & 275 | # sgf$condition == unique(sgf$condition)[n],] 276 | # do_a_thing(this_data) 277 | # bind_together_somehow(this_data) 278 | # } 279 | # } 280 | 281 | sgf %>% 282 | summarise(correct = mean(correct)) 283 | ``` 284 | Note the syntax here: `summarise` takes multiple `new_column_name = function_to_be_applied_to_data(data_column)` entries in a list. Using this syntax, we can create more elaborate summary datasets also: 285 | 286 | ```{r} 287 | sgf %>% 288 | summarise(correct = mean(correct), 289 | n_observations = length(subid)) 290 | ``` 291 | 292 | Where these two verbs shine is in combination, though. Because `summarise` applies functions to columns in your *grouped data*, not just to the whole dataset! 293 | 294 | So we can group by age or condition or whatever else we want and then carry out the same procedure, and all of a sudden we are doing something extremely useful! 295 | 296 | ```{r} 297 | sgf_means <- sgf %>% 298 | group_by(age_group, condition) %>% 299 | summarise(correct = mean(correct), 300 | n_observations = length(subid)) 301 | sgf_means 302 | ``` 303 | 304 | These summary data are typically very useful for plotting. . 305 | 306 | ```{r} 307 | ggplot(sgf_means, 308 | aes(x = age_group, y = correct, col = condition, group = condition)) + 309 | geom_line() + 310 | ylim(0,1) + 311 | ggthemes::theme_few() 312 | 313 | # sgf %>% 314 | # mutate(age_group) %>% 315 | # group_by() %>% 316 | # summarise %>% 317 | # ggplot() 318 | 319 | ``` 320 | 321 | > **Exercise.** One of the most important analytic workflows for psychological data is to take some function (e.g., the mean) *for each participant* and then look at grand means and variability *across participant means*. This analytic workflow requires grouping, summarising, and then grouping again and summarising again! Use `dplyr` to make the same table as above (`sgf_means`) but with means (and SDs if you want) computed across subject means, not across all data points. (The means will be pretty similar as this is a balanced design but in a case with lots of missing data, they will vary. In contrast, the SD doesn't even really make sense across the binary data before you aggregate across subjects.) 322 | 323 | ```{r} 324 | # exercise 325 | sgf_sub_means <- sgf %>% 326 | group_by(age_group, condition, subid) %>% 327 | summarise(correct = mean(correct)) 328 | 329 | sgf_grand_means <- sgf_sub_means %>% 330 | group_by(age_group, condition) %>% 331 | summarise(mean_correct = mean(correct), 332 | sd_correct = sd(correct)) 333 | ``` 334 | 335 | 336 | # Getting to Tidy with `tidyr` 337 | 338 | Reference: [R4DS Chapter 12](http://r4ds.had.co.nz/tidy-data.html) 339 | 340 | Psychological data often comes in two flavors: *long* and *wide* data. Long form data is *tidy*, but that format is less common. It's much more common to get *wide* data, in which every row is a case (e.g., a subject), and each column is a variable. In this format multiple trials (observations) are stored as columns. 341 | 342 | This can go a bunch of ways, for example, the most common might be to have subjects as rows and trials as columns. But here's an example from a real dataset on "unconscious arithmetic" from [Sklar et al. (2012)](http://www.pnas.org/content/109/48/19614.short). In it, *items* (particular arithmetic problems) are rows and *subjects* are columns. 343 | 344 | 345 | ```{r} 346 | sklar <- read_csv("data/sklar_data.csv") 347 | head(sklar) 348 | ``` 349 | 350 | ## Tidy verbs 351 | 352 | The two main verbs for tidying are `gather` and `spread`. (There are lots of others in the `tidyr` package if you want to split or merge columns etc.). 353 | 354 | First, let's go *away* from tidiness. We're going to `spread` a tidy dataset. Remember that tidy data has one observation in each row, but we want to "spread" it out so it's wide. (The metaphor works better in this description). This may not be helpful, but I think of the data as a long cream cheese pat, and I "spread" it over a wide bagel. 355 | 356 | Let's try it on the SGF data above. First we'll spread it so it's wide. I do this by indicating what column is going to be the *column labels* in the new data frame, here it's `item`, and what column is going to have the *values* in those columns, here it's `correct`: 357 | 358 | ```{r} 359 | sgf_wide <- sgf %>% 360 | spread(item, correct) 361 | head(sgf_wide) 362 | ``` 363 | 364 | Now you can see that there is no explicit specification that all those item columns, e.g. `faces`, `beds` are holding `correct` values, but the data are much more compact. (This form is easy to work with in Excel, so that's probably why people use it in psych). 365 | 366 | OK, let's go back to our original format. `gather` is about making wide data into tidy (long) data. When you gather a dataset you are "gathering" a bunch of columns (maybe that you previously `spread`). You specify what all the columns have in common (e.g., they are all `subject_id`s in the example above), and you say what measure they all contain (they all have RTs). So in that sense, it's the flip of `spread`. You did `spread(item, correct)` and now you'll `gather(item, correct, ...)`. The one extra argument is that you need to specify the columns that will go into `item`! 367 | 368 | ```{r} 369 | sgf_long <- sgf_wide %>% 370 | gather(item, correct, beds, faces, houses, pasta) 371 | head(sgf_long) 372 | head(sgf) 373 | ``` 374 | 375 | There are lots of flexible ways to specify these columns - you can enumerate their names like I did. 376 | 377 | ```{r} 378 | # gather(item, correct, 5:8) 379 | # gather(item, correct, starts_with("foo")) 380 | ``` 381 | 382 | > **Exercise.** Take the Sklar data from above, where each column is a separate subject, and `gather` it so that it's a tidy dataset. What challenges come up? 383 | 384 | ```{r} 385 | sklar 386 | ``` 387 | 388 | ```{r} 389 | sklar_tidy <- sklar %>% 390 | gather(subid, rt, 8:28) 391 | 392 | sklar_tidy 393 | ``` 394 | 395 | Let's also go back and tidy an easier one: `iris`. 396 | 397 | ```{r} 398 | iris 399 | ``` 400 | 401 | ```{r} 402 | iris %>% 403 | mutate(iris_id = 1:nrow(iris)) %>% 404 | gather(measurement, centimeters, Sepal.Length, Petal.Length, Sepal.Width, Petal.Width) 405 | ``` 406 | 407 | 408 | 409 | 410 | # A bigger worked example: Wordbank data 411 | 412 | We're going to be using some data on vocabulary growth that we load from the Wordbank database. [Wordbank](http://wordbank.stanford.edu) is a database of children's language learning. 413 | 414 | (Go explore it for a moment). 415 | 416 | We're going to look at data from the English Words and Sentences form. These data describe the repsonses of parents to questions about whether their child says 680 different words. 417 | 418 | `dplyr` really shines in this context. 419 | 420 | ```{r} 421 | # to avoid dependency on the wordbankr package, we cache these data. 422 | # ws <- wordbankr::get_administration_data(language = "English", 423 | # form = "WS") 424 | 425 | ws <- read_csv("data/ws.csv") 426 | ``` 427 | 428 | Take a look at the data that comes out. 429 | 430 | ```{r} 431 | DT::datatable(ws) 432 | ``` 433 | 434 | 435 | ```{r} 436 | ggplot(ws, aes(x = age, y = production)) + 437 | geom_point() 438 | ``` 439 | 440 | Aside: How can we fix this plot? Suggestions from group? 441 | 442 | ```{r} 443 | ggplot(ws, aes(x = age, y = production)) + 444 | geom_jitter(size = .5, width = .25, height = 0, alpha = .3) 445 | ``` 446 | 447 | Ok, let's plot the relationship between sex and productive vocabulary, using `dplyr`. 448 | 449 | ```{r} 450 | ggplot(ws, aes(x = age, y = production, col=sex)) + 451 | geom_jitter(size = .5, width = .25, height = 0, alpha = .3) 452 | ``` 453 | This is a bit useless, because the variability is so high. So let's summarise! 454 | 455 | > **Exercise.** Get means and SDs of productive vocabulary (`production`) by `age` and `sex`. Filter the kids with missing data for `sex` (coded by `NA`). 456 | 457 | HINT: `is.na(x)` is useful for filtering. 458 | 459 | ```{r} 460 | # View(ws) 461 | ws_sex <- ws %>% 462 | filter(!is.na(sex)) %>% 463 | group_by(age, sex) %>% 464 | summarise(production = mean(production), 465 | production_sd = sd(production)) 466 | ws_sex 467 | ``` 468 | 469 | Now plot: 470 | 471 | ```{r} 472 | ggplot(ws_sex, 473 | aes(x = age, y = production, col = sex)) + 474 | geom_line() + 475 | geom_jitter(data = filter(ws, !is.na(sex)), 476 | size = .5, width = .25, height = 0, alpha = .3) + 477 | geom_linerange(aes(ymin = production - production_sd, 478 | ymax = production + production_sd), 479 | position = position_dodge(width = .2)) # keep SDs from overlapping 480 | ``` 481 | 482 | **Bonus: Compute effect size.** 483 | 484 | ```{r} 485 | # instructor demo 486 | ``` 487 | 488 | 489 | 490 | # Exciting stuff you can do with this workflow 491 | 492 | Here are three little demos of exciting stuff that you can do (and that are facilitated by this workflow). 493 | 494 | ## Reading bigger files, faster 495 | 496 | A few other things will help you with "medium size data": 497 | 498 | + `read_csv` - Much faster than `read.csv` and has better defaults. 499 | + `dbplyr` - For connecting directly to databases. This package got forked off of `dplyr` recently but is very useful. 500 | + `feather` - The `feather` package is a fast-loading binary format that is interoperable with python. All you need to know is `write_feather(d, "filename")` and `read_feather("filename")`. 501 | 502 | Here's a timing demo for `read.csv`, `read_csv`, and `read_feather`. 503 | 504 | ```{r} 505 | system.time(read.csv("data/ws.csv")) 506 | system.time(read_csv("data/ws.csv")) 507 | system.time(feather::read_feather("data/ws.feather")) 508 | ``` 509 | I see about a 2x speedup for `read_csv` (bigger for bigger files) and a 20x speedup for `read_feather`. 510 | 511 | ## Interactive visualization 512 | 513 | The `shiny` package is a great way to do interactives in R. We'll walk through constructing a simple shiny app for the wordbank data here. 514 | 515 | Technically, this is [embedded shiny](http://rmarkdown.rstudio.com/authoring_embedded_shiny.html) as opposed to freestanding shiny apps (like Wordbank). 516 | 517 | The two parts of a shiny app are `ui` and `server`. Both of these are funny in that they are lists of other things. The `ui` is a list of elements of an HTML page, and the server is a list of "reactive" elements. In brief, the UI says what should be shown, and the server specifies the mechanics of how to create those elements. 518 | 519 | This little embedded shiny app shows a page with two elements: 1) a selector that lets you choose a demographic field, and 2) a plot of vocabulary split by that field. 520 | 521 | The server then has the job of splitting the data by that field (for `ws_split`) and rendering the plot (`agePlot`). 522 | 523 | The one fancy thing that's going on here is that the app makes use of the calls `group_by_` (in the `dplyr` chain) and `aes_` (for the `ggplot` call). These `_` functions are a little complex - they are an example of "standard evaluation" that lets you feed *actual variables* into `ggplot2` and `dplyr` rather than *names of variables*. For more information, there is a nice vignette on standard and non-standard evaluation: try `(vignette("nse")`. 524 | 525 | ```{r} 526 | shinyApp( 527 | ui <- fluidPage( 528 | selectInput("demographic", "Demographic Split Variable", 529 | c("Sex" = "sex", "Maternal Education" = "mom_ed", 530 | "Birth Order" = "birth_order", "Ethnicity" = "ethnicity")), 531 | plotOutput("agePlot") 532 | ), 533 | 534 | server <- function(input, output) { 535 | ws_split <- reactive({ 536 | ws %>% 537 | group_by_("age", input$demographic) %>% 538 | summarise(production_mean = mean(production)) 539 | }) 540 | 541 | output$agePlot <- renderPlot({ 542 | ggplot(ws_split(), 543 | aes_(quote(age), quote(production_mean), col = as.name(input$demographic))) + 544 | geom_line() 545 | }) 546 | }, 547 | 548 | options = list(height = 500) 549 | ) 550 | ``` 551 | 552 | ## Function application 553 | 554 | As I've tried to highlight, `dplyr` is actually all about applying functions. `summarise` is a verb that helps you apply functions to chunks of data and then bind them together. But that creates a requirement that all the functions return a single value (e.g., `mean`). There are lots of things you can do that summarise data but *don't* return a single value. For example, maybe you want to run a linear regression and return the slope *and* the intercept. 555 | 556 | For that, I want to highlight two things. 557 | 558 | One is `do`, which allows function application to grouped data. The only tricky thing about using `do` is that you have to refer to the dataframe that you're working on as `.`. 559 | 560 | The second is the amazing `broom` package, which provides methods to `tidy` the output of lots of different statistical models. So for example, you can run a linear regression on chunks of a dataset and get back out the coefficients in a data frame. 561 | 562 | Here's a toy example, again with Wordbank data. 563 | 564 | ```{r} 565 | ws %>% 566 | filter(!is.na(sex)) %>% 567 | group_by(sex) %>% 568 | do(broom::tidy(lm(production ~ age, data = .))) 569 | ``` 570 | 571 | In recent years, this workflow in R ihas gotten really good. `purrr` is an amazing package that introduces consistent ways to `map` functions. It's beyond the scope of the course. 572 | 573 | 574 | # Exercise solutions 575 | 576 | Returning the third cell. 577 | 578 | ```{r} 579 | iris$Petal.Length[3] 580 | iris[3,3] 581 | iris[3,"Petal.Length"] 582 | iris[[3]][3] 583 | iris[["Petal.Length"]][3] 584 | # probably more? 585 | ``` 586 | 587 | Piped commands. 588 | 589 | ```{r} 590 | iris$Species %>% 591 | unique %>% 592 | length 593 | ``` 594 | 595 | Mean of participant means. 596 | 597 | ```{r} 598 | sgf %>% 599 | group_by(age_group, subid) %>% 600 | summarise(correct = mean(correct)) %>% 601 | summarise(mean_correct = mean(correct), 602 | sd_correct = sd(correct)) 603 | ``` 604 | 605 | Sklar tidying. 606 | 607 | ```{r} 608 | sklar %>% 609 | gather(participant, RT, 8:28) 610 | # might be a better way to select these columns than by number, e.g. regex 611 | ``` 612 | 613 | Sex means. 614 | 615 | ```{r} 616 | ws_sex <- ws %>% 617 | filter(!is.na(sex)) %>% 618 | group_by(age, sex) %>% 619 | summarise(production_sd = sd(production, na.rm=TRUE), 620 | production_mean = mean(production)) 621 | ``` 622 | 623 | Effect size. (Instructor demo) 624 | 625 | ```{r} 626 | ws_es <- ws_sex %>% 627 | group_by(age) %>% 628 | summarise(es = (production_mean[sex=="Female"] - production_mean[sex=="Male"]) / 629 | mean(production_sd)) 630 | 631 | ggplot(ws_es, aes(x = age, y = es)) + 632 | geom_point() + 633 | geom_smooth(span = 1) + 634 | ylab("Female advantage (standard deviations)") + 635 | xlab("Age (months)") 636 | ``` -------------------------------------------------------------------------------- /tidyverse_tutorial_short.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Medium Data in the Tidyverse" 3 | author: "Mike Frank" 4 | date: "6/22/2017, updated 10/14/2019" 5 | output: 6 | html_document: 7 | toc: true 8 | toc_float: true 9 | --- 10 | 11 | Starting note: The best reference for this material is Hadley Wickham's [R for data scientists](http://r4ds.had.co.nz/). My contribution here is to translate this reference for psychology. 12 | 13 | If you have tidyverse installed, you can `knit` the tutorial into an HTML document for better readability by pressing the `knit` button at the top. 14 | 15 | ```{r setup, include=FALSE} 16 | library(tidyverse) 17 | knitr::opts_chunk$set(echo = TRUE, cache=TRUE) 18 | ``` 19 | 20 | 21 | # Goals and Introduction 22 | 23 | By the end of this tutorial, you will know: 24 | 25 | + What "tidy data" is and why it's an awesome format 26 | + How to do some stuff with tidy data 27 | + How to get your data to be tidy 28 | + Some tips'n'tricks for dealing with "medium data" in R 29 | 30 | In order to do that, we'll start by introducing the concepts of **tidy data** and **functions and pipes**. 31 | 32 | ## Tidy data 33 | 34 | > “Tidy datasets are all alike, but every messy dataset is messy in its own way.” –– Hadley Wickham 35 | 36 | Here's the basic idea: In tidy data, every row is a single **observation** (trial), and every column describes a **variable** with some **value** describing that trial. 37 | 38 | And if you know that data are formatted this way, then you can do amazing things, basically because you can take a uniform approach to the dataset. From R4DS: 39 | 40 | "There’s a general advantage to picking one consistent way of storing data. If you have a consistent data structure, it’s easier to learn the tools that work with it because they have an underlying uniformity. There’s a specific advantage to placing variables in columns because it allows R’s vectorised nature to shine." 41 | 42 | ## Functions and Pipes 43 | 44 | Everything you typically want to do in statistical programming uses **functions**. `mean` is a good example. `mean` takes one **argument**, a numeric vector. Pipes are a way to write strings of functions more easily. They bring the first argument of the function to the beginning. 45 | 46 | We'll use the `mtcars` dataset that's built in with the `tidyverse` and look at the `mpg` variable (miles per gallon). Instead of writing `mean(mtcars$mpg)`, with a pipe you can write: 47 | 48 | ```{r} 49 | mtcars$mpg %>% mean 50 | ``` 51 | 52 | That's not very useful yet, but when you start **nesting** functions, it gets better. 53 | 54 | ```{r} 55 | gpm <- function (mpg) {1/mpg} # gallons per mile, maybe better than miles per gallon. 56 | 57 | round(mean(gpm(mtcars$mpg)), digits = 2) 58 | 59 | # how do we do this with pipes? 60 | ``` 61 | This can be super helpful for writing strings of functions so that they are readable and distinct. We'll be doing a lot of piping of functions with multiple arguments later, and it will really help keep our syntax simple. 62 | 63 | 64 | # Tidy Data Analysis with `dplyr` 65 | 66 | Reference: [R4DS Chapter 5](http://r4ds.had.co.nz/transform.html) 67 | 68 | Let's take a psychological dataset. Here are the raw data from [Stiller, Goodman, & Frank (2015)](http://langcog.stanford.edu/papers_new/SGF-LLD-2015.pdf). Children met a puppet named "Furble." Furble would show them three pictures, e.g. face, face with glasses, face with hat and glasses and would say "my friend has glasses." They then had to choose which face was Furble's friend. (The prediction was that they'd choose *glasses and not a hat*, indicating that they'd made a correct pragmatic inference). In the control condition, Furble just mumbled. 69 | 70 | These data are tidy: each row describes a single trial, each column describes some aspect of tha trial, including their id (`subid`), age (`age`), condition (`condition` - "label" is the experimental condition, "No Label" is the control), item (`item` - which thing Furble was trying to find). 71 | 72 | We are going to manipulate these data using "verbs" from `dplyr`. I'll only teach four verbs, the most common in my workflow (but there are many other useful ones): 73 | 74 | + `filter` - remove rows by some logical condition 75 | + `mutate` - create new columns 76 | + `group_by` - group the data into subsets by some column 77 | + `summarize` - apply some function over columns in each group 78 | 79 | ## Exploring and characterizing the dataset 80 | 81 | ```{r} 82 | sgf <- read_csv("data/stiller_scales_data.csv") 83 | sgf 84 | ``` 85 | 86 | Inspect the various variables before you start any analysis. Lots of people recommend `summary` but TBH I don't find it useful. 87 | 88 | ```{r} 89 | summary(sgf) 90 | ``` 91 | 92 | I prefer interactive tools like `View` or `DT::datatable` (which I really like, especially in knitted reports). 93 | 94 | ```{r, eval=FALSE} 95 | View(sgf) 96 | ``` 97 | 98 | ## Filtering & Mutating 99 | 100 | There are lots of reasons you might want to remove *rows* from your dataset, including getting rid of outliers, selecting subpopulations, etc. `filter` is a verb (function) that takes a data frame as its first argument, and then as its second takes the **condition** you want to filter on. 101 | 102 | So if you wanted to look only at two year olds, you could do this. (Note you can give two conditions, could also do `age > 2 & age < 3`). (equivalent: `filter(sgf, age > 2, age < 3)`) 103 | 104 | Note that we're going to be using pipes with functions over data frames here. The way this works is that: 105 | 106 | + `tidyverse` verbs always take the data frame as their first argument, and 107 | + because pipes pull out the first argument, the data frame just gets passed through successive operations 108 | + so you can read a pipe chain as "take this data frame and first do this, then do this, then do that." 109 | 110 | This is essentially the huge insight of `dplyr`: you can chain verbs into readable and efficient sequences of operations over dataframes, provided 1) the verbs all have the same syntax (which they do) and 2) the data all have the same structure (which they do if they are tidy). 111 | 112 | OK, so filtering: 113 | 114 | ```{r} 115 | sgf %>% 116 | filter(age > 2, 117 | age < 3) 118 | ``` 119 | 120 | **Exercise.** Filter out only the "face" trial in the "Label" condition. 121 | 122 | ```{r} 123 | 124 | ``` 125 | 126 | Next up, *adding columns*. You might do this perhaps to compute some kind of derived variable. `mutate` is the verb for these situations - it allows you to add a column. Let's add a discrete age group factor to our dataset. 127 | 128 | ```{r} 129 | sgf <- sgf %>% 130 | mutate(age_group = cut(age, 2:5, include.lowest = TRUE)) 131 | 132 | head(sgf$age_group) 133 | ``` 134 | 135 | ## Standard descriptives using `summarise` and `group_by` 136 | 137 | We typically describe datasets at the level of subjects, not trials. We need two verbs to get a summary at the level of subjects: `group_by` and `summarise` (kiwi spelling). Grouping alone doesn't do much. 138 | 139 | ```{r} 140 | sgf %>% 141 | group_by(age_group) 142 | ``` 143 | 144 | All it does is add a grouping marker. 145 | 146 | What `summarise` does is to *apply a function* to a part of the dataset to create a new summary dataset. So we can apply the function `mean` to the dataset and get the grand mean. 147 | 148 | ```{r} 149 | ## DO NOT DO THIS!!! 150 | # foo <- initialize_the_thing_being_bound() 151 | # for (i in 1:length(unique(sgf$item))) { 152 | # for (j in 1:length(unique(sgf$condition))) { 153 | # this_data <- sgf[sgf$item == unique(sgf$item)[i] & 154 | # sgf$condition == unique(sgf$condition)[n],] 155 | # do_a_thing(this_data) 156 | # bind_together_somehow(this_data) 157 | # } 158 | # } 159 | 160 | sgf %>% 161 | summarise(correct = mean(correct)) 162 | ``` 163 | Note the syntax here: `summarise` takes multiple `new_column_name = function_to_be_applied_to_data(data_column)` entries in a list. Using this syntax, we can create more elaborate summary datasets also: 164 | 165 | ```{r} 166 | sgf %>% 167 | summarise(correct = mean(correct), 168 | n_observations = length(subid)) 169 | ``` 170 | 171 | Where these two verbs shine is in combination, though. Because `summarise` applies functions to columns in your *grouped data*, not just to the whole dataset! 172 | 173 | So we can group by age or condition or whatever else we want and then carry out the same procedure, and all of a sudden we are doing something extremely useful! 174 | 175 | ```{r} 176 | sgf_means <- sgf %>% 177 | group_by(age_group, condition) %>% 178 | summarise(correct = mean(correct), 179 | n_observations = length(subid)) 180 | sgf_means 181 | ``` 182 | 183 | These summary data are typically very useful for plotting. . 184 | 185 | ```{r} 186 | ggplot(sgf_means, 187 | aes(x = age_group, y = correct, col = condition, group = condition)) + 188 | geom_line() + 189 | ylim(0,1) + 190 | theme_classic() 191 | ``` 192 | 193 | **Exercise**. Adapt the code above to split the data by item, rather than age group. **BONUS**: plot the data this way as well. 194 | 195 | ```{r} 196 | 197 | ``` 198 | 199 | 200 | 201 | # Getting to Tidy with `tidyr` 202 | 203 | Reference: [R4DS Chapter 12](http://r4ds.had.co.nz/tidy-data.html) 204 | 205 | Psychological data often comes in two flavors: *long* and *wide* data. Long form data is *tidy*, but that format is less common. It's much more common to get *wide* data, in which every row is a case (e.g., a subject), and each column is a variable. In this format multiple trials (observations) are stored as columns. This can go a bunch of ways, for example, the most common might be to have subjects as rows and trials as columns. 206 | 207 | For example, let's take a look at a wide version of the `sgf` dataset above. 208 | 209 | ```{r} 210 | sgf_wide <- read_csv("data/sgf_wide.csv") 211 | head(sgf_wide) 212 | ``` 213 | 214 | The two main verbs for tidying are `pivot_longer` and `pivot_wider`. (There are lots of others in the `tidyr` package if you want to split or merge columns etc.). 215 | 216 | Here, we'll just show how to use `pivot_longer` to make the data tidy; we'll try to make a single column called `item` and a single column called `correct` rather than having four different columns, one for each item. 217 | 218 | `pivot_longer` takes three arguments: 219 | 220 | - a `tidyselect` way of getting columns. This is the columns you want to make longer. You can select them by name (e.g. `beds, faces, houses, pasta`), you can use numbers (e.g., `5:8`), or you can use markers like `starts_with(...)`. 221 | - a `names_to` argument. this argument is the **name of the column names**. in this case, the column names are items, so the "missing label" for them is `item`. 222 | - a `values_to` argument. this is the name of the thing in each column, in this case, the accuracy of the response (`correct`). 223 | 224 | Let's try it: 225 | 226 | ```{r} 227 | sgf_tidy <- sgf_wide %>% 228 | pivot_longer(beds:pasta, 229 | names_to = "item", 230 | values_to = "correct") 231 | sgf_tidy 232 | ``` 233 | We can compare this to `sgf` and see that we've recovered the original long form. (This is good, because I used `pivot_wider` to *make* the `sgf_wide` dataframe). 234 | 235 | **Exercise.** Use `pivot_wider` to try and make `sgf_wide` from `sgf`. The two arguments you need are `names_from` and `values_from`, which specify the names and values (just like in `pivot_longer`). 236 | 237 | 238 | # Extras 239 | 240 | These extras are fun things to go through at the end of the tutorial, time permitting. Because they require more data and packages, they are set by default not to evaluate if you knit the tutorial. 241 | 242 | ## A bigger worked example: Wordbank data 243 | 244 | We're going to be using some data on vocabulary growth that we load from the Wordbank database. [Wordbank](http://wordbank.stanford.edu) is a database of children's language learning. 245 | 246 | We're going to look at data from the English Words and Sentences form. These data describe the repsonses of parents to questions about whether their child says 680 different words. 247 | 248 | `tidyverse` really shines in this context. 249 | 250 | ```{r, eval=FALSE} 251 | # to avoid dependency on the wordbankr package, we cache these data. 252 | # ws <- wordbankr::get_administration_data(language = "English", 253 | # form = "WS") 254 | 255 | ws <- read_csv("data/ws.csv") 256 | ``` 257 | 258 | Take a look at the data that comes out. 259 | 260 | ```{r, eval=FALSE} 261 | DT::datatable(ws) 262 | ``` 263 | 264 | 265 | ```{r, eval=FALSE} 266 | ggplot(ws, aes(x = age, y = production)) + 267 | geom_point() 268 | ``` 269 | 270 | Aside: How can we fix this plot? Suggestions from group? 271 | 272 | ```{r, eval=FALSE} 273 | ggplot(ws, aes(x = age, y = production)) + 274 | geom_jitter(size = .5, width = .25, height = 0, alpha = .3) 275 | ``` 276 | 277 | Ok, let's plot the relationship between sex and productive vocabulary, using `dplyr`. 278 | 279 | ```{r, eval=FALSE} 280 | ggplot(ws, aes(x = age, y = production, col=sex)) + 281 | geom_jitter(size = .5, width = .25, height = 0, alpha = .3) + 282 | geom_smooth() 283 | ``` 284 | 285 | 286 | ## More exciting stuff you can do with this workflow 287 | 288 | Here are three little demos of exciting stuff that you can do (and that are facilitated by this workflow). 289 | 290 | ### Reading bigger files, faster 291 | 292 | A few other things will help you with "medium size data": 293 | 294 | + `read_csv` - Much faster than `read.csv` and has better defaults. 295 | + `dbplyr` - For connecting directly to databases. This package got forked off of `dplyr` recently but is very useful. 296 | + `feather` - The `feather` package is a fast-loading binary format that is interoperable with python. All you need to know is `write_feather(d, "filename")` and `read_feather("filename")`. 297 | 298 | Here's a timing demo for `read.csv`, `read_csv`, and `read_feather`. 299 | 300 | ```{r, eval=FALSE} 301 | system.time(read.csv("data/ws.csv")) 302 | system.time(read_csv("data/ws.csv")) 303 | system.time(feather::read_feather("data/ws.feather")) 304 | ``` 305 | I see about a 2x speedup for `read_csv` (bigger for bigger files) and a 20x speedup for `read_feather`. 306 | 307 | ### Interactive visualization 308 | 309 | The `shiny` package is a great way to do interactives in R. We'll walk through constructing a simple shiny app for the wordbank data here. 310 | 311 | Technically, this is [embedded shiny](http://rmarkdown.rstudio.com/authoring_embedded_shiny.html) as opposed to freestanding shiny apps (like Wordbank). 312 | 313 | The two parts of a shiny app are `ui` and `server`. Both of these are funny in that they are lists of other things. The `ui` is a list of elements of an HTML page, and the server is a list of "reactive" elements. In brief, the UI says what should be shown, and the server specifies the mechanics of how to create those elements. 314 | 315 | This little embedded shiny app shows a page with two elements: 1) a selector that lets you choose a demographic field, and 2) a plot of vocabulary split by that field. 316 | 317 | The server then has the job of splitting the data by that field (for `ws_split`) and rendering the plot (`agePlot`). 318 | 319 | The one fancy thing that's going on here is that the app makes use of the calls `group_by_` (in the `dplyr` chain) and `aes_` (for the `ggplot` call). These `_` functions are a little complex - they are an example of "standard evaluation" that lets you feed *actual variables* into `ggplot2` and `dplyr` rather than *names of variables*. For more information, there is a nice vignette on standard and non-standard evaluation: try `(vignette("nse")`. 320 | 321 | ```{r, eval=FALSE} 322 | library(shiny) 323 | shinyApp( 324 | ui <- fluidPage( 325 | selectInput("demographic", "Demographic Split Variable", 326 | c("Sex" = "sex", "Maternal Education" = "mom_ed", 327 | "Birth Order" = "birth_order", "Ethnicity" = "ethnicity")), 328 | plotOutput("agePlot") 329 | ), 330 | 331 | server <- function(input, output) { 332 | ws_split <- reactive({ 333 | ws %>% 334 | group_by_("age", input$demographic) %>% 335 | summarise(production_mean = mean(production)) 336 | }) 337 | 338 | output$agePlot <- renderPlot({ 339 | ggplot(ws_split(), 340 | aes_(quote(age), quote(production_mean), col = as.name(input$demographic))) + 341 | geom_line() 342 | }) 343 | }, 344 | 345 | options = list(height = 500) 346 | ) 347 | ``` 348 | 349 | ### Function application 350 | 351 | As I've tried to highlight, `tidyverse` is actually all about applying functions. `summarise` is a verb that helps you apply functions to chunks of data and then bind them together. But that creates a requirement that all the functions return a single value (e.g., `mean`). There are lots of things you can do that summarise data but *don't* return a single value. For example, maybe you want to run a linear regression and return the slope *and* the intercept. 352 | 353 | For that, I want to highlight two things. 354 | 355 | One is `do`, which allows function application to grouped data. The only tricky thing about using `do` is that you have to refer to the dataframe that you're working on as `.`. 356 | 357 | The second is the amazing `broom` package, which provides methods to `tidy` the output of lots of different statistical models. So for example, you can run a linear regression on chunks of a dataset and get back out the coefficients in a data frame. 358 | 359 | Here's a toy example, again with Wordbank data. 360 | 361 | ```{r, eval=FALSE} 362 | ws %>% 363 | filter(!is.na(sex)) %>% 364 | group_by(sex) %>% 365 | do(broom::tidy(lm(production ~ age, data = .))) 366 | ``` 367 | 368 | In recent years, this workflow in R ihas gotten really good. `purrr` is an amazing package that introduces consistent ways to `map` functions. It's beyond the scope of the course. 369 | 370 | # Conclusions 371 | 372 | Thanks for taking part. The `tidyverse` has been a transformative tool for me in teaching and doing data analysis. With a little practice it can make many seemingly-difficult tasks surprisingly easy! For example, my entire book was written in a tidyverse idiom ([wordbank book](https://langcog.github.io/wordbank-book/index.html)). -------------------------------------------------------------------------------- /tidyverse_tutorial_short_CDS.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Medium Data in the Tidyverse" 3 | author: "Mike Frank" 4 | date: "6/22/2017, updated 10/14/2019" 5 | output: 6 | html_document: 7 | toc: true 8 | toc_float: true 9 | --- 10 | 11 | Starting note: The best reference for this material is Hadley Wickham's [R for data scientists](http://r4ds.had.co.nz/). My contribution here is to translate this reference for psychology. 12 | 13 | If you have tidyverse installed, you can `knit` the tutorial into an HTML document for better readability by pressing the `knit` button at the top. 14 | 15 | ```{r setup, include=FALSE} 16 | #install.packages("tidyverse") 17 | library(tidyverse) 18 | knitr::opts_chunk$set(echo = TRUE, cache=TRUE) 19 | ``` 20 | 21 | 22 | # Goals and Introduction 23 | 24 | By the end of this tutorial, you will know: 25 | 26 | + What "tidy data" is and why it's an awesome format 27 | + How to do some stuff with tidy data 28 | + How to get your data to be tidy 29 | + Some tips'n'tricks for dealing with "medium data" in R 30 | 31 | In order to do that, we'll start by introducing the concepts of **tidy data** and **functions and pipes**. 32 | 33 | ## Tidy data 34 | 35 | > “Tidy datasets are all alike, but every messy dataset is messy in its own way.” –– Hadley Wickham 36 | 37 | Here's the basic idea: In tidy data, every row is a single **observation** (trial), and every column describes a **variable** with some **value** describing that trial. 38 | 39 | And if you know that data are formatted this way, then you can do amazing things, basically because you can take a uniform approach to the dataset. From R4DS: 40 | 41 | "There’s a general advantage to picking one consistent way of storing data. If you have a consistent data structure, it’s easier to learn the tools that work with it because they have an underlying uniformity. There’s a specific advantage to placing variables in columns because it allows R’s vectorised nature to shine." 42 | 43 | ## Functions and Pipes 44 | 45 | Everything you typically want to do in statistical programming uses **functions**. `mean` is a good example. `mean` takes one **argument**, a numeric vector. Pipes are a way to write strings of functions more easily. They bring the first argument of the function to the beginning. 46 | 47 | We'll use the `mtcars` dataset that's built in with the `tidyverse` and look at the `mpg` variable (miles per gallon). Instead of writing `mean(mtcars$mpg)`, with a pipe you can write: 48 | 49 | ```{r} 50 | mtcars 51 | mean(mtcars$mpg) 52 | mtcars$mpg %>% mean 53 | ``` 54 | 55 | That's not very useful yet, but when you start **nesting** functions, it gets better. 56 | 57 | ```{r} 58 | gpm <- function (mpg) {1/mpg} # gallons per mile, maybe better than miles per gallon. 59 | 60 | round(mean(gpm(mtcars$mpg)), digits = 2) 61 | 62 | # how do we do this with pipes? 63 | mtcars$mpg %>% 64 | gpm %>% 65 | mean %>% 66 | round(digits = 2) 67 | ``` 68 | This can be super helpful for writing strings of functions so that they are readable and distinct. We'll be doing a lot of piping of functions with multiple arguments later, and it will really help keep our syntax simple. 69 | 70 | 71 | # Tidy Data Analysis with `dplyr` 72 | 73 | Reference: [R4DS Chapter 5](http://r4ds.had.co.nz/transform.html) 74 | 75 | Let's take a psychological dataset. Here are the raw data from [Stiller, Goodman, & Frank (2015)](http://langcog.stanford.edu/papers_new/SGF-LLD-2015.pdf). Children met a puppet named "Furble." Furble would show them three pictures, e.g. face, face with glasses, face with hat and glasses and would say "my friend has glasses." They then had to choose which face was Furble's friend. (The prediction was that they'd choose *glasses and not a hat*, indicating that they'd made a correct pragmatic inference). In the control condition, Furble just mumbled. 76 | 77 | These data are tidy: each row describes a single trial, each column describes some aspect of tha trial, including their id (`subid`), age (`age`), condition (`condition` - "label" is the experimental condition, "No Label" is the control), item (`item` - which thing Furble was trying to find). 78 | 79 | We are going to manipulate these data using "verbs" from `dplyr`. I'll only teach four verbs, the most common in my workflow (but there are many other useful ones): 80 | 81 | + `filter` - remove rows by some logical condition 82 | + `mutate` - create new columns 83 | + `group_by` - group the data into subsets by some column 84 | + `summarize` - apply some function over columns in each group 85 | 86 | ## Exploring and characterizing the dataset 87 | 88 | ```{r} 89 | sgf <- read_csv("data/stiller_scales_data.csv") 90 | sgf 91 | ``` 92 | 93 | Inspect the various variables before you start any analysis. Lots of people recommend `summary` but TBH I don't find it useful. 94 | 95 | ```{r} 96 | summary(sgf) 97 | ``` 98 | 99 | I prefer interactive tools like `View` or `DT::datatable` (which I really like, especially in knitted reports). 100 | 101 | ```{r, eval=FALSE} 102 | View(sgf) 103 | ``` 104 | 105 | ## Filtering & Mutating 106 | 107 | There are lots of reasons you might want to remove *rows* from your dataset, including getting rid of outliers, selecting subpopulations, etc. `filter` is a verb (function) that takes a data frame as its first argument, and then as its second takes the **condition** you want to filter on. 108 | 109 | So if you wanted to look only at two year olds, you could do this. (Note you can give two conditions, could also do `age > 2 & age < 3`). (equivalent: `filter(sgf, age >= 2, age < 3)`) 110 | 111 | Note that we're going to be using pipes with functions over data frames here. The way this works is that: 112 | 113 | + `tidyverse` verbs always take the data frame as their first argument, and 114 | + because pipes pull out the first argument, the data frame just gets passed through successive operations 115 | + so you can read a pipe chain as "take this data frame and first do this, then do this, then do that." 116 | 117 | This is essentially the huge insight of `dplyr`: you can chain verbs into readable and efficient sequences of operations over dataframes, provided 1) the verbs all have the same syntax (which they do) and 2) the data all have the same structure (which they do if they are tidy). 118 | 119 | OK, so filtering: 120 | 121 | ```{r} 122 | sgf %>% 123 | filter(age >= 2, age < 3) 124 | # sgf <- 125 | # -> sgf 126 | # sgf %<>% filter 127 | # help: ?filter 128 | ``` 129 | 130 | **Exercise.** Filter so only the "face" trial in the "Label" condition is present. 131 | 132 | ```{r} 133 | faces_labelcond <- sgf %>% 134 | filter(item == "faces", 135 | condition == "Label") 136 | ``` 137 | 138 | Next up, *adding columns*. You might do this perhaps to compute some kind of derived variable. `mutate` is the verb for these situations - it allows you to add a column. Let's add a discrete age group factor to our dataset. 139 | 140 | ```{r} 141 | sgf <- sgf %>% 142 | mutate(age_group = cut(age, 2:5, include.lowest = TRUE), 143 | age_group_halfyear = cut(age, seq(2, 5, .5), 144 | include.lowest = TRUE)) 145 | 146 | sgf 147 | ``` 148 | 149 | ```{r} 150 | sgf %>% 151 | select(-starts_with("age_group")) 152 | ``` 153 | 154 | ## Standard descriptives using `summarise` and `group_by` 155 | 156 | We typically describe datasets at the level of subjects, not trials. We need two verbs to get a summary at the level of subjects: `group_by` and `summarise` (kiwi spelling). Grouping alone doesn't do much. 157 | 158 | ```{r} 159 | sgf %>% 160 | group_by(age_group) 161 | ``` 162 | 163 | All it does is add a grouping marker. 164 | 165 | What `summarise` does is to *apply a function* to a part of the dataset to create a new summary dataset. So we can apply the function `mean` to the dataset and get the grand mean. 166 | 167 | ```{r} 168 | ## DO NOT DO THIS!!! 169 | # foo <- initialize_the_thing_being_bound() 170 | # for (i in 1:length(unique(sgf$item))) { 171 | # for (j in 1:length(unique(sgf$condition))) { 172 | # this_data <- sgf[sgf$item == unique(sgf$item)[i] & 173 | # sgf$condition == unique(sgf$condition)[n],] 174 | # do_a_thing(this_data) 175 | # bind_together_somehow(this_data) 176 | # } 177 | # } 178 | 179 | sgf %>% 180 | group_by(age_group, condition) %>% 181 | summarise(mean_correct = mean(correct), 182 | sd_correct = sd(correct)) 183 | ``` 184 | Note the syntax here: `summarise` takes multiple `new_column_name = function_to_be_applied_to_data(data_column)` entries in a list. Using this syntax, we can create more elaborate summary datasets also: 185 | 186 | ```{r} 187 | sgf %>% 188 | summarise(correct = mean(correct), 189 | n_observations = length(subid)) 190 | ``` 191 | 192 | Where these two verbs shine is in combination, though. Because `summarise` applies functions to columns in your *grouped data*, not just to the whole dataset! 193 | 194 | So we can group by age or condition or whatever else we want and then carry out the same procedure, and all of a sudden we are doing something extremely useful! 195 | 196 | ```{r} 197 | sgf_means <- sgf %>% 198 | group_by(age_group, condition) %>% 199 | summarise(correct = mean(correct), 200 | n_observations = length(subid)) 201 | sgf_means 202 | ``` 203 | 204 | These summary data are typically very useful for plotting. . 205 | 206 | ```{r} 207 | ggplot(sgf_means, 208 | aes(x = age_group, y = correct, col = condition, group = condition)) + 209 | geom_line() + 210 | ylim(0,1) + 211 | theme_classic() 212 | ``` 213 | 214 | Grouping by participants 215 | 216 | ```{r} 217 | sgf_means <- sgf %>% 218 | group_by(age_group, condition, subid) %>% 219 | summarise(mean_correct = mean(correct)) %>% 220 | group_by(age_group, condition) %>% 221 | summarise(sd_correct = sd(mean_correct), 222 | n_obs = length(mean_correct), 223 | mean_correct = mean(mean_correct)) 224 | ``` 225 | 226 | Getting confidence intervals 227 | 228 | ```{r} 229 | sgf_sub_means <- sgf %>% 230 | group_by(age_group, condition, subid) %>% 231 | summarise(mean_correct = mean(correct)) 232 | 233 | sgf_group_means <- sgf_sub_means %>% 234 | group_by(age_group, condition) %>% 235 | summarise(sd_correct = sd(mean_correct), 236 | n_obs = length(mean_correct), 237 | sem = sd_correct / sqrt(n_obs), 238 | ci = sem * 1.96, 239 | mean_correct = mean(mean_correct)) 240 | ``` 241 | 242 | Now plot with CIs. 243 | 244 | ```{r} 245 | ggplot(sgf_group_means, 246 | aes(x = age_group, y = mean_correct, 247 | col = condition, group = condition)) + 248 | geom_line() + 249 | geom_pointrange(aes(ymin = mean_correct - ci, 250 | ymax = mean_correct + ci)) + 251 | geom_jitter(data = sgf_sub_means, alpha = .3, 252 | width = .1, height = .1) + 253 | ylim(0,1) + 254 | theme_classic() 255 | ``` 256 | 257 | Complex exclusion logic: 258 | 259 | remove (filter out) participants with < 4 trials. 260 | 261 | ```{r} 262 | # simple solution 263 | sgf %>% 264 | group_by(subid) %>% 265 | mutate(total_trials = length(subid)) %>% 266 | filter(total_trials >= 4) 267 | 268 | # another (more elegant) solution 269 | sgf %>% 270 | group_by(subid) %>% 271 | filter(length(subid) >= 4) 272 | 273 | # another way to do it by creating a new dataframe 274 | four_trial_participants <- sgf %>% 275 | group_by(subid) %>% 276 | summarise(total_trials = length(subid)) %>% 277 | filter(total_trials >= 4) 278 | 279 | filter(sgf, subid %in% four_trial_participants$subid) 280 | ``` 281 | 282 | 283 | 284 | 285 | **Exercise**. Adapt the code above to split the data by item, rather than age group. **BONUS**: plot the data this way as well. 286 | 287 | ```{r} 288 | 289 | ``` 290 | 291 | 292 | 293 | # Getting to Tidy with `tidyr` 294 | 295 | Reference: [R4DS Chapter 12](http://r4ds.had.co.nz/tidy-data.html) 296 | 297 | Psychological data often comes in two flavors: *long* and *wide* data. Long form data is *tidy*, but that format is less common. It's much more common to get *wide* data, in which every row is a case (e.g., a subject), and each column is a variable. In this format multiple trials (observations) are stored as columns. This can go a bunch of ways, for example, the most common might be to have subjects as rows and trials as columns. 298 | 299 | For example, let's take a look at a wide version of the `sgf` dataset above. 300 | 301 | ```{r} 302 | sgf_wide <- read_csv("data/sgf_wide.csv") 303 | sgf_wide 304 | ``` 305 | 306 | The two main verbs for tidying are `pivot_longer` and `pivot_wider`. (There are lots of others in the `tidyr` package if you want to split or merge columns etc.). 307 | 308 | Here, we'll just show how to use `pivot_longer` to make the data tidy; we'll try to make a single column called `item` and a single column called `correct` rather than having four different columns, one for each item. 309 | 310 | `pivot_longer` takes three arguments: 311 | 312 | - a `tidyselect` way of getting columns. This is the columns you want to make longer. You can select them by name (e.g. `beds, faces, houses, pasta`), you can use numbers (e.g., `5:8`), or you can use markers like `starts_with(...)`. 313 | - a `names_to` argument. this argument is the **name of the column names**. in this case, the column names are items, so the "missing label" for them is `item`. 314 | - a `values_to` argument. this is the name of the thing in each column, in this case, the accuracy of the response (`correct`). 315 | 316 | Let's try it: 317 | 318 | ```{r} 319 | sgf_tidy <- sgf_wide %>% 320 | pivot_longer(beds:pasta, 321 | names_to = "item", 322 | values_to = "correct") 323 | sgf_tidy 324 | ``` 325 | We can compare this to `sgf` and see that we've recovered the original long form. (This is good, because I used `pivot_wider` to *make* the `sgf_wide` dataframe). 326 | 327 | **Exercise.** Use `pivot_wider` to try and make `sgf_wide` from `sgf`. The two arguments you need are `names_from` and `values_from`, which specify the names and values (just like in `pivot_longer`). 328 | 329 | 330 | ```{r} 331 | install.packages("tidyr") 332 | library(tidyr) 333 | 334 | # pivot_longer(beds:pasta, 335 | # names_to = "item", 336 | # values_to = "correct") 337 | 338 | sgf %>% 339 | pivot_wider(names_from = "item", 340 | values_from = "correct") 341 | ?pivot_wider 342 | ``` 343 | 344 | 345 | 346 | # Extras 347 | 348 | These extras are fun things to go through at the end of the tutorial, time permitting. Because they require more data and packages, they are set by default not to evaluate if you knit the tutorial. 349 | 350 | ## A bigger worked example: Wordbank data 351 | 352 | We're going to be using some data on vocabulary growth that we load from the Wordbank database. [Wordbank](http://wordbank.stanford.edu) is a database of children's language learning. 353 | 354 | We're going to look at data from the English Words and Sentences form. These data describe the repsonses of parents to questions about whether their child says 680 different words. 355 | 356 | `tidyverse` really shines in this context. 357 | 358 | ```{r, eval=FALSE} 359 | # to avoid dependency on the wordbankr package, we cache these data. 360 | # ws <- wordbankr::get_administration_data(language = "English", 361 | # form = "WS") 362 | 363 | ws <- read_csv("data/ws.csv") 364 | ws 365 | ``` 366 | 367 | Take a look at the data that comes out. 368 | 369 | ```{r, eval=FALSE} 370 | DT::datatable(ws) 371 | ``` 372 | 373 | 374 | ```{r, eval=FALSE} 375 | ggplot(ws, aes(x = age, y = production)) + 376 | geom_point() 377 | ``` 378 | 379 | Aside: How can we fix this plot? Suggestions from group? 380 | 381 | ```{r, eval=FALSE} 382 | ggplot(ws, aes(x = age, y = production)) + 383 | geom_jitter(size = .5, width = .25, height = 0, alpha = .3) 384 | ``` 385 | 386 | Ok, let's plot the relationship between sex and productive vocabulary, using `dplyr`. 387 | 388 | ```{r, eval=FALSE} 389 | sex_means <- ws %>% 390 | group_by(sex, age) %>% 391 | summarise(mean = mean(production)) 392 | 393 | ggplot(sex_means, aes(x = age, y = mean, col=sex)) + 394 | geom_line() 395 | ``` 396 | 397 | 398 | ## More exciting stuff you can do with this workflow 399 | 400 | Here are three little demos of exciting stuff that you can do (and that are facilitated by this workflow). 401 | 402 | ### Reading bigger files, faster 403 | 404 | A few other things will help you with "medium size data": 405 | 406 | + `read_csv` - Much faster than `read.csv` and has better defaults. 407 | + `dbplyr` - For connecting directly to databases. This package got forked off of `dplyr` recently but is very useful. 408 | + `feather` - The `feather` package is a fast-loading binary format that is interoperable with python. All you need to know is `write_feather(d, "filename")` and `read_feather("filename")`. 409 | 410 | Here's a timing demo for `read.csv`, `read_csv`, and `read_feather`. 411 | 412 | ```{r, eval=FALSE} 413 | system.time(read.csv("data/ws.csv")) 414 | system.time(read_csv("data/ws.csv")) 415 | system.time(feather::read_feather("data/ws.feather")) 416 | ``` 417 | I see about a 2x speedup for `read_csv` (bigger for bigger files) and a 20x speedup for `read_feather`. 418 | 419 | ### Interactive visualization 420 | 421 | The `shiny` package is a great way to do interactives in R. We'll walk through constructing a simple shiny app for the wordbank data here. 422 | 423 | Technically, this is [embedded shiny](http://rmarkdown.rstudio.com/authoring_embedded_shiny.html) as opposed to freestanding shiny apps (like Wordbank). 424 | 425 | The two parts of a shiny app are `ui` and `server`. Both of these are funny in that they are lists of other things. The `ui` is a list of elements of an HTML page, and the server is a list of "reactive" elements. In brief, the UI says what should be shown, and the server specifies the mechanics of how to create those elements. 426 | 427 | This little embedded shiny app shows a page with two elements: 1) a selector that lets you choose a demographic field, and 2) a plot of vocabulary split by that field. 428 | 429 | The server then has the job of splitting the data by that field (for `ws_split`) and rendering the plot (`agePlot`). 430 | 431 | The one fancy thing that's going on here is that the app makes use of the calls `group_by_` (in the `dplyr` chain) and `aes_` (for the `ggplot` call). These `_` functions are a little complex - they are an example of "standard evaluation" that lets you feed *actual variables* into `ggplot2` and `dplyr` rather than *names of variables*. For more information, there is a nice vignette on standard and non-standard evaluation: try `(vignette("nse")`. 432 | 433 | ```{r, eval=FALSE} 434 | library(shiny) 435 | shinyApp( 436 | ui <- fluidPage( 437 | selectInput("demographic", "Demographic Split Variable", 438 | c("Sex" = "sex", "Maternal Education" = "mom_ed", 439 | "Birth Order" = "birth_order", "Ethnicity" = "ethnicity")), 440 | plotOutput("agePlot") 441 | ), 442 | 443 | server <- function(input, output) { 444 | ws_split <- reactive({ 445 | ws %>% 446 | group_by_("age", input$demographic) %>% 447 | summarise(production_mean = mean(production)) 448 | }) 449 | 450 | output$agePlot <- renderPlot({ 451 | ggplot(ws_split(), 452 | aes_(quote(age), quote(production_mean), col = as.name(input$demographic))) + 453 | geom_line() 454 | }) 455 | }, 456 | 457 | options = list(height = 500) 458 | ) 459 | ``` 460 | 461 | ### Function application 462 | 463 | As I've tried to highlight, `tidyverse` is actually all about applying functions. `summarise` is a verb that helps you apply functions to chunks of data and then bind them together. But that creates a requirement that all the functions return a single value (e.g., `mean`). There are lots of things you can do that summarise data but *don't* return a single value. For example, maybe you want to run a linear regression and return the slope *and* the intercept. 464 | 465 | For that, I want to highlight two things. 466 | 467 | One is `do`, which allows function application to grouped data. The only tricky thing about using `do` is that you have to refer to the dataframe that you're working on as `.`. 468 | 469 | The second is the amazing `broom` package, which provides methods to `tidy` the output of lots of different statistical models. So for example, you can run a linear regression on chunks of a dataset and get back out the coefficients in a data frame. 470 | 471 | Here's a toy example, again with Wordbank data. 472 | 473 | ```{r, eval=FALSE} 474 | ws %>% 475 | filter(!is.na(sex)) %>% 476 | group_by(sex) %>% 477 | do(broom::tidy(lm(production ~ age, data = .))) 478 | ``` 479 | 480 | In recent years, this workflow in R ihas gotten really good. `purrr` is an amazing package that introduces consistent ways to `map` functions. It's beyond the scope of the course. 481 | 482 | # Conclusions 483 | 484 | Thanks for taking part. The `tidyverse` has been a transformative tool for me in teaching and doing data analysis. With a little practice it can make many seemingly-difficult tasks surprisingly easy! For example, my entire book was written in a tidyverse idiom ([wordbank book](https://langcog.github.io/wordbank-book/index.html)). --------------------------------------------------------------------------------