├── README.html
└── README.md
/README.md:
--------------------------------------------------------------------------------
1 | # Data Science in Education: Syllabus
2 |
3 | * **Course:** [EDCT-GE2550, NYU Steinhardt](http://steinhardt.nyu.edu/alt/ect/courses)
4 | * **Instructor:** Charles Lang, [charles.lang@nyu.edu](mailto:charles.lang@nyu.edu), @learng00d
5 | * **Location:** 2 MetroTech, Room 845
6 |
7 | ## Course Description
8 |
9 | New class motto: "If its not messing up, its not technology"
10 |
11 | The Internet and mobile computing are changing our relationship to data. Data can be collected from more people, across longer periods of time, and a greater number of variables, at a lower cost and with less effort than ever before. This has brought opportunities and challenges to many domains, but the full impact on education is only beginning to be felt. On the one hand there is a critical mass of educators, technologists and investors who believe that there is great promise in the analysis of this data. On the other, there are concerns about what the utilization of this data may mean for education and society more broadly. Data Science in Education provides an overview of the use of new data cources in education with the aim of developing students’ ability to perform analyses and critically evaluate the technologies and consequences of this emerging field. It covers methods and technologies associated with Data Science, Educational Data Mining and Learning Analytics, as well as discusses the opportunities for education that these methods present and the problems that they may create.
12 |
13 | No previous experience in statistics, computer science or data manipulation will be expected. However, students will be encouraged to get hands-on experience, applying methods or technologies to educational problems. Students will be assessed on their understanding of technological or analytical innovations and how they critique the consequences of these innovations within the broader educational context.
14 |
15 | ## Course Goals
16 |
17 | The overarching goal of this course is for students to acquire the knowledge and skills to be intelligent producers and consumers of data science in education. By the end of the course students should:
18 | * Systematically develop a line of inquiry utilizing data to make an argument about learning
19 | * Be able to evaluate the implications of data science for educational research, policy, and practice
20 |
21 | This necessarily means that students become comfortable with the educational applications of three domain areas: computer science, statistics and the context surrounding data use. There is no expectation for students to become experts in any one of these areas but rather the course will aim to: enhance student competency in identifying issues at the level of data acquisition, data analysis and application of analysis in education.
22 |
23 | ## Assessment
24 |
25 | In EDCT-GE 2550 students will be attempting several data science projects, however, unlike most courses, students will not be asssessed based on how successful they are in completing these projects. Rather students will be assessed on two key components for future sucess: contribution and organization. **Contribution** reflects the extent to which students participate in the course, how often they tweet, whether or not they complete assignments and quizzes, attend class, etc. **Organization** reflects how well students document their process and maintain data and software resources. For example, maintaining a well organized Zotero library with notes, maintaining a well organized Github account and maintaining organized data sets that are labelled appropriately. To do well in EDCT-GE 2550 requires that students finish the course with the resources to sucessfully use data science in education *in the future*. Do the work and stay organized and all will be well!
26 |
27 | Tasks that need to be completed during the semester:
28 |
29 | * Attend class
30 | * Weekly readings
31 | * Comment on readings on Twitter
32 | * Weekly in class questionnaire
33 | * Maintain documentation of work (Github, R Markdown, Zotero)
34 | * Ask one question on Stack Overflow
35 | * In person meeting with instructor
36 | * 8 short assignments (including one group assignment)
37 | * Group presentation of group assignment, 3-5 students each
38 | * Produce one argument about learning using data from the class
39 |
40 |
41 | ## Week-by-week
42 |
43 | Unit 1: Introduction
44 |
45 | Unit 2: Data Sources
46 |
47 | Unit 3: Networks
48 |
49 | Unit 4: Prediction
50 |
51 | Unit 5: Natural Language Processing
52 |
53 | Unit 6: Quantified Student
54 |
55 | Unit 7: Advanced Graphics
56 |
57 | ## Unit 1: Introduction (1/28/16 - 2/4/16)
58 |
59 | ### Learning Objectives
60 |
61 | * Be familiar with course philosophy, logic & structure
62 | * Install and be familiar with the software to be used in the course
63 | * Consider informed consent and its complexity in education technology
64 | * Appreciate the importance of tightly defining educational goals
65 |
66 | ### Tasks to be completed:
67 |
68 | 1. Read and comment on by 1/30/16:
69 | * [Leong, B. and Polonetsky, J. 2015. Why Opting Out of Student Data Collection Isn’t the Solution. EdSurge.](https://www.edsurge.com/news/2015-03-16-why-opting-out-of-student-data-collection-isn-t-the-solution)
70 | * [Young, J.R. 2014. Why Students Should Own Their Educational Data. The Chronicle of Higher Education Blogs: Wired Campus.](http://chronicle.com/blogs/wiredcampus/why-students-should-own-their-educational-data/54329)
71 |
72 | 2. Assignment 1: Set up
73 |
74 | # Unit 2: Data Sources & their Manipulation
75 | ## Week 2 Data Sources (2/4/16 - 2/11/16)
76 |
77 | ### Learning Objectives
78 |
79 | * Be familiar with a range of data sources, formats and extraction processes
80 | * Be familiar with R & Github & markdown
81 | * Be familiar with the kinds of work done in the fields of LA and EDM
82 |
83 | ### Tasks to be completed:
84 |
85 | 1. Read/watch and comment:
86 | * [Siemens, G. and Baker, R.S.J. d. 2012. Learning Analytics and Educational Data Mining: Towards Communication and Collaboration. Proceedings of the 2nd International Conference on Learning Analytics and Knowledge (New York, NY, USA, 2012), 252–254.](http://users.wpi.edu/~rsbaker/LAKs%20reformatting%20v2.pdf)
87 | * [Educause 2015. Why Is Measuring Learning So Difficult?](http://er.educause.edu/multimedia/2015/8/why-is-measuring-learning-so-difficult-v)
88 | * [Saturday Morning Breakfast Cereal: 2016.](http://www.smbc-comics.com/index.php?id=3978)
89 | * [The R Markdown Cheat sheet: 2014.](http://shiny.rstudio.com/articles/rm-cheatsheet.html)
90 |
91 | 2. Assignment 2: Github and RStudio
92 |
93 | ## Week 3 Data Tidying (2/11/16 - 2/18/16)
94 |
95 | ### Learning Objectives:
96 |
97 | * Be able to perform a data tidying workflow
98 | * Be able to do basic visualization
99 | * Understand the importance of workflow and recording workflow
100 |
101 | ### Tasks to be completed:
102 |
103 | 1. Read/watch:
104 | * [Poulson, B. Up and Running with R. Lynda.com. Section 3 - 4](http://www.lynda.com/R-tutorials/Up-Running-R/120612-2.html?org=nyu.edu)
105 | * [Data Wrangling Cheatsheet: 2015.](http://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf)
106 |
107 | 2. Read/comment:
108 | * [Clow, D. 2014. Data wranglers: human interpreters to help close the feedback loop. Proceedings of the Fourth International Conference on Learning Analytics And Knowledge (2014), 49–53.](http://oro.open.ac.uk/40608/2/Clow-DataWranglers-final.pdf)
109 |
110 | 3. Assignment 3
111 |
112 | ## Week 4: Personalization through Features (2/18/16 - 2/25/16)
113 |
114 | * Understand why dimensionality reduction is necessary
115 | * Be familiar with broad groups of dimensionality reduction (feature transformation, feature selection, feature extraction)
116 | * Understand the complexity of personalization in education
117 |
118 | ### Tasks to be completed:
119 |
120 | 1. Read/Comment:
121 |
122 | * [Kucirkova, N. and FitzGerald, E. 2015. Zuckerberg is Ploughing Billions into “Personalised Learning” – Why? The Conversation.](https://theconversation.com/zuckerberg-is-ploughing-billions-into-personalised-learning-why-51940)
123 |
124 | 2. Read/Watch:
125 |
126 | * [Georgia Tech 2015. Feature Selection. Youtube.](https://www.youtube.com/watch?v=8CpRLplmdqE)
127 | * [Perez-Riverol, Y. 2013. Introduction to Feature Selection for Bioinformaticians Using R, Correlation Matrix Filters, PCA & Backward Selection. R-bloggers.](http://www.r-bloggers.com/introduction-to-feature-selection-for-bioinformaticians-using-r-correlation-matrix-filters-pca-backward-selection/)
128 |
129 | 3. Assignment 4
130 |
131 | ## Week 5: Dimension Reduction (2/25/16 - 3/3/16)
132 |
133 | * Perform one method from each group of dimensionality reduction methods
134 | * Be aware of the complexity of Open Data
135 |
136 | ### Tasks to be completed:
137 |
138 | 1. Read/Comment:
139 |
140 | * [Ridgway, J. and Smith, A. 2013. Open data, official statistics and statistics education: threats, and opportunities for collaboration. Proceedings of the Joint IASEIAOS Satellite Conference “Statistics Education for Progress”, Macao, China (2013).](http://iase-web.org/documents/papers/sat2013/IASE_IAOS_2013_Paper_K3_Ridgway_Smith.pdf)
141 |
142 | 2. Assignment 5
143 |
144 | # Unit 3: Networks
145 | ## Week 6 Introduction to Networks (3/3/16 - 3/10/16)
146 |
147 | ### Learning Objectives
148 |
149 | * Define social network analysis and its main analysis methods
150 | * Perform social network analysis and visualize analysis results in R
151 | * Develop a well defined opinion on how to approach student privacy and data use
152 |
153 | ### Tasks to be completed:
154 |
155 | 1. Read/Comment:
156 | * [Hanneman, R.A. and Riddle, M. Chapter 1: Social Network Data. Introduction to Social Network Methods.](http://faculty.ucr.edu/~hanneman/nettext/C1_Social_Network_Data.html)
157 | * [Krueger, K.R. and Moore, B. 2015. New Technology “Clouds” Student Data Privacy. Phi Delta Kappan. 96, 5 (Feb. 2015), 19–24.](http://www.greeleyschools.org/cms/lib2/CO01001723/Centricity/Domain/2387/New%20technology%20clouds%20student%20data%20privacy.pdf)
158 | * [Leong, B. and Polonetsky, J. 2016. Passing the Privacy Test as Student Data Laws Take Effect. EdSurge.](https://www.edsurge.com/news/2016-01-12-passing-the-privacy-test-as-student-data-laws-take-effect?utm_content=bufferc0042&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer)
159 |
160 | 2. Assignment 6
161 |
162 | ## Week 7 Social Network Analysis (3/10/16 - 3/17/16)
163 |
164 | * Describe and interpret the results of social network analysis for the study of learning
165 | * Describe and critically reflect on approaches to the use of social network analysis for the study of learning
166 |
167 | ### Tasks to be completed:
168 |
169 | 1. Read/Comment:
170 |
171 | * [Grunspan, D. Z., Wiggins, B. L., & Goodreau, S. M. (2014). Understanding Classrooms through Social Network Analysis: A Primer for Social Network Analysis in Education Research. CBE-Life Sciences Education, 13(2), 167–178. doi:10.1187/cbe.13-08-0162](http://www.lifescied.org/content/13/2/167.full.pdf)
172 | * [Manai, J. 2015. The Learning Analytics Landscape: Tension Between Student Privacy and the Process of Data Mining. Carnegie Foundation for the Advancement of Teaching.](http://www.carnegiefoundation.org/blog/the-learning-analytics-landscape-tension-between-student-privacy-and-the-process-of-data-mining/)
173 |
174 | 2. Assignment 7
175 |
176 | # Unit 4: Prediction
177 |
178 | ## Week 8 Prediction Modelling (3/17/16 - 3/24/16)
179 |
180 | * Conduct one form of prediction modeling effectively and appropriately
181 | * Understand the basis of predictive inference
182 | * Develop a well defined opinion of the complexity of adaption
183 |
184 | ### Tasks to be completed:
185 |
186 | 1. Read/Comment:
187 |
188 | * [Honan, M. (2014, August 11). I Liked Everything I Saw on Facebook for Two Days. Here’s What It Did to Me | Gadget Lab. WIRED. Retrieved August 12, 2014](http://www.wired.com/2014/08/i-liked-everything-i-saw-on-facebook-for-two-days-heres-what-it-did-to-me/)
189 | * [Farr, C. 2014. Microsoft and Knewton partner up to bring adaptive learning to publishers & schools. VentureBeat.](http://venturebeat.com/2014/03/13/microsoft-and-knewton-partner-up-to-bring-adaptive-learning-to-publishers-schools/)
190 |
191 | 2. Read:
192 |
193 | * [Zheng, A. 2015. Evaluating Machine Learning Models. O’Reily Media. Chapter 2: Evaluation Metrics p.7-18](http://www.oreilly.com/data/free/evaluating-machine-learning-models.csp?intcmp=il-data-free-lp-lgen_free_reports_page)
194 |
195 | 3. Assignment 8
196 |
197 | ## Week 9 Prediction Modelling (3/24/16 - 3/31/16)
198 |
199 | * Understand core uses of prediction modeling in intelligent tutors
200 | * Learn how to engineer both features and training labels
201 | * Learn about key diagnostic metrics and their uses
202 |
203 | ### Tasks to be completed:
204 |
205 | 1. Read/Comment:
206 |
207 | * [San Pedro, M.O.Z., Baker, R.S.J.d., Bowers, A.J., Heffernan, N.T. (2013) Predicting College Enrollment from Student Interaction with a Intelligent Tutoring System in Middle School. Proceedings of the 6th International Conference on Educational Data Mining, 177-184.](http://www.columbia.edu/~rsb2162/EDM2013_SBBH.pdf)
208 | * [Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Yu, P.S., Zhou, Z.-H., Steinbach, M., Hand, D.J. and Steinberg, D. 2007. Top 10 algorithms in data mining. Knowledge and Information Systems. 14, 1 (Dec. 2007), 1–6.](https://www.cs.umd.edu/~samir/498/10Algorithms-08.pdf)
209 |
210 | 2. Assignment 9
211 |
212 | # Unit 5: Natural Language Processing
213 | ## Week 10 Natural Language Processing (3/31/16 - 4/7/16)
214 |
215 | * Describe prominent areas of text mining
216 | * Assemble a corpus of documents
217 | * Describe applications of text mining to education
218 |
219 | ### Tasks to be completed:
220 |
221 | 1. Read/Comment:
222 | * [Nadkarni, P.M., Ohno-Machado, L. and Chapman, W.W. 2011. Natural language processing: an introduction. Journal of the American Medical Informatics Association : JAMIA. 18, 5 (2011), 544–551.](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168328/)
223 | * [Shermis, M. D. (2014). State-of-the-art automated essay scoring: Competition, results, and future directions from a United States demonstration. Assessing Writing, 20, 53–76.](https://s3.amazonaws.com/s3.documentcloud.org/documents/1094637/shermis-aw-final.pdf)
224 |
225 | 2. Assignment 10
226 |
227 | ## Week 11 Natural Language Processing (4/7/16 - 4/14/16)
228 |
229 | * Perform a basic NLP analysis
230 | * Develop a well defined opinion on whether students should have a right to understand how they are judged
231 |
232 | ### Tasks to be completed:
233 |
234 | 1. Read/Comment:
235 | * [Crawford, K. and Schultz, J. 2014. Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harms - Boston College Law Review. Boston College Law Review. LV, 1 (2014).](http://bclawreview.org/files/2014/01/03_crawford_schultz.pdf)
236 | * [Thompson, J. 2015. Text Mining, Big Data, Unstructured Data. Dell Computing.](http://documents.software.dell.com/Statistics/Textbook/Text-Mining)
237 |
238 | 2. Assignment 11
239 |
240 | # Unit 6: The Quantified Student
241 |
242 | ## Week 12 The Quantified Student (4/14/16 - 4/21/16)
243 |
244 | * Have a well defined opinion of the use of biometric data in education
245 | * Extract orientation data from a mobile device
246 |
247 | ### Tasks to be completed:
248 |
249 | 1. Read/Comment
250 | * [Lee, V. R., & Drake, J. (2013). Quantified Recess: Design of an Activity for Elementary Students Involving Analyses of Their Own Movement Data. In Proceedings of the 12th International Conference on Interaction Design and Children (pp. 273–276). New York, NY, USA: ACM. doi:10.1145/2485760.2485822](http://quantifiedself.com/wp-content/uploads/2014/11/Quantified-recess_-Design-of-an-activity-for-elementary-students.pdf)
251 | * [Kamenetz, A. 2015. The Quantified Student: An App That Predicts GPA. NPR.](http://www.npr.org/sections/ed/2015/06/02/409780423/the-quantified-student-an-app-that-predicts-gpa)
252 | * [Meyer, R. (2016, February 25). The Quantified Welp. The Atlantic.](http://www.theatlantic.com/technology/archive/2016/02/the-quantified-welp/470874/)
253 |
254 | 2. Assignment 12
255 |
256 | # Unit 7: Advanced Graphics
257 |
258 | ## Week 13 Advanced Graphics (4/21/16 - 4/28/16)
259 |
260 | * Understand basic principals of the grammar of graphics
261 | * Understand the basic principals of effective data visualization
262 | * Produce a range of graphical representations using ggplot & D3.js for R
263 |
264 | ### Tasks to be completed: IMPORTANT
265 |
266 | 1. Read/Watch:
267 | * [Datacamp 2015. The ggvis R package - How to Work With The Grammar of Graphics - YouTube. Youtube.](https://www.youtube.com/watch?v=rf55oB6xX3w)
268 | * [Friendly, M. 2008. A Brief History of Data Visualization. Handbook of Data Visualization. Springer Berlin Heidelberg. 15–56.] (http://download.springer.com.ezp-prod1.hul.harvard.edu/static/pdf/797/chp%253A10.1007%252F978-3-540-33037-0_2.pdf?originUrl=http%3A%2F%2Flink.springer.com%2Fchapter%2F10.1007%2F978-3-540-33037-0_2&token2=exp=1453237938~acl=%2Fstatic%2Fpdf%2F797%2Fchp%25253A10.1007%25252F978-3-540-33037-0_2.pdf%3ForiginUrl%3Dhttp%253A%252F%252Flink.springer.com%252Fchapter%252F10.1007%252F978-3-540-33037-0_2*~hmac=f39b47d9779f7d2ef33b7e231c7385fb79662ec5cc43ff39d52e812fe9ca466c)
269 |
270 | 2. Assignment 13
271 |
--------------------------------------------------------------------------------