└── README.rst /README.rst: -------------------------------------------------------------------------------- 1 | PyData Berlin 2016 Materials 2 | ============================ 3 | 4 | 5 | Keynotes 6 | -------- 7 | 8 | Olivier Grisel, Evolution of the pydata ecosystem 9 | 10 | - http://ogrisel.github.io/decks/2016_pydata_berlin/ 11 | - https://github.com/ogrisel/docker-distributed 12 | 13 | 14 | Julia Evans, How to trick a neural network 15 | 16 | - http://jvns.ca/blog/2016/05/21/a-few-notes-from-my-pydata-berlin-keynote/ 17 | 18 | 19 | We McKinney, Python Data Ecosystem: Thoughts on Building for the Future 20 | 21 | - http://de.slideshare.net/wesm/python-data-ecosystem-thoughts-on-building-for-the-future 22 | 23 | 24 | Regular 25 | ------- 26 | 27 | Daniel Kirsch, Functional Programming in Python 28 | 29 | - https://github.com/kirel/functional-python 30 | 31 | 32 | Trent McConaghy, BigchainDB: a Scalable Blockchain Database, in Python 33 | 34 | - https://github.com/bigchaindb/bigchaindb 35 | 36 | 37 | David Higgins, Introduction to Julia for Python programmers 38 | 39 | - https://github.com/daveh19/pydataberlin2016 40 | 41 | 42 | Katharina Rasch, What every Data Scientist should know about data anonymization 43 | 44 | - https://github.com/krasch/presentations/blob/master/pydata_Berlin_2016.pdf 45 | 46 | 47 | Alexander Sibiryakov, Frontera: open source, large scale web crawling framework 48 | 49 | - https://github.com/scrapinghub/frontera 50 | 51 | 52 | Thomas Reineking, Plumbing in Python: Pipelines for Data Science Applications 53 | 54 | - Yamal: Not yet Opensourced 55 | 56 | 57 | Ryan Henderson, image-match: a python library for searching for similar images in large corpora 58 | 59 | - https://github.com/ascribe/image-match 60 | 61 | 62 | Ian Ozsvald, Statistically Solving Sneezes and Sniffles (a work in progress) 63 | 64 | - https://speakerdeck.com/ianozsvald/statistically-solving-sniffles-step-by-step-a-work-in-progress 65 | - http://ianozsvald.com/2016/05/07/statistically-solving-sneezes-and-sniffles-a-work-in-progress-report-at-pydatalondon-2016/ 66 | 67 | 68 | Felix Biessmann, Predicting Political Views From Text 69 | 70 | - https://github.com/felixbiessmann/ 71 | 72 | 73 | Jie Bao, ExpAn - A Python Library for A/B Testing Analysis 74 | 75 | - https://github.com/zalando/expan 76 | - http://www.slideshare.net/JieBao3/expan-presentation-pydata-berlin-2016 77 | 78 | 79 | Anne Matthies, Zero-Administration Data Pipelines using AWS Simple Workflow 80 | 81 | - https://github.com/babbel/floto 82 | 83 | 84 | Daniel Moisset, Bridging the gap: from Data Science to service 85 | 86 | - https://github.com/machinalis/slides/tree/master/data-science-to-service 87 | 88 | 89 | Katharine Jarmul, Holy D@t*! How to Deal with Imperfect, Unclean Datasets 90 | 91 | - https://docs.google.com/presentation/d/1G-lgHKTdrqeeJhcvVmd7C9gOIfTRe429zhBN6lmKKzA/ 92 | 93 | 94 | Nora Neumann, Usable A/B testing – A Bayesian approach 95 | 96 | - https://speakerdeck.com/nneu/b-testing-a-bayesian-approach 97 | 98 | 99 | Frank Kaufer, Building Polyglot Data Science Platform on Big Data Systems 100 | 101 | - https://speakerdeck.com/fkaufer/polyglot-data-science-platforms-on-big-data-systems 102 | 103 | 104 | Lukasz Czarnecki, Brand recognition in real-life photos using deep learning 105 | 106 | - http://de.slideshare.net/ukaszCzarnecki/brand-recognition-in-reallife-photos-using-deep-learning-lukasz-czarnecki-pydata-berlin-2016/ 107 | 108 | 109 | Edouard Fouché, Accelerating Python Analytics by In-Database Processing 110 | 111 | - https://ibmdbanalytics.github.io/pydata-berlin-2016-ibmdbpy.slides.html 112 | 113 | 114 | Anton Dubrau, Using small data in the client instead of big data in the cloud 115 | 116 | Nils Magnus, Dealing with TBytes of Data in Realtime 117 | 118 | Abhishek Thakur, Classifying Search Queries without User Click Data 119 | 120 | Nathan Epstein, Machine Learning at Scale 121 | 122 | Angelos Kapsimanis, The Simple Leads To The Spectacular 123 | 124 | Jessica Palmer, Python and TouchDesigner for Interactive Experiments 125 | 126 | Maciej Gryka, Removing Soft Shadows with Hard Data 127 | 128 | Andreas Lattner, Setting up predictive analytics services with Palladium 129 | 130 | Martina Pugliese, Spotting trends and tailoring recommendations: PySpark on Big Data in fashion 131 | 132 | Andrej Warkentin, Visualizing FragDenStaat.de 133 | 134 | James Powell, The kwarg problem 135 | 136 | Moritz Neeb, Bayesian Optimization and it's application to Neural Networks" 137 | 138 | Kashif Rasul, What's new in Deep Learning? 139 | 140 | Jakob van Santen, The IceCube data pipeline from the South Pole to publication 141 | 142 | Matthew Honnibal, Designing spaCy: A high-performance natural language processing (NLP) library written in Cython 143 | 144 | Valentine Gogichashvili, Data Integration in the World of Microservices 145 | 146 | Michelle Tran Chain, Loop & Group: How Celery Empowered our Data Scientists to Take Control of our Data Pipeline 147 | 148 | Guertel Idai, Artificial Body Representation in Robots, Expectation and Surprise 149 | 150 | Robert Meyer, pypet: A Python Toolkit for Simulations and Numerical Experiments 151 | 152 | Ronert Obst and Dat Tran, PySpark in Practice 153 | 154 | Juha Suomalainen, Visualizing research data: Challenges of combining different datasources 155 | 156 | Danny Bickson, Python based predictive analytics with GraphLab Create 157 | 158 | Jose Quesada, A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and cons 159 | 160 | Fang Xu, Connecting Keywords to Knowledge Base Using Search Keywords and Wikidata 161 | 162 | Delia Rusu, Estimating stock price correlations using Wikipedia 163 | 164 | Dr. Markus Abel, Python Learns to Control Complex Systems 165 | 166 | 167 | Tutorials 168 | --------- 169 | 170 | Frank Gerhardt, Using Spark - with PySpark 171 | 172 | - https://gitlab.com/gerhardt.io/pyspark-workshop 173 | 174 | Mike Müller, Single-source Python 2/3 175 | 176 | - http://www.python-academy.com/download/pydatabln2016/Single_Source_Python_2_3.pdf 177 | 178 | Katharine Jarmul, Data Wrangling with Python 179 | 180 | Lev Konstantinovskiy, Practical Word2vec in Gensim 181 | 182 | Shoaib Burq, Which city is the cultural capital of Europe? An introduction to Apache PySpark for GeoAnalytics 183 | 184 | 185 | Lightning Talks 186 | --------------- 187 | 188 | Oliver Zeigermann 189 | 190 | - https://djcordhose.github.io/big-data-visualization/2016_pydata_berlin_lightning.html#/ 191 | 192 | 193 | Piotr Migdał, Teaching machine learning 194 | 195 | - https://speakerdeck.com/pmigdal/teaching-machine-learning 196 | - http://p.migdal.pl/2016/03/15/data-science-intro-for-math-phys-background.html 197 | 198 | Mentioned tools: 199 | 200 | - Pybuilder: Tired of writing setup.py? http://pybuilder.github.io/ 201 | - Sputnik: Package manager for Data https://github.com/spacy-io/sputnik 202 | --------------------------------------------------------------------------------